Nested versus non-nested characterization of vegetation composition and species richness at multiple spatial scales

Thomas R. Wentworth1

Peter S. White2

Brooke E. Wheeler2

Kristin Taverna2

Dane Kuppinger2

Lee Anne Jacobs2

Jason D. Fridley2

Jack Weiss2 ??

Robert K. Peet2

1Department of Botany

Campus Box 7612

North CarolinaStateUniversity

Raleigh, NC27695-7612USA

2Department of Biology

CB#3280, Coker Hall

University of North Carolina

Chapel Hill, NC27599-3280USA

Abstract

The importance of characterizing vegetation composition and species richness at multiple spatial scales is increasingly recognized by ecologists, but there is no consensus as to whether variation with scale is best characterized by subplots arranged in a design that is nested or non-nested. Many crucial aspects of biodiversity management and research may depend on this design choice. The fundamental difference between these designs is in how they are influenced by spatial autocorrelation and in how well they allow characterization of change in vegetation structure with change in grain of observation. We compared nested and non-nested designs for characterizing species richness (using species-accumulation curves) of vascular plants using data collected by the Carolina Vegetation Survey in 873 0.04 ha plots. Our objectives were to compare exponential (Gleason) and power-function (Arrhenius) models for fitting species-accumulation curves and to determine which design allowed more effective extrapolation of the species-accumulation relationship to larger spatial extent. The arithmetic-Arrhenius model provided the best fit and most closely predicted 400m2 richness for both sampling designs. With the arithmetic-Arrhenius model, the average deviation of predicted 400 m2 richness values using the nested design was near zero, while the average deviation using the non-nested design was positive, suggesting a consistent over-prediction of 400m2 richness for the non-nested design. The latter finding was consistent with our hypothesis that data collected using a non-nested design are prone to over-predict richness in larger areas within which they are nested. From these empirical results, we conclude that data collected using a nested design are to be preferred over those collected using a non-nested design. From a conceptual perspective, the non-nested design offers a means to interpret underlying spatial patterns of richness that the non-nested design does not. In particular, the nested design is more appropriate if the goal is to assess changes in species composition with changing grain size of observation, because the non-nested design confounds the influence of grain with that of extent. For this and other conceptual reasons, we also We conclude that the nested design is equivalent or superior to the non-nested design for most applications and should be the standard method for multi-scale inventories.

Key words: Arrhenius model, Gleason model, nested sampling design, non-nested sampling design, spatial extent, spatial scale, species richness, species-area curves, species-accumulation curves.

Introduction

Among community properties,Aassessment of species richness (number of species per unit area) is a common objective for terrestrial plant ecologists concerned with both inventory and conservation of natural resources. Among community properties, species richness is of particular interest because it is the outcome of numerous density dependent and independent processes (Huston 1979, Grace 1999), such as competition and density-independent population reduction, and because it contributes to community structure and function (Loreau et al. 2001). Because processes relating to species richness are scale-dependent (Huston 1994, Rosenzweig 1995, Fridley 2001, Chase and Liebold 2002), it is essential that species richness patterns be quantified at a variety of spatial scales. Ecologists have the choice of characterizing species richness at single or multiple spatial scales. We argue that characterization of species richness at multiple scales is preferable. Because species richness increases with scale of inventory (typically area in terrestrial systems), estimates of richness conducted at a single, arbitrary spatial scale limit the utility of the data for two reasons: (1) results can only be compared with those of other researchers if they adopted the same spatial scale; and (2) the investigator may not have selected the most appropriate spatial scale for answering a particular research question. In contrastparticular, determination of richness at multiple spatial scales has three benefits: (1) as research questions and objectives evolve, analyses are possible at a variety of spatial scales, some of which may be more suited to particular research questions,; (2) the pattern of increase in species richness may be the attribute of greatest interest, not the richness at any particular scale (Gleason 1925, Williams 1964, Rosenzweig 1995),; and (3) species richness dataspanningfrom multiple spatial scales facilitate both interpolation of richness for comparison with studies at varied scales and extrapolation of the species-area relationship to larger scales.

Implementing an inventory protocol at multiple spatial scales involves many choices, including the spatial arrangement and relative sizes of subplots and how such subplots are analyzed to make inferences about the underlying spatial structure of the vegetation. A design choice of particular importance is There is no consensus as to whether variation of species richness with increasing scale is best characterized by subplots arranged in a design that is nested or non-nested. In a nested design, each area inventoried is fully enclosed within the next larger area; in a non-nested design, areas of different sizes are inventoried independently of one another. There is no consensus as to whetherthe circumstances under which nested versusor non-nested multi-scalar sampling designs should be preferredused. Recently, Stohlgren et al. (1995) and Barnett and Stohlgren (2003) have promoted a non-nested design, using a “modified-Whittaker” 0.1 ha plot. [Here we need brief mention of why Stohlgren chose the non-nested approach.]Researchers ofThe Carolina Vegetation Survey (Peet et al. 1998) has have advocated use of a nested design, also using 0.1 ha plots. Rosenzweig (1995, p. 10), following the pioneering work of Gleason (1925),also recommended use of contiguous, nested subplots because data from non-nested or “scattered” subplots result in species-area curves that climb “too fast” and exhibit “too much curvature.” Given the influence of Stohlgren and colleagues’ non-nested modified-Whittaker design, and prevailaing counter-arguments favoring the nested approach, there is a pressing need for an empirical evaluation of the effectiveness of nested versus non-nested designs in characterizing species richness patterns.

Beyond the issue of nested versuss. non-nested subplot arrangement, there is a secondary problem of the appropriate model for fitting species-area relationships obtained using either design. It may be that both nested and non-nested subplot arrangements are appropriate methods for characterizing species- richness patterns, but their efficacy must be evaluated by application of different models. Such model-fitting has a long history, extending back to seminal work by Arrhenius (1921) and Gleason (1922). While many possible models exist, data from relatively small areas (defined for our purposes as those 0.1 ha) seem best fit by either exponential (S = zlog(A) + c) or power-function (S = cAz) models (He and Legendre 1996), referred to in this paper as Gleason and Arrhenius models, respectively. In both models, S is the number of species, A is the area examined, and c and z are constants. While the Gleason model is appropriately fit using simple linear regression, two alternatives exist for fitting the Arrhenius model. As noted by Rosenzweig (1995), most authors transform the Arrhenius model to its linear form, log(S) = zlog(A) + log(c) and estimate its parameters using simple linear regression. However, Wright (1981) recommended using non-linear regression techniques to estimate the parameters of the Arrhenius model in its non-linear form, S = cAz. We chose to compare the results using both linear regression (referred to hereafter as the log-Arrhenius model) and non-linear regression (referred to hereafter as the arithmetic-Arrhenius model). Of the three, the log-Arrhenius model is the most difficult to apply to small-scale data due to the prevalence of zero richness values at the smallest scales. [This might be our first major decision in translating the poster toin manuscript form: how much do we want to focus on the issue of log- vs. non-log Arrhenius? The point has been made before (haven’t read the Wright paper, but Rosenzweig 1995 discusses this a bit); is this analysis essential to our argument? Should we just pick one? Also, how much model-fitting redundancy should we have with the fine-scale sparcs paper? I am more worried about the issue of redundancy with the SPACS paper. A case could be made for omitting this stuff her, assuming the SPARCS paper will be submitted at nearly the same time. I rather like the one paper – one idea approach here. Perhaps we could simply reference the SPARCS paper in place of repeating it here.]

Non-nested subplot designs add an additional level of complexity by allowing two different ways to tally species with increasing area. The first, a species-area curve (SPARC), simply charts the number of species found at each increasingly larger area. The second, a species-accumulation curve (SPACC), charts the cumulative number of different species encountered as larger areas are inventoried. For nested designs, SPARCs and SPACCs are identical; however, for non-nested designs, SPARCs and SPACCs may differ. Individual SPARCs developed from nested data always increase monotonically, while individual SPARCs developed from non-nested data may not be monotonically increasing functions. However, it can be shown that the SPARCs developed by averaging both nested and non-nested sets of data collected from within the same larger area are expected to be identical; efforts to compare them would thus be uninteresting. [But the issue of replication is an important one; although mean values of subplot richness should be the same, CVS design allows for several semi-independent estimates of small-scale species-area curves; this is not as easily achieved with non-nested data because you’d have to “jump around” in space to construct replicate curves. Yup] However, SPACCs developed from nested and non-nested data collected from within the same larger area may be different, and this paper focuses exclusively on comparison of species-area relationships described by SPACCs. Because SPACCs developed from non-nested data encompass greater spatial extent than those developed from non-nested data, the cumulative species list will generally be greater than the corresponding list developed from nested data. [This assumes that, unlike our comparisons below, nested and non-nested approaches are not both nested within the representative largest area (ie, 0.1 ha plot). In Gleason and Rosenzweig’s examples of SPACCs with nested and non-nested data, the ends of the curves are fixed, and the shape of the non-nested curve is convex.]

Because of continued interest in protocols for species inventory at multiple spatial scales, we chose to compared nested and non-nested approaches using data available from the Carolina Vegetation Inventory (Peet et al. 1998). We compared SPACCs developed from nested data with SPACCs developed from non-nested data to evaluate the following questions.: (1) For both nested and non-nested designs, which model provides the best fit tothe species richness-area relationship? (2) Does one design result in consistently better model fitting? (3) Do nested or non-nested subplot designs provide more accurate predictions of richness at larger scales? [Given the much larger-scale focus of our other paper, perhaps we should be more specific with our “projecting upward” analysis to relatively small scales, say within a community? Yup]

[The major component missing from this current introduction is the rationale for choosing nested vs. non-nested designs in the first place, without regard to model fitting. What did we know about nested vs. non-nested techniques BEFORE our analysis? How does then our analysis change what we thought we knew (or does it)? We at least knew that: 1) nested samples are generally logistically easier to perform, more efficient use of field time; 2) nested samples are most appropriate for concerns over confounding grain and extent; 3) nested samples are more mathematically desirable because they are strictly monotonic; 4) non-nested samples are often thought to be more statistically desirable because richness values of different quadrats are more statistically independent; 5) extrapolation of small-scale data to larger-scales is ALWAYS performed in a nested framework, so nested designs at small scales are methodologically consistent with larger-scale curves; 6) given equal sampling time, non-nested quadrats will almost always find more species because they can cover a greater spatial extent. So the question becomes how model fitting procedures add to this previous perspective (and perhaps more rigorous shore-up some of our preconceived notions, or perhaps even change some). We also need some rationale for our stated 3 goals: why is prediction of richness at larger scales important? Why is it necessary to have a consistent model for fine-scale sparcs?]

[I agree with Jason’s concerns expressed in the previous paragraph. Note, however, that to address them would add length to an already much, much too long introduction. This leads me to think we need a structural change in the paper. We should have a short intro that is at most 2 paragraphs long where we explain the need for an analysis of the conceptual merits and advantages of both methods, and the need to empirically test model fits. This should be followed by a section on theory and logic, wherein Jason’s points are addressed. Then we move on to the empirical section wherein we start with methods.]

Methods

A total of 873 0.1 ha plots with nonzero richness values at each of four spatial scales were available from the Carolina Vegetation Survey data set, covering a wide range of communities and environmental conditions across the Carolinas (USA) [wanna use the figure from the other paper, or a subset of it? – issue?]. The data were collected following the protocol of Peet et al. (1998; ) (Figure 1). Within each plot, four contiguous 10x10m modules with a total area of 400m2 were used to generate richness values for this study. Subplot data came from the 0.1, 1, 10, and 100m2 scales (subplot data from the 0.01 m2 scale were also available but not used in this investigation because of the large number of zero-richness values encountered at that scale). These data represent the number of vascular plant species rooted within the area of concern (i.e., to be considered present at a particular spatial scale, a species had to be represented by at least one shoot emerging within the subplot corresponding to that scale).a quadrat (cf. Williamson 2003) The four 10x10mmodules were nested within larger 1000m2 plots, but this studywe focusesexclusively on the contiguous block of 10x10m modules, with a combined area of 400m2. Four separate sets of nested richness values were generated for each plot using values collected within each of the four 10x10m modules. Four separate sets of non-nested richness values were also generated; each set started with a value for the smallest scale drawn from one of the four modules and then accumulated values for increasingly larger scales from the remaining modulesin a counter-clockwise fashion through the block of four modules, such that no value at a given scale was selected more than once. Each set of richness values was used to generate an individual SPACC, resulting in four nested replicate and four non-nested replicate SPACCs in a given plot. In modeling SPACCs from nested data, the values for the independent variable (cumulative area) were 0.1, 1, 10, and 100 m2. However, SPACCs from non-nested data accumulated area at a slightly greater rate, so the corresponding values for cumulative area were adjusted to be 0.1, 1.1, 11.1, and 111.1 m2.

Both Gleason and Arrhenius models were fit to each of the four nested and four non-nested sets of richness data in each plot; we fit the Arrhenius model using both linear (log-Arrhenius) and nonlinear (arithmetic-Arrhenius) regression. [Here I wonder if we should say we fit both untransformed and log-transformed Arrhenius models but got consistently better fits for the untransformed, and omit all subsequent discussion of log-Arr? – I like this idea] We will subsequently refer to each of these three approaches to curve-fitting as a separate “model.”[I guess I prefer two models only. Me too] Figures 2a-c illustrate our curve-fitting procedure for nested data from one plot representing median richness fromthe Carolina Vegetation Survey data set. The model results were extrapolated to predict species richness at the 400m2 scale for each of four nested and non-nested replicates in each plot, and deviations of the predicted values from the actual 400m2 richness were calculated. In the scatter plots of predicted versus observed richness values, the result from each set of richness values is individually depicted (resulting in 873x4 points per scatter plot) for data from both non-nested and nested designs (a and b, respectively, in Figures 3-5).