Online Resource1: Quantification and Characterization of Natural and Anthropogenic

Online Resource1: Quantification and characterization of natural and anthropogenic environmental patterns across the study area

ARTICLE TITLE: Effects of urbanization on herbaceous forest vegetation: the relative impacts of soil, geography, biotic interactions, human access, and an invasive shrub

JOURNAL: Urban Ecosystems

AUTHORS: Guy N. Cameron1, Theresa M. Culley1, Sarah E. Kolbe2, Arnold I. Miller2, and Stephen F. Matter1

AFFILIATION: 1Department of Biological Sciences and 2Department of Geology

University of Cincinnati, Cincinnati Ohio 45221

Corresponding Author: G. Cameron ()

This supplement provides an overview of natural and anthropogenic environmental patterns at our study sites:Miami Whitewater Forest (MWW), Mt. Airy Forest (MAF), Benedict Nature Preserve (BEN), East Fork Wildlife Area (EF), Tranquility Wildlife Area (TRA), Edge of Appalachia Preserve (EOA); see main text (Medthods and Materials: Study sites) for details on study sites]. In the following sections, we describe the methods used for collecting and analyzing environmental data, present the results of environmental analyses, and provide the basis for classifying these sites as Urban, Exurban and Wildland.

Methods used to collect data

Edaphic variables:

Our methods for collecting and analyzing data from soil cores are described in the main text (Methods: Environmental measures). Additional data on soil taxonomy, drainage class, and available water storage capacity were extracted from the Soil Survey Geographic Database (SSURGO 2011). Likewise, our use of GIS datasets to acquire information on elevation, aspect, and slope of our study plots, as well the distance of each plot to the nearest primary and secondary roads are described in the main text (Methods: Environmental measures).

Deposition of atmospheric pollutants:

Measures of wet deposition of NO3-, NH4+, total nitrogen (TN), and SO42- were obtained using raster datasets created from data collected through the National Atmospheric Deposition Program (NRSP-3 2012; 4-km resolution) to assess the impacts of atmospheric deposition from coal-fired power plants and other major point sources of pollution. Annual precipitation, average maximum temperature, and average minimum temperature are based on 30-year climate-normal data from 1971 to 2000. These data were extracted from raster datasets compiled by the PRISM Climate Group (PRISM 2008) from weather stations maintained by the National Weather Service, National Resources Conservation Service, United States Forest Service, the Bureau of Land Management, and other state and local station networks.

Population density:

Population density in 2010 and change in population density between 2000 and 2010 were assembled from U.S. Census data (U.S. Census Bureau 2011) to assess contemporary population characteristics and recent trends near each plot. Change in population density was assessed over this relatively short time interval to capture volatility. The short time interval also decreases the influence of artefactual differences in population density related to changes in census block borders through time. To estimate population density for each time period, we created a buffer with a 1-km radius around each plot centroid, and calculated average population density for the buffer based on the population density of census blocks that intersected the buffer, weighted by their percent areal coverage within the buffered zone. This was preferred over extracting population density values directly from the census blocks containing the plots because many plots were located near the borders of census blocks, so that population densities of the blocks may not accurately represent true population densities near the plots.

A buffer size of 1-km was selected to balance the need to characterize the area around each plot while minimizing the overlap among buffers of adjacent plots. Previous studies suggest that even in fully forested environments, edge-effected zones may be as deep as 1 km, so it is reasonable to expect that anthropogenic activities occurring within that distance from our plots may affect the communities present there (Gascon et al. 2000). Using the relatively large buffer size is a conservative approach: buffers serve as windows to a landscape and its features, and heterogeneous landscapes generally appear more homogeneous as buffer size increases, decreasing the likelihood of significant differences among different areas (Baker et al. 1995). Although the 1-km plot buffers overlap to some degree at each locality, the buffer approach was favored over the use of a locality “average” value because it permits diagnosis of variation among plots within localities even when parts of buffers were shared.

Roads:

In addition to proximity to roadsdescribed in the main text (Methods: Environmental measures), we also calculated total density of roads around study plots by summing the total length of roads circumscribed by a 1-km buffer around each plot centroid, analogous to the standard method for calculating drainage density within a basin area (Tucker et al. 2001).

Land cover and land-use change:

Remote sensing multispectral imagery was used to characterize current land cover and changes in land use at each study siteover the 23 year interval from 1988 to 2011, selected based on image availability. We used Landsat 5 imagery because this satellite has been operational since 1984, allowing for comparison among images collected by the same platform using identical bands and sensors in both 1988 and 2011. For each locality, a satellite image was selected from each year using USGS EarthExplorer ( Images were selected to be cloud-free and to represent the same time of year to minimize potential differences introduced by variation in moisture and plant phenology. For MWW, BEN,and EF, cloud-free scenes from June 6, 1988 and June 6, 2011 were selected. For EOA and TRA, scenes from May 30, 1988 and May 30, 2011 were selected. Appropriate cloud-free scenes were not available for May or June prior to 1988. A high-resolution panchromatic band from Landsat 7 ETM+ was also downloaded for each area to aid in the identification of training areas for land cover classification.

After image acquisition, composite images were formed using bands 1-5 and 7, representing visible blue, green, red, near-infrared, and mid-infrared wavelengths, and training areas representing forest, agricultural, water, residential/transitional, and urban land cover were selected (Online Resource 1 Fig. 1). The same training areas were used for each image wherever possible; overlap between the east and west scenes permitted the identification of training areas for each land cover type present in all four images. Supplemental training areas were identified for each image as necessary to fully characterize each land cover type. Maximum likelihood-based supervised classification was applied to determine land-cover types for each image individually using ENVI 4.4, and post-processing majority analysis was used to smooth images. Classified images then were exported into a geographic information system. Images were combined into a single composite raster representing land cover transitions over the 23-year interval. Changes in land use were quantified within a 1-km buffered radius to determine the total percentage of land that experienced any transition in land use over the interval, the percentage of land that experience a transition to “urban” land use, and the percentage of land that transitioned to any disturbed state (e.g. “residential”, “agricultural”), to distinguish between active, anthropogenically mediated transitions in land use and passive transitions, such as reforestation after agricultural abandonment.

The classified image from 2011 also was used to quantify current land cover conditions and measures of habitat fragmentation near each plot. The percentage of each land cover type was determined within a 1-km buffered radius of each plot. To quantify forest fragmentation, we calculated the area-weighted mean size, size variability, and edge-to-area ratio of forest patches within each plot’s buffer radius. To characterize landscape heterogeneity, we also calculated the diversity of land cover types and the total edge density (in km/km2) within each 1-km buffer.

Estimating canopy cover from LiDAR:

Canopy cover, which represents the percentage of forest floor covered by the vertical projection of tree crowns (McLane et al. 2009), strongly influences the amount of light available in the forest understory, and therefore may influence tree community composition and structure. To estimate canopy cover at each study site, we used discrete-return LiDAR data available through the Ohio Statewide Imagery Program (OSIP). These data were collected with a Leica ALS50 digital LiDAR System at a flight altitude of 2.2 km, resulting in an average post spacing of approximately 2 m. LiDAR returns were separated into ground and aboveground points. LiDAR-based canopy cover was calculated as the ratio of all (first, last, single, and intermediate) canopy returns to all total (ground plus canopy) returns:

where RCanopy(all) represents all canopy returns, and RTotal(all) represents the all ground and aboveground returns. A similar metric using the canopy-to-total ratio of first returns (CCFR) is also frequently used as a measure of canopy cover, and is sometimes favored because intermediate and last returns provide little additional information when closely-spaced first returns are available (Morsdorf et al. 2006). Given the post spacing of the OSIP data, however, and the density of the canopy cover at our study sites, there were not enough first ground returns to reliably calculate CCFR. CCAR has been demonstrated to reliably represent true canopy cover (Smith et al. 2009, Hopkinson and Chasmer 2009), and LiDAR-based measures of canopy cover can actually surpass the accuracy of quickly measured field-based estimates (Morsdorf et al. 2006, Smith et al. 2009, Korhonen et al. 2011). A height threshold of 1.37 m was used to separate canopy returns from other nonground returns. This threshold was selected because it is the average height at which field-based measurements of vertical canopy cover and diameter at breast height are recorded (Morsdorf et al. 2006, Smith et al. 2009, Korhonen et al. 2011), and because overstory and understory vegetation can be efficiently separated at this height across a broad range of forest types (McLane et al. 2009).

Quantitative data analyses

In our initial analysesof the environmental data listed below in Results, we illustrate differences in degree of anthropogenic influence amongUrban, Exurban, and Wildland sites based on variables that describe population density and flux, proximity to roads, current land cover and transitions in land use, canopy cover at our study sites based on LiDAR, measures of habitat fragmentation and heterogeneity, and sources of atmospheric pollution. For variables that were measured using a 1-km buffer radius, the explanation of differences is descriptive, as the presence of overlapping buffers at some sites prohibits the use of any statistical test that is based on an assumption of independent samples. For other variables, Kruskal-Wallis tests were used to identify significant differences among Urban, Exurban, and Wildland sites for variables that met the distribution assumptions of the test; Wilcoxon rank sum tests then were used to assess pairwise differences among the three groups.

Finally, to test our initial assignments of each study siteto one of the three categories (Urban, Exurban, Wildland), we used a classification tree and linear discriminant analysis (McGarigal et al. 2000). For both approaches, all anthropogenic and natural explanatory environmental variables were included as possible predictors of study site classification, and to determine whether study sitescould reliably be assigned to the categories based on differences in any of the environmental variables. In both analyses, classification error was calculated as the percent of plots incorrectly assigned to a site category. For the discriminant analysis, error was further assessed using leave-one-out-cross-validation (LOOCV), in which a single sample is omitted from the data set, a classification function is derived, and the omitted sample is classified (Allen 1971). The process is repeated sequentially for each sample, and the resulting correct classification rate for the omitted samples represents the LOOCV error. LOOCV cross-validation provides unbiased estimates of the accuracy of the linear discriminant analysis when the number of explanatory variables is large relative to the number of samples (Molinaro et al. 2005).

Results

As expected, human population density was highest at urban sites, lowest at wildland sites, and intermediate at exurban sites (Online Resource 1 Fig. 2a). Urban populations also were more volatile over the period from 2000 to 2010. Change in population density over the 10year interval ranged from -195 to +82 people/km2 near Urban sites,as compared to Wildland and Exurban sites where change in population density ranged from -1 to +1 person/km2and -9 to +19 person/km2, respectively (Online Resource 1 Fig. 2b).

At all study sites, plots were < 1 km from the nearest road (Online Resource 1 Fig. 3a). Mean distance to the nearest road was lowest at Urban sites, significantly higher at Exurban sites (Wilcoxon rank sum test, W = 174.5, p < 0.000), and intermediate at Wildland sites. Plots in Wildland sitesvaried the most in their distancesfrom roads. Patterns were similar when only major roads (primary and secondary highways) were considered (SupplementalFig. 3b). Urban sites had a much higherroad density within a 1-km buffer radius than Exurban orWildland sites, while Exurban sites had the lowest road density (Online Resource lFig. 3c).

The distribution of land cover types varied among sites (Online Resource 1Fig.4). At Exurban and Wildland sites, forest cover was dominant within the 1-km buffer radius (70 and 71%, respectively), while at Urban sites, forest cover was less common (43%), and the mean area of residential and forest cover was approximately equal. Agricultural cover was absent near Urban sites, and highest near Wildland sites (16%). Residential cover was highest at Urban sites (45%), intermediate at Exurban sites (19%), and lowest at Wildland sites (12%). Urban cover was highest near Urban sites (12%), and nearly absent near Wildland sites (0.3%).Areal coverage of rivers and lakes was only substantial at Exurban sites, where an average of approximately 6% of land near plots was classified as ‘water’, reflecting the proximity of our Exurban plots at ef to East Fork Lake.

Transitions in land use were greatest in areas surrounding Wildland sites, where more than 25% percent of land experienced a change in land cover over the period from 1988 to 2011 (Online Resource 1Fig. 5). Exurban sites also had relatively high levels of total land use change (24%), while Urban sites experienced the lowest levels (18%). However, when only transitions to a highly anthropogenically-disturbed state (“urban”, “residential”, and “agricultural”) were considered, the degree of land use transition experienced at each site over the 23-year interval was approximately equal (14 to 15% at all sites). The difference for each site between total land use transitions and transitions to disturbed states reflectsthe relativeimportance of reversions from agricultural or residential land to forest vegetation. Reversion to forests was relatively high nearWildland and Exurban sites, where it accounts for a high proportion of total land-use change (45% and 43% of total land use change at Wildland and Exurban sites, respectively). Land-use transitionsfrom forest, agricultural, or residential cover to urban land cover were greatest near Urban sites (4.4%), and muchlower near Exurban (0.7%) and Wildland sites (0.3%).

The area-weighted mean size of forest patches within a 1-km buffer radius of each plot was lowest at Urban sites, and 2-3timeshigher at Exurban and Wildland sites (Online Resource lFig. 6a). Variance in the size of forest patches contained within a buffer also varied along the gradient: Urban sites had relatively low variability in forest patch size, while Wildland sites had the highest variability in in forest patch size, and Exurban sites had an intermediate level of variability (Online Resource lFig. 6b).This reflects the absence of large forest patches in urban settings; the mean maximum forest patch size observed near urban plots was 1.2 km2, compared to 2.1 and 2.2 km2 at Exurban and Wildland sites, respectively. The ratio of forest edge to forest area decreased from Urban to Exurban to Wildland sites (Online Resource lFig. 6c),indicating that core forest area is largest at Wildland sites, and smallest at Urban sites. Total edge density was highest at Urban sites (19.1 km/km2) and lower at Exurban (16.6 km/km2) and Wildland sites (16.4 km/km2), indicating a higher degree of landscape heterogeneity and fragmentation in urban settings.

Atmospheric wet deposition of NH4+, NO3-, SO4-, and TN generally increased along a west-east transect, although differences among study sites were not statistically significant. For all measures, wet atmospheric deposition was lower at the Urban sites and at the western Exurban site (MWW), and higher at the Wildland sites and the eastern Exurban site (EF) (Online Resource 1Fig. 7). Variation along the gradient was strongest for SO4-, while variation among sites in NH4+, NO3-, and TN was relatively low. Because all measures of wet deposition were highly correlated (R2 = 0.75 to 0.97), a principal components analysis was used to reduce the four variables to one composite PCA axis, which summarized 98% of the variation in the original dataset. This composite atmospheric deposition variable was used in all subsequent analyses described in the main text.