Investigating Variance in a Data Set: Hopkins vs. Point Lobos

M.H. Schmitt

Long Marine Laboratory, 100 Shaffer Rd, University of California, Santa Cruz, CA 95060

Abstract:

Understanding the different types of variability is a crucial concept in understanding and collecting quality data. To gain a better understanding of sampling issues, we compared reef assemblages between two sites (Point Lobos and Hopkins Marine Life Refuge) over two days. We hypothesize that there will be a significant difference between the two sites in overall species assemblages but that certain taxa may respond to variation in conditions yielding an interaction effect between site and sampling day. We found that there is a statistically significant difference in overall species assemblages between the sites (PERMANOVA: site effect, P=.001). We found a strong effect of site on the algal assemblages at Hopkins and Point Lobos (PERMANOVA: site effect, P=.001). There is a strong interaction effect between site and day for the composition of fish assemblages for Hopkins and Point Lobos (PERMANOVA: interaction effect, P=.015). We also found that invertebrate assemblages show differences among the sites (PERMANOVA: site effect, .P=.001).

Introduction:

Variabilityhas two aspects - positive or negative. If one is testing to see if two samples are the same, variation would be considered positive since this would indicate that the samples are actually different (Levine, and Stephan 51-52). However, if the variability is either from a known or unexplained source that interferes with one’s ability to test a hypothesis then it is considered to be negative.

Understanding the different types of variability is a crucial concept in understanding and collecting quality data. Variance can come from many known sources such as differences in site/location, period of time (season, time of day) during which data were collected, and conditions under which the data were collected. When sampling for a project, it is important to take all possible sources of variation into account to try and identity and limit the ones that hinder the study.

If sampling is conducted over a short period of time, it is safe to assume a population is not changing. Factors that influence population change such as birth, death, immigration, and emigration, typically only are seen over longer time frames, rather than a matter of days. However, if sampling occurs over long periods of time (months to years), one has to account for the potential population changes. Theoretically, an observer should get the same results if sampling a population over a short period of time under uniform conditions.

Sampling in subtidal systems, such as kelp forests, can be very challenging due to the nature of the system. There are a multitude of factors that can cause unwanted variation in a data set such as weather and light conditions as well as water clarity. It is most likely that not all sampling will be conducted in one day, so it is also apparent that “day” might be a source of variation.

To gain a better understanding of sampling issues, we compared reef assemblages between two sites(Point Lobos and Hopkins Marine Life Refuge) over two days. Weformulated a series of hypotheses to test for the different sources of variance (site and day). We hypothesize that: 1) there is a difference in species composition between Hopkins and Point Lobos 2) there is a difference in species composition between Hopkins and Pt Lobos that varies by taxa 3) a difference in species composition between days overall does not change 4) a difference in species composition between days varies by taxa 5) both site and sampling day affect species composition (interaction effect) 6) the interaction between site and sampling day varies by taxa.

We expect to see a significant difference between the two sites in overall species assemblages but that certain taxa may respond to variation in conditions yielding an interaction effect between site and sampling day.

Methods:

Study sites

To test for variance in reef assemblages, two sites were sampled over a two-day period (October 11th and 13th, 2011)Stanford’s Hopkins Marine Life Refuge(36o36'N, 121°54'W) located in Pacific Grove on the southern side of Monterey Bay in California, USA is a kelp forest growing on granitic reefs primarily dominated by Macrocystis pyrifera(McLean 1962, North 1971, Gerard 1976, Watanabe 1984). In shallow areas, large slabs of granite forms high relief outcrops while in deeper areas, shell-fragment rubble covers granite rock reef (Gerard 1976, Watanabe 1984). This reef hosts a variety of species from invertebrates to mammals. This site is one of the most sheltered reefs in central California, giving better protection to the kelp forest and its inhabitants as well as allowing divers to access the reef more easily and often than other areas.

Point Lobos National Monument is comprised of both terrestrial and marine habitats. We used Whaler’s Middle Reef within Whaler’s Cove as our comparative study site (36°31′1.56″N 121°56′33.36″W). From observations, this site is much more exposed to swells and has much higher subtidal relief than Hopkins. We also observed that the predominant species of kelp at Point Lobos are Macrocystis pyrifera, Eisenia arborea, and Pteragophera californica. To access our dive site, we boated to our location and set up our transects.

Survey methods

Comparingalgal communities

To test for the three sources of variance (site, sampling day, or interaction effect) of algae composition, we followed a predetermined sampling pattern. The sampling method was uniform for both Hopkins and Point Lobos, except for the actual position of the initial transect and direction of sampling transects. At Hopkins, there is a permanent cable which is marked in 10 meter increments allowing each buddy pair to find the start of their assigned line transect. Shallow-deep transects were spaced 10 meters apart along the main north-south cable. A total of 14 divers comprised 7 buddy pairs (one-half of the total observers) and surveyed the same transects each day. Each buddy pair was responsible for one set of shallow-deep transects (all transects were at depths between 34-40 feet) and were 30 meters long. Each transect was counted as a replicate.

At Point Lobos, however, there is no permanent benthic cable. Instead, a transect line was laid out and from this; the observers used this as the starting point for their 30 meter transects. Instead of surveying a “deep” and “shallow” side of the transect line, as was done at Hopkins, observers surveyed two transects in the same direction (there is little depth variation within the area sampled, so it is easiest if transects are done in a uniform direction).

We started from our given mark on the cable (Hopkins) or initial transect line (Point Lobos) and laid out our meter tapes at a 90 degree angle from the main cable/transect for a distance of 30 meters. As we progressed along our transect line, each person in the buddy pair collected data in 10 meter increments. The 30 meter transect was divided into three sections for ease of sampling. A predetermined list of algae was sampled (see table below). Each buddy was responsible for one meter on either side of the meter tape (left or right). We repeated this same process as we completed our shallow (Hopkins) or second (Point Lobos) transect. Observers that sampled Hopkins the first day, sampled Point Lobos the second sampling day and vise-versa.

Once the dives were completed, we compiled our data as a class and then ran the appropriate statistical analyses. To test for differences in algal communities between Hopkins and Point Lobos we used a PERMANOVA. We used this same analysis to test for differences between algal assemblages over each sampling day. This test reveals whether algal species compositions are sensitive to site, differences in conditions (day) or if there is a combination of effects (interaction effect). The relative importance of individual species was also examined for those effects that were significant (such as site).

In addition,multidimensional scaling (MDS) plots were created using each transect as a replicate. This test reveals whetheralgalspecies assemblages differed among the sites or sampling days. Mean abundance was calculated by summing all values for each species and dividing by the number of replicates. This permits us to compare site variance and relative species abundance at each site over both sampling days which can provide information about which species might be the best and the poorest candidates for sampling in a range of conditions.

Comparing invertebrate communities

The exact same sampling regime was adopted to compare invertebrate communities between Hopkins and Point Lobos. All the data collected was compiled and analyzed in the same manner as was performed for algae. To test for differences in invertebrate assemblages between Hopkins and Point Lobos, we used a PERMANOVA. To test for differences in invertebrate compositions over the two sampling days we also used a PERMANOVA. A PERMANOVA was also used to determine if an interaction effect was present for invertebrate communities.

Comparing fish communities

The same prescribed sampling methods were used for fish surveys, except the observer sampled across a volume of water instead of a meter swath to either side of the transect line as done for the invertebrates and algae. The observer sampled one meter out from the transect, two meters ahead and one meter above the bottom, forming a two meter squared sampling area either side of the transect.

To test for differences in fish assemblages between Hopkins and Point Lobos, we used a PERMANOVA. To test for differences in fish compositions over the two sampling days we also used a PERMANOVA. A PERMANOVA was also used to determine if an interaction effect was present for fish communities.

Results:

To compare reef assemblages, we collected data at both Hopkins and Point Lobos at depths between 34-40 feet with a median depth at both sites around 36.5 feet (Figure 1). The depths of sampling did not differ between the two sites. We found that there is a statistically significant difference in overall species assemblages between the sites (PERMANOVA: site effect, P=.001). Multidimensional scaling (MDS) analyses were performed to produce a spatial model to interpret the data collected. For these figures, each point represents a transect (replicate) displaying how closely related each replicate is. For all species combined, the markers revealed a separation of site (Point Lobos vs. Hopkins) but not sampling day (indicated by a “1” or a “2”)(Figure 2). [jf1]

Algae

We found a strong effect of site on the algal assemblages at Hopkins and Point Lobos (PERMANOVA: site effect, P=.001). We also found that variance stemmed only from site (Figure 9). Algal abundance is compared over both days and sites in the Multidimensional scaling(MDS) analysis depicting a clear spatial difference indicating a significant difference in algal communities between Hopkins and Point Lobos(Figure 3). Relative abundances across both days for both sites of the specific species counted, reveal this same pattern (Figure 4).

It is notable that there is not much variation across sampling days but large differences as to which species form the algal communities of the two sites (Figure 4). There are several species of algae that are far more abundant at Hopkins, such as Condracanthus corymbifera and Cystoseira osmundacea, but very rare at Point Lobos. The most abundant species at Point Lobos were Pterygophera californica and Eisenia arborea which were almost absent at Hopkins (Figure 4).

Fish Fishes

There is a strong interaction effect between site and day for the composition of fish assemblages for Hopkins and Point Lobos (PERMANOVA: interaction effect, P=.015). There were two sources of variance found - both site and day (Figure 9). From the multidimensional scaling(MDS) analysis, it is clear that there is no discernable pattern either for site or day in the abundance of fishes (Figure 5). Hopkins and Point Lobos fish assemblages show no pattern across the sampling days even within each site (Figure 5, Figure 6). The relative species abundance also shows that even within a site, certain species that were abundant the first sampling day, nearly disappeared the next (Sebastes atrovirens at Point Lobos [0.35 fish/transect to 0.05 fish/transect]) while another species at the same site that was nearly absent the first day, was the most abundant species the second sampling day (Embiotica lateralis at Point Lobos [0.3 fish/transect to 0.2 fish/transect])(Figure 6).Similar discrepancies can be seen for the data collected at Hopkins. The first sampling day showed that there was a low abundance of Sebastes mystinus(0.1 fish/transect), but then the second sampling day, it was the most abundant species (0.3 fish/transect) (Figure 6). Sebastes atrovirens, the most abundant species at Hopkins on sampling day one, was only half as abundant (0.4 fish/transect to 0.2 fish/transect)(Figure 6).

Invertebrates

We also found that invertebrate assemblages show differences among the sites (PERMANOVA: site effect, .P=.001). Multidimensional scaling(MDS) analysis reveals that replicates mostly cluster by site in the two-dimensional figure, indicating there is a significant site pattern (Figure 7). We also found that site was the only source of variance for invertebrate composition (Figure 9). Relative abundances are nearly a perfect match across both sampling days for each site, showing that there is little variance between sampling days (Figure 8). Hopkins had a much higher abundance of Ballanophyllia elegans and Callostoma ligata than Point Lobos did, forming Hopkin’s unique invertebrate community.

Figure 1: Depths at which data were collected.

Figure 2: Total species composition for Hopkins and Lobos. Note the separation of the markers by site[jf2]

Figure 3: Algal composition for Hopkins Figure 4:Relative abundance of Lobos. Note the separation of the markers algae at each site on each day

by site

Figure 5: Fish composition for HopkinsFigure 6: Relative abundance of fish species

and Lobos. Note there is no discernable at each site on each day

pattern

Figure 7: Invertebrate composition for Figure 8:Relative abundance of

Hopkins and Lobos. Note there is a pattern invertebrate species at each

by site sites on each day

Figure 9: Sources of variance by taxa

Figure 10:Estimate for appropriate sample size

Figure 11: Estimate of appropriate sampleFigure 12:Estimate of appropriate

size by each species of algae sample size by each species of fish

Figure 13:Estimate of sample size

by each species of invertebrate

Discussion:

Overall species compositions

Start with your main results and restrict discussion of error as support for your conclusions. Variance can stem from a variety of sources, some wanted and controllable, and others not. After inspecting the reef assemblage data collected at Point Lobos and Hopkins, it is apparent that for overall species composition, there is asignificant site effect meaning that species assemblages are dependent onlocation. We can conclude that there is a difference in species compositions between Hopkins and Point Lobos, andthe well mixed points for days on Figure 2 indicated that the composition did not change overall over the two sampling days Resist the urge to explain how you interpret graphs. You wouldn’t say : and the one bar being larger than the other in the bar graph shows that X is more abundant than Y- right? Same for MDS. This lack of variance between sampling days suggests that conditions (such as visibility or swell) between sampling days do not play a large role when assessing overall species compositions. We did not detect an interaction effect either, which also supports the notion that there does not need to be a strict rubric about what conditionsare required for sampling of overall species compositions.

Theoretically there should be no change in populations of reef species of algae, fish and invertebrates when sampling over short periods of time because the factors that influence population size (birth, death, immigration, emigration) occur over longer time scales. The results we found for differences in overall species composition by site but not day support this.

Taxa specific results

Algae

From our results, we can conclude that overallalgal assemblages are different between Hopkins and Point Lobos. Our data also show that there is no change in species composition over the two sampling days for either site, indicating that algae can be sampled reliably under different conditions. Since there was no day effect, there is no interaction effect.

Don’t describe what you did…that was done in the methods….just discuss the relevance of the results to your studyWe calculated a power index to determine whether we collected sufficient data to support our conclusions and it revealed that we had indeed reached a minimum sample size. We calculated that there needs to be at least 22 replicates (transects) performed to get an adequate estimate of differences in algal compositions between Hopkins and Point Lobos (Figure 10), and our data set exceeds that (34 transects). However, a power analysis for each species (Figure 11), indicated that only a few species were actually satisfactorily sampled. These included Macrocystis pyrifera, Pterygophera californica and Cystoseira osmundaceae. To decrease unwanted unexplained variance, further sampling would be need for on the remaining species.

Fish es

From our results, we conclude that there is a strong interaction effect, meaning that both site and day affect the species compositions of each site. In particular, day has different effects depending on the site We expected to see a difference in compositions between Hopkins and Point Lobos, but this relationship was complicated by a change in conditions.