Review of a proposed monitoring plan for a Toba Montrose Hydro project.

Carl James Schwarz, P.Stat.

Department of Statistics and Actuarial Science

Simon Fraser University

8888 University Drive

Burnaby BC Canada

2010-03-24

DRAFT 1.

1. Introduction

Toba Montrose Hydro, Inc is proposing an independent power production facility on Montrose Creek and East Toba River watersheds. As part of the license agreement, a monitoring study must be implemented with monitoring taking place pre- and post-startup. One of the response variables to be monitored is the density of aquatic invertebrates.

To date, sampling has taken place in fall 2008, spring 2009, fall 2009, and spring 2010. All dates are pre-startup with startup proposed for summer 2010.

One of the standard designs for monitoring environmental impacts is the paired-BACI (before-after, control-impact) designs where both the control and project streams are monitored before and after impact. The pairing indicates that sampling occurs at the same time on both streams to account for “seasonal” effects that simultaneously affect the response variable in both streams. It is necessary to monitor the control stream to account for changes in the response variable that occur in the absence of an environmental impact. Evidence of an environmental impact is declared if the change in the response variable between pre- and post-project is different in the project stream than the control stream.

A preliminary power analysis for the two projects showed that sample sizes required to detect a 50% change with 80% power at in the mean response variable (density of invertebrates) would require substantial monitoring (15 years!) prior to the proposed start of the project in 2010[1] In response, an alternate monitoring methodology has been proposed in the same document.

In this proposed monitoring scheme, the control and project stream will be monitored with 5 samples taken each year in both streams in a paired fashion. Then regression analysis (either parametric or non-parametric) will be applied and a comparison of the trends over time between the project and control stream will be used to detect environmental impact.

Weins and Parker (1985) discuss a variety of design suitable for monitoring environmental impact when little (or no) pre-project data are available. Their “impact-level-by-time interaction” design is very similar to the proposal – monitor the project and control site to see if the response over time follows a parallel trajectory.

2. Example of proposed methodology.

This section will provide a numerical example (using hypothetical data) to illustrate how the proposed monitoring program’s data will be analyzed. Suppose that there are 7 years of data available. In each year, five samples are taken in both the control and project streams. A plot of the data is illustrated in Figure 1.

Figure 1. Hypothetical data illustrating the proposed analysis. Each stream is measured at the same time in 7 years. In each stream in each year, five samples are taken to measure the density of invertebrates.

In Figure 1, the control stream shows evidence of a downward trend due to factors other than the project (e.g. climate change). The decline in mean density in the project stream is steeper than in the control stream. The difference in slopes would be evidence of an environmental impact of the project.

There are several sources of variation in the monitoring system, The lowest-level variation is measurement error. The density of invertebrates over the entirety of each stream is unknown and must be estimated in each year by taking the five samples. Measurement error can be reduced by increasing the number of samples taken each year..

Even if the density were known exactly, the true densities for each stream will NOT lie exactly on the underlying trend line. This is known generally as process error. No amount of sampling will eliminate process error. However, because the sampling in each year is paired (i.e. taken at the same time of year), some of the process error is common to both the control and project streams. For example, in year 4, the density of invertebrates has been depressed in both streams due to a common cause, while in year 5, the density of invertebrates has been increased by a common cause. But even after accounting for the common process error, additional process error still exists because the increase/decrease in years 4 and 5 is not exactly the same in both streams. This is illustrated in Figure 2.

Figure 2. Hypothetical data illustrating the three sources of variation. In year 4, the common process error depresses density in both streams. The stream specific process error accounts for additional variation in the true density over and above the common process error. Finally, the five sample from each stream exhibit sampling error around the year specific density for each stream.

More formally, the model for this experiment can be expressed as:

where

are the observed densities in year t in sample s in the project (p) or control (c) stream;

are the intercepts of the underlying trend line for the project or control stream;

are the slopes of the underlying trend line for the project or control stream;

are the common process errors affecting both streams in year t;

,are the stream specific process errors in year t for the project or control streams;

, are the sampling errors for sample s in year t for the project or control streams.

Such experiments can be analyzed using a variant of Analysis of Covariance (ANCOVA) that accounts for the three levels of error.[2] This gives the following estimates of effects (using JMP software)

The line in the effect test table corresponding to the Year*Stream effect is the test for parallelism of the slopes. In this case, the very small p-value (.0007) indicates strong evidence that the slopes in the two streams are not parallel. The estimated difference in slopes is found using a contrast:

The estimated difference in slopes (found on the last line of the above output) is .12 (SE .016). The p-value to test if the difference in slopes is 0 is .0007 and is mathematically equivalent to testing the hypothesis of parallel slopes.

A simpler analysis can also be done (which eventually leads to the proposed test in the report) by first averaging over the five sub-samples in each year-stream combination. This reduces the dataset from 70 values (7 years x 2 streams x 5 samples) to 14 values (7 years x 2 streams). Again an ANCOVA type model can be applied to this reduced dataset[3] which leads to the following effect tests:

The results are identical to the previous table. There is no loss of information by averaging over the five samples – the individual sample values only provide information on variability within that stream on that year, but no information on the slope. The estimated difference in slopes (last line) is also identical:

The analysis can also be simplified further to match the proposed non-parametric procedure by taking the averaged values for each year, and computing the difference between the averaged values of each stream. This reduces the dataset to 7 values (7 years of differences). Notice that by taking the differences in the averages, the common process error will cancel and so the total unexplained variation is reduced. This is the key advantage of pairing sampling across streams, and the plot of the differences over time is:

A straight line can be fit to the differences over time. The test for parallelism of the original slopes is equivalent to testing if the slope of the line through the differences is zero:

The final line in the above table is the estimated difference in slopes (which matches the previous results) and the p-value is also identical to the previous results.

A non-parametric test for trend is obtained using the Mann-Kendall procedure on difference in averages over time (which is mathematically equivalent to the non-parametric estimator of correlation called Kendall’s tau – refer to Conover, 1999). The non-parametric tests examines if there is no concordance between the time variable and the difference (i.e. no general increase/decrease in the response over time):

There is again strong evidence that the slopes of the lines in the two streams are not parallel. The p-value for the Mann-Kendall (Kendall’s tau) is based on large-sample approximations and may not be valid for small samples but will be approximately correct. An non-parametric estimate of the slope (of the differences in averages over time) can be computed as described in Conover (1999, Section 5.5). For each pair of years (i and j), compute the “two-point” slope: . Order the set of all slopes obtained, and find the median of these “two-point” slopes. We obtain (which matches the parametric estimate of slope) and a 95% confidence interval (refer to Conover, 1999) of (.07 -> .17) which is slightly wider than the 95% confidence interval from the parametric procedures.

4. General discussion of proposed approach.

It is not surprising that the non-parametric procedure gives similar results to the parametric procedure. Conover (1999) indicates that the asymptotic relative efficiency (ARE, a measure of the reduction in sample size required by the parametric method to give the same power as the non-parametric method) for evenly spaced data with normally distributed errors is .98 (i.e. for a given power, the parametric test requires 98% the sample size as the non-parametric test to give the same power). The ARE is never less than .95 for any distribution and can be greater than 1 (i.e. the non-parametric test can be more powerful in the case of unusual distributions).

Consequently, there really is little penalty in using the Mann-Kendall test in place of the ANCOVA models. However, the ANCOVA models can deal with situation where there are missing values in some years (e.g. the control stream may not be measured and so no difference can be computed) where the Mann-Kendall test must delete those years from the analysis. Similarly if there is unpaired data (e.g. observations are taken in the spring on one stream and the fall on the other stream), the parametric ANCOVA can be suitably modified, but the Mann-Kendall procedure will have to drop the unpaired points. Potthoff (1974) discusses the case of a non-parametric testing if two regression lines are parallel, but this is not reviewed here. His method does not account for the common process error that could be present in the study and “eliminated” by taking differences in the means and will lose power to detect because of the increased variability.

Both the parametric and non-parametric methods can be easily modified for unequally spaced sampling, e.g. twice a year in some years, or every second year.

If the number of sample in each year differs, then the method of averaging is only approximately correct, but unless the number of sample varies by orders of magnitude, the approximate analysis will be suitable.

The key determinant for the power of this type of experiment are the process and sampling errors. In many cases, sampling variance is typically much smaller than process variance and increases in sampling rates (e.g. to 10 samples per year) may have little impact on the performance. If the pairing is successful, a substantial part of the total process error, the common process error, can be “eliminated” by taking differences in the averages.

While the trend design implicitly assumes a linear trend, this really is not an impediment to it usage if the trend is non-linear. First, the parametric methods can be suitably modified to account for the non-linearity. The non-parametric methods are more general in that they look for “concordance” between the response and time – generally speaking as time increases does the response also increase/decrease. Consequently, if the trend is curvilinear with increasing effects over time, the non-parametric methods will not be affected.

Both the non-parametric and parametric methods implicitly assume that the yearly responses are independent from year to year. Specifically, knowledge about the density in year t provides no information about density in year t+1. In the case of short-lived invertebrates this is more likely to be true than long-lived mammals. For example, if the response variable was density of bears, a depressed population (e.g. a severe winter kill which manifests itself a process error) will have a lingering effect for many years.

The paired-BACI approach assumes that the response to the project is a simple step change once the project starts with no recovery or increase in impact. The trend-design assumes that the impact effects are cumulative and tending to increase over time. Consequently, the preference for one design over the other depends on a large extent to the type of response expected. If the step response occurs, then, all else being equal, the paired-BACI design will be more powerful to detect changes than the trend design and vice versa.

5. Power analysis of proposed design

To be done. Need the pre-impact data from the 4 sample times to estimate common process, stream specific process, and sampling error.

6. Specific comments about proposed methods applied to this project.

Project has 4 sample periods which are preimpact and then point after impact. Consequently, using entire sequence of points is not really valid because the shape of the curve is likely flat until the project starts and then decreases. These pre-impact data points should be “clustered” together, eg. at years 2009.96, 2009.97, 2009.98, 2009.99 to serve as “intercept” (pre-project) points. Elaborate on this more after power analysis is done.

7. Conclusion.

To be done later.

References:

Conover, W.J. (1999). Practical non-parametric statistics. Wiley: New York.

Pothoff, R. F. (1974). A non-parametric test of whether two simple regression lines are parallel. Annals of Statistics, 2, 295-210.

Weins J. A. and Parker K. R. (1995). Analyzing the effects of accidental environmental impacts: approaches and assumptions. Ecological Applications 5, 1069-1083.

1

[1] Refer to letter Table of Commitments Condition 4 Water Licence Condition m – Aquatic Effects Monitoring dated 2010-03-10, File VA10-00252

[2] The formal model using standard statistical notation is:

Density = Year Stream Year*Stream YearC&R Stream*YearC&R

where YearC&R is a categorical variable for the random effect of year (rather than the continuous variable Year) which represents common process error, and Stream*YearC&R is the stream specific process error in each year. The sampling variation is implicit and not specified directly.

[3] The formal model using standard statistical notation is:

AvgDensity = Year Stream Year*Stream YearC&R

where YearC&R is a categorical variable for the random effect of year (rather than the continuous variable Year) which represents common process error, The sampling error and stream specific process error are combined in the implicit error term.