Analysis of preliminary creel survey and
recommendations for a creel survey of
Kootenay Lake, British Columbia.
Prepared for: Fish & Wildlife Compensation Program – Columbia Basin,*
103-333 Victoria Street,
Nelson, B.C. V1L 4K3
by
Carl James Schwarz
Department of Statistics and Actuarial Science
Simon Fraser University
Burnaby, BC, V5A 1S6
2010-06-09
* The Fish & Wildlife Compensation Program (FWCP) was established in 1995 to offset the impacts resulting from construction of BC Hydro dams in the Columbia Basin, and works to deliver a wide range of conservation and enhancement projects for fish and wildlife, on behalf of its program partners BC Hydro, the BC Ministry of Environment (MOE) and the Department of Fisheries and Oceans Canada.
Executive Summary
A survey of anglers was conducted on Kootenay Lake from October 2009 through March 2010 to provide data for designing a one year creel survey, and a preliminary estimate of angler effort and harvest. Eleven days were sampled at five access locations with overflight boat counts occurring on seven of the sampled days.
The expansion factor to adjust access interview data to total effort was about 1.4 and relatively stable over the sampled period.
A model-assisted approach be used to deal with missing overflight information and the method of multiple imputation is recommended. This model-assisted method with multiple-imputation is very flexible and provides estimates that properly account for the uncertainty in the missing data.
Estimated angler effort during the October to March survey was 6870 (SE 984) angler days and 38,403 (SE 4850)rod hours.Bull trout and rainbow trout harvest were estimated as 1031 (SE 288) and 1016 (SE 303) respectively. Daily catch per unit effort (pooled over all sites) ranged from zero to 0.064 for bull trout and zero to 0.149 for rainbow trout. Average length and weight (range) of sampled bull trout was 60 cm (46 – 82) and 2.9 kg (1.0 – 7.0), and for rainbow trout 54 cm (34 – 78) and 2.6 kg (0.4 – 7.4). The log(weight) vs. log(length) relationship cannot be distinguished between the two species.
A review of the proposed design for the year round creel survey suggested that effort be approximately split between weekend and weekdays, but that effort be shifted from sampling in the winter months to the summer months. Based on the stability of expansion factor during the off-peak season seen in the preliminary creel survey, the number of overflights can be reduced by about 50% during the off-peak months.
A simulator was used to investigate different designs for a one year survey using the feasibility data and other information to approximate the expected precision. The proposed design (and modifications) should provide estimates with relative standard errors (standard error/estimate) at the yearly level of 10% of less which would give 95% confidence intervals at the yearly level of ±20% or less of the estimates.
1. Introduction
This report analyzes the preliminary creel survey conducted in fall 2009 and winter 2010 on Kootenay Lake, British Columbia and then uses the information from the preliminary survey to suggest a design of a year long creel survey of the same lake.
The preliminary survey was conducted in October 2009 through March 2010. Briefly, the creel was sampled at 5 access points (Kaslo, Woodbury, Balfour, Queens Bay East, and Boswell) at 11 days during this period as shown in Table 1. [On 2 of the days, sampling was not done at Kaslo.] During the sampled days, all angling parties returning to the access point were interviewed to determine the number and species of fish kept and released, the start and end time of the angling trip, and other variables. On 7 of the sampled days, an aerial survey was also conducted at approximately noon that counted the number of active boats on the lake. The number was counted once as the airplane flew out and again on the return flight (Table 1). Additional details are provided in Anonymous (2010).
Table 1: Summary of interviews conducted and air counts of active boats from the preliminary survey.Date / 24 Oct 2009 / 03 Nov 2009 / 21 Nov 2009 / 20 Dec 2009 / 09 Jan 2010 / 29 Jan 2010 / 16 Feb 2010 / 21 Feb 2010 / 04 Mar 2010 / 14 Mar 2010 / 20 Mar 2010
Day type1 / WE / WD / WE / WE / WE / WD / WD / WE / WD / WE / WE
Kaslo / 7 / * / 2 / 0 / 0 / 0 / 3 / 5 / * / 3 / 7
Woodbury / 8 / 2 / 2 / 2 / 1 / 0 / 1 / 3 / 2 / 5 / 3
Balfour / 15 / 5 / 6 / 4 / 4 / 0 / 3 / 11 / 5 / 14 / 10
Queens Bay East / 8 / 2 / 1 / 1 / 2 / 2 / 1 / 3 / 3 / 4 / 2
Boswell/Kuskanook / 4 / 6 / 2 / 3 / 1 / 2 / 1 / 7 / 4 / 6 / 12
Total / 42 / 15 / 13 / 10 / 8 / 4 / 9 / 29 / 14 / 32 / 34
Air counts Flight 1 / 41 / 20 / 9 / 27 / 12 / 36 / 32
Air counts Flight 2 / 38 / 14 / 9 / 33 / 15 / 36 / 37
Active Flight 12 / 33 / 9 / 7 / 20 / 11 / 22 / 28
Active Flight 2 / 33 / 9 / 7 / 22 / 10 / 25 / 28
* Interviews not conducted.
1 WE=Weekend, WD=Weekday
2Number of access interviews where the start/end time of the boating trip overlapped the start/end time of the overflight.
All analyses in this report were done using SAS 9.2. Programs and results are available at:
Length and weight measurements for sampled fish are appended to this report.
2. Analysis of Preliminary Survey
2.1 Estimates of catch and related variables.
The preliminary survey is an example of an aerial-access survey as discussed by Pollock et al (1994) with non-randomized flight times. The interviews at the access point provide partial information about the total catch on the sampled day as other fishing parties may use different access points where creel clerks were not stationed. The aerial counts provide information on the fraction of the total effort that was sampled at the access point by comparing the number of active boats counted by the flight to the number of interviews that were active on the water during the flight time (determined by examining if the flight time occurred between the start and end of the trip) as outlined in Dauk and Schwarz (2001). For example, if 40 angling parties were interviewed on a date of which 30 were active during the overflight, and if the overflight counted 60 active parties, then it is estimated that only 30/60=.50 of the total effort was interviewed and so the catch based on the 40 parties must be inflated by a factor or 2 (1/0.5)
Unfortunately, estimates at the monthly level will be difficult to obtain for the feasibility data. Due to funding limitations, no sampling on weekdays was done in October 2009 and December 2010 which implies that estimates for weekdays for these months will have to based on perhaps the relationship between weekend and weekdays from other similar months. As well, no replicate day of the same daytype were taken in any month except March 2010 and even then only a weekend was sampled twice. Consequently, there is very little information on the variability among days of the same daytype within a month. One possible analysis would compute the variability among days of the same daytype over a longer period of time (e.g. seasons (see below) and then use this as an estimate of the variation within a month. For this reason, initial estimates will be provided at a larger unit of analysis than the month level.
The analysis of the preliminary survey first stratifies the six months survey into two strata. The shoulder season is defined as October, February, and March. The winter season is defined as November, December and January. Within each of these strata, days of the week were subdivided into two daytypes -- weekdays (Monday to Friday) and weekends (Saturday and Sunday). Holiday Mondays and other statutory holidays were also defined as “weekend” days. This stratification ensured that at least 2 days of each type are measured in each season and provides a design-based estimate of variance based on replicate daytype. This avoids having to make additional assumptions about the variation across days of the same daytype in season when only one day of the daytype is sampled. It is assumed that days within each season-daytype stratum were selected at random.
Normally in aerial-access surveys, aerial flights are conducted on all sample days; in this preliminary survey, only 7 of the 11 sampled days had flights conducted. It will be assumed that the days selected for aerial flights were a random sample of days selected for sampling.
The usual assumptions are made about data being collected properly (both at the access points and during the aerial surveys).
Estimates are formed as simple expansion of the average response for a daytype within a season by the number of daytypes within that season. The standard error at this first step is based on that for estimating a total from a simple random sample as outlined in many books on sampling and demonstrated by Pollock et al (1994).
Let be the response variable (e.g. total catch of species at all access points) in season s, day-type t, and date d when summed over all interviews i for that day. Ordinarily, the expansion of the response variable using the aerial count would be done on each day; but because not every day had an overflight, this expansion will be left to the later steps.
Step 1. Compute the average response and standard deviation of the response over multiple days of each daytype in each season:
/ Number of days sampled of day-type t in season s./ Average response for each season x day-type combination where is the number of days of that day-type within the season.
/ Standard deviation of response over the days for each season x day-type combination
Step 2. Expand the average to estimate the total for the season-day-type combination.
/ Total number of days of day-type t in season s./ Estimated total response for season s and daytype t.
/ Estimated standard error for the total response for season s and day-type t.
Steo 3. Combined totals over daytypes within a season.
/ Estimated total response for season s./ Estimated standard error for total response for season s.
Step 4. Grand total over all season.
/ Estimated grand total over all season/ Estimated standard error for grand total over all seasons
This procedure can be repeated for each variable of interest, e.g. number of fish kept, number of fish released, angler hours, rod hours, etc. A SAS program to do the computations is available.
A summary of the results for preliminary survey before adjusting based on the overflights are shown in Table 2.
Table 2. Estimates from preliminary creel survey. No adjustments have been made for expansion based on the aerial overflights.Response Variable / Stratum
2009/2010 - Shoulder / 2009/2010 - Winter / Grand total
Est Total / SE Total / Est Total / SE Total / Est Total / SE Total
Angler- anglers / 3417 / 392 / 1607 / 517 / 5024 / 649
Angler- hours / 19220 / 2170 / 7267 / 2264 / 26487 / 3136
Angler- rod-hrs / 20092 / 1925 / 7992 / 2424 / 28084 / 3095
Angler- rods / 3699 / 378 / 1852 / 549 / 5551 / 667
Fish - BT - kept / 560 / 163 / 194 / 125 / 754 / 206
Fish - BT - r/k / 710 / 181 / 277 / 162 / 986 / 243
Fish - BT - rel / 150 / 69 / 82 / 40 / 232 / 79
Fish - Oth - kept / 0 / 0 / 0 / 0 / 0 / 0
Fish - Oth - r/k / 0 / 0 / 0 / 0 / 0 / 0
Fish - Oth - rel / 0 / 0 / 0 / 0 / 0 / 0
Fish - RT - kept / 447 / 114 / 297 / 184 / 743 / 217
Fish - RT - r/k / 710 / 181 / 277 / 162 / 986 / 243
Fish - RT - rel / 615 / 269 / 378 / 271 / 992 / 382
Fish - Tot - kept / 1007 / 197 / 491 / 301 / 1498 / 360
Fish - Tot - r/k / 1771 / 374 / 951 / 600 / 2722 / 707
Fish - Tot - rel / 764 / 259 / 460 / 300 / 1224 / 397
RT = Rainbow Trout; BT=Bull Trout; r/k = released + kept
Dauk and Schwarz (2001) outlined how to expand the information from the access surveys for a non-randomized overflight. Briefly, define if the fishing party i in season s, day-type t, and sampling day d as not active/active during overflight f. For example, if the overflight occurred at 13:00, then the fishing party was active if the start and endtimes of their trip included 13:00. Then the expanded response variable for that day is found as:
where is the overflight count during flight f. For example, consider the data from Table 1 for 21 February 2010. There were 27 and 33 boats seen by the two overflights so . Of the 29 interviews, 20 and 22 were active during the two overflights so . The estimated expansion factor for that day is , i.e, inflate the total response over all interviews for that day by a factor of 1.43.
If every sampled day had an overflight, the estimates (accounting for expansion) would be found by replacing by in Step 1 above. The final standard errors would automatically incorporate any variability in the expansion factors over the different days.
However, not every day had an overflight in the preliminary survey. There are two(equivalent in large samples) ways in which the expansion factors can be applied in the case of missing data.
In the first method, the estimates from the unadjusted counts are adjusted using an appropriate factor and the uncertainty in the factor. For example, if the expansion factor was roughly equal across all dates, the final totals could be expanded by the average expansion factor. Table 3 (and Figure 1) summarizes the estimated expansion factors over the seven days with overflights. Given the estimated uncertainty in each expansion factor, there is no evidence that the expansion factors differ across strata or by day type, and an average expansion factor based on all the overflight data will be used to expand the estimates presented in Table 2.
Table 3. Estimated expansion factors on sampled days with overflights.Date / Day-type / Air count 1 / Air count 2 / Active 1 / Active 2 / Expansion
factor / SE1
24OCT09 / WE / 41 / 36 / 33 / 33 / 1.17 / 0.08
21NOV09 / WE / 20 / 14 / 9 / 9 / 1.89 / 0.43
09JAN10 / WE / 9 / 9 / 7 / 7 / 1.29 / 0.23
21FEB10 / WE / 27 / 33 / 20 / 22 / 1.43 / 0.17
04MAR10 / WD / 14 / 17 / 11 / 10 / 1.48 / 0.26
14MAR10 / WE / 36 / 36 / 22 / 25 / 1.53 / 0.19
20MAR10 / WE / 32 / 37 / 28 / 28 / 1.23 / 0.10
Average / 179 / 182 / 130 / 134 / 1.37 / 0.06
1Standard error was computed assuming binomial distribution for the proportion of interviews that was active during the overflight
Figure 1. Comparison of expansion factors across the surveyed dates.
The estimates in Table 4 are expanded by the average expansion factor (E)of 1.37 (SE 0.06). The standard error of the expanded estimates is approximated by :
. This assumes that the estimated expansion factor is independent of the estimate – this is only approximately true because some of the same data are used for both the estimate and the expansion factor, but the effects of non-independenceare expected to be small. The expanded estimates are presented in Table 4a.
Table 4a. Estimates from preliminary creel survey after adjustment with average expansion factor of 1.37 (SE 0.06).Response Variable / Stratum
2009/2010 - Shoulder / 2009/2010 - Winter / Grand total
Est Total / SE Total / Est Total / SE Total / Est Total / SE Total
Angler- anglers / 4672 / 609 / 2197 / 720 / 6870 / 984
Angler- hours / 26282 / 3381 / 9938 / 3156 / 36219 / 4835
Angler- rod-hrs / 27474 / 3130 / 10929 / 3382 / 38403 / 4850
Angler- rods / 5058 / 604 / 2533 / 767 / 7591 / 1025
Fish - BT - kept / 766 / 228 / 266 / 172 / 1031 / 288
Fish - BT - r/k / 970 / 255 / 378 / 223 / 1348 / 343
Fish - BT - rel / 204 / 95 / 112 / 55 / 317 / 110
Fish - Oth - kept / 0 / 0 / 0 / 0 / 0 / 0
Fish - Oth - r/k / 0 / 0 / 0 / 0 / 0 / 0
Fish - Oth - rel / 0 / 0 / 0 / 0 / 0 / 0
Fish - RT - kept / 611 / 160 / 406 / 253 / 1016 / 303
Fish - RT - r/k / 970 / 255 / 378 / 223 / 1348 / 343
Fish - RT - rel / 840 / 372 / 517 / 372 / 1357 / 529
Fish - Tot - kept / 1376 / 283 / 671 / 414 / 2048 / 508
Fish - Tot - r/k / 2421 / 533 / 1300 / 825 / 3721 / 994
Fish - Tot - rel / 1045 / 360 / 629 / 413 / 1674 / 552
RT = Rainbow Trout; BT=Bull Trout; r/k = released + kept
A second way to impute values for the missing expansion factors is the multiple imputation method as outlined in Little and Rubin (2002, Section 5.4). In simple imputation methods, an imputed value is used for any missing expansion factor (e.g. the mean of the expansion factors). The “complete” data are the analyzed in the standard way by replacing by in Step 1 above. As long as the imputed value is an unbiased estimate of the missing value, the final estimates will still be unbiased, but the reported standard errors will be too small. Under multiple imputations, a model for the missing expansion factors is first determined. In this case, it seems sensible that the missing expansion factors come from the same distribution as the observed expansion factors. Then a total of M imputed datasets are created. In each of the M imputed dataset, a new imputed value is chosen for each missing value. [In this case, you could randomly choose from the seven observed expansion factors for each missing flight, or you could generate these from a normal distribution with the same mean and standard deviation as the seven observed expansion factors.] For each of the “complete” datasets, compute the estimates of interest and the estimated variance (standard error squared) (e.g. estimated total catch), denoted as and for i=1,… M respectively. The final estimate is the average of the estimates over the M “complete” datasets:
The final standard error combines the average variance from the “complete” datasets plus a correction term for the extra variation in the estimates over the different imputations:
These two methods are asymptotically equivalent. In our case, if a large number of “complete” datasets are imputed, the different values of the expansion factor will average out for the days with the missing expansion factor. Estimates for the preliminary survey using the multiple imputation method are presented in Table 4b. The estimates and estimated variances are very similar.
Table 4b. Estimates from preliminary creel survey after using multiple imputations (M=10) for the missing expansion factors.Response Variable / Stratum
2009/2010 - Shoulder / 2009/2010 - Winter / Grand total
Est Total / SE Total / Est Total / SE Total / Est Total / SE Total
Angler- anglers / 4770 / 522 / 2319 / 733 / 7089 / 900
Angler- hours / 26567 / 3096 / 10743 / 3534 / 37310 / 4698
Angler- rod-hrs / 28020 / 3020 / 11794 / 3726 / 39814 / 4796
Angler- rods / 5113 / 609 / 2775 / 853 / 7888 / 1048
Fish - BT - kept / 808 / 239 / 272 / 182 / 1080 / 300
Fish - BT - r/k / 1015 / 270 / 385 / 232 / 1400 / 356
Fish - BT - rel / 211 / 100 / 116 / 61 / 327 / 117
Fish - Oth - kept / 0 / 0 / 0 / 0 / 0 / 0
Fish - Oth - r/k / 0 / 0 / 0 / 0 / 0 / 0
Fish - Oth - rel / 0 / 0 / 0 / 0 / 0 / 0
Fish - RT - kept / 582 / 125 / 434 / 271 / 1015 / 299
Fish - RT - r/k / 1020 / 268 / 402 / 253 / 1422 / 369
Fish - RT - rel / 815 / 319 / 567 / 401 / 1381 / 513
Fish - Tot - kept / 1386 / 273 / 706 / 436 / 2092 / 515
Fish - Tot - r/k / 2361 / 442 / 1345 / 834 / 3706 / 945
Fish - Tot - rel / 1004 / 298 / 655 / 427 / 1659 / 521
RT = Rainbow Trout; BT=Bull Trout; r/k = released + kept
The multiple imputation method requires the analysis of multiple complete datasets and some additional computations to roll-up the estimates from the multiple imputations but with modern software, this is relatively simple. A key advantage of the multiple imputation approach is that the model to generate the imputed values can be very general – it is not necessary to use the same distribution for all missing values if there is strong evidence that the missing values depends upon a covariate. For example, if the observed expansion factors showed a dependence upon weather variables, these weather variables could be used to impute a more appropriate expansion factor than the simple average.
In the case of the preliminary survey, using the average expansion factor at the end of the analysis, or imputing based on a common distribution for all overflights is asymptotically equivalent.
The method of multiple imputation could also be used to impute the missing access records for Kaslo on two dates. Given the small number of anglers typically using this access point during the shoulder and winter seasons, this was not done and the estimates are only slightly affected.
2.2 CPUE
Only the interview data is required to estimate the CPUE. The CPUE was estimated for particular species and location/date as: