Supplemental Materials belonging to:
Assessing conservation risks to populations of an anadromous Arctic salmonid, the northern Dolly Varden (Salvelinus malma malma), via estimates of effective and census population sizes and approximate Bayesian computation
Submitted by:
LES N. HARRIS1*, FRISO P. PALSTRA2*, ROBERT BAJNO1, COLIN P. GALLAGHER1, KIMBERLY L. HOWLAND1, ERIC B. TAYLOR3 AND JAMES D. REIST1
1Fisheries and Oceans Canada, 501 University Crescent, Winnipeg, MB, Canada, R3T 2N6
2 CNRS UMR 7206 Eco-anthropologie et Ethnobiologie, Equipe “Génétique des populations humaines”, Muséum National d'Histoire Naturelle, Paris Cedex 05, France
3Department of Zoology, Biodiversity Research Centre and Beaty Biodiversity Museum, University of British Columbia, 6270 University Blvd., Vancouver, BC, Canada, V6T 1Z4
* These authors contributed equally to this work.
Correspondence:
Les N. Harris
Fisheries and Oceans Canada, 501 University Crescent, Winnipeg, MB, Canada, R3T 2N6.
Tel: +1-204-983-5143; Fax: +1-204-984-2403
E-mail:
Table S1. Basic descriptive statistics for each of the 15 microsatellite loci for all sampling locations assessed in northern Dolly Varden (Salvelinus malma malma) showing the average number of alleles per locus (NA), expected (HE) and observed (HO) heterozygosities, inbreeding coefficient (FIS), allelic richness (AR), and private allelic richness (PAR). Sample codes are shown in Table 1 of the manuscript.
Table S1. Continued.
Table S1. Continued.
Table S3. Goodness-of-fit of 24 simulated summary statistics (S) to empirically observed values (obs), as assessed by their location within the distribution of simulated values. For each summary statistic (number of alleles (NA), mean genic diversity (HE) (Nei 1987), the population size reduction parameter (M) (Garza & Williamson 2001), FST (θ , Weir and Cockerham 1984), the average squared distance between alleles δμ2) (Goldstein et al. 1995)) and scenario are given the percentage of simulated values smaller than observed values, based on 500,000 simulations per scenario. Displayed in bold are those values which fall below 5% or above 95%, indicating a poor fit. Population names in parentheses after summary statistics are shown in Table 1. All simulations performed in FastSimcoal 2.0 (Excoffier et al. 2013) and all summary statistics calculated using arlsumstat (Excoffier et al. 2010).
S / Obs / 1 / 2 / 3 / 4 / 5 / 6 / 7 / 8 / 9NA(FIR) / 15.5 / 0.4146 / 0.3269 / 0.3042 / 0.4186 / 0.3300 / 0.3086 / 0.4089 / 0.3405 / 0.3234
NA(BAB) / 11.2 / 0.2470 / 0.1735 / 0.1630 / 0.2488 / 0.1761 / 0.1652 / 0.2485 / 0.1963 / 0.1883
NA(BIG) / 13.1 / 0.2784 / 0.1990 / 0.1868 / 0.2816 / 0.2008 / 0.1895 / 0.2774 / 0.2179 / 0.2096
NA(RAT) / 10.3 / 0.1901 / 0.1353 / 0.1234 / 0.1940 / 0.1373 / 0.1253 / 0.1965 / 0.1579 / 0.1478
HE(FIR) / 0.84 / 0.2537 / 0.1922 / 0.1790 / 0.2572 / 0.1944 / 0.1806 / 0.2585 / 0.2183 / 0.2081
HE(BAB) / 0.77 / 0.1801 / 0.1332 / 0.1246 / 0.1814 / 0.1343 / 0.1264 / 0.1931 / 0.1629 / 0.1570
HE(BIG) / 0.77 / 0.1767 / 0.1310 / 0.1226 / 0.1774 / 0.1320 / 0.1246 / 0.1902 / 0.1606 / 0.1550
HE(RAT) / 0.75 / 0.1562 / 0.1218 / 0.1137 / 0.1589 / 0.1227 / 0.1154 / 0.1719 / 0.1523 / 0.1458
M(FIR) / 0.61 / 0.1061 / 0.0637 / 0.0523 / 0.1093 / 0.0646 / 0.0554 / 0.1025 / 0.0650 / 0.0570
M(BAB) / 0.45 / 0.0222 / 0.0105 / 0.0087 / 0.0228 / 0.0107 / 0.0091 / 0.0245 / 0.0131 / 0.0121
M(BIG) / 0.55 / 0.0454 / 0.0216 / 0.0186 / 0.0467 / 0.0219 / 0.0195 / 0.0471 / 0.0246 / 0.0230
M(RAT) / 0.50 / 0.0246 / 0.0130 / 0.0103 / 0.0262 / 0.0129 / 0.0106 / 0.0252 / 0.0156 / 0.0138
FST (FIR-BIG) / 0.048 / 0.6398 / 0.9990 / 1.0000 / 0.6298 / 0.9989 / 0.9998 / 0.7573 / 0.9991 / 1.0000
FST (FIR-BAB) / 0.034 / 0.5574 / 0.9919 / 0.9999 / 0.5478 / 0.9915 / 0.9998 / 0.6613 / 0.9922 / 0.9999
FST (BIG-BAB) / 0.085 / 0.7901 / 0.9998 / 1.0000 / 0.7831 / 0.9997 / 0.9999 / 0.8759 / 0.9998 / 1.0000
FST (FIR-RAT) / 0.079 / 0.7727 / 0.9964 / 1.0000 / 0.7629 / 0.9962 / 1.0000 / 0.8694 / 0.9965 / 1.0000
FST (BIG-RAT) / 0.104 / 0.8353 / 0.9990 / 1.0000 / 0.8267 / 0.9989 / 1.0000 / 0.9072 / 0.9991 / 1.0000
FST (BAB-RAT) / 0.074 / 0.7543 / 0.9996 / 1.0000 / 0.7442 / 0.9994 / 0.9999 / 0.8556 / 0.9995 / 1.0000
δμ2(FIR-BIG) / 15.24 / 0.8026 / 0.9999 / 1.0000 / 0.7929 / 1.0000 / 1.0000 / 0.9071 / 1.0000 / 1.0000
δμ2(FIR-BAB) / 7.54 / 0.5727 / 0.9984 / 1.0000 / 0.5598 / 0.9981 / 1.0000 / 0.7577 / 0.9983 / 1.0000
δμ2(BIG-BAB) / 8.09 / 0.6071 / 0.9997 / 1.0000 / 0.5932 / 0.9997 / 1.0000 / 0.7835 / 0.9997 / 1.0000
δμ2(FIR-RAT) / 7.81 / 0.5854 / 0.9965 / 1.0000 / 0.5729 / 0.9962 / 1.0000 / 0.7827 / 0.9967 / 1.0000
δμ2(BIG-RAT) / 16.57 / 0.8404 / 0.9998 / 1.0000 / 0.8343 / 0.9997 / 1.0000 / 0.9333 / 0.9999 / 1.0000
δμ2(BAB-RAT) / 14.48 / 0.8104 / 1.0000 / 1.0000 / 0.8014 / 0.9999 / 1.0000 / 0.9190 / 1.0000 / 1.0000
Table S4: Evaluation of confidence in the model choice procedure for the demographic history of four northern Dolly Varden (Salvelinus malma malma) population samples as represented by the three most likely candidate scenarios. For each scenario, we simulated 100 sets of summary statistics, and used these as the pseudo-observed values in the model selection (using a logistic regression procedure based on the 1% closest simulations). Given for each set of simulated scenarios are the proportions of times that the model was correctly identified as the preferred one, as well as the proportion of times it was misidentified as another scenario. Averaged across scenarios, these correspond to type I and type II errors. All analyses were performed using the package abc (Csillery et al. 2012) in R v.2.15.1 (R Core Team, 2012).
Simulated as scenario / Identified as scenario1 / 4 / 7 / error I
1 / 0.32 / 0.32 / 0.36 / 0.34
4 / 0.32 / 0.37 / 0.31 / 0.32
7 / 0.25 / 0.23 / 0.52 / 0.24
error II / 0.29 / 0.28 / 0.34
Table S5. Posterior distributions of model parameters of the preferred demographic scenario for four northern Dolly Varden (Salvelinus malma malma) populations. Given for each parameter are the median, mean and 95% confidence interval, based on a neural networks regression on the 0.5% simulations closest to observed values. All analyses were performed using the package abc (Csillery et al. 2012) in R v.2.15.1 (R Core Team, 2012). See Figure S4 for more details on the distributions.
Parameter / Median / Mean / -95% CI / +95% CINFIR / 4.4x104 / 4.2x104 / 2.0x104 / 5.0x104
NBAB / 1.4x104 / 1.6x104 / 4.3x103 / 4.4x104
NBIG / 1.4x104 / 1.7x104 / 2.6x103 / 4.3x104
NRAT / 9.3x103 / 1.2x104 / 3.4x103 / 4.3x104
tA / 1.5x103 / 1.5x103 / 1.0x103 / 1.9x103
RA / 3.8 / 1.4 / 1.2x10-2 / 8.5x101
μmicro / 1.4x10-3 / 1.6x10-3 / 5x10-4 / 3.8x103
Pmicro / 4.7x10-1 / 4.7x10-1 / 3.4x10-1 / 6.0x10-1
Figure S1. Shown is the relationship between effective population size (Ne) and available spawning habitat (above) and the relationship between Ne and census population size (Nc) for the four anadromous, northern Dolly Varden (Salvelinus malma malma) populations assessed in the present study.
Figure S2. Prior and posterior distributions of 24 summary statistics used for the ABC analyses on the demographic history of four northern Dolly Varden (Salvelinus malma malma) populations. For each population sample (population codes are shown in Table 1 of the manuscript), within-population summary statistics include the number of alleles (Na), the genic diversity (He) and the statistic M, averaged across 15 microsatellite loci. Between-population summary statistics include genetic differentiation (FST) and average squared allelic distance (δμ2). For each summary statistic, the prior (grey) and posterior distributions (blue) are given (based on 0.5% closest values), as well as empirically observed values (solid black vertical lines).
Figure S3. Principal components analysis (PCA) summarizing the variance observed in a set of 24 simulated summary statistics across four northern Dolly Varden (Salvelinus malma malma) populations. Displayed are simulations under all scenarios (various colours), in relation to the empirically observed values (yellow dots). Scenarios with an island model of gene flow are displayed in shades of grey, scenarios with a stepping stone model of gene flow are displayed in shades of green, and scenarios with complete isolation are displayed in red shades (See Figure 2 for further details on the candidate scenarios). Displayed are the three first axes of the PCA, which together explain 71% of the total variance observed. Analyses performed in R v.2.15.1 (R Core Team, 2012).
Figure S4. Posterior distributions of model parameters of the preferred candidate (scenario 7, see main text) for the demographic history underlying four northern Dolly Varden (Salvelinus malma malma) populations. Given are prior (grey) and posterior (blue) distributions, based on a neural networks analyses of the 0.5% simulations with values closest to observed ones, for population sizes (N),splitting time (tA) and population size change at splitting (RA). All analyses were performed using the package abc (Csillery et al. 2012) in R v.2.15.1 (R Core Team, 2012).
REFERENCES
Csillery K, Francois O, Blum MGB (2012) abc: an R package for approximate Bayesian computation (ABC). Method Ecol Evol 3:475-479
Excoffier L, Lischer HE (2010) Arlequin suite ver 3.5: a new series of programs to perform population genetics analyses under Linux and Windows. Mol Ecol Res 10(3):564-567
Excoffier L, Dupanloup I, Huerta-Sánchez E, Sousa, VC, Foll, M (2013) Robust demographic inference from genomic and SNP data. PLOS Genetics, 9(10):e1003905.
Garza J, Williamson E (2001) Detection of reduction in population size using data from microsatellite loci. Mol Ecol 10:305-318
Goldstein DB, Linares AR, Cavilla-Sforza LL, Feldman MW (1995) An evaluation of genetic distances for use with microsatellites. Genetics 139:463-471
Nei M (1987) Molecular evolutionary genetics Columbia University press.
R Core Team (2013) R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing: Vienna, Austria, URL. http://www.R-project.org/
Weir BS, Cockerham CC (1984) Estimating F-statistics for the analysis of population structure. Evolution 38:1358-1370