cpDNArbcL sequences for Biodiverse analyses

Final Report

Prepared for the Ministry for the Environment by Dr Peter Heenan

Allan Herbarium (Landcare Research)

Disclaimer

This report has been prepared by Landcare Research for Ministry for the Environment. If used by other parties, no warranty or representation is given as to its accuracy and no liability is accepted for loss or damage arising directly or indirectly from reliance on the information in it.

This report may be cited as:

Ministry of the Environment. 2015.cpDNArbcL sequences for Biodiverse analyses: Final Report. Wellington: Ministry for the Environment.

Published in October 2015 by the
Ministry for the Environment
ManatūMōTeTaiao
PO Box 10362, Wellington 6143, New Zealand

ISBN: 978-0-908339-13-6 (electronic)

Publication number: ME 1218

© Crown copyright New Zealand [Year]

This document is available on the Ministry for the Environment’s website:

Contents

1Aim

2Background

3Work completed

Obtain plant material

DNA extraction and PCR

DNA sequencing

Data analyses

4Spatial and phylogenetic analyses results and discussion

Generic phylogenetic corrected weighted endemism

Phylogenetic diversity and endemism at genus level

Phylogenetic diversity and endemism at species level

Genus-level categorical analyses of neo- and palaeo-endemism

Species-level categorical analyses of neo- and palaeo-endemism

Utility of Biodiverse spatial and phylogenetic analyses

Additional rbcL sequencing requirements

5References

Appendix 1. Sample information for the new rbcL sequences generated for this contract

Sorted alphabetically by genus

Appendix 2. Genus rbcL phylogeny

Appendix 3. Generic phylogenetic endemism

Appendix 4. Genus-level patterns of PD, PWE and neo- and palaeo-endemism

Appendix 5. Species-level patterns of PD, PWE and neo- and palaeo-endemism

cpDNA rbcL sequences for Biodiverse analyses 1

1Aim

To obtain DNA sequence data for the chloroplastlarge subunit of ribulose 1, 5 bisphosphate carboxylase/oxygenase (rbcL) from New Zealand samples representing c. 200 genera for analyses in Biodiverse, and touse these data to undertake spatial analyses in Biodiverse.

2Background

There are 436 genera accepted in the New Zealand indigenous flora. Recent spatial and genetic analyses of the New Zealand flora used a generic level phylogeny to obtain genetic metrics for Biodiverse analyses (Heenanunpubl. data). Of the 436 genera included in the Biodiverse study, 214 genera were represented by sequences of species indigenous to New Zealand. However, 222 sequences were ‘surrogates’, being based on non-New Zealand species of the same genus or, in a small number of cases, a close generic relative.

This project was to obtain rbcL sequences from indigenous New Zealand species to replace sequencesobtained from non-New Zealand samples and used as surrogates in the Biodiverse analyses. Spatial analyses utilising the new rbcL sequence data in the phylogenetic dataset will be undertaken.

3Work completed

Obtain plant material

  1. We obtained samples representing 211 indigenous New Zealand genera from dried collections in the Allan Herbarium or fresh collections from cultivated material. These 211 genera represented 95% of the 222 genera that were represented by surrogates in the Biodiverse study.
  2. Herbarium vouchers information is presented in Appendix 1.

DNA extraction and PCR

  1. We extracted DNA from 309 samples and obtained suitable PCR product and clean sequences from 191 samples.
  1. For 118 samples we did not obtain sequences as we obtained no PCR product, weak PCR product, or the sequencing was messy.
  2. For some species we included more than one sample as we attempted multiple DNA extractions.
  3. Using a robot for DNA extractions has meant we were able to increase the sample number analysed, so we have been able to send three plates (each with c. 95 samples) for sequencing, rather than the two plates we initially envisaged. This has meant that for some samples that were unsuccessful in plate 1 (or plate 2) we were able to attempt another DNA extraction, PCR and sequence with a different sample in plates 2 and/or 3. For some genera we have attempted sequencing up to three different samples, and this has meant we have been able to obtain sequences representing more genera.
  4. For some samples identified as being particularly difficult, we attemptedDNA extractions using a mortar & pestle and altered PCR protocols to obtain suitable DNA product for sequencing.

DNA sequencing

  1. We obtained sequences for 191indigenous species that are representative of New Zealand indigenous genera; this is 86% of the target number of 222. For 167 samples we obtained read lengths of between 970 and the maximum of 1324 bases; for 10 samples there was a gap (32–223 bases) in the sequence between the internal primers; and for 14 samples we obtained only the 5′or 3′half of rbcL.
  1. Appendix 1 provides a summary of the successful sequence results, including GenBank numbers.

Data analyses

  1. A data matrix was constructed comprising rbcL sequences representing 405 genera from this study and GenBanksequences that are based on indigenous New Zealand species. Sequences representing 31 genera that are based on non-New Zealand species acted as surrogates where there was not a sequence available based on a New Zealand indigenous species. One sequence was selected to represent each genus.
  1. The total dataset comprised 436 genera and was aligned in MEGA 5.0 (Tamura et al. 2011). A model of sequence evolution for rbcL was selected using ModelTest.
  2. An optimised Maximum Likelihood tree was used as the base tree for model likelihood calculations and the best model of sequence substitution was selected using the Bayesian information criterion. Bayesian inference of phylogeny was performed using MrBayes version 3.2.3 through the CIPRES Science Gateway version 3.3. Two runs with eight chains and a sample frequency of 5000 were run for 36,000,000 generations resulting in a total of 7200 trees for each run. The first 6000 trees of each run were discarded as burn-in and the remaining 2400 trees of both runs were combined in a 95% majority rule consensus tree using SumTrees version 3.3.1. The consensus tree is presented in Appendix 2.
  3. The Biodiverse software package version 1.0 was used for all analyses (Laffan et al., 2010). Spatial data used for this study comprised 213,141 georeferenced specimens from the New Zealand Virtual Herbarium (NZVH). All analyses were performed using a cell size of 0.12°, resulting in 2393 cells.
  4. The genus-level spatial data and phylogenetic tree were used to calculate phylogenetic diversity (PD) and phylogenetic corrected weighted endemism (PE_CWE) for the entire New Zealand archipelago and the main New Zealand islands. Statistical significance of the resulting patterns of endemism for each of the phylogenetic and non-phylogenetic analyses was assessed with a two-tailed test involving 999 random realisations of the observed datasets using the preserved model implemented in Biodiverse.
  5. Categorical Analyses of Neo- and Palaeo- Endemism (CANAPE) analyses. Phylogenetic diversity (PD) and phylogenetic weighted endemism (PWE) were calculated following Mishler et al. (2014) at genus and species rank. Statistical significance of the resulting biodiversity patterns for PD and PWE was assessed with 999 random realisations of the observed datasets using the preserved model implemented in Biodiverse. This model randomises the spatial locations of each taxon while preserving the taxon range and maintaining the taxon richness within each cell.
  6. A CANAPEanalysiswas performed on the genus and species spatial and phylogenetic data following Mishler et al. (2014). Differences in neo- and palaeo-endemism among islands were visualised as barplotsby plotting the distribution of p(RPE) values for the entire New Zealand archipelago, North Island, South Island and offshore islands.

4Spatial and phylogenetic analyses results and discussion

Generic phylogenetic corrected weighted endemism

For the New Zealand archipelago generic phylogenetic CWE has most primary and secondary endemism concentrated in the northern offshore islands (Kermadec and Three Kings islands) and the upper North Island areas of Surville Cliffs, Karikari Peninsula, Great Barrier Island, and greater Auckland area (Appendix 3C). Randomisations showed that primary and secondary areas of generic phylogenetic CWE for the northern offshore Kermadec and Three Kings islands, along with northern North Island areas of Surville Cliffs and Karikari Peninsula, are significantly greater than expected from random. The majority of primary and secondary areas of CWE on the main New Zealand islands were not significantly different from random. In the South Island, the majority of cells with significantly higher values of phylogenetic CWE than expected from random occurred in inner montane basins and eastern areas, and cells with significantly less CWE than expected from random were scattered throughout the island. In the North Island, the majority of cells with significantly less generic phylogenetic CWE than expected from random occurred in the lower half of the island.

For the main New Zealand islands, generic phylogenetic diversity (Appendix 3A) shows high richness in the North Island and parts of the upper South Island. The general pattern is decreasing richness with increasing latitude. Generic phylogenetic CWE for the main New Zealand islands is very similar to generic phylogenetic CWE for the New Zealand archipelago (Appendix 3B, 3C). The main differences are additional primary cells on the main New Zealand islands being associated with Surville Cliffs, Kaitaia, Great Barrier Island and greater Auckland area, and new single primary cells on the Volcanic Plateau and near Wellington City. No South Island cells were part of the top 1% for generic phylogenetic endemism.

Phylogenetic diversity and endemism at genus level

In the genus-level analyses of phylogenetic diversity (PD) few cells with significantly high PD were indicated by randomisation analysis (Appendix 4). Those few that were indicated were mostly in the northern and central North Island or in the southern South Island (blue cells in Appendix 4A). However, cells with significantly low values of PD were distributed moreorless throughout the archipelago (red cells in Appendix 4A). Many more cells had significantly high values of PWE (blue cells in Appendix 4B) than had significantly high PD. The most prevalent concentrations of PWE cells were in the northern North Island, Three Kings Islands and Kermadec Islands but a number of areas also appeared in the South Island. Areas with significantly low values of PWE were less frequent than were significantly low values of PD, and were entirely absent from the northern North Island but still frequent in the central and southern North Island and the South Island.

Phylogenetic diversity and endemism at species level

Few cells with significantly high PD were observed in the species level analysis and all of these occurred in the central or northern North Island (blue cells in Appendix 5A). Cells with significantly low PD were scattered throughout the northern North Island but much more frequent and contiguous in the southern North Island and South Island, Stewart Island and Chatham and subantarctic islands (red cells in Appendix 5A). Cells with significantly high PWE were most frequent in the northern North Island, Kermadec Islands and Three Kings Islands and found in scattered patches throughout the southern North Island and South Island (blue cells in Appendix 5B). Cells with significantly low CWE were common in the lower North Island and throughout the South Island (red cells in Appendix 5B). A cluster of cells on Stewart Island and one cell from the Chatham Islands had high PWE, and Auckland and Antipodes islands each had a single low PWE cell.

Genus-level categorical analyses of neo- and palaeo-endemism

The New Zealand archipelago has a relatively even distribution of genus-level endemism types with similar numbers of neo- and palaeo-endemics (Appendix 4C, D). Detailed analyses revealed there are some distinct patterns and a two-sample Bootstrap Kolmogorov-Smirnov test confirmed differences among the distribution of endemism types between the North and South islands (D = 0.4251, p-value < 0.001). The North Island is biased toward palaeo-endemics, including mixed-endemics being strongly skewed toward palaeo-endemics, and neo-endemics are poorly represented (Appendix 4C, E). The South Island has similar numbers of neo- and palaeo-endemics and a more even distribution of mixed-endemics (Appendix 4C, F). The offshore islands, with the exception of the Antipodes Islands, all show cells with high levels of endemism with a latitudinal trend of increasing neo-endemics with increasing latitude (Appendix 4C, G). The northern Kermadec and Three Kings islands have only mixed-endemic cells, the mid-latitude Chatham Islands have neo- and mixed-endemic cells, the Snares have mixed-endemic cells, and the southernmost subantarcticislands, Auckland and Campbell islands, have only neo-endemic cells.

Species-level categorical analyses of neo- and palaeo-endemism

The analyses of species-level endemism for the New Zealand archipelago revealed a predominance of neo- and palaeo-endemics (Appendix 5C, D). There are some well-defined geographic patterns and a two-sample Bootstrap Kolmogorov-Smirnov test confirmed differences in endemism types between the North and South islands (D = 0.8665, p-value < 0.001). In the northern North Island there are extensive areas of palaeo-endemism, accompanied by only a few neo- and mixed-endemism cells (Appendix 5C, E). The South Island is dominated by extensive and contiguous areas of neo-endemism in the northern South Island and southern South Island (Appendix 5C, F). Stewart Island is a hotspot of neo-endemism. The offshore islands are all hotspots of endemism, with the northernmost Kermadec and Three Kings islands comprising mixed- and palaeo-endemics, whereas the Chatham and subantarctic islands are dominated by neo-endemics (Appendix 5C, G).

Utility of Biodiverse spatial and phylogenetic analyses

This study is one of the first to utilise PD, PWE and CANAPE in analyses of the entire vascular flora of an archipelago at two taxonomic ranks, with previous studies having focused on species within genera or genera within families in continental Australia.

The results of this study are generally consistent with current understanding of New Zealand areas of vascular plant endemism and regional biogeographic patterns. However, the analyses presented here importantly provide new insights into the positions of some of the major biogeographic boundaries and to the types of endemism observed. New centres of endemism can be revealed using the sophisticated analyses in Biodiverse, and patterns of endemism can be identified at a finer scale of resolution and more accurately than in earlier studies. The types of analyses presented here have wide application, including: 1) new hypotheses of areas, boundaries and types of endemism; 2) enabling significant areas to be identified to plan and prioritise conservation efforts; and 3) the phylogenetic metrics can be utilised for environmental reporting to provide insights to and measures of biodiversity at genetic scales not previously possible.

In conclusion, Biodiverse provides a valuable framework for objectively analysing and visualising biodiversity datafor the entire vascular flora of the widely dispersed New Zealand archipelago. Further research is needed to better understand the behaviour of endemism metrics with regard to the effects of data biases, taxonomic rank and geographic scale.

AdditionalrbcL sequencing requirements

  1. Complete the sequencing for the 10 samples with a sequence gap between internal 3′ and 5′ primers and obtain the complete sequence for the 14 samples missing either the 5′ or 3′ half of rbcL.
  1. For the 31 genera for which we have not obtained DNA sequences from New Zealand indigenous species,this missing data should be obtained to complete the dataset of rbcL sequences representative of all New Zealand genera. This is achievable and loans from other herbaria with recent collections of these genera and some field work is required to obtain suitable plant material for sequencing. It should be noted that the orchids Gastrodia and Molloybas and the parasitic Dactylanthus do not have chlorophyll and therefore are not able to be sequenced for rbcL.
  2. Having almost completed the construction of an rbcL phylogeny for New Zealand indigenous vascular plant genera, consideration should be given to expanding this to include indigenous non-vascular moss, liverwort and hornwort genera (bryophytes).
  3. Discussions have been held with Dr Robbie Holdaway (Landcare Research) about utilising the New Zealand indigenous genus rbcL phylogeny as part of the environmental DNA project (A national framework for biological heritage assessment across natural and productive landscapes) in The National Science Challenge, New Zealand’s Biological Heritage. Completing the rbcL sequencing for the missing genera is a priority as the resulting dataset would have multiple uses.

5References

Laffan, S.W., Lubarsky, E. and Rosauer, D.F., 2010. Biodiverse, a tool for the spatial analysis of biological and related diversity. Ecography, 33, 643–647. doi:10.1111/j.1600-0587.2010.06237.x

Mishler, B.D., Knerr, N.J., Gonzalez-Orozco, C.E., Thornhill, A.H., Laffan, S. and Miller, J.T., 2014. Phylogenetic measures of biodiversity and neo- and palaeo-endemism in Australian Acacia. Nature Communications, 4, 4473.

Tamura K, Peterson D, Peterson N, Stecher G, Nei M, and Kumar S (2011) MEGA5: Molecular Evolutionary Genetics Analysis using Maximum Likelihood, Evolutionary Distance, and Maximum Parsimony Methods. Molecular Biology and Evolution, 28, 2731-2739.

Appendix 1. Sample information for the new rbcL sequences generated for this contract

Sorted alphabeticallyby genus

Family / Genus / Species / GenBank number / Accession number / Sequence length (5' + 3', 1324 max)
Asteraceae / Abrotanella / A. caespitosa / KT626656 / CHR 607884 / 1115
Rosaceae / Acaena / A. rorida / KT626657 / CHR 688794 / 1324
Gramineae / Achnatherum / A. petriei / KT626658 / CHR 586059 / 691; 5' only