Scholl and Wiens page: 21

Electronic Supplementary Material

(Summary)

Appendix S1. Sources for phylogeny, taxonomy, and species richness data (including Supplementary References)

Appendix S2. Effects of incomplete taxon sampling on estimated clade ages and relationships between diversification rates, richness, and clade ages.

Appendix S3. Randomizing and testing the relationship between richness and diversification

Table S1. Summary of the number of sampled higher-level taxa in each kingdom used in the analyses at different ranks.

Table S2. Summary of chronograms used in this study, and the percentage of clades at each rank that are included in each one.

Table S3. Relationships between diversification rates, clade age, and species richness based on PGLS regression analyses among higher taxa of the same rank (phylum, class, order, and family) within eight major clades.

Table S4. Results testing the effects of randomizing clade ages among clades on the relationship between richness and diversification rate (re-estimated based on the new clade ages).

Table S5. Results testing the effects of randomizing observed richness values among clades on the relationship between richness and observed diversification rate.

Table S6. Results testing the effects of randomizing observed diversification rate values among clades on the relationship between richness and observed diversification rate.

Table S7. Results from experimental tests of the effects of strong negative or positive relationships between age and richness on the relationships between richness and diversification rate, at the kingdom, phylum, class, and order ranks.

Table S8. Estimated diversification rates for Eubacteria, assuming actual bacterial richness is orders of magnitude higher than their current described richness.

Table S9. Effects of reduced taxon sampling on estimates ages and on estimated relationships between richness and diversification and richness and clade age.

Figure S1. Relationship between diversification rates and richness for three different values of epsilon.

Figure S2. Phylogenetically corrected linear regression analyses of clade species richness (ln) versus clade age (in Myr, ln-transformed) for all groups across taxonomic ranks.

Figure S3. The relationship between clade age and species richness when taxa of different ranks are compared.

Database S1. Data on richness, ages, and diversification rates used in analyses of different ranks across the Tree of Life (SEPARATE FILE)

Database S2. Trees used in analyses of different ranks across the Tree of Life. (SEPARATE FILE)

Appendix S1. Sources for phylogeny, taxonomy, and richness data

Archaea and Eubacteria

Phylogenetic relationships and divergence times within Archaea were based on Blank [1]. Those within Eubacteria were based on Battistuzzi Hedges [2]. These studies provided the most comprehensive phylogenies within these groups. In addition, estimates from Blank [1] were generally similar to mean divergence times found in several other studies [3–5]. Likewise, the stem ages estimated for Eubacteria by Battistuzzi Hedges [2] were broadly consistent with those in several other studies [3,4,6–8]. Data on richness and taxonomy for Archaea and Eubacteria were primarily from the Catalogue of Life database (see main text). However, when taxa present in the tree could not be located there we obtained data from the NCBI taxonomy database [9–11], the Global Biodiversity and Information Facility [12], and the Taxonomicon [13]. This was especially necessary for the cyanobacteria.

Backbone Eukaryote Phylogeny

We used a comprehensive chronogram representing all the major lineages of eukaryotes [14]. Other eukaryote-wide phylogenies have been published but are not time calibrated (e.g. [15]).

Plants

The phylogeny of plant families was taken from Fiz-Palacios et al. [16], which provided a relatively recent and comprehensive chronogram of plant families. We did not use the more recent chronogram from Zanne et al. [17] as it only covered 438 families and was therefore not sufficiently comprehensive for our purposes here. Taxonomy and richness data for the kingdom and for phyla, classes, and orders of plants were obtained from the Catalogue of Life [18]. Species richness data on families were from Fiz-Palacios et al. [16].

Fungi

Phylogeny and divergence times for clades within Fungi were based on Gueidan et al. [19]. This study provided a relatively comprehensive tree. Taylor Berbee [20] also provided a relatively comprehensive tree but with highly unstable age estimates. In addition, clade age estimates from Gueidan et al. [19] were similar to those in several other studies [21–25]. Taxonomic data for orders and above were from Gueidan et al. [19]. Richness data for all ranks were taken from Mycobank [26].

Protists

The phylogeny of protists was from Parfrey et al. [14]. Other studies have addressed protist phylogeny [27–29] but were not as comprehensive. Species richness data at the kingdom level were from Pawlowski et al. [30] whereas data on richness for phyla, classes, orders, and families were from the NCBI database [9–11].

Backbone animal phylogeny

A time-calibrated tree including most animal phyla was taken from Wiens [17], using Tree 2 from that study. In that study, existing data were used to estimate a time-calibrated tree, and the topology was constrained to reflect well-supported relationships found in previous studies. Specifically, a slightly expanded version of the matrix of Philippe et al. [32] was generated. The matrix of Philippe et al. [32] is itself modified from that Dunn et al. [33]. This matrix included 150 genes from 77 taxa (71 metazoans, 6 outgroups) mostly from ribosomal proteins and ESTs (expressed sequence tags). A time-calibrated phylogeny was generated using the Bayesian relaxed lognormal approach in BEAST, version 2.1.3 [34], after reducing the matrix to 16 relatively complete genes (each ≥100 amino acids in length, and >90% complete in their taxon sampling). Topological constraints were based primarily on the tree of Dunn et al. [35], with other relationships constrained based on the trees of Dunn et al. [33] and Philippe et al. [32]. The tree of Dunn et al. [35] provides a consensus of animal phylogeny based on the results of many recent studies. To estimate the ages of clades across the tree, a total of 20 fossil calibration points were used. We used Tree2 from Wiens (17), which constrains the age of Porifera to 836 Ma, and generated younger clade ages at the base of the tree (i.e. 876 Ma), intermediate between those of Erwin et al. [36] and Blair [37].

Animla phyla: Cnidaria

The phylogeny of cnidarians was taken from Rogers [38]. Other studies include relatively few higher taxa [36,39]. Data on species richness and taxonomy were from the Catalogue of Life [18].

Animla phyla: Echinodermata

The phylogeny of echinoderms used was from O’Hara [40]. A time-calibrated phylogeny of echinoderms was also presented by Smith [41] but this tree was not as comprehensive as that of O’Hara [40]. Data on species richness and taxonomy were from the Catalogue of Life Life [18].

Animla phyla: Mollusca

The phylogeny of molluscs was taken from Stoger et al. [42], which is relatively complete. Other chronograms have more limited taxonomic sampling, such as Strugnell Allcock [43] and Strugnell et al. [44]. Data on species richness and taxonomy were from the Catalogue of Life [18].

Animla phyla: Nematoda

The phylogeny of nematodes was taken from Blaxter [45]. Several chronograms for nematodes have been published but with limited taxon sampling [22,46]. Data on species richness and taxonomy were from the Catalogue of Life [18].

Animla phyla: Arthropoda

Higher-level arthropod relationships were from Tree2 of Wiens [17], as described above. Data on species richness and taxonomy were from the Catalogue of Life [18]. These data were supplemented by the NCBI taxonomy database [9–11] for taxa missing from the Catalogue of Life.

Arthropod clades: Arenaea (Spiders)

The phylogeny of spiders was based on the time-calibrated phylogenomic tree from Bond et al. [47] which provided the most comprehensive treatment of the group to date. Data on species richness and taxonomy were from the Catalogue of Life [18].

Arthropod clades: Hexapoda (insects and relatives)

The phylogeny of hexapods was taken from Rainford et al. [48], which provided a time-calibrated tree of nearly all hexapod families based on multiple nuclear and mitochondrial genes. Data on species richness and taxonomy were from the Catalogue of Life [18].

Animal phyla: Chordata

The higher-level phylogeny of vertebrates (within Chordata) was taken largely from Alfaro et al. [49], which included all major clades of gnathostomes (jawed vertebrates). We added the cyclostomes (Myxiniformes, Petromyzontiformes) to this tree using the age of the vertebrate crown group from Erwin et al. [36] and the cyclostome crown group from Kuraku Kuratani [50], following Wiens [29]. We used estimated species numbers for Myxiniformes, Petromyzontiformes, Chondrichthyes, Actinistia, and Dipnoa from the Catalogue of Life [18].

Vertebrate clades: Chondrichthyes (sharks and rays)

The phylogeny was taken from Sorenson et al. [52]. Data on species richness and taxonomy were taken largely from Sorenson et al. [52] with additional information from the Catalogue of Fishes [53] and the Catalogue of Life [18].

Vertebrate clades: Actinopterygia (ray-finned fish)

The phylogeny and taxonomy were taken from Betancur et al. [54]. The phylogeny was embedded in our supertree using the chronogram from Alfaro et al. [49]. Species numbers were primarily from the Catalogue of Fishes [53] and supplemented with data from the Catalogue of Life [18].

Vertebrate clades: Amphibians

The phylogeny was taken from Pyron Wiens [55] and species numbers for families were from AmphibiaWeb [56]. Taxonomy and richness values for the clade overall were from the Catalogue of Life [18].

Vertebrate clades: Tesutudines (Turtles)

The phylogeny was taken from Jaffe et al. [57]. Other available phylogenies lacked time calibration [58]. Taxonomy and species numbers were from the Reptile Database [59].

Vertebrate clades: Crocodylia (alligators and crocodiles)

The phylogeny was taken from Oaks [60]. Other phylogenies were not as comprehensive and lacked time calibration [61]. Taxonomy and species numbers were from the Reptile Database [59].

Vertebrate clades: Lepidosauria (lizards, snakes, tuatara)

The timing of the split between Rhynchocephalia (tuatara) and Squamata (lizards and snakes) was taken from Alfaro et al. [49] and the tree within squamates was from Pyron et al. [62] with time calibration from Pyron Burbrink [63]. Species numbers were from the Reptile Database [59]. In some cases, the taxonomy of the Reptile Database for squamates differed from traditional taxonomy in several respects, but without phylogenetic justification (i.e. due to a non-monophyletic group). In these cases, we followed traditional taxonomy instead. Specifically, we: (1) treated Anguidae, Diploglossidae, and Anniellidae as one family (Anguidae); (2) treated Typhlopidae, Gerropilidae, and Xenotyphlopidae as one family (Typhlopidae); and (3) treated Dipsadidae, Natricidae, Pseudoxenodontidae as parts of Colubridae.

Vertebrate clades: Aves (Birds)

The phylogeny for birds was taken from Jetz et al. [64], but this requires some explanation, since their paper provides distributions of trees rather than a single tree. We developed a list of 192 species, representing all bird families as recognized by Jetz et al. [64]. We used the backbone tree based on Hackett et al. [65], and included only species that had sequence data. We then obtained a set of 1,000 trees from the full set of 10,000 trees for all 9993 taxa, including only these 192 species. We then used TreeAnnotator in BEAST 1.5.4 to summarize these 1,000 trees as a consensus tree, with clade ages based on the mean ages. This tree was generally very well-supported (based on posterior probabilities of clades), but some clades had relatively weak support. Species numbers were obtained from the list of Jetz et al. [64], with a total of 9,993 species among 192 families. However, some families did not appear to be monophyletic, and assignment of species numbers in some cases was very unclear. We included all families, but could assign only 9,273 species among these families. These diversification rates may therefore be underestimated in some cases (or overestimated if non-monophyly causes the ages of families to be underestimated). A potential alternative approach was to use orders instead, but many of these appeared to be non-monophyletic as well. Richness values for orders and Aves overall were taken from the Catalogue of Life [18].

Vertebrate clades: Mammalia

The phylogeny of mammal families was taken from Meredith et al. [66], which was based on multiple nuclear loci. We used the phylogeny presented in Fig. 1 of that paper (based on amino acid sequences). Species numbers for each family (and family-level taxonomy) were taken from IUCN [67].

Supplementary References

1. Blank CE. 2011 An expansion of age constraints for microbial clades that lack a conventional fossil record using phylogenomic dating. J. Mol. Evol. 73, 188–208. (doi:10.1007/s00239-011-9467-y)

2. Battistuzzi FU, Hedges SB. 2009 Eubacteria. In The Timetree of Life (eds SB Hedges & S Kumar), pp. 106–115. Oxford University Press.

3. Sheridan PP, Freeman KH, Brenchley JE. 2003 Estimated minimal divergence times of the major bacterial and archaeal phyla. Geomicrobiol. J. 20, 1–14. (doi:10.1080/01490450390144330)

4. Battistuzzi FU, Feijao A, Hedges SB. 2004 A genomic timescale of prokaryote evolution: insights into the origin of methanogenesis, phototrophy, and the colonization of land. BMC Evol. Biol. 4, 44. (doi:10.1186/1471-2148-4-44)

5. Blank CE. 2009 Not so old Archaea - the antiquity of biogeochemical processes in the archaeal domain of life. Geobiology 7, 495–514. (doi:10.1111/j.1472-4669.2009.00219.x)

6. Blank CE, Sanchez-Baracaldo P. 2010 Timing of morphological and ecological innovations in the cyanobacteria - a key to understanding the rise in atmospheric oxygen. Geobiology 8, 1–23. (doi:10.1111/j.1472-4669.2009.00220.x)

7. Luo H, Cusros M, Hughes AL, Moran MA. 2013 Evolution of divergent life history strategies in marine Alphaproteobacteria. MBio 4, e00373–13. (doi:10.1128/mBio.00373-13)

8. Blank CE. 2013 Origin and early evolution of photosynthetic Eukaryotes in freshwater environments: reinterpreting proterozoic paleobiology and biogeochemical processes in light of trait evolution. J. Phycol. 49, 1040–1055.

9. Benson DA, Clark K, Karsch-Mizrachi I, Lipman DJ, Ostell J, Sayers EW. 2015 GenBank. Nucleic Acids Res. 43, D30–D35. (doi:10.1093/nar/gku1216)

10. Federhen S. 2012 The NCBI Taxonomy database. Nucleic Acids Res. 40, D136–D143 (doi: 10.1093/nar/gkr1178)

11. Sayers EW et al. 2009 Database resources of the National Center for Biotechnology Information. Nucleic Acids Res. 37, D5–D15. (doi:10.1093/nar/gkn741)

12. GBIF 2015 Global biodiversity information facility. (www.gbif.org/)

13. Brands SJ. 2015 The Taxonomicon. (http://taxonomicon.taxonomy.nl/)