Supplement 1

for

Novel [NiFe]- and [FeFe]-Hydrogenase Genes Indicative of Active Facultative Aerobes and Obligate Anaerobes in Earthworm Gut Contents

Oliver Schmidt, Pia K. Wüst, Susanne Hellmuth, Katharina Borst, Marcus A. Horn,

and Harold L. Drake

Department of Ecological Microbiology, University of Bayreuth, 95440 Bayreuth, Germany

------

Criteria for Establishing Hydrogenase Operational Taxonomic Units (OTUs)

Fermentative H2 producers posses either [FeFe]- or Group 4 [NiFe]-hydrogenases (S12, S19, S23). Thus, genes encoding these hydrogenases can be targeted for assessing polypheletic fermentative H2 producers (S2, S3, S17, S25). Diverse organisms with other metabolic capabilities (e.g., methanogens, carbon monoxide oxidizers, and sulfate reducers) also harbor such hydrogenases (S5, S19) and can also be detected by using these structural gene markers. However, phylogenetic inferences based on hydrogenase sequence data should be made with caution. The number of reference hydrogenase sequences available in public databases is limited. This problem can be improved as more microbial genomes (H2 producers among others) are sequenced (S11) and available hydrogenase primers (S2, S3, S17, S25, and this study) are used to obtain hydrogenase sequences from pure cultures. Differing topologies of hydrogenase and 16S rRNA gene phylogenetic trees and the presence of several homologous hydrogenase genes in one genome also complicate the interpretation of environmental hydrogenase sequence data (S11,S17,S23). Thus, the correlation between [FeFe]- and [NiFe]-hydrogenase fragments to corresponding 16S rRNA genes was determined and threshold similarity values for [FeFe]- and Group 4 [NiFe]-hydrogenases were calculated to standardize OTU assignment of environmental hydrogenase sequences.

Calculation of hydrogenase/16S rRNA similarity correlation plots. 184 Group 4 [NiFe]-hydrogenase gene sequences and 229 [FeFe]- hydrogenases gene sequences from pure cultures (Supplemental Table S1) were retrieved from GenBank and aligned as described in the Material and Methods of the main presentation. Aligned 16S rRNA gene sequences of the microorganisms harboring these [FeFe]- and [NiFe]-hydrogenases were retrieved from the latest 16S rRNA gene database from the SILVA homepage ( (S14) (Supplemental Table S1). The fragments of the [NiFe]- and [FeFe]-hydrogenase genes that were analyzed correspond to regions amplified with the hydrogenase primers listed in Table 1. Approximately 1350 bp (positions 93 to 1450 of the 16S rRNA gene from E. coli [GenBank acc. no. U00096]) were analyzed for 16S rRNA genes. Distance matrices (with D being the number of base or amino acid differences per site) for pairwise comparisons of all 16S rRNA gene sequences andin silico translated [FeFe]- and [NiFe]-hydrogenases amino acidsequences were calculated with the MEGA software (S21). The similarity (S) was expressed as S = 1 – D. This approach is similar to those of previous studies (S13, S15).

Correlations between [FeFe]- or Group 4 [NiFe]-hydrogenase fragments and 16S rRNA gene sequences. Topologies of [FeFe]- or Group 4 [NiFe]-hydrogenase phylogenetic trees were partially differentto that of corresponding 16S rRNA phylogenetic trees (Supplemental Fig. S3A, S17). These differences are reflected by the non-linear correlation of [FeFe]- or [NiFe]-hydrogenase amino acid similarities tothe corresponding 16S rRNA gene similarities (Supplemental Fig. S4A and S4C). The numerous dots located in the lower right corner of Supplemental Fig. S4A and S4C indicate that relatively closely related organisms can have distantly related hydrogenases. Furthermore, many microorganisms harbor several homologous hydrogenase genes. 23% and 32% of the species listed in Supplemental Table S1 have two or more homologs of Group 4 [NiFe]-hydrogenases and [FeFe]-hydrogenases, respectively. The average number of homologs is 1.26 and 1.48 for Group 4 [NiFe]-hydrogenases and [FeFe]-hydrogenases, respectively (Supplemental Table S1). The different hydrogenase homologs often have low similarities to each other (S17), as reflected by the dots at 100% 16S rRNA gene similarity that often have hydrogenase amino acid sequence similarities below 50% (Supplemental Fig. S4A and S4C). On the other hand, distantly related organisms generally did not have closely related [FeFe]- or Group 4 [NiFe]-hydrogenases (Supplemental Fig. S4A and S4C). A notable exception is found within the Thermotogaceae where Marinitoga piezophila shares 16S rRNA gene similaritiesof only 83% to those of Thermotoga maritima and other Thermotoga spp. whereas amino acid sequence similarities of the [FeFe]-hydrogenases are 98-100% (the dots corresponding to the aforementioned correlations are highlighted in Supplemental Fig. S4A with a black box).

Despite the aforementioned exception, the lack of closely related [FeFe]- and Group 4 [NiFe]-hydrogenases in distantly related organisms indicate that the different homologs of these hydrogenases probably originate from gene duplication and subsequent diversification and only to a lesser extent from horizontal gene transfer (S11, S23). Hydrogenase homologs can have different structural compositions and are stabilized in the genome by performing different functions (S1, S18). Thus, drawing accurate phylogenic inferences for host organisms is optimized if only [FeFe]- and Group 4 [NiFe]-hydrogenases with related functions are considered for amino acid sequence comparison. Therefore, a functional reclassification of [FeFe]- and Group 4 [NiFe]-hydrogenases might be helpful. However, most available hydrogenase gene sequences were obtained from sequenced genomes (S11) and their in situ functions arenot resolved in all cases.

Determination of OTU threshold values. Similarity correlation plots between structural gene markers and corresponding 16S rRNA genes, similar to those in Supplemental Fig. S4, have been used to calculate threshold similarity values at the specieslevel (S8, S13, S15) and can also be applied at the genus, family, and phylum levels (S7, this study). Such threshold similarity values can be used to estimate novelty and diversity of a functional group in an environmental sample based on structural gene sequence data. However, the non-linear correlation between [FeFe]- or Group 4 [NiFe]-hydrogenases and corresponding 16S rRNA genes (as explained above) is reflected in the calculated taxa level thresholds (Supplemental Table S5). [FeFe]-hydrogenase threshold similarity values were almost equally low from the species to the phylum level, whereas the threshold at the family level was higher compared to the thresholds of the species and genus levels for Group 4 [NiFe]-hydrogenases (Supplemental Table S5). Thus, it is not possible to accurately estimate species, genus, and family level diversitiesor species and genus level diversities of H2-producing microorganisms based on either [FeFe]- or group 4 [NiFe]-hydrogenase amino acid sequence threshold similarities, respectively.

The hydrogenase amino acidsequences were restricted to only one homolog per organism in Supplemental Fig. S4B and S4D. The resulting threshold similarity values decrease stepwise from the species to the phylum level (Supplemental Table S5) (note: the selection of the hydrogenase homologs that were either included or excluded in these calculations was subjective and the selection of other homologs would result in slightly different thresholds. The stepwise decrease of threshold similarity values calculated with the restricted dataset of hydrogenases (Supplemental Fig. S4B and S4D)was different to the threshold similarity values calculated from the complete dataset of hydrogenase amino acid sequences (including all homologs, see Supplemental Fig. S4A, S4C and Supplemental Table S5). The apparently different thresholds obtained from the different datasets of hydrogenase amino acid sequences underscore the likelihood that the presence of several hydrogenase homologs in genomes can result in inconsistencies between hydrogenase phylogeny and 16S rRNA gene phylogeny of H2 producing microorganisms (S17). Despite the generally large differences between hydrogenase thresholds calculated from all or only one homolog(s), the family level thresholds for Group 4 [NiFe]-hydrogenases were very similar (Supplemental Table S5). Thus, 68% seems to be an effective threshold for family-level assignment of Group 4 [NiFe]-hydrogenases.

Group 4 [NiFe]-hydrogenases of the Gammaproteobacteria form a well resolved monophyletic cluster (Supplemental Fig. S3A). An exception is the hydrogenase of Allochromatium vinosum (GenBank acc. no. EER67355) that is located in a mixed cluster of hydrogenases from, e.g.,Archaea, Firmicutes, and Actinobacteria (Supplemental Fig.S3A). All Group 4 [NiFe]-hydrogenases of the Gammaproteobacteria cluster according to their family affiliation (Supplemental Fig. S3B). The Group 4 [NiFe]-hydrogenases of the Enterobacteriaceae form two distinct clusters with hydrogenase genes related to either hycE or hyfG of the E. coli hydrogenases III and IV, respectively (Supplemental Fig. S3B). In addition to E. coli, only the ShigellaspeciesS. boydii, S. dysenteriae, and S. sonnei have homologs of both Group 4 [NiFe]-hydrogenase genes (Supplemental Table S1). Due to the low number of Group 4 [NiFe]-hydrogenase homologs within the Gammaproteobacteria, calculated genus and family level threshold similarity values were for the most part unaffected when all or only one hydrogenase homolog(s) were considered (Supplemental Table S5). Furthermore, minimum similarities close to the threshold for the genus and the family levels (Supplemental Table S5) indicated that 73% and 71% were useful thresholds to estimate genus and family level diversities of the Gammaproteobacteria in environmental samples, respectively. However, hydrogenaseamino acid sequence-based affiliation at the genus level was not possible within the Enterobacteraceae since these species did not cluster according to their genus affiliation based on either 16S rRNA or Group 4 [NiFe]-hydrogenase (Supplemental Fig. S3B). Furthermore, species-level thresholds for Group 4 [NiFe]-hydrogenases of the Gammaproteobacteria were affected when all or only one homolog(s) were utilized for calculating the thresholds. Thus, species-level thresholds for Group 4 [NiFe]-hydrogenasesof the Gammaproteobacteriawere considered unsuitable for use.

The observation that closely related [FeFe]- or Group 4 [NiFe]-hydrogenases generally belong to closely related organisms (see Supplemental Fig. S4A, S4C, and text above) facilitates an alternative standardized assignment of OTUs. The 16S rRNA threshold similarity for closely related [FeFe]- and Group 4 [NiFe]-hydrogenases (e.g., hydrogenases with ≥ 80% similarity) is 91% and 93%, respectively (Supplemental Table S5). In other words, all [FeFe]- or Group 4 [NiFe]-hydrogenase fragments within one OTU (at 80% hydrogenase threshold similarity) most probably belong to the same family (a conservative family-level threshold for 16S rRNA sequences is 87.5% [S26]).

Thus, for environmental samples, [FeFe]- and Group 4 [NiFe]-hydrogenase gene sequences that share at least 80% similarity to those of pure cultures available in public databases can be assigned to the family of the corresponding organism. Furthermore, amplified [FeFe]- or Group 4 [NiFe]-hydorogenase fragments that share less than 80% similarity to known hydrogenases can be grouped together as one (‘novel’) family if they share at least 80% similarity to each other. However, it cannot be ruled out that such ‘novel’ hydrogenases belong to organisms of known families since closely related organisms can have distantly related hydrogenases (see Supplemental Fig. S4A, S4C and text above).

References for Supplemental Material

S1.Bagramyan, K., and A. Trchounian. 2003. Structural and functional features of formate hydrogen lyase, an enzyme of mixed-acid fermentation from Escherichia coli. Biochem. Int. 68:1159-1170.

S2.Boyd, E. S, J. R. Spear, and J. W. Peters. 2009. [Fe-Fe]-hydrogenase genetic diversity provides insight into molecular adaption in a saline microbial mat community. Appl. Environ. Microbiol. 75:4620-4623.

S3.Chang, J.-J., W.-E. Chen, S.-Y Shih, S.-J. Yu, J.-J. Lay, F.-S. Wen, and C.-C. Huang. 2006. Molecular detection of the clostridia in an anaerobic biohydrogen system by hydrogenase mRNA targeted reverse transcription-PCR. Appl. Microbiol. Biotechnol. 70:598-604.

S4.Euzéby, J. P. 2011. List of prokaryotic names with standing in nomenclature

S5.Hedderich, R., and L. Forzi. 2005. Energy-converting [NiFe] hydrogenases: More than just H2activation. J. Mol. Microbiol. Biotechnol. 10:92-104.

S6.Heck, K. L., G. Vanbelle, and D. Simberloff. 1975. Explicit calculation of rarefaction diversity measurement and determination of sufficient sample size. Ecology 56:1459-1461.

S7.Hunger, S., O.Schmidt, M.Hilgarth, M. A. Horn, S. Kolb, R. Conrad, and H. L. Drake.2011. Competing formate- and carbon dioxide-utilizing prokaryotes in an anoxic methane-emitting fen soil. Appl. Environ. Microbiol. doi:10.1128.

S8.Kjeldsen, K. U., A. Loy, T. F. Jakobsen, T. R. Thomsen, M. Wagner, and K. Ingvorsen. 2007. Diversity of sulfate-reducing bacteria from an extreme hypersaline sediment, Great Salt Lake (Utah). FEMS Microbiol. Ecol. 60:287-298.

S9.Ludwig, W., K.-H. Schleifer, and W. B. Whitman. 2009.Revised road map to the phylum Firmicutes. In P. De Vos, G. Garrity, D. Jones, N. R. Krieg, W. Ludwig, F. A. Rainey, K.-H. Schleifer, and W. B. Whitman (eds.), Bergey’s manual of systematic bacteriology, 2nd ed., vol. 3. The Firmicutes. Springer, New York, NY.

S10.Ludwig, W., O. Strunk, R. Westram, L. Richter, H. Meier, Yadhukumar, A. Buchner, T. Lai, S. Steppi, G. Jobb, W. Forster, I. Brettske, S. Gerber, W. A. Ginhart, O. Gross, S. Grumann, S. Hermann, R. Jost, A. Konig, T. Liss, R. Lussmann, M. May, B. Nonhoff, B. Reichel, R. Strehlow, A. Stamatakis, N. Stuckmann, A. Vilbig, M. Lenke, T. Ludwig, A. Bode, and K.-H. Schleifer. 2004. ARB: a software environment for sequence data. Nucleic Acids Res. 32:1363-1371.

S11.Meyer, J. 2007. [Fe-Fe] hydrogenases and their evolution: a genomic perspective. Cell. Mol. Life Sci. 64:1063-1084.

S12.Nandi, R., and S. Sengupta. 1998. Microbial production of hydrogen: An overview. Crit. Rev. Microbiol. 24:61-84.

S13.Palmer, K., H. L. Drake, and M. A. Horn. 2009. Genome-derived criteria for assigning environmental narG and nosZ sequences to operational taxonomic units of nitrate reducers. Appl. Environ. Microbiol. 75:5170-5174.

S14.Pruesse, E., C. Quast, K. Knittel, B. Fuchs, W. Ludwig, J. Pelies, and F. O. Glöckner. 2007. SILVA: a comprehensive online resource for quality checked and aligned ribosomal RNA sequence data compatible with ARB. Nuc. Acids Res. 35:7188-7196.

S15.Purkhold, U., A. Pommerening-Röser, S. Juretschko, M. C. Schmid, H.-P. Koops, and M. Wagner. 2000. Phylogeny of all recognized species of ammonia oxidizers based on comparative 16S rRNA and amoA sequence analysis: implications for molecular diversity surveys. Appl. Environ. Microbiol. 66:5368-5382.

S16.Schloss, P. D., and J. Handelsman. 2005. Introducing DOTUR, a computer program for defining operational taxonomic units and estimating species richness. Appl. Environ. Microbiol. 71:1501–1506.

S17.Schmidt, O., H. L. Drake, and M. A. Horn. 2010. Hitherto Unknown [FeFe]-Hydrogenase Gene Diversity in Anaerobes and Anoxic Enrichments from a Moderately Acidic Fen. Appl. Environ. Microbiol. 76:2027-2031.

S18.Schut, G. I., and M. W. W. Adams. 2009. The iron-hydrogenase of Thermotoga maritima utilizes ferredoxin and NADH synergistically: a new perspective on anaerobic hydrogen production. J. Bacteriol. 191:4451-4457.

S19.Schwartz, E., and B. Friedrich.2006. The H2-metabolizing Prokaryotes, p. 496-563. In A. Balows, H.G. Trüper, M. Dworkin, W. Harder, and K.-H. Schleifer, (ed.), The Prokaryotes, 3rd ed. Springer-Verlag, New York, NY.

S20.Stackebrandt, E., and J. Ebert. 2006. Taxonomic parameters revisited tarnished gold standards. Microbiol. Today 33:152-155.

S21.Tamura, K., J. Dudley, M. Nei, S. Kumar.2007. MEGA 4: Molecular evolutionary genetics analysis (MEGA) software version 4.0. Mol. Biol. Evol. 24:1596-1599.

S22.Vignais, P. M., and B. Billoud. 2007. Occurrence, classification, and biological function of hydrogenases: An overview. Chem. Rev. 107:4206-4272.

S23.Vignais, P. M., B. Billoud, and J. Meyer. 2001. Classification and phylogeny of hydrogenases. FEMS Microbiol. Rev. 25:455-501.

S24.Wüst, P. K., M. A. Horn, and H. L. Drake. 2011. Clostridiaceae and Enterobacteriaceae as active fermenters in earthworm gut content. ISME J. 5:92-106.

S25.Xing, D., Ren, N., and B. E. Rittmann. 2008. Genetic diversity of hydrogen-producing Bacteria in an acidophilic ethanol-H2-coproducing system, analyzed using the [Fe]-hydrogenase gene. Appl. Environ. Microbiol. 74:1232-1239.

S26.Yarza, P., M. Richter, J. Peplies, J. Euzeby, R. Amann, K.-H. Schleifer, W. Ludwig, F. O. Glöckner, and R. Rosselló-Móra.2008. The All-Species Living Tree project: a 16S rRNA-based phylogenetic tree of all sequenced type strains. Syst. Appl. Microbiol. 31:241-250.

1