Supplementary Figure 1: Location map of the sites drilled during the Ocean Drilling Program (ODP) expedition Leg 201 in the equatorial pacific and on the Peru margin. Sites where logs were recorded are in red. Leg 201 was the first ODP leg dedicated primarily to the study of life in deep marine sediments. The seven sites drilled had been previously visited by ODP/DSDP legs and were chosen to be representative of the general range of biological activity and diversity in marine sediments. Source: ODP (

Supplementary Figure 2: Bias in Illumina® reads after MDA of SAG Dsc1. Analysis with PRINSEQ ( revealed a high exact duplication rate (>90%) in all three raw Illumina® datasets. Analysis of the reads showed difference in coverage of up to 4 orders of magnitude, indicating large bias in the first MDA reaction.

Supplementary Figure 3: Workflow of the SCG pipeline (A) A frozen sample of Peru Margin site 1230 collected 7.3 mbsf and stored at -80 °C for 8 years was used for single cell genome analysis. (B) Physical isolation of the single cells was performed by Fluorescent Activated Cell Sorting (FACS) in two 384 well plates. (C) After the separation, the single cells were lysed to release their DNA. (D) Amplification of the DNA by Multiple Displacement Amplification (MDA)was performed. (E) PCR was used to screen with broad eubacterial and archaeal 16S rRNA primers as well as nanoL qPCR (Supplementary Table 4), followed by Sanger sequencing. (F) The MDA products were sequenced separately on an Illumina® HighSeq platform and PacBio® RS Magbead CLR sequencing technique, (G) After quality assessment, trimming and/or normalization of the sequencing reads, bioinformatics tools were used to conduct the assembly, orf calling and annotation of the genes.

Supplementary Figure 4: Concentrations of dissolved sulfate and methane of interstitial water from Site 1230. Methane concentration were determined after 7 days at 22°C. The grey bar represents the anaerobic oxidation of methane zone of a Peru slope hydrate.The Figure was adapted from Figure F3 and F19 of the Shipboard Scientific Part report of site 1230 (D'Hondt et al 2003a, D'Hondt et al 2003b).

Supplementary Figure 5. Conserved Rdh domains in DscP2_00865. Psi-blast search in NCBI with the following parameters: Live blast searchRID=4PWH5ZVG015, Database:cdsearch/cdd v3.10 Low complexity filter:yes, e-value threshold:0.01and maximum number of hits:500.

Supplementary Figure 6: Amino acid identity comparison. An all-vs-all blastp was conducted on translated protein sequences for the reference sequences Dehalococcoides mccartyi195 (NC_002936.faa)(Seshadri et al 2005)and Dehalogenimonas lykanthroporepellens BL-DC-9 (NC_014314.faa)(Moe et al 2009). Translated protein sequences predicted by RAST wereused for partial genomes in this study Dsc1 and DscP2. Fromthe concatenated contigs of a partial genome in another studyDehalococcoidetes bacterium DEH-J10 (Wasmund et al 2013). The default Blastall parameters were used with an e-value cut-off of 0.001. A custom python script was usedto sort the best single alignment in each organism to organism comparisonwith DscP2 as the reference. Alignmentsshorter than 50 amino acids in length were excluded from consideration. Adensity plot of the percent identity for each alignment was visualizedusing the R statistical environment.

Supplementary Table 1. DNA form 31 deepsea sediment samples, originating from the eastern equatorial Pacific, the southeast Pacific off Peru (Peruvian Margin, ODP Leg 201), the northeast Pacific at the Juan de Fuca Ridge flank off Oregon (IODP Expedition 301), the northwest Pacific off Japan (JAMSTEC Chikyu Shakedown Expedition CK06-06) and the Nankai Trough Forearc Basin off Japan (ODP Expedition 315) was extracted from 5-8 g of sample using the FastDNA® SPIN Kit for Soil (MP Biomedicals, Solon, OH). 1 ul of the extracted DNA was amplified by Multiple Displacement Amplification (MDA) (Lasken 2007) in three independent reactions using the Repli-g Mini Kit for single cells (Qiagen®). Amplified DNA was then pooled and cleaned up using the QiAamp kit (Qiagen®). rdh genes were amplified using primers RFF2 (5'-SHMGBMGWGATTTYATGAARR-3') and B1R (5’-CHADHAGCCAYTCRTACCA-3’) (Futagami et al 2009). nd: no descripton.

Supplementary Table 2: PCR programs used for screening of SAGs and extracted DNA. Different types of polymerases were tried, since in some samples an inhibitory effect was observed, most likely due to humic acids. The OmniTaq LA DNA Polymerase from Klentaq was showing an overall better performance than the other polymerases and was less inhibited.After the extracted DNA was amplified by Multiple Displacement Amplification (MDA) (Lasken 2007) and cleaned up no inhibitory effect in the PCRs was observed anymore.

Supplementary Table 3: A panel of novel qPCR primer pairs was developed (Mayer-Blackwell and Spormann, in prep.) to detect a large fraction of known reductive dehalogenase genes (rdhA) at a single annealing temperature and buffer chemistry. Using non-redundant full-length and near full-length rdhA gene sequences curated in the protein family Pfam PF13486 as of July 2012 (Punta et al 2012), and sequences were clustered based on percent identity (PID) using an all-vs-all Blastp (Altschul et al 1990). Sequences considered ranged between 350 and 700 amino acids. Specific assays were designed for 50 references sequences, each with at least one known high PID homolog. Thousand of candidate primer pairs were generated using primer3 (Rozen S and H 2000) and filtered based on complementarity to at least 3 distinct sequences sharing high PID to the reference sequence. Where possible, lack of complementarity to homologs with lower PID was used as additional criteria in assay selection. A second class of general capture assays was selected based on complementarity to the reference sequence and as many homolog sequences as possible.

Supplementary Table 4: DNA samples applied to nanoLiter-qPCR chip on the Wafergen chip. Values represent the mean values of the replicates. Ct where multiple peaks were detected were counted as non-detected, MP. A ct value greater than 28 was considered a non-ID.Someprimer sets were showing good ct value in water, probably due to dimer formation, and were therefore excluded. Sample 1: control; sample 2: purified MDA products of single Dsc cells # 1, 2 and 3; sample 3: 33 purified MDA products of SAGs that showed 16S rRNA amplification with broad eubacterial or archaeal primers; sample 4:MDA products of all SAGs from plate #1, not purified; sample 5: MDA products of all SAGs from plate #2, not purified; sample 6: extracted non-amplified DNA, from Peru Margin site 1230, collected 7.3 mbsf; sample 7: extracted non-amplified DNA, from multiple depths of Peru Margin (ODP Leg 201) sites 1227, 1229 and 1230; sample 8: extracted non-amplified DNA of samples from sites in Futagami et al. 2009, Table 1.

Supplementary Table 5: Statistics of different assembly strategies for Dsc1 sequencing data. Different strategies were applied to assemble the reads of the individual cells, and later to combine single cells Dsc # 2 and # 3, in order to get the most out of the sequencing data. Statistics were checked with assemblathon (Earl et al 2011). It is noteworthy, that a good assembly statistics do not automatically hold true that the assembly is optimal. Assemblies were therefore always run through the RAST pipeline (Aziz et al 2008) to check for misassemblies, like e.g. gene duplications and to ensure a maximum amount of coding sequences.A. CLC bio ( B. spades 2.3 (Bankevich et al 2012); C. spades-n ( et al 2012)); D. velvet-sc, kmer=37 (Chitsaz et al 2011); E. velvet-sc n ( et al 2011)); F. Celera (CA) (Miller et al 2008); G. Hybrid error correction method using CA assembled Illumina® data to correct long PacBio® reads (Koren et al 2012); H. velvet assembly using Euler correction, kmer=55 (Chitsaz et al 2011); I. spades assembly of Illumina®-only combined via PCAP with CA assembly of PacBio corrected by PacBio only; J. velvet-sc assembly of Illumina®-only combined via PCAP with CA assembly of PacBio® corrected by PacBio® only; K. velvet-sc assembly of Illumina®-only combined via PCAP with CA assembly of PacBio® corrected by Illumina®-only(Huang et al 2003); L. spades assembly of Illumina®-only combined via PCAP with CA assembly of PacBio corrected by Illumina® only(Huang et al 2003, Koren et al 2012, Miller et al 2008); n = Normalization of the Illumina® reads (

Supplementary Table 6:Phylogenetic Distribution of Dsc1 genes based on the distribution of best BLAST hits of its protein-coding genes. The hit genome count is shown in brackets at 30%, 60%, and 90% BLAST identities.Unassigned means are either remainders of genes less than the percent identity cutoff (30%), or that there are not best hits at the cutoff, or there are actually no hits (Markowitz et al 2010).

Supplementary Table 7:Phylogenetic Distribution of DscP2 genes based on the distribution of best BLAST hits of its protein-coding genes. The hit genome count is shown in brackets at 30%, 60%, and 90% BLAST identities.Unassigned means are either remainders of genes less than the percent identity cutoff (30%), or that there are not best hits at the cutoff, or there are actually no hits (Markowitz et al 2010).