Ms. No. TIMI-D-08-00185 Final Version

Supplementary Material

This section explores the apparent DNA replication and repair functionality in the eleven insect and bivalve symbiont genomes analysed in this study. It also discusses whether phage recombinases might contribute to genome reduction in the early phase of genomic erosion in these species. This material could not be accommodated within the article itself due to space constraints and efforts to maintain the overall focus of the review.

DNA replication and repair in mutualistic bacteria

The processes of recombination, replication and repair are fundamentally interconnected1, 2. Hence an appraisal of the DNA replication and repair genes present in all eleven g-proteobacterial symbiont genomes is merited (Table S1).

With the exception of a few accessory subunits (c, y, q), the majority appear to have a fully intact DNA polymerase III apparatus for chromosome replication3. How the replisome is loaded onto the template, however, must differ significantly from the situation in E. coli. For instance, Blochmannia, Baumannia and Wigglesworthia lack an orthologue of the origin initiation protein DnaA (Table S1a). Together with DnaC (also missing in all but Buchnera and Sodalis), DnaA facilitates duplex opening and loading of the DnaB replicative helicase at a single point, oriC, in the bacterial chromosome4. DnaA is only dispensable in E. coli if the RNase HI activity is eliminated5. This ribonuclease normally removes RNA from RNA-DNA hybrids at the oriC origin, which would allow priming of DNA synthesis6. However, all retain an rnhA gene, though it is possible that RNA-primed replication initiation may be retained because other ribonucleases are absent from these species. Recombination-dependent replication can also compensate for the absence of DnaA in E. coli6,7. This may explain why Wigglesworthia and Baumannia retain a functional RecA recombinase. Other mechanisms, however, must be feasible since Blochmannia lacks DnaA, DnaC and RecA. The elevated A+T content in mutualistic symbiont genomes (Table 1) could alleviate the inherent barrier to replication initiation evident in those organisms with a higher G+C nucleotide composition. Alternatively, these bacteria may have adopted some form of plasmid replication, which could also account for an increase in chromosomal copy number.

Recruitment of DnaB for the replication restart pathway in E. coli also requires appropriate mediator proteins, typically the primosomal assembly factors PriA/PriB/DnaT or PriC, together with DnaC8. Significantly, several endocellular symbiotic bacteria lack all of these primosome components (Table S1a). In E. coli, the need for the PriA branch-specific helicase can be circumvented by mutations in DnaC that permit direct loading of DnaB9,10, a scenario potentially applicable to Buchnera. However, the patchy distribution of DnaC and PriA supports an alternative means of loading DnaB for initiation at both origin and stalled or collapsed replication forks. For those forks that fail to progress, symbiotic bacteria may simply tolerate the loss of partially completed replicons. Buchnera and Blochmannia contain about 100 genomic copies per cell11,12 and, if this feature is true of other mutualists, discarding a few chromosomes per cell cycle may have little impact on the production of viable progeny.

In addition to recombinational processes, all organisms rely on a further set of repair pathways to eliminate lesions likely to obstruct DNA replication or lead to mutation. Defects in more than one pathway of repair are usually synergistic, a feature worth considering when evaluating an ability to cope with corruption of the genome in the absence of genetic recombination. All of the mutualistic symbionts appear to have a restricted capacity to deal with small non-bulky adducts and mismatches (Table S1b), although genes for certain repair pathways have been retained in some cases, notably (i) base excision repair (Ung, MutY, Nfo and Nth) to remove damaged or incorrect nucleotides incorporated in DNA13, (ii) mismatch repair (MutS and MutL) to target repair to the newly synthesised strands following replication, although several lack mutH and dam genes normally found in E. coli but dispensable in other species14 and (iii) direct repair pathways including DNA ligase (LigA) to seal nicks in duplexes and DNA photolyase (Phr) to eliminate cyclobutane pyrimidine dimers15,16. These symbionts generally lack other base excision repair enzymes that deal with oxidative damage to nucleotides (e.g. MutT and MutM) and may therefore be sensitive to reactive oxygen species17. It seems likely that the insect symbionts are exposed to oxygen radicals given they possess their own genes for aerobic respiration, while lacking those for fermentation and anaerobic respiration, and are housed within cells that harbour mitochondria18. The elevated level of transcription of two antioxidant enzymes, SodA and AhpC, in Blochmannia floridanus12 is consistent with an effort to defend the genome against products of oxygen metabolism. Mutualistic bacteria from marine bivalves are probably exposed to both oxic and anoxic conditions19; in the latter case, utilisation of nitrate as an electron acceptor instead of oxygen would still generate reactive species with damaging effects on nucleic acids. Defects in base excision repair pathways frequently favour a GC to AT mutational bias20 and loss of these activities most likely explains the emergence of a high A+T content in mutualist genomes21. All of the symbionts are also devoid of methyltransferases (Ada, AlkA, Ogt) ensuring they will have limited repair capacity against alkylating agents.

Conspicuous by their absence are the genes for UvrABC-mediated nucleotide excision repair, a versatile system for removal of bulky lesions by excising and replacing short tracts of DNA22. E. coli strains deficient in both recA and uvrA are extremely sensitive to DNA damage with a single photoproduct in the chromosome thought to be lethal23. Blochmannia species lack both of these repair pathways meaning they may be highly susceptible to agents that induce bulky adducts, such as ultraviolet light. The presence of DNA photolyase in two Buchnera strains (Ap and Bp) and Wigglesworthia (Table S1b) implies that UV photoproducts do occur and that sunlight can penetrate the host insect’s abdomen; light is needed to induce the initial cyclobutane pyrimidine dimer (254 nm) and for nucleotide restoration by DNA photolyase (350-450nm)16. Surprisingly, given the absence of nucleotide excision repair, several of the symbionts do possess an orthologue of E. coli Mfd (Table S1b), a large, multifunctional enzyme responsible for recruiting UvrA and releasing RNA polymerase stalled at a lesion in DNA24. Mfd directs repair to the transcribed strand, ensuring efficient gene expression and DNA replication by disengaging both polymerase and transcript from the template. Mfd in mutualistic bacteria could still function in this capacity by allowing the replication apparatus to bypass the lesion or repair enzymes other than UvrABC to gain access. Consistent with this idea, the N-terminal UvrA-binding domain is missing from Buchnera and Wigglesworthia Mfd proteins (Table 2c and Figure S2b), whereas the central RNA polymerase interaction domain (residues 482-603 in E. coli Mfd) is preserved along with the superfamily 2 helicase motifs required for Mfd translocation on duplex DNA25. An N-terminal deletion of E. coli Mfd, removing residues 1-378, can dissociate RNA polymerase and transcripts from DNA templates but cannot interact with UvrA26.

In summary, little is known concerning symbiont efficiency at replicating and repairing damage to DNA, primarily because their unique mutualistic lifestyle makes such experiments difficult to perform. From the information garnered on the repair networks they do possess, coupled with an estimated high mutation rate relative to their free-living relatives27,28, it seems their capabilities are severely constrained.

Do phage recombinases contribute to genome reduction?

Enterobacterial species commonly carry prophages or cryptic prophage-like elements and many of these possess their own recombination functions. Normally these phage enzymes are repressed in the lysogenic state or, in the case of cryptic elements, lack the necessary signals for transcription. However, their expression can be activated by mutations or rearrangements that generate a functional promoter, meaning they can have an impact on bacterial genome repair and evolution. All eleven symbiont genomes (Table 1) were probed with a selection of bacteriophage proteins known to participate in either initiation or resolution of genetic recombination. Consistent with the absence of phage sequences among obligate bacterial symbionts, no trace of l Rap or b, E. coli Rac prophage RecE, P22 Erf, T4 UvsX, UvsY or endonuclease VII and T7 endonuclease I was detected. However, homologues of Exo and Orf from phage l, RecT from E. coli cryptic prophage Rac and RusA from E. coli defective lambdoid prophage DLP12 were found exclusively in Sodalis. The two homologues of l Exo, a 5’-3’ exonuclease active on DNA ends, both contain frameshift mutations. Sodalis does, however, carry an intact RecT protein sharing 60% identity over 237 residues. RecT is a ssDNA annealing protein29 and together with RecE comprises an end joining system functionally analogous to the l Red recombination system30,31. Assuming that the Sodalis RecT is expressed, it is conceivable that it stimulates recombinational splicing by annealing ssDNA exposed by bacterial exonucleases. The relaxed fidelity inherent in these reactions could boost genome attrition by encouraging ectopic fusions between dispersed repetitive sequences. It may be significant that E. coli strains expressing RecET in combination with an sbcC mutation exhibit a substantial increase in the rate of inversions at small inverted repeats32. Sodalis also carries six predicted copies of the Orf (NinB) protein (sharing 30-33% identity with the l protein). Genetic studies indicate that Orf substitutes for the RecF, RecO and RecR proteins in loading the RecA recombinase33. Multiple copies of Orf in a cell could substantially improve RecA nucleation efficiency in various contexts, leading to an accelerated rate of genetic rearrangement. Finally, Sodalis has six homologues (ranging from 34-42% identity) of the RusA Holliday junction resolvase34. Any or all of these three recombination activities may already be participants in the ongoing genome degeneration process apparent in Sodalis. It is certainly plausible that phage recombination and repair enzymes contribute to the rearrangements and losses occurring in the early phase of symbiosis before they in turn succumb to the attrition process.

Figure legends

Figure S1. Comparison of the genetic organisation between Escherichia coli K12 (Eco) and Buchnera apidicola Ap (Ap) at recBCD, recF, recO, recR and ruvABC loci. Orthologues present in both species are coloured accordingly. Gene orientations are indicated by an open triangle and the diagram is drawn to scale.

Figure S2. Deletions affecting RecD and Mfd proteins in mutualistic bacteria. (a) Crystal structure of E. coli RecBCD complexed with dsDNA35 highlighting the region missing in Wigglesworthia RecD. DNA strands are separated and pass through channels within the interior of the protein eventually emerging close to the nuclease active site (red). The C-terminal segment (residues 335-608; cyan) is absent from the Wigglesworthia RecD subunit and results in the loss of domain 3 corresponding to superfamily 1 helicase domain 2A35. Removal of this region could block 5’-3’ helicase translocation by this subunit and impact on the processivity of the bidirectional helicase complex. The 5’ strand of a duplex end would normally be fed past this C-terminal region on its journey to the nuclease domain of RecB. In Wigglesworthia this 5’ tail could be spooled out from the side of the complex and removed by RecJ. This fits with the fact that an E. coli RecBC complex, missing the RecD subunit, is deficient in nuclease activities but retains helicase and RecA-loading functions without the need for c sites36. Furthermore, the recombination proficiency of recD mutants depends on the presence of the RecJ 5’-3’ ssDNA exonuclease37,38. (b) Crystal structure of E. coli Mfd25 highlighting regions missing in Wigglesworthia and strains of Buchnera. The smallest deletion removes 351 residues (cyan), while the largest affects additional residues from 352-481 (blue). The C-terminal segment (residues 482-1148; green) encompasses domains important for helicase activity and interaction with RNA polymerase25.

References

1. Kowalczykowski, S.C. (2000) Initiation of genetic recombination and recombination-dependent replication. Trends Biochem. Sci. 25, 156-165

2. McGlynn, P., and Lloyd, R.G. (2002) Recombinational repair and restart of damaged replication forks. Nat. Rev. Mol. Cell Biol. 3, 859-871

3. O'Donnell, M. (2006) Replisome architecture and dynamics in Escherichia coli. J. Biol. Chem. 281, 10653-10656

4. Kaguni, J.M. (2006) DnaA: controlling the initiation of bacterial DNA replication and more. Annu. Rev. Microbiol. 60, 351-375

5. Kogoma, T., et al. (1985) Function of ribonuclease H in initiation of DNA replication in Escherichia coli K-12. Mol. Gen. Genet. 200, 103-109

6. Kogoma, T. (1997) Stable DNA replication: interplay between DNA replication, homologous recombination, and transcription. Microbiol. Mol. Biol. Rev. 61, 212-238

7. Magee, T.R., et al. (1992) DNA damage-inducible origins of DNA replication in Escherichia coli. EMBO J. 11, 4219-4225

8. Heller, R.C., and Marians, K.J. (2005) The disposition of nascent strands at stalled replication forks dictates the pathway of replisome loading during restart. Mol. Cell. 17, 733-743

9. Sandler, S.J., et al. (1996) Differential suppression of priA2::kan phenotypes in Escherichia coli K-12 by mutations in priA, lexA, and dnaC. Genetics 143, 5-13

10. Xu, L., and Marians, K.J. (2000) Purification and characterization of DnaC810, a primosomal protein capable of bypassing PriA function. J. Biol. Chem. 275, 8196-8205

11. Komaki, K., and Ishikawa, H. (1999) Intracellular bacterial symbionts of aphids possess many genomic copies per bacterium. J. Mol. Evol. 48, 717-722

12. Stoll, S., et al. (2008) Transcriptional profiling of the endosymbiont Blochmannia floridanus during different developmental stages of its holometabolous ant host. Environ. Microbiol. in press

13. Seeberg, E., et al. (1995) The base excision repair pathway. Trends Biochem. Sci. 20, 391-397

14. Modrich, P. (1989) Methyl-directed DNA mismatch correction. J. Biol. Chem. 264, 6597-6600

15. Wilkinson, A., et al. (2001) Bacterial DNA ligases. Mol. Microbiol. 40, 1241-1248

16. Park, H.W., et al. (1995) Crystal structure of DNA photolyase from Escherichia coli. Science 268, 1866-1872

17. Demple, B., and Harrison, L. (1994) Repair of oxidative damage to DNA: enzymology and biology. Annu. Rev. Biochem. 63, 915-948

18. Shigenobu, S., et al. (2000) Genome sequence of the endocellular bacterial symbiont of aphids Buchnera sp. APS. Nature 407, 81-86

19. Newton, I.L., et al. (2008) Comparative genomics of vesicomyid clam (Bivalvia: Mollusca) chemosynthetic symbionts. BMC Genomics 9, 585

20. Horst, J.P., et al. (1999) Escherichia coli mutator genes. Trends Microbiol. 7, 29-36