Symbiosis with an endobacterium increases the fitness of a mycorrhizal fungus, raising its bioenergetic potential

Alessandra Salvioli1, Stefano Ghignone2, Mara Novero1,Lorella Navazio3, Francesco Venice1, Paolo Bagnaresi4, Paola Bonfante*1

Summary

The Supplementary Information includes Supplementary Materials containing Methods and Results, eleven Supplementary Figures and twelve Supplementary Tables.

Supplementary Materials and Methods

RNA extraction and sample preparation for sequencing

Samples belonging to Data Set 1 (see below) were subjected to a normalization procedure that equalizes transcript concentrations to increase the sequencing efficiency of rare transcripts. First, cDNA was obtained from 500 ng of total RNA for each sample using the SMARTer PCR cDNA synthesis kit (Clontech, Mountain View, CA, USA). Then the cDNA libraries were normalized using the Trimmer-2 cDNA normalization kit (Evrogen, Moskow, Russia) following the manufacturer's instructions. The normalized cDNAs were purified using the QIAquick PCR purification kit (Qiagen, Germany) and 1.5 μg of clean material was sent to the sequencing company (Fasteris, Geneva, Switzerland), where independent indexed libraries were prepared and run in a single Illumina Hiseq channel.

For Data Set 2 (not normalized, see below), 1.4 μg of total RNA was sent to the sequencing facility for library construction and Illumina Hiseq sequencing.

Generation of Data Sets 1 and 2 and de novo transcriptome assembly

The normalized paired-end libraries were generated with the aim of enriching the dataset for rare transcripts, whereas the single-end libraries were constructed to investigate the differential gene expression in the presence and in the absence of the endobacterium.

Data Set 1 was composed of four in vitro normalised paired-end libraries, obtained from the B+ line of G. margarita, sampled in the following stages of the fungal life cycle: quiescent spores (GOU-13), germinating spores (GOU-14), spores treated with strigolactone (GOU-15), and extraradical mycelium (GOU-16).

The total number of reads in channel (PE) was 188'118'376 for a total 18'811'837'600 bases, with an average Q30 of 85.6%. A custom script, derived from Cutadapt (Martin M., 2011)(courtesy P. Giannini & M. Malavasi) was used to remove any remaining sequences from the primers used in library construction. Read quality was then checked by FastQC script, and since quality scores dropped below 25, TrimmomaticPE (Bolger et al., 2014)was used to trim poor quality ends from each pair (TrimmomaticPE settings: LEADING:10 TRAILING:10 SLIDINGWINDOW:4:25 MINLEN:25). After quality trimming, 76.00 to 79.13% of both paired ends remained (Table S11).

Data Set 2 was composed of 14 single-end libraries, obtained from the B+ and B- lines of G.margarita, sampled in the following stages of the fungal life cycle: germinating spores (B+: GDR-25/26/27; B-: GDR-28/29/30), spores treated with strigolactone (B+: GDR-31/32; B-: GDR-33/34), and mycorrhizal roots (B+: GDR-35/36; B-: GDR-37/38). According to the ENCODE Consortium Standards, Guidelines and Best Practices for RNA-Seq (v1.0), a minimum of two replicates were collected for critical samples. (The total number of reads was 357,777,717, with an average Q30 of 91.42%. No quality trimming was required for this dataset.

The de novo assembly of Data Set 1 and 2 libraries was performed on a 60 core and 256 GB RAM machine, running Ubuntu server 12.04 LTS, using Trinity v.Trinityrnaseq_r20131110 (Grabherr et al., 2011). First trials indicated that the available amount of memory was not sufficient to handle all the raw reads and, following the Trinity manual, we performed in silico reads normalization for each of the libraries from Data Set 1 and 2, to a max coverage of 30. Normalization results are summarized in Table S11. Libraries from myc tips (GDR-35 to GDR-38) were not subjected to normalization and were not used for the de novo assembly, since only a fraction of reads were ascribable to the fungal transcriptome (13.4%, 17.7%, 16.6%, 7.6%), whereas the majority of reads were from the plant host (Lotus). The Data Set 1 and 2 read counts are reported in Table S11.

All the normalised single-end Data Set 2 libraries were merged together with the paired-end Data Set 1 left ends.

Trinity was run with the following characterizing options, suited to assemble a gene-dense compact genome, such as a fungal genomes, and to minimize the number of isoforms per transcript: Trinity.pl --seqType fq --CPU 30 --JM 150G --min_contig_length 350 --jaccard_clip --min_kmer_cov 2 --CuffFly --group_pairs_distance 300 –extended_lock.

Transcript abundance estimation (computing expression values) was performed using RSEM, using the original Data Set 2 reads (not in silico normalized), which were mapped against the Trinity transcripts. RSEM was executed to estimate expression values based on the resulting alignments. All the libraries from Data Set 2 were used in this step, including the four libraries (GDR-35 to GDR-38) from mycorrhizal roots not used for the de novo assembly. Identification and Analysis of Differentially Expressed Trinity Genes and Transcripts were performed as described below.

Further downstream analyses were performed on nucleotide sequences of transcripts or on likely coding regions, extracted from Trinity transcripts using TransDecoder, including PFAM domain searches as ORF retention criteria. Sequence similarities were captured with BLAST suite programs, using an e-value of 1E-5, querying all Trinity transcripts or peptide predictions against the refseq_protein and R. irregularis protein databases. Protein domains were identified using HMMER and TransDecoder. SignalP was used to predict signal peptides and tmHMM was used to predict transmembrane regions. RNAMMER was used to identify rRNA transcripts. All downstream analyses were loaded into a Trinotate SQLite Database. Whole Trinity transcript annotation was performed with Blast2Go.

Since the smallest contig size was set to 350 bp to reduce the number of isoforms per component, the current assembly is not well suited to investigate the effector-like and secreted protein repertoire of genes encoded by the fungus. Only one transcript (comp7520_c0_seq1) annotated as secreted protein was in fact found in the Best Reciprocal Hit (BRH) dataset with Rhizophagus (transcript ID:336770). Nevertheless, the list of the 1,340 transcripts predicted in G. margarita as putatively secreted proteins by SignalP is reported in Table S12.

GOslim representations (based on S. pombe GOslim, Figure S10) were produced to summarize G. margarita biological processes and pinpoint highly represented GO groups.

GO enrichment analyses

GO enrichment analyses were conducted with the goseq bioconductor package version 1.14.0, while KEGG pathway pictures, KO (KEGG Orthology) mappings were first obtained from KAAS (KEGG Automatic Annotation Server; using as query trinity-assembled Gigaspora transcripts against the KEGG GENES database. All the details are provided in the Supplemental material.

The Goseq package is specifically designed to limit length bias, which may affect RNA-seq data(Secco et al., 2012). Gene lengths were obtained from the Trinity assembler output file (i.e. genes_feature_lengths file).

GO mappings for each gene were obtained by first running BLASTX against the RefSeq_protein database (December 2013) using as query trinity-assembled isoforms (86,183 isoforms) with an Expect value of 1e-05. The resulting XML file was then loaded in Blast2GO (Gotz et al., 2008) and annotated with the following settings: E-value filter: 1.0E-6; Annot_cutoff: 55; GO _weight: 5; HSP-Hit_Cov_cutoff: 0). GO terms associated with isoforms were exported from Blast2GO and collapsed over the corresponding genes (4,554 unique genes with at least one GO). An FDR cutoff of 0.1 was used as goseq parameter for GO enrichments.

KEGG maps

In order to obtain KEGG pathway pictures, KO (KEGG Orthology) mappings were first obtained from KAAS (KEGG Automatic Annotation Server; using as query trinity-assembled Gigaspora transcripts against the KEGG GENES database. The selected fungal database organisms were: uma, spo, tml, pcs, ang, ani, bfu, ncr, mgr, fgr, sce, yli, ecu, mgl, mpr, ppl, cne, ure, cpw, aor, afm, ssl, ago, kla, ppa, vpo, cgr, and dha.

The resulting KO mappings were collapsed to genes and combined with the whole expression data estimated by DESeq2. The Bioconductor package pathview version 1.2.3 (Luo & Brouwer, 2013) was used to generate relevant KEGG pathway pictures.

Phosphate measurements in roots and shoots from mycorrhizal clover plants

About 5 g of frozen roots and shoots from clover plants mycorrhized with B+ or B- G. margarita were dried at 70°C for 48 hours, and digested in 2 ml 6 M HNO3at 90°C for one hour. The digestion product was diluted to 6 ml with milliQ water and filtered. The P contents were determined using Inductively Coupled Plasma Atomic Emission Spectrometry (ICP-AES) performed using a Liberty 100 Varian apparatus equipped with a VGroove nebulizer and a Czerny–Turner monochromator (Department of Mineralogical and Petrological Science, University of Turin).

Supplementary Results

The de novo assembly

We found that most of the transcripts described as key features of the R. irregularis genome (Liu et al., 2013; Tisserant et al., 2013)were expressed by G. margarita as well. In addition to the features illustrated in the main text, the following traits are noteworthy.

For primary metabolism, the enzymes involved in glycolysis, gluconeogenesis, and TCA cycle were all expressed at high levels; galactose metabolism was also highly represented. The genes for the metabolism of purines, pyrimidines, and amino acids were all well represented in the transcriptome of G. margarita at the asymbiotic phase. By contrast, some of the central enzymes involved in fatty acid (FA) biosynthesis seemed not to be expressed, but FA degradation was highly represented. The genes involved in cell cycle and ribosome biogenesis were well represented in the global transcriptome, as were the genes for ubiquitin-mediated proteolysis, proteaseome, and the sec-dependent pathway. Also the BTB/POZ domain, a common structural domain, was over-represented in both lines. The BTB/POZ domain is a versatile protein-protein interaction motif that participates in a wide range of cellular functions, including transcriptional regulation, cytoskeletal dynamics, ion channel assembly and gating, and targeting proteins for ubiquitination (Stogios et al., 2005).

We found that other features were shared with mycorrhizal fungi whose genomes had been sequenced. The expansion of the tyrosine kinase-encoding gene family involved in signaling pathways was also a key feature of the ectomycorrhizal L. bicolor genome (Martin & Selosse, 2008). The Sel1 repeat domain is a subclass of the tetratricopeptide repeat region (TPR), a structural motif present in a wide range of proteins, which are also overrepresented in G. margarita transcriptome. The TPR domain mediates protein-protein interactions and the assembly of multiprotein complexes (Blatch & Lassle, 1999). Proteins containing TPRs are involved in a variety of biological processes, such as cell cycle regulation, transcriptional control, mitochondrial and peroxisomal protein transport, and protein folding (D'Andrea & Regan, 2003). Sel1-like repeat proteins have been shown to be involved in signal transduction (Mittl & Schneider-Brachert, 2007). In fungi, bimA, a member of the tetratricopeptide repeat family of proteins, has been found to be essential for the completion of mitosis in A. nidulans (Odonnell et al., 1991); the TPR domain of a serine/threonine phosphatase of A. oryzae was necessary for the full activity of the enzyme (Feng et al., 2007). Other evidences indicate that the TPR domain may be involved in virulence of fungal pathogens; C. neoformans with a mutation in the TPR-containing gene CCN1 failed to cause systemic infection in mice and regained the virulence when complemented with the wild-type CCN1 gene (Chung et al., 2003). Candida albicans Tcc1p, a TPR domain containing protein, interacts with Tup1p to regulate the morphological transition and virulence (Kaneko et al., 2006).

Quantitative analysis of differentially expressed genes: the presence of the endobacterium is more relevant during the presymbiotic phase

Due to the complexity of the experimental design (two fungal lines investigated at three stages, Figure S1) as a first step we performed a quantitative analysis of the differentially expressed genes (DEGs) to understand whether the main driver of gene expression in G. margarita depended on the life cycle stage or on the presence/absence of the endobacterium. When the whole number of DEGs was considered (B+ and B- lines considered together, purple arrows in Figure S1) as shown in Venn diagrams (Figure S12), we concluded that the major transcriptional changes take place during the transition from the asymbiotic to the symbiotic stage (1559 versus 8420 transcripts, respectively). This quantitative analysis shows that the association of G. margarita with its host plant leads to an important change in the fungal transcriptomic profile in agreement with the patterns already shown by other symbiotic fungi (i.e. Laccaria, Tuber, Rhizophagus) irrespective of their taxonomic position and of the mycorrhizal typology they support: Laccaria and Tuber are ectomycorrhizal fungi belonging to the Basidiomycota and Ascomycota, respectively (Martin & Selosse, 2008; Martin et al., 2010)while Rhizophagus is an arbuscular mycorrhizal fungus (Tisserant et al., 2013). Gigaspora seems therefore to follow the trend typical of mycorrhizal fungi as well as of other plant-interacting fungi, such as the endophyte Pirimorphospora indica(Zuccaro et al., 2011).

By contrast, when the data sets were investigated by comparing the fungus with and without the endobacterium, Venn diagrams (Figure S4), showed that the major changes in the transcriptome landscape related to the endobacterium presence were recorded during the pre-symbiotic phase, i.e. in the germinating stage (G) and strigolactone-treated condition (SL). As commented in the main text, the most DEGs were identified in the G condition, followed by SL (9609 and 3427 transcripts, respectively), while very few transcripts were differentially expressed during the symbiotic phase. The low number of fungal transcripts may have two explanations: the fungal transcripts may have been diluted by the massive number of plant transcripts, or, more likely, the low number is related to a limited impact of the endobacterium during the mycorrhizal phase. As demonstrated, the presence or the absence of the endobacterium does not have any impact on the success of the mycorrhization (Figure S5).

Gene expression and differential expression analysis between B+ and B- lines

In addition to the genes commented in the main text, other relevant differentially expressed genes were also detected:

Phosphate transporter: a sequence sharing a very high similarity with P transporters from filamentous fungi and yeasts, was downregulated in the B+ line (comp27005_c0). It shares the highest similarity with Pho87, a P transporter containing an SPX domain involved in Pi signaling and homeostasis (Secco et al., 2012). According with what we observed in our dataset, in Saccharomyces, the P transporters Pho87 and Pho84 were reported to behave in opposite ways, being active at high and low Pi levels, respectively (Secco et al., 2012).Interestingly, a similar trend was obtained for sulphur: three sulfate transporters of G. margarita were down regulated in the presence of the endobacterium, but one was consistently upregulated, also leading to a higher S content in the plant (not shown).

Starch and sucrose metabolism, and fatty acid beta oxidation appeared to be up-regulated in the G condition, while components of the spliceosome were rather up-regulated in both G and SL. Some genes involved in N-glycan biosynthesis were slightly repressed in SL, as well as many enzymes of the shikimate pathway. Confirming the GO analysis, nitrogen metabolism was downregulated by the endobacterium especially in the G condition, while sulfur metabolism appeared to be up-regulated exclusively in SL.

A transcript encoding a protein kinase C (K0: 2677), involved in the calcium-mediated signalling pathway was downregulated in SL. Along the same line, the phosphatidylinositol signaling system (KO: 04070) was downregulated.

Some genes encoding to two-component systems (KO: 02020) seem to be influenced by the endobacterial presence. In many bacteria the expression of phosphate-regulated genes is modulated by the two-component system PhoR–PhoP (Sola-Landa et al., 2003). In our transcriptome, a putative alkaline phosphatase phoA was downregulated in SL, and a phosphate sensing membrane transporter phoP was upregulated in the presence of the endobacterium.

The fungal-bacterial symbiosis provides protection from ROS induced by oxidative stress or strigolactone treatment

According to the behavior of other plant-and animal interacting fungi for Magnaporthe and Beauveria, spores were treated with variable concentrations of hydrogen peroxide (from 100 mM to 0.25mM) to check their germination responses after seven days. Spores (both cured and WT) treated with hydrogen peroxide at 100 mM, 10 mM, 2 mM, 1 mM, 0.75 mM and 0.5 mM did not germinate and in addition at 100 mM and 10 mM, the spore walls lost their natural coloration and became completely white. Spores start to germinate when treated with hydrogen peroxide 0.3 mM. This concentration of 0.3 mM was therefore selected for the transcriptomic experiments (see main text).

The phenotypic impact of SL on the two lines of G. margarita was checked again, revealing that the presence of the bacterium had some very subtle effects: the hyphae showed a curved/waving phenotype quite different from hyphae developing in water or from those originating from spores that do not contain the endobacterium.

References

Blatch GL, Lassle M. (1999) The tetratricopeptide repeat: a structural motif mediating protein-protein interactions. Bioessays21: 932-939.

Bolger AM, Lohse M, Usadel B. (2014)Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics30: 2114-2120.

Chung S, Mondon P, Chang YC, Kwon-Chung KJ. (2003) Cryptococcus neoformans with a mutation in the tetratricopeptide repeat-containing gene, CCN1, causes subcutaneous lesions but fails to cause systemic infection. Infect Immun71: 1988-1994.

D'Andrea LD, Regan L. (2003) TPR proteins: the versatile helix. Trends Biochem Sci28: 655-662.

Feng B, Zhao CH, Tanaka S, Imanaka H, Imamura K, Nakanishi K. (2007) TPR domain of Ser/Thr phosphatase of Aspergillus oryzae shows no auto-inhibitory effect on the dephosphorylation activity. Int J Biol Macromol41: 281-285.

Gotz S, Garcia-Gomez JM, Terol J, Williams TD, Nagaraj SH, Nueda MJ et al. (2008) High-throughput functional annotation and data mining with the Blast2GO suite. Nucleic Acids Res36: 3420-3435.

Grabherr MG, Haas BJ, Yassour M, Levin JZ, Thompson DA, Amit I et al(2011) Full-length transcriptome assembly from RNA-Seq data without a reference genome. Nat Biotechnol29: 644-U130.

Kaneko A, Umeyama T, Utena-Abe Y, Yamagoe S, Niimi M, Uehara Y. (2006) Tcc1p, a novel protein containing the tetratricopeptide repeat motif, interacts with Tup1p to regulate morphological transition and virulence in Candida albicans. Eukaryotic Cell5: 1894-1905.