Regulation of gene expression in the CMD2 cassava locus

Abdulsalam Toyin1, Carluccio Anna Vittoria2, Rabbi Y. Ismail1, Gedil Melaku1, Gisel Andreas1,3 and Stavolone Livia*1,2

1) International Institute of Tropical Agriculture, PMB 5320,Oyo Road, Ibadan 200001, Oyo State, Nigeria

2) Institute for sustainable plant protection, CNR, Bari, Italy,

3) Institute for biomedical technologies, CNR, Bari, Italy,

* , Registrant ID# 286

Cassava mosaic disease (CMD) is the most severe and widespread cassava viral disease.It is caused by different species of cassava mosaic geminiviruses inducing a variety of foliar symptoms whose severity is inversely correlated to the number and size of root production.

Cultivated cassava species are protected from CMD by a polygenic resistance introgressed from the wild species Manihot glaziovii and a dominant resistance conferred by a single QTL, named CMD2, discovered in African landraces. Based on sound statistical analysis, recent studies confirmed the narrow basis of the single locus CMD2 whereas the gene(s) involved in regulation of CMD resistance have not yet been uncovered. Aiming at identifying known and unknown regulatory pathways involved in the mechanisms of CMD resistance, we performed total RNA sequencing analysis of four cassava lines: two susceptible, TMEB117 and TMS-4(2)425 and two resistant, TMS-96/1089 and TMS-011412.

All plants were infested with Bemisia tabacicarrying African cassava mosaic virus (ACMV) and East-African cassava mosaic virus (EACMV), and kept confined until appearance of systemic symptoms on susceptible cassava lines. Total RNA was extracted and used to prepare TruSeq stranded total RNA libraries for Illumina sequencing.

We obtained between 28M and 46M paired-end reads per sample with a read length of 150nts (average 116nts after trimming). Mapping against the cassava genome resulted in between 35 and 47 % of uniquely mapped reads. Cufflinks software recovered between 3.6M and 4.6M transcript splices from susceptible cassava lines whereas between 2.0M and 2.3M splice events from resistant lines.

Our aim is to focus on the CMD2 locus that contains 406 annotated transcripts, 338 of which we recovered by RNAseq analysis.Of these, 255transcripts were not differentially regulated between the two resistant lines. We selected this set of transcripts and exposed them to a differential expression analysis between susceptible and resistant lines that revealed 52 differentially regulated transcripts. The gene ontology enrichment demonstrates that these gene products are involved in processes such as metabolic processes, transport and response to stress;they cover functions such as catalytic/transporter activities and nucleic acid binding, andthey are mainly associatedor integral components of membranes.

We found that a subset of such transcripts was consistentlyup-regulated or down-regulated in both TMEB117 and TMS-4(2)425 susceptible lines. Confirmation of differential expression levels of the transcripts with known functions, and bioinformatic characterization of the transcripts with unknown function are currently underway.

First approaches of de novo transcript discovery showed the presenceofseveralnon-annotated transcripts in the CMD2 QTL that are presently being characterized and annotated.