Note 1:the tAI and the justification for using tAI as an predictor of the co-adaptation between codon bias and tRNA pool

As codon-anti-codon coupling is not unique due to wobble interactions, several anti-codons can recognize the same codon with different efficiency weights (see dos Reis et al. for all the relations between codon – anti-codons).

Let nibe the number of tRNA isoacceptors recognizing codon i. Let tCGNij be the copy number of the jth tRNA that recognizes the ith codon, and let Sij be the selective constraint on the efficiency of the codon-anticodon coupling. We define the absolute adaptiveness, Wi , for each codon i as:

From Wi we obtain wi, which is the relative adaptiveness valueof codoni by normalizing the Wi's values (dividing them by the maximal of all 61 Wi).

The tAI of codon i is defined as wi. The similarity in tRNA pools between two organisms is defined as the non-parametric Spearman correlation between the two vectors (of length 61) of their codons' tAI (denoted by tRs).

In the case of a gene (vector of codons), the final tAI of a gene, g, is the geometric mean of all its codons

Where ikg is the codon defined by the k'th triplet on gene g; and lg is the length of the gene (excluding stop codons). tRNA copy numbers of all organisms analyzed in this study appear in Supplementary Table 2. The tAI of a COGs with more than one gene in a certain organism is the mean tAI of all the corresponding genes.

The Sij-values can be organized in a vector (S-vector) as described in (13); each component in this vector is related to one wobble nucleoside-nucleoside paring: I:U, G:U, G:C, I:C, U:A, I:A, etc. The wivalues for all codons (except stop codons) of all organisms analyzed in this study appear in Supplementary Table 2.

The tAI is based on the genomic tRNA copy number (tGCN) as a surrogate measure for the cellular abundances of tRNAs; it is justified by several observations.

First, in the past, in many organisms, it has been observed that the in vivo concentration of a tRNA bearing a certain anticodon is highly proportional to the number of gene copies coding for this tRNA type. Specifically, in S. cerevisiae a correlation of r=0.91(1) was reported. In B. subtilis, a correlation of 0.86 between tRNA copy number and tRNA abundance was reported (2). Similarly, previous papers reported about significant correlation between genomic tRNA copy number and tRNA abundance in E. coli(3, 4). A related interesting result is the analysis of (5) who measured the translation rateof two glutamate codons: GAA and GAG. They found them to have athreefold difference in translation rate (21.6 and 6.4 codons persecond, respectively). Remarkably, the wi of these codons, whichis based on the tRNA pool and affinity of codon-anti-codoncoupling and is the basis for the tAI calculation, captures the ratio of translation rate between the twocodons. Calculating wi values for E. coli we found that the ratio between the wi of GAA and GAG is 3.125 (0.5/0.16) as compared tothe 3.34 reported in the experiments (21.4/6.4). This resultsuggests that there is a direct relation between the adaptation of a codon to the tRNA pool, based on the genomic tRNA copy number, and the time it takes to translate it.

Second, a recent study showed that in S. cerevisiae the promoters of many of the tRNA genes have a low predicted affinity to the nucleosome, suggesting a constitutive expression with little transcriptional regulation capacity (6). Thus, for fully sequenced genomes, the relative concentrations of the various tRNAs in the cell, and therefore the optimality of the various codons in terms of translation, can be approximated using the respective tRNA gene copy numbers in the genome. Additionally, the tAI has been shown to be highly correlated (r=0.63 for S. cerevisiae) to protein expression levels(7, 8). It was found that even among genes with similar transcript levels, higher tAI often corresponds to higher proteinabundance(7).

This definition stems from an early observation of a trend of increasing codon usage bias with increasing gene expression levels in a sample of E. coli genes (9), and that tRNA concentrations are rate limiting in the elongation of nascent peptides(10).The translation efficiency, as defined above, has also been shown to be correlated with translation rate and accuracy (11), phenotypic divergence of yeast species (7), evolutionary rate (12), and to also play part in protein functionality (13).

Note 2: The evolutionary tree used in the analysis

((((((((Wsuccinogenes: 0.030586, (Hpylori: 0.011149, Hhepaticus: 0.013950): 0.002479): 0.002836, Cjejuni: 0.016420): 0.034409, ((Dvulgaris: 0.069430, (Gsulfurreducens: 0.078311, (Dpsychrophila: 0.078909, Bbacteriovorus: 0.071062): 0.000000): 0.000421): 0.013175, (((Mcapsulatus: 0.086560, (((Xfastidiosa: 0.009861, (Xcitri: 0.016158, (Xoryzae: 0.044481, Xcampestris: 0.025537): 0.002055): 0.132758): 0.053734, Cburnetii: 0.033290): 0.002253, ((Lpneumophila: 0.053660, Ftularensis: 0.028762): 0.005770, (((Pputida: 0.062120, (Psyringae: 0.095675, Pfluorescens: 0.095896): 0.021094): 0.022466, Paeruginosa: 0.095695): 0.155208, ((Parcticum: 0.027147, Acinetobacter: 0.073012): 0.023812, ((Iloihiensis: 0.037096, Cpsychrerythraea: 0.095506): 0.028625, ((Soneidensis: 0.076146, (Pprofundum: 0.075595, (((Vcholerae: 0.053065, Vvulnificus: 0.033163): 0.012388, Vparahaemolyticus: 0.036982): 0.031231, Vfischeri: 0.038869): 0.016545): 0.058093): 0.042984, ((Hducreyi: 0.012415, (Msucciniciproducens: 0.032385, (Pmultocida: 0.022254, Hinfluenzae: 0.017301): 0.001579): 0.009447): 0.018857, (((Wbrevipalpis: 0.008819, Bfloridanus: 0.005839): 0.002618, (Buchnera: 0.000004, Baphidicola: 0.001220): 0.002794): 0.031075, ((((Styphimurium: 0.005134, (Styphi: 0.036775, Senterica: 0.020241): 0.004276): 0.070544, (Sflexneri: 0.027910, Ecoli: 0.040651): 0.024059): 0.052160, ((Ypseudotuberculosis: 0.005792, Ypestis: 0.011419): 0.078435, Ecarotovora: 0.086397): 0.008092): 0.027947, Pluminescens: 0.071417): 0.056291): 0.000000): 0.007399): 0.003100): 0.012914): 0.000000): 0.009863): 0.000000): 0.000000): 0.000592, (((Nmeningitidis: 0.005382, Ngonorrhoeae: 0.003589): 0.045176, Cviolaceum: 0.131736): 0.004795, ((((Bpertussis: 0.000367, (Bparapertussis: 0.013541, Bbronchiseptica: 0.013472): 0.074202): 0.116872, ((Rsolanacearum: 0.070685, Reutropha: 0.136131): 0.040674, (Bpseudomallei: 0.051771, Bmallei: 0.000009): 0.133756): 0.045889): 0.033048, Neuropaea: 0.061117): 0.000000, (Daromatica: 0.077185, Azoarcus: 0.080537): 0.042089): 0.008270): 0.011144): 0.015162, (((CPelagibacter: 0.033449, (((Rtyphi: 0.000485, Rprowazekii: 0.000004): 0.000021, (Rconorii: 0.001678, Rfelis: 0.002791): 0.005886): 0.022874, (Wendosymbiont: 0.000826, (Eruminantium: 0.001741, Amarginale: 0.000592): 0.002516): 0.008060): 0.006976): 0.002251, ((Goxydans: 0.052032, Ccrescentus: 0.098234): 0.004696, ((Spomeroyi: 0.099204, (Rpalustris: 0.042518, Bjaponicum: 0.137031): 0.113653): 0.018250, ((Mloti: 0.130010, (Smeliloti: 0.091923, Atumefaciens: 0.082334): 0.044258): 0.094429, ((Bquintana: 0.000282, Bhenselae: 0.002224): 0.048182, (Bsuis: 0.000004, (Babortus: 0.006683, Bmelitensis: 0.017646): 0.007124): 0.108137): 0.000000): 0.004707): 0.022758): 0.017742): 0.000000, Zmobilis: 0.054477): 0.004746): 0.002222): 0.001821): 0.004035, ((Linterrogans: 0.065301, ((Tpallidum: 0.002042, Tdenticola: 0.042206): 0.021458, (Bburgdorferi: 0.000004, Bgarinii: 0.000004): 0.041841): 0.005455): 0.000740, ((Parachlamydia: 0.017061, ((Ctrachomatis: 0.009987, Cmuridarum: 0.000004): 0.002236, ((Cabortus: 0.000004, Ccaviae: 0.000547): 0.002493, Cpneumoniae: 0.000004): 0.001753): 0.020130): 0.015801, (Pgingivalis: 0.004975, (Bfragilis: 0.008918, Bthetaiotaomicron: 0.017344): 0.098079): 0.053616): 0.000000): 0.002693): 0.000006, ((Oyellows: 0.003329, ((Mmobile: 0.003518, ((Mpulmonis: 0.002461, Msynoviae: 0.003062): 0.000780, Mhyopneumoniae: 0.002847): 0.000808): 0.003583, ((Mmycoides: 0.003119, Mflorum: 0.001220): 0.008233, ((Uurealyticum: 0.003014, Mpenetrans: 0.007727): 0.000736, ((Mpneumoniae: 0.000793, Mgenitalium: 0.000004): 0.008208, Mgallisepticum: 0.002941): 0.002650): 0.002290): 0.000000): 0.001820): 0.007562, ((Ttengcongensis: 0.051160, ((Ctetani: 0.040848, Cacetobutylicum: 0.077084): 0.012407, Cperfringens: 0.045819): 0.026744): 0.006563, (Bclausii: 0.091014, (Bhalodurans: 0.077181, (Gkaustophilus: 0.070652, ((Bsubtilis: 0.029008, Blicheniformis: 0.036042): 0.066228, ((Bthuringiensis: 0.003902, (Bcereus: 0.040505, Banthracis: 0.018615): 0.017715): 0.191381, (Oiheyensis: 0.065922, (((Shaemolyticus: 0.019192, Sepidermidis: 0.011055): 0.004896, Saureus: 0.010684): 0.063709, ((Lmonocytogenes: 0.002777, Linnocua: 0.005995): 0.085395, (Efaecalis: 0.055043, (Lplantarum: 0.048736, (((Smutans: 0.023486, (Spneumoniae: 0.027130, (Sthermophilus: 0.015744, (Spyogenes: 0.013835, Sagalactiae: 0.024347): 0.012111): 0.000327): 0.001743): 0.015946, Llactis: 0.026312): 0.012937, (Ljohnsonii: 0.008114, Lacidophilus: 0.004918): 0.045346): 0.002890): 0.005714): 0.012571): 0.010676): 0.038292): 0.013825): 0.000385): 0.004083): 0.009737): 0.001851): 0.077537): 0.024696): 0.000000): 0.000273, ((((Pacnes: 0.049270, Blongum: 0.035484): 0.002140, (((Nfarcinica: 0.112702, ((Mleprae: 0.048713, (Mtuberculosis: 0.000960, Mbovis: 0.000671): 0.103300): 0.002127, Mavium: 0.074525): 0.019277): 0.049895, ((Cglutamicum: 0.014310, Cefficiens: 0.011165): 0.048994, (Cjeikeium: 0.019719, Cdiphtheriae: 0.031462): 0.001922): 0.019157): 0.020860, ((Scoelicolor: 0.059601, Savermitilis: 0.047197): 0.240202, (Twhipplei: 0.016545, Lxyli: 0.030340): 0.005671): 0.000000): 0.001394): 0.000361, Tfusca: 0.094725): 0.020169, Sthermophilum: 0.098641): 0.002211): 0.000000, (Gviolaceus: 0.084784, (Telongatus: 0.028168, ((Selongatus: 0.028875, (Synechococcus: 0.008547, Pmarinus: 0.010767): 0.036943): 0.005246, (Synechocystis: 0.022803, Nostoc: 0.077702): 0.027566): 0.001440): 0.020757): 0.064815): 0.000000, ((Mkandleri: 0.018659, ((Tkodakaraensis: 0.008351, (Pfuriosus: 0.006324, (Phorikoshii: 0.009433, Pabyssi: 0.003742): 0.005454): 0.008392): 0.066312, ((Mthermoautotrophicum: 0.019769, (Mmaripaludis: 0.014085, Mjannaschii: 0.009818): 0.023101): 0.012704, (Afulgidus: 0.047593, (((Mmazei: 0.006479, Macetivorans: 0.019625): 0.122794, (Hmarismortui: 0.028169, Halobacterium: 0.004998): 0.073132): 0.001469, ((Tvolcanium: 0.002136, Ptorridus: 0.036580): 0.000699, Tacidophilum: 0.002626): 0.076412): 0.001076): 0.003544): 0.000466): 0.000567): 0.007687, (Paerophilum: 0.022637, (Apernix: 0.008760, ((Ssolfataricus: 0.012994, Sacidocaldarius: 0.019433): 0.007346, Stokodaii: 0.003272): 0.094831): 0.003025): 0.012283): 0.033254);

Note 3: The evolutionary tree used in the analysis

We also checked the relation between the number of shared genes and similarity of tRNA pool using the data from Beiko et al.(22), who identified highways of gene sharing in prokaryotes, based on phylogenetic reconstruction. In agreement with our protein similarity-based findings, we also find a significant correlation in this dataset, between the number of shared genes and the mean tRs of the correspondingto groups of prokaryotes (r = 0.36 p = 4.7* 10-4; comparison to correlations of permutations of the values - empirical p-value = 0.018).

References

1.Percudani R, Pavesi A, & Ottonello S (1997) Transfer RNA gene redundancy and translational selection in Saccharomyces cerevisiae J Mol Biol268, 322-330.

2.Kanaya S, Yamada Y, Kudo Y, & Ikemura T (1999) Studies of codon usage and tRNA genes of 18 unicellular organisms and quantification of Bacillus subtilis tRNAs: gene expression level and species-specific diversity of codon usage based on multivariate analysis Gene238, 143-155.

3.Ikemura T (1981) Correlation between the abundance of Escherichia coli transfer RNAs and the occurrence of the respective codons in its protein genes: a proposal for a synonymous codon choice that is optimal for the E. coli translational system J Mol Biol151, 389-409.

4.Dong H, Nilsson L, & Kurland CG (1996) Co-variation of tRNA abundance and codon usage in Escherichia coli at different growth rates J Mol Biol260, 649-663.

5.Sorensen MA & Pedersen S (1991) Absolute in vivo translation rates of individual codons in Escherichia coli. The two glutamic acid codons GAA and GAG are translated with a threefold difference in rate J Mol Biol222, 265-280.

6.Segal E, Fondufe-Mittendorf Y, Chen L, Thastrom A, Field Y, et al. (2006) A genomic code for nucleosome positioning Nature.

7.Man O & Pilpel Y (2007) Differential translation efficiency of orthologous genes is involved in phenotypic divergence of yeast species Nat Genet39, 415-421.

8.Tuller T, Kupiec M, & Ruppin E (2007) Determinants of protein abundance and translation efficiency in S. cerevisiae PLoS Comput Biol3, e248.

9.Sharp PM & Li WH (1986) An evolutionary perspective on synonymous codon usage in unicellular organisms J Mol Evol24, 28-38.

10.Varenne S, Buc J, Lloubes R, & Lazdunski C (1984) Translation is a non-uniform process : Effect of tRNA availability on the rate of elongation of nascent polypeptide chains Journal of Molecular Biology180, 549-576.

11.Akashi H (2003) Translational Selection and Yeast Proteome Evolution Genetics164, 1291-1303.

12.Wall DP, Hirsh AE, Fraser HB, Kumm J, Giaever G, et al. (2005) Functional genomic analysis of the rates of protein evolution Proc Natl Acad Sci U S A102, 5483-5488.

13.Kimchi-Sarfaty C, Oh JM, Kim I-W, Sauna ZE, Calcagno AM, et al. (2007) A "Silent" Polymorphism in the MDR1 Gene Changes Substrate Specificity Science315, 525-528.