Supplementary tables S3-S6 1
SUPPLEMENTARY TABLES S3-S6: Evaluation of sequence parameters in ribosomal proteins and RNAs
Unless stated otherwise, test significances are forpost hoc Bonferroni t tests, p <0.1 where bracketed, p < 0.05 for other.
Boldface font indicates tests within LSU ("LSU:LSU") or SSU ("SSU:SSU") sets, plain font between LSU and SSU ("LSU:SSU" and "SSU:LSU").
For species and access codes used see tables S1 (LSU) and S2 (SSU).
TABLE PAGE
Table S3A Accruing mean differences of ionic parameters for LSU ribosomal proteins2
Table S3.1 The number of residues in ribosomal proteins examined (Fig. 1A in the main text)4
Table S3.2 Hydrophobicity of the ribosomal proteins examined (Fig. 1B in the main text)4
Table S3.3 ILMV>2 clusters as % sequence of the ribosomal proteins examined (Fig. 1C in the main text)4
Table S3.4 AGILMV>2 clusters as % sequence of the ribosomal proteins examined (Fig. 1D in the main text)4
Table S3.5A Numbers of residues per RP sequence in ILMV >2 clusters4
Table S3.5B Numbers of residues per RP sequence in AGILMV>2 clusters5
Table S3.6 Numbers of all ILMV and AG residues per RP sequence in LSU ribosomal proteins5
Table S3.7 Basic amino acid residues as % RP sequence (Fig. 2A in the main text)5
Table S3.8 Acidic amino acid residues as % RP sequence (Fig. 2B in the main text)5
Table S3.9 Homobasic segments with >1 HKR residue as % RP sequence (Fig. 2C in the main text)5
Table S3.10 Homoacidic segments with >1 DE as % RP sequence (Fig. 2D in the main text)6
Table S3.11 Basic PCNs as % RP sequence (Fig. 3A in the main text)6
Table S3.12 Acidic PCNs as % RP sequence (Fig. 3B in the main text)6
Table S3.13 The number of basic residues in bPCNs per RP sequence ("basic PCN impact") (Fig. 3C in the main text)6
Table S3.14 The number of acidic residues in aPCNs per RP sequence ("acidic PCN impact") (Fig. 3D in the main text)6
Table S4.1 Eukaryote-only (EO) LSU ribosomal proteins7
Table S4.2 Eukaryote and archaeal alignable(EA) LSU ribosomal proteins8
Table S4.3 Eukaryote, archaeal and bacterial alignable (EAB) LSU ribosomal proteins9
Table S4.4.1 LSU 41 EO/EA/EAB, archaeal and bacterial RPs: sequence identity to human orthologs [Fig. 4A]10
Table S4.4.2 LSU 41 EO/EA/EAB, archaeal and bacterial RPs: bPCN alignment identity to human [Fig. 4B]10
Table S4.4.3 LSU numbers of bPCNs per 100 aa and per entire sequence [Fig 4C]10
Table S4.5 LSU bPCN alignment with ribosomal proteins of group-representative species11
Table S4.6.1 The families and species compared in LSU alignments of bPCNs in archaea and bacteria (Fig. 6 main)12
Table S4.6.2 Sequence and bPCN identities with M. jannaschii for archaeal LSU tunnel proteins (Fig. 6A in the main text))12
Table S4.6.3 Sequence and bPCN identities to M. jannaschii for the bacterial LSU tunnel proteins (Fig. 6B in the main text)12
Table S4.6.4 Sequence and bPCN identities to M. tuberculosis for the bacterial LSU tunnel proteins (Fig. 6C in the main text)12
Table S4.7.1 The bulk HKR% of ribosomal proteins relative to human [Fig 7A in the main text]13
Table S4.7.2 HKR in bPCN of ribosomal proteins as % sequence relative to human [Fig. 7B in the main text]13
Table S4.7.3 Percent rRNA sequence in expansion segments relative to human [Fig. 7C in the main text]13
Table S4.7.4 rRNA expansion segment GC% relative to human [Fig. 7D in the main text]13
Table S4.8.1 Ratios of residues in ES as % human to sequence HKR%as % human [Fig. 8A in the main text]14
Table S4.8.2 Ratios of ES GC% as % human to sequence HKR% as % human [Fig. 8B in the main text]14
Table S5 Basic PCN clusters in nuclear localization signals 15 to 17
Table S6.1 LSU ES boundaries defined by alignment with human 28S rRNA
compared with published models for 25-28S rRNAs 18
Table S6.2 LSU ES boundaries defined by alignment with yeast 25S rRNA
compared with human 28S rRNA-defined boundaries 18
Table S3A Accruing mean differences of ionic parameters for LSU ribosomal proteins
Percent difference of ionic parameter accrued means from the final group mean in LSU ribosomal proteins
Means for all species preceding and including the current column were subtracted from the final mean for all species of a group, and the difference expressed as % of that mean.
This table illustrates the degree of parameter homogeneity within the examined groups of species.
Parameter abbreviations:
DEacidic residues as % sequence aa
HKR basic residues as % sequence aa
Azip% sequence in homoionic segments containing >1 DE residue;
Bzip% sequence in homoionic segments containing >1 HKR residue
aPCNacidic PCN segments with >2 and >=50% DE residues
bPCNbasic segments with >2 and >=50% HKR residues
mdif1, 2, 3…% difference from the overall mean for the mean after addition of the parameter of species 1, 2, 3…
Archaea (12) / mdif1 / mdif2 / mdif3 / mdif4 / mdif5 / mdif6 / mdif7 / mdif8 / mdif9 / mdif10 / mdif11DE / -23.54 / 8.61 / 9.79 / 4.48 / 0.33 / -2.19 / 3.78 / 1.07 / 5.97 / 3.82 / 1.4
HKR / 6.02 / -6.66 / -2.15 / 2.67 / 2.04 / 1.62 / -1.52 / 0.38 / -2.87 / -1.18 / -0.68
azip / -43.62 / 13.65 / 12.21 / 3.23 / -0.75 / -3.65 / 7.19 / 1.98 / 11.25 / 6.81 / 3.05
bzip / 12.58 / -8.61 / -6.44 / -1.3 / 0.92 / 1.98 / -2.24 / 0 / -4.32 / -2.4 / -0.96
aPCN / -64.89 / 28.64 / 23.01 / 4.47 / -8.08 / -15.88 / 6.09 / -0.71 / 17.22 / 12.1 / 3.89
bPCN / 2.79 / -15.32 / -5.02 / 6.96 / 3.95 / 1.75 / -3.36 / 0.26 / -5.24 / -2.08 / -1.76
Bacteria (12) / mdif1 / mdif2 / mdif3 / mdif4 / mdif5 / mdif6 / mdif7 / mdif8 / mdif9 / mdif10 / mdif11
DE / -0.64 / -1.02 / 1.37 / 1.83 / 0.18 / -0.02 / -0.27 / -1.7 / -1.86 / -0.32 / 0.18
HKR / -0.23 / -1.81 / -1.69 / -0.98 / -0.39 / -0.21 / 0.07 / 0.3 / 0.02 / -0.18 / 0.13
azip / -3.74 / 2.23 / 4.43 / 4.87 / 2.03 / 0.9 / -0.08 / -2.28 / -1.39 / 0.28 / 0.13
bzip / 0.79 / -1.42 / -2.41 / -2.09 / -0.79 / -0.39 / 0 / 0.92 / 0.98 / 0.28 / 0.11
aPCN / 18.55 / 8.6 / 1.95 / -7.53 / -3.25 / -2.83 / -0.14 / -5.54 / -5.73 / 1.77 / 1.3
bPCN / -14.15 / -1.78 / 3.33 / 0.76 / 5.49 / 3.76 / 1.25 / 1.83 / 0.74 / -0.31 / -0.16
Lower eukarya (3) / mdif1 / mdif2
DE / -3.44 / -1.93
HKR / 3.12 / 1.08
azip / -5.3 / -11.2
bzip / 3.18 / 0.88
aPCN / -24.31 / -19.44
bPCN / 9.65 / 0.23
Insects (2) / mdif1
DE / 1.87
HKR / -2.6
azip / 5.19
bzip / -1.26
aPCN / -7.24
bPCN / 0.09
Non-mammalian vertebrates [3} / mdif1 / mdif2
DE / 0.45 / -0.44
HKR / 0.87 / -0.06
azip / 9.06 / 0.66
bzip / -0.95 / -0.12
aPCN / -2.51 / -0.51
bPCN / 4.88 / 0.3
Table S3A continued
Mammalian [5] / mdif1 / mdif2 / mdif3 / mdif4DE / 1.85 / -1.51 / -2.19 / -0.13
HKR / -0.55 / 1.06 / 1.09 / -0.11
azip / 2.27 / -2.51 / -3.56 / 0.07
bzip / -0.54 / 1.06 / 1.25 / -0.01
aPCN / 7.45 / 9.79 / 7.9 / -1.96
bPCN / -2.71 / 1.44 / 2.23 / -0.09
Plants [3] / mdif1 / mdif2
DE / 0.21 / -0.5
HKR / -1.4 / 1.23
azip / 4.88 / -1.43
bzip / -0.96 / 0.28
aPCN / -34.03 / -12.25
bPCN / -2.18 / 3.64
Table S3.1 The number of residues in ribosomal proteins examined (Fig. 1A in the main text)
group / LSU #residues / se / LSU:LSU sgnf. / LSU:SSU sgnf. / SSU #residues / se / SSU:SSU sgnf. / SSU:LSU sgnf. / LSU/SSU size ratioarchaeal [A] / 140.2 / 7.46 / 142.8 / 3.98 / 0.982
bacterial [B] / 123.1 / 2.89 / 135 / 4.61 / 0.912
lower eukarya [L] / 168 / 6.858 / AB / AB / 166.5 / 6.57 / AB / AB / 1.009
insect [I] / 178.5 / 9.489 / AB / ABM[N] / 169.1 / 9.232 / AB / AB / 1.056
non-mammalian [N] / 171.7 / 7.125 / AB / AB[M] / 159.6 / 6.631 / 1.076
mammalian [M] / 169.2 / 5.321 / AB / AB / 157.5 / 4.941 / 1.074
plant [P] / 172.7 / 7.015 / AB / AB[M] / 167.9 / 6.643 / AB / AB / 1.029
Giardia / 167.9 / 10.27 / n. t. / n. t. / 164.3 / 9.762 / n. t. / n. t. / 1.022
n.t. = not tested
Table S3.2 Hydrophobicity of the ribosomal proteins examined (Fig. 1B in the main text)
group / LSU / se / LSU:LSU sgnf. / LSU:SSU sgnf. / SSU / se / SSU:SSU sgnf. / SSU:LSU sgnf.archaeal [A] / -0.5878 / 0.019 / IMLN[L] / IMLNP / -0.5412 / 0.0160
bacterial [B] / -0.4832 / 0.0176 / AILMNP / IMLNP / -0.5665 / 0.0168
lower eukarya [L] / -0.6827 / 0.03099 / IM[N] / IMLNP / -0.5545 / 0.02731
insect [I] / -0.7819 / 0.03773 / IMLNP / -0.5697 / 0.03362
non-mammalian [N] / -0.7587 / 0.0293 / IMLNP / -0.5792 / 0.03027
mammalian [M] / -0.7853 / 0.02688 / IMLNP / -0.5714 / 0.02271
plant [P] / -0.7114 / 0.03351 / M / IMLNP[A] / -0.5453 / 0.02884
Giardia / -0.5670 / 0.04404 / n. t. / n. t. / -0.4480 / 0.0460 / n. t. / n. t.
Table S3.3 ILMV>2 clusters as % sequence of the ribosomal proteins examined (Fig. 1C in the main text)
group / LSU-ILMV>2% / se / LSU:LSU sgnf. / LSU:SSU sgnf. / SSU-ILMV>2% / se / SSU:SSU sgnf. / SSU:LSU sgnf.archaeal [A] / 3.123 / 0.143 / ILMNP / BIN / 3.006 / 0.153 / B / ILP[M]
bacterial [B] / 2.72 / 0.141 / I[LP] / B / 1.797 / 0.127
lower eukarya [L] / 2.254 / 0.2003 / 2.65 / 0.2647 / B / [I]
insect [I] / 1.932 / 0.219 / 2.395 / 0.3198
non-mammalian [N] / 2.489 / 0.2085 / [B] / 2.49 / 0.265 / [B]
mammalian [M] / 2.441 / 0.1576 / [B] / 2.632 / 0.2015 / B / I
plant [P] / 2.327 / 0.2 / 2.703 / 0.2771 / B / I
Giardia / 1.921 / 0.3452 / n. t. / n. t. / 3.086 / 0.569 / n. t. / n. t.
Table S3.4 AGILMV>2 clusters as % sequence of the ribosomal proteins examined (Fig. 1D in the main text)
group / LSU-AGILMV>2 % / se / LSU:LSU sgnf. / LSU:SSU sgnf. / SSU-AGILMV>2% / se / SSU:SSU sgnf. / SSU:LSU sgnf.archaeal [A] / 10.98 / 0.354 / ILMNP / 12.08 / 0.316 / ILMNP / ILMNP
bacterial [B] / 12.09 / 0.328 / ILMNP / ILMNP / 12.06 / 0.367 / AILMNP / ILMNP
lower eukarya [L] / 9.522 / 0.4298 / 10.74 / 0.6029 / ILMNP
insect [I] / 8.345 / 0.5217 / 9.445 / 0.6251
non-mammal ian [N] / 8.864 / 0.4393 / 9.533 / 0.5539
mammalian [M] / 9.04 / 0.3376 / 9.673 / 0.421 / [I]
plant [P] / 8.902 / 0.4499 / 10.24 / 0.5368 / I[MNP]
Giardia / 9.17 / 0.8086 / n. t. / n. t. / 11.39 / 1.317 / n. t. / n. t.
Table S3.5A Numbers of residues per RP sequence in ILMV >2 clusters
group / LSU # ILMV in >2 / se / LSU:LSU sgnf. / LSU:SSU sgnf. / SSU # ILMV in >2 / se / SSU:SSU sgnf. / SSU:LSU sgnf.archaeal / 4.29 / 0.2924 / BI / B / 4.572 / 0.2555 / B
bacterial / 3.673 / 0.2915 / B / 2.69 / 0.2154 / B
lower eukarya / 4.024 / 0.3438 / B / 4.362 / 0.4264 / B
insect / 3.448 / 0.4049 / B / 4.355 / 0.5624 / B
non-mammalian / 4.605 / 0.4084 / B / 4.172 / 0.4451 / B
mammalian / 4.397 / 0.2943 / B[I] / B / 4.352 / 0.3274 / B
plant / 4.232 / 0.3382 / B / 4.591 / 0.4762 / B
Giardia / 3.162 / 0.4985 / n. t. / n. t. / 5.368 / n. t. / n. t.
# ILMV>2 = the average number of ILMV residues in the respective clusters of RP sequences
Table S3.5B Numbers of residues per RP sequence in AGILMV>2 clusters
group / LSU # AGILMV in >2 clusters / se / LSU:LSU sgnf. / LSU:SSU sgnf. / SSU #AGILMV in >2 clusters / se / SSU:SSU sgnf. / SSU:LSU sgnf.archaeal / 16.95 / 0.675 / 18.06 / 0.6433 / M[N]
bacterial / 16.31 / 0.5913 / 17.65 / 0.9197 / [M]
lower eukarya / 16.71 / 1.034 / 18.15 / 1.132 / [M]
insect / 15.42 / 1.199 / 16.52 / 1.337
non-mammalian / 15.72 / 0.9245 / 15.53 / 1.1
mammalian / 15.89 / 0.6932 / 15.58 / 0.808
plant / 16.9 / 1.047 / 17.11 / 1.069
Giardia / 15.43 / 1.582 / n. t. / n. t. / 19.11 / 2.21 / n. t. / n. t.
# AGILMV = the average number of AGILMV residues in the respective clusters of RP sequences
Table S3.6 Numbers of all ILMV and AG residues per RP sequence in LSU ribosomal proteins
group / # ILMV / se / % mammal / # AG / se / % mammalarchaeal / 33.51 / 0.9115 / 89.79 / 22.98 / 0.7077 / 91.99
bacterial / 29.55 / 0.7447 / 79.18 / 23.2 / 0.708 / 92.87
lower eukarya / 36.39 / 1.579 / 97.51 / 26.4 / 1.399 / 105.68
insect / 37.41 / 2.081 / 100.24 / 28.08 / 2.337 / 112.41
non-mammalian / 37.48 / 1.643 / 100.43 / 25.1 / 1.332 / 100.48
mammalian / 37.32 / 1.262 / 100 / 24.98 / 1.004 / 100
plant / 37.9 / 1.668 / 101.55 / 26.63 / 1.384 / 106.61
# ILMV, # AG = the average numbers of the respective residues per LSU RP sequence
Table S3.7 Basic amino acid residues as % RP sequence (Fig. 2A in the main text)
group / LSU-HKR% / se / LSU:LSU sgnf. / LSU:SSU sgnf. / SSU-HKR% / se / SSU:SSU sgnf. / SSU:LSU sgnf.archaeal [A] / 21.71 / 0.356 / A / 19.64 / 0.288
bacterial [B] / 22.17 / 0.324 / A / 22.6 / 0.31 / A
lower eukarya [L] / 26.01 / 0.5633 / AB / ABILMP / 21.54 / 0.4798 / A
insect [I] / 28.11 / 0.7149 / ABLP / ABILMNP / 22.58 / 0.5726 / A
non-mammalian [N] / 27.52 / 0.5453 / ABL[P] / ABILMNP / 23 / 0.4888 / A
mammalian [M] / 28.25 / 0.5363 / ABLP / ABILMNP / 23.06 / 0.3479 / A[L]
plant [P] / 26.17 / 0.6222 / AB / ABILMNP / 21.78 / 0.4636 / A
Giardia / 23.24 / 0.777 / n. t. / 21.46 / 0.758 / n. t.
Table S3.8 Acidic amino acid residues as % RP sequence (Fig. 2B in the main text)
group / LSUDE% / se / LSU:LSU sgnf. / LSU:SSU sgnf. / SSUDE% / se / SSU:SSU sgnf. / SSU:LSU sgnf.archaeal [A] / 12.45 / 0.291 / BILMNP / BIMNP / 13.56 / 0.284 / BIMNP / ABILMNP
bacterial [B] / 9.658 / 0.192 / ILMNP / 9.72 / 0.299 / [P] / ILMNP
lower eukarya [L] / 7.181 / 0.2553 / 9.88 / 0.3142 / ILMNP
insect [I] / 7.142 / 0.3061 / 10.44 / 0.3617 / ILMNP
non-mammalian [N] / 7.466 / 0.2992 / 9.669 / 0.2942 / ILMNP
mammalian [M] / 7.2 / 0.2291 / 9.687 / 0.2192 / ILMNP
plant [P] / 8.117 / 0.293 / LM[I] / 9.581 / 0.2895 / ILMNP
Giardia / 8.88 / 0.483 / n. t. / n. t. / 8.7 / 0.47 / n. t. / n. t.
Table S3.9 Homobasic segments with >1 HKR residue as % RP sequence (Fig. 2C in the main text)
group / LSU % sequence / se / LSU:LSU sgnf. / LSU:SSU sgnf. / SSU % sequence / se / SSU:SSU sgnf. / SSU:LSU sgnf.archaeal [A] / 45.94 / 1.012 / A / 40.76 / 0.808
bacterial [B] / 48.15 / 0.808 / A / 47.77 / 1.157 / A
lower eukarya [L] / 63.67 / 1.173 / ABP / ABILMNP / 51.78 / 1.274 / AB / AB
insect [I] / 67.11 / 1.268 / ABP[L] / ABILMNP / 52.03 / 1.682 / AB / AB
non-mammalian [N] / 65.06 / 1.271 / ABP / ABILMNP / 53.96 / 1.393 / AB / AB
mammalian [M] / 66.31 / 0.9583 / ABP / ABILMNP / 54.35 / 1.034 / AB / AB
plant [P] / 59.62 / 1.148 / AB / ABILMNP / 53.63 / 1.509 / AB / AB
Giardia / 56.75 / 2.081 / n. t. / n. t. / 53.81 / 2.44 / n. t.
Table S3.10 Homoacidic segments with >1 DE as % RP sequence (Fig. 2D in the main text)
group / LSU % sequence / se / LSU:LSU sgnf. / LSU:SSU sgnf. / SSU % sequence / se / SSU:SSU sgnf. / SSU:LSU sgnf.archaeal [A] / 18.68 / 0.747 / BILMNP / BILMNP / 20.95 / 0.728 / BILMNP / ABIMNPU
bacterial [B] / 13.9 / 0.523 / ILMNP / 13.77 / 0.788 / ILMNP
lower eukarya [L] / 7.833 / 0.5318 / 12.88 / 0.8552 / ILMNP
insect [I] / 6.703 / 0.6028 / 14.55 / 0.9957 / ILMNP
non-mammalian [N] / 8.012 / 0.5485 / 12.48 / 0.7566 / ILMNP
mammalian [M] / 7.949 / 0.4201 / 12.89 / 0.542 / ILMNP
plant [P] / 9.636 / 0.5705 / I[LM] / 14.11 / 0.9039 / ILMNP
Giardia / 10.24 / 1.168 / n. t. / n. t. / 10.83 / 1.13 / n. t. / n. t.
Table S3.11 Basic PCNs as % RP sequence (Fig. 3A in the main text)
group / LSU-bPCN% / se / LSU:LSU sgnf. / LSU:SSU sgnf. / found in % LSU / SSU-bPCN% / se / SSU:SSU sgnf. / SSU:LSU sgnf. / found in % SSUarchaeal [A] / 9.812 / 0.444 / A[L] / 84.71 / 7.073 / 0.33 / 83.5
bacterial [B] / 9.556 / 0.492 / A[L] / 80.5 / 10.08 / 0.779 / A / 97.98
lower eukarya [L] / 13.53 / 0.9332 / AB / ABILMP[N] / 98.43 / 7.92 / 0.4977 / 91.49
insect [I] / 16.65 / 1.302 / ABL / ABILMNP / 98.82 / 9.478 / 0.7796 / [A] / 95.16
non-mammalian [N] / 16.85 / 0.9051 / ABLP / ABILMNP / 99.25 / 10.98 / 0.6735 / AL / 96.55
mammalian [M] / 18.48 / 0.9724 / ABLP / ABILMNP / 100 / 10.66 / 0.4917 / AL / 97.48
plant [P] / 15.48 / 1.046 / AB / ABILMNP / 99.28 / 9.476 / 0.6249 / [A] / 96.77
Giardia / 10.02 / 0.9725 / n. t. / n. t. / 9.815 / 1.1 / n. t. / n. t.
Table S3.12 Acidic PCNs as % RP sequence (Fig. 3B in the main text)
group / LSU-aPCN% / se / LSU:LSU sgnf. / LSU:SSU sgnf. / found in % LSU / SSU-aPCN% / se / SSU:SSU sgnf. / SSU:LSU sgnf. / found in % SSUarchaeal [A] / 3.061 / 0.237 / BILMNP / BIMNPU / 49.76 / 3.269 / 0.278 / BILMNP / ABILMNP / 54.21
bacterial [B] / 0.7734 / 0.0861 / 21 / 0.7483 / 0.127 / 24.19
lower eukarya [L] / 0.581 / 0.127 / 20.47 / 1.544 / 0.2176 / B / BILMN[P] / 44.68
insect [I] / 0.4828 / 0.1163 / 20 / 1.523 / 0.3104 / [B] / BILMU / 41.94
non-mammalian [N] / 0.8536 / 0.1481 / 28.57 / 1.126 / 0.1987 / [I] / 31.03
mammalian [M] / 0.793 / 0.1052 / 26.2 / 1.113 / 0.1459 / [IL] / 31.45
plant [P] / 0.9301 / 0.1922 / 23.91 / 1.314 / 0.1701 / IL[BM] / 47.31
Giardia / 0.871 / 0.2373 / n. t. / n. t. / 0.816 / 0.374 / n. t. / n. t.
Table S3.13 The number of basic residues in bPCNs per RP sequence ("basic PCN impact") (Fig. 3C in the main text)
group / LSU#HKR in bPCN / se / LSU:LSU sgnf. / LSU:SSU sgnf. / SSU#HKR in bPCN / se / SSU:SSU sgnf. / SSU:LSU sgnf.archaeal [A] / 8.915 / 0.353 / 7.256 / 0.327
bacterial [B] / 7.06 / 0.28 / 8.702 / 0.258
lower eukarya [L] / 15.42 / 0.7976 / AB / AILMP[N] / 9.883 / 0.7173 / A / B
insect [I] / 19.18 / 0.996 / ABL[P] / ABILMNP / 12.1 / 1.107 / AB[L] / AB
non-mammalian [N] / 19.62 / 0.8322 / ABLP / ABILMNP / 13.34 / 0.937 / ABL / AB
mammalian [M] / 20.04 / 0.6527 / ABLP / ABILMNP / 12.8 / 0.6942 / ABL / AB
plant [P] / 17.35 / 0.7681 / ABL / ABILMNP / 11.74 / 0.8198 / AB / AB
Giardia / 12.22 / 1.15 / n. t. / n. t. / 7.489 / 0.846 / n. t. / n. t.
Table S3.14 The number of acidic residues in aPCNs per RP sequence ("acidic PCN impact") (Fig. 3D in the main text)
group / LSU#DE in aPCN / se / LSU:LSU sgnf. / LSU:SSU sgnf. / SSU-#DE in aPCN / se / SSU:SSU sgnf. / SSU:LSU sgnf.archaeal [A] / 3.551 / 0.277 / BILMNP / BILMNP / 4 / 0.458 / BILMNP / ABILMNP
bacterial [B] / 0.833 / 0.054 / 1.29 / 0.192
lower eukarya [L] / 0.9685 / 0.2219 / 2.287 / 0.3271 / B[N] / BILM[NP]
insect [I] / 0.8706 / 0.22 / 1.919 / 0.3214 / B[IL]
non-mammalian [N] / 1.331 / 0.2121 / 1.471 / 0.2563
mammalian [M] / 1.192 / 0.1476 / 1.43 / 0.1837
plant [P] / 1.377 / 0.2944 / 2.064 / 0.2489 / BILM[NP]
Giardia / 1.351 / 0.3975 / n. t. / n. t. / 0.684 / 0.299 / n. t. / n. t.
Eukaryote-only (EO), Eukaryote/Archaeal (EA) and Eukaryote/Archaeal/Bacterial (EAB) LSU proteins
LSU ribosomal proteins grouped by interdomain sequence aignment. The classification is largely as proposed by Klinge et al., Science 334:941-948 (2010).
Alignments with human LSU RPs were obtained in SSEARCH3 program (Pearson WR (2000) Methods Mol Biol 132:185- 219).
The SWR (Smith-Waterman ratio) parameter listed is the average of Smith-Waterman indices expressed per number of residues in the corresponding sequences.
This relativization helps bring the SW data in register across protein sequences.
Abbreviations: A = archaeal; B = Bacterial; L = Lower eukarya; I = Insect; N = Non-mammalian vertebrate; M = Mammalian vertebrate; P = Angiosperm plant.
Five mammalian, three plant, five each archaeal and bacterial, three non-mammalian vertebrate, two insect and three lower eukaryotic species were compared.
The archaeal species used were H. marismortui., M. kandleri, M. jannaschii, M. maripaludis and S. solfataricus.
The bacterial species used were E. coli, M. leprae, M. smegmatis, M. tuberculosis and B. subtilis.
The eukaryotic sequences included five mammalian, three plant and two insect species listed in Table S1, two non-mammalian eukaryotes (D. rerio and X. laevis) and two lower eukaryotic (S. cerevisiae and T. thermophila).
Significance is for Bonferroni post hoc tests ( p < 0.05 for non-bracketed, p < 0.1 for bracketed items).
Table S4.1 Eukaryote-only (EO) LSU ribosomal proteins
LSU proteins of five archaeal species most comparable to human LSU "EO" proteins were also aligned, to illustrate the lack of similarity with eukaryotic sequences.
The RP correspondence table:
Eukaryote / ArchaealRL06 / RL14E
RL13 / RL13
RL22 / RL15E
RL27 / RL06
RL28 / RL15E
RL29 / RL31
RL30 / RL22
Smith-Waterman ratios compared for EO LSU ribosomal proteins
Parameter / A / L / I / N / M / Pmean / 0.4779 / 2.808 / 3.282 / 5.862 / 6.227 / 3.131
se / 0.0572 / 0.2221 / 0.208 / 0.1156 / 0.06688 / 0.1714
min / 0.162 / 0.9403 / 1.692 / 5.229 / 5.138 / 1.643
max / 1.608 / 4.19 / 4.382 / 6.431 / 6.548 / 4.417
# proteins / 35 / 14 / 14 / 14 / 35 / 21
mean as % mammal / 7.67 / 45.09 / 52.71 / 94.14 / 100 / 50.28
significantly larger than / A / AL / AILP / AILNP / A[L]
A =archaea; L = lower eukarya; I = insects; N = non-mammalian vertebrate; M = mammalian; P = plant
Numbers of residues compared for EO LSU ribosomal proteins
Parameter / A / L / I / N / M / Pmean / 149.1 / 134.5 / 168.9 / 149.2 / 166.3 / 145
se / 8.14 / 12.27 / 19.23 / 16.5 / 10.28 / 11.77
min / 77 / 58 / 76 / 64 / 92 / 60
max / 216 / 206 / 299 / 258 / 298 / 233
# proteins / 35 / 14 / 14 / 14 / 35 / 21
mean as % mammal / 99.93 / 90.15 / 113.2 / 100 / 111.46 / 97.18
significantly larger than / n / n / n / n / n / n
Table S4.2 Eukaryote and archaeal alignable (EA) LSU ribosomal proteins
Nineteen LSU RPs included.
The RP correspondence table:
Eukaryote / Archaeal / Eukaryote / ArchaealRL07A / RL07A / RL39 / RL39
RL14 / RL14E / RL40 / RL40
RL15 / RL15E
RL18 / RL18E
RL18A / RLX
RL19 / RL19E
RL21 / RL21e
RL24 / RL24E
RL30 / RL30E
RL31 / RL31
RL32 / RL32
RL34 / RL34
RL35A / RL35A
RL36A / RL44E
RL37 / RL37
RL37A / RL37A
RL38 / RL38
Smith-Waterman ratios compared for EA LSU ribosomal proteins
Parameter / A / L / I / N / M / Pmean / 1.942 / 3.703 / 4.741 / 6.23 / 6.58 / 4.148
se / 0.07679 / 0.1567 / 0.1693 / 0.09539 / 0.0361 / 0.1419
min / 0.219 / 1.461 / 2.414 / 1.712 / 5.188 / 0.6316
max / 3.721 / 6.039 / 6.827 / 7.192 / 7.192 / 6.385
# RPs / 95 / 38 / 38 / 57 / 95 / 57
mean as % mammal / 29.51 / 56.28 / 72.05 / 94.68 / 100 / 63.04
significantly larger than / A / ALP / AILP / AINLP / AL
A =archaea; L = lower eukarya; I = insects; N = non-mammalian vertebrate; M = mammal; P = plant
Numbers of residues compared for EA LSU ribosomal proteins
Parameter / A / L / I / N / M / Pmean / 98.46 / 132.3 / 141.3 / 133.4 / 138 / 135.4
se / 4.037 / 8.461 / 9.243 / 7.073 / 5.865 / 7.275
min / 47 / 50 / 51 / 50 / 50 / 51
max / 241 / 255 / 271 / 266 / 266 / 268
# proteins / 95 / 38 / 38 / 57 / 95 / 57
mean as % mammal / 71.35 / 95.87 / 102.39 / 96.67 / 100 / 98.12
significantly larger than / A / A / A / A / A
Table S4.3 Eukaryote, archaeal and bacterial alignable (EAB) LSU ribosomal proteins
Fifteen eukaryote LSU RPs included: RL3, 4, 5, 7, 8, 9, 10, 11, 13A, 17, 23, 23A, 26, 27A, 35.
The correspondence table:
Eukaryote / Archaeal / BacterialRL03 / RL03 / RL03
RL04 / RL04 / RL04
RL05 / RL18 / RL17
RL07 / RL30 / RL30
RL08 / RL02 / RL02
L09 / RL06 / RL06
RL10 / RL10 / RL16
RL11 / RL05 / RL05
RL13 / RL13 / RL13
RL17 / RL22 / RL22
RL23 / RL14 / RL14
RL23A / RL23 / RL23
RL26 / RL24 / RL24
RL27A / RL15 / RL15
RL35 / RL29 / RL29
Smith-Waterman index / sequence length ratios compared for EAB LSU ribosomal proteins
Parameter / A / B / L / I / N / M / Pmean / 2.196 / 0.9158 / 3.848 / 4.554 / 6.113 / 6.522 / 4.201
se / 0.07645 / 0.04712 / 0.1368 / 0.1675 / 0.06651 / 0.02995 / 0.1154
min / 0.662 / 0.2 / 2.587 / 2.3 / 5.189 / 5.726 / 2.168
max / 3.556 / 1.91 / 5.404 / 6.114 / 6.679 / 7 / 5.721
# proteins / 75 / 75 / 30 / 30 / 45 / 75 / 45
mean as % mammal / 33.67 / 14.04 / 59 / 69.83 / 93.73 / 100 / 64.41
significantly larger than / B / AB / ABLP / ABILP / ABILP / ABL
Numbers of residues compared for EAB LSU ribosomal proteins
Parameter / A / B / L / I / N / M / Pmean / 173.4 / 151.1 / 214.5 / 228 / 219.4 / 220.9 / 220
sd / 66.42 / 57.03 / 84.76 / 92.23 / 86.08 / 89.31 / 85.5
se / 7.67 / 6.586 / 15.48 / 16.84 / 12.83 / 10.31 / 12.75
min / 70 / 58 / 119 / 123 / 123 / 122 / 123
max / 361 / 280 / 410 / 435 / 421 / 427 / 407
# RPs / 75 / 75 / 30 / 30 / 45 / 75 / 45
mean as % mammal / 78.5 / 68.4 / 97.1 / 103.21 / 99.32 / 100 / 99.59
significantly larger than / B / B / B[A] / AB / B[A]
Table S4.4.1 LSU 41 EO/EA/EAB,archaeal and bacterial RPs: whole sequence identity to human orthologs [Fig. 4A]
group / ID% / se / sgnfarchaeal [A] / 31.37 / 0.8189 / B
bacterial [B] / 20.31 / 0.7961
lower eukarya [L] / 51.79 / 1.009 / AB
insect [I] / 65.13 / 1.548 / ABPL
non-mammalian [N] / 90.61 / 1.029 / ABIPL
mammalian [M] / 98.23 / 0.209 / ABINPL
plant [P] / 59.77 / 1.144 / ABL
Table S4.4.2 LSU 41 EO/EA/EAB, archaeal and bacterial RPs: bPCN alignment identity to human [Fig. 4B]
group / ID% / se / sgnfarchaeal [A] / 19.61 / 0.9831 / B
bacterial [B] / 14.65 / 1.158
lower eukarya [L] / 36.51 / 1.653 / AB
insect [I] / 51.93 / 2.371 / ABLP
non-mammalian [N] / 89.4 / 2.267 / ABILP
mammalian [M] / 94.8 / 0.984 / ABILNP
plant [P] / 45.92 / 2.029 / ABL
Table S4.4.3 Numbersof bPCNs per 100 aa and per entire LSU sequence [Fig 4C]
group / #bPCNs per 100 aa / se / sgnf / #bPCNs per sequence / se / sgnfarchaeal [A] / 1.54 / 0.0788 / 2.663 / 0.1366 / [B]
bacterial [B] / 1.41 / 0.1147 / 2.13 / 0.1733
lower eukarya [L] / 2.03 / 0.1094 / AB / 4.35 / 0.2346 / AB
insect [I] / 2.32 / 0.1216 / ABL / 5.28 / 0.2772 / ABL
non-mammal ian [N] / 2.38 / 0.1012 / ABL / 5.228 / 0.222 / ABL
mammalian [M] / 2.47 / 0.0761 / ABPL / 5.454 / 0.168 / ABLNP
plant [P] / 2.17 / 0.101 / AB / 4.78 / 0.2222 / AB
Table S4.5 LSU bPCN alignment with ribosomal proteins of group-representative species (Fig. 5 in the main text)
species / bPCN ID% / seYEAST
G. lamblia [35] / 35.7 / 3
T. thermophila [41] / 39.62 / 3.35
T. brucei [41] / 44.6 / 3.57
S. pombe [43] / 53.18 / 4.012
RICE
A. thaliana [41] / 76.78 / 3.545
Z. mays [41] / 78.86 / 3.549
FRUIT FLY
A. gambiae [41] / 65.85 / 3.71
HUMAN
D. rerio [41] / 86.79 / 2.444
X. laevis [42] / 90.34 / 2.242
G. gallus [41] / 92.8 / 1.951
M. musculus [44] / 96.12 / 1.67
R. norvegicus [44] / 96.17 / 1.625
C. familiaris [45] / 98.39 / 0.7169
M. mulatta [42] / 99.40 / 0.415
bPCN ID% = average % identity of bPCNs in LSU RPs with those in the group comparator sequences.
Table S4.6.1 The families and species compared in LSU alignments of bPCNs in archaea and bacteria (Fig. 6)
ARCHAEA
family / speciesDesulfurococcaceae / Aeropyrus pernix
Halobacteriaceae / Haloarcula marismortui
Haloarcula salinarm
Natronomonas pharaonis
Methanocaldococceae / Methanocaldococcus jannaschii
Methanococcaceae / Methanococcus maripaludis
Methanococcus vannielii
Methanopyraceae / Methanopyrus kandleri
Sulfolobaceae / Sulfolobus solfataricus
Sulfolobus tokodaii
Thermococcaceae / Pyrococcus abyssi
Pyrococcus furiosus
BACTERIA
family / speciesBacillaceae / Bacillus anthracis
Bacillaceae / Bacillus subtilis
Enterobacteriacae / Escherichia coli
Salmonella typhimurium
Shigella flexneri
Yersinia pestis
Mycobacteriaceae / Mycobacterium leprae
Mycobacterium smegmatis
Mycobacterium tuberculosis
Spirochaetaceae / Treponema pallidum
Staphylococcaceae / Staphylococcus aureus
Streptococcaceae / Streptococcus pyogenes
Table S4.6.2 Sequence and bPCN identities with M.jannaschii for archaeal LSU tunnel proteins (Fig. 6A in the main text))
family / % sequence identity / se / % bPCN identity / seMethanopyraceae / 62.41 / 2.477 / 40.47 / 7.468
Thermococcaceae / 60.66 / 2.781 / 36.72 / 7.708
Methanococcaceae [2] / 77.11 / 0.9206 / 33.51 / 8.369
Sulfolobaceae [2] / 50.3 / 3.343 / 28.01 / 3.566
Desulfurococcaceae / 55.18 / 4.302 / 25.77 / 7.963
Halobacteriaceae [3] / 56.2 / 1.421 / 9.749 / 1.595
The LSU tunnel proteins L4, L22, L23, L24, L29 and L31 were aligned with the respective M. iannaschii orthologs.
The number of species examined is shown in brackets if above 1.
Table S4.6.3 Sequence and bPCN identities to M. jannaschii for the bacterial LSU tunnel proteins (Fig. 6B in the main text)
family / % sequence identity / se / % bPCN identity / seSpirochaetaceae / 29.3 / 5.73 / 22.66 / 4.229
Mycobacteriaceae [2] / 27.99 / 3.285 / 17.37 / 3.132
Enterobacteriacae [4] / 31.59 / 3.484 / 12.12 / 3.536
Streptococcaceae / 36.39 / 5.07 / 11.68 / 5.307
Staphylococcaceae / 37.83 / 5.211 / 11.67 / 6.691
Bacillaceae [2] / 35.42 / 4.659 / 11.22 / 3.461
See the footnote of table S4.6.1 for proteins examined.
Table S4.6.4 Sequence and bPCN identities to M. tuberculosis for the bacterial LSU tunnel proteins (Fig. 6C in the main text)
family / % sequence identity / se / % bPCN identity / seMycobacteriaceae [2] / 88.58 / 1.496 / 88.37 / 5.85
Staphylococcaceae / 61.61 / 1.902 / 42.34 / 14.8
Spirochaetaceae / 53.73 / 4.96 / 37.67 / 5.207
Bacillaceae [2] / 63.52 / 1.769 / 37.15 / 9.561
Streptococcaceae / 58.46 / 3.464 / 22.96 / 11.76
Enterobacteriacae [4] / 57.03 / 1.386 / 21.67 / 4.459
See the footnote of table S4.6.1 for proteins examined.
Table S4.7.1 The bulk HKR% of ribosomal proteins relative to human [Fig. 7A in the main text]
species / LSU / se / SSU / seA. gambiae / 1.023 / 0.01069 / 1.012 / 0.01654
A. thaliana / 0.9481 / 0.01273 / 0.9519 / 0.0228
D. melanogaster / 1.006 / 0.01006 / 0.9985 / 0.01332
M. musculus / 0.9991 / 0.00342 / 0.9881 / 0.01451
O. sativa / 0.9451 / 0.01176 / 0.9777 / 0.02385
R. norvegicus / 0.9994 / 0.00327 / 1.004 / 0.00350
S. cerevisiae / 0.8802 / 0.01585 / 0.8965 / 0.02217
T. brucei / 1.014 / 0.01831 / 1.007 / 0.0286
T. thermophila / 0.9337 / 0.01371 / 0.971 / 0.01894
X. laevis / 0.9975 / 0.00461 / 0.9991 / 0.02154
Z. mays / 0.9363 / 0.01544 / 0.9372 / 0.01838
Table S4.7.2 HKR in bPCN of ribosomal proteins as % sequence relative to human [Fig. 7B in the main text]
species / LSU / se / SSU / seA. gambiae / 1.057 / 0.09124 / 1.093 / 0.08364
A. thaliana / 0.9323 / 0.05968 / 0.9373 / 0.0745
D. melanogaster / 0.9504 / 0.05641 / 0.9963 / 0.08068
M. musculus / 1.001 / 0.01496 / 0.9725 / 0.02779
O. sativa / 0.9444 / 0.06096 / 1.004 / 0.08851
R. norvegicus / 1.003 / 0.01158 / 1.036 / 0.05497
S. cerevisiae / 0.6639 / 0.05139 / 0.761 / 0.07894
T. brucei / 1.06 / 0.07913 / 0.9427 / 0.1233
T. thermophila / 0.7733 / 0.07074 / 0.8413 / 0.06367
X. laevis / 0.9937 / 0.02419 / 1 / 0.04369
Z. mays / 0.9112 / 0.05315 / 0.8236 / 0.06901
Table S4.7.3 Percent rRNA sequence in expansion segments relative to human [Fig. 7C in the main text]
species / LSU RNA ES / SSU RNA ESA. gambiae / 0.70449 / 0.97374
A. thaliana / 0.53744 / 0.85593
D. melanogaster / 0.70196 / 0.91235
M. musculus / 0.92449 / 0.99823
O. sativa / 0.49968 / 0.99503
R. norvegicus / 0.9401 / 0.99787
S. cerevisiae / 0.53301 / 0.93045
T. brucei / 0.8823 / 1.11285
T. thermophila / 0.47859 / 0.87544
X. laevis / 0.73023 / 0.96842
Z. mays / 0.50791 / 0.95138
Table S4.7.4 rRNA expansion segment GC% relative to human [Fig. 7D in the main text]
species / LSU RNA GC / SSU RNA GCA. gambiae / 0.67962 / 0.84834
A. thaliana / 0.75783 / 0.8034
D. melanogaster / 0.35011 / 0.68416
M. musculus / 0.95934 / 0.99919
O. sativa / 0.8999 / 0.87286
R. norvegicus / 0.97608 / 0.99395
S. cerevisiae / 0.60213 / 0.67633
T. brucei / 0.5641 / 0.9118
T. thermophila / 0.52332 / 0.64097
X. laevis / 0.99785 / 0.94912
Z. mays / 0.88627 / 0.8691
Table S4.8.1 Ratios of residues in ES as % human to sequence HKR% as % human [Fig. 8A in the main text]
Species / 6ESt/HKRt / se6ESt/HKRt / 4ESt/HKRt / se4ESt/HKRt / EH LSU/SSUT. thermophila / 0.517 / 0.00736 / 0.9125 / 0.01903 / 0.567
O. sativa / 0.5321 / 0.00683 / 1.034 / 0.02257 / 0.515
Z. mays / 0.5487 / 0.00967 / 1.028 / 0.02207 / 0.534
A. thaliana / 0.571 / 0.00785 / 0.9165 / 0.02455 / 0.623
S. cerevisiae / 0.6132 / 0.01088 / 1.064 / 0.0367 / 0.576
A. gambiae / 0.6914 / 0.00678 / 0.9706 / 0.01666 / 0.712
D. melanogaster / 0.7005 / 0.00674 / 0.9188 / 0.01245 / 0.762
X laevis / 0.7327 / 0.00330 / 0.9865 / 0.02944 / 0.743
T. brucei / 0.8807 / 0.01453 / 1.13 / 0.03106 / 0.779
M. musculus / 0.9254 / 0.00334 / 1.021 / 0.0259 / 0.906
R. norvegicus / 0.9410 / 0.00308 / 0.9945 / 0.00342 / 0.946
Table S4.8.2 Ratios of ES GC% as % human to sequence HKR% as % human [Fig. 8B in the main text
species / 6ESGCt/HKRt / se6GC / 4ESGCt/HKRt / se4GCD. melanogaster / 0.3494 / 0.00336 / 0.6889 / 0.00934
T. brucei / 0.5631 / 0.00930 / 0.9257 / 0.02544
T. thermophila / 0.5652 / 0.00806 / 0.6681 / 0.01392
A. gambiae / 0.667 / 0.00654 / 0.8457 / 0.01451
S. cerevisiae / 0.6929 / 0.01229 / 0.7733 / 0.02667
A. thaliana / 0.8052 / 0.01105 / 0.8601 / 0.02303
Z. mays / 0.9575 / 0.01689 / 0.939 / 0.02016
O. sativa / 0.9583 / 0.01231 / 0.9068 / 0.0198
M. musculus / 0.9605 / 0.00348 / 1.023 / 0.02593
R. norvegicus / 0.977 / 0.00319 / 0.9905 / 0.00340
X. laevis / 1.001 / 0.00452 / 0.9669 / 0.02886
Table S5 Basic PCN clusters in nuclear localization signals
The UniProtein database nuclear localization signals in humanand human viral proteins that are listed as experimentally confirmed.
This list excludes entries such as "by similarity" and "probable", and most entries are supported by journal references within the respective records..
The list does not contain any NLS of ribosomal proteins, although of course such motifs are present in most of the ribosomal proteins.
[All human LSU RPs have at least one bPCN cluster with three or more HKR residues and at least 50% of these.]
Multiple basic clusters in ribosomal proteins could be assumed to represent sufficient nuclear import signals in any case.
The bPCNs as defined in this review are found in 170 of 174 NLS in this list (97.7%), and recover 70.3% sequence of the listed NLSs.
UniProtein access / UniProtein designation / NLS sequence / bPCN sequence(s) / bPCN % NLS lengthQ7Z5L9 / NLS / ARKRKPSP / RKRK / 50