Supplementary tables S3-S6 1

SUPPLEMENTARY TABLES S3-S6: Evaluation of sequence parameters in ribosomal proteins and RNAs

Unless stated otherwise, test significances are forpost hoc Bonferroni t tests, p <0.1 where bracketed, p < 0.05 for other.

Boldface font indicates tests within LSU ("LSU:LSU") or SSU ("SSU:SSU") sets, plain font between LSU and SSU ("LSU:SSU" and "SSU:LSU").

For species and access codes used see tables S1 (LSU) and S2 (SSU).

TABLE PAGE

Table S3A Accruing mean differences of ionic parameters for LSU ribosomal proteins2

Table S3.1 The number of residues in ribosomal proteins examined (Fig. 1A in the main text)4

Table S3.2 Hydrophobicity of the ribosomal proteins examined (Fig. 1B in the main text)4

Table S3.3 ILMV>2 clusters as % sequence of the ribosomal proteins examined (Fig. 1C in the main text)4

Table S3.4 AGILMV>2 clusters as % sequence of the ribosomal proteins examined (Fig. 1D in the main text)4

Table S3.5A Numbers of residues per RP sequence in ILMV >2 clusters4

Table S3.5B Numbers of residues per RP sequence in AGILMV>2 clusters5

Table S3.6 Numbers of all ILMV and AG residues per RP sequence in LSU ribosomal proteins5

Table S3.7 Basic amino acid residues as % RP sequence (Fig. 2A in the main text)5

Table S3.8 Acidic amino acid residues as % RP sequence (Fig. 2B in the main text)5

Table S3.9 Homobasic segments with >1 HKR residue as % RP sequence (Fig. 2C in the main text)5

Table S3.10 Homoacidic segments with >1 DE as % RP sequence (Fig. 2D in the main text)6

Table S3.11 Basic PCNs as % RP sequence (Fig. 3A in the main text)6

Table S3.12 Acidic PCNs as % RP sequence (Fig. 3B in the main text)6

Table S3.13 The number of basic residues in bPCNs per RP sequence ("basic PCN impact") (Fig. 3C in the main text)6

Table S3.14 The number of acidic residues in aPCNs per RP sequence ("acidic PCN impact") (Fig. 3D in the main text)6

Table S4.1 Eukaryote-only (EO) LSU ribosomal proteins7

Table S4.2 Eukaryote and archaeal alignable(EA) LSU ribosomal proteins8

Table S4.3 Eukaryote, archaeal and bacterial alignable (EAB) LSU ribosomal proteins9

Table S4.4.1 LSU 41 EO/EA/EAB, archaeal and bacterial RPs: sequence identity to human orthologs [Fig. 4A]10

Table S4.4.2 LSU 41 EO/EA/EAB, archaeal and bacterial RPs: bPCN alignment identity to human [Fig. 4B]10

Table S4.4.3 LSU numbers of bPCNs per 100 aa and per entire sequence [Fig 4C]10

Table S4.5 LSU bPCN alignment with ribosomal proteins of group-representative species11

Table S4.6.1 The families and species compared in LSU alignments of bPCNs in archaea and bacteria (Fig. 6 main)12

Table S4.6.2 Sequence and bPCN identities with M. jannaschii for archaeal LSU tunnel proteins (Fig. 6A in the main text))12

Table S4.6.3 Sequence and bPCN identities to M. jannaschii for the bacterial LSU tunnel proteins (Fig. 6B in the main text)12

Table S4.6.4 Sequence and bPCN identities to M. tuberculosis for the bacterial LSU tunnel proteins (Fig. 6C in the main text)12

Table S4.7.1 The bulk HKR% of ribosomal proteins relative to human [Fig 7A in the main text]13

Table S4.7.2 HKR in bPCN of ribosomal proteins as % sequence relative to human [Fig. 7B in the main text]13

Table S4.7.3 Percent rRNA sequence in expansion segments relative to human [Fig. 7C in the main text]13

Table S4.7.4 rRNA expansion segment GC% relative to human [Fig. 7D in the main text]13

Table S4.8.1 Ratios of residues in ES as % human to sequence HKR%as % human [Fig. 8A in the main text]14

Table S4.8.2 Ratios of ES GC% as % human to sequence HKR% as % human [Fig. 8B in the main text]14

Table S5 Basic PCN clusters in nuclear localization signals 15 to 17

Table S6.1 LSU ES boundaries defined by alignment with human 28S rRNA

compared with published models for 25-28S rRNAs 18

Table S6.2 LSU ES boundaries defined by alignment with yeast 25S rRNA

compared with human 28S rRNA-defined boundaries 18

Table S3A Accruing mean differences of ionic parameters for LSU ribosomal proteins

Percent difference of ionic parameter accrued means from the final group mean in LSU ribosomal proteins

Means for all species preceding and including the current column were subtracted from the final mean for all species of a group, and the difference expressed as % of that mean.

This table illustrates the degree of parameter homogeneity within the examined groups of species.

Parameter abbreviations:

DEacidic residues as % sequence aa

HKR basic residues as % sequence aa

Azip% sequence in homoionic segments containing >1 DE residue;

Bzip% sequence in homoionic segments containing >1 HKR residue

aPCNacidic PCN segments with >2 and >=50% DE residues

bPCNbasic segments with >2 and >=50% HKR residues

mdif1, 2, 3…% difference from the overall mean for the mean after addition of the parameter of species 1, 2, 3…

Archaea (12) / mdif1 / mdif2 / mdif3 / mdif4 / mdif5 / mdif6 / mdif7 / mdif8 / mdif9 / mdif10 / mdif11
DE / -23.54 / 8.61 / 9.79 / 4.48 / 0.33 / -2.19 / 3.78 / 1.07 / 5.97 / 3.82 / 1.4
HKR / 6.02 / -6.66 / -2.15 / 2.67 / 2.04 / 1.62 / -1.52 / 0.38 / -2.87 / -1.18 / -0.68
azip / -43.62 / 13.65 / 12.21 / 3.23 / -0.75 / -3.65 / 7.19 / 1.98 / 11.25 / 6.81 / 3.05
bzip / 12.58 / -8.61 / -6.44 / -1.3 / 0.92 / 1.98 / -2.24 / 0 / -4.32 / -2.4 / -0.96
aPCN / -64.89 / 28.64 / 23.01 / 4.47 / -8.08 / -15.88 / 6.09 / -0.71 / 17.22 / 12.1 / 3.89
bPCN / 2.79 / -15.32 / -5.02 / 6.96 / 3.95 / 1.75 / -3.36 / 0.26 / -5.24 / -2.08 / -1.76
Bacteria (12) / mdif1 / mdif2 / mdif3 / mdif4 / mdif5 / mdif6 / mdif7 / mdif8 / mdif9 / mdif10 / mdif11
DE / -0.64 / -1.02 / 1.37 / 1.83 / 0.18 / -0.02 / -0.27 / -1.7 / -1.86 / -0.32 / 0.18
HKR / -0.23 / -1.81 / -1.69 / -0.98 / -0.39 / -0.21 / 0.07 / 0.3 / 0.02 / -0.18 / 0.13
azip / -3.74 / 2.23 / 4.43 / 4.87 / 2.03 / 0.9 / -0.08 / -2.28 / -1.39 / 0.28 / 0.13
bzip / 0.79 / -1.42 / -2.41 / -2.09 / -0.79 / -0.39 / 0 / 0.92 / 0.98 / 0.28 / 0.11
aPCN / 18.55 / 8.6 / 1.95 / -7.53 / -3.25 / -2.83 / -0.14 / -5.54 / -5.73 / 1.77 / 1.3
bPCN / -14.15 / -1.78 / 3.33 / 0.76 / 5.49 / 3.76 / 1.25 / 1.83 / 0.74 / -0.31 / -0.16
Lower eukarya (3) / mdif1 / mdif2
DE / -3.44 / -1.93
HKR / 3.12 / 1.08
azip / -5.3 / -11.2
bzip / 3.18 / 0.88
aPCN / -24.31 / -19.44
bPCN / 9.65 / 0.23
Insects (2) / mdif1
DE / 1.87
HKR / -2.6
azip / 5.19
bzip / -1.26
aPCN / -7.24
bPCN / 0.09
Non-mammalian vertebrates [3} / mdif1 / mdif2
DE / 0.45 / -0.44
HKR / 0.87 / -0.06
azip / 9.06 / 0.66
bzip / -0.95 / -0.12
aPCN / -2.51 / -0.51
bPCN / 4.88 / 0.3

Table S3A continued

Mammalian [5] / mdif1 / mdif2 / mdif3 / mdif4
DE / 1.85 / -1.51 / -2.19 / -0.13
HKR / -0.55 / 1.06 / 1.09 / -0.11
azip / 2.27 / -2.51 / -3.56 / 0.07
bzip / -0.54 / 1.06 / 1.25 / -0.01
aPCN / 7.45 / 9.79 / 7.9 / -1.96
bPCN / -2.71 / 1.44 / 2.23 / -0.09
Plants [3] / mdif1 / mdif2
DE / 0.21 / -0.5
HKR / -1.4 / 1.23
azip / 4.88 / -1.43
bzip / -0.96 / 0.28
aPCN / -34.03 / -12.25
bPCN / -2.18 / 3.64

Table S3.1 The number of residues in ribosomal proteins examined (Fig. 1A in the main text)

group / LSU #residues / se / LSU:LSU sgnf. / LSU:SSU sgnf. / SSU #residues / se / SSU:SSU sgnf. / SSU:LSU sgnf. / LSU/SSU size ratio
archaeal [A] / 140.2 / 7.46 / 142.8 / 3.98 / 0.982
bacterial [B] / 123.1 / 2.89 / 135 / 4.61 / 0.912
lower eukarya [L] / 168 / 6.858 / AB / AB / 166.5 / 6.57 / AB / AB / 1.009
insect [I] / 178.5 / 9.489 / AB / ABM[N] / 169.1 / 9.232 / AB / AB / 1.056
non-mammalian [N] / 171.7 / 7.125 / AB / AB[M] / 159.6 / 6.631 / 1.076
mammalian [M] / 169.2 / 5.321 / AB / AB / 157.5 / 4.941 / 1.074
plant [P] / 172.7 / 7.015 / AB / AB[M] / 167.9 / 6.643 / AB / AB / 1.029
Giardia / 167.9 / 10.27 / n. t. / n. t. / 164.3 / 9.762 / n. t. / n. t. / 1.022

n.t. = not tested

Table S3.2 Hydrophobicity of the ribosomal proteins examined (Fig. 1B in the main text)

group / LSU / se / LSU:LSU sgnf. / LSU:SSU sgnf. / SSU / se / SSU:SSU sgnf. / SSU:LSU sgnf.
archaeal [A] / -0.5878 / 0.019 / IMLN[L] / IMLNP / -0.5412 / 0.0160
bacterial [B] / -0.4832 / 0.0176 / AILMNP / IMLNP / -0.5665 / 0.0168
lower eukarya [L] / -0.6827 / 0.03099 / IM[N] / IMLNP / -0.5545 / 0.02731
insect [I] / -0.7819 / 0.03773 / IMLNP / -0.5697 / 0.03362
non-mammalian [N] / -0.7587 / 0.0293 / IMLNP / -0.5792 / 0.03027
mammalian [M] / -0.7853 / 0.02688 / IMLNP / -0.5714 / 0.02271
plant [P] / -0.7114 / 0.03351 / M / IMLNP[A] / -0.5453 / 0.02884
Giardia / -0.5670 / 0.04404 / n. t. / n. t. / -0.4480 / 0.0460 / n. t. / n. t.

Table S3.3 ILMV>2 clusters as % sequence of the ribosomal proteins examined (Fig. 1C in the main text)

group / LSU-ILMV>2% / se / LSU:LSU sgnf. / LSU:SSU sgnf. / SSU-ILMV>2% / se / SSU:SSU sgnf. / SSU:LSU sgnf.
archaeal [A] / 3.123 / 0.143 / ILMNP / BIN / 3.006 / 0.153 / B / ILP[M]
bacterial [B] / 2.72 / 0.141 / I[LP] / B / 1.797 / 0.127
lower eukarya [L] / 2.254 / 0.2003 / 2.65 / 0.2647 / B / [I]
insect [I] / 1.932 / 0.219 / 2.395 / 0.3198
non-mammalian [N] / 2.489 / 0.2085 / [B] / 2.49 / 0.265 / [B]
mammalian [M] / 2.441 / 0.1576 / [B] / 2.632 / 0.2015 / B / I
plant [P] / 2.327 / 0.2 / 2.703 / 0.2771 / B / I
Giardia / 1.921 / 0.3452 / n. t. / n. t. / 3.086 / 0.569 / n. t. / n. t.

Table S3.4 AGILMV>2 clusters as % sequence of the ribosomal proteins examined (Fig. 1D in the main text)

group / LSU-AGILMV>2 % / se / LSU:LSU sgnf. / LSU:SSU sgnf. / SSU-AGILMV>2% / se / SSU:SSU sgnf. / SSU:LSU sgnf.
archaeal [A] / 10.98 / 0.354 / ILMNP / 12.08 / 0.316 / ILMNP / ILMNP
bacterial [B] / 12.09 / 0.328 / ILMNP / ILMNP / 12.06 / 0.367 / AILMNP / ILMNP
lower eukarya [L] / 9.522 / 0.4298 / 10.74 / 0.6029 / ILMNP
insect [I] / 8.345 / 0.5217 / 9.445 / 0.6251
non-mammal ian [N] / 8.864 / 0.4393 / 9.533 / 0.5539
mammalian [M] / 9.04 / 0.3376 / 9.673 / 0.421 / [I]
plant [P] / 8.902 / 0.4499 / 10.24 / 0.5368 / I[MNP]
Giardia / 9.17 / 0.8086 / n. t. / n. t. / 11.39 / 1.317 / n. t. / n. t.

Table S3.5A Numbers of residues per RP sequence in ILMV >2 clusters

group / LSU # ILMV in >2 / se / LSU:LSU sgnf. / LSU:SSU sgnf. / SSU # ILMV in >2 / se / SSU:SSU sgnf. / SSU:LSU sgnf.
archaeal / 4.29 / 0.2924 / BI / B / 4.572 / 0.2555 / B
bacterial / 3.673 / 0.2915 / B / 2.69 / 0.2154 / B
lower eukarya / 4.024 / 0.3438 / B / 4.362 / 0.4264 / B
insect / 3.448 / 0.4049 / B / 4.355 / 0.5624 / B
non-mammalian / 4.605 / 0.4084 / B / 4.172 / 0.4451 / B
mammalian / 4.397 / 0.2943 / B[I] / B / 4.352 / 0.3274 / B
plant / 4.232 / 0.3382 / B / 4.591 / 0.4762 / B
Giardia / 3.162 / 0.4985 / n. t. / n. t. / 5.368 / n. t. / n. t.

# ILMV>2 = the average number of ILMV residues in the respective clusters of RP sequences

Table S3.5B Numbers of residues per RP sequence in AGILMV>2 clusters

group / LSU # AGILMV in >2 clusters / se / LSU:LSU sgnf. / LSU:SSU sgnf. / SSU #AGILMV in >2 clusters / se / SSU:SSU sgnf. / SSU:LSU sgnf.
archaeal / 16.95 / 0.675 / 18.06 / 0.6433 / M[N]
bacterial / 16.31 / 0.5913 / 17.65 / 0.9197 / [M]
lower eukarya / 16.71 / 1.034 / 18.15 / 1.132 / [M]
insect / 15.42 / 1.199 / 16.52 / 1.337
non-mammalian / 15.72 / 0.9245 / 15.53 / 1.1
mammalian / 15.89 / 0.6932 / 15.58 / 0.808
plant / 16.9 / 1.047 / 17.11 / 1.069
Giardia / 15.43 / 1.582 / n. t. / n. t. / 19.11 / 2.21 / n. t. / n. t.

# AGILMV = the average number of AGILMV residues in the respective clusters of RP sequences

Table S3.6 Numbers of all ILMV and AG residues per RP sequence in LSU ribosomal proteins

group / # ILMV / se / % mammal / # AG / se / % mammal
archaeal / 33.51 / 0.9115 / 89.79 / 22.98 / 0.7077 / 91.99
bacterial / 29.55 / 0.7447 / 79.18 / 23.2 / 0.708 / 92.87
lower eukarya / 36.39 / 1.579 / 97.51 / 26.4 / 1.399 / 105.68
insect / 37.41 / 2.081 / 100.24 / 28.08 / 2.337 / 112.41
non-mammalian / 37.48 / 1.643 / 100.43 / 25.1 / 1.332 / 100.48
mammalian / 37.32 / 1.262 / 100 / 24.98 / 1.004 / 100
plant / 37.9 / 1.668 / 101.55 / 26.63 / 1.384 / 106.61

# ILMV, # AG = the average numbers of the respective residues per LSU RP sequence

Table S3.7 Basic amino acid residues as % RP sequence (Fig. 2A in the main text)

group / LSU-HKR% / se / LSU:LSU sgnf. / LSU:SSU sgnf. / SSU-HKR% / se / SSU:SSU sgnf. / SSU:LSU sgnf.
archaeal [A] / 21.71 / 0.356 / A / 19.64 / 0.288
bacterial [B] / 22.17 / 0.324 / A / 22.6 / 0.31 / A
lower eukarya [L] / 26.01 / 0.5633 / AB / ABILMP / 21.54 / 0.4798 / A
insect [I] / 28.11 / 0.7149 / ABLP / ABILMNP / 22.58 / 0.5726 / A
non-mammalian [N] / 27.52 / 0.5453 / ABL[P] / ABILMNP / 23 / 0.4888 / A
mammalian [M] / 28.25 / 0.5363 / ABLP / ABILMNP / 23.06 / 0.3479 / A[L]
plant [P] / 26.17 / 0.6222 / AB / ABILMNP / 21.78 / 0.4636 / A
Giardia / 23.24 / 0.777 / n. t. / 21.46 / 0.758 / n. t.

Table S3.8 Acidic amino acid residues as % RP sequence (Fig. 2B in the main text)

group / LSUDE% / se / LSU:LSU sgnf. / LSU:SSU sgnf. / SSUDE% / se / SSU:SSU sgnf. / SSU:LSU sgnf.
archaeal [A] / 12.45 / 0.291 / BILMNP / BIMNP / 13.56 / 0.284 / BIMNP / ABILMNP
bacterial [B] / 9.658 / 0.192 / ILMNP / 9.72 / 0.299 / [P] / ILMNP
lower eukarya [L] / 7.181 / 0.2553 / 9.88 / 0.3142 / ILMNP
insect [I] / 7.142 / 0.3061 / 10.44 / 0.3617 / ILMNP
non-mammalian [N] / 7.466 / 0.2992 / 9.669 / 0.2942 / ILMNP
mammalian [M] / 7.2 / 0.2291 / 9.687 / 0.2192 / ILMNP
plant [P] / 8.117 / 0.293 / LM[I] / 9.581 / 0.2895 / ILMNP
Giardia / 8.88 / 0.483 / n. t. / n. t. / 8.7 / 0.47 / n. t. / n. t.

Table S3.9 Homobasic segments with >1 HKR residue as % RP sequence (Fig. 2C in the main text)

group / LSU % sequence / se / LSU:LSU sgnf. / LSU:SSU sgnf. / SSU % sequence / se / SSU:SSU sgnf. / SSU:LSU sgnf.
archaeal [A] / 45.94 / 1.012 / A / 40.76 / 0.808
bacterial [B] / 48.15 / 0.808 / A / 47.77 / 1.157 / A
lower eukarya [L] / 63.67 / 1.173 / ABP / ABILMNP / 51.78 / 1.274 / AB / AB
insect [I] / 67.11 / 1.268 / ABP[L] / ABILMNP / 52.03 / 1.682 / AB / AB
non-mammalian [N] / 65.06 / 1.271 / ABP / ABILMNP / 53.96 / 1.393 / AB / AB
mammalian [M] / 66.31 / 0.9583 / ABP / ABILMNP / 54.35 / 1.034 / AB / AB
plant [P] / 59.62 / 1.148 / AB / ABILMNP / 53.63 / 1.509 / AB / AB
Giardia / 56.75 / 2.081 / n. t. / n. t. / 53.81 / 2.44 / n. t.

Table S3.10 Homoacidic segments with >1 DE as % RP sequence (Fig. 2D in the main text)

group / LSU % sequence / se / LSU:LSU sgnf. / LSU:SSU sgnf. / SSU % sequence / se / SSU:SSU sgnf. / SSU:LSU sgnf.
archaeal [A] / 18.68 / 0.747 / BILMNP / BILMNP / 20.95 / 0.728 / BILMNP / ABIMNPU
bacterial [B] / 13.9 / 0.523 / ILMNP / 13.77 / 0.788 / ILMNP
lower eukarya [L] / 7.833 / 0.5318 / 12.88 / 0.8552 / ILMNP
insect [I] / 6.703 / 0.6028 / 14.55 / 0.9957 / ILMNP
non-mammalian [N] / 8.012 / 0.5485 / 12.48 / 0.7566 / ILMNP
mammalian [M] / 7.949 / 0.4201 / 12.89 / 0.542 / ILMNP
plant [P] / 9.636 / 0.5705 / I[LM] / 14.11 / 0.9039 / ILMNP
Giardia / 10.24 / 1.168 / n. t. / n. t. / 10.83 / 1.13 / n. t. / n. t.

Table S3.11 Basic PCNs as % RP sequence (Fig. 3A in the main text)

group / LSU-bPCN% / se / LSU:LSU sgnf. / LSU:SSU sgnf. / found in % LSU / SSU-bPCN% / se / SSU:SSU sgnf. / SSU:LSU sgnf. / found in % SSU
archaeal [A] / 9.812 / 0.444 / A[L] / 84.71 / 7.073 / 0.33 / 83.5
bacterial [B] / 9.556 / 0.492 / A[L] / 80.5 / 10.08 / 0.779 / A / 97.98
lower eukarya [L] / 13.53 / 0.9332 / AB / ABILMP[N] / 98.43 / 7.92 / 0.4977 / 91.49
insect [I] / 16.65 / 1.302 / ABL / ABILMNP / 98.82 / 9.478 / 0.7796 / [A] / 95.16
non-mammalian [N] / 16.85 / 0.9051 / ABLP / ABILMNP / 99.25 / 10.98 / 0.6735 / AL / 96.55
mammalian [M] / 18.48 / 0.9724 / ABLP / ABILMNP / 100 / 10.66 / 0.4917 / AL / 97.48
plant [P] / 15.48 / 1.046 / AB / ABILMNP / 99.28 / 9.476 / 0.6249 / [A] / 96.77
Giardia / 10.02 / 0.9725 / n. t. / n. t. / 9.815 / 1.1 / n. t. / n. t.

Table S3.12 Acidic PCNs as % RP sequence (Fig. 3B in the main text)

group / LSU-aPCN% / se / LSU:LSU sgnf. / LSU:SSU sgnf. / found in % LSU / SSU-aPCN% / se / SSU:SSU sgnf. / SSU:LSU sgnf. / found in % SSU
archaeal [A] / 3.061 / 0.237 / BILMNP / BIMNPU / 49.76 / 3.269 / 0.278 / BILMNP / ABILMNP / 54.21
bacterial [B] / 0.7734 / 0.0861 / 21 / 0.7483 / 0.127 / 24.19
lower eukarya [L] / 0.581 / 0.127 / 20.47 / 1.544 / 0.2176 / B / BILMN[P] / 44.68
insect [I] / 0.4828 / 0.1163 / 20 / 1.523 / 0.3104 / [B] / BILMU / 41.94
non-mammalian [N] / 0.8536 / 0.1481 / 28.57 / 1.126 / 0.1987 / [I] / 31.03
mammalian [M] / 0.793 / 0.1052 / 26.2 / 1.113 / 0.1459 / [IL] / 31.45
plant [P] / 0.9301 / 0.1922 / 23.91 / 1.314 / 0.1701 / IL[BM] / 47.31
Giardia / 0.871 / 0.2373 / n. t. / n. t. / 0.816 / 0.374 / n. t. / n. t.

Table S3.13 The number of basic residues in bPCNs per RP sequence ("basic PCN impact") (Fig. 3C in the main text)

group / LSU#HKR in bPCN / se / LSU:LSU sgnf. / LSU:SSU sgnf. / SSU#HKR in bPCN / se / SSU:SSU sgnf. / SSU:LSU sgnf.
archaeal [A] / 8.915 / 0.353 / 7.256 / 0.327
bacterial [B] / 7.06 / 0.28 / 8.702 / 0.258
lower eukarya [L] / 15.42 / 0.7976 / AB / AILMP[N] / 9.883 / 0.7173 / A / B
insect [I] / 19.18 / 0.996 / ABL[P] / ABILMNP / 12.1 / 1.107 / AB[L] / AB
non-mammalian [N] / 19.62 / 0.8322 / ABLP / ABILMNP / 13.34 / 0.937 / ABL / AB
mammalian [M] / 20.04 / 0.6527 / ABLP / ABILMNP / 12.8 / 0.6942 / ABL / AB
plant [P] / 17.35 / 0.7681 / ABL / ABILMNP / 11.74 / 0.8198 / AB / AB
Giardia / 12.22 / 1.15 / n. t. / n. t. / 7.489 / 0.846 / n. t. / n. t.

Table S3.14 The number of acidic residues in aPCNs per RP sequence ("acidic PCN impact") (Fig. 3D in the main text)

group / LSU#DE in aPCN / se / LSU:LSU sgnf. / LSU:SSU sgnf. / SSU-#DE in aPCN / se / SSU:SSU sgnf. / SSU:LSU sgnf.
archaeal [A] / 3.551 / 0.277 / BILMNP / BILMNP / 4 / 0.458 / BILMNP / ABILMNP
bacterial [B] / 0.833 / 0.054 / 1.29 / 0.192
lower eukarya [L] / 0.9685 / 0.2219 / 2.287 / 0.3271 / B[N] / BILM[NP]
insect [I] / 0.8706 / 0.22 / 1.919 / 0.3214 / B[IL]
non-mammalian [N] / 1.331 / 0.2121 / 1.471 / 0.2563
mammalian [M] / 1.192 / 0.1476 / 1.43 / 0.1837
plant [P] / 1.377 / 0.2944 / 2.064 / 0.2489 / BILM[NP]
Giardia / 1.351 / 0.3975 / n. t. / n. t. / 0.684 / 0.299 / n. t. / n. t.

Eukaryote-only (EO), Eukaryote/Archaeal (EA) and Eukaryote/Archaeal/Bacterial (EAB) LSU proteins

LSU ribosomal proteins grouped by interdomain sequence aignment. The classification is largely as proposed by Klinge et al., Science 334:941-948 (2010).

Alignments with human LSU RPs were obtained in SSEARCH3 program (Pearson WR (2000) Methods Mol Biol 132:185- 219).

The SWR (Smith-Waterman ratio) parameter listed is the average of Smith-Waterman indices expressed per number of residues in the corresponding sequences.

This relativization helps bring the SW data in register across protein sequences.

Abbreviations: A = archaeal; B = Bacterial; L = Lower eukarya; I = Insect; N = Non-mammalian vertebrate; M = Mammalian vertebrate; P = Angiosperm plant.

Five mammalian, three plant, five each archaeal and bacterial, three non-mammalian vertebrate, two insect and three lower eukaryotic species were compared.

The archaeal species used were H. marismortui., M. kandleri, M. jannaschii, M. maripaludis and S. solfataricus.

The bacterial species used were E. coli, M. leprae, M. smegmatis, M. tuberculosis and B. subtilis.

The eukaryotic sequences included five mammalian, three plant and two insect species listed in Table S1, two non-mammalian eukaryotes (D. rerio and X. laevis) and two lower eukaryotic (S. cerevisiae and T. thermophila).

Significance is for Bonferroni post hoc tests ( p < 0.05 for non-bracketed, p < 0.1 for bracketed items).

Table S4.1 Eukaryote-only (EO) LSU ribosomal proteins

LSU proteins of five archaeal species most comparable to human LSU "EO" proteins were also aligned, to illustrate the lack of similarity with eukaryotic sequences.

The RP correspondence table:

Eukaryote / Archaeal
RL06 / RL14E
RL13 / RL13
RL22 / RL15E
RL27 / RL06
RL28 / RL15E
RL29 / RL31
RL30 / RL22

Smith-Waterman ratios compared for EO LSU ribosomal proteins

Parameter / A / L / I / N / M / P
mean / 0.4779 / 2.808 / 3.282 / 5.862 / 6.227 / 3.131
se / 0.0572 / 0.2221 / 0.208 / 0.1156 / 0.06688 / 0.1714
min / 0.162 / 0.9403 / 1.692 / 5.229 / 5.138 / 1.643
max / 1.608 / 4.19 / 4.382 / 6.431 / 6.548 / 4.417
# proteins / 35 / 14 / 14 / 14 / 35 / 21
mean as % mammal / 7.67 / 45.09 / 52.71 / 94.14 / 100 / 50.28
significantly larger than / A / AL / AILP / AILNP / A[L]

A =archaea; L = lower eukarya; I = insects; N = non-mammalian vertebrate; M = mammalian; P = plant

Numbers of residues compared for EO LSU ribosomal proteins

Parameter / A / L / I / N / M / P
mean / 149.1 / 134.5 / 168.9 / 149.2 / 166.3 / 145
se / 8.14 / 12.27 / 19.23 / 16.5 / 10.28 / 11.77
min / 77 / 58 / 76 / 64 / 92 / 60
max / 216 / 206 / 299 / 258 / 298 / 233
# proteins / 35 / 14 / 14 / 14 / 35 / 21
mean as % mammal / 99.93 / 90.15 / 113.2 / 100 / 111.46 / 97.18
significantly larger than / n / n / n / n / n / n

Table S4.2 Eukaryote and archaeal alignable (EA) LSU ribosomal proteins

Nineteen LSU RPs included.

The RP correspondence table:

Eukaryote / Archaeal / Eukaryote / Archaeal
RL07A / RL07A / RL39 / RL39
RL14 / RL14E / RL40 / RL40
RL15 / RL15E
RL18 / RL18E
RL18A / RLX
RL19 / RL19E
RL21 / RL21e
RL24 / RL24E
RL30 / RL30E
RL31 / RL31
RL32 / RL32
RL34 / RL34
RL35A / RL35A
RL36A / RL44E
RL37 / RL37
RL37A / RL37A
RL38 / RL38

Smith-Waterman ratios compared for EA LSU ribosomal proteins

Parameter / A / L / I / N / M / P
mean / 1.942 / 3.703 / 4.741 / 6.23 / 6.58 / 4.148
se / 0.07679 / 0.1567 / 0.1693 / 0.09539 / 0.0361 / 0.1419
min / 0.219 / 1.461 / 2.414 / 1.712 / 5.188 / 0.6316
max / 3.721 / 6.039 / 6.827 / 7.192 / 7.192 / 6.385
# RPs / 95 / 38 / 38 / 57 / 95 / 57
mean as % mammal / 29.51 / 56.28 / 72.05 / 94.68 / 100 / 63.04
significantly larger than / A / ALP / AILP / AINLP / AL

A =archaea; L = lower eukarya; I = insects; N = non-mammalian vertebrate; M = mammal; P = plant

Numbers of residues compared for EA LSU ribosomal proteins

Parameter / A / L / I / N / M / P
mean / 98.46 / 132.3 / 141.3 / 133.4 / 138 / 135.4
se / 4.037 / 8.461 / 9.243 / 7.073 / 5.865 / 7.275
min / 47 / 50 / 51 / 50 / 50 / 51
max / 241 / 255 / 271 / 266 / 266 / 268
# proteins / 95 / 38 / 38 / 57 / 95 / 57
mean as % mammal / 71.35 / 95.87 / 102.39 / 96.67 / 100 / 98.12
significantly larger than / A / A / A / A / A

Table S4.3 Eukaryote, archaeal and bacterial alignable (EAB) LSU ribosomal proteins

Fifteen eukaryote LSU RPs included: RL3, 4, 5, 7, 8, 9, 10, 11, 13A, 17, 23, 23A, 26, 27A, 35.

The correspondence table:

Eukaryote / Archaeal / Bacterial
RL03 / RL03 / RL03
RL04 / RL04 / RL04
RL05 / RL18 / RL17
RL07 / RL30 / RL30
RL08 / RL02 / RL02
L09 / RL06 / RL06
RL10 / RL10 / RL16
RL11 / RL05 / RL05
RL13 / RL13 / RL13
RL17 / RL22 / RL22
RL23 / RL14 / RL14
RL23A / RL23 / RL23
RL26 / RL24 / RL24
RL27A / RL15 / RL15
RL35 / RL29 / RL29

Smith-Waterman index / sequence length ratios compared for EAB LSU ribosomal proteins

Parameter / A / B / L / I / N / M / P
mean / 2.196 / 0.9158 / 3.848 / 4.554 / 6.113 / 6.522 / 4.201
se / 0.07645 / 0.04712 / 0.1368 / 0.1675 / 0.06651 / 0.02995 / 0.1154
min / 0.662 / 0.2 / 2.587 / 2.3 / 5.189 / 5.726 / 2.168
max / 3.556 / 1.91 / 5.404 / 6.114 / 6.679 / 7 / 5.721
# proteins / 75 / 75 / 30 / 30 / 45 / 75 / 45
mean as % mammal / 33.67 / 14.04 / 59 / 69.83 / 93.73 / 100 / 64.41
significantly larger than / B / AB / ABLP / ABILP / ABILP / ABL

Numbers of residues compared for EAB LSU ribosomal proteins

Parameter / A / B / L / I / N / M / P
mean / 173.4 / 151.1 / 214.5 / 228 / 219.4 / 220.9 / 220
sd / 66.42 / 57.03 / 84.76 / 92.23 / 86.08 / 89.31 / 85.5
se / 7.67 / 6.586 / 15.48 / 16.84 / 12.83 / 10.31 / 12.75
min / 70 / 58 / 119 / 123 / 123 / 122 / 123
max / 361 / 280 / 410 / 435 / 421 / 427 / 407
# RPs / 75 / 75 / 30 / 30 / 45 / 75 / 45
mean as % mammal / 78.5 / 68.4 / 97.1 / 103.21 / 99.32 / 100 / 99.59
significantly larger than / B / B / B[A] / AB / B[A]

Table S4.4.1 LSU 41 EO/EA/EAB,archaeal and bacterial RPs: whole sequence identity to human orthologs [Fig. 4A]

group / ID% / se / sgnf
archaeal [A] / 31.37 / 0.8189 / B
bacterial [B] / 20.31 / 0.7961
lower eukarya [L] / 51.79 / 1.009 / AB
insect [I] / 65.13 / 1.548 / ABPL
non-mammalian [N] / 90.61 / 1.029 / ABIPL
mammalian [M] / 98.23 / 0.209 / ABINPL
plant [P] / 59.77 / 1.144 / ABL

Table S4.4.2 LSU 41 EO/EA/EAB, archaeal and bacterial RPs: bPCN alignment identity to human [Fig. 4B]

group / ID% / se / sgnf
archaeal [A] / 19.61 / 0.9831 / B
bacterial [B] / 14.65 / 1.158
lower eukarya [L] / 36.51 / 1.653 / AB
insect [I] / 51.93 / 2.371 / ABLP
non-mammalian [N] / 89.4 / 2.267 / ABILP
mammalian [M] / 94.8 / 0.984 / ABILNP
plant [P] / 45.92 / 2.029 / ABL

Table S4.4.3 Numbersof bPCNs per 100 aa and per entire LSU sequence [Fig 4C]

group / #bPCNs per 100 aa / se / sgnf / #bPCNs per sequence / se / sgnf
archaeal [A] / 1.54 / 0.0788 / 2.663 / 0.1366 / [B]
bacterial [B] / 1.41 / 0.1147 / 2.13 / 0.1733
lower eukarya [L] / 2.03 / 0.1094 / AB / 4.35 / 0.2346 / AB
insect [I] / 2.32 / 0.1216 / ABL / 5.28 / 0.2772 / ABL
non-mammal ian [N] / 2.38 / 0.1012 / ABL / 5.228 / 0.222 / ABL
mammalian [M] / 2.47 / 0.0761 / ABPL / 5.454 / 0.168 / ABLNP
plant [P] / 2.17 / 0.101 / AB / 4.78 / 0.2222 / AB

Table S4.5 LSU bPCN alignment with ribosomal proteins of group-representative species (Fig. 5 in the main text)

species / bPCN ID% / se
YEAST
G. lamblia [35] / 35.7 / 3
T. thermophila [41] / 39.62 / 3.35
T. brucei [41] / 44.6 / 3.57
S. pombe [43] / 53.18 / 4.012
RICE
A. thaliana [41] / 76.78 / 3.545
Z. mays [41] / 78.86 / 3.549
FRUIT FLY
A. gambiae [41] / 65.85 / 3.71
HUMAN
D. rerio [41] / 86.79 / 2.444
X. laevis [42] / 90.34 / 2.242
G. gallus [41] / 92.8 / 1.951
M. musculus [44] / 96.12 / 1.67
R. norvegicus [44] / 96.17 / 1.625
C. familiaris [45] / 98.39 / 0.7169
M. mulatta [42] / 99.40 / 0.415

bPCN ID% = average % identity of bPCNs in LSU RPs with those in the group comparator sequences.

Table S4.6.1 The families and species compared in LSU alignments of bPCNs in archaea and bacteria (Fig. 6)

ARCHAEA

family / species
Desulfurococcaceae / Aeropyrus pernix
Halobacteriaceae / Haloarcula marismortui
Haloarcula salinarm
Natronomonas pharaonis
Methanocaldococceae / Methanocaldococcus jannaschii
Methanococcaceae / Methanococcus maripaludis
Methanococcus vannielii
Methanopyraceae / Methanopyrus kandleri
Sulfolobaceae / Sulfolobus solfataricus
Sulfolobus tokodaii
Thermococcaceae / Pyrococcus abyssi
Pyrococcus furiosus

BACTERIA

family / species
Bacillaceae / Bacillus anthracis
Bacillaceae / Bacillus subtilis
Enterobacteriacae / Escherichia coli
Salmonella typhimurium
Shigella flexneri
Yersinia pestis
Mycobacteriaceae / Mycobacterium leprae
Mycobacterium smegmatis
Mycobacterium tuberculosis
Spirochaetaceae / Treponema pallidum
Staphylococcaceae / Staphylococcus aureus
Streptococcaceae / Streptococcus pyogenes

Table S4.6.2 Sequence and bPCN identities with M.jannaschii for archaeal LSU tunnel proteins (Fig. 6A in the main text))

family / % sequence identity / se / % bPCN identity / se
Methanopyraceae / 62.41 / 2.477 / 40.47 / 7.468
Thermococcaceae / 60.66 / 2.781 / 36.72 / 7.708
Methanococcaceae [2] / 77.11 / 0.9206 / 33.51 / 8.369
Sulfolobaceae [2] / 50.3 / 3.343 / 28.01 / 3.566
Desulfurococcaceae / 55.18 / 4.302 / 25.77 / 7.963
Halobacteriaceae [3] / 56.2 / 1.421 / 9.749 / 1.595

The LSU tunnel proteins L4, L22, L23, L24, L29 and L31 were aligned with the respective M. iannaschii orthologs.

The number of species examined is shown in brackets if above 1.

Table S4.6.3 Sequence and bPCN identities to M. jannaschii for the bacterial LSU tunnel proteins (Fig. 6B in the main text)

family / % sequence identity / se / % bPCN identity / se
Spirochaetaceae / 29.3 / 5.73 / 22.66 / 4.229
Mycobacteriaceae [2] / 27.99 / 3.285 / 17.37 / 3.132
Enterobacteriacae [4] / 31.59 / 3.484 / 12.12 / 3.536
Streptococcaceae / 36.39 / 5.07 / 11.68 / 5.307
Staphylococcaceae / 37.83 / 5.211 / 11.67 / 6.691
Bacillaceae [2] / 35.42 / 4.659 / 11.22 / 3.461

See the footnote of table S4.6.1 for proteins examined.

Table S4.6.4 Sequence and bPCN identities to M. tuberculosis for the bacterial LSU tunnel proteins (Fig. 6C in the main text)

family / % sequence identity / se / % bPCN identity / se
Mycobacteriaceae [2] / 88.58 / 1.496 / 88.37 / 5.85
Staphylococcaceae / 61.61 / 1.902 / 42.34 / 14.8
Spirochaetaceae / 53.73 / 4.96 / 37.67 / 5.207
Bacillaceae [2] / 63.52 / 1.769 / 37.15 / 9.561
Streptococcaceae / 58.46 / 3.464 / 22.96 / 11.76
Enterobacteriacae [4] / 57.03 / 1.386 / 21.67 / 4.459

See the footnote of table S4.6.1 for proteins examined.

Table S4.7.1 The bulk HKR% of ribosomal proteins relative to human [Fig. 7A in the main text]

species / LSU / se / SSU / se
A. gambiae / 1.023 / 0.01069 / 1.012 / 0.01654
A. thaliana / 0.9481 / 0.01273 / 0.9519 / 0.0228
D. melanogaster / 1.006 / 0.01006 / 0.9985 / 0.01332
M. musculus / 0.9991 / 0.00342 / 0.9881 / 0.01451
O. sativa / 0.9451 / 0.01176 / 0.9777 / 0.02385
R. norvegicus / 0.9994 / 0.00327 / 1.004 / 0.00350
S. cerevisiae / 0.8802 / 0.01585 / 0.8965 / 0.02217
T. brucei / 1.014 / 0.01831 / 1.007 / 0.0286
T. thermophila / 0.9337 / 0.01371 / 0.971 / 0.01894
X. laevis / 0.9975 / 0.00461 / 0.9991 / 0.02154
Z. mays / 0.9363 / 0.01544 / 0.9372 / 0.01838

Table S4.7.2 HKR in bPCN of ribosomal proteins as % sequence relative to human [Fig. 7B in the main text]

species / LSU / se / SSU / se
A. gambiae / 1.057 / 0.09124 / 1.093 / 0.08364
A. thaliana / 0.9323 / 0.05968 / 0.9373 / 0.0745
D. melanogaster / 0.9504 / 0.05641 / 0.9963 / 0.08068
M. musculus / 1.001 / 0.01496 / 0.9725 / 0.02779
O. sativa / 0.9444 / 0.06096 / 1.004 / 0.08851
R. norvegicus / 1.003 / 0.01158 / 1.036 / 0.05497
S. cerevisiae / 0.6639 / 0.05139 / 0.761 / 0.07894
T. brucei / 1.06 / 0.07913 / 0.9427 / 0.1233
T. thermophila / 0.7733 / 0.07074 / 0.8413 / 0.06367
X. laevis / 0.9937 / 0.02419 / 1 / 0.04369
Z. mays / 0.9112 / 0.05315 / 0.8236 / 0.06901

Table S4.7.3 Percent rRNA sequence in expansion segments relative to human [Fig. 7C in the main text]

species / LSU RNA ES / SSU RNA ES
A. gambiae / 0.70449 / 0.97374
A. thaliana / 0.53744 / 0.85593
D. melanogaster / 0.70196 / 0.91235
M. musculus / 0.92449 / 0.99823
O. sativa / 0.49968 / 0.99503
R. norvegicus / 0.9401 / 0.99787
S. cerevisiae / 0.53301 / 0.93045
T. brucei / 0.8823 / 1.11285
T. thermophila / 0.47859 / 0.87544
X. laevis / 0.73023 / 0.96842
Z. mays / 0.50791 / 0.95138

Table S4.7.4 rRNA expansion segment GC% relative to human [Fig. 7D in the main text]

species / LSU RNA GC / SSU RNA GC
A. gambiae / 0.67962 / 0.84834
A. thaliana / 0.75783 / 0.8034
D. melanogaster / 0.35011 / 0.68416
M. musculus / 0.95934 / 0.99919
O. sativa / 0.8999 / 0.87286
R. norvegicus / 0.97608 / 0.99395
S. cerevisiae / 0.60213 / 0.67633
T. brucei / 0.5641 / 0.9118
T. thermophila / 0.52332 / 0.64097
X. laevis / 0.99785 / 0.94912
Z. mays / 0.88627 / 0.8691

Table S4.8.1 Ratios of residues in ES as % human to sequence HKR% as % human [Fig. 8A in the main text]

Species / 6ESt/HKRt / se6ESt/HKRt / 4ESt/HKRt / se4ESt/HKRt / EH LSU/SSU
T. thermophila / 0.517 / 0.00736 / 0.9125 / 0.01903 / 0.567
O. sativa / 0.5321 / 0.00683 / 1.034 / 0.02257 / 0.515
Z. mays / 0.5487 / 0.00967 / 1.028 / 0.02207 / 0.534
A. thaliana / 0.571 / 0.00785 / 0.9165 / 0.02455 / 0.623
S. cerevisiae / 0.6132 / 0.01088 / 1.064 / 0.0367 / 0.576
A. gambiae / 0.6914 / 0.00678 / 0.9706 / 0.01666 / 0.712
D. melanogaster / 0.7005 / 0.00674 / 0.9188 / 0.01245 / 0.762
X laevis / 0.7327 / 0.00330 / 0.9865 / 0.02944 / 0.743
T. brucei / 0.8807 / 0.01453 / 1.13 / 0.03106 / 0.779
M. musculus / 0.9254 / 0.00334 / 1.021 / 0.0259 / 0.906
R. norvegicus / 0.9410 / 0.00308 / 0.9945 / 0.00342 / 0.946

Table S4.8.2 Ratios of ES GC% as % human to sequence HKR% as % human [Fig. 8B in the main text

species / 6ESGCt/HKRt / se6GC / 4ESGCt/HKRt / se4GC
D. melanogaster / 0.3494 / 0.00336 / 0.6889 / 0.00934
T. brucei / 0.5631 / 0.00930 / 0.9257 / 0.02544
T. thermophila / 0.5652 / 0.00806 / 0.6681 / 0.01392
A. gambiae / 0.667 / 0.00654 / 0.8457 / 0.01451
S. cerevisiae / 0.6929 / 0.01229 / 0.7733 / 0.02667
A. thaliana / 0.8052 / 0.01105 / 0.8601 / 0.02303
Z. mays / 0.9575 / 0.01689 / 0.939 / 0.02016
O. sativa / 0.9583 / 0.01231 / 0.9068 / 0.0198
M. musculus / 0.9605 / 0.00348 / 1.023 / 0.02593
R. norvegicus / 0.977 / 0.00319 / 0.9905 / 0.00340
X. laevis / 1.001 / 0.00452 / 0.9669 / 0.02886

Table S5 Basic PCN clusters in nuclear localization signals

The UniProtein database nuclear localization signals in humanand human viral proteins that are listed as experimentally confirmed.

This list excludes entries such as "by similarity" and "probable", and most entries are supported by journal references within the respective records..

The list does not contain any NLS of ribosomal proteins, although of course such motifs are present in most of the ribosomal proteins.

[All human LSU RPs have at least one bPCN cluster with three or more HKR residues and at least 50% of these.]

Multiple basic clusters in ribosomal proteins could be assumed to represent sufficient nuclear import signals in any case.

The bPCNs as defined in this review are found in 170 of 174 NLS in this list (97.7%), and recover 70.3% sequence of the listed NLSs.

UniProtein access / UniProtein designation / NLS sequence / bPCN sequence(s) / bPCN % NLS length
Q7Z5L9 / NLS / ARKRKPSP / RKRK / 50