Table S1. Organisms from superkingdoms Eukarya (E), Bacteria (B) and Archaea (A) that were analyzed with corresponding name abbreviations (Abbr.), inclusion in set described in Figure S1 (SF1), and number of sequences retrieved for analysis.
Organisms(Eukarya 219, Bacteria 478, Archaea 52) / Abbr. / Domain / SF1 / Sequences / Sequences with no structure
Homo sapiens 49_36k / hs / E / + / 30713 / 15878
Pan troglodytes 49_21h / xp / E / 21712 / 11425
Gorilla gorilla 52_1 / gx / E / + / 11200 / 5582
Pongo pygmaeus 49_1 / of / E / + / 15446 / 7963
Macaca mulatta 49_10h / ru / E / 24287 / 12097
Otolemur garnettii 49_1c / ob / E / 10460 / 4988
Microcebus murinus 49_1 / io / E / 11069 / 5250
Tarsius syrichta 51_1 / ih / E / 9217 / 4344
Rattus norvegicus 49_34s / rn / E / + / 23024 / 9924
Mus musculus 49_37b / mm / E / 26914 / 12753
Spermophilus tridecemlineatus 49_1e / gb / E / 9939 / 4891
Dipodomys ordii 51_1 / in / E / + / 10933 / 4817
Cavia porcellus 51_3 / gu / E / 14096 / 5678
Oryctolagus cuniculus 49_1f / ok / E / 10396 / 5042
Ochotona princeps 49_1 / oq / E / 10876 / 4967
Tupaia belangeri 49_1d / tz / E / 10471 / 4991
Bos taurus 49_3f / bv / E / 18673 / 9021
Vicugna pacos 51_1 / vu / E / 7953 / 3752
Tursiops truncatus 51_1 / ut / E / 11462 / 5031
Canis familiaris 49_2g / dg / E / 18338 / 7220
Felis catus 49_1c / fe / E / 9979 / 4864
Equus caballus 49_2 / eq / E / 15957 / 6791
Myotis lucifugus 49_1e / lu / E / 11080 / 5152
Pteropus vampyrus 51_1 / vr / E / 11842 / 5089
Sorex araneus 49_1c / xr / E / 9122 / 4070
Erinaceus europaeus 49_1c / ek / E / 9973 / 4619
Procavia capensis 51_1 / vn / E / 10997 / 5006
Loxodonta africana 49_1d / lk / E / 10678 / 5039
Echinops telfairi 49_1e / ee / E / 11076 / 5486
Dasypus novemcinctus 49_1f / d5 / E / + / 10368 / 5171
Monodelphis domestica 49_5d / op / E / 24682 / 7875
Ornithorhynchus anatinus 49_1f / oh / E / 18455 / 8381
Gallus gallus 49_2g / gg / E / 14376 / 7819
Xenopus laevis / xl / E / + / 23167 / 7873
Xenopus tropicalis 49_41i / xn / E / 19964 / 7747
Danio rerio 49_7c / da / E / 23072 / 8671
Gasterosteus aculeatus 49_1f / gc / E / + / 19521 / 8056
Oryzias latipes 49_1e / ol / E / 17226 / 7435
Tetraodon nigroviridis 49_1k / tn / E / 16043 / 11948
Takifugu rubripes 49_4i / to / E / + / 38145 / 9696
Branchiostoma floridae 1.0 / bf / E / 33445 / 17372
Ciona savignyi 49_2f / c0 / E / + / 12650 / 7493
Ciona intestinalis 49_2i / is / E / 11913 / 7945
Strongylocentrotus purpuratus / tu / E / + / 27118 / 15302
Helobdella robusta / hx / E / 11381 / 12051
Capitella sp. I / i1 / E / 17170 / 15245
Bombyx mori / om / E / 8802 / 12500
Nasonia vitripennis / nz / E / 6664 / 2590
Apis mellifera 37.2d / ai / E / 15858 / 11897
Drosophila grimshawi 1.3 / gd / E / 8205 / 6781
Drosophila willistoni 1.3 / gw / E / 8120 / 7393
Drosophila pseudoobscura 2.3 / do / E / 8260 / 7811
Drosophila persimilis 1.3 / dq / E / 8389 / 8489
Drosophila yakuba 1.3 / dy / E / 8196 / 7886
Drosophila simulans 1.3 / dw / E / 7657 / 7758
Drosophila sechellia 1.3 / dz / E / 8512 / 7959
Drosophila melanogaster Ensembl 49_54 / dd / E / + / 12692 / 8123
Drosophila erecta 1.3 / ge / E / 7987 / 7061
Drosophila ananassae 1.3 / dk / E / + / 8077 / 6993
Drosophila virilis 1.2 / du / E / 7898 / 6593
Drosophila mojavensis 1.3 / dx / E / + / 7746 / 6849
Aedes aegypti 49_1b / ax / E / 10220 / 6569
Culex pipiens quinquefasciatus / ie / E / 10769 / 8114
Anopheles gambiae 49_3j / ag / E / 8072 / 5061
Tribolium castaneum 3.0 / tj / E / + / 8490 / 7932
Acyrthosiphon pisum / qp / E / 7061 / 3405
Daphnia pulex / d7 / E / 11750 / 19190
Lottia gigantea / gy / E / 12223 / 11628
Pristionchus pacificus / wp / E / 10791 / 18853
Meloidogyne hapla / wm / E / 5928 / 8493
Brugia malayi / r0 / E / + / 3643 / 4208
Caenorhabditis japonica / wj / E / 8865 / 10640
Caenorhabditis brenneri / wn / E / 8090 / 18150
Caenorhabditis remanei / wr / E / + / 12755 / 18886
Caenorhabditis elegans Ensembl 49_180a / cl / E / 14297 / 12605
Caenorhabditis briggsae 2 / cw / E / 9625 / 12352
Nematostella vectensis 1.0 / nw / E / 15306 / 11967
Hydra magnipapillata / qm / E / 10132 / 7266
Trichoplax adhaerens / rq / E / 7663 / 3857
Monosiga brevicollis / ov / E / 5777 / 3419
Malassezia globosa CBS 7966 / gz / E / 2752 / 1534
Ustilago maydis / um / E / 3829 / 2693
Puccinia graminis f. sp. tritici CRL 75-36-700-3 / p4 / E / + / 5100 / 15466
Melampsora laricis-populina / ql / E / 4893 / 11801
Sporobolomyces roseus IAM 13481 / yu / E / 3411 / 2125
Coprinopsis cinerea okayama7 130 / or / E / 6143 / 7401
Laccaria bicolor S238N-H82 / lo / E / + / 7148 / 13466
Schizophyllum commune / zc / E / 6267 / 6914
Heterobasidion annosum / qa / E / + / 5306 / 6964
Phanerochaete chrysosporium RP-78 2.1 / fc / E / 5688 / 4360
Postia placenta / ia / E / 8835 / 8338
Cryptococcus neoformans JEC21 / cf / E / 3601 / 2281
Schizosaccharomyces octosporus yFS286 / o9 / E / 3136 / 1789
Schizosaccharomyces japonicus yFS275 / hj / E / + / 3059 / 1755
Schizosaccharomyces pombe / po / E / 3225 / 1796
Magnaporthe grisea 70-15 / gr / E / 5871 / 5172
Podospora anserina / nq / E / 5772 / 4842
Chaetomium globosum CBS 148.51 / hg / E / 5692 / 5432
Neurospora tetrasperma / n4 / E / 4724 / 5916
Neurospora discreta FGSC 8579 / n3 / E / 4631 / 5317
Neurospora crassa OR74A / ns / E / 4745 / 5080
Cryphonectria parasitica / yh / E / 6191 / 4993
Fusarium oxysporum f. sp. lycopersici 4286 / fo / E / + / 9210 / 8398
Nectria haematococca mpVI / nx / E / + / 9236 / 6471
Fusarium verticillioides 7600 / fv / E / 7599 / 6576
Fusarium graminearum / fg / E / + / 6894 / 6427
Trichoderma atroviride / wt / E / 6519 / 4581
Trichoderma reesei 1.2 / re / E / + / 5432 / 3697
Trichoderma virens Gv29-8 / nj / E / 6981 / 4662
Verticillium dahliae VdLs.17 / vd / E / 5850 / 4685
Verticillium albo-atrum VaMs.102 / ve / E / + / 5552 / 4668
Botrytis cinerea B05.10 / b7 / E / 5958 / 10490
Sclerotinia sclerotiorum / lz / E / 5564 / 8958
Ajellomyces dermatitidis SLH14081 / j2 / E / 4568 / 4987
Histoplasma capsulatum class NAmI strain WU24 / hk / E / 4238 / 5010
Microsporum gypseum / gp / E / + / 4713 / 4163
Microsporum canis CBS 113480 / qc / E / 4850 / 3915
Trichophyton equinum CBS 127.97 / o7 / E / 4450 / 4110
Paracoccidioides brasiliensis Pb18 / iv / E / 4084 / 4657
Coccidioides posadasii RMSCC 3488 / i6 / E / 4530 / 5367
Coccidioides immitis RS / im / E / 4542 / 5810
Uncinocarpus reesii 1704 / ur / E / 4291 / 3507
Neosartorya fischeri NRRL 181 / nh / E / 6296 / 4111
Aspergillus terreus NIH2624 / gi / E / 6411 / 3995
Aspergillus fumigatus Af293 / ao / E / 5874 / 4013
Aspergillus oryzae RIB40 / a8 / E / 7194 / 4869
Aspergillus niger ATCC 1015 / a5 / E / 5208 / 3384
Aspergillus flavus NRRL3357 / gq / E / + / 7388 / 5199
Aspergillus clavatus NRRL 1 / a7 / E / 5505 / 3615
Aspergillus nidulans FGSC A4 / an / E / 6335 / 4330
Alternaria brassicicola / l3 / E / 5336 / 5352
Pyrenophora tritici-repentis / t4 / E / 5903 / 6266
Cochliobolus heterostrophus / wc / E / 5904 / 3729
Stagonospora nodorum / nd / E / 6930 / 9667
Mycosphaerella fijiensis CIRAD86 / ym / E / + / 5585 / 4728
Mycosphaerella graminicola IPO323 / yo / E / 5774 / 5621
Candida tropicalis MYA-3404 / t3 / E / + / 3650 / 2608
Candida parapsilosis / ik / E / + / 3483 / 2250
Candida albicans SC5314 / al / E / + / 3635 / 2530
Yarrowia lipolytica CLIB122 / yl / E / 3853 / 2595
Candida lusitaniae ATCC 42720 / iu / E / 3316 / 2625
Vanderwaltozyma polyspora DSM 70294 / vw / E / 3191 / 2154
Candida glabrata CBS138 / gl / E / 3155 / 2047
Kluyveromyces thermotolerans CBS 6340 / yv / E / 3163 / 1929
Lachancea kluyveri / y4 / E / 3272 / 2049
Kluyveromyces waltii / kw / E / 3106 / 2108
Lodderomyces elongisporus NRRL YB-4239 / ly / E / + / 3356 / 2443
Ashbya gossypii ATCC 10895 / go / E / + / 2908 / 1809
Debaromyces hansenii / dh / E / 3682 / 2590
Zygosaccharomyces rouxii / yz / E / + / 3084 / 1907
Saccharomyces mikatae MIT / y6 / E / + / 3602 / 6713
Saccharomyces paradoxus MIT / y8 / E / + / 3489 / 7070
Saccharomyces cerevisiae SGD / sc / E / + / 3523 / 3194
Saccharomyces cerevisiae Ensembl 49_1h / xs / E / 3517 / 3181
Saccharomyces bayanus MIT / y1 / E / 3573 / 8423
Candida guilliermondii ATCC 6260 / ng / E / + / 5692 / 5432
Pichia stipitis CBS 6054 / ip / E / 3609 / 2230
Kluyveromyces lactis / kl / E / 3096 / 1980
Rhizopus oryzae RA 99-880 / ry / E / + / 7845 / 9622
Phycomyces blakesleeanus / hn / E / 6497 / 8295
Mucor circinelloides / ul / E / 6567 / 4363
Encephalitozoon cuniculi / eu / E / 1059 / 937
Batrachochytrium dendrobatidis JEL423 / b8 / E / 4302 / 4491
Dictyostelium discoideum / dt / E / 6643 / 6816
Dictyostelium purpureum / yd / E / 6623 / 5787
Entamoeba histolytica 1 / en / E / + / 5009 / 4763
Vitis vinifera / vt / E / 17268 / 13166
Arabidopsis lyrata / l4 / E / + / 17472 / 15198
Arabidopsis thaliana 8 / at / E / + / 19612 / 13213
Carica papaya / r6 / E / + / 12095 / 16494
Medicago truncatula / mw / E / 16819 / 27485
Populus trichocarpa 1.1 / pt / E / 25547 / 20008
Oryza sativa ssp. japonica 5.0 / os / E / + / 31684 / 35026
Sorghum bicolor / oj / E / 18423 / 16073
Selaginella moellendorffii / gj / E / 20302 / 14395
Physcomitrella patens subsp. patens / pw / E / 13310 / 22628
Chlorella sp. NC64A / h2 / E / 6153 / 3638
Chlorella vulgaris / vg / E / 5560 / 4434
Volvox carteri f. nagariensis / vo / E / 6986 / 8558
Chlamydomonas reinhardtii 3.1 / cy / E / 7132 / 7466
Ostreococcus sp. RCC809 / o8 / E / + / 4511 / 3262
Ostreococcus lucimarinus CCE9901 / oz / E / 4608 / 3043
Ostreococcus tauri / ou / E / 4295 / 3430
Micromonas sp. RCC299 / ij / E / 5638 / 4177
Micromonas pusilla CCMP1545 / iq / E / 5601 / 4874
Cyanidioschyzon merolae / ya / E / 3152 / 1862
Giardia lamblia / gf / E / 2426 / 2463
Trypanosoma cruzi strain CL Brener / uz / E / 8565 / 11042
Trypanosoma brucei / tb / E / 3931 / 4827
Leishmania major strain Friedlin / em / E / + / 4132 / 4170
Leishmania infantum JPCM5 / lh / E / 4058 / 4115
Leishmania braziliensis MHOM/BR/75/M2904 / ib / E / 3993 / 4085
Aureococcus anophagefferens / a6 / E / 7871 / 3630
Phytophthora ramorum 1.1 / ra / E / 8800 / 6943
Phytophthora sojae 1.1 / sj / E / 9867 / 9160
Phytophthora infestans T30-4 / iy / E / 9664 / 12994
Phytophthora capsici / yy / E / 8932 / 8482
Phaeodactylum tricornutum / hr / E / 5800 / 4602
Thalassiosira pseudonana / tl / E / 6238 / 5538
Paramecium tetraurelia / ir / E / 17934 / 21638
Tetrahymena thermophila SB210 1 / hy / E / + / 11268 / 13457
bovis T2Bo / o0 / E / + / 1801 / 1905
Theileria parva / pv / E / 1813 / 2266
Theileria annulata / nu / E / 1796 / 1999
Plasmodium falciparum 3D7 / pl / E / 2468 / 2992
Plasmodium vivax SaI-1 / vx / E / 2373 / 3059
Plasmodium knowlesi strain H / fw / E / 2351 / 2834
Plasmodium yoelii ssp. yoelii 1 / py / E / 2452 / 5409
Plasmodium chabaudi / fy / E / 3287 / 11720
Plasmodium berghei ANKA / fb / E / + / 2917 / 9318
Cryptosporidium hominis / rm / E / 1699 / 2235
Cryptosporidium muris / uc / E / 2025 / 1909
Cryptosporidium parvum Iowa II / cv / E / 1899 / 1906
Neospora caninum / un / E / 2981 / 2606
Toxoplasma gondii ME49 / ue / E / 3422 / 4571
Naegleria gruberi / eb / E / 8619 / 7134
Trichomonas vaginalis / tx / E / 24902 / 72981
Guillardia theta / gt / E / 355 / 243
Emiliania huxleyi CCMP1516 / ex / E / 19074 / 20051
Candidatus Phytoplasma mali / 9y / B / 313 / 166
Aster yellows witches-broom phytoplasma AYWB / ay / B / 371 / 300
Onion yellows phytoplasma OY-M / oy / B / 465 / 289
Acholeplasma laidlawii PG-8A / 8b / B / + / 904 / 476
Mesoplasma florum L1 / mf / B / 471 / 211
Ureaplasma parvum ser. 3 ATCC 700970 / uu / B / 377 / 237
Ureaplasma urealyticum ser. 10 ATCC 33699 / kf / B / + / 381 / 265
Mycoplasma penetrans HF-2 / me / B / 575 / 462
Mycoplasma mobile 163K / m0 / B / 422 / 211
Mycoplasma arthritidis 158L3-1 / 9w / B / 368 / 263
Mycoplasma agalactiae PG2 / 45 / B / 446 / 296
Mycoplasma synoviae 53 / yc / B / 412 / 247
Mycoplasma pulmonis UAB CTIP / mq / B / 468 / 314
Mycoplasma pneumoniae M129 / mp / B / 436 / 253
Mycoplasma mycoides ssp. mycoides SC PG1 / mi / B / 600 / 416
Mycoplasma hyopneumoniae 232 / x9 / B / 405 / 286
Mycoplasma genitalium G37 / mg / B / + / 354 / 123
Mycoplasma gallisepticum R / my / B / 441 / 285
Mycoplasma capricolum ssp. capricolum ATCC 27343 / yi / B / 493 / 319
Leptospira borgpetersenii ser. Hardjo-bovis L550 / 0k / B / 1804 / 1141
Leptospira interrogans ser. Lai 56601 / lr / B / 2088 / 2639
Leptospira biflexa ser. Patoc Patoc 1 (Paris) / 83 / B / 2230 / 1437
Treponema pallidum ssp. pallidum Nichols / tp / B / 615 / 421
Treponema denticola ATCC 35405 / td / B / 1478 / 1289
Borrelia garinii PBi / ga / B / 543 / 289
Borrelia afzelii PKo / 04 / B / 547 / 308
Borrelia burgdorferi B31 / bb / B / 544 / 307
Updated Borrelia recurrentis A1 / g7 / B / 546 / 254
Updated Borrelia duttonii Ly / f7 / B / + / 560 / 260
Borrelia turicatae 91E135 / 9c / B / 557 / 261
Updated Borrelia hermsii DAH / 9u / B / 560 / 259
Rhodopirellula baltica SH 1 / pi / B / 3219 / 4106
Bifidobacterium longum NCC2705 / bl / B / 1185 / 542
Updated Bifidobacterium animalis ssp. lactis AD011 / g5 / B / 1047 / 481
Bifidobacterium adolescentis ATCC 15703 / 5g / B / 1108 / 523
Kineococcus radiotolerans SRS30216 / 58 / B / 2945 / 1535
Acidothermus cellulolyticus 11B / 3q / B / 1550 / 607
Frankia sp. CcI3 / fk / B / 2831 / 1668
Frankia alni ACN14a / 1b / B / 4089 / 2622