Supporting Information

Molecular Determinants for Substrate Selectivity of -Transaminases

Eul-Soo Park,Minji Kim and Jong-Shik Shin*

Department of Biotechnology, Yonsei University, Shinchon-Dong 134, Seodaemun-Gu,

Seoul 120-749, Korea

Table S1.Amino donor specificity of -TAs measured with cell-free extracts of wild-type bacterial strains. a

Substrate / Relative reactivity (%)b
AP / AB / OA
(S)--methylbenzylamine / 100.0(68.2)c / 100.0 (98.7)c / 100.0 (117.3)c
(S)--ethylbenzylamine / 19.2 / 12.9 / 11.3
(S)-1-aminoindan / 207.1 / 167.6 / 181.7
benzylamine / 98.8 / 81.3 / 72.3
(R)--methylbenzylamine / < 0.1 / 0.3 / 0.2

aA. pasteurianus,A. baumannii and O. anthropi were cultivated in defined culture media recommended by the bacterial stock supplier. Medium 1 (3 g/L beef extract and 5 g/L peptone) was used for A. baumannii and O. anthropi, and medium 2 (5 g/L yeast extract, 3 g/L peptone and 25 g/L mannitol) was used for A. pasteurianus. Cells harvested from the culture broth were disrupted by sonication. Supernatants obtained after centrifugation were used as the cell-free extracts.

bRelative reactivity represents the initial reaction rate normalized by that for (S)--MBA. Reaction conditions were 10 mM amine, 10 mM pyruvate and 50 mM phosphate buffer (pH 7).

cThe values in the parentheses represent initial reaction rates (M/min).

Table S2.Amino acceptor specificity of -TAs measured with cell-free extracts of wild-type bacterial strains. a

Substrate / Relative reactivity (%)b
AP / AB / OA
pyruvate / 100.0(68.2)c / 100.0 (98.7)c / 100.0 (117.3)c
2-oxobutyrate / 75.6 / 81.9 / 84.1
trimethylpyruvate / < 0.1 / < 0.1 / 0.5
butanal / 101.2 / 63.6 / 90.3
propiophenone / 2.7 / 3.3 / 2.4

aA. pasteurianus,A. baumannii and O. anthropi were cultivated in defined culture media recommended by the bacterial stock supplier. Medium 1 (3 g/L beef extract and 5 g/L peptone) was used for A. baumannii and O. anthropi, and medium 2 (5 g/L yeast extract, 3 g/L peptone and 25 g/L mannitol) was used for A. pasteurianus. Cells harvested from the culture broth were disrupted by sonication. Supernatants obtained after centrifugation were used as the cell-free extracts.

b Relative reactivity represents the initial rate normalized by that for pyruvate. Reaction conditions were 10 mM (S)--MBA, 10 mM amino acceptor and 50 mM phosphate buffer (pH 7).

cThe values in the parentheses represent initial reaction rates (M/min).

Figure Legends

Figure S1.DNA and amino acid sequences of the cloned AB-TA. Nucleotidesshaded in green indicate the site of point mutation. Underlined bold characters represent amino acid substitutions compared with the original NCBI sequence (shown in italic).

Figure S2.DNA and amino acid sequences of the cloned AP-TA. Font use is consistent with Figure S1.

Figure S3.DNA and amino acid sequences of the cloned PF-TA. Font use is consistent with Figure S1.

Figure S4.Adapted active site structure of PP-TA in which the side chain of Arg414 was moved toward Gln421. The green dotted lines represent hydrogen bonds of the guanidyl group of Arg414 with Gln421 and Ser231.

Figure S5.Multiple sequence alignmentsof the -TAs from Pseudomonas putida (PP), Acetobacter pasteurianus (AP), Acinetobacter baumannii (AB), Ochrobactrum anthropi (OA), Paracoccus denitrificans(PD),Chromobacterium violaceum (CV), Alcaligenes denitrificans(AD), Vibrio fluvialis(VF) and Caulobacter crescentus(CC). Residues conserved in the nine enzymes are marked by asterisks. Dashes represent gaps introduced during alignment process.

1 ATGTTTGATA CGGATAAATT CAGCGACTCT GAACATACTT TGGACGCGGT

MXXFXXDXXTXXXDXXKXXFXXXSXXDXXSXXXEXXHXXTXXLXXXDXXAXXV

51 TCAAACCAAT AATAATATGC ATATAAATTA TCAGGCACAC TGGATGCCTT

XQXXTXXNXXXNXXNXXMXXHXXXIXXNXXYXXXQXXAXXHXXXWXXMXXPXXF

101 TTTCAGCAAA CCGAAACTTT GCTAAAGACC CACGTATGAT TGTGGGTGCC

XXSXXAXXNXXXRXXNXXFXXXAXXKXXDXXPXXXRXXMXXIXXXVXXGXXA

151 AAAGGGTCTT ATCTGATTGA TGATTCTGGC CGTGAAATTT ATGACTCACT

KXXGXXSXXYXXXLXXIXXDXXXDXXSXXGXXXRXXEXXIXXYFXXDXXSXXL

201 TTCGGGCTTA TGGACGTGTG GTGCAGGTCA TACCTTGCCT GAAATTCAAC

XSXXGXXLXXXWXXTXXCXXGXXXAXXGXXHXXXTXXLXXPXXXEXXIXXQXXQ

251 AAGCGGTAAG TGCTCAATTA GGTCAGCTCG ACTACTCACC AGCGTTCCAG

XXAXXVXXSXXXATQXXLXXXGXXQXXLXXDXXXYXXSXXPXXXAXXFXXQXX

301 TTTGGTCATC CGCTTTCTTT TAAGTTGGCA GATAAAATTG TTCAGCATAT

FXXGXXHXXPXXXLXXSXXFXXXKXXLXXAXXXDXXKXXIXXVXXXQXXHXXM

351 GCCTGAAAAA CTACAACACG TTTTCTTTAC CAACTCGGGT TCGGAGTCGG

XPXXEXXKXXXLXXQXXHXXVXXXFXXFXXTXXXNXXSXXGXXXSXXEXXSXXA

401 CAGATACGTC TATAAAAATG GCACGCGCCT ATTGGCGCAT TAAAGGTAAA

XXDXXTXXSXXXIXXKXXMXXXAXXRXXAXXYXXXWXXRXXIXXXKXXGXXKXX

451 CCAAGTAAAA CCAAATTGAT TGGTCGTGCC CGTGGCTATC ATGGTGTGAA

PXXSXXKXXTXXXKXXLXXIXXXGXXRXXAXXXRXXGXXYXXHXXXGXXVXXN

501 CGTTGCAGGG ACAAGTTTAG GTGGGATTGG CGGCAACCGT AAAATGTTTG

XVXXAXXGXXXTXXSXXLXXGXXXGXXIXXGXXXGXXNXXRXXXKXXMXXFXXG

551 GACAACTTAT GGATGTAGAC CATTTACCTC ATACTTTGCA ACCTGATTTA

XXQXXLXXMXXXDXXVXXDXXXHXXLXXPXXHXXXTXXLXXQXXXPXXDXXL

601 ACTTTTACCA AAGGCTGTGC AGAAACAGGC GGGGTAGAAC TTGCCAATGA

TXXFXXTXXKXXXGXXCXXAXXXEXXTXXGXXXGXXVXXEXXLXXXAXXNXXE

651 AATGCTTAAG TTAATTGAGC TACACGATGC TTCAAATATT GCAGCTGTCA

XMXXLXXKXXXLXXIXXEXXLXXXHXXDXXAXXXSXXNXXIXXXAXXAXXVXXI

701 TTGTGGAGCC TATTTCTGGT TCTGCGGGTT GTATTGTGCC GCCAACAGGC

XXVXXEXXPXXXIXXSXXGXXXSXXAXXGXXCXXXIXXVXXPXXXPXXTXXG

751 TATTTACAAC GTTTAAGAGA GATCTGTGAT CAGCATGACA TCTTGTTAAT

YXXLXXQXXRXXXLXXRXXEXXXIXXCXXDXXXQXXHXXDXXIXXXLXXLXXI

801 TTTTGATGAG GTGATTACAG GCTTTGGGCG TTTAGGAACG TGGACGGCGG

XFXXDXXEXXXVXXIXXTXXGXXXFXXGXXRXXXLXXGXXTXXXWXXTXXAXXA

851 CAGAATATTT TGGGGTAACA CCAGATATTT TGAATTTTGC GAAACAGGTG

XXEXXYXXFXXXGXXVXXTXXXPXXDXXIXXLXXXNXXFXXAXXXKXXQXXV

901 ACCAATGGTG CTATTCCTTT AGGTGGAGTG GTGGCAAGCC ATGAAATTTA

TXXNXXGXXAXXXIXXPXXLXXXGXXGXXVXXXVXXAXXSXXHXXXEXXIXXY

951 CTCTGCCTTT ATGCAGCAAG ACTTACCAGA GCATGCCATT GAATTTACCC

XSXXAXXFXXXMXXQXXQXXDXXXLXXPXXEXXvHXXAXXIXXXEXXFXXTXXH

1001 ACGGCTATAC CTATTCGGCA CATCCGGTTG CTTGTGCTGC CGCTTTAGCT

XXGXXYXXTXXXYXXSXXAXXXHXXPXXVXXAXXXCXXAXXAXXXAXXLXXA

1051 GCGCTTGAAA TTTTAGAGAA GAAAAACCTG CTGGCTCAAT CGGCAGCGTT

AXXLXXEXXIXXXLXXEXXKXXXKXXNXXLXXXLXXAXXQXXSXXXAXXAXXL

1101 GGCACCAAGT TTTGAAAAAA TGCTGCATGG TTTAAAAGGC GCCCCGCATA

XAXXPXXSXXXFXXEXXKXXMXXXLXXHXXGXXXLXXKXXGSXXAXXPXXHXXI

1151 TTTTAGATAT TCGCAACTGT GGCTTGATTG GTGCTTTGCA GTTAGCTCCG

XXLXXDXXIXXXRXXNXXCXXXGXXLXXIXXGXXXAXXLXXQXXXLXXAXXP

1201 CGTGATGGCG ATGCTGCCAT CCGAGGCTTT GAGCTTGGTA TGAAACTCTG

RXXDXXGXXDXXXAXXATXIXXXRXXGXXFXXXEXXLXXGXXMXXXKXXLXXW

1251 GAAAGAAGGT TTCTATGTCC GCTTTGGAGG CGACACGCTT CAGTTCGGCC

XKXXEXXGXXXFXXYXXVXXRXXXFXXGXXGXXXDXXTXXLXXXQXXFXXGXXP

1301 CAATGTTTAA TAGTACAGAA GCAGATATTG ACCGCTTAAT GAATGCTGTG

XXMXXFXXNXXXSXXTXXEXXXAXXDXXIXXDXXXRXXLXXMXXXNXXAXXV

1351 GGCGATGCGC TTTATCAAGT GAATTAA

GXXDXXAXXLXXXYXXQXXVXXXNXX*

Fig. S1

1 ATGGTAGACA TGAGCAGCAA TTTTGATTCT GTAAATGAGG CGCGTAAGGG

MXXVXXDXXMXXXSXXSXXNXXXFXXDXXSXXXVXXNXXEXXAXXXRXXKXXG

51 CACTTACTGG CAGCCTTTTA CAAGCAATAG GTTATTGCGT GCTGATCCGG

XTXXYXXWXXXQXXPXXFXXTXXXSXXNXXRXXXLXXLXXRXXXAXXDXXPXXE

101 AACCCCGGAT GCTTACAAAA GCAGAAGGTA TTTATTATAC GTCTATCAAT

XXPXXRXXMXXXLXXTXXKXXXAXXEXXGXXIXXXYXXYXXTXXXSXXIXXN

151 GGCACACGTT TGCTGGATAC GCTTTCAGGC TTATGGTGTA CGCCTTTGGG

GXXTXXRXXLXXXLXXDXXTXXXLXXSXXGXXXLXXWXXCXXTXXXPXXLXXG

201 ACACGCTCAC CCGCGTATTG CAGAAGCTGT TAAAACACAG GTGGAAACCT

XHXXAXXHXXXPXXRXXIXXAXXXEXXAXXVXXXKXXTXXQXXXVXXEXXTXXL

251 TGGATTTTGC TCCAAGTTTT CAGATGACGC ATCCTGGAGC CATAAGTCTT

XXDXXFXXAXXXPXXSXXFXXXQXXMXXTXXHXXXPXXGXXAXXXIXXSXXL

301 GCTGAGCGTA TCGCAGAGAT GGCCCCTGAG GGAATGAACC ATGTATTTTT

AXXEXXRXXIXXXAXXEXXMXXXAXXPXXEXXXGXXMXXNXXHXXXVXXFXXF

351 TGCAAATTCA GGTTCAGAAT CTGTTGATAC AGCATTAAAA GTGGCTTTGG

XAXXNXXSXXXGXXSXXEXXSXXXVXXDXXTXXXAXXLXXKXXXVXXAXXLXXG

401 GGTTTCATCG TATCAAAGGT GAAGGCAATC GCTTTCGTAT GATAGGGCGT

XXFXXHXXRXXXIXXKXXGXXXEXXGXXNXXRXXXFXXRXXMXXXIXXGXXR

451 GAGCGCGGGT ATCATGGTGT CGGGTTTGGT GGCATGTCTG TTGGAGGCAT

EXXRXXGXXYXXXHXXGXXVXXXGXXFXXGXXXGXXMXXSXXVXXXGXXGXXI

501 TGTATCTAAC CGTAAGATGT TTGCACCGTG TATGATGCCT GGAGTGGATC

XVXXSXXNXXXRXXKXXMXXFXXXAXXPXXCXXXMXXMXXPXXXGXXVXXDXXH

551 ATCTTCGGCA CCCCTATGAA CCGGAATATG CAGCTTTTTC ACATGGTCAA

XXLXXRXXHXXXPXXYXXEXXXPXXEXXYXXAXXXAXXFXXSXXXHXXGXXQ

601 CCAACATGGG GTGCGGAACG GGCGGAAGAT TTGCAGCGTC TGGTGGCTTT

PXXTXXWXXGXXXAXXEXXRXXXAXXEXXDXXXLXXQXXRXXLXXXVXXAXXL

651 GCATGATGCC TCTACAATTG CCGCAGTTAT TGTTGAACCC GTACAGGGAT

XHXXDXXAXXXSXXTXXIXXAXXXAXXVXXIXXXVXXEXXPXXXVXXQXXGXXS

701 CGACAGGCGT GTTGGTTCCG CCTGTAGGGT ATTTGGAACG TCTGCGCGAA

XXTXXGXXVXXXLXXVXXPXXXPXXVXXGXXYXXXLXXEXXRXXXLXXRXXE

751 ATTTGTACGC AAAACGGCAT CCTTCTTATT TTTGATGAAG TGATTACCGG

IXXCXXTXXQXXXNXXGXXIXXXLXXLXXIXXXFXXDXXEXXVXXXIXXTXXG

801 GTTTGGGCGT ATGGGTGCCC CGTTTGCCGC CCAGAGATTT GGGGTAAAGC

XFXXGXXRXXXMXXGXXAXXPXXXFXXAXXAXXXQXXRXXFXXXGXXVXXKXXP

851 CAGATATCAT CACTTTTGCC AAGGCGGTTA CAAACGGTGT GGTGCCTATG

XXDXXIXXIXXXTXXFXXAXXXKXXAXXVXXTXXXNXXGXXVXXXVXXPXXM

901 GGAGGAGTGA TTGTTACAGA TGAAATTTAC AACACATTTA TGACAGGTCC

GXXGXXVXXIXXXVXXTXXDXXXEXXIXXYXXXNXXTXXFXXMXXXTXXGXXP

951 TGAAAGTGCC ATAGAATTTT GTCATGGATA TACGTATTCT GGGCACCCTT

XEXXSXXAXXXIXXEXXFXXCXXXHXXGXXYXXXTXXYXXSXXXGXXHXXPXXL

1001 TGGCAGCGGC AGTAGGGCAT GTTGTGCTCG ATATTATGGA AAGTGAAGAT

XXAXXAXXAXXXVXXGXXHXXXVXXVXXLXXDXXXIXXMXXEXXXSXXEXXD

1051 ATTTTTGCCC GTGTGCGTGC GCTGGAACCT GTTCTGGAAG AAGAAGTTCA

IXXFXXAXXRXXXVXXRXXAXXXLXXEXXPXXXVXXLXXEXXEXXXEXXVXXH

1101 TGGTCTGAAG GATCTTCCCT GTGTTTCTGA TATCCGTAAC ATCGGTTTGA

XGXXLXXKXXXDXXLXXPXXCXXXVXXSXXDXXXIXXRXXNXXXIXXGXXLXXT

1151 CTGCAGCCGT AGATATGAAG CCAATAGAGG GAAAGCCCGG AGCTCGATGG

XXAXXAXXVXXXDXXMXXKXXXPXXIXXEGXGXXXKXXPXXGXXXAXXRXXW

1201 TCTGCGGTGT TTGAGGAAGG TTTACGTCAG GGGCTTTTGC TGCGTTGTAC

SXXAXXVXXFXXXEXXEXXGXXXLXXRXXQXXXGXXLXXLXXLXXXRXXCXXT

1251 AGGGGATACG GTTTCTTTTG GTCCTCCTTT TGTCGCCTCA GAGCAGGAAT

XGXXDXXTXXXVXXSXXFXXGXXXPXXPXXFXXXVXXAXXSXXXEXXQXXEXXL

1301 TGCGAGGTAT GATTGCCTCG TTCCGTAAAG TGCTTGAAGCTGTTGGTTAA

XXRXXGXXMXXXIXXAXXSXXXFXXRXXKXXVXXXLXXEXXAXXXVXXGXX*

Fig. S2

1 ATGAACATGC CCGAAAACGC CCCTTCGTCC CTGGCCAGCC AACTGAAGTT

MXXNXXMXXPXXXEXXNXXAXXXPXXSXXSXXXLXXAXXSXXQXXXLXXKXXL

51 GGATGCTCAC TGGATGCCCT ACACCGCCAA CCGTAACTTC CAGCGTGACC

XDXXAXXHXXXWXXMXXPXXYXXXTXXAXXNXRXXNXXFXXQXXXRXXDXXP

101 CGCGCCTGAT CGTGGCCGCC GAAGGCAGCT GGTTGACCGA TGACAAGGGG

XXRXXLXXIXXXVXXAXXAXXXEXXGXXSXXWXXXLXXTIXDXXXDXXKXXG

151 CGCAAGGTGT ACGACTCATT GTCGGGCCTG TGGACGTGCG GCGCCGGGCA

XXRXXKXXVXXYXXXDXXSXXLXXXSXXGXXLXXXWXXTXXCXXGXXXAXXGXXH

201 TACCCGCAAG GAAATCCAGG CAGCTGTATC CAAACAATTG GGCACGCTGG

XTXXRXXKXXXEXXIXXQXXAEXXAXXVXXSAXXKXXQXXLXXXGXXTXXLXXD

251 ATTACTCCCC AGGCTTCCAA TACGGTCATC CGTTGTCATT CCAACTGGCG

XXYXXSXXPXXXGXXFXXQXXXYXXGXXHXXPXXXLXXSXXFXXXQXXLXXA

301 GAAAAGATTA CCGATCTGAC CCCAGGCAAC CTGAACCACG TGTTCTTCAC

EXXKXXIXXTXXXDXXLXXTXXXPXXGXXNXXXLXXNXXHXXVXXXFXXFXXT

351 CGATTCCGGT TCCGAGTGCG CCGATACCGC AGTGAAGATG GTACGTGCTT

XDXXSXXGXXXSXXEXXCXXAXXXDXXTXXAXXXVXXKXXMXXXVXXRXXAXXY

401 ACTGGCGCCT GAAGGGCCAG GCGACCAAGA CCAAAATGAT CGGCCGCGCC

XXWXXRXXLXXXKXXGXXQXXXAXXTXXKXXTXXXKXXMXXIXXXGXXRXXA

451 CGTGGTTATC ACGGTGTGAA CATTGCCGGT ACCAGCCTGG GCGGCGTCAA

RXXGXXYXXHXXXGXXVXXNXXXIXXAXXGXXXTXXSXXLXXGXXXGXXVXXN

501 CGGTAACCGT AAGCTGTTTG GTCAAGGCTT GATGGATGTT GACCACCTGC

XGXXNXXRXXXKXXLXXFXXGXXXQXXGXXLXXXMXXDXXVXXXDXXHXXLXXP

551 CTCACACCTT GCTGGCGAGC AATGCCTTCT CCCGTGGCAT GCCGGAGCAG

XXHXXTXXLXXXLXXAXXSXXXNXXAXXFXXSXXXRXXGXXMXXXPXXEXXQ

601 GGCGGTATCG CCTTGGCCGA TGAGCTGCTC AAACTGATCG AGTTGCACGA

GXXGXXIXXAXXXLXXAXXDXXXEXXLXXLXXXKXXLXXIXXEXXXLXXHXXD

651 CGCGTCGAAT ATCGCGGCGG TGTTTGTCGA GCCGATGGCC GGCTCCGCGG

XAXXSXXNXXXIXXAXXAXXVXXXFXXVXXEXXXPXXMXXAXXXGXXSXXAXXG

701 GCGTGCTGGT GCCGCCGCAA GGCTATCTCA AACGTCTGCG TGAGATCTGC

XXVXXLXXVXXXPXXPXXQXXXGXXYXXLXXKXXXRXXLXXRXXXEXXIXXC

751 GACCAGCACA ACATCCTGCT GGTGTTCGAT GAAGTCATCA CCGGTTTCGG

DXXQXXHXXNXXXIXXLXXLXXXVXXFXXDXXXEXXVXXIXXTXXXGXXFXXG

801 CCGTACCGGC TCGATGTTCG GTGCCGACAG CTTTGGCGTG ACCCCGGACC

XRXXTXXGXXXSXXMXXFXXGXXXAXXDXXSXXXFXXGXXVXXXTXXPXXDXXL

851 TGATGTGCAT CGCCAAGCAA GTCACCAACG GCGCGATCCC GATGGGCGCG

XXMXXCXXIXXXAXXKXXQXXXVXXTXXNXXGXXXAXXIXXPXXXMXXGXXA

901 GTGATTGCCA GCAGCGAGAT CTACCAGACC TTCATGAACC AGGCGACGCC

VXXIXXAXXSXXXSXXEXXIXXXYXXQXXTXXXFXXMXXNXXQXXXAXXTXXP

951 GGAATACGCG GTGGAATTCC CCCACGGTTA TACCTACTCG GCGCACCCGG

XEXXYXXAXXXVXXEXXFXXPXXXHXXGXXYXXXTXXYXXSXXXAXXHXXPXXV

1001 TGGCTTGCGC CGCTGGCCTG GCGGCATTGG ACCTGTTGCA GAAAGAAAAC

XXAXXCXXAXXXAXXGXXLXXXAXXAXXLXXDEXXLXXLXXQXXXKXXEXXN

1051 CTGGTGCAGA GCGTCGCCGA GGTTGCCCCG CACTTTGAGA ATGCGCTGCA

LXXVXXQXXSXXXVXXAXXEXXXVXXAXXPXXXHXXFXXEXXNXXXAXXLXXH

1101 CGGTTTGAAG GGCAGCAAGA ACGTGATCGA TATTCGCAAC TACGGCCTGG

XGXXLXXKXXXGXXSXXKXXNXXXVXXIXXDXXXIXXRXXNXXXYXXGXXLXXA

1151 CCGGCGCGAT CCAGATTGCC CCGCGTGACG GTGATGCCAT CGTGCGTCCA

XXGXXAXXIXXXQXXIXXAXXXPXXRXXDXXGXXXDXXAXXIXXXVXXRXXP

1201 TTTGAGGCGG GTATGGCCTT GTGGAAAGCC GGTTTCTACG TGCGCTTTGG

FXXEXXAXXGXXXMXXAXXLXXXWXXKXXAXXXGXFXXYXXVXXXRXXFXXG

1251 CGGTGACACC CTGCAGTTCG GGCCAACCTT CAACAGCAAG CCGCAGGACC

XGXXDXXTXXXLXXQXXFXXGXXXPXXTXXFXXXNXXSXXKXXXPXXQXXDXXL

1301 TGGATCGCCT GTTCGATGCG GTCGGCGAAG TGCTGAACAA GATCGACTGA

XDXXRXXLXXXFXXDXXAXXXVXXGXXEXXVXXXLXXNXXKXXXIXXDXX*

Fig. S3

Fig. S4

PP ------NMPEHAGASLAS----QLKLDAHWMPYTANRNFLRDP--RLIVAAEGSWLVDD 47

AP ------MVDMSSNFDSVNEA-----RKGTYWQPFTSNRLLRADPEPRMLTKAEGIYYTSI 49

AB MFDTDKFSDSEHTLDAVQTNNNMHINYQAHWMPFSANRNFAKDP--RMIVGAKGSYLIDD 58

OA ------MTAQPNSLEAR-----DIRYHLHSYTDAVRLEAEG-PLVIERGDGIYVEDV 45

PD ------MNQPQSWEAR-----AETYSLYGFTDMPSVHQRG-TVVVTHGEGPYIVDV 44

CV ------MQKQRTTSQWREL-----DAAHHLHPFTDTASLNQAG-ARVMTRGEGVYLWDS 47

AD ------MSAAKLP-----DLSHLWMPFTANRQFKANP--RLLASAKGMYYTSF 40

VF ------MNKPQSWEAR-----AETYSLYGFTDMPSLHQRG-TVVVTHGEGPYIVDV 44

CC ----MQALARMLPMPDFGAN-----DLDAFWMPFTPNRRFKRHP--RMLSSASGMWYRTP 49

CC ------*

PP KGRKVYDSLSGLWTCGAGHTRKEIQEAVAKQLSTLDYSPGFQYG-HPLSFQLAEKITDLT 106

AP NGTRLLDTLSGLWCTPLGHAHPRIAEAVKTQVETLDFAPSFQMT-HPGAISLAERIAEMA 108

AB SGREIFDSLSGLWTCGAGHTLPEIQQAVSTQLGQLDYSPAFQFG-HPLSFKLADKIVQHM 117

OA SGKRYIEAMSGLWSVGVGFSEPRLAEAAARQMKKLPFYHTFSYRSHGPVIDLAEKLVSMA 105

PD HGRRYLDANSGLWNMVAGFDHKGLIEAAKAQYDRFPGYHAFFGRMSDQTVMLSEKLVEVS 104

CV EGNKIIDGMAGLWCVNVGYGRKDFAEAARRQMEELPFYNTFFKTTHPAVVELSSLLAEVT 107

AD DGRQILDGTAGLWCVNAGHCREEIVSAIASQAGVMDYAPGFQLG-HPLAFEAATAVAGLM 99

VF NGRRYLDANSGLWNMVAGFDHKGLIDAAKAQYERFPGYHAFFGRMSDQTVMLSEKLVEVS 104

CC ESREVLDATSGLWCVNAGHDRPKIREAIQKQAAEMDYAPCFNMG-HPLAFQFASRLAQIT 108

CC ------***----*------*---*------*

PP PGNLNHVFFTDSGSECALTAVKMVRAYWRLKGQATKTKMIGRARGYHGVNIAGTSLGGVN 166

AP PEGMNHVFFANSGSESVDTALKVALGFHRIKGEGNRFRMIGRERGYHGVGFGGMSVGGIV 168

AB PEKLQHVFFTNSGSESADTSIKMARAYWRIKGKPSKTKLIGRARGYHGVNVAGTSLGGIG 177

OA PVPMSKAYFTNSGSEANDTVVKLIWYRSNALGEPERKKIISRKRGYHGVTIASASLTGLP 165

PD PFDNGRVFYTNSGSEANDTMVKMLWFLHAAEGKPQKRKILTRWNAYHGVTAVSASMTGKP 164

CV PAGFDRVFYTNSGSESVDTMIRMVRRYWDVQGKPEKKTLIGRWNGYHGSTIGGASLGGMK 167

AD PQGLDRVFFTNSGSESVDTALKIALAYHRARGEAQRTRLIGRERGYHGVGFGGISVGGIS 159

VF PFDSGRVFYTNSGSEANDTMVKMLWFLHAAEGKPQKRKILTRWNAYHGVTAVSASMTGKP 164

CC PKGLDRIFFTNSGSESVDTALKIALAYHRARGKGTKTRLIGRERGYHGVGFGGISVGGIP 168

CC *------****---*------*------*---***------*: *

PP GNRKLFGQP-MQDVDHLPHTLLAS-NAYSRGMPKEGGIALADELLKLIELHDASNIAAVF 224

AP SNRKMFAPCMMPGVDHLRHPYEPEYAAFSHGQP-TWGAERAEDLQRLVALHDASTIAAVI 227

AB GNRKMFGQ--LMDVDHLPHTLQPD-LTFTKGCAETGGVELANEMLKLIELHDASNIAAVI 234

OA NNHRSFDLP-IDRILHTGCPHFYREGQAGESEE-QFATRLADELEQLIIAEGPHTIAAFI 223

PD YNS-VFGLP-LPGFIHLTCPHYWRYGEEGETEA-QFVARLARELEDTITREGADTIAGFF 221

CV YMHEQGDLP-IPGMAHIEQPWWYKHG-KDMTPD-EFGVVAARWLEEKILEIGADKVAAFV 224

AD PNRKTFSGALLPAVDHLPHTHSLEHNAFTRGQP-EWGAHLADELERIIALHDASTIAAVI 218

VF YNS-VFGLP-LPGFVHLTCPHYWRYGEEGETEE-QFVARLARELEETIQREGADTIAGFF 221

CC KNR-MYFGSLLTGVDHLPHTHGLPGNTCAKGQP-ENGAHLADDLERIVALHDASNIAAVI 226

CC *------*------*

PP VEPLAGSAGVLVPPEGYLKRTREICNQHNILLVFDEVITGFGRTGSMFGADSFGVTPDLM 284

AP VEPVQGSTGVLVPPVGYLERLREICTQNGILLIFDEVITGFGRMGAPFAAQRFGVKPDII 287

AB VEPISGSAGCIVPPTGYLQRLREICDQHDILLIFDEVITGFGRLGTWTAAEYFGVTPDIL 294

OA GEPVMGAGGVVVPPKTYWEKVQAVLKRYDILLIADEVICGFGRTGNLFGSQTFDMKPDIL 283

PD AEPVMGAGGVIPPAKGYFQAILPILRKYDIPMISDEVICGFGRTGNTWGCLTYDFMPDAI 281

CV GEPIQGAGGVIVPPATYWPEIERICRKYDVLLVADEVICGFGRTGEWFGHQHFGFQPDLF 284

AD VEPMAGSTGVLVPPKGYLEKLREITARHGILLIFDEVITAYGRLGEATAAAYFGVTPDLI 278

VF AEPVMGAGGVIPPAKGYFQAILPILRKYDIPVISDEVICGFGRTGNTWGCVTYDFTPDAI 281

CC VEPVAGSTGVLIPPKGYLERLRAICDKHDILLIFDEVITGFGRVGAPFAAERFGVTPDLI 286

CC -**--*- *---*---*------****---** *------**

PP CIAKQVTNGAIPMGAVIASTEIYQTFMNQPTPEYAVEFPHGYTYSAHPVACAAGLAALCL 344

AP TFAKAVTNGVVPMGGVIVTDEIYNTFMTG--PESAIEFCHGYTYSGHPLAAAVGHVVLDI 345

AB NFAKQVTNGAIPLGGVVASHEIYSAFMQQDLPEHAIEFTHGYTYSAHPVACAAALAALEI 354

OA VMSKQLSSSYLPISAFLINERVYAPIAEE--SHKIGTLGTGFTASGHPVAAAVALENLAI 341

PD ISSKNLTAGFFPMGAVILGPDLAKRVEAA--VEAIEEFPHGFTASGHPVGCAIALKAIDV 339

CV TAAKGLSSGYLPIGAVFVG----KRVAEG--LIAGGDFNHGFTYSGHPVCAAVAHANVAA 338

AD TMAKGVSNAAVPAGAVAVRREVHDAIVNG--PQGGIEFFHGYTYSAHPLAAAAVLATLDI 336

VF ISSKNLTAGFFPMGAVILGPELSKRLETA--IEAIEEFPHGFTASGHPVGCAIALKAIDV 339

CC CMAKGLTNAAVPCGAVAASGKIYDAMMDG--ADAPIELFHGYTYSAHPLACAAGLATLET 344

CC ---*------*------*-* *-**---*

PP LQKENLVQSVAE-VAPHFEKALHGIKGA-KNVIDIRNFGLAGAIQIAPRD------GDAI 396

AP MESEDIFARVRA-LEPVLEEEVHGLKDL-PCVSDIRNIGLTAAVDMKPIG------GKPG 397

AB LEKKNLLAQSAA-LAPSFEKMLHGLKSA-PHILDIRNCGLIGALQLAPRD------GDAT 406

OA IEERDLVANARD-RGTYMQKRLRELQDH-PLVGEVRGVGLIAGVELVTDKQAKTGLEPTG 399

PD VMNEGLAENVRR-LAPRFEAGLKRIADR-PNIGEYRGIGFMWALEAVKDKPTKTPFDANL 397

CV LRDEGIVQRVKDDIGPYMQKRWRETFSRFEHVDDVRGVGMVQAFTLVKNKAKRELFPDFG 398

AD YRREDLFARARK-LSAAFEEAAHSLKGA-PHVIDVRNIGLVAGIELSPRE------GAPG 388

VF VMNEGLAENVRR-LAPRFEERLKHIAER-PNIGEYRGIGFMWALEAVKDKASKTPFDGNL 397

CC YREDDLFARAAG-LEGYWQDAMHSLADA-RHVVDVRNLGLVAGIELEPRP------GAPT 396

.: : : : : *. *

PP VRPFEAGMALWKAGFYVRFGGDTLQFGPTFNSKPQDLDRLFDAVGEVLNKLLD------449

AP ARWSAVFEEGLRQGLLLRCTGDTVSFGPPFVASEQELRGMIASFRKVLEAVG------449

AB IRGFELGMKLWKEGFYVRFGGDTLQFGPMFNSTEADIDRLMNAVGDALYQVN------458

OA ALGAKANAVLQERGVISRAMGDTLAFCPPLIINDQQVDTMVSALEATLNDVQASLTR--- 456

PD SVSERIANTCTDLGLICRPLGQSIVLCPPFILTEAQMDEMFEKLEKALDKVFAEVA---- 453

CV EIGTLCRDIFFRNNLIMRACGDHIVSAPPLVMTRAEVDEMLAVAERCLEEFEQTLKARGLA459

AD ARAAEAFQKCFDTGLMVRYTGDILAVSPPLIVDENQIGQIFEGIGKVLKEVA------440

VF SVSERIANTCTDLGLICRPLGQSVVLCPPFILTEAQMDEMFDKLEKALDKVFAEVA---- 453

CC ARAMEVFETCFDEGLLIRVTGDIIALSPPLILEKDHIDRMVETIRRVLGQVD------448

------*--*------*------*

Fig. S5

1