Phylogeny of the STE kinases

Using the same methods as those used for the tyrosine receptor kinases, the sequence of the kinase region of the stem STE kinase has been deduced as:

FELLEEIGSG AFGVVYKARD KPTGQLvAVK KIxLESDEEE EIEQKEILIL

KELKHPNIVK YYGAFLRGGE LWICMEFCEG GSLDDLLKKK NGPLTEEEIa

YILRQVLKGL EYLHSKGIIH RDIKGANILL TSDGQVKLAD FGVSAQLTDt

VIkrKSFVGT PYWMAPEVIL QQRGYDAKAD IWSLGITLIE LATGKPPYAN

LNPMRALFLI AKGPPPRLEE PELSPEFRDF VKQCLEKDPE KRPSAEELLK

HPF

Methods

STE protein kinase domains were selected from Swissprot database(http://www.ncbi.nlm.nih.gov/entrez/) and arranged into families on the basis of homology relatedness. This corresponded to families defined using extracellular structure1. A family tree (fig.1) shows the sequence similarity between protein kinase domains, derived from public sequences and gene prediction methods2 . Domains were defined by hidden Markov model profile analysis and multiple sequence alignment. The initial branching pattern was built from a neighbor-joining tree derived from a clustalW protein sequence alignment of the domains was prepared by pairwise comparison (in practice, a tree available on a commercial website was used (http://www.cellsignal.com/reference/kinase/tk.asp accessed 6/03)).

Assuming that each branch point represents a gene duplication event, the immediate ancestral gene as it was at the time of duplication was given a name(fig 1) and a sequence was determined as a consensus sequence of its progeny using its nearest neighbour as an outgroup to determine which amino acid was the original where those of the progeny differed. (‘x’ was used where this could not be determined). To enable this, the amino acid sequences of the gene products had to be aligned. In order to align amino acids, sequences were ‘piled up’ to locate conserved stretches and variable inserts. Initially the clustal alignment of the NCBI conserved domain database for kinases (http://www.ncbi.nlm.nih.gov/Structure/cdd/cddsrv.cgi) was used to give each amino acid a number in the (longest aggregate) sequence (Supplementary Fig. 1 online), though some adjustments were made subsequently as necessary to improve fit. The greatest need for adjustment was observed at the edges of the conserved domains.

For each amino acid, an evolutionary tree was constructed by using successive neighbours or derived neighbours as outgroups. The final stem sequence (S1) was rooted using stem sequences of the TKL and RTK families derived in the same way, as outgroups.

Where ‘x’s accumulated, a tentative assignment was made by looking for amino acids that appeared in progeny on both sides of a divide. Finally, the tree was constructed that required the least number of mutations overall. Where there was a choice of equal parsimony, it was assumed that the same mutation had occurred twice during the family development rather than a forward mutation that was subsequently reversed.

1 Fantl, W.J., Johnson, D.E. & Williams,L.T. Signalling by receptor tyrosine kinases. Ann. Rev. Biochem. 62, 453-481 (1993)

2 Manning, G., Whyte, D.B., Martinez, R., Hunter,T. & Sudarsanam, S. The Protein Kinase Complement of the Human Genome. Science, 298, 1912-1934 , (2002)

Fig 1

STE named tree

MEKK3 ( )

MEKK2 ( )---- (S35)

MAP3K8 ( )-----(S23)

MEKK1 ( )------(S12)

()

MAP3K4 ( ) ()------(S8 )

TAK1 ( ) ()------(S13) ()

ASK1 ( )---- (S36) () ()

ASK2 ( )---- (S24) ()

()------(S4 )

MEK1 ( ) () ()

MEK2 ( )---- (S37) () ()

MEK5 ( )------(S14) () ()

MKK3 ( ) () () ()----- (S2 )

MKK6 ( )---- (S38) ()--- (S9 ) () ()

MKK4 ( )---- (S25) () () ()

MKK7 ( )------(S15) () ()

COT ( ) () ()

NIK ( )------(S5 ) ()

()

STLK3 ( ) ()

OSR1 ( )---- (S39) ()

()------(S26) ()

STLK6 ( )---- (S40) () ()

STLK5 ( ) () ()

PAK2 ( ) ()---(S18) ()

PAK1 ( ) ()--- (S30) () () ()

PAK3 ( )---- (S41) () () () ()

() () () ()

PAK4 ( ) ()--- (S27) () ()

PAK5 ( )---- (S42) () () ()—(S1 )

()--- (S31) () ()

PAK6 ( ) ()--- (S16) ()

() () ()

MST1 ( ) () () ()

MST2 ( )------(S32) () () ()

() () () ()

MST3 ( )-----(S43) ()---(S28) () () ()

MST4 ( ) ()--- (S33) () () () ()

YSK1 ( ) () () () ` ()

()---(S19) () ()

TAO2 ( ) () () ()

TAO1 ( )---- (S44) () ()--- (S10) ()

()------(S29) () () ()

TAO3 ( ) () () ()

() () ()

ZC1 ( ) () () ()

ZC3 ( )---- (S45) () () ()

()--- (S34) () () ()

ZC2 ( ) ()--- (S20) () () ()

ZC4 ( ) () () ()--- (S6 ) ()

MYO3A ( ) ()------(S17) () () ()

MYO3B ( )------(S21) () () ()

() () ()

KHS1 ( ) () () ()

KHS2 ( )---- (S46) () ()--- (S3 )

()--- (S22) () ()

GCK ( ) ()------(S11) ()

HPK1 ( ) ()

LOK ( ) ()

SLK ( )------(S7 )