Table S1
Brome Mosaic Virus coat protein, variants from Portage la Prairie, Manitoba, and from Nebraska
P @STSGTGKMT RAQRRAAARR NRRTARVQPV IVEPLAAGQG KAIKAIAGYS 50
N ------V------G------50
P ISKWEASSDA ITAKATNAMS ITLPHELSSE KNKELKVGRV LLWLGLLPSV 100
N ------100
P AGRIKACVAE KQAQAEAAFQ VALAVADSSK EVVAAMYTDA FRGATLGDLL 150
N ------150
P NLQIYLYASE AVPAKAVVVH LEVEHVRPTF DDFFTPVYR 189
N ------189
The predicted N-terminal methionine was found to be deleted, and the resulting Nterminal serine acetylated (denoted by @) but the original numbering has been retained [28].
Foxtail Mosaic Virus H93 coat protein, variant from Kansas.
@ATQNADVTD ATDYKKPPAE TEQKALTIQP RSNKAPSDEE LVRIINAAQK 50
RGLTPAAFVQ AAIIFTMESM DKGATDSTIF TGKYNTFPMK QLALCKDAGV 100
PVHKLCYFYT KPAYANRRVA NQPPARWTNE NVPKANKWAA FDTFDALLDP 150
YVVPSSVPYD EPTPEDRQVN EIFKKDNLSQ AASRNQLLGT QASITRGRLN 200
GAPALPNNGQ YFIEAPQ 217
Compared to the predicted sequence in the Swiss-Prot databank (accession # P22172, submitted by Bancroft et al [30]) the following changes were observed (indicated by a double underline): The predicted N-terminal methionine was found to be deleted, and the resulting N-terminal alanine acetylated (denoted by @), but the original numbering has been retained. The typographical error at position 4 has been corrected (GQ), as in the Swiss-Prot databank [36]. At position 64, V has been replaced by I. At positions 6769 MES has been inserted, and the peptide has been renumbered from this point. At position 91 Q has replaced S. Between L(94) and C(95) R has been deleted.
Wheat Streak Mosaic Virus coat proteins (various strains)
IHC QSNNVSVMAG LDTGGAKTGQ GSGSKGTGGS FTSNPVRTGG RATDVQDQTP 50
CO85 ------
BT95 ------
EB3 QAGGT-T--- V---R—-V—- A------S-- -I-----N------
IHC GLVFPAPKIT TKAIYMPKTV RDKIKPEMIN NMIKYQPRAE LIDNRYATTE 100
CO85 -----T------R------
BT95 -----T------
EB3 ------R-- -E------V------I-
IHC QLNTWIKEAS EGLDVTEDVF INTLLPGWVY HCIINTTSPE NRALGTWRVV 150
CO85 ------
BT95 ------
EB3 ------V------K------
IHC NNAGKDNEQQ LEFKIEPMYK AAKPSLRAIM RHFGEGARVM IEESVRIGKP 200
CO85 ------
BT95 ------
EB3 --V-----E- H-Y—-D------
IHC IIPRGFDKAG VLNINNIVAA CDFIMRGADD TPNFVQVQNS VAVNRLRGIQ 250
CO85 ------S------
BT95 ------S------
EB3 ------SV------TS------
IHC NKLFAQARLS AGTNEDNSRH DADDVRENTH SFNGVNALA 289
CO85 ------
BT95 ------
EB3 ------
The protein sequences of WSMV isolates were confirmed by mass spectrometry to be those in the NCBI database: IHC (Canada, accession # AAM48214), CO85 (Colorado, accession # AAM48195), BT95 (Kansas, accession # AAM48191) and EB3 (E1 Batan 3 strain from Mexico, NCBI accession #AAG28732) [37].
Johnsongrass Mosaic Virus coat protein (various strains).
TrEMBL SGNEDAGKQK SATPAANQTA SGDGKPVQTT ATADNKPSSD NTSNAQGTSQ 50
Aus @------R------A------E---A------50
KS @------R------E--P- -A----A--- --S------T----- 50
NI @------NV- ---G -K EN -S D KEK- --E-G -AA--KA--N 40
IS @--TV-V-Q-- -Q-ESQDKET GESVNKDKQN KEGESGKTTQ DEKDKT---T 50
TrEMBL TKGGGESGGT NATATKKDKD VDVGSTGTFV IPKLKKVSPK MRLPMVSNKA 100
Aus ------100
KS A--DS----- K-STAT------100
NI --ED-GES-- KP-TAN------90
IS -NSQKND-K- SQEG ------T-T---IT V----AMTK- ----KANG-- 99
TrEMBL ILNLDHLIQY KPDQRDISNA RATHTQFQFW YNRVKKEYDV DDEQMRILMN 150
Aus ------I------150
KS ------I------150
NI ------I------140
IS -----F-LT- A-Q-Q----T ---QEE-NR- --AI----E- E-S--GV--D 149
TrEMBL GLMVWCIENG TSPDINGYWT MVDGNNQSEF PLKPIVENAK PTLRQCMMHF 200
Aus ------D--D------SA-D------200
KS ------200
NI ------190
IS ------N---V-- -M--ET-VTY ----V----S -----I-H-- 199
TrEMBL SDAAEAYIEM RNLDEPYMPR YGLLRNLNDK SLARYAFDFY EINSRTPNRA 250
Aus ------250
KS ------250
NI ------240
IS ------Y --SK-R------Q---T-F N------T-K-TT-- 249
TrEMBL REAHAQMKAA AIRGSTNHMF GLDGNVGESS ENTERHTAAD VSRNVHSYRG 300
Aus ------300
KS ------300
NI ------T--- 290
IS ----M------V---SAR------Q ------S- ----M--LL- 299
TrEMBL AKI 303
Aus --- 303
KS --- 303
NI --- 293
IS VQQSH 304
Strains are identified as: Aus, Australian; KS, Kansas; NI, Nigeria; IS, Israel, according to the origin of the virus. The N-terminal acetyl group (denoted by @) is not included in the numbering sequences. JGMV-NI is a new virus whose measured sequences correspond (except at position 287) to positions 67 – 303 of previously identified strains [33]. Allowing for 10 deletions, parts of the early sequence can also be matched to known strains. The strain identified as JGMV-IS is listed here with the JGM viruses for convenience because it has extended sequences that can be matched to those in previously identified JGMV strains but is not actually a JGM virus [34].
High Plains Virus Coat Protein
@ALSFKNSSGV LKAKTLKDGF VTSSDIETTV HDFSYEKPDL SSVDGFSLKS 50
LLSSDGWHIV VAYQSVTNSE RLNNNKKNNK TQRFKLFTFD IIVIPGLKPN 100
KSKNVVSYNR FMALCIGMIC YHKKWKVFNW SNKRYEDNKN TINFNEDDDF 150
MNKLAMSAGF SKEHKYHWFY STGFEYTFDI FPAEVIAMSL FRWSHRVELK 200
IKYEHESDLV APMVRQVTKR GNISDVMDIV GKDIIAKKYE EIVKDRSSIG 250
IGTKYNDILD EFKDIFNKID SSSLDSTIKN CFNKIDGE
There are two strains, Kansas (KS96) and Idaho (ID97), whose sequences, confirmed by mass spectrometry [35], are identical and can be found in the Swiss-Prot databank [36]. The N-terminal acetyl group is not included in the numbering.