Table S1

Brome Mosaic Virus coat protein, variants from Portage la Prairie, Manitoba, and from Nebraska

P @STSGTGKMT RAQRRAAARR NRRTARVQPV IVEPLAAGQG KAIKAIAGYS 50

N ------V------G------50

P ISKWEASSDA ITAKATNAMS ITLPHELSSE KNKELKVGRV LLWLGLLPSV 100

N ------100
P AGRIKACVAE KQAQAEAAFQ VALAVADSSK EVVAAMYTDA FRGATLGDLL 150
N ------150

P NLQIYLYASE AVPAKAVVVH LEVEHVRPTF DDFFTPVYR 189

N ------189

The predicted N-terminal methionine was found to be deleted, and the resulting Nterminal serine acetylated (denoted by @) but the original numbering has been retained [28].

Foxtail Mosaic Virus H93 coat protein, variant from Kansas.

@ATQNADVTD ATDYKKPPAE TEQKALTIQP RSNKAPSDEE LVRIINAAQK 50

RGLTPAAFVQ AAIIFTMESM DKGATDSTIF TGKYNTFPMK QLALCKDAGV 100

PVHKLCYFYT KPAYANRRVA NQPPARWTNE NVPKANKWAA FDTFDALLDP 150

YVVPSSVPYD EPTPEDRQVN EIFKKDNLSQ AASRNQLLGT QASITRGRLN 200

GAPALPNNGQ YFIEAPQ 217

Compared to the predicted sequence in the Swiss-Prot databank (accession # P22172, submitted by Bancroft et al [30]) the following changes were observed (indicated by a double underline): The predicted N-terminal methionine was found to be deleted, and the resulting N-terminal alanine acetylated (denoted by @), but the original numbering has been retained. The typographical error at position 4 has been corrected (GQ), as in the Swiss-Prot databank [36]. At position 64, V has been replaced by I. At positions 6769 MES has been inserted, and the peptide has been renumbered from this point. At position 91 Q has replaced S. Between L(94) and C(95) R has been deleted.

Wheat Streak Mosaic Virus coat proteins (various strains)

IHC QSNNVSVMAG LDTGGAKTGQ GSGSKGTGGS FTSNPVRTGG RATDVQDQTP 50

CO85 ------
BT95 ------
EB3 QAGGT-T--- V---R—-V—- A------S-- -I-----N------
IHC GLVFPAPKIT TKAIYMPKTV RDKIKPEMIN NMIKYQPRAE LIDNRYATTE 100
CO85 -----T------R------
BT95 -----T------
EB3 ------R-- -E------V------I-
IHC QLNTWIKEAS EGLDVTEDVF INTLLPGWVY HCIINTTSPE NRALGTWRVV 150
CO85 ------

BT95 ------

EB3 ------V------K------

IHC NNAGKDNEQQ LEFKIEPMYK AAKPSLRAIM RHFGEGARVM IEESVRIGKP 200

CO85 ------

BT95 ------

EB3 --V-----E- H-Y—-D------

IHC IIPRGFDKAG VLNINNIVAA CDFIMRGADD TPNFVQVQNS VAVNRLRGIQ 250

CO85 ------S------

BT95 ------S------

EB3 ------SV------TS------

IHC NKLFAQARLS AGTNEDNSRH DADDVRENTH SFNGVNALA 289

CO85 ------

BT95 ------

EB3 ------

The protein sequences of WSMV isolates were confirmed by mass spectrometry to be those in the NCBI database: IHC (Canada, accession # AAM48214), CO85 (Colorado, accession # AAM48195), BT95 (Kansas, accession # AAM48191) and EB3 (E1 Batan 3 strain from Mexico, NCBI accession #AAG28732) [37].

Johnsongrass Mosaic Virus coat protein (various strains).

TrEMBL SGNEDAGKQK SATPAANQTA SGDGKPVQTT ATADNKPSSD NTSNAQGTSQ 50

Aus @------R------A------E---A------50

KS @------R------E--P- -A----A--- --S------T----- 50

NI @------NV- ---G -K EN -S D KEK- --E-G -AA--KA--N 40

IS @--TV-V-Q-- -Q-ESQDKET GESVNKDKQN KEGESGKTTQ DEKDKT---T 50

TrEMBL TKGGGESGGT NATATKKDKD VDVGSTGTFV IPKLKKVSPK MRLPMVSNKA 100

Aus ------100

KS A--DS----- K-STAT------100

NI --ED-GES-- KP-TAN------90

IS -NSQKND-K- SQEG ------T-T---IT V----AMTK- ----KANG-- 99

TrEMBL ILNLDHLIQY KPDQRDISNA RATHTQFQFW YNRVKKEYDV DDEQMRILMN 150

Aus ------I------150

KS ------I------150

NI ------I------140

IS -----F-LT- A-Q-Q----T ---QEE-NR- --AI----E- E-S--GV--D 149

TrEMBL GLMVWCIENG TSPDINGYWT MVDGNNQSEF PLKPIVENAK PTLRQCMMHF 200

Aus ------D--D------SA-D------200

KS ------200

NI ------190

IS ------N---V-- -M--ET-VTY ----V----S -----I-H-- 199

TrEMBL SDAAEAYIEM RNLDEPYMPR YGLLRNLNDK SLARYAFDFY EINSRTPNRA 250

Aus ------250

KS ------250

NI ------240

IS ------Y --SK-R------Q---T-F N------T-K-TT-- 249

TrEMBL REAHAQMKAA AIRGSTNHMF GLDGNVGESS ENTERHTAAD VSRNVHSYRG 300

Aus ------300

KS ------300

NI ------T--- 290

IS ----M------V---SAR------Q ------S- ----M--LL- 299

TrEMBL AKI 303

Aus --- 303

KS --- 303

NI --- 293

IS VQQSH 304

Strains are identified as: Aus, Australian; KS, Kansas; NI, Nigeria; IS, Israel, according to the origin of the virus. The N-terminal acetyl group (denoted by @) is not included in the numbering sequences. JGMV-NI is a new virus whose measured sequences correspond (except at position 287) to positions 67 – 303 of previously identified strains [33]. Allowing for 10 deletions, parts of the early sequence can also be matched to known strains. The strain identified as JGMV-IS is listed here with the JGM viruses for convenience because it has extended sequences that can be matched to those in previously identified JGMV strains but is not actually a JGM virus [34].

High Plains Virus Coat Protein

@ALSFKNSSGV LKAKTLKDGF VTSSDIETTV HDFSYEKPDL SSVDGFSLKS 50

LLSSDGWHIV VAYQSVTNSE RLNNNKKNNK TQRFKLFTFD IIVIPGLKPN 100

KSKNVVSYNR FMALCIGMIC YHKKWKVFNW SNKRYEDNKN TINFNEDDDF 150

MNKLAMSAGF SKEHKYHWFY STGFEYTFDI FPAEVIAMSL FRWSHRVELK 200

IKYEHESDLV APMVRQVTKR GNISDVMDIV GKDIIAKKYE EIVKDRSSIG 250

IGTKYNDILD EFKDIFNKID SSSLDSTIKN CFNKIDGE

There are two strains, Kansas (KS96) and Idaho (ID97), whose sequences, confirmed by mass spectrometry [35], are identical and can be found in the Swiss-Prot databank [36]. The N-terminal acetyl group is not included in the numbering.