Supplementary Table

Gene / Codon(s) / Forward Primer (5’ to 3’) / Reverse Primer (5’ to 3’)
KRAS / 11/12/13 / ACACTCTTTCCCTACACGACGCTCTTCCGATCTagctgtatcgtcaaggcactct / GTGACTGGAGTTCAGACGTGTGCTCTTCCGATCTggcctgctgaaaatgactga
KRAS / 61 / acactctttccctacacgacgctcttccgatcttcctcatgtactggtccctcatt / gtgactggagttcagacgtgtgctcttccgatctaattgatggagaaacctgtctctt
BRAF / 600 / ACACTCTTTCCCTACACGACGCTCTTCCGATCTtccagacaactgttcaaactgat / GTGACTGGAGTTCAGACGTGTGCTCTTCCGATCTtgaagacctcacagtaaaaatagg
EGFR / 718 / ACACTCTTTCCCTACACGACGCTCTTCCGATCTgctcccaaccaagctctctt / GTGACTGGAGTTCAGACGTGTGCTCTTCCGATCTtatacaccgtgccgaacgc
EGFR / 744/752 / ACACTCTTTCCCTACACGACGCTCTTCCGATCTtcccagaaggtgagaaagttaaaa / GTGACTGGAGTTCAGACGTGTGCTCTTCCGATCTacacagcaaagcagaaactcac
EGFR / 767/773 / ACACTCTTTCCCTACACGACGCTCTTCCGATCTctccaggaagcctacgtgat / GTGACTGGAGTTCAGACGTGTGCTCTTCCGATCTaggcagatgcccagcag
EGFR / 789 / ACACTCTTTCCCTACACGACGCTCTTCCGATCTctgggcatctgcctcacctc / GTGACTGGAGTTCAGACGTGTGCTCTTCCGATCTgtctttgtgttcccggacat
EGFR / 857/860 / ACACTCTTTCCCTACACGACGCTCTTCCGATCTaaacaccgcagcatgtcaa / GTGACTGGAGTTCAGACGTGTGCTCTTCCGATCTcctccttctgcatggtattctttc

Table s1: Primer sequence for the amplification of the sequences of interest. The genomic sequences are in lower case while the Illumina adaptor tag sequences are in upper case.

Name / Homologous too: / Illumina adaptor specific sequence
Universal adaptor / Forward primer tag / AAT GAT ACG GCG ACC ACC GAG ATC TAC ACT
CTT TCC CTA CAC GAC GCT CTT CCG ATC*T
Barcoded adaptor / Reverse primer tag / CAA GCA GAA GAC GGC ATA CGA GAT NNN NNN GTG
ACT GGA GTT CAG ACG TGT GCT CTT CCG ATC* T

Table s2: The sequence of the Illumina adaptor oligonucleotide used in the library production. “*” Denotes Phosphorothioates s-linkage and the “NNN NNN” sequence indicates the position of the sample specific index sequence.

Reagent / Volume / Final Conc / MM x 6
FastStart High Fidelity Buffer without MgCl2 (10x) / 2.5 / 1x / 15
MgCl2 (25mM) / 4.5 / 4.5mM / 27
DMSO / 1.25 / 5% / 7.5
Nucleotides (10mM) / 0.5 / 0.2uM / 3
FastStart High Fidelity Enzyme (5U/ul) / 0.25 / 1.5
Forward Target Primer (12.5uM) / 0.5 / 500nM / 3
Reverse Target Primer (12.5uM) / 0.5 / 500nM / 3
Universal adaptor (25uM) / 0.5 / 500nM / 3
H20 / 11.5 / 69
Barcode adaptor (25uM) / 1 / 500nM
DNA (10ng/ul) / 2
Total / 25

Table s3 PCR reagent: The composition of the PCR reaction mix used to create the amplicon libraries

Temp ( ̊C) / Time / Cycles
50 / 2 min / 1 cycle
70 / 20 min
95 / 10 min
95 / 15 sec / 10 cycles
60 / 30 sec
72 / 1min
95 / 15 sec / 2 cycles
80 / 30 sec
60 / 30 sec
72 / 1 min
95 / 15 sec / 8 cycles
60 / 30 sec
72 / 1min
95 / 15 sec / 2 cycles
80 / 30 sec
60 / 30 sec
72 / 1 min
95 / 15 sec / 8 cycles
60 / 30 sec
72 / 1min
95 / 15 sec / 5 cycles
80 / 30 sec
60 / 30 sec
72 / 1 min

Table s4: The PCR cycle conditions used during library creation and amplification

Proportion of non-reference nucleotides at a position / Number of positions
0% to 0.01% / 152
0.01% to 0.1% / 573
0.1% to 0.2% / 733
0.2% to 0.3% / 622
0.3% to 0.4% / 162
0.4% to 0.5% / 47
0.5% to 0.6% / 26
0.6% to 0.7% / 8
0.7% to 0.8% / 8
0.8% to 0.9% / 3
0.9% to 1.0% / 1
2%-10% / 4
10%-100% / 17

Table s5: The distribution of the proportion of base calls that differ from the reference sequence in the non-primer positions in the BRAF and KRAS amplicons.

Sample ID / BRAF_1 / KRAS_2 / KRAS_3
g.140453136A>T / g.25380276T>A / g.25380277G>A / g.25398281C>T / g.25398282C>A / g.25398284C>T / g.25398284C>A / 9.25398285C>G / WT
V>E / Q>L / Q>K / G>D / G>C / G>D / G>V / G>C
11-1 / 55%, 55%, 51%
11-52 / 54%, 54%, 47%
11-67 / 36%, 36%, 31%
11-108 / 37%, 36%, 27%
11-243 / 36%, 36%, 26% / < 5%, 2%, < 5%
11-260 / 31%, 31%,29%
11-295 / WT, WT, WT
11-346 / 38%, 36%, 33%
11-457 / 33%, 33%, 20%
11-463 / 38% , 36%, 35%
12-4 / 14%, 14%, 12%
12-79 / 20%, 20%, 18%
12-102 / < 5%, 1%, < 5% / 49%, 49%, 39%
12-166 / 46%, 46%, 35%
12-177 / 28%, 28%, 26%
12-219 / 21%,19%, 15% / < 5%, 2%, < 5%
12-238 / 51%, 49%, 43%
12-242 / 16% 15%, 13%
12-268 / WT, WT, WT
12-303 / 44%, 44%, 30%

Table s6: An extended version of Table 1 which includes the variants identified by the BWA/VarScan pipeline. Variants identified in the BRAF (NM_004333.4) and KRAS (NM_004985.3) data by AgileSMAll, AgileSMPoint and the BWA/VarScan pipeline. For each variant the proportion of reads indicating a variant is shown as a percentage of the total number of reads analysed by AgileSMAll and AgileSMPoint respectively. WT identifies samples found to contain no variants. Shaded cells identify variants reported in the diagnostic screening.

PCR product / EGFR_1 / EGFR_2 / EGFR_3 / EGFR_4 / EGFR_5
Genomic position and substitution / g.55241677G>A / g.55241707G>T / g.55241708G>C / g.55242465Del GGAATTAAGAGAAGC / g.55242466Del GAATTAAGAGAAG>AGAC / g.55242467Del AATTAAGAGAAGCAACATC / g.55242470Del TAAGAGAAGCAACATCTC / g.55249002C>T / g.55249003C>T / g.55249004A>G / g.55249005G>T / g.55249011A>G / g.55249011Ins
CCAGCGTGG / g.55249063G>A / g.55249071C>T / g.55259515T>G / g.55259524T>A
Protein
change / E>K / G>C / G>A / Ex / Ex / Ex / Ex / A>V / WT / S>G / S>I / D>G / Ex / WT / T>M / L>R / L>Q
SNP / + / - / - / - / - / - / - / + / - / - / - / - / - / + / - / - / -
1 / 39%,
35%,
< 5%
2 / 25%,
24%,
22% / 51%,
NA,
23% / 100%, NA,
99%
3 / 42%,
40%,
38% / 51%,
50%,
45% / 58%,
NA,
56%
4 / 100%,
NA,
99% / 8%,
7%,
8%
5 / 63%,
NA,
61% / 15%,
15%,
15%
6 / 23%,
23%,
22% / 17%,
17%,
17%
7 / 42%,
NA,
39% / 51%,
NA,
49% / 42%,
41%,
40%
8 / 99%,
NA,
99% / 41%,
40%,
38%
9 / 58%,
NA,
54% / 37%,
37%,
37%
10 / 40%,
69%,
< 5% / 100%,
NA,
99%
11 / 48%,
46%,
< 5% / 59%,
NA,
56%
12 / 5%,
NA,
< 5% / 5%,
NA,
< 5% / 5%,
NA,
< 5% / 13%,
12%,
13%
13 / 37%,
NA,
35% / 28%,
30%,
26% / 42%,
41%,
40%
14 / 72%,
68%,
< 5% / 75%,
NA,
74%
15 / 51%,
NA,
48% / 7%,
6%,
7%
16 / 31%,
31%,
43%
17 / 100%,
NA,
100%
18 / 52%,
NA,
48%
19 / 100%,
NA,
100%
20 / *5%,
4%,
< 5% / 100%,
NA,
100%
21 / 100%,
NA,
100% / < 5%,
1%,
< 5%
22 / 73%,
72%,
70% / 100%,
NA,
100%
23 / 66%,
NA,
63% / 43%,
42%,
43%
24 / 100%,
NA,
100% / < 5% ,
1%,
< 5%
25 / 53%,
NA,
49% / 15%,
15%,
15%

Table s7: An extended version of Table 2 which includes the variants identified by the BWA/VarScan pipeline. Variants identified in the EGFR (NM_005228.3) data by AgileSMAll, AgileSMPoint and the BWA/VarScan pipeline . For each variant the proportion of reads indicating a variant is shown as a percentage of the total number of reads scored (shown in brackets) by AgileSMAll and AgileSMPoint, respectively. Shaded cells identify variants that had been reported in the previous diagnostic screening. * The insertion variant g.55249011InsCCAGCGTGG was correctly reported only in reads originating from the forward strand, as the read depth for the variant on the reverse strand was below the 5% cut-off value used in this analysis. ¥ The number in brackets indicates which amplicon contained each variant. † FS and WT indicate frame shit and WT variants respectively.

Minimum variant read depth / Number of variants in BRAF/KRAS dataset / Number of variants in EGFR dataset
10% / 18 / 39
5% / 18 / 45
2.50% / 18 / 101
1% / 22 / 296
0.50% / 70 / 421
0.25% / 425 / 742
0.10% / 1601 / 3123

Table s8: The number of variants of unknown significance identified at different cut off values for the minimum proportion of reads suggesting a variant allele.


Figure s1: Fragment of an AgileSMPoint report file showing the file’s structure. The report for each read data file is displayed a sequential reports headed by the name of the data file (highlighted by the blue line). The name for each amplicon (highlighted by the red line) is followed by a list of variants identified in the amplicon file (highlighted by the green line). If no variants are found the phrase ‘No mutations’ is displayed.

A

B

Figure S2: Fragment of an AgileSMPoint raw data file opened in Excel. Read depth data for each file spans 9 rows with data for each amplicon written as data sets spanning the sequential columns as shown in Figure B. The read depth for each position of interest is given (underlined in black) followed by the read depth of possible indels (underlined in red). The total number of reads linked to an amplicon is shown (underlined in green) along with the number of reads suggesting no indel which is also the same value as the total number of nucleotides analysed to identify a substitution (underlined in blue).

Target gene KRAS on chromosome 12, 25398251 bp. Number of reads mapped: 46900 (with possible indel: 2895, with no indel: 43791, Pseudogene reads: 214)
KRAS chr12 g.25398284C>T G>D 24162 19565 55.26 CCDS8702.1
Target gene KRAS on chromosome 12, 25380238 bp. Number of reads mapped: 119376 (with possible indel: 1499, with no indel: 93803, Pseudogene reads: 24074)
No mutations
Target gene BRAF on chromosome 7, 140453090 bp. Number of reads mapped: 118937 (with possible indel: 824, with no indel: 114324, Pseudogene reads: 3789)
No mutations

Figure S3: Structure of the report file produced by AgileSMAll

Figure S4: Structure of the indel alignment data file produced by AgileSMAll. The forward and reverse reads are analysed independently and merged in to a single reported variant if only the forward and reverse results match.

Figure S5: Structure of the raw data file exported by AgileSMAll, this information is used to identify substitutions variants only and does not contain data from reads with possible indels.

Figure s6: The distribution of the proportion base calls that differ from the reference sequence in the non-primer positions in the BRAF and KRAS amplicons.