Supplementary Methods

Sample Selection

We included 50 individuals recruited to the Familial Breast Cancer Study (FBCS) through the Breast Cancer Susceptibility Collaboration BCSC (UK). A full list of contributors to this collaboration is given in the Supplementary Note. All participants in the study give informed consent and the research was approved by the NHS National Research Ethics Service (MREC/01/2/18). All samples were screened for mutations in BRCA1 and BRCA2 by Sanger sequencing or heteroduplex methods in combination with Multiplex Ligation-dependent Probe Amplification (MLPA). Each individual was assigned a Family History Score (FHS) that quantifies the strength of the family history of breast cancer, as previously described [1]. A score of 1.0 is assigned to the index case, with an additional 0.5 for each affected first degree relative, and an additional 0.25 for each affected second degree relative. The score of an individual with bilateral cancer is doubled. The median Family History Score was 3, which is the score of a woman with bilateral breast cancer and two first degree relatives with breast cancer, or equivalent. All individuals in this study had a score of ≥ 2. The breast cancer status and FHS for each proband is given in Supplementary Table 1.

Exome Sequencing

For 21 exomes targeted capture from 3 µg of genomic DNA was undertaken using a prototype version of the Agilent SureSelect Human All Exon 50 Mb Kit developed with the Wellcome Trust Sanger Institute (the GENCODE exome) [2]. For the remaining 29 exomes targeted capture from 3 µg of genomic DNA was undertaken using the Agilent SureSelect Human All Exon 38 Mb Kit (see Supplementary Table 2) Manufacturers’ protocol was followed in both cases.

(http://www.chem.agilent.com/Library/usermanuals/Public/G3360-90000_SureSelect_IlluminaPaired_1.1.1.pdf)

Sequencing was undertaken on one lane of an Illumina Genome Analyzer IIx platform. Read length was 76 bp. Illumina specific FASTQ files containing sequence information and quality scores for each base were exported for further analysis. Read mapping and variant analysis were undertaken using version 2.10 NextGENe software (http://www.softgenetics.com/NextGene) using default settings for format conversion and alignment to the whole human genome reference sequence (GRCh37) [3]. NextGENe undertakes functional annotation from an inbuilt database using the seq_contig.md and seq_gene.md files from NCBI ftp://ftp.ncbi.nih.gov/genomes/MapView/Homo_sapiens/sequence/BUILD.37.2/initial_release/
Call quality filters

We undertook initial validation of 142 variants within our dataset (81 base substitutions and 61 indels). 51% of variants at coverage <15 were false positives, compared to 30% of variants at a coverage of ≥ 15. 100% of base substitutions with a mutant read percentage of <30% were false positives. Only 22% of indel variants with a mutant read percentage of <30% were false positives, emphasising the difficulties aligning and calling this class of variant in short read data. To reduce our false positive rate, and our dataset to a pragmatic number of variants for validation purposes, all variants with a coverage of <15 reads, and all base substitutions with a mutant read % of <30% were excluded from the analysis.

Truncating variant validation

We undertook experimental validation of all truncating variants by Sanger sequencing in cases 1-12. Cases 1-4 have mutations in known breast cancer predisposition genes. Cases 5-12 were negative for mutations in known breast cancer predisposition genes and were selected to ensure that both exome platforms were equally represented in the validation experiment.

We designed primers for amplicons of 501 bp, with the variant base at the central position. Variants within 50 bp were included in the same amplicon, with the one of the variants in the central position. We uploaded 292 sequence amplicons containing 316 variants to BatchPrimer3 (http://probes.pw.usda.gov/cgi-bin/batchprimer3/batchprimer3.cgi) in FASTA format. Primers were selected using “General Settings for Generic Primers”, with the following exceptions; Product size was Min 200, Opt 350, Max 501 and Max Tm difference was 5.0. We performed the sequencing reactions with BigDye Terminator Cycle Sequencing Kit (ABI) after removing excess dNTPs and primers from PCR products using Exofast. All PCR reactions were run with an annealing temperature of 60ºC. Sequencing products were sequenced using an ABI 3730 sequencer and the data was analysed using Mutation Surveyor (SoftGenetics, version 3.20) and by visual inspection.

Gene Enrichment Analysis

We inputted the 122 genes in which we confirmed truncating variants into the ToppGene Suite ToppFun analysis software http://toppgene.cchmc.org/enrichment.jsp. Search terms were “GO : Molecular Function”, “GO : Biological Process”, “GO : Cellular Component”. The Bonferroni correction was applied with a p-Value cut-off of 0.05 [4].


References

1. Turnbull C, Ahmed S, Morrison J, Pernet D, Renwick A, Maranian M, Seal S, Ghoussaini M, Hines S, Healey CS et al. (2010) Genome-wide association study identifies five new breast cancer susceptibility loci. Nat Genet 42:504-507.

2. Coffey AJ, Kokocinski F, Calafato MS, Scott CE, Palta P, Drury E, Joyce CJ, Leproust EM, Harrow J, Hunt S et al. (2011) The GENCODE exome: sequencing the complete human exome. Eur J Hum Genet 19:827-831.

3. Snape K, Hanks S, Ruark E, Barros-Nunez P, Elliott A, Murray A, Lane AH, Shannon N, Callier P, Chitayat D et al. (2011) Mutations in CEP57 cause mosaic variegated aneuploidy syndrome. Nature Genetics 43:527-529.

4. Chen J, Bardes EE, Aronow BJ, Jegga AG (2009) ToppGene Suite for gene list enrichment analysis and candidate gene prioritization. Nucleic Acids Res 37:W305-311.


Supplementary Table 1. Characteristics of 50 individuals in whom exome sequencing was performed.

ID / Breast cancer status,
age at diagnosis / Family History Score / ID / Breast cancer status,
age at diagnosis / Family
History Score
1 / Bi, 50, 59 / 3.5 / 26 / Bi, 45, 63 / 3.25
2 / Bi, 71, 80 / 3.25 / 27 / Uni, 50 / 4
3 / Bi, 40, 49 / 3.25 / 28 / Bi, 52, 61 / 3.25
4 / Bi, 58, 68 / 3 / 29 / Bi, 59, 59 / 2.5
5 / Uni, 31 / 3 / 30 / Uni, 56 / 2.75
6 / Bi, 59, 63 / 2.5 / 31 / Bi, 51, 62 / 2.75
7 / Bi, 50, 66 / 4.5 / 32 / Uni, 53 / 3
8 / Bi, 57, 71 / 3.25 / 33 / Bi, 56, 66 / 2.75
9 / Bi, 58, 73 / 3 / 34 / Bi, 49, 59 / 2.75
10 / Bi, 50, 50 / 3.25 / 35 / Bi, 59, 59 / 3.5
11 / Bi, 64, 64 / 3.5 / 36 / Bi, 55, 59 / 4.5
12 / Bi, 52, 61 / 3 / 37 / Bi, 57, 76 / 2.75
13 / Bi, 60, 62 / 2 / 38 / Bi, 54, 54 / 3
14 / Bi, 39, 52 / 3 / 39 / Bi, 49, 68 / 3
15 / Bi, 60, 60 / 4 / 40 / Bi, 44, 44 / 3.25
16 / Bi, 44, 44 / 2.75 / 41 / Bi, 53, 60 / 3.25
17 / Bi, 56, 62 / 2.75 / 42 / Bi, 58, 58 / 3.25
18 / Bi, 65, 65 / 3 / 43 / Bi, 45, 45 / 2.75
19 / Bi, 52, 53 / 2.5 / 44 / Uni, 48 / 2.5
20 / Uni, 54 / 3.25 / 45 / Bi, 66, 66 / 3.25
21 / Uni, 53 / 2.5 / 46 / Bi, 58, 58 / 2.5
22 / Bi, 37, 39 / 2.75 / 47 / Bi, 51, 53 / 3
23 / Bi, 35, 58 / 2.75 / 48 / Bi, 36, 45 / 3
24 / Uni, 44 / 2.75 / 49 / Bi, 65, 65 / 3.25
25 / Bi, 43, 57 / 4 / 50 / Bi, 56, 60 / 3

Bi = Bilateral breast cancer, Uni = Unilateral breast cancer


Supplementary Table 2. Exome sequencing metrics for 50 individuals with familial breast cancer. Validation of all truncating variants passing filters was undertaken in cases 1-12.

ID / Size of target exome / Total Number of Reads / Mapped Reads / Percentage of Mapped Reads / Percentage of target bases with coverage ≥ 15
1 / 50Mb / 41115790 / 40561471 / 98.7% / 62%
2 / 38Mb / 34750198 / 32694381 / 94.1% / 76%
3 / 38Mb / 67955486 / 48139149 / 70.8% / 81%
4 / 38Mb / 62701831 / 62216748 / 99.2% / 86%
5 / 50Mb / 35162494 / 34693641 / 98.7% / 58%
6 / 50Mb / 31058365 / 30450069 / 98.0% / 48%
7 / 50Mb / 32631451 / 31794830 / 97.4% / 55%
8 / 38Mb / 56521669 / 56200146 / 99.4% / 86%
9 / 38Mb / 52491128 / 52120340 / 99.3% / 87%
10 / 38Mb / 56243271 / 55773045 / 99.2% / 83%
11 / 38Mb / 50421970 / 50061713 / 99.3% / 83%
12 / 38Mb / 57021443 / 56654682 / 99.4% / 84%
13 / 38Mb / 42003989 / 41533300 / 98.9% / 79%
14 / 38Mb / 56025719 / 55799031 / 99.6% / 85%
15 / 38Mb / 87385236 / 62705226 / 71.8% / 83%
16 / 38Mb / 51823741 / 51446927 / 99.3% / 84%
17 / 38Mb / 63355611 / 63016395 / 99.5% / 87%
18 / 38Mb / 57298053 / 56952996 / 99.4% / 87%
19 / 50Mb / 37962415 / 37378097 / 98.5% / 58%
20 / 50Mb / 34208988 / 33735200 / 98.6% / 57%
21 / 50Mb / 35110021 / 34639332 / 98.7% / 56%
22 / 50Mb / 33509515 / 32182068 / 96.0% / 41%
23 / 50Mb / 32536401 / 32141420 / 98.8% / 58%
24 / 50Mb / 33633086 / 33178208 / 98.6% / 59%
25 / 50Mb / 42962125 / 42573229 / 99.1% / 71%
26 / 50Mb / 36823566 / 36243127 / 98.4% / 55%
27 / 50Mb / 36855386 / 36179154 / 98.2% / 59%
28 / 50Mb / 35109324 / 34521557 / 98.3% / 52%
29 / 50Mb / 37457958 / 36920161 / 98.6% / 56%
30 / 50Mb / 69572624 / 69091602 / 99.3% / 72%
31 / 38Mb / 57895667 / 57474931 / 99.3% / 87%
32 / 50Mb / 57526978 / 56587053 / 98.4% / 66%
33 / 38Mb / 57894805 / 57513445 / 99.3% / 85%
34 / 38Mb / 51355016 / 51010811 / 99.3% / 85%
35 / 50Mb / 41019489 / 40527205 / 98.8% / 60%
36 / 38Mb / 60455569 / 60059488 / 99.3% / 88%
37 / 38Mb / 54532840 / 54183335 / 99.4% / 86%
38 / 38Mb / 64352912 / 63727056 / 99.0% / 87%
39 / 38Mb / 58867830 / 58453243 / 99.3% / 88%
40 / 38Mb / 59966571 / 59606324 / 99.4% / 88%
41 / 38Mb / 59641159 / 59089467 / 99.1% / 88%
42 / 38Mb / 55271122 / 54943552 / 99.4% / 88%
43 / 38Mb / 57956482 / 57613108 / 99.4% / 87%
44 / 50Mb / 31780806 / 31170175 / 98.1% / 50%
45 / 38Mb / 56655031 / 56332124 / 99.4% / 87%
46 / 50Mb / 29123178 / 28771413 / 98.8% / 52%
47 / 38Mb / 59629268 / 59081301 / 99.1% / 87%
48 / 38Mb / 57018711 / 56584582 / 99.2% / 87%
49 / 38Mb / 59300434 / 58848129 / 99.2% / 85%
50 / 50Mb / 35480736 / 34876654 / 98.3% / 53%


Supplementary Table 3. Confirmed truncating variants in 12 individuals with familial breast cancer.

Gene / HGNC Approved Name / Truncating mutation
1 / BRCA2 / breast cancer 2, early onset / c.7977-1G>C
BRIX1 / biogenesis of ribosomes, homolog (S. cerevisiae) / c.793-2_793-1insA
CASP5 / caspase 5, apoptosis-related cysteine peptidase / c.1135+1C>T
CXCL6 / Chemokine (C-X-C motif) ligand 6 (granulocyte chemotactic protein 2) / c.239_240insT
FILIP1 / filamin A interacting protein 1 / c.303delG
HEATR7B2 / HEAT repeat family member 7B2 / c.2214+5A>G
IGSF22 / immunoglobulin superfamily, member 22 / c.479-2T>A
MLL4 / myeloid/lymphoid or mixed-lineage leukemia 4 / c.3059_3060dupG
PTCHD3 / patched domain containing 3 / c.923_924dupG
SLAMF6 / SLAM family member 6 / c.321G>C, p.Y107 X
SMARCD2 / SWI/SNF related, matrix associated, actin dependent regulator of chromatin, subfamily d, member 2 / c.574G>A, p.R136X
SSX9 / synovial sarcoma, X breakpoint 9 / c.110delC
TNFAIP6 / tumor necrosis factor, alpha-induced protein 6 / c.90G>A, p.W30X
2 / CHEK2 / CHK2 checkpoint homolog (S. pombe) / c.1100delC
C2orf63 / chromosome 2 open reading frame 63 / c.1384+2A>T
CFHR5 / complement factor H-related 5 / c.486_487insA
PPEF2 / protein phosphatase, EF-hand calcium binding domain 2 / c.1960G>A, p.R654X
SERPINI2 / serpin peptidase inhibitor, clade I (pancpin), member 2 / c.628_629delAC
3 / CHEK2 / CHK2 checkpoint homolog (S. pombe) / c.658T>A, p.K220X
ABCC11 / ATP-binding cassette, sub-family C (CFTR/MRP), member 11 / c.2813C>G, p.S938X
DNMT3A / DNA (cytosine-5-)-methyltransferase 3 alpha / c.1025_1026insC
EPS8L1 / EPS8-like 1 / c.1514_1515dupT
FTMT / ferritin mitochondrial / c.436A>T, p.K146X
LOC647020 / Locus 647020 / c.303_304delAT
MCAT / malonyl CoA:ACP acyltransferase (mitochondrial) / c.729+1G>T
NOD2 / nucleotide-binding oligomerization domain containing 2 / c.3019_3020dupC
PRMT7 / protein arginine methyltransferase 7 / c.1056-1G>T
PRSS7 / transmembrane protease, serine 15 / c.2042_2043dupT
VPS13B / vacuolar protein sorting 13 homolog B (yeast) / c.6732+1G>A
WRN / Werner syndrome, RecQ helicase-like / c.1230_1231insA
ZNF451 / zinc finger protein 451 / c.488G>G/A, p.W163X
ZNF582 / zinc finger protein 582 / c.136+1G>T
4 / ATM / ataxia telangiectasia mutated / c.4396C>T, p.R1466X
FETUB / fetuin B / c.127_128insCA
KIAA1919 / KIAA1919 / c.614delT
SLC26A10 / solute carrier family 26, member 10 / c.1483C>T, p.R495X
TAOK1 / TAO kinase 1 / c.2544+5A>G
ZIM2 / zinc finger, imprinted 2 / c.1513C>T, p.R505X
Gene / HGNC Approved Name / Truncating mutation
5 / ATG4C / ATG4 autophagy related 4 homolog C (S. cerevisiae) / c.959_960delTG
BIRC8 / baculoviral IAP repeat containing 8 / c.711+5T>G
C17orf57 / chromosome 17 open reading frame 57 / c.1297A>T, p.K433X
C5orf52 / chromosome 5 open reading frame 52 / c.247delA
CHRNB3 / Cholinergic receptor, nicotinic, beta 3 / c.1249C>T, p.Q417X
DCD / Dermcidin / c.217_218insT
DDX60L / DEAD (Asp-Glu-Ala-Asp) box polypeptide 60-like / c.4600delA
ECEL1 / Endothelin converting enzyme-like 1 / c.1059+1C>T
ENDOD1 / Endonuclease domain containing 1 / c.708delT
GIMAP6 / GTPase, IMAP family member 6 / c.879+2G>A
IL25 / interleukin 25 / c.10C>T, p.R4X
NDUFA10 / NADH dehydrogenase (ubiquinone) 1 alpha subcomplex, 10, 42kDa / c.669+1C>G
NOX1 / NADPH oxidase 1 / c.544delT
SLC6A5 / solute carrier family 6 (neurotransmitter transporter, glycine), member 5 / c.1-2A>G