Development of DNA Confirmatory and High-Risk Diagnostic Testing for Newborns Using Targeted Next-Generation DNA Sequencing

SUPPLEMENTARY INFORMATION

Supplementary Note 1- DNA Isolation from Dried Blood Spots (DBS).

Supplementary Note 2- Costs per Subject.

Figure S1: Quality Performance of DNA Isolated from Bio-specimens.

Figure S2: Uniformity of Coverage and Reproducibility of NBDx.

Figure S3: Variant Management for Filtering Blinded Samples.

Table S1. Sequencing and Enrichment Statistics for the NBDx and WES samples.

Table S2. Yield and DNA Isolation from Biological Specimens.

Table S3. Tabulation of Disease Positive Calls.

Table S4. Coverage for NBDx Samples per Tiled Region.


Supplementary Note 1- DNA Isolation from Dried Blood Spots (DBS).

Over the years several groups have isolated DNA from DBS for various assays by using methods such as boiling or alkaline washes that yield single-stranded DNA1. In contrast, NGS library construction requires double-stranded DNA (dsDNA) for enzymatic incorporation of universal adapters via ligation or transposition. DNA isolation from DBS was tested by several methods including a modified version of the QiaAmp® Blood Micro kit (Qiagen, Germantown, MD), a modified version of GenSolveTM (IntegenX, Pleasanton, CA) isolation coupled with Qiagen MinEluteTM purification similar to previously described methods2 and the ChargeSwitch® Forensic kit (Life Technologies, Carlsbad, CA). Multiple blood spots were collected from two individual volunteers as standardized sample input and compared with our protocol for DNA isolation from saliva and extremely small volume (25-50 µl) whole blood (control sample set). The best purifications in terms of yield and other quality metrics was from our modified protocols with the QiaAmp Blood Micro kit (Table S2 and data not shown). Small blood volume was collected in K2EDTA Microtainer tubes from Becton Dickson (Franklin Lakes, NJ) and saliva was collected with OraGene® assisted devices from DNA Genotek (Ontario, Canada).

DNA isolated from DBS, small volume whole blood and saliva was examined for quality metrics as summarized in Table S2 and Figure S1. dsDNA yield increased, without diminished quality, while increasing from ~1/4 spot to an entire DBS spot (Figure S1A). Capture performance metrics were similar for DBS, whole blood, and saliva (see Figure S1B for a comparison of WES metrics). Bacterial contamination ranging up to 30% bacteria in saliva samples did not appear to negatively affect enrichment nor sequencing throughput, due to the hybrid-capture process that enriches human sequences over non-endogenous sequences.

Supplementary Note 2- Costs per Subject.

We estimated upstream costs of capture and sequencing at approximately $900 for Exome and $365 for NBDx based on Illumina HiSeq 2500 workflow. The NBDx capture and sequencing cost is typically 15- 30% of the overall cost. Variant analysis and curation is based on salary of genetic counselors at $60 per hour and commercial flat rate contracting rates at $300-500 per case. This cost is determined by the number of variants mined, which is much lower in NBDx panels. Review of evidence for pathogenicity, including time required for literature searches (finding literature cataloged in mutation databases and performing independent PubMed and Google searches for the genetic variant and gene) and secondary review. The cost of interpretation is thus close to 50% of the overall cost. Follow up tests cost an additional $250 per parent or (~20%). We estimate cost reduction of 50% on just sequencing and capture year over year will only have a net effect of 7-15% cost reduction on the overall cost. The cost does not consider CLIA license overhead and other compliance cost and requirements (such as data storage, utilities, indirect costs). If volumes are low the fixed cost components rise increasing test cost. NBDx can also be run on MiSeq with a lower throughput of 3-4 samples per run.

Current Cost Estimates (per sample on HiSeq 2500)
Capture & Sequencing / Analysis & Counseling / Ancillary Tests / Infrastructure / Total
WES
(4 samples/lane) / $900 (24%) / $1600 (43%) / $500 / $750 / $3,750.00
NBDx
(20 samples/lane) / $365 (16%) / $650 (29%) / $500 / $750 / $2,265.00


Supplementary References

1. Saavedra-Matiz CA, Isabelle JT, Biski CK, et al. Cost-effective and scalable DNA extraction method from dried blood spots. Clin Chem 2013;59:1045-1051.

2. Beyan H, Down TA, Ramagopalan SV, et al. Guthrie card methylomics identifies temporally stable epialleles that are present at birth in humans. Genome Res 2012;22:2138-2145.

3. Kumar P, Henikoff S, Ng PC. Predicting the effects of coding non-synonymous variants on protein function using the sift algorithm. Nat Protoc 2009;4:1073–1081.

4. Adzhubei IA, Schmidt S, Peshkin L, et al. A method and server for predicting damaging missense mutations. Nat Methods 2010;7:248–249.

5. Schwarz JM, Rödelsperger C, Schuelke M, Seelow D. MutationTaster evaluates disease-causing potential of sequence alterations. Nat Methods 2010;7:575-576.

6. Pollard KS, Hubisz MJ, Rosenbloom KR, Siepel A. Detection of nonneutral substitution rates on mammalian phylogenies. Genome Res 2010;20:110-121.

7. Breiman L. Random forests. Machine Learning 2001;45:5–32.

8. Landrum MJ, Lee JM, Riley GR, et al. ClinVar: public archive of relationships among sequence variation and human phenotype. Nucleic Acids Res 2014;42:D980-5.

9. Online Mendelian Inheritance in Man. McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins University (Baltimore, MD); available at www.omim.org/statistics [accessed June 1, 2013].

10. Klein TE, Chang JT, Cho MK, et al. Integrating genotype and phenotype information: an overview of the PharmGKB project. Pharmacogenetics Research Network and Knowledge Base. Pharmacogenomics J 2001;1:167-170. http://www.pharmgkb.org

11. Hindorff LA, MacArthur J, Morales J, et al. A Catalog of Published Genome-Wide Association Studies. Available at: www.genome.gov/gwastudies.

12. Giardine B, Riemer C, Hefferon T, et al. PhenCode: connecting ENCODE data with mutations and phenotype. Hum Mutat 2007;28:554-562. http://phencode.bx.psu.edu/

13. 1000 Genomes Project Consortium An integrated map of genetic variation from 1,092 human genomes. Nature 2012;491:56-65. www.1000genomes.org

14. Database of Single Nucleotide Polymorphisms (dbSNP). Bethesda (MD): National Center for Biotechnology Information, National Library of Medicine. http://www.ncbi.nlm.nih.gov/SNP/

15. Stenson PD, Ball EV, Mort M, et al. The Human Gene Mutation Database (HGMD®): 2003 Update. Hum Mutat 2003;21:577-581. http://www.hgmd.cf.ac.uk

16. Fokkema IF, Taschner PE, Schaafsma GC, Celli J, Laros JF, den Dunnen JT. LOVD v.2.0: the next generation in gene variant databases. Hum Mutat 2011;32:557-563.

17. Nolan T, Hands RE, Ogunkolade W, Bustin SA. SPUD: a quantitative PCR assay for the detection of inhibitors in nucleic acid preparations. Anal Biochem 2006;351:308-310.

Figure S1: Quality and Performance of DNA Isolated from Bio-specimens. (A) Agarose Gel QC of genomic DNA purified from DBS. DNA is high molecular weight and yield increases with spot area sampled. Sufficient yield is obtained from a single spot for NGS library preparation. (B) TNGS performance metrics. DNA isolated from DBS, Whole Blood and Saliva of the same individual perform similarly in TNGS. Graphs show WES results for %Reads On-Target (Reads On-Target/Reads Mapped) and Coverage at least 1, 10, 20, 50 and 100 reads on target. NBDx panel capture results were also similar across bio-specimen types (data not shown).

Figure S2: Uniformity of Coverage and Reproducibility of NBDx. Histogram of coverage counts for all bases in the tiled regions as generated by GATK's Base Coverage Distribution program. (A) NBDx and WES distribution for the respective target regions. (B) Representative pairwise-comparison of variant read depth. Read depth of variants in exons of the 126 NBS genes plotted for coverage depth from independent capture and sequencing runs of a single patient sample. Variants with ≥10 reads were included. The GATK pipeline coverage threshold was 200 reads. The same sample is compared pairwise for WES and NBDx capture (~140 variants/sample).

Figure S3: Variant Management for Filtering Blinded Samples. Variant filtering workflow for calling causative mutations from the blinded samples (Table 3). Variant files (VCF) were loaded into Opal for annotation and filters applied in Variant Miner. aProtein Impact were categorized as Stop Gained/Lost, Indel/Frameshift, Splice Site and Non-synonymous. bVariant scoring used prediction algorithms including SIFT3, PolyPhen24, MutationTaster5, PhyloP6 and Omicia Score (a random-forest classifier7 that creates an integrative score between 0-1). cDatabases include ClinVar8, OMIM9, PharmGKB10, GWAS11, Locus Specific Databases (from PhenCode12), 1000 genomes13, dbSNP14, HGMD15, LOVD16, and an in-house database. Literature searches were also included to more fully understand the classification of filtered variants. Intronic mutations were annotated in Opal and identified through variant scoring following identification of a deleterious mutation with heterozygosity for a disorder indicated by the clinical summary.

Table S1. Sequencing and Enrichment Statistics for the NBDx and WES Samples.

ID / Panel / Raw Reads
(Millions) / Reads
Mapped
(Millions) / Reads
On-Target
(Millions) / %Target Covered 1X / %Target Covered 10X / %Target Covered 20X / %Target Covered 50X / %Target Covered 100X / Average Reads / Specificity
S1 / WES / 90.0 / 85.8 / 65.8 / 99.2 / 95.2 / 89.9 / 69.1 / 37.4 / 99 / 76.7
S3 / WES / 84.5 / 79.8 / 61.4 / 99.4 / 95.4 / 89.9 / 67.6 / 34.2 / 97 / 76.9
S4 / WES / 95.6 / 91.2 / 70.9 / 99.4 / 95.7 / 90.9 / 71.9 / 41.2 / 108 / 77.7
S5 / WES / 49.4 / 37.4 / 20.7 / 99.6 / 92.3 / 76.2 / 29.9 / 5.8 / 48 / 55.3
S6 / WES / 93.0 / 89.1 / 69.0 / 99.3 / 95.1 / 89.6 / 69.4 / 39.1 / 102 / 77.4
S7 / WES / 67.7 / 58.3 / 38.2 / 99.5 / 95.3 / 87.6 / 54.4 / 17.4 / 72 / 65.4
S9 / WES / 76.8 / 72.7 / 56.1 / 99.3 / 94.7 / 88.2 / 63.3 / 29.5 / 89 / 77.2
S10 / WES / 75.1 / 72.0 / 56.5 / 99.2 / 93.9 / 86.9 / 62.2 / 29.7 / 88 / 78.5
S1 / NBDx / 17.5 / 17.0 / 14.7 / 97.0 / 94.3 / 92.2 / 87.4 / 74.7 / 147 / 86.5
S3 / NBDx / 17.1 / 16.4 / 14.0 / 97.0 / 94.6 / 92.6 / 87.8 / 73.6 / 149 / 85.6
S4 / NBDx / 16.2 / 15.8 / 13.7 / 97.2 / 94.3 / 92.1 / 86.6 / 71.3 / 148 / 86.7
S5 / NBDx / 11.5 / 9.3 / 7.0 / 96.7 / 92.4 / 89.5 / 68.9 / 24.4 / 81 / 75.4
S6 / NBDx / 16.3 / 15.9 / 13.7 / 97.2 / 94.1 / 91.8 / 86.5 / 71.2 / 150 / 86.6
S7 / NBDx / 13.8 / 13.3 / 11.7 / 97.0 / 94.4 / 92.2 / 86.3 / 63.9 / 134 / 88.2
S9 / NBDx / 16.1 / 15.6 / 13.5 / 97.0 / 94.1 / 91.8 / 86.3 / 70.3 / 140 / 86.5
S10 / NBDx / 15.9 / 15.4 / 13.4 / 97.1 / 94.2 / 92.1 / 86.6 / 70.6 / 142 / 86.5
S11 / NBDx / 13.5 / 13.2 / 11.7 / 97.2 / 94.6 / 92.4 / 86.2 / 64.3 / 133 / 88.7
4963 / NBDx / 18.6 / 17.9 / 15.5 / 97.5 / 94.7 / 92.8 / 87.9 / 76.5 / 161 / 86.5
6810 / NBDx / 18.7 / 18.0 / 15.6 / 97.5 / 95.0 / 93.1 / 88.3 / 77.2 / 160 / 86.5
7066 / NBDx / 17.6 / 17.0 / 14.7 / 97.2 / 94.6 / 92.7 / 87.7 / 75.3 / 154 / 86.5
7241 / NBDx / 18.1 / 17.6 / 15.6 / 97.5 / 95.4 / 93.8 / 89.3 / 78.5 / 158 / 88.8
7656 / NBDx / 18.3 / 17.6 / 15.3 / 97.4 / 94.6 / 92.6 / 87.7 / 75.9 / 166 / 86.6
7901 / NBDx / 20.7 / 20.0 / 17.3 / 97.4 / 94.9 / 93.1 / 88.7 / 79.5 / 173 / 86.8
7912 / NBDx / 18.1 / 17.5 / 15.2 / 97.2 / 94.5 / 92.6 / 87.7 / 75.7 / 160 / 86.8
10241 / NBDx / 19.6 / 19.0 / 16.4 / 97.5 / 94.9 / 93.1 / 88.5 / 77.9 / 163 / 86.3
10642 / NBDx / 23.1 / 22.3 / 19.2 / 97.2 / 94.8 / 93.2 / 89.4 / 82.2 / 176 / 86.1
13925 / NBDx / 15.4 / 15.0 / 13.4 / 96.9 / 94.5 / 92.4 / 87.0 / 70.9 / 145 / 89.4
14691 / NBDx / 16.9 / 16.4 / 14.1 / 97.4 / 94.7 / 92.7 / 87.4 / 73.5 / 148 / 86.3
16622 / NBDx / 19.0 / 18.4 / 15.9 / 97.4 / 95.1 / 93.3 / 88.6 / 77.5 / 172 / 86.3
19283 / NBDx / 14.2 / 13.8 / 12.3 / 97.0 / 94.3 / 92.0 / 85.7 / 66.2 / 138 / 89.2
21901 / NBDx / 17.7 / 17.1 / 14.7 / 97.4 / 94.7 / 92.7 / 87.7 / 75.5 / 155 / 86.1
22785 / NBDx / 20.1 / 19.4 / 16.7 / 97.6 / 95.1 / 93.3 / 89.0 / 79.9 / 173 / 86.0
23275 / NBDx / 14.8 / 14.5 / 12.9 / 97.2 / 94.7 / 92.6 / 86.8 / 69.2 / 133 / 88.9
23279 / NBDx / 14.9 / 14.5 / 12.9 / 97.3 / 94.8 / 92.7 / 86.8 / 69.7 / 142 / 88.8
25875 / NBDx / 18.7 / 18.1 / 15.7 / 97.5 / 95.1 / 93.3 / 88.5 / 77.1 / 159 / 86.3
26607 / NBDx / 17.2 / 16.7 / 14.5 / 97.3 / 94.8 / 92.8 / 87.4 / 74.5 / 150 / 86.6
27244 / NBDx / 13.8 / 13.4 / 11.9 / 97.2 / 94.2 / 91.8 / 85.5 / 64.5 / 130 / 88.9
28907 / NBDx / 17.8 / 17.3 / 15.4 / 97.4 / 95.1 / 93.1 / 88.3 / 76.6 / 159 / 88.6
29351 / NBDx / 20.4 / 19.8 / 17.0 / 97.7 / 95.3 / 93.7 / 89.4 / 80.3 / 170 / 86.3
31206 / NBDx / 18.4 / 17.8 / 15.5 / 97.1 / 94.6 / 92.7 / 87.8 / 75.9 / 157 / 86.7
WES / Average / 79 / 73.3 / 55 / 99 / 95 / 87 / 61 / 29 / 88 / 73
Stdev / 15 / 18.1 / 17 / 0 / 1 / 5 / 14 / 12 / 19 / 8
NBDx / Average / 17 / 16.6 / 14 / 97 / 95 / 93 / 87 / 72 / 151 / 87
Stdev / 2 / 2.5 / 2 / 0 / 1 / 1 / 3 / 10 / 18 / 2

Samples were run using Nimblegen SeqCap capture and HiSeq 2500 sequencing, in sets of 4 samples for WES and 20 samples for NBDx. As measured in Picard, PCR duplication rates were ~5% for WES and ~7% for NBDx (data not shown). An additional 10 samples with mutations spanning PAH and 5 GCDH samples were run from archival DBS stored at room temperature for over 10 years. While we were able to call mutations, the majority of these samples were highly degraded, required whole genome amplification and did not have a priori Sanger data and as such are not included here (data not shown).