Minimal Residual Disease Quantification Using Consensus Primers and High-Throughput IGH Sequencing Universally Predicts Post-Transplant Relapse in Chronic Lymphocytic Leukemia
Aaron C. Logan,1 Bing Zhang,2 Balasubramanian Narasimhan,3 Malek Faham,4 Victoria Carlton,4 Jianbiao Zheng,4 Martin Moorhead,4 Mark R. Krampf,1 Carol D. Jones,2 Amna N. Waqar,2 James L. Zehnder,2 and David B Miklos1
Supplemental Methods
MRD samples
PBMC were generated from 10 mL of whole blood by centrifugation on a Ficoll-Hypaque gradient, followed by washing with phosphate buffered saline (PBS) and cryopreservation in freezing medium consisting of 50% fetal bovine serum, 10% dimethylsulfoxide (DMSO), and 40% Roswell Park Medical Institute (RPMI) medium. Cells were stored in liquid nitrogen vapor until used for MRD quantification. To assess MRD, samples were removed from cryopreservation, thawed rapidly in a 37°C water bath and washed twice with PBS. A cell lysate was made immediately using Buffer AL with proteinase K from a Qiagen Blood and Tissue kit (Qiagen, Valencia, CA) and the remainder of DNA isolation proceeded according to manufacturer’s instructions. For some samples, DNA was harvested using Gentra Puregene Blood kit (Qiagen) from peripheral blood collected in EDTA tubes per manufacturer’s instructions.
IGH amplification and sequencing
First stage primers for multiplex PCR were designed to amplify all known germline IGH sequences. Each IGHV segment is amplified by 3 primers, decreasing the likelihood of somatic hypermutations preventing amplification. Primers were optimized such that each possible IGHV and IGHJ segment was amplified at a similar rate so as to minimally skew the repertoire frequency distribution during linear amplification. A given sequence may have been amplified by multiple primers and this was handled bioinformatically during IGH clonotype quantification such that one amplimer was used for quantification of each specific clonotype. This methodology led to slightly different primer designs than have been published previously for similar IGH amplification approaches.1 The numbers of primers and the positions of these primers are shown in Faham and Willis.2
At the 5’ ends of IGHV segment primers a universal sequence complementary to a set of second stage PCR primers was appended. Similarly the primers on the IGHJ side had a 5’ tail with a universal sequence complementary to second stage PCR primers. Second stage PCR primers additionally contained a sequence primer site and the P5 sequence used for cluster formation in the Illumina Genome Analyzer sequencer. The primers on the IGHV side of the amplification constituted one of a set of primers, each of which had a 3’ region that annealed to the overhang sequence appended in the first reaction but which further contained one of multiple 6 or 9 base pair indices that allowed for sample multiplexing on the sequencer. Each of these primers further contained a 5’ tail with the P7 sequence used for cluster formation in the Illumina Genome Analyzer sequencer.
First stage PCR was carried out using a high fidelity polymerase (AccuPrime, Life Technologies) for 16 cycles. 1/100 of this amplification reaction was then used as the template for a second PCR reaction using the second stage primers that append sample indices and cluster formation sequences. A second stage PCR was carried out for 22 cycles. Different samples were pooled for sequencing in the same Illumina Genome Analyzer sequencing lane. The pool was then purified using the QIAquick PCR purification kit (Qiagen).
Cluster formation and sequencing was carried out per the manufacturer protocol (Illumina, Inc., La Jolla, CA). Specifically, three sequencing reactions were performed. First 115 bp were sequenced from the IGHJ side sufficient to sequence through the CDR3 junctional sequence from IGH J-to-V. At this point, the synthesized strand was denatured and washed off. A second sequencing primer was annealed that allowed the sample index to be sequenced for 6 cycles to identify the sample. At this point the reverse complement strand was generated in a third sequencing reaction per the Illumina protocol. The final sequencing read of 95 bp obtained from the IGH V-to-J direction provided ample sequence to map the IGHV segment accurately using germline sequences published by the International Immunogenetics (IMGT) Information System.3
Clonotype determination
Algorithmic methods were utilized for clonotype determination. Briefly, sequence data were analyzed to determine the clonotype sequences including mapping to germline V and J consensus sequences.3First, the forward sequence read was used to map the J segment. After Jsegment identification, Vsegments were mapped using the reverse sequence read. The IGHV primer was mapped and the bases under this primer were excluded from further analysis of the reverse read. Thereafter, the next ~70 bases of the reverse read were mapped to the known IGHV segments. Read pairs that did not map to V segments were excluded. The next step in mapping involved identifying the frame that related the forward and reverse reads and this allowed a continuous sequence from J to V to be constructed.
To generate a clonotype, identification of at least two identical sequences was required. We developed an algorithm to determine whether similar sequencing reads are the result of biological differences in the initial sample or technical artifact (i.e., sequencing or PCR error). The algorithm takes into account the number of sequencing reads and the degree of sequence variation between the clonotypes in question. For example, two sequences with one base difference but present at vastly different frequencies were consistent with sequencing or PCR error. On the other hand two sequences with two base differences and present at similar magnitudes were unlikely to arise from sequencing error. Non-functional rearrangements (generally less than 20% of all VDJ rearrangements) are included in the analysis.
MRD quantification
To determine the absolute measure of the total leukemia-derived molecules present in the follow-up sample, we added a known quantity of reference IGH sequence into the reaction and counted the associated sequencing reads. The known quantity of reference IGH sequence was derived from a pool of plasmids containing 3 unique IGH clonotypes, quantified using standard RT-PCR methods. The resulting factor (number of molecules per sequence read) was then applied to the leukemia associated clonal rearrangement reads to obtain an absolute measure of the total leukemia-derived molecules in the reaction. A similar calculation was performed to assess the total number of rearranged IGH molecules, or B-lineage cells, in the reaction. Finally, we calculated the total leukocytes in the reaction by measuring the total DNA in the reaction using standard picogreen methods and RT-PCR using β actin DNA, assuming an average human diploid genome mass of 6.49 picograms. These metrics were combined to calculate a final MRD measurement, which is the number of leukemia-derived molecules divided by the total leukocytes in the sample (capped at 1 million CLL clonotypes per 1 million input PBMC genomes).
IGH allele-specific oligonucleotide PCR
An IGH V-region consensus Taqman probe was first evaluated for predicted success based on a set of criteria including the number, type and position of mismatches.4 If the consensus probe was predicted to be successful, allele-specific primers were designed to work with the probe, with the forward primer annealing 5’ of the probe and the reverse primer annealing in the complementarity determining region 3 (CDR3) region. If the consensus probe was predicted to be unsuccessful or deemed empirically to be insensitive for a specific patient, a CDR3-specific probe and corresponding primers were designed. Q-PCR reactions were performed on an ABI 7900HT real-time PCR instrument (Applied Biosystems, Carlsbad, CA) with 500ng of total leukocyte DNA and Taqman Universal PCR master mix (Applied Biosystems). Human genomic DNA (Roche Diagnostics, Germany) was used as a reference GAPDH standard and the IGH data were normalized to the corresponding GAPDH quantification and the final result was reported as number of CLL IGH copies/μg of human DNA.
Supplemental Tables
Supplemental Table 1. Patient characteristics and outcomes
[Separate file due to landscape orientation.]
Supplemental Figures
Supplemental Figure 1.Patient outcomes. Overall (OS) and disease-free survival (DFS) for the patient cohort studied here are shown.
Supplemental Figure 2. MRD quantification in relapsed and non-relapsed patients. MRD quantification is shown for patients relapsing within 12 months post-HCT (A) after 12 months post-HCT (B). MRD patterns for patients who remained free of relapse are shown in (C).signifies a patient (SPN 3723) with apparent MRD progression, but who died from complications of chronic GVHD prior to meeting criteria for clinical relapse.
Supplemental References
1.van Dongen JJ, Langerak AW, Bruggemann M, Evans PA, Hummel M, Lavender FL et al. Design and standardization of PCR primers and protocols for detection of clonal immunoglobulin and T-cell receptor gene recombinations in suspect lymphoproliferations: report of the BIOMED-2 Concerted Action BMH4-CT98-3936. Leukemia 2003; 17(12): 2257-317.
2.Faham M, Willis TD. Monitoring health and disease status using clonotype profiles. In: USPTO ed. Vol. 2011/0207134A1. United States; 2011.
3.Giudicelli V, Chaume D, Lefranc MP. IMGT/GENE-DB: a comprehensive database for human and mouse immunoglobulin and T cell receptor genes. Nucleic Acids Res. 2005;33(Database issue):D256-261.
4.Ladetto M, Donovan JW, Harig S, Trojan A, Poor C, Schlossnan R et al. Real-Time polymerase chain reaction of immunoglobulin rearrangements for quantitative evaluation of minimal residual disease in multiple myeloma. Biol Blood Marrow Transplant 2000; 6(3): 241-53.