The Genomic Landscape of Balanced Cytogenetic Abnormalities

Associated with Human Developmental Anomalies

Claire Redin1,2,3, Harrison Brand1,2,3, Ryan L. Collins1,3,4, Tammy Kammin5, Elyse Mitchell6, Jennelle C. Hodge6,7, Carrie Hanscom1, Vamsee Pillalamarri1, Catarina M. Seabra1,8, Mary-Alice Abbott9, Omar Abdul-Rahman10, Erika Aberg11, Rhett Adley1, Sofia Alcaraz-Estrada12, Fowzan S. Alkuraya13, Yu An1,14, Mary-Anne Anderson15, Caroline Antolik1, Kwame Anyane-Yeboa16, Joan F. Atkin17,18, Tina Bartell19, Jonathan A. Bernstein20, Elizabeth Beyer21, Ernie M.H.F. Bongers22, Eva H. Brilstra23, Chester W. Brown24,25, Hennie T. Brüggenwirth26, Bert Callewaert27, Ken Corning28, Helen Cox29, Benjamin B. Currall1,5,30, Tom Cushing31, Dezso David32, Matthew A. Deardorff33,34, Annelies Dheedene27, Marc D’hooghe35, Bert B.A. de Vries22, Dawn L. Earl36, Heather L. Ferguson5, Heather Fisher37, David R. FitzPatrick38, Pamela Gerrol5, Daniela Giachino39, Joseph T. Glessner1,2,3, Troy Gliem6, Margo Grady40, Brett H. Graham41,42, Cristin Griffis21, Karen W. Gripp43, Andrea L. Gropman44, Andrea Hanson-Kahn45, David J. Harris46,47, Mark Hayden5, Ron Hochstenbach23, Jodi D. Hoffman48, Monika W. Hubshman49, A. Micheil Innes50, Mira Irons51, Melita Irving52,53, Sandra Janssens27, Tamison Jewett54, John P. Johnson55, Marjolijn C. Jongmans22, Stephen G. Kahler56, David A. Koolen22, PeterM.Kroisel57, Yves Lacassie58, William Lawless1, Emmanuelle Lemyre59, Kathleen Leppig60,61, Alex V. Levin62, Haibo Li63, Hong Li63,Eric C. Liao64,65,66, Cynthia Lim67,68, Edward J. Lose69, Diane Lucente1, Michael J. Macera70, Poornima Manavalan1, Giorgia Mandrile39, Carlo L. Marcelis22, Tamaron Mason71, Lauren Margolin71, Diane Masser-Frye72, Michael W. McClellan73, Björn Menten27, Liya R. Mikami74,75, Emily Moe21, Shehla Mohammed76, Tarja Mononen77, Megan E. Mortenson78,79, Graciela Moya80, Aggie Nieuwint81, Zehra Ordulu5,82, Sandhya Parkash83, Susan P. Pauker84, Shahrin Pereira5, Danielle Perrin71, Katy Phelan85, Raul E. Piña Aguilar12,86, Pino J. Poddighe81, Giulia Pregno39, Salmo Raskin74, Linda Reis87, William Rhead88, Debra Rita89, Ivo Renkens23, Filip Roelens90, Jayla Ruliera15, Patrick Rump91, Samantha L.P. Schilit92, Ranad Shaheen13, Rebecca Sparkes93,94, Erica Spiegel95, Blair Stevens96, Matthew R. Stone1, Julia Tagoe97, Joseph V. Thakuria30,98, Bregje W. van Bon22, Jiddeke van de Kamp81, Ton van Essen91, Conny M. van Ravenswaaij-Arts91, Markus J. van Roosmalen23, Sarah Vergult27, Catharina M.L. Volker-Touw23, Dorothy P. Warburton99, MatthewJ. Waterman1, Susan Wiley100,101, Anna Wilson1, Maria de la Concepcion A. Yerena-de Vega102, Roberto T. Zori103, Brynn Levy104, Han G. Brunner22,105, Nicole de Leeuw22, Wigard P. Kloosterman23, Erik C. Thorland6, Cynthia C. Morton3,5,106,107, James F. Gusella1,2,3,92, Michael E. Talkowski1,2,3,*

1Molecular Neurogenetics Unit and Psychiatric and Neurodevelopmental Genetics Unit, Center for Human Genetic Research, Massachusetts General Hospital, Boston, MA 02114, USA;

2Department of Neurology, Massachusetts General Hospital and Harvard Medical School, Boston, MA 02114, USA;

3Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA 02141, USA;

4Program in Bioinformatics and Integrative Genomics, Division of Medical Sciences, and Departments of Neurology and Genetics, Harvard Medical School, Boston, MA 02115, USA

5Department of Obstetrics, Gynecology, and Reproductive Biology, Brigham and Women's Hospital, Boston, MA 02115, USA;

6Department of Laboratory Medicine and Pathology, Mayo Clinic, Rochester, MN 55902, USA;

7Department of Pathology and Laboratory Medicine, Cedars-Sinai Medical Center, Los Angeles, CA 90048, USA;

8GABBA Program, University of Porto, Porto, Portugal;

9Medical genetics, Baystate Medical Genetics, and Baystate Children’s Subspecialty Center, Springfield, MA 01199, USA;

10Department of Pediatrics, University of Mississippi Medical Center, Jackson, MS

11Maritime Medical Genetics Service, IWK Health Centre, Halifax, Nova Scotia, Canada

12Medical Genomics Division, Centro Medico Nacional 20 de Noviembre, ISSSTE, Mexico City, Mexico

13King Faisal Specialist Hospital and Research Center, MBC-03 PO BOX 3354, Riyadh 11211, Saudi Arabia;

14Institutes of Biomedical Sciences and MOE Key Laboratory of Contemporary Anthropology,Fudan University, Shanghai, China

15Center for Human Genetic Research DNA and Tissue Culture Resource, Boston, MA 02114, USA;

16Columbia University Medical Center, New York, NY,10032, USA;

17Department of Pediatrics, The Ohio State University College of Medicine, Columbus, OH 43210, USA;

18Division of Molecular and Human Genetics, Nationwide Children's Hospital, Columbus, OH 43205, USA

19Sacramento Medical Center, Department of Genetics, Sacramento, CA 95815, USA;

20Department of Pediatrics, Stanford University School of Medicine, CA, USA

21Children's Hospital of Wisconsin and Departments of Pediatrics, Medical College of Wisconsin.

22Department of Human Genetics, Radboud Institute for Molecular Life Sciences and Donders Institute for Brain, Cognition and Behaviour, Radboud University Medical Center, Nijmegen 6500 HB, the Netherlands;

23Department of Genetics, Division of Biomedical Genetics, University Medical Center Utrecht, 3508 AB Utrecht, The Netherlands;

24Department of Molecular and Human Genetics, Department of Pediatrics, Baylor College of Medicine, One Baylor Plaza, Houston, TX 77030, USA;

25Texas Children's Hospital, 6621 Fannin, Houston, TX 77030, USA;

26Department of Clinical Genetics, Erasmus University Medical Centre, 3000 CA Rotterdam, The Netherlands;

27Center for Medical Genetics, Ghent University, De Pintelaan 185, 9000 Ghent, Belgium;

28Greenwood Genetic Center, Columbia, SC, 29201, USA;

29West midlands regional clinical genetics Unit, Birmingham Womens Hospital, Edgbaston, Birmingham B15 2TG, England, UK;

30Department of Genetics, Harvard Medical School, Boston, MA, USA;

31University of New Mexico, School of medicine, Department of pediatrics, Division of pediatric genetics, Albuquerque, NM 87131, USA;

32Department of Human Genetics, Organization National Institute of Health Dr Ricardo Jorge, Lisbon, Portugal

33Department of Pediatrics, Perelman School of Medicine at theUniversity of Pennsylvania, Philadelphia, PA 19104, USA;

34Division of Human Genetics,Children's Hospital of Philadelphia, Philadelphia, PA, USA;

35Algemeen Ziekenhuis Sint-Jan, Brugge, Belgium;

36Seattle Children’s, Seattle, Washington, WA 98105, USA;

37Mount Sinai West Hospital, New York, NY 10019, USA;

38Medical Research Council Human Genetics Unit, Institute of Genetic and Molecular Medicine, University of Edinburgh, Western General Hospital, Edinburgh EH4 2XU, UK;

39Medical Genetics Unit, Department of Clinical and Biological Sciences, University of Torino, Italy

40UW Cancer Center at ProHealth Care, Waukesha, Wisconsin, USA

41Department of Molecular and Human Genetics, Baylor College of Medicine, One Baylor Plaza, Houston, TX 77030, USA;

42Department of Genetics, Texas Children's Hospital, Houston, TX 77054, USA;

43Sidney Kimmel Medical School at T. Jefferson University, Philadelphia, PA, USA;

44Children's National Medical Center, N.W. Washington, D.C, USA

45Department of Pediatrics and Genetics, Stanford University School of Medicine, CA, USA

46Division of Genetics, Boston Children's Hospital, Boston, MA, USA;

47Department of Pediatrics, Harvard Medical School, Boston, MA, USA;

48Department of Pediatrics, Division of Genetics, Boston Medical Center, MA, USA;

49Schneider Medical Centre, Genetics, Israel;

50Department of Medical Genetics and Alberta Children’s Hospital Research Institute, Cumming School of Medicine, University of Calgary, Calgary, AB T3B 6A8, Canada;

51Academic Affairs, American Board of Medical Specialties, Chicago, IL 60654, USA;

52Guy's and St Thomas' NHS Foundation Trust London, UK;

53Honorary Reader, Division of Medical and Molecular Genetics, King's College London, London, UK

54Wake Forest Baptist Medical Center, Winston Salem, NC 27157, USA;

55Shodair Children's Hospital, Molecular Genetics Department, Helena, MT, USA;

56Division of Genetics and Metabolism, Arkansas Children's Hospital, AR, USA;

57InstituteofHumanGenetics, MedicalUniversityofGraz, Graz, Austria;

58Department of Pediatrics LSUHSC and Children's Hospital, New Orleans, LA, USA;

59CHU Sainte-Justine, 3175 chemin de la Côte-Sainte-Catherine, Montréal QC, Canada;

60Division of Medical Genetics, Department of Medicine, University of Washington, Seattle, Washington, USA;

61Clinical Genetics, Group Health Cooperative, Seattle, Washington, USA;

62Wills Eye Hospital, Ste. 1210, 840 Walnut Street, Philadelphia, PA, USA;

63Center for Reproduction and Genetics, The affiliated Suzhou Hospital of Nanjing Medical University, Suzhou, Jiangsu, China;

64Center for Regenerative Medicine, Massachusetts General Hospital and Harvard Medical School, Boston, MA 02114, USA;

65Division of Plastic and Reconstructive Surgery, Massachusetts General Hospital, Boston, MA 02114, USA;

66Harvard Stem Cell Institute, Cambridge, MA 02138

67HonorHealth/Virginia G. Piper Cancer Center, Scottsdale, AZ 85258, USA;

68Arkansas Children’s Hospital, Little Rock, AR 72202, USA;

69Department of Medical Genetics, University of Alabama Hospital at Birmingham, Birmingham, AL, USA;

70New York-Presbyterian Hospital, Columbia University Medical Center, New York, USA;

71Program in Medical and Population Genetics and Genomics Platform, Broad Institute of Harvard and MIT, Cambridge, MA 02141, USA.

72Department of Genetics, Rady Children's Hospital San Diego, CA, USA;

73Department of Obstetrics & Gynaecology, Madigan Army Medical Center, Tacoma, WA 98431, USA;

74Group for Advanced Molecular Investigation, Graduate Program in Health Sciences, School of Medicine, Pontifícia Universidade Católica do Paraná, Curitiba, Paraná, Brazil;

75Centro Universitário Autônomo do Brasil (Unibrasil), Curitiba, Paraná,Brazil;

76Guy's Hospital, London

77Department of Clinical Genetics, Kuopio University Hospital, Finland;

78Wake Forest Baptist Medical Center, Winston Salem, NC 27157, USA;

79Novant Health Derrick L. Davis Cancer Center, Winston Salem, NC 27103, USA;

80GENOS Laboratory, Buenos Aires, Argentina;

81Department of Clinical Genetics, VU University Medical Center, De Boelelaan 1117, Amsterdam 1081 HV, The Netherlands;

82Harvard Medical School, Boston, MA, USA;

83Department of Pediatrics, Maritime Medical Genetics Service, IWK Health Centre, Dalhousie University, Halifax, Nova Scotia, Canada;

84Medical Genetics, Harvard Vanguard Medical Associates, 485 Arsenal St. Watertown, MA 02472, USA;

85Hayward Genetics Program, Department of Pediatrics, Tulane University School of Medicine, New Orleans, LA, USA;

86School of Medicine, Medical Sciences and Nutrition, University of Aberdeen, Aberdeen, United Kingdom

87Department of Pediatrics and Children’s Research Institute, Medical College of Wisconsin, Milwaukee, WI, 53226 USA

88Children's Hospital of Wisconsin and Departments of Pediatrics and Pathology, Medical College of Wisconsin.

89Midwest Diagnostic Pathology, Aurora Clinical Labs, Rosemont, IL, USA;

90Algemeen Ziekenhuis Delta, Roeselare, Belgium;

91University of Groningen, University Medical Center Groningen, Department of Genetics, PO Box 30.001, 9700RB Groningen, The Netherlands;

92Department of Genetics, Harvard Medical School, Boston, MA, USA;

93Department of Medical Genetics and Paediatrics, Cumming School of Medicine, University of Calgary, AB, Canada;

94Alberta Children's Hospital, Alberta Health Services, Calgary, AB, Canada;

95Columbia University Medical Center, New York, NY 10032, USA;

96University of Texas Medical School at Houston, Houston, TX, USA;

97Genetic Services, Alberta Health Services, Alberta, Canada T1J 4L5

98Division of Medical Genetics, Massachusetts General Hospital, Boston, MA 02114;

99Department of Clinical Genetics and Development, Columbia University Medical Center, New York, NY 10032, USA;

100Cincinnati Children’s Hospital Medical Center, Division of Developmental at University of Cincinnati, OH, USA;

101Cincinnati Children’s Hospital Medical Center, Division of Behavioral Pediatrics at University of Cincinnati, OH, USA;

102Laboratory of Genetics, Centro Medico Nacional 20 de Noviembre, ISSSTE, Mexico City, Mexico.

103Division of Pediatric Genetics & Metabolism, University of Florida, Florida, USA;

104Department of Pathology, Columbia University, New York, NY, USA;

105Department of Clinical Genetics, Maastricht University Medical Centre, Universiteitssingel 50, 6229 ER Maastricht, The Netherlands;

106Department of Pathology, Brigham and Women's Hospital, Boston, MA 02115, USA;

107University of Manchester, Manchester Academic Health Science Center, Manchester, UK;

*Correspondence:

Michael E. Talkowski, Ph.D.

Associate Professor of Neurology, Psychiatry, and Pathology

Center for Human Genetic Research

Massachusetts General Hospital, Harvard Medical School, Broad Institute

185 Cambridge St., Boston, MA 02114

;

ABSTRACT

Despite their clinical significance, characterization of balanced chromosomal abnormalities (BCAs) has largely been restricted to cytogenetic resolution. We explored the landscape of BCAs at nucleotide resolution in 273 subjects with congenital or developmental anomalies.Whole-genome sequencing revised 93% of karyotypes and revealed complexity that was cryptic to karyotyping in21% of BCAs, highlighting the limitations of conventional cytogeneticapproaches. At least 33% of BCAs resulted in gene disruption that likely contributed to the developmental phenotype, 4% were associated with pathogenic genomic imbalances, andat least 9% disrupted topologically associated domains (TADs) encompassing known syndromic loci. Remarkably, 8 subjects harbored BCA breakpoints that localized to asingle TAD encompassing MEF2C, aknown driverofthe 5q14.3 microdeletion syndrome,resulting in alteredMEF2Cexpressionby positional effect. This study proposesthatsequence-level resolution dramatically improves prediction of clinical outcomesfor balanced rearrangements and provides insight into novel pathogenic mechanisms such as long-range regulatorychanges inchromosome topology.

Keywords: Cytogenetics, structural variation, balanced chromosomal abnormality, congenital anomaly, intellectual disability, autism, translocation, inversion, chromothripsis, topologically associated domain (TAD), MEF2C

1

Balanced chromosomal abnormalities (BCA) are a class of structural variation that involve rearrangement of the chromosome structure and results in a change in the orientation or localization of a genomic segment without a large concomitant gain or loss of DNA. This class of variation includes inversions, translocations, excisions/insertions, and more complex rearrangements consisting of combinations of such events. Cytogenetic studies of unselected newborns and control adult males estimate a prevalence of 0.2-0.5% for BCAs in the general population1-3. By contrast, an approximate five-fold increase in the prevalence of BCAs has been reported among subjects with neurodevelopmental disorders, particularly intellectual disability (1.5%)4 and autism spectrum disorder (ASD; 1.3%)5. These data suggest that BCAs represent highly penetrant mutations in a meaningful fraction of subjects with associated congenital anomalies or neurodevelopmental disorders.

Delineating the breakpoints of BCAs and the genomic regions that they disrupt has long been a fertile area of novel gene discovery in human genetic research and has greatly contributed to the annotation of the morbid map of the human genome6,7. Despite their significance in human disease, the clinical detection of this unique class of chromosomal rearrangements still relies upon conventional cytogenetic methods such as karyotyping that are limited to microscopic resolution (~3-10 Mb, depending on the chromosome banding pattern and specimen type)8. The absence of gross genomic imbalances renders BCAs invisible to higher resolution techniques that currently serve as first-tier diagnostic screens for many developmental anomalies of unknown etiology; chromosomal microarray (CMA), whichcan detect microscopic and sub-microscopic copy-number variants (CNVs), or whole-exome sequencing (WES), which surveys single nucleotide variants within coding regions. Without access to precise breakpoint localization, clinical interpretation of de novo BCAs has been limited to estimates of an untoward outcome from population cytogenetic studies based solely on the presenceof a rearrangement(6.1% of de novo reciprocal translocations, 9.4% for de novoinversions)9.We have recently shown that innovations in genomic technologies can efficiently reveal BCA breakpoints at nucleotide resolution with a cost and timeframe comparable to clinical CMA or karyotyping; however, only a limited number of BCAs have been evaluated to date10-16.

In this study, we explored several fundamental but previously intractablequestions regarding de novoBCAs associated with human developmental anomalies, such as the originsof their formation, the genomic properties of the sequences they disrupt, and the mechanisms by which BCAs act as dominant pathogenic mutations. We evaluated 273subjects ascertained based on the presence of a BCA discovered by karyotyping that presented in a proband with a developmental defect. We defined the genomic sequences that were altered by the breakpoints and created a framework in which we interpreted their significance based on convergent genomic datasets. This included CNV and WES data in tens of thousands of individuals, as well as prediction of long-range regulatory effects from recent studies that have established high-resolution maps of chromosomal compartmentalization in the nucleus17,18. Our findings indicate that formation of BCAs involves a variety of mechanisms and sequence characteristics, that the end-result often reflects substantial complexity invisible to cytogenetic assessment, that BCAs directly disrupt genes likely to contribute to the developmental defect in approximately one-third of subjects, and that the developmental anomaly can be caused by long-range regulatory changes due to alterations to the chromosome structure. These results highlight the myriad genomic features of BCAs that have been largely unexplored in conventional cytogenetic research and demonstrate mechanisms by which they contribute to human developmentaldefects.

RESULTS

Sequencing BCAs reveals cryptic complexity

This studysequenced 273 subjects across fiveprimary referral sites that collectively represented an international consortium of over 100 clinical investigators. Subjects harbored a BCA that was detected by karyotyping and presented with various congenital anomalies and other developmental defects. Most of the 273 subjects were surveyed using large-insert whole-genome sequencing (liWGS or ‘jumping libraries’; 83%), with the remainder of subjects being analyzed by standard short-insert WGS or targeted breakpoint sequencing (see Online Methods; Supplementary Table S1). Subjects were preferentially selected with confirmedde novo BCAs based on cytogenetic studies at the referring siteor rearrangements that segregated with a phenotypic anomaly within a family (72.5% ofsubjects); however, inheritance information was not available for one or both parents in the remaining 27.5% of subjects. No subjects were included if a BCA was confirmed to be inherited from an unaffected parent.Subjects presented with a spectrum of clinical features: congenital anomaliesrangedfrom organ-specific disorders to multisystem abnormalities, as well as neurodevelopmental conditions such as intellectual disability or autism spectrum disorder (ASD;Table 1). While no specificphenotypes were prioritized for inclusion (see Supplementary Figure S1), neurological defects were the most common featurein the cohort (80% of subjects when using digitalized phenotypes from the Human Phenome Ontology [HPO]19; Table 1; Supplementary Table S2).

Breakpoints were successfully identified in 248of the 273 cases (90.8%); all subsequent analyses were restricted to these 248subjects (Fig. 1). This success ratewas consistent with expectations, as simulation of one million random breakpoints in thegenome and comparison against all uniquely alignable 10 bp – 100 bp kmers suggests that 7.6% of simulated breakpoints were localized within N-masked regions or genomic segments that cannot be confidently mapped by short-read sequencing (Supplementary Fig. S2). Sequencing identified 874breakpoints genome-wide and revised the breakpoint localizationby at least one sub-band in93%of subjects when compared to thekaryotype interpretation (Fig. 1a;breakpoint positions provided in Supplementary Table S3). Across all rearrangements, 26% of BCAs were revealed to be complex (i.e., involved three or more breakpoints), including5.3%that were consistent with the phenomena of chromothripsisor chromoplexy that we and others have previously defined in cancer genomes and the human germline(complex reassembly of the chromosomes involving extensive shattering and random ligation of fragments from one or more chromosomes)20-24. The most complex BCA involved 57 breakpoints (Supplementary Fig.S3).Whenanalyses were restricted to the 230 subjects for which the karyotype suggested a simple chromosomal exchange, 48(21%) were revealed to be rearrangements with complexity that was cryptic to the karyotype, emphasizing the insights that are gained from nucleotide resolution. Across all BCAs,81% resolved to less than ten kilobases of total genomic imbalance, although several cases harbored largecryptic imbalances (mostly deletions) of varied impact(Fig. 1b, Supplementary Table S4). Importantly, 8.8%BCAs displayed an overallgenomic imbalancegreater than 1 Mb and 12.2% had imbalances of >100 kbin this study, representing a significantly lower fraction than previous cytogenetic estimates25.The overall genomic imbalance associated with a BCA was larger among cases without CMA pre-screening, and 13.6%/18.2% of these subjects harbored imbalances greater than 1 Mb/100 kb, respectively (Fig. 1b,Supplementary Table S4).The overall genomic imbalance also generally increased with the number ofbreakpoints,although a fewchromothripsis and chromoplexy eventswere practically balanced(e.g. subject NIJ19 involved 13 junctions across five chromosomes that resolved to a final genomic imbalance of only 631 bases).

BCA formation is mediated by multiple molecular mechanisms

Extensive mechanistic studies have been performed on breakpoints of large CNV datasets; however, the limited scale and resolution of BCA studies have precluded similar analyses for balanced rearrangements. Using precise junction sequences from 661breakpoints, we found that nearly halfdisplayed signatures of blunt-end ligation(45%), presumably driven by non-homologous end joining (NHEJ) (Fig. 1c)7.A substantial fraction (29%) involved microhomology of 2-15 bp at the junction point (median: 3-bp microhomology), indicating that template-switching coupled to DNA-replication mechanisms such as microhomology-mediated break-induced replication (MMBIR) contribute to a substantial fraction of BCAs26. A comparable fraction (25%) of junctions harbored micro-insertions of several basepairs(1 to 375 inserted bases, median: 6-bp), consistent with NHEJ or fork stalling and template switching (FoSTeS) mechanisms (Fig.1c).Finally, only ninejunctions (1%)contained long stretches of homologous sequences (>100 bp) that would be consistent with homology-mediated repair.It is however important to note that this is almost certainly an underestimate given the limitations of short-read sequencing to capture rearrangements localized within highly homologous sequences such as segmental duplications or microsatellites. BCA breakpoints harbored significantly different signatures when compared to CNV breakpoints, particularly deletions, suggesting that they arise from distinct mechanisms. BCA breakpoints were enriched for blunt-ends signatures while depleted for microhomology and large homology sequences (Supplementary Fig. S4), consistent with a higher implication of NHEJ.