SupplementaryFig. S1.A long PCR scheme for retrieval of missing sequence corresponding to a breaking point in a presumed misassembled region at the end of a pseudomolecule.

SupplementaryFig. S2.Diversity along soybean chromosomes 1 to 20. The horizontal axes are Mbp coordinated along the Williams 82 reference genome and approximate centromere positions proposed by Schmutz et al. 2010 are denoted by the thick arrows. Shown for each chromosome are relationship between physical and genetic positions (cM, black line), and corresponding recombination rates (red line, cM/Mb) calculated from 100-kb sliding windows for the genomic regions between adjacent markers (top panel), numbers of genes per 100 kb (black line) and numbers of transposable elements (TE) per 100 kb (middle panel), and numbers of single nucleotide polymorphic (SNP) sites per 100 kb (left y axis) and percentages of shared SNPs of Hwangkeum per the number of SNPs for IT182832 (bottom panel). In the top panel, the discrepant regions between the genetic and sequence-based physical maps are denoted by discontinuities. In the bottom panel, green lines represent Williams 82K, blue lines IT182932, red lines Hwangkeum, and black lines % shared SNP.

SupplementaryFig. S3.Single nucleotide divergence between soybean variants. The approximate 3.2 million single nucleotide differences among three soybean accessions, which were identified by aligning short reads against the Williams 82 reference genome sequence, were classified into shared and unique variations. The overlapping regions represent variants shared between two variants or all variants.

SupplementaryTable S2. Distribution of diversity or polymorphism informative content (PIC) values from genotyping 258 indel markers onto a diverse set of 12 soybean variants

PIC value / Number of indel markers
0.1531 / 138
0.2779 / 27
0.2918 / 5
0.3750 / 20
0.3775 / 3
0.4029 / 4
0.4449 / 22
0.4862 / 16
0.4866 / 5
0.5000 / 6
0.5418 / 4
0.5695 / 1
0.5696 / 2
0.6112 / 2
0.6528 / 1
0.7085 / 1
0.7502 / 1

SupplementaryTable S3. List of indel markers in marker intervals corresponding to putative introgression regions in Hwangkeum (Glycine max) from G. soja where more than half of indel markers polymorphic between Williams 82 (G. max) and IT182932 (G. soja) were shown to be monomorphic between Hwangkeum and IT182932

Indel marker in marker-sparse interval / Genome sequence position / Polymorphism or monomorphism between Hwangkeum and IT182932 / Forward primer / Reverse primer / Allele size in Williams 82
GMES0924 to Satt631(Gm03: 1310955..2916475)
GSINDEL19600 / 1516234 / Polymorphism / GAATTGTATTCTGAAACAGC / CACTCAAATCCTACGTTTAC / 161
GSINDEL19661 / 1684062 / Monomorphism / TTTACTATGGCATGATTTCT / TTCACCTAGGTAATTTTGAA / 143
GSINDEL19680 / 1742470 / Monomorphism / TTACATTGTTCAATCCTACC / TTTTCTTCTTGCCTTTAGTA / 152
GSINDEL19728 / 1908133 / Monomorphism / GGATTTTTCAATTGATTTTA / TCATCTCTCTCCTAACAGAA / 115
GSINDEL19743 / 1942952 / Monomorphism / TGCATTCCAATACTATTACC / AGTGATTTATGCTTTTTCAC / 116
GSINDEL19792 / 2045321 / Polymorphism / GTACTTCCATTAAAACATGC / ATGCTTTTGTTGTTGATTAT / 243
GSINDEL19902 / 2324329 / Monomorphism / GTCCTCTGAACAATAAACTG / TACACCGATTCCTTTAAATA / 216
GSINDEL19990 / 2558325 / Polymorphism / AATGGTTCACAAAACTTAGA / TATCACAGAAGAAGAGGCTA / 248
Satt316 to Sca-364a (Gm06: 47485161..48661496)
GSINDEL56203 / 47572099 / Monomorphism / AAATAAGCAATAGGCACTAA / GTTTTTAATTATGAGGCAAA / 130
GSINDEL56300 / 47798156 / Monomorphism / ATACGTGGCAATAGTATGAT / CAAGATTTTGAGTTAAGGTG / 122
GSINDEL56311 / 47820269 / Monomorphism / TGAGAAATCGTTTATTTCAT / CTTTGTTTTTCTTAAGGTGA / 179
GSINDEL56321 / 47837905 / Polymorphism / CTTGTTTTTGTTGATTCTTC / TTAACCTATTTTCTGTCCAA / 200
GSINDEL56371 / 47956076 / Monomorphism / TATTTCTTTTGAAACAGACC / TACTCTTCCCTTTGTCATTA / 180
GSINDEL56385 / 47986268 / Monomorphism / GAATAAGAAAGAGAGGAAGC / TAGGGGAAAATGAAGACTA / 177
GSINDEL56400 / 48014768 / Monomorphism / ATACATTTCATTTCATCCAG / AAGTTTCACGTCAGTTAAAA / 160
GSINDEL56414 / 48074822 / Monomorphism / TGACAACTAAAATGACAATAAA / CATTTGACATTGCTATTATG / 132
GSINDEL56439 / 48178039 / Monomorphism / GATCCAACTTACCATAATCA / TCAAAAATAAAATGGAGTGT / 198
GSINDEL56531 / 48415540 / Polymorphism / TCTTCAATTCCGAATACTAA / ATATATCAACGAAATGCTTC / 161
GMES1600 to GMES6736 (Gm09: 39751797..41849066)
GSINDEL84893 / 40031613 / Polymorphism / CAATTTTTAAACAAGCTCAA / AGTCTTTTCATGTTATGCAC / 195
GSINDEL85007 / 40364072 / Monomorphism / ACCAGCAACACATTATTTAT / TGCTGAACTGTCTTCTACTT / 210
GSINDEL85023 / 40412092 / Monomorphism / GTAACACGACACAAACTTCT / GAACAAAATGAAAATATGCT / 163
GSINDEL85030 / 40419277 / Monomorphism / GAATGAATGAATGTTTGTTT / GGTAGTGAATTACAACCAAG / 130
GSINDEL85048 / 40498520 / Polymorphism / GTAAGGACTAAGGATAAAGC / CTTTCAAGCTGGATTTGAC / 204
GSINDEL85121 / 40623731 / Monomorphism / ACTGTGTTGTTAGCATTTTT / CCAACTCGTCAACTCTATT / 187
GSINDEL85147 / 40673473 / Monomorphism / AAAGAGTTGCATTACAAGAG / CTTCCCTTTTCTTCTTTTAT / 134
GSINDEL85155 / 40694764 / Monomorphism / TGCCATATCTTATCTTTTGT / GGACTGTGTACTTGATAGGA / 141
GSINDEL85193 / 40763413 / Monomorphism / GACTCTTCTTCTGTCTCCTC / TTTTAATTGGGTGAGAGTAA / 169
GSINDEL85269 / 41096510 / Monomorphism / TATGCTCGTACTGAGATTTT / GAGAGTGATCCATTCAAAG / 232
GSINDEL85271 / 41103592 / Monomorphism / ACTCAGGAGATTCTTGAAAT / GCTAGTCAATTGGAAACAT / 134
GSINDEL85272 / 41109897 / Monomorphism / TATACACCGAGCTTAATAGG / AAGACCTTCAGTACAGTTCA / 137
GSINDEL85282 / 41140003 / Monomorphism / ATTTACCATGAGCAGATTTA / CTTGGTCCAATCTTAGTGT / 232
GSINDEL85324 / 41280723 / Monomorphism / GTGCCACTTATGTGAGATAC / AAAAACTTTGATATTGTGGA / 130
GSINDEL85332 / 41294386 / Polymorphism / TATACACAAAGTTGCACAAA / ATGTCACTCAAAATAGATGC / 179
GSINDEL85333 / 41295818 / Monomorphism / GAGGGGATATCTGTGTATCT / CTTCACTTGGTGATAGAGAA / 244
GSINDEL85335 / 41297158 / Polymorphism / ACGTGAAAAGTGTCTCTAAA / TTCATCTTCTCCTTTTCATA / 216
GSINDEL85359 / 41384266 / Monomorphism / ATTGAAGAGTCCTCTACCTC / GTAGCTAGCATTTCAAGAAG / 183
GSINDEL85435 / 41537237 / Polymorphism / CCCATGACTCTTATCTCATA / GATACTTGGGAAGAGAAAGT / 229
A203.p1to Sca_189b (Gm15: 7469277..8817679)
GSINDEL136839 / 7777973 / Monomorphism / AAAAGAGTGCATAATGATTT / ATTTCCAAGATTTTTCTTTT / 117
GSINDEL136934 / 8075682 / Polymorphism / TCTCAAAATAAAAATGGAAG / TTATCAAATAACAAGGGAAT / 131
GSINDEL136982 / 8167899 / Monomorphism / ACAAATCCAGCAAACTATA / CTTAGGAAATTCATTTGATG / 156
GSINDEL137010 / 8288840 / Polymorphism / CTTTGCAAAATAAGTTTAGG / CTTTTTCTCTCAATTTTTCA / 195
GSINDEL137069 / 8507084 / Monomorphism / TGGAATTTTCTGAAATAAAG / TAATCTCAAGAGGAGATGAA / 174
GSINDEL137071 / 8509641 / Monomorphism / TTTAGATAACCTTCCTCACA / TTCACAGTAGGTTAGACGTT / 171
GSINDEL137103 / 8609958 / Polymorphism / CATAAGGGAGGGTAATACTT / TTAATTGATCCATGTTCATC / 133
GSINDEL137141 / 8721390 / Monomorphism / TTGGTGGTATCACTAACTTT / ATTTAGGCTTAGGGTCTAAC / 141

SupplementaryTable S4. List of primer sets for long PCR amplification of breaking points of discrepant segments between the current genetic map and Williams 82 genome sequence assembly (Glyma1) and GenBank accession numbers of sequences of the retrieved sequences

Discrepant segment / Primer set / Forward primer sequence / Reverse primer sequence / GenBank accession number
Chr 5 / Chr5A1 / gcaacgtttgtcttcgttca / Gttaatctcgccggaaaattg / JQ924190
Chr 11 / JQ924191
BE020413-containing 5’ end / BE020413 / AGTTAAGATATGTTGCTTGG / AGTGTTTGTTGTATGGTTGT
3’ end / Chr11B1 (primary PCR) / GGCCACTTCTGGAATCGTAA / GCCCCACTGGAAGTATTTGA
Chr11B1-2 (nested PCR) / GACTCGGTGACACCATAAGT / GTGAATTGTGTACGGGTTTT
Chr 14 / Chr14B2 / GAACATATATGGGGTGCATGA / CATTCTACGCTAGAAGCTGAA / JQ924192
Insertion site of unplaced scaffold_41 on Chr 17
Upper / Scaffold41-be / ATATGCCACCCAAATAAAAA / GTTTGGGTGAAAAACAAGAG / JQ924193
Lower / Scaffold41-end / CCAGACAAAAGAGAAAGTGG / GGAAGGACAAGGGTTATTTT / JQ924194
Chr 19 / Chr19L / GAAGGATACAAGTGAAAAAGTACAA / GATGTAGACAACATATCCCCTTC / JQ924195

SupplementaryTable S5. Summary of mapping by chromosome

Chromosome number / Number of markers / Distance (cM) / cM/marker / Physical length (Mb) / kb/marker / Recombination rate (cM/Mb)
1 / 128 / 107.9 / 0.8 / 55.9 / 436.7 / 1.9
2 / 113 / 138.6 / 1.2 / 51.7 / 457.5 / 2.7
3 / 67 / 113.2 / 1.7 / 47.8 / 713.4 / 2.4
4 / 59 / 105.7 / 1.8 / 49.2 / 833.9 / 2.1
5 / 79 / 110.0 / 1.4 / 41.9 / 530.4 / 2.6
6 / 74 / 125.1 / 1.7 / 50.7 / 685.1 / 2.5
7 / 64 / 114.9 / 1.8 / 44.6 / 696.9 / 2.6
8 / 93 / 155.3 / 1.7 / 47.0 / 505.4 / 3.3
9 / 73 / 109.4 / 1.5 / 46.8 / 641.1 / 2.3
10 / 80 / 142.6 / 1.8 / 51.0 / 637.5 / 2.8
11 / 76 / 133.0 / 1.8 / 39.2 / 515.8 / 3.4
12 / 72 / 106.1 / 1.5 / 40.1 / 556.9 / 2.6
13 / 110 / 115.3 / 1.0 / 44.4 / 403.6 / 2.6
14 / 76 / 113.4 / 1.5 / 49.7 / 653.9 / 2.3
15 / 78 / 124.9 / 1.6 / 50.9 / 652.6 / 2.5
16 / 65 / 91.9 / 1.4 / 37.4 / 575.4 / 2.5
17 / 64 / 119.2 / 1.9 / 41.9 / 654.7 / 2.8
18 / 73 / 116.8 / 1.6 / 62.3 / 853.4 / 1.9
19 / 75 / 112.0 / 1.5 / 50.6 / 674.7 / 2.2
20 / 62 / 105.9 / 1.7 / 46.8 / 754.8 / 2.3
Total/average / 1581 / 2361.2 / 1.5 / 950.0 / 600.9 / 2.5

SupplementaryTable S6. Comparison of recombination rate of plants with sequenced genomes

Species name (common name) / Predicted genome size (Mb) / Number of chromosomes / Sequenced genome size (Mb) / Transposable element (TE) content (%) / Sequenced genome size excluding TE / Genetic map length (cM) / Recombination rate (cM/Mb)a / Adjusted recombination rate (cM/Mb)b / Referencesd
Arabidopsis thaliana / 125 / 5 / 119 / 16 / 102 / 597 / 5.0 / 5.9 / 1, 2, 3
Arabidopsis lyrata / 230 / 8 / 207 / 23 / 159 / 515 / 2.5 / 3.2 / 2, 4, 5
Fragariavesca (strawberry) / 240 / 7 / 210 / 22 / 164 / 559 / 2.7 / 3.4 / 6
Brachypodiumdistachyon / 355 / 5 / 272 / 26 / 201 / 1598 / 5.9 / 8 / 7, 8
Brassica rapa(pakchoi) / 485 / 10 / 284 / 40 / 170 / 1123 / 4.0 / 6.6 / 9, 10
Medicagotruncatula / 454 / 8 / 375 / 30 / 263 / 567 / 1.5 / 2.2 / 11, 12
Oryza sativa (rice) / 405 / 12 / 389 / 35 / 264 / 1530 / 3.9 / 5.8 / 13, 14
Solanumtuberosum (potato) / 844 / 12 / 727 / 62 / 276 / 762c / 1.1 / 2.8 / 15, 16
Sorghum bicolor (sorghum) / 748 / 10 / 730 / 62 / 277 / 1059 / 1.5 / 3.8 / 17, 18
Glycine max (soybean) / 1115 / 20 / 937 / 57 / 402 / 2361 / 2.5 / 5.9 / 19,
this study
Zea mays (maize) / 2300 / 10 / 2300 / 85 / 345 / 2349 / 1.0 / 6.8 / 20, 21

aCalculated by dividing the map length by the sequenced genome size. We presume that, as the sequenced genomes of rice and Arabidopsis demonstrated tendency of overestimating genome size by flow cytometry [Arabidopsis Genome Initiative (2000); International Rice Genome Sequencing Project (2005)], the sequenced genome sizes are more accurate than the predicted genome sizes.

bCalculated by dividing the map length by the sequenced genome size excluding TE.

cAverage genetic length of 751 cM for the maternal map and 773 cM for the paternal map.

dReferences

1. Arabidopsis Genome Initiative (2000) Analysis of the genome sequence of the flowering plant Arabidopsis thaliana. Nature 408:796–815

2. Hollister JD, Smith LM, Guo YL, Ott F, Weigel D, Gaut BS (2011) Transposable elements and small RNAs contribute to gene expression divergence between Arabidopsis thaliana and Arabidopsis lyrata. ProcNatlAcadSci USA 108:2322-2327

3. Lister C, Dean C (1993) Recombinant inbred lines for mapping RFLP and phenotypic markers in Arabidopsis thaliana. Plant J 4: 745–750

4. Hu TT, Pattyn P, Bakker EG, Cao J, Cheng JF, Clark RM, Fahlgren N, Fawcett JA, Grimwood J, Gundlach H, Haberer G, Hollister JD, Ossowski S, Ottilar RP, Salamov AA, Schneeberger K, Spannagl M, Wang X, Yang L, Nasrallah ME, Bergelson J, Carrington JC, Gaut BS, Schmutz J, Mayer KFX, Van de Peer Y, Grigoriev IV, Nordborg M, Weigel D, Guo Y-L (2011) The Arabidopsis lyrata genome sequence and the basis of rapid genome size change. Nat Genet 43:476-481

5. Kuittinen H, de Haan AA, Vogl C, Oikarinen S, Leppälä J, Koch M, Mitchell-Olds T, Langley CH, Savolainen O (2004) Comparing the linkage maps of the close relatives Arabidopsis lyrata and A. thaliana. Genetics 168:1575-1584

6. Shulaev V, Sargent DJ, Crowhurst RN, Mockler TC, Folkerts O, Delcher AL, Jaiswal P, Mockaitis K, Liston A, Mane SP, Burns P, Davis TM, Slovin JP, Bassil N, Hellens RP, Evans C, Harkins T, Kodira C, Desany B, Crasta OR, Jensen RV, Allan AC, Michael TP, Setubal JC, Celton J-M, Rees DJG, Williams KP, Holt SH, Rojas JJR, Chatterjee M, Liu B, Silva H, Meisel L, Adato A, Filichkin SA, Troggio M, Viola R, Ashman TL, Wang H, Dharmawardhana P, Elser J, Raja R, Priest HD, Bryant DW Jr, Fox SE, Givan SA, Wilhelm LJ, Naithani S, Christoffels A, Salama DY, Carter J, Lopez Girona E, Zdepski A, Wang W, Kerstetter RA, Schwab W, Korban SS, Davik J, Monfort A, Denoyes-Rothan B, Arus P, Mittler R, Flinn B, Aharoni A, Bennetzen JL, Salzberg SL, Dickerman AW, Velasco R, Borodovsky M, Veilleux RE, Folta KM (2011) The genome of woodland strawberry (Fragariavesca). Nat Genet 43:109-116

7. Huo N, Garvin DF, You FM, McMahon S, Luo MC, Gu YQ, Lazo GR, Vogel JP. (2011) Comparison of a high-density genetic linkage map to genome features in the model grass Brachypodiumdistachyon. TheorAppl Genet. 123:455-64

8. International Brachypodium Initiative (2010) Genome sequencing and analysis of the model grass Brachypodiumdistachyon. Nature 463:763–768

9. Kim H, Choi SR, Bae J, Hong CP, Lee SY, Hossain MJ, Van Nguyen D, Jin M, Park BS, Bang JW, Bancroft I, Lim YP. (2009) Sequenced BAC anchored reference genetic map that reconciles the ten individual chromosomes of Brassica rapa. BMC Genomics 10:432

10. The Brassica rapa Genome Sequencing Project Consortium (2011) The genome of the mesopolyploid crop species Brassica rapa. Nat Genet 43:1035-1039

11. Mun JH, Kim DJ, Choi HK, Gish J, Debellé F, Mudge J, Denny R, Endré G, Saurat O, Dudez AM, Kiss GB, Roe B, Young ND, Cook DR (2006) Distribution of microsatellites in the genome of Medicagotruncatula: A resource of genetic markers that integrate genetic and physical maps. Genetics 172:2541-2555

12. Young ND, Debellé F, Oldroyd GE, Geurts R, Cannon SB, Udvardi MK, Benedito VA, Mayer KF, Gouzy J, Schoof H, Van de Peer Y, Proost S, Cook DR, Meyers BC, Spannagl M, Cheung F, De Mita S, Krishnakumar V, Gundlach H, Zhou S, Mudge J, Bharti AK, Murray JD, Naoumkina MA, Rosen B, Silverstein KA, Tang H, Rombauts S, Zhao PX, Zhou P, Barbe V, Bardou P, Bechner M, Bellec A, Berger A, Bergès H, Bidwell S, Bisseling T, Choisne N, Couloux A, Denny R, Deshpande S, Dai X, Doyle JJ, Dudez AM, Farmer AD, Fouteau S, Franken C, Gibelin C, Gish J, Goldstein S, González AJ, Green PJ, Hallab A, Hartog M, Hua A, Humphray SJ, Jeong DH, Jing Y, Jöcker A, Kenton SM, Kim DJ, Klee K, Lai H, Lang C, Lin S, Macmil SL, Magdelenat G, Matthews L, McCorrison J, Monaghan EL, Mun JH, Najar FZ, Nicholson C, Noirot C, O'Bleness M, Paule CR, Poulain J, Prion F, Qin B, Qu C, Retzel EF, Riddle C, Sallet E, Samain S, Samson N, Sanders I, Saurat O, Scarpelli C, Schiex T, Segurens B, Severin AJ, Sherrier DJ, Shi R, Sims S, Singer SR, Sinharoy S, Sterck L, Viollet A, Wang BB, Wang K, Wang M, Wang X, Warfsmann J, Weissenbach J, White DD, White JD, Wiley GB, Wincker P, Xing Y, Yang L, Yao Z, Ying F, Zhai J, Zhou L, Zuber A, Dénarié J, Dixon RA, May GD, Schwartz DC, Rogers J, Quétier F, Town CD, Roe BA (2011) The Medicago genome provides insight into the evolution of rhizobial symbioses. Nature 480:520-524

13. Harushima Y, Yano M, Shomura A, Sato M, Shimano T, Kuboki Y, Yamamoto T, Lin SY, Antonio BA, Parco A, Kajiya H, Huang N, Yamamoto K, Nagamura Y, Kurata N, Khush GS, Sasaki T (1998) A high-density rice genetic linkage map with 2275 markers using a single F2 population. Genetics 148(1):479-494

14. International Rice Genome Sequencing Project (2005) The map-based sequence of the rice genome. Nature 436:793–800

15. The Potato Genome Sequencing Consortium (2011) Genome sequence and analysis of the tuber crop potato. Nature 474:189-195

16. van Os H, Andrzejewski S, Bakker E, Barrena I, Bryan GJ, Caromel B, Ghareeb B, Isidore E, de Jong W, van Koert P, Lefebvre V, Milbourne D, Ritter E, van der Voort JN, Rousselle-Bourgeois F, van Vliet J, Waugh R, Visser RG, Bakker J, van Eck HJ (2006) Construction of a 10,000-marker ultradense genetic recombination map of potato: providing a framework for accelerated gene isolation and a genomewide physical map. Genetics 173:1075-1087

17. Bowers JE, Abbey C, Anderson S, Chang C, Draye X, Hoppe AH, Jessup R, Lemke C, Lennington J, Li Z, Lin Y, Liu S, Luo L, Marler BS, Ming R, Mitchell SE, Qiang D, Reischmann K, Schulze SR, Skinner DN, Wang Y, Kresovich S, Schertz KF, and Paterson AH (2003) A high-density genetic recombination map of sequence-tagged sites for Sorghum, as a framework for comparative structural and evolutionary genomics of tropical grains and grasses. Genetics 165:367–386

18. Paterson AH, Bowers JE, Bruggmann R, Dubchak I, Grimwood J, Gundlach H, Haberer G, Hellsten U, Mitros T, Poliakov A, Schmutz J, Spannagl M, Tang H, Wang X, Wicker T, Bharti AK, Chapman J, Feltus FA, Gowik U, Grigoriev IV, Lyons E, Maher CA, Martis M, Narechania A, Otillar RP, Penning BW, Salamov AA, Wang Y, Zhang L, Carpita NC, Freeling M, Gingle AR, Hash CT, Keller B, Klein P, Kresovich S, McCann MC, Ming R, Peterson DG, Mehboob Ur R, Ware D, WesthoVP, Mayer KFX, Messing J, Rokhsar DS (2009) The Sorghum bicolor genome and the diversification of grasses. Nature 457: 551–556

19. Schmutz J, Cannon SB, Schlueter J, Ma J, Mitros T, Nelson W, Hyten DL, Song Q, Thelen JJ, Cheng J, Xu D, Hellsten U, May GD, Yu Y, Sakurai T, Umezawa T, Bhattacharyya MK, Sandhu D, Valliyodan B, Lindquist E, Peto M, Grant D, Shu S, Goodstein D, Barry K, Futrell-Griggs M, Abernathy B, Du J, Tian Z, Zhu L, Gill N, Joshi T, Libault M, Sethuraman A, Zhang X-C, Shinozaki K, Nguyen HT, Wing RA, Cregan PB, Specht J, Grimwood J, Rokhsar D, Stacey G, Shoemaker RC, Jackson SA (2010) Genome sequence of the palaeopolyploid soybean. Nature 463(7278):178-183

20. Liu S, Yeh C-T, Ji T, Ying K, Wu H, Tang HM, Fu Y, Nettleton D, Schnable PS (2009) Mu transposon insertion sites and meiotic recombination events co-localize with epigenetic marks for open chromatin across the maize genome. PLoS Genetics 5: e1000733.

21. Schnable PS, Ware D, Fulton RS, Stein JC, Wei F, Pasternak S, Liang C, Zhang J, Fulton L, Graves TA, Minx P, Reily AD, Courtney L, Kruchowski SS, Tomlinson C, Strong C, Delehaunty K, Fronick C, Courtney B, Rock SM, Belter E, Du F, Kim K, Abbott RM, Cotton M, Levy A, Marchetto P, Ochoa K, Jackson SM, Gillam B, Chen W, Yan L, Higginbotham J, Cardenas M, Waligorski J, Applebaum E, Phelps L, Falcone J, Kanchi K, Thane T, Scimone A, Thane N, Henke J, Wang T, Ruppert J, Shah N, Rotter K, Hodges J, Ingenthron E, Cordes M, Kohlberg S, Sgro J, Delgado B, Mead K, Chinwalla A, Leonard S, Crouse K, Collura K, Kudrna D, Currie J, He R, Angelova A, Rajasekar S, Mueller T, Lomeli R, Scara G, Ko A, Delaney K, Wissotski M, Lopez G, Campos D, Braidotti M, Ashley E, Golser W, Kim H, Lee S, Lin J, Dujmic Z, Kim W, Talag J, Zuccolo A, Fan C, Sebastian A, Kramer M, Spiegel L, Nascimento L, Zutavern T, Miller B, Ambroise C, Muller S, Spooner W, Narechania A, Ren L, Wei S, Kumari S, Faga B, Levy MJ, McMahan L, Van Buren P, Vaughn MW, Ying K, Yeh CT, Emrich SJ, Jia Y, Kalyanaraman A, Hsia AP, Barbazuk WB, Baucom RS, Brutnell TP, Carpita NC, Chaparro C, Chia JM, Deragon JM, Estill JC, Fu Y, Jeddeloh JA, Han Y, Lee H, Li P, Lisch DR, Liu S, Liu Z, Nagel DH, McCann MC, SanMiguel P, Myers AM, Nettleton D, Nguyen J, Penning BW, Ponnala L, Schneider KL, Schwartz DC, Sharma A, Soderlund C, Springer NM, Sun Q, Wang H, Waterman M, Westerman R, Wolfgruber TK, Yang L, Yu Y, Zhang L, Zhou S, Zhu Q, Bennetzen JL, Dawe RK, Jiang J, Jiang N, Presting GG, Wessler SR, Aluru S, Martienssen RA, Clifton SW, McCombie WR, Wing RA, Wilson RK. (2009) The B73 maize genome: complexity, diversity, and dynamics. Science 326:1112-1115

SupplementaryTable S7. Summary of sequencing and variations for three soybean varieties

Variety
Category / IT182932 / Hwangkeum / Williams 82
Mapping
Total bases / 36,880,852,541 / 19,983,900,227 / 16,544,562,819
Mean depth / 38.82 / 21.03 / 17.41
Coveragea
%_bases_above_1 / 97.4 / 97.4 / 98.5
%_bases_above_5 / 93.2 / 95.9 / 97.6
%_bases_above_10 / 85.5 / 92.0 / 89.8
%_bases_above_20 / 69.2 / 56.2 / 34.2
SNP
Total / 2,397,205 / 1,236,277 / 113,587
Known / 1,365,216 / 817,605 / 44,566
Homozygous / 2,286,168 / 1,165,945 / 51,454
Heterozygous / 111,037 / 70,332 / 62,133
Transition / 1,575,531 / 785,647 / 63,381
Transversion / 821,674 / 450,630 / 50,206
Exon / 116,121 / 55,029 / 5,610
Exon known / 73,935 / 41,882 / 2,008
Exon novel / 42,186 / 13,147 / 3,602
Exon homozygous / 108,819 / 50,081 / 2,268
Exon heterozygous / 7,302 / 4,948 / 3,342
Exon transition / 68,941 / 32,637 / 3,081
Exon transversion / 47,180 / 22,392 / 2,529
CDS / 85,330 / 40,543 / 4,388
5’ UTR / 9,706 / 4,420 / 423
3’ UTR / 21,886 / 10,468 / 856
Intron / 236,471 / 115,952 / 9,295
Silent / 38,587 / 18,489 / 1,932
Missense / 45,798 / 21,624 / 2,401
Nonsense / 896 / 414 / 50
Readthrough / 131 / 65 / 9
Splice_site / 701 / 357 / 50
Start_codon / 162 / 68 / 10
Indel
Total / 302,013 / 236,276 / 29,105
Homozygous / 286,097 / 224,203 / 20,269
Heterozygous / 15,916 / 12,073 / 8,836
Tandem_repeat / 213,098 / 173,403 / 22,621
Exon / 14,072 / 7,127 / 1,162
Exon novel / 14,072 / 7,127 / 1,162
Exon homozygous / 13,191 / 6,714 / 930
Exon heterozygous / 881 / 413 / 232
Exon tandem_repeat / 9,885 / 5,171 / 917
CDS / 4,879 / 2,452 / 655
5’ UTR / 3,476 / 1,633 / 199
3’ UTR / 5,775 / 3,073 / 316
Intron / 47,994 / 28,876 / 3,285
Frameshift / 2,772 / 1,531 / 595
Inframe / 2,107 / 921 / 60
Splice_site / 264 / 139 / 31
Start_codon / 85 / 41 / 5

aPercentageof the Williams 82 reference genome sequence covered with short reads higher than 1, 5, 10, and 20