Journal of the Royal Society Interface

Journal of the Royal Society Interface

JOURNAL OF THE ROYAL SOCIETY INTERFACE

SUPPLEMENTARY MATERIAL

Population genetics of immune-related multilocus CNV in Native Americans

Luciana W Zuccherato, Silvana Schneider, Eduardo Tarazona-Santos, Robert J Hardwick, Douglas E Berg, Helen Bogle, Mateus H Gouveia, Lee Machado, Moara Machado, Fernanda Rodrigues-Soares, Giordano B Soares-Souza, Diego Leal Togni, Roxana Zamudio, Robert H Gilman, Denise Duarte, Edward J Hollox, Maíra R Rodrigues

MATERIAL AND METHODS

Estimating Allele Frequencies and fCNV

Our method estimates both fCNV and the allele frequencies using the profiled likelihood function and the expectation maximization (EM) algorithm, where the set of allele frequencies is the main parameter and fCNV is the perturbation parameter. This is done by replacing the perturbation parameter by a maximum likelihood estimate in the original likelihood, considering fixed values for allele frequencies.

Since the allele frequencies and fCNV are initially unknown, the method starts by assigning random probabilities to allele frequencies. Thus, considering the alleles with k and q copy number and their frequencies pk and pq, the genotypic proportion in the g-th iteration is given by:

(1)

Since the observed phenotype j (the experimentally observed diploid copy number) is known, and, considering that k+q=j, the proportion of the j-th diploid copy number when taking a population structure parameter f, is given by the sum of all possible genotypes for the j-th diploid copy number:

(2)

The diploid copy numbers density function follows a multinomial distribution given by:

(3)

where n is the total number of individuals in the sample, jmax is the highest value of a diploid copy number and nj is the number of individuals with diploid copy number j.

Based on equations (1), (2) and (3), the joint likelihood function for fCNV and p can be obtained. With p being the vector of all proportions pk and pq (k+q = j), this function is given by:

(4)

To solve the joint likelihood function for fCNV and p, we use the profiled likelihood method (Severini 2000). In this likelihood, for fixed values of p, the maximum likelihood estimate of fCNV is generally a function of p, such that:

(5)

The profiled likelihood is then treated as a normal likelihood and, having the fCNV estimate value, we go back to the initial equations to calculate the genotypic proportion of the g-th iteration as follows:

(6)

Afterwards, the proportion of class j, considering k+q=j, is given by:

(7)

After that, the genotypic proportion is updated with the results of equations (6) and (7) and with the sample information, as follows:

(8)

The allelic proportion is then given by:

(9)

Where it indicates the number of times that allele t appears in genotype i = (hk, hq), and jmax is the maximum phenotype. These values are stored in the initial values of Equation 9 and the algorithm starts over until convergence, which occurs when the difference between the current and the previous estimated allelic proportions is smaller than 10-8, or until it reaches a maximum of 10.000 iterations.

Populations, Samples and Genotyping

We used 103 validated ancestry informative markers (Pereira et al. 2012) to estimate continental admixture proportions in a set of 112 Aymara, 72 Ashaninka, 72 Shimaa and 25 Monte Carmelo individuals from the same populations studied in this article. Most of these individuals were also genotyped for the mCNV data presented in this study. We used the following populations from public databases for the continental ancestry analyses: 176 Yoruba from Ibadan, Nigeria (YRI) and 174 European descendants living in Utah (CEU) from Hapmap database. We estimated individual ancestry using the software Admixture v.1.2 (Alexander et al. 2009), running the analyses in an unsupervised mode with the command: admixture input.ped 3.

To compare the inference of genetic structure of the studied populations using mCNVs with that using SNP data, we genotyped 695 SNPs unlinked between them and respect to the studied mCNV loci in 124 individuals from three populations: Ashaninka (n=44), Matsiguenga (n=64) and Aymara (n=16). The f statistics were calculated from SNP genotypes using the hierfstat R package (Goudet 2005), using the varcomp.glob function to estimate fixation index and the boot.vc function with 1000 bootstraps to assess the confidence interval of the variance components. The list of these SNPs is at the end of this document.

CNV detection performed by SNP arrays tends to be robust in detecting simple CNVs, such as large deletions and duplications (Alkan et al. 2011). However, for smaller mCNVs, as analyzed in this paper, there is a lack of discrimination between individual copy numbers due to noisy data. For an accurately determination of diploid mCNV copy number, we performed the paralogue ratio test (PRT), a quantitative PCR that uses a single primer pair to simultaneously amplify both a test locus and a reference locus (Armour et al. 2007). The combined use of multiplex restriction enzyme digest variant ratio (REDVR) with PRT allows the determination of CN, SNPs and paralog genes, which adds an additional layer of complexity.

SUPPLEMMENTARY TABLES

Table S1. Allele frequencies and heterozygosity (H) for six theoretical populations used for testing CNVice performance (Sample size: 100 diploid individuals).

Alleles and Copy number allele frequencies / H
0 / 1 / 2 / 3 / 4 / 5 / 6 / 7 / 8
Pop1 / 0.3 / 0.4 / 0.3 / 0.66
Pop2 / 0.1 / 0.1 / 0.8 / 0.34
Pop3 / 0.2 / 0.6 / 0.2 / 0.56
Pop4 / 0.1 / 0.1 / 0.1 / 0.1 / 0.2 / 0.1 / 0.1 / 0.1 / 0.1 / 0.88
Pop5 / 0.01 / 0.01 / 0.01 / 0.01 / 0.01 / 0.01 / 0.02 / 0.02 / 0.9 / 0.19
Pop6 / 0.05 / 0.05 / 0.1 / 0.6 / 0.1 / 0.05 / 0.05 / 0.61

H: genetic diversity or expected heterozygosity under Hardy-Weinberg equilibrium.

1

Table S2. Estimation of population structure parameter f and allele frequencies by CNVice. Monte Carlo simulations (1000 replications) were performed sampling from populations with the allele frequencies specified in Table S1, assuming a sample size of 100 individuals and different values of the f parameter: 0.05, 0.10 and 0.20.

Pop1, 3 alleles, Gene diversity: 0.66
Parameter f = 0.05 / Parameter f = 0.10 / Parameter f = 0.20
Allele / True allele freq. / Estimated allele freq. / [CI 95%] / Estimated allele freq. / [CI 95%] / Estimated allele freq. / [CI 95%]
0 / 0.3 / 0.28713 / [0.28056; / 0.29369] / 0.29496 / [0.28353; / 0.30638] / 0.30609 / [0.28831; / 0.32388]
1 / 0.4 / 0.42505 / [0.40838; / 0.44173] / 0.41098 / [0.38400; / 0.43796] / 0.39151 / [0.35117; / 0.43184]
2 / 0.3 / 0.28642 / [0.27266; / 0.30019] / 0.29251 / [0.27270; / 0.31232] / 0.30106 / [0.27329; / 0.32883]
3 / 0.0 / 0.00140 / [0.00000; / 0.00503] / 0.00155 / [0.00000; / 0.00579] / 0.00134 / [0.00000; / 0.00645]
Estimated fCNV / 0.07150 / [0.00000; / 0.23972] / 0.10221 / [0.00000; / 0.29497] / 0.17170 / [0.00000; / 0.40610]
Pop2, 4 alleles, Gene diversity: 0.38.
Parameter f = 0.05 / Parameter f = 0.10 / Parameter f = 0.20
Allele / True allele freq. / Estimated allele freq. / [CI 95%] / Estimated allele freq. / [CI 95%] / Estimated allele freq. / [CI 95%]
0 / 0.000 / 0.03495 / [0.00000; / 0.15910] / 0.03342 / [0.00000; / 0.15267] / 0.03933 / [0.00000; / 0.17683]
1 / 0.080 / 0.09168 / [0.03458; / 0.14878] / 0.09731 / [0.01434; / 0.18029] / 0.10845 / [0.00000; / 0.23587]
2 / 0.770 / 0.68696 / [0.36475; / 1.00000] / 0.67723 / [0.33026; / 1.00000] / 0.64648 / [0.21399; / 1.00000]
3 / 0.110 / 0.12701 / [0.05353; / 0.20050] / 0.13378 / [0.02924; / 0.23833] / 0.14005 / [0.00000; / 0.29228]
4 / 0.029 / 0.05708 / [0.00000; / 0.19746] / 0.05563 / [0.00000; / 0.19444] / 0.06254 / [0.00000; / 0.22044]
5 / 0.000 / 0.00169 / [0.00000; / 0.00331] / 0.00157 / [0.00000; / 0.00332] / 0.00184 / [0.00000; / 0.00469]
Estimated fCNV / 0.03901 / [0.00000; / 0.21126] / 0.06026 / [0.00000; / 0.27732] / 0.10469 / [0.00000; / 0.39309]

Pop3: 3 alleles, Gene diversity: 0.56.

Parameter f = 0.05 / Parameter f = 0.10 / Parameter f = 0.20
Allele / True allele freq. / Estimated allele freq. / [CI 95%] / Estimated allele freq. / [CI 95%] / Estimated allele freq. / [CI 95%]
0 / 0.2 / 0.19705 / [0.19576; / 0.19834] / 0.20173 / [0.19835; / 0.20511] / 0.20424 / [0.18876; / 0.21971]
1 / 0.6 / 0.60595 / [0.60215; / 0.60974] / 0.60081 / [0.59176; / 0.60986] / 0.59166 / [0.55854; / 0.62477]
2 / 0.2 / 0.19529 / [0.19122; / 0.19936] / 0.19516 / [0.18646; / 0.20385] / 0.20286 / [0.18197; / 0.22375]
3 / 0.0 / 0.00171 / [0.00025; / 0.00318] / 0.00231 / [0.00000; / 0.00511] / 0.00125 / [0.00000; / 0.00420]
Estimated fCNV / 0.048583 / [0.00000; / 0.16891] / 0.08628 / [0.00000; / 0.23701] / 0.17951 / [0.00000; / 0.37278]

Pop4: 9 alleles, Gene diversity: 0.88.

Parameter f = 0.05 / Parameter f = 0.10 / Parameter f = 0.20
Allele / True allele freq. / Estimated allele freq. / [CI 95%] / Estimated allele freq. / [CI 95%] / Estimated allele freq. / [CI 95%]
0 / 0.1 / 0.09668 / [0.06342; / 0.12995] / 0.10153 / [0.06921; / 0.13385] / 0.10533 / [0.07330; / 0.13736]
1 / 0.1 / 0.10119 / [0.04658; / 0.15579] / 0.09917 / [0.04737; / 0.15096] / 0.09639 / [0.05346; / 0.13933]
2 / 0.1 / 0.10510 / [0.01483; / 0.19537] / 0.10560 / [0.02386; / 0.18734] / 0.10907 / [0.03987; / 0.17827]
3 / 0.1 / 0.11663 / [0.03224; / 0.20101] / 0.11251 / [0.03677; / 0.18824] / 0.10986 / [0.05351; / 0.16620]
4 / 0.2 / 0.14993 / [0.02492; / 0.27493] / 0.15322 / [0.04541; / 0.26103] / 0.15986 / [0.07601; / 0.24372]
5 / 0.1 / 0.13284 / [0.03620; / 0.22948] / 0.12743 / [0.04409; / 0.21077] / 0.11495 / [0.05831; / 0.17159]
6 / 0.1 / 0.11939 / [0.00761; / 0.23116] / 0.11230 / [0.01498; / 0.20962] / 0.10664 / [0.03176; / 0.18151]
7 / 0.1 / 0.09576 / [0.01298; / 0.17853] / 0.09792 / [0.02667; / 0.16918] / 0.09792 / [0.04308; / 0.15277]
8 / 0.1 / 0.06488 / [0.00229; / 0.12747] / 0.07563 / [0.01478; / 0.13648] / 0.09290 / [0.04445; / 0.14135]
Estimated fCNV / 0.05316 / [0.00000; / 0.20805] / 0.08369 / [0.00000; / 0.27313] / 0.17085 / [0.00000; / 0.41368]

Pop5: 9 alleles, Gene diversity: 0.19.

Parameter f = 0.05 / Parameter f = 0.10 / Parameter f = 0.20
Allele / True allele freq. / Estimated allele freq. / [CI 95%] / Estimated allele freq. / [CI 95%] / Estimated allele freq. / [CI 95%]
0 / 0.01 / 0.02717 / [0.00000; / 0.12428] / 0.02864 / [0.00000; / 0.13154] / 0.03087 / [0.00000; / 0.13896]
1 / 0.01 / 0.03333 / [0.00000; / 0.14768] / 0.03248 / [0.00000; / 0.14279] / 0.03380 / [0.00000; / 0.14386]
2 / 0.01 / 0.03307 / [0.00000; / 0.15328] / 0.03064 / [0.00000; / 0.14352] / 0.03763 / [0.00000; / 0.17213]
3 / 0.01 / 0.03903 / [0.00000; / 0.17152] / 0.03861 / [0.00000; / 0.16426] / 0.03600 / [0.00000; / 0.15410]
4 / 0.01 / 0.04718 / [0.00000; / 0.19029] / 0.05116 / [0.00000; / 0.20754] / 0.04932 / [0.00000; / 0.20232]
5 / 0.01 / 0.04880 / [0.00000; / 0.19295] / 0.04874 / [0.00000; / 0.19558] / 0.04513 / [0.00000; / 0.17916]
6 / 0.02 / 0.04614 / [0.00000; / 0.19394] / 0.04417 / [0.00000; / 0.18752] / 0.04211 / [0.00000; / 0.17709]
7 / 0.02 / 0.05944 / [0.00000; / 0.22024] / 0.06025 / [0.00000; / 0.22737] / 0.05568 / [0.00000; / 0.20933]
8 / 0.9 / 0.43021 / [0.00000; / 1.00000] / 0.42857 / [0.00000; / 1.00000] / 0.43405 / [0.00000; / 1.00000]
Estimated fCNV / 0.17217 / [0.00000; / 0.69289] / 0.18816 / [0.00000; / 0.72492] / 0.23368 / [0.00000; / 0.81062]

Pop6: 7 alleles, Gene diversity: 0.61.

Parameter f = 0.05 / Parameter f = 0.10 / Parameter f = 0.20
Allele / True allele freq. / Estimated allele freq. / [CI 95%] / Estimated allele freq. / [CI 95%] / Estimated allele freq. / [CI 95%]
0 / 0.00 / 0.00807 / [0.00000; / 0.02756] / 0.00783 / [0.00000; / 0.02551] / 0.00894 / [0.00000; / 0.03293]
1 / 0.05 / 0.03549 / [0.00000; / 0.07619] / 0.03735 / [0.00000; / 0.07782] / 0.04205 / [0.00000; / 0.08779]
2 / 0.05 / 0.06317 / [0.00307; / 0.12327] / 0.06112 / [0.00000; / 0.12665] / 0.06254 / [0.00000; / 0.14237]
3 / 0.10 / 0.17317 / [0.00000; / 0.36011] / 0.16563 / [0.00000; / 0.35098] / 0.15173 / [0.00000; / 0.31855]
4 / 0.60 / 0.44226 / [0.02572; / 0.85880] / 0.45457 / [0.03759; / 0.87155] / 0.46907 / [0.00000; / 0.86922]
5 / 0.1 / 0.17316 / [0.00000; / 0.36623] / 0.17036 / [0.00000; / 0.36078] / 0.15507 / [0.00000; / 0.33001]
6 / 0.05 / 0.06204 / [0.00336; / 0.12071] / 0.06012 / [0.00000; / 0.12202] / 0.06236 / [0.00000; / 0.13825]
7 / 0.05 / 0.03331 / [0.00000; / 0.07310] / 0.03334 / [0.00000; / 0.07187] / 0.03798 / [0.00000; / 0.08389]
8 / 0.00 / 0.00652 / [0.00000; / 0.02452] / 0.00586 / [0.00000; / 0.02107] / 0.00633 / [0.00000; / 0.02579]
Estimated fCNV / 0.03502 / [0.00000; / 0.19410] / 0.05859 / [0.00000; / 0.26130] / 0.12928 / [0.00000; / 0.42511]

1

Tables S3-S9 are in Excel files.

Table S3. Diploid copy number distribution of beta-defensin CNV loci and inferred allelic frequency in populations and subpopulations with CoNVEM (A) and CNVice (B).

Table S4. Diploid copy number distribution of CCL3L1/CCL4L1 CNV locus and inferred allelic frequencies in populations and subpopulations with CoNVEM (A) and CNVice (B).

Table S5. Diploid copy number distribution of FCGR3A CNV locus and inferred allelic frequencies in populations and subpopulations with CoNVEM (A) and CNVice (B).

Table S6. Diploid copy number distribution of FCGR3B CNV locus and inferred allelic frequencies in populations and subpopulations with CoNVEM (A) and CNVice (B).

Table S7. Diploid copy number distribution of FCGR2C CNV locus and inferred allelic frequencies in populations and subpopulations with CoNVEM (A) and CNVice (B).

Table S8. Diploid copy number distribution of DEFB CNV locus for worldwide populations (Human Genome Diversity Project panel) published by Hardwick et al. (2010) and allele frequencies and fCNV inferred using CoNVEM (A) and CNVice (B).

Table S9. Diploid copy number distribution of FCGR3B CNV locus for worldwide populations (Human Genome Diversity Project panel) published by Machado et al. (2012), and allele frequencies and fCNV inferred using CoNVEM (A) and CNVice (B).

Table S10. Allelic frequency of functional SNPs associated with CNV loci. The ratio of the two alleles of allotypes HNA1a and HNA1b of FCGR3B, alleles Q and X of FCGR2C and alleles of Clade II and I of DEFB103 was assigned for each individual, considering the copy number of each allele. The population frequency was calculated by the mean of individual ratios.

Ashaninka / Monte Carmelo / Shimaa / Aymara
Clade II DEFB103 (rs2737902) / 0.048 / 0.098 / 0.131 / 0.075
HNA1a FCGR3B (rs447536) / 0.820 / 0.780 / 0.710 / 0.680
Q57 FCGR2C (rs10917661) / 0.070 / 0.060 / 0.090 / 0.060

Table S11 - Genotyping of 53 trios with unique diploid copy number combinations for beta-defensins, CCL3L1/CCL4L1, FCGR3A, FCGR3B and FCGR2C. (IN EXCEL FILE)

Table S12. Description of the Peruvian native populations included in this study.

Ethnic group / Village / Sample size (n=376) / Geographic coordinates / District / Province / Region
Ashaninka (n=143) / Cushireni / 20 / 11°12'S, 73°42'W / Rio Tambo / Satipo / Junin
Mayapo / 36 / 11°10'S, 73°42'W
Charahuaja / 35 / 11°04'S, 73°43'W
Capitiri / 20 / 11°06'S, 73°43'W
Ivotsote / 32 / 11°09'S, 73°42'W
Matsiguenga (n=113) / Monte Carmelo / 24 / 12°13'S, 72°58'W / Echarate / La Convencion / Cusco
Shimaa / 89 / 12°33'S, 73°05'W
Aymara (n=120) / Jayu Jayu / 10 / 15°59'S, 69°37'W / Acora / Puno / Puno
Ccopamaya / 51 / 15°58'S, 69°38'W
Laraqueri / 8 / 16°14'S, 70°06'W / Pichacani
Pichacani / 37 / 16°06'S, 70°09'W
Camicachi / 14 / 16°09'S, 69°63'W / Ilave / El Collao

1

Table S13. Reference samples with known copy numbers from the European Collection of Cell Cultures ( used to normalize the values in the PRT technique. (IN EXCEL FILE)

SUPPLEMENTARY FIGURES

Figure S1. Comparison of allele frequency estimations by CNVice and CoNVEM in simulated populations when the f parameter varies. Sample size is 100 individuals. Characteristics of simulated populations 1-6 (allele frequencies and heterozygosity H) are in Table S1. The true values of simulated allele frequencies are indicated by the “reference” line.

Figure S2

Figure S2. Comparison of CNVice individual genotype probability estimation with and without trio information (posterior and prior probabilities respectively). Data comes from 53 Ashaninka trios with unique diploid copy number combinations for beta-defensins, CCL3L1/CCL4L1, FCGR3A, FCGR3B and FCGR2C. Posterior probability of the offspring’s most likely genotype increases, reducing the level of uncertainty for its genotype, in 49% of cases (blue points). Yellow points indicate no change in the posterior probability (43.4% of cases) and red points indicate a decrease in the posterior probability (7.6% of cases).

TriosBefore After scat

Figure S3. Profiled likelihood and maximum likelihood estimation of the population structure parameter fCNV for DEFB and FCGR3B multilocus CNV using worldwide data from the HGDP panel (for loci DEFB (Hardwick et al. 2011 in Human Mutation) and FCGR3B (Machado et al. 2012 in AJHG)). The vertical axis of each locus is standardized according to its maximum likelihood.

List of SNPs used for f estimation in Figure 3 (Bottom).

rs12121543, rs1801133, rs2066470, rs1052576, rs2020902, rs6300, rs2088102, rs1048771, rs1883454, rs2235544, rs1887285, rs7602, rs1137100, rs1137101, rs6413471, rs663649, rs559062, rs540049, rs5845, rs1041163, rs2392221, rs3176879, rs2234696, rs7483, rs879332, rs4659175, rs2064902, rs1889740, rs2864873, rs7517566, rs1205, rs1061217, rs164283, rs2295612, rs6413453, rs5085, rs5082, rs15049, rs5361, rs929087, rs11072, rs486907, rs2274064, rs689466, rs800292, rs2274700, rs1065489, rs2820312, rs3024496, rs3021094, rs1800871, rs1800896, rs1800890, rs2671272, rs3738043, rs2854456, rs2234922, rs1051741, rs1136410, rs1805415, rs1805414, rs1805087, rs2275565, rs735943, rs4149963, rs1042034, rs1469513, rs1367117, rs2303287, rs4149372, rs1049987, rs163077, rs1367696, rs163086, rs10916, rs162562, rs1800440, rs162556, rs162555, rs10175368, rs6544718, rs1863332, rs4952887, rs7607076, rs1981928, rs7602094, rs17036614, rs3821227, rs4608577, rs6544991, rs3136228, rs1800935, rs17561, rs1071676, rs1143634, rs3136558, rs1143627, rs419598, rs4150416, rs853785, rs3770603, rs5742926, rs5742938, rs1233297, rs5743072, rs1233258, rs5743116, rs256564, rs256563, rs256550, rs2066804, rs3900115, rs2293554, rs1035142, rs4673222, rs11571315, rs4553808, rs11571317, rs231775, rs3087243, rs5031011, rs2070094, rs2229571, rs2070096, rs828910, rs207916, rs1051685, rs2440, rs1106037, rs2372848, rs2270360, rs2241193, rs1042640, rs7568, rs125701, rs2304277, rs2938392, rs1175541, rs2228000, rs1799977, rs3732378, rs3732379, rs3864004, rs4135385, rs2953, rs9883073, rs4987053, rs3138042, rs936426, rs3448, rs2228017, rs9282638, rs1385520, rs3732361, rs10934500, rs1732170, rs1719895, rs2319398, rs1574154, rs4072520, rs17810235, rs17204878, rs17810676, rs9282641, rs1129055, rs4678045, rs1042636, rs1127717, rs2305230, rs140696, rs582537, rs1317082, rs3733001, rs1001073, rs3774268, rs698090, rs3105782, rs3864099, rs13094773, rs13089330, rs3172469, rs6789961, rs6790167, rs35861864, rs35592567, rs3817672, rs2855262, rs7041, rs2227549, rs230547, rs4444903, rs2237051, rs1396080, rs769242, rs2254514, rs2676330, rs2714805, rs3804099, rs3804100, rs10078, rs13167280, rs2853677, rs1801075, rs6347, rs460700, rs2287780, rs10380, rs1802059, rs6863657, rs34677, rs10941112, rs34689, rs3195676, rs1494555, rs2972418, rs2972419, rs2940944, rs6873545, rs6878512, rs6179, rs6180, rs6413428, rs7579, rs2972388, rs2303151, rs1805355, rs26279, rs2075685, rs3777015, rs1805377, rs2909786, rs41115, rs866006, rs459552, rs40401, rs839, rs1295686, rs20541, rs2243248, rs2243250, rs2070874, rs2243268, rs2243290, rs1042124, rs10063949, rs3829987, rs2228422, rs2042235, rs1946234, rs8177404, rs8177426, rs2277940, rs3212227, rs5326, rs2235718, rs984253, rs2745599, rs5369, rs5370, rs1799945, rs1572982, rs707889, rs909253, rs1799964, rs1800630, rs1800629, rs3093661, rs2072915, rs2076310, rs513349, rs1757000, rs12197850, rs10507, rs25648, rs730775, rs513688, rs405729, rs367836, rs543613, rs4986947, rs130058, rs6155, rs497186, rs581235, rs11914, rs3799488, rs488133, rs2077647, rs2273206, rs2228480, rs1799971, rs9282821, rs562859, rs4880, rs1570070, rs998075, rs629849, rs2282140, rs1803989, rs2009115, rs2066853, rs1800797, rs4619, rs2017000, rs1140475, rs2293347, rs1211152, rs2235074, rs9282564, rs1801197, rs8713, rs1049334, rs1049337, rs11762213, rs35775721, rs13223756, rs41736, rs10250202, rs10244817, rs2167270, rs2069456, rs1549760, rs10245199, rs33945943, rs34206126, rs33972793, rs7006985, rs9644708, rs263, rs316, rs327, rs328, rs1126452, rs2230009, rs1800389, rs1800392, rs1346044, rs3136717, rs1129660, rs1031552, rs3779870, rs2306494, rs2306492, rs1063053, rs1063045, rs3088440, rs3731239, rs3731217, rs1800975, rs868, rs2230806, rs1805330, rs1805313, rs2228083, rs1139488, rs5788, rs1330684, rs10795241, rs12529, rs2245191, rs2275928, rs1937920, rs3829125, rs2296135, rs2296141, rs2228059, rs3136614, rs1149901, rs2229359, rs3781093, rs569421, rs1800858, rs1800860, rs4986832, rs2029253, rs1369214, rs2228529, rs10082466, rs10824793, rs5030737, rs2243639, rs1903858, rs1468063, rs4986894, rs4244285, rs717620, rs2273697, rs619824, rs10883782, rs4919682, rs4919687, rs743572, rs12917, rs2070676, rs4987059, rs2230949, rs734351, rs3213223, rs3213216, rs1003483, rs3751058, rs3751052, rs6256, rs6254, rs2279900, rs12574333, rs3740617, rs769218, rs769217, rs475043, rs2227973, rs312016, rs608343, rs607887, rs7177, rs2075577, rs1393350, rs1870019, rs1042839, rs492457, rs1042838, rs543215, rs568157, rs3758841, rs1943781, rs2071230, rs10488, rs664677, rs1800889, rs1801516, rs664143, rs4585, rs1079597, rs1799978, rs9610, rs506504, rs6413436, rs887477, rs3213427, rs921, rs2075267, rs1045733, rs1055151, rs3748302, rs715398, rs10160846, rs13096, rs9266, rs17329025, rs2970532, rs757343, rs7974876, rs12821902, rs2280698, rs2280699, rs703842, rs769412, rs2288729, rs1007573, rs5742714, rs978458, rs4764883, rs5742667, rs2162679, rs11038, rs3751144, rs989892, rs4765621, rs3924313, rs4765181, rs750771, rs1539096, rs1799943, rs144848, rs1801406, rs543304, rs1799955, rs15869, rs3765535, rs1047768, rs2227869, rs17655, rs1805386, rs1713449, rs1760904, rs1760898, rs1760897, rs1760944, rs3136814, rs1130409, rs945352, rs2234636, rs4986938, rs3020450, rs2071566, rs2296327, rs3825644, rs3742599, rs2277502, rs7101, rs1063169, rs4645856, rs1046428, rs1799794, rs1130233, rs1900758, rs1800407, rs1800404, rs11636097, rs936216, rs2242119, rs2619681, rs2304579, rs2412546, rs1153829, rs4646, rs10046, rs700518, rs12907866, rs749292, rs1058219, rs1800588, rs3825776, rs1968687, rs1968689, rs6083, rs2242064, rs6074, rs4646421, rs17861115, rs2472299, rs3129, rs2072352, rs389480, rs2229765, rs1065663, rs7102, rs4280262, rs1800067, rs2057768, rs3024544, rs1805015, rs8832, rs820299, rs289717, rs1801706, rs5923, rs1109166, rs1801026, rs251796, rs10517, rs689453, rs723012, rs1424151, rs1061646, rs2159116, rs3785275, rs2016571, rs2239359, rs2239360, rs7220870, rs1126667, rs5435, rs4227, rs13894, rs858520, rs858517, rs6259, rs727428, rs1641535, rs1641512, rs1614984, rs2909430, rs8079544, rs2287499, rs17885803, rs2287498, rs3744263, rs9282801, rs2297518, rs1052536, rs1042658, rs2269858, rs598126, rs1060915, rs4986852, rs16940, rs1799949, rs1799950, rs799923, rs4987082, rs2114443, rs2071409, rs4986763, rs1015771, rs4988340, rs2048718, rs11868547, rs7210356, rs4128941, rs3923087, rs9282553, rs699517, rs2790, rs7614, rs1145315, rs757228, rs3745551, rs1051690, rs1799817, rs8110533, rs891087, rs1035942, rs3745545, rs35931207, rs919275, rs2233679, rs889162, rs5030390, rs281432, rs5498, rs3093032, rs5930, rs5925, rs2116898, rs14158, rs3008, rs3212752, rs1059369, rs1800469, rs25487, rs13181, rs6966, rs3212986, rs11615, rs20580, rs20579, rs905238, rs2304206, rs2304204, rs2278414, rs2278415, rs4988334, rs35560557, rs1110277, rs1776964, rs1715364, rs25406, rs1484994, rs619289, rs285164, rs34771484, rs2076546, rs396221, rs2296241, rs2248359, rs10485805, rs6024840, rs1047972, rs6064387, rs6099129, rs12482371, rs1893650, rs2070424, rs3153, rs1059293, rs25678, rs2156406, rs458582, rs469390, rs2070229, rs2280807, rs2072683, rs1050008, rs469304, rs12613, rs234706, rs2236467, rs7499, rs1051266, rs6518591, rs1800706, rs4646310, rs4646312, rs4680, rs2240716, rs12233352, rs140504, rs2267131, rs2097461, rs2239815, rs2267155, rs86582, rs2016755, rs4376

References of the Supplementary Material that are not in the Main Text

Armour J et al. Accurate, high-throughput typing of copy number variation using paralogue ratios from dispersed repeats. Nucleic Acids Research 2007;35(3):-.

Alkan C, Coe BP, Eichler EE (2011) Genome structural variation discovery and genotyping. Nature Reviews Genetics 12, 363-376.

Machado L.R., et al. Evolutionary History of Copy-Number-Variable Locus for the Low-Affinity Fcγ Receptor: Mutation Rate, Autoimmune Disease, and the Legacy of Helminth Infection. Am J Hum Genet 2012.

Hardwick, R., et al. A Worldwide Analysis of Beta-Defensin Copy Number Variation Suggests Recent Selection of a High-Expressing DEFB103 Gene Copy in East Asia. Human Mutation 2011;32(7):743-750.

1