SUPPLEMENTARY METHODS

Y chromosome and mitochondrial genotyping and sequencing

DNA was extracted from peripheral blood following standard protocols. Next, to establish paternal lineages, we determined the variation of the non- recombinant part of the Y chromosome (NRY) by genotyping 35 NRY Single Nucleotide Polymorphisms (SNPs) (table 1manuscrript). A total of 24 SNPs (M170, M96, M45, M207, M213, M9, SRY1532, M174, 12f2, M106, M175, M69, M168, M343, M20, M214, M145, M124, tat, M172, M201, MEH2, M173, M91) were amplified in one multiplex reaction followed by detection with one minisequencing reaction using the SNAPshot multiplex kit from Applied Biosystems, as described in Corach D et al 20101 20. Purified minisequencing products were analyzed with an ABI 3100 Genetic Analyzer using LIZ 120 as size standard. Data were analyzed using Gene Mapper ID v3.2.1 software, allowing us to identify Unique Event Polymorphisms (UEP). We followed the nomenclature of the Y chromosome Consortium (updated to 2009)2 to assign each UEP to a major haplogroup. To further dissect specific haplogroups, we sequenced 11 SNPs on subsets of samples based on the results of the analysis of the first 24 SNPs. Mutations M17, M18 and M269 were typed for a subset of samples identified as haplogroup R1; M35, M78 and M123 were typed for samples with haplogroup E; M67, M92 and M102 for samples belonging to haplogroup J2 and M26 and M161 for samples belonging to haplogroup I. Primers for sequencing the 11 additional SNPs were taken from literature2. Sequencing reactions were performed using BigDye Terminators v3.1 from Applied Biosystems and run on a 3730 Genetic Analyzer. Data were analyzed by using Seqscape software (Applied Biosystems).

Further, we genotyped 17 microsatellite markers (STR) DYS19, DYS389I, DYS389II, DYS390, DYS391, DYS392, DYS393, DYS385a and DYS385Bb, DYS437, DYS438, DYS439, DYS448, DYS456, DYS458, DYS635, Y-GATA-H4 using the AmpFlSTR®Yfiler® PCR amplification kit (Applied Biosystems), according to the instructions provided by the manufacturer. However, we excluded markers DYS385a and DYS385b from the analysis as these markers are duplicated and therefore did not provide additional information. The alleles from the microsatellites were coded according to the nomenclature in the Y-STR reference database.

We also estimated maternal lineages. We sequenced the hypervariable segments HVS-I (nucleotides 16001-16568) and HVS-II (nucleotides 001-574) of the mitochondrial DNA (mtDNA) control region from the Cambridge Reference Sequence (update to 2009). DNA amplification and sequencing of the products were performed according to the protocol of the Variant SeqResequencing System (Applied Biosystems). To define mtDNA haplogroups, we used the mtDNA manager software3, which provides the expected haplogroups based on specific mutations and their surrounding haplotype background, based on 2,619 mtDNA sequences.

Autosomal SNP markers

A sample of 118 individuals, which included 72 males from the Y chromosome sample and 46 females were genotyped using the Affymetrix Genome Wide SNP v5.0 arrays containing 500568 SNPs. Experiments were performed according to the protocol provided by the supplier. Briefly, 250ng of genomic DNA were digested with NspI and StyI, ligated with specific adaptors, and amplified by PCR using the kit primers. The amplicons were purified, quantified, fragmented and labelled, prior to hybridization to the array chips at 48°C for 16–18h. Excess unhybridized products were removed by multiple washing steps, and the arrays were scanned with a GeneChip Scanner 3000 (Affymetrix).

Five individuals with low CEL intensity signal calls (<86%) were eliminated from the analyses. Intensity signals were converted to genotype calls using the BRLMN (Bayesian Robust Linear Model with Mahalanobis Distance) algorithm4 with the Genotyping Console v3.0 software from Affymetrix (http://www.affymetrix.com). Unmapped and duplicated SNPs were removed. Next, SNPs with genotype call rates lower than 95% were excluded (No SNPs: 42783). Further quality controls included removal of SNPs with Hardy-Weinberg Equilibrium (HWE) deviations (p= 0.001; No. SNPs: 882), missing genotype rates of >=10% (No. SNPs: 34) and missing individual genotypes of >5% (none). After the quality control procedure, 113 individuals (44 females and 69 males) genotyped for 402 566 autosomal markers were available for analysis.

REFERENCES

1 Corach D, Lao O, Bobillo C et al: Inferring Continental Ancestry of Argentineans from Autosomal, Y-Chromosomal and Mitochondrial DNA. Annals of Human Genetics; 74: 65-76.

2 Karafet TM, Mendez FL, Meilerman MB, Underhill PA, Zegura SL, Hammer MF. New binary polymorphisms reshape and increase resolution of the human Y chromosomal haplogroup tree, 2008, vol 18, pp 830-838.

3 Lee HY, Song I, Ha E, Cho S-B, Yang WI, Shin K-J. mtDNAmanager: a Web-based tool for the management and quality analysis of mitochondrial DNA control-region sequences, 2008, vol 9, p 483.

4 Rabbee N, Speed TP: A genotype calling algorithm for affymetrix SNP arrays. Bioinformatics 2006; 22: 7-12.

4