Supplemental Table 1. Loci, genes, and SNPs reviewed for type 2 diabetes.*

Locus / Gene / Variant / # Entities listing / Listed by
10q23.33 / HHEX / rs1111875 / 5 / NIH / D / N / 23 / PG
8q24.11 / SLC30A8 / rs13266634 / 5 / NIH / D / N / 23 / PG
3p25.2 / PPARG / rs1801282 / 5 / NIH / D / N / 23 / PG
11p15.4 / KCNQ1 / rs2237892 / 4 / NIH / D / 23 / PG
3q27.2 / IGF2BP2 / rs4402960 / 4 / NIH / D / N / 23
10q25.2 / TCF7L2 / rs7903146 / 4 / NIH / D / 23 / PG
7p15.1 / JAZF1 / rs864745 / 4 / NIH / D / N / PG
4p16.1 / WFS1 / rs10010131 / 3 / D / N / PG
9p21.3 / CDKN2B / rs10811661 / 3 / NIH / N / PG
1p12 / NOTCH2, ADAM30 / rs10923931 / 3 / NIH / N / PG
9p21.3 / CDKN2A, CDKN2B / rs2383208 / 3 / NIH / D / 23
17q12 / HNF1B,TCF2 / rs4430796 / 3 / NIH / D / N
11p15.1 / KCNJ11 / rs5215 / 3 / NIH / D / N
11p15.1 / KCNJ11 / rs5219 / 3 / NIH / 23 / PG
6p22.3 / CDKAL1 / rs7756992 / 3 / NIH / D / N
12q21.1 / TSPAN8,LGR5 / rs7961581 / 3 / NIH / D / N
16q12.2 / FTO / rs8050136 / 3 / NIH / N / PG
11q21 / MTNR1B / rs10830963 / 2 / D / PG
6p22.3 / CDKAL / rs10946398 / 2 / NIH / PG
11q14.3 / MTNR1B / rs1387153 / 2 / NIH / 23
3q27.2 / IGF2BP2 / rs1470579 / 2 / NIH / PG
3q12.3 / ZPLD1 / rs2063640 / 2 / NIH / 23
10q25.2 / TCF7L2 / rs4506565 / 2 / NIH / N
3p14.1 / ADAMTS9 / rs4607103 / 2 / NIH / N
6p22.3 / CDKAL1 / rs4712523 / 2 / NIH / 23
2p21 / THADA / rs7578597 / 2 / NIH / D
11p12 / Intergenic / rs9300039 / 2 / NIH / N

Abbreviations: NIH, NIH GWAS catalog; D, deCODEme; N, Navigenics; 23, 23andMe; PG, Pathway Genomics

*Since the underlying SNP association studies are the basis for the NIH GWAS catalog and consumer genomics analysis, examining studies cited is another dimension for evaluating the core SNPs list concept and revealing the most important studies. Of 74 total studies cited by the 5 organizations in their type 2 diabetes analyses, Table 2 lists the 15 that are listed by more than one entity. The full list is available in the Supplementary Material. Metrics similar to Table 1 can be seen. No single study was cited by all 5 organizations, only 1 study was cited by 4 organizations,14 5 studies were cited by 3 organizations, and 9 studies were cited by 2 organizations. Eighty percent of the studies (59) were only cited by 1 organization, in this case mostly by 23andMe.

We have actually tried to compare the Pathway Genomics’s DRPs with other companies using the three volunteers. However, Pathway Genomics did not report relative risks (RR), it reported the disease risks with categorical representation (for example, such as, 1: low risk, 3: normal risk, 5: high risk). Therefore we did not include the Pathway Genomics for DRPs comparison analyses in main sections, and we only included it in this supplemental section as references.

Supplemental Table 2. Underlying research studies reviewed for type 2 diabetes SNPs.

Study PMID / # Entities Listing / Listed by:
17460697 / 4 / NIH / D / N / 23
17463248 / 3 / NIH / N / 23
17463249 / 3 / NIH / N / 23
17554300 / 3 / NIH / N / 23
17603484 / 3 / D / N / 23
18372903 / 3 / NIH / D / N
17463246 / 2 / NIH / 23
17603485 / 2 / D / N
17668382 / 2 / NIH / N
18711366 / 2 / NIH / N
18711367 / 2 / NIH / D
19060907 / 2 / D / 23
19401414 / 2 / NIH / 23
20581827 / 2 / NIH / 23
20818381 / 2 / NIH / 23

Abbreviations: PMID, PubMed Paper Identification Number; NIH, NIH GWAS catalog; D, deCODEme; N, Navigenics; 23, 23andMe; PG, Pathway Genomics.

Pathway Genomics (PG) does not provide PMID references by SNP.

Supplemental Table 3. Average risk.

Male / Female
Average risk (%) / Average risk (%)
23andme / Navigenics / deCODEme / 23andme / Navigenics / deCODEme
Disease conditions / European / Asian / Cauc. / Cauc. / European / Asian / Cauc. / Cauc.
Type 2 diabetes / 25.7 / 27.8 / 25.0 / 40.0 / 20.7 / 21.9 / 30.0 / 28.0
Rheumatoid arthritis / 2.4 / 0.9 / 1.6 / 1.0 / 4.2 / 1.3 / 3.3 / 1.0
Restless legs syndrome / 2 / - / 4.0 / 7.0 / 4.2 / - / 4.0 / 13.0
Psoriasis / 11.4 / - / 4.0 / 2.0 / 10.1 / - / 4.0 / 2.0
Prostate cancer / 17.8 / 11.2 / 17.0 / 19.0 / - / - / - / -
Multiple sclerosis / 0.34 / - / 0.3 / 0.2 / 0.7 / - / 0.8 / 0.5
Lupus (systemic lupus erythematosus) / - / - / 0.0 / 0.1 / 0.25 / 0.32 / 0.3 / -
Heart attack / 46.8 / - / 42.0 / 42.0 / - / 25.0 / 25.0
Crohn’s disease / 0.53 / - / 0.6 / 0.5 / 0.47 / - / 0.5 / 0.5
Celiac disease / 0.12 / - / 0.1 / 1.0 / 0.24 / - / 0.1 / 1.0
Breast cancer / - / - / - / - / 13.5 / 8.8 / 13.0 / 12.0
Osteoarthritis / 40.0 / - / 47.0 / -
Atrial fibrillation / 27.2 / - / 26.0 / - / 15.9 / - / 23.0 / 25.0
Obesity / 63.9 / - / 34.0 / 39.5 / 59 / - / 32.0 / 39.5
Lung cancer / 8.5 / - / 8.0 / 17.2 / 6.2 / - / 6.0 / 11.6
Abdominal aortic aneurysm / 3.1 / 17.0 / 1.5 / -
Melanoma / 2.9 / - / 3.7 / - / 1.7 / - / 2.6 / -
Stomach cancer, diffuse / - / 0.23 / 2.4 / - / - / <0.1 / 1.7 / -
Brain aneurysm / 0.6 / 5.0 / 0.9 / 5.0
Age-related macular degeneration / 6.5 / - / 3.1 / 8.0 / 7 / - / 3.1 / 8.0
Colorectal cancer / 5.6 / 4.9 / 6.0 / 6.0 / 4 / 3.4 / 5.0 / 6.0
Alzheimer’s disease / 7.2 / - / 9.0 / 6.0 / 7.1 / - / 17.0 / -

Grey area: no risk value because of preliminary research.

Supplemental Table 4: Difference of risk prediction calculations between deCODEme and 23andMe.

A: Comparisons of disease risk prediction algorithms between deCODEme and 23andMe (r = 1.2, q = 0.01).

r = 1.2,
q = 0.01 / deCODEme / 23andMe
p / d1 / d2 / d3 / d1 / d2 / d3
0.1 / 0.0096 / 0.0115 / 0.0138 / 0.0096 / 0.0115 / 0.0138
0.2 / 0.0092 / 0.0111 / 0.0133 / 0.0093 / 0.0111 / 0.0133
0.3 / 0.0089 / 0.0107 / 0.0128 / 0.0089 / 0.0107 / 0.0128
0.4 / 0.0086 / 0.0103 / 0.0123 / 0.0086 / 0.0103 / 0.0123
0.5 / 0.0083 / 0.0099 / 0.0119 / 0.0083 / 0.0099 / 0.0119
0.6 / 0.0080 / 0.0096 / 0.0115 / 0.0080 / 0.0096 / 0.0115
0.7 / 0.0077 / 0.0092 / 0.0111 / 0.0077 / 0.0092 / 0.0111
0.8 / 0.0074 / 0.0089 / 0.0107 / 0.0075 / 0.0089 / 0.0107
0.9 / 0.0072 / 0.0086 / 0.0103 / 0.0072 / 0.0086 / 0.0103

B: Comparisons of disease risk prediction algorithms between deCODEme and 23andMe (r = 1.5, q = 0.01)

r = 1.5, q = 0.01 / deCODEme / 23andMe
p / d1 / d2 / d3 / d1 / d2 / d3
0.1 / 0.0091 / 0.0136 / 0.0204 / 0.0091 / 0.0136 / 0.0202
0.2 / 0.0083 / 0.0124 / 0.0186 / 0.0083 / 0.0124 / 0.0184
0.3 / 0.0076 / 0.0113 / 0.0170 / 0.0076 / 0.0113 / 0.0169
0.4 / 0.0069 / 0.0104 / 0.0156 / 0.0070 / 0.0104 / 0.0156
0.5 / 0.0064 / 0.0096 / 0.0144 / 0.0064 / 0.0096 / 0.0143
0.6 / 0.0059 / 0.0089 / 0.0133 / 0.0059 / 0.0089 / 0.0133
0.7 / 0.0055 / 0.0082 / 0.0123 / 0.0055 / 0.0082 / 0.0123
0.8 / 0.0051 / 0.0077 / 0.0115 / 0.0051 / 0.0077 / 0.0115
0.9 / 0.0048 / 0.0071 / 0.0107 / 0.0048 / 0.0072 / 0.0107

C: Comparisons of disease risk prediction algorithms between deCODEme and 23andMe (r = 2, q = 0.01)

r=2, q=0.01 / deCODEme / 23andMe
p / d1 / d2 / d3 / d1 / d2 / d3
0.1 / 0.0083 / 0.0165 / 0.0331 / 0.0083 / 0.0164 / 0.0324
0.2 / 0.0069 / 0.0139 / 0.0278 / 0.0070 / 0.0137 / 0.0274
0.3 / 0.0059 / 0.0118 / 0.0237 / 0.0060 / 0.0118 / 0.0234
0.4 / 0.0051 / 0.0102 / 0.0204 / 0.0051 / 0.0102 / 0.0202
0.5 / 0.0044 / 0.0089 / 0.0178 / 0.0045 / 0.0089 / 0.0177
0.6 / 0.0039 / 0.0078 / 0.0156 / 0.0039 / 0.0078 / 0.0156
0.7 / 0.0035 / 0.0069 / 0.0138 / 0.0035 / 0.0070 / 0.0138
0.8 / 0.0031 / 0.0062 / 0.0123 / 0.0031 / 0.0062 / 0.0123
0.9 / 0.00278 / 0.0055 / 0.0111 / 0.0028 / 0.0056 / 0.0111


Supplemental Note 1. Risk prediction algorithms for 3 DTC companies.

We considered the case in which 1 biallelic marker was associated with a disease (denoted as a disease-associated locus). A disease-associated locus was assumed to consist of 2 alleles, a and A, the former resulting in an increased disease risk. Let p be the frequency of risk allele a, and assuming Hardy-Weinberg equilibrium, the frequency of each genotype AA, Aa, and aa can be represented by , respectively. Let d1, d2, and d3 be the penetrance (probability of developing the disease) of each of genotype AA, Aa, and aa.

We can represent the prevalence of the disease in the population (q) as follows:

di denotes the probability that an individual with each genotype randomly selected from the population has the disease.

By using these notations, the risk prediction algorithm for the 3 companies (the method for calculating an individual’s absolute risk with multi-allelic markers) can be described as follows.

23andMe

Let be the OR defined by the following formulas:

Therefore, if we are given q, p, r(1), and r(2), then we can get (d1, d2, and d3) by solving the above 3 equations, since there are 3 equations with 3 variables.

Then, individual absolute risk with multi-allelic markers (P) is calculated by the following formulas.

where n represents the number of allelic markers and digi represents penetrance when the i locus of an individual’s genotype was gi.

Navigenics

Navigenics uses the risk ratio, λRN = d2/d1, and λRR = d3/d1, instead of using d1, d2, d3.

We can obtain λRN and λRR by solving the following 2 equations:

Then, individual absolute risk with multi-allelic markers (P), which is termed GCI (genetic composite index) by Navigenics, is calculated by the following formula:

where n represents the number of allelic markers, gi represents the genotype of locus i, and λgi represents the risk ratio for the genotype (for genotype AA, Aa, and aa, λgi = 1, d2/d1, and d3/d1, respectively.)

deCODEme

deCODEme uses the odds ratio(OR) instead of the risk ratio. They assume that the OR can be approximated as the value of the risk ratio.

Since the prevalence of the disease in the population (q) can be described as follows:

,

the relative risk for each genotype aa, Aa, AA can be obtained by solving the above equations as follows:

where, .

Then, individual absolute risk with multi-allelic markers (P) is calculated by the following formulas.

where n represents the number of allelic markers, and digi represents penetrance when the i locus of an individual’s genotype was gi.