APPENDIX B: Details of the Simulation Study

APPENDIX B: Details of the simulation study

In this appendix, we explain the details of the simulations that are executed in this paper. We consider 2 unlinked QTLs, where the recoded genotypes are additively coded as (-1,0,1). We note p1 as the MAF for QTL 1 and p2 as the MAF for QTL 2.

Scenario 1: No population stratification

We start each simulation by generating genotypic data for the QTLs according to the specified MAFs and the Mendelian laws for the required number of subjects. Then we need to simulate the trait values according to the different settings.

When no population stratification is present, the trait values are simulated based on splitting up the (total) variance V of the trait. We suppose that

V=Va1+Va2+Vaa+Ve+Vp

where Vai is the additive major gene variance explained by QTL i, Vaa is the additive genetic variance explained by the gene-gene interaction between the two QTLs,Veis the (non-shared) environmental variance and Vpis the polygenic variance. To simulate the trait values, we start by drawing values of the univariate normal distribution with mean zero and variance Vp for all founders (N(0, Vp)). For all offspring, we generate values independently based on an univariate normal distribution with mean equal to the mean of the parental (polygenic) generated values and variance equal to Vp/2. Generating the values this way for the polygenic component of the trait value will result in a polygenic variance Vpof the trait and covariance values between trait values of family members of φ Vp,where φ equals twice the kinship coefficient. We can show this as follows.
Assume X multivariate normally distributed, with parameters

and

We define

Drawing from AX will correspond to simulating the polygenic component for a family with 2 parents and 2 offspring in the way we described above.

By a general property of the multivariate normal distribution it follows that AX is multivariate normally distributed with mean

and variance

The correlation matrix is thus twice the kinship matrix as stated before.

To further simulate the trait values, we add to all simulated values randomly drawn values of the univariate normal distribution with mean zero and variance Ve for each subject (N(0, Ve)).

In a final step, we add values based on the following model:

.(1)

We can calculate a1, a2 and a12 from the given additive genetic variances Va1, Va2and Vaa. The following formulae come from 1:

Va1 = 2p1(1-p1)[a1+(p2-(1-p2))a12]2

Va2 = 2p2(1-p2)[a2+(p1-(1-p1))a12]2

Vaa = 4p1(1-p1)p2(1-p2)a122

From those, we can derive that:

In model (1), we specify values a1, a2 and a12 according to these formulae and put µ=0. We calculate based on model (1) for each subject and add these values to the already simulated trait values based on Vpand Ve. Adding this part of the trait value will add Va1+Va2+Vaa to the variance of the trait values and π1Va1+ π2Va2+ π1π2Vaato the covariance of the trait values of 2 individuals in the family where π1 is the proportion of alleles shared IBD at locus 1 (analogous for π2) and is the Hadamard product (element-wise multiplication) between 2 matrices.

We conclude that simulating the trait values as described above, will result in the following variance-covariance matrix for the trait values of individuals j and k of family i:

as is described in the paper.

We choose to simulate the data this way, for comparison with the QTDT paper 2.

Scenario 2: Population Stratification

When we are simulatingdata in presence of population stratification, we also start by simulating genotypic data according to the specified MAFs of the QTLs and the Mendelians laws for all subjects. The difference is that now the MAFs of the QTLs differ according to the stratum. We simulate two strata and divide the families in an equal amount over the 2 strata. We do not take admixture into account. The specified MAFs are 0.1/0.3/0.4 for stratum 1 and 0.5 for stratum 2.

To simulate the trait values, we specify the effects a1, a2 and a12 in model (1)instead of the variance decomposition of the trait. This way, we can measure the bias in the estimated coefficients of the epiQTDT method. We keep these parameters fixed for the 2 strata, but specify µ=1 for stratum 1 and µ =10 for stratum 2. Furthermore, we also specify the variances Ve and Vp .

The simulation of the trait values is analogue to the case of no population stratification, based on the described a1, a2 , a12, Ve and Vp . We note that the additive genetic variances Va1, Va2and Vaa will differ for the 2 strata, since the MAF of both QTLs is different.

1.Tiwari HK: Deriving components of genetic variance for multilocus models. Genetic Epidemiology 1997; 14: 1131-1136.

2.Abecasis GR, Cardon LR, Cookson WO: A general test of association for quantitative traits in nuclear families. Am J Hum Genet 2000; 66: 279-292.