Mathematical Modeling Using Hardy-Weinberg- modeling the effects of genetic drift on populations of different sizes

Purpose: The main purpose of this lab is to test the effects of genetic drift on various sized populations. Additionally, you are going to work with your own mathematical model using the Hardy-Weinberg principle to carry out the main purpose of the lab. Mathematical models include parameters and assumptions. The model you are going to work with is a simplified one and is easily done in Excel.

Hypothesis: The effects of genetic drift on populations’ allele frequencies from one generation to the next is greater in smaller populations. One prediction we are testing is that if genetic drift has more significant evolutionary effects on smaller populations, then we would expect to see smaller populations reaching fixation of an allele in fewer generations than larger populations.

Methods:

1)One parameter you will add in your model is population size. For simplicity, work with populations of n = 5, 50, and 500. The letter ‘n’ stands for sample size. Dedicate one excel file to each population size. Save your files asas Population 5, Population 50, and Population 500.

2)The general idea of this lab is such that:

  1. In each population,you will start with allele frequencies of p= 0.5 and q = 0.5. In your model, you will allow ‘random’ mating for 10 generations.
  2. In generation 1, your population starts with p= 0.5 and q = 0.5. This means that there are equal proportions of the dominant (A) and recessive (in class we called this ‘a’, but for this model, we will use B’) in the gene pool. Individuals will mate randomly in generation 1 and produce zygotes (individuals) with specific genotypes (combinations of alleles, which could be AA, AB, or BB).
  3. You will calculate the genotype frequencies, using the total number of individuals/genotypes as we did in class, and use those frequencies to calculate the allele frequencies (p and q) for generation 2.
  4. You will repeat this process until you get to generation 10.
  5. You will then graph the relationship between p, frequency for the dominant allele (A), and generation number. In your results section, you will have a total of three graphs, one for each population size, and you will use these graphs to discuss the effects of genetic drift on allele frequencies of various sized populations.

3)Specific methods:

  1. In the ‘Population n= 5” Excel File, make a total of 9 sheets for your 10 generations. Do this by going to the bottom of the Excel file and clicking the + symbol that is next to the sheet labelled “Generation 1”. Label them as “Generation 2, Generation 3, …Generation 10”. Next, add another sheet at the end and label this one “Data Chart”.

  1. You will see that Generation 1 is already completed for you. Your allele frequencies are already set, to where p and q both equal 0.5. You will also see that your gamete, zygote, and genotype data section are already set.

  1. Click on cell E5 and then press the F9 button. The allele may change. The file is set up such that when you press F9, alleles are randomly chosen from your gene pool. This simulates random mating. The chance of getting an A or a B allele depends on p and q, which are both set to 0.5 for generation 1.
  2. D. The ‘zygote’ column shows the genotype for the individual that results from the random mating. You will see that there are only a total of 5 zygotes in your Population 5 data file. That’s because you are modelling genetic drift for a sample size of 5 individuals here.
  3. At the end of columns h-j, you will find a number that equals the total sum for the corresponding genotype (AA, AB, or BB).


  1. This number is not the frequency of genotypes, but is the total number on individuals with that genotype. You will use these numbers to calculate the genotypic frequencies related to each genotype. These frequencies correspond to p2, 2pq, and q2 in the Hardy-Weinberg equation.
  2. Using these newly calculated genotypic frequencies, you will calculate p and q. You will use these new p and q values for generation 2. You will copy and paste everything from the Generation 1 sheet and paste it into the Generation 2 Sheet.

  1. Then you will enter that newly calculated p value into cell D2. The q value will automatically be calculated.
  2. For example, in generation 1, I started with p = 0.5. Based on the random matings in my

Generation 1, I had 2 individuals who are AA. Using this number, I calculated p2 = 0.4. Using this genotypic frequency, my new p value for generation 2 was 0.63. As you see in the figure above, I entered 0.63 for my new p value. The new q value was automatically calculated for me.

  1. Once you enter your new p value for generation 2, you will click on cell E5 and again hit the F9 button. This will simulate the random matings in your population for generation 2. Again, look at how many individuals were AA, AB, or BB. Using these numbers, you will calculate the genotypic frequencies (p2, 2pq, and q2). Use the genotypic frequencies to calculate new p value. Copy and paste everything from the Generation 2 sheet and paste it into the Generation 3 sheet. Then, enter this new p value that you calculated from Generation 2 and enter this new number in cell D2 in the Generation 3 sheet. You will keep on doing this until you get a new p value for Generation 10.
  2. Lastly, you will graph your data in the Data Chart sheet. In cell A1, enter in ‘Generation’. In cell B1, enter ‘p’.

  1. Make a scatterplot of your data. What is the independent variable? What is the dependent variable? Make your chart by selecting all your data from A:2 and B:2 all the way to the last data row. Click on the ‘Insert’ tab and select the scatterplot data chart.
  2. Once you have your data chart, you may copy it and paste it into a Word file. Label this Word document as ‘Data Charts’ for your records. Make sure to indicate that this scatterplot belongs to a population size of n =5. You will add 2 more data charts to this document, for populations n =50 and n = 500.
  3. Congratulations! You finished modelling drift for a population size of n = 5 individuals. Now, you will do this all over again for population n= 50 and 500. You will repeat the same steps, but each population will have its own Excel file, just so things are organized. Make scatterplots for each population and paste them into your ‘Data Charts’ Word document.
  4. Once you have all your charts, for each chart, add in a line that cuts through p = 0.5 throughout all your generations. Click on ‘Insert’, “Shapes’, and click the second shape, which is a straight line. It should look like my charts below:
  1. This allows you to see how p fluctuates from the original p value in generation 1, and how this fluctuations may differ among the different sized populations. Use these charts to help you interpret your results. Do your results match your predictions? What does it mean if they do or not? If they don’t match, don’t just simply say that your results don’t match your predictions. What inferences can you make from your data, in terms of how genetic drift affects allele frequencies over time in different sized populations?

Note: The points in your data charts may not look like mine. You will get different data based on random chance.