Week 3 Changes in Allele and Genotype Frequency

Over the last two tutorials, we learned the basics of R and also how to manipulate, visualize and explore data. From this tutorial onwards, we will start change direction a little and reinforce some of the concepts of evolutionary genetics that you have been learning during the class sessions. This doesn’t mean we are going to throw you in at the deep-end and expect you to be completely relaxed in R, we will still take the opportunity to work through some of the concepts of the language we have already touched upon. Again remember, it is perfectly fine to ask for assistance or to look up R code you don’t understand - we use R everyday and turn to Google for solutions almost constantly!

For the bulk of this session, we will focus on exploring the Hardy-Weinberg (HW) model and testing for deviations from Hardy-Weinberg Expectation (HWE) using R code. You will recall that the HW-model is basically an idealized model of the relationship between allele and genotype frequencies, in the absence of the action of demographic processes such as inbreeding, and evolutionary processes such as genetic drift or selection. The model therefore acts as null model for testing whether such processes have taken place. In other words, we can compare our real data to the expectation and draw inference on what demographic or evolutionary processes might be acting in the real world. As well as allowing us to flex our R abilities, examining the HW-model also requires us to perform some basic statistical analysis, particularly a goodness of fit test.

In addition to the HW model and testing for deviations from HWE, we will also learn how to simulate genetic drift in R. Here we will make use of the visualisation skills we learned with ggplot2 in Chapter 2 to recreate Figure 3.7 from the textbook. We hope that this will have the dual benefit of letting you play with population parameters in order to understand genetic drift but also to help you develop your R programming skills in more detail.

What to expect

In this section we will:

  • learn about for-loops in R
  • explore the concept of Hardy Weinberg Equilibrium using R
  • simulate genetic drift under the Hardy Weinberg expectation
  • take a first look at how we can use R to program

The tutorial is divided into two parts:

  1. A section focused on general R programming, where you learn concepts that can be used across many fields and even programming languages.
  2. A section focused on using R for evolutionary biology, where you apply some of the concepts you’ve learned in this tutorial and earlier to solve problems in evolutionary biology.

In the second section, most of the code is quite straightforward, but you might encounter some more complicated code that you haven’t learned enough to understand yet. This will be clearly indicated, and we only expect you to have a general sense of what’s going on, not understand what every bit of the code does.