Week 4 assignment 3
1. Functions
You need to define the calc_geno()
function from the tutorial to complete this part of the assignment.
- Create a function with two arguments,
x
andy
, that checks whetherx
is larger thany
. You can decide yourself if you want the function to returnTRUE
/FALSE
or a string of text.
The goal of functions is to make life easier for yourself, by having to repeat yourself less, and making your code more readable and maintainable. In the R-part of the tutorial, we created the function calc_geno
, which uses \(p\) to calculate expected genotype frequencies under Hardy-Weinberg equilibrium. If you want to calculate this, but only have the genotype counts (number of individuals \(A_1 A_1\), \(A_1 A_2\) and \(A_2A_2\)), not \(p\), it makes sense to create a function that does this for you. If you do that, you can simply write:
Or even calc_geno(calc_p(geno_counts))
if you prefer! Notice how easy it is to follow what happens in the code above. This adheres to a principle called modular programming, where larger tasks are separated into smaller, more manageable tasks. Now it’s your turn to actually turn the above into working code by creatng the calc_p
function.
Create a function
calc_p()
that takes a vector of genotype counts as an argument and returns the allele frequency of \(A_1\) (i.e., \(p\)). To test your code, run the following:p
should be0.1875
, andgeno_freq
should be0.03515625 0.30468750 0.66015625
Optional: create a function called
geno_from_counts()
that takes a vector of genotype counts as an argument, and then calls firstcalc_p()
, thencalc_geno()
to get expected genotype frequencies directly from genotype counts.
For the rest of the assignment, feel free to use these functions, and create new functions if you like. Note: be aware that calc_geno()
calculates expected genotype frequencies, and cannot be used where you need observed genotype frequencies.
2. Fitness
- What are absolute and relative fitness? What do we mean by marginal fitness?
We sample a 325 individuals from a population of snails which have a brown/yellow polymorphism. This is controlled by a single locus B. B1B1 individuals are brown, so are B1B2 individuals but B2B2 is yellow. In our sample, 184 snails are B1B1, 60 are B1B2 and 81 are B2B2. We know from our experimental work that on average, brown individuals have 33 offspring, whereas yellow individuals have 24. Calculate the following:
the frequency of the B1 and B2 alleles
the relative fitnesses
the marginal fitness
the mean population fitness.
3. Selection
You need to define the functions selection_model()
and selection_sim()
to complete this part of the assignment.
selection_model <- function(p, rel_fit){
# define q
q <- 1 - p
# calculate genotype frequencies (under HWE)
gf <- c(p^2, 2*(p*q), q^2)
# calculate mean pop fitness
w_bar <- sum(rel_fit*gf)
# calculate marginal allele frequencies
w1 <- (p*rel_fit[1]) + (q*rel_fit[2])
w2 <- (p*rel_fit[2]) + (q*rel_fit[3])
# calculate freq of p in the next generation
p_t <- (p*w1)/w_bar
# make vector for output
output <- c(p = p, q = q, w_bar = w_bar,
w1 = w1, w2 = w2, p_t = p_t)
# return the results
return(output)
}
selection_sim <- function(p_init, rel_fit, ngen){
# Set first generation
mod_pars <- t(selection_model(p = p_init, rel_fit = rel_fit))
# simulate for n generations
for(i in 2:ngen){
mod_pars <- rbind(mod_pars, selection_model(p = mod_pars[i-1, "p_t"], rel_fit = rel_fit))
}
# make generations object
g <- 1:ngen
# return the result as a data frame
return(as.data.frame(cbind(g, mod_pars)))
}
Using R code we learned during the tutorial, simulate selection over 100 generations for a case where A1A1 has the highest fitness and the relative fitness of the genotypes A1A2 and A2A2 is 0.4. The initial frequency of the A1 allele is 0.01. Plot the frequency of p over the generations. Is A1 dominant, additive or recessive? What can we infer from the plot?
What is the difference between directional selection and balancing selection?