Week 9 Reconstructing the Past

How can we use genetics and genomics to understand the evolutionary history of organisms? Although evolutionary change can be rapid, it mainly occurs on a timescale that extends far beyond the average human lifespan. For this reason, we must turn to other means in order to reconstruct a picture of the evolutionary history of species and populations. In this tutorial, we will focus on two such methods. The first of these is phylogenetic analysis as a means to visualise the evolutionary relationships among species. We will then turn to principal components analysis, a popular method for examining the variation we see in genomic data and a first step for gaining insight into the processes that might have led to the evolution of such structure within or between species.

What to expect

In this section we will:

  • learn some tools for visualising phylogenetic trees
  • learn how to create phylogenies
  • perform a PCA on genomic data

Getting started

As always, we need to set up our R environment. We’ll load tidyverse as usual, but we will also need a few more packages today to help us handle different types of data. Chief among these is ape which is the basis for a lot of phylogenetic analysis in R. We will also load another phylogenetic package, phangorn (which has an extremely geeky reference in its name). The package adegenet will also be used to perform some population genomic analyses and since these are quite computationally intensive, we will also install and load parallel - a package that allows R to run computations in parallel to speed up analysis.

# clear the R environment
rm(list = ls())

# install new packages
install.packages("ape")
install.packages("phangorn")
install.packages("adegenet")
install.packages("parallel")

# load packages
library(ape)
library(phangorn)
library(adegenet)
library(tidyverse)
library(parallel)

With these packages installed, we are ready to begin!