Week 7 Speciation Genomics

Genomic data has revolutionised the way we conduct speciation research over the past decade. With high-throughput sequencing it is now possible to examine variation at thousands of markers from across the genome. Genome-wide studies of genetic differentiation, particularly measured using F_ST have been used to identify regions of the genome that might be involved in speciation. The rationale is relatively simple, F_ST is a measure of genetic differentiation and when species diverge in the presence of gene flow, we might expect that genome regions underlying traits that prevent gene flow between species will show a higher level of F_ST than those that do not. In other words, genome scan analyses can, in principle, be used to identify barrier loci involved in the speciation process. This approach became extremely popular in many early speciation genomic studies but it overlooked a crucial point - that other processes, not related to speciation can produce the same patterns in the genome. In this session, we will leverage our ability to handle high-throughput, whole genome resequencing data to investigate patterns of nucleotide diversity, genetic differentiation and genetic divergence across a chromosome. We will examine what might explain some of the patterns we observe and learn that while genome scans can be a powerful tool for speciation research, they must be used with caution.

What to expect

In this section we will:

Learn tools for visualizing data with many dimensions
contrast and compare genome-wide measures of F_ST among species
examine variation in recombination rate and it’s influence on differentiation

Getting started

The first thing we need to do is set up the R environment. Today we’ll be using tidyverse and the PopGenome package that we installed and loaded in the last session.

# clear the R environment
rm(list = ls())
library(tidyverse)
library(PopGenome)