R tutorials for the course BIOS1140
Preface
How to use these tutorials
I Introductory R course
1
Introduction to R
What to expect
1.1
Start using R!
1.1.1
Install R and Rstudio
1.1.2
Getting familiar with R and the console
1.1.3
Scripts
1.2
R essentials
1.2.1
Assigning to objects
1.2.2
Vectors
1.2.3
Strings
1.2.4
Logical values
1.2.5
Functions
1.2.6
Data frames
1.3
Data import
1.3.1
Data formats
1.3.2
Working directories
1.3.3
Importing
1.4
Plotting
1.4.1
Plotting vectors
1.4.2
Customizing your plots
1.5
Installing and loading packages
1.6
Going further
2
Building on your foundations: going further with R
What to expect
2.1
Intro to data manipulation with tidyverse
2.1.1
What is the tidyverse?
2.1.2
The dplyr package
2.2
Using dplyr to work with your data
2.2.1
The pipe
2.2.2
Selecting columns with
select()
2.2.3
Filtering colums using
filter()
2.2.4
Grouped summaries with
group_by()
and
summarise()
2.2.5
Using everything we’ve learned in a single pipe, and a
dplyr
exercise
2.3
Plotting your data with
ggplot2
2.3.1
The three things you need in a ggplot
2.3.2
Storing ggplots in objects
2.3.3
Customizing your plots
2.3.4
Saving your plots
2.4
Reshaping data with pivot_longer()
2.4.1
Wide and long format
2.4.2
Import example data
2.4.3
Reshape the data
2.4.4
Plot the data
2.5
Study questions
2.6
Going further
II Evolutionary genetics with R
3
Changes in Allele and Genotype Frequency
What to expect
3.1
R programming: for-loops
3.1.1
Motivation: why loop?
3.1.2
How a for-loop works
3.1.3
Indexing with for-loops
3.1.4
Solving our problem
3.1.5
Storing values from a for-loop
3.2
Evolutionary biology
3.2.1
The Hardy-Weinberg Model
3.2.2
Testing for deviations from the Hardy Weinberg Expectation
3.2.3
Simulating genetic drift
3.3
Study questions
3.4
Going further
4
The Theory of Natural Selection
What to expect
Getting started
4.1
R programming: Making custom functions
4.1.1
Motivation
4.1.2
Function basics
4.1.3
A simple example
4.1.4
Some important function properties
4.1.5
A slightly more useful example
4.1.6
Example: calculating genotype frequencies
4.1.7
Creating a function of the drift simulation
4.2
Evolutionary biology: fitness
4.2.1
Understanding fitness
4.2.2
One-locus model of viability selection
4.2.3
Over and underdominance
4.3
Study questions
4.4
Going further
5
Detecting Natural Selection
What to expect
Getting started
5.1
More on functions: Vectorisation
5.1.1
The
apply()
function
5.1.2
The
ifelse()
function
5.2
Understanding
F
ST
- the fixation index
5.2.1
A worked example of
F
ST
in humans
5.2.2
Writing a set of
F
ST
functions
5.2.3
Applying functions to matrices and data frames
5.3
Visualising
F
ST
along a chromosome
5.3.1
Identifying outliers in our
F
ST
distribution
5.4
Study questions
5.5
Going further
6
Inferring Evolutionary Processes from Sequence Data
What to expect
Getting started
6.1
Working with DNA sequence data
6.1.1
Reading sequence data into R.
6.1.2
Exploring DNA sequence data
6.1.3
Calculating basic sequence statistics
6.2
Working with a larger dataset
6.2.1
Sample size and sequence statistics
6.2.2
Inferring evolutionary processes using Tajima’s
D
6.3
Calculating statistics at the whole genome level
6.3.1
Reading in variant data
6.3.2
Calculating nucleotide diversity statistics
6.3.3
Visualising nucleotide diversity along the chromosome
6.3.4
Performing a sliding window analysis
6.4
Study questions
6.5
Going further
7
Speciation Genomics
What to expect
Getting started
7.1
Visualizing complex data
7.1.1
Faceting
7.2
Returning to the sparrow dataset
7.2.1
Reading in the sparrow vcf
7.2.2
Examining the variant data
7.3
Setting up sliding windows
7.3.1
Calculating sliding window estimates of nucleotide diversity and differentiation
7.3.2
Extracting statistics for visualisation
7.4
Visualising the data
7.4.1
Visualising patterns along the chromosome
7.4.2
Interlude: relative vs. absolute measures of nucleotide diversity
7.4.3
Investigating recombination rate variation
7.5
Study questions
7.6
Going further
8
Reconstructing the Past
What to expect
Getting started
8.1
Phylogenetics in R
8.1.1
Storing trees in R
8.1.2
Plotting trees
8.1.3
A simple example with real data - avian phylogenetics
8.1.4
Constructing trees with R
8.2
Population structure
8.2.1
Village dogs as an insight to dog domestication
8.2.2
Reading the data into R
8.2.3
Performing a PCA
8.2.4
Visualising the PCA
8.2.5
Eigenvalues
8.2.6
The full data set
8.3
Study questions
8.4
Going further
9
Advancing Further in R
What to expect
Getting started
9.1
Advanced features of RStudio
9.1.1
Projects, projects, projects
9.1.2
Everyone has a history
9.1.3
Tab complete and other hotkeys
9.2
More on data handling: categorising and joining
9.2.1
ifelse()
for making categories
9.2.2
Joining
9.3
Lists
9.4
Vectorisation
9.4.1
Using
lapply()
to vectorise
9.4.2
sapply()
9.4.3
Anonymous functions
9.5
Concluding remarks
9.6
Study questions
9.7
Going further
III Assignments
Week 1-2 assignment 1
Week 3 Assignment 2
Week 4 assignment 3
Week 5 assignment 4
Week 6-7 assignment 5
Week 8 assignment 6
Week 9 assignment 7
Appendix
A
About assignments and RMarkdown
R Markdown
Getting started
B
Technical information
Made with bookdown
R tutorials for the course BIOS1140 at the University of Oslo
5.5
Going further
Graham Coop’s notes on F statistics
A detailed tutorial on calculating population differentiation with several R-based population genetic packages
A nice, thorough exploration of the normal distribution using R functions and plotting