Week 9 assignment

1. Working with trees

  1. Draw a random tree with 23 taxa. Make it a radial (or “fan”) tree, with an edge width of 1.5. It must be any colour other than black. Tip: for naming the tree tips, check out the letters object that is built into R

  2. Extract the Proaves clade from the bird.orders dataset and plot it. Then, test whether ducks and geese (Anseriformes) and chickens and turkeys (Galliformes) are monophyletic. Hint: Find out which node that represents the last common ancestor of the Proaves.

2. Primates

In the tutorial, we worked with a subset (hominidae) of the primates data set. In this part of the assignment you will do the same analyses with the full data set.

  1. Using the full primates dataset, calculate the pairwise distance matrix using the most suitable sequence model. Then, construct both UPGMA and NJ trees and plot them.

  2. Root the NJ tree with Mouse as an outgroup. What is the parsimony of the rooted NJ tree and the UPGMA tree, and which is most parsimonous?

  3. Look at the placement of Bovine in the rooted NJ tree. Is it what you would expect? What can explain this placement?

3. Village dogs PCA

  1. From the eigenvalues generated by our dogs_pca (i.e. the subset we worked with in the tutorial), how many principal components should we consider to account for approximately 10% of the variance?

For the rest of the assignment, we will be working with the PCA results and eigenvalues from the full village dogs data set. The PCA can be found here. The eigenvalues can be found here.

  1. Plot the PCA for the full village dogs data set, colouring the points by location.

  2. Which group is the most divergent? What other patterns can you see in the PCA?

To read in the eigenvalues, try using the scan() function. If you open the file, you can see that it’s just a plain text file with a single column of values, so read.table() could also work.

  1. What proportion of the variance do the first two principal components explain?