Friday, November 16, 2007 - 00:51
As many of you know, I'm a big fan of do-it-yourself biology. Digital biology, the field that I write about, is particularly well-suited to this kind of fun and exploration. Last week, I wrote some instructions for making a phylogenetic tree from mitochondrial genomes. This week, we'll continue our analysis. I wrote this activity, in part, because of this awful handout that my oldest daughter brought home last year. She presented me with an overly photocopied paper that showed several protein sequences from cytochrome C in several creatures. She said she was supposed count the differences or something and make a tree. I guess this is a standard activity in high school biology, but there are so many better ways to do this sort of thing. Of course, my suggestion that we get the sequences from GenBank and make a tree with the computer, was not popular. "Mooom! I have to do the assignment the way the teacher told me to do it, not do it better!" Sigh. Moving onward, last week we got sequences, we got blast results, and I left it up to you to decipher those weird scientific names and figure out which animals were which and to draw a simple diagram showing how you thought they might be grouped together. You know, like Sesame Street. "One of these things is not like the other ..." I would group creatures like horses and donkeys together, primates together, you get the idea. So what were the results? First, we have a graph showing our matching sequences ranked by score. In this case, the score is derived from the number of nucleotides that our identical to our query sequence. You can see that the mitochondrial genome isn't terribly large, and that parts of it are similar in all the creatures we tested, while certain parts show lots of variablity. (Which parts do you think encode proteins required for metabolism?)
There is another table below that presents blast scores, and other values. I sorted the results, in the table below, by clicking the column called "Total score." This value comes from adding up the scores for all the regions where the pairs of sequences match. You can see that the Pan paniscus, Pan troglodyte (chimpanzee), and Gorilla mitochondria match the human sequence the best. And you can see that they're very similar, 91-94% identical in fact.
Now, let's make a tree. We click the link that says "Distance tree of results" and we get a phylogenetic tree.
Pretty cool, huh?
(FYI - I did try some other methods for making this kind of tree - but I found that both Clustal and Phylip can be really finicky on school computers and these sequences are too big to be analyzed by someone else's web service - I did try JalView and Clustal but gave up after an hour. Research grade methods aren't always robust enough for a classroom.)Do the placements of these creatures match the way you grouped them together? Would you like to add more creatures to the tree? If there's a creature that you want to add, write it in the comments and I'll find the accession number (if the mitochondria has been sequenced, that is). Or you can go to the NCBI, find mitochondrial genomes and get the accession number yourself. You can add to the list I wrote last week and see what you get.