Thursday, July 12, 2007 - 06:39
By now, many of you have probably seen the the new BLAST web interface at the NCBI. There are many good things that I can say about it, but there are a few others that caught me by surprise during my last couple of classes.
tags:Because of these changes, and because I'm giving a workshop for teachers on BLAST at the Fralin Biotechnology Conference in Blacksburg, VA, next week, it seemed like a good time to update our animated BLAST tutorial at Geospiza Education and save myself some trouble. I originally created the BLAST for beginners tutorial to accompany an activity called "BLASTing through the kingdom life." In this activity, students use blastn to compare an unidentified DNA sequence with sequences in the nr database. Some of the goals in this activity are for students to identify the unknown DNA sequence and to identify related sequences in other organisms. Personally, I think it's pretty cool to find that a DNA sequence from a frog, for example, has a counterpart in a chicken. The tutorial is also good for an activity that I call "Head, Shoulders, Knees, and Toes" where students identify unknown sequences and look for tissue or developmentally-specific gene expression. Anyway, the new interface makes this slightly more complicated than it used to be. No more doing things by default To summarize the first set of changes, we can't use the default settings in BLAST anymore. In the earlier incarnation, NCBI's BLAST server automatically searched a database with a large and varied collection of sequences. Now, we have to pay attention. If we wish to look at sequences that come from organisms other than humans, (and we do!) we must choose the proper database. We also need to adjust the stringency of the search. The current default setting for doing a nucleotide blast search asks that the sequences match pretty closely. This works great if we are looking for a sequence that's almost identical to the sequence we have, but if we want to find a closely related sequence, from a different organism, we won't find it by using the default setting. In the tutorial, I show how to change these parameters and present a summary afterwards listing what's been changed. It's not hard, and I suppose at some level it's just as well to have to do this because you have to think about what you're doing a bit better than before. But you do have to remember to do it. It's like looking at gel box to make sure that the right electrodes are plugged in at the correct ends. You get used to it. And it does add one more place for things to go wrong when your students do a search. More and more information The new BLAST results pages no longer present some bits of information (like the gi number) but some of the information that you do get is more useful, like the query coverage and Max % identity. I added a page to define these terms. You can find worksheets, sets of taxonomically diverse or tissue specific unknown sequences, and the BLAST for beginners tutorial all right here. Enjoy!