Wednesday, June 18, 2008 - 01:19
In a recent post, I wrote about an article that I read in Science magazine on the genetics of learning. One of things about the article that surprised me quite a bit was a mistake the authors made in placing the polymorphism in the wrong gene. I wrote about that yesterday. The other thing that surprised me was something that I found at the NCBI. The article that I wrote about definitely made a mistake and I don't understand why it wasn't caught by the reviewers. I found it pretty quickly by searching OMIM and I was only trying to find information about dopamine, not verify results. Anyway, the authors of the paper assumed that the polymorphism they were studying mapped to the dopamine D2 receptor. They were wrong. The disappointing and more surprising finding was the extent of wrong information that I found in the Gene database at the NCBI. I thought the Gene database was curated, not necessarily by the NCBI staff, but at least by the community. Well, if the community is curating the information, they do a very poor job. I found about 31 papers in the GeneRIF section with- what I think is- the wrong information. All of these papers concern an allele (the TaqI A allele) that many people thought mapped to the Dopamine D2 receptor gene. According to a 2004 paper, by Neville, Johnstone, and Walton; it does not. Neville, Johnstone, and Walton claim that this polymorphism maps to a different gene, but there are still lots of other papers cited in the GeneRIFs that discuss the TaqI A allele of the dopamine D2 receptor. A quick look at the instructions shows that the problem bit of information, the GeneRIF, or Gene Reference Into Function, may have been entered by the authors themselves or at least someone else who read those papers. I understand finding incorrect information in PubMed. We all know that science is a process and it's wise to be skeptical when you see the first publication on a topic. But, somehow I had the impression that the Gene database was curated and that the information in the Gene database had been verified. I can also see that the NCBI has a mechanism for the scientific community to correct the information that's been entered incorrectly. Apparently no one has gone back and corrected this and now that I'm looking at the page for submitting corrections and realizing the extent of the problem, I'm not so sure that I want to spend the time on this either. I don't even work in this area. Unlike the researchers who submitted these references and were paid for their work and the journals who were reviewed the work and were paid by subscribers, I'm just an innocent bystander. And I can see that 31 citations will take a bit of time to correct and I wish I hadn't even found this problem because now I'm conflicted between a possible obligation to a larger community, who will neither value my contribution nor be pleased with my input, and the temptation to forget I ever saw the problem in the first place. Damn. UPDATE: here's a picture from the human DRD2 record in the Gene database showing some of the incorrect references. These are misleading because a naive reader would miss that the correct gene location was identified 4 years ago and would think that this polymorphism maps in the DRD2 gene when it doesn't.