Science and its interpretation is wonderful. Today I saw a post on Twitter from @LAbizar, referencing an @GEN, post that stated 8.2% of Human DNA is Functional
with a link to a GEN article: "Surprise: Only 8.2% of Human DNA Is Functional." The GEN writeup cited a PLoS Genetics article, "8.2% of the Human Genome Is Constrained: Variation in Rates of Turnover across Functional Element Classes in the Human Lineage,"
How much of the human genome is functional?
In 2012, the ENCODE (Encyclopedia of DNA Elements) project published a landmark summary, "An integrated encyclopedia of DNA elements in the human genome
," from nine years of work measuring the ways in which DNA structure and its interactions with proteins such as transcription factors might contribute to the regulation of genes. In the paper's abstract the team stated that "These data enabled us to assign biochemical functions for 80% of the genome, in particular outside of the well-studied protein-coding regions." As a very small fraction of the genome (~1%) encodes for protein sequences, a question in science has been, what does the other 99% do? ENCODE data demonstrated that much of this DNA participates in biochemistry in some way. Many lauded the work for its tour-de-force effort and the resources contributed have been significant.
Others disagreed with ENCODE's findings. One of the first criticisms was in an article entitled "On the Immortality of Television Sets: “Function” in the Human Genome According to the Evolution-Free Gospel of ENCODE
." In the abstract the authors reject the ENCODE thesis claiming that the evolutionary constrained regions of the genome are less then 10% of the genome's total DNA. They close their abstract by generating additional controversy, "The ENCODE results were predicted by one of its authors to necessitate the rewriting of textbooks. We agree, many textbooks dealing with marketing, mass-media hype, and public relations may well have to be rewritten." This work is followed up with a more detailed analysis, cited above, of the fraction of the genome that is evolutionarily constrained in mammals and provides a precise extrapolated measurement that 8.2% (with at range of 7.1-9.2%) of the genome is conserved in mammals.
So, what's the correct answer?
As usual, the correct answer depends on the kind of question you are asking, how you define function, and what kind of attention you seek. It also depends on how you interpret what others say. For example, the ENCODE data describes interactions between proteins and DNA, and counts transcribed bases as fuctional. The proteins can be contained in nucleosomes that pack DNA, or factors that bind promoters or operators to activate or inhibit gene expression. We also know that large-scale DNA structure is important and contributes to the activity of enhancers. Finally, non-coding genes - herein defined as those that are transcribed to produce different kinds of non-coding RNA - are much more abundant than we once thought, hence 80% of the genome can be annotated as functional. The ENCODE team found that 15-20% of the genome can bind something or be accessible, and many times there are correlations between different measurements that indicate a useful role for these regions of DNA.
On the other hand, the fraction of the genome that is conserved between different mammalian species is small. Even smaller if you examine non-mammalian species. However, a challenge with using species conservation as the rule for defining function is that it does not accommodate continual evolution very well. After all we [humans] are not mice. Indeed, the senior author of the PLoS paper commented in the GEN article:
“This is in large part a matter of different definitions of what is 'functional' DNA,” says joint senior author Chris Pointing, Ph.D., of the MRC Functional Genomics Unit at Oxford University. “We don't think our figure is actually too different from what you would get looking at ENCODE's bank of data using the same definition for functional DNA.”
So, it's a matter of definition - and interpretation. When the PloS title is read closely, it simply says that 8.2% of the genome is constrained in functional element classes; not that only 8.2% of the genome is functional as the GEN article states - which might be about seeking attention.