When researchers working on the Human Genome Project completely mapped the genetic blueprint of humans in 2001, they were surprised to find only around 20,000 genes that produce proteins. Could it be that humans have only about twice as many genes as a common fly? Scientists had expected considerably more.
Now, researchers from 20 institutions worldwide bring together more than 7,200 unrecognized gene segments that potentially code for new proteins. For the first time, the study makes use of a new technology to find possible proteins in humans—looking in detail at the protein-producing machinery in cells. The new study suggests the gene discovery efforts of the Human Genome Project were just the beginning, and the research consortium aims to encourage the scientific community to integrate the data into the major human genome databases.
Note that so-called orphan genes have been discussed in-depth at Evolution News:
Orphan genes (sometimes called ORFan genes in bacteria) are those open reading frames that lack identifiable sequence similarity to other protein-coding genes. Lack of similarity is hard to prove, given the size of the genomic universe. Methods vary from researcher to researcher, so each study needs to be evaluated carefully. There is also always the possibility that any given ORF has no function. No doubt some orphan genes will prove to be artifacts of incomplete evidence (see below). But orphan genes are a reality, nonetheless, based on numerous and substantial studies.
Thus, the existence and prevalence of orphan genes raises a number of significant questions.…
Then there is the elephant in the room that evolutionary biologists don’t want to acknowledge. Perhaps we see so many species- and clade-specific orphan genes because they are uniquely designed for species- and clade-specific functions. Certainly, this runs contrary to the expectation of common descent.
Continuing with the Phys.org article…
New gene sequences remained out of reach
In the past few years, thousands of frequently very small open reading frames (ORFs) have been discovered in the human genome. These are spans of DNA sequence that may contain instructions for building proteins.
Traditionally, protein-coding regions in genes have been identified by comparing DNA sequences from multiple species: the most important coding regions have been preserved during animal evolution. But this method has a drawback: coding regions that are relatively young, i.e., that arose during the evolution of primates, fall through the cracks and are therefore missing from the databases.
So now the task is to integrate the largely ignored ORFs into the largest reference databases, because researchers have so far had to specifically search for them in the literature if they wanted to study them.
ORFs likely play a role in common diseases
Dr. Sebastiaan van Heesch, group leader at the Princess Máxima Center for pediatric oncology, says that their “research marks a huge step forward in understanding the genetic make-up and complete number of proteins in humans. It’s tremendously exciting to enable the research community with our new catalog. It’s too soon to say whether all of the unexplored sections of DNA truly represent proteins, but we can clearly see that something unexplored is happening across the human genome and that the world should be paying attention.”
“It is especially remarkable that most of these 7,200 ORFs are exclusive to primates and might represent evolutionary innovations unique to our species,” reports Jorge Ruiz-Orera, an evolutionary biologist working in Hübner’s lab at the MDC. “This shows how these elements can provide important hints of what makes us humans.”
Read the complete article at Phys.org.
Another “elephant in the room” is the question of how did the significant amount of information needed for novel protein-coding regions arise without intelligent design?