All that follows is from Dr. JDD:
Hopefully from the first of these two posts the simplified concept of AltORFs that overlap existing genes has been sufficiently introduced. It appears to me that this is an area of research vastly underrepresented in not only the literature, but the minds and understandings of many PhD-level scientists today. I think the very fact that it is barely mentioned in papers such as those referenced previously (Nature publications on the human proteome, for example) illustrates this point to a degree.
The paper I wish to discuss is this one from 2013:
An out-of-frame overlapping reading frame in the ataxin-1 coding sequence encodes a novel ataxin-1 interacting protein. Bergeron D, et al. J Biol Chem. 2013 Jul 26;288(30):21824-35
ATXN1 is a gene that encodes a protein (ataxin-1) implicated in neurodegeneration. In particular, the expansion of a polyglutamine (poly-Q) repeat in the middle of the ~815amino acid protein is a hallmark of Spinocerebellar ataxia type 1 (you can read more about this condition online). As a side note – this is quite common in several neurodegenerative diseases: a repeat expansion of a triplet codon that encodes for a specific amino acid, usually Q. There is often some variation in how many Q’s are present in such a region from one person to the next, but over a certain number of repeats a degenerative phenotype is observed, with severity of disease usually associated with this number of repeats. Therefore understanding the regulation of the gene/protein and other genes/proteins associated with it is essential for better understanding the disease.
The normal function of ataxin-1 was unclear, however it is known to bind RNA and several transcription factors and localises to the cell nucleus. The researchers examined the genetic sequence of ATXN1 and noticed a potential overlapping ORF (open reading frame) that spanned from bp 30 up to bp 587.
They then elucidated the sequence of what they found was the dominant product of this AltORF, which was around 180aa in length (compared with with ~815aa of ataxin-1). This AltORF was an alternative frame read that started on the 3rd codon of the normal ORF. As a result, here is the protein sequence that is yielded from this AltORF:
If we take the amino acid sequence translated from the same region of the consensus ORF we get this sequence:
As you can see, these do not align at all. If you BLAST the alt-ATXN1 amino acid sequence in a simple BLASTp against nr sequences (default parameters), there is nothing in the database that it aligns with any measurable homology against, at all. Therefore, it seems very unique. It does not even align with any part of the ATXN1 protein (which is not unexpected given it is a +3 frame-shift).
So the researchers show that this protein is real, it is expressed in the cell, and most astonishingly, it interacts with the ataxin-1 protein. Not only does it localise to the same place (nucleus) but there is clear interaction between the two proteins that appears to be specific.
Now let us take a step back for one second. Why is this astonishing? I am sure many of our materialists friends will say this is not astonishing at all (despite the researchers using that common word “surprisingly” we so often read). They will maintain of course it coevolved as it is within the same gene, so it would only make sense to arise to be transcribed and translated at the same time as the normal ATXN1 as it is on the same ORF. But let not such playing down of this issue fool you – think about the complexity and impact of an 180aa protein with no homology to an 815aa protein co-evolving where the evolution of one inherently impacts that of the other (in that region), yet they seem to be functioning together. If you can remove your bias from the evolutionary paradigm and not shout “co-evolution” then you can see the astonishing nature of this unique gene within a gene. This is the stuff programmers dream of – embedding useful, functional and meaningful code within code. If that does not amaze you I do not know what will in molecular biology, quite frankly!
But qualitatively, we must come back to the point about this being an example of something complex (dual code layering) and specified (shared functionality – proteins interacting with each other and localising to the same place) information. I can give you no fancy formulas, probabilities nor statistical significance analysis. However if you lean towards the arguments of Douglas Axe, Michael Behe, Stephen Meyers, as well as others including people here at UD who have looked objectively at the probabilities and chance of unguided random processes generating a single protein of say the size of ataxin-1 (810aa long) that is complex and clearly has a specific role and function in the cell, such work casts doubt on the abilities of such arising in the proposed manner. However when you then add another layer of complexity to that – well, the probabilities do not really need to be calculated (I am not sure how they can be calculated). I think this is a case of “data speaking for itself.” I know most/all materialists will disagree with me here – but I write this more for sharing the fascination of biology with those who will be able to see how astonishingly complex our genes are, and think objectively about the implications.
So, let us return to the paper. The researchers do demonstrate that the Alt-ATXN1 protein interacts with ataxin-1. Another striking feature is that it does not contain a classical canonical nuclear localisation signal (NLS), yet it is clearly transported back into the nucleus after translation in the cytoplasm. Again, they show a more novel and unique way that Alt-ATXN1 is transported to the nucleus, which allows it to correctly localise with ataxin-1 and interact with it.
It should be emphasised that interaction does not automatically mean functionality or essential role. However, the researchers do show a specific interaction of the two proteins and they show significant amounts of each are transcribed. Some AltORFs that are quite small are thought to be transcribed and near instantly broken down and degraded in the cell so have a very short cellular half-life. This does not seem to be the case here. The paper presents therefore several lines of evidence that would suggest this Alt-ATXN1 has a cellular role to play or fulfil. The authors suggest from some of these results that Alt-ATXN1 appears to be a “mediator of the function of ATXN1 in normal and/or pathological conditions.” They also display RNA-binding activity of this Alt-ATXN1 protein. Therefore, it would be foolish and naive to conclude that this Alt-ATXN1 protein is in fact of insignificance and the by-product of an aberrant translational event by accidental transcription of an out-of-frame coding sequence.
In summary of this work we see several things, which are of significance when we globally consider random mutations within genes. Some of the key questions therefore abound:
1) How common are such AltORFs?
2) How many AltORFs produce translated products that are directly involved with the normal ORF product?
3) How can we model the pathway of “evolution” to account for such AltORF products?
Consider the comment on the Nature paper in my first post –they anticipate/estimate up to 3.88 AltORFs per gene. Additionally, around 50% of the MS spectra from the peptides identified to define the human proteome could not be mapped to sequences in the public datasets, implying that we are indeed, missing many proteins (often tissue restricted expression) and that our understanding of what the genetic information is for defining the proteinacious compartment is still relatively naïve. If indeed we find commonality (even if 2-5% of proteins that contain such AltORFs) of overlapping AltORFs, this to my mind poses a serious problem for understanding and acceptance of unguided processes generating such codes that have specific function and complexity. This is especially true if such proteins are indispensible in developmental pathways.
Perhaps you are reading and already knew some (or all) of this and it is old news. Personally, I still get fascinated by such work even when I contemplate it for the second, third and times beyond. To me, it cries out incredible design. Sure, design that is broken, but still design, and I would actually be less OK if I saw design that was not broken, given my faith. We all must have faith though – faith that something so intricate and complex as multiple layers of code within the same code can be accounted for by random unguided accidents, or faith that what cries out design is actually the product of a designer outside our known space-time dimension and therefore invisible to our eyes (and full understanding). Personally I strongly believe that no more faith is required for the latter when you consider the vastness of even the most minuscule cellular process, and quite frankly, how little we know.