Uncommon Descent Serving The Intelligent Design Community

The spliceosome: a molecular machine that defies any non-design explanation.

Share
Facebook
Twitter
LinkedIn
Flipboard
Print
Email

OK, let’s start with a very simple fact: eukaryotic genes have introns.

IOWs, they are not continuous. They are made of exons and introns: exon – intron – exon – intron – exon and so on. Exons code for the protein. Introns don’t.

So, when the content of the gene is copied to the mRNA, introns must be cut away, and only exons are retained, in order to be translated, so that the mature mRNA can be transferred to the cytoplasm and translated by the ribosome.

This process of removing introns is called splicing.

Now, a few clarifications:

a) Introns exist in prokaryotes too, but they are rather rare. For our purposes, we will only discuss introns in eukaryotes.

b) Introns exist in many different types of genes. For our purposes, we will discuss only those in protein coding genes.

c) The origin and possible function of introns is, still, a mystery.

d) Introns are usually longer than exons. In humans, for example, they amount to approximately 35% of the whole geneome, vs about 1.5% of coding exons.

e) However, the amount and length of introns can vary a lot in different organims. An extreme example is yeast (s. cerevisiae), whose genome contains a very small amount of introns (about 250 out of about 6250 genes).

When the gene if transcribed, both exons and introns are transcribed. A 5′ UTR (Untranslated region) and 3′ UTR, is also part of the pre-mRNA.

 

Fig. 1

Pre-mRNA is the first form of RNA created through transcription in protein synthesis. The pre-mRNA lacks structures that the messenger RNA (mRNA) requires. First all introns have to be removed from the transcribed RNA through a process known as splicing. Before the RNA is ready for export, a Poly(A)tail is added to the 3’ end of the RNA and a 5’ cap is added to the 5’ end. – By Nastypatty (Own work) [CC BY-SA 4.0 (https://creativecommons.org/licenses/by-sa/4.0)], via Wikimedia Commons

So the question is: how are introns removed from pre-mRNA? IOWs, how is splicing achieved?

And there is more. As everybody probably knows, splicing is not always done in the same way. Different isoforms of the same protein can be obtained by alternative splicing, and they can have functional differences. I will not go into details about that, but here is the Wikipedia page about alternative splicing:

https://en.wikipedia.org/wiki/Alternative_splicing

So, how is splicing done, and how is alternative splicing regulated?

We know much about the first question, very little about the second.

There are three ways to perform splicing:

  1. Spliceosomal splicing
  2. Self-splicing
  3. tRNA splicing

The last two modalities are rare, and can be found both in prokaryotes and eukaryotes. I will not discuss them here.

So, the subject of this OP is spliceosomal splicing, which is restricted to eukaryotes.

Moreover, I will discuss only the major spliceosome, which is responsible for the vast majority of splicing in eukaryotes. It must be said, however, that also a minor spliceosome exists, and that it acts in a minority of cases.

So, the spliceosome.

The first important point is:

It is an amazing molecular machine. Even more, it is an amazing molecular cycle, involving many different stages each of which is an amazing molecular machine.

Let’s see. Here is a figure which summarizes the main stages of the spliceosome cycle:

 

Fig. 2

 

Spliceosomal splicing cycle – By JBrain [CC BY-SA 2.0 de (https://creativecommons.org/licenses/by-sa/2.0/de/deed.en)], via Wikimedia Commons

To make it simple, the spliceosome units are built upon 5 specific RNAs, called small nuclear RNAs (snRNA). These are, in humans:

U1 (164 bases), U2 (187 bases), U4 (145 bases), U5 (116 bases), U6 (107 bases)

They are transcribed from multiple gene copies. While their sequences are not particularly conserved (U6 being the most conserved of all), their secondary structure seems to be very conserved.

snRNAs are very important in the spliceosome, because they seem to be responsible for the catalytic activities.

Each of the 5 snRNAs forms a complex with proteins, and the complex takes the name of snRNP. The whole spliceosome includes at least 145 different proteins, maybe more, some of which are still not well known.

I will mention here some of the most important:

 

U1 snRNP:

U1 snRNP 70 kDa  (P08621, 437 AAs)

U1 snRNP A (P09012, 282 AAs)

U1 snRNP C  (P09234, 159 AAs)

Sm proteins: 7 small proteins (76 – 240 AAs) which form the “Sm core” ring in spliceosome subunits U1, U2, U4, U5, a ring which hosts the specific snRNA molecule.

 

U2 snRNP:

U2 snRNP A’ (P09661, 255 AAs)

U2 snRNP  B” (P08579, 225 AAs)

SF3a120 (Q15459, 793 AAs)

SF3a66 (Q15428, 464 AAs)

SF3a60 (Q12874, 501 AAs)

SF3b155 (O75533, 1304 AAs)

SF3b145 (Q13435,  895 AAs)

SF3b130 (Q15393, 1217 AAs)

SF3b49 (Q15427, 424 AAs)

SF3b14a/p14 (Q9Y3B4, 125 AAs)

SF3b10 (Q9BWJ5, 86 AAs)

Sm proteins

 

U4/U6 snRNP:

Prp3 (O43395, 683 AAs)

Prp31 (Q8WWY3, 499 AAs)

Prp4 (O43172, 522 AAs)

Cyph ( O43447, 177 AAs)

15.5 K (P55769, 128 AAs)

Sm proteins (for the U4 snRNA)

Lsm proteins: a number of proteins similar to Sm proteins (usually Lsm 2-8), which form a specific ring for the U6 snRNA.

 

U5 snRNP:

Prp8 (Q6P2Q9, 2335 AAs)

BRR2 (O75643, 2136 AAs)

Snu114 (Q15029, 972 AAs)

Prp6 (O94906, 941 AAs)

Prp28 (Q9BUQ8, 820 AAs)

52 K (O95400, 341 AAs)

40 K (Q96DI7, 357 AAs)

Sm proteins

 

Additional proteins in the U4/U6/U5 complex:

hSnu66 (O43290, 800 AAs)

hSad1 (Q53GS9, 565 AAs)

27 K (Q8WVK2, 155 AAs)

 

OK, these are only the main components, and the best understood. We are still far from the sum total of 145/150 proteins which are involved in the spliceosome cycle.

But how does it work?

Always in brief. Here is a typical exon-intron structure:

Fig. 3

 

Parts of an intron – By miguelferig (Own work) [Public domain], via Wikimedia Commons

 

GU, A and AG are nucleotides almost universally conserved in all introns, approximately at the positions shown in the figure, and which have a fundamental role in splicing. However, the real stuff is much more complex than this (see the splicing code section). GU (n. 4 in the figure) is near to the 5′ end of the intron, AG (n. 1) near to the 3′ end. The A (n. 3 in the figure) is called “the branch point”. The py-py-py (n. 2 in the figure) is the “polypyrimidine tract”.

a) The U1 subunit binds to the GU sequence at the 5′ splice site in the intron

b) The U2 subunit binds to the “branch point”.

c) The U4/U6/U5 binds to the complex.

d) Numeorus further modifications take place, causing the formation of a “lariat” (including the intron), which is then cleaved, while the two exons are ligated.

I will spare you the many complexities in all the various steps, which are well summarized (in a very simplified way) in this Wikipedia page:

RNA splicing

See in particular the “Formation and activity” section.

Or, if you like more detail, here:

Spliceosome Structure and Function

And here is a very good video on the whole splicing process in yeast:

 

 

Now, if somebody still has doubts about the complexity of this molecular machine/process, let’s consider some important aspects.

  • 1. The spliceosome is a molecular machine which appears in eukaryotes.

I quote from this paper (in the abstract):

Origin and evolution of spliceosomal introns

There is no indication that any prokaryote has ever possessed a spliceosome or introns in protein-coding genes, other than relatively rare mobile self-splicing introns.

The following Table shows how some of the main proteins involved in the spliceosome activity show practically no trace of homology in prokaryotes. I have included also included in the table two examples  which show low homologies due to some domain which is already expressed in prokaryotes: Prp4, whose 209 bits of homology are due to a specific domain, WD40, and Prp28, which exhibits 313 bits linked to the DEXDc domain. The point is: many of the spliceosome proteins are complete novelties in eukaryotes, but not all of them.

 

 Protein Bacteria Archea
U1 snRNP 70 kDa  (P08621, 437 AAs) 67
U2 SF3b130 (Q15393, 1217 AAs) 43.5
U4/U6 hPrp3 (O43395, 683 AAs)
U4/U6/U5 hSnu66 (O43290, 800 AAs)
U5 Prp8 (Q6P2Q9, 2335 AAs)
U5 Prp6 (O94906, 941 AAs) 100
U4/U6 hPrp4 (O43172, 522 AAs) 209 150
U/5 Prp28 (Q9BUQ8, 820 AAs) 313 286

 

I will analyze in more detail one of the most important proteins in the spliceosome, Prp8, in the last part of this OP

(Just a technical note: if you blast Prp8, you will find 3 hits which are obviously  an error due to unverified sequences, probably cases of contamination).

  • 2. The spliceosome is a molecular machine which is universally present in eukaryotes.

All eukaryotes have introns, even if in very different degrees, and the spliceosome, even if in some organisms parts of the spliceosome complex can be lost. For a more detailed discussion, look at the Rogozin paper quoted above (Origin and evolution of spliceosomal introns), in particular the section:

Intron density, size and distribution in protein coding genes across the eukaryote domain

I quote this important conclusion:

As pointed out above, despite the existence of numerous, diverse intron-poor genomes, eukaryotes do not lose the “last” intron or the spliceosome although degradation of the spliceosome including loss of many components does occur, e.g. in yeast. The only firmly established exception is the tiny genome of a nucleomorph (an extremely degraded intracellular symbiont of algae) that has lost both all the introns and the spliceosome [7]; preliminary genomic data indicate that all introns might have been lost also in a microsporidium, a highly degraded intracellular parasite distantly related to Fungi [54].

So, whe can conclude that both introns and the spliceosome are a universal feature of eukaryotes, the few exceptions being simply cases of loss of information.

  • 3. The spliceosome is a molecular machine whose information is extremely conserved throughout eukaryotes, up to humans.

I have already mentioned that the 5 RNAs which form the core of the spliceosome are not extremely conserved at sequence level, even if they are extremely conserved at structure level.

However, many of the proteins that compose the spliceosome show an amazing sequence conservation throughout eukaryotes. Now, even if we cannot be certain of when eukaryoyes really emerged, and of their early evolutionary history (both issues being at present highly controversial), we can reasonably assume that protein sequences which are highly conserved in all eukaryotes have been conserved for something like 2 billion years (more or less). As anybody who has followed my previous OPs about information conservation in vertebrates certainly knows, that is an evolutionary time frame which certainly allows us to equal conservation to extremely high functional constraint.

But how conserved are spliceosome proteins? We can analyze a few of them with my usual methodology: looking at human conserved information. The results shows that many proteins involved in the spliceosome are amazingly conserved in all eukaryotes. While there are a few cases which have a rather different evolutionary history, this is by far the most common behaviour for spliceosomal proteins.

Here is a sample of some important sequences that show high homology with the human form in all major groups of single celled eukaryotes. The 5 groups of single celled organisms chosen here, indeed, cover rather well the whole range of single celled eukaryotes.

 

Fig. 4

 

These four important proteins, as shown, have an amazing amount of information shared with the human form, ranging from more than 1000 to more than 4000 bits. In bits per AA, the range goes from 0.88 to 1.80 bits per aminoacid (baa).  As can be seen, the highest homology is found in fungi, as expected, because fungi are the most likely ancestors of metazoa. The lowest homologies are observed in Naegleria (Excavata) or in Alveolata.

Of course, these proteins remain highly similar to the human form in the following evolutionary history in Metazoa.

So, we can safely state that most spliceosomal proteins, while emerging almost entirely in eukaryotes and showing only trivial homologies with prokaryotes, were probably already universally present in the Last Universal Eukaryotic Ancestor (LECA), and in a form already very similar at sequence level to what we observe in metazoa and in humans.

  • 4. The spliceosome is a wonderful example of irreducibly complexity.

OK, we have already said that splicing can be achieved in at least three different ways. For example, bacterial introns, although rare, are of the self-splicing type. So, we know that the generic function of splicing introns can be implemented in different ways.

But eukaryotic introns are of the spliceosomal type, and they are spliced only by the spliceosome.

We have also said that a minor spliceosome also exists. It shares some featues with the major spliceosome, but it is a different structure and acts on different, and much rarer, introns.

So, for the vast majority of eukaryotic introns, the major spliceosome, and only the major spliceosome, can effectively accomplish the splicing.

Now, I don’t mean here that the major spliceosome must always be absolutely complete, with all its 150 proteins, to be able to work. That’s not what I mean when I say that it is irreducibly complex.

Maybe in some organisms the spliceosome can be partially defective, and still work. It is difficult to say, because we still don’t understand the role of all the components of the spliceosome.

But however, as far as we can understand, most of the principal features must be present, because, as we have seen, the splicing is the result of a complex cycle, involving the RNAs and the subunits, and all stages are essential to the final result.

So, the spliceosome is certainly highly irreducibly complex, even if we may not be able to clearly identify the essential nucleus of molecules which is absolutely necessary to the minimal function.

Moreover, the spliceosome would be useless if spliceosomal introns did not exist, with their properties and code (see later), and spliceosomal introns could not exist if the spliceosome were not there to splice them, because otherwise transcription would be completely ineffective. So, in that sense, spliceosomal introns and the spliceosome are a good example of chicken egg paradox, or we could also say that they form an irreducibly complex system at a higher level. And remember, the whole system seems to have been already present, very much similar to its current form in humans, in LECA, as we have seen.

But, of course, our darwinist friends will simply say that they have co-evolved!  🙂

Moreover, we can and should ask ourselves: why is the spliceosome so complex? The answer is not easy, because we still understand very little, but it is certainly related to the complexity of the splicing code, and to the fundamental issue of alternative splicing.

The Splicing code.

To splice introns by localizing 5 (or a few more) conserved nucleotides and then cutting at the ends of a lariat and rejoining the two exons is certainly a complex task, but apparently not so complex  that one of the biggest and most impressing known molecular machines is needed for that. But the simple truth is that recognizing and appropriately splicing all introns is a much more complex task than that.

That is due to the simple fact that conserved nucleotides at the ends are not a sufficient signal to identify the segment that has to be spliced, and that a lot of other components (not always well understood) are necessary to that, and that the splicing is not made always in the same way, and that alternative splicing is a very powerful tool for transcription regulation.

The subject is very complex, and I will not deal with it in depth. However, Those interested could look at this recent review:

The splicing code.

Unfortunately, the paper is paywalled, but the abstract is very informative.

The complexity of the splicing code can give us some insight about the true reasons for the complexity of the spliceosome, and definitely supposrts the idea that the whole system, introns, splicing code and spliceosome, is irreducibly complex.

The Prp8 protein.

I will add a few words about this proteins, which is probably the most amazing component of the spliceosome, and well represents its essential features.

This protein has many amazing charactertistics:

a) It is, as far as I can say, the longest protein in the whole spliceosomal system, and a very long protein indeed: 2335 AAs.

b) It is completely absent in prokaryotes.

c) It is extremely conserved in eukaryotes, probably the most conserved protein in the whole spliceosomal complex (see Fig. 4). In Naegleria, we have 3345 bits of homology with the human protein, corresponding to 1.4 baa, while in fungi we have the amazing result of 4211 bits of human-conserved information, corresponding to 1.8 baa. IOWs, in fungi we already find almost 90% of the functional information present in human Prp8 (remember, the highest possible bitscore is about 1.2 baas), corresponding to 1995 identities (86%) and 2155 positives (92%): a result which is incredibly rare, considering that such information has been conserved for about 2 billion years.

Just to emphasize the importance of this fact, here is the blast between the human protein and the best hit in fungi (Basidiobolus meristosporus):

 

 

d) It is a protein which is extremely important for the function of the spliceosome. Here is a very good paper about this protein and its functional relevance:

Prp8 protein: At the heart of the spliceosome

I quote from the conclusions:

Prp8p is central to the expression of all nuclear intron-containing mRNAs. In higher eukaryotes, it is responsible for processing thousands of transcripts in alternative splicing pathways, and in both U2 and U12 spliceosomes. It is important for the pathology of human disease, as all eukaryotic pathogens and parasites require Prp8p to functionally express their genes, in some cases via the trans-spliceosome. Retinitis Pigmentosa, a human genetic disorder that causes progressive blindness, positions Prp8p as a target for therapeutic medicine.

Moreover, another way to assess the functional constraints of a protein is to check its tolerance to polymorphisms in humans. That can be done consulting a very recent and useful database, the ExAC browser, which reports data from about 60000 human genomes.

ExAC gives two important metrics to assess how much a protein is tolerant to polymorphisms and variants. I quote here from the site FAQ:

What are the constraint metrics?

For synonymous and missense, we created a signed Z score for the deviation of observed counts from the expected number. Positive Z scores indicate increased constraint (intolerance to variation) and therefore that the gene had fewer variants than expected. Negative Z scores are given to genes that had a more variants than expected.

For LoF, we assume that there are three classes of genes with respect to tolerance to LoF variation: null (where LoF variation is completely tolerated), recessive (where heterozygous LoFs are tolerated), and haploinsufficient (where heterozygous LoFs are not tolerated). We used the observed and expected variants counts to determine the probability that a given gene is extremely intolerant of loss-of-function variation (falls into the third category). The closer pLI is to one, the more LoF intolerant the gene appears to be. We consider pLI >= 0.9 as an extremely LoF intolerant set of genes.

Now, for Prp8 we get the following results:

Constraint
from ExAC
Expected
no. variants
Observed
no. variants
Constraint
Metric
Synonymous 366.6 460 z = -3.02
Missense 920.2 266 z = 10.55
LoF 92.6 8 pLI = 1.00

That means the following:

a) The number of observed missense variants is so low vs expected (266 vs 920.2) that the z value is 10.55 (IOWs, 10.55 standard deviations, more than 10 sigma). Believe me, this is a really exceptional result, most proteins are much more tolerant to missense variants.

b) The probability of loss of function is 1. That means that even heterozygous LoF is absolutely not tolerated.

This is an extremely functional molecule, if ever there was one!

A brief conclusion.

To sum up the meaning of this rather long discussion, I would simply say:

  1. The introns – spliceosome system is a molecular machine of amazing complexity. I have touched only its main aspects in this OP, but believe me, there are layers of complexity there that would require a whole treatise and that I have not even started to mention here.
  2. Whoever can really believe that all this can be explained by some RV + NS model is, IMO, really admirable for his faith in a wrong paradigm.
Comments
Irremediable complexity ? See the 112 papers citing the featured paper: https://www.researchgate.net/publication/47755596_Irremediable_ComplexityDionisio
December 29, 2017
December
12
Dec
29
29
2017
01:47 AM
1
01
47
AM
PDT
No matter how persuasive the CNE might be, I still prefer Cinderella's fairytale, because it makes more sense from a rational perspective: a pumpkin was converted into an elegant carriage, mice became beautiful horses, a grasshopper was hired as the cochero. :) Is CNE in the "third way" arsenal?Dionisio
December 29, 2017
December
12
Dec
29
29
2017
01:31 AM
1
01
31
AM
PDT
DATCG and gpuccio, Let's not rush to conclusions before reading the breakthrough revelation that professor Arthur Hunt promised. Let's be patient. :)Dionisio
December 29, 2017
December
12
Dec
29
29
2017
01:26 AM
1
01
26
AM
PDT
DATCG: Yes, the interesting part is that, as I have tried to emphasize, nobody denies the complexity. And nobody denies that the complexity works, that it is indeed functional. So, according to these people, we have: Irremediable futile functional complexity The concept of the century! :)gpuccio
December 29, 2017
December
12
Dec
29
29
2017
01:25 AM
1
01
25
AM
PDT
GPuccio, the article by Behe. You might appreciate the lead line. ;-)
An intriguing “hypothesis” paper entitled “How a neutral evolutionary ratchet can build cellular complexity”1, where the authors speculate about a possible solution to a possible problem, recently appeared in the journal IUBMB Life. It is an expanded version of a short essay called “Irremediable Complexity?”2 published last year in Science.
DATCG
December 28, 2017
December
12
Dec
28
28
2017
09:59 PM
9
09
59
PM
PDT
DATCG: Thank you for the link! :) I did not know that Behe had already touched tose points, and so brilliantly, otherwise I would certainly have quoted him! Luckily, you have provided the reference. I am really happy to be, once again, in perfect accord with what Behe says. :)gpuccio
December 28, 2017
December
12
Dec
28
28
2017
09:58 PM
9
09
58
PM
PDT
Chuckles :) OK, that was funny. Thanks for the continued reviews Gpuccio and digging on your part! Irremediable Complexity was briefly addressed by Michael Behe in 2011... Brief Review of Irremediable Complexity and CNE by Behe Note: its from 2011. I'm sure more experimental work has been done since to "support the feasibility of the model."
The authors think the evolution of such a complex is well beyond the powers of positive natural selection: “Even Darwin might be reluctant to advance a claim that eukaryotic spliceosomal introns remove themselves more efficiently or accurately from mRNAs than did their self-splicing group II antecedents, or that they achieved this by ‘numerous, successive, slight modifications’ each driven by selection to this end.”1
Well, I can certainly agree with them about the unlikelihood of Darwinian processes putting together something as complex as the spliceosome. However, leaving aside the few RNAs involved in the splicesome, I think their hypothesis of CNE as the cause for the interaction of hundreds of proteins — or even a handful — is quite implausible. (An essay skeptical of large claims for CNE, written from a Darwinian-selectionist viewpoint, has appeared recently-3 along with a response from the authors-4).
see references 3 and 4 above at the linked article
The authors’ rationale for how a protein drifts into becoming part of a larger complex is illustrated by Figure 1 of their recent paper (similar to the single figure in their Science essay). A hypothetical “Protein A” is imagined to be working just fine on its own, when hypothetical “Protein B” serendipitously mutates to bind to it. This interaction, postulate the authors, is neutral, neither helping nor harming the ability of Protein A to do its job. Over the generations Protein A eventually suffers a mutation which would have decreased or eliminated its activity. However, because of the fact that Protein B is bound to it, the mutation does not harm the activity of Protein A. This is still envisioned to be a neutral interaction by the authors, and organisms containing the Protein A-Protein B complex drift to fixation in the population. Then other mutations come along, co-adapting the structures of Protein A and Protein B to each other. At this point the AB complex is necessary for the activity of Protein A. Repeat this process several hundred more times with other proteins, and you’ve(voila) built up a protein aggregate with complexity of the order of the spliceosome.
(voila) emphasis mine Or at least so the story goes. Is it plausible? As as neo-Darinian tale? Maybe. But with more details I hope. I'll leave the rest of the review for readers to pursue.DATCG
December 28, 2017
December
12
Dec
28
28
2017
09:45 PM
9
09
45
PM
PDT
Irremediable complexity? It sounds important, doesn't it? :) How about complex functionally specified informational complexity?Dionisio
December 28, 2017
December
12
Dec
28
28
2017
08:21 PM
8
08
21
PM
PDT
So CNE includes some kind of co-option too? Hey, why not? At least it sounds important. :)Dionisio
December 28, 2017
December
12
Dec
28
28
2017
08:12 PM
8
08
12
PM
PDT
TWSYF @209: @216 follow-up You may want to read the comment @206.Dionisio
December 28, 2017
December
12
Dec
28
28
2017
08:02 PM
8
08
02
PM
PDT
Discussion between AH and GP Index of posted comments: AH @25 …….……. GP @28 AH @50 AH @51 …….……. GP @54 AH @56 * …….……. GP @60 …….……. GP @69 …….……. GP @75 …….……. GP @86 …….……. GP @98 …….……. GP @106 …….……. GP @118 …….……. GP @127 …….……. GP @129 AH @130 …….……. GP @136 …….……. GP @138 …….……. GP @146 …….……. GP @162 AH @164 …….……. GP @167 …….……. GP @176 …….……. GP @182 …….……. GP @198 …….……. GP @200 …….……. GP @201 …….……. GP @210 …….……. GP @211 …….……. GP @212 AH is the distinguished professor Arthur Hunt GP is the author of the excellent OP that started this discussion thread (*) first publicly admitted mistake @56 (to be continued…)Dionisio
December 28, 2017
December
12
Dec
28
28
2017
07:56 PM
7
07
56
PM
PDT
gpuccio:
Is everything clear?
Clear as chloroplaster!Mung
December 28, 2017
December
12
Dec
28
28
2017
07:38 PM
7
07
38
PM
PDT
TWSYF @209: gpuccio is who has set the bar too high for his politely dissenting interlocutors. I'm just trying to make it easier for the rest of us to keep track of the interesting discussion GP is having with a distinguished professor. BTW, gpuccio posted three additional comments that render the info @203 incomplete. I'm awaiting the breakthrough revelation that professor Arthur Hunt promised.Dionisio
December 28, 2017
December
12
Dec
28
28
2017
07:20 PM
7
07
20
PM
PDT
"CNE is rubbish!" That's so rude. :)Dionisio
December 28, 2017
December
12
Dec
28
28
2017
07:14 PM
7
07
14
PM
PDT
ET: Shall I say it? OK: CNE is rubbish! :)gpuccio
December 28, 2017
December
12
Dec
28
28
2017
04:35 PM
4
04
35
PM
PDT
CNE? That one cracks me up. Molecules just diffuse throughout the cell and if they happen to meet another and join then that's just wonderful if it makes something useful. It isn't as if those molecules had something to do on their own or some other molecule to link up with. They just happened to be there and meet up with a mate. Larry Moran is a fan of constructive neutral theory. To me it's just another just-so story. Tornado in a junkyard, anyone?ET
December 28, 2017
December
12
Dec
28
28
2017
04:25 PM
4
04
25
PM
PDT
Arthur Hunt (and all interested): Just an add-on to the previous post. For those who liked the concept of irremediable complexity, there is more. From the Introduction in the same paper:
The whims of the plastid—the “spoiled kid”—are thus compensated by a liberal nuclear genome. In the long term, this might allow the occurrence and persistence of apparently futile steps of gene expression, such as RNA editing or trans-splicing.
(emphasis mine) And from the Discussion:
Posttranscriptional RNA editing provides another striking example of complex but apparently futile RNA metabolism.
(emphasis mine) As you can see (I repeat myself) nobody here is denying the complexity. But this fascinating type of complexity is now labeled as both irremediable complexity and futile complexity. Behe will probably be envious! :)gpuccio
December 28, 2017
December
12
Dec
28
28
2017
03:46 PM
3
03
46
PM
PDT
Arthur Hunt (and all interested): The experimental part of the paper is simple enough and clear enough, and it aims essentially to demonstrate that, at least in the case of psaA, theory b), the ratchet theory, is probably true. And I must say that their argument is of some interest. Our model organism, Chlamydomonas, is very good for getting mutants. So, they took three different mutants, one for each of the three proteins raa1, raa2 and raa3. As expected, each of these mutant was defective at photosynthesis, because of the defective psaA trans-splicing. Of course, according to what we have already said, in the raa1 mutant the splcing of both introns was compromised, while in raa2 mutant only intron 2 was involved, and in the raa3 mtant only intron 1. Then they inserted into the chloroplast genome of the three mutants an intron-less version of the psaA gene. The simple result is that photsynthesis was rescued in all three mutants. But there is more: they compared the behaviour of the "rescued" mutants with the Wild Type, and they could not observe any difference. The comparison was made not only in standard conditions, but also in some stress conditions: iron deprivation, competitive growth, high temperature, low oxygen. In all these settings, growth and behaviour of the rescued strains and of the WT were not different. They conclude, therefore that theory b) must be true: the trans-splicing system, however complex it is, has apparently only the purpose of trans-splicing the degenerate psaA gene. If we provide the organism with an intron-less gene, it is no more necessary. Now, this is very interesting, and I don't want in any way to underemphasize these results, which I have tried to report as simply and clearly as possible. I will add, however, a couple of cautionary notes: 1) Test in the lab of the relative fitness of different populations should be considered with some caution. Of course, in the wild and in long evolutionary times a lot of differences could be revealed which cannot be apparent in a short test under controlled conditions. This is true for all these kinds of tests. IOWs, it is still possible that the trans-splicing system has regulatory functions which could not be observed in the tests made during the experiment. 2) Let's remember that we still have a very limited understanding of this system. I would reasonably wait for further information about the proteins involved, the structure, the molecular function, and above all the similarities and differences between different trans-splicing systems in different organisms, before drawing final conclusions about this complex issue. That said, I frankly admit that these results, taken as they are, are in favour of theory b) for this specific system. Which makes the system itself, as we understand it at present, a really weird object from all points of view. Many more things could be said about this aspect, but I will stop here for the moment, and leave some space to a possible discussion. In next post, I will rather outline the (few) similarities between this trans-splicing system and our spliceosome, and the (many) differences. And maybe add some interesting recent information about the spliceosome itself.gpuccio
December 28, 2017
December
12
Dec
28
28
2017
03:36 PM
3
03
36
PM
PDT
Arthur Hunt (and all interested): Now, before going to the experimental part of the paper reference at #201, let's say still something about theory b): the ratchet theory for psaA trans-splicing. As we have seen, this theory assumes that mutations at the level of nuclear genome (what they call "suppressor mutations) "compensate" for the alteration in the chloroplast gene (the “spoiled kid”). So, in this case, the compensating "suppressor mutations" would be those which generate the many proteins in the nuclear genome which effect the trans-splicing, while the "spoiled kid" is the degenerate and fragmented psaA gene in the chloroplast genome. But there is a problem. If the degeneracy of the gene (for example, being fragmented into three separated exons/intron-parts, plus a small RNA coding gene) is such that it prevents completely an important function (photosynthesis), IOWs if the spoiled kid is really and severely spoiled, the organism should be subject to strong negative selection, and there is no evolutionary time to "wait" for the compensating mutations to provide the complex solution. In the words of the authors:
When a plastid mutation severely affects RNA metabolism, the theory of constructive neutral evolution (CNE) proposes that suppression may involve a preexisting nucleus-encoded factor which restores adequate gene expression, and allows a step towards “irremediable complexity"
What does this mean? It means that they are proposing a new sub-theory, called CNE. That theory is not really new. We find a good summary of it, more in general, in the abstract of a paper of 2011: How a neutral evolutionary ratchet can build cellular complexity http://onlinelibrary.wiley.com/doi/10.1002/iub.489/abstract
Complex cellular machines and processes are commonly believed to be products of selection, and it is typically understood to be the job of evolutionary biologists to show how selective advantage can account for each step in their origin and subsequent growth in complexity. Here, we describe how complex machines might instead evolve in the absence of positive selection through a process of “presuppression,” first termed constructive neutral evolution (CNE) more than a decade ago. If an autonomously functioning cellular component acquires mutations that make it dependent for function on another, pre-existing component or process, and if there are multiple ways in which such dependence may arise, then dependence inevitably will arise and reversal to independence is unlikely. Thus, CNE is a unidirectional evolutionary ratchet leading to complexity, if complexity is equated with the number of components or steps necessary to carry out a cellular process. CNE can explain “functions” that seem to make little sense in terms of cellular economy, like RNA editing or splicing, but it may also contribute to the complexity of machines with clear benefit to the cell, like the ribosome, and to organismal complexity overall. We suggest that CNE-based evolutionary scenarios are in these and other cases less forced than the selectionist or adaptationist narratives that are generally told.
To say it simply, in cases like the psaA degeneracy, where survival does not seem likely without some remedy, the remedy must be already there, at the time when the kid becomes spoiled. So, the idea is that neutral evolution provides the new information in advance, so that it is ready to act when the situation required it. That's probably why they call that neutral evolution "constructive"! :) So, going back to our psaA, when the chloroplast gene becomes fragmented and degenerate (which, frankly, does not seem an event which should require a lot of time), the 14+ proteins that will effect its trans-splicing are already there, kindly arranged by "constructive" neutral evolution. I know what you are thinking... :) In the words of the authors of the paper referenced at #201:
A pool of potential preexisting suppressors for chloroplast mutations could be constituted by proteins that have functions in RNAmetabolism and can be co-opted for the new task.
Did they say "co-opted?". Yes, they did! OK, no problem in that, I suppose. After all, it is a "pool" of potential compensating proteins. Maybe a lot of them, who knows? And after all, they only have to recognize the trascripts from four different genomic sites, reconstruct a complex RNA structure from them, and in some way correctly effect the splicing. Where's the problem? And for those who are tired of Behe's old concept of "irreducible complexity", we have here a brilliant, new and certainly more fashionable concept: Irremediable complexity! IOWs, a complexity which should not be necessary, but becomes "irremediable" because of serious, unavoidable events (spoiled kids, which, as each parent knows, can cause a lot of trouble).
The CNE theory explains how this type of transgenomic suppression mechanism could be part of a “drive toward irremediable complexity”.
OK, no comment for the moment. We still have to look at the experimental part of the paper: On the Complexity of Chloroplast RNA Metabolism: psaA Trans-splicing Can be Bypassed in Chlamydomonas In next post.gpuccio
December 28, 2017
December
12
Dec
28
28
2017
03:06 PM
3
03
06
PM
PDT
Dionisio @ 203: You set the bar very high, my friend.Truth Will Set You Free
December 28, 2017
December
12
Dec
28
28
2017
01:30 PM
1
01
30
PM
PDT
@200:
So, as the result of all this information, I would like to highlight a few points: a) The complexity of the system is apparently very high, and nobody seems to deny that, even if we still miss a lot of information to really understand it as a whole. b) One amazing aspect of this system is how specific it seems to be. Indeed, as far as we know, it is specific not only for the organism Chlamidomonas reinhardtii, but also for the psaA gene. But there is more: it is specific for each of the two introns in that gene: indeed, many proteins are involved only in the splicing of one of the two introns. c) Finally, this amazing system seems to be made mainly of specific proteins that we find practically only in this organism. I hope you are as amazed as I am.
That functional complexity is just an illusion. We just don't understand evolution. :)Dionisio
December 28, 2017
December
12
Dec
28
28
2017
12:57 PM
12
12
57
PM
PDT
ET @204 & 205: Of course, that's just pure physics. Is there another option? :)Dionisio
December 28, 2017
December
12
Dec
28
28
2017
10:24 AM
10
10
24
AM
PDT
@200 & @201: As usual, very thorough attention to details explained with much pedagogy. This discussion could be a valuable chapter if a biology textbook for post docs. The complexity of the described issues is so visible that one gets dizzy trying to follow the whole explanation for the first time. Several rereads might be required for the penny to drop. It seems like gpuccio has rolled up his sleeves and gotten to work hard on this. Many of his readers (myself included) are benefiting from his effort. Thanks.Dionisio
December 28, 2017
December
12
Dec
28
28
2017
10:19 AM
10
10
19
AM
PDT
(sarcasm alert with respect to the spliceosome)- But it's all material processes using physical material. There isn't anything magical going on, just plain ole physics and chemistry.ET
December 28, 2017
December
12
Dec
28
28
2017
09:32 AM
9
09
32
AM
PDT
(sarcasm alert with respect to the spliceosome)- But it's all material processes using physical material. There isn't anything magical going on, just plain ole physics and chemistry.ET
December 28, 2017
December
12
Dec
28
28
2017
09:32 AM
9
09
32
AM
PDT
Discussion between AH and GP Index of posted comments: AH @25 …….……. GP @28 AH @50 AH @51 …….……. GP @54 AH @56 * …….……. GP @60 …….……. GP @69 …….……. GP @75 …….……. GP @86 …….……. GP @98 …….……. GP @106 …….……. GP @118 …….……. GP @127 …….……. GP @129 AH @130 …….……. GP @136 …….……. GP @138 …….……. GP @146 …….……. GP @162 AH @164 …….……. GP @167 …….……. GP @176 …….……. GP @182 …….……. GP @198 …….……. GP @200 …….……. GP @201 AH is the distinguished professor Arthur Hunt GP is the author of the excellent OP that started this discussion thread (*) first publicly admitted mistake @56 (to be continued…)Dionisio
December 28, 2017
December
12
Dec
28
28
2017
09:28 AM
9
09
28
AM
PDT
@198: The plot thickens... This discussion is getting better with every comment posted. And some of us here are learning much from it.Dionisio
December 28, 2017
December
12
Dec
28
28
2017
09:26 AM
9
09
26
AM
PDT
Arthur Hunt (and all interested): Another aspect about psaA is related to the following paper: On the Complexity of Chloroplast RNA Metabolism: psaA Trans-splicing Can be Bypassed in Chlamydomonas https://academic.oup.com/mbe/article/31/10/2697/1013909 I will try to make it simple. In the introduction, this paper gives us an important piece of information:
This type of trans-splicing occurs in the plastids and mitochondria of many organisms and in a variety of different genes (Glanz and Kuck 2009). For example, psaA is trans-spliced in Chlamydomonas but is an intron-less gene in higher plant plastids, and conversely rps12 is trans-spliced in higher plants but not in Chlamydomonas.
OK. Let's remember that. Then comes some information that we already know, but it can be useful to refresh it:
Most of these genes are required specifically for splicing of only one of the two split introns. At least seven genes are essential for trans-splicing of the first intron, some of which are necessary for processing of tscA from a polycistronic precursor (Hahn et al. 1998; Rivier 2000; Balczun et al. 2005; Glanz et al. 2012). Five loci are required for trans-splicing of the second intron (Perron et al. 1999), and two are involved in splicing of both the introns (Merendino et al. 2006).
Then the paper briefly discusses two possible theories to "explain" the complexity of the trans-splicing system we are discussing (psaA and other similar cases). As I have already pointed out, nobody here is doubting that complexity. The system appears to be complex, everyone is rather certain of that. But how can we explain that complexity? What is the reason for it? I would like to emphasize that the main question, in this discussion in the paper, is not so much "how", but rather "why". Why did such a complex system originate? Why is it necessary? Of course, they also say something about the "how", as we will see later. This discussion is interesting, because of course everyone of us is probably wondering, at this point: why such a complex system? For one specific task? And of course, that kind of discussion, at least in a very general sense, can also be used as a starting guide to debate the spliceosome itself (I will try later to point at the important differences, however). So, the authors of the paper give a clear and simple summary of two possible theories about the "why". I will try to simplify them even further here, but if you want you can read their words directly in the paper, in the two paragraphs that start with: "There may be multiple reasons for the remarkable complexity of chloroplast RNA metabolism." Essentially, the two theories are: a) The complexity is needed for adaptational or regulatory reasons. We can call this theory the "regulatory" theory. The complexity offers tools to change the working of the system in response to different situations. b) The complexity arises simply to compensate for random errors in the genome, in particular the degradation of the original DNA (fragmentation, transformation, loss of function). The authors call this the “spoiled kid hypothesis”, where the spoiled kid is the degenerate DNA of the gene, and the mutations in the nuclear genome which compensate for that degeneracy are called an "evolutionary ratchet". IOWs, in this theory the complexity that arises to compensate for the degeneracy has one purpose only: to compensate for the degeneracy. Theory a) is clear enough. It is usually the theory applied to many complex systems, including the spliceosome, whose many regulatory functions have been well demonstrated. One form of the theory, however, assumes that the spliceosome originated as a ratchet, and then evolved new functions. But of course it is possible that both the compensating function and the regulatory function were developed at the same time. In theory b), instead, there is no regulatory function. The system is only a ratchet. It serves nothing else. (By the way, I am not responsible for the "ratchet" analogy, it is in the paper, and not only in this one) Is everything clear? Let's stop just a moment. To next post.gpuccio
December 28, 2017
December
12
Dec
28
28
2017
09:17 AM
9
09
17
AM
PDT
Arthur Hunt (and all interested): The first question is: how complex is the trans-splicing of psaA in Chlamydiomonas? Of course, the description I have already given of the process seems to imply great complexity. But what is, in detail, the role of the protein component? Unfortunately, even for this deeply studied case, we know probably too little. But something we know. The paper I have referenced here, which is of 2016, makes a good summary of what is known, and adds a lot of original information. Let's start with this very good summary of the main premises, which I have in some way already provided:
In the green alga Chlamydomonas reinhardtii, exceptional examples of split group II introns were described as part of the chloroplast psaA gene that encodes an apoprotein of photosystem I. The psaA gene is split into three dispersed exons, which are flanked by truncated group II intron sequences (14). Whereas the second intron (psaA-i2) is bipartite, the first intron (psaA-i1) is tripartite, and the missing group II intron secondary structure is delivered in trans by the chloroplast-encoded tscA RNA (15). After separate transcription, two group II introns are built up by base pairing, followed by two trans-splicing reactions and, ultimately, formation of the mature psaA mRNA. Splicing of such variant group II introns relies on nucleus-encoded splicing factors to compensate for lack of functional motifs and to retain splicing activity (11, 16). However, it is still unknown and under scientific debate whether these splicing factors function in a spliceosomal-like yet intron-specific manner.
I will anticipate here that much of the interest in these trans-splicing complexes is that they are thought to be some model for the origine of the spliceosome. In that sense, they are certainly related to our main discussion. But how complex is the trans-splicing machine for psaA? In previous works, at least 14 proteins were thought to be implied, but only 7 of them have really been identified.
For C. reinhardtii, seven splicing factors, specific for group II introns, have been described at the molecular level (17,–23). Whereas splicing factor Raa1 (RNA maturation of psaA RNA) is involved in splicing of both reactions, Raa3, Raa4, Raa8, and Rat2 (RNA maturation of psaA tscA RNA) are psaA-i1-specific. Raa2 and Raa7 act specifically on the splicing of psaA-i2.
We will go back to these proteins later. The role of many of these proteins has been assessed because specific mutants showed impairment of photosynthesis. It was also supposed that those proteins acted in the form of some high molecular weight complex. However, the referenced paper has for the first time given evidence of that.
The results presented here define two core-splicing complexes, subcomplexes I and II. We further demonstrate that there is an interaction network of at least 11 and 7 interaction partners in subcomplex I and II, respectively. Several of the uncharacterized proteins are most probably further not yet identified trans-splicing factors.
As an image is better that a thousand words, let's look at Fig. 1, which gives in B and D a gross scheme of the two subcomplexes, and in A and C a list of the core components of each of them. Raa1 is a big protein which seems to have an important role in both subcomplexes. The paper also confirms the important of the "missing piece", the tscA gene, in completing the structure of intron 1, and the intricate role of the proteins in the subcomplexes to help the maturation of this small RNA molecule. But what do we know of the 7 proteins that have been so far characterized? If you look again at Figure 1, the lists in A and C, you can find some important information in the column "description". For many of them, no functional annotation is available (IOWs, no known domains have been identified in the sequences). A couple of them show some known domain homology (for example, Raa2 has a pseudouridine synthase domain, for some reason that is not shown in Fig. 1, but you can find the information in Fig. 4). However, the most common domain found in them is the OPR sequence, an Octatricopeptide repeat sequence probably implied in RNA binding. Pentatricopeptide repeat sequences are also implied in other cases of trans-splicing. The general idea is: these proteins are, at least some of them, rather isolated proteins. For example, the very big Raa1 protein (2103 AAs), if blasted, shows only 4 low homology hits (107 -352 bits), all with green algae organisms. IOWs, these are not widely conserved proteins. Not at all. As far as we know, they are rather organism specific proteins, with some low homology to a few existing domains. A lot of further details can be found in the paper, but I will not deal with them here. So, as the result of all this information, I would like to highlight a few points: a) The complexity of the system is apparently very high, and nobody seems to deny that, even if we still miss a lot of information to really understand it as a whole. b) One amazing aspect of this system is how specific it seems to be. Indeed, as far as we know, it is specific not only for the organism Chlamidomonas reinhardtii, but also for the psaA gene. But there is more: it is specific for each of the two introns in that gene: indeed, many proteins are involved only in the splicing of one of the two introns. c) Finally, this amazing system seems to be made mainly of specific proteins that we find practically only in this organism. I hope you are as amazed as I am. There is a final important point to illustrate, and I will do that in next post.gpuccio
December 28, 2017
December
12
Dec
28
28
2017
06:28 AM
6
06
28
AM
PDT
gpuccio, The more we know, more is ahead for us to learn. This fascinating topic that you've brought up for discussion, starting with an excellent OP that's followed by a very comprehensive series of insightful comments (mostly in response to professor Arthur Hunt's inputs), has motivated me to search for additional information that could shed more light on what is known about this amazing biological machinery and its functioning. But I have to admit that the available information is becoming so overwhelming by its increasing volume, that sometimes I lose track of the papers. This happens with other biological topics too. The Big Data problem they have been talking about for the last several years seems to get worse with the avalanche of research discoveries made in wet and dry labs out there. The free Zotero tool may help to alleviate the problem, but discipline is still required to avoid skipping important papers after having been located. Zotero also helps to prevent repetitions. My lack of constant discipline has mace me lose track of interesting papers I had located before. It's frustrating to realize that my laziness has led me to squander time I had spent searching for information. It's encouraging to see that sometimes you may use some of the papers that have been found. That motivates me to try better next time.Dionisio
December 28, 2017
December
12
Dec
28
28
2017
05:47 AM
5
05
47
AM
PDT
1 5 6 7 8 9 14

Leave a Reply