Exon Shuffling, and the Origins of Protein Folds

_{Jonathan McLatchie

February 22, 2015

Evolution}

Share: Facebook; Twitter; LinkedIn; Flipboard; Print; Email

A frequently made claim in the scientific literature is that protein domains can be readily recombined to form novel folds. In Darwin’s Doubt, Stephen Meyer addresses this subject in detail (see Chapter 11). Over the course of this article, I want to briefly expand on what was said there.

Defining Our Terms

Before going on, it may be useful for me to define certain key terms and concepts. I will be referring frequently to “exons” and “introns.” Exons are sections of genes that code for proteins; whereas introns are sections of genes that don’t code for proteins. Introns and exons.png

Proteins have multiple structural levels. Primary structure refers to the linear sequence of amino acids comprising the protein chain. When segments within this chain fold into structures such as helices and loops, this is referred to as secondary structure. Common units of secondary structure include α-helices and β-strands. Tertiary structure is the biologically active form of the protein, and refers to the packing of secondary structural elements into domains. Since a protein’s tertiary structure optimizes the forces of attraction between amino acids, it is the most stable form of the protein. When multiple folded domains are arranged in a multi-subunit complex, it is referred to as a quaternary structure.

A further concept is domain shuffling. This is the hypothesis that fundamentally new protein folds can be created by recombining already-existing domains. This is thought to be accomplished by moving exons from one part of the genome to another (exon shuffling). There are various ways in which exon shuffling might be achieved, and it is to this subject that I now turn.

The Mechanisms of Exon Shuffling

There are several ways in which exon shuffling may occur. Exon shuffling can be transposon-mediated, or it can occur as a result of crossover during meiosis and recombination between non-homologous or (less frequently) short homologous DNA sequences. Alternative splicing is also thought to play a role in facilitating exon shuffling.

When domain shuffling occurs as a result of crossover during sexual recombination, it is hypothesized that it takes place in three stages (called the “modularization hypothesis”). First, introns are gained at positions that correspond to domain boundaries, forming a “protomodule.” Introns are typically longer than exons, and thus the majority of crossover events take place in the noncoding regions. Second, within the inserted introns, the newly formed protomodule undergoes tandem duplication. Third, intronic recombination facilitates the movement of the protomodule to a different, non-homologous, gene.

Another hypothesized mechanism for domain shuffling involves transposable elements such as LINE-1 retroelements and Helitron transposons, as well as LTR retroelements. LINE-1 elements are transcribed into an mRNA that specifies proteins called ORF1 and ORF2, both of which are essential for the process of transposition. LINE-1 frequently associates with 3′ flanking DNA, transporting the flanking sequence to a new locus somewhere else on the genome (Ejima and Yang, 2003; Moran et al., 1999; Eickbush, 1999). This association can happen if the weak polyadenylation signal of the LINE-1 element is bypassed during transcription, causing downstream exons to be included on the RNA transcript. Since LINE-1’s are “copy-and-paste” elements (i.e. they transpose via an RNA intermediate), the donor sequence remains unaltered.

Long-terminal repeat (LTR) retrotransposons have also been established to facilitate exon shuffling, notably in rice (e.g. Zhang et al., 2013; Wang et al., 2006). LTR retrotransposons possess a gag and a pol gene. The pol gene translates into a polyprotein composed of an aspartic protease (which cleaves the polyprotein), and various other enzymes including reverse transcriptase (which reverse transcribes RNA into DNA), integrase (used for integrating the element into the host genome), and Rnase H (which serves to degrade the RNA strand of the RNA-DNA hybrid, resulting in single-stranded DNA). Like LINE-1 elements, LTR retrotransposons transpose in a “copy-and-paste” fashion via an RNA intermediate. There are a number of subfamilies of LTR retrotransposons, including endogenous retroviruses, Bel/Pao, Ty1/copia, and Ty3/gypsy.

Alternative splicing by exon skipping is also believed to play a role in exon shuffling (Keren et al., 2010). Alternative splicing allows the exons of a pre-mRNA transcript to be spliced into a number of different isoforms to produce multiple proteins from the same transcript. This is facilitated by the joining of a 5′ donor site of one intron to the 3′ site of another intron downstream, resulting in the “skipping” of exons that lie in between. This process may result in introns flanking exons. If this genomic structure is reinserted somewhere else in the genome, the result is exon shuffling.There are of course other mechanisms that are hypothesized to play a role in exon shuffling. But this will suffice for our present purposes. Next, we will look at the evidence for and against domain shuffling as an explanation for the origin of new protein folds.

Introns Early vs. Introns Late

It was hypothesized fairly early, after the discovery of introns in vertebrate genes, that they could have contributed to the evolution of proteins. In a 1978 article in Nature, Walter Gilbert first proposed that exons could be independently assorted by recombination within introns (Gilbert, 1978). Gilbert also hypothesized that introns are in fact relics of the original RNA world (Gilbert, 1986). According to the “exons early” hypothesis, all protein-coding genes were created from exon modules — coding for secondary structural elements (such as α-helices, β-sheets, signal peptides, or transmembrane helices) or folding domains — by a process of intron-mediated recombination (Gilbert and Glynias, 1993; Dorit et al., 1990).

The alternative “introns late” scenario proposed that introns only appeared much later in the genes of eukaryotes (Hickey and Benkel, 1986; Sharp, 1985; Cavalier-Smith, 1985; Orgel and Crick, 1980). Such a scenario renders exon shuffling moot in accounting for the origins of the most ancient proteins.

The “introns early” hypothesis was the dominant view in the 1980s. The frequently cited evidence for this was the then widespread belief in the general correspondence between exon-intron structure and protein secondary structure.

From the mid 1980s, this view became increasingly untenable, however, as new information came to light (e.g. see Palmer and Logsdon, 1991; and Patthy, 1996; 1994; 1991; 1987) that raised doubts about a general correlation between protein structure and intron-exon structure. Such a correspondence is not borne out in many ancient protein-coding genes. Moreover, the apparently clearest examples of exon shuffling all took place fairly late in the evolution of eukaryotes, becoming significant only at the time of the emergence of the first multicellular animals (Patthy,1996; 1994).

In addition, analysis of intron splicing junctions suggested a similar pattern of late-arising exon shuffling. The location where introns are inserted and interrupt the protein’s reading frame determines whether exons can be recombined, duplicated or deleted by intronic recombination without altering the downstream reading frame of the modified protein (Patthy, 1987). Introns can be grouped according to three “phases”: Phase 0 introns insert between two consecutive codons; phase 1 introns insert between the first and second nucleotide of a codon; and phase 2 introns insert between the second and third nucleotide.

Thus, if exon shuffling played a major role in protein evolution, we should expect a characteristic intron phase distribution. But the hypothetical modules of ancient proteins do not conform to such expectations (Patthy, 1991; 1987).

It is clear, then, that exon shuffling (at the very least) is unlikely to explain the origins of the most ancient proteins that have emerged in the history of life. But is this mechanism adequate to explain the origins of later proteins such as those that arise in the evolution of eukaryotes? I now turn to evaluate the evidence pro-and-con for the role of exon shuffling in protein origins.

The Case for Exon Shuffling

What, then, are the best arguments for exon shuffling? If the thesis is correct, a prediction would be that exon boundaries should correlate strongly with protein domains. In other words, one exon should code for a single protein domain. One argument, therefore, points to the fact that there is a statistically significant correlation between exon boundaries and protein domains (e.g., see Liu et al., 2005 and Liu and Grigoriev, 2004).

However, there are many, many examples where this correspondence does not hold. In many cases, single exons code for multiple domains. For instance, protocadhedrin genes typically involve large exons coding for multiple domains (Wu and Maniatis, 2000). In other cases, multiple exons are required to specify a single domain (e.g. see Ramasarma et al., 2012; or Buljan et al., 2010).

A further argument for the role of exon shuffling in protein evolution is the intron phase distributions found in the exons coding for protein domains in humans. In 2002, Henrik Kaessmann and colleagues reported that “introns at the boundaries of domains show high excess of symmetrical phase combinations (i.e., 0-0, 1-1, and 2-2), whereas nonboundary introns show no excess symmetry” (Kaessmann, 2002). Their conclusion was thus that “exon shuffling has primarily involved rearrangement of structural and functional domains as a whole.” They also performed a similar analysis on the nematode worm Caenorhabditis elegans, finding that “Although the C. elegans data generally concur with the human patterns, we identified fewer intron-bounded domains in this organism, consistent with the lower complexity of C. elegans genes.”

Another line of evidence relates to genes that appear to be chimeras of parent genes. These are typically associated with signs indicative of its mode of origin. One famous example is the jingweigene in Drosophila, which may have arisen when “the sequence of the processed Adh [alcohol dehydrogenase] messenger RNA became part of a new functional gene by capturing several upstream exons and introns of an unrelated gene” (Long and Langley, 1993).

We must take care, however, not to confuse the observed pattern of intron phase distribution, or exon/domain mapping, with proof that exon shuffling is actually the process by which this pattern arose.

Perhaps common ancestry is the cause, but this must be demonstrated and not assumed. It is the biologist’s duty to determine whether unintelligent chance-based mechanisms actually can produce novel genes in this manner. It is to this question that I now turn.

The Problems with Domain Shuffling as an Explanation for Protein Folds

While the hypothesis of exon shuffling does, taken at face value, have some attractive elements, it suffers from a number of problems. For one thing, the model at its core presupposes the prior existence of protein domains. A protein’s lower-level secondary structures (α-helices and β-strands) exist stably only in the context of the tertiary structures in which they are found. In other words, the domain level is the lowest level at which self-contained stable structural modules exist. This leaves the origins of these domains in the first place unaccounted for. But stable and functional protein domains are demonstrably rare within amino-acid sequence space (e.g. Axe, 2010; Axe, 2004; Taylor et al., 2001; Keefe and Szostak, 2001; Reidhaar-Olson and Sauer, 1990; Salisbury, 1969).

A fairly recent study examined many different combinations of E. coli secondary structural elements (α-helices, β-strands and loops), assembling them “semirandomly into sequences comprised of as many as 800 amino acid residues” (Graziano et al., 2008). The researchers screened 10⁸ variants for features that might suggest folded structure. They failed, however, to find any folded protein structures. Reporting on this study, Axe (2010) writes:

“After a definitive demonstration that the most promising candidates were not properly folded, the authors concluded that “the selected clones should therefore not be viewed as ‘native-like’ proteins but rather ‘molten-globule-like'”, by which they mean that secondary structure is present only transiently flickering in and out of existence along a compact but mobile chain. This contrasts with native-like structure, where secondary structure is locked-in to form a well defined and stable tertiary fold. Their finding accords well with what we should expect in view of the above considerations. Indeed, it would be very puzzling if secondary structure were modular.”

“For those elements to work as robust modules,” explains Axe, “their structure would have to be effectively context-independent, allowing them to be combined in any number of ways to form new folds.” In the case of protein secondary structure, however, this requirement is not met.

The model also seems to require that the diversity and disparity of functions carried out by proteins in the cell can in principle originate by mixing and matching prior existing domains. But this presupposes the ability of blind evolutionary processes to account for a specific “toolbox” of domains that can be recombined in various ways to yield new functions. This seems unlikely, especially in light of the estimation that “1000 to 7000 exons were needed to construct all proteins” (Dorit et al., 1990). In other words, a primordial toolkit of thousands of diverse protein domains needs to be constructed before the exon shuffling hypothesis even becomes a possibility. And even then there are severe problems.

A further issue relates to interface compatibility. The domain shuffling hypothesis in many cases requires the formation of new binding interfaces. Since amino acids that comprise polypeptide chains are distinguished from one another by the specificity of their side-chains, however, the binding interfaces that allow units of secondary structure (i.e. α-helices and β-strands) to come together to form elements of tertiary structure is dependent upon the specific sequence of amino acids. That is to say, it is non-generic in the sense that it is strictly dependent upon the particulars of the components. Domains that must bind and interact with one another can’t simply be pieced together like jenga tiles.

In his 2010 paper in the journal BIO-Complexity Douglas Axe reports on an experiment conducted using β-lactamase enzymes which illustrates this difficulty (Axe, 2010). Take a look at the following figure, excerpted from the paper:

Beta lactamase comparison.png

The top half of the figure (labeled “A”) reveals the ribbon structure of the TEM-1 β-lactamase (left) and the PER-1 β-lactamase (right). The bottom half of the figure (labeled “B”) reveals the backbone alignments for the two corresponding domains in the two proteins. Note the high level of structural similarity between the two enzymes. Axe attempted to recombine sections of the two genes to produce a chimeric protein from the domains colored green and red. Since the two parent enzymes exhibit extremely high levels of structural and functional similarity, this should be expected to work. No detectable function was identified in the chimeric construct, though, presumably as a consequence of the substantial dissimilarity between the respective amino-acid sequences and the interface incompatibility between the two domains.

This isn’t by any means the only study demonstrating the difficulty of shuffling domains to form new functional proteins. Another study by Axe (2000) described “a set of hybrid sequences” from “the 50%-identical TEM-1 and Proteus mirabilis β-lactamases,” which were created such that the “hybrids match[ed] the TEM-1 sequence except for a region at the C-terminal end, where they [were] random composites of the two parents.” The results? “All of these hybrids are biologically inactive.”

In fact, in the few cases where protein chimeras do possess detectable function, it only works for the precise reason that the researchers used an algorithm (developed by Meyer et al., 2006) to carefully select the sections of a protein structure that possess the fewest side-chain interactions with the rest of the fold, and chose parent proteins with relatively high sequence identity (Voigt et al., 2002). This only serves to underscore the problem. Even in the Voigt study, the success rate was quite low, even with highly favorable circumstances, with only one in five chimeras possessing discernible functionality.

Conclusion

To conclude, although there is some indirect inferential evidence for the role of exon shuffling in protein evolution, a consideration of how such a process might work in reality reveals that the hypothesis itself is fraught with severe difficulties.

This article was originally published at Evolution News & Views (part 1; part 2)

Comments

Indels exist. How did you determine they are blind watchmaker processes? And CONTEXT matters, duh. We were discussing polypeptides of 80 amino acids evolving to polypeptides of 300+ amino acids.Joe_{March 4, 2015
March
03
Mar
4
04
2015
07:10 AM
7
07
10
AM
PDT}

[within-species protein variations] are their own proteins, perhaps related by a common design.
You really ought to try this in a court of law. Close relatives are more 'commonly designed' than distant ones! Heh heh heh.
And one more time- what is your experimental evidence for proteins growing by adding amino acids?
There is a very well-known a class of mutation known as an 'indel'. Are you honestly saying that there is no such thing? It's not just a question of doing a lit-search for a single example of length change - you'd just deny it anyway; see your ridiculous claims about within-species variation. This is embedded in genetics. It's fundamental stuff. I gave you 4 known mechanisms. People didn't just make them up. Do you really think they NEVER happen? Well, that's genetics sorted!
Well seeing that your position can’t account for the tyrosine kinase, that would be an issue for you.
Ha ha! Stock Move #13. Demand evidence for X, get it, then huff "Well your position can't account for the precursor of X!". We'll take it as a gracious concession that proteins can, indeed, grow.Hangonasec_{March 4, 2015
March
03
Mar
4
04
2015
05:13 AM
5
05
13
AM
PDT}

Well seeing that your position can't account for the tyrosine kinase, that would be an issue for you.Joe_{March 3, 2015
March
03
Mar
3
03
2015
10:06 AM
10
10
06
AM
PDT}

How about a 1,130 amino acid tyrosine kinase 'growing' to be a 1,530 amino acid fusion protein, that still retains tyrosine kinase activity? Would that satisfy you?DNA_Jock_{March 3, 2015
March
03
Mar
3
03
2015
07:31 AM
7
07
31
AM
PDT}

We have already pointed out that, any time a stop codon mutates, the protein “grows” by ~~20 amino acids.
And it still functions? Evidence please. What you need is evidence that an 80 amino acid protein can grow into a 300 amino acid protein. That is the context of what I was discussing.Joe_{March 3, 2015
March
03
Mar
3
03
2015
05:52 AM
5
05
52
AM
PDT}

Joe:
Hi DNA Jock- How does your link affect what I am saying? In what way is that an example of a protein growing?
We have already pointed out that, any time a stop codon mutates, the protein "grows" by ~~20 amino acids. Your response was the truly bizarre claim that :
additional amino acids will either alter or bury the active site. That is just the way it is.
This statement is demonstrably false. The two-hybrid system that I linked to demonstrates that one can add many different (random) peptides to many different proteins without disturbing their function. Similarly, it is common practice to make fusion proteins, adding substance P, avidin, a Ni-binding domain, a Maltose binding domain, a myc-tag, FLAG-tag, hemagglutinin-tag, to a protein of interest in order to facilitate its purification. If you had read and understood Keefe & Szostak, you would be aware of this. Perhaps one of the better-informed ID proponents will help us set you straight on this matter. I doubt it, however; there is a curious reticence...DNA_Jock_{March 3, 2015
March
03
Mar
3
03
2015
05:49 AM
5
05
49
AM
PDT}

What happens to activity when a cassette skipped in one spliceoform is included in another? Does the active site get buried or inactivated?
Most likely a new active site emerges, by design. And your position can't explain exon shuffling anyway...Joe_{March 3, 2015
March
03
Mar
3
03
2015
04:31 AM
4
04
31
AM
PDT}

So, what’s the JoeWorld explanation for within-species protein length differences? If they aren’t homologues, related by descent … what are they?
They are their own proteins, perhaps related by a common design. And one more time- what is your experimental evidence for proteins growing by adding amino acids?Joe_{March 3, 2015
March
03
Mar
3
03
2015
04:29 AM
4
04
29
AM
PDT}

Hi DNA Jock- How does your link affect what I am saying? In what way is that an example of a protein growing?Joe_{March 3, 2015
March
03
Mar
3
03
2015
04:24 AM
4
04
24
AM
PDT}

Oh Joe,
additional amino acids will either alter or bury the active site. That is just the way it is.
So how do you explain the fact that two-hybrid screening works? http://en.wikipedia.org/wiki/Two-hybrid_screening Hilarious.DNA_Jock_{March 2, 2015
March
03
Mar
2
02
2015
05:38 AM
5
05
38
AM
PDT}

If all you have is one extra amino acid then you don’t have anything to discuss.
Do you understand what the qualifier "as a minimum" means? Or even 'growth', for that matter?
As for homologs- you cannot tell if it is or if it just looks like a homolog.
Yes - as I predicted: "Of course, I know what the next dodge is going to be – “you cannot prove they are homologues” [...] Care to bet there is no within-species variation in protein length, then?" So, what's the JoeWorld explanation for within-species protein length differences? If they aren't homologues, related by descent ... what are they?
Me: Particularly amusing in a thread about exon shuffling, where different-length mature products are all functional, the extra amino acids in the longer version not ‘getting in the way’ at all. Joe: LoL! They are all distinct proteins in their own right.
Whoosh! The point goes soaring over Joe's head. What happens to activity when a cassette skipped in one spliceoform is included in another? Does the active site get buried or inactivated?
Again additional amino acids will either alter or bury the active site. That is just the way it is.
So no protein can ever change its length because length change is universally fatal? Is that what you are saying? Can any non-ID mechanism cause indels?Hangonasec_{March 2, 2015
March
03
Mar
2
02
2015
05:32 AM
5
05
32
AM
PDT}

If all you have is one extra amino acid then you don't have anything to discuss. As for homologs- you cannot tell if it is or if it just looks like a homolog.
Particularly amusing in a thread about exon shuffling, where different-length mature products are all functional, the extra amino acids in the longer version not ‘getting in the way’ at all.
LoL! They are all distinct proteins in their own right. Again additional amino acids will either alter or bury the active site. That is just the way it is.Joe_{March 2, 2015
March
03
Mar
2
02
2015
04:43 AM
4
04
43
AM
PDT}

Joe
OK hangonasec doesn’t have any evidence for growing proteins. That is what we thought…
In order to show that proteins can grow, I need as a minimum only give 1 instance of a mutation resulting in 1 extra amino acid. Do you really think I can't? Are you basically saying that indels as a class contain nothing but 'dels'? Or that they don't even exist, in either direction?Hangonasec_{March 2, 2015
March
03
Mar
2
02
2015
04:35 AM
4
04
35
AM
PDT}

DNA_Jock: I am happy that you laugh. That is good for your health. All the best to you, sincerely.gpuccio_{March 2, 2015
March
03
Mar
2
02
2015
03:36 AM
3
03
36
AM
PDT}

gpuccio, So you have no rebuttal at all to my comment 87. Good to know. (Note that any agnosticism about whether the improvements are in affinity or in yield in no way affects the validity of the two conclusions outlined therein.) And similarly, if you were being consistent, you would dismiss Axe & Gauger's work as being a 'methodological cheat', given your statement @78
Obviously, it [reproductive advantage] is the only property that anyone is allowed to test is one wants to derive conclusions about NS.
Okay, I guess. You should stop citing Axe's work then. LOLDNA_Jock_{March 1, 2015
March
03
Mar
1
01
2015
10:11 AM
10
10
11
AM
PDT}

DNA_Jock: "given that they do not report the Kd of the ancestral sequence (and it may be impossible to measure), the phrase “improve ATP-binding relative to the ancestral sequence” is agnostic as to whether the “improvement” is in Kd or in yield." So, I am right in being very "agnostic" about those sequences in the original library. Those sequences about which we really know very little, but about which so bold conclusions are made, both in the paper and in your discussion. I think I will stick to my agnosticism, and to my position that a paper which is agnostic about the object of its conclusions is a bad paper. I am happy that you are entertained by my silence about, I suppose, this statement of yours: "According to your bizarre view of indirect measurement, I can breezily dismiss Axe & Gauger’s work because they wash their cells four times in ice-cold phosphate-buffered saline, which is far removed from measuring any “reproductive advantage”, and therefore (according to gpuccio-logic) one can derive no conclusions whatsoever from their work about what NS can or cannot achieve." but the simple reason for my silence is that I find that statement lacking any detectable meaning, like many of the things that you said recently. When I can find some sense in what you say, I still answer (maybe I will change my mind). Otherwise, I don't.gpuccio_{March 1, 2015
March
03
Mar
1
01
2015
09:05 AM
9
09
05
AM
PDT}

So, you think all homologues of a given protein are the same length? Hee hee. Of course, I know what the next dodge is going to be - "you cannot prove they are homologues". Because I wasn't there. Care to bet there is no within-species variation in protein length, then? Particularly amusing in a thread about exon shuffling, where different-length mature products are all functional, the extra amino acids in the longer version not 'getting in the way' at all.Hangonasec_{March 1, 2015
March
03
Mar
1
01
2015
09:03 AM
9
09
03
AM
PDT}

OK hangonasec doesn't have any evidence for growing proteins. That is what we thought...Joe_{March 1, 2015
March
03
Mar
1
01
2015
08:51 AM
8
08
51
AM
PDT}

By what? The additional amino acids, duh.
So you think 'additional amino acids' are always fatal to protein function? In what journal have you published this startling revelation?Hangonasec_{March 1, 2015
March
03
Mar
1
01
2015
08:29 AM
8
08
29
AM
PDT}

By what? The additional amino acids, duh.
Also known as ‘mechanisms’.
Non-sequitur. Look either you have evidence for proteins \growing or you don't. And obviously you don't.Joe_{March 1, 2015
March
03
Mar
1
01
2015
08:23 AM
8
08
23
AM
PDT}

But then you don’t have a STOP codone and your active substrate gets buried.
By what? The additional C-terminus tail? Universally, in all proteins, ever?
Proteins have functional sections tat bind and/ or catalyze reactions. If you block those sections you lose functionality. If you add an amino acid which somehow gets in the way then you lose. I asked for evidence and you bring stories.
Also known as 'mechanisms'. This is the kind of garbage that people routinely bring as 'evidence' against mutation in general. "Mutations are frequently catastrophic". Yeah, I know, so what? Unless they are always catastrophic, you have failed to demonstrate that proteins are unable to grow by mechanisms such as those I outlined, which appears to be your position. Care to place a bet that there are no homologues, in any species pairs anywhere, in which one exemplar has a STOP where the other has an amino acid and subsequent 'tail'?Hangonasec_{March 1, 2015
March
03
Mar
1
01
2015
08:04 AM
8
08
04
AM
PDT}

gpuccio, I don't see why you would think that the passage from Keefe and Szostak that you quoted @88 disagrees in any way with what I have written. Please note the phrase "contributing to the formation of a folded structure". Perhaps you were misled by the final sentence: given that they do not report the Kd of the ancestral sequence (and it may be impossible to measure), the phrase "improve ATP-binding relative to the ancestral sequence" is agnostic as to whether the "improvement" is in Kd or in yield. Note that their selection procedure measures and optimizes primarily for yield, rather than affinity. Separately, I am entertained by your silence on the question of whether (according to gpuccio-logic) Axe and Gauger are guilty of a 'methodological cheat'.DNA_Jock_{March 1, 2015
March
03
Mar
1
01
2015
07:59 AM
7
07
59
AM
PDT}

Hangonasec:
1) Mutate a STOP codon into one with a corresponding charged tRNA, and bingo, your protein has grown, as if by magic, adding (as many residues as sit between this position and the next STOP) + 1 amino acids. M
But then you don't have a STOP codone and your active substrate gets buried. Proteins have functional sections tat bind and/ or catalyze reactions. If you block those sections you lose functionality. If you add an amino acid which somehow gets in the way then you lose. I asked for evidence and you bring stories.Joe_{March 1, 2015
March
03
Mar
1
01
2015
06:33 AM
6
06
33
AM
PDT}

DNA_Jock:
Comparing the round 18 sequences with the ancestral sequence showed that four amino-acid substitutions had become predominant in the selected population (present more than 39 times in 56 sequences, Fig. 3b), and that 16 other substitutions had also been selectively enriched (present more than 4 times in 56 sequences, Fig. 3b). In addition, each clone contained a variable number of other substitutions. The selectively enriched substitutions are distributed over the 62 amino-terminal amino acids of the original 80-amino-acid random region, suggesting that amino acids throughout this region are contributing to the formation of a folded structure, at least in the complex with ATP. The substitutions in each of the assayed clones improve ATP-binding relative to the ancestral sequence.
gpuccio_{March 1, 2015
March
03
Mar
1
01
2015
06:08 AM
6
06
08
AM
PDT}

gpuccio writes:
That ATP-binding is a known biological function is something we agree upon, why do you insist in giving me references about that? My point is that the weak ATP binding in the original sequences in the random library, as far as we know, would be of no use in a real biological context, and that the paper gives no evidence at all to believe or hypothesize differently. While it gives a lot of unnecessary and irrelevant information about an engineer protein which however, as far as we know (and as far as it has been tested) would be of no use in a real biological context.
You appear to be agreeing with us that ATP binding is a ‘known biological function’, but asserting that this function (as embodied in the final, 'engineered' protein) would be of no use in a real biological context. Just. Plain. Weird.
Either they are deriving no conclusions about what NS can do, or they are. If they are, why are they discussing at length an engineered protein, and not the original proteins in the random library? Why didn’t they focus on the original proteins (which are apparently the object of their paper), and on their non engineered properties, and on how NS could act on them?
Your objection to Keefe and Szostak has now migrated from the rather sad “that`s not the experiment I would do” to the genuinely peculiar “that’s not the way I would write the paper”. Seriously, you conclude that they are NOT deriving conclusions about what NS can do, because they spend so much time discussing the results of NS, rather than the starting point. Really? This makes no sense. The paper demonstrates two things 1) Proteins with a minimal function exist (at a low frequency) in random peptide libraries 2) These proteins can, via mutation and selection, evolve into proteins with good function There is a subtlety here: the round 8 proteins, that we have all been referring to as “weak ATP binders”, have not had their Kd measured (in this paper at least). It would be more accurate to refer to the round 8 proteins as “inefficient ATP binders”, since they bind to ATP with a yield of only 5 - 15%. This ‘conformational heterogeneity’ makes studying the proteins very difficult – nobody in their right mind tries to do enzyme kinetics using a sample that is 85 - 95% inactive. So Keefe focused on the biochemical properties of the later proteins because those were the ones that he could measure. It is an interesting thought that the initial binders might bind with quite high affinity -- the paper gives no evidence at all to believe or hypothesize differently – and the subsequent mutation and selection achieved an increase in the yield of protein that had the right conformation. After all, a somewhat disordered peptide can ‘explore’ a huge conformational space, thanks to conformational heterogeneity and induced fit. Subsequent M&S can stabilize the conformation that functions best.DNA_Jock_{March 1, 2015
March
03
Mar
1
01
2015
05:55 AM
5
05
55
AM
PDT}

Joe @75:
Proteins can grow? Evidence please. Proteins are not stalagmites, nor are they living organisms.
1) Mutate a STOP codon into one with a corresponding charged tRNA, and bingo, your protein has grown, as if by magic, adding (as many residues as sit between this position and the next STOP) + 1 amino acids. 2) Insert extra DNA bases between Start and STOP and (provided a truncating STOP is not thereby generated), once again, it has grown. 3) Alter the position of an exon/intron boundary such that it includes more exon and less intron. 4) Less readily, mutate the Shine-Dalgarno sequence or the initial AUG-methionine, lengthening 'the other way', provided there are other initiation signals upstream.Hangonasec_{March 1, 2015
March
03
Mar
1
01
2015
03:54 AM
3
03
54
AM
PDT}

DNA_Jock: Just because I have some spare time today: Either they are deriving no conclusions about what NS can do, or they are. If they are, why are they discussing at length an engineered protein, and not the original proteins in the random library? Why didn't they focus on the original proteins (which are apparently the object of their paper), and on their non engineered properties, and on how NS could act on them?gpuccio_{March 1, 2015
March
03
Mar
1
01
2015
02:04 AM
2
02
04
AM
PDT}

Zachriel: That ATP-binding is a known biological function is something we agree upon, why do you insist in giving me references about that? My point is that the weak ATP binding in the original sequences in the random library, as far as we know, would be of no use in a real biological context, and that the paper gives no evidence at all to believe or hypothesize differently. While it gives a lot of unnecessary and irrelevant information about an engineer protein which however, as far as we know (and as far as it has been tested) would be of no use in a real biological context. But, strangely, both you and DNA_Jock seem to elude that aspect. So, if you prefer, just go on giving me references about things we agree upon.gpuccio_{March 1, 2015
March
03
Mar
1
01
2015
02:00 AM
2
02
00
AM
PDT}

gpuccio @78
Pitiful. Are you denying that reproductive advantage is the only property which is selected by NS?
Heavens, no; I agree that it is the only property that is directly selected by NS.
Obviously, it is the only property that anyone is allowed to test is one wants to derive conclusions about NS.
Obviously not. One can test other properties and make an inference (reasonable or otherwise) that these properties would confer a selectable advantage. You are welcome to argue against the reasonableness of the inference (if you actually had an argument), but to deny that the inference can ever be made is truly "pitiful". According to your bizarre view of indirect measurement, I can breezily dismiss Axe & Gauger's work because they wash their cells four times in ice-cold phosphate-buffered saline, which is far removed from measuring any "reproductive advantage", and therefore (according to gpuccio-logic) one can derive no conclusions whatsoever from their work about what NS can or cannot achieve. I (unlike you, apparently) am comfortable with Axe & Gauger testing for the Bio phenotype without their having to prove that it confers a reproductive advantage. Rather, I dismiss their work because they are asking the wrong question.DNA_Jock_{February 28, 2015
February
02
Feb
28
28
2015
12:15 PM
12
12
15
PM
PDT}

gpuccio: I said: “which as far as we know would be of no use in a real biological context ” Then we pointed out that ATP-binding is a known biological function. See, for instance, Matte & Delbaere, ATP-binding Motifs, in Encyclopedia of Life Sciences, Wiley-Blackwell 2002.Zachriel_{February 28, 2015
February
02
Feb
28
28
2015
06:15 AM
6
06
15
AM
PDT}

1 2 3 4 Next

You must be logged in to post a comment.

Leave a Reply