Intelligent Design

Axe (2004) And The Evolution Of Protein Folds

Spread the love

In my second response to Arthur Hunt on the origin of the T-urf13 gene (which specifies a mitochondrial ligand-gating pore-forming receptor for T-toxin in maise), I briefly mentioned towards the end of my post Arthur Hunt’s comments on the Panda’s Thumb blog regarding the Axe (2004) result concerned with the rarity of catalytic domains within sequence space.

As I noted in my previous post, Axe’s 2004 JMB paper is not an isolated result. I cited a number of papers which attained similar results with respect to the rarity of functional domains within sequence space. In one study, published in Naturein 2001 by Keefe & Szostak, it was documented that more than a million million random sequences were required in order to stumble upon a functioning ATP-binding protein, a protein substantially smaller than the transmembrane protein specified by the gene, T-urf13, discussed by Hunt. In addition, I noted, a similar result was obtained by Taylor et al. in their 2001 PNAS paper. This paper examined the AroQ-type chorismate mutase, and arrived at a similarly low prevalence (giving a value of 1 in 10^24 for the 93 amino acid enzyme, but, when adjusted to reflect a residue of the same length as the 150-amino-acid section analysed from Beta-lactamase, yields a result of 1 in 10^53). Yet another paper by Sauer and Reidhaar-Olson (1990) reported on “the high level of degeneracy in the information that specifies a particular protein fold,” which it gives as 1 in 10^63. In my previous post, I also strongly encouraged Arthur Hunt and others to read Douglas Axe’s excellent review article in Bio-complexity which covers this topic in more detail, as well as to read the recently-published The Nature of Nature — Examining The Role of Naturalism in Science, which is highly accessible for non-specialists.

Yesterday, I posted a short itallicised update to my previous article, having now looked somewhat closer at the article to which Hunt referred me. For those that missed it, allow me to highlight just a few of the points at which Hunt errs.

The key short coming of Hunt’s analysis appears to be in the categoric conflation of (a) the rarity of functional folds in sequence space, and (b) the ability to optimise those functional folds. But it was the purpose of Axe’s 2004 JMB paper to provide an estimate for the former, and not the latter.

Axe’s research set out with the initiative to ascertain the prevalence of sequence variants with a particular hydropathic signature which could form a functional structure out of the space of combinatorial possibilities. Hunt tells us that “Axe deliberately identified and chose for study a temperature sensitive variant. In altering the enzyme in this way, he molded a variant that would be exquisitely sensitive to mutation.” And, indeed, Axe did begin with an extremely weak (temperature sensitive) variant, entailing that an evolving new fold would be expected to be poorly functional. And why would Axe do this? Because he saught to detect variants operating at the lowest level — the threshold, if you will — of detectability.

Axe sought to provide an estimate of the rarity of functional folds in all sequence space, which he gives as 1 in 10^77. This estimate was extrapolated from the number of variants which were able to carry out the function, no matter how weakly, of the TEM-1 Beta-Lactamase enzyme. The graphic on Hunt’s article seems to me to betray somewhat of a misapprehension of Axe’s result and experimental motif, and also appears to misconstrue the real-life scenario of what is going on here. His graphic illustrates the shape of a generously favourable fitness landscape for one particular fold. What he should have shown is the landscape for all sequence space, portraying functional folds — as they are in the real world of biology — as isolated peaks. The Darwinian mechanism may well be able to optimise a protein to a higher and higher level of function if by chance one can locate the base of a smooth fitness peak. But the problem facing neo-Darwinism is its impotence in finding those functional peaks in the first place.

In summary, then, we can conclude that Arthur Hunt appears to subtantially misapprehend the significance of Axe’s result. The key shortcoming of Hunt’s argument is that he conflates two very different questions — namely, the rarity of functional protein folds in sequence space and the difficulty of optimising those folds. Consider a fitness landscape, comprising a few thousand peaks, each one representing a different functional fold. These peaks are extremely rare, and moreover widely dispersed throughout sequence space. If by some fluke of chance one landed at the base of one of those peaks, then it stands to reason that one might be able to scale that peak by virtue of a Darwinian-type process. But if one were to land some place on the flat plain of non-functionality, miles from any peak, the Darwinian model requires too much of an emphasis on the role of random chance to be considered a viable means of locating a functional peak via a blind search. This problem, of course, is only accentuated many fold by the necessitude for multiple and functionally-specific proteins which are required to work mutually together in even the cell’s most basic activities. In sum, there is no reason to think that this is even plausible.

4 Replies to “Axe (2004) And The Evolution Of Protein Folds

  1. 1
    bornagain77 says:

    Thanks again Jonathan, definitely a keeper!


    I liked Dr. Hunter’s observation on the minimal ‘million-million’ ATP binding protein that is often quoted by Darwinists as proof against Axe’s more rigorous proof for rarity of protein folds of 1 in 10^77;

    How Proteins Evolved – Cornelius Hunter – December 2010
    Excerpt: Comparing ATP binding with the incredible feats of hemoglobin, for example, is like comparing a tricycle with a jet airplane. And even the one in 10^12 shot, though it pales in comparison to the odds of constructing a more useful protein machine, is no small barrier. If that is what is required to even achieve simple ATP binding, then evolution would need to be incessantly running unsuccessful trials. The machinery to construct, use and benefit from a
    potential protein product would have to be in place, while failure after failure results. Evolution would make Thomas Edison appear lazy, running millions of trials after millions of trials before finding even the tiniest of function.

    The entire episode of Szostak’s failed attempt to establish the legitimacy of the 1 in 10^12 functional protein number for a randomly generated protein can be read here:

    This following paper was the paper that put the final nail in the coffin for Szostak’s work:

    A Man-Made ATP-Binding Protein Evolved Independent of Nature Causes Abnormal Growth in Bacterial Cells
    Excerpt: “Recent advances in de novo protein evolution have made it possible to create synthetic proteins from unbiased libraries that fold into stable tertiary structures with predefined functions. However, it is not known whether such proteins will be functional when expressed inside living cells or how a host organism would respond to an encounter with a non-biological protein. Here, we examine the physiology and morphology of Escherichia coli cells engineered to express a synthetic ATP-binding protein evolved entirely from non-biological origins. We show that this man-made protein disrupts the normal energetic balance of the cell by altering the levels of intracellular ATP. This disruption cascades into a series of events that ultimately limit reproductive competency by inhibiting cell division.”

  2. 2
    bornagain77 says:

    I hope this is not too far off your specific topic Jonathan, but I think it is of central importance;

    further note;

    Quantum Information/Entanglement In DNA & Protein Folding – short video

    Further evidence that quantum entanglement/information is found throughout entire protein structures:

    It is very interesting to note that quantum entanglement, which conclusively demonstrated that ‘information’ is completely transcendent of any time and space constraints, should be found in molecular biology, for how can quantum entanglement, in molecular biology, possibly be explained by the materialistic framework of neo-Darwinism, a framework which is predicated on the presupposition of being constrained by time and space, when Alain Aspect and company falsified the validity of local realism (reductive materialism) in the first place with quantum entanglement? It is simply ludicrous to appeal to the materialistic framework, which undergirds the entire neo-Darwinian framework, that has been falsified by the very same quantum entanglement effect that one is seeking an explanation to! To give a coherent explanation for an effect that is shown to be completely independent of any time and space constraints one is forced to appeal to a cause that is itself not limited to time and space! Probability arguments, which have been a staple of the arguments against neo-Darwinism, simply do not apply in trying to explain quantum entanglement in biology, since it is shown to be impossible for quantum entanglement to be explained by the materialistic framework in the first place! i.e. It simply does not follow for neo-Darwinism to even begin to presume itself to be the sufficient cause for the effect in question!

    The Failure Of Local Realism – Materialism – Alain Aspect – video

    Physicists close two loopholes while violating local realism – November 2010
    Excerpt: The latest test in quantum mechanics provides even stronger support than before for the view that nature violates local realism and is thus in contradiction with a classical worldview.

    Quantum Measurements: Common Sense Is Not Enough, Physicists Show – July 2009
    Excerpt: scientists have now proven comprehensively in an experiment for the first time that the experimentally observed phenomena cannot be described by non-contextual models with hidden variables.

    Further notes:

    Quantum entanglement holds together life’s blueprint – 2010
    Excerpt: “If you didn’t have entanglement, then DNA would have a simple flat structure, and you would never get the twist that seems to be important to the functioning of DNA,” says team member Vlatko Vedral of the University of Oxford.

    Information and entropy – top-down or bottom-up development in living systems? A.C. McINTOSH
    Excerpt: It is proposed in conclusion that it is the non-material information (transcendent to the matter and energy) that is actually itself constraining the local thermodynamics to be in ordered disequilibrium and with specified raised free energy levels necessary for the molecular and cellular machinery to operate.

    The ‘Fourth Dimension’ Of Living Systems

  3. 3
    PaV says:

    IIRC, I looked at Hunt’s paper years ago. My reaction was exactly as yours: he conflates optimization with realization.

    IIRC, he wanted to say that there was a whole family of proteins that were functional and which could be arrived at via putative evolutionary mechanisms. But, of course, in looking at things in this fashion, he is merely—and conveniently—overlooking how infinitesimal the entire family of proteins is given the phase space were dealing with.

  4. 4
    Arthur Hunt says:

    Hi Jonathan,

    Once again, thanks for the interest in my essays.

    Above, you said:

    The key short coming of Hunt’s analysis appears to be in the categoric conflation of (a) the rarity of functional folds in sequence space, and (b) the ability to optimise those functional folds. But it was the purpose of Axe’s 2004 JMB paper to provide an estimate for the former, and not the latter.

    I am having a hard time understanding how you can conclude this. In my essay, I thought the following statement was pretty clear:

    Put pictorially, the issue that ID proponents are arguing about is the relative structure, or shape, of the topography of functional sequences in all of sequence space. To illustrate, the issue becomes one of the parameters of the hill shown in this figure (we’ll call it Figure 1) (I’ll skip the figure – please refer to the essay):

    In this illustration, the base formed by the X and Y axes represents the sequence “space”, each hypothetical point or patch would depict a different sequence, and the Z-axis depicts some measure of activity. The “accessibility” of function, using this illustration, is a matter of the area of the base of the hill shown – the broader the base, the greater the number of related and functional sequences, and the greater the number of ways that function may be “found”. The idea that ID proponents push is that, if such a hill has a narrow enough base, then it is not likely that random processes can “find” even the base of the hill, let alone the peak.

    In case this isn’t plain enough, I’ll make it shorter – in my essay, I am in no uncertain terms speaking about the rarity of functional folds in sequence space.

    I’ll also assert that my criticisms of Axe’s conclusions remain valid – no one has begun to address my discussion the technical issues that render the generality of the claims about said rarity moot.

    I would also note this statement, which remains as true today (Axe’s later review notwithstanding) as it was in 2007:

    Studies such as these involve what Axe calls a “reverse” approach – one starts with known, functional sequences, introduces semi-random mutants, and estimates the size of the functional sequence space from the numbers of “surviving” mutants. Studies involving the “forward” approach can and have been done as well. Briefly, this approach involves the synthesis of collections of random sequences and isolation of functional polymers (e.g., polypeptides or RNAs) from these collections. Historically, these studies have involved rather small oligomers (7-12 or so), owing to technical reasons (this is the size range that can be safely accommodated by the “tools” used). However, a relatively recent development, the so-called “mRNA display” technique, allows one to screen random sequences that are much larger (approaching 100 amino acids in length). What is interesting is that the forward approach typically yields a “success rate” in the 10^-10 to 10^-15 range – one usually need screen between 10^10 -> 10^15 random sequences to identify a functional polymer. This is true even for mRNA display. These numbers are a direct measurement of the proportion of functional sequences in a population of random polymers, and are estimates of the same parameter – density of sequences of minimal function in sequence space – that Axe is after.

    10^-10 -> 10^-63 (or thereabout): this is the range of estimates of the density of functional sequences in sequence space that can be found in the scientific literature. The caveats given in Section 2 notwithstanding, Axe’s work does not extend or narrow the range. To give the reader a sense of the higher end (10^-10) of this range, it helps to keep in mind that 1000 liters of a typical pond will likely contain some 10^12 bacterial cells of various sorts. If each cell gives rise to just one new protein-coding region or variant (by any of a number of processes) in the course of several thousands of generations, then the probability of occurrence of a function that occurs once in every 10^10 random sequences is going to be pretty nearly 1. In other words, 1 in 10^-10 is a pretty large number when it comes to “probabilities” in the biosphere.

    I hope that clarifies and focuses things here.

Leave a Reply