Uncommon Descent Serving The Intelligent Design Community

Proteins Fold As Darwin Crumbles

Share
Facebook
Twitter
LinkedIn
Flipboard
Print
Email

A Review Of The Case Against A Darwinian Origin Of Protein Folds By Douglas Axe, Bio-Complexity, Issue 1, pp. 1-12

Proteins adopt a higher order structure (eg: alpha helices and beta sheets) that define their functional domains.  Years ago Michael Denton and Craig Marshall reviewed this higher structural order in proteins and proposed that protein folding patterns could be classified into a finite number of discrete families whose construction might be constrained by a set of underlying natural laws (1).  In his latest critique Biologic Institute molecular biologist Douglas Axe has raised the ever-pertinent question of whether Darwinian evolution can adequately explain the origins of protein structure folds given the vast search space of possible protein sequence combinations that exist for moderately large proteins, say 300 amino acids in length.  To begin Axe introduces his readers to the sampling problem.  That is, given the postulated maximum number of distinct physical events that could have occurred since the universe began (10150) we cannot surmise that evolution has had enough time to find the 10390 possible amino-acid combinations of a 300 amino acid long protein.

The battle cry often heard in response to this apparently insurmountable barricade is that even though probabilistic resources would not allow a blind search to stumble upon any given protein sequence, the chances of finding a particular protein function might be considerably better.  Countering such a facile dismissal of reality, we find that proteins must meet very stringent sequence requirements if a given function is to be attained.  And size is important.  We find that enzymes, for example, are large in comparison to their substrates.  Protein structuralists have demonstrably asserted that size is crucial for assuring the stability of protein architecture.

Axe has raised the bar of the discussion by pointing out that very often enzyme catalytic functions depend on more that just their core active sites.  In fact enzymes almost invariably contain regions that prep, channel and orient their substrates, as well as a multiplicity of co-factors, in readiness for catalysis.  Carbamoyl Phosphate Synthetase (CPS) and the Proton Translocating Synthase (PTS) stand out as favorites amongst molecular biologists for showing how enzyme complexes are capable of simultaneously coordinating such processes.  Overall each of these complexes contains 1400-2000 amino acid residues distributed amongst several proteins all of which are required for activity.

Axe employs a relatively straightforward mathematical rationale for assessing the plausibility of finding novel protein functions through a Darwinian search.  Using bacteria as his model system (chosen because of their relatively large population sizes) he shows how a culture of 1010 bacteria passing through 104 generations per year over five billion years would produce a maximum of 5×1023 novel genotypes.  This number represents the ‘upper bound’ on the number of new protein sequences since many of the differences in genotype would not generate “distinctly new proteins”.  Extending this further, novel protein functions requiring a 300 amino acid sequence (20300 possible sequences) could theoretically be achieved in 10366 different ways (20300/5×1023). 

Ultimately we find that proteins do not tolerate this extraordinary level of “sequence indifference”.  High profile mutagenesis experiments of beta lactamases and bacterial ribonucleases have shown that functionality is decisively eradicated when a mere 10% of amino-acids are substituted in conservative regions of these proteins.  A more in-depth breakdown of data from a beta lactamase domain and the enzyme chorismate mutase  has further reinforced the pronouncement that very few protein sequences can actually perform a desired function; so few in fact that they are “far too rare to be found by random sampling”.

But Axe’s landslide evaluation does not end here.  He further considers the possibility that disparate protein functions might share similar amino-acid identities and that therefore the jump between functions in sequence space might be realistically achievable through random searches.  Sequence alignment studies between different protein domains do not support such an exit to the sampling problem.  While the identification of a single amino acid conformational switch has been heralded in the peer-review literature as a convincing example of how changes in folding can occur with minimal adjustments to sequence, what we find is that the resulting conformational variants are unstable at physiological temperatures.  Moreover such a change has only been achieved in vitro and most probably does not meet the rigorous demands for functionality that play out in a true biological context.  What we also find is that there are 21 other amino-acid substitutions that must be in place before the conformational switch is observed. 

Axe closes his compendious dismantling of protein evolution by exposing the shortcomings of modular assembly models that purport to explain the origin of new protein folds.  The highly cooperative nature of structural folds in any given protein means that stable structures tend to form all at once at the domain (tertiary structure) level rather that at the fold (secondary structure) level of the protein.  Context is everything.  Indeed experiments have held up the assertion that binding interfaces between different forms of secondary structure are sequence dependent (ie: non-generic).  Consequently a much anticipated “modular transportability of folds” between proteins is highly unlikely. 

Metaphors are everything in scientific argumentation.  And Axe’s story of a random search for gem stones dispersed across a vast multi-level desert serves him well for illustrating the improbabilities of a Darwinian search for novel folds.  Axe’s own experience has shown that reticence towards accepting his probabilistic argument stems not from some non-scientific point of departure in what he has to say but from deeply held prejudices against the end point that naturally follows.  Rather than a house of cards crumbling on slippery foundations, the case against the neo-Darwinian explanation is an edifice built on a firm substratum of scientific authenticity.  So much so that critics of those who, like Axe, have stood firm in promulgating their case, better take note. 

Read Axe’s paper at: http://bio-complexity.org/ojs/index.php/main/article/view/BIO-C.2010.1

Further Reading

  1. Michael Denton, Craig Marshall (2001), Laws of form revisited, Nature Volume 410, p. 417
Comments
alex73 you asked a very good question:
"Can an intelligent agent, confined fully within this known universe, design life (exactly as we know it) from scratch? The agent knows all physical and chemical laws and has a practically unlimited government funding for this project. The idea of DNA, amino acids, proteins, lipid walls, ATP etc, however, are yet to be discovered.,,,, Are there searches that are capable to generate the amount of biological information around us within 10 billion years or so?"
In my very unqualified opinion, Unless quantum computers greatly increase computing capacity past what is maximally possible for computers built of "particles", not only is a comprehensive search for all relevant biological sequences impossible for random processes but the search is also impossible "for a intelligent agent, confined fully within this known universe,": Hopefully one of the more qualified computer programmers on UD can comment on this to give us a better idea just how hard a search would be for a "confined" intelligent agent armed with idea supercomputer. notes: Book Review - Meyer, Stephen C. Signature in the Cell. New York: HarperCollins, 2009. Excerpt: As early as the 1960s, those who approached the problem of the origin of life from the standpoint of information theory and combinatorics observed that something was terribly amiss. Even if you grant the most generous assumptions: that every elementary particle in the observable universe is a chemical laboratory randomly splicing amino acids into proteins every Planck time for the entire history of the universe, there is a vanishingly small probability that even a single functionally folded protein of 150 amino acids would have been created. Now of course, elementary particles aren't chemical laboratories, nor does peptide synthesis take place where most of the baryonic mass of the universe resides: in stars or interstellar and intergalactic clouds. If you look at the chemistry, it gets even worse—almost indescribably so: the precursor molecules of many of these macromolecular structures cannot form under the same prebiotic conditions—they must be catalysed by enzymes created only by preexisting living cells, and the reactions required to assemble them into the molecules of biology will only go when mediated by other enzymes, assembled in the cell by precisely specified information in the genome. So, it comes down to this: Where did that information come from? The simplest known free living organism (although you may quibble about this, given that it's a parasite) has a genome of 582,970 base pairs, or about one megabit (assuming two bits of information for each nucleotide, of which there are four possibilities). Now, if you go back to the universe of elementary particle Planck time chemical labs and work the numbers, you find that in the finite time our universe has existed, you could have produced about 500 bits of structured, functional information by random search. Yet here we have a minimal information string which is (if you understand combinatorics) so indescribably improbable to have originated by chance that adjectives fail. http://www.fourmilab.ch/documents/reading_list/indices/book_726.html In the year 2000 IBM announced the development of a new super-computer, called Blue Gene, which was 500 times faster than any supercomputer built up until that time. It took 4-5 years to build. Blue Gene stands about six feet high, and occupies a floor space of 40 feet by 40 feet. It cost $100 million to build. It was built specifically to better enable computer simulations of molecular biology. The computer performs one quadrillion (one million billion) computations per second. Despite its speed, it was estimated to take one entire year for it to analyze the mechanism by which JUST ONE “simple” protein will fold onto itself from its one-dimensional starting point to its final three-dimensional shape. As well armed with all the known laws of protein folding our search is slow: "Blue Gene's final product, due in four or five years, will be able to "fold" a protein made of 300 amino acids, but that job will take an entire year of full-time computing." Paul Horn, senior vice president of IBM research, September 21, 2000 http://www.news.com/2100-1001-233954.html "SimCell," anyone? "Unfortunately, Schulten's team won't be able to observe virtual protein synthesis in action. Even the fastest supercomputers can only depict such atomic complexity for a few dozen nanoseconds." - cool cellular animation videos on the site http://whyfiles.org/shorties/230simcell/ Networking a few hundred thousand computers together has reduced the time to a few weeks for simulating the folding of a single protein molecule: A Few Hundred Thousand Computers vs. A Single Protein Molecule - video http://www.metacafe.com/watch/4018233 As a sidelight to this, the complexity of computing the actions of even simple atoms quickly exceeds the capacity of our supercomputers of today: Delayed time zero in photoemission: New record in time measurement accuracy - June 2010 Excerpt: Although they could confirm the effect qualitatively using complicated computations, they came up with a time offset of only five attoseconds. The cause of this discrepancy may lie in the complexity of the neon atom, which consists, in addition to the nucleus, of ten electrons. "The computational effort required to model such a many-electron system exceeds the computational capacity of today's supercomputers," explains Yakovlev. http://www.physorg.com/news196606514.htmlbornagain77
July 1, 2010
July
07
Jul
1
01
2010
04:57 AM
4
04
57
AM
PST
gpuccio, Thanks for your reply. I agree, human intellgence has enormous advantage over blind search. But is it enough? How many experiments or floating point operations in a quantum model would be required to design a bacterium from totally, absolutely, scratch? How many interactions have to be examined between the gazillion chemical components of a human body? Just think about how much time it takes for pharmaceutical companies to test the effects of a single molecule. What about an entire biosphere, where all inhabitants can potentially interact? To be more precise: can we give an estimate for the quantity of required 1-bit (yes/no) decisions that will produce the functional information around us? Given the limitations of the physical world, was there enough time and computational resource to make all of these decisions?Alex73
July 1, 2010
July
07
Jul
1
01
2010
04:43 AM
4
04
43
AM
PST
Alex73: Good question. And here is my answer: No, an intelligent search has a huge advantage vs an unguided search, even if the intelligent agent does not know the final detailed answer in advance. First of all, a conscious intelligent is guided by his conscious representations of reality. IOW: a) He is aware of purposes b) He perceives reality and creates intelligent maps of it c) He can build explanatory models d) He can make reasonable inferences e) He is guided by innate cognitive principles (logics, mathemathics) f) He can recognize function and measure it And so on. All these are not assumptions: they are just considerations derived form the direct observation of how design is realize in us human conscious intelligent beings. That's why humans can easily output original CSI, and build machines, and elaborate comnplex cognitive theories like QM, and write software, and output poetry and dramas, and so on. None of that would be in the reach of simple unconscious processes. So, intelligent consciousness really makes the difference: it's the whole magic of knowledge and of guided action.gpuccio
July 1, 2010
July
07
Jul
1
01
2010
03:44 AM
3
03
44
AM
PST
Dear All, We have seen it many times proven that there are insufficient physical events in the entire known universe to explore the object space of a single 300 amino acid protein. Now I have a question: Can an intelligent agent, confined fully within this known universe, design life (exactly as we know it) from scratch? The agent knows all physical and chemical laws and has a practically unlimited goverment funding for this project. The idea of DNA, amino acids, proteins, lipid walls, ATP etc, however, are yet to be discovered. Is there a research program that will terminate with a complete biosphere within a few billion years? Are there searches that are capable to generate the amount of biological information around us within 10 billion years or so? Or even an intelligent search would need more than 10^150 calculations so the biological information cannot be originated from within this universe.Alex73
July 1, 2010
July
07
Jul
1
01
2010
01:57 AM
1
01
57
AM
PST
Robert Deyes: I have already referred to this very good paper in many recent posts. It really deserved a thread of its own. So, thank you for pointing the attention to it. The basic information contained in protein domains remains IMO the strongest and most detailed model for ID. I hope this paper can help us to have some in depth discussion about that.gpuccio
July 1, 2010
July
07
Jul
1
01
2010
01:12 AM
1
01
12
AM
PST
1 11 12 13

Leave a Reply