Human evolution Intelligent Design News

As few as 19 000 human protein-coding genes?

Spread the love

From Human Molecular Genetics :

Abstract: Determining the full complement of protein-coding genes is a key goal of genome annotation. The most powerful approach for confirming protein-coding potential is the detection of cellular protein expression through peptide mass spectrometry (MS) experiments. Here, we mapped peptides detected in seven large-scale proteomics studies to almost 60% of the protein-coding genes in the GENCODE annotation of the human genome. We found a strong relationship between detection in proteomics experiments and both gene family age and cross-species conservation. Most of the genes for which we detected peptides were highly conserved. We found peptides for >96% of genes that evolved before bilateria. At the opposite end of the scale, we identified almost no peptides for genes that have appeared since primates, for genes that did not have any protein-like features or for genes with poor cross-species conservation. These results motivated us to describe a set of 2001 potential non-coding genes based on features such as weak conservation, a lack of protein features, or ambiguous annotations from major databases, all of which correlated with low peptide detection across the seven experiments. We identified peptides for just 3% of these genes. We show that many of these genes behave more like non-coding genes than protein-coding genes and suggest that most are unlikely to code for proteins under normal circumstances. We believe that their inclusion in the human protein-coding gene catalogue should be revised as part of the ongoing human genome annotation effort. Open access

See also: The Science Fictions series at your fingertips (human evolution)

Follow UD News at Twitter!

50 Replies to “As few as 19 000 human protein-coding genes?

  1. 1
    wd400 says:

    Well this should be interesting. Anyone who has made a big deal out ORFans wish to comment?

  2. 2
    JoeCoder says:


    In the paper they say they only looked at genes that had similar sequence in other organisms:

    To be a candidate de novo originated gene, in addition to having a potentially translatable open reading frame in the human genome, the gene must have been present, and disrupted (i.e., non-translatable), in both the chimpanzee and orangutan genomes, e.g., the chimpanzee and orangutan sequences must lack an ATG start codon or have frameshift-inducing indels or nucleotide differences that result in a premature stop codon.

    These are the genes that many evolutionists cited as evidence of genes evolving from non-genes. Meanwhile ID proponents have been making a deal out of “true orphans”, that lack any sequence identity at all in other organisms. So if anything this seems to be evidence against de novo proteins evolving?

    I’m not a biologist so please correct me if I’ve understood something incorrectly.

  3. 3
    wd400 says:

    I think you are reading a different paper?

  4. 4
    bornagain77 says:

    As to this comment from the abstract:

    “We found peptides for >96% of genes that evolved before bilateria.”

    So there has been almost no ‘evolution’ of proteins since before the Cambrian? But from whence are all the differences arising if there are practically no new proteins since before the Cambrian? And why are they allowed to go all the way back to before bilateria to establish similarity? As Dr. Axe has said before, its not the similarities that need explaining but the differences:

    Indeed, instead of assuming evolution to be true in the first place and then forcing the evidence to fit, it would be nice if Darwinists would actually try to prove Darwinism to be feasible before assuming it to be true:

    Peacefulness, in a Grown Man, That is Not a Good Sign – Cornelius Hunter – August 2011
    Excerpt: Evolution cannot even explain how a single protein first evolved, let alone the massive biological world that ensued. From biosonar to redwood trees, evolution is left with only just-so stories motivated by the dogma that evolution must be true. That dogma comes from metaphysics,

    Thou Shalt Not Put Evolutionary Theory to a Test – Douglas Axe – July 18, 2012
    Excerpt: “For example, McBride criticizes me for not mentioning genetic drift in my discussion of human origins, apparently without realizing that the result of Durrett and Schmidt rules drift out. Each and every specific genetic change needed to produce humans from apes would have to have conferred a significant selective advantage in order for humans to have appeared in the available time (i.e. the mutations cannot be ‘neutral’). Any aspect of the transition that requires two or more mutations to act in combination in order to increase fitness would take way too long (>100 million years).
    My challenge to McBride, and everyone else who believes the evolutionary story of human origins, is not to provide the list of mutations that did the trick, but rather a list of mutations that can do it. Otherwise they’re in the position of insisting that something is a scientific fact without having the faintest idea how it even could be.” Doug Axe PhD.

    Accounting for Variations – Dr. David Berlinski: – video

  5. 5
    bornagain77 says:

    as to this comment from the abstract:

    “we identified almost no peptides for genes that have appeared since primates”

    Although I find the preceding comment to be suspect, because of,,

    An integrated encyclopedia of DNA elements in the human genome – Sept. 6, 2012
    Excerpt: Analysis,,, yielded 57 confidently identified unique peptide sequences in intergenic regions relative to GENCODE annotation. Taken together with evidence of pervasive genome transcription, these data indicate that additional protein-coding genes remain to be found.


    Nonetheless, even it is true that ‘almost no peptides for genes that have appeared since primates’ this would make the claim that humans evolved from apes even more remarkable than it already is since the anatomical differences between chimps and humans are far greater than many people have realized,,,

    In “Science,” 1975, M-C King and A.C. Wilson were the first to publish a paper estimating the degree of similarity between the human and the chimpanzee genome.
    But…in the second section of their paper King and Wilson honestly describe the deficiencies of such reasoning:

    “The molecular similarity between chimpanzees and humans is extraordinary because they differ far more than sibling species in anatomy and way of life. Although humans and chimpanzees are rather similar in the structure of the thorax and arms, they differ substantially not only in brain size but also in the anatomy of the pelvis, foot, and jaws, as well as in relative lengths of limbs and digits (38).
    Humans and chimpanzees also differ significantly in many other anatomical respects, to the extent that nearly every bone in the body of a chimpanzee is readily distinguishable in shape or size from its human counterpart (38).
    Associated with these anatomical differences there are, of course, major differences in posture (see cover picture), mode of locomotion, methods of procuring food, and means of communication. Because of these major differences in anatomy and way of life, biologists place the two species not just in separate genera but in separate families (39). So it appears that molecular and organismal methods of evaluating the chimpanzee human difference yield quite different conclusions (40).”

    King and Wilson went on to suggest that the morphological and behavioral between humans and apes,, must be due to variations in their genomic regulatory systems.
    David Berlinski – The Devil’s Delusion – Page 162&163
    Evolution at Two Levels in Humans and Chimpanzees Mary-Claire King; A. C. Wilson – 1975

    And indeed it is in the genomic regulatory regions where the differences are found to be ‘orders of magnitude’ different between chimps and humans:

    “Where (chimps and humans) really differ, and they differ by orders of magnitude, is in the genomic architecture outside the protein coding regions. They are vastly, vastly, different.,, The structural, the organization, the regulatory sequences, the hierarchy for how things are organized and used are vastly different between a chimpanzee and a human being in their genomes.”
    Raymond Bohlin (per Richard Sternberg) – 9:29 minute mark of video

    Evolution by Splicing – Comparing gene transcripts from different species reveals surprising splicing diversity. – Ruth Williams – December 20, 2012
    Excerpt: A major question in vertebrate evolutionary biology is “how do physical and behavioral differences arise if we have a very similar set of genes to that of the mouse, chicken, or frog?”,,,
    A commonly discussed mechanism was variable levels of gene expression, but both Blencowe and Chris Burge,,, found that gene expression is relatively conserved among species.
    On the other hand, the papers show that most alternative splicing events differ widely between even closely related species. “The alternative splicing patterns are very different even between humans and chimpanzees,” said Blencowe.,,,

    Yet, it is in these regulatory regions (dGRNS) where random mutations are least likely to be tolerated:

    A Listener’s Guide to the Meyer-Marshall Debate: Focus on the Origin of Information Question -Casey Luskin – December 4, 2013
    Excerpt: “There is always an observable consequence if a dGRN (developmental gene regulatory network) subcircuit is interrupted. Since these consequences are always catastrophically bad, flexibility is minimal, and since the subcircuits are all interconnected, the whole network partakes of the quality that there is only one way for things to work. And indeed the embryos of each species develop in only one way.” –
    Eric Davidson

    Darwin or Design? – Paul Nelson at Saddleback Church – Nov. 2012 – ontogenetic depth (excellent update) – video
    Text from one of the Saddleback slides:
    1. Animal body plans are built in each generation by a stepwise process, from the fertilized egg to the many cells of the adult. The earliest stages in this process determine what follows.
    2. Thus, to change — that is, to evolve — any body plan, mutations expressed early in development must occur, be viable, and be stably transmitted to offspring.
    3. But such early-acting mutations of global effect are those least likely to be tolerated by the embryo.
    Losses of structures are the only exception to this otherwise universal generalization about animal development and evolution. Many species will tolerate phenotypic losses if their local (environmental) circumstances are favorable. Hence island or cave fauna often lose (for instance) wings or eyes.

    Thus where Darwinian theory most needs flexibility in order to be viable as a hypothesis, i.e. in developmental gene regulatory networks, is the place where it is found to be least flexible. Yet, it is in these gene regulatory networks where the greatest differences are found!

    Here are a few more notes on the dramatic anatomical differences between chimps and humans that need to be honestly addressed by Darwinists

    The Red Ape – Cornelius Hunter – August 2009
    Excerpt: “There remains, however, a paradoxical problem lurking within the wealth of DNA data: our morphology and physiology have very little, if anything, uniquely in common with chimpanzees to corroborate a unique common ancestor. Most of the characters we do share with chimpanzees also occur in other primates, and in sexual biology and reproduction we could hardly be more different. It would be an understatement to think of this as an evolutionary puzzle.”

    In fact so great are the anatomical differences between humans and chimps that a Darwinist actually proposed that a chimp and pig mated with each other and that is what ultimately gave rise to humans:

    A chimp-pig hybrid origin for humans? – July 3, 2013
    Excerpt: Dr. Eugene McCarthy,, has amassed an impressive body of evidence suggesting that human origins can be best explained by hybridization between pigs and chimpanzees. Extraordinary theories require extraordinary evidence and McCarthy does not disappoint. Rather than relying on genetic sequence comparisons, he instead offers extensive anatomical comparisons, each of which may be individually assailable, but startling when taken together.,,,
    The list of anatomical specializations we may have gained from porcine philandering is too long to detail here. Suffice it to say, similarities in the face, skin and organ microstructure alone is hard to explain away. A short list of differential features, for example, would include, multipyramidal kidney structure, presence of dermal melanocytes, melanoma, absence of a primate baculum (penis bone), surface lipid and carbohydrate composition of cell membranes, vocal cord structure, laryngeal sacs, diverticuli of the fetal stomach, intestinal “valves of Kerkring,” heart chamber symmetry, skin and cranial vasculature and method of cooling, and tooth structure. Other features occasionally seen in humans, like bicornuate uteruses and supernumerary nipples, would also be difficult to incorporate into a purely primate tree.

    Human hybrids: a closer look at the theory and evidence – July 25, 2013
    Excerpt: There was considerable fallout, both positive and negative, from our first story covering the radical pig-chimp hybrid theory put forth by Dr. Eugene McCarthy,,,By and large, those coming out against the theory had surprisingly little science to offer in their sometimes personal attacks against McCarthy.
    ,,,Under the alternative hypothesis (humans are not pig-chimp hybrids), the assumption is that humans and chimpanzees are equally distant from pigs. You would therefore expect chimp traits not seen in humans to be present in pigs at about the same rate as are human traits not found in chimps. However, when he searched the literature for traits that distinguish humans and chimps, and compiled a lengthy list of such traits, he found that it was always humans who were similar to pigs with respect to these traits. This finding is inconsistent with the possibility that humans are not pig-chimp hybrids, that is, it rejects that hypothesis.,,,

    The obvious question for me is, of course, since Darwinists are having such a hard time proving that we did not come from pig-chimp hybrids, what makes Darwinists so sure that we evolved from apes or anything else in the first place? Any reasonable person would realize that if such a dubious theory such as the pig-chimp hybrid theory can cause such havoc to their empirical basis, for what was suppose to be such well established science, then perhaps the Darwinian story for human origins is not nearly as strong as Darwinists have dogmatically held it to be in the first place. Some might even hold that such ‘flimsiness’ would suggest that the original theory was rubbish as to being hard science.

  6. 6
    bornagain77 says:

    of related interest to this claim,,,

    “We found peptides for >96% of genes that evolved before bilateria.”

    ,,,Is the fact that yeast and bacteria are both found to be very uncooperative to assumptions of common descent,,,

    Here Are Those Incongruent Trees From the Yeast Genome – Case Study – Cornelius Hunter – June 2013
    Excerpt: We recently reported on a study of 1,070 genes and how they contradicted each other in a couple dozen yeast species. Specifically, evolutionists computed the evolutionary tree, using all 1,070 genes, showing how the different yeast species are related. This tree that uses all 1,070 genes is called the concatenation tree. They then repeated the computation 1,070 times, for each gene taken individually. Not only did none of the 1,070 trees match the concatenation tree, they also failed to show even a single match between themselves. In other words, out of the 1,071 trees, there were zero matches. Yet one of the fundamental predictions of evolution is that different features should generally agree. It was “a bit shocking” for evolutionists, as one explained: “We are trying to figure out the phylogenetic relationships of 1.8 million species and can’t even sort out 20 yeast.”
    In fact, as the figure above shows, the individual gene trees did not converge toward the concatenation tree. Evolutionary theory does not expect all the trees to be identical, but it does expect them to be consistently similar. They should mostly be identical or close to the concatenation tree, with a few at farther distances from the concatenation tree. Evolutionists have clearly and consistently claimed this consilience as an essential prediction.
    But instead, on a normalized scale from zero to one (where zero means the trees are identical), the gene trees were mostly around 0.4 from the concatenation tree with a huge gap in between. There were no trees anywhere close to the concatenation tree. This figure is a statistically significant, stark falsification of a highly acclaimed evolutionary prediction.

    Widespread ORFan Genes Challenge Common Descent – Paul Nelson – video with references

    Estimating the size of the bacterial pan-genome – Pascal Lapierre and J. Peter Gogarten – 2008
    Excerpt: We have found greater than 139 000 rare (ORFan) gene families scattered throughout the bacterial genomes included in this study. The finding that the fitted exponential function approaches a plateau indicates an open pan-genome (i.e. the bacterial protein universe is of infinite size); a finding supported through extrapolation using a Kezdy-Swinbourne plot (Figure S3). This does not exclude the possibility that, with many more sampled genomes, the number of novel genes per additional genome might ultimately decline; however, our analyses and those presented in Ref. [11] do not provide any indication for such a decline and confirm earlier observations that many new protein families with few members remain to be discovered.

    At the 12:40 minute mark of the following ‘The Dictionary of Life’ video, Dr. Nelson describes the breaking point for Darwinian scenarios from the genetic evidence of Bacteria:

    The Dictionary of Life | Origins with Dr. Paul A. Nelson – video

    The essential genome of a bacterium – 2011
    Figure (C): Venn diagram of overlap between Caulobacter and E. coli ORFs (outer circles) as well as their subsets of essential ORFs (inner circles). Less than 38% of essential Caulobacter ORFs are conserved and essential in E. coli. Only essential Caulobacter ORFs present in the STING database were considered, leading to a small disparity in the total number of essential Caulobacter ORFs.

  7. 7
    PaV says:


    My initial perusal tells me that their results are perfectly conformable to ID expectations.

    I don’t see the problem. What does this have to do with ORFan genes?

  8. 8
    JoeCoder says:


    You’re right. I had two papers open at the same time and the text I copied was from the wrong one. Sorry about that.

  9. 9
    ppolish says:

    “The number of new genes that separate humans from mice [those genes that have evolved since the split from primates] may even be fewer than ten,” study co-author
    David Juan said in the press release.”

    Seems genes are not really that important in differentiating? Gosh, which book has more errors – Origin of Species or The Selfish
    Gene? Tough one.

  10. 10
    bornagain77 says:

    So instead of 21,000 types of the billion trillion proteins that make up a human, we only have 19,000 types of the billion trillion proteins that make up a human? Well I guess that simplifies everything immensely. 🙂 Now if a Darwinist would be so kind as to tell us how these 19,000 types of the billion trillion proteins actually self assemble themselves into a human from a fertilized egg I guess we can all call it a day and go home:

    Excerpt: “If you think air traffic controllers have a tough job guiding planes into major airports or across a crowded continental airspace, consider the challenge facing a human cell trying to position its proteins”. A given cell, he notes, may make more than 10,000 different proteins, and typically contains more than a billion protein molecules at any one time. “Somehow a cell must get all its proteins to their correct destinations — and equally important, keep these molecules out of the wrong places”. And further: “It’s almost as if every mRNA [an intermediate between a gene and a corresponding protein] coming out of the nucleus knows where it’s going” (Travis 2011),,,
    Further, the billion protein molecules in a cell are virtually all capable of interacting with each other to one degree or another; they are subject to getting misfolded or “all balled up with one another”; they are critically modified through the attachment or detachment of molecular subunits, often in rapid order and with immediate implications for changing function; they can wind up inside large-capacity “transport vehicles” headed in any number of directions; they can be sidetracked by diverse processes of degradation and recycling… and so on without end. Yet the coherence of the whole is maintained.
    The question is indeed, then, “How does the organism meaningfully dispose of all its molecules, getting them to the right places and into the right interactions?”
    The same sort of question can be asked of cells, for example in the growing embryo, where literal streams of cells are flowing to their appointed places, differentiating themselves into different types as they go, and adjusting themselves to all sorts of unpredictable perturbations — even to the degree of responding appropriately when a lab technician excises a clump of them from one location in a young embryo and puts them in another, where they may proceed to adapt themselves in an entirely different and proper way to the new environment. It is hard to quibble with the immediate impression that form (which is more idea-like than thing-like) is primary, and the material particulars subsidiary.
    Two systems biologists, one from the Max Delbrück Center for Molecular Medicine in Germany and one from Harvard Medical School, frame one part of the problem this way:
    “The human body is formed by trillions of individual cells. These cells work together with remarkable precision, first forming an adult organism out of a single fertilized egg, and then keeping the organism alive and functional for decades. To achieve this precision, one would assume that each individual cell reacts in a reliable, reproducible way to a given input, faithfully executing the required task. However, a growing number of studies investigating cellular processes on the level of single cells revealed large heterogeneity even among genetically identical cells of the same cell type. (Loewer and Lahav 2011)”,,,
    And then we hear that all this meaningful activity is, somehow, meaningless or a product of meaninglessness. This, I believe, is the real issue troubling the majority of the American populace when they are asked about their belief in evolution. They see one thing and then are told, more or less directly, that they are really seeing its denial. Yet no one has ever explained to them how you get meaning from meaninglessness — a difficult enough task once you realize that we cannot articulate any knowledge of the world at all except in the language of meaning.,,,

    Stephen Meyer – Functional Proteins And Information For Body Plans – video

    Dr. Stephen Meyer comments at the end of the preceding video,,,

    ‘Now one more problem as far as the generation of information. It turns out that you don’t only need information to build genes and proteins, it turns out to build Body-Plans you need higher levels of information; Higher order assembly instructions. DNA codes for the building of proteins, but proteins must be arranged into distinctive circuitry to form distinctive cell types. Cell types have to be arranged into tissues. Tissues have to be arranged into organs. Organs and tissues must be specifically arranged to generate whole new Body-Plans, distinctive arrangements of those body parts. We now know that DNA alone is not responsible for those higher orders of organization. DNA codes for proteins, but by itself it does not insure that proteins, cell types, tissues, organs, will all be arranged in the body. And what that means is that the Body-Plan morphogenesis, as it is called, depends upon information that is not encoded on DNA. Which means you can mutate DNA indefinitely. 80 million years, 100 million years, til the cows come home. It doesn’t matter, because in the best case you are just going to find a new protein some place out there in that vast combinatorial sequence space. You are not, by mutating DNA alone, going to generate higher order structures that are necessary to building a body plan. So what we can conclude from that is that the neo-Darwinian mechanism is grossly inadequate to explain the origin of information necessary to build new genes and proteins, and it is also grossly inadequate to explain the origination of novel biological form.’
    Stephen Meyer – (excerpt taken from Meyer/Sternberg vs. Shermer/Prothero debate – 2009)

    Darwin’s Doubt narrated by Paul Giem – The Origin of Body Plans – video

  11. 11
    wd400 says:


    They find no evidence for proteins being made from most of the lineage-specific sequences annotated as protein-coding — I think the relvance is pretty self explanatory from there?

  12. 12
    phoodoo says:

    bornagain77 re post 10:

    That is a pretty devastating takedown of evolution. Great source of information.

  13. 13
    PaV says:


    I suppose you’re trying to say that the lineage-specific sequences (ORFans) found in humans is “non-coding.” But this is exactly as IDists would have predicted.

  14. 14
    wd400 says:

    ORF stands for Open Reading Frame — you can’t have a non-coding ORF.

  15. 15
    PaV says:


    Then why did you bring up the whole issue of ORFan genes?

    Again, if we’re dealing with annotated lineage-specific sequences that are ‘not’ protein-coding (i.e., are non-coding) then this is just what ID has been expecting to find for years.

  16. 16
    wd400 says:

    ORFans are lineage-specific genes that are annotated as being protein coding. This paper sugggests that many lineage-specific sequences annotated as being protein coding are in fact not making proteins.

    Since folks here have made a big deal about ORFans this seems like an interesting result, no?

  17. 17
    Dr JDD says:

    As I have said before when discussions have arisen about such a peptide approach, we must remember that it relies on the sensitivity of your Mass Spec (MS) capabilities. As described even by the authors of this paper in the introduction:

    However, while MS evidence can be used to verify protein-coding potential, the low coverage of proteomics experiments implies that the reverse is not true. Not detecting peptides does not prove that the corresponding gene is non-coding because it may be a consequence of the protein being expressed in few tissues, having very low abundance, or being degraded quickly.

    I would also add myself, a point about sensitivity here as well. Some proteins may be expressed, but very transiently, in varying stresses and other conditions (there is a lot of good evidence for this) but also at very low levels. MS requires enough signal to be above the noise and to be able to distinguish that signal confidently as a real event, i.e. a peptide it can distinguish. If the signal is too low, it will be lost in the noise. Of course more recent machines and sensitivities have increased with improvements in the last few years, but the principle still remains.
    What the authors here are doing is taken already published work, setting criteria for inclusion of peptides in their analysis, and making their conclusions from that. This is a fair approach of course when taking all of the caveats that come with this type of work into account. Let us look firstly at the source for these peptides:

    We collected peptides from seven separate MS sources. Two came from large-scale proteomics databases, PeptideAtlas (26) and NIST ( Another four, referred to as ‘Geiger’, ‘Muñoz’, ‘Nagaraj’ and ‘Neuhauser’ throughout the paper, were recently published large-scale MS experiments (20,22–24). For all six datasets, the starting point was the list of peptides provided by the authors or databases. We generated the final set of peptides (referred to as ‘CNIO’) in house from an X!Tandem (27) search against spectra from the GPM (28) and PeptideAtlas databases, following the protocol set out in Ezkurdia et al. (18) with a false discovery rate of 0.1%. These seven studies cover a wide range of search engines, tissues and cell types.

    So they used a mixture of already described sources, and also used confidence levels to determine what to include.
    – Peptide Atlas – a public domain repository of MS peptide data
    – NIST – as per peptide atlas, linked in some way I think
    – 4 others from published datasets: Geiger was from 11 “common” cell lines (which are really poor representatives as they will be immortalised cells heavily cultured and not representative of the real scenario for sure – are they human??), Munoz was from embryonic and induced pluripotent stem cells alone, Nagaraj was from a single cancer cell line (very unrepresentative) and Neuhauser I cannot figure out what they did as it is not clear (see abstract here: with note to the final sentence

    We apply our high performance platform to investigate incremental coverage of the human proteome by high resolution MS data originating from in-depth cell line and cancer tissue proteome measurements.

    where again, I highlight cell line and cancer tissue – no mention of different developmentally stages different tissue normal primary cells).
    – Finally, their own approach appears to me to be a modified search algorithm of their own, applied to databases already described
    They make the statement:

    These seven studies cover a wide range of search engines, tissues and cell types.

    However to me, 2 of the papers just like at 1-2 types, a third looks at 11 cell lines (cell lines as said are highly unrepresentative of the real in vivo situation, are also 2-d models where we already know have differential gene expression than 3-d models as evidenced by anyone who works in cell biology in pharmaceuticals will currently know, etc).
    Now let’s look at what they state in the conclusion, related to the quoted point in the introduction:

    Of course, the absence of peptides in proteomics analyses does not imply that a protein is not expressed. There are many reasons why peptides are not detected in proteomics experiments, for example, the proteins may be present in limited tissues or developmental stages, they may be expressed in very low quantities or, like the HOX genes, have a very short half-lives. Some may be only activated by certain stresses (25), and still other proteins will have features, such as multiple trans-membrane helices, that make them difficult to detect for technical reasons.
    However, the seven proteomics studies covered a wide range of cell types, making it less likely that one of the main reasons for not detecting a protein, i.e. that it is expressed in limited tissues or developmental stages. The PeptideAtlas database alone is a compendium of experiments carried out on 51 different tissue and cell types, and the PeptideAtlas database forms just a part of the CNIO study and the NIST database. Six of the seven studies were carried out on a range of tissues, and together these studies cover considerably more cell types than UniGene. Although the Human Proteome Project (25) has reported that early developmental stages are still under represented in proteomics experiments, the Muñoz analysis used in this paper (22) interrogated embryo and pluripotent stem cells and found relatively few previously undetected proteins. However, despite the variety of tissues interrogated in our analysis, it is to be expected that some proteins will remain undetected because they are tissue specific.

    So they seem to dismiss the idea that it is tissue restriction, yet much of what these databases and papers they cite, have come from analysis of cultured cells that are immortalised. This is not representative at all. Cells that are in 2-d growth, immortalised, in DMEM or RPMI supplemented with fetal calf serum, antibiotics and l-glutamine have very different characteristics and look very different to real, normal primary cells. From what I can see, most of the data they have taken comes from cell lines that will have been treated in this way.
    Now this is not a criticism – the work is good and sound but the limitations must be taken and I find it odd that this is not commented on more. Also there is little discussion about some of the more pertinent points and limitations of MS and too much weight is given to the apparent “breadth” of these databases and publications they have used.
    What the paper seems to suggest to me, is that when you look at the common datasets that the most conserved proteins generate the most reliable data and are most commonly found. That is not a surprise. We also know that it is NOT simply tissue expression where differential gene expression occurs – it is also environmental stresses. This has recently been shown with many 100s-1000s of new genes identified in Drosophila due to various stresses, such as alcohol, temperature, etc, etc. This analysis is very much unlikely to be able to even start to address this question.
    Finally, we need to place this paper into the context of other recent findings, using NEW and more sensitive and more directed specific approaches, to overcome the limitations of these existing datasets used here. One big one came out very recently, and found some proteins that came from genetic regions predicted to have no genes present. That is a testament in itself to how more modern analyses (rather than older datasets) may be able to better address these questions about rarer more novel/specifically expressed proteins:
    Finally, I point you to another recent publication here:
    Note this in the abstract:

    Chromosome-centric human proteome project (C-HPP) is a global initiative to comprehensively characterize proteins encoded by genes across all human chromosomes by teams focusing on individual chromosomes. Here, we report mass spectrometry-based identification and characterization of proteins encoded by genes on chromosome 12. Our study is based on proteomic profiling of 30 different histologically normal human tissues and cell types using high-resolution mass spectrometry. In our analysis, we identified 1,535 proteins encoded by 836 genes on human chromosome 12. This includes 89 genes that are designated as “missing proteins” by “neXtProt” as they did not have any prior evidence either by mass spectrometry or by antibody-based detection methods. We identified several variant peptides that reflected coding SNPs annotated in dbSNP database. We also confirmed the start sites of ?200 proteins by identifying protein N-terminal acetylated peptides. We also identified alternative start sites for 11 proteins that were not annotated in public databases until now. Most importantly, we identified 12 novel protein coding regions on chromosome 12 using our proteogenomics strategy. All of the 12 regions have been annotated as pseudogenes in public databases. This study demonstrates that there is scope for significantly improving annotation of protein coding genes in the human genome using mass-spectrometry-derived data. Individual efforts as part of C-HPP initiative should significantly contribute toward enriching human protein annotation. The data have been deposited to ProteomeXchange with identifier PXD000561.

    Emphasis in bold is mine. Please note that >10% of the genes were designated “missing proteins”, that annotated pseudogenes had real protein products, and most relevant to this discussion, these 10% genes designated as missing proteins were not previously found by MS.
    So while this original study posted is interesting, it must be taken with all the caveats and other information we have at hand. I think the case for novel proteins is still pretty strong, and this does not defeat ORFans. If anything as others have said, it is all the more amazing we are SO different to organisms yet with relatively few proteins.
    Which leads onto a final point. Clearly, the regulation by non-proteinacious components of the cell is crucial to defining characteristics of an organism. Much of our genome is being found to, and is likely to encode for regulatory components such as regulatory RNA molecules or simply moieties itself. Evolutionists have always focused on the evolution of proteins through mutation in the DNA to give new amino acid changes and new protein functions. However it seems that this is not a shrinking problem for them as there is a whole other random mutation “space” for evolution to account for and that is the plethora of regulatory molecules that are quite different to the simple DNA–>RNA–>protein central dogma.

  18. 18
    bornagain77 says:

    Dr JDD, thank you for taking the time to do, IMHO, such a fair and balanced analysis of the study. I found your explanation easy to follow, i.e. not over my head, and I really appreciated that! 🙂

  19. 19
    Mung says:

    There must be a lot of non-human in humans.

  20. 20
    Mapou says:

    It makes sense to me as a programmer that regulatory genes should outnumber protein coding genes by a very large margin.

  21. 21
    Moose Dr says:

    PaV (13) I suppose you’re trying to say that the lineage-specific sequences (ORFans) found in humans is “non-coding.” But this is exactly as IDists would have predicted.

    How is this predicted by ID? It would appear to me that ORFans that produce genes playing a critical role would produce a much better case for ID.

  22. 22
    JoeCoder says:

    Thanks for the detailed analysis, Dr JDD.

  23. 23
    gpuccio says:

    Dr JDD:

    Very good work! Thank you.

  24. 24
    Dr JDD says:

    Thanks for the positive comments, and I am glad what I said has made some sense.

    I should clarify that I am not dismissing the study. However, it has to be seen for what it is: Firstly, the data it used and the limitations of the technology; Secondly, in light of other recent data that demonstrates evidence for peptides from novel proteins and alludes to the fact these are difficult to detect. Again I will say the best data from these studies comes from using primary normal tissue, as opposed to the use of cultured cell lines and even cultured normal primary cells. Much of the public domain data and even publications use such cells and these are very poorly representative of reality (i.e. in vivo).

    Additionally, we must remember that we need to follow the trail of evidence and we need not to get hung up on one particular strand of an argument. Wd400 raised this issue as a blow perhaps for the ORFan fans. Indeed, it appeared that way and perhaps I have tried to argue that due to the limitations and design of the study, and taking all things into account that this is not a blow for ORFans, at least from my understanding of the study.

    However what is often failed to be acknowledged in my opinion is that little arguments like ORFans while they can support the increased complexity and variability between species and hence can be argued as support for ID, this is not the crux of the ID argument. It is a small fragment. Like I said in my earlier post at the end, when we step back and look at the big picture, what we see is that in 6-7million years, what man is compared to counterparts that have apparently evolved from chimpanzees (other primates) is quite staggering, in particular to our intelligence, our processing abilities and all other various skills, talents and abilities (complex reasoning, typing out this post…etc). If we generally use the majority of the same proteins – and not just as primates, but also as much more simpler organisms – then we must conclude that the separation of what makes species different is far beyond the genetic code. That code is what is the supposed holy grail for evolution to do its work on. That code provides evolution the mechanism of simple random mutation can lead to a selected benefit and over time new function can occur. Yet the biggest differences are likely to not be simply presence of a gene or not, but more likely timing, regulation, and other events we have not yet encountered or fully described/understood. Therefore, evolution has to act out on much more than simply bringing about a new gene randomly, it has to do so in an appropriate spatiotemporal positioning that is tightly regulated in different tissue types in response to different other cellular cues.

    Then when we consider the real crux of the evolution vs ID argument, which is the mathematical modelling and how the sequence search space for random mutations to generate functionality (apart from any regulation of such a gene in timing and amounts and turn-over), we are only adding to that problem as we have a whole host of regulatory molecules and events to randomly produce, alongside our randomly produced novel genes/proteins in parallel to allow proper functionality. This is where evolution is perceived by us in the ID camp as incredibly falling short of the mark and that does not in any way hinge upon whether ORFans are real or not.

    This is why I am pleased to see an article like this posted here on UCD. It allows us to examine our own understanding of the arguments used for and against, and follow what the evidence seems to be saying. I don’t want to use a very poor argument for ID just for the sake of supporting ID as then I am no better than the evolutionist that when confronted with a complex, incredible mechanism simply says, “Evolution is amazing, look at what it has done!” which is not really a valid explanation if the process truly is naturalistic.

    In this case however, I quite strongly believe that this publication does not supply sufficient evidence to question the presence of ORFan proteins or similar issues that it apparently may raise.

  25. 25
    bornagain77 says:

    OT: here is a look at that recent ribosome paper:

    Imagine How It Happened! “Evolution Presents” the Ribosome, “Nature’s Masterpiece” – July 9, 2014
    Excerpt: There are even more reasons to reject the evolutionary hypothesis in the PNAS paper on which the film was based. The authors provide no evidence that the “common core” (Phase 1 in the film) of the large ribosomal subunit (LSU) was able to do anything on its own. There is a small ribosomal subunit (SSU) that has to match it. Even more important, a ribosome is useless without a genome! How do they handle that? “In our model, the LSU has evolved in distinct phases,” the paper speculates. “This process started with the formation of the P site, possibly in an RNA world, and continues today in eukaryotes.” So they lean on the RNA world scenario, which we have shown many times is untenable. This is recognized even by evolutionists, such as Niles Lehman, whom Casey Luskin quoted as saying, “The odds of suddenly having a self-replicating RNA pop out of a prebiotic soup are vanishingly low.” This stops the tale before it even starts.
    The authors try to make the “common core” look small and simple, but the LSU of the simplest bacterium contains on the order of 3,000 nucleotides. The small rRNA subunit (SSU) contains another 1,500 more. These are much larger (and more complex) than anything that origin-of-life researchers could ever hope for in an RNA world.
    Even more problematic for evolution, both ribosomal subunits for the simplest bacterium contain dozens of protein parts integrated with the RNA parts. But the proteins had to be translated by the very ribosome the evolutionists are trying to explain! It’s a profound chicken-and-egg problem that Williams and his co-authors gloss over

  26. 26
    wd400 says:

    Re the study,

    Of course proteomic studies have less power than genomic ones at present, an it’s likely that there are at least some human-specific protein coding genes. It reamins the case, however, that anyone trying to make something from “ORFans” has to deal with the fact many of the apparent-ORFan genes may in fact not produce proteins (and analyses beyond the proteomic ones support this).

    I don’t know what to make of the claim

    [the genetic] code is what is the supposed holy grail for evolution to do its work on

    Care to offer a citation for this view? Somewhere in all that link spam BA mentioned King and Wilson who made the point much of the evolutionary difference between human and chimp would be explained by gene regulation and that was 1975!

    Same goes for this

    Evolutionists have always focused on the evolution of proteins through mutation in the DNA to give new amino acid changes and new protein functions. However it seems that this is not a shrinking problem for them as there is a whole other random mutation “space” for evolution to account for and that is the plethora of regulatory molecules that are quite different to the simple DNA–>RNA–>protein central dogma.

  27. 27
    Dr JDD says:

    Of course I accept many of the apparent ORFan genes may not produce proteins but I am also saying this study is not sufficient evidence to suggest ORFans aren’t a thing as was your original implication.

    In fact, let’s play your game you are playing here.
    1) IDists harp on about ORFan genes as evidence for ID etc
    2) This paper shows virtually no/very low evidence of ORFan genes giving protein products
    3) Therefore, IDists are wrong in using ORFan genes as support of ID

    So here is my turn:
    1) Evolutionists harp on about pseudogenes being remnants of dead genes, byproducts of wasteful evolution
    2) The paper I cite above (last one) finds 12 genes annotated on Chr12 as being pseudogenes actually are real genes giving protein products
    3) Therefore, evolutionists are wrong to use Pseudogenes as a support of evolution

    Now the same arguments will ensue from you why this is not valid – “its only 12 genes” and I say “well it is a MS peptide approach which has sensitivity issues and it also found 10% of genes not previously identified as proteins” blah blah blah.

    The point is, there may well be ORFans that do not express protein products. But again, the point is do they need to be? Maybe they are regulatory RNAs or other functional sequences. The other point is maybe many pseudogenes are duplicated genes or broken genes. So? Does that refute ID? No, of course not. Neither of these papers in themselves refute or prove one view or the other. That is the point I am making. In summary:

    – There ARE ORFan genes that are translated to give protein products
    – There MAY be ORFan genes that are not translated to give proteins
    – There ARE pseudogenes that are actually functional translatable genes
    – There MAY WELL be pseudogenes which are broken genes with little/no function

    These papers do not prove either view nor disprove the other. Which is why I say it is interesting, but the mathematical probability behind all of this information is the real question, ORFan or no ORFan, pseudogene or no pseudogene.

    Finally, of course people have been saying for many years that regulation may make the difference, but my point is we are now more certain than ever (why do you think they estimated originally over 100,000 genes in humans prior to the human genome project?) that it is more about spatiotemperal regulation. And my extension about that point is the regulation is a lot more complex than thought of in 1975 which would have been more focused on other proteins such as transcription faactors as they did not have knowledge of the plethora of regulatory molecules and processes now known in the genome such as siRNA, modification of DNA and histones/epigenetic control (e.g. methylation), lncRNA, lincRNA, and the many more regulatory molecules involved in the genome that we are just scratching the surface of. A lot of these things have only come to light in the last few years and they require a further layer of complexity for random generation through naturalistic processes which is not straight forward as those involved with simply generating a novel protein.

  28. 28
    scordova says:

    Dopey, speculative paper just rehashing existing databases of known experimental results with their own spin.

    They only made their inferences on what proteins have been detected. It’s like those folks who said no black swans existed because they haven’t seen any!

    There are in the range of 1 to 10 billion physical protein instances in a human cell and 200 trillion adult cells and even more from all stages of development. If some proteins classes are rare, they could easily escape detection.

    Not to mention, there are proteins specific to stages of the cell cycle!


    No structural or sequence inferences can be based on the failure to observe a specific peptide; such a peptide may arise from proteolysis but simply not be detected, it may have an unexpected mass due to one or more modifications, or the residues may indeed be present in the sequence but the proteolysis may not proceed as anticipated. It is equally important to note that the absence of evidence for the presence of a particular protein does not definitively establish the absence of the protein.

    Why wasn’t this paper published in a Proteomics journal? Instead it was in a genetics journal, and it wasn’t new experimental data, just spin of known databases. One could have just as easily said,

    “the absences of detection indicates we need for improved detection methods.” But noooooo, they just make a grand pronouncement that there are fewer and fewer proteins in human development cycle because we haven’t yet found them.

  29. 29
    bornagain77 says:

    Dr JDD, nice overview of the overall state of evidence,

    and to save wd400 the inevitable comeback,,,

    ‘You just don’t understand evolution’


  30. 30
    Dr JDD says:

    BA77 I’m used to that response as I am sure you are however I actually don’t understand evolution, just not in the same way that evolutionists use that term!

    ID is full of people who don’t understand evolution – we don’t understand how 19,000 functional proteins could randomly be generated through blind chance given that we know what probability it takes just 2 coordinated amino acid changes to make!!!

  31. 31
    scordova says:

    Reproducibility. Proteomics experiments conducted in one laboratory are not easily reproduced in another. For instance, Peng et al.[12] have identified 1504 yeast proteins in a proteomics experiment of which only 858 were found in a similar previous study.[13] Further, the previous study identified 607 proteins that were not found by Peng et al. This translates to a reproducibility of 57% (Peng vs. Washburn) to 59% (Washburn vs. Peng).

    Moral of the story:

    It is premature to say there are only 19,000 proteins.

  32. 32
    wd400 says:

    I don’t recognise the argument you present in the start of your comment, so wont’ defend it. All I’ve said is that people who have made a great deal out of the supposedly high number of ORFan genes should consider the evidence that many of them don’t make proteins.

    For those that don’t, there is not particular reason to think they are genes at all (they are annotated due to their genomic sequence or presence in transcripts, but there is no reason to think everything that is transcribed is functional), and good reason to think a lot of transcribed sequences aren’t.

    The “it’s all very complex so evolution didn’t do it” refrain is very common, but I don’t really understand it. All the classes of sequence you mention are genes (or are controlled by gene products). It’s going to take a long time to understand all the details, and extract the signal from the noise, but the idea (in your own words) protein coding genes by themselves were a “holy grail” for evolutionary biology and functional RNAs make it all wrong is very strange.

    Finally, it’s certainly true that “2 coordinated amino acid changes” is a shiboleth that reveals speaker has a shaky understanding of evolutionary biology. It usually comes from Behe’s mistakes or a misreading of Durret and Schmidt – though you don’t provide enough details in this case to make clear what you are driving at.

  33. 33
    wd400 says:

    I forgot to reply to this comment

    (why do you think they estimated originally over 100,000 genes in humans prior to the human genome project?

    This is one of those fact-like ideas that get’s cicrulated but isn’t quite true. The human genome project itself estimated 100,000, but this wasn’t the consensus view of biologists. Most molecular biologists predicted fewer (the mean of the Enseble sweepstake was about 60,000), evolutionary biologists did even better (Ohno predicted 30,000 genes in 1972, King & Jukes and Crow also made similar predictions)

  34. 34
    rhampton7 says:

    Therefore, evolution has to act out on much more than simply bringing about a new gene randomly, it has to do so in an appropriate spatiotemporal positioning that is tightly regulated in different tissue types in response to different other cellular cues.

    Dr JDD, if the differences between closely related species are ‘just’ the variations, regulations and timings of existing genes, then the empirical measure of ID (500 bits of specified information, a.k.a new genes) may represent only a fraction of detectable ID events. Have you given this any thought, and if so, what other empirical methods might be used?

  35. 35
    willh says:

    “It usually comes from Behe’s mistakes or a misreading of Durret and Schmidt”

    If you can, briefly what is the misreading of Durret and Schmidt?

  36. 36
    bornagain77 says:

    As to wd400’s claim that,,,

    “you just don’t understand Durret and Schmidt” 🙂

    ,,it is interesting to note Dr. Behe’s gripe about Durret and Schmidt’s mathematical model:

    Waiting Longer for Two Mutations – Michael J. Behe
    Excerpt: Citing malaria literature sources (White 2004) I had noted that the de novo appearance of chloroquine resistance in Plasmodium falciparum was an event of probability of 1 in 10^20. I then wrote that ‘for humans to achieve a mutation like this by chance, we would have to wait 100 million times 10 million years’ (1 quadrillion years)(Behe 2007) (because that is the extrapolated time that it would take to produce 10^20 humans). Durrett and Schmidt (2008, p. 1507) retort that my number ‘is 5 million times larger than the calculation we have just given’ using their model (which nonetheless “using their model” gives a prohibitively long waiting time of 216 million years). Their criticism compares apples to oranges. My figure of 10^20 is an empirical statistic from the literature; it is not, as their calculation is, a theoretical estimate from a population genetics model.

    I agree with Dr. Behe’s gripe against their Durret and Schmidt’s mathematical model:

    The Scientific Method – Richard Feynman – video
    Quote: ‘If it disagrees with experiment, it’s wrong. In that simple statement is the key to science. It doesn’t make any difference how beautiful your guess is, it doesn’t matter how smart you are who made the guess, or what his name is… If it disagrees with experiment, it’s wrong. That’s all there is to it.

  37. 37
    bornagain77 says:

    “The difficulty with models such as Durrett and Schmidt’s is that their biological relevance is often uncertain, and unknown factors that are quite important to cellular evolution may be unintentionally left out of the model. That is why experimental or observational data on the evolution of microbes such as P. falciparum are invaluable,,,”
    – Behe

  38. 38
    Starbuck says:

    So they seem to dismiss the idea that it is tissue restriction, yet much of what these databases and papers they cite, have come from analysis of cultured cells that are immortalised. This is not representative at all. Cells that are in 2-d growth, immortalised, in DMEM or RPMI supplemented with fetal calf serum, antibiotics and l-glutamine have very different characteristics and look very different to real, normal primary cells. From what I can see, most of the data they have taken comes from cell lines that will have been treated in this way…

    What the paper seems to suggest to me, is that when you look at the common datasets that the most conserved proteins generate the most reliable data and are most commonly found. That is not a surprise. We also know that it is NOT simply tissue expression where differential gene expression occurs – it is also environmental stresses. This has recently been shown with many 100s-1000s of new genes identified in Drosophila due to various stresses, such as alcohol, temperature, etc, etc. This analysis is very much unlikely to be able to even start to address this question.

    As far as I understand, the idea would be that we are finding more ancient, conserved genes with protein features because these are the proteins that tend to be expressed by the cancer cell cultures that are used for proteomics experiments?

    I don’t believe that at all. First of all, I would have thought that it was equally likely that non-biological conditions that the immortalised cells were in would have lead cell cultures to express a range of unusual proteins. Is there any data at all that shows that immortalised cells cease to produce any proteins other than housekeeping proteins?

    Second, the data comes from a wide-range of sources. While the large-scale experiments from the Matthias Mann groups do fit this profile, the CNIO, PeptideAtlas and NIST analyses come from a huge range of different experiments.

    Third there is little reliable evidence that carrying out tissue-based experiments expands the number of identified genes by much. See the HPP paper by Lane in JPR this year (ref 25). Clearly there have been two recent very high profile papers based on experiments on multiple tissues that have made incredible claims for the number of genes they have identified, but in these experiments it definitely wasn’t the number of tissues used that inflated the gene numbers. Probably varying the digestion enzymes would be the most efficient form of detecting new proteins.

    Ultimately the identification (or not) of a gene in the proteomics analyses is a bit of a red herring. The proteomics evidence can only be used as a means of positive identification. It can’t be used to confirm non-expression. There are approximately 8,500 genes that we don’t find peptides for and the majority of these genes will be either multiple trans-membrane helix proteins, have short half-lives or are expressed in certain tissues or certain cellular conditions. If you look at each gene it is usually quite clear which category it fits into, but 8,500 are rather a lot to investigate one by one.

    The annotation of the human genome is in constant flux, and although much work has been done by human annotators, they are still working based on the initial automatic annotation from 10 years ago. There are still many loci defined as protein coding genes in the Ensembl database that were put there by automatic prediction programs that have no basis for being there, and that will almost certainly be removed over the next couple of years.

    So, tissue restriction will be PART of the reason that we don’t find evidence for the 6,500 genes that we believe are likely to be protein coding genes), but will play little role for the 2,001 genes in the potential non-coding set, many of which we believe will not code for proteins under any circumstances.

    In any case, it is not us who have the final say in this, it will be the genome annotators in GENCODE and Ensembl.

  39. 39
    bornagain77 says:

    podcast – Michael Behe: The Limit in the Evolution of Proteins (Thorton 2014 paper)

  40. 40
    wd400 says:

    Hi Willh,

    The most common mistakes people make with D&S relate to skipping over the assumption in the model and concluding

    * It would take prohibitively long time for two mutations to fix in the same gene at all
    * Evolutionary pathways requiring more than one mutation are unreachable
    * Evolutionary pathways requiring more than one mutation are unreachable when the first mutation is selectively neutral
    * Because some pre-defined evolutionary events would take a long time to occur in pre-defined gene that won’t happen at all in the genome

    BA’s quotes show another sort of confusion — the idea that Behe’s calculation doesn’t have a model of popgen underlying it. In fact, as is often the case when people make these claims, there is an implicit popgen model in Behe’s claim. Have a formal model rather than the implicit one Behe worked from is an advanatge, as a pretty great statistician ones say “The great advantage of the model-based over the ad hoc approach, it seems to me, is that at any given time we know what we are doing.”

  41. 41
    wd400 says:

    (and now I see D&S summarized these mis-readings themeseves rather better than I did above:

  42. 42
    bornagain77 says:

    Contrary to what wd400 ‘professes’ to believe, (I have a hard time thinking he ACTUALLY believes it), In science, experimental evidence trumps models/theories all the time. Models/Theories do not trump empirical evidence. It is simply insane to insist that a model is impervious to empirical falsification. To repeat:

    The Scientific Method – Richard Feynman – video
    Quote: ‘If it disagrees with experiment, it’s wrong. In that simple statement is the key to science. It doesn’t make any difference how beautiful your guess is, it doesn’t matter how smart you are who made the guess, or what his name is… If it disagrees with experiment, it’s wrong. That’s all there is to it.”

    But I can see where someone such as wd400, who earns his bread and butter as a professional evolutionist, would want to switch the evidential priority completely around for Darwinism,,,, Darwinism, Neo-Darwinism in particular, simply has no empirical support in which to verify its grand claims.

    “On the other hand, I disagree that Darwin’s theory is as `solid as any explanation in science.; Disagree? I regard the claim as preposterous. Quantum electrodynamics is accurate to thirteen or so decimal places; so, too, general relativity. A leaf trembling in the wrong way would suffice to shatter either theory. What can Darwinian theory offer in comparison?”
    (Berlinski, D., “A Scientific Scandal?: David Berlinski & Critics,” Commentary, July 8, 2003)

    Lynn Margulis Criticizes Neo-Darwinism in Discover Magazine (Updated) – Casey Luskin April 12, 2011
    Excerpt: Population geneticist Richard Lewontin gave a talk here at UMass Amherst about six years ago, and he mathemetized all of it–changes in the population, random mutation, sexual selection, cost and benefit. At the end of his talk he said, “You know, we’ve tried to test these ideas in the field and the lab, and there are really no measurements that match the quantities I’ve told you about.” This just appalled me. So I said, “Richard Lewontin, you are a great lecturer to have the courage to say it’s gotten you nowhere. But then why do you continue to do this work?” And he looked around and said, “It’s the only thing I know how to do, and if I don’t do it I won’t get grant money.” –
    Lynn Margulis – biologist

    Further notes as to the sheer poverty of evidence for Darwinism:

    Multiple Overlapping Genetic Codes Profoundly Reduce the Probability of Beneficial Mutation George Montañez 1, Robert J. Marks II 2, Jorge Fernandez 3 and John C. Sanford 4 – May 2013
    Excerpt: It is almost universally acknowledged that beneficial mutations are rare compared to deleterious mutations [1–10].,, It appears that beneficial mutations may be too rare to actually allow the accurate measurement of how rare they are [11].
    1. Kibota T, Lynch M (1996) Estimate of the genomic mutation rate deleterious to overall fitness in E. coli . Nature 381:694–696.
    2. Charlesworth B, Charlesworth D (1998) Some evolutionary consequences of deleterious mutations. Genetica 103: 3–19.
    3. Elena S, et al (1998) Distribution of fitness effects caused by random insertion mutations in Escherichia coli. Genetica 102/103: 349–358.
    4. Gerrish P, Lenski R N (1998) The fate of competing beneficial mutations in an asexual population. Genetica 102/103:127–144.
    5. Crow J (2000) The origins, patterns, and implications of human spontaneous mutation. Nature Reviews 1:40–47.
    6. Bataillon T (2000) Estimation of spontaneous genome-wide mutation rate parameters: whither beneficial mutations? Heredity 84:497–501.
    7. Imhof M, Schlotterer C (2001) Fitness effects of advantageous mutations in evolving Escherichia coli populations. Proc Natl Acad Sci USA 98:1113–1117.
    8. Orr H (2003) The distribution of fitness effects among beneficial mutations. Genetics 163: 1519–1526.
    9. Keightley P, Lynch M (2003) Toward a realistic model of mutations affecting fitness. Evolution 57:683–685.
    10. Barrett R, et al (2006) The distribution of beneficial mutation effects under strong selection. Genetics 174:2071–2079.
    11. Bataillon T (2000) Estimation of spontaneous genome-wide mutation rate parameters: whither beneficial mutations? Heredity 84:497–501.

    “The First Rule of Adaptive Evolution”: Break or blunt any functional coded element whose loss would yield a net fitness gain – Michael Behe – December 2010
    Excerpt: In its most recent issue The Quarterly Review of Biology has published a review by myself of laboratory evolution experiments of microbes going back four decades.,,, The gist of the paper is that so far the overwhelming number of adaptive (that is, helpful) mutations seen in laboratory evolution experiments are either loss or modification of function. Of course we had already known that the great majority of mutations that have a visible effect on an organism are deleterious. Now, surprisingly, it seems that even the great majority of helpful mutations degrade the genome to a greater or lesser extent.,,, I dub it “The First Rule of Adaptive Evolution”: Break or blunt any functional coded element whose loss would yield a net fitness gain.

    Michael Behe talks about the devastating implications for Darwinism highlighted by the preceding paper in this following podcast:

    Michael Behe: Challenging Darwin, One Peer-Reviewed Paper at a Time – December 2010

    How about the oft cited example for neo-Darwinism of antibiotic resistance?

    List Of Degraded Molecular Abilities Of Antibiotic Resistant Bacteria:
    Excerpt: Resistance to antibiotics and other antimicrobials is often claimed to be a clear demonstration of “evolution in a Petri dish.” ,,, all known examples of antibiotic resistance via mutation are inconsistent with the genetic requirements of evolution. These mutations result in the loss of pre-existing cellular systems/activities, such as porins and other transport systems, regulatory systems, enzyme activity, and protein binding.

    That doesn’t seem to be helping! How about we look really, really, close at very sensitive growth rates and see if we can catch almighty evolution in action???

    Unexpectedly small effects of mutations in bacteria bring new perspectives – November 2010
    Excerpt: Most mutations in the genes of the Salmonella bacterium have a surprisingly small negative impact on bacterial fitness. And this is the case regardless whether they lead to changes in the bacterial proteins or not.,,, using extremely sensitive growth measurements, doctoral candidate Peter Lind showed that most mutations reduced the rate of growth of bacteria by only 0.500 percent. No mutations completely disabled the function of the proteins, and very few had no impact at all. Even more surprising was the fact that mutations that do not change the protein sequence had negative effects similar to those of mutations that led to substitution of amino acids. A possible explanation is that most mutations may have their negative effect by altering mRNA structure, not proteins, as is commonly assumed.

    Shoot that doesn’t seem to be helping either! How about if we just try to fix a ‘beneficial’ mutation in a multi-cellular creature:

    Experimental Evolution in Fruit Flies (35 years of trying to force fruit flies to evolve in the laboratory fails, spectacularly) – October 2010
    Excerpt: “Despite decades of sustained selection in relatively small, sexually reproducing laboratory populations, selection did not lead to the fixation of newly arising unconditionally advantageous alleles.,,, “This research really upends the dominant paradigm about how species evolve,” said ecology and evolutionary biology professor Anthony Long, the primary investigator.

  43. 43
    bornagain77 says:

    Well that certainly didn’t help. How about if just try to help evolution out a little and saturate entire genomes with mutations until we can finally see some ‘evolution’ in action?

    Response to John Wise – October 2010
    Excerpt: A technique called “saturation mutagenesis”1,2 has been used to produce every possible developmental mutation in fruit flies (Drosophila melanogaster),3,4,5 roundworms (Caenorhabditis elegans),6,7 and zebrafish (Danio rerio),8,9,10 and the same technique is now being applied to mice (Mus musculus).11,12 None of the evidence from these and numerous other studies of developmental mutations supports the neo-Darwinian dogma that DNA mutations can lead to new organs or body plans–because none of the observed developmental mutations benefit the organism.

    Shoot that doesn’t seem to be helping either! Perhaps we just have to give the almighty power of neo-Darwinism a little ‘room to breathe’ so as to work its magic? How about we ‘open the floodgates’ to the almighty power of Darwinian Evolution and look at Lenski’s Long Term Evolution Experiment and see what we can find after 50,000 generations, which is equivalent to somewhere around 1,000,000 years of human evolution???

    Richard Lenski’s Long-Term Evolution Experiments with E. coli and the Origin of New Biological Information – September 2011
    Excerpt: The results of future work aside, so far, during the course of the longest, most open-ended, and most extensive laboratory investigation of bacterial evolution, a number of adaptive mutations have been identified that endow the bacterial strain with greater fitness compared to that of the ancestral strain in the particular growth medium. The goal of Lenski’s research was not to analyze adaptive mutations in terms of gain or loss of function, as is the focus here, but rather to address other longstanding evolutionary questions. Nonetheless, all of the mutations identified to date can readily be classified as either modification-of-function or loss-of-FCT.
    (Michael J. Behe, “Experimental Evolution, Loss-of-Function Mutations and ‘The First Rule of Adaptive Evolution’,” Quarterly Review of Biology, Vol. 85(4) (December, 2010).)

    Lenski’s Long-Term Evolution Experiment: 25 Years and Counting – Michael Behe – November 21, 2013
    Excerpt: Twenty-five years later the culture — a cumulative total of trillions of cells — has been going for an astounding 58,000 generations and counting. As the article points out, that’s equivalent to a million years in the lineage of a large animal such as humans. Combined with an ability to track down the exact identities of bacterial mutations at the DNA level, that makes Lenski’s project the best, most detailed source of information on evolutionary processes available anywhere,,,
    ,,,for proponents of intelligent design the bottom line is that the great majority of even beneficial mutations have turned out to be due to the breaking, degrading, or minor tweaking of pre-existing genes or regulatory regions (Behe 2010). There have been no mutations or series of mutations identified that appear to be on their way to constructing elegant new molecular machinery of the kind that fills every cell. For example, the genes making the bacterial flagellum are consistently turned off by a beneficial mutation (apparently it saves cells energy used in constructing flagella). The suite of genes used to make the sugar ribose is the uniform target of a destructive mutation, which somehow helps the bacterium grow more quickly in the laboratory. Degrading a host of other genes leads to beneficial effects, too.,,, –

    Now that just can’t be right!! Man we should really start to be seeing some neo-Darwinian fireworks by 50,000 generations!?! Hey I know what we can do! How about we see what happened when the ‘top five’ mutations from Lenski’s experiment were combined??? Surely now the Darwinian magic will start flowing now???!!!

    Mutations : when benefits level off – June 2011 – (Lenski’s e-coli after 50,000 generations)
    Excerpt: After having identified the first five beneficial mutations combined successively and spontaneously in the bacterial population, the scientists generated, from the ancestral bacterial strain, 32 mutant strains exhibiting all of the possible combinations of each of these five mutations. They then noted that the benefit linked to the simultaneous presence of five mutations was less than the sum of the individual benefits conferred by each mutation individually.

    Now something is going terribly wrong here!!! Tell you what, let’s just forget trying to observe evolution in the lab, I mean it really is kind of cramped in the lab you know, and now let’s REALLY open the floodgates and let’s see what the almighty power of neo-Darwinian evolution can do with the ENTIRE WORLD at its disposal??? Surely now almighty neo-Darwinian evolution will flex its awesomely powerful muscles and forever make those IDiots, who believe in Intelligent Design, cower in terror!!!

    A review of The Edge of Evolution: The Search for the Limits of Darwinism
    The numbers of Plasmodium and HIV in the last 50 years greatly exceeds the total number of mammals since their supposed evolutionary origin (several hundred million years ago), yet little has been achieved by evolution. This suggests that mammals could have “invented” little in their time frame. Behe: ‘Our experience with HIV gives good reason to think that Darwinism doesn’t do much—even with billions of years and all the cells in that world at its disposal’ (p. 155).

    “The immediate, most important implication is that complexes with more than two different binding sites-ones that require three or more proteins-are beyond the edge of evolution, past what is biologically reasonable to expect Darwinian evolution to have accomplished in all of life in all of the billion-year history of the world. The reasoning is straightforward. The odds of getting two independent things right are the multiple of the odds of getting each right by itself. So, other things being equal, the likelihood of developing two binding sites in a protein complex would be the square of the probability for getting one: a double CCC, 10^20 times 10^20, which is 10^40. There have likely been fewer than 10^40 cells in the world in the last 4 billion years, so the odds are against a single event of this variety in the history of life. It is biologically unreasonable.”
    – Michael Behe – The Edge of Evolution – page 146

    Michael Behe, The Edge of Evolution, pg. 162 Swine Flu, Viruses, and the Edge of Evolution
    “Indeed, the work on malaria and AIDS demonstrates that after all possible unintelligent processes in the cell–both ones we’ve discovered so far and ones we haven’t–at best extremely limited benefit, since no such process was able to do much of anything. It’s critical to notice that no artificial limitations were placed on the kinds of mutations or processes the microorganisms could undergo in nature. Nothing–neither point mutation, deletion, insertion, gene duplication, transposition, genome duplication, self-organization nor any other process yet undiscovered–was of much use.”

    Now, there is something terribly wrong here! After looking high and low and everywhere in between, we can’t seem to find the almighty power of neo-Darwinism anywhere!! Shoot we can’t even find ANY power of neo-Darwinism whatsoever!!! It is as if the whole neo-Darwinian theory, relentlessly sold to the general public as it was the gospel truth, is nothing but a big fat lie!!!

  44. 44
    willh says:


    Much appreciate your considered response; thankyou for answering directly. As you may have concluded from my limited contributions, I favour the ‘other’ explanation in the relevant discussions; but do wish to understand clearly what each is saying. If I may, given your second point:

    “Evolutionary pathways requiring more than one mutation are unreachable”.

    Perhaps unreachable would have to be assessed over the aggregate? Even if you discount the conclusions wrought in “Edge of Evolution”, and Behe’s criticism of Durret and Schmidt’s paper, is not the time factor that Durret and Schmidt proposed suggesting a much bigger dimension across the board for Darwinian evolutions current dating of the fossil record? Or do you see problems With Durret and Schmidts conclusions perhaps?

  45. 45
    bornagain77 says:

    Here Dr. Behe responds to Durrett and Schmidt’s attempted rebuttal” in a 5 part essay:

    summary at the end of part 5 is here:

    as I show above, when simple mistakes in the application of their model to malaria are corrected, it agrees closely with empirical results reported from the field that I cited. This is very strong support that the central contention of The Edge of Evolution is correct: that it is an extremely difficult evolutionary task for multiple required mutations to occur through Darwinian means, especially if one of the mutations is deleterious. And, as I argue in the book, reasonable application of this point to the protein machinery of the cell makes it very unlikely that life developed through a Darwinian mechanism.

  46. 46
    bornagain77 says:

    Don’t Mess With ID (Overview of Behe’s ‘Edge’ and Durrett and Schmidt’s paper at the 20:00 minute mark) – Paul Giem – video

  47. 47
    wd400 says:

    Hi willh,

    I guess the point is that the numbers they caclulate are under some assumptions. Namely,

    (1) there is only one way to get from one phenotype to the next
    (2) That path requires two mutations in a single short section of DNA
    (3) The first mutation is selectively neutral

    So, they can show there has only been time for a few processes like this in the evolution of humans from our common ancestor with chimps. But it doesn’t mean you can’t fix multiple mutations in genes (indeed the expectation is that yo will get multiple mutations in genes without selection), or that such such specified pathways are common (indeed – there is good reason to think they don’t apply even to the malaria parasite Behe cites)

  48. 48
    willh says:


    “(1) there is only one way to get from one phenotype to the next”

    This, along with your comment on the malaria parasite question, seems to hint at larger considerations then. Too narrow a focus for ultimate conclusions? Thanks again for your answers.

  49. 49
    bornagain77 says:

    Of related interest to wd400’s claim that,,,

    But it doesn’t mean you can’t fix multiple mutations in genes (indeed the expectation is that yo will get multiple mutations in genes without selection),,,

    is this ‘unexpected’ fact:

    A hidden genetic code: Researchers identify key differences in seemingly synonymous parts of the structure – January 21, 2013
    Excerpt: (In the Genetic Code) there are 64 possible ways to combine four bases into groups of three, called codons, the translation process uses only 20 amino acids. To account for the difference, multiple codons translate to the same amino acid. Leucine, for example, can be encoded in six ways. Scientists, however, have long speculated whether those seemingly synonymous codons truly produced the same amino acids, or whether they represented a second, hidden genetic code. Harvard researchers have deciphered that second code,,,
    Under some stressful conditions, the researchers found, certain sequences manufacture proteins efficiently, while others—which are ostensibly identical—produce almost none. “It’s really quite remarkable, because it’s a very simple mechanism,” Subramaniam said. “Many researchers have tried to determine whether using different codons affects protein levels, but no one had thought that maybe you need to look at it under the right conditions to see this.”,,,
    While the system helps cells to make certain proteins efficiently under stressful conditions, it also acts as a biological failsafe, allowing the near-complete shutdown in the production of other proteins as a way to preserve limited resources.

    Sounds of silence: synonymous nucleotides as a key to biological regulation and complexity. – Jan 2013
    Excerpt: Silent or synonymous codon positions, which do not determine amino acid sequences of the encoded proteins, define mRNA secondary structure and stability and affect the rate of translation, folding and post-translational modifications of nascent polypeptides.,,,
    Synonymous positions of the coding regions have a higher level of hybridization potential relative to non-synonymous positions, and are multifunctional in their regulatory and structural roles.

    ‘Snooze Button’ On Biological Clocks Improves Cell Adaptability – Feb. 17, 2013
    Excerpt: Like many written languages, the genetic code is filled with synonyms: differently spelled “words” that have the same or very similar meanings. For a long time, biologists thought that these synonyms, called synonymous codons, were in fact interchangeable. Recently, they have realized that this is not the case and that differences in synonymous codon usage have a significant impact on cellular processes,,

    Also of interest to wd400’s claim is this:

    Proteins with cruise control provide new perspective:
    Excerpt: “A mathematical analysis of the experiments showed that the proteins themselves acted to correct any imbalance imposed on them through artificial mutations and restored the chain to working order.”

    Thus, in contrast to the reductive materialism (central dogma) that undergirds Darwinian thought, it is found that mutations are secondary to the overall structure of a protein. Moreover, that an entire protein would be involved in maintaining the primary shape of a protein is evidence that information must be shared along the entirety of the protein structure so that it can maintain its basic functional shape despite deleterious mutations. And when we look close at proteins, we find that quantum information/entanglement is indeed shared along the entirety of the protein structure:

    Coherent Intrachain energy migration at room temperature – Elisabetta Collini and Gregory Scholes – University of Toronto – Science, 323, (2009), pp. 369-73
    Excerpt: The authors conducted an experiment to observe quantum coherence dynamics in relation to energy transfer. The experiment, conducted at room temperature, examined chain conformations, such as those found in the proteins of living cells. Neighbouring molecules along the backbone of a protein chain were seen to have coherent energy transfer. Where this happens quantum decoherence (the underlying tendency to loss of coherence due to interaction with the environment) is able to be resisted, and the evolution of the system remains entangled as a single quantum state.

    Quantum information performing Quantum computation in proteins, would go a long ways towards explaining the protein folding enigma:

    Physicists Discover Quantum Law of Protein Folding – February 22, 2011
    Quantum mechanics finally explains why protein folding depends on temperature in such a strange way.
    Excerpt: First, a little background on protein folding. Proteins are long chains of amino acids that become biologically active only when they fold into specific, highly complex shapes. The puzzle is how proteins do this so quickly when they have so many possible configurations to choose from.
    To put this in perspective, a relatively small protein of only 100 amino acids can take some 10^100 different configurations. If it tried these shapes at the rate of 100 billion a second, it would take longer than the age of the universe to find the correct one. Just how these molecules do the job in nanoseconds, nobody knows.,,,
    Their astonishing result is that this quantum transition model fits the folding curves of 15 different proteins and even explains the difference in folding and unfolding rates of the same proteins.
    That’s a significant breakthrough. Luo and Lo’s equations amount to the first universal laws of protein folding. That’s the equivalent in biology to something like the thermodynamic laws in physics.

    Speed Test of Quantum Versus Conventional Computing: Quantum Computer Wins – May 8, 2013
    Excerpt: quantum computing is, “in some cases, really, really fast.”
    McGeoch says the calculations the D-Wave excels at involve a specific combinatorial optimization problem, comparable in difficulty to the more famous “travelling salesperson” problem that’s been a foundation of theoretical computing for decades.,,,
    “This type of computer is not intended for surfing the internet, but it does solve this narrow but important type of problem really, really fast,” McGeoch says. “There are degrees of what it can do. If you want it to solve the exact problem it’s built to solve, at the problem sizes I tested, it’s thousands of times faster than anything I’m aware of. If you want it to solve more general problems of that size, I would say it competes — it does as well as some of the best things I’ve looked at. At this point it’s merely above average but shows a promising scaling trajectory.”

  50. 50
    scordova says:

    There are a few issues, and I have to correct some of my own conflations — apologies to the readers.

    1. number of “genes” are not the number of proteins

    2. low numbers of genes are not necessarily an indication of any more than the fact there are only 4 DNA bases are an indication of biological simplicity

    The paper conflated the number of proteins and protein isoforms with genes. I unfortunately went for the bait. That is a major strike against the paper. There are 50,000 to 500,000 protein isoforms (no one knows the number) for humans, and I seriously doubt Mass Spec has catalogued every one of them, so how then can we even say how many genes code for proteins when we don’t even know the protein count?

    That said, I’m not averse to thinking there might be a limited number of fundamental protein types but millions of protein isoforms, in fact for the Immuno Globulin proteins, it is a relatively established fact there are around 10 million protein isoforms.

    There would be great organizational reasons for having a relatively small number of proteins (say 20,000) and millions of isoforms as this would enable beautiful hierarchical organization.

    This researcher believes there are millions of protein isoforms (probably in addition to immune globulin forms).

Leave a Reply