Intelligent Design

What are the limits of Random Variation? A simple evaluation of the probabilistic resources of our biological world

Spread the love

Coming from a long and detailed discussion about the limits of Natural Selection, here:

What are the limits of Natural Selection? An interesting open discussion with Gordon Davisson

I realized that some attention could be given to the other great protagonist of the neo-darwinian algorithm: Random Variation (RV).

For the sake of clarity, as usual, I will try to give explicit definitions in advance.

Let’s call RV event any random event that, in the course of Natural History, acts on an existing organism at the genetic level, so that the genome of that individual organism changes in its descendants.

That’s more or less the same as the neo-darwinian concept of descent with modifications.

A few important clarifications:

a) I use the term variation instead of mutation because I want to include in the definition all possible kinds of variation, not only single point mutations.

b) Random here means essentially that the mechanisms that cause the variation are in no way related to function, whatever it is: IOWs, the function that may arise or not arise as a result of the variation is in no way related to the mechanism that effects the change, but only to the specific configuration which arises randomly from that mechanism.

In all the present discussion we will not consider how NS can change the RV scenario: I have discussed that in great detail in the quoted previous thread, and those who are interested in that aspect can refer to it. In brief, I will remind here that NS does not act on the sequences themselves (IOWs the functional information), but, if and when and in the measure that it can act, it acts by modifyng the probabilistic resources.

So, an important concept is that:

All new functional information that may arise by the neo-darwinian mechanism is the result of RV.

Examining the Summers paper about chloroquine resistance:

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4035986/

I have argued in the old thread that the whole process of generation of the resistance in natural strains can be divided into two steps:

a) The appearance of an initial new state which confers the initial resistance. In our example, that corresponds to the appearance of one of two possible resistant states, both of which require two neutral mutations. IOWs, this initial step is the result of mere RV, and NS has no role in that. Of course, the initial resistant state, once reached, can be selected. We have also seen that the initial state of two mutations is probably the critical step in the whole process, in terms of time required.

b) From that point on, a few individual steps of one single mutation, each of them conferring greater resistance, can optimize the function rather easily.

Now, point a) is exactly what we are discussing in this new thread.

So, what are the realistic powers of mere RV in the biological world, in terms of functional information? What can it really achieve?

Another way to ask the same question is: how functionally complex can the initial state that for the first time implements a new function be, arising from mere RV?

And now, let’s define the probabilistic resources.

Let’s call probabilistic resources, in a system where random events take place, the total number of different states that can be reached by RV events in a certain window of time.

In a system where two dies are tossed each minute, and the numbers deriving from each toss are the states we are interested in, the probabilistic resources of the system in one day amount to  1440 states.

The greater the probabilstic resources, the easier it is to find some specific state, which has some specific probability to be found in one random attempt.

So, what are the states generated by RV? They are, very simply, all different genomes that arise in any individual of any species by RV events, or if you prefer by descent with modification.

Please note that we are referring here to heritable variation only, we are not interested to somatic genetic variation, which is not transmitted to descendants.

So, what are the probabilistic resources in our biological world? How can they be estimated?

I will use here a top-down method. So, I will not rely on empirical data like those from Summers or Behe or others, but only on what is known about the biological world and natural history.

The biological probabilstic resources derive from reproduction: each reproduction event is a new state reached, if its genetic information is different from the previous state. So, the total numbet of states reached in a system in a certain window of time is simply the total number of reproduction events where the genetic information changes. IOWs, where some RV event takes place.

Those resources depend essentially on three main components:

  1. The population size
  2. The number of reproductions of each individual (the reproduction rate) in a certain time
  3. The time window

So, I have tried to compute the total probabilistic resources (total number of different states) for some different biological populations, in different time windows, appropriate for the specific population (IOWs, for each population, from the approximate time of its appearance up to now). As usual, I have expressed the final results in bits (log2 of the total number).

Here are the results:

 

Population Size Reproduction rate (per day) Mutation rate Time window Time (in days) Number of states Bits + 5 sigma Specific AAs
Bacteria 5.00E+30 24 0.003 4 billion years 1.46E+12 5.26E+41 138.6 160.3 37
Fungi 1.00E+27 24 0.003 2 billion years 7.3E+11 5.26E+37 125.3 147.0 34
Insects 1.00E+19 0.2 0.06 500 million years 1.825E+11 2.19E+28 94.1 115.8 27
Fish 4E+12 0.1 5 400 million years 1.46E+11 2.92E+23 78.0 99.7 23
Hominidae 5.00E+09 0.000136986 100 15 million years 5.48E+09 3.75E+17 58.4 80.1 19

The mutation rate is expressed as mutations per genome per reproduction.

This is only a tentative estimate, and of course a gross one. I have tried to get the best reasonable values from the sources I could find, but of course many values could be somewhat different, and sometimes it was really difficult to find any good reference, and I just had to make an educated guess. Of course, I will be happy to acknowledge any suggestion or correction based on good sources.

But, even if we consider all those uncertainties, I would say that these numbers do tell us something very interesting.

First of all, the highest probabilistic resources are found in bacteria, as expected: this is due mainly to the huge population size and high reproduction rate. The number for fungi are almost comparable, although significantly lower.

So, the first important conclusion is that, in these two basic classes of organisms, the probabilistic resources, with this hugely optimistic estimate, are still under 140 bits.

The penultimate column just adds 21.7 bits (the margin for 5 sigma safety for inferences about fundamental issues in physics). What does that mean?

It means, for example, that any sequence with 160 bits of functional information is, by far, beyond any reasonable probability of being the result of RV in the system of all bacteria in 4 billion years of natural history, even with the most optimistic assumptions.

The last column gives the number of specific AAs that corrispond to the bit value in the penultimate column (based on a maximum information value of 4.32 bits per AA).

For bacteria, that corresponds to 37 specific AAs.

IOWs, a sequence of 37 specific AAs is already well beyond the probabilistic resources of the whole population of bacteria in the whole world reproducing for 4 billion years!

For fungi, 147 bits and 34 AAs are the upper limit.

Of course, values become lower for the other classes. Insects still perform reasonably well, with 116 bits and 27 AAs. Fish and Hominidae have even lower values.

We can notice that Hominidae gain something in the mutation rate, which as known is higher, and that I have considered here at 100 new mutations per genome per reproduction (a reasonable estimate for homo sapiens). Moreover, I have considered here a very generous population of 5 billion individuals, again taking a recent value for homo sapiens. These are  not realistic choices, but again generous ones, just to make my darwinist friends happy.

Another consideration: I have given here total populations (or at least generous estimates for them), and not effective population sizes. Again, the idea is to give the highest chances to the neo-darwinian algorithm.

So, these are very simple numbers, and they should give an idea of what I would call the upper threshold of what mere RV can do, estimated by a top down reasoning, and with extremely generous assumptions.

Another important conclusion is the following:

All the components of the probabilistic resources have a linear relationship with the total number of states.

That is true for population size, for reproduction rate, mutation rate and time.

For example, everyone can see that the different time windows, ranging from 4 billion years to 15 million years, which seems a very big difference, correspond to only 3 orders of magnitude in the total number of states. Indeed, the highest variations are probably in population size.

However, the complexity of a sequence, in terms of necessary AA sites, has an exponential relationship with the functional information in bits: a range from 19 to 37 AAs (only 18 AAs) corresponds to a range of 24 orders of magnitude in the distribution of probabilistic resources.

Can I remind here briefly, without any further comments, that in my OP here:

The amazing level of engineering in the transition to the vertebrate proteome: a global analysis

I have analyzed the informational jump in human conserved information at the apperance of vertebrates? One important result is that 10% of all human proteins (about 2000) have an information jump from pre-vertebrates to vertenrates of at least (about) 500 bits (corresponding to about 116 AAs)!

Now, some important final considerations:

  1. I am making no special inferences here, and I am drawing no special conclusions. I don’t think it is really necessary. The numbers speak for themselves.
  2. I will be happy of any suggestion, correction, or comment. Especially if based on facts or reasonable arguments. The discussion is open.
  3. Again, this is about mere RV. This is about the neutral case. NS has nothing to do with these numbers.
  4. For those interested in a discussion about the possible role of NS, I can suggest the thread linked at the beginning of this OP.
  5. I will be happy to answer any question about NS too, of course, but I would be even more happy if someone tried to answer my two questions challenge, given at post #103 of the other thread, and that nobody has answered yet. I paste it here for the convenience of all:

Will anyone on the other side answer the following two simple questions?

1) Is there any conceptual reason why we should believe that complex protein functions can be deconstructed into simpler, naturally selectable steps? That such a ladder exists, in general, or even in specific cases?

2) Is there any evidence from facts that supports the hypothesis that complex protein functions can be deconstructed into simpler, naturally selectable steps? That such a ladder exists, in general, or even in specific cases?

275 Replies to “What are the limits of Random Variation? A simple evaluation of the probabilistic resources of our biological world

  1. 1
    Anaxagoras says:

    Please, do not treat life as a probabilistic outcome.
    This is a gen-centrist view, totally false. This is what darwinists and all naturalistic evolutionists want us to accept.
    Life “is “not the sequences of the genome.The genome is just the material used by the living organism to perform functions.
    Life is agency, goal directed behaviour, purpose and intention. Formal and final causes. Life is “order” and order can only come from an intelligent source. Treating life as a probabilistic outcome and accepting a likelihood formulation of the design argument implies accepting that “in principle” the contrary could have happened, that is, that “in principle”, life could have emerged spontaneously in an inanimate world. But it can´t. It is not a problem of enough probabilistic resources, it is a problem of causal adequacy.
    It is not a question of an army of monkeys typing on a machine and producing by chance a meaningful text given enough time. It is a question of a monkey not being able to prove the Poincaré conjecture.

  2. 2
    Dionisio says:

    This is unfair… you open this new discussion thread while your previous discussion thread –less than a month old– is at the top of the hit parade:

    Popular Posts (Last 30 Days)

    What are the limits of Natural Selection? An interesting… (2,679)

    Violence is Inherent in Atheist Politics (2,032)

    Of course: Mathematics perpetuates white privilege (1,168)

    Sweeping the Origin of Life Under the Rug (1,013)

    Is social media killing Wikipedia? (979)

    🙂 🙂 🙂 🙂 🙂 🙂 🙂

  3. 3
    Dionisio says:

    Now we got the whole Neo-Darwinian RV+NS enchilada served on the table!
    🙂

  4. 4
    Dionisio says:

    Anaxagoras @1:

    I think gpuccio is treating the Neo-Darwinian folks with their own medicine, so they see how bitter it tastes and how ineffective it is.

    That’s all.

    As one can see by the poor defense presented by the few Neo-Darwinian advocates who have dared to debate, gpuccio seems to touch a sensitive spot, while playing in their own court by their own rules. This seems to be an interesting approach to pull out the rug from under the Neo-Darwinian house of cards and discredit their magician’s tricks.

    Currently gpuccio’s latest two discussion threads seem like the most technically scientific in this forum, as far as I can see. I think ID should have a substantial proportion of scientific discussions here too.

    Actually, without pressuring gpuccio, because he’s a busy doctor, I’m looking forward with much anticipation to reading his future OPs on other interesting biology-related topics he has mentioned here.

  5. 5
    gpuccio says:

    Anaxagoras:

    Thank you for opening the discussion. The simple fact is that I fully agree with all that you say.

    But the point is: I am not treating life as a probabilistic outcome. Not at all.

    I am just discussing the functional information that we observe in the biological world, and that is connected to the expression of life, which is necessary (but probably not sufficient) to the expression of life as we observe it in the natural world.

    That information is there, and it is there in the form of those “sequences of the genome” of which we are discussing (and maybe also in other forms, beyond the genome itself).

    Now, the point of ID is simply that the information in those sequences can only be explained by design.

    Neo-darwinist theory instead suggests that the information in those sequences can be explained by RV+NS.

    I strongly believe that neo-darwinism is wrong, and that ID theory is right. And I try to explain why.

    That’s all. It is a scientific problem about the origin of functional information in living beings and in the sequences they use. Not about life.

    About life, I agree with you: it cannot be reduced to information.

    So, I believe that there are two important orders of reasons why “life could not have emerged spontaneously in an inanimate world”.

    The first is that, as you correctly say, there is a basic “problem of causal adequacy”.

    The second is that the functional information necessary for the expression of life, even at the simplest levels we know of, absolutely requires design to exist.

    You may wonder why I discuss only the second point here, and not the first.

    The reason is simple enough: I try to stick to scientific discussions here, and only to them.

    While the problem of life is certainly of paramount importance and interest, I think that at present it cannot really be discussed satisfactorily from a scientific point of view for a basic reason: we have no idea, from a scientific perspective, of what life is.

    That makes any discussion about that, necessarily, mainly philosophical.

    I have nothing against philosophical discussions: they are important and extremely useful.

    But they are not what I am trying to do here. 🙂

  6. 6
    gpuccio says:

    Dionisio:

    “This is unfair… you open this new discussion thread while your previous discussion thread –less than a month old– is at the top of the hit parade”

    Success is a drug! 🙂

  7. 7
    gpuccio says:

    Dionisio:

    “Now we got the whole Neo-Darwinian RV+NS enchilada served on the table!”

    Ah, I love mexican cuisine! Unfortunately, I am very bad at cooking! 🙂

  8. 8
    Dionisio says:

    Anaxagoras @1:

    As a continuation of my comment @4, note the sentences at the beginning of the OP:

    I realized that some attention could be given to the other great protagonist of the neo-darwinian algorithm: Random Variation (RV).

    For the sake of clarity, as usual, I will try to give explicit definitions in advance.

    Let’s call RV event any random event that, in the course of Natural History, acts on an existing organism at the genetic level, so that the genome of that individual organism changes in its descendants.

    That’s more or less the same as the neo-darwinian concept of descent with modifications.

    Note that gpuccio is carefully giving the Neo-Darwinian folks their own medicine, while playing in their own terrain under their own rules, so they see that their concepts really don’t work as they loudly proclaim everywhere.

  9. 9
    Dionisio says:

    gpuccio @7:

    Unfortunately, I am very bad at cooking!

    The Neo-Darwinian folks apparently agree with that, because they’ve found the whole RV+NS enchilada you have cooked very disgusting. 🙂

    But sometimes taste is a very relative thing. Some folks here, including myself, have found both OP dishes very well prepared and tasteful.

    It’s very difficult to please everybody.

    🙂

  10. 10
    Dionisio says:

    The more I read this two-volume RV+NS dissertation by gpuccio, the more I think he understands evolution better than many Neo-Darwinian folks.
    At least he explains it more precisely and clearly.

    🙂

  11. 11
    gpuccio says:

    Dionisio:

    If they like it hot, perhaps they should try the challenge.

    It’s the spiciest part! 🙂

  12. 12
    Origenes says:

    Andreas Wagner:

    The first vertebrates to use crystallins in lenses did so more than five hundred million years ago, and the opsins that enable the falcon’s vision are some seven hundred million years old. They originated some three billion years after life first appeared on earth. That sounds like a helpfully long amount of time to come up with these molecular innovations. But each one of those opsin and crystallin proteins is a chain of hundreds of amino acids, highly specific sequences of molecules written in an alphabet of twenty amino acid letters. If only one such sequence could sense light or help form a transparent cameralike lens, how many different hundred-amino-acid-long protein strings would we have to sift through? The first amino acid of such a string could be any one of the twenty kinds of amino acids, and the same holds for the second amino acid. Because 20 × 20 = 400, there are there are 400 possible strings of two amino acids. Consider also the third amino acid, and you have arrived at 20 × 20 × 20, or 8,000, possibilities. At four amino acids we already have 160,000 possibilities. For a protein with a hundred amino acids (crystallins and opsins are much longer), the numbers multiply to a 1 with more than 130 trailing zeroes, or more than 10^130 possible amino acid strings. To get a sense of this number’s magnitude, consider that most atoms in the universe are hydrogen atoms, and physicists have estimated the number of these atoms as 10^90, or 1,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,
    000,000,000,000,000,000,000,000,000,000,000. This is “only” a 1 with 90 zeroes. The number of potential proteins is not merely astronomical, it is hyperastronomical, much greater than the number of hydrogen atoms in the universe.11 To find a specific sequence like that is not just less likely than winning the jackpot in the lottery, it is less likely than winning a jackpot every year since the Big Bang.12 In fact, it’s countless billions of times less likely. If a trillion different organisms had tried an amino acid string every second since life began, they might have tried a tiny fraction of the 10^130 potential ones. They would never have found the one opsin string. There are a lot of different ways to arrange molecules. And not nearly enough time.

    The power of natural selection is beyond dispute, but this power has limits. Natural selection can preserve innovations, but it cannot create them. And calling the change that creates them random is just another way of admitting our ignorance about it.

    That sounded profoundly reasonable, right? Sadly, Wagner goes into full-fledged fantasy-mode about a library housed in a 5000 dimensional hypercube, in order to facilitate the search ….

  13. 13
    Dionisio says:

    The biological probabilstic resources derive from reproduction

    The biological probabilistic resources derive from reproduction

  14. 14
    gpuccio says:

    Origenes:

    “That sounded profoundly reasonable, right?”

    Yes, it is.

    “Sadly, Wagner goes into full-fledged fantasy-mode about a library housed in a 5000 dimensional hypercube, in order to facilitate the search ….”

    What else could he do? Admitting design? 🙂

    Frankly, I prefer that he remains on the other side, he and his 5000 dimensional hypercubes…

  15. 15
    Dionisio says:

    “What else could he do? Admitting design?”

    Is there another game in town?

    🙂

  16. 16
    Dionisio says:

    Never mind. You may disregard #15.

    There’s another game in town: the third way!!!

    🙂

  17. 17
    EugeneS says:

    Anaxagoras

    A fair point. However, everything can be viewed probabilistically on condition that we should return from the world of mathematics to reality when we formulate the outcomes. E.g. for all intents and purposes, a probability of 10^-300 is a practical zero, which means practical impossibility.

  18. 18
    gpuccio says:

    Dionisio:

    “The biological probabilstic resources derive from reproduction”

    Yes. That is an important concept.

    In a sense, the whole RV+NS algorithm can be considered as a side effect of the functional information already present in the organisms, and that allows its reproduction.

    So, it is functional information modifying itself throrugh its existing information.

    That’s why the scope of RV+NS is so limited. In any case, it cannot really or significantly go beyond the limits implicit in the already existing information, and the computational powers implicit in that information.

  19. 19
    forexhr says:

    Biological structures, just like everything in nature, are made of large number of various different kinds of particles. In evolutionary sense, that means that in order for a population to adapt to a particular environment, the mutation process must extract adaptive arrangement of particles(nucleotides in this case) from a pool of all possible arrangements(adaptive and non-adaptive).

    For e.g. since the DNA of first self replicating organism didn’t contain DNA sequences(nucleotide arrangements) for visual function, than the only way for such sequences to appear is by extracting them from a pool of all possible DNA sequences of some duplicated gene. But here’s where the problem comes in for the theory of evolution(ToE). The radio between sequences for non-visual and visual function, is so large that the total numbers of mutations in the history of life – 10^43(1), fails by many orders of magnitude to succeed in this extraction process.
    With the absurd assumption that only one average eukaryotic gene codes for some simple proto visual function, and that 10^500 different sequences from its pool of 10^810(2) possible sequences are functional proto visual sequences, it follows that it is mathematically impossible for evolution to extract proto visual function because the number of sequences that won’t code for this function is 267 orders of magnitude greater than the total numbers of mutations in the history of life. In other words, due to the lack of mutational resources it is impossible for a proto visual adaptation to enter the gene pool of a population and increase its frequency through natural selection.

    This mathematical problem is completely ignored by the ToE.

    Whenever I present this argument, the majority of evolutionists would immediately make an appeal to natural selection(NS).
    For example, we all know that humans are unable to breathe under water. We also know that there’s tons of mutations in the human gene pool in every generation. Thus, large amounts of mutations have been spent in the last 5 million years and no trait for breathing underwater has entered the human gene pool. It can be shown mathematically that this will not happen even if we spent all the mutations that occured in the history of life. When such calculations are presented, an average evolutionist would completely ignore them and instantly make an appeal to NS. But obviously, NS cannot change the fact that no trait for breathing under water exists in the human gene pool. NS can act only when such trait entered the gene pool, by spreading this trait in the population.

    So, NS is completely unrelated to the question of mutational resources required to find adaptive traits, but evolutionists have repeatedly shown that they cannot comprehend the difference between these two instances.

    (1) http://rsif.royalsocietypublis.....5/953.full
    (2)The length of an average eukaryotic gene is 1346 bp. A gene consists of four different bases. Any base can assume one of four values (ATCG). A sequence of L basis can therefore assume one out of 4^L values, which gives 4^1346 or 10^810 potential sequences.

  20. 20
    daveS says:

    gpuccio,

    Your post is certainly very clear, but I am one of those who needs a “for dummies” presentation.

    Could I try to restate some of your points in terms of a toy example to make sure I have it right?

    We could consider the system where 200 fair, distinguishable, 6-sided dice are repeatedly rolled, let’s say once per second.

    Then the probabilistic resources of this system over a period of 4 billion years would be the number of distinct states we would expect to be reached in that time period.

    I think the chance of getting a repeated state is small, so it’s likely about 1.26 × 10^17 states would be reached.

    Hence the probabilistic resources in this example, would be a bit less than 57 bits.

    This is far less than the log_2 of the total possible number of states, which is about 517 bits.

    I take it the conclusion is that these probabilistic resources have had time to “search” only a miniscule portion of the state space (in fact, about 2.8 × 10^−139 of it).

    Does that sound about right?

  21. 21
    gpuccio says:

    forexhr:

    You have provided a very good and clear summary of the main arguments I have given in my last two OPs! Thank you. 🙂

  22. 22
    gpuccio says:

    daveS:

    Yes, it is right.

    And repeated states can indeed be factored, using for example the binomial distribution to get the probability of having at least one success of probability p in n attempts.

    The only formal difference between your example and the biological scenario is that the biological setting is better modeled as a random walk, usually a Markov chain. However, the general concepts remain comparable.

    I have used the concept of total number of states that can be reached because it is simple, intuitive and effective.

  23. 23
    gpuccio says:

    forexhr:

    By the way, the Dryden paper you quote is simply ridiculous. I would really like to see the authors produce a working version of ATP synthase with only two aminoacids!

  24. 24
    daveS says:

    Thanks, gpuccio.

    And I take it that under a Markov chain model, the probabilistic resources would actually decrease compared to my assumption of independent trials. I think that’s correct, anyway.

    In any case, I thought it would be interesting to use a physical illustration for the 200-dice example. Suppose we wanted to visualize the fraction of the state space explored by rolling the 200 dice for 4 billion years in terms of pixels on a large high-definition screen (say with 200 pixels per inch).

    Well, it seems that in order to construct this HD display so that even a single pixel corresponded to the searched part of the space, the display would have to be vastly larger than the known universe!

  25. 25
    ET says:

    I said it before and this looks like a good place to say it again:

    Why is it that only in the field of biology are we to accept that random hits to an existing functioning system does not degrade it? Heck it not only doesn’t degrade it, it made it! And all without evidentiary support? Really?

  26. 26
    Dionisio says:

    Here’s an off topic question:

    At the bottom of this OP it reads:

    “(Visited 265 times, 278 visits today)”

    What do those number stand for?

    Thanks

  27. 27
    Dionisio says:

    forexhr @19:

    Interesting comment. Thanks.
    PS.
    “For e.g. since the DNA of first self replicating organism didn’t contain DNA sequences(nucleotide arrangements) for visual function, than…”
    then?

    “The radio between sequences for non-visual and visual function,…”
    ratio?

  28. 28
    Mung says:

    Sadly, Wagner goes into full-fledged fantasy-mode about a library housed in a 5000 dimensional hypercube, in order to facilitate the search ….

    Sounds to me like the library, the hypercube and the paths to get from one shelf to another must have been designed.

    🙂

    Does Wagner calculate the probabilities of his library?

  29. 29
    gpuccio says:

    daveS:

    “And I take it that under a Markov chain model, the probabilistic resources would actually decrease compared to my assumption of independent trials.”

    I think they should be comparable.

    The main difference is that in a random walk you can easily reach a nearby point. For example, to compare with your dies example, if you start from a state which is 199 sixes and one five, it’s rather likely to reach 200 sixes, while in the todding dies scenario each results is independent from the previous one.

    In a Markov chain, each result depends on the previous state, but not on the history which led to the previous state. In a tossing dies scenario, each result is completely independent.

    However, when you start from any unrelated state, a random walk should be more or less equivalent to a random search. Indeed, from a few simulations I have done, it seems to perform worse than a random search, but I am not really sure. Maybe someone who knows better the mathemathical theory could give us some confirmation of that.

  30. 30
    Anaxagoras says:

    Probabilistic arguments have been around for decades. But they doesn´t seem to have impressed evolutionists too much. Elliot Sober addresses the likelihood formulation of the design argument in his book “Evolution and Evidence”. He admits that the outcomes of evolution have “extremely small probabilities” but (he says) “they are not impossible”. In principle, “monkeys pounding at random on typewriters CAN produce the works of Shakespeare, and a hurricane whirling through a junkyard CAN produce a functioning airplane”. As a corollary, he insists, evolution CAN produce adaptive features that are irreducibly complex.

    Gpuccio has presented his arguments in two separate posts and that is a tricky thing, according to Sober. The argument on RV alone can beat Epicureanism, that is , a purely random process, but nos darwinism because darwinism implies a combination of RV plus NS. Basically he summarizes his criticism concluding that ID arguments don´t hold because they are not testable against the evolutionary hypothesis that RV plus NS did the work.

    A key point in Sober´s discourse is that he misrepresents what ID arguments are from the beginning of his explanation. That is, he starts by assuming that ID is a probabilistic argument on the grounds of an unimportant sentence he quotes from Paley´s Natural Theology. He purports that this sentence implies that Paley, “in principle” admits that random processes could have done the work, and therefore his argument is not a deductive formulation but a probabilistic argument.
    True, Paley´s argument like all ID arguments presented, starting from the philosophers of antiquity to the more modern proponents of the ID movement, shouldn´t be understood as an apodictic conclusion of a deductive syllogism. But they are not probabilistic, they are “the inference to the best explanation” (that includes to the only possible explanation) according to Peirce´s abductive logical method. And the fact that they are not presented as apodictic conclusions doesn´t imply that the contrary is accepted in principle as a possible reasonable explanation.

  31. 31
    gpuccio says:

    Mung:

    “Does Wagner calculate the probabilities of his library?”

    I don’t know, but his library seems to exist only in his head, certainly not out there!

    I am all for Borges’ truly random “Library of Babel”!

  32. 32
    daveS says:

    gpuccio,

    I think they should be comparable.

    The main difference is that in a random walk you can easily reach a nearby point. For example, to compare with your dies example, if you start from a state which is 199 sixes and one five, it’s rather likely to reach 200 sixes, while in the todding dies scenario each results is independent from the previous one.

    In a Markov chain, each result depends on the previous state, but not on the history which led to the previous state. In a tossing dies scenario, each result is completely independent.

    However, when you start from any unrelated state, a random walk should be more or less equivalent to a random search. Indeed, from a few simulations I have done, it seems to perform worse than a random search, but I am not really sure. Maybe someone who knows better the mathemathical theory could give us some confirmation of that.

    Interesting.

    My reasoning was that, referring to my toy example, using “realistic” parameters, most of the dice would remain fixed from one stage to the next—that is, assuming “mutations” are fairly rare.

    In that scenario, it’s not absurdly unlikely that in two consecutive transitions, the first die switches from 1 to 2, and then back to 1 again, with all other dice fixed (for example). Hence repetitions of states might be a little more likely.

    I do think you are right in that whatever happens, the two scenarios would be comparable and perhaps very close.

  33. 33
    gpuccio says:

    Anaxagoras:

    Again, I fully agree with you.

    Two important points, on which we seem to agree:

    a) In empirical sciences, a good theory is not a deduction, but an inference to the best explanation.

    I am surprised that Sober “admits that the outcomes of evolution have “extremely small probabilities” but (he says) “they are not impossible”. That is not a scientifc argument at all. There are a lot of things “not impossible” that have no relevnce in science.

    b) ID theory is much more than simply rejecting the neo-darwinian theory. It is about positive reasons for considering complex functional information as a reliable and safe marker of design. I have discussed those positive aspects of ID many times, in my OPs and in my discussions on other’s threads.

    But ID theory, indeed, does also reject the neo-darwinian theory of RV + NS, the RV part by probabilistic reasonings (as it is appropriate, being the RV part a probabilistic argument), and the NS part by conceptual, methodological and empirical considerations.

    I have tried to sum up, in my 2 OPs, the best arguments for both aspects of that reject. I am sorry that I had to use two separate OPs, which “according to Sober” should be, for some reason that I certainly miss, “a tricky thing”. Maybe next time I will write a single long post! 🙂

    However, now the discussion is open on both aspects.

  34. 34
    gpuccio says:

    daveS:

    I have made simple simulations of Markov chains, both including synonimous events (no variation) and excluding them.

    They always seem to perform worse than the mere probability in a random search.

    But I could have made some errors, I am not completely sure.

  35. 35
    Axel says:

    ‘Anaxagorus,

    Your post #1 made AL (not Laugh Out Loud, but Audible Laughter) reading for me, from start to finish, simply because those things should not even need to be articulated. Though I’m a little surprised that GP of that ilk, (though he might be a hospital doctor, rather than a GP) had to explain to you that he was on your side, and explain his rationale.

    But then again, I get impatient when you boffins on here patiently explain the most basic and most commonsensical truths to our materialist friends. There is a certain sense in which this very forum is a surreal venture. How do people with a tertiary education in scientific fields manage to harbour crazy assumptions and conjectures that need to be eliminated from consideration?

    But to revert to the humour, these perfectly measured and courteous analyses of their madness remind me of the devastating analyses of the arguments of the rigorists and legalistic pedants by Papa Francesco, and indeed other cardinals of similar acuity; the humour residing in large part from their being no more than accurate statements of the truth, while they sound like mean satires, authorship of which Evelyn Waugh or Joseph Heller might have coveted. Of course, the fact that they read as if deliberately satirical, might, on occasions not be unwelcome by the authors such as Pope Francis and prelate cohorts.

  36. 36
    daveS says:

    gpuccio,

    Yes, just to be clear, that’s what I would expect based on my simple reasoning above.

    Using independent trials in my toy example, it’s virtually certain that no states would occur more than once. However, in a Markov chain model, I think it could be more likely, depending on the transition probabilities, that states occur two or more times, hence “wasting” some of the trials.

  37. 37
    gpuccio says:

    daveS:

    My son, who is a physicist, confirms that a random walk performs worse than a random search when you start from an unrelated state. Of course, it performs better if you start from a related state.

    That seems to make sense.

    In a very big search space, where most states are unrelated to the target, I would expect the two things to be comparable.

  38. 38
    Axel says:

    ‘Why is it that only in the field of biology are we to accept that random hits to an existing functioning system does not degrade it? Heck it not only doesn’t degrade it, it made it! And all without evidentiary support? Really?’

    ET, I feel I must interject here on the side of your opponents, and adapt a saying my stepfather was fond of: theists, deists, IDists et al should be seen and not heard. There! I trust that answers your question.

  39. 39
    gpuccio says:

    Axel:

    Funny! 🙂

  40. 40
    gpuccio says:

    ET:

    “Why is it that only in the field of biology are we to accept that random hits to an existing functioning system does not degrade it? Heck it not only doesn’t degrade it, it made it! And all without evidentiary support? Really?”

    True. But absurd reasonings can be found in many other fields, too! 🙂

  41. 41
    mike1962 says:

    Axel @38,

    “Shut up,” he explained. 🙂

  42. 42

    Good post and discussion. ET @ 25 hits the bullseye.

  43. 43
    groovamos says:

    gppucio True. But absurd reasonings can be found in many other fields, too!

    Not in the field of yours truly, electrical engineering. There is a branch of EE, stochastic processes, under the subfield of signals and systems. I am much more adept at signals and systems than I am with stochastic processes but here goes.

    Random events are carefully analysed with the tools of mathematical statistics and classical signal theory. The way we do this requires that we can characterize the source of the events and processes and actually study them. I don’t see this happening in “evolutionary biology” with so-called “random mutations”

    Examples from my field: Johnson or thermal noise. This is the VLF spectral portion of blackbody radiation which follows Weins law, and the Gaussian probability function.

    Shot noise – This is the transition across a semiconductor junction of a charge carrier and is largely of a Poisson statistical nature.

    Electromagnetic interference – this should be familiar to you guys but in specific environments it can be studied and characterized.

    Alpha particle or neutron bombardment – this is the one type of stochastic process that actually degrades system function irreversibly and is studied carefully at the university level including at my alma mater Vanderbilt; of interest for the aerospace sector.

    What I don’t see in evolutionary biology is great understanding of the stochastic nature of the so-called random mutations that we are assured are of primary interest. I keep saying it again and again: I have never seen anyone on the blogs take an interest in the question or who can even point me to the proof of just one case of statistical non-correlation among the “random mutations” of a sequence responsible for novel form or function.

  44. 44
    gpuccio says:

    groovamus:

    “What I don’t see in evolutionary biology is great understanding of the stochastic nature of the so-called random mutations that we are assured are of primary interest. I keep saying it again and again: I have never seen anyone on the blogs take an interest in the question or who can even point me to the proof of just one case of statistical non-correlation among the “random mutations” of a sequence responsible for novel form or function.”

    Exactly. Of course, if even a little of the scientific rigour apllied in your field were found in evolutionary biology, then neo-darwinism would have already been left behind as a wrong and useless theory.

    Of course, there are some serious attempts at modeling what can really happen. But they are often biased by assumptions in favor of the classic dogma, at least in the conclusions.

    I would recommend, about the two mutations problem, the following papers:

    https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2581952/

    https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2286568/

    https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2253472/

    All of them are serious attempts at building models for the two mutations case. Beyond the differences in approach and results, they demonstrate that it is possible to build models and try to verify them.

    They also demonstrate a very important thing: this war between models for two mutations is the proof that if we increase the number of AAs necessary for the initial function, neo-darwinism is completely out of the game.

    The numbers I give here in my OP are hopeless for the classic dogma. And they are, of course, an extremely optimistic top down higher threshold.

    The bottom up results, including all the papers I have quoted, tell all another story. If we consider realistic effective population sizes, time windows and evolutionary effects, including the effects of fixation time and many other factors, the 37 AAs that I give as extreme threshold for the bacterial world (including the 5 AAs that ensure the 5 sigma margin) seem really a myth.

    Two neutral mutations to establish a first selectable function are certainly a possible target, but they are already a very exacting target, definitely out of range, or just at the limits of the range, for the less performing scenarios.

    However, 3, or 4, or maybe even 5 AAS could be accepted in the most favourable scenarios. That is also, more or less, Axe’s conclusion from his analyses.

    I would gladly concede 10 AAs as an extreme possibility, from a bottom-up perspective.

    However, for the moment I will stick to the top down limits I have given in my OP.

    I still have to see any naturally selectable new function that is in the range of the 19-37 specific AAs. Maybe there are some.

    But what about the thousands and thousands of proteins (practically almost all of them) which have functional information well beyond that?

    What about the 2000 human proteins that have an information jump of 500+ bits in the 30 million years of the transition to vertebrates?

    What about ATP synthase? Dynein? Or even the histones, or ubiquitin, or other extremely conserved sequences?

    How did they originate?

  45. 45
    Origenes says:

    Mung: Sounds to me like the library, the hypercube and the paths to get from one shelf to another must have been designed.

    Does Wagner calculate the probabilities of his library?

    No, he does not. However he does say several things about the library.

    More than that, the mathematics of biology allowed us to see that these libraries self-organize with a simple principle, as simple as the gravitation that helps mold diffuse matter into enormous galaxies. This principle—that organisms are robust, a consequence of the complexity that helps them survive in a changing world—brings forth the intricate organization of these vast libraries.

    So, let me get this straight: because organisms are robust (homeostasis) the library self-organizes.

    Okay got it. And we have all this complexity in life due to the libraries, right? Right.

    But why are organisms robust? Wagner tells us that robustness is a “consequence of the complexity that helps them to survive”. But that is circular reasoning, isn’t it?

    Complexity –> robustness –> libraries –> complexity

    And then I read this …

    When we begin to study nature’s libraries we aren’t just investigating life’s innovability or that of technology. We are shedding new light on one of the most durable and fascinating subjects in all of philosophy. And we learn that life’s creativity draws from a source that is older than life, and perhaps older than time.

    [A. Wagner, ‘Survival of the Fittest’, ‘EPILOGUE Plato’s Cave’]

    What to make of that final sentence of his book? Rather a strange way to end a ‘naturalistic’ book I would say.

  46. 46
    gpuccio says:

    To all interested:

    One important clarification about the mutation rate: it is a mutation rate per genome per replication.

    So, the related number of states is computed for the whole genome.

    Let’s take the case of bacteria. The mutation rate I reported is 0.003. That means that, out of 1000 replications, 3 genomes will have one mutation.

    But, of course, that is referred to the whole genome. For example, for E. coli the genome length is about 4.6 Mbp.

    Now, the mutation rate per nucleotide site per replication is usually about 10^-8 – 10^-9, for almost all organisms. 10^-9 x 4.6 * 10^6 is 0.0046, which is very near to the 0.003 that I have given as average mutation rate in bacteria.

    OK, but the point is that those mutations will be spread thorughout the genome of 4.6 Mbp!

    Now, let’s imagine that the specific functional sequence we want to find is related to one specific gene location, for example a sequence of 100 AAs, IOWs 300 nucleotides.

    In this case, the only mutations that will help find the sequence will be those that happen in that nucleotide sequence. By a simple computation, we can see that the mutation rate for that sequence will be:

    10^-9 * 300 = 0.0000003

    So, the mutation rate that can be used for that particular target will be 4 orders of magnitude lower than the total per genome mutation rate. That means only 125 bits instead of 138, we lose about 3 aminoacids in our margin.

    Always considering a maximum population of 10^30, and 4 billion years of evolutionary time, which is of course another big exaggeration.

  47. 47
    gpuccio says:

    Origenes:

    Wagner is beyond any hope. You say that is circular reasoning, and I obviously agree.

    But, frankly, I would not even call it reasoning at all.

    I am sorry to say that, I usually try to look for what is interesting in others’thoughts. With Wagner, I have an immediate sense of foolishness and uselessness.

    If anyone really believes that there is something in Wagner’s weird ideas, that person is cordially invited to explain, in clear and simple words, appropriate for people of medium intelligence as we are, what that “something” is. I really can’t see it.

  48. 48
    Origenes says:

    M. Denton, “Evolution: Still a Theory in Crisis”:

    If gradual natural selection is powerless to generate the most important biological features in the history of life, as I have argued in previous chapters, then what about relying on chance saltations as an alternative mechanism? …[T]he sheer complexity of biological life renders such a proposal incredible. Chance cannot resuscitate the corpse of Darwinian evolution. [p. 225]

    The complexity of living systems is so great that there is now an almost universal consensus, as we saw in the discussion of ORFan genes, that the simplest of all biological novelties — a single functional gene sequence — cannot come about by chance mutations in a DNA sequence. And if an individual gene sequence is far too complex to be produced by chance, then the sudden origination of a morphological novelty like a feather, a limb, or even such a comparatively simple novelty as an enucleate red cell — all novelties vastly more complex than an individual functional gene sequence — is by any common-sense judgment far beyond the reach of any sort of undirected “chance” saltation.” [p. 226]

  49. 49
    gpuccio says:

    Origenes:

    It seems that I am not alone in my convictions! 🙂

  50. 50
    Origenes says:

    GPuccio

    Everybody agrees with you, including those who don’t want to admit it.

    I have decided not to write about Wagner any more. However, for those who want to know more about him and his libraries, here is a link to an elucidating article by ID skeptic and philosophical materialist Massimo Pigliucci.

  51. 51
    Dionisio says:

    gpuccio @44:

    What about the 2000 human proteins that have an information jump of 500+ bits in the 30 million years of the transition to vertebrates?

    What about ATP synthase? Dynein? Or even the histones, or ubiquitin, or other extremely conserved sequences?

    How did they originate?

    Well, the surrounding conditions at the time were conducive to their appearance. Somehow the predominant forces cooperated to make their simpler predecessors, which were lately co-opted into their current versions.

    Did this answer your question?

    🙂

  52. 52
    gpuccio says:

    Origenes:

    “Everybody agrees with you, including those who don’t want to admit it.”

    How can you say that, when I am overwhelmed by so many brilliant interventions by dissenting interlocutors? And I have to spend all my time answering the hundreds of people who have taken my repeated challenge? 🙂

    “I have decided not to write about Wagner any more.”

    Wise decision. I don’t think he deserves your attention.

    I will read Pigliucci’s article, however!

  53. 53
    gpuccio says:

    Dionisio:

    “Well, the surrounding conditions at the time were conducive to their appearance. Somehow the predominant forces cooperated to make their simpler predecessors, which were lately co-opted into their current versions.

    Did this answer your question?”

    Of course! You are definitely the best neo-darwinist interlocutor in this thread!

    Ehm, probably the only one. 🙂

  54. 54
    Dionisio says:

    gpuccio,

    I’m glad to read that my thorough explanation ‘somehow’ answered your question. That means we both finally understand evolution!
    Maybe we should contact professor James Tour to get us the free dinner* he promised in his challenge, in exchange for explaining evolution to him too?
    🙂

    Maybe we should suggest meeting at a nice restaurant by the Palermo littoral? 🙂

    Since Dr. Tour would be paying, let’s try to reserved a large table at the most expensive restaurant that you know there, then we bring our entire families too. Dr. Tour didn’t mention the number of people he would treat, and a promise is a promise, right? 🙂

    My only personal requirement is that it must be authentic Sicilian cuisine. I’m sure Dr. Tour would agree. Who wouldn’t?

    (*) suggested dinner, because someone -forgot who- wrote that there’s no free lunch
    🙂

    PS. BTW, maybe we should include the distinguished Canadian biochemistry professor who could explain to Dr. Tour -and to us- how exactly morphogen gradients are formed and interpreted. 🙂

  55. 55
    gpuccio says:

    Dionisio:

    That would be some dinner! 🙂

    “suggested dinner, because someone -forgot who- wrote that there’s no free lunch”

    🙂 🙂 🙂

  56. 56
    J-Mac says:

    “There are only a limited number of genes, which, upon mutation, can produce a restrictednumber of alleles” Micke and Donini (1993)

  57. 57
    gpuccio says:

    J-Mac:

    “There are only a limited number of genes, which, upon mutation, can produce a restrictednumber of alleles” Micke and Donini (1993)”

    True! 🙂

    While I could not find the paper you quote (have you a more complete reference?), I searched the whole statement, and I found it in this interesting paper by Lönnig:

    http://www.unser-auge.de/law-o.....ation.html

    I quote another interesting conclusion from it:

    In accord with the law of recurrent variation, mutants in every species thoroughly examined (from pea to man) ? whether naturally occurring, experimentally induced, or accidentally brought about ? happen in a large, but nevertheless limited spectrum of phenotypes with either losses of functions or neutral deviations. Yet, in the absence of the generation of new genes and novel gene reaction chains with entirely new functions, mutations cannot transform an original species into an entirely new one. This conclusion agrees with all the experiences and results of mutation research of the 20th century taken together as well as with the laws of probability. Thus, the law of recurrent variation implies that genetically properly defined species have real boundaries that cannot be abolished or transgressed by accidental mutations.

    In contrast to the authors quoted in the introduction, yet in accord with the group of researchers referred to under REPERCUSSIONS above, the origin of the world of living organisms must be explained on a basis different from that given by the synthetic theory of evolution.

  58. 58
    EugeneS says:

    The criticism I came across when getting the same ballpark figures is, ok, evolution only had time to forage a tiny fraction of the search space and yet, it could build up all the observed biocomplexity!

    What I want to say, is that the estimate of the (minuscule) fraction of the search space that can be visited by evolutionary random walk is not enough by itself. It must be supported by an estimate of rarity of functional states in the search space (of the order of 1 functional polypeptide in every 10^77 as experimentally assessed by D. Axe). These two estimates together present a statistical argument against the “grand show of evolution”, as R. Dawkins put it describing the R. Lenski experiment.

  59. 59
    daveS says:

    gpuccio,

    I was going to ask a question related to the point raised by EugeneS.

    Given what is known about the structure of this state space, what fraction of it would have to have been explored in order for RV + NS to be a plausible explanation?

    And looking at this from the other end, are there some modest examples of evolution that you believe could have occurred without the intervention of a designer? For example, could any of the speciation events on this phylogenetic tree have occurred “naturally”?

  60. 60
    Mung says:

    daves:

    For example, could any of the speciation events on this phylogenetic tree have occurred “naturally”?

    Naturally it happened “naturally.”

  61. 61
    Mung says:

    EugeneS:

    The criticism I came across when getting the same ballpark figures is, ok, evolution only had time to forage a tiny fraction of the search space and yet, it could build up all the observed biocomplexity!

    That doesn’t obviate the need for design.

    Ask what must the nature and structure of the search space be like given the hypothesis that evolution only had time to forage a tiny fraction of the search space and yet, it could build up all the observed biocomplexity.

    The search space itself must be very special and unique.

  62. 62
    Mung says:

    Origenes:

    What to make of that final sentence of his book?

    I know, right? Over at TSZ they claim his book is “an ID killer.”

    I quote that sentence to them and ask how so?

  63. 63
    ET says:

    Wagner’s book cannot be an ID killer because he never says how proteins originated nor how life originated. He does say that there are many different DNA sequences that can produce the same protein. But it takes specific changes to do so starting from some given DNA sequence. And we already know that many specific changes are out of the reach of unguided evolution.

    So no, the book is far from an ID killer.

  64. 64
    Origenes says:

    Mung @62 ET @63

    Breaking my promise in #50 …
    Only yesterday I found out that Wagner’s 5000-dimensional hypercube housed library is a “platonic concept” …?
    Do read this article http://nautil.us/blog/the-neo_.....more-wrong

  65. 65
    ET says:

    A random walk that changes the DNA sequence but doesn’t change the protein is like walking in place. I don’t see how this helps evolutionism.

    And I don’t see how comparing DNA sequences that produce that same protein can be used to form a tree alleging Common Descent. That reeks of desperation.

  66. 66
    gpuccio says:

    EugeneS and daveS:

    Good points. They deserve a detailed discussion.

    The important point is that we are always discussing functional information here, which is by definition the rate of the target space to the search space. So, it is absolutely important to understand what is the target space, and what is the search space.

    No, when I say that bacteria can reach, at most, 138.6 bits, the meaning is simply that the maximum space they can search, in relation to their whole genome, is 2^138.6. That is the space of the potential search, the highest fraction of the search space that can be explored in the global system we have defined.

    Now, what is the search space? It can be simply defined as the sum total of all possible sequences of the same length as the bacterial genome (which is, however, very variable).

    For E. coli, for example, whose genome is 4.6 Mbp, the search space would be 4^4600000, IOWs 9.2 million bits.

    So we have:

    Potentially searched space = 138.6 bits

    Search space = 9.2 million bits

    Do you realize how insignificant is the searched space in relation to the search space? For a bacterial genome?

    OK, but what about the target space?

    Nobody knows for certain how big target spaces are in general, but we have some good clues to understand how big they are in particular cases.

    Let’s take the case of weak ATP binding, a ridiculously simple function, and by far not selectable.

    There should be no problem to get such a simple function: just a string of AAs that can bind ATP, even at very low levels, so that we can separate the corresponding protein by ATP columns in our lab.

    That’s what Szostak has done in his famous paper.

    And, while his results have nothing to do with NS and neo-darwinism, as discussed many times, they can still give us some idea about the functional space for an extremely simple function.

    Let’s see: I have already computed the functional information for the basic simple function selected by the authors: about 40 bits.

    That means that, for that function, while the search space (for 80 AAs) is huge (1.2E+104), the target space is very big too: 8.06E+91. The ratio: 6.66667E-13, 40 bits, is the functional complexity for that function in a 80 AAs long sequence.

    Now, we can easily see that such a simple function is well in the range of what can be found by the total bacterial system, which has a capacity of finding about 138 functional bits per genome, and in particular 123,4 bits for a 80 AAs sequence. So, 40 bits are a piece of cake.

    For Hominidae, on the other hand, the total space that can be explored for a 80 AAs sequence is 32.5 bits.

    So, the system of Hominidae, in 15 million years, should not be able to find such a simple function by RV alone.

    However, the probability of finding it is 0.003994003: not exactly 5 sigma, but significant enough to consider rejecting the null hypothesis of a random origin in that context.

    So, we can see that 40 bits of functional information, a very trivial generic function, is in the range of bacteria, but probably not in the range of Hominidae.

    But remember: this is a completely useless function. It is so simple, that it cannot be selected in any natural context.

    Of course, there is “potential for function” in it, as Corey Delvine has said. And Szostak has amplified that potential for function by rounds of mutation and artificial selection. Which means that he has certainly increased the functional complexity of the result, even if it is not easy to say how much.

    And what has he obtained? Another useless protein, but with a string ATP binding ability: strong enough to be deleterious in most contexts.

    Well, that should give an idea, but I will make some further considerations as soon as I have time! 🙂

  67. 67
    gpuccio says:

    ET:

    If Wagner’s arguments are an “ID killer”, then ID is certainly in very good health conditions! 🙂

  68. 68
    gpuccio says:

    Mung:

    “The search space itself must be very special and unique.”

    Correct. If it were as Wagner imagines. But it is not.

    If it were, that would be some evidence for the position of theistic evolution. (Maybe Wagner is a theistic evolutionist, camouflaged as a neo-platonic evolutionist! 🙂 )

    But it is not, and that is absolute evidence for design, in space and time, in the course of natural history, and not at the beginning of creation.

  69. 69
    daveS says:

    gpuccio,

    Now, what is the search space? It can be simply defined as the sum total of all possible sequences of the same length as the bacterial genome (which is, however, very variable).

    For E. coli, for example, whose genome is 4.6 Mbp, the search space would be 4^4600000, IOWs 9.2 million bits.

    So we have:

    Potentially searched space = 138.6 bits

    Search space = 9.2 million bits

    Do you realize how insignificant is the searched space in relation to the search space? For a bacterial genome?

    Unimaginably small!

    My only question here is, what is the status of the entire “search space”, this collection of 4^4600000 logically possible states?

    I don’t know anything about biology, but is it not the case that the vast majority of the states in the search space do not correspond to viable organisms, and in fact are not even close to states which do? If that were true, then it would be practically impossible for many of these states to occur in a search. Please correct me if I’m wrong about that.

    The picture I have in mind is something like a population density map of Egypt, where the states that have some practical possibility of being searched make up only a tiny fraction of the space of logically possible states.

    If that’s the case, is the 4^4600000 number even relevant?

  70. 70
    gpuccio says:

    EugeneS and daveS:

    Now, a very important point.

    We have seen that the functional information for Szostak’s initial sequences is very low: 40 bits.

    That number derives very simply from the fact that he has found 4 such sequences in a random library of 6×10^12 random sequences, that, as the authors say, “randomly samples the whole of sequence space, rather than the vicinity of a known protein”.

    So, the functional complexity of the function is:

    4:6E12 = 6.67E-13 = 40 bits.

    Now, when I give the functional complexity of a protein, or of a transition, in my OPs, I derive it from the conservation scores in BLAST comparisons. That bit score is a very good measure of functional complexity, IMO, provided that the homology is found between proteins that are separated by a vast evolutionary time.

    For example, all my analyses of functional gain at the vertebrate transition are based on conservation betweeb cartilaginous fish and humans, a 400+ million years time window, more than enough to ensure that conservation is due to functionality, and is not passive.

    But let’s go back to an old frien, ATP sinthase.

    As I have said many times, the beta chain if the F1 complex of the molecule has an astonishing conservation between bacteria and humans.

    Just to remind the numbers (humans and E. coli):

    ATP synthase beta chain (P06576, 553 AAs):

    334 identities, 383 positives, 663 bits

    Now, that is amazing.

    Consider that this result in bits is already a measure of the target space/search space ratio.

    Indeed, the search space for this proteins is 2390 bits, about 10^719 states.

    Therefore, when we have 663 bits of functional information from the bitscore, that is already a very conservative value, because it is setting the target space at 1727 bits, IOWs a target space of 10^519 states!

    IMO, the blast bitscore is definitely underevaluating functional information. For example, it gives perfect identity a botsocre of about 2.2, which is half the potential information in one AA position (4.3 bits). That derives in part from how the bitscore is computed, but I still believe that it underevaluates functional information.

    Hoever, as it is a measure that is very easy to obtain, and is universally considered a valid metrics of homology, I have used that score in all my analyses.

    But my point is that the bitscore is already corrected, maybe hypercorrected, for the target space.

    So, goinf back to ATP synthase beta chain, where does it stand in my table of probabilistic resources?

    Absolutely nowhere! The highest information value that can be found, in the most performant system, is 138.6 bits, for a whole genome. Certainly something less for a 553 AAs (1659 nucleotides) sequence.

    Here we have 663 bits of functional information.

    There is no game for neo-darwinism. No game at all.

  71. 71
    gpuccio says:

    daveS:

    If a mutation is not compatible with life, the new organism will die. But the mutation still occurs, and it is one of the states searched.

    So, the number is absolutely relevant.

    Non functional states can absolutely be reached by mutation, They simply will not survive. That does not make functional states more likely.

    Remember, we are discussing the total numer of states that can be reached by variation in all available reproductions. Non functional states are part of that set.

  72. 72
    daveS says:

    gpuccio,

    If a mutation is not compatible with life, the new organism will die. But the mutation still occurs, and it is one of the states searched.

    So, the number is absolutely relevant.

    Non functional states can absolutely be reached by mutation, They simply will not survive. That does not make functional states more likely.

    Remember, we are discussing the total numer of states that can be reached by variation in all available reproductions. Non functional states are part of that set.

    Yes, but I’m suggesting (again, correct me if I’m wrong) that many of these nonfunctional states are so remote from any viable state that it is practically impossible for them to occur.

    If I cobble together a 4.6 Mbp genome at random, what is the probability that it will correspond to a viable organism [or be “reachable” realistically from a viable organism]? I have no idea, but isn’t it very very small?

  73. 73
    gpuccio says:

    daveS:

    Moreover, if the effect of negative selection is strong, as happens for functional protein coding genes, then any random walk towards distant unrelated targets will be extremely unlikely.

    Negative selection tends to preserve what already exists, as it exists.

    That’s why even darwinists tend to believe now that many genes arise from non coding, non functional DNA. And in non coding, non functional sequences the random wlak is completely free to go anywhere.

  74. 74
    gpuccio says:

    daveS:

    I wrote #73 before reading your #72. I think it is a partial answer to your question.

    The fact is that you have a functional genome of 4.6 Mbp (for example E. coli), and if it remains as it is there is no problem: non functional mutations will be erased by negative selection, neutral mutations will be tolerated, but will not change anything, and maybe occasionally the extremely rare mutations that can optimize an existing function under severe evvironmentsl pressure (like in antibiotic resistance) will be selected.

    That’s all.

    But the problem is, how do you get something really new from that?

    How do you get an eukaryotic cell?

    How do you get a new bacterial species, with functions that were not present in the original species?

    How do you get new protein networks?

    Or, just to be simple, how do you get a new protein with a new function?

    A protein that is completely distant, at sequence level, form what already exists?

    A protein whose function was not present before?

    That’s what biological evolution is about: getting the new.

    Those extreme random walks, that have to reach new long and complex and newly functional results, are practically impossible from highly functional genes, which are limited in their possibility of change by negative selection.

    But they are completely possible starting from non functional sequences. Even a duplicated and inactivated gene will do.

    So, for real evolution to take place, the search space must be completely explorable. And therefore, the probabilities that I give in my table are the real probabilities of reaching a subset of it.

    So, the conclusion is: either the search space cannot be explored in all its parts, and therefore no real evolution is possible and organisms should always remain more or less the same, or the search space can be explored in all its parts, and real evolution is possible (and it is, because it happened!), but it is empirically completely impossible by the RV+NS algorithn.

    Of course, design is all another story. 🙂

  75. 75
    daveS says:

    gpuccio,

    [posted before reading your #74]

    What would a “population density” map of this 4^4600000 state space look like?

    Would it look like the one of Egypt, where all viable states (and their close neighbors, reachable realistically from viable states) cover only a tiny portion of the map?

    Or would it look like this one of Italy, where the states are more evenly distributed, and most states or at or near viable ones?

  76. 76
    Origenes says:

    DaveS: … many of these nonfunctional states are so remote from any viable state that it is practically impossible for them to occur.

    I agree. One has to wonder: how searchable is the search space? Are islands of function surrounded by poisonous seas?
    Of course the junk-dna hypothesis plays a role here.

  77. 77
    daveS says:

    PS: I don’t know enough about this to address your comments in #74—sorry about that. My only point is that (perhaps) calling this 4^4600000 element set the “search space” is a bit like saying that the search space for flight MH 370 includes the Indian Ocean, as well as the surface of the Moon. Logically, the plane could be on the Moon, but it’s practically impossible.

  78. 78
    gpuccio says:

    daveS:

    In my examples about real proteins like ATP synthase, or even the Szostak sequences, I have restricted the search space to the corrispondent length of sequence. The problem remains the same: ATP synthase beta chain, with its 660 bits of functional information, is beyond the probabilistic resources of the mutations in a 553 AAs long sequence, well beyond.

    Egypt and Italy have nothing to do with that.

    That sequence could never be found, either from a functional but completely different sequence, or from a non functional stretch of nucleotides.

  79. 79
    gpuccio says:

    daveS:

    The 2000 protein superfamilies are completely unrelated at sequence level. They are, without any doubt, isolated islands in the sequence space. They are as different on from the other as any other groups of sequences can be.

    So, how could evolutionary random walks traverse all the functional space to reach 2000 completely isolated islands of sequence and functionality?

    The sequence space is the sequence space. Ut is not an unknown, mystical reality.

    Random variation can change sequences in many ways. Single mutations must follows some gradual walks, but other types of variation, like frameshift mutations, can jump to any part of the sequence space.

    Ohno believed that nylonase arose by such a jump. He was wrong, of course, but many darwinists still believe that absurd hypothesis.

    The number of states that can be reached must be computed for a definite sequence of definite length, because mutations are about 10^-9 per nucleotide in most organisms.

    If the genome of an organism is 4.6 Mbp. we must compute the probabilistic resources fro the whole geneome, because any part of it is subject to variation, of all kinds.

    If we are interested only in a specific part of it, we hav to adjust the mutation rate for that sequence length.

    That’s what I have done when I have computed the numbers in my comments above.

  80. 80
    Mung says:

    “Of course, there is “potential for function” in it, as Corey Delvine has said.”

    Who?

  81. 81
    Dionisio says:

    Now the whole RV+NS enchilada is in the hit parade!

    Popular Posts (Last 30 Days)

    What are the limits of Natural Selection? An interesting… (2,733)

    Violence is Inherent in Atheist Politics (2,035)

    Of course: Mathematics perpetuates white privilege (1,179)

    Is social media killing Wikipedia? (1,038)

    What are the limits of Random Variation? A simple evaluation (927)

  82. 82
  83. 83
    gpuccio says:

    Mung:

    Corey, one of my best sources of inspiration! 🙂

  84. 84
    gpuccio says:

    Dionisio:

    Yes, two hits at the same time!

    I should definitely go on with this “What are the limits…” series. 🙂

  85. 85
    gpuccio says:

    daveS:

    Always about the functional space.

    Let’s say that the beta chain of ATP synthase is a functional island. Small or big that it is.

    Now, I cannot see any other sequences which are “near” it in the sequence space.

    Blasting it against all human proteins, it has high homology (1061 bits) only with itself.

    A low homology is present with the H+-ATPase B subunit (maximum 114 bits). And one even lower (94.7 bit) is present with the alpha subunit of ATP synthase.

    The alpha subunit of ATP synthase, again blasted against all human proteins, has the same behaviour: high homology with itself only (1117 bits), and the usual low homologies with H+-ATPase B subunit and with the beta subunit of ATP synthase.

    The same situtation we have if we blast those two chains of E. coli against all other proteins of E. coli.

    And yet, both the beta and the alpha chain exhibit an amazing homology between the human and the E. coli form (663 bits for the beta chian, 561 bits for the alpha chain).

    IOWs, both the alpha and beta chain are really similar only to themselves, at least for the bulk of their information content, and remain similar to themselves for billions of years.

    These are islands. Isolated islands. Even very similar molecules, like H+-ATPase B subunit, which implements a very similar function, share only a fraction of the sequence information content of the ATP synthase two chains.

    Where is the connected functional space that can generate these amazing functional sequences thorugh a random walk starting from other islands of the search space? It is nowhere to be seen.

  86. 86
    daveS says:

    gpuccio,

    Isn’t that completely consistent with the proposition I asked about?

    My question is essentially whether a large portion of sequence space is “inaccessible” to search via life forms—perhaps consisting of large, sterile voids, if you will.

    I’m not asserting there is much if any connectivity between these “islands” of function. Perhaps the map of Egypt was misleading in that way; Australia might be a better illustration.

  87. 87
    gpuccio says:

    daveS:

    The space between islands is not inaccessible to random walk, if the random walk is free from the restraints of negative selection, IOWs, if the variation acts on non functional sequences.

    That’s why neo-darwinists love the scenario of gene duplication and inactivation, or of variation acting on non functional, non coding sequences.

    Of course, that kind of random walk is completely neutral, and therefore cannot get any help from NS.

    That’s why the immensity of the search space is the critical element: it demostrates that the random walk has no hope of finding the functional islands.

    The functional islands are islands of great complexity. They are not the two mutations that confer chloroquine resistance, or any other “beneficial loss of function” seen in the few microevolutionary scenario. They are true, new, original, complex functions. We have thousands of them in the universal proteome.

    A new complex function requires often hundreds of specific AA sites to work, exactly as a complex machine requires the perfect arrangement of multiple parts, in a very specific configuration.

    If you want to get a proper ATP binding site plus an ATPase activity which generates energy from the hydrolisis of ATP and uses it to implement some complex and useful function, like for example in dynein, you cannot have that function by arranging five or ten or even thirty AAs: you need much more, even to just start with the function in some not yet optimized form.

    So yes, the functional islands which have such specific and complex configurations are really rare and isolated: they are not like the weak ATP binding of Szostak, which is so “easy” to find in random repertoires. They are not 40 bits complex: they are hundreds of bits complex. The difference is exponential, and it sets those functional islands beyond the powers of RV.

    So, as we all know that those islands have been found in natural history, because those complex functional proteins exist in thousands, and that they cannot have been found by a random wakk or by NS, or by the two elements together, the only answer is that they were either directly designed, or that they were found by a designed search, which certainly has the power to overcome the probabilistic barriers, because it is guided by intelligence and purpose.

  88. 88
    daveS says:

    gpuccio,

    The space between islands is not inaccessible to random walk, if the random walk is free from the restraints of negative selection, IOWs, if the variation acts on non functional sequences.

    That’s why neo-darwinists love the scenario of gene duplication and inactivation, or of variation acting on non functional, non coding sequences.

    Of course, that kind of random walk is completely neutral, and therefore cannot get any help from NS.

    Thanks, I think this answers my question.

    Am I correct to say then that if you choose an arbitrary sequence S from this 4^4600000-element set of sequences, there’s a fair chance that it would be within a realistic number of mutations of the genome of some viable organism? (by “realistic”, I mean these mutations could all reasonably be expected to occur in 1 generation).

  89. 89
    gpuccio says:

    daveS:

    “Am I correct to say then that if you choose an arbitrary sequence S from this 4^4600000-element set of sequences, there’s a fair chance that it would be within a realistic number of mutations of the genome of some viable organism? (by “realistic”, I mean these mutations could all reasonably be expected to occur in 1 generation).”

    No. If I understand correctly what you mean, you are not correct at all.

    Let’s say that you “choose an arbitrary sequence S” from the “4^46000000-element set” that represents the search space for the E. coli genome. Is that what you mean?

    Now, let’s say that we choose an arbitrary S that has a well defined specification. To make that specification new, we can do the following:

    We choose some form of coding from nucleotides to the English alphabet, and then we code Shakespeare’s dramas by it. If my computations are correct, about ten of them should be in the range of our sequence of 4.6 million nucleotides. That’s the same as saying that a sequence in base 4 as long as the E. coli genome can bear the information that is in 10 Sakespeare’s dramas.

    OK? That would certainly be “an arbitrary sequence S” from our search space. Arbitrary and specified.

    Now, if I understand well your statement, you are saying that there should be “fair chance” that it will be reached from “the genome of some viable organism”? In 1 generation? Is that your idea?

    Of course you are not correct.

    That arbitrary sequence S will never be reached from the genome of some viable organism, not even in a very generous multiverse.

    You could not be less correct.

    Of course, the same would be true for any individual random sequence S, chosen randomly from the search space (IOWs, pre-specified). The search space is simply too big, exceptionally too big, for any specified and unrelated sequence to be reached by a random walk with “fair chance”. Not in 1 generation. Not in all the generations available on our planet, or in our universe. Not in all the generations available in some reasonably finite multiverse.

    If you meant something different, please explain better.

  90. 90
    J-Mac says:

    “Mutations are a reality and while most of them are of no consequence or detrimental, one cannot deny that on occasion a beneficial mutation might occur (in relation to a certain environment, but usually not for a gene’s function per se).

    However, to invoke strings of beneficial mutations that suffice to reshape one animal into the shape of another is not merely unreasonable, it is not science.”

    Christian Schwabe

  91. 91
    gpuccio says:

    J-Mac:

    “However, to invoke strings of beneficial mutations that suffice to reshape one animal into the shape of another is not merely unreasonable, it is not science.”

    Absolutely!

    The secret is in the length and specificity of the string. IOWs, its functional information content.

    I want to restate here what I said in the OP.

    All the components of the probabilistic resources have a linear relationship with the total number of states.

    However, the complexity of a sequence, in terms of necessary AA sites, has an exponential relationship with the functional information in bits.

    That’s why, as the functional sequence gets longer and/or more specific, the divergence between functional information and probabilistic resources increases dramatically!

    Indeed, only sequences made of really few AAs, or with extremely low functional specificity, can be considered in the range of the total probabilistic resources on our planet.

  92. 92
    Origenes says:

    GPuccio: Indeed, only sequences made of really few AAs, or with extremely low functional specificity, can be considered in the range of the total probabilistic resources on our planet.

    Axe confirms your assessment:

    We can imagine a different world where, for example, the planetary surface has rich deposits of abiotic amino acids, and cells indiscriminately incorporate these amino acids into long polypeptide chains, and these chains somehow benefit the cells without performing complex functions. In that world the problem we address here would not exist. But in our world things are strikingly different. Here we see a planet with amino acids of strictly biological origin and we see cells going to extraordinary lengths to manufacture, use, recycle, and scavenge all twenty of them. We see elaborate error-checking mechanisms that minimize the chances of confusing any one amino acid for any other during protein synthesis, and (as already noted) we see that the products of this tightly controlled process are long proteins. Lastly, we see that these long proteins perform an impressive variety of functions with equally impressive specifcity and effciency.

  93. 93
    gpuccio says:

    Origenes:

    Yes. I do believe that Axe is very correct in his arguments and ideas.

    We definitely need more people who can approach biology from an ID point of view. 🙂

  94. 94
    daveS says:

    gpuccio,

    No. If I understand correctly what you mean, you are not correct at all.

    That’s what I suspected; however I still think we are talking about different things. I’m not asking above whether S will be reached in one step from a viable organism, but whether it could be reached in principle from some hypothetical viable organism.

    ***

    Let me step back and ask a more basic question that might be at the root of my misunderstanding.

    Does it make sense to speak of whether an arbitrary element S of the sequence space corresponds to a viable organism?

    For example, take one of the ten sequences derived from Shakespeare; does the information in that genome (in principle) completely determine whether a real, living organism could have a genome with that sequence?

    ***

    The reason I’m asking is because it seems to me that it’s practically impossible for an organism with the Shakespeare genome to actually arise. That’s because this genome is rubbish from a biological point of view, and it would it would require an enormous number of mutations/changes to an actual viable organism to reach the Shakespeare genome, which is extremely unlikely assuming reasonable mutation rates.

  95. 95
    Origenes says:

    DaveS: I’m not asking above whether S will be reached in one step from a viable organism, but whether it could be reached in principle from some hypothetical viable organism.

    Are you saying that the search by organisms is confined to viable paths? If so, while that this is obviously true, there is the complication of junk-DNA.
    If there is a large enough portion of junk-DNA available to explore random sequences without any repercussion for the viability of organisms, then, in principle, any sequence can be explored.

    Note that we graciously assume that, at the moment a functional sequence is formed in junk-dna, somehow, this particular sequence (and not any other) is activated and translated into proteins. Why or how this is done, no one knows.

  96. 96
    daveS says:

    Origenes,

    Are you saying that the search by organisms is confined to viable paths? If so, while that this is obviously true, there is the complication of junk-DNA.

    Yes, viable paths, or at least within one step in a random walk from viable. Non-viable sequences “close” to viable could also be searched, although the resulting organism would not survive to reproduce.

    And yes, junk DNA is an issue here, although presumably there has to be some coding DNA present in order for the genome to be “viable”

  97. 97
    Origenes says:

    DaveS: And yes, junk DNA is an issue here, although presumably there has to be some coding DNA present in order for the genome to be “viable”.

    Absolutely right, Venter’s minimal genome (473 genes) comes to mind. The search of an organism must respect certain boundaries; which certainly doesn’t help the evolutionary search. However, given junk-dna, I have no idea if this constraint can be translated into a number. Would be very nice of course.

  98. 98
    daveS says:

    gpuccio,

    PS to my post #94: My question really boils down to whether there are well-defined “islands of viability” in the sequence space, and if so, what is the approximate total size of this set of islands.

  99. 99
    gpuccio says:

    daveS:

    Let’s try to understand each other.

    You had asked:

    “Am I correct to say then that if you choose an arbitrary sequence S from this 4^4600000-element set of sequences, there’s a fair chance that it would be within a realistic number of mutations of the genome of some viable organism? (by “realistic”, I mean these mutations could all reasonably be expected to occur in 1 generation).”

    Emphasis mine.

    So, your question was about “a fair chance”, and I have answered it.

    But again, if I understand well, your point seems to be that a whole genome cannot undergo states that are incompatible with life, and therefore many states that are completely distant from the viable state cannot be reach by a whole genome.

    That’s obviously right.

    The Shakespeare organism would not be viable, neither in its final state, nor in most nearby states at sequence level.

    But that has no relevance with the discussion I am making here.

    I have been rather clear:

    1) In my table, I have computed the probabilistic resources using average mutation rates per genome per replication, as I found them in available sources.

    2) I could have used the more consistent value of 10^-8 or 10^-9 mutations per nucleotide site, but that would require adjustment per genome size if we want to have an idea of how many states a class of organisms can reach.

    3) The upper threshold of the number of states that organisms can reach depends, as said, on the population size, the reproduction rate and the mutation rate.

    4) Of these three variables, the population size and the mutation rate can be considered rather constant for each class of organisms (at least in such a general estimate).

    5) The mutation rate, instead, is rather constant per nucleotide site, and therefore depends linearly on the length of the sequence we consider.

    6) However, as the relationship is linear, and the range of genome length is not huge, the differences in the total number of states, in different classes of organisms, are not influenced as much by the mutation rate per genome, but rather by the population size.

    For example, the difference in population size in my table between the two extremes, bacteria and hominidae, is 21 orders of magnitude, while the difference in genome size between the same two classe is about 3 orders of magnitude. That’s why the dofferences in total number of states, which are of 24 orders of magnitude, can be related essentially to the difference in population size.

    7) Of course, as my numbers are expressed for whole genome, when you asked what fraction of the search space can be reahce with those probabilistic resources, I have computed that value for the average bacterial genome, and the total number of states that can be reached by the bacterial system. That is absolutely correct, and it is in no way modified by the facts that, of course, many of the states in the search space are incompatible with life, if the search is done on a whole genome.

    8) Of course, no realistic search can be done on a whole genome, because a great (or maybe small in some cases, for those who believe in junk DNA) part of a whole genome cannot change. For example, protein coding genes are strongly constrained by negative selection, and in many cases, like ATP synthase alpha and beta chain, histones, ubiquitin, dynein, those constraints are huge, and the sequence is very much fixed for great part.

    9)As I have said, the only realistic conclusion is that functional sequences can change very little, and mostly in a neutral way, which does not change the functional state of the organism. Or maybe sometimes they may undergo minor changes in the direction of some optimization of their existing function.

    10) Therefore, practically all the search for really new functional information must take place in non functional sequences. That allows space for free random walks, which can reach any part of the search space, but in accord to the probabilistic rules.

    11) Of course, as we have said, a whole genome cannot be non functional: tehrefore a whole genome will never take part in that kind of random walk. But some definite subset of it can certainly do that.

    12) So, let’s go again to our E. coli example. Let’s not compute any more the relationship between the search space that can be reached and the total search space for the whole genome. Let’s compute it for a subset of it.

    13) Let’s assume, just for the sake of discussion, that 10% of the E. coli genome could be non functional, and take part in a neutral random walk. It’s not really important how much it is, I just want to show how the computation must be made.

    14) Now, we have 460 Kbp, instead of 4.6 Mbp. OK? The mutation rate is now 4.60E-04 (1E-9 * 460000). The number of states that can be reached is now 8.06E+40, one order of magnitude lower, as expected (we have reduced the sequence length of one order of magnitude). Ane now all these states can be reached, because we are discussing a purely neutral random walk in non functional DNA.

    15) O course, the search space is much smaller than fro the whole genome: it is now “only” 4^460000, 1E276947!

    16) So, the ratio of the search space that can be reached to the whole search space is now 8.06E+40 : 1E276947. IOWs, only a fraction of:

    1 : 10^276907

    can be reached by the whole bacterial system in 4 billion years.

    Is that tiny enough? 🙂

    17) Now, let’s do that again for a shorter sequence, and a more defined scenario. Let’s say that one gene for a 400 AAs protein (a perfectly medium size protein) is duplicated and inactivated in E. coli. This is a very frequently invoked scenario in the neo-darwinian field. Let’s say, for the sake of discussion, that the duplicated and inactivated gene has been miraculously fixed by genetic drift, and is now, at the beginning of our 4 billion years window, present in the whole population of 5E30 bacteria, ready to start its random walk.

    18) Now the sequence that can mutate is only 1200 nucleotides. So, let’s do the computations.

    Total states that can be reached by the whole bacterial system: 2.10E+38 (127 bits)

    Search space: 4^1200 = 10^722

    Fraction of the search space for that sequence that can be reached by the whole bacterial system in 4 billion years for one gene for some new protein of about 400 AAs is:

    1 : 2.1×10^684.

    Is that tiny enough?

    18) Finally, let’s remember that the ration between the total number of states that cab be search is a measure of the probabilistic resources, not a measure of functional information.

    But it can immediately be related to measures of functional information.

    For example, we have seen that, in a maximal bacterial system, the highest munber of total states that can be reache by a neutral sequence of 1200 nucleotides is:

    127 bits

    That means, very simply, that any sequence which is definitely more ocmplex than that will not be realistically reached by that stretch of nucleotides (our duplicated and inactivated gene).

    If you want five sigma certainty, you can just add 22 bits, for a total of:

    149 bits.

    So, no protein with more than 149 bits of functional information is in the range of a duplication inactivation process in the super maximal bacterial system we have considered, in 4 billion years.

    ATP synthase beta chain has at least 663 bits of functional information.

    QED.

  100. 100
    gpuccio says:

    Origenes:

    “Note that we graciously assume that, at the moment a functional sequence is formed in junk-dna, somehow, this particular sequence (and not any other) is activated and translated into proteins. Why or how this is done, no one knows.”

    Yes, we graciously assume a lot of things. We are definitely very kind to our darwinist interlocutors! 🙂

  101. 101
    daveS says:

    gpuccio,

    Thanks, this answers exactly the questions I meant to ask.

  102. 102
    kurx78 says:

    This post is amazing, I’ve learned a lot.
    But now I remember, where are the politely dissenting interlocutors?
    Like the person who “demostrated” that literaly everything is possible with random mutation if you give it enough time and cast a spell on it.

  103. 103
    forexhr says:

    If a new market niche opens up for a stone sculpture of Michael Jackson, is it possible for this niche to be closed up by erosion process? The answer is – no, because given the poly-3D enumeration mathematics, only a thousand particles can be arranged into approximately 10^3,271 different states.( 2^(n-7)n^(n-9) (n-4)(8n^8-128n^7+828n^6-2930n^5+7404n^4-17523n^3+41527n^2-114302n+204960)/6)

    So, the ratio between ‘non-Michael Jackson stone shapes’ and ‘Michael Jackson stone shapes’ is so large that even if every proton in the observable universe were a stone under erosion process, eroding extremely fast from the Big Bang until the end of the universe, when protons might no longer exist, we would still need trillions and trillions orders of magnitude more time to have even a 1 in trillions and trillions chance of success.

    This question in evolutionary language is as follows. If a new environmental niche opens up for a male reproductive system, is it possible for this niche to be closed up by mutating DNA sequences? The answer is – no, because if we suppose that some simple male reproductive system is represented by only 10,000 nucleotides, which can be arranged into approximately 10^6,020 different states, the ratio between DNA sequences for ‘non-male reproductive system’ and ‘male reproductive system’ is so large that even if every proton in the observable universe were a DNA sequence mutating extremely fast from the Big Bang until the end of the universe, information for a male reproductive system would not be found.

    The probability of evolution is therefore zero in any operational sense of an event, and the belief that the process of RV+NS must eventually succeed in producing complex molecular machines, organs and organ systems is delusional.

  104. 104
    gpuccio says:

    daveS:

    “Thanks, this answers exactly the questions I meant to ask.”

    I am happy of that! 🙂

  105. 105
    gpuccio says:

    kurx78:

    “This post is amazing, I’ve learned a lot.”

    Thank you, I am happy you like it! 🙂

    “But now I remember, where are the politely dissenting interlocutors?”

    Yes, where are they?

  106. 106
    gpuccio says:

    forexhr:

    Exactly!

    An interesting aspect is also that, as seen in the discussion with daveS, NS seems cut out feom the game when the random walk takes place in some non functional part of the genome.

    Indeed, that seems the only reasonable scenario, because, as daveS has pointed out, a random walk starting from a functional sequence, towards some unrelated sequence with different function, seems really impossible because of the severe restraints posed by negative selection.

    Of course the short optimizing walks for the same function, or slight variations of it, are perfectly possible and can be driven by NS, as we have seen with the scenario of chloroquine resistance optimization.

    But for the really new information, only a neutral walk seems feasible.

    But a neutral walk implies all the probabilistic barriers that we have discussed, with no help from NS.

    And NS could come to the rescue only when the new function is already there, even if not yet optimized. But we have seen that the new function cannot ne there by RV alone.

    And of course, at some point in the neutral random walk, we have to go back to the functional scenario, so that NS can start the fixation and optimization task!

    IOWs as Origenes says:

    “Note that we graciously assume that, at the moment a functional sequence is formed in junk-dna, somehow, this particular sequence (and not any other) is activated and translated into proteins. Why or how this is done, no one knows.”

  107. 107
    forexhr says:

    gpuccio:

    Regarding the search for new functional sequences, the average evolutionist would respond in the following way: “new function comes out of pre-existing functions. Nothing starts out from scratch. Your probability calculations reflect deliberate assumptions of everything starting out de novo, in order to ensure that the resulting probabilities come out as low as possible.”

    But of course, such alibi response is completely flawed. Here is why. If these DNA sequences: ATT, CGC and ACA are something that is functional in the environment A, while DNA sequences: TAC, AAA and CCC are required for adaptation to environment B then the first sequences are equally junk as all ‘non-TAC, AAA and CCC’ sequences. In other words, in the environment which requires visual function, DNA that codes for fully functional heart(pre-existing function) is equally junk as any random sequence of nucleotides. Hence, every new adaptation starts out from scratch.

  108. 108
    gpuccio says:

    forexhr:

    That’s absolutely right! 🙂

    New proteins or protein domain superfamilies indeed “start from scratch”, because superfamilies are completely unrelated at the level of sequence and structure.

    And it is not necessary that the whole protein be completely new: it is enough to show that there is a transition of high functional complexity, that new complex and specific parts of the molecule are acquired from scratch.

    For example, in my OP here:

    https://uncommondescent.com/intelligent-design/the-amazing-level-of-engineering-in-the-transition-to-the-vertebrate-proteome-a-global-analysis/

    I have shown that more than 1.7 million bits of functional information are acquired by the 20000 proteins that we find in the human genome in the transition from pre-vertebrates to vertebrates, which happens in about 30 million years. And the population is not certainly the 5.00E+30 bacteria of my table. It is rather some bunch of lancelets, with probabilistic resources at most comparable to those of fish (78 bits), but probably much less.

    So, those 78 bits of variation should explain the acquisition of 1.7 million bits of functional information in 30 million years, information so specific that it will be conserved for 400+ million years, up to humans!

    Moreover, the distribution of the informational jump in the 20000 proteins of the human proteome shows that the 90th percentile of the jump is 486 bits, and the 95th percentile is 733 bits. IOWs, of the 20000 human protiens, 10% (about 2000) have a specific information jump of 486 bits or more, and 5% (about 1000) have a specific information jump of 733 bits or more, in that quick evolutionary transition from lancelet-like organisms to cartilaginous fish, in about 30 million years. All that starting from at most 78 bits of probabilistic resources!

  109. 109
    Dionisio says:

    gpuccio,

    The Neo-Darwinian RV+NS enchilada you have “cooked” and served here lately keeps attracting many readers:

    Popular Posts (Last 30 Days)

    What are the limits of Natural Selection? An interesting… (2,744)
    Violence is Inherent in Atheist Politics (2,036)
    What are the limits of Random Variation? A simple evaluation (1,213)
    Of course: Mathematics perpetuates white privilege (1,184)
    Is social media killing Wikipedia? (1,068)

    Well done!

  110. 110
    J-Mac says:

    gpuccio,

    There is enough evidence that mutations are non-random…when one considers quantum coherence…

    If you add to the pot of evolution that there is evidence for a strong element of randomness in natural selection, then you have a clear picture what Darwinian evolution is facing today…

    All Darwinists can do is deny the evidence and hope for the best… that their faith holds up as long as they are alive…

  111. 111
    Dionisio says:

    Can this thread be somehow associated with or related to the following link?

    https://uncommondescent.com/intelligent-design/evolutionary-predictions-of-protein-structure-is-iffy/

  112. 112
    EugeneS says:

    Mung # 61

    Of course not! RV+NS is a secondary (induced) phenomenon. Whatever its capabilities in reality, it must have started from a population of self-reproducing organisms. How did nature get there?! To answer that question in earnest (without shameless intellectual tricks like the multiverse) one inevitably needs to bring ID to the table, at least in the form of parameter fine tuning (weak ID).

  113. 113
    gpuccio says:

    EugeneS:

    I am definitely for strong ID. Very strong! 🙂

    While there is no doubt that the basic parameters are fine tuned to allow life, I don’t believe that any special setting could, by itself, ever generate the tons of specific functional information that are necessary for life to exist.

    There is no empirical support to the idea that our environment, or even our world, however fine tuned it may be, has the specific understanding of biochemical laws, and the computational resources, to guide the generation of thousands of amazing biochemical machines like the ones that we observe in all living beings.

    Moreover, I insist that the same evidence for design that we find for OOL is also present for life evolution: each new wonderful protein that appears in natural history is evidence of design, not only OOL.

    OOL just requires probably some more powerful design! 🙂

    But we have seen that many other events, like the appearance of eukaryotes, and of metazoa, and, just to remain in a field that I have discussed in detail with empirical evidence of all kinds, the transition to vertebrates, all require the input of tons of new functional information.

    So, what we need is strong, very very strong ID. 🙂

    Otherwise we cannot really explain anything of what we observe, and we are therefore no better than neo-darwinists, just imagining things that are not supported by facts.

  114. 114
    gpuccio says:

    To all:

    This must be almost a record: if I am not wrong, not even one single comment from the other side, in this thread.

    Maybe RV is of no interest to them! 🙂

  115. 115
    forexhr says:

    gpuccio:”OOL just requires probably some more powerful design!”

    I must disagree with the above statement of yours because origin of higher life forms(OHLF) is an even greater problem than origin of life(OOL). Here is why. Since everything in nature is made of large number of particles that constantly change their spatial positions, the idea that life arose from nonlife basically boils down to finding a functional arrangement of particles(FAOP) (which has the ability to reproduce), through the process of particle rearrangements.

    The problem for OOL lies in the ratio between non-FAOP and FAOP, which is so huge that the quantity of particle rearrangements in the entire lifespan of the Universe is insufficient to find only one instance of FAOP.

    OHLF faces the same problem, but this problem is masked by linguistic construct called natural selection(NS). Basically, NS just pre-specifies FAOP through fitness or the ability of an organism to survive in a certain environment. For e.g., given an aquatic environment, if AOP within an organism gives it the ability to breathe under water, such AOP is by definition FAOP. Given an intron-exon environment if AOP within an organism gives it the ability to remove introns from pre mRNA, then such AOP is by definition FAOP. Henece, higher life forms are actually numerous FAOPs pre-specified by various environments.

    Meaning, the same as with OOL, origin of a particular biological function boils down to finding FAOP in nearly infinite poll of non-FAOPs. But, since finding one FAOP – ability to reproduce(OOL) is easier than finding many different FAOPs – ability to reproduce sexually, remove introns, think, see, talk, jump,… OHLF is an even greater problem than OOL.

    Thus, the truth is that OHLF requires more powerful design that OOL.

  116. 116
    Origenes says:

    Larry Moran, wd400, CR, MatSpirit, Seversky, Goodusername, rvb8, Gordon Davidson and many others, Where Art Thou?

  117. 117
    gpuccio says:

    forexhr:

    I can agree with you, but the point is that when I speak of OOL I am really thinking of Origin of LUCA, or if you want Origin of Prokaryotes.

    Indeed, I do not believe that simpler forms of life ever existed. My idea is that life started probably with something very similar to a prokaryote (IOWs LUCA).

    I don’t believe that the simple “ablity to reproduce” has anything to do with life. Chemical systems where simple molecules can self-reproduce are not IMO living systems, nor can they evolve into living systems.

    A living organisms requires much more:

    a) Separation and differention of an ineer environment vs an outer environment.

    b) Metabolism to derive energy which allows the generation of very low entropy structures.

    c) Ability to generate and maintain far from equilibrium systems.

    And probably many other things.

    So, in this sense, maybe that OOL required even more design than the following steps. Or at least a comparable amount.

    However, I suppose that we agree that nothing of all that could have happened without a lot of design! 🙂

  118. 118
    Origenes says:

    //About NS — off-topic//

    GPuccio,

    I would very much appreciate your opinion on the following line of reasoning:

    Natural selection slows evolution down.

    Natural selection (NS) culls variety — eliminates organisms with certain traits from the population. In effect, NS enhances probability for the remaining variation. IOWs NS intensifies a particular search and thereby enhances the probability of its success.

    Assuming that functional islands are not necessarily connected, and assuming constant population size, the overall chance of finding evolutionary novelty remains the same, with or without NS, — the intensified search compensates for the loss of variety. IOWs the presence of NS does not effect the chance of success for the overall evolutionary search.

    However, the waiting time for the selected variation to restore the original population size is slowing down evolution.

    If some disease kills off all of human race, except for the Japanese, then the probability of finding evolutionary novelties is restored at the moment that the Japanese have a population size of 7 billion. My point? The waiting time needs to be factored in.

  119. 119
    J-Mac says:

    @118

    “Out of 120,000 fertilized eggs of the green frog only two individuals survive. Are we to conclude that these two frogs out of 120,000 were selected by nature because they were the fittest ones; or rather – as Cuenot said – that natural selection is nothing but blind mortality which selects nothing at all?”

    Natural selection builds things only in Darwinists’ wishful mind and not in real life…

  120. 120
    J-Mac says:

    @116

    Moran lives in the wishful world of his impotent god random genetic drift that just like natural selection can build nothing, create nothing and eliminate whenever benefited mutations happen to achieve..

    “Even a new mutation that is slightly favorable will usually be lost in the first few generations after it appears in the population, a victim of genetic drift. If a new mutation has a selective advantage of S in the heterozygote in which it appears, then the chance is only 2S that the mutation will ever succeed in taking over the population. So a mutation that is 1 percent better in fitness than the standard allele in the population will be lost 98 percent of the time by genetic drift.”

    Griffith and colleagues(1999, p. 564):

  121. 121
    forexhr says:

    gpuccio:

    Whether we define life as the ability to reproduce or the ability to generate and maintain far from equilibrium systems, this is merely a human convention, an instance of human language and it is completely unrelated to the problem of explaining how the organisms that we observe around us came to be. Viewing organisms through the lens of linguistic constructs is a necessity of evolutionists because, given their assumption – ‘once life appears and begins to reproduce, the emergence of higher life forms become possible’, it’s convenient for them to say that chemical systems where simple molecules can self-reproduce are living systems. Since we know that such systems can emerge by natural means, if follows that the emergence of higher life forms by natural means is a trivial occurrence. The problem is of course in their assumption. Ability to reproduce is not some magical thing, but merely a mechanism for changing spatial positions of particles. For e.g., when bacteria reproduces to provide the raw material for evolution – mutations, all that is happening is rearrangement of the particles that comprise the bacteria. That’s all. But exactly the same processes occur in non-living matter – particles change their spatial positions. That is why the above assumption is like saying – due to particle motion the emergence of higher life forms become possible, which is of course nonsensical statement. So, the problem is not definition or linguistic distinction of living and non-living matter, but the ratio between non-FAOP and FAOP.

  122. 122
    gpuccio says:

    Origenes:

    Of course the waiting time is a critical factor. And it depends strictly on the population size.

    I would like to clarify a few concepts here.

    1) In this OP, I have considered only the probabilistic resources inherent in the biological scenario on our planet: IOWs the total number of different states that can be really reached by some random walk in the whole natural history, by different types of biological populations.

    2) The numbers in my table are only an evaluation of a higher threshold, computed with extremely generous, and essentially unrealistic, assumptions in favor of neo-darwinism. For example, population sizes are by far too big, and the reproduction rates, for bacteria for example, are those of an expanding population, and not those of a steady state. And so on.

    3) The meaning, therefore, is the following: even with the best and unrealistic assumptions for neo-darwinism, there is no chance that more than such a number of states could be reached in natural history.

    4) As the total number of states is ridiculously small, in comparison with the probabilistic resources needed by even simple functional proteins, the idea is that neo-darwinism is completely out of game.

    5) In this reasoning, I have not considered the effects of NS. Those I have discussed in the previous OP. Therefore, the reasoning in this OP is about the powers of RV to generate the starting function on which NS could eventually act, or to generate any new function from scratch in random walks that take place in non functional sequences, and are therefore by definition neutral.

    6) Moreover, I have not considered in this OP the possible effects of neutral genetic drift because, as I have discussed many times, it does not act on the probabilistic resources. Given a number of states n that can be reached by the system, drift events cannot in any way favour functional states over non functional states. Therefore, nothing changes.

    7) The important point is: if one includes NS or genetic drift in the model, the time to fixation becomes a critical factor. Indeed, time to fixation can be extremely long, especially for drift, while it can be shorter for NS, but significantly so only if the selection coefficient is very high (like in antibiotic resistance).

    8) The time to fixation for neutral drift is, according to population genetics, 4Ne generations, where Ne is the effective population size. Therefore, if the population size is very big, the time to fixation is very big too.

    Just as an extreme example, if my number of 5E30 given in my table for bacteria were a real effective population size (which, of course, it is not), the time to fixation for a neutral trait would be 2E+31 generations which, considering one generation per hour, as I have done in my table, would give us a time to fixation of 2.28E+27 years, which is some big time from all points of view.

    Of course, effective population sizes are much smaller. Let’s say that we have an effective population size for bacteria of 10^9. That would give us a time to fixation, for one neutral trait, od only 456621 years, about 0.5 million years. OK, but with an effective population size of 1E09, the total number of states that can be reached in 4 billion years in the bacterial scenario declines dramatically, from 5.26E+41 (the number in my table) to 2.10E+19!

    So, as I said, adding genetic drift of neutral traits to the scenario does not improve the situation at all! 🙂

    9) Of course, NS will be more efficient, and its time to fixation will be shorter, according to the selection coefficient. Low values of s will not help much, but higher values, like in antibiotic resistance, can do the trick of greatly facilitating small steps (usually of one aminoacid), as we have seen in the scenario of chloroquine resistance (see the Summers paper).

    However, NS too has the problem of time to fixation, and that can be a real stopper in many situations, as you correctly say.

    10) Finally, all those who really want to consider realistically the role of NS as some help to the disastrous numbers in my OP can certainly try to do that, but again they are cordially invited, as a first step, to take my challenge, repeated at the end of this OP, and answer my two simple questions which are critically important for any role that NS is supposed to have.

    Good luck to everybody! 🙂

  123. 123
    gpuccio says:

    J-Mac:

    “Even a new mutation that is slightly favorable will usually be lost in the first few generations after it appears in the population, a victim of genetic drift. If a new mutation has a selective advantage of S in the heterozygote in which it appears, then the chance is only 2S that the mutation will ever succeed in taking over the population. So a mutation that is 1 percent better in fitness than the standard allele in the population will be lost 98 percent of the time by genetic drift.”

    That’s a very important point, thank you! 🙂

    See also my previous post (#122).

  124. 124
    jerry says:

    Gpuccio,

    I apologize for not reading all that has been written. It looks very interesting. Two things

    First, how does what you have been showing differ if at all from Behe’s Edge of evolution hypothesis? Your approach seems more mathematical than anything Behe has done but seems to have the same conclusions.

    Second, I happened by chance on a TV nature documentary this morning on Madagascar and how much of its flora and fauna differs from the rest of Africa. Are you aware of just how far are these species/variations on Madagascar differ from the mainland of Africa?

    That is probably an unfair question since my guess few know anything about Madagascar. But geographical variation of species formed an important part to Darwin’s thinking and is there any analysis of just how different are the various species by geographic separation? And how they could have developed by the geographic isolation and be within your limits of what is possible?

    Actually a third thing, the RV for Dummies approach to make it more accessible to those not mathematically inclined to follow all the bits and their implications (namely me). Somebody should do it here at UD so it can be used for those who are educated but not familiar with the technical arguments. I deal with these type of people frequently in other places and use the improbability of generating new coding sequences/proteins as the basis for undermining naturalistic evolution.

    It would be nice to have a layman’s version to use with these people, especially when they bring a biologist to support them. I once tried to explain this to an author of a non-technical article on evolution and the person got Kenneth Miller to support him.

  125. 125
    Mung says:

    It is plainly irrational to believe in Darwinian evolution. What’s an atheist to do?

  126. 126
    gpuccio says:

    jerry:

    Thank you for your questions.

    1) Of course, my point of view is essentially similar to Behe’s. My approach is probably a little different in the details. However, the general idea is the same.

    As I have said, my approach here is top down, and is aimed essentially to give some higher threshold of what RV, even in an extremely optimistic scenario, can really do.

    Bottom up approaches, like Behe’s and Axe’s, try to establish a lower threshold based on empirical observations or actual experiments.

    Those data are very interesting, but again I don’t think that the problem is really to establish if the edge is at 2, 3, 5, or 10 AAs.

    Even 10 AAs are simply ridiculous, when we consider what is necessary to build up functional proteins.

    Even the 37 that I have given for a completely unrealistic bacterial system are completely useless.

    A few examples:

    ATP synthase beta chain: 334 identities between E. coli and humans.

    Dynein: 1137 identities between saccharomices cerevisiae and humans.

    SMC3: 1133 identities (93%) between shark and humans.

    And so on, and so on.

    1.7 million specific functional bits generated in 30 million years from pre-vertebrates to vertebrates.

    All that with probabilistic resources that could be enough, at most , for 10 – 20 specific AAs!

    And no way that NS can help explain all that! (see my challenge).

    2) I don’t know much of Madascar. In general, I never discuss phenotypic differences unless the molecular basis is well known (which is really rare).

    There are many adaptational mechanisms that can explain much. Only if we know the functional information that separates species or organisms can we discuss what can be attributed to design with reasonable certainty.

    The relationship between genotype and phenotype is often mysterious and not well understood.

    3) RV for dummies?

    Let’s try:

    a) Our planet, with all its biological resources, cannot generate enough different genomic states to find anything that has a functional information higher than 19 – 37 specific AAs at most by RV (with an hyper-optimistic estimate)

    b) Almost all proteins have higher functional information, ranging often in the hundreds, or even thousands, of specific AAs per protein, as proved by absolute conservation throughout hundred million years of natural history.

    c) The role of NS is absolutely negligible in those scenarios.

    Is that simple enough? 🙂

    By the way, if anyone on the other side doubts these numbers, please invite them here to explain why.

  127. 127
    gpuccio says:

    jerry:

    ” the person got Kenneth Miller to support him”

    What a help, really! 🙂

  128. 128
    Origenes says:

    @122

    GPuccio: Of course the waiting time is a critical factor. And it depends strictly on the population size.

    Thank you for taking the time to answer my question. It seems that we agree. Or do we not?

    GPuccio: Of course, NS will be more efficient, and its time to fixation will be shorter, according to the selection coefficient.

    I agree. When we have a global deadly virus outbreak and only the Japanese survive, then, obviously, fixation time is very short — the frequency of the ‘Japan-allele’ is at 100% in a matter of days.

    GPuccio: However, NS too has the problem of time to fixation, and that can be a real stopper in many situations, as you correctly say.

    But, that’s not what I was saying. My point was not about fixation, but, instead, about the time necessary for restoring the population size to 7 billion. Is there a term for this period? My point is that during that period, evolution is performing worse than without NS — worse than a blind search. Only when the ‘pre-deadly-virus-outbreak’ population size is restored are probabilities for an evolutionary search back to normal — a blind search.
    Just to be clear: fixation does not take into account the original population size before NS intervention.

  129. 129
    J-Mac says:

    Good stuff gpuccio!

    Are you familiar with the origin of carnivorous plants by WE. Lonnig?

    http://onlinelibrary.wiley.com.....2/abstract

    Mind boggling evidence that Dawinists have access to and yet still refuse to accept it…

    I pity them…Maybe I shouldn’t? Duno…

  130. 130
    gpuccio says:

    Origenes:

    “My point was not about fixation, but, instead, about the time necessary for restoring the population size to 7 billion. Is there a term for this period?”

    I am not sure. My impression is that it is a special form of time to fixation by NS. But I could be wrong.

    It also reminds me, even if it’s not exactly the same, ot the idea of the “cost of Natural Selection”, according to Haldane.

    That is more an effect of the competition between different selectable genes, but there too the problem is mainly the loss of reproductive resources because of the intervention of NS.

    Indeed, those are two ways in which NS can be an obstacle to evolution, because of its intrinsic behaviour as a destructive principle: both the antagonistic effect against what is new, effected by negative NS, and the reduction of reproductive resources in the course of fixation and explansion, effected by positive NS (see for example Haldane’s dilemma and your Japanese example).

  131. 131
    gpuccio says:

    J-Mac:

    Thank you! 🙂

    Lonnig’s paper seems very interesting indeed.

  132. 132
    EugeneS says:

    GP

    I totally agree on the strong ID. It is not correct to attribute all biological information to what is basically noise. Natural selection in practice is not really a magic wand but rather a magic hammer 😉 NS can only reduce information (not produce). Consequently, all Darwinism can rely on in terms of information production is RV + drift. I may be wrong but I do not think that drift can do anything substantial statistically. They have nothing else.

    forexhr

    An interesting observation indeed! However, this may be more difficult than it looks. In order to confidently say that the origin of higher life forms is a greater problem than the origin of life, we need to demonstrate that the additional constraints do not make the problem easier (because they can in theory). In combinatorial search, constraints can provide valuable information to narrow down the feasibility space.

    If you have a set of instances of graph coloring problems that you generate by varying the ratio of the number of edges to the number of vertices, you will get a spectrum of problems with different solubility:

    – no edges (too few constraints or none at all) easily soluble;
    – phase transition; the hardest problems to solve in practice;
    – too many edges (constraints) easily insoluble.

  133. 133
    gpuccio says:

    EugeneS:

    “I may be wrong but I do not think that drift can do anything substantial statistically.”

    It can’t. It’s easy to understand that if we reason very simply as follows:

    a) If the total number of states that can be generated in natural history is n, can drift change that number? The answer of course is no. Drift does not act on the number of states that can be generated: that depends only on the population size, reproduction rate, mutation rate and timw window.

    b) Drift can certainly modify the distribution of states at various times. That said, the only pertinent question is: can drift in some way favor the target space vs the rest of the search space? IOWs, can it favor functional states?

    And the answer, of course, is no. For neutral drift, all states are equivalent. Functional states are not recognized. Therefore, the probabilties of reaching a functional state remain absolutely the same.

    c) Only NS can recognize functional states, and therefore change the probabilistic distribution. But we have seen its severe limits, and its negative aspects too.

    However, NS can certainly contribute to optimize existing functions a lttle bit. That it can do, and that we can observe. Nothing else.

    Indeed, for the generation of new functional sequences, unrelated to existing sequences and functions, NS is a game stopper. Only neutral walks have any hope.

    And we know how likely that hope really is! (see the table in the OP) 🙂

  134. 134
    jerry says:

    Gpuccio,

    Is that simple enough?

    Too simple. My observation of discussions of complicated topics is that the discussion nearly always starts out at the kindergarten level and just a few paragraphs later is at discussions appropriate for graduate school.

    Your discussions are rightly at the graduate school level but what is needed is a discussion that is somewhat at the high school level in order to make it clear to the average educated person who is not schooled in the technological terms used here but would understand the logic when presented at that level.

    I am not asking you to do this because your time is obviously needed at these graduate level discussions and what you have produced is extremely valuable. But it would be a great service if someone would do this. 10 years ago I would have tried to do it but I am not aware of all the technical arguments you are using and don’t have the time to think them all through.

    Keep up the great work. But somewhere there is a 100 page or less discussion of this at the “RV for Dummies” level.

  135. 135
    forexhr says:

    EugeneS:

    I was speaking strictly from a numerical perspective. Higher life forms are composed of more particles that lower life forms and as the number of particles increases the ratio between non-functional and functional states for any given specifier increases also. The higher the ratio, the more resources are required to find functional states.

    Of course, from a physicochemical perspective, life cannot originate from non-living matter due to one simple fact: processes of non-living matter are heading towards physicochemical equilibrium or a state of minimum total potential energy and not towards a state where some physicochemical system can use surrounding matter to reproduce and maintain its structure. In that regard, any isolated and momentary instances of self-reproduction would instantly be destroyed by equilibrium flows of matter and energy.

  136. 136
    gpuccio says:

    To all:

    Well, the record gets better: after 1600+ views and 135 comments, not one single intervention from the other side. 😉

  137. 137
    Origenes says:

    Larry Moran, wd400, CR, MatSpirit, Seversky, Goodusername, rvb8, Gordon Davidson and many others, Where Art Thou?

    Maybe this time one of them will answer 😉

  138. 138
    EugeneS says:

    GP

    Usually Darwinists prefer this argument. At least, I have seen or heard this more than once.

    The probability of something complex and specific is very low, but we must consider an ensemble, not a given individual. The probability of Mr X being born (with all his distinctive traits such as hair color, stature, weight, size of liver etc) is vanishingly small, but Mr X’s do get born.

    An analogy is language: it may have started from very simple ideas (crude semantic islands in the ocean of possible sequences of letters of the same length). Then gradually the semantic islands were getting more specific and consequently relatively smaller in size.

    With life, the situation is analogous: it may have started from something considerably simpler than the specific peptide sequences observed today (and consequently with much greater probabilities), which then gradually became more specific leading to what appears to be very low probabilities.

    The most important observations that in my opinion invalidate this claim are:

    1. complex function is/cannot be not a (re)combination of relatively simple function. We have discussed it at length about it in the other thread.
    2. the language analogy is flawed because it does not take into account the physical/chemical constraints protein molecules must satisfy in order to be functional. Words can be added to words without constraints, to specify meaning; whereas physical constraints do not permit arbitrary functional adjustments in the protein world.

    GP, I wonder if you have any more comments on this. Thanks.

  139. 139
    EugeneS says:

    Forexhr 135

    Yes, of course. I understand this.

  140. 140
    Mung says:

    gpuccio:

    Well, the record gets better: after 1600+ views and 135 comments, not one single intervention from the other side.

    Patience grasshopper. We are in the waiting time. The random mutations are accumulating but not yet selectable. You will see the results and it will be a wonderfully engineered design. By accident. Just you wait and see.

  141. 141
    DATCG says:

    Gpuccio,
    Great reading and discussions once again. Thank you for your time and effort in answering questions and flushing out details.

    Origenes, your statement about graciously accepting JUNK DNA and Gpuccio’s response should not be lost on readers…

    Origenes:

    “Note that we graciously assume that, at the moment a functional sequence is formed in junk-dna, somehow, this particular sequence (and not any other) is activated and translated into proteins. Why or how this is done, no one knows.”

    Yes, we graciously assume a lot of things. We are definitely very kind to our darwinist interlocutors! ????

    This is important.

    It brings to memory Dan Graur’s angry rant against ENCODE and function found in JUNK DNA.

    His last paper insisting the Genome must be at least 75% Junk.

    Is 75% JUNK DNA enough to solve the problem for neo-Darwinist? And save materialist assumptions?

    Don’t think so.

  142. 142
    jerry says:

    at the moment a functional sequence is formed in junk-dna, somehow, this particular sequence (and not any other) is activated and translated into proteins. Why or how this is done, no one knows.

    It has been awhile since I have read about this but aren’t there lots of expressed sequences in the cell that may be due to just this, part of the junk DNA being expressed with no apparent function. Someone once said that these apparently useless proteins may be some of the so called Orphan proteins that have been discovered in the cell but which have no known function.

    This is part of the punctuated equilibrium hypothesis, that sequences mutate away due to various means of random variaton until they become functional. I believe it relies on the concept that a lot of the junk DNA gets expressed but has no function but that some eventually will and the function will appear suddenly. This seems to be part of the ideas that Jurgen Brosius has written about. Brosius is a very strident atheist who attacks anybody suggesting something looking like design.

    Brosius wrote the key article in the issue of Paleobiology (31(sp5):1-16. 2005) dedicated to Stephen Gould. I have mentioned this several times before and gpuccio has looked at his work. This issue was republished as a book by Vrbra called Macroevolution.

    So novel new proteins can arise according to these ideas and Brosius and his colleague’s publications discuss some of them. However, it seems they are not able to overcome the statistical boundaries and hurdles that gpuccio has listed in order to form complex new systems.

  143. 143
    gpuccio says:

    EugeneS:

    Let’s see.

    “The probability of something complex and specific is very low, but we must consider an ensemble, not a given individual. The probability of Mr X being born (with all his distinctive traits such as hair color, stature, weight, size of liver etc) is vanishingly small, but Mr X’s do get born.”

    This seems just the classical wrong argument about unlikely things happening. One of the most senseless aguments ever made!

    It goes like that: let’s say we have a total number of possible random states of one sequence. For example a sequence made of the 25 letters in the English alphabet, plus space and 4 puntuaction marks. Each sequence has a probability of about 2E-44 to be found. Then we generate a sequence of letters of that length, and of course one of the “unlikely” sequences has been found! Miracle, miracle…

    Of course this is only bad reasoning about probabilites.

    The probability of finding one specific sequence among all is, correctly, 2E-44. Therefore, if we specify in advance one of the possible sequences, and then we get that same sequence from a random event, we can really call that a miracle!

    But the probability of getting one generic sequence out of those 2×10^44 with one random event is exactly 1. Therefore, we have to find one of them by our random search.

    In functional specification, or other types of specification, the definition of the target space is not made by some arbitrary pre-specification, like in the above example, but rather by some objective property that implicitly defines a subset of the search space, and therefore generates a binary partition. Then the probability of finding the target space can be computed.

    For example, let’s say that we generate by one random event a sequence of the 25 letters in correct alphabetical order, followed by space and the 4 punctuation marks in some specific order: that would really be astounding. But we are not pre-defining the target space, because the alphabetical order exists independently from our little experiment.

    So would any sequence made of 30 identical letters be extremely unlikely, and here again because of a specification which is independent and which is a form of intrinsic order.

    Finally, if we get a phrase which has good meaning in English, like:

    It is easy to understand that.

    again we touch an amazingly small target space.

    Now, one objection could be: but what if we get one result which is in one of the possible target spaces which have some good objective definition?

    OK, we must sum those target spaces.

    For example, just in our little example, the target space of the first case (ordered sequence, assuming that space and punctuation marks have some alphabetical order) is:

    2 sequences (considering both possible orderings).

    The target space for sequences made of one symbol is:

    30 sequences (because we have 30 symbols)

    The target space for “a phrase which has good meaning in English” is more difficult to compute.

    However, here:

    https://uncommondescent.com/intelligent-design/an-attempt-at-computing-dfsci-for-english-language/

    I have given a simple way to appoximate the target space for the function “A phrase made of English words”, which certainly includes the subset of “a phrase which has good meaning in English”.

    Using my computations given in that OP, for a 30 characters phrase, the set of phrases made of English words is about:

    3.2e+26

    Now, let’s say that the subset of phrases having good meaning in English is 3 orders of magnitude smaller (IOWs that 1:1000 of the phrases made of English words has good meaning in English, which is probably a very optimistic assumption).

    So, the target space for out function of “having good meaning in English” would be of about:

    3.2e+23

    Now, let’s imagine that we have 1000 main languages on our planet, and that we want the target space for all of them together. It will be:

    3.2e+26

    If we add the 32 sequences corresponding to thw first two functions, we have practically the same number.

    With a search space of 2e+44, the probability of finding one sequence which has one of the three functions we have defined:

    a) Being fully ordered

    b) Being made of only one symbol

    c) Having good meaning in one of the 1000 main languages on earth

    remains about 1:10^18.

    This, for a very short sequence of 30 letters.

    In my OP cited above I have also demonstrated that, for language, the ratio between target space and search space is bound to increase with the length of the string.

    So, what can we learn from this example?

    a) Even in very simple digital sequences, complex functions are only a small part of the search space, even if large functional islands are defined.

    b) The longer the sequence, the smaller is the ratio between target space and search space.

    c) Considering many different functional islands usually does not increase much the target space, because functional islands are simply summed. In this kind of computations, it’s mainly the effect of exponential components that can really change things.

    d) In particular, functional islands which have great specificity contribute very little to a general target space, as we have seen for the “ordered” and “same character” functions.

    e) So, if we find a very specific result, like a sequence completely ordered, that result remains absolutely unlikely, even if other functions, like “having good meaning in English”, have a bigger target space.

    IOWs, we have to consider the functional information in our specific result, and not just a generic target space that can include many types of other functions.

    In particular, in the case of proteins and evolution, the only target space we can consider is the function:

    “any variation that, in this specific biological context, confers a detectable reproductive advantage”.

    Because that is the only function that can be “offered” to NS.

    More in next post.

  144. 144
    gpuccio says:

    jerry:

    “I believe it relies on the concept that a lot of the junk DNA gets expressed but has no function but that some eventually will and the function will appear suddenly”

    I think that a lot of non coding DNA is more or less transcripted, but not translated.

    Otherwise, cells should be packed of useless proteins, which would certainly be an extreme burden to cell life.

    Moreover, a sequence can be translated only if it has reached the state of ORF, with a starting codon and no stop codons.

    So, I believe that all the search that happens in non coding DNA cannot be helped by NS, at all. Only if and when an ORF is transcripted and translated it can be detected bt NS, and only if it confers some detectable reproductive advantage.

    IOWs, the walk to a new basic selectable function is a mere random walk. And therefore it can use only the probabilistic resources that I have highlighetd in my OP.

  145. 145
    gpuccio says:

    EugeneS:

    “An analogy is language: it may have started from very simple ideas (crude semantic islands in the ocean of possible sequences of letters of the same length). Then gradually the semantic islands were getting more specific and consequently relatively smaller in size.”

    I am not sure I understand.

    Let’s go back to our example of a 30 characters sequence.

    If we choose not form the 200000 words if English language, but from, say, 100 “crude semantic islands”, always of about 5 letters, our target space is much smaller. It becomes “only”:

    1.1e+11

    The search space remains the same. The probability of finding semantic islands goes down, a lot down.

    The probability goes up only if the functional islands are extremely big. For example, if in some language any word of five letters which contains two “a” can represent, say, a dog, than that result is easier to be found, because there are many words with two “a”.

    IOWs, a very generic function is easy to find. But almost always useless in most realistic scenarios.

    The idea is: important functions require a lot of specific information, even in their basic form, which can then be optimized.

    A basic petrol engine requires a lot of specific configuration, much more than a cart. Even if it is a simple petrol engine. And it cannot be derived from a cart by extremely simple variations, each improving the original function of the cart.

    More in next post.

  146. 146
    gpuccio says:

    EugeneS:

    “With life, the situation is analogous: it may have started from something considerably simpler than the specific peptide sequences observed today (and consequently with much greater probabilities), which then gradually became more specific leading to what appears to be very low probabilities.”

    Again, there are two problems:

    a) We know no independent form of life simpler than a prokaryote. All the rest is imagination.

    b) Whatever was simpler, must have been a starting point for the specific peptide sequences that we observe today. For example, ATP synthase beta chains, which is not simply something we observe today, but something thatwas already there, very similar to what we observe today, shortly after OOL.

    Now, how did the “simpler” (and never observed) thing serve as starting point to that specific sequence?

    Was it some partially homologous protein?

    Did it already have the function of synthesizing ATP from a proton gradient? Did it already contribute to the hexameric structure of F1, together with the alpha chain?

    How simple was it, then?

    IOWs we are again at the problem of the cart and the petrol engine, but thousands of times more complex.

    With the little problem that we have no cart! 🙂

    In the end, all leads again to my challenge, which nobody has answered yet. At the cost of being repetitive, I paste it again here:

    Will anyone on the other side answer the following two simple questions?

    1) Is there any conceptual reason why we should believe that complex protein functions can be deconstructed into simpler, naturally selectable steps? That such a ladder exists, in general, or even in specific cases?

    2) Is there any evidence from facts that supports the hypothesis that complex protein functions can be deconstructed into simpler, naturally selectable steps? That such a ladder exists, in general, or even in specific cases?

  147. 147
    jerry says:

    Gpuccio,

    I think that a lot of non coding DNA is more or less transcripted, but not translated.

    Otherwise, cells should be packed of useless proteins, which would certainly be an extreme burden to cell life.

    Moreover, a sequence can be translated only if it has reached the state of ORF, with a starting codon and no stop codons.

    Thank you.

    But some day you should take on the specific claims of Brosius, that is post graduate work and appropriate for your time.

  148. 148
    gpuccio says:

    jerry:

    “But some day you should take on the specific claims of Brosius, that is post graduate work and appropriate for your time.”

    Just to help, could you please confirn if this is the paper you refer to:

    “Disparity, adaptation, exaptation, bookkeeping, and contingency at the genome level”

    and, if possible, what is the part which makes the pertinent claims?

    Thank you. 🙂

  149. 149
    jerry says:

    Gpuccio,

    That is the first article I read and is in the journal article I referred to in Paleobiology. It gives you a flavor of the issues and how he thinks. As I said he is a strident atheist.

    He has a website for his group at the University of Muenster

    https://campus.uni-muenster.de/en/zmbe/the-institutes/inst-of-exp-pathology/staff/

    https://campus.uni-muenster.de/en/zmbe/the-institutes/inst-of-exp-pathology/publications/

    I believe some of these papers discuss the origin of specific proteins and their coding sequences.

    He has a wikipedia page

    https://en.wikipedia.org/wiki/Jürgen_Brosius

    Jürgen Brosius (born 1948) in Saarbrücken) is a German molecular geneticist and evolutionary biologist. He is a professor at the University of Münster where he is the director of the Institute of Experimental Pathology. Some of his scientific contributions involve the first genetic sequencing of a ribosomal RNA operon, the design of plasmids for studying gene expression, expression vectors for high-level production of recombinant proteins and RNA, RNA biology, RNomics as well as the significance of retroposition for plasticity and evolution of genomes, genes and gene modules including regulatory sequences or elements.

    Here is another article he wrote at the same time

    Waste not, want not – transcript excess in multicellular eukaryotes

    There is growing evidence that mammalian genomes produce thousands of transcripts that do not encode proteins, and this RNA class might even rival the complexity of mRNAs. There is no doubt that a number of these non-protein-coding RNAs have important regulatory functions in the cell. However, do all transcripts have a function or are many of them products of fortuitous transcription with no function? The second scenario is mirrored by numerous alternative-splicing events that lead to truncated proteins. Nevertheless, analogous to ‘superfluous’ genomic DNA, aberrant transcripts or processing products embody evolutionary potential and provide novel RNAs that natural selection can act on.

    Also more recently

    The Persistent Contributions of RNA to Eukaryotic Gen(om)e Architecture and Cellular Function
    Jürgen Brosius

    Abstract

    Currently, the best scenario for earliest forms of life is based on RNA molecules as they have the proven ability to catalyze enzymatic reactions and harbor genetic information. Evolutionary principles valid today become apparent in such models already. Furthermore, many features of eukaryotic genome architecture might have their origins in an RNA or RNA/protein (RNP) world, including the onset of a further transition, when DNA replaced RNA as the genetic bookkeeper of the cell. Chromosome maintenance, splicing, and regulatory function via RNA may be deeply rooted in the RNA/RNP worlds. Mostly in eukaryotes, conversion from RNA to DNA is still ongoing, which greatly impacts the plasticity of extant genomes. Raw material for novel genes encoding protein or RNA, or parts of genes including regulatory elements that selection can act on, continues to enter the evolutionary lottery.

    There are other more specific articles as these are review articles.

  150. 150
    jerry says:

    Gpuccio,

    Here is the full text of the previous article I mentioned

    The Persistent Contributions of RNA to Eukaryotic Gen(om)e Architecture and Cellular Function

    Brosius, J. 2014. The persistent contributions of RNA to eukaryotic gen(ome)e architecture and cellular function. Cold Spring Harb. Perspect. Biol. 2014. 6. pii: a016089. doi: 10.1101/cshperspect.a016089.

    http://cshperspectives.cshlp.o.....16089.full

    You might want to look at this too which references Brosius’ paper


    Life is physics and chemistry and communication
    Guenther Witzany

    Manfred Eigen extended Erwin Schroedinger’s concept of “life is physics and chemistry” through the introduction of information theory and cybernetic systems theory into “life is physics and chemistry and information.” Based on this assumption, Eigen developed the concepts of quasispecies and hypercycles, which have been dominant in molecular biology and virology ever since. He insisted that the genetic code is not just used metaphorically: it represents a real natural language. However, the basics of scientific knowledge changed dramatically within the second half of the 20th century. Unfortunately, Eigen ignored the results of the philosophy of science discourse on essential features of natural languages and codes: a natural language or code emerges from populations of living agents that communicate. This contribution will look at some of the highlights of this historical development and the results relevant for biological theories about life.

  151. 151
    gpuccio says:

    jerry:

    “As I said he is a strident atheist.”

    I am not interested in his world-view. I am only interested in his scientific ideas.

    Now, excuse me, but I have no special reasons to read everything he has written. I have already read that paper in Paleobiology you apparently referenced at #142. Frankly, I did not find it really interesting, except maybe for an early understanding (in 2004) of the important role of transposons in evolution, an idea that has been confirmed by the more recent literature, and that I will happily support, because as said many times I believe that transposons are a very good tool of design.

    Unfortunately, I found nothing in that paper regarding the point you mentioned:

    It has been awhile since I have read about this but aren’t there lots of expressed sequences in the cell that may be due to just this, part of the junk DNA being expressed with no apparent function. Someone once said that these apparently useless proteins may be some of the so called Orphan proteins that have been discovered in the cell but which have no known function.

    My answer was:

    “I think that a lot of non coding DNA is more or less transcribed, but not translated.

    Otherwise, cells should be packed of useless proteins, which would certainly be an extreme burden to cell life.

    Moreover, a sequence can be translated only if it has reached the state of ORF, with a starting codon and no stop codons.

    So, I believe that all the search that happens in non coding DNA cannot be helped by NS, at all. Only if and when an ORF is transcribed and translated it can be detected bt NS, and only if it confers some detectable reproductive advantage.

    IOWs, the walk to a new basic selectable function is a mere random walk. And therefore it can use only the probabilistic resources that I have highlighetd in my OP.”

    Again, in all the citations you make of Brosius, or about him, I can find nothing about systematic translation of useless proteins.

    So, for the moment I will assume that there is no evidence about such an idea.

    I am interested in any argument about a different view, but please give me exact references for it, if you have them.

  152. 152
    gpuccio says:

    jerry:

    By the way, I don’t believe at all that transposons are evidence for an RNA world, as Brosius seems to suggest.

    Indeed, I don’t believe at all in the RNA world theory. There is no evidence for it. It is only a necessary imaginary tool to try to answer questions about OOL that cannot be answered by neo-darwinism.

    And that cannot even be answered by the imaginary RNA world.

    However, I will not discuss in detail here the many irrational ideas in Brosius’ paper, because this is not really the object of discussion in this thread.

    Be it enough to say that not only Brosius, as you say, is “not able to overcome the statistical boundaries and hurdles that gpuccio has listed in order to form complex new systems”: in the paper I have read, he does not even try to address them in any way.

    A whole paper about how imaginary evolutionary events should have happened, at least according to him, and not even one simple attempt at analyzing if any of those events is even empirically credible, least of all probable!

  153. 153
    Dionisio says:

    gpuccio,

    Perhaps you’ve seen this paper before.

    The below link points to a relatively old* paper (it appeared 3 years ago) that seems like a game changer, because it explains in details the jump from prokaryotes to eukaryotes, doesn’t it?

    https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4210606/pdf/12915_2014_Article_76.pdf

    (*) BTW, what would be considered ‘old’ for biology research papers?

  154. 154
    Dionisio says:

    @153 follow-up

    The following papers seem to add supporting evidences for the paper referenced @153.

    https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5557255/pdf/13062_2017_Article_190.pdf

    http://mmbr.asm.org/content/81/3/e00008-17

  155. 155
    gpuccio says:

    Dionisio:

    “The below link points to a relatively old* paper (it appeared 3 years ago) that seems like a game changer, because it explains in details the jump from prokaryotes to eukaryotes, doesn’t it?”

    🙂 🙂 🙂

    Well, the 3 papers you quote are, in a sense, honest enough, because they clearly show how little we understand of the issue! 🙂

    They are all of the type: we understand practically nothing of how this happened, but let me suggest some new revealing idea!

    Of course, after the revealing idea is given, we still understand practically nothing.

    However, the transition to eukaryotes remains one of the big issues in biology, especially from an informational point of view. The amount of new information generated in that transition is simply staggering!

    Unfortunately, there is still a lot of uncertainty about the basics of the event: for example, when did it happen.

    These huge transitions (OOL, eukaryotes, metazoa, vertebrates, and certainly many others) are really beyond any hope of explanation, even vague, if design is not factored.

  156. 156
    jerry says:

    Gpuccio,

    A couple things.

    I pointed to Brosius because he is considered important in the evolution science community. He was the person given the honor of introducing the tribute to Gould. He is a serious researcher as opposed to Dawkins who is a polemicist.

    He runs a very respected research lab on evolution. Whether he or his colleagues have produced anything that challenges the ID point of view is what is of interest. I have said for about 8 years, since he was first introduced to UD by Allan MacNeil, that I could not see any real evidence for his overall worldview.

    But there are some interesting questions which his colleagues are investigating. One recent one is,

    Speciation network in Laurasiatheria: retrophylogenomic signals

    Liliya Doronina, Gennady Churakov, Andrej Kuritzin, Jingjing Shi, Robert Baertsch, Hiram Clawson, and Jürgen Schmitz
    Genome Res. June 2017 27: 997-1003

    Based on my limited understanding of the abstract this is about what are the possible causes for the differences between various mammals and what are the origins of these causes?

    Certainly an interesting question and one that looks at the edge of evolution.

    I don’t pretend to understand the science for these differences because the layman’s description has not been written.

    Before you say this issue is not part of your thesis which I agree, it points to how various morphological and adaptive elements arise by apparently possible natural events. And understanding them correctly can only contribute to the ID worldview by showing what is possible and what is not. They don’t seem to rise to new complex, functional systems. They are actually nails in the coffin by showing just what the limits are.

    ————

    On what is selectable in NS, Brosius maintains that it is more than proteins but includes RNA sequences and control mechanisms on the DNA. You said

    So, I believe that all the search that happens in non coding DNA cannot be helped by NS, at all. Only if and when an ORF is transcribed and translated it can be detected by NS, and only if it confers some detectable reproductive advantage.

    From the abstract above

    Raw material for novel genes encoding protein or RNA, or parts of genes including regulatory elements that selection can act on, continues to enter the evolutionary lottery.

    So he is looking at a lot more than coded genes. Again, there is no evidence that any of this challenges the ID position.

  157. 157
    Mung says:

    gpuccio:

    These huge transitions (OOL, eukaryotes, metazoa, vertebrates, and certainly many others) are really beyond any hope of explanation, even vague, if design is not factored.

    Don’t we have some very fine and detailed examples of how evolution can bring about minor transitions?

    Say, from one protein to another. Or a slightly different size of beak in a population of birds.

  158. 158
    gpuccio says:

    jerry:

    OK, I will look with interest to the paper about mammal speciation that you suggested, and will give you my feedback.

    Regarding the second point, of course variations in regulatory elements in non coding DNA can be selected, if they give some reproductive advantage. But those are sequences of non coding but absolutely functional DNA (or RNA).

    The issue I was discussing was, instead, the possibility of getting to some functional protein coding sequence from some non functional DNA sequence (like a duplicated and inactivated gene).

    In this case:

    a) The starting sequence must be non functional, which is important to allow a non restricted random walk, one that can reach completely new sequences in the search space.

    b) The intermediate states of the random walk are not functional, too.

    c) Only when we reach the final state of a basic functional protein coding gene, whose coded protein can give some reproductive advantage, can the result be selected by NS.

    d) The point raised by Origenes, and supported by me, was that a further difficulty is that the intermediate states are not transcribed, or at least not translated (because the sequence is not yet an ORF, or because it is an ORF that codes for some useless protein). But, as soon as a potentially functional sequence is reached, it must be fully transcribed and translated (and, I would add, correctly regulated) to be able to confer the reproductive advantage that will expose it to NS. Otherwise, any new neutral variation can easily degrade the potential function so preciously (and improbably) reached.

    e) In the light of the above argument, possible selections of non coding DNA sequences for some regulatory function can have no role at all, or probably a negative role. Indeed, if some part of the sequence which should be a step to the future protein is selected for a regulatory role as DNA or RNA, that would be a final stop to its further “evolution” in the direction of a protein coding sequence, because indeed negative selection would act to preserve the sequence as it is. And a regulatory function linked to the biochemical structure of the nucleotide sequence has nothing to do with the potential symbolic function of those same nucleotides to code for functional proteins.

    So, the point remains valid: the best perspective for neo-darwinism, as far as a new functional protein coding sequence must be found, which is sequence unrelated to what already exists, is a random walk on some non coding and non functional DNA sequence, where no intervention of NS can help the walk (and therefore all the probabilistic limits highlighted in this OP definitely apply), and where a lot of lucky events must also happen to allow the (im)possibly found sequence to be translated and selected at the right time.

  159. 159
    gpuccio says:

    To all:

    By the way, this statement in my answer to jerry at #158:

    A regulatory function linked to the biochemical structure of the nucleotide sequence has nothing to do with the potential symbolic function of those same nucleotides to code for functional proteins.

    is also the main reason (but not the only one) why the imagined transition from some hypothetical RNA world to a DNA-RNA-protein world (the only one we know to exist) is simply impossible.

  160. 160
    Origenes says:

    GPuccio: But, as soon as a potentially functional sequence is reached, it must be fully transcribed and translated (and, I would add, correctly regulated) to be able to confer the reproductive advantage that will expose it to NS.

    Another minor detail …

  161. 161
    gpuccio says:

    Origenes:

    “Another minor detail …”

    Yes. A lot of “minor” details, indeed, often “graciously” underestimated in the debate.

    Are we really too kind to our interlocutors? 🙂

  162. 162
    gpuccio says:

    Mung:

    OK, you have me! 🙂

    The “size of beak” argument is really a game stopper! 🙂

  163. 163
    jerry says:

    So, the point remains valid: the best perspective for neo-darwinism, as far as a new functional protein coding sequence must be found, which is sequence unrelated to what already exists, is a random walk on some non coding and non functional DNA sequence, where no intervention of NS can help the walk (and therefore all the probabilistic limits highlighted in this OP definitely apply), and where a lot of lucky events must also happen to allow the (im)possibly found sequence to be translated and selected at the right time.

    This is the point I have been making for years. I have said that shortly the evidence for what is possible will be available to either support or destroy the Neo Darwinian point of view. My guess is that it will reveal some very interesting changes that have happened in species due to natural events but support the edge of evolution thesis.

    If a new protein arises or a new regulatory element arises through some form of random process there should be evidence for it in related species where a similar sequence is present in an incomplete form.

    Dr Gauger replied to a comment that I made back in May that this is happening on a small scale and pointed to an analysis of some plant genomes and their common proteins and unique proteins. What was amazing about the study she pointed to was the number of unique proteins in these plant species. Or the taxonomically restricted genes that encode for these proteins.

    https://uncommondescent.com/intelligent-design/do-nylon-eating-bacteria-show-that-new-functional-information-is-easy-to-evolve/

  164. 164
    gpuccio says:

    jerry:

    “This is the point I have been making for years. I have said that shortly the evidence for what is possible will be available to either support or destroy the Neo Darwinian point of view. My guess is that it will reveal some very interesting changes that have happened in species due to natural events but support the edge of evolution thesis.”

    I absolutely agree with you. 🙂

  165. 165
    Dionisio says:

    gpuccio @159:

    “A regulatory function linked to the biochemical structure of the nucleotide sequence has nothing to do with the potential symbolic function of those same nucleotides to code for functional proteins.”

    is also the main reason (but not the only one) why the imagined transition from some hypothetical RNA world to a DNA-RNA-protein world (the only one we know to exist) is simply impossible.

    The OOL debate is over.

  166. 166
    Dionisio says:

    gpuccio @155:

    That seems like an accurate diagnosis of the current hopeless condition of the terminally-ill Neo-Darwinian story.

    Thanks.

  167. 167
    gpuccio says:

    Dionisio (and whoever is interested):

    I would like to be more detailed on this important point.

    Let’s say that we have a sequence of nucletodes A.

    Let’s say that, in the fabulous RNA world, sequence A is functional: it has some specific biochemical activity, for which it has been selected and preserved.

    The functional information in sequence A is also passed to new generations because, of course, a sequence of nucleotides can be copied by some RNA polymerase activity, effected by the RNA itself.

    This is, I suppose, the central idea for the RNA world: RNA can be both an effector molecule and an information storing molecule.

    Now, let’s say that we make the transition to an RNA-protein world.

    Let’s say that the function which was effected by A in the RNA world should now be inherited by a protein Ap.

    Well, there is absolutely no way that the information in A (the nucleotide sequence) can be “transferred” to some other nucleotide sequence (let’s call it A1) which can code for the protein Ap.

    Why?

    Because the information in A1 for Ap must of course be coded according to the symbolic genetic code.

    IOWs A1 (which should code for Ap) has absolutely no relationship with A (the nucleotide sequence which effected the function in the RNA world).

    They are two completely different types of information, because they have information for the same function, but in two completely different ways:

    a) A has information for an RNA molecule, whose 3d structure and biochemical activity depend directly on the nucleotide sequence, according to objective biochemical laws (IOWs, in the same way that a protein structure and function depend on its AA sequence).

    b) A1 has information for a sequence of AAs, corresponding to a functional protein, but that sequence of AAs, and therefore the structure and function of the protein, depend on the nucleotide sequence only symbolically, through an arbitrary code.

    So, there is absolutely no way that the information in A can generate the information in A1 by non design mechanisms.

    IOWs, the RNA world is no “precursor” to the protein world: all the information in the RNA world will be lost in the supposed transition to the protein world. IOWs, the protein world could as well arise from scratch, as far as functional information is concerned.

  168. 168
    Dionisio says:

    gpuccio @167:

    […] that sequence of AAs, and therefore the structure and function of the protein, depend on the nucleotide sequence only symbolically, through an arbitrary code.

    The essence of the problem.

  169. 169
    Dionisio says:

    The OOL + Neo-Darwinian evolution stories are starting to look like a bunch of ‘chicken-egg’ conundrums mixed with many ‘humpty-dumpty’ issues. And it seems like getting worse with every discovery.
    But maybe we just don’t understand it.
    🙂

  170. 170
    Corey Delvine says:

    Oh brother Gpuccio, you’re still passing out this kool-aid?

  171. 171
    Corey Delvine says:

    Your claim is that evolution could not arrive at these proteins by a random walk through amino acid sequence space, but nowhere in any literature does anyone claim that this is how protein evolution occurs.

    So you’re argument is a strawman argument.

    High homology across species obviously suggests function, but function does not require high homology.
    Proteins with the same function can have very different amino acid sequences and experiments have swapped amino acids in proteins, heck they’ve even stripped all 20 amino acids away and rebuilt proteins using only 4 amino acids and the protein was still functional.

    So again, your argument is a strawman.

    You claim that in order for evolution to be possible, the search space must be explored in all its parts.
    But there is no evidence that suggest this to be true and the fact that function can be happened upon easily as Szostack shows, disagrees with this.
    (again this is thinking of function in a biological context, and not using Gpuccio’s personal definition of function).

    Again. strawman.

    “Let’s take the case of weak ATP binding, a ridiculously simple function, and by far not selectable.”
    Who are you to be defining function and to say what’s selectable and what’s not?

    Do you see everything in black and white or is that just how you view biology?

  172. 172
    Mung says:

    Let’s say that, in the fabulous RNA world, sequence A is functional: it has some specific biochemical activity

    Let’s say that it’s function is to fold in on itself, because that’s what RNA does. Which makes it sort of hard to copy.

  173. 173
    Mung says:

    Corey Delvine:

    Your claim is that evolution could not arrive at these proteins by a random walk through amino acid sequence space, but nowhere in any literature does anyone claim that this is how protein evolution occurs.

    Thank you for telling us how protein evolution does not occur.

    Given your obvious familiarity with the subject matter and your review of all the literature, how do proteins evolve?

    You left that out.

  174. 174
    kairosfocus says:

    CD, on very long and good track record, GP is not on trial. You are. The score does not look so good so far, given the sort of rhetoric you chose above. FYI, your relevant choice on cause is: blind chance and/or mechanical necessity, or intentionally directed configuration. Rule the latter out and you are stuck at the former, where protein sequences (and the codes for them) are patently highly contingent. So, the choice is hitting a needle in a beyond astronomical scale haystack by blind chance — aka repeated statistical miracles in a huge chain — or having a small step increment between functional forms; something Dawkins pointed out when he put up his rather misleading Weasel case back in the 80’s. GP is pointing out issues with the small step approach. And if you think the genetic code and proteins were written into the laws of our cosmos making emergence of life inevitable, you are looking at a version of cosmological fine tuning that is way beyond what has been claimed. The issue is not oh unless you can show reason that blind chance and mechanical necessity working through evolutionary mechanisms in chemistry of a warm little pond or the like and then onward through whatever flavour of macro-evo you favour then we can pose the magic word evolution and game over, but instead, showing empirically supported credible means. The absence of Nobel prizes for having shown those means speaks for itself as appeal to repeated statistical miracles is utterly self defeating, and appealing to quasi-infinite multiverses to open up possibility explorations is in the end a resort to bald ad hoc unsupported speculation. KF

  175. 175
    kairosfocus says:

    GP, not just coded but executed, and that implies a need for corresponding execution machinery, taking the issue to another whole level, as such machines are also coded for in the system. this is part5 of why I always highlight that we must start at the root of the tree of life type icon, OoL. At that level, differential reproductive success is not on the cards as the origin of relevant mechanisms for metabolism and for self-replication is what is on the cards. Design is present at the root of the tree of life as the best explanation, once methodological naturalist blinkers are taken off. If such is there at the root, it is there all along. And in the end, I am astonished at how readily ever so many would account for such systems on blind chance and/or mechanical necessity that is readily shown to be overwhelmed by the challenge of forming 500 – 1,000 bits worth of functionally specific complex organisation and associated information. Appeal to statistical miracle after statistical miracle. KF

  176. 176
    gpuccio says:

    Corey Delvine:

    Hi Corey, welcome to the discussion! 🙂

    I was feeling rather lonely: I had never had a thread where nobody from the other side made any intervention. I was beginning to beg for some name calling, at least! 🙂

    But here you are, at your best. So, thank you.

    Answers to your points:

    a) You say:

    Your claim is that evolution could not arrive at these proteins by a random walk through amino acid sequence space, but nowhere in any literature does anyone claim that this is how protein evolution occurs.

    No.

    As you should well know, I have debated in great detail the role of NS and its limits in the previous thread:

    https://uncommondescent.com/intelligent-design/what-are-the-limits-of-natural-selection-an-interesting-open-discussion-with-gordon-davisson/

    to which you have brilliantly contributed.

    What’s happening to your memory? Age?

    My first statement in this OP is:

    Coming from a long and detailed discussion about the limits of Natural Selection, I realized that some attention could be given to the other great protagonist of the neo-darwinian algorithm: Random Variation (RV).

    So no, I am not simply “claiming that evolution could not arrive at these proteins by a random walk through amino acid sequence space”.

    Look again (if you have already looked) to my OP here. You will find this clear (I suppose) statement:

    In all the present discussion we will not consider how NS can change the RV scenario: I have discussed that in great detail in the quoted previous thread, and those who are interested in that aspect can refer to it. In brief, I will remind here that NS does not act on the sequences themselves (IOWs the functional information), but, if and when and in the measure that it can act, it acts by modifyng the probabilistic resources.

    So, an important concept is that:

    All new functional information that may arise by the neo-darwinian mechanism is the result of RV.

    Now, I will try to explain it better for those who has some difficulties in understanding:

    NS must act on some already existing new function in its initial form, which must therefore be, of course, naturally selectable. That initial new function must be generated by RV only.

    Is it clear?

    So, what I am claiming is that:

    evolution could not arrive at these new initial functions by a random walk through amino acid sequence space“.

    Is it clear?

    More in next post.

  177. 177
    gpuccio says:

    Corey Delvine:

    b) You say:

    High homology across species obviously suggests function, but function does not require high homology.
    Proteins with the same function can have very different amino acid sequences and experiments have swapped amino acids in proteins, heck they’ve even stripped all 20 amino acids away and rebuilt proteins using only 4 amino acids and the protein was still functional.

    Give the reference, please. Then I will answer.

    More in next post.

  178. 178
    gpuccio says:

    kairosfocus:

    “GP, not just coded but executed, and that implies a need for corresponding execution machinery, taking the issue to another whole level, as such machines are also coded for in the system.”

    Of course you are right.

    I suppose I was again “graciously” understating some aspects. 🙂

  179. 179
    gpuccio says:

    Corey Delvine:

    c) You say:

    You claim that in order for evolution to be possible, the search space must be explored in all its parts.

    No.

    I claim that in order for a specific target to be found, enough probabilistic resources are needed to find it, given its probability to be found. It’s a completely different idea.

    But there is no evidence that suggest this to be true and the fact that function can be happened upon easily as Szostack shows, disagrees with this.

    As I have discussed, Szostak found a simple and useless function using approapriate probabilistic resources:

    Let’s take the case of weak ATP binding, a ridiculously simple function, and by far not selectable.

    There should be no problem to get such a simple function: just a string of AAs that can bind ATP, even at very low levels, so that we can separate the corresponding protein by ATP columns in our lab.

    That’s what Szostak has done in his famous paper.

    And, while his results have nothing to do with NS and neo-darwinism, as discussed many times, they can still give us some idea about the functional space for an extremely simple function.

    Let’s see: I have already computed the functional information for the basic simple function selected by the authors: about 40 bits.

    That means that, for that function, while the search space (for 80 AAs) is huge (1.2E+104), the target space is very big too: 8.06E+91. The ratio: 6.66667E-13, 40 bits, is the functional complexity for that function in a 80 AAs long sequence.

    Now, we can easily see that such a simple function is well in the range of what can be found by the total bacterial system, which has a capacity of finding about 138 functional bits per genome, and in particular 123,4 bits for a 80 AAs sequence. So, 40 bits are a piece of cake.

    This is completely consistent with my reasoning. Indeed, it is an integral part of it.

    Then you say:

    (again this is thinking of function in a biological context, and not using Gpuccio’s personal definition of function).

    What do you mean?

    I have no “personal definition of function”. I accept all functions that can be defined explicitly, as clearly stated here:

    https://uncommondescent.com/intelligent-design/functional-information-defined/

    What I say about what you call the “biological context”, but should be rather called the “neo-darwinian scenario”, is that:

    a function should be naturally selectable to be naturally selected“.

    Which is, I suppose, a tautology. Can you object to that?

    You say:

    “Let’s take the case of weak ATP binding, a ridiculously simple function, and by far not selectable.”
    Who are you to be defining function and to say what’s selectable and what’s not?

    I am not defining function.

    I am saying that the function found by Szostak is relatively simple (40 bits) for the context we are discussing (biological evolution).

    I am saying that such a function is not naturally selectable, because of course it cannot confer any reproductive advantage in any known biological context. That point is rather intuitive, but it is also proven by the following experimental article:

    “A Man-Made ATP-Binding Protein Evolved Independent of Nature Causes Abnormal Growth in Bacterial Cells”

    https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2754611/

    So, as far as we can tell, Szostak’s ATP binding protein, in one of its most evoluted forms, is not naturally selectable, indeed it is deleterious.

    If you have reasons to think differently, please state those reasons. I will not object:

    “Who is Corey Delvine to be defining function and to say what’s selectable and what’s not?”

    I will simply consider your arguments, for what they are worth.

  180. 180
    Mung says:

    Is it clear?

    Wait. You mean new proteins don’t just appear de novo?

  181. 181
    ET says:

    Corey:

    Your claim is that evolution could not arrive at these proteins by a random walk through amino acid sequence space, but nowhere in any literature does anyone claim that this is how protein evolution occurs.

    Ummm, no one knows how evolution by means of blind and mindless processes works. That is why the need for the dogma.

    Look, Corey, if you and yours actually had some experimental evidence of blind and mindless processes producing proteins from scratch you would have presented it. As it is you and yours don’t have a clue.

    Can Corey even support his claims of “strawman”? I doubt it.

  182. 182
    Dionisio says:

    KF @174 & @175:

    Good points.

    For example, the fascinating morphogenesis is associated with 4D signaling profiles that form and are interpreted following beautiful spatiotemporally choreographed
    procedures with multilevel coding beyond anything ever dreamed by the best design engineers.
    Both the spatial concentration of the morphogens and the duration of their presence may affect the fate determination of the surrounding cells.
    This is beyond any science fiction ideas ever imagined by the most prolific writers of the genre.
    But sadly many people are unaware of that.

  183. 183
    forexhr says:

    Corey Delvine: “Who are you to be defining function and to say what’s selectable and what’s not?”

    If the theory of evolution teaches that the process of mutations and natural selection create and preserve traits that are fitted to the selection pressures of specific environments, then it is no brainer to define ‘function’ and to say what’s ‘selectable’ and what’s not: function is such an arrangement of nucleotides that codes for traits which are beneficial in a specific environment, while selectable is that which is functional.

    Thus, nobody here needs to define function or selection – the theory itself implicitly defines these terms precisely and it is really funny how those who preach evolution as their gospel are ignorant of what the evolution theory actually teaches.

    Such paradoxical situation can be explained by the fact that the term ‘function’ implies something that is too uncomfortable for preachers to accept and as a result they are forced to a priori reject all definitions of ‘function’. Of course, they will never provide their own definition because any such attempt would result in anxiety and unacceptable impulses that are experienced when a person simultaneously holds two contradictory beliefs or ideas. In our case, the first belief arises from the dogmatic acceptance of the evolution theory. The second belief arises from the empirical observation of the huge search space(10^810 for average gene size), on which it is impossible to find any evolutionary selectable(functional) instance with only 10^43 mutations which have occurred during the entire evolutionary process.

    That is why it is impossible to have a rational discussion with an evolution preacher. His belief that evolution must be true is so deep that it trumps any logic or reason.

  184. 184
    gpuccio says:

    forexhr:

    True! 🙂

    “Function” is a very vague concept, if it is not defined with precision.

    In my old OP here:

    https://uncommondescent.com/intelligent-design/functional-information-defined/

    I have tried to give a precise definition of function, which includes all possible aspects and is very basic. I paste it here:

    I define a function for an object as follows:

    a) If a conscious observer connects some observed object to some possible desired result which can be obtained using the object in a context, then we say that the conscious observer conceives of a function for that object.

    b) If an object can objectively be used by a conscious observer to obtain some specific desired result in a certain context, according to the conceived function, then we say that the object has objective functionality, referred to the specific conceived function.

    In that sense, any function can be freely defined, because it is not the concept of function which is important in itself.

    The important concept is not function, but the amount of information that is necessary to implement the defined function.

    Simple functions can arise, and do arise, all the time in non designed objects: as I often say, a stone can be used for many purposes, and therefore many functions can be defined for it. But all those functions are simple, because no special configuration is necessary to implement them, beyond the few requirements that a lot of stones present (some generic dimensions, weight, and so on).

    Neo-darwinists love to equivocate on the concept of function, never trying to quantify the functional information linked to a function.

    That’s the case, for example, in Szostak’s famous paper, where the generic concept of “functional” sequences is used to make very ambiguous statements.

    Indeed, as many times said in this thread, the original function found in Szostak’s library:

    a) “Any protein sequence 80 AAs long that can weakly bind ATP, and therefore be selected by ATP columns in the lab”

    is rather simple, about 40 bits, as demonstrated by the fact that he found 4 such sequences out of 6×10^12 random sequences 80 AAs long.

    That is the complexity linked to that specific definition.

    But, as often said here, and as you very correctly remind, there is only one definition of function which is pertinent for the neo-darwinian algorithm. And that definition is:

    b) “Any function that confers a detectable reproductive advantage in a specific biological context, so that it can be the object of NS”

    That definition of function is very strict.

    Now, the correct question is: can we find functions that satisfy that definition and are still simple?

    The answer, of course, is yes. Antibody resistance, of the simple type, is a good example. It can arise by a single aminoacid substitution (about 4 bits) or by two AAs substitutions (about 8 bits), like in the case of chloroquine resistance.

    All the known cases of microevolution are cases where a simple new function is found by RV.

    But, of course, no complex function, satisfying the basic definition given in b), can be found that way!

    Now, neo-darwinists will argue, at that point, that complex functions can arise as a gradual sum of simpler, naturally selectable functions. That is the lie.

    Because it is simply not true.

    And if anyone on the other side thinks differently, he is again invited to take my challenge, and answer the following two questions, that nobody at all, up to now, has even tried to answer:

    Will anyone on the other side answer the following two simple questions?

    1) Is there any conceptual reason why we should believe that complex protein functions can be deconstructed into simpler, naturally selectable steps? That such a ladder exists, in general, or even in specific cases?

    2) Is there any evidence from facts that supports the hypothesis that complex protein functions can be deconstructed into simpler, naturally selectable steps? That such a ladder exists, in general, or even in specific cases?

  185. 185
    gpuccio says:

    To all interested:

    Now, as a follow-up to my previous comment to forexhr, I will give here a clear example of what I mean when I say that complex functions cannot be deconstructed into simpler, naturally selectable steps. I will use a well known scenario, that we have discussed in depth here and in the previous OP about NS, and that is very well documented: chloroquine resistance.

    Chloroquine resistance is a “relatively” complex function (about 8 bits).

    And, as we know, it cannot be deconstructed into two simpler naturally selectable steps of 4 bits each.

    Why?

    Because we know well (from Summer’s paper) that two individual substitutions are necessary to confer basic chloroquine resistance:

    either 75E and 76T or 76T and 326D

    IOWs, no single substitution can confer any resistance, and therefore be naturally selected.

    IOW, the ladder:

    75E -> NS -> 76T -> NS

    or any other equivalent, simply does not exist.

    What exists is the sequence:

    75E -> 76T -> NS

    where NS comes into play only after the two mutations occur by RV.

    IOWs, the function:

    chloroquine resistance

    has 8 bits of complexity, and cannot be deconstructed into two 4 bit steps.

    Now, ATP synthase beta chain, just to use an old friend, has a functional complexity of about 600 bits, as proven by its evolutionary conservation.

    If even such a simple function as choroquine resistance cannot be deconstructed into simpler naturally selectable steps, how can anyone imagine that a complex, specific function like the function of ATP synthase beta chain can be deconstructed that way?

    Such an explicit scenario should help my kind interlocutors to try to answer my challenge, and explain if there is any reason to believe such a strange idea, except for blind faith in a wrong theory.

  186. 186
    Eugene S says:

    GP

    “Of course this is only bad reasoning about probabilites.”

    Exactly my thoughts on this! Thank you very much for the detailed answer. I wanted to synchronize the watches, so to speak. Second, I feel we need to give our interlocutors’ arguments a fair bit of attention (since they are not speaking up for themselves) 🙂 I hate the phrase “the advocate of the devil” even though I know that its connotation is not as dramatic as it may sound to somebody, like myself, whose native language is not English 🙂

    I have been extremely busy lately. I will eventually get through your other comments.

    I also have a couple of very concrete and probably very simple questions regarding the bioinformatics algorithms and software you are using. Could you write a post on the bioinformatics basics, the metrics and a little more detail about how you produced those graphs, for the benefit of the general audience?

    This is really valuable work.

  187. 187
    Eugene S says:

    GP, Barry and other contributors!

    Please could you organize your contributions on the blog by author, so we can quickly find a publication. Otherwise the interested reader has to create multiple bookmarks in the browser. I suggested creating an index by author way back but was not heard 🙂

  188. 188
    gpuccio says:

    EugeneS:

    “Second, I feel we need to give our interlocutors’ arguments a fair bit of attention (since they are not speaking up for themselves)”

    Yes, that’s fair. And I must say that you friends who occasionally play the role of “devil’s advocate” (which, after all, is a term from the Catholic tradition) are much better at it than our polite dissenters: you know the issues better, and you can formulate better counter-arguments, even if only for the sake of discussion! 🙂

    “I also have a couple of very concrete and probably very simple questions regarding the bioinformatics algorithms and software you are using. Could you write a post on the bioinformatics basics, the metrics and a little more detail about how you produced those graphs, for the benefit of the general audience?”

    That’s a very good idea. I’ll do it.

    “This is really valuable work.”

    Thank you! 🙂

    “Please could you organize your contributions on the blog by author, so we can quickly find a publication.”

    That’s a very good idea, too. I support it, even if I don’t know how easy it would be to implement it. Maybe Barry can give us some feedback about that.

  189. 189
    EugeneS says:

    GPuccio

    “That’s a very good idea. I’ll do it.”

    Thank you very much.

    “devil’s advocate … (which, after all, is a term from the Catholic tradition)”

    I meant I would not like to be called one. I should have used less emotionally loaded verb 😉

    Seriously, I consider your OPs as ones that belong to the golden fund of this blog!

  190. 190
    Corey Delvine says:

    “on very long and good track record, GP is not on trial.”
    Sorry, but Gpuccio’s track record pales in comparison to the body of work that supports evolution.
    The two are not even in the same ballpark.
    And yes, he is on trial; if Gpuccio is going to make “scientific” claims, someone should peer-review them.

    Anyways…
    Gpuccio, you should take a look at the work David Bakers lab has done on protein sequence/structure/function.
    The fact that you have to ask for references demonstrates what little knowledge you have in the field which you are attempting to contribute to.

    “in order for a specific target to be found”
    More strawmen.
    Evolution does not look for a specific target.
    Certainly not in sequence space.

    ATP-binding could serve a number of functions, especially if we are considering the time period it originated (extremely early in life’s history).
    Binding ATP will stabilize the phosphate groups and limit auto-hydrolysis (preserving energy).
    Binding ATP can also act as a mechanism to increase its local concentrations.
    Binding ATP can be a basic energy storage mechanism by sequestering free ATP
    Binding ATP will also drive chemical reactions by affecting reactant/product ratios.

    There’s a few right off the top of my head.
    Of course the poster-child of confirmation bias will find a way to sweep them under the rug along with everything else I say.

  191. 191
    gpuccio says:

    Corey Delvine:

    Sorry, but Gpuccio’s track record pales in comparison to the body of work that supports evolution.
    The two are not even in the same ballpark.

    True. And I am very happy and proud of not being in that ballpark.

    And yes, he is on trial; if Gpuccio is going to make “scientific” claims, someone should peer-review them.

    I am making my claims here, at Uncommon Descent. I did not know that peer review was required.

    My claims can be scientific or not, anyone can judge. Certainly they are not “scientific”, a category whose epistemological nature eludes me! 🙂

    Peer review is a requirement for publication in journals. I did not know that it was a requirement for statements to be scientific. Philosophy of science must have changed in the last few days, and I probably did not realize it. 🙂

    Gpuccio, you should take a look at the work David Bakers lab has done on protein sequence/structure/function.

    You mean the lab at the “Institute for Protein Design“? (emphasis mine)

    Whose last paper in Nature is about “building 20,000 new drug candidates”? I quote from their site:

    This new method of drug development relies on the integration of computer modeling and laboratory testing to rapidly generate and evaluate tens of thousands of potential mini-protein binders with varying shapes. This unprecedented scale of de novo protein design was made possible by recent advances in both the Rosetta software suite and DNA manufacturing. Using this method, mini-protein binders can be rapidly programmed to target a range of proteins, including other viruses, toxins, or even tumor-specific markers.

    (emphasis mine)

    Perhaps you should explain better how work about protein engineering, where the word “design” recurs almost at every paragraph, is relevant to our discussion. I certainly can’t see it.

    The fact that you have to ask for references demonstrates what little knowledge you have in the field which you are attempting to contribute to.

    Strange statement indeed! Asking for precise, explicit and relevant references is the first thing that all peer reviewers do when you submit any paper for publication. Are all peer reviewers, that you seem to love so much, completely ignorant “in the field they are attempting to contribute to”?

    In your comment at #171 you boldly stated:

    “heck they’ve even stripped all 20 amino acids away and rebuilt proteins using only 4 amino acids and the protein was still functional.”

    I asked for a precise reference, because I don’t believe that it is true, but of course, as I respect what you say and I cannot know everything that has been published, I would never say so without first giving you a chance to show that it is true, and that you are not just imagining things.

    Also, a precise reference would allow a discussion about specific statements.

    You seem to answer by referring vaguely to a lab that does, explicitly, protein engineering.

    If it was the Bakers lab that did what you say at #171, could you please specify the pertinent paper?

    I am not making any assumptions about the reasons why you apparently don’t want to give any references about what you state, because that is anyone’s guess.

    More strawmen. Evolution does not look for a specific target. Certainly not in sequence space.

    More strawmen from your part. Very specific targets have been found by evolution. And they have certainly been found in sequence space. I am discussing how evolution could find them, IOWs I am looking for some explanation for what we observe (the facts), which is what science is supposed to do.

    Your theory is that evolution was not looking for those targets, and yet in some way it found them in sequence space. That is exactly what I believe to be impossible, and I try to explain why.

    My theory is that evolution was guided by design to find those targets.

    So, no strawman at all. I am criticizing exactly what you believe.

    ATP-binding could serve a number of functions, especially if we are considering the time period it originated (extremely early in life’s history).
    Binding ATP will stabilize the phosphate groups and limit auto-hydrolysis (preserving energy).
    Binding ATP can also act as a mechanism to increase its local concentrations.
    Binding ATP can be a basic energy storage mechanism by sequestering free ATP
    Binding ATP will also drive chemical reactions by affecting reactant/product ratios.

    There’s a few right off the top of my head.

    So, I suppose that if I ask you any reference that shows with facts that Szostak’s ATP binding proteins, implementing even one of those functions, can give a reproductive advantage in any real biological context, and therefore be naturally selectable, you will promptly give them? Or will you just accuse me of harassing you?

    Until then, let’s say that you are just imagining things “off the top of your head”.

    Of course the poster-child of confirmation bias will find a way to sweep them under the rug along with everything else I say.

    If I really swept all stupid and wrong statements under my rug, I could never use my rug again.

    Let’s say that I am satisfied with showing that they are stupid and/or wrong.

  192. 192
  193. 193
    kairosfocus says:

    CD, case proved as you can see for yourself. KF

  194. 194
    forexhr says:

    Corey Delvine:”…Evolution does not look for a specific target. Certainly not in sequence space.”

    That’s really an odd statement to make. Imagine that we have an ecological or environmental area that is inhabited by some organisms. Sources of food in this area are drying up and population of organisms is introduced to a new environment. In this new environment there are plenty of energy rich substances. But the problem is that genes for metabolic pathway to convert this substances into usable energy do not exist in a gene pool of that population. Metabolic pathway with such ability consists of two enzymes. Thus, the information that codes for these enzymes is not present in the DNA, just like information that codes for eyes, heart or wings was not present in the DNA of the first self-replicating organism. Question: if evolution does not look for a specific target(specific DNA sequences) in a sequence space of some junk DNA, then how do the adaptive sequences enter the gene pool of the population for natural selection to act on them and drive evolution? Do they appear, emerge, arose or burst onto the scene with the wave of a magic wand?

  195. 195
    Dionisio says:

    gpuccio @159:

    “A regulatory function linked to the biochemical structure of the nucleotide sequence has nothing to do with the potential symbolic function of those same nucleotides to code for functional proteins.”

    is also the main reason (but not the only one) why the imagined transition from some hypothetical RNA world to a DNA-RNA-protein world (the only one we know to exist) is simply impossible.

    gpuccio @167:

    Let’s say that we have a sequence of nucletodes A.

    Let’s say that, in the fabulous RNA world, sequence A is functional: it has some specific biochemical activity, for which it has been selected and preserved.

    The functional information in sequence A is also passed to new generations because, of course, a sequence of nucleotides can be copied by some RNA polymerase activity, effected by the RNA itself.

    This is, I suppose, the central idea for the RNA world: RNA can be both an effector molecule and an information storing molecule.

    Now, let’s say that we make the transition to an RNA-protein world.

    Let’s say that the function which was effected by A in the RNA world should now be inherited by a protein Ap.

    Well, there is absolutely no way that the information in A (the nucleotide sequence) can be “transferred” to some other nucleotide sequence (let’s call it A1) which can code for the protein Ap.

    Why?

    Because the information in A1 for Ap must of course be coded according to the symbolic genetic code.

    IOWs A1 (which should code for Ap) has absolutely no relationship with A (the nucleotide sequence which effected the function in the RNA world).

    They are two completely different types of information, because they have information for the same function, but in two completely different ways:

    a) A has information for an RNA molecule, whose 3d structure and biochemical activity depend directly on the nucleotide sequence, according to objective biochemical laws (IOWs, in the same way that a protein structure and function depend on its AA sequence).

    b) A1 has information for a sequence of AAs, corresponding to a functional protein, but that sequence of AAs, and therefore the structure and function of the protein, depend on the nucleotide sequence only symbolically, through an arbitrary code.

    So, there is absolutely no way that the information in A can generate the information in A1 by non design mechanisms.

    IOWs, the RNA world is no “precursor” to the protein world: all the information in the RNA world will be lost in the supposed transition to the protein world. IOWs, the protein world could as well arise from scratch, as far as functional information is concerned.

    What about this?

    Frozen Accident Pushing 50: Stereochemistry, Expansion, and Chance in the Evolution of the Genetic Code
    Eugene V. Koonin
    Life (Basel). 7(2): 22.
    doi: 10.3390/life7020022
    https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5492144/pdf/life-07-00022.pdf

  196. 196
    gpuccio says:

    Dionisio:

    I can easily quote myself (post #155, answering you):

    “Well, the 3 papers you quote are, in a sense, honest enough, because they clearly show how little we understand of the issue!

    They are all of the type: we understand practically nothing of how this happened, but let me suggest some new revealing idea!

    Of course, after the revealing idea is given, we still understand practically nothing.

    That could well apply to this recent paper by Koonin. While it is interesting and detailed in explaining why all existing theories about the origin of the genetic code are trash, it is completely vague and unconvincing in trying to suggest some new form of trash.

    However, my point was not so much about how the genetic code originated (which remains an absolute mystery): my point was rather that, anyway, all the existing information in RNA molecules cannot be transmitted to a new symbolic code of information.

  197. 197
    ET says:

    CD:

    Sorry, but Gpuccio’s track record pales in comparison to the body of work that supports evolution.

    What body of work supports evolution by means of blind, mindless processes?

    Evolution does not look for a specific target.
    Certainly not in sequence space.

    And yet you expect us to believe it found it many, many, many times over.

    How much sheer dumb luck does your position require?

  198. 198
    Dionisio says:

    gpuccio @196:

    Thanks for the insightful commentary.

  199. 199
    Dionisio says:

    gpuccio @191:
    “If I really swept all stupid and wrong statements under my rug, I could never use my rug again.

    Let’s say that I am satisfied with showing that they are stupid and/or wrong.”

    That rug would have to be thrown into a special container for highly contaminated trash.

  200. 200
    Dionisio says:

    gpuccio @196:

    Your assessment of the bottom line message of the paper referenced @195 is right on target. The author’s disclaimer confirms it:

    “As a disclaimer, I should note that no attempt is made here on anything close to a comprehensive review of the research on the code origin and evolution let alone the origin of life. The goal is to place the frozen accident concept into the context of latest efforts in the field and briefly discuss some new ideas.”

    BTW, they have more on that topic -including another paper by Koonin here:
    http://www.annualreviews.org/d.....116-024713

  201. 201
    gpuccio says:

    EugeneS:

    OK, I have just published an OP about bioinformatics tools. I hope you may enjoy it! 🙂

  202. 202
    Corey Delvine says:

    I’m sure you’re happy about that, Gpuccio, because you wouldn’t survive in the ballpark of actual scientific research.

    Peer-review is the standard metric for today’s scientific research.
    Without some sort of peer-review, your claims are not scientific and, but mere Gpuccio pipe-dreams; and hence you will remain in the realm of “scientific.”
    That is unless I can win a nobel for just claiming to have cured cancer.

    Ah, I see, you judge the merit of scientific research based on the number of times the word “design” pops up.
    Got it.

    You should know this stuff already gpucc, if you’re going to try to talk about protein evolution.
    “Functional rapidly folding proteins from simplified amino acid sequences”
    It was five amino acids, not four though (excuse me!).
    Baker’s work has also shown that typically 95% of residues tolerate amino acid susbtitutions.

    Look, the complex cells of today are extremely different from the first living protocells, we’re going to have to speak somewhat hypothetically about ATP-binding/function while also grounding ourselves in basic biochemical processes.
    If your lack of knowledge about molecular biology keep you from doing this, then I apologize.

    Also, just FYI, you can do all this “science” within R, but maybe it’s an “old dog, new tricks” issue.

  203. 203
    Mung says:

    I’m sure you’re happy about that, Gpuccio, because you wouldn’t survive in the ballpark of actual scientific research.

    Could have predicted the insults were coming.

  204. 204
    Mung says:

    Corey Delvine:

    Look, the complex cells of today are extremely different from the first living protocells, we’re going to have to speak somewhat hypothetically about ATP-binding/function while also grounding ourselves in basic biochemical processes.

    I have some of the first living protocells right here next to me and they are at least as complex as any modern cell and very much the same.

    Of course, you would just look at them and compare them to modern cells and declare ex cathedra that they could not possibly be ancient protocells because they are far too much like modern cells.

    Of course, that is just begging the question. But hey, if that’s all you have!

  205. 205
    gpuccio says:

    Corey Delvine:

    OK, we are at name calling at last! Good. 🙂

    However, in the midst of name calling, it seems that you have found your lost reference. Good. 🙂

    “Functional rapidly folding proteins from simplified amino acid sequences”

    Now, I can answer in detail to your statements. But I have not the time now, because it requires some work, and you certainly deserve it. I hope later today!

    In the meantime, as you have again “not-quoted” another paper:

    “Baker’s work has also shown that typically 95% of residues tolerate amino acid susbtitutions.”

    which seems to be on a similar subject, could you please find the reference for it too, so that I can answer about that too?

    Sorry to ask you for so much work! 🙂

  206. 206
    gpuccio says:

    Corey Delvine:

    Here are the answers about your statement and the paper you quoted to support it.

    Your statement was (at #171):

    experiments have swapped amino acids in proteins, heck they’ve even stripped all 20 amino acids away and rebuilt proteins using only 4 amino acids and the protein was still functional

    And the paper you quote in support of that statement is the following:

    “Functional rapidly folding proteins from simplified amino acid sequences”

    https://www.nature.com/articles/nsb1097-805

    The article is of 1997, not exactly recent, but that’s not a problem.

    Now, the question is: does that paper support your statement?

    The answer, of course is: no!

    Let’s see why.

    Here is the abstract:

    “Early protein synthesis is thought to have involved a reduced amino acid alphabet. What is the minimum number of amino acids that would have been needed to encode complex protein folds similar to those found in nature today? Here we show that a small ?-sheet protein, the SH3 domain, can be largely encoded by a five letter amino acid alphabet but not by a three letter alphabet. Furthermore, despite the dramatic changes in sequence, the folding rates of the reduced alphabet proteins are very close to that of the naturally occurring SH3 domain. This finding suggests that despite the vast size of the search space, the rapid folding of biological sequences to their native states is not the result of extensive evolutionary optimization. Instead, the results support the idea that the interactions which stabilize the native state induce a funnel shape to the free energy landscape sufficient to guide the folding polypeptide chain to the proper structure.”

    A little bit vague, I would say. But we will see more details in our discussion.

    First of all, what did they experiment with?

    Not “a protein”, as you incorrectly state, but a protein domain.

    Moreover, a very small protein domain: the SH3 domain.

    Now, this domain is not only small (in the paper, it is given as 57 AAs long), but represents also a big protein superfamily in SCOP: 1shf. This superfamily includes many different proteins, which share a similar folding and structure, a beta-sheet based structure, but have rather different sequences and functions. See for example the Wikipedia page on the domain, here:

    https://en.wikipedia.org/wiki/SH3_domain

    We will come back to these points later.

    So, what particular SH3 domain sequence was used in our paper?

    well, it is not completely clear (they just call it “src SH3 domain” in the Methods seciton). But, luckily, they give the full sequence. Here it is:

    TFVALYDYESRTETDLSFKKGERLQIVNNTEGDWWLAHSLSTGRTGYIPSNYVAPSD

    Now, that’s strange, because blasting that sequence I could find no identical sequence in the Blast database. However, the sequence is almost identical to the SH3 domain in one human protein (and in other vertebrate proteins):

    Proto-oncogene tyrosine-protein kinase Src, P12931, 536 AAs long.

    Almost identical: 56 identities + 1 positive (the R in position 44). As I could find no better hit I will consider the sequence as the domain from that human protein, for all practical purposes.

    That this domain is a short and rather simple domain can be verified, for example, looking at this paper:

    https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2143930/pdf/9521098.pdf

    Fig. 6 C. The paper gives domain length as ranging from 36 to 692 AAs, and peaking around 100. With 57, we are here certainly in the lower tail of the distribution, probably about 10th – 20th percentile. This is important, because shorter domains are of course, in average, less complex.

    The function of the SH3 domain is, in general, to bind something, usually (but not always) peptides. From Wikipedia:

    The classical SH3 domain is usually found in proteins that interact with other proteins and mediate assembly of specific protein complexes, typically via binding to proline-rich peptides in their respective binding partner.

    OK, no more time now. More in next post, later.

  207. 207
    forexhr says:

    Corey Delvine:

    It is funny to see how you attack the character of gpuccio just because he is demonstrating mathematically and empirically that there hasn’t been enough fuel for evolution to occur.

    Evolution is a nice theory on paper – we have a population and its gene pool which is steadily changed due to mutations and new sexual recombinations of genes. Natural selection then sorts out the useful changes in the gene pool and population evolves. Beneficial new genes quickly spread through the population because members who carry them have a greater reproductive success or evolutionary fitness.

    But, what is ignored in this theoretical approach is whether changes that entered the gene pool during some evolutionary period are sufficient to explore the sequence space of some duplicated genes and find ones that are beneficial. Empirical science and mathematics answered a very definite ‘no’ to this question since gene sequence space is extremely sparsely populated by genes that are beneficial in the environment to which the population is exposed. For e.g., human population has been exposed to aquatic environments for a hundred thousand generations and tons and tons of mutations have entered its gene pool, but no beneficial gens (for breathing under water) have been found. This is because the ratio between non-beneficial and beneficial genes for this specific environment is so large that even if all mutations which have occurred in the evolutionary history are spent, no genes for breathing under water would enter the gene pool of human population.

    Simply put, there hasn’t been enough fuel for evolution to run. A this is what gpuccio is trying to tell you. But instead of accepting or at least exploring this simple and obvious scientific truth, you attack the person who posted it. Though, what can you expect from a Darwinist?

  208. 208
    Corey Delvine says:

    Gpucc, if you are complaining that they only altered a domain of a protein then you are ignorant of molecular biology techniques.
    If you had actually read the paper instead of just the abstract, you’d see that even for this short amino acid segment, they had to break it up into 3 sections to generate variant libraries.

    So once again, much like your demand that researchers build an entire ecosystem to demonstrate selectability and therefore function, you are simply being ridiculous.

    For you to say that the paper doesn’t support what I have said because it’s a domain and not an entire protein, means you are either clueless or a liar.
    So which is it?

  209. 209
    kurx78 says:

    I just see more Ad Hominems… and that’s quite sad.
    It’s sad because even if gpuccio is wrong with his argumentation, you are totally unable to sustain an educated discussion, you just want to humiliate, to denigrate, to destroy and that speaks very low of you as a professional and a scientist. (In case you are one)
    As an engineer and scientist I always try to keep my discussions educated and polite. Every dialog with someone else is an oportunity to learn new things and examine my own findings. That’s what good science is about, to question, to examine and improve our understanding of the natural world.
    You are not contributing to this conversation, you are just throwing ramblings and barking trees. I may disagree with people like Critical Rationalist or rvb8 but his contributions to the dialog sometimes are amazing and very thoughtful.
    What’s exactly the point of arguing with you anyway?

  210. 210
    Corey Delvine says:

    Things are more interesting and fun when you do it my way Kurx, and I just like to push people’s buttons.
    And I couldn’t care less about how I am perceived here.

    I do find it disgusting though, how the “science” behind the ID movement preys on the lack of knowledge most people have in the field of biology.
    They construct houses of cards much like gpuccio has done here, that are based on strawman arguments (as I have continually pointed out) and simply baseless/untrue claims about the current evolutionary theory.

    I’m just doing my part to push back, no matter how futile.

  211. 211
    gpuccio says:

    Corey Delvine:

    You should at least leave me the time to complete my reasoning. I was still at the very first premises.

    However, it is absolutely true that a first important error in your statement has been demonstrated.

    You said:

    “experiments have swapped amino acids in proteins, heck they’ve even stripped all 20 amino acids away and rebuilt proteins using only 4 amino acids and the protein was still functional”

    Your statement refers explicitly to proteins, and then in particular to a protein. But the paper you quoted is about a protein domain, a short protein domain which is part of much longer proteins. This is an error, in my world.

    I am not complaining about the paper (not yet, at least). I am pointing to an explicit error in your statement, an important error that could certainly mislead those who read your comment and have great faith in what you say (I am not part of the lot) to believe that entire proteins, probably a lot of them, or at least one entire protein, had been, as you say, “stripped away of all 20 amino acids away and rebuilt using only 4 aminoacids, and the protein was still functional”.

    That is simply false. Your statement is wrong about that, and can only generate false convictions.

    I am simply pointing to that undeniable error. Am I “clueless or a liar” for that?

    However, that was only a premise to my criticism of your statement, and in part of the paper itself. Please, be patient and give me the time to write other clueless lies.

  212. 212
    gpuccio says:

    Corey Delvine:

    “I do find it disgusting though, how the “science” behind the ID movement preys on the lack of knowledge most people have in the field of biology.
    They construct houses of cards much like gpuccio has done here, that are based on strawman arguments (as I have continually pointed out) and simply baseless/untrue claims about the current evolutionary theory.

    I’m just doing my part to push back, no matter how futile.”

    Don’t be humble. You are an hero!

  213. 213
    gpuccio says:

    kurx78:

    “What’s exactly the point of arguing with you anyway?”

    There is certainly no point in discussing with Corey. His bad faith is obvious.

    However, it is often interesting to discuss about some of the things that Corey says. He has been a good source of inspiration for good debates. Hist quote of that paper (when he finally gave it) is interesting, because it allows a detailed discussion on some serious errors made by darwinists (both Corey and the authors of the paper).

    If I can find the time to complete that discussion! 🙂

  214. 214
    Corey Delvine says:

    “This is an error, in my world.”
    With every word you type, it seems that Gpuucio’s world is more and more different from the real world.

    In my world, the word protein refers to any segment of amino acids over 50 residues.
    This is also the typically accepted minimum length requirement to be called a protein among biologists.
    So once again, your argument is entirely dependent on your own personal definitions of biological terms, which is in opposition to common definitions.

    So no, no error on my part.

    You say that because the 57 amino acid segement is “just” a protein domain, and not a full-length protein, it doesn’t support my claim.
    And for that you are either clueless, or a liar.

  215. 215
    forexhr says:

    Corey Delvine: “I do find it disgusting though, how the “science” behind the ID movement preys on the lack of knowledge most people have in the field of biology.”

    There is no such thing as lack of knowledge in determining either the probabilistic resources of biological world or the ratio between target space and search space in a specific biological context. Given our biological knowledge these numbers are pretty easy to determine. The problem is that most researchers in the field of evolutionary biology deny them and resist them because it is imperative for them to continue to secure funding and employment. It always boils down to ‘follow the money’. Getting paid for empty storytelling about unseen past events is easy money for them.

  216. 216
    gpuccio says:

    Corey Delvine:

    “With every word you type, it seems that Gpuucio’s world is more and more different from the real world.”

    It is certainly different from your world. Luckily.

    “In my world, the word protein refers to any segment of amino acids over 50 residues.”

    From Wikipedia:

    “Proteins (/?pro??ti?nz/ or /?pro?ti.?nz/) are large biomolecules, or macromolecules, consisting of one or more long chains of amino acid residues”.

    “Many proteins are composed of several protein domains, i.e. segments of a protein that fold into distinct structural units.”

    IOWs, a protein is a molecule, a domain is a segment of a protein, and is not an individual molecule.

    Your definition for protein:

    “any segment of amino acids over 50 residues”

    is simply wrong.

    Length is useful to distinguish between proteins and peptides, which are both complete molecules:

    Always from Wikipedia:

    “The words protein, polypeptide, and peptide are a little ambiguous and can overlap in meaning. Protein is generally used to refer to the complete biological molecule in a stable conformation, whereas peptide is generally reserved for a short amino acid oligomers often lacking a stable three-dimensional structure. However, the boundary between the two is not well defined and usually lies near 20–30 residues.[5] Polypeptide can refer to any single linear chain of amino acids, usually regardless of length, but often implies an absence of a defined conformation.”

    So, peptides are shorter molecules, while proteins are longer molecules, but both are whole molecules.

    Domains are by definition functional subunits of proteins, they are not whole molecules.

    In particular, the SH3 domain is part of a lot of much longer proteins. It is not a protein.

    Always from Wikipedia:

    “The SRC Homology 3 Domain (or SH3 domain) is a small protein domain of about 60 amino acid residues. Initially, SH3 was described as a conserved sequence in the viral adaptor protein v-Crk. This domain is also present in the molecules of phospholipase and several cytoplasmic tyrosine kinases such as Abl and Src.[1][2] It has also been identified in several other protein families such as: PI3 Kinase, Ras GTPase-activating protein, CDC24 and cdc25.[3][4][5] SH3 domains are found in proteins of signaling pathways regulating the cytoskeleton, the Ras protein, and the Src kinase and many others. ”

    Your world is full of gross errors, but it is also a world where you never admit one.

    Just a friendly advice: next time, you ask first your biology teacher, if you have one.

  217. 217
    gpuccio says:

    Corey Delvine:

    “You say that because the 57 amino acid segment is “just” a protein domain, and not a full-length protein, it doesn’t support my claim.”

    It certainly does not support your claim about proteins. Try to formulate again your claim, making it about one short protein domain, and at least that part will be correct.

    The rest, I still have to deal with.

  218. 218
    Corey Delvine says:

    The basic premise of Gpuccio’s argument is based on a lack of knowledge, forexhr.

    His entire argument depends on this: function = sequence homology
    But this is false and I have pointed to numerous lines of experimental evidence that show this.
    This also makes his sequence space calculations completely worthless.

    This is all evident in his apparent belief that evolution has to find a “specific” sequence in order to get function,
    which is, again, completely wrong.

  219. 219
    Origenes says:

    Corey Delvin @208

    CD: So once again, much like your demand that researchers build an entire ecosystem to demonstrate selectability and therefore function, you are simply being ridiculous.

    Here you repeat the blunder you previously made in the thread What are the limits of Natural Selection? A blunder based on a complete misunderstanding of GPuccio’s argument concerning artificial selection.
    Here is my comment again:

    … Corey Delvin is a guy who believes that natural selection in the lab requires building “an entire ecosystem and monitor every single aspect for millions of years” (post #228, #232 and #239). This is not at all a joke by Corey, he seriously holds that when you [GPuccio] are talking about artificial selection that you are objecting to the absence of such an ecosystem. That is his genuine understanding of the discussion….
    No one has commented on this, because it is simply stupid beyond belief.

  220. 220
    gpuccio says:

    Corey Delvine:

    “His entire argument depends on this: function = sequence homology”

    As usual, you have it all wrong.

    My entire argument depends on this:

    sequence homology (conserved for long evolutionary times) = function

    Not the other way round.

    Sometimes I really wonder if you really don’t understand things, or if you are just doing it for the hell of it!

    OK, it’s not important. You are an occasion to explain important points, and I am grateful to you for the role you play, either unwillingly or intentionally.

    You say:

    “This is all evident in his apparent belief that evolution has to find a “specific” sequence in order to get function, which is, again, completely wrong.”

    Evolution certainly has to find a specific sequence that can implement some specific function, for that function to be present. Even a child would understand that.

    Of course, there is often more than one specific sequence that can implement a function. It’s called the target space, and it is one of the basic concepts in ID.

  221. 221
    gpuccio says:

    Origenes:

    Thank you. I have the impression that our “friend” is simply trying to prevent me from completing my argument about “his” paper, because I am wasting the few time I have answering his blunder.

  222. 222
    Corey Delvine says:

    Ah yes, wikipedia trumps all.
    Now I know with 100% certainty that we are doing “science”

    And just for you Gpucc, I will restate:
    Baker rebuilt a protein DOMAIN using just 5 amino acids and it folded properly and remained functional.

    Protein or protein domain, it does not affect my claims in anyway.
    But the fact that the function is retained with this highly simplified amino acid repertoire does shred your personal beliefs about sequence space.

    Of course you will find a way to remain ignorant though.

  223. 223
    Origenes says:

    GPuccio @221

    There is this saying:

    “Never start a fight with an idiot, he’ll only pull you down to his level and beat you with his stupidity.”

  224. 224
    Corey Delvine says:

    “sequence homology (conserved for long evolutionary times) = function”

    but

    function =/= sequence homology (conserved for long evolutionary times)

    This should tell you there is something inherently (and glaringly) wrong with your approach.

    Also, that’s not how (correct) equations work.

  225. 225
    ET says:

    Corey D:

    I do find it disgusting though, how the “science” behind the ID movement preys on the lack of knowledge most people have in the field of biology.

    Totally clueless. Evolutionism is alive and well due to lack of knowledge. Nice own goal

  226. 226
    ET says:

    Corey D:

    In my world, the word protein refers to any segment of amino acids over 50 residues.

    Fine. Now demonstrate that blind and mindless processes can produce one. Or admit that your position is for losers.

  227. 227
    Corey Delvine says:

    ‘Corey Delvin is a guy who believes that natural selection in the lab requires building “an entire ecosystem and monitor every single aspect for millions of years”’

    No, that is what Gpuccio requires.

    Any experiment on the evolution of protein function, that doesn’t support Gpuccio’s worldview, he lumps into the category of “artificial selection.”
    His reasoning is that (in his opinion) the experiment did not do a good enough job demonstrating biological relevance.
    Which of course would be understandable, except for the fact that his requirements for biological relevance are absurd:
    It has to be in a living organism, you have to see it increase fitness, etc.
    Hence, why the only thing he seems to admit is natural selection is a study in viruses.
    These are all ridiculous requirements when taking into account the complexities of the molecular biology techniques that would be required.
    To do these experiments on bacterial or eukaryotic proteins, things must be simplified.

  228. 228
    Dionisio says:

    Mung @203:

    “Could have predicted the insults were coming.”

    It should have been obvious given the lack of serious arguments by the dissenting commenter.

    Running out of arguments leads some folks -unwilling to search for the truth- to insulting their opponents.

    Nothing new. It’s been that way since the beginning of history.

  229. 229
    Corey Delvine says:

    “My entire argument depends on this:
    sequence homology (conserved for long evolutionary times) = function
    Not the other way round.”

    You don’t even understand your own argument.

    The first part of your argument depends on
    sequence homology (conserved for long evolutionary times) = function.
    When you blast a human protein to another organism and infer that very high homology suggests the same function.
    And I’m fine with that.

    But the second you start making calculations about sequence space based on conservation of sequence homology and conclude that the jump in homology coincides with a rapid evolution of function, you are implying that function = conserved homology.
    Which, it does not.

  230. 230
    Origenes says:

    Corey Delvin @227

    Origenes: Corey Delvin is a guy who believes that natural selection in the lab requires building “an entire ecosystem and monitor every single aspect for millions of years. … it is simply stupid beyond belief.

    Corey Delvin: No, that is what Gpuccio requires.

    You are absolutely wrong. 100%. You cannot provide one single quote by GPuccio that supports your absurd claim.

  231. 231
    forexhr says:

    Corey Delvine: “This is all evident in his apparent belief that evolution has to find a “specific” sequence in order to get function, which is, again, completely wrong.”

    Whether a specific function is coded with homologous or non-homologous sequences is completely unrelated to the ratio between target space and search space for the environment where this function is beneficial. If these non-homologous sequences: TTT, CGC and ACA are functional in a specific environment(target space of size 3), the ratio (3/4^3) would not change if those sequences were homologous – for e.g. AAA, AAC, AAT. Your assertion is therefore beside the point.

  232. 232
    Corey Delvine says:

    Did you not read the rest of #227, Origenes?

  233. 233
    gpuccio says:

    Corey Delvine:

    p–>q does not imply that q–>p.

    And now, let’s go to the paper.

  234. 234
    Corey Delvine says:

    p = q
    implies that
    q = p

    don’t obfuscate

  235. 235
    Origenes says:

    Corey Delvin @232

    Origenes: You cannot provide one single quote by GPuccio that supports your absurd claim.

    Corey Delvin: Did you not read the rest of #227, Origenes?

    I did. You are not quoting GPuccio. All you offer is your warped view of his argument.

  236. 236
    Corey Delvine says:

    Warped?
    It is an exact summary of how the previous conversation went.
    You’re now in the same troll-boat as ET, Dio, and Mungy

  237. 237
    Origenes says:

    Corey Delvin @229

    GPuccio: “My entire argument depends on this:
    sequence homology (conserved for long evolutionary times) = function
    Not the other way round.”

    Corey Delvin: The first part of your argument depends on sequence homology (conserved for long evolutionary times) = function.

    Indeed, when a sequence is conserved for long evolutionary times, we can safely infer that this particular sequence has biological function. BTW this means we do not have to know which specific function, in order to know that it has function.

    Corey: When you blast a human protein to another organism and infer that very high homology suggests the same function. And I’m fine with that.

    I don’t think that GPuccio is very concerned with the question if the function is the same or not. Please GPuccio, correct me if I am wrong. But, why would that be relevant? It is enough to know that it is functional, based on being conserved for long evolutionary times. The main point of blasting, as I understand it, is that we can see the evolutionary history of the functional sequence.

    Corey: But the second you start making calculations about sequence space based on conservation of sequence homology and conclude that the jump in homology coincides with a rapid evolution of function …

    If we see the sudden addition of 1000’s bits followed by extreme conservation of the new sequence, what other conclusion can be drawn?

    Corey: … you are implying that function = conserved homology.

    Not at all. We see that the new sequence is conserved for long evolutionary times and from this we infer that it is a (new) functional sequence.
    There is no reason whatsoever to have it backwards.

    – – –
    Corey Delvin @236

    You cannot provide one single quote by GPuccio that supports your idiotic claim.

  238. 238
    gpuccio says:

    To all interested:

    Let’s remind Corey’ statement:

    “experiments have swapped amino acids in proteins, heck they’ve even stripped all 20 amino acids away and rebuilt proteins using only 4 amino acids and the protein was still functional”

    1) We have already found a first important error: exchanging a protein domain for a protein.

    2) Corey has graciously anticipated another small and obvious error: it’s not 4 AAs, but 5. Not so important.

    3) Let’s go to the statement that the protein, I beg your pardon the protein domain, was “stripped away of all 20 amino acids and rebuilt”. Is that true?

    No.

    The aim of the experiment is to simplify the sequence according to a 5 AAs alphabet. The chosen AAs are I,K,E,A and G.

    But, as the authors say, simplification was attempted only in the “residues not implied in binding”, IOWs in the residues not directly implied in the function of the domain. IOWs, the most important residues from the point of view of function.

    How many are the residues where simplification was simply not attempted?

    They are 12.

    Let’s go to Fig. 2b, which shows the sequence of the two “functional” (more on that later) simplified domains that were the final result of the whole process, as compared to the sequence of the wildtype.

    residues (from left): 6,7,8,15,33,34,35,47,49,50,51,52

    12 AAs (21% of the molecule) were left as in the wild type, IOWs as they were from an alphabet of 20 AAs!

    But there is more.

    Of the 45 AAs were the simplification was attempted, it was achieved (in the final “functional” two sequences) only in 40 out of 45 residues, the authors say.

    Indeed, they say something slightly different, and we will see the reason for that. It’s a real trick they are trying, so be very careful.

    What they say is:

    “In the more simplified variant, FP2” (IOWs their best result) “40 of the 45 residues at which simplification was attempted are I,K,E,A or G.”

    Strange way to say it. But there is a reason for that. A reason that is a subtle deception.

    But for the moment, let’s acknowledge this simple truth: the simplification failed in 5 residues where it was attempted.

    But in the Figure I can find only 4 of them.

    Residues 2,16,17,42

    The fifth residue in black, the S at 50, is indeed one of the 12 AAs mentioned before, So, there must be some error in the paper. OK, not important.

    Let’s say that in FP2 simplification failed in 4 residues.

    That makes: 12 + 4 = 16 AAs, 28% of the sequence which has not been simplified at all.

    But there is more. Here comes the trick, the deception.

    The authors say:

    “”In the more simplified variant, FP2” (IOWs their best result) “40 of the 45 residues at which simplification was attempted are I,K,E,A or G.

    So, it would seem that 40 (or 41) of the residues have been “simplified”, isn’t it?

    But that’s not the case.

    If you look at the sequence of FP2, you will see that 13 residues:

    Residues 4,19,20,21,22,26,31,32,37,43,46,48,54

    were already IKEA or G in the wild type, and that they have not changed in FP2. Not at all. IOWs, they have not been “simplified” at all, they have remained the same, and they were alredy from the 5 chosen AAs in from the beginning. Coming, of course, from an alphabet of 20 AAs.

    Can you see the subtle deception?

    So, let’s see. We have:

    12 + 4 + 13 = 29 AAs (50.9% of the whole sequence) where no simplification has been accomplished at all.

    So, is Corey’s statement that:

    “heck they’ve even stripped all 20 amino acids away and rebuilt proteins using only 4 amino acids”

    accurate? Is that even close to truth?

    Of course not. Another big, big error.

    But I must admit that here it’s not all Corey’s fault.

    He should have noticed that in 12 important AAs no simplification was even attempted, because that is clearly stated by the authors. So, he’s responsible for that error.

    He should have noticed that in 4 or 5 AAs the simplification simply did not work, because that is clearly stated by the authors. So, he’s responsible for that error.

    But he is not fully responsible for not noticing that 13 AAs had not changed from the wildtype, because they were already AAs included in the chosen 5 from the beginning. The authors do not say that clearly. Indeed, it seems that they try to hide that important fact.

    So, third big error. They are already a lot, aren’t they, Corey?

    I have still much to say about SH3 and the paper, and other papers on the subject. Maybe I will not find other big errors from Corey, but probably some imprecision that deserves clarification.

    But that will have to wait until tomorrow. 🙂

  239. 239
    Mung says:

    Corey Delvine:

    I’m just doing my part to push back, no matter how futile.

    Perhaps change your username to Sisyphus.

  240. 240
    Corey Delvine says:

    You cannot draw the conclusion that the function also “jumped” into existence solely based on the appearance of a certain amino acid sequence.

    This is where gpuccio assumes that only this conserved sequence represents the only functional sequence of said protein.
    Do I have to spell it out for you?
    This is where Gpuccio is implying that function = conserved sequence homology.
    And this is completely incorrect.

  241. 241
    Mung says:

    gpuccio:

    The chosen AAs are IKEA and G.

    Must be a Swedish team.

  242. 242
    gpuccio says:

    Origenes:

    “I don’t think that GPuccio is very concerned with the question if the function is the same or not. Please GPuccio, correct me if I am wrong. But, why would that be relevant? It is enough to know that it is functional, based on being conserved for long evolutionary times. The main point of blasting, as I understand it, is that we can see the evolutionary history of the functional sequence.”

    Correct. But, of course, in the case of known proteins or known domains, it is easy to see that the function is more or less the same. That is not easy when we find conservation of sequences whose function is not known, as I have often done.

  243. 243
    gpuccio says:

    Mung:

    Yes, I was afraid that I was making undue advertising! 🙂

  244. 244
    gpuccio says:

    Corey Delvine:

    “This is where gpuccio assumes that only this conserved sequence represents the only functional sequence of said protein. Do I have to spell it out for you?
    This is where Gpuccio is implying that function = conserved sequence homology.”

    Not at all. I only assume that conservation is a very good way of measuring how much that sequence cannot change, how much it is functionally constrained.

    I am not saying that it is the only functional sequence. I am saying that the Blast bitscore is a good measure (probably an underestimate) of the functional information in the protein.

    Please, read my post #70. I paste here the relevant part, for your convenience:

    But let’s go back to an old friend, ATP synthase.

    As I have said many times, the beta chain in the F1 complex of the molecule has an astonishing conservation between bacteria and humans.

    Just to remind the numbers (humans and E. coli):

    ATP synthase beta chain (P06576, 553 AAs):

    334 identities, 383 positives, 663 bits

    Now, that is amazing.

    Consider that this result in bits is already a measure of the target space/search space ratio.

    Indeed, the search space for this proteins is 2390 bits, about 10^719 states.

    Therefore, when we have 663 bits of functional information from the bitscore, that is already a very conservative value, because it is setting the target space at 1727 bits, IOWs a target space of 10^519 states!

    IMO, the blast bitscore is definitely underevaluating functional information. For example, it gives perfect identity a botsocre of about 2.2, which is half the potential information in one AA position (4.3 bits). That derives in part from how the bitscore is computed, but I still believe that it underevaluates functional information.

    Hoever, as it is a measure that is very easy to obtain, and is universally considered a valid metrics of homology, I have used that score in all my analyses.

    But my point is that the bitscore is already corrected, maybe hypercorrected, for the target space.

    I would not say that estimating a target space of 10^519 is the same as saying that “this conserved sequence represents the only functional sequence of said protein”. There are 519 orders of magnitude of difference between the two statements!

  245. 245
    gpuccio says:

    Corey Delvine:

    “p = q
    implies that
    q = p”

    Are you serious? Are you confounding logical implication with identity?

    Sometimes you amaze me! 🙂

    So, for you:

    “If p then q”

    is the same as

    p = q

    Good to know…

  246. 246
    gpuccio says:

    To all:

    While Corey goes on accumulating errors beyond any possible conception, I must say that I am happy that the discussion is still alive.

    We have still so many things to say: the failure of neo-darwinism is such a fascinating argument. 🙂

  247. 247
    Origenes says:

    Corey @240

    Corey: You cannot draw the conclusion that the function also “jumped” into existence solely based on the appearance of a certain amino acid sequence.

    Everyone agrees, but no one has suggested otherwise. Functionality of a sequence is simply inferred from the fact that it is conserved for long evolutionary times.

    Corey: This is where gpuccio assumes that only this conserved sequence represents the only functional sequence of said protein.

    Why would he? That is not at all required for his argument. Quote please.

    Corey: This is where Gpuccio is implying that function = conserved sequence homology. And this is completely incorrect.

    Frankly, I don’t even know what it is supposed to mean. How can biological function be ‘conserved sequence homology?’ This is all in your warped mind. It doesn’t make sense; gibberish.

  248. 248
    Dionisio says:

    Origenes, ET and Mung,

    According to the comment @236, I’ve been given the undeserved honor of sharing a boat with you.
    Can you tell me any details about that boat?
    Do you plan a trip soon?

  249. 249
    Joe Sixpack says:

    Gpuccio, I just stumbled on to this site and skimmed through the OP and thread. I haven’t given it enough thought to comment yet but I just wanted to commend you on a lively discussion. Too many of these devolve into insults puked on insults.

  250. 250
    ET says:

    You’re now in the same troll-boat as ET, Dio, and Mungy

    Which makes Corey the troll we are fishing for. The bait is good and I believe we have him hooked.

    Nicely done guys.

  251. 251
    Dionisio says:

    ET @250:

    LOL!

  252. 252
    gpuccio says:

    Joe Sixpack:

    Thank you! 🙂

    Any comment from you will be certainly appreciated.

  253. 253
    gpuccio says:

    To all:

    Now, a few more thoughts about the paper Corey kindly. I have to cover two more big issues:

    a) How they did it, and what it means.

    b) The function.

    So, let’s start with the first.

    Here, Corey can be of some help. In his comment #208, he quickly summarizes the procedure:

    “If you had actually read the paper instead of just the abstract, you’d see that even for this short amino acid segment, they had to break it up into 3 sections to generate variant libraries.”

    Well, of course I have read the paper, and very carefully. So, I can say that Corey’s summary is perfectly correct: they had to break it into 3 sections, and do the search and artificial selection for each of them.

    IOWs, they simplified each third of the sequence using random libraries, restricted to the 5 chosen AAs, whose individual complexity “averaged 5×10^7” (see Methods).

    Then, for each third, they artifically selected the sequences that showed function accodring to the method of artificail selection they chose (which I will discuss better when I will deal with the “function” problem: IOWs, they selected the “functional” sequences “by biopanning with proline-rich peptide covered paramagnetic beads”. In particular, the beads exhibited the short proline rich peptide RALPPLPRY, which is a ligand for our SH3 domain (see Methods).

    So, for each of the three subsections, simplified sequences that retained the ability to bind that ligand were selected. Those sequences are shown in Fig. 1 a,b,c. Of course, these are very partial simplifications, because for each simplified third, the remaining two thirds of the molecule have not changed at all.

    Here again the authors make their subtle deception rather explicit. For example, they state, about the first third:

    “For example, in the first third of the protein (Fig. 1a), 16 of the 19 residues not involved in binding were converted to I,K,E,A, or G in the most simplified sequences.”

    (emphasis mine)

    But, as already said, that’s absolutely not true. If you look at Fig. 1a, you will see that 3 of those 16 residues 4,21 and 22, have never been converted at all, because in all shown sequences they are absolutely the same as in the wild type. And other residues have conserved the same value as in the wildtype in almost all the variants (for example, 19 and 20).

    But that’s not enough, of course. The authors had to join these partial results to get their final “simplified” sequences. They did that by “randomly splicing together the simplified segments of a number of the partially simplified variants”, and then artificially selecting the results, as previously described. Unfortunately, this step of “random splicing” is not detailed in the Methods section. However, we can safely assume that this further step of random search (the random splicing) and artificial selection implement a further exploration of the potential search space. this is an important point.

    In the end, they found two sequences that could be artificially selected after the whole “simplification” procedure. They are shown in Fig. 2b, and I have already discussed how, in the most “simplified” sequence (FP2), only 28 AAs (less than 50%) have been “simplified” (see my post #238).

    However, let’s take this very limited result for good, for the moment, and try to understand who it was attained. And, of course, make some quantitative assessments.

    It should be very clear that the procedure used here is, as usual, RV + artificail selection. IOWs, like in Szostak’s paper (random phage libraries for variation, beads exhibiting the ligand for selection).

    But here the RV was limited to 5 AAs, because that is the aim of the study.

    So, for 28 AAs (those where simplification was achieved) the search space here is:

    5^28 = 3.7E+19 = 65 bits

    How much of that search space was traversed in the RV procedures?

    It’s easy. Each library was about 5×10^7. They used 3 independent libraries, selecting each time the results, before the last recombination step.

    Therefore they explored the equivalent of a space of about 1.25E+23 + the component linked to the final recombination step. that we cannot evaluate because the size of the random procedure is not detailed.

    Let’s say, conservatively, that they have explored a search space of about 10^25.

    But, as we said before, the real search space is only of the order of 10^19.

    Therefore, we can safely assume that they have explored the search space rather completely, by their procedure of RV + artifical selection steps.

    Therefore, their two “functional” results are probably the only functional sequences in that search space.

    Well, maybe there are a few others, but certainly not many more.

    What does that mean?

    First of all, it means that the two functional sequence that were found for those 28 AAs sequence in the search space of 5 AAs have, as far as we can know, a functional complexity (for the function as defined by the authors) of:

    2:3.7E+19 = 5.41E-20 = 64 bits

    That’s very big functional information for such a short sequence, about 2.29 baa. And it is computed top-down!

    Why is it so?

    Let’s understand.

    The idea of simplification should, of course, be of making things simpler.

    So, a simplified alphabet would be really effective if it simplified the functional information.

    That should work this way:

    I have an alphabet of 20 AAs, but they are definitely too many. I believe that I can achieve similar functional results using only five of them, because AAs are often of the same kind, and they can be easily substituted without consequences for function.

    So, let’s say that I find a substitution scheme that wroks well enough to preserve function: for example, I will write I every time that an AA in the wildtype is one of maybe 4 explicitly defined AAS, K for another group of maybe 4, and so on.

    IOWs I should be able to use a translation code from the 20 AAs code to the 5 AAs code.

    In that case, there would be no need to make all the work that is described in the paper. Once I find the right translation code, I can simply synthesize de novo the correct translated sequence, using only my 5 aas, deriving it directly from the wildtype sequence.

    But that’s not what happened here. Not at all.

    Because there is no translation code here.

    There is only a random search helped by many steps of artificial selection, aimed at selection those 5 AAs sequences that retain function, and whose result is two mere sequences.

    Why?

    The answer is simple: because it is almost impossible to retain function if we use 5 AAs only.

    But, you will say, in two cases they did it.

    True. Because they found the only sequences that can retain the defined function, at some level (more on that in the future discussion about function), because they had some favorable effect from epistasis.

    In brief, here is a very simple explanation of the concept:

    “A protein’s biological functions emerge from its chemical and physical properties, which in turn are determined by the interactions between its amino acid residues in three?dimensional space. It is therefore not surprising that the functional effect of changing an amino acid often depends on the specific sequence of the protein into which the mutation is introduced. This dependency on genetic context has long been called epistasis by geneticists.1 Epistasis is invoked when the combined effect of two or more mutations deviates from that predicted by adding their individual effects.”

    From:

    “Epistasis in protein evolution”

    https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4918427/

    So, this is what we observe: there is no translation code from a 20 AAs alphabet to a 5 AAs alphabet, because it is not possible to generate the same functional information that is present in 20 AAs using only 5 AAs. IOWs, reducing the number of AAs quickly degrades functional information.

    However, in extremely rare cases, substitutions that in general could not be tolerated can still be compatible with some function, because the general epistatic interaction of many different deleterious substitutions can still salvage some function.

    That is the case in our two “functional” sequences.

    There would be more to say on this fascinating issue, but I have not the time now, and I still have to deal with the “function” problem!

    However, we can at least ask ourselves a simple question: does further literature support the idea that we can reduce the number of AAs to 5?

    No.

    Take, for example, this paper of 2000:

    “Simplified amino acid alphabets for protein fold recognition and implications for folding”

    Abstract
    Protein design experiments have shown that the use of specific subsets of amino acids can produce foldable proteins. This prompts the question of whether there is a minimal amino acid alphabet which could be used to fold all proteins. In this work we make an analogy between sequence patterns which produce foldable sequences and those which make it possible to detect structural homologs by aligning sequences, and use it to suggest the possible size of such a reduced alphabet. We estimate that reduced alphabets containing 10–12 letters can be used to design foldable sequences for a large number of protein families. This estimate is based on the observation that there is little loss of the information necessary to pick out structural homologs in a clustered protein sequence database when a suitable reduction of the amino acid alphabet from 20 to 10 letters is made, but that this information is rapidly degraded when further reductions in the alphabet are made.

    And this paper is about “picking out structural homologs in a clustered protein sequence database”, not about building true functional proteins!

    And it seems that reducing the alphabet to less than 10 rapidly degrades even that possibility!

    So, what is it? 5 or 10-12?

    With the available evidence, I would say neither!

    As far as we know, proteins are made of 20 AAs, and the genetic code works through 20 aa-tRNA synthetases, and all the rest.

    Those who believe that a reduction in the number of AAs is feasible should really try to make whole proteins with that number, and verify their function in a true biological context.

    More on that when I will have the time to discuss the “function” aspect of our paper.

    In the meantime, Corey will of course go on with his “gibberish”! 🙂

  254. 254
    gpuccio says:

    Dionisio:

    “Origenes, ET and Mung,

    According to the comment @236, I’ve been given the undeserved honor of sharing a boat with you.
    Can you tell me any details about that boat?
    Do you plan a trip soon?”

    Any room left in the boat? 🙂

  255. 255
    Dionisio says:

    @253:
    Very interesting comparative analysis of potential protein functionality based on the known 20 AAs set (including both essential and nonessential AAs) vs. a hypothetical 5 AAs set.
    Very insightful indeed.
    With at least two interesting papers referenced.
    The heat is still up in this thread.
    GP promises more of this interesting analysis ahead.
    Thanks!

  256. 256
    Dionisio says:

    gpuccio @254:

    “Any room left in the boat?”

    Sorry, but you were not mentioned by your [not so] politely dissenting interlocutor @236. Only ET, Origenes, Mung and I were given such an honor. 🙂

    However, since you have provided excellent bait for this fishing adventure, perhaps that qualifies you automatically for a distinguished place in the boat?

    Also, given that this kind of fish could secrete toxic material, it might help to have a medical doctor on board. Just in case somebody gets in contact with the fish. 🙂

  257. 257
    Dionisio says:

    @253:

    “Simplified amino acid alphabets for protein fold recognition and implications for folding”

    Cited in:

    “Basic units of protein structure, folding, and function”
    Igor N. Berezovsky, Enrico Guarnera, Zejun Zheng

    Related papers here:

    http://www.sciencedirect.com/s.....0716300864

    Full text here:

    https://www.researchgate.net/profile/Igor_Berezovsky/publication/308912133_Basic_units_of_protein_structure_folding_and_function/links/597efbdfaca272d56817fa17/Basic-units-of-protein-structure-folding-and-function.pdf

  258. 258
    Dionisio says:

    @253:

    The 2000 paper “Simplified amino acid alphabets for protein fold recognition and implications for folding”

    is directly cited in at least 158 papers referenced in the following link:

    https://www.researchgate.net/publication/12540406_Simplified_amino_acid_alphabets_for_protein_fold_recognition_and_implications_for_folding

    The paper referenced @257 is one of the 158 papers mentioned in the above link.

  259. 259
  260. 260
    gpuccio says:

    To all:

    Now I would like to address thw problem of function in relation to the SH3 domain.

    To do that, I will introduce another interesting paper:

    “SH3-like Fold Proteins are Structurally Conserved and Functionally Divergent”

    http://www.eurekaselect.com/79432/article

    The abstract says:

    The folding space for all the protein sequences is limited. Therefore it was observed that many proteins, whose sequences are not related, have similar fold characteristics. The fold databases like SCOP and CATH have classified various protein folds. However, in-depth analysis of the functional features of these folds was not done. We analyzed about twenty unique SH3-like folded proteins in their structural environment and functional characteristics. From our analysis it is apparent that the SH3-like folds could carry out various functions by modulation of loops and the functional region is restricted to one side of a particular sheet helped by two or three loops. The functions vary from oligonucleotide-binding to peptide-binding and other ligand binding. Although certain degree of sequence similarity was observed among the SH3-fold proteins, the similarity was restricted to the ?-strand regions of the proteins.

    The paper is about the different proteins which include an SH3 domain (called simply, in the article, SH3-fold proteins). I quote:

    “Here, we review the structural and functional features of many different SH3-fold proteins and our analysis show that the functional region of the fold, to a large extent is conserved to bind to a variety of ligands.”

    Emphasis mine.

    So, the first important point is: the SH3 domain can bind to a variety of ligands.

    And:

    “All the proteins considered in this study were superposed to align them structurally by taking the chicken spectrin SH3 as reference molecule. The proteins considered in this study are listed in Table. I.”

    Now, let’s see the SH3 domain in chicken pectrin. Here it is:

    TGKELVLALYDYQEKSPREVTMKKGDILTLLNSTNKDWWKVEVNDRQGFVPAAYVKKLDP

    It is located more or less at the center of the spectrin chain, a 2477 AAs protein (the alpha chain).

    What is the function of spectrin?

    Here is what Uniprot says:

    “Morphologically, spectrin-like proteins appear to be related to spectrin, showing a flexible rod-like structure. They can bind actin but seem to differ in their calmodulin-binding activity. In nonerythroid tissues, spectrins, in association with some other proteins, may play an important role in membrane organization.”

    Now, let’s see again the SH3 sequence in the paper we have discussed, the sequence from Proto-oncogene tyrosine-protein kinase Src, P12931, 536 AAs long:

    TFVALYDYESRTETDLSFKKGERLQIVNNTEGDWWLAHSLSTGRTGYIPSNYVAPSD

    In this protein, the domain is placed near the N terminal part of the protein.

    What is the function of Proto-oncogene tyrosine-protein kinase Src?

    Here is what Uniprot says:

    “Non-receptor protein tyrosine kinase which is activated following engagement of many different classes of cellular receptors including immune response receptors, integrins and other adhesion receptors, receptor protein tyrosine kinases, G protein-coupled receptors as well as cytokine receptors. Participates in signaling pathways that control a diverse spectrum of biological activities including gene transcription, immune response, cell adhesion, cell cycle progression, apoptosis, migration, and transformation. Due to functional redundancy between members of the SRC kinase family, identification of the specific role of each SRC kinase is very difficult. ”

    And much more.

    So, two very different proteins, it seems.

    But what does the same domain (SH3) do in such different proteins?

    We can certainly imagine that it binds some ligand.

    But first of all, are these two SH3 domains the same thing?

    We know that, as far as we know, all SH3 domains share a similar folding and structure. In this sense, they are certainly similar.

    But I have blasted the two short sequences. Here is the result:

    Score: 47.0 bits
    Expect: 1e-14
    Identities: 20/56(36%)
    Positives: 34/56(60%)
    Gaps: 2/56(3%)

    VLALYDYQEKSPREVTMKKGDILTLLNSTNKDWWKVE–VNDRQGFVPAAYVKKLD
    +ALYDY+ ++ +++ KKG+ L ++N+T DWW R G++P+ YV D FVALYDYESRTETDLSFKKGERLQIVNNTEGDWWLAHSLSTGRTGYIPSNYVAPSD

    Well, that the two sequences are realted there is no doubt: an Expect of 1e-14 is more than enough for that.

    But they are also so different! Only 20 identities (36% of the aligned part)!

    OK, I can already heart Corey shouting triumphantly: “I told you! These two sequences have low homology and still they implement the same function!”

    Maybe. We have examples of that, certainly.

    But here, are we really sure that they “implement the same function”?

    We have two important points:

    a) The SH3 domains binds to a variety of ligands.

    b) The two proteins where the domain is included have very different sequences and functions.

    So, what if those two domains, while sharing a gross folding and structure, have some different function, because they:

    a) Bind different ligands

    and:

    b) Interact differently with the whole protein and its specific function?

    That would certainly explain the difference in sequence. The functionally required difference in sequence.

    Is it so? Let’s see.

    We learn form the Kishan paper that SH3 can bind both peptides and DNA, although peptide binding is certainly the most common scenario.

    The short peptide motifs to which the domain binds can vary much. Here is another paper of 2012:

    “SH3 domain ligand binding: What’s the consensus and where’s the specificity?”

    http://www.sciencedirect.com/s.....9312003316

    Abstract

    An increasing number of SH3 domain–ligand interactions continue to be described that involve the conserved peptide-binding surface of SH3, but structurally deviate substantially from canonical docking of consensus motif-containing SH3 ligands. Indeed, it appears that that the relative frequency and importance of these types of interactions may have been underestimated. Instead of atypical, we propose referring to such peptides as type I or II (depending on the binding orientation) non-consensus ligands. Here we discuss the structural basis of non-consensus SH3 ligand binding and the dominant role of the SH3 domain specificity zone in selective target recognition, and review some of the best-characterized examples of such interactions.

    So, there can be no doubt about the variety of ligands and interactions of which the SH3 domain is capable.

    But let’s see another important aspect. Let’s consider again the two SH3 sequences we have already seen. The question is, as usual: are those sequences evolutionary conserved?

    To answer that, let’s ask for the help of our ussal friends, the sharks: 🙂

    1) Sequence from the spectrin alpha chain.

    Blast: human – cartilaginous fish

    Well, the human sequence is identical to the chicken sequence: 60/60 identities.

    So, we can just look at the homology with sharks:

    The best hit is wuìith Callorhinchus milii:

    60/60 identities!

    The sequence is identical, after 400+ million years of evolutionary history!

    2) Now, the sequence from human Proto-oncogene tyrosine-protein kinase, the one in the Riddle / Baker paper:

    Again, the best hit is with Callorhincus milii:

    Bitscore: 110 bits
    Expect value: 3e-30
    Identities: 51/57(89%)
    Positives: 54/57(94%)
    Gaps: 0/57(0%)

    TFVALYDYESRTETDLSFKKGERLQIVNNTEGDWWLAHSLSTGRTGYIPSNYVAPSD
    TFVALYDYESRT +DLSFKKGERLQIVNNTEGDWWLA SL+TG +GYIPSNYVAPSD TFVALYDYESRTASDLSFKKGERLQIVNNTEGDWWLARSLNTGSSGYIPSNYVAPSD

    Again, an amazing conservation.

    But then the question is:

    How is it that these two versions of SH3 domain, which have so low homology one with the other, and still share the basci folding and structure, are then so conserved in evolutionary history, in the same protein?

    The obvious answer is: because the additional information is necessary to fine-tune the specific function of the domain in each protein.

    No more time. More in next post.

  261. 261
    gpuccio says:

    To all:

    I am sorry for the bad formatting of the alignments. The middle line seems not to stay where it should!

    OK, I hope the meaning is clear just the same. 🙂

  262. 262
    Dionisio says:

    The clarification about the text alignment issue helps. Thanks.
    I’m still digesting the whole explanation, very technical as usual, but so far it seems clear.

  263. 263
    gpuccio says:

    To all:

    So, to sum up:

    There are different variants of the SH3 domain, often rather distant at sequence level, which probably share the same basic folding and structure. However, while a low level homology seems enough to implement that basci similarity, individual variants of the domain have extreme conservation of their sequence through very long evolutionary times, which definitely points to their functional nature (whatever Corey can say).

    IOWs, we have two level of functional information in the domain: a basic level, shared by all variants, which is implied in the basic folding, and an additional level which serves to “fine tune” the function in each protein scenario.

    So, while the function of the domain can be described, in general, as “binding some ligand”, usually, but not always, a leucine rich peptide with some specific AA sequence, the specific nature of the ligand, of the interaction with it, and of the interaction with the host protein can vary a lot from protein to protein, and therefore from domain variant to domain variant. That “fine tuning” of the generic function seems to require more functional information than the basic folding itself.

    And the amazing evolutionary conservation of that specific sequence information in the same domain variant contrasts strikingly with the low level homology between different variants, even in the same organism.

    OK, that said, let’s go back to the Riddle/Baker paper, and to Corey’s initial statement about it, which was the beginning of everything.

    Let’s read it again:

    “experiments have swapped amino acids in proteins, heck they’ve even stripped all 20 amino acids away and rebuilt proteins using only 4 amino acids and the protein was still functional”

    Now, we already know that there is here a basic confusion between the protein and the protein domain. In the light of what we have said about the function, that confusion becomes extremely important.

    Indeed, we must distinguish here between two different functions:

    a) The function of the domain, which we have already discussed, and which is similar in all domains at a gross level (binding some ligand), but finely individualized in each domain variant.

    b) The function of the host protein, which is completely different from case to case, but which of course depends in some measure on the function of the domain.

    So, a first important question is: what does the paper say about the function of the protein?

    And the answer is: absolutely nothing. The authors do not even mention it. They are only interested in the function of the domain, as we will see.

    So, Corey’s statement that “the protein was still functional” (after the simplification), is completely wrong if we consider it as regarding the protein.

    The authors have not verified in any way if the protein hosting the domain could remain functional if a simplified domain is substituted to the wildtype domain. We have no information about that. And yet, that is the really important point, as we will see.

    However, let’s assume that Corey can admit his basic error (he has half done that) and so reformulate the statement as:

    “and the domain was still functional”.

    Can we accept that?

    No. Not even that is supported by the paper.

    However, there is some work in the paper about the domain function. Let’s see.

    They have tested the function of the domain in a very specific way: as binding affinity to a specific peptide, in particular the RALPPLPRY sequence that was used for the artificial selection.

    So, the two final sequences were able to bind that proline rich sequence. Like the wildtype.

    They have even evaluated the folding of the different sequences, and they state that FP2 “refolds at almost exactly the same rate as the WT, while FP1 refolds even faster”. They even state that this result proves that “the sequence of the src SH3 domain has not been highly optimized by natural selection for rapid folding”. A rather heavy statement, IMO. Are they sure that rapid folding is what is needed for the function?

    However, let’s take that for good, for the moment: their two sequences fold as well as, or even better than, the WT (at least in terms of rapidity!). However, there are differences, as we can see in Tab. 2, where the first three columns of data are relative to folding.

    The fourth column, instead, is about the affinity for the specific ligand used in the experiment. And there are differences here too.

    Let’s not consider S1, S2 and S3, which are the partially simplified variants. Let’s go directly to FP1 and FP2, the final variants.

    To help understand the fourth column, I will remind that it contains the values of the equilibrium dissociation constant (KD) of each sequence in relation to the above mentioned ligand.

    I paste here a brief explanation of what that means, taken form the following web site:

    https://www.malvern.com/en/products/measurement-type/binding-affinity

    Binding affinity is the strength of the binding interaction between a single biomolecule (e.g. protein or DNA) to its ligand/binding partner (e.g. drug or inhibitor). Binding affinity is typically measured and reported by the equilibrium dissociation constant (KD), which is used to evaluate and rank order strengths of bimolecular interactions. The smaller the KD value, the greater the binding affinity of the ligand for its target.

    Emphasis mine.

    So, what are the Kd values in the Table?

    wt SH3: 7.5 micromoles

    FP1: 150 micromoles

    FP2: 38 micromoles

    Remember: The smaller the KD value, the greater the binding affinity of the ligand for its target.

    So, we can see that both final sequences have a significantly lower affinity for the substrate than the WT.

    Now, of course, we don’t know what the best affinity is for the domain function. Even a higher affinity could be deleterious.

    But we know that the WT is functional. While we have no evidence that the simplified versions are functional. In the context of the whole protein.

    So, can a significant difference in affinity to a specific substrate, in particular a lower affinity for it, be deleterious to that final function?

    Of course it can.

    So, have we any evidence that the simplified versions can really be functional in a real protein, in a real biological context?

    Of course no.

    We only know that the two simplified sequences fold almost as rapidly or more rapidly than the WT (which could already be a problem), and that both sequences have lower affinity for one specific substrate (which could almost certainly be a problem).

    But there is more. Do we know that the chosen substrate represents well what the domain and the protein do in real life?

    No, we have no evidence for that. Some other ligand, even slightly different, could have some basic importance in the function of the domain and of the protein. And the differences in affinity for those other ligands could be greater than the differences for the chosen ligand, which after all is the ligand for which the final sequences were selected, and therefore the one with which they should reasonably perform better!

    So, what remains of Corey’s initial statement that:

    “experiments have swapped amino acids in proteins, heck they’ve even stripped all 20 amino acids away and rebuilt proteins using only 4 amino acids and the protein was still functional”

    Practically nothing!

    The experiment was about a short domain, and not a protein.

    The simplification was done with 5 AAs, and not 4.

    The simplification was effected in less that 50% of the sequence, and not in all of it.

    We know nothing about the functionality of the protein, with the simplified sequences.

    We know very little about the functionality of the domain with the simplified sequences: that it folds rapidly, and that it retains affinity for the specific ligand with which the artificial selection was made, but at a definitely lower level.

    We know nothing about other possible ligands, or about the interaction of the domain with its host protein.

    So, let’s say, graciously, that Corey’s statement was, at best, preposterous.

    But we have learnt a lot of interesting things in understanding why. 🙂

  264. 264
    ET says:

    ENOUGH- Back on the boat, both of you! 🙂

    There are more trolls flailing about.

  265. 265
    forexhr says:

    gpuccio:

    Why don’t we completely reverse the story of functional information and let the evolutionists eat their own numbers.

    Here is what I mean. Evolution theory is based on the fundamental premise that genes which code for new structures that provide new biological functions, arise through duplication and modification of pre-existing genes. But, given the high level of mutational neutrality, where the tons of mutations in the gene might not alter the structure it codes for, even if all the mutations in the history of life(10^43) are spent this might be insufficient to alter the underlying structure which provides some biological function.

    For example, lets look at this paper: Functional Proteins from a random sequence library(1), which comes up with an estimate of 10^91 different structures having ATP binding function.

    Such an enormous structural landscape clearly shows that even with all evolutionary mutations spent, the evolutionary process is stuck at current structural landscape and it cannot proceed towards new structures, let alone specific or adaptive structures which are beneficial in the environment where the population currently exists. This also explains the observation of evolutionary stasis. If we add to that a mutation rate of about 10^-8 mutations/bp/generation, where a 100,000 mutations must be spent just to produce one mutation in a specific 1000 bp DNA region(where some new gene ‘evolves’), it is more than obvious that the evolution theory is a complete hoax.

    So, instead of you spending your time and mental energy in writing extensive posts, let evolutionists prove that evolutionary processes can leave the current structural landscape (of an organ for e.g.) and climb another one.

    (1) https://www.researchgate.net/publication/12045894_Functional_proteins_from_a_random-sequence_library

  266. 266
    gpuccio says:

    forexhr:

    “So, instead of you spending your time and mental energy in writing extensive posts, let evolutionists prove that evolutionary processes can leave the current structural landscape (of an organ for e.g.) and climb another one.”

    Frankly, I have no real confidence in evolutionists proving anything that could be in favor of ID. Maybe I am a pessimist, or a cynic! 🙂

    So, I will probably go on spending my time and mental energy.

    I know very well the Keefe and Szostak paper. It’s probably the paper I have commented more frequently upon. It is essentially a paper about protein engineering, and the conclusions that evolutionists think can be drawn from it are not the conclusions that should be drawn by any thinking person. 🙂

    I have discussed it even in this thread, but for a more detailed discussion about it look at my previous OP here:

    https://uncommondescent.com/intelligent-design/what-are-the-limits-of-natural-selection-an-interesting-open-discussion-with-gordon-davisson/

    comments #62, 229, 237, 238, 263, 277, 284, 301, 303, 320.

    As you can see, I have spent a lot of my time and mental energy dealing with that paper.

    “But, given the high level of mutational neutrality, where the tons of mutations in the gene might not alter the structure it codes for,”

    OK, but that is true only for mutations that happen in functional coding sequences. But, if mutations happen in non functional, non coding sequences (such as duplicated and inactivated genes, or just non functional non coding DNA sequences), then any mutation is neutral, and can at the same time go in any direction, towards any possible state. That’s why the evaluation of probabilistic resources fro that type of random walk is so important.

  267. 267
    Corey Delvine says:

    Gpuccio:
    “residues not directly implied in the function of the domain. IOWs, the most important residues from the point of view of function.”
    12 AAs (21% of the molecule)
    So, let me get this straight, you are admitting that 79% is not essential to function and tolerate substitutions to some degree (many of them to apparently a large degree)?
    Doesn’t this fly in the face of your “functional sequence space is a tiny bit of the search space” belief?

    “subtle deception.”
    Ah yes, those scientists, always trying to deceive us!

    “The fifth residue in black, the S at 50, is indeed one of the 12 AAs mentioned before, So, there must be some error in the paper. OK, not important.”
    Wrong, there are 40 IKEAG and 5 non-IKEAG in FP2. The S->A was caused by the PCR splicing which can cause mutation, check table 1.

    “So, it would seem that 40 (or 41) of the residues have been “simplified”, isn’t it”
    But that is not what it says. Nobody but you is to blame for your own misunderstandings or miscomprehension.
    The study was about building a protein from a “simplified amino acid alphabet,” which is largely what they did.

    “He should have noticed that in 12 important AAs no simplification was even attempted, because that is clearly stated by the authors. So, he’s responsible for that error.”
    Wrong, the paper mentions that they were able to change half of those 12 amino acids individually to alanine and not effect on function.
    You conveniently left that part out.

    Scientists:
    “Here we show that a small beta-sheet protein, the SH3 domain”
    Gpuccio:
    “It’s a domain, not a protein”
    Seems like a bunch of scientists don’t have an issue with using the word protein here, but Gpucc doesn’t like it.
    Hmm who do we listen to?

    Gpuccio:
    “If you look at Fig. 1a, you will see that 3 of those 16 residues 4,21 and 22, have never been converted at all”
    Wrong. There are 19 residues they attempted to change in 1a.
    Of those 19, I repeat:19, 3 were not changed (4,21,22) to get the 16/19 (not 13/16 as you said).

    Gpuccio:
    “Unfortunately, this step of “random splicing” is not detailed in the Methods section”
    Wrong, it’s called PCR splicing

    Gpuccio:
    “Therefore, their two “functional” results are probably the only functional sequences in that search space.”
    Pucci, you are completely and utterly wrong. How can you say those are the only two functional sequences, when they’ve put numerous other functional sequences right in you face?
    Oh right, because it’s you.
    Scientists:
    “The biophysical properites of a number of functional variants were assesed…studies indicated that each of the variants was folded and stable”
    And the thermodynamics/peptide binding results show that these variants are functional.
    Scientists:
    “Simplification was successful at 38 of the 40 positions varied”
    Scientists:
    “The protein scaffold that supports the binding site is 95% IKEAG”

    Gpuccio:
    “But, you will say, in two cases they did it.”
    The fact that they end up with just these two variants is a product of experimental limitations, not biological limitations.
    You try all the cloning, PCRing, digestion/ligations, transforming, bead binding/eluting, colony screening and let me know what you end up with.

    Gpuccio:
    “So, what is it? 5 or 10-12?”
    That depends what you are trying to do.
    Demonstrate amino acid tolerances in simple proteins, or reproduce folding of a large number of today’s known protein families?
    The two are very different.

    Gpuccio:
    “With the available evidence, I would say neither!”
    Of course, because when you have two papers sitting in front of you that explicitly demonstrate function with 5 or 10-12 amino acid alphabets Gpuccio will simply deny their existence.

    Gpuccio:
    “wt SH3: 7.5 micromoles”
    M is molar not moles. =)

    We’ve both made mistakes, but whereas mine are in glossing over and maybe embellishing things a little too much, your mistakes are egregious misunderstandings of the undelying science and biology.

  268. 268
    gpuccio says:

    Corey Delvine:

    Hey, I thought we had lost you, instead you have done a lot of work! That’s good, i really appreciate it. 🙂

    And you even admit that you have made some mistakes, even if of course implying me in much greater ones!

    “We’ve both made mistakes, but whereas mine are in glossing over and maybe embellishing things a little too much, your mistakes are egregious misunderstandings of the underlying science and biology.”

    That’s progress, I would say.

    So, let’s see you arguments in detail, and what my huge mistakes are, even if it means to spend further time and attention with a paper that certainly does not deserve it! 🙂

    I said:

    ““residues not directly implied in the function of the domain. IOWs, the most important residues from the point of view of function.
    12 AAs (21% of the molecule)”

    You comment:

    So, let me get this straight, you are admitting that 79% is not essential to function and tolerate substitutions to some degree (many of them to apparently a large degree)?
    Doesn’t this fly in the face of your “functional sequence space is a tiny bit of the search space” belief?

    Excuse me, where did I say such a thing? I said that those 12 AAs are “the most important residues from the point of view of function.” I never said that the others “are not is not essential to function and tolerate substitutions to some degree”. You are inventing things and putting word in my mouth. OK, that is not the first time.

    Ah yes, those scientists, always trying to deceive us!

    Not “those scientists”. Some scientists. And not “always”. Sometimes.

    This is one of those times.

    I said:

    “But for the moment, let’s acknowledge this simple truth: the simplification failed in 5 residues where it was attempted.

    But in the Figure I can find only 4 of them.

    Residues 2,16,17,42

    The fifth residue in black, the S at 50, is indeed one of the 12 AAs mentioned before, So, there must be some error in the paper. OK, not important.”

    You comment:

    Wrong, there are 40 IKEAG and 5 non-IKEAG in FP2. The S->A was caused by the PCR splicing which can cause mutation, check table 1.

    Excuse me, in Fig. 2b it says: “The colour scheme is as in Fig. 1”. In Fig 1 it says: “black, residues which did not tolerate simplification”.

    So, that means that in Fig 2, too, black residues are those which “did not tolerate simplification”.

    But in Fig. 2, in FP2, I can count only 5 black residues, number 2, 16, 17, 42 and 50. Can you confirm, or are my old eyes deceiving me?

    So, according to the legend, those 5 residues in FP2, and only them, must be residues which “did not tolerate simplification”.

    But the S in residue 50, in Fig 1, is shown in blue, “residues where simplification was not attempted”.

    So, there is an error. Not important, as I said. But an error just the same.

    What’s your problem?

    You say:

    “The S->A was caused by the PCR splicing which can cause mutation, check table 1”

    OK, and what has that to do with what I am saying? The error in in Fig 2, in the color code attributed to the 50th residue.

    I said:

    ““So, it would seem that 40 (or 41) of the residues have been “simplified”, isn’t it””

    You comment:

    But that is not what it says. Nobody but you is to blame for your own misunderstandings or miscomprehension.
    The study was about building a protein from a “simplified amino acid alphabet,” which is largely what they did.

    There is no miscomprehension here.

    One thing is to say that the FP2 protein contains 40 IKEAG, a thing which is true and that I have never denied.

    Another thing is saying:

    “In the more simplified variant, FP2” (IOWs their best result) “40 of the 45 residues at which simplification was attempted are I,K,E,A or G.”

    which is at best ambiguous and decieving, as I have said.

    Another thing is saying:

    “For example, in the first third of the protein (Fig. 1a), 16 of the 19 residues not involved in binding were converted to I,K,E,A, or G in the most simplified sequences.”

    which is simply false.

    About this last point, you say:

    Gpuccio:
    “If you look at Fig. 1a, you will see that 3 of those 16 residues 4,21 and 22, have never been converted at all”
    Wrong. There are 19 residues they attempted to change in 1a.
    Of those 19, I repeat:19, 3 were not changed (4,21,22) to get the 16/19 (not 13/16 as you said).

    Let’s see:

    “There are 19 residues they attempted to change in 1a.”

    Correct.

    “Of those 19, I repeat:19, 3 were not changed (4,21,22)”

    Yes.

    “to get the 16/19 (not 13/16 as you said).”

    No. You are confused.

    19 residues: those they attempted to change.

    3 residues: those which did not change, because they are the same as in the WT ((4,21,22)

    + the residues where simplification was not tolerated (those in black), whose number is different in each sequence. In the most simplified sequences they are 3, in the others they are more.

    The most simplified sequences are number 5, 8, 9, 12, 14, 15, 18.

    In each of them there are 3 black residues + at least 3 residues which have not changed. Take number 5, for example:

    3 black residues: 2, 16, 17

    3 residues which have not changed in any sequence: 4, 21, 22

    + residues which have not changed in this particular sequence: 9, 19, 20

    In each of the most simplified sequences, there are at least 6 residues that have not been converted to IKEAG (the 3 black ones + the 3 which have never changed).

    But the authors state:

    “For example, in the first third of the protein (Fig. 1a), 16 of the 19 residues not involved in binding were converted to I,K,E,A, or G in the most simplified sequences.”

    But that is not true.

    19 – 6 = 13, not 16

    And yet, you support their statement. Why?

    you say:

    “He should have noticed that in 12 important AAs no simplification was even attempted, because that is clearly stated by the authors. So, he’s responsible for that error.”
    Wrong, the paper mentions that they were able to change half of those 12 amino acids individually to alanine and not effect on function.
    You conveniently left that part out.

    Wrong. The authors simply say:

    “For example, of the 12 positions held fixed in this study, half were shown to tolerate alanine substitutions in the Sem5 SH3 domain; judging from the effects on expression levels, only one of these mutants appeared to have significantly decreased stability 9.”

    They are only quoting another paper. It’s not something they did at all.

    And yet you say:

    “the paper mentions that they were able to change half of those 12 amino acids individually to alanine and not effect on function”

    Wrong.

    The paper they quote is the following:

    “Critical residues in an SH3 domain from Sem-5 suggest a mechanism for proline-rich peptide recognition.”

    by Lim WA1, Richards FM.

    Another paper, other authors, other aims of the study, different SH3 domain (SH3 domain from the Caenorhabditis elegans protein Sem-5).

    So, it’s not true that “they were able to change half of those 12 amino acids individually to alanine and not effect on function”, as you say. In their study, the 12 AAs were simply “held fixed”, as clearly stated by the authors.

    So, I did not left out any part of their study, and what you say is simply wrong.

    More in next post.

  269. 269
    gpuccio says:

    Corey Delvine:

    you say:

    Scientists:
    “Here we show that a small beta-sheet protein, the SH3 domain”
    Gpuccio:
    “It’s a domain, not a protein”
    Seems like a bunch of scientists don’t have an issue with using the word protein here, but Gpucc doesn’t like it.
    Hmm who do we listen to?

    Maybe the serious authors of the study they quote, who very correctly say:

    Critical residues in an SH3 domain from Sem-5 suggest a mechanism for proline-rich peptide recognition.

    Lim WA1, Richards FM.

    Abstract

    Src homology 3 (SH3) domains bind specific proline-rich peptide motifs. To identify interactions involved in peptide recognition, we have mutated residues on the putative binding surface of an SH3 domain from the Caenorhabditis elegans protein Sem-5. Among the most critical positions are three adjacent aromatic residues, which appear to participate in highly stereospecific packing interactions with the ligand. The co-planar arrangement of two of these residues closely matches the periodicity of a poly-proline II (PPII) helix. Thus, a model for recognition has the peptide adopting a PPII helix, with the pyrrolidine rings on one helical face interlocking with the aromatic SH3 residues.

    (Emphasis mine)

    This is the correct way to say things. It’s not my fault if the authors of your beloved paper are sloppy and imprecise in their wording (and not only in that).

    You say:

    Gpuccio:
    “Unfortunately, this step of “random splicing” is not detailed in the Methods section”
    Wrong, it’s called PCR splicing

    It is not detailed. It’s simply mentioned. There is not even a reference. Again, sloppy. What’s wrong in my statement?

    You say:

    Gpuccio:
    “Therefore, their two “functional” results are probably the only functional sequences in that search space.”
    Pucci, you are completely and utterly wrong. How can you say those are the only two functional sequences, when they’ve put numerous other functional sequences right in you face?
    Oh right, because it’s you.
    Scientists:
    “The biophysical properites of a number of functional variants were assesed…studies indicated that each of the variants was folded and stable”
    And the thermodynamics/peptide binding results show that these variants are functional.
    Scientists:
    “Simplification was successful at 38 of the 40 positions varied”
    Scientists:
    “The protein scaffold that supports the binding site is 95% IKEAG”

    “when they’ve put numerous other functional sequences right in you face?”

    But those are of course the partially simplified sequences! The statement you quote:

    “The biophysical properites of a number of functional variants were assesed…studies indicated that each of the variants was folded and stable”

    is about the partially simplified sequences!

    The fully simplified sequences are two, and only two. And all the conclusions of the study are about the two sully simplified sequences, not about the partially simplified ones.

    Again, you mystify, probably intentionally.

    I have already discussed the last two points in my previous post.

    You say:

    Gpuccio:
    “But, you will say, in two cases they did it.”
    The fact that they end up with just these two variants is a product of experimental limitations, not biological limitations.
    You try all the cloning, PCRing, digestion/ligations, transforming, bead binding/eluting, colony screening and let me know what you end up with.

    Ah, then you understand that they “end up with just two variants”. So, the mystification was intentional, and not due to ignorance.

    But my point is very explicit. They used a library of about 10^7 sequences for each third of the molecule.

    Now, we agree I hope that the successful simplifications were in about 50% of the molecule. That mean 28 residues.

    So, each third which was simplified was about 9 residues.

    The search space for 9 residues in a 5 letters alphabet is:

    5^9 = about 10^6

    So, they explored all the search space.

    And they found only two final sequences.

    Which was my point.

    Gpuccio:
    “So, what is it? 5 or 10-12?”
    That depends what you are trying to do.
    Demonstrate amino acid tolerances in simple proteins, or reproduce folding of a large number of today’s known protein families?
    The two are very different.

    No. What everybody is trying to do is showing how many AAs could be present in the original alphabet at OOL. That is very clear in the first statement of your paper, in the abstract:

    “Early protein synthesis is thought to have involved a reduced amino acid alphabet. What is the minimum number of amino acids that would have been needed to encode complex protein folds similar to those found in nature today?”

    So, they are all debating the same thing.

    Gpuccio:
    “With the available evidence, I would say neither!”
    Of course, because when you have two papers sitting in front of you that explicitly demonstrate function with 5 or 10-12 amino acid alphabets Gpuccio will simply deny their existence.

    After having argued in great detail why I think that way, and why the paper you presented as evidence is evidence of nothing relevant to the discussion.

    You say:

    Gpuccio:
    “wt SH3: 7.5 micromoles”
    M is molar not moles. =)

    Well, at least you put a smiley 🙂

    OK, I should have written micromoles/liter.

    So, one big mistake on my part! 🙂

    Let’s close with part of your final statement:

    “We’ve both made mistakes”

    Yes, we are mistake buddies! 🙂

  270. 270
    Dionisio says:

    gpuccio,

    It seems like your politely dissenting interlocutor should have spent more time reading your comments carefully before coming back?

    Thus all the barking up the wrong trees could have been avoided?

    Well, at least the loud barking attracted more readers back to this thread.

    🙂

  271. 271
    Dionisio says:

    Critical residues in an SH3 domain from Sem-5 suggest a mechanism for proline-rich peptide recognition.
    Lim WA1, Richards FM.

    That’s a 1994 paper.

    Here’s a 2016 paper referring to the same domain:

    http://journals.plos.org/ploso.....ne.0145872

  272. 272
    ET says:

    Corey:

    We’ve both made mistakes, but whereas mine are in glossing over and maybe embellishing things a little too much, your mistakes are egregious misunderstandings of the undelying science and biology.

    Except the fact that YOU don’t have any science nor biology to support the claim that blind and mindless processes can produce proteins 50AA’s long or longer. So it looks like you lose because of that failure.

  273. 273
    Dionisio says:

    ET,
    You may have scared the polite dissenter away.
    🙂

  274. 274
    Nonlin.org says:

    Gpuccio:
    Anaxagoras @1 is right (didn’t read all). The analysis might be fine, but accepts Darwinist false assumptions. I would add it’s too geeky and these kind of messages never win. Contrast this with the “selfish gene” soundbite. Yes it’s wrong, but effective. Btw, DNA is overrated: http://nonlin.org/dna-not-essence-of-life/

  275. 275
    gpuccio says:

    Nonlin.org:

    I have answered Anaxagoras at #5.

    I do not accept Darwinist false assumptions. I accept those assumptions that are supported by facts and have scientific credibility. Most of Darwinist assumptions do not have those requisites, and that’s why I reject them.

    But I completely accept the idea of sequence conservation by negative natural selection, for example, because it is completely supported by facts.

    Are you suggesting that we should reject any good idea only because Darwinists share it?

    Moreover, I can agree that DNA is “overrated”. But that is no reason to deny what it really does.

    Are you suggesting that protein coding genes do not bear the information for their proteins? Are you suggesting that the sequence of a proteins is not important for its function? Are you suggesting that the conservation of a sequence is not an indicator of functional complexity? Or are you just suggesting that RV does not occur at all?

    Could you specify what are the “Darwinist false assumptions” that I would accept?

    Just to understand.

Leave a Reply