What are the limits of Random Variation? A simple evaluation of the probabilistic resources of our biological world

_{gpuccio
October 31, 2017

Intelligent Design

10}_{Categories
Intelligent Design}

Share: Facebook; Twitter/X; LinkedIn; Flipboard; Print; Email

Coming from a long and detailed discussion about the limits of Natural Selection, here:

What are the limits of Natural Selection? An interesting open discussion with Gordon Davisson

I realized that some attention could be given to the other great protagonist of the neo-darwinian algorithm: Random Variation (RV).

For the sake of clarity, as usual, I will try to give explicit definitions in advance.

Let’s call RV event any random event that, in the course of Natural History, acts on an existing organism at the genetic level, so that the genome of that individual organism changes in its descendants.

That’s more or less the same as the neo-darwinian concept of descent with modifications.

A few important clarifications:

a) I use the term variation instead of mutation because I want to include in the definition all possible kinds of variation, not only single point mutations.

b) Random here means essentially that the mechanisms that cause the variation are in no way related to function, whatever it is: IOWs, the function that may arise or not arise as a result of the variation is in no way related to the mechanism that effects the change, but only to the specific configuration which arises randomly from that mechanism.

In all the present discussion we will not consider how NS can change the RV scenario: I have discussed that in great detail in the quoted previous thread, and those who are interested in that aspect can refer to it. In brief, I will remind here that NS does not act on the sequences themselves (IOWs the functional information), but, if and when and in the measure that it can act, it acts by modifyng the probabilistic resources.

So, an important concept is that:

All new functional information that may arise by the neo-darwinian mechanism is the result of RV.

Examining the Summers paper about chloroquine resistance:

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4035986/

I have argued in the old thread that the whole process of generation of the resistance in natural strains can be divided into two steps:

a) The appearance of an initial new state which confers the initial resistance. In our example, that corresponds to the appearance of one of two possible resistant states, both of which require two neutral mutations. IOWs, this initial step is the result of mere RV, and NS has no role in that. Of course, the initial resistant state, once reached, can be selected. We have also seen that the initial state of two mutations is probably the critical step in the whole process, in terms of time required.

b) From that point on, a few individual steps of one single mutation, each of them conferring greater resistance, can optimize the function rather easily.

Now, point a) is exactly what we are discussing in this new thread.

So, what are the realistic powers of mere RV in the biological world, in terms of functional information? What can it really achieve?

Another way to ask the same question is: how functionally complex can the initial state that for the first time implements a new function be, arising from mere RV?

And now, let’s define the probabilistic resources.

Let’s call probabilistic resources, in a system where random events take place, the total number of different states that can be reached by RV events in a certain window of time.

In a system where two dies are tossed each minute, and the numbers deriving from each toss are the states we are interested in, the probabilistic resources of the system in one day amount to 1440 states.

The greater the probabilstic resources, the easier it is to find some specific state, which has some specific probability to be found in one random attempt.

So, what are the states generated by RV? They are, very simply, all different genomes that arise in any individual of any species by RV events, or if you prefer by descent with modification.

Please note that we are referring here to heritable variation only, we are not interested to somatic genetic variation, which is not transmitted to descendants.

So, what are the probabilistic resources in our biological world? How can they be estimated?

I will use here a top-down method. So, I will not rely on empirical data like those from Summers or Behe or others, but only on what is known about the biological world and natural history.

The biological probabilstic resources derive from reproduction: each reproduction event is a new state reached, if its genetic information is different from the previous state. So, the total numbet of states reached in a system in a certain window of time is simply the total number of reproduction events where the genetic information changes. IOWs, where some RV event takes place.

Those resources depend essentially on three main components:

The population size
The number of reproductions of each individual (the reproduction rate) in a certain time
The time window

So, I have tried to compute the total probabilistic resources (total number of different states) for some different biological populations, in different time windows, appropriate for the specific population (IOWs, for each population, from the approximate time of its appearance up to now). As usual, I have expressed the final results in bits (log2 of the total number).

Here are the results:

Population	Size	Reproduction rate (per day)	Mutation rate	Time window	Time (in days)	Number of states	Bits	+ 5 sigma	Specific AAs
Bacteria	5.00E+30	24	0.003	4 billion years	1.46E+12	5.26E+41	138.6	160.3	37
Fungi	1.00E+27	24	0.003	2 billion years	7.3E+11	5.26E+37	125.3	147.0	34
Insects	1.00E+19	0.2	0.06	500 million years	1.825E+11	2.19E+28	94.1	115.8	27
Fish	4E+12	0.1	5	400 million years	1.46E+11	2.92E+23	78.0	99.7	23
Hominidae	5.00E+09	0.000136986	100	15 million years	5.48E+09	3.75E+17	58.4	80.1	19

The mutation rate is expressed as mutations per genome per reproduction.

This is only a tentative estimate, and of course a gross one. I have tried to get the best reasonable values from the sources I could find, but of course many values could be somewhat different, and sometimes it was really difficult to find any good reference, and I just had to make an educated guess. Of course, I will be happy to acknowledge any suggestion or correction based on good sources.

But, even if we consider all those uncertainties, I would say that these numbers do tell us something very interesting.

First of all, the highest probabilistic resources are found in bacteria, as expected: this is due mainly to the huge population size and high reproduction rate. The number for fungi are almost comparable, although significantly lower.

So, the first important conclusion is that, in these two basic classes of organisms, the probabilistic resources, with this hugely optimistic estimate, are still under 140 bits.

The penultimate column just adds 21.7 bits (the margin for 5 sigma safety for inferences about fundamental issues in physics). What does that mean?

It means, for example, that any sequence with 160 bits of functional information is, by far, beyond any reasonable probability of being the result of RV in the system of all bacteria in 4 billion years of natural history, even with the most optimistic assumptions.

The last column gives the number of specific AAs that corrispond to the bit value in the penultimate column (based on a maximum information value of 4.32 bits per AA).

For bacteria, that corresponds to 37 specific AAs.

IOWs, a sequence of 37 specific AAs is already well beyond the probabilistic resources of the whole population of bacteria in the whole world reproducing for 4 billion years!

For fungi, 147 bits and 34 AAs are the upper limit.

Of course, values become lower for the other classes. Insects still perform reasonably well, with 116 bits and 27 AAs. Fish and Hominidae have even lower values.

We can notice that Hominidae gain something in the mutation rate, which as known is higher, and that I have considered here at 100 new mutations per genome per reproduction (a reasonable estimate for homo sapiens). Moreover, I have considered here a very generous population of 5 billion individuals, again taking a recent value for homo sapiens. These are not realistic choices, but again generous ones, just to make my darwinist friends happy.

Another consideration: I have given here total populations (or at least generous estimates for them), and not effective population sizes. Again, the idea is to give the highest chances to the neo-darwinian algorithm.

So, these are very simple numbers, and they should give an idea of what I would call the upper threshold of what mere RV can do, estimated by a top down reasoning, and with extremely generous assumptions.

Another important conclusion is the following:

All the components of the probabilistic resources have a linear relationship with the total number of states.

That is true for population size, for reproduction rate, mutation rate and time.

For example, everyone can see that the different time windows, ranging from 4 billion years to 15 million years, which seems a very big difference, correspond to only 3 orders of magnitude in the total number of states. Indeed, the highest variations are probably in population size.

However, the complexity of a sequence, in terms of necessary AA sites, has an exponential relationship with the functional information in bits: a range from 19 to 37 AAs (only 18 AAs) corresponds to a range of 24 orders of magnitude in the distribution of probabilistic resources.

Can I remind here briefly, without any further comments, that in my OP here:

The amazing level of engineering in the transition to the vertebrate proteome: a global analysis

I have analyzed the informational jump in human conserved information at the apperance of vertebrates? One important result is that 10% of all human proteins (about 2000) have an information jump from pre-vertebrates to vertenrates of at least (about) 500 bits (corresponding to about 116 AAs)!

Now, some important final considerations:

I am making no special inferences here, and I am drawing no special conclusions. I don’t think it is really necessary. The numbers speak for themselves.
I will be happy of any suggestion, correction, or comment. Especially if based on facts or reasonable arguments. The discussion is open.
Again, this is about mere RV. This is about the neutral case. NS has nothing to do with these numbers.
For those interested in a discussion about the possible role of NS, I can suggest the thread linked at the beginning of this OP.
I will be happy to answer any question about NS too, of course, but I would be even more happy if someone tried to answer my two questions challenge, given at post #103 of the other thread, and that nobody has answered yet. I paste it here for the convenience of all:

Will anyone on the other side answer the following two simple questions?

1) Is there any conceptual reason why we should believe that complex protein functions can be deconstructed into simpler, naturally selectable steps? That such a ladder exists, in general, or even in specific cases?

2) Is there any evidence from facts that supports the hypothesis that complex protein functions can be deconstructed into simpler, naturally selectable steps? That such a ladder exists, in general, or even in specific cases?

Comments

It is plainly irrational to believe in Darwinian evolution. What's an atheist to do?Mung_{November 5, 2017
November
11
Nov
5
05
2017
01:23 PM
1
01
23
PM
PDT}

Gpuccio, I apologize for not reading all that has been written. It looks very interesting. Two things First, how does what you have been showing differ if at all from Behe's Edge of evolution hypothesis? Your approach seems more mathematical than anything Behe has done but seems to have the same conclusions. Second, I happened by chance on a TV nature documentary this morning on Madagascar and how much of its flora and fauna differs from the rest of Africa. Are you aware of just how far are these species/variations on Madagascar differ from the mainland of Africa? That is probably an unfair question since my guess few know anything about Madagascar. But geographical variation of species formed an important part to Darwin's thinking and is there any analysis of just how different are the various species by geographic separation? And how they could have developed by the geographic isolation and be within your limits of what is possible? Actually a third thing, the RV for Dummies approach to make it more accessible to those not mathematically inclined to follow all the bits and their implications (namely me). Somebody should do it here at UD so it can be used for those who are educated but not familiar with the technical arguments. I deal with these type of people frequently in other places and use the improbability of generating new coding sequences/proteins as the basis for undermining naturalistic evolution. It would be nice to have a layman's version to use with these people, especially when they bring a biologist to support them. I once tried to explain this to an author of a non-technical article on evolution and the person got Kenneth Miller to support him.jerry_{November 5, 2017
November
11
Nov
5
05
2017
01:19 PM
1
01
19
PM
PDT}

J-Mac: "Even a new mutation that is slightly favorable will usually be lost in the first few generations after it appears in the population, a victim of genetic drift. If a new mutation has a selective advantage of S in the heterozygote in which it appears, then the chance is only 2S that the mutation will ever succeed in taking over the population. So a mutation that is 1 percent better in fitness than the standard allele in the population will be lost 98 percent of the time by genetic drift." That's a very important point, thank you! :) See also my previous post (#122).gpuccio_{November 5, 2017
November
11
Nov
5
05
2017
08:39 AM
8
08
39
AM
PDT}

Origenes: Of course the waiting time is a critical factor. And it depends strictly on the population size. I would like to clarify a few concepts here. 1) In this OP, I have considered only the probabilistic resources inherent in the biological scenario on our planet: IOWs the total number of different states that can be really reached by some random walk in the whole natural history, by different types of biological populations. 2) The numbers in my table are only an evaluation of a higher threshold, computed with extremely generous, and essentially unrealistic, assumptions in favor of neo-darwinism. For example, population sizes are by far too big, and the reproduction rates, for bacteria for example, are those of an expanding population, and not those of a steady state. And so on. 3) The meaning, therefore, is the following: even with the best and unrealistic assumptions for neo-darwinism, there is no chance that more than such a number of states could be reached in natural history. 4) As the total number of states is ridiculously small, in comparison with the probabilistic resources needed by even simple functional proteins, the idea is that neo-darwinism is completely out of game. 5) In this reasoning, I have not considered the effects of NS. Those I have discussed in the previous OP. Therefore, the reasoning in this OP is about the powers of RV to generate the starting function on which NS could eventually act, or to generate any new function from scratch in random walks that take place in non functional sequences, and are therefore by definition neutral. 6) Moreover, I have not considered in this OP the possible effects of neutral genetic drift because, as I have discussed many times, it does not act on the probabilistic resources. Given a number of states n that can be reached by the system, drift events cannot in any way favour functional states over non functional states. Therefore, nothing changes. 7) The important point is: if one includes NS or genetic drift in the model, the time to fixation becomes a critical factor. Indeed, time to fixation can be extremely long, especially for drift, while it can be shorter for NS, but significantly so only if the selection coefficient is very high (like in antibiotic resistance). 8) The time to fixation for neutral drift is, according to population genetics, 4Ne generations, where Ne is the effective population size. Therefore, if the population size is very big, the time to fixation is very big too. Just as an extreme example, if my number of 5E30 given in my table for bacteria were a real effective population size (which, of course, it is not), the time to fixation for a neutral trait would be 2E+31 generations which, considering one generation per hour, as I have done in my table, would give us a time to fixation of 2.28E+27 years, which is some big time from all points of view. Of course, effective population sizes are much smaller. Let's say that we have an effective population size for bacteria of 10^9. That would give us a time to fixation, for one neutral trait, od only 456621 years, about 0.5 million years. OK, but with an effective population size of 1E09, the total number of states that can be reached in 4 billion years in the bacterial scenario declines dramatically, from 5.26E+41 (the number in my table) to 2.10E+19! So, as I said, adding genetic drift of neutral traits to the scenario does not improve the situation at all! :) 9) Of course, NS will be more efficient, and its time to fixation will be shorter, according to the selection coefficient. Low values of s will not help much, but higher values, like in antibiotic resistance, can do the trick of greatly facilitating small steps (usually of one aminoacid), as we have seen in the scenario of chloroquine resistance (see the Summers paper). However, NS too has the problem of time to fixation, and that can be a real stopper in many situations, as you correctly say. 10) Finally, all those who really want to consider realistically the role of NS as some help to the disastrous numbers in my OP can certainly try to do that, but again they are cordially invited, as a first step, to take my challenge, repeated at the end of this OP, and answer my two simple questions which are critically important for any role that NS is supposed to have. Good luck to everybody! :)gpuccio_{November 5, 2017
November
11
Nov
5
05
2017
08:31 AM
8
08
31
AM
PDT}

gpuccio: Whether we define life as the ability to reproduce or the ability to generate and maintain far from equilibrium systems, this is merely a human convention, an instance of human language and it is completely unrelated to the problem of explaining how the organisms that we observe around us came to be. Viewing organisms through the lens of linguistic constructs is a necessity of evolutionists because, given their assumption - 'once life appears and begins to reproduce, the emergence of higher life forms become possible', it's convenient for them to say that chemical systems where simple molecules can self-reproduce are living systems. Since we know that such systems can emerge by natural means, if follows that the emergence of higher life forms by natural means is a trivial occurrence. The problem is of course in their assumption. Ability to reproduce is not some magical thing, but merely a mechanism for changing spatial positions of particles. For e.g., when bacteria reproduces to provide the raw material for evolution - mutations, all that is happening is rearrangement of the particles that comprise the bacteria. That's all. But exactly the same processes occur in non-living matter - particles change their spatial positions. That is why the above assumption is like saying - due to particle motion the emergence of higher life forms become possible, which is of course nonsensical statement. So, the problem is not definition or linguistic distinction of living and non-living matter, but the ratio between non-FAOP and FAOP.forexhr_{November 5, 2017
November
11
Nov
5
05
2017
05:46 AM
5
05
46
AM
PDT}

@116 Moran lives in the wishful world of his impotent god random genetic drift that just like natural selection can build nothing, create nothing and eliminate whenever benefited mutations happen to achieve.. "Even a new mutation that is slightly favorable will usually be lost in the first few generations after it appears in the population, a victim of genetic drift. If a new mutation has a selective advantage of S in the heterozygote in which it appears, then the chance is only 2S that the mutation will ever succeed in taking over the population. So a mutation that is 1 percent better in fitness than the standard allele in the population will be lost 98 percent of the time by genetic drift." Griffith and colleagues(1999, p. 564):J-Mac_{November 5, 2017
November
11
Nov
5
05
2017
05:40 AM
5
05
40
AM
PDT}

@118 "Out of 120,000 fertilized eggs of the green frog only two individuals survive. Are we to conclude that these two frogs out of 120,000 were selected by nature because they were the fittest ones; or rather - as Cuenot said - that natural selection is nothing but blind mortality which selects nothing at all?" Natural selection builds things only in Darwinists' wishful mind and not in real life...J-Mac_{November 5, 2017
November
11
Nov
5
05
2017
05:35 AM
5
05
35
AM
PDT}

//About NS — off-topic// GPuccio, I would very much appreciate your opinion on the following line of reasoning: Natural selection slows evolution down. Natural selection (NS) culls variety — eliminates organisms with certain traits from the population. In effect, NS enhances probability for the remaining variation. IOWs NS intensifies a particular search and thereby enhances the probability of its success. Assuming that functional islands are not necessarily connected, and assuming constant population size, the overall chance of finding evolutionary novelty remains the same, with or without NS, — the intensified search compensates for the loss of variety. IOWs the presence of NS does not effect the chance of success for the overall evolutionary search. However, the waiting time for the selected variation to restore the original population size is slowing down evolution. If some disease kills off all of human race, except for the Japanese, then the probability of finding evolutionary novelties is restored at the moment that the Japanese have a population size of 7 billion. My point? The waiting time needs to be factored in.Origenes_{November 5, 2017
November
11
Nov
5
05
2017
05:08 AM
5
05
08
AM
PDT}

forexhr: I can agree with you, but the point is that when I speak of OOL I am really thinking of Origin of LUCA, or if you want Origin of Prokaryotes. Indeed, I do not believe that simpler forms of life ever existed. My idea is that life started probably with something very similar to a prokaryote (IOWs LUCA). I don't believe that the simple "ablity to reproduce" has anything to do with life. Chemical systems where simple molecules can self-reproduce are not IMO living systems, nor can they evolve into living systems. A living organisms requires much more: a) Separation and differention of an ineer environment vs an outer environment. b) Metabolism to derive energy which allows the generation of very low entropy structures. c) Ability to generate and maintain far from equilibrium systems. And probably many other things. So, in this sense, maybe that OOL required even more design than the following steps. Or at least a comparable amount. However, I suppose that we agree that nothing of all that could have happened without a lot of design! :)gpuccio_{November 5, 2017
November
11
Nov
5
05
2017
03:32 AM
3
03
32
AM
PDT}

Larry Moran, wd400, CR, MatSpirit, Seversky, Goodusername, rvb8, Gordon Davidson and many others, Where Art Thou?Origenes_{November 5, 2017
November
11
Nov
5
05
2017
03:21 AM
3
03
21
AM
PDT}

gpuccio:"OOL just requires probably some more powerful design!" I must disagree with the above statement of yours because origin of higher life forms(OHLF) is an even greater problem than origin of life(OOL). Here is why. Since everything in nature is made of large number of particles that constantly change their spatial positions, the idea that life arose from nonlife basically boils down to finding a functional arrangement of particles(FAOP) (which has the ability to reproduce), through the process of particle rearrangements. The problem for OOL lies in the ratio between non-FAOP and FAOP, which is so huge that the quantity of particle rearrangements in the entire lifespan of the Universe is insufficient to find only one instance of FAOP. OHLF faces the same problem, but this problem is masked by linguistic construct called natural selection(NS). Basically, NS just pre-specifies FAOP through fitness or the ability of an organism to survive in a certain environment. For e.g., given an aquatic environment, if AOP within an organism gives it the ability to breathe under water, such AOP is by definition FAOP. Given an intron-exon environment if AOP within an organism gives it the ability to remove introns from pre mRNA, then such AOP is by definition FAOP. Henece, higher life forms are actually numerous FAOPs pre-specified by various environments. Meaning, the same as with OOL, origin of a particular biological function boils down to finding FAOP in nearly infinite poll of non-FAOPs. But, since finding one FAOP - ability to reproduce(OOL) is easier than finding many different FAOPs - ability to reproduce sexually, remove introns, think, see, talk, jump,... OHLF is an even greater problem than OOL. Thus, the truth is that OHLF requires more powerful design that OOL.forexhr_{November 5, 2017
November
11
Nov
5
05
2017
03:19 AM
3
03
19
AM
PDT}

To all: This must be almost a record: if I am not wrong, not even one single comment from the other side, in this thread. Maybe RV is of no interest to them! :)gpuccio_{November 4, 2017
November
11
Nov
4
04
2017
10:47 PM
10
10
47
PM
PDT}

EugeneS: I am definitely for strong ID. Very strong! :) While there is no doubt that the basic parameters are fine tuned to allow life, I don't believe that any special setting could, by itself, ever generate the tons of specific functional information that are necessary for life to exist. There is no empirical support to the idea that our environment, or even our world, however fine tuned it may be, has the specific understanding of biochemical laws, and the computational resources, to guide the generation of thousands of amazing biochemical machines like the ones that we observe in all living beings. Moreover, I insist that the same evidence for design that we find for OOL is also present for life evolution: each new wonderful protein that appears in natural history is evidence of design, not only OOL. OOL just requires probably some more powerful design! :) But we have seen that many other events, like the appearance of eukaryotes, and of metazoa, and, just to remain in a field that I have discussed in detail with empirical evidence of all kinds, the transition to vertebrates, all require the input of tons of new functional information. So, what we need is strong, very very strong ID. :) Otherwise we cannot really explain anything of what we observe, and we are therefore no better than neo-darwinists, just imagining things that are not supported by facts.gpuccio_{November 4, 2017
November
11
Nov
4
04
2017
12:13 PM
12
12
13
PM
PDT}

Mung # 61 Of course not! RV+NS is a secondary (induced) phenomenon. Whatever its capabilities in reality, it must have started from a population of self-reproducing organisms. How did nature get there?! To answer that question in earnest (without shameless intellectual tricks like the multiverse) one inevitably needs to bring ID to the table, at least in the form of parameter fine tuning (weak ID).EugeneS_{November 4, 2017
November
11
Nov
4
04
2017
10:16 AM
10
10
16
AM
PDT}

Can this thread be somehow associated with or related to the following link? https://uncommondescent.com/intelligent-design/evolutionary-predictions-of-protein-structure-is-iffy/Dionisio_{November 3, 2017
November
11
Nov
3
03
2017
09:01 PM
9
09
01
PM
PDT}

gpuccio, There is enough evidence that mutations are non-random...when one considers quantum coherence... If you add to the pot of evolution that there is evidence for a strong element of randomness in natural selection, then you have a clear picture what Darwinian evolution is facing today... All Darwinists can do is deny the evidence and hope for the best... that their faith holds up as long as they are alive...J-Mac_{November 3, 2017
November
11
Nov
3
03
2017
01:35 PM
1
01
35
PM
PDT}

gpuccio, The Neo-Darwinian RV+NS enchilada you have "cooked" and served here lately keeps attracting many readers:
Popular Posts (Last 30 Days)
What are the limits of Natural Selection? An interesting… (2,744) Violence is Inherent in Atheist Politics (2,036) What are the limits of Random Variation? A simple evaluation (1,213) Of course: Mathematics perpetuates white privilege (1,184) Is social media killing Wikipedia? (1,068)

Well done!Dionisio_{November 3, 2017
November
11
Nov
3
03
2017
01:09 PM
1
01
09
PM
PDT}

forexhr: That's absolutely right! :) New proteins or protein domain superfamilies indeed "start from scratch", because superfamilies are completely unrelated at the level of sequence and structure. And it is not necessary that the whole protein be completely new: it is enough to show that there is a transition of high functional complexity, that new complex and specific parts of the molecule are acquired from scratch. For example, in my OP here: https://uncommondescent.com/intelligent-design/the-amazing-level-of-engineering-in-the-transition-to-the-vertebrate-proteome-a-global-analysis/ I have shown that more than 1.7 million bits of functional information are acquired by the 20000 proteins that we find in the human genome in the transition from pre-vertebrates to vertebrates, which happens in about 30 million years. And the population is not certainly the 5.00E+30 bacteria of my table. It is rather some bunch of lancelets, with probabilistic resources at most comparable to those of fish (78 bits), but probably much less. So, those 78 bits of variation should explain the acquisition of 1.7 million bits of functional information in 30 million years, information so specific that it will be conserved for 400+ million years, up to humans! Moreover, the distribution of the informational jump in the 20000 proteins of the human proteome shows that the 90th percentile of the jump is 486 bits, and the 95th percentile is 733 bits. IOWs, of the 20000 human protiens, 10% (about 2000) have a specific information jump of 486 bits or more, and 5% (about 1000) have a specific information jump of 733 bits or more, in that quick evolutionary transition from lancelet-like organisms to cartilaginous fish, in about 30 million years. All that starting from at most 78 bits of probabilistic resources!gpuccio_{November 3, 2017
November
11
Nov
3
03
2017
11:29 AM
11
11
29
AM
PDT}

gpuccio: Regarding the search for new functional sequences, the average evolutionist would respond in the following way: "new function comes out of pre-existing functions. Nothing starts out from scratch. Your probability calculations reflect deliberate assumptions of everything starting out de novo, in order to ensure that the resulting probabilities come out as low as possible." But of course, such alibi response is completely flawed. Here is why. If these DNA sequences: ATT, CGC and ACA are something that is functional in the environment A, while DNA sequences: TAC, AAA and CCC are required for adaptation to environment B then the first sequences are equally junk as all 'non-TAC, AAA and CCC' sequences. In other words, in the environment which requires visual function, DNA that codes for fully functional heart(pre-existing function) is equally junk as any random sequence of nucleotides. Hence, every new adaptation starts out from scratch.forexhr_{November 3, 2017
November
11
Nov
3
03
2017
09:52 AM
9
09
52
AM
PDT}

forexhr: Exactly! An interesting aspect is also that, as seen in the discussion with daveS, NS seems cut out feom the game when the random walk takes place in some non functional part of the genome. Indeed, that seems the only reasonable scenario, because, as daveS has pointed out, a random walk starting from a functional sequence, towards some unrelated sequence with different function, seems really impossible because of the severe restraints posed by negative selection. Of course the short optimizing walks for the same function, or slight variations of it, are perfectly possible and can be driven by NS, as we have seen with the scenario of chloroquine resistance optimization. But for the really new information, only a neutral walk seems feasible. But a neutral walk implies all the probabilistic barriers that we have discussed, with no help from NS. And NS could come to the rescue only when the new function is already there, even if not yet optimized. But we have seen that the new function cannot ne there by RV alone. And of course, at some point in the neutral random walk, we have to go back to the functional scenario, so that NS can start the fixation and optimization task! IOWs as Origenes says: "Note that we graciously assume that, at the moment a functional sequence is formed in junk-dna, somehow, this particular sequence (and not any other) is activated and translated into proteins. Why or how this is done, no one knows."gpuccio_{November 3, 2017
November
11
Nov
3
03
2017
09:34 AM
9
09
34
AM
PDT}

kurx78: "This post is amazing, I’ve learned a lot." Thank you, I am happy you like it! :) "But now I remember, where are the politely dissenting interlocutors?" Yes, where are they?gpuccio_{November 3, 2017
November
11
Nov
3
03
2017
09:25 AM
9
09
25
AM
PDT}

daveS: "Thanks, this answers exactly the questions I meant to ask." I am happy of that! :)gpuccio_{November 3, 2017
November
11
Nov
3
03
2017
09:24 AM
9
09
24
AM
PDT}

If a new market niche opens up for a stone sculpture of Michael Jackson, is it possible for this niche to be closed up by erosion process? The answer is - no, because given the poly-3D enumeration mathematics, only a thousand particles can be arranged into approximately 10^3,271 different states.( 2^(n-7)n^(n-9) (n-4)(8n^8-128n^7+828n^6-2930n^5+7404n^4-17523n^3+41527n^2-114302n+204960)/6) So, the ratio between 'non-Michael Jackson stone shapes' and 'Michael Jackson stone shapes' is so large that even if every proton in the observable universe were a stone under erosion process, eroding extremely fast from the Big Bang until the end of the universe, when protons might no longer exist, we would still need trillions and trillions orders of magnitude more time to have even a 1 in trillions and trillions chance of success. This question in evolutionary language is as follows. If a new environmental niche opens up for a male reproductive system, is it possible for this niche to be closed up by mutating DNA sequences? The answer is - no, because if we suppose that some simple male reproductive system is represented by only 10,000 nucleotides, which can be arranged into approximately 10^6,020 different states, the ratio between DNA sequences for 'non-male reproductive system' and 'male reproductive system' is so large that even if every proton in the observable universe were a DNA sequence mutating extremely fast from the Big Bang until the end of the universe, information for a male reproductive system would not be found. The probability of evolution is therefore zero in any operational sense of an event, and the belief that the process of RV+NS must eventually succeed in producing complex molecular machines, organs and organ systems is delusional.forexhr_{November 3, 2017
November
11
Nov
3
03
2017
09:15 AM
9
09
15
AM
PDT}

This post is amazing, I've learned a lot. But now I remember, where are the politely dissenting interlocutors? Like the person who "demostrated" that literaly everything is possible with random mutation if you give it enough time and cast a spell on it.kurx78_{November 3, 2017
November
11
Nov
3
03
2017
07:38 AM
7
07
38
AM
PDT}

gpuccio, Thanks, this answers exactly the questions I meant to ask.daveS_{November 3, 2017
November
11
Nov
3
03
2017
06:39 AM
6
06
39
AM
PDT}

Origenes: "Note that we graciously assume that, at the moment a functional sequence is formed in junk-dna, somehow, this particular sequence (and not any other) is activated and translated into proteins. Why or how this is done, no one knows." Yes, we graciously assume a lot of things. We are definitely very kind to our darwinist interlocutors! :)gpuccio_{November 3, 2017
November
11
Nov
3
03
2017
06:30 AM
6
06
30
AM
PDT}

daveS: Let's try to understand each other. You had asked: “Am I correct to say then that if you choose an arbitrary sequence S from this 4^4600000-element set of sequences, there’s a fair chance that it would be within a realistic number of mutations of the genome of some viable organism? (by “realistic”, I mean these mutations could all reasonably be expected to occur in 1 generation).” Emphasis mine. So, your question was about "a fair chance", and I have answered it. But again, if I understand well, your point seems to be that a whole genome cannot undergo states that are incompatible with life, and therefore many states that are completely distant from the viable state cannot be reach by a whole genome. That's obviously right. The Shakespeare organism would not be viable, neither in its final state, nor in most nearby states at sequence level. But that has no relevance with the discussion I am making here. I have been rather clear: 1) In my table, I have computed the probabilistic resources using average mutation rates per genome per replication, as I found them in available sources. 2) I could have used the more consistent value of 10^-8 or 10^-9 mutations per nucleotide site, but that would require adjustment per genome size if we want to have an idea of how many states a class of organisms can reach. 3) The upper threshold of the number of states that organisms can reach depends, as said, on the population size, the reproduction rate and the mutation rate. 4) Of these three variables, the population size and the mutation rate can be considered rather constant for each class of organisms (at least in such a general estimate). 5) The mutation rate, instead, is rather constant per nucleotide site, and therefore depends linearly on the length of the sequence we consider. 6) However, as the relationship is linear, and the range of genome length is not huge, the differences in the total number of states, in different classes of organisms, are not influenced as much by the mutation rate per genome, but rather by the population size. For example, the difference in population size in my table between the two extremes, bacteria and hominidae, is 21 orders of magnitude, while the difference in genome size between the same two classe is about 3 orders of magnitude. That's why the dofferences in total number of states, which are of 24 orders of magnitude, can be related essentially to the difference in population size. 7) Of course, as my numbers are expressed for whole genome, when you asked what fraction of the search space can be reahce with those probabilistic resources, I have computed that value for the average bacterial genome, and the total number of states that can be reached by the bacterial system. That is absolutely correct, and it is in no way modified by the facts that, of course, many of the states in the search space are incompatible with life, if the search is done on a whole genome. 8) Of course, no realistic search can be done on a whole genome, because a great (or maybe small in some cases, for those who believe in junk DNA) part of a whole genome cannot change. For example, protein coding genes are strongly constrained by negative selection, and in many cases, like ATP synthase alpha and beta chain, histones, ubiquitin, dynein, those constraints are huge, and the sequence is very much fixed for great part. 9)As I have said, the only realistic conclusion is that functional sequences can change very little, and mostly in a neutral way, which does not change the functional state of the organism. Or maybe sometimes they may undergo minor changes in the direction of some optimization of their existing function. 10) Therefore, practically all the search for really new functional information must take place in non functional sequences. That allows space for free random walks, which can reach any part of the search space, but in accord to the probabilistic rules. 11) Of course, as we have said, a whole genome cannot be non functional: tehrefore a whole genome will never take part in that kind of random walk. But some definite subset of it can certainly do that. 12) So, let's go again to our E. coli example. Let's not compute any more the relationship between the search space that can be reached and the total search space for the whole genome. Let's compute it for a subset of it. 13) Let's assume, just for the sake of discussion, that 10% of the E. coli genome could be non functional, and take part in a neutral random walk. It's not really important how much it is, I just want to show how the computation must be made. 14) Now, we have 460 Kbp, instead of 4.6 Mbp. OK? The mutation rate is now 4.60E-04 (1E-9 * 460000). The number of states that can be reached is now 8.06E+40, one order of magnitude lower, as expected (we have reduced the sequence length of one order of magnitude). Ane now all these states can be reached, because we are discussing a purely neutral random walk in non functional DNA. 15) O course, the search space is much smaller than fro the whole genome: it is now "only" 4^460000, 1E276947! 16) So, the ratio of the search space that can be reached to the whole search space is now 8.06E+40 : 1E276947. IOWs, only a fraction of: 1 : 10^276907 can be reached by the whole bacterial system in 4 billion years. Is that tiny enough? :) 17) Now, let's do that again for a shorter sequence, and a more defined scenario. Let's say that one gene for a 400 AAs protein (a perfectly medium size protein) is duplicated and inactivated in E. coli. This is a very frequently invoked scenario in the neo-darwinian field. Let's say, for the sake of discussion, that the duplicated and inactivated gene has been miraculously fixed by genetic drift, and is now, at the beginning of our 4 billion years window, present in the whole population of 5E30 bacteria, ready to start its random walk. 18) Now the sequence that can mutate is only 1200 nucleotides. So, let's do the computations. Total states that can be reached by the whole bacterial system: 2.10E+38 (127 bits) Search space: 4^1200 = 10^722 Fraction of the search space for that sequence that can be reached by the whole bacterial system in 4 billion years for one gene for some new protein of about 400 AAs is: 1 : 2.1x10^684. Is that tiny enough? 18) Finally, let's remember that the ration between the total number of states that cab be search is a measure of the probabilistic resources, not a measure of functional information. But it can immediately be related to measures of functional information. For example, we have seen that, in a maximal bacterial system, the highest munber of total states that can be reache by a neutral sequence of 1200 nucleotides is: 127 bits That means, very simply, that any sequence which is definitely more ocmplex than that will not be realistically reached by that stretch of nucleotides (our duplicated and inactivated gene). If you want five sigma certainty, you can just add 22 bits, for a total of: 149 bits. So, no protein with more than 149 bits of functional information is in the range of a duplication inactivation process in the super maximal bacterial system we have considered, in 4 billion years. ATP synthase beta chain has at least 663 bits of functional information. QED.gpuccio_{November 3, 2017
November
11
Nov
3
03
2017
06:29 AM
6
06
29
AM
PDT}

gpuccio, PS to my post #94: My question really boils down to whether there are well-defined "islands of viability" in the sequence space, and if so, what is the approximate total size of this set of islands.daveS_{November 3, 2017
November
11
Nov
3
03
2017
06:18 AM
6
06
18
AM
PDT}

DaveS: And yes, junk DNA is an issue here, although presumably there has to be some coding DNA present in order for the genome to be “viable”.
Absolutely right, Venter's minimal genome (473 genes) comes to mind. The search of an organism must respect certain boundaries; which certainly doesn't help the evolutionary search. However, given junk-dna, I have no idea if this constraint can be translated into a number. Would be very nice of course.Origenes_{November 3, 2017
November
11
Nov
3
03
2017
05:44 AM
5
05
44
AM
PDT}

Origenes,
Are you saying that the search by organisms is confined to viable paths? If so, while that this is obviously true, there is the complication of junk-DNA.
Yes, viable paths, or at least within one step in a random walk from viable. Non-viable sequences "close" to viable could also be searched, although the resulting organism would not survive to reproduce. And yes, junk DNA is an issue here, although presumably there has to be some coding DNA present in order for the genome to be "viable"daveS_{November 3, 2017
November
11
Nov
3
03
2017
05:26 AM
5
05
26
AM
PDT}

Prev 1 … 4 5 6 7 8 … 10 Next

You must be logged in to post a comment.

Leave a Reply