Defending Intelligent Design theory: Why targets are real targets, probabilities real probabilities, and the Texas Sharp Shooter fallacy does not apply at all.

_{gpuccio
April 13, 2018

Intelligent Design

48}_{Categories
Intelligent Design}

Share: Facebook; Twitter/X; LinkedIn; Flipboard; Print; Email

The aim of this OP is to discuss in some order and with some completeness a few related objections to ID theory which are in a way connected to the argument that goes under the name of Texas Sharp Shooter Fallacy, sometimes used as a criticism of ID.

The argument that the TSS fallacy is a valid objection against ID has been many times presented by DNA_Jock, a very good discussant from the other side. So, I will refer in some detail to his arguments, as I understand them and remember them. Of course, if DNA_Jock thinks that I am misrepresenting his ideas, I am ready to ackowledge any correction about that. He can post here, if he can or likes, or at TSZ, where he is a contributor.

However, I thik that the issues discussed in this OP are of general interest, and that they touch some fundamental aspects of the debate.

As an help to those who read this, I will sum up the general structure of this OP, which will probably be rather long. I will discuss three different arguments, somewhat related. They are:

a) The application of the Texas Sharp Shooter fallacy to ID, and why that application is completely wrong.

b) The objection of the different possible levels of function definition.

c) The objection of the possible alternative solutions, and of the incomplete exploration of the search space.

Of course, the issue debated here is, as usual, the design inference, and in particular its application to biological objects.

So, let’s go.

a) The Texas Sharp Shooter fallacy and its wrong application to ID.

What’s the Texas Sharp Shooter fallacy (TSS)?

It is a logical fallacy. I quote here a brief description of the basic metaphor, from RationalWiki:

The fallacy’s name comes from a parable in which a Texan fires his gun at the side of a barn, paints a bullseye around the bullet hole, and claims to be a sharpshooter. Though the shot may have been totally random, he makes it appear as though he has performed a highly non-random act. In normal target practice, the bullseye defines a region of significance, and there’s a low probability of hitting it by firing in a random direction. However, when the region of significance is determined after the event has occurred, any outcome at all can be made to appear spectacularly improbable.

For our purposes, we will use a scenario where specific targets are apparently shot by a shooter. This is the scenario that best resembles what we see in biological objects, where we can observe a great number of functional structures, in particular proteins, and we try to understand the causes of their origin.

In ID, as well known, we use functional information as a measure of the improbability of an outcome. The general idea is similar to Paley’s argument for a watch: a very high level of specific functional information in an object is a very reliable marker of design.

But to evaluate functional information in any object, we must first define a function, because the measure of functional information depends on the function defined. And the observer must be free to define any possible function, and then measure the linked functional information. Respecting these premises, the idea is that if we observe any object that exhibits complex functional information (for example, more than 500 bits of functional information ) for an explicitly defined function (whatever it is) we can safely infer design.

Now, the objection that we are discussing here is that, according to some people (for example DNA_Jock), by defining the function after we have observed the object as we do in ID theory we are committing the TSS fallacy. I will show why that is not the case using an example, because examples are clearer than abstract words.

So, in our example, we have a shooter, a wall which is the target of the shooting, and the shootin itself. And we are the observers.

We know nothing of the shooter. But we know that a shooting takes place.

Our problem is:

Is the shooting a random shooting? This is the null hypothesis

or:

Is the shooter aiming at something? This is the “aiming” hypothesis

So, here I will use “aiming” instead of design, because my neo-darwinist readers will probably stay more relaxed. But, of course, aiming is a form of design (a conscious representation outputted to a material system).

Now I will describe three different scenarios, and I will deal in detail with the third.

First scenario: no fallacy.

In this case, we can look at the wall before the shooting. We see that there are 100 targets painted in different parts of the wall, rather randomly, with their beautiful colors (let’s say red and white). By the way, the wall is very big, so the targets are really a small part of the whole wall, even if taken together.

Then, we witness the shootin: 100 shots.

We go again to the wall, and we find that all 100 shots have hit the targets, one per target, and just at the center.

Without any worries, we infer aiming.

I will not compute the probabilities here, because we are not really interested in this scenario.

This is a good example of pre-definition of the function (the targets to be hit). I believe that neither DNA_Jock nor any other discussant will have problems here. This is not a TSS fallacy.

Second scenario: the fallacy.

The same setting as above. However, we cannot look at the wall before the shooting. No pre-specification.

After the shooting, we go to the wall and paint a target around each of the different shots, for a total of 100. Then we infer aiming.

Of course, this is exactly the TSS fallacy.

There is a post-hoc definition of the function. Moreover, the function is obviously built (painted) to correspond to the information in the shots (their location). More on this later.

Again, I will not deal in detail with this scenario because I suppose that we all agree: this is an example of TSS fallacy, and the aiming inference is wrong.

Third scenario: no fallacy.

The same setting as above. Again, we cannot look at the wall before the shooting. No pre-specification.

After the shooting, we go to the wall. This time, however, we don’t paint anything.

But we observe that the wall is made of bricks, small bricks. Almost all the bricks are brown. But there are a few that are green. Just a few. And they are randomly distributed in the wall.

We also observe that all the 100 shots have hit green bricks. No brown brick has been hit.

Then we infer aiming.

Of course, the inference is correct. No TSS fallacy here.

And yet, we are using a post-hoc definition of function: shooting the green bricks.

What’s the difference with the second scenario?

The difference is that the existence of the green bricks is not something we “paint”: it is an objective property of the wall. And, even if we do use something that we observe post-hoc (the fact that only the green bricks have been shot) to recognize the function post-hoc, we are not using in any way the information about the specific location of each shot to define the function. The function is defined objectively and independently from the contingent information about the shots.

IOWs, we are not saying: well the shooter was probably aiming at poin x1 (coordinates of the first shot) and point x2 (coordinates of the second shot), and so on. We just recognize that the shooter was aimin at the green bricks. An objective property of the wall.

IOWs ( I use many IOWs, because I know that this simple concept will meet a great resistance in the minds of our neo-darwinist friends) we are not “painting” the function, we are simply “recognizing” it, and using that recognition to define it.

Well, this third scenario is a good model of the design inference in ID. It corresponds very well to what we do in ID when we make a design inference for functional proteins. Therefore, the procedure we use in ID is no TSS fallacy. Not at all.

Given the importance of this model for our discussion, I will try to make it more quantitative.

Let’s say that the wall is made of 10,000 bricks in total.

Let’s say that there are only 100 green bricks, randomly distributed in the wall.

Let’s say that all the green bricks have been hit, and no brown brick.

What are the probabilities of that result if the null hypothesis is true (IOWs, if the shooter was not aiming at anything) ?

The probability of one succesful hit (where success means hitting a green brick) is of course 0.01 (100/10000).

The probability of having 100 successes in 100 shots can be computed using the binomial distribution. It is:

10e-200

IOWs, the system exhibits 664 bits of functional information. More ore less like the TRIM62 protein, an E3 ligase discussed in my previous OP about the Ubiquitin system, which exhibits an increase of 681 bits of human conserved functional information at the transition to vertebrates.

Now, let’s stop for a moment for a very important step. I am asking all neo-darwinists who are reading this OP a very simple question:

In the above situation, do you infer aiming?

It’s very important, so I will ask it a second time, a little louder:

In the above situation, do you infer aiming?

Because if your answer is no, if you still think that the above scenario is a case of TSS fallacy, if you still believe that the observed result is not unlikely, that it is perfectly reasonable under the assumption of a random shooting, then you can stop here: you can stop reading this OP, you can stop discussing ID, at least with me. I will go on with the discussion with the reasonable people who are left.

So, in the end of this section, let’s remind once more the truth about post-hoc definitions:

No post-hoc definition of the function that “paints” the function using the information from the specific details of what is observed is correct. Those definitions are clear examples of TSS fallacy.
On the contrary, any post-hoc definition that simply recognizes a function which is related to an objectively existing property of the system, and makes no special use of the specific details of what is observed to “paint” the function, is perfectly correct. It is not a case of TSS fallacy.

b) The objection of the different possible levels of function definition.

DNA_Jock summed up this specific objection in the course of a long discussion in the thread about the English language:

Well, I have yet to see an IDist come up with a post-specification that wasn’t a fallacy. Let’s just say that you have to be really, really, really cautious if you are applying a post-facto specification to an event that you have already observed, and then trying to calculate how unlikely that specific event was. You can make the probability arbitrarily small by making the specification arbitrarily precise.

OK, I have just discussed why post-specifications are not in themselves a fallacy. Let’s say that DNA_Jock apparently admits it, because he just says that we have to be very cautious in applying them. I agree with that, and I have explained what the caution should be about.

Of course, I don’t agree that ID’s post-hoc specifications are a fallacy. They are not, not at all.

And I absolutely don’t agree with his argument that one of the reasosn why ID’s post-hoc specifications are a fallacy would be that “You can make the probability arbitrarily small by making the specification arbitrarily precise.”

Let’s try to understand why.

So, let’s go back to our example 3), the wall with the green bricks and the aiming inference.

Let’s make our shooter a little less precise: let’s say that, out of 100 shots, only 50 hits are green bricks.

Now, the math becomes:

The probability of one succesful hit (where success means hitting a green brick) is still 0.01 (100/10000).

The probability of having 50 successes or more in 100 shots can be computed using the binomial distribution. It is:

6.165016e-72

Now, the system exhibits “only” 236 bits of functional information. Much less than in the previous example, but still more than enough, IMO, to infer aiming.

Consider that five sigma, which is ofetn used as a standard in physics to reject the nulll hypothesis , is just 3×10^-7, less than 22 bits.

Now, DNA_Jock’s objection would be that our post-hoc specification is not valid because “we can make the probability arbitrarily small by making the specification arbitrarily precise”.

But is that true? Of course not.

Let’s say that, in this case, we try to “make the specification arbitrarily more precise”, defining the function of sharp aiming as “hitting only green bricks with all 100 shots”.

Well, we are definitely “making the probability arbitrarily small by making the specification arbitrarily precise”. Indeed, we are making the specification more precise for about 128 orders of magnitude! How smart we are, aren’t we?

But if we do that, what happens?

A very simple thing: the facts that we are observing do not meet the specification anymore!

Because, of course, the shooter hit only 50 green bricks out of 100. He is smart, but not that smart.

Neither are we smart if we do such a foolish thing, defining a function that is not met by observed facts!

The simple truth is: we cannot at all “make the probability arbitrarily small by making the specification arbitrarily precise”, as DNA_Jock argues, in our post-hoc specification, because otherwise our facts would not meet our specification anymore, and that would be completely useless and irrelevant..

What we can and must do is exactly what is always done in all cases where hypothesis testing is applied in science (and believe me, that happens very often).

We compute the probabilities of observing the effect that we are indeed observing, or a higher one, if we assume the null hypothesis.

That’s why I have said that the probability of “having 50 successes or more in 100 shots” is 6.165016e-72.

This is called a tail probability, in particular the probability of the upper tail. And it’s exactly what is done in science, in most scenarios.

Therefore, DNA_Jock’s argument is completely wrong.

c) The objection of the possible alternative solutions, and of the incomplete exploration of the search space.

c1) The premise

This is certainly the most complex point, because it depends critically on our understanding of protein functional space, which is far from complete.

For the discussion to be in some way complete, I have to present first a very general premise. Neo-darwinists, or at least the best of them, when they understand that they have nothing better to say, usually desperately recur to a set of arguments related to the functional space of proteins. The reason is simple enough: as the nature and structure of that space is still not well known or understood, it’s easier to equivocate with false reasonings.

Their purpose, in the end, is always to suggest that functional sequences can be much more frequent than we believe. Or at least, that they are much more frequent than IDists believe. Because, if functional sequences are frequent, it’s certainly easier for RV to find them.

The arguments for this imaginary frequency of biological function are essentially of five kinds:

The definition of biological function.
The idea that there are a lot of functional islands.
The idea that functional islands are big.
The idea that functional islands are connected. The extreme form of this argument is that functional islands simply don’t exist.
The idea that the proteins we are observing are only optimized forms that derive from simpler implementations through some naturally selectable ladder of simple steps.

Of course, different mixtures of the above arguments are also frequently used.

OK. let’s get rid of the first, which is rather easy. Of course, if we define extremely simple biological functions, they will be relatively frequent.

For example, the famous Szostak experiment shows that a weak affinity for ATP is relatively common in a random library; about 1 in 10¹¹ sequences 80 AAs long.

A weak affinity for ATP is certainly a valid definition for a biological function. But it is a function which is at the same time irrelevant and non naturally selectable. Only naturally selectable functions have any interest for the neo-darwinian theory.

Moreover, most biological functions that we observe in proteins are extremely complex. A lot of them have a functional complexity beyond 500 bits.

So, we are only interested in functions in the protein space which are naturally selectable, and we are specially interested in functions that are complex, because those are the ones about which we make a design inference.

The other three points are subtler.

The idea that there are a lot of functional islands.

Of course, we don’t know exactly how many functional islands exist in the protein space, even restricting the concept of function to what was said above. Neo-darwinists hope that there are a lot of them. I think there are many, but not so many.

But the problem, again, is drastically redimensioned if we consider that not all functional islands will do. Going back to point 1, we need naturally selectable islands. And what can be naturally selected is much less than what can potentially be functional. A naturally selectable island of function must be able to give a reproductive advantage. In a system that already has some high complexity, like any living cell, the number of functions that can be immediately integrated in what already exists, is certainly strongly constrained.

This point is also stricly connected to the other two points, so I will go on with them and then try some synthesis.

The idea that functional islands are big.

Of course, functional islands can be of very different sizes. That depends on how many sequences, related at sequence level (IOWs, that are part of the same island), can implement the function.

Measuring functional information in a sequence by conservation, like in the Dustron method or in my procedure many times described, is an indirect way of measuring the size of a functional island. The greater is the functional complexity of an island, the smaller is its size in the search space.

Now, we must remember a few things. Let’s take as an example an extremely conserved but not too long sequence, our friend ubiquitin. It’s 76 AAs long. Therefore, the associated search space is 20^76: 328 bits.

Of course, even the ubiquitin sequence can tolerate some variation, but it is still one of the most conserved sequences in evolutionary history. Let’s say, for simplicity, that at least 70 AAs are stictly conserved, and that 6 can vary freely (of course, that’s not exact, just an approximation for the sake of our discussion).

Therefore, using the absolute information potential of 4.3 bits per aminoacid, we have:

Functional information in the sequence = 303 bits

Size of the functional island = 328 – 303 = 25 bits

Now, a functional island of 25 bits is not exactly small: it corresponds to about 33.5 million sequences.

But it is infinitely tiny if compared to the search space of 328 bits: 7.5 x 10^98 sequences!

If the sequence is longer, the relationship between island space and search space (the ocean where the island is placed) becomes much worse.

The beta chain of ATP synthase (529 AAs), another old friend, exhibits 334 identities between e. coli and humans. Always for the sake of simplicity, let’s consider that about 300 AAs are strictly conserved, and let’s ignore the functional contraint on all the other AA sites. That gives us:

Search space = 20^529 = 2286 bits

Functional information in the sequence = 1297 bits

Size of the functional island = 2286 – 1297 = 989 bits

So, with this computation, there could be about 10^297 sequences that can implement the function of the beta chain of ATP synthase. That seems a huge number (indeed, it’s definitley an overestimate, but I always try to be generous, especially when discussing a very general principle). However, now the functional island is 10^390 times smaller than the ocean, while in the case of ubiquitin it was “just” 10^91 times smaller.

IOWs, the search space (the ocean) increases exponentially much more quickly than the target space (the functional island) as the lenght of the functional sequence increases, provided of course that the sequences always retain high functional information.

The important point is not the absolute size of the island, but its rate to the vastness of the ocean.

So, the beta chain of ATP synthase is really a tiny, tiny island, much smaller than ubiquitin.

Now, what would be a big island? It’s simple: a functional isalnd which can implement the same function at the same level, but with low functional information. The lower the functional information, the bigger the island.

Are there big islands? For simple functions, certainly yes. Behe quotes the antifreeze protein as an example example. It has rather low FI.

But are there big islands for complex functions, like that of ATP synthase beta chain? It’s absolutely reasonable to believe that there are none. Because the function here is very complex, and it cannot be implemented by a simple sequence, exactly like a functional spreadsheet software annot be written by a few bits of source code. Neo-darwinists will say that we don’t know that for certain. It’s true, we don’t know it for certain. We know it almost for certain.

The simple fact remains: the only example of the beta chain of the F1 complex of ATP synthase that we know of is extremely complex.

Let’s go, for the moment, to the 4th argument.

The idea that functional islands are connected. The extreme form of this argument is that functional islands simply don’t exist.

This is easier. We have a lot of evidence that functional islands are not connected, and that they are indeed islands, widely isolated in the search space of possible sequences. I will mention the two best evidences:

4a) All the functional proteins that we know of, those that exist in all the proteomse we have examined, are grouped in abot 2000 superfamilies. By definition, a protein superfamily is a cluster of sequences that have:

no sequence similarity
no structure similarity
no function similarity

with all the other groups.

IOWs, islands in the sequence space.

4b) The best (and probably the only) good paper that relates an experiment where Natural Selection is really tested by an approrpiaite simulation is the rugged landscape paper:

Experimental Rugged Fitness Landscape in Protein Sequence Space

http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0000096

Here, NS is correctly simulated in a phage system, because what is measured is infectivity, which in phages is of course strictly related to fitness.

The function studied is the retrieval of a partially damaged infectivity due to a partial random substitution in a protein linked to infectivity.

In brief, the results show a rugged landscape of protein function, where random variation and NS can rather easily find some low-level peaks of function, while the original wild-type, optimal peak of function cannot realistically be found, not only in the lab simulation, but in any realistic natural setting. I quote from the conclusions:

The question remains regarding how large a population is required to reach the fitness of the wild-type phage. The relative fitness of the wild-type phage, or rather the native D2 domain, is almost equivalent to the global peak of the fitness landscape. By extrapolation, we estimated that adaptive walking requires a library size of 10⁷⁰ with 35 substitutions to reach comparable fitness.

I would recommend to have a look at Fig. 5 in the paper to have an idea of what a rugged landscape is.

However, I will happily accept a suggestion from DNA_Jock, made in one of his recent comments at TSZ about my Ubiquitin thread, and with which I fully agree. I quote him:

To understand exploration one, we have to rely on in vitro evolution experiments such as Hayashi et al 2006 and Keefe & Szostak, 2001. The former also demonstrates that explorations one and two are quite different. Gpuccio is aware of this: in fact it was he who provided me with the link to Hayashi – see here.
You may have heard of hill-climbing algorithms. Personally, I prefer my landscapes inverted, for the simple reason that, absent a barrier, a population will inexorably roll downhill to greater fitness. So when you ask:

How did it get into this optimized condition which shows a highly specified AA sequence?

I reply
It fell there. And now it is stuck in a crevice that tells you nothing about the surface whence it came. Your design inference is unsupported.

Of course, I don’t agree with the last phrase. But I fully agree that we should think of local optima as “holes”, and not as “peaks”. That is the correct way.

So, the protein landscape is more like a ball and holes game, but without a guiding labyrinth: as long as the ball in on the flat plane (non functional sequences), it can go in any direction, freely. However, when it falls into a hole, it will quickly go to the bottom, and most likely it will remain there.

But:

The holes are rare, and they are of different sizes
They are distant from one another
A same function can be implemented by different, distant holes, of different size

What does the rugged landscape paper tell us?

That the wildtype function that we observe in nature is an extremely small hole. To find it by RV and NS, according to the authors, we should start with a library of 10^70 sequences.
That there are other bigger holes which can partially implement some function retrieval, and that are in the range of reasonable RV + NS
That those simpler solutions are not bridges to the optimal solution observed in the wildtype. IOWs. they are different, and there is no “ladder” that NS can use to reach the optimal solution .

Indeed, falling into a bigger hole (a much bigger hole, indeed) is rather a severe obstacle to finding the tiny hole of the wildtype. Finding it is already almost impossible because it is so tiny, but it becomes even more impossible if the ball falls into a big hole, because it will be trapped there by NS.

Therefore, to sum up, both the existence of 2000 isolated protein superfamilies and the evidence from the rugged landscape paper demonstrate that functional islands exist, and that they are isolated in the sequence space.

Let’s go now to the 5th argument:

The idea that the proteins we are observing are only optimized forms that derive from simpler implementations by a naturally selectable ladder.

This is derived from the previous argument. If bigger functional holes do exist for a function (IOWs, simpler implementations), and they are definitely easier to find than the optimal solution we observe, why not believe that the simpler solutions were found first, and then opened the way to the optimal solution by a process of gradual optimization and natural selection of the steps? IOWs, a naturally selectable ladder?

And the answer is: because that is impossible, and all the evidence we have is against that idea.

First of all, even if we know that simpler implementations do exist in some cases (see the rugged landscape paper), it is not at all obvious that they exist as a general rule.

Indeed, the rugged landscape experiment is a very special case, because it is about retrieval of a function that has been only partially impaired by substituting a random sequence to part of an already existing, functional protein.

The reason for that is that, if they had completely knocked out the protein, infectivity, and therefore survival itself, would not have survived, and NS could not have acted at all.

In function retrieval cases, where the function is however kept even if at a reduced level, the role of NS is greatly helped: the function is already there, and can be optimed with a few naturally selectable steps.

And that is what happens in the case of the Hayashi paper. But the function is retrieved only very partially, and, as the authors say, there is no reasonable way to find the wildtype sequence, the optimal sequence, in that way. Because the optimal sequence would require, according to the authors, 35 AA substitutions, and a starting library of 10^70 random sequences.

What is equally important is that the holes found in the experiment are not connected to the optimal solution (the wildtype). They are different from it at sequence level.

IOWs, this bigger holes do not lead to the optimal solution. Not at all.

So, we have a strange situation: 2000 protein superfamilies, and thousand and tousands of proteins in them, that appear to be, in most cases, extremely functional, probably absolutely optimal. But we have absolutely no evidence that they have been “optimized”. They are optimal, but not necessarily optimized.

Now, I am not excluding that some optimization can take place in non design systems: we have good examples of that in the few known microevolutionary cases. But that optimization is always extremely short, just a few AAs substitutions once the starting functional island has been found, and the function must already be there.

So, let’s say that if the extremely tiny functional island where our optimal solution lies, for example the wildtype island in the rugged landscape experiment, can be found in some way, then some small optimization inside that functional island could certainly take place.

But first, we have to find that island: and for that we need 35 specific AA substitutions (about 180 bits), and 10^70 starting sequences, if we go by RV + NS. Practically impossible.

But there is more. Do those simpler solutions always exist? I will argue that it is not so in the general case.

For example, in the case of the alpha and beta chains of the F1 subunit of ATP synthase, there is no evidence at all that simpler solutions exist. More on that later.

So, to sum it up:

The ocean of the search space, according to the reasonings of neo-darwinists, should be overflowing with potential naturally selectable functions. This is not true, but let’s assume for a moment, for the sake of discussion, that it is.

But, as we have seen, simpler functions or solutions, when they exist, are much bigger functional islands than the extremely tiny functional islands corresponding to solutions with high functional complexity.

And yet, we have seen that there is absolutely no evidence that simpler solution, when they exist, are bridges, or ladder, to highly complex solutions. Indeed, there is good evidence of the contrary.

Given those premises, what would you expect if the neo-darwinian scenario were true? It’s rather simple: an universal proteome overflowing with simple functional solutions.

Instead, what do we observe? It’s rather simple: an universal proteome overflowing with highly functional, probably optimal, solutions.

IOWs, we find in the existing proteome almost exclusively highly complex solutions, and not simple solutions.

The obvious conclusion? The neo-darwinist scenario is false. The highly functional, optimal solutions that we observe can only be the result of intentional and intelligent design.

c2) DNA_Jock’s arguments

Now I will take in more detail DNA_Jock’ s two arguments about alternative solutions and the partial exploration of the protein space, and will explain why they are only variants of what I have already discussed, and therefore not valid.

The first argument, that we can call “the existence of alternative solutions”, can be traced to this statement by DNA_Jock:

Every time an IDist comes along and claims that THIS protein, with THIS degree of constraint, is the ONLY way to achieve [function of interest], subsequent events prove them wrong. OMagain enjoys laughing about “the” bacterial flagellum; John Walker and Praveen Nina laugh about “the” ATPase; Anthony Keefe and Jack Szostak laugh about ATP-binding; now Corneel and I are laughing about ubiquitin ligase: multiple ligases can ubiquinate a given target, therefore the IDist assumption is false. The different ligases that share targets ARE “other peaks”.
This is Texas Sharp Shooter.

We will debate the laugh later. For the moment, let’s see what the argument states.

It says: the solution we are observing is not the only one. There can be others, in some cases we know there are others. Therefore, your computation of probabilities, and therefore of functional information, is wrong.

Another way to put it is to ask the question: “how many needles are there in the haystack?”

Alan Fox seems to prefer this metaphor:

This is what is wrong with “Islands-of-function” arguments. We don’t know how many needles are in the haystack. G Puccio doesn’t know how many needles are in the haystack. Evolution doesn’t need to search exhaustively, just stumble on a useful needle.

They both seem to agree about the “stumbling”. DNA_Jock says:

So when you ask:

How did it get into this optimized condition which shows a highly specified AA sequence?

I reply
It fell there. And now it is stuck in a crevice that tells you nothing about the surface whence it came.

OK, I think the idea is clear enough. It is essentially the same idea as in point 2 of my general premise. There are many functional islands. In particular, in this form, many functional islands for the same function.

I will answer it in two parts:

Is it true that the existence of alternative solutions, if they exist, makes the computation of functional complexity wrong?
Have we really evidence that alternative solutions exist, and of how frequent they can really be?

I will discuss the first part here, and say something about the second part later in the OP.

Let’s read again the essence of the argument, as summed up by me above:

” The solution we are observing is not the only one. There can be others, in some cases we know there are others. Therefore, your computation of probabilities, and therefore of functional information, is wrong.”

As it happens with smart arguments (and DNA_Jock is usually smart), it contains some truth, but is essentially wrong.

The truth could be stated as follows:

” The solution we are observing is not the only one. There can be others, in some cases we know there are others. Therefore, our computation of probabilities, and therefore of functional information, is not completely precise, but it is essentially correct”.

To see why that is the case, let’s use again a very good metaphor: Paley’s old watch. That will help to clarify my argument, and then I will discuss how it relies to proteins, in particular.

So, we have a watch. Whose function is to measure time. And, in general, let’s assume that we infer design for the watch, because its functional information is high enough to exclude that it could appear in any non design system spontaneously. I am confident that all reasonable people will agree with that. Anyway, we are assuming it for the present discussion.

Now, after having made a design inference (a perfectly correct inference, I would say) for this object, we have a sudden doubt. We ask ourselves: what if DNA_Jock is right?

So, we wonder: are there other solutions to measure time? Are there other functional islands in the search space of material objects?

Of course there are.

I will just mention four clear examples: a sundial, an hourglass, a digital clock, an atomic clock.

The sundial uses the position of the sun. The hourglass uses a trickle of sand. The digital clock uses an electronic oscillator that is regulated by a quartz crystal to keep time. An atomic clock uses an electron transition frequency in the microwave, optical, or ultraviolet region.

None of them uses gears or springs.

Now, two important points:

Even if the functional complexity of the five above mentioned solutions is probably rather different (the sundial and the hourglass are probably quite simpler, and the atomic clock is probably the most complex), they are all rather complex. None of them would be easily explained without a design inference. IOWs, they are small functional islands, each of them. Some are bigger, some are really tiny, but none of them is big enough to allow a random origin in a non design system.
None of the four additional solutions mentioned would be, in any way, a starting point to get to the traditional watch by small functional modifications. Why? Because they are completely different solutions, based on different ideas and plans.

If someone believes differently, he can try to explain in some detail how we can get to a traditional watch starting from an hourglass.

Now, an important question:

Does the existence of the four mentioned alternative solutions, or maybe of other possible similar solutions, make the design inference for the traditional watch less correct?

The answer, of course, is no.

But why?

It’s simple. Let’s say, just for the sake of discussion, that the traditional watch has a functional complexity of 600 bits. There are at least 4 additional solutions. Let’s say that each of them has, again, a functional complexity of 500 bits.

How much does that change the probability of getting the watch?

The answer is: 2 bits (because we have 4 solutions instead of one). So, now the probability is 598 bits.

But, of course, there can be many more solutions. Let’s say 1000. Now the probability would be about 590 bits. Let’s say one million different complex solutions (this is becoming generous, I would say). 580 bits. One billion? 570 bits.

Shall I go on?

When the search space is really huge, the number of really complex solutions is empirically irrelevant to the design inference. One observed complex solution is more than enough to infer design. Correctly.

We could call this argument: “How many needles do you need to tranfsorm a haystack into a needlestack?” And the answer is: really a lot of them.

Our poor 4 alternative solutions will not do the trick.

But what if there are a number of functional islands that are much bigger, much more likely? Let’s say 50 bits functional islands. Much simpler solutions. Let’s say 4 of them. That would make the scenario more credible. Not so much, probably, but certainly it would work better than the 4 complex solutions.

OK, I have already discussed that above, but let’s say it again. Let’s say that you have 4 (or more) 50 bits solution, and one (or more) 500 bits solution. But what you observe as a fact is the 500 bits solution, and none of the 50 bits solutions. Is that credible?

No, it isn’t. Do you know how smaller a 500 bits solution is if compared to a 50 bits solution? It’s 2^450 times smaller: 10^135 times smaller. We are dealing with exponential values here.

So, if much simpler solutions existed, we would expect to observe one of them, and not certainly a solution that is 10^135 times more unlikely. The design inference for the highly complex solution is not disturbed in any way by the existence of much simpler solutions.

OK, I think that the idea is clear enough.

c3) The laughs

As already mentioned, the issue of alternative solutions and uncounted needles seems to be a special source of hilarity for DNA_Jock. Good for him (a laugh is always a good thing for physical and mental health). But are the laughs justified?

I quote here again his comment about the laughs, that I will use to analyze the issues.

Every time an IDist comes along and claims that THIS protein, with THIS degree of constraint, is the ONLY way to achieve [function of interest], subsequent events prove them wrong. OMagain enjoys laughing about “the” bacterial flagellum; John Walker and Praveen Nina laugh about “the” ATPase; Anthony Keefe and Jack Szostak laugh about ATP-binding; now Corneel and I are laughing about ubiquitin ligase: multiple ligases can ubiquinate a given target, therefore the IDist assumption is false. The different ligases that share targets ARE “other peaks”.

I will not consider the bacterial flagellum, that has no direct relevance to the discussion here. I will analyze, instead, the other three laughable issues:

Szostak and Keefe’s ATP binding protein
ATP synthase (rather than ATPase)
E3 ligases

Szostak and Keefe should not laugh at all, if they ever did. I have already discussed their paper a lot of times. It’s a paper about directed evolution which generates a strongly ATP binding protein form a weakly ATP binding protein present in a random library. It is directed evolution by mutation and artificial selection. The important point is that both the original weakly binding protein and the final strongly binding protein are not naturally selectable.

Indeed, a protein that just binds ATP is of course of no utility in a cellular context. Evidence of this obvious fact can be found here:

A Man-Made ATP-Binding Protein Evolved Independent of Nature Causes Abnormal Growth in Bacterial Cells

http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0007385

There is nothing to laugh about here: the protein is a designed protein, and anyway it is no functional peak/hole at all in the sequence space, because it cannot be naturally selected.

Let’s go to ATP synthase.

DNA_Jock had already remarked:

They make a second error (as Entropy noted) when they fail to consider non-traditional ATPases (Nina et al).

And he gives the following link:

Highly Divergent Mitochondrial ATP Synthase Complexes in Tetrahymena thermophila

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2903591

And, of course, he laughs with Nina (supposedly).

OK. I have already discussed that the existence of one or more highly functional, but different, solutions to ATP building would not change the ID inference at all. But is it really true that there are these other solutions?

Yes and no.

As far as my personal argument is concerned, the answer is definitely no (or at least, there is no evidence of them). Why?

Because my argument, repeated for years, has always been based (everyone can check) on the alpha and beta chains of ATP synthase, the main constituents of the F1 subunit, where the true catalytic function is implemented.

To be clear, ATP synthase is a very complex molecule, made of many different chains and of two main multiprotein subunits. I have always discussed only the alpha and beta chains, because those are the chains that are really highly conserved, from prokaryotes to humans.

The other chains are rather conserved too, but much less. So, I have never used them for my argument. I have never presented blast values regarding the other chains, or made any inference about them. This can be checked by everyone.

Now, the Nina paper is about a different solution for ATP synthase that can be found in some single celled eukaryotes,

I quote here the first part of the abstract:

The F-type ATP synthase complex is a rotary nano-motor driven by proton motive force to synthesize ATP. Its F₁ sector catalyzes ATP synthesis, whereas the F_o sector conducts the protons and provides a stator for the rotary action of the complex. Components of both F₁ and F_o sectors are highly conserved across prokaryotes and eukaryotes. Therefore, it was a surprise that genes encoding the a and b subunits as well as other components of the F_o sector were undetectable in the sequenced genomes of a variety of apicomplexan parasites. While the parasitic existence of these organisms could explain the apparent incomplete nature of ATP synthase in Apicomplexa, genes for these essential components were absent even in Tetrahymena thermophila, a free-living ciliate belonging to a sister clade of Apicomplexa, which demonstrates robust oxidative phosphorylation. This observation raises the possibility that the entire clade of Alveolata may have invented novel means to operate ATP synthase complexes.

Emphasis mine.

As everyone can see, it is absolutely true that these protists have a different, alternative form of ATP symthase: it is based on a similar, but certainly divergent, architecture, and it uses some completely different chains. Which is certainly very interesting.

But this difference does not involve the sequence of the alpha and beta chains in the F1 subunit.

Beware, the a and b subunits mentioned above by the paper are not the alpha and beta chains.

From the paper:

The results revealed that Spot 1, and to a lesser extent, spot 3 contained conventional ATP synthase subunits including α, β, γ, OSCP, and c (ATP9)

IOWs, the “different” ATP synthase uses the same “conventional” forms of alpha and beta chain.

To be sure of that, I have, as usual, blasted them against the human forms. Here are the results:

ATP synthase subunit alpha, Tetrahymena thermophila, (546 AAs) Uniprot Q24HY8, vs ATP synthase subunit alpha, Homo sapiens, 553 AAs (P25705)

Bitscore: 558 bits Identities: 285 Positives: 371

ATP synthase subunit beta, Tetrahymena thermophila, (497 AAs) Uniprot I7LZV1, vs ATP synthase subunit beta, Homo sapiens, 529 AAs (P06576)

Bitscore: 729 bits Identities: 357 Positives: 408

These are the same, old, conventional sequences that we find in all organisms, the only sequences that I have ever used for my argument.

Therefore, for these two fundamental sequences, we have no evidence at all of any alternative peaks/holes. Which, if they existed, would however be irrelevant, as already discussed.

Not much to laugh about.

Finally, E3 ligases. DNA_Jock is ready to laugh about them because of this very good paper:

Systematic approaches to identify E3 ligase substrates

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5103871

His idea, shared with other TSZ guys, is that the paper demonstrates that E3 ligases are not specific proteins, because a same substrate can bind to more than one E3 ligase.

The paper says:

Significant degrees of redundancy and multiplicity. Any particular substrate may be targeted by multiple E3 ligases at different sites, and a single E3 ligase may target multiple substrates under different conditions or in different cellular compartments. This drives a huge diversity in spatial and temporal control of ubiquitylation (reviewed by ref. [61]). Cellular context is an important consideration, as substrate–ligase pairs identified by biochemical methods may not be expressed or interact in the same sub-cellular compartment.

I have already commented elsewhere (in the Ubiquitin thread) that the fact that a substrate can be targeted by multiple E3 ligases at different sites, or in different sub-cellular compartments, is clear evidence of complex specificity. IOWs, its’ not that two or more E3 ligases bind a same target just to do the same thing, they bind the same target in different ways and different context to do different things. The paper, even if very interesting, is only about detecting affinities, not function.

That should be enough to stop the laughs. However, I will add another simple concept. If E3 ligases were really redundant in the sense suggested by DNA_Jock and friends, their loss of function should not be a serious problem for us. OK, I will just quote a few papers (not many, because this OP is already long enough):

The multifaceted role of the E3 ubiquitin ligase HOIL-1: beyond linear ubiquitination.

https://www.ncbi.nlm.nih.gov/pubmed/26085217

HOIL-1 has been linked with antiviral signaling, iron and xenobiotic metabolism, cell death, and cancer. HOIL-1 deficiency in humans leads to myopathy, amylopectinosis, auto-inflammation, and immunodeficiency associated with an increased frequency of bacterial infections.

WWP1: a versatile ubiquitin E3 ligase in signaling and diseases.

https://www.ncbi.nlm.nih.gov/pubmed/22051607

WWP1 has been implicated in several diseases, such as cancers, infectious diseases, neurological diseases, and aging.

RING domain E3 ubiquitin ligases.

https://www.ncbi.nlm.nih.gov/pubmed/19489725

RING-based E3s are specified by over 600 human genes, surpassing the 518 protein kinase genes. Accordingly, RING E3s have been linked to the control of many cellular processes and to multiple human diseases. Despite their critical importance, our knowledge of the physiological partners, biological functions, substrates, and mechanism of action for most RING E3s remains at a rudimentary stage.

HECT-type E3 ubiquitin ligases in nerve cell development and synapse physiology.

https://www.ncbi.nlm.nih.gov/pubmed/25979171

The development of neurons is precisely controlled. Nerve cells are born from progenitor cells, migrate to their future target sites, extend dendrites and an axon to form synapses, and thus establish neural networks. All these processes are governed by multiple intracellular signaling cascades, among which ubiquitylation has emerged as a potent regulatory principle that determines protein function and turnover. Dysfunctions of E3 ubiquitin ligases or aberrant ubiquitin signaling contribute to a variety of brain disorders like X-linked mental retardation, schizophrenia, autism or Parkinson’s disease. In this review, we summarize recent findings about molecular pathways that involve E3 ligasesof the Homologous to E6-AP C-terminus (HECT) family and that control neuritogenesis, neuronal polarity formation, and synaptic transmission.

Finally I would highly recommend the following recent paper to all who want to approach seriously the problem of specificity in the ubiquitin system:

Specificity and disease in the ubiquitin system

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5264512

Abstract

Post-translational modification (PTM) of proteins by ubiquitination is an essential cellular regulatory process. Such regulation drives the cell cycle and cell division, signalling and secretory pathways, DNA replication and repair processes and protein quality control and degradation pathways. A huge range of ubiquitin signals can be generated depending on the specificity and catalytic activity of the enzymes required for attachment of ubiquitin to a given target. As a consequence of its importance to eukaryotic life, dysfunction in the ubiquitin system leads to many disease states, including cancers and neurodegeneration. This review takes a retrospective look at our progress in understanding the molecular mechanisms that govern the specificity of ubiquitin conjugation.
Concluding remarks
Our studies show that achieving specificity within a given pathway can be established by specific interactions between the enzymatic components of the conjugation machinery, as seen in the exclusive FANCL–Ube2T interaction. By contrast, where a broad spectrum of modifications is required, this can be achieved through association of the conjugation machinery with the common denominator, ubiquitin, as seen in the case of Parkin. There are many outstanding questions to understanding the mechanisms governing substrate selection and lysine targeting. Importantly, we do not yet understand what makes a particular lysine and/or a particular substrate a good target for ubiquitination. Subunits and co-activators of the APC/C multi-subunit E3 ligase complex recognize short, conserved motifs (D [221] and KEN [222] boxes) on substrates leading to their ubiquitination [223–225]. Interactions between the RING and E2 subunits reduce the available radius for substrate lysines in the case of a disordered substrate [226]. Rbx1, a RING protein integral to cullin-RING ligases, supports neddylation of Cullin-1 via a substrate-driven optimization of the catalytic machinery [227], whereas in the case of HECT E3 ligases, conformational changes within the E3 itself determine lysine selection [97]. However, when it comes to specific targets such as FANCI and FANCD2, how the essential lysine is targeted is unclear. Does this specificity rely on interactions between FA proteins? Are there inhibitory interactions that prevent modification of nearby lysines? One notable absence in our understanding of ubiquitin signalling is a ‘consensus’ ubiquitination motif. Large-scale proteomic analyses of ubiquitination sites have revealed the extent of this challenge, with seemingly no lysine discrimination at the primary sequence level in the case of the CRLs [228]. Furthermore, the apparent promiscuity of Parkin suggests the possibility that ubiquitinated proteins are the primary target of Parkin activity. It is likely that multiple structures of specific and promiscuous ligases in action will be required to understand substrate specificity in full.

To conclude, a few words about the issue of the sequence space not entirely traversed.

We have 2000 protein superfamilies that are completely unrelated at sequence level. That is evidence that functional protein sequences are not bound to any particular region of the sequence space.

Moreover, neutral variation in non coding and non functional sequences can go any direction, without any specific functional constraints. I suppose that neo-darwinists would recognize that parts of the genomes is non functional, wouldn’t they? And we have already seen elsewhere (in the ubiquitin thread discussion) that many new genes arise from non coding sequences.

So, there is no reason to believe that the functional space has not been traversed. But, of course, neutral variation can traverse it only at very low resolution.

IOWs, there is no reason that any specific part of the sequence space is hidden from RV. But of course, the low probabilistic resources of RV can only traverse different parts of the sequence space occasionally.

It’s like having a few balls that can move freely on a plane, and occasionally fall into a hole. If the balls are really few and the plane is extremely big, the balls will be able to potentially traverse all the regions of the plane, but they will pass only through a very limited number of possible trajectories. That’s why finding a very small hole will be almost impossible, wherever it is. And there is no reason to believe that small functional holes are not scattered in the sequence space, as protein superfamilies clearly show.

So, it’s not true that highly functional proteins are hidden in some unexplored tresure trove in the sequence space. They are there for anyone to find them, in different and distant parts of the sequence space, but it is almost impossible to find them through a random walk, because they are so small.

And yet, 2000 highly functional superfamilies are there.

Moreover, The rate of appearance of new suprefamilies is highest at the beginning of natural history (for example in LUCA), when a smaller part of the sequence space is likely to have been traversed, and decreases constantly, becoming extremely low in the last hundreds of million years. That’s not what you would expect if the problem of finding new functional islands were due to how much sequence space has been traversed, and if the sequence space were really so overflowing with potential naturally selectable functions, as neo-darwinists like to believe.

OK, that’s enough. As expected, this OP is very long. However, I think that it was important to discuss all these partially related issues in the same context.

Comments

gpuccio A comment from Rumraket that adds value.
Rumraket April 16, 2018 at 6:49 pm Ignored The alpha and beta subunits from F-type ATP synthases that gpuccio is obsessing about belong to a big family of haxameric helicases. They are WILDLY divergent in sequence over the diversity of life, and many of them are involved in other processes and functions that have nothing to do with ATP synthase/ATPase. Besides the structural similarities, they all seem to be involved in many different forms of DNA or RNA nucleotide/ribonucleotide processing (such as unwinding of double stranded DNA or RNA), of which NTP hydrolysis or synthesis as observed in ATP synthase, is just one among these many different functions. So not only are they divergent in sequence in ATP synthase machines, versions of the structure is part of many other functions besides ATP hydrolysis and synthesis. Which evolved from which, or do they all derive from a common ancestral function different from any present one? We don’t know. But we know that both the sequence and functional space of hexameric helicases goes well beyond the ATP synthase machinery. Their capacity to function as an RNA helicase could be hinting at an RNA world role.
bill cole_{April 16, 2018
April
04
Apr
16
16
2018
10:43 AM
10
10
43
AM
PDT}

uncommon avles- It isn't just the right proteins. You need them in the correct concentrations, at the right time and gathered at the right place. The assembly of any flagellum is also IC. Then there is command and control without which the newly evolved flagellum is useless.ET_{April 16, 2018
April
04
Apr
16
16
2018
09:38 AM
9
09
38
AM
PDT}

GP, I love your posts and you make great points. I have long concluded that the opposition to ID is not based on science and reason but extreme emotion.tribune7_{April 16, 2018
April
04
Apr
16
16
2018
07:58 AM
7
07
58
AM
PDT}

To all: I have just posted this comment in the Ubiquitin thread. I think it is relevant to the discussion here, too, because E3 ligases are one of the examples proposed by DNA_Jock. So, I copy it here too: This recent paper is really thorough, long and detailed. It is an extremely good summary about what is known of the role of ubiquitin in the regulation of the critical pathway of NF-?B Signaling, of which we have said a lot during this discussion: The Many Roles of Ubiquitin in NF-kB Signaling http://www.mdpi.com/2227-9059/6/2/43/htm I quote just a few parts:
Abstract: The nuclear factor kB (NF-kB) signaling pathway ubiquitously controls cell growth and survival in basic conditions as well as rapid resetting of cellular functions following environment changes or pathogenic insults. Moreover, its deregulation is frequently observed during cell transformation, chronic inflammation or autoimmunity. Understanding how it is properly regulated therefore is a prerequisite to managing these adverse situations. Over the last years evidence has accumulated showing that ubiquitination is a key process in NF-kB activation and its resolution. Here, we examine the various functions of ubiquitin in NF-kB signaling and more specifically, how it controls signal transduction at the molecular level and impacts in vivo on NF-kB regulated cellular processes. — Importantly, the number of E3 Ligases or DUBs mutations found to be associated with human pathologies such as inflammatory diseases, rare diseases, cancers and neurodegenerative disorders is rapidly increasing [22,23,24]. There is now clear evidence that many E3s and DUBs play critical roles in NF-kB signaling, as will be discussed in the next sections, and therefore represent attractive pharmacological targets in the field of cancers and inflammation or rare diseases. — 3.3. Ubiquitin Binding Domains in NF-kB Signaling Interpretation of the “ubiquitin code” is achieved through the recognition of different kinds of ubiquitin moieties by specific UBD-containing proteins [34]. UBDs are quite diverse, belonging to more than twenty families, and their main characteristics can be summarized as follows: (1) They vary widely in size, amino acid sequences and three-dimensional structure; (2) The majority of them recognize the same hydrophobic patch on the beta-sheet surface of ubiquitin, that includes Ile44, Leu8 and Val70; (3) Their affinity for ubiquitin is low (in the higher µM to lower mM range) but can be increased following polyubiquitination or through their repeated occurrence within a protein; (4) Using the topology of the ubiquitin chains, they discriminate between modified substrates to allow specific interactions or enzymatic processes. For instance, K11- and K48-linked chains adopt a rather closed conformation, whereas K63- or M1-linked chains are more elongated. In the NF-kB signaling pathway, several key players such as TAB2/3, NEMO and LUBAC are UBD-containing proteins whose ability to recognize ubiquitin chains is at the heart of their functions. — 9. In Vivo Relevance of Ubiquitin-Dependent NF-kB Processes NF-kB-related ubiquitination/ubiquitin recognition processes described above at the protein level, regulate many important cellular/organismal functions impacting on human health. Indeed, several inherited pathologies recently identified are due to mutations on proteins involved in NF-kB signaling that impair ubiquitin-related processes [305]. Not surprisingly, given the close relationship existing between NF-kB and receptors participating in innate and acquired immunity, these diseases are associated with immunodeficiency and/or deregulated inflammation. 10. Conclusions Over the last fifteen years a wealth of studies has confirmed the critical function of ubiquitin in regulating essential processes such as signal transduction, DNA transcription, endocytosis or cell cycle. Focusing on the ubiquitin-dependent mechanisms of signal regulation and regulation of NF-kB pathways, as done here, illustrates the amazing versatility of ubiquitination in controlling the fate of protein, building of macromolecular protein complexes and fine-tuning regulation of signal transmission. All these molecular events are dependent on the existence of an intricate ubiquitin code; that allows the scanning and proper translation of the various status of a given protein;. Actually, this covalent addition of a polypeptide to a protein, a reaction that may seem to be a particularly energy consuming process, allows a crucial degree of flexibility and the occurrence of almost unlimited new layers of regulation. This latter point is particularly evident with ubiquitination/deubiquitination events regulating the fate and activity of primary targets often modulated themselves by ubiquitination/deubiquitination events regulating the fate and activity of ubiquitination effectors and so on. — To the best of our knowledge the amazingly broad and intricate dependency of NF-kB signaling on ubiquitin has not been observed in any other major signaling pathways. It remains to be seen whether this is a unique property of the NF-kB signaling pathway or only due to a lack of exhaustive characterization of players involved in those other pathways. Finally, supporting the crucial function of ubiquitin-related processes in NF-kB signaling is their strong evolutionary conservation.
Emphasis mine. The whole paper is amazingly full of fascinating information. I highly recommend it to all, and especially to those who have expressed doubts and simplistic judgments about the intricacy and specificity of the ubiquitin system, in particular the E3 ligases. But what’s the point? They will never change their mind.gpuccio_{April 16, 2018
April
04
Apr
16
16
2018
07:56 AM
7
07
56
AM
PDT}

tribune7: Of course. The key concept is always the complexity that is necessary to implement the function. A very interesting example to understand better the importance of the functional complexity of a sequence, and why complexity is not additive, can be foun in the Ubiquitin thread, in my discussion with Joe Felsestein, from whom we are waiting for some more detailed answer. It's the thief scenario. See here: The Ubiquitin System: Functional Complexity and Semiosis joined together. https://uncommondescent.com/intelligent-design/the-ubiquitin-system-functional-complexity-and-semiosis-joined-together/#comment-656365 #823, #831, #859, #882, #919 I paste here, for convenience, the final summary of the mental experiment, from comment #919 (to Joe Felsestein):
The thief mental experiment can be found as a first draft at my comment #823, quoted again at #831, and then repeated at #847 (to Allan Keith) in a more articulated form. In essence, we compare two systems. One is made of one single object (a big safe). the other of 150 smaller safes. The sum in the big safe is the same as the sums in the 150 smaller safes put togethjer. that ensures that both systems, if solved, increase the fitness of the thief in the same measure. Let’s say that our functional objects, in each system, are: a) a single piece of card with the 150 figures of the key to the big safe b) 150 pieces of card, each containing the one figure key to one of the small safes (correctly labeled, so that the thief can use them directly). Now, if the thief owns the functional objects, he can easily get the sum, both in the big safe and in the small safes. But our model is that the keys are not known to the thief, so we want to compute the probability of getting to them in the two different scenarios by a random search. So, in the first scenario, the thief tries the 10^150 possible solutions, until he finds the right one. In the second scenario, he tries the ten possible solutions for the first safe, opens it, then passes to the second, and so on. A more detailed analysis of the time needed in each scenario can be found in my comment #847. So, I would really appreciate if you could answer this simple question: Do you think that the two scenarios are equivalent? What should the thief do, according to your views? This is meant as an explicit answer to your statement mentioned before: “That counts up changes anywhere in the genome, as long as they contribute to the fitness, and it counts up whatever successive changes occur.” The system with the 150 safes corresponds to the idea of a function that include changes “anywhere in the genome, as long as they contribute to the fitness”. The system with one big safe corresponds to my idea of one single object (or IC system of objects) where the function (opening the safe) is not present unless 500 specific bits are present.
gpuccio_{April 16, 2018
April
04
Apr
16
16
2018
07:23 AM
7
07
23
AM
PDT}

O & GP If so, Miller needs to explain his position, since Miller assigns the same probability to pre-specified and post-specified events. He lumps those two categories together. If a specification is valid, only the complexity of the specification matters for a design inference. If the complexity is the same, there is absolutely no difference between a pre-specification and a valid post-specification. Suppose after dealing a deck all day, one particular sequence occurs which causes the lights to come on, music to start playing and a clown to come in with a cake. Could that reasonably considered to be a chance event? Any arrangement of the chemistry of the genetic code has a equally minuscule probability but only if it is done in a specific way does something happen.tribune7_{April 16, 2018
April
04
Apr
16
16
2018
05:48 AM
5
05
48
AM
PDT}

Tribune7 and Origenes: As I have argued at #36, pre-specifications are always valid, and they can use any contingent information, because of course that contingent information does not derive from any random event that has already happened. Post-specifications, instead, are valid only if they are about objective properties and if they don't use any already existing contingent information. If a specification is valid, only the complexity of the specification matters for a design inference. If the complexity is the same, there is absolutely no difference between a pre-specification and a valid post-specification.gpuccio_{April 16, 2018
April
04
Apr
16
16
2018
04:53 AM
4
04
53
AM
PDT}

Tribune7 @57
T7: Now, imagine the exact sequence had been predicted before hand. Would they still say it was by chance?
I do like your 'simple' question. Indeed, suppose a card dealer successfully specifies beforehand which cards Miller will get, would Miller accept this as a chance event? What if the card dealer gets it right every time all day long? What would Miller say? That, without design, this is impossible, perhaps? If so, Miller needs to explain his position, since Miller assigns the same probability to pre-specified and post-specified events. He lumps those two categories together. And here lies Miller's obvious mistake. In his example (see quote in #33) he smuggles in the specification.Origenes_{April 16, 2018
April
04
Apr
16
16
2018
04:27 AM
4
04
27
AM
PDT}

To all: Not much at TSZ. Entropy continues to confound the problem of TSS fallacy with the problem of alternative solutions. I have discussed them both in the OP, but he seems not to be aware of that. Just to help him understand: a) The problem of the TSS fallacy is: is the post-hoc specification valid, and when? I have abswered that problem very clearly: any post-hoc specification is valid if the two requisites I have described are satisfied. In that case, there is no TSS fallacy. My two requisites are always satisfied in the ID inferences, therefore the TSS fallacy does not apply to the ID inference. b) Then there is the problem of how to compute the probability of the observed function. Entropy thinks that this is too part of the TSS fallacy, because he follows the wrong reasoning of DNA_Jock. But that has nothing to do with the fallacy itself. At most, it is a minor problem of how to compute probabilities. I have clearly argued that with huge search spaces, and with highly complex solutions, that problme is irrelevant. We can very well compute the specificity of the observed solution, and ignore other possible complex solutions, which would not change the result in any significant way for our purposes. DNA_Jock and Entropy can disagree, but I have discussed the issue, and given my reasons. Everyone can judge for himself. Then there is the issue of the level at whuch the function must be defined. I have clearly stated that the only correct scientific approach is to define as rejection region the upper taile of the observed effect, as everybody ddoes in hypothesis testing. DNA_Jock does not like my answer, but he has not explained why. He also gives cryptic allusions to some different argument that I could have used, but of course he does not say what it is. And, of course, I suppose that he laughs. Good for him. Finally, Entropy, like DNA_Jock, seems not to have understood the simple fact that the alpha and beta chains of ATP synthase have the same conserved sequence in Alveolata as in all other organisms. Could someone please explain to these people that I have discussed that issue in the OP, with precise references from the literature that they had linked? If they think that I am wrong, I am ready to listen to their reasons.gpuccio_{April 16, 2018
April
04
Apr
16
16
2018
04:09 AM
4
04
09
AM
PDT}

bill cole: Thank you for the kind words! :) I am looking forward to Joe Felsestein's clarifications. He seems to be one of the last people there willing to discuss reasonably.gpuccio_{April 15, 2018
April
04
Apr
15
15
2018
11:38 PM
11
11
38
PM
PDT}

Origenes at #54: "Yes that is exactly what he does. It is Ken Miller’s mistake all over again." Yes, sometimes our kind interlocutors really help us. Seriously, I am really amazed that they are still using the infamous deck of cards fallacy! What's wrong in their minds? At least DNA_Jock has avoided that intellectual degradation. At least up to now... :)gpuccio_{April 15, 2018
April
04
Apr
15
15
2018
11:36 PM
11
11
36
PM
PDT}

uncommon_avles: Thank you for your comment. What you say is not really connected to the discussion here, but I will answer your points.
I don’t think analogy of bricks and bullets works.
As already discussed with Origenes at #31, #36 and #50, the bricks analogy in the OP has only one purpose: to ahow that there is a class of systems and events to which the TSS fallacy does not apply. The wall with the green bricks and the protein functions both belong to that class. for exactly the same reasons, that I have explicitly discussed (my two requirements), as detailed in the OP and at #36:
1) The function is recognized after the random shooting (whatever it is), and certainly its explicit definition, including the definition of the levels observed, depends on what we observe. In this sense, our definition is not “independent” from the results. But the first important requisite is that the function we observe and define must be “related to an objectively existing property of the system”. IOWs, the bricks were green before the shooting (we are not considering here the weird proposal about moss made by uncommon_avles at #32). In the case of protein functions, the connection with objectively existing properties of the system is even more clear. Indeed, if bricks could theorically be ppainted after the shooting, biochemical laws are not supposed to come into existence after the proteins themselves. At least, I hope that nobody, even at TSZ, is suggesting that. So, our first requisite is completely satisfied. 2) The second important requisite is that we must “make no use of the specific details of what is observed to “paint” the function”. This is a little less intuitive, so I will try to explain it well. For “specific details” I mean here the contingent information in the result: IOWs, the coordinates of the shots in the case of the wall, or the specific AAs in the sequence in the case of proteins. The rule is simple, and universally appliable: if I need to know and use those specific contingent values to explicitly define the function, then, and only then, I am committing the TSS fallacy.
This is the purpose of the analogy. For that purpose, it is perfectly appropriate. For the rest, of course, it is not a model of a biological system. Then you say:
The point is, in biological process, you need to take time factor and incremental probability into account.
I don't know what you mean by "incremental probability", I suppose you mean Natural Selection. Of course I take into account the time factor and NS in all my discussions about biological systems. You can find my arguments here: What are the limits of Natural Selection? An interesting open discussion with Gordon Davisson https://uncommondescent.com/intelligent-design/what-are-the-limits-of-natural-selection-an-interesting-open-discussion-with-gordon-davisson/ and here: What are the limits of Random Variation? A simple evaluation of the probabilistic resources of our biological world https://uncommondescent.com/intelligent-design/what-are-the-limits-of-random-variation-a-simple-evaluation-of-the-probabilistic-resources-of-our-biological-world/ In the second OP, just at the beginnign, you can find a Table with the computation of the rpobabilistic resources of biological systems on our planet, for its full lifetime. All those points of course are extremely important, and I have discussed them in great detail. But they have nothing to do with the TSS fallacy argument, which is the issue debated here. So, we have discovered two great truths: a) Any analogy has its limitations b) I cannot discuss everything at the same time
The green colour of brick might not be paint. It might be moss formed over a few months’ time.
As already said, your point is wrong and not pertinent. there are of course ways to distinguish between a painted brick and moss. The important question is: is the propert I am using to recognize the function post-hoc an objective property of the system, or am I inventing it now? For protein function, there is absolutely no doubt: the function of a protein is the strict consequence of biochemical laws. I am inventing neither the laws nor the observed function. They are objective properties of the system of our observable universe. So, while you can still have some rather unreasonable doubt for the green brick (the "moss" alternative), there can be no doubt for the protein function. Your idea is probably that the protein could have acquired the function gradually, by a process of RV and NS. But that is not an argument about the TSS fallacy, as I have already explained. The problem here is: can we reject the null hypothesis of a random origin using a specification post-hoc? And the clear answer is: yes, of course, but we have to respect these two requirements (see above). The probabilistic analysis has only one purpose: to reject a random origin. Mechanisms like NS must be evaluated in other ways, considering what they can do and what they cannot do in the observed system. As I have done both in my previous OPs and here.
If you see a property of a biological system which seems improbable at first glance, you should consider the fact that the property might have evolved over time from other dissimilar properties.
Of course, and I have done that a lot of times. But, as said, that has nothing to do with the TSS fallacy. The problem in the TSS fallacy is: is the property I an using in my reasoning an objective property, for which I can build a probabilistic analysis of the hypothesis of a random origin (of course also considering, if apporpriate, the role of necessity factors, like NS), or is it a "painted" property, one that did not exist before observing what I am observing? You are conflating different arguments here. I have discussed all of them, but, as said before, not all at the same time. Finally you say:
Thus in the flagellum of the E. coli bacterium, there are around 40 different kinds of proteins but only 23 of these proteins are common to all the other bacterial flagella . Of these 23 proteins just two are unique to flagella. The others all closely resemble proteins that carry out other functions in the cell. This means that the vast majority of the components needed to make a flagellum might already have been present in bacteria before this structure appeared
This is the old (and wrong) argument against Irreducible Complexity. Again, it's another argument, and has nothing to do with the TSS fallacy. Moreover, I have not used IC in this OP and in this discussion as a relevant argument. My exanples are essentially about the functional complexity of single proteins, for example the alpha and beta chains of ATP synthase. But of course the system made by those two proteins together is certainly irreducibly complex. Each of the two proteins is powerless without the other. But each of the two proteins is also functionally complex of its own merit. However, the discussion here is not about IC. Again, you conflate different arguments without any reason to do that.gpuccio_{April 15, 2018
April
04
Apr
15
15
2018
11:31 PM
11
11
31
PM
PDT}

Origenes, great point: Given some accuracy recording the outcome, everyone can perform the following cycle all day long: 1. deal cards. 2. make a “specification” based on the outcome. 3. see that outcome and specification match and express puzzlement. Now, imagine the exact sequence had been predicted before hand. Would they still say it was by chance? What if someone took that deck and rather than dealing them, just built a house of cards? Would they claim as within the realm of chance?tribune7_{April 15, 2018
April
04
Apr
15
15
2018
09:58 PM
9
09
58
PM
PDT}

gpuccio @ 34, I don’t think analogy of bricks and bullets works. The point is, in biological process, you need to take time factor and incremental probability into account. The green colour of brick might not be paint. It might be moss formed over a few months’ time. If you see a property of a biological system which seems improbable at first glance, you should consider the fact that the property might have evolved over time from other dissimilar properties. Thus in the flagellum of the E. coli bacterium, there are around 40 different kinds of proteins but only 23 of these proteins are common to all the other bacterial flagella . Of these 23 proteins just two are unique to flagella. The others all closely resemble proteins that carry out other functions in the cell. This means that the vast majority of the components needed to make a flagellum might already have been present in bacteria before this structure appeareduncommon_avles_{April 15, 2018
April
04
Apr
15
15
2018
08:07 PM
8
08
07
PM
PDT}

gpuccio
Moreover, he points to our old exchanges instead of dealing with my arguments here. Again, his choice. But I will not go back to re-read the past. I have worked a lot to present my arguments together, and in a new form, and I will answer only to those who deal with the things I have said here.
I looked over the old exchanges and his use of the TSS was fallacious. You are comparing protein sequence data over different species which seems to have nothing to do with the TSS fallacy. I am grateful that his challenge that got you to write this excellent op which was very educational for me especially the highlights you made on the Hayashi paper. Rumraket usually backs up his claims. I agree that his argument was based on a straw-man fallacy but honestly I think thats the best he can do. The data here is very problematic to the Neo-Darwinian position. The TSS claim was also a fallacy and a cleaver argument by Jock but again it misrepresented your claims. Joe Felsenstein said he would not comment on the TSS op but would write an op addressing your definition of information. I look forward to his op and hope that it generates a more productive discussion between UD and TSZ. From his lecture I do believe that he understands the challenge that genetic information brings to understanding the cause of the diversity of living organisms. Again, thank you so much for this clearly written op.:-)bill cole_{April 15, 2018
April
04
Apr
15
15
2018
05:49 PM
5
05
49
PM
PDT}

GPuccio @53
GP: Rumracket is doing exactly that: he is using the specific contingent values in a post-hoc specification. So, he is committing a fallacy that ID never commits.
Yes that is exactly what he does. It is Ken Miller's mistake all over again. "What is the likelihood of that particular collection of mutations?", Rumracket asks. In return I would like to ask him: "What probability are you attempting to compute?" And as a follow-up question: “Are we talking about the probability that the outcome matches a specification informed by the outcome? If so, then the chance is 100%."Origenes_{April 15, 2018
April
04
Apr
15
15
2018
05:33 PM
5
05
33
PM
PDT}

Origenes: Please. notice how Rumrachet aty TSZ has given us a full example of the fallacy I have described:
But that’s silly, because all sufficiently long historical developments will look unbelievably unlikely after the fact. To pick an example, take one of the lineages in the Long Term Evolution experiment with E coli. In this lineage, over 600 particular mutations have accumulated in the E coli genome over the last 25 years. What is the likelihood of that particular collection of mutations?
Emphasis mine. He is clearly violating my second fundamental requisite to avoid the TSS fallacy, as explained both in the OP and in my discussion with you at #36: "2) The second important requisite is that we must “make no use of the specific details of what is observed to “paint” the function”. This is a little less intuitive, so I will try to explain it well. For “specific details” I mean here the contingent information in the result: IOWs, the coordinates of the shots in the case of the wall, or the specific AAs in the sequence in the case of proteins. The rule is simple, and universally appliable: if I need to know and use those specific contingent values to explicitly define the function, then, and only then, I am committing the TSS fallacy." Rumracket is doing exactly that: he is using the specific contingent values in a post-hoc specification. So, he is committing a fallacy that ID never commits. A good demonstration of my point. Should I thank him? :)gpuccio_{April 15, 2018
April
04
Apr
15
15
2018
04:27 PM
4
04
27
PM
PDT}

bill cole: As you have probably noticed, Rumraket: April 15, 2018 at 9:31 pm is just reciting again the infamous deck of cards fallacy. I will not waste my time with him, repeating what I have already said (see #35 here and #859 in the Ubiquitin thread).gpuccio_{April 15, 2018
April
04
Apr
15
15
2018
04:12 PM
4
04
12
PM
PDT}

bill cole: I have read the comment by DNA_Jock. April 15, 2018 at 9:12 pm What a disappointment. Seriously. He does not want to discuss "over the fence". OK, his choice. Therefore, I will not address him directly, too. I can discuss over the fence, and I have done exactly that, but I don't like to shout over the fence to someone who has already declared that he will not respond. Moreover, he points to our old exchanges instead of dealing with my arguments here. Again, his choice. But I will not go back to re-read the past. I have worked a lot to present my arguments together, and in a new form, and I will answer only to those who deal with the things I have said here. He seems offended that I have added the: "ATP synthase (rather than ATPase)" clarification. Of course he will not believe it, but I have done that only to avoid equivocations. All the discussions here have been about ATP synthase, that I have always called by that name. The official name of the beta chain that I discuss (P06576) is, at Uniprot: "ATP synthase subunit beta, mitochondrial" Of course ATP synthase is also an ATPase, because it can work in both directions. But the term ATPase is less specific, because there are a lot of ATPases that are in no way ATP synthases. See Wikipedia for a very simple reference: ATPase https://en.wikipedia.org/wiki/ATPase So, it was important to clarify that I was of course speaking of ATP synthase, instead of intentionally generating confusion, as he has tried to do. He does not answer my criticism to his level of definiton argument (the things he says are no answer at all, as anyone can check). Again, his choice. But it is really shameful that he has not even mentioned my argument that his argument about my argument about the alpha and beta chains of ATP synthase is completely wrong. As I have said, the alpha and beta chains of ATP synthase are the same in Alveolata as in all other organisms. So he is wrong, I have clearly said why, quoting the same paper that he linked, and he does not even mention the fact. He is simply ridiculous about my argument regarding time measuring systems. "omits the water clock and the candle clock". I cannot believe that he says that! Just for the record, this is from the OP:
So, we wonder: are there other solutions to measure time? Are there other functional islands in the search space of material objects? Of course there are. I will just mention four clear examples: a sundial, an hourglass, a digital clock, an atomic clock.
Emphasis added. Is this "whining"? Is this "ignorance or lack of attention" that is "leading me to underestimate the number of other possible ways of achieving any function"? You judge. Again, I quote from my OP:
Does the existence of the four mentioned alternative solutions, or maybe of other possible similar solutions, make the design inference for the traditional watch less correct? The answer, of course, is no. But why? It’s simple. Let’s say, just for the sake of discussion, that the traditional watch has a functional complexity of 600 bits. There are at least 4 additional solutions. Let’s say that each of them has, again, a functional complexity of 500 bits. How much does that change the probability of getting the watch? The answer is: 2 bits (because we have 4 solutions instead of one). So, now the probability is 598 bits. But, of course, there can be many more solutions. Let’s say 1000. Now the probability would be about 590 bits. Let’s say one million different complex solutions (this is becoming generous, I would say). 580 bits. One billion? 570 bits. Shall I go on? When the search space is really huge, the number of really complex solutions is empirically irrelevant to the design inference. One observed complex solution is more than enough to infer design. Correctly. We could call this argument: “How many needles do you need to tranfsorm a haystack into a needlestack?” And the answer is: really a lot of them. Our poor 4 alternative solutions will not do the trick.
That said, I am really happy that he does not want to "shout over the fence". This is very bad shouting, arrogant evasion, and certainly not acceptable behaviour from someone who is certainly not stupid. Just to be polite, good by to him.gpuccio_{April 15, 2018
April
04
Apr
15
15
2018
04:02 PM
4
04
02
PM
PDT}

Origenes: No, that was not what I meant. NS acts only on the shot that has already found a functional island, because it needs an existing, naturally selectable function to act. Going back to the wall metaphor, it's as is the shots that hit a green brick, and only those that hit a green brick, become in some way "centered" after the hit: so, even if they hit the green brick, say, in a corner, there is a mechanism that moves the bullet to the center. That does not happen to the shots that hit the brown bricks. So, NS is a mechanisms that works in the protein space, but not as a rule in the wall (I am aware of no mechanisms that centers the bullet after the shot). As you said yourself, every analogy has its limitations! One important difference if that the wall model is a random search, while the search in protein space is a random walk. That does not change much in terms of probabilities, but the models are different. So, the model of the ball and holes corresponds better to what happens in protein space. The ball is some sequence, possibly non coding, that changes through neutral variation (it can go in any direction, on the flat plane). As we have said many times, this is the best scenario for finding a new functional island, because already existing functional sequences are already in a hole, and it is extremely difficult for them to move away from it. So, the ball can potentially explore all the search space by neutral variation, but of course it has not the resources to explore all possible trajectories. The movement of the ball is the random walk. We can thing of each new state tested as a discrete movement. Most movements (aminoacid substitutions) make the ball move gradually through the protein space, by small shifts, but some types of variation (indels, frameshifts, and so on) can make it move suddenly to different parts of the space. However, each new state is a new try, that can potentially find a hole, but only according to the probabilites of finding it. If a hole is found, and a naturally selectable function appears, then the ball falls in the hole, and most likely its movement will be confined in the functional island itself, until optimization is reached. The higher the optimization, the more difficult it will be for the ball to go out of the hole and start again a neutral walk. A random search and a random walk are two different kinds of random systems, that have many things in common but are different for some aspects. However, essentially the probabilistic computation is not really different: if a target is extremely improbable in a random search (the shooting) it is also extremely improbable in a random walk (the ball), provided of course that the walk does not start from a position near the target: all that is necessary is that the starting position must be unrelated at sequence level, as is the case, for example, for all the 2000 protein superfamilies. Even in the case where an already functional protein undergoes a sudden functional transition which is in itself complex, like for example in the transition to vertebrates, there is no difference. The fact that the whole protein already had part of the functional information that will be conserved up to humans before the transition does not help to explain the appearence of huge new amounts of specific sequence homology to the human form. Again, the random walk is from an unrelated sequence (the part of the molecule that had no homology with the human form) to a new functional hole (the new functional part of the sequence that appears in vertebrates and has high homology to the human form, and that will be conserved from then on). The important point is that the functional transition must be complex: as I have said many times, there is no difference, probabilistically, if we build a completely new protein which has 500 bits of human conserved functional information, or if we add 500 bits of human conserved functional information to a protein that already exhibited 300 bits of it, and then goes to 800 bits in the transition. In both cases, we are generating 500 new and functional bits of human conserved information that did not exist before, starting from an unrelated sequence, or part of sequence.gpuccio_{April 15, 2018
April
04
Apr
15
15
2018
03:29 PM
3
03
29
PM
PDT}

gpuccio There are a couple of responses in the TSZ ubiquitin thread. I responded to DNA jock briefly but your comments would be greatly appreciated.bill cole_{April 15, 2018
April
04
Apr
15
15
2018
03:20 PM
3
03
20
PM
PDT}

gpuccio
However, the wildtype has an infectivity of about: e^22.4 = 5,348,061,523 which is about 2000 times greater (from 2.6 millions to 5.3 billions). So, they are still far away from the function of the wildtype, and they have already reached stagnation. Moreover, if you look at the sequences at the bottom of the same Figure, you can see that the best result obtained has no homology to the sequence of the wildtype. As the authors say: “More than one such mountain exists in the fitness landscape of the function for the D2 domain in phage infectivity. The sequence selected finally at the 20th generation has ?=?0.52 but showed no homology to the wild-type D2 domain, which was located around the fitness of the global peak. The two sequences would show significant homology around 52% if they were located on the same mountain. Therefore, they seem to have climbed up different mountains.”
Amazing and helpful. Solid evidence of a separate hole that "traps" the protein away form the wild type.bill cole_{April 15, 2018
April
04
Apr
15
15
2018
02:54 PM
2
02
54
PM
PDT}

GPuccio @37
GP: NS has absolutely no role in the process of shooting the functional islands. The ball falls into the hole (however big or deep it is), and rather quickly reaches the bottom. And stays there. This is the role of NS. Once a functional island has been shot (found), NS can begin to act. And it can, at least in some cases, optimize the existing function, usually by a short ladder of one AA steps. Until the bottom is reached (the function is optimized for that specific functional island). So, NS acts in its two characteristic ways, but only after the functional island has been found ...
This is not immediately clear to me. At the moment that NS optimizes a function, can it be argued that NS has some influence on these "optimizing shots"? Assuming that each new configuration is a new shot, perhaps, one can argue that NS indirectly, by fixating the ball in the hole and steering it towards the lowest point, induces more shots to be fired in the area of the hole, rather than somewhere else? IOWs is there a secondary role for NS in relation to the shots fired during the optimization process? As in, NS never fires the first shot, but, instead, induces some 'follow-up-shots'.Origenes_{April 15, 2018
April
04
Apr
15
15
2018
02:47 PM
2
02
47
PM
PDT}

jdk: Thank you for the link. It seems that I did not take part in that discussion. At present I cannot read that long thread, because as you can see I am rather busy. Is there any specific argument that you would like to propose?gpuccio_{April 15, 2018
April
04
Apr
15
15
2018
02:28 PM
2
02
28
PM
PDT}

mike1962: Thank you very much. Your appreciation is much appreciated! :)gpuccio_{April 15, 2018
April
04
Apr
15
15
2018
02:25 PM
2
02
25
PM
PDT}

Origenes at #41: OK, I would say that we agree perfectly. :)gpuccio_{April 15, 2018
April
04
Apr
15
15
2018
02:24 PM
2
02
24
PM
PDT}

bill cole at #38 and 42: The Hayashi paper is about function retrieving. So, it is not about a completely new function. They changed one domain of the g3p protein of the phage, a 424 AAs long protein necessary for infectivity, with a random sequence of 139 AAs. The protein remained barely functional, and that's what allows them to test RV and NS: the function is still there, even if greatly reduced. The phage can still survive and infect. An important point is that fitness is measure here as the natural logarithm of infectivity, therefore those are exponential values. If you look at Fig. 2, you can see that the initial infectivity is about: e^5 = 148 Their best result is about: e^14.8 = 2,676,445 That's why they say that they had an increase in infectivity of about 17000 folds. (the numbers are not precise, I am deriving them from the Figure). However, the wildtype has an infectivity of about: e^22.4 = 5,348,061,523 which is about 2000 times greater (from 2.6 millions to 5.3 billions). So, they are still far away from the function of the wildtype, and they have already reached stagnation. Moreover, if you look at the sequences at the bottom of the same Figure, you can see that the best result obtained has no homology to the sequence of the wildtype. As the authors say: "More than one such mountain exists in the fitness landscape of the function for the D2 domain in phage infectivity. The sequence selected finally at the 20th generation has ?=?0.52 but showed no homology to the wild-type D2 domain, which was located around the fitness of the global peak. The two sequences would show significant homology around 52% if they were located on the same mountain. Therefore, they seem to have climbed up different mountains."gpuccio_{April 15, 2018
April
04
Apr
15
15
2018
02:22 PM
2
02
22
PM
PDT}

gpuccio
Indeed, falling into a bigger hole (a much bigger hole, indeed) is rather a severe obstacle to finding the tiny hole of the wildtype. Finding it is already almost impossible because it is so tiny, but it becomes even more impossible if the ball falls into a big hole, because it will be trapped there by NS. Therefore, to sum up, both the existence of 2000 isolated protein superfamilies and the evidence from the rugged landscape paper demonstrate that functional islands exist, and that they are isolated in the sequence space.
After my re read, I see you have answered my question.bill cole_{April 15, 2018
April
04
Apr
15
15
2018
01:12 PM
1
01
12
PM
PDT}

GPuccio @35, @36
GP: Is that Allan Miller at TSZ?
No, I quoted biochemist Ken Miller from Brown University. He presented this argument succesfully at the Dover trial.
GP: We have one event: the random generation of a 150 figures number. What is the probability of that event? It depends on how you define the probability. In all probability problems, you need a clear definition of what probability you are computing.
You make a very important point. What is falsely suggested, by Ken Miller and others, is that an independent specification is matched.
GP: So, if you define the problem as follows: “What is the probability of having exactly this result? … (and here you must give the exact sequence for which you are computing the probability)”
Exactly right. Ken Miller tell us the exact sequence you refer to when you talk about probability, and do NOT use the outcome to produce this specification.
GP: … then the probability is 10^-150. But you have to define the result by the exact contingent information of the result you have already got. IOWs the outcome informed your specification. IOWs, what you are asking is the probability of a result that is what it is.
The ‘specification’ informed by the outcome matches the outcome. Accurately done Ken Miller, but no cigar.
GP: That probability in one try is 1 (100%). Because all results are what they are. All results have a probability of 10^-150. That property is common to all the 10^150 results. Therefore, the probability of having one generic result whose probability is 10^-150 is 1, because we have 10^150 potential results with that property, and no one that does not have it. So, should we be suprised that we got one specific result, that is what it is?
Kenny Miller acted very surprised, like this:
Miller: We can then look back and say ‘my goodness, how improbable this is. We can play cards for the rest of our lives and we would never ever deal the cards out in this exact same fashion.’ You know what; that’s absolutely correct.
My goodness!
GP: Not at all. That is the only possible result. The probability is 1. No miracle, of course. Not even any special luck. Just necessity (a probabiltiy of one is necessity).
I agree completely. I have attempted to make the exact same point in #33.
GP: A few comments on what you say, and about the word “independent”. Pre-specifications are in a sense “independent” by definition. There is never any problem with them.
I agree. However, unfortunately, obviously, no human can produce pre-specifications of e.g. functional proteins.
GP: The problem arises with post-specifications. You say that they must be “independent”, and I agree. But perhaps the word “independent” can lead to some confusion. So, it’s better to clarify what it means.
In #33 I offered the following clarification:
O: To be clear, here by “independent” is meant independent from the outcome. Such an independent specification can be produced before, during or after the outcome, the only demand which must be met is that it is not informed by the outcome.

GP: But if I see that the protein is a very efficient enzyme for a specific biochemical reaction, and using that observation only, and not the specific sequence of the protein (and I can be completely ignorant of it), I define my function as: “a protein that can implement the following enzymatic reaction” (observed function) at at least the following level (upper tail based on the observed function efficiency)” then my post-specification is completely valid. I am not committing any TSS fallacy. My target is a real target, and my probabilities are real probabilities.
I agree. My only comment is that I still prefer the term “independent specification”. Calling it “post-specification” is confusing and also less accurate. The specification is not based on the outcome and it is therefore irrelevant if it happens before, during or after the outcome.Origenes_{April 15, 2018
April
04
Apr
15
15
2018
01:05 PM
1
01
05
PM
PDT}

It's going to take a few reads to fully digest this, but nicely done. Much appreciation.mike1962_{April 15, 2018
April
04
Apr
15
15
2018
12:25 PM
12
12
25
PM
PDT}

Prev 1 … 13 14 15 16 17 Next

You must be logged in to post a comment.

a) The Texas Sharp Shooter fallacy and its wrong application to ID.

First scenario: no fallacy.

Second scenario: the fallacy.

Third scenario: no fallacy.

b) The objection of the different possible levels of function definition.

c) The objection of the possible alternative solutions, and of the incomplete exploration of the search space.

c1) The premise

c2) DNA_Jock’s arguments

c3) The laughs

Leave a Reply