Uncommon Descent Serving The Intelligent Design Community

Defending Intelligent Design theory: Why targets are real targets, probabilities real probabilities, and the Texas Sharp Shooter fallacy does not apply at all.

Share
Facebook
Twitter
LinkedIn
Flipboard
Print
Email

 

 

 

The aim of this OP is to discuss in some order and with some completeness a few related objections to ID theory which are in a way connected to the argument that goes under the name of Texas Sharp Shooter Fallacy, sometimes used as a criticism of ID.

The argument that the TSS fallacy is a valid objection against ID has been many times presented by DNA_Jock, a very good discussant from the other side. So, I will refer in some detail to his arguments, as I understand them and remember them. Of course, if DNA_Jock thinks that I am misrepresenting his ideas, I am ready to ackowledge any correction about that. He can post here, if he can or likes, or at TSZ, where he is a contributor.

However, I thik that the issues discussed in this OP are of general interest, and that they touch some fundamental aspects of the debate.

As an help to those who read this, I will sum up the general structure of this OP, which will probably be rather long. I will discuss three different arguments, somewhat related. They are:

a) The application of the Texas Sharp Shooter fallacy to ID, and why that application is completely wrong.

b) The objection of the different possible levels of function definition.

c) The objection of the possible alternative solutions, and of the incomplete exploration of the search space.

Of course, the issue debated here is, as usual, the design inference, and in particular its application to biological objects.

So, let’s go.

a) The Texas Sharp Shooter fallacy and its wrong application to ID.

 

What’s the Texas Sharp Shooter fallacy (TSS)?

It is a logical fallacy. I quote here a brief description of the basic metaphor, from RationalWiki:

The fallacy’s name comes from a parable in which a Texan fires his gun at the side of a barn, paints a bullseye around the bullet hole, and claims to be a sharpshooter. Though the shot may have been totally random, he makes it appear as though he has performed a highly non-random act. In normal target practice, the bullseye defines a region of significance, and there’s a low probability of hitting it by firing in a random direction. However, when the region of significance is determined after the event has occurred, any outcome at all can be made to appear spectacularly improbable.

For our purposes, we will use a scenario where specific targets are apparently shot by a shooter. This is the scenario that best resembles what we see in biological objects, where we can observe a great number of functional structures, in particular proteins, and we try to understand the causes of their origin.

In ID, as well known, we use functional information as a measure of the improbability of an outcome.  The general idea is similar to Paley’s argument for a watch: a very high level of specific functional information in an object is a very reliable marker of design.

But to evaluate functional information in any object, we must first define a function, because the measure of functional information depends on the function defined. And the observer must be free to define any possible function, and then measure the linked functional information. Respecting these premises, the idea is that if we observe any object that exhibits complex functional information (for example, more than 500 bits of functional information ) for an explicitly defined function (whatever it is) we can safely infer design.

Now, the objection that we are discussing here is that, according to some people (for example DNA_Jock), by defining the function after we have observed the object as we do in ID theory we are committing the TSS fallacy. I will show why that is not the case using an example, because examples are clearer than abstract words.

So, in our example, we have a shooter, a wall which is the target of the shooting, and the shootin itself. And we are the observers.

We know nothing of the shooter. But we know that a shooting takes place.

Our problem is:

  1. Is the shooting a random shooting? This is the null hypothesis

or:

  1. Is the shooter aiming at something? This is the “aiming” hypothesis

So, here I will use “aiming” instead of design, because my neo-darwinist readers will probably stay more relaxed. But, of course, aiming is a form of design (a conscious representation outputted to a material system).

Now I will describe three different scenarios, and I will deal in detail with the third.

  1. First scenario: no fallacy.

In this case, we can look at the wall before the shooting. We see that there are 100 targets painted in different parts of the wall, rather randomly, with their beautiful colors (let’s say red and white). By the way, the wall is very big, so the targets are really a small part of the whole wall, even if taken together.

Then, we witness the shootin: 100 shots.

We go again to the wall, and we find that all 100 shots have hit the targets, one per target, and just at the center.

Without any worries, we infer aiming.

I will not compute the probabilities here, because we are not really interested in this scenario.

This is a good example of pre-definition of the function (the targets to be hit). I believe that neither DNA_Jock nor any other discussant will have problems here. This is not a TSS fallacy.

  1. Second scenario: the fallacy.

The same setting as above. However, we cannot look at the wall before the shooting. No pre-specification.

After the shooting, we go to the wall and paint a target around each of the different shots, for a total of 100. Then we infer aiming.

Of course, this is exactly the TSS fallacy.

There is a post-hoc definition of the function. Moreover, the function is obviously built (painted) to correspond to the information in the shots (their location). More on this later.

Again, I will not deal in detail with this scenario because I suppose that we all agree: this is an example of TSS fallacy, and the aiming inference is wrong.

  1. Third scenario: no fallacy.

The same setting as above. Again, we cannot look at the wall before the shooting. No pre-specification.

After the shooting, we go to the wall. This time, however, we don’t paint anything.

But we observe that the wall is made of bricks, small bricks. Almost all the bricks are brown. But there are a few that are green. Just a few. And they are randomly distributed in the wall.

 

 

We also observe that all the 100 shots have hit green bricks. No brown brick has been hit.

Then we infer aiming.

Of course, the inference is correct. No TSS fallacy here.

And yet, we are using a post-hoc definition of function: shooting the green bricks.

What’s the difference with the second scenario?

The difference is that the existence of the green bricks is not something we “paint”: it is an objective property of the wall. And, even if we do use something that we observe post-hoc (the fact that only the green bricks have been shot) to recognize the function post-hoc, we are not using in any way the information about the specific location of each shot to define the function. The function is defined objectively and independently from the contingent information about the shots.

IOWs, we are not saying: well the shooter was probably aiming at poin x1 (coordinates of the first shot) and point x2 (coordinates of the second shot), and so on. We just recognize that the shooter was aimin at the green bricks.  An objective property of the wall.

IOWs ( I use many IOWs, because I know that this simple concept will meet a great resistance in the minds of our neo-darwinist friends) we are not “painting” the function, we are simply “recognizing” it, and using that recognition to define it.

Well, this third scenario is a good model of the design inference in ID. It corresponds very well to what we do in ID when we make a design inference for functional proteins. Therefore, the procedure we use in ID is no TSS fallacy. Not at all.

Given the importance of this model for our discussion, I will try to make it more quantitative.

Let’s say that the wall is made of 10,000 bricks in total.

Let’s say that there are only 100 green bricks, randomly distributed in the wall.

Let’s say that all the green bricks have been hit, and no brown brick.

What are the probabilities of that result if the null hypothesis is true (IOWs, if the shooter was not aiming at anything) ?

The probability of one succesful hit (where success means hitting a green brick) is of course 0.01 (100/10000).

The probability of having 100 successes in 100 shots can be computed using the binomial distribution. It is:

10e-200

IOWs, the system exhibits 664 bits of functional information. More ore less like the TRIM62 protein, an E3 ligase discussed in my previous OP about the Ubiquitin system, which exhibits an increase of 681 bits of human conserved functional information at the transition to vertebrates.

Now, let’s stop for a moment for a very important step. I am asking all neo-darwinists who are reading this OP a very simple question:

In the above situation, do you infer aiming?

It’s very important, so I will ask it a second time, a little louder:

In the above situation, do you infer aiming? 

Because if your answer is no, if you still think that the above scenario is a case of TSS fallacy, if you still believe that the observed result is not unlikely, that it is perfectly reasonable under the assumption of a random shooting, then you can stop here: you can stop reading this OP, you can stop discussing ID, at least with me. I will go on with the discussion with the reasonable people who are left.

So, in the end of this section, let’s remind once more the truth about post-hoc definitions:

  1. No post-hoc definition of the function that “paints” the function using the information from the specific details of what is observed is correct. Those definitions are clear examples of TSS fallacy.
  2. On the contrary, any post-hoc definition that simply recognizes a function which is related to an objectively existing property of the system, and makes no special use of the specific details of what is observed to “paint” the function, is perfectly correct. It is not a case of TSS fallacy.

 

b) The objection of the different possible levels of function definition.

DNA_Jock summed up this specific objection in the course of a long discussion in the thread about the English language:

Well, I have yet to see an IDist come up with a post-specification that wasn’t a fallacy. Let’s just say that you have to be really, really, really cautious if you are applying a post-facto specification to an event that you have already observed, and then trying to calculate how unlikely that specific event was. You can make the probability arbitrarily small by making the specification arbitrarily precise.

OK, I have just discussed why post-specifications are not in themselves a fallacy. Let’s say that DNA_Jock apparently admits it, because he just says that we have to be very cautious in applying them. I agree with that, and I have explained what the caution should be about.

Of course, I don’t agree that ID’s post-hoc specifications are a fallacy. They are not, not at all.

And I absolutely don’t agree with his argument that one of the reasosn why ID’s post-hoc specifications are a fallacy would be that “You can make the probability arbitrarily small by making the specification arbitrarily precise.”

Let’s try to understand why.

So, let’s go back to our example 3), the wall with the green bricks and the aiming inference.

Let’s make our shooter a little less precise: let’s say that, out of 100 shots, only 50 hits are green bricks.

Now, the math becomes:

The probability of one succesful hit (where success means hitting a green brick) is still 0.01 (100/10000).

The probability of having 50 successes or more in 100 shots can be computed using the binomial distribution. It is:

6.165016e-72

Now, the system exhibits “only” 236 bits of functional information. Much less than in the previous example, but still more than enough, IMO, to infer aiming.

Consider that five sigma, which is ofetn used as a standard in physics to reject the nulll hypothesis , is just 3×10-7,  less than 22 bits.

Now, DNA_Jock’s objection would be that our post-hoc specification is not valid because “we can make the probability arbitrarily small by making the specification arbitrarily precise”.

But is that true? Of course not.

Let’s say that, in this case, we try to “make the specification arbitrarily more precise”, defining the function of sharp aiming as “hitting only green bricks with all 100 shots”.

Well, we are definitely “making the probability arbitrarily small by making the specification arbitrarily precise”. Indeed, we are making the specification more precise for about 128 orders of magnitude! How smart we are, aren’t we?

But if we do that, what happens?

A very simple thing: the facts that we are observing do not meet the specification anymore!

Because, of  course, the shooter hit only 50 green bricks out of 100. He is smart, but not that smart.

Neither are we smart if we do such a foolish thing, defining a function that is not met by observed facts!

The simple truth is: we cannot at all “make the probability arbitrarily small by making the specification arbitrarily precise”, as DNA_Jock argues, in our post-hoc specification, because otherwise our facts would not meet our specification anymore, and that would be completely useless and irrelevant..

What we can and must do is exactly what is always done in all cases where hypothesis testing is applied in science (and believe me, that happens very often).

We compute the probabilities of observing the effect that we are indeed observing, or a higher one, if we assume the null hypothesis.

That’s why I have said that the probability of “having 50 successes or more in 100 shots” is 6.165016e-72.

This is called a tail probability, in particular the probability of the upper tail. And it’s exactly what is done in science, in most scenarios.

Therefore, DNA_Jock’s argument is completely wrong.

c) The objection of the possible alternative solutions, and of the incomplete exploration of the search space.

c1) The premise

This is certainly the most complex point, because it depends critically on our understanding of protein functional space, which is far from complete.

For the discussion to be in some way complete, I have to present first a very general premise. Neo-darwinists, or at least the best of them, when they understand that they have nothing better to say,  usually desperately recur to a set of arguments related to the functional space of proteins. The reason is simple enough: as the nature and structure of that space is still not well known or understood, it’s easier to equivocate with false reasonings.

Their purpose, in the end, is always to suggest that functional sequences can be much more frequent than we believe. Or at least, that they are much more frequent than IDists believe. Because, if functional sequences are frequent, it’s certainly easier for RV to find them.

The arguments for this imaginary frequency of biological function are essentially of five kinds:

  1. The definition of biological function.
  2. The idea that there are a lot of functional islands.
  3. The idea that functional islands are big.
  4. The idea that functional islands are connected. The extreme form of this argument is that functional islands simply don’t exist.
  5. The idea that the proteins we are observing are only optimized forms that derive from simpler implementations through some naturally selectable ladder of simple steps.

Of course, different mixtures of the above arguments are also frequently used.

OK. let’s get rid of the first, which is rather easy. Of course, if we define extremely simple biological functions, they will be relatively frequent.

For example, the famous Szostak experiment shows that  a weak affinity for ATP is relatively common in a random library; about 1 in 1011 sequences 80 AAs long.

A weak affinity for ATP is certainly a valid definition for a biological function. But it is a function which is at the same time irrelevant and non naturally selectable. Only naturally selectable functions have any interest for the neo-darwinian theory.

Moreover, most biological functions that we observe in proteins are extremely complex. A lot of them have a functional complexity beyond 500 bits.

So, we are only interested in functions in the protein space which are naturally selectable, and we are specially interested in functions that are complex, because those are the ones about which we make a design inference.

The other three points are subtler.

  1. The idea that there are a lot of functional islands.

Of course, we don’t know exactly how many functional islands exist in the protein space, even restricting the concept of function to what was said above. Neo-darwinists hope that there are a lot of them. I think there are many, but not so many.

But the problem, again, is drastically redimensioned if we consider that not all functional islands will do. Going back to point 1, we need naturally selectable islands. And what can be naturally selected is much less than what can potentially be functional. A naturally selectable island of function must be able to give a reproductive advantage. In a system that already has some high complexity, like any living cell, the number of functions that can be immediately integrated in what already exists, is certainly strongly constrained.

This point is also stricly connected to the other two points, so I will go on with them and then try some synthesis.

  1. The idea that functional islands are big.

Of course, functional islands can be of very different sizes. That depends on how many sequences, related at sequence level (IOWs, that are part of the same island), can implement the function.

Measuring functional information in a sequence by conservation, like in the Dustron method or in my procedure many times described, is an indirect way of measuring the size of a functional island. The greater is the functional complexity of an island, the smaller is its size in the search space.

Now, we must remember a few things. Let’s take as an example an extremely conserved but not too long sequence, our friend ubiquitin. It’s 76 AAs long. Therefore, the associated search space is 20^76: 328 bits.

Of course, even the ubiquitin sequence can tolerate some variation, but it is still one of the most conserved sequences in evolutionary history. Let’s say, for simplicity, that at least 70 AAs are stictly conserved, and that 6 can vary freely (of course, that’s not exact, just an approximation for the sake of our discussion).

Therefore, using the absolute information potential of 4.3 bits per aminoacid, we have:

Functional information in the sequence = 303 bits

Size of the functional island = 328 – 303 = 25 bits

Now, a functional island of 25 bits is not exactly small: it corresponds to about 33.5 million sequences.

But it is infinitely tiny if compared to the search space of 328 bits:  7.5 x 10^98 sequences!

If the sequence is longer, the relationship between island space and search space (the ocean where the island is placed) becomes much worse.

The beta chain of ATP synthase (529 AAs), another old friend, exhibits 334 identities between e. coli and humans. Always for the sake of simplicity, let’s consider that about 300 AAs are strictly conserved, and let’s ignore the functional contraint on all the other AA sites. That gives us:

Search space = 20^529 = 2286 bits

Functional information in the sequence = 1297 bits

Size of the functional island =  2286 – 1297 = 989 bits

So, with this computation, there could be about 10^297 sequences that can implement the function of the beta chain of ATP synthase. That seems a huge number (indeed, it’s definitley an overestimate, but I always try to be generous, especially when discussing a very general principle). However, now the functional island is 10^390 times smaller than the ocean, while in the case of ubiquitin it was “just”  10^91 times smaller.

IOWs, the search space (the ocean) increases exponentially much more quickly than the target space (the functional island) as the lenght of the functional sequence increases, provided of course that the sequences always retain high functional information.

The important point is not the absolute size of the island, but its rate to the vastness of the ocean.

So, the beta chain of ATP synthase is really a tiny, tiny island, much smaller than ubiquitin.

Now, what would be a big island? It’s simple: a functional isalnd which can implement the same function at the same level, but with low functional information. The lower the functional information, the bigger the island.

Are there big islands? For simple functions, certainly yes. Behe quotes the antifreeze protein as an example example. It has rather low FI.

But are there big islands for complex functions, like that of ATP synthase beta chain? It’s absolutely reasonable to believe that there are none. Because the function here is very complex, and it cannot be implemented by a simple sequence, exactly like a functional spreadsheet software annot be written by a few bits of source code. Neo-darwinists will say that we don’t know that for certain. It’s true, we don’t know it for certain. We know it almost for certain.

The simple fact remains: the only example of the beta chain of the F1 complex of ATP synthase that we know of is extremely complex.

Let’s go, for the moment, to the 4th argument.

  1. The idea that functional islands are connected. The extreme form of this argument is that functional islands simply don’t exist.

This is easier. We have a lot of evidence that functional islands are not connected, and that they are indeed islands, widely isolated in the search space of possible sequences. I will mention the two best evidences:

4a) All the functional proteins that we know of, those that exist in all the proteomse we have examined, are grouped in abot 2000 superfamilies. By definition, a protein superfamily is a cluster of sequences that have:

  • no sequence similarity
  • no structure similarity
  • no function similarity

with all the other groups.

IOWs, islands in the sequence space.

4b) The best (and probably the only) good paper that relates an experiment where Natural Selection is really tested by an approrpiaite simulation is the rugged landscape paper:

Experimental Rugged Fitness Landscape in Protein Sequence Space

http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0000096

Here, NS is correctly simulated in a phage system, because what is measured is infectivity, which in phages is of course strictly related to fitness.

The function studied is the retrieval of a partially damaged infectivity due to a partial random substitution in a protein linked to infectivity.

In brief, the results show a rugged landscape of protein function, where random variation and NS can rather easily find some low-level peaks of function, while the original wild-type, optimal peak of function cannot realistically be found, not only in the lab simulation, but in any realistic natural setting. I quote from the conclusions:

The question remains regarding how large a population is required to reach the fitness of the wild-type phage. The relative fitness of the wild-type phage, or rather the native D2 domain, is almost equivalent to the global peak of the fitness landscape. By extrapolation, we estimated that adaptive walking requires a library size of 1070 with 35 substitutions to reach comparable fitness.

I would recommend to have a look at Fig. 5 in the paper to have an idea of what a rugged landscape is.

However, I will happily accept a suggestion from DNA_Jock, made in one of his recent comments at TSZ about my Ubiquitin thread, and with which I fully agree. I quote him:

To understand exploration one, we have to rely on in vitro evolution experiments such as Hayashi et al 2006 and Keefe & Szostak, 2001. The former also demonstrates that explorations one and two are quite different. Gpuccio is aware of this: in fact it was he who provided me with the link to Hayashi – see here.
You may have heard of hill-climbing algorithms. Personally, I prefer my landscapes inverted, for the simple reason that, absent a barrier, a population will inexorably roll downhill to greater fitness. So when you ask:

How did it get into this optimized condition which shows a highly specified AA sequence?

I reply
It fell there. And now it is stuck in a crevice that tells you nothing about the surface whence it came. Your design inference is unsupported.

Of course, I don’t agree with the last phrase. But I fully agree that we should think of local optima as “holes”, and not as “peaks”. That is the correct way.

So, the protein landscape is more like a ball and holes game, but without a guiding labyrinth: as long as the ball in on the flat plane (non functional sequences), it can go in any direction, freely. However, when it falls into a hole, it will quickly go to the bottom, and most likely it will remain there.

 

 

But:

  • The holes are rare, and they are of different sizes
  • They are distant from one another
  • A same function can be implemented by different, distant holes, of different size

What does the rugged landscape paper tell us?

  • That the wildtype function that we observe in nature is an extremely small hole. To find it by RV and NS, according to the authors, we should start with a library of 10^70 sequences.
  • That there are other bigger holes which can partially implement some function retrieval, and that are in the range of reasonable RV + NS
  • That those simpler solutions are not bridges to the optimal solution observed in the wildtype. IOWs. they are different, and there is no “ladder” that NS can use to reach the optimal solution .

Indeed, falling into a bigger hole (a much bigger hole, indeed) is rather a severe obstacle to finding the tiny hole of the wildtype. Finding it is already almost impossible because it is so tiny, but it becomes even more impossible if the ball falls into a big hole, because it will be trapped there by NS.

Therefore, to sum up, both the existence of 2000 isolated protein superfamilies and the evidence from the rugged landscape paper demonstrate that functional islands exist, and that they are isolated in the sequence space.

Let’s go now to the 5th argument:

  1. The idea that the proteins we are observing are only optimized forms that derive from simpler implementations by a naturally selectable ladder.

This is derived from the previous argument. If bigger functional holes do exist for a function (IOWs, simpler implementations), and they are definitely easier to find than the optimal solution we observe, why not believe that the simpler solutions were found first, and then opened the way to the optimal solution by a process of gradual optimization and natural selection of the steps? IOWs, a naturally selectable ladder?

And the answer is: because that is impossible, and all the evidence we have is against that idea.

First of all, even if we know that simpler implementations do exist in some cases (see the rugged landscape paper), it is not at all obvious that they exist as a general rule.

Indeed, the rugged landscape experiment is a very special case, because it is about retrieval of a function that has been only partially impaired by substituting a random sequence to part of an already existing, functional protein.

The reason for that is that, if they had completely knocked out the protein, infectivity, and therefore survival itself, would not have survived, and NS could not have acted at all.

In function retrieval cases, where the function is however kept even if at a reduced level, the role of NS is greatly helped: the function is already there, and can be optimed with a few naturally selectable steps.

And that is what happens in the case of the Hayashi paper. But the function is retrieved only very partially, and, as the authors say, there is no reasonable way to find the wildtype sequence, the optimal sequence, in that way. Because the optimal sequence would require, according to the authors, 35 AA substitutions, and a starting library of 10^70 random sequences.

What is equally important is that the holes found in the experiment are not connected to the optimal solution (the wildtype). They are different from it at sequence level.

IOWs, this bigger holes do not lead to the optimal solution. Not at all.

So, we have a strange situation: 2000 protein superfamilies, and thousand and tousands of proteins in them, that appear to be, in most cases, extremely functional, probably absolutely optimal. But we have absolutely no evidence that they have been “optimized”. They are optimal, but not necessarily optimized.

Now, I am not excluding that some optimization can take place in non design systems: we have good examples of that in the few known microevolutionary cases. But that optimization is always extremely short, just a few AAs substitutions once the starting functional island has been found, and the function must already be there.

So, let’s say that if the extremely tiny functional island where our optimal solution lies, for example the wildtype island in the rugged landscape experiment, can be found in some way, then some small optimization inside that functional island could certainly take place.

But first, we have to find that island: and for that we need 35 specific AA substitutions (about 180 bits), and 10^70 starting sequences, if we go by RV + NS. Practically impossible.

But there is more. Do those simpler solutions always exist? I will argue that it is not so in the general case.

For example, in the case of the alpha and beta chains of the F1 subunit of ATP synthase, there is no evidence at all that simpler solutions exist. More on that later.

So, to sum it up:

The ocean of the search space, according to the reasonings of neo-darwinists, should be overflowing with potential naturally selectable functions. This is not true, but let’s assume for a moment, for the sake of discussion, that it is.

But, as we have seen, simpler functions or solutions, when they exist, are much bigger functional islands than the extremely tiny functional islands corresponding to solutions with high functional complexity.

And yet, we have seen that there is absolutely no evidence that simpler solutuion, when they exist, are bridges, or ladder, to highly complex solutions. Indeed, there is good evidence of the contrary.

Given those premises, what would you expect if the neo-darwinian scenario were true? It’s rather simple: an universal proteome overflowing with simple functional solutions.

Instead, what do we observe? It’s rather simple: an universal proteome overflowing with highly functional, probably optimal, solutions.

IOWs, we find in the existing proteome almost exclusively highly complex solutions, and not simple solutions.

The obvious conclusion? The neo-darwinist scenario is false. The highly functional, optimal solutions that we observe can only be the result of intentional and intelligent design.

c2) DNA_Jock’s arguments

Now I will take in more detail DNA_Jock’ s two arguments about alternative solutions and the partial exploration of the protein space, and will explain why they are only variants of what I have already discussed, and therefore not valid.

The first argument, that we can call “the existence of alternative solutions”, can be traced to this statement by DNA_Jock:

Every time an IDist comes along and claims that THIS protein, with THIS degree of constraint, is the ONLY way to achieve [function of interest], subsequent events prove them wrong. OMagain enjoys laughing about “the” bacterial flagellum; John Walker and Praveen Nina laugh about “the” ATPase; Anthony Keefe and Jack Szostak laugh about ATP-binding; now Corneel and I are laughing about ubiquitin ligase: multiple ligases can ubiquinate a given target, therefore the IDist assumption is false. The different ligases that share targets ARE “other peaks”.
This is Texas Sharp Shooter.

We will debate the laugh later. For the moment, let’s see what the argument states.

It says: the solution we are observing is not the only one. There can be others, in some cases we know there are others. Therefore, your computation of probabilities, and therefore of functional inpormation, is wrong.

Another way to put it is to ask the question: “how many needles are there in the haystack?”

Alan Fox seems to prefer this metaphor:

This is what is wrong with “Islands-of-function” arguments. We don’t know how many needles are in the haystack. G Puccio doesn’t know how many needles are in the haystack. Evolution doesn’t need to search exhaustively, just stumble on a useful needle.

They both seem to agree about the “stumbling”. DNA_Jock says:

So when you ask:

How did it get into this optimized condition which shows a highly specified AA sequence?

I reply
It fell there. And now it is stuck in a crevice that tells you nothing about the surface whence it came.

OK, I think the idea is clear enough. It is essentially the same idea as in point 2 of my general premise. There are many functional islands. In particular, in this form, many functional islands for the same function.

I will answer it in two parts:

  • Is it true that the existence of alternative solutions, if they exist, makes the computation of functional complexity wrong?
  • Have we really evidence that alternative solutions exist, and of how frequent they can really be?

I will discuss the first part here, and say something about the second part later in the OP.

Let’s read again the essence of the argument, as summed up by me above:

” The solution we are observing is not the only one. There can be others, in some cases we know there are others. Therefore, your computation of probabilities, and therefore of functional information, is wrong.”

As it happens with smart arguments (and DNA_Jock is usually smart), it contains some truth, but is essentially wrong.

The truth could be stated as follows:

” The solution we are observing is not the only one. There can be others, in some cases we know there are others. Therefore, our computation of probabilities, and therefore of functional information, is not completely precise, but it is essentially correct”.

To see why that is the case, let’s use again a very good metaphor: Paley’s old watch. That will help to clarify my argument, and then I will discuss how it relies to proteins, in particular.

So, we have a watch. Whose function is to measure time. And, in general, let’s assume that we infer design for the watch, because its functional information is high enough to exclude that it could appear in any non design system spontaneously. I am confident that all reasonable people will agree with that. Anyway, we are assuming it for the present discussion.

 

 

Now, after having made a design inference (a perfectly correct inference, I would say) for this object, we have a sudden doubt. We ask ourselves: what if DNA_Jock is right?

So, we wonder: are there other solutions to measure time? Are there other functional islands in the search space of material objects?

Of course there are.

I will just mention four clear examples: a sundial, an hourglass, a digital clock,  an atomic clock.

The sundial uses the position of the sun. The hourglass uses a trickle of sand. The digital clock uses an electronic oscillator that is regulated by a quartz crystal to keep time. An atomic clock uses an electron transition frequency in the microwave, optical, or ultraviolet region.

None of them uses gears or springs.

Now, two important points:

  • Even if the functional complexity of the five above mentioned solutions is probably rather different (the sundial and the hourglass are probably quite simpler, and the atomic clock is probably the most complex), they are all rather complex. None of them would be easily explained without a design inference. IOWs, they are small functional islands, each of them. Some are bigger, some are really tiny, but none of them is big enough to allow a random origin in a non design system.
  • None of the four additional solutions mentioned would be, in any way, a starting point to get to the traditional watch by small functional modifications. Why? Because they are completely different solutions, based on different ideas and plans.

If someone believes differently, he can try to explain in some detail how we can get to a traditional watch starting from an hourglass.

 

 

Now, an important question:

Does the existence of the four mentioned alternative solutions, or maybe of other possible similar solutions, make the design inference for the traditional watch less correct?

The answer, of course, is no.

But why?

It’s simple. Let’s say, just for the sake of discussion, that the traditional watch has a functional complexity of 600 bits. There are at least 4 additional solutions. Let’s say that each of them has, again, a functional complexity of 500 bits.

How much does that change the probability of getting the watch?

The answer is: 2 bits (because we have 4 solutions instead of one). So, now the probability is 598 bits.

But, of course, there can be many more solutions. Let’s say 1000. Now the probability would be about 590 bits. Let’s say one million different complex solutions (this is becoming generous, I would say). 580 bits. One billion? 570 bits.

Shall I go on?

When the search space is really huge, the number of really complex solutions is empirically irrelevant to the design inference. One observed complex solution is more than enough to infer design. Correctly.

We could call this argument: “How many needles do you need to tranfsorm a haystack into a needlestack?” And the answer is: really a lot of them.

Our poor 4 alternative solutions will not do the trick.

But what if there are a number of functional islands that are much bigger, much more likely? Let’s say 50 bits functional islands. Much simpler solutions. Let’s say 4 of them. That would make the scenario more credible. Not so much, probably, but certainly it would work better than the 4 complex solutions.

OK, I have already discussed that above, but let’s say it again. Let’s say that you have 4 (or more) 50 bits solution, and one (or more) 500 bits solution. But what you observe as a fact is the 500 bits solution, and none of the 50 bits solutions. Is that credible?

No, it isn’t. Do you know how smaller a 500 bits solution is if compared to a 50 bits solution? It’s 2^450 times smaller: 10^135 times smaller. We are dealing with exponential values here.

So, if much simpler solutions existed, we would expect to observe one of them, and not certainly a solution that is 10^135 times more unlikely. The design inference for the highly complex solution is not disturbed in any way by the existence of much simpler solutions.

OK, I think that the idea is clear enough.

c3) The laughs

As already mentioned, the issue of alternative solutions and uncounted needles seems to be a special source of hilarity for DNA_Jock.  Good for him (a laugh is always a good thing for physical and mental health). But are the laughs justified?

I quote here again his comment about the laughs, that I will use to analyze the issues.

Every time an IDist comes along and claims that THIS protein, with THIS degree of constraint, is the ONLY way to achieve [function of interest], subsequent events prove them wrong. OMagain enjoys laughing about “the” bacterial flagellum; John Walker and Praveen Nina laugh about “the” ATPase; Anthony Keefe and Jack Szostak laugh about ATP-binding; now Corneel and I are laughing about ubiquitin ligase: multiple ligases can ubiquinate a given target, therefore the IDist assumption is false. The different ligases that share targets ARE “other peaks”.

I will not consider the bacterial flagellum, that has no direct relevance to the discussion here. I will analyze, instead, the other three laughable issues:

  • Szostak and Keefe’s ATP binding protein
  • ATP synthase (rather than ATPase)
  • E3 ligases

Szostak and Keefe should not laugh at all, if they ever did. I have already discussed their paper a lot of times. It’s a paper about directed evolution which generates a strongly ATP binding protein form a weakly ATP binding protein present in a random library. It is directed evolution by mutation and artificial selection. The important point is that both the original weakly binding protein and the final strongly binding protein are not naturally selectable.

Indeed, a protein that just binds ATP is of course of no utility in a cellular context. Evidence of this obvious fact can be found here:

A Man-Made ATP-Binding Protein Evolved Independent of Nature Causes Abnormal Growth in Bacterial Cells

http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0007385

There is nothing to laugh about here: the protein is a designed protein, and anyway it is no functional peak/hole at all in the sequence space, because it cannot be naturally selected.

Let’s go to ATP synthase.

DNA_Jock had already remarked:

They make a second error (as Entropy noted) when they fail to consider non-traditional ATPases (Nina et al).

And he gives the following link:

Highly Divergent Mitochondrial ATP Synthase Complexes in Tetrahymena thermophila

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2903591/

And, of course, he laughs with Nina (supposedly).

OK. I have already discussed that the existence of one or more highly functional, but different, solutions to ATP building would not change the ID inference at all. But is it really true that there are these other solutions?

Yes and no.

As far as my personal argument is concerned, the answer is definitely no (or at least, there is no evidence of them). Why?

Because my argument, repeated for years, has always been based (everyone can check) on the alpha and beta chains of ATP synthase, the main constituents of the F1 subunit, where the true catalytic function is implemented.

To be clear, ATP synthase is a very complex molecule, made of many different chains and of two main multiprotein subunits. I have always discussed only the alpha and beta chains, because those are the chains that are really highly conserved, from prokaryotes to humans.

The other chains are rather conserved too, but much less. So, I have never used them for my argument. I have never presented blast values regarding the other chains, or made any inference about them. This can be checked by everyone.

Now, the Nina paper is about a different solution for ATP synthase that can be found in some single celled eukaryotes,

I quote here the first part of the abstract:

The F-type ATP synthase complex is a rotary nano-motor driven by proton motive force to synthesize ATP. Its F1 sector catalyzes ATP synthesis, whereas the Fo sector conducts the protons and provides a stator for the rotary action of the complex. Components of both F1 and Fo sectors are highly conserved across prokaryotes and eukaryotes. Therefore, it was a surprise that genes encoding the a and b subunits as well as other components of the Fo sector were undetectable in the sequenced genomes of a variety of apicomplexan parasites. While the parasitic existence of these organisms could explain the apparent incomplete nature of ATP synthase in Apicomplexa, genes for these essential components were absent even in Tetrahymena thermophila, a free-living ciliate belonging to a sister clade of Apicomplexa, which demonstrates robust oxidative phosphorylation. This observation raises the possibility that the entire clade of Alveolata may have invented novel means to operate ATP synthase complexes.

Emphasis mine.

As everyone can see, it is absolutely true that these protists have a different, alternative form of ATP symthase: it is based on a similar, but certainly divergent, architecture, and it uses some completely different chains. Which is certainly very interesting.

But this difference does not involve the sequence of the alpha and beta chains in the F1 subunit.

Beware, the a and b subunits mentioned above by the paper are not the alpha and beta chains.

From the paper:

The results revealed that Spot 1, and to a lesser extent, spot 3 contained conventional ATP synthase subunits including α, β, γ, OSCP, and c (ATP9)

IOWs, the “different” ATP synthase uses the same “conventional” forms of alpha and beta chain.

To be sure of that, I have, as usual, blasted them against the human forms. Here are the results:

ATP synthase subunit alpha, Tetrahymena thermophila, (546 AAs) Uniprot Q24HY8, vs  ATP synthase subunit alpha, Homo sapiens, 553 AAs (P25705)

Bitscore: 558 bits     Identities: 285    Positives: 371

ATP synthase subunit beta, Tetrahymena thermophila, (497 AAs) Uniprot I7LZV1, vs  ATP synthase subunit beta, Homo sapiens, 529 AAs (P06576)

Bitscore: 729 bits     Identities: 357     Positives: 408

These are the same, old, conventional sequences that we find in all organisms, the only sequences that I have ever used for my argument.

Therefore, for these two fundamental sequences, we have no evidence at all of any alternative peaks/holes. Which, if they existed, would however be irrelevant, as already discussed.

Not much to laugh about.

Finally, E3 ligases. DNA_Jock is ready to laugh about them because of this very good paper:

Systematic approaches to identify E3 ligase substrates

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5103871/

His idea, shared with other TSZ guys, is that the paper demonstrates that E3 ligases are not specific proteins, because a same substrate can bind to more than one E3 ligase.

The paper says:

Significant degrees of redundancy and multiplicity. Any particular substrate may be targeted by multiple E3 ligases at different sites, and a single E3 ligase may target multiple substrates under different conditions or in different cellular compartments. This drives a huge diversity in spatial and temporal control of ubiquitylation (reviewed by ref. [61]). Cellular context is an important consideration, as substrate–ligase pairs identified by biochemical methods may not be expressed or interact in the same sub-cellular compartment.

I have already commented elsewhere (in the Ubiquitin thread) that the fact that a substrate can be targeted by multiple E3 ligases at different sites, or in different sub-cellular compartments, is  clear evidence of complex specificity. IOWs, its’ not that two or more E3 ligases bind a same target just to do the same thing, they bind the same target in different ways and different context to do different things. The paper, even if very interesting, is only about detecting affinities, not function.

That should be enough to stop the laughs. However, I will add another simple concept. If E3 ligases were really redundant in the sense suggested by DNA_Jock and friends, their loss of function should not be a serious problem for us. OK, I will just quote a few papers (not many, because this OP is already long enough):

The multifaceted role of the E3 ubiquitin ligase HOIL-1: beyond linear ubiquitination.

https://www.ncbi.nlm.nih.gov/pubmed/26085217

HOIL-1 has been linked with antiviral signaling, iron and xenobiotic metabolism, cell death, and cancer. HOIL-1 deficiency in humans leads to myopathy, amylopectinosis, auto-inflammation, and immunodeficiency associated with an increased frequency of bacterial infections.

WWP1: a versatile ubiquitin E3 ligase in signaling and diseases.

https://www.ncbi.nlm.nih.gov/pubmed/22051607

WWP1 has been implicated in several diseases, such as cancers, infectious diseases, neurological diseases, and aging.

RING domain E3 ubiquitin ligases.

https://www.ncbi.nlm.nih.gov/pubmed/19489725

RING-based E3s are specified by over 600 human genes, surpassing the 518 protein kinase genes. Accordingly, RING E3s have been linked to the control of many cellular processes and to multiple human diseases. Despite their critical importance, our knowledge of the physiological partners, biological functions, substrates, and mechanism of action for most RING E3s remains at a rudimentary stage.

HECT-type E3 ubiquitin ligases in nerve cell development and synapse physiology.

https://www.ncbi.nlm.nih.gov/pubmed/25979171

The development of neurons is precisely controlled. Nerve cells are born from progenitor cells, migrate to their future target sites, extend dendrites and an axon to form synapses, and thus establish neural networks. All these processes are governed by multiple intracellular signaling cascades, among which ubiquitylation has emerged as a potent regulatory principle that determines protein function and turnover. Dysfunctions of E3 ubiquitin ligases or aberrant ubiquitin signaling contribute to a variety of brain disorders like X-linked mental retardation, schizophrenia, autism or Parkinson’s disease. In this review, we summarize recent findings about molecular pathways that involve E3 ligasesof the Homologous to E6-AP C-terminus (HECT) family and that control neuritogenesis, neuronal polarity formation, and synaptic transmission.

Finally I would highly recommend the following recent paper to all who want to approach seriously the problem of specificity in the ubiquitin system:

Specificity and disease in the ubiquitin system

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5264512/

Abstract

Post-translational modification (PTM) of proteins by ubiquitination is an essential cellular regulatory process. Such regulation drives the cell cycle and cell division, signalling and secretory pathways, DNA replication and repair processes and protein quality control and degradation pathways. A huge range of ubiquitin signals can be generated depending on the specificity and catalytic activity of the enzymes required for attachment of ubiquitin to a given target. As a consequence of its importance to eukaryotic life, dysfunction in the ubiquitin system leads to many disease states, including cancers and neurodegeneration. This review takes a retrospective look at our progress in understanding the molecular mechanisms that govern the specificity of ubiquitin conjugation.

Concluding remarks

Our studies show that achieving specificity within a given pathway can be established by specific interactions between the enzymatic components of the conjugation machinery, as seen in the exclusive FANCL–Ube2T interaction. By contrast, where a broad spectrum of modifications is required, this can be achieved through association of the conjugation machinery with the common denominator, ubiquitin, as seen in the case of Parkin. There are many outstanding questions to understanding the mechanisms governing substrate selection and lysine targeting. Importantly, we do not yet understand what makes a particular lysine and/or a particular substrate a good target for ubiquitination. Subunits and co-activators of the APC/C multi-subunit E3 ligase complex recognize short, conserved motifs (D [221] and KEN [222] boxes) on substrates leading to their ubiquitination [223–225]. Interactions between the RING and E2 subunits reduce the available radius for substrate lysines in the case of a disordered substrate [226]. Rbx1, a RING protein integral to cullin-RING ligases, supports neddylation of Cullin-1 via a substrate-driven optimization of the catalytic machinery [227], whereas in the case of HECT E3 ligases, conformational changes within the E3 itself determine lysine selection [97]. However, when it comes to specific targets such as FANCI and FANCD2, how the essential lysine is targeted is unclear. Does this specificity rely on interactions between FA proteins? Are there inhibitory interactions that prevent modification of nearby lysines? One notable absence in our understanding of ubiquitin signalling is a ‘consensus’ ubiquitination motif. Large-scale proteomic analyses of ubiquitination sites have revealed the extent of this challenge, with seemingly no lysine discrimination at the primary sequence level in the case of the CRLs [228]. Furthermore, the apparent promiscuity of Parkin suggests the possibility that ubiquitinated proteins are the primary target of Parkin activity. It is likely that multiple structures of specific and promiscuous ligases in action will be required to understand substrate specificity in full.

To conclude, a few words about the issue of the sequence space not entirely traversed.

We have 2000  protein superfamilies that are completely unrelated at sequence level. That is  evidence that functional protein sequences are not bound to any particular region of the sequence space.

Moreover, neutral variation in non coding and non functional sequences can go any direction, without any specific functional constraints. I suppose that neo-darwinists would recognize that parts of the genomes is non functional, wouldn’t they? And we have already seen elsewhere (in the ubiquitin thread discussion) that many new genes arise from non coding sequences.

So, there is no reason to believe that the functional space has not been traversed. But, of course, neutral variation can traverse it only at very low resolution.

IOWs, there is no reason that any specific part of the sequence space is hidden from RV. But of course, the low probabilistic resources of RV can only traverse different parts of the sequence space occasionally.

It’s like having a few balls that can move freely on a plane, and occasionally fall into a hole. If the balls are really few and the plane is extremely big, the balls will be able to  potentially traverse all the regions of the plane, but they will pass only through a very limited number of possible trajectories. That’s why finding a very small hole will be almost impossible, wherever it is. And there is no reason to believe that small functional holes are not scattered in the sequence space, as protein superfamilies clearly show.

So, it’s not true that highly functional proteins are hidden in some unexplored tresure trove in the sequence space. They are there for anyone to find them, in different and distant parts of the sequence space, but it is almost impossible to find them through a random walk, because they are so small.

And yet, 2000 highly functional superfamilies are there.

Moreover, The rate of appearance of new suprefamilies is highest at the beginning of natural history (for example in LUCA), when a smaller part of the sequence space is likely to have been traversed, and decreases constantly, becoming extremely low in the last hundreds of million years. That’s not what you would expect if the problem of finding new functional islands were due to how much sequence space has been traversed, and if the sequence space were really so overflowing with potential naturally selectable functions, as neo-darwinists like to believe.

OK, that’s enough. As expected, this OP is very long. However, I think that it  was important to discuss all these partially related issues in the same context.

 

Comments
Neil Rickert at TSZ gives us this pearl of thought:
If 500 bits of information reliably indicates design, but 499 bits doesn’t, then it must follow that 1 bit makes all the difference. (This is, roughly, the heap paradox).
So, apparently we should never categorize a continuous variable using a threshold. Good to know. He should probably share that concept with all the scientists who do exactly that, day after day.gpuccio
May 25, 2018
May
05
May
25
25
2018
09:00 AM
9
09
00
AM
PDT
gpuccio
I wonder how Joe Felsestein would try to explain that. Maybe just restating that he is not “very knowledgeable about biochemistry”?
I think Joe believes you have a legitimate argument or he would not take the time to challenge you. He is very interested in the subject of genetic information. The case Corneel is making is that you are bypassing natural selection. I have interpreted your hypothesis that if there are selectable sequences they are part of the bit calculation. I also made the point to Corneel that he needs to show evidence that a protein family that has a unique sequence and function has a selectable path. I am glad you agree with my conclusion that evidence of precise preserved sequences is evidence for design. RMNS has no viable explanation how these sequences formed in nature.bill cole
May 24, 2018
May
05
May
24
24
2018
04:08 PM
4
04
08
PM
PDT
bill cole: "Finding highly optimized sequences in nature reinforces the design argument IMO." Yes, of course. And UBR5 is about something like 2500+ AAs optimization! I wonder how Joe Felsestein would try to explain that. Maybe just restating that he is not "very knowledgeable about biochemistry”?gpuccio
May 24, 2018
May
05
May
24
24
2018
03:23 PM
3
03
23
PM
PDT
gpuccio
As you can see, I am trying to answer Joe Felsestein directly.
Thats great thank you :-)
Have you seen my latest results about UBR5?
Yes, and it clearly reinforces your argument. If I think back to the Hayashi paper a question in my mind is how did the protein arrive at a configuration that Hayashi estimates 10^70 trials to achieve? This is less then 500 bits but it is more then the estimated number of evolutionary trials at 10^43 or around 120 bits. Finding highly optimized sequences in nature reinforces the design argument IMO.bill cole
May 24, 2018
May
05
May
24
24
2018
10:39 AM
10
10
39
AM
PDT
bill cole at #393: As you can see, I am trying to answer Joe Felsestein directly. You say: "I think you may be selling your self short that there is no mathematical confirmation as there are clearly too many trials and too few resources if a system really contains 500 bits of information." I am not selling myself at all. I am not a mathematician, and I cannot provide a mathematical demonstration that complex functional information cannot be generated out of design because it is mathematically impossible. I think it will probably be proved sometime. I think Dembski is really trying to do that, but I am not sure if he has succedeed. But that is not my concern. Of course, proving that it is empirically impossible in all known cases is all another thing. That is my concern, definitely. You say: "The only question in my mind at this point is how accurate is your measurement based on your method of choosing preserved sequences." It is absolutely reliable. "Accurate" is not the right word, because I am not pretending that it is absolutely precise. But it does measure what it is intended to measure, and with great reliability. Indeed, I am rather sure that my method underestimates functional information. Have you seen my latest results about UBR5? Here: Isolated complex functional islands in the ocean of sequences: a model from English language, again. https://uncommondescent.com/intelligent-design/isolated-complex-functional-islands-in-the-ocean-of-sequences-a-model-from-english-language-again/ at comments #85, 86, 108, 110, 119 and 124.gpuccio
May 24, 2018
May
05
May
24
24
2018
09:29 AM
9
09
29
AM
PDT
Joe Felsestein at TSZ: OK, I have finally read your atrticle at TSZ. So I will try to answer your points. First of all, thank you for considering my arguments with serious attention. I appreciate that. I would like to csolve immediately one problem which is rather simple: yes, you are right in thinking that I do not rely on Dembski's Law of Conservation of Complex Specified Information in my resonings. And, as I have alredy said in my comments #382 and #383, I don't want to give a mathemathical theorem that demonstrates that complex functional information can only be generated by design. My reasonings are completely empirical. So, I think that answers you point 1 and Possibility 1. Of course, I agree with many of Dembskis ideas on other important points. And I am not satying that I disagree with his Law of Conservation of Complex Specified Information: I am simply saying that I do not rely on it for my arguments. OK? So, now I will give brief answers to your final questions, and then add some reflections in more detail. Just to start the discussion. Your questions: 1. Is your “functional information” the same as Szostak’s? 2. Or does it add the requirement that there be no function in sequences that are outside of the target set? 3. Does it also require us to compute the probability that the sequence arises as a result of normal evolutionary processes? My answers: 1. Yes, I think so. The fact that Szostak does not use it to infer design does not mean that the concept is not the same. 2. It is computed for one explicit definition of a function, including a definite level of it. Therefore, all the sequences that do not satisfy the definition are not in the target set. I think that, too, is the same as what Szostal suggests. 3. It only requires that there is no evidence that an evolutionary process can do it. Such evidence would falsify the theory and the procedure of design inference, as I have said many times. Maybe the third point requires some more detail. As you certainly know, Dembski's explanatory filter has the explicit requirement that no known necessity mechanism can be responsible for what we observe. Of course, that is speically important for results based on order and regularity, and the problem is not really relevant for the type of information that we observe in language, software, machines and proteins. However, neo-darwinism has been claiming for decades that a special type of mechanism based on RV and NS, where NS is the necessity part, can explain that kind of functional information. If that were true, it would of course be a falsification of the design inference, at least for biological objects. That's why ID has to deal with the neo-darwinian model: to show that it is no credible explanation for the complex functional information in biological objects. You say that you are not "very knowledgeable about biochemistry", and that you will "happily leave that argument to others". But it's an important part of the discussion. Why? Because my statement, the statement upon which ID is essentially founded, is that no object with 500 bits of functional information can be generated in a non design system. It is a strong statement, one that invites all to provide even one single counter-example. The argument is completely empirical: no such object has ever been observed to arise without a design intervention. You will not find any exception, anywhere. I have also explained that, to avoid wrong interpretations, we must refer to new and original complex functional information. And I have explained in detail what it means: new = the sequence information must be unrelated to what already exists in the system original = the functional specification must be a new function, and not only a tweaking of an existing function I have also explained that order coming from necessity laws cannot be considered complex functional information (for example, an ordered sequence of heads which can be explained by an unfair coin). But these are all minor clarifications, just to avoid the usual misinterpretations of the concept. You may also want to read what I say at #382 about the computation of pi, just to have some other information about my position. I think that my position is better represented in you Possibility 2. But with some important clarifications. I never, never use "function" as a generic word. Everything has some function. But that's not what is discussed here. I discuss complex functional information, and it is always computed for one explicitly defined function, including a minimal level of it. You say: "gpuccio does not rule out that the region could be defined by a high level of function, with lower levels of function in sequences outside of the region, so that there could be paths allowing evolution to reach the target region of sequences." And then you quote some reflections of mine about that. But you seem not to understand my point. My point is that for complex functions there is no path that leads to them. It is true that we have to set a minimal level of function to define it and to compute the related functional information. But that is not to isolate a peak of high function from gradual lower levels. It's to define any relevant level of function, and distinguish it from what is essentiall irrelevant function. For example, look at my recent OP: Isolated complex functional islands in the ocean of sequences: a model from English language, again. https://uncommondescent.com/intelligent-design/isolated-complex-functional-islands-in-the-ocean-of-sequences-a-model-from-english-language-again/ Consider my example of paragraph P. Of course, we can define a broader function for it, like for example "being made of English words". That gives us a bigger functional island, of course. But, in our context, the important thing is that paragraph P must convey some specific and correct onformation about the issue that is debated there. It's of no use to have a paragraph that is made of English words, but does not mean anything. Or that just conveys information about a soccer game. That's why we define function as an upper tail, as Szostak correctly suggests. Because we are making empirical science, not philosophy or mathematics. We are interested in the real thing, in results, not in abstract discussions. So, we define what is really functional in the context, and give a way to measure its minimal useful level. In the case of a neo-darwinian model, of course the only useful function is: a) That the variation can be naturally selected AND: b) That the variation is building the final functional sequence. There is not one single example of such a pathway that can lead to a new and original complex protein. Those pathways simply do not exist. They do not exist for language, as they do not exist for software. And they do not exist for proteins. This is not a theorem. It is an observed fact. Falsifiable, of course. Please, falsify it. In science, we base our inferences on facts. Not on theorems. Facts rule. That's just to start. I will go on as soon as I have time. By the way, I had asked you to comment on my model (the thief and the safes) which was an explicit criticism to a very important point proposede by you. I see no answer to that in your article. Why? And yet, it is absolutely relevant to the discussion here.gpuccio
May 24, 2018
May
05
May
24
24
2018
09:04 AM
9
09
04
AM
PDT
gpuccio
I have not restricted anything. A 500 bits function requires at least 500 specific bits to be implemented. So, the function is not there if those bits are not there. If someone can show that the function can be implemented with, say, 100 bits, then the functional complexity of the function is 100 bits, and not 500 bits.
I think this is the key point. Selectable steps are just additional sequences that have function. You would subtract these from the total sequence space in order to get the functional sequence space. If that number is 500 bits then you can infer design. I think you may be selling your self short that there is no mathematical confirmation as there are clearly too many trials and too few resources if a system really contains 500 bits of information. The only question in my mind at this point is how accurate is your measurement based on your method of choosing preserved sequences.bill cole
May 24, 2018
May
05
May
24
24
2018
08:55 AM
8
08
55
AM
PDT
Bob O'H @ 384: It seems this post would be better suited to gpuccio's "Islands of Function" OP. Anyway:
There is so little empirical support that it was reviewed 4 year ago. More “no empirical support” has accumulated since then.
Per the abstract:
The genotype–fitness map (that is, the fitness landscape) is a key determinant of evolution, yet it has mostly been used as a superficial metaphor because we know little about its structure.
It sounds like your citation is confirming that which you object to, admitting that at the time of this "review", they knew practically nothing.
This is now changing, as real fitness landscapes are being analysed by constructing genotypes with all possible combinations of small sets of mutations observed in phylogenies or in evolution experiments. In turn, these first glimpses of empirical fitness landscapes inspire theoretical analyses of the predictability of evolution. Here, we review these recent empirical and theoretical developments, identify methodological issues and organizing principles, and discuss possibilities to develop more realistic fitness landscape models.
Ok, so they're finally getting around to hammering it out. That's excellent. So, are they finding handy ladders? Or are they finding islands of function?LocalMinimum
May 24, 2018
May
05
May
24
24
2018
07:34 AM
7
07
34
AM
PDT
Bob O'H:
1. ID is a theory of intelligent design, not evolution.
And yet ID is OK with evolution by design being able to produce IC
What is the clear rationale, supported by known facts, that say that an intelligent designer can’t mimic evolution?
What does that even mean? Clearly after all of these years you still don't know what is being debated here. Again, the intelligent designer mimicked unguided evolution then there wouldn't be any evidence for ID. And if unguided evolution can produce what ID claims required a designer the design inference is falsified due to Newton's four rules, ie science 101ET
May 24, 2018
May
05
May
24
24
2018
06:18 AM
6
06
18
AM
PDT
Bob, I only read the last few posts but don't you have it backward? You seem to be asking how can we know when something is designed even though we make design inferences all the time and that ID merely attempts to quantify the qualities of design according to scientific standards. Shouldn't the question you be asking is how can nature turn chaos into high complexity and why should we believe it can?tribune7
May 24, 2018
May
05
May
24
24
2018
06:05 AM
6
06
05
AM
PDT
Bob O’H at #384 and #386. You say: "What is the clear rationale, supported by known facts, that say that an intelligent designer can’t mimic evolution?" Of course an intelligent designer can mimic unguided evolution, if he so decides. In that case, he will design only simple microevolutionary events, so that his design is not detectable. And so? Then you say: "unless ID specifically claims that an intelligent designer doesn’t mimic unguided evolution (and I’ve been told repeatedly that ID says nothing about the designer), this can’t be a falsification of ID." As said many times, a falsification of ID is to show that some non design system can generate an object exhibiting complex functional information, IOWs an object which would be considered designed according to the ID procedure. IOWs, a false positive. I really cannot believe that you still stick to such blatant errors of reasoning. I believe you are in good faith, and I think you are intelligent, so I really cannot understand why it happens.gpuccio
May 24, 2018
May
05
May
24
24
2018
04:36 AM
4
04
36
AM
PDT
Bob O'H at #384: Sometimes it seems that you don't even try to understand what we are discussing. If you read (and understand) my comments #382 and #383, it should be easy to see that I am discussing the 500 bits rule, as quoted by Joe Felsestein. I quote the relevant part:
So, if I observe a function that requires 500 bits to be there, there is no gradual way of implementing it. As, in the thief example, there is no gradual way to find the key to the big safe by step by step attempts. The same is true of a new protein, unrelated to existing ones at the sequence level, and with a new function: if the transition to the new functional protein requires at least 500 new bits of functional information, it cannot be achieved by gradual increasingly functional steps. Like the key for the big safe. Of course, darwinists can imagine that some pathway (ladder) exists, maybe passing thorugh completely unrelated functions. But there is no reason to believe such a weird idea, and it has never been observed. IOWs, it is a myth, with no rationale and no empirical support.
IOWs, we are talking of a selectable pathway to a 500 bit new function. In answer to that, you quote a paper which has completely nothing to do with that question. I quote from the paper:
Weinreich and collaborators demonstrated the implications of sign epistasis by constructing and analysing a fitness landscape that involved five mutations in the beta-lactamase TEM, which collectively gave rise to bacterial resistance to a novel antibiotic26. Only 18 of the 120 possible 5?step mutational trajectories from wild type to high-resistance enzyme were accessible under strong selection, and the single most likely trajectory would be used in almost half of the cases (discussed in detail below).
This is the kind of "landscapes" that are discussed in the paper: microevolutionary landscapes, where a few simple transitions twek a simple starting function. In the case of the beta-lactamase, a single starting mutation confers the function, and 4-5 selectable mutations tweak it. This is the exact scenario that I have discussed in detail in my OP: What are the limits of Natural Selection? An interesting open discussion with Gordon Davisson https://uncommondescent.com/intelligent-design/what-are-the-limits-of-natural-selection-an-interesting-open-discussion-with-gordon-davisson/ Again from the paper you quote:
The second approach involves the systematic analysis of all possible combinations of a small, predefined set of mutations (FIG. 2). This approach explores a tiny part of genotypic space, but the information obtained is complete and allows the probability of mutational trajectories to be quantified and compared. Below, we focus on systematic studies that adopt the second approach and their use in analyses of evolutionary predictability. Currently, there are <20 systematic studies of empirical fitness landscapes, but this number is rapidly growing10,27. These studies analyse interactions among three17 to a maximum of nine mutations28, which occur either in a single gene17,26,28–38 or operon39, or across genes in a bacterial40,41, fungal22,42 or fly genome43.
As you can see, all those things have no relevance at all to what I was discussing.gpuccio
May 24, 2018
May
05
May
24
24
2018
04:14 AM
4
04
14
AM
PDT
Bob O'H @ Bob, you are mistaken, ID does make claims about unguided evolution. Perhaps you have not read the following section of the uncommondescent website: ID Defined. For your convenience, the most relevant part:
"The theory of intelligent design (ID) holds that certain features of the universe and of living things are best explained by an intelligent cause rather than an undirected process such as natural selection.
Origenes
May 24, 2018
May
05
May
24
24
2018
03:43 AM
3
03
43
AM
PDT
Origenes - 1. ID is a theory of intelligent design, not evolution. 2. unless ID specifically claims that an intelligent designer doesn't mimic unguided evolution (and I've been told repeatedly that ID says nothing about the designer), this can't be a falsification of ID.Bob O'H
May 24, 2018
May
05
May
24
24
2018
03:13 AM
3
03
13
AM
PDT
Bob O'H @384
GPuccio: We have no reasons at all, either rational or empirical, to believe that NS can do that [create a 500 bits function]. Of course, the fans of NS can try to show that it can do it. That would be a faslification of ID.
Bob O'H: What is the clear rationale, supported by known facts, that say that an intelligent designer can’t mimic evolution?
A misguided question for two reasons: 1. ID claims that unguided evolution cannot produce a 500 bits function, so, according to ID, there is nothing to "mimic." 2. ID does not claim that an intelligent designer mimics unguided evolution.Origenes
May 24, 2018
May
05
May
24
24
2018
02:50 AM
2
02
50
AM
PDT
Of course, darwinists can imagine that some pathway (ladder) exists, maybe passing thorugh completely unrelated functions. But there is no reason to believe such a weird idea, and it has never been observed. IOWs, it is a myth, with no rationale and no empirical support.
There is so little empirical support that it was reviewed 4 year ago. More "no empirical support" has accumulated since then.
Not at all. We have no reasons at all, either rational or empirical, to believe that NS can do that. Of course, the fans of NS can try to show that it can do it. That would be a faslification of ID. As discussed recently with Bob O’H, that’s the reason why ID is absolutely falsifiable.
What is the clear rationale, supported by known facts, that say that an intelligent designer can't mimic evolution?Bob O'H
May 24, 2018
May
05
May
24
24
2018
01:43 AM
1
01
43
AM
PDT
bill cole: A few more reflections. Joe Felsestein says: "Or has gpuccio restricted the 500-Bits-Rule somehow, such as requiring that all sequences outside of the target set have no function at all? " I have not restricted anything. A 500 bits function requires at least 500 specific bits to be implemented. So, the function is not there if those bits are not there. If someone can show that the function can be implemented with, say, 100 bits, then the functional complexity of the function is 100 bits, and not 500 bits. So, if I observe a function that requires 500 bits to be there, there is no gradual way of implementing it. As, in the thief example, there is no gradual way to find the key to the big safe by step by step attempts. The same is true ofr a new protein, unrelated to existing ones at the sequence level, and with w new function: if the transition to the new functional protein requires at least 500 new bits of functional information, it cannot be achieved by gradual increasingly functional steps. Like the key for the big safe. Of course, darwinists can imagine that some pathway (ladder) exists, maybe passing thorugh completely unrelated functions. But there is no reason to believe such a weird idea, and it has never been observed. IOWs, it is a myth, with no rationale and no empirical support. Joe Felsestein also says: "Or has he dodged the whole issue by only defining CFI to be present if we already know that natural selection cannot reach the set?" Not at all. We have no reasons at all, either rational or empirical, to believe that NS can do that. Of course, the fans of NS can try to show that it can do it. That would be a faslification of ID. As discussed recently with Bob O'H, that's the reason why ID is absolutely falsifiable. But there is no reason that we have to demonstrate (mathematically) that NS cannot do it. As already said, we don't need a mathematical falsification to ignore a myth which is not supported either by reason or by facts. "Arguing one case, as you do, does not address the issue of whether the 500-Bits Rule is valid in all cases." I am not "arguing one case". I am arguing that the 500-Bits Rule is valid in all known cases. I am afraid that Joe Felsestein is again confounded about the nature of empirical science: empirical science is not mathematics. In empirical science, an explanation is not interesting just because it has not been mathematically proven impossible. An explanation is interesting only if it has explanatory power, IOWs it is suggested by a clear rationale, and if it is supported by known facts. Neither thing is true for NS as an explanation of complex functional information. Both things are true for design as an explanation of complex functional information.gpuccio
May 24, 2018
May
05
May
24
24
2018
01:20 AM
1
01
20
AM
PDT
bill cole: The answer is rather easy: The 500 bit rule is an empirical observation. The connection between functional complexity and design is an empirical observation. There is not one single known counter-example where 500 bits of new and original functional information can arise without any conscious design intervention. The explanation is simple: new and original complex functions can never be reached by step by step increases of function. Being an empirical observation, there is no need of any "mathematical proof". It just works in all known cases. The only case where functional complexity can increase without any new conscious intervention is a computationla system which has already been designed. For example, as I often say, a software that can compute the figures of pi will output, in time, increasingly complex outcomes (a greater number of figures of pi). But there is no increase of the functional information in the system, because the Kolmogorov complexity of the system remains the same. In that example: a) The functional specification has already been set: it is not "original". b) The increase of complexity in the outcome is computationally achieved, and the computation method is already embedded in the system (designed). The old procedure outlined in Dembski's expolanatory filter (excluding cases where the result can be achieved by a necessity mechanism operating in the system) is more than enough to eliminate those cases. But a software that has been designed to compute the figures of pi can only do what it has been designed to do. It cannot program a spreadsheet, or demonstarte a theorem, or anything else. IOWs, a new and original function cannot arise from an existing complex function, completely different in specification and implementation. There is not a mathemathical proof of that (at least, I cannot provide one). But it is empirically true. The idea that 500 bits of new and original complex information can be generated by a set by step ladder of increasingly functional states is simply a myth: something that has never been observed (fact), and never will (prediction). We don't need a mathemathical proof to ignore a myth: a myth is simply irrelevant in empirical science. By the way, has Joe Felsestein answered my argument about the thief? Has he shown how complex functional information can increase gradually in a genome? Or does he think that we need a mathemathical proof that my thief will never find the key to the big safe, and that he should rather stick to working on the many smaller safes? Just to know. (For those who have not followed the thief discussion, please look at my comment #65 here, and the quoted discussion in the Ubiquitin thread).gpuccio
May 24, 2018
May
05
May
24
24
2018
12:16 AM
12
12
16
AM
PDT
gpuccio Here is Joe's response to me. Very interesting discussion.
Joe Felsenstein May 24, 2018 at 1:10 am colewd, You are arguing one case, in a Michael Behe style argument. But the issue I am raising is whether there is some mathematical proof that all cases where we can have a set of sequences that have functional information greater than 500 bits cannot be reached by natural selection acting on less-functional sequences that are outside the set. Is there a mathematical proof? Something like William Dembski’s Law of Conservation of Complex Specified Information? (Like his, but not the same — his does not do the job). Or has gpuccio restricted the 500-Bits-Rule somehow, such as requiring that all sequences outside of the target set have no function at all? Or has he dodged the whole issue by only defining CFI to be present if we already know that natural selection cannot reach the set? Arguing one case, as you do, does not address the issue of whether the 500-Bits Rule is valid in all cases.
bill cole
May 23, 2018
May
05
May
23
23
2018
05:54 PM
5
05
54
PM
PDT
gpuccio Here is a piece from an op by Joe Felsenstein that I commented on. Any thoughts would be appreciated.
Joe Felsenstein, Joe We are asking here whether, in general, observation of more than 500 bits of functional information is “a reliable indicator of design”. And gpuccio’s definition of functional information is not confined to cases of islands of function, but also includes cases where there would be a path to along which function increases. In such cases, seeing 500 bits of functional information, we cannot conclude from this that it is extremely unlikely to have arisen by normal evolutionary processes. So the general rule that gpuccio gives fails, as it is not reliable. Bill In the cases that gpuccio supplied the proteins were part of a multi protein complex. They bind to other proteins and support the function or they don’t. Their sequence specificity is dependent on the proteins they bind with. If function here is either working or not working how would you argue there is any hill to climb?
bill cole
May 23, 2018
May
05
May
23
23
2018
04:08 PM
4
04
08
PM
PDT
DATCG (377): Very interesting. Thanks.OLV
May 6, 2018
May
05
May
6
06
2018
12:40 AM
12
12
40
AM
PDT
gpuccio (373): That’s fascinating indeed. Thanks.OLV
May 6, 2018
May
05
May
6
06
2018
12:38 AM
12
12
38
AM
PDT
OLV, Re: lncRNA and Epigenetics, you might enjoy the PDF link at bottom to a presentation or collection of slides by Professor John Mattick. He's been ahead of the curve and out front on failures of Central Dogma and gene-centric views. This covers a bit of history and important progress. If interested, he reviews paper highlights for example on lncRNA and other epigenetic factors, RNA editing related to brain function, cancer, etc. As well as ALUs, a really interesting finding about half way down. Search on ALU in the presentation. Most papers are several years old, but his interpretations are interesting in highlights. Nothing to do with Central Dogma and you can see why the Title states assumptions in past were wrong. But what do we often read today, still from neo-Darwinist faithful? They're still relying on antiquated beliefs and assumptions that were/are wrong. Slides are from 2013, the title is golden: Most Assumptions in Molecular Biology are Wrong Clear cut and to the point. How refreshing! What Darwinist dismissed for so long as "Junk" DNA will become a more important driver of new medical treatments, especially individual Genomes/Epigenomes. Why? Because these "non-coded" regions are abundantly involved in regulatory functions and often cause disease if mutated. And our individual epigenomes differ in key areas of erroneously categorized "JUNK" DNA. What follows is a frank assessment in history of wrong assumptions:
The Central Dogma (Crick, 1958) refers to the flow of genetic information from DNA > RNA > protein. The assumption, based on studies of the lac operon in E. coli, has been that genes are synonymous with proteins and that most genetic information, including regulatory information, is transacted by proteins. This protein-centric view reflects a mechanical orientation and has led to several subsidiary assumptions, despite a number of subsequent surprises that should have given pause for thought.
Agreed, this is why many see a need for Extended Evolutionary Synthesis at least, or better, Modern Synthesis replacement! As Denis Noble has argued.
Surprise #1: Genes in humans and other complex eukaryotes are mosaics. Interpretation: Introns, despite the fact that they are transcribed, are ‘junk’.
Golden! I went looking for function in Introns in Gpuccio's Spliceosome OP on Alternative Splicing and found it. Surprise? So what - Darwinist answer? "JUNK" Ooof that hurts!
Surprise #2: Eukaryote genomes are full of transposon-derived sequences. Interpretation: These sequences are mainly non-functional ‘selfish’ DNA. (!)
Again, non-functional answer? Yes, and "selfish" oy!
Surprise #3: Gene number does not scale with developmental complexity. Interpretation: Combinatorial control of transcription, alternative splicing etc. can explain ….?
By ignoring non-coded regions as "JUNK" DNA, they missed the bigger picture of complexity and regulatory control systems.
The genetic basis of human development - Humans(and other vertebrates) have approximately the same number of protein-coding genes(~20,000) as C. Elegans - Most of the proteins have similar functions from nematodes to humans, and many are common with brewers yeast - Where is the information that programs our complexity?
That last one is a good question! Gee, what about "JUNK" DNA regions? On Slide 4(page 4).
- The biggest surprise of the genome projects was the discovery that the number of orthodox (protein-coding) genes does not scale strongly or consistently with complexity: The proportion of noncoding DNA broadly increases with developmental complexity (See Graphic Scale of Non-coded regions increasing up thru Vertebrates)
Hmmmm... seems like a good place to look for function! Also, check out Slide/Page 9 for intergenic regions of "gene deserts," then Table 2 page 11, Functionality of ncRNAs. He ends with the following future scenario..
Within a decade or two, individual genome sequences will be part of everyone’s medical record, and be integrated with other data in mobile electronic records that are both personal and part of larger databases that are used to inform health economics, insurance/underwritng, strategies for reducing disease burdens and costs, and deployment of resources.
The PDF Link: Most Assumptions in Molecular Biology are Wrong .DATCG
May 5, 2018
May
05
May
5
05
2018
03:14 PM
3
03
14
PM
PDT
Local Minimum @372, Nice :) electrostatic came to mind for conformational forces involved in protein folding, but I didn't bother to pursue it. Which explains why I was always replacing my Starter in high school, My friend would often jump the starter, or was it the solenoid? Ah, solenoid. Talk about "Junk" - my old cars during high school! And yet, they still had function ;-) Usually the function was to deplete the holdings in my wallet!DATCG
May 5, 2018
May
05
May
5
05
2018
03:03 PM
3
03
03
PM
PDT
LocalMinimum: "Sounds like a combination of a keyhole and a transport solenoid, using electrostatic tumblers instead of a magnetic field to carry the key and an attached package straight through the hole." Well, that's a metaphor! :) Thank you!gpuccio
May 5, 2018
May
05
May
5
05
2018
02:03 PM
2
02
03
PM
PDT
DATCG: Extremely cool! :) The Nuclear Pore Complex is one of the wonders of the eukaryotic cells. But I was not aware of the “transport paradox”, and of the possible role of IDPs in explaining it. Great stuff. I will read it very carefully. :)gpuccio
May 5, 2018
May
05
May
5
05
2018
02:01 PM
2
02
01
PM
PDT
OLV at #369: Very interesting paper. The study of RNA modfications is really still in its infancy, and I am sure that we will see great things in this field, in the next few years! :)gpuccio
May 5, 2018
May
05
May
5
05
2018
01:39 PM
1
01
39
PM
PDT
DATCG @ 370: Sounds like a combination of a keyhole and a transport solenoid, using electrostatic tumblers instead of a magnetic field to carry the key and an attached package straight through the hole.LocalMinimum
May 5, 2018
May
05
May
5
05
2018
01:28 PM
1
01
28
PM
PDT
OLV @369, Cool, more regulatory control! Ha! :) Say it ain't so. Frontiers is fun stuff. Thanks, I'll file it away for a look when I have time. I noticed they mentioned lncRNA. CircRNA is another interesting RNA in brain function as well associated with disease if mutated that most likely will be wrapped up in the Epitranscriptome. Along with many others. It's funny the authors quote Darwin at the end in attempts to tie it to neo-Darwinist story-telling. They based this not upon knowledge, but a "Darwin-of-the-Gaps" story telling. It could as easily be common design techniques. The Epigenome and regulatory code, and now the Epitranscriptome scream design. Regulatory control systems have a need to Know - up front - for decision processing. It's not a system designed for friendly mutation. This is so far from anything related to Darwinism or neo-darwinian blind, mutational events. There's a reason some Darwinist are so upset with ENCODE project and Epigenetics, and function within formerly "JUNK" declared regions. The more function, the more Editing, Splicing, multi-functional, overlapping regions there are, the less likely there's any room for fudge factors of random mutations being an innovator of novel forms.DATCG
May 5, 2018
May
05
May
5
05
2018
12:35 PM
12
12
35
PM
PDT
Gpuccio, Apologize for going off topic on this comment, but hopefully a bit on Target ;-) You originally linked to a paper in your Ubiquitin OP; comment #10 on Design Principles of Protein "Disorder" facilitating specificity for substrates: Design Principles Involving Protein Disorder Facilitate Specific Substrate Selection and Degradation by the Ubiquitin-Proteasome System* Resulting discussions and additional research papers on IDR/IDPs convinced me they are a good example of Design features in the cell. IDPs allow a flexible folding solution that is conditional with specificity. Curious, I thought how else could these Design Principles of IDPs be utilized in cellular processing? Search results turned up the Nuclear Pore Complex(NPC). A controlled entry gateway using IDPs that verify cargo transport into and out of the nucleus. The NPC defends entry against viruses while allowing approved cargo to be transported to interior of nucleus... https://www.sciencedaily.com/releases/2018/03/180327132011.htm Summary:
Cells can avoid 'data breaches' when letting signaling proteins into their nuclei thanks to a quirky biophysical mechanism involving a blur of spaghetti-like proteins, researchers have shown.
The "quirky" mechanisms are Intrinsically Disordered(badly named) Proteins. Conditional and Flexible Folding Proteins. Continued...
In every human cell, all of the body's blueprints and instructions are stored in the form of DNA inside the nucleus. Molecules that need to travel in and out of the nucleus -- to turn genes on or off or retrieve information -- do so through passageways called nuclear pore complexes (NPCs). Traffic through these NPCs must be tightly controlled in order to prevent DNA hijacking by viruses or faulty functioning as in cancer. To travel through NPCs, many molecules must be attached to proteins called transport factors (TFs), which act as shuttles that the NPC recognizes. But the NPC faces a challenge: It must accurately recognize and bind to TFs to let them through without admitting unwanted traffic, but it must let them through quickly -- in a matter of milliseconds -- in order for the cell to be able to do its duties. Proteins known to accurately bind to specific molecules, like antibodies, normally stay stuck to their targets for periods of up to months. "How on Earth do you have the kind of specificity that we see in protein-protein interactions like antibodies, and yet have the kind of speed that we see with water off a Teflon pan?" asked Michael Rout, professor at Rockefeller University who was one of the co-lead authors of the work.
Voila - IDPs - Conditional and Flexible Folding Proteins. IDP - does not capture the functional aspect of these wonderfully designed proteins. continued...
They found that the key to this interaction being so specific, yet fleeting, was in many quick, transient contacts between transport factors and FG Nups. Similarly to the threads and hooks of Velcro, each amino acid pair of the FG Nup region only attached to the transport factor very weakly, with an overall result of affinity between the two partners; but unlike Velcro, the partners were not stuck together longer than necessary for the transport factor to travel through the nuclear pore. "I can't think of any analogy in normal life that does what this does," Rout said. "You've got this blur of (amino acids) coming on and off (the transport factor) with extraordinary speed."
Another analogy might be a Password/ID entry systems? Like an ID Card for employees or hotels that must gain entry with slide of a card. Flexibly programmable, quick, but specified and can be pre-programmed for different access levels to different departments or floors(cells), etc., keeping out intruders. If Flexible IDPs were not available, what would happen? Rigid Protein folding might shut the system down? Like inefficient hard-coding, rigid folds for dynamic interactions would be inefficient, a pain to maintain, hurting resources and inhibiting ease of modular and functional expansion. It appears to be a designed system for modular functionality. Flexible, yet specific. IDPs give greater efficiency in the Code. From a Design heuristic, makes sense. like in the NPC, or UPS and other functional processing systems for signal recognition. A one-to-many or many-to-one relationship is a requirement for fast, flexible approach. IDRs and IDPs meet that criteria. I think looking for Design in cellular systems pays off. The paper link... http://www.jbc.org/content/293/12/4555.full Abstract:
Intrinsically disordered proteins (IDPs) play important roles in many biological systems. Given the vast conformational space that IDPs can explore, the thermodynamics of the interactions with their partners is closely linked to their biological functions. Intrinsically disordered regions of Phe–Gly nucleoporins (FG Nups) that contain multiple phenylalanine–glycine repeats are of particular interest, as their interactions with transport factors (TFs) underlie the paradoxically rapid yet also highly selective transport of macromolecules mediated by the nuclear pore complex. Here, we used NMR and isothermal titration calorimetry to thermodynamically characterize these multivalent interactions. These analyses revealed that a combination of low per-FG motif affinity and the enthalpy–entropy balance prevents high-avidity interaction between FG Nups and TFs, whereas the large number of FG motifs promotes frequent FG–TF contacts, resulting in enhanced selectivity. Our thermodynamic model underlines the importance of functional disorder(flexibility) of FG Nups. It helps explain the rapid and selective translocation of TFs through the nuclear pore complex and further expands our understanding of the mechanisms of “fuzzy” interactions involving IDPs.
( ) emphasis mine Not so "fuzzy" per say from a Design perspective. Fuzzy due to current technology and the unknown. But it's clearly a highly flexible structure for a purpose. It must be able to identify quickly what is coming through the entry. NPC and IDPs - really cool research to find based upon Design Principles.DATCG
May 5, 2018
May
05
May
5
05
2018
12:07 PM
12
12
07
PM
PDT
1 2 3 4 5 6 17

Leave a Reply