Uncommon Descent Serving The Intelligent Design Community

Evolutionist: You’re Misrepresenting Natural Selection

Share
Facebook
Twitter
LinkedIn
Flipboard
Print
Email

How could the most complex designs in the universe arise all by themselves? How could biology’s myriad wonders be fueled by random events such as mutations?  Read more

Comments
Oops, from "As a first “activity” in the new year..." to "... because folded proteins are the only kind of molecules that can do those things" are gpuccio's words, not mine. Sorry for the confusion (if a mod could fix the tags that would be awesome :))Elizabeth Liddle
January 1, 2012
January
01
Jan
1
01
2012
04:14 AM
4
04
14
AM
PDT
Elizabeth: First of all, happy New Year to you and all!
It's looking good so far! I wish the same to you. As a first “activity” in the new year, I would like to make some comments about function and folding in proteins, to add to the interesting discussion you are having with MI. I see things this way: You are right to say that there are many functions that do not depend on protein folding. First of all, it is perfectly true that many functions do not depend on proteins at all. A lot of other molecules are highly functional in the biological context. Some of them are much simpler than proteins. Among others, the regulatory importance of small peptides and small DNA or RNA sequences has been proved. But that’s not the point. The point is not that function requires proteins, or folded proteins. The point is that a lot of specific biochemical functions, absolutely fundamental for any living beings, do require folded proteins, because folded proteins are the only kind of molecules that can do those things. But that is circular. It's like saying that I require two legs because without two legs I couldn't do the things I need two legs to do. But actually, it's perfectly possible to do lots of things without two legs, just not those things specific to two legs. This is an example of "drawing the target round the arrow". You see a folded protein, and you say: this protein is essential, because without that protein, this function couldn't be performed. Sure. But there may be many useful functions that could be performed by some protein we don't happen to have, yet we get on fine. For example, we can't synthesise Vitamin C. Most mammals can. Is it required? No. Would it be useful? Yes. Moreover, if population evolves to rely on something (being able to synthesise vitamin C for instance) and then loses it, then it could be disastrous. However, if it evolves to do without, then it won't. So you can't remove a part, watch the system fail, then say: "therefore this part is essential to all such systems". That is what is wrong with Behe's argument. If you remove the keystone of an arch, the arch fails. That does not mean that arches are unbuildable.
Think of all the enzymatic activities on which life is based: metabolism, DNA duplication, Transcription, translation, cell cycle, mebmrane receptors, membrane transport, mopvement, amd so on. All these things are realized by the contribution of thousands of individual proteins, each of them very complex, each of them efficiently folded. in no other way those things could be attained.
Again, you are working backwards from a functioning system. My first car was extremely simple, and if it stopped working was easy to fix. If the starter motor jammed, as it frequently did, I just dunted it with a hammer I kept in the door compartment for the purpose. Now I have a car that is hugely complex. It has automatic gears, decides itself whether to run on battery or petrol (it's a Toyota Prius). It tells me if there is something behind me when I back up, and won't go further if there is. And if any one thing goes wrong with it, the whole thing stops (which fortunately hasn't happened to me yet) and has to be towed in for specialist repair. It is easy to damage, fatally, a complex system by removing a single part. That does not mean that any simpler system will not function. This is the heart of Behe's fallacy.
So, we need not equate function with protein folding. But we can certainly say that a special subset, a very big subset, of all biological functions, and especially non trivial biochemical activities, require some well folded protein to be achieved. The living world as we know it could never exist without folded, biochemically active proteins.
And this, as I said, is fallacious. It's one of several fundamental mistakes made by ID proponents, that render the argument untenable.
Again I must refuse here your attempts to define function in relation to reproduction. I do define function as a well recognizable non trivial biochemical activities, that can allow us, even in the lab, even in the absence of any life, least of all reproduction, to obtain important biochemical results that could never occur spontaneously, like the enzymatic ac celeration of biochemical reactions.
You can define it how you like as long as we are clear :) But if you are going to call the activity of an enzyme a "function" when it is sitting in a test-tube, then you are going to have to define "spontaneously" as well. Are you saying that an enzyme in a test-tube has to be asked politely in order to perform its function, unlike, say, an inorganic catalyst? I don't think so :) So would you also say an inorganic catalyst has a "function"? If not, why not? If so, then how does "function" in your sense differ form "physical/chemical properties"?
So, these objectively definable biochemical functions do need complex folded proteins. We know no other way in the world to do those things. So, the subset of folding and biochemically functional proteins is ceratinly an important subset of the protein space, without which no life as we know it could exist.
Life as we know it, sure. But we aren't talking about "life as we know it". We are talking about life before we knew it. Life as we have to infer it was in the past, based on fossil evidence and evidence from biochemistry and geology and other sciences. It is, as I have said, completely fallacious to assume that only "life as we know it" is viable. It would, as I've said elsewhere, be assuming your conclusion.
It is true that folding does not ensure function. A folded protein, to be functional, must have further requisites, like one or more active sites, and in most cases, the ability to undergoe specific, useful, conformational modifications as the result of active site interaction. Many proteins that fold have no non trivial biochemical activity.
And many biological catalysts are not proteins at all, yet by your definition have a "function". And some proteins with "trivial biochemical activity" nontheless have a "function" (in terms of effect on the phenotype).
So, there is no doubt that we can define a subset of protein space that is the subset of well folding proteins, ans a subset of that subset, that is the space of folding proteins with non trivial biochemical function. And we can certainly affirm that this last subset is absolutely indispensable to life as we know it. Indeed, it is certainly the most important subset of biological molecules, as far as biological biochemical activity is considered.
By defining "function" so idiosyncratically, you are, IMO, tying yourself in knot. Either "function", as you define it, applies to any compound with any biochemical effect or you need to unpack "non trivial",which, it seems to me, lands you back in phenotypic effects. You have talked about these proteins being "necessary for life" and yet in the same breath you say that the effect on reproductive success is irrelevant. How can a protein that has no effect on reproductive success have a function that is anything other than "trivial"? And how can a compound, regardless of the complexity of its biochemical activity or lack of it, that has an effect on the reproductive success of the phenotype, not be "functional"? If something is "necessary for life" it must have a positive effect on reproductive success, no? And if it doesn't, then it's neither necessary nor "functional" is it? Do you see the problem?Elizabeth Liddle
January 1, 2012
January
01
Jan
1
01
2012
04:12 AM
4
04
12
AM
PDT
Elizabeth: First of all, happy New Year to you and all! As a first "activity" in the new year, I would like to make some comments about function and folding in proteins, to add to the interesting discussion you are having with MI. I see things this way: You are right to say that there are many functions that do not depend on protein folding. First of all, it is perfectly true that many functions do not depend on proteins at all. A lot of other molecules are highly functional in the biological context. Some of them are much simpler than proteins. Among others, the regulatory importance of small peptides and small DNA or RNA sequences has been proved. But that's not the point. The point is not that function requires proteins, or folded proteins. The point is that a lot of specific biochemical functions, absolutely fundamental for any living beings, do require folded proteins, because folded proteins are the only kind of molecules that can do those things. Think of all the enzymatic activities on which life is based: metabolism, DNA duplication, Transcription, translation, cell cycle, mebmrane receptors, membrane transport, mopvement, amd so on. All these things are realized by the contribution of thousands of individual proteins, each of them very complex, each of them efficiently folded. in no other way those things could be attained. So, we need not equate function with protein folding. But we can certainly say that a special subset, a very big subset, of all biological functions, and especially non trivial biochemical activities, require some well folded protein to be achieved. The living world as we know it could never exist without folded, biochemically active proteins. Again I must refuse here your attempts to define function in relation to reproduction. I do define function as a well recognizable non trivial biochemical activities, that can allow us, even in the lab, even in the absence of any life, least of all reproduction, to obtain important biochemical results that could never occur spontaneously, like the enzymatic ac celeration of biochemical reactions. So, these objectively definable biochemical functions do need complex folded proteins. We know no other way in the world to do those things. So, the subset of folding and biochemically functional proteins is ceratinly an important subset of the protein space, without which no life as we know it could exist. It is true that folding does not ensure function. A folded protein, to be functional, must have further requisites, like one or more active sites, and in most cases, the ability to undergoe specific, useful, conformational modifications as the result of active site interaction. Many proteins that fold have no non trivial biochemical activity. So, there is no doubt that we can define a subset of protein space that is the subset of well folding proteins, ans a subset of that subset, that is the space of folding proteins with non trivial biochemical function. And we can certainly affirm that this last subset is absolutely indispensable to life as we know it. Indeed, it is certainly the most important subset of biological molecules, as far as biological biochemical activity is considered.gpuccio
January 1, 2012
January
01
Jan
1
01
2012
12:55 AM
12
12
55
AM
PDT
"Anyway, about to take my glass of champagne of to bed!"
Felicitations! =D And thanks for a generous footnote.material.infantacy
December 31, 2011
December
12
Dec
31
31
2011
05:08 PM
5
05
08
PM
PDT
Well, I'm certainly not trying to "smuggle" anything! I'm trying to be as obvious as I possibly can be! I think these things matter! But IF we accept, for the sake of argument, that non-folding proteins are non-functional, and that only a subset of folding proteins can be functional, and that no other sequences are functional THEN, I accept that the subset of functional sequences is probably a very small subset of total sequences. But that is a huge IF, and I don't accept any of those premises! Nor do I accept Axe's estimate. But I will concede that Miller's point is crude, and not applicable to yours, although a legitimate one to make against Dembski's argument: even though Dembski claims to have sorted it by his "specification" concept, I do not accept that he has, but that's a slightly different issue. Also a legitimate one to make against Axe. And, indeed a legitimate one against many ID arguments I have seen (those "islands of function" for instance :)) Thanks for the link! The greater-than symbol is the biggest problem as not only does it not render, it is parsed as tag code, and sometimes you lose a great chunk of post! Anyway, about to take my glass of champagne of to bed!Elizabeth Liddle
December 31, 2011
December
12
Dec
31
31
2011
04:38 PM
4
04
38
PM
PDT
Happy new year to you as well, Lizzie. I still don't think we're on the same page, and I believe you're smuggling in some points which have no real bearing on the issue at hand, so I'll get back to those points later if I get more time. It appears to me, that no matter how you slice it, there's a target in S which remains (at least reasonably) fixed. It's narrow, and consists of peptides which successfully fold. Within that set the probability of function is significantly greater than it is given S. Even with some rather generous considerations, function is not arbitrary with respect to S, AFAICT. If we could only put Miller's cards analogy six feet under (and those like it) I'd be happy to give it a rest.
"PS: had to get page info to find the code for < thanks for providing it!"
Yeah that one has caused me a fair amount of trouble without the code. I found this page for mathy unicode symbols (enjoy). I only wish that the tags for superscript and subscript functioned properly in the comments here. We can dream.material.infantacy
December 31, 2011
December
12
Dec
31
31
2011
04:30 PM
4
04
30
PM
PDT
This conversation has split into at least two parts: whether an objective target for functional sequences exists in sequence space; and whether the folding set is improbable. These are distinct issues.
Yes. Good thinking :)
Elizabeth wrote: ”Sure, a “target” exists = we know that the set of sequences that code for foldable proteins isn’t an empty set. That is trivially true, and not in dispute. If you want to add the adjective “objective” to that clear fact, then feel free, but I don’t see what it adds.” That it’s trivially true was apparent to me as well, which makes this whole conversation a wonder. It should be noted though, that the folding set is certainly not empty, but it’s also not the same as the entire sequence space. 0 < n(F) < n(S) S = {sequence space} F = {folding sequences} F1 = {functional sequences} F1 ? F ? S I add “objective” because it’s clear that set F exists. Not all sequences fold, and of those that do, a subset can have function. The implication is that when we observe a function, the space in which it exists is determined by physics, not by the ad hoc rationalization of the observer. It should be clear that I chose to focus on the folding set because it obviates the need to debate about what may or may not be functional in some context. Functional proteins are determined (in this sense) by naturalistic laws, because they exist as a subset of proteins that fold. One may object that not all folded proteins are functional; good. But one cannot argue that functional proteins are not folded. So we have a partition on S, consisting of F’ and F. F1 ? F and F1 ? F’
Yes to most of the above, but a couple of caveats: first you cannot assume that unfolded proteins or short peptides never serve any function in an organism; second, and it is related, function has to be considered in terms of the reproductive success of the phenotype in a given context. You can’t look at a protein, or a peptide, or an RNA molecule, and say; this one is functional; this one isn’t. Just because one of those things are coded by DNA and produced in a cell doesn’t mean they are functional in the only sense that matters – the reproductive success of the phenotype. So it is misleading, in my view, to characterise sequences, as you have done, simply as “folding” or “folding and functional”. There will be non-folding functional sequences and non-functional folding sequences, and non-coding functional sequences, and sequences that are folding but only functional if some non-coding functional sequence is present, and some of all the above that will only be functional if expressed in some tissue, or in some tissue in certain environments.
This means that the coins/cards analogies — that each sequence is as improbable as the next — is irrelevant, because F occurring changes the probability of F1 having occurred. None of this reasoning requires an explicit calculation, only the bare knowledge that some proteins fold, but not all do. P(F1) < P(F1|F) The above is also trivially true. Proteins exist in F, and proteins which can add function in a biological context exist in F1. If we observe F1, then we know that F occurred. If we know that F has occurred, the probability of F1 having occurred is greater. Again, this is trivially true, which is why it can’t be true that what constitutes function is arbitrary. If we observe a functional protein, we know that it exists in a narrow subset of S, because F is a narrow subset of S. (You may reason that it’s less narrow than suggested, but it doesn’t change the core argument.)
OK, but with the caveats aforementioned :) P(F) ? 1/n(S) That is, if F occurs, it’s not the same as any sequence in S occurring, because n(F) > 1 and n(F) < n(F’), which is perhaps something we can agree on, unless there’s good reason to think that more sequences will fold properly than sequences which won’t. I don’t know of a reason, but I am still concerned about the assumption that only sequences that “fold properly” can have a function. I think this is a serious flaw. If we are talking about the emergence of protein coding sequences very early in the emergence of life, even simple non-folding peptides may have had a function in promoting successful reproduction, even if later such peptides sequences turn out to be invariably non-functional. And the reason it’s important is that early functional sequences can bootstrap later more complex ones, as we see in genetic algorithms. Early genetic patterns that confer an advantage form the basis for later more complex ones which confer greater advantage, after which, the re-emergence of those early ones is disadvantageous, not advantageous.
As to the improbability of F, I’m sticking with 10^-74, which is a researched estimate. You may have reason for thinking that this estimate is flawed, and if you would like to make a case for that I’d be happy to read it; however you can revise that number down by many orders of magnitude and it doesn’t change the argument. If we say that 1 in 10^50 will potentially fold, then we still have a narrow subset F, which is highly improbable to find with a random trial.
I have a great many reasons for thinking the estimate is not only flawed, but not possible to estimate, unless we take your assumption that only foldable proteins can ever be functional, and even then, we don’t currently have the ability to estimate the proportion of foldable sequences out of all possible sequences because we don’t know how to predict folding from the sequence alone.
”We simply don’t know how improbable it is because a) we don’t know how many elements of set n(S) are also members of set n(F), and nor do we know how many draws there were.” First, this isn’t about draws/trials because I’m not talking about what evolution may or may not accomplish. I’m painstakingly stating the obvious, that there exists an objective subset of S in which functional proteins must reside. I agree that we don’t know how improbable. I’m willing to withdraw “objectively improbable” and state instead that it is “likely improbable” or something of the sort. Again, even if we take the 10^-74 estimate down to 10^-50, we’re still looking at something highly improbable.
But then your conclusion that folding proteins are “highly improbable” hangs overwhelmingly on that estimate! Which, for a great many reasons, given above, is highly flawed.
Regardless, a target space for function exists and is not ad hoc nor post hoc — because it exists in a narrow and unchanging subset of the sequence space.
No. The “target space” is not unchanging at all, and sequence space itself is constantly changing. In fact I think the problem here is that you are not placing the problem in fitness space at all, and that is critical. Whether a sequence is “functional” or not depends on whether it affects the phenotype’s position if fitness space. And fitness space is constantly changing. So, to take a toy model: if we start with a very small genome, as in a GA, we have a small sequence space, and we start with just the minimum functional sequences present to keep the population renewing itself. Now, let’s say that (as I seem to recall) DNA sequences with predominantly GC bonds are more stable than ones with predominantly AT bonds. At that point GC bonds are “functional” if organisms with more GC bonds are more likely to reproduce successfully than others. So “GC” or GG or even GAC are “functional” sequences. Later, certain sequences may result in RNA molecules that for some reason or another (perhaps they catalyse some beneficial reaction) improve reproductive efficiency, and so they become “functional” and the predominance or otherwise of GC bonds ceases to matter. Already the “target” has changed.
If your position is that the existence of F is trivially true, hence “objective,” then we agree, and can cease straining gnats; and we can put to rest the notion that the argument “equally improbable as any other sequence” has any bearing on the observation of function. That was Miller’s argument and entirely misses the point; it is irrelevant.
No, it isn’t irrelevant! In fact it’s exactly the error Axe makes, and on which basis you concluded that folding proteins are “highly improbable”. The fact that there exist a finite number of potentially functional folding proteins (which I dispute, for reasons I have given) and that therefore constitutes an “objective” target is no use unless you know how large that set is. To be precise in the card analogy: it is as though you assert (correctly) that in a deck of unknown but finite size there are an unknown but finite number of possible hands that could win an unknown but finite number of card games. You are then dealt a hand from that deck and told that you have won one of the games. You then erroneously assume that because that number of winning hands is finite, and less than the total number of possible hands, that being dealt any one of them is highly improbable, and that therefore you have been extraordinarily lucky. You have mistaken, I suggest, the knowledge that the winning set is finite for the knowledge that it is a very tiny subset of all possible hands, and the reason you have done so is that someone (Axe in this case) has inferred from the fact the hand you were dealt was a winner in a game now knows exists (because you won it) that it was the only game in town. In other words he drew the target far too tightly round an observed hit.
I’m arguing against the notion that any sequence is as good as any other with respect to function. If a protein has a function, then it is a subset of F, which is a subset of S, which makes it anything but arbitrary with respect to sequence space, and likely highly improbable.
And my point is that this oversimplifies the problem to the point of falsification! The target is not unchanging, and indeed, the hitting of a target itself changes the target space. The system is, literally, chaotic, full of feedback loops. Think Markov chains if you like, or at least Bayesian probabilities, but it is really important, I suggest, to consider the system dynamically, not statically, and that where you are at any given point not only constrains where you can go next, but opens up new functional possibilities. Anyway, it’s nearly 2012 now, so a happy new year to you, and nice to talk to you! I do enjoy getting down to brass tacks! Thanks! Lizzie PS: had to get page info to find the code for < thanks for providing it!Elizabeth Liddle
December 31, 2011
December
12
Dec
31
31
2011
03:20 PM
3
03
20
PM
PDT
This conversation has split into at least two parts: whether an objective target for functional sequences exists in sequence space; and whether the folding set is improbable. These are distinct issues. Elizabeth wrote:
”Sure, a “target” exists = we know that the set of sequences that code for foldable proteins isn’t an empty set. That is trivially true, and not in dispute. If you want to add the adjective “objective” to that clear fact, then feel free, but I don’t see what it adds.”
That it’s trivially true was apparent to me as well, which makes this whole conversation a wonder. It should be noted though, that the folding set is certainly not empty, but it’s also not the same as the entire sequence space. 0 < n(F) < n(S) S = {sequence space} F = {folding sequences} F1 = {functional sequences} F1 ⊆ F ⊂ S I add “objective” because it’s clear that set F exists. Not all sequences fold, and of those that do, a subset can have function. The implication is that when we observe a function, the space in which it exists is determined by physics, not by the ad hoc rationalization of the observer. It should be clear that I chose to focus on the folding set because it obviates the need to debate about what may or may not be functional in some context. Functional proteins are determined (in this sense) by naturalistic laws, because they exist as a subset of proteins that fold. One may object that not all folded proteins are functional; good. But one cannot argue that functional proteins are not folded. So we have a partition on S, consisting of F’ and F. F1 ⊆ F and F1 ⊈ F’ This means that the coins/cards analogies -- that each sequence is as improbable as the next -- is irrelevant, because F occurring changes the probability of F1 having occurred. None of this reasoning requires an explicit calculation, only the bare knowledge that some proteins fold, but not all do. P(F1) < P(F1|F) The above is also trivially true. Proteins exist in F, and proteins which can add function in a biological context exist in F1. If we observe F1, then we know that F occurred. If we know that F has occurred, the probability of F1 having occurred is greater. Again, this is trivially true, which is why it can’t be true that what constitutes function is arbitrary. If we observe a functional protein, we know that it exists in a narrow subset of S, because F is a narrow subset of S. (You may reason that it’s less narrow than suggested, but it doesn’t change the core argument.) P(F) ≠ 1/n(S) That is, if F occurs, it’s not the same as any sequence in S occurring, because n(F) > 1 and n(F) < n(F’), which is perhaps something we can agree on, unless there’s good reason to think that more sequences will fold properly than sequences which won’t. As to the improbability of F, I’m sticking with 10^-74, which is a researched estimate. You may have reason for thinking that this estimate is flawed, and if you would like to make a case for that I’d be happy to read it; however you can revise that number down by many orders of magnitude and it doesn’t change the argument. If we say that 1 in 10^50 will potentially fold, then we still have a narrow subset F, which is highly improbable to find with a random trial.
”We simply don’t know how improbable it is because a) we don’t know how many elements of set n(S) are also members of set n(F), and nor do we know how many draws there were.”
First, this isn’t about draws/trials because I’m not talking about what evolution may or may not accomplish. I’m painstakingly stating the obvious, that there exists an objective subset of S in which functional proteins must reside. I agree that we don’t know how improbable. I’m willing to withdraw “objectively improbable” and state instead that it is “likely improbable” or something of the sort. Again, even if we take the 10^-74 estimate down to 10^-50, we’re still looking at something highly improbable. Regardless, a target space for function exists and is not ad hoc nor post hoc -- because it exists in a narrow and unchanging subset of the sequence space. If your position is that the existence of F is trivially true, hence “objective,” then we agree, and can cease straining gnats; and we can put to rest the notion that the argument “equally improbable as any other sequence” has any bearing on the observation of function. That was Miller’s argument and entirely misses the point; it is irrelevant. I’m arguing against the notion that any sequence is as good as any other with respect to function. If a protein has a function, then it is a subset of F, which is a subset of S, which makes it anything but arbitrary with respect to sequence space, and likely highly improbable.material.infantacy
December 31, 2011
December
12
Dec
31
31
2011
01:31 PM
1
01
31
PM
PDT
@material.infantacy
By the laws of physics. I’m standing by the notion that if something is necessitated, then its nature can be determined. Aren’t there, in principle, ways to determine the proportion of folding sequences to not? I seem to remember something.
Not that I'm aware of. That's the problem with biology. Things that seem in principle to be "knowable" turn out not to be, essentially because once things get a bit complicated, with feedback effects, the maths becomes horrendous and the only thing you can do is model - or empirically observe. Like weather, really. We understand weather extremely well, but we still can't make good forecasts for more than a couple of days at a time. That's not because we can't in principle model it very well, it's because large effects depend on tiny differences in starting conditions, and we cannot possibly know the precise starting conditions. Maths is becoming an empirical science :) It's also, incidentally, why, despite being a neuroscientist, I have no real worries that neuroscientists will ever be able to make accurate predictions about what people will do or think, even though I do hold the position that what we do or think is a function of our brains and their inputs. Simply telling each other what we think and what we intend to do will remain the best method indefinitely, IMO, in other words the model that we are autonomous decision-makers is the best model we will ever have :)Elizabeth Liddle
December 31, 2011
December
12
Dec
31
31
2011
08:57 AM
8
08
57
AM
PDT
You say: You put your finger on it IMO when you say “as we cannot know all the variables implied”. Yes indeed. And we never do. Even if at some fundamental level the universe proves to have no quantum uncertainty (as at least one eminent theoretical physicist has proposed) all our models (and we are talking about models here) must be stochastic. Now physicists work to find tolerances and insist on 5 sigma confidence, while life scientists are often content with two. But every single effect we observe comes with stochastic variance. We can never know all the variables. What we can do is estimate probability distributions, and also the extent to which those distributions are orthogonal. This is an extremely serious misrepresentation of scientific epistemology.
I don’t agree.
An explanatory model based on necessity is a set of logic relations and mathemathical rules that connect events, according to rigid determinism, to give a causal explanation of observed facts.
Well, I accept this as your definition of “an explanatory model based on necessity”. It doesn’t resemble most explanatory models.
Newton’s law of gravity gives us a definite mathematical relationship between mass and gravitational force. That relation is not in any way random. It is a necessity relation. A causal relation, where mass is the cause of gravitational force.
Again, that word “random”. You really need to give a tight definition of it, because it has no accepted tight definition in English. And we simply do not know whether gravity is subject to quantum uncertainty or not. Newton’s law is indeed deterministic, but it is only a law, not an explanation – not a theory. And all a law is a mathematical description that holds broadly true. That doesn’t make it a causal relation. We do not know whether mass is the cause of gravitational force. It may be that gravitational force is the cause of mass. Or it could be that “cause” is itself merely a model we use to denote temporal sequence, and ceases to make sense when considering space-time. But I’m no physicist, so I can’t comment further except to say that you are making huge and unwarranted assumptions here.
The evolution of a wave function in quantum mechanics is strictly deterministic, and has nothing to do with probability.
Depends what you mean by “nothing to do with probability”. I am talking about scientific (explanatory) models. There is always unmodelled variance, if only experimental error. It is often possible to use deterministic models, even when the underlying processes are indeterminate. Similarly we often have to use stochastic models even when the underlying processes are determinate. I think a big problem (and I find it repeatedly in ID conversations) concerns the word “probability” itself, which is almost as problematic as “random”. Sometimes people use it as a substitute for “frequency” (as in probability distributions which are based on observed frequency distributions) . At other times they use it to mean something closer to “likelihood”. And at yet other times they use it as a measure of certainty. We need to be clear as to which sense we are using the word, and not equivocate between mathematically very different usages, especially if the foundation of your argument is probabilistic (as ID arguments generally are).
All science works that way. Even biological sciences work largely by necessity models.
No. Pretty well all sciences, and certainly life sciences, use models in which the error term is extremely important. And biology is full of stochastic models. In fact I simply couldn’t do my job without stochastic models (and I work in life-science, but in close collaboration with physicists).
probabilistic model, obviously, are very useful where a necessity model cannot be explicitly built, and still a probability function can well describe, to some extent, what we observe.
I think you are confusing a law with a model. A law, generally, is an equation that seems to be highly predictive in certain circumstances, although there are always residuals – always data points that don’t lie on the line given by the equation, and these are not always measurement error. We often come up with mathematical laws, even in life sciences, but that doesn’t mean that the laws represent some fundamental “law of necessity”. It just means that, usually within a certain data range (as with Newton, and Einstein, whose laws break down beyond certain data limits) relationships can be summarised by a mathematical function fairly reliably – perhaps very reliably sometimes.
So, the repeated tossing of a coin is well described by a simple probabilistic model, while a single toss of a coin is certainly a deterministic problem that cannot easily be solved by a necessity model, even if in theory it can be solved.
This is a false distinction in my opinion. You can describe the results of a single coin toss by a probabilistic model just as well as you can describe the results of repeated tosses. But if you want to predict the results of an individual toss, as opposed to the aggregate results of many tosses, you need to build a more elaborate model that takes into account all kinds of extra data, including the velocity, spin, distance and angle etc of the coin. And you cannot possibly know all the factors, so there will still be an error term in your equation. In other words, predictive models always have error terms; sometimes these can be practically ignored; at other times, you need to characterise the distribution of the error terms and build a full stochastic model.
The two ways to describe reality are deeply different: they have different forms, and different rules. They are so different that the whole methodology of science is aimed at distinguishing them.
I agree that characterising uncertainty is fundamental to scientific methodology. I disagree that stochastic and non-stochastic models are “deeply different”. In fact I’d say that a non-stochastic model is just a special case of a stochastic model where the error term is assumed to be zero.
Take Fisher’s hipothesis testing, for instance, that is widely ised as research methodology in biological sciences. As you ceretainly know, the purpose of the test is to affirm a causal necessity, or to deny it.
No. That is not the purpose of Fisher’s test, which has nothing to do with “causal necessity” per se (although it can be used to support a causal hypothesis). Nor can you use Fisher’s test to “deny” a “causal necessity”. If students attempt to do that in their essays for me they lose marks! You can use Fisher’s test to support a hypothesis, in which case you can conclude that the observed data are unlikely to be observed under the null hypothesis (the hypothesis that your study hypothesis is false). However, if Fisher’s test tells you that your observed data are quite likely to be observed under the null, you cannot conclude that your hypothesis is false, merely that you have no warrant for claiming that it is true.
Observed data are analyzed, and observed differences, or effects, are statistically tested to compute the probability of those effects if only random noise were the explanation: that is the null hypothesis. if the null hypothesis, that is an explanation out of random effects, is considered too improbable, it is rejected, and the necessity explanation that was the initial hypothesis of the researchers is usually assumed.
Right. Except that a good scientist will then attempt to devise an alternative hypothesis that could also account for the observed data. But if you “retain the null”, by Fisher’s test, you cannot conclude that your hypothesis is false. Fisher’s test cannot be used to falsify any hypothesis except the null. It cannot be used in the Popperian sense of falsification in other words.
As you can see, that is completely different from your bold statement that “all our models (and we are talking about models here) must be stochastic”. That is a huge mistake. Most scientific models are not stochastic at all. Some of them are, like part of thermodinamics, and the collapse of wave function in quantum physics.
It is entirely compatible with my statement. Typically, we use Fisher’s test to test the difference between two summary statistics. Those summary statistics are models, and they come with associated error terms. Without those error terms we could not compute Fisher’s test statistics, which actually incorporate error terms.
So, where does your confusion come from? I believe you make a huge mistake, confounding the model, that is often not stochastic at all, and observed data, that always have a component of random noise.
I am certainly not confounding the two! Scientific methodology involves fitting models to data. Yes, you can build a non-stochastic model, but you still have to deal with the error term, in other words the residuals, aka the stuff impacting on your data that you haven’t modelled. And you can either make assumptions about the distributions of your residuals (assume a Gaussian, for instance) or you can actually include specified distributions for the uncertain factors in your model. If you don’t – you report a non-stochastic model with the assumption that the residuals are normally distributed, and they aren’t, you will be making a serious error, and your model will be unreliable.
Random noice is an empirical reality. Random sampling is a cause of random error in most biological contexts. In physics, as in all science, measurement always comes with some error. The presence of random error in empirical data is exactly the reason why those data are often analyzed statistically to distinguish between the effects of random noise and the assumed effect of necessity. But that does not mean, in any way, that the model is stochastic. The explanatory model is usually based on necessity, it assumes specific cause and effect relations, where the probability of an event, given the cause, is 1 or 0.
In my view you are making some serious statistical errors here, I think at a conceptual level. I think it all derives from your unpacked concept “random”. “Random noise” is simply unmodelled variance (as you’ve said yourself). For instance, we can often reduce the residuals in our models by including a covariate that models some of that “noise”, so it ceases to be noise – we’ve found, in effect, a systematic relationship between some of the previously unmodelled variance and a simply modelled factor. Age, for instance, in my field, or “working memory capacity” is a useful one, as measured by digit span. Furthermore, sampling error, is not a “cause of random error”, but represents the variability in summary statistics of samples resulting from variance in the population that is not included in your model. Certainly some variance is due to measurement error. But it would be very foolish to assume that you have modelled every variable impacting on your data apart from measurement error. You are also making a false distinction between “the effects of random noise and the assumed effect of necessity”. Apart from measurement error, all the rest of your “random noise” may well be “effects of necessity”. What makes them “noise” is simply the fact that you haven’t modelled them. Model them, and they become “effects of necessity”. And, as I said, you can model them as covariates, or you can model them as stochastic terms. Either way, you aren’t going to get away without a stochastic term in your model, even if it just appears as the error term.
So, you are calling the model stochastic, while the only contribution of statistical analysis in bthe procedure is to rule out random causes as an explanation. That is a big congitive mistake.
I disagree. I think yours is the cognitive mistake, and I think it is the mistake of inadequately considering what “random causes” means. “Random causes” are not “explanations”. They are the opposite of “explanations”. They are theunexplained aka unmodeled variance in your data. Thinking that “random” is an “explanation” is a really big cognitive mistake! But you are not the only person on this board to make it ?
You cite 5 sigma. presumably referring to its use in physics. Well, I quote here what 5 sigma measn, and what it is used for (emphasis is mine): “Word of the Week: Five Sigma September 23, 2011 by Lori Ann White The realm of particle physics is the quantum realm, and physicists who venture there must play by the rules of probability. Major discoveries don’t come from a single repetition of a single experiment. Instead, researchers looking for new particles or strange processes repeat an experiment, such as smacking protons together at nearly the speed of light, over and over again, millions or billions or trillions of times – the more times the better. Even then, as the researchers sort through the results, interesting lumps and bumps in the data don’t automatically translate into, “We found it!” Interesting lumps and bumps can, and do, happen by chance. That’s why statistical analysis is so important to particle physics research. The statistical significance of a particular lump or bump – the probability that it did not appear by chance alone – must be determined. If chance could have supplied it, no dice, and no discovery. The yardstick by which this significance is measured is called standard deviation (generally denoted by the lowercase Greek letter sigma). In statistics, the standard deviation is a measure of the spread of data in a normal probability distribution – the well-known bell curve that determines the grades of many students, to their dismay. On that bell curve, researchers plot the probability that their interesting lump or bump is due to chance alone. If that point is more than five sigma – five standard deviations – from the center of the bell curve, the probability of it being random is smaller than one in one million. Only then can particle physicists shout, “Eureka!” (a la Archimedes) … but without running naked through the streets of Athens.” Again, as you can see, physicists use the 5 sigma threshold to exclude “that their interesting lump or bump is due to chance alone”. IOWs, to affirm that it is evidence of their necessity model. QED. That is exactly how ID uses the concept of dFSCI: a bump that is too unlikely to be due to chance alone. IOWs, a result that cannot emerge in a random system, but that requires a cause. In that case, the cause is design.
And that is exactly what is wrong with ID. Lori Ann White has done what I always reprimand students for doing – saying that their alpha criterion allows them to rule out effects that are “due to chance alone”. It does no such thing. All it does is to allow them to say, as you said yourself, that the observed results are unlikely to be observed if the null is true. Chance doesn’t “cause” anything. But lots of unmodelled factors do. However, under the null hypothesis, only rarely will those unmodelled factors combine to give you results like those you have observed. And ID simply does not attempt to model those unmodelled factors. It simply assumes that under the null (no design) the observed data will be very rare. In other words, it assumes what it sets out to demonstrate, which is fallacious.
Again, if we cannot agree on those basic epistemological concepts, there is no hope in discussing. All my reasonings are based on my epistemology, in which I very much believe. How can I discuss anything with you, if you believe completely different things (I really cannot understand what and why) at the epistemological level?
I agree we can make no progress without agreeing on basic statistical principles. My position is that yours are erroneous. I hope the above assists you in understanding why I think so. Cheers, and a happy new year to you! LizzieElizabeth Liddle
December 31, 2011
December
12
Dec
31
31
2011
08:40 AM
8
08
40
AM
PDT
Elizabeth: Now, you cannot reatract what you have stated. I quote another statement of yours (emphasis mine): I agree. That was my point. All systems are stochastic. What we need are the relevant probability distributions, not a division into stochastic and non-stochastic processes. They are all stochastic. Some merely have more variance than others. This is, btw, why I avoid the term “RM+NS” and get cross when people say that mutation is random and natural selection isn’t. Both are variance generation (“RM”) and differential reproduction (“NS”) stochastic processes. This is, btw, why I avoid the term “RM+NS” and get cross when people say that mutation is random and natural selection isn’t. Both are variance generation (“RM”) and differential reproduction (“NS”) stochastic processes. And both have probability distributions that are biased in favour of reproductive success. the confusion here is very clear. You refuse the epistemological difference between RV and NS, that is extremely serious, and then you strangely state that "both have probability distributions that are biased in favour of reproductive success." What does that mean? RV is a random system (yes, I do prefer that term to "stochastic"). It includes all the events of variation that happen in the genome, and that are cuased by a great number of variables, none of which has "probability distributions that are biased in favour of reproductive success". Please, name one cause of random variation that has that type of "probability distribution", and explain why. In what sense single point mutations are " biased in favour of reproductive success"? Or chromosomal deletions? Or anything else? You say such strange things exactly because you don't want to admit the difference between RV and NS. Let's take NS. Is it a random principle? No. It is a necessity relation. It expresses the very trivial concept that, in a reproducing population, if some variation has a negative effect on reproduction, the variated genome will probably be less represented, and the other way round if the variation has a positive effect on reproduction. Now, it is true that, due to existing random noice in the reproducing population, positive variations can sometimes be lost, and vice versa. In that sense, the necessity effect of the principle of NS can be diluted, and we need to take that into account. But still, the relationschip between some specific variation and reproduction is a causal relation, a relation of necessity. Its effect will be modulated by other effects, but still it is a necessity relation. Let's make an example. If some very fundamental protein that is required for cell division mutates in a way that it loses all function n(let's say a frameshift mutation), reproduction becomes impossible. That effect has no probability distribution at all. It cannot even be mitigated by other factors. Negative NS is often a very strong necessity principle. You cannot treat those things with "probability distributions", or just play games saying that they are stochastic. Funtion is a question of necessity. A machine works because it is made to work. Its working is the necessary result of its structure. Probabilty does not help here. We have to understand the function, the structure, and the relationship between the two. That is a work of necessity, of understanding logical and causal relations. You cannot "avoid the term “RM+NS”". People who say that "mutation is random and natural selection isn’t" are simply right. Variation in the genome is random, unless it is designed. NS is a necessity principle that depends critically on how the variation modifies fuinction. Its effect is not to modify the genome, but simply to change the percebtual representation of what exists in the population. There is a big difference. The only "engine of variation", the only thing that modifies genome, is a series of random modifications, none of them "biased in favour of reproductive success". Whatever they are, single point mutations, deletions, inversion, sexual recombination, whatever, they are random, because there is no explicit cause and effect relationship between the cuase of variation and the effect on genomic function. A single point mutation happens randomly in the genome. Even is the probabilites are not the same for all mutations, nothing in that probability dostribution is "biased in favour of reproductive success". The only mechanism "biased in favour of reproductive success", except for design, is NS, the necessity relationship between the type of variation (positive, negative, neutral) and the reproductive fucntion. That's why RV and NS must be recognized as different, and treated separately.gpuccio
December 31, 2011
December
12
Dec
31
31
2011
08:24 AM
8
08
24
AM
PDT
We only need to “count” the elements to determine the specific probability
Based on actual experiment, the probability of finding any arbitrary functional sequence that is one step removed from an existing sequence is one, because populations buy all the lottery tickets.Petrushka
December 31, 2011
December
12
Dec
31
31
2011
08:21 AM
8
08
21
AM
PDT
Function is objective, and so is folding. Since not all sequences fold, the assessment is not arbitrary.
Folding ceased to be the most important criterion for utility several hundred million years ago. Most evolution in multi-celled organisms takes place in regulatory genes. Of course utility is not arbitrary. It is constrained by selection. My point is you have no way to calculate probability, because -- independent of selection -- you have no way of determining which sequences are useful and which are not. If you wish to argue that design is even possible, demonstrate a method of determining utility that is independent of selection. The problem for both evolution and design is determining utility. Evolution solves the problem by trying all variations in the sequence neighborhood. That is the point of the Lenski experiment -- demonstrating that this is what actually happens. In one sense it makes no difference whether mutation is random. If everything is tried it makes no difference what the order of trials is. It could just as effectively be done starting at one end of a sequence and progressing to the other. Now ID proponents could go the Thornton route and do the painstaking research to see if cousin sequences really can be connected. That would be interesting.Petrushka
December 31, 2011
December
12
Dec
31
31
2011
08:09 AM
8
08
09
AM
PDT
Elizabeth:
I will discuss just one point: your refusal to admit the fundamental difference between necessity and a random system. If we can’t agree on that, it’s completely useless to go on discussing anything.
gpuccio, I've already said that I find the term "random system" hopelessly ambiguous and imprecise. I am happy with the term "stochastic", which all systems are, to some extent, but can be regarded as non-stochastic at certain scales. Please don't characterise my position as a "refusal to admit" anything. I am not refusing to "admit" anything. I am simply pointing out something that you yourself pointed out, which is that all models include unknowns, error terms, if you like, representing either unmodelled variables that are orthogonal (with luck) to modelled factors or even inbuilt uncertainties (such as quantum uncertainty). The degree to which unmodelled variables impact the effects of interest is the degree to which we have to consider the system stochastic when modelling it. That's all. I'll read the rest of your post now.Elizabeth Liddle
December 31, 2011
December
12
Dec
31
31
2011
07:27 AM
7
07
27
AM
PDT
It’s what I said all along. Just because I proposed that 0 < P(F) < 1 (or other inequalities) doesn’t mean that I was calculating the probability of a function. You’re going a long way to avoid admission that an objective target exists. Nothing of what you say has refuted it. You seem to be operating under the assumption that because I can’t compute an exact probability for a given function, that no objective target exists.
What's "objective" about a target that you can't define? Sure, a "target" exists = we know that the set of sequences that code for foldable proteins isn't an empty set. That is trivially true, and not in dispute. If you want to add the adjective "objective" to that clear fact, then feel free, but I don't see what it adds. But if you want then to calculate the probability that a randomly drawn sequence will include one of those foldable proteins then you can't do it. So you can't claim then that "The set F is objectively improbable". We simply don't know how improbable it is because a) we don't know how many elements of set n(S) are also members of set n(F), and nor do we know how many draws there were. All you can say is that set n(F) is smaller than set n(S) where n(F) is the set of sequences that result in foldable proteins.
Regarding the probability of F, isn’t there an estimate of 10^-74 for a 150 length chain? If you don’t like that number, do you have one which makes a difference? What value do you think would be reasonable?
We don't know, which is Petrushka's point. As I understand it (and I assume Petrushka does too) Axe "drew his target round an arrow". There's a good article about it on Panda's Thumb, with nice illustrations: http://pandasthumb.org/archives/2007/01/92-second-st-fa.html As Petrushka said:
My point is and always has been that in the absence of the ability to distinguish a functional sequence independently of trial and error, design is impossible.
And as I added:
Not only is design impossible (except by an omnipotent deity I guess), but unless you know what proportion of sequences result in potentially useful proteins (or, indeed, are potentially useful regulatory sequences) there’s no way of computing the probability that any one will arise “by chance” nor how closely clustered the useful stuff is.
For the heck of it, I'm just writing a quick simulation to get some kind of feel for how often a randomly generated genome of length N will contain a sequence starting with a start codon and ending with a stop codon that consists of a whole number of triplets (ie. code for an amino acid chain with no introns. Not sure it will help though, because I still can't get from there to whether the resulting protein chains would fold stably or not. If you are really interested in the answer, you could join folding@home: http://folding.stanford.edu/ It's my New Year's Resolution :)Elizabeth Liddle
December 31, 2011
December
12
Dec
31
31
2011
07:22 AM
7
07
22
AM
PDT
Elizabeth: I will discuss just one point: your refusal to admit the fundamental difference between necessity and a random system. If we can't agree on that, it's completely useless to go on discussing anything. You say: You put your finger on it IMO when you say “as we cannot know all the variables implied”. Yes indeed. And we never do. Even if at some fundamental level the universe proves to have no quantum uncertainty (as at least one eminent theoretical physicist has proposed) all our models (and we are talking about models here) must be stochastic. Now physicists work to find tolerances and insist on 5 sigma confidence, while life scientists are often content with two. But every single effect we observe comes with stochastic variance. We can never know all the variables. What we can do is estimate probability distributions, and also the extent to which those distributions are orthogonal. This is an extremely serious misrepresentation of scientific epistemology. An explanatory model based on necessity is a set of logic relations and mathemathical rules that connect events, according to rigid determinism, to give a causal explanation of observed facts. Newton's law of gravity gives us a definite mathematical relationship between mass and gravitational force. That relation is not in any way random. It is a necessity relation. A causal relation, where mass is the cause of gravitational force. The evolution of a wave function in quantum mechanics is strictly deterministic, and has nothing to do with probability. All science works that way. Even biological sciences work largely by necessity models. probabilistic model, obviously, are very useful where a necessity model cannot be explicitly built, and still a probability function can well describe, to some extent, what we observe. So, the repeated tossing of a coin is well described by a simple probabilistic model, while a single toss of a coin is certainly a deterministic problem that cannot easily be solved by a necessity model, even if in theory it can be solved. The two ways to describe reality are deeply different: they have different forms, and different rules. They are so different that the whole methodology of science is aimed at distinguishing them. Take Fisher's hipothesis testing, for instance, that is widely ised as research methodology in biological sciences. As you ceretainly know, the purpose of the test is to affirm a causal necessity, or to deny it. Observed data are analyzed, and observed differences, or effects, are statistically tested to compute the probability of those effects if only random noise were the explanation: that is the null hypothesis. if the null hypothesis, that is an explanation out of random effects, is considered too improbable, it is rejected, and the necessity explanation that was the initial hypothesis of the researchers is usually assumed. As you can see, that is completely different from your bold statement that "all our models (and we are talking about models here) must be stochastic". That is a huge mistake. Most scientific models are not stochastic at all. Some of them are, like part of thermodinamics, and the collapse of wave function in quantum physics. So, where does your confusion come from? I believe you make a huge mistake, confounding the model, that is often not stochastic at all, and observed data, that always have a component of random noise. Random noice is an empirical reality. Random sampling is a cause of random error in most biological contexts. In physics, as in all science, measurement always comes with some error. The presence of random error in empirical data is exactly the reason why those data are often analyzed statistically to distinguish between the effects of random noise and the assumed effect of necessity. But that does not mean, in any way, that the model is stochastic. The explanatory model is usually based on necessity, it assumes specific cause and effect relations, where the probability of an event, given the cause, is 1 or 0. So, you are calling the model stochastic, while the only contribution of statistical analysis in bthe procedure is to rule out random causes as an explanation. That is a big congitive mistake. You cite 5 sigma. presumably referring to its use in physics. Well, I quote here what 5 sigma measn, and what it is used for (emphasis is mine): "Word of the Week: Five Sigma September 23, 2011 by Lori Ann White The realm of particle physics is the quantum realm, and physicists who venture there must play by the rules of probability. Major discoveries don't come from a single repetition of a single experiment. Instead, researchers looking for new particles or strange processes repeat an experiment, such as smacking protons together at nearly the speed of light, over and over again, millions or billions or trillions of times – the more times the better. Even then, as the researchers sort through the results, interesting lumps and bumps in the data don't automatically translate into, "We found it!" Interesting lumps and bumps can, and do, happen by chance. That's why statistical analysis is so important to particle physics research. The statistical significance of a particular lump or bump – the probability that it did not appear by chance alone – must be determined. If chance could have supplied it, no dice, and no discovery. The yardstick by which this significance is measured is called standard deviation (generally denoted by the lowercase Greek letter sigma). In statistics, the standard deviation is a measure of the spread of data in a normal probability distribution – the well-known bell curve that determines the grades of many students, to their dismay. On that bell curve, researchers plot the probability that their interesting lump or bump is due to chance alone. If that point is more than five sigma – five standard deviations – from the center of the bell curve, the probability of it being random is smaller than one in one million. Only then can particle physicists shout, "Eureka!" (a la Archimedes) ... but without running naked through the streets of Athens." Again, as you can see, physicists use the 5 sigma threshold to exclude "that their interesting lump or bump is due to chance alone". IOWs, to affirm that it is evidence of their necessity model. QED. That is exactly how ID uses the concept of dFSCI: a bump that is too unlikely to be due to chance alone. IOWs, a result that cannot emerge in a random system, but that requires a cause. In that case, the cause is design. Again, if we cannot agree on those basic epistemological concepts, there is no hope in discussing. All my reasonings are based on my epistemology, in which I very much believe. How can I discuss anything with you, if you believe completely different things (I really cannot understand what and why) at the epistemological level?gpuccio
December 31, 2011
December
12
Dec
31
31
2011
07:01 AM
7
07
01
AM
PDT
"Now you say:
This isn’t about the probability of function, but whether function is a subset of sequence space — yes, it is."
It's what I said all along. Just because I proposed that 0 < P(F) < 1 (or other inequalities) doesn't mean that I was calculating the probability of a function. You're going a long way to avoid admission that an objective target exists. Nothing of what you say has refuted it. You seem to be operating under the assumption that because I can't compute an exact probability for a given function, that no objective target exists. You say that n(F) may be only slightly smaller than n(S). Regardless, n(F) < n(S) and so for any trial there is either F or not F, and 0 < P(F) < 1. That's a target. That's the claim. That's the demonstration. It objectively exists, and that's what you won't acknowledge. You suggest that it might be more probable than I think. Fine. It's still a target in S. The target is in F, so regardless of how improbable it is (and yes I think it's improbable) it is only one of the two given outcomes: P(F) + P(F') = 1. This is true regardless of whether I can determine the size of F.
"And in any case, “function” is not synonymous with “coding for a foldable protein”."
Indeed, but "function" is a subset of "folding" which is why I used folding in the first place. Regarding the probability of F, isn't there an estimate of 10^-74 for a 150 length chain? If you don't like that number, do you have one which makes a difference? What value do you think would be reasonable?
Since proteins can’t perform functions unless they first fold into stable structures, Axe’s measure of the frequency of folded sequences within sequence space also provided a measure of the frequency of functional proteins—any functional proteins—within that space of possibilities. Indeed, by taking what he knew about protein folding into account, Axe estimated the ratio of (a) the number of 150-amino-acid sequences that produce any functional protein whatsoever to (b) the whole set of possible amino-acid sequences of that length. Axe’s estimated ratio of 1 to 1074 implied that the probability of producing any properly sequenced 150-amino-acid protein at random is also about 1 in 1074. In other words, a random process producing amino-acid chains of this length would stumble onto a functional protein only about once in every 1074 attempts. Meyer, Stephen C. (2009-06-06). Signature in the Cell (pp. 210-211). Harper Collins, Inc.. Kindle Edition.
material.infantacy
December 31, 2011
December
12
Dec
31
31
2011
06:51 AM
6
06
51
AM
PDT
You wrote:
Not exactly. We are talking about whether there is a probability — whether an objective, predetermined target exists as a space F (folded) as a subset of S (all sequences), for which n(F) [is less than] n(S); that is, F is a proper subset of S. And that functional protein sequences are a subset of F. I vote yes.
Now you say:
This isn’t about the probability of function, but whether function is a subset of sequence space — yes, it is.
So yes, we were talking about probability. Specifically, you took Petrushka to task as follows:
For a set S, which is a universal set consisting of every possible sequence of length n, there is a subset F which consists of every sequence that results in a folded protein. P(F) = n(F) / n(S) That is, the probability of F occurring is equal to the number of elements in F divided by the number of elements in S. The set F is objectively improbable, unchanging, and contains the subset of all potentially functional proteins. (That is, the size of the subset of functional proteins is bounded by the size of the subset F.)
No, the set F is not "objectively improbable" because you haven't calculated it. It's not calculable. n(F) may be only slightly smaller than n(S). And how, in any case, are you computing sequence lengths, and for what maximum length of genome? And even if you were able to calculate P(F), you'd still have to compute 1-(1-P(F)^N) where N is the number of opportunities for a new sequence in order to get the probability that one of those F sequences would turn up. And those opportunities themselves will be constrained by the sequences already generated. So the thing is not "objectively improbable". We simply do not know, and cannot compute, the probability. And in any case, "function" is not synonymous with "coding for a foldable protein". There are plenty of functional but non-coding regions of the genome. And not all genes that code for a foldable protein perform a function. Some genes are never expressed; some are expressed but don't do anything that contributes to the life and reproductive capacity of the phenotype.Elizabeth Liddle
December 31, 2011
December
12
Dec
31
31
2011
06:11 AM
6
06
11
AM
PDT
Here are some questions to ponder. S = {sequence space} F = {folding sequences} Is F a subset of S ? Is the size of F less than the size of S ? Is the probability of F greater than zero and less than one? Do we need to know the probability value before knowing that 0 < P(F) < 1 ? For any trial in S, are there two possible outcomes, F and F' ? Is F determined by the laws of physics? Does F change from time to time? Answering those questions will be useful in determining whether folding is deterministically specific to a subset of sequences.material.infantacy
December 31, 2011
December
12
Dec
31
31
2011
05:45 AM
5
05
45
AM
PDT
Do all sequences fold? Yes or no. If no, then there is a target in S (sequence space). Whether there is an objective target. That is the issue. There either is or there isn't. Which is it? This isn't about the probability of function, but whether function is a subset of sequence space -- yes, it is.material.infantacy
December 31, 2011
December
12
Dec
31
31
2011
05:21 AM
5
05
21
AM
PDT
But you haven't told us how you "predetermine it"! So how can you compute the probability?Elizabeth Liddle
December 31, 2011
December
12
Dec
31
31
2011
05:17 AM
5
05
17
AM
PDT
"We were talking about the probability of a mutated sequence coding for a potentially functionally useful folding protein...."
Not exactly. We are talking about whether there is a probability -- whether an objective, predetermined target exists as a space F (folded) as a subset of S (all sequences), for which n(F) < n(S); that is, F is a proper subset of S. And that functional protein sequences are a subset of F. I vote yes.material.infantacy
December 31, 2011
December
12
Dec
31
31
2011
05:07 AM
5
05
07
AM
PDT
"Tell me how you would assess, objectively the proportion of DNA sequences that would result in a folding protein. "
By the laws of physics. I'm standing by the notion that if something is necessitated, then its nature can be determined. Aren't there, in principle, ways to determine the proportion of folding sequences to not? I seem to remember something. Regardless it is non-arbitrary, and there is a target for functional protein sequences -- the folded set. Unless you are making the claim that folding is arbitrary instead of deterministic, and that proteins can happily exist without folding, there is an objective target for protein function in sequence space.
I don’t need to list the elements to objectively assess, with sufficient clarity, that function is objective.
You need to count the elements to know what proportion of each set are also members of the larger sets. And unless you know what they are you can’t do that. Or any of the other stuff I mentioned that you would have to do.
We only need to "count" the elements to determine the specific probability, not whether one exists. We know that not any sequence will fold, even that fewer fold than not. This gives us the basis for determining that a probability exists -- that in a sample space S, there is a set F of proteins that will fold. There is also a set F' (the complement) such that F ∪ F' = S, and that F ∩ F' = {}. This is a partition on S. That's a target, the set F. We can negotiate specifics, such as the value of n(F)/n(F'), or whether all sequences in F can be functional, but we need not do so to determine that a target exists. Miller's card analogy
"But if you want to compute the probability of a sequence being a folding sequence then you have to have a way of determining what proportion of possible sequences are folding sequences. If not, why not?"
Explained above.material.infantacy
December 31, 2011
December
12
Dec
31
31
2011
05:01 AM
5
05
01
AM
PDT
With respect, kf, you are moving the goalposts. We were talking about the probability of a mutated sequence coding for a potentially functionally useful folding protein, not the probability of a mutated sequence coding for a specific complex function (which requires a great many stretches of code). Nobody suggests that complex functions arose ex nihilo by some unselected series of mutations. That's why natural selection (the subject of the OP) is important.Elizabeth Liddle
December 31, 2011
December
12
Dec
31
31
2011
04:40 AM
4
04
40
AM
PDT
P: Pardon, but there you go again with your inappropriate "painted target" metaphor. Kindly, tell us how seeing the ATP synthase as what it is, a motor -- a two-port converting energy into shaft power, is painting a target after the fact. Instead of recognising a case of something we do know something about: a motor, only, using molecular nanotech and in the context of living organisms. Likewise, -- and here we deliberately move to an analogy -- how seeing kinesin with vesicles in tow along the microtubules as a miniature walking truck, is an after the fact, dismissible subjective evaluation. In short, you are playing at distractive, label and dismiss rhetoric, not serious discussion on the merits. That is really sad. And, that WE do not yet know how to design working proteins etc, does not mean that no-one out there does. That's like a tribe in the deep Amazon seeing an aeroplane flying over, and saying, we do not know how to do it, so we cannot say that this is credibly a designed object on the evident signs. So, it "must" be a natural object that was always there for all we can say. Please, think again. The proper issue, is, what are the candidate adequate causal factors that can produce such items with FSCO/I, and what empirical warrant can allow us to decide which is a best explanation. Pardon, but I get the strong impression that you know that motors etc are made by designers, and are only credibly explained on design. So, you are desperately straining every talking point device, to shift focus. Red herrings and strawmen, in short. Which are fallacious. So, kindly get back on track and deal wit the issue on the table: if you wish to suggest that the objects in view are explicable on blind chance plus mechanical necessity in some plausible initial environment, then show us how that environment leads to these objects, and substantiate with observations. Failing such, we will have every right to conclude that you are trying to distract attention from what you cannot answer cogently on the merits of observed fact, and then dismiss what is not convenient for your preferred view. So, kindly fill in the blanks:
a: Major candidate causal explanations for _______________,_______________, _______________, . . . and _______________, are ________________ . b: Evidence, per observations for the first candidate is __________________ c: Evidence, per observations for the second candidate is __________________ d: Repeat as often as required. e: Of these, on factual adequacy, coherence and explanatory power and elegance, candidate ____ is the best explanation, because ________________ .
Thanks in advance. GEM of TKI PS: Having a metric that tells us when something that is specific, complex and functional is beyond the reasonable reach of chance and necessity is a relevant factor in the above. That is exactly what the log reduced chi metric you seem to want to dismiss does: Chi_500 = I*S - 500, functionally specific bits beyond the solar system atomic state resources thresholdkairosfocus
December 31, 2011
December
12
Dec
31
31
2011
04:28 AM
4
04
28
AM
PDT
I don’t need to list the elements to objectively assess, with sufficient clarity, that function is objective.
You need to count the elements to know what proportion of each set are also members of the larger sets. And unless you know what they are you can't do that. Or any of the other stuff I mentioned that you would have to do.
“In other words you know not one single relevant datum with which to compute your probability.” Irrelevant. The issue at hand is whether function is some sort of post hoc imposition, or whether it’s objectively real. Function is objective, and so is folding. Since not all sequences fold, the assessment is not arbitrary.
It certainly isn't irrelevant! We are talking about probability, right? And it's also incorrect. Tell me how you would assess, objectively the proportion of DNA sequences that would result in a folding protein. Then tell me how you would assess, objectively, which one of those are "functional".
Whether we can tell if a sequence will be functional, or even fold, is not the issue. Folding sequences fold, regardless of an observer’s judgment as to function.
Sure. But how do you tell whether a sequence is a folding sequence?
Therefore, in a sample space consisting of all sequences of length n, there is a set of folding sequences — a predetermined target.
Yes. But if you want to compute the probability of a sequence being a folding sequence then you have to have a way of determining what proportion of possible sequences are folding sequences. If not, why not?Elizabeth Liddle
December 31, 2011
December
12
Dec
31
31
2011
04:21 AM
4
04
21
AM
PDT
"Well, yes, it does. It only starts to get “objective” when you can list the contents of your functional set."
I don't need to list the elements to objectively assess, with sufficient clarity, that function is objective.
"In other words you know not one single relevant datum with which to compute your probability."
Irrelevant. The issue at hand is whether function is some sort of post hoc imposition, or whether it's objectively real. Function is objective, and so is folding. Since not all sequences fold, the assessment is not arbitrary. Whether we can tell if a sequence will be functional, or even fold, is not the issue. Folding sequences fold, regardless of an observer's judgment as to function. Therefore, in a sample space consisting of all sequences of length n, there is a set of folding sequences -- a predetermined target.material.infantacy
December 31, 2011
December
12
Dec
31
31
2011
04:05 AM
4
04
05
AM
PDT
Clearly this is not the case, as has been shown. Proteins which fold do so regardless of whether anyone’s “painted a bullseye” around them. Your claim has been refuted. Venn diagram: Draw a rectangle which represents the sample space. In the center of that, draw a circle which represets the set of folding sequences. In the center of that circle, draw another which represents the functional set. There’s the target, and it’s objectively real. It cannot reasonably be denied. Plain and simple, neat and clean. The target which constitutes folding, functional proteins is carved upon the face of reality by the laws of physics. It doesn’t get any more objective than that.
Well, yes, it does. It only starts to get "objective" when you can list the contents of your functional set. Until you've done that, firstly you have no way of knowing whether the "functional set" is coterminous with the folded set, nor the size of the folded set, nor do you know how similar the elements of the functional set are, nor in what contexts they prove functional, nor how close those contexts are. In other words you know not one single relevant datum with which to compute your probability. Unless you assert, without foundation, that the contents of the subset are the observed functional folds. Which would, as Petrushka says, be drawing your target round the arrow.Elizabeth Liddle
December 31, 2011
December
12
Dec
31
31
2011
03:47 AM
3
03
47
AM
PDT
should read: "Which is one (of many) reasons...."Elizabeth Liddle
December 31, 2011
December
12
Dec
31
31
2011
03:42 AM
3
03
42
AM
PDT
Not only is design impossible (except by an omnipotent deity I guess), but unless you know what proportion of sequences result in potentially useful proteins (or, indeed, are potentially useful regulatory sequences) there's no way of computing the probability that any one will arise "by chance" nor how closely clustered the useful stuff is. Which is (of many) reasons why the entire probability calculation approach is misconceived.Elizabeth Liddle
December 31, 2011
December
12
Dec
31
31
2011
03:41 AM
3
03
41
AM
PDT
1 2 3 4 5 6 8

Leave a Reply