Uncommon Descent Serving The Intelligent Design Community

On The Calculation Of CSI

Share
Facebook
Twitter
LinkedIn
Flipboard
Print
Email

My thanks to Jonathan M. for passing my suggestion for a CSI thread on and a very special thanks to Denyse O’Leary for inviting me to offer a guest post.

[This post has been advanced to enable a continued discussion on a vital issue. Other newer stories are posted below. – O’Leary ]

In the abstract of Specification: The Pattern That Signifies Intelligence, William Demski asks “Can objects, even if nothing is known about how they arose, exhibit features that reliably signal the action of an intelligent cause?” Many ID proponents answer this question emphatically in the affirmative, claiming that Complex Specified Information is a metric that clearly indicates intelligent agency.

As someone with a strong interest in computational biology, evolutionary algorithms, and genetic programming, this strikes me as the most readily testable claim made by ID proponents. For some time I’ve been trying to learn enough about CSI to be able to measure it objectively and to determine whether or not known evolutionary mechanisms are capable of generating it. Unfortunately, what I’ve found is quite a bit of confusion about the details of CSI, even among its strongest advocates.

My first detailed discussion was with UD regular gpuccio, in a series of four threads hosted by Mark Frank. While we didn’t come to any resolution, we did cover a number of details that might be of interest to others following the topic.

CSI came up again in a recent thread here on UD. I asked the participants there to assist me in better understanding CSI by providing a rigorous mathematical definition and showing how to calculate it for four scenarios:

  1. A simple gene duplication, without subsequent modification, that increases production of a particular protein from less than X to greater than X. The specification of this scenario is “Produces at least X amount of protein Y.”
  2. Tom Schneider’s ev evolves genomes using only simplified forms of known, observed evolutionary mechanisms, that meet the specification of “A nucleotide that binds to exactly N sites within the genome.” The length of the genome required to meet this specification can be quite long, depending on the value of N. (ev is particularly interesting because it is based directly on Schneider’s PhD work with real biological organisms.)
  3. Tom Ray’s Tierra routinely results in digital organisms with a number of specifications. One I find interesting is “Acts as a parasite on other digital organisms in the simulation.” The length of the shortest parasite is at least 22 bytes, but takes thousands of generations to evolve.
  4. The various Steiner Problem solutions from a programming challenge a few years ago have genomes that can easily be hundreds of bits. The specification for these genomes is “Computes a close approximation to the shortest connected path between a set of points.”

vjtorley very kindly and forthrightly addressed the first scenario in detail. His conclusion is:

I therefore conclude that CSI is not a useful way to compare the complexity of a genome containing a duplicated gene to the original genome, because the extra bases are added in a single copying event, which is governed by a process (duplication) which takes place in an orderly fashion, when it occurs.

In that same thread, at least one other ID proponent agrees that known evolutionary mechanisms can generate CSI. At least two others disagree.

I hope we can resolve the issues in this thread. My goal is still to understand CSI in sufficient detail to be able to objectively measure it in both biological systems and digital models of those systems. To that end, I hope some ID proponents will be willing to answer some questions and provide some information:

  1. Do you agree with vjtorley’s calculation of CSI?
  2. Do you agree with his conclusion that CSI can be generated by known evolutionary mechanisms (gene duplication, in this case)?
  3. If you disagree with either, please show an equally detailed calculation so that I can understand how you compute CSI in that scenario.
  4. If your definition of CSI is different from that used by vjtorley, please provide a mathematically rigorous definition of your version of CSI.
  5. In addition to the gene duplication example, please show how to calculate CSI using your definition for the other three scenarios I’ve described.

Discussion of the general topic of CSI is, of course, interesting, but calculations at least as detailed as those provided by vjtorley are essential to eliminating ambiguity. Please show your work supporting any claims.

Thank you in advance for helping me understand CSI. Let’s do some math!

Comments
And if you cannot provide a mathematically rigorous definion of a computer program then you don't know what you are talking about and are a waste of time.Joseph
March 24, 2011
March
03
Mar
24
24
2011
03:33 PM
3
03
33
PM
PDT
MathGrrl:
Unless you have a mathematically rigorous definition of CSI,
500 bits of specified information as demonstrated by the math in "No Free Lunch". All the rigor for determining that is in that book. That said all that has to be done is for someone to come along and demonstrate that a bacterial flagellum can evolve via an accumulation of genetic accidents and that will be that for the design inference for the BF. But anyway seeing that you ignore most of what I post dealing with you is a waste of time.Joseph
March 24, 2011
March
03
Mar
24
24
2011
03:31 PM
3
03
31
PM
PDT
I know, for a fact, that the bacterial flagellum is a specified functional biological system. Not any ole sequence will produce one. I also know a BF contains thousands of parts- and that is before breaking it down into bits. Last I checked thousands is greater than 500 and 500 is the number I am looking to meet or break.
Unless you have a mathematically rigorous definition of CSI, you don't actually know that a bacterial flagella is "full of it". Without such a definition, you don't even know what "it" is. If you can't define your terms, claiming that CSI is an indicator of intelligent agency is, to put it as politely as possible, premature. All I'm asking for in my original post and throughout this thread is for a rigorous definition and some example calculations so that I can test the claims that ID proponents make with respect to CSI. I hope you'll consider providing that information.MathGrrl
March 24, 2011
March
03
Mar
24
24
2011
03:20 PM
3
03
20
PM
PDT
Joseph, Sorry for making fun. I think your point is actually very good.Collin
March 24, 2011
March
03
Mar
24
24
2011
03:18 PM
3
03
18
PM
PDT
Joseph, you are so dramatic. I noticed that you copied the same error in your 158 message as you did in your message at 145. That must be a posting-duplication mutation. I think that the CSI did not translate. :)Collin
March 24, 2011
March
03
Mar
24
24
2011
03:16 PM
3
03
16
PM
PDT
MathGrrl, Please provide the rigorous mathematical definition for a computer program and you will how wrong you are.Joseph
March 24, 2011
March
03
Mar
24
24
2011
03:04 PM
3
03
04
PM
PDT
Mathgrrl, I'm not sure he will succeed. It's probably like quantifying consciousness or intelligence. Psychologists say (tongue in cheek) that IQ tests measure intelligence and intelligence is what IQ tests measure. They know that it is tautological, but they've kind of given up on justifying it. There is just no way of making IQ correspond perfectly to anything in the real world. But it is useful in making predictions (even if they are inexact).Collin
March 24, 2011
March
03
Mar
24
24
2011
03:03 PM
3
03
03
PM
PDT
MahGrrl: <bockquoeYou’ve already told me that bacterial flagella are “full of CSI”. Surely you wouldn’t make a claim like that without being able to support it? Please show me the math that caused you to reach your stated conclusion. I know, for a fact, that the bacterial flagellum is a specified functional biological system. Not any ole sequence will produce one. I also know a BF contains thousands of parts- and that is before breaking it down into bits. Last I checked thousands is greater than 500 and 500 is the number I am looking to meet or break.Joseph
March 24, 2011
March
03
Mar
24
24
2011
03:02 PM
3
03
02
PM
PDT
MathGrrl:
Does that calculation hold even if we find that the gene evolved via known evolutionary mechanisms from precursors that coded for a less useful, but still workable, protein?
Do you read my posts? "Evolutionary mechanisms" is meaningless.
Since you seem to understand how to calculate CSI in some detail, could you please do so for the four scenarios I described?
No, they are bogus.
Please explain how the paper you referenced aligns with Dembski’s definition of CSI in Specification…. As I’ve tried to make very clear, I am interested in understanding the specific metric used by ID proponents.
You mean explain it AGAIN? I told you by quoting Dembski- in biology CSI refers to biological function. And the paper dals with the information pertaining to biological function.Joseph
March 24, 2011
March
03
Mar
24
24
2011
02:59 PM
2
02
59
PM
PDT
niwrad,
About the “detailed calculations” I think that formulas are useful tools that one can apply after some basic principles are stated. In other words, formulas necessarily come after principles. By now we disagree on principles then it is useless to put formulas on the table.
We're not disagreeing on any principle. I'm just asking for clarification on a core ID concept. Given that happy state, could you please provide a rigorous mathematical definition of CSI based on Dembski's discussion in Specification... and show how to calculate it for the four scenarios I described in my original post?MathGrrl
March 24, 2011
March
03
Mar
24
24
2011
02:55 PM
2
02
55
PM
PDT
Joseph,
Do the research and find out how many proteins are used- how many amino acids in each protein.
You've already told me that bacterial flagella are "full of CSI". Surely you wouldn't make a claim like that without being able to support it? Please show me the math that caused you to reach your stated conclusion.MathGrrl
March 24, 2011
March
03
Mar
24
24
2011
02:53 PM
2
02
53
PM
PDT
MathGrrl #133
"You’re just restating your original claim that duplication does not increase CSI. I explained how duplication in biological systems can result in significant biochemical changes. If you maintain that this does not increase CSI, please show detailed calculations for the scenario I described."
Let’s suppose that a gene has CSI X. Let’s suppose that an organism has CSI Y. I think that both agree on the fact that Y is far greater than X (a gene is an infinitesimal part of an organism). Can the difference Y-X be caused by simple duplications? In software engineering duplication doesn’t work. You say that this analogy is flawed but after all informatics/robotics is the technological field more similar to biology. It would be very strange that what doesn’t work at all in the former works so well in the latter. You say that duplication in biological systems can result in significant biochemical changes but you are very far from demonstrating that duplication is the cause of Darwinian evolution. About the "detailed calculations" I think that formulas are useful tools that one can apply after some basic principles are stated. In other words, formulas necessarily come after principles. By now we disagree on principles then it is useless to put formulas on the table. Organization (and biology eminently shows organization) is quality. At the very end, the principle on which we disagree is that, given X quality, by doubling X we obtain more quality. This is simply impossible. I repeat that this way we increase quantity only, while you repeat that quality (what are your "significant biochemical changes" but quality) increases too, and this is absurd.niwrad
March 24, 2011
March
03
Mar
24
24
2011
02:52 PM
2
02
52
PM
PDT
Joseph,
One “easy” example of doing so is taking a gene that cannot tolerate any variation- for example say it it codes for a protein that has 200 amino acids- all have to be in that specific order. 6 bits per amino acid (2^6 = 64) x 200 amino acids = 1200 bits of specified information. And that means CSI is present.
Does that calculation hold even if we find that the gene evolved via known evolutionary mechanisms from precursors that coded for a less useful, but still workable, protein? Since you seem to understand how to calculate CSI in some detail, could you please do so for the four scenarios I described? I would very much like to understand it well enough to compute it myself.
(and that mathgrrl cannot see the connection between CSI ad taht linked paper tells me she needs to do a lot of reading before asking her questions)
Please explain how the paper you referenced aligns with Dembski's definition of CSI in Specification.... As I've tried to make very clear, I am interested in understanding the specific metric used by ID proponents.MathGrrl
March 24, 2011
March
03
Mar
24
24
2011
02:50 PM
2
02
50
PM
PDT
MathGrrl :Excellent! Please provide a detailed calculation to show me how to objectively determine exactly how much CSI is present in “the” bacterial flagellum (pick whichever flagellum you prefer). Do the research and find out how many proteins are used- how many amino acids in each protein. Then find out how much variation each can tolerate. Then follow theinstructions in the paper I linked to in comment 12- heck everything I just typed is paraphrased from that.Joseph
March 24, 2011
March
03
Mar
24
24
2011
02:45 PM
2
02
45
PM
PDT
Collin,
Do you think that in order to calculate CSI it must be quantified?
While I don't understand Dembski's description well enough to calculate CSI myself, it is clear that he measures it in bits.MathGrrl
March 24, 2011
March
03
Mar
24
24
2011
02:44 PM
2
02
44
PM
PDT
OK, I think I have a way through this. Information is quantitative, yes? Specification, however, is not (it's certainly not in Orgel). So we can measure the information, but not the specification, which we can only say is either there or not. Does that seem reasonable? I have a bit of difficulty with this, from PaV above:
Specifications” are DISCOVERED: I see something. It suggests a pattern to me. I uncover the pattern (which means that it is translatable, or functional.
The first part (suggests a pattern to me) sounds overly subjective, until the second part (uncover the pattern), which suggests some possibility of measuring the pattern.QuiteID
March 24, 2011
March
03
Mar
24
24
2011
02:42 PM
2
02
42
PM
PDT
Joseph,
And even then the BF is eidence for design as it is full of CSI.
Excellent! Please provide a detailed calculation to show me how to objectively determine exactly how much CSI is present in "the" bacterial flagellum (pick whichever flagellum you prefer).MathGrrl
March 24, 2011
March
03
Mar
24
24
2011
02:42 PM
2
02
42
PM
PDT
MathGrrl- from comment 117: The point of CSI is that its existence is a sign of a designing agency. CSI is defined as X number of bits f specified information. In “No Free Lunch” X = 500, which the math shows equals a probability greater than the upper probability bound. With respect to biology specified information equates to biological function. To see if CSI is present we need to determine if there is > 500 bits of specified information. One “easy” example of doing so is taking a gene that cannot tolerate any variation- for example say it it codes for a protein that has 200 amino acids- all have to be in that specific order. 6 bits per amino acid (2^6 = 64) x 200 amino acids = 1200 bits of specified information. And that means CSI is present. That said if there can be some variation you have to figure that in, which brings us back to the paper I linked to in comment 12. (and that mathgrrl cannot see the connection between CSI ad taht linked paper tells me she needs to do a lot of reading before asking her questions)Joseph
March 24, 2011
March
03
Mar
24
24
2011
02:42 PM
2
02
42
PM
PDT
MathGrrl:
Please provide the rigorous mathematical definition of CSI that shows that it cannot be calculated for the scenarios I describe.
Please provide the rigorous mathematical definition for a computer program and you will how wrong you are.Joseph
March 24, 2011
March
03
Mar
24
24
2011
02:38 PM
2
02
38
PM
PDT
Mathgrrl, Do you think that in order to calculate CSI it must be quantified? In other words, would I have to be able to look at a code, genome or sentence and say, "This has 7 CSIs" or 1 "Dembski" (like one Watt or one Calorie or one Newton). I don't know if it can be quantified. In economics, attempts have been made to measure "utility" (similar to "value" or "benefit"). They call it 1 util. But as I understand it, those attempts have not lead to "rigorous mathematical definitions" of utility. But that does not mean that "utility" does not exist or that it is not helpful as a concept in increasing understanding of an economy. I guess I am not optimistic that you will find a rigorous mathematical definition of CSI. Again, I would hope that Mr. Dembski would weigh in on this issue.Collin
March 24, 2011
March
03
Mar
24
24
2011
02:37 PM
2
02
37
PM
PDT
Mathgrrl, At this point, the whale poo at the bottom of the ocean has succeeded in realizing that my argument with you is not over the math, it is over the completeness of your conclusions (post #31). If you wish to have a discussion with me, you can take the opportunity to finally address the issue for which I have now grown tired of bringing up to you. ”Does the output of any evolutionary algorithm being modeled establish the semiosis required for information to exist, or does it take it for granted as an already existing quality”.Upright BiPed
March 24, 2011
March
03
Mar
24
24
2011
02:37 PM
2
02
37
PM
PDT
MathGrrl:
The only point I would like to add is that CSI is claimed by ID proponents, including Dembski, as an unambiguous indicator of intelligent agency for biological artifacts such as the bacterial flagella. If CSI were really only about origins, such claims would be ridiculous on their face.
Why would it be ridiculous on their face? You can't have a bacterial flagellum with the bacteria. And even then the BF is eidence for design as it is full of CSI.
The fact is that ID proponents do claim to be able to measure CSI in biological systems without reference to their origins.
That has been explained for you also. It's as if you ignore more than half of what is posted. If that is how you are going to be then fine I suggest everyone leave you alone.Joseph
March 24, 2011
March
03
Mar
24
24
2011
02:36 PM
2
02
36
PM
PDT
Noesis, My organic view of information has been well noted. It presents no problem, and results in the much the same conclusion. Personally, I blame Shannon for conflating noise with data. :)Upright BiPed
March 24, 2011
March
03
Mar
24
24
2011
02:34 PM
2
02
34
PM
PDT
Mark, I find it interesting that you see different opinions among IDist as “important”. I wonder what exactly makes the grade of importance in your estimation. Obviously, if I drink the Kool-aide down at the NCSE, then ID is an insignificant group of incorrigible religious fanatics that have been repeatedly discredited by scientific observation. One would think the number of scientific facts that their ideas contradict must be an amazing document to behold (if it only existed). On the other hand, Darwinism and its corollaries represent no less than the modern scientific unification of all acquired human knowledge, collected together into the Absolute Truth of our reality. So powerful is it, that it should be legislated on one front, and used as the billy club of enforcement on another. Given this, I am interested. If it is true that any differences of opinion are marked as “important” among the forever discredited, how much more important are these same differences among those that command the obedience of everyone? It would seem to me that if differences of opinion carry the importance you suggest, then such importance would signify one thing among the insignificant (e.g. “who cares”), but must certainly mean something entirely different among those who cannot be denied (e.g. “no questions allowed”). For instance what if one proponent should say that natural selection is the ultimate creative force in the cosmos, while another says it is a weak force which is not even dominant? What if one should say that the theory not only predicts but demands gradualism, while another suggests rampant cladogenesis? What if one should say that the tremendous improbability of Life comes from a single ancestor, while another suggests that Life may have had several beginnings? Are these important, or should we genuflect in their presence and whisper to ourselves the comforting slogans of the authority? And what happens when their contradictions occur not only among themselves, but against other factual observations? We are told that DNA is at the center of Life; a digital code that can accommodate any amount of information. We are also told with absolute certainty that it cannot be the result of anything other than a natural process. Yet, where is there any actual evidence whatsoever that a natural process can create a digital code? I suppose what constitutes “importance” will be forever subjective, no? In any case, you skipped my question with the same aplomb as Mathgrrl: ”Does the output of any evolutionary algorithm being modeled establish the semiosis required for information to exist, or does it take it for granted as an already existing quality”.Upright BiPed
March 24, 2011
March
03
Mar
24
24
2011
02:33 PM
2
02
33
PM
PDT
Noesis #122
"Suppose that duplication of a gene doubles the amount of a protein produced by a cell. This can have a huge impact on phenotype. MathGrrl seems to have measure of CSI on the phenotype, and not the genotype, in mind. Take my behavior as the phenotype. If you send me an email message, I may put off dealing with it. If you repeat the transmission, I will generally respond promptly. What I’m describing here is a nonlinear response to a redundant transmission."
The increase of a protein (eventually produced by gene duplication) by definition is a quantitative effect with no CSI per se. CSI implies quality and in principle quantity doesn’t entail quality. Such quantitative effects can trigger some cellular events only if the cell allows it by design. Anyway the "huge impacts on phenotype" caused by increase of a protein have nothing to do with the immense functional hierarchies found in organisms that Darwinian macroevolution pretends to explain naturalistically. About the "nonlinear response to a redundant transmission" by a human I repeat somehow what I said previously: it is not the redundancy of the message to carry on real new information rather the intelligence of the receiver that applies additional meanings/decisions/interpretations fundamentally not contained inside the message (also if duplicated).niwrad
March 24, 2011
March
03
Mar
24
24
2011
02:12 PM
2
02
12
PM
PDT
Joseph,
markf:
All that Mathgrrl is asking is how do you calculate CSI in some specific cases.
Right and at least some IDists are saying those cases are bogus.
Please provide the rigorous mathematical definition of CSI that shows that it cannot be calculated for the scenarios I describe. I see nothing in Dembski's work that suggests this is the case.MathGrrl
March 24, 2011
March
03
Mar
24
24
2011
02:05 PM
2
02
05
PM
PDT
Noesis,
Where is the rigorous definition of the sample space ? [Omega]? If the sample space is ill-defined, then so is CSI.
I'm afraid you're demonstrating the same misunderstanding as Upright BiPed. I don't know how to calculate CSI, despite having read the relevant material by Dembski. I'm asking the ID proponents here to help me out. Could you please provide a rigorous mathematical definition of CSI based on Dembski's discussion in Specification... and show how to calculate it for the four scenarios I described in my original post?MathGrrl
March 24, 2011
March
03
Mar
24
24
2011
02:04 PM
2
02
04
PM
PDT
PaV,
I think it is the height of hubris to come to this blog and DEMAND that someone demonstrate to you in a mathematically rigorous fashion that CSI was NOT generated by Tierra and its ilk.
I'm not demanding, I'm asking as politely as I can. I'm also not asking for what you claim I am, I'm simply requesting a few example calculations so that I can understand how to measure CSI objectively myself.
That is the stuff of Ph’D work.
It shouldn't be. While I don't find Dembski's explanation in Specification... to be clear enough to allow me to calculate CSI, none of the math there is beyond an average high school student.
Shallit has tried to prove Dembski wrong, and it turned out he was wrong.
Could you provide a reference? I would like to learn from Shallit's mistakes. And I'll ask, politely, again: Could you please provide a rigorous mathematical definition of CSI based on Dembski's discussion in Specification... and show how to calculate it for the four scenarios I described in my original post?MathGrrl
March 24, 2011
March
03
Mar
24
24
2011
02:03 PM
2
02
03
PM
PDT
scordova,
The question is whether evolutionary algorithms can spontaneously generate CSI without authorship of intelligent agency.
That is indeed the question I am trying to answer. Can you provide a rigorous mathematical definition of CSI and example calculations for the four scenarios I described?
That is the subject of No Free Lunch discussions.
The NFL theorems do not apply to a situation in which there is only one fitness landscape, nor to the situation where the fitness landscape is dynamic. That makes it doubly inapplicable to the world we observe.MathGrrl
March 24, 2011
March
03
Mar
24
24
2011
02:03 PM
2
02
03
PM
PDT
Mark Frank,
I agree that CSI is supposed to be a method for making deductions about origins. However, the whole point of Dembski’s paper (and indeed his other work) is that he suggests that CSI is a property of an object that you can assess without knowing anything about its origins. All that Mathgrrl is asking is how do you calculate CSI in some specific cases. According to Dembski it should be possible to do this without knowing anything about the origins of the object. So Joseph’s objection that the situations she puts forward are not about origins is irrelevant.
Exactly, Mark, thank you for stating this so clearly. Can anyone provide the detailed calculations I requested in the orignal post of this thread?MathGrrl
March 24, 2011
March
03
Mar
24
24
2011
02:02 PM
2
02
02
PM
PDT
1 8 9 10 11 12 15

Leave a Reply