Uncommon Descent Serving The Intelligent Design Community

On The Calculation Of CSI

Share
Facebook
Twitter
LinkedIn
Flipboard
Print
Email

My thanks to Jonathan M. for passing my suggestion for a CSI thread on and a very special thanks to Denyse O’Leary for inviting me to offer a guest post.

[This post has been advanced to enable a continued discussion on a vital issue. Other newer stories are posted below. – O’Leary ]

In the abstract of Specification: The Pattern That Signifies Intelligence, William Demski asks “Can objects, even if nothing is known about how they arose, exhibit features that reliably signal the action of an intelligent cause?” Many ID proponents answer this question emphatically in the affirmative, claiming that Complex Specified Information is a metric that clearly indicates intelligent agency.

As someone with a strong interest in computational biology, evolutionary algorithms, and genetic programming, this strikes me as the most readily testable claim made by ID proponents. For some time I’ve been trying to learn enough about CSI to be able to measure it objectively and to determine whether or not known evolutionary mechanisms are capable of generating it. Unfortunately, what I’ve found is quite a bit of confusion about the details of CSI, even among its strongest advocates.

My first detailed discussion was with UD regular gpuccio, in a series of four threads hosted by Mark Frank. While we didn’t come to any resolution, we did cover a number of details that might be of interest to others following the topic.

CSI came up again in a recent thread here on UD. I asked the participants there to assist me in better understanding CSI by providing a rigorous mathematical definition and showing how to calculate it for four scenarios:

  1. A simple gene duplication, without subsequent modification, that increases production of a particular protein from less than X to greater than X. The specification of this scenario is “Produces at least X amount of protein Y.”
  2. Tom Schneider’s ev evolves genomes using only simplified forms of known, observed evolutionary mechanisms, that meet the specification of “A nucleotide that binds to exactly N sites within the genome.” The length of the genome required to meet this specification can be quite long, depending on the value of N. (ev is particularly interesting because it is based directly on Schneider’s PhD work with real biological organisms.)
  3. Tom Ray’s Tierra routinely results in digital organisms with a number of specifications. One I find interesting is “Acts as a parasite on other digital organisms in the simulation.” The length of the shortest parasite is at least 22 bytes, but takes thousands of generations to evolve.
  4. The various Steiner Problem solutions from a programming challenge a few years ago have genomes that can easily be hundreds of bits. The specification for these genomes is “Computes a close approximation to the shortest connected path between a set of points.”

vjtorley very kindly and forthrightly addressed the first scenario in detail. His conclusion is:

I therefore conclude that CSI is not a useful way to compare the complexity of a genome containing a duplicated gene to the original genome, because the extra bases are added in a single copying event, which is governed by a process (duplication) which takes place in an orderly fashion, when it occurs.

In that same thread, at least one other ID proponent agrees that known evolutionary mechanisms can generate CSI. At least two others disagree.

I hope we can resolve the issues in this thread. My goal is still to understand CSI in sufficient detail to be able to objectively measure it in both biological systems and digital models of those systems. To that end, I hope some ID proponents will be willing to answer some questions and provide some information:

  1. Do you agree with vjtorley’s calculation of CSI?
  2. Do you agree with his conclusion that CSI can be generated by known evolutionary mechanisms (gene duplication, in this case)?
  3. If you disagree with either, please show an equally detailed calculation so that I can understand how you compute CSI in that scenario.
  4. If your definition of CSI is different from that used by vjtorley, please provide a mathematically rigorous definition of your version of CSI.
  5. In addition to the gene duplication example, please show how to calculate CSI using your definition for the other three scenarios I’ve described.

Discussion of the general topic of CSI is, of course, interesting, but calculations at least as detailed as those provided by vjtorley are essential to eliminating ambiguity. Please show your work supporting any claims.

Thank you in advance for helping me understand CSI. Let’s do some math!

Comments
This is completely opposite of the Law of Thermodynamics stating neither energy or matter can be created or destroyed and that information must be pass along from its host and not created by nothing or on its own. So in this experiment, they were able to create a gene with a new DNA information structure than what should’ve existed or been allowed.PatHarris334
August 13, 2018
August
08
Aug
13
13
2018
10:30 AM
10
10
30
AM
PDT
It has been more than two years since this post began. We've learned, subsequently, just as I suspected, that MathGrrl is really not a "girl", and that it was a person whose sole person was to work to undermine the ID position. This was obvious from the beginning, and was the reason I thought Denise was wrong in allowing this post, and the reason I told so-called "MathGrrl" to just "go away." Since that time, I've had a chance to look at Schneider's ev program. I haven't looked at it for quite some time now, but remember some few specifics. In response to so-called "MathGrrl", the following can be said: The 'ev' program does not have a truly defined "specification." What substitutes for this "specification" is a digital field representing what are supposed to be 128 nucleotide bases. The ev program is designed to 'evolve' a 'protein binding site', and a nucleotide sequence that matches up to the site. Each time the program is run, and further depending on the selection of various program variables, different 'specifications', i.e, different "protein binding sites," with matching protein sequence, are arrived at. [N.B. The 'ev' program is sort of rigged in several ways to produce a successful 'protein binding site.' One of the ways you might say that it is "rigged" is because after each 'replication' (program run through, the 'ev' program selects the top 64 sequences for duplication. So, it sweeps away one-half of the produced sequences based on how high the 'score' is for each of these sequences. Well, how is the this score arrived at? It's done by seeing how well the sequences match up based on a kind of grading system. This grading system amounts to a fitness function, and is a way of bringing in information that would not otherwise be available to a truly random process. The net effect of this one-half elimination is that within two more 'runs' of the program the highest scoring sequence is now found in all 128 available sequences. You have the option of turning off this replacement process. When you do, the program runs on ceaselessly: i.e., the 'protein binding site' is NEVER arrived at. IOW, to consider this a truly random operation is to give it a very generous interpretation. But we go on . . . ] Given these circumstances, to vastly oversimplify the situation, one can simply say that the 'ev' program ultimately provides a "specified" nucleotide sequence that is 128 positions long. Taking this 128 long nucleotide sequence as being equivalent to an actual 128 long sequence in protein coding DNA, we then know that the probability associated with each nucleotide is 1 in 4. [Now, let's understand that with what I've written above, the "actual" probabilities are much lower, almost equal to 1 in 1 since what Schneider has done is to essentially write a set of mathematical equations for which he is trying to find a solution. Given the value matrix he applies in calculating the 'value' of each sequence (i.e., it's nearness to a given calculated value for the computer-produced 'binding site) the actual odds of finding the right 'solution' are not that high, and crunching numbers will get you there. But, for the sake of simplification, we're overlooking these real flaws in the design of the program (flaws in the sense of not really tracking with what NS does in nature)] Hence, the "rejection region"--that I long spoke of--can be calculated. It's quite simple; and it reflects the basic answer I originally gave to MathGrrl. The 'rejection region' is 1 in 4^128; or, equivalently, 1 in 2^256. While a very small 'rejection region,' it is not what CSI requires, which is a 'rejection region' of 1 in 2^500. Hence, the output of the 'ev' program DOES NOT PRODUCE "CSI." But I said this all along, didn't I? This is why the interaction with MathGrrl was fruitless and foolish. If the man who presented himself as MathGrrl could understand these matters, what I said to him--continually--should have been straightforward. But it wasn't. Which was only a further tip-off that we were being had. The only critic of ID, and Dembski's NFL/CSI, who had a good understanding of a "specification" was Mark Perakh. However, even he made fundamental errors at times, errors which caused him to make false conclusions about what ID, and Dembski in particular, had to say. With this said, the underpinnings of ID, as found in NFL, still, in my estimation, stand scrutiny. There is no evidence that it has been overturned by any computer program produced so far, or any other kind of counter example.PaV
August 5, 2013
August
08
Aug
5
05
2013
12:23 PM
12
12
23
PM
PDT
So in Dr. Tom Schneider's Nucleic Acid Research on the Evolution of Biological Information it seems that thru his research, he was able to perform a frame shift, normally a highly destructive mutation, but in? this instance beneficial, and as it's new information. This is completely opposite of the Law of Thermodynamics stating neither energy or matter can be created or destroyed and that information must be pass along from its host and not created by nothing or on its own. So in this experiment, they were able to creat a gene with a new DNA information structure than what should've existed or been allowed. Although there's been some researchers who tried to debunk Dr. Schneider's findings, like Batten, Behe, Bracht, Dembski, Gitt, Meyer, Strachan, Joseph, Truman, and Williams, all of whom seem to have been debunked themselves. Does anyone here know if Dr. Tom Schneider's research on Nucleic Acid Research in the Evolution of Biological Information has been debunked or is it valid research that points to Evo's ability to create new information on its own? Or is there genuine reference material that suggests this theory is flawed and nature in not able create gene material containing new information on its own accord? Please let me know.greghar
February 29, 2012
February
02
Feb
29
29
2012
02:27 PM
2
02
27
PM
PDT
MathGrrl @ 342:
I’m going to address a number of the comments, but certainly not close to all. If you think I’ve overlooked an important point, please call my attention to it.
Is it "an important point" that claims by critics of ID that evolution can generate CSI undermines your entire case? Yes, I think you've overlooked that point. I also think it's an important point. I raised it way back in post #248. You know, back when I was trying to figure out whether you were even worth my time?Mung
April 9, 2011
April
04
Apr
9
09
2011
07:52 PM
7
07
52
PM
PDT
We are now over the 400 comment mark and I haven’t seen any reason to change the provisional conclusions that I reached in my post...
Amazing. I had no clue that you'd already reached any conclusions, especially as early as in the OP. Either we're all idiots, or... I came across another thread on a different forum accusing you of being a fraud. If you had bothered to respond to my own request in this thread that you establish your bona fides... But you didn't. So I am left with: Could you please point me to any post in this thread in which you display competence in any of the following: 1. computational biology 2. evolutionary algorithms 3. genetic programming You can't. You don't.Mung
April 6, 2011
April
04
Apr
6
06
2011
06:32 PM
6
06
32
PM
PDT
I know that specific DNA sequences cause certain amino acids to appear. I asked what those DNA sequences symbolise. Do those DNA sequences symbolise the corresponding amino acids? After all causes are not typically regarded to be symbols of their effects. High pressure causes clear skies – but it doesn’t symbolise clear skies.
Mark, you are quite correct that (on its face) high atmospheric pressure does not symbolize clear skies. To what entity would this symbol have any meaning? If one were to actually say that high pressure symbolizes a forthcoming of clear skies, it would only do so for a living observer who happens to pay attention and creates a mapping of the discrete symbol (high pressure) to the discrete object (clear skies). But even then, that mapping (the rule that high pressure equates to clear skies) would only exist in the mind of the observer. This has nothing whatsoever to do with the translation of a recorded code within the cell. Firstly, in cellular translation there is no third-party observer assigning meaning after the fact. In place of an observer, the mapping of the symbol has been physically instantiated within the system of translation. This is exactly what takes place within computer programming, as well as in the entire history of machine-code-operated machinery. A system has been organized to where the presence of an input is purposefully mapped to an output. The input and output are linked by the context of the organization; meaning without that context, one would be literally meaningless to the other. As an example, Adenine has no particular physical relationship to Asparagine. However, if you present Adenine to the cellular translation machinery in the correct order (one that matches the physically instantiated mapping of Adenine to Asparagine) then Asparagine will be added during protein synthesis. Needless to say, this is significantly different than an observer witnessing a cause and effect relationship in weather patterns, and then assigning order to what he/she observes. In one instance there is a direct step-by-step chain of physical causes leading to the effect. In the other, that chain comes to a certain point and then it stops. The point where it stops is at the presence of the symbol. “The existence of a genome and the genetic code divides the living organisms from nonliving matter. There is nothing in the physico-chemical world that remotely resembles reactions being determined by a sequence and codes between sequences.” Hubert P. Yockey: Information Theory, Evolution, and the Origin of LifeUpright BiPed
March 31, 2011
March
03
Mar
31
31
2011
02:05 PM
2
02
05
PM
PDT
KF 431 said: > For the X-metric, the approach is simple: 125 bytes has sufficient possible configs that the observed cosmos as a whole across its thermodynamic lifespan would not be able to sample as much as 1 in 10^150 of the configs. That is, the search is so small a fraction that it rounds down to no effective or credible search. This is patent. The fewer states/"possible configs" a search has to visit to achieve a goal (whether specified as a unique target or as satisfying a function without a target), the more efficient it is. A binary search, for example, is far more efficient than a brute search and visits only a fraction of the total possible states. It does this by making use of properties of the space (e.g., the space is sorted). The number of states visited is much smaller, but it is not zero. The smaller number of states visited by itself is not an indication that the search is not credible. That is nonsensical. However, I think your intended point is that the space being searched doesn't have the necessary properties to allow it to be searched in so brief a time. e.g. a binary search being done on an unsorted space with no order that can be exploited by the search. A binary search would not give valid results on such a space. So the real question is, does a biological space have order or redundancy that can be exploited? And how exactly does evolution (or GA models) search such spaces? Do they need to visit a sizable number of states to be effective? Or is that a red herring from misunderstanding both biology and search algorithms? So what do we notice about biological spaces? For one thing, only some changes matter. e.g. most mutations are neutral in functional effect. And even when there is a change in a protein, if the change doesn't affect a binding site it is also can be neutral in effect. In other words, a working individual is not a single point in the worst case maximum "possible configs", but rather spans a volume of points at least as large as every permutation of neutral changes that can be made to that individual. And that isn't even touching on the much larger volume of changes around an individual which actually have some minimal functional phenotypical effect--even useful ones: such as the wide variety of dog breeds. In other words, the actual search space that must be visited to reach a working individual is greatly reduced by the properties of the biological search space. If the proper algorithm is used to exploit this, you can get away with searching a fraction of it. A brute force search would still be ineffective even given this kind of search space. Fortunately evolution and GAs are not brute force algorithms. Like a binary search, they make use of the properties of the search space. Consider how useful mutations get fixed in populations. What is the effect of fixation on the search space? It means that whole swaths of the search space are no longer visited to any significance and do not need to be. The size of the search space has just been dynamically reduced as a direct result. That reduction happens for every fixation, and those reductions in the "possible configs" multiply. The reverse also happens. When a genome duplicates in full, the search space dramatically expands. It is as if our 128x128 smiley face has been resampled as a 256x256 smiley (initially blurry). The space now has more possible states, some of which will get "fixed" and reduce the search space again once some of these more precise "pixels" are visited and tested by selection. But how did we get to a 128x128 smiley by a evolutionary search? By starting with a 2x2, a 4x4, a 16x16 smiley, with previous fixations at every stage eliminating whole swaths of the search space even as the search space expands. This kind of search only has to visit a tiny fraction of the possible states. The number of total "possible configs" is a red herring. What matters are the properties of the search space and how an algorithm exploits those properties. In the right combination, you get highly efficient results without having to search the entire space. Bravo to MathGrrl and others for working on actual tests to the behavior of such algorithms.smgr78
March 31, 2011
March
03
Mar
31
31
2011
12:15 PM
12
12
15
PM
PDT
here's your example: Casey Luskin has rather strongly implied that random strings of letters don't have CSI, whereas sentences do. So here's a very simple test for CSI. I here present two strings in a simple letter-to-number substitution code. One of them spells a coherent sentence composed by myself, and one is a string of randomly generated letters. Run the CSI calculations, SHOW ME the CSI calculations, and based on that determine which is which: 1,4,5,12,14,3,6,2,4,6,26,26,17,19,14,12,28,20,6,9,2,7,17,13,7,25,11,17,1,22,17,30,7,11,10,11,18,22,20,6,16,5,2,10,2,27,18,12,1,20,28 11,6,10,18,9,12,27,18,11,9,14,6,12,27,7,6,23,9,19,27,16,1,14,1,17,11,15,10,19,15,14,6,17,1,8,2,9,17,1,15,19,9,14,21,15,17,13,1,23,9,30 Remember: it is very important that this is done on the basis of ACTUAL CSI CALCULATIONS and not 'cheating' of some sort. Thus, the number-substitution code. We're testing your claim that you can detect design through calculating complex specified information, not your ability to paste text into Google translate, run expectation algorithms to determine real language consonant and vowel usage, look for repeating patterns, or even laboriously transliterate everything into the latin alphabet and read it aloud to see if it sounds like language.extremities
March 31, 2011
March
03
Mar
31
31
2011
12:43 AM
12
12
43
AM
PDT
Kairofocus,
Complex, functionally specific information is used in a great many contexts and is a characteristic sign of intelligence. Take the posts in this blog thread for a start, then go over to the software used in the ICT’s industry, and the whole Internet.
That information is used and is useful is not in question. As you say, take the posts in this blog, and calculate the CSI (you made some mention of this above, I recall). Now we can determine that these posts are not randomly generated - which is great - but we already knew that even before putting numbers on it. Not so useful. However, I wasn't actually attempting to create a useful example. Just an example of the simplest kind. It need not prove anything at all. Must sleep now. Maybe more later?Tomato Addict
March 30, 2011
March
03
Mar
30
30
2011
08:49 PM
8
08
49
PM
PDT
TA: Please open your eyes. Complex, functionally specific information is used in a great many contexts and is a characteristic sign of intelligence. Take the posts in this blog thread for a start, then go over to the software used in the ICT's industry, and the whole Internet. Orgel and Wicken were correct to highlight CSI and FSCI in the 1970's. The concept is meaningful at an intuitive common-sense level, and poses a serious challenge to origin of life research and the origin of body plans. Quantifications and models ranging from the simple X-metric to Durston et al and the FITS metric published in 2007, or Dembski's CSI metric model may be debated on specifics, and are subject to development as is so with all scientific work; but that does not change the basic fact that CSI and FSCI are real and plainly matter. Matter so much in fact that there is now an active push to deny, and to dismiss. The difference here is that in the design theory investigations, a threshold estimate is to be identified for how much will be sufficiently complex and specific that it is not credible to infer that lawlike natural forces and processes and/or chance contingency have by good luck given rise to such information. In the case of the X-metric, a simple, brute force approach has been used. For the X-metric, the approach is simple: 125 bytes has sufficient possible configs that the observed cosmos as a whole across its thermodynamic lifespan would not be able to sample as much as 1 in 10^150 of the configs. That is, the search is so small a fraction that it rounds down to no effective or credible search. This is patent. In that context, given that we have an empirically, routinely known observed -- and only observed -- source of such FSCI, intelligence is a superior explanation. The duplication objection raised above fails to address the problem of how the duplication arises -- notice that "simple" in the original post -- as, I have already pointed out repeatedly above. Such duplication, however, implies more than "it arises by magic." That is it implies a regulated, stage by stage process that duplicates. Such a process, in relevant examples, will normally involve more than 125 bytes worth of working information; on the simple grounds that 125 bytes is not enough space to do anything of significance with software: at say 6 letters per key word on average, we are talking of 20 or 21 words. Not a lot of working space. In the course of the genome, the process of replication of the chromosome is itself quite involved, and has sub processes called up if errors are detected. It can fail, especially for things like cancer, but the failure is a demonstration of what happens as a rule when something that is functionally specific breaks down under the impact of a chance factor. If we are dealing with a PC, an algorithm to duplicate a given string will similarly be quite functionally specific and complex, and will likely be irreducibly complex as well. It will require setting up start, sequence, decisions, looping, termination of loops and halting, with symbolic elements implemented according to definite rules and conventions more broadly. That means that a genetic or evolutionary algorithm based on duplicate, modify and improve, is fatally dependent on intelligent programming. Indeed, my consistent objection to the genetic algorithm proposed models of evolution, is that they start on a highly functionally specific and complex island of function, set up by intelligence and operate according to that already being on the target zone. As for the X = S*C*B simple metric, that is intentionally very simple and based on familiar things such as data measured in bits, and no resort to anything more than the usual judging semiotic agent, aka observer [who is involved in estimating how much solution is in a measuring cylinder or the length of a string as compared with a metre stick etc], is needed . . . so, pardon: it would help if you would actually address what is on the table not a strawman:
(i) is the item functionally or otherwise specific, so that it is isolated in the config space relative to the non-functional ones (e.g. what is the effect of injected noise on function, try out your PC application programs, or an image file etc to see this.) 1/0 for the obvious alternatives. (ii) Is it sufficiently complex as measured by number of bits, to pass the 1,000 bit scope that leads to a config space of at least 10^301 possibilities? 1/0 again. (iii) how many bits are actually explicitly used or implied to store the info involved in the function?
When a function is specific [e.g. we write posts using conventions of English], complex [at least 143 ASCII characters] and involves a certain number of bits, e.g. the 4469 7-bit characters [= 31,283 bits] in the original post, we can deduce the value 31,283 functionally specific and complex bits at work. The explanatory filter would point to the best explanation for this being design, and in fact we have good independent reason to accept that this is true. In the case of more complex metrics, I have repeatedly pointed onlookers to -- and even excerpted on -- the case by Durston et al, where the Shannon H-metric, integrated with functionality as assessed for observed sequences for proteins in 35 families, was used to give a table of FSC in FITS, published in the per reviewed literature. I find it highly interesting that, consistently, this vital data point has been ignored by objectors, as if it does not exist in the peer-reviewed literature. (Notice, it is a more sophisticated form of the X-metric, targetting functional as opposed to random or orderly sequences of symbols.) Similarly, when I see commenters -- just scroll up -- looking at the genetic code and in 2011 trying to dismiss what is most patently is, a digital code, that is telling. Let me cite from Crick in his letter to his son Michael, on his discovery, March 19, 1953:
Now we believe that the DNA is a code. That is, the order of bases (the letters) makes one gene different from another gene (just as one page of print is different from another)
So, who should I believe: Crick or the dismissive objector above? As a further example of what is going on, let's look at the CSI scanner thread at no 8, where JemimaRacktouey artfully clips off only a part of the UD corrective 27, to suggest that the X-metric and the overall comment in that remark is simplistic and incorrect. Where did she clip off? Just before the more serious level analysis was introduced, from Durston and Dembski, with links. In short, we are seeing selective hyperskepticism, strawman tactics, willful refusal to accept patent facts and reasonable findings [often tracing to the successful work of famous scientists such as Crick, Orgel and Wicken] and the like. Surely, we can do better. GEM of TKIkairosfocus
March 30, 2011
March
03
Mar
30
30
2011
02:28 PM
2
02
28
PM
PDT
Joseph: I did read your example (thank you), along with most (not all) of the previous comments. I do not have time to chase back through all that again to find what I need, so I was asking for assistance. Fortunately, Kariosfocus has recently provided: > X = S*C*B. which I believe is the equation I was looking for. (this seems to be a likelihood ratio?) Back to my calculations: In my duplication example I doubled the string from "CG" to "CGCG", and S will be the increase in information, which is either zero, or something very small (there has been some disagreement). If zero, CSI will be undefined as I take the log, so let's say it is one bit (S=1) to carry through the example. This seems **wrong** somehow, and a little digging confirms. S or perhaps S*C ought to be a likelihood (apologies, it has been a while since I last worked this sort of problem by hand.) I am out of time for today, and perhaps out of energy to pursue this as well. Maybe tomorrow. I still think there should be a trivially simple demostration of the calculation, and the difficulty supports MathGrrl's contention that CSI is not rigorously defined. If CSI is a useful concept, then there ought to be examples of it being used as such in many fields. Likelihood ratio tests are certainly useful (I use them all the time), but here the definition are clear. The lack of clear definition for CSI appear to make it a not-useful concept.Tomato Addict
March 30, 2011
March
03
Mar
30
30
2011
12:59 PM
12
12
59
PM
PDT
News Flash- There isn't anything any IDists can say or do to satisfy MathGrrl. I have provided a definition of CSI, one with mathematical rigor. She choked on it. I told her why CSI is a strong indicator of a designing agency, she choked on that too. Not only that she continues her equivocation by using "evolutionary mechanisms"- ie she blindly accepts that all evolutionary processes are blind watchmaker processes. So we have MathGrrl dismissing the efforts of IDists but when it comes to gene duplications she just blindly accepts that they are blind watchmaker processes. And in the end all she had to do was read "No Free Lunch"- but she choked on that too. A lot of choking, equivocating, and strawman erecting. That is what MathGrrl has provided. Oh well...Joseph
March 30, 2011
March
03
Mar
30
30
2011
09:09 AM
9
09
09
AM
PDT
QuiteID, so what is your point? In non-technical usage, a “cipher” is the same thing as a “code”; however, the concepts are distinct in cryptography. http://en.wikipedia.org/wiki/Cipher Whereas A metaphor is a figure of speech that constructs an analogy between two things or ideas; the analogy is conveyed by the use of a metaphorical word in place of some other word. For example: "Her eyes were glistening jewels." http://en.wikipedia.org/wiki/Metaphor Yet clearly the DNA code is well beyond the metaphor category, in fact your own reference suggested the more strict usage of the term 'cipher' be used instead of Code, Thus QuiteID you and your references, which are suggesting that the use of the term Code for DNA is merely metaphorical, are shown to be completely out of context towards making your point! i.e. DNA is 100% a code in meaning, intent and purpose! Perhaps you should use a metaphor for your metaphor so as to make this twisted logic work for you! :)bornagain77
March 30, 2011
March
03
Mar
30
30
2011
09:06 AM
9
09
06
AM
PDT
Mrs. O'Leary, I would like to again thank you for giving me the opportunity to make this guest post and for your time in policing the comments. I would also like to thank Jonathan M. for raising the possibility with you. Warm regards, MathGrrlMathGrrl
March 30, 2011
March
03
Mar
30
30
2011
08:54 AM
8
08
54
AM
PDT
Everyone, We are now over the 400 comment mark and I haven't seen any reason to change the provisional conclusions that I reached in my post now numbered 201, namely: 1) There is no agreed definition of CSI. I have asked from the original post onward for a rigorous mathematical definition of CSI and have yet to see one. Worse, the comments here show that a number of ID proponents have definitions that are not consistent with each other or with Dembski’s published work. 2) There is no agreement on the usefulness of CSI. This may be related to the lack of an agreed definition, but several variants, that are incompatible with Dembski’s description, and alternative metrics have been proposed in this thread alone. 3) There are no calculations of CSI that provide enough detail to allow it be objectively calculated for other systems. The only example of a calculation for a biological system is Dembski’s estimate for a bacterial flagellum, but no one has managed to apply the same technique to other systems. 4) There is no proof that CSI is a reliable indicator of intelligent agency. This is not surprising, given the lack of a rigorous mathematical definition and examples of how to calculate it, but it does mean that the claims of many ID proponents are unfounded. Even after all of the effort expended by numerous participants, no one has directly addressed the five straightforward questions I asked, no one has provided a rigorous mathematical definition of CSI, and no one has provided detailed examples of how to objectively calculate it. I will continue to monitor this thread on the chance that someone chooses to address my original post, but I'm going to step back from addressing the majority of the comments that do not do so. Despite my disappointment and occasional frustration that I have not come away from this exercise with a sufficient understanding of CSI to be able to test the assertion that it cannot be generated by evolutionary mechanisms, I do believe that this has been a valuable discussion. It certainly provides a good reference for future threads here at UD. Thank you all for your participation. It's been interesting.MathGrrl
March 30, 2011
March
03
Mar
30
30
2011
08:54 AM
8
08
54
AM
PDT
QuiteID The call of "metaphor" is nothing more than damage control. Whenever you take on type of input and create an output of a different type there has to be a code to do so. Nucleotides in amino acid chain out. That said I agree that DNA is not some type of program. But for DNA to be of any use there has to be a code to change from nucleotides to proteins. I also agree that DNA is not a "blueprint"- all it does it carry out its instructions.Joseph
March 30, 2011
March
03
Mar
30
30
2011
07:50 AM
7
07
50
AM
PDT
bornagain77, I can cite literature too, and not just from blogs and books. Consider "Genes and Causation" by Denis Noble (Phil. Trans. R. Soc. A 13 September 2008 vol. 366 no. 1878 3001-3015), available at http://rsta.royalsocietypublishing.org/content/366/1878/3001.long "The coding step in the case of the relationship between DNA and proteins is what leads us to regard the information as digital. This is what enables us to give a precise number to the base pairs (3 billion in the case of the human genome). Moreover, the CGAT code could be completely represented by binary code of the kind we use in computers. (Note that the code here is metaphorical in a biological context—no one has determined that this should be a code in the usual sense. For that reason, some people have suggested that the word ‘cipher’ would be better.)" And: "Another analogy that has come from comparison between biological systems and computers is the idea of the DNA code being a kind of program. This idea was originally introduced by Monod & Jacob (1961) and a whole panoply of metaphors has now grown up around their idea. We talk of gene networks, master genes and gene switches. These metaphors have also fuelled the idea of genetic (DNA) determinism. But there are no purely gene networks! Even the simplest example of such a network—that discovered to underlie circadian rhythm—is not a gene network, nor is there a gene for circadian rhythm. Or, if there is, then there are also proteins, lipids and other cellular machinery for circadian rhythm." And: "The metaphors that served us well during the molecular biological phase of recent decades have limited or even misleading impacts in the multilevel world of systems biology. New paradigms are needed if we are to succeed in unravelling multifactorial genetic causation at higher levels of physiological function and so to explain the phenomena that genetics was originally about." Also see Stephen Strauss, "Beyond the double helix: as genetics becomes ever more complex, we badly need a way of describing what DNA does. 'Blueprint' just won't cut it." New Scientist 201.2696 (2009): 22 Also see Sergi Cortiñas Rovira, "Metaphors of DNA: a review of the popularisation processes," Journal of Science Communication 7(1), March 2008. I could go on.QuiteID
March 30, 2011
March
03
Mar
30
30
2011
07:34 AM
7
07
34
AM
PDT
QuiteID, The DNA code is not 'like a code' as you are trying to insinuate, the DNA code is 100% a code in every meaning, intent, and purpose; The DNA Code - Solid Scientific Proof Of Intelligent Design - Perry Marshall - video http://www.metacafe.com/watch/4060532/ Moreover there are multiple overlapping codes that are completely inexplicable to Darwinian mechanisms; "In the last ten years, at least 20 different natural information codes were discovered in life, each operating to arbitrary conventions (not determined by law or physicality). Examples include protein address codes [Ber08B], acetylation codes [Kni06], RNA codes [Fai07], metabolic codes [Bru07], cytoskeleton codes [Gim08], histone codes [Jen01], and alternative splicing codes [Bar10]. Donald E. Johnson – Programming of Life – pg.51 - 2010 Histone Inspectors: Codes and More Codes - Cornelius Hunter - March 2010 Excerpt: By now most people know about the DNA code. A DNA strand consists of a sequence of molecules, or letters, that encodes for proteins. Many people do not realize, however, that there are additional, more nuanced, codes associated with the DNA. http://darwins-god.blogspot.com/2010/03/histone-inspectors-codes-and-more-codes.html ------------ Illya Prigogine, (Nobel-Chemistry) once wrote, “let us have no illusions we are unable to grasp the extreme complexity of the simplest of organisms. The DNA of a bacterium contains an encyclopedic amount of pure digitally encoded information that directs the highly sophisticated molecular machinery within the cell membrane. DNA characters are copied with an accuracy that rivals anything that modern engineers can do. -------------------bornagain77
March 30, 2011
March
03
Mar
30
30
2011
06:57 AM
6
06
57
AM
PDT
kairosfocus, whatever Wikipedia says (and I don't know why you say it is "testifying against interest"), it's widely recognized among scientists that terms like "code," "instructions," "language," etc. are metaphors. That's not to say they're "false" or "wrong" but only that they're limited. We understand the world in metaphorical terms all the time (see Lakoff and Johnson). These metaphors in particular tend to lead us to look at biology in anthropomorphic terms. There's a fairly developed literature on the misunderstandings that develop when the code metaphor is taken too literally.QuiteID
March 30, 2011
March
03
Mar
30
30
2011
06:29 AM
6
06
29
AM
PDT
F/N: Now that I have a moment to pause, let's take a look at MG's four posed challenges at the head of this thread. It will emerge in short order that the questions are misdirected, and that consistently, the processes used start within islands of function in much wider and predominantly non-functional configuration spaces, based on intelligent design. Inadvertently, they show how FSCI is routinely the product of design. I will comment on points: _______________ >>A simple gene duplication, without subsequent modification, that increases production of a particular protein from less than X to greater than X. The specification of this scenario is “Produces at least X amount of protein Y.” a --> gene duplication, as shown above is not simple, and implies a problem with a regulatory network b --> The existence and structure of that network expresses a complex, integrated functional organisation that points to design, as it will go well beyond the FSCI threshold [125 bytes worth of code]. c --> The duplication itself, in the context cited does not create novel function, it would simply replicate existing information. d --> The question is mis-directed and falls under the fallacy of the complex question, also failing to distinguish mere information carrying capacity [complexity] from functional, meaningful specific information. Tom Schneider’s ev evolves genomes using only simplified forms of known, observed evolutionary mechanisms, e --> Ev starts within an island of function, i.e it begins within a target zone already. f --> That functionality owes a lot to the intelligent direction of Schneider that meet the specification of “A nucleotide that binds to exactly N sites within the genome.” The length of the genome required to meet this specification can be quite long, depending on the value of N. (ev is particularly interesting because it is based directly on Schneider’s PhD work with real biological organisms.) g --> The key question being addressed by Design theory is being begged, and it seems form above that the quantum of increment of information being claimed is well below the relevant threshold for FSCI, i.e if you show that chance plus trial and error can generate successful changes within the reach of the search resources of the cosmos, you have not addressed the real question, to generate functional information on the scale required to be relevant to the CSI filter h --> So, the same errors are at work. The design inference does not assert that no increments in information are possible on chance plus trial and error etc, but that such run into a limit, the search resources of the cosmos. i --> This is brought out in the simple X-metric, X = S*C*B. Tom Ray’s Tierra routinely results in digital organisms with a number of specifications. j --> Again, this starts within an island of function, i.e the key question is being begged. One I find interesting is “Acts as a parasite on other digital organisms in the simulation.” The length of the shortest parasite is at least 22 bytes, but takes thousands of generations to evolve. k --> 22 bytes [and the like], of course, is well within the FSCI limit of interest. But already, we see how hard it is to search a space. The various Steiner Problem solutions l --> Followed the link, and saw this:
In this post, I will present my research on a Genetic Algorithm I developed a few years ago, for the specific purpose of addressing the question Can Genetic Algorithms Succeed Without Precise “Targets”?
m --> this already begs the same question as has been repeatedly pointed out for some weeks: the program itself is already on an intelligently arrived at island of function. That dominates all else. n --> Now of course things have progressed since the days of Weasel where non-functional symbol strings were directly rewarded on mere proximity to target; which was presented -- rhetorically quite successfully -- as a demonstration of how evolution works [never mind the weasel word disclaimers in the fine print]. o --> Until there is an open admission, repudiation and correciton of that bit of manipulation, I have no confidence in further, more sophisticated versions of the same basic trick. p --> This, I do not find here, only a pretence that Philip Johnson and others did not have a legitimate point of protest. from a programming challenge a few years ago have genomes that can easily be hundreds of bits. q --> You are able to generate bit strings that within an island of function, hill climb to a target specified by a fitness function. r --> Where did that fitness function come form? Intelligent design, and it already codes the relevant target. The specification for these genomes is “Computes a close approximation to the shortest connected path between a set of points.” s --> And so, you have shown that an intleligently designed algorithm cna use controlled trial and error to hill-climb to a short distance between points solution. t --> Congratulations, you have shown that functionally specific, complex organisation and associated information, come from intelligence and can do the tasks they were set up to do. >> _______________ In short, MG, the problem is that the questions are complex and question-begging, so the set tasks become caught up in the loops of pointless circles of argument. The real challenger for the evolutionary materialistic paradigm, is to EMPIRICALLY show that life can and does self-assemble from reasonable chemicals in a reasonable pre-life environment, then give rise to novel body plans, in so doing crossing the observed thresholds of complexity of order 100+ k bits and 10+ mn bits. So far, we are seeing the a priori Lewontinian presumption of evolutionary materialism, and attempted [often irrelevant illustrations] that are working in that circle of reasoning. Of course, there are no specific CSI calculations above, such would be pointless in a context of a prior question-begging, complex and loaded question error. GEM of TKI PS: The follow up thread posted by VJT here is significant and responsive to the themes in this thread. I thank him for taking the time and making the effort to do the detailed calculations and analysis on WD's CSI metric that I just don't have time to even look at attempting. I only add that there are several possible metrics of the CSI in various forms.kairosfocus
March 30, 2011
March
03
Mar
30
30
2011
05:46 AM
5
05
46
AM
PDT
A follow up on mye previous "cryptic" post The problem is to define specified. It seems that computer programs as well as proteins and other functional systems are more or less specified. Hence, "specified" is a set. Proteins gain their function by interacting in a context with other biomolecules. Hence, the set of specified proteins is the range of amino acid combinations that carry out a specific function within a specific context. The method would be to vary the amino acid combinations and count the ocombinations that carried out the questioned function (hard to test, but could be modelled by methods given by Axe and Durston et al., e.g. “Measuring the functional sequence complexity of proteins") Calculating CSI is possible when you have a finite set of configurable switches, but the approach must be different from system to system.Albert Voie
March 30, 2011
March
03
Mar
30
30
2011
05:12 AM
5
05
12
AM
PDT
Tomato Addict, I provided MathGrrl with a simple definition and a simple example. Nothing will ever be good enough for her. Whatever any IDist says she will just come back with BS response. And the main problem is she didn't even read the book that describes and defines CSI. IOW she is a pathetic waste of time and it is sad to see all the time IDists have wasted on her. 1- She refuses to back-down from her strawman 2- She refuses to engage by providing the requested information 3- She refuses to read "No Free Lunch" 4- She is either purposely obtuse or just on a mission to see how long of a thread he can get and how much confusion she can generate.Joseph
March 30, 2011
March
03
Mar
30
30
2011
04:42 AM
4
04
42
AM
PDT
MathGrrl:
In any case, my focus on this thread is to get a rigorous mathematical definition of CSI and some examples of how to calculate it.
Explain why it has to be "a rigorous mathematical definition of CSI" and then explain why my definition doesn't fit. As for equivocation tha is all you do with your use of "evolutinary mechanisms". IOW stop telling others tha tey ar equivocating when tat is what you have been doing.Joseph
March 30, 2011
March
03
Mar
30
30
2011
04:34 AM
4
04
34
AM
PDT
Onlookers (and MF): Re MF:
I know that specific DNA sequences cause certain amino acids to appear. I asked what those DNA sequences symbolise . . . After all causes are not typically regarded to be symbols of their effects. High pressure causes clear skies – but it doesn’t symbolise clear skies.
This all too tellingly captures the degree to which evolutionary materialism is driven to incoherence in the face of the well-established fact that DNA bases in the chromosome SYMBOLISE, using a DIGITAL CODE -- Wikipedia testifying against interest, the amino acid string to be coded for once the regulatory network triggers that transcription and protein synthesis process. Let's cite Wiki:
The genetic code is the set of rules by which information encoded in genetic material (DNA or mRNA sequences) is translated into proteins (amino acid sequences) by living cells. The code defines a mapping between tri-nucleotide sequences, called codons, and amino acids . . . . Not all genetic information is stored using the genetic code. All organisms' DNA contains regulatory sequences, intergenic segments, and chromosomal structural areas that can contribute greatly to phenotype. Those elements operate under sets of rules that are distinct from the codon-to-amino acid paradigm underlying the genetic code . . . . Each protein-coding gene is transcribed into a template molecule of the related polymer RNA, known as messenger RNA or mRNA. This, in turn, is translated on the ribosome into an amino acid chain or polypeptide.[8]:Chp 12 The process of translation requires transfer RNAs specific for individual amino acids with the amino acids covalently attached to them, guanosine triphosphate as an energy source, and a number of translation factors. tRNAs have anticodons complementary to the codons in mRNA and can be "charged" covalently with amino acids at their 3' terminal CCA ends. Individual tRNAs are charged with specific amino acids by enzymes known as aminoacyl tRNA synthetases, which have high specificity for both their cognate amino acids and tRNAs. The high specificity of these enzymes is a major reason why the fidelity of protein translation is maintained.[8]:464–469 There are 4³ = 64 different codon combinations possible with a triplet codon of three nucleotides; all 64 codons are assigned for either amino acids or stop signals during translation.
Do the computer instructions on your PC cause the computer to execute steps specified by a given program? Of course they do, as a component of the cause of the computer's processing and output, based precisely on their digital, symbolic nature. High pressure, by sharpest contrast, is a causal factor for clear skies, but that is not by a step by step symbolic algorithmic process, such as is happening in the cell when DNA is transcribed to RNA, which is sent out to the ribosomes and then used to step by step string a protein chain, as is discussed here, with video. Amazing . . . and not a little sad. And of course if one struggles to see that the genetic code is exactly that, one will then have great difficulties in addressing how it reflects functionally specific, complex information, or how such FSCI could be an empirically reliable sign pointing to intelligent cause. GEM of TKIkairosfocus
March 30, 2011
March
03
Mar
30
30
2011
03:54 AM
3
03
54
AM
PDT
#408 UB I know that specific DNA sequences cause certain amino acids to appear. I asked what those DNA sequences symbolise. Do those DNA sequences symbolise the corresponding amino acids? After all causes are not typically regarded to be symbols of their effects. High pressure causes clear skies - but it doesn't symbolise clear skies.markf
March 29, 2011
March
03
Mar
29
29
2011
11:12 PM
11
11
12
PM
PDT
tgpeeler @ 403: "You can restart it by communicating something without using a language. Good luck with that." There are a number of dogs, bees, and ants that would beg to differ.paragwinn
March 29, 2011
March
03
Mar
29
29
2011
11:00 PM
11
11
00
PM
PDT
Others have stated that CSI calculations are "hard work", Why is CSI so hard to calculate? Can it not be simplified to the point where a demonstration is clear? Even a trivial example ought to be sufficient for this purpose. MathGrrl's scenarios do not appear to require that a calculation be complicated, just that there should be a calculation. I often find that simplifying a problem down to the absolute basics is an excellent way to gain understanding. If CSI is a useful concept, then its meaning ought to be even more clear in a trivial example. Now if you will pardon my naivety, I'm going to give this a whack myself with MG's first scenario: Suppose the gene is simply "CG" and it duplicates to "CGCG" (I did say trivial!). If I understand correctly there needed to be some statement of the probability of this happening by chance, so how about 10%, or some arbitrary constant probability C if you prefer. So what come next? Walk me through this.Tomato Addict
March 29, 2011
March
03
Mar
29
29
2011
08:07 PM
8
08
07
PM
PDT
Mark at 381, Hello Mark. If someone asks me if I think the arrangement of nucleic acids in DNA are mapped to amino acids during protein synthesis, I probably won’t take the question all that seriously, particularly on a thread where I’ve already made several comments to that effect. It seems from your own comments that you have evidence of some naturally occurring symbols, representations, abstractions, etc, you’d like to share. So again, if you have a case to make, then please by all means, make it. I will check back in as soon as I can. Cheers…Upright BiPed
March 29, 2011
March
03
Mar
29
29
2011
04:12 PM
4
04
12
PM
PDT
It's radical that life is life? I'm sure I'm missing something here. There are various kinds of life, no? I for example, the last time I checked, am moderately different from a mushroom. Although at work I feel like one sometimes. :-) If your statement is true (I don't think it is) then all strings of symbols in English symbolize the same thing and are equivalent to every other one. Not true.tgpeeler
March 29, 2011
March
03
Mar
29
29
2011
03:11 PM
3
03
11
PM
PDT
tgpeeler,
mg @ 392 “There are a number of information theorists who would beg to differ.” I’m sure there are. They’d be wrong, too. I will say again, boldly, if I may, that it is IMPOSSIBLE to account for the phenomenon of information in terms of the laws of physics. Why do I say this? Because symbols, rules (language), rationality (logic), free will (freely assembling symbols according to the aforementioned language specific rules and laws of rational thought), and intentionality (for a reason, to communicate a message) are ALL necessary for human information/communication. Materialism or naturalism or physicalism, whatever ilk your particular version is, all fail to account for human language and information....
There's your equivocation. You start with creating an analogy between some aspects of biological system and the concept of a language and now you're talking about "human language and information". You seem to be confused between your use of symbols and some kind of Platonic symbol inherent in what you are modeling. In any case, my focus on this thread is to get a rigorous mathematical definition of CSI and some examples of how to calculate it. Can you provide either or both of those?MathGrrl
March 29, 2011
March
03
Mar
29
29
2011
03:09 PM
3
03
09
PM
PDT
1 2 3 15

Leave a Reply