Uncommon Descent Serving The Intelligent Design Community

Is the CSI concept well-founded mathematically, and can it be applied to the real world, giving real and useful numbers?

Share
Facebook
Twitter
LinkedIn
Flipboard
Print
Email

Those who have been following the recently heated up exchanges on the theory of intelligent design and the key design inference on tested, empirically reliable signs, through the ID explanatory filter, will know that a key move in recent months was the meteoric rise of the mysterious internet persona MathGrrl (who is evidently NOT the Calculus Prof who has long used the same handle).

MG as the handle is abbreviated, is well known for “her” confident-manner assertion — now commonly stated as if it were established fact in the Darwin Zealot fever swamps that are backing the current cyberbullying tactics that have tried to hold my family hostage —  that:

without a rigorous mathematical definition and examples of how to calculate [CSI], the metric is literally meaningless. Without such a definition and examples, it isn’t possible even in principle to associate the term with a real world referent.

As the strike-through emphasises, every one of these claims has long been exploded.

You doubt me?

Well, let us cut down the clip from the CSI Newsflash thread of April 18, 2011, which was again further discussed in a footnote thread of 10th May (H’mm, anniversary of the German Attack in France in 1940), which was again clipped yesterday at fair length.

( BREAK IN TRANSMISSION: BTW, antidotes to the intoxicating Darwin Zealot fever swamp “MG dunit” talking points were collected here — Graham, why did you ask the question but never stopped by to discuss the answer? And the “rigour” question was answered step by step at length here.  In a nutshell, as the real MathGrrl will doubtless be able to tell you, the Calculus itself, historically, was founded on sound mathematical intuitive insights on limits and infinitesimals, leading to the warrant of astonishing insights and empirically warranted success, for 200 years. And when Math was finally advanced enough to provide an axiomatic basis — at the cost of the sanity of a mathematician or two [doff caps for a minute in memory of Cantor] — it became plain that such a basis was so difficult that it could not have been developed in C17. Had there been an undue insistence on absolute rigour as opposed to reasonable warrant, the great breakthroughs of physics and other fields that crucially depended on the power of Calculus, would not have happened.  For real world work, what we need is reasonable warrant and empirical validation of models and metrics, so that we know them to be sufficiently reliable to be used.  The design inference is backed up by the infinite monkeys analysis tracing to statistical thermodynamics, and is strongly empirically validated on billions of test cases, the whole Internet and the collection of libraries across the world being just a sample of the point that the only credibly known source for functionally specific complex information and associated organisation [FSCO/I]  is design.  )

After all, a bit of  careful citation always helps:

_________________

>>1 –> 10^120 ~ 2^398

I = – log(p) . . .  eqn n2
3 –> So, we can re-present the Chi-metric:
[where, from Dembski, Specification 2005,  χ = – log2[10^120 ·ϕS(T)·P(T|H)]  . . . eqn n1]
Chi = – log2(2^398 * D2 * p)  . . .  eqn n3
Chi = Ip – (398 + K2) . . .  eqn n4
4 –> That is, the Dembski CSI Chi-metric is a measure of Information for samples from a target zone T on the presumption of a chance-dominated process, beyond a threshold of at least 398 bits, covering 10^120 possibilities.
5 –> Where also, K2 is a further increment to the threshold that naturally peaks at about 100 further bits . . . .
6 –> So, the idea of the Dembski metric in the end — debates about peculiarities in derivation notwithstanding — is that if the Hartley-Shannon- derived information measure for items from a hot or target zone in a field of possibilities is beyond 398 – 500 or so bits, it is so deeply isolated that a chance dominated process is maximally unlikely to find it, but of course intelligent agents routinely produce information beyond such a threshold.

7 –> In addition, the only observed cause of information beyond such a threshold is the now proverbial intelligent semiotic agents.
8 –> Even at 398 bits that makes sense as the total number of Planck-time quantum states for the atoms of the solar system [most of which are in the Sun] since its formation does not exceed ~ 10^102, as Abel showed in his 2009 Universal Plausibility Metric paper. The search resources in our solar system just are not there.
9 –> So, we now clearly have a simple but fairly sound context to understand the Dembski result, conceptually and mathematically [cf. more details here]; tracing back to Orgel and onward to Shannon and Hartley . . . .
As in (using Chi_500 for VJT’s CSI_lite [UPDATE, July 3: and S for a dummy variable that is 1/0 accordingly as the information in I is empirically or otherwise shown to be specific, i.e. from a narrow target zone T, strongly UNREPRESENTATIVE of the bulk of the distribution of possible configurations, W]):
Chi_500 = Ip*S – 500,  bits beyond the [solar system resources] threshold  . . . eqn n5
Chi_1000 = Ip*S – 1000, bits beyond the observable cosmos, 125 byte/ 143 ASCII character threshold . . . eqn n6
Chi_1024 = Ip*S – 1024, bits beyond a 2^10, 128 byte/147 ASCII character version of the threshold in n6, with a config space of 1.80*10^308 possibilities, not 1.07*10^301 . . . eqn n6a
[UPDATE, July 3: So, if we have a string of 1,000 fair coins, and toss at random, we will by overwhelming probability expect to get a near 50-50 distribution typical of the bulk of the 2^1,000 possibilities W. On the Chi-500 metric, I would be high, 1,000 bits, but S would be 0, so the value for Chi_500 would be – 500, i.e. well within the possibilities of chance.  However, if we came to the same string later and saw that the coins somehow now had the bit pattern of the ASCII codes for the first 143 or so characters of this post, we would have excellent reason to infer that an intelligent designer, using choice contingency, had intelligently reconfigured the coins. that is because, using the same I = 1,000 capacity value, S is now 1, and so Chi_500 = 500 bits beyond the solar system threshold. If the 10^57 or so atoms of our solar system, for its lifespan, were to be converted into coins and tables etc, and tossed at an impossibly fast rate, it would be impossible to sample enough of the possibilities space W to have confidence that something from so unrepresentative a zone T,  could reasonably be explained on chance. So, as long as an intelligent agent capable of choice is possible, choice — i.e. design — would be the rational, best explanation on the sign observed, functionally specific, complex information.]
10 –> Similarly, the work of Durston and colleagues, published in 2007, fits this same general framework . . . .
We use the formula log (20) – H(Xf) to calculate the functional information at a site specified by the variable Xf such that Xf corresponds to the aligned amino acids of each sequence with the same molecular function f. The measured FSC for the whole protein is then calculated as the summation of that for all aligned sites. The number of Fits quantifies the degree of algorithmic challenge, in terms of probability [info and probability are closely related], in achieving needed metabolic function. For example, if we find that the Ribosomal S12 protein family has a Fit value of 379, we can use the equations presented thus far to predict that there are about 10^49 different 121-residue sequences that could fall into the Ribsomal S12 family of proteins, resulting in an evolutionary search target of approximately 10^-106 percent of 121-residue sequence space. In general, the higher the Fit value, the more functional information is required to encode the particular function in order to find it in sequence space . . . .
11 –> So, Durston et al are targetting the same goal, but have chosen a different path from the start-point of the Shannon-Hartley log probability metric for information. That is, they use Shannon’s H, the average information per symbol, and address shifts in it from a ground to a functional state on investigation of protein family amino acid sequences. They also do not identify an explicit threshold for degree of complexity. [Added, Apr 18, from comment 11 below:] However, their information values can be integrated with the reduced Chi metric:
Using Durston’s Fits from his Table 1, in the Dembski style metric of bits beyond the threshold, and simply setting the threshold at 500 bits:
RecA: 242 AA, 832 fits, Chi: 332 bits beyond
SecY: 342 AA, 688 fits, Chi: 188 bits beyond
Corona S2: 445 AA, 1285 fits, Chi: 785 bits beyond  . . . results n7
The two metrics are clearly consistent . . .  (Think about the cumulative fits metric for the proteins for a cell . . . )
In short one may use the Durston metric as a good measure of the target zone’s actual encoded information content, which Table 1 also conveniently reduces to bits per symbol so we can see how the redundancy affects the information used across the domains of life to achieve a given protein’s function; not just the raw capacity in storage unit bits [= no.  of  AA’s * 4.32 bits/AA on 20 possibilities, as the chain is not particularly constrained.]>>

_________________

So, there we have it folks:

I: Dembski’s CSI metric is closely related to standard and widely used work in Information theory, starting with I = – log p

II: It is reducible on taking the appropriate logs, to an information beyond a threshold value

III: The threshold is reasonably set by referring to the accessible search resources of a relevant system, i.e. our solar system or the observed cosmos as a whole.

IV: Where, once an observed configuration — event E, per NFL — that bears or implies information is from a separately and “simply” describable narrow zone T that is strongly unrepresentative — that’s key — of the space of possible configurations, W, then

V: since the search applied is of a very small fraction of W, it is unreasonable to expect that chance can reasonably account for E in T, instead of the far more typical possibilities in W of in aggregate, overwhelming statistical weight.

(For instance the 10^57 or so atoms of our solar system will go through about 10^102 Planck-time Quantum states in the time since its founding on the usual timeline. 10^150 possibilities [500 bits worth of possibilities] is 48 orders of magnitude beyond that reach, where it takes 10^30 P-time states to execute the fastest chemical reactions.  1,000 bits worth of possibilities is 150 orders of magnitude beyond the 10^150 P-time Q-states of the about 10^80 atoms of our observed cosmos. When you are looking for needles in haystacks, you don’t expect to find them on relatively tiny and superficial searches.)

VI: Where also, in empirical investigations we observe that an aspect of an object, system, process or phenomenon that is controlled by mechanical necessity will show itself in low contingency. A dropped, heavy object falls reliably at g. We can make up a set of differential equations and model how events will play out on a given starting condition, i.e we identify an empirically reliable natural law.

VII: By contrast, highly contingent outcomes — those that vary significantly on similar initial conditions, reliably trace to chance factors and/or choice, e.g we may drop a fair die and it will tumble to a value essentially by chance. (This is in part an ostensive definition, by key example and family resemblance.)  Or, I may choose to compose a text string, writing it this way or the next. Or as the 1,000 coins in a string example above shows, coins may be strung by chance or by choice.

VIII: Choice and chance can be reliably empirically distinguished, as we routinely do in day to day life, decision-making, the court room, and fields of science like forensics.  FSCO/I is one of the key signs for that and the Dembski-style CSI metric helps us quantify that, as was shown.

IX:  Shown, based on a reasonable reduction from standard approaches, and shown by application to real world cases, including biologically relevant ones.

We can safely bet, though, that you would not have known that this was done months ago — over and over again — in response to MG’s challenge, if you were going by the intoxicant fulminations billowing up from the fever swamps of the Darwin zealots.

Let that be a guide to evaluating their credibility — and, since this was repeatedly drawn to their attention and just as repeatedly brushed aside in the haste to go on beating the even more intoxicating talking point drums,  sadly, this also raises serious questions on the motives and attitudes of the chief ones responsible for those drumbeat talking points and for the fever swamps that give off the poisonous, burning strawman rhetorical fumes that make the talking points seem stronger than they are.  (If that is offensive to you, try to understand: this is coming from a man whose argument as summarised above has repeatedly been replied to by drumbeat dismissals without serious consideration, led on to the most outrageous abuses by the more extreme Darwin zealots (who were too often tolerated by host sites advocating alleged “uncensored commenting,” until it was too late), culminating now in a patent threat to his family by obviously unhinged bigots.)

And, now also you know the most likely why of TWT’s attempt to hold my family hostage by making the mafioso style threat: we know you, we know where you are and we know those you care about. END

Comments
Indium: Why is it you "never" follow the links or pointers to the places where I do deal with biosystems, e.g. the citations on Durston et al and related calcs in the OP above? Why do you erect and knock over strawmen arguments, in other words? I give cases from text to illustrate patterns that coded text faces whether it is biological or technological or human in origin. The relative rarity of meaningful or functional complex coded strings is a capital case in point; as compared to the space of all possible configs. Remember, for a conceptual 100 k base genome, we are looking at 4^100,000 = 9.98*10^60,205 possibilities. the Planck time resources of the observed cosmos -- 10^150 states -- could not sample more than 1 in 10^60,000+ of that. So, unless functional states are absolutely overwhelmingly abundant to the point where they are practically falling off the tree into our hands, we have a zero scope search for a needle in a haystack problem. And, we know from the decades of observation of coded digital strings that meaningful strings are credibly going to be quite rare. I think you will see that if you start with say 3-letter clusters you can easily go like: rat -- cat -- bat -- mat -- eat -- ear -- car, etc. (I have given this or similar examples many times. The fatal errors in Zachriel's example are that he is plainly intelligently directing the process and is relying on short words where the function is easy to bridge; so the analogy breaks down very rapidly. When you have to do a real world system control, you are not going to get it to fit into 70 or 130 bytes or so, not to control a serious system, never mind one that is going to be self-replicating. I am sick of strawman arguments.) But just you try the same with 73 ASCII letter strings that need to be meaningful every time. Try changing the first 73 letters of this post into say the last 73, preserving meaningfulness every step of the way. In short as string length -- a measure of complexity -- is increased, and functional specificity is required, the functional strings become naturally far more isolated. GEM of TKIkairosfocus
July 20, 2011
July
07
Jul
20
20
2011
11:28 AM
11
11
28
AM
PDT
Indium: Briefly: 1) Again, look at the Durston paper. 2) The Durston method, applying the Shannon reduction of uncertainty to single amioacid position in large and old protein families, is a reasonable method to approximate the target space, provided we assume that in those cases the functional space has been almost completely traversed by neutral evolution, which is a very reasonable assumption, supported by all existing data. IMO, the Durston method tends rather to overestimate the target space, and if you look at the estimated target spaces, they are very big. 3) The concept of target is not misleading at all. We do observe functional proteins, thousands of different superfamilies of them, and highly integrated in living beings. "Target" just mean specific function, the function that is needed in each specific context to implement a real novelty. That we have to explain. We see the results, and we don't know how they were reached. With a dumb theory reigning in the academic world, which attributes to random variation the generation of biological information, and to NS its expansion, we do need to calculate if what is attributed to random variation is really in its reach. And it is not. Darwinists are not troubled by that, but sincere scientists certainly should. 4) You say: "The total ratio of target- to searchspace is mostly irrelevant if there is a path of viable organisms from some starting point to the target. " That's exactly the point. That path does not exist. It is a complete fairy tale, unsupported by logic and unsupported by facts. Can you show me a path to any of the basic protein superfamilies? Starting from whatever you like. And specifying the supposed naturally selected intermediates, eahc of which in the range of a microevolutionary variation from the previou, and each of which conferring a reproductive advantage. I am waiting.gpuccio
July 20, 2011
July
07
Jul
20
20
2011
11:25 AM
11
11
25
AM
PDT
KF: Obviously. Inferences by analogy are the best, and are the basis of all our knowledge.gpuccio
July 20, 2011
July
07
Jul
20
20
2011
11:13 AM
11
11
13
AM
PDT
kf Why do you always talk about letters and not about real biological objects. Go ahead and demonstrate how gpuccios dfcsi definition can be put to work. Or maybe you can share your thoughts regarding my questions? In any case, you can generate some quite interesting results with words... Word MutagenationIndium
July 20, 2011
July
07
Jul
20
20
2011
11:10 AM
11
11
10
AM
PDT
indium: Kindly look at the original post, to see what is feasible and achieved. the analytical context leads to an empirically valid procedure, per standard techniques commonly used in info theory. Are you willing to argue that a typical at random 73 or worse 143 ASCII character string will be reasonably likely to be a valid text in English? Similarly, when you look at the constraints to be met by protein sequences to fold and function in a key-lock fit context, it is quite reasonable that these will come from narrow and unrepresentative zones in AA sequence space. Why don't you show us a few cases of say 300 AA biofunctional proteins formed successfully through random AA chaining, or through converting one AA one or a few AA's at a time, at random then filtered for function, into a completely different protein? [That is a good analogy to the job of converting say this paragraph at random steps filtered for function, into a completely different one.) GEM of TKIkairosfocus
July 20, 2011
July
07
Jul
20
20
2011
10:05 AM
10
10
05
AM
PDT
GP: Analogy, only in the sense that inductions are rooted in analogies. The argument is strictly an inference to best explanation, on empirical evidence. An abduction in the sense of Peirce. GEM of TKIkairosfocus
July 20, 2011
July
07
Jul
20
20
2011
09:59 AM
9
09
59
AM
PDT
gpuccio: You said
2) Compute as well as possible the target space for that function and the search space, and calculate the rate, expressing it as -log P (in base 2)
I have a few questions/remarks: 1. Can you give an example calculation of the target and search space for a real biological object? 2. In your calculation, how do you know you have exhaustively described the target space? 3. The concept of a target is highly misleading anyway. Whatever biological structure you loook at, it was never a target that had to be reached. 4. Evolution does not have to reach a specific target at once. The total ratio of target- to searchspace is mostly irrelevant if there is a path of viable organisms from some starting point to the target. Evolution never "searches" in the whole search space, it always just looks in a very small "shell" around the existing sphere of genomes (streching high order geometry here a bit, but anyway).Indium
July 20, 2011
July
07
Jul
20
20
2011
09:28 AM
9
09
28
AM
PDT
Elizabeth: Now, briefly, the last step, and then I can rest. f) A final inference, by analogy, of a design process as the cause of dFSCI in living beings. That should be simple now. But please note: 1) The design inference is an inference, not a logical deduction. 2) It is an inference by analogy. As the design process by a conscious intelligent being is the only process coneected, as far as we know, to the emergence of dFSCI, and as we observe dFSCI in abundance in biological beings, of which we have no definite observed experience of the origin, it is perfectly natural to hypothesize a process of design by a conscious intelligent being as the explanation for a very important part of reality that, otherwise, we can in no way explain. Inferences are not dogmas. Nobody must necessarily accept them. That is true for any scientific inference, that is for all empirical science. But those who do not accept the design inference, have the duty to try to really explain the origin of dFSCI in biological information. Dogmatic prejudices ("only humans are conscious intelligent beings, and there were no humans there", or "the design inference implies a god, a science cannot accept that", and so on). Nothing of that is true. The only thing the design inference implies is a conscious intelligent designer as the origin of biological information. Deying the possibility that conscious intelligent agents may be implied in the origin of something we observe and cannot explain dofferently is imply dogma. Conscious intelligent agents generate dFSCI all the time. Whoever is certain that humans are the only conscious intelligent agents in reality is simply expressing his own religion, not making science.gpuccio
July 20, 2011
July
07
Jul
20
20
2011
08:58 AM
8
08
58
AM
PDT
Elizabeth: The following point is: e) An appreciation of the existence of vast quantities of objects exhibiting dFSCI in the biological world, and nowhere else (excluding designed objects). I will not go into detail about that now. We have in part already discussed that. Let's say that we in ID believe that there is already a lot of evodence that most basic biological information, and certainly all basic protein coding genes, and especially basic protein superfamilies, abundantly exhibit dFSCI. Darwinists do not agree. Confrontation on this point is certainly useful and will go on as new data come from research. The only "reasonable" objections, that you have already embraced, are IMO two: 1) Biological information does not exhibit dFSCI because the ID ideas about the target space are wrong, and the target space is much bigger than we in ID believe, either because the single functional islands are much bigger than we thinkm or because there are many more functional islands than we think, and many more naturally selectable functions than we think. 2) Biological information does not exhibit dFSCI, because there is a specific necessity mechanism that can explain it, that is the neo darwinian mechanism: macroevolution can be deconstructed into microevolutionary steps, each of them visible to NS. Well, I believe that all these arguments are deeply flawed, and that there is really no empirical support in their favour. But, obviously, that is the field where healthy scientific confrontation should take place. If you are aware of other fundamental objections, please let me know.gpuccio
July 20, 2011
July
07
Jul
20
20
2011
08:35 AM
8
08
35
AM
PDT
Elizabeth: So, the next point: d) An empirical appreciation of the correlation between dFSCI and design in all known cases. Here I would like to be very clear abot where we stay. We have an explicit definition of dFSCI. You have objected that the value of complexity is usually very difficult to compute, and I agree. But that does not mean that the concept is not completely defined, and that we cannot attempt calculations in specific contexts, even is approximate. Now, we have defined dFSCI so that its presence can reasonably exclude a random origin of functional information, and at the same time a necessity explanation must not be available. At this point, dFSCI could well, in principle, not exist in the universe. But we know that it exists. We can easily observe it around us. A lot of objects in our world certailnly exhibit dFSCI: practically all writings longer than one page, practically all software programs, can easily be shown to exhibit dFSCI, even using an extreme threshold like UPB. The simple fact is: all these kinds of objects are human artifacts, and all of them are designed (according to our initial definition). Please note that even for language and computer programs, the computation of the target space is difficult. And yet, by a simple approximation, we can easily become convinced that they certainly exìhibit dFSCI. For instance, I have discussed once dFSCI as related to Hamlet, defining the function as the ability of a text of similar length to give a reader a complete understanding of the story, the cheracters, the meaning and themes of the drama, and so on. I believe that there cannot be any doubt that Hamlet exhibits dFSCI even in relation to UPB. Hamlet is about 140,000 characters long. Taking the alphabet at 26 values, the search space is 26^140000, that is, if I am not wrong, about 1,128,000 bits. So, unles you believe that about 2^1,127,500 texts of that length could fully convey the plot and meaning of Hamlet, then you have to admit that Hamlet exhibits dFSCI, tons of it. The same reason could be made for some simple computer program, let's say a simple spreadsheet working in Windows. Let's say its length is 1 Mbyte, more or less 8,000,000 bits. How many sequences do you believe will work as spreadsheets in Windows? So, we can well say that language and computer programs are objects that very easily exhibit dFSCI. So, at thos point we can simply make an empirical evaluation of where dFSCI can be observed. Let's start with human artifacts. They are designed by definition (observed design). Well, do they all exhibit dFSCI? Absolutely not. Many of them are simple, even those in digital form. If I write a message: "I am here", it is certanly functional (transmits a specific meaning), but its maximum complexity (about 40 bits if considered for one functional sequence, just for simplicity) does not qualify it as dFSCI in any relevant context. So, designed things are of two kinds: simple and complex, and if we choose some specific threshold, we can try to separate the simple ones from the complex ones. What about non designed objects? Well, I affirm here that no non designed object found in nature, and of which we can understand the origin well enough to be sure that it is not a human artifact, exhibits dFSCI, not even with the lower threshold of 150 bits I have proposed. With one exception, and only one: biological information. This point we can obviously discuss. I am reasy to take into consideration any counter - example you may want to present. So, if we exclude biological information (which is exactly the object of our research), all things existing in the known universe (and which can be read as digital sequences)can be roughly categorized in three empirical classes: 1) They are designed (observed design) and they do exhibit dFSCI 2) They are designed (observed design) and they do not exhibit dFSCI 3) THey are not designed (observed origin which does not imply a conscious intervention) and do not exhibit dFSCI. So we can empirically say that there is a correlation between the presence of dFSCI and designed things. If the threshold is high enough, the correlation will be as follows: a) All things exhibiting dFSCI are designed. No false positives. b) Many designed things do not exhibit dFSCI. Many false negatives. The absence of false positives, and the presence of a lot of false negatives, is the consequence of having chosen an extreme threshold (een at the 150 bit level). If we lowered the threshold, we would certainly have less false negatives, but would could start to observe false positives. Please note that the correlation between dFSCi and design can be empirically verified for human artifatcs. We can well prepare a range of digital sequences, some of them designed and complex, others designed and simple, and others non designed (generated in a random system) or derived by any natural, non biological system. We know in advance which are designed and which are not. Then we ask some observer to assess dFSCI in them. I affirm that the observe can have as many false negatives as we can imagine, but if he rightly assesses dFSCI in some cases, none of them will be a false positive. They will all be designed sequences.gpuccio
July 20, 2011
July
07
Jul
20
20
2011
08:23 AM
8
08
23
AM
PDT
Elizabeth: I have been too nusy the last two days! I have not checked the previous discussion, but first of all I would like to go on with the points I had outlined (I hate to leave things unfinished). So, to complete the point of complexity and dFSCI definition, I have to discuss the threshold of functional complexity. Well, in principle we could just compute the complexity, and say that a certain object has a digital Functionally Specified Information, for a certain function, of, say, complexity amounting to 132 bits. That's perfectly correct. What it mean is that you need 132 bits of specific information to code for that function. Usually, for the discussions related to design inference, it is useful to fix a threshold of complexity. Now, an important point is that he threshold is arbitrary, or rather conventional: it must be chosen so that for functional information higher than that value generation in a random system, in a specific context, is empirically impossible. It is important to emphasyze that the threshold must be appropriate for the contex. That's because the probability of a certain result emerging randomly depends not only on the absolute probability of the result (the functional complexity), but also by what Dembski calls the probabilistic resources. So, a certain result can have a probability of, say, 1:10^5, but if my system can try 10^8 times to get the result, it is almmost certain that it will emerge. That's the real meaning of the threshold: it must be high enough that the result remains virtually impossile guven the probabilistic resources ofn the system we are considering. Now, as you probably know, Dembski has often referred to the UPB (about 500 bits of complexity), as the level of sppecified complexity which guarantees that a result remain completely unlikely even given all the probablistic resources available in the whole universe and in its whole span of existence. That's fine if we want to define an event as absolutely unlikely, but for our biological reasoning it's certainly too much. That's why I have suggested a lower threshold that gives reliable improbability in any biological system on our planet, given its span of existence, and referring to the replicators with the highest population number and the highest reproduction rate (bacteria). I have even tried to calculate a reasonable threshold for that. I don't remember now the details of those calculations, but if I remember well the result, I would suggest a biological threshold of complexity of about 150 bits (more or less 35 AAs). I can obviously accept any other value, if a better computation of the maximun probabilistic resources in a biological setting on our planet is done. So, if we want to affirm if dFSCI is present in an object for a design inference in a biologicalm context, we have to do the following things: 1) define a function 2) Compute as well as possible the target space for that function and the search space, and calculate the rate, expressing it as -log P (in base 2) 3) Verify that the digital information is scarcely compressible, and that no explicit necessity algorithm is known that can act in alternative to random generation. 4) If all the above is satisfied, and the specified complexity is higher than our threshold, we say that the object exhibits dFSCI. If our context is different, for instance cosmological, then it will probably be more appropriate to use UPB as a threshold to asses dFSCI.gpuccio
July 20, 2011
July
07
Jul
20
20
2011
06:46 AM
6
06
46
AM
PDT
Ah - gap is only visible at preview. Seems to be OK in the post. Ignore my PS :)Elizabeth Liddle
July 20, 2011
July
07
Jul
20
20
2011
05:05 AM
5
05
05
AM
PDT
Chris, I think we have different mental images here. I am visualising of a peptide chain that can split in two, and each half then attract the monomers that will complete, it and result in two chains where first there was one. Now the sequence of units in that chain may be completely irrelevant to its ability to self-replicate - what makes it a self-replicator is not a specific sequence, but it's doubleness. Let's say A mates to C and B mates to D. And we have two chains: AC CA CA DB DB BD and CA CA AC AC DB BD Both are equally capable of self-replication because what gives them their self-replicating property is their ability to split down the middle, and for both halves to then attract to each now-unmated unit, the corresponding unit. So the first chain splits into: A C C D D B and C A A B B D Each of these then "mates" with the appropriate A, B, C and D monomers in the environment, resulting in tow chains that are identical to the parent chain. But this self-replicative property does not derive from the sequence of units, but from the pairing-properties of the units. So the second chain will be just as capable of self-replication as the first, and so will any daughter, no matter how many copying "error" take place in the sequence, because the self-replicative properties are not derived from the sequence but from the doubleness of the chain! So as it stands, these peptides do not meet the minimum criterion for Darwinian evolution - there is no phenotypic consequence of the transmitted sequence. Got to go and collect some data now! See you later. Cheers Lizzie PS: there seems to be a stray gap before the last term in each of my sequences - I can't seem to get rid of it - please ignore it! Seems to be a replication error....Elizabeth Liddle
July 20, 2011
July
07
Jul
20
20
2011
05:03 AM
5
05
03
AM
PDT
But if the daughter molecules are only “more like the parent molecule than a randomly selected molecule would be” then why would the daughter molecules necessarily have the ability to self-replicate? Given that the first self-replicating molecule would have done virtually nothing else but self-replicate, any ‘copying errors’ in the daughters would almost certainly lead to the impairment or loss of the ability to self-replicate. How can it be otherwise? You must also agree that perfect cloning of the parent molecule is all that we need for perfect, eternal self-replication. So where does the variance come into it? And if variance does come into it, how can you be so sure that it wouldn’t lead to impairment or loss of self-replication: particularly given the fact that the first self-replicating molecule is doing little else apart from self-replicating in the first place! I can certainly see how you can ‘imagine’ overcoming these insurmountable obstacles, but we need to be realistic otherwise there is no truth value to be found here.Chris Doyle
July 20, 2011
July
07
Jul
20
20
2011
04:33 AM
4
04
33
AM
PDT
NFL is accessible at Google Books, in a good slice preview.kairosfocus
July 20, 2011
July
07
Jul
20
20
2011
04:20 AM
4
04
20
AM
PDT
Chris:
Hiya Lizzie, To my mind, the first successful self-replicating molecule need to be: 1. Stable enough at all times to survive for long enough to self-replicate correctly. That means protected from detrimental chemical reactions.
Well, I'd say that it simply needs to replicate with sufficient fidelity that the daughter molecules are more like the parent molecule than a randomly selected molecule would be. Then we can say that some minimal "self-replication" has occurred.
2. Involve a form of self-replication that perfectly cloned the original. That means true self-replication or else the ability to self-replicate would be impaired or lost within a few generations.
No, I don't think that is a necessary requirement for natural selection to occur. What is necessary, as I said above, is that the sequence itself has to affect replication probability. This is unlikely, I think, unless the molecules are enclosed in some way (although I could be wrong). In other words, the genotype needs to affect the self-replicative capacity of a phenotype. Until we have a a phenotype, as well as a genotype, we don't have the necessary conditions for Darwinian evolution. Perfect fidelity of self-replication is not a condition however - indeed Darwinian evolution presupposes self-replication with variance, and that variance has to include phenotypic effects that affect the probability of further self-replication.Elizabeth Liddle
July 20, 2011
July
07
Jul
20
20
2011
04:19 AM
4
04
19
AM
PDT
Hiya Lizzie, To my mind, the first successful self-replicating molecule need to be: 1. Stable enough at all times to survive for long enough to self-replicate correctly. That means protected from detrimental chemical reactions. 2. Involve a form of self-replication that perfectly cloned the original. That means true self-replication or else the ability to self-replicate would be impaired or lost within a few generations. Do you agree that these are necessary conditions?Chris Doyle
July 20, 2011
July
07
Jul
20
20
2011
03:52 AM
3
03
52
AM
PDT
Well, there's plenty of scope for copying error, but I think what you may be getting at is: "what scope is there for copying error that will make a difference to the ability to self replicate?" If a double strand of a sequence of mating base pairs splits, and each single strand attractes loose monomers to its now-unmated bases, so that we have two identical strands in place of the original, then it may well be that the sequence itself is sometimes disrupted (perhaps the end of the strand detaches itself; perhaps it joins to another floating strand), but the sequence my not be important to the self-replicating properties; the critical property may simply be the double strandedness, and the mating properties fo the bases. And your question would then be: how could it ever improve its self-replicating properties? Well, alone, it probably couldn't. But recall that selection operates at the level of the "phenotype" not the level of the genome. What we have is here is a phenotype-free genome, essentially. But if those self-replicating peptides find themselves captured by a lipid vesicle, moreover one that tends to grow to a critical size then subdivide, then we have a potential "phenotype". If the sequence of bases in the peptide, for instance, are such that, say, the odd rRNA molecule tends to form with properties that, for example, increase the critical size at which the vesicle divides, allowing more replications of the peptide, and a great chance that the daughter vesicles will both contain copies of the peptide, then we do have a selectable phenotype - a population of vesicles in which some grow larger before subdividing than others, and therefore stand a greater probability of forming two daughter vesicles containing matching daughter peptides. Does that make more sense?Elizabeth Liddle
July 20, 2011
July
07
Jul
20
20
2011
03:36 AM
3
03
36
AM
PDT
But hang on, Lizzie. We're just talking about the first self-replicating molecule. This must have had the ability to replicate (along with its clone descendants) without a "lipid vesicle". And the only way the first ever self-replicating molecule could self-replicate was the way it self-replicated: that's axiomatic. Where is the scope for copying error? How could copying error not result in impairment or loss of the ability to self-replicate given that that is the only method available to the first self-replicating molecule?Chris Doyle
July 20, 2011
July
07
Jul
20
20
2011
03:20 AM
3
03
20
AM
PDT
Because self-replication is the only capability of the first self-replicating molecule and, as far as we can tell, there is only one way to self-replicate.
Well, firstly, I'd say that there are lots of way that a thing, or assemblage of things, might self-replicate. Secondly, I'd say that even if they all did it the same way, they wouldn't necessarily be identical in terms of their likelihood of surviving to do it time and time again. For instance a self-replicating peptide loose in the soup might be far more vulnerable to disintegration than one that found itself inside a lipid vesicle. And if that vesicle itself tended to subdivide (as vesicles do) you've now got a peptide-vesicle combo that reproduces more efficiently than a peptide alone.Elizabeth Liddle
July 20, 2011
July
07
Jul
20
20
2011
03:06 AM
3
03
06
AM
PDT
Because self-replication is the only capability of the first self-replicating molecule and, as far as we can tell, there is only one way to self-replicate.Chris Doyle
July 19, 2011
July
07
Jul
19
19
2011
11:12 AM
11
11
12
AM
PDT
Chris:
Any deviation from the clone of this first self-replicant is only going to impair or remove the ability to self-replicate, surely?
Why?Elizabeth Liddle
July 19, 2011
July
07
Jul
19
19
2011
10:53 AM
10
10
53
AM
PDT
Hiya Lizzie, I'm glad that RSS is proving useful. I'm wary of rushing into paper chemistry when some big questions remain unanswered and we're completely lacking an empirical basis to proceed. Let's come back to that first self-replicating molecule: the descendants of which are clones. Now, I'm still not clear on this point: given that the urge to reproduce has been satisfied by this first self-replicating molecule, which must have been able to survive long enough to replicate itself (as must its descendants) why should it have evolved at all? The simpler the better, surely? On what grounds can we believe that the first self-replicating molecule actually had any scope for copying error? Again, eternal, perfect cloning is sufficient. Any deviation from the clone of this first self-replicant is only going to impair or remove the ability to self-replicate, surely? Even if a mutant could still self-replicate, if there is no real competition for resources (because the original strain flourishes along with the mutant strain), then why should the original strain die out?Chris Doyle
July 19, 2011
July
07
Jul
19
19
2011
10:42 AM
10
10
42
AM
PDT
Chris:
Don’t worry, I’ll remind you of the unanswered posts if the conversation returns to those territories!
Thanks! And I'm finding the RSS feed a great boon.
Given that we’ve already got successful survival of clones (ie. of the first self-replicating molecule) then why would mutants arise and then be selected? The monkey god is being served nicely, thank-you very much. Given that the most sophisticated part of a self-replicating molecule is the ability to self-replicate, surely any mutation of that molecule will result in the impairment or even loss of the ability to self-replicate? Whatever resources the first self-replicating molecule depended on (what were they?) must have been absolutely abundant in order for it to arise in the first place. If they were scarce, then it would never have lasted long enough to evolve. So, you have not yet established if there was competition for resources at all. Also, Lenski’s LTEE showed that when a previously untapped resource enters into the equation then both strains thrive (because of the reduced competition for the original resource), not just the new one.
Well, that last point is a good answer to your question about why diversity occurs, I would have thought. As for your other points - yes I would agree that for the limits of evolution to expanded, there has to be room, as it were, for improvement. However, there is (as you point out, wrt to Lenski's evidence) more than one dimension along which things can be improved. So "the ability to self-replicate" isn't just the physical ability to do so (which in some ways is quite simple, once you have an entity with two halves that can split) but the ability to survive long enough to do so again and again. And that may depend on the specific properties of the self-replicator. Which is why, I think, OOL researchers have looked at the combination of self-replicating peptides AND lipid vesicles that could enclose them. If I ever get round to my simulation, that's what I hope will happen. Then, the longevity and self-replication capacity of the whole combo - vesicle+peptide has more chance of depending on the specific properties of the peptide.
So, let’s not go into just-so stories until we have an empirical foundation to build them on. Especially when there’s an enormous difference between “swallowing something” and “metabolising something”.
Sure. But it's also good to get back to basics - the reason a protobiont needs "resources" in the first place is to build copies of itself. It may have all the self-replicating power you could shake a stick at, but it can't put it into practice unless it has access to the bits it needs to build it's replica with. The most fundamental thing about organisms is that you end up with more of the stuff than you started. That stuff has to come from somewhere. And in the early days, simply "swallowing" it - useful monomers, maybe bits of old organisms, or just more bits of the original "soup" - may be all it needed - stuff to be chemically attracted to the naked bases of a split peptide to make two whole ones. That's what I hoped would happen in my sim (and still do!) - from a "primordial soup" of monomers with certain "chemical" properties, I hope that vesicles will form that will enclose polymers that will tend to replicate themselves, given a rich enough supply of monomers, then the whole thing will get so big it divides in two, leaving us with two vesicles containing copies of the parent peptides. And, after a while, I hope that vesicles with peptides that for some reason help the vesicle to maintain its integrity for longer, maybe by acting as a template for the generation some kind of "protein" liner product from more floating stuff, will produce more copies of themselves, will come to dominate the population, leaving me with a population of self-replicating peptide-containing vesicles in which the particular peptides have the best vesicle-preserving properties. It'll be a challenge though :)Elizabeth Liddle
July 19, 2011
July
07
Jul
19
19
2011
09:25 AM
9
09
25
AM
PDT
Hiya Lizzie, Don’t worry, I’ll remind you of the unanswered posts if the conversation returns to those territories! Given that we’ve already got successful survival of clones (ie. of the first self-replicating molecule) then why would mutants arise and then be selected? The monkey god is being served nicely, thank-you very much. Given that the most sophisticated part of a self-replicating molecule is the ability to self-replicate, surely any mutation of that molecule will result in the impairment or even loss of the ability to self-replicate? Whatever resources the first self-replicating molecule depended on (what were they?) must have been absolutely abundant in order for it to arise in the first place. If they were scarce, then it would never have lasted long enough to evolve. So, you have not yet established if there was competition for resources at all. Also, Lenski’s LTEE showed that when a previously untapped resource enters into the equation then both strains thrive (because of the reduced competition for the original resource), not just the new one. So, let’s not go into just-so stories until we have an empirical foundation to build them on. Especially when there’s an enormous difference between “swallowing something” and “metabolising something”.Chris Doyle
July 19, 2011
July
07
Jul
19
19
2011
06:13 AM
6
06
13
AM
PDT
Chris: (oops, I owe you another response as well....)
Good Afternoon Lizzie, If the monkey god (wow, things are getting weird around here!) only cares about monkeys typing, then, coming back to the real world for a minute, why did we ever evolve beyond the first self-replicating molecule in the first place (or even evolve beyond unicellular organisms)? Clearly the first self-replicating molecule (and unicellular organisms) can just go on typing forever and ever and ever. Things which complicate that (such as multicellularity and sexual reproduction) should have been weeded out by the monkey god from the very beginning.
Good question - let's abandon the monkeys though! Firstly, the unicellular organisms will tend to form a family tree - a mutant that survives will start a lineage of descendents bearing that mutation. So after a short time there will be many strains of unicellular forms all competing for resources. Any strain (any population of one lineage) that finds itself able to exploit a new resource will do better than strains competing for the same resource. So we have the beginnings of, if not, technically "species" (because they do not interbreed) at least of differentiated populations, adapted to utilise different resources. Now, let's say that one population (this is purely just-so story, of course, I just made it up, but it might work) utilises another population as a resource. Eats it. And that other population mutates away, until a sub-strain happens to emerge where the offspring tend to stick together, making them harder for the predator strain to eat them. The outer ones will be picked off, of course, but the inner ones will be protected, and as long as they keep breeding as fast as the outer ones are being picked off, the whole colony will survive. Now, let's say one of these colonies comes up with a mutation that means that some protein that aids utilisation of some resource is only made if the cell in question is surrounded on all sides by other cells, enabling it to absorb enough of some important enzyme from its neighbours. Now, such a mutation would obviously be highly deleterious in a non-colony, however, in a colony, it's only going to harm the cells near the edge. And that won't matter as long as the cells in the middle keep dividing as fast as the ones at the edge keep dying off. However, an unexpected advantage shows up- the dead cells at the edge of the colony prove to be inedible to the predator. So the edge of the colony now not only protects the inner members by being sacrificial victims, giving them time to replicate before the predator population reaches them, but actually keeps the predator population off. The colony has, in fact, become a single organism, with two organs - a skin, consisting of dead cells, and a middle, consisting of live ones. And these two organs are differentiated by a primitive kind of cell signalling - the leakage of enzymes from one cell to another. If enough enzymes get through, the cell goes on metabolising and dividing; if not, the cell dies and protects the colony, and this happens only at the edge, or, as we can now call it "skin". This enables the colony to keep growing, indefinitely, "skin" forming whenever the inside grows so big it tears the outside. However, there is probably an upper limit to the size - beyond a certain size, the thing will tend to break up, just because of basic structural forces including bending moments. At which point, the pieces rapidly grow a "skin", and continue to grow, and self replicate in turn. Now, selection may kick in at the level of the colony. Colonies that happen to have mutations in which other proteins are dependent on the concentration of enzymes leaching across from neighbouring cells may start to differentiate still more - perhaps cavities appear in the the structure, to increase the exposure if the centre to nutrients. We now have the beginning of something like a sponge. And so on. OK, I just made all that up. But then that's the beginning of the scientific process: Just So story -> speculation -> explanatory theory -> hypothesis -> prediction -> new data conforms to/violates prediction. I guess my point is simply that speculation is at least possible, and that there are actual theories out there, including sponges. As for "why are there still sponges" - well I guess some other colony found a new niche :) Be back later, gotta run.Elizabeth Liddle
July 19, 2011
July
07
Jul
19
19
2011
05:54 AM
5
05
54
AM
PDT
Joseph: That we don't know yet (although we have some ideas). But first, let'g get clear: given the monkeys, and the typewriters etc, i.e. given the minimal entity capable of replication with variance in the ability to self-replicate, People - including Shakespeare - are one of many possible outcomes. Then (or even simulaneously) we can discuss where that first entity came from.Elizabeth Liddle
July 19, 2011
July
07
Jul
19
19
2011
05:22 AM
5
05
22
AM
PDT
Where did the monkeys get the typewriters, the ribbbons and the paper?Joseph
July 19, 2011
July
07
Jul
19
19
2011
04:46 AM
4
04
46
AM
PDT
Good Afternoon Lizzie, If the monkey god (wow, things are getting weird around here!) only cares about monkeys typing, then, coming back to the real world for a minute, why did we ever evolve beyond the first self-replicating molecule in the first place (or even evolve beyond unicellular organisms)? Clearly the first self-replicating molecule (and unicellular organisms) can just go on typing forever and ever and ever. Things which complicate that (such as multicellularity and sexual reproduction) should have been weeded out by the monkey god from the very beginning.Chris Doyle
July 19, 2011
July
07
Jul
19
19
2011
04:44 AM
4
04
44
AM
PDT
No, I ran out of coffee. No problem though: the monkey god does have to know Shakespeare. You've underestimated the target space again. All the monkey god cares about is that the monkeys go on typing, and anything they type that increases the probability that they will go on typing, will of course, be typed more often. Differential repetition, if you like. As you say - that's what the monkey god is.Elizabeth Liddle
July 19, 2011
July
07
Jul
19
19
2011
04:36 AM
4
04
36
AM
PDT
1 2 3 4 5 6

Leave a Reply