Dembski’s CSI concept has come under serious question, dispute and suspicion in recent weeks here at UD.
After diligent patrolling the cops announce a bust: acting on some tips from un-named sources, they have caught the miscreants in the act!
>>NFL as just linked, pp. 144 & 148:
144: “. . . since a universal probability bound of 1 in 10^150 corresponds to a universal complexity bound of 500 bits of information, (T, E) constitutes CSI because T [i.e. “conceptual information,” effectively the target hot zone in the field of possibilities] subsumes E [i.e. “physical information,” effectively the observed event from that field], T is detachable from E, and and T measures at least 500 bits of information . . . ”
148: “The great myth of contemporary evolutionary biology is that the information needed to explain complex biological structures can be purchased without intelligence. My aim throughout this book is to dispel that myth . . . . Eigen and his colleagues must have something else in mind besides information simpliciter when they describe the origin of information as the central problem of biology.
I submit that what they have in mind is specified complexity, or what equivalently we have been calling in this Chapter Complex Specified information or CSI . . . .
Biological specification always refers to function . . . In virtue of their function [a living organism’s subsystems] embody patterns that are objectively given and can be identified independently of the systems that embody them. Hence these systems are specified in the sense required by the complexity-specificity criterion . . . the specification can be cashed out in any number of ways . . . “
Here we see all the suspects together caught in the very act.
Let us line up our suspects:
2: events from target zones in wider config spaces,
3: joint complexity-specification criteria,
4: 500-bit thresholds of complexity,
5: functionality as a possible objective specification
6: biofunction as specification,
7: origin of CSI as the key problem of both origin of life [Eigen’s focus] and Evolution, origin of body plans and species etc.
8: equivalence of CSI and complex specification.
Rap, rap, rap!
“How do you all plead?”
“Guilty as charged, with explanation your honour. We were all busy trying to address the scientific origin of biological information, on the characteristic of complex functional specificity. We were not trying to impose a right wing theocratic tyranny nor to smuggle creationism in the back door of the schoolroom your honour.”
“Throw the book at them!”
So, now we have heard from the horse’s mouth.
What are we to make of it, in light of Orgel’s conceptual definition from 1973 and the recent challenges to CSI raised by MG and others.
. . . In brief, living organisms are distinguished by their specified complexity. Crystals are usually taken as the prototypes of simple well-specified structures, because they consist of a very large number of identical molecules packed together in a uniform way. Lumps of granite or random mixtures of polymers are examples of structures that are complex but not specified. The crystals fail to qualify as living because they lack complexity; the mixtures of polymers fail to qualify because they lack specificity. [[The Origins of Life (John Wiley, 1973), p. 189.]
And, what about the more complex definition in the 2005 Specification paper by Dembski?
define ϕS as . . . the number of patterns for which [agent] S’s semiotic description of them is at least as simple as S’s semiotic description of [a pattern or target zone] T.  . . . . where M is the number of semiotic agents [S’s] that within a context of inquiry might also be witnessing events and N is the number of opportunities for such events to happen . . . . [where also] computer scientist Seth Lloyd has shown that 10^120 constitutes the maximal number of bit operations that the known, observable universe could have performed throughout its entire multi-billion year history. . . . [Then] for any context of inquiry in which S might be endeavoring to determine whether an event that conforms to a pattern T happened by chance, M·N will be bounded above by 10^120. We thus define the specified complexity [χ] of T given [chance hypothesis] H [in bits] . . . as [the negative base-2 log of the conditional probability P(T|H) multiplied by the number of similar cases ϕS(t) and also by the maximum number of binary search-events in our observed universe 10^120]
χ = – log2[10^120 ·ϕS(T)·P(T|H)] . . . eqn n1
How about this (we are now embarking on an exercise in “open notebook” science):
1 –> 10^120 ~ 2^398
2 –> Following Hartley, we can define Information on a probability metric:
I = – log(p) . . . eqn n2
3 –> So, we can re-present the Chi-metric:
Chi = – log2(2^398 * D2 * p) . . . eqn n3
Chi = Ip – (398 + K2) . . . eqn n4
4 –> That is, the Dembski CSI Chi-metric is a measure of Information for samples from a target zone T on the presumption of a chance-dominated process, beyond a threshold of at least 398 bits, covering 10^120 possibilities.
5 –> Where also, K2 is a further increment to the threshold that naturally peaks at about 100 further bits. In short VJT’s CSI-lite is an extension and simplification of the Chi-metric. He explains in the just linked (and building on the further linked):
The CSI-lite calculation I’m proposing here doesn’t require any semiotic descriptions, and it’s based on purely physical and quantifiable parameters which are found in natural systems. That should please ID critics. These physical parameters should have known probability distributions. A probability distribution is associated with each and every quantifiable physical parameter that can be used to describe each and every kind of natural system – be it a mica crystal, a piece of granite containing that crystal, a bucket of water, a bacterial flagellum, a flower, or a solar system . . . .
Two conditions need to be met before some feature of a system can be unambiguously ascribed to an intelligent agent: first, the physical parameter being measured has to have a value corresponding to a probability of 10^(-150) or less, and second, the system itself should also be capable of being described very briefly (low Kolmogorov complexity), in a way that either explicitly mentions or implicitly entails the surprisingly improbable value (or range of values) of the physical parameter being measured . . . .
my definition of CSI-lite removes Phi_s(T) from the actual formula and replaces it with a constant figure of 10^30. The requirement for low descriptive complexity still remains, but as an extra condition that must be satisfied before a system can be described as a specification. So Professor Dembski’s formula now becomes:
CSI-lite=-log2[10^120.10^30.P(T|H)]=-log2[10^150.P(T|H)] . . . eqn n1a
. . . .the overall effect of including Phi_s(T) in Professor Dembski’s formulas for a pattern T’s specificity, sigma, and its complex specified information, Chi, is to reduce both of them by a certain number of bits. For the bacterial flagellum, Phi_s(T) is 10^20, which is approximately 2^66, so sigma and Chi are both reduced by 66 bits. My formula makes that 100 bits (as 10^30 is approximately 2^100), so my CSI-lite computation represents a very conservative figure indeed.
Readers should note that although I have removed Dembski’s specification factor Phi_s(T) from my formula for CSI-lite, I have retained it as an additional requirement: in order for a system to be described as a specification, it is not enough for CSI-lite to exceed 1; the system itself must also be capable of being described briefly (low Kolmogorov complexity) in some common language, in a way that either explicitly mentions pattern T, or entails the occurrence of pattern T. (The “common language” requirement is intended to exclude the use of artificial predicates like grue.) . . . .
[As MF has pointed out] the probability p of pattern T occurring at a particular time and place as a result of some unintelligent (so-called “chance”) process should not be multiplied by the total number of trials n during the entire history of the universe. Instead one should use the formula (1–(1-p)^n), where in this case p is P(T|H) and n=10^120. Of course, my CSI-lite formula uses Dembski’s original conservative figure of 10^150, so my corrected formula for CSI-lite now reads as follows:
CSI-lite=-log2(1-(1-P(T|H))^(10^150)) . . . eqn n1b
If P(T|H) is very low, then this formula will be very closely approximated [HT: Giem] by the formula:
CSI-lite=-log2[10^150.P(T|H)] . . . eqn n1c
6 –> So, the idea of the Dembski metric in the end — debates about peculiarities in derivation notwithstanding — is that if the Hartley-Shannon- derived information measure for items from a hot or target zone in a field of possibilities is beyond 398 – 500 or so bits, it is so deeply isolated that a chance dominated process is maximally unlikely to find it, but of course intelligent agents routinely produce information beyond such a threshold.
7 –> In addition, the only observed cause of information beyond such a threshold is the now proverbial intelligent semiotic agents.
8 –> Even at 398 bits that makes sense as the total number of Planck-time quantum states for the atoms of the solar system [most of which are in the Sun] since its formation does not exceed ~ 10^102, as Abel showed in his 2009 Universal Plausibility Metric paper. The search resources in our solar system just are not there.
9 –> So, we now clearly have a simple but fairly sound context to understand the Dembski result, conceptually and mathematically [cf. more details here]; tracing back to Orgel and onward to Shannon and Hartley. Let’s augment here [Apr 17], on a comment in the MG progress thread:
Shannon measured info-carrying capacity, towards one of his goals: metrics of the carrying capacity of comms channels — as in who was he working for, again?
CSI extended this to meaningfulness/function of info.
And in so doing, observed that this — due to the required specificity — naturally constricts the zone of the space of possibilities actually used, to island[s] of function.
That specificity-complexity criterion links:
I: an explosion of the scope of the config space to accommodate the complexity (as every added bit DOUBLES the set of possible configurations), to
II: a restriction of the zone, T, of the space used to accommodate the specificity (often to function/be meaningfully structured).
In turn that suggests that we have zones of function that are ever harder for chance based random walks [CBRW’s] to pick up. But intelligence does so much more easily.
Thence, we see that if you have a metric for the information involved that surpasses a threshold beyond which a CBRW is a plausible explanation, then we can confidently infer to design as best explanation.
Voila, we need an info beyond the threshold metric. And, once we have a reasonable estimate of the direct or implied specific and/or functionally specific (especially code based) information in an entity of interest, we have an estimate of or credible substitute for the value of – log2(p(T|H)); especially if the value of information comes from direct inspection of storage capacity and code symbol patterns of use leading to an estimate of relative frequency, we may evaluate average [functionally or otherwise] specific information per symbol used. This is a version of Shannon’s weighted average information per symbol H-metric, H = – Σ pi * log(pi), which is also known as informational entropy [there is an arguable link to thermodynamic entropy, cf here) or uncertainty.
As in (using Chi_500 for VJT’s CSI_lite [UPDATE, July 3: and S for a dummy variable that is 1/0 accordingly as the information in I is empirically or otherwise shown to be specific, i.e. from a narrow target zone T, strongly UNREPRESENTATIVE of the bulk of the distribution of possible configurations, W]):
Chi_500 = Ip*S – 500, bits beyond the [solar system resources] threshold . . . eqn n5
Chi_1000 = Ip*S – 1000, bits beyond the observable cosmos, 125 byte/ 143 ASCII character threshold . . . eqn n6
Chi_1024 = Ip*S – 1024, bits beyond a 2^10, 128 byte/147 ASCII character version of the threshold in n6, with a config space of 1.80*10^308 possibilities, not 1.07*10^301 . . . eqn n6a
[UPDATE, July 3: So, if we have a string of 1,000 fair coins, and toss at random, we will by overwhelming probability expect to get a near 50-50 distribution typical of the bulk of the 2^1,000 possibilities W. On the Chi-500 metric, I would be high, 1,000 bits, but S would be 0, so the value for Chi_500 would be – 500, i.e. well within the possibilities of chance. However, if we came to the same string later and saw that the coins somehow now had the bit pattern of the ASCII codes for the first 143 or so characters of this post, we would have excellent reason to infer that an intelligent designer, using choice contingency, had intelligently reconfigured the coins. that is because, using the same I = 1,000 capacity value, S is now 1, and so Chi_500 = 500 bits beyond the solar system threshold. If the 10^57 or so atoms of our solar system, for its lifespan, were to be converted into coins and tables etc, and tossed at an impossibly fast rate, it would be impossible to sample enough of the possibilities space W to have confidence that something from so unrepresentative a zone T, could reasonably be explained on chance. So, as long as an intelligent agent capable of choice is possible, choice — i.e. design — would be the rational, best explanation on the sign observed, functionally specific, complex information.]
10 –> Similarly, the work of Durston and colleagues, published in 2007, fits this same general framework. Excerpting:
Consider that there are usually only 20 different amino acids possible per site for proteins, Eqn. (6) can be used to calculate a maximum Fit value/protein amino acid site of 4.32 Fits/site [NB: Log2 (20) = 4.32]. We use the formula log (20) – H(Xf) to calculate the functional information at a site specified by the variable Xf such that Xf corresponds to the aligned amino acids of each sequence with the same molecular function f. The measured FSC for the whole protein is then calculated as the summation of that for all aligned sites. The number of Fits quantifies the degree of algorithmic challenge, in terms of probability [info and probability are closely related], in achieving needed metabolic function. For example, if we find that the Ribosomal S12 protein family has a Fit value of 379, we can use the equations presented thus far to predict that there are about 10^49 different 121-residue sequences that could fall into the Ribsomal S12 family of proteins, resulting in an evolutionary search target of approximately 10^-106 percent of 121-residue sequence space. In general, the higher the Fit value, the more functional information is required to encode the particular function in order to find it in sequence space. A high Fit value for individual sites within a protein indicates sites that require a high degree of functional information. High Fit values may also point to the key structural or binding sites within the overall 3-D structure.
11 –> So, Durston et al are targetting the same goal, but have chosen a different path from the start-point of the Shannon-Hartley log probability metric for information. That is, they use Shannon’s H, the average information per symbol, and address shifts in it from a ground to a functional state on investigation of protein family amino acid sequences. They also do not identify an explicit threshold for degree of complexity. [Added, Apr 18, from comment 11 below:] However, their information values can be integrated with the reduced Chi metric:
Using Durston’s Fits from his Table 1, in the Dembski style metric of bits beyond the threshold, and simply setting the threshold at 500 bits:
RecA: 242 AA, 832 fits, Chi: 332 bits beyond
SecY: 342 AA, 688 fits, Chi: 188 bits beyond
Corona S2: 445 AA, 1285 fits, Chi: 785 bits beyond . . . results n7
The two metrics are clearly consistent, and Corona S2 would also pass the X metric’s far more stringent threshold right off as a single protein. (Think about the cumulative fits metric for the proteins for a cell . . . )
In short one may use the Durston metric as a good measure of the target zone’s actual encoded information content, which Table 1 also conveniently reduces to bits per symbol so we can see how the redundancy affects the information used across the domains of life to achieve a given protein’s function; not just the raw capacity in storage unit bits [= no. of AA’s * 4.32 bits/AA on 20 possibilities, as the chain is not particularly constrained.]
12 –> I guess I should not leave off the simple, brute force X-metric that has been knocking around UD for years.
13 –> The idea is that we can judge information in or reducible to bits, as to whether it is or is not contingent and complex beyond 1,000 bits. If so, C = 1 (and if not C = 0). Similarly, functional specificity can be judged by seeing the effect of disturbing the information by random noise [where codes will be an “obvious” case, as will be key-lock fitting components in a Wicken wiring diagram functionally organised entity based on nodes, arcs and interfaces in a network], to see if we are on an “island of function.” If so, S = 1 (and if not, S = 0).
14 –> We then look at the number of bits used, B — more or less the number of basic yes/no questions needed to specify the configuration [or, to store the data], perhaps adjusted for coding symbol relative frequencies — and form a simple product, X:
X = C * S * B, in functionally specific bits . . . eqn n8.
15 –> This is of course a direct application of the per aspect explanatory filter, (cf. discussion of the rationale for the filter here in the context of Dembski’s “dispensed with” remark) and the value in bits for a large file is the familiar number we commonly see such as a Word Doc of 384 k bits. So, more or less the X-metric is actually quite commonly used with the files we toss around all the time. That also means that on billions of test examples, FSCI in functional bits beyond 1,000 as a threshold of complexity is an empirically reliable sign of intelligent design.
All of this adds up to a conclusion.
Namely, that there is excellent reason to see that:
i: CSI and FSCI are conceptually well defined (and are certainly not “meaningless”),
ii: trace to the work of leading OOL researchers in the 1970’s,
iii: have credible metrics developed on these concepts by inter alia Dembski and Durston, Chiu, Abel and Trevors, metrics that are based on very familiar mathematics for information and related fields, and
iv: are in fact — though this is hotly denied and fought tooth and nail — quite reliable indicators of intelligent cause where we can do a direct cross-check.
In short, the set of challenges recently raised by MG over the past several weeks has collapsed. END