Uncommon Descent Serving The Intelligent Design Community

NEWS FLASH: Dembski’s CSI caught in the act

Share
Facebook
Twitter
LinkedIn
Flipboard
Print
Email

Dembski’s CSI concept has come under serious question, dispute and suspicion in recent weeks here at UD.

After diligent patrolling the cops announce a bust: acting on some tips from un-named sources,  they have caught the miscreants in the act!

From a comment in the MG smart thread, courtesy Dembski’s  NFL (2007 edn):

___________________

>>NFL as just linked, pp. 144 & 148:

144: “. . . since a universal probability bound of 1 in 10^150 corresponds to a universal complexity bound of 500 bits of information, (T, E) constitutes CSI because T [i.e. “conceptual information,” effectively the target hot zone in the field of possibilities] subsumes E [i.e. “physical information,” effectively the observed event from that field], T is detachable from E, and and T measures at least 500 bits of information . . . ”

148: “The great myth of contemporary evolutionary biology is that the information needed to explain complex biological structures can be purchased without intelligence. My aim throughout this book is to dispel that myth . . . . Eigen and his colleagues must have something else in mind besides information simpliciter when they describe the origin of information as the central problem of biology.

I submit that what they have in mind is specified complexity, or what equivalently we have been calling in this Chapter Complex Specified information or CSI . . . .

Biological specification always refers to function . . . In virtue of their function [a living organism’s subsystems] embody patterns that are objectively given and can be identified independently of the systems that embody them. Hence these systems are specified in the sense required by the complexity-specificity criterion . . . the specification can be cashed out in any number of ways . . . “

Here we see all the suspects together caught in the very act.

Let us line up our suspects:

1: CSI,

2: events from target zones in wider config spaces,

3: joint complexity-specification criteria,

4: 500-bit thresholds of complexity,

5: functionality as a possible objective specification

6: biofunction as specification,

7: origin of CSI as the key problem of both origin of life [Eigen’s focus] and Evolution, origin of body plans and species etc.

8: equivalence of CSI and complex specification.

Rap, rap, rap!

“How do you all plead?”

“Guilty as charged, with explanation your honour. We were all busy trying to address the scientific origin of biological information, on the characteristic of complex functional specificity. We were not trying to impose a right wing theocratic tyranny nor to smuggle creationism in the back door of the schoolroom your honour.”

“Guilty!”

“Throw the book at them!”

CRASH! >>

___________________

So, now we have heard from the horse’s mouth.

What are we to make of it, in light of Orgel’s conceptual definition from 1973 and the recent challenges to CSI raised by MG and others.

That is:

. . . In brief, living organisms are distinguished by their specified complexity. Crystals are usually taken as the prototypes of simple well-specified structures, because they consist of a very large number of identical molecules packed together in a uniform way. Lumps of granite or random mixtures of polymers are examples of structures that are complex but not specified. The crystals fail to qualify as living because they lack complexity; the mixtures of polymers fail to qualify because they lack specificity. [[The Origins of Life (John Wiley, 1973), p. 189.]

And, what about the more complex definition in the 2005 Specification paper by Dembski?

Namely:

define ϕS as . . . the number of patterns for which [agent] S’s semiotic description of them is at least as simple as S’s semiotic description of [a pattern or target zone] T. [26] . . . . where M is the number of semiotic agents [S’s] that within a context of inquiry might also be witnessing events and N is the number of opportunities for such events to happen . . . . [where also] computer scientist Seth Lloyd has shown that 10^120 constitutes the maximal number of bit operations that the known, observable universe could have performed throughout its entire multi-billion year history.[31] . . . [Then] for any context of inquiry in which S might be endeavoring to determine whether an event that conforms to a pattern T happened by chance, M·N will be bounded above by 10^120. We thus define the specified complexity [χ] of T given [chance hypothesis] H [in bits] . . . as  [the negative base-2 log of the conditional probability P(T|H) multiplied by the number of similar cases ϕS(t) and also by the maximum number of binary search-events in our observed universe 10^120]

χ = – log2[10^120 ·ϕS(T)·P(T|H)]  . . . eqn n1

How about this (we are now embarking on an exercise in “open notebook” science):

1 –> 10^120 ~ 2^398

2 –> Following Hartley, we can define Information on a probability metric:

I = – log(p) . . .  eqn n2

3 –> So, we can re-present the Chi-metric:

Chi = – log2(2^398 * D2 * p)  . . .  eqn n3

Chi = Ip – (398 + K2) . . .  eqn n4

4 –> That is, the Dembski CSI Chi-metric is a measure of Information for samples from a target zone T on the presumption of a chance-dominated process, beyond a threshold of at least 398 bits, covering 10^120 possibilities.

5 –> Where also, K2 is a further increment to the threshold that naturally peaks at about 100 further bits. In short VJT’s CSI-lite is an extension and simplification of the Chi-metric. He explains in the just linked (and building on the further linked):

The CSI-lite calculation I’m proposing here doesn’t require any semiotic descriptions, and it’s based on purely physical and quantifiable parameters which are found in natural systems. That should please ID critics. These physical parameters should have known probability distributions. A probability distribution is associated with each and every quantifiable physical parameter that can be used to describe each and every kind of natural system – be it a mica crystal, a piece of granite containing that crystal, a bucket of water, a bacterial flagellum, a flower, or a solar system . . . .

Two conditions need to be met before some feature of a system can be unambiguously ascribed to an intelligent agent: first, the physical parameter being measured has to have a value corresponding to a probability of 10^(-150) or less, and second, the system itself should also be capable of being described very briefly (low Kolmogorov complexity), in a way that either explicitly mentions or implicitly entails the surprisingly improbable value (or range of values) of the physical parameter being measured . . . .

my definition of CSI-lite removes Phi_s(T) from the actual formula and replaces it with a constant figure of 10^30. The requirement for low descriptive complexity still remains, but as an extra condition that must be satisfied before a system can be described as a specification. So Professor Dembski’s formula now becomes:

CSI-lite=-log2[10^120.10^30.P(T|H)]=-log2[10^150.P(T|H)] . . . eqn n1a

. . . .the overall effect of including Phi_s(T) in Professor Dembski’s formulas for a pattern T’s specificity, sigma, and its complex specified information, Chi, is to reduce both of them by a certain number of bits. For the bacterial flagellum, Phi_s(T) is 10^20, which is approximately 2^66, so sigma and Chi are both reduced by 66 bits. My formula makes that 100 bits (as 10^30 is approximately 2^100), so my CSI-lite computation represents a very conservative figure indeed.

Readers should note that although I have removed Dembski’s specification factor Phi_s(T) from my formula for CSI-lite, I have retained it as an additional requirement: in order for a system to be described as a specification, it is not enough for CSI-lite to exceed 1; the system itself must also be capable of being described briefly (low Kolmogorov complexity) in some common language, in a way that either explicitly mentions pattern T, or entails the occurrence of pattern T. (The “common language” requirement is intended to exclude the use of artificial predicates like grue.) . . . .

[As MF has pointed out] the probability p of pattern T occurring at a particular time and place as a result of some unintelligent (so-called “chance”) process should not be multiplied by the total number of trials n during the entire history of the universe. Instead one should use the formula (1–(1-p)^n), where in this case p is P(T|H) and n=10^120. Of course, my CSI-lite formula uses Dembski’s original conservative figure of 10^150, so my corrected formula for CSI-lite now reads as follows:

CSI-lite=-log2(1-(1-P(T|H))^(10^150)) . . . eqn n1b

If P(T|H) is very low, then this formula will be very closely approximated [HT: Giem] by the formula:

CSI-lite=-log2[10^150.P(T|H)]  . . . eqn n1c

6 –> So, the idea of the Dembski metric in the end — debates about peculiarities in derivation notwithstanding — is that if the Hartley-Shannon- derived information measure for items from a hot or target zone in a field of possibilities is beyond 398 – 500 or so bits, it is so deeply isolated that a chance dominated process is maximally unlikely to find it, but of course intelligent agents routinely produce information beyond such a threshold.

7 –> In addition, the only observed cause of information beyond such a threshold is the now proverbial intelligent semiotic agents.

8 –> Even at 398 bits that makes sense as the total number of Planck-time quantum states for the atoms of the solar system [most of which are in the Sun] since its formation does not exceed ~ 10^102, as Abel showed in his 2009 Universal Plausibility Metric paper. The search resources in our solar system just are not there.

9 –> So, we now clearly have a simple but fairly sound context to understand the Dembski result, conceptually and mathematically [cf. more details here]; tracing back to Orgel and onward to Shannon and Hartley. Let’s augment here [Apr 17], on a comment in the MG progress thread:

Shannon measured info-carrying capacity, towards one of his goals: metrics of the carrying capacity of comms channels — as in who was he working for, again?

CSI extended this to meaningfulness/function of info.

And in so doing, observed that this — due to the required specificity — naturally constricts the zone of the space of possibilities actually used, to island[s] of function.

That specificity-complexity criterion links:

I: an explosion of the scope of the config space to accommodate the complexity (as every added bit DOUBLES the set of possible configurations),  to

II: a restriction of the zone, T, of the space used to accommodate the specificity (often to function/be meaningfully structured).

In turn that suggests that we have zones of function that are ever harder for chance based random walks [CBRW’s] to pick up. But intelligence does so much more easily.

Thence, we see that if you have a metric for the information involved that surpasses a threshold beyond which a CBRW is a plausible explanation, then we can confidently infer to design as best explanation.

Voila, we need an info beyond the threshold metric. And, once we have a reasonable estimate of the direct or implied specific and/or functionally specific (especially code based) information in an entity of interest, we have an estimate of or credible substitute for the value of – log2(p(T|H)); especially if the value of information comes from direct inspection of storage capacity and code symbol patterns of use leading to an estimate of relative frequency, we may evaluate average [functionally or otherwise] specific information per symbol used. This is a version of Shannon’s weighted average information per symbol H-metric, H = –  Σ pi * log(pi), which is also known as informational  entropy [there is an arguable link to thermodynamic entropy, cf here)  or uncertainty.

As in (using Chi_500 for VJT’s CSI_lite [UPDATE, July 3: and S for a dummy variable that is 1/0 accordingly as the information in I is empirically or otherwise shown to be specific, i.e. from a narrow target zone T, strongly UNREPRESENTATIVE of the bulk of the distribution of possible configurations, W]):

Chi_500 = Ip*S – 500,  bits beyond the [solar system resources] threshold  . . . eqn n5

Chi_1000 = Ip*S – 1000, bits beyond the observable cosmos, 125 byte/ 143 ASCII character threshold . . . eqn n6

Chi_1024 = Ip*S – 1024, bits beyond a 2^10, 128 byte/147 ASCII character version of the threshold in n6, with a config space of 1.80*10^308 possibilities, not 1.07*10^301 . . . eqn n6a

[UPDATE, July 3: So, if we have a string of 1,000 fair coins, and toss at random, we will by overwhelming probability expect to get a near 50-50 distribution typical of the bulk of the 2^1,000 possibilities W. On the Chi-500 metric, I would be high, 1,000 bits, but S would be 0, so the value for Chi_500 would be – 500, i.e. well within the possibilities of chance.  However, if we came to the same string later and saw that the coins somehow now had the bit pattern of the ASCII codes for the first 143 or so characters of this post, we would have excellent reason to infer that an intelligent designer, using choice contingency, had intelligently reconfigured the coins. that is because, using the same I = 1,000 capacity value, S is now 1, and so Chi_500 = 500 bits beyond the solar system threshold. If the 10^57 or so atoms of our solar system, for its lifespan, were to be converted into coins and tables etc, and tossed at an impossibly fast rate, it would be impossible to sample enough of the possibilities space W to have confidence that something from so unrepresentative a zone T,  could reasonably be explained on chance. So, as long as an intelligent agent capable of choice is possible, choice — i.e. design — would be the rational, best explanation on the sign observed, functionally specific, complex information.]

10 –> Similarly, the work of Durston and colleagues, published in 2007, fits this same general framework. Excerpting:

Consider that there are usually only 20 different amino acids possible per site for proteins, Eqn. (6) can be used to calculate a maximum Fit value/protein amino acid site of 4.32 Fits/site [NB: Log2 (20) = 4.32]. We use the formula log (20) – H(Xf) to calculate the functional information at a site specified by the variable Xf such that Xf corresponds to the aligned amino acids of each sequence with the same molecular function f. The measured FSC for the whole protein is then calculated as the summation of that for all aligned sites. The number of Fits quantifies the degree of algorithmic challenge, in terms of probability [info and probability are closely related], in achieving needed metabolic function. For example, if we find that the Ribosomal S12 protein family has a Fit value of 379, we can use the equations presented thus far to predict that there are about 10^49 different 121-residue sequences that could fall into the Ribsomal S12 family of proteins, resulting in an evolutionary search target of approximately 10^-106 percent of 121-residue sequence space. In general, the higher the Fit value, the more functional information is required to encode the particular function in order to find it in sequence space. A high Fit value for individual sites within a protein indicates sites that require a high degree of functional information. High Fit values may also point to the key structural or binding sites within the overall 3-D structure.

11 –> So, Durston et al are targetting the same goal, but have chosen a different path from the start-point of the Shannon-Hartley log probability metric for information. That is, they use Shannon’s H, the average information per symbol, and address shifts in it from a ground to a functional state on investigation of protein family amino acid sequences. They also do not identify an explicit threshold for degree of complexity. [Added, Apr 18, from comment 11 below:] However, their information values can be integrated with the reduced Chi metric:

Using Durston’s Fits from his Table 1, in the Dembski style metric of bits beyond the threshold, and simply setting the threshold at 500 bits:

RecA: 242 AA, 832 fits, Chi: 332 bits beyond

SecY: 342 AA, 688 fits, Chi: 188 bits beyond

Corona S2: 445 AA, 1285 fits, Chi: 785 bits beyond  . . . results n7

The two metrics are clearly consistent, and Corona S2 would also pass the X metric’s far more stringent threshold right off as a single protein. (Think about the cumulative fits metric for the proteins for a cell . . . )

In short one may use the Durston metric as a good measure of the target zone’s actual encoded information content, which Table 1 also conveniently reduces to bits per symbol so we can see how the redundancy affects the information used across the domains of life to achieve a given protein’s function; not just the raw capacity in storage unit bits [= no.  of  AA’s * 4.32 bits/AA on 20 possibilities, as the chain is not particularly constrained.]

12 –> I guess I should not leave off the simple, brute force X-metric that has been knocking around UD for years.

13 –> The idea is that we can judge information in or reducible to bits, as to whether it is or is not contingent and complex beyond 1,000 bits. If so, C = 1 (and if not C = 0). Similarly, functional specificity can be judged by seeing the effect of disturbing the information by random noise [where codes will be an “obvious” case, as will be key-lock fitting components in a Wicken wiring diagram functionally organised entity based on nodes, arcs and interfaces in a network], to see if we are on an “island of function.” If so, S = 1 (and if not, S = 0).

14 –> We then look at the number of bits used, B — more or less the number of basic yes/no questions needed to specify the configuration [or, to store the data], perhaps adjusted for coding symbol relative frequencies — and form a simple product, X:

X = C * S * B, in functionally specific bits . . . eqn n8.

15 –> This is of course a direct application of the per aspect explanatory filter, (cf. discussion of the rationale for the filter here in the context of Dembski’s “dispensed with” remark) and the value in bits for a large file is the familiar number we commonly see such as a Word Doc of 384 k bits. So, more or less the X-metric is actually quite commonly used with the files we toss around all the time. That also means that on billions of test examples, FSCI in functional bits beyond 1,000 as a threshold of complexity is an empirically reliable sign of intelligent design.

______________

All of this adds up to a conclusion.

Namely, that there is excellent reason to see that:

i: CSI and FSCI are conceptually well defined (and are certainly not “meaningless”),

ii: trace to the work of leading OOL researchers in the 1970’s,

iii: have credible metrics developed on these concepts by inter alia Dembski and Durston, Chiu, Abel and Trevors, metrics that are based on very familiar mathematics for information and related fields, and

iv: are in fact — though this is hotly denied and fought tooth and nail — quite reliable indicators of intelligent cause where we can do a direct cross-check.

In short, the set of challenges recently raised by MG over the past several weeks has collapsed. END

Comments
Talking point: ID is indeed creationism in a cheap tuxedo, promoted by right wing ideologues intending to impose a theocratic tyranny in their war against science. --> Answered in the four linked points in the paragraph in the original post above which is being cited as though the denial is actually a confession of guilt rather than a plea to stop a destructive and dangerous slander. --> Let's clip: >> . . . . We were all busy trying to address the scientific origin of biological information, on the characteristic of complex functional specificity. We were not trying to impose a right wing theocratic tyranny nor to smuggle creationism in the back door of the schoolroom your honour.” . . . . >> kairosfocus
April 16, 2011
April
04
Apr
16
16
2011
01:46 PM
1
01
46
PM
PDT
F/N 2: In the last linked, read esp pp 9 - 14, to see a discussion of the impact of starting on an island of function, and the implications of how we got there.kairosfocus
April 16, 2011
April
04
Apr
16
16
2011
01:17 PM
1
01
17
PM
PDT
F/N: Dissecting evolutionary algorithms, here. Specifically addressing Weasel, Avida, Ev and Tierra, here.kairosfocus
April 16, 2011
April
04
Apr
16
16
2011
12:40 PM
12
12
40
PM
PDT
Has anyone considered that there is an underlying logic behind the four questions? Is there a common thread behind them all other than the (I just want to see CSI in action)? I will say that the first challenge appears to be a thinly veiled attempt to get an admission that information in the genome can increase through a 'simple' gene duplication even. IOW, it wasn't really about CSI at all. But then, I'm pretty cynical.Mung
April 16, 2011
April
04
Apr
16
16
2011
08:42 AM
8
08
42
AM
PDT
Just to keep a reminder going; above I have clipped and responded to MG's four questions in her guest post, linking that thread so onlookers can see that similar responses were developed at that time as well; just, they were ignored or brushed aside without serious response.kairosfocus
April 16, 2011
April
04
Apr
16
16
2011
07:18 AM
7
07
18
AM
PDT
OOPS: - 478, typo.kairosfocus
April 16, 2011
April
04
Apr
16
16
2011
03:04 AM
3
03
04
AM
PDT
14 --> You have provided an information increment value. Fed into the transformed Dembski Chi-metric:
Chi_tierra = 22 [BYTES!] - 500 = - 479 [oops] Chi_Tierra = 22*8 - 500 = 176 - 500 = -324 Again, irrelevant to the design inference
4] The various Steiner Problem solutions from a programming challenge a few years ago have genomes that can easily be hundreds of bits. The specification for these genomes 15 --> A long time ago, someone once called a country " Greenland." Green-land is of course mostly White-land. Calling a bit string in an intelligently designed software entity a "genome" does not make it so. 16 --> Hundreds of bits produced by an intelligently designed search algorithm is immediately known to be a designed output. This can be seen from say this clip from the PT page MG links:
In the Genetic Algorithm itself, the DNA strings for a population of around 2000 random solutions are tested, mutated, and bred [left off: By the coded program running on a machine acting as proxy for its intelligent programmer] over a few hundred generations. Mutation is easily performed, and requires no knowledge of actual coordinates or connections [a generic warmer/colder is good enough]. If a mutation is to affect one of the first 26 digits of an organism’s DNA [oops: you even pick mutation sites], that digit will be replaced by a random digit from 0 to 9, ensuring the mutated DNA can still be “read.” [points to a set data structure that is intelligently designed and on an island of function] (The first two digits are limited to just 00-04.) [again, island of function constraints] Likewise, mutations in the 36-bit connections map part of the DNA string [island of function constraints] will be replaced by a random new bit (T or F) . . . As is common in Genetic Algorithms, some “elite” members of the population are usually retained as-is (i.e. asexual reproduction instead of sexual reproduction). [an algorithmic choice that ensures that things move towards the target faster than otherwise, perhaps decisively faster] All that remains is getting the whole thing started, and implementing a “Fitness Test” to see which organisms have a better chance of contributing to the next generation’s gene pool. The first generation of 2000 strings is generated randomly, under constraints that the first two digits can only be 00-04, the next 24 digits can be any digits 0-9, and the last 36 bits T or F. [notice the island of function data structure constraints specified by the intelligently designed algorithm; the "body plan" is fixed in advance] Each generation is tested for “fitness,” and the individuals with higher fitness are assigned a higher probability of making it into the next generation. While mating is biased toward organisms with better fitness, because it is a stochastic process, even low-fitness individuals can overcome the odds and breed once in a while.
17 --> In short, hill climbing, within the search resources of the Dembski threshold, starting inside an island of intelligently arrived at function. 19 --> A case of 62 bits is given [added: these calcs are illustrative of how the Chi-500 metric works based on identified cases of info given], which is of course Chi_tierra = 62 - 500 = - 438, well within the threshold where chance plus selection are credible sources. is “Computes a close approximation to the shortest connected path between a set of points.” 18 --> Computes is a give-away term, again: intelligently designed, within a designed in island of function. Irrelevant to the real problem to be tackled. (ADDED: That is, even if we can make info in excess of threshold, the info is coming from an intelligent source, not blind chance and necessity; hence the significance of actual cases of random text generation, that peak out at 24 ASCII characters [ = 168 bits, very close to the results for Tierra] so far, per the tests reported by WIKI. This provides evidence that config spaces of 10^50 or so possibilities are a reasonable upper bounded practical scope of search under realistic in-lab conditions on earth. Just as thermodynamicists had suggested decades ago. The Dembski type 500 bit threshold is best seen in light of Abel on universal plausibility as a solar system upper limit, and the 1,000 bits as an observed cosmos limit. These expanded limits give wiggle room for fairly large islands of function by shifting from a strictly probability base to one on upper limit of number of events at quantum level available to be used in searches. If a scope of possibilities significantly exceeds that number for the scope in which a search should be done, it is unreasonable that the needle in that big of a haystack would be expected to be found on a chance based random walk plus trial and error with locking-in of success. One's Blind Watchmaker, chance plus necessity theories of origins should not depend too much on statistical miracles.) 19 --> In addition, the program needs to answer to the challenge that the degree of complexity of the targets has to be so constrained that the search within the island of function does not find itself looking at truly isolated functions. That is there is a search resources issue that is being ducked by using fairly simple toy scale cases. Real genomes start north of 100,000 bits, and go on up from there to the billions. >> __________________ The first case, overlooks the key issue of where do copies come from, which is anything but simple. If a copy originates and is seen, that pushes the pattern STRING + STRING COPY over the 500 bit threshold, where the copying is not credibly a coincidence but credibly an output of a mechanism that reads and replicates, that is itself an index of sophisticated organised complexity. The Chi-metric's implied conclusion of design is warranted. In short, this is actually a success for the Chi-metric. We see that the other three cases are all computer simulation exercises, pre-loaded with designed algorithms that start on islands of function. The scope of search is also controlled so that it does n0t have to look for targets in spaces beyond search resource limits. But hat is precisely the root problem being highlighted by the design inference analysis. So, we observe:
1: the part of the search that is in a vast config space: specifying the code for the algorithm, is intelligently designed. This puts you on an island of function to begin with. 2: Where a random search feature is incorporated, the scope of the implied secondary config space is restricted, so that the existing search resources are not overwhelmed, i.e the lottery is designed to be winnable based on restricted scope.
The examples inadvertently demonstrate the point of the Chi-metric's threshold! Namely, to operate on the far side of the threshold, you need intelligence to narrow down scope of trial and error based search. And so it is no surprise to see the results of such calculations as were possible: a: gene duplication beyond the threshold [the duplicate needs to be over 500 bits], points to a complex machinery of duplication, and this can best be assigned to design. b: ev -- a search within an island of function, with all sorts of built-in search enablers -- peaks out at under 300 bits, well within the threshold, as was seen by plugging in the number 300 into the transformed Chi-metric equation. c: Tierra, again within a designed island of function, on the 62 bit case cited, on plugging in 62 bits and subtracting 500, is well within the threshold of the Chi_metric. d: Steiner, is again within an island of function, and the scope of intelligently controlled search for solutions is well within the search resources threshold. In all cases, the transformed and simplified Chi-metric simplifies the calculation problem, as we only need to find the relevant information, in bits. GEM of TKIkairosfocus
April 16, 2011
April
04
Apr
16
16
2011
03:03 AM
3
03
03
AM
PDT
MG's four "CSI challenge" scenarios, addressed: These come from the OP, On the Calc of CSI guest thread. Observe, first, how only one of the four is an actual genetic situation, and the other three have to do with the operation of designed computer programs that were artificially set up [after, doubtless, considerable troubleshooting] to be specifically functional. In one case, the structure of the digit string termed the "genome" was explicitly explained as very constrained, showing just how much these exercises are about moving around on islands of specific function, set up by known intelligent programmers. The case also shows how, where random search elements are introduced, the degree of scope is very carefully constrained in ways that do not overwhelm search resources, i.e. the search is within the 500 - 1,000 bit space threshold, directly or by cutting down qualified "random" variation. The large config space part of the challenge is tacked by intelligence, and randomness is held within search resources limits by that design. On this alone, the simulations are of micro evolution, not body plan level macro evolution. And the remark that Schneider's ev is based on investigation of evo in observed organisms, underscores this: observed evo cases are all micro. Macro evolution at body plan level is an extrapolation and inference, not an observation. But it is the later that is under question; the key question is being (unconsciously) begged. I shall interleave comments: __________________ >> 1] A simple gene duplication, without subsequent modification, 1 --> As I said originally, there is nothing simple about this act. The additionality of the function of copying implies a facility capable of that, which is a complex and specific function in itself. If the difference in bit-count is sufficient that it is not credible that this duplication has somehow occurred by chance, the implication of a mechanism is consistent with the known root of such consistency. The inference to design in that context is inductively valid on the empirical pattern of cause of such. 2 --> The copy without modification is of course no increment to actual information, or function; as is pointed out above and as has been pointed out previously when this was raised. No explicit calculation is needed. The issue of FSCI is not how many copies of the information exist but how the specific functional organisation of data elements common to the copies first occurred; i.e. its origin and cause. that increases production of a particular protein from less than X to greater than X. 3 --> This is about increase in EXPRESSION of a gene's effect, not increase in functional information The specification of this scenario is “Produces at least X amount of protein Y.” 4 --> After the fact target painting; [Added, 04:22:] cf. PAV's remarks at 50 - 51 below for more. In effect the EVENT that lands one in the target zone is the critical issue, and it is such that the chance hyp for this per our knowledge of copying yields a value of novel info of zero; but of course this points to the issue of a copying mechanism, so one may infer that if G_copy is > 500 bits, then the existence of two copies entails a complex copying mechanism, which will all but certainly be both irreducibly complex to function and beyond the 500-bit threshold itself, so the observed result, of a duplication of an original, implies design on FSCO/I as a reliable sign. 2] Tom Schneider’s ev evolves genomes 5 --> Question-begging description of the action of ev. Ev --[Added 04:22:] as Mung shows in 119 & 126 (cf also 129) below -- is an intelligently designed program that starts out at an island of function in a config space, and is designed to produce the "genomes" that result. Dembski et al aptly remark, on dissecting ev:
The success of ev is largely due to active information introduced by the Hamming oracle and from the perceptron structure. It is not due to the evolutionary algorithm used to perform the search. Indeed, other algorithms are shown to mine active information more efficiently from the knowledge sources provided by ev. Schneider claims that ev demonstrates that naturally occurring genetic systems gain information by evolutionary processes and that "information gain can occur by punctuated equilibrium". Our results show that, contrary to these claims, ev does not demonstrate "that biological information...can rapidly appear in genetic control systems subjected to replication, mutation, and selection". We show this by demonstrating that there are at least five sources of active information [intelligently injected info that increases performancs of a search type program above the random trial and error case] in ev. 1. The perceptron structure. The perceptron structure is predisposed to generating strings of ones sprinkled by zeros or strings of zeros sprinkled by ones. Since the binding site target is mostly zeros with a few ones, there is a greater predisposition to generate the target than if it were, for example, a set of ones and zeros produced by the flipping of a fair coin. 2. The Hamming Oracle. When some offspring are correctly announced as more fit than others, external knowledge is being applied to the search and active information is introduced. As with the child's game, we are being told with respect to the solution whether we are getting "colder" or "warmer". 3. Repeated Queries. Two queries contain more information than one. Repeated queries can contribute active information. 4. Optimization by Mutation. This process discards mutations with low fitness and propagates those with high fitness. When the mutation rate is small, this process resembles a simple Markov birth process that converges to the target. 5. Degree of Mutation. As seen in Figure 3, the degree of mutation for ev must be tuned to a band of workable values. (George Montañez, Winston Ewert, William A. Dembski, and Robert J. Marks II, "A Vivisection of the ev Computer Organism: Identifying Sources of Active Information," Bio-Complexity, Vol. 2010(3) (internal citations removed).)
using only simplified forms of known, observed evolutionary mechanisms, that meet the specification of “A nucleotide that binds to exactly N sites within the genome.” 6 --> Cf cite just above. The length of the genome required to meet this specification can be quite long, depending on the value of N. (ev is particularly interesting because it is based directly on Schneider’s PhD work with real biological organisms.) 7 --> PAV also pointed out -- cf 50 - 51 below -- that the level of resulting info "produced" by Ev, is within the threshold for CSI; i.e it is within what is plausible for chance driven change to do, not beyond the search resources threshold:
Less than 300 bits of functional information appear in the output (IIRC, in a previous thread PAV put 266; below he argues for much less); no particular calculation is needed to show that -- as was already shown -- 300 bits [to be fed into the transformed Chi-metric] is within a 500 bit threshold: Chi_ev = 300 - 500 = -200, within the reach of chance and necessity. Irrelevant to the inference to design.
8 --> And, we must not forget, this all starts inside an island of function arrived at by intelligent design. (Cf Mung 126 below) That is at best this models micro-evolution, not the relevant body plan level macro evolution. 9 --> such is actually implicit in the remark: "it is based directly on Schneider’s PhD work with real biological organisms." Indeed, where WE HAVE NEVER OBSERVED BODY PLAN LEVEL MACROEVOLUTION, ONLY SMALL VARIATION AKA MICROEVOLUTION. 11 --> Ev transforms underlying functional information, according to progams, and achieves generic as opposed to specific pre-programmed targets [the perceptron pushes towards sparse 0 outcomes -- BTW this filters out most of the config space -- and the warmer/colder performance filter selects for those that are closer to the generic target. 12 --> Subtler than WEASEL, but in the same general class. Information is being processed and expressed by intelligently designed software, not appearing as a free lunch from non-foresighted phenomena such as chance variation and pure trial and error picking for any function. 3] Tom Ray’s Tierra routinely results in digital organisms with a number of specifications. 13 --> Again, a question-begging description, here as "organisms." These are no more organisms than the SIMS characters, they are programmed virtual actors on a virtual stage, based on the output of an intelligently designed algorithm, inputted data structures and contents, and controlled execution on a machine that was equally intelligently designed. One I find interesting is “Acts as a parasite on other digital organisms in the simulation.” The length of the shortest parasite is at least 22 bytes, added 13a --> Notice, the 22 byte figure comes from MG but takes thousands of generations to evolve. [ . . . ]kairosfocus
April 16, 2011
April
04
Apr
16
16
2011
03:03 AM
3
03
03
AM
PDT
Hi kairosfocus, I'm not sure whether this paper has been mentioned:
Complex emergent systems of many interacting components, including complex biological systems, have the potential to perform quantifiable functions. Accordingly, we define “functional information,” I(Ex), as a measure of system complexity. For a given system and function, x (e.g., a folded RNA sequence that binds to GTP), and degree of function, Ex (e.g., the RNA–GTP binding energy), I(Ex) = ?log2[F(Ex)], where F(Ex) is the fraction of all possible configurations of the system that possess a degree of function ? Ex. Functional information, which we illustrate with letter sequences, artificial life, and biopolymers, thus represents the probability that an arbitrary configuration of a system will achieve a specific function to a specified degree. In each case we observe evidence for several distinct solutions with different maximum degrees of function, features that lead to steps in plots of information versus degree of function.
http://www.pnas.org/content/104/suppl.1/8574.fullMung
April 15, 2011
April
04
Apr
15
15
2011
06:59 PM
6
06
59
PM
PDT
I've posted the original challenges: 1. here 2. 1. here 3. here 4. and hereMung
April 15, 2011
April
04
Apr
15
15
2011
05:47 PM
5
05
47
PM
PDT
PS: if a copy of an earlier informational entity is made, then the copy is transformed into something that does a different function, the increment in CSI for the changes, may be relevant, and the process of transformation may also be VERY relevant. For instance, Venter and other genetic engineers.kairosfocus
April 15, 2011
April
04
Apr
15
15
2011
07:00 AM
7
07
00
AM
PDT
MG attempted rebuttal: 1: you still can’t provide a rigorous mathematical definition of CSI As by now you know or should know, CSI is not a primarily mathematical concept or construct, like say a Riemann integral or a limit. It takes reality from being an observed characteristic of systems of interest, being described. As in: science seeks to describe, explain, predict then control or at least influence on the realm of observable phenomena, objects, processes etc. The definition of CSI is actually in its name, after all the term is a description: "complex, specified information," as is seen in text in English, or in computer programs, or in DNA's code, and indeed in the complex functional organisation of the cell. Such CSI -- and its functionally specified subset FSCI [note Dembski in NFL as cited in the OP] -- as has been shown, is capable of being modelled mathematically, and it is reducible to metrics in various ways, as can be seen above for Dembski and Durston. If these do not satisfy you as ADEQUATE mathematical models, that relate to observable realities and produce useful results, then that is because you are playing the self-refuting game of selective hyperskepticism, in Sagan's evidentialist form. Here is the key epistemological blunder, corrected:
extraordinary [to me] claims require extraordinary [ADEQUATE] evidence
2: you still can’t show examples of how to calculate it for the four scenarios I described in my guest post Your guest post thread had in it significant responses that provided relevant calculations, as well as relevant corrections, and counter challenges to you to address same. Observe in particular Dr Torley's responses, and onward I have reduced the matters as above. Once we can produce a reasonable estimate of the information content of an entity, we can derive a Dembski style bits beyond a threshold value. In the case of the example that strikes my mind just now, duplicated genes, the information is just that, a duplicate, its incremental information content is ZERO, as was repeatedly pointed out; but brushed aside. The copy itself does not say any more about the information's origin than was already in the original. However, as I pointed out in that thread, a copy does not come from nowhere, so if there is a process that is able to replicate info like that, it is functional, and the presence of a copy like that shows that we have a further function and so if the copy and the original are both present in a system, that points to an increment of FSCI, not in the copy but in the system capable of copying -- indeed maybe a huge increment given the implicated cellular machinery. That too was studiosly ignored. Similarly, PAV has pointed out that for ev [apparently best of breed], the increment in info is well within 300 bits, i.e it is within the range where the CSI threshold does not apply. Worse, as I repeatedly pointed out, ev and similar programs all start within an island of function that was arrived at intelligently, and have in them co-tuning of matched parts that leads to expression of existing FSCI. this too was ignored. I forget for the moment the other cases, do remind me. However, the track record from my viewpoint is that you ducked cogent answers again and again, as you have also ducked on the conceptual links between Dembski and Orgel-Wicken. the cite from NFL -- which you claim to have read -- at the top of the thread exposes the hollowness of the assertion that the CSI concept is "meaningless," or cannot be mathematically coherently expressed, or that what Orgel et al meant and what Dembski means are materially in conflict. Why not clip your four specific challenges that you claim have not been addressed, and let us see if there is not an adequate answer in UD? In particular, what is your answer to my reduction of the Dembski metric on a basis of its literal statement, to an information value in bits beyond a threshold, and the use of this with Durston's Fits as a reasonable estimate of the information content of relevant target zones? 3: you admit that Durston’s metric will not generate the same values as Dembski’s, therefore you claim victory. This is an outright strawman, st up in total caricature of the real issue, and pummelled. Durston is addressing in effect one term in the Dembski metric as I reduced it to information. recall, for convenience, from OP here and previous threads where you conspicuously did not address this:
2 –> Following Hartley, we can define Information on a probability metric: I = – log(p) 3 –> So, we can re-present the Chi-metric: Chi = – log2(2^398 * D2 * p) Chi = Ip – (398 + K2) 4 –> That is, the Dembski CSI Chi-metric is a measure of Information for samples from a target zone T on the presumption of a chance-dominated process, beyond a threshold of at least 398 bits, covering 10^120 possibilities. 5 –> Where also, K2 is a further increment to the threshold that naturally peaks at about 100 further bits. (In short VJT’s CSI-lite is an extension and simplification of the Chi-metric.) 6 –> So, the idea of the Dembski metric in the end — debates about peculiarities in derivation notwithstanding — is that if the Hartley-Shannon- derived information measure for items from a hot or target zone in a field of possibilities is beyond 398 – 500 or so bits, it is so deeply isolated that a chance dominated process is maximally unlikely to find it, but of course intelligent agents routinely produce information beyond such a threshold. 7 –> In addition, the only observed cause of information beyond such a threshold is the now proverbial intelligent semiotic agents. 8 –> Even at 398 bits that makes sense as the total number of Planck-time quantum states for the atoms of the solar system [most of which are in the Sun] since its formation does not exceed ~ 10^102, as Abel showed in his 2009 Universal Plausibility Metric paper. The search resources in our solar system just are not there.
I have now updated an earlier comment, taking the 500 bit threshold and deducing specific Dembski Chi-values using Durston Ip values in fits. Here are my results, on Durston's Table 1:
Using Durston’s Fits from his Table 1, in the Dembski style metric of bits beyond the threshold, and simply setting the threshold at 500 bits: RecA: 242 AA, 832 fits, Chi: 332 bits beyond SecY: 342 AA, 688 fits, Chi: 188 bits beyond Corona S2 445 AA, 1285 fits, Chi: 785 bits beyond. –> The two metrics are clearly consistent, and Corona S2 would also pass the X metric’s far more stringent threshold right off as a single protein. (Think about the cumulative fits metric for the proteins for a cell . . . ) –> In short I am here using the Durston metric as a good measure of the target zone’s information content, which Table 1 also conveniently reduces to bits per symbol so we can see how the redundancy affects the information used across the domains of life to achieve a given protein’s function; not just the capacity in storage unit bits [= no AA's * 2]
That looks quite coherent, useful and relevant to me. GEM of TKIkairosfocus
April 15, 2011
April
04
Apr
15
15
2011
06:53 AM
6
06
53
AM
PDT
Mathgrrl (#11) You write to Kairosfocus: "you still can't provide a rigorous mathematical definition of CSI." Surely you jest, Mathgrrl. You have already been given a few rigorous mathematical definitions of CSI. If these do not satisfy you, I'm convinced that nothing will. I might add that I have yet to see any you perform any rigorous mathematical calculations on this blog, despite several invitations to do so. When one is engaging in open dialogue with an intellectual opponent, one should always be prepared to make concessions. I made several in my post on CSI-lite. I have yet to see any concessions on your part. You've been given an opportunity to voice your opinions on this blog. It would be a pity if it were wasted. Think about it, please.vjtorley
April 15, 2011
April
04
Apr
15
15
2011
06:34 AM
6
06
34
AM
PDT
MathGrrl, Your 4 examples are bogus for the reasons already provided. Also you have been provided with a defnition of CSI that has rigorous mathematical components. So stop blaming us for your obvious inadequacies. Ya see that is why there are posts dedicated to you. Now grow up...Joseph
April 15, 2011
April
04
Apr
15
15
2011
06:03 AM
6
06
03
AM
PDT
kairosfocus, So, to summarize, you still can't provide a rigorous mathematical definition of CSI, you still can't show examples of how to calculate it for the four scenarios I described in my guest post, and you admit that Durston's metric will not generate the same values as Dembski's, therefore you claim victory. Did I miss anything?MathGrrl
April 15, 2011
April
04
Apr
15
15
2011
05:48 AM
5
05
48
AM
PDT
MG attempted rebuttal: If Durston’s metric is actually the same as Dembski’s (or a “subset” thereof, whatever that means), you should be able to demonstrate that mathematically. Either compare the two and demonstrate a transformation from one to the other or show how applying Durston’s metric and Dembski’s metric to the same systems results in the same answer, consistently. This is a strawman misrepresenatation of what was said, and it is very handy for pummelling. I have not argued that Durston and Dembski will create teh same values, as the metrics are distinct. Durston does not give a specific threshold but argues that the metric indicates difficulty of finding a given island of function in a space of possibilities, i.e, it is a metric of complex specified info, were specificity is on function and scope of island is on the observed variation of AA sequences for protein families. Info is based on an estimate of increase in H on moving from a ground to a functional state. This is conceptually consistent -- something that on Orgel you plainly have a challenge with -- with Dembski and the results will give generally consistent results once thresholds of improbability are applied, such as 398 - 500 bits or the like. The point as the cite from Durston that you cleverly duck -- cf OP above and cites in the other thread too, also ducked -- shows that:
Consider that there are usually only 20 different amino acids possible per site for proteins, Eqn. (6) can be used to calculate a maximum Fit value/protein amino acid site of 4.32 Fits/site [NB: Log2 (20) = 4.32]. We use the formula log (20) – H(Xf) to calculate the functional information at a site specified by the variable Xf such that Xf corresponds to the aligned amino acids of each sequence with the same molecular function f. The measured FSC for the whole protein is then calculated as the summation of that for all aligned sites. The number of Fits quantifies the degree of algorithmic challenge, in terms of probability [info and probability are closely related], in achieving needed metabolic function. For example, if we find that the Ribosomal S12 protein family has a Fit value of 379, we can use the equations presented thus far to predict that there are about 10^49 different 121-residue sequences that could fall into the Ribsomal S12 family of proteins, resulting in an evolutionary search target of approximately 10^-106 percent of 121-residue sequence space. In general, the higher the Fit value, the more functional information is required to encode the particular function in order to find it in sequence space. A high Fit value for individual sites within a protein indicates sites that require a high degree of functional information. High Fit values may also point to the key structural or binding sites within the overall 3-D structure.
Note the telling highlights. Kindly cf the OP above point 10. So, the whole focus of the rebuttal point is misdirected and conceptually erroneous. Durston is measuring functionally specific complex information, which is -- as has been pointed out in the UD WACs 27 - 28 for years -- a sub set of CSI. Indeed,the cites in the OP from Dembski's NFL that you claim to have read, make this crystal clear. And they also show that biological origins and biological function are contexts for CSI and ways to cash out specification, respectively. Perhaps, you should at least use the Google books link and read NFL ch 3's excerpts? GEM of TKI PS: Using Durston's Fits from his Table 1, in the Dembski style metric of bits beyond the threshold, and simply setting the threshold at 500 bits: RecA: 242 AA, 832 fits, Chi: 332 bits beyond SecY: 342 AA, 688 fits, Chi: 188 bits beyond Corona S2 445 AA, 1285 fits, Chi: 785 bits beyond. --> The two metrics are clearly consistent, and Corona S2 would also pass the X metric's far more stringent threshold right off as a single protein. (Think about the cumulative fits metric for the proteins for a cell . . . ) --> In short I am here using the Durston metric as a good measure of the target zone's information content, which Table 1 also conveniently reduces to bits per symbol so we can see how the redundancy affects the information used across the domains of life to achieve a given protein's function; not just the capacity in storage unit bits [= no AA's * 2 4.32 (oops, I obviously had DNA bases in mind)] kairosfocus
April 15, 2011
April
04
Apr
15
15
2011
05:28 AM
5
05
28
AM
PDT
Talking point: islands of function don't exist [for biological systems] . . . Answered in this thread, no 4, point 2 above.kairosfocus
April 15, 2011
April
04
Apr
15
15
2011
05:09 AM
5
05
09
AM
PDT
Talking point: But the fitness landscape is varying with time and your argument does not address that, it is useless . . . Answered in the same two challenges thread here, as well as in 2- 3 above. This one was actually anticipated by Dembski in NFL, ch 4, where he pointed out that fitness landscapes will most likely be dynamic across time. (One of the ways these talking points work, is that they are based on our likely ignorance of the prior moves and discussions and the history of the debates over ID, much less our failure to read key classics. Inexcusable in w world where a lot of key books are excerpted on Google.)kairosfocus
April 15, 2011
April
04
Apr
15
15
2011
05:03 AM
5
05
03
AM
PDT
Talking point: You're just making a tornado in a junkyard [builds a jumbo jet by chance] argument . . . Answered in the two challenges thread here, as well as above.kairosfocus
April 15, 2011
April
04
Apr
15
15
2011
04:58 AM
4
04
58
AM
PDT
Talking point: The multiverse swamps out calculations of odds etc and gives enough resources, and there coulr be 10^150 evolutionary paths Eugene S gives an answer in the two challenges thread here. Nullasalus gives another helpful answer here in the same thread, that focuses on numbers of evolutionary paths.kairosfocus
April 15, 2011
April
04
Apr
15
15
2011
04:56 AM
4
04
56
AM
PDT
Talking point: the dog-wolf kind as an example of evolution in action. Answers in the MG smart thread here. Joseph has a great zinger herekairosfocus
April 15, 2011
April
04
Apr
15
15
2011
04:52 AM
4
04
52
AM
PDT
Onlookers: Monitoring and rebutting the latest waves of talking points spreading out in swarms from the evo mat fever swamps . . . 1: But maybe evolution went down 10^150 paths so probability calculations are useless . . . This is empty metaphysics, as Null pointed out. (Are we talking empirically supported science or unbridled metaphysical speculation where one may build castles in the air to suit one's materialistic fancy?) We have observational evidence -- a key criterion for science -- of precisely one cosmos. (And a multiverse has in it a sub-cosmos bakery which is fine tuned to produce life habitable sub-cosmos, leading onwards to design anyway. Cf the discussion of cosmological fine tuning including multiverses here.) If one changes the subject to philosophical speculation like this, then ALL live option alternatives must be on the table, and ALL evidence supporting them is on the table too. 2: Islands of function don't exist, life is on a smoothly varying tree with neat fine gradations from the LUCA to us. Where is the evidence for this? As I pointed out in comment 3, points 24 ff above, the evidence as Gould and Loennig highlight is that life is in distinct islands of function, due to the irreducible complexity of complex functions and the difficulty of searching out the resulting islands of function. In particular, the key workhorse molecules in the cell, proteins, come in very distinct and deeply isolated fold domains, and have to do a key-lock fit to work, especially as enzymes or in agglomersted cluster proteins. (Think: sickle cell anaemia, haemoglobin mutations, and prions giving rise to mad cow disease off misfolding here to see what misfolding does.) Then, think about the Cambrianm life revolution which precisely shows the sudden appearance with no clear ancestry for the major animal life body plans. The tree of life icon is one of the most fundamentally misleading icons in the evo mat gallery of miseducation. 4: "xxx-bit calculations are only relevant for tornado in the junkyard scenarios." This -- as has repeatedly been pointed out but willfully, even stubbornly closed-mindedly ignored in the agenda to spread persuasive but flawed rebuttal points -- is itself a strawmannising of Sir Fred Hoyle's point. The principle is that complexity swamps out trial and error on chance variation, very quickly. All the way back to Cicero -- the quote at the top of my always linked note, we see:
Is it possible for any man to behold these things, and yet imagine that certain solid and individual bodies move by their natural force and gravitation, and that a world so beautifully adorned was made by their fortuitous concourse? He who believes this may as well believe that if a great quantity of the one-and-twenty letters, composed either of gold or any other matter, were thrown upon the ground, they would fall into such order as legibly to form the Annals of Ennius. I doubt whether fortune could make a single verse of them. How, therefore, can these people assert that the world was made by the fortuitous concourse of atoms, which have no color, no quality—which the Greeks call [poiotes], no sense? [Cicero, THE NATURE OF THE GODS BK II Ch XXXVII, C1 BC, as trans Yonge (Harper & Bros., 1877), pp. 289 - 90.]
To get to Hoyle's jumbo jet by a tornado in a junkyard is a probabilistic miracle: scanning multi Gigabit config spaces by lucky noise. But so is to get to a "simple" functional gauge on its dashboard by the same means, probably a kilobit level exercise. Or, for that matter, to produce one full length 143 ASCII character tweet by successful chance. Or, one 200 AA typical protein fro the hundreds used in event the simplest cells. Remember, you don't get to successful reproducing cell based life on the only observed architecture until you have a metabolic entity joined to a von Neumann self replicator. Where, this last means you have to have symbolic code, algorithms, effector arms and other execution machines, all together and properly organised. An irreducibly complex system. Without this, natural selection on differential reproductive success does not possibly exist for cell based life. On evidence this needs 100+ k bits of DNA stored information. The config space for that at the lower end is ~ possibilities. Hopelessly beyond our observed -- talking point 1 above shows why I consistently use this word -- cosmos' search capacity. Going on, to get to a new body plan, you have to generate mutations that change development paths in co-ordinated ways, that are expressed very early in embryological development, as body plans form quite early. But, chance changes at this stage are far more likely to be fatal in utero or the equivalent. The islands of function Indium and others would wish away just re-appeared, on steroids. 5: "you assume that evolution *had* to proceed along a *certain* path, for example from ape-like creatures to humans. Of course you can calculate some amazingly low probability values then. But evolution could have went along 10^150 different paths" Excuse me, but isn't it you evolutionary materialists who are advocating that we descend from apes and trot out the 98% similarity of our proteins to chimp proteins at least as you count the similarity? Well, you then need to account for the origin of the human body plan, including the origin of the physical capacity to use verbal language, itself crucial to the explanation of the origin of the credible mind, another -- and self referentially fatal -- gap. Brain, voice box, co-ordination with hearing and more. Credibly millions of base pairs worth. And, the solar system's quantum state resources are not 10^150 possibilities, they are like 10^102. And, by far most of the atoms in question are in the sun. You don't begin to have the required time, reproductive capacity to generate populations to compete and allow mutations to get fixed. And since deleterious muts that are marginal in effects and cumulative dominate, you are looking at genetic decay and deterioration on aggregate, not creation of something so complex as language ability. More broadly, you are twisting the actual points on the challenge of navigating to islands of function. There is absolutely no discussion of specific paths involved. All random walks from arbitrary initial points are allowed, but these have got to bridge the sea of non-functioning configs to get to shores of islands of co-ordinated complex function without exhausting the available search resources. 1,000 bits, my preferred threshold, swamps not only solar system or galactic level resources, but turns the scope of the cosmos to a 1 in 10^150 fraction of the set of possibilities. Not, for 10 Mbits or even 100 k bits worth of functional info, but for just 1,000 bits. If you have a significant controller to program, you can't sneeze in 1,000 bits or just 125 bytes! Matter of fact, the other day I set up a blank Word 97 doc, and took a look at the internal file. 155 k bits, with no serious content. Loads and loads of apparently repetitive y-double dots and other seemingly meaningless repeats. But, I bet these are all quite functional. The living cell is capable of doing a vNSR-metabolic entity in roughly the same space! The desperation that led to this strawman with an even worse suggestion on apes and men, is telling. 6: all your calculations are useless! This is of course quite a step back from they don't exist and/or are about something that is meaningless. THE LINE IS BENDING BACK IN RETREAT AND IS ABOUT TO BREAK. The blunders in point 5 just above show that on the contrary, the calculations are useful indeed. _____________ Onlookers, I suspect the tactic here is to try to bury this thread in obscurity, raising challenges elsewhere. (Hundreds of looks but one objecting comment and that studiously vague, sounds just a bit suspicious to me.) So, I announce a policy: in the onward addressing of the MG challenge, I will answer the fever swamp talking points here, and then link from wherever to this thread, so that this thread will become the reference on the fever swamp talking points. So, the adventure continues as the CSI gang appeals the kangaroo court decision from behind the bars of HM's cross-bar hotel. (Hey, the food ain't half bad . . . and the company and conversation are first class . . . ) GEM of TKIkairosfocus
April 15, 2011
April
04
Apr
15
15
2011
04:33 AM
4
04
33
AM
PDT
18: In this context, gene duplication events, i.e. duplication -- not de novo creation -- of existing information based on a failure or mutation of a regulatory network, is already within an island of function, and is going to be within the micro-evolutionary threshold. 19: For such a du0plicaiton event to have evolutionary significance, it has to fit in with a regulatory network, and possibly vary to form up a new functional protein or something like that. 20: This runs right back into the search space challenges as pointed out ever since, and will not rise tot the level of creation of novel body plans; on the evidence first life needs 100 k+ bits of de novo FSCI, and novel multicellular body plans need 10 mn+ bits; on the observable evidence. 21: This, to get to the shores of islands of function BEFORE hill-climbing in a dynamic landscape can even begin. 22: gene duplication, gene exchange and the like can explain microevolutionary adaptations -- and notice the ability to transfer genes between organisms has to be a complex functional, organised capacity itself -- but they do not explain novel body plans, starting with the first one. 23: Similarly, point mutations and splicing in of strings of bases can explain microevo but will not rise to the level of body plan macroevolution, for want of search resources. 24: Fundamentally, we must reckon with the overwhelming record of the fossils, as Loennig points out -- and the only really clean explanation out there is that such reflect tightly integrated, irreducibly and functionally complex organisation required for life and for successful fit to environmental niches. 25: So, as Loennig published in the literature in 2004, in his Dynamic Genomes paper (I cite from my always linked, which has been one click away for every comment post I have ever made at UD):
the basic genetical questions should be addressed in the face of all the dynamic features of ever reshuffling and rearranging, shifting genomes, (a) why are these characters stable at all and (b) how is it possible to derive stable features from any given plant or animal species by mutations in their genomes? . . . . granted that there are indeed many systems and/or correlated subsystems in biology, which have to be classified as irreducibly complex and that such systems are essentially involved in the formation of morphological characters of organisms, this would explain both, the regular abrupt appearance of new forms in the fossil record as well as their constancy over enormous periods of time. For, if "several well-matched, interacting parts that contribute to the basic function" are necessary for biochemical and/or anatomical systems to exist as functioning systems at all (because "the removal of any one of the parts causes the system to effectively cease functioning") such systems have to (1) originate in a non-gradual manner and (2) must remain constant as long as they are reproduced and exist. And this could mean no less than the enormous time periods mentioned for all the living fossils hinted at above. Moreover, an additional phenomenon would also be explained: (3) the equally abrupt disappearance of so many life forms in earth history . . . The reason why irreducibly complex systems would also behave in accord with point (3) is also nearly self-evident: if environmental conditions deteriorate so much for certain life forms (defined and specified by systems and/or subsystems of irreducible complexity), so that their very existence be in question, they could only adapt by integrating further correspondingly specified and useful parts into their overall organization, which prima facie could be an improbable process -- or perish . . . . According to Behe and several other authors [5-7, 21-23, 53-60, 68, 86] the only adequate hypothesis so far known for the origin of irreducibly complex systems is intelligent design (ID) . . .
26: In short the irreducibly complex functionally specific organisation and associated information in cell based life fits well with the only actual record we have of life from the remote past of origins, the fossil record. Namely, sudden appearance, stasis, disappearance and/or continuity into the modern era. Gould documents this:
. . . long term stasis following geologically abrupt origin of most fossil morphospecies, has always been recognized by professional paleontologists. [The Structure of Evolutionary Theory (2002), p. 752.] . . . . The great majority of species do not show any appreciable evolutionary change at all. These species appear in the section [[first occurrence] without obvious ancestors in the underlying beds, are stable once established and disappear higher up without leaving any descendants." [p. 753.] . . . . proclamations for the supposed ‘truth’ of gradualism - asserted against every working paleontologist’s knowledge of its rarity - emerged largely from such a restriction of attention to exceedingly rare cases under the false belief that they alone provided a record of evolution at all! The falsification of most ‘textbook classics’ upon restudy only accentuates the fallacy of the ‘case study’ method and its root in prior expectation rather than objective reading of the fossil record. [p. 773.]
27: In short, the two main design theory explanatory constructs --
(i) CSI/FSCI and resulting search space challenges, and (ii) irreducible complexity of major multipart functional systems and networks based on the need for mutual fitting of parts and provision of all required parts for function --
. . . provide a direct and cogent explanation for the predominant features of the real fossil record, as opposed to the one highlighted in headlines, museum displays and textbooks. 28: A careful examination will show that there is not one empirically well warranted case where body-plan creating macroevolution has been documented as observed fact, and/or a mechanism that would credibly cause such body plan innovation [beyond the search threshold challenge] has been empirically demonstrated and observed. 29: With one possible exception. DESIGN, by intelligent designers. Venter's recent results, and the field of genetic engineering, are highly illuminating as to what design is already known to be capable of, and what it promises to be capable of; indeed there are already discussions on novel body plan beasts. _________________ I trust this provides at least a context for answering to the various questions you and others may have. GEM of TKIkairosfocus
April 15, 2011
April
04
Apr
15
15
2011
01:04 AM
1
01
04
AM
PDT
Graham: First, observe that for several days I have been bringing this set of considerations to MG's attention, and she has consistently ducked out, starting with the analysis of what Dembski's metric boils down to and how it works on the ground, in a manner that is reasonable. That evasiveness, to my mind, is revealing. I also notice that you just vaguely allude to the questions MG has asked. She has asked many, many loaded questions in recent weeks, so which of them are you specifically interested in? [NB: Onlookers, there is no end of possible loaded questions and further questions on such questions -- just ask the shade of Torquemada -- so I have taken a different approach, undermining and blowing up the foundation for MG's claims.] Now, my point in the above post is to show how MG's entire underlying case collapses (and her main questions have been long since answered elsewhere -- including in NFL; which I doubt she has seriously and carefully read; cf the Google online link provided above). 1: CSI -- from the horse's mouth [note the NFL cite] -- is meaningful, and conceptually defined by Orgel and applied by him to life vs distinct things that are not functionally organised but ordered or random, using a concept familiar from the world of technological design. 2: Even before it is quantified, it is meaningful. 3: The usage by Orgel and that by Dembski are conceptually consistent not contradictory, as was alleged or implied. 4: It is measurable, on several metrics, tracing to Dembski (and as modified by others, including say VJT), to Durston et al [who, recall, published values of FSC for 35 protein families], and the simple brute force X-metric. 5: The Durston and Dembski metrics are consistent and closely related, though they address the measurement challenge in somewhat different ways. 6: Both end up in the situation of providing a context for examining how, once something is specific and sufficiently complex, the algorithmic challenge to find it by in effect chance-dominated processes. 7: The Dembski metric boils down [regardless of debates on how it got there] to finding a probability based negative log information value and comparing it to a threshold of 398 to about 500 bits, as an index of being sufficiently complex (while being specific to a "small" target zone) to be too isolated to be found by chance-dominated processes. 8: The simple, easily understood X-metric also does much the same, though it uses a far harder threshold, one that would overwhelm not the Planck time quantum state resources of our solar system but the cosmos as a whole. (NB: In the time for the fastest nuclear reactions, there would be ~ 10^20 Planck time quantum states.) 9: At this point, the foundation for what MG was trying to put forward is gone. the edifice naturally collapses of its own weight. 10: As Dembski pointed out in NFL [cf ch 4, which is far broader than a narrowish view of Weasel based on the evidence of the published results; i.e. apparently latched runs] -- this is part of why I think she has never seriously read it -- and also ever since, the basic problem with evolutionary search algorithms is that they are specific to and begin their search process in an identified island of function with a rising fitness landscape (which may vary across time, just as the topology of a real island does). 11: By the time that has been done by the programmers who write the software, the main job has been done, and done by intelligent design. 12: For, the underlying no free lunch principle points out that searches are tuned to fitness landscapes or a t least clusters of landscapes [i.e. as the islands of function vary across time], and the number of fitness functions that may be fitted onto a given config space is exponential on the scope of that space or actually unlimited. So, the search for a tuned search is exponentially more difficult than the direct search for the islands and then peaks of function. 13: So, a successful evolutionary search algorithm is strongly dependent -- as I pointed out -- on being started from within an island of function [the working algorithm, its data structures, coding and underlying machine], and is based on being preloaded with functionally specific, complex information thereby. 14: It may take a grand tour of such an island of function and produce quite a spectacular show, but -- like any good exercise in prestidigitation -- the effectiveness of the show depends on our willing suspension of knowledge of the curtains, the strings going up above the stage, and the trap doors under the stage. 15: PAV has also pointed out that ev, the best of the breed per Schneider, peaks out at less than 300 bits of search, on a computer -- which is far faster and more effective at searching than real world generations on the ground would be; i.e. well within the thresholds that the various metrics acknowledges as reachable by chance dominated search. 16: Evolutionary search algorithms, in short, may well explain microevolution, but that such is possible and is empirically supported is accepted by all, including young earth creationists, who see it as a designed means of adapting kinds of creatures to environments (and for the benefit of God's steward on earth, man, e.g. the dog-wolf kind). 17: Boiled down, the algorithms are based on built-in FSCO/I, and do not de novo create novel functional information beyond the FSCI/CSI threshold, which makes sense once you see that the search capacity of the solar system is less than 398 bits worth of states. [ . . . ]kairosfocus
April 15, 2011
April
04
Apr
15
15
2011
01:03 AM
1
01
03
AM
PDT
So, can you now answer Mathgrrls questions ?
MathGrrl did not have any questions.Mung
April 14, 2011
April
04
Apr
14
14
2011
08:57 PM
8
08
57
PM
PDT
So, can you now answer Mathgrrls questions ? __________________ ED: This thread is pivotal on the meaning of CSI, its Chi metric, and applications, and answers MG's "CSI is not rigorously defined" challenge; with a significant side issue on the real significance of evolutionary algorithms, especially Schneider's ev. (Note Mung's bottom-line description of how ev works at 180. Also, the response to the "simple genome" talking point at 206. Axe's response here to four common misconceptions on his work concerning isolation of protein fold domains in protein config space -- recall, coded for by genes -- is also material. Given the irresponsible and divisive tactics that are at work, I ask you to look at the onward post here, and at the notes here, here and here on the subtle temptation and trap of willful deception by insistently repeated misrepresentation maintained by drumbeat repetition and unresponsiveness in the teeth of adequate correction.) Axe's remarks help set our own in context, so let us clip a key cite:
let’s be clear what the question is. We know that living cells depend on the functions of thousands of proteins, and that these proteins have a great variety of distinct structural forms. These distinct forms are referred to as folds, and there are well over a thousand of them known today, with more being discovered all the time. The big question is: Does the Darwinian mechanism explain the origin of these folds? One way to approach this question is to reframe it slightly by viewing the Darwinian mechanism as a simple search algorithm. If Darwin’s theory is correct, then natural populations find solutions to difficult problems having to do with survival and reproduction by conducting successful searches. Of course we use the words search and find here figuratively, because nothing is intentionally looking for solutions. Rather, the solutions are thought to be the inevitable consequence of the Darwinian mechanism in operation. If we view Darwinism in this way—as a natural search mechanism—we can restate the big question as follows: Are new protein folds discoverable by Darwinian searches? Recasting the question in this way turns out to be helpful in that it moves us from a subjective form of explanation to an objective one, where an affirmative answer needs to be backed up by a convincing probabilistic analysis. . . . . Yes, the Darwinian mechanism requires that the different protein folds and functions not be isolated, and yes the rarity of functional sequences has a great deal to do with whether they are isolated . . . . [Using the comparison of a text string in English] as you’ve probably guessed, meaningful 42-character [text] combinations are far more rare than 1 in 1000, which explains why the islands of meaning are always isolated—mere dots in an enormously vast ocean of nonsense. Billions of sensible things can be said in 42 characters, and almost all of them can be said in many different ways, but none of that amounts to anything compared to the quadrillion quadrillion quadrillion quadrillion possible character combinations of that length (27^42 = 10^60). That is the sense in which functional protein sequences appear to be rare, and it has everything to do with their isolation.
That is the context in which the Dembski type info beyond a threshold metric is an index of how hard it is to search for isolated islands of function in seas of non-function. For protein fold domains, and for other Wicken wiring diagram based functional entities. Let's highlight the key results from above: I: The Chi metric of CSI, in log-reduced form, and on thresholds of complexity:
Chi_500 = Ip – 500, bits beyond the [solar system resources] threshold . . . eqn n5 Chi_1000 = Ip – 1000, bits beyond the observable cosmos, 125 byte/ 143 ASCII character threshold . . . eqn n6 Chi_1024 = Ip – 1024, bits beyond a 2^10, 128 byte/147 ASCII character version of the threshold in n6, with a config space of 1.80*10^308 possibilities, not 1.07*10^301 . . . eqn n6a
Sample results of applying the Chi_500 version to the Durston et al values of information in 35 protein families:
RecA: 242 AA, 832 fits, Chi: 332 bits beyond SecY: 342 AA, 688 fits, Chi: 188 bits beyond Corona S2: 445 AA, 1285 fits, Chi: 785 bits beyond . . . results n7
That is we see a measure of information beyond the threshold that points to design of the relevant protein families, thus the living cell. II: After such results, we can fairly regard the dismissive claim made by MG and highlighted in a previous thread by BA, as overturned. To wit:
MG: My conclusion is that, without a rigorous mathematical definition and examples of how to calculate [CSI], the metric is literally meaningless. Without such a definition and examples, it isn’t possible even in principle to associate the term with a real world referent. [NB: Observe my for the record comment at MF's blog on this talking point as repeated by Flint, here. Also, this onward discussion in the follow-on thread, on the rigour question -- which MG has (predictably) raised again, ignoring or outright dismissing all correction and brushing aside the pivotal issue of configuration spaces and search for islands of function.]
III: Instead we note:
a: CSI is an observational, descriptive term, whose meaning is in the very words: complex + specified + information. That should have been clear from the outset. b: As the just linked UD WAC 26 has pointed out for years, the concept CSI does not trace to Dembski but to Orgel, Wicken and others, and more broadly it describes a common feature of a technological world: complicated functional things that require a lot of careful arrangement or organisation -- thus functionally specific and complex information [FSCI] -- to work right. c: It turns out that cell based life is replete with the same sort of complex specific organisation and information. d: this raises the reasonable question as to whether cell based life is a technology that has implemented in effect a self-replicating machine based on high-tech nano-tech machines based on informational polymer molecules, especially proteins, RNA and DNA. e: So, we need to identify whether there are signs that can distinguish such design from what blind chance and mechanical forces can do. f: Once we see that we are dealing with islands of specific function in large spaces of possible arrangements, by far and away most of which will not function, we can apply the mathematics of searching such large spaces at random -- through a random walk -- on a trial and error basis vs on an intelligent basis. g: The decisive issue is that once spaces are large enough, the material resources -- numbers of atoms -- of our solar system or the observed cosmos eventually are inadequate. 398 - 500 bits of info specifies more than enough for the first threshold, and 1,000 for the second. h: 500 bits corresponds to 10^150 possibilities, the number more or less of Planck time quantum states for 10^80 atoms of our observed cosmos, across its estimated lifespan. (The number of P-time Q-states of the atoms of our solar system since the big bang on the commonly used 13.7 BY timeline amount to some 10^102, i.e. 48 orders of magnitude below the number of configurations for 500 bits.) 1,000 bits is the SQUARE of that number. (A fast, strong force nuclear interaction takes about 10^20 Planck-times, and the fastest chemical reactions about 10^30; with organic reactions being much slower than that. That is why the P-time is regarded as the shortest duration that makes physical sense.) In short, yet another dismissive talking point bites the dust: we know that here is so much haystack that the feasible scope of search boils down to effectively zero, without need for specific probability calculations, in a context where the routinely observed, and only empirically known adequate cause of FSCI/CSI is design. i: And yet, to get something fairly complicated functionally organised in less than 125 bytes of information [1,000 bits] is not feasible. Indeed, just the DNA in simplest cell based life is of order 100,000 - 1,000,000 bits of storage. 18 - 20 typical English words takes up 1,000 bits. j: That short of a computer program will not do much either, unless it is calling on other software elsewhere to do the real work, offstage so to speak. (That is what the so-called genomes of "evolutionary/genetic algorithms" effectively do.) k: So, the reduced Dembski metric is meaningful and allows us to identify a relevant threshold of complexity beyond which the only known, observed source of things that are functionally specific and complex and work, is intelligent design: Chi = I - 500, bits beyond a threshold l: And straightaway, we can apply this to the Durston results from 2007. m: So, we can see that the above objection is swept off the table once we see the above reduction and application of the Dembski CSI metric. n: And, the simpler to understand X-metric is just as applicable: X = C*S*B
IV: The main matter is now settled. So we can turn to the secondary questions and points MG raised in recent weeks. - - - - - - MG's four main questions are initially replied to at 3 below; the "now" presumes that MG's questions have not hitherto been cogently answered, which is not so, as can be seen from the discussion in her guest post thread. Note too how the reduction of Dembski's Chi in the OP is explained in 47 below as a log reduction. Mung has helpfully collected MG's four "major" questions/challenges here at 17 -18 below. The thread's highlights (Numbers relate to the time of addition, some have shifted, but links should work) are: 1: Cf 11 -- integrates Durston and Dembski showing biological values of Chi (which MG seems to think are not possible) -- and 19 - 20 below -- for point by point answers to the four main questions, then 2: 44 ff below replies to MG's rebuttals. In addition, PAV responds in 50 - 51, discussing the genetic duplication scenario on an event basis in 50, and highlighting the question of how much actual functionally specific info is in Schneider's reported case in 51. 3: MG's predictable further main list of questions/demands is answered at 76. 4: A discussion of the meaning, quantification and definition of Chi is in 57 below. 5: The Mandelbrot set is introduced at 59 to illustrate how genetic algorithms process rather than de novo create functionally specific and complex info. 6: This is backed up by a further talking point response at what is now 87. 7: Talking point rebuttals (in general) begin at 5 below and continue throughout the thread. 8: At 128 below, and in light of reciprocity for his own efforts to clarify CSI for her, VJT again pleads for MG to give him a jargon-free [intelligent 12 year old level] summary of what ev and the other programs she cited in her four main examples, in response to another refusal to do so. 9: Mung dissects ev starting at 126, and continuing with his linked discussion at ARN. Observe my own remarks in response to Schneider's dissection of the vivisection page, as well. 10: Note that Schneider actually says: "the information gain [of ev] depends on selection and it not blind and unguided. The selection is based on biologically sensible criteria: having functional DNA binding sites and not having extra ones. So Ev models the natural situation." OOPS. Methinks the biologists will want to have a word with Dr Schneider about whether or no natural selection is a GUIDED process. Starting with Dawkins, when he wrote of his own Weasel that:
the monkey/Shakespeare model is useful for explaining the distinction between single-step selection and cumulative selection, it is misleading in important ways. One of these is that, in each generation of selective ‘breeding’, the mutant ‘progeny’ phrases were judged according to the criterion of resemblance to a distant ideal target . . . Life isn’t like that. Evolution has no long-term goal. There is no long-distance target, no final perfection to serve as a criterion for selection, although human vanity cherishes the absurd notion that our species is the final goal of evolution. In real life, the criterion for selection is always short-term, either simple survival or, more generally, reproductive success . . .
__________ The sad bottomline, here. Next, Mung puts in the coup de grace, here. ++++++ On a happier note, we can explore some potential break-out and exploitation opportunities now that the schwerpunkt has broken through:
1: the observed fine-tuned nature of known designed GA's leads to the inference that if GA's model adaptation of living organisms, that is a mark that the capacity is an in-built design in living organisms. 2: In short, we seem to have been looking through the telescope from the wrong end, and may be missing what it can show us. 3: In addition, the way that exploration and discovery algors and approaches can often reveal the unexpected but already existing and implicit, reminds us that discovery is not origination. 4: Finally, the case of the Mandelbrot set where a simple function and exploratory algorithm revealed infinite complexity, points to the inference that the elegant, irreducibly complex [per Godel] and undeniably highly functional and precisely specific system of mathematical-logical reality, a key component of the cosmos, is itself arguably designed. An unexpected result. 5: And, a BA 77 point: this specifically includes how the Euler equation pins down the five key numbers in Mathematics, tying them together in one tight, elegantly beautiful complex plane -- yup, the plane with the Mandelbrot set in it to reveal infinite complexity [hence infinite lurking information unfolded from such a simple specification] -- expression:
e^(i*pi) + 1 = 0
So, let us explore together . . .
Graham
April 14, 2011
April
04
Apr
14
14
2011
07:52 PM
7
07
52
PM
PDT
1 5 6 7

Leave a Reply