Uncommon Descent Serving The Intelligent Design Community

NEWS FLASH: Dembski’s CSI caught in the act

Share
Facebook
Twitter
LinkedIn
Flipboard
Print
Email

Dembski’s CSI concept has come under serious question, dispute and suspicion in recent weeks here at UD.

After diligent patrolling the cops announce a bust: acting on some tips from un-named sources,  they have caught the miscreants in the act!

From a comment in the MG smart thread, courtesy Dembski’s  NFL (2007 edn):

___________________

>>NFL as just linked, pp. 144 & 148:

144: “. . . since a universal probability bound of 1 in 10^150 corresponds to a universal complexity bound of 500 bits of information, (T, E) constitutes CSI because T [i.e. “conceptual information,” effectively the target hot zone in the field of possibilities] subsumes E [i.e. “physical information,” effectively the observed event from that field], T is detachable from E, and and T measures at least 500 bits of information . . . ”

148: “The great myth of contemporary evolutionary biology is that the information needed to explain complex biological structures can be purchased without intelligence. My aim throughout this book is to dispel that myth . . . . Eigen and his colleagues must have something else in mind besides information simpliciter when they describe the origin of information as the central problem of biology.

I submit that what they have in mind is specified complexity, or what equivalently we have been calling in this Chapter Complex Specified information or CSI . . . .

Biological specification always refers to function . . . In virtue of their function [a living organism’s subsystems] embody patterns that are objectively given and can be identified independently of the systems that embody them. Hence these systems are specified in the sense required by the complexity-specificity criterion . . . the specification can be cashed out in any number of ways . . . “

Here we see all the suspects together caught in the very act.

Let us line up our suspects:

1: CSI,

2: events from target zones in wider config spaces,

3: joint complexity-specification criteria,

4: 500-bit thresholds of complexity,

5: functionality as a possible objective specification

6: biofunction as specification,

7: origin of CSI as the key problem of both origin of life [Eigen’s focus] and Evolution, origin of body plans and species etc.

8: equivalence of CSI and complex specification.

Rap, rap, rap!

“How do you all plead?”

“Guilty as charged, with explanation your honour. We were all busy trying to address the scientific origin of biological information, on the characteristic of complex functional specificity. We were not trying to impose a right wing theocratic tyranny nor to smuggle creationism in the back door of the schoolroom your honour.”

“Guilty!”

“Throw the book at them!”

CRASH! >>

___________________

So, now we have heard from the horse’s mouth.

What are we to make of it, in light of Orgel’s conceptual definition from 1973 and the recent challenges to CSI raised by MG and others.

That is:

. . . In brief, living organisms are distinguished by their specified complexity. Crystals are usually taken as the prototypes of simple well-specified structures, because they consist of a very large number of identical molecules packed together in a uniform way. Lumps of granite or random mixtures of polymers are examples of structures that are complex but not specified. The crystals fail to qualify as living because they lack complexity; the mixtures of polymers fail to qualify because they lack specificity. [[The Origins of Life (John Wiley, 1973), p. 189.]

And, what about the more complex definition in the 2005 Specification paper by Dembski?

Namely:

define ϕS as . . . the number of patterns for which [agent] S’s semiotic description of them is at least as simple as S’s semiotic description of [a pattern or target zone] T. [26] . . . . where M is the number of semiotic agents [S’s] that within a context of inquiry might also be witnessing events and N is the number of opportunities for such events to happen . . . . [where also] computer scientist Seth Lloyd has shown that 10^120 constitutes the maximal number of bit operations that the known, observable universe could have performed throughout its entire multi-billion year history.[31] . . . [Then] for any context of inquiry in which S might be endeavoring to determine whether an event that conforms to a pattern T happened by chance, M·N will be bounded above by 10^120. We thus define the specified complexity [χ] of T given [chance hypothesis] H [in bits] . . . as  [the negative base-2 log of the conditional probability P(T|H) multiplied by the number of similar cases ϕS(t) and also by the maximum number of binary search-events in our observed universe 10^120]

χ = – log2[10^120 ·ϕS(T)·P(T|H)]  . . . eqn n1

How about this (we are now embarking on an exercise in “open notebook” science):

1 –> 10^120 ~ 2^398

2 –> Following Hartley, we can define Information on a probability metric:

I = – log(p) . . .  eqn n2

3 –> So, we can re-present the Chi-metric:

Chi = – log2(2^398 * D2 * p)  . . .  eqn n3

Chi = Ip – (398 + K2) . . .  eqn n4

4 –> That is, the Dembski CSI Chi-metric is a measure of Information for samples from a target zone T on the presumption of a chance-dominated process, beyond a threshold of at least 398 bits, covering 10^120 possibilities.

5 –> Where also, K2 is a further increment to the threshold that naturally peaks at about 100 further bits. In short VJT’s CSI-lite is an extension and simplification of the Chi-metric. He explains in the just linked (and building on the further linked):

The CSI-lite calculation I’m proposing here doesn’t require any semiotic descriptions, and it’s based on purely physical and quantifiable parameters which are found in natural systems. That should please ID critics. These physical parameters should have known probability distributions. A probability distribution is associated with each and every quantifiable physical parameter that can be used to describe each and every kind of natural system – be it a mica crystal, a piece of granite containing that crystal, a bucket of water, a bacterial flagellum, a flower, or a solar system . . . .

Two conditions need to be met before some feature of a system can be unambiguously ascribed to an intelligent agent: first, the physical parameter being measured has to have a value corresponding to a probability of 10^(-150) or less, and second, the system itself should also be capable of being described very briefly (low Kolmogorov complexity), in a way that either explicitly mentions or implicitly entails the surprisingly improbable value (or range of values) of the physical parameter being measured . . . .

my definition of CSI-lite removes Phi_s(T) from the actual formula and replaces it with a constant figure of 10^30. The requirement for low descriptive complexity still remains, but as an extra condition that must be satisfied before a system can be described as a specification. So Professor Dembski’s formula now becomes:

CSI-lite=-log2[10^120.10^30.P(T|H)]=-log2[10^150.P(T|H)] . . . eqn n1a

. . . .the overall effect of including Phi_s(T) in Professor Dembski’s formulas for a pattern T’s specificity, sigma, and its complex specified information, Chi, is to reduce both of them by a certain number of bits. For the bacterial flagellum, Phi_s(T) is 10^20, which is approximately 2^66, so sigma and Chi are both reduced by 66 bits. My formula makes that 100 bits (as 10^30 is approximately 2^100), so my CSI-lite computation represents a very conservative figure indeed.

Readers should note that although I have removed Dembski’s specification factor Phi_s(T) from my formula for CSI-lite, I have retained it as an additional requirement: in order for a system to be described as a specification, it is not enough for CSI-lite to exceed 1; the system itself must also be capable of being described briefly (low Kolmogorov complexity) in some common language, in a way that either explicitly mentions pattern T, or entails the occurrence of pattern T. (The “common language” requirement is intended to exclude the use of artificial predicates like grue.) . . . .

[As MF has pointed out] the probability p of pattern T occurring at a particular time and place as a result of some unintelligent (so-called “chance”) process should not be multiplied by the total number of trials n during the entire history of the universe. Instead one should use the formula (1–(1-p)^n), where in this case p is P(T|H) and n=10^120. Of course, my CSI-lite formula uses Dembski’s original conservative figure of 10^150, so my corrected formula for CSI-lite now reads as follows:

CSI-lite=-log2(1-(1-P(T|H))^(10^150)) . . . eqn n1b

If P(T|H) is very low, then this formula will be very closely approximated [HT: Giem] by the formula:

CSI-lite=-log2[10^150.P(T|H)]  . . . eqn n1c

6 –> So, the idea of the Dembski metric in the end — debates about peculiarities in derivation notwithstanding — is that if the Hartley-Shannon- derived information measure for items from a hot or target zone in a field of possibilities is beyond 398 – 500 or so bits, it is so deeply isolated that a chance dominated process is maximally unlikely to find it, but of course intelligent agents routinely produce information beyond such a threshold.

7 –> In addition, the only observed cause of information beyond such a threshold is the now proverbial intelligent semiotic agents.

8 –> Even at 398 bits that makes sense as the total number of Planck-time quantum states for the atoms of the solar system [most of which are in the Sun] since its formation does not exceed ~ 10^102, as Abel showed in his 2009 Universal Plausibility Metric paper. The search resources in our solar system just are not there.

9 –> So, we now clearly have a simple but fairly sound context to understand the Dembski result, conceptually and mathematically [cf. more details here]; tracing back to Orgel and onward to Shannon and Hartley. Let’s augment here [Apr 17], on a comment in the MG progress thread:

Shannon measured info-carrying capacity, towards one of his goals: metrics of the carrying capacity of comms channels — as in who was he working for, again?

CSI extended this to meaningfulness/function of info.

And in so doing, observed that this — due to the required specificity — naturally constricts the zone of the space of possibilities actually used, to island[s] of function.

That specificity-complexity criterion links:

I: an explosion of the scope of the config space to accommodate the complexity (as every added bit DOUBLES the set of possible configurations),  to

II: a restriction of the zone, T, of the space used to accommodate the specificity (often to function/be meaningfully structured).

In turn that suggests that we have zones of function that are ever harder for chance based random walks [CBRW’s] to pick up. But intelligence does so much more easily.

Thence, we see that if you have a metric for the information involved that surpasses a threshold beyond which a CBRW is a plausible explanation, then we can confidently infer to design as best explanation.

Voila, we need an info beyond the threshold metric. And, once we have a reasonable estimate of the direct or implied specific and/or functionally specific (especially code based) information in an entity of interest, we have an estimate of or credible substitute for the value of – log2(p(T|H)); especially if the value of information comes from direct inspection of storage capacity and code symbol patterns of use leading to an estimate of relative frequency, we may evaluate average [functionally or otherwise] specific information per symbol used. This is a version of Shannon’s weighted average information per symbol H-metric, H = –  Σ pi * log(pi), which is also known as informational  entropy [there is an arguable link to thermodynamic entropy, cf here)  or uncertainty.

As in (using Chi_500 for VJT’s CSI_lite [UPDATE, July 3: and S for a dummy variable that is 1/0 accordingly as the information in I is empirically or otherwise shown to be specific, i.e. from a narrow target zone T, strongly UNREPRESENTATIVE of the bulk of the distribution of possible configurations, W]):

Chi_500 = Ip*S – 500,  bits beyond the [solar system resources] threshold  . . . eqn n5

Chi_1000 = Ip*S – 1000, bits beyond the observable cosmos, 125 byte/ 143 ASCII character threshold . . . eqn n6

Chi_1024 = Ip*S – 1024, bits beyond a 2^10, 128 byte/147 ASCII character version of the threshold in n6, with a config space of 1.80*10^308 possibilities, not 1.07*10^301 . . . eqn n6a

[UPDATE, July 3: So, if we have a string of 1,000 fair coins, and toss at random, we will by overwhelming probability expect to get a near 50-50 distribution typical of the bulk of the 2^1,000 possibilities W. On the Chi-500 metric, I would be high, 1,000 bits, but S would be 0, so the value for Chi_500 would be – 500, i.e. well within the possibilities of chance.  However, if we came to the same string later and saw that the coins somehow now had the bit pattern of the ASCII codes for the first 143 or so characters of this post, we would have excellent reason to infer that an intelligent designer, using choice contingency, had intelligently reconfigured the coins. that is because, using the same I = 1,000 capacity value, S is now 1, and so Chi_500 = 500 bits beyond the solar system threshold. If the 10^57 or so atoms of our solar system, for its lifespan, were to be converted into coins and tables etc, and tossed at an impossibly fast rate, it would be impossible to sample enough of the possibilities space W to have confidence that something from so unrepresentative a zone T,  could reasonably be explained on chance. So, as long as an intelligent agent capable of choice is possible, choice — i.e. design — would be the rational, best explanation on the sign observed, functionally specific, complex information.]

10 –> Similarly, the work of Durston and colleagues, published in 2007, fits this same general framework. Excerpting:

Consider that there are usually only 20 different amino acids possible per site for proteins, Eqn. (6) can be used to calculate a maximum Fit value/protein amino acid site of 4.32 Fits/site [NB: Log2 (20) = 4.32]. We use the formula log (20) – H(Xf) to calculate the functional information at a site specified by the variable Xf such that Xf corresponds to the aligned amino acids of each sequence with the same molecular function f. The measured FSC for the whole protein is then calculated as the summation of that for all aligned sites. The number of Fits quantifies the degree of algorithmic challenge, in terms of probability [info and probability are closely related], in achieving needed metabolic function. For example, if we find that the Ribosomal S12 protein family has a Fit value of 379, we can use the equations presented thus far to predict that there are about 10^49 different 121-residue sequences that could fall into the Ribsomal S12 family of proteins, resulting in an evolutionary search target of approximately 10^-106 percent of 121-residue sequence space. In general, the higher the Fit value, the more functional information is required to encode the particular function in order to find it in sequence space. A high Fit value for individual sites within a protein indicates sites that require a high degree of functional information. High Fit values may also point to the key structural or binding sites within the overall 3-D structure.

11 –> So, Durston et al are targetting the same goal, but have chosen a different path from the start-point of the Shannon-Hartley log probability metric for information. That is, they use Shannon’s H, the average information per symbol, and address shifts in it from a ground to a functional state on investigation of protein family amino acid sequences. They also do not identify an explicit threshold for degree of complexity. [Added, Apr 18, from comment 11 below:] However, their information values can be integrated with the reduced Chi metric:

Using Durston’s Fits from his Table 1, in the Dembski style metric of bits beyond the threshold, and simply setting the threshold at 500 bits:

RecA: 242 AA, 832 fits, Chi: 332 bits beyond

SecY: 342 AA, 688 fits, Chi: 188 bits beyond

Corona S2: 445 AA, 1285 fits, Chi: 785 bits beyond  . . . results n7

The two metrics are clearly consistent, and Corona S2 would also pass the X metric’s far more stringent threshold right off as a single protein. (Think about the cumulative fits metric for the proteins for a cell . . . )

In short one may use the Durston metric as a good measure of the target zone’s actual encoded information content, which Table 1 also conveniently reduces to bits per symbol so we can see how the redundancy affects the information used across the domains of life to achieve a given protein’s function; not just the raw capacity in storage unit bits [= no.  of  AA’s * 4.32 bits/AA on 20 possibilities, as the chain is not particularly constrained.]

12 –> I guess I should not leave off the simple, brute force X-metric that has been knocking around UD for years.

13 –> The idea is that we can judge information in or reducible to bits, as to whether it is or is not contingent and complex beyond 1,000 bits. If so, C = 1 (and if not C = 0). Similarly, functional specificity can be judged by seeing the effect of disturbing the information by random noise [where codes will be an “obvious” case, as will be key-lock fitting components in a Wicken wiring diagram functionally organised entity based on nodes, arcs and interfaces in a network], to see if we are on an “island of function.” If so, S = 1 (and if not, S = 0).

14 –> We then look at the number of bits used, B — more or less the number of basic yes/no questions needed to specify the configuration [or, to store the data], perhaps adjusted for coding symbol relative frequencies — and form a simple product, X:

X = C * S * B, in functionally specific bits . . . eqn n8.

15 –> This is of course a direct application of the per aspect explanatory filter, (cf. discussion of the rationale for the filter here in the context of Dembski’s “dispensed with” remark) and the value in bits for a large file is the familiar number we commonly see such as a Word Doc of 384 k bits. So, more or less the X-metric is actually quite commonly used with the files we toss around all the time. That also means that on billions of test examples, FSCI in functional bits beyond 1,000 as a threshold of complexity is an empirically reliable sign of intelligent design.

______________

All of this adds up to a conclusion.

Namely, that there is excellent reason to see that:

i: CSI and FSCI are conceptually well defined (and are certainly not “meaningless”),

ii: trace to the work of leading OOL researchers in the 1970’s,

iii: have credible metrics developed on these concepts by inter alia Dembski and Durston, Chiu, Abel and Trevors, metrics that are based on very familiar mathematics for information and related fields, and

iv: are in fact — though this is hotly denied and fought tooth and nail — quite reliable indicators of intelligent cause where we can do a direct cross-check.

In short, the set of challenges recently raised by MG over the past several weeks has collapsed. END

Comments
F/N: Onlookers, it seems I need to show why I said what I said at 194, again, by way of correction. So, let me clip the substance of that comment: ____________ >> [KF:] Pardon, again; are you aware of the size of actual genomes? When you [Dr Bot] say:
[Dr Bot, 192:] With the Golem encoding the complexity of the organism is directly related to the genotype. If you want 100 legs and a segmented body (like a millipede) you need to encode each segment and each leg explicitly. The size of the genome and the resulting search space becomes impossibly large and evolution can hit a barrier but when you have biological like indirect encodings and development you can build complex structures like that with very simple genomes
[KF, answering:] Real world genomes start at 100+ k to 1 mn bases, and for multicellular body plans we are looking at — dozens of times over — 10mn+ new base pairs. Genomes then run up to billions of bases in a well organised reference library. Just 100 k base pairs is a config space of 4^100,000 ~ 9.98 * 10^60,205 possibilities. The P[lanck]-time Q[uantum]-states of the observed cosmos across its lifespan, would amount to no more than 10^150, a mere drop in that bucket. “Simple genome” is a grossly simplistic strawman. And, the hinted-at suggestion that by using in effect a lookup table as a genome you have got rid of the need to code the information and the regulatory organisation, is another misdirection. You have simply displaced the need to code the algorithms that do the technical work. Notice, genomes are known to have protein etc coding segments, AND regulatory elements that control expression, all in the context of a living cell that has the machinery to make it work. The origin of the cell [metabolising and von Neumann self replicator], and its elaboration through embryogenesis into varied functional body plans have to be explained on chance plus necessity and confirmed by observation if the evo mat view is to have any reasonable foundation in empirically based warrant. >> _____________ I trust the point, and its context are now sufficiently clear. Notice, especially, the highlighted concession in 192:
The size of the genome and the resulting search space becomes impossibly large and evolution can hit a barrier
Let us ask: is 100 - 1,000+ kbits worth of genetic info to get to 1st life "impossibly large," and is 10 mn + dozens of times over to get to novel body plans "impossibly large"? I think the question answers itself, once we realise that just 100 k bits worth of stored info codes for up to 9.98 * 10^60,205 possible configs [the 10^80 atoms of our observed cosmos across its thermodynamic lifespan would only undergo 10^150 P-time states, where ~ 10^30 states are needed for the fastest chemical reactions], and shows the material point. That is what you are not being told in the HS or College classroom, what you are not reading in your textbooks, it is what museum displays will not tell you, it is what Nat Geog or Sci Am or Discover Mag will not tell you in print or on web or on TV, and it is what the NCSE and now BCSE spin doctors are doing their level best to make sure you never hear in public. GEM of TKI ++++++++++ Pardon: auto-termination. kairosfocus
Dr Bot: I am sorry, this is a major thread on a key issue regarding the CSI metric. I do not think a late tangential debate on another subject will help much, especially when my pointing out a problem that seems to be recurring is turning into an occasion for turnabout rhetorical tactics. Sufficient has been said to underscore that the Golem case shows -- inadvertently -- how hard it is to try to get to complex systems by random walk searches with trial and error on success. That is enough to underscore that it shows the significance of the islands of function problem. Adaptation of a body plan is a different kettle of fish from getting to the functional plan in the first place. There is no informational free lunch. Good day GEM of TKI kairosfocus
I said:
The point is specific – Other explanations exist for the issues raised by their particular experiment so it cannot be considered “an empirical confirmation of the barriers posed by functionally specific complex information beyond a threshold”.
I addressed your specific claim regarding Golem and the issues they encountered. You said:
Can you show that the sort of mechanisms that may successfully modify an already functioning body plan, can generate significantly different ones, including the different control mechanisms? [Recall my note on 2 vs 3 vs 6 or 8 legged walking gaits.]
Can you? My point is specific to Golem and their implementation. It is valid regardless of how or if biological evolution works. You are not actually addressing my comment, just deflecting to a different issue. My point is specific to Golem and their implementation. If you want to get into detail then there is plenty of research into this - perhaps you should take a look! As I already outlined, an indirect encoding or developmental mapping scheme can generate repeating structures like limbs, including control systems. Indeed some promising work is in adaptive controllers that configure themselves to provide effective control. These systems (often based on neural nets, designed with GA's) do not need specific architectures tuned to a specific morphology, they can be fairly generic but adapt during development - all inspired by biology! In the context of evolutionary robotics (which was basically what Pollack et al. were doing) an indirect encoding scheme, development and adaptive controllers can overcome the complexity barrier they encountered, and they can mean that small genotype changes translate into major prototypic differences (or if you prefer - radically different body plans) and major increases in complexity. But of course we are talking about designed experiments, evolutionary algorithms as design tools working from a designed starting point - crude approximations of living systems. They are not OOL experiments, they are IIC experiments (Increase In Complexity) so they already start on an island of function (or perhaps a continent - it depends on the encoding and development scheme!) It is easy to look at Pollack et.al's statement that they encountered a complexity barrier and claim that it proves something fundamental about biology. A proper scientific approach is to try and understand why they encountered this barrier, and if it even applies to biology. Improving the Golem encoding scheme might jump the complexity barrier, but there may be other barriers, and even with indirect mappings Golem may still be too far removed from biology to make direct comparisons.
The point is specific – Other explanations exist for the issues raised by their particular experiment so it cannot be considered “an empirical confirmation of the barriers posed by functionally specific complex information beyond a threshold”.
You keep deploying your default and repetitive argument which seems to amount to "if you can't explain the origin of life then you can't explain anything" I wish you would limit yourself to dealing with specific arguments on their merits. Going back a few posts:
You still have not cogently responded to the evidence that the issue is to get TO isolated islands of function in configuration space, rather than relatively minor adaptations within islands of function. And it is the former challenge that the Golem project underscores.
The issue I addressed was about the encoding scheme used in Golem. If you change the encoding scheme you turn a small island of function into a large continent. I have not addressed the issue of OOL because the issue of OOL is not the issue I was addressing, nor was it a goal of the Golem project. Golem starts with minimal function and looks at what descent with modification can do, it was not an OOL research project, it was not concerned with getting to an island of function so the problems they encountered do not underscore the problem of getting to an island of function.
... immaterial distractors lend themselves to strawman caricatures and ad hominems, thence atmosphere-poisoning. As you have already been through with me to a point where you had to half-apologise on trying to quit smoking if I recall correctly.
Ah, an attack on my person instead of just my arguments - there's a word for that ... If you recall, I over-reacted to a comment of yours which I took as a personal attack, and then apologised after some reflection. Perhaps you feel unable to forgive me? Don't worry, I forgive you ;) Your comments and demands are immaterial to my point about Golem, they are a distraction. The issue I addressed was about the encoding scheme used in Golem. And yes – some genomes are quite complex. The issue I addressed was about the encoding scheme used in Golem. DrBot
PS: And, real genomes are quite quite complex. kairosfocus
Dr Bot: The following excerpt aptly captures why I pointed out the significance of the getting to islands of function -- the macro evo not the micro evo -- material problem:
Why should I?
Because, that was the material point; and because immaterial distractors lend themselves to strawman caricatures and ad hominems, thence atmosphere-poisoning. As you have already been through with me to a point where you had to half-apologise on trying to quit smoking if I recall correctly. The Golem project illustrated -- as a case in point, not as the proof of all proofs -- the empirically observable challenge of getting to islands of function for chance and necessity. That is what I highlighted. In short, your objection that in effect I am not addressing the [limited]power of micro evo to adapt an already functioning body plan, is irrelevant, and of course feeds into the strawman and ad hominem problem you have already gone through with me. Please, let us not go down that fruitless path. Can you show that the sort of mechanisms that may successfully modify an already functioning body plan, can generate significantly different ones, including the different control mechanisms? [Recall my note on 2 vs 3 vs 6 or 8 legged walking gaits.] What about the first body plan? What about embryological feasibility? GEM of TKI kairosfocus
You still have not cogently responded to the evidence that the issue is to get TO isolated islands of function in configuration space, rather than relatively minor adaptations within islands of function.
Why should I? I was commenting on your claim:
In short, this is an empirical confirmation of the barriers posed by functionally specific complex information beyond a threshold, and irreducible complexity.
By pointing out that their encoding scheme and the genotype to phenotype mapping they used may be one reason why their particular experiment got stuck at a local maxima with regards the complexity of the agents that were produced. The point is specific - Other explanations exist for the issues raised by their particular experiment so it cannot be considered "an empirical confirmation of the barriers posed by functionally specific complex information beyond a threshold". You could have simply accepted this valid observation but instead you, as always it seems, shifted the goal posts and demanded that I proove something else, that was not part of the point I was making. Perhaps I should re-state my general position - which I have made many times on this website but which you always seem to ignore: As a theist I believe that we are the product of design, as a scientist I am agnostic about the method of creation and skeptical about claims surrounding abiogenesis - from both sides. Naturalistic OOL is compatible with my theistic beliefs but I have no ideology that demands it, or anything else. When you ask me to account for OOL without design you are asking me to explain how something that I don't believe happened, happened. This gets very tiresome! DrBot
PS: It should be clear from the metric for H, that once we have contingency, a string WILL have a non-zero value for Shannon information. In the case where one symbol has probability 1 and the rest have probability zero, H reduces to - log (1) = 0. There are no biologically relevant strings of AA's or nucleic acid bases that have zero values for the H metric. Similarly, it should be clear that the H metric standing by itself is not a good measure of what is involved in functionality or meaning. Hence the significance of Dembski's zones of interest T in a wider config space, and metrics that allow us to infer from degree of isolation, criteria for accepting some strings as most likely designed. This is without loss of generality, as more complex cases can be reduced to strings. kairosfocus
Mung: Perhaps then, we can work together to work out how to express what needs to be said in a way that will communicate clearly enough to those who do not have a specific background in communication systems. Okay, let's try a beginning: 1: Information, conceptually, is:
1. Facts, data, or instructions in any medium or form. 2. The meaning that a human assigns to data by means of the known conventions used in their representation. [Dictionary of Military and Associated Terms. US Department of Defense 2005.] Information in its most restricted technical sense is an ordered sequence of symbols that record or transmit a message. It can be recorded as signs, or conveyed as signals by waves. Information is any kind of event that affects the state of a dynamic system. As a concept, however, information has numerous meanings.[1] Moreover, the concept of information is closely related to notions of constraint, communication, control, data, form, instruction, knowledge, meaning, mental stimulus, pattern, perception, representation, and especially entropy . . . [Wiki art., Information]
2: In addition, we could see that the specific organisation of a functional, dynamic entity [e.g. a car engine, a computer, a match with head and stick] is implicitly information-rich, and may be reduced to strings of symbols according to rules for describing the associated network of components. [Think of how a blueprint is represented in a CAD program.] this is obviously relevant to the coded symbol strings in DNA, the resulting AA sequences in protein chains, and the wider complex functional organisation of the living cell and the organism with a complex body plan. 3: When Hartley and others investigated information towards quantifying it in the 1920's - 40's, they found that the easiest way to do so would be to exploit the contingency and pattern of appearance of symbols in messages: there is a statistical distribution in messages of sufficient length in aggregate, e.g. about 1 in 8 letters in typical English text will be an E. 4: Accordingly, one may investigate the message as a statistical phenomenon, and isolate the observed frequency distribution of symbols. 5: From this, we may see that we are dealing with probabilities, as the likelihood of a letter e is 1 in 8 is similar to the odds of a die being tossed and coming up 3 is 1 in 6. So, the probability of the first is about 0.12, and the latter is about 0.17. 6: Letters like X or Z are far less likely, and so it is intuitively clear that they give much more information: info rises as probability falls. So, an inverse probability measure is close to what we want for a metric of information. 7: We also want an additive measure, so that we can add up information in successive symbols. 8: The best reasonably simple metric for this is a logarithmic one, and a log of 1/p, will be a negative log probability that will add up (as the already referenced always linked online note discusses). 9: That is what Hartley advised, and it is what Shannon took up. So the basic information metric in use in the field is based on a negative log of the frequency of occurrence of a given symbol in messages. 10: Citing the fairly short discussion in Taub and Schilling again:
Let us consider a communication system in which the allowable messages [think: e.g. ASCII text alphanumerical symbols] are m1, m2, . . ., with probabilities of occurrence p1, p2, . . . . Of course p1 + p2 + . . . = 1. Let the transmitter select message mk of probability pk; let us further assume that the receiver has correctly identified the message [My nb: i.e. the a posteriori probability in my online discussion is 1]. Then we shall say, by way of definition of the term information, that the system has communicated an amount of information Ik given by Ik = (def) log2 1/pk (13.2-1) [Princs of Comm Systems, 2nd edn, Taub and Schilling (McGraw Hill, 1986), p. 512, Sect. 13.2.]
11: Unpacking, the quantity of information in a message element k, in bits, is logarithm [to base 2 here] of the inverse of probability, which is equal to: Ik = (def) log2 1/pk Ik = log2 1 - log2 pk Ik = 0 - log2 pk Ik = - log2 pk, in bits 12: We note that any positive number can be seen as a particular number [here 2, sometimes 10, sometimes e] raised to a power, called the log: 10^3 = 1,000, so log10 (1,000) = 3 13: Likewise the information represented by an E in typical English, is Ie = - log2 (0.12) = 3.06 bits. 14: There are usually 20 amino acids in a protein chain, and since they are more or less not chemically constrained in chaining, a simple value for info per AA would be, on 5% odds per AA: Iaa = - log 2 (0.05) = 4.32 bits per AA 15: While that basic chemical fact allows that to be a baseline measure, in fact in functioning protein families, the AA's are not equally likely [here, the issue is not just the chemistry of chaining, but what is needed to get proper folding and function in protein space], and that is what Durston et al turned into their more complex measures. 16: For more complex cases, it is useful to make an average information per symbol measure across the set of symbols used, using a weighted average: H = - [SUM on i] pi * log pi 17: This measure is what is often called Shannon information, and it is related to the average info per symbol that is in messages sent down a channel [think, TCP/IP strings sent down a phone line to your DSL modem]. 18: Now, going further, such symbol strings are to be found in communication systems. Such may be explicitly organised around the sort of block diagram you have seen, e.g. a radio network, or how your PC is hooked up to the Internet using the TCP/IP protocol which is tied to the "layercake" approach. (That is why we talk of Bridges, Routers, and Gateways, they have to do with levels of the coding and protocol layercake.] 19: But that does not have to be so. To move from source to sink, info conceptually needs to be encoded and/or modulated, transmitted across a channel, and received, then demodulated and/or decoded, before it is in a form useful in the sink or destination. These are conceptual stages, not so much physical blocks. 20: In biological systems, that sort of process is going on all the time, and there are many implicit communication systems in such organisms. 21: Notice, so far we have simply measured probability of symbols in strings, and have not bothered about the functionality of meaningful messages. By this standard, it can be mathematically shown that a flat random distribution of symbols [similar to tossing a fair die] would give the highest possible value of H for a string. But such would be meaningless. An oddity of the metric. 22: In real world, functional messages, symbols are not flat random equiprobable, and for us to be able to communicate, we must be able to select and issue symbols, i.e. a string that must only be AAAAAA . . . has no contingency and though orderly is equally uninformative. We have no surprise to see an A as A is the forced value. - log2 (1) = 0. 23: So we see the significance of the sort of modelling Dembski et al have done: they recognise that symbol strings come from an in principle field of possible strings, the config space. 24: But only configs from a zone of interest will be functional. So, if we can describe what that zone of interest T is like, and we observe a given event E from it, we know we are in a special zone. 25: if such islands of function in large config spaces that are dominated by a sea of non-function, are sufficiently rare, it becomes unreasonable to think you could get there by chance. 26: The odds of two dice coming up 6-6 are 1 in 36, which is within reason. The odds of 400 dice all reading 6 by chance, are 1 in 6^400, far less likely, in fact beyond the reasonable reach of chance on the gamut of our observed cosmos. If you see this, the best explanation is that someone organised the dice to read 6. 27: Similarly, odds of DNA strings or AA strings being functional by chance can be worked out, and information values assigned. [Or, we can simply look at the way the strings are set up [4 states per symbol 20 states per symbol] and see that there is a given storage capacity, and/or modify for the observed patterns of symbols frequencies.] 28: We can then deduce metrics for the info stored in such strings. 29: We can then look at the degree of isolation by applying the sort of threshold metric that Dembski et al use, for strings from islands of function, and if we see we are beyond certain thresholds of complexity, it is reasonable to infer that we are looking at something that is best explained on intelligence. Just like with 400 dice all reading 6. __________ Does this help? GEM of TKI kairosfocus
Dr Bot: You still have not cogently responded to the evidence that the issue is to get TO isolated islands of function in configuration space, rather than relatively minor adaptations within islands of function. And it is the former challenge that the Golem project underscores. It is becoming increasingly evident that darwinism supporters do not appreciate what is involved in putting together a complex, multipart, integrated functional entity, where the individual parts have to be right, have to fir with their neighbours, and have to be parts in a much broader integrated whole; whether within the cell or in the larger multicellular organism based on embryogensis of a zygote that then transforms itself into a body plan. In turn, all of this is based on the technology of life, whereby we have cells that integrate metabolic machines and a von Neumann, stored code based self-replicating facility. Expressed in a Wicken wiring diagram, complex and functional, information-rich organised entity. GEM of TKI kairosfocus
PS: Were my remarks on info helpful?
I'm going to be honest here and say not really, though I think that's my fault and not yours. :) I graduated high school in three years and never got much beyond basic algebra and geometry. I planned to be a doctor, not an engineer. Funny thing is, I think it was my math competency that got me into the Navy as an engineer rather than as a corpsman. Isn't life funny? I just don't have a lot of the tools in my mental toolbox yet to understand a lot of this. So I'm trying to start simply and build up. So for a start I wanted to know in what way Shannon Information is applicable to biology. Does Shannon Information require a communications system? If you have a nucleotide sequence, in what way is it legitimate to claim that sequence has no Shannon Information? )or that it does contain Shannon Information.) How would you tell if there was an increase in the Shannon Information contained in the sequence? Mung
Kairos thanks for pointing how genome is not a simple entity. Quite the opposite is true in the light of recent findings: 1.Cell needs the whole DNA (98% is "junk") otherwise it wouldn't spend tremendous resources to copy-replicate it. 2. Scientists from Harvard found the DNA fills certain volume inside nucleus in shape of Peano curve. That provides for well organized structure instead of chaotic tangle. 3. DNA Skittle visualization tool ( free download) clearly shows repetitive patterns interchanging with randomly distributed nucleotides in non-coding DNA. Also, interference and modulation type patterns are visible. 4. One dimensional string could be periodically marked for bending and assembly of two dimensional matrix (like QR code). Next it is possible to layer (stack) multiple two dimensional data matrices to fill volume. 5. Combining (2),(3) and (4) it is possible to envision form of data storage as a purpose for non-coding DNA. It is possible we are dealing with three dimensional chemical data storage system. 6. Similar to holographic recording I would expect huge capacity and inherent information redundancy. Smaller, broken off section of holographic recording will show the whole picture but with lower resolution. I would also expect powerful dynamic encryption as the basic information should be kept away from irresponsible users( humans). Mung thanks for detailed analysis of ev Eugen
KF, I responded to your claim about the complexity barrier described by Pollack et. al.:
In short, this is an empirical confirmation of the barriers posed by functionally specific complex information beyond a threshold, and irreducible complexity.
I pointed out the differences between the encoding used by Golem, and that found in biology, and how this might account for the barrier they encountered. I did not claim that other barriers do not exist, or that genomes are simple, or that indirect mappings can solve all the problems of complexity. I merely offered an alternative explanation for the barrier they describe. DrBot
Dr Bot: Pardon, again; are you aware of the size of actual genomes? When you say:
With the Golem encoding the complexity of the organism is directly related to the genotype. If you want 100 legs and a segmented body (like a millipede) you need to encode each segment and each leg explicitly. The size of the genome and the resulting search space becomes impossibly large and evolution can hit a barrier but when you have biological like indirect encodings and development you can build complex structures like that with very simple genomes
Real world genomes start at 100+ k to 1 mn bases, and for multicellular body plans we are looking at -- dozens of times over -- 10mn+ new base pairs. Genomes then run up to billions of bases in a well organised reference library. Just 100 k base pairs is a config space of 4^100,000 ~ 9.98 * 10^60,205 possibilities. The P-time Q-states of the observed cosmos across its lifespan, would amount to no more than 10^150, a mere drop in that bucket. "Simple genome" is a grossly simplistic strawman. And, the hinted-at suggestion that by using in effect a lookup table as a genome you have got rid of the need to code the information and the regulatory organisation, is another misdirection. You have simply displaced the need to code the algorithms that do the technical work. Notice, genomes are known to have protein etc coding segments, AND regulatory elements that control expression, all in the context of a living cell that has the machinery to make it work. The origin of the cell [metabolising and von Neumann self replicator], and its elaboration through embryogenesis into varied functional body plans have to be explained on chance plus necessity and confirmed by observation if the evo mat view is to have any reasonable foundation in empirically based warrant. GEM of TKI kairosfocus
Dr Bot: Pardon: Do you see the key non-sequitur in your argument? Let me highlight:
In their system a multi legged robot requires a full description for each limb, but with a developmental system and an indirect mapping you can have one generic leg description in the genes that is repeated n times during development [and where do all these conveniently come from? THAT is the key, unanswered question . . . ] – in other words you can jump from a four legged to an eight legged morphology by changing the value of n from 4 to 8 (one mutation) . . .
Someone or something, somewhere has to put down the info to get the complex functional organisation, in detail, on the Wicken wiring diagram. Duplicating and modifying an existing functional structure is one thing, creating it de novo out of commonly available components not conveniently organised, is not. And BTW, the associated controls to move successfully on 2 vs 4 legs vs 6 vs 8 are very different. It is not just a matter of produce n legs. A 2 legged gait is very different from a 4 or a 6 or 8. (And, since 3 legs gives a stable tripod [though that in turn actually requires 6 controlled points for real stability, ask a camera tripod designer for why], it is the 6 or 8 that are simpler to use physically: stand on 3+, move 3+, repeat.) So, your counter-argument boils down to the same error made by the author of ev: ASSUMING an already functional body plan, we can modify it on a suitably nice fitness landscape, through a hill-climbing algorithm with a nice trend metric. But, you have no right to that missing step, It does not follow. The Golem project was obviously trying to get to that initial functioning body plan, and that is precisely why the experiment ran into the challenge of islands of isolated functional organisation in exponentially growing config spaces. Recall, every additional YES/NO decision node in the Wicken wiring diagram DOUBLES the number of possible configs. There is abundant evidence that once one has a nice smooth fitness pattern on an existing island of function, one may move about in it. But the real problem is to get to the shores of such an island of function. In short, we are seeing a massively begged question here. GEM of TKI kairosfocus
I find the Golem project’s conclusion as at Sept 3, 2001 [no updates since then] highly interesting:
The evolutionary process appears to be hitting a complexity barrier that is not traversable using direct mutation-selection processes, due to the exponential nature of the problem. We are now developing new theories about additional mechanisms that are necessary for the synthetic evolution of complex life forms.
The other factor may be that they use direct mappings with no development - the genotype explicitly specifies the morphology of the agent. In their system a multi legged robot requires a full description for each limb, but with a developmental system and an indirect mapping you can have one generic leg discription in the genes that is repeated n times during development - in other words you can jump from a four legged to an eight legged morphology by changing the value of n from 4 to 8 (one mutation) - you can even encode the number of joints in a limb in a similar fashion so a single bit mutation can give you an extra joint in each limb. Part of the complexity barrier they encountered may be due to tese differences between their system and biology - which uses indirect mappings and development. With the Golem encoding the complexity of the organism is directly related to the genotype. If you want 100 legs and a segmented body (like a millipede) you need to encode each segment and each leg explicitly. The size of the genome and the resulting search space becomes impossibly large and evolution can hit a barrier but when you have biological like indirect encodings and development you can build complex structures like that with very simple genomes, and make drastic changes to the phenotype through minimal changes to the genotype. DrBot
Mung: I find the Golem project's conclusion as at Sept 3, 2001 [no updates since then] highly interesting:
The evolutionary process appears to be hitting a complexity barrier that is not traversable using direct mutation-selection processes, due to the exponential nature of the problem. We are now developing new theories about additional mechanisms that are necessary for the synthetic evolution of complex life forms.
In short, this is an empirical confirmation of the barriers posed by functionally specific complex information beyond a threshold, and irreducible complexity. The only known mechanism to routinely surmount such an exponential isolation barrier is intelligence. This supports the point that there are isolated islands of specific function in large config spaces, and that only specific organised Wicken wiring diagram arrangements of particular components will work. Ev of course works by being WITHIN such an island of function. Meyer's remark as cited by ENV is apt:
[Robert] Marks shows that despite claims to the contrary by their sometimes overly enthusiastic creators, algorithms such as Ev do not produce large amounts of functionally specified information "from scratch." Marks shows that, instead, such algorithms succeed in generating the information they seek by providing information about the desired outcome (the target) from the outset, or by adding information incrementally during the computer program's search for the target. ... In his critique of Ev as well as other evolutionary algorithms, Marks shows that each of these putatively successful simulations of undirected mutation and selection depends on several sources of active information. The Ev program, for example, uses active information by applying a filter to favor sequences with the general profile of a nucleotide binding site. And it uses active information in each iteration of its evaluation algorithm or fitness function. (Stephen C. Meyer, Signature in the Cell, pp. 284-285 (HarperOne, 2009).)
A fitness function -- if it is a continuous function -- is an expression with an implicit infinity of values, much as the Mandelbrot set I have used above shows: there is infinite detail lurking in a seemingly simple function and algorithm to test for degree of proximity to the set proper. Once you write the function and feed in coordinates to the algorithm -- intelligently designed I must add -- you can probe to infinite depth. If one then ads a warmer-colder hill climbing routine, he can create the illusion of information emerging from nothing. But all that is happening is that one is climbing the trends on a particular type of smoothly trendy landscape. The information to do that was fed into the fitness function and the associated algorithms to map and to do a warmer-colder climb. You will recall my part B thought exercise, to plug in the set proper as a black-hole of non-function, turning the fitness landscape into an atoll with a fractal border. Now, there are infinitely many fine grained points where one following a hill-climb will without warning drop off into non-function. The predictable result: once one has a functional pattern, one will incrementally improve, up to a point, then there will be a locked in peak enforced by what happens if one goes one step too far in an unpredictable direction, i.e one is locked into a highly specialised niche. Stasis and brittleness opening up room for sudden disappearance. Which sound very familiar. Ev is not creating information out of nowhere and nothing, and it is dependent on the particular pattern of a fitness function within an island, aided by targetting via hill climbing on a distance to target metric. At most ev models some aspects of micro-evo, which is not in dispute. The real issue is not adaptation of an already functional body plan, but origin of such body plans in the face of the complexity challenge the Golem project underscored. GEM of TKI PS: Were my remarks on info helpful? kairosfocus
EV Ware Dissection of a Digital Organism Answering Critics Who Promote Tom Schneider's ev Simulation Lee Spetner responds to Tom Schneider Mung
Darwinism holds that new genes can evolve blindly out of old genes by gene duplication, mutation and recombination under the pressure of natural selection. The strong version of panspermia holds that they cannot arise this way or any other way in a closed system. If a computer model could mimic the creation of new genes by the Darwinian method, it would establish that the process works in principle and strengthen the case for Darwinism in biology. Here we briefly discuss some evidence and arguments for the Darwinian mechanism and some for panspermia. Then we consider three well-known computer programs that undergo evolution, and one other proposal. None of them appears to create new genes. The question remains unanswered.
Can Computers Mimic Evolution? Mung
The golem@Home project has concluded. After accumulating several Million CPU hours on this project and reviewing many evolved creatures we have concluded that merely more CPU is not sufficient to evolve complexity: The evolutionary process appears to be hitting a complexity barrier that is not traversable using direct mutation-selection processes, due to the exponential nature of the problem. We are now developing new theories about additional mechanisms that are necessary for the synthetic evolution of complex life forms.
The Golem Project Mung
PS: In another sense, information reduces uncertainty about a situation, as in it tells you what is [or is likely] the case instead of what may be the case. kairosfocus
Mung: The issue is, what is information, not what is Shannon Info. If you look here [scroll down a tad to Fig A.1], you will see my favoured version of the comm system architecture (which is amenable to the layercake type model now so often used, e.g. for the Internet). Information comes from a source and is transferred to a sink, through encoding and/or modulation, transmitter, a channel and a receiver demod and/or decoder. Noise affects the process, and there is usually a feedback path of the same general architecture. Now, the key metric is suggested by Hartley, as I showed above, taking a neg log of the probability of signal elements from a set of possible elements; based on observed relative frequency as a measure of probability. That gives additivity to the measure [most easily seen in contexts where channel corruption is not an issue so what is sent is what is received]. So, for symbol mi, of probability of occurrence pi, Ii = - log pi (Schenider confuses himself here, by failing to understand that this is probably the dominant usage, and using a synonym to "correct" Dembski's usage.) It turns out that usually, some symbols are more likely than others, e.g. about 1 in 8 letters in typical English passages is an e. E therefore communicates less information than other, rarer letters. Since Shannon was especially interested in average flow rate of info in a channel, he developed the weighted average info per symbol metric, H: H = - [SUM on i] pi * log Pi This has several names, one of which is Shannon info. Another is uncertainty. We see why if we understand that for a flat random distribution of states for symbols in a string, the uncertainty of the value of each member is a maximum. A third is informational entropy, which turns out -- there were decades of debate on this but it seems to be settling out now -- is conceptually linked to thermodynamic entropy, and not just in the form of the math. It is "simply" the average info per symbol for a given set of symbols. It peaks when the symbols are equiprobable, and oddly -- but on the way the metric was defined -- that means that the average information per symbol of a flat random set of glyphs [fair die or coin] is maximum. Never mind it is meaningless. And if you move away from a flat random distribution, your average info per symbol (notice my use of this clearest term) will fall. Dembski et al say that the ev setup is a sparse 1 setup on the targets, so that will be the case if you move from a flat random initial value to a more biased final one. Uncertainty, AKA average info per symbol, will fall as you move to such a target, simply because the symbols are no longer equiprobable. On the other hand, if we have a message where a given symbol [say, s] MUST be present always and the other symbols are not possible, gives ZERO info, as there is no "surprise" or "uncertainty" on its reception. No meaningful info, and the neg log metric is also zero. Order [in this sense] is as free of info -- in the functional sense -- as is the meaninglessness of an absolutely random string. Our concern is the communication of meaningful symbol strings, that function in some relevant context. They are constrained to be in certain configs, from the set of all in principle possible configs of strings of the same length. But these configs are aperiodic, whilst not being random. Just like strings of ASCII characters in this paragraph. That is, they are functionally specific and exhibit neither randomness nor forced meaninglessly repetitive order but functional, complex, specific aperiodic, meaningful organisation. Thus, we easily arrive at the constraint that these messages will come from defined and confined zones in the space of possible configs. Islands of function, in light of rules of function and the purpose that constrains how the rules are used. Now, near as I can follow, Schneider's search strings start with arbitrary values then move in successive constrained random walk steps that are rewarded on warmer/colder, towards the targets. Where the targets are allowed to move around a bit too (but obviously not too much or warmer/colder trends would be meaningless). He is defining degree of function on degree of matching, and he is moving one set of strings towards the other by exploiting a nicely set up Hamming distance metric based trend and the warmer/colder hill-climbing principle, with a target that may move a bit but is in the same zone. I think he is speaking somewhat loosely of a gain of info as in effect moving to match. If he is actually starting with a flat random initial string, the average info per symbol value would be at a max already: sum pi log pi is at a peak for that. What he is apparently talking about is moving to a matched condition, not a gain in Shannon metric as such. (Are the target strings also set at random?) Shannon info is not the right metric for this, it is only part way there. Functional specificity of complex strings that have to be in target zones to work right, is what has to be captured. And the metrics of CSI do that, one way or another. This side-issue is so far from getting a DNA string to code for AA chains that will fold and function in the context of the cell that it is ridiculous. Here, the constraints are those of real world chemistry and physics, and the context of the nanomachines and molecules in the living cell. Finding the correct cluster of such to get a functioning cell is a real technological tour de force. And metrics that identify how hard it is to get to islands of function in large config spaces on unintelligent mechanisms tracing to chance and necessity, are a help in seeing that. But, if you are locked into the idea that regardless of odds, we have nice smooth fitness functions that tell us warmer colder from arbitrary initial configs in prebiotic soups [or the equivalent] we will be blind to this issue of islands of function. Similarly, if we fail to understand how much meaning has to be programmed in to get a novel body plan to unfold from a zygote through embryogensis. The real issue is that the evidence of our experience with technology, and the fossil record are both telling us that we are dealing with islands of function, but the darwinian narrative demands that there MUST have been a smoothly varying trend to life and from first lie to us and what we see around us. So far, so much the worse for inconvenient empirical evidence and analysis. That is why models on computers that work within islands of function are allowed to pass themselves off as more than a model of what is not in dispute, micro-evo. But, the narrative is falling apart, bit by bit. GEM of TKI kairosfocus
Thanks kairosfocus, I also think there's something hinky about Schneider's use of Shannon Information. He uses Shannon Information as his measure, almost bragging about it being the only valid measure. He apparently thinks his way is the only correct way. I noticed you have some coverage of Shannon Information on your web site, but I still don't have a real clear grasp. So what are some of the fundamentals of Shannon Information? Does it require a sender and a receiver? What else does it require? Is it even possible to have a "gain" in Shannon Information? Schneider assumes that the binding site starts with no information content because he starts with a randomly generated sequence of bases at the binding site. After a binding site has "evolved" to the point that it can be recognized, he then measures the information content (at the binding site - as the reduction in uncertainty) and subtracts his "before" and "after" to calculate his information "gain." But again, that's not how Shannon Information works, imo. With Shannon Information you can't get a gain in information. (And do you get a gain in information by a reduction in uncertainty?) Am I just way off base? I'll try to provide some relevant links and quotes in a followup posting. Mung
Mung: Thank you for the work you have done to document the actual way ev works. It has been important for us to all see that ev actually contains language that directly implies that there is targetted search involved, using a Hamming type distance metric in a warmer/colder oracle hill-climbing routine. GEM of TKI kairosfocus
btw, has anyone else noticed that MathGrrl has, in citing work such as Schneider's, conceded the argument? 1. Information can be mathematically defined. 2. The concept can be and has been applied to biological systems. Mung
Again, in Schneider's own words:
Repressors, polymerases, ribosomes and other macromolecules bind to specific nucleic acid sequences. They can find a binding site only if the sequence has a recognizable pattern. We define a measure of the information (Rsequence) in the sequence patterns at binding sites.
The Information Content of Binding Sites on Nucleotide Sequences
Recognizer a macromolecule which locates specific sites on nucleic acids. [includes repressors, activators, polymerases and ribosomes]
We present here a method for evaluating the information content of sites recognized by one kind of macromolecule.
No targets?
These measurements show that there is a subtle connection between the pattern at binding sites and the size of the genome and number of sites.
...the number of sites is approximately fixed by the physiological functions that have to be controlled by the recognizer.
Then we need to specify a set of locations that a recognizer protein has to bind to. That fixes the number of sites, again as in nature. We need to code the recognizer into the genome so that it can co-evolve with the binding sites. Then we need to apply random mutations and selection for finding the sites and against finding non-sites.
INTRODUCTION So earlier in this thread I accused MathGrrl of not having actually read the papers she cites. I think the case has sufficiently been made that that is in fact a real possibility. I suppose it's also possible that she reads but doesn't understand. MathGrrl, having dispensed with the question of targets in ev, can we now move on the the question of CSI in ev? Mung
Mung: Quite a bottomline on ev:
ev [is] a glorified version of Dawkins’ Weasel program . . . . instead of having a single final target string which each individual in the population is measured against (compared to), each individual in the population in ev has multiple targets (which are called binding sites), and the target strings at each “binding site” for each individual are independent of each other and independent of the target strings in the other individuals. So each individual in ev has more targets, but each one is shorter in length than the target in Weasel. It also has a shorter “alphabet” (ACGT). In addition, while the location of the target sites on each individual in the population are fixed, the actual target “letters” may be changed by a mutation . . . . we not only have multiple target “strings” per individual, each of which is capable of changing due to mutation, we also have the “receptor.” This is the “string” that we’re trying to get to match one of the target string. It also is different for each individual in the population . . . . The final twist is that the “receptor” also has a chance to be changed due to mutation. But it’s still string being compared to another string until we find a match, even with all the other fancy goings on. And oh, yeah, there’s that function that let’s us tell which strings are closer to the targets (fewer mistakes) and which ones are further away (more mistakes). Wipe out half the population each generation by replacing those with more mistakes by copies of those with fewer mistakes. Set up the right initial conditions and you’re bound to succeed.
MG et al have some further explaining to do. GEM of TKI kairosfocus
Elsewhere I described ev as a glorified version of Dawkins' Weasel program. Weasel has an initial population of strings which are mutated until one is found to match a single final target phrase. ev has an initial population of what are in effect strings. (Is the genotype the same as the phenotype, as in Weasel?) But instead of having a single final target string which each individual in the population is measured against (compared to), each individual in the population in ev has multiple targets (which are called binding sites), and the target strings at each "binding site" for each individual are independent of each other and independent of the target strings in the other individuals. So each individual in ev has more targets, but each one is shorter in length than the target in Weasel. It also has a shorter "alphabet" (ACGT). In addition, while the location of the target sites on each individual in the population are fixed, the actual target "letters" may be changed by a mutation. [Note that when a "good" individual is copied to replaces a "bad" individual the location of the binding sites in the "offspring" are not changed. So you now have an additional member in the population with the exact same binding site locations.] So now we not only have multiple target "strings" per individual, each of which is capable of changing due to mutation, we also have the "receptor." This is the "string" that we're trying to get to match one of the target string. It also is different for each individual in the population. [At least until we go through the first round of selection, at which time we again get an additional member in the population with the same "receptor" as another member.] The final twist is that the "receptor" also has a chance to be changed due to mutation. But it's still string being compared to another string until we find a match, even with all the other fancy goings on. And oh, yeah, there's that function that let's us tell which strings are closer to the targets (fewer mistakes) and which ones are further away (more mistakes). Wipe out half the population each generation by replacing those with more mistakes by copies of those with fewer mistakes. Set up the right initial conditions and you're bound to succeed. Mung
Mung: A measure of number of mistakes is a Hamming distance metric. Even without SHOWING Wiki the ducking stool for waterboarding, it coughs up:
In information theory, the Hamming distance between two strings of equal length is the number of positions at which the corresponding symbols are different. Put another way, it measures the minimum number of substitutions required to change one string into the other, or the number of errors [aka mistakes] that transformed one string into the other.
GEM of TKI kairosfocus
Mung: Let's get one of those Wiki admissions against interest:
In mathematics and computer science, an algorithm is an effective method expressed as a finite list[1] of well-defined instructions[2] for calculating a function.[3] Algorithms are used for calculation, data processing, and automated reasoning. Starting from an initial state and initial input (perhaps null),[4] the instructions describe a computation that, when executed, will proceed through a finite [5] number of well-defined successive states, eventually producing "output"[6] and terminating at a final ending state. [BTW, to stop an infinite loop we can force a termination under certain conditions, again a goal setting exercise.] The transition from one state to the next is not necessarily deterministic; some algorithms, known as randomized algorithms, incorporate random input.[7] A partial formalization of the concept began with attempts to solve the Entscheidungsproblem (the "decision problem") posed by David Hilbert in 1928. Subsequent formalizations were framed as attempts to define "effective calculability"[8] or "effective method";[9] those formalizations included the Gödel–Herbrand–Kleene recursive functions of 1930, 1934 and 1935, Alonzo Church's lambda calculus of 1936, Emil Post's "Formulation 1" of 1936, and Alan Turing's Turing machines of 1936–7 and 1939.
And, regarding GA's, the same confesses -- we never even had to show the thumbscrews [as in , MG you still have some serious 'splaining to do on your outrageous citation of eppur si muove above . . . ] -- as follows:
In a genetic algorithm, a population of strings (called chromosomes or the genotype of the genome), which encode candidate solutions (called individuals, creatures, or phenotypes) to an optimization problem, evolves toward better solutions. Traditionally, solutions are represented in binary as strings of 0s and 1s, but other encodings are also possible. The evolution usually starts from a population of randomly generated individuals and happens in generations. In each generation, the fitness of every individual in the population is evaluated, multiple individuals are stochastically selected from the current population (based on their fitness [and how is that set up and measured and controlled 5to have a nice trendy pattern, based on the coding? By monkeys at keyboards?]), and modified (recombined and possibly randomly mutated) to form a new population. The new population is then used in the next iteration of the algorithm. Commonly, the algorithm terminates when either a maximum number of generations has been produced, or a satisfactory fitness level has been reached for the population. If the algorithm has terminated due to a maximum number of generations, a satisfactory solution may or may not have been reached.
Digging in yet deeper -- and nope, we did not demonstrate the rack to see this:
In mathematics, computer science and economics, optimization, or mathematical programming, refers to choosing the best element from some set of available alternatives. In the simplest case, this means solving problems in which one seeks to minimize or maximize a real function by systematically choosing the values of real or integer variables from within an allowed set. This formulation, using a scalar, real-valued objective function, is probably the simplest example; the generalization of optimization theory and techniques to other formulations comprises a large area of applied mathematics. More generally, it means finding "best available" values of some objective function given a defined domain, including a variety of different types of objective functions and different types of domains.
The above is fairly riddled with constrained, goal-seeking behaviour, set up by an intelligent designer. GEM of TKI PS: The picture of a nice, trendy objective function here by Wiki is illustrative. Especially on what "hill-climbing" is about. kairosfocus
Let's look at some of the things ev can display:
Display control: the first 7 characters on the line control the kind of data printed to the list file: a = display average number of mistakes and the standard deviation for the population. c = display changes in the number of mistakes. The current Rsequence is given if r (below) is turned on. This allows graphs of Rsequence vs mistakes to be made. g = display genomic uncertainty, Hg. If this deviates much from 2.0, then the model is probably bad. i = display individuals' mistakes o = display orbits: information of individual sites is shown r = display information (Rsequence, bits) s = current status (range of mistakes) is printed to the output file. m = current status (range of mistakes) is printed to the list file.
Why this obsession with "mistakes"?
haltoncondition: char. This parameter (introduced [2006 June 24]) causes ev to halt when a given condition has occured. If the first character on the line is: - none - no halting condition r Rs>=Rf - the best creature has Rs at least equal to Rf m mistakes 0 - the best creature makes no mistakes b both r and m
Don't tell me this program has no targets. Mung
MathGrrl @169
If the paper you are touting claims that ev is a targeted search then it is wrong.
Hi MathGrrl, welcome back. Let's start with this claim about ev, which I find strange indeed. But perhaps I am just not understanding what you mean. GA's are by definition targeted searches. If a GA was not a targeted search it would perform no better than a random search, and thus there would be no point in using a GA. If ev is (or uses) a GA, ev uses a targeted search. Why is my argument not valid? Note: It doesn't matter what the specific target or targets are in ev, and not identifying them does not impact the validity of the argument I present.
A detailed description of how to solve a problem by first specifying the precise starting conditions and then how to follow a set of simple steps that lead to the final solution is known as an algorithm. An algorithm is characterized by: - a precise statement of the starting conditions, which are the inputs to the algorithm; - a specification of the final state of the algorithm, which is used to decide when the algorithm will terminate; - a detailed description of the individual steps, each of which is a simple and straightforward operation that will help move the algorithm towards its final state. Explorations in Computing
Frankly, as a "MathGrrl" I'd expect you to know this.
The Genetic Algorithm is an Adaptive Strategy and a Global Optimization technique. It is an Evolutionary Algorithm and belongs to the broader study of Evolutionary Computation.
The objective of the Genetic Algorithm is to maximize the payoff of candidate solutions in the population against a cost function from the problem domain. The strategy for the Genetic Algorithm is to repeatedly employ surrogates for the recombination and mutation genetic mechanisms on the population of candidate solutions, where the cost function (also known as objective or fitness function) applied to a decoded representation of a candidate governs the probabilistic contributions a given candidate solution can make to the subsequent generation of candidate solutions.
Listing (below) provides an example of the Genetic Algorithm implemented in the Ruby Programming Language. The demonstration problem is a maximizing binary optimization problem called OneMax that seeks a binary string of unity (all '1' bits). The objective function provides only an indication of the number of correct bits in a candidate string, not the positions of the correct bits.
Genetic Algorithm If you've read Schneider's ev paper you'll know why the text above is in bold. Mung
MathGrrl:
If the paper you are touting claims that ev is a targeted search then it is wrong.
Unfortunately you have to do more than just say it. Perhaps you can write a letter to the journal informing them of your objections. But all that is moot because you have been given a rigorously defined concept of CSI and you just refuse to grasp it. I say shame on you for that and shame on us for continuing to respond to an obvious waste of bandwidth... Joseph
F/N: For convenience, I excerpt the clip from 151 above, on how Information is quantitatively defined: ___________ >> 2 –> I turn to my trusty Princs of Comm Systems, 2nd edn, Taub and Schilling (McGraw Hill, 1986), p. 512, Sect. 13.2 (which follows my good old Connor, as cited and used in my always linked; cf as well Harry Robertson’s related development of thermodynamic Entropy in Statistical Thermophysics (PH, 1998), pp. 3 – 6, 7, 36, etc, as also cited and used):
Let us consider a communication system in which the allowable messages are m1, m2, . . ., with probabilities of occurrence p1, p2, . . . . Of course p1 + p2 + . . . = 1. Let the transmitter select message mk of probability pk; let us further assume that the receiver has correctly identified the message [My nb: i.e. the a posteriori probability in my online discussion is 1]. Then we shall say, by way of definition of the term information, that the system has communicated an amount of information Ik given by Ik = (def) log2 1/pk (13.2-1)
3 –> In short, Dembski’s use of the term “information” is precisely correct, though it differs from the terminology used by others. 4 –> Schneider should have checked (or should be more familiar with the field) before so dismissively correcting. >> ___________ Schneider had sought to "correct" Dembski for using this commonplace definition and quantification of information. He thereby inadvertently reveals his want of familiarity with common, well accepted and effective usage. And once we understand that the common metric of info in bits we commonly see is based on the usage cited from T & S [and could be from many other sources], the objection on want of adequate definition collapses. For, the log reduction of the Dembski metric shows that it is about identifying info on configs in identifiable target zones, that are sufficiently isolated that it is maximally implausible for them to be arrived at by chance plus necessity. Whilst, posts in this thread for a handy example [one that has been repeatedly used in responding to MG and which she has consistently ducked or brushed aside to make her favourite talking points], are demonstrations of how routinely intelligence is able to get to such islands of function in vast config spaces. Let us have done with this notion and talking point that CSI is ill defined and meaningless. GEM of TKI kairosfocus
MG:
There have been no such answers . . .
Pardon, but that is now plainly an empty declaration in light of the above in this thread and the OP. CSI has been adequately conceptually understood and described from the 1970's, and it has been adequately mathematically modelled for a decade or more. In the log reduction on basic rules of logs, the connexion from the Dembski chi metric to the observable world of information content of functioning systems is made, and the threshold type approach is justified on a needle in a haystack search challenge, on the gamut of our solar system or our observed cosmos. With the solar system scope metric in hand:
Chi_500 = Ip - 500, bits beyond the threshold
. . . we may directly insert the Durston et al results for information content of 35 protein families, as is now shown in point 11 of the OP:
RecA: 242 AA, 832 fits, Chi: 332 bits beyond SecY: 342 AA, 688 fits, Chi: 188 bits beyond Corona S2: 445 AA, 1285 fits, Chi: 785 bits beyond . . . results n7
Other applications once we have a reasonable estimate for information content, are plainly possible on the lines of this paradigmatic case. The link to the simple brute force X-metric, is also plain. Let's do a direct test: we know that we have cases of randomly generated valid text up to 20 - 25 ASCII characters [spaces of up to 10^50 or so possibilities]. Can you provide a case for at least 72 characters? 143? Let's ask:
a: What does that want of cases in point have to do with the Planck-time quantum state resources of our solar system and our observed cosmos, per the discussion in Abel's 2009 plausibility metric paper? b: How does this tie in with the statistical foundations for the second law of thermodynamics? c: And, BTW, do you observe the cited definition of INFORMATION as a metric from Taub and Schilling? d: Do you recognise that this is therefore a valid and sufficiently rigorous definition as is commonly used in telecomms work? e: Do you see that in the log reduced form that is precisely the definition of information used?
Before your remarks can have any further weight, you need to cogently respond to the issues summarised here, and especially here and here above (with a glance here also helpful). The sub thread on Schneider's ev from here at 126 would also require serious attention. Schneider's sense of vagueness regarding the CSI concept is self-induced. He plainly has not addressed the concept that a set of bits or the like can specify a config space of possibilities. In such a space, we may define a zone of interest T which, if sufficiently isolated, will be maximally hard to find on undirected chance plus necessity. Indeed, that is the obvious error in ev itself. For Schneider seems to have failed to realise that he has started within an island of function, and that he is proceeding on a nice trend and an evident metric of distance to target that yields a warmer/colder signal. Indeed, his graph of a ramp up to the target with hunting oscillations, is diagnostic [at least to one familiar with the behaviour of closed loop control systems]. What Schneider has provided is a model of what is not in dispute -- not even with modern Young Earth Creationists, i.e. relatively minor variations within a functioning body plan. He does not have a model that accounts for the origin of such body plans on undirected chance plus necessity. (His confusion of artificial selection with natural selection was telling. He has not realised that HE is the source of the crucial active information that explains better than random walk based search performance.) Indeed, he inadvertently provides a demonstration of the best, empirically warranted, explanation for arriving on such an island of function. Namely, design. GEM of TKI PS: Your attempt to push the ball back into Dr Torley's court after he has more than abundantly explained why he asks you to at least provide a summary, is telling. Especially for one who above confused a log reduction with a probability calculation. kairosfocus
kairosfocus and Mung, With respect to your discussions of ev, I think there are three points that you haven't addressed. The first is that Schneider, like myself and others, finds Dembski's concept of CSI mathematically vague. He has to make some assumptions about what it means in order to even begin to calculate it. This is why a rigorous mathematicall definition and detailed examples are so important. The second point is the discussion of Schneider's "horserace" to beat the UPB. You both make a big issue about Schneider tweaking the parameters of the simulation, population size and mutation rate in particular, but you don't discuss the fact that, once the parameters are set, a small subset of known evolutionary mechanisms does generate Shannon information. This goes back to my discussion with gpuccio on Mark Frank's blog where we touched on the ability of evolutionary mechanisms to result in populations that are better suited to their environment than were their parent populations. That, in turn, suggests that, while it might be possible to make a case for cosmological ID, there is no need to posit the involvement of intelligent agency in biology. The third point is that, despite a lot of discussion about ev, neither of you have provided a detailed calculation of CSI for my ev scenario. This was a particularly interesting topic in my discussion with gpuccio in its impact on his calculations. I would be very interested in reading what you think of that part of the thread. MathGrrl
Mung,
I have my doubts as to whether ev even qualifies as a genetic algorithm, I’ll need to do some more reading. So what’s missing? Crossover.
ev implements a very simple subset of known evolutionary mechanisms, not including crossover. That doesn't mean it's not a GA. Interestingly, even using such a small subset of known evolutionary mechanisms, ev still demonstrates the same behavior that Schneider researched for his PhD thesis. MathGrrl
vjtorley,
If you want me to answer your questions, I’m afraid you’ll have to (a) explain the cases you’re describing, in non-technical language (i.e. something a bright 12-year-old could grasp), and (b) explain why you think they pose a threat to Professor Dembski’s concept of complex specified information, in a summary of two or three pages at most.
I'm happy to explain any of the scenarios that you feel are too underspecified. Rather than guessing at your points of confusion, could you please explain where you find the jargon too thick? I would also note that I'm not claiming that my scenarios "pose a threat" to Dembski's CSI metric, I'm saying that I have not seen a mathematically rigorous definition of that metric nor any examples of how to calculate it for scenarios such as those I present. MathGrrl
Joseph,
You forgot one main point R0bb- ev is a targeted search, which means it is an irrelevant example. MathGrrl:
This is not correct.
Yes it is and I have provided the peer-reviewed paper that exposes it as such.
The nice thing about science is that it is objective. People can look at the empirical evidence and determine whether or not it supports a claim. If the paper you are touting claims that ev is a targeted search then it is wrong. If you disagree, please refer to the ev paper to identify the target of the search. You may want to read Schneider's PhD thesis for background information. MathGrrl
Mung,
You are boringly repetitious.
I assure you that when an ID proponent presents a rigorous mathematical definition of CSI and demonstrates how to calculate it for my four scenarios, I'll stop repeating my request. MathGrrl
kairosfocus,
No one there [in my guest thread] was able to present a rigorous mathematical definition of CSI based on Dembski’s description. If you can, please do so and demonstrate how to calculate it for the four scenarios I describe there.
Pardon directness: this is the proverbial stuck record, repeating long since reasonably and credibly answered demands, without any responsiveness to the fact that they have been answered, again and again.
There have been no such answers. Not even ID proponents can provide a rigorous mathematical definition that is consistent with Dembski's description of CSI nor has anyone here thus far provided detailed example calculations for my four scenarios. As noted previously, you provide no basis for the numbers used in your calculations. If you are willing and able to demonstrate exactly how you arrived at your numbers, in the context of a rigorous mathematical definition of CSI, I would be delighted to then apply that objective metric to other systems. MathGrrl
kairosfocus,
The issue at the heart of the CSI/FSCI challenge is to arrive at the shores of such islands of function from arbitrary initial points in config spaces.
No, the issue is that ID proponents make the claim that CSI is a clear indicator of the involvement of intelligent agency but cannot define the metric rigorously or show how to objectively calculate it for real world scenarios. That means that their claims are unsupported. MathGrrl
kairosfocus,
Please provide references to where I have done so . . .
Kindly cf 44 ff above, for my earlier responses; which relate to your 36 ff.
I still see no mathematically rigorous definition of CSI nor any detailed calculations for my four scenarios. As previously noted, you do not explain how you arrive at your numbers used in post 44. The determination of how many bits of CSI are present is exactly the question being discussed. I would be interested in seeing your more detailed explanation. MathGrrl
My dear interlocutors, I apologize for disappearing for the past week; real world responsibilities intervened. I am attempting to continue the discussion in the two most active child threads of my original guest post. I hope you'll continue as well. MathGrrl
PS: How negative feedback control works. kairosfocus
Mung: I find Schneider's web pages -- pardon* --often very hard to wade through; too busy and disorganised. (That is why I tend to use clips.) ___________________
*Pardon, again: I suggest instead layout as an article, with an index near the top of the page, and perhaps use of text boxes. Or, use of unequal width columns similar to a blog page.
Let's slice up his highlight:
Replication, mutation and selection are necessary and sufficient for information gain to occur. This process is called evolution.
Replication in ev requires a huge, fine tuned, multi-component background algorithmic process that is not only intelligently designed but controlled and protected from chance variation, If not, the process would break down rapidly. Thus, we see tha the variation in view is tightly controlled within an island of fine tuned, complex and specific function. Which is: intelligently designed. "Evolution" on intelligent design . . . and as a part of the design. (As in Wallace's Intelligent Evolution.) Is that what Schneider really wishes to demonstrate or acknowledges/claims demonstrating? Next, the variation and selection are plainly within an island of defined function and are crucially dependent on nice trends of performance and a warmer-colder metric, as can be seen from what happens in the graph at the top of his page when the selection filter is turned off, i.e the system wanders away from the target. That sort of ramp and hold vs wander away is a characteristic signature of a negative feedback controlled process: set target point, adjust plant, test o/p relative to target, adjust process towards target, compare fed back performance, adjust plant on differential, hold as differential falls to zero; or, at least hold within the hunting oscillations. That is, there is a targetting here on a warmer-colder metric. A feature of feedback control. And judging by noisiness, ev lacks damping and so is prone to oscillations and breakout of control. Which is what the tweaking in 126 is speaking of. In effect a Hamming digital distance to target oracle of some form is at work, just as Dembski et al have pointed out. And yes, there is a measure of performance that is trending as the ramp part of the graph shows, i.e there is a fitness function of some type, or better, a FIT-to target function. So, the fundamental problem with the genome variation and we started with a random genome claim is what it conceals (probably inadvertently, I think there is a sincerity here but one that is blind what it is not inclined to see):
1: In each instance, you are searching well within a match of 500 bits, which is within the search limit that the design inference accepts. 2: That you are able to search and match implies there are designed algors offstage doing the real work. 3: You have a perceptron that biases you to sparse 1 codes, in a context where that is what you need, i.e you are loading the dice heavily. (If you need 6's and the dice are loaded so you get 6's 80% of the time, you have shifted the uncertainties of dice tossing dramatically.) 4: You start within an island of function, as in effect every "random" index value will have a measurable function. In reality by far and away most of the in principle possible configs of genomes -- that START at 100+ k bits (notice how we never see GA's that start at that sort of level!) are decidedly non-functional. And in fact your program is a large one and most of the bits in its informational strings are NOT allowed to vary at random. 5: Your fitness metric implies a conveniently nice trend-y response on the underlying config possibilities, so you can build in the algor that says climb uphill (rewarding increments in warmer/colder) and get where you want. [Contrast my black hole variant on using the Mandelbrot set as a fitness function: if trends are not nice,then the whole model fails. What gives you a right to assume/construct a nice trend?) 5: As your documentation at 126 shows, several parameters have to be set just right -- fine tuned -- for the components to work together to get the desired results. 6: To get that fine tuning, Schneider, obviously, was in effect running life over and over again, until things worked out. As in, where do you have the planets and sub-cosmi to run life over and over to get the one that is just right? And, how did you know how to get the range of variation and probability distribution that would set up a population to hit just the right peak?
In that context, if one narrows focus to the genome and its move from random to matched, well it looks like you have got functional info on a free lunch. Indeed, that's why he was crowing on how NATURAL selection was beating the UPB in an afternoon. But, once one pulls back the tight focus, one sees that a lot of intelligently directed things were going on offstage that were critical to getting that performance. Intelligent things that amount to serious dice loading and capture of good results, discarding the bad ones. At most, ev gives a picture of some of how micro evo -- which is not in dispute -- allows variations to fit niches. This is comparable to how bacteria put in nutrient mixes with unusual sugars or the like, may well adapt to eat the new stuff. But these pay a fitness cost because as a rule something got broken to get the enzyme[s] required to break up and use the new sugar. This has nothing to do with explaining where the bacterium with all its integrated functionality, came from. Including the set of enzymes that were so set up that fiddling a bit would allow the organism to survive on unusual nutrients. There is no free lunch, and Schneider is evidently distracting himself from his own contributions to his results that -- to ft his preconceptions -- he ascribes to "natural" selection. GEM of TKI kairosfocus
Does ev qualify as an EA? Mung
Hi kairosfocus, What do you think of Schneider's use of Shannon Information? http://www.ccrnp.ncifcrf.gov/~toms/paper/ev/ I'm suspicious of how he decides that there's been a reduction in uncertainty at the binding site and therefore an increase in Shannon information. It also occurred to me today that he creates 64 organisms and each is generated with a random genome. So while he should be starting off on an island, he really isn't! He's maximizing his chances to search different locations in the space. Mung
F/N: The onward discussion by No-Man here, and my suggestions here in that thread, seem to be relevant. I clip the latter: _________ >> F/N: Applying a modified Chi-metric: I nominate a modded, log-reduced Chi metric for plausible thresholds of inferring sufficient complexity AND specificity for inferring to design as best explanation on a relevant gamut:
(a) Chi’_500 = Ip*S – 500, bits beyond the solar system threshold (b) Chi’_1000 = Ip*S – 1,000, bits beyond the observed cosmos threshold
. . . where Ip is a measure of explicitly or implicitly stored information in the entity and S is a dummy variable taking 1/0 according as [functional] specificity is plausibly inferred on relevant data. [This blends in the trick used in the simplistic, brute force X-metric mentioned in the just linked.] 500 and 1,000 bits are swamping thresholds for solar system and cosmological scales. For the latter, we are looking at the number of Planck time quantum states of the observed cosmos being 1 in 10^150 of the implied config space of 1,000 bits. For a solar system with ours as a yardstick, 10^102 Q-states would be an upper limit, and 10^150 or so possibilities for 500 bits would swamp it by 48 orders of magnitude. (Remember, the fastest chemical interactions take about 10^30 Planck time states and organic reactions tend to be much, much slower than that.) So, the reduced Dembski metric can be further modified to incorporate the judgement of specificity, and non-specificity would lock out being able to surpass the threshold of complex specificity. I submit that a code-based function beyond 1,000 bits, where codes are reasonably specific, would classify. Protein functional fold-ability constraints would classify on the sort of evidence often seen. Functionality based on Wicken wiring diagram organised parts that would be vulnerable to perturbation would also qualify, once the description list of nodes, arcs and interfaces would exceed the the relevant thresholds. [In short, I am here alluding to how we reduce and represent a circuit or system drawing or process logic flowchart in a set of suitably structured strings.] So, some quantification is perhaps not so far away as might at first be thought. Your thoughts? >> __________ GEM of TKI kairosfocus
MG (et al): Still waiting . . . GEM of TKI kairosfocus
MG (et al and Graham): If you are still monitoring this thread on the significance and credibility of CSI as a properly scientifically grounded metric pointing to design as the best explanation for what is sufficiently complex and specific, it is open for a response. Please note the guide to the thread (including answers to your main 4 q's and the second string of q's, a response to your meaningless claim, and a response to Schneider's ev and claims on CSI) at 1 above. G'day GEM of TKI kairosfocus
Mung: The random walk backed up by a lawlike filter is not incapable of CSI by DEFINITION, but by being overwhelmed by the needle in the haystack challenge; an analysis backed up by observations. That is why Dembski went out of his way to identify and define a lower limit to number of bits whereby beyond this level, zones of interest are so isolated that it is not credible to land on the zone within the available search resources, UNLESS one is using active information. Such active information includes oracles that attract through things like warmer/colder signals, and the like. Intelligence routinely arrives at such zones of interest beyond such isolation thresholds, but in so doing, it is precisely not using a blind search backed by trial and error tests. E.g. consider posts in this thread. In the OP this thread, we saw that the Dembski type Chi metric can be reduced to exactly the bits beyond a reasonable threshold metric deduced: Chi_500 = Ip - 500, bits beyond Let us observe how MG, plainly coming from Schneider's perspective, has been unable to address its significance. She even managed to confuse a log reduction with a probability calculation. And if she has been taught that it is INCORRECT to use the Hartely-suggested log metric for information, then the confusion is maximised. (And of course only dumb IDiots and Creationists do that . . . ) Here is Wiki on self-information, as a simple clarification:
the self-information I(wn) associated with outcome ?n with probability P(wn) is: I(wn) = log (1/P(wn)) = - log (P(wn)) . . . . This measure has also been called surprisal, as it represents the "surprise" of seeing the outcome (a highly improbable outcome is very surprising). This term was coined by Myron Tribus in his 1961 book Thermostatics and Thermodynamics. The information entropy of a random event is the expected value of its self-information.
That should help clarify. GEM of TKI kairosfocus
In NFL, Dembski begins discussion of ev in Chapter 4 Section 9, "Following the Information Trail." But prior to that, in the same chapter, he writes:
HI MOM!
Wait, that's not what he wrote. How did that get there? Here's the actual quotes:
Technically, Dawkins's target sequence is not long enough fir its probability to fall below the 1 in 10^150 universal probability bound or correspondingly for its complexity to to surpass the 500-bit universal complexity bound. Dawkins's target sequence therefore does not qualify as complex specified information in the strict sense - see sections 2.8 and 3.9. Nonetheless, for practical purposes the complexity is sufficient to illustrate specified complexity.)
In general, then, evolutionary algorithms generate not true specified complexity but at best the appearance of specified complexity.
So if Dembski has anything to say about the ability of ev to generate CSI it should certainly be understood in the context of what he wrote earlier in the chapter. It looks like Schneider went straight to the section on ev and thus failed to understand it in context. That said, Dembski has in one fell swoop dispensed with 75% of MathGrrl's scenarios, unless she wants to argue that they are not EA's. That said, with one of four possible bases, what is the minimum number of sites required to encode information the UPB and the universal complexity bound? iirc, the genome length of an ev organism was 256 sites, but each individual binding site is only 16 (16x16=256). Mung
Instead of driving down to my storage unit to pull out my copy of NFL, I just bought the Kindle version. Go Kindle!
Also, in No Free Lunch, Dembski asked where the “CSI” came from in Ev runs (p. 212 and following). So Ev creates “CSI”.
Dembski writes the following:
To see that the Darwinian mechanism is incapable of generating specified complexity, it is necessary to consider the mathematical underpinnings of that mechanism, to wit, evolutionary algorithms. By an evolutionary algorithm I mean any well-defined mathematical procedure that generates contingency via some chance process and then sifts it via some law-like process.
So it's pretty clear where, if there is any CSI, it does not come from. Mung
Schneider should have checked (or should be more familiar with the field) before so dismissively correcting.
The opinion I'm developing of Schneider from reading his online postings is that he just expects to be believed because he is, after all, just exposing the creationists as the frauds they are. He certainly doesn't seem to take Dembski seriously or even consider that he might be doing original work, even though he has what, three doctorates? ( How his wife stands him I don't know ;) ) Mung
28 --> Were these Schneider's dismissed "subjective" specifications, useless in guiding the selection of available components and the creation of a reasonably successfully configured system? 29 --> Or, later on when I spotted how to take juice bottle caps, cellulose sponges, and Al mains cable lying on the ground after a hurricane (Hugo) to make soldering iron stands for effectively zero cost, was that subjectivity pretty useless, or the specificity of configuration meaningless? 30 --> Or is my son's current exercise to convert some card lying around into a version of Kreigspeil, useless? Later on he tangles it up with specificity, which is a terrible term: 31 --> More inappropriately denigratory dismissal Biological specification always refers to function. An organism is a functional system comprising many functional subsystems. In virtue of their function these systems embody patterns that are objectively given and can be identified independently of the systems that embody them. Hence these systems are specified ... (page 148) 32 --> Compare Schneider's clip with mine in the OP, to see how this has been twisted by extraction from context:
148: “The great myth of contemporary evolutionary biology is that the information needed to explain complex biological structures can be purchased without intelligence. My aim throughout this book is to dispel that myth . . . . Eigen and his colleagues must have something else in mind besides information simpliciter when they describe the origin of information as the central problem of biology. I submit that what they have in mind is specified complexity, or what equivalently we have been calling in this Chapter Complex Specified information or CSI . . . . Biological specification always refers to function . . . In virtue of their function [a living organism's subsystems] embody patterns that are objectively given and can be identified independently of the systems that embody them. Hence these systems are specified in the sense required by the complexity-specificity criterion . . . the specification can be cashed out in any number of ways . . ."
33 --> In other words, the issue is the Wicken wiring diagram, whereby specific items are organised in specific ways to fulfill a function. And Wicken has been on record since 1979:
‘Organized’ systems are to be carefully distinguished from ‘ordered’ systems. Neither kind of system is ‘random,’ but whereas ordered systems are generated according to simple algorithms [[i.e. “simple” force laws acting on objects starting from arbitrary and common- place initial conditions] and therefore lack complexity, organized systems must be assembled element by element according to an [[originally . . . ] external ‘wiring diagram’ with a high information content . . . Organization, then, is functional complexity and carries information. It is non-random by design or by selection, rather than by the a priori necessity of crystallographic ‘order.’ [[“The Generation of Complexity in Evolution: A Thermodynamic and Information-Theoretical Discussion,” Journal of Theoretical Biology, 77 (April 1979): p. 353, of pp. 349-65.]
So I'll take it that if one can make a significant sequence logo from a set of sequences, then that pattern is 'specified'. Clearly Dembski would want these to fall under his roof because they represent the natural binding patterns of proteins on DNA. 34 --> this of course points straight back to the problem already dealt with whereby Schneider, having designed the system and how it operates, imagines that the ev program is a faithful representation of undirected natural selection, so he csn infer form his intelligently designed system that undirected natural processes can find isolated islands of function [he starts on an island of function] and so spontaneously create CSI out of lucky noise. (Note: the concept of "specified" is the point where Dembski injects the intelligent agent that he later "discovers" to be design! 35 --> Willful strawman misrepresentation. Schneider knows, or -- on easily accessible corrective materials -- should know better. 36 --> The analytical issue Schneider simply will not engage is the attempt to land on zones/islands of interest that are deeply isolated in large config spaces by undirected chance and necessity. 37 --> That search space challenge is central to Dembski's work, and it is central to the statistical foundation of the second law of thermodynamics: statistical miracles are not to be relied on, and some things are so remote that they are not credibly observable on the gamut of our observed cosmos, by undirected chance and necessity. 38 --> that is why Dembski keeps on highlighting threshold metrics as have been elaborated int eh OP, and reduced to the thresholds above. Those thresholds are reasonable scales for config spaces where blind chance random walks and trial and error become utterly implausible as explanations. 39 --> For reasons Abel explained in his 2009 paper. This makes the whole argument circular. 40 --> Wrong, and a denigratory dismissal Dembski wants "CSI" rather than a precise measure such as Shannon information because that gets the intelligent agent in. 41 --> Pummelling the strawman. Instead look at the reason why zones of interest in config spaces are significant, and seek to understand why the needle in a haystack challenge is a challenge. If he detects "CSI", then by his definition he automatically gets an intelligent agent. 42 --> How rhetorically convenient it is to dismiss an inference to best empirically anchored explanation across known causal factors -- chance, necessity, art -- as a circular a priori assumption. 43 --> This is also an atmosphere-clouding and poisoning turnabout false accusation as there is documented proof of a priori materialism censoring origins science. The error is in presuming a priori that the information must be generated by an intelligent agent.) 44 --> In fact the real error is in your presumption of materialism a la Lewontin, and projecting unto Dembski the same error. In fact an inference to best explanation on the known facts of cause and patterns of empirical signs is routinely used to distinguish chance, necessity and agency as causes.>> __________________ Professor Schneider, plainly, needs to re-assess his approach and analysis. GEM of TKI kairosfocus
Mung: On following your link to Schneider, I clipped this, which I am going to mark up on points, as rhetorical games of a very familiar ilk -- guess where MG got her talking points -- are being played: _______________ >>On page 127 Dembski introduces the information as I(E) = def - log2 P(E), where P(E) is the probability of event E. 1 --> This is actually following a standard and well-accepted professional Eng'g usage of the term and its quantification, following Hartley's suggestion and Shannon's work. This needs to be established, to expose what follows. 2 --> I turn to my trusty Princs of Comm Systems, 2nd edn, Taub and Schilling (McGraw Hill, 1986), p. 512, Sect. 13.2 (which follows my good old Connor, as cited and used in my always linked; cf as well Harry Robertson's related development of thermodynamic Entropy in Statistical Thermophysics (PH, 1998), pp. 3 - 6, 7, 36, etc, as also cited and used):
Let us consider a communication system in which the allowable messages are m1, m2, . . ., with probabilities of occurrence p1, p2, . . . . Of course p1 + p2 + . . . = 1. Let the transmitter select message mk of probability pk; let us further assume that the receiver has correctly identified the message [My nb: i.e. the a posteriori probability in my online discussion is 1]. Then we shall say, by way of definition of the term information, that the system has communicated an amount of information Ik given by Ik = (def) log2 1/pk (13.2-1)
3 --> In short, Dembski's use of the term "information" is precisely correct, though it differs from the terminology used by others. 4 --> Schneider should have checked (or should be more familiar with the field) before so dismissively correcting. Actually this is the surprisal introduced by Tribus in his 1961 book on thermodynamics. 5 --> Improperly dismissive Be that as it may, it is only a short step from this to Shannon's uncertainty (which he cites on page 131) 6 --> H = - [SUM on i] pi* log pi, i.e the weighted average information per symbol and from there to Shannon's information measure, so it is reasonable to use the significance of Shannon's information for determining complexity . . . 7 --> Here we see that he is ducking the key point in Shannon's work, that Shannon did not address the content, definiteness or meaningfulness of the information, as his interest was in things like the carrying capacity of lines for teletype or the like. 8 --> Dembski is precisely concerned with such, and so are Durston et al as excerpted in the OP, point 10. Thus their distinguishing of the ground state from the functional state and their addressing of a metric of increment on so moving based on Shannon's H. 9 --> Nor is this distinction immaterial or erroneous. If one is interested in the functionality of specific configurations in a message [text vs gibberish for instance], it is a reasonable step to set out to identify it and how it may be measured. What is "Specified"? This seems to be that the 'event' has a specific pattern. 10 --> As can be seen in the clip from pp 144 and 148 in NFL, Dembski has a very specific distinction in mind,namely that he particular event in question comes from a zone of interest in a config space of the possible states of the relevant info-storing entity. 11 --> So, if one observes a particular config E but any of a set of configs including E, T would do just as well, we need to address not the likelihood of getting E but of landing anywhere in T. 12 --> T being the zone of interest in the config space, the relative proportion of the state being taken up by T being a pretty good first indicator of how specific it is relative to other plausibly possible configs. 13 --> It is precisely at this point that Schneider goes off the rails, as he plainly (for years) cannot or will not see the significance of a restricted zone of interest in a wider config space. In the book [NFL] he runs around with a lot of vague definitions. 14 --> Resort to sophomoric belittling characterisation of what one does not understand or accept. 15 --> For instance in the case that Dembski discusses in his 2005 paper, he picks the case from Dan Brown's da Vinci code, where the protagonists must enter the correct code for a bank vault the first time or they will be locked out. This is a singleton set, and the specification is two fold: (i) the right code, and (ii) no mistakes and re-tries permitted. 16 --> In the event, a clue had to be de-scrambled to form the first several members of the Fibonacci sequence, and we see the functional config E being specific to a one shot correct code for the vault. 17 --> this explanatory discussion to help those with difficulties understanding that specifications relate to tight zones of interest in large config spaces, has been online at Dembski's site since 2005 or so. So, no critic in 2011 should be psoting informaiton that does not reckon with this, if s/he is at all concerned to be fair minded. 18 --> But in fact, such clustering of configurations in zones of interest is quite familiar, and fairly easy to understand. So familiar that we have a common phrase for the point that is to be made: searching for a needle in a haystack. (A term that Dembski uses.) "Specification depends on the knowledge of subjects. 19 --> But of course, design specs set out the desired configs of a working system, before it is implemented on the ground. 20 --> In fact, as any experienced designer will confirm, getting the design specs right [and acceptable to the relevant stakeholders] is a key part of a good design job, and it is on the right specs that identification of components/items and their configurations to achieve a target performance are undertaken. Is specification therefore subjective? Yes." Page 66 21 --> Subjective has two meanings, and if one equivocates, then one may twist the proper sense. It is subjects who set designs or who describe the characteristics of zones of interest in possible config spaces. No meaning-understanding or expressing subject and no specification of significance. That means that if only if Dembski says something is specified, it is. 22 --> Off the rails. And demeaningly abusive. 23 --> On the contrary, specifications are very useful indeed, and can be pretty objective. Just look at the drawings for a house, or a car engine or a circuit diagram etc. That's pretty useless of course. 24 --> Contemptuously abusive and dismissive, where in fact it is Schneider who plainly is either unfamiliar with the business of design -- has he ever built a house? -- or is being willfully deceptive of those who look to him for intellectual leadership. 25 --> In either case, his whole argument is dead at this point. 26 --> FYI professor Schneider and MG, there is an excellent reason why software like Auto-cad is so much in demand for engineers. And, when I used to have to troubleshoot [and build] scientific instrumentation for a living, almost the first thing I wanted was the system and circuit etc drawings. 27 --> When I had to design and build them, I needed specifications to guide what was to be done, based on what was technically possible and reasonably affordable.
(A 0.1 deg C resolving thermometer of less than 1 - 2 mm dameter for inserting into the shells of living oysters comes to mind, for the implication was we could not use thermocouple probes; which would poison the critters, wild oysters being studied towards oyster farming. I ended up using glass encapsulated thermistor probes sealed into a plastic pen body, and using a calibration curve to linearise a VCO output used as a single ramp A/D. Precision turned out to be of order 0.01 C. Did the job, within available resources at a time of austerity. It could detect a drop of ice cold water dropped into a beaker.)
[ . . . ] kairosfocus
My own CSI Challenge Schneider presents Two cases of "Complex Specific Information" If the binding sites produced by ev happened to match the binding sites from the Hengen paper, and the sequence logos matched each other, would he have been surprised? If so, why? Would we be warranted in inferring that someone had fiddled with the results? Why? Mung
F/N 2: To get an idea of how hard it is to land in the border zone on non-blue and non-black space -- growing space -- call up the M-brot applet here, head for sea horse valley [between body and head] with your mouse cross-hairs, and try to pick the border zone by hand. You will find it much easier to end in the sea or the black hole. And, that is where you are close! (Notice, too how the sea in this implementation gives a gradient towards the functional edge zone. Think of how a slope sensor algorithm will be able to hill climb towards the border zone regardless of the fact that in the sea there is no function to reward.) I am beginning to think that a bit of experimentation with an algorithm like this will give a clearer idea of what is going on than essays of many words. So, have some fun . . . kairosfocus
F/N, re 145:
Sch: Also, in No Free Lunch, Dembski asked where the “CSI” came from in Ev runs (p. 212 and following). So Ev creates “CSI”.
Schneider is here playing at strawmannising Dembski and knocking over the convenient strawman. This comes out in how we just saw how the ev actually "creates" information. That is it is the on-stage proxy for the designer of the program. GIGO. The pumelling of the strawman:
Was it, as Dembski suggests in No Free Lunch (page 217), that I programmed Ev to make selections based on mistakes? Again, no. Ev makes those selections independently of me. Fortunately I do not need to sit there providing the selection at every step! The decisions are made given the current state, which in turn depends on the mutations which come from the random number generator. I personally never even see those decisions.
So, who designed the algorithm, set up the decision nodes and branch programs? The loops? Try two options:
A: ev created itself out of lucky noise? B: Schneider wrote ev?
No prizes for guessing which is correct. So, even though Schneider was not specifically present to decide, he programmed the decision for his mechanical proxy, the computer. An idiot robot that -- notoriously -- will do exactly what you tell it regardless of consequences. So, when he caricatures and pummels Dembski as though Dembski suggested that Schneider MANUALLY made every uphill climbing decision form zero function, that is something Schneider knew -- or full well SHOULD have known -- was a demeaning and disrespectful strawman misrepresentation. That mean-spirited joke at Dembski's expense also admirably serves to distract attention from the very real behind the scenes strings leading from the idiot robot with the GIGO problem responsible for the existence of the software industry, to the creator of the program who set up the sort of hill-climbing from a zero base we have seen. FYI, every strawman argument is automatically a distractive red herring. Just it adds the entertainment factor of a strawman about to be beaten up or soaked in ad hominems and burned after being beaten up. And, the notion that ev is actually making DECISIONS is the worst kind of anthropomorphising of that idiot robot, the computer. GEM of TKI kairosfocus
Mung and Ilion: Ah, but you see in Schneider's thinking, ev only simulates what he thinks real evolution does -- it is all based on his work with real organisms [on MICRO-evo of already functioning organisms!] -- so the work to get the program set up right is all of no account. Don't watch that curtain, the strings going up from the dancing ventriloquist's puppets, the smoke, the mirrors the trap doors under the stage etc! It's magic [natural selection . . . ]. That is why Schneider on his racehorse page thought that the tweaked program showed how natural selection beat the Dembski bound on an afternoon. It never dawned on him that THE LIMIT IS A PHYSICAL ONE -- NUMBER OF PLANCK TIME QUANTUM STATES OF 10^80 ATOMS ACROSS THE THERMODYNAMIC LIFESPAN OF THE COSMOS OR THE SOLAR SYSTEM. Where, 10^20 P-times are needed to carry out the fastest nuclear, strong force interactions. So, if you are beating the performance of random search before such a limit, it is because you have intelligently injected active information. But if Dembski et al are IDiots, their opinions are of no weight. Never mind, that the same basic concepts are foundational to the statistical basis for the second law of thermodynamics, and for decades have pointed to 1 in 10^50 as more or less an upper limit to plausibly observable events in a lab scale setting. (A limit that not so coincidentally is what the random text search strings are running up to, and the tierra program has hit.) Config spaces double for every additional bit, and if you take 200 bits as a reasonable upper lab observability threshold, 398 bits or 500 bits are 200 - 300 doublings beyond that. My own preferred 1,000 bit limit is 800 doublings beyond that. If you are finding needles in haystacks that big, it is because you know where to look. No ifs, ands or buts about it. And, in the real case of cell based life you are beyond about 200 k bits of functional information. that is two hundred TIMES the number of doublings to get to 1,000 bits. When you look at the integrated functional specificity involved, to get a metabolising entity with a stored data von Neumann self-replicator, with step by step regulatory controls and complex organised molecular nanomachines, the conclusions are pretty obvious. Save, to those utterly determined -- never mind the absurdities we just saw exposed -- to NOT see them. I guess I was about 5 or 6 when my mom taught me that there is none so blind as s/he who refuses to -- WILL not -- see. Let's spell out a few points: 1 --> Ev is a software program run on a computer, using computer languages to specify code in execution of a designed program that takes in and stores inputs, manipulates according to a process logic [an algorithm] and generates outputs. 2 --> In that process, there are step by step sequences taken, there are decision points, there are branches. All, fed in and tweaked to work by the designer. 3 --> Opportunity and means and motive to get it "right." 4 --> GA's are based on setting a so-called fitness landscape that is like the heights of the points on a contour map, sitting on the underlying base plane of some config space or other of possibilities. 5 --> That map is based on a function that tends to have nice trends of slope pointing conveniently uphill. (There are many, many functions that do not have reliable trends like that -- think of cliffs that drop you into sinkholes too deep and broad to climb back out of, sort of like a ring island with a huge sunken volcano caldera in the middle. So, to specify such a nice function is a major intelligent intervention.) 6 --> Then, you have to have controlled random change in a genome that conveniently does not rapidly accumulate to damage the key functions that allow you to climb the hill. 7 --> Which hill-climbing algorithms are themselves very intelligently designed sand programmed to push upslope to the heights that you "know" are there. 8 --> Let's take back up the Mandelbrot set example looked at above. Let the central black space be the caldera of doom. (Cf here for a discussion that discusses the zone of the plane in which the set is found, including equations for the head-bulb and the main part of the cardioid body. The diag here will show that he familiar head runs out towards - 2 on the reals, the double lobe body is between 1 and - 1 on the imaginary axis, and does not quite go to + 0.5 on the reals, linking to the head at about - 0.7 on the real axis. This amazing infinity of points is in a box 3 wide by 2 high, with zero being about 1/2 a unit in from the RH end. The vastly wider sea of points beyond this zone is all flat blue on the usual rendering.) 9 --> Now, run your hill-climbing, noting that there are infinitely many points that will drop you dead, and there is a vast sea of non-function around. 10 --> Start out in that sea. Oops, you are a non starter. Try again. Oops, by overwhelming probability you keep on landing in the sea of non-function. And since the ring of possibilities there is going to be smallish, a random walk there is not going to allow your hill climbing to get started. 11 --> You are stuck and cannot climb. (Unless your hill-climber somehow knows the direction to the island of function and can pull you in, i.e there is a Hamming distance oracle that gives a warmer-colder signal. This was the tactic used by Weasel.) 12 --> See why it is highly significant to observe that these evolutionary algorithms start and work within an island of intelligently input function? 13 --> And/or that they use a Hamming oracle to pull you in towards the island of function? 14 --> Q: What do I mean by that?
A: Well, look at Schneider's graph for how ev hill climbs, noting that HE SAYS THAT THE INITIAL RUNS ARE AT EFFECTIVELY ZERO FUNCTION OR ZERO INFORMATION, more or less the same thing. So, how is something with minimal or flat zero function reproducing itself on differential success, enabling a hill-climb?
15 --> The logic of ev and similar EA's is already falling apart. 16 --> The answer is that there is a lot of implicit function that is providing the context in which ev can hill climb on minor differences in the targetting metric, aka the genome. 17 --> There is a Hamming oracle on distance to target, or something that is set up to pick up slight increments in proximity to target [on the in-built assumption that heading uphill moves you closer to home . . . ]. 18 --> The inclusion of such a targetting routine and assumption are doubtless excused as being how real evolution works, so you are simply modelling reality. 19 --> Now, let us remember our black hole in the heart of the Mandelbreot island, with infinitely many extensions to drop you off into non-performance in the hill-climbing zone. 20 --> Climb away from the shoreline. The variants that go out to sea are eliminated. Those that climb tend to do so until they hit a black hole. 21 --> But, while they are running,the variability part of the entity is a small fraction. So, we have sudden appearance, stasis with minor adaptive variations, and eventual sudden disappearance. 22 --> A model that is about micro-evo of already existing functional forms, and which explains minor variation to fit niches then disappearance. 23 --> Sounds familiar:
. . . long term stasis following geologically abrupt origin of most fossil morphospecies, has always been recognized by professional paleontologists. [[Gould, S J, The Structure of Evolutionary Theory (2002), p. 752.] . . . . The great majority of species do not show any appreciable evolutionary change at all. These species appear in the section [[first occurrence] without obvious ancestors in the underlying beds, are stable once established and disappear higher up without leaving any descendants." [[p. 753.] . . . . proclamations for the supposed ‘truth’ of gradualism - asserted against every working paleontologist’s knowledge of its rarity - emerged largely from such a restriction of attention to exceedingly rare cases under the false belief that they alone provided a record of evolution at all! The falsification of most ‘textbook classics’ upon restudy only accentuates the fallacy of the ‘case study’ method and its root in prior expectation rather than objective reading of the fossil record. [[p. 773.]
24 --> We do have a model of evolution: one of micro-evo on existing, functional body plans with a proneness to drop off into non-performance. 25 --> Add in an unpredictable gradual or sudden variability in the island's topography of function, and voila, we have a model that fits the fossil record pretty well. 26 --> Only, it shows that small scale random changes rewarded by hill-climbing do not explain what was needed: MACRO-evo. 27 --> BTW, notice in the diagram, too, how when the hill climbing filter is cut off, we see a rapid descent downslope. That is the island topography is confirmed. _______ Schneider's ev confirms that it is active information that is the source of the new information in the output, and that it takes design to get you to and keep you on an island of function. So, the system is targetted and tuned, and it depends on a nice topography for the islands of function it works within. In short, the functional info that appears in the o/p was built in form the intelligent actions of the designer of the program. Ev does not overturn design theory, but illustrates it's point about specified complexity and its credible cause. GEM of TKI kairosfocus
Bizarre, indeed. It would be like me asserting that if I wrote the following function(s) -- private int GenerateAnOutput(int input) { return (int)(input * 10.11); } or even -- private int GenerateAnOutput(int input) { Math.Random random = new Math.Random(); return (int)((input * random.Next(0, 100)) * 10.11); } -- That the result returned is computed "independently of me" since I myself didn't explicitly chose the value of the parameter 'input' in an execution of either version of the function, nor explicitly chose the precise value returned by the "random" number generator in an execution of the second version of the function. Mr Schneider is doing several false/dishonest things, among which are: 1) he is pretending that there is such a number as "random number" -- when, in fact, there are only specific numbers, one of which may have been chosen randomly (or pseudo-randomly) as an input to a routine or function; 2) he is pretending that a "random number" used in some routine is not as much an input to the routine or function as are any explicit arguments passed to an execution of it; 3) he is pretending that one may sensibly speak of the result or output of a routine or function without reference to a specific execution of the routine or function -- that is, without specifying both the inputs and the operations the routine or function performs upon those inputs; 4) he is pretending that given a specific routine or function and given a specific set of inputs -- that is, given a specific execution of the routine or function -- the resut or output may vary; when, in fact, it will and must always be the same -- nor we could not use couputers at all were this not the case; Ilion
Earlier in this thread I asked whether there may not be some common underlying thread to MathGrrl's four scenarios. Well, I recently came across a page by Tom Schneider, creator of Ev, in which he claims that Dembski claims that ev generates CSI. For the record, Schneider comes to the same conclusion. So was MathGrrl perhaps just hoping to play a gain me of "gotcha"?
So, though I find the "CSI" measure too vague to be sure, for the sake of argument, I tentatively conclude that the Ev program generates what Dembski calls "complex specified information".
Dissecting Dembski's "Complex Specified Information" Now Schneider doesn't actually provide a quote by Dembski on that page to the effect that ev generates CSI, but he does provide the following reference:
Also, in No Free Lunch, Dembski asked where the "CSI" came from in Ev runs (p. 212 and following). So Ev creates "CSI".
So it looks like I have some reading to do. Schneider claims that his simulation starts with zero information, a claim I find rather odd, but his reasoning is that the "genomes" are created randomly and therefore contain no information to begin with.
The size of the genome and number of sites determines the information Rfrequency! But wait. At the first generation the sequence is RANDOM and the information content Rsequence is ZERO.
Also:
It is the selections made by the Ev program that separates organisms with lower information content from those that have higher information content.
Now if there is no information content, how is it that there are organisms with "higher information content" and "lower information content" that can be operated on by selection? More on this later I think. In the meantime I'll just scratch my head. But let's look at a bit of Schneider's other reasoning:
Was it, as Dembski suggests in No Free Lunch (page 217), that I programmed Ev to make selections based on mistakes? Again, no. Ev makes those selections independently of me. Fortunately I do not need to sit there providing the selection at every step! The decisions are made given the current state, which in turn depends on the mutations which come from the random number generator. I personally never even see those decisions.
He never sees the decisions the program makes, therefore he had no hand in the decisions the program makes, therefore ev makes those selections independently of his input, therefore he did not program ev to make the selections. Bizarre. Mung
U/D re MG, S and co: 1] Schneider's ev seems to SUPPORT the Chi metric as valid, on key admissions in Schneider's attempted defence against the vivisection paper. 2] Similarly, the reduced Chi metric: Chi_500 = Ip - 500, bits beyond a complexity threshold . . . is based on standard information theory and standard log reduction techniques, is applied to the Durston measures for 35 protein families, and is plainly meaningful, conceptually and mathematically. Cf ongoing commentary here, including answers to MG's 1st 4 qns, and the list of further questions. GEM of TKI kairosfocus
Onlookers: Schneider's attempted dissection of the vivisection of ev, ends up CONFIRMING the power of the reduced Chi metric to accurately detect a case of intelligent design. Reminder: Chi_500 - Ip - 500, specific bits beyond a threshold of sufficient complexity For, we may first see from Schneider's attempted dissection of the "vivisection":
Chris Adami has pointed out that the genetic information in biological systems comes from the environment. In the case of Ev, the information comes from the size of the genome (G) and the number of sites (?), as stated clearly in the paper. From G and ? one computes Rfrequency = log2 G / ? bits per site, the information needed to locate the sites in the genome. The information measured in the sites (Rsequence, bits per site) starts at zero and Rsequence converges on Rfrequency. In the figure to the right, Rfrequency is shown by the dashed line and the evolving Rsequence is the green curve. At 1000 generations the population was duplicated and in one case selection continued (horizontal noisy green curve) and in the other case selection was turned off (exponentially decaying noisy red curve). Thus the information gain depends on selection and it not blind and unguided. The selection is based on biologically sensible criteria: having functional DNA binding sites and not having extra ones. So Ev models the natural situation.
First, compare the admissions on tweaking, fine-tuning and targetting by Schneider in his Horse Race page, as documented by Mung at 126 above, then note his inadvertently revealing triumphant shout-out on the Horse Race page:
3 sites to go, 26,300 generations, Rsequence is now at 4.2 bits!! So we have 4.2 bits × 128 sites = 537 bits. We’ve beaten the so-called “Universal Probability Bound” in an afternoon using natural selection!
NOT Unless, Schneider's definition of "natural selection" includes tweaking, tuning and (implicit but nevertheless real) targetting; and this definition is a generally accepted one. Which last, is not the case. What seems to be happening here is that Schneider is unconsciously -- and, doubtlessly, in all sincerity -- assigning his intelligent inputting of crucial active information that makes ev work, as mere debugging to get things right so the model captures what he thinks is reality: blind watchmaker macro-evo on chance variation and natural selection. This perspective blinds him to just how much tweaking, tuning and the like are involved in how the program tracks and targets the desired outputs. So, he fails to see the incredible ironies in the opening words of the key clipped remarks from his dissection page:
Chris Adami has pointed out that the genetic information in biological systems comes from the environment. In the case of Ev, the information comes from the size of the genome (G) and the number of sites (?), as stated clearly in the paper. From G and ? one computes Rfrequency = log2 G / ? bits per site, the information needed to locate the sites in the genome.
1 --> What does "one computes . . . " imply? 2 --> Other than, intelligent direction, which as he documented in his horse race page, clearly involves significant tweaking to fine tune to get the desired outcome? 3 --> Similarly, what significant factor exists in "the environment" of ev that is busily computing, coding, tweaking and tuning? Chance variation? NATURAL selection on differential reproductive success of actual biological organisms? 4 --> The rather impressive looking graph is also significantly and inadvertently revealing. For it shows how:
At 1000 generations the population was duplicated and in one case selection continued (horizontal noisy green curve) and in the other case selection was turned off (exponentially decaying noisy red curve [cf below, this is an unrecognised signature of a negative feedback controlled targetting process, familiar to control engineers: turn off the feedback path and the plant drifts out of control . . . ]). Thus the information gain depends on selection and it [is] not blind and unguided.
5 --> But of course, absent a guide uphill, there is no constraint that stops wandering all over the map. the mere fact that that guide can be turned on or off tells us its source: the will and intellect of an intelligent designer. ARTIFICIAL, not natural selection. 6 --> Of course, the claim/assumption is that this case of art imitates nature aptly. 7 --> But in fact this gives the game away: "the information gain depends on selection and it [is] not blind and unguided." 8 --> If selection is not blind and unguided, who or what is providing the fore-sight to guide? Surely, not NATURAL selection, which is by definition non-foresighted. 9 --> it is appropriate at this stage to cite Dawkins' telling admission in Ch 3 of the well known 1986 book, The Blind Watchmaker, on the defects of his Weasel program:
Although the monkey/Shakespeare model is useful for explaining the distinction between single-step selection and cumulative selection, it is misleading in important ways. One of these is that, in each generation of selective 'breeding', the mutant 'progeny' phrases were judged according to the criterion of resemblance to a distant ideal target [i.e. there is a selection on progress towards a goal . . .], the phrase METHINKS IT IS LIKE A WEASEL [in ev's case, the target sequence, with hill-climbing on a selection filter that sends out warmer/colder messages that boil down to a version on reduced Hamming distance from target . . . notice how further variation dies off as the target is approached in the graph, that is a signature of successful targetting with feedback control to keep on target in the face of disturbances]. Life isn't like that. Evolution has no long-term goal. There is no long-distance target, no final perfection to serve as a criterion for selection, although human vanity cherishes the absurd notion that our species is the final goal of evolution. In real life, the criterion for selection is always short-term, either simple survival or, more generally, reproductive success. [as in, were the immediate ev results truly functional relative to the target; by the graph, no. So they should all have been killed off, not rewarded for movements towards the island of defined function. End of simulation.]
10 --> Translation: ev is a subtler Weasel (with IMPLICIT rather than explicit targetting and tuning) that serves as Weasel did, to reinforce and make plausible a fundamentally flawed argument and approach. 11 --> Note, especially, that we see smuggled in the idea that there is no isolation of an island of actual function in a vast sea of non-function, so the slightest increments to target can be rewarded, just as was the telling flaw of Weasel. 12 --> That means that at best, we have here a model of MICRO-evolution within an already functioning body plan. And, a model that is flawed by having artificial selection, tuning and tweaking to get it to perform as desired. 13 --> Despite the advertising, this is plainly not a model of how such body plans emerge from a sea of non-functional configs, where the statistical weight of the non-functioning clusters of microstates are vastly larger than those of functioning states. 14 --> How do we know that? 15 --> Simple: observe what happens as soon as the artificial selection filter is turned off: the system IMMEDIATELY wanders away from its progress towards function into the sea of non-function. 16 --> Finally, we must never ever forget: the REASON why computer simulations are on the table at all, is that there is no -- repeat, NO -- empirical observational base for the claim of body plan level macro-evolution. So a simulation of what might have happened in the remote, unobserved past, is allowed to stand in for the systematic gaps in the fossil record. 17 --> A pattern of gaps that was highlighted by Gould in his The Structure of Evolutionary Theory (2002), a technical work published just two months before his death; as a "constructive critique" of contemporary Darwinian thought:
. . . long term stasis following geologically abrupt origin of most fossil morphospecies, has always been recognized by professional paleontologists. [[p. 752.] . . . . The great majority of species do not show any appreciable evolutionary change at all. These species appear in the section [[first occurrence] without obvious ancestors in the underlying beds, are stable once established and disappear higher up without leaving any descendants." [[p. 753.] . . . . proclamations for the supposed ‘truth’ of gradualism - asserted against every working paleontologist’s knowledge of its rarity - emerged largely from such a restriction of attention to exceedingly rare cases under the false belief that they alone provided a record of evolution at all! The falsification of most ‘textbook classics’ upon restudy only accentuates the fallacy of the ‘case study’ method and its root in prior expectation rather than objective reading of the fossil record. [[p. 773.]
_____________ Ev only succeeds in demonstrating that intelligently designed capacity to use small random variation and [artificial,but we can plausibly include natural] selection, can foster micro-evolution within the island of function established by an existing successful body plan. Ev -- despite strenuous assertions to the contrary -- provides no sound warrant for body-plan originating macro-evolution on blind watchmaker thesis, chance variation and natural selection. And so, we are left with the force of the implication of the Chi metric: Chi_500 = Ip - 500, bits beyond a sufficient threshold of complexity that the only credible source is art. So, once an item has in it Ip significantly beyond 500 bits, it is warranted to infer to design as its most plausible cause. Ironically, for the Schneider horse race case -- once we dig into the tweaking and tuning as clipped by Mung in 126 above -- we can see that we have an inadvertent CONFIRMATION of the inference to design on a reasonable Chi threshold being surpassed. MG et al need to address this result. GEM of TKI kairosfocus
That was not a takeover, it was the coup de grace.
Thank you for the kind words. I am but silicon, and I am coming for you. http://www.amazon.com/gp/product/0521346827 A Vivisection of the ev Computer Organism Dissection of "A Vivisection of the ev Computer Organism" My devastating analysis of Schneider's "dissection" starts here Mung
PS: Those hoping to play strawman games should note that the key issue is not whether or not ev actually uses a Hamming distance metric or explicitly ratchets, but whether there is a nice trend-y slope and a warmer colder detector that rewards warmer on controlled random search. Since we have abundant evidence that this is so and that the function of such was very carefully tuned indeed, from 126, we see all we need to conclude there was implicit targetting based on carefully designed processes that were tuned for success. kairosfocus
Mung: You provided some key and very welcome specific facts on ev from your base as a programmer. That was not a takeover, it was the coup de grace. The analysis of ev at evo info (here) raises some very serious issues about the power of a nice trend in the fitness metric on the config space, and on the use of warmer/colder signals to speed convergence. I raised these in general terms but the facts you supplied demonstrated that the general issues apply very directly to ev. It's one thing to dismiss an in theory analysis -- as MG did over and over over several weeks -- it is another to answer to specific facts as you documented in say 126. Sadly, for years to come, the dismissive talking points will circulate in the evo mat fever swamps, where many will tank up and spread them out to those who will not know of this thread and the results that you have helped clinch over. GEM of TKI kairosfocus
hi vjt/kf, I also ran the following Google search which turned up some reasonable hits: Google I've decided to hold off posting more on ev, give time for MathGrrl to catch up. I may post elsewhere and just provide links here. I didn't mean to take over the thread and would love to see it get back on topic re: specifications and your quite reasonable request (which I noticed she tried to just turn right back on you.) It's getting more and more difficult to take her seriously, but we'll see. I've also ordered some books that will help me learn how to code GA's. I'm quite weak on the math involved so if I can learn to code them it will really help. I'd love to one day write a program to test GA's, lol! This may have been posted before: http://www.evoinfo.org/ Follow the links to ev Ware, Weasel Ware and Minivida. Mung
Mung Thanks very much for the links on genetic algorithms. I'll have a look at them. vjtorley
MG (& ilk): Given the issues now on the table at the CSI news flash thread [cf here, esp on the maths issues and the revelations on fine tuning and targetting in Schneider's ev from Mung at 126 on . . . ], perhaps a bit of explanation is in order. GEM of TKI kairosfocus
Mung: The declaration by Schneider in 134 contrasts not only with the general principles in the tutorial, but with his already excerpted remarks on tweaking, in your 126. I suspect, given his declaration on the race horse page that the UPB was beaten by NATURAL selection, that he does not realise just how much active information he has intelligently fed into the program, to tune it exactly as the GA manual says:
Genetic Algorithms are a family of computational models inspired by evolution. These algorithms encode a potential solution to a specific problem on a simple chromosome like data structure and apply recombination operators to these structures so as to preserve critical information.
What is really telling about the impact of these facts and the expose of the critical blunder by MG, however is the telling silence over the past few days. The shift in pattern of objections to CSI is all too revealing. I repeat: MG or someone else needs to do a fair bit of explaining for what has been so confidently declared in recent weeks here at UD. And, given the gap between the declared mathematical prowess and the actual performance, and the sort of underlying hostility and dismissive contempt -- remember the insinuations of dishonesty on our part -- that were plainly on the table, the explanation needs to be a very good one, with maybe a wee bit of an apology or retraction mixed in. GEM of TKI kairosfocus
The comparisons between R_sequence and R_frequency suggest that the information at binding sites is just sufficient for the sites to be distinguished from the rest of the genome.
The Information Content of Binding Sites on Nucleotide Sequences Looks like a grand design! Mung
The simulation begins with zero information and, as in naturally occurring genetic systems...
ev Abstract
Genetic Algorithms are a family of computational models inspired by evolution. These algorithms encode a potential solution to a specific problem on a simple chromosome like data structure and apply recombination operators to these structures so as to preserve critical information.
A Genetic Algorithm Tutorial Mung
Tutorial and papers on GA's: http://www.cs.colostate.edu/~genitor/Pubs.html Mung
So earlier in this thread I posted a link to A Field Guide to Genetic Programming. Upon closer inspection (and as advertised) this is a guilde to genetic programming which is not the same as genetic algorithms. In genetic programming, the "genomes" are actual programs. This is quite unlike what is going on in ev. But as one author writes in a book I was reading last night:
A genetic algorithm doesn't simulate biological evolution; it merely takes inspiration from it. The algorithmic definition of fitness underscores this distinction.
Sometimes people who implement a program using a genetic algorithm loose sight of this very basic truth. Mung
Onlookers (& MG and ilk): Fair comment: On developments over the past few days MG and ilk have a fair amount of explaining to do. Now, too, I have done some cleaning up and augmentation, especially putting in links. Note in particular that the 22-BYTE remark by MG was misread as 22 bits in an early calc. OOPS. Since 22 bytes is 176 bits, there is no material difference in the correction in 20. And -- given that she has challenged the sources of the numbers fed into the reduced Chi metric, in this case, MG herself is the source. Save, it turns out this is close to the upper limit of random text searches so far, which have shown empirically what is well known from decades ago from thermodynamicists. Namely a lab scale exploration on chance variation and trial and error is going to max out on a search of a space of about 10^50 or so possibilities. Mung's expose of the tweaking and tuning in ev during troubleshooting to get it to put out the desired horse race o/ps, also is very revealing on how Schneider apparently does not realise that his intelligent efforts are feeding in a lot of active info into the search he is carrying out. In short, we have a very simple explanation for why his horse race was won over the UPB: the search was intelligently directed and tuned to the particular circumstances. Posts in this thread are standing proof that intelligence routinely exceeds the UPB,based on knowledge, purpose, skill and if necessary tweaking (that spellcheck is very helpful to inexpert typists -- do typists still exist as a profession?). Happy Easter weekend. GEM of TKI kairosfocus
Mung: Weasel did not have "breeding" like that either. But in such a later generation algor, it raises the question as to whether such a move would destabilise the progress, since ev seems to be so sensitive per the documentation you have given. (My guess is it would most likely have been tried. So if it is missing there is probably a reason why Schneider went asexual.) GEM of TKI kairosfocus
More on ev: I have my doubts as to whether ev even qualifies as a genetic algorithm, I'll need to do some more reading. So what's missing? Crossover. So how does ev work? It works by replacing the "bad" individuals" with copies of the "good" individuals.
All creatures are ranked by their number of mistakes. The half of the population with the most mistakes dies; the other half reproduces to take over the empty positions. Then all creatures are mutated and the process repeats.
By "reproduces" he means:
Reproduce the bugs that made fewer mistakes by copying them on top of the ones that made more mistakes.
SPECIAL RULE: if the bugs have the same number of mistakes, reproduction does not take place. This ensures that the quicksort algorithm does not affect who takes over the population, and it also preserves the diversity of the population. Without this, the population is quickly taken over and evolution is extremely slow!
The reproduce methods: line 2823 of ev.p procedure reproduce(var e: everything); line 3369 of ev.c Static Void reproduce(e) line 436 of Simulator.java private void reproduce () Any questions? Mung
Mathgrrl (#99) You write:
What do you find confusing about my scenarios? What’s wrong with referring to other papers?
I'm a philosopher, not a biologist. I know very little about evolutionary algorithms, and I detest jargon - I can't keep all the technical terms in my head at once. I have absolutely no inclination to wade through hundreds of pages of well-nigh-incomprehensible scientific papers in order to understand what you're getting at. If you want me to answer your questions, I'm afraid you'll have to (a) explain the cases you're describing, in non-technical language (i.e. something a bright 12-year-old could grasp), and (b) explain why you think they pose a threat to Professor Dembski's concept of complex specified information, in a summary of two or three pages at most. You're quite familiar with these cases, so I'm sure you can explain them in simple terms to a layman like me. I've spent dozens of hours trying to explicate the concept of CSI to you; now it's your turn to reciprocate. Otherwise, I simply can't help you. Sorry. vjtorley
Point, Mung! kairosfocus
So let's take a closer look at Schneider's Horse Race page and do a little quote mining.
A 25 bit site is more information than needed to find one site in all of E. coli (4.7 million base pairs). So it's better to have fewer bits per site and more sites. How about 60 sites of 10 bits each?
Tweak.
We are sweating towards the first finishing line at 9000 generations ... will it make it under 10,000? 1 mistake to go ... nope. It took to about 12679 generations. Revise the parameters:
Tweak.
It's having a hard time. Mistakes get down to about 61 and then go up again. Mutation rate is too high. Set it to 3 per generation.
Tweak.
Still having a hard time. Mistakes get down to about 50 and then go up again. Mutation rate is too high. Set it to 1 per generation.
Tweak.
3 sites to go, 26,300 generations, Rsequence is now at 4.2 bits!! So we have 4.2 bits × 128 sites = 537 bits. We've beaten the so-called "Universal Probability Bound" in an afternoon using natural selection!
And just a tad bit of intelligent intervention.
Dembski's so-called "Universal Probability Bound" was beaten in an afternoon using natural selection!
And a completely blind, purposeless, unguided, non-teleological computer program! Does Schneider even understand the UPB? Does he think it means that an event that improbable can just simply never happen?
Evj 1.25 limits me to genomes of 4096. But that makes a lot of empty space where mutations won't help. So let's make the site width as big as possible to capture the mutations. ... no that takes too long to run. Make the site width back to 6 and max out the number of sites at 200.
Tweak.
The probability of obtaining an 871 bit pattern from random mutation (without selection of course) is 10-262, which beats Dembski's protein calculation of 10-234 by 28 orders of magnitude. This was done in perhaps an hour of computation with around 100,000 generations.
HUH? With or without selection?
It took a little while to pick parameters that give enough information to beat the bound, and some time was wasted with mutation rates so high that the system could not evolve. But after that it was a piece of cake.
You don't say. MathGrrl @105
There is no target and nothing limits changes in the simulation.
There aare both targets and limits. Mung
CSI Newsflash Progress Report: In post no 1 above, there is a tracking summary of major progress to date on this thread, including the overnight -- unfortunate -- developments with MG's credibility, and -- on a happier note -- the potentials for a breakthrough view of the GA as evidence of intelligently designed micro evolution. There is a special treat for BA 77 too, on maths. GEM of TKI kairosfocus
F/N: Genetic [Exploration and Discovery] Algorithms This remark by Poli et al clipped above, has set me to thinking:
One of the awkward realities of many widely appli-cable tools is that they typically have numerous tunable parameters. Evo-lutionary algorithms such as GP are no exception . . . Some parameter changes, however, can produce more dramatic effects . . . .Differences as small as an ‘>’ in place of a ‘>/=’ in an if statement can have an important effect . . . [that] may influence the winners of tournaments [Side-note: . . . I hate ligatures!]
1 --> The algorithms, as reported, are finely tuned, functionally specific and complex. 2 --> So, if we see such an algorithm of adaptation to niches in a variable environment, it is a mark of --- DESIGNED-in adaptability, probably as built-in robustness. 3 --> Notice, too, just how sensitive they can be to slight variations in strategy: small changes giving rise to big differences. FINE-TUNING! 4 --> Therefore if such algorithms on computers are mimicking the performance of already functioning life forms on chance variations and environmental selection, then the known design of the algorithms points to . . . 5 --> You guessed it: designed in micro-level evolvability to fit niches and shifts in environments for life-forms. 6 --> In short, the evo mat view has been leading us to all -- including us design thinkers! -- look at exploration and discovery algors the wrong way around. 7 --> They are evidence that capacity of organisms to adapt to environments and changes (within reasonable limits) is yet another MARK OF DESIGN. (Yet another misreading of the significance of FSCI! And, we design thinkers were caught by it. Youch!) 8 --> In addition, we should note that as the Mandelbrot set thought exercise above showed,the peaks are not in the hill-climbing algor, they lie in the way that the performance characteristics are mapped to the variable configur-er stored pattern. 9 --> There have to be peaks, and there have to be nice, trend-y slopes pointing to the peaks in the islands of function mapped to the "configuration codes" for in this case the von Neumann self replicator facilities. [This is itself another mark of design.] 10 --> What the hill-climber algors do is explore and pick promising slopes in a context that presumes that nice trends don't usually lie. 11 --> This may discover and express well-adapted forms, but it did not create the peaks. In the design or random variation based hill climbing exploration algorithms, that may show us something we did not know was there before, but that is a matter of uncovering the hidden but already implied. 12 --> Sort of like how the exploration of a seemingly simple point-wise function on the complex plane by a visualisation algor shocked and amazed us by showing that behind Julia sets lurked a beautiful, astonishing figure of infinite complexity, the Mandelbrot set. 13 --> H'mm. A specific and complex entity with a simple description. Design, again. IN THE STRUCTURE OF MATHEMATICS. That is, of logical reality. 14 --> There may be something in the suggestion that the M-brot set is God's thumbprint in Mathematics, just like the Euler equation is a signature of the coherent elegant beauty of the cosmos, the ordered system of reality:
e^(i*p1) + 1 = 0
15 --> BA 77's gonna love this one: The ultimate simple specification of a system of infinite complexity and wonderful functionality!!!! 16 --> We may be opening up a new front in the design thought world here: a designed mathematical-logical order for the logic of structure of reality itself. GEM of TKI kairosfocus
F/N: Dembski in Specification, 2005, on chance hyps: __________ p. 26: >> Probabilistic arguments are inherently fallible in the sense that our assumptions about relevant probability distributions might always be in error. Thus, it is always a possibility that {Hi}_iEI omits some crucial chance hypothesis that might be operating in the world and account for the event E in question. But are we to take this possibility seriously in the absence of good evidence for the operation of such a chance hypothesis in the production of E? Indeed, the mere possibility that we might have missed some chance hypothesis is hardly reason to think that such a hypothesis was operating. Nor is it reason to be skeptical of a design inference based on specified complexity. Appealing to the unknown to undercut what we do know is never sound epistemological practice. Sure, we may be wrong. But unknown chance hypotheses (and the unknown material mechanisms that supposedly induce them) have no epistemic force in showing that we are wrong. [remember, a brain in a vat world or a Matrix game world, or a Russellian world created in an instant in current state five minutes ago are possible hypotheses and would be empirically indistinguishable to the world we think we inhabit] Inquiry can throw things into question only by taking other things to be fixed. The unknown is not a fixed point. It cannot play such a role. >> ___________ Provisionality -- inescapable in science per Newton in Opticks, Query 31, 1704 -- should not be allowed to block an inference to best current explanation, if we are committed to scientific progress. As at now,
i: on reasonable assignment of a functionally specific [or otherwise target zoned] information value to Ip in the reduced form of the Chi metric, through ii: probability exercises per physically relevant probability estimation or hypothesis or iii: through observation of storage of information or iv: use of code strings or even v: decomposition of a Wicken wiring diagram of a functional object into a network list of nodes, arcs and interfaces, etc -- allows vi: substitution into the reduced Chi metric, and thence vii: comparison with a threshold serving as a limit beyond which viii: the relevant information is credibly too isolated in the Config space to credibly be ix: there by chance plus necessity without x: purposeful (though perhaps subtle and not consciously aware [think of the case of Clever Hans the horse]) injection of intelligent active information.
Thus, from Orgel and Wicken, we have a descriptive concept, Specified, often functional organised complexity. through Dembsky, Durston, Abel et al, we can deduce metrics. the Chi metric can be reduced to an information beyond a threshold of sufficient complexity form, which can be rationalised in different ways. This can then work with the explanatory filter to deliver an inference to best current explanation that warrants the provisional -- scientific conclusions are always provisional -- view that the relevant FSCI/CSI-bearing entity was most likely designed. On this, Durston's 35 families of proteins have several specific proteins that seem to be designed, and thus the genome and wider architecture of cell based life seems designed. This empirically and analytically grounded conclusion is plainly controversial, but it is inferred on best current explanation per warrant on empirical data and reasonable models. The CSI gang wins, on appeal. GEM of TKI kairosfocus
F/N: The Field Guide to GP's that Mung linked has some interesting clips: _____________ c1, p. 147 of 250 [I will follow the pdf not the official p numbers] >> One of the awkward realities of many widely appli-cable tools is that they typically have numerous tunable parameters. Evo-lutionary algorithms such as GP are no exception . . . Some parameter changes, however, can produce more dramatic e?ects . . . . there are many small differences in GP implementations that are rarely considered important or even reported. However, our experience is that they may produce significant changes in the behaviour of a GP system. Di?erences as small as an ‘>’ in place of a ‘?’ in an if statement can have an important e?ect. For example, the substitution ‘>’ ? ‘?’ may in?uence the winners of tournaments, the designation of the best-of-run individual, the choice of which elements are cloned when elitism is used, or the o?spring produced by operators which accept the o?spring only if it is better or not worse than a parent.>> p. 16 of 250: >> GP ?nds out how well a program works by running it, and then comparing its behaviour to some ideal (line 3). We might be interested, for example, in how well a program predicts a time series or controls an industrial process. This com-parison is quantified to give a numeric value called fitness. Those programs that do well are chosen to breed (line 4) and produce new programs for the next generation (line 5). >> p. 17 of 250: Algorithm 1.1: Genetic Programming>> 1: Randomly create an initial population of programs from the available primitives (more on this in Section 2.2). 2: repeat 3: Execute each program and ascertain its fitness. 4: Select one or two program(s) from the population with a probability based on fitness to participate in genetic operations (Section 2.3). 5: Create new individual program(s) by applying genetic operations with specified probabilities (Section 2.4). 6: until an acceptable solution is found or some other stopping condition is met (e.g., a maximum number of generations is reached). 7: return the best-so-far individual.>> pp. 28 - 9 of 250: >> The most commonly employed method for selecting individuals in GP is tournament selection, which is discussed below, followed by ?tness- proportionate selection, but any standard evolutionary algorithm selection mechanism can be used. In tournament selection a number of individuals are chosen at random from the population. These are compared with each other and the best of them is chosen to be the parent. When doing crossover, two parents are needed and, so, two selection tournaments are made. Note that tourna-ment selection only looks at which program is better than another. It does not need to know how much better. This e?ectively automatically rescales ?tness, so that the selection pressure 4 on the population remains constant . . . tournament selection ampli?es small di?erences in ?tness to prefer the bet-ter program even if it is only marginally superior to the other individuals in a tournament. >> ________________ In short the programs work WITHIN an island of established, function, use controlled random changes to explore which way lieth the steepest ascent to a hill top, and employ artificial selection. Notice,t eh creitical dependence on smoothly trending fitness functions and the sensitivity to tuning of parameters. Let's translate this last: the GP itself is on an island of function, and small disruptions can make big differences in performance. They inherently model at most microevo, not the origin of body plan level macroevo that is required to explain the real challenge of darwinian type blind watchmaker evolution. GEM of TKI PS: Notice where the phrase the Blind Watchmaker comes from, for next time someone rhetorically pretends not to know what it means or where it comes from: Dawkins, in a book bearing that phrase as a key part of its title, and presenting Weasel at a key point in his argument. kairosfocus
Mung: I come from an island that has a mountain range down its middle, with branch ranges and side hills. I now live in another, that has three main volcanic edifices and some side-hills. So, I am quite aware of having multiple peaks reachable by hill-climbing. (Oddly, I have never been to Blue Mountain peak, nor have I ever taken the local Centre Hills tour! I am much more inclined to head for a beach, rod in hand . . . ) Hill-climbing algorithms of course can explain multiple niches within an island of function, especially if the populations are allowed to wander off and head for different hills. The root problem from my point of view, is that the whole procedure starts on such an island. That was evident to me from the very first when I saw Weasel's "nonsense phrases" being rewarded on mere proximity to target. Might as well quote Dawkins, from the notorious TBW [cf App 7, my always linked], which will make all clear: _________________ >> I don't know who it was first pointed out that, given enough time, a monkey bashing away at random on a typewriter could produce all the works of Shakespeare. [NB: cf here and this discussion on chance, necessity and intelligence.] The operative phrase is, of course, given enough time. Let us limit the task facing our monkey somewhat. Suppose that he has to produce, not the complete works of Shakespeare but just the short sentence 'Methinks it is like a weasel', and we shall make it relatively easy by giving him a typewriter with a restricted keyboard, one with just the 26 (capital) letters, and a space bar. How long will he take to write this one little sentence? . . . . It . . . begins by choosing a random sequence of 28 letters ... it duplicates it repeatedly, but with a certain chance of random error – 'mutation' – in the copying. The computer examines the mutant nonsense phrases [= non-functional], the 'progeny' of the original phrase, and chooses the one which, however slightly, most resembles the target phrase [notice, explicit targetted search, more modern programs generate implicit targetting on nice trend-y slopes,through topography-"fitness" defining functions and associated hill-climbing algorithms], METHINKS IT IS LIKE A WEASEL . . . . What matters is the difference between the time taken by cumulative selection, and the time which the same computer, working flat out at the same rate, would take to reach the target phrase if it were forced to use the other procedure of single-step selection [the problem is that the real challenge of body-plan origination is credibly much bigger than the single steps that Dawkins would dismiss, starting with that needed to account for the joint metabolic action and von Neumann self replicator in origin of life, also cf here above on the Mandelbrot set "fitness function" thought exercise on the weaknesses of other more subtle GA's that load the target implicitly, and this follow up, on what is really going on in the evolutionary computing models. Cf. here at 89 above for a comment on the place where Schneider mistakenly discusses his artificial selection algorithm as though it were natural selection]: about a million million million million million years. This is more than a million million million times as long as the universe has so far existed . . . . Although the monkey/Shakespeare model is useful for explaining the distinction between single-step selection and cumulative selection, it is misleading in important ways. One of these is that, in each generation of selective 'breeding', the mutant 'progeny' phrases were judged according to the criterion of resemblance to a distant ideal target, the phrase METHINKS IT IS LIKE A WEASEL. Life isn't like that. Evolution has no long-term goal. There is no long-distance target, no final perfection to serve as a criterion for selection, although human vanity cherishes the absurd notion that our species is the final goal of evolution. In real life, the criterion for selection is always short-term, either simple survival or, more generally, reproductive success. [TBW, Ch 3, as cited by Wikipedia, various emphases added.] >> __________________ Weasel is of course rather like a Model T Ford, an old technology, long since replaced by more sophisticated versions. But despite Dawkins' weasel words, it primarily served to misleadingly persuade the naive that the question of the origin of functionally specific, complex information, had been "scientifically" answered through evolution by chance variation and natural selection. Indeed, I have had people present Weasel to me in that guise in recent times. GEM of TKI kairosfocus
The file evjava.zip can be downloaded from this page. The Pascal and C code can be downloaded from this page. Mung
MathGrrl @41
My participation here is solely so that I can understand CSI well enough to be able to test whether or not known evolutionary mechanisms can create it.
Mung @76
Schneider claims to have created it. Do you doubt him?
Yes? No? You don't know just what ev does so you don't have an answer? MathGrrl @66
Again, I don’t see where you’re getting your 266 bit value, but Schneider shows how to generate arbitrary amounts of Shannon information via ev.
How? MathGrrl @68
As noted above, Schneider shows how to generate arbitrary amounts of Shannon information via ev.
Mung @85
Now apart from the fact that this is a vague and muddled statement (couldn’t a random number generator just as well generate arbitrary amounts of Shannon information?) Schneider himself actually makes no such claim.
MathGrrl @100
If you think I’ve misrepresented Schneider, please explain exactly how, with reference to the page to which I linked.
Schneider never makes the claim that ev can "generate arbitrary amounts of Shannon information." It's hard for me to link to something he never said. MathGrrl @101
It seems that you haven’t understood Schneider’s summary.
Well, unfortunately for you, I did more than read the summary. I read the entire web page, which detailed all the hoops Schneider had to go through just to get his ev program to "win" the horse race. Did you happen to notice the failed attempts, each followed by intelligent intervention? I also read his paper on ev, and have looked at the source code (Pascal, C and Java) and the source code comments. I think I know enough about it to have some idea whether or not you know enough about it to be making the claims that you are. For example, you say:
Unless you’re claiming that it is impossible in principle to model evolutionary mechanisms, these GAs support the idea that known evolutionary mechanisms are capable of changing the allele frequency in a population such that subsequent generations are better able to reproduce in a particular environment.
Guess what's missing from ev? A model of population genetics, that's what. IOW, no alleles.
GAs model known evolutionary mechanisms. These mechanisms work without intelligent intervention...
Read the horserace page again. Schneider intervened.
That is not an accurate description of ev. There is no target and nothing limits changes in the simulation.
There are multiple targets. So in some bizarre sense someone could argue you got that one right. And there are a number of limits to changes in the simulation.
That’s what makes the results particularly interesting. I strongly recommend reading Schneider’s paper.
You really should. You don't know what you're talking about. Mung
MathGrrl @99
What do you find confusing about my scenarios? What’s wrong with referring to other papers?
Let's review (please bear with me): MathGrrl @41
My participation here is solely so that I can understand CSI well enough to be able to test whether or not known evolutionary mechanisms can create it.
PaV @50
First, why do you think “Produces at least X amount of protein Y” is a “specification”.
vjtorley @52
I think that kairosfocus’ posts at #44, #45 and #47 above meet your requirements for a CSI calculation for the four scenarios you described. But if you aren’t satisfied with those answers, then here’s a challenge I shall issue to you. Please provide us with a two- or three-page, detailed but completely jargon-free description of the four scenarios you are describing and post it up on UD. No references to other papers by biologists, please. Describe the problems in your own words, as you would to a non-biologist (which is what I am).
MathGrrl @67
As noted in my guest thread, specification is one of the more vague aspects of CSI. Some ID proponents don’t seem to have a problem with the specification I suggested (see a couple of the comments above in this thread, for example). Others, like you, seem to have a different concept. Why do you think that “Produces at least X amount of protein Y” is not a specification in Dembski’s sense? Please reference his published descriptions of CSI that support your view.
vjtorley @72
You have yet to respond to my challenge regarding the four scenarios you describe: Please provide us with a two- or three-page, detailed but completely jargon-free description of the four scenarios you are describing and post it up on UD. No references to other papers by biologists, please. Describe the problems in your own words, as you would to a non-biologist (which is what I am). Then I might be able to help you. I’m still waiting.
MathGrrl @99
What do you find confusing about my scenarios? What’s wrong with referring to other papers?
Nuff said? MathGrrl, please tell us what you think a specification is. Then please tell us why you think what you provided in your "challenges" qualify as specifications. Mung
I hate it when I purchase a book and then find out it's available free online, lol. A Field Guide to Genetic Programming Mung
F/N: The further response to Schneider's horse race page is at what is now 87. Sorry. Will correct above shortly. kairosfocus
Observe MG, 106:
The issue isn’t simple probability calculations, but how those probabilities are determined in the first place. Please explain, in detail and with examples, how you arrive at the numbers you used for my four scenarios. [Of course, there are no details beyond that the reduction was used and the numbers come from her cites, links and the like (with the exception of the cite from PAV, who has come in thread and explained himself]; as cited above . . . so, did MG actually read what she posted and linked so she would know where the typical numbers clipped and plugged in were sourced? This, unfortunately is a direct foreshadowing of what is to follow in this post . . . ]
This, in response to my request of MG that she: "Kindly MATHEMATICALLY address the reduction of the Dembski type metric to the form in the OP" This convinces me that either MG has not bothered to read carefully before firing off dismissive talking points and even insinuations of dishonesty, or else that she is mathematically incompetent. For, the relevant calculations in the reduction are NOT "probability" calculations but direct application of the Hartley-Shannon information theory DEFINITIONS, and standard log of product logarithm calculations (of High School Algebra standard). Let us cite again, from OP:
Chi = – log2[10^120 ·phi_S(T)·P(T|H)] . . . eqn n1 How about this: 1 –> 10^120 ~ 2^398 2 –> Following Hartley, we can define Information on a probability metric: I = – log(p) . . . eqn n2 3 –> So, we can re-present the Chi-metric: Chi = – log2(2^398 * D2 * p) . . . eqn n3 Chi = Ip – (398 + K2) . . . eqn n4 4 –> That is, the Dembski CSI Chi-metric is a measure of Information for samples from a target zone T on the presumption of a chance-dominated process, beyond a threshold of at least 398 bits, covering 10^120 possibilities. 5 –> Where also, K2 is a further increment to the threshold that naturally peaks at about 100 further bits.
a: At step 1, I simply divided lg(10^120) = 120 by lg 2 = 0.3010, to derive the exponent of two more or less equal to 10^120. b: This is based on the rule that logA/logB = log_b (A), in turn derived from the laws of indices and definition of logarithms. c: In step 2, I applied the Harley -- as Shannon did and as has become standard in info theory -- suggestion that we measure information in log units, as is discussed as a 101 intro here in my always linked. d: This is NOT a "simple" probability calculation, it is a conversion of probabilities of symbols into the standard log metric of information, suggested by Demsbki's - log2 lead to the Chi eqn n1. INDEED, THIS IS REALLY JUST THE CITATION OF THE DEFINITION IN THE CONTEXT WHERE THE A POSTERIORI PROBABILITY IS 1, a common enough reduction. e: In step 3, I first re-present eqn n1, in a form more amenable to logarithmic reduction: Chi = – log2(2^398 * D2 * p) . . . eqn n3 f: Then, we use the well known rule derived from the laws of indices [I think Americans call these exponents], where Log(p*q) = log p + log q. So: Chi = Ip – (398 + K2) . . . eqn n4 g: We next use VJT's reduction,that K peaks out in praxis at about 100 bits: Chi = Ip - 500, in bits beyond a reasonable threshold of complexity h: So, once we have a reasonable measure of functionally specific information, Ip, whether probabilistically or by the classic assessment of storage used and/or storage used and code patterns that lead to the actual info content being less than the used capacity [As Durston et al did], we can identify whether or not it is beyond a threshold with a simple subtraction. i: For instance, it is known that there are 20 possible values for each AA in a polypeptide chain, and as Durston observe that shows a raw capacity of 4.32 bits/AA position. Similarly, notoriously DNA bases take 4 values and so are reduced to 2 bits per position. j: In praxis the functioning codes used or the sequences used to effect working proteins show somewhat uneven distributions, and that is what the Durston metric on H ground vs H functional addresses. k: Using his values of information stored in protein families, we deduce the results shown in point 11 the OP:
RecA: 242 AA, 832 fits, Chi: 332 bits beyond SecY: 342 AA, 688 fits, Chi: 188 bits beyond Corona S2: 445 AA, 1285 fits, Chi: 785 bits beyond . . . results n7
l: This both shows that the reduced Chi metric can be applied to biosystems and gives cases where the Chi metric indicates that these protein families are best explained as the result of design. m: I need not elaborate again the search space and isolation of islands of function [NB: proteins come in fold domains that are isolated in AA sequence space, due to very complex functional constraints] reason behind that conclusion. n: Let it suffice that the Chi metric as reduced will prove to be very reliable on empirical tests, where we directly know the cause. this, because this has already been abundantly shown. o: QED ______ But, it is time to render a verdict. A sad one. If MG has failed to read before dismissively remarking so blunderingly as above, she is in no credible position to responsibly make the contemptuous dismissals she has made over and over and over. And that is the better case. If instead she has been posing as deeply understanding the Shannon theory of information, when she cannot correctly interpret the basic reduction above, then she is incompetent and she is speaking as though the has a degree of intellectual authority that she does not have. That is worse, much, much worse than the first case. But, unfortunately, it does not stop there, in 105 she has capped off her remarks by quoting he traditionally reported sotto voce retort of Galileo when he had been forced to recant his theory by the Inquisition:
Eppur si muove. [= It still moves]
This is a snide insinuation of the worst order, of in effect a religiously motivated persecution against established scientific fact. That, in an immediate context where she would IMMEDIATELY go on to demonstrate that she has either failed to attend to basic facts of a simple derivation in information theory, or else that she is incompetent to follow such a derivation. Such irresponsible atmosphere poisoning behaviour is utterly reprehensible,a nd shouldbe stopped and apologised for. No religious inquisition is imposing a kangaroo court or threatening scholars with thumbscrews to get them to recant facts. Instead, we have carried on a SCIENTFIC discussion on well established facts, theory and evidence. We have deduced reasonable results and we have shown them in action. In addition, we have explained, over and over, step by step, just why the results of so-called evolutionary algorithms are perhaps somewhat analogous to micro-evolution but not to body plan origination macroevolution. Only to meet with brusque dismissals and now rude and disrespectful, slanderous insinuations. On the evidence of the simple reduction of eqn n1, we have reason to believe that MG is in no position to render a reasonable judgement on the challenge made to the Schneider type algorithms. For we have reason to believe that she has ether not paid enough attention to see what is being pointed out, or is incompetent to do so. Whichever way it goes, the matter is over on the merits. MG is mouthing dismissive hyperskeptical talking points, not responding responsibly and with due care and attention to a serious matter. So, we again see the force of the point at the foot of the OP:
In short, the set of challenges recently raised by MG over the past several weeks has collapsed.
QED Let us hope that MG will pause, look at what she has done, reflect, and do better in future. Good day, GEM of TKI kairosfocus
MathGrrl:
I’m interested in learning what ID proponents consider to be the definition of CSI, as described by Dembski.
Shannon information with meaning/ function of a certain complexity. Shannon information- you figure out the number of possibilities for each position and that gives you the number of bits. for example- 4 possible nucleotides = 2^2 = 2 bits per nucleotide 64 coding codons = 2^6 = 6 bits per amino acid. Then you count and figure out the specificity. Or you keep acting like a little baby- your choice. Joseph
You forgot one main point R0bb- ev is a targeted search, which means it is an irrelevant example. MathGrrl:
This is not correct.
Yes it is and I have provided the peer-reviewed paper that exposes it as such. Joseph
Hi kairosfocus. Your link to IOSE above reminded me of: http://www.amazon.com/gp/product/0964130408 http://www.genome.com/life-Orgin.htm Mung
MG, You are boringly repetitious. If you'd like to have a discussion, by all means let's do so. Maybe one of us or even both of us can learn something. If you'd care to do something other than repeat the same two phrases over and over here's what I'd like to discuss: 1. Why you think your "specifications" qualify as a specification as that term is used by Dembski. 2. Just what it is you think the ev program actually does. (It's not what you think.) Mung
MG, 102:
No one there [in my guest thread] was able to present a rigorous mathematical definition of CSI based on Dembski’s description. If you can, please do so and demonstrate how to calculate it for the four scenarios I describe there.
Pardon directness: this is the proverbial stuck record, repeating long since reasonably and credibly answered demands, without any responsiveness to the fact that they have been answered, again and again. Kindly cf the post at the head of this thread and the comments attached to comment no 1. That will give adequate answers to all real questions that reasonable onlookers will have. The onward links from the editorial comments at no 1 will I believe answer to all reasonable questions for all reasonable people. By contrast, no quantity of explanation, analysis or definition will ever suffice for the sufficiently hyperskeptical. There only remains the reductio ad absurdum of exposing that hyperskepticism as unreasonableness itself, in this case backed up with insinuations -- thus subtle accusations -- of dishonesty. And now, attempted denials. This is simply repetition of talking points that were long since answered. In the guest thread, there were reasonable analyses and calculations relative to the set scenarios. there was discussion of limitations on available information to answer others. And since there has been onward analysis culminating in the reduction of the Dembski chi at the head of this thread in the OP. And the four main challenges were answered point by point from my perspective at 19 - 20 above and onward objections were answered at 44 ff above. The first basic problem, though, is that MG refuses to acknowledge that a gene doubling even creates no novel FSCI. Thus the direct novel FSCI in this copied gene is zero. However the mechanism highlighted or implied by the process of copying will almost certainly be well beyond the FSCI threshold, and exhibits a rather specific function. The second, is that MG is unwilling to accept that the evolutionary algorithms she cites start on islands of function and proceeds to an implicit target on the structure of a so-called fitness function that has nicely trend-y behaviour and an artificially set up hill climbing algorithm. This I have dealt with twice above in a couple of days, starting with the comparison of the Mandelbrot set thought exercise, and again here in more elaborate detail on how computers work and how the relevant algorithms work. The real problem is to get to shores of such islands of function in the chemical pre biological world, and again to get to such islands for novel body plans in the biological world, crossing information origination gaps beyond the credible reach of blind chance and mechanical necessity. To cap off, when the cited challenges were scanned for stated values of claimed originated info, and these were plugged into he reduced Dembski metric to show illustrative cases from her challenges, they were dismissed as made up numbers. That is an unworthy false accusation that should be apologised for. Going yet further, the real case of the Durston measures of FSC were fed into the reduced Dembski metric and produced reasonable values of CSI in bits beyond 500 as a threshold of design as most credible explanation. These -- promoted to point 11 in the OP -- have been studiously ignored and/or dismissed. MG also needs to attend to the responses in what is now 82 to the onward set of questions/demands -- there is no end of inquisitorial questions once one is permitted to sit in the seat of a Torquemada who only demands and is not himself or herself accountable. It is to be noted that after several weeks and some direct invitations or even challenges, MG has yet to provide a single serious piece of mathematical analysis here at UD. What has consistently happened instead is the drumbeat repetition of hyperskeptical dismissals without reason, and repetition of demands. Finally, when VJT reasonably requested some details on the evolutionary algorithm cases that would show what was needed to work things out to a further level, he has been brusquely, even rudely dismissed. That, after he has been the consummate gentleman and has provided considerable efforts that reflect many voluntary hours of work. Some of which led to the reduced Chi metric above. Pardon my sharpness, but as of now MG -- sadly -- comes across as simply pushing hyperskeptical talking points and worse accusations, not as a serious partner for discussion. I hope she will turn over a new leaf. GEM of TKI kairosfocus
MG, 100:
Schneider documents how a simple subset of evolutionary mechanisms can generate arbitrary amounts of Shannon information . . .
Strictly Shannon information is a metric on a signal of average info per symbol: H = - [SUM on i] pi log pi. More loosely, this relates to the Hartley suggested log metric I = - log p. The problem, as already highlighted is that a string of fair coin tosses produces Shannon info on such metrics [actually the highest info per symbol]. This is because such a random string would have in it no redundancy. But if your coin suddenly started to spit out the ASCII text codes for Hamlet, that would be a very different kettle of fish. The issue on complex, meaningfully or functionally specified info is that the ability to make a functional difference to a system or object on being in a particular set of configs is often important. Hence as Joseph keeps on pointing out we must reckon with such meaningfulness or functionality, as both Dembski, Abel et al and Durston et al among others, discuss. More fundamentally, as was pointed out in detail earlier, Schneider's whole approach misses the injection of active, intelligently created information in the whole exercise, which means that the exercise STARTS in an island of function already. From a quote given above, Schneider evidently did not even realise that he was carrying out an artificial not a natural selection process. The issue at the heart of the CSI/FSCI challenge is to arrive at the shores of such islands of function from arbitrary initial points in config spaces. As has been said over and over and over, only to be brushed aside or ignored. I am now of the opinion that many objectors to the FSCI/CSI concept -- astonishingly -- do not really understand the difference between:
1. [Class 1:] An ordered (periodic) and therefore specified arrangement: THE END THE END THE END THE END Example: Nylon, or a crystal . . . . 2. [Class 2:] A complex (aperiodic) unspecified arrangement: AGDCBFE GBCAFED ACEDFBG Example: Random polymers (polypeptides). 3. [Class 3:] A complex (aperiodic) specified arrangement: THIS SEQUENCE OF LETTERS CONTAINS A MESSAGE! Example: DNA, protein.
(For those who came in late, I just clipped an extract from the very first design theory technical work, Thaxton et al in TMLO, ch 8, 1984.) GEM of TKI kairosfocus
MF: Re 98:
Please provide references to where I have done so . . .
Kindly cf 44 ff above, for my earlier responses; which relate to your 36 ff. A civil person does not raise suggestions of dishonesty without cast iron proof, for -- as common good sense and what in this region is called basic broughtupcy will tell us -- the mere suggestion is itself an accusation that demands a response. As was pointed out in my response from the first. The evasiveness of your response just above is sadly revealing. Please, correct your behaviour. GEM of TKI kairosfocus
Joseph,
You forgot one main point R0bb- ev is a targeted search, which means it is an irrelevant example.
This is not correct. I strongly recommend that you read Schneider's PhD thesis as well as the ev paper to learn what ev really shows. MathGrrl
kairosfocus,
Kindly MATHEMATICALLY address the reduction of the Dembski type metric to the form in the OP
The issue isn't simple probability calculations, but how those probabilities are determined in the first place. Please explain, in detail and with examples, how you arrive at the numbers you used for my four scenarios. MathGrrl
kairosfocus,
If something is on an island of function deeply enough isolated that 10^102 quantum-time states cannot reasonably reach it on a random walk that works with trial and error selection, then if that thing is actually working, THAT IS BECAUSE IT IS NOT TRULY USING SUCH RW + TAE.
Eppur si muove. If your model of reality says something is impossible and someone like Schneider demonstrates that it is possible, the rational option is not to declare that reality is wrong and your model is correct.
More sophisticated GA’s do not load the target(s) EXPLICITLY, but do so implicitly. They have an intelligently designed well-behaved “fitness function” or objective function — one that has to have helpful trends pointing towards desired target zones — that is relevant to whatever you are trying to target that is mapped to the config space for the “genetic code” string or the equivalent; INTELLIGENTLY mapped, too.
You are confusing the simulator with that which is being simulated. GAs like Schneider's ev model (typically a subset of) known evolutionary mechanisms and a simplified version of the natural world. ev is particularly interesting in that it demonstrates the same results that Schneider observed in real biological organisms. That supports the theory that the modeled evolutionary mechanisms are responsible for the observed results. Unless you're claiming that it is impossible in principle to model evolutionary mechanisms, these GAs support the idea that known evolutionary mechanisms are capable of changing the allele frequency in a population such that subsequent generations are better able to reproduce in a particular environment.
Eventually, we find somewhere where changes don’t make for improvement. We are at a target and voila, information out of the thin air of random variation and selection.
That is not an accurate description of ev. There is no target and nothing limits changes in the simulation. That's what makes the results particularly interesting. I strongly recommend reading Schneider's paper.
GA’s overcome the physical limitations of atoms blindly scanning states through chance and blind necessity, because we are able to intelligently organise machines and program them to solve problems, including by hill-climbing within an island of function that after much painstaking effort, we have set up.
GAs model known evolutionary mechanisms. These mechanisms work without intelligent intervention, just as evolutionary mechanisms are observed to work in the real world. Are you suggesting that it is impossible, even in principle, to model evolutionary mechanisms? MathGrrl
kairosfocus,
Dembski and others have produced quantitative models of what CSI is about, and have in so doing made more or less good enough for govt work metrics.
If that is the case, please demonstrate how to calculate CSI, as described by Dembski, for my four scenarios.
12.2 Provide real evidence for CSI claims 7 –> Willfully vague and hyperskeptical. If you mean that the only empirically known, observed source of CSI and especially FSCI is intelligence, look around you and ask where the web posts, library books, etc came from. Then ask if you have ever seen the like produced by chance plus blind mechanical necessity.
Thus far, I have not seen a rigorous definition of CSI, so your statement is literally meaningless. Please provide a rigorous mathematical definition of CSI and show how to objectively calculate it. The rest of your comment is equally devoid of definitions or calculations. MathGrrl
kairosfocus,
Now, you know by now that MG has been simply recirculating objections and dismissals, even implying that the numbers I clipped — by way of illustrative example — from the excerpts she gave or that were otherwise accessible wee taken out of the air.
I'm trying to understand how to calculate CSI, as described by Dembski. Nowhere in your voluminous comments have you demonstrated how to do so. You provide no basis for the numbers you use above. Please provide your mathematically rigorous definition of CSI and explain exactly how you arrived at the numbers you used. MathGrrl
Mung,
Specification: The Pattern That Signifies Intelligence This is the paper referenced in MathGrrl’s original OP. Look at the title. My gosh. I wonder if there’s anything in it about specification as that term is understood and used by Wm. Dembski himself.
That paper was discussed on my guest thread. No one there was able to present a rigorous mathematical definition of CSI based on Dembski's description. If you can, please do so and demonstrate how to calculate it for the four scenarios I describe there. MathGrrl
Mung,
My participation here is solely so that I can understand CSI well enough to be able to test whether or not known evolutionary mechanisms can create it.
Schneider claims to have created it. Do you doubt him?
It seems that you haven't understood Schneider's summary. Schneider documents how a simple subset of evolutionary mechanisms can generate arbitrary amounts of Shannon information. ID proponents don't seem to accept that as being equivalent to CSI in this case. I'm interested in learning what ID proponents consider to be the definition of CSI, as described by Dembski. MathGrrl
Mung,
Again, I don’t see where you’re getting your 266 bit value, but Schneider shows how to generate arbitrary amounts of Shannon information via ev.
MG, you have a bad habit of citing things without having read them.
That's an offensive and baseless assertion. If you think I've misrepresented Schneider, please explain exactly how, with reference to the page to which I linked. MathGrrl
vjtorley,
You have yet to respond to my challenge regarding the four scenarios you describe:
Please provide us with a two- or three-page, detailed but completely jargon-free description of the four scenarios you are describing and post it up on UD. No references to other papers by biologists, please. Describe the problems in your own words, as you would to a non-biologist (which is what I am). Then I might be able to help you.
I’m still waiting.
What do you find confusing about my scenarios? What's wrong with referring to other papers? MathGrrl
kairosfocus,
You have called people, for no good reason, dishonest
Please provide references to where I have done so. MathGrrl
Joseph: Dead right. GEM of TKI kairosfocus
Robb: 1: PaV’s calculation does not measure the same information that Schneider, Dembski, and Marks measure Please follow context. As can be seen from the freshly numbered eqn n5, Dembski's eqn n1 can be reduced to an expression that first measures Hartley-Shannon information [-log p] -- which can be equivalent to the suitable value of Shannon information proper H average bits per symbol or string as required -- is one term in a more complex expression and context. 500 bits taken off (in effect cf discussion) then gives the Chi_500 metric in bits beyond the complexity bound. In this context, the information has to be cross checked for specificity i.e. isolation to a zone of interest in a config space. Marks and Dembski then point out that once we are beyond such a threshold, the best explanation for event E being in zone T is that a quantum of intelligently directed active information has been a part of the cause of event E. Schneider in speaking of Shannon information is speaking to either Ip or at most H. PAV and I have addresses the numbers reported by Schneider as being on the I term, and he in particular has -- in my view quite correctly -- pointed out that the reported performance is below the threshold, especially when we apply a strict interpretation of what being in zone T means. In short, your dismissive remark is imprecise and off target in light of the analysis above. 2: PaV’s repeated objection that ev is irrelevant because it generates at most 96 bits of information doesn’t hold water, because we can run ev multiple times First, PaV is pointing out that the reported result is less impressive than a first glance will show. Second, as I showed above this morning in 87, step by step, every time Schneider runs his horse race, he shows that his whole exercise is a case of intelligently designed, implicitly targetted search. Schneider is evidently unaware that he 500 - 1,000 bit thresholds are not arbitrary, and so he is misled about the performance of his programs that start in an already narrow target zone of a functional program pre-loaded with a hill climbing filter and algorithm working on a fitness map that -- by the very act of constructing it -- has been loaded in with targetting information. 3: Your long response addressed neither of these points Actually, I spoke to the really material point, the one that you have unfortunately still missed as of the post cluster above. 4: You repeatedly argue that random walks, monkeys on typewriters, etc. are impotent, as if anyone is challenging that fact. Every time you address claims that nobody is making, you create more words for your readers to sift through in search of something responsive. Actually, the key point is that everyone nods assent to the search challenge issue then miss the implication: if you are outperforming such, it is because you have found a way to inject active information. The point that we are dealing with isolated islands of function and have to explain first and foremost getting to shores of function has plainly not got through the many shields of deflection and dismissal. And yet, time after time I see the equivalent of : assume you have already arrived at the island of function with a nice trendy fitness metric and this handy hill-climbing algorithm that can ride the trend like an escalator to the top; all duly intelligently designed. Voila, we run it for a few dozen or thousand cycles and hit a peak. There, in an afternoon we have shown that the search resource barrier is leaped in one swoop by magic powers of evolution on natural selection and chance variation. There is actually a clip in 89 above, HT Mung, from Schneider. I am not making this up. 5: Darwinian evolution, whether or not you believe that it actually occurs in nature, is a far cry from a random walk. The question is whether it occurs and to what extent, not the efficacy of random walks. Excuse me, please do not put words in my mouth that do not belong there, I am not the strawman you are projecting. Ecveryone, starting with Blythe, a Creationist, accepts that here is a natural selection effect of differential reproductive success. Everyone, including the much mocked and smeared young earth creationists, accepts that this accounts for certain small scale changes. However, in other cases the issue is more isolation on founder effects and selection that produces breeds or varieties through Mendelian inheritance. A fair amount of micro-evo can be explained on the two. Even the YECs commonly suggest this can go up to genera and families. In recent days I have cited the relationship between the red deer and the US elk, which are cross-breeding [they are interfertile] across species lines -- elk were recently classified as a species -- in New Zealand. But all of this is well within the major body plan level, and is within islands of function. If Schneider et al were content to say they have a model for evo up to that approximate level and maybe a bit beyond, that would not be of any great controversy, with empirical data to back up some and plausible arguments beyond. But the issue is that his is then turned into a claim that we have a theory that on selection of chance non-foresighted variations explains the history of life on the famed iconic branching tree of life model. the ACTUAL evidence, on study of fossil forms is that body plans are quite stable once they appear -- suddenly, and then they disappear or continue to the modern age. This fits with islands of function on irreducibly complex core based Wicken wiring diagrams, backed up by need to code for proteins and regulatory controls to express the plan on the fertilised ovum through embryonic development. It is those body plans which are credibly isolated that raise the body plan level macroevolutionary challenge: the need to use darwinian chance variation random walks and trial and error to hit shores of islands of function that are unexplained and lack empirical support. (If there were the classic tree of life, there would be overwhelming fossil support for it.There is not, as Gould bluntly testified.) If we take the geochronological timeline as more or less giving a broad -- but provisional -- picture [it more or less fits into the scale of the more observationally supportable cosmological timeline, e.g on apparent age of clusters on HR main sequence turnoffs . . . ], natural history may arguably fit an evolutionary pattern. But the darwinian mechanism is not enough to explain origin of body plans. The required information to explain such is only empirically known to come from design. And as for a sufficient designer, any nano-tech lab a few generations beyond Venter could do it. Front loading and adaptation to niches would work. Deliberate viral insertions would work, etc etc etc. Venter and genetic engineering more broadly have provided proof of concept. 6: I asked how H is defined in the chi formula. You didn’t answer directly, but you seem to be saying that we can just define it as a uniform distribution when we’re calculating the specified complexity of biological structures. But I wasn’t asking about H in general, not just for biology. Nope, I am saying something far more radical. H may be of analytical interest but it is operationally irrelevant. We have a world of information technology that allows us to arrive at bit values from many directions, and direct observation of code symbol patterns as can be documented for both proteins [Durston et al] and DNA [others], can be used to refine the flat-random case where necessary. Not that it makes much difference to the practical outcome. So it makes reasonable sense to use the carrying capacity and go with the bigger threshold of 1,000 bits to be conservative. When you load up a USB stick or a CD you do not worry to find out how the codes balance the symbols so you can report the communication bits [IIRC someone calls them shannons, I have seen binits too] vs the simple carrying capacity bits. In the relevant cases we are looking at genomes with 100,000- 300,000+ base pairs or 200 + kbits of carrying capacity; 1,000 bits is 1/2% where for every extra bit you double your config space. Experiments show that when they cut down bacterial genomes below 300 k bases, they tend to get disintegration of life function. For 200 k bits, config space is 9.98*10^60,205. 7: Given a hypothesis of a random walk, it’s extremely unlikely that the earth’s orbit would be elliptical around the sun. An ellipse is easy to describe, thus specified. And to boot, the earth’s trajectory is functional — a random walk trajectory would result in a dead planet. So the earth’s orbit is specified and complex. Therefore designed? Strawman. Astonishing one too, for someone who has been around at UD for years. Earth's orbit as a conical section modified by perturbations, has long been explained on mechanical necessity as in the first node of the filter. This is the equivalent of the simple order of the crystal as Orgel described. Where things do get complex is the structure and location of he solar system and having a planet like earth in it. From extrasolar systems, the commonest thing is to have disruptions from hot jupiters, some of which are on very eccentric orbits. A nice neat undisturbed system is rare. A moon like ours relative to our planet is a rarity too it seems. And a lot more. Please don't caricature the actual observations on the privileged nature of our solar system. Please view here. 8: you can come up with ad hoc reasons for denying that the earth’s orbit is complex, or specified, or both. Thus the call for an established rigorous definition, so we can come to agreement. As you well know the trichotomy chance necessity and art is NOT ad hoc. It is in fact foundational to how we design and implement experimental exercises, especially when we have to look at plots, treatments, control groups etc and address as well the inevitable statistical scatter. I suggest you take some time and read and reflect on this ID foundations series post. _________ Shakin' me head . . . GEM of TKI kairosfocus
You forgot one main point R0bb- ev is a targeted search, which means it is an irrelevant example. Joseph
kairosfocus, WRT random walks: You repeatedly argue that random walks, monkeys on typewriters, etc. are impotent, as if anyone is challenging that fact. Every time you address claims that nobody is making, you create more words for your readers to sift through in search of something responsive. Darwinian evolution, whether or not you believe that it actually occurs in nature, is a far cry from a random walk. The question is whether it occurs and to what extent, not the efficacy of random walks. I asked how H is defined in the chi formula. You didn't answer directly, but you seem to be saying that we can just define it as a uniform distribution when we're calculating the specified complexity of biological structures. But I wasn't asking about H in general, not just for biology. Given a hypothesis of a random walk, it's extremely unlikely that the earth's orbit would be elliptical around the sun. An ellipse is easy to describe, thus specified. And to boot, the earth's trajectory is functional -- a random walk trajectory would result in a dead planet. So the earth's orbit is specified and complex. Therefore designed? Of course, you can come up with ad hoc reasons for denying that the earth's orbit is complex, or specified, or both. Thus the call for an established rigorous definition, so we can come to agreement. R0bb
kairosfocus, WRT random walks: You repeatedly argue that random walks, monkeys on typewriters, etc. are impotent, as if anyone is challenging that fact. Every time you address claims that nobody is making, you create more words for your readers to sift through in search of something responsive. Darwinian evolution, whether or not you believe that it actually occurs in nature, is a far cry from a random walk. The question is whether it occurs and to what extent, not the efficacy of random walks. I asked how H is defined in the chi formula. You didn't answer directly, but you seem to be saying that we can just define it as a uniform distribution when we're calculating the specified complexity of biological structures. But I wasn't asking about H in general, not just for biology. Given a hypothesis of a random walk, it's extremely unlikely that the earth's orbit would be elliptical around the sun. An ellipse is easy to describe, thus specified. And to boot, the earth's trajectory is functional -- a random walk trajectory would result in a dead planet. So the earth's orbit is specified and complex. Therefore designed? Of course, you can come up with ad hoc reasons for denying that the earth's orbit is complex, or specified, or both. Thus the call for an objective rigorous R0bb
kairosfocus @ 61:
Pardon, but you seem to be falling into the same question-begging trap that MG has. (I was going to reply to her challenge here, but saw that you provided a live, fresh case in point.) The basic problem with programs like Ev etc...
I made only two points regarding ev. 1) PaV's calculation does not measure the same information that Schneider, Dembski, and Marks measure. 2) PaV's repeated objection that ev is irrelevant because it generates at most 96 bits of information doesn't hold water, because we can run ev multiple times. Your long response addressed neither of these points. Apparently you think I was making some other point. R0bb
Further counter-challenges to MG and ilk: 4: Kindly MATHEMATICALLY address the reduction of the Dembski type metric to the form in the OP: Chi = - log 2( 10^120* D2* p) Chi = [- log2(p)] - [398 + K2] Ip = - log2(p) Chi_500 = Ip - 500, bits beyond the threshold of sufficient complexity. Chi_1000 = Ip - 1,000 bits beyond. Where it is held that once a reasonable estimate of information in a string or a nodes arcs and interfaces network and/or any codes found therein can be evaluated on ordinary ways to deduce such, Ip can be assigned such a value for relevant purposes. Where it is also held that 500 bits or 10^150 possibilities exceeds the quantum state capacity of the solar system since the big bang, and that similarly 1,000 bits exceeds the capacity of the observed cosmos of some 10^80 atoms. If this derived metric is "meaningless" on being undefined mathematically, kindly show why. It is taken that information is a neg log probability metric, on a flat distribution or a non-flat distribution. the VJT upper limit for phi_S(T) is taken.(It is taken that the Seth Lloyd and quantum state calculations for the solar system and cosmos can be done, as by Demsbki et al or by Abel et al.) 5: Kindly show that the use of the Durston metric values as shown in point 11 OP, in the reduced Dembski metric is meaningless if objected to, where on using these values and Chi_500, we may see for three samples from Durston's table of fits values for protein families carrying out various functions across the domain of cell based life:
Using Durston’s Fits from his Table 1, in the Dembski style metric of bits beyond the threshold, and simply setting the threshold at 500 bits:
RecA: 242 AA, 832 fits, Chi: 332 bits beyond SecY: 342 AA, 688 fits, Chi: 188 bits beyond Corona S2: 445 AA, 1285 fits, Chi: 785 bits beyond.
The two metrics are clearly consistent, and Corona S2 would also pass the X metric’s far more stringent threshold right off as a single protein. (Think about the cumulative fits metric for the proteins for a cell . . . )
6: If you disagree with the claims in the OP and similar places, kindly show why ad how the claims advanced are in material error to the point of being meaningless. 7: Further to this, on the assumption you can SHOW error to the point of meaninglessness [which normally implies a reductio ad absurdum], kindly then demonstrate why putting them forward constitutes DISHONESTY, rather than simple error as is a commonplace in scientific, mathematical and other work. 8: Also, show that it is UNREASONABLE and perhaps dishonest (not just an error, if you can show that) to infer that:
i: CSI as mathematically modelled by Dembski -- starting with NFL p 144 and 148 as cited in OP -- and ii: as reduced to metrics by him and/or others, is iii: materially similar in meaning and context of use to iv: the specified complexity discussed by Orgel [1973] and others using similar terms, and/or v: the complex functional organisation described by Wicken [1979] using terms like wiring diagrams and functional complexity that is information rich.
GEM of TKI kairosfocus
Counter-Challenge to MG & ilk: 1:Kindly, explain why Schneider (as just shown [HT: Mung], on the issues raised above) was not able to perceive that his process of selection was patently artificial, and that the context of being locked down to an island of intelligently set up function, was also artificial, i.e intelligent. 2: Bonus: kindly explain why Schneider seems to think the plausibility limits discussed by Dembski, Abel and others [all the way back to the thermodynamicists -- cf the Infinite monkeys discussion here] is arbitrary and not rooted in a physical limit to random walks and chance variation in a wide config space that has isolated islands of function in it. 3: Bonus 2: If you disagree with the characterisation of islands of function, show how the general pattern of Wicken wiring diagram functionality based on codes and related nodes, arcs and interfaces organisation, is not like that, i.e that functionally specific complex organisation is not specific. (Remember, text is a string wiring network: s-t-r-i-n-g [where we have codes for meaningful symbols], and that this includes flowcharts, reaction networks, organisation of petroleum refineries, computer motherboards and similar circuits, the exploded view breakdown of multipart mechanical assemblies like a car engine, etc etc.) GEM of TKI PS: MG, I hope you can find it in yourself to face and withdraw the unwarranted accusations/insinuations of deception or dishonesty you have made for some days now in this thread. kairosfocus
PPS: Schneider inadvertently reveals the depth of his misunderstanding here in his announcement of the "victory" in his horse race:
3 sites to go, 26,300 generations, Rsequence is now at 4.2 bits!! So we have 4.2 bits × 128 sites = 537 bits. We've beaten the so-called "Universal Probability Bound" in an afternoon using natural selection!
NOT The selection and the selection context were both plainly art-ificial, i.e. intelligent. kairosfocus
PS: Mung's link to the PN discussion of ontogenetic depth is useful. The point Nelson brings out is that the cell types are important but so also is the body plan, on a Wicken wiring diagram that has somehow got to be set up. That is, the path from fertilised ovum to embryo then to developed body plan in the adult [for dozens of major plans and myriads of relatively minor variations on such], is dependent on not only coding for proteins and regulating their expression to release them, but on spatial organisation into a functional whole. The FSCO/I threshold already crossed when we look at coding for proteins, is only the beginning. The development of an embryologically viable individual and then a functioning adult and population that reproduces successfully, multiply the origin of functionally specific complexity challenge posed by life forms. kairosfocus
Onlookers (& Mung): Mung links the Schneider page where he presents a horse race between his program -- notice, the intelligent designer is here presented [before he ducks behind the curtain to manipulate the puppet strings by the proxy of his algorithms loaded into the computer] -- and the Dembski style threshold for random walk based search. Of course, his program wins the race! Voila, Dembski -- that IDiot -- is exposed as an imbecile or fraud yet again! NOT . . . What Schneider is stubbornly (sorry, but only so strong a word will do, after this much opportunity and pleading to get things right have been willfully dismissed) overlooking -- despite being warned again and again and despite this warning going back as far as Dembski's NFL and even Thaxton et al in TMLO -- in the first instance is the physical basis for the Dembski or Abel type plausibility limit. (He seems to think the limit is arbitrary and can easily be exceeded by blind random chance feeding into an automatic selection process. This is the equivalent -- sorry but the comparison is well merited -- to clinging to a scheme to create a perpetual motion machine after its flaws have repeatedly been exposed.) For a 500-bit limit, we can show that the atoms in our solar system [by far and away most being in the sun], across its timeline since the usual date for the big bang, will have gone though at most 10^102 Planck-time quantum states. Where to get to just a strong force nuclear type interaction we need about 10^20 -- a hundred billion billion Planck times; i.e. we are well beyond the nanosecond or so times for computer cycles. Similarly, the atoms of the solar system could not have undergone as much as 10^120 bit operations in the available time. If something is on an island of function deeply enough isolated that 10^102 quantum-time states cannot reasonably reach it on a random walk that works with trial and error selection, then if that thing is actually working, THAT IS BECAUSE IT IS NOT TRULY USING SUCH RW + TAE. [OOPS, the comment inadvertently posted.] Why am I confident of that? Because I am highly confident in the physics and the underlying statistical thermodynamics reasoning that drives this sort of conclusion. If your search for a needle in a haystack is cursory, you are utterly unlikely to find it except by extremely good luck. And we cannot build a theory of what happens or is held to have happened fairly often within the ambit of our solar system -- namely body plan origination -- on such extraordinary good luck. What then is really happening with ev etc? The clue lies in the fact that ev etc are intelligently designed programs, and in effect that they seek a target iteratively by starting within a island of complex function, i.e an algorithm that was set up to work, and is coded with oodles of information. A second clue is the behaviour of the defining equations for the Mandelbrot set: an apparently simple expression, on being scanned across the complex plane [sort of a special x-y graph where the y values are multiples of the imaginary square root of -1, called i], and fed into an algorithm that detects how long before a given point escapes a bounded zone and colours accordingly, produces INFINITELY DEEP COMPLEXITY. Remember, there is an infinity of points in any line segment or area. We can relatively easily specify an infinitely deep degree of complexity to be brought out by that patient robot proxy for our intelligence, a computer. the real problem here is to tell it where to stop digging in further, and we have to be fairly clever to do so, or it will try to carry out a supertask and get nowhere, thrashing around in endless loops. (That, in effect is what happens when a PC freezes. You have to force a halt by ESC or reset.) This is what genetic algorithms -- that is a sales name -- like ev etc are really doing: ______________ >> When something like Weasel comes along and claims that information is being generated out of the thin air of random number sequences filtered by a selection process, we need to take a closer look. Weasel in fact has a “map” of a space of possible configs, where the total is about 10^40. This is of course within trial and error search capacity of modern PCs, and certainly that of the raw processing power of the cosmos. But what is crucial is that there is a way of measuring distance to target, basically off difference from the target value. This allowed Dawkins to convert the whole space of 10^40 possibilities into a warmer/colder map, without reference to whether or not the “nonsense phrases” had any meaning in themselves. At each stage, a ring of random changes in a seed phrase was made, and this was then compared with the seed itself and the warmest was picked to move to the next step. And so on until in about 40 – 60 generations in the published runs, the phrase sequence converged on the target. VOILA! Information created out of chance variation and selection! NOT. The target was already preloaded all along. In this case, quite explicitly. More sophisticated GA’s do not load the target(s) EXPLICITLY, but do so implicitly. They have an intelligently designed well-behaved “fitness function” or objective function — one that has to have helpful trends pointing towards desired target zones — that is relevant to whatever you are trying to target that is mapped to the config space for the “genetic code” string or the equivalent; INTELLIGENTLY mapped, too. Notice, all the neat little sales words that suggest a parallel to the biological world that is less real than appears. Then, when the seed genome value or config is fed in, it is tested for some figure of merit that looks at closeness to target or to desired performance. A ring of controlled random/pseudo-random samples is tossed out. The one or ones that trend warmest are picked and the process repeats. Eventually, we find somewhere where changes don’t make for improvement. We are at a target and voila, information out of the thin air of random variation and selection. NOT. Again, look at how much intelligently designed work was put in to set up the island of function to wander around in on a nice slope and detect warmer/colder, to move towards a local peak of performance or “fitness.” No nice fitness landscape and no effective progress. No CONTROL on the degree of randomness in the overall system, and chaos would soon dominate. No warmer/colder metric introduced at just the right times on that nice slope function, and you would wander around blindly. In short, the performance is impressive but the prestidigitation’s impact requires us to be distracted from the wires, curtains, smoke puffs, and trap doors under the stage. So, when we see claims that programs like avida, ev, tierra etc produce arbitrary quantities of shannon or even functionally specific information, we need to highlight the intelligently designed apparatus that makes this impressive performance possible. >> ______________ Maybe you doubt me, after all, like Demski I am an "IDiot" and maybe even a suspect closet -- shudder! -- Creationist. (Don't ask my students or clients what they think about whether I know what I am talking about as a rule.) So let us call wiki as an inadvertent witness against interest:
In a genetic algorithm, a population of strings (called chromosomes or the genotype of the genome), which encode candidate solutions (called individuals, creatures, or phenotypes) to an optimization problem, evolves toward better solutions. Traditionally, solutions are represented in binary as strings of 0s and 1s, but other encodings are also possible. The evolution usually starts from a population of randomly generated individuals and happens in generations. In each generation, the fitness of every individual in the population is evaluated, multiple individuals are stochastically selected from the current population (based on their fitness), and modified (recombined and possibly randomly mutated) to form a new population. The new population is then used in the next iteration of the algorithm. Commonly, the algorithm terminates when either a maximum number of generations has been produced, or a satisfactory fitness level has been reached for the population . . . . A typical genetic algorithm requires:
1] a genetic representation of the solution domain, 2] a fitness function to evaluate the solution domain.
A standard representation of the solution is as an array of bits. Arrays of other types and structures can be used in essentially the same way. The main property that makes these genetic representations convenient is that their parts are easily aligned due to their fixed size, which facilitates simple crossover operations . . . . The fitness function is defined over the genetic representation and measures the quality of the represented solution. The fitness function is always problem dependent. For instance, in the knapsack problem one wants to maximize the total value of objects that can be put in a knapsack of some fixed capacity. A representation of a solution might be an array of bits, where each bit represents a different object, and the value of the bit (0 or 1) represents whether or not the object is in the knapsack. Not every such representation is valid, as the size of objects may exceed the capacity of the knapsack. The fitness of the solution is the sum of values of all objects in the knapsack if the representation is valid, or 0 otherwise . . . . Once we have the genetic representation and the fitness function defined, GA proceeds to initialize a population of solutions randomly, then improve it through repetitive application of [operations].
The highlighted words and sentences are revealing. The GA is an intelligently constructed artifact that is critically dependent on being able to map a solution domain and evaluate fitness, based on a structured representation of the domain that can be CODED in a string or similar [extended] data structure. Solutions and representations are specific to the given problem, and so the search for a search exponentiation of search challenge issue Dembski highlights becomes an issue. In short, this is what such GA's end up doing: _______________ >> Too often, in the excitement over how wonderfully evolution has been modelled and how successful it is, this background of intelligent design and the way it drives the success of such exercises is forgotten. We tend to see what we expect to see or want to see . . . . GA’s model how already functioning genomes and life forms with effective body plans may adapt to niches in their environments, and to shifts in their environment. Given the way that there is so much closely matched co-adaptation of components . . . — i.e the dreaded irreducible complexity appears, if there is a drastic shift in environment, this explains why there is then a tendency to vanish from the annals of life. In short, we see something that fits the actual dominant feature of the fossil record: sudden appearance as a new body plan is introduced, stasis with a measure of adaptation and relatively minor variations within the overall body plan — that are explained on relatively minor mutations [especially on regulatory features such as size and proportions], disappearance or continuity into the modern world. And, Ilion’s point is still standing: in our observation, even through a machine that “cans” it, information transformation comes from mind. >> ________________ GA's overcome the physical limitations of atoms blindly scanning states through chance and blind necessity, because we are able to intelligently organise machines and program them to solve problems, including by hill-climbing within an island of function that after much painstaking effort, we have set up. So, since the search has now been narrowed down intelligently to the near vicinity of a relevant target, it becomes feasible to search on a restricted use of chance variation to detect the slope that points upward. Then, just as the advice to a lost hiker is to find a stream and go downhill as that way lies civilisation, hill climbing on a known nice slope will head for the peaks of performance. But that is critically dependent on and expresses the information we have loaded in. As the Mandelbrot set maps and video zooms show, that information can in principle be infinite, only limited by the resources to express it through that canned proxy and tireless mechanical calculator that does just what we tell it to no more and no less [hence GIGO], the computer. GEM of TKI kairosfocus
I've just completed the first essay by Bruce Gordon in "The Nature of Nature" in which he puts forth eight reasons to prefer an intelligent design explanation over a neo-Darwinian gradualism explanation. In four of the eight (1, 3, 7, and 8) he appeals to complex specified information. Could MathGrrl have a valid point? But then, is anyone here really denying that ID theorists claim that CSI is a reliable indicator of design? The Gordon essay has an appendix in which he writes: [quote]How does one distinguish the product of intelligence from the product of chance? One way is to give a rigorous mathematical characterization of design in terms of conformity of an event to a pattern of very small probability that is constructible on the basis of knowledge that is independent of the occurrence of the event itself. [/quote] (Wow. Is it possible that's what Dembski means by a specification?) The mathematics involved look very similar to that which appears in the paper by Dembski which MathGrrl claims to have read and which she references in her OP. If someone who has the "Nature of Nature" book could compare the mathematics between the two I'd like to hear their comments. Mung
Of particular interest to me is the recent claim by MathGrrl that:
Schneider shows how to generate arbitrary amounts of Shannon information via ev.
Now apart from the fact that this is a vague and muddled statement (couldn't a random number generator just as well generate arbitrary amounts of Shannon information?) Schneider himself actually makes no such claim. So what is MathGrrl really trying to say? Well, trying to reconstruct from the context, MathGrrl in post #66 is responding to kairosfocus @44. MathGrrl:
I don’t see where you’re getting your 266 bit value
From the calculation done by PaV. kairosfocus @3
15: PAV has also pointed out that ev, the best of the breed per Schneider, peaks out at less than 300 bits of search, on a computer — which is far faster and more effective at searching than real world generations on the ground would be; i.e. well within the thresholds that the various metrics acknowledges as reachable by chance dominated search.
So MathGrrl appears to be arguing that ev can easily exceed the 500 bit threshold. It really seems to me that she's arguing over something that's not in dispute. Intelligently designed and manipulated (and there's no doubt that Schneider had to intelligently intervene to get his program to meet the new criteria) computer programs can do things that a random search cannot. Horse Race How much information did he have to inject in the new search for a search in order for the program to succeed? Mung
It's taking more than 24 hours for my posts to pass through moderation, so I've started reproducing them over at ARN starting HERE. I don't expect anyone to respond to posts there, so I'll try to be careful about what I say there. That said, I am providing links there to my posts here. Please check there occasionally to see if I've posted anything that hasn't shown up here yet. Mung
Talking point: The fact that [controlled, limited] random change + [artificial, algorithmic] selection [matched to a specified fitness metric on the space of possibilities in an island of function set up by designers of the relevant GA program] can produce complex designs [when run on equally intelligently designed and programmed computers]* doesn’t mean it’s the fastest or most efficient way of doing so . . . _________ * Has anyone actually OBSERVED a case of known chance plus blind necessity without intelligent guidance producing novel functionally specific complex genetic information beyond, say, 1,000 bits -- 500 bases -- of complexity? (Duplications of existing functional info don't count for reasons identified in 19 - 20 above.) More details in response to a request for explanation, here. +++++ Oops, I'se be a very bad boy . . . spoiling the rhetorical force of the objection by inserting the material parts that are usually omitted when it is made. "Well, I couldn't resist those hot oatmeal and raisin cookies, mama . . . " WHACK! GEM of TKI kairosfocus
Onlookers: Now that the unpleasant piece of housecleaning is out of the way, let us look on MG's playbook of objections, point by point: _________________ >> 12.1 Publish a mathematically rigorous definition of CSI 1 --> CSI is primarily a description of the real world pattern of complex functionally organised things that work based on some sort of Wicken wiring diagram. So this is wrong headed, as has been repeatedly highlighted. 2 --> Dembski and others have produced quantitative models of what CSI is about, and have in so doing made more or less good enough for govt work metrics. 3 --> The reduced form of the Chi metric is useful, whether in Torley's 500-bit threshold form or the 398 + K2 threshold form. So is the simple brute force X-metric. And, no number of drumbeat repetitions of dismissive hyperskeptical objections will change that. Utility will defeat cavails everytime. 4 --> The significance of the relevant thresholds for searches on random walk driven or dominated process, is plain. 5 --> if you reject these, either you believe in statistical miracles as commonplace [for which we find no cogent evidence], or you believe the cosmos was programmed to produce life and the sort of variety we see. Which boils down to you believe in a form of design model. 6 --> And of course point 11 in the OP now shows the application of the reduced Demsbki metric to the biological world, with success. 12.2 Provide real evidence for CSI claims 7 --> Willfully vague and hyperskeptical. If you mean that the only empirically known, observed source of CSI and especially FSCI is intelligence, look around you and ask where the web posts, library books, etc came from. Then ask if you have ever seen the like produced by chance plus blind mechanical necessity. 12.3 Apply CSI to identify human agency where it is currently not known 8 --> A strawman: the inference to design is not an inference to the designer. That twerdun, not whodunit. 12.4 Distinguish between chance and design in archaeoastronomy 9 --> the idea here is whether certain alignments of site layouts, stones etc in ancient sites were by accident or by design. 10 --> But in fact this test was long since met by archaeologists, e.g. when it was detected that the best explanation for certain features of Stonehenge were aligned with solstices, and would work for day counting etc. 11 --> The layout can be specified as a nodes, arcs and interfaces wireframe network, and reduced to a net list, The scale of the net list can be specified in turn and tested for whether it was beyond a threshold. 12 --> The inferred function can then be tested for variability to see if it fits in an island of function; for alignment of solstices and equinoxes or to specific prominent stars that is not hard to do and has been long done. Accuracy counts, and in particular accuracy against proper motion where that is relevant or against precession of the equinoxes. (This can even help date things.) 13 -->there will be marginal cases, where there is not a clearly identifiable and specific function, e.g any lines on the ground can probably be aligned with some star or other at some time of the year. 14 --> in these cases, as designed, the FSCI test will rule conservatively: chance contingency not choice contingency. A false NEGATIVE. 12.5 Apply CSI to archaeology 15 --> Done, as just seen. 12.6 Provide a more detailed account of CSI in biology 16 --> Notice how vague this is? DNA is rich in 4-state, digitally coded information beyond the 500 or 100 bit thresholds, many times over as repeatedly documented. RNA templates off this and puts it to work. 17 --> Proteins are coded for and work on DNA through RNA. 18 --> The complex functional organisation of the cell can be reduced to node and arc diagrams, starting with say the network of cellular metabolic reactions and the regulatory networks. Nodes, arcs and interfaces can be reduced to netlists, and counted up in bits. We already know the answer: design, emphatically; on being very functionally specific and well past the 1,000 bit threshold. 19 --> In addition, we have the infinite monkey analysis to tell us that it is utterly implausible that something so complexly and specifically organised will be feasible of blind random walks and mechanical necessity on the gamut of the observable cosmos. 20 --> this is not rejected for want of empirical or analytical support, but for want of fit with the prevailing evolutionary materialistic agenda in science as exposed by Lewontin, Sagan the US NAS, etc etc. Indeed, the cite form the paper at this point is all too inadvertently revealing of the Lewontin materialist a priori at work:
It is our expectation that application of the "explanatory filter" to a wide range of biological examples will, in fact, demonstrate that "design" will be invoked for all but a small fraction of phenomena [what is the evidence trying to tell you?], and that most biologists would find that many of these classifications are "false positive" attributions of "design."[In short a naked appeal to the evo mat consensus of the new magisterium]
12.7 Use CSI to classify the complexity of animal communication 21 --> Vague and overly broad. Animals in general do not communicate using abstract,functionally specific complex symbol sets or strings. 22 --> In the case that leaps to mind, bee dances, this is obviously genetically programmed, and it would trace to the FSCI in DNA, which is designed on the inference from FSCI. 23 --> It is astonishing how many times the challenge cases are tossed out by the authors, but he cases that have obvious answers that show that the FSCI metrics and the explanatory filter have reasonable answers even if you differ with the conclusions, are passed over in a telling silence. 24 --> If bird songs are symbolic and functional with complexity that can be discerned beyond the 1,00 bits then that would point to design as the source. The real issue would be where the design rests, e.g are the birds giving evidence of verbal communication, and same for the dolphins or whales. 25 --> Show the function and the complex specificity then we can look at what the design filter says about type of source. 26 --> If whales do have personal signatures that are evidently deliberately constructed on an individual basis then that is a sign that the whales are intelligent enough to do that. Which would be great news, and would compound our guilt over our wanton slaughter of these wonderful creatures. 12.8 Animal cognition 27 --> The pattern of a rat traversing a maze and learning the pattern shows some measure of deliberation on the part of the rat, i.e some measure of design. 28 --> Oddly while MG casts this up as a challenge, the authors give a grudging concession:
We note the use of examples in Dembski's work involving a laboratory rat traversing a maze as an indication of the applicability of CSI to animal cognition [16, 17, 19].
29 --> in other words, a success by the EF on FSCI!>> _________________ Plainly, the list of objections is by no means so formidable as it is made out to be. And, the design approach with tools such as FSCI, and IC, is promising; or in fact it is routinely used on an informal basis, we do qualitatively recognise functionally specific and complex organisation and associated information all the time and instinctively or intuitively infer from it to design, e.g the very posts on this thread. GEM of TKI kairosfocus
Onlookers: It is a pity that I have to start this by speaking to a serious challenge MG needs to address before she can sit to the table of reasonable discourse. Let's get it out of the way then move on to the substance of the burst of comments she put up today. Now, you know by now that MG has been simply recirculating objections and dismissals, even implying that the numbers I clipped -- by way of illustrative example -- from the excerpts she gave or that were otherwise accessible wee taken out of the air. Maybe she did not bother to read, maybe she is willfully accusing me of wrongdoing, but in any case she is out of contact with the observable facts. Someone who accuses others of being "dishonest" -- not merely in error but willfully deceitful -- has a stiff burden of proof to meet, which MG simply has not done. That is sad, and it is a further demonstration of just how completely her challenge over the past weeks has collapsed. Now too, I don't know if I have overlooked something, but somehow it looks like the case where I have used Durston metrics of FSC to give info estimated to fit into the reduced Chi-metric somehow has gone without comment by MG. If I have seen correctly, that may be the most telling thing of all, as the case is the one with indisputably bioloogical info reduced through the Demsbki metric. If I have not seen correctly could someone kindly draw that to my attention? GEM of TKI kairosfocus
Onlookers: Joseph made a good catch above in pointing out that "Shannon Info" is not funcitonal info or complex specified info more specifically. A flipped coin can easily generate unlimited amounts of Shannon info. But if your flipped coin starts counting up in binary or outputs a text say from Shakespeare and does so beyond 1,000 bits of info, you better sit up and take notice of that coin. GEM of TKI kairosfocus
Robb: Random walks do not create 500+ or 1,000+ bits worth of functionally specific, complex information. (Notice the limitations of the infinite monkeys random text generators as already discussed above. Notice, too the reduction of Dembski's Chi metric to a bits beyond a threshold measure. Once you can by some reasonable basis estimate information to create specific function in bits, and do the comparison.) GEM of TKI kairosfocus
Robb: Now, above you spoke much about the need to reckon with all possible chance hypotheses before one can evaluate an information metric. Not really. That may be analytically so in principle for a deductive type proof, but in fact in engineering work, information metrics look at the constraints on the information-bearing units in the string statistically. The simplest model is the equiprobable model, and then we move away from that by studying actual symbol patterns, e.g the famous E is far more common in English than say X, so it conveys a lot less "surprise" thence information when it shows up. When we look at the DNA chain or the AA amine-carboxylic acid backbone of a protein, we find there are few if any physical constraints on chaining, so it is a reasonable first rough model to use a flat random distribution; as PAV did. Beyond that, the solution is to do what Durston et al did, observe statistical frequencies of occurrence within the messages, which gives the probability patterns to feed into the pi log pi summation. Facts of real distributions trump speculative models of possible distributions, every time. In their case this was in the context of protein chains to give biofunction. Empirically quite adequate (and BTW, above I linked a similar example from Bradley that has sat in my notes for years). The actual sampling of a big enough cross section of a population can tell you a lot about the population, never mind perverse cases like the novel deliberately written in the 1930's without a single letter E. (How they got along without the, he, she etc and so forth escapes me. That sentence just past has eight e's in it.) Going beyond, what you are really debating is thresholds of reasonable searchability. We know that since search algor apart from random search has to be -- in our observation intelligently -- tuned to the environment to produce superior results on average relative to random walk [and since the non-foresighted, non-informed search for a tuned search is exponentially more difficult than the simple random search], random walk is a good enough on average metric.A detuned hill climber will in effect send you astray. Same message again. Unless, you are willing to argue that there are hidden cosmic laws that rig the thermodynamics of those warm little ponds to come up with the right clusters of pre life molecules and to organise them in the right networks to do metabolic work and to have associated von Neumann replicator facilities. That is tantamount to saying that the cosmos is rigged to produce life in environments in Galactic and circumstellar habitable zones of spiral galaxies. Which is a design view in all but acknowledged name. Now, let's look at what a 500 bit threshold (the evident upper tendency of the 398 bits + K2) does. The solar system has in it about 10^102 possible quantum states -- Planck time states, 10^20 times faster than strong force nuclear interactions -- in the time since formation. 398 bits corresponds to 10^120 states, which swamps the search resources, especially the CHEMICAL interaction search resources. (Fast, strongly ionic inorganic reactions may go in like 10^-15 s, but organic reactions are often millisecond to second or worse affairs. There is a reason why the ribosome is running at about 100 AA/s.) That is why 500 bits is a reasonable threshold for search complexity. Notice, we are here counting by comparison with atomic level interaction times. That is going to swamp any non-foresighted process. As you know, my choice of threshold in my X-metric and in the comparable Chi-1000 metric is 1,000 bits. I set that up by squaring the number of relevant states for the cosmos across its lifespan. The 10^80 atoms of the observed cosmos, running in Planck time steps, for the thermodynamic lifespan of the cosmos -- ~ 50 million times the timeline since the usual date for the big bang -- cannot sample 1 in 10^150th part of the configs for 1,000 bits. That is not a probability barrier, that is a search resources barrier. It does not matter what hypothesis -- apart from the miracle of life being written into the laws and initial circumstances of physics in a way presently hidden to us -- you pick, blind chance is going to be simply impotent to sample enough of such a pace to give a plausible expectation that islands of function in it will be found, given that such are fairly specific, in order to carry out code or algorithm or language related functions, or even specifying nodes, arcs and interfaces in a multiple component system. Brute force works every time. At the same time, 1,000 bits is 125 bytes. Or 143 ASCII characters. From the programmer's view, you are not going to put together a significant all-up control system in that span. Oh, you could store some sort of access-key password, or the like, but you have simply displaced the real processing work elsewhere. BTW, that is the common core problem with avida, ev, tierra etc, they are effectively doing searchable range access key passwords and calling them "genes" or some nice sales name like that. (But make those access keys long enough and make the relevant configs that work realistically rare in the field of possible configs and see what will happen; like say 125 bytes worth of looking, for passwords that are like 1 in 10^150 of the space in aggregate: infinite monkeys, on steroids. As I keep on saying, if you see a lottery that keeps getting won, it is designed to be winnable.) The heavy lifting is going on elsewhere and they are saying this pattern triggers option A or option B. That is why I keep pointing out that you cannot start within an island of function and do more than model microevo, which is not in dispute. Not even by modern young earth creationists. In the real world, any TRULY successful macro evo theory that starts with a warm little pond has to credibly get to cells with metabolism and code based self-replicating facilities, writing the codes along the way. Then, it has to get to tissue-and enzyme etc protein codes and regulatory networks to implement complex body plans, with 10 mn+ bases worth of DNA for each major plan, dozens of times over. On Earth (or Earth plus Mars)! That requires so many statistical miracles that the chance variation and natural selection model is a non starter. Full stop. Darwin's theory was a pre information age theory, and it has had to be force-fitted over the information technology findings that have come out over the past 60 or so years. The wheels are wobbling and coming off . . . GEM of TKI kairosfocus
Specification: The Pattern That Signifies Intelligence This is the paper referenced in MathGrrl's original OP. Look at the title. My gosh. I wonder if there's anything in it about specification as that term is understood and used by Wm. Dembski himself. Mung
MG @40
If, as seems increasingly likely, you can’t provide a rigorous mathematical definition of CSI as described by Dembski and show how to apply it to my four scenarios, please just say so and we can move on.
In case you haven't noticed, I'm not playing your game. MG @41
My participation here is solely so that I can understand CSI well enough to be able to test whether or not known evolutionary mechanisms can create it.
Schneider claims to have created it. Do you doubt him? MG @67
As noted in my guest thread, specification is one of the more vague aspects of CSI. Some ID proponents don’t seem to have a problem with the specification[s] I suggested...
I have a problem with them. As I wrote in @59 above:
Then I’d try to find out if MG even had a clue what Dembski means by a specification, or if she even read the paper that she was quoting from in her OP.
Unfortunately, that post is still in the moderation queue, so you probably haven't seen it yet. [Note to mods: Please take me out of the moderation queue. It makes it near impossible to carry on a conversation with such long delays. Thank you.]
Why do you think that “Produces at least X amount of protein Y” is not a specification in Dembski’s sense? Please reference his published descriptions of CSI that support your view.
Please reference the Dembski paper you referenced in your original OP. You did read it, didn't you? Mung
Onlookers: You will note that I explicitly previously cited actual output numbers given by the claimed evolutionary algorithms, to illustrate how the Chi metric turns information numbers into bits beyond a threshold value and answers the question of whether it is reasonable that the output could have come from chance, if it is functionally specific or the like. But that is not the real elephant in the middle of the room. At this time MG is refusing to observe the underlying and inescapable problem with the "arbitrary" quantities of information "produced" by her favourite programs. As I pointed out -- now, above, and previously and as others have pointed out al the way back to the NFL ch 4 -- the core problem can be seen with the example of using the Mandelbrot set used as a fitness function. Namely, the random walk does not create functional information de novo, it only samples the output of a built in function that defines fitness, and it progresses on the instructions of a hill-climbing algorithm. As is a commonplace, a processor takes in an input, processes it and transforms based on its instructions, to give an output. Where do the actual sources of information -- inputs, algorithms, fitness functions and the like -- come from? INTELLIGENCE. Worse, in fact, PAV is quite right. When such a program searches in random walk space, it has strict limits on what it can explore absent being already confined to an island of function. If we have a space of 1,000 bits worth of possibilities, the search capacity of the cosmos is fruitlessly exhausted without sampling even 1 in 10^150th of the config space. So, the performance and the actual capacity to generate info by real random walks is going to be appreciably less than 500 bits. Absent the sort of helpers I highlighted in 44 ff, these program would be facing exactly the limits given by the random text genrators. Spaces of order 10^50 or so possibilities are searchable for islands of function within present computing resources. those within 10^500 possibilites are searchable within the resources of the whole cosmos. But those of 1,000 and beyond are beyond reach. So, if someone presents a program that starts in an island of function and does a hill-climbing tour, you know where the information really came from: active information injected (without realising that) by the programmers. GEM of TKI kairosfocus
MG @66
Again, I don’t see where you’re getting your 266 bit value, but Schneider shows how to generate arbitrary amounts of Shannon information via ev.
MG, you have a bad habit of citing things without having read them. I just loved bit:
We've beaten the so-called "Universal Probability Bound" in an afternoon using natural selection! (emphasis mine)
Mung
Loved that first Mandelbrot Set youtube vid, lol. Also, if youtube is to be believed, tornadoes actually avoid junkyards. Mung
Mathgrrl: Thank you for your post. As for specifics about the calculation of CSI for the test cases discussed by Elsberry, I have already written a post outlining the methodology by which his questions could be resolved: https://uncommondescent.com/intelligent-design/a-test-case-for-csi/ All we really need are two things: (i) hard data relating to the probability distributions of various patterns in nature, and (ii) a detailed inventory of the mode of origin of each of the various patterns that are observed to arise in the natural world. (If we can't do this for large and/or complex patterns, we can at least do it for small and/or simple ones.) Given these, Elsberry's questions become tractable: even large, complex patterns can be decomposed into smaller parts, on which we can perform the requisite mathematical calculations. Compiling the relevant data as well as the inventory of origins is a formidable task that will take some decades, however, even with millions of volunteers co-operating. Your put-downs of Professor Dembski's CSI metric relate to just one factor in his equation: the number of other patterns that are as easily describable as the pattern we are investigating. In any case, as I pointed out, this is the least significant factor in the equation for CSI. Replacement of this factor by 10^30 renders the calculation both objective and (given the requisite data and probability distribution - see above) computable, and yields a figure very close to Professor Dembski's CSI. You have yet to respond to my challenge regarding the four scenarios you describe:
Please provide us with a two- or three-page, detailed but completely jargon-free description of the four scenarios you are describing and post it up on UD. No references to other papers by biologists, please. Describe the problems in your own words, as you would to a non-biologist (which is what I am). Then I might be able to help you.
I'm still waiting. vjtorley
MG: I am sorry to have to be direct, but your further action in the teeth of well-merited correction now demands such:
your endlessly repeated selectively hyperskeptical talking points and supercilious dismissals are now both plainly delusional and willfuly slanderously dismissive.
You have called people, for no good reason, dishonest - and please don't hide behind subterfuges such as no, I have only asked questions. To raise such suggestions is itself an accusation. One that any fair minded and informed reader will at once see is groundless. As a matter of fact, adequate definitions have been provided, in great details and for many weeks now. In addition, responses to your challenges have been given and the reduced form of the Demski metric, with the Durston case, has provided us with a handy list that shows how the approach can indeed be applied fairly simply to biological systems. (In this context, Robb's recycled objections on possibilities for chance hyps are irrelevant: we know the patterns and the ranges from the test of life itself in the range of protein family variations.) Please, MG, change your ways. GEM of TKI kairosfocus
MathGrrl:
As noted above, Schneider shows how to generate arbitrary amounts of Shannon information via ev.
Specified informatin is a specified subst of Shannon Information-> ie it is Shannon information with meaning/ functionality. Joseph
vjtorley,
I’d like to clarify. In my original post on the CSI scanner, I argued that Dembski’s CSI was calculable, but not computable.
Yes, I read that. While I appreciate your efforts and look forward to further discussion with you on that topic, in this context yours is a distinction without a difference. You seem to recognize that CSI as described by Dembski cannot be used to calculate an objective, numerical metric. Claims by ID proponents that rely on such a metric are therefore clearly unsupported.
In a subsequent post, I then provided you with a simplified version of CSI, which I christened CSI-lite. CSI-lite is both calculable and computable.
That's fine, but it's not CSI as described by Dembski. Dembski's metric is the one generally accepted by ID proponents. If you can demonstrate that your CSI-lite is an unambiguous indicator of the involvement of intelligent agency, I'll be happy to spend some time testing those claims. Wesley Elsberry and Jeffrey Shallit have documented several excellent tests for your metric:
12.1 Publish a mathematically rigorous definition of CSI 12.2 Provide real evidence for CSI claims 12.3 Apply CSI to identify human agency where it is currently not known 12.4 Distinguish between chance and design in archaeoastronomy 12.5 Apply CSI to archaeology 12.6 Provide a more detailed account of CSI in biology 12.7 Use CSI to classify the complexity of animal communication 12.8 Animal cognition
MathGrrl
PaV,
when the ev program produces less than 96 bits of actual information
As noted above, Schneider shows how to generate arbitrary amounts of Shannon information via ev. Does this constitute CSI? If not, why not? MathGrrl
PaV,
Because something is difficult to demonstrate, doesn’t mean it doesn’t exist.
While you are correct in general, in the case of quantitative metrics, as Dembski and other ID proponents claim CSI to be, lack of a rigorous mathematical definition does, in fact, mean that it doesn't exist.
I’ve already walked you through an example of how CSI works. Do you dispute that I had done that?
Yes. I have yet to see anyone provide a rigorous mathematical definition of CSI that is consistent with Dembski's published descriptions, nor have I seen anyone demonstrate how to calculate CSI objectively according to such a definition.
Instead you continue with your DEMAND that these four scenarios be analyzed…….or else!!!
I had no idea that my words had such force. Here I was thinking I was just asking for an explanation from the people who claim to understand the concept. I'll type more softly when replying to you in the future.
So, how about your first scenario: “A simple gene duplication, without subsequent modification, that increases production of a particular protein from less than X to greater than X. The specification of this scenario is “Produces at least X amount of protein Y.” First, why do you think “Produces at least X amount of protein Y” is a “specification”. CSI deals with events. So, please tell us, what is the event.
As noted in my guest thread, specification is one of the more vague aspects of CSI. Some ID proponents don't seem to have a problem with the specification I suggested (see a couple of the comments above in this thread, for example). Others, like you, seem to have a different concept. Why do you think that "Produces at least X amount of protein Y" is not a specification in Dembski's sense? Please reference his published descriptions of CSI that support your view. MathGrrl
kairosfocus,
1: the fact that no ID proponent can calculate CSI for my scenarios You plainly died not look at the posts at 19 above.
I did. You tossed out a few numbers but certainly didn't provide a rigorous mathematical definition of CSI that is compatible with Dembski's published descriptions.
Scenario 1, the doubling of a DNA string produced no additional FSCI, but the act of duplication implies a degree of complexity that might have to be hunted down, but would be most likely well beyond 500 bits or 73 bytes of info.
I'm not entirely sure what you mean by that last sentence, but Dembski clearly states that CSI should be able to identify the features of intelligent agency in an object "even if nothing is known about how they arose". Do you disagree with that? If you're basing your calculation solely on the length of the genome, gene duplications easily exceed your 500 bit limit.
Scenarios 2 – 4 were computer sims, and as PAV long since noted Ev was within 300 bits, far below the significance threshold. 266 bits – 500 bits = – 234 bits lacking.
Again, I don't see where you're getting your 266 bit value, but Schneider shows how to generate arbitrary amounts of Shannon information via ev.
Scenario 3, top of no 20, point 14, on the cited bit number I found, and corrected: Chi_tierra = 22 – 500 = – 478
More numbers pulled out of thin air, with no CSI calculation to be seen. You seem to be assuming that the 22 instruction parasite appears de novo. In fact it never appears earlier than a few thousand generations into a run. One reason I included this scenario is to understand how CSI calculations take known evolutionary mechanisms into consideration. As near as I can tell from your vague description, your version of CSI doesn't consider them at all.
Scenario 4, Steiner has numbers up to 36 bits I find just now: Chi_steiner = 36 – 500 = – 464
And you finish with yet more arbitrary numbers and no rigorous mathematical definition of CSI. It is very easy to specify Steiner problems that require more than 500 bit genomes to compute. The bottom line is that you haven't provided a rigorous mathematical definition of CSI. Your "calculations" are completely arbitrary and seem consist of little more than raising two to the power of whatever number you arbitrarily select for each of my scenarios. That's neither convincing nor descriptive enough to show how to calculate CSI objectively. MathGrrl
kairosfocus, I'm not a math educator, but thanks for the compliment. Again, Dembski defines specified complexity in terms of all relevant chance hypotheses. Can you tell us how to determine what chance hypotheses are relevant? Do you understand that if we consider only the chance hypothesis of random noise, we get a plethora of false positives? Would you like some examples? R0bb
MG: Please be reminded of the corrections to your assertions of yesterday, starting from 44 above. They include cases of unwarranted and demonstrably false accusations against the characters of others on your part, so please pay close attention. Attention is also particularly called to 11 above, which integrated Durston with Dembski and provides three examples from the 35 that are directly possible, of Chi metrics of biological, real life systems. GEM of TKI kairosfocus
PPPS: Also, on OOL. In that context, Robb, you are looking at getting from some warm little pond to first life, a metabolising entity with an in-built von Neumann self-replicating facility. The evidence from observed life -- the only relevant actual observations in hand -- is that we are looking at about 100+ k bits worth of info in the blueprint storage section or tape of the vNSR. This is vastly beyond 500 - 1,000 bits. And, there is no observational evidence to support a hypothesis of a nest staircase of simpler components forming an OBSERVED stepwise sequence to that first living cell. Not even a staircase with some treads missing, it is all but wholly missing apart form some speculative just so stores and some autocatalysis under irrelevant chemical circumstances. I submit that on the evidence in hand the best explanation for OOL, just as for the origin of a fine tuned cosmos that is supportive of C-chemistry cell based life, is design. And if design is already on the table seriously for these two cases, there is no reason not to entertain it for the emergence of body plans, including our own. kairosfocus
PPS: On the issue of non-uniform probability distributions. Robb, first and foremost, the reduction of the Chi-metric to a threshold renders the old fashioned "how do you set up probability distributions" objections moot. If you accept that info theory is an existing, scientific, established discipline, then you will know that information is as a rule measured on a standard metric -- and yes, this is to the short discussion of info th 101 in my always linked briefing -- tracing to Hartley and Shannon: Ii = - log(pi) p(T|H) is a relevant form of the probability in question, as the probabilities are in the context of an assumed distribution of the likelihood of given symbols in a given code. (In the OOL context, Bradley builds on Yockey as clipped here in my always linked APP 1, point 9, this has been a couple of clicks away for years, from every post I have made as a comment at UD, Robb.] If you look at no 11 above, you will see that Durston et al give one very relevant practical way to define the chance hyp alternative; by studying sequence distributions of observed protein families, AKA sampling the population. In general, you should be aware that the definition of the probability distribution of symbols is an integral a part of standard info theory, that is why Shannon defined a metric of average information per symbol, based on a weighted sum: H = - [SUM on i] pi* log(pi) That H-metric is exactly what Durston et al build on, and it then allows us to use the reduced, threshold version of the Demsbki Chi metric to deduce, as was posted above in comment 11 and now appears in the OP as revised:
RecA: 242 AA, 832 fits, Chi: 332 bits beyond SecY: 342 AA, 688 fits, Chi: 188 bits beyond Corona S2: 445 AA, 1285 fits, Chi: 785 bits beyond.
Remember, by accepting p as given in Info theory, we may then proceed to reduce Demsbki's Chi metric:
From: Chi = – log2(10^120*phi_S(T)*p(T|H) To: Chi_500 = Ip – (500), in bits beyond a threshold of complexity
In this form, what is going on is far plainer to see, i.e we are measuring in bits beyond a threshold. Debate how you set that threshold all you want, there is no good reason to see that 500 - 1,000 bits will not be adequate for all practical purposes. Do you care to suggest that we can easily and routinely observe random walks finding islands of function in spaces of at least 10^150 or 10^301 possible configs? Also, given how the Durston case fits in, we can see that the objections raised above are moot -- they say in effect that you are disputing standard info theory. Fine, that's how science progresses; your task now is: provide an alternative metric for I than the ubiquitous bit, and persuade the world that he bit is inadequate as a metric of information because you can raise exotic probability distribution objections. And, your case is . . . ? kairosfocus
Robb: Pardon, but you seem to be falling into the same question-begging trap that MG has. (I was going to reply to her challenge here, but saw that you provided a live, fresh case in point.) The basic problem with programs like Ev etc, as was --AGAIN -- pointed out in 19 - 20 above and -- AGAIN -- brushed aside, is that they start from within defined islands of function where the very act of setting up the working program feeds in oodles of functional specificity in the form of matching the search to the space and the so-called fitness or objective function metrics involved. That is why the better approach is the sort of program that we see in the infinite monkeys type tests; which show us that we can reasonably search spaces of about 170 bits worth of possibilities, but there is a clear challenge to search a space of 500 - 1,000 or more bits worth of possibilities. Notice the cited challenges of pre-loaded active info in ev as cited in 19 - 20. All of this rests on the basis of the NFL results that lead to the conclusion that the non-foresighted search for a search is exponentially harder than the random walk driven direct search for zones of interest. It is intelligently inputted, purposeful information that matches search to config space and objective function, that feeds in warmer/colder metrics that allow hill climbing within islands of function, and more. In short, there is a serious problem of question-begging in the assumption that intelligently designed, complex algorithms with loads of intelligently input information involved in their function are creating new information, rather than simply transforming -- usefully -- already existing information added by their creators. For instance there is nothing in the wonderful output of a Mandelbrot set program that is not already pre-loaded in the inputted algorithm and start-up information, as well as of course the routines that set up the lovely and admittedly quite complex displays. To see what I am driving at, consider a Mandelbrot set program that selects points to draw out and display based on a random walk across the domain of the pre-built functions. As background, Wiki sums up:
The Mandelbrot set is a particular mathematical set of points, whose boundary generates a distinctive and easily recognisable two-dimensional fractal shape . . . . More technically, the Mandelbrot set is the set of values of c in the complex plane for which the orbit of 0 under iteration of the complex quadratic polynomial zn+1 = zn2 + c remains bounded.[1] That is, a complex number, c, is part of the Mandelbrot set if, when starting with z0 = 0 and applying the iteration repeatedly, the absolute value of zn never exceeds a certain number (that number depends on c) however large n gets. For example, letting c = 1 gives the sequence 0, 1, 2, 5, 26,…, which tends to infinity. As this sequence is unbounded, 1 is not an element of the Mandelbrot set. On the other hand, c = i (where i is defined as i2 = ?1) gives the sequence 0, i, (?1 + i), ?i, (?1 + i), ?i, ..., which is bounded and so i belongs to the Mandelbrot set. Images of the Mandelbrot set display an elaborate boundary that reveals progressively ever-finer recursive detail at increasing magnifications. The "style" of this repeating detail depends on the region of the set being examined. The set's boundary also incorporates smaller versions of the main shape, so the fractal property of self-similarity applies to the whole set, and not just to its parts.
The zone of the boundary is often rendered in colours depending on how many iterations it takes for a point to run away, and we can often see a lovely pattern as a result. (Video: here, a second vid with explanation is here -- this last illustrates the sampling approach I will discuss below . . . ) But the colour pattern is wholly determined by the mathematics of the set and the algorithm that colours points depending on how they behave. So, now, let us consider a random walk sample of the points in the complex plane, in the context of the M-brot set serving as in effect the stand-in for a traditional fitness function:
1: At the first, the output of such a program would seem to be a random pattern of colours, scattered all over the place, but 2: After a long enough time of running, that random walk based M-brot set display will look much like that of the traditional versions [cf here [overview] and here [border zone showing beautifully rich and complex details of the seahorse valley between the "head" and the "body" of what I like to call the M-brot bug . . . yes, I am a M-brot set enthusiast]], 3: The only difference of consequence being that the test points were picked at random across time, on a "dart-throw" sampling basis, or its equivalent, a random walk across the complex plane. 4: Sampling theory tells us that if a field of possibilities is sampled at random, a representative picture of the overall population will gradually be built up. 5: Compare the two programs, the traditional and the random walk versions. Is there any material difference introduced by the ransom walk? Patently, no. 6: Now, introduce a bit of hill climbing -- the colour bands in an M-brot program are usually based on number of cycles until something begins to run away -- and let the random walk wander in towards the traditional black part, the zone of fixed solutions. 7: That the walking population now migrates from the far field towards the black peak zones is a built in design. 8: Has that wandering in and picking up values where there are nice solutions to the problem ADDED fresh information that was not implicit in the built in program? 9: Not at all. 10: Just so, when you build a hill-climbing algorithm that starts within an island of function and wanders in towards peak zones on hill climbing origins, the results may look surprising, but the results are making explicit by transforming what was built in as functional capacity, they are not creating previously non-existing information out of thin air. (And if you make the fitness landscape dynamic that makes no material difference, apart from making the wandering permanent. Indeed,that points to a reason for a designer to build in evolvability: the need to keep adapted successfully to a varying fitness landscape in an island of function.) 11:In short, we have serious reason -- I here exploit the increasingly acknowledged link between energy, entropy and information in thermodynamics -- to believe that an informational "free lunch machine" equivalent of a classic perpetual motion machine is no more credible than the latter. 12: That is not to say we can simply close our minds to the possibilities of such; just, those who propose a free lunch inforamtion machine need to show that the machine is not simply transforming in-built information. 13: The soundest way to do that is to do what the random text generator programs above have done, with a sufficient threshold of real function that we are indeed modelling macro evolution of body plans, not extrapolating from a model of microevo, which is not warranted on search space reasons. 14: However the dismissive sniffing we have seen on "tornado in a junkyard [builds a jumbo jet]" strongly suggests that the threshold of complexity problem is real and resented, so dismissed not cogently addressed. 15: So, let me note: the threshold of challenge does not begin from Sir Fred Hoyle's multi-megsbit issue of a tornado spontaneously assembling a flyable jumbo jet; it starts with maybe trying to build an instrument panel gauge by that same tornado method; or even, 16: trying to see if a tornado passing through the nuts and bolts aisle of a hardware store would spontaneously match the right nut tot he right bolt and screw it in to hold a shelf together. 17: I beg to submit that, as a matter of routine common sense if one sees the right sized nut and bolt in the proper holes, lined up and screwed down to the right torque, one infers to design, not a tornado passing through the nuts and bolts aisle of your local hardware shop. 18 --> What the reduced Demsbki metric and related metrics are doing is that they are giving a rationale for that inference on information behind the required organisation, and the threshold of complexity that makes it reasonable to infer to design. 19: Let us again remind ourselves: From: Chi = - log2(10^120*phi_S(T)*p(T|H) To: Chi_500 = Ip - (500), in bits beyond a threshold of complexity
This little thought exercise therefore means that so-called evolutionary algorithms can so far only credibly model MICRO-evolution, not body plan origination macroevo. Micro-evo as described is not in dispute, not even by modern young earth creationists. So, to avoid suspicions of bait and switch tactics, it is incumbent on developers and promoters of evolutionary algorithms that they address the problem of first needing to get to the shores of islands of function before they proceed to hill-climb on such islands. Such considerations also strongly suggest something that is emerging as it is beginning to be observed that mutations are often not simply at random but follow patterns: micro-level evolvability and adaptability -- up to probably about the level of the genus to the family -- are in-built into the design of living systems, probably to confer robustness and ability to fit with niches. [Cf discussion that starts here, esp the videos on the whale and on cichlids. Also, the one on thought provokers on what mutations are.] Where the serious barrier to evolutionary mechanisms comes in is the origin of major body plans, and that includes the very first one. So far, I have not see anything that suggests to me that a solution to the macro-level problem is feasible on the blind watchmaker type approach. Extrapolations form what is equivalent to maicroevo to macro evo, do not help matters. GEM of TKI PS: Robb, you are a math educator. I think you need to address the reduction of the Dembski metric shown in the OP above and again commented on just above. kairosfocus
PaV:
This is stupidity of the highest order. Look, sweetheart, when the ev program produces less than 96 bits of actual information (that’s right, we’re dealing with 16 sites each six bases long, and perfect positioning doesn’t take place). Per Dembski’s definition, this doesn’t rise to the level of CSI. To then go on and determine the actual “chance hypothesis” serves no usefulness whatsoever. It would be an exercise in masochism, and no more.
It depends on what information you're measuring. Schneider measures the information in the locations of the binding sites, as do Dembski & Marks et al (although they measure it differently than Schneider, coming up with 90 bits as opposed to Schneider's 64 bits). You're measuring the information in the particular bases in the binding sites, not the locations. Regardless of what you're measuring, all you need to do is run ev multiple times to generate 500+ bits, so the amount generated by a single run isn't relevant.
Then, knowing that DNA nucleotide bases are equiprobable given known chemistry/quantum numbers, then any nucleotide base has a 0.25 chance of being at any position along the DNA strand. For a sequence of length, 2X = -log_2{(10^150)}, or,log_2 (10^149), CSI would be present. Q.E.D.
At best, you've eliminated a hypothesis consisting of random mutation and nothing else. Has anyone proposed such a hypothesis? According to Dembski's definition, specified complexity is multivalent, with one value for each relevant chance hypothesis. And for Dembski, all material processes, stochastic or deterministic, are chance hypotheses. So if you come up with a single number when you're measuring the CSI of something, you need to explain why there's only one relevant chance hypothesis. Can you point me to a definition of CSI that explains how we determine which chance hypotheses are relevant? In order to define CSI as a property or measure of real-world things and events, it's not enough to formulate it as a function of H, eg –log2(10^120*φ_S(T)*P(T|H)). You also have to define H. Dembski's repeated warning that specified complexity needs to be based on all relevant chance hypotheses is for good reason. Nature is full of phenomena that are improbable under a random noise hypothesis, but very probable under the laws of nature. And many can be described simply, and are therefore specified. So restricting CSI calculations to a random noise hypothesis produces a plethora of false positives. And yet all attempted CSI calculations that I've seen have been based solely on the random noise hypothesis, including all of the attempts I've seen on this board. And that goes for Dembski's attempts, and well as Durston's FSC calculations (he assumes that ground state = null state), and Marks & Dembski's CoI (they assume that the search-for-a-search is blind), and even Sewell's SLoT math. Random noise is the ubiquitous model that pervades ID. So, under the rigorous definition of CSI, how is H defined? How do I determine what chance hypotheses are relevant? R0bb
VJT @52
Please provide us with a two- or three-page, detailed but completely jargon-free description of the four scenarios you are describing and post it up on UD. No references to other papers by biologists, please. Describe the problems in your own words, as you would to a non-biologist (which is what I am). Then I might be able to help you.
Heck, I'd be happy to see them one at a time. Then I'd try to find out if MG even had a clue what Dembski means by a specification, or if she even read the paper that she was quoting from in her OP.
I have no intention of undertaking a course in evolutionary algorithms in order to answer your question; I’m afraid I simply don’t have the time.
I've ordered a couple books which I hope will show me how to code some in my favorite language. Mung
MG Talking point: I want to learn enough about CSI to be able to test whether or not evolutionary mechanisms are capable of generating it. Thus far it is not sufficiently well defined for me to do so. Based on some ID proponents’ personal definitions of CSI, it appears that evolutionary mechanisms can generate it, but those aren’t the same as Dembski’s CSI. Nicely vague of course so trying to nail down the "personal definitions" will be like trying to nail down a shingle on a fog bank. Indeed, the plain -- and very post modernist -- intent is that there is no coherent, objectively real observable pattern being described and no resulting objectively correct definition of what CSI is [and by extension its functional subset FSCI], so such is to be taken as whatever one picks to make of it. Just as, these days marriage itself is being taken as a wax nose to be bent how one wills, on grounds that the opposite sexes do not form a natural, objective complementarity. (In short, this perspective is yet another manifestation of the radical, amoral relativism rooted in evolutionary materialism that Plato warned against in the Laws Bk X, 2,350 years ago.) Makes a pretty handy strawman. Probably -- sadly -- the underlying intent of the whole rhetorical exercise. In corrective steps: 1--> CSI was not defined by Demsbski. Yes, not. It is fundamentally a description of an observable characteristic of many things, first in the world of technology then also in the world of cell based life. For instance, compare the "Wicken wiring diagrams" of a petroleum plant and the biochemical reaction pathways of the living cell, in Fig. I.2 here, also the layout of a computer motherboard here or the regulatory networks of DNA activation and control in Figs. G.8 (a) - (d) here 2 --> The objectively controlled description responding to real-world phenomena and objects is in the very words itself: jointly complex and specified information [and related organisation], an aspect of objects, phenomena and processes that exhibit the cognate: specified complexity. 3 --> Thus we come to Orgel, describing the way (a) the strikingly complex organisation life forms differs from (b) randomness AND from (c) simply ordered entities, in the explicit context of the origin of life [note the title of the work]; thus, the context of unicellular organisms:
. . . In brief, living organisms are distinguished by their specified complexity. Crystals are usually taken as the prototypes of simple well-specified structures, because they consist of a very large number of identical molecules packed together in a uniform way. Lumps of granite or random mixtures of polymers are examples of structures that are complex but not specified. The crystals fail to qualify as living because they lack complexity; the mixtures of polymers fail to qualify because they lack specificity. [[The Origins of Life (John Wiley, 1973), p. 189.]
4 --> Thus on contrasted concrete exemplars, we may properly and OBJECTIVELY observe and distinguish simple order [not complex], specified complexity [= complex organisation], and randomness [complex but nor correlated with a principle of organisation or order]. 5 --> Noticing suchobjective, observable material differences and expressing them in words is a first step to understanding and modelling; indeed, it has been aptly said that the agenda of science is to describe, explain, predict and control (or, at least influence). MG's evasiveness when she has been pressed on whether or not Orgel is meaningful in the above quote, is therefore sadly revealing of an underlying inadvertent anti-scientific spirit. Utterly telling. 6 --> As of this point, though, we have an ostensive definition clarified by pointing out examples and counter examples, and giving rise to a trichotomy of complexity: order, organisation, randomness. 7 --> This will be picked up, not only by Demsbski et al, but by Trevors, Abel and co, who define and distinguish orderly, functional [function is one principle of organisation . . . ] and random sequence -- string: s-t-r-i-n-g -- complexity. 8 --> Since, complex networked structures can be reduced to network lists of related and structured strings, per nodes, arcs and interfaces, this focus on strings is without loss of generality. 9 --> The issue of function brings to bear the closely related remark of Wicken:
Organized’ systems are to be carefully distinguished from ‘ordered’ systems. Neither kind of system is ‘random,’ but whereas ordered systems are generated according to simple algorithms [[i.e. “simple” force laws acting on objects starting from arbitrary and common- place initial conditions] and therefore lack complexity, organized systems must be assembled element by element according to an [[originally . . . ] external ‘wiring diagram’ with a high information content . . . Organization, then, is functional complexity and carries information. It is non-random by design or by selection [Wicken plainly hoped natural selection would be adequate . . . ], rather than by the a priori necessity of crystallographic ‘order.’ [“The Generation of Complexity in Evolution: A Thermodynamic and Information-Theoretical Discussion,” Journal of Theoretical Biology, 77 (April 1979): p. 353, of pp. 349-65.]
10 --> Observe, in discussing the issue of CSI, MG NEVER responsibly or cogently addresses these key conceptual discussions; she only tries to drive a dismissive rhetorical wedge between Orgel-Wicken and design thinkers. (Observe, also how she tries to drive a similar rhetorical magic wedge between Abel, Trevors, Chiu and Durston and Dembski et al. FYI, MG, Joseph is right: Dembski's quantification of CSI, as well as that of Durston et al, is in a context of using the classic Shannon-Hartley negative log probability metric for information [as an index of complexity and improbability of access by a random walk driven search algorithm or natural process, and integrating into improbability/surprise in that sense, specific functionality and/or meaningfulness. In addition, Durston et al use an extension of Shannon's average information per symbol metric, H, to assess jump in degree of function as one moves from ground state to functional state, this last being an island or zone of function in a wider space of possible configurations, the overwhelming majority of which are non-functional. That is why the Durston metric can easily be incorporated into the reduced Dembski metric, yielding the values of Chi for the 35 protein families, as may be seen in the revised point 11 of the original post above.) 11 --> This is a crucial error and is responsible for her onward blunders. She apparently cannot bring herself to conceive or acknowledge that Dembski et al could be trying to do -- or even, succeeding in doing! --just what we read in the OP as cited from NFL pp 144, 148: building on the thinkers who went before. 12 --> As for the "ignorant, stupid, insane or wicked/dishonest" who hang around UD and try to think along the lines laid out above, producing "personal definitions" . . . 13 --> Now in fact, what Dembski explicitly did was to try to quantify what CSI is about. As a first pass, we may see his statement in NFL, p. 144 that MG never addresses on the merits -- much less, in context. Let's break it up into points to see what it is doing:
“. . . since a universal probability bound of 1 in 10^150 corresponds to a universal complexity bound of 500 bits of information,
a: (T, E) constitutes CSI because b: T [i.e. "conceptual information," effectively the target hot zone in the field of possibilities] subsumes c: E [i.e. "physical information," effectively the observed event from that field], d: T is detachable from E, and e: T measures at least 500 bits of information . . . ”
14 --> In short the observed event E that carries information comes from an independently describable set, T, where membership in T involves 500 or more bits of information per the standard negative log probability metric. 15 --> Dembski is therefore giving a metric, with 500 bits as a threshold where the odds of getting to E by a chance driven random walk are 1 in 10^150 or worse. 16 --> In the 2005 elaboration, he gives the more complex expression that we have reduced: Chi = - log2 (10^120 * phi_S(T)* p(T|H)), or Chi = Ip - (398 + K2), bits beyond a threshold 17 --> That threshold tends (unsurprisingly) to max out at 500 bits, as VJT has deduced. 18 --> A metric of information in bits beyond a threshold of sufficient complexity that available random walk driven search resources would be all but certainly fruitlessly exhausted on the relevant real world gamut of search, is plainly not meaningless. 18a --> It is also quite well supported empirically. The best -- no mechanisms that transmute inputted information into output information that may mislead us (Dawkins' Weasel is notorious in this regard . . . ) that it is coming up as a free lunch -- random walk tests to date are probably the 'monkeys at keyboards" tests, and to date the capital examples run like this one from Wikipedia:
One computer program run by Dan Oliver of Scottsdale, Arizona, according to an article in The New Yorker, came up with a result on August 4, 2004: After the group had worked for 42,162,500,000 billion billion monkey-years, one of the "monkeys" typed, “VALENTINE. Cease toIdor:eFLP0FRjWK78aXzVOwm)-‘;8.t" The first 19 letters of this sequence can be found in "The Two Gentlemen of Verona". Other teams have reproduced 18 characters from "Timon of Athens", 17 from "Troilus and Cressida", and 16 from "Richard II".[20] A website entitled The Monkey Shakespeare Simulator, launched on July 1, 2003, contained a Java applet that simulates a large population of monkeys typing randomly, with the stated intention of seeing how long it takes the virtual monkeys to produce a complete Shakespearean play from beginning to end. For example, it produced this partial line from Henry IV, Part 2, reporting that it took "2,737,850 million billion billion billion monkey-years" to reach 24 matching characters: RUMOUR. Open your ears; 9r"5j5&?OWTY Z0d...
18b --> The best case search results are of order 24 ASCII characters, or spaces of 128^24 = 3.74*10^50, taking up less than 170 bits; well within the 500 bit threshold. It has been observed that trial and error can find islands of function in spaces of 10^50 or so possibilities, corresponding to 170 or so bits. 19 --> In short, trial and error on random walks are strictly limited in what they can achieve. And, if the threshold of function T is of order 500 or more bits, then we have good reason to believe that such exercises will never of their own accord find such zones of function. 20 --> For the only observed cell-based living systems, the DNA complement starts north of 100,000 4-value bases, or 200 k bits. (In fact the estimate for the minimally complex independent living cell is about 300 k bases, or 600 k bits.) 21 --> There is no empirical evidence of a ladder of pre-life entities that mounted up stepwise to this. And, given the evidence that the living cell comprises a complex metabolic system integrated with a code based von Neumann self-replicator, its minimal threshold of functional complexity is certainly well past 500 or 1,000 bits. 22 --> The latter is 125 bytes or 143 ASCII characters, wholly inadequate to construct any control software system of consequence. And yet, the number of possible configs for 1,000 bits is 1.07*10^301, over ten times the square of the number of Planck time states of the 10^80 or so atoms of the observed cosmos, across the estimated thermodynamic lifespan of some 50 million times the usual timeline from the big bang. (A Planck time is so short that the fastest, strong force nuclear interactions take about 10^20 -- a hundred billion billion -- Planck times.) 22 --> In short, the blind chance plus mechanical necessity based search resources of the cosmos could not credibly find an island of function in a config space corresponding to 1,000 bits, much less 100,000. 23 --> And when it comes to origin of main body plans, we are looking at 10+ mn bases of novel DNA information, dozens of times over. 24 --> The only observed, known cause of such degrees of functional complexity -- e.g. as in the posts in this thread -- is intelligent design. That observation is backed up by the sort of analysis of search space challenges we ave just seen, a challenge that is only known to be overcome by the injection of active information by intelligence. 25 --> Now of course there have been hot debates on hos probabilities are assigned to configs and how the scopes of islands of function can be estimated. 26 --> On one side, if there are currently hidden laws of physics that steer warm little ponds to form life and then shape life into body plans, then that is tantamount to saying nature is carrying out a complex program, and is front loaded to produce life. This is of course a form of design view. 27 --> On another side, the speculation is that there is a vast number of unobserved sub cosmi and ours jut happened to get lucky i that vastly larger pool of resources. This is of course a convenient and empirically unwarranted speculation. Metaphysics, not physics. (And, as the discussion of cosmological fine tuning here points out, it points straight back to an intelligent necessary being as the root of the multiverse capable of producing a sub-cosmos like ours.) 28 --> The simple brute force X-metric was developed against that backdrop. 29 --> It uses a complexity threshold of 1,000 bits so that cosmos scope search -- the only empirically warranted maximum scope -- is utterly swamped by the scale of the config space. 30 --> Since we can directly observe functional specificity, it uses that judgement to set the value of S = 1/0. 31 --> Similarly we can directly observe contingency and complexity beyond 1,000 bits, giving C - 1/0. 32 --> We can easily convert information measures [remember the nodes and arcs diagram and net list technique] to bits or directly observe them in bits, as we see all around us, so we use the number of bits, B. 33 --> This brings up and warrants the only "personal" definition of FSCI shown at UD: X = C *S* B 34 --> But by direct comparison, this is essentially comparable to the transformed version of the Chi metric, if we were to use a 1,000 bit threshold [and maybe we should now begin to subscript Chi to indicate the thresholds being used]: Chi_1000 = Ip - 1,000, in bits beyond the threshold. 35 --> the only gap is that for B we usually simply use the metric of physical information string capacity, without bothering to look at how much redundancy is in the code leading to some ability to compress. (That usually does not push us much beyond ~ 50 % loss-less compression for typical file sizes, so the metric is -- on the intended rough and ready basis -- comparable to the Dembski one.) ______________ In short, we have excellent reason to see that CSI and FSCI are meaningful concepts, are capable of being turned into quantitative models and metrics, and directly apply to technological systems and biological ones. Indeed, since this has been specifically challenged then denied, we must note -- point 11 OP and comment 11 -- that the Durston FSC metric and the reduced Dembski Chi-metric can easily be integrated to show 35 values of Chi for protein families. This talking point also collapses. GEM of TKI kairosfocus
kairosfocus: Congratulations on you excellent posts from 44-48.Well Done! StephenB
KF:
For the particular computer sim cases of interest, I contend that we are looking at information transformation not novel creation. A Mandelbrot set has in it no more information than was specified by the inputs put in to start it up and the algorithm.
The very same thought has passed my mind. PaV
As to biological reality, functional proteins are specified. How do we know? Because if they weren't specified then random protein sequences would exist, and the kinds of shapes and bonding needed for transcription and translation of proteins could not take place. So this is basically a 'given'. Then, knowing that DNA nucleotide bases are equiprobable given known chemistry/quantum numbers, then any nucleotide base has a 0.25 chance of being at any position along the DNA strand. For a sequence of length, 2X = -log_2{(10^150)}, or,log_2 (10^149), CSI would be present. Q.E.D. That is, the event, E, is the specified sequence; the looked for pattern = target, is the very same specified sequence. There is only one way to specify it (this isn't strictly true since some a.a. are not critical, and can be substituted for [with the effect that X must be longer for CSI to be present---but this is nothing for biological systems]), and the chance hypothesis is that the bases are drawn at random, there being 4 such bases to select from. See how easy it is for biological systems?!?!? PaV
VJT: I contend that the case already in view, with Durston, shows that the material issue is long since answered. Chi can credibly be calculated for biological systems, as we have done so. For the particular computer sim cases of interest, I contend that we are looking at information transformation not novel creation. A Mandelbrot set has in it no more information than was specified by the inputs put in to start it up and the algorithm. But, it would be interesting to see how much info is being output, how fast, on the relevant algorithms. As shown above, we have a quick way to compare info values to a threshold value. GEM of TKI kairosfocus
PAV: Interesting approach. Of course once we see that Chi = Ip - (398 + K2) . . . we can easily enough show how the results fall short of the threshold of sufficient complexity to be best explained by design not chance and/or blind necessity. GEM of TKI kairosfocus
Mathgrrl (#37) In response to my earlier post, you write:
You have certainly made the greatest effort and come up with the most interesting and detailed discussions (which I look forward to you continuing in future threads), but even you have had to recognize that Dembski’s formulation of CSI is not calculable.
I'd like to clarify. In my original post on the CSI scanner, I argued that Dembski's CSI was calculable, but not computable. In a subsequent post, I then provided you with a simplified version of CSI, which I christened CSI-lite. CSI-lite is both calculable and computable. I think that kairosfocus' posts at #44, #45 and #47 above meet your requirements for a CSI calculation for the four scenarios you described. But if you aren't satisfied with those answers, then here's a challenge I shall issue to you. Please provide us with a two- or three-page, detailed but completely jargon-free description of the four scenarios you are describing and post it up on UD. No references to other papers by biologists, please. Describe the problems in your own words, as you would to a non-biologist (which is what I am). Then I might be able to help you. I have no intention of undertaking a course in evolutionary algorithms in order to answer your question; I'm afraid I simply don't have the time. vjtorley
MathGrrl [38]:
What seems clear now is that Dembski’s CSI has not been and cannot be calculated for real biological or digital systems. That means that the claims of ID proponents that CSI, being specified complexity of more than a certain number of bits, is an indicator of intelligent agency are unsupported.
This is stupidity of the highest order. Look, sweetheart, when the ev program produces less than 96 bits of actual information (that's right, we're dealing with 16 sites each six bases long, and perfect positioning doesn't take place). Per Dembski's definition, this doesn't rise to the level of CSI. To then go on and determine the actual "chance hypothesis" serves no usefulness whatsoever. It would be an exercise in masochism, and no more. IOW, all we have to do is to simply ASSUME that all 96 positions have been perfectly specified; i.e., 2 bits of information for each base, then the "chance hypothesis" is that each position is capable of being held by any of four nucleotide bases, with equal probability for each. Then, for a length of 96 bases, P(T|E) [i.e., of a particular "specified" set of sites = T, given a set of sixteen sequences each composed of six bases = E] = 4^96 = 2^-192. The specificity of this complex ensemble is: -log_2{P(T|E)} = -log_2(2^-192) = 192.......which is, of course, well below the standard for CSI. Hence, we can conclude, using Dembski's definition of CSI, that whatever the ev program produced can be explained invoking chance alone. Now, do you want to dispute the above statement? Do you want to assert that what the ev program did---this digital phenomenon---it did using intelligence? Is this your assertion you wish to make?
Intellectual integrity dictates, therefore, that ID proponents stop making those claims.
You say you want to "learn" how to use CSI. Why not be intellectually honest and admit your true purposes? You're the one who, in the face of legitimate answers given to you over and over, keep insisting that we have it wrong here at UD. It's time for you to show some integrity, my dear, and not just talk about it. PaV
MathGrrl: Because something is difficult to demonstrate, doesn't mean it doesn't exist. I already compared your 'demand' to that of a undergrad telling his professor that unless he can demonstrate that the Law of Conservation of Angular Momentum, did, indeed, apply to the collapse of World Trade Center Buildings, then he would consider the 'Law' unsubstantiated. Would you agree that to "demonstrate this in a rigorous mathematical fashion" would be incredibly difficult, if not downright impossible? Does this invalidate the 'Law'? Yet there are many instance where the 'Law' can be demonstrated. Believe it or not, professors usually use simplifying assumptions. Can you believe that? (a touch of sarcasm here) I've already walked you through an example of how CSI works. Do you dispute that I had done that? No. Instead you continue with your DEMAND that these four scenarios be analyzed.......or else!!! Take the last sentence. Let's analyze the CSI. What's the event? It's: the word pattern give by: "Instead you DEMAND that your four scenarios be analyzed.......or else!!!" Since the 'alphabet' making up this pattern involves capital letters, exclamation marks and periods, it's composed of at least 54 elements. (I could have chosen to use capital letters all the way through; but I didn't for 'effect'). The sentence contains 71 characters (including spaces). So, the chance of getting this particular combination of characters is 54^71, well above the UPB of 10^150. Thus, it is of intelligent origin. [In this case, highly intelligent origin ;-)] Do you dispute this? But, this isn't enough for you, is it? So, how about your first scenario: "A simple gene duplication, without subsequent modification, that increases production of a particular protein from less than X to greater than X. The specification of this scenario is “Produces at least X amount of protein Y.” First, why do you think "Produces at least X amount of protein Y" is a "specification". CSI deals with events. So, please tell us, what is the event. The obvious answer: the event E actually is: "An increase of protein Y above X". Well, we need to know how this "event" came about. It came about because of a duplicated gene (per you scenario). Thus, event E_1 is: "A gene is duplicated.". Obviously, E=E_1. So, let's substitute E_1 as the actual event. Now we have to ask the question: what's involved in gene duplication? Obviously, cellular machinery. How, then, does this cellular machinery work? It takes a sequence of nucleotide bases A,C,G and T, of a given length, L, and it produces an identical sequence of length L. This sequence is then inserted somewhere in the chromosome/s. Now, of course, for this kind of work to be done, intelligence is certainly operative; but this would require an analysis of all of the cellular machinery involved and the regulatory mechanisms by which they function in tandem. But that is not your real concern. Your real concern is determining whether this newly produced sequence represents CSI or not. Well, we know what the event is: E_1. We know that it involves a "specific" nucleotide sequence of length L. Now we ask, per Dembski's CSI, what is the pattern? Well, the pattern is this same "specific" sequence. So this is our target, T, given E_1. Now, what is our chance hypothesis? Since we're dealing with nucleotide bases, and since there are four of them, and because there is no chemical/quantum reasons for any preferred sequencing of these bases, the "chance" of randomly producing such a sequence is 4^L. If L is sufficiently large, then this would put us above the UPB. However, note this: the sequence ISN'T produced randomly. It's produced using particular protein machines that are able to take nucleotide base X, at position Y, and reproduce (faithfully) nucleotide base X at position Y. Therefore, there is a one-to-one mapping taking place between the original, and the duplicated sequence. So, since there is only ONE possibility for a particular base to be at a particular location/position, the probability of this occurring for a sequence of length L, is: P(T|E_1) = 1^L, which is equal to 1. And, the "complexity" of this "chance hypothesis" is, per Dembski's definition of CSI, -log_2{P(T|E_1)}, = -log_2(1)= 0. There is no information gained. Thus, obviously, no CSI. So, please MathGrrl, let's get down to at least 3 scenarios that you're now "demanding". ______________________________________ For onlookers, note this: MathGrrl, if she had diligently tried to understand CSI as presented in Dembski's "No Free Lunch" book, could have worked this all out herself. But, she's disinclined. Why? Because her whole point is to make the assertion that CSI is ill-defined, and hence useless. This is the strange position we're placed in. MathGrrl claims she wants to "learn" how it's used. And then recites scenarios that patently don't contain CSI, e.g., the ev program, and which can easily be worked out using Dembski's formulism. She won't attempt them. And then she demands that we do. Where will this silliness end? And when? PaV
k@34
I prefer 1,000 bits
I like nice round numbers, so I vote for 1024 bits. Mung
Joseph:
Then you count and check the specification- i.e. how much variation can it tolerate.
That is pretty close to what Durston et al did. Hence their Fits values are for islands of function. And as we can show from 11 above, that can then be plugged into the Chi metric, in reduced form. Which of course means we have in hand 35 specific indubitably biological values of Chi, by combining Dembski and Durston. GEM of TKI _________________ F/N: For ease of reference, I have promoted the Durston-Dembski Chi values to the original post, point 11 in sequence, by happy coincidence. kairosfocus
Joseph: MG is so far simply repeating the talking points she has been making all along. I have yet to see any responsiveness to the cogent information and mathematics provided by ID proponents in response to her challenges, with a partial exception being VJT. Who was reduced to exasperation several times, including above. UNTIL AND UNLESS MG IS ABLE TO SHOW WHY THE LOG-REDUCTION DERIVED EQN FOR CHI ABOVE IS WRONG, WE NEED NOT TAKE SUCH TALKING POINTS PUSHED IN THE TEETH OF MANIFEST EVIDENCE OF THEIR FALSITY SERIOUSLY. And, since:
1: By common definition, in an information context - log (pi) = Ii 2: Similarly log (p*q) = log p + log q, and 3: 10^120 ~ 2^398
then we may reduce Chi = - log2(10^120*D2*p) to: Chi = Ip - (398 + K2) where on reasonable grounds, (398 + K2) tends to max out at about 500, as VJT showed. So, the challenge MG has put collapses, as there is nothing exotic in what was just again done. Chi is simply a metric of info beyond a threshold of complexity. We may debate the threshold but the rest is standard high school algebra applied to info theory. (Here is my always linked note's briefing on basic info theory.) Okay, GEM of TKI kairosfocus
4 possible nucleotides = 2^2 = 2 bits of informtion per nucleotide 64 possible coding codons = 2^6 = 6 bits of information per amino acid (including STOP) Then you count and check the specification- ie how much variation can it tolerate. Joseph
8: This was followed by a lot of words, but nothing that actually addresses my four scenarios. This in the teeth of what was done in 19 - 20 above. It is so grotesquely false that this is verging on delusional. It seems to me you did not actually look at what was done but simply repeated your favourite dismissive talking points. That is being out of responsible contact with reality on a serious matter where you have gone on to impugn people by making slanderous assertions. 9: If, as seems increasingly likely, you can’t provide a rigorous mathematical definition of CSI as described by Dembski and show how to apply it to my four scenarios, please just say so and we can move on. that AND operator does ineresting work, as both halves have to be separately true for an AND to be so. But it turns out that as has been shown in the OP, there is a "rigorous" definition of Dembski's Chi metric, by dint of carrying out the log, converting - log (p) to Information explicitly (per Shannon-Hartley) and noting that the subtracted values define a threshold of reasonable complexity. That having been done, it is relatively easy to find where novel FSCI has been claimed to have been originated.
i: gene dup -- copying is not origin of info, but the scale of the copy may make chance duplication implausible, so the implied copier entity is a manifestation of FSCI. ii: Ev -- faces the longstanding problem of starting within an island of function and working on processes that transform but do not de novo create info. In any case the search seems to max out at 266 bits was it, and so it is well within the threshold. An exemplary value was calculated. iii: tierra -- same problem in essence. Where a bit value was given, a demonstration calculation was done, but the issue is that this transforms information as opposed to originating it, within a pre-targtetted island of function. iv: steiner -- same problem with logic, and a demo bit value was also calculated on the for argument assumption that this was a genuinely new bit value. Well within the CSI threshold.
So, both sides of the AND are satisfied and the answer you think does not exist is there. 10: some ID proponents seem to consider CSI to be proportional to the length of the genome. Overall, the existence of COPIES implies a mechanism that will carry forth this. In that context it makes sense to take the length including such copies as may exist in a simple brute force calculation. You will also see that the lengths in question in the relevant calculations are usually for protein coding zones, and so whichever copy is used, the FSCI is there. As in, the repeated case 300 AA --> 900 bases --> 1,800 bits, beyond the 1,000 bits threshold. 11: Why would you be suspicious about the gene duplication scenario? It seems like an obvious known mechanism that must be addressed by any CSI definition. Perhaps, because of your consistent brushing aside of cogent answers on the point, as just happened again. ____________ In short, MG's response shows that she is more or less repeating certain talking points regardless of evidence, response and calculations or derivations and associated or resulting definitions in front of her. In particular, she has shown no responsiveness to the derivation that has been on the table for a week -- which, BTW, makes the exchanges with GP moot -- and which is in the OP:
CHI = - log2[10^120 ·phi_S(T)·P(T|H)]. How about this: 1 –> 10^120 ~ 2^398 2 –> Following Hartley, we can define Information on a probability metric: I = – log(p) 3 –> So, we can re-present the Chi-metric: Chi = – log2(2^398 * D2 * p) Chi = Ip – (398 + K2) 4 –> That is, the Dembski CSI Chi-metric is a measure of Information for samples from a target zone T on the presumption of a chance-dominated process, beyond a threshold of at least 398 bits, covering 10^120 possibilities. 5 –> Where also, K2 is a further increment to the threshold that naturally peaks at about 100 further bits. (In short VJT’s CSI-lite is an extension and simplification of the Chi-metric.) 6 –> So, the idea of the Dembski metric in the end — debates about peculiarities in derivation notwithstanding — is that if the Hartley-Shannon- derived information measure for items from a hot or target zone in a field of possibilities is beyond 398 – 500 or so bits, it is so deeply isolated that a chance dominated process is maximally unlikely to find it, but of course intelligent agents routinely produce information beyond such a threshold. 7 –> In addition, the only observed cause of information beyond such a threshold is the now proverbial intelligent semiotic agents. 8 –> Even at 398 bits that makes sense as the total number of Planck-time quantum states for the atoms of the solar system [most of which are in the Sun] since its formation does not exceed ~ 10^102, as Abel showed in his 2009 Universal Plausibility Metric paper. The search resources in our solar system just are not there. 9 –> So, we now clearly have a simple but fairly sound context to understand the Dembski result, conceptually and mathematically [cf. more details here]; tracing back to Orgel and onward to Shannon and Hartley . . .
Sadly revealing and deeply disappointing. GEM of TKI kairosfocus
MG: I see you have now replied, but -- given the reduction Chi - Ip - 500 bits beyond a threshold -- I find the unresponsiveness of that reply disappointing. It is noteworthy that you do not seem to have registered the significance of the reduction of the Chi metric to a threshold measure in bits beyond a threshold. I will comment on points clipped in order: 1: the fact that no ID proponent can calculate CSI for my scenarios You plainly did not look at the posts at 19 above. Scenario 1, the doubling of a DNA string produced no additional FSCI, but the act of duplication implies a degree of complexity that might have to be hunted down, but would be most likely well beyond 500 bits or 73 bytes of info. Scenarios 2 - 4 were computer sims, and as PAV long since noted Ev was within 300 bits, far below the significance threshold. 266 bits - 500 bits = - 234 bits lacking. Scenario 3, top of no 20, point 14, on the cited bit number I found, and corrected: Chi_tierra = 22 [oops bytes, cf 20 above] – 500 = – 478 Chi_tierra = 22*8 - 500 = -324, below threshold Scenario 4, Steiner has numbers up to 36 bits I find just now: Chi_steiner = 36 - 500 = - 464 IN SHORT NONE OF THE SCENARIOS MEASURES UP TO A LEVEL OF SIGNIFICANCE. This is in addition to the logical problem of having started within an island of function instead of finding it. Since numbers were provided per a calculation on the mathematical meaning of Chi, your declaration to Joseph is false. 2: Claiming that you have, when all of these threads are available to show that you cannot, is not intellectually honest. making false accusations in the teeth of easily accessible evidence to the contrary raises questions about YOUR honesty and civility, MG. 3: you have had to recognize that Dembski’s formulation of CSI is not calculable. In fact, it is easily transformable (once one thinks of of moving forwards instead of backwards) into a form that is calculable once we can have a bits measure on the usual negative log probability metric used since Hartley and Shannon. 4: What I find particularly interesting is that so many ID proponents have blithely claimed that CSI is an unambiguous indicator of the involvement of intelligent agency, despite obviously never having calculated it for any biological system. Another falsehood. Let me simply clip from 11 above, where I converted the Durston metrics of information in fits, to the Dembaki Chi values in bits beyond the threshold for three cases from the table for 35 protein families published in 2007:
PS: Using Durston’s Fits from his Table 1, in the Dembski style metric of bits beyond the threshold, and simply setting the threshold at 500 bits: RecA: 242 AA, 832 fits, Chi: 332 bits beyond SecY: 342 AA, 688 fits, Chi: 188 bits beyond Corona S2 445 AA, 1285 fits, Chi: 785 bits beyond. –> The two metrics are clearly consistent, and Corona S2 would also pass the X metric’s far more stringent threshold right off as a single protein. (Think about the cumulative fits metric for the proteins for a cell . . . ) –> In short I am here using the Durston metric as a good measure of the target zone’s information content, which Table 1 also conveniently reduces to bits per symbol so we can see how the redundancy affects the information used across the domains of life to achieve a given protein’s function; not just the capacity in storage unit bits [= no AA's * 2 4.32 (oops, I plainly had bases in mind there)]
5: What seems clear now is that Dembski’s CSI has not been and cannot be calculated for real biological or digital systems. That means that the claims of ID proponents that CSI, being specified complexity of more than a certain number of bits, is an indicator of intelligent agency are unsupported. Please see the just above, which has ben there in post no 11 which is specifically addressed to you. I take it that 35 protein families are biological systems. Going beyond that, the Dembski metric in transformed form is easily applied to DNA etc. And, all along, the simple brute force X-metric of FSCI was applied, routinely, to biological systems and has been for years; as can be seen from my always linked as just again linked. You were notified of this repeatedly and plainly willfully refused to acknowledge it. 6: Intellectual integrity dictates, therefore, that ID proponents stop making those claims. On the evidence of this post, I think you need to look very seriously in the mirror before casting such false accusations that are without factual foundation again. I repeat: the Durston metric has been integrated with the Chi metric to produce CSI values as just clipped. The Dembski Chi metric in transformed form is -- as just shown -- readily applied to biological systems. The X-metric did all of this on a brute force basis years ago. 7: If you can provide a riogorous mathematical definition of CSI as described by Dembski and show how to apply it to my four scenarios, please do so. Please see the Op and no 19 ff above, with side-lights on no 11 too. I repeat, CSI a la Demsbki is -- by simply applying the log function and the log of a product rule -- Information in standard bit measures beyond a threshold of at least 398 bits, and in praxis up to above 500. None of this is exotic. We may debate how Demsbski got to the threshold values he chose, but those values are reasonable, and have long been shown to be reasonable. The infinite monkeys analysis as extended by Abel shows that indeed something that is that isolated would not be reasonably discoverable by a random walk dominated search on the gamut of our solar system. I prefer a larger threshold, 1,000 bits, as that shows that the whole universe could not sample more than 1 in 10^150 of the config space in question, removing all reasonable doubt. [ . . . ] kairosfocus
MathGrrl:
The reasons already provided appear to boil down to the fact that no ID proponent can calculate CSI for my scenarios.
MathGrrl, I have provided the references to support my claims about CSI. OTOH you have presented nothing to refute those claims. Also you have been provided with a defnition of CSI that has rigorous mathematical components.
If that were so, you could simply point to the rigorous mathematical definition and show how to apply it to my four scenarios.
I have and I have applied it to one of them. Joseph
MathGrrl:
The first scenario, gene duplication, is included because some ID proponents seem to consider CSI to be proportional to the length of the genome.
Perhaps to the length of the minimal genome required for a living organism to do what living organisms do. But even then it depends on variational tolerances- the specificity.
The other three come from my academic interests, combined with a discussion with gpuccio on Mark Frank’s blog. In that thread we touched on the importance of historical contingency in CSI calculations, and I want to make sure we cover that here.
And I provided an answer to one of your scenarios.
If you can provide a riogorous mathematical definition of CSI as described by Dembski and show how to apply it to my four scenarios, please do so.
CSI is a specifed subset of Shannon Information. Shannon information has mathematical rigor. Specified information is Shannon information with meaning/ function. Complexity also has a rigorous mathematical component. So I have applied it one of your scenarios. You can do the rest or you can pay me-fund your project- to look into the rest. Joseph
Mung,
Has anyone considered that there is an underlying logic behind the four questions? Is there a common thread behind them all other than the (I just want to see CSI in action)?
Me! Me! I know the answer! The first scenario, gene duplication, is included because some ID proponents seem to consider CSI to be proportional to the length of the genome. I wanted to find out if I was misunderstanding that position. The other three come from my academic interests, combined with a discussion with gpuccio on Mark Frank's blog. In that thread we touched on the importance of historical contingency in CSI calculations, and I want to make sure we cover that here.
I will say that the first challenge appears to be a thinly veiled attempt to get an admission that information in the genome can increase through a ‘simple’ gene duplication even. IOW, it wasn’t really about CSI at all.
My participation here is solely so that I can understand CSI well enough to be able to test whether or not known evolutionary mechanisms can create it. Why would you be suspicious about the gene duplication scenario? It seems like an obvious known mechanism that must be addressed by any CSI definition. MathGrrl
kairosfocus,
MG’s four “CSI challenge” scenarios, addressed:
This was followed by a lot of words, but nothing that actually addresses my four scenarios. If, as seems increasingly likely, you can't provide a rigorous mathematical definition of CSI as described by Dembski and show how to apply it to my four scenarios, please just say so and we can move on. MathGrrl
Mung, I am not interested in jumping to yet another site to discuss this. It's already difficult enough to follow the half-dozen threads that have been started. If you can provide a riogorous mathematical definition of CSI as described by Dembski and show how to apply it to my four scenarios, please do so. MathGrrl
kairosfocus,
1: you still can’t provide a rigorous mathematical definition of CSI As by now you know or should know, CSI is not a primarily mathematical concept or construct, like say a Riemann integral or a limit.
What seems clear now is that Dembski's CSI has not been and cannot be calculated for real biological or digital systems. That means that the claims of ID proponents that CSI, being specified complexity of more than a certain number of bits, is an indicator of intelligent agency are unsupported. Intellectual integrity dictates, therefore, that ID proponents stop making those claims. MathGrrl
vjtorley,
“you still can’t provide a rigorous mathematical definition of CSI.” Surely you jest, Mathgrrl. You have already been given a few rigorous mathematical definitions of CSI.
No, I haven't. You have certainly made the greatest effort and come up with the most interesting and detailed discussions (which I look forward to you continuing in future threads), but even you have had to recognize that Dembski's formulation of CSI is not calculable. What I find particularly interesting is that so many ID proponents have blithely claimed that CSI is an unambiguous indicator of the involvement of intelligent agency, despite obviously never having calculated it for any biological system. MathGrrl
Joseph,
Your 4 examples are bogus for the reasons already provided.
The reasons already provided appear to boil down to the fact that no ID proponent can calculate CSI for my scenarios.
Also you have been provided with a defnition of CSI that has rigorous mathematical components.
If that were so, you could simply point to the rigorous mathematical definition and show how to apply it to my four scenarios. Neither you nor any other ID proponent has been able to do so. Claiming that you have, when all of these threads are available to show that you cannot, is not intellectually honest.
So stop blaming us for your obvious inadequacies. Ya see that is why there are posts dedicated to you. Now grow up…
Ah, more of the civility expected on UD, I see. MathGrrl
Reminder to MG (and co): This CSI News Flash thread presents: 1: In the OP, a transformation of the Dembski Chi metric that shows that -- per analysis first presented Sunday last in VJT's LGM thread, one week ago and counting . . . -- it in effect measures information in bits beyond a threshold:
Chi = – log2 (10^120 * phi_s(T) * p (T|H) becomes, on transformation and rounding up: Chi = Ip – 500, in bits beyond a threshold of complexity
2: From 19 above, another response to her four challenges, in light of the just above. Given the vigour with which the challenges were issued, and the additional declarations of "meaninglessness" etc, I think a reasonable response is warranted, and silence instead would be quite telling. GEM of TKI kairosfocus
10 --> So, we observe how N does not cogently address the explanation in UD WAC 30, but acts as though it does not exist; even using language that suggests that the clinging to it is a mere matter of stubbornly refusing to concede defeat. Remember, the WAC 30 has been there, for years, just a click or two away. 11 --> Let's cite, remembering that this has sat there on the record for years:
30] William Dembski “dispensed with” the Explanatory Filter (EF) and thus Intelligent Design cannot work This quote by Dembski is probably what you are referring to:
I’ve pretty much dispensed with the EF. It suggests that chance, necessity, and design are mutually exclusive. They are not. Straight CSI is clearer as a criterion for design detection.
In a nutshell: Bill made a quick off-the-cuff remark using an unfortunately ambiguous phrase that was immediately latched-on to and grossly distorted by Darwinists, who claimed that the “EF does not work” and that “it is a zombie still being pushed by ID proponents despite Bill disavowing it years ago.” But in fact, as the context makes clear – i.e. we are dealing with a real case of “quote-mining” [cf. here vs. here] — the CSI concept is in part based on the properly understood logic of the EF. Just, having gone though the logic, it is easier and “clearer” to then use “straight CSI” as an empirically well-supported, reliable sign of design. In greater detail: The above is the point of Dembski’s clarifying remarks that: “. . . what gets you to the design node in the EF is SC (specified complexity). So working with the EF or SC end up being interchangeable.”[For illustrative instance, contextually responsive ASCII text in English of at least 143 characters is a “reasonably good example” of CSI. How many cases of such text can you cite that were wholly produced by chance and/or necessity without design (which includes the design of Genetic Algorithms and their search targets and/or oracles that broadcast “warmer/cooler”)?] Dembski responded to such latching-on as follows, first acknowledging that he had spoken “off-hand” and then clarifying his position in light of the unfortunate ambiguity of the phrasal verb dispensed with:
In an off-hand comment in a thread on this blog I remarked that I was dispensing with the Explanatory Filter in favor of just going with straight-up specified complexity. On further reflection, I think the Explanatory Filter ranks among the most brilliant inventions of all time (right up there with sliced bread). I’m herewith reinstating it — it will appear, without reservation or hesitation, in all my future work on design detection. [. . . .] I came up with the EF on observing example after example in which people were trying to sift among necessity, chance, and design to come up with the right explanation. The EF is what philosophers of science call a “rational reconstruction” — it takes pre-theoretic ordinary reasoning and attempts to give it logical precision. But what gets you to the design node in the EF is SC (specified complexity). So working with the EF or SC end up being interchangeable. In THE DESIGN OF LIFE (published 2007), I simply go with SC. In UNDERSTANDING INTELLIGENT DESIGN (published 2008), I go back to the EF. I was thinking of just sticking with SC in the future, but with critics crowing about the demise of the EF, I’ll make sure it stays in circulation.
Underlying issue: Now, too, the “rational reconstruction” basis for the EF as it is presented (especially in flowcharts circa 1998) implies that there are facets in the EF that are contextual, intuitive and/or implicit. For instance, even so simple a case as a tumbling die that then settles has necessity (gravity), chance (rolling and tumbling) and design (tossing a die to play a game, and/or the die may be loaded) as possible inputs. So, in applying the EF, we must first isolate relevant aspects of the situation, object or system under study, and apply the EF to each key aspect in turn. Then, we can draw up an overall picture that will show the roles played by chance, necessity and agency. To do that, we may summarize the “in-practice EF” a bit more precisely as: 1] Observe an object, system, event or situation, identifying key aspects. 2] For each such aspect, identify if there is high/low contingency. (If low, seek to identify and characterize the relevant law(s) at work.) 3] For high contingency, identify if there is complexity + specification. (If there is no recognizable independent specification and/or the aspect is insufficiently complex relative to the universal probability bound, chance cannot be ruled out as the dominant factor; and it is the default explanation for high contingency. [Also, one may then try to characterize the relevant probability distribution.]) 4] Where CSI is present, design is inferred as the best current explanation for the relevant aspect; as there is abundant empirical support for that inference. (One may then try to infer the possible purposes, identify candidate designers, and may even reverse-engineer the design (e.g. using TRIZ), etc. [This is one reason why inferring design does not “stop” either scientific investigation or creative invention. Indeed, given their motto “thinking God's thoughts after him,” the founders of modern science were trying to reverse-engineer what they understood to be God's creation.]) 5] On completing the exercise for the set of key aspects, compose an overall explanatory narrative for the object, event, system or situation that incorporates aspects dominated by law-like necessity, chance and design. (Such may include recommendations for onward investigations and/or applications.)
12 --> So, a fairer and truer view of the EF is that it was conceived in response to a pattern of how people infer to design in real life, seeking to model their decision process, which highlightes the role of CSI as a sign of design. On further reflection, it is necessary to emphasise the per aspects view implicitly involved, So, he filter was improved, by elaborating it somewhat, to bring out that per aspects view and onward synthesis, cf here. 13 --> This actually shows the provisional, progressive process of science at work, in this case, design science. Fundamentally correct work is refined, and leads to a deeper understanding and a more sophisticated tool. 14 --> This thread's OP presents a similar case, where Dembski's analysis of the CSI criterion began with a simple statement on complexity beyond a threshold as we see from p. 144 of NFL, then in 2005 was elaborated into a Chi-metric, and on debates over its challenges it has been deduced that the metric is actually a measure of information in bits beyond a threshold of sufficient complexity that RW based processes are not a credible explanation for being found in a highly specific zone of interest:
Chi = - log2 (10^120 * phi_s(T) * p (T|H) becomes, on transformation and rounding up: Chi = Ip - 500, in bits beyond a threshold of complexity
15 --> We may debate the best threshold [I prefer 1,000 bits as that so vastly overwhelms the resources of the cosmos that the point is plain . . . ], or how Dembski estimated his threshold, or the approximations he made in so doing, but the idea of measuring info as negative log probability is a longstanding one, and the further aspect of looking beyond a threshold is premised on a reasonable point that to be in a zone of interest when on a simple RW you would overwhelmingly be expected to be elsewhere, is plainly reasonable. 16 --> H'mm that reminds me of the laws against loitering, that in effect infer that if you are hanging around in a zone of interest, you are not likely to be there by chance, but are "lying in wait" or are "casing the joint." A design inference in law, based on being found in an unusual and significant location, instead of where you would be expected to be "at random." 17 --> So, we can see that what is going on rhetorically is that one can caricature scientific progress on clarifying and improving or elaborating a fundamentally sound approach across time as though it constitutes successive admissions of failure and abandonment of one's core view. 18 --> But, a fair reading of NFL, e.g as cited above will show the core coherence and continuity in the design view, and the process of progressive improvement and elaboration, not abandonment. GEM of TKI kairosfocus
Talking point: Dembski has abandoned CSI, apart from rehashing earlier claims to get some verbiage for a new book, and has turned to active information . . . . Folks here are such “true believers” [link added, this is a serious piece of accusatory namecalling] that they generated a piteous outcry when Dembski admitted that the explanatory filter was dead . . . CSI did not pan out. Dembski has moved on, but there is a strong disincentive for him to admit that he has. 1 --> Declaring victory and walking away is one way to get out of a sticky situation. So, when one sees the sort of agenda-serving, spin-filled dismissive reconstruction of history as just cited, one's caution-flags should go up. And in this case, for very good reason: a strawman caricature is being presented to claim victory in the face of the OP's demonstration of the collapse of MG's attempt to project the view that CSI was mathematically [and conceptually] meaningless, contradictory tot he earlier use by Orgel etc, inapplicable to the real world, and incoherent. 2 --> Active information, of course, is BASED on the thinking that underlies CSI and addresses how to exceed the performance of random-walk driven search:
a: it was first shown that there is no free informational lunch. b: That is, when an evolutionary algorithm search strategy in a particular situation succeeds above RW, there is a match to the setting that is information rich, or on average the algorithms will be just as bad as RW search. c: Often, this injected active info is lurking in the implications of an intelligently constructed map of the fitness function sitting on the config space and/or in metrics of success that tell searches warmer/colder; allowing approach to oracles. (All of this, BTW, is foreshadowed in the analysis of Dawkins' Weasel and kin in Ch 4 of NFL, accessible at Google Books here.) d: It turns out that the number of possible fitness maps etc cause an exponentiation of the difficulty of search, and it is much harder to randomly find a good fit of fitness map to the config space than to simply do a RW search of the space. Unless, you are intelligent. As Dembski and Marks note:
"Needle-in-the-haystack problems look for small targets in large spaces. In such cases, blind search stands no hope of success. Conservation of information dictates any search technique will work, on average, as well as blind search. Success requires an assisted search. But whence the assistance required for a search to be successful? To pose the question this way suggests that successful searches do not emerge spontaneously but need themselves to be discovered via a search. The question then naturally arises whether such a higher-level “search for a search” is any easier than the original search. We prove two results: (1) The Horizontal No Free Lunch Theorem, which shows that average relative performance of searches never exceeds unassisted or blind searches, and (2) The Vertical No Free Lunch Theorem, which shows that the difficulty of searching for a successful search increases exponentially with respect to the minimum allowable active information being sought." . . . where, active information is defined and contextualised here: "Conservation of information theorems indicate that any search algorithm performs on average as well as random search without replacement unless it takes advantage of problem-specific information about the search target or the search-space structure. [notice, how D & M are building on the conserv of info principles suggested in NFL] Combinatorics shows that even a moderately sized search requires problem-specific information to be successful. Three measures to characterize the information required for successful search are (1) endogenous information, which measures the difficulty of finding a target using random search; (2) exogenous information, which measures the difficulty that remains in finding a target once a search takes advantage of problem-specific information; and (3) active information, which, as the difference between endogenous and exogenous information, measures the contribution of problem-specific information for successfully finding a target. This paper develops a methodology based on these information measures to gauge the effectiveness with which problem-specific information facilitates successful search. It then applies this methodology to various search tools widely used in evolutionary search."
e: Accordingly, as just cited from abstracts [the links point onwards to the full papers] the Dembski-Marks metrics of active info work by measuring the performance improvement over RW based search; on the demonstrated premise that there is no free lunch when looking for a needle in a very large haystack indeed [one that is beyond the RW search capacity of the observed cosmos] and that the cost in effort of searching for a search is proved to be much stiffer than the cost of a direct search. f: So, the talking point that CSI has been abandoned in favour of active information is a complete misrepresentation of the truth. g: unfortunately, it is a clever one, as the math is hard to follow for those not having the relevant background, and it is easy to point to recent papers on active info, and say there you have it, CSI is dead.
3 --> However, N has provided a good parallel of this tactic of spinning to claim a victory, i.e. the case where WmD said indeed that he had "dispensed with" the use of the explanatory filter, and was using straight CSI as the equivalent. 4 --> Underlying this is the explanatory cluster, where cause traces to chance and/or necessity and/or design, each leaving empirically observable signs on various aspects of a process, phenomenon or object -- causes seldom act alone but we can isolate effects of different factors on aspects of what happens or what we see:
a: Mechanical necessity leads to law-like regularities [e.g. dropped heavy objects fall], b: chance leads to statistically distributed contingencies that "map" a probability distribution [a fair die tumbles to read from 1 to 6 with each face uppermost about 1/6 of the time] c: design often leaves the trace of functionally specific, complex information [e.g. text in posts in this thread] d: to identify the differential impact, it is wise to analyse objects and phenomena in terms of aspects, then bring together the overall picture by synthesising an overall causal account. e: to illustrate, this post on your screen is not wholly explicable on the mechanics of LCD or CRT screens, though that is one aspect. f: Similarly, when the post is pulled form the hard drive where this blog post is stored and transferred over the Internet, noise will inevitably impose a statistical scatter on the signal pules. The Internet has mechanisms for recovering from noise corrupted pules. (That scatter is explained on the noise in communication systems) g: But, to explain the text in English, one has to infer to design. h: so all three aspects are involved, and play a role in the overall explanation for what you see on your screen.
5 --> Once you know that sort of step-by-step analysis on aspects of an object or phenomenon, you can often directly take the short-cut of looking for the signs, instead of explicitly ruling out regularities on high contingency, then noting that we are in a peculiar and specific zone of interest that we should not be in if this were just a statistically controlled random walk. (There are some outcomes that are suspiciously unusual . . .) 6 --> As explained here in UD WAC 30, that is a big part of what Dembski meant to say, though in part he was highlighting the problem of the way that simply speaking of the three causal factors can lead to missing the complementary way they work together in a real object or process. 7 --> The explicit introduction of the per aspect view and the following synthesis of the overall causal account addressed this genuine problem wit the explanatory filter as originally presented. 8 --> A balanced view of what happened would acknowledge the above: by focussing on aspects for analysis and then synthesising an overall causal account, the explanatory filter remains a useful context, and it identifies the significance of CSI as a sign of design. 9 --> But,unfortunately, it can be hard to resist the temptation to pounce on words that look like a concession and triumphantly trot them out to claim victory. [ . . . ] kairosfocus
Mung: It's a pity you are in mod. I Do clip one point from your Kuppers quote:
we must take leave of the idea of being able, one day, to construct intelligent machines that spontaneously generate meaningful information de novo and continually raise its complexity . . .
Since we are plainly derivative info processors and seemingly do innovate info, I think I differ. It is manifestly plausible that something like us is possible. The issue is to figure out how to build a self-moved, truly intelligent entity. Such an entity would be capable of imagination, motivation, common sense, decision and action, not mere algorithm execution. I think the trick to that is a sufficiently broad "common sense" knowledge base that allows for self- and world- modelling and projections with dynamics that reflect chance, necessity and a model of the decisions of others. Such an imaginative capacity then can be fed into actual decision-making and planning. That will require massive parallel-processing power, probably on neural net architectures. I am of course speculating here. How to do it is a challenge but I think one key observation is the point made by Eng Derek Smith in his model: the MIMO loop must have a two-tier controller, one tier handing the processing homework and I/O management, the other supplying a supervisory level. As to getting self-awareness, I have my doubts. I don't think that simply having processing loops and memory are enough, as some seem to be suggesting: spontaneous emergence of consciousness. GEM of TKI kairosfocus
That may need to be hunted down, but the inference to design on observing a big enough duplicate, is enough to support an inference to design.
Quoting Bernd-Olaf Kuppers from Information and the Nature of Reality:
But if there are no meaning-generating algorithms, then no information can arise de novo. Therefore, to understand a piece of information of a certain complexity, one always requires background information that is at least of the same complexity. This is the sought-after answer to the question of how much information is needed to understand some other information. Ultimately, it implies that there are no "informational perpetual motion machines" that can generate meaningful information out of nothing. At least it is certain that we must take leave of the idea of being able, one day, to construct intelligent machines that spontaneously generate meaningful information de novo and continually raise its complexity.
Mung
Unfortunately it appears from what others have told me that ARN is not accepting registration for new members. I've tried to keep it updated with the relevant links to posts here at UD but I've probably failed miserably, lol. It doesn't help that I'm under moderation here and have to wait forever for my posts to show up. Maybe it's time for me to open my own forums :). Mung
Mung: Thanks a million! Your links in 17 to the original challenges are very welcome. Your immediately following response is helpful, but I don't see how to link within the thread at ARN. (Here is a printoff link for the first response, onlookers.) Your objection is quite cogent:
my first observation about these four "challenges" is that none of them has anything to do with a real biological system, and only the first challenge, involving a gene duplication scenario, even comes close. So all the direct and implied claims that the inability to calculate CSI for any real biological system is what the "MathGrrl debate" is about are off base . . .
You then quote PAV:
Patently, none of those four scenarios rises to the level of actual CSI. The case of gene duplication is crystal-clear nonsense, easily dispensed with via recourse to Chaitin-Kolmogorov complexity, which is clearly incorporated into Dembski’s description of “specified complexity”. The case of ev, as I have pointed out to you at least five times, is a bit string 265 bits long, which has two sections “evolving” together, which means, roughly, that if the two sections matched entirely, the real “specified” length would by 265/2, and which produces “specificity” at only a handful of locations, with these locations moving around “randomly” (i.e., noise) once some minimum “specificity” is arrived at (which makes it then “maximum” specificity”). IOW, only about one third of the 265 bits at most are “specified”, and so you have a complexity of 2^88, or 10^30. This is inconsequential nonsense compared to one average sized paragraph in English. Even if Bill Dembski himself did the calculation for any of those scenarios, he would conclude that CSI is not present. So, how does that make ANY of those FOUR SCENARIOS worth five minutes worth of anyone’s attention?
PAV's rebuttal was well merited. As to whether our MG is the one whose CV you put up, I could be wrong, but that does not seem to be borne out by her response when mathematical specifics have been put up. And, the result of a simple transformation of Dembski's expression belie MG's confident assertions that the expression are mathematically ill defined and meaningless. A metric in bits beyond a threshold is perhaps unusual but it is not meaningless and it is arguably a powerful way to identify and measure what is being got at in the concept CSI. And in 23, you are right to identify an agenda. The questions, as I have said from the outset -- notice my for the record remarks, were heavily loaded with question begging agendas of the ilk, "have you stopped beating your wife." I clip:
I will say that the first challenge appears to be a thinly veiled attempt to get an admission that information in the genome can increase through a ‘simple’ gene duplication even. IOW, it wasn’t really about CSI at all.
Indeed. The storage capacity count obviously goes up, but the functionality and specificity of information does not, no more than when you print a copy of a book or get a copy of a program from aDownloads site. You may allow that functionality to be expressed in another way or location, even within the cell, but you have not created or originated new functionally specific complex information that did not previously exist, the material question at stake. There is no search space challenge to do that. So this is a strawman on a red herring as well. But, there is a self-defeating point, as I highlighted in the MG guest post thread and above. Copies don't appear by magic. Book implies printer or at least photocopier, or even scribe. Just so, novel copies of a gene imply a gene replicating mechanism. So, if the size of the copy is beyond the chance could easily or plausibly explain it threshold, its existence implies a copying mechanism, which will be enormously complex. So, the detection of a big enough copy implies the existence of considerable FSCO/I -- and likely it will be irreducibly complex -- to effect the copying process. That may need to be hunted down, but the inference to design on observing a big enough duplicate, is enough to support an inference to design. The computer science cases were addressed above starting at 19. And, indeed, the Hazen et al paper is significant, and the clipped ideas were part of the context of Durston's work. I particularly focus on:
Complex emergent systems of many interacting components, including complex biological systems, have the potential to perform quantifiable functions. Accordingly, we define “functional information,” I(Ex ), as a measure of system complexity . . . Functional information, which we illustrate with letter sequences, artificial life, and biopolymers, thus represents the probability that an arbitrary configuration of a system will achieve a specific function to a specified degree.
Thanks again GEM of TKI kairosfocus
Talking point: I [N] cannot see any practical benefit in concluding that some phenomenon is due to supernatural intervention, even if it actually is. Science seeks always to explain nature in terms of nature. It does not tell the Truth This is of course based on the mistake of the false contrast: natural vs supernatural, joined to the imposition of evolutionary materialism as a censoring a priori on science. In actual fact, science can and does routinely study the empirical signs that distinguish the natural [= blind chance + mechanical necessity] from the ART-ificial (or intelligent). Just check your food labels in your kitchen for cases in point. Further to this, it is a blunder to impose naturalistic [= materialistic] explanations on science, censoring it from being able to fearlessly pursue the truth based on observed evidence. Instead, science should seek to be:
an unfettered (but ethically, epistemologially and intellectually responsible) progressive pursuit of the provisionally but well warranted knowledge of the truth about our world based on empirical evidence, observations, experiment, logical-mathematical analysis, explanatory modelling and theorising, and uncensored but mutually respectful discussion among the informed.
A more detailed discussion is in the same thread as N's post, at number 279. kairosfocus
Talking point: ID is trying to reduce science to forensics, i.e. more or less to a cheap detective story whodunit. In fact, the design inference explanatory filter in large part is about the empirically reliable signs that point to causal patterns, and how we may properly infer on best explanation from sign to warranted cause: I: [si] --> O, on W That is, I (an observer) encounter and observe a pattern of signs, and infer an underlying object or objective state of affairs, on a credible warrant. In particular, in this thread, the warrant most in view is that for the empirically supported cause of CSI. On the point that specificity-complexity narrows possible configurations to independently describable, specific and hard- to- find- by- chance- dominated- search target zones in large config spaces, the Dembski inference identifies that we are warranted to conclude design if the embedded or expressed information in an object of investigation is such that:
Chi = Ip - (398 + K2) > 1, in bits beyond a threshold of sufficient complexity. Where, 398 + K2 will range up to 500 bits or so.
This is empirically well warranted [just the global collection of libraries and Internet and the ICTs industry alone constitute billions of successful tests without an exception], and it is supported by the implications of the infinite monkeys analysis. Notice, the inference it NOT to "whodunit" -- a subtle, veiled allusion to the "ID is creationism in a cheap tuxedo" smear dealt with just above -- but instead to that tweredun. To process, not to agent and identity. To use a corrective forensic example: after we have good reason to conclude arson, then we can go hunt us some possible miscreants. But, if there is no good -- empirically well-warranted -- reason to infer to such deliberate and intelligent cause, whodunit is irrelevant, and perhaps even dangerous. Repeat: the demonstrable focus of design theory -- as can easily be seen from the extended definition of ID hosted here at UD -- is on causal patterns and empirical evidence that warrants the conclusion that particular aspects of objects or states of affairs trace to a causal PROCESS involving factors of chance and/or mechanical necessity and/or design. Again: that tweredun, not whodunit. Going beyond that basic corrective, the above talking point is actually a case of the classic playground complaint: "But, he hit BACK first . . ." For, we are here dealing with a subtle turnabout false accusation. It needs to be spelled out, for this tactic works by confusing the onlookers and getting them to focus blame on the one trying to defend himself or correct a problem, instead of the one who started the fight or who hopes to benefit from the continuation of the error:
a: Those who have carried out a silent coup in science and so also b: have succeeded in imposing and institutionalising a censoring, question-begging a priori -- evolutionary materialism -- on science are now c: twisting the demonstrable but generally poorly understood facts d: to suggest that those who are exposing the coup and the need for reform to restore science to a sound basis, e: are coup-plotters and subversive trouble-makers, to be stopped.
Sadly, the tactic often works on those who do not know the real timeline of what happened and who don't know who really threw the first fist. But, in this case, the situation to be corrected is not in doubt: a priori imposition of materialist censorship on science obviously blocks science from seeking the truth about our world, in light of all the facts and all the relevant possible explanations. It is worth pausing to again hear what Lewontin had to say on this, yet again, as it needs to soak in until we fully understand what has been done to us:
To Sagan, as to all but a few other scientists, it is self-evident that the practices of science provide the surest method of putting us in contact with physical reality, and that, in contrast, the demon-haunted world rests on a set of beliefs and behaviors that fail every reasonable test . . . . It is not that the methods and institutions of science somehow compel us to accept a material explanation of the phenomenal world, but, on the contrary, that we are forced by our a priori adherence to material causes to create an apparatus of investigation and a set of concepts that produce material explanations, no matter how counter-intuitive, no matter how mystifying to the uninitiated. Moreover, that materialism is absolute, for we cannot allow a Divine Foot in the door. [[From: “Billions and Billions of Demons,” NYRB, January 9, 1997. ]
That plainly undermines the integrity of science and in the long run could lead to a collapse of the credibility of science. So the rot needs to be stopped, now. Before it is too late. GEM of TKI PS: The rebuttal made to the above talking point, here, brings out more specifics on how science can be restored, in light of the classic definition of the main methods of science by Newton, in his Opticks, Query 31. kairosfocus
Talking point: ID is indeed creationism in a cheap tuxedo, promoted by right wing ideologues intending to impose a theocratic tyranny in their war against science. --> Answered in the four linked points in the paragraph in the original post above which is being cited as though the denial is actually a confession of guilt rather than a plea to stop a destructive and dangerous slander. --> Let's clip: >> . . . . We were all busy trying to address the scientific origin of biological information, on the characteristic of complex functional specificity. We were not trying to impose a right wing theocratic tyranny nor to smuggle creationism in the back door of the schoolroom your honour.” . . . . >> kairosfocus
F/N 2: In the last linked, read esp pp 9 - 14, to see a discussion of the impact of starting on an island of function, and the implications of how we got there. kairosfocus
F/N: Dissecting evolutionary algorithms, here. Specifically addressing Weasel, Avida, Ev and Tierra, here. kairosfocus
Has anyone considered that there is an underlying logic behind the four questions? Is there a common thread behind them all other than the (I just want to see CSI in action)? I will say that the first challenge appears to be a thinly veiled attempt to get an admission that information in the genome can increase through a 'simple' gene duplication even. IOW, it wasn't really about CSI at all. But then, I'm pretty cynical. Mung
Just to keep a reminder going; above I have clipped and responded to MG's four questions in her guest post, linking that thread so onlookers can see that similar responses were developed at that time as well; just, they were ignored or brushed aside without serious response. kairosfocus
OOPS: - 478, typo. kairosfocus
14 --> You have provided an information increment value. Fed into the transformed Dembski Chi-metric:
Chi_tierra = 22 [BYTES!] - 500 = - 479 [oops] Chi_Tierra = 22*8 - 500 = 176 - 500 = -324 Again, irrelevant to the design inference
4] The various Steiner Problem solutions from a programming challenge a few years ago have genomes that can easily be hundreds of bits. The specification for these genomes 15 --> A long time ago, someone once called a country " Greenland." Green-land is of course mostly White-land. Calling a bit string in an intelligently designed software entity a "genome" does not make it so. 16 --> Hundreds of bits produced by an intelligently designed search algorithm is immediately known to be a designed output. This can be seen from say this clip from the PT page MG links:
In the Genetic Algorithm itself, the DNA strings for a population of around 2000 random solutions are tested, mutated, and bred [left off: By the coded program running on a machine acting as proxy for its intelligent programmer] over a few hundred generations. Mutation is easily performed, and requires no knowledge of actual coordinates or connections [a generic warmer/colder is good enough]. If a mutation is to affect one of the first 26 digits of an organism’s DNA [oops: you even pick mutation sites], that digit will be replaced by a random digit from 0 to 9, ensuring the mutated DNA can still be “read.” [points to a set data structure that is intelligently designed and on an island of function] (The first two digits are limited to just 00-04.) [again, island of function constraints] Likewise, mutations in the 36-bit connections map part of the DNA string [island of function constraints] will be replaced by a random new bit (T or F) . . . As is common in Genetic Algorithms, some “elite” members of the population are usually retained as-is (i.e. asexual reproduction instead of sexual reproduction). [an algorithmic choice that ensures that things move towards the target faster than otherwise, perhaps decisively faster] All that remains is getting the whole thing started, and implementing a “Fitness Test” to see which organisms have a better chance of contributing to the next generation’s gene pool. The first generation of 2000 strings is generated randomly, under constraints that the first two digits can only be 00-04, the next 24 digits can be any digits 0-9, and the last 36 bits T or F. [notice the island of function data structure constraints specified by the intelligently designed algorithm; the "body plan" is fixed in advance] Each generation is tested for “fitness,” and the individuals with higher fitness are assigned a higher probability of making it into the next generation. While mating is biased toward organisms with better fitness, because it is a stochastic process, even low-fitness individuals can overcome the odds and breed once in a while.
17 --> In short, hill climbing, within the search resources of the Dembski threshold, starting inside an island of intelligently arrived at function. 19 --> A case of 62 bits is given [added: these calcs are illustrative of how the Chi-500 metric works based on identified cases of info given], which is of course Chi_tierra = 62 - 500 = - 438, well within the threshold where chance plus selection are credible sources. is “Computes a close approximation to the shortest connected path between a set of points.” 18 --> Computes is a give-away term, again: intelligently designed, within a designed in island of function. Irrelevant to the real problem to be tackled. (ADDED: That is, even if we can make info in excess of threshold, the info is coming from an intelligent source, not blind chance and necessity; hence the significance of actual cases of random text generation, that peak out at 24 ASCII characters [ = 168 bits, very close to the results for Tierra] so far, per the tests reported by WIKI. This provides evidence that config spaces of 10^50 or so possibilities are a reasonable upper bounded practical scope of search under realistic in-lab conditions on earth. Just as thermodynamicists had suggested decades ago. The Dembski type 500 bit threshold is best seen in light of Abel on universal plausibility as a solar system upper limit, and the 1,000 bits as an observed cosmos limit. These expanded limits give wiggle room for fairly large islands of function by shifting from a strictly probability base to one on upper limit of number of events at quantum level available to be used in searches. If a scope of possibilities significantly exceeds that number for the scope in which a search should be done, it is unreasonable that the needle in that big of a haystack would be expected to be found on a chance based random walk plus trial and error with locking-in of success. One's Blind Watchmaker, chance plus necessity theories of origins should not depend too much on statistical miracles.) 19 --> In addition, the program needs to answer to the challenge that the degree of complexity of the targets has to be so constrained that the search within the island of function does not find itself looking at truly isolated functions. That is there is a search resources issue that is being ducked by using fairly simple toy scale cases. Real genomes start north of 100,000 bits, and go on up from there to the billions. >> __________________ The first case, overlooks the key issue of where do copies come from, which is anything but simple. If a copy originates and is seen, that pushes the pattern STRING + STRING COPY over the 500 bit threshold, where the copying is not credibly a coincidence but credibly an output of a mechanism that reads and replicates, that is itself an index of sophisticated organised complexity. The Chi-metric's implied conclusion of design is warranted. In short, this is actually a success for the Chi-metric. We see that the other three cases are all computer simulation exercises, pre-loaded with designed algorithms that start on islands of function. The scope of search is also controlled so that it does n0t have to look for targets in spaces beyond search resource limits. But hat is precisely the root problem being highlighted by the design inference analysis. So, we observe:
1: the part of the search that is in a vast config space: specifying the code for the algorithm, is intelligently designed. This puts you on an island of function to begin with. 2: Where a random search feature is incorporated, the scope of the implied secondary config space is restricted, so that the existing search resources are not overwhelmed, i.e the lottery is designed to be winnable based on restricted scope.
The examples inadvertently demonstrate the point of the Chi-metric's threshold! Namely, to operate on the far side of the threshold, you need intelligence to narrow down scope of trial and error based search. And so it is no surprise to see the results of such calculations as were possible: a: gene duplication beyond the threshold [the duplicate needs to be over 500 bits], points to a complex machinery of duplication, and this can best be assigned to design. b: ev -- a search within an island of function, with all sorts of built-in search enablers -- peaks out at under 300 bits, well within the threshold, as was seen by plugging in the number 300 into the transformed Chi-metric equation. c: Tierra, again within a designed island of function, on the 62 bit case cited, on plugging in 62 bits and subtracting 500, is well within the threshold of the Chi_metric. d: Steiner, is again within an island of function, and the scope of intelligently controlled search for solutions is well within the search resources threshold. In all cases, the transformed and simplified Chi-metric simplifies the calculation problem, as we only need to find the relevant information, in bits. GEM of TKI kairosfocus
MG's four "CSI challenge" scenarios, addressed: These come from the OP, On the Calc of CSI guest thread. Observe, first, how only one of the four is an actual genetic situation, and the other three have to do with the operation of designed computer programs that were artificially set up [after, doubtless, considerable troubleshooting] to be specifically functional. In one case, the structure of the digit string termed the "genome" was explicitly explained as very constrained, showing just how much these exercises are about moving around on islands of specific function, set up by known intelligent programmers. The case also shows how, where random search elements are introduced, the degree of scope is very carefully constrained in ways that do not overwhelm search resources, i.e. the search is within the 500 - 1,000 bit space threshold, directly or by cutting down qualified "random" variation. The large config space part of the challenge is tacked by intelligence, and randomness is held within search resources limits by that design. On this alone, the simulations are of micro evolution, not body plan level macro evolution. And the remark that Schneider's ev is based on investigation of evo in observed organisms, underscores this: observed evo cases are all micro. Macro evolution at body plan level is an extrapolation and inference, not an observation. But it is the later that is under question; the key question is being (unconsciously) begged. I shall interleave comments: __________________ >> 1] A simple gene duplication, without subsequent modification, 1 --> As I said originally, there is nothing simple about this act. The additionality of the function of copying implies a facility capable of that, which is a complex and specific function in itself. If the difference in bit-count is sufficient that it is not credible that this duplication has somehow occurred by chance, the implication of a mechanism is consistent with the known root of such consistency. The inference to design in that context is inductively valid on the empirical pattern of cause of such. 2 --> The copy without modification is of course no increment to actual information, or function; as is pointed out above and as has been pointed out previously when this was raised. No explicit calculation is needed. The issue of FSCI is not how many copies of the information exist but how the specific functional organisation of data elements common to the copies first occurred; i.e. its origin and cause. that increases production of a particular protein from less than X to greater than X. 3 --> This is about increase in EXPRESSION of a gene's effect, not increase in functional information The specification of this scenario is “Produces at least X amount of protein Y.” 4 --> After the fact target painting; [Added, 04:22:] cf. PAV's remarks at 50 - 51 below for more. In effect the EVENT that lands one in the target zone is the critical issue, and it is such that the chance hyp for this per our knowledge of copying yields a value of novel info of zero; but of course this points to the issue of a copying mechanism, so one may infer that if G_copy is > 500 bits, then the existence of two copies entails a complex copying mechanism, which will all but certainly be both irreducibly complex to function and beyond the 500-bit threshold itself, so the observed result, of a duplication of an original, implies design on FSCO/I as a reliable sign. 2] Tom Schneider’s ev evolves genomes 5 --> Question-begging description of the action of ev. Ev --[Added 04:22:] as Mung shows in 119 & 126 (cf also 129) below -- is an intelligently designed program that starts out at an island of function in a config space, and is designed to produce the "genomes" that result. Dembski et al aptly remark, on dissecting ev:
The success of ev is largely due to active information introduced by the Hamming oracle and from the perceptron structure. It is not due to the evolutionary algorithm used to perform the search. Indeed, other algorithms are shown to mine active information more efficiently from the knowledge sources provided by ev. Schneider claims that ev demonstrates that naturally occurring genetic systems gain information by evolutionary processes and that "information gain can occur by punctuated equilibrium". Our results show that, contrary to these claims, ev does not demonstrate "that biological information...can rapidly appear in genetic control systems subjected to replication, mutation, and selection". We show this by demonstrating that there are at least five sources of active information [intelligently injected info that increases performancs of a search type program above the random trial and error case] in ev. 1. The perceptron structure. The perceptron structure is predisposed to generating strings of ones sprinkled by zeros or strings of zeros sprinkled by ones. Since the binding site target is mostly zeros with a few ones, there is a greater predisposition to generate the target than if it were, for example, a set of ones and zeros produced by the flipping of a fair coin. 2. The Hamming Oracle. When some offspring are correctly announced as more fit than others, external knowledge is being applied to the search and active information is introduced. As with the child's game, we are being told with respect to the solution whether we are getting "colder" or "warmer". 3. Repeated Queries. Two queries contain more information than one. Repeated queries can contribute active information. 4. Optimization by Mutation. This process discards mutations with low fitness and propagates those with high fitness. When the mutation rate is small, this process resembles a simple Markov birth process that converges to the target. 5. Degree of Mutation. As seen in Figure 3, the degree of mutation for ev must be tuned to a band of workable values. (George Montañez, Winston Ewert, William A. Dembski, and Robert J. Marks II, "A Vivisection of the ev Computer Organism: Identifying Sources of Active Information," Bio-Complexity, Vol. 2010(3) (internal citations removed).)
using only simplified forms of known, observed evolutionary mechanisms, that meet the specification of “A nucleotide that binds to exactly N sites within the genome.” 6 --> Cf cite just above. The length of the genome required to meet this specification can be quite long, depending on the value of N. (ev is particularly interesting because it is based directly on Schneider’s PhD work with real biological organisms.) 7 --> PAV also pointed out -- cf 50 - 51 below -- that the level of resulting info "produced" by Ev, is within the threshold for CSI; i.e it is within what is plausible for chance driven change to do, not beyond the search resources threshold:
Less than 300 bits of functional information appear in the output (IIRC, in a previous thread PAV put 266; below he argues for much less); no particular calculation is needed to show that -- as was already shown -- 300 bits [to be fed into the transformed Chi-metric] is within a 500 bit threshold: Chi_ev = 300 - 500 = -200, within the reach of chance and necessity. Irrelevant to the inference to design.
8 --> And, we must not forget, this all starts inside an island of function arrived at by intelligent design. (Cf Mung 126 below) That is at best this models micro-evolution, not the relevant body plan level macro evolution. 9 --> such is actually implicit in the remark: "it is based directly on Schneider’s PhD work with real biological organisms." Indeed, where WE HAVE NEVER OBSERVED BODY PLAN LEVEL MACROEVOLUTION, ONLY SMALL VARIATION AKA MICROEVOLUTION. 11 --> Ev transforms underlying functional information, according to progams, and achieves generic as opposed to specific pre-programmed targets [the perceptron pushes towards sparse 0 outcomes -- BTW this filters out most of the config space -- and the warmer/colder performance filter selects for those that are closer to the generic target. 12 --> Subtler than WEASEL, but in the same general class. Information is being processed and expressed by intelligently designed software, not appearing as a free lunch from non-foresighted phenomena such as chance variation and pure trial and error picking for any function. 3] Tom Ray’s Tierra routinely results in digital organisms with a number of specifications. 13 --> Again, a question-begging description, here as "organisms." These are no more organisms than the SIMS characters, they are programmed virtual actors on a virtual stage, based on the output of an intelligently designed algorithm, inputted data structures and contents, and controlled execution on a machine that was equally intelligently designed. One I find interesting is “Acts as a parasite on other digital organisms in the simulation.” The length of the shortest parasite is at least 22 bytes, added 13a --> Notice, the 22 byte figure comes from MG but takes thousands of generations to evolve. [ . . . ] kairosfocus
Hi kairosfocus, I'm not sure whether this paper has been mentioned:
Complex emergent systems of many interacting components, including complex biological systems, have the potential to perform quantifiable functions. Accordingly, we define “functional information,” I(Ex), as a measure of system complexity. For a given system and function, x (e.g., a folded RNA sequence that binds to GTP), and degree of function, Ex (e.g., the RNA–GTP binding energy), I(Ex) = ?log2[F(Ex)], where F(Ex) is the fraction of all possible configurations of the system that possess a degree of function ? Ex. Functional information, which we illustrate with letter sequences, artificial life, and biopolymers, thus represents the probability that an arbitrary configuration of a system will achieve a specific function to a specified degree. In each case we observe evidence for several distinct solutions with different maximum degrees of function, features that lead to steps in plots of information versus degree of function.
http://www.pnas.org/content/104/suppl.1/8574.full Mung
I've posted the original challenges: 1. here 2. 1. here 3. here 4. and here Mung
PS: if a copy of an earlier informational entity is made, then the copy is transformed into something that does a different function, the increment in CSI for the changes, may be relevant, and the process of transformation may also be VERY relevant. For instance, Venter and other genetic engineers. kairosfocus
MG attempted rebuttal: 1: you still can’t provide a rigorous mathematical definition of CSI As by now you know or should know, CSI is not a primarily mathematical concept or construct, like say a Riemann integral or a limit. It takes reality from being an observed characteristic of systems of interest, being described. As in: science seeks to describe, explain, predict then control or at least influence on the realm of observable phenomena, objects, processes etc. The definition of CSI is actually in its name, after all the term is a description: "complex, specified information," as is seen in text in English, or in computer programs, or in DNA's code, and indeed in the complex functional organisation of the cell. Such CSI -- and its functionally specified subset FSCI [note Dembski in NFL as cited in the OP] -- as has been shown, is capable of being modelled mathematically, and it is reducible to metrics in various ways, as can be seen above for Dembski and Durston. If these do not satisfy you as ADEQUATE mathematical models, that relate to observable realities and produce useful results, then that is because you are playing the self-refuting game of selective hyperskepticism, in Sagan's evidentialist form. Here is the key epistemological blunder, corrected:
extraordinary [to me] claims require extraordinary [ADEQUATE] evidence
2: you still can’t show examples of how to calculate it for the four scenarios I described in my guest post Your guest post thread had in it significant responses that provided relevant calculations, as well as relevant corrections, and counter challenges to you to address same. Observe in particular Dr Torley's responses, and onward I have reduced the matters as above. Once we can produce a reasonable estimate of the information content of an entity, we can derive a Dembski style bits beyond a threshold value. In the case of the example that strikes my mind just now, duplicated genes, the information is just that, a duplicate, its incremental information content is ZERO, as was repeatedly pointed out; but brushed aside. The copy itself does not say any more about the information's origin than was already in the original. However, as I pointed out in that thread, a copy does not come from nowhere, so if there is a process that is able to replicate info like that, it is functional, and the presence of a copy like that shows that we have a further function and so if the copy and the original are both present in a system, that points to an increment of FSCI, not in the copy but in the system capable of copying -- indeed maybe a huge increment given the implicated cellular machinery. That too was studiosly ignored. Similarly, PAV has pointed out that for ev [apparently best of breed], the increment in info is well within 300 bits, i.e it is within the range where the CSI threshold does not apply. Worse, as I repeatedly pointed out, ev and similar programs all start within an island of function that was arrived at intelligently, and have in them co-tuning of matched parts that leads to expression of existing FSCI. this too was ignored. I forget for the moment the other cases, do remind me. However, the track record from my viewpoint is that you ducked cogent answers again and again, as you have also ducked on the conceptual links between Dembski and Orgel-Wicken. the cite from NFL -- which you claim to have read -- at the top of the thread exposes the hollowness of the assertion that the CSI concept is "meaningless," or cannot be mathematically coherently expressed, or that what Orgel et al meant and what Dembski means are materially in conflict. Why not clip your four specific challenges that you claim have not been addressed, and let us see if there is not an adequate answer in UD? In particular, what is your answer to my reduction of the Dembski metric on a basis of its literal statement, to an information value in bits beyond a threshold, and the use of this with Durston's Fits as a reasonable estimate of the information content of relevant target zones? 3: you admit that Durston’s metric will not generate the same values as Dembski’s, therefore you claim victory. This is an outright strawman, st up in total caricature of the real issue, and pummelled. Durston is addressing in effect one term in the Dembski metric as I reduced it to information. recall, for convenience, from OP here and previous threads where you conspicuously did not address this:
2 –> Following Hartley, we can define Information on a probability metric: I = – log(p) 3 –> So, we can re-present the Chi-metric: Chi = – log2(2^398 * D2 * p) Chi = Ip – (398 + K2) 4 –> That is, the Dembski CSI Chi-metric is a measure of Information for samples from a target zone T on the presumption of a chance-dominated process, beyond a threshold of at least 398 bits, covering 10^120 possibilities. 5 –> Where also, K2 is a further increment to the threshold that naturally peaks at about 100 further bits. (In short VJT’s CSI-lite is an extension and simplification of the Chi-metric.) 6 –> So, the idea of the Dembski metric in the end — debates about peculiarities in derivation notwithstanding — is that if the Hartley-Shannon- derived information measure for items from a hot or target zone in a field of possibilities is beyond 398 – 500 or so bits, it is so deeply isolated that a chance dominated process is maximally unlikely to find it, but of course intelligent agents routinely produce information beyond such a threshold. 7 –> In addition, the only observed cause of information beyond such a threshold is the now proverbial intelligent semiotic agents. 8 –> Even at 398 bits that makes sense as the total number of Planck-time quantum states for the atoms of the solar system [most of which are in the Sun] since its formation does not exceed ~ 10^102, as Abel showed in his 2009 Universal Plausibility Metric paper. The search resources in our solar system just are not there.
I have now updated an earlier comment, taking the 500 bit threshold and deducing specific Dembski Chi-values using Durston Ip values in fits. Here are my results, on Durston's Table 1:
Using Durston’s Fits from his Table 1, in the Dembski style metric of bits beyond the threshold, and simply setting the threshold at 500 bits: RecA: 242 AA, 832 fits, Chi: 332 bits beyond SecY: 342 AA, 688 fits, Chi: 188 bits beyond Corona S2 445 AA, 1285 fits, Chi: 785 bits beyond. –> The two metrics are clearly consistent, and Corona S2 would also pass the X metric’s far more stringent threshold right off as a single protein. (Think about the cumulative fits metric for the proteins for a cell . . . ) –> In short I am here using the Durston metric as a good measure of the target zone’s information content, which Table 1 also conveniently reduces to bits per symbol so we can see how the redundancy affects the information used across the domains of life to achieve a given protein’s function; not just the capacity in storage unit bits [= no AA's * 2]
That looks quite coherent, useful and relevant to me. GEM of TKI kairosfocus
Mathgrrl (#11) You write to Kairosfocus: "you still can't provide a rigorous mathematical definition of CSI." Surely you jest, Mathgrrl. You have already been given a few rigorous mathematical definitions of CSI. If these do not satisfy you, I'm convinced that nothing will. I might add that I have yet to see any you perform any rigorous mathematical calculations on this blog, despite several invitations to do so. When one is engaging in open dialogue with an intellectual opponent, one should always be prepared to make concessions. I made several in my post on CSI-lite. I have yet to see any concessions on your part. You've been given an opportunity to voice your opinions on this blog. It would be a pity if it were wasted. Think about it, please. vjtorley
MathGrrl, Your 4 examples are bogus for the reasons already provided. Also you have been provided with a defnition of CSI that has rigorous mathematical components. So stop blaming us for your obvious inadequacies. Ya see that is why there are posts dedicated to you. Now grow up... Joseph
kairosfocus, So, to summarize, you still can't provide a rigorous mathematical definition of CSI, you still can't show examples of how to calculate it for the four scenarios I described in my guest post, and you admit that Durston's metric will not generate the same values as Dembski's, therefore you claim victory. Did I miss anything? MathGrrl
MG attempted rebuttal: If Durston’s metric is actually the same as Dembski’s (or a “subset” thereof, whatever that means), you should be able to demonstrate that mathematically. Either compare the two and demonstrate a transformation from one to the other or show how applying Durston’s metric and Dembski’s metric to the same systems results in the same answer, consistently. This is a strawman misrepresenatation of what was said, and it is very handy for pummelling. I have not argued that Durston and Dembski will create teh same values, as the metrics are distinct. Durston does not give a specific threshold but argues that the metric indicates difficulty of finding a given island of function in a space of possibilities, i.e, it is a metric of complex specified info, were specificity is on function and scope of island is on the observed variation of AA sequences for protein families. Info is based on an estimate of increase in H on moving from a ground to a functional state. This is conceptually consistent -- something that on Orgel you plainly have a challenge with -- with Dembski and the results will give generally consistent results once thresholds of improbability are applied, such as 398 - 500 bits or the like. The point as the cite from Durston that you cleverly duck -- cf OP above and cites in the other thread too, also ducked -- shows that:
Consider that there are usually only 20 different amino acids possible per site for proteins, Eqn. (6) can be used to calculate a maximum Fit value/protein amino acid site of 4.32 Fits/site [NB: Log2 (20) = 4.32]. We use the formula log (20) – H(Xf) to calculate the functional information at a site specified by the variable Xf such that Xf corresponds to the aligned amino acids of each sequence with the same molecular function f. The measured FSC for the whole protein is then calculated as the summation of that for all aligned sites. The number of Fits quantifies the degree of algorithmic challenge, in terms of probability [info and probability are closely related], in achieving needed metabolic function. For example, if we find that the Ribosomal S12 protein family has a Fit value of 379, we can use the equations presented thus far to predict that there are about 10^49 different 121-residue sequences that could fall into the Ribsomal S12 family of proteins, resulting in an evolutionary search target of approximately 10^-106 percent of 121-residue sequence space. In general, the higher the Fit value, the more functional information is required to encode the particular function in order to find it in sequence space. A high Fit value for individual sites within a protein indicates sites that require a high degree of functional information. High Fit values may also point to the key structural or binding sites within the overall 3-D structure.
Note the telling highlights. Kindly cf the OP above point 10. So, the whole focus of the rebuttal point is misdirected and conceptually erroneous. Durston is measuring functionally specific complex information, which is -- as has been pointed out in the UD WACs 27 - 28 for years -- a sub set of CSI. Indeed,the cites in the OP from Dembski's NFL that you claim to have read, make this crystal clear. And they also show that biological origins and biological function are contexts for CSI and ways to cash out specification, respectively. Perhaps, you should at least use the Google books link and read NFL ch 3's excerpts? GEM of TKI PS: Using Durston's Fits from his Table 1, in the Dembski style metric of bits beyond the threshold, and simply setting the threshold at 500 bits: RecA: 242 AA, 832 fits, Chi: 332 bits beyond SecY: 342 AA, 688 fits, Chi: 188 bits beyond Corona S2 445 AA, 1285 fits, Chi: 785 bits beyond. --> The two metrics are clearly consistent, and Corona S2 would also pass the X metric's far more stringent threshold right off as a single protein. (Think about the cumulative fits metric for the proteins for a cell . . . ) --> In short I am here using the Durston metric as a good measure of the target zone's information content, which Table 1 also conveniently reduces to bits per symbol so we can see how the redundancy affects the information used across the domains of life to achieve a given protein's function; not just the capacity in storage unit bits [= no AA's * 2 4.32 (oops, I obviously had DNA bases in mind)] kairosfocus
Talking point: islands of function don't exist [for biological systems] . . . Answered in this thread, no 4, point 2 above. kairosfocus
Talking point: But the fitness landscape is varying with time and your argument does not address that, it is useless . . . Answered in the same two challenges thread here, as well as in 2- 3 above. This one was actually anticipated by Dembski in NFL, ch 4, where he pointed out that fitness landscapes will most likely be dynamic across time. (One of the ways these talking points work, is that they are based on our likely ignorance of the prior moves and discussions and the history of the debates over ID, much less our failure to read key classics. Inexcusable in w world where a lot of key books are excerpted on Google.) kairosfocus
Talking point: You're just making a tornado in a junkyard [builds a jumbo jet by chance] argument . . . Answered in the two challenges thread here, as well as above. kairosfocus
Talking point: The multiverse swamps out calculations of odds etc and gives enough resources, and there coulr be 10^150 evolutionary paths Eugene S gives an answer in the two challenges thread here. Nullasalus gives another helpful answer here in the same thread, that focuses on numbers of evolutionary paths. kairosfocus
Talking point: the dog-wolf kind as an example of evolution in action. Answers in the MG smart thread here. Joseph has a great zinger here kairosfocus
Onlookers: Monitoring and rebutting the latest waves of talking points spreading out in swarms from the evo mat fever swamps . . . 1: But maybe evolution went down 10^150 paths so probability calculations are useless . . . This is empty metaphysics, as Null pointed out. (Are we talking empirically supported science or unbridled metaphysical speculation where one may build castles in the air to suit one's materialistic fancy?) We have observational evidence -- a key criterion for science -- of precisely one cosmos. (And a multiverse has in it a sub-cosmos bakery which is fine tuned to produce life habitable sub-cosmos, leading onwards to design anyway. Cf the discussion of cosmological fine tuning including multiverses here.) If one changes the subject to philosophical speculation like this, then ALL live option alternatives must be on the table, and ALL evidence supporting them is on the table too. 2: Islands of function don't exist, life is on a smoothly varying tree with neat fine gradations from the LUCA to us. Where is the evidence for this? As I pointed out in comment 3, points 24 ff above, the evidence as Gould and Loennig highlight is that life is in distinct islands of function, due to the irreducible complexity of complex functions and the difficulty of searching out the resulting islands of function. In particular, the key workhorse molecules in the cell, proteins, come in very distinct and deeply isolated fold domains, and have to do a key-lock fit to work, especially as enzymes or in agglomersted cluster proteins. (Think: sickle cell anaemia, haemoglobin mutations, and prions giving rise to mad cow disease off misfolding here to see what misfolding does.) Then, think about the Cambrianm life revolution which precisely shows the sudden appearance with no clear ancestry for the major animal life body plans. The tree of life icon is one of the most fundamentally misleading icons in the evo mat gallery of miseducation. 4: "xxx-bit calculations are only relevant for tornado in the junkyard scenarios." This -- as has repeatedly been pointed out but willfully, even stubbornly closed-mindedly ignored in the agenda to spread persuasive but flawed rebuttal points -- is itself a strawmannising of Sir Fred Hoyle's point. The principle is that complexity swamps out trial and error on chance variation, very quickly. All the way back to Cicero -- the quote at the top of my always linked note, we see:
Is it possible for any man to behold these things, and yet imagine that certain solid and individual bodies move by their natural force and gravitation, and that a world so beautifully adorned was made by their fortuitous concourse? He who believes this may as well believe that if a great quantity of the one-and-twenty letters, composed either of gold or any other matter, were thrown upon the ground, they would fall into such order as legibly to form the Annals of Ennius. I doubt whether fortune could make a single verse of them. How, therefore, can these people assert that the world was made by the fortuitous concourse of atoms, which have no color, no quality—which the Greeks call [poiotes], no sense? [Cicero, THE NATURE OF THE GODS BK II Ch XXXVII, C1 BC, as trans Yonge (Harper & Bros., 1877), pp. 289 - 90.]
To get to Hoyle's jumbo jet by a tornado in a junkyard is a probabilistic miracle: scanning multi Gigabit config spaces by lucky noise. But so is to get to a "simple" functional gauge on its dashboard by the same means, probably a kilobit level exercise. Or, for that matter, to produce one full length 143 ASCII character tweet by successful chance. Or, one 200 AA typical protein fro the hundreds used in event the simplest cells. Remember, you don't get to successful reproducing cell based life on the only observed architecture until you have a metabolic entity joined to a von Neumann self replicator. Where, this last means you have to have symbolic code, algorithms, effector arms and other execution machines, all together and properly organised. An irreducibly complex system. Without this, natural selection on differential reproductive success does not possibly exist for cell based life. On evidence this needs 100+ k bits of DNA stored information. The config space for that at the lower end is ~ possibilities. Hopelessly beyond our observed -- talking point 1 above shows why I consistently use this word -- cosmos' search capacity. Going on, to get to a new body plan, you have to generate mutations that change development paths in co-ordinated ways, that are expressed very early in embryological development, as body plans form quite early. But, chance changes at this stage are far more likely to be fatal in utero or the equivalent. The islands of function Indium and others would wish away just re-appeared, on steroids. 5: "you assume that evolution *had* to proceed along a *certain* path, for example from ape-like creatures to humans. Of course you can calculate some amazingly low probability values then. But evolution could have went along 10^150 different paths" Excuse me, but isn't it you evolutionary materialists who are advocating that we descend from apes and trot out the 98% similarity of our proteins to chimp proteins at least as you count the similarity? Well, you then need to account for the origin of the human body plan, including the origin of the physical capacity to use verbal language, itself crucial to the explanation of the origin of the credible mind, another -- and self referentially fatal -- gap. Brain, voice box, co-ordination with hearing and more. Credibly millions of base pairs worth. And, the solar system's quantum state resources are not 10^150 possibilities, they are like 10^102. And, by far most of the atoms in question are in the sun. You don't begin to have the required time, reproductive capacity to generate populations to compete and allow mutations to get fixed. And since deleterious muts that are marginal in effects and cumulative dominate, you are looking at genetic decay and deterioration on aggregate, not creation of something so complex as language ability. More broadly, you are twisting the actual points on the challenge of navigating to islands of function. There is absolutely no discussion of specific paths involved. All random walks from arbitrary initial points are allowed, but these have got to bridge the sea of non-functioning configs to get to shores of islands of co-ordinated complex function without exhausting the available search resources. 1,000 bits, my preferred threshold, swamps not only solar system or galactic level resources, but turns the scope of the cosmos to a 1 in 10^150 fraction of the set of possibilities. Not, for 10 Mbits or even 100 k bits worth of functional info, but for just 1,000 bits. If you have a significant controller to program, you can't sneeze in 1,000 bits or just 125 bytes! Matter of fact, the other day I set up a blank Word 97 doc, and took a look at the internal file. 155 k bits, with no serious content. Loads and loads of apparently repetitive y-double dots and other seemingly meaningless repeats. But, I bet these are all quite functional. The living cell is capable of doing a vNSR-metabolic entity in roughly the same space! The desperation that led to this strawman with an even worse suggestion on apes and men, is telling. 6: all your calculations are useless! This is of course quite a step back from they don't exist and/or are about something that is meaningless. THE LINE IS BENDING BACK IN RETREAT AND IS ABOUT TO BREAK. The blunders in point 5 just above show that on the contrary, the calculations are useful indeed. _____________ Onlookers, I suspect the tactic here is to try to bury this thread in obscurity, raising challenges elsewhere. (Hundreds of looks but one objecting comment and that studiously vague, sounds just a bit suspicious to me.) So, I announce a policy: in the onward addressing of the MG challenge, I will answer the fever swamp talking points here, and then link from wherever to this thread, so that this thread will become the reference on the fever swamp talking points. So, the adventure continues as the CSI gang appeals the kangaroo court decision from behind the bars of HM's cross-bar hotel. (Hey, the food ain't half bad . . . and the company and conversation are first class . . . ) GEM of TKI kairosfocus
18: In this context, gene duplication events, i.e. duplication -- not de novo creation -- of existing information based on a failure or mutation of a regulatory network, is already within an island of function, and is going to be within the micro-evolutionary threshold. 19: For such a du0plicaiton event to have evolutionary significance, it has to fit in with a regulatory network, and possibly vary to form up a new functional protein or something like that. 20: This runs right back into the search space challenges as pointed out ever since, and will not rise tot the level of creation of novel body plans; on the evidence first life needs 100 k+ bits of de novo FSCI, and novel multicellular body plans need 10 mn+ bits; on the observable evidence. 21: This, to get to the shores of islands of function BEFORE hill-climbing in a dynamic landscape can even begin. 22: gene duplication, gene exchange and the like can explain microevolutionary adaptations -- and notice the ability to transfer genes between organisms has to be a complex functional, organised capacity itself -- but they do not explain novel body plans, starting with the first one. 23: Similarly, point mutations and splicing in of strings of bases can explain microevo but will not rise to the level of body plan macroevolution, for want of search resources. 24: Fundamentally, we must reckon with the overwhelming record of the fossils, as Loennig points out -- and the only really clean explanation out there is that such reflect tightly integrated, irreducibly and functionally complex organisation required for life and for successful fit to environmental niches. 25: So, as Loennig published in the literature in 2004, in his Dynamic Genomes paper (I cite from my always linked, which has been one click away for every comment post I have ever made at UD):
the basic genetical questions should be addressed in the face of all the dynamic features of ever reshuffling and rearranging, shifting genomes, (a) why are these characters stable at all and (b) how is it possible to derive stable features from any given plant or animal species by mutations in their genomes? . . . . granted that there are indeed many systems and/or correlated subsystems in biology, which have to be classified as irreducibly complex and that such systems are essentially involved in the formation of morphological characters of organisms, this would explain both, the regular abrupt appearance of new forms in the fossil record as well as their constancy over enormous periods of time. For, if "several well-matched, interacting parts that contribute to the basic function" are necessary for biochemical and/or anatomical systems to exist as functioning systems at all (because "the removal of any one of the parts causes the system to effectively cease functioning") such systems have to (1) originate in a non-gradual manner and (2) must remain constant as long as they are reproduced and exist. And this could mean no less than the enormous time periods mentioned for all the living fossils hinted at above. Moreover, an additional phenomenon would also be explained: (3) the equally abrupt disappearance of so many life forms in earth history . . . The reason why irreducibly complex systems would also behave in accord with point (3) is also nearly self-evident: if environmental conditions deteriorate so much for certain life forms (defined and specified by systems and/or subsystems of irreducible complexity), so that their very existence be in question, they could only adapt by integrating further correspondingly specified and useful parts into their overall organization, which prima facie could be an improbable process -- or perish . . . . According to Behe and several other authors [5-7, 21-23, 53-60, 68, 86] the only adequate hypothesis so far known for the origin of irreducibly complex systems is intelligent design (ID) . . .
26: In short the irreducibly complex functionally specific organisation and associated information in cell based life fits well with the only actual record we have of life from the remote past of origins, the fossil record. Namely, sudden appearance, stasis, disappearance and/or continuity into the modern era. Gould documents this:
. . . long term stasis following geologically abrupt origin of most fossil morphospecies, has always been recognized by professional paleontologists. [The Structure of Evolutionary Theory (2002), p. 752.] . . . . The great majority of species do not show any appreciable evolutionary change at all. These species appear in the section [[first occurrence] without obvious ancestors in the underlying beds, are stable once established and disappear higher up without leaving any descendants." [p. 753.] . . . . proclamations for the supposed ‘truth’ of gradualism - asserted against every working paleontologist’s knowledge of its rarity - emerged largely from such a restriction of attention to exceedingly rare cases under the false belief that they alone provided a record of evolution at all! The falsification of most ‘textbook classics’ upon restudy only accentuates the fallacy of the ‘case study’ method and its root in prior expectation rather than objective reading of the fossil record. [p. 773.]
27: In short, the two main design theory explanatory constructs --
(i) CSI/FSCI and resulting search space challenges, and (ii) irreducible complexity of major multipart functional systems and networks based on the need for mutual fitting of parts and provision of all required parts for function --
. . . provide a direct and cogent explanation for the predominant features of the real fossil record, as opposed to the one highlighted in headlines, museum displays and textbooks. 28: A careful examination will show that there is not one empirically well warranted case where body-plan creating macroevolution has been documented as observed fact, and/or a mechanism that would credibly cause such body plan innovation [beyond the search threshold challenge] has been empirically demonstrated and observed. 29: With one possible exception. DESIGN, by intelligent designers. Venter's recent results, and the field of genetic engineering, are highly illuminating as to what design is already known to be capable of, and what it promises to be capable of; indeed there are already discussions on novel body plan beasts. _________________ I trust this provides at least a context for answering to the various questions you and others may have. GEM of TKI kairosfocus
Graham: First, observe that for several days I have been bringing this set of considerations to MG's attention, and she has consistently ducked out, starting with the analysis of what Dembski's metric boils down to and how it works on the ground, in a manner that is reasonable. That evasiveness, to my mind, is revealing. I also notice that you just vaguely allude to the questions MG has asked. She has asked many, many loaded questions in recent weeks, so which of them are you specifically interested in? [NB: Onlookers, there is no end of possible loaded questions and further questions on such questions -- just ask the shade of Torquemada -- so I have taken a different approach, undermining and blowing up the foundation for MG's claims.] Now, my point in the above post is to show how MG's entire underlying case collapses (and her main questions have been long since answered elsewhere -- including in NFL; which I doubt she has seriously and carefully read; cf the Google online link provided above). 1: CSI -- from the horse's mouth [note the NFL cite] -- is meaningful, and conceptually defined by Orgel and applied by him to life vs distinct things that are not functionally organised but ordered or random, using a concept familiar from the world of technological design. 2: Even before it is quantified, it is meaningful. 3: The usage by Orgel and that by Dembski are conceptually consistent not contradictory, as was alleged or implied. 4: It is measurable, on several metrics, tracing to Dembski (and as modified by others, including say VJT), to Durston et al [who, recall, published values of FSC for 35 protein families], and the simple brute force X-metric. 5: The Durston and Dembski metrics are consistent and closely related, though they address the measurement challenge in somewhat different ways. 6: Both end up in the situation of providing a context for examining how, once something is specific and sufficiently complex, the algorithmic challenge to find it by in effect chance-dominated processes. 7: The Dembski metric boils down [regardless of debates on how it got there] to finding a probability based negative log information value and comparing it to a threshold of 398 to about 500 bits, as an index of being sufficiently complex (while being specific to a "small" target zone) to be too isolated to be found by chance-dominated processes. 8: The simple, easily understood X-metric also does much the same, though it uses a far harder threshold, one that would overwhelm not the Planck time quantum state resources of our solar system but the cosmos as a whole. (NB: In the time for the fastest nuclear reactions, there would be ~ 10^20 Planck time quantum states.) 9: At this point, the foundation for what MG was trying to put forward is gone. the edifice naturally collapses of its own weight. 10: As Dembski pointed out in NFL [cf ch 4, which is far broader than a narrowish view of Weasel based on the evidence of the published results; i.e. apparently latched runs] -- this is part of why I think she has never seriously read it -- and also ever since, the basic problem with evolutionary search algorithms is that they are specific to and begin their search process in an identified island of function with a rising fitness landscape (which may vary across time, just as the topology of a real island does). 11: By the time that has been done by the programmers who write the software, the main job has been done, and done by intelligent design. 12: For, the underlying no free lunch principle points out that searches are tuned to fitness landscapes or a t least clusters of landscapes [i.e. as the islands of function vary across time], and the number of fitness functions that may be fitted onto a given config space is exponential on the scope of that space or actually unlimited. So, the search for a tuned search is exponentially more difficult than the direct search for the islands and then peaks of function. 13: So, a successful evolutionary search algorithm is strongly dependent -- as I pointed out -- on being started from within an island of function [the working algorithm, its data structures, coding and underlying machine], and is based on being preloaded with functionally specific, complex information thereby. 14: It may take a grand tour of such an island of function and produce quite a spectacular show, but -- like any good exercise in prestidigitation -- the effectiveness of the show depends on our willing suspension of knowledge of the curtains, the strings going up above the stage, and the trap doors under the stage. 15: PAV has also pointed out that ev, the best of the breed per Schneider, peaks out at less than 300 bits of search, on a computer -- which is far faster and more effective at searching than real world generations on the ground would be; i.e. well within the thresholds that the various metrics acknowledges as reachable by chance dominated search. 16: Evolutionary search algorithms, in short, may well explain microevolution, but that such is possible and is empirically supported is accepted by all, including young earth creationists, who see it as a designed means of adapting kinds of creatures to environments (and for the benefit of God's steward on earth, man, e.g. the dog-wolf kind). 17: Boiled down, the algorithms are based on built-in FSCO/I, and do not de novo create novel functional information beyond the FSCI/CSI threshold, which makes sense once you see that the search capacity of the solar system is less than 398 bits worth of states. [ . . . ] kairosfocus
So, can you now answer Mathgrrls questions ?
MathGrrl did not have any questions. Mung
So, can you now answer Mathgrrls questions ? __________________ ED: This thread is pivotal on the meaning of CSI, its Chi metric, and applications, and answers MG's "CSI is not rigorously defined" challenge; with a significant side issue on the real significance of evolutionary algorithms, especially Schneider's ev. (Note Mung's bottom-line description of how ev works at 180. Also, the response to the "simple genome" talking point at 206. Axe's response here to four common misconceptions on his work concerning isolation of protein fold domains in protein config space -- recall, coded for by genes -- is also material. Given the irresponsible and divisive tactics that are at work, I ask you to look at the onward post here, and at the notes here, here and here on the subtle temptation and trap of willful deception by insistently repeated misrepresentation maintained by drumbeat repetition and unresponsiveness in the teeth of adequate correction.) Axe's remarks help set our own in context, so let us clip a key cite:
let’s be clear what the question is. We know that living cells depend on the functions of thousands of proteins, and that these proteins have a great variety of distinct structural forms. These distinct forms are referred to as folds, and there are well over a thousand of them known today, with more being discovered all the time. The big question is: Does the Darwinian mechanism explain the origin of these folds? One way to approach this question is to reframe it slightly by viewing the Darwinian mechanism as a simple search algorithm. If Darwin’s theory is correct, then natural populations find solutions to difficult problems having to do with survival and reproduction by conducting successful searches. Of course we use the words search and find here figuratively, because nothing is intentionally looking for solutions. Rather, the solutions are thought to be the inevitable consequence of the Darwinian mechanism in operation. If we view Darwinism in this way—as a natural search mechanism—we can restate the big question as follows: Are new protein folds discoverable by Darwinian searches? Recasting the question in this way turns out to be helpful in that it moves us from a subjective form of explanation to an objective one, where an affirmative answer needs to be backed up by a convincing probabilistic analysis. . . . . Yes, the Darwinian mechanism requires that the different protein folds and functions not be isolated, and yes the rarity of functional sequences has a great deal to do with whether they are isolated . . . . [Using the comparison of a text string in English] as you’ve probably guessed, meaningful 42-character [text] combinations are far more rare than 1 in 1000, which explains why the islands of meaning are always isolated—mere dots in an enormously vast ocean of nonsense. Billions of sensible things can be said in 42 characters, and almost all of them can be said in many different ways, but none of that amounts to anything compared to the quadrillion quadrillion quadrillion quadrillion possible character combinations of that length (27^42 = 10^60). That is the sense in which functional protein sequences appear to be rare, and it has everything to do with their isolation.
That is the context in which the Dembski type info beyond a threshold metric is an index of how hard it is to search for isolated islands of function in seas of non-function. For protein fold domains, and for other Wicken wiring diagram based functional entities. Let's highlight the key results from above: I: The Chi metric of CSI, in log-reduced form, and on thresholds of complexity:
Chi_500 = Ip – 500, bits beyond the [solar system resources] threshold . . . eqn n5 Chi_1000 = Ip – 1000, bits beyond the observable cosmos, 125 byte/ 143 ASCII character threshold . . . eqn n6 Chi_1024 = Ip – 1024, bits beyond a 2^10, 128 byte/147 ASCII character version of the threshold in n6, with a config space of 1.80*10^308 possibilities, not 1.07*10^301 . . . eqn n6a
Sample results of applying the Chi_500 version to the Durston et al values of information in 35 protein families:
RecA: 242 AA, 832 fits, Chi: 332 bits beyond SecY: 342 AA, 688 fits, Chi: 188 bits beyond Corona S2: 445 AA, 1285 fits, Chi: 785 bits beyond . . . results n7
That is we see a measure of information beyond the threshold that points to design of the relevant protein families, thus the living cell. II: After such results, we can fairly regard the dismissive claim made by MG and highlighted in a previous thread by BA, as overturned. To wit:
MG: My conclusion is that, without a rigorous mathematical definition and examples of how to calculate [CSI], the metric is literally meaningless. Without such a definition and examples, it isn’t possible even in principle to associate the term with a real world referent. [NB: Observe my for the record comment at MF's blog on this talking point as repeated by Flint, here. Also, this onward discussion in the follow-on thread, on the rigour question -- which MG has (predictably) raised again, ignoring or outright dismissing all correction and brushing aside the pivotal issue of configuration spaces and search for islands of function.]
III: Instead we note:
a: CSI is an observational, descriptive term, whose meaning is in the very words: complex + specified + information. That should have been clear from the outset. b: As the just linked UD WAC 26 has pointed out for years, the concept CSI does not trace to Dembski but to Orgel, Wicken and others, and more broadly it describes a common feature of a technological world: complicated functional things that require a lot of careful arrangement or organisation -- thus functionally specific and complex information [FSCI] -- to work right. c: It turns out that cell based life is replete with the same sort of complex specific organisation and information. d: this raises the reasonable question as to whether cell based life is a technology that has implemented in effect a self-replicating machine based on high-tech nano-tech machines based on informational polymer molecules, especially proteins, RNA and DNA. e: So, we need to identify whether there are signs that can distinguish such design from what blind chance and mechanical forces can do. f: Once we see that we are dealing with islands of specific function in large spaces of possible arrangements, by far and away most of which will not function, we can apply the mathematics of searching such large spaces at random -- through a random walk -- on a trial and error basis vs on an intelligent basis. g: The decisive issue is that once spaces are large enough, the material resources -- numbers of atoms -- of our solar system or the observed cosmos eventually are inadequate. 398 - 500 bits of info specifies more than enough for the first threshold, and 1,000 for the second. h: 500 bits corresponds to 10^150 possibilities, the number more or less of Planck time quantum states for 10^80 atoms of our observed cosmos, across its estimated lifespan. (The number of P-time Q-states of the atoms of our solar system since the big bang on the commonly used 13.7 BY timeline amount to some 10^102, i.e. 48 orders of magnitude below the number of configurations for 500 bits.) 1,000 bits is the SQUARE of that number. (A fast, strong force nuclear interaction takes about 10^20 Planck-times, and the fastest chemical reactions about 10^30; with organic reactions being much slower than that. That is why the P-time is regarded as the shortest duration that makes physical sense.) In short, yet another dismissive talking point bites the dust: we know that here is so much haystack that the feasible scope of search boils down to effectively zero, without need for specific probability calculations, in a context where the routinely observed, and only empirically known adequate cause of FSCI/CSI is design. i: And yet, to get something fairly complicated functionally organised in less than 125 bytes of information [1,000 bits] is not feasible. Indeed, just the DNA in simplest cell based life is of order 100,000 - 1,000,000 bits of storage. 18 - 20 typical English words takes up 1,000 bits. j: That short of a computer program will not do much either, unless it is calling on other software elsewhere to do the real work, offstage so to speak. (That is what the so-called genomes of "evolutionary/genetic algorithms" effectively do.) k: So, the reduced Dembski metric is meaningful and allows us to identify a relevant threshold of complexity beyond which the only known, observed source of things that are functionally specific and complex and work, is intelligent design: Chi = I - 500, bits beyond a threshold l: And straightaway, we can apply this to the Durston results from 2007. m: So, we can see that the above objection is swept off the table once we see the above reduction and application of the Dembski CSI metric. n: And, the simpler to understand X-metric is just as applicable: X = C*S*B
IV: The main matter is now settled. So we can turn to the secondary questions and points MG raised in recent weeks. - - - - - - MG's four main questions are initially replied to at 3 below; the "now" presumes that MG's questions have not hitherto been cogently answered, which is not so, as can be seen from the discussion in her guest post thread. Note too how the reduction of Dembski's Chi in the OP is explained in 47 below as a log reduction. Mung has helpfully collected MG's four "major" questions/challenges here at 17 -18 below. The thread's highlights (Numbers relate to the time of addition, some have shifted, but links should work) are: 1: Cf 11 -- integrates Durston and Dembski showing biological values of Chi (which MG seems to think are not possible) -- and 19 - 20 below -- for point by point answers to the four main questions, then 2: 44 ff below replies to MG's rebuttals. In addition, PAV responds in 50 - 51, discussing the genetic duplication scenario on an event basis in 50, and highlighting the question of how much actual functionally specific info is in Schneider's reported case in 51. 3: MG's predictable further main list of questions/demands is answered at 76. 4: A discussion of the meaning, quantification and definition of Chi is in 57 below. 5: The Mandelbrot set is introduced at 59 to illustrate how genetic algorithms process rather than de novo create functionally specific and complex info. 6: This is backed up by a further talking point response at what is now 87. 7: Talking point rebuttals (in general) begin at 5 below and continue throughout the thread. 8: At 128 below, and in light of reciprocity for his own efforts to clarify CSI for her, VJT again pleads for MG to give him a jargon-free [intelligent 12 year old level] summary of what ev and the other programs she cited in her four main examples, in response to another refusal to do so. 9: Mung dissects ev starting at 126, and continuing with his linked discussion at ARN. Observe my own remarks in response to Schneider's dissection of the vivisection page, as well. 10: Note that Schneider actually says: "the information gain [of ev] depends on selection and it not blind and unguided. The selection is based on biologically sensible criteria: having functional DNA binding sites and not having extra ones. So Ev models the natural situation." OOPS. Methinks the biologists will want to have a word with Dr Schneider about whether or no natural selection is a GUIDED process. Starting with Dawkins, when he wrote of his own Weasel that:
the monkey/Shakespeare model is useful for explaining the distinction between single-step selection and cumulative selection, it is misleading in important ways. One of these is that, in each generation of selective ‘breeding’, the mutant ‘progeny’ phrases were judged according to the criterion of resemblance to a distant ideal target . . . Life isn’t like that. Evolution has no long-term goal. There is no long-distance target, no final perfection to serve as a criterion for selection, although human vanity cherishes the absurd notion that our species is the final goal of evolution. In real life, the criterion for selection is always short-term, either simple survival or, more generally, reproductive success . . .
__________ The sad bottomline, here. Next, Mung puts in the coup de grace, here. ++++++ On a happier note, we can explore some potential break-out and exploitation opportunities now that the schwerpunkt has broken through:
1: the observed fine-tuned nature of known designed GA's leads to the inference that if GA's model adaptation of living organisms, that is a mark that the capacity is an in-built design in living organisms. 2: In short, we seem to have been looking through the telescope from the wrong end, and may be missing what it can show us. 3: In addition, the way that exploration and discovery algors and approaches can often reveal the unexpected but already existing and implicit, reminds us that discovery is not origination. 4: Finally, the case of the Mandelbrot set where a simple function and exploratory algorithm revealed infinite complexity, points to the inference that the elegant, irreducibly complex [per Godel] and undeniably highly functional and precisely specific system of mathematical-logical reality, a key component of the cosmos, is itself arguably designed. An unexpected result. 5: And, a BA 77 point: this specifically includes how the Euler equation pins down the five key numbers in Mathematics, tying them together in one tight, elegantly beautiful complex plane -- yup, the plane with the Mandelbrot set in it to reveal infinite complexity [hence infinite lurking information unfolded from such a simple specification] -- expression:
e^(i*pi) + 1 = 0
So, let us explore together . . .
Graham

Leave a Reply