Signal to Noise: A Critical Analysis of Active Information
|April 23, 2015||Posted by johnnyb under Conservation of Information, Evolution, Intelligent Design, UD Guest Posts|
The following is a guest post by Aurelio Smith. I have invited him to present a critique of Active Information in a more prominent place at UD so we can have a good discussion of Active Information’s strengths and weaknesses. The rest of this post is his.
My thanks to johnnyb for offering to host a post from me on the subject of ‘active information’. I’ve been following the fortunes of the ID community for some time now and I was a little disappointed that the recent publications of the ‘triumvirate’ of William Dembski, Robert Marks and their newly promoted postgrad Doctor Ewert have received less attention here than their efforts deserve. The thrust of their assault on Darwinian evolution has developed from earlier concepts such as “complex specified information” and “conservation of information” and they now introduce “Algorithmic Specified Complexity” and “Active information”.
William Demsbski gives an account of the birth of his ideas here:
…in the summer of 1992, I had spent several weeks with Stephen Meyer and Paul Nelson in Cambridge, England, to explore how to revive design as a scientific concept, using it to elucidate biological origins as well as to refute the dominant materialistic understanding of evolution (i.e., neo-Darwinism). Such a project, if it were to be successful, clearly could not merely give a facelift to existing design arguments for the existence of God. Indeed, any designer that would be the conclusion of such statistical reasoning would have to be far more generic than any God of ethical monotheism. At the same time, the actual logic for dealing with small probabilities seemed less to directly implicate a designing intelligence than to sweep the field clear of chance alternatives. The underlying logic therefore was not a direct argument for design but an indirect circumstantial argument that implicated design by eliminating what it was not.*
Dembski published The Design Inference in 1998, where the ‘explanatory filter’ was proposed as a tool to separate ‘design’ from ‘law’ and ‘chance’. The weakness in this method is that ‘design’ is assumed as the default after eliminating all other possible causes. Wesley Elsberry’s review points out the failure to include unknown causation as a possibility. Dembski acknowledges the problem in a comment in a thread at Uncommon Descent – Some Thanks for Professor Olofsson
I wish I had time to respond adequately to this thread, but I’ve got a book to deliver to my publisher January 1 — so I don’t. Briefly: (1) I’ve pretty much dispensed with the EF. It suggests that chance, necessity, and design are mutually exclusive. They are not. Straight CSI [Complex Specified Information] is clearer as a criterion for design detection.* (2) The challenge for determining whether a biological structure exhibits CSI is to find one that’s simple enough on which the probability calculation can be convincingly performed but complex enough so that it does indeed exhibit CSI. The example in NFL ch. 5 doesn’t fit the bill. The example from Doug Axe in ch. 7 of THE DESIGN OF LIFE (www.thedesignoflife.net) is much stronger. (3) As for the applicability of CSI to biology, see the chapter on “assertibility” in my book THE DESIGN REVOLUTION. (4) For my most up-to-date treatment of CSI, see “Specification: The Pattern That Signifies Intelligence” at http://www.designinference.com. (5) There’s a paper Bob Marks and I just got accepted which shows that evolutionary search can never escape the CSI problem (even if, say, the flagellum was built by a selection-variation mechanism, CSI still had to be fed in).
Dr Dembski has posted some background to his association with Professor Robert Marks and The Evolutionary Informatics Lab which has resulted in the publication of several papers with active information as an important theme. A notable collaborator is Winston Ewert Ph D, whose master’s thesis was entitled: Studies of Active Information in Search where, in chapter four, he criticizes Lenski et al., 2003, saying:
[quoting Lenski et al., 2003]“Some readers might suggest that we stacked the deck by studying the evolution of a complex feature that could be built on simpler functions that were also useful.”
This, indeed, is what the writers of Avida software do when using stair step active information.
What is active information?
In A General Theory of Information Cost Incurred by Successful Search, Dembski, Ewert and Marks (henceforth DEM) give their definition of “active information” as follows:
In comparing null and alternative searches, it is convenient to convert probabilities to information measures (note that all logarithms in the sequel are to the base 2). We therefore define the endogenous information IΩ as –log(p), which measures the inherent difficulty of a blind or null search in exploring the underlying search space Ω to locate the target T. We then define the exogenous information IS as –log(q), which measures the difficulty of the alternative search S in locating the target T. And finally, we define the active information I+ as the difference between the endogenous and exogenous information: I+ = IΩ – IS = log(q/p). Active information therefore measures the information that must be added (hence the plus sign in I+) on top of a null search to raise an alternative search’s probability of success by a factor of q/p. [excuse formatting errors in mathematical symbols]
They conclude with an analogy from the financial world, saying:
Conservation of information shows that active information, like money, obeys strict accounting principles. Just as banks need money to power their financial instruments, so searches need active information to power their success in locating targets. Moreover, just as banks must balance their books, so searches, in successfully locating targets, must balance their books — they cannot output more information than was inputted.
In an article at the Pandas Thumb website Professor Joe Felsenstein, in collaboration with Tom English, presents some criticism of of the quoted DEM paper. Felsenstein helpfully posts an “abstract in the comments, saying:
Dembski, Ewert and Marks have presented a general theory of “search” that has a theorem that, averaged over all possible searches, one does not do better than uninformed guessing (choosing a genotype at random, say). The implication is that one needs a Designer who chooses a search in order to have an evolutionary process that succeeds in finding genotypes of improved fitness. But there are two things wrong with that argument: 1. Their space of “searches” includes all sorts of crazy searches that do not prefer to go to genotypes of higher fitness – most of them may prefer genotypes of lower fitness or just ignore fitness when searching. Once you require that there be genotypes that have different fitnesses, so that fitness affects their reproduction, you have narrowed down their “searches” to ones that have a much higher probability of finding genotypes that have higher fitness. 2. In addition, the laws of physics will mandate that small changes in genotype will usually not cause huge changes in fitness. This is true because the weakness of action at a distance means that many genes will not interact strongly with each other. So the fitness surface is smoother than a random assignment of fitnesses to genotypes. That makes it much more possible to find genotypes that have higher fitness. Taking these two considerations into account – that an evolutionary search has genotypes whose fitnesses affect their reproduction, and that the laws of physics militate against strong interactions being typical – we see that Dembski, Ewert, and Marks’s argument does not show that Design is needed to have an evolutionary system that can improve fitness.
I note that there is an acknowledgement in the DEM paper as follows:
The authors thank Peter Olofsson and Dietmar Eben for helpful feedback on previous work of the Evolutionary Informatics Lab, feedback that has found its way into this paper.
I’m not qualified to criticize the mathematics but I see no need to doubt that it is sound. However what I do query is whether the model is relevant to biology. The search for a solution to a problem is not a model of biological evolution and the concept of “active information” makes no sense in a biological context. Individual organisms or populations are not searching for optimal solutions to the task of survival. Organisms are passive in the process, merely affording themselves of the opportunity that existing and new niche environments provide. If anything is designing, it is the environment. I could suggest an anthropomorphism: the environment and its effects on the change in allele frequency are “a voice in the sky” whispering “warmer” or “colder”. There is the source of the active information.
I was recently made aware that this classic paper by Sewall Wright, The Roles of Mutation, Inbeeding, Crossbreeding and Selection in Evolution, is available online. Rather than demonstrating the “active information” in Dawkins’ Weasel program, which Dawkins freely confirmed is a poor model for evolution with its targeted search, would DEM like to look at Wright’s paper for a more realistic evolutionary model?
Perhaps, in conclusion, I should emphasize two things. Firstly, I am utterly opposed to censorship and suppression. I strongly support the free exchange of ideas and information. I strongly support any genuine efforts to develop “Intelligent Design” into a formal scientific endeavor. Jon Bartlett sees advantages in the field of computer science and I say good luck to him. Secondly, “fitness landscape” models are not accurate representations of the chaotic, fluid, interactive nature of the real environment . The environment is a kaleidoscope of constant change. Fitness peaks can erode and erupt. Had Sewall Wright been developing his ideas in the computer age, his laboriously hand-crafted diagrams would, I’m sure, have evolved (deliberate pun) into exquisite computer models.
History: Wm Dembski 1998 the Design inference, explanatory filter ( Elsberry criticizes the book for using a definition of “design” as what is left over after chance and regularity have been eliminated)
Wikipedia, upper probability bound, complex specified information, conservation of information, meaningful information.
Theft over Toil John S. Wilkins, Wesley R. Elsberry 2001
Computational capacity of the universe Seth Lloyd 2001
Information Theory, Evolutionary Computation, and
Dembski’s “Complex Specified Information” Elsberry and Shallit 2003
Specification: The Pattern That Signifies Intelligence by William A. Dembski August 15, 2005
Evaluation of Evolutionary and Genetic
Optimizers: No Free Lunch Tom English 1996
Conservation of Information Made Simple William Dembski 2012
…evolutionary biologists possessing the mathematical tools to understand search are typically happy to characterize evolution as a form of search. And even those with minimal knowledge of the relevant mathematics fall into this way of thinking.
Take Brown University’s Kenneth Miller, a cell biologist whose knowledge of the relevant mathematics I don’t know. Miller, in attempting to refute ID, regularly describes examples of experiments in which some biological structure is knocked out along with its function, and then, under selection pressure, a replacement structure is evolved that recovers the function. What makes these experiments significant for Miller is that they are readily replicable, which means that the same systems with the same knockouts will undergo the same recovery under the same suitable selection regime. In our characterization of search, we would say the search for structures that recover function in these knockout experiments achieves success with high probability.
Suppose, to be a bit more concrete, we imagine a bacterium capable of producing a particular enzyme that allows it to live off a given food source. Next, we disable that enzyme, not by removing it entirely but by, say, changing a DNA base in the coding region for this protein, thus changing an amino acid in the enzyme and thereby drastically lowering its catalytic activity in processing the food source. Granted, this example is a bit stylized, but it captures the type of experiment Miller regularly cites.
So, taking these modified bacteria, the experimenter now subjects them to a selection regime that starts them off on a food source for which they don’t need the enzyme that’s been disabled. But, over time, they get more and more of the food source for which the enzyme is required and less and less of other food sources for which they don’t need it. Under such a selection regime, the bacterium must either evolve the capability of processing the food for which previously it needed the enzyme, presumably by mutating the damaged DNA that originally coded for the enzyme and thereby recovering the enzyme, or starve and die.
So where’s the problem for evolution in all this? Granted, the selection regime here is a case of artificial selection — the experimenter is carefully controlling the bacterial environment, deciding which bacteria get to live or die*. [(* My emphasis) Not correct – confirmed by Richard Lenski – AF] But nature seems quite capable of doing something similar. Nylon, for instance, is a synthetic product invented by humans in 1935, and thus was absent from bacteria for most of their history. And yet, bacteria have evolved the ability to digest nylon by developing the enzyme nylonase. Yes, these bacteria are gaining new information, but they are gaining it from their environments, environments that, presumably, need not be subject to intelligent guidance. No experimenter, applying artificial selection, for instance, set out to produce nylonase.
To see that there remains a problem for evolution in all this, we need to look more closely at the connection between search and information and how these concepts figure into a precise formulation of conservation of information. Once we have done this, we’ll return to the Miller-type examples of evolution to see why evolutionary processes do not, and indeed cannot, create the information needed by biological systems. Most biological configuration spaces are so large and the targets they present are so small that blind search (which ultimately, on materialist principles, reduces to the jostling of life’s molecular constituents through forces of attraction and repulsion) is highly unlikely to succeed. As a consequence, some alternative search is required if the target is to stand a reasonable chance of being located. Evolutionary processes driven by natural selection constitute such an alternative search. Yes, they do a much better job than blind search. But at a cost — an informational cost, a cost these processes have to pay but which they are incapable of earning on their own.
Meaningful Information Paul Vit´anyi 2004
The question arises whether it is possible to separate meaningful information from accidental information, and if so, how.
Evolutionary Informatics Publications
Conservation of Information in Relative Search Performance Dembski, Ewert, Marks 2013
Algorithmic Specified Complexity
in the Game of Life Ewert, Dembski, Marks 2015
On the Improbability of Algorithmic Specified
Complexity Dembski, Ewert, Marks 2013
Wikipedia, upper probability bound, complex specified information, conservation of information, meaningful information.
A General Theory of Information Cost Incurred by Successful Search Dembski, Ewert, Marks 2013
Actually, in my talk, I work off of three papers, the last of which Felsenstein fails to cite and which is the most general, avoiding the assumption of uniform probability to which Felsenstein objects.
Conservation of Information in Search:
Measuring the Cost of Success Dembski, Marks 2009
The Search for a Search: Measuring the Information Cost of
Higher Level Search Dembski, Marks 2009
Has Natural Selection Been Refuted? The Arguments of William Dembski Joe Felsenstein 2007
Dembski argues that there are theorems that prevent natural selection from explaining the adaptations that we see. His arguments do not work. There can be no theorem saying that adaptive information is conserved and cannot be increased by natural selection. Gene frequency changes caused by natural selection can be shown to generate specified information. The No Free Lunch theorem is mathematically correct, but it is inapplicable to real biology. Specified information, including complex specified information, can be generated by natural selection without needing to be “smuggled in”. When we see adaptation, we are not looking at positive evidence of billions and trillions of interventions by a designer. Dembski has not refuted natural selection as an explanation for adaptation.
ON DEMBSKI’S LAW OF CONSERVATION OF INFORMATION Erik Tellgren 2002