Uncommon Descent Serving The Intelligent Design Community

“Conservation of Information Made Simple” at ENV

Share
Facebook
Twitter
LinkedIn
Flipboard
Print
Email

Evolution News & Views just posted a long article I wrote on conservation of information.

EXCERPT: “In this article, I’m going to follow the example of these books, laying out as simply and clearly as I can what conservation of information is and why it poses a challenge to conventional evolutionary thinking. I’ll break this concept down so that it seems natural and straightforward. Right now, it’s too easy for critics of intelligent design to say, ‘Oh, that conservation of information stuff is just mumbo-jumbo. It’s part of the ID agenda to make a gullible public think there’s some science backing ID when it’s really all smoke and mirrors.’ Conservation of information is not a difficult concept and once it is understood, it becomes clear that evolutionary processes cannot create the information required to power biological evolution.” MORE

TEASER: The article quotes some interesting email correspondence that I had with Richard Dawkins and with Simon Conway Morris, now going back about a decade, but still highly relevant.

Comments
Joe, I don't think it's a problem. If we can know whether or not we obtained item 6, then we can know whether or not we obtained the 'six' machine. In Dembski's example, we're only interested in the outcome when the 'six' machine is secured, because we're not looking for a 'five' machine or any other. We throw out bad picks because they don't provide the boost in outcome that we're looking for, namely the increased probability of obtaining item 6. So in this regard we pay the probabilistic cost, 1/6, of obtaining the 'six' machine. Imagine a little man -- not little in stature, but little in mind, because he believes he can outmaneuver probability with shenanigans. His lottery is for item 6, and he wishes to secure a machine that will increase his chances of a jackpot from 1/6 to 1/2. So he searches among the various machines until he finds the 'six' machine. If he at first finds a 'four' machine, or any other, he tosses it out, and proceeds looking until the desired machine turns up. So he incurs the 1/6 penalty in order to increase his chances to 1/2. When he finds the 'six' machine, the complete cost of success is 1/12. And this is what Dembski was talking about -- the probabilistic cost of both finding the correct machine and then winning the lottery is indeed 1/12. He wasn't examining the undesired outcomes because they were not relevant. Under Dembski's scenario, if one wishes to find item 6 with a 1/2 probability, one must first secure a 'six' machine at a cost of 1/6. If instead our man were to disregard the machine type, and just take whichever machine he found at first, he would do no better, but no worse, than the original lottery, because each of the other five machines still offer a chance of scoring item 6, albeit at reduced rate of success for each individual machine. However this is not an example that is irrespective of machine choice, it is explicitly one in which the 'six' machine is first found, and then the lottery is conducted. So there's nothing confusing or erroneous about his example -- he lays out the probability for event A and B occurring, where A is the event for securing the correct machine, and B is the event for obtaining item 6. For both outcomes to occur, P(A∩B), there is a certain 1/12 probability of success. Dembski's explanation matches his math. If one wishes to know the probability of both A and B occurring, one must take the intersection of the two events, not the total probability. His statement is clear that, "The probability of finding item 6 using this machine, once we factor in the probabilistic cost of securing the machine, therefore ends up being 1/6 x 1/2 = 1/12.". If he were referring to the total probability of finding the 'six' machine, he wouldn't have used language that indicates a more specific outcome. He made no error, and he is not confused about the math with regard to his statement. They match up perfectly. P(B∩A) = P(B|A)*P(A) = 1/2 * 1/6 = 1/12.Chance Ratcliff
September 2, 2012
September
09
Sep
2
02
2012
01:31 PM
1
01
31
PM
PDT
Dr Dembski- for what its worth, in your example:
To see how this works, let's consider a toy problem. Imagine that your search space consists of only six items, labeled 1 through 6. Let's say your target is item 6 and that you're going to search this space by rolling a fair die once. If it lands on 6, your search is successful; otherwise, it's unsuccessful. So your probability of success is 1/6. Now let's say you want to increase the probability of success to 1/2. You therefore find a machine that flips a fair coin and delivers item 6 to you if it lands heads and delivers some other item in the search space if it land tails. What a great machine, you think. It significantly boosts the probability of obtaining item 6 (from 1/6 to 1/2). But then a troubling question crosses your mind: Where did this machine that raises your probability of success come from? A machine that tosses a fair coin and that delivers item 6 if the coin lands heads and some other item in the search space if it lands tails is easily reconfigured. It can just as easily deliver item 5 if it lands heads and some other item if it lands tails. Likewise for all the remaining items in the search space: a machine such as the one described can privilege any one of the six items in the search space, delivering it with probability 1/2 at the expense of the others. So how did you get the machine that privileges item 6? Well, you had to search among all those machines that flip coins and with probability 1/2 deliver a given item, selecting the one that delivers item 6 when it lands heads. And what's the probability of finding such a machine? To keep things simple, let's imagine that our machine delivers item 6 with probability 1/2 and each of items 1 through 5 with equal probability, that is, with probability 1/10. Accordingly, this machine is one of six possible machines configured in essentially the same way. There's another machine that flips a coin, delivers item 1 from the original search space if it lands heads, and delivers any one of 2 through 6 with probability 1/10 each if the coin lands tails. And so on. Thus, of these six machines, one delivers item 6 with probability 1/2 and the remaining five machines deliver item 6 with probability 1/10. Since there are six machines, only one of which delivers item 6 (our target) with high probability, and since only labels and no intrinsic property distinguishes one machine from any other in this setup (the machines are, as mathematicians would say, isomorphic), the principle of indifference applies to these machines and prescribes that the probability of getting the machine that delivers item 6 with probability 1/2 is the same as that of getting any other machine, and is therefore 1/6. But a probability of 1/6 to find a machine that delivers item 6 with probability 1/2 is no better than our original probability of 1/6 of finding the target simply by tossing a die. In fact, once we have this machine, we still have only a 50-50 chance of locating item 6. Finding this machine incurs a probability cost of 1/6, and once this cost is incurred we still have a probability cost of 1/2 of finding item 6. Since probability costs increase as probabilities decrease, we're actually worse off than we were at the start, where we simply had to roll a die that, with probability 1/6, locates item 6. The probability of finding item 6 using this machine, once we factor in the probabilistic cost of securing the machine, therefore ends up being 1/6 x 1/2 = 1/12. So our attempt to increase the probability of finding item 6 by locating a more effective search for that item has actually backfired, making it in the end even more improbable that we'll find item 6. Conservation of information says that this is always a danger when we try to increase the probability of success of a search -- that the search, instead of becoming easier, remains as difficult as before or may even, as in this example, become more difficult once additional underlying information costs, associated with improving the search and often hidden, as in this case by finding a suitable machine, are factored in.
You should have had the other 5 machines give a zero probability of hitting a 6- no output at all. That way there is definitely a suitable machine that you are searching for and ignoring the other 5 is valid in a real world scenario. Otherwise, the way it looks as it stands, there isn't any difference, probability wise, in making your pick of a machine and then getting a 6. Yes it is far more difficult in searching for the machines, then it is to roll a die and you have to factor in the probabilities in finding one (or six), but once you have the machines you specified the probabilities of success are (about) the same regardless of the machine you choose. The search is more difficult, physically, to be sure. But that appears to be only part of your point.Joe
September 2, 2012
September
09
Sep
2
02
2012
12:04 PM
12
12
04
PM
PDT
R0bb? Is that it then?Joe
September 2, 2012
September
09
Sep
2
02
2012
05:54 AM
5
05
54
AM
PDT
I don’t know what you mean by “part of the equation”, but those 3 are still part of ?.
Ω changes.
If you read the example, |?| is always 16.
A different 16. A 16 that does NOT include those 3:
• A is the uniform distribution over the rightmost four squares in the search space. • B is the uniform distribution over the bottom twelve squares in the search space.
The three in the top row starting in the left corner are no longer part of the search grid. Now that that has been exposed let's move on to the question of: "is it easier to roll a die to get your 6, a 1/6 probability than it is to also have to search for a machine that will allow you to find your 6 by say flipping a coin? And if there are 6 machines out there that will help you find your 6, but only one is the 50/50, is that less or more difficult than just rolling a die?"
You’re the one who claimed that an intelligent being could, without any prior information, create a metal detector.
Nope. My point was that someone without any prior knowledge of where the needle is located can use a metal detector to enhance his/ her chances of having a successful search- ie making it a much better chance of success than a blind search.Joe
September 1, 2012
September
09
Sep
1
01
2012
10:36 AM
10
10
36
AM
PDT
@Chance Ratcliff 1) Do you agree that the probability to find the target "6" using the two-layered system of at first choosing a machine at random and then let the machine choose the target is 1/6 ? 2) W. Dembski says: "So our attempt to increase the probability of finding item 6 by locating a more effective search for that item has actually backfired, making it in the end even more improbable that we'll find item 6." How do you square this with your answer to 1) ?DiEb
September 1, 2012
September
09
Sep
1
01
2012
10:28 AM
10
10
28
AM
PDT
onlooker, I know, the irony. ;-) "Either Dembski is talking about the probability of finding the target after the right machine is found, a true conditional probability, in which case the answer is .5 or he is talking about the “final cost”, which is the total probability of the two step process of choosing a machine and then asking it for a value, in which case the answer is 1/6." Or he is talking about the probability of both finding the correct machine (1/6) and finding the correct item after finding the correct machine (1/2), which actually what he said. You can disagree with the claim by insisting that the total probability applies, or you can disagree by insisting that the conditional probability must be taken solely. But what you cannot refute is that Dembski is specifically referring to the intersection of two events, for which the probability is 1/12. And he is correct. P(A∩B) = 1/12, unless your math comes out differently. I've quoted Demski, and I've shown why the math matches his claim as stated. So the idea that he made some elementary math error is without merit. If you lot would just acknowledge this simple little fact, I'll stop reminding everyone that P(A∩B) really does equal 1/12, and that Dembski really was talking about the probability of P(A∩B). Maybe then it would be appropriate to discuss whether the probability of the intersection is less relevant than total probability. However in either case, the LCI holds, so I'm not sure what the fuss is about, unless it was to show that Dembski made some elementary math error -- which he didn't.Chance Ratcliff
September 1, 2012
September
09
Sep
1
01
2012
10:08 AM
10
10
08
AM
PDT
Chance, Discussing probability with someone named Chance amuses me. ;-)
Again, he’s claiming that the final cost of B after securing the correct machine is 1/12, and he’s right. It’s the probability of both B and A occurring.
You can't have it both ways. Either Dembski is talking about the probability of finding the target after the right machine is found, a true conditional probability, in which case the answer is .5 or he is talking about the "final cost", which is the total probability of the two step process of choosing a machine and then asking it for a value, in which case the answer is 1/6. The core error that you and Dembski are making is that there is not one single correct machine. All of the machines have a non-zero probability of returning the target. That cannot be just ignored.onlooker
September 1, 2012
September
09
Sep
1
01
2012
09:11 AM
9
09
11
AM
PDT
Joe:
Those 3 are no longer part of the equation. That was the whole point, the probabilities just got lower because the search space was narrowed.
I don't know what you mean by "part of the equation", but those 3 are still part of Ω. If you read the example, |Ω| is always 16. And the probabilities get higher, not lower, when the searchable area narrows (assuming that the area still contains the target).
Can an intelligent being, without any prior information, exist? Would such a being even care or know about searching?
You're the one who claimed that an intelligent being could, without any prior information, create a metal detector. Have you changed your mind?
That doesn’t say we add the two conditional probabilities together to get the total probability.
It says that the total probability is the weighted sum of the conditional probabilities. That's what Dieben did.
There should only be one final state-> the target. Otherwise you keep shifting until then, right? And if you are still shifting then you aren’t in a final state, which would be stable.
No, the model consists of only one transition. If you look at the state diagram, you'll see that there are no loops. In Dembski's model, a search consists of a single event.R0bb
September 1, 2012
September
09
Sep
1
01
2012
09:01 AM
9
09
01
AM
PDT
The law of total probability doesn't seem to apply DiEB. Ya see there are TWO targets, not one. The first target is picking the the machine that gives you a 1/2 chance of getting a 6. And the final target is getting the 6. The first target has a 1/6 chance and the second has a 1/2 chance. You multiply those together to get a 1/12 chance. One more time- the law of total probability from wikipedia:
In probability theory, the law (or formula) of total probability is a fundamental rule relating marginal probabilities to conditional probabilities.
That doesn’t say we add the two conditional probabilities together to get the total probability.Joe
September 1, 2012
September
09
Sep
1
01
2012
06:54 AM
6
06
54
AM
PDT
W. Dembski:
The probability of finding item 6 using this machine, once we factor in the probabilistic cost of securing the machine, therefore ends up being 1/6 x 1/2 = 1/12. So our attempt to increase the probability of finding item 6 by locating a more effective search for that item has actually backfired, making it in the end even more improbable that we'll find item 6.
Yet often, as in this example, we may actually do worse by trying to improve the probability of a successful search.
The part in bold fond isn't true - it isn't more improbable to find item 6, it is exactly as improbable as before, we don't do actually worse. @Chance Ratcliff: IMO the final cost of B after securing the correct machine would be P(B|A) and not P(B?A). "I found the target" - "Sure, but you didn't use the right machine, so we can't give you any points for the answer" - that's just not how it works....DiEb
August 31, 2012
August
08
Aug
31
31
2012
11:22 PM
11
11
22
PM
PDT
onlooker, I'm simply pointing out that Dembski's claim as stated, is that P(B∩A) = 1/12, and he's correct. Event A: getting the correct machine. Event B: getting the correct item (#6). P(A) = 1/6 P(B|A) = 1/2 P(B∩A) = P(B|A)*P(A) = 1/2 * 1/6 = 1/12 The total probability for event B is 1/6 (and will not go higher in the given example) but Dembski is not referring to the total probability. If you don't agree with his claim, fine -- but the math, as it corresponds to the claim, is correct. Again, he's claiming that the final cost of B after securing the correct machine is 1/12, and he's right. It's the probability of both B and A occurring.Chance Ratcliff
August 31, 2012
August
08
Aug
31
31
2012
06:58 PM
6
06
58
PM
PDT
onlooker:
That means that you can’t simply use 1/6 as the probability of getting the right machine because the other machines each also have a probability of .1 to return the target.
How many machines are there? 6 And how many do we get to choose? 1 = 1/6. Every machine has a 1/6 chance of being chosen.Joe
August 31, 2012
August
08
Aug
31
31
2012
06:47 PM
6
06
47
PM
PDT
Given the two-tiered search- level 1 for the machine and level 2 for the 6- each of the six pathways (pathway being via one of thesix machines) to 6 starts with a probability of 1/12. And that is half as good as the original. If you get the correct machine at level one then your odds jump to 1/2, which is 3 times better than the original. And if you get one of the other machines your odds step to 1/10, which is 2/3 worse than the original. And the odds of getting a path that is worse than the original is 5/6. So you got to ask yourself, do you feel lucky, punk? :)Joe
August 31, 2012
August
08
Aug
31
31
2012
06:41 PM
6
06
41
PM
PDT
Chance,
Dembski is referring to the total cost of finding item 6 after having secured the correct machine.
"The probability of finding item 6 using this machine, once we factor in the probabilistic cost of securing the machine, therefore ends up being 1/6 x 1/2 = 1/12."
Using cost as a metaphor (or worse, reifying it) is causing confusion. You are interpreting Dembski as changing the problem from finding a target using six available machines to finding a machine that finds the target with probability .5 and ignoring all the others. If you want to do that, you still have to take the probability that the other machines will find the target into account when computing the "probabilistic cost". That means that you can't simply use 1/6 as the probability of getting the right machine because the other machines each also have a probability of .1 to return the target. If "probabilistic cost" has any real meaning, based on my brute force approach above I predict you'll find that the actual value you get when you take those other probabilities into account will be 1/3 rather than 1/6. You'll have to define concept more clearly first, though.onlooker
August 31, 2012
August
08
Aug
31
31
2012
06:32 PM
6
06
32
PM
PDT
As mentioned by Joe, LCI holds even for P(B). We can add searches for searches for searches, and the probability does not increase beyond the original discrete probability of 1/6. But if we're paying the cost of having found those searches, the probability drops significantly.Chance Ratcliff
August 31, 2012
August
08
Aug
31
31
2012
06:17 PM
6
06
17
PM
PDT
Dembski is referring to the total cost of finding item 6 after having secured the correct machine.
“The probability of finding item 6 using this machine, once we factor in the probabilistic cost of securing the machine, therefore ends up being 1/6 x 1/2 = 1/12.”
Taking into account what he's actually saying: P(B|A) = 1/2 P(A) = 1/6 P(B∩A) = P(B|A)*P(A) = 1/2 * 1/6 = 1/12 He is not referencing the total probability of event B, as in P(B) = 1/6. If we wish to secure a machine that increases our chances of finding item 6 to 1/2, we must first pay 1/6 to do so. The cost of finding the desired item then is 1/12. He's referencing the probability of events A and B occurring. Dembski's math supports his claim as stated.Chance Ratcliff
August 31, 2012
August
08
Aug
31
31
2012
06:05 PM
6
06
05
PM
PDT
onlooker, Chance Ratcliff explained it above. Also what Dembski used is called conditional probabilities and it works-> You multiply the odds of each level to get the final odds. Also 1/6 = 1/6, so even if you and DiEB are right, Dembski is still correct as we didn't find a more effective search. You do understand that a more effective search would have a higher probability of success than the original...Joe
August 31, 2012
August
08
Aug
31
31
2012
05:12 PM
5
05
12
PM
PDT
R0bb writes
Joe:
Please explain why your “+” should not by a “*”.
See Law of Total Probability.
Dembski's error is simple enough to demonstrate by brute force. One of his machines produces a particular value with probability .5 and each of the other values with probability .1. This is the same as picking one value at random from a set of integers like this: 11111 23456 There are six of these machines, so the total number of sets of integers is: 11111 23456 22222 13456 33333 12456 44444 12356 55555 12346 66666 12345 Now, there are two simple ways to see the problem. The first is that when Dembski says "The probability of finding item 6 using this machine, once we factor in the probabilistic cost of securing the machine, therefore ends up being 1/6 x 1/2 = 1/12" it is clear that he is ignoring half the 6s, namely those listed on the right. An even easier way is to recognize that picking a machine at random and getting a value at random from that machine is equivalent to simply picking one integer at random from all the integers listed. There are 60 integers, 10 each of 1, 2, 3, 4, 5, and 6. The probability of finding the target value is therefore 1/6, as calculated by DiEB. Dembski made an error. Hey, it happens to the best of us. Unfortunately for his thesis, that error means that his conclusion that "So our attempt to increase the probability of finding item 6 by locating a more effective search for that item has actually backfired, making it in the end even more improbable that we'll find item 6." is incorrect. The impact of this on his broader claims about "conservation of information" may or may not be significant.onlooker
August 31, 2012
August
08
Aug
31
31
2012
05:02 PM
5
05
02
PM
PDT
Indeed, Rob, you made the point before. And W. Dembski could argue that - assuming that an addition step is costly - the combination of the meta-search and the search is more costly than performing the search alone, without gaining a better expectation of success. But he says Yet often, as in this example, we may actually do worse by trying to improve the probability of a successful search. - and that's just not true: the probability of a successful search hasn't changed.DiEb
August 31, 2012
August
08
Aug
31
31
2012
03:13 PM
3
03
13
PM
PDT
Ω Ω Thanks Chance, I can see the symbols in the preview sceenJoe
August 31, 2012
August
08
Aug
31
31
2012
02:41 PM
2
02
41
PM
PDT
In #53 you said that my examples were mathematically valid.
It is mathematically valid to say that 2 unicorns plus 2 unicorns = 4 unicorns. However in the real world there aren't any unicorns.Joe
August 31, 2012
August
08
Aug
31
31
2012
12:58 PM
12
12
58
PM
PDT
And I never said that there are 5 possible final states for a given initial state. I said there are 5 final states.
There should only be one final state-> the target. Otherwise you keep shifting until then, right? And if you are still shifting then you aren't in a final state, which would be stable.Joe
August 31, 2012
August
08
Aug
31
31
2012
12:56 PM
12
12
56
PM
PDT
In probability theory, the law (or formula) of total probability is a fundamental rule relating marginal probabilities to conditional probabilities.
That doesn't say we add the two conditional probabilities together to get the total probability.Joe
August 31, 2012
August
08
Aug
31
31
2012
12:53 PM
12
12
53
PM
PDT
R0bb, Can an intelligent being, without any prior information, exist? Would such a being even care or know about searching?Joe
August 31, 2012
August
08
Aug
31
31
2012
12:48 PM
12
12
48
PM
PDT
In that case, Bernoulli’s PrOIR would have the search for T characterized by a probability measure ?2 (? M2(?)) that assigns probability 1/2 to both A and B.
The search confers a probability of 1/2 on subset A and a probability of 1/2 on subset B, which means it confers a probability of zero on the 3 squares that are in neither A nor B.
Those 3 are no longer part of the equation. That was the whole point, the probabilities just got lower because the search space was narrowed.Joe
August 31, 2012
August
08
Aug
31
31
2012
12:44 PM
12
12
44
PM
PDT
Also Joe, I forgot to ask, how does a person invent a metal detector without any prior information, which would mean no information about the facts of science and engineering?R0bb
August 31, 2012
August
08
Aug
31
31
2012
12:43 PM
12
12
43
PM
PDT
Joe:
Please explain why your “+” should not by a “*”.
See Law of Total Probability.R0bb
August 31, 2012
August
08
Aug
31
31
2012
12:34 PM
12
12
34
PM
PDT
Dieben:
I see a tiny problem at Dr. Dembski’s toy example
Thanks, Dieben. I noticed that too, and brought it up in comment #20. It seems that when Dembski says that the probability of finding a target decreases when we factor in the cost of the higher-level search, he really means that the probability of finding the higher-level target AND finding the lower-level target is smaller than the probability of directly finding the lower-level targetR0bb
August 31, 2012
August
08
Aug
31
31
2012
12:31 PM
12
12
31
PM
PDT
BTW Joe, if you don't like my example of an LCI violation, you can look at Dembski's own such example. On rereading the example in section 1.1.4 of the S4S paper, I see an LCI violation that I hadn't noticed before. Search B has 2 bits of active information, and the endogenous information of finding search B is 1 bit. So instead of arguing about the validity of my example, let's just use Dembski's. And BTW, how does Dembski's example apply to the real world? How do his three CoI theorems apply to the real world? (He claims that his Bora Bora example is a special case of his function-theoretic CoI theorem, but this is in fact an error.) If any of Dembski's examples or theorems don't apply to the real world, is he being uncivil by bringing them up?R0bb
August 31, 2012
August
08
Aug
31
31
2012
12:28 PM
12
12
28
PM
PDT
Joe:
There are 5 POSSIBLE final states if and only if the ONE item can be in all three of the initial starting points at the same time.
And I never said that there are 5 possible final states for a given initial state. I said there are 5 final states. If you inferred from that statement that all 5 states are accessible to a given initial state, I'm sorry -- that wasn't my intention.
Please quote the part that says that. I cannot find anything that comes close to saying that.
Here is the quote:
In that case, Bernoulli’s PrOIR would have the search for T characterized by a probability measure Θ2 (∈ M2(Ω)) that assigns probability 1/2 to both A and B.
The search confers a probability of 1/2 on subset A and a probability of 1/2 on subset B, which means it confers a probability of zero on the 3 squares that are in neither A nor B.
Can something be mathematically valid when applied to something that is invalid, such as your mangled sets?
In #53 you said that my examples were mathematically valid. Have you changed your mind? And you still haven't answered the question: If the LCI fails in mathematically valid cases, is it a true lawR0bb
August 31, 2012
August
08
Aug
31
31
2012
12:21 PM
12
12
21
PM
PDT
1 3 4 5 6 7 8

Leave a Reply