Uncommon Descent Serving The Intelligent Design Community

# A Tutorial on Specified Complexity

Share
Flipboard
Print
Email

I’ve found that a lot of people who are interested in Intelligent Design are nonetheless unaware of the mathematics behind it. Therefore, I decided to do some videos teaching the basic ideas.

I would love to hear any feedback you have on the videos, or anything that you would like to see covered.

Originally, I was going to redo this video into a series of shorter videos, but time has prevented me. So, I’ll start with this one and we’ll see where we go from here.

I would also like to hear any criticisms of the math presented here. That is, not on the applications of it elsewhere (we can debate it in another thread), or on other people’s usage of the mathematics (we can also debate that elsewhere), but just on the basic ideas presented here.

DS, actually, it is a lump of background information ( --> there is a world of possibilities otherwise, starting with loaded dice as a near alternative world, and going on from there, perhaps endlessly [a la "animal, vegetable or mineral"]!) that is important to understand that you have an alphabet of possible outcomes, with varying likelihoods, tracing to a particular dynamic process, a pair of fair dice and linked a priori or frequentist probabilities. This in effect defines a frame -- a model world -- and a description language. A set of possibilities and probabilities like this is informational. Then, in that context, 7 is less informative than 4 as there are more ways that the former can be had than the latter. Things get interesting if you then get a long string of dice and define each as a six-state code element [~2.585 bits/character . . . i.e. y/n choices], clustering the digits in clusters, to code more interesting results, e.g. 3-letter codons with 6 states would give us 216 possibilities, a whole lot more than 64. (BTW, I think someone out there is doing that, extending the 4-base genome world to 6-bases.) Now, we can code in dice, and the baseline probabilities of individual dice and triplets of dice allow us to identify information carrying capacity. Likely, we will throw some away by having don't care [for now] states. So, we see now a world in which we can have codes based on dice, exploiting the high contingency of the 6-state elements in an organised way, as opposed to the more common random dice-toss approach. Talking strings of dice is less loaded in the e-YES vs EYE-s sense [ --> try the experiment in the joke that is likely still making the rounds, you will be amazed at how people struggle RW with this one . . . ], too, so maybe it will help throw some lateral illumination on the more polarised matters. KF PS: JB, the only thing worse than sitting exams is setting and marking them!kairosfocus
December 6, 2016
December
12
Dec
6
06
2016
10:54 PM
10
10
54
PM
PDT
What's really interesting, though, is using Active Information and the No Free Lunch theorem. NFL gives an expected average value for search success rates. Therefore, having search success rates significantly above NFL is itself an indicator that something is skewed positively. Even if it turns out that natural selection itself is somehow able to do the things it is said (not likely, for reasons shown here), then that would indicate that there was design within either the universe or natural selection to facilitate this operation. In fact, NFL and Active Information allow us to do really interesting things, like measure the amount of information that a cell applies to its own evolution. I'll probably be presenting my paper on that in the AM-Nat Biology meeting in February. Anyway, I don't think I explained that last one well, but it's midnight and I still have to write finals for my students. Hopefully another day. Thanks for taking the time to watch the video!johnnyb
December 6, 2016
December
12
Dec
6
06
2016
10:01 PM
10
10
01
PM
PDT
First of all, I would agree with the statement that specified complexity as it is outlined here is hard to apply to biology, especially to do all the things people want it to do. I think that, for instance, finding the specified complexity of the flagellum is problematic for a number of reasons, including the ones you state. Now, technically, I think it is at least in theory possible to factor natural selection into P(T|H), but I don't think anyone has done it explicitly (we will see some ways around this later). However, if one is calculating the specified complexity of the first life, then one can indeed ignore natural selection for P(T|H). In fact, this actually has been done in the peer-reviewed literature, though by another name and with some other differences (all which actually give less probability to the origin of life than would Dembski's). It was allowed in at the time because the criticism was aimed specifically at the prebiotic soup theory, despite the fact that it applies to any given non-design abiogenesis theory. Estimating P(T|H) for the origin of life is actually much simpler, as it can be done on a stochastic model for the smallest possible organism. While it is true that we don't know for sure what the smallest possible organism is, I think that we can make reasonable estimates from known data, based on the minimal genome size of modern organisms, the requirements of self-replication, etc. If you have a defensible minimum size (I would probably be willing to go with any size you had in mind, provided you could reasonably justify it), then you can pretty much find an upper bound to the probability just using information theory (physics would actually cause the actual probability to be *much* lower, because the size of the molecule would increase the probability that it broke up in absence of an existing organism). That leaves finding phi_s(t). Some think that self-replication is not detachable from biology (I think it is, but whatever). Nonetheless, it is not hard to describe self-replication using other terms, such as "a sequence of amino acids such that it is able to produce a new copy of the same set of amino acids for multiple generations with at least 90% accuracy and seek building blocks for the same". Using the dictionary method, that's 709 bits. So, if you think that a food-seeking self-replicator can arise in less than 1,209 bits, then it is possible that such a self-replicator might have arisen sometime in the history of the universe. Personally, I think that 500 bits is a bit much (I prefer Yockey's calculation), but I won't make a big deal about it here.johnnyb
December 6, 2016
December
12
Dec
6
06
2016
09:56 PM
9
09
56
PM
PDT
steveh - Thanks for watching, and for watching all the way through! I'm going to start in the middle, because I think you misunderstood an important point. You said,
And then you make matters worse by casually stating that you can compute CSI for biological organisms, which as I understand it has never been done
I did not state this, in fact, I purposefully stayed away from this question precisely because I wanted to focus on the mathematics rather than the applications. If I accidentally slipped and put something like this in, please let me know the timecode so I can cut that part out. Now, as to the difference between phi_s(t) and K(T), there are a few things to consider: Note that earlier in the video, I noted that you can use any defensible method (i.e., conditions of attachment, etc.) for generating the size of the specification space. Also note that these are all upper-bounding methods. The shortest sequence that I can think of to generate a sequence of 1,000 characters in ruby is 64 bits, so, therefore, saying the specification is 64-bits is probably a massively overstated upper bound. I don't remember Dembski's own method for generating specificational complexity outside of Kolmogorov complexity, but both he and most others agree that KC is much more defensible for generating the properties of detachability, etc. Most of his current work uses KC precisely for this reason. I don't know if Dembski's original computation of phi_S(t) of being 1 bit is correct or not, but, as I said, it isn't unreasonable because both (a) they are both upper bounds, and (b) the smallest program to generate the right number of characters is 64-bits. So, it actually seems like it is on the right scale. Technically, KC involves a +C attached to it, but, in general, I think it works well, and in fact grossly overestimates phi_s(t) because of the redundancies in the language, in the alphabet, and the dead space in the syntax of the language. So that deals with your questions regarding what was actually in the video. My next post will cover the rest, but I want to make sure we separate out what was in the video from what other topics we are talking about.johnnyb
December 6, 2016
December
12
Dec
6
06
2016
09:37 PM
9
09
37
PM
PDT
I found your video quite useful for the most part - very carefully explained in great detail until the last seconds where you brought up the relationship to Dembskis work - and then things got very sketchy indeed. At around 42:50 you explained that the Specified Complexity Formular is
C(T|H)) - K(T|H - 500
and then stated
The formula given by Dembski is the same, just using different notation: -log_2 [2^500 phi_s(T).P(T|H)] where phi_S(T) = 2^(K(T|H))
If I have understood what you said earlier in the video correctly, I think you are pulling a fast one here. K(T|H) is (paraphrased)
the length of the shortest program that will produce target T under hypothesis H.
and phi_S(T) is (from dembski) [https://billdembski.com/documents/2005.06.Specification.pdf]
the number of patterns for which S’s semiotic description of them is at least as simple as S’s semiotic description of T
This appears to me to be much more than a change in notation. You and Dembski appear to be discussing completely different concepts. i.e your phi_S(T) is 2^(length in bits of smallest program that will reproduce the pattern) and Dembski's is a count of a number of patterns. Maybe the two concepts are related in some mind-boggling way but I don't think it's valid to gloss it over as a notation change because the numbers don't match in the simple 1000-Heads example. For you the K(T|H) is 64 bits [@13:45 in the video] and phi_S(T) wouuld be the largest number that can be written with 64 bits (approx NINETEEN QUADRILLION); For Dembski phi_S(T) for 1000 coins is TWO (his example used 100 coins but the same logic also gives two for 1000). [page 17] And then you make matters worse by casually stating that you can compute CSI for biological organisms, which as I understand it has never been done (regardless of anything KF says). Demsbski described what factors the calculations should consider:
P(T|H) as the probability for the chance formation for the bacterial flagellum. T, here, is conceived not as a pattern but as the evolutionary event/pathway that brings about that pattern (i.e., the bacterial flagellar structure). Moreover, H, here, is the relevant chance hypothesis that takes into account Darwinian and other material mechanisms.
In the only attempts I have seen at calculating CSI, P(T|H) is essentially calculated as the probability of forming a protein / dna strand entirely by chance by adding bases at random and then getting the required result at the first attempt ( 4^num-bases). That's not modelling the Darwinian mechanism. The only attempts at calculating phi_S(T) I have seen have been by choosing four concepts at random from a dictionary of 100000 concepts (Different from both formulations discussed above) or by using using a number of alternative patterns that could produce the same result (then why not just factor that into P(T|H)?) or IIUIC by using a constant of ONE (indicating functional) or ZERO (none functional).steveh
December 6, 2016
December
12
Dec
6
06
2016
09:57 AM
9
09
57
AM
PDT
UB,
... it seems to me that you are being given 1 of 11 possible answers to a question, corresponding to the 11 possible sums of two die. The additional information regarding probabilities is information you already had.
Yes, I would have to agree with that. But then we can evaluate the information content of such a "message", I take it. So far, my understanding is that messages of the form "the sum of the dice is n" (or, more generally, some event E has occurred) convey to me some amount of information which we can quantify as in the OP. For another example, if someone says "the sum of the dice was between 2 and 12 inclusive", ultimately they have conveyed 0 bits of information to me (so they have in essence told me nothing).daveS
December 4, 2016
December
12
Dec
4
04
2016
06:12 AM
6
06
12
AM
PDT
Then if you tell me E has occurred, you have given me -log_2(1/12) = 3.58 bits of information. If you tell me F has occurred, you have given me log_2(1/6) = 2.58 bits of information.
Just a minor point about "being given" different amounts of information depending on whether you are given an E or an F. Perhaps I am woefully misreading your question (completely likely), but it seems to me that you are being given 1 of 11 possible answers to a question, corresponding to the 11 possible sums of two die. The additional information regarding probabilities is information you already had.Upright BiPed
December 4, 2016
December
12
Dec
4
04
2016
05:48 AM
5
05
48
AM
PDT
Thanks johnnyb and KF.daveS
December 4, 2016
December
12
Dec
4
04
2016
05:40 AM
5
05
40
AM
PDT
PS: Pardon, I have 7 and 4 reversed. The 4 is the more surprising cluster of outcomes than the 7. In stat mech terms the 7 has the higher statistical weight. And yes, all of these areas of thought converge.kairosfocus
December 4, 2016
December
12
Dec
4
04
2016
01:14 AM
1
01
14
AM
PDT
DS & JB: Yes, the 7 is more surprising than the 4 and is more informational; where, no surprise, no information. Perhaps this from my always linked note will help those digging in deeper. KFkairosfocus
December 3, 2016
December
12
Dec
3
03
2016
08:35 PM
8
08
35
PM
PDT
December 3, 2016
December
12
Dec
3
03
2016
08:30 PM
8
08
30
PM
PDT
daveS - that is correct.johnnyb
December 3, 2016
December
12
Dec
3
03
2016
07:59 PM
7
07
59
PM
PDT
johnnyb, One very basic question, if you don't mind, concerning the part around 4:00 where you discuss converting probabilities to bits. Let's say we are working with the experiment where you roll two fair dice and add the two numbers. E is the event of getting a sum of 4, while F is the event of getting a sum of 7. Then if you tell me E has occurred, you have given me -log_2(1/12) = 3.58 bits of information. If you tell me F has occurred, you have given me log_2(1/6) = 2.58 bits of information. Since E has lower probability than F, if you tell me E has happened, then you have given me more "specific" information than you would have by telling me F happened. By that I mean that in the case of E occurring, I can narrow down the exact outcome to a smaller (in probability) region of the sample space than with F. Is that right?daveS
December 3, 2016
December
12
Dec
3
03
2016
01:11 PM
1
01
11
PM
PDT
Very nice video, johnnyb, I think it lays out the concepts quite clearly.daveS
December 3, 2016
December
12
Dec
3
03
2016
08:44 AM
8
08
44
AM
PDT
Ditto. Excellent video! Totally worth watching all the way through to the end.rigby
December 3, 2016
December
12
Dec
3
03
2016
06:11 AM
6
06
11
AM
PDT
Excellent video! Thank you so much, really helped me understand specified complexity and the warrant to design on a mathematical level.anthropic
December 2, 2016
December
12
Dec
2
02
2016
11:00 PM
11
11
00
PM
PDT