Uncommon Descent Serving The Intelligent Design Community

An attempt at computing dFSCI for English language

Share
Facebook
Twitter
LinkedIn
Flipboard
Print
Email

In a recent post, I was challenged to offer examples of computation of dFSCI for a list of 4 objects for which I had inferred design.

One of the objects was a Shakespeare sonnet.

My answer was the following:

A Shakespeare sonnet. Alan’s comments about that are out of order. I don’t infer design because I know of Shakespeare, or because I am fascinated by the poetry (although I am). I infer design simply because this is a piece of language with perfect meaning in english (OK, ancient english).
Now, a Shakespeare sonnet is about 600 characters long. That corresponds to a search space of about 3000 bits. Now, I cannot really compute the target space for language, but I am assuming here that the number of 600 characters sequences which make good sense in english is lower than 2^2500, and therefore the functional complexity of a Shakespeare sonnet is higher than 500 bits, Dembski’s UPB. As I am aware of no simple algorithm which can generate english sonnets from single characters, I infer design. I am certain that this is not a false positive.

In the discussion, I admitted however that I had not really computed the target space in this case:

The only point is that I have not a simple way to measure the target space for English language, so I have taken a shortcut by choosing a long enough sequence, so that I am well sure that the target space /search space ratio is above 500 bits. As I have clearly explained in my post #400.
For proteins, I have methods to approximate a lower threshold for the target space. For language I have never tried, because it is not my field, but I am sure it can be done. We need a linguist (Piotr, where are you?).
That’s why I have chosen and over-generous length. Am I wrong? Well, just offer a false positive.
For language, it is easy to show that the functional complexity is bound to increase with the length of the sequence. That is IMO true also for proteins, but it is less intuitive.

That remains true. But I have reflected, and I thought that perhaps, even if I am not a linguist and not even a amthematician, I could try to define better quantitatively the target space in this case, or at least to find a reasonable higher threshold for it.

So, here is the result of my reasonings. Again, I am neither a linguist nor a mathematician, and I will happy to consider any comment, criticism or suggestion. If I have made errors in my computations, I am ready to apologize.

Let’s start from my functional definition: any text of 600 characters which has good meaning in English.

The search space for a random search where every character has the same probability, assuming an alphabet of 30 characters (letters, space, elementary punctuation) gives easily a search space of 30^600, that is 2^2944. IOWs 2944 bits.

OK.

Now, I make the following assumptions (more or less derived from a quick Internet search:

a) There are about 200,000 words in English

b) The average length of an English word is 5 characters.

I also make the easy assumption that a text which has good meaning in English is made of English words.

For a 600 character text, we can therefore assume an average number of words of 120 (600/5).

Now, we compute the possible combinations (with repetition) of 120 words from a pool of 200000. The result, if I am right, is: 2^1453. IOWs 1453 bits.

Now, obviously each of these combinations can have n! permutations, therefore each of them has 120! different permutation, that is 2^660. IOWs 660 bits.

So, multiplying the total number of word combinations with repetitions by the total number of permutations for each combination, we have:

2^1453 * 2^660 = 2^2113

IOWs, 2113 bits.

What is this number? It is the total number of sequences of 120 words that we can derive from a pool of 200000 English words. Or at least, a good approximation of that number.

It’s a big number.

Now, the important concept: in that number are certainly included all the sequences of 600 characters which have good meaning in English. Indeed, it is difficult to imagine sequences that have good meaning in English and are not made of correct English words.

And the important question: how many of those sequences have good meaning in English? I have no idea. But anyone will agree that it must be only a small subset.

So, I believe that we can say that 2^2113 is a higher threshold for out target space of sequences of 600 characters which have a good meaning in English. And, certainly, a very generous higher threshold.

Well, if we take that number as a measure of our target space, what is the functional information in a sequence of 600 characters which has good meaning in English?

It’s easy: the ratio between target space and search space:

2^2113 / 2^ 2944 = 2^-831. IOWs, taking -log2, 831 bits of functional information. (Thank you to drc466 for the kind correction here)

So, if we consider as a measure of our functional space a number which is certainly an extremely overestimated higher threshold for the real value, still our dFSI is over 800 bits.

Let’s go back to my initial statement:

Now, a Shakespeare sonnet is about 600 characters long. That corresponds to a search space of about 3000 bits. Now, I cannot really compute the target space for language, but I am assuming here that the number of 600 characters sequences which make good sense in english is lower than 2^2500, and therefore the functional complexity of a Shakespeare sonnet is higher than 500 bits, Dembski’s UPB. As I am aware of no simple algorithm which can generate english sonnets from single characters, I infer design. I am certain that this is not a false positive.

Was I wrong? You decide.

By the way, another important result is that if I make the same computation for a 300 character string, the dFSI value is 416 bits. That is a very clear demonstration that, in language, dFSI is bound to increase with the length of the string.

Comments
Zac said, That’s actually a slightly different question, which has to do with choosing an algorithm which computes the string. While such an algorithm exists, you won’t be able to tell which one. I say, Thank you God!!! some common ground You say That’s true of a Shakespearean sonnet, but it’s also true of a random sequence, or any string for that matter. I say. Agreed. see my response to gpuccio belowfifthmonarchyman
November 29, 2014
November
11
Nov
29
29
2014
07:13 AM
7
07
13
AM
PDT
Gary S. Gaulin writes:
This is the only known Theory of Intelligent Design that provides scientifically testable predictions and models to explain the origin of intelligence and how intelligent cause works.
What is the purpose of the word "known" in the above sentence? Are there "unknown" theories of Intelligent Design that provide scientifically testable predictions? If so, how would we know? Are ID proponents impressed with Mr. Gaulin's 40 pages of somewhat difficult prose and his impressive claim that his theory has explanatory power that produces testable entailments? Some ID proponent should be fetching this man a chair. Mr Arrington! Get Mr. Gaulin to write an OP. It's what we've been waiting for!Alicia Renard
November 29, 2014
November
11
Nov
29
29
2014
06:52 AM
6
06
52
AM
PDT
Gary S. Gaulin @ 894
Your childish answers indicate that you still don’t even know what is explained by the Theory of Intelligent Design.
Are you a famous IDer who has published papers and books and expounded your theories across the world ? When even a Math and Philosophy PhD guy is struggling to make his work recognized, why do you think your own theory and schematics in your own blog will be known, much less read and understood by anyone at all ?Me_Think
November 29, 2014
November
11
Nov
29
29
2014
06:46 AM
6
06
46
AM
PDT
Gary S. Gaulin: Your ... answers indicate that you still don’t even know what is explained by the Theory of Intelligent Design. We were responding to your comment that we were somehow "arguing that this planet suddenly appeared, at the very start of the Cambrian Explosion." That's obviously not the case. The Earth formed long before the Cambrian Explosion, and life appeared soon after that. As for your link, the problem with pseudo-science is that, because it's not constrained by observation, it fractures into as many pieces as there are advocates (adaptive radiation).Zachriel
November 29, 2014
November
11
Nov
29
29
2014
06:41 AM
6
06
41
AM
PDT
Gary S. Gaulin: But good luck arguing that this planet suddenly appeared, at the very start of the Cambrian Explosion.
You do realize there is ample evidence of life before the Cambrian Explosion, including multicellular life?
Your childish answers indicate that you still don't even know what is explained by the Theory of Intelligent Design.Gary S. Gaulin
November 29, 2014
November
11
Nov
29
29
2014
06:36 AM
6
06
36
AM
PDT
fifthmonarchy: Do you believe that finite strings are computable even if their specification is unknown? Glad you gave up on Kolmogorov Complexity. All finite strings are computable, and there are an infinite number of algorithms which can compute each one. fifthmonarchy: You can’t compute a string if you don’t already know it’s specification. That's actually a slightly different question, which has to do with choosing an algorithm which computes the string. While such an algorithm exists, you won't be able to tell which one. That's true of a Shakespearean sonnet, but it's also true of a random sequence, or any string for that matter.Zachriel
November 29, 2014
November
11
Nov
29
29
2014
06:12 AM
6
06
12
AM
PDT
Zac let's cut to the chase Do you believe that finite strings are computable even if their specification is unknown? If so we are at a metaphysical impasse. You can't compute a string if you don't already know it's specification. This is a to me a self evident truth. If you deny this obvious truth discussion is futile as far as I can tell and the only way that I can see forward is science.fifthmonarchyman
November 29, 2014
November
11
Nov
29
29
2014
05:40 AM
5
05
40
AM
PDT
fifthmonarchyman: I ask again are you claiming that ALL finite strings are computable? Yes. fifthmonarchyman: If you say yes you have ruled out ID a-priori and there is really no reason to discuss further. It has nothing to do with ID. It's a simple fact about a specific measure called Kolmogorov complexity. fifthmonarchyman: If you somehow think that that program has anything to do with my argument then There is no point in discussing this further with you, You're the one who introduced Kolmogorov complexity without apparently understanding it. fifthmonarchyman: I know you realize that all universal Turning machines can be considered equivalent so increased technical ability will not make the challenge any easier Your claim is that such an algorithm is impossible, but that's what you have yet to show. Gary S. Gaulin: That’s another brush-off. When a new niche becomes available, there are a multitude of opportunities for adaptation, hence we will usually see a spurt of variation, followed by a winnowing process. That doesn't resolve the specifics of the Cambrian Explosion, but provides other examples of the general pattern. Gary S. Gaulin: But good luck arguing that this planet suddenly appeared, at the very start of the Cambrian Explosion. You do realize there is ample evidence of life before the Cambrian Explosion, including multicellular life?Zachriel
November 29, 2014
November
11
Nov
29
29
2014
05:12 AM
5
05
12
AM
PDT
Gary S. Gaulin: And what did you explain by spouting a smart sounding name for something?
Adaptive radiation occurs when a new niche becomes available. The Cambrian Explosion is a case of adaptive radiation on a large scale.
That's another brush-off. But good luck arguing that this planet suddenly appeared, at the very start of the Cambrian Explosion.Gary S. Gaulin
November 28, 2014
November
11
Nov
28
28
2014
08:44 PM
8
08
44
PM
PDT
FMM, You wrote:
since the computability resources needed to specify an IC object is infinite the Kolmogorov complexity of said object is infinite by definition.
That claim is incorrect, and I have explained why. The simple program I cited proves it, by the very definition of Kolmogorov complexity. To knowledgeable observers, you look very foolish: a guy who doesn't know what he's talking about, lashing out at the people who are trying to explain it to him. Why not crack a book or two? Does everything have to be spoon-fed to you by your critics? Show some initiative and learn about irreducible complexity, computability and Kolmogorov complexity. They're interesting topics!keith s
November 28, 2014
November
11
Nov
28
28
2014
07:19 PM
7
07
19
PM
PDT
I know you realize that all universal Turning machines can be considered equivalent
Not really.Every universal Turning machines has different states. For Eg. The smallest universal Turning machines put forth by Yurii Rogozhin has state and color set as (24, 2), (10, 3), (7, 4), (5, 5), (4, 6), (3, 10), and (2, 18). Wolfram in 2002 discovered the Four 2-state 4-color universal Turning machines.Me_Think
November 28, 2014
November
11
Nov
28
28
2014
07:04 PM
7
07
04
PM
PDT
Zac said, Not being able to produce such an algorithm may mean nothing more than a lack of technical ability. I say, I know you realize that all universal Turning machines can be considered equivalent so increased technical ability will not make the challenge any easier peacefifthmonarchyman
November 28, 2014
November
11
Nov
28
28
2014
06:17 PM
6
06
17
PM
PDT
Keiths said Any finite string can be produced by a program like the one I gave you above: I say NO offense but I'm not in the mood to play games. If you somehow think that that program has anything to do with my argument then There is no point in discussing this further with you, There are other critics who while not agreeing with me at least have a clue of what I'm talking about. Why don't you go back to your ONH argument. It doesn't require you to do much actual discussion. peacefifthmonarchyman
November 28, 2014
November
11
Nov
28
28
2014
06:12 PM
6
06
12
PM
PDT
Any finite string can be produced by a program like the one I gave you above:
What about producing an irreducibly complex structure like a bacterial flagellum?Joe
November 28, 2014
November
11
Nov
28
28
2014
05:22 PM
5
05
22
PM
PDT
The problem with your original post is the claim that the scarcity of the targets is a problem for evolutionary search.
Evolutionary search is an oxymoron wrt biology and the mainstream version of evolution.Joe
November 28, 2014
November
11
Nov
28
28
2014
05:20 PM
5
05
20
PM
PDT
FMM:
I make no claims to being particularly well read it’s possible I’ve misunderstood something
That's fine, but why not work on remedying that? Learn what irreducible complexity, computability, and Kolmogorov complexity actually are, and then come back and make your argument. Any finite string can be produced by a program like the one I gave you above:
string = “<insert specified string here>”; output(string);
That is a finite program. Therefore, the Kolmogorov complexity of any finite string is finite.keith s
November 28, 2014
November
11
Nov
28
28
2014
04:56 PM
4
04
56
PM
PDT
KeithS I make no claims to being particularly well read it's possible I've misunderstood something but Ive seen nothing Ive read here or elsewhere to contradict my understanding can you provide a link? you say. The Kolmogorov complexity of any finite string is finite. I say, Kolmogorov complexity is a measure of the computability resources needed to specify an object. I ask again are you claiming that ALL finite strings are computable? This is a simple straightforward question please answer yes or no. If you say yes you have ruled out ID a-priori and there is really no reason to discuss further. If you say no then our disagreement is semantics, You are apparently saying that a string can be not computable yet it's computation only requires finite resources. And I say that such a statement is illogical and incoherent peacefifthmonarchyman
November 28, 2014
November
11
Nov
28
28
2014
04:26 PM
4
04
26
PM
PDT
fifthmonarchyman, The Kolmogorov complexity of any finite string is finite. This is a basic and well-known fact about Kolmogorov complexity. Why not crack a book now and then?keith s
November 28, 2014
November
11
Nov
28
28
2014
03:40 PM
3
03
40
PM
PDT
fifthmonarchyman: Agreed, if you think my experiment is not well-devised please provide constructive criticism. The problem is that your experiment merely tests human technical limitations. Not sure if there is any way to salvage it. fifthmonarchyman: an Algorithm that infallibly fools the observer would be a very specific observation would it not? If you could find an algorithm that produced quality poetry, it would show that poetry is not beyond the capabilities of an algorithm. It used to be that chess was the ultimate test of human intelligence. Not finding such an algorithm is simply not significant evidence of anything other than the limitations of human capabilities. fifthmonarchyman: Again we are not trying to support my hypothesis we are trying to falsify it. Your hypothesis is that humans can't make such an algorithm, not that such an algorithm isn't possible. fifthmonarchyman: Are you implying that only positive prediction is valid in scientific inquiry? In the case of the bending of sunlight, a negative result would have been just as decisive. What a good hypothesis does is cleave the possible world into two, with a very distinct boundary. Your experiment will almost certainly fail, showing only the limitations of the experimenter. gpuccio: What a pity that a suitable oracle is usually provided only by intelligent engineering, and that highly connected spaces are so hard to find, especially when the search space becomes really huge The oracle could actually be the physical environment, as with some newer robot programs. Huge spaces are not a problem as long as they exhibit locality. Gary S. Gaulin: And what did you explain by spouting a smart sounding name for something? Adaptive radiation occurs when a new niche becomes available. The Cambrian Explosion is a case of adaptive radiation on a large scale.Zachriel
November 28, 2014
November
11
Nov
28
28
2014
02:13 PM
2
02
13
PM
PDT
And Zachriel, most science defenders use the phrase "punctuated equilibrium" to get out of having to give an appropriate scientific answer to that one.Gary S. Gaulin
November 28, 2014
November
11
Nov
28
28
2014
01:33 PM
1
01
33
PM
PDT
Gary S. Gaulin: Not having beforehand predicted a sudden proliferation of multicellular intelligence is one of the very serious weaknesses of Darwinian theory.
It’s called adaptive radiation.
And what did you explain by spouting a smart sounding name for something? The only thing I see in your statement is a brush-off of what I said.Gary S. Gaulin
November 28, 2014
November
11
Nov
28
28
2014
01:19 PM
1
01
19
PM
PDT
Zachriel: Thank you for your last comments to my statements. I find them rather balanced. I agree that the fundamental role of conscious representations in cognition is still an open problem. There are, IMO, many arguments in favor of its essential role (including Godel derived arguments, and some basic intuitions about cognition itself). And I would say that there is no evidence that the opposite view, let's call it strong AI theory, has any rationale or any empirical support. But I agree, it is an open problem, and IMO a very important one for the whole scientific paradigm. I have definitely much less faith than you have in evolutionary searches. Sometimes I am surprised of how much faith, IMO unsupported, my "skeptical" interlocutors can harbor for things that help maintain their worldview. :) However, I have no problems in admitting that an "evolutionary search" can work rather well, given a suitable oracle and a highly connected functional space. What a pity that a suitable oracle is usually provided only by intelligent engineering, and that highly connected spaces are so hard to find, especially when the search space becomes really huge... :)gpuccio
November 28, 2014
November
11
Nov
28
28
2014
09:22 AM
9
09
22
AM
PDT
zac says, No, but you can support a hypothesis with a well-devised experiment. I say, Agreed, if you think my experiment is not well-devised please provide constructive criticism. I'm open to any modifications what so ever as long as the "key" remains concealed from the programer. You say, A good prediction will have a very specific observation that if found to be false will contradict the hypothesis. I say, an Algorithm that infallibly fools the observer would be a very specific observation would it not? You say, You are making a negative prediction. NOT producing an algorithm does nothing to support your hypothesis. I say, Again we are not trying to support my hypothesis we are trying to falsify it. The failure to falsify provides a sort of indirect support I suppose but not proof by any means. You say, Consider a famous example, such as the degree of curvature of light around the Sun predicted by General Relativity. You take the observation. If it fails, then the theory is falsified. I say, Are you implying that only positive prediction is valid in scientific inquiry? Don't tell that to my boss ;-) I use negative prediction all the time in my work. It's often not as desirable as a positive prediction but lots of real world knowledge is built upon it. A negative result in a cancer screening for example can provide valuable information as can a negative result in a e coli test in a mountain stream . peacefifthmonarchyman
November 28, 2014
November
11
Nov
28
28
2014
09:19 AM
9
09
19
AM
PDT
fifthmonarchyman: Not being able to produce an algorithm does not “prove” my Hypothesis Science can’t “prove” anything. No, but you can support a hypothesis with a well-devised experiment. fifthmonarchyman: Producing an algorithm falsifies my Hypothesis A good prediction will have a very specific observation that if found to be false will contradict the hypothesis. You are making a negative prediction. NOT producing an algorithm does nothing to support your hypothesis. Consider a famous example, such as the degree of curvature of light around the Sun predicted by General Relativity. You take the observation. If it fails, then the theory is falsified.Zachriel
November 28, 2014
November
11
Nov
28
28
2014
08:41 AM
8
08
41
AM
PDT
Zac says. Add the scientific method to things you don’t understand. Not being able to produce such an algorithm may mean nothing more than a lack of technical ability. I say, Not being able to produce an algorithm does not "prove" my Hypothesis Science can't "prove" anything. Producing an algorithm falsifies my Hypothesis Falsifiability is the classic demarcation between science and non-science How is that a misunderstanding of the scientific method? peacefifthmonarchyman
November 28, 2014
November
11
Nov
28
28
2014
07:50 AM
7
07
50
AM
PDT
gpuccio: I would simply say that such an algorithm cannot exist, and that a conscious agent who understands meaning and has complex conscious representations is necessary to do that. Z: That’s your opinion, but not something you’ve shown. That’s a good description of the problem, though. The problem with your original post is the claim that the scarcity of the targets is a problem for evolutionary search. It's simply not. As long as there are selectable pathways, evolution will find them. And we can show this by creating a landscape of meaningful phrases, even if only a tiny subset of what you point out is already scarce, to see if an evolutionary algorithm can navigate the landscape. This doesn't solve the problem of meaning, though.Zachriel
November 28, 2014
November
11
Nov
28
28
2014
07:28 AM
7
07
28
AM
PDT
fifthmonarchman: since a Shakespearean sonnet is by definition a sonnet composed by Shakespeare the “overhead required” must specify all that is Shakespeare. Clearly that is a lot of information No. The longest possible shortest description is the string itself. The overhead is just what the program requires to call the identity function. Gary S. Gaulin: Instead of the discovery of (what later became known as) the Cambrian Explosion having been predicted by Charles Darwin ... Darwin was aware of the Cambrian Explosion. Gary S. Gaulin: Not having beforehand predicted a sudden proliferation of multicellular intelligence is one of the very serious weaknesses of Darwinian theory. It's called adaptive radiation. fifthmonarchyman: Shakespeare is important because we are using Shakespearean sonnets as a typical test case to illustrate what Irreducible complexity is and how the game works. Note to Me_Think: he's using an non-standard definition of irreducible complexity, as well as information, computable, Kolmogorov Complexity, and the scientific method. gpuccio: but are you saying that we can output a string by a simple program if we already know it? Yes, that is the longest shortest program which can output a given string in Kolmogorov Complexity. gpuccio: I am not an expert in Kolmogorov complexity, but is it possible that the real utility of it is to know if we can compute a string which we don’t know in advance by an algorithm simpler than the string itself, and not if we can output a string which is already in the algorithm? Think in terms of compression. What is the shortest possible representation of the string. It can be shorter than the original string, but can't be longer (other than calling the function). gpuccio: Now, even if the term “Kolmogorov complexity” is perhaps describing both cases, I think that we are dealing with two different concepts here. It's fifthmonarchyman's confusion. gpuccio: The interesting point is: how big must an algorithm be to compute a Shakespeare sonnet (or something equivalent) without previously knowing it? Don't know. How big was Shakespeare's mind when he wrote them? Any algorithmic solution is going to have to have access to all the very same information Shakespeare did, spelling and grammar, rhyme and rhythm, the history of England, tales told by countless others that he had heard, and wisecracks he heard at the local pub. gpuccio: Maybe fifthmonarchyman’s point is that such an algorithm would have infinite complexity. That seems to be his claim. He would do better not to redefine terminology. It confuses his readers, and leaves his own thinking muddled. gpuccio: I would simply say that such an algorithm cannot exist, and that a conscious agent who understands meaning and has complex conscious representations is necessary to do that. That's your opinion, but not something you've shown. That's a good description of the problem, though. gpuccio: How complex should an algorithm be to recognize all possible contexts of that kind? Very complex no doubt, and quite possibly non-computable, but that just isn't something you've shown. fifthmonarchyman: Prove that you can produce an algorithm that will fool an observer infallibly with out cheating when it comes to IC configurations and my claim that they are non-computable will be falsified. If you can’t do that my hypothesis stands Add the scientific method to things you don't understand. Not being able to produce such an algorithm may mean nothing more than a lack of technical ability.Zachriel
November 28, 2014
November
11
Nov
28
28
2014
07:16 AM
7
07
16
AM
PDT
Gpuccio @ 863, Your text-detector algorithm works for texts in languages known to the detector. Congratulations. Not that I doubted this for a minute, of course. However, you did mention (@728) the specification “having good meaning in any pre-existing language that we may discover in the future, on other planets, everywhere”. Now THAT would actually be useful, telling us something that we didn’t already know. THAT would be analogous to a protein-design-detector. Presumably, it would also be able to detect encrypted messages. Could you provide a working example of this design-detector? The NSA would be very interested in a steganography detector. My responses at 720 & 745 stand.DNA_Jock
November 28, 2014
November
11
Nov
28
28
2014
07:12 AM
7
07
12
AM
PDT
gpuccio said, So, has the algorithm computed Shakespeare’s sonnet? The answer is: no. The algorithm has computed the whole list of sequences made of English words from a list of all English words. I say exactly, I think it is important to understand the concept of the Y-axes here. The meaning of the sonnet goes far beyond the words themselves. The x-axes is indeed finite but the y-axes is infinite. peacefifthmonarchyman
November 28, 2014
November
11
Nov
28
28
2014
04:18 AM
4
04
18
AM
PDT
Everyone, I am more than willing to share my Game with anyone If they wanted to give a crack at converting it from excel to a shareable app. just let me know how to contact you peacefifthmonarchyman
November 28, 2014
November
11
Nov
28
28
2014
04:01 AM
4
04
01
AM
PDT
1 2 3 4 31

Leave a Reply