A recent blogger has announced that a few million simulated monkeys really could reproduce Shakespeare. This is such a hoary chestnut, that of course, everyone had to go and read just exactly what the fellow actually did, if only to ridicule it. Here’s how he describes his project,
Instead of having real monkeys typing on keyboards, I have virtual,
computerized monkeys that output random gibberish. This is supposed to mimic a monkey randomly mashing the keys on a keyboard. The computer program I wrote compares that monkey’s gibberish to every work of Shakespeare to see if it actually matches a small portion of what Shakespeare wrote…
For this project, I used Hadoop, Amazon EC2, and Ubuntu Linux. Since I don’t have real monkeys, I have to create fake Amazonian Map Monkeys. The Map Monkeys create random data in ASCII between a and z. It uses Sean Luke’s Mersenne Twister to make sure I have fast, random, well behaved monkeys. Once the monkey’s output is mapped, it is passed to the reducer which runs the characters through a Bloom Field membership test. If the monkey output passes the membership test, the Shakespearean works are checked using a string comparison. If that passes, a genius monkey has written 9 characters of Shakespeare. The source material is all of Shakespeare’s works as
taken from Project Gutenberg.
35 Replies to “Just how many monkeys = Shakespeare?”
Oh, brother. A fun bit of amateur programming, but nothing more. A slightly more sophisticated version of Dawkins’ logically-flawed embarrassment “me thinks it is a weasel” program.
In what way is it flawed?
Read the rest of the blog
LOL. Back to the “latching” claim. A durable myth.
But I’m not defending the monkey program, because I haven’t seen exactly how it works. It may latch.
A good way to figure out that this is fairly silly is to ask the question, “why 9 letters?” Or, an even better question, “what would happen if there were only one-letter strings?” or “what would happen if he matched 50-letter strings?”
In the first case, the computer would be done in a few seconds, probably. In the second one, the computer would never get done. So, the choice of 9-letter fragments is a ruse – it makes it look like he’s achieving something, when actually he’s just weaseling. It gives the computer something to do to make it *look* like it is generating stuff randomly, but not enough to do that it would actually be a hard problem.
Latching is not required since the mutation generation is biased toward successful matching of the fitness function.
I was speaking of the WEASEL program. I haven’t seen the source code for this guy’s program. Only an explanation by him, and it is not analogous to the typing monkeys scenario. Pointless, IMO.
It probably shows (yet again!) that if an evolutionary algorithm is designed and constrained towards a pre-conceived highly complex target, the algorithm can reach it. Basic programming. Demonstrates nothing except the human ability to design even pointless stuff.
The sad thing is that the project will be stopped at 100% completion of Shakespeare’s works. Personally, I would be interested in a new sonnet or perhaps an extra scene in Hamlet. I think we should have a petition for the continuation…
Nice suggestion Alex73.
Might also help anyone who sees this as some form of validation of Darwinism that it is in fact much ado about nothing.
I would love to see what the program would churn out following your suggestion. Does the sonnet merely need to exist or does it need to be able to be published? I would see being published as a requirement/analogy to breeding in the animal family. Has it enough selective advantage to survive?
Again here is someone who continues not to see the obvious: the programme is working toward a planned endpoint. This is NOT Darwinian evolution … they should know better!
“Oftentimes excusing of a fault
Doth make the fault the worse by the excuse.”
Shakespeare’s King John
It might be deemed nit picking but it should be obvious that one monkey with sufficent time (some ridiculously enormous amount) or infinite monkeys with exactly enough time will be able to reproduce the works of shakespear by chance. That is just the nature of infinities and very very large numbers.
And besides, if it is just that the monkies need to type all the fragments, all that is really required is that the monkies type all of the individual letters and spaces at least once, these are tiny fragments after all.
I always assumes it was the whole text end to end.
“It was the best of times it was the blurst of times … Stupid Monkey!” – M. Burns
Absolutely correct. I could write a book this way. All I need is the precise text I’m trying to achieve (meaning, first I write the book). Then I program a computer to spit out random strings. Each time I get a good string I save it. In a very short time I have my book. Voila!
(Of course I might run the risk that someone would notice that I had to write my book in the first place before I set my ‘puter loose looking for the strings that match. The astute person might further question whether my ‘puter had in fact come up with the book, as I claimed, or whether I was trying to pull the wool over everyone’s eyes . . .)
Oh, I’m not suggesting there is anything wrong with the code (has anyone ever actually seen it)? The problem is the substance and the logical fail. I think the term my kids would use these days is: Epic Fail. Dawkins’ program is supposed to show how random changes, coupled with selection, can lead to a meaningful, information-rich sequence. Of course by coming up with the sequence in the first place and then carefully programming the computer to inevitably converge on that target doesn’t demonstrate anything of the kind. Dawkins fooled himself (as do more sophisticated algorithms, like Avida, though more nuanced), because he saw lots of discreet steps leading to the target, which looks — kinda, sorta, if you hold your head to the side just right and stick your tongue out — like some kind of natural selection process. Of course it is nothing of the kind and is intelligently guided from start to finish.
Dawkins program was primarily an exercise in self delusion.
Experimental results here.
Jason, it’s not obvious at all.
wiki has a “direct proof”.
Which contains the following:
“If we assume that the keys are pressed randomly (i.e., with equal probability) and independently”
But why should we assume this?
Random means having no definite aim or purpose. The monkeys might just as well produce no output, or sit down on the middle of the keyboard, as a nice neat sequence of random output.
So we substitute monkeys for something that really does produce a constant unpredicatable output. But what?
(maybe someone should start searching for Shakespeare in PI?)
And even if an infinite random sequence did contain Shakespeare, would it not take an infinite search to find it?
That sounds meaningless to me. Unless we have a finite sequence that can be searched in a finite time we’re dealing with the impossible.
Monkeys plus Typewriters equals Shakespeare???
Doesn’t sound right, but hey science is not about if it sounds weird or not, but is about if it is actually true or not. Thus let’s conduct a experiment to see if Monkeys plus Typewriters may actually equal Shakespeare.
Here’s the link I posted again. It’s the same experiment
Oh, this is hilarious — Thanks for the link to the vivaria.net “book”! Looks like those monkeys are well on their way! 🙂
The other problem with recurring to the “infinite” resources argument, is that it is essentially equivalent to saying that everything happens. It is just as likely that I am a brain in a bottle as it is for Shakespeare’s works to get produced; or that, to borrow an example from others, we and our environment have just popped into existence from the multiverse, rather than our universe really being 13+B years old, etc. Everything goes; nothing can be relied upon. The whole infinite resources path, including its child, the multiverse hypothesis, is useless because it takes away all objectivity and makes the fantastical inevitable and the outrageous definite. There can be no basis in observed reality, and no objective scientific enterprise, once we admit to an existence where absolutely everything happens.
In any case, even if this succeeds in reproducing the works of Shakespeare, it proves nothing because monkeys are designed. Right?
but it should be obvious that one monkey with sufficent time (some ridiculously enormous amount) or infinite monkeys with exactly enough time will be able to reproduce the works of shakespear by chance.
Not to me.
Without a designer there is no Shakespeare or anything ever. If you place a candle, next to a pack of matches and allow for only that but given infinity will a match ever be struck and used to light the candle?
Add natural undirected forces to the experiment. Does that make increase the likelihood for the candle to be lit by the matches? The wind blows, the earth shakes and materials decay. No lit candle ever even with infinity.
CORRECTION: Before you further embarrass yourself, it may help you to see what the discussion was about on that subject, latching or ratcheting, here. Just remember that Alinsky’s a priori evolutionary materialist disciples and their fever-swamps specialise in strawmannishly caricaturing, ridiculing and demonising those who dare differ with their shibboleths.
Can someone kindly point to evidence, per observation, of an observed infinity [other than something like a definition of mathematical points . . . which is inferred not observed]?
I find this one cynically dismissive; especially on the unacknowledged implication that the proponents of blind chance plus necessity producing FSCI plainly do not have credible observed facts in support of their claims that non-trivial amounts of complex functionally specific info can be produced absent intelligence.
An actual credible demonstration of blind chance + mechanical necessity –> FSCI would immediately destroy both CSI and IC as empirically reliable signs of design. And you know or should know that.
What happens is that there are entire industries in support of the claim that FSCI is a reliable sign of design, and no credible OBSERVATIONAL counter evidence. That’s why objectors resort to so many rhetorical games, obfuscations, misleading claimed counter examples — Weasel being the most infamous — and philosophical a prioris such as gerrymandering the definition of science.
In the real world, it is blatantly obvious that the only empirically reliable source of functionally specific complex — especially coded info — is intelligence.
What you are hinting at but dare not try to address cogently on the merits is that (a) life has in it coded, digital, specifically functional info, and (b) the observed cosmos is sitting at a fine-tuned operating point that supports such C-chemistry, aqueous medium cell based life.
That points to design of life, and to design of the cosmos in which the life is found. That may be resisted, but only by willful refusal to attend to the actual evidence and credible principles of inference on evidence.
GEM of TKI
One could also read up on the Travelling salesman problem and how GAs can solve the case with 10^150 possible routes — without having a target.
In fact, the GA is the only practical way of solving the problem.
There’s nothing about GAs that make them inherently capable or incapable of traversing a landscape. It’s the nature of the landscape that matters.
I have no idea how this particular monkey program works or what it really does, but words in any language are deeply connected by the rules of pronunciation. The dictionary landscape can be traversed by a GA that knows nothing about valid words, and only knows the most common letter pairs and combinations.
This argument is heard again and again.
The problem is how to come up with something that can mutate in the first place and where the limits within which mutations occur in practice are.
What takes a lot of extraneous intelligence is not so much the ability of adapt/optimise behaviour but how you get to a neighbourhood of states around which the system state can oscillate.
GAs have taken a lot of intelligence on the part of researchers. It would have been another matter altogether if you could prove GAs can spontaneously come about starting from a program that prints “Hello, world”.
Very good point K.F. And is an easy task to bet that nobody will be ever able to observe an infinite. Simply it is strictly impossible to observe or measure something that is infinite with the strictly finite means that we have at disposal in our world. It is a genuine philosophical constraint, so that any reference to infinite in science is a faith act.
What do you think it proves?
GA’s of course are intelligently designed and targetted to achieve what hey do. They ave plenty of built in FSCI, and it comes from the usual known source. That the travelling salesman problem can be solved by such an algorithm has nothing to do with the spontaneous — blind chance + necessity only — origin of functional self replicating cell based life forms, or the [want of] observational evidence to substantiate such claims. Similarly, for the spontaneous origin of major body plans as claimed. And, besides, all of this is distractive from the point I responded to, the misleading nature of Weasel and the observed reality of implicit latching as defined and described in discussions going back several years ago now. That sort of predictable subject switching is yet more evidence on where the weight on the merits actually lies.
In addition, the problem is not how to hill-climb within an island of function, but how to get to its beaches, in a vast ocean of non-functional configs, without intelligent direction.
Now, apply to the actual observationally justifiable resources [10^57 atoms, 10^17 s, or 10^80 atoms, 10^25 s at 10^45 Planck time quantum states per second] and see how the FSCI based design inference pops out.
As in if Chi_500 = I*S – 500 bits beyond the solar system th5reshold, once we go positive, we have good reason to accept design.
The experiment shows that monkeys are not even reasonable subjects for the random typing experiment.
However, this exercise is a reasonable substitute for the one of searching a more elaborate config space for zones of interest. A familiar type of model situation such as is commonly used in thermodynamics — which is where the example came from.
Its message is simple and direct.
Trial and error/success backed up by chance and necessity is not a feasible search algorithm for zones of interest when we have a quite modest degree of complexity — 72 or 143 ASCII characters worth of info is not a lot.
Large quantities of functionally specific complex coded info or the like — what we see in say posts here — are best explained on design. So much so that FSCI is diagnostic of design.
Save, were an ideological a priori dressed up in a lab coat intervenes, i.e the case of DNA.
Given an infinite amount of time, an immortal, tireless monkey, an indestructible typewriter and a limitless supply of paper, then eventually even the complete works of Shakespeare will be produced. It can’t be prevented.
Knowing monkeys, are they really able to type in any order.?
Is it factored in that they would eat the paper every other keypunch?
They are monkeys.
Do they space? Does a non space series of letters count as words?
Something wrong with the monkey thing and literature from thinking people.
how many monkeys , how long, would take to reproduce a President Obama speech /!
they should pick a short one and calculate it up.