In this post I want to consider another aspect of information. Specifically, I want to consider the concept of “Shannon information.”
First of all, I admit to having ruffled a few feathers when I mentioned in passing in a prior post that “Shannon information is not really information.” As I have also written before in comments on UD, I don’t begrudge anyone referring to the Shannon metric as “information.” That terminology has penetrated the English language and has become regularly-used in information theory. So, no, I am not going to police everyone who puts the words “Shannon” and “information” next to each other.
However, no small amount of misunderstanding has resulted from the unfortunate term “Shannon information.” In particular, as it relates to intelligent design, some critics have seized on the idea of Shannon information and have argued that because this or that computer program or this or that natural process can produce a complex string or a complex sequence, that therefore such a program or process is producing new complex “information.” This proves, the argument goes, that purely natural processes can produce new and large amounts of information, contra the claims of intelligent design.
Such thinking demonstrates a lack of understanding of CSI – in particular the need for specification. However, a large part of the problem results from the use of the word “information” in reference to the Shannon metric. As I have stated before, somewhat provocatively, we would all have been better off if instead of “Shannon information” the concept were referred to as the “Shannon measurement” or the “Shannon metric.”
Claude Shannon published a paper entitled “A Mathematical Theory of Communication” in the July 1948 volume of The Bell System Technical Journal. This paper is available online here and is considered a foundational groundwork for not only Shannon’s subsequent research on the topic, but for information theory generally. To be sure, there are many other aspects of information theory and many other individuals worthy of acclaim in the field, but Shannon is perhaps justifiably referred to as the father of information theory.
But before delving into other details in subsequent posts, time permitting, I want to relate a short experience and then a parable. Consider this a primer, a teaser, if you will.
When I was a teenager in high school, one of my part time jobs was working in a warehouse that housed and sold equipment and materials for the construction industry. On a regular weekly schedule we would load a truck with supplies at the main warehouse and drive the truck to a smaller warehouse in a different city to supply the needs in that locale. The day of the week was fixed (if memory serves, it was generally a Friday) and the sending warehouse foreman made sure that there were enough people on hand in the morning to pull inventory and load the truck, while the receiving warehouse foreman in turn ensured that there were enough people on hand in the afternoon to unload the truck and stock the inventory.
Due to the inevitable uneven customer demand in the receiving city, the needs of the receiving warehouse would vary. With good inventory management, a large portion of the receiving warehouse’s needs could be anticipated up front. However, it was not uncommon for the receiving warehouse to have a special order at the last minute that would necessitate removing a large crate or some boxes from the truck that had already been loaded in order to make room for the special order. At other times when no large orders had been made, we would finish loading all the supplies and find that we still had room on the truck. In this latter case, the sending foreman would often decide to send some additional supplies – usually a high turnover item that he knew the receiving warehouse would likely need shortly anyway.
In either case, the goal was to make most efficient use of the time, money and expense of the truck and driver that were already slated to head to the other town – taking the best possible advantage of the previously-allocated sunk costs, if you will. Ensuring that the shipment container (in this case a truck) made best use of the available capacity was a key to efficient operations.
I want to now take this experience and turn it into a parable that relates to Shannon information.
The Parable of the Fruit Truck
Let’s assume that instead of heating and cooling equipment and supplies, the warehouse sells fruit directly to customers. Let’s further assume that the various kinds of fruit are shipped in different-sized boxes – the watermelons in one size of box, the pineapples in another, the apples in another, and the strawberries in yet another.
Now, for simplicity, let’s suppose that customers purchase the fruit on a long-term contract with a pre-set price, so the primary variable expense of the warehouse is the expense of operating the truck. The warehouse would thus be highly incentivized to maximize the efficiency of the truck – sending it out on the road only as often as needed, and maximizing the carrying capacity of the truck.
The dock workers in our parable, however, are not particularly sharp. As the fruit comes in from the farms, the dock workers, without confirming the contents, simply start packing the boxes at the front of the truck, working their way to the back. Invariably, there are gaps and open spaces as the various-sized boxes do not precisely conform to the internal capacity of the truck. Some days are better than others by dint of luck, but the owner quickly realizes that the packing of the truck is inefficient. Worse still, customers regularly complain that (i) the truck is arriving only partly filled, (ii) boxes contain the wrong kind of fruit, or (iii) in particularly egregious cases, the boxes contain rotten fruit or no fruit at all.
As a result, the warehouse owner decides to hire a sharp young man fresh from the university whose sole job it is to figure out the best way to pack the truck, to create the most efficient and time-saving way to deliver as much fruit as possible given the carrying capacity of the truck.
Let’s say this young man’s name is, oh, I don’t know, perhaps “Shannon.”
Now our hero of the parable, Shannon, works in the office, not the loading dock, and is unable to confirm the actual contents of the boxes that are loaded on the truck. Further, he quite reasonably assumes the dock workers should be doing that part of the job. Notwithstanding those limitations, Shannon is a sharp fellow and quickly comes up with a formula that gives the owner a precise calculation of the truck’s carrying capacity and the exact number of each type of fruit box that can be loaded on the truck to ensure that every square inch of the truck is filled.
Elated with the prospect of putting all the customer complaints behind him, the warehouse owner hands down the instruction to the dock workers: henceforth the truck will be packed with so many watermelon boxes, so many pineapple boxes, so many apple boxes and so on. Furthermore, they will be packed according to Shannon’s carefully worked out order and placement of the boxes.
After the next week’s shipments, the owner is surprised to receive a number of customer complaints. Although not a single customer complains that the truck was only partly full (it was packed tightly to the brim in all cases), several customers still complain that (i) boxes contain the wrong kind of fruit, or (ii) in particularly egregious cases, the boxes contain rotten fruit or no fruit at all.
Furious, the owner marches to Shannon’s desk and threatens to fire him on the spot. “I hired you to figure out the best way to pack the truck to create the most efficient approach to delivering as much fruit as possible! But I am still swamped by customer complaints,” he fumes as he throws down the list of customer complaints on Shannon’s desk. Unfazed, Shannon calmly looks at the customer complaints and says, “I understand you used to get complaints that the truck was only partially filled, but I notice that not a single customer has complained about that problem this week. You hired me to find the most efficient delivery method, to ensure that the truck was maximizing its carrying capacity of boxes. I did that. And that is all I have ever claimed to be able to do.”
“But some of the customers got the wrong fruit or got no fruit at all,” sputters the owner. Based on your work we told them they would be receiving a specific quantity of specific types of fruit each week.
“I’m sorry to hear that,” retorts Shannon, “but you should not have promised any specific fruit or any particular quantity of fruit based on my formula alone. From my desk I have no way of knowing what is actually in the boxes. The supplier farms and dock workers can answer for that. What is in the boxes – what is actually delivered to the customer – has nothing to do with me. I have no ability from where I am sitting, nor frankly any interest, in guaranteeing the contents of the boxes. My only task, the only thing I have ever claimed to be able to do, is calculate the maximum carrying capacity of the truck with the given boxes.”
The fruit truck is obviously but a simple and fun analogy. However, it does, I believe, help newcomers get a feel for what Shannon can do (analyze maximum carrying capacity of a delivery channel) and what Shannon cannot do (analyze, confirm, understand or quantify the underlying substance). We’ll get into more details later, but let’s kick it off with this analogy.
What similarities and differences are there between our parable of the fruit truck and Shannon information? What other analogies are you familiar with or perhaps have yourself used to help bring these rather intangible concepts down to earth in a concrete way for people to understand?