Uncommon Descent Serving The Intelligent Design Community

Intelligent Design Basics – Information – Part IV – Shannon II

Share
Facebook
Twitter
LinkedIn
Flipboard
Print
Email

The concept of information is central to intelligent design.  In previous discussions, we have examined the basic concept of information, we have considered the question of when information arises, and we have briefly dipped our toes into the waters of Shannon information.  In the present post, I put forward an additional discussion regarding the latter, both so that the resource is out there front and center and also to counter some of the ambiguity and potential confusion surrounding the Shannon metric.

As I have previously suggested, much of the confusion regarding “Shannon information” arises from the unfortunate twin facts that (i) the Shannon measurement has come to be referred to by the word “information,” and (ii) many people fail to distinguish between information about an object and information contained in or represented by an object.  Unfortunately, due to linguistic inertia, there is little to be done about the former.  Let us consider today an example that I hope will help address the latter.

At least one poster on these pages takes regular opportunity to remind us that information must, by definition, be meaningful – it must inform.  This is reasonable, particularly as we consider the etymology of the word and its standard dictionary definitions, one of which is: “the act or fact of informing.”

Why, then, the occasional disagreement about whether Shannon information is “meaningful”?

The key is to distinguish between information about an object and information contained in or represented by an object.

A Little String Theory

Consider the following two strings:

String 1:

kqsifpsbeiiserglabetpoebarrspmsnagraytetfs

String 2:

String 2

The first string is an essentially-random string of English letters.  (Let’s leave aside for a moment the irrelevant question of whether anything is truly “random.”  In addition, let’s assume that the string does not contain any kind of hidden code or message.  Treat it for what it is intended to be: a random string of English letters.)

The second string is, well, a string.  (Assume it is a real string, dear reader, not just a photograph – we’re dealing with an Internet blog; if we were in a classroom setting I would present students with a real physical string.)

There are a number of instructive similarities between these two strings.  Let’s examine them in detail.

Information about a String

It is possible for us to examine a string and learn something about the string.

String of Letters

Regarding String 1 we can quickly determine some characteristics about the string and can make the following affirmative statements:

1. This string consists of forty-two English letters.

2. This string has only lower-case characters.

3. This string has no spaces, numerals, or special characters.

It is possible for us to determine the foregoing based on our understanding of English characters, and given certain parameters (for example, I have provided as a given in this case that we are dealing with English characters, rather than random squiggles on the page, etc.).  It is also possible to generate these affirmative statements about the string because the creator of the statements has a framework within which to create such statements to convey those three pieces of information, namely, the English language.

In addition to the above quickly-ascertainable characteristics of the string, we could think of additional characteristics if we were to try.

For example, let’s assume that some enterprising fellow (we’ll call him Shannon) were to come up with an algorithm that allowed us to determine how much information could – in theory – be contained in a string consisting of those 3 characteristics: a string with forty-two English letters, using only lower-case characters, and with no spaces, numerals, or special characters.  Let’s even assume that Shannon’s algorithm required some additional given parameters in this particular case, such as the assumption that all possible letters occurred at least once, that all letters could occur with the relative frequency at which they show up in the string and so forth.  Shannon has also, purely as a convenience for discussing the results of his algorithm, given us a name for the unit of measurement resulting from his algorithm: the “bit.”

In sum, what Shannon has come up with is a series of parameters, a system for identifying and analyzing a particular characteristic of the string of letters.  And within the confines of that system – given the parameters of that system and the particular algorithm put forward by Shannon – we can now plug in our string and create another affirmative statement about that characteristic of the string.  In this case, we plug in the string, and Shannon’s algorithm spits out “one hundred sixty-eight bits.”  As a result, based on Shannon’s system and based on our ability in the English language to describe characteristics of things, we can now write a fourth affirmative statement about how many bits are required to convey the string:

4. This string requires one hundred sixty-eight bits.

Please note that the above 4 affirmative pieces of information about the string are by no means comprehensive.  We could think of another half dozen characteristics of the string without trying too hard.  For example, we could measure the string by looking at the number of characters of a certain height, or those that use only straight lines, or those that have an enclosed circle, or those that use a certain amount of ink, and on and on.  This is not an idle example.  Font makers right up to the present day still take into consideration these kinds of characteristics when designing fonts, and, indeed, publishers can be notoriously picky about which font they publish in.  As long as we lay out with reasonable detail the particular parameters of our analysis and agree upon how we are going to measure them, then we can plug in our string, generate a numerical answer and generate additional affirmative statements about the string in question.  And – note this well – it is every bit as meaningful to say “the string requires X amount of ink” as to say “the string requires X bits.”

Now, let us take a deep breath and press on by looking back to our statement #4 about the number of bits.  Where did that statement #4 come from?  Was it contained in or represented by the string?  No.  It is a statement that was (i) generated by an intelligent agent, (ii) using rules of the English language, and (iii) based on an agreed-upon measurement system created by and adopted by intelligent agents.  Statement #4 “The string rquires one hundred sixty-eight bits” is information – information in the full, complete, meaningful, true sense of the word.  But, and this is key, it was not contained in the artifact itself; rather, it was created by an intelligent agent, using the tools of analysis and discovery, and articulated using a system of encoded communication.

Much of the confusion arises in discussions of “Shannon information” because people reflexively assume that by running a string through the Shannon algorithm and then creating (by use of that algorithm and agreed-upon communication conventions) an affirmative, meaningful, information-bearing statement about the string, that we have somehow measured meaningful information contained in the string.  We haven’t.

Some might argue that while this is all well and good, we should still say that the string contains “Shannon information” because, after all, that is the wording of convention.  Fair enough.  As I said at the outset, we can hardly hope to correct an unfortunate use of terminology and decades of linguistic inertia.  But we need to be very clear that the so-called Shannon “information” is in fact not contained in the string.  The only meaning we have anywhere here is the meaning Shannon has attached to the description of one particular characteristic of the string.  It is meaning, in other words, created by an intelligent agent upon observation of the string and using the conventions of communication, not in the string itself.

Lest anyone is still unconvinced, let us hear from Shannon himself:

“The fundamental problem of communication is that of reproducing at one point either exactly or approximately a message selected at another point.  Frequently the messages have meaning; that is they refer to or are correlated according to some system with certain physical or conceptual entities.  These semantic aspects of communication are irrelevant to the engineering problem. (bold added)”*

Furthermore, in contrast to the string we have been reviewing, let us look at the following string:

“thisstringrequiresonehundredsixtyeightbits”

What makes this string different from our first string?  If we plug this new string into the Shannon algorithm, we come back with a similar result: 168 bits.  The difference is that in the first case we were simply ascertaining a characteristic about the string.  In this new case the string itself contains or represents meaningful information.

String of Cotton

Now let us consider String 2.  Again, we can create affirmative statements about this string, such as:

1. This string consists of multiple smaller threads.

2. This string is white.

3. This string is made of cotton.

Now, critically, we can – just as we did with our string of letters – come up with other characteristics.  Let’s suppose, for example, that some enterprising individual decides that it might be useful to know how long the string is.  So we come up with a system that uses some agreed-upon parameters and a named unit of measurement.  Hypothetically, let’s call it, say, a “millimeter.”  So now, based on that agreed-upon system we can plug in our string and come up with another affirmative statement:

4. This string is one hundred sixty-eight millimeters long.

This is a genuine piece of information – useful, informative, meaningful.  And it was not contained in the string itself.  Rather, it was information about the string, created by an intelligent agent, using tools of analysis and discovery, and articulated in an agreed-upon communications convention.

It would not make sense to say that String 2 contains “Length information.”  Rather, I assign a length value to String 2 after I measure it with agreed-upon tools and an agreed-upon system of measurement.  That length number is now a piece of information which I, as an intelligent being, have created and which can be encoded and transmitted just like any other piece of information and communicated to describe String 2.

After all, where does the concept of “millimeter” come from?  How is it defined?  How is it spelled?  What meaning does it convey?  The concept of “millimeter” was not learned by examining String 2; it was not inherent in String 2.  Indeed, everything about this “millimeter” concept was created by intelligent beings, by agreement and convention, and by using rules of encoding and transmission.  Again, nothing about “millimeter” was derived by or has anything inherent to do with String 2.  Even the very number assigned to the “millimeter” measurement has meaning only because we have imposed it from the outside.

One might be tempted to protest: “But String 2 still has a length, we just need to measure it!”

Of course.  If by having a “length” we simply mean that it occupies an area of space.  Yes, it has a physical property that we define as length, which when understood at its most basic, simply means that we are dealing with a three-dimensional object existing in real space.  That is, after all, what a physical object is.  That is to say: the string exists.  And that is about all we can say about the string unless and until we start to impose – from the outside – some system of measurement or comparison or evaluation.  In other words, we can use information that we create to describe the object that exists before us.

Systems of Measurement

There is no special magic or meaning or anything inherently more substantive in the Shannon measurement than in any other system of measurement.  It is no more substantive to say that String 1 contains “Shannon information” than to say String 2 contains “Length information.”  This is true notwithstanding the unfortunate popularity of the former term and the blessed absence in our language of the latter term.

This may seem rather esoteric, but it is a critical point and one that, once grasped, will help us avoid no small number of rhetorical traps, semantic games, and logical missteps:

Information can be created by an intelligent being about an object or to describe an object; but information is not inherently contained in an object by its mere existence.

We need to avoid the intellectual trap of thinking that just because a particular measurement system calls its units “bits” and has unfortunately come to be known in common parlance as Shannon “information,” that such a system is any more substantive or meaningful or informative or inherently deals contains more “information” than a measurement system that uses units like “points” or “gallons” or “kilograms” or “millimeters.”

To be sure, if a particular measurement system gains traction amongst practitioners as an agreed-upon system, it can then prove useful to help us describe and compare and contrast objects.  Indeed, the Shannon metric has proven very useful in the communications industry; so too, the particular size and shape and style of the characters in String 1 (i.e., the “font”) is very useful in the publishing industry.

The Bottom String Line

Intelligent beings have the known ability to generate new information by using tools of discovery and analysis, with the results being contained in or represented in a code, language, or other form of communication system.  That information arises as a result of, and upon application of, those tools of discovery and can then be subsequently encoded.  And that information is information in the straight-forward, ordinary understanding of the word: that which informs and is meaningful.  In contrast, an object by its mere existence, whether a string of letters or a string of cotton, does not contain information in and of itself.

So if we say that Shannon information is “meaningful,” what we are really saying is that the statement we made – as intelligent agents, based on Shannon’s system and using our English language conventions – the statement that we made to describe a particular characteristic of the string, is meaningful.  That is of course true, but not because the string somehow contains that information, but rather because the statement we created is itself information – information created by us as intelligent agents and encoded and conveyed in the English language.

This is just as Shannon himself affirmed.  Namely, the stuff in String 1 has, in and of itself, no inherent meaning.  And the stuff that has the meaning (the statement we created about the number of bits) is meaningful precisely because it informs, because it contains information, encoded in the conventions of the English language, and precisely because it is not just “Shannon information.”

—–

* Remember that Shannon’s primary concern was that of communication.  More narrowly, the technology of communication systems.  The act of communication and the practical requirements for communication, yes, are usually related to, but are not the same thing as information.  Remembering this can help keep things straight.

Comments
Eric and Mung: I would reflect again on Shannon's point: “The fundamental problem of communication is that of reproducing at one point either exactly or approximately a message selected at another point. Frequently the messages have meaning; that is they refer to or are correlated according to some system with certain physical or conceptual entities. These semantic aspects of communication are irrelevant to the engineering problem.” Shannon's theory is obviously about communication of a string which is a message. It is true that he does not enter into the debate of why that string is a message. But he assumes that it is the message that we want to communicate. So, the string always has the meaning of being the message we want to communicate. IOWs, it is functional in conveying a desired message. Shannon's theory is about the answer to a simple question: how many bits are needed to communicate that specific message? So, he is measuring functional information (in respect to a communication problem). ID is measuring functional information in respect to a generation problem. It answer the simple question: how many bits are necessary in an object to implement this function? So, in a sense, there is a strict formal similarity between Shannon's theory and ID. That's why Durston has been able to use so naturally Shannon's metric in an ID context. The problem of generating a string which conveys a meaning or implements a function is not essentially different from the problem of communicating a message. In a sense, a designer can be conceived as a conscious beings who is trying to communicate a message to another conscious being. The message can be a meaning or a function, but in all cases only another conscious being will be able to recognize it for what it is. Now, the important point that I want to suggest is that neither Shannon's theory of communicating the message nor ID theory of generating the message are really "qualitative". Both are "quantitative" theories. In a sense, neither deals with the problem of "what is meaning" and "what is function". I will try to be more clear. Shannon assumes that a string is the message, and for him that's all. OK. ID specifies the string as meaningful or functional. For example, in my personal version of the procedure to assess dFSCI, I require an explicit definition of the function for which we measure dFSCI in an object. OK. Shannon generates a partition in the space of all possible communicated strings: this string is the message, all other strings are not. ID generates a partition in the space of all possible generated strings: this set of strings is functional (the target space); all the others are not. At that point, both reasonings are interested only in quantitative measurements: how many bits are necessary to convey the message, how many bits are necessary to implement the defined function. Neither theory really measures the meaning or the function. They only measure the quantitative bits necessary to deal with that meaning/function in a specific context (communicate it/generate it). OK? Let's see an example with English language. a) One two three four five six seven eight nine ten eleven twelve b) This is a table in some room in some building in a city street c) Pi means the ratio of a circle's circumference to its diameter These three strings have the same length (62 characters) and all are correct statements in English. If we don't consider possible differences in compressibility, we can measure the number of bits which is necessary to communicate or generate each particular "message", and it should be more or less similar. Does that quantitative consideration tell us anything specific about the three different "messages"? No. They are very different, not only because they convey different meanings, but also because those meanings are of very different type and quality. But our "quantitative" theories (both Shannon's theory and ID) are not really dealing with that aspect. Let's apply the same reasoning to a context that I always use: the origin of functional information in proteins. So, let's say that we have two different proteins families, with two different, well defined, biochemical functions (for example, two different enzymes). They may have different length in AAs, but we apply the Durston method and we come up with a similar value for their functional complexity: let's say 200 bits. OK, that's what ID can tell us. But let's say that one protein is essential for survival of the biological being (whatever it is), so that a "knockout" experiment is incompatible with life, while the second protein is inserted in some redundant system, and the consequences of its "knockout" are much less obvious. St the level of the basic informational analysis, ID tells us nothing about that "difference" between the two functions. (Of course, we could apply the analysis to wider systems, but that's another story). Another way of saying that is that the same "informational" complexity can be computed for a Shakespeare sonnet, for a passage from a treatise of mathematics, or for a shopping list. Both Shannon's theory and ID are not dealing with the "quality" of the information (for example, the beauty in the sonnet), but only with the quantitative aspect of communicating or implementing each message, whatever it is, in a physical medium (the string).gpuccio
October 18, 2014
October
10
Oct
18
18
2014
07:32 PM
7
07
32
PM
PDT
Overall I think I get the point of the OP, which urges caution when speaking about information. However, for the sake of being argumentative and general pita, I will advance the claim that Shannon Information is "in" the string. Where else would the information be derived from? Shannon's theory is about an actual message from a set of possible messages, or put another way, about an actual string from a set of possible strings. And knowing the actual message or string we can say we have received an amount of information (because the receipt of the actual message or string has ruled out the alternatives). So in some way it is the actual string which enables us to calculate the "information content" of the string, but that content is always relative to the non-actualized possibilities. But what about the case where each string proceeding from the source is equally likely? If each string proceeding from a source is equally likely, how can any particular string be informative? It is informative in the sense of being an actualization from a set of the merely potential. Knowing the potential strings does not give us any information about the actual string. Therefore the information must reside in the actual string. Or not. ;)Mung
October 18, 2014
October
10
Oct
18
18
2014
06:14 PM
6
06
14
PM
PDT
Mung @2: Thanks for the clarification. I agree it need not inform "us". I've updated the OP.Eric Anderson
October 18, 2014
October
10
Oct
18
18
2014
05:43 PM
5
05
43
PM
PDT
Hi Eric, I'd appreciate your thoughts on the following. The length of a string of cotton could be seen as a property of the string. The units of measurement for this property might be millimeters. The length of the string in millimeter units might be 168. Now take a string of characters. The string might be 168 characters long. But 'character' is not a unit of measurement. So when we say that the string is 168 characters long we are not saying the same thing about the first string as we are saying about the second string when we say it has a certain length. Further, that a string is 168 characters in length doesn't mean it "contains" 168 bits of Shannon information. So we are talking two very fundamentally different things.Mung
October 18, 2014
October
10
Oct
18
18
2014
05:43 PM
5
05
43
PM
PDT
An instance of information informs a system capable of producing a functional effect.Upright BiPed
October 18, 2014
October
10
Oct
18
18
2014
05:30 PM
5
05
30
PM
PDT
Eric:
At least one poster on these pages takes regular opportunity to remind us that information must, by definition, be meaningful – it must inform us.
If you're thinking of me I don't mind being called out by name as holding the position that "meaningless information" is an oxymoron and for agreeing with Jan Kahre:
A fundamental, but a somehow forgotten fact, is that information is always information about something.
Perhaps I don't articulate my position as well as I ought, but I think it is an eminently reasonable one. :) I don't know that I would go so far as to claim that information must inform "us."Mung
October 18, 2014
October
10
Oct
18
18
2014
05:21 PM
5
05
21
PM
PDT
I think that information is the wrong metric. I think that meaning is the point. As long as we keep talking about information, rather than meaning, I think we'll be playing loop-de-loop with the loopies forever. Consider the following tale: A man flips a coin, it lands tails, tails, heads, tails. He waits a bit and flips again, this time he flips tails, tails. After another pause he flips tails, head, tails. After a last pause he flips tails. Another man yells in a loud voice, "Everybody please clear the building!" They asked him why he "yelled fire in a crowded theater." He responded, "I am a ham." This is a tale of meaning.Moose Dr
October 18, 2014
October
10
Oct
18
18
2014
04:52 PM
4
04
52
PM
PDT
1 2

Leave a Reply