Monday, March 23, 2009

Darwin Bicentenary 15: A critique of the mystique caused by the information conflation

This post was going to be a discussion of William Dembski on “Active Information and Intelligent Design” (which I’ll now have to postpone). It has turned into a digression on the intellectual hazards of the concept of information. Before I get back to active information and ID I am going to look at this concept and try to flag some of the pitfalls.

Shannon’s technical definition of information certainly helps to clarify the situation. However, whether we are talking about Shannon information or a more informal concept, in general information is an extrinsic property of a system; that is, a system possesses information by virtue of its relation to another system capable of reading and responding to that information. For example, a digitized computer program on a CD is information only in as much as it contains patterns that can be used by a computer as instructions. Likewise a book only contains information by virtue of its relation to human beings who have the power to correctly interpret the black and white patterns in the book. In both cases, without the potential to invoke the appropriate reactions from computer and human beings respectively, the black and white patterns of CD and book are devoid of any informative content and they remain as just patterns. The general lesson here is that the informational property of a pattern is not intrinsic to the pattern itself, but exists extrinsically by virtue of its relation to other complex systems capable of interpreting the patterning, thus bestowing upon it the property of being information. Without this potential relation to computers and humans respectively the CD and book contain nothing at all apart from patterning. In a word information is not a standalone property

It is this extrinsic property of information that causes a real headache for the analyst who is trying to get to the bottom of the evolution/ID debate. ID theorists, in particular, are very fond of the idea of information and this I think sometimes clouds the issue. Why? Because if an ID theorist is talking about the information in a structure, there is the implicit suggestion of relatedness either within the structure itself or, and this is where it gets tricky, to some system beyond the structure that reads and reacts to the patterns of the structure in some way. This latter case is an open ended system, and for the analyst who likes to isolate systems under study open-systems leaves him with inconvenient wildcards. This situation can be resolved, however, if the analyst is then allowed to include the system that bestows the property of information: for example, if the analyst is handed a digitised program along with the computer that uses it, he then has a self contained system and can get to work. But the real problems come when the analyst is just told “This structure contains information” but is given no indication what kind of external system it is information for. In particular, if there is a hint that the destination of the information is something as complex as a human level reader, then the analyst has got his work cut out trying to crack the code. In fact he may not be able to proceed at all as there may be too many cultural unknowns.

Shannon’s formal definition of information does help alleviate the problems somewhat, but his definition has been contrived, needless to say, with communication in mind and is therefore implicit about extrinsic properties derived from relatedness. But even Shannon’s definition, if we forget that it has been contrived with a third party in mind, can come up with some surprising results. Firstly Shannon’s definition makes use of the probability of an outcome. This use of probability should itself be treated with caution because of the philosophical contentions surrounding the nature of probability; namely the question of whether probability is an objective propensity or a subjective property bound up with knowledge (see my paper here). Thus, for example if probability is an objective propensity it really does look from Shannon’s definition that “information” is a property intrinsic to a system regardless of its relatedness. However I hope the following illustrations of the use of Shannon information will settle the question in favour of information being an extrinsic property bound up with an observer’s knowledge.

Example 1: Tossing a coin. Informally speaking a sequence of coin tosses carries no information because we don’t think that it encodes any secret messages. But in Shannon’s technical sense the tosses of a coin do carry a lot of information because at each throw an attentive observer passes from complete ignorance about the outcome to a full knowledge of the outcome. Once the outcome is a happened event, however, it loses its information “content” because the attentive observer is no longer ignorant about the outcome and therefore the outcome loses its power to inform him. Thus in prospect, a sequence of heads and tails is packed with information but in retrospect, for the attentive observer, has no information. Now, let’s imagine that our first observer subsequently transmits his record of the sequence of throws to a second observer over a transmission line. The second observer is effectively observing the sequence of heads and tails via the first observer’s signal. Before the second observer receives the signal he is in the same position of ignorance about the sequence of heads and tails that the first observer was in before he observed the sequence. Thus prior to receiving the communication from the first observer, the second observer is faced with a sequence that is still information rich; but only up until he receives and registers the signal from the first observer. This illustration shows how the same pattern of throws can pass from a sequence full of information to one of zero information, back to one full of information, depending on the state of knowledge of the observer. “Information” is an observer relative quantity.

Example 2: Crystal lattice. I have heard an ID theorist claim that crystal lattices contain no information. The response to that is “yes and no”. “Yes” because in the Shannon sense, which is probably the sense intended by the ID theorist, the crystal contains very little information if the observer knows the simple periodic algorithm that can be used to generate the lattice points. Under these circumstances the observer can’t learn anymore from observation of the crystal itself; ergo a crystal lattice, under these circumstances contains no information. But there are other circumstances where the lattice can “contain” information. If the observer knew that he was dealing with some kind of crystal lattice, that in itself would be very informative: he would at least know that the structure was highly organised and that would eliminate an overwhelmingly large number of disordered configurations. However, that still leaves a large number of possible regular lattices, and the only way our observer is going find out which one he has on his hands is by interrogating the actual lattice in order to discover which particular periodic algorithm can be used to generate the lattice points; while he is trying to do this the lattice contains the information he is after, but once he has worked out the algorithm then, and only then, would the lattice contain no information. We conclude that a crystal lattice can contain some information, depending on the state of knowledge, or lack of knowledge, of the observer. (However I stress the word “some” here because it wouldn’t take long for observer to work out the pattern and thus for it to cease being a learning experience) Note what’s happened here: the information “in” the lattice is not actually in the lattice itself; the information is, as I have said, an extrinsic property bestowed upon it by the observer’s state of knowledge. The extrinsic nature of information means that the information “content” of a lattice is a relative and variable property.

Example 3: The Mandelbrot Set. This set is a very rich pattern of variation that at first sight one might think clearly contains an enormous amount of information. But the answer to that is both “yes and no”. An observer armed with the Mandelbrot algorithm effectively knows the position of all the pixels; hence he can predict the state of each pixel just as he could for a simple lattice. Thus, on these terms the Mandelbrot set contains no information! But think again; what about if the analyst threw the algorithm away and forgot about it, or what if he was culturally medieval? The Mandelbrot set would then be a source of endless surprise to him and would in Shannon’s terms be information rich. However, even though the pattern has a degree of regularity, and thus from inspection the analyst may be able to better his guesses as to what each pixel contains, without the Mandelbrot algorithm it would be a source of information.

Example 4: Random sequence generator. Random number generators can generate very disordered patterns of 1s and 0s and are random to a reasonable approximation. Such a pattern of 1s and 0s, statistically speaking, will be all but indistinguishable from a sequence generated by the tosses of a coin. So does such a sequence contain information? Once gain it depends on the observer. If the observer knows the algorithm then the pattern contains as little information as a crystal lattice. But someone coming to the sequence without knowledge of the algorithm would find it difficult to distinguish from the tosses of a coin and it would then contain a lot of information.

Example 5: DNA. It seems indisputable that DNA clearly contains a lot of information. But as we get to know more about, say, the human genome, the less informative it becomes and therefore the less information it contains. Of course that doesn’t apply to the ribosomes that construct proteins: they know nothing at all about the DNA until they get a messages from the nucleus where DNA resides. Thus as far as the ribosomes are concerned DNA always contains a full quota of information.

Unless we are aware that information is an extrinsic property of a pattern, a property defined by the pattern’s relation to an information user, then we are liable to fall for the mystique of information; it will appear to us as some mysterious almost vitalistic property of a system that we can’t quite put our finger on; a ghostly lurking presence of unknown power that makes the analyst recoil in fear of the unknown.

In truth information arises out of the epistemic relation an observer has with the system under study and thus effectively the observer, via his measure of knowledge becomes a variable in the system that we are trying to express quantitatively using the concept of information. Unless we are aware of the extrinsic nature of information the state of the observer can become inconveniently entangled with the system under study. When we discuss a system in information terms it is easy to conflate observer and observed and thus strange counter intuitive statements can be made about a system if we don’t realize that we are actually talking about a joint system (e.g. “DNA contains no information”). This, I suspect, will have a bearing when we go on to consider the work of William Dembski who to quote one of his detractors is “some big shot intellectual”.

Characters of the Wild Web Number 3: Billy the DembskID – Intellectual sharp shooter or a peddler of disinformation?

No comments: