Friday, August 14, 2009

Stephen Wolfram and Cellular Automata Part 2

At the constructive suggestion of a Paul Abbots who commented on my first post on Wolfram’s cellular automata I have pushed the boat out a little further by doing some reading of Wolfram’s NKS online, viz the sections on universality and computational equivalence. Paul was responding to this comment in my post:

On the subject of computational equivalence Wolfram is less compelling. Simple processes can produce the high complexity of disorder, but simple processes, which are relatively small in number, merely sample a very small part of the huge class of complex and disordered configurations. The only way of reaching this huge class is to start with complexity. Accordingly I am not sure I know what Wolfram means when he talks about a “maxing out” of complexity. Complexity of process seems to have no upper limit. Perhaps I do not understand Wolfram aright but it seems to me that there is a huge platonic realm out there of highly complex algorithms, whose rules are beyond our ken, algorithms whose computational resources are too great to be humanly manageable.

So how does the above compare with Wolfram’s notion of Computational Equivalence?

Having read bits of NKS online I think I now have a much clearer idea of Wolfram’s notion of computational equivalence. This is how I currently see it: According to Wolfram there is a class of computations based on simple rules that generate complex “baroque” patterns; see for example those simple cellular automata that generate complex patterns like the one illustrated. The members of this class, Wolfram conjectures, are by and large computationally equivalent in that at some point in the sequence of patterns generated by these computations each and every possible computation is carried out. That is, provided you wait long enough for one of these simple rule systems to do its work, it will generate the conditions that in conjunction with further operations of the rule system will then proceed to carry out any identified computation. If Wolfram’s conjecture is correct then there is an obvious sense in which any selected baroque computation can’t do better than any other; for all of these baroque computations (it is conjectured) eventually create the conditions which when worked on by further application of the rules will effectively be carrying out any identified computation. Hence I now see what Wolfram means by a “maxing out” of complexity. Each and every baroque computation potentially embodies all computations and therefore can’t do any better.

The sting in the tail here is that Wolfram’s conjecture may never be proved for the simple reason that the computation needed to prove it may be too long. However, thinking about it, computational equivalence does have some support in intuition. Chaotic computations, like say random number generators that are “baroque” enough not to contain periodicities, will forever generate novelty of pattern. In effect they are (unsystematically) ringing the changes and thus working their way through the entire space of possibilities and thus one expects that given enough time they will visit all computations that are possible. The same even seems to be true of a simple counting procedure; given time it will generate a string that, appropriately decoded, will represent some computation. Put like this, though, Wolfram’s equivalence conjecture seems almost trivial. However, according to Wolfram computational equivalence is always a little bit bigger and profounder than our articulations of it: “Computational equivalence is truer than any proof of it or illustration of it”.

According to Wiki’s NKS page some critics have suggested that Wolfram’s conjecture is equivalent to the Church-Turing thesis. This I don’t see; the Church-Turing thesis affirms the universality of certain models of computation; that is, it affirms that certain models of computation can carry out any computation if appropriately programmed. But Wolfram seems to be saying something that at first sight is surprising; namely that a particular computation (and not a particular computational model), provided it is baroque enough, is potentially universal. That surprising result appears not to follow from the Church-Turing thesis, a thesis which concerns the universality of computational models and not the universality of particular computations.

Now we come to my own comment quoted above. The intended content of this comment, which I can now see doesn’t challenge Wolfram’s thesis of computational equivalence, I explain below. The following really represents my own informal grasp of the kind of work that is being carried out formally by people like Gregory Chaitin. (Caution: my understanding here is still at the hand waving stage)

A computation is product of two resources: the starting program string (which encodes data and the rules of operation) and time (which equates to computational steps). Hence, a given computational result can be expressed as the product of two resources, program and time, thus:

Program + Time = Result.

This equation is not rigorous or literal mathematics but rather a metaphor, a metaphor that illustrates the complimentary relation of program string and time: If we want to shorten the time needed to arrive at a given result we can do so by increasing the length of the program string, effectively giving the computation more information. Conversely if we shorten the program string depleting it of information this has the effect of increasing the time needed to generate the designated result. Together, program string and time must “sum” to the “value” designated as “result” in the above “equation”. (It is assumed here that “memory space” is unlimited). There is a loose analogy here with data compression; the less compressed data is, the less time needed to decompress it, whereas the more compressed the data, the greater the time needed to decompress it. The size of the combined resources of program string and time can be thought of as a measure of the computational complexity of the result.

Of particular interest to human beings are results that are a product of manageable resources; that is, program strings and computation times of humanly convenient length. In this connection it is clear from such things as random number generators and Wolfram’s cellular automata, that configurational complexity need not be particularly computationally complex; that is, configurational complexity can be the output of computational simplicity; simple in terms of manageable programs strings and computation times. Thus, there is a class of configurationally complex structures that can be “compressed” into elementary computing resources.

Given a particular programming model it is clear from combinatorial considerations alone that the number of short programs is very small compared to the enormous size of result space. Thus, if we are to stipulate that short programs are only allowed to map to this space via links that consist of a realistic number of computation steps, then it follows that the very limited supply of short programs reaches a very small sample of results. However, if we put no limit on the number of computation steps by switching our programs into nonstop production mode thus allowing them to generate very long sequences of results, we are then moving into a realm where Wolfram’s computational equivalence may apply. Thus, from computational equivalence we deduce that any result can be reached with most short program strings that have a “baroque” output, but the colossal size of result space will ensure that the majority of results will only be arrived at after a prohibitively long time.

As Gregory Chaitin points out, a theory is a successful means of computing the complexity we see around us, but such theories only have human utility if they use humanly tractable resources; namely relatively short program strings and computation times. This is a form of data compression and a theory’s ability to compress data into simple computing resources is in part the human rationale for theorizing. A good theory is a way of describing the complexity we see around us using minimal computational resources. On the other hand if we apply computational equivalence then trivially any observed configuration could in principle be a product of a simple computation, but it is unlikely to be a computation with a tractable number of steps. In fact if computational equivalence holds then any observed complexity could be the product any of a large class of “baroque” computations that have been given enough time to do their work. Thus, computational equivalence leaves us with an ontological ambiguity in that when we allow long computation times any of a large class of long computations could do the job of describing the cosmos.

On the other hand this ontological ambiguity is not true of low resource computations; (that is, short program strings and computation times) if we succeed in describing the complexity we see around us using a low resource computation it follows that we have found a very rare case indeed; that is, a rare mapping from a low resource computation to the specific configurational complexity found in the cosmic setting. In terms of real descriptive power and succinctness of resource not all computations are equivalent.

…to be continued.

No comments: