Here I continue with
my analysis and comments in relation to this post by Joe Felsenstein and Tom English
(FE for short) on Panda’s thumb. Part 1 of this series can be seen here
A priori sophisticated targeting methods and a distant object: Is this all American paradigm for hitting targets the right
one for evolution?
In their paper here North American Intelligent
Design ID theorists Dembski, Ewert and Marks (DEM for short) use the idea of an algorithmic
search looking for targets. The target configurations at the back of their minds are, no doubt, the configurations of life. However, according to FE DEM use a model where the actual target sought for by the algorithm is immaterial:
DEM have a “target” for which the search is searching.
Except that they don’t actually require that the “search” actually search for
something that makes sense. The target can be any set of points. If each point
is a genotype and each of them has a fitness, the target can be genotypes with
unusually high fitnesses, with unusually low fitnesses, mediocre fitnesses, or
any mixture of them. They do not have to be points that are “specified” by
fitness or by any other criterion. DEM do not require that the “search” even
consider the fitnesses. They calculate the fraction of all M points that are in
the target. If |T| is the size of the target, for this fraction If we divide
that by the number of points in the space, N, we get p = |T|/|N|. This of
course is also the probability that a random point drawn uniformly from the
space hits the target.
What I guess DEM
are saying, then, is that their result is very general: Whatever the target whether,
it be the configurations of life or something else, in terms of computational complexity the conclusion is the same for small targets (i.e relatively small |T|); namely, that they are
extremely hard to find whatever they may be. Jumping ahead a bit, we might conclude
therefore that if an algorithm is to find a specified target in reasonable time
and with a reasonable probability that algorithm must be suitably provisioned with
the right up front information to do so. Right? Well, that’s one
solution to the problem. (More about that question another day).
One of DEM’s
important intermediate conclusions is expressed by FE as follows (My emphasis)
DEM assume that there is a baseline “search” that does
not favor any particular “target”. For our space of genotypes, the baseline
search generates all outcomes with equal probability. DEM in fact note that on average over all possible searches, the
probability of success is the same as if we simply drew randomly (uniformly)
from the space of genotypes.
That is, with
respect to the class of all possible searches the average search will do no
better than random chance. This, it seems, is a consequence of the search concept DEM are using. FE describe this concept as follows:
Searches as
distributions on the space of points
DEM consider the probability distribution of all
outcomes of a search. Different instances of the search can find different
results, either because they choose different starting points, or because of
random processes later during the search. They assume very little about the
machinery of the search – they simply identify the search with the distribution
of results that it gets. Suppose that two searches lead to the same
distribution of outcomes, say a probability 0.6 of coming up with point x1,
probability 0.4 of being coming up with x12, and probability 0 of everything
else. They consider these two processes to be the same identical search. They don’t
consider what intermediate steps the searches go through. Correspondingly, two
searches that lead to different probability distributions of outcomes are
considered to be different searches. All distributions that you can make can
apparently be found by one or another of DEM’s search processes. From this
point on they talk about the set of possible distributions, which to them
represent the set of possible searches.
Note that this means that they are including
“searches” that might either fail to be influenced by the fitnesses of the
genotypes, and even ones that deliberately move away from highly fit genotypes,
and seek out worse ones. Anything that gets results is a “search”, no matter
how badly it performs.
On DEM’s
definition, then, a search is identified by the probability distribution it returns over the total possible outcomes of the search. If we take the ith outcome, then for a
particular search distribution the probability of the ith outcome, Pi, will have a value between
0 and 1. Given that the ensemble of distributions is in fact the set of all
possible distributions, then selecting a distribution at random will effectively
make Pi a random variable.
But this random variable is subject to the constraint:
S Pi = 1
Given this
constraint it follows that if DEM type searches are selected at random
then the varying values of Pi
must average out to a value ai
where we require:
S ai = N ai
= 1
….and where N is the total number of possible
outcomes. From this relation it follows that ai, the average value of the distribution for the ith outcome, is 1/N.
This value 1/N is in fact identical
to the probability of selecting an outcome at random. Hence, provided there is
no particular weighting or bias placed on the way the searches are selected it
follows, as FE state, that on average
over all possible searches, the probability of success is the same as if we
simply drew randomly (uniformly) from the space of genotypes. So this
result of DEM looks to be intuitively agreeable.
Although, as FE
tell us, DEM make few assumption about the mechanics of the search, typically algorithmic
searches walk step by step through configuration space from one configuration
to the next, where adjacent configurations in the walk are separated by small
differences. This kind of configuration space can be modelled as a network of
connected nodes where the walk metric between configurations is the smallest
possible change between two configurations. FE approach this walk concept of search
by way of a concrete example, namely, a network formed from the possible configurations
of a genotype of 1000 bases. Using this model it follows that a particular genotype
has as its immediate neighbors all those configurations that can be formed
simply by changing any one base on the genotype to any of the three other bases
available from the set of four possible bases, A, G, C and T. Therefore using
this specific example it follows that each genotype is directly connected to 3000 near neighbors.
With this
concrete example in mind a search will consist of a starting genotype plus the
subsequent walk through configuration space. If we take a starting genotype and
then select a walk at random then the likelihood is that the subsequent walk
will conform to random walk statistics and generate a Gaussian probability
envelope across the configuration space. It then follows, of course, that targets which
are nearest the starting configuration will likely be found first. However,
when searching for a target we must take into account the chances of starting at
a configuration in the near vicinity of the target; if the starting configurations
are being selected at random it follows that locating stating points near the
target are less likely than wider misses. It turns out that the increased
chances of finding a target from a nearby starting configuration are exactly
offset by the reduced chance in selecting a starting configuration in the near vicinity
of the target.
This tradeoff
between the chance of finding a search
and the chance of the search finding a
target is the subject of a general theorem by DEM. This theorem can be expressed as follows:
Probability
of finding a search = p/q (p < q)
....where p is the probability of finding a target
completely at random and q is the
probability of finding a target given the search method in question. It can be
seen from this relationship that if we attempt to increase the chance of
finding a target by increasing q this
simply has the effect of decreasing the probability of finding the search algorithm. I give my ‘dime store’ proof of this theorem in
the first of my Dembski and Felsenstein posts
- see here.
***
FE don’t attack
DEM’s mathematical model; I don’t think anyone is doing that. What is at stake
is the significance of DEM’s work and how they use this work to support their
philosophy. In fact the mere
content of DEM’s theorem is intuitively
compelling; systems where organization and constraints are at a minimum - that
is, where there is no a priori bias - will not seek out small targets (such as life) in realistic times. It
follows then that if a system is capable of generating life in realistic times it must be provisioned to do
so. One (and I repeat one)
of those ways of provisioning the system is by ‘front loading’ it – that is, by giving the system what DEM call ‘active
information’ from the outset. In fact, I think you will find that FE would
agree with this conclusion, although they are unlikely to favor the term ‘active information’. But it is when the argument passes on to the
realm of biased systems that FE come into their own as we shall
see. However, we will also see that the
battle ground is philosophical dualism. As good dualists they all see it as a choice
between natural forces and God. It is that debate which I will be considering
next. In due time I also hope to show
that ‘active information isn't the only way forward.
No comments:
Post a Comment