What does it mean to compute?
"Birds do it, Bees do it, even educated fleas do it..." - Cole Porter
Most people have a very concrete and very limited idea of what a computer
is. They envision some metal-encased electrical machine with a disk
and mouse attached.
The fact is that a computer need not be electrical in nature.
Computers can be mechanical-a computer was built at MIT completely out
of tinkertoys which could play tic-tac-toe. Computers can be
optically based, relying on light combinations for their innermost
circuits. Computers can be based on quantum waves.
While all of the above types of computers use different types of "life
blood" for their computations, they are all pretty similar in that
they all use logic gates. The fact is that logic gates are not a
necessary building block of a computer. Many AI researchers are
building connectionist computers which use large networks of simple
"neuron-like" objects to do computation. Some researchers are
experimenting with chemical and biological computers where the
computation occurs as a result of a chemical reaction or the
proliferation of a certain strain of unicellular organism.
In some sense, an entire biological system can be seen as a computer. What
does it compute? It computes an(approximate) solution to the "fitness
function". It computes organisms which are able to prosper in a
particular environment. It may seem strange to think of a biological
system as a computer but for about a decade now many AI researchers
and people in industry have been solving problems using this model to
"evolve" algorithms to solve a particular problem.
In a later section on Entropy and Information Theory, we'll see that
we can view the entire physical world as a computer - view the laws
of nature as algorithms, our observations as output and use this to
prove the efficacy of Occam's razor in formulating hypothesis(along
with some other interesting theological results).
A Mathematical Definition of Computation
The fact is the notion of computing is a generalized concept which is
essentially divorced from any physical or biological mechanism. There
are two models of computation which are widely used today-both are
exactly equivalent in power. One is a purely function and logic based
formalism invented by the mathematician Alonzo Church called the
Lambda Calculus. The other is an abstract device formalism invented
by the mathematician Alan Turing called the Turing Machine.
Both of these models of computation were discovered in the first half
of the twentieth century when many mathematicians were concerned with
the question "Can the discovery of mathematical truth be
automated?". Goedel made the amazing discovery that in a
sufficiently complex mathematical system, there exist true statements
which can not be proven. This basically destroys any hope
mathematicians cherished of systematically finding and proving all
true statements.
A similar result is paralleled in the domain of the Turing machine
which I'll go over here. First let me give the description of a
Turing Machine. It has an infinite tape for memory and input and a
single head reading this tape. The head can read the number under it
and write a 0 or a 1. It also has a simple system of control consisting
of a finite number of states. When in a particular state, it can move
the head right or left, write a 0 or a 1 and make a transition to
another state based on the number under the head.
No one has ever been able to build, conceptualize or even recognize in
nature(e.g. the human brain) a computing mechanism which is provably
more powerful than a Turing Machine. When I say more powerful, I
don't mean faster or more efficient, I mean able to solve problems
which the Turing Machine can not. Under this definition, if the
Turing Machine takes 10 million times as long to solve a problem as
some other computer but can still solve the same set of problems, the
two computers are equally powerful.
The label Turing Complete is used to describe a computer or computing
system which is as powerful but no more powerful than a Turing
Machine. It's easy to prove that all modern computers are Turing
complete. This means not only that you can use a Turing machine to
do simple arithmetic or calculate pi to n places but that you could
simulate a word processor, spreadsheet or video game on a Turing
machine. The input would need to be converted to 0's and 1's on the
tape and the output would need to be converted to letters or graphics
but the "computational essence" of the problem could be done by the
Turing Machine.
Now modern computers can be simulated on a Turing machine and neural
networks and biological models can be simulated on modern computers.
Further, various complicated proofs have shown that neural networks and
some biological models can simulate a Turing machine. Hence we know
that neural networks and biological models(e.g. genetic algorithms)
are also Turing complete. Many people believe that all biological and
physical computing systems from the human brain to the entire universe
are at most Turing complete but this is not known for sure. This
speculation is known as Church's thesis.
In a result that paralleled Church's, Turing discovered that there are
certain problems that can not be solved on a Turing Machine. In
particular, the halting problem which is "Given a Turing Machine and
an input, determine if the Turing Machine ever stops when fed that
input". It's not difficult to assign a unique number to each Turing
Machine so there are no practical problems in formatting this problem
but the fact remains that it can not be solved in the general case.
The repurcussions are immediate-it means that on a modern computer as
well we can not write a program which will in general determine if another
program ever stops.
The proof of this is quite simple. Assume there exists a TM call it
M1 which can solve the halting problem. Use this TM to construct a
machine called M2 in the following way: When M2 is given an input x,
it feeds the input (x,x) (i.e. the machine described by x and the
input x) into M1. If M1 answers that Machine x on input x halts, M2
goes into an infinite loop. If it answers that it does not halt, M2
halts. Now consider what happens when we feed the number describing
M2 into M2. If M1 answers "yes" for (M2,M2), M2 loops forever. If M1
answers "no" for (M2,M2), M2 halts. In either case we have a
contradiction and so our original assumption that M1 exists is false.
Because this problem can not be solved in general on the most powerful
known model of computation, we say that it is undecidable.
A more complicated mathematical approach(diagnolization) shows that
the vast majority of all possible problems are undecidable. Lest the
reader find this result depressing, let me say that limit proofs are
often established in computer science to suggest ways of overcoming
the limits. We know that we can not solve these general undecidable
problems but there are at least two possible ways of overcoming these
difficulties. First if we limit the size of the problem to all
instances less than k bits, the problem becomes decidable. Hence
deciding if a program halts on a computer with a set finite memory is
decidable. Second we can relax the constraint that the computer
always answer correctly with 100% probability(as long as it's correct
say 99.9% of the time) and we can give the computer extra information
such as random numbers. There do exist undecidable problems which are
solvable when randomization is allowed.
Before ending this section, I want to mention briefly that there is a
whole heirarchy of abstract machines and corresponding grammars that
increase in computational power up to and including the Turing
Machine. In addition, many, many classes of problems have been
delineated that are solvable for example in polynomial or exponential
time and those whose answers are checkable in polynomial time. One of
the most important open questions in CS today is does P=NP - is the
class of problems solvable in polynomial time equal to the class whose
answers are checkable in polynomial time. We suspect the answer is
no. If the question is answered in the affirmative, it would mean
among many other things that we could derive mathematical proofs in
polynomial time on a TM(in a sense giving computers a profound
mathematical ability that humans seem to lack).