Theoretical Computer Science and You

What does it mean to compute?


"Birds do it, Bees do it, even educated fleas do it..." - Cole Porter

Most people have a very concrete and very limited idea of what a computer is. They envision some metal-encased electrical machine with a disk and mouse attached.

The fact is that a computer need not be electrical in nature. Computers can be mechanical-a computer was built at MIT completely out of tinkertoys which could play tic-tac-toe. Computers can be optically based, relying on light combinations for their innermost circuits. Computers can be based on quantum waves.

While all of the above types of computers use different types of "life blood" for their computations, they are all pretty similar in that they all use logic gates. The fact is that logic gates are not a necessary building block of a computer. Many AI researchers are building connectionist computers which use large networks of simple "neuron-like" objects to do computation. Some researchers are experimenting with chemical and biological computers where the computation occurs as a result of a chemical reaction or the proliferation of a certain strain of unicellular organism.

In some sense, an entire biological system can be seen as a computer. What does it compute? It computes an(approximate) solution to the "fitness function". It computes organisms which are able to prosper in a particular environment. It may seem strange to think of a biological system as a computer but for about a decade now many AI researchers and people in industry have been solving problems using this model to "evolve" algorithms to solve a particular problem.

In a later section on Entropy and Information Theory, we'll see that we can view the entire physical world as a computer - view the laws of nature as algorithms, our observations as output and use this to prove the efficacy of Occam's razor in formulating hypothesis(along with some other interesting theological results).

A Mathematical Definition of Computation


The fact is the notion of computing is a generalized concept which is essentially divorced from any physical or biological mechanism. There are two models of computation which are widely used today-both are exactly equivalent in power. One is a purely function and logic based formalism invented by the mathematician Alonzo Church called the Lambda Calculus. The other is an abstract device formalism invented by the mathematician Alan Turing called the Turing Machine.

Both of these models of computation were discovered in the first half of the twentieth century when many mathematicians were concerned with the question "Can the discovery of mathematical truth be automated?". Goedel made the amazing discovery that in a sufficiently complex mathematical system, there exist true statements which can not be proven. This basically destroys any hope mathematicians cherished of systematically finding and proving all true statements.

A similar result is paralleled in the domain of the Turing machine which I'll go over here. First let me give the description of a Turing Machine. It has an infinite tape for memory and input and a single head reading this tape. The head can read the number under it and write a 0 or a 1. It also has a simple system of control consisting of a finite number of states. When in a particular state, it can move the head right or left, write a 0 or a 1 and make a transition to another state based on the number under the head.

No one has ever been able to build, conceptualize or even recognize in nature(e.g. the human brain) a computing mechanism which is provably more powerful than a Turing Machine. When I say more powerful, I don't mean faster or more efficient, I mean able to solve problems which the Turing Machine can not. Under this definition, if the Turing Machine takes 10 million times as long to solve a problem as some other computer but can still solve the same set of problems, the two computers are equally powerful.

The label Turing Complete is used to describe a computer or computing system which is as powerful but no more powerful than a Turing Machine. It's easy to prove that all modern computers are Turing complete. This means not only that you can use a Turing machine to do simple arithmetic or calculate pi to n places but that you could simulate a word processor, spreadsheet or video game on a Turing machine. The input would need to be converted to 0's and 1's on the tape and the output would need to be converted to letters or graphics but the "computational essence" of the problem could be done by the Turing Machine.

Now modern computers can be simulated on a Turing machine and neural networks and biological models can be simulated on modern computers. Further, various complicated proofs have shown that neural networks and some biological models can simulate a Turing machine. Hence we know that neural networks and biological models(e.g. genetic algorithms) are also Turing complete. Many people believe that all biological and physical computing systems from the human brain to the entire universe are at most Turing complete but this is not known for sure. This speculation is known as Church's thesis.

In a result that paralleled Church's, Turing discovered that there are certain problems that can not be solved on a Turing Machine. In particular, the halting problem which is "Given a Turing Machine and an input, determine if the Turing Machine ever stops when fed that input". It's not difficult to assign a unique number to each Turing Machine so there are no practical problems in formatting this problem but the fact remains that it can not be solved in the general case. The repurcussions are immediate-it means that on a modern computer as well we can not write a program which will in general determine if another program ever stops.

The proof of this is quite simple. Assume there exists a TM call it M1 which can solve the halting problem. Use this TM to construct a machine called M2 in the following way: When M2 is given an input x, it feeds the input (x,x) (i.e. the machine described by x and the input x) into M1. If M1 answers that Machine x on input x halts, M2 goes into an infinite loop. If it answers that it does not halt, M2 halts. Now consider what happens when we feed the number describing M2 into M2. If M1 answers "yes" for (M2,M2), M2 loops forever. If M1 answers "no" for (M2,M2), M2 halts. In either case we have a contradiction and so our original assumption that M1 exists is false. Because this problem can not be solved in general on the most powerful known model of computation, we say that it is undecidable.

A more complicated mathematical approach(diagnolization) shows that the vast majority of all possible problems are undecidable. Lest the reader find this result depressing, let me say that limit proofs are often established in computer science to suggest ways of overcoming the limits. We know that we can not solve these general undecidable problems but there are at least two possible ways of overcoming these difficulties. First if we limit the size of the problem to all instances less than k bits, the problem becomes decidable. Hence deciding if a program halts on a computer with a set finite memory is decidable. Second we can relax the constraint that the computer always answer correctly with 100% probability(as long as it's correct say 99.9% of the time) and we can give the computer extra information such as random numbers. There do exist undecidable problems which are solvable when randomization is allowed.

Before ending this section, I want to mention briefly that there is a whole heirarchy of abstract machines and corresponding grammars that increase in computational power up to and including the Turing Machine. In addition, many, many classes of problems have been delineated that are solvable for example in polynomial or exponential time and those whose answers are checkable in polynomial time. One of the most important open questions in CS today is does P=NP - is the class of problems solvable in polynomial time equal to the class whose answers are checkable in polynomial time. We suspect the answer is no. If the question is answered in the affirmative, it would mean among many other things that we could derive mathematical proofs in polynomial time on a TM(in a sense giving computers a profound mathematical ability that humans seem to lack).