The nature of computing and infeasible machines

If asked what they do, some developers might say "I write code" or "I program computers". But more fundamentally, what is computing really about? After all, Euclid's algorithm for computing the greatest common denominator of two numbers in polynomial time was designed - and executed by hand - thousands of years ago, long before even early mechanical computing devices were available (besides the abacus). For centuries elaborate bureaucracies have set up processing units and communication channels to spread and delegate work, not unlike modern server farms. Even today, complexity theorists regularly study mythical machines such as nondeterministic machines, alternating machines, nonuniform circuits, oracle machines, and hypercomputers, none of which are practical to construct or even to simulate in general. So why design algorithms for machines we can't build?

In reality, computing has nothing to do with computers. Computing is fundamentally about solving problems using strictly defined processes under resource constraints. The theory of computing, by design, has little regard for what device you use to carry those processes out.

For example, suppose ten monks in medieval Europe are copying a sacred book of 100 pages, numbered 0 through 99, and just as they finish one of them accidentally knocks the pages all over the floor. How can they rapidly put the pages back in order? One simple process is this: each monk picks up 10 pages and places them in one of ten piles depending on the tens digit of the page number. Then, each monk takes one of the piles and sorts it by himself. The piles are put together and they are done - this could all be done in just a couple minutes, much more quickly than a single monk would take to sort the whole book. In computing terms, this is a parallelized version of radix sort, switching to selection sort or insertion sort for small sublists.

Indeed, good algorithms are often more important than the device you use; for example, a computer sorting a list of a billion integers using bubble sort would probably take more time than a room full of people using the parallelized radix sort described above. On the other hand, some devices can solve problems that others cannot; for example, the very weak deterministic finite automaton (DFA) can't even tell whether its input string contains the same number of zeros and ones, which is a trivial task for a Turing machine, an abstract machine modelling practical computers. This tells us something important: DFAs aren't good enough. In other words, a computer that at its best can only simulate a DFA is fundamentally incapable of solving certain problems. Similarly, theoretical hypercomputers can solve problems that Turing machines cannot.

Unfortunately, once machines become powerful enough, they quickly all become equivalent with respect to the problems they can solve. For example, deterministic, multitape, random access, probabilistic, nondeterministic, and quantum Turing machines all solve the same set of problems. Why go to the bother of defining all these different machines then? There are two reasons:

  1. Certain devices make solutions to certain problems much easier to describe. For example, problems that are easy to solve by exhaustive search are also easy to express using algorithms that run on nondeterministic machines. Problems like primality testing and undirected graph connectivity are much easier to solve with known algorithms when we have the ability to generate random numbers.
  2. Some devices can solve more problems than others using the same amount of resources, or solve the same problems using less resources. As a simple example, a machine with random access can search an ordered list in logarithmic time using binary search, whereas an ordinary Turing machine with only sequential access would require linear time. Shor's algorithm can factor integers on quantum computers in polynomial time, something we don't know how to do on classical machines.

We know that the "features" offered by new theoretical devices are useful in describing new algorithms for solving problems, since we designed them for that purpose, but what we'd really like to know is whether they really let us solve problems more efficiently than we possibly could without them. This question, on the "power" of devices, is the major stumbling block of modern complexity theory. Even the great unsolved question of whether P=NP can be phrased as a question of power: can nondeterminism allow us to solve problems in polynomial time that we could not otherwise? Another question, whether P=RP, asks whether we can solve more problems in poly time with the ability to generate random numbers than without it. Although also unsolved, many researchers believe that these classes are equal and randomness does not add any real new power. The unsolved question of whether L=P asks whether we can solve more problems in polynomial space and polynomial time than in logarithmic space and polynomial time. Occasionally, we show that two dissimilar devices are equally powerful, as in the famous results that PSPACE=NPSPACE, IP=PSPACE, and NL=coNL, but rarely ever have we unconditionally shown that one machine is truly more powerful than another (notable exceptions are the trivial L ≠ PSPACE and P ≠ EXPTIME; see time hierarchy theorem, space hierarchy theorem).

To come back to my original point, why do we design algorithms for machines we can't build? There are many reasons, but some of the most important are:

  1. The algorithm may be more natural to describe on an abstract machine which we can then simulate on a real machine. By focusing our efforts on this simulation program, we can acquire the ability to easily solve any problem that the machine being simulated can solve.
  2. Understanding what problems a machine can solve helps us learn about the machine and how powerful it is. This helps us determine whether we should invest money and effort in software and hardware simulating those features or whether we can adapt existing machines to solve the same problems just as efficiently.
  3. Like the proof that the halting problem cannot be solved, they help us understand what problems are truly intractable, so that we can focus our effort on problems that we're more likely to crack successfully.

More philosophically, a popular saying among mathematicians is that if you were to wake up tomorrow and the universe were gone, you could still practice mathematics - abstract concepts are not dependent on any property of our physical universe, even if the specific representations and concepts that we use are. The same applies to computing, which at its core is a mathematical discipline: even when modern SDKs and classical architectures have become obsolete, the same abstract problem solving we now perform in C++, C#, or Java we will perform in yet another new programming language. And if you're ever stuck in purgatory for eternity, you can just quicksort a list of a billion numbers in your head.