next up previous contents
Next: Stack Machines Up: Interpreters Previous: Interpreters

Interpretation Techniques

  Interpretation has several desirable characteristics. Interactive systems benefit from a quick turn-around time and the extensibility of the interpreters themselves. For example, interpretation of commands can begin as soon as the user starts typing. There is no edit, compile, test cycle. Some interpreters can be extended by the user. Forth, for example, allows the definition of new keywords by the user. These new keywords are integrated into the running interpreter and can be used like any other keyword predefined by Forth.

Interpreters are relatively easy to write (when compared to a compiler) and are easy to port. Furthermore, intermediate code representation, bytecode for example, makes ``precompiled'' executables very small. Often much smaller than the corresponding source and even a compiled binary.

The only real drawback usually is execution speed. Simple interpreters are often ten to hundred times slower than the same program written in a language such as C or Pascal and executed as a native binary. Several techniques exist to make interpretation fast. One of the earliest techniques is Bell's threaded code technique [6]. The idea is to preprocess the code and convert language statements into subroutine calls. In principle, this reduces the execution overhead of an interpreter to one or two assembly instructions per interpreted statement.

Indirect threaded code [21] adds a level of indirection and makes the technique more portable. Kogge [49] reviews threaded code techniques and compares them. He finds that performance penalties for threaded code versus direct assembly coding are about 1.2:1 for minicomputers. A paper by Paul Klint [48] compares interpretation techniques. He finds that instruction fetch time of direct threaded code is faster than indirect threaded code. Both threaded code techniques are much faster than a traditional interpreter that uses opcode tables. As the complexity of the interpreted instructions increases; i.e. the subroutines that implement the instruction become larger and more complex, the less relevant the instruction fetch time becomes.

Cint [18], a C language interpreter, shuns threaded code techniques, claiming they are machine dependent. Instead, Cint relies on a minimal (RISC) virtual machine for speed. The instruction set consists of 49 executable instructions and 14 pseudo-operations. The overhead of executing a C program under Cint compared to the execution of a compiled binary, is considerable: On average 29 times slower on a VAX-11/780 and 36 times slower on a Sun-3/75. To avoid the machine dependency problem, Ertl [31] uses a feature of GNU gcc [90] to generate a threaded code interpreter from a C language source. GNU gcc extends the C language and allows labels to be treated as values. Together with GNU gcc's computed goto statement it is possible to write a threaded code interpreter without resorting to machine language. The interpreter is portable to any system that supports GNU gcc.

For our research we will use threaded code techniques. If possible, we use the C language extensions of GNU gcc to remain as portable as possible. At the same time we will adapt and modify R-code so that an interpreter for it can be small and simple (like Cint), and keep the amount of computation done per interpreted instruction high, so the instruction fetch time can be amortized [78].


next up previous contents
Next: Stack Machines Up: Interpreters Previous: Interpreters

Rolf Riesen
Wed Jan 22 22:24:20 MST 1997