RISE: Randomized Instruction Set Emulation

Introduction

From a security point of view, one of the problems with current computer systems is that they are very uniform. When a vulnerability is found, and an attack is devised to take advantage of it, the attacker will have a good probability of succeeding on a large number of systems with almost identical configurations. This uniformity is the result of careful design to make systems more compatible and easier to use.

Given those considerations, the idea of automated diversity has been proposed to increase security. With automated diversity, some amount of randomness is added to a standard system in ways that minimally affect external interfaces, but interfere with attacks designed for "known" systems configurations.

Many interfaces could be diversified: system call numbers and addresses, layout of stack, heap, data and text in processes, file systems and so on. The research we present here is concerned with language randomization at the level of the machine code instruction set.

A computer with a "personalized" instruction set will have some resistance to code injection. The amount of protection will depend on how the diversification is performed and how easy would it be for the intruder to discover or imitate a particular set.

Clearly, to change the instruction set of a computer, some access to the hardware is necessary. However, we can demonstrate the usefulness of the idea entirely in software using an emulator. Our choice was Valgrind, a x86 emulator, originally intended for memory debugging.

As most emulators, RISE/Valgrind will run slowly, so at this stage, it is mostly a proof of concept. However, we believe that if RISE is used in a more optimized emulator, the performance penalties will not be that high. And of course, there is always the possibility of eventually porting it entirely to hardware.

RISE implementation

Once we set on the piece of interface we wanted to diversify, several important design decisions had to be taken:

How to diversify the language?
When to diversify?
How to adapt the system to the diversified interface?

Diversification mechanism

Diversifying a language can be as complex as creating new representations for data structures, operators, and so on, or as "simple" as encrypting the language and decrypting in a "safe" moment. We chose an intermediate solution. We XOR the original binary code with a large key, and then modify the "processor" (emulator), to use the correct sub-key when reading the binary input for interpretation.

The key we generate is at least as large as the combined lengths of the legitimately executable pieces of the ELF file. However, it is NOT a one time pad in the strict sense, as we use a smaller truly random seed to generate the rest of the key (with /dev/urandom).

Diversification time

The diversification can be achieved modifying the source programs, at compilation, at link time, at load time, or at combinations of these and other arbitrary times during the lifetime of a program. Each has its benefits and problems. We decided to randomize (combine with a large "key" using XOR) the ELF binary at load time. This has several advantages: we don't need to store randomized versions, we can use a new "key" at each execution, and we don't need to have access to the original sources.

System adaptation

An emulator such as Valgrind (as opposed to a pure interpreter), works in passes when executing a binary. If an entry point (an address where a block of instruction starts) has been seen previously and it is still in the cache, no further interpretation is done, and the cache fragment is directly executed in the processor. Otherwise, the emulator reads and stores (maybe in an intermediate representation) several instructions forward, until it can determine a full block, writes the block to the cache (after maybe optimizing and transforming back to native binary), and then sends the block for execution to the processor.

Given that the code is stored in memory already randomized, the best moment to de-randomize it is during the first emulator pass, when it is trying to figure the shape of the block, which is what we implemented. So we convert back to a representation understandable by our system just at that time, and never modify the randomized code sitting in memory. This "emulator read" time is just before execution on the first pass, and will not happen again unless the cache block gets evicted.

How to use RISE

Unfortunately, RISE could not be implemented as an additional Valgrind skin, because too many details had to be hardcoded in the main program. So RISE code is interleaved with Valgrind code. Valgrind provides the user with several really interesting skins and other functionalities, which for the moment we are not explicitly supporting. Therefore, the RISE distribution you will find in this site is very similar to a a Valgrind distribution, but there are no skins except for the default (rskin) one, and the main executable and libraries have been renamed to "rise".

To build RISE:

Extract it from the tarball: e.g. tar -xjvf rise-xxx.tar.bz2
Descend in the rise subdirectory: e.g. cd rise-xxx
Do the typical build steps:
- ./configure
- make
- make install

More detailed instructions can be found in the README and RISE-README files.

Test it with a simple application:

rise ls -l

To test it with an attack, run a vulnerable application (let's call it vulnapp) with RISE:

rise vulnapp

and then execute the code injection attack.

If the vulnerable application is really vulnerable, if the attack really works (try it without RISE at first), and if the attack is a real binary code injection (as opposed to -say- a privilege escalation or a macro attack), then the attack will not be able to carry out its task and the vulnerable application will terminate abnormally.

In our experience, what you will observe is one of the following possibilities:

vulnapp will crash with either an "invalid opcode" or "Segmentation Fault" error: this is the most common result when the attack can avoid the emulator checks and reach the vulnerability.
vulnapp will hang because sometimes randomly formed instructions enter an infinite loop.
vulnapp will not crash, will not hang and will not respond to the attack either. This happens because the emulator does address checks and rearrangements which sometimes make the vulnerability no longer exploitable (without adapting the attack, that is).

Available software and documentation

RISE, as a single tarball, based on a standard Valgrind 2.0.0 standard distribution.
- I have compiled it on:
  - RedHat 7, 7.1, 7.2, 7.3, 8, and 9. However, on RH 7* it requires a slightly modified version to work, even though it compiles fine.
  - Debian stable and testing (as of September of 2004).
- I have older versions of RISE (available upon request) that work with older RedHat and Mandrake distributions.
- This version has the flag --mask-size= [number in bytes] that allows to use a smaller mask (default is size of process).
- The Valgrind --trace-children flag has been removed as a command-line option, and it is permanently enabled as all children of a process must be protected.
- The Valgrind --skin flag will only accept "rskin", which is enabled by default as well.
A paper on code diversification that presents the results of running RISE, that was presented on the 10th ACM CCS conference .

Acknowledgements

We gratefully acknowledge the partial support of the National Science Foundation (grants ANIR-9986555, CCR-0219587, CCR-0085792, CCR-0311686, EIA-0218262, EIA-0238027, and EIA-0324845).

Contact information

If you decide to give RISE a try, I really appreciate any feedback. Please write to gbarrantes@ecci.ucr.ac.cr. Thank you!

Gabriela Barrantes

Last modified: April 25, 2006