Natural immune systems provide a rich source of inspiration for computer security in the age of the Internet. Immune systems have many features that are desirable for the imperfect, uncontrolled, and open environments in which most computers currently exist. These include distributability, diversity, disposability, adaptability, autonomy, dynamic coverage, anomaly detection, multiple layers, identity via behavior, no trusted components, and imperfect detection. These principles suggest a wide variety of architectures for a computer immune system.
Automated intrusion response is an important unsolved problem in computer security. A system called pH (for process homeostasis) is described which can success fully detect and stop intrusions before the target system is compromised. In its current form, pH monitors ev ery executing process on a computer at the systemcall level, and responds to anomalies by either delaying or aborting system calls. The paper presents the rationale for pH, its design and implementation, and a set of initial experimental results.
Diversity is an important source of robustness in biological systems. Computers, by contrast, are notable for their lack of diversity. Although homogeneous systems have many advantages, the beneficial effects of diversity in computing systems have been overlooked, specifically in the area of computer security. Several methods of achieving software diversity are discussed based on randomizations that respect the specified behavior of the program. Such randomization could potentially increase the robustness of software systems with minimal impact on convenience, usability, and efficiency. Randomization of the amount of memory allocated on a stack frame is shown to disrupt a simple buffer overflow attack.
An artificial immune system (ARTIS) is described which incorporates many properties of natural immune systems, including diversity, distributed computation, error toler- ance, dynamic learning and adaptation and self-monitoring. ARTIS is a general frame- work for a distributed adaptive system and could, in principle, be applied to many do- mains. In this paper, ARTIS is applied to computer security, in the form of a network intrusion detection system called LISYS. LISYS is described and shown to be effective at detecting intrusions, while maintaining low false positive rates. Finally, similarities and differences between ARTIS and Holland's classifier systems are discussed.
A method for anomaly detection is introduced in which "normal" is defined by short- range correlations in a process' system calls. Initial experiments suggest that the definition is stable during normal behavior for standard UNIX programs. Further, it is able to detect several common intrusions involving sendmail and lpr. This work is part of a research program aimed at building computer security systems that incorporate the
We describe a method for intrusion detection at the level of privileged processes. We present evidence that short sequences of system calls executed by running processes are a good discriminator between normal and abnormal operating characteristics of several common UNIX programs. Normal behavior is collected in two ways: Synthetically, by exercising as many normal modes of usage of a program as possible, and in a live user environment by tracing the actual execution of the program. In the former case we study several types of intrusive behavior; in the latter case, we analyze our results for false positives.
We are designing and testing a prototype distributed intrusion detection system (IDS) that monitors TCP/IP network traffic. Each network packet is characterized by the triple (source host, destination host, network service). The IDS monitors the network for the occurrence of uncommon triples, which represent unusual traffic patterns within the network. This approach was introduced by researchers at the University of California, Davis, who developed the Network Security Monitor (NSM), which monitors traffic patterns on a broadcast LAN. NSM was effective because most machines communicated with few (3 to 5) other machines, so any intrusion was highly likely to create an unusual triple and thus trigger an alarm.
Although successful, NSM has serious limitations. It is computationally expensive, requiring its own dedicated machine, and even then only being able to monitor existing connections every five minutes. Further, the architecture of NSM does not scale: The computational complexity increases as the square of the number of machines communicating. Finally, NSM is a single point of failure in the system because it runs on a single machine. These limitations can be overcome by distributing the IDS over all machines in the network. Distribution will make the IDS robust by eliminating the single point of failure and will make it more flexible and efficient; computation can vary from machine to machine, fully utilizing idle cycles.
The architecture of NSM is not easily distributable. Distributing NSM would require either excessive resource consumption on every machine upon which it was run, or communication between machines. The immune system has interesting solutions to a similar problem of distributed detection. We have designed a distributed IDS based on the architecture of the immune system. This allows the IDS to function efficiently on all machines on the LAN, without any form of centralized control, data fusion or communication between machines. The architecture is scalable, flexible and tunable.
Our IDS depends on several "immunological" features, the most salient being negative detection with censoring, and partial matching with permutation masks. With negative detection, the system retains a set of negative detectors, that match occurrences of abnormal or unusual patterns (in this case, the patterns are binary string representations of network packet triples). The detectors are generated randomly and censored (deleted) if they match normal patterns. Partial matching is implemented through a matching rule, which allows a negative detector to match a subset of abnormal patterns. Partial matching reduces the number of detectors needed, but can result in undetectable abnormal patterns called holes, which limit detection rates. We eliminate holes by using permutation masks to re-map the triple representation seen by different detectors.
We have conducted controlled experiments on a simulation that uses real network traffic as normal, and synthetically generated intrusive traffic as abnormal. Using a system of 25 detectors per machine on a network of 200 machines, one out of every 250 nonself patterns goes undetected, which is a false-negative error rate of 0.004. This number is conservative because intrusions will almost always generate more than one anomalous pattern. The computational impact of 25 detectors per machine is negligible, so performance can be improved by using more detectors per machine: If the number of detectors is doubled to 50 per machine, the error rate reduces by an order of magnitude. These results indicate that holes can be effectively eliminated using permutation masks, and that consequently the IDS can provide comprehensive coverage in a robust, fully distributed manner.
Previously, in the 1996 IEEE Symposium on Security and Privacy, we reported a technique for intrusion detection using sequences of system calls. Although the vision here is the same, this current research differs in the domain of application (network traffic), and draws far more strongly on the immune analogy.
Intrusion detection systems rely on a wide variety of observable data to distinguish between legitimate and illegitimate activities. In this paper we study one such observable -- sequences of system calls in the kernel of an operating system. Using system-call data sets generated by several different programs, we compare the ability of different data-modeling methods to represent normal behavior accurately and to recognize intrusions. We compare the following methods: Simple enumeration of observed sequences, comparison of relative frequencies of different sequences, a rule induction technique, and Hidden Markov Models (HMMs). We discuss the factors affecting the performance of each method, and conclude that for this particular problem, weaker methods than HMMs are likely sufficient.
This paper introduces a method of distributed network intrusion detection which scales with the number of computers on a network and is tunable (the probability of detection can be traded off against overhead). Experiments with real network traffic show that the system has high detection rates for several common routes of intrusion, and low false-positive rates user normal behavior. The method can easily be extended to accommodate dynamically changing definitions of normal behavior (for example, adding a host to the network) and to remember known patterns of intrusion.
The natural immune system has evolved many interesting mechanisms to solve the problem of self-nonself discrimination. An anomaly detection system based upon principles derived from the immune system was introduced in Self-nonself discrimination in a computer. Its main advantages are that it is distributable, local, and tunable. This paper provides an overview of the theoretical, algorithmic and practical developments extending the original proposal. In particular, we present information theoretic results on the detection method, show the possibility of strings that cannot be detected for a given combination of self set and matching rule, present efficient algorithms to generate the detector set, and provide rules of thumb for setting the parameters to apply this method to a real data set.
The problem of protecting computer systems can be viewed generally as the problem of learning to distinguish self from other. We describe a method for change detection which is based on the generation of T cells in the immune system. Mathematical analysis reveals computational costs of the system, and preliminary experiments illustrate how the method might be applied to the problem of computer viruses.
We present new results on a distributable change-detection method inspired by the natural immune system. A weakness in the original algorithm was the exponential cost of generating detectors. Two detector-generating algorithms are introduced which run in linear time. The algorithms are analyzed, heuristics are given for setting parameters based on the analysis, and the presence of holes in detector space is examined. The analysis provides a basis for assessing the practicality of the algorithms in specific settings, and some of the implications are discussed.
This paper examines some of the theoretical foundations of the distributable change detection method introduced in Self-nonself discrimination in a computer, including fundamental bounds on some of its parameters. A short overview is given of the reasoning behind this method, its immunological counterpart and its computer implementation. The amount of information that is lost by splitting a data stream into unordered strings can be estimated, and this estimate can be used to guide the choice of string length. A lower bound on the size of the detector set is derived, based on information- theoretic grounds. The principle of holes (undetectable nonself strings) is illustrated, along with a proof of their existence for a large class of matching rules. The influence of holes on the achievable failure rate is discussed, along with guidelines on how to avoid them.