|
This chapter describes the behavior of the immune system from an information processing perspective. It reviews a series of projects conducted at the University of New Mexico and the Santa Fe Institute, which have developed and explored the theme "immunology as information processing". The projects cover the spectrum from serious modeling of real immunological phenomena, such as crossreactive responses in animals and the generation of diversity, to computer science applications, especially the attempt to develop an immune system for computers to protect them against viruses, intrusions, and other malicious activities.
We describe an artificial immune system (ARTIS) that incorporates many properties of natural immune systems, including diversity, distributed computation, error tolerance, dynamic learning and adaptation and self-monitoring. ARTIS is a general framework for a distributed adaptive system and could be applied to many domains. In this paper we apply ARTIS to computer security, in the form of a network intrusion detection system called LISYS. We demonstrate that LISYS is effective at detecting intrusions while maintaining low false positive rates. Finally, we note several similarities between ARTIS and Holland's classifier systems.
This high-level overview discusses the architecture of the immune system, with an emphasis on how various key parts fit together. These key parts inlude recognition through binding, generation of receptor diversity, adaptation and immune memory, self tolerance, detection and elimination of intra-cellular pathogens, MHC and diversity, and effector selection.
This dissertation explores an immunological model of distributed detection, called negative detection, and studies its performance in the domain of intrusion detection on computer networks. The goal of the detection system is to distinguish between illegitimate behaviour (nonself), and legitimate behaviour (self). The detection system consists of sets of negative detectors that detect instances of nonself; these detectors are distributed across multiple locations. The negative detection model was developed previously; this research extends that previous work in several ways.Firstly, analyses are derived for the negative detection model. In particular, a framework for explicitly incorporating distribution is developed, and is used to demonstrate that negative detection is both scalable and robust. Furthermore, it is shown that any scalable distributed detection system that requires communication (memory sharing) is always less robust than a system that does not require communication (such as negative detection). In addition to exploring the framework, algorithms are developed for determining whether a nonself instance is an undetectable hole, and for predicting performance when the system is trained on non-random data sets. Finally, theory is derived for predicting false positives in the case when the training set does not include all of self.
Secondly, several extensions to the model of distributed detection are described and analysed. These extensions include: multiple representations to overcome holes; activation thresholds and sensitivity levels to reduce false positive rates; costimulation by a human operator to eliminate autoreactive detectors; distributed detector generation to adapt to changing self sets; dynamic detectors to avoid consistent gaps in detection coverage; and memory, to implement signature-based detection.
Thirdly, the model is applied to network intrusion detection. The system monitors TCP traffic in a broadcast local area network. The results of empirical testing of the model demonstrate that the system detects real intrusions, with false positive rates of less than one per day, using at most five kilobytes per computer. The system is tunable, so detection rates can be traded off against false positives and resource usage. The system detects new intrusive behaviours (anomaly detection), and exploits knowledge of past intrusions to improve subsequent detection (signature-based detection).
We describe an artificial immune system (AIS) that is distributed, robust, dynamic, diverse and adaptive. It captures many features of the vertebrate immune system and places them in the context of the problem of protecting a network of computers from illegal intrusions.
We describe an artificial immune system (AIS) that is distributed, robust, dynamic, diverse and adaptive. It captures many features of the vertebrate immune system and places them in the context of the problem of protecting a network of computers from illegal intrusions. The AIS resembles a classifier system in many important ways. Similarities and differences are discussed.
Analogies with immunology represent an important step toward the vision of robust, distributed protection for computers.
Computer use leaves trails of activity that can reveal signatures of misuse as well as of legitimate activity. Depending on the audit method used, one can record a user's keystrokes, the system resources used, or the system calls made by some collection of processes. Preliminary work is presented on the analysis of system call traces, particularly their structure during normal and anomalous behavior, and anomalies are found to be temporally localized. These techniques could eventually lead to an effective, automatic analysis and monitoring system, and might even be extensible to handle other kinds of anomalous behavior.
Natural immune systems provide a rich source of inspiration for computer security in the age of the Internet. Immune systems have many features that are desirable for the imperfect, uncontrolled, and open environments in which most computers currently exist. These include distributability, diversity, disposability, adaptability, autonomy, dynamic coverage, anomaly detection, multiple layers, identity via behavior, no trusted components, and imperfect detection. These principles suggest a wide variety of architectures for a computer immune system.
A method for anomaly detection is introduced in which "normal" is defined by short- range correlations in a process' system calls. Initial experiments suggest that the definition is stable during normal behavior for standard UNIX programs. Further, it is able to detect several common intrusions involving sendmail and lpr. This work is part of a research program aimed at building computer security systems that incorporate the mechanisms and algorithms used by natural immune systems.
A method is introducted for detecting intrusions at the level of privileged processes. Evidence is given that short sequences of system calls executed by running processes are a good discriminator between normal and abnormal operating characteristics of several common UNIX programs. Normal behavior is collected in two ways: Synthetically, by exercising as many normal modes of usage of a program as possible, and in a live user environment by tracing the actual execution of the program. In the former case several types of intrusive behavior were studied; in the latter case, results were analyzed for false positives.
We are designing and testing a prototype distributed intrusion detection system (IDS) that monitors TCP/IP network traffic. Each network packet is characterized by the triple (source host, destination host, network service). The IDS monitors the network for the occurrence of uncommon triples, which represent unusual traffic patterns within the network. This approach was introduced by researchers at the University of California, Davis, who developed the Network Security Monitor (NSM), which monitors traffic patterns on a broadcast LAN. NSM was effective because most machines communicated with few (3 to 5) other machines, so any intrusion was highly likely to create an unusual triple and thus trigger an alarm.Although successful, NSM has serious limitations. It is computationally expensive, requiring its own dedicated machine, and even then only being able to monitor existing connections every five minutes. Further, the architecture of NSM does not scale: The computational complexity increases as the square of the number of machines communicating. Finally, NSM is a single point of failure in the system because it runs on a single machine. These limitations can be overcome by distributing the IDS over all machines in the network. Distribution will make the IDS robust by eliminating the single point of failure and will make it more flexible and efficient; computation can vary from machine to machine, fully utilizing idle cycles.
The architecture of NSM is not easily distributable. Distributing NSM would require either excessive resource consumption on every machine upon which it was run, or communication between machines. The immune system has interesting solutions to a similar problem of distributed detection. We have designed a distributed IDS based on the architecture of the immune system. This allows the IDS to function efficiently on all machines on the LAN, without any form of centralized control, data fusion or communication between machines. The architecture is scalable, flexible and tunable.
Our IDS depends on several "immunological" features, the most salient being negative detection with censoring, and partial matching with permutation masks. With negative detection, the system retains a set of negative detectors, that match occurrences of abnormal or unusual patterns (in this case, the patterns are binary string representations of network packet triples). The detectors are generated randomly and censored (deleted) if they match normal patterns. Partial matching is implemented through a matching rule, which allows a negative detector to match a subset of abnormal patterns. Partial matching reduces the number of detectors needed, but can result in undetectable abnormal patterns called holes, which limit detection rates. We eliminate holes by using permutation masks to re-map the triple representation seen by different detectors.
We have conducted controlled experiments on a simulation that uses real network traffic as normal, and synthetically generated intrusive traffic as abnormal. Using a system of 25 detectors per machine on a network of 200 machines, one out of every 250 nonself patterns goes undetected, which is a false-negative error rate of 0.004. This number is conservative because intrusions will almost always generate more than one anomalous pattern. The computational impact of 25 detectors per machine is negligible, so performance can be improved by using more detectors per machine: If the number of detectors is doubled to 50 per machine, the error rate reduces by an order of magnitude. These results indicate that holes can be effectively eliminated using permutation masks, and that consequently the IDS can provide comprehensive coverage in a robust, fully distributed manner.
Previously, in the 1996 IEEE Symposium on Security and Privacy, we reported a technique for intrusion detection using sequences of system calls. Although the vision here is the same, this current research differs in the domain of application (network traffic), and draws far more strongly on the immune analogy.