On the Trail of Intrusions Into Information Systems
Schemes for intrusion detection yield clues to making data safer in the future
The importance of information system security, particularly as it applies to the Internet, is obvious. Each day the news media report yet another security breach--sometimes a localized single crime or prank; at others, a denial-of-service attack affecting millions of people. As electronic commerce becomes increasingly pervasive, the subject can only become more critical.
One of the more interesting techniques for enhancing information system security is detecting that an intrusion has taken place. Although intrusion-detection systems have been a part of the information security landscape for over 25 years, their proper role in the overall security picture is often misunderstood. As their very name implies, they are not preventative security measures. Most often, they are used as active security mechanisms in conjunction with other (passive) information assurance processes like firewalls, smart cards, and virtual private networks.
12.IntDetect.f1In practice, an intrusion-detection system (IDS) attempts to detect attacks or attack preparations by monitoring either the traffic on a computer network or the application or operating system activities within a computer. Once such behavior is detected, the IDS may alert a security administrator or it may invoke an automated response (such as closing down external communication paths or initiating a mechanism to trace the source of an attack), in which case it would more properly be called an intrusion detection and response system. If an IDS detects attack behavior soon enough, it might be able to invoke a response to thwart the attack. But since most IDSs do not react fast enough or are not reliable enough to be used in this fashion, how do they aid information system security?
There are several answers. In many instances, the information they provide can help a system security administrator learn what systems were attacked and exactly how the attacks were made. With this information, often damage control can be performed on the affected systems. For example, it may be possible to remove software planted by the attacker to facilitate later access to the system.
Analysis of the attack method may also enable an administrator to fix the security problems that allowed the attacks to happen. That may sound like closing the barn door after the horse has run away, but there may be more horses in the barn--and the farmer may have other barns.
Often, the data collected by an IDS aids in tracing the source of the attack, which may prove helpful in identifying the attacker. IDS logs, for instance, could provide forensic evidence if legal action is taken against an attacker. So, even if an IDS does not detect an attack early enough to help prevent it, it may be of considerable value.
A multipronged approach
Experience has shown that it is difficult to prevent many kinds of attacks. So, during the latter half of the 1990s, security-conscious organizations like the U.S. Department of Defense (DOD) began putting more emphasis on detection, with the intent that responses triggered by detection would ultimately result in more secure systems, even in the absence of better preventative security measures. The DOD adopted a three-word security mantra that emphasized the role of intrusion-detection systems: "Prevent, Detect, Respond."
Unfortunately, some have interpreted the "detect" role in the triad to imply that an IDS will be capable of catching all attacks that are not thwarted by a system's preventative security measures. This is certainly not the case today, and it is not clear that it ever will be. The effectiveness of most of these systems is not as good as might be imagined, given the reliance placed upon them.
Installing an IDS often provides a security administrator with an immediate sense of accomplishment because installation is usually followed by a voluminous stream of data indicating possible attacks against a variety of target systems. Unfortunately, not all the indicated attacks are real. Moreover, not all actual attacks generate alarms. Over time, in fact, the security administrator may come to view the stream of alarms generated by the IDS more as a burden than as a revelation. (As will be shown, it is no minor feat to distinguish an attack from many types of normal behavior.)
'[They] can help determine what systems were attacked and
exactly how the attacks were made.'
First, some background on IDSs may be helpful. An IDS is an important element--but only an element--of a comprehensive, defense-in-depth, security architecture. The defense-in-depth paradigm currently in vogue calls for using multiple defensive technologies to thwart attacks [see Trust in Cyberspace in To Probe Further]. Although each defensive barrier is understood to be imperfect, the strategy assumes that all will not be equally vulnerable to the same sort of attacks. Therefore, the reasoning goes, an attacker will either be unable to overcome all of the barriers or, at least, will take longer to overcome them, and will therefore be more readily detectable by an IDS.
While the defense-in-depth strategy sounds good, it is hard to implement. No engineering methodology for designing a system to execute this strategy exists today. Moreover, as the capabilities of attackers increase, deployed security architectures must be reevaluated to determine whether they remain effective.
In principle, using one or more IDSs can help in the continuing evaluation of system security. Detecting new attacks might alert system security administrators to the need for upgrading selected defenses, for instance. But many IDSs are especially poor at detecting new attacks, so this potential is rarely realized in practice.
Active: requiring action by the user. This peculiar usage is common in safety- and security-related contexts--for example, automobile seat belts are called active restraints, whereas air bags are called passive.
Anomaly detection: a kind of detection that infers a hacker attack is taking place by recognizing deviations from the normal behavior of a computer or network. Contrast with signature detection.
Host-based intrusion detection: intrusion detection in which the examined parameters are computer operations data; also called computer-based intrusion detection.
IDS: intrusion-detection system.
Network-based intrusion detection: intrusion detection in which the examined parameters are network data.
Passive: not requiring action on the part of the user; see active.
Port scan: the process of sending data packets to potential target computers to see what network services each one offers.
Signature detection: a kind of detection that recognizes an attack on the basis of known attack characteristics or signatures; also called attack detection. Contrast with anomaly detection.
TCP: transmission control protocol, the robust protocol used by most Internet applications.
Historically, IDSs have been characterized as either signature-detection systems or anomaly-detection systems. Today more and more commercial products include features from both types of systems. The signature system detects attacks by matching observed parameters of network traffic or of computer operations against a database of known attack characteristics, called signatures. The anomaly system compares such parameters against normal network traffic or computer behavior patterns, looking for deviations from the norm. Each class of system has its pros and cons.
An IDS can also be characterized as either network-based or host-based. The distinction depends on whether the set of parameters the IDS examines is network data or computer operations data. Here, too, a product may incorporate both host- and network-based components. Most of the IDSs deployed today are signature-detection systems, and many are network-based.
In general, a signature-detection IDS will do well in detecting attacks in its signature database although it may miss some mounted by sophisticated attackers who have taken steps to conceal them, as described below. The database for these systems is compiled--and periodically updated--by experts who attempt to extract the essence of an attack from the set of parameters monitored by the system. A good signature database allows an IDS to minimize the likelihood of false alarms and to perform detection quickly--in real time or nearly so.
The database is general, not site-dependent, which increases its value. But where there is a database, there is the possibility of gaining access to it. It is wise to assume that capable adversaries will eventually do so and thus will be able to create attack tools and test them against the database. That upfront testing will, of course, give them the opportunity to refine their methods, thereby reducing the likelihood of detection during a real attack.
Examples of network-based, signature-driven IDS products include Network Flight Recorder's eponymous product, Haystack's NetStalker, Harris Corp.'s StakeOut, and CyberCop from Network Associates. The ISS RealSecure product incorporates both network- and host-based signature components.
'Data collected...aids in tracking the source of an attack, which
may prove helpful in identifying the attacker.'
Suppose someone on the Internet carried out an attack by transmitting malicious code as part of a very large argument, or input, to an application. Processing the Internet traffic carrying this over-large argument triggers a buffer overflow in the application, which allows the malicious code to be executed on the targeted computer. Aware of this attack scenario, a network-based signature-driven IDS might examine packets addressed to the targeted application and search for the malicious code in those packets.
As a countermeasure, the attacker might purposely employ a modified transmission control protocol (TCP) implementation to break the malicious code into several overlapping TCP packets before transmission, and then to transmit the packets out of order. The attacker can rely upon TCP (in the target system) to reorder the packets and to discard the overlapping data when it reassembles the packets.
However, an IDS may omit part of the TCP processing in an effort to keep up better with LAN traffic, and that may allow malicious code to get through undetected. Aware of this possibility, the creator of the IDS may choose to look for smaller parts of the malicious code. But that strategy may cause the IDS to match innocuous packets incorrectly.
The preceding scenario illustrates a fundamental problem for an IDS: strategies for detecting attacks usually involve a tradeoff between false negatives (not detecting real attacks) and false positives (generating false alarms). Each false negative is an undetected attack, while each false positive creates a burden for a security administrator, who must devote time to analyzing each alarm. If an IDS cries "Wolf!" too often, its output will be ignored--with predictable results.
Even for a good signature-detection IDS, operational reality leaves much to be desired. For example, a network-based IDS may be unable to observe all the traffic on a local-area network because of the use of a switched (as opposed to a shared, or broadcast) network architecture. As noted, an attacker may fragment an attack data stream, overlap the fragments, and intentionally transmit the fragments out of order, to make the pattern recognition process harder. Thus some of the best network-based IDSs manage to detect only about 80 percent of known attacks, according to annual tests conducted by the Massachusetts Institute of Technology's Lincoln Laboratory, Lexington, Mass.
Host-based IDSs that employ signature-detection techniques can fare much better; some detect essentially all attacks known to them. But both network- and host-based signature IDSs exhibit high false-negative detection rates in the face of attacks with which they are not familiar. Moreover, any IDS on a computer that is the target of an attack is in a race against the attacker. If the attack succeeds before detection and alarm, a sophisticated attacker will disable the IDS as a first step after breaking in.
Examples of host-based, signature-driven IDS products include AT&T's CompWatch, Intruder Alert by Axent Technologies, and Science Applications International's Computer Misuse Detection System, which also incorporates anomaly-detection features.
Holding greater promise of detecting novel attacks are anomaly-detection systems because they are not constrained by using an attack signature database. Such systems, however, must first be trained to develop a normal behavior database, and their success is dependent, in large part, on how well normal behavior can be characterized, and on how closely attack behavior mimics normal behavior. What's more, the database used by such systems tends to be site-specific. That makes it harder to deploy them because local (not central) administrators must manage the process of normal database construction (training).
The positive side to this site dependency is that it makes it harder for an adversary to test prospective attacks, to determine whether they will be detected by the IDS. An example of a network-based, anomaly-detection system is the Intouch INSA Security Agent by Touch Technologies Inc., San Diego, Calif., which also embodies signature-driven features.
If the database for an anomaly detection IDS is static, based on sampling of behavior at some prior, fixed time, it will not track usage trends that may cause normal traffic to appear anomalous, which sounds like a disadvantage. But, if the IDS tracks behavior and "learns" from continuous monitoring, an attacker may use that feature to gradually shift the database, to "train" it to accept attack (or pre-attack) behavior. So that bug may turn out to be a feature.
Experience in this regard varies considerably, depending on the IDS in question. Some have been successful at detecting a range of attacks, both known and novel, in certain contexts. In general, though, host-based anomaly-detection systems seem to be more successful than their network-based counterparts, the same as with signature-detection IDSs.
As mentioned earlier, both classes of IDSs struggle with the problem of striking a balance between false positives and false negatives (Type I and Type II errors, in the parlance of statistics). Many systems have the equivalent of a tuning knob that allows a system administrator to adjust the sensitivity of the IDS. Unfortunately, the false positive rate usually rises as one increases detection sensitivity, which can overwhelm the administrator with too much data from which to extract real attacks.
If a second level of automated analysis is employed to reduce the false positive rate, the delay between detection and reporting will increase, making it tougher to invoke a response fast enough to thwart the attack. Conversely, reducing sensitivity, to avoid information overload, often allows even more genuine attacks to go undetected, which runs counter to the motivation for deploying an IDS.
Detecting a port scan
Consider the follow example. Before launching an attack, it is common for an adversary to conduct a port scan, a process in which data packets are sent to potential target computers to determine which network services are offered on each computer. The least sophisticated hackers conduct port scans in an obvious fashion: they send traffic from a single source, targeting all potential host addresses in a local-area network (LAN) in numeric order, probing each port in numerical order. The port scan is conducted as fast as the network and the source and target computers will allow. This sort of behavior is readily detected by an IDS.
In contrast, a more sophisticated attacker might send probes from multiple sources, to host addresses and ports that are randomly selected from the target address space. This more devious attack might be carried out not in a matter of minutes, but over a period of days, weeks, or months. The source of the port scan may also be obscured, by launching the probes from several sources. A "low and slow" port scan of this sort is hard to detect, since a sensitivity setting capable of detecting it would tend to generate an unacceptably large number of false positives.
Other factors may further inhibit the ability of a network-based IDS to function. The increasing use of high-speed (100-Mb/s and 1-Gb/s) switched LANs makes it more difficult to monitor all of a network's traffic from a single point, as does the use of end-to-end encryption. This is a serious concern as network-based systems have been the easiest and least expensive to deploy: they do not require installation and maintenance of software on every computer being protected in an enterprise environment.
Clues to tomorrow
So, what is the future of intrusion detection? Host-based systems, while more difficult to install and manage, appear to offer the best hope for detection of both known and novel attacks. They are not subject to some of the more daunting challenges that face network-based systems. But network-based intrusion detection may be appropriate for monitoring traffic directed to specific systems, where encryption is not used and where interfaces can be targeted (to avoid LAN monitoring problems).
At this stage in their evolution, both attack-detection and anomaly-detection systems appear to have merit. Using multiple, distinct styles of intrusion detection as part of a defense-in-depth strategy still seems promising, even though no methodology exists for deciding how best to construct such composite systems.
As for IDS research, a current focus is on the correlation of inputs from multiple IDS sensors. This work is an effort to uncover patterns that might otherwise escape detection by an individual IDS. At this point, however, the advantage is with the adversary.
And what of intrusion detection systems today? Despite its shortcomings, an IDS is an important part of a comprehensive information system security suite. It is not an alternative to effective protective mechanisms, but it is a useful adjunct and a good source for forensic data if an attacker is caught and prosecuted.
About the Author
Stephen Kent is the chief scientist, information security, for BBN Technologies, in Cambridge, Mass., a unit of Verizon Communications, where he has led information security R&D projects for over 20 years. His is currently developing high-assurance public-key infrastructure technology, security for Internet routing, and multi-gigabit encryption devices. He is a fellow of the Association for Computing Machinery, and has authored several Internet security standards.
To Probe Further
For a thorough discussion of the main aspects of intrusion detection, see Intrusion Detection: An Introduction to Internet Surveillance, Correlation, Traps, Trace Back, and Response, 1999, by Edward G. Amoroso, chief technical officer, Information Security Center, AT&T Laboratories. The book was developed from lectures the author gave to AT&T customers and Stevens Institute graduate students over several years. See the Web site of Intrusion.Net Books at https://www.intrusion.net/.
For an analysis of why defense-in-depth is an attractive alternative to previous approaches to achieving system security, see the 1998 National Research Council report, Trust in Cyberspace.
An excellent discussion on system security is offered in "Detecting Computer and Network Misuse Through the Production-Based Expert System Toolset (P-BEST)," by U. Lindqvist and P. Porras, in the Proceedings of the 1999 IEEE Symposium on Security and Privacy, Oakland, Calif., 9-12 May, 1999.
Also relevant is "Data Mining Approaches for Intrusion Detection," by W. Lee and S. Stolfo, Proceedings of the 7th Usenix Security Symposium, 1998.