Crying wolf: False alarms hide attacks

Eight IDSs fail to impress during the monthlong test on a production network.

By David Newman, Joel Snyder and Rodney Thayer
Network World, 06/24/02

Original Article on Network World Web Site

One thing that can be said with certainty about network-based intrusion-detection systems is that they're guaranteed to detect and consume all your available bandwidth. Whether they also detect network intrusions is less of a sure thing.

Those are the major conclusions of our first-ever IDS product comparison conducted "in the wild." Unlike previous tests run in lab settings, we put seven commercial IDS products and one open-source offering on a live ISP segment to see what they'd catch.

What we found wasn't encouraging:

Several IDSs crashed repeatedly under the burden of the false alarms they churned out.
When real attacks came along, some products didn't catch them and others buried the reports so deep in false alarms that they were easy to miss.
Overly complex interfaces made tuning out false alarms a challenge.

Because no product distinguished itself, we are not naming a winner (See "No cigar"). The eight products we tested - from Cisco, Intrusion, Lancope, Network Flight Recorder (NFR), Nokia (running on OEM version of Internet Security Systems RealSecure 6.5), OneSecure, Recourse Technologies and the open-source Snort package - all ask too much of their users in terms of time and expertise to be described as security must-haves.

That's not to say IDSs have no place in corporate networks. They can be valuable tools for learning about network security and can validate that other security devices are doing their jobs. But setting up the current generation of IDSs requires a substantial time investment to ensure they'll flag only suspicious traffic and leave everything else alone.

We used the production network of Opus One, an ISP in Tucson, Ariz., as our testbed. Opus One offers Web hosting and leased-line, DSL and dial-up Internet access services to 50 small to midsize businesses. The backbone infrastructure includes nine T-1 circuits with an average utilization in the range of 9M to 12M bit/sec.

To spice things up a bit, we deployed four "sacrificial lambs," systems running old, unpatched versions of Windows 2000 Server and NT 4.0 Server, Red Hat Linux 6.2 and Sun Solaris 2.6. Putting plain-vanilla versions of these operating systems on the Internet is just asking to be attacked. Past studies have shown that unpatched systems get owned in a matter of minutes, thanks to automated scripts that find and exploit well-known vulnerabilities. We figured the IDS sensors couldn't miss seeing these attacks.

All IDSs consist of at least one sensor that monitors traffic and sends alarms whenever suspicious behavior occurs. There are two major methods of detecting problems: signature detection and anomaly detection. Signature detection, used by all products in this review except Lancope's StealthWatch, will generate an alarm whenever traffic matches a known attack pattern. With anomaly detection, the IDS compares current behavior against a baseline of "normal" traffic on that network and flags anything out of profile as an alarm.

No Cigar

Not in the game

IDS glossary

Three tips for reducing false alarms

For signature detection, the size of the attack library is key and vendors boast about spotting large numbers of attacks. On the flip side, signature-based IDSs only report on attacks they know about. With new attacks appearing daily, keeping the library current is a must.

Anomaly-based IDSs don't need to know about specific attacks, only exceptions. This makes them easier to maintain. At the same time, alarms from anomaly-based systems are only as useful as the baseline with which they're compared. An anomaly-based system might characterize a network already rife with attacks as "normal" and thus miss future intrusions.

We wanted our test to model corporate use of IDSs, especially when it came to management. Most IDSs offer a management hierarchy with two tiers (or more), so sensors on one network can report to a management console and/or database in another location. To model this distributed approach, we set up an IP Security tunnel from the sensors at Opus One to Network Test's offices in Los Angeles, where the management stations were located.

Staying alive

Initially, we planned to assess the IDSs on accuracy and ease of use. As it turned out, we needed to add a third metric: uptime.

All the products we tested - save one - suffered at least some downtime during our approximately 30 days of testing (see "Uptime table," page 60).

Even before we turned loose the sacrificial lamb hosts, we experienced numerous crashes as IDS sensors struggled to keep up with traffic. In some cases, this occurred because the sensor simply fell over. An early version of the NFR software, for example, caused the vendor's NID200 sensor to use all available memory and CPU resources. A software patch fixed that problem.

A more common problem lay with the management stations. Most wouldn't stay up for more than a few days because of database overload.

Cisco's Secure Intrusion Detection System 4235, Intrusion's SecureNet 7145C and Nokia's IP530 - were especially shaky on availability. Cisco's sensor never locked up, but its management software was another story.

The vendor initially supplied Version 2.3.3i of its Cisco Secure Policy Manager (CSPM). CSPM is a powerful application with tons of useful features and one very significant downside: Whenever its database grows too large, the application ceases to function.

CSPM hit this threshold daily. Cisco suggested we create and run a batch file twice daily that would automatically prune CSPM's logs. This fix kept the application going, but it also excised the database of previous entries that CSPM could have used for its event correlation and reporting functions.

Cisco says it will announce a new version of CSPM next month that runs atop a more robust database. We hope so. Although CSPM is intended to collate input from large numbers of sensors, in our test it took just one to kill it.

We also used a beta version of Cisco's free management tool - Integrated Device Manager (IDM) with Integrated Event Viewer. IDM Version 3.1 didn't crash once.

Intrusion's SecureNet Provider (SNP) software uses a multitiered approach in which different machines can be used as sensors; consoles (for configuring the sensor); databases (for storing alarms from multiple sensors); and clients.

In our experience, it was the SNP client that locked up repeatedly. We'd see CPU utilization rise above 90% and stay there. In that state, it was impossible to tell what events Intrusion's client was and wasn't seeing.

The vendor's fix was twofold: First, Intrusion tuned its database not to store any alarms for what it deemed low-severity events. Second, the company configured the client to store only a day's worth of alarms. These measures kept the client running but limited the mount of data stored locally. Intrusion's database continued to log all medium- and high-severity events it received. However, the lack of local information at the management client could be irksome for a network professional trying to piece together an incident after, say, a long weekend.

The most troublesome performer of all was Nokia's IP530. What the vendor touts as a "high-performance security platform" locked up 13 out of the approximately 30 days we used it.

On the sensor side, Nokia's hardware-based security appliance runs RealSecure 6.5 from Internet Security Systems (ISS). The volume of traffic on the Opus One network caused the IP530's RealSecure process to terminate roughly once a day until Nokia supplied a patch.

Volume was the root problem for Nokia's management application. The first of several problems was Nokia's decision to supply Microsoft Database Engine (MSDE) as its data store. MSDE works well on a very quiet network, but it was unable to keep up with the feed from Opus One. It would fill up and stop running less than 24 hours after each reinstallation (which also wiped out all previous data). At our request Nokia supplied another management machine running Microsoft's SQL Server, but this too locked up.

Most of the other sensors and/or consoles also suffered from at least some downtime in our tests. The OneSecure Intrusion Detection and Prevention (IDP) sensor crashed once and wouldn't reboot; installing a software upgrade also left the sensor unusable because it didn't set a default route.

The bigger problem with OneSecure was glacial screen updates on its management console. Although the Java-based application logged a large number of events, scrolling through the log entries took so long that the application often seemed to be hung. Only exiting the application would bring it back to life.

Recourse's ManHunt application, also Java-based, was nearly as sluggish at times. It hung twice during our tests.

The only commercial product that didn't crash at all was Lancope's StealthWatch. This Web-based appliance didn't have a separate management application to crash. Further, the StealthWatch sensor never once locked up.

Who goes there?

Our main metric for this project was accuracy. What attacks would the IDSs see, and how clearly would they identify those attacks?

We considered an attack to be any compromise of any computing resource on the "protected" network. That resource could be bandwidth, disk space, a printer, a password file - basically, anything for which access is not explicitly authorized. This is not the same as an attempted attack; if there was no compromise, then the IDS is essentially reporting on a vulnerability that doesn't exist. During the test, most of the ISPs generated so many false positives that it was difficult to spot reports of real attacks.

We expected the IDS sensors to report any behavior outside these bounds and only such behavior. The major challenge when it comes to IDS reporting is reducing the number of false positives, while at the same time avoiding false negatives (see online IDS glossary, page 62).

Sensors take different approaches to reporting accurately. A sensor that sends an alarm every time a packet goes by would be very accurate because it would flag every attack packet. But this sensor also would flag everything else, making it difficult to distinguish real attacks from background noise. Open-source Snort came closest to this model, with NFR's NID200 a close second.

Who goes there?
Attackers launched thousands of penetration attempts on the network monitored by the IDSs we tested. As shown in these three representative incidents, not all IDSs recognized all attacks.

IDS	Incident	Response	Incident	Response	Incident	Response
Cisco	SYN flood	Database frozen	Code Red worm	Database frozen	wu-ftpd exploit	Saw attack
Intrusion	SYN flood	Client frozen	Code Red worm	Client hung	wu-ftpd exploit	Saw attack
Lancope	SYN flood	No report	Code Red worm	No report	wu-ftpd exploit	No report
NFR	SYN flood	Saw attack	Code Red worm	Saw attack	wu-ftpd exploit	No report
Nokia/ISS	SYN flood	No report	Code Red worm	Sensor hung	wu-ftpd exploit	Saw attack
OneSecure	SYN flood	No report	Code Red worm	No report	wu-ftpd exploit	Saw attack ⁽¹⁾
Recourse	SYN flood	Saw attack	Code Red worm	Saw attack	wu-ftpd exploit	No report
Snort	SYN flood	No report ⁽²⁾	Code Red worm	Saw attack	wu-ftpd exploit	Saw attack

^1. Only saw attack in inline mode; failed to detect attack in passive mode.
^2. Offline because of configuration error.

At the other end of the spectrum, an IDS could be configured to send alarms for only a narrowly defined set of criteria. For example, it could flag only FTP sessions to host X from user Y at time Z.

The big danger here is false negatives; because if the IDS only listens for a few specific events it will miss everything else. We were especially concerned about this when tuning the Cisco, Intrusion and Nokia products because they had to reduce their reporting load to stay operational. In addition, OneSecure's device had relatively few alarms enabled by default.

We found most IDSs reported far too much rather than too little, making it difficult to pick out actual attacks from all the noise.

Our first indication of trouble came before we'd even powered up the sacrificial lambs. Two Macintoshes attached to Opus One's network froze at the same instant, possibly indicating a denial-of-service attack.

Checking the IDS clients, we found thousands of alarms from all sensors - but only NFR's NID200 and Recourse's ManHunt actually reported a SYN flood attack at the instant the Macs froze (see attack chart, below).

The Cisco, Intrusion and Nokia systems were unavailable because their databases had frozen as a result of the huge volume of alarms they handled, almost all of which were false positives. Lancope's StealthWatch and the OneSecure sensor didn't see an attack. Snort was off the air at the time of the attack because of misconfiguration on our part.

Determining which sensors did and didn't see the attack was a chore. Using NFR's Administration Interface management tool, we could query for all incidents. However, the version of Administration Interface we tested only returned a maximum of 4,096 records per query, or around 17 minutes' worth of traffic on the network. A registry edit on a later version of Administration Interface lets a query return more responses.

Positively negative

So what was it that kept the sensors so busy they couldn't report on actual incidents? By far the biggest problem was a huge number of false positives, with sensors sending alarms for insignificant events - or even worse, for vulnerabilities that didn't exist.

The most egregious example of the latter was the massive number of reports of the Code Red and Code Blue attacks commonly launched against Microsoft's Internet Information Servers (IIS) Web servers. NFR also sent many "successful Nimbda attack" reports, alerting us to the presence of another way in which IIS can be compromised. To be sure, such attacks are a real problem - provided the vulnerability also is real.

But Opus One's servers run OpenVMS, not Windows. Even though it is trivially easy to figure out what operating system a Web server uses, not one of the IDSs did so. Instead, they collectively generated literally millions of alarms about attacks that never happened.

An even greater source of noise was reporting on benign events on the network. The Cisco, Intrusion, Lancope, Nokia, OneSecure and Recourse products prioritize alarms by severity, tagging events with labels such as high, medium or low severity.

In most cases the sensors spewed vast quantities of "low" or "informational" alarms. The Nokia and Intrusion devices sent low-priority alarms every time an end user requested a Web page. This might be desirable in paranoid network configurations where Web access is forbidden, but on an ISP's network where Web traffic is the norm, it's not only annoying but also dangerous to the well-being of the IDS sensor and its manager.

Things went from bad to worse once we attached the sacrificial lamb machines to the network. Attackers compromised these hosts soon after we deployed them - but in some cases it was the host's own message logs, and not the IDSs, that offered proof positive.

The easiest target was our Windows NT Server box. It became a launching pad for the Code Red and Code Blue worms within an hour of deployment. We soon began receiving complaints from other ISPs advising us we had a compromised machine.

Unfortunately, the IDS sensors weren't as clear in their reports. Code Red and Code Blue worms involve lots of traffic, and this blinded some of the sensors. NFR saw the attacks, but these alarms were buried inside thousands of other reports of attempted attacks against other machines that weren't running IIS. It was a similar story for Recourse's ManHunt and the open source Snort program.

Drowning under the huge volume of traffic, most systems either buckled or simply missed the attack outright. The Cisco, Intrusion and Nokia systems stopped logging. Each required a database purge to get going again but, in fairness, we must say that all three recognized this attack after we resuscitated them. The Lancope StealthWatch and Onesecure systems didn't spot the attack the first time, but both companies' engineers offered guidance in reconfiguring the systems so they would see subsequent attacks.

Vision test

With such an abysmal record of flagging attacks in the wild, we began to wonder whether the IDSs could catch any attack. As a sort of vision test, we decided to launch a controlled attack of our own. We picked a well-known, 3-year-old attack that exploited the Red Hat Linux FTP server.

The good news is that Cisco, Intrusion, Nokia and Snort products spotted the compromise immediately. However, the Lancope, Recourse and NFR systems failed to report a compromise.

The final vendor, OneSecure, also missed seeing the FTP exploit the first time around. OneSecure's vision is apparently related to the way it's deployed in networks.

For this test, we deployed OneSecure in passive mode, meaning it was attached to the same hub as the sacrificial lamb hosts. But OneSecure also works in so-called in-line mode, meaning it can sit in front of a protected network and actively block suspicious traffic. This time, OneSecure reported and blocked the attempted FTP exploit. We're puzzled why OneSecure's IDP didn't see the attack in passive mode; we made no configuration changes to the sensor or management client when making the switch.OneSecure the vendor acknowledged the false negative as a bug, and said a corrected signature would be available by press time.

We also should note that OneSecure's in-line mode performance wasn't perfect. Not long after we changed it from passive to in-line mode, the company's own management software notified us that one of the sacrificial lambs was sending outbound trivial FTP traffic to a bogus address. On any of the sacrificial lambs, outbound traffic of any kind is a sure sign that the machine has been compromised.

A little help?

By now, readers with security expertise probably are asking why we didn't tune the IDSs to reduce the chatter and improve our chances of seeing real attacks. The short answer is that we did, or at least we tried to. Including setup time, this project stretched along three months; and during that period we worked on these systems almost every day.

In the last major area of our test, ease of use, we found that IDSs don't offer users enough help in the way of improving the signal-to-noise ratio. All the products we tested assume that before any tuning begins, the user already knows what attacks exist on the network.

That assumption is shaky on two counts. First, security isn't a full-time job for most network professionals. As such, they're unlikely to know every attack they might encounter.

Second, the management software for the products we tested offer cryptic error messages, hard-to-use graphic user interfaces and limited assistance in identifying what is and isn't a real vulnerability. These products don't offer anything like expert systems, instead leaving the user to puzzle out what actually happened.

Consider, for example, a DNS alarm reported by NFR's NID200 sensor. The sensor reported a huge number of alarms marked as "Non-Internet Query class:error(user error), name es\x00." followed by a hexadecimal string.

It took our DNS expert a couple of hours to identify the "attack" as harmless attempts to find the root name server for Spain. The NFR documentation did not explain why this alarm occurred; why it was classified as non-Internet; or where false positives might occur.

Cisco's Integrated Event Viewer also reported these DNS "attacks" and compounded the confusion by burying specific alarms two levels deep in its interface.

On the plus side, all the commercial IDSs offer at least a capsule description of each attack. This is actually the single-best feature of all the products we tested, as it helps educate users about what the IDS sees. Some products also suggest resolutions to the alarms, information on events that could lead to false positives, and links to fixes and additional data.

The Nokia management software offered all these types of information. The Intrusion product's suggested resolution for an IIS attack is a bit more terse: It simply asks users to "verify that the latest patches are installed." Even so, any description of an attack can be helpful, especially for a user seeing an attack for the first time.

We also noted a couple of minor usability annoyances with the IDSs that made troubleshooting more difficult. With the commendable exception of OneSecure, all IDSs display IP addresses and not host names. With only an address and not a domain name with which to work, it's harder to figure out where an attack might be coming from.

StealthWatch was especially irksome in this regard; it makes prominent use of addresses and only displays host names deep inside a second-level menu.

Second, most IDSs don't offer a means of grouping hosts or networks together under some easily remembered nickname. The exception is NFR, which let user-defined groups be set up using its N-code programming language.

Other devices allowed us to query groups of hosts, but not as a single object. Intrusion's SNP product let us run queries for a given host or a given subnet, but it wouldn't run a query on a group on noncontiguous addresses, even if all were on the same subnet.

Wrapping up

Don't expect IDSs to be plug-and-play devices. To be effective, they require a lot of tuning, and a fair amount of security expertise. They'll also require willingness to spend a lot of time sifting through reports, at least until the configuration is tuned properly. Even then, IDSs will require constant updating as new attacks appear. IDSs can be lifesavers and invaluable educational tools - but only for those with a lot of patience and a willingness to learn.