Security information management tools refine the deluge of raw data into
actionable intelligence.
BY JOEL SNYDER
Information security, November 2004
Original
Article on Information Security Web Site
Security devices overwhelm us with information: Firewalls log permits and denies;
routers supply traffic information; servers note break-in attempts and user
activity; and IDSes strafe us with alerts. All accumulate voluminous logs that
are difficult and time-consuming to interpret, and offer too little benefit
for the effort.
Security information management (SIM) systems give enterprises control over this swirl of data. They simplify and normalize information from disparate security and network devices, reducing noise to a relative hum of useful alerts and presenting useful trend and event reporting that an enterprise can access through a unified console. Feed a SIM 10,000 events and let it pick out those that matter-the router that failed to reboot or someone on the inside very slowly trying to guess passwords all over the network.
Do they work as advertised? We put that question to the test in our lab, feeding data from a variety of security and network devices to five leading enterprise-class SIM products: Mitigation and Response System (MARS) 2.5 by Protego Networks, Security Threat Manager 2.0 by OpenService, Security Manager 5.0 by NetIQ, ArcSight 3.0 by ArcSight, and enVision 2.0 on a Network Intelligence Engine HA series appliance by Network Intelligence.
We evaluated each based on the common components all SIMs share: data collection; analysis, alerting and responding; forensics and reporting; and storage, scalability and archiving.
We learned that a SIM's product's value depends on how much you put into it; that is, the business-specific rules that tell the engine what information is important to you. Accordingly, the degree to which the product allows you to determine those rules is absolutely critical. The differences in this respect vary significantly among the SIMs we tested.
Information Central
You have to collect data before you can analyze it.
SIMs collect most raw data via syslog, a format available in virtually every network device. The most significant exceptions are Windows, Check Point Software Technologies' firewalls and Cisco Systems' IDS sensors. For example, a Windows system stores information in its event log as well as other parts of the operating system, such as the performance monitor counters. Non-syslog devices and platforms generally require an agent for data collection.
We expected that every product would support collection methods for these "big four" of the security world, and we found some surprises.
Protego, ArcSight and Network Intelligence sailed through syslog testing with ease, but OpenService and NetIQ stumbled. NetIQ is a great Windows SIM, but its data collection architecture isn't designed for heterogeneous networks; its syslog server can only handle a single device type at a time. If you have a multivendor network, you have to drop in a syslog server for each device type.
OpenService doesn't incorporate its own syslog server in Security Threat Manager-a curious choice. By relying on third-party software to collect syslog data, it's given up control over information gathering, a critical part of the SIM process. Not helping matters, the middleware that OpenService suggested stopped working several times during our testing.
Vendors vary in their approach for obtaining Windows information, which can be gathered from Windows Event Logs or via agents on the Windows system-the approach you choose depends on what kind of information is important to you. Looking for potentially complex problems, such as memory, disk or network usage, and which processes are running requires a local agent; scanning through the event logs can be done remotely or locally. The burden of deploying an agent to every Windows system, whether you're covering just servers or every desktop, will also affect your choice.
Some vendors let you choose. OpenService, for example, will let you either pull most event logs with agentless technology or install their agent and look for things such as CPU usage or disk space.
The benefits of the agent approach are clear in NetIQ's Windows-centric architecture. In addition to capturing logs, NetIQ agents tell you what processes are running and actually enforce policy by killing proscribed processes. If your business rules state that no one can run Solitaire, the NetIQ agent can kill the process. This is a major differentiator for NetIQ, though not a core function of a conventional SIM.
Just getting the data is only half the battle. You must also parse and normalize it before it makes sense.
Vendors have come up a wide variety of techniques for pulling data out devices and systems that don't support syslog. Network Intelligence and ArcSight can gather data by reading a file on a disk somewhere, making an ODBC call, reading an XML transaction or accepting an SNMP trap.
While all the vendors sport long laundry lists of supported devices, you can anticipate some customization if your specific devices aren't supported. Whether the SIM product allows you to custom code your own device support or charges you to add nonstandard devices is an important differentiator. When you're shopping for a SIM, consider what's on your network and what you may have to pay to get the product to talk to your devices.
For example, the H-P switch in our test lab uses SNMP, not syslog. That doesn't mean a SIM will support it simply because it accepts SNMP traps. We needed to customize the SNMP agent for each product that supported SNMP to load the H-P MIB and map it to the normalized schema.
The big question is whether you can do it yourself. All the SIMs parse data into appropriate fields, depending on their database schema. But it's up to the vendor whether it keeps the process secret. NetIQ, OpenService and Network Intelligence have the most open architectures. If you want to add a new device, you simply open up a text editor and write your own parser, using the existing data files for other devices as a template.
ArcSight lets you write your own parsers, but no longer provides the templates. The company told us that it was tired of competitors using its parsers.
Even if you don't anticipate writing your own device support, you still want access to those files because they all have bugs. This is where Protego is lacking-it actually writes code for each device, so you can't modify it. It's impossible to add your own device support; and you can't repair any bugs in the parsers-Protego has to do it.
Business Intelligence
Analysis and alerting are what separate SIM products from a log server with a lot of disk space. This is the secret sauce that gives these products value-and this is where SIMs have radically different tool sets and architectures.
To get value out of a SIM, you have to build in "business intelligence" in the form of correlation rules and alerting behaviors. SIMs aren't like IDSes, which have thousands of signatures. Out of the box, SIMs are pretty useless. One of the most important evaluation criteria for a SIM is the extent to which it can support unique business intelligence rules.
Business intelligence comes in many forms. For example, we all occasionally forget or mistype our passwords, but if someone tries 10 times in a minute, that's a break-in attempt. When you decide what you want to know about and what you want to ignore, you've created business intelligence, and that's what has to go into your SIM.
Building business intelligence isn't easy. All of these products require a significant amount of customization. Whether you add the business intelligence yourself or use the vendor's professional services, you'll invest a lot of time and money. We tuned and tweaked the products for a full week, but we never felt that we had done all we could do to get the full value out of them.
Our testing was based on 20 tasks representing our hypothetical business intelligence. For example, anonymous FTP connections reported by the IDS to one set of servers were mundane and could be ignored. Connections to a different set of systems, which were prohibited by both policy and the firewall, indicated a major breach. We had a task of looking at IDS alerts for the prohibited system-we were concerned about firewall connection logs, traffic load for FTP and the IP addresses. One rule, one alert, but triggering based on any of a number of possible scenarios.
ArcSight provided the best toolkit for analyzing data. The product comes with about 100 correlation and analysis rules, which serve largely as examples. ArcSight was the only SIM to successfully handle our hardest test: to identify worm-infested sites by tracking DSL lines going up and down repeatedly.
Network Intelligence also provides a powerful tool-kit for creating business intelligence rules. It does a remarkable job of taking event data and putting it into a series of databases very specific to each application. But in the version we tested (2.0), you can't actually access the data. This means, for example, that even though the product learns from your VA tool how important a particular system is, you can't necessarily use all that data when writing your correlation rules.
Network Intelligence addressed this issue in its new version-released in October but unavailable during our test period-with significantly increased capabilities for analysis and alerting.
Protego's MARS is ambitious, but flawed. We were frustrated by bugs and a poorly designed GUI. The built-in correlation rules are fairly powerful, but are hidden from the user. This means that you can't use Protego's own intelligence to build rules.
Protego attempts to go beyond simply receiving and analyzing log entries and generating alerts. It's constantly mapping, probing and tracing traffic to pinpoint the network topology and structure, and does its own vulnerability analysis. But its built-in VA tool caused numerous devices to hang and swamped our WAN links; its mapping is limited to specific systems and discovered very little of our network topology. (OpenService has a similar capability, graphically showing interactive attack topologies, which also proved nearly useless for our specific network.)
OpenService says its architecture is "ruleless," but it actually has nine basic rule types. You can't create your own rules, just set tolerances for those it provides. We got numerous very high priority alerts for obscure problems related to unusual software that we weren't running. It also did a poor job filtering the voluminous data from our IDSes. It was the poorest alerting and analysis results among the tested SIMs.
NetIQ's strength is in analyzing and alerting on host-based events, and it blew us away with literally hundreds upon hundreds of Windows-based rules and actions. On the other hand, its correlation and rule-writing capabilities are ill-suited for analyzing network information.
Forensics and Reporting
Sometimes you want a specific message or report summarizing what's on your network. SIMs promise security intelligence through a single interface, but we found different emphases in different products. Although each SIM's reporting capabilities were nicely done, the forensics tools varied considerably.
Let's be clear about what forensics tools are-they slice and dice information to discern deeper intelligence. Such capabilities allow users to manipulate correlated data and drill down for quick summaries.
This is where OpenService made up for its poor analytical capabilities. Its powerful viewing and summarization capabilities enable in-depth analysis of security data. Similarly, NetIQ does a superior job laying out all security data and letting you explore it in depth.
ArcSight's dashboards and channels give you a dizzying array of ways to look at data. Unfortunately, its drill- down capabilities aren't strong.
Network Intelligence's nicely designed Log Smart Viewer suffers some of the same weaknesses when it comes to drilling down or summarizing data. Users could use the database query tool, but fashioning the right query of 80 assorted database tables is daunting.
Protego's highly functional reporting and forensics interface was offset by horrible performance at the end of our weeklong review. As the system filled with data, reporting speed became intolerable, even though we were running at less than 1 percent of rated load.
We didn't test specifically for performance, preferring to focus our effort on SIM capabilities. Further, the varying complexity from enterprise to enterprise made meaningful performance comparisons difficult. Nevertheless, anecdotal observations gave us pause.
In our network, we only generated about 50 events per second; most of these products claim processing capabilities in the thousands. But even at our modest levels, we ran into some serious performance problems with Protego and others. For example, when we tried to use OpenService's tuning interface while data was feeding in, it took four minutes to change screens. We found ArcSight's Manager dragged at painfully slow speeds of a minute or more per screen update even under fairly modest load, running on two dual-CPU systems (one for database, one for management).(ArcSight says that they have identified and fixed this update issue.)
Our experiences suggest that the performance numbers these three vendors offer may be open to question, or they might be a sign of the fragility of these systems when not tightly tuned for specific environments and performance. On the other hand, Network Intelligence and NetIQ handled our load with snappy response time for forensics and report writing, each on a single dual-CPU system.
Storage and Archiving
The growing demands of regulatory requirements for recordkeeping make SIM products a natural part of the storage and archiving process.
NetIQ and ArcSight can directly integrate with enterprises' SQL-based data warehouses and archives, providing straightforward access to data through the management console or SQL query tools.
Network Intelligence includes data warehousing and some capability to look at the data from outside of the management console, but it doesn't keep the information in an off-the-shelf database. However, Network Intelligence makes it easy for you to grab what you want out of its proprietary format warehouse and archive it.
Protego and OpenService store data in SQL databases, but keep a limited number of days' information. For example, OpenService completely purges data after 30 days, making long-term trending impossible.
Meeting Your Expectations
Security information management systems are expensive, time-consuming to install, and hard to customize. A successful deployment will require days, if not weeks, of professional services assistance from the vendor.
If you're just looking for a single place to put all your data and generate reports, any of these products should meet your need. However, the strength of a SIM is in its ability to tell you things about your network that you didn't know, but need to know-and often need to act on. In that case, how you intend to use your SIM will strongly influence your choice.
If your main interest is actively protecting your Windows systems, NetIQ is well-suited for the job.
Similarly, if you are looking for a tool that does some correlation but has greater strengths in reporting and forensics, consider OpenService.
At this point, Protego's MARS is hard to recommend. Its elegant set of tools is designed for a very specific network environment, and it's handicapped by a buggy GUI and poor performance. Protego will be one to watch, if it works the bugs out.
Network Intelligence and ArcSight are the most robust SIM products we tested. While Network Intelligence has its own advantages and strengths, ArcSight is clearly more mature and has both broader and deeper functionality.
There's added incentive for considering these tools. At the core, a SIM can be little more than an IDS and log "super console," aggregating and correlating alerts and trying to prioritize and manage security from a policy violation point of view.
However, you can use these tools to gain additional insight into network performance and reliability. For example, data from our firewall traffic logs reported the top talkers and listeners on the network-not security information per se, but a benefit that comes from seeing all the logs.
The same is true of SIMs that have Windows agents. In addition to collecting event logs, these products can do all manner of inventory and status checking. If a Windows agent reports vital stats and the SIM sends an alert when the disks on the Windows system fill up, is that a security issue or a system management issue?
Enterprises buying SIM products at these prices have a right to expect them to deliver more than just security alerts. If we're going to the trouble of feeding all of our logs to this box, it should be smart enough to look at the entire log data stream and come to conclusions beyond a subset directly related to security. We expect the most successful products will be those that leverage all the data they collect and provide the most help to the entire network and security team.