[Discussion] Features - Designing for Scaleability

Tue Nov 18 18:47:19 UTC 2008

Team,

My apologies if I've missed this being covered previously. Or, if the
Bro framework already takes some of these things into consideration.

I'd like to suggest that we take very large deployments into
consideration when designing our solution. The kind of problems you
encounter when managing an infrastructure with a handful or even a
dozen different IDS sensors is very different than trying to
efficiently and consistently manage infrastructures with larger
footprints (e.g. > 100+ sensors).

A couple of suggestions to help our design address these potential
deployment and management scenarios:

1) Centralized sensor and policy management platform (or framework)
--> Such a solution may be restricted to a single centralized server
or multiple servers.
--> Might be a parent -> child relationship among configuration
servers to segregate business units, or simply replication among
servers for disaster recovery / business continuity purposes
--> efficient, repeatable, and audit-able methodology for making
changes to both individual sensors as well as pre-defined groups of
sensors (e.g. dmz sensors, dns sensors, development lab sensors,
etc...)
--> My experience to date has been performing these kind of tasks with
home-grown scripts, ssh, scp, audit logs, etc... However, it would be
nice to find something a little more mature for this project.

I have not used it, but there is a project called "Puppet" that looks
like it might be a good candidate for trying to develop a framework
along these lines:
http://reductivelabs.com/trac/puppet/wiki/PuppetIntroduction

2) Centralized sensor and policy monitoring platform (or framework)
--> This is similar to the "management framework" concept, but the
focus is more on device health monitoring... and possibly policy
integrity monitoring...
--> For this piece, I'm thinking of something that provides functions
such as looking at memory utilization, cpu utilization, hard-drive
space, network interface stats... and other "bits" such as dates and
checksums for critical configuration files changed (e.g. detection
policies, tuning policies, variable definitions, etc)...

There are a number of open-source enterprise monitoring utilities out
there. Here's one I've been playing with recently:
http://www.zabbix.com/

3) Distributed Data Repositories
I know we briefly touched on the database stuff when talking about
schema design. I just wanted to add a plug for a couple of things
here:
--> encrypted communication channels (sensor -> DB or sensor -> sensor)
--> ability to simultaneously log to 2 or more data repositories

I strongly agree with the concept of designing modular solutions. So,
these kind of features can be "bolted on" if they were required.

Look forward to everyone's thoughts on how we can most effectively
tackle problems of scale for large deployments.

Cheers, John