[Discussion] Features - Designing for Scaleability

Wed Dec 3 22:40:58 UTC 2008

So you're thinking the tool should have it's own management protocol to
a central hub built in. To allow sensors to be relatively autonomous,
pulling config and reporting data through a single protocol?

Matt

Martin Holste wrote:
> John,
> 
> I think you've hit on some key points here.  I manage about a dozen
> sensors, and there is a definite line that is crossed somewhere around a
> half-dozen servers where managing the configuration and health of the
> boxes requires some power tools.  I prefer Nagios for general system
> health monitoring, so that has never been an issue.  Conversely, it has
> been an incredible challenge to manage rules and granular Snort
> performance on many boxes with multiple Snort instances.  I've taken the
> route you have with many home-grown scripts, etc., but there's always
> something new that comes out with Snort making the management
> difficult.  Specifically, dealing with SO rules throws a huge wrench in
> the works when dealing with mixed architectures. 
> 
> My strategy thus far has been to have my sensors completely self
> contained and manage them centrally using a home-grown system written in
> Perl to make generic queries and issue commands.  The databases which
> Snort writes to are on the box (performance really isn't impacted at all
> by doing this).  The Perl agent receives queries like "get all alerts in
> last X minutes," or "add this rule to the running configuration," or
> "calculate stats for Mbps and alerts per second for last 1000 ticks." 
> The key is that since all the data is on the sensor, it scales very
> well.  The central Perl client can then parallelize queries so all
> sensors can search at the same time much faster than if all the alerts
> were located in one central database.  Since everything goes through the
> central management client, it can easily log all the actions (and even
> queries) which are run for audit/change management purposes.
> 
> For encryption, I run everything through OpenVPN.  This works really
> well, especially for debugging, since it is much easier to tcpdump a
> tunnel interface than get a hook into a TLS socket.
> 
> I'm working on a 2.0 version of this architecture which will be entirely
> asynchronous, so that the sensors can alert the central management hub
> on predetermined criteria.
> 
> For truly large deployments, I think you're right that a parent-child
> setup might be necessary.  An exploration of peer-to-peer techniques
> might be an interesting, but I think that for simplicity, a tree
> hierarchy would make the most sense.  That is, a "hub" may have X number
> of servers assigned to it, and a hub higher up the tree would be able to
> ask hubs to delegate queries down the hierarchy.  It would be
> interesting to see if there would be performance gains versus attempting
> parallelized queries over thousands of sensors from just one hub.  My
> thinking is that some of the sensors could also function as hubs. 
> 
> I think that we need to remember that the vast majority of users do not
> deploy more than a few sensors, so we need to guard against spending too
> much devel time on features that will only serve a small percentage of
> the community.  That said, audit, change management, archiving, and
> other management features are things that benefit everyone.  As long as
> we keep it all modular, users can mix and match to get the features they
> require.
> 
> --Martin
> 
> On Tue, Nov 18, 2008 at 12:47 PM, John Pritchard
> <john.r.pritchard at gmail.com <mailto:john.r.pritchard at gmail.com>> wrote:
> 
>     Team,
> 
>     My apologies if I've missed this being covered previously. Or, if the
>     Bro framework already takes some of these things into consideration.
> 
>     I'd like to suggest that we take very large deployments into
>     consideration when designing our solution. The kind of problems you
>     encounter when managing an infrastructure with a handful or even a
>     dozen different IDS sensors is very different than trying to
>     efficiently and consistently manage infrastructures with larger
>     footprints (e.g. > 100+ sensors).
> 
>     A couple of suggestions to help our design address these potential
>     deployment and management scenarios:
> 
>     1) Centralized sensor and policy management platform (or framework)
>     --> Such a solution may be restricted to a single centralized server
>     or multiple servers.
>     --> Might be a parent -> child relationship among configuration
>     servers to segregate business units, or simply replication among
>     servers for disaster recovery / business continuity purposes
>     --> efficient, repeatable, and audit-able methodology for making
>     changes to both individual sensors as well as pre-defined groups of
>     sensors (e.g. dmz sensors, dns sensors, development lab sensors,
>     etc...)
>     --> My experience to date has been performing these kind of tasks with
>     home-grown scripts, ssh, scp, audit logs, etc... However, it would be
>     nice to find something a little more mature for this project.
> 
>     I have not used it, but there is a project called "Puppet" that looks
>     like it might be a good candidate for trying to develop a framework
>     along these lines:
>     http://reductivelabs.com/trac/puppet/wiki/PuppetIntroduction
> 
> 
>     2) Centralized sensor and policy monitoring platform (or framework)
>     --> This is similar to the "management framework" concept, but the
>     focus is more on device health monitoring... and possibly policy
>     integrity monitoring...
>     --> For this piece, I'm thinking of something that provides functions
>     such as looking at memory utilization, cpu utilization, hard-drive
>     space, network interface stats... and other "bits" such as dates and
>     checksums for critical configuration files changed (e.g. detection
>     policies, tuning policies, variable definitions, etc)...
> 
>     There are a number of open-source enterprise monitoring utilities out
>     there. Here's one I've been playing with recently:
>     http://www.zabbix.com/
> 
> 
>     3) Distributed Data Repositories
>     I know we briefly touched on the database stuff when talking about
>     schema design. I just wanted to add a plug for a couple of things
>     here:
>     --> encrypted communication channels (sensor -> DB or sensor -> sensor)
>     --> ability to simultaneously log to 2 or more data repositories
> 
>     I strongly agree with the concept of designing modular solutions. So,
>     these kind of features can be "bolted on" if they were required.
> 
>     Look forward to everyone's thoughts on how we can most effectively
>     tackle problems of scale for large deployments.
> 
>     Cheers, John
>     _______________________________________________
>     Discussion mailing list
>     Discussion at openinfosecfoundation.org
>     <mailto:Discussion at openinfosecfoundation.org>
>     http://lists.openinfosecfoundation.org/mailman/listinfo/discussion
> 
> 
> 
> ------------------------------------------------------------------------
> 
> _______________________________________________
> Discussion mailing list
> Discussion at openinfosecfoundation.org
> http://lists.openinfosecfoundation.org/mailman/listinfo/discussion

-- 
--------------------------------------------
Matthew Jonkman
Emerging Threats
Phone 765-429-0398
Fax 312-264-0205
http://www.emergingthreats.net
--------------------------------------------

PGP: http://www.jonkmans.com/mattjonkman.asc