[Discussion] Non-tokenized preprocessor parameter lines

Wed Feb 11 20:48:33 UTC 2009

I've done a little bit of work with rule profiling enough to realize that I
need help from someone who understands Aho-Corasick better so that I can
more accurately figure out what the load would be on the detection engine.
(I posted something to this effect on the Emerging Threats list last week.)
One real performance factor, as far as I can tell, would be to see if the
content matching already appears in another rule.  If that's true, then the
effective load of adding that rule could be negligible, since the detection
engine's effective load doesn't actually increase.  So, a tool which takes
the entire ruleset into account would be very helpful.  I know that
Sourcefire kind of already does this in Snort in a few places, but I think
the number of people who understand the information from the state
transition table reports could be counted on one hand (judging from the lack
of comments on the ET list).  That information needs to be wrapped into a
larger profiler.

Regarding libpcap stats, to put it simply, they lie.  I'm speaking from
Snort experince here, but when I've used router byte counters to audit how
much traffic is going through an interface, then asked Snort how many MB/sec
it processed, the numbers are very, very different until I reduce the load
on the box via subnet-based BPF.  The other problem is that libpcap drop
numbers are completely useless if you're using an Endace DAG card or
(correct me if this is not true) running through iptables.  Undetected drops
have been bad enough in my environment where I've resorted to creating
specific heartbeat signatures and test for the absence of a signature to
detect when a sensor is failing.  That's still far from perfect, though, as
there's plenty of room for drops in the middle.  In any case, that tells a
very different story than asking libpcap how many packets it's dropping.

So here's an idea from left field: what if there were multiple, overlapping
detection engines running which were capable of auditing each other?  It
would be tough to get perfect, but one engine should have about the same TCP
state info as another engine, for instance.  Then a periodic comparison of
those states could shed light on how bad things are.  If the two engines
agree on 99.9 percent of the traffic, it's a safe bet that they are able to
read everything.  Because if they were both overloaded, there's little
chance that they would randomly agree on specific TCP states.  That's just a
brainstorm, and I realize that doubling the load on a box for audit purposes
may be a bit much, but it's that kind of style that I would be looking for,
as opposed to hoping that libpcap is truthful in its reports.

On Wed, Feb 11, 2009 at 3:27 AM, Victor Julien <lists at inliniac.net> wrote:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Martin Holste wrote:
> > I think a fair amount of auto-configuration for the super-techy details
> > would really help.  To complement that, I'd also really like to see a
> > focus on performance metrics.  Too often we are in situations where we
> > have to try to infer something based on rules that _didn't_ fire.  When
> > you're not confident in a sensor, that's basically impossible.  Some
> > sort of real-time non-libpcap-based drop statistic or load-shedding
> > would be a huge leap forward.  For bonus points, a system for providing
> > a 100% objective performance baseline of a given signature or module
> > would also really help.  I know that each rule performs differently
> > depending on the traffic at hand, but a metric detailing
> > worst-case/best-case scenario performance would provide a really nice
> > guideline to aid in making decisions about which rules should make the
> > cut into the ruleset.  This could be crudely calculated by, say, the
> > number of PCRE's used, length of content searches, etc.
>
> Great suggestion. Matt and I have been talking about doing something
> like this for ET sigs for a while already, just never got to actually
> building something.
>
> You mentioned that you would like non-libpcap stats. Whats wrong with
> them and what is it you want instead?
>
> Regards,
> Victor
>
>
> > On Tue, Feb 10, 2009 at 10:12 AM, Matt Jonkman <jonkman at jonkmans.com
> > <mailto:jonkman at jonkmans.com>> wrote:
> >
> >     I agree, I'm not enamored with the snort-style config. I'd much
> rather
> >     it be more dynamic, and possibly even real-time adjustable by the
> engine
> >     to suit it's resources.
> >
> >     Or even better, one that would build a baseline of the box's
> >     capabilities and then config itself to suit. Such as choosing search
> >     methods that fit the ram available, # of threads based on cpu's
> >     available, etc. Take more of this out of black magic guesswork and
> into
> >     a more scientific method...
> >
> >     Matt
> >
> >     Victor Julien wrote:
> >     > Martin Fong wrote:
> >     >> Matt Jonkman wrote:
> >     >
> >     >>> Non-tokenized preprocessor parameter lines
> >     >> Let me rephrase this into what I'd like (versus definition by
> >     >> negation): It would be great if processor arguments could
> >     (optionally)
> >     >> _include_ newlines to permit line-oriented parameter definition.
>  For
> >     >> example, this would allow
> >     >
> >     >>     allow newlines
> >     >
> >     >>     preprocessor myPreprocessor:            \
> >     >>     threshold = 1.0        # a description        \
> >     >>     max_count = 10        # another description
> >     >
> >     >>     disallow newlines
> >     >
> >     >> where "[dis]allow newlines" would dictate the parameter token
> scanner
> >     >> behavior.
> >     >
> >     >>      As a side issue, I'd also like more functionality in the
> mSplit
> >     >> () replacement.  Specifically, it would be nice if it accepted 0
> >     >> (zero) for max_strs and then dynamically allocate the requisite
> >     >> members, particularly when the input is user-specified and thus
> >     >> causing the maximum to be relatively unpredictable (e.g., IP
> >     >> blacklists).
> >     >
> >     > I think we need to work out a rules syntax and configuration scheme
> >     > first. I'm not convinced we should have a snort compatible
> >     configuration
> >     > scheme... I haven't thought of alternatives though.
> >     >
> >     > Regards,
> >     > Victor
> >     >
> >
> >     --
> >     --------------------------------------------
> >     Matthew Jonkman
> >     Emerging Threats
> >     Phone 765-429-0398
> >     Fax 312-264-0205
> >     http://www.emergingthreats.net
> >     --------------------------------------------
> >
> >     PGP: http://www.jonkmans.com/mattjonkman.asc
> >
> >
> >     _______________________________________________
> >     Discussion mailing list
> >     Discussion at openinfosecfoundation.org
> >     <mailto:Discussion at openinfosecfoundation.org>
> >     http://lists.openinfosecfoundation.org/mailman/listinfo/discussion
> >
> >
>
>
> - --
> - ---------------------------------------------
> Victor Julien
> http://www.inliniac.net/
> PGP: http://www.inliniac.net/victorjulien.asc
> - ---------------------------------------------
>
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.9 (GNU/Linux)
> Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
>
> iEYEARECAAYFAkmSmeYACgkQiSMBBAuniMeuhwCfdnSPZxC5UG1ITzhhGzfdlhRo
> uBEAnRMcybFmg336SyNnQjKm3Ac6EDml
> =tl4o
> -----END PGP SIGNATURE-----
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openinfosecfoundation.org/pipermail/discussion/attachments/20090211/de45ec59/attachment-0002.html>