[Discussion] Features and Design Strategy

Fri Oct 17 18:53:08 UTC 2008

First off, many congrats, Matt!

When you look at the threat landscape, it's clear that the most imminent
threats are against web-facing clients and servers, so I think we should
focus on client-side attacks and attacks against web applications.
Detecting SMB protocol attacks, etc. is still important, but most attacks
against non-web protocols need a foothold on the LAN first.  Most IDS are
historically built for defending servers from clients, and I think that idea
is becoming a bit antiquated, though certainly far from obsolete.

My group has had great success doing what John at Berkeley is/will be doing
with file carving.  We've been auto-extracting packed files from the network
stream and auto-submitting to VirusTotal for over a year now.  John's point
about processing the layer-7 data inside the streams is dead-on.  We need to
move away from answering "what strings did the traffic contain" to answering
"what did the host actually do?"  We also need to keep in mind that
canonically defining what "bad" behavior is can be incredibly difficult even
if you're just trying to put it in words.

The design strategy that it is important to me is plugin extensibility.  One
of the reasons I love perl so much is that there is a CPAN module for just
about anything you could ever dream of needing.  The extreme ease with which
you can incorporate others' code into your own project is what has kept it
thriving for so long.  The differences between individual organization's
security needs and preferences mandate that a successful project will have a
long-term goal of being extremely modular, while also having a core
functionality that is superb.

Since Snort is what I'm familiar with, I will try to make my point by
comparison:  Snort has excellent core functionalilty, but despite Marty's
best efforts, extending it is not a trivial task because the code is in C.
When Jason Brevenik posted his SnortUnified.pm module which gives perl
access to realtime Snort alerts, I realized that there was suddenly an
avenue for decoupling the pure network-grepping core functionality of Snort
with the action component.  Moreover, I now had a hook for writing extremely
customized and complicated event handling in my programming language of
choice instead of trying to debug segfaults in preprocessors all day.  So,
while perl would be far too slow to do real-time content matching, it is
certainly not too slow to receive an alert, check to see if this source IP
has alerted recently, create an incident with some priority, see who needs
to be notified of the incident, and then send an alert.  So, the point is,
we only need the really fast component to do a limited number of tasks, and
if we decouple the real-time, per-packet tasks from the high level "what do
we do with this" tasks, anyone is free to trivially write their own hooks
into how they want things handled, which is the part that varies the most
from org to org.  Marty recognized all of this when he wrote the unified
output module, but Sourcefire never had a commercial reason for extending it
much since it would encroach on their own product offerings.  OISF has the
advantage of not being for-profit, and so we can make extensibility a
priority.

What I would like to see is a brand-new program with functions similar to
Snort's HTTP preprocessor which has the sole purpose in life of sniffing the
network and attaching HTTP header fields to the traffic, most importantly
the host, content-type, and content-length fields.  This daemon would read a
config file which specifies rules more like scripts.  The rules would have
variables populated from a policy server, and it would update these
variables very regularly (without a program restart of any kind).  This
would de-couple the work of sniffing from the work of policy setting.  The
policy server would get its variables from all kinds of sources, like remote
XML files.  Armed with this, very simple signatures which identify web
applications could be written.  Here's a pseudo-code example of what I think
the configs would look like:

## On the policy server ##

# Declare vars

DECLARE $attacked = ARRAY;

DECLARE $attackers = ARRAY;

DECLARE $bad_hosts = { type=dns, url=blacklists.example.org/bad_dns.xml,
refresh_list=30 };

DECLARE $bad_ips = { type=ip, url=more_blacklists.example.org/bad_ips.xml,
refresh_list=30 };

DECLARE $our_windows_hosts = { type=ip, url=
http://mycompany.example.org/our_windows_hosts.xml, refresh_list=900 };

## On the HTTP sniffer ##

# Detect our web apps

ID $google AS [http,https ] FROM [ host='*.google.com' ];

ID $google_search_result AS [ $google ] MATCHING [ uri =~ /search/ ];

ID $our_web_clients AS [http,https] TO [ $our_windows_hosts ];

ID $us_sending_email AS [smtp] FROM [ $our_web_clients ];

ID $malware_webapp AS [ http, https ] FROM [ $bad_hosts OR $bad_ips ];

# Define our actions

# Create an event and record a client as attacked if it is referred to a
malware page by a Google search

FORWARD EVENT IF { $google_search_result REFERS $malware_webapp } AND [ ADD
$_CLIENT TO $attacked, ADD $_SERVER TO $attackers ];

The important thing is that our daemon doing the HTTP sniffing is agnostic
in that it has no idea if things are bad or good, it just lets somebody know
when a predefined set of criteria has occurred.  It is up to an event policy
server to receive these events and do something intelligent (and highly
configurable) with it.  So, we've decoupled any decision making from the
component that has the most time-sensitive work to do.  All of the other
components can buffer their work, but the network sniffer needs as much work
lifted from it as possible so that it can cope with high loads.  The policy
server components would be the only parts that would require much
customization, and they would be written in a high-level language like perl
for maximum extensibility.

Let's also not forget the plethora of other apps out there that we can
leverage.  Of particular importance would be SANCP and PADS.

SANCP's newest version contains a feature that John Curry added for me which
writes out an index of the file position within a bulk pcap where a packet
starts and stops.  This means that you can do instantaenous bulk packet
retrievals using the index.  In my tests, I can pull an arbitrary host's
traffic from a 30 GB pcap in under 5 seconds.  This technique could be used
for file carving, which would provide the ability to send to sandboxes, send
to VirusTotal, Symantec, etc.

PADS provides a ready-made daemon for identifying the protocols used between
hosts.  If PADS fed a policy server, we could automagically update the
policy server when new Windows hosts come online, or when an SMTP session
starts.  Since the HTTP sniffer would be reading from the policy server on a
regular basis, PADS would be able to keep it informed as to the state of the
network.  P0f could work as well.

So, the main idea would be to have a very lightweight daemon with a specific
task forward events to an event policy server written in a higher level
language, and to remove almost all decision making from the sniffer.

Comments?

Thanks,

Martin
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openinfosecfoundation.org/pipermail/discussion/attachments/20081017/f05d96c6/attachment-0002.html>