[Discussion] Rules Syntax

Sun Jan 4 00:53:49 UTC 2009

I think that Claudio's English signatures are good examples to start with.
Even a repository of paragraphs like that would be valuable to me.

A port-tracking database wouldn't be that big (as in, only a few gigs) to
store last remote port on a per-local-IP basis.  With decent indexing, it
would be extremely fast.

Regarding statistical analysis, I highly encourage the group to check out
the NFSen project (nfsen.sourceforge.net) for inspiration.  They have an
entire framework for statistical alerting on Netflow contained in a small
amount of very pluggable/extensible Perl packages. I'm extremely impressed
with its efficiency and the overall software architecture.  I've recently
implemented it in my org and it took almost zero effort.  In particular,
they have a plugin which applies a Holt-Winters exponential smoothing
algorithm to flows and alerts based on those statisical anomalies.  Email
alerts are standard, but there is a plugin for MySQL alert logging as well.
The quality of code is exceptional (I was very impressed to see so much
effort being spent on input validation!) and I plan on bolting it onto my
existing SIM.  I would think the Holt-Winters plugin code could be modified
to analyze a lot of input types.

I also want to point out that Netflow (or equivalent) is a great way to
disambiguate the inside IP addresses of a NAT point without having to
install a separate sensor.

As for rotating tcpdumps, I recommend checking out the methods both Time
Machine (http://www.net.t-labs.tu-berlin.de/research/tm/) and SANCP use to
decided whether to record traffic or not.  Setting it to ignore RTSP and
other types of bulky but useless network traffic can save a ton of disk
space.  By default, Time Machine will store only the first N bytes of
traffic per connection, so you can get a really great profile of a ton of
connections with a reasonable amount of disk.  For those of you with any
kind of budget at all, I would point out that you can build a 24 TB RAID
enclosure for under $4K (complete with hardware RAID5).  Disk is cheap!

That said, you can often get 80-90% of the intel you need with today's
malware by simply recording URL's via a web proxy or something like tshark,
and getting that into a database.  I am finding that more and more, sifting
through pcap's just ends up with me extracting the trail of URL's which
tells the whole story.  This trail is especially helpful when you use Anubis
from iseclab.org to sandbox-analyze a given URL.

--Martin

On Sun, Dec 21, 2008 at 10:11 AM, Matt Jonkman <jonkman at jonkmans.com> wrote:

>
> Claudio Criscione wrote:
> >
> > Well, what about:
> >
> > "if someone in my organization has never started any ftp traffic in the
> last
> > three months starts an ftp connection, notify me and start watching more
> > carefully that person. "
>
> I like this too. How do we store that kind of data for long term? Even
> if we were to just store last timestamp we saw that port in use on this
> IP that'd be a significant amount of data on the average net to go back
> months, no?
>
> How do we go after that then? (Call to the db experts out there)
>
> Use a sliding scale so as the user-defined storage space allocated fills
> up older data drops out?
>
> Use a limited range of ports, and/or group together high port ranges?
>
>
> >
> >  - Someone vs some machine
> > Using the IP address is still the only way to go in most cases, but we
> need
> > more sophisticate means to identify who's who as networks evolve (think
> about
> > whole cities behind a NAT)
>
> I think we should think more inside the firewall for these issues, no?
>
> There are ways, and several commercial products that track a user to an
> IP in realtime. Cisco I believe does, and others surely. LDAP
> integration/AD, netbios login monitoring, etc. It's possible, but it's a
> big thing to tackle. And likely we'd have patent conflicts. We can
> explore that though if there's a large enough driver to get it. Thoughts?
>
>
> > - in the last three monts can actually be translated to "is not used to"
> > or "does not usually"
> > The issue with statistical approaches is that you really have to develope
> > custom models. What about "signature based statistical models"?
>
> Yes, statistical approaches are tough. I'd like to see what is available
> out there in this area these days as far as open research. As I
> mentioned, I think it'll be a good use of some of our grant money to
> contract or grant fund a real statistician or group of such. Maybe we
> could get it made into a class project at a university somewhere under
> the guidance of an experienced statistician.
>
> >
> > - "watching more carefully"
> > I'm not sure we always want the same "resolution" on network traffic, and
> I
> > feel it would be great to be able to zoom on suspicious activity
> > automatically without having to carry the burden of logging everything
> > everytime
> >
>
> Another good point. Most folks these days do that with rotating
> tcpdumps, but you're time limited there. If you don't get to that alert
> before the pcap rotates out you've lost it. Are there better approaches
> out there?
>
> Matt
>
>
> --
> --------------------------------------------
> Matthew Jonkman
> Emerging Threats
> Phone 765-429-0398
> Fax 312-264-0205
> http://www.emergingthreats.net
> --------------------------------------------
>
> PGP: http://www.jonkmans.com/mattjonkman.asc
>
>
> _______________________________________________
> Discussion mailing list
> Discussion at openinfosecfoundation.org
> http://lists.openinfosecfoundation.org/mailman/listinfo/discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openinfosecfoundation.org/pipermail/discussion/attachments/20090103/e4ffc63d/attachment-0002.html>