[Discussion] Features suggestion

Jeremy Hewlett jh at dok.org
Fri Nov 7 05:29:36 UTC 2008


Thanks for your response - comments inline.

On Thu, Nov 06, 2008 at 19:56:06 -0600, Martin Holste wrote:
>    not subjective in any way.  A system which can spew out info like
>    "these 4 hosts downloaded files which have an average .src entropy
>    higher than 7" would be more valuable than a system which tried to
>    guess at which 4 hosts were not acting like the other hosts on the
>    network.  Obviously, if your system can pinpoint interesting things,
>    you are close to identifying odd traffic patterns.  The key
>    difference I am trying to point out is that I think it's less useful
>    to spend time identifying how many or what percentage entropy is

I'm not sure I can agree completely on this point. I still believe having
an accurate model of how your network behaves is the best way to detect
novel and modified attacks, as well as abnormal traffic patterns. It's not
so much of "guessing what 4 hosts aren't acting like the other hosts" as
much as comparing a host to itself over time and observing deviations from
the norm. 

In your example, you're not always guaranteed that after an attack the
victim will download a packed (high entropy) binary - or even download
anything at all. With an anomaly-based system, that wouldn't matter. The
very basis that the host is receiving abnormal attack traffic (perhaps say,
outside of observed protocol spec, a large packet, whatever) is enough to
raise an alert. In either case, if that particular victim has never talked
to .cn hosts, and is now doing so, that would also be another deviation
worth investigating.

Anyway, I digress; I understand the point you are conveying here. So, on
to the reason i replied in the first place... :)

Picking out traffic (files) with high entropy is a great idea. Have you
looked at http://code.google.com/p/pefile/ at all? I've been using that
from time to time to pinpoint files of interest on compromised boxes.
Something similar to pefile would be easy enough to implement in an IDS.




More information about the Discussion mailing list