[Oisf-devel] filemd5?

Martin Holste mcholste at gmail.com
Thu Feb 16 15:06:25 UTC 2012

Ok, so to really make this thing pay off, and without too much effort,
I bet this could be done:

1. Add a simple config option to suricata.yaml for md5list: some_file_name.txt
2. Read the file into memory at startup
3. Generate an alert if any of the runtime-detected m5's match the list.
4. Re-read the file periodically, (or on Linux, async when the inode changes)


Of course, for some lists of md5's, it may be too big for memory, but
I say too bad for those folks, just start out with this much and
everyone will win.  A proper memory/disk trade-off mechanism could be
concocted later to deal with mega-lists of md5's.

On Thu, Feb 16, 2012 at 8:59 AM, Nikolay Denev <ndenev at gmail.com> wrote:
> On Feb 16, 2012, at 4:21 PM, Victor Julien wrote:
>> So I guess the best development happens when you're actually doing
>> boring stuff and you allow yourself to spend 30 minutes on a hunch. Of
>> course the 30 minutes becomes a couple of hours, but who cares :)
>> Anyway, the hunch here was integrating libnss' md5 calculation code into
>> the Suricata file inspection/extraction code, calculating the md5
>> checksum of files on the fly.
>> Turns out it works and at decent speeds too. In a test pcap I extract
>> 8393 files in 16.9 seconds. With md5 on the fly it's 17.6 seconds.
>> Sounds acceptable, no?
>> Right now all I have is writing the md5 to the .meta file, like so:
>> TIME:              10/02/2009-21:35:10.556990
>> PCAP PKT NUM:      6225
>> SRC IP:  
>> DST IP:  
>> PROTO:             6
>> SRC PORT:          80
>> DST PORT:          1091
>> FILENAME:          /ww/aa7.exe
>> MAGIC:             PE32 executable for MS Windows (GUI) Intel 80386 32-bit
>> STATE:             CLOSED
>> MD5:               e148eaaadceecb2e3e25fd25809cb5db
>> SIZE:              25712
>> But obviously this needs to be made available to the rule language. I
>> was thinking a simple filemd5 keyword to start, allowing matching on
>> single md5's. But the real value is probably in a keyword that allows
>> you to check an entire db of md5's all at once. I'm sure there are ppl
>> sitting on large collections of known bad md5.
>> Does this all make sense? Any other ideas?
>> --
>> ---------------------------------------------
>> Victor Julien
>> http://www.inliniac.net/
>> PGP: http://www.inliniac.net/victorjulien.asc
>> ---------------------------------------------
> Very cool!
> This seems that can be also used to provide DLP functionality, i.e. keep a database with md5 checksums of files with sensitive data and
> alert if the data is leaked (regardless of filename). I've heard at least of one DLP vendor that uses similar method to detect unauthorized data leaks.
> _______________________________________________
> Oisf-devel mailing list
> Oisf-devel at openinfosecfoundation.org
> http://lists.openinfosecfoundation.org/mailman/listinfo/oisf-devel

More information about the Oisf-devel mailing list