[Oisf-users] [Oisf-devel] file extraction -- Re: [COMMIT] OISF branch, master, updated. a556338936ad3cd2b0379a6985fb62084368d99e

Peter Manev petermanev at gmail.com
Thu Dec 1 09:38:28 UTC 2011

This is absolutely phenomenal  - great work - dev team!
Makes it a lot easier to find,learn,teach,look into pdf/other attachment

On Thu, Dec 1, 2011 at 10:26 AM, Kevin Ross <kevross33 at googlemail.com>wrote:

> Oh happy day.... :-) This will be great for getting binaries off network
> for things like exploit kits and so on. I think having the file is
> essential; to confirm exploits, to see if you detect something as malware
> and if not how you could fix that if you have a problem where something is
> on a machine. While grabbing files off network good and do other stuff to
> it to try and get out what is bad being able to say extract files dropped
> after a Java exploit using those sigs or a file sent from an exploit kit
> will be very useful.
> Great work.
> On 29 November 2011 16:54, Victor Julien <victor at inliniac.net> wrote:
>> >From my blog:
>> http://www.inliniac.net/blog/2011/11/29/file-extraction-in-suricata.html
>> File extraction in Suricata
>> Today I pushed out a new feature in Suricata I’m very excited about. It
>> has been long in the making and with over 6000 new lines of code it’s a
>> significant effort. It’s available in the current git master. I’d
>> consider it alpha quality, so handle with care.
>> So what is this all about? Simply put, we can now extract files from
>> HTTP streams in Suricata. Both uploads and downloads. Fully controlled
>> by the rule language. But thats not all. I’ve added a touch of magic. By
>> utilizing libmagic (this powers the “file” command), we know the file
>> type of files as well. Lots of interesting stuff that can be done there.
>> Rule keywords
>> Four new rule keywords were added: filename, fileext, filemagic and
>> filestore.
>> Filename and fileext are pretty trivial: match on the full name or file
>> extension of a file.
>>    alert http any any -> any any (filename:”secret.xls”;)
>>    alert http any any -> any any (fileext:”pdf”;)
>> More interesting is the filemagic keyword. It runs on the magic output
>> of inspecting the (start of) a file. This value is for example:
>>    GIF image data, version 89a, 1 x 1
>>    PE32 executable for MS Windows (GUI) Intel 80386 32-bit
>>    HTML document text
>>    Macromedia Flash data (compressed), version 9
>>    MS Windows icon resource – 2 icons, 16×16, 256-colors
>>    PNG image data, 70 x 53, 8-bit/color RGBA, non-interlaced
>>    JPEG image data, JFIF standard 1.01
>>    PDF document, version 1.6
>> So how the filemagic keyword allows you to match on this is pretty simple:
>>    alert http any any -> any any (filemagic:”PDF document”;)
>>    alert http any any -> any any (filemagic:”PDF document, version 1.6″;)
>> Pretty cool, eh? You can match both very specifically and loosely. For
>> example:
>>    alert http any any -> any any (filemagic:”executable for MS Windows”;)
>> Will match on (among others) these types:
>>    PE32 executable for MS Windows (DLL) (GUI) Intel 80386 32-bit
>>    PE32 executable for MS Windows (GUI) Intel 80386 32-bit
>>    PE32+ executable for MS Windows (GUI) Mono/.Net assembly
>> Finally there is the filestore keyword. It is the simplest of all: if
>> the rule matches, the files will be written to disk.
>> Naturally you can combine the file keywords with the regular HTTP
>> keywords, limiting to POST’s for example:
>>    alert http $EXTERNAL_NET any -> $HOME_NET any (msg:”pdf upload
>> claimed, but not pdf”; flow:established,to_server; content:”POST”;
>> http_method; fileext:”pdf”; filemagic:!”PDF document”; filestore; sid:1;
>> rev:1;)
>> This will alert on and store all files that are uploaded using a POST
>> request that have a filename extension of pdf, but the actual file is
>> not pdf.
>> Storage
>> The storage to disk is handled by a new output module called “file”.
>> It’s config looks like this:
>> enabled: yes # set to yes to enable
>> log-dir: files # directory to store the files
>> force-magic: no # force logging magic on all stored files
>> It needs to be enabled for file storing to work.
>> The files are stored to disk as “file.1″, “file.2″, etc. For each of the
>> files a meta file is created containing the flow information, file name,
>> size, etc. Example:
>> TIME: 01/27/2010-17:41:11.579196
>> PCAP PKT NUM: 2847035
>> SRC IP:
>> DST IP:
>> PROTO: 6
>> SRC PORT: 80
>> DST PORT: 56207
>> /msdownload/update/software/defu/2010/01/mpas-fe_7af9217bac55e4a6f71c989231e424a9e3d9055b.exe
>> MAGIC: PE32+ executable for MS Windows (GUI) Mono/.Net assembly
>> SIZE: 5204
>> Configuration
>> The file extraction is for HTTP only currently, and works on top of our
>> HTTP parser. As the HTTP parser runs on top of the stream reassembly
>> engine, configuration parameters of both these parts of Suricata affect
>> handling of files.
>> The stream engine option “stream.reassembly.depth” (default 1 Mb)
>> controls the depth into a stream in which we look. Set to 0 for no limit.
>> The libhtp options request-body-limit and response-body-limit control
>> how far into a HTTP request or response body we look. Again set to 0 for
>> no limit. This can be controlled per HTTP server.
>> Performance
>> The file handling is fully streaming, so it’s very efficient.
>> Nonetheless there will be an overhead for the extra parsing, book
>> keeping, writing to disk, etc. Memory requirements appear to be limited
>> as well. Suricata shouldn’t keep more than a few kb per flow in memory.
>> Limitations
>> Lack of limits is a limitation. For file storage no limits have been
>> implemented yet. So it’s easy to clutter your disk up with files.
>> Example: 118Gb enterprise pcap storing just JPG’s extracted 400.000
>> files. Better use a separate partition if you’re on a life link.
>> Future work
>> Apart from stabilizing this code and performance optimizing it, the next
>> step will be SMTP file extraction. Possibly other protocols, although
>> nothing is set in stone there yet.
>> --
>> ---------------------------------------------
>> Victor Julien
>> http://www.inliniac.net/
>> PGP: http://www.inliniac.net/victorjulien.asc
>> ---------------------------------------------
>> _______________________________________________
>> Oisf-users mailing list
>> Oisf-users at openinfosecfoundation.org
>> http://lists.openinfosecfoundation.org/mailman/listinfo/oisf-users
> _______________________________________________
> Oisf-users mailing list
> Oisf-users at openinfosecfoundation.org
> http://lists.openinfosecfoundation.org/mailman/listinfo/oisf-users

Peter Manev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openinfosecfoundation.org/pipermail/oisf-users/attachments/20111201/911f9340/attachment-0002.html>

More information about the Oisf-users mailing list