[Oisf-devel] extracted to filestore may not always be original file

Kyle Creyts kyle.creyts at gmail.com
Thu Oct 11 21:35:21 UTC 2012

Has anyone else noticed that some percentage of the time[1] when a
rule with filestore in it triggers, a file will be either not be
written to filestore (bug1), or may be written in a jumbled and
sometimes incomplete fashion (bug2)?

I have had this happen to me repeatedly, but I can't reliably
reproduce the circumstances; when it does happen, it will happen many
times in a row:  suricata[2] drops roughly 1 out of every 3 of the
files which should have been extracted due to filestore rules[3].
When it does happen, all binaries output seem to be in order, but it
seems to only output about 1/3 of the files which should have been
extracted (as they triggered filestore rules).
When it runs like this, I have noticed that many of the suricata
workers jump to reading at about 15MB/s from disk for the duration of
the run, and the run takes about 20s to complete on the attached pcap.
Otherwise, it takes about 5s, and I don't see any major disk hit.

In the other case (logs, files, and input pcap attached) it outputs 1
binary for every binary that triggered the filestore rules, but some
small percent of these binaries may be missing chunks, may have extra
chunks, or may be written in a jumbled order. This is something I have
been able to reliably reproduce, and have attached extensive debug
logs for.

I am pretty sure that these are two separate bugs.

Is this a thread-scheduling problem, or some other weird race
condition? Or is it something more easily fixed?

My test ran with just 2 rules enabled, to detect win32 binaries. the
pcap being processed is included, and consists of 251 wgets of the
same binary. The results of this testing are attached.

[1] (I've had it vary between 0% and 66%, most frequently ~1% )
[2] (tested on versions 1.2.1, 1.3.1, 1.3.2)
[3] (out of my test set of 251 file-downloads, it would save only
80-130, but on average, about 90)
[4] I apologize for the weird multiple layers of compression, but the
raw logs are ~300MB,  GMail wouldn't take it without encrypted layer
because there was a corrupt binary (! thats why I'm mailing the list!)
password is logs

Kyle Creyts

Information Assurance Professional
-------------- next part --------------
  md5sum log/suricata/files/file.{1..251}|cut -d " " -f1 |sort|uniq -c
      1 1aa0a615e05ecbe2c45ab2ce4f085935
      1 811a7062d3f29d3973d45a80cd12aa7d
    249 829e4805b0e12b383ee09abdc9e2dc3c

  md5sum log/suricata/files/file.{1..251}|cut -d " " -f1 |sort|uniq -ccata/pcaps/*
      1 5cac0d77af1b8c5b28c9bd2b4bb9a6d1
      1 5e300f8207cc41624c64389c24da56f8
      1 811a7062d3f29d3973d45a80cd12aa7d
    248 829e4805b0e12b383ee09abdc9e2dc3c

  md5sum log/suricata/files/file.{1..251}|cut -d " " -f1 |sort|uniq -c
      1 0d34851794b0ac0c0487f3a433a2f158
      1 36550b53aef67af982bd8b49116514f7
      1 811a7062d3f29d3973d45a80cd12aa7d
    245 829e4805b0e12b383ee09abdc9e2dc3c
      1 93ef359e379a8751992c895586c997e4
      1 a3e809844ffa29ae013819f0c1679ead
      1 f72c0b96a8bce986edaf514385a88c3b

  md5sum log/suricata/files/file.{1..251}|cut -d " " -f1 |sort|uniq -c
      1 811a7062d3f29d3973d45a80cd12aa7d
    248 829e4805b0e12b383ee09abdc9e2dc3c
      1 c25b36aee35661abc98f09b2455a4917
      1 e989c92342e553db9594b361aeb5a3a4

   md5sum log/suricata/files/file.{1..251}|cut -d " " -f1 |sort|uniq -c
      1 811a7062d3f29d3973d45a80cd12aa7d
    249 829e4805b0e12b383ee09abdc9e2dc3c
      1 faaed181e00c50c96d9ef43ebff3b0f6
-------------- next part --------------
A non-text attachment was scrubbed...
Name: test.7z.zip
Type: application/zip
Size: 5510090 bytes
Desc: not available
URL: <http://lists.openinfosecfoundation.org/pipermail/oisf-devel/attachments/20121011/52b7e010/attachment-0001.zip>

More information about the Oisf-devel mailing list