[Oisf-users] Suricata pegs a detect thread and drops packets

Ivan Ristić ivanr at webkreator.com
Wed Jun 25 16:14:35 UTC 2014


David,

It's possible that LibHTP is the performance bottleneck there. We have a
couple of places where we track issues similar to the one you reported:

  https://redmine.openinfosecfoundation.org/issues/822

  https://github.com/ironbee/libhtp/issues/56

I'll look into it.

Thanks for your reports!


On 23/06/2014 15:58, David Vasil wrote:
> Anything I should do to tune the list size downward? If so, what
> implications does that have for inspection? Thanks!
> 
> -dave
> 
> On Jun 23, 2014 10:56 AM, "Anoop Saldanha" <anoopsaldanha at gmail.com
> <mailto:anoopsaldanha at gmail.com>> wrote:
> 
>     Dave,
> 
>     That's a huge list and that's pretty much the issue without doubt.
> 
>     On Mon, Jun 23, 2014 at 8:21 PM, David Vasil <davidvasil at gmail.com
>     <mailto:davidvasil at gmail.com>> wrote:
>     > Sorry about that:
>     >
>     > (gdb) info threads
>     >   Id   Target Id         Frame
>     >   14   Thread 0x7fcdf5621700 (LWP 32689) "RxPFR1"
>     0x00007fcdf998589c in
>     > __lll_lock_wait () from /lib/x86_64-linux-gnu/libpthread.so.0
>     >   13   Thread 0x7fcdf4e20700 (LWP 32690) "RxPFR2"
>     0x00007fcdf9241a43 in poll
>     > () from /lib/x86_64-linux-gnu/libc.so.6
>     >   12   Thread 0x7fcdd7fff700 (LWP 32691) "Detect1"
>     0x00007fcdf9982d84 in
>     > pthread_cond_wait@@GLIBC_2.3.2 () from
>     /lib/x86_64-linux-gnu/libpthread.so.0
>     >   11   Thread 0x7fcdd77fe700 (LWP 32692) "Detect2"
>     0x00007fcdfa692cdf in
>     > htp_list_array_get (l=0x7fcdd0c53190, idx=3446) at htp_list.c:98
>     >   10   Thread 0x7fcdd6ffd700 (LWP 32693) "Detect3"
>     0x00007fcdf9982d84 in
>     > pthread_cond_wait@@GLIBC_2.3.2 () from
>     /lib/x86_64-linux-gnu/libpthread.so.0
>     >   9    Thread 0x7fcdcffff700 (LWP 32694) "Detect4"
>     0x00007fcdf9982d84 in
>     > pthread_cond_wait@@GLIBC_2.3.2 () from
>     /lib/x86_64-linux-gnu/libpthread.so.0
>     >   8    Thread 0x7fcdd67fc700 (LWP 32695) "Detect5"
>     0x00007fcdf9982d84 in
>     > pthread_cond_wait@@GLIBC_2.3.2 () from
>     /lib/x86_64-linux-gnu/libpthread.so.0
>     >   7    Thread 0x7fcdd5ffb700 (LWP 32696) "Detect6"
>     0x00007fcdf9982d84 in
>     > pthread_cond_wait@@GLIBC_2.3.2 () from
>     /lib/x86_64-linux-gnu/libpthread.so.0
>     >   6    Thread 0x7fcdd57fa700 (LWP 32697) "Detect7"
>     0x00007fcdf9982d84 in
>     > pthread_cond_wait@@GLIBC_2.3.2 () from
>     /lib/x86_64-linux-gnu/libpthread.so.0
>     >   5    Thread 0x7fcdd4ff9700 (LWP 32698) "Detect8"
>     0x00007fcdf9982d84 in
>     > pthread_cond_wait@@GLIBC_2.3.2 () from
>     /lib/x86_64-linux-gnu/libpthread.so.0
>     >   4    Thread 0x7fcdcf7fe700 (LWP 32699) "FlowManagerThre"
>     > 0x00007fcdf99830fe in pthread_cond_timedwait@@GLIBC_2.3.2 () from
>     > /lib/x86_64-linux-gnu/libpthread.so.0
>     >   3    Thread 0x7fcdceffd700 (LWP 32700) "SCPerfWakeupThr"
>     > 0x00007fcdf99830fe in pthread_cond_timedwait@@GLIBC_2.3.2 () from
>     > /lib/x86_64-linux-gnu/libpthread.so.0
>     >   2    Thread 0x7fcdce7fc700 (LWP 32701) "SCPerfMgmtThrea"
>     > 0x00007fcdf99830fe in pthread_cond_timedwait@@GLIBC_2.3.2 () from
>     > /lib/x86_64-linux-gnu/libpthread.so.0
>     > * 1    Thread 0x7fcdfaab2840 (LWP 32668) "Suricata-Main"
>     0x00007fcdf921908d
>     > in nanosleep () from /lib/x86_64-linux-gnu/libc.so.6
>     > (gdb) t 11
>     > [Switching to thread 11 (Thread 0x7fcdd77fe700 (LWP 32692))]
>     > #0  0x00007fcdfa692cdf in htp_list_array_get (l=0x7fcdd0c53190,
>     idx=3446) at
>     > htp_list.c:98
>     > 98              r = l->elements[i];
>     > (gdb) print *l
>     > $1 = {first = 0, last = 8512, max_size = 16384, current_size = 8512,
>     > elements = 0x7fcdd2658c50}
>     >
>     > Cooper,
>     >   I'll try the dev branch as well.
>     >
>     > Thanks,
>     > -dave
>     >
>     >
>     > On Mon, Jun 23, 2014 at 9:32 AM, Duarte Silva
>     <duarte.silva at serializing.me <mailto:duarte.silva at serializing.me>>
>     > wrote:
>     >>
>     >> On Monday 23 June 2014 09:08:45 David Vasil wrote:
>     >> > Here is the result after removing all of the -O2's from the libhtp
>     >> > makefile
>     >> > and waiting for a detect thread to hit 100%.
>     >> >
>     >> > http://pastebin.com/aHp415HV
>     >>
>     >> Hi Dave,
>     >>
>     >> the print doesn't say much other than the memory address of the
>     variable
>     >> :P
>     >> Instead, could you do a:
>     >>
>     >> (gdb) print *l
>     >>
>     >> It should print all of the members of the htp_list_array_t
>     structure. If
>     >> not,
>     >> do this:
>     >>
>     >> (gdb) print l->max_size
>     >> (gdb) print l->current_size
>     >>
>     >> Cheers,
>     >> Duarte
>     >>
>     >> >
>     >> > Also, setting transparent hugepages to always did not seem to
>     prevent
>     >> > this
>     >> > from occurring.
>     >> >
>     >> > -dave
>     >> >
>     >> >
>     >> > On Fri, Jun 20, 2014 at 10:36 PM, Anoop Saldanha
>     >> > <anoopsaldanha at gmail.com <mailto:anoopsaldanha at gmail.com>>
>     >> >
>     >> > wrote:
>     >> > > Dave,
>     >> > >
>     >> > > Nice!  You did manage to catch that thread inside the get()
>     function.
>     >> > > From the perf-top it indicates that it's the same issue,
>     although it
>     >> > > would be nice if I could look at the contents of the variable "l"
>     >> > > inside that get() function.  Unfortunately from your gdb
>     output it
>     >> > > looks like you forgot to compile libhtp without(i typed it
>     wrongly as
>     >> > > "with" in my previous mail, instead of "without", although I did
>     >> > > specify -g3 later on) optimization.
>     >> > >
>     >> > > Can you get the output of "print l" inside the get()
>     function, but
>     >> > > without optimization enabled in the libhtp library.
>     >> > >
>     >> > > You will have to rebuild libhtp without optimization.  Go to your
>     >> > > libhtp build directory.  Run the "configure" command like you
>     have
>     >> > > before.  Before you type "make", search for "Makefile" in the
>     libhtp
>     >> > > root/base directory and the "htp" subdirectory, and replace
>     "-g -O2"
>     >> > > with just "-g".  Now run the make command.  Confirm during
>     the build
>     >> > > stage on the console that you don't have "-g -o2" with any of
>     the gcc
>     >> > > commands and instead have just "-g".
>     >> > >
>     >> > > You can then do a "make install", OR manually copy the .so files
>     >> > > directory from "htp/.libs/" directory, to replace the libhtp
>     libraries
>     >> > > that you are linking with suricata.
>     >> > >
>     >> > > If you still can't see the symbols inside libhtp's functions
>     when you
>     >> > > run suricata, your suricata binary is pointing to the
>     wrong/old libhtp
>     >> > > library.
>     >> > >
>     >> > >
>     >> > > On Sat, Jun 21, 2014 at 12:42 AM, David Vasil
>     <davidvasil at gmail.com <mailto:davidvasil at gmail.com>>
>     >> > >
>     >> > > wrote:
>     >> > > > I was able to do this after Detect5 hit 100% and stayed
>     there.  I
>     >> > >
>     >> > > reverted
>     >> > >
>     >> > > > back to my originally compiled suricata 2.0.1 deb package
>     (without
>     >> > > > --enable-debug) as that flag created a ton of overhead - as you
>     >> > >
>     >> > > mentioned,
>     >> > >
>     >> > > > probably due to not being compiled with optimization - and
>     it also
>     >> > > > ended
>     >> > >
>     >> > > up
>     >> > >
>     >> > > > core dumping several times.  I copied the unstripped libhtp
>     lib and
>     >> > >
>     >> > > suricata
>     >> > >
>     >> > > > binary (again, without --enable-debug) to the appropriate
>     >> > > > destinations
>     >> > >
>     >> > > and
>     >> > >
>     >> > > > was able to see the debugging symbols as expected.
>      Attached is a
>     >> > > > 'perf
>     >> > >
>     >> > > top'
>     >> > >
>     >> > > > drilling into the annotated code within htp_list_array_get
>     showing
>     >> > > > where
>     >> > >
>     >> > > the
>     >> > >
>     >> > > > time is being spent (I assume).  9d99, not in the
>     screenshot, shows
>     >> > > > the
>     >> > > >
>     >> > > > following:
>     >> > > >     0.08 :            9d99:       repz retq
>     >> > > >
>     >> > > >          :            free(l->elements);
>     >> > > >          :            free(l);
>     >> > > >          :
>     >> > > >          :        }
>     >> > > >
>     >> > > > GDB from this is thread here: http://pastebin.com/3tfjTsL0
>     >> > > >
>     >> > > > Thanks!
>     >> > > > -dave
>     >> > > >
>     >> > > >
>     >> > > > On Fri, Jun 20, 2014 at 9:41 AM, Anoop Saldanha
>     >> > > > <anoopsaldanha at gmail.com <mailto:anoopsaldanha at gmail.com>
>     >> > > >
>     >> > > > wrote:
>     >> > > >> I don't think --enable-debug compiles it with optimization.
>     >> > > >> Instead
>     >> > > >> compile it without optimization, i.e. either -g -O0 or -g -03.
>     >> > > >> Copy
>     >> > > >> the new binaries over, like you previously did.  You will
>     also have
>     >> > > >> to
>     >> > > >> compile libhtp the same way.  You can either specify this
>     in the
>     >> > > >> environment variable with configure or manually edit the
>     configure
>     >> > > >> script and the makefiles, replacing all "-g -o2" with just
>     "-g".
>     >> > > >>
>     >> > > >> 1. You can start suricata, and wait for one of the detect
>     threads
>     >> > > >> to
>     >> > > >> hit the 100% cpu utilization mark(make a note of the detect
>     >> > > >> threadname).
>     >> > > >> 2. One you see that, attach gdb to the running process,
>     and print
>     >> > > >> the
>     >> > > >> threads using "info threads".  If you see the offending thread
>     >> > > >> stuck
>     >> > > >> in the libhtp get() function call, switch over to that
>     thread using
>     >> > > >> "t
>     >> > > >> <thread_number>" and do a "print l".  The symbol "l" is
>     inside the
>     >> > > >> libhtp get() function call.  Unless you have the detect thread
>     >> > > >> inside
>     >> > > >> the libhtp get() function scope that we are trying to
>     trace, you
>     >> > > >> won't
>     >> > > >> have the symbol available for printing.
>     >> > > >> 3. If when you do a "info threads", you don't see any of the
>     >> > > >> threads
>     >> > > >> currently inside htp get() function(gone out of scope at that
>     >> > > >> instance
>     >> > > >> of time t), continue the process in gdb, and keep a tab on the
>     >> > > >> threads
>     >> > > >> with top/htop, till you see the detect thread(s) again hit
>     the 100%
>     >> > > >> cpu mark, post which you can interrupt the process inside
>     gdb again
>     >> > > >> and hopefully find the detect thread still inside the
>     libhtp get()
>     >> > > >> function context.
>     >> > > >>
>     >> > > >> With the issue at hand, once the thread gets pegged, you
>     should be
>     >> > > >> able to zero-in on the thread pretty quickly.  In case you
>     can't,
>     >> > > >> I'll
>     >> > > >> provide a debug patch to corner the issue.
>     >> > >
>     >> > > --
>     >> > > -------------------------------
>     >> > > Anoop Saldanha
>     >> > > http://www.poona.me
>     >> > > -------------------------------
>     >>
>     >
> 
> 
> 
>     --
>     -------------------------------
>     Anoop Saldanha
>     http://www.poona.me
>     -------------------------------
> 


-- 
Ivan



More information about the Oisf-users mailing list