[Oisf-users] Suricata pegs a detect thread and drops packets
David Vasil
davidvasil at gmail.com
Mon Jun 23 14:58:56 UTC 2014
Anything I should do to tune the list size downward? If so, what
implications does that have for inspection? Thanks!
-dave
On Jun 23, 2014 10:56 AM, "Anoop Saldanha" <anoopsaldanha at gmail.com> wrote:
> Dave,
>
> That's a huge list and that's pretty much the issue without doubt.
>
> On Mon, Jun 23, 2014 at 8:21 PM, David Vasil <davidvasil at gmail.com> wrote:
> > Sorry about that:
> >
> > (gdb) info threads
> > Id Target Id Frame
> > 14 Thread 0x7fcdf5621700 (LWP 32689) "RxPFR1" 0x00007fcdf998589c in
> > __lll_lock_wait () from /lib/x86_64-linux-gnu/libpthread.so.0
> > 13 Thread 0x7fcdf4e20700 (LWP 32690) "RxPFR2" 0x00007fcdf9241a43 in
> poll
> > () from /lib/x86_64-linux-gnu/libc.so.6
> > 12 Thread 0x7fcdd7fff700 (LWP 32691) "Detect1" 0x00007fcdf9982d84 in
> > pthread_cond_wait@@GLIBC_2.3.2 () from
> /lib/x86_64-linux-gnu/libpthread.so.0
> > 11 Thread 0x7fcdd77fe700 (LWP 32692) "Detect2" 0x00007fcdfa692cdf in
> > htp_list_array_get (l=0x7fcdd0c53190, idx=3446) at htp_list.c:98
> > 10 Thread 0x7fcdd6ffd700 (LWP 32693) "Detect3" 0x00007fcdf9982d84 in
> > pthread_cond_wait@@GLIBC_2.3.2 () from
> /lib/x86_64-linux-gnu/libpthread.so.0
> > 9 Thread 0x7fcdcffff700 (LWP 32694) "Detect4" 0x00007fcdf9982d84 in
> > pthread_cond_wait@@GLIBC_2.3.2 () from
> /lib/x86_64-linux-gnu/libpthread.so.0
> > 8 Thread 0x7fcdd67fc700 (LWP 32695) "Detect5" 0x00007fcdf9982d84 in
> > pthread_cond_wait@@GLIBC_2.3.2 () from
> /lib/x86_64-linux-gnu/libpthread.so.0
> > 7 Thread 0x7fcdd5ffb700 (LWP 32696) "Detect6" 0x00007fcdf9982d84 in
> > pthread_cond_wait@@GLIBC_2.3.2 () from
> /lib/x86_64-linux-gnu/libpthread.so.0
> > 6 Thread 0x7fcdd57fa700 (LWP 32697) "Detect7" 0x00007fcdf9982d84 in
> > pthread_cond_wait@@GLIBC_2.3.2 () from
> /lib/x86_64-linux-gnu/libpthread.so.0
> > 5 Thread 0x7fcdd4ff9700 (LWP 32698) "Detect8" 0x00007fcdf9982d84 in
> > pthread_cond_wait@@GLIBC_2.3.2 () from
> /lib/x86_64-linux-gnu/libpthread.so.0
> > 4 Thread 0x7fcdcf7fe700 (LWP 32699) "FlowManagerThre"
> > 0x00007fcdf99830fe in pthread_cond_timedwait@@GLIBC_2.3.2 () from
> > /lib/x86_64-linux-gnu/libpthread.so.0
> > 3 Thread 0x7fcdceffd700 (LWP 32700) "SCPerfWakeupThr"
> > 0x00007fcdf99830fe in pthread_cond_timedwait@@GLIBC_2.3.2 () from
> > /lib/x86_64-linux-gnu/libpthread.so.0
> > 2 Thread 0x7fcdce7fc700 (LWP 32701) "SCPerfMgmtThrea"
> > 0x00007fcdf99830fe in pthread_cond_timedwait@@GLIBC_2.3.2 () from
> > /lib/x86_64-linux-gnu/libpthread.so.0
> > * 1 Thread 0x7fcdfaab2840 (LWP 32668) "Suricata-Main"
> 0x00007fcdf921908d
> > in nanosleep () from /lib/x86_64-linux-gnu/libc.so.6
> > (gdb) t 11
> > [Switching to thread 11 (Thread 0x7fcdd77fe700 (LWP 32692))]
> > #0 0x00007fcdfa692cdf in htp_list_array_get (l=0x7fcdd0c53190,
> idx=3446) at
> > htp_list.c:98
> > 98 r = l->elements[i];
> > (gdb) print *l
> > $1 = {first = 0, last = 8512, max_size = 16384, current_size = 8512,
> > elements = 0x7fcdd2658c50}
> >
> > Cooper,
> > I'll try the dev branch as well.
> >
> > Thanks,
> > -dave
> >
> >
> > On Mon, Jun 23, 2014 at 9:32 AM, Duarte Silva <
> duarte.silva at serializing.me>
> > wrote:
> >>
> >> On Monday 23 June 2014 09:08:45 David Vasil wrote:
> >> > Here is the result after removing all of the -O2's from the libhtp
> >> > makefile
> >> > and waiting for a detect thread to hit 100%.
> >> >
> >> > http://pastebin.com/aHp415HV
> >>
> >> Hi Dave,
> >>
> >> the print doesn't say much other than the memory address of the variable
> >> :P
> >> Instead, could you do a:
> >>
> >> (gdb) print *l
> >>
> >> It should print all of the members of the htp_list_array_t structure. If
> >> not,
> >> do this:
> >>
> >> (gdb) print l->max_size
> >> (gdb) print l->current_size
> >>
> >> Cheers,
> >> Duarte
> >>
> >> >
> >> > Also, setting transparent hugepages to always did not seem to prevent
> >> > this
> >> > from occurring.
> >> >
> >> > -dave
> >> >
> >> >
> >> > On Fri, Jun 20, 2014 at 10:36 PM, Anoop Saldanha
> >> > <anoopsaldanha at gmail.com>
> >> >
> >> > wrote:
> >> > > Dave,
> >> > >
> >> > > Nice! You did manage to catch that thread inside the get()
> function.
> >> > > From the perf-top it indicates that it's the same issue, although it
> >> > > would be nice if I could look at the contents of the variable "l"
> >> > > inside that get() function. Unfortunately from your gdb output it
> >> > > looks like you forgot to compile libhtp without(i typed it wrongly
> as
> >> > > "with" in my previous mail, instead of "without", although I did
> >> > > specify -g3 later on) optimization.
> >> > >
> >> > > Can you get the output of "print l" inside the get() function, but
> >> > > without optimization enabled in the libhtp library.
> >> > >
> >> > > You will have to rebuild libhtp without optimization. Go to your
> >> > > libhtp build directory. Run the "configure" command like you have
> >> > > before. Before you type "make", search for "Makefile" in the libhtp
> >> > > root/base directory and the "htp" subdirectory, and replace "-g -O2"
> >> > > with just "-g". Now run the make command. Confirm during the build
> >> > > stage on the console that you don't have "-g -o2" with any of the
> gcc
> >> > > commands and instead have just "-g".
> >> > >
> >> > > You can then do a "make install", OR manually copy the .so files
> >> > > directory from "htp/.libs/" directory, to replace the libhtp
> libraries
> >> > > that you are linking with suricata.
> >> > >
> >> > > If you still can't see the symbols inside libhtp's functions when
> you
> >> > > run suricata, your suricata binary is pointing to the wrong/old
> libhtp
> >> > > library.
> >> > >
> >> > >
> >> > > On Sat, Jun 21, 2014 at 12:42 AM, David Vasil <davidvasil at gmail.com
> >
> >> > >
> >> > > wrote:
> >> > > > I was able to do this after Detect5 hit 100% and stayed there. I
> >> > >
> >> > > reverted
> >> > >
> >> > > > back to my originally compiled suricata 2.0.1 deb package (without
> >> > > > --enable-debug) as that flag created a ton of overhead - as you
> >> > >
> >> > > mentioned,
> >> > >
> >> > > > probably due to not being compiled with optimization - and it also
> >> > > > ended
> >> > >
> >> > > up
> >> > >
> >> > > > core dumping several times. I copied the unstripped libhtp lib
> and
> >> > >
> >> > > suricata
> >> > >
> >> > > > binary (again, without --enable-debug) to the appropriate
> >> > > > destinations
> >> > >
> >> > > and
> >> > >
> >> > > > was able to see the debugging symbols as expected. Attached is a
> >> > > > 'perf
> >> > >
> >> > > top'
> >> > >
> >> > > > drilling into the annotated code within htp_list_array_get showing
> >> > > > where
> >> > >
> >> > > the
> >> > >
> >> > > > time is being spent (I assume). 9d99, not in the screenshot,
> shows
> >> > > > the
> >> > > >
> >> > > > following:
> >> > > > 0.08 : 9d99: repz retq
> >> > > >
> >> > > > : free(l->elements);
> >> > > > : free(l);
> >> > > > :
> >> > > > : }
> >> > > >
> >> > > > GDB from this is thread here: http://pastebin.com/3tfjTsL0
> >> > > >
> >> > > > Thanks!
> >> > > > -dave
> >> > > >
> >> > > >
> >> > > > On Fri, Jun 20, 2014 at 9:41 AM, Anoop Saldanha
> >> > > > <anoopsaldanha at gmail.com
> >> > > >
> >> > > > wrote:
> >> > > >> I don't think --enable-debug compiles it with optimization.
> >> > > >> Instead
> >> > > >> compile it without optimization, i.e. either -g -O0 or -g -03.
> >> > > >> Copy
> >> > > >> the new binaries over, like you previously did. You will also
> have
> >> > > >> to
> >> > > >> compile libhtp the same way. You can either specify this in the
> >> > > >> environment variable with configure or manually edit the
> configure
> >> > > >> script and the makefiles, replacing all "-g -o2" with just "-g".
> >> > > >>
> >> > > >> 1. You can start suricata, and wait for one of the detect threads
> >> > > >> to
> >> > > >> hit the 100% cpu utilization mark(make a note of the detect
> >> > > >> threadname).
> >> > > >> 2. One you see that, attach gdb to the running process, and print
> >> > > >> the
> >> > > >> threads using "info threads". If you see the offending thread
> >> > > >> stuck
> >> > > >> in the libhtp get() function call, switch over to that thread
> using
> >> > > >> "t
> >> > > >> <thread_number>" and do a "print l". The symbol "l" is inside
> the
> >> > > >> libhtp get() function call. Unless you have the detect thread
> >> > > >> inside
> >> > > >> the libhtp get() function scope that we are trying to trace, you
> >> > > >> won't
> >> > > >> have the symbol available for printing.
> >> > > >> 3. If when you do a "info threads", you don't see any of the
> >> > > >> threads
> >> > > >> currently inside htp get() function(gone out of scope at that
> >> > > >> instance
> >> > > >> of time t), continue the process in gdb, and keep a tab on the
> >> > > >> threads
> >> > > >> with top/htop, till you see the detect thread(s) again hit the
> 100%
> >> > > >> cpu mark, post which you can interrupt the process inside gdb
> again
> >> > > >> and hopefully find the detect thread still inside the libhtp
> get()
> >> > > >> function context.
> >> > > >>
> >> > > >> With the issue at hand, once the thread gets pegged, you should
> be
> >> > > >> able to zero-in on the thread pretty quickly. In case you can't,
> >> > > >> I'll
> >> > > >> provide a debug patch to corner the issue.
> >> > >
> >> > > --
> >> > > -------------------------------
> >> > > Anoop Saldanha
> >> > > http://www.poona.me
> >> > > -------------------------------
> >>
> >
>
>
>
> --
> -------------------------------
> Anoop Saldanha
> http://www.poona.me
> -------------------------------
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openinfosecfoundation.org/pipermail/oisf-users/attachments/20140623/43db197c/attachment-0002.html>
More information about the Oisf-users
mailing list