[Oisf-users] Suricata pegs a detect thread and drops packets

Anoop Saldanha anoopsaldanha at gmail.com
Mon Jun 23 14:56:47 UTC 2014


Dave,

That's a huge list and that's pretty much the issue without doubt.

On Mon, Jun 23, 2014 at 8:21 PM, David Vasil <davidvasil at gmail.com> wrote:
> Sorry about that:
>
> (gdb) info threads
>   Id   Target Id         Frame
>   14   Thread 0x7fcdf5621700 (LWP 32689) "RxPFR1" 0x00007fcdf998589c in
> __lll_lock_wait () from /lib/x86_64-linux-gnu/libpthread.so.0
>   13   Thread 0x7fcdf4e20700 (LWP 32690) "RxPFR2" 0x00007fcdf9241a43 in poll
> () from /lib/x86_64-linux-gnu/libc.so.6
>   12   Thread 0x7fcdd7fff700 (LWP 32691) "Detect1" 0x00007fcdf9982d84 in
> pthread_cond_wait@@GLIBC_2.3.2 () from /lib/x86_64-linux-gnu/libpthread.so.0
>   11   Thread 0x7fcdd77fe700 (LWP 32692) "Detect2" 0x00007fcdfa692cdf in
> htp_list_array_get (l=0x7fcdd0c53190, idx=3446) at htp_list.c:98
>   10   Thread 0x7fcdd6ffd700 (LWP 32693) "Detect3" 0x00007fcdf9982d84 in
> pthread_cond_wait@@GLIBC_2.3.2 () from /lib/x86_64-linux-gnu/libpthread.so.0
>   9    Thread 0x7fcdcffff700 (LWP 32694) "Detect4" 0x00007fcdf9982d84 in
> pthread_cond_wait@@GLIBC_2.3.2 () from /lib/x86_64-linux-gnu/libpthread.so.0
>   8    Thread 0x7fcdd67fc700 (LWP 32695) "Detect5" 0x00007fcdf9982d84 in
> pthread_cond_wait@@GLIBC_2.3.2 () from /lib/x86_64-linux-gnu/libpthread.so.0
>   7    Thread 0x7fcdd5ffb700 (LWP 32696) "Detect6" 0x00007fcdf9982d84 in
> pthread_cond_wait@@GLIBC_2.3.2 () from /lib/x86_64-linux-gnu/libpthread.so.0
>   6    Thread 0x7fcdd57fa700 (LWP 32697) "Detect7" 0x00007fcdf9982d84 in
> pthread_cond_wait@@GLIBC_2.3.2 () from /lib/x86_64-linux-gnu/libpthread.so.0
>   5    Thread 0x7fcdd4ff9700 (LWP 32698) "Detect8" 0x00007fcdf9982d84 in
> pthread_cond_wait@@GLIBC_2.3.2 () from /lib/x86_64-linux-gnu/libpthread.so.0
>   4    Thread 0x7fcdcf7fe700 (LWP 32699) "FlowManagerThre"
> 0x00007fcdf99830fe in pthread_cond_timedwait@@GLIBC_2.3.2 () from
> /lib/x86_64-linux-gnu/libpthread.so.0
>   3    Thread 0x7fcdceffd700 (LWP 32700) "SCPerfWakeupThr"
> 0x00007fcdf99830fe in pthread_cond_timedwait@@GLIBC_2.3.2 () from
> /lib/x86_64-linux-gnu/libpthread.so.0
>   2    Thread 0x7fcdce7fc700 (LWP 32701) "SCPerfMgmtThrea"
> 0x00007fcdf99830fe in pthread_cond_timedwait@@GLIBC_2.3.2 () from
> /lib/x86_64-linux-gnu/libpthread.so.0
> * 1    Thread 0x7fcdfaab2840 (LWP 32668) "Suricata-Main" 0x00007fcdf921908d
> in nanosleep () from /lib/x86_64-linux-gnu/libc.so.6
> (gdb) t 11
> [Switching to thread 11 (Thread 0x7fcdd77fe700 (LWP 32692))]
> #0  0x00007fcdfa692cdf in htp_list_array_get (l=0x7fcdd0c53190, idx=3446) at
> htp_list.c:98
> 98              r = l->elements[i];
> (gdb) print *l
> $1 = {first = 0, last = 8512, max_size = 16384, current_size = 8512,
> elements = 0x7fcdd2658c50}
>
> Cooper,
>   I'll try the dev branch as well.
>
> Thanks,
> -dave
>
>
> On Mon, Jun 23, 2014 at 9:32 AM, Duarte Silva <duarte.silva at serializing.me>
> wrote:
>>
>> On Monday 23 June 2014 09:08:45 David Vasil wrote:
>> > Here is the result after removing all of the -O2's from the libhtp
>> > makefile
>> > and waiting for a detect thread to hit 100%.
>> >
>> > http://pastebin.com/aHp415HV
>>
>> Hi Dave,
>>
>> the print doesn't say much other than the memory address of the variable
>> :P
>> Instead, could you do a:
>>
>> (gdb) print *l
>>
>> It should print all of the members of the htp_list_array_t structure. If
>> not,
>> do this:
>>
>> (gdb) print l->max_size
>> (gdb) print l->current_size
>>
>> Cheers,
>> Duarte
>>
>> >
>> > Also, setting transparent hugepages to always did not seem to prevent
>> > this
>> > from occurring.
>> >
>> > -dave
>> >
>> >
>> > On Fri, Jun 20, 2014 at 10:36 PM, Anoop Saldanha
>> > <anoopsaldanha at gmail.com>
>> >
>> > wrote:
>> > > Dave,
>> > >
>> > > Nice!  You did manage to catch that thread inside the get() function.
>> > > From the perf-top it indicates that it's the same issue, although it
>> > > would be nice if I could look at the contents of the variable "l"
>> > > inside that get() function.  Unfortunately from your gdb output it
>> > > looks like you forgot to compile libhtp without(i typed it wrongly as
>> > > "with" in my previous mail, instead of "without", although I did
>> > > specify -g3 later on) optimization.
>> > >
>> > > Can you get the output of "print l" inside the get() function, but
>> > > without optimization enabled in the libhtp library.
>> > >
>> > > You will have to rebuild libhtp without optimization.  Go to your
>> > > libhtp build directory.  Run the "configure" command like you have
>> > > before.  Before you type "make", search for "Makefile" in the libhtp
>> > > root/base directory and the "htp" subdirectory, and replace "-g -O2"
>> > > with just "-g".  Now run the make command.  Confirm during the build
>> > > stage on the console that you don't have "-g -o2" with any of the gcc
>> > > commands and instead have just "-g".
>> > >
>> > > You can then do a "make install", OR manually copy the .so files
>> > > directory from "htp/.libs/" directory, to replace the libhtp libraries
>> > > that you are linking with suricata.
>> > >
>> > > If you still can't see the symbols inside libhtp's functions when you
>> > > run suricata, your suricata binary is pointing to the wrong/old libhtp
>> > > library.
>> > >
>> > >
>> > > On Sat, Jun 21, 2014 at 12:42 AM, David Vasil <davidvasil at gmail.com>
>> > >
>> > > wrote:
>> > > > I was able to do this after Detect5 hit 100% and stayed there.  I
>> > >
>> > > reverted
>> > >
>> > > > back to my originally compiled suricata 2.0.1 deb package (without
>> > > > --enable-debug) as that flag created a ton of overhead - as you
>> > >
>> > > mentioned,
>> > >
>> > > > probably due to not being compiled with optimization - and it also
>> > > > ended
>> > >
>> > > up
>> > >
>> > > > core dumping several times.  I copied the unstripped libhtp lib and
>> > >
>> > > suricata
>> > >
>> > > > binary (again, without --enable-debug) to the appropriate
>> > > > destinations
>> > >
>> > > and
>> > >
>> > > > was able to see the debugging symbols as expected.  Attached is a
>> > > > 'perf
>> > >
>> > > top'
>> > >
>> > > > drilling into the annotated code within htp_list_array_get showing
>> > > > where
>> > >
>> > > the
>> > >
>> > > > time is being spent (I assume).  9d99, not in the screenshot, shows
>> > > > the
>> > > >
>> > > > following:
>> > > >     0.08 :            9d99:       repz retq
>> > > >
>> > > >          :            free(l->elements);
>> > > >          :            free(l);
>> > > >          :
>> > > >          :        }
>> > > >
>> > > > GDB from this is thread here: http://pastebin.com/3tfjTsL0
>> > > >
>> > > > Thanks!
>> > > > -dave
>> > > >
>> > > >
>> > > > On Fri, Jun 20, 2014 at 9:41 AM, Anoop Saldanha
>> > > > <anoopsaldanha at gmail.com
>> > > >
>> > > > wrote:
>> > > >> I don't think --enable-debug compiles it with optimization.
>> > > >> Instead
>> > > >> compile it without optimization, i.e. either -g -O0 or -g -03.
>> > > >> Copy
>> > > >> the new binaries over, like you previously did.  You will also have
>> > > >> to
>> > > >> compile libhtp the same way.  You can either specify this in the
>> > > >> environment variable with configure or manually edit the configure
>> > > >> script and the makefiles, replacing all "-g -o2" with just "-g".
>> > > >>
>> > > >> 1. You can start suricata, and wait for one of the detect threads
>> > > >> to
>> > > >> hit the 100% cpu utilization mark(make a note of the detect
>> > > >> threadname).
>> > > >> 2. One you see that, attach gdb to the running process, and print
>> > > >> the
>> > > >> threads using "info threads".  If you see the offending thread
>> > > >> stuck
>> > > >> in the libhtp get() function call, switch over to that thread using
>> > > >> "t
>> > > >> <thread_number>" and do a "print l".  The symbol "l" is inside the
>> > > >> libhtp get() function call.  Unless you have the detect thread
>> > > >> inside
>> > > >> the libhtp get() function scope that we are trying to trace, you
>> > > >> won't
>> > > >> have the symbol available for printing.
>> > > >> 3. If when you do a "info threads", you don't see any of the
>> > > >> threads
>> > > >> currently inside htp get() function(gone out of scope at that
>> > > >> instance
>> > > >> of time t), continue the process in gdb, and keep a tab on the
>> > > >> threads
>> > > >> with top/htop, till you see the detect thread(s) again hit the 100%
>> > > >> cpu mark, post which you can interrupt the process inside gdb again
>> > > >> and hopefully find the detect thread still inside the libhtp get()
>> > > >> function context.
>> > > >>
>> > > >> With the issue at hand, once the thread gets pegged, you should be
>> > > >> able to zero-in on the thread pretty quickly.  In case you can't,
>> > > >> I'll
>> > > >> provide a debug patch to corner the issue.
>> > >
>> > > --
>> > > -------------------------------
>> > > Anoop Saldanha
>> > > http://www.poona.me
>> > > -------------------------------
>>
>



-- 
-------------------------------
Anoop Saldanha
http://www.poona.me
-------------------------------



More information about the Oisf-users mailing list