[Oisf-users] Suricata pegs a detect thread and drops packets
Victor Julien
lists at inliniac.net
Tue Jul 8 14:45:54 UTC 2014
On 06/25/2014 06:14 PM, Ivan Ristić wrote:
> It's possible that LibHTP is the performance bottleneck there. We have a
> couple of places where we track issues similar to the one you reported:
>
> https://redmine.openinfosecfoundation.org/issues/822
>
> https://github.com/ironbee/libhtp/issues/56
>
> I'll look into it.
>
> Thanks for your reports!
Ivan has updated the libhtp 0.5.x branch to address this issue. Can you
test it?
It's in this branch: https://github.com/ironbee/libhtp/tree/0.5.x
Cheers,
Victor
>
>
> On 23/06/2014 15:58, David Vasil wrote:
>> Anything I should do to tune the list size downward? If so, what
>> implications does that have for inspection? Thanks!
>>
>> -dave
>>
>> On Jun 23, 2014 10:56 AM, "Anoop Saldanha" <anoopsaldanha at gmail.com
>> <mailto:anoopsaldanha at gmail.com>> wrote:
>>
>> Dave,
>>
>> That's a huge list and that's pretty much the issue without doubt.
>>
>> On Mon, Jun 23, 2014 at 8:21 PM, David Vasil <davidvasil at gmail.com
>> <mailto:davidvasil at gmail.com>> wrote:
>> > Sorry about that:
>> >
>> > (gdb) info threads
>> > Id Target Id Frame
>> > 14 Thread 0x7fcdf5621700 (LWP 32689) "RxPFR1"
>> 0x00007fcdf998589c in
>> > __lll_lock_wait () from /lib/x86_64-linux-gnu/libpthread.so.0
>> > 13 Thread 0x7fcdf4e20700 (LWP 32690) "RxPFR2"
>> 0x00007fcdf9241a43 in poll
>> > () from /lib/x86_64-linux-gnu/libc.so.6
>> > 12 Thread 0x7fcdd7fff700 (LWP 32691) "Detect1"
>> 0x00007fcdf9982d84 in
>> > pthread_cond_wait@@GLIBC_2.3.2 () from
>> /lib/x86_64-linux-gnu/libpthread.so.0
>> > 11 Thread 0x7fcdd77fe700 (LWP 32692) "Detect2"
>> 0x00007fcdfa692cdf in
>> > htp_list_array_get (l=0x7fcdd0c53190, idx=3446) at htp_list.c:98
>> > 10 Thread 0x7fcdd6ffd700 (LWP 32693) "Detect3"
>> 0x00007fcdf9982d84 in
>> > pthread_cond_wait@@GLIBC_2.3.2 () from
>> /lib/x86_64-linux-gnu/libpthread.so.0
>> > 9 Thread 0x7fcdcffff700 (LWP 32694) "Detect4"
>> 0x00007fcdf9982d84 in
>> > pthread_cond_wait@@GLIBC_2.3.2 () from
>> /lib/x86_64-linux-gnu/libpthread.so.0
>> > 8 Thread 0x7fcdd67fc700 (LWP 32695) "Detect5"
>> 0x00007fcdf9982d84 in
>> > pthread_cond_wait@@GLIBC_2.3.2 () from
>> /lib/x86_64-linux-gnu/libpthread.so.0
>> > 7 Thread 0x7fcdd5ffb700 (LWP 32696) "Detect6"
>> 0x00007fcdf9982d84 in
>> > pthread_cond_wait@@GLIBC_2.3.2 () from
>> /lib/x86_64-linux-gnu/libpthread.so.0
>> > 6 Thread 0x7fcdd57fa700 (LWP 32697) "Detect7"
>> 0x00007fcdf9982d84 in
>> > pthread_cond_wait@@GLIBC_2.3.2 () from
>> /lib/x86_64-linux-gnu/libpthread.so.0
>> > 5 Thread 0x7fcdd4ff9700 (LWP 32698) "Detect8"
>> 0x00007fcdf9982d84 in
>> > pthread_cond_wait@@GLIBC_2.3.2 () from
>> /lib/x86_64-linux-gnu/libpthread.so.0
>> > 4 Thread 0x7fcdcf7fe700 (LWP 32699) "FlowManagerThre"
>> > 0x00007fcdf99830fe in pthread_cond_timedwait@@GLIBC_2.3.2 () from
>> > /lib/x86_64-linux-gnu/libpthread.so.0
>> > 3 Thread 0x7fcdceffd700 (LWP 32700) "SCPerfWakeupThr"
>> > 0x00007fcdf99830fe in pthread_cond_timedwait@@GLIBC_2.3.2 () from
>> > /lib/x86_64-linux-gnu/libpthread.so.0
>> > 2 Thread 0x7fcdce7fc700 (LWP 32701) "SCPerfMgmtThrea"
>> > 0x00007fcdf99830fe in pthread_cond_timedwait@@GLIBC_2.3.2 () from
>> > /lib/x86_64-linux-gnu/libpthread.so.0
>> > * 1 Thread 0x7fcdfaab2840 (LWP 32668) "Suricata-Main"
>> 0x00007fcdf921908d
>> > in nanosleep () from /lib/x86_64-linux-gnu/libc.so.6
>> > (gdb) t 11
>> > [Switching to thread 11 (Thread 0x7fcdd77fe700 (LWP 32692))]
>> > #0 0x00007fcdfa692cdf in htp_list_array_get (l=0x7fcdd0c53190,
>> idx=3446) at
>> > htp_list.c:98
>> > 98 r = l->elements[i];
>> > (gdb) print *l
>> > $1 = {first = 0, last = 8512, max_size = 16384, current_size = 8512,
>> > elements = 0x7fcdd2658c50}
>> >
>> > Cooper,
>> > I'll try the dev branch as well.
>> >
>> > Thanks,
>> > -dave
>> >
>> >
>> > On Mon, Jun 23, 2014 at 9:32 AM, Duarte Silva
>> <duarte.silva at serializing.me <mailto:duarte.silva at serializing.me>>
>> > wrote:
>> >>
>> >> On Monday 23 June 2014 09:08:45 David Vasil wrote:
>> >> > Here is the result after removing all of the -O2's from the libhtp
>> >> > makefile
>> >> > and waiting for a detect thread to hit 100%.
>> >> >
>> >> > http://pastebin.com/aHp415HV
>> >>
>> >> Hi Dave,
>> >>
>> >> the print doesn't say much other than the memory address of the
>> variable
>> >> :P
>> >> Instead, could you do a:
>> >>
>> >> (gdb) print *l
>> >>
>> >> It should print all of the members of the htp_list_array_t
>> structure. If
>> >> not,
>> >> do this:
>> >>
>> >> (gdb) print l->max_size
>> >> (gdb) print l->current_size
>> >>
>> >> Cheers,
>> >> Duarte
>> >>
>> >> >
>> >> > Also, setting transparent hugepages to always did not seem to
>> prevent
>> >> > this
>> >> > from occurring.
>> >> >
>> >> > -dave
>> >> >
>> >> >
>> >> > On Fri, Jun 20, 2014 at 10:36 PM, Anoop Saldanha
>> >> > <anoopsaldanha at gmail.com <mailto:anoopsaldanha at gmail.com>>
>> >> >
>> >> > wrote:
>> >> > > Dave,
>> >> > >
>> >> > > Nice! You did manage to catch that thread inside the get()
>> function.
>> >> > > From the perf-top it indicates that it's the same issue,
>> although it
>> >> > > would be nice if I could look at the contents of the variable "l"
>> >> > > inside that get() function. Unfortunately from your gdb
>> output it
>> >> > > looks like you forgot to compile libhtp without(i typed it
>> wrongly as
>> >> > > "with" in my previous mail, instead of "without", although I did
>> >> > > specify -g3 later on) optimization.
>> >> > >
>> >> > > Can you get the output of "print l" inside the get()
>> function, but
>> >> > > without optimization enabled in the libhtp library.
>> >> > >
>> >> > > You will have to rebuild libhtp without optimization. Go to your
>> >> > > libhtp build directory. Run the "configure" command like you
>> have
>> >> > > before. Before you type "make", search for "Makefile" in the
>> libhtp
>> >> > > root/base directory and the "htp" subdirectory, and replace
>> "-g -O2"
>> >> > > with just "-g". Now run the make command. Confirm during
>> the build
>> >> > > stage on the console that you don't have "-g -o2" with any of
>> the gcc
>> >> > > commands and instead have just "-g".
>> >> > >
>> >> > > You can then do a "make install", OR manually copy the .so files
>> >> > > directory from "htp/.libs/" directory, to replace the libhtp
>> libraries
>> >> > > that you are linking with suricata.
>> >> > >
>> >> > > If you still can't see the symbols inside libhtp's functions
>> when you
>> >> > > run suricata, your suricata binary is pointing to the
>> wrong/old libhtp
>> >> > > library.
>> >> > >
>> >> > >
>> >> > > On Sat, Jun 21, 2014 at 12:42 AM, David Vasil
>> <davidvasil at gmail.com <mailto:davidvasil at gmail.com>>
>> >> > >
>> >> > > wrote:
>> >> > > > I was able to do this after Detect5 hit 100% and stayed
>> there. I
>> >> > >
>> >> > > reverted
>> >> > >
>> >> > > > back to my originally compiled suricata 2.0.1 deb package
>> (without
>> >> > > > --enable-debug) as that flag created a ton of overhead - as you
>> >> > >
>> >> > > mentioned,
>> >> > >
>> >> > > > probably due to not being compiled with optimization - and
>> it also
>> >> > > > ended
>> >> > >
>> >> > > up
>> >> > >
>> >> > > > core dumping several times. I copied the unstripped libhtp
>> lib and
>> >> > >
>> >> > > suricata
>> >> > >
>> >> > > > binary (again, without --enable-debug) to the appropriate
>> >> > > > destinations
>> >> > >
>> >> > > and
>> >> > >
>> >> > > > was able to see the debugging symbols as expected.
>> Attached is a
>> >> > > > 'perf
>> >> > >
>> >> > > top'
>> >> > >
>> >> > > > drilling into the annotated code within htp_list_array_get
>> showing
>> >> > > > where
>> >> > >
>> >> > > the
>> >> > >
>> >> > > > time is being spent (I assume). 9d99, not in the
>> screenshot, shows
>> >> > > > the
>> >> > > >
>> >> > > > following:
>> >> > > > 0.08 : 9d99: repz retq
>> >> > > >
>> >> > > > : free(l->elements);
>> >> > > > : free(l);
>> >> > > > :
>> >> > > > : }
>> >> > > >
>> >> > > > GDB from this is thread here: http://pastebin.com/3tfjTsL0
>> >> > > >
>> >> > > > Thanks!
>> >> > > > -dave
>> >> > > >
>> >> > > >
>> >> > > > On Fri, Jun 20, 2014 at 9:41 AM, Anoop Saldanha
>> >> > > > <anoopsaldanha at gmail.com <mailto:anoopsaldanha at gmail.com>
>> >> > > >
>> >> > > > wrote:
>> >> > > >> I don't think --enable-debug compiles it with optimization.
>> >> > > >> Instead
>> >> > > >> compile it without optimization, i.e. either -g -O0 or -g -03.
>> >> > > >> Copy
>> >> > > >> the new binaries over, like you previously did. You will
>> also have
>> >> > > >> to
>> >> > > >> compile libhtp the same way. You can either specify this
>> in the
>> >> > > >> environment variable with configure or manually edit the
>> configure
>> >> > > >> script and the makefiles, replacing all "-g -o2" with just
>> "-g".
>> >> > > >>
>> >> > > >> 1. You can start suricata, and wait for one of the detect
>> threads
>> >> > > >> to
>> >> > > >> hit the 100% cpu utilization mark(make a note of the detect
>> >> > > >> threadname).
>> >> > > >> 2. One you see that, attach gdb to the running process,
>> and print
>> >> > > >> the
>> >> > > >> threads using "info threads". If you see the offending thread
>> >> > > >> stuck
>> >> > > >> in the libhtp get() function call, switch over to that
>> thread using
>> >> > > >> "t
>> >> > > >> <thread_number>" and do a "print l". The symbol "l" is
>> inside the
>> >> > > >> libhtp get() function call. Unless you have the detect thread
>> >> > > >> inside
>> >> > > >> the libhtp get() function scope that we are trying to
>> trace, you
>> >> > > >> won't
>> >> > > >> have the symbol available for printing.
>> >> > > >> 3. If when you do a "info threads", you don't see any of the
>> >> > > >> threads
>> >> > > >> currently inside htp get() function(gone out of scope at that
>> >> > > >> instance
>> >> > > >> of time t), continue the process in gdb, and keep a tab on the
>> >> > > >> threads
>> >> > > >> with top/htop, till you see the detect thread(s) again hit
>> the 100%
>> >> > > >> cpu mark, post which you can interrupt the process inside
>> gdb again
>> >> > > >> and hopefully find the detect thread still inside the
>> libhtp get()
>> >> > > >> function context.
>> >> > > >>
>> >> > > >> With the issue at hand, once the thread gets pegged, you
>> should be
>> >> > > >> able to zero-in on the thread pretty quickly. In case you
>> can't,
>> >> > > >> I'll
>> >> > > >> provide a debug patch to corner the issue.
>> >> > >
>> >> > > --
>> >> > > -------------------------------
>> >> > > Anoop Saldanha
>> >> > > http://www.poona.me
>> >> > > -------------------------------
>> >>
>> >
>>
>>
>>
>> --
>> -------------------------------
>> Anoop Saldanha
>> http://www.poona.me
>> -------------------------------
>>
>
>
--
---------------------------------------------
Victor Julien
http://www.inliniac.net/
PGP: http://www.inliniac.net/victorjulien.asc
---------------------------------------------
More information about the Oisf-users
mailing list