[Oisf-users] Help with 99% CPU usage

Anoop Saldanha anoopsaldanha at gmail.com
Thu Jun 6 04:03:43 UTC 2013


On Wed, Jun 5, 2013 at 9:00 PM, Anoop Saldanha <anoopsaldanha at gmail.com> wrote:
> On Wed, Jun 5, 2013 at 5:04 PM, Victor Julien <lists at inliniac.net> wrote:
>> On 06/05/2013 12:34 PM, Anoop Saldanha wrote:
>>> On Wed, Jun 5, 2013 at 2:45 PM, Duarte Silva
>>> <duarte.silva at serializing.me> wrote:
>>>> On Thursday 16 May 2013 10:01:26 Duarte Silva wrote:
>>>>> On Wednesday 15 May 2013 19:54:21 Anoop Saldanha wrote:
>>>>>> On Wed, May 15, 2013 at 3:55 PM, Duarte Silva
>>>>>>
>>>>>> <duarte.silva at serializing.me> wrote:
>>>>>>> Hi all,
>>>>>>>
>>>>>>> I'm currently facing a problem with Suricata. After running for a while,
>>>>>>> there is always an AF_PACKET thread (workers mode) that hogs the CPU to
>>>>>>> which it is bound to creating an awful amount of packet loss. I have
>>>>>>> discarded the>
>>>>>>>
>>>>>>> following factors:
>>>>>>>  - Number of rules, it has also happened without rules;
>>>>>>>  - Amount of network traffic, I have seen Suricata handle ~18 MBs (150
>>>>>>>  MBps) of>
>>>>>>>
>>>>>>> traffic without problems with the current configuration and it as also
>>>>>>> happened with only ~2 MBs of traffic;
>>>>>>>
>>>>>>>  - Memory, Suricata was only using ~500 MB of it when the CPU usage
>>>>>>>  pegged
>>>>>>>  to>
>>>>>>>
>>>>>>> 100%;
>>>>>>>
>>>>>>> This happens repeatedly and after it happens, Suricata takes a long time
>>>>>>> to
>>>>>>> stop. Could some tell me what I can do to debug this issue?
>>>>>>>
>>>>>>> Suricata is executed with the following command line:
>>>>>>>
>>>>>>> suricata -D -c /etc/suricata/suricata.yaml --pidfile
>>>>>>> /var/lock/subsys/suricata --af-packet=eth1 --user=suri --group=suri
>>>>>>>
>>>>>>> I have also attached any files that can help out in debugging.
>>>>>>
>>>>>> While this thread hogs the cpu, can you attach gdb to the suricata
>>>>>> process, and get a bt for the specified thread, and also all the
>>>>>> threads.
>>>>>
>>>>> Follows in the attachments the traces for the hogging thread (I had to wait
>>>>> almost height hours for it to happen). I have created three traces in
>>>>> different times while the AFPacketeth12 was hoging the CPU, all of them end
>>>>> up in the list_array_get in dslib.c.
>>>>>
>>>>> I will investigate what is happening by looking at the code, when it happens
>>>>> again I will also take traces for the other threads.
>>>>
>>>> Hi,
>>>>
>>>> I have taken two more traces when it happened again. Could you please give a
>>>> little help on this? I think it has something to do with HTTP processing.
>>>>
>>>
>>> @Duarte
>>>
>>> What version of suricata are you running?
>>>
>>> @Victor.
>>>
>>> From the last bt that Duarte sent, it looks like the list has grown in
>>> size.  The size is around 4k.  Probably that's the reason for the
>>> slowdown?  Every time we inspect state we will end up looping through
>>> the whole array.
>>
>> Looks like it ya. The array approach doesn't really seem to scale that
>> well. Maybe a optimization in the short run would be to walk it
>> backwards? If we have this many TX' we probably still only have the last
>> few active.
>>
>
> Yeah, that should be the short term solution I think.  Let me look into it.
>

Giving this issue some more thought, the current dev master handles
this issue much better since it doesn't walk through the older items
repeatedly like the current stable, albeit the actual list is still
huge and filled with lots of NULL entries.  We still need to
cull/compress the NULL items in the list?

-- 
-------------------------------
Anoop Saldanha
http://www.poona.me
-------------------------------



More information about the Oisf-users mailing list