[Oisf-users] number of alerts versus performance

Thu Jun 30 16:51:10 UTC 2016

top -H -p suricata_pid gets me this:

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
 4692 suricata  20   0 11.2g  10g 396m R 100.2 69.5  69:24.27 W#03-bond0
 4693 suricata  20   0 11.2g  10g 396m S 52.6 69.5  55:26.68 W#04-bond0
 4690 suricata  20   0 11.2g  10g 396m R 45.6 69.5  67:32.78 W#01-bond0
 4691 suricata  20   0 11.2g  10g 396m S 30.6 69.5  55:52.85 W#02-bond0
 4694 suricata  20   0 11.2g  10g 396m S  1.3 69.5   1:20.04 FM#01
 4695 suricata  20   0 11.2g  10g 396m S  0.7 69.5   0:32.82 FM#02
 4442 suricata  20   0 11.2g  10g 396m S  0.3 69.5   3:45.30 Suricata-Main
 4697 suricata  20   0 11.2g  10g 396m S  0.0 69.5   0:08.25 FR#01
 4698 suricata  20   0 11.2g  10g 396m S  0.0 69.5   0:00.05 CW
 4699 suricata  20   0 11.2g  10g 396m S  0.0 69.5   0:00.15 CS

W#04 detection threads gets higher CPU utilization as well but not 100%. It seems that it is unevenly spread.

So the reason that it says bond0 is because I've bonded two nics into one since they are taps and each sees one direction of traffic.

Looking at cat /proc/interrupts | grep eth0[2]

56: 1499740917          0          1   12313392   PCI-MSI-edge      eth0
 57:          0          0          0          0   PCI-MSI-edge      eth0:1
 58:          0          0          0          0   PCI-MSI-edge      eth0:2
 59:          0          0          0          0   PCI-MSI-edge      eth0:3
 64:          0          0          2 3682766018   PCI-MSI-edge      eth2
 65:          0          0          0          0   PCI-MSI-edge      eth2:1
 66:          0          0          0          0   PCI-MSI-edge      eth2:2
 67:          0          0          0          0   PCI-MSI-edge      eth2:3

It seems that each NIC is only utilizing one queue and one CPU. Could this explain HIGH CPU usage on two detection threads?

I've tried to assign smp affinity to all nic queues for all CPUs but it didnt seem to change much in these stats above. And yes I've killed irqbalance when I've tested that.

After running perf top (without -c option), the highest usage is

 6.15%  libc-2.12.so              [.] __memset_x86_64

Running perf top -c 0, I get the same results.

Running perf top -c 1 or 2 or 3, I get only one line (sometimes two)

 100.00%  [kernel]  [k] native_write_msr_safe

or

 77.00%  [kernel]      [k] native_write_msr_safe
  23.00%  libc-2.12.so  [.] memcpy

BTW, i get weird numbers for symbol under suricata processes in perf top. I've found an article about that which tells to compile with -fno-omit-frame-pointer flag for GCC. No help.

Thanks.

________________________________
From: Peter Manev <petermanev at gmail.com>
Sent: Thursday, June 30, 2016 4:27 PM
To: Yasha Zislin
Cc: oisf-users at lists.openinfosecfoundation.org
Subject: Re: [Oisf-users] number of alerts versus performance

On Thu, 2016-06-30 at 15:54 +0000, Yasha Zislin wrote:
> Peter,
>
>
> I found one alert that was causing high alert count. After I've
> disabled it, count went down but packet loss is still around 20%.
>
>
> my stats.log does not contain anything useful such as flow emergency
> mode, or ssn memcap drop. The only thing that is off is kernel drops,
> and tcp reassembly gaps.
> From my understanding kernel drops have nothing to do with Suricata
> and point to OS problems.
>
>
> I do see one of the CPUs peak at 100% when packet loss increases. One
> thing to note. Two other CPUs are working on capturing traffic with
> high IRQs. My guess would be flow manager or detection engine.
>

You can see if you get more info from:
top -H -p `pidof suricata`
and
perf top -c cpu_number_here
example: perf top -c 0

> I dunno.
>
>
> Thanks
>
>
>
>
> ______________________________________________________________________
> From: Peter Manev <petermanev at gmail.com>
> Sent: Thursday, June 30, 2016 3:00 PM
> To: Yasha Zislin
> Cc: oisf-users at lists.openinfosecfoundation.org
> Subject: Re: [Oisf-users] number of alerts versus performance
>
> On Thu, 2016-06-30 at 14:41 +0000, Yasha Zislin wrote:
> > I have been trying to figure out a packet loss on one of my sensors
> > and I am puzzled.
> >
> > It is has 16 gigs of RAM, one quad core AMD CPU, and nic sees about
> 3
> > million packets per minute. Nothing special in my mind. I am using
> > PFRING 6.5.0 and Suricata 3.1.
> >
> > I get about 20% to 40% packet loss.  I have another identical server
> > which sees the same amount of traffic and maybe some of the same
> > traffic as well.
> >
> > I've been messing around with NIC settings, IRQs, PFRING settings,
> > Suricata settings trying to figure out why such a high packet loss.
> >
> >
> > I have just realized one big difference in these two sensors.
> > Problematic one gets 2k to 4k of alerts per minute which sounds
> huge.
> >
>
> Any particular sig that is alerting in excess ?
>
> > Second one gets like 80 alerts per minute. Both have the same
> > rulesets.
> >
> >
> > The difference of course is the home_net variable.
> >
> >
> > Can the fact that Suricata processes more rules due to HOME_NET
> > definition cause high performance strain on the server?
> >
>
> Yes HOME_NET size has effect on performance as well (among other
> things). For example -
> HOME_NET: "any"
> EXTERNAL_NET: "any"
> will certainly degrade your performance.
>
> >
> > If the packet does not match per HOME_NET, it will be discarded
> before
> > being processed in rules. Correct?
> >
> > Versus if packet passes HOME_NET check, it would have to go through
> > all of the rules, hence cause higher CPU utilization.
> >
> >
> > Thank you for the clarification.
> >
> >
> > _______________________________________________
> > Suricata IDS Users mailing list:
> oisf-users at openinfosecfoundation.org
> > Site: http://suricata-ids.org | Support:
> http://suricata-ids.org/support/
>
>
> Suricata
> suricata-ids.org
> Open Source IDS / IPS / NSM engine
>
>
> > List:
> https://lists.openinfosecfoundation.org/mailman/listinfo/oisf-users
> > Suricata User Conference November 9-11 in Washington, DC:
> http://oisfevents.net
>
>

--
Regards,
Peter Manev

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openinfosecfoundation.org/pipermail/oisf-users/attachments/20160630/eeb9896c/attachment-0002.html>