[Oisf-users] Unbalanced load on AFpacket threads
Victor Julien
lists at inliniac.net
Mon Jun 3 17:56:06 UTC 2013
On 06/03/2013 07:52 PM, Fernando Sclavo wrote:
> We set "cluster-type: cluster_cpu" as suggested and CPU load lowered
> from 30% (average) to 5%!! But, the unbalance is still there. Also the
> UDP traffic is balanced now (sudo ethtool -N eth7 rx-flow-hash udp4 sdfn).
>
> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+
> COMMAND
>
> 2299 root 22 2 55.8g 53g 51g R 80.1 28.6 5:12.04
> AFPacketeth51
>
> 2331 root 20 0 55.8g 53g 51g R 19.9 28.6 1:19.97
> FlowManagerThre
>
> 2324 root 18 -2 55.8g 53g 51g S 16.4 28.6 1:13.06
> AFPacketeth710
>
> 2315 root 18 -2 55.8g 53g 51g S 11.9 28.6 0:49.75
> AFPacketeth71
>
> 2328 root 18 -2 55.8g 53g 51g S 11.9 28.6 0:55.22
> AFPacketeth714
>
> 2316 root 18 -2 55.8g 53g 51g S 10.9 28.6 0:54.53
> AFPacketeth72
>
> 2326 root 18 -2 55.8g 53g 51g S 10.9 28.6 0:45.33
> AFPacketeth712
>
> 2317 root 18 -2 55.8g 53g 51g S 10.4 28.6 0:38.21
> AFPacketeth73
>
> 2323 root 18 -2 55.8g 53g 51g S 9.9 28.6 0:44.72
> AFPacketeth79
>
>
> Dropped kernel packets:
>
> capture.kernel_drops | AFPacketeth51 | 449774742
> capture.kernel_drops | AFPacketeth52 | 48573
> capture.kernel_drops | AFPacketeth53 | 104763
> capture.kernel_drops | AFPacketeth54 | 108080
> capture.kernel_drops | AFPacketeth55 | 95763
> capture.kernel_drops | AFPacketeth56 | 105133
> capture.kernel_drops | AFPacketeth57 | 103984
> capture.kernel_drops | AFPacketeth58 | 100208
> capture.kernel_drops | AFPacketeth59 | 86704
> capture.kernel_drops | AFPacketeth510 | 95995
> capture.kernel_drops | AFPacketeth511 | 89633
> capture.kernel_drops | AFPacketeth512 | 94029
> capture.kernel_drops | AFPacketeth513 | 95192
> capture.kernel_drops | AFPacketeth514 | 106460
> capture.kernel_drops | AFPacketeth515 | 109770
> capture.kernel_drops | AFPacketeth516 | 108373
Can you share a full record from the stats.log?
Cheers,
Victor
>
> idsuser at suricata:/var/log/suricata$ cat /etc/rc.local
> #!/bin/sh -e
> #
> # rc.local
> #
> # This script is executed at the end of each multiuser runlevel.
> # Make sure that the script will "exit 0" on success or any other
> # value on error.
> #
> # In order to enable or disable this script just change the execution
> # bits.
> #
> # By default this script does nothing.
>
> sudo sysctl -w net.core.rmem_max=536870912
> sudo sysctl -w net.core.wmem_max=67108864
> sudo sysctl -w net.ipv4.tcp_window_scaling=1
> sudo sysctl -w net.core.netdev_max_backlog=1000000
>
> # Seteo tamaño de MMRBC en bus en 4K
> sudo setpci -d 8086:10fb e6.b=2e
>
> # sudo sysctl -w net.ipv4.tcp_rmem="4096 87380 67108864"
> # sudo sysctl -w net.ipv4.tcp_wmem="4096 87380 67108864"
>
> sleep 2
> sudo rmmod ixgbe
> sleep 2
> sudo insmod
> /lib/modules/3.2.0-45-generic/kernel/drivers/net/ethernet/intel/ixgbe/ixgbe.ko
> FdirPballoc=3,3,3,3 RSS=16,16,16,16 DCA=2,2,2,2
> sleep 2
>
> # Seteo ring size
> # sudo ethtool -G eth4 rx 4096
> sudo ethtool -G eth5 rx 4096
> # sudo ethtool -G eth6 rx 4096
> sudo ethtool -G eth7 rx 4096
>
> # Balanceo de carga de flows UDP
> # sudo ethtool -N eth4 rx-flow-hash udp4 sdfn
> sudo ethtool -N eth5 rx-flow-hash udp4 sdfn
> # sudo ethtool -N eth6 rx-flow-hash udp4 sdfn
> sudo ethtool -N eth7 rx-flow-hash udp4 sdfn
>
> sleep 2
> sudo ksh /home/idsuser/ixgbe-3.14.5/scripts/set_irq_affinity eth4 eth5
> eth6 eth7
> sleep 2
> # sudo ifconfig eth4 up && sleep 1
> sudo ifconfig eth5 up && sleep 1
> # sudo ifconfig eth6 up && sleep 1
> sudo ifconfig eth7 up && sleep 1
> sleep 5
> sudo suricata -D -c /etc/suricata/suricata.yaml --af-packet
> sleep 10
> sudo barnyard2 -c /etc/suricata/barnyard2.conf -d /var/log/suricata -f
> unified2.alert -w /var/log/suricata/suricata.waldo -D
> exit 0
>
>
>
> 2013/6/3 Fernando Sclavo <fsclavo at gmail.com <mailto:fsclavo at gmail.com>>
>
> Correction to previous email: runmode IS set to workers
>
>
> 2013/6/3 Fernando Sclavo <fsclavo at gmail.com <mailto:fsclavo at gmail.com>>
>
> Hi Peter/Eric, I will try "flow per cpu" and mail the results.
> Same to "workers", but if I don't mistake we has tried but CPU
> usage was very high.
>
>
> Queues and IRQ affinity: each NIC has 16 queues, with IRQ
> assigned to one core each one (Intel driver script), and
> Suricata has CPU affinity enabled, confirmed that each thread
> keeps in their own core.
>
>
>
> 2013/6/3 Eric Leblond <eric at regit.org <mailto:eric at regit.org>>
>
> Hi,
>
> Le lundi 03 juin 2013 à 15:54 +0200, Peter Manev a écrit :
> >
> >
> >
> > On Mon, Jun 3, 2013 at 3:34 PM, Fernando Sclavo
> <fsclavo at gmail.com <mailto:fsclavo at gmail.com>>
> > wrote:
> > Hi all!
> > We are running Suricata 1.4.2 with two Intel x520
> cards,
> > connected each one to the core switches on our
> datacenter
> > network. The average traffic is about 1~2Gbps per
> port.
> > As you can see on the following top output, there
> are some
> > threads significantly more loaded than others
> (AFPacketeth54
> > for example): these threads are continuously
> dropping kernel
> > packets. We raised kernel parameters (buffers and
> rmem, etc)
> > and lowered suricata timeouts flows to just a few
> seconds, but
> > we can't keep drops counter static when CPU goes
> to 99.9% for
> > a specific thread.
> > How can we do to balance the load better on all
> threads to
> > prevent this issue?
> >
> > The server is a Dell R715 2x16 core AMD
> Opteron(tm) Processor
> > 6284, 192Gb RAM.
> >
> > idsuser at suricata:~$ top -d2
> >
> > top - 10:24:05 up 1 min, 2 users, load average:
> 4.49, 1.14,
> > 0.38
> > Tasks: 287 total, 15 running, 272 sleeping, 0
> stopped, 0
> > zombie
> > Cpu(s): 30.3%us, 1.3%sy, 0.0%ni, 65.3%id,
> 0.0%wa, 0.0%hi,
> > 3.1%si, 0.0%st
> > Mem: 198002932k total, 59619020k used, 138383912k
> free,
> > 25644k buffers
> > Swap: 15624188k total, 0k used, 15624188k free,
> > 161068k cached
> >
> > PID USER PR NI VIRT RES SHR S %CPU %MEM
> TIME+
> > COMMAND
> > 2309 root 18 -2 55.8g 54g 51g R 99.9 28.6
> 0:20.96
> > AFPacketeth54
> > 2314 root 18 -2 55.8g 54g 51g R 99.9 28.6
> 0:18.29
> > AFPacketeth59
> > 2318 root 18 -2 55.8g 54g 51g R 99.9 28.6
> 0:12.90
> > AFPacketeth513
> > 2319 root 18 -2 55.8g 54g 51g R 77.6 28.6
> 0:12.78
> > AFPacketeth514
> > 2307 root 20 0 55.8g 54g 51g S 66.6 28.6
> 0:21.25
> > AFPacketeth52
> > 2338 root 20 0 55.8g 54g 51g R 58.2 28.6
> 0:09.94
> > FlowManagerThre
> > 2310 root 18 -2 55.8g 54g 51g S 51.2 28.6
> 0:15.35
> > AFPacketeth55
> > 2320 root 18 -2 55.8g 54g 51g R 50.2 28.6
> 0:07.83
> > AFPacketeth515
> > 2313 root 18 -2 55.8g 54g 51g S 48.7 28.6
> 0:11.66
> > AFPacketeth58
> > 2321 root 18 -2 55.8g 54g 51g S 47.7 28.6
> 0:07.75
> > AFPacketeth516
> > 2315 root 18 -2 55.8g 54g 51g R 45.2 28.6
> 0:12.18
> > AFPacketeth510
> > 2306 root 22 2 55.8g 54g 51g R 37.3 28.6
> 0:12.32
> > AFPacketeth51
> > 2312 root 18 -2 55.8g 54g 51g S 35.8 28.6
> 0:11.90
> > AFPacketeth57
> > 2308 root 20 0 55.8g 54g 51g R 34.8 28.6
> 0:16.69
> > AFPacketeth53
> > 2317 root 18 -2 55.8g 54g 51g R 33.3 28.6
> 0:07.93
> > AFPacketeth512
> > 2316 root 18 -2 55.8g 54g 51g S 28.8 28.6
> 0:08.03
> > AFPacketeth511
> > 2311 root 18 -2 55.8g 54g 51g S 24.9 28.6
> 0:10.51
> > AFPacketeth56
> > 2331 root 18 -2 55.8g 54g 51g R 19.9 28.6
> 0:02.41
> > AFPacketeth710
> > 2323 root 18 -2 55.8g 54g 51g S 17.9 28.6
> 0:03.60
> > AFPacketeth72
> > 2336 root 18 -2 55.8g 54g 51g S 16.9 28.6
> 0:01.50
> > AFPacketeth715
> > 2333 root 18 -2 55.8g 54g 51g S 14.9 28.6
> 0:02.14
> > AFPacketeth712
> > 2330 root 18 -2 55.8g 54g 51g S 13.9 28.6
> 0:02.12
> > AFPacketeth79
> > 2324 root 18 -2 55.8g 54g 51g R 11.9 28.6
> 0:02.96
> > AFPacketeth73
> > 2329 root 18 -2 55.8g 54g 51g S 11.9 28.6
> 0:01.90
> > AFPacketeth78
> > 2335 root 18 -2 55.8g 54g 51g S 11.9 28.6
> 0:01.44
> > AFPacketeth714
> > 2334 root 18 -2 55.8g 54g 51g R 10.9 28.6
> 0:01.68
> > AFPacketeth713
> > 2325 root 18 -2 55.8g 54g 51g S 9.4 28.6
> 0:02.38
> > AFPacketeth74
> > 2326 root 18 -2 55.8g 54g 51g S 8.9 28.6
> 0:02.71
> > AFPacketeth75
> > 2327 root 18 -2 55.8g 54g 51g S 7.5 28.6
> 0:01.98
> > AFPacketeth76
> > 2332 root 18 -2 55.8g 54g 51g S 7.5 28.6
> 0:01.53
> > AFPacketeth711
> > 2337 root 18 -2 55.8g 54g 51g S 7.0 28.6
> 0:01.09
> > AFPacketeth716
> > 2328 root 18 -2 55.8g 54g 51g S 6.0 28.6
> 0:02.11
> > AFPacketeth77
> > 2322 root 18 -2 55.8g 54g 51g R 5.5 28.6
> 0:03.78
> > AFPacketeth71
> > 3 root 20 0 0 0 0 S 4.5 0.0
> 0:01.25
> > ksoftirqd/0
> > 11 root 20 0 0 0 0 S 0.5 0.0
> 0:00.14
> > kworker/0:1
> >
> > Regards
> >
> >
> > _______________________________________________
> > Suricata IDS Users mailing list:
> > oisf-users at openinfosecfoundation.org
> <mailto:oisf-users at openinfosecfoundation.org>
> > Site: http://suricata-ids.org | Support:
> > http://suricata-ids.org/support/
> > List:
> >
> https://lists.openinfosecfoundation.org/mailman/listinfo/oisf-users
> > OISF: http://www.openinfosecfoundation.org/
> >
> >
> > Hi,
> >
> >
> > You could try "runmode: workers".
>
> From thread name it seems it is already the case.
>
> >
> >
> > What is your flow balance method?
> >
> > Can you try "flow per cpu" in the yaml section of afpacket?
> > ("cluster-type: cluster_cpu")
>
> It could help indeed.
>
> A few questions:
>
> Are your IRQ affinity setting correct ? (meaning multiqueue
> used on the
> NICs and well balanced accross CPU ?)
>
> If you have a lot of UDP on your network use ethtool to load
> balance it
> as it is not done by default.
>
> BR,
> >
> >
> >
> >
> >
> > Thank you
> >
> >
> > --
> > Regards,
> > Peter Manev
> > _______________________________________________
> > Suricata IDS Users mailing list:
> oisf-users at openinfosecfoundation.org
> <mailto:oisf-users at openinfosecfoundation.org>
> > Site: http://suricata-ids.org | Support:
> http://suricata-ids.org/support/
> > List:
> https://lists.openinfosecfoundation.org/mailman/listinfo/oisf-users
> > OISF: http://www.openinfosecfoundation.org/
>
>
>
>
>
>
> _______________________________________________
> Suricata IDS Users mailing list: oisf-users at openinfosecfoundation.org
> Site: http://suricata-ids.org | Support: http://suricata-ids.org/support/
> List: https://lists.openinfosecfoundation.org/mailman/listinfo/oisf-users
> OISF: http://www.openinfosecfoundation.org/
>
--
---------------------------------------------
Victor Julien
http://www.inliniac.net/
PGP: http://www.inliniac.net/victorjulien.asc
---------------------------------------------
More information about the Oisf-users
mailing list