[Oisf-users] Unbalanced load on AFpacket threads

Fernando Sclavo fsclavo at gmail.com
Mon Jun 3 17:52:07 UTC 2013


We set "cluster-type: cluster_cpu" as suggested and CPU load lowered from
30% (average) to 5%!! But, the unbalance is still there. Also the UDP
traffic is balanced now (sudo ethtool -N eth7 rx-flow-hash udp4 sdfn).

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+
COMMAND

 2299 root      22   2 55.8g  53g  51g R 80.1 28.6   5:12.04
AFPacketeth51

 2331 root      20   0 55.8g  53g  51g R 19.9 28.6   1:19.97
FlowManagerThre

 2324 root      18  -2 55.8g  53g  51g S 16.4 28.6   1:13.06
AFPacketeth710

 2315 root      18  -2 55.8g  53g  51g S 11.9 28.6   0:49.75
AFPacketeth71

 2328 root      18  -2 55.8g  53g  51g S 11.9 28.6   0:55.22
AFPacketeth714

 2316 root      18  -2 55.8g  53g  51g S 10.9 28.6   0:54.53
AFPacketeth72

 2326 root      18  -2 55.8g  53g  51g S 10.9 28.6   0:45.33
AFPacketeth712

 2317 root      18  -2 55.8g  53g  51g S 10.4 28.6   0:38.21
AFPacketeth73

 2323 root      18  -2 55.8g  53g  51g S  9.9 28.6   0:44.72
AFPacketeth79


Dropped kernel packets:

capture.kernel_drops      | AFPacketeth51             | 449774742
capture.kernel_drops      | AFPacketeth52             | 48573
capture.kernel_drops      | AFPacketeth53             | 104763
capture.kernel_drops      | AFPacketeth54             | 108080
capture.kernel_drops      | AFPacketeth55             | 95763
capture.kernel_drops      | AFPacketeth56             | 105133
capture.kernel_drops      | AFPacketeth57             | 103984
capture.kernel_drops      | AFPacketeth58             | 100208
capture.kernel_drops      | AFPacketeth59             | 86704
capture.kernel_drops      | AFPacketeth510            | 95995
capture.kernel_drops      | AFPacketeth511            | 89633
capture.kernel_drops      | AFPacketeth512            | 94029
capture.kernel_drops      | AFPacketeth513            | 95192
capture.kernel_drops      | AFPacketeth514            | 106460
capture.kernel_drops      | AFPacketeth515            | 109770
capture.kernel_drops      | AFPacketeth516            | 108373


idsuser at suricata:/var/log/suricata$ cat /etc/rc.local
#!/bin/sh -e
#
# rc.local
#
# This script is executed at the end of each multiuser runlevel.
# Make sure that the script will "exit 0" on success or any other
# value on error.
#
# In order to enable or disable this script just change the execution
# bits.
#
# By default this script does nothing.

sudo sysctl -w net.core.rmem_max=536870912
sudo sysctl -w net.core.wmem_max=67108864
sudo sysctl -w net.ipv4.tcp_window_scaling=1
sudo sysctl -w net.core.netdev_max_backlog=1000000

# Seteo tamaño de MMRBC en bus en 4K
sudo setpci -d 8086:10fb e6.b=2e

# sudo sysctl -w net.ipv4.tcp_rmem="4096 87380 67108864"
# sudo sysctl -w net.ipv4.tcp_wmem="4096 87380 67108864"

sleep 2
sudo rmmod ixgbe
sleep 2
sudo insmod
/lib/modules/3.2.0-45-generic/kernel/drivers/net/ethernet/intel/ixgbe/ixgbe.ko
FdirPballoc=3,3,3,3 RSS=16,16,16,16 DCA=2,2,2,2
sleep 2

# Seteo ring size
# sudo ethtool -G eth4 rx 4096
sudo ethtool -G eth5 rx 4096
# sudo ethtool -G eth6 rx 4096
sudo ethtool -G eth7 rx 4096

# Balanceo de carga de flows UDP
# sudo ethtool -N eth4 rx-flow-hash udp4 sdfn
sudo ethtool -N eth5 rx-flow-hash udp4 sdfn
# sudo ethtool -N eth6 rx-flow-hash udp4 sdfn
sudo ethtool -N eth7 rx-flow-hash udp4 sdfn

sleep 2
sudo ksh /home/idsuser/ixgbe-3.14.5/scripts/set_irq_affinity eth4 eth5 eth6
eth7
sleep 2
# sudo ifconfig eth4 up && sleep 1
sudo ifconfig eth5 up && sleep 1
# sudo ifconfig eth6 up && sleep 1
sudo ifconfig eth7 up && sleep 1
sleep 5
sudo suricata -D -c /etc/suricata/suricata.yaml --af-packet
sleep 10
sudo barnyard2 -c /etc/suricata/barnyard2.conf -d /var/log/suricata -f
unified2.alert -w /var/log/suricata/suricata.waldo -D
exit 0



2013/6/3 Fernando Sclavo <fsclavo at gmail.com>

> Correction to previous email: runmode IS set to workers
>
>
> 2013/6/3 Fernando Sclavo <fsclavo at gmail.com>
>
>> Hi Peter/Eric, I will try "flow per cpu" and mail the results. Same to
>> "workers", but if I don't mistake we has tried but CPU usage was very high.
>>
>>
>> Queues and IRQ affinity: each NIC has 16 queues, with IRQ assigned to one
>> core each one (Intel driver script), and Suricata has CPU affinity enabled,
>> confirmed that each thread keeps in their own core.
>>
>>
>>
>> 2013/6/3 Eric Leblond <eric at regit.org>
>>
>>> Hi,
>>>
>>> Le lundi 03 juin 2013 à 15:54 +0200, Peter Manev a écrit :
>>> >
>>> >
>>> >
>>> > On Mon, Jun 3, 2013 at 3:34 PM, Fernando Sclavo <fsclavo at gmail.com>
>>> > wrote:
>>> >         Hi all!
>>> >         We are running Suricata 1.4.2 with two Intel x520 cards,
>>> >         connected each one to the core switches on our datacenter
>>> >         network. The average traffic is about 1~2Gbps per port.
>>> >         As you can see on the following top output, there are some
>>> >         threads significantly more loaded than others (AFPacketeth54
>>> >         for example): these threads are continuously dropping kernel
>>> >         packets. We raised kernel parameters (buffers and rmem, etc)
>>> >         and lowered suricata timeouts flows to just a few seconds, but
>>> >         we can't keep drops counter static when CPU goes to 99.9% for
>>> >         a specific thread.
>>> >         How can we do to balance the load better on all threads to
>>> >         prevent this issue?
>>> >
>>> >         The server is a Dell R715 2x16 core AMD Opteron(tm) Processor
>>> >         6284, 192Gb RAM.
>>> >
>>> >         idsuser at suricata:~$ top -d2
>>> >
>>> >         top - 10:24:05 up 1 min,  2 users,  load average: 4.49, 1.14,
>>> >         0.38
>>> >         Tasks: 287 total,  15 running, 272 sleeping,   0 stopped,   0
>>> >         zombie
>>> >         Cpu(s): 30.3%us,  1.3%sy,  0.0%ni, 65.3%id,  0.0%wa,  0.0%hi,
>>> >         3.1%si,  0.0%st
>>> >         Mem:  198002932k total, 59619020k used, 138383912k free,
>>> >         25644k buffers
>>> >         Swap: 15624188k total,        0k used, 15624188k free,
>>> >         161068k cached
>>> >
>>> >           PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+
>>> >         COMMAND
>>> >          2309 root      18  -2 55.8g  54g  51g R 99.9 28.6   0:20.96
>>> >         AFPacketeth54
>>> >          2314 root      18  -2 55.8g  54g  51g R 99.9 28.6   0:18.29
>>> >         AFPacketeth59
>>> >          2318 root      18  -2 55.8g  54g  51g R 99.9 28.6   0:12.90
>>> >         AFPacketeth513
>>> >          2319 root      18  -2 55.8g  54g  51g R 77.6 28.6   0:12.78
>>> >         AFPacketeth514
>>> >          2307 root      20   0 55.8g  54g  51g S 66.6 28.6   0:21.25
>>> >         AFPacketeth52
>>> >          2338 root      20   0 55.8g  54g  51g R 58.2 28.6   0:09.94
>>> >         FlowManagerThre
>>> >          2310 root      18  -2 55.8g  54g  51g S 51.2 28.6   0:15.35
>>> >         AFPacketeth55
>>> >          2320 root      18  -2 55.8g  54g  51g R 50.2 28.6   0:07.83
>>> >         AFPacketeth515
>>> >          2313 root      18  -2 55.8g  54g  51g S 48.7 28.6   0:11.66
>>> >         AFPacketeth58
>>> >          2321 root      18  -2 55.8g  54g  51g S 47.7 28.6   0:07.75
>>> >         AFPacketeth516
>>> >          2315 root      18  -2 55.8g  54g  51g R 45.2 28.6   0:12.18
>>> >         AFPacketeth510
>>> >          2306 root      22   2 55.8g  54g  51g R 37.3 28.6   0:12.32
>>> >         AFPacketeth51
>>> >          2312 root      18  -2 55.8g  54g  51g S 35.8 28.6   0:11.90
>>> >         AFPacketeth57
>>> >          2308 root      20   0 55.8g  54g  51g R 34.8 28.6   0:16.69
>>> >         AFPacketeth53
>>> >          2317 root      18  -2 55.8g  54g  51g R 33.3 28.6   0:07.93
>>> >         AFPacketeth512
>>> >          2316 root      18  -2 55.8g  54g  51g S 28.8 28.6   0:08.03
>>> >         AFPacketeth511
>>> >          2311 root      18  -2 55.8g  54g  51g S 24.9 28.6   0:10.51
>>> >         AFPacketeth56
>>> >          2331 root      18  -2 55.8g  54g  51g R 19.9 28.6   0:02.41
>>> >         AFPacketeth710
>>> >          2323 root      18  -2 55.8g  54g  51g S 17.9 28.6   0:03.60
>>> >         AFPacketeth72
>>> >          2336 root      18  -2 55.8g  54g  51g S 16.9 28.6   0:01.50
>>> >         AFPacketeth715
>>> >          2333 root      18  -2 55.8g  54g  51g S 14.9 28.6   0:02.14
>>> >         AFPacketeth712
>>> >          2330 root      18  -2 55.8g  54g  51g S 13.9 28.6   0:02.12
>>> >         AFPacketeth79
>>> >          2324 root      18  -2 55.8g  54g  51g R 11.9 28.6   0:02.96
>>> >         AFPacketeth73
>>> >          2329 root      18  -2 55.8g  54g  51g S 11.9 28.6   0:01.90
>>> >         AFPacketeth78
>>> >          2335 root      18  -2 55.8g  54g  51g S 11.9 28.6   0:01.44
>>> >         AFPacketeth714
>>> >          2334 root      18  -2 55.8g  54g  51g R 10.9 28.6   0:01.68
>>> >         AFPacketeth713
>>> >          2325 root      18  -2 55.8g  54g  51g S  9.4 28.6   0:02.38
>>> >         AFPacketeth74
>>> >          2326 root      18  -2 55.8g  54g  51g S  8.9 28.6   0:02.71
>>> >         AFPacketeth75
>>> >          2327 root      18  -2 55.8g  54g  51g S  7.5 28.6   0:01.98
>>> >         AFPacketeth76
>>> >          2332 root      18  -2 55.8g  54g  51g S  7.5 28.6   0:01.53
>>> >         AFPacketeth711
>>> >          2337 root      18  -2 55.8g  54g  51g S  7.0 28.6   0:01.09
>>> >         AFPacketeth716
>>> >          2328 root      18  -2 55.8g  54g  51g S  6.0 28.6   0:02.11
>>> >         AFPacketeth77
>>> >          2322 root      18  -2 55.8g  54g  51g R  5.5 28.6   0:03.78
>>> >         AFPacketeth71
>>> >             3 root      20   0     0    0    0 S  4.5  0.0   0:01.25
>>> >         ksoftirqd/0
>>> >            11 root      20   0     0    0    0 S  0.5  0.0   0:00.14
>>> >         kworker/0:1
>>> >
>>> >         Regards
>>> >
>>> >
>>> >         _______________________________________________
>>> >         Suricata IDS Users mailing list:
>>> >         oisf-users at openinfosecfoundation.org
>>> >         Site: http://suricata-ids.org | Support:
>>> >         http://suricata-ids.org/support/
>>> >         List:
>>> >
>>> https://lists.openinfosecfoundation.org/mailman/listinfo/oisf-users
>>> >         OISF: http://www.openinfosecfoundation.org/
>>> >
>>> >
>>> > Hi,
>>> >
>>> >
>>> > You could try "runmode: workers".
>>>
>>> From thread name it seems it is already the case.
>>>
>>> >
>>> >
>>> > What is your flow balance method?
>>> >
>>> > Can you try "flow per cpu" in the yaml section of afpacket?
>>> > ("cluster-type: cluster_cpu")
>>>
>>> It could help indeed.
>>>
>>> A few questions:
>>>
>>> Are your IRQ affinity setting correct ? (meaning multiqueue used on the
>>> NICs and well balanced accross CPU ?)
>>>
>>> If you have a lot of UDP on your network use ethtool to load balance it
>>> as it is not done by default.
>>>
>>> BR,
>>> >
>>> >
>>> >
>>> >
>>> >
>>> > Thank you
>>> >
>>> >
>>> > --
>>> > Regards,
>>> > Peter Manev
>>> > _______________________________________________
>>> > Suricata IDS Users mailing list: oisf-users at openinfosecfoundation.org
>>> > Site: http://suricata-ids.org | Support:
>>> http://suricata-ids.org/support/
>>> > List:
>>> https://lists.openinfosecfoundation.org/mailman/listinfo/oisf-users
>>> > OISF: http://www.openinfosecfoundation.org/
>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openinfosecfoundation.org/pipermail/oisf-users/attachments/20130603/8ff89273/attachment-0002.html>


More information about the Oisf-users mailing list