[Oisf-users] Unbalanced load on AFpacket threads

Victor Julien lists at inliniac.net
Mon Jun 3 17:56:06 UTC 2013


On 06/03/2013 07:52 PM, Fernando Sclavo wrote:
> We set "cluster-type: cluster_cpu" as suggested and CPU load lowered
> from 30% (average) to 5%!! But, the unbalance is still there. Also the
> UDP traffic is balanced now (sudo ethtool -N eth7 rx-flow-hash udp4 sdfn).
> 
>   PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+
> COMMAND                                                                           
> 
>  2299 root      22   2 55.8g  53g  51g R 80.1 28.6   5:12.04
> AFPacketeth51                                                                     
> 
>  2331 root      20   0 55.8g  53g  51g R 19.9 28.6   1:19.97
> FlowManagerThre                                                                   
> 
>  2324 root      18  -2 55.8g  53g  51g S 16.4 28.6   1:13.06
> AFPacketeth710                                                                    
> 
>  2315 root      18  -2 55.8g  53g  51g S 11.9 28.6   0:49.75
> AFPacketeth71                                                                     
> 
>  2328 root      18  -2 55.8g  53g  51g S 11.9 28.6   0:55.22
> AFPacketeth714                                                                    
> 
>  2316 root      18  -2 55.8g  53g  51g S 10.9 28.6   0:54.53
> AFPacketeth72                                                                     
> 
>  2326 root      18  -2 55.8g  53g  51g S 10.9 28.6   0:45.33
> AFPacketeth712                                                                    
> 
>  2317 root      18  -2 55.8g  53g  51g S 10.4 28.6   0:38.21
> AFPacketeth73                                                                     
> 
>  2323 root      18  -2 55.8g  53g  51g S  9.9 28.6   0:44.72
> AFPacketeth79                                                                    
> 
> 
> Dropped kernel packets:
> 
> capture.kernel_drops      | AFPacketeth51             | 449774742
> capture.kernel_drops      | AFPacketeth52             | 48573
> capture.kernel_drops      | AFPacketeth53             | 104763
> capture.kernel_drops      | AFPacketeth54             | 108080
> capture.kernel_drops      | AFPacketeth55             | 95763
> capture.kernel_drops      | AFPacketeth56             | 105133
> capture.kernel_drops      | AFPacketeth57             | 103984
> capture.kernel_drops      | AFPacketeth58             | 100208
> capture.kernel_drops      | AFPacketeth59             | 86704
> capture.kernel_drops      | AFPacketeth510            | 95995
> capture.kernel_drops      | AFPacketeth511            | 89633
> capture.kernel_drops      | AFPacketeth512            | 94029
> capture.kernel_drops      | AFPacketeth513            | 95192
> capture.kernel_drops      | AFPacketeth514            | 106460
> capture.kernel_drops      | AFPacketeth515            | 109770
> capture.kernel_drops      | AFPacketeth516            | 108373

Can you share a full record from the stats.log?

Cheers,
Victor

> 
> idsuser at suricata:/var/log/suricata$ cat /etc/rc.local
> #!/bin/sh -e
> #
> # rc.local
> #
> # This script is executed at the end of each multiuser runlevel.
> # Make sure that the script will "exit 0" on success or any other
> # value on error.
> #
> # In order to enable or disable this script just change the execution
> # bits.
> #
> # By default this script does nothing.
> 
> sudo sysctl -w net.core.rmem_max=536870912
> sudo sysctl -w net.core.wmem_max=67108864
> sudo sysctl -w net.ipv4.tcp_window_scaling=1
> sudo sysctl -w net.core.netdev_max_backlog=1000000
> 
> # Seteo tamaño de MMRBC en bus en 4K
> sudo setpci -d 8086:10fb e6.b=2e
> 
> # sudo sysctl -w net.ipv4.tcp_rmem="4096 87380 67108864"
> # sudo sysctl -w net.ipv4.tcp_wmem="4096 87380 67108864"
> 
> sleep 2
> sudo rmmod ixgbe
> sleep 2
> sudo insmod
> /lib/modules/3.2.0-45-generic/kernel/drivers/net/ethernet/intel/ixgbe/ixgbe.ko
> FdirPballoc=3,3,3,3 RSS=16,16,16,16 DCA=2,2,2,2
> sleep 2
> 
> # Seteo ring size
> # sudo ethtool -G eth4 rx 4096
> sudo ethtool -G eth5 rx 4096
> # sudo ethtool -G eth6 rx 4096
> sudo ethtool -G eth7 rx 4096
> 
> # Balanceo de carga de flows UDP
> # sudo ethtool -N eth4 rx-flow-hash udp4 sdfn
> sudo ethtool -N eth5 rx-flow-hash udp4 sdfn
> # sudo ethtool -N eth6 rx-flow-hash udp4 sdfn
> sudo ethtool -N eth7 rx-flow-hash udp4 sdfn
> 
> sleep 2
> sudo ksh /home/idsuser/ixgbe-3.14.5/scripts/set_irq_affinity eth4 eth5
> eth6 eth7
> sleep 2
> # sudo ifconfig eth4 up && sleep 1
> sudo ifconfig eth5 up && sleep 1
> # sudo ifconfig eth6 up && sleep 1
> sudo ifconfig eth7 up && sleep 1
> sleep 5
> sudo suricata -D -c /etc/suricata/suricata.yaml --af-packet
> sleep 10
> sudo barnyard2 -c /etc/suricata/barnyard2.conf -d /var/log/suricata -f
> unified2.alert -w /var/log/suricata/suricata.waldo -D
> exit 0
> 
> 
> 
> 2013/6/3 Fernando Sclavo <fsclavo at gmail.com <mailto:fsclavo at gmail.com>>
> 
>     Correction to previous email: runmode IS set to workers
> 
> 
>     2013/6/3 Fernando Sclavo <fsclavo at gmail.com <mailto:fsclavo at gmail.com>>
> 
>         Hi Peter/Eric, I will try "flow per cpu" and mail the results.
>         Same to "workers", but if I don't mistake we has tried but CPU
>         usage was very high.
> 
> 
>         Queues and IRQ affinity: each NIC has 16 queues, with IRQ
>         assigned to one core each one (Intel driver script), and
>         Suricata has CPU affinity enabled, confirmed that each thread
>         keeps in their own core.
> 
> 
> 
>         2013/6/3 Eric Leblond <eric at regit.org <mailto:eric at regit.org>>
> 
>             Hi,
> 
>             Le lundi 03 juin 2013 à 15:54 +0200, Peter Manev a écrit :
>             >
>             >
>             >
>             > On Mon, Jun 3, 2013 at 3:34 PM, Fernando Sclavo
>             <fsclavo at gmail.com <mailto:fsclavo at gmail.com>>
>             > wrote:
>             >         Hi all!
>             >         We are running Suricata 1.4.2 with two Intel x520
>             cards,
>             >         connected each one to the core switches on our
>             datacenter
>             >         network. The average traffic is about 1~2Gbps per
>             port.
>             >         As you can see on the following top output, there
>             are some
>             >         threads significantly more loaded than others
>             (AFPacketeth54
>             >         for example): these threads are continuously
>             dropping kernel
>             >         packets. We raised kernel parameters (buffers and
>             rmem, etc)
>             >         and lowered suricata timeouts flows to just a few
>             seconds, but
>             >         we can't keep drops counter static when CPU goes
>             to 99.9% for
>             >         a specific thread.
>             >         How can we do to balance the load better on all
>             threads to
>             >         prevent this issue?
>             >
>             >         The server is a Dell R715 2x16 core AMD
>             Opteron(tm) Processor
>             >         6284, 192Gb RAM.
>             >
>             >         idsuser at suricata:~$ top -d2
>             >
>             >         top - 10:24:05 up 1 min,  2 users,  load average:
>             4.49, 1.14,
>             >         0.38
>             >         Tasks: 287 total,  15 running, 272 sleeping,   0
>             stopped,   0
>             >         zombie
>             >         Cpu(s): 30.3%us,  1.3%sy,  0.0%ni, 65.3%id,
>              0.0%wa,  0.0%hi,
>             >         3.1%si,  0.0%st
>             >         Mem:  198002932k total, 59619020k used, 138383912k
>             free,
>             >         25644k buffers
>             >         Swap: 15624188k total,        0k used, 15624188k free,
>             >         161068k cached
>             >
>             >           PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM
>                TIME+
>             >         COMMAND
>             >          2309 root      18  -2 55.8g  54g  51g R 99.9 28.6
>               0:20.96
>             >         AFPacketeth54
>             >          2314 root      18  -2 55.8g  54g  51g R 99.9 28.6
>               0:18.29
>             >         AFPacketeth59
>             >          2318 root      18  -2 55.8g  54g  51g R 99.9 28.6
>               0:12.90
>             >         AFPacketeth513
>             >          2319 root      18  -2 55.8g  54g  51g R 77.6 28.6
>               0:12.78
>             >         AFPacketeth514
>             >          2307 root      20   0 55.8g  54g  51g S 66.6 28.6
>               0:21.25
>             >         AFPacketeth52
>             >          2338 root      20   0 55.8g  54g  51g R 58.2 28.6
>               0:09.94
>             >         FlowManagerThre
>             >          2310 root      18  -2 55.8g  54g  51g S 51.2 28.6
>               0:15.35
>             >         AFPacketeth55
>             >          2320 root      18  -2 55.8g  54g  51g R 50.2 28.6
>               0:07.83
>             >         AFPacketeth515
>             >          2313 root      18  -2 55.8g  54g  51g S 48.7 28.6
>               0:11.66
>             >         AFPacketeth58
>             >          2321 root      18  -2 55.8g  54g  51g S 47.7 28.6
>               0:07.75
>             >         AFPacketeth516
>             >          2315 root      18  -2 55.8g  54g  51g R 45.2 28.6
>               0:12.18
>             >         AFPacketeth510
>             >          2306 root      22   2 55.8g  54g  51g R 37.3 28.6
>               0:12.32
>             >         AFPacketeth51
>             >          2312 root      18  -2 55.8g  54g  51g S 35.8 28.6
>               0:11.90
>             >         AFPacketeth57
>             >          2308 root      20   0 55.8g  54g  51g R 34.8 28.6
>               0:16.69
>             >         AFPacketeth53
>             >          2317 root      18  -2 55.8g  54g  51g R 33.3 28.6
>               0:07.93
>             >         AFPacketeth512
>             >          2316 root      18  -2 55.8g  54g  51g S 28.8 28.6
>               0:08.03
>             >         AFPacketeth511
>             >          2311 root      18  -2 55.8g  54g  51g S 24.9 28.6
>               0:10.51
>             >         AFPacketeth56
>             >          2331 root      18  -2 55.8g  54g  51g R 19.9 28.6
>               0:02.41
>             >         AFPacketeth710
>             >          2323 root      18  -2 55.8g  54g  51g S 17.9 28.6
>               0:03.60
>             >         AFPacketeth72
>             >          2336 root      18  -2 55.8g  54g  51g S 16.9 28.6
>               0:01.50
>             >         AFPacketeth715
>             >          2333 root      18  -2 55.8g  54g  51g S 14.9 28.6
>               0:02.14
>             >         AFPacketeth712
>             >          2330 root      18  -2 55.8g  54g  51g S 13.9 28.6
>               0:02.12
>             >         AFPacketeth79
>             >          2324 root      18  -2 55.8g  54g  51g R 11.9 28.6
>               0:02.96
>             >         AFPacketeth73
>             >          2329 root      18  -2 55.8g  54g  51g S 11.9 28.6
>               0:01.90
>             >         AFPacketeth78
>             >          2335 root      18  -2 55.8g  54g  51g S 11.9 28.6
>               0:01.44
>             >         AFPacketeth714
>             >          2334 root      18  -2 55.8g  54g  51g R 10.9 28.6
>               0:01.68
>             >         AFPacketeth713
>             >          2325 root      18  -2 55.8g  54g  51g S  9.4 28.6
>               0:02.38
>             >         AFPacketeth74
>             >          2326 root      18  -2 55.8g  54g  51g S  8.9 28.6
>               0:02.71
>             >         AFPacketeth75
>             >          2327 root      18  -2 55.8g  54g  51g S  7.5 28.6
>               0:01.98
>             >         AFPacketeth76
>             >          2332 root      18  -2 55.8g  54g  51g S  7.5 28.6
>               0:01.53
>             >         AFPacketeth711
>             >          2337 root      18  -2 55.8g  54g  51g S  7.0 28.6
>               0:01.09
>             >         AFPacketeth716
>             >          2328 root      18  -2 55.8g  54g  51g S  6.0 28.6
>               0:02.11
>             >         AFPacketeth77
>             >          2322 root      18  -2 55.8g  54g  51g R  5.5 28.6
>               0:03.78
>             >         AFPacketeth71
>             >             3 root      20   0     0    0    0 S  4.5  0.0
>               0:01.25
>             >         ksoftirqd/0
>             >            11 root      20   0     0    0    0 S  0.5  0.0
>               0:00.14
>             >         kworker/0:1
>             >
>             >         Regards
>             >
>             >
>             >         _______________________________________________
>             >         Suricata IDS Users mailing list:
>             >         oisf-users at openinfosecfoundation.org
>             <mailto:oisf-users at openinfosecfoundation.org>
>             >         Site: http://suricata-ids.org | Support:
>             >         http://suricata-ids.org/support/
>             >         List:
>             >        
>             https://lists.openinfosecfoundation.org/mailman/listinfo/oisf-users
>             >         OISF: http://www.openinfosecfoundation.org/
>             >
>             >
>             > Hi,
>             >
>             >
>             > You could try "runmode: workers".
> 
>             From thread name it seems it is already the case.
> 
>             >
>             >
>             > What is your flow balance method?
>             >
>             > Can you try "flow per cpu" in the yaml section of afpacket?
>             > ("cluster-type: cluster_cpu")
> 
>             It could help indeed.
> 
>             A few questions:
> 
>             Are your IRQ affinity setting correct ? (meaning multiqueue
>             used on the
>             NICs and well balanced accross CPU ?)
> 
>             If you have a lot of UDP on your network use ethtool to load
>             balance it
>             as it is not done by default.
> 
>             BR,
>             >
>             >
>             >
>             >
>             >
>             > Thank you
>             >
>             >
>             > --
>             > Regards,
>             > Peter Manev
>             > _______________________________________________
>             > Suricata IDS Users mailing list:
>             oisf-users at openinfosecfoundation.org
>             <mailto:oisf-users at openinfosecfoundation.org>
>             > Site: http://suricata-ids.org | Support:
>             http://suricata-ids.org/support/
>             > List:
>             https://lists.openinfosecfoundation.org/mailman/listinfo/oisf-users
>             > OISF: http://www.openinfosecfoundation.org/
> 
> 
> 
> 
> 
> 
> _______________________________________________
> Suricata IDS Users mailing list: oisf-users at openinfosecfoundation.org
> Site: http://suricata-ids.org | Support: http://suricata-ids.org/support/
> List: https://lists.openinfosecfoundation.org/mailman/listinfo/oisf-users
> OISF: http://www.openinfosecfoundation.org/
> 


-- 
---------------------------------------------
Victor Julien
http://www.inliniac.net/
PGP: http://www.inliniac.net/victorjulien.asc
---------------------------------------------




More information about the Oisf-users mailing list