[Oisf-users] packet on wrong thread x710 cards

Mon Jul 6 17:05:53 UTC 2020

Hi

I am aware of https://redmine.openinfosecfoundation.org/issues/2725 and
there seems to be a conclusion that cluster_qm with symetric hashing solved
packet_on_wrong thread issue. Unformutaly this is not the case for my
setup.
I am using two X710 10G cards on two numa nodes with two Intel 5218 CPU HT
enabled.
It's going to be a production suricata setup and I am getting around 3-5
Gbps on one interface. I have enabled only around 6000 rules for testing.
The only way I don't get any pkt_on_wrong_thread if I use autofp but cpu
usage goes on the top so I don't think it is sustainable.

I am testing with cluster_qm and symmetric hashing

My setup is

ethtool -i ens3f0

driver: i40e

version: 2.3.2-k

firmware-version: 7.10 0x800075df 19.5.12

expansion-rom-version:

bus-info: 0000:3b:00.0

supports-statistics: yes

supports-test: yes

supports-eeprom-access: yes

supports-register-dump: yes

suricata from debian repo

suricata -V

This is Suricata version 4.1.2 RELEASE

kernel : uname -r

4.19.0-9-amd64

cat /sys/devices/system/node/node0/cpulist

0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,38,40,42,44,46,48,50,52,54,56,58,60,62

cat /sys/devices/system/node/node1/cpulist

1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31,33,35,37,39,41,43,45,47,49,51,53,55,57,59,61,63

ethtool -L ens3f0 combined 32

ethtool -K ens3f0 rxhash on

ethtool -K ens3f0 ntuple on

same with ens4f0

./set_irq_affinity
0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,38,40,42,44,46,48,50,52,54,56,58,60,62
ens3f0

./set_irq_affinity
1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31,33,35,37,39,41,43,45,47,49,51,53,55,57,59,61,63
ens4f0

ethtool -X ens3f0 hkey
6D:5A:6D:5A:6D:5A:6D:5A:6D:5A:6D:5A:6D:5A:6D:5A:6D:5A:6D:5A:6D:5A:6D:5A:6D:5A:6D:5A:6D:5A:6D:5A:6D:5A:6D:5A:6D:5A:6D:5A:6D:5A:6D:5A:6D:5A:6D:5A:6D:5A:6D:5A
equal 32

same with ens4f0

for both interfaces

for i in rx tx tso gso gro   rxhash ntuple sg txvlan rxvlan ; do ethtool -K
ens4f0 $i off ; echo $i ; done

for proto in tcp4 udp4 tcp6 udp6; do /sbin/ethtool -N ens4f0 rx-flow-hash
$proto sdfn ; done

suricata --dump-config | grep af-packet

af-packet = (null)

af-packet.0 = interface

af-packet.0.interface = ens3f0

af-packet.0.threads = 32

af-packet.0.cluster-id = 99

af-packet.0.cluster-type = cluster_qm

af-packet.0.defrag = yes

af-packet.0.use-mmap = yes

af-packet.0.mmap-locked = yes

af-packet.0.tpacket-v3 = yes

af-packet.0.ring-size = 200000

af-packet.0.block-size = 1048576

af-packet.1 = interface

af-packet.1.interface = ens4f0

af-packet.1.threads = 32

af-packet.1.cluster-id = 98

af-packet.1.cluster-type = cluster_qm

af-packet.1.defrag = yes

af-packet.1.tpacket-v3 = yes

af-packet.1.use-mmap = yes

af-packet.1.mmap-locked = yes

af-packet.1.ring-size = 200000

af-packet.1.block-size = 1048576

When I start suricata, pkt_on_wrong_thread is around 20 percent of
capture.kernel_packets but gradually in a few hours it comes down to 1-2%
but keeps increasing.

I can see from pidstat that only even numbered cpu being used on ens3f0 and
odd numbered on ens4fo as expected due to numa node architecture.

mpstat shows all cpu being used but the usage is really low, 2-3%.

I haven't enabled cpu_affinity in config files as I can not see load an
issue here.

Even if i use a single interface, it is still showing pkt_on_wrong_thread.

Any suggestion would be really appreciated as no config change is removing
pkt_on_wrong_thread.

Regards

Kashif
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openinfosecfoundation.org/pipermail/oisf-users/attachments/20200706/33f8d82e/attachment-0001.html>