[Oisf-users] suricata 3.2.0 for 10Gb performance

Maxim hittlle at 163.com
Fri Jan 20 01:08:27 UTC 2017

Hi Cooper,
Thanks very much. I could not open http://marc.info/?l=linux-netdev&m=148181173415107&w=2 to patch my ixgbe driver. I use ixgbe-4.4.6, the latest version downloaded from Intel official site. Do I need to patch it? Could you please share your experience to optimize suricata performance? Could you please send me a list? Currently, I use multiple queues and RSS, and plus RFS, and my setup can process nearly 5 gigabits of traffic per second. I wanna try your way, that is single receive queue + RFS. Another question is that what size of RX queue should I set? Does this size have something to do with CPU layer 3 cache size? I used perf to record my cache misses, my cache miss rate is nearly 50%, maybe I can reduce this. Many thanks.


At 2017-01-20 06:53:30, "Cooper F. Nelson" <cnelson at ucsd.edu> wrote:
>Hardware RSS has problems because often the flow load balancing is not
>symmetric.  This causes problems with suricata as different cores handle
>each side of the flow and creates timing issues.
>I'm assuming you are using they ixgbe driver, if so you probably need to
>patch it.
>> http://marc.info/?l=linux-netdev&m=148181173415107&w=2
>... and then set a special hash key to force symmetric flows.
>I have a special experimental 3.2 build based around full hardware
>RSS/offloading using the new AF_PACKET tpacket-v3 mode which is showing
>some pretty spectacular performance improvements over the standard
>build.  If you are interested I can work with you off list to get it
>setup on your hardware, but I'll warn you there are lots of moving parts
>to get everything working correctly.
>Most important thing first is to make sure you are on a Linux
>distribution with a relatively 'fresh' kernel.  I'm on 4.8.7 currently
>and at least 4.7+ is recommend.  You also need to be able to install the
>source for the kernel and then patch the ixgbe module, or download the
>driver and then patch it.
>On 1/18/2017 6:58 PM, Maxim wrote:
>> Thanks all for you guidance. I've read this tutorial. Currently there
>> are two approaches to suricata performance tuning. One is to use
>> multiple queues, and bind each queue IRQ to a separate core; the
>> other one, just like this tutorial shows is to use a single queue,
>> but let Linux RFS(receive flow steering) to do what NIC RSS would do.
>> I've no idea who is better. I prefer the multiple queue approach
>> because I think hardware is better doing calculating than RFS because
>> the latter is implemented in software, what do you think? In my case,
>> I used 16 RX queues, and bind them to 16 cores separately, when I
>> tried to simulate 10 gigabit traffic per second, all the 16 cores
>> were fully occupied, but I still have another 8 cores idling. I wanna
>> use RFS to distribute busy softirqs to the 8 idle cores, but it turns
>> out there is no significant improvement. I turned on hyperthreading,
>> and my CPU is 2.1 Ghz, my CPU sucks? Many thanks.
>Cooper Nelson
>Network Security Analyst
>UCSD ITS Security Team
>cnelson at ucsd.edu x41042
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openinfosecfoundation.org/pipermail/oisf-users/attachments/20170120/f1a83724/attachment-0002.html>

More information about the Oisf-users mailing list