[Oisf-users] Performance with 3x600Mbps and up to 10Gbps

Jozef MlĂ­ch Jozef.Mlich at trustport.com
Fri Oct 16 11:34:57 UTC 2015

Dear suricata users,

we have done performance testing for 3x600 MBits on selks and it is
working well with runmode: workers, but not with autofp. We reach about
0% packet drop with workers mode, but about 50% packet drop with autofp

For some reasons, we need to use autofp mode, but we are interested
also about other modes.

We were using suricata from SELKS (which is something between 2.0.9 and
current master). Additionally, we made tests also on CentOS7 with
kernel 3.10+ and compiled upstream suricata (master branch) from last
week with same results. 

Our configuration is almost the default one for SELKS except:
  - af-packet cluster_type=cluster_round_robin
  - af-packet threads 16
  - detect-engine profile
  - we have about 26k rules
  - we use just eve-log (json) as output, unified is disabled


runmode: workers; # 1st scenario
- packets replayed on speed 600 Mbit/s with 3 interfaces using
tcpreplay (i.e. 3x600Mbps)
- duration: 11m 58s
- threads: 48+4
- avg load: 12.76 11.22 6.87
- packet drop 0,57%, packets: 305908678 drops: 1748851	

runmode: autofp: # 2nd scenario
- packets replayed on speed 600 Mbit/s with 3 interfaces using
tcpreplay (i.e. 3x600Mbps)
- duration: 11m 57s
- threads: 60+4
- avg load: 40.85 30.08 23.66
- packet drop 53,91 packets: 302916555 drops: 163309837

- Intel(R) Xeon(R) CPU E5-2690 v3 @ 2.60GHz (12 cores with HT enabled =
24 cores)
- Memory (4x16 DIMM @ 2133MHz = 64GB)
- NICs 4x Broadcom BCM5720/1GBit/PCIExpress (it should manage 17
threads per one dualport card devs)

We have tried tuning of following settings based on links below [1-4]:
  - detect-engine, high profile and custom values
  - max-pending-packets
  - mpm-algo: ac (almost every test)
    - but no acceleration (cuda, napatech)
  - af-packet
    - flow-type cluster_flow/cluster_round_robin
    - ring_size
    - buffer_size
    - thread count
    - mtu
    - rollover [4]
  - sysctl 
    - rmem_max 
    - tcp_timestamp
    - tcp_sack
    - tcp_no_metrics_save
    - tcp_window_scaling
    - netdev_max_backlog
  - NIC offloading disabled (via ethtool) 

What hardware do you recommend for our configuration and following
- 1.5Gbps traffic with 30k rules
- 10Gbps traffic with 30k rules

Also I am not sure about memory requirements. I read some information
on links [3] and [5] and I would like to know more about setting up
upper memory limit.

It seems the memory usage is computed as follows:
  [af-packet memory allocations] + [detect-engine memory allocation] + 

The af-packet memory allocations is given by:
af-packet = threads * mtu * ring_size

The max_pending_packets should be also in this formula, but I am not
sure where.

The detect engine memory usage is given by multiplication and addition
of 8 values, which could be configured for custom profile. 
detect-engine.profile=high == ?
detect-engine.profile=low == ?

[1] https://redmine.openinfosecfoundation.org/projects/suricata/wiki/Hi
[2] https://redmine.openinfosecfoundation.org/projects/suricata/wiki/Tu
[3] https://redmine.openinfosecfoundation.org/projects/suricata/wiki/Su
[4] https://lists.openinfosecfoundation.org/pipermail/oisf-users/2015-O

Jozef Mlich <Jozef.Mlich at trustport.com>

More information about the Oisf-users mailing list