[Oisf-users] 1Gbps NIDS performance tuning
Darren Spruell
phatbuckett at gmail.com
Tue May 20 18:00:33 UTC 2014
Hi,
Hoping for some performance tuning guidance on the following system:
Suricata 2.0 RELEASE
CentOS 6.5 amd64
Linux 2.6.32-431.17.1.el6.x86_64
libpcre 8.3.5
luajit 2.0.3
libpcap-1.4.0-1.20130826git2dbcaa1.el6.x86_64
12 core (2x6) Intel(R) Xeon(R) CPU E5-2630 v2 @ 2.60GHz (24 cpu w/HT)
64GB RAM
For the time being we're held on this 2.6.32 kernel and wonder if we
can achieve suitable performance/minimal packet drop with AF_PACKET.
PF_RING may be an option but not likely DNA/zero_copy due to
licensing, only vanilla. Can we achieve close to 0% drop with this
hardware/kernel/ruleset as described below? Is an upgraded kernel a
major factor in achieving better performance in this
configuration/traffic profile?
I find pretty good information about 10Gbps Suricata tuning (Intel
82599 adapters, typically) but I'm not certain what pieces of network
adapter setup would apply to a 1Gbps adapter:
43:00.0 Ethernet controller: Intel Corporation 82576 Gigabit Network
Connection (rev 01)
43:00.1 Ethernet controller: Intel Corporation 82576 Gigabit Network
Connection (rev 01)
44:00.0 Ethernet controller: Intel Corporation 82576 Gigabit Network
Connection (rev 01)
44:00.1 Ethernet controller: Intel Corporation 82576 Gigabit Network
Connection (rev 01)
Updated igb driver to latest:
driver: igb
version: 5.2.5
firmware-version: 1.2.1
bus-info: 0000:43:00.1
supports-statistics: yes
supports-test: yes
supports-eeprom-access: yes
supports-register-dump: yes
supports-priv-flags: no
Is it proper to disable all NIC offloading features?
$ sudo ethtool -k eth6
Features for eth6:
rx-checksumming: off
tx-checksumming: off
scatter-gather: off
tcp-segmentation-offload: off
udp-fragmentation-offload: off
generic-segmentation-offload: off
generic-receive-offload: off
large-receive-offload: off
ntuple-filters: off
receive-hashing: off
$ sudo ethtool -g eth6
Ring parameters for eth6:
Pre-set maximums:
RX: 4096
RX Mini: 0
RX Jumbo: 0
TX: 4096
Current hardware settings:
RX: 256
RX Mini: 0
RX Jumbo: 0
TX: 256
$ sudo ethtool -n eth6
1 RX rings available
rxclass: Cannot get RX class rule count: Operation not supported
RX classification rule retrieval failed
Traffic throughput is around 500 Mbps -- 800 Mbps , composed mostly of
TCP streams (forward proxy requests, clients <-> proxies).
Packet size brackets for interface bond1
Packet Size (bytes) Count Packet Size (bytes) Count
1 to 75: 31628146 751 to 825: 458232
76 to 150: 2284335 826 to 900: 361877
151 to 225: 651222 901 to 975: 672172
226 to 300: 524288 976 to 1050: 501744
301 to 375: 613914 1051 to 1125: 323661
376 to 450: 1391229 1126 to 1200: 362991
451 to 525: 1032754 1201 to 1275: 1312685
526 to 600: 818482 1276 to 1350: 341888
601 to 675: 7107770 1351 to 1425: 636282
676 to 750: 718379 1426 to 1500+: 34102516
Proto/Port Pkts Bytes PktsTo BytesTo
PktsFrom BytesFrom
TCP/3128 3788780 2948M 1397043 219783k
2391737 2728M
TCP/443 40160 11690875 21846 2569412
18314 9121463
TCP/80 13948 10720193 5295 342935
8653 10377258
UDP/53 5749 756317 2925 204953
2824 551364
UDP/161 3260 527866 1710 219259
1550 308607
TCP/43 4969 373030 2493 154131
2476 218899
UDP/514 281 67116 267 66004
14 1112
TCP/22 129 27147 64 7581
65 19566
UDP/123 8 608 4 304
4 304
Protocol data rates:555192.46 kbps total 36979.02 kbps in 518213.43 kbps out
Some flows are sizable (top 10 in monitored period of several minutes):
123715221 bytes
43233291 bytes
25762925 bytes
23353052 bytes
18263680 bytes
16888624 bytes
15858329 bytes
14250494 bytes
14081114 bytes
13980641 bytes
...many are quite small (smallest -> 46 bytes)
I intend to run a lightly tuned Emerging Threats ruleset with
something around 12K-13K rules enabled (current untuned rule
breakout):
20/5/2014 -- 09:39:42 - <Info> - 14341 signatures processed. 750 are
IP-only rules, 4046 are inspecting packet payload, 10997 inspect
application layer, 85 are decoder event only
The current configuration attempt has about a 50% drop rate. stats.log
at http://dpaste.com/246MJP9.txt. Changes in config:
- max-pending-packets: 3000
- eve-log and http-log disabled
- af-packet.ring-size: 524288
- af-packet.buffer-size: 65536
- detection-engine.profile: high
- defrag.memcap: 8gb
- flow.memcap: 16gb
- flow.prealloc: 1000000
- stream.memcap: 32gb
- stream.prealloc-sessions: 1024000
- reassembly.memcap: 16gb
- stream.depth: 6mb
Rules that fire are probably a result of the packet drop affecting
session reassembly (my guess): http://dpaste.com/1B0RZC2.txt
linux-vdso.so.1 => (0x00007fff64dff000)
libhtp-0.5.10.so.1 => /usr/local/lib/libhtp-0.5.10.so.1
(0x00007f55ad857000)
libluajit-5.1.so.2 => /usr/local/lib/libluajit-5.1.so.2
(0x00007f55ad5e7000)
libmagic.so.1 => /usr/lib64/libmagic.so.1 (0x00000034f8600000)
libcap-ng.so.0 => /usr/local/lib/libcap-ng.so.0 (0x00007f55ad3e2000)
libpcap.so.1 => /usr/lib64/libpcap.so.1 (0x00000034f4e00000)
libnet.so.1 => /lib64/libnet.so.1 (0x00007f55ad1c8000)
libpthread.so.0 => /lib64/libpthread.so.0 (0x00000034f3600000)
libyaml-0.so.2 => /usr/lib64/libyaml-0.so.2 (0x00007f55acfa8000)
libpcre.so.1 => /opt/pcre-8.35/lib/libpcre.so.1 (0x00007f55acd40000)
libc.so.6 => /lib64/libc.so.6 (0x00000034f2e00000)
libz.so.1 => /lib64/libz.so.1 (0x00000034f3a00000)
libm.so.6 => /lib64/libm.so.6 (0x00000034f3e00000)
libdl.so.2 => /lib64/libdl.so.2 (0x00000034f3200000)
libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00000034f4a00000)
/lib64/ld-linux-x86-64.so.2 (0x00000034f2a00000)
Other questions:
- This sensor combines two half-duplex traffic feeds using a bonding
interface as the capture interface (bond1). If offload features are
disabled on each physical slave interface tied to the bond master,
does one have to disable offload features on the bond interface?
Attempting to disable some features on the bond pseudo-interface
fails; I'm guessing that it's the physical interfaces that really
matter.
- If a monitored network segment carries traffic comprised of jumbo
frames (say 9K frames), do the monitor interfaces on the sensor
require their MTU to be set accordingly in order to receive/capture
the full 9K of payload? Or is the MTU only relevant when transmitting,
or irrelevant for a NIDS sensor (i.e. not an endpoint station)?
- What is the correct practice with regard to the irqbalance daemon
under RHEL-type distributions? Some guidance specifies to disable it
entirely. Is that only a function of a system being tuned with CPU
affinity settings? Is it relevant to 1Gbps installations or 10Gbps
installations only?
- Some guides suggest setting interface ring parameter limits higher,
and setting network stack limits higher (sysctl). How important are
these settings? For example:
ethtool -G eth6 rx 4096
sysctl -w net.core.netdev_max_backlog=250000
sysctl -w net.core.rmem_max = 16777216
sysctl -w net.core.rmem_max=16777216
sysctl -w net.core.rmem_default=16777216
sysctl -w net.core.optmem_max=16777216
Thanks for any assistance. There's a lot of tuning guidance published
already but I'm having difficulty determining just what is useful in a
given situation.
--
Darren Spruell
phatbuckett at gmail.com
More information about the Oisf-users
mailing list