[Oisf-users] Suricata load/latency spikes
robert.jamison at bt.com
robert.jamison at bt.com
Mon Jun 29 20:31:39 UTC 2015
I'm understanding correctly, your running 'live' traffic through Suricata that is compiled without --enable-debug. This means that the low level output from the dns.c function calls would only be available in 'replay' mode, which hasn't exhibited the same latency.
Something is contraindicative of DNS being an issue, your pings are taking longer than 100ms without resolution (e.g. ping 184.108.40.206), but in the stats, the dns.memcap is the only metric that is moving in the before/after. If dns.memcap is holding onto stub records and waiting for responses (which might cause the 3X increase in alloc), potentially the thread is blocking the reentrance of the kernel libnet primitives causing delays to propagate to every process queuing to use them? I'm really grasping here...
From: oisf-users-bounces at lists.openinfosecfoundation.org [mailto:oisf-users-bounces at lists.openinfosecfoundation.org] On Behalf Of Oliver Humpage
Sent: Monday, June 29, 2015 3:54 PM
To: oisf-users at lists.openinfosecfoundation.org
Subject: Re: [Oisf-users] Suricata load/latency spikes
On 29 Jun 2015, at 18:29, <robert.jamison at bt.com> <robert.jamison at bt.com> wrote:
> My suspicion is, (and this is really just instinct) that there is a latency caused by something in the DNS side of things,
Would a good way to test that be to disable the DNS rules and decoder in suricata.yaml and see what happens? I'm assuming suricata doesn't do DNS lookups itself (just GeoIP possibly).
Thinking about it, this router is on a new connection, and all DNS traffic should still be going to our old DNS servers on the old connection, so there should be roughly no DNS traffic at all. I'll check the pcap for port 53 tomorrow.
Also, on either side of this IPS are instances of pf that are scrubbing/cleaning/reassembling the traffic, so suricata should be getting a pretty clean feed. I can't see any evidence of SYN floods, but I might switch on pf's synproxy to make sure.
In my investigations today I've found that although the worst/longest-lived problems only happen twice a day, short bursts of high latency are happening many times an hour. I've got a script running that says if 3 pings in a row to the neighbouring router take more than 100ms each, take a pcap of the next few thousand packets, and it's triggering on average every ~10 minutes. I suppose that might miss the actual traffic that causes the problem, maybe I'll take constant pcap snapshots and tag the ones that occur as a spike happens...
Suricata IDS Users mailing list: oisf-users at openinfosecfoundation.org
Site: http://suricata-ids.org | Support: http://suricata-ids.org/support/
Suricata User Conference November 4 & 5 in Barcelona: http://oisfevents.net
More information about the Oisf-users