[Oisf-users] [EXT] Re: Packet loss and increased resource consumption after upgrade to 4.1.2 with Rust support

Tue Feb 26 17:00:36 UTC 2019

Thanks  - that is a handy way to find top talkers.  In the past I've also used Darkstat which is a bit old but works https://unix4lyfe.org/darkstat/

In my case though, the drops don't usually happen during peak traffic times and often there are much higher peaks within a day or two when no packets are dropped.  This is specifically about The Zabbix graphs I use rarely show a sloped line,  but a flat line at zero for days before and after.  The chart below is a perfect example.  Looking at the graph on the far right - at some point on Feb. 6 packet drops went from zero to 15 million w/in a 10 minute span and then continued with no further drops.  Looking at the chart on the far left shows the normal rhythm of daily traffic and no abnormal spike and one peak over 2Gbps.  The following week shows a significant increase in traffic including a spike over 3Gbps but no additional drops during that time.  That is why I don’t think it is volume which is causing this.  If it were I would increase the rx buffers on the NIC from the 512 I have it set to right now.

[cid:image001.png at 01D4CDBD.B87F23C0]

I did check into the monitoring potential but it seems that nobody is aware of jumbo packets in use on monitored LAN segments nor in testing tools.  If that was happening, would is show up in the stats under decoder.max_pkt_size ?  Maybe I need to add that stat and monitor.  To date the largest packet size has been 6520.

-----Original Message-----
From: Nelson, Cooper <cnelson at ucsd.edu>
Sent: Friday, February 22, 2019 2:02 PM
To: Peter Manev <petermanev at gmail.com>
Cc: Cloherty, Sean E <scloherty at mitre.org>; Open Information Security Foundation <oisf-users at lists.openinfosecfoundation.org>
Subject: RE: [Oisf-users] [EXT] Re: Packet loss and increased resource consumption after upgrade to 4.1.2 with Rust support

What I ended up doing was just putting some windows on my desktop showing htop (all the threads would get pegged) and the packet drops from stats.log , since this was happening several times a day for extended periods.

I then wrote a script to just grab a million packets off the wire and show the top talkers.  Found it pretty easily.

$ cat bin/top_flows.sh

#!/bin/bash

sudo tcpdump -tnn -c 100000 -i any 2>/dev/null | awk '{print $2,$3,$4,$5}' | sort | uniq -c | sort -nr | head

You could automate something like to watch for packet drops in stats.log and then run the above script (I would recommend several times).

I think I've suggested in the past if suri could dump the ring buffer to a file when the 'emergency flush' condition is triggered you could just run the above script on the resulting pcap to find what caused it.

-Coop

-----Original Message-----

From: Peter Manev <petermanev at gmail.com<mailto:petermanev at gmail.com>>

Sent: Friday, February 22, 2019 9:43 AM

To: Nelson, Cooper <cnelson at ucsd.edu<mailto:cnelson at ucsd.edu>>

Cc: Cloherty, Sean E <scloherty at mitre.org<mailto:scloherty at mitre.org>>; Open Information Security Foundation <oisf-users at lists.openinfosecfoundation.org<mailto:oisf-users at lists.openinfosecfoundation.org>>

Subject: Re: [Oisf-users] [EXT] Re: Packet loss and increased resource consumption after upgrade to 4.1.2 with Rust support

Could be related indeed.

@Sean Could you try the following and give me some feedback please.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openinfosecfoundation.org/pipermail/oisf-users/attachments/20190226/1a4c0572/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image001.png
Type: image/png
Size: 276566 bytes
Desc: image001.png
URL: <http://lists.openinfosecfoundation.org/pipermail/oisf-users/attachments/20190226/1a4c0572/attachment-0001.png>