[Oisf-users] Drops: From none to gigantic in the blink of an eye

Cloherty, Sean E scloherty at mitre.org
Wed Mar 23 16:12:18 UTC 2016


Our Suricata installation went from normal to completely haywire overnight Tuesday.  It was cruising along with very low packet loss (0.002%) when suddenly between 2:24 and 2:29 AM it began to grow extremely rapidly.



So far 've checked and



-          NIC stats for errors or drops are very few(at bottom of email)

-          There were no changes to server Tuesday AM to account for this

-          Network traffic just before and after exhibited no major change of volume.

-          No errors are visible in the messages file, or Suricata logs that appear out of the ordinary.

-          Since that time RAM usage and CPU utilization is much higher (no surprise)



The most pertinent data is below or attached. Any input at all would be helpful to say the least . . .









Host info -

CentOS 7.2.1511 64bit, 128G RAM, 32 cores Xeon E5-2640 v3 @ 2.60GHz, hundreds of gigs of free space



Suricata - 3.0 -

/usr/bin/suricata -c /etc/suricata/suricata.yaml --user=suri --group=suri -v --af-packet=ens1f1 --runmode=workers -D





top - 11:24:38 up 28 days, 21:16,  1 user,  load average: 7.63, 9.83, 9.49

Tasks: 396 total,   1 running, 395 sleeping,   0 stopped,   0 zombie

%Cpu(s): 19.3 us,  0.1 sy,  0.0 ni, 80.1 id,  0.0 wa,  0.0 hi,  0.5 si,  0.0 st

KiB Mem : 13175409+total, 11530432+free, 14969692 used,  1480088 buff/cache

KiB Swap:  4194300 total,  4194300 free,        0 used. 11620499+avail Mem



  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND

25685 suri      20   0 14.600g 0.013t  36420 S 631.8 10.6  21749:31 Suricata-Main





Entries from the stats.log just before and after the start of the problem. 1st is the last "normal" entry.  Attached spreadsheet showing the delta %  of changes which vary from 90% to 20,000% over a short period of time has more detail (attached).



PRIOR TO JUMP:

capture.kernel_drops                              2,042

flow.memcap                                                   0

tcp.no_flow                                                      0



5 mins AFTER JUMP:

capture.kernel_drops                          420,209

flow.memcap                                            22,445

tcp.no_flow                                               19,584



10 mins AFTER JUMP:

capture.kernel_drops                    16,651,570

flow.memcap                                          308,876

tcp.no_flow                                             273,039



CURRENT:

capture.kernel_drops           36,169,877,508

flow.memcap                              240,736,337

tcp.no_flow                                 182,273,322





CURRENT (full detail):



-------------------------------------------------------------------

Date: 3/23/2016 -- 08:47:55 (uptime: 5d, 17h 02m 59s)

-------------------------------------------------------------------

Counter                   | TM Name                   | Value

-------------------------------------------------------------------

capture.kernel_packets    | Total                     | 36169877508

capture.kernel_drops      | Total                     | 8610389158

decoder.pkts              | Total                     | 27103400573

decoder.bytes             | Total                     | 17260791141325

decoder.invalid           | Total                     | 201

decoder.ipv4              | Total                     | 27113873640

decoder.ipv6              | Total                     | 450013

decoder.ethernet          | Total                     | 27103400573

decoder.tcp               | Total                     | 17460031447

decoder.udp               | Total                     | 9390824301

decoder.sctp              | Total                     | 572

decoder.icmpv4            | Total                     | 14544707

decoder.icmpv6            | Total                     | 54284

decoder.gre               | Total                     | 58888

decoder.teredo            | Total                     | 278019

decoder.avg_pkt_size      | Total                     | 636

decoder.max_pkt_size      | Total                     | 1514

flow.memcap               | Total                     | 240736337

defrag.ipv4.fragments     | Total                     | 23421551

defrag.ipv4.reassembled   | Total                     | 11686108

defrag.ipv6.fragments     | Total                     | 686

tcp.sessions              | Total                     | 91335846

tcp.pseudo                | Total                     | 15993656

tcp.invalid_checksum      | Total                     | 5377

tcp.no_flow               | Total                     | 182273322

tcp.syn                   | Total                     | 156088653

tcp.synack                | Total                     | 60140655

tcp.rst                   | Total                     | 93730134

tcp.stream_depth_reached  | Total                     | 138960

tcp.reassembly_gap        | Total                     | 5070871

detect.alert              | Total                     | 391

flow_mgr.closed_pruned    | Total                     | 64297464

flow_mgr.new_pruned       | Total                     | 11662246

flow_mgr.est_pruned       | Total                     | 36603185

flow.spare                | Total                     | 20626

flow.tcp_reuse            | Total                     | 6278669

tcp.memuse                | Total                     | 280650800

tcp.reassembly_memuse     | Total                     | 1539438177

dns.memuse                | Total                     | 25028081

dns.memcap_state          | Total                     | 53614

dns.memcap_global         | Total                     | 56782651

http.memuse               | Total                     | 1175360548

flow.memuse               | Total                     | 536870752







NIC STATS  -

Ethtool stats from the NIC

    rx_errors: 141

     tx_errors: 0

     rx_dropped: 0

     tx_dropped: 0

     rx_over_errors: 0

     rx_crc_errors: 0

     rx_frame_errors: 0

     rx_fifo_errors: 0

     rx_missed_errors: 0

     tx_aborted_errors: 0

     tx_carrier_errors: 0

     tx_fifo_errors: 0

     tx_heartbeat_errors: 0

     rx_long_length_errors: 0

     rx_short_length_errors: 0

     rx_csum_offload_errors: 249

     rx_fcoe_dropped: 0




-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openinfosecfoundation.org/pipermail/oisf-users/attachments/20160323/4775b5fc/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: DROP SPIKE.xlsx
Type: application/vnd.openxmlformats-officedocument.spreadsheetml.sheet
Size: 15726 bytes
Desc: DROP SPIKE.xlsx
URL: <http://lists.openinfosecfoundation.org/pipermail/oisf-users/attachments/20160323/4775b5fc/attachment-0001.xlsx>


More information about the Oisf-users mailing list