[Oisf-users] Battling segfaults on 3.2.1

Thu Apr 13 11:11:07 UTC 2017

Hi Sean,

Can you also provide a suricata --build-info?

barnyard appears to be segfaulting as well which is curious.

As Duarte mentioned, gdb is going to be the best bet.

More than likely there is a shared library in there somewhere causing
the problem but gdb will help point in the right direction.

JT

On Thu, 2017-04-13 at 06:52 +0200, Duarte Silva wrote:
> Hi Sean,
>  
> To debug such situations what I do is:
> - install Suricata debug symbols
> - install gdb
> - launch Suricata and attach gdb
> - when the error occurs, I look at the call trace and stack to
> determine where the problem is and maybe the why.
>  
> To help reproduce the error, I would have tcpdump creating a network
> packet dump so that I could replay traffic to Suricata.
>  
> Cheers,
> Duarte
>  
> De: Cloherty, Sean E
> Enviado: 12 de abril de 2017 18:18
> Para: oisf-users at lists.openinfosecfoundation.org
> Assunto: [Oisf-users] Battling segfaults on 3.2.1
>  
> I am running 3.2.1 on 4 identical servers.  Two of them started
> having segfaults and traps. 
>  
> Troubleshooting - Compared yamls amd found an extra 0 (making the
> tracker 10x larger) in the SMTP mime section for inspected-tracker
> for file data keyword.  Also, one system had 2gb vs. 4gb for the http
> memcap in the app layer protocol config.  I changed the yamls to
> match the less problematic server.  I also took the opportunity to
> recompile Suricata with Hyperscan (Thank you Derek Spransy and Justin
> Viiret!).
>  
> On one box I’ve had no segfaults since the April 7th (following the
> changes). The other one continues to have the problem 2-3 times a day
> at random hours – mid-morning, early evening, sometimes after
> midnight. Messages in the system log only include the actual fault
> message and nothing else. The fault always points to a worker thread
> and the numbers vary W#01-ensf1 or  W#15-ens1f1 etc.   Two types of
> errors come up from segfaults
>  
> error 4 in suricata[400000+242000] or
> error 5 in suricata[400000+242000]
>  
> Trap messages seem to have stopped on April 7th (following the
> changes), but also had error messages with the same info in the
> brackets –
>  
> error:0 in suricata[400000+242000]
>  
>  
> I’ve attached a zip file of the startup script, suricata.yaml, the
> suricata.log, stats.log, a copy of the faults listed in the
> /var/log/messages, and a textfule with the time and date of crashes. 
> The server details follow:
>  
> GENERAL SERVER INFO :
>  
> - CentOS Linux release 7.3.1611 (Core) 3.10.0-514.10.2.el7.x86_64 #1
> SMP Fri Mar 3 00:04:05 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
> - Intel(R) Xeon(R) CPU E5-2667 v3 @ 3.20GHz - 16 cores / 32 threads
> - 128GB of RAM
> - Capture NIC is a dual port Intel Corporation 82599ES 10-Gigabit
> SFI/SFP+ Network Connection (rev 01)
> - NIC Driver is Intel(R) 10GbE PCI Express Linux Network Driver -
> version 4.6.4
> - Max traffic seen on the interface in the last 4 months has been 1.2
> Gb/s, but usually mid-day peaks are around 1.1 Gb/s
>  
>  
> Any suggestions of what to check next?
>  
> Sean
>  
>  
> _______________________________________________
> Suricata IDS Users mailing list: oisf-users at openinfosecfoundation.org
> Site: http://suricata-ids.org | Support: http://suricata-
> ids.org/support/
> List: https://lists.openinfosecfoundation.org/mailman/listinfo/oisf-u
> sers