[Oisf-users] Suffering Simultaneous Suricata Segfaults

Cloherty, Sean E scloherty at mitre.org
Thu Sep 27 13:51:23 UTC 2018


Hello Victor - 

I am not sure if the actual fault messages came across in my previous email. Below is what I've got from syslog - (apologies if the tabs and spaces mess up the faux table).  No core dump so I've gone back and reverted the two test servers to the settings that they had when they faulted, enabled.  Now I need to puzzle through enabling cored dumps on CentOS 7.

TIME			HOST			SURICATA	SEGFAULT
9/25/2018 18:26	production host #1	4.04	 kernel: W#14-ens1f1[29348]: segfault at 0 ip 0000000000597207 sp 00007f918b7fbef0 error 4 in suricata[400000+256000]
9/25/2018 18:26	test-host #1		4.1rc1	 kernel: W#03-ens1f1[24471]: segfault at 0 ip 00000000005b7787 sp 00007f6650b27cb0 error 4 in suricata[400000+28c000]
9/25/2018 18:26	production host #3	4.04	 kernel: W#06-ens1f1[24268]: segfault at 0 ip 0000000000597207 sp 00007f3a077fbef0 error 4 in suricata[400000+256000]
9/25/2018 18:26	test-host #2		4.05	 kernel: W#01-ens1f1[4720]: segfault at 0 ip 000000000059b557 sp 00007efc6e69cde0 error 4 in suricata[400000+265000]
9/25/2018 18:27	test-host #2		4.05	 kernel: W#07-ens1f1[4406]: segfault at 0 ip 000000000059b557 sp 00007fc4c2504de0 error 4 in suricata[400000+265000]

-----Original Message-----
From: Oisf-users <oisf-users-bounces at lists.openinfosecfoundation.org> On Behalf Of Victor Julien
Sent: Thursday, September 27, 2018 1:23 AM
To: oisf-users at lists.openinfosecfoundation.org
Subject: Re: [Oisf-users] Suffering Simultaneous Suricata Segfaults

On 26-09-18 18:55, Cloherty, Sean E wrote:
> I was troubleshooting instances of Suricata being down on multiple 
> hosts and I found that 2 production hosts running 4.04 and 2 test 
> hosts running 4.05 and 4.1rc1 faulted at roughly the same time.  
> Strangely,  2 additional production hosts running 4.04 on duplicate 
> hardware have not had any issues to date.  Below is the outline of 
> what I’ve been able to put together this morning.
> 
>  

Did any of the instances dump a core file you can inspect?

Another way to get more info based on the lines you posted is described
here:
https://stackoverflow.com/questions/2549214/interpreting-segfault-messages
could you try to see if you can get more info about where in the code the crash happens?


> 
> What is the same across all platforms faulting or not:
> 
>  
> 
> All use tpacket v3 & AF-PACKET
> 
> All use workers mode
> 
> All are in IDS mode
> 
> All ingest traffic from Gigamon taps
> 
> All are running CentOS 7.5 64bit
> 
> All use Intel(R) 10GbE PCI Express Linux Network Driver 5.3.7
> 
> All use Intel Corporation 82599ES 10-Gigabit SFI/SFP+
> 
> What is different:
> 
>  
> 
> NO FAULT:          #zero-copy-size: 128
> 
> FAULT:                  zero-copy-size: 128

This option is no longer used by any of the versions you are using.


> NO FAULT:
> 
>         prio:
> 
> #          low: [ 0 ]
> 
> #          medium: [ "1-2" ]
> 
> #          high: [ 3 ]
> 
>           default: "high"
> 
>  
> 
> FAULT:
> 
>         prio:
> 
>           low: [ 0 ]
> 
>           medium: [ "1-2" ]
> 
>           high: [ 3 ]
> 
>           default: "high"
Would be weird if this did anything.

--
---------------------------------------------
Victor Julien
http://www.inliniac.net/
PGP: http://www.inliniac.net/victorjulien.asc
---------------------------------------------

_______________________________________________
Suricata IDS Users mailing list: oisf-users at openinfosecfoundation.org
Site: http://suricata-ids.org | Support: http://suricata-ids.org/support/
List: https://lists.openinfosecfoundation.org/mailman/listinfo/oisf-users

Conference: https://suricon.net
Trainings: https://suricata-ids.org/training/


More information about the Oisf-users mailing list