[Oisf-users] Suricata 4.0.3 with Napatech problems

Peter Manev petermanev at gmail.com
Sun Jan 21 20:49:22 UTC 2018



> On 18 Jan 2018, at 19:21, Steve Castellarin <steve.castellarin at gmail.com> wrote:
> 
> And also, the bandwidth utilization was just over 800Mbps.
> 

Can you try the same run but this time - load no rules. I would like to see if it would make difference or not in the same amount of time.


>> On Thu, Jan 18, 2018 at 1:16 PM, Steve Castellarin <steve.castellarin at gmail.com> wrote:
>> Hey Peter,
>> 
>> Those changes didn't help.  Around 23+ minutes into the run one worker CPU (#30) stayed at 100% while buffer NT11 dropped packets and would not recover.  I'm attaching a zip file that has the stats.log for that run, the suricata.log file as well as the information seen at the command line after issuing "/usr/bin/suricata -vvv -c /etc/suricata/suricata.yaml --napatech --runmode workers -D".
>> 
>> Steve
>> 
>> 
>>> On Thu, Jan 18, 2018 at 11:30 AM, Steve Castellarin <steve.castellarin at gmail.com> wrote:
>>> We never see above 2Gbps.  When the issue occurred a little bit ago I was running the Napatech "monitoring" tool and it was saying we were between 650-900Mbps.  I'll note the bandwidth utilization when the next issue occurs.
>>> 
>>>> On Thu, Jan 18, 2018 at 11:28 AM, Peter Manev <petermanev at gmail.com> wrote:
>>>> On Thu, Jan 18, 2018 at 5:27 PM, Steve Castellarin
>>>> <steve.castellarin at gmail.com> wrote:
>>>> > When you mean the "size of the traffic", are you asking what the bandwidth
>>>> > utilization is at the time the issue begins?
>>>> 
>>>> Sorry - i mean the traffic you sniff - 1/5/10...Gbps ?
>>>> 
>>>> >
>>>> > I will set things up and send you any/all output after the issue starts.
>>>> >
>>>> > On Thu, Jan 18, 2018 at 11:17 AM, Peter Manev <petermanev at gmail.com> wrote:
>>>> >>
>>>> >> On Thu, Jan 18, 2018 at 4:43 PM, Steve Castellarin
>>>> >> <steve.castellarin at gmail.com> wrote:
>>>> >> > Hey Peter,
>>>> >> >
>>>> >> > I tried as you asked.  Less than 15 minutes after I restarted Suricata I
>>>> >> > saw
>>>> >> > my first CPU hitting 100% and one host buffer dropping all packets.
>>>> >> > Shortly
>>>> >> > after that the second CPU hit 100% and a second host buffer began
>>>> >> > dropping
>>>> >> > all packets.  I'm attaching the stats.log where you'll see at 10:31:11
>>>> >> > the
>>>> >> > first host buffer (nt1.drop) starts to register dropped packets, then at
>>>> >> > 10:31:51 you'll see host buffer nt6.drop begin to register dropped
>>>> >> > packets.
>>>> >> > At that point I issued the kill.
>>>> >> >
>>>> >>
>>>> >> What is the size of the traffic?
>>>> >> Can you also try
>>>> >> detect:
>>>> >>   - profile: high
>>>> >>
>>>> >> (as opposed to "custom")
>>>> >>
>>>> >> Also if can run it in verbose mode (-vvv)   and send me that compete
>>>> >> output after you start having the issues.
>>>> >>
>>>> >> Thanks
>>>> >>
>>>> >>
>>>> >>
>>>> >> > Steve
>>>> >> >
>>>> >> > On Thu, Jan 18, 2018 at 10:05 AM, Peter Manev <petermanev at gmail.com>
>>>> >> > wrote:
>>>> >> >>
>>>> >> >> On Wed, Jan 17, 2018 at 1:29 PM, Steve Castellarin
>>>> >> >> <steve.castellarin at gmail.com> wrote:
>>>> >> >> > Hey Pete,
>>>> >> >> >
>>>> >> >> > Here's the YAML file from the last time I attempted to run 4.0.3 -
>>>> >> >> > with
>>>> >> >> > the
>>>> >> >> > network information removed.  Let me know if you need anything else
>>>> >> >> > from
>>>> >> >> > our
>>>> >> >> > configuration.  I'll also go to the redmine site to open a bug
>>>> >> >> > report.
>>>> >> >> >
>>>> >> >> > Steve
>>>> >> >>
>>>> >> >> Hi Steve,
>>>> >> >>
>>>> >> >> Can you try without -
>>>> >> >>
>>>> >> >>   midstream: true
>>>> >> >>   asyn-oneside:true
>>>> >> >> so
>>>> >> >>   #midstream: true
>>>> >> >>   #asyn-oneside:true
>>>> >> >>
>>>> >> >> and lower the "prealloc-session: 1000000" to 100 000 for example
>>>> >> >>
>>>> >> >>
>>>> >> >> Thank you.
>>>> >> >>
>>>> >> >> >
>>>> >> >> > On Wed, Jan 17, 2018 at 6:36 AM, Peter Manev <petermanev at gmail.com>
>>>> >> >> > wrote:
>>>> >> >> >>
>>>> >> >> >> On Tue, Jan 16, 2018 at 4:12 PM, Steve Castellarin
>>>> >> >> >> <steve.castellarin at gmail.com> wrote:
>>>> >> >> >> > Hey Peter, I didn't know if you had a chance to look at the stats
>>>> >> >> >> > log
>>>> >> >> >> > and
>>>> >> >> >> > configuration file I sent.  So far, running 3.1.1 with the updated
>>>> >> >> >> > Napatech
>>>> >> >> >> > drivers my system is running without any issues.
>>>> >> >> >> >
>>>> >> >> >>
>>>> >> >> >> The toughest part of the troubleshooting is that i dont have the set
>>>> >> >> >> up to reproduce this.
>>>> >> >> >> I didn't see anything that could lead me to definitive conclusion
>>>> >> >> >> from
>>>> >> >> >> the stats log.
>>>> >> >> >> Can you please open a bug report on our redmine with the details
>>>> >> >> >> form
>>>> >> >> >> this mialthread?
>>>> >> >> >>
>>>> >> >> >> Would it be possible to share the suricata.yaml (privately if you
>>>> >> >> >> would like works too; remove all networks)?
>>>> >> >> >>
>>>> >> >> >> Thank you
>>>> >> >> >>
>>>> >> >> >> > On Thu, Jan 11, 2018 at 12:54 PM, Steve Castellarin
>>>> >> >> >> > <steve.castellarin at gmail.com> wrote:
>>>> >> >> >> >>
>>>> >> >> >> >> Here is the zipped stats.log.  I restarted the Napatech drivers
>>>> >> >> >> >> before
>>>> >> >> >> >> running Suricata 4.0.3 to clear out any previous drop counters,
>>>> >> >> >> >> etc.
>>>> >> >> >> >>
>>>> >> >> >> >> The first time I saw a packet drop was at the 12:20:51 mark, and
>>>> >> >> >> >> you'll
>>>> >> >> >> >> see "nt12.drop" increment.  During this time one of the CPUs
>>>> >> >> >> >> acting
>>>> >> >> >> >> as
>>>> >> >> >> >> a
>>>> >> >> >> >> "worker" was at 100%.  But these drops recovered at the 12:20:58
>>>> >> >> >> >> mark,
>>>> >> >> >> >> where
>>>> >> >> >> >> "nt12.drop" stays constant at 13803.  The big issue triggered at
>>>> >> >> >> >> the
>>>> >> >> >> >> 12:27:05 mark in the file - where one worker CPU was stuck at
>>>> >> >> >> >> 100%
>>>> >> >> >> >> followed
>>>> >> >> >> >> by packet drops in host buffer "nt3.drop".  Then came a second
>>>> >> >> >> >> CPU
>>>> >> >> >> >> at
>>>> >> >> >> >> 100%
>>>> >> >> >> >> (another "worker" CPU) and packet drops in buffer "nt2.drop" at
>>>> >> >> >> >> 12:27:33.  I
>>>> >> >> >> >> finally killed Suricata just before 12:27:54, where you see all
>>>> >> >> >> >> host
>>>> >> >> >> >> buffers
>>>> >> >> >> >> beginning to drop packets.
>>>> >> >> >> >>
>>>> >> >> >> >> I'm also including the output from the "suricata --dump-config"
>>>> >> >> >> >> command.
>>>> >> >> >> >>
>>>> >> >> >> >> On Thu, Jan 11, 2018 at 11:40 AM, Peter Manev
>>>> >> >> >> >> <petermanev at gmail.com>
>>>> >> >> >> >> wrote:
>>>> >> >> >> >>>
>>>> >> >> >> >>> On Thu, Jan 11, 2018 at 8:02 AM, Steve Castellarin
>>>> >> >> >> >>> <steve.castellarin at gmail.com> wrote:
>>>> >> >> >> >>> > Peter, yes that is correct.  I worked for almost a couple
>>>> >> >> >> >>> > weeks
>>>> >> >> >> >>> > with
>>>> >> >> >> >>> > Napatech support and they believed the Napatech setup
>>>> >> >> >> >>> > (ntservice.ini
>>>> >> >> >> >>> > and
>>>> >> >> >> >>> > custom NTPL script) are working as they should.
>>>> >> >> >> >>> >
>>>> >> >> >> >>>
>>>> >> >> >> >>> Ok.
>>>> >> >> >> >>>
>>>> >> >> >> >>> One major difference between Suricata 3.x and 4.0.x in terms of
>>>> >> >> >> >>> Napatech is that they did update the code, some fixes and
>>>> >> >> >> >>> updated
>>>> >> >> >> >>> the
>>>> >> >> >> >>> counters.
>>>> >> >> >> >>> There were a bunch of upgrades in Suricata too.
>>>> >> >> >> >>> Is it possible to send over a stats.log - when the issue starts
>>>> >> >> >> >>> occuring?
>>>> >> >> >> >>>
>>>> >> >> >> >>>
>>>> >> >> >> >>> > On Thu, Jan 11, 2018 at 9:52 AM, Peter Manev
>>>> >> >> >> >>> > <petermanev at gmail.com>
>>>> >> >> >> >>> > wrote:
>>>> >> >> >> >>> >>
>>>> >> >> >> >>> >> I
>>>> >> >> >> >>> >>
>>>> >> >> >> >>> >> On 11 Jan 2018, at 07:19, Steve Castellarin
>>>> >> >> >> >>> >> <steve.castellarin at gmail.com>
>>>> >> >> >> >>> >> wrote:
>>>> >> >> >> >>> >>
>>>> >> >> >> >>> >> After my last email yesterday I decided to go back to our
>>>> >> >> >> >>> >> 3.1.1
>>>> >> >> >> >>> >> install of
>>>> >> >> >> >>> >> Suricata, with
>>>> >> >> >> >>> >>
>>>> >> >> >> >>> >>
>>>> >> >> >> >>> >> the upgraded Napatech version.  Since then I've seen no
>>>> >> >> >> >>> >> packets
>>>> >> >> >> >>> >> dropped
>>>> >> >> >> >>> >> with sustained bandwidth of between 1 and 1.7Gbps.  So I'm
>>>> >> >> >> >>> >> not
>>>> >> >> >> >>> >> sure
>>>> >> >> >> >>> >> what is
>>>> >> >> >> >>> >> going on with my configuration/setup of Suricata 4.0.3.
>>>> >> >> >> >>> >>
>>>> >> >> >> >>> >>
>>>> >> >> >> >>> >>
>>>> >> >> >> >>> >> So the only thing that you changed is the upgrade of the
>>>> >> >> >> >>> >> Napatech
>>>> >> >> >> >>> >> drivers
>>>> >> >> >> >>> >> ?
>>>> >> >> >> >>> >> The Suricata config stayed the same -  you just upgraded to
>>>> >> >> >> >>> >> 4.0.3
>>>> >> >> >> >>> >> (from
>>>> >> >> >> >>> >> 3.1.1) and the observed effect was - after a while all (or
>>>> >> >> >> >>> >> most)
>>>> >> >> >> >>> >> cpus
>>>> >> >> >> >>> >> get
>>>> >> >> >> >>> >> pegged at 100% - is that correct ?
>>>> >> >> >> >>> >>
>>>> >> >> >> >>> >>
>>>> >> >> >> >>> >> On Wed, Jan 10, 2018 at 4:46 PM, Steve Castellarin
>>>> >> >> >> >>> >> <steve.castellarin at gmail.com> wrote:
>>>> >> >> >> >>> >>>
>>>> >> >> >> >>> >>> Hey Peter, no there is no error messages.
>>>> >> >> >> >>> >>>
>>>> >> >> >> >>> >>> On Jan 10, 2018 4:37 PM, "Peter Manev"
>>>> >> >> >> >>> >>> <petermanev at gmail.com>
>>>> >> >> >> >>> >>> wrote:
>>>> >> >> >> >>> >>>
>>>> >> >> >> >>> >>> On Wed, Jan 10, 2018 at 11:29 AM, Steve Castellarin
>>>> >> >> >> >>> >>> <steve.castellarin at gmail.com> wrote:
>>>> >> >> >> >>> >>> > Hey Peter,
>>>> >> >> >> >>> >>>
>>>> >> >> >> >>> >>> Are there any errors msgs in suricata.log when that happens
>>>> >> >> >> >>> >>> ?
>>>> >> >> >> >>> >>>
>>>> >> >> >> >>> >>> Thank you
>>>> >> >> >> >>> >>>
>>>> >> >> >> >>> >>>
>>>> >> >> >> >>> >>>
>>>> >> >> >> >>> >>> --
>>>> >> >> >> >>> >>> Regards,
>>>> >> >> >> >>> >>> Peter Manev
>>>> >> >> >> >>> >>>
>>>> >> >> >> >>> >>>
>>>> >> >> >> >>> >>
>>>> >> >> >> >>> >
>>>> >> >> >> >>>
>>>> >> >> >> >>>
>>>> >> >> >> >>>
>>>> >> >> >> >>> --
>>>> >> >> >> >>> Regards,
>>>> >> >> >> >>> Peter Manev
>>>> >> >> >> >>
>>>> >> >> >> >>
>>>> >> >> >> >
>>>> >> >> >>
>>>> >> >> >>
>>>> >> >> >>
>>>> >> >> >> --
>>>> >> >> >> Regards,
>>>> >> >> >> Peter Manev
>>>> >> >> >
>>>> >> >> >
>>>> >> >>
>>>> >> >>
>>>> >> >>
>>>> >> >> --
>>>> >> >> Regards,
>>>> >> >> Peter Manev
>>>> >> >
>>>> >> >
>>>> >>
>>>> >>
>>>> >>
>>>> >> --
>>>> >> Regards,
>>>> >> Peter Manev
>>>> >
>>>> >
>>>> 
>>>> 
>>>> 
>>>> --
>>>> Regards,
>>>> Peter Manev
>>> 
>> 
> 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openinfosecfoundation.org/pipermail/oisf-users/attachments/20180121/d6595f8f/attachment-0002.html>


More information about the Oisf-users mailing list