<div dir="ltr">Peter Manev has been helping off-list with this issue. And as of now it looks like my Suricata 4.0.3 instance is stable and running well. Here is a summary of changes we've made to the YAML file. Thank you again Peter for all your help with this.<div><br></div><div><div>1) Change in "libhtp" section, from:</div><div> default-config:</div><div> request-body-limit: 12mb</div><div> response-body-limit: 12mb</div><div>TO:</div><div> request-body-limit: 2mb</div><div> response-body-limit: 2mb</div><div><br></div><div>2) Changed "max-pending-packets: 10000" to "max-pending-packets: 65000"</div><div><br></div><div>3) Changed "default-packet-size: 9018" to "default-packet-size: 1572"</div><div><br></div><div>4) Change in "flow" section, from:</div><div> managers: 10</div><div> #recyclers: 1 # default to one flow recycler thread</div><div>TO:</div><div> managers: 2</div><div> recyclers: 2 # default to one flow recycler thread</div><div><br></div><div>5) Changed "flow-timeouts" section from:</div><div> default:</div><div> new: 3</div><div> established: 300</div><div> closed: 0</div><div> emergency-new: 10</div><div> emergency-established: 10</div><div> emergency-closed: 0</div><div> tcp:</div><div> new: 6</div><div> established: 100</div><div> closed: 12</div><div> emergency-new: 1</div><div> emergency-established: 5</div><div> emergency-closed: 2</div><div> udp:</div><div> new: 3</div><div> established: 30</div><div> emergency-new: 3</div><div> emergency-established: 10</div><div> icmp:</div><div> new: 3</div><div> established: 30</div><div> emergency-new: 1</div><div> emergency-established: 10</div><div>TO:</div><div> default:</div><div> new: 2</div><div> established: 30</div><div> closed: 0</div><div> bypassed: 10</div><div> emergency-new: 1</div><div> emergency-established: 10</div><div> emergency-closed: 0</div><div> emergency-bypassed: 5</div><div> tcp:</div><div> new: 2</div><div> established: 60</div><div> closed: 2</div><div> bypassed: 20</div><div> emergency-new: 1</div><div> emergency-established: 10</div><div> emergency-closed: 0</div><div> emergency-bypassed: 5</div><div> udp:</div><div> new: 2</div><div> established: 15</div><div> bypassed: 10</div><div> emergency-new: 1</div><div> emergency-established: 10</div><div> emergency-bypassed: 5</div><div> icmp:</div><div> new: 3</div><div> established: 30</div><div> bypassed: 10</div><div> emergency-new: 1</div><div> emergency-established: 10</div><div> emergency-bypassed: 5</div><div><span style="white-space:pre"> </span></div><div>6) Changed "stream" section from:</div><div> memcap: 12gb</div><div> checksum-validation: no</div><div> prealloc-session: 100000</div><div> inline: no </div><div> bypass: yes</div><div> reassembly:</div><div> memcap: 20gb </div><div> depth: 12mb </div><div> toserver-chunk-size: 2560</div><div> toclient-chunk-size: 2560</div><div> randomize-chunk-size: yes</div><div> chunk-prealloc: 303360 </div><div><br></div><div>TO:</div><div> memcap: 12gb</div><div> checksum-validation: no </div><div> prealloc-session: 100000</div><div> inline: auto </div><div> reassembly:</div><div> memcap: 20gb</div><div> depth: 2mb </div><div> toserver-chunk-size: 2560</div><div> toclient-chunk-size: 2560</div><div> randomize-chunk-size: yes</div><div> #randomize-chunk-range: 10</div><div> #raw: yes</div><div> segment-prealloc: 40000</div><div><br></div><div>7) Changed "detect" section from:</div><div> profile: high</div><div> custom-values:</div><div> toclient-sp-groups: 200</div><div> toclient-dp-groups: 300</div><div> toserver-src-groups: 200</div><div> toserver-dst-groups: 400</div><div> toserver-sp-groups: 200</div><div> toserver-dp-groups: 250</div><div> sgh-mpm-context: auto</div><div> inspection-recursion-limit: 3000</div><div> prefilter:</div><div> default: mpm</div><div><br></div><div>TO:</div><div> profile: high</div><div> custom-values:</div><div> toclient-groups: 3</div><div> toserver-groups: 25</div><div> sgh-mpm-context: auto</div><div> inspection-recursion-limit: 3000</div><div><br></div><div> prefilter:</div><div> default: auto</div><div> grouping:</div><div> tcp-whitelist: 53, 80, 139, 443, 445, 1433, 3306, 3389, 6666, 6667, 8080</div><div> udp-whitelist: 53, 135, 5060</div><div><br></div><div> profiling:</div><div> grouping:</div><div> dump-to-disk: false</div><div> include-rules: false # very verbose</div><div> include-mpm-stats: false</div><div><br></div><div>8) Changed "mpm-algo" from 'ac-ks' to 'auto'</div><div><br></div><div>9) Changed "spm-algo" from 'bm' to 'auto'</div><div><br></div></div></div><div class="gmail_extra"><br><div class="gmail_quote">On Wed, Jan 31, 2018 at 2:25 AM, Peter Manev <span dir="ltr"><<a href="mailto:petermanev@gmail.com" target="_blank">petermanev@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span class="">On Tue, Jan 30, 2018 at 10:07 PM, Steve Castellarin<br>
<<a href="mailto:steve.castellarin@gmail.com">steve.castellarin@gmail.com</a>> wrote:<br>
> Oh sorry. In one instance it took 20-25 minutes. Another took an hour. In<br>
> both cases the bandwidth utilization was under 1Gbps.<br>
><br>
<br>
</span>In this case I would suggest to try to narrow it down if possible (and<br>
see if that is the real cause actually) - to a rule file/rule<br>
So maybe if you can take the config that took 1 hr and start from there.<br>
<div class="HOEnZb"><div class="h5"><br>
<br>
> On Tue, Jan 30, 2018 at 4:06 PM, Peter Manev <<a href="mailto:petermanev@gmail.com">petermanev@gmail.com</a>> wrote:<br>
>><br>
>> On Tue, Jan 30, 2018 at 9:46 PM, Steve Castellarin<br>
>> <<a href="mailto:steve.castellarin@gmail.com">steve.castellarin@gmail.com</a>> wrote:<br>
>> > It will stay 100% for minutes, etc - until I kill Suricata. The same<br>
>> > goes<br>
>> > with the associated host buffer - it will continually drop packets. If<br>
>> > I do<br>
>> > not stop Suricata, eventually a second CPU/host buffer pair will hit<br>
>> > that<br>
>> > 100% mark, and so on. I've had instances where I've let it go to 8 or 9<br>
>> > CPU/buffers at 100% before I killed it - hoping that the original CPU(s)<br>
>> > would recover but they don't.<br>
>> ><br>
>><br>
>> I meant something else.<br>
>> In previous runs you mentioned that one or more buffers start hitting<br>
>> 100% right after 15 min.<br>
>> In the two previous test runs - that you tried with 1/2 the ruleset -<br>
>> how long did it take before you started seeing any buffer hitting 100%<br>
>> ?<br>
>><br>
>> > On Tue, Jan 30, 2018 at 3:34 PM, Peter Manev <<a href="mailto:petermanev@gmail.com">petermanev@gmail.com</a>><br>
>> > wrote:<br>
>> >><br>
>> >> On Tue, Jan 30, 2018 at 8:49 PM, Steve Castellarin<br>
>> >> <<a href="mailto:steve.castellarin@gmail.com">steve.castellarin@gmail.com</a>> wrote:<br>
>> >> > Hey Peter,<br>
>> >> ><br>
>> >> > Unfortunately I continue to have the same issues with a buffer<br>
>> >> > overflowing<br>
>> >> > and a CPU staying at 100%, repeating over multiple buffers and CPUs<br>
>> >> > until I<br>
>> >> > kill the Suricata process.<br>
>> >><br>
>> >> For what period of time o you get to the 100% ?<br>
>> >><br>
>> >> ><br>
>> >> > On Thu, Jan 25, 2018 at 9:14 AM, Steve Castellarin<br>
>> >> > <<a href="mailto:steve.castellarin@gmail.com">steve.castellarin@gmail.com</a>> wrote:<br>
>> >> >><br>
>> >> >> OK I'll create a separate bug tracker on Redmine.<br>
>> >> >><br>
>> >> >> I was able to run 4.0.3 with a smaller ruleset (13,971 versus<br>
>> >> >> 29,110)<br>
>> >> >> for<br>
>> >> >> 90 minutes yesterday, without issue, before I had to leave. I'm<br>
>> >> >> getting<br>
>> >> >> ready to run 4.0.3 again to see how it runs and for how long. I'll<br>
>> >> >> update<br>
>> >> >> with results.<br>
>> >> >><br>
>> >> >> On Thu, Jan 25, 2018 at 9:00 AM, Peter Manev <<a href="mailto:petermanev@gmail.com">petermanev@gmail.com</a>><br>
>> >> >> wrote:<br>
>> >> >>><br>
>> >> >>> On Wed, Jan 24, 2018 at 6:27 PM, Steve Castellarin<br>
>> >> >>> <<a href="mailto:steve.castellarin@gmail.com">steve.castellarin@gmail.com</a>> wrote:<br>
>> >> >>> > If a bug/feature report is needed - would that fall into Bug<br>
>> >> >>> > #2423<br>
>> >> >>> > that<br>
>> >> >>> > I<br>
>> >> >>> > opened on Redmine last week?<br>
>> >> >>> ><br>
>> >> >>><br>
>> >> >>> Separate is probably better.<br>
>> >> >>><br>
>> >> >>> > As for splitting the rules, I'll test that out and let you know<br>
>> >> >>> > what<br>
>> >> >>> > happens.<br>
>> >> >>> ><br>
>> >> >>><br>
>> >> >>><br>
>> >> >>> --<br>
>> >> >>> Regards,<br>
>> >> >>> Peter Manev<br>
>> >> >><br>
>> >> >><br>
>> >> ><br>
>> >><br>
>> >><br>
>> >><br>
>> >> --<br>
>> >> Regards,<br>
>> >> Peter Manev<br>
>> ><br>
>> ><br>
>><br>
>><br>
>><br>
>> --<br>
>> Regards,<br>
>> Peter Manev<br>
><br>
><br>
<br>
<br>
<br>
</div></div><span class="HOEnZb"><font color="#888888">--<br>
Regards,<br>
Peter Manev<br>
</font></span></blockquote></div><br></div>