<div dir="ltr"><div dir="ltr">Dear Peter,<div><br></div><div>I'm sorry for the late reply.</div><div>I have started suricata without any rules on the problematic node2, and indeed the drops were almost 0. </div><div>By default we have around 20k rules loaded.</div><div>On the other hand, with that little traffic we would still expect having very little/no drops.</div><div><br></div><div>The other node (node1) has exactly same set of rules loaded.</div><div><br></div><div>Thank you a lot for you help.</div><div><br></div><div>Best,</div><div>magmi</div></div></div><br><div class="gmail_quote"><div dir="ltr">On Fri, 14 Sep 2018 at 11:13, Peter Manev <<a href="mailto:petermanev@gmail.com">petermanev@gmail.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">On Thu, Sep 13, 2018 at 8:48 AM Magmi A <<a href="mailto:magmi.sec@gmail.com" target="_blank">magmi.sec@gmail.com</a>> wrote:<br>
><br>
> Hi Peter,<br>
><br>
> Yes, of course.<br>
> Here you have stats.log for Node2:<br>
><br>
> Date:9/13/2018--06:37:42(uptime:6d,17h05m59s)<br>
> ------------------------------------------------------------------------------------<br>
> Counter |TMName |Value<br>
> ------------------------------------------------------------------------------------<br>
> capture.kernel_packets |Total |135374454<br>
> capture.kernel_drops |Total |11430384<br>
> decoder.pkts |Total |123946335<br>
> decoder.bytes |Total |122943123694<br>
> decoder.ipv4 |Total |123767936<br>
> decoder.ipv6 |Total |5036<br>
> decoder.ethernet |Total |123946335<br>
> decoder.tcp |Total |120082602<br>
> decoder.udp |Total |683567<br>
> decoder.icmpv4 |Total |114883<br>
> decoder.icmpv6 |Total |189<br>
> decoder.tered |Total |5<br>
> decoder.avg_pkt_size |Total |991<br>
> decoder.max_pkt_size |Total |1514<br>
> flow.tcp |Total |780742<br>
> flow.udp |Total |305951<br>
> flow.icmpv6 |Total |162<br>
> tcp.sessions |Total |727356<br>
> tcp.syn |Total |771112<br>
> tcp.synack |Total |720764<br>
> tcp.rst |Total |549359<br>
> tcp.stream_depth_reached |Total |454<br>
> tcp.reassembly_gap |Total |7722<br>
> tcp.overlap |Total |483624<br>
> detect.alert |Total |4<br>
> app_layer.flow.http |Total |108080<br>
> app_layer.tx.http |Total |262748<br>
> app_layer.flow.smtp |Total |7<br>
> app_layer.tx.smtp |Total |7<br>
> app_layer.flow.tls |Total |6612<br>
> app_layer.flow.ssh |Total |13<br>
> app_layer.flow.smb |Total |55361<br>
> app_layer.flow.dcerpc_tcp |Total |202204<br>
> app_layer.flow.dns_tcp |Total |10<br>
> app_layer.tx.dns_tcp |Total |10<br>
> app_layer.flow.failed_tcp |Total |274419<br>
> app_layer.flow.dcerpc_udp |Total |4<br>
> app_layer.flow.dns_udp |Total |139325<br>
> app_layer.tx.dns_udp |Total |239577<br>
> app_layer.flow.failed_udp |Total |166622<br>
> flow_mgr.closed_pruned |Total |684943<br>
> flow_mgr.new_pruned |Total |290500<br>
> flow_mgr.est_pruned |Total |110950<br>
> flow.spare |Total |10000<br>
> flow.tcp_reuse |Total |6055<br>
> flow_mgr.flows_checked |Total |12<br>
> flow_mgr.flows_notimeout |Total |5<br>
> flow_mgr.flows_timeout |Total |7<br>
> flow_mgr.flows_timeout_inuse |Total |1<br>
> flow_mgr.flows_removed |Total |6<br>
> flow_mgr.rows_checked |Total |65536<br>
> flow_mgr.rows_skipped |Total |65522<br>
> flow_mgr.rows_empty |Total |2<br>
> flow_mgr.rows_maxlen |Total |1<br>
> tcp.memuse |Total |4587520<br>
> tcp.reassembly_memuse |Total |6650304<br>
> dns.memuse |Total |16386<br>
> http.memuse |Total |12142178<br>
> flow.memuse |Total |7207360<br>
><br>
<br>
Can you do a zero test and see if you run it for a bit with 0 rules<br>
loaded ( -S /dev/null )- if you are going to have the same amount of<br>
drops percentage wise?<br>
<br>
<br>
<br>
> And just for a reference stats.log for Node1:<br>
><br>
> Date:9/13/2018--06:35:39(uptime:6d,16h39m54s)<br>
> ------------------------------------------------------------------------------------<br>
> Counter |TMName |Value<br>
> ------------------------------------------------------------------------------------<br>
> capture.kernel_packets |Total |3577965800<br>
> capture.kernel_drops |Total |3416155<br>
> decoder.pkts |Total |3574589712<br>
> decoder.bytes |Total |3536139875210<br>
> decoder.invalid |Total |10070<br>
> decoder.ipv4 |Total |3571132083<br>
> decoder.ipv6 |Total |143756<br>
> decoder.ethernet |Total |3574589712<br>
> decoder.tcp |Total |3522243739<br>
> decoder.udp |Total |34827953<br>
> decoder.icmpv4 |Total |963831<br>
> decoder.icmpv6 |Total |33551<br>
> decoder.teredo |Total |1399<br>
> decoder.avg_pkt_size |Total |989<br>
> decoder.max_pkt_size |Total |1534<br>
> flow.tcp |Total |1144524<br>
> flow.udp |Total |202960<br>
> flow.icmpv6 |Total |439<br>
> decoder.ipv4.trunc_pkt |Total |10070<br>
> tcp.sessions |Total |341278<br>
> tcp.ssn_memcap_drop |Total |4446979<br>
> tcp.pseudo |Total |84<br>
> tcp.invalid_checksum |Total |4<br>
> tcp.syn |Total |6653717<br>
> tcp.synack |Total |2572744<br>
> tcp.rst |Total |1857715<br>
> tcp.segment_memcap_drop |Total |10<br>
> tcp.stream_depth_reached |Total |303<br>
> tcp.reassembly_gap |Total |95648<br>
> tcp.overlap |Total |3889304<br>
> tcp.insert_data_normal_fail |Total |3314483<br>
> detect.alert |Total |518<br>
> app_layer.flow.http |Total |34820<br>
> app_layer.tx.http |Total |60759<br>
> app_layer.flow.ftp |Total |20<br>
> app_layer.flow.smtp |Total |140<br>
> app_layer.tx.smtp |Total |177<br>
> app_layer.flow.tls |Total |43356<br>
> app_layer.flow.smb |Total |3430<br>
> app_layer.flow.dcerpc_tcp |Total |8894<br>
> app_layer.flow.dns_tcp |Total |48<br>
> app_layer.tx.dns_tcp |Total |46<br>
> app_layer.flow.failed_tcp |Total |107518<br>
> app_layer.flow.dcerpc_udp |Total |5<br>
> app_layer.flow.dns_udp |Total |114888<br>
> app_layer.tx.dns_udp |Total |482904<br>
> app_layer.flow.failed_udp |Total |88067<br>
> flow_mgr.closed_pruned |Total |259368<br>
> flow_mgr.new_pruned |Total |981024<br>
> flow_mgr.est_pruned |Total |107531<br>
> flow.spare |Total |10000<br>
> flow.tcp_reuse |Total |29932<br>
> flow_mgr.rows_checked |Total |65536<br>
> flow_mgr.rows_skipped |Total |65536<br>
> tcp.memuse |Total |4587520<br>
> tcp.reassembly_memuse |Total |655360<br>
> dns.memcap_global |Total |1086836<br>
> flow.memuse |Total |7074304<br>
><br>
> Thank you for any suggestions.<br>
><br>
> Best,<br>
> magmi<br>
><br>
> On Wed, 12 Sep 2018 at 16:16, Peter Manev <<a href="mailto:petermanev@gmail.com" target="_blank">petermanev@gmail.com</a>> wrote:<br>
>><br>
>> On Wed, Sep 12, 2018 at 10:57 AM Magmi A <<a href="mailto:magmi.sec@gmail.com" target="_blank">magmi.sec@gmail.com</a>> wrote:<br>
>> ><br>
>> ><br>
>> >> > * Node1 receives ~ 500Mbps of traffic (it's 1Gbps interface), and gets in average 1-2% kernel packet dropped<br>
>> >> > while<br>
>> >> > * Node2 receives ~ 500kbps of traffic and gets in average 10% kernel packet dropped<br>
>> >><br>
>> >> What is different between node 1 and node 2 ? (same config/same<br>
>> >> suricata/same HW/same rules...?)]<br>
>> ><br>
>> ><br>
>> > The nodes have the same HW, run the same config/ suricata version, have the same set of rules.<br>
>> ><br>
>> > The only difference is that they are exposed to different sources of traffic.<br>
>> > From Wireshark analysis the protocol hierarchies for both cases seem similar - there is no spectacular difference.<br>
>> ><br>
>> > So really the only difference is the captured traffic itself (MACs, IPs, partly protocols, data etc).<br>
>> ><br>
>> > That is why we have such a problem how to approach the problem and troubleshoot it.<br>
>><br>
>><br>
>> Can you share full update of the latest stats.log ?<br>
>><br>
>><br>
>><br>
>> --<br>
>> Regards,<br>
>> Peter Manev<br>
<br>
<br>
<br>
-- <br>
Regards,<br>
Peter Manev<br>
</blockquote></div>