[Oisf-users] [EXT] Re: Packet loss and increased resource consumption after upgrade to 4.1.2 with Rust support

Michał Purzyński michalpurzynski1 at gmail.com
Wed May 29 19:51:31 UTC 2019


How about ignoring layers above 3 and just going with ip src + ip dst? I'm
pretty sure I can do that on i40e.

On Wed, May 29, 2019 at 12:45 PM Nelson, Cooper <cnelson at ucsd.edu> wrote:

> HI all,
>
> I've been investigating the ' tcp.pkt_on_wrong_thread' issue on my system,
> the counters are very high currently.
>
> I'm fairly certain I know what the issue is (at least on Intel cards).
> See the is blog post:
>
>
> http://adrianchadd.blogspot.com/2014/08/receive-side-scaling-figuring-out-how.html
>
> Pay close attention to this bit....
>
> >The Intel and Chelsio NICs will hash on all packets that are fragmented
> by only hashing on the IPv4 details. So, if it's a fragmented TCP or UDP
> frame, it will hash the first fragment the same as the others - it'll
> ignore the TCP/UDP details and only hash on the IPv4 frame. This means that
> all the fragments in a given IP datagram will hash to the same value and
> thus the same queue.
>
> >But if there are a mix of fragmented and non-fragmented packets in a
> given flow - for example, small versus larger UDP frames - then some may be
> hashed via the IPv4+TCP or IPv4+UDP details and some will just be hashed
> via the IPv4 details. This means that packets in the same flow will end up
> being received in different receive queues and thus highly likely be
> processed out of order.
>
> The edge case described above is actually very, very common when
> monitoring live traffic on a big, busy network.  Large HTTP downloads will
> begin with small, unfragmented TCP packets.  However, as the receive window
> increases over time eventually the TCP packets will become fragmented and
> end up on the wrong thread.  You also won't see this on test/simulated
> traffic unless you deliberately create these packets.
>
> The easiest fix for this would be to simply force a trivial 'sd'
> (src->dst) hash on the Intel NIC or within the ixgbe driver.  However,
> ethtool does not seem to allow this for TCP traffic.  I'm thinking it might
> be possible by modifying the driver source code.
>
> If anyone has any ideas I would appreciate it.
>
> -Coop
>
> -----Original Message-----
> From: Oisf-users <oisf-users-bounces at lists.openinfosecfoundation.org> On
> Behalf Of Cloherty, Sean E
> Sent: Thursday, February 14, 2019 2:00 PM
> To: Peter Manev <petermanev at gmail.com>; Eric Urban <eurban at umn.edu>
> Cc: Open Information Security Foundation <
> oisf-users at lists.openinfosecfoundation.org>
> Subject: Re: [Oisf-users] [EXT] Re: Packet loss and increased resource
> consumption after upgrade to 4.1.2 with Rust support
>
> That also seems to be the case with me regarding high counts on
> tcp.pkt_on_wrong_thread.  I've reverted to 4.0.6 using the same setup and
> YAML and the stats look much better with no packet loss.  I will forward
> the data.
>
> Thanks.
>
> -----Original Message-----
> From: Peter Manev <petermanev at gmail.com>
> Sent: Wednesday, February 13, 2019 3:52 PM
> To: Eric Urban <eurban at umn.edu>
> Cc: Cloherty, Sean E <scloherty at mitre.org>; Open Information Security
> Foundation <oisf-users at lists.openinfosecfoundation.org>
> Subject: Re: [EXT] Re: [Oisf-users] Packet loss and increased resource
> consumption after upgrade to 4.1.2 with Rust support
>
> On Fri, Feb 8, 2019 at 6:34 PM Eric Urban <eurban at umn.edu> wrote:
> >
> > Peter, I emailed our config to you directly.  I mentioned in my original
> email that we did test having Rust enabled in 4.1.2 where I explicitly
> disabled the Rust parsers and still experienced significant packet loss.
> In that case I added the following config under app-layer.protocols but
> left the rest of the config the same:
> >
>
>
> Thank you for sharing all the requested information.
> Please find below my observations and some suggestions.
>
> The good news with 4.1.2:
> tcp.pkt_on_wrong_thread                    | Total                     |
> 100
>
> This is very low (lowest i have seen) for the "tcp.pkt_on_wrong_thread "
> counter especially with a big run like the shared stats -over 20 days.
> Do you mind sharing a bit more info on your NIC (Myricom i think - if I am
> not mistaken) - driver/version/any specific set up - we are trying to keep
> a record for that here -
> https://redmine.openinfosecfoundation.org/issues/2725#note-13
>
>
> Observations:
> with 4.1.2 this counters seem odd -
> capture.kernel_packets                     | Total
> | 16345348068
> capture.kernel_drops                        | Total
>  | 33492892572
> aka you have more kernel_drops than kernel_packets - seems odd.
> Which makes me think It maybe a "counter" bug of some sort. Are the NIC
> driver versions the same on both boxes / same NIC config etc ?
>
>
> Suggestions for the 4.1.2 set up:
> Try a run where you disable those (false) and run again to see if any
> difference (?) :
>   midstream: true            # allow midstream session pickups
>   async-oneside: true        # enable async stream handling
>
> Thank you
> _______________________________________________
> Suricata IDS Users mailing list: oisf-users at openinfosecfoundation.org
> Site: http://suricata-ids.org | Support: http://suricata-ids.org/support/
> List: https://lists.openinfosecfoundation.org/mailman/listinfo/oisf-users
>
> Conference: https://suricon.net
> Trainings: https://suricata-ids.org/training/
> _______________________________________________
> Suricata IDS Users mailing list: oisf-users at openinfosecfoundation.org
> Site: http://suricata-ids.org | Support: http://suricata-ids.org/support/
> List: https://lists.openinfosecfoundation.org/mailman/listinfo/oisf-users
>
> Conference: https://suricon.net
> Trainings: https://suricata-ids.org/training/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openinfosecfoundation.org/pipermail/oisf-users/attachments/20190529/f85bb96c/attachment-0001.html>


More information about the Oisf-users mailing list