[Oisf-devel] imporve flow table in workers runmode

Vito Piserchia vpiserchia at gmail.com
Thu Apr 4 12:44:17 UTC 2013


Hi all,

IMHO the success key is having a symmetric RSS hash function. Someone
already made experiments/studies about this: i.e.
http://www.ndsl.kaist.edu/~shinae/papers/TR-symRSS.pdf

Obviously this could lead to unbalanced flow queue, think about a long
standing flows which remain alive for long time
period... To take into account this kind of situation one could think to
assign a group of processing CPU thread to packets that arrive from the
same RSS queue, loosing,  of course, in this case the cache (ant interrupt)
affinity benefits.

regards
vito


On Thu, Apr 4, 2013 at 12:10 PM, Victor Julien <victor at inliniac.net> wrote:

> On 04/04/2013 11:53 AM, Eric Leblond wrote:
> > On Thu, 2013-04-04 at 11:05 +0200, Victor Julien wrote:
> >> On 04/04/2013 10:25 AM, Eric Leblond wrote:
> >>> Hello,
> >>>
> >>> On Wed, 2013-04-03 at 15:40 +0200, Victor Julien wrote:
> >>>> On 04/03/2013 10:59 AM, Chris Wakelin wrote:
> >>>>> On 03/04/13 09:19, Victor Julien wrote:
> >>>>>>> On 04/03/2013 02:31 AM, Song liu wrote:
> >>>>>>>>>
> >>>>>>>>> Right now, all workers will share one big flow table, and there
> will be
> >>>>>>>>> contention for it.
> >>>>>>>>> Supposed that the network interface is flow affinity, each
> worker will
> >>>>>>>>> handle individual flows.
> >>>>>>>>> In this way, I think it makes more sense that each worker has
> its own
> >>>>>>>>> flow table rather than one big table to reduce contention.
> >>>>>>>>>
> >>>>>>>
> >>>>>>> We've been discussing this before and I think it would make sense.
> It
> >>>>>>> does require quite a bit of refactoring though, especially since
> we'd
> >>>>>>> have to support the current setup as well for the non-workers
> runmodes.
> >>>>>>>
> >>>>> It sounds like a good idea when things like PF_RING are supposed to
> >>>>> handle the flow affinity onto virtual interfaces for us (PF_RING DNA
> +
> >>>>> libzero clusters do, and there's the PF_RING_DNA_SYMMETRIC_RSS flag
> for
> >>>>> PF_RING DNA without libzero and interfaces that support RSS).
> >>>>
> >>>> Actually, all workers implementations share the same assumption
> >>>> currently: flow based load balancing in pf_ring, af_packet, nfq, etc.
> So
> >>>> I think it makes sense to have a flow engine per worker in all these
> cases.
> >>>
> >>> There may be a special point in the IPS mode. For example, NFQ will
> soon
> >>> provide a cpu fanout mode where the worker will be selected based on
> >>> CPU. The idea is to have the NIC do the flow balancing. But this
> implies
> >>> that the return packet may come to a different CPU based on the flow
> >>> hash function used by the NIC.
> >>> We have the same behavior in af_packet IPS mode...
> >>
> >> I think this can lead to some weird packet order problems. T1 inspects
> >> toserver, T2 toclient. If the T1 worker is held up for whatever reason,
> >> we may for example process ACKs in T2 for packets we've not processed in
> >> T1 yet. I'm pretty sure this won't work correctly.
> >
> > In the case of IPS mode, do inline streaming depends on ACKed packet ?
>
> No, but the stream engine is written with the assumption that what we
> see is the order of packets on the wire. TCP packets may still be out of
> order of course, but in this case the end-host has to deal with it as well.
>
> In cases like window checks, sequence validation, SACK checks, etc I can
> imagine problems. We'd possibly reject/accept packets in the stream
> handling that the end host will treat differently.
>
> >
> >> This isn't limited to workers btw, in autofp when using multiple capture
> >> threads we can have the same issue. One side of a connection getting
> >> ahead of the other.
> >
> > Yes, I've observed this lead to strange behavior...
> >
> >> Don't think we can solve this in Suricata itself, as the OS has a lot of
> >> liberty in scheduling threads. A full packet reordering module would
> >> maybe work, but it's performance affect would probably completely nix
> >> all gains by the said capture methods.
> >
> > Sure
> >
> >>> In this case, we may want to disable the per-worker flow engine which
> is
> >>> a really good idea for other running mode.
> >>
> >> Don't think it would be sufficient. The ordering problem won't be solved
> >> by it.
> >
> > Yes, it may be interesting to study a bit the hash function used by NIC
> > to see if they behave symetrically. In this case, this should fix the
> > issue (at least for NFQ). I will have a look into it.
>
> --
> ---------------------------------------------
> Victor Julien
> http://www.inliniac.net/
> PGP: http://www.inliniac.net/victorjulien.asc
> ---------------------------------------------
>
> _______________________________________________
> Suricata IDS Devel mailing list: oisf-devel at openinfosecfoundation.org
> Site: http://suricata-ids.org | Participate:
> http://suricata-ids.org/participate/
> List: https://lists.openinfosecfoundation.org/mailman/listinfo/oisf-devel
> Redmine: https://redmine.openinfosecfoundation.org/
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openinfosecfoundation.org/pipermail/oisf-devel/attachments/20130404/0c131947/attachment-0002.html>


More information about the Oisf-devel mailing list