[Oisf-devel] imporve flow table in workers runmode

Thu Apr 4 09:53:54 UTC 2013

On Thu, 2013-04-04 at 11:05 +0200, Victor Julien wrote:
> On 04/04/2013 10:25 AM, Eric Leblond wrote:
> > Hello,
> > 
> > On Wed, 2013-04-03 at 15:40 +0200, Victor Julien wrote:
> >> On 04/03/2013 10:59 AM, Chris Wakelin wrote:
> >>> On 03/04/13 09:19, Victor Julien wrote:
> >>>>> On 04/03/2013 02:31 AM, Song liu wrote:
> >>>>>>>
> >>>>>>> Right now, all workers will share one big flow table, and there will be
> >>>>>>> contention for it.
> >>>>>>> Supposed that the network interface is flow affinity, each worker will
> >>>>>>> handle individual flows.
> >>>>>>> In this way, I think it makes more sense that each worker has its own
> >>>>>>> flow table rather than one big table to reduce contention.
> >>>>>>>
> >>>>>
> >>>>> We've been discussing this before and I think it would make sense. It
> >>>>> does require quite a bit of refactoring though, especially since we'd
> >>>>> have to support the current setup as well for the non-workers runmodes.
> >>>>>
> >>> It sounds like a good idea when things like PF_RING are supposed to
> >>> handle the flow affinity onto virtual interfaces for us (PF_RING DNA +
> >>> libzero clusters do, and there's the PF_RING_DNA_SYMMETRIC_RSS flag for
> >>> PF_RING DNA without libzero and interfaces that support RSS).
> >>
> >> Actually, all workers implementations share the same assumption
> >> currently: flow based load balancing in pf_ring, af_packet, nfq, etc. So
> >> I think it makes sense to have a flow engine per worker in all these cases.
> > 
> > There may be a special point in the IPS mode. For example, NFQ will soon
> > provide a cpu fanout mode where the worker will be selected based on
> > CPU. The idea is to have the NIC do the flow balancing. But this implies
> > that the return packet may come to a different CPU based on the flow
> > hash function used by the NIC.
> > We have the same behavior in af_packet IPS mode...
> 
> I think this can lead to some weird packet order problems. T1 inspects
> toserver, T2 toclient. If the T1 worker is held up for whatever reason,
> we may for example process ACKs in T2 for packets we've not processed in
> T1 yet. I'm pretty sure this won't work correctly.

In the case of IPS mode, do inline streaming depends on ACKed packet ?

> This isn't limited to workers btw, in autofp when using multiple capture
> threads we can have the same issue. One side of a connection getting
> ahead of the other.

Yes, I've observed this lead to strange behavior...

> Don't think we can solve this in Suricata itself, as the OS has a lot of
> liberty in scheduling threads. A full packet reordering module would
> maybe work, but it's performance affect would probably completely nix
> all gains by the said capture methods.

Sure

> > In this case, we may want to disable the per-worker flow engine which is
> > a really good idea for other running mode.
> 
> Don't think it would be sufficient. The ordering problem won't be solved
> by it.

Yes, it may be interesting to study a bit the hash function used by NIC
to see if they behave symetrically. In this case, this should fix the
issue (at least for NFQ). I will have a look into it.

BR,
-- 
Eric Leblond <eric at regit.org>
Blog: https://home.regit.org/