[Oisf-devel] imporve flow table in workers runmode
Victor Julien
victor at inliniac.net
Fri Apr 5 09:09:21 UTC 2013
(please don't top-post in discussions like this and also don't use HTML)
On 04/04/2013 02:44 PM, Vito Piserchia wrote:
> On Thu, Apr 4, 2013 at 12:10 PM, Victor Julien <victor at inliniac.net
> <mailto:victor at inliniac.net>> wrote:
>
> On 04/04/2013 11:53 AM, Eric Leblond wrote:
> > On Thu, 2013-04-04 at 11:05 +0200, Victor Julien wrote:
> >> On 04/04/2013 10:25 AM, Eric Leblond wrote:
> >>> Hello,
> >>>
> >>> On Wed, 2013-04-03 at 15:40 +0200, Victor Julien wrote:
> >>>> On 04/03/2013 10:59 AM, Chris Wakelin wrote:
> >>>>> On 03/04/13 09:19, Victor Julien wrote:
> >>>>>>> On 04/03/2013 02:31 AM, Song liu wrote:
> >>>>>>>>>
> >>>>>>>>> Right now, all workers will share one big flow table, and
> there will be
> >>>>>>>>> contention for it.
> >>>>>>>>> Supposed that the network interface is flow affinity, each
> worker will
> >>>>>>>>> handle individual flows.
> >>>>>>>>> In this way, I think it makes more sense that each worker
> has its own
> >>>>>>>>> flow table rather than one big table to reduce contention.
> >>>>>>>>>
> >>>>>>>
> >>>>>>> We've been discussing this before and I think it would make
> sense. It
> >>>>>>> does require quite a bit of refactoring though, especially
> since we'd
> >>>>>>> have to support the current setup as well for the
> non-workers runmodes.
> >>>>>>>
> >>>>> It sounds like a good idea when things like PF_RING are
> supposed to
> >>>>> handle the flow affinity onto virtual interfaces for us
> (PF_RING DNA +
> >>>>> libzero clusters do, and there's the PF_RING_DNA_SYMMETRIC_RSS
> flag for
> >>>>> PF_RING DNA without libzero and interfaces that support RSS).
> >>>>
> >>>> Actually, all workers implementations share the same assumption
> >>>> currently: flow based load balancing in pf_ring, af_packet,
> nfq, etc. So
> >>>> I think it makes sense to have a flow engine per worker in all
> these cases.
> >>>
> >>> There may be a special point in the IPS mode. For example, NFQ
> will soon
> >>> provide a cpu fanout mode where the worker will be selected based on
> >>> CPU. The idea is to have the NIC do the flow balancing. But this
> implies
> >>> that the return packet may come to a different CPU based on the flow
> >>> hash function used by the NIC.
> >>> We have the same behavior in af_packet IPS mode...
> >>
> >> I think this can lead to some weird packet order problems. T1
> inspects
> >> toserver, T2 toclient. If the T1 worker is held up for whatever
> reason,
> >> we may for example process ACKs in T2 for packets we've not
> processed in
> >> T1 yet. I'm pretty sure this won't work correctly.
> >
> > In the case of IPS mode, do inline streaming depends on ACKed packet ?
>
> No, but the stream engine is written with the assumption that what we
> see is the order of packets on the wire. TCP packets may still be out of
> order of course, but in this case the end-host has to deal with it
> as well.
>
> In cases like window checks, sequence validation, SACK checks, etc I can
> imagine problems. We'd possibly reject/accept packets in the stream
> handling that the end host will treat differently.
>
> >
> >> This isn't limited to workers btw, in autofp when using multiple
> capture
> >> threads we can have the same issue. One side of a connection getting
> >> ahead of the other.
> >
> > Yes, I've observed this lead to strange behavior...
> >
> >> Don't think we can solve this in Suricata itself, as the OS has a
> lot of
> >> liberty in scheduling threads. A full packet reordering module would
> >> maybe work, but it's performance affect would probably completely nix
> >> all gains by the said capture methods.
> >
> > Sure
> >
> >>> In this case, we may want to disable the per-worker flow engine
> which is
> >>> a really good idea for other running mode.
> >>
> >> Don't think it would be sufficient. The ordering problem won't be
> solved
> >> by it.
> >
> > Yes, it may be interesting to study a bit the hash function used
> by NIC
> > to see if they behave symetrically. In this case, this should fix the
> > issue (at least for NFQ). I will have a look into it.
>
> IMHO the success key is having a symmetric RSS hash function. Someone
> already made experiments/studies about this: i.e.
> http://www.ndsl.kaist.edu/~shinae/papers/TR-symRSS.pdf
Interesting, thanks.
> Obviously this could lead to unbalanced flow queue, think about a long
> standing flows which remain alive for long time period... To take
> into account this kind of situation one could think to
> assign a group of processing CPU thread to packets that arrive from
> the same RSS queue, loosing, of course, in this case the cache (ant
> interrupt) affinity benefits.
With our autofp mode this could be done. We could also consider a more
advanced autofp mode where instead of on global load balancer over all
cpu's/threads we'd have autofp style load balancing over a select group
of threads that run on the same cpu.
--
---------------------------------------------
Victor Julien
http://www.inliniac.net/
PGP: http://www.inliniac.net/victorjulien.asc
---------------------------------------------
More information about the Oisf-devel
mailing list