[Oisf-users] Performance on multiple CPUs

Gene Albin gene.albin at gmail.com
Mon Aug 15 16:00:05 UTC 2011


Anoop,
  Indeed.  With 48 CPU's in both runmodes in each max-pending-packets
category I average the following across all of my runs:

Runmode: Auto
MPP     Avg PPS   StDev
50:     27160     590
500:    29969     1629
5000:   31267     356
50000:  31608     358

Runmode: AutoFP
MPP     Avg PPS   StDev
50:     16924     106
500:    56572     405
5000:   86683     1577
50000:  132936    5548


  Just reading over my email I don't think I mentioned the variables that
I'm adjusting.  3 variables here.  Runmode, Detect Thread Ratio, and Max
Pending Packets.  Each run that I mention above is at a different DTR from
.1-1.0 then 1.2, 1.5, 1.7, and 2.0.  I was expecting to see something along
the lines of Eric LeBlond's results on his blog post:
http://home.regit.org/2011/02/more-about-suricata-multithread-performance/
but it doesn't look like changing the DTR gave me the significant
performance increase that he reported.  (most likely due to other
differences in our .yaml files, i.e. cpu_affinity).

  Thank you for the clarification on the relationship between MPP and the
cache.  That does clear thing up a bit.  So you think I should be seeing
better performance with 48 CPU's than I'm currently getting?  Where do you
think I can make the improvements?  My first guess would be in cpu_affinity,
but that's just a guess.

  I don't mind filling in the table, however I don't think the attachment
made it to my inbox.  Would you mind resending?

Thanks,
Gene


On Mon, Aug 15, 2011 at 12:20 AM, Anoop Saldanha <poonaatsoc at gmail.com>wrote:

> On Sun, Aug 14, 2011 at 11:43 AM, Gene Albin <gene.albin at gmail.com> wrote:
> >
> > Anoop,
> >   With max-pending-packets set to 50,000 and 48 CPU's I get performance
> around 135,000 packets/sec.  With mpp at 50,000 and only 4 CPU's I get
> performance around 31,343 packets/sec.  Both of these are with
> --runmode=autofp enabled.
> >
> > Interestingly enough, when I run 4 CPU's in autofp mode I get 31,343 pps,
> and when I run 48 CPU's in auto mode I also get 31,343 pps.
> >
>
> You get this for all runs of auto + 48 CPUs?
>
> >
> >   I have to admit that I don't quite follow your explanation about the
> thread usage below.  In layman's terms how will this affect the performance
> of suricata?
>
> They probably would have meant this.  Whenever a thread processes a
> packet, whatever data the thread needs to process the packet, would be
> cached by the thread.  Now this is one thread, one packet.  Let's say
> you have more packets now.  With this packet processing rate, you
> would have threads trying to load data for too many packets into the
> cache, which might lead to other thread overwriting the cache with
> their data.
>
> Either ways I really wouldn't worry about cache behaviour based on
> increasing max pending packets.  The consumption/processing rate is
> high with greater max pending packets, to be countered by any cache
> performance degradation.
>
> All this doesn't mean you can't obtain performance based on cache
> usage.  A lot of our performance improvements is based on writing good
> cache usage code(more on locality of reference).  If you write code
> that understands cache usage, the benefit's tenfold.
>
> >
> >   In my case I seem to be getting great performance increases, but I
> can't see what downside there might be with the cache.
> >
>
> Yes, but with 48 cores, we can extract even more performance out of
> the engine than what you are currently seeing, and cache may/may not
> have anything to do with it.  So if there are any cache performance
> issues, it is reducing the maximum performance obtainable on 48 cores
> and this reduced performance is what you are currently seeing as the
> throughput, but even this lowered throughput is far greater than what
> you would have otherwise achieved using just 50 max pending packets.
>
> I hope that clears it.
>
> I believe you are running some tests on suricata.  Whenever you run
> run suricata in a particular config, can you fill this table(have
> attached it) up?  When you are done filling it you can mail it.
>
>
> > Thanks,
> > Gene
> >
> > On Sat, Aug 13, 2011 at 10:05 PM, Anoop Saldanha <poonaatsoc at gmail.com>
> wrote:
> >>
> >>
> >> On Thu, Aug 11, 2011 at 12:24 AM, Gene Albin <gene.albin at gmail.com>
> wrote:
> >>>
> >>> So I'm running in autofp mode and I increased the max-pending-packets
> from 50 to 500, then 5000, then 50000.  I saw a dramatic increase from:
> >>> 50 to 500 (17000 packets/sec @ 450sec to 57000 pps at 140s)
> >>> not quite as dramatic from:
> >>> 500 to 5000 ( to 85000pps at 90s)
> >>> and about the same from:
> >>> 5000 to 50000 (to 135000pps at 60s)
> >>> My question now is about the tradeoff mentioned in the config file.
>  Mentions negatively impacting caching.  How does it impact caching?  Will I
> see this when running pcaps or in live mode?
> >>> Thanks,
> >>> Gene
> >>
> >> Probably polluting_the_cache/breaking_the_cache_coherency for the data
> used by other packets.  Either ways I wouldn't second guess the effects of
> cache usage when it comes to multiple threads probably ruining data loaded
> by some other thread.  I would just be interested about locality of
> reference with respect to data used by one thread for whatever time slice it
> is on the cpu.
> >>
> >> ** I see that you have tested with max-pending-packets set to 50,000.
> Can you check how Suricata scales from 4 cpu cores to 32 cpu cores, with
> these 50,000 max-pending-packets, and post the results here?
> >>
> >>>
> >>> On Thu, Aug 4, 2011 at 1:07 PM, saldanha <poonaatsoc at gmail.com> wrote:
> >>>>
> >>>> On 08/03/2011 08:50 AM, Gene Albin wrote:
> >>>>
> >>>> So I just installed Suricata on one of our research computers with
> lots of cores available.  I'm looking to see what kind of performance boost
> I get as I bump up the CPU's. After my first run I was surprised to see that
> I didn't get much of a boost when going from 8 to 32 CPUs.  I was running a
> 6GB pcap file with a about 17k rules loaded.  The first run on 8 cores took
> 190sec.  The second run on 32 cores took 170 sec.  Looks like something
> other than CPU is the bottle neck.
> >>>>
> >>>> My first guess is Disk IO.  Any recommendations on how I could
> check/verify that guess?
> >>>>
> >>>> Gene
> >>>>
> >>>> --
> >>>> Gene Albin
> >>>> gene.albin at gmail.com
> >>>>
> >>>>
> >>>>
> >>>> _______________________________________________
> >>>> Oisf-users mailing list
> >>>> Oisf-users at openinfosecfoundation.org
> >>>> http://lists.openinfosecfoundation.org/mailman/listinfo/oisf-users
> >>>>
> >>>> * forgot to reply to the list previously
> >>>>
> >>>> Hey Gene.
> >>>>
> >>>> Can you test by increasing the max-pending-packets in the
> suricata.yaml file to a higher value.  You can try one run with a value of
> 500 and then try higher values(2000+ suggested.  More the better, as long as
> you don't hit swap).
> >>>>
> >>>> Once you have set a higher max-pending-packets you can try running
> suricata in autofp runmode.  autofp mode runs suricata in flow-pinned mode.
> To do this add this option to your suricata command line "--runmode=autofp.
> "
> >>>>
> >>>> sudo suricata -c ./suricata.yaml -r your_pcap.pcap --runmode=autofp
> >>>>
> >>>> With max-pending-packets set to a higher value and with
> --runmode=autofp, you can test how suricata scales from 4 to 32 cores.
> >>>>
> >>>>
> >>>>
> >>>> _______________________________________________
> >>>> Oisf-users mailing list
> >>>> Oisf-users at openinfosecfoundation.org
> >>>> http://lists.openinfosecfoundation.org/mailman/listinfo/oisf-users
> >>>>
> >>>
> >>>
> >>>
> >>> --
> >>> Gene Albin
> >>> gene.albin at gmail.com
> >>>
> >>>
> >>> _______________________________________________
> >>> Oisf-users mailing list
> >>> Oisf-users at openinfosecfoundation.org
> >>> http://lists.openinfosecfoundation.org/mailman/listinfo/oisf-users
> >>>
> >>
> >>
> >>
> >> --
> >> Anoop Saldanha
> >>
> >
> >
> >
> > --
> > Gene Albin
> > gene.albin at gmail.com
> >
>
>
>
> --
> Anoop Saldanha
>



-- 
Gene Albin
gene.albin at gmail.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openinfosecfoundation.org/pipermail/oisf-users/attachments/20110815/021818a8/attachment-0002.html>


More information about the Oisf-users mailing list