Anoop,<br>  Indeed.  With 48 CPU's in both runmodes in each max-pending-packets category I average the following across all of my runs:<br><br><span style="font-family: courier new,monospace;">Runmode: Auto</span><br style="font-family: courier new,monospace;">

<span style="font-family: courier new,monospace;">MPP     Avg PPS   StDev</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;">50:     27160     590</span><br style="font-family: courier new,monospace;">

<span style="font-family: courier new,monospace;">500:    29969     1629</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;">5000:   31267     356</span><br style="font-family: courier new,monospace;">

<span style="font-family: courier new,monospace;">50000:  31608     358</span><br style="font-family: courier new,monospace;"><br style="font-family: courier new,monospace;"><div class="gmail_quote"><span style="font-family: courier new,monospace;">Runmode: AutoFP</span><br style="font-family: courier new,monospace;">

<font style="font-family: courier new,monospace;" face="courier new,monospace">MPP     Avg PPS   StDev<br>50:     16924     106<br>500:    56572     405<br>5000:   86683     1577<br>50000:  132936    5548<br></font><br><br>

  Just reading over my email I don't think I mentioned the variables that I'm adjusting.  3 variables here.  Runmode, Detect Thread Ratio, and Max Pending Packets.  Each run that I mention above is at a different DTR from .1-1.0 then 1.2, 1.5, 1.7, and 2.0.  I was expecting to see something along the lines of Eric LeBlond's results on his blog post: <a href="http://home.regit.org/2011/02/more-about-suricata-multithread-performance/">http://home.regit.org/2011/02/more-about-suricata-multithread-performance/</a>  but it doesn't look like changing the DTR gave me the significant performance increase that he reported.  (most likely due to other differences in our .yaml files, i.e. cpu_affinity).<br>

<br>  Thank you for the clarification on the relationship between MPP and the cache.  That does clear thing up a bit.  So you think I should be seeing better performance with 48 CPU's than I'm currently getting?  Where do you think I can make the improvements?  My first guess would be in cpu_affinity, but that's just a guess.<br>

<br>  I don't mind filling in the table, however I don't think the attachment made it to my inbox.  Would you mind resending?<br><br>Thanks,<br>Gene<br><br><br>On Mon, Aug 15, 2011 at 12:20 AM, Anoop Saldanha <span dir="ltr"><<a href="mailto:poonaatsoc@gmail.com">poonaatsoc@gmail.com</a>></span> wrote:<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;"><div class="im">On Sun, Aug 14, 2011 at 11:43 AM, Gene Albin <<a href="mailto:gene.albin@gmail.com">gene.albin@gmail.com</a>> wrote:<br>


><br>

> Anoop,<br>

>   With max-pending-packets set to 50,000 and 48 CPU's I get performance around 135,000 packets/sec.  With mpp at 50,000 and only 4 CPU's I get performance around 31,343 packets/sec.  Both of these are with --runmode=autofp enabled.<br>


><br>

> Interestingly enough, when I run 4 CPU's in autofp mode I get 31,343 pps, and when I run 48 CPU's in auto mode I also get 31,343 pps.<br>

><br>

<br>

</div>You get this for all runs of auto + 48 CPUs?<br>

<div class="im"><br>

><br>

>   I have to admit that I don't quite follow your explanation about the thread usage below.  In layman's terms how will this affect the performance of suricata?<br>

<br>

</div>They probably would have meant this.  Whenever a thread processes a<br>

packet, whatever data the thread needs to process the packet, would be<br>

cached by the thread.  Now this is one thread, one packet.  Let's say<br>

you have more packets now.  With this packet processing rate, you<br>

would have threads trying to load data for too many packets into the<br>

cache, which might lead to other thread overwriting the cache with<br>

their data.<br>

<br>

Either ways I really wouldn't worry about cache behaviour based on<br>

increasing max pending packets.  The consumption/processing rate is<br>

high with greater max pending packets, to be countered by any cache<br>

performance degradation.<br>

<br>

All this doesn't mean you can't obtain performance based on cache<br>

usage.  A lot of our performance improvements is based on writing good<br>

cache usage code(more on locality of reference).  If you write code<br>

that understands cache usage, the benefit's tenfold.<br>

<div class="im"><br>

><br>

>   In my case I seem to be getting great performance increases, but I can't see what downside there might be with the cache.<br>

><br>

<br>

</div>Yes, but with 48 cores, we can extract even more performance out of<br>

the engine than what you are currently seeing, and cache may/may not<br>

have anything to do with it.  So if there are any cache performance<br>

issues, it is reducing the maximum performance obtainable on 48 cores<br>

and this reduced performance is what you are currently seeing as the<br>

throughput, but even this lowered throughput is far greater than what<br>

you would have otherwise achieved using just 50 max pending packets.<br>

<br>

I hope that clears it.<br>

<br>

I believe you are running some tests on suricata.  Whenever you run<br>

run suricata in a particular config, can you fill this table(have<br>

attached it) up?  When you are done filling it you can mail it.<br>

<div><div></div><div class="h5"><br>

<br>

> Thanks,<br>

> Gene<br>

><br>

> On Sat, Aug 13, 2011 at 10:05 PM, Anoop Saldanha <<a href="mailto:poonaatsoc@gmail.com">poonaatsoc@gmail.com</a>> wrote:<br>

>><br>

>><br>

>> On Thu, Aug 11, 2011 at 12:24 AM, Gene Albin <<a href="mailto:gene.albin@gmail.com">gene.albin@gmail.com</a>> wrote:<br>

>>><br>

>>> So I'm running in autofp mode and I increased the max-pending-packets from 50 to 500, then 5000, then 50000.  I saw a dramatic increase from:<br>

>>> 50 to 500 (17000 packets/sec @ 450sec to 57000 pps@140s)<br>

>>> not quite as dramatic from:<br>

>>> 500 to 5000 ( to 85000pps@90s)<br>

>>> and about the same from:<br>

>>> 5000 to 50000 (to 135000pps@60s)<br>

>>> My question now is about the tradeoff mentioned in the config file.  Mentions negatively impacting caching.  How does it impact caching?  Will I see this when running pcaps or in live mode?<br>

>>> Thanks,<br>

>>> Gene<br>

>><br>

>> Probably polluting_the_cache/breaking_the_cache_coherency for the data used by other packets.  Either ways I wouldn't second guess the effects of cache usage when it comes to multiple threads probably ruining data loaded by some other thread.  I would just be interested about locality of reference with respect to data used by one thread for whatever time slice it is on the cpu.<br>


>><br>

>> ** I see that you have tested with max-pending-packets set to 50,000.  Can you check how Suricata scales from 4 cpu cores to 32 cpu cores, with these 50,000 max-pending-packets, and post the results here?<br>

>><br>

>>><br>

>>> On Thu, Aug 4, 2011 at 1:07 PM, saldanha <<a href="mailto:poonaatsoc@gmail.com">poonaatsoc@gmail.com</a>> wrote:<br>

>>>><br>

>>>> On 08/03/2011 08:50 AM, Gene Albin wrote:<br>

>>>><br>

>>>> So I just installed Suricata on one of our research computers with lots of cores available.  I'm looking to see what kind of performance boost I get as I bump up the CPU's. After my first run I was surprised to see that I didn't get much of a boost when going from 8 to 32 CPUs.  I was running a 6GB pcap file with a about 17k rules loaded.  The first run on 8 cores took 190sec.  The second run on 32 cores took 170 sec.  Looks like something other than CPU is the bottle neck.<br>


>>>><br>

>>>> My first guess is Disk IO.  Any recommendations on how I could check/verify that guess?<br>

>>>><br>

>>>> Gene<br>

>>>><br>

>>>> --<br>

>>>> Gene Albin<br>

>>>> <a href="mailto:gene.albin@gmail.com">gene.albin@gmail.com</a><br>

>>>><br>

>>>><br>

>>>><br>

>>>> _______________________________________________<br>

>>>> Oisf-users mailing list<br>

>>>> <a href="mailto:Oisf-users@openinfosecfoundation.org">Oisf-users@openinfosecfoundation.org</a><br>

>>>> <a href="http://lists.openinfosecfoundation.org/mailman/listinfo/oisf-users" target="_blank">http://lists.openinfosecfoundation.org/mailman/listinfo/oisf-users</a><br>

>>>><br>

>>>> * forgot to reply to the list previously<br>

>>>><br>

>>>> Hey Gene.<br>

>>>><br>

>>>> Can you test by increasing the max-pending-packets in the suricata.yaml file to a higher value.  You can try one run with a value of 500 and then try higher values(2000+ suggested.  More the better, as long as you don't hit swap).<br>


>>>><br>

>>>> Once you have set a higher max-pending-packets you can try running suricata in autofp runmode.  autofp mode runs suricata in flow-pinned mode.  To do this add this option to your suricata command line "--runmode=autofp.  "<br>


>>>><br>

>>>> sudo suricata -c ./suricata.yaml -r your_pcap.pcap --runmode=autofp<br>

>>>><br>

>>>> With max-pending-packets set to a higher value and with --runmode=autofp, you can test how suricata scales from 4 to 32 cores.<br>

>>>><br>

>>>><br>

>>>><br>

>>>> _______________________________________________<br>

>>>> Oisf-users mailing list<br>

>>>> <a href="mailto:Oisf-users@openinfosecfoundation.org">Oisf-users@openinfosecfoundation.org</a><br>

>>>> <a href="http://lists.openinfosecfoundation.org/mailman/listinfo/oisf-users" target="_blank">http://lists.openinfosecfoundation.org/mailman/listinfo/oisf-users</a><br>

>>>><br>

>>><br>

>>><br>

>>><br>

>>> --<br>

>>> Gene Albin<br>

>>> <a href="mailto:gene.albin@gmail.com">gene.albin@gmail.com</a><br>

>>><br>

>>><br>

>>> _______________________________________________<br>

>>> Oisf-users mailing list<br>

>>> <a href="mailto:Oisf-users@openinfosecfoundation.org">Oisf-users@openinfosecfoundation.org</a><br>

>>> <a href="http://lists.openinfosecfoundation.org/mailman/listinfo/oisf-users" target="_blank">http://lists.openinfosecfoundation.org/mailman/listinfo/oisf-users</a><br>

>>><br>

>><br>

>><br>

>><br>

>> --<br>

>> Anoop Saldanha<br>

>><br>

><br>

><br>

><br>

> --<br>

> Gene Albin<br>

> <a href="mailto:gene.albin@gmail.com">gene.albin@gmail.com</a><br>

><br>

<br>

<br>

<br>

</div></div><font color="#888888">--<br>

Anoop Saldanha<br>

</font></blockquote></div><br><br clear="all"><br>-- <br>Gene Albin<br><a href="mailto:gene.albin@gmail.com" target="_blank">gene.albin@gmail.com</a><br><br>