Anoop,<br> Indeed. With 48 CPU's in both runmodes in each max-pending-packets category I average the following across all of my runs:<br><br><span style="font-family: courier new,monospace;">Runmode: Auto</span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;">MPP Avg PPS StDev</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;">50: 27160 590</span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;">500: 29969 1629</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;">5000: 31267 356</span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;">50000: 31608 358</span><br style="font-family: courier new,monospace;"><br style="font-family: courier new,monospace;"><div class="gmail_quote"><span style="font-family: courier new,monospace;">Runmode: AutoFP</span><br style="font-family: courier new,monospace;">
<font style="font-family: courier new,monospace;" face="courier new,monospace">MPP Avg PPS StDev<br>50: 16924 106<br>500: 56572 405<br>5000: 86683 1577<br>50000: 132936 5548<br></font><br><br>
Just reading over my email I don't think I mentioned the variables that I'm adjusting. 3 variables here. Runmode, Detect Thread Ratio, and Max Pending Packets. Each run that I mention above is at a different DTR from .1-1.0 then 1.2, 1.5, 1.7, and 2.0. I was expecting to see something along the lines of Eric LeBlond's results on his blog post: <a href="http://home.regit.org/2011/02/more-about-suricata-multithread-performance/">http://home.regit.org/2011/02/more-about-suricata-multithread-performance/</a> but it doesn't look like changing the DTR gave me the significant performance increase that he reported. (most likely due to other differences in our .yaml files, i.e. cpu_affinity).<br>
<br> Thank you for the clarification on the relationship between MPP and the cache. That does clear thing up a bit. So you think I should be seeing better performance with 48 CPU's than I'm currently getting? Where do you think I can make the improvements? My first guess would be in cpu_affinity, but that's just a guess.<br>
<br> I don't mind filling in the table, however I don't think the attachment made it to my inbox. Would you mind resending?<br><br>Thanks,<br>Gene<br><br><br>On Mon, Aug 15, 2011 at 12:20 AM, Anoop Saldanha <span dir="ltr"><<a href="mailto:poonaatsoc@gmail.com">poonaatsoc@gmail.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;"><div class="im">On Sun, Aug 14, 2011 at 11:43 AM, Gene Albin <<a href="mailto:gene.albin@gmail.com">gene.albin@gmail.com</a>> wrote:<br>
><br>
> Anoop,<br>
> With max-pending-packets set to 50,000 and 48 CPU's I get performance around 135,000 packets/sec. With mpp at 50,000 and only 4 CPU's I get performance around 31,343 packets/sec. Both of these are with --runmode=autofp enabled.<br>
><br>
> Interestingly enough, when I run 4 CPU's in autofp mode I get 31,343 pps, and when I run 48 CPU's in auto mode I also get 31,343 pps.<br>
><br>
<br>
</div>You get this for all runs of auto + 48 CPUs?<br>
<div class="im"><br>
><br>
> I have to admit that I don't quite follow your explanation about the thread usage below. In layman's terms how will this affect the performance of suricata?<br>
<br>
</div>They probably would have meant this. Whenever a thread processes a<br>
packet, whatever data the thread needs to process the packet, would be<br>
cached by the thread. Now this is one thread, one packet. Let's say<br>
you have more packets now. With this packet processing rate, you<br>
would have threads trying to load data for too many packets into the<br>
cache, which might lead to other thread overwriting the cache with<br>
their data.<br>
<br>
Either ways I really wouldn't worry about cache behaviour based on<br>
increasing max pending packets. The consumption/processing rate is<br>
high with greater max pending packets, to be countered by any cache<br>
performance degradation.<br>
<br>
All this doesn't mean you can't obtain performance based on cache<br>
usage. A lot of our performance improvements is based on writing good<br>
cache usage code(more on locality of reference). If you write code<br>
that understands cache usage, the benefit's tenfold.<br>
<div class="im"><br>
><br>
> In my case I seem to be getting great performance increases, but I can't see what downside there might be with the cache.<br>
><br>
<br>
</div>Yes, but with 48 cores, we can extract even more performance out of<br>
the engine than what you are currently seeing, and cache may/may not<br>
have anything to do with it. So if there are any cache performance<br>
issues, it is reducing the maximum performance obtainable on 48 cores<br>
and this reduced performance is what you are currently seeing as the<br>
throughput, but even this lowered throughput is far greater than what<br>
you would have otherwise achieved using just 50 max pending packets.<br>
<br>
I hope that clears it.<br>
<br>
I believe you are running some tests on suricata. Whenever you run<br>
run suricata in a particular config, can you fill this table(have<br>
attached it) up? When you are done filling it you can mail it.<br>
<div><div></div><div class="h5"><br>
<br>
> Thanks,<br>
> Gene<br>
><br>
> On Sat, Aug 13, 2011 at 10:05 PM, Anoop Saldanha <<a href="mailto:poonaatsoc@gmail.com">poonaatsoc@gmail.com</a>> wrote:<br>
>><br>
>><br>
>> On Thu, Aug 11, 2011 at 12:24 AM, Gene Albin <<a href="mailto:gene.albin@gmail.com">gene.albin@gmail.com</a>> wrote:<br>
>>><br>
>>> So I'm running in autofp mode and I increased the max-pending-packets from 50 to 500, then 5000, then 50000. I saw a dramatic increase from:<br>
>>> 50 to 500 (17000 packets/sec @ 450sec to 57000 pps@140s)<br>
>>> not quite as dramatic from:<br>
>>> 500 to 5000 ( to 85000pps@90s)<br>
>>> and about the same from:<br>
>>> 5000 to 50000 (to 135000pps@60s)<br>
>>> My question now is about the tradeoff mentioned in the config file. Mentions negatively impacting caching. How does it impact caching? Will I see this when running pcaps or in live mode?<br>
>>> Thanks,<br>
>>> Gene<br>
>><br>
>> Probably polluting_the_cache/breaking_the_cache_coherency for the data used by other packets. Either ways I wouldn't second guess the effects of cache usage when it comes to multiple threads probably ruining data loaded by some other thread. I would just be interested about locality of reference with respect to data used by one thread for whatever time slice it is on the cpu.<br>
>><br>
>> ** I see that you have tested with max-pending-packets set to 50,000. Can you check how Suricata scales from 4 cpu cores to 32 cpu cores, with these 50,000 max-pending-packets, and post the results here?<br>
>><br>
>>><br>
>>> On Thu, Aug 4, 2011 at 1:07 PM, saldanha <<a href="mailto:poonaatsoc@gmail.com">poonaatsoc@gmail.com</a>> wrote:<br>
>>>><br>
>>>> On 08/03/2011 08:50 AM, Gene Albin wrote:<br>
>>>><br>
>>>> So I just installed Suricata on one of our research computers with lots of cores available. I'm looking to see what kind of performance boost I get as I bump up the CPU's. After my first run I was surprised to see that I didn't get much of a boost when going from 8 to 32 CPUs. I was running a 6GB pcap file with a about 17k rules loaded. The first run on 8 cores took 190sec. The second run on 32 cores took 170 sec. Looks like something other than CPU is the bottle neck.<br>
>>>><br>
>>>> My first guess is Disk IO. Any recommendations on how I could check/verify that guess?<br>
>>>><br>
>>>> Gene<br>
>>>><br>
>>>> --<br>
>>>> Gene Albin<br>
>>>> <a href="mailto:gene.albin@gmail.com">gene.albin@gmail.com</a><br>
>>>><br>
>>>><br>
>>>><br>
>>>> _______________________________________________<br>
>>>> Oisf-users mailing list<br>
>>>> <a href="mailto:Oisf-users@openinfosecfoundation.org">Oisf-users@openinfosecfoundation.org</a><br>
>>>> <a href="http://lists.openinfosecfoundation.org/mailman/listinfo/oisf-users" target="_blank">http://lists.openinfosecfoundation.org/mailman/listinfo/oisf-users</a><br>
>>>><br>
>>>> * forgot to reply to the list previously<br>
>>>><br>
>>>> Hey Gene.<br>
>>>><br>
>>>> Can you test by increasing the max-pending-packets in the suricata.yaml file to a higher value. You can try one run with a value of 500 and then try higher values(2000+ suggested. More the better, as long as you don't hit swap).<br>
>>>><br>
>>>> Once you have set a higher max-pending-packets you can try running suricata in autofp runmode. autofp mode runs suricata in flow-pinned mode. To do this add this option to your suricata command line "--runmode=autofp. "<br>
>>>><br>
>>>> sudo suricata -c ./suricata.yaml -r your_pcap.pcap --runmode=autofp<br>
>>>><br>
>>>> With max-pending-packets set to a higher value and with --runmode=autofp, you can test how suricata scales from 4 to 32 cores.<br>
>>>><br>
>>>><br>
>>>><br>
>>>> _______________________________________________<br>
>>>> Oisf-users mailing list<br>
>>>> <a href="mailto:Oisf-users@openinfosecfoundation.org">Oisf-users@openinfosecfoundation.org</a><br>
>>>> <a href="http://lists.openinfosecfoundation.org/mailman/listinfo/oisf-users" target="_blank">http://lists.openinfosecfoundation.org/mailman/listinfo/oisf-users</a><br>
>>>><br>
>>><br>
>>><br>
>>><br>
>>> --<br>
>>> Gene Albin<br>
>>> <a href="mailto:gene.albin@gmail.com">gene.albin@gmail.com</a><br>
>>><br>
>>><br>
>>> _______________________________________________<br>
>>> Oisf-users mailing list<br>
>>> <a href="mailto:Oisf-users@openinfosecfoundation.org">Oisf-users@openinfosecfoundation.org</a><br>
>>> <a href="http://lists.openinfosecfoundation.org/mailman/listinfo/oisf-users" target="_blank">http://lists.openinfosecfoundation.org/mailman/listinfo/oisf-users</a><br>
>>><br>
>><br>
>><br>
>><br>
>> --<br>
>> Anoop Saldanha<br>
>><br>
><br>
><br>
><br>
> --<br>
> Gene Albin<br>
> <a href="mailto:gene.albin@gmail.com">gene.albin@gmail.com</a><br>
><br>
<br>
<br>
<br>
</div></div><font color="#888888">--<br>
Anoop Saldanha<br>
</font></blockquote></div><br><br clear="all"><br>-- <br>Gene Albin<br><a href="mailto:gene.albin@gmail.com" target="_blank">gene.albin@gmail.com</a><br><br>