<html>

  <head>

    <meta content="text/html; charset=utf-8" http-equiv="Content-Type">

  </head>

  <body bgcolor="#FFFFFF" text="#000000">

    What kind of memory settings are folks using with this branch within

    the config file and what was the end result in terms of total memory

    usage compared to number of workers configured?<br>

    <br>

    On 12/9/2015 2:11 PM, Cooper F. Nelson wrote:<br>

    <blockquote type="cite">On 12/9/2015 5:36 AM, Victor Julien wrote:<br>

      > Our main performance hit in the multi pattern matching (mpm)

      stage.<br>

      > We've used a skip based algorithm in the past (b2g is still

      in our<br>

      > tree), but performance with AC is quite a lot better.

      Generally the<br>

      > problem for IDS patterns is that they are of poor quality,

      many 1 and<br>

      > 2 byte patterns. These defeat the skip based algo's. Another

      issue<br>

      > that is important to us is the worst-case performance. The

      skip based<br>

      > algo's seem to have a worse worst case profile.<br>

      <br>

      As I was walking to the pub last night I remembered that suricata

      has<br>

      migrated to AC some time ago!  Thanks for the details regardless,

      it's<br>

      very interesting.<br>

      <br>

      > Btw, I recently saw a new paper on a mix of AC and skip based

      approach<br>

      > that I still have to take a deeper look at:<br>

      >

      <a class="moz-txt-link-freetext" href="http://halcyon.usc.edu/~pk/prasannawebsite/papers/HeadBody_camera.pdf">http://halcyon.usc.edu/~pk/prasannawebsite/papers/HeadBody_camera.pdf</a><br>

      <br>

      - From the paper:<br>

      <br>

      > In [14], a throughput of 7.5 Gbps was achieved<br>

      > using 32 processors in a Cray XMT supercomputer. There<br>

      > is yet a cost-efficient DBSM solution capable of matching<br>

      > 10 Gbps traffic against several thousand strings on a

      multicore<br>

      > platform.<br>

      <br>

      I've been doing this for years with suricata and a small bpf

      filter, on<br>

      a 16 core (actually 8 w/hyper-threading) Xeon server.  Over 20k<br>

      signatures, too.  I will admit the EmergingThreats guys do a

      fabulous<br>

      job of optimizing their signatures for efficiency.<br>

      <br>

      As I mentioned previously the actual suricata process is only

      processing<br>

      a fraction of the original packets, but if you are primarily

      interested<br>

      in matching against HTTP headers I don't particularly see the

      value of a<br>

      full DPI solution.  Particularly when you allow services like

      Netflix<br>

      and Youtube on your network.<br>

      <br>

      The real bottleneck on all modern multi-core Von-Neumann style<br>

      architectures is memory (particularly cache memory) I/O.  So this

      is<br>

      less of problem with the performance of the pattern-matching

      engine as<br>

      it's an issue with memory pressure put on the various core

      sub-systems<br>

      by attempting to match against full TCP flows.  The authors allude

      to<br>

      this at points, however I think if they ran better performance

      counters<br>

      this would be more obvious.<br>

      <br>

      The tl;dr is that what they are discussing *is* possible if you<br>

      pre-process your IP traffic via an efficient byte-based pattern

      matcher<br>

      like bpf.  This is why packet filters were invented, in fact.<br>

      <br>

      I guess its possible that they are already working with sampled

      traffic,<br>

      but I doubt it.<br>

      <br>

      > Finally, we should start experimenting with Intel's Hyperscan

      soon.<br>

      > They claim much better perf, so we will see :)<br>

      <br>

      Ok now this is interesting and a new thing for me.  My next

      question for<br>

      you was if you were still looking at using SSE for

      pattern-matching.<br>

      Especially in the context of Aho-Corasick, as I would think it

      would be<br>

      possible to analyze multiple flows/packets/patterns in parallel

      via a<br>

      SIMD approach.  Great to see this is open-source, too.<br>

      <br>

      There is a concern that SSE breaks hyperthreading to an extent, in

      that<br>

      hyper-threaded cores share a single FP/SSE execution pipeline. 

      However,<br>

      I would think the performance benefits afforded by vectorizing the<br>

      regexp process would exceed any losses incurred by losing 1-2<br>

      traditional integer pipelines.<br>

      <br>

      Anyways, this if fabulously exciting and would be willing and able

      to<br>

      help test this once available.<br>

      <br>

    </blockquote>

    <span style="white-space: pre;">> _______________________________________________

> Suricata IDS Users mailing list: <a class="moz-txt-link-abbreviated" href="mailto:oisf-users@openinfosecfoundation.org">oisf-users@openinfosecfoundation.org</a>

> Site: <a class="moz-txt-link-freetext" href="http://suricata-ids.org">http://suricata-ids.org</a> | Support: <a class="moz-txt-link-freetext" href="http://suricata-ids.org/support/">http://suricata-ids.org/support/</a>

> List: <a class="moz-txt-link-freetext" href="https://lists.openinfosecfoundation.org/mailman/listinfo/oisf-users">https://lists.openinfosecfoundation.org/mailman/listinfo/oisf-users</a>

> Suricata User Conference November 4 & 5 in Barcelona: <a class="moz-txt-link-freetext" href="http://oisfevents.net">http://oisfevents.net</a></span><br>

    <br>

    <br>

  </body>

</html>