[Oisf-devel] [Discussion] Suricata 1.0.0 released

Anoop Saldanha poonaatsoc at gmail.com
Sat Jul 3 16:05:46 UTC 2010

Hey Martin,

On Fri, Jul 2, 2010 at 7:39 PM, Martin Holste <mcholste at gmail.com> wrote:

> Congrats!
> Can you briefly describe the current state of the CUDA support?  Is it
> refined enough that performance is better when using CUDA?  I believe
> that in the early stages of development it was stated that non-CUDA
> was still faster than using CUDA.

We got the batching support in, but it isn't perfect.  Still needs a bit of
work to beat the CPU.  We know where we have perf bottlenecks now.  Here is
the agenda for upcoming cuda work(looks like this patch didn't get in)

 * \todo
 *       - Make cuda paramters user configurable.
 *       - Implement a gpu version of aho-corasick.  That should get rid of
 *         lot of post processing and pattern_chopping, and we don't have to
 *         deal with one or two byte patterns.
 *       - Currently a lot of packets(~17k) are getting stuck on the
 *         thread, which is a major bottleneck.  Introduce bypass detection
 *         threads for these 17k non buffered packets and check how the
 *         are affected by this(out of sequence handling by detection
 *       - Use texture/shared memory.  This should be handled along with AC.
 *       - Test the use of host-alloced page locked memory.
 *       - Test other optimizations like using the sgh held in the flow(if
 *         present in the flow), instead of retrieving the sgh inside the
 *         thread.

Another thing I missed in this list is, buffering packets based on the
remaining available space in the buffer.  Currently we just use a count of
packets buffered in and use a threshhold limit for this count, after which
we transfer the packets to the GPU.

>  Also, has anyone tested it using
> SLI?  This is really exciting stuff as the count of on-board stream
> processors continues to grow by leaps and bounds on these video cards
> every month!

Afaik, cuda doesn't recognize SLI.  It will recognize just one card(haven't
checked it myself though).  In case one wants to use both the cards, they
will have to disable SLI.  Cuda would then list all the cards.  Either ways
load balancing isn't supported on multiple devices currently, but we have
this on the agenda, but it isn't high priority.  Need to extract perf out of
one card first ;)

Anoop Saldanha
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openinfosecfoundation.org/pipermail/oisf-devel/attachments/20100703/d3c32b4b/attachment-0002.html>

More information about the Oisf-devel mailing list