[Oisf-users] New MPM available

Anoop Saldanha anoopsaldanha at gmail.com
Thu Feb 23 03:36:11 UTC 2012


On Thu, Feb 23, 2012 at 6:37 AM, Hariharan Thantry <thantry at gmail.com> wrote:
> Hi folks,
>
> Just curious. What exactly are the b2g, b3g Multi-pattern-match
> algorithms? I know Wu-Manber, Knuth-Morris-Pratt & Aho-Corasick, but
> couldn't figure what algorithm b2g (or b3g) implemented...
>

q-gram versions of bndmq

> Thanks,
> Hari
>
> On Mon, Feb 20, 2012 at 11:31 PM, Anoop Saldanha
> <anoopsaldanha at gmail.com> wrote:
>> On Tue, Feb 21, 2012 at 12:54 AM, Tom DeCanio <decanio.tom at gmail.com> wrote:
>>> How were you getting the byte counts?  I put back a bit of code to dump
>>> state counts an no more.
>>>
>>
>> diff --git a/src/util-mpm-ac-bs.c b/src/util-mpm-ac-bs.c
>> index 9e08a23..ef9f139 100644
>> --- a/src/util-mpm-ac-bs.c
>> +++ b/src/util-mpm-ac-bs.c
>> @@ -972,6 +972,7 @@ static inline void
>> SCACBSCreateModDeltaTable(MpmCtx *mpm_ctx)
>>             exit(EXIT_FAILURE);
>>         }
>>         memset(ctx->state_table_mod, 0, size);
>> +        printf("size: %d\n", size);
>>
>>         mpm_ctx->memory_cnt++;
>>         mpm_ctx->memory_size += size;
>>
>>
>>> ##############Delta Table (state count 970)##############
>>> ##############Delta Table (state count 540)##############
>>> ##############Delta Table (state count 1908)##############
>>> ##############Delta Table (state count 1908)##############
>>> ##############Delta Table (state count 302)##############
>>> ##############Delta Table (state count 15263)##############
>>> ##############Delta Table (state count 9)##############
>>> ##############Delta Table (state count 686)##############
>>> ##############Delta Table (state count 6002)##############
>>> ##############Delta Table (state count 469)##############
>>> ##############Delta Table (state count 45)##############
>>> ##############Delta Table (state count 17218)##############
>>> ##############Delta Table (state count 7285)##############
>>>
>>> Some testing indicates that "ac-bs" isn't as fast as the old "ac" on tie
>>> Tilera for the ruleset I've been using.
>>>
>>
>> What's the perf difference?
>>
>> Probably with some optimizations like the cache prefetching for next
>> buffer byte you had done earlier?
>>
>>> Tom
>>>
>>> On Mon, 2012-02-20 at 10:57 +0530, Anoop Saldanha wrote:
>>>> as a reference, these are the table sizes on my box with "ac-bs" for
>>>> all the mpm contexts used by the engine, for a 18k ruleset
>>>>
>>>> * in bytes
>>>>
>>>> "ac-bs"
>>>> 24348
>>>> 38486
>>>> 118900
>>>> 47736
>>>> 4716
>>>> 4648804
>>>> 558
>>>> 15874
>>>> 266202
>>>> 6838
>>>> 696
>>>> 692
>>>> 3982784
>>>> 10756976
>>>>
>>>> On Mon, Feb 20, 2012 at 4:19 AM, Tom DeCanio <decanio.tom at gmail.com> wrote:
>>>> > I just brought this up on the Tilera (tilegx).  Haven't benchmarked it yet,
>>>> > but the tables do look much smaller than those produced by ac.  Seems like
>>>> > this should improve performance here.  When I get my benchmarking setup back
>>>> > I'll gather some new numbers.
>>>> >
>>>> > Tom
>>>> >
>>>> > On Tue, Feb 14, 2012 at 1:22 AM, Anoop Saldanha <anoopsaldanha at gmail.com>
>>>> > wrote:
>>>> >>
>>>> >> Hello all,
>>>> >>
>>>> >> We have a new MPM available in our codebase - "ac-bs".  This provides
>>>> >> compression that's pretty close to ac-gfbs, while performing better
>>>> >> than ac-gfbs.
>>>> >>
>>>> >> To use this mpm, set
>>>> >>
>>>> >> "mpm-algo: ac-bs" in the conf file.
>>>> >>
>>>> >> Would appreciate performance numbers with both
>>>> >>
>>>> >> "sgh-mpm-context:full"
>>>> >> and
>>>> >> "sgh-mpm-context:single"
>>>> >>
>>>> >> To give an explanation on what "sgh-mpm-context" and the params "full"
>>>> >> and "single" mean, these refer to how we set up mpm contexts.
>>>> >> "single" indicates that we use a single context for all the patterns
>>>> >> in the engine.  "full" indicates that we split the patterns into many
>>>> >> mpm contexts, one mpm context per signature group head(sgh).
>>>> >>
>>>> >> To use "full" with a sufficiently decent ruleset(say > 10k rules with
>>>> >> a decent no of patterns) would require a lot of memory, running into a
>>>> >> couple of gigs for ac-gfbs or ac-bs or b2gc, or tens of gigs in case
>>>> >> of "ac".  "single" solves this with a single context and hence the
>>>> >> smaller memory footprint for the engine.
>>>> >>
>>>> >> If the machine has sufficient memory, "full" is suggested as it
>>>> >> provides much better performance than "single", albeit at the cost of
>>>> >> increased memory consumption.  More of a available_memory vs
>>>> >> performance scenario.
>>>> >>
>>>> >> Looking forward to some performance/memory feedback/benchmarks with
>>>> >> this mpm from the community.
>>>> >>
>>>> >> *mpm - multi pattern matcher
>>>> >> *sgh - signature group head
>>>> >>
>>>> >> --
>>>> >> Anoop Saldanha
>>>> >>
>>>> >> _______________________________________________
>>>> >> Oisf-users mailing list
>>>> >> Oisf-users at openinfosecfoundation.org
>>>> >> http://lists.openinfosecfoundation.org/mailman/listinfo/oisf-users
>>>> >
>>>> >
>>>>
>>>>
>>>>
>>>
>>>
>>
>>
>>
>> --
>> Anoop Saldanha
>> _______________________________________________
>> Oisf-users mailing list
>> Oisf-users at openinfosecfoundation.org
>> http://lists.openinfosecfoundation.org/mailman/listinfo/oisf-users



-- 
Anoop Saldanha



More information about the Oisf-users mailing list