[Oisf-devel] [COMMIT] OISF branch, master, updated. suricata-2.0beta1-119-ge05034f

Fri Sep 13 10:09:56 UTC 2013

This is an automated email from the git hooks/post-receive script. It was
generated because a ref change was pushed to the repository containing
the project "OISF".

The branch, master has been updated
       via  e05034f5ddb0da633f046d0f6d1067c67ef289d9 (commit)
      from  77b429c402e888c0325aed2f78cc8a8dba2902ab (commit)

Those revisions listed above that are new to this repository have
not appeared on any other notification email; so we list those
revisions in full, below.

- Log -----------------------------------------------------------------
commit e05034f5ddb0da633f046d0f6d1067c67ef289d9
Author: Ken Steele <ken at tilera.com>
Date:   Thu Sep 5 08:59:13 2013 -0400

    New Multi-pattern matcher, ac-tile, optimized for Tile architecture.

    Aho-Corasick mpm optimized for Tilera Tile-Gx architecture. Based on the
    util-mpm-ac.c code base. The primary optimizations are:
    1) Matching function used Tilera specific instructions.
    2) Alphabet compression to reduce delta table size to increase cache
       utilization  and performance.

    The basic observation is that not all 256 ASCII characters are used by
    the set of multiple patterns in a group for which a DFA is
    created. The first reason is that Suricata's pattern matching is
    case-insensitive, so all uppercase characters are converted to
    lowercase, leaving a hole of 26 characters in the
    alphabet. Previously, this hole was simply left in the middle of the
    alphabet and thus in the generated Next State (delta) tables.

    A new, smaller, alphabet is created using a translation table of 256
    bytes per mpm group. Previously, there was one global translation
    table for converting upper case to lowercase.

    Additional, unused characters are found by creating a histogram of all
    the characters in all the patterns. Then all the characters with zero
    counts are mapped to one character (0) in the new alphabet. Since
    These characters appear in no pattern, they can all be mapped to a
    single character and still result in the same matches being
    found. Zero was chosen for the value in the new alphabet since this
    "character" is more likely to appear in the input. The unused
    character always results in the next state being state zero, but that
    fact is not currently used by the code, since special casing takes
    additional instructions.

    The characters that do appear in some pattern are mapped to
    consecutive characters in the new alphabet, starting at 1. This
    results in a dense packing of next state values in the delta tables
    and additionally can allow for a smaller number of columns in that
    table, thus using less memory and better packing into the cache. The
    size of the new alphabet is the number of used characters plus 1 for
    the unused catch-all character.

    The alphabet size is rounded up to the next larger power-of-2 so that
    multiplication by the alphabet size can be done with a shift.  It
    might be possible to use a multiply instruction, so that the exact
    alphabet size could be used, which would further reduce the size of
    the delta tables, increase cache density and not require the
    specialized search functions. The multiply would likely add 1 cycle to
    the inner search loop.

    Since the multiply by alphabet-size is cleverly merged with a mask
    instruction (in the SINDEX macro), specialized versions of the
    SCACSearch function are generated for alphabet sizes 256, 128, 64, 32
    and 16.  This is done by including the file util-mpm-ac-small.c
    multiple times with a redefined SINDEX macro. A function pointer is
    then stored in the mpm context for the search function. For alpha bit
    sizes of 8 or smaller, the number of states usually small, so the DFA
    is already very small, so there is little difference using the 16
    state search function.

    The SCACSearch function is also specialized by the size of the value
    stored in the next state (delta) tables, either 16-bits or 32-bits.
    This removes a conditional inside the Search function. That
    conditional is only called once, but doesn't hurt to remove
    it. 16-bits are used for up to 32K states, with the sign bit set for
    states with matches.

    Future optimization:

    The state-has-match values is only needed per state, not per next
    state, so checking the next-state sign bit could be replaced with
    reading a different value, at the cost of an additional load, but
    increasing the 16-bit next state span to 64K.

    Since the order of the characters in the new alphabet doesn't matter,
    the new alphabet could be sorted by the frequency of the characters in
    the expected input stream for that multi-pattern matcher. This would
    group more frequent characters into the same cache lines, thus
    increasing the probability of reusing a cache-line.

    All the next state values for each state live in their own set of
    cache-lines. With power-of-two sizes alphabets, these don't overlap.
    So either 32 or 16 character's next states are loaded in each cache
    line load. If the alphabet size is not an exact power-of-2, then the
    last cache-line is not completely full and up to 31*2 bytes of that
    line could be wasted per state.

    The next state table could be transposed, so that all the next states
    for a specific character are stored sequentially, this could be better
    if some characters, for example the unused character, are much more
    frequent.

-----------------------------------------------------------------------

Summary of changes:
 src/Makefile.am                              |    1 +
 src/app-layer-smtp.c                         |    2 +-
 src/detect-csum.c                            |    4 +-
 src/detect-dns-query.c                       |   14 +-
 src/detect-engine-mpm.c                      |   18 +-
 src/detect-engine-payload.c                  |   26 +-
 src/detect-engine.c                          |    2 +-
 src/detect-http-header.c                     |    2 +-
 src/detect-ipproto.c                         |    6 +-
 src/util-mpm-ac-tile-small.c                 |   91 +
 src/util-mpm-ac-tile.c                       | 2568 ++++++++++++++++++++++++++
 src/{util-mpm-ac-bs.h => util-mpm-ac-tile.h} |   89 +-
 src/util-mpm.c                               |    4 +-
 src/util-mpm.h                               |    7 +
 14 files changed, 2768 insertions(+), 66 deletions(-)
 create mode 100644 src/util-mpm-ac-tile-small.c
 create mode 100644 src/util-mpm-ac-tile.c
 copy src/{util-mpm-ac-bs.h => util-mpm-ac-tile.h} (51%)

hooks/post-receive
-- 
OISF