[Oisf-devel] Some questions about module development

Fri Feb 1 14:07:00 UTC 2013

One more time i've got some questions:

I am trying to figure out how exactly reassembly works in suricata. My
goal is to devide the stream into messages: Message boundaries are
defined by a change in stream direction, timeout or end of stream.

As far as I've seen, the reassembly is done just to enable the AppLayer
Parsers. A possible hook point for my implementation would be
AppLayerHandleTCPData or an AppLayer module.
The problem I'm seeing there is, that reassembly is stoped by some other
applayer modules, e.g. http when a bad header is received. Do you think
there is any other way to do this? Or do I just have to ensure (by
always clearing the flag) that reassembly is never stoped.

The data send to AppLayerHandleTCPData also seems to be affected by the
applayer modules. Do the modules tell suricata what part of the payload
they have parsed already? E.g. if i sent "GET / HTTP/1.0" and in another
packet "Host: example.com", both packets are send individual. Some other
packets are send together. (like "abc" and "def" will send "abc" and
"abcdef" to AppLayerHandleTCPData.

I think what I really need is reassembly and AppLayerHandling split up.
They seem to be very tightly integrated at the moment...

Jörg

Am 18.01.2013 10:46, schrieb Victor Julien:
> On 01/16/2013 01:51 PM, Jörg Vehlow wrote:
>> Hallo again.
>>
>> Thanks for your answers so far.
>>
>> I took a deeper look into suricata and I got some more questions know.
>>
>> I was trying to understand the tcp stream, reassembly and message
>> terminology.
>>
>> Can someone elborate the use of Flow, TcpSession and StreamMsg.
> Flow is an object that we use to track everything related to the ...
> flow :) A flow is shared between all packets with the same 5tuple
> (proto,src,dst,sp,dp) until it times out.
>
> TcpSession is a per flow object specifically for TCP stream tracking and
> reassembly.
>
>> What are messages in suricata terms? As far as I've seen everything send
>> from the client ends up in ONE message. The same for the server. Is
>> there any way, that more than one message exists for a single direction?
>> I'm thinking about something like "a new message starts when the
>> transmission direction changes".
> The StreamMsg is a bit of a misnomer. It's actually chunks of
> reassembled stream data in the normal case. Such chunks are always in
> one direction. It can also be more of a message: telling the application
> layer modules that a stream gap was encountered.
>
> Cheers,
> Victor
>
>>
>> PS: I'm subscribed to the list, I just used the wrong sender address.
>>
>> Jörg
>>
>>
>> Am 28.11.2012 08:52, schrieb Victor Julien:
>>> On 11/27/2012 04:50 PM, Jörg Vehlow wrote:
>>>> Hi,
>>>>  
>>>> I am currently trying to develop some detection modules in order to
>>>> detect non simple signature based malware cnc traffic for my master's
>>>> thesis.
>>>> The first module I implemented can calculate the entropy of the payload.
>>>> And while doing that some questions (and possible answers) came up:
>>>>
>>>> 1. What is the intended purpose of Setup, Match and Free functions of a
>>>> detection module?
>>>> I think Setup is for parsing the parameters, match does the actual
>>>> calculating and matching and free is for freeing the memory allocated by
>>>> Setup.
>>>> 2. When are Setup, Match and Free called?
>>>> From what I saw, I suspect that: Setup gets called ONCE for EVERY rule
>>>> that uses the registered keyword at initialization and Free analog in
>>>> the end. Match is called on every rule? Or does suricata match the
>>>> keywords in order and if the first one fails it stops matching the other
>>>> ones?
>>> Setup is called per keyword occurrence. So if you have 2 rules with each
>>> one "foo:bar;", Setup will be called twice.
>>>
>>> Match is called on a per packet basis, but only if signature is actually
>>> inspected (so not prefiltered based on flow dir, addresses, mpm and a
>>> bunch of other conditions) and if the other conditions that are checked
>>> prior to your keyword matched.
>>>
>>> Free is called when we clean up our detection engine ctx. Normally at
>>> shut down, but in case of rule parsing errors we can do it at init as
>>> well. With live reload support it's also done at runtime, when the old
>>> detect engine ctx is cleaned up.
>>>
>>>> 3. Where am I supposed to store my calculated values for reuse? I.e. if
>>>> the entropy is used in more than one rule it would be cheaper to store
>>>> the value somewhere.
>>>> Do I have to add a member to the Packet structure?
>>> That is an option. If you need it only for a single packet during one
>>> inspection round you could also just use a thread local var, ie __thread
>>> int yourint; or add something to the DetectEngineThreadCtx.
>>>
>>>> 4. What exactly is it I'm getting in Packet.payload?
>>>> As far as I've seen it is the payload of the tcp / udp packet. Is it
>>>> always the payload for the Application layer (app layer header and app
>>>> layer payload)?
>>> A packet payload, so the layer above the tcp/udp/sctp/icmp.
>>>
>>>> 5. Is there a way to decrypt a packet on the fly but only if certain
>>>> criteria are is matched, e.g. entropy greater x (I'd like to be able to
>>>> do other matches like content on the decrypted code afterwards)?
>>>> Can I just change Packet.payload to decrypt the packet and then go on
>>>> matching on it? Or do I have to implement another app layer?
>>> This is more tricky. Our content inspection engine is currently not
>>> trivial to extend, it's something we will change. It's not very hard
>>> either, but it requires a lot of small additions in a lot of places.
>>>
>>> If you need the normal "content" inspection to work on your decrypted
>>> payloads you will need to do your decryption before the detection engine
>>> is invoked, so at the decode or applayer stages (or you can add your own
>>> stage and hook it in before detect). You will have to store your
>>> decrypted buffer in the packet or the flow, depending on the protocol,
>>> then adapt the detect content inspection engine to know about it. Also,
>>> for this you'd have to add a content modifier similar to http_uri,
>>> http_client_body, etc.
>>>
>>>> 6. This is about app layers: Are they always "executed"? What I mean is:
>>>> Are all packets checked if they are possible http or smb even if there
>>>> is no rules that needs this kind of matching?
>>> Currently yes. We do some things on demand in the http parsers, but
>>> otherwise yes.
>>>
>>>> What I would like here is to have some match like entropy greater x and
>>>> then execute an app layer decoder.
>>> I think the best way currently would be to bail out early in the app
>>> layer module if the entropy is below your threshold.
>>>
>>>> 7. One last  (not so important) question: Why doesn't suricata use a
>>>> real plugin system with dynamically linked plugins or something?
>>>> Is it for performance reasons?
>>> No, it actually on our (mental) roadmap. No ETA or anything yet :)
>>>
>>> Cheers,
>>> Victor
>>>
>>> Btw, looks like you're not a member of this list, please subscribe.
>