[Oisf-devel] Some questions about module development

Fri Feb 1 14:30:00 UTC 2013

What I'm trying to do is to devide the stream into messages. A message
is a continous part of the stream (stream means bidrectional stream
here) in one direction.
For an HTTP session with one request there will be two messages, the
request and the response.
If there is another request in the http session it will have four
messages. (Request, response, request, response)
The order in which the messages appear is important. Together with the
payloadof a message, the time of the first frame and direction should be
saved.

Something like:
struct Message {
    char* data;
    uint length;
    int direction;

    Message * next;
}

The benefit of this representation of the stream is that you can do
statistical matches on the length of the individual messages or the time
between two messsages. I hope to be able to detect some malware c'n'c
traffic with this. Often you have limited message lengths or somewhat
fixed intervals between messages.

The rules to devide a bidirectional tcp stream into messages are as follows:
 - begin of stream -> create new message
 - change of transmission direction -> create new message
 - timeout after last received packet (say if no packet receives within
2 seconds) -> create new message

The only thing I need at the moment is the pakets of a tcp stream in the
correct order. This essentially what the reassembler should to I think...

Am 01.02.2013 15:16, schrieb Victor Julien:
> On 02/01/2013 03:07 PM, Jörg Vehlow wrote:
>> One more time i've got some questions:
>>
>> I am trying to figure out how exactly reassembly works in suricata. My
>> goal is to devide the stream into messages: Message boundaries are
>> defined by a change in stream direction, timeout or end of stream.
>>
>> As far as I've seen, the reassembly is done just to enable the AppLayer
>> Parsers. A possible hook point for my implementation would be
>> AppLayerHandleTCPData or an AppLayer module.
> It's used for raw stream matching as well.
>
>> The problem I'm seeing there is, that reassembly is stoped by some other
>> applayer modules, e.g. http when a bad header is received. Do you think
>> there is any other way to do this? Or do I just have to ensure (by
>> always clearing the flag) that reassembly is never stoped.
>>
>> The data send to AppLayerHandleTCPData also seems to be affected by the
>> applayer modules. Do the modules tell suricata what part of the payload
>> they have parsed already? E.g. if i sent "GET / HTTP/1.0" and in another
>> packet "Host: example.com", both packets are send individual. Some other
>> packets are send together. (like "abc" and "def" will send "abc" and
>> "abcdef" to AppLayerHandleTCPData.
> What you're describing is due to the integration with protocol
> detection. While the protocol isn't detected yet, we send the same data
> + new data to the protocol detection code. Once the protocol is
> detected, the app layer module gets data in in-order chunks, without
> duplication or anything.
>
>> I think what I really need is reassembly and AppLayerHandling split up.
>> They seem to be very tightly integrated at the moment...
> Maybe you can explain in some more detail what you are trying to do?
>
> Cheers,
> Victor
>
>> Jörg
>>
>> Am 18.01.2013 10:46, schrieb Victor Julien:
>>> On 01/16/2013 01:51 PM, Jörg Vehlow wrote:
>>>> Hallo again.
>>>>
>>>> Thanks for your answers so far.
>>>>
>>>> I took a deeper look into suricata and I got some more questions know.
>>>>
>>>> I was trying to understand the tcp stream, reassembly and message
>>>> terminology.
>>>>
>>>> Can someone elborate the use of Flow, TcpSession and StreamMsg.
>>> Flow is an object that we use to track everything related to the ...
>>> flow :) A flow is shared between all packets with the same 5tuple
>>> (proto,src,dst,sp,dp) until it times out.
>>>
>>> TcpSession is a per flow object specifically for TCP stream tracking and
>>> reassembly.
>>>
>>>> What are messages in suricata terms? As far as I've seen everything send
>>>> from the client ends up in ONE message. The same for the server. Is
>>>> there any way, that more than one message exists for a single direction?
>>>> I'm thinking about something like "a new message starts when the
>>>> transmission direction changes".
>>> The StreamMsg is a bit of a misnomer. It's actually chunks of
>>> reassembled stream data in the normal case. Such chunks are always in
>>> one direction. It can also be more of a message: telling the application
>>> layer modules that a stream gap was encountered.
>>>
>>> Cheers,
>>> Victor
>>>
>>>> PS: I'm subscribed to the list, I just used the wrong sender address.
>>>>
>>>> Jörg
>>>>
>>>>
>>>> Am 28.11.2012 08:52, schrieb Victor Julien:
>>>>> On 11/27/2012 04:50 PM, Jörg Vehlow wrote:
>>>>>> Hi,
>>>>>>  
>>>>>> I am currently trying to develop some detection modules in order to
>>>>>> detect non simple signature based malware cnc traffic for my master's
>>>>>> thesis.
>>>>>> The first module I implemented can calculate the entropy of the payload.
>>>>>> And while doing that some questions (and possible answers) came up:
>>>>>>
>>>>>> 1. What is the intended purpose of Setup, Match and Free functions of a
>>>>>> detection module?
>>>>>> I think Setup is for parsing the parameters, match does the actual
>>>>>> calculating and matching and free is for freeing the memory allocated by
>>>>>> Setup.
>>>>>> 2. When are Setup, Match and Free called?
>>>>>> From what I saw, I suspect that: Setup gets called ONCE for EVERY rule
>>>>>> that uses the registered keyword at initialization and Free analog in
>>>>>> the end. Match is called on every rule? Or does suricata match the
>>>>>> keywords in order and if the first one fails it stops matching the other
>>>>>> ones?
>>>>> Setup is called per keyword occurrence. So if you have 2 rules with each
>>>>> one "foo:bar;", Setup will be called twice.
>>>>>
>>>>> Match is called on a per packet basis, but only if signature is actually
>>>>> inspected (so not prefiltered based on flow dir, addresses, mpm and a
>>>>> bunch of other conditions) and if the other conditions that are checked
>>>>> prior to your keyword matched.
>>>>>
>>>>> Free is called when we clean up our detection engine ctx. Normally at
>>>>> shut down, but in case of rule parsing errors we can do it at init as
>>>>> well. With live reload support it's also done at runtime, when the old
>>>>> detect engine ctx is cleaned up.
>>>>>
>>>>>> 3. Where am I supposed to store my calculated values for reuse? I.e. if
>>>>>> the entropy is used in more than one rule it would be cheaper to store
>>>>>> the value somewhere.
>>>>>> Do I have to add a member to the Packet structure?
>>>>> That is an option. If you need it only for a single packet during one
>>>>> inspection round you could also just use a thread local var, ie __thread
>>>>> int yourint; or add something to the DetectEngineThreadCtx.
>>>>>
>>>>>> 4. What exactly is it I'm getting in Packet.payload?
>>>>>> As far as I've seen it is the payload of the tcp / udp packet. Is it
>>>>>> always the payload for the Application layer (app layer header and app
>>>>>> layer payload)?
>>>>> A packet payload, so the layer above the tcp/udp/sctp/icmp.
>>>>>
>>>>>> 5. Is there a way to decrypt a packet on the fly but only if certain
>>>>>> criteria are is matched, e.g. entropy greater x (I'd like to be able to
>>>>>> do other matches like content on the decrypted code afterwards)?
>>>>>> Can I just change Packet.payload to decrypt the packet and then go on
>>>>>> matching on it? Or do I have to implement another app layer?
>>>>> This is more tricky. Our content inspection engine is currently not
>>>>> trivial to extend, it's something we will change. It's not very hard
>>>>> either, but it requires a lot of small additions in a lot of places.
>>>>>
>>>>> If you need the normal "content" inspection to work on your decrypted
>>>>> payloads you will need to do your decryption before the detection engine
>>>>> is invoked, so at the decode or applayer stages (or you can add your own
>>>>> stage and hook it in before detect). You will have to store your
>>>>> decrypted buffer in the packet or the flow, depending on the protocol,
>>>>> then adapt the detect content inspection engine to know about it. Also,
>>>>> for this you'd have to add a content modifier similar to http_uri,
>>>>> http_client_body, etc.
>>>>>
>>>>>> 6. This is about app layers: Are they always "executed"? What I mean is:
>>>>>> Are all packets checked if they are possible http or smb even if there
>>>>>> is no rules that needs this kind of matching?
>>>>> Currently yes. We do some things on demand in the http parsers, but
>>>>> otherwise yes.
>>>>>
>>>>>> What I would like here is to have some match like entropy greater x and
>>>>>> then execute an app layer decoder.
>>>>> I think the best way currently would be to bail out early in the app
>>>>> layer module if the entropy is below your threshold.
>>>>>
>>>>>> 7. One last  (not so important) question: Why doesn't suricata use a
>>>>>> real plugin system with dynamically linked plugins or something?
>>>>>> Is it for performance reasons?
>>>>> No, it actually on our (mental) roadmap. No ETA or anything yet :)
>>>>>
>>>>> Cheers,
>>>>> Victor
>>>>>
>>>>> Btw, looks like you're not a member of this list, please subscribe.
>