[Oisf-devel] Some questions about module development

Fri Feb 1 14:16:45 UTC 2013

On 02/01/2013 03:07 PM, Jörg Vehlow wrote:
> One more time i've got some questions:
> 
> I am trying to figure out how exactly reassembly works in suricata. My
> goal is to devide the stream into messages: Message boundaries are
> defined by a change in stream direction, timeout or end of stream.
> 
> As far as I've seen, the reassembly is done just to enable the AppLayer
> Parsers. A possible hook point for my implementation would be
> AppLayerHandleTCPData or an AppLayer module.

It's used for raw stream matching as well.

> The problem I'm seeing there is, that reassembly is stoped by some other
> applayer modules, e.g. http when a bad header is received. Do you think
> there is any other way to do this? Or do I just have to ensure (by
> always clearing the flag) that reassembly is never stoped.
> 
> The data send to AppLayerHandleTCPData also seems to be affected by the
> applayer modules. Do the modules tell suricata what part of the payload
> they have parsed already? E.g. if i sent "GET / HTTP/1.0" and in another
> packet "Host: example.com", both packets are send individual. Some other
> packets are send together. (like "abc" and "def" will send "abc" and
> "abcdef" to AppLayerHandleTCPData.

What you're describing is due to the integration with protocol
detection. While the protocol isn't detected yet, we send the same data
+ new data to the protocol detection code. Once the protocol is
detected, the app layer module gets data in in-order chunks, without
duplication or anything.

> I think what I really need is reassembly and AppLayerHandling split up.
> They seem to be very tightly integrated at the moment...

Maybe you can explain in some more detail what you are trying to do?

Cheers,
Victor

> Jörg
> 
> Am 18.01.2013 10:46, schrieb Victor Julien:
>> On 01/16/2013 01:51 PM, Jörg Vehlow wrote:
>>> Hallo again.
>>>
>>> Thanks for your answers so far.
>>>
>>> I took a deeper look into suricata and I got some more questions know.
>>>
>>> I was trying to understand the tcp stream, reassembly and message
>>> terminology.
>>>
>>> Can someone elborate the use of Flow, TcpSession and StreamMsg.
>> Flow is an object that we use to track everything related to the ...
>> flow :) A flow is shared between all packets with the same 5tuple
>> (proto,src,dst,sp,dp) until it times out.
>>
>> TcpSession is a per flow object specifically for TCP stream tracking and
>> reassembly.
>>
>>> What are messages in suricata terms? As far as I've seen everything send
>>> from the client ends up in ONE message. The same for the server. Is
>>> there any way, that more than one message exists for a single direction?
>>> I'm thinking about something like "a new message starts when the
>>> transmission direction changes".
>> The StreamMsg is a bit of a misnomer. It's actually chunks of
>> reassembled stream data in the normal case. Such chunks are always in
>> one direction. It can also be more of a message: telling the application
>> layer modules that a stream gap was encountered.
>>
>> Cheers,
>> Victor
>>
>>>
>>> PS: I'm subscribed to the list, I just used the wrong sender address.
>>>
>>> Jörg
>>>
>>>
>>> Am 28.11.2012 08:52, schrieb Victor Julien:
>>>> On 11/27/2012 04:50 PM, Jörg Vehlow wrote:
>>>>> Hi,
>>>>>  
>>>>> I am currently trying to develop some detection modules in order to
>>>>> detect non simple signature based malware cnc traffic for my master's
>>>>> thesis.
>>>>> The first module I implemented can calculate the entropy of the payload.
>>>>> And while doing that some questions (and possible answers) came up:
>>>>>
>>>>> 1. What is the intended purpose of Setup, Match and Free functions of a
>>>>> detection module?
>>>>> I think Setup is for parsing the parameters, match does the actual
>>>>> calculating and matching and free is for freeing the memory allocated by
>>>>> Setup.
>>>>> 2. When are Setup, Match and Free called?
>>>>> From what I saw, I suspect that: Setup gets called ONCE for EVERY rule
>>>>> that uses the registered keyword at initialization and Free analog in
>>>>> the end. Match is called on every rule? Or does suricata match the
>>>>> keywords in order and if the first one fails it stops matching the other
>>>>> ones?
>>>> Setup is called per keyword occurrence. So if you have 2 rules with each
>>>> one "foo:bar;", Setup will be called twice.
>>>>
>>>> Match is called on a per packet basis, but only if signature is actually
>>>> inspected (so not prefiltered based on flow dir, addresses, mpm and a
>>>> bunch of other conditions) and if the other conditions that are checked
>>>> prior to your keyword matched.
>>>>
>>>> Free is called when we clean up our detection engine ctx. Normally at
>>>> shut down, but in case of rule parsing errors we can do it at init as
>>>> well. With live reload support it's also done at runtime, when the old
>>>> detect engine ctx is cleaned up.
>>>>
>>>>> 3. Where am I supposed to store my calculated values for reuse? I.e. if
>>>>> the entropy is used in more than one rule it would be cheaper to store
>>>>> the value somewhere.
>>>>> Do I have to add a member to the Packet structure?
>>>> That is an option. If you need it only for a single packet during one
>>>> inspection round you could also just use a thread local var, ie __thread
>>>> int yourint; or add something to the DetectEngineThreadCtx.
>>>>
>>>>> 4. What exactly is it I'm getting in Packet.payload?
>>>>> As far as I've seen it is the payload of the tcp / udp packet. Is it
>>>>> always the payload for the Application layer (app layer header and app
>>>>> layer payload)?
>>>> A packet payload, so the layer above the tcp/udp/sctp/icmp.
>>>>
>>>>> 5. Is there a way to decrypt a packet on the fly but only if certain
>>>>> criteria are is matched, e.g. entropy greater x (I'd like to be able to
>>>>> do other matches like content on the decrypted code afterwards)?
>>>>> Can I just change Packet.payload to decrypt the packet and then go on
>>>>> matching on it? Or do I have to implement another app layer?
>>>> This is more tricky. Our content inspection engine is currently not
>>>> trivial to extend, it's something we will change. It's not very hard
>>>> either, but it requires a lot of small additions in a lot of places.
>>>>
>>>> If you need the normal "content" inspection to work on your decrypted
>>>> payloads you will need to do your decryption before the detection engine
>>>> is invoked, so at the decode or applayer stages (or you can add your own
>>>> stage and hook it in before detect). You will have to store your
>>>> decrypted buffer in the packet or the flow, depending on the protocol,
>>>> then adapt the detect content inspection engine to know about it. Also,
>>>> for this you'd have to add a content modifier similar to http_uri,
>>>> http_client_body, etc.
>>>>
>>>>> 6. This is about app layers: Are they always "executed"? What I mean is:
>>>>> Are all packets checked if they are possible http or smb even if there
>>>>> is no rules that needs this kind of matching?
>>>> Currently yes. We do some things on demand in the http parsers, but
>>>> otherwise yes.
>>>>
>>>>> What I would like here is to have some match like entropy greater x and
>>>>> then execute an app layer decoder.
>>>> I think the best way currently would be to bail out early in the app
>>>> layer module if the entropy is below your threshold.
>>>>
>>>>> 7. One last  (not so important) question: Why doesn't suricata use a
>>>>> real plugin system with dynamically linked plugins or something?
>>>>> Is it for performance reasons?
>>>> No, it actually on our (mental) roadmap. No ETA or anything yet :)
>>>>
>>>> Cheers,
>>>> Victor
>>>>
>>>> Btw, looks like you're not a member of this list, please subscribe.
>>
> 

-- 
---------------------------------------------
Victor Julien
http://www.inliniac.net/
PGP: http://www.inliniac.net/victorjulien.asc
---------------------------------------------