[Oisf-devel] libhtp 0.5.x integration - bug 775
Anoop Saldanha
anoopsaldanha at gmail.com
Tue Apr 9 12:06:49 UTC 2013
On Tue, Apr 9, 2013 at 2:36 PM, Victor Julien <victor at inliniac.net> wrote:
> (bad juju to brian and ivan for top posting and/or html emails! :)
>
> On 04/09/2013 10:21 AM, Ivan Ristic wrote:
>> On Mon, Apr 8, 2013 at 3:54 PM, Anoop Saldanha <anoopsaldanha at gmail.com
>> <mailto:anoopsaldanha at gmail.com>> wrote:
>>
>> On Mon, Apr 8, 2013 at 7:50 PM, Brian Rectanus <brectanu at gmail.com
>> <mailto:brectanu at gmail.com>> wrote:
>> >
>> > On Mon, Apr 8, 2013 at 9:16 AM, Brian Rectanus <brectanu at gmail.com
>> <mailto:brectanu at gmail.com>> wrote:
>> >>
>> >>
>> >> On Mon, Apr 8, 2013 at 8:47 AM, Anoop Saldanha
>> <anoopsaldanha at gmail.com <mailto:anoopsaldanha at gmail.com>>
>> >> wrote:
>> >>>
>> >>> On Mon, Apr 8, 2013 at 3:42 PM, Victor Julien
>> <victor at inliniac.net <mailto:victor at inliniac.net>>
>> >>> wrote:
>> >>> > (moving to oisf-devel)
>> >>> >
>> >>> > On 04/08/2013 06:17 AM, Anoop Saldanha wrote:
>> >>> >>>> I recollect we introduced path and query double decoding
>> through
>> >>> >>>> configurable params, and also we had this thing with query
>> >>> >>>> decoding(single level). Can you explain a bit what the
>> status was
>> >>> >>>> previously. Seeing related failed uts.
>> >>> >>>>
>> >>> >>>
>> >>> >>> We run the path normalization on the query through our
>> >>> >>> HTPCallbackRequestUriNormalizeQuery callback. Previously we used
>> >>> >>> htp_decode_path_inplace to normalize the query (e.g. for
>> >>> >>> uridecoding).
>> >>> >>> However, this was causing issues (remember that pcre "bug" we
>> >>> >>> discussed
>> >>> >>> a while back, where http:// turned into http:/).
>> >>> >>>
>> >>> >>> In libhtp I copied htp_decode_path_inplace to
>> >>> >>> htp_decode_query_inplace
>> >>> >>> and also copied the config params and cfg funcs:
>> >>> >>>
>> >>> >>>
>> https://github.com/inliniac/suricata/commit/d41c762689a08e6814dc93e8bfebeceab97175c3
>> >>> >>>
>> >>> >>> Hack of the 1st order, which is wrong in many ways. But it
>> basically
>> >>> >>> allowed me to make sure we don't normalize the query as if
>> it's path,
>> >>> >>> esp with turning ftp:// into ftp:/ and such.
>> >>> >>>
>> >>> >>> For 0.5 integration I think we need a proper solution. The only
>> >>> >>> reason I
>> >>> >>> pushed my hack like this was that I knew in 0.5 we would
>> make things
>> >>> >>> right.
>> >>> >>>
>> >>> >>
>> >>> >> I think if we still want to double decode, we still require
>> all of
>> >>> >> these above things from our bundled htp.
>> >>> >>
>> >>> >> -----
>> >>> >>
>> >>> >> In 0.5.x, tx->request_uri_normalized has been removed, and
>> we'd now
>> >>> >> have to use the REQUEST_URI hook. We'll have to carry out the
>> >>> >> reconstruction ourselves, and store it ourselves in our HTPState.
>> >>> >>
>> >>>
>> >>> What are your thoughts on this?
>> >>>
>> >>> >
>> >>> > IIRC there is some function in libhtp that does just the
>> decoding of
>> >>> > uriencoding and unicode. We should probably just use that on
>> the query
>> >>> > and do the full normalization on the path.
>> >>> >
>> >>> > As a side thought: I think it would be nice to store path and
>> query
>> >>> > separately so that we can add http_path and http_query
>> keywords later
>> >>> > on.
>> >>> >
>> >>>
>> >>> We'd pretty much extract it directly from parsed_uri. Will have to
>> >>> check if we need the extract double decode phase we have
>> currently in
>> >>> our bundled htp, in which case we'd need to store them separately.
>> >>>
>> >>
>> >> Yes, all the normalized components are in tx->parsed_uri. This
>> is what is
>> >> used in ironbee to expose all the various parts like
>> tx->parsed_uri->path
>> >> and tx->parsed_uri->query.
>> >>
>> >> Also note that the hostname should now be obtained from
>> >> tx->request_hostname in 0.5.
>> >>
>> >> -B
>> >
>> >
>> > FYI, for an example using libhtp 0.5 see ironbee code. This was all
>> > recently updated for 0.5.
>> >
>> > https://github.com/ironbee/ironbee/blob/0.7.x/modules/modhtp.c
>> >
>>
>> Will have a look. Thanks.
>>
>> Previously we would use tx->connp->conn->transactions to access txs
>> in the state. Now that htp_connp_t is an opaque pointer how do I
>> access the txs? Tried locating helper functions to retrieve it, but I
>> didn't find any.
>>
>> It's an oversight that there isn't a helper function to retrieve
>> transactions on a connections. I will add one tomorrow.
>>
>> Having said that, what is your use case that you require to retrieve
>> transactions? I thought your code was driven by the callbacks, which > all
>> come with a tx instance (via connp)? For my education, can you explain
>> how you process connection data?
>>
>>
>
> One of the things that we don't do out of the callbacks is logging the
> requests. This is one of the things we need access to the TX store for.
>
And to add to it, since we already have the txs stored in a list
inside libhtp, re-buffering the txs would come as a redundant task,
from where I see it.
--
Anoop Saldanha
More information about the Oisf-devel
mailing list