[Oisf-devel] libhtp 0.5.x integration - bug 775

Victor Julien victor at inliniac.net
Tue Apr 9 09:06:07 UTC 2013


(bad juju to brian and ivan for top posting and/or html emails! :)

On 04/09/2013 10:21 AM, Ivan Ristic wrote:
> On Mon, Apr 8, 2013 at 3:54 PM, Anoop Saldanha <anoopsaldanha at gmail.com
> <mailto:anoopsaldanha at gmail.com>> wrote:
> 
>     On Mon, Apr 8, 2013 at 7:50 PM, Brian Rectanus <brectanu at gmail.com
>     <mailto:brectanu at gmail.com>> wrote:
>     >
>     > On Mon, Apr 8, 2013 at 9:16 AM, Brian Rectanus <brectanu at gmail.com
>     <mailto:brectanu at gmail.com>> wrote:
>     >>
>     >>
>     >> On Mon, Apr 8, 2013 at 8:47 AM, Anoop Saldanha
>     <anoopsaldanha at gmail.com <mailto:anoopsaldanha at gmail.com>>
>     >> wrote:
>     >>>
>     >>> On Mon, Apr 8, 2013 at 3:42 PM, Victor Julien
>     <victor at inliniac.net <mailto:victor at inliniac.net>>
>     >>> wrote:
>     >>> > (moving to oisf-devel)
>     >>> >
>     >>> > On 04/08/2013 06:17 AM, Anoop Saldanha wrote:
>     >>> >>>> I recollect we introduced path and query double decoding
>     through
>     >>> >>>> configurable params, and also we had this thing with query
>     >>> >>>> decoding(single level).  Can you explain a bit what the
>     status was
>     >>> >>>> previously.  Seeing related failed uts.
>     >>> >>>>
>     >>> >>>
>     >>> >>> We run the path normalization on the query through our
>     >>> >>> HTPCallbackRequestUriNormalizeQuery callback. Previously we used
>     >>> >>> htp_decode_path_inplace to normalize the query (e.g. for
>     >>> >>> uridecoding).
>     >>> >>> However, this was causing issues (remember that pcre "bug" we
>     >>> >>> discussed
>     >>> >>> a while back, where http:// turned into http:/).
>     >>> >>>
>     >>> >>> In libhtp I copied htp_decode_path_inplace to
>     >>> >>> htp_decode_query_inplace
>     >>> >>> and also copied the config params and cfg funcs:
>     >>> >>>
>     >>> >>>
>     https://github.com/inliniac/suricata/commit/d41c762689a08e6814dc93e8bfebeceab97175c3
>     >>> >>>
>     >>> >>> Hack of the 1st order, which is wrong in many ways. But it
>     basically
>     >>> >>> allowed me to make sure we don't normalize the query as if
>     it's path,
>     >>> >>> esp with turning ftp:// into ftp:/ and such.
>     >>> >>>
>     >>> >>> For 0.5 integration I think we need a proper solution. The only
>     >>> >>> reason I
>     >>> >>> pushed my hack like this was that I knew in 0.5 we would
>     make things
>     >>> >>> right.
>     >>> >>>
>     >>> >>
>     >>> >> I think if we still want to double decode, we still require
>     all of
>     >>> >> these above things from our bundled htp.
>     >>> >>
>     >>> >> -----
>     >>> >>
>     >>> >> In 0.5.x, tx->request_uri_normalized has been removed, and
>     we'd now
>     >>> >> have to use the REQUEST_URI hook.  We'll have to carry out the
>     >>> >> reconstruction ourselves, and store it ourselves in our HTPState.
>     >>> >>
>     >>>
>     >>> What are your thoughts on this?
>     >>>
>     >>> >
>     >>> > IIRC there is some function in libhtp that does just the
>     decoding of
>     >>> > uriencoding and unicode. We should probably just use that on
>     the query
>     >>> > and do the full normalization on the path.
>     >>> >
>     >>> > As a side thought: I think it would be nice to store path and
>     query
>     >>> > separately so that we can add http_path and http_query
>     keywords later
>     >>> > on.
>     >>> >
>     >>>
>     >>> We'd pretty much extract it directly from parsed_uri.  Will have to
>     >>> check if we need the extract double decode phase we have
>     currently in
>     >>> our bundled htp, in which case we'd need to store them separately.
>     >>>
>     >>
>     >> Yes, all the normalized components are in tx->parsed_uri.  This
>     is what is
>     >> used in ironbee to expose all the various parts like
>     tx->parsed_uri->path
>     >> and tx->parsed_uri->query.
>     >>
>     >> Also note that the hostname should now be obtained from
>     >> tx->request_hostname in 0.5.
>     >>
>     >> -B
>     >
>     >
>     > FYI, for an example using libhtp 0.5 see ironbee code.  This was all
>     > recently updated for 0.5.
>     >
>     > https://github.com/ironbee/ironbee/blob/0.7.x/modules/modhtp.c
>     >
> 
>     Will have a look.  Thanks.
> 
>     Previously we would use tx->connp->conn->transactions to access txs
>     in the state.  Now that htp_connp_t is an opaque pointer how do I
>     access the txs? Tried locating helper functions to retrieve it, but I
>     didn't find any.
> 
> It's an oversight that there isn't a helper function to retrieve
> transactions on a connections. I will add one tomorrow.
>
> Having said that, what is your use case that you require to retrieve
> transactions? I thought your code was driven by the callbacks, which > all
> come with a tx instance (via connp)? For my education, can you explain
> how you process connection data?
>
>

One of the things that we don't do out of the callbacks is logging the
requests. This is one of the things we need access to the TX store for.

-- 
---------------------------------------------
Victor Julien
http://www.inliniac.net/
PGP: http://www.inliniac.net/victorjulien.asc
---------------------------------------------




More information about the Oisf-devel mailing list