[Oisf-devel] libhtp 0.5.x integration - bug 775

Ivan Ristic ivan.ristic at gmail.com
Thu Apr 11 15:40:42 UTC 2013


I wouldn't advise you to do any buffering anyhow. But I am curious if
you're deleting transactions once you're done with them. Because, if you're
not, you may be allocating a lot of memory (all tx instances) on long-lived
HTTP connections.


On Tue, Apr 9, 2013 at 1:06 PM, Anoop Saldanha <anoopsaldanha at gmail.com>wrote:

> On Tue, Apr 9, 2013 at 2:36 PM, Victor Julien <victor at inliniac.net> wrote:
> > (bad juju to brian and ivan for top posting and/or html emails! :)
> >
> > On 04/09/2013 10:21 AM, Ivan Ristic wrote:
> >> On Mon, Apr 8, 2013 at 3:54 PM, Anoop Saldanha <anoopsaldanha at gmail.com
> >> <mailto:anoopsaldanha at gmail.com>> wrote:
> >>
> >>     On Mon, Apr 8, 2013 at 7:50 PM, Brian Rectanus <brectanu at gmail.com
> >>     <mailto:brectanu at gmail.com>> wrote:
> >>     >
> >>     > On Mon, Apr 8, 2013 at 9:16 AM, Brian Rectanus <
> brectanu at gmail.com
> >>     <mailto:brectanu at gmail.com>> wrote:
> >>     >>
> >>     >>
> >>     >> On Mon, Apr 8, 2013 at 8:47 AM, Anoop Saldanha
> >>     <anoopsaldanha at gmail.com <mailto:anoopsaldanha at gmail.com>>
> >>     >> wrote:
> >>     >>>
> >>     >>> On Mon, Apr 8, 2013 at 3:42 PM, Victor Julien
> >>     <victor at inliniac.net <mailto:victor at inliniac.net>>
> >>     >>> wrote:
> >>     >>> > (moving to oisf-devel)
> >>     >>> >
> >>     >>> > On 04/08/2013 06:17 AM, Anoop Saldanha wrote:
> >>     >>> >>>> I recollect we introduced path and query double decoding
> >>     through
> >>     >>> >>>> configurable params, and also we had this thing with query
> >>     >>> >>>> decoding(single level).  Can you explain a bit what the
> >>     status was
> >>     >>> >>>> previously.  Seeing related failed uts.
> >>     >>> >>>>
> >>     >>> >>>
> >>     >>> >>> We run the path normalization on the query through our
> >>     >>> >>> HTPCallbackRequestUriNormalizeQuery callback. Previously we
> used
> >>     >>> >>> htp_decode_path_inplace to normalize the query (e.g. for
> >>     >>> >>> uridecoding).
> >>     >>> >>> However, this was causing issues (remember that pcre "bug"
> we
> >>     >>> >>> discussed
> >>     >>> >>> a while back, where http:// turned into http:/).
> >>     >>> >>>
> >>     >>> >>> In libhtp I copied htp_decode_path_inplace to
> >>     >>> >>> htp_decode_query_inplace
> >>     >>> >>> and also copied the config params and cfg funcs:
> >>     >>> >>>
> >>     >>> >>>
> >>
> https://github.com/inliniac/suricata/commit/d41c762689a08e6814dc93e8bfebeceab97175c3
> >>     >>> >>>
> >>     >>> >>> Hack of the 1st order, which is wrong in many ways. But it
> >>     basically
> >>     >>> >>> allowed me to make sure we don't normalize the query as if
> >>     it's path,
> >>     >>> >>> esp with turning ftp:// into ftp:/ and such.
> >>     >>> >>>
> >>     >>> >>> For 0.5 integration I think we need a proper solution. The
> only
> >>     >>> >>> reason I
> >>     >>> >>> pushed my hack like this was that I knew in 0.5 we would
> >>     make things
> >>     >>> >>> right.
> >>     >>> >>>
> >>     >>> >>
> >>     >>> >> I think if we still want to double decode, we still require
> >>     all of
> >>     >>> >> these above things from our bundled htp.
> >>     >>> >>
> >>     >>> >> -----
> >>     >>> >>
> >>     >>> >> In 0.5.x, tx->request_uri_normalized has been removed, and
> >>     we'd now
> >>     >>> >> have to use the REQUEST_URI hook.  We'll have to carry out
> the
> >>     >>> >> reconstruction ourselves, and store it ourselves in our
> HTPState.
> >>     >>> >>
> >>     >>>
> >>     >>> What are your thoughts on this?
> >>     >>>
> >>     >>> >
> >>     >>> > IIRC there is some function in libhtp that does just the
> >>     decoding of
> >>     >>> > uriencoding and unicode. We should probably just use that on
> >>     the query
> >>     >>> > and do the full normalization on the path.
> >>     >>> >
> >>     >>> > As a side thought: I think it would be nice to store path and
> >>     query
> >>     >>> > separately so that we can add http_path and http_query
> >>     keywords later
> >>     >>> > on.
> >>     >>> >
> >>     >>>
> >>     >>> We'd pretty much extract it directly from parsed_uri.  Will
> have to
> >>     >>> check if we need the extract double decode phase we have
> >>     currently in
> >>     >>> our bundled htp, in which case we'd need to store them
> separately.
> >>     >>>
> >>     >>
> >>     >> Yes, all the normalized components are in tx->parsed_uri.  This
> >>     is what is
> >>     >> used in ironbee to expose all the various parts like
> >>     tx->parsed_uri->path
> >>     >> and tx->parsed_uri->query.
> >>     >>
> >>     >> Also note that the hostname should now be obtained from
> >>     >> tx->request_hostname in 0.5.
> >>     >>
> >>     >> -B
> >>     >
> >>     >
> >>     > FYI, for an example using libhtp 0.5 see ironbee code.  This was
> all
> >>     > recently updated for 0.5.
> >>     >
> >>     > https://github.com/ironbee/ironbee/blob/0.7.x/modules/modhtp.c
> >>     >
> >>
> >>     Will have a look.  Thanks.
> >>
> >>     Previously we would use tx->connp->conn->transactions to access txs
> >>     in the state.  Now that htp_connp_t is an opaque pointer how do I
> >>     access the txs? Tried locating helper functions to retrieve it, but
> I
> >>     didn't find any.
> >>
> >> It's an oversight that there isn't a helper function to retrieve
> >> transactions on a connections. I will add one tomorrow.
> >>
> >> Having said that, what is your use case that you require to retrieve
> >> transactions? I thought your code was driven by the callbacks, which >
> all
> >> come with a tx instance (via connp)? For my education, can you explain
> >> how you process connection data?
> >>
> >>
> >
> > One of the things that we don't do out of the callbacks is logging the
> > requests. This is one of the things we need access to the TX store for.
> >
>
> And to add to it, since we already have the txs stored in a list
> inside libhtp, re-buffering the txs would come as a redundant task,
> from where I see it.
>
> --
> Anoop Saldanha
> _______________________________________________
> Suricata IDS Devel mailing list: oisf-devel at openinfosecfoundation.org
> Site: http://suricata-ids.org | Participate:
> http://suricata-ids.org/participate/
> List: https://lists.openinfosecfoundation.org/mailman/listinfo/oisf-devel
> Redmine: https://redmine.openinfosecfoundation.org/
>



-- 
Ivan Ristić
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openinfosecfoundation.org/pipermail/oisf-devel/attachments/20130411/eb20854c/attachment-0002.html>


More information about the Oisf-devel mailing list