[Oisf-devel] libhtp 0.5.x integration - bug 775
Anoop Saldanha
anoopsaldanha at gmail.com
Wed May 15 06:37:54 UTC 2013
Ivan,
I see the introduction of
htp_tx_t *htp_connp_get_in_tx(const htp_connp_t *connp);
htp_tx_t *htp_connp_get_out_tx(const htp_connp_t *connp);
Which means I won't be able to retrieve individual txs? I receive 5
pipelined requests, so that would be 5 txs created. How do I retrieve
the individual txs?
On Thu, Apr 11, 2013 at 10:16 PM, Anoop Saldanha
<anoopsaldanha at gmail.com> wrote:
> On Thu, Apr 11, 2013 at 9:10 PM, Ivan Ristic <ivan.ristic at gmail.com> wrote:
>> I wouldn't advise you to do any buffering anyhow.
>> But I am curious if you're
>> deleting transactions once you're done with them. Because, if you're not,
>> you may be allocating a lot of memory (all tx instances) on long-lived HTTP
>> connections.
>>
>
> We do delete them, once we're done.
>
>>
>> On Tue, Apr 9, 2013 at 1:06 PM, Anoop Saldanha <anoopsaldanha at gmail.com>
>> wrote:
>>>
>>> On Tue, Apr 9, 2013 at 2:36 PM, Victor Julien <victor at inliniac.net> wrote:
>>> > (bad juju to brian and ivan for top posting and/or html emails! :)
>>> >
>>> > On 04/09/2013 10:21 AM, Ivan Ristic wrote:
>>> >> On Mon, Apr 8, 2013 at 3:54 PM, Anoop Saldanha <anoopsaldanha at gmail.com
>>> >> <mailto:anoopsaldanha at gmail.com>> wrote:
>>> >>
>>> >> On Mon, Apr 8, 2013 at 7:50 PM, Brian Rectanus <brectanu at gmail.com
>>> >> <mailto:brectanu at gmail.com>> wrote:
>>> >> >
>>> >> > On Mon, Apr 8, 2013 at 9:16 AM, Brian Rectanus
>>> >> <brectanu at gmail.com
>>> >> <mailto:brectanu at gmail.com>> wrote:
>>> >> >>
>>> >> >>
>>> >> >> On Mon, Apr 8, 2013 at 8:47 AM, Anoop Saldanha
>>> >> <anoopsaldanha at gmail.com <mailto:anoopsaldanha at gmail.com>>
>>> >> >> wrote:
>>> >> >>>
>>> >> >>> On Mon, Apr 8, 2013 at 3:42 PM, Victor Julien
>>> >> <victor at inliniac.net <mailto:victor at inliniac.net>>
>>> >> >>> wrote:
>>> >> >>> > (moving to oisf-devel)
>>> >> >>> >
>>> >> >>> > On 04/08/2013 06:17 AM, Anoop Saldanha wrote:
>>> >> >>> >>>> I recollect we introduced path and query double decoding
>>> >> through
>>> >> >>> >>>> configurable params, and also we had this thing with query
>>> >> >>> >>>> decoding(single level). Can you explain a bit what the
>>> >> status was
>>> >> >>> >>>> previously. Seeing related failed uts.
>>> >> >>> >>>>
>>> >> >>> >>>
>>> >> >>> >>> We run the path normalization on the query through our
>>> >> >>> >>> HTPCallbackRequestUriNormalizeQuery callback. Previously we
>>> >> used
>>> >> >>> >>> htp_decode_path_inplace to normalize the query (e.g. for
>>> >> >>> >>> uridecoding).
>>> >> >>> >>> However, this was causing issues (remember that pcre "bug"
>>> >> we
>>> >> >>> >>> discussed
>>> >> >>> >>> a while back, where http:// turned into http:/).
>>> >> >>> >>>
>>> >> >>> >>> In libhtp I copied htp_decode_path_inplace to
>>> >> >>> >>> htp_decode_query_inplace
>>> >> >>> >>> and also copied the config params and cfg funcs:
>>> >> >>> >>>
>>> >> >>> >>>
>>> >>
>>> >> https://github.com/inliniac/suricata/commit/d41c762689a08e6814dc93e8bfebeceab97175c3
>>> >> >>> >>>
>>> >> >>> >>> Hack of the 1st order, which is wrong in many ways. But it
>>> >> basically
>>> >> >>> >>> allowed me to make sure we don't normalize the query as if
>>> >> it's path,
>>> >> >>> >>> esp with turning ftp:// into ftp:/ and such.
>>> >> >>> >>>
>>> >> >>> >>> For 0.5 integration I think we need a proper solution. The
>>> >> only
>>> >> >>> >>> reason I
>>> >> >>> >>> pushed my hack like this was that I knew in 0.5 we would
>>> >> make things
>>> >> >>> >>> right.
>>> >> >>> >>>
>>> >> >>> >>
>>> >> >>> >> I think if we still want to double decode, we still require
>>> >> all of
>>> >> >>> >> these above things from our bundled htp.
>>> >> >>> >>
>>> >> >>> >> -----
>>> >> >>> >>
>>> >> >>> >> In 0.5.x, tx->request_uri_normalized has been removed, and
>>> >> we'd now
>>> >> >>> >> have to use the REQUEST_URI hook. We'll have to carry out
>>> >> the
>>> >> >>> >> reconstruction ourselves, and store it ourselves in our
>>> >> HTPState.
>>> >> >>> >>
>>> >> >>>
>>> >> >>> What are your thoughts on this?
>>> >> >>>
>>> >> >>> >
>>> >> >>> > IIRC there is some function in libhtp that does just the
>>> >> decoding of
>>> >> >>> > uriencoding and unicode. We should probably just use that on
>>> >> the query
>>> >> >>> > and do the full normalization on the path.
>>> >> >>> >
>>> >> >>> > As a side thought: I think it would be nice to store path and
>>> >> query
>>> >> >>> > separately so that we can add http_path and http_query
>>> >> keywords later
>>> >> >>> > on.
>>> >> >>> >
>>> >> >>>
>>> >> >>> We'd pretty much extract it directly from parsed_uri. Will
>>> >> have to
>>> >> >>> check if we need the extract double decode phase we have
>>> >> currently in
>>> >> >>> our bundled htp, in which case we'd need to store them
>>> >> separately.
>>> >> >>>
>>> >> >>
>>> >> >> Yes, all the normalized components are in tx->parsed_uri. This
>>> >> is what is
>>> >> >> used in ironbee to expose all the various parts like
>>> >> tx->parsed_uri->path
>>> >> >> and tx->parsed_uri->query.
>>> >> >>
>>> >> >> Also note that the hostname should now be obtained from
>>> >> >> tx->request_hostname in 0.5.
>>> >> >>
>>> >> >> -B
>>> >> >
>>> >> >
>>> >> > FYI, for an example using libhtp 0.5 see ironbee code. This was
>>> >> all
>>> >> > recently updated for 0.5.
>>> >> >
>>> >> > https://github.com/ironbee/ironbee/blob/0.7.x/modules/modhtp.c
>>> >> >
>>> >>
>>> >> Will have a look. Thanks.
>>> >>
>>> >> Previously we would use tx->connp->conn->transactions to access txs
>>> >> in the state. Now that htp_connp_t is an opaque pointer how do I
>>> >> access the txs? Tried locating helper functions to retrieve it, but
>>> >> I
>>> >> didn't find any.
>>> >>
>>> >> It's an oversight that there isn't a helper function to retrieve
>>> >> transactions on a connections. I will add one tomorrow.
>>> >>
>>> >> Having said that, what is your use case that you require to retrieve
>>> >> transactions? I thought your code was driven by the callbacks, which >
>>> >> all
>>> >> come with a tx instance (via connp)? For my education, can you explain
>>> >> how you process connection data?
>>> >>
>>> >>
>>> >
>>> > One of the things that we don't do out of the callbacks is logging the
>>> > requests. This is one of the things we need access to the TX store for.
>>> >
>>>
>>> And to add to it, since we already have the txs stored in a list
>>> inside libhtp, re-buffering the txs would come as a redundant task,
>>> from where I see it.
>>>
>>> --
>>> Anoop Saldanha
>>> _______________________________________________
>>> Suricata IDS Devel mailing list: oisf-devel at openinfosecfoundation.org
>>> Site: http://suricata-ids.org | Participate:
>>> http://suricata-ids.org/participate/
>>> List: https://lists.openinfosecfoundation.org/mailman/listinfo/oisf-devel
>>> Redmine: https://redmine.openinfosecfoundation.org/
>>
>>
>>
>>
>> --
>> Ivan Ristić
>
>
>
> --
> Anoop Saldanha
--
Anoop Saldanha
More information about the Oisf-devel
mailing list