[Oisf-devel] libhtp 0.5.x integration - bug 775
Ivan Ristic
ivan.ristic at gmail.com
Wed May 15 06:48:24 UTC 2013
On Wed, May 15, 2013 at 7:37 AM, Anoop Saldanha <anoopsaldanha at gmail.com> wrote:
> Ivan,
>
> I see the introduction of
>
> htp_tx_t *htp_connp_get_in_tx(const htp_connp_t *connp);
> htp_tx_t *htp_connp_get_out_tx(const htp_connp_t *connp);
>
> Which means I won't be able to retrieve individual txs?
Those 2 functions will give you only the currently active request and
response, respectively. There can be one of each at any given time.
With recent changes, callbacks are sent the correct tx, so the above
functions will rarely be needed when you're processing one transaction
at a time.
> I receive 5
> pipelined requests, so that would be 5 txs created. How do I retrieve
> the individual txs?
The transactions are in htp_conn_t::transactions, which is a list. How
to access the htp_conn_t pointer depends on your setup. You probably
keep a pointer to connp somewhere in your context, and from there you
can get a connection using htp_connp_get_connection().
> On Thu, Apr 11, 2013 at 10:16 PM, Anoop Saldanha
> <anoopsaldanha at gmail.com> wrote:
>> On Thu, Apr 11, 2013 at 9:10 PM, Ivan Ristic <ivan.ristic at gmail.com> wrote:
>>> I wouldn't advise you to do any buffering anyhow.
>>> But I am curious if you're
>>> deleting transactions once you're done with them. Because, if you're not,
>>> you may be allocating a lot of memory (all tx instances) on long-lived HTTP
>>> connections.
>>>
>>
>> We do delete them, once we're done.
>>
>>>
>>> On Tue, Apr 9, 2013 at 1:06 PM, Anoop Saldanha <anoopsaldanha at gmail.com>
>>> wrote:
>>>>
>>>> On Tue, Apr 9, 2013 at 2:36 PM, Victor Julien <victor at inliniac.net> wrote:
>>>> > (bad juju to brian and ivan for top posting and/or html emails! :)
>>>> >
>>>> > On 04/09/2013 10:21 AM, Ivan Ristic wrote:
>>>> >> On Mon, Apr 8, 2013 at 3:54 PM, Anoop Saldanha <anoopsaldanha at gmail.com
>>>> >> <mailto:anoopsaldanha at gmail.com>> wrote:
>>>> >>
>>>> >> On Mon, Apr 8, 2013 at 7:50 PM, Brian Rectanus <brectanu at gmail.com
>>>> >> <mailto:brectanu at gmail.com>> wrote:
>>>> >> >
>>>> >> > On Mon, Apr 8, 2013 at 9:16 AM, Brian Rectanus
>>>> >> <brectanu at gmail.com
>>>> >> <mailto:brectanu at gmail.com>> wrote:
>>>> >> >>
>>>> >> >>
>>>> >> >> On Mon, Apr 8, 2013 at 8:47 AM, Anoop Saldanha
>>>> >> <anoopsaldanha at gmail.com <mailto:anoopsaldanha at gmail.com>>
>>>> >> >> wrote:
>>>> >> >>>
>>>> >> >>> On Mon, Apr 8, 2013 at 3:42 PM, Victor Julien
>>>> >> <victor at inliniac.net <mailto:victor at inliniac.net>>
>>>> >> >>> wrote:
>>>> >> >>> > (moving to oisf-devel)
>>>> >> >>> >
>>>> >> >>> > On 04/08/2013 06:17 AM, Anoop Saldanha wrote:
>>>> >> >>> >>>> I recollect we introduced path and query double decoding
>>>> >> through
>>>> >> >>> >>>> configurable params, and also we had this thing with query
>>>> >> >>> >>>> decoding(single level). Can you explain a bit what the
>>>> >> status was
>>>> >> >>> >>>> previously. Seeing related failed uts.
>>>> >> >>> >>>>
>>>> >> >>> >>>
>>>> >> >>> >>> We run the path normalization on the query through our
>>>> >> >>> >>> HTPCallbackRequestUriNormalizeQuery callback. Previously we
>>>> >> used
>>>> >> >>> >>> htp_decode_path_inplace to normalize the query (e.g. for
>>>> >> >>> >>> uridecoding).
>>>> >> >>> >>> However, this was causing issues (remember that pcre "bug"
>>>> >> we
>>>> >> >>> >>> discussed
>>>> >> >>> >>> a while back, where http:// turned into http:/).
>>>> >> >>> >>>
>>>> >> >>> >>> In libhtp I copied htp_decode_path_inplace to
>>>> >> >>> >>> htp_decode_query_inplace
>>>> >> >>> >>> and also copied the config params and cfg funcs:
>>>> >> >>> >>>
>>>> >> >>> >>>
>>>> >>
>>>> >> https://github.com/inliniac/suricata/commit/d41c762689a08e6814dc93e8bfebeceab97175c3
>>>> >> >>> >>>
>>>> >> >>> >>> Hack of the 1st order, which is wrong in many ways. But it
>>>> >> basically
>>>> >> >>> >>> allowed me to make sure we don't normalize the query as if
>>>> >> it's path,
>>>> >> >>> >>> esp with turning ftp:// into ftp:/ and such.
>>>> >> >>> >>>
>>>> >> >>> >>> For 0.5 integration I think we need a proper solution. The
>>>> >> only
>>>> >> >>> >>> reason I
>>>> >> >>> >>> pushed my hack like this was that I knew in 0.5 we would
>>>> >> make things
>>>> >> >>> >>> right.
>>>> >> >>> >>>
>>>> >> >>> >>
>>>> >> >>> >> I think if we still want to double decode, we still require
>>>> >> all of
>>>> >> >>> >> these above things from our bundled htp.
>>>> >> >>> >>
>>>> >> >>> >> -----
>>>> >> >>> >>
>>>> >> >>> >> In 0.5.x, tx->request_uri_normalized has been removed, and
>>>> >> we'd now
>>>> >> >>> >> have to use the REQUEST_URI hook. We'll have to carry out
>>>> >> the
>>>> >> >>> >> reconstruction ourselves, and store it ourselves in our
>>>> >> HTPState.
>>>> >> >>> >>
>>>> >> >>>
>>>> >> >>> What are your thoughts on this?
>>>> >> >>>
>>>> >> >>> >
>>>> >> >>> > IIRC there is some function in libhtp that does just the
>>>> >> decoding of
>>>> >> >>> > uriencoding and unicode. We should probably just use that on
>>>> >> the query
>>>> >> >>> > and do the full normalization on the path.
>>>> >> >>> >
>>>> >> >>> > As a side thought: I think it would be nice to store path and
>>>> >> query
>>>> >> >>> > separately so that we can add http_path and http_query
>>>> >> keywords later
>>>> >> >>> > on.
>>>> >> >>> >
>>>> >> >>>
>>>> >> >>> We'd pretty much extract it directly from parsed_uri. Will
>>>> >> have to
>>>> >> >>> check if we need the extract double decode phase we have
>>>> >> currently in
>>>> >> >>> our bundled htp, in which case we'd need to store them
>>>> >> separately.
>>>> >> >>>
>>>> >> >>
>>>> >> >> Yes, all the normalized components are in tx->parsed_uri. This
>>>> >> is what is
>>>> >> >> used in ironbee to expose all the various parts like
>>>> >> tx->parsed_uri->path
>>>> >> >> and tx->parsed_uri->query.
>>>> >> >>
>>>> >> >> Also note that the hostname should now be obtained from
>>>> >> >> tx->request_hostname in 0.5.
>>>> >> >>
>>>> >> >> -B
>>>> >> >
>>>> >> >
>>>> >> > FYI, for an example using libhtp 0.5 see ironbee code. This was
>>>> >> all
>>>> >> > recently updated for 0.5.
>>>> >> >
>>>> >> > https://github.com/ironbee/ironbee/blob/0.7.x/modules/modhtp.c
>>>> >> >
>>>> >>
>>>> >> Will have a look. Thanks.
>>>> >>
>>>> >> Previously we would use tx->connp->conn->transactions to access txs
>>>> >> in the state. Now that htp_connp_t is an opaque pointer how do I
>>>> >> access the txs? Tried locating helper functions to retrieve it, but
>>>> >> I
>>>> >> didn't find any.
>>>> >>
>>>> >> It's an oversight that there isn't a helper function to retrieve
>>>> >> transactions on a connections. I will add one tomorrow.
>>>> >>
>>>> >> Having said that, what is your use case that you require to retrieve
>>>> >> transactions? I thought your code was driven by the callbacks, which >
>>>> >> all
>>>> >> come with a tx instance (via connp)? For my education, can you explain
>>>> >> how you process connection data?
>>>> >>
>>>> >>
>>>> >
>>>> > One of the things that we don't do out of the callbacks is logging the
>>>> > requests. This is one of the things we need access to the TX store for.
>>>> >
>>>>
>>>> And to add to it, since we already have the txs stored in a list
>>>> inside libhtp, re-buffering the txs would come as a redundant task,
>>>> from where I see it.
>>>>
>>>> --
>>>> Anoop Saldanha
>>>> _______________________________________________
>>>> Suricata IDS Devel mailing list: oisf-devel at openinfosecfoundation.org
>>>> Site: http://suricata-ids.org | Participate:
>>>> http://suricata-ids.org/participate/
>>>> List: https://lists.openinfosecfoundation.org/mailman/listinfo/oisf-devel
>>>> Redmine: https://redmine.openinfosecfoundation.org/
>>>
>>>
>>>
>>>
>>> --
>>> Ivan Ristić
>>
>>
>>
>> --
>> Anoop Saldanha
>
>
>
> --
> Anoop Saldanha
--
Ivan Ristić
More information about the Oisf-devel
mailing list