[Oisf-devel] libhtp 0.5.x integration - bug 775

Anoop Saldanha anoopsaldanha at gmail.com
Mon Jun 3 16:07:21 UTC 2013


@Victor

Since we need to store the normalized request uri in our htp_state, we
can probably figure out a solution that we can also reuse in dcerpc
for storing transactions.

Probably a linked_list that stores the tx_id(tx id for the related
data) of it's head?

On Wed, May 15, 2013 at 12:38 PM, Anoop Saldanha
<anoopsaldanha at gmail.com> wrote:
> Right.  Thanks.
>
> On Wed, May 15, 2013 at 12:18 PM, Ivan Ristic <ivan.ristic at gmail.com> wrote:
>> On Wed, May 15, 2013 at 7:37 AM, Anoop Saldanha <anoopsaldanha at gmail.com> wrote:
>>> Ivan,
>>>
>>> I see the introduction of
>>>
>>> htp_tx_t *htp_connp_get_in_tx(const htp_connp_t *connp);
>>> htp_tx_t *htp_connp_get_out_tx(const htp_connp_t *connp);
>>>
>>> Which means I won't be able to retrieve individual txs?
>>
>> Those 2 functions will give you only the currently active request and
>> response, respectively. There can be one of each at any given time.
>>
>> With recent changes, callbacks are sent the correct tx, so the above
>> functions will rarely be needed when you're processing one transaction
>> at a time.
>>
>>
>>> I receive 5
>>> pipelined requests, so that would be 5 txs created.  How do I retrieve
>>> the individual txs?
>>
>> The transactions are in htp_conn_t::transactions, which is a list. How
>> to access the htp_conn_t pointer depends on your setup. You probably
>> keep a pointer to connp somewhere in your context, and from there you
>> can get a connection using htp_connp_get_connection().
>>
>>
>>> On Thu, Apr 11, 2013 at 10:16 PM, Anoop Saldanha
>>> <anoopsaldanha at gmail.com> wrote:
>>>> On Thu, Apr 11, 2013 at 9:10 PM, Ivan Ristic <ivan.ristic at gmail.com> wrote:
>>>>> I wouldn't advise you to do any buffering anyhow.
>>>>> But I am curious if you're
>>>>> deleting transactions once you're done with them. Because, if you're not,
>>>>> you may be allocating a lot of memory (all tx instances) on long-lived HTTP
>>>>> connections.
>>>>>
>>>>
>>>> We do delete them, once we're done.
>>>>
>>>>>
>>>>> On Tue, Apr 9, 2013 at 1:06 PM, Anoop Saldanha <anoopsaldanha at gmail.com>
>>>>> wrote:
>>>>>>
>>>>>> On Tue, Apr 9, 2013 at 2:36 PM, Victor Julien <victor at inliniac.net> wrote:
>>>>>> > (bad juju to brian and ivan for top posting and/or html emails! :)
>>>>>> >
>>>>>> > On 04/09/2013 10:21 AM, Ivan Ristic wrote:
>>>>>> >> On Mon, Apr 8, 2013 at 3:54 PM, Anoop Saldanha <anoopsaldanha at gmail.com
>>>>>> >> <mailto:anoopsaldanha at gmail.com>> wrote:
>>>>>> >>
>>>>>> >>     On Mon, Apr 8, 2013 at 7:50 PM, Brian Rectanus <brectanu at gmail.com
>>>>>> >>     <mailto:brectanu at gmail.com>> wrote:
>>>>>> >>     >
>>>>>> >>     > On Mon, Apr 8, 2013 at 9:16 AM, Brian Rectanus
>>>>>> >> <brectanu at gmail.com
>>>>>> >>     <mailto:brectanu at gmail.com>> wrote:
>>>>>> >>     >>
>>>>>> >>     >>
>>>>>> >>     >> On Mon, Apr 8, 2013 at 8:47 AM, Anoop Saldanha
>>>>>> >>     <anoopsaldanha at gmail.com <mailto:anoopsaldanha at gmail.com>>
>>>>>> >>     >> wrote:
>>>>>> >>     >>>
>>>>>> >>     >>> On Mon, Apr 8, 2013 at 3:42 PM, Victor Julien
>>>>>> >>     <victor at inliniac.net <mailto:victor at inliniac.net>>
>>>>>> >>     >>> wrote:
>>>>>> >>     >>> > (moving to oisf-devel)
>>>>>> >>     >>> >
>>>>>> >>     >>> > On 04/08/2013 06:17 AM, Anoop Saldanha wrote:
>>>>>> >>     >>> >>>> I recollect we introduced path and query double decoding
>>>>>> >>     through
>>>>>> >>     >>> >>>> configurable params, and also we had this thing with query
>>>>>> >>     >>> >>>> decoding(single level).  Can you explain a bit what the
>>>>>> >>     status was
>>>>>> >>     >>> >>>> previously.  Seeing related failed uts.
>>>>>> >>     >>> >>>>
>>>>>> >>     >>> >>>
>>>>>> >>     >>> >>> We run the path normalization on the query through our
>>>>>> >>     >>> >>> HTPCallbackRequestUriNormalizeQuery callback. Previously we
>>>>>> >> used
>>>>>> >>     >>> >>> htp_decode_path_inplace to normalize the query (e.g. for
>>>>>> >>     >>> >>> uridecoding).
>>>>>> >>     >>> >>> However, this was causing issues (remember that pcre "bug"
>>>>>> >> we
>>>>>> >>     >>> >>> discussed
>>>>>> >>     >>> >>> a while back, where http:// turned into http:/).
>>>>>> >>     >>> >>>
>>>>>> >>     >>> >>> In libhtp I copied htp_decode_path_inplace to
>>>>>> >>     >>> >>> htp_decode_query_inplace
>>>>>> >>     >>> >>> and also copied the config params and cfg funcs:
>>>>>> >>     >>> >>>
>>>>>> >>     >>> >>>
>>>>>> >>
>>>>>> >> https://github.com/inliniac/suricata/commit/d41c762689a08e6814dc93e8bfebeceab97175c3
>>>>>> >>     >>> >>>
>>>>>> >>     >>> >>> Hack of the 1st order, which is wrong in many ways. But it
>>>>>> >>     basically
>>>>>> >>     >>> >>> allowed me to make sure we don't normalize the query as if
>>>>>> >>     it's path,
>>>>>> >>     >>> >>> esp with turning ftp:// into ftp:/ and such.
>>>>>> >>     >>> >>>
>>>>>> >>     >>> >>> For 0.5 integration I think we need a proper solution. The
>>>>>> >> only
>>>>>> >>     >>> >>> reason I
>>>>>> >>     >>> >>> pushed my hack like this was that I knew in 0.5 we would
>>>>>> >>     make things
>>>>>> >>     >>> >>> right.
>>>>>> >>     >>> >>>
>>>>>> >>     >>> >>
>>>>>> >>     >>> >> I think if we still want to double decode, we still require
>>>>>> >>     all of
>>>>>> >>     >>> >> these above things from our bundled htp.
>>>>>> >>     >>> >>
>>>>>> >>     >>> >> -----
>>>>>> >>     >>> >>
>>>>>> >>     >>> >> In 0.5.x, tx->request_uri_normalized has been removed, and
>>>>>> >>     we'd now
>>>>>> >>     >>> >> have to use the REQUEST_URI hook.  We'll have to carry out
>>>>>> >> the
>>>>>> >>     >>> >> reconstruction ourselves, and store it ourselves in our
>>>>>> >> HTPState.
>>>>>> >>     >>> >>
>>>>>> >>     >>>
>>>>>> >>     >>> What are your thoughts on this?
>>>>>> >>     >>>
>>>>>> >>     >>> >
>>>>>> >>     >>> > IIRC there is some function in libhtp that does just the
>>>>>> >>     decoding of
>>>>>> >>     >>> > uriencoding and unicode. We should probably just use that on
>>>>>> >>     the query
>>>>>> >>     >>> > and do the full normalization on the path.
>>>>>> >>     >>> >
>>>>>> >>     >>> > As a side thought: I think it would be nice to store path and
>>>>>> >>     query
>>>>>> >>     >>> > separately so that we can add http_path and http_query
>>>>>> >>     keywords later
>>>>>> >>     >>> > on.
>>>>>> >>     >>> >
>>>>>> >>     >>>
>>>>>> >>     >>> We'd pretty much extract it directly from parsed_uri.  Will
>>>>>> >> have to
>>>>>> >>     >>> check if we need the extract double decode phase we have
>>>>>> >>     currently in
>>>>>> >>     >>> our bundled htp, in which case we'd need to store them
>>>>>> >> separately.
>>>>>> >>     >>>
>>>>>> >>     >>
>>>>>> >>     >> Yes, all the normalized components are in tx->parsed_uri.  This
>>>>>> >>     is what is
>>>>>> >>     >> used in ironbee to expose all the various parts like
>>>>>> >>     tx->parsed_uri->path
>>>>>> >>     >> and tx->parsed_uri->query.
>>>>>> >>     >>
>>>>>> >>     >> Also note that the hostname should now be obtained from
>>>>>> >>     >> tx->request_hostname in 0.5.
>>>>>> >>     >>
>>>>>> >>     >> -B
>>>>>> >>     >
>>>>>> >>     >
>>>>>> >>     > FYI, for an example using libhtp 0.5 see ironbee code.  This was
>>>>>> >> all
>>>>>> >>     > recently updated for 0.5.
>>>>>> >>     >
>>>>>> >>     > https://github.com/ironbee/ironbee/blob/0.7.x/modules/modhtp.c
>>>>>> >>     >
>>>>>> >>
>>>>>> >>     Will have a look.  Thanks.
>>>>>> >>
>>>>>> >>     Previously we would use tx->connp->conn->transactions to access txs
>>>>>> >>     in the state.  Now that htp_connp_t is an opaque pointer how do I
>>>>>> >>     access the txs? Tried locating helper functions to retrieve it, but
>>>>>> >> I
>>>>>> >>     didn't find any.
>>>>>> >>
>>>>>> >> It's an oversight that there isn't a helper function to retrieve
>>>>>> >> transactions on a connections. I will add one tomorrow.
>>>>>> >>
>>>>>> >> Having said that, what is your use case that you require to retrieve
>>>>>> >> transactions? I thought your code was driven by the callbacks, which >
>>>>>> >> all
>>>>>> >> come with a tx instance (via connp)? For my education, can you explain
>>>>>> >> how you process connection data?
>>>>>> >>
>>>>>> >>
>>>>>> >
>>>>>> > One of the things that we don't do out of the callbacks is logging the
>>>>>> > requests. This is one of the things we need access to the TX store for.
>>>>>> >
>>>>>>
>>>>>> And to add to it, since we already have the txs stored in a list
>>>>>> inside libhtp, re-buffering the txs would come as a redundant task,
>>>>>> from where I see it.
>>>>>>
>>>>>> --
>>>>>> Anoop Saldanha
>>>>>> _______________________________________________
>>>>>> Suricata IDS Devel mailing list: oisf-devel at openinfosecfoundation.org
>>>>>> Site: http://suricata-ids.org | Participate:
>>>>>> http://suricata-ids.org/participate/
>>>>>> List: https://lists.openinfosecfoundation.org/mailman/listinfo/oisf-devel
>>>>>> Redmine: https://redmine.openinfosecfoundation.org/
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Ivan Ristić
>>>>
>>>>
>>>>
>>>> --
>>>> Anoop Saldanha
>>>
>>>
>>>
>>> --
>>> Anoop Saldanha
>>
>>
>>
>> --
>> Ivan Ristić
>
>
>
> --
> Anoop Saldanha



-- 
-------------------------------
Anoop Saldanha
http://www.poona.me
-------------------------------


More information about the Oisf-devel mailing list