[Oisf-devel] libhtp 0.5.x integration - bug 775

Anoop Saldanha anoopsaldanha at gmail.com
Tue Jun 11 15:01:45 UTC 2013


Ivan,

Or maybe it shouldn't check for NULL as well.  Rather set an invalid
basic authorization failure event and proceed parsing normally the
rest of the request?

Same when it checks for the user value

htp_parsers.c:132

      bstr *decoded = htp_base64_decode_mem(data + pos, len - pos);
      if (decoded == NULL) return HTP_ERROR;

On Tue, Jun 11, 2013 at 8:15 PM, Anoop Saldanha <anoopsaldanha at gmail.com> wrote:
> Ivan,
>
> htp_parsers.c:149
>
>       connp->in_tx->request_auth_password = bstr_dup_ex(decoded, i +
> 1, bstr_len(decoded) - i - 1);
>       if (connp->in_tx->request_auth_password) {
>           bstr_free(decoded);
>           bstr_free(connp->in_tx->request_auth_username);
>           return HTP_ERROR;
>       }
>
> If we have a non-null auth password we error out on the stream.  Is
> this right or should that have been a NULL check?
>
> On Tue, Jun 11, 2013 at 5:11 PM, Anoop Saldanha <anoopsaldanha at gmail.com> wrote:
>> Ivan,
>>
>> Updating my point (2) from the previous mail, since
>> htp_unparse_uri_noencode() takes a htp_uri_t argument, I'd probaby
>> have to create a custom function that utilizes
>> htp_unparse_uri_noencode() + use tx->request_params while producing
>> the normalized uri.  Is there any other way to do this, while using
>> the normalized query to create the normalized uri?
>>
>> On Mon, Jun 10, 2013 at 7:36 PM, Anoop Saldanha <anoopsaldanha at gmail.com> wrote:
>>> @Ivan
>>>
>>> I am using htp_unparse_uri_noencode() to retrieve the reconstructed
>>> normalized uri.
>>>
>>> 1. Should we enable url query normalization using
>>> htp_config_register_urlencode_parser()?
>>> 2. If (1) is true, shouldn't the value stored by (1) in
>>> tx->request_params be used by htp_unparse_uri_noencode() while
>>> reconstructing the normalized uri?
>>> 3. Suricata provides conf settings to enable double decoding for path
>>> and query.  We do this by hooking into the request line callback and
>>> calling htp_decode_path_inplace() and htp_decode_query_inplace().  Do
>>> we have any internal support from libhtp for decoding double encoded
>>> characters?
>>>
>>> @ Victor.
>>>
>>> We enabled some features here to decode path in query here -
>>> https://github.com/inliniac/suricata/commit/d41c762689a08e6814dc93e8bfebeceab97175c3
>>>
>>> If a query parameter is an uri, should be decode the uri?  It's an
>>> argument and from the user perspective we don't know how the user
>>> treats the query argument internally .  So some of the settings wrt
>>> decoding a path in a query may not make sense?
>>>
>>> On Mon, Jun 10, 2013 at 5:52 PM, Ivan Ristic <ivan.ristic at gmail.com> wrote:
>>>> Yes, thanks for noticing. I've removed it now.
>>>>
>>>>
>>>> On Mon, Jun 10, 2013 at 12:48 PM, Anoop Saldanha
>>>> <anoopsaldanha at gmail.com> wrote:
>>>>> Ivan,
>>>>>
>>>>> Now that libhtp 0.5.x doesn't generate the normalized request uri
>>>>> anymore, htp_config_set_generate_request_uri_normalized() should
>>>>> probably be removed?
>>>>>
>>>>> On Fri, Jun 7, 2013 at 8:21 PM, Anoop Saldanha <anoopsaldanha at gmail.com> wrote:
>>>>>> Ivan,
>>>>>>
>>>>>> When a request such as "HELLO\r\n", libhtp would have the
>>>>>> "request_protocol_number" set as HTP_PROTOCOL_0_9.  Is that right or
>>>>>> should it be HTP_PROTOCOL_UNKNOWN?
>>>>>>
>>>>>> On Tue, Jun 4, 2013 at 12:02 AM, Anoop Saldanha <anoopsaldanha at gmail.com> wrote:
>>>>>>> On Mon, Jun 3, 2013 at 10:48 PM, Victor Julien <victor at inliniac.net> wrote:
>>>>>>>> On 06/03/2013 06:07 PM, Anoop Saldanha wrote:
>>>>>>>>> @Victor
>>>>>>>>>
>>>>>>>>> Since we need to store the normalized request uri in our htp_state, we
>>>>>>>>> can probably figure out a solution that we can also reuse in dcerpc
>>>>>>>>> for storing transactions.
>>>>>>>>>
>>>>>>>>> Probably a linked_list that stores the tx_id(tx id for the related
>>>>>>>>> data) of it's head?
>>>>>>>>
>>>>>>>> Would it be an option to use the per tx HtpUserData that we use in the
>>>>>>>> 0.2.x implementation for body tracking?
>>>>>>>
>>>>>>> Yes, we can use it store all generated buffers.
>>>>>>>
>>>>>>>>
>>>>>>>>> On Wed, May 15, 2013 at 12:38 PM, Anoop Saldanha
>>>>>>>>> <anoopsaldanha at gmail.com> wrote:
>>>>>>>>>> Right.  Thanks.
>>>>>>>>>>
>>>>>>>>>> On Wed, May 15, 2013 at 12:18 PM, Ivan Ristic <ivan.ristic at gmail.com> wrote:
>>>>>>>>>>> On Wed, May 15, 2013 at 7:37 AM, Anoop Saldanha <anoopsaldanha at gmail.com> wrote:
>>>>>>>>>>>> Ivan,
>>>>>>>>>>>>
>>>>>>>>>>>> I see the introduction of
>>>>>>>>>>>>
>>>>>>>>>>>> htp_tx_t *htp_connp_get_in_tx(const htp_connp_t *connp);
>>>>>>>>>>>> htp_tx_t *htp_connp_get_out_tx(const htp_connp_t *connp);
>>>>>>>>>>>>
>>>>>>>>>>>> Which means I won't be able to retrieve individual txs?
>>>>>>>>>>>
>>>>>>>>>>> Those 2 functions will give you only the currently active request and
>>>>>>>>>>> response, respectively. There can be one of each at any given time.
>>>>>>>>>>>
>>>>>>>>>>> With recent changes, callbacks are sent the correct tx, so the above
>>>>>>>>>>> functions will rarely be needed when you're processing one transaction
>>>>>>>>>>> at a time.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>> I receive 5
>>>>>>>>>>>> pipelined requests, so that would be 5 txs created.  How do I retrieve
>>>>>>>>>>>> the individual txs?
>>>>>>>>>>>
>>>>>>>>>>> The transactions are in htp_conn_t::transactions, which is a list. How
>>>>>>>>>>> to access the htp_conn_t pointer depends on your setup. You probably
>>>>>>>>>>> keep a pointer to connp somewhere in your context, and from there you
>>>>>>>>>>> can get a connection using htp_connp_get_connection().
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>> On Thu, Apr 11, 2013 at 10:16 PM, Anoop Saldanha
>>>>>>>>>>>> <anoopsaldanha at gmail.com> wrote:
>>>>>>>>>>>>> On Thu, Apr 11, 2013 at 9:10 PM, Ivan Ristic <ivan.ristic at gmail.com> wrote:
>>>>>>>>>>>>>> I wouldn't advise you to do any buffering anyhow.
>>>>>>>>>>>>>> But I am curious if you're
>>>>>>>>>>>>>> deleting transactions once you're done with them. Because, if you're not,
>>>>>>>>>>>>>> you may be allocating a lot of memory (all tx instances) on long-lived HTTP
>>>>>>>>>>>>>> connections.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> We do delete them, once we're done.
>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Tue, Apr 9, 2013 at 1:06 PM, Anoop Saldanha <anoopsaldanha at gmail.com>
>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Tue, Apr 9, 2013 at 2:36 PM, Victor Julien <victor at inliniac.net> wrote:
>>>>>>>>>>>>>>>> (bad juju to brian and ivan for top posting and/or html emails! :)
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On 04/09/2013 10:21 AM, Ivan Ristic wrote:
>>>>>>>>>>>>>>>>> On Mon, Apr 8, 2013 at 3:54 PM, Anoop Saldanha <anoopsaldanha at gmail.com
>>>>>>>>>>>>>>>>> <mailto:anoopsaldanha at gmail.com>> wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>     On Mon, Apr 8, 2013 at 7:50 PM, Brian Rectanus <brectanu at gmail.com
>>>>>>>>>>>>>>>>>     <mailto:brectanu at gmail.com>> wrote:
>>>>>>>>>>>>>>>>>     >
>>>>>>>>>>>>>>>>>     > On Mon, Apr 8, 2013 at 9:16 AM, Brian Rectanus
>>>>>>>>>>>>>>>>> <brectanu at gmail.com
>>>>>>>>>>>>>>>>>     <mailto:brectanu at gmail.com>> wrote:
>>>>>>>>>>>>>>>>>     >>
>>>>>>>>>>>>>>>>>     >>
>>>>>>>>>>>>>>>>>     >> On Mon, Apr 8, 2013 at 8:47 AM, Anoop Saldanha
>>>>>>>>>>>>>>>>>     <anoopsaldanha at gmail.com <mailto:anoopsaldanha at gmail.com>>
>>>>>>>>>>>>>>>>>     >> wrote:
>>>>>>>>>>>>>>>>>     >>>
>>>>>>>>>>>>>>>>>     >>> On Mon, Apr 8, 2013 at 3:42 PM, Victor Julien
>>>>>>>>>>>>>>>>>     <victor at inliniac.net <mailto:victor at inliniac.net>>
>>>>>>>>>>>>>>>>>     >>> wrote:
>>>>>>>>>>>>>>>>>     >>> > (moving to oisf-devel)
>>>>>>>>>>>>>>>>>     >>> >
>>>>>>>>>>>>>>>>>     >>> > On 04/08/2013 06:17 AM, Anoop Saldanha wrote:
>>>>>>>>>>>>>>>>>     >>> >>>> I recollect we introduced path and query double decoding
>>>>>>>>>>>>>>>>>     through
>>>>>>>>>>>>>>>>>     >>> >>>> configurable params, and also we had this thing with query
>>>>>>>>>>>>>>>>>     >>> >>>> decoding(single level).  Can you explain a bit what the
>>>>>>>>>>>>>>>>>     status was
>>>>>>>>>>>>>>>>>     >>> >>>> previously.  Seeing related failed uts.
>>>>>>>>>>>>>>>>>     >>> >>>>
>>>>>>>>>>>>>>>>>     >>> >>>
>>>>>>>>>>>>>>>>>     >>> >>> We run the path normalization on the query through our
>>>>>>>>>>>>>>>>>     >>> >>> HTPCallbackRequestUriNormalizeQuery callback. Previously we
>>>>>>>>>>>>>>>>> used
>>>>>>>>>>>>>>>>>     >>> >>> htp_decode_path_inplace to normalize the query (e.g. for
>>>>>>>>>>>>>>>>>     >>> >>> uridecoding).
>>>>>>>>>>>>>>>>>     >>> >>> However, this was causing issues (remember that pcre "bug"
>>>>>>>>>>>>>>>>> we
>>>>>>>>>>>>>>>>>     >>> >>> discussed
>>>>>>>>>>>>>>>>>     >>> >>> a while back, where http:// turned into http:/).
>>>>>>>>>>>>>>>>>     >>> >>>
>>>>>>>>>>>>>>>>>     >>> >>> In libhtp I copied htp_decode_path_inplace to
>>>>>>>>>>>>>>>>>     >>> >>> htp_decode_query_inplace
>>>>>>>>>>>>>>>>>     >>> >>> and also copied the config params and cfg funcs:
>>>>>>>>>>>>>>>>>     >>> >>>
>>>>>>>>>>>>>>>>>     >>> >>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> https://github.com/inliniac/suricata/commit/d41c762689a08e6814dc93e8bfebeceab97175c3
>>>>>>>>>>>>>>>>>     >>> >>>
>>>>>>>>>>>>>>>>>     >>> >>> Hack of the 1st order, which is wrong in many ways. But it
>>>>>>>>>>>>>>>>>     basically
>>>>>>>>>>>>>>>>>     >>> >>> allowed me to make sure we don't normalize the query as if
>>>>>>>>>>>>>>>>>     it's path,
>>>>>>>>>>>>>>>>>     >>> >>> esp with turning ftp:// into ftp:/ and such.
>>>>>>>>>>>>>>>>>     >>> >>>
>>>>>>>>>>>>>>>>>     >>> >>> For 0.5 integration I think we need a proper solution. The
>>>>>>>>>>>>>>>>> only
>>>>>>>>>>>>>>>>>     >>> >>> reason I
>>>>>>>>>>>>>>>>>     >>> >>> pushed my hack like this was that I knew in 0.5 we would
>>>>>>>>>>>>>>>>>     make things
>>>>>>>>>>>>>>>>>     >>> >>> right.
>>>>>>>>>>>>>>>>>     >>> >>>
>>>>>>>>>>>>>>>>>     >>> >>
>>>>>>>>>>>>>>>>>     >>> >> I think if we still want to double decode, we still require
>>>>>>>>>>>>>>>>>     all of
>>>>>>>>>>>>>>>>>     >>> >> these above things from our bundled htp.
>>>>>>>>>>>>>>>>>     >>> >>
>>>>>>>>>>>>>>>>>     >>> >> -----
>>>>>>>>>>>>>>>>>     >>> >>
>>>>>>>>>>>>>>>>>     >>> >> In 0.5.x, tx->request_uri_normalized has been removed, and
>>>>>>>>>>>>>>>>>     we'd now
>>>>>>>>>>>>>>>>>     >>> >> have to use the REQUEST_URI hook.  We'll have to carry out
>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>     >>> >> reconstruction ourselves, and store it ourselves in our
>>>>>>>>>>>>>>>>> HTPState.
>>>>>>>>>>>>>>>>>     >>> >>
>>>>>>>>>>>>>>>>>     >>>
>>>>>>>>>>>>>>>>>     >>> What are your thoughts on this?
>>>>>>>>>>>>>>>>>     >>>
>>>>>>>>>>>>>>>>>     >>> >
>>>>>>>>>>>>>>>>>     >>> > IIRC there is some function in libhtp that does just the
>>>>>>>>>>>>>>>>>     decoding of
>>>>>>>>>>>>>>>>>     >>> > uriencoding and unicode. We should probably just use that on
>>>>>>>>>>>>>>>>>     the query
>>>>>>>>>>>>>>>>>     >>> > and do the full normalization on the path.
>>>>>>>>>>>>>>>>>     >>> >
>>>>>>>>>>>>>>>>>     >>> > As a side thought: I think it would be nice to store path and
>>>>>>>>>>>>>>>>>     query
>>>>>>>>>>>>>>>>>     >>> > separately so that we can add http_path and http_query
>>>>>>>>>>>>>>>>>     keywords later
>>>>>>>>>>>>>>>>>     >>> > on.
>>>>>>>>>>>>>>>>>     >>> >
>>>>>>>>>>>>>>>>>     >>>
>>>>>>>>>>>>>>>>>     >>> We'd pretty much extract it directly from parsed_uri.  Will
>>>>>>>>>>>>>>>>> have to
>>>>>>>>>>>>>>>>>     >>> check if we need the extract double decode phase we have
>>>>>>>>>>>>>>>>>     currently in
>>>>>>>>>>>>>>>>>     >>> our bundled htp, in which case we'd need to store them
>>>>>>>>>>>>>>>>> separately.
>>>>>>>>>>>>>>>>>     >>>
>>>>>>>>>>>>>>>>>     >>
>>>>>>>>>>>>>>>>>     >> Yes, all the normalized components are in tx->parsed_uri.  This
>>>>>>>>>>>>>>>>>     is what is
>>>>>>>>>>>>>>>>>     >> used in ironbee to expose all the various parts like
>>>>>>>>>>>>>>>>>     tx->parsed_uri->path
>>>>>>>>>>>>>>>>>     >> and tx->parsed_uri->query.
>>>>>>>>>>>>>>>>>     >>
>>>>>>>>>>>>>>>>>     >> Also note that the hostname should now be obtained from
>>>>>>>>>>>>>>>>>     >> tx->request_hostname in 0.5.
>>>>>>>>>>>>>>>>>     >>
>>>>>>>>>>>>>>>>>     >> -B
>>>>>>>>>>>>>>>>>     >
>>>>>>>>>>>>>>>>>     >
>>>>>>>>>>>>>>>>>     > FYI, for an example using libhtp 0.5 see ironbee code.  This was
>>>>>>>>>>>>>>>>> all
>>>>>>>>>>>>>>>>>     > recently updated for 0.5.
>>>>>>>>>>>>>>>>>     >
>>>>>>>>>>>>>>>>>     > https://github.com/ironbee/ironbee/blob/0.7.x/modules/modhtp.c
>>>>>>>>>>>>>>>>>     >
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>     Will have a look.  Thanks.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>     Previously we would use tx->connp->conn->transactions to access txs
>>>>>>>>>>>>>>>>>     in the state.  Now that htp_connp_t is an opaque pointer how do I
>>>>>>>>>>>>>>>>>     access the txs? Tried locating helper functions to retrieve it, but
>>>>>>>>>>>>>>>>> I
>>>>>>>>>>>>>>>>>     didn't find any.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> It's an oversight that there isn't a helper function to retrieve
>>>>>>>>>>>>>>>>> transactions on a connections. I will add one tomorrow.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Having said that, what is your use case that you require to retrieve
>>>>>>>>>>>>>>>>> transactions? I thought your code was driven by the callbacks, which >
>>>>>>>>>>>>>>>>> all
>>>>>>>>>>>>>>>>> come with a tx instance (via connp)? For my education, can you explain
>>>>>>>>>>>>>>>>> how you process connection data?
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> One of the things that we don't do out of the callbacks is logging the
>>>>>>>>>>>>>>>> requests. This is one of the things we need access to the TX store for.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> And to add to it, since we already have the txs stored in a list
>>>>>>>>>>>>>>> inside libhtp, re-buffering the txs would come as a redundant task,
>>>>>>>>>>>>>>> from where I see it.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>> Anoop Saldanha
>>>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>>>> Suricata IDS Devel mailing list: oisf-devel at openinfosecfoundation.org
>>>>>>>>>>>>>>> Site: http://suricata-ids.org | Participate:
>>>>>>>>>>>>>>> http://suricata-ids.org/participate/
>>>>>>>>>>>>>>> List: https://lists.openinfosecfoundation.org/mailman/listinfo/oisf-devel
>>>>>>>>>>>>>>> Redmine: https://redmine.openinfosecfoundation.org/
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> --
>>>>>>>>>>>>>> Ivan Ristić
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> --
>>>>>>>>>>>>> Anoop Saldanha
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> --
>>>>>>>>>>>> Anoop Saldanha
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> --
>>>>>>>>>>> Ivan Ristić
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> Anoop Saldanha
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> ---------------------------------------------
>>>>>>>> Victor Julien
>>>>>>>> http://www.inliniac.net/
>>>>>>>> PGP: http://www.inliniac.net/victorjulien.asc
>>>>>>>> ---------------------------------------------
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> -------------------------------
>>>>>>> Anoop Saldanha
>>>>>>> http://www.poona.me
>>>>>>> -------------------------------
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> -------------------------------
>>>>>> Anoop Saldanha
>>>>>> http://www.poona.me
>>>>>> -------------------------------
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> -------------------------------
>>>>> Anoop Saldanha
>>>>> http://www.poona.me
>>>>> -------------------------------
>>>>
>>>>
>>>>
>>>> --
>>>> Ivan Ristić
>>>
>>>
>>>
>>> --
>>> -------------------------------
>>> Anoop Saldanha
>>> http://www.poona.me
>>> -------------------------------
>>
>>
>>
>> --
>> -------------------------------
>> Anoop Saldanha
>> http://www.poona.me
>> -------------------------------
>
>
>
> --
> -------------------------------
> Anoop Saldanha
> http://www.poona.me
> -------------------------------



-- 
-------------------------------
Anoop Saldanha
http://www.poona.me
-------------------------------



More information about the Oisf-devel mailing list