[Oisf-devel] libhtp 0.5.x integration - bug 775
Anoop Saldanha
anoopsaldanha at gmail.com
Tue Jun 11 14:45:25 UTC 2013
Ivan,
htp_parsers.c:149
connp->in_tx->request_auth_password = bstr_dup_ex(decoded, i +
1, bstr_len(decoded) - i - 1);
if (connp->in_tx->request_auth_password) {
bstr_free(decoded);
bstr_free(connp->in_tx->request_auth_username);
return HTP_ERROR;
}
If we have a non-null auth password we error out on the stream. Is
this right or should that have been a NULL check?
On Tue, Jun 11, 2013 at 5:11 PM, Anoop Saldanha <anoopsaldanha at gmail.com> wrote:
> Ivan,
>
> Updating my point (2) from the previous mail, since
> htp_unparse_uri_noencode() takes a htp_uri_t argument, I'd probaby
> have to create a custom function that utilizes
> htp_unparse_uri_noencode() + use tx->request_params while producing
> the normalized uri. Is there any other way to do this, while using
> the normalized query to create the normalized uri?
>
> On Mon, Jun 10, 2013 at 7:36 PM, Anoop Saldanha <anoopsaldanha at gmail.com> wrote:
>> @Ivan
>>
>> I am using htp_unparse_uri_noencode() to retrieve the reconstructed
>> normalized uri.
>>
>> 1. Should we enable url query normalization using
>> htp_config_register_urlencode_parser()?
>> 2. If (1) is true, shouldn't the value stored by (1) in
>> tx->request_params be used by htp_unparse_uri_noencode() while
>> reconstructing the normalized uri?
>> 3. Suricata provides conf settings to enable double decoding for path
>> and query. We do this by hooking into the request line callback and
>> calling htp_decode_path_inplace() and htp_decode_query_inplace(). Do
>> we have any internal support from libhtp for decoding double encoded
>> characters?
>>
>> @ Victor.
>>
>> We enabled some features here to decode path in query here -
>> https://github.com/inliniac/suricata/commit/d41c762689a08e6814dc93e8bfebeceab97175c3
>>
>> If a query parameter is an uri, should be decode the uri? It's an
>> argument and from the user perspective we don't know how the user
>> treats the query argument internally . So some of the settings wrt
>> decoding a path in a query may not make sense?
>>
>> On Mon, Jun 10, 2013 at 5:52 PM, Ivan Ristic <ivan.ristic at gmail.com> wrote:
>>> Yes, thanks for noticing. I've removed it now.
>>>
>>>
>>> On Mon, Jun 10, 2013 at 12:48 PM, Anoop Saldanha
>>> <anoopsaldanha at gmail.com> wrote:
>>>> Ivan,
>>>>
>>>> Now that libhtp 0.5.x doesn't generate the normalized request uri
>>>> anymore, htp_config_set_generate_request_uri_normalized() should
>>>> probably be removed?
>>>>
>>>> On Fri, Jun 7, 2013 at 8:21 PM, Anoop Saldanha <anoopsaldanha at gmail.com> wrote:
>>>>> Ivan,
>>>>>
>>>>> When a request such as "HELLO\r\n", libhtp would have the
>>>>> "request_protocol_number" set as HTP_PROTOCOL_0_9. Is that right or
>>>>> should it be HTP_PROTOCOL_UNKNOWN?
>>>>>
>>>>> On Tue, Jun 4, 2013 at 12:02 AM, Anoop Saldanha <anoopsaldanha at gmail.com> wrote:
>>>>>> On Mon, Jun 3, 2013 at 10:48 PM, Victor Julien <victor at inliniac.net> wrote:
>>>>>>> On 06/03/2013 06:07 PM, Anoop Saldanha wrote:
>>>>>>>> @Victor
>>>>>>>>
>>>>>>>> Since we need to store the normalized request uri in our htp_state, we
>>>>>>>> can probably figure out a solution that we can also reuse in dcerpc
>>>>>>>> for storing transactions.
>>>>>>>>
>>>>>>>> Probably a linked_list that stores the tx_id(tx id for the related
>>>>>>>> data) of it's head?
>>>>>>>
>>>>>>> Would it be an option to use the per tx HtpUserData that we use in the
>>>>>>> 0.2.x implementation for body tracking?
>>>>>>
>>>>>> Yes, we can use it store all generated buffers.
>>>>>>
>>>>>>>
>>>>>>>> On Wed, May 15, 2013 at 12:38 PM, Anoop Saldanha
>>>>>>>> <anoopsaldanha at gmail.com> wrote:
>>>>>>>>> Right. Thanks.
>>>>>>>>>
>>>>>>>>> On Wed, May 15, 2013 at 12:18 PM, Ivan Ristic <ivan.ristic at gmail.com> wrote:
>>>>>>>>>> On Wed, May 15, 2013 at 7:37 AM, Anoop Saldanha <anoopsaldanha at gmail.com> wrote:
>>>>>>>>>>> Ivan,
>>>>>>>>>>>
>>>>>>>>>>> I see the introduction of
>>>>>>>>>>>
>>>>>>>>>>> htp_tx_t *htp_connp_get_in_tx(const htp_connp_t *connp);
>>>>>>>>>>> htp_tx_t *htp_connp_get_out_tx(const htp_connp_t *connp);
>>>>>>>>>>>
>>>>>>>>>>> Which means I won't be able to retrieve individual txs?
>>>>>>>>>>
>>>>>>>>>> Those 2 functions will give you only the currently active request and
>>>>>>>>>> response, respectively. There can be one of each at any given time.
>>>>>>>>>>
>>>>>>>>>> With recent changes, callbacks are sent the correct tx, so the above
>>>>>>>>>> functions will rarely be needed when you're processing one transaction
>>>>>>>>>> at a time.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> I receive 5
>>>>>>>>>>> pipelined requests, so that would be 5 txs created. How do I retrieve
>>>>>>>>>>> the individual txs?
>>>>>>>>>>
>>>>>>>>>> The transactions are in htp_conn_t::transactions, which is a list. How
>>>>>>>>>> to access the htp_conn_t pointer depends on your setup. You probably
>>>>>>>>>> keep a pointer to connp somewhere in your context, and from there you
>>>>>>>>>> can get a connection using htp_connp_get_connection().
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> On Thu, Apr 11, 2013 at 10:16 PM, Anoop Saldanha
>>>>>>>>>>> <anoopsaldanha at gmail.com> wrote:
>>>>>>>>>>>> On Thu, Apr 11, 2013 at 9:10 PM, Ivan Ristic <ivan.ristic at gmail.com> wrote:
>>>>>>>>>>>>> I wouldn't advise you to do any buffering anyhow.
>>>>>>>>>>>>> But I am curious if you're
>>>>>>>>>>>>> deleting transactions once you're done with them. Because, if you're not,
>>>>>>>>>>>>> you may be allocating a lot of memory (all tx instances) on long-lived HTTP
>>>>>>>>>>>>> connections.
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> We do delete them, once we're done.
>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Tue, Apr 9, 2013 at 1:06 PM, Anoop Saldanha <anoopsaldanha at gmail.com>
>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Tue, Apr 9, 2013 at 2:36 PM, Victor Julien <victor at inliniac.net> wrote:
>>>>>>>>>>>>>>> (bad juju to brian and ivan for top posting and/or html emails! :)
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On 04/09/2013 10:21 AM, Ivan Ristic wrote:
>>>>>>>>>>>>>>>> On Mon, Apr 8, 2013 at 3:54 PM, Anoop Saldanha <anoopsaldanha at gmail.com
>>>>>>>>>>>>>>>> <mailto:anoopsaldanha at gmail.com>> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Mon, Apr 8, 2013 at 7:50 PM, Brian Rectanus <brectanu at gmail.com
>>>>>>>>>>>>>>>> <mailto:brectanu at gmail.com>> wrote:
>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>> > On Mon, Apr 8, 2013 at 9:16 AM, Brian Rectanus
>>>>>>>>>>>>>>>> <brectanu at gmail.com
>>>>>>>>>>>>>>>> <mailto:brectanu at gmail.com>> wrote:
>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>> >> On Mon, Apr 8, 2013 at 8:47 AM, Anoop Saldanha
>>>>>>>>>>>>>>>> <anoopsaldanha at gmail.com <mailto:anoopsaldanha at gmail.com>>
>>>>>>>>>>>>>>>> >> wrote:
>>>>>>>>>>>>>>>> >>>
>>>>>>>>>>>>>>>> >>> On Mon, Apr 8, 2013 at 3:42 PM, Victor Julien
>>>>>>>>>>>>>>>> <victor at inliniac.net <mailto:victor at inliniac.net>>
>>>>>>>>>>>>>>>> >>> wrote:
>>>>>>>>>>>>>>>> >>> > (moving to oisf-devel)
>>>>>>>>>>>>>>>> >>> >
>>>>>>>>>>>>>>>> >>> > On 04/08/2013 06:17 AM, Anoop Saldanha wrote:
>>>>>>>>>>>>>>>> >>> >>>> I recollect we introduced path and query double decoding
>>>>>>>>>>>>>>>> through
>>>>>>>>>>>>>>>> >>> >>>> configurable params, and also we had this thing with query
>>>>>>>>>>>>>>>> >>> >>>> decoding(single level). Can you explain a bit what the
>>>>>>>>>>>>>>>> status was
>>>>>>>>>>>>>>>> >>> >>>> previously. Seeing related failed uts.
>>>>>>>>>>>>>>>> >>> >>>>
>>>>>>>>>>>>>>>> >>> >>>
>>>>>>>>>>>>>>>> >>> >>> We run the path normalization on the query through our
>>>>>>>>>>>>>>>> >>> >>> HTPCallbackRequestUriNormalizeQuery callback. Previously we
>>>>>>>>>>>>>>>> used
>>>>>>>>>>>>>>>> >>> >>> htp_decode_path_inplace to normalize the query (e.g. for
>>>>>>>>>>>>>>>> >>> >>> uridecoding).
>>>>>>>>>>>>>>>> >>> >>> However, this was causing issues (remember that pcre "bug"
>>>>>>>>>>>>>>>> we
>>>>>>>>>>>>>>>> >>> >>> discussed
>>>>>>>>>>>>>>>> >>> >>> a while back, where http:// turned into http:/).
>>>>>>>>>>>>>>>> >>> >>>
>>>>>>>>>>>>>>>> >>> >>> In libhtp I copied htp_decode_path_inplace to
>>>>>>>>>>>>>>>> >>> >>> htp_decode_query_inplace
>>>>>>>>>>>>>>>> >>> >>> and also copied the config params and cfg funcs:
>>>>>>>>>>>>>>>> >>> >>>
>>>>>>>>>>>>>>>> >>> >>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> https://github.com/inliniac/suricata/commit/d41c762689a08e6814dc93e8bfebeceab97175c3
>>>>>>>>>>>>>>>> >>> >>>
>>>>>>>>>>>>>>>> >>> >>> Hack of the 1st order, which is wrong in many ways. But it
>>>>>>>>>>>>>>>> basically
>>>>>>>>>>>>>>>> >>> >>> allowed me to make sure we don't normalize the query as if
>>>>>>>>>>>>>>>> it's path,
>>>>>>>>>>>>>>>> >>> >>> esp with turning ftp:// into ftp:/ and such.
>>>>>>>>>>>>>>>> >>> >>>
>>>>>>>>>>>>>>>> >>> >>> For 0.5 integration I think we need a proper solution. The
>>>>>>>>>>>>>>>> only
>>>>>>>>>>>>>>>> >>> >>> reason I
>>>>>>>>>>>>>>>> >>> >>> pushed my hack like this was that I knew in 0.5 we would
>>>>>>>>>>>>>>>> make things
>>>>>>>>>>>>>>>> >>> >>> right.
>>>>>>>>>>>>>>>> >>> >>>
>>>>>>>>>>>>>>>> >>> >>
>>>>>>>>>>>>>>>> >>> >> I think if we still want to double decode, we still require
>>>>>>>>>>>>>>>> all of
>>>>>>>>>>>>>>>> >>> >> these above things from our bundled htp.
>>>>>>>>>>>>>>>> >>> >>
>>>>>>>>>>>>>>>> >>> >> -----
>>>>>>>>>>>>>>>> >>> >>
>>>>>>>>>>>>>>>> >>> >> In 0.5.x, tx->request_uri_normalized has been removed, and
>>>>>>>>>>>>>>>> we'd now
>>>>>>>>>>>>>>>> >>> >> have to use the REQUEST_URI hook. We'll have to carry out
>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>> >>> >> reconstruction ourselves, and store it ourselves in our
>>>>>>>>>>>>>>>> HTPState.
>>>>>>>>>>>>>>>> >>> >>
>>>>>>>>>>>>>>>> >>>
>>>>>>>>>>>>>>>> >>> What are your thoughts on this?
>>>>>>>>>>>>>>>> >>>
>>>>>>>>>>>>>>>> >>> >
>>>>>>>>>>>>>>>> >>> > IIRC there is some function in libhtp that does just the
>>>>>>>>>>>>>>>> decoding of
>>>>>>>>>>>>>>>> >>> > uriencoding and unicode. We should probably just use that on
>>>>>>>>>>>>>>>> the query
>>>>>>>>>>>>>>>> >>> > and do the full normalization on the path.
>>>>>>>>>>>>>>>> >>> >
>>>>>>>>>>>>>>>> >>> > As a side thought: I think it would be nice to store path and
>>>>>>>>>>>>>>>> query
>>>>>>>>>>>>>>>> >>> > separately so that we can add http_path and http_query
>>>>>>>>>>>>>>>> keywords later
>>>>>>>>>>>>>>>> >>> > on.
>>>>>>>>>>>>>>>> >>> >
>>>>>>>>>>>>>>>> >>>
>>>>>>>>>>>>>>>> >>> We'd pretty much extract it directly from parsed_uri. Will
>>>>>>>>>>>>>>>> have to
>>>>>>>>>>>>>>>> >>> check if we need the extract double decode phase we have
>>>>>>>>>>>>>>>> currently in
>>>>>>>>>>>>>>>> >>> our bundled htp, in which case we'd need to store them
>>>>>>>>>>>>>>>> separately.
>>>>>>>>>>>>>>>> >>>
>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>> >> Yes, all the normalized components are in tx->parsed_uri. This
>>>>>>>>>>>>>>>> is what is
>>>>>>>>>>>>>>>> >> used in ironbee to expose all the various parts like
>>>>>>>>>>>>>>>> tx->parsed_uri->path
>>>>>>>>>>>>>>>> >> and tx->parsed_uri->query.
>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>> >> Also note that the hostname should now be obtained from
>>>>>>>>>>>>>>>> >> tx->request_hostname in 0.5.
>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>> >> -B
>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>> > FYI, for an example using libhtp 0.5 see ironbee code. This was
>>>>>>>>>>>>>>>> all
>>>>>>>>>>>>>>>> > recently updated for 0.5.
>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>> > https://github.com/ironbee/ironbee/blob/0.7.x/modules/modhtp.c
>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Will have a look. Thanks.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Previously we would use tx->connp->conn->transactions to access txs
>>>>>>>>>>>>>>>> in the state. Now that htp_connp_t is an opaque pointer how do I
>>>>>>>>>>>>>>>> access the txs? Tried locating helper functions to retrieve it, but
>>>>>>>>>>>>>>>> I
>>>>>>>>>>>>>>>> didn't find any.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> It's an oversight that there isn't a helper function to retrieve
>>>>>>>>>>>>>>>> transactions on a connections. I will add one tomorrow.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Having said that, what is your use case that you require to retrieve
>>>>>>>>>>>>>>>> transactions? I thought your code was driven by the callbacks, which >
>>>>>>>>>>>>>>>> all
>>>>>>>>>>>>>>>> come with a tx instance (via connp)? For my education, can you explain
>>>>>>>>>>>>>>>> how you process connection data?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> One of the things that we don't do out of the callbacks is logging the
>>>>>>>>>>>>>>> requests. This is one of the things we need access to the TX store for.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> And to add to it, since we already have the txs stored in a list
>>>>>>>>>>>>>> inside libhtp, re-buffering the txs would come as a redundant task,
>>>>>>>>>>>>>> from where I see it.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> --
>>>>>>>>>>>>>> Anoop Saldanha
>>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>>> Suricata IDS Devel mailing list: oisf-devel at openinfosecfoundation.org
>>>>>>>>>>>>>> Site: http://suricata-ids.org | Participate:
>>>>>>>>>>>>>> http://suricata-ids.org/participate/
>>>>>>>>>>>>>> List: https://lists.openinfosecfoundation.org/mailman/listinfo/oisf-devel
>>>>>>>>>>>>>> Redmine: https://redmine.openinfosecfoundation.org/
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> --
>>>>>>>>>>>>> Ivan Ristić
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> --
>>>>>>>>>>>> Anoop Saldanha
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> --
>>>>>>>>>>> Anoop Saldanha
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> Ivan Ristić
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> Anoop Saldanha
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> ---------------------------------------------
>>>>>>> Victor Julien
>>>>>>> http://www.inliniac.net/
>>>>>>> PGP: http://www.inliniac.net/victorjulien.asc
>>>>>>> ---------------------------------------------
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> -------------------------------
>>>>>> Anoop Saldanha
>>>>>> http://www.poona.me
>>>>>> -------------------------------
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> -------------------------------
>>>>> Anoop Saldanha
>>>>> http://www.poona.me
>>>>> -------------------------------
>>>>
>>>>
>>>>
>>>> --
>>>> -------------------------------
>>>> Anoop Saldanha
>>>> http://www.poona.me
>>>> -------------------------------
>>>
>>>
>>>
>>> --
>>> Ivan Ristić
>>
>>
>>
>> --
>> -------------------------------
>> Anoop Saldanha
>> http://www.poona.me
>> -------------------------------
>
>
>
> --
> -------------------------------
> Anoop Saldanha
> http://www.poona.me
> -------------------------------
--
-------------------------------
Anoop Saldanha
http://www.poona.me
-------------------------------
More information about the Oisf-devel
mailing list