[Oisf-devel] libhtp - Normalization of query string

Ivan Ristic ivan.ristic at gmail.com
Wed Jun 19 10:15:22 UTC 2013


On Tue, Jun 18, 2013 at 2:12 PM, Anoop Saldanha <anoopsaldanha at gmail.com> wrote:
> On Mon, Jun 17, 2013 at 6:40 PM, Ivan Ristic <ivan.ristic at gmail.com> wrote:
>> On Mon, Jun 17, 2013 at 9:18 AM, Anoop Saldanha <anoopsaldanha at gmail.com> wrote:
>>> While producing the normalized uri, what is the right way to
>>> generate the normalized query string? Can see 2 solutions -
>>>
>>>     1. Duplicate this code section from htp_unparse_uri_noencode( ) -
>>>
>>>         if (uri->query != NULL) {
>>>             bstr *query = bstr_dup(uri->query);
>>>             htp_uriencoding_normalize_inplace(query);
>>>             bstr_add_c_noex(r, "?");
>>>             bstr_add_noex(r, query);
>>>             bstr_free(query);
>>>         }
>>
>> I think this one is a better approach, although it may depend on
>> exactly how you define normalization.
>
> With htp_uriencoding_normalize_inplace( ) if it sees a %2d it would
> translate it as a '-'(hypen) using x2c, and then checks if it's a
> reserved character and post confirmation leaves it undecoded.  Is this
> the right behaviour?

It depends. It's ambiguous in the spec, and some argue one way, some
another. Unfortunately, I didn't document my reasoning and so I will
need to go back and double-check.


> I would have preferred to use htp_decode_urlencoded_inplace(), but
> it's private and duplication would be a nuisance with all the
> reference to cfg.

I don't think you can avoid the reference to cfg, because there are
many settings that control exactly how the decoding is done. There
isn't any one true way. I could create a public function removing the
reference to tx -- would you like that?


> Btw the cfg associated with HTP_DECODER_URL_PATH applicable to both
> the path and the query part of the uri?

HTP_DECODER_URL_PATH will be used for the path only,
HTP_DECODER_URLENCODED for parameters, either in the query string or
in the body. (I've just found one place when the incorrect cfg is
used. Will fix it now.)


>>
>>
>>>     2. Register htp_config_register_urlencoded_parser( ), and then
>>>           use the below code -
>>>
>>>         if (uri->query != NULL) {
>>>             bstr_add_c_noex(r, "?");
>>>             size_t tsize = htp_table_size(tx->request_params);
>>>             size_t i;
>>>             for (i = 0; i < tsize; i++) {
>>>                 htp_param_t *p =
>>>                     htp_table_get_index(tx->request_params, i, NULL);
>>>                 if (p == NULL || p->source != HTP_SOURCE_QUERY_STRING)
>>>                     continue;
>>>                 bstr_add_noex(r, p->name);
>>>                 if (bstr_len(p->value) != 0) {
>>>                     bstr_add_c_noex(r, "=");
>>>                     bstr_add_noex(r, p->value);
>>>                 }
>>>                 if (i != (tsize - 1))
>>>                     bstr_add_c_noex(r, "&");
>>>         }
>>>
>>>     Which of these 2 is the right solution?
>>>
>
> --
> -------------------------------
> Anoop Saldanha
> http://www.poona.me
> -------------------------------



-- 
Ivan Ristić



More information about the Oisf-devel mailing list