[Oisf-devel] http_host & http_raw_host

Tue Mar 19 12:19:08 UTC 2013

On Tue, Mar 19, 2013 at 5:43 PM, Victor Julien <victor at inliniac.net> wrote:
> On 03/19/2013 01:05 PM, Anoop Saldanha wrote:
>> On Tue, Mar 19, 2013 at 5:26 PM, Victor Julien <victor at inliniac.net> wrote:
>>> On 03/19/2013 12:22 PM, Anoop Saldanha wrote:
>>>> On Tue, Mar 19, 2013 at 4:35 PM, Victor Julien <victor at inliniac.net> wrote:
>>>>> On 03/19/2013 12:03 PM, Anoop Saldanha wrote:
>>>>>> On Tue, Mar 19, 2013 at 4:23 PM, Victor Julien <victor at inliniac.net> wrote:
>>>>>>> In the new http_host, which host is selected if we have:
>>>>>>>
>>>>>>> GET http://one/ HTTP/1.0
>>>>>>> Host: two
>>>>>>>
>>>>>>> One or two?
>>>>>>
>>>>>> One.  The uri value gets priority over the header value.
>>>>>>
>>>>>>>
>>>>>>> I know "alert http any any -> any any (msg:"SURICATA HTTP Host header
>>>>>>> ambiguous"; flow:established,to_server;
>>>>>>> app-layer-event:http.host_header_ambiguous;
>>>>>>> flowint:http.anomaly.count,+,1; classtype:protocol-command-decode;
>>>>>>> sid:2221015; rev:1;)" will fire in this case, but I assume the http_host
>>>>>>> keyword will fire on something as well.
>>>>>>>
>>>>>>> Also, what does http_raw_host match on specifically?
>>>>>>>
>>>>>>
>>>>>> Same logic as above.
>>>>>>
>>>>>
>>>>> Thanks.
>>>>>
>>>>> What is the overall difference between http_host and http_raw_host? I
>>>>> don't think we do normalization of the host, do we?
>>>>>
>>>>
>>>> Case difference, iirc.  http_host is lowercase.  Will need to check
>>>> with libhtp, though.
>>>>
>>>
>>> Cool, please let me know.
>>>
>>> Was wondering about something related. http_host is normalized to
>>> lowercase. Yet it seems the rules are forced to set nocase, which is odd
>>> I think.
>>
>> I forced the nocase so that users don't have the misconception that
>> they can use a uppercase pattern without a nocase, and it still
>> matches against a lowercase uri.
>
> Yeah, understand.
>
>>> Adding nocase makes sure we take a slower code path while we
>>> have no case to consider at all.  We should only consider it at the rule
>>> parsing stage.
>>>
>>
>> We don't take a slower code path.  If anything it is faster or am I
>> missing something?  It's considered at rule parsing stage.
>
> In case of nocase (and pcre's /i) we take slower paths in (some) mpm,
> pcre matching, spm matching, etc. Might not be a major difference, but
> it's completely unnecessary in this case.
>
> Compare BoyerMoore vs BoyerMooreNocase for example. We "tolower" each
> char before matching. Unnecessary if we know everything is lowercase
> already. Something similar will happen in pcre. I see in AC it won't
> make a difference.
>

See your point about spm.  AC it's the other way round though.  Nocase
is faster.  Understand your point though.

>>
>>> So I think we need a different sort of warning:
>>>
>>> e.g.: content:"Google.com"; http_host;
>>>
>>> should warn "uppercase pattern against lowercase buffer, use lowercase
>>> pattern or "nocase" if you're stupid" :)
>>>
>>> Make sense?
>>>
>>
>> Makes sense, although I feel we should error out on such sigs, rather
>> than warn.  In the first place, such a sig is technically wrong, and
>> when we are reading such sigs quickly, we tend to not notice the
>> difference immediately.
>>
>
> Sure.
>

Will make this change.

-- 
Anoop Saldanha