[Oisf-devel] http_host & http_raw_host

Tue Mar 19 12:13:11 UTC 2013

On 03/19/2013 01:05 PM, Anoop Saldanha wrote:
> On Tue, Mar 19, 2013 at 5:26 PM, Victor Julien <victor at inliniac.net> wrote:
>> On 03/19/2013 12:22 PM, Anoop Saldanha wrote:
>>> On Tue, Mar 19, 2013 at 4:35 PM, Victor Julien <victor at inliniac.net> wrote:
>>>> On 03/19/2013 12:03 PM, Anoop Saldanha wrote:
>>>>> On Tue, Mar 19, 2013 at 4:23 PM, Victor Julien <victor at inliniac.net> wrote:
>>>>>> In the new http_host, which host is selected if we have:
>>>>>>
>>>>>> GET http://one/ HTTP/1.0
>>>>>> Host: two
>>>>>>
>>>>>> One or two?
>>>>>
>>>>> One.  The uri value gets priority over the header value.
>>>>>
>>>>>>
>>>>>> I know "alert http any any -> any any (msg:"SURICATA HTTP Host header
>>>>>> ambiguous"; flow:established,to_server;
>>>>>> app-layer-event:http.host_header_ambiguous;
>>>>>> flowint:http.anomaly.count,+,1; classtype:protocol-command-decode;
>>>>>> sid:2221015; rev:1;)" will fire in this case, but I assume the http_host
>>>>>> keyword will fire on something as well.
>>>>>>
>>>>>> Also, what does http_raw_host match on specifically?
>>>>>>
>>>>>
>>>>> Same logic as above.
>>>>>
>>>>
>>>> Thanks.
>>>>
>>>> What is the overall difference between http_host and http_raw_host? I
>>>> don't think we do normalization of the host, do we?
>>>>
>>>
>>> Case difference, iirc.  http_host is lowercase.  Will need to check
>>> with libhtp, though.
>>>
>>
>> Cool, please let me know.
>>
>> Was wondering about something related. http_host is normalized to
>> lowercase. Yet it seems the rules are forced to set nocase, which is odd
>> I think.
> 
> I forced the nocase so that users don't have the misconception that
> they can use a uppercase pattern without a nocase, and it still
> matches against a lowercase uri.

Yeah, understand.

>> Adding nocase makes sure we take a slower code path while we
>> have no case to consider at all.  We should only consider it at the rule
>> parsing stage.
>>
> 
> We don't take a slower code path.  If anything it is faster or am I
> missing something?  It's considered at rule parsing stage.

In case of nocase (and pcre's /i) we take slower paths in (some) mpm,
pcre matching, spm matching, etc. Might not be a major difference, but
it's completely unnecessary in this case.

Compare BoyerMoore vs BoyerMooreNocase for example. We "tolower" each
char before matching. Unnecessary if we know everything is lowercase
already. Something similar will happen in pcre. I see in AC it won't
make a difference.

> 
>> So I think we need a different sort of warning:
>>
>> e.g.: content:"Google.com"; http_host;
>>
>> should warn "uppercase pattern against lowercase buffer, use lowercase
>> pattern or "nocase" if you're stupid" :)
>>
>> Make sense?
>>
> 
> Makes sense, although I feel we should error out on such sigs, rather
> than warn.  In the first place, such a sig is technically wrong, and
> when we are reading such sigs quickly, we tend to not notice the
> difference immediately.
> 

Sure.

-- 
---------------------------------------------
Victor Julien
http://www.inliniac.net/
PGP: http://www.inliniac.net/victorjulien.asc
---------------------------------------------