[Oisf-users] http.log log format
Martin Holste
mcholste at gmail.com
Fri Apr 13 13:34:15 UTC 2012
Hm, looking at it again, I don't think it will cause any parsing
problems since the URI could be any string anyway, so nevermind. Log
normalizers will just have to use spaces as delimiters since the field
can't be counted on to start with a slash.
On Fri, Apr 13, 2012 at 3:41 AM, Victor Julien <victor at inliniac.net> wrote:
> On 04/05/2012 03:45 PM, Martin Holste wrote:
>> Yes, but we also lose info if the log doesn't get parsed right by
>> whatever log solution is reading it in. I understand the desire to
>> keep as much forensic information available as possible, but I think
>> that's better suited for packets. For the log, normalization is
>> important because misparsed logs can mean missed log searches.
>
> I see the point, but I don't want to loose the extra information it
> gives. Maybe we can add that to the log in a different way?
>
> Victor
>
>> On Thu, Apr 5, 2012 at 3:23 AM, Victor Julien <victor at inliniac.net> wrote:
>>> On 04/05/2012 10:14 AM, Geert Alberghs wrote:
>>>> Hello,
>>>>
>>>> http logging has been enabled in our environment. The purpose is to
>>>> parse these logs for URL's up to and including the path. (so no query
>>>> and/or fragment part) The problem is that in http.log I encounter 2 log
>>>> formats:
>>>>
>>>> 1. TIMESTAMP HOSTNAME [**] COMPLETE URL [**]
>>>> 2. TIMESTAMP HOSTNAME [**] URL without SCHEME&HOSTNAME [**]
>>>>
>>>> In case 1 I only need COMPLETE URL and strip of query and/or fragment
>>>> In case 2 I need to concat "SCHEME", "HOSTNAME" and "URL without
>>>> SCHEME&HOSTNAME" and then strip of query and/or fragment.
>>>>
>>>> Is there any logic in why there are 2 different cases? Personally I
>>>> think log format 1 is preferable.
>>>
>>> The URL is expressed as it appears in the request. These are both valid:
>>>
>>> GET / HTTP/1.1
>>>
>>> GET http://somehost/ HTTP/1.1
>>>
>>> The host name is taken from the Host header.
>>>
>>> The 2nd URL format is used in case of proxy request usually, but is also
>>> valid for "normal" request by RFC.
>>>
>>> If we leave it out we miss some info, especially if the host part of the
>>> URL would not match the value of the Host header.
>>>
>>> --
>>> ---------------------------------------------
>>> Victor Julien
>>> http://www.inliniac.net/
>>> PGP: http://www.inliniac.net/victorjulien.asc
>>> ---------------------------------------------
>>>
>>> _______________________________________________
>>> Oisf-users mailing list
>>> Oisf-users at openinfosecfoundation.org
>>> http://lists.openinfosecfoundation.org/mailman/listinfo/oisf-users
>>
>
>
> --
> ---------------------------------------------
> Victor Julien
> http://www.inliniac.net/
> PGP: http://www.inliniac.net/victorjulien.asc
> ---------------------------------------------
>
More information about the Oisf-users
mailing list