[Oisf-users] Correlating http transactions and alert logs

Fri Jul 11 17:49:58 UTC 2014

On Fri, Jul 11, 2014 at 2:19 AM, Victor Julien <lists at inliniac.net> wrote:
> On 07/11/2014 11:09 AM, Darren Spruell wrote:
>> Request in same vein as
>> https://lists.openinfosecfoundation.org/pipermail/oisf-users/2014-January/003281.html
>>
>> I'm embarking on a task with Suricata where I'd need to specifically
>> correlate alerts to HTTP requests - namely to associate the set of
>> alerts that fire in connection with the HTTP transaction under
>> inspection. Currently Suricata logging is the only data source
>> available to do this for the environment, and I'd prefer to use EVE
>> logging.
>>
>> If I understand correctly Suricata can generate debug alerting with
>> knowledge of the transaction being handled at the moment of alert.
>> I've peeked at the debug log output and it is not appropriate for our
>> use case. Is it possible to have Suricata generate http logs (maybe
>> other service logs too?) that include data about any alerts that were
>> generated inspecting that transaction? I want to make sure we have a
>> strong correlator on these events, so attempting to match logs up by
>> timestamp or similar is potentially impractical or at the least
>> awkward and potentially fraught with #fail.
>>
>> Alternatively, is it possible to tag the transaction ID at logging
>> time into both the alert log and the http log that would allow
>> correlation on the two in a datastore?
>
> In my flow-log branch I have also added a simple 'flow id', which is a
> tag added to each alert, http, etc record that is connected to a flow.
> So this would be a step in the right direction. If you need a tx
> specific tag... I guess it would be enough to log the tx number where
> available in both http and alert. This is just a simple incrementing
> counter that starts at 0 for the first tx. The flow_id+tx_cnt would be
> quite unique.
>
> This is all about the json output btw.
>
> Would this be helpful?

I think so. Flows sound appropriate in most cases with most protocols
but I guess HTTP pipelining seems to give a case that would require
the tx_cnt to make sure the request is uniquely identified in the
flow. The per-request focus is important in our case because each
request we process will contain a unique custom resource ID header
passed by the client, and we'll need to perfectly correlate this
resource to request to alert(s) when processing logging.

>> Or is there another way of accomplishing this?
>
> Approx timestamp + 5tulpe is usually good, but it requires a bit of
> scripting.

Does seem like that would work in most cases. I think in our case
we'll be using a lot of proxy pipelining which weakens the utility of
the 5tuple as a strong identifier. Definitely prefer something
internal to the flow and a short key that can used in DB queries and
the like is probably ideal.

-- 
Darren Spruell
phatbuckett at gmail.com