[Oisf-users] Correlating http transactions and alert logs

Darren Spruell phatbuckett at gmail.com
Sun Mar 22 05:43:36 UTC 2015

On Tue, Jan 27, 2015 at 10:40 AM, Darren Spruell <phatbuckett at gmail.com> wrote:
> On Sat, Jul 12, 2014 at 1:39 AM, Victor Julien <lists at inliniac.net> wrote:
>> On 07/11/2014 07:49 PM, Darren Spruell wrote:
>>> On Fri, Jul 11, 2014 at 2:19 AM, Victor Julien <lists at inliniac.net> wrote:
>>>> On 07/11/2014 11:09 AM, Darren Spruell wrote:
>>>>> Request in same vein as
>>>>> https://lists.openinfosecfoundation.org/pipermail/oisf-users/2014-January/003281.html
>>>>> I'm embarking on a task with Suricata where I'd need to specifically
>>>>> correlate alerts to HTTP requests - namely to associate the set of
>>>>> alerts that fire in connection with the HTTP transaction under
>>>>> inspection. Currently Suricata logging is the only data source
>>>>> available to do this for the environment, and I'd prefer to use EVE
>>>>> logging.
>>>>> If I understand correctly Suricata can generate debug alerting with
>>>>> knowledge of the transaction being handled at the moment of alert.
>>>>> I've peeked at the debug log output and it is not appropriate for our
>>>>> use case. Is it possible to have Suricata generate http logs (maybe
>>>>> other service logs too?) that include data about any alerts that were
>>>>> generated inspecting that transaction? I want to make sure we have a
>>>>> strong correlator on these events, so attempting to match logs up by
>>>>> timestamp or similar is potentially impractical or at the least
>>>>> awkward and potentially fraught with #fail.
>>>>> Alternatively, is it possible to tag the transaction ID at logging
>>>>> time into both the alert log and the http log that would allow
>>>>> correlation on the two in a datastore?
>>>> In my flow-log branch I have also added a simple 'flow id', which is a
>>>> tag added to each alert, http, etc record that is connected to a flow.
>>>> So this would be a step in the right direction. If you need a tx
>>>> specific tag... I guess it would be enough to log the tx number where
>>>> available in both http and alert. This is just a simple incrementing
>>>> counter that starts at 0 for the first tx. The flow_id+tx_cnt would be
>>>> quite unique.
>>>> This is all about the json output btw.
>>>> Would this be helpful?
>>> I think so. Flows sound appropriate in most cases with most protocols
>>> but I guess HTTP pipelining seems to give a case that would require
>>> the tx_cnt to make sure the request is uniquely identified in the
>>> flow. The per-request focus is important in our case because each
>>> request we process will contain a unique custom resource ID header
>>> passed by the client, and we'll need to perfectly correlate this
>>> resource to request to alert(s) when processing logging.
>>>>> Or is there another way of accomplishing this?
>>>> Approx timestamp + 5tulpe is usually good, but it requires a bit of
>>>> scripting.
>>> Does seem like that would work in most cases. I think in our case
>>> we'll be using a lot of proxy pipelining which weakens the utility of
>>> the 5tuple as a strong identifier. Definitely prefer something
>>> internal to the flow and a short key that can used in DB queries and
>>> the like is probably ideal.
>> This PR implements the flow_id
>> https://github.com/inliniac/suricata/pull/1032
>> And this the tx_id: https://github.com/inliniac/suricata/pull/1031
>> Both will go into 2.1, but if you're interested in testing them it'd be
>> great.
> About time for some feedback (1/2 a year!). Sorry for that.
> We've been using this codebase (Suricata version 2.1dev (rev fdfa184))
> specifically for the flow ID/transaction ID functionality and ability
> to correlate alerts to HTTP requests, and it's been working usefully.
> We output json logging for alert and http events; at log rotation
> time, we call a post-processing correlation script to output a CSV of
> HTTP URLs corresponding to logged alerts along with some other useful
> fields. Example:
> 2015-01-26T14:01:55.608385,ET CURRENT_EVENTS Unknown Banking PHISH -
> Login.php?LOB=RBG,2015938,A Network Trojan was
> detected,1,2,1,GET,hXXp://www.apartcar.com[.]ar/online/Home/multimedia/189199_paperless_marquee_572x150.swf,hXXp://www.apartcar.com[.]ar/site/img/novedades/c/Logon.php?LOB=RBGLogon&_pageLabel=page_logonform&amp=&amphttp://www.apartmani-krstanovic.com/k_krhttp://www.apartcar.com.ar/site/img/novedades/c/Logon.php?LOB=RBGLogon&_pageLabel=page_logonform&amp=&amphttp:,text/html,20,www.apartcar.com[.]ar
> From this point the CSVs are consumed in various ways, e.g. loaded
> into ElasticSearch for analysis, processed for alerting, etc.
> Over the course of doing this, we've observed some URLs logged that
> correspond by flow/xaction id to alerts they don't seem to match up
> to. It may be because the flow ID rolls over more quickly than our
> logging or something; we haven't really dug into this.

I wanted to ask the status of the flow ID/transaction ID line of
development and what more specific testing/feedback would be helpful
in relation to it, if any. We've found the functionality to be useful
in practice.

Darren Spruell
phatbuckett at gmail.com

More information about the Oisf-users mailing list