[Oisf-users] Correlating http transactions and alert logs

Darren Spruell phatbuckett at gmail.com
Tue Jan 27 18:40:26 UTC 2015


On Sat, Jul 12, 2014 at 1:39 AM, Victor Julien <lists at inliniac.net> wrote:
> On 07/11/2014 07:49 PM, Darren Spruell wrote:
>> On Fri, Jul 11, 2014 at 2:19 AM, Victor Julien <lists at inliniac.net> wrote:
>>> On 07/11/2014 11:09 AM, Darren Spruell wrote:
>>>> Request in same vein as
>>>> https://lists.openinfosecfoundation.org/pipermail/oisf-users/2014-January/003281.html
>>>>
>>>> I'm embarking on a task with Suricata where I'd need to specifically
>>>> correlate alerts to HTTP requests - namely to associate the set of
>>>> alerts that fire in connection with the HTTP transaction under
>>>> inspection. Currently Suricata logging is the only data source
>>>> available to do this for the environment, and I'd prefer to use EVE
>>>> logging.
>>>>
>>>> If I understand correctly Suricata can generate debug alerting with
>>>> knowledge of the transaction being handled at the moment of alert.
>>>> I've peeked at the debug log output and it is not appropriate for our
>>>> use case. Is it possible to have Suricata generate http logs (maybe
>>>> other service logs too?) that include data about any alerts that were
>>>> generated inspecting that transaction? I want to make sure we have a
>>>> strong correlator on these events, so attempting to match logs up by
>>>> timestamp or similar is potentially impractical or at the least
>>>> awkward and potentially fraught with #fail.
>>>>
>>>> Alternatively, is it possible to tag the transaction ID at logging
>>>> time into both the alert log and the http log that would allow
>>>> correlation on the two in a datastore?
>>>
>>> In my flow-log branch I have also added a simple 'flow id', which is a
>>> tag added to each alert, http, etc record that is connected to a flow.
>>> So this would be a step in the right direction. If you need a tx
>>> specific tag... I guess it would be enough to log the tx number where
>>> available in both http and alert. This is just a simple incrementing
>>> counter that starts at 0 for the first tx. The flow_id+tx_cnt would be
>>> quite unique.
>>>
>>> This is all about the json output btw.
>>>
>>> Would this be helpful?
>>
>> I think so. Flows sound appropriate in most cases with most protocols
>> but I guess HTTP pipelining seems to give a case that would require
>> the tx_cnt to make sure the request is uniquely identified in the
>> flow. The per-request focus is important in our case because each
>> request we process will contain a unique custom resource ID header
>> passed by the client, and we'll need to perfectly correlate this
>> resource to request to alert(s) when processing logging.
>>
>>>> Or is there another way of accomplishing this?
>>>
>>> Approx timestamp + 5tulpe is usually good, but it requires a bit of
>>> scripting.
>>
>> Does seem like that would work in most cases. I think in our case
>> we'll be using a lot of proxy pipelining which weakens the utility of
>> the 5tuple as a strong identifier. Definitely prefer something
>> internal to the flow and a short key that can used in DB queries and
>> the like is probably ideal.
>>
>
> This PR implements the flow_id
> https://github.com/inliniac/suricata/pull/1032
> And this the tx_id: https://github.com/inliniac/suricata/pull/1031
>
> Both will go into 2.1, but if you're interested in testing them it'd be
> great.

About time for some feedback (1/2 a year!). Sorry for that.

We've been using this codebase (Suricata version 2.1dev (rev fdfa184))
specifically for the flow ID/transaction ID functionality and ability
to correlate alerts to HTTP requests, and it's been working usefully.

We output json logging for alert and http events; at log rotation
time, we call a post-processing correlation script to output a CSV of
HTTP URLs corresponding to logged alerts along with some other useful
fields. Example:

2015-01-26T14:01:55.608385,ET CURRENT_EVENTS Unknown Banking PHISH -
Login.php?LOB=RBG,2015938,A Network Trojan was
detected,1,2,1,GET,hXXp://www.apartcar.com[.]ar/online/Home/multimedia/189199_paperless_marquee_572x150.swf,hXXp://www.apartcar.com[.]ar/site/img/novedades/c/Logon.php?LOB=RBGLogon&_pageLabel=page_logonform&amp=&amphttp://www.apartmani-krstanovic.com/k_krhttp://www.apartcar.com.ar/site/img/novedades/c/Logon.php?LOB=RBGLogon&_pageLabel=page_logonform&amp=&amphttp:,text/html,20,www.apartcar.com[.]ar

>From this point the CSVs are consumed in various ways, e.g. loaded
into ElasticSearch for analysis, processed for alerting, etc.

Over the course of doing this, we've observed some URLs logged that
correspond by flow/xaction id to alerts they don't seem to match up
to. It may be because the flow ID rolls over more quickly than our
logging or something; we haven't really dug into this.

-- 
Darren Spruell
phatbuckett at gmail.com


More information about the Oisf-users mailing list