[Oisf-users] practical use of dns log

Wed Nov 27 10:30:10 UTC 2013

Hi all,

Thank you for the long replies.
@Kevin: gamelinux passivedns is indeed great, I've been using it for a while.
@Peter and Coop: Elasticsearch, logstash and elsa are comparable to
splunk, a tool to index data and search for it.

The thing I'm more specifically looking for are practical uses of the
DNS logging format of Suricata.
The reason is that it's a LOT different from the output I'm used of
gamelinux passivedns (and of some other pdns webinterfaces I have
access on) and I'm currently evaluating if i) it is worth to switch to
the suricata pdns now, or should I wait for the json output and the
more philosophical ii) is

A pdns database is great in the way that it can be used with multiple hats:
1/ an incident response analyst finding infected systems knowing the CC server
2/ an incident response analyst searching for relations with other dns
names and CCs
3/ an analyst searching for unknown badness based on some concepts

Let me explain by comparing gamelinux passivedns output and suricata's
for example: We have a database (or file) containing these fields:
1385545938||10.0.1.2||10.1.1.10||IN||stores.ebay.fr.||CNAME||stores.shop.ebay.fr.||900
1385545938||10.0.1.2||10.1.1.10||IN||stores.ebay.fr.||CNAME||stores.shop.intl.ebay.com.||900

Now as Kevin explains you can find badness by searching for NXDOMAINS
and other weird things. The interesting thing with the gamelinux
format is that everything is on one line: type, request, type,
response, ttl.

Suricata on the other hand outputs data in multiple lines:

11/27/2013-11:06:49.845594 [**] Query TX 82f3 [**] td.twitter.com [**]
A [**] x.x.x.x:15937 -> 208.78.70.34:53
11/27/2013-11:06:49.845594 [**] Response TX 82f3 [**] td.twitter.com
[**] A [**] TTL 30 [**] 199.59.150.8 [**] 208.78.70.34:53 ->
x.x.x.x:15937
11/27/2013-11:06:49.845594 [**] Response TX 82f3 [**] td.twitter.com
[**] A [**] TTL 30 [**] 199.59.149.231 [**] 208.78.70.34:53 ->
x.x.x.x:15937
11/27/2013-11:06:49.845594 [**] Response TX 82f3 [**] td.twitter.com
[**] A [**] TTL 30 [**] 199.59.148.92 [**] 208.78.70.34:53 ->
x.x.x.x:15937
11/27/2013-11:06:49.845594 [**] Response TX 82f3 [**] twitter.com [**]
NS [**] TTL 20864 [**] ns3.p34.dynect.net [**] 208.78.70.34:53 ->
x.x.x.x:15937
11/27/2013-11:06:49.845594 [**] Response TX 82f3 [**] twitter.com [**]
NS [**] TTL 20864 [**] ns1.p34.dynect.net [**] 208.78.70.34:53 ->
x.x.x.x:15937
11/27/2013-11:06:49.845594 [**] Response TX 82f3 [**] twitter.com [**]
NS [**] TTL 20864 [**] ns2.p34.dynect.net [**] 208.78.70.34:53 ->
x.x.x.x:15937
11/27/2013-11:06:49.845594 [**] Response TX 82f3 [**] twitter.com [**]
NS [**] TTL 20864 [**] ns4.p34.dynect.net [**] 208.78.70.34:53 ->
x.x.x.x:15937

The output is a LOT more verbose than gamelinux pdns, this is great as
I do expect to be able to see more "weird" things.
However do notice that you need additional lookups to correlate the ID
(82f3) between the query and response. If you data is indexed you will
need to search for the ID, and considering the length of the ID you
will get lots of duplicates.

In the end I think I'll just wait for the json output, as it might be
a lot easier.

@Peter: do you think it'd be possible to also log the time between
query/response? I'm wondering about the things that might come out of
it.

Cheers
Christophe

On Wed, Nov 27, 2013 at 10:16 AM, Kevin Ross <kevross33 at googlemail.com> wrote:
> With DNS if you haven't already I would also suggest using this:
>
> https://github.com/gamelinux/passivedns
> www.alienvault.com/open-threat-exchange/blog/identifying-suspicious-domains-using-dns-records
> http://www.net-security.org/article.php?id=1844&p=1.
>
> I use it and I find its ability to query quickly via web interface or having
> a mysql database I can do other queries and try ideas on useful (such as
> newly seen suspicious domains etc). Also blacklists & regex support and as
> described in the last link provided I use this in combination with a SIEM to
> identify domain generation algorithms and it has proven reliable against
> Zeus and other DGA malware.
>
> While it may not be as verbose in tracing down where queries are coming from
> (especially if you can get a sensor in between your clients and DNS servers
> to get the source) I find it excellent for large scale analysis and queries.
> Also with the web interface I have Virustotal (which is excellent for
> passiveDNS as it links malware from or speaking to the address and other
> files), BFK and also my local PDNS as Snorby lookup sources and I find the
> local one exceptionally useful as when you have a query you can see all the
> domain names that we actually queried inside your network for this and then
> can quickly determine other linked IPs, domain names etc seen in your
> organisation. Also given that the database size in a large network with
> 30,000 users is relatively small for now nearly 2 months of data (500MB) so
> keeping large sets of historic data is possible to find later malicious
> domains.
>
> So far using this I have looked into using the data reliably to detect these
> things:
> - DGA malware
> - Other malware domains
> - Malicious domains (exploit kits, fake AVs etc).
> - Now looking into fast flux detection and also other ways of detecting
> malicious hosting infrastructure.
>
> I think once I have some kind of decent data reduction on the basic queries
> on the data already available I hoping to output that data from the database
> and do other automatic analysis on it to reduce that set further with other
> features (such as geoIP, creation dates etc).
>
> Oh and reading the academic papers from Damballa
> https://www.damballa.com/damballa-labs/publications.php and openDNS/Umbrella
> Labs http://labs.umbrella.com/blog/ may help give you other ideas of using
> your DNS data to detect badness however you choose to collect your DNS data.
>
> Hope that helps you a bit,
> Kevin
>
>
>
> On 26 November 2013 08:49, Christophe Vandeplas <christophe at vandeplas.com>
> wrote:
>>
>> Hi list,
>>
>>
>> In the past I've been using another tool to do DNS logging, and now
>> I'd like to use Suricata for this. The format of the file is
>> completely different, and also a part of the interpretation (Suricata
>> is a LOT more verbose and complete)
>>
>> DNS logging of Suricata is mulitiple lines per DNS request (and
>> response). So searching for things require multiple greps and
>> filtering out duplicate ids.
>>
>> I'm wondering how others use this DNS logging.
>> All stories (on or off-list) and practical use-cases are welcome.
>> I'll do my best to document these on the wiki so that others can
>> benefit from this info.
>>
>> As far as I understand there seem to be plans to transform the logging
>> into json, is there already an idea about when that's to be expected?
>>
>>
>> Thanks
>> Kind regards
>> Christophe
>> _______________________________________________
>> Suricata IDS Users mailing list: oisf-users at openinfosecfoundation.org
>> Site: http://suricata-ids.org | Support: http://suricata-ids.org/support/
>> List: https://lists.openinfosecfoundation.org/mailman/listinfo/oisf-users
>> OISF: http://www.openinfosecfoundation.org/
>
>