[Discussion] RFC: Proposal for Analysis Framework

Sun Oct 19 23:37:53 UTC 2008

This is my first draft for describing the analysis framework that I think we
need to combat current and future threats.  Identifying the flaws in this
first draft will allow us to more clearly state what we do and do not want
to accomplish and what will and will not work to that end.  Please comment
liberally as to what you agree and disagree with.  I will begin with the
overall goals, and then describe an outline for a general technical
architecture which aims to complete those goals.  Then I will propose a
specific technical implementation of the general technical architecture with
an example of how the implementation would execute, followed by a high-level
list of the sub-projects required to construct the framework.

Goal One: Identify Contexts of Network Traffic

Determining behavior means determining action within a context.  Therefore
the prerequisite to identifying behavior means establishing a context for
all actions.  This means converting packets into streams, streams into
applications, and applications into contexts.  Contexts can be simple like
"an HTTP download has occurred" or more specific like "a Google search" or
"an update from Microsoft."

Goal Two: Analyze Contexts to Classify Behavior

The identified contexts are passed to a higher level framework for
classifying the behavior of traffic based on its context.  This requires
applying information available from a variety of sources to given contexts
including performing heuristic-based analysis possibly using sandboxing
technology, applying reputations from external sources, applying long-term
contextual analysis with historical information, and consulting data
describing the current environment.  If the resulting classification of the
context is considered malicious, it is forwarded on to the result management
portion of the framework.

Goal Three: Manage and Communicate Results

Once traffic is classified as malicious, the information needs to be
communicated in a way that can be acted upon.  This means storing the result
and delivering the result to the framework's administrators running the
local implementation.  This also includes automatically formatting the
results and delivering them to a community repository.  The mechanism for
doing this must be extremely flexible and extensible as it will require many
organization-specific customizations.  The framework must allow for
configuration of the components, monitoring of the components' performance,
archival of results, an extensive query interface, communication of results
in a sanitized way, and extensive reporting capabilities.  This needs to be
performed in a scalable way which achieves as close to realtime performance
as possible.

Goal Four: Act on Results

Managing results includes defining which results require further action.
The framework must provide actions which can be taken on the results.  This
includes the act of preventing intrusions by blocking network traffic and
initializing incident handling procedures.  The implementations of these
goals will vary so greatly between organizations that the framework must
provide a plugin architecture for interfacing with it.  Example plugins
providing the most common actions should be included with the framework, but
the actual creation of plugins should be considered out of scope of the
project proper and instead be considered a separate (but important)
side-project.

General Architecture

To establish network contexts and trigger on specific traffic, the basic
component will be a network inspection system capable of full-content
capture, session capture, and event generation from predefined criteria.
The inspection system will use these predefined criteria to parse and
contextualize traffic according to its layer 7 attributes.  Predefined
triggers will forward noteworthy contexts to the analysis framework where
correlation and high level analysis will take place.

This framework will be built from high level, modular code which focuses on
moving information asynchronously so that any given step in the analysis
will not block the analysis of other contexts.  This framework will act as
the clutch between the low-level, packet-bound inspection system and the
high-level user interaction.

User interaction will occur primarily via a web console extensive enough to
provide a way to configure, monitor, and manage the system as it runs, as
well as a way of viewing, querying, and acting upon those results.  The
analysis framework will be the bridge between the web console and the
inspection system, updating the web console in realtime, and allowing the
administrator to issue actions for results.

Result actions will include active response packet dropping by sending reset
packets to the offending connections, routing table changes to employ
blackhole route filtering, configurable email notifications, and list
submission mechanisms.

Specific Implementation

Network Inspection:

Full content capture: Time Machine <
http://www.net.t-labs.tu-berlin.de/research/tm/TM_HOWTO_tm-20080814-0>
Inspection: Bro <http://bro-ids.org>
Flow data: Netflow (via routers or something like nProbe) <
http://www.ntop.org/nprobe.html>
Communications: Broccoli plus a newly created Perl plugin similar to the
existing python plugin

Context Analysis and Result Management:

I propose a framework for content analysis based on Perl using the POE
framework.  Perl is the right "weight" of language in that it can perform
low-level tasks like sending raw network packets while providing access to a
rich code base of high-level tasks like analyzing pcaps, checking client
patch levels and asset information in a CMDB, and performing web queries.
The POE event framework for Perl <http://poe.perl.org> is a robust, mature,
formal framework for asynchronous event handling.  It specializes in
non-blocking input/output operations with a built-in framework for
communicating natively between nodes either on the same host or across a
network.  It has an extensive code base created by a large user community
and is extremely scalable and flexible.  Proof of the compatibility of its
design goals can be found in the fact that the project's creator, Rocco
Caputo, made a module specifically designed to tail Snort log files and take
actions on events.

POE also has the great strength of being the perfect server-side partner for
AJAX requests from a web browser.  This would allow for a thin layer of
separation between what occurs on the backend traffic inspection nodes and
what is communicated to the end user via a web console, unifying the two
technologies.  This advantage is critical because it reduces the amount of
overall code and complexity of the project and encourages a modular
framework in which functionality for inspection and functionality for
management can be created in using the same technology and with a minimal
amount of separation.

Platform: Perl + POE
Blacklist sharing: centralized XML submission to Emerging Threats

Result Actions:
Forged packet active response via sending RST/ICMP prohibited packets via
perl
Routing table alteration via active router configuration via perl or
on-board routing daemon like Zebra
Email templates for notification/incident creation

Example Execution Path

Here is an example of how the proposed technical implementation would detect
zero-day malware from an unknown IP:

   1. A user inside the org is clickjacked and downloads new malware from a
   previously unknown IP.
   2. This matches predefined criteria instructing Bro to begin extracting
   files from from the network stream of any HTTP download with an executable
   header and to forward the extracted file and related netflow to the event
   analysis framework via the Perl-Broccoli connection.
   3. The Perl event analysis framework receives this new event complete
   with flows and files extracted from HTTP.
   4. The extracted files are submitted to CWSandbox via a framework plugin
   for CWSandbox, which in turn automatically submits to VirusTotal.
   5. CWSandbox emails the results back to the framework, which parses the
   email and stores the results.  The framework checks the CWSandbox network
   activity results and observes that an outbound connection to a separate IP
   was reported.
   6. The CWSandbox output and VirusTotal output show that the download was
   malicious and detected by several AV vendors.  Both IP addresses are marked
   as malicious.
   7. The framework issues a query to Time Machine via the Time Machine
   remote query console for all packets to and from the second malicious IP
   address.
   8. The framework uses the Net::Analysis Perl module to analyze the pcap
   from Time Machine and determines that there were packets present and that
   they were not all unanswered SYN packets, indicating that this host was
   successfully compromised and that there was possible data leakage.
   9. The framework creates a report of the event and issues a high-priority
   notification to the organization's network security admin staff.  It then
   submits the malware sample to Symantec's web submit page, which will create
   rapid response definitions to remove the malware.
   10. The framework updates the organization's blackhole router to include
   both malicious IP's, preventing any further data leakage and shielding the
   other clients.
   11. The framework downloads and installs the new rapid response
   definitions, mounts the infected client's C$ share via samba, initiates a
   scan of the network share, and either cleans the malware or sends an email
   to the help desk system requesting a re-imaging of the client.
   12. The framework submits both of the malicious IP's to Emerging Threats'
   blacklist.
   13. ET adds the IP's to the malware blacklist.
   14. Another organization's IDS framework checks in and updates its lists
   of bad IP's from ET.
   15. The other org's framework notices the new IP's which are added and
   updates its blackhole routes accordingly, protecting its org.
   16. Fifteen minutes later, the network security staff at the original org
   return from lunch and read the report of all that occurred while they were
   at Chotchkies.

Tasks:

The tasks for this implementation can be broken down into sub-projects.

   - Create a Perl interface for Bro's Broccoli
   - Create an ET repository for Bro scripts
   - Create the Perl analysis framework
   - Create analysis framework plugins for various tasks
   - Create a standard format for blacklists
   - Create a web console frontend for the framework

Some of these are some huge tasks, but there is such a large amount of code
already available for Perl on CPAN, that the framework really consists of
knitting together different modules to make a coherent one.  Also, a lot of
it can be done in parallel by separate working groups.

A note about other technologies: there are lots of ways to implement what
I've discussed, what I listed as technologies for implementation were ones
which already do what I've discussed or which I have working prototypes
for.  I hope this proposal starts a discussion on both the design and
implementation choices I've laid out here.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openinfosecfoundation.org/pipermail/discussion/attachments/20081019/c2b3f581/attachment-0002.html>