auscert finding needles in haystacks (the size of countries)
TRANSCRIPT
Michael Baker@cloudjunky
AusCERT - May 2013
Finding Needles in Haystacks(the size of countries)
Me
Michael Baker
CTO and Co-Founder of Packetloop.
Pioneering Big Data Security Analytics.
Spoken at Black Hat and Ruxcon.
http://bitly.com/bundles/packetloop/1
“We build toys. Some of those toys change the world”
- Nicholas Taleb
Uncertainty
Risk you can’t measure.
Castles, Casinos, War Zones.
Let’s get Bayesian (H|E).
Industry suffering from overfit.
Signatures, limited data, work once.
Exhibit A
CVE-2011-3192 - “Apache Killer”
auxiliary/dos/http/apache_range_dos 2011-08-19 normal Apache Range header DoS (Apache Killer)
Snort 1:19825
/Range\s*\x3A\s*bytes=([\d\x2D]+\x2C){50}/Hsmi
/Range\s*\x3A\s*bytes=([\d\x2D]+[\x2C\s]*){50}/Hsmi
Unknown Unknowns.
Prevention Fails.
Detection is the key.
Prevention is the goal.
The Big Data Promise*Full fidelity, higher accuracy, no aggregation, size and scale.
Model complexity.
Apply real science to the problem.
“There are more chess games than the number of atoms in the universe” Diego Rasskin Gutman
Induction and the Turkey Problem.
Kill ChainsReconnaissance
Weaponisation
Delivery
Exploitation
Installation
Command Control
Actions and Objectives
APT1 Kill ChainMalware link or executable sent to target. (Spearfish or watering hole).
Malware executed.
Establish Command and Control.
Lateral movement through privilege escalation.
Data Compressed and Exfiltrated.
Invasion GamesAttackers vs Defenders.
Attackers looking to stretch, avoid, challenge defensive lines to achieve their goal.
Security is a contact sport.
Manipulate Time and Space.
Win collisions.
Invasion Games
Detect
Deny
Disrupt
Degrade
Deceive
Destroy
Big DataSecurity Analytics
Big Data Security AnalyticsSize and Scale
Visualization
Fidelity
Interaction
Outlier Detection
Attacker Profiling
Enrichment
Transform
Prediction and Probability
Intelligence sharing
Statistical Analysis
Feature Extraction
Machine Learning
Kill Chain Disruption
Size and Scale
Network StreamsComplete record of all network data.
Provides the highest fidelity to analysts.
Only way to really understand subtle, targeted attacks.
Play, pause and rewind your network.
No need to have a specific logging setup.
Dense feature space.
“The difficulty shifts from traffic collection to traffic analysis. If you can store hundreds of gigabytes of traffic
per day, how do you make sense of it?” - Richard Bejtlich
Map Reduce
Packetpig
http://bit.ly/105AYxc
It’s all about Context.
Context
Enriched information, not just IP Addresses.
Additional intelligence on attackers.
Allow you to perform detective work.
What if? Branch analysis and exploring data.
Providing full fidelity and full context quickly.
It’s really about feature space.
Hindsight is 20/20
Realtime
Streaming
Streaming
Visualisation
Anscombe’s QuartetII IIII IIIIII IVIV
x y x y x y x y0.0 8.04 10.0 9.14 10.0 7.46 8.0 6.58
8.0 6.95 8.0 8.14 8.0 6.77 8.0 5.76
13.0 7.58 13.0 8.74 13.0 12.74 8.0 7.71
9.0 8.81 9.0 8.77 9.0 7.11 8.0 8.84
11.0 8.33 11.0 9.26 11.0 7.81 8.0 8.47
14.0 9.96 14.0 8.10 14.0 8.84 8.0 7.04
6.0 7.24 6.0 6.13 6.0 6.08 8.0 5.25
4.0 4.26 4.0 3.10 4.0 5.39 19.0 12.50
12.0 10.84 12.0 9.13 12.0 8.15 8.0 5.56
7.0 4.82 7.0 7.26 7.0 6.42 8.0 7.91
5.0 5.68 5.0 4.74 5.0 5.73 8.0 6.89
Source: Wikipedia http://bit.ly/110Se5y
Anscombe’s Quartet
Source: visual.ly - http://bit.ly/105BcEI
Full HDPlay, Pause, Rewind
Deep Packet Inspection
Finding Zero Days
Attacker Information
File Extraction
Bias Collisions
Producing information as it arrives in the stream.
Yaraprocessor
Chopshop
Enrich as much information as possible.
What’s the probability of the event?
ssdeep comparison
VirusShare_fe8ff84a23feb673a59d8571575fee0b
ssdeep comparison
Machine LearningHigh dimensional feature space.
Models instead of signatures.
Classification (class prediction).
Operating system detection.
Protocol detection.
Finding novelty and outliers.
Trained models, real time predictions.
Related ML Work
Frank Denis @jedisct1
Malware vs Big Data
Jason Trost @jason_trost and John Munro
Large Scale Malicious Domain Classification
Entropy and Covert Channels
Tor in HTTPS
Tor/HTTPS PCA
Meterpreter in HTTP
Meterpreter (HTTP)
Meterpreter Needle
Geocoding
Tor Endpoints
Torrent Triangulation
Torrent1M attacks over 12 days.
17 attackers were also downloading torrents.
TOR / Torrent are generally mutually exclusive.
Good entropy on larger files for changing IPs.
Torrent client + UA + OS Classification
Really?
7 Weeks to 100 Push-Ups: Strengthen and Sculpt Your Arms, Abs, C
1000 Photoshop Tips and Tricks (Dec 2010)-Mantesh
Footloose.2011.DVDRip.XviD- PADDO
Decision Making
Half Life of DataIncredibly valuable just after creation.
What is the half life of security data?
Need to accommodate post hoc delivery of information.
Probabilistic models making real time decisions.
Full fidelity and long histories for Tactical, Operational and Strategic decisions.
Source: Nucleus Research - http://bit.ly/10BRAeZ
This is not SIEM.
!SIEMReal time
Full Fidelity
Explore and explain the data (evidence).
Play, Pause and Rewind.
Blink and you miss it technology.
No aggregation. No parsers. Frictionless.
Clear intelligence.
Decision Making Platform.
One thing we can count on
Changing TacticsKill Chains will change.
Commit, shift, delay defenders.
Commit to triaging an event that is not the real event.
Shift defenders to locations or targets.
Create doubt in defenders to maintain stationary.
@packetloop@packetpig
Questions?