autograph toward automated, distributed worm signature detection (hyang-ah kim, brad karp)
DESCRIPTION
Autograph Toward Automated, Distributed Worm Signature Detection (Hyang-Ah Kim, Brad Karp). Yunhai & Justin. Agenda. Autograph Overview Introduction & Motivation Design Goals Autograph System Design Selecting Suspicious Traffic Content-Based Signature Generation Evaluation - PowerPoint PPT PresentationTRANSCRIPT
AutographToward Automated, Distributed Worm Signature Detection (Hyang-Ah Kim, Brad Karp)
Yunhai & Justin
2
Agenda Autograph Overview Introduction & Motivation Design Goals Autograph System Design
Selecting Suspicious Traffic Content-Based Signature Generation
Evaluation Local Signature Detection Distributed Signature Detection
Attacks & Limitations Conclusion & Future Work Questions & Answers
3
Autograph Overview What is Autograph
A system that automatically generates signatures for novel Internet worms that propagate using TCP transport
How does it work Generates signatures by analyzing the
prevalence of portions of flow payloads What is the main feature
Designed to produce signatures that exhibit high sensitivity and high specificity
4
Introduction Self-replicating Worm
Remotely exploits a software vulnerability Severity goes far beyond mere inconvenience
High cost in lost productivity Code Red: $2.6 billion
Researches Worm Quarantine Worm Origin Identification Using Random Moonwalks
Assumes attack flows do not use spoofed source IP address Assumes worm propagation occurs in a tree-like structure
5
Methods for Containing Internet Worm Quarantine Techniques
Destination port blocking Discovering ports on which worms appear to be
spreading, and filtering all traffic destined for those ports
Infected source host IP blocking Discovering source addresses of infected hosts and
filtering all traffic from those source addresses Content-based blocking
Discovering the payload content string that a worm uses in its infection attempts, and filtering all flows contain it
6
Motivation Motivation
How should one obtain worm content signatures for use in content-based filtering accurately and quickly?
Worm Signature
05:45:31.912454 90.196.22.196.1716 > 209.78.235.128.80: . 0:1460(1460) ack 1 win 8760 (DF)0x0000 4500 05dc 84af 4000 6f06 5315 5ac4 16c4 [email protected] d14e eb80 06b4 0050 5e86 fe57 440b 7c3b .N.....P^..WD.|;0x0020 5010 2238 6c8f 0000 4745 5420 2f64 6566 P."8l...GET./def0x0030 6175 6c74 2e69 6461 3f58 5858 5858 5858 ault.ida?XXXXXXX0x0040 5858 5858 5858 5858 5858 5858 5858 5858 XXXXXXXXXXXXXXXX . . . . .0x00e0 5858 5858 5858 5858 5858 5858 5858 5858 XXXXXXXXXXXXXXXX0x00f0 5858 5858 5858 5858 5858 5858 5858 5858 XXXXXXXXXXXXXXXX0x0100 5858 5858 5858 5858 5858 5858 5858 5858 XXXXXXXXXXXXXXXX0x0110 5858 5858 5858 5858 5825 7539 3039 3025 XXXXXXXXX%u9090%0x01a0 303d 6120 4854 5450 2f31 2e30 0d0a 436f 0=a.HTTP/1.0..Co .
Signature for CodeRed II
Signature: A Payload Content String Specific To A Worm
7
Content-based Blocking
Our networkX
Traffic Filtering
Internet
Signature for CodeRed II
Can be used by Bro, Snort, Cisco’s NBAR, ...
8
Signature derivation is too slow Current Signature Derivation Process
New worm outbreak Report of anomalies from people via phone/email/newsg
roup Worm trace is captured Manual analysis by security experts Signature generation
Labor-intensive, Human-mediated
9
Design Assumptions Focus on TCP worms that propagate via
scanning
Actually, any transport in which spoofed sources cannot communicate
successfully in which transport framing is known to monitor
Worm’s payloads share a common substring Vulnerability exploit part is not easily mutable
Not polymorphic
10
Agenda Autograph Overview Introduction & Motivation Design Goals Autograph System Design
Selecting Suspicious Traffic Content-Based Signature Generation
Evaluation Local Signature Detection Distributed Signature Detection
Attacks & Limitations Conclusion & Future Work Questions & Answers
11
Design Goals Automation: Minimal manual intervention Signature quality: Sensitive & specific
Sensitive: match all worms low false negative rate
Specific: match only worms low false positive rate
Timeliness: Early detection Application neutrality
Broad applicability Bandwidth Efficiency Signature Quantity & Length
12
Agenda Autograph Overview Introduction & Motivation Design Goals Autograph System Design
Selecting Suspicious Traffic Content-Based Signature Generation
Evaluation Local Signature Detection Distributed Signature Detection
Attacks & Limitations Conclusion & Future Work Questions & Answers
13
Automated Signature Generation
Step 1: Select suspicious flows using heuristics Step 2: Generate signature using content-
prevalence analysis
Our network
Traffic Filtering
Internet Autograph Monitor
Signature
X
SignatureSignature
14
Heuristic: Flows from scanners are suspicious Focus on the successful flows from IPs who made unsuccessful con
nections to more than s destinations for last 24hours Suitable heuristic for TCP worm that scans network
Suspicious Flow Pool Holds reassembled, suspicious flows captured during the last time p
eriod t Triggers signature generation if there are more than flows
S1: Suspicious Flow SelectionReduce the work by filtering out vast amount of innocuous flows
Autograph (s = 2)
Non-existent
Non-existentThis flow will be
selected
15
S1: Suspicious Flow Selection
Heuristic: Flows from scanners are suspicious Focus on the successful flows from IPs who made unsuccessful con
nections to more than s destinations for last 24hours Suitable heuristic for TCP worm that scans network
Suspicious Flow Pool Holds reassembled, suspicious flows captured during the last time p
eriod t Triggers signature generation if there are more than flows
Reduce the work by filtering out vast amount of innocuous flows
16
S2: Signature Generation
All instances of a worm have a common byte pattern specific to the worm
Rationale Worms propagate by duplicating themselves Worms propagate using vulnerability of a service
Use the most frequent byte sequences across suspicious flows as signatures
How to find the most frequent byte sequences?
17
Divide Payloads into Content Block
Use the entire payload Brittle to byte insertion, deletion, reordering
GARBAGEEABCDEFGHIJKABCDXXXXFlow 1
Flow 2 GARBAGEABCDEFGHIJKABCDXXXXX
18
Divide Payloads into Content Block
Partition flows into non-overlapping small blocks and count the number of occurrences
Fixed-length Partition Still brittle to byte insertion, deletion, reordering
GARBAGEEABCDEFGHIJKABCDXXXXFlow 1
Flow 2 GARBAGEABCDEFGHIJKABCDXXXXX
19
Divide Payloads into Content Block
Content-based Payload Partitioning (COPP) Partition if Rabin fingerprint of a sliding window matches Breakmark Configurable parameters: content block size (minimum, average, ma
ximum), breakmark, sliding window Content Blocks
Breakmark = last 8 bits of fingerprint (ABCD)
GARBAGEEABCDEFGHIJKABCDXXXXFlow 1
Flow 2 GARBAGEABCDEFGHIJKABCDXXXXX
20
Definition Prevalence
The number of suspicious flows in which each content block occurs
21
Why Prevalence?
Worm flows dominate in the suspicious flow pool Content-blocks from worms are highly ranked
Nimda
CodeRed2
Nimda (16 different payloads)
WebDAV exploit
Innocuous, misclassified
Prevalence Distribution in Suspicious Flow Pool - From 24-hr http traffic trace
22
Select Most Frequent Content Block
A B D
A B E
A C E
A D
C F
C D G
B
f0
f1
f2
f3
f4
f5
H I Jf6
I H Jf7
G I Jf8
23
A
A
A
E
E
A
FC
C
C
D
D
DB
B
B H
H
G
G
I
I
I
J
J
J
Select Most Frequent Content Block
D
C
E
E
A
A
A
A D
FC
C D G
B
B
B
H
H
G
I
I
I
J
J
J
f0
f1
f2
f3
f4
f5
f6
f7
f8
f0 C F
f1 C D G
f2 A B D
f3 A C E
f4 A B E
f5 A B D
f6 H I J
f7 I H J
f8 G I J
24
Select Most Frequent Content Block
A
B
D
A
B E
A
C
E
A
D
C
F
C
D
GB H
I J
I
H
J
GI J
f0 C F
f1 C D G
f2 A B D
f3 A C E
f4 A B E
f5 A B D
f6 H I J
f7 I H J
f8 G I JP≥3
W≥90%Signature:
W: target coverage in suspicious flow poolP: minimum occurrence to be selected
25
Signature: A
Select Most Frequent Content Block
A
B
D
A
B E
A
C
E
A
D
C
F
C
D
GB H
I J
I
H
J
GI J
f0 C F
f1 C D G
f2 A B D
f3 A C E
f4 A B E
f5 A B D
f6 H I J
f7 I H J
f8 G I JP≥3
W≥90%
W: target coverage in suspicious flow poolP: minimum occurrence to be selected
26
Select Most Frequent Content Block
B
DBA
A
A
C E
E
A
D
F
C
C
D
GB H
I J
I
H
J
GI J
P≥3
W≥90%Signature: A
f0 C F
f1 C D G
f2 A B D
f3 A C E
f4 A B E
f5 A B D
f6 H I J
f7 I H J
f8 G I J
W: target coverage in suspicious flow poolP: minimum occurrence to be selected
27
Select Most Frequent Content Block
F
C
C D
G H
I J
I
H
J
GI J
P≥3
W≥90%Signature: A
f0 C F
f1 C D G
f2 A B D
f3 A C E
f4 A B E
f5 A B D
f6 H I J
f7 I H J
f8 G I J
I
W: target coverage in suspicious flow poolP: minimum occurrence to be selected
28
Select Most Frequent Content Block
F
C
C DG
P≥3
W≥90%Signature: A
f0 C F
f1 C D G
f2 A B D
f3 A C E
f4 A B E
f5 A B D
f6 H I J
f7 I H J
f8 G I J
ISignature:
W: target coverage in suspicious flow poolP: minimum occurrence to be selected
29
Agenda Autograph Overview Introduction & Motivation Design Goals Autograph System Design
Selecting Suspicious Traffic Content-Based Signature Generation
Evaluation Local Signature Detection Distributed Signature Detection
Attacks & Limitations Conclusion & Future Work Questions & Answers
30
Evaluation - Local Objectives
Effect of COPP parameters on signature quality Metrics
Sensitivity = # of true alarms / total # of worm flows false negatives
Efficiency = # of true alarms / # of alarms false positives
Trace Contains 24-hour http traffic Includes 17 different types of worm payloads
31
Signature Quality
Larger block sizes generate more specific signatures A range of w (90-95%, workload dependent)
produces a good signature
32
Number of Signatures
Smaller block sizes generate small # of signatures
33
Agenda Autograph Overview Introduction & Motivation Design Goals Autograph System Design
Selecting Suspicious Traffic Content-Based Signature Generation
Evaluation Local Signature Detection Distributed Signature Detection
Attacks & Limitations Conclusion & Future Work Questions & Answers
34
Signature Generation Speed Bounded by worm payload accumulation speed
Aggressiveness of scanner detection heuristics: # of failed connection peers to detect a scanner
# of payloads enough for content analysis: suspicious flow pool size to trigger signature generation
Single Autograph Worm payload accumulation is slow
InternetInternet
A
AA
A
A A
A
tattler
Distributed Autograph Share scanner IP list Tattler: limit bandwidth
consumption within a predefined cap
35
Benefit from tattler Worm payload accumulation (time to catch 5 worms)
Signature generation More aggressive scanner detection (s) and signature
generation trigger () faster signature generation, more false positives
With s=2 and =15, Autograph generates the good worm signature before < 2% hosts get infected
Info Sharing
Autograph Monitor
Fraction of Infected Hosts
Aggressive(s = 1)
Conservative (s = 4)
NoneLuckiest 2% 60%Median 25% --
Tattler All <1% 15%
Many innocuous misclassified flows
36
tattler
A modified RTCP (RTP Control Protocol) Limit the total bandwidth of announcements sent to
the group within a predetermined cap
37
Agenda Autograph Overview Introduction & Motivation Design Goals Autograph System Design
Selecting Suspicious Traffic Content-Based Signature Generation
Evaluation Local Signature Detection Distributed Signature Detection
Attacks & Limitations Conclusion & Future Work Questions & Answers
38
Attacks & Limitation Overload due to flow reassembly
Solutions Multiple instances of Autograph on separate HW (port-disjoint) Suspicious flow sampling under heavy load
Abuse Autograph for DoS: pollute suspicious flow pool
Port scan and then send innocuous trafficSolution Distributed verification of signatures at many monitors
Source-address-spoofed port scanSolution Reply with SYN/ACK on behalf of non-existent hosts/services
39
Agenda Autograph Overview Introduction & Motivation Design Goals Autograph System Design
Selecting Suspicious Traffic Content-Based Signature Generation
Evaluation Local Signature Detection Distributed Signature Detection
Attacks & Limitations Conclusion & Future Work Questions & Answers
40
Future Work Attacks Online evaluation with diverse traces & deployment
on distributed sites Broader set of suspicious flow selection heuristics
Non-scanning worms (ex. hit-list worms, topological worms, email worms)
UDP worms Egress detection Distributed agreement for signature quality testing
Trusted aggregation
41
Conclusion Stopping spread of novel worms requires
early generation of signatures Autograph: automated signature detection
system Automated suspicious flow selection→ Automated
content prevalence analysis COPP: robustness against payload variability Distributed monitoring: faster signature
generation Autograph finds sensitive & specific
signatures early in real network traces
42
Questions & Answers