netshield: matching a large vulnerability signature ruleset for high performance network defense
DESCRIPTION
Yan Chen Department of Electrical Engineering and Computer Science Northwestern University Lab for Internet & Security Technology (LIST) http://list.cs.northwestern.edu. NetShield: Matching a Large Vulnerability Signature Ruleset for High Performance Network Defense. Signature DB. - PowerPoint PPT PresentationTRANSCRIPT
NetShield: Matching a Large Vulnerability Signature
Ruleset for High Performance Network Defense
1
Yan Chen
Department of Electrical Engineering and Computer Science
Northwestern University
Lab for Internet & Security Technology (LIST)
http://list.cs.northwestern.edu
2
Background NIDS/NIPS (Network Intrusion
Detection/Prevention System) operation
Signature DB
NIDS/NIPS `
`
`
Packets
Securityalerts
• Accuracy• Speed• Attack Coverage
IDS/IPS Overview
4
State Of The Art
Pros• Can efficiently match
multiple sigs simultaneously, through DFA
• Can describe the syntactic context
Regular expression (regex) based approaches
Used by: Cisco IPS, Juniper IPS, open source Snort
Cons• Limited expressive
power• Cannot describe the
semantic context • Inaccurate
Cannot combat Conficker!
Example: .*Abc.*\x90+de[^\r\n]{30}
5
State Of The Art
Pros• Directly describe
semantic context• Very expressive, can
express the vulnerability condition exactly
• Accurate
Vulnerability Signature [Wang et al. 04]
Cons• Slow! • Existing approaches all
use sequential matching• Require protocol parsing
Blaster Worm (WINRPC) Example:BIND:rpc_vers==5 && rpc_vers_minor==1 && packed_drep==\x10\x00\x00\x00&& context[0].abstract_syntax.uuid=UUID_RemoteActivationBIND-ACK:rpc_vers==5 && rpc_vers_minor==1CALL:rpc_vers==5 && rpc_vers_minors==1 && packed_drep==\x10\x00\x00\x00&& stub.RemoteActivationBody.actual_length>=40 && matchRE( stub.buffer, /^\x5c\x00\x5c\x00/)
Goodstate
BadstateVulnerability
Signature
Vulnerability: design flaws enable the bad inputs lead the program to a bad state
Bad input
6
Motivation of NetShield
6
Theoretical accuracy limitation of regex
State of the art regex Sig
IDSesNetShield
Existing Vulnerability
Sig IDS
Accuracy HighLow
Low
Hig
hSpeed
77
Motivation• Desired Features for Signature-based
NIDS/NIPS– Accuracy (especially for IPS)– Speed– Coverage: Large ruleset
Regular Expression
Vulnerability
Accuracy Relative Poor
Much Better
Speed Good ??
Memory OK ??
Coverage Good ??
Shield[sigcomm’04]
Focus of this work
Cannot capture vulnerability condition well!
88
Vulnerability Signature Studies• Use protocol semantics to express
vulnerabilities
• Defined on a sequence of PDUs & one predicate for each PDU– Example: ver==1 && method==“put” && len(buf)>300
• Data representations– For all the vulnerability signatures we studied, we only
need numbers and strings– number operators: ==, >, <, >=, <=– String operators: ==, match_re(.,.), len(.).
Blaster Worm (WINRPC) Example:BIND:rpc_vers==5 && rpc_vers_minor==1 && packed_drep==\x10\x00\x00\x00&& context[0].abstract_syntax.uuid=UUID_RemoteActivationBIND-ACK:rpc_vers==5 && rpc_vers_minor==1CALL:rpc_vers==5 && rpc_vers_minors==1 && packed_drep==\x10\x00\x00\x00&& stub.RemoteActivationBody.actual_length>=40 && matchRE( stub.buffer, /^\x5c\x00\x5c\x00/)
99
Research Challenges
• Matching thousands of vulnerability signatures simultaneously– Regex rules can be merged to a single DFA,
but vulnerability signature rules cannot be easily combined
– Sequential matching match multiple sigs. simultaneously
• Need high speed protocol parsing
1010
Outline
• Motivation and NetShield Overview
• High Speed Matching for Large Rulesets
• High Speed Parsing
• Evaluation
• Research Contributions
NetShield Overview
11
12
Matching Problem Formulation• Suppose we have n signatures, defined on k
matching dimensions (matchers)– A matcher is a two-tuple (field, operation) or a four-
tuple for the associative array elements– Translate the n signatures to a n by k table– This translation unlocks the potential of matching
multiple signatures simultaneously
Rule 4: URI.Filename=“fp40reg.dll” && len(Headers[“host”])>300RuleID Method == Filename == Header == LEN
1 DELETE * *
2 POST Header.php *
3 * awstats.pl *
4 * fp40reg.dll name==“host”; len(value)>300
5 * * name==“User-Agent”; len(value)>544
1313
Matching Problem Formulation
• Challenges for Single PDU matching problem (SPM)– Large number of signatures n– Large number of matchers k– Large number of “don’t cares”– Cannot reorder matchers arbitrarily --
buffering constraint– Field dependency
• Arrays, associative arrays• Mutually exclusive fields.
1414
Matching Algorithms
Candidate Selection Algorithm
1.Pre-computation decides the rule order and matcher order
2.Decomposition. Match each matcher separately and iteratively combine the results efficiently
• Integer range checking balanced binary search tree
• String exact matching Trie• Regex DFA (XFA)
15
Step 1: Pre-Computation• Optimize the matcher order based on buffering
constraint & field arrival order • Rule reorder:
RequireMatcher 1
Don’t careMatcher 1
RequireMatcher 1
RequireMatcher 2
Don’t careMatcher 1
& 2
1
n
1616
Step 2: Iterative Matching
RuleID Method == Filename == Header == LEN
1 DELETE * *
2 POST Header.php *
3 * awstats.pl *
4 * fp40reg.dll name==“host”; len(value)>300
5 * * name==“User-Agent”; len(value)>544
PDU={Method=POST, Filename=fp40reg.dll, Header: name=“host”, len(value)=450}
S1={2} Candidates after match Column 1 (method==)S2= S1 A2+B2={2} {}+{4}={}+{4}={4}S3=S2 A3+B3 ={4} {4}+{}={4}+{}={4}
1 ii AS
Si1 ii AS
Don’t care matcher i+1
requirematcher i+1
In Ai+1
R1
R2
R3
17
Complexity Analysis
• Merging complexity– Need k-1 merging iterations– For each iteration
• Merge complexity O(n) the worst case, since Si can have O(n) candidates in the worst case rulesets
• For real-world rulesets, # of candidates is a small constant. Therefore, O(1)
– For real-world rulesets: O(k) which is the optimal we can get
Three HTTP traces: avg(|Si|)<0.04Two WINRPC traces: avg(|Si|)<1.5
1818
Refinement and Extension
• SPM improvement– Allow negative conditions– Handle array cases– Handle associative array cases– Handle mutual exclusive cases
• Extend to Multiple PDU Matching (MPM)– Allow checkpoints.
1919
Outline
• Motivation
• High Speed Matching for Large Rulesets.
• High Speed Parsing
• Evaluation
• Research Contribution
2020
Observations
array
PDU• PDU parse tree
• Leaf nodes are numbers or strings
• Observation 1: Only need to parse the fields related to signatures (mostly leaf nodes)
• Observation 2: Traditional recursive descent parsers which need one function call per node are too expensive
2121
Efficient Parsing with State Machines
• Studied eight protocols: HTTP, FTP, SMTP, eMule, BitTorrent, WINRPC, SNMP and DNS as well as their vulnerability signatures
• Common relationship among leaf nodes
• Pre-construct parsing state machines based on parse trees and vulnerability signatures
• Design UltraPAC, an automated fast parser generator
Varderive
Sequential Branch Loop Derive(a) (d)(c)(b)
VarVar
2222
Example for WINRPC• Rectangles are states• Parsing variables: R0 .. R4
• 0.61 instruction/byte for BIND PDU
1 rpc_ver_minor
R4
20*R4
R2++R2£R3
R2 ‹- 0R3 ‹- ncontext
Header BindR0
R0
R1-16
Bind
Bind-ACK
R1
Bind-ACK
1 rpc_vers
1 pfc_flags
1 ptype
2 frag_length
4 packed_drep
6 merge1
1 n_tran_syn
2 ID
16 UUID
1 padding
tran_syn4 UUID_ver
1 ncontext
8 merge2
3 padding
merge3
2323
Outline
• Motivation
• High Speed Matching for Large Rulesets.
• High Speed Parsing
• Evaluation
• Research Contributions
24
Evaluation Methodology
• 26GB+ Traces from Tsinghua Univ. (TH), Northwestern (NU) and DARPA
• Run on a P4 3.8Ghz single core PC w/ 4GB memory
• After TCP reassembly and preload the PDUs in memory
• For HTTP we have 794 vulnerability signatures which cover 973 Snort rules.
• For WINRPC we have 45 vulnerability signatures which cover 3,519 Snort rules 24
Fully implemented prototype• 12,000 lines of C++ and
3,000 lines of Python• Can run on both Linux and
Windows
Deployed at a university DC
with up to 106Mbps
2525
Parsing Results
Trace TH DNS
TH WINRPC
NU WINRPC
TH HTTP
NU HTTP
DARPA HTTP
Throughput (Gbps)
Binpac
Our parser
0.31
3.43
1.41
16.2
1.11
12.9
2.10
7.46
14.2
44.4
1.69
6.67
Speed up ratio 11.2 11.5 11.6 3.6 3.1 3.9Max. memory per connection (bytes)
15 15 15 14 14 14
2626
Matching Results
Trace TH WINRPC
NU WINRPC
TH HTTP
NU HTTP
DARPA HTTP
Throughput (Gbps)
Sequential
CS Matching10.68
14.37
9.23
10.61
0.34
2.63
2.37
17.63
0.28
1.85Matching only time
speed up ratio4 1.8 11.3 11.7 8.8
Avg # of Candidates 1.16 1.48 0.033 0.038 0.0023Max. memory per connection (bytes)
27 27 20 20 20
27
Scalability and Accuracy Results
• Create two polymorphic WINRPC exploits which bypass the original Snort rules but detect accurately by our scheme.
• For 10-minute “clean” HTTP trace, Snort reported 42 alerts, NetShield reported 0 alerts. Manually verify the 42 alerts are false positives0 200 400 600 800
01
23
4
# of rules used
Th
rou
gh
pu
t (G
bp
s)
Rule scaling results
Performancedecreasegracefully
Accuracy
2828
Research Contribution
Regular Expression Exists Vul. IDS NetShield
Accuracy Poor Good Good
Speed Good Poor Good
Memory Good ?? Good
Coverage Good ?? Good
Build a better Snort alternative!
• Multiple sig. matching candidate selection algorithm • Parsing parsing state machine• Achieves high speed with much better accuracy
Make vulnerability signature a practical solutionfor NIDS/NIPS
29
Q & A
Thanks!
30
Comparing With Regex
• Memory for 973 Snort rules: DFA 5.29GB (XFA 863 rules1.08MB), NetShield 2.3MB
• Per flow memory: XFA 36 bytes, NetShield 20 bytes.
• Throughput: XFA 756Mbps, NetShield 1.9+Gbps
(*XFA [SIGCOMM08][Oakland08])
3131
Measure Snort Rules
• Semi-manually classify the rules.1. Group by CVE-ID 2. Manually look at each vulnerability
• Results– 86.7% of rules can be improved by protocol semantic
vulnerability signatures. – Most of remaining rules (9.9%) are web DHTML and
scripts related which are not suitable for signature based approach.
– On average 4.5 Snort rules are reduced to one vulnerability signature.
– For binary protocol the reduction ratio is much higher than that of text based ones. • For netbios.rules the ratio is 67.6.
32
Matcher order
111 iiii BASS
Reduce Si+1 Enlarge Si+1
|| 11 ii BA fixed, put the matcher later, reduce Bi+1
Merging Overhead |Si| (use hash table to calculate in Ai+1, O(1))
33
Matcher order optimization
• Worth buffering only if estmaxB(Mj)<=MaxB
• For Mi in AllMatchers
– Try to clear all the Mj in the buffer which estmaxB(Mj)<=MaxB
– Buffer Mi if (estmaxB(Mi)>MaxB)
– When len(Buf)>Buflen, remove the Mj with minimum estmaxB(Mj)
34
35
Backup Slides
•
36
Experiences
• Working in process– In collaboration with MSR, apply the semantic
rich analysis for cloud Web service profiling. To understand why slow and how to improve.
• Interdisciplinary research
• Student mentoring (three undergraduates, six junior graduates)
37
Future Work
• Near term– Web security (browser security, web server security)– Data center security– High speed network intrusion prevention system with
hardware support • Long term research interests
– Combating professional profit-driven attackers will be a continuous arm race
– Online applications (including Web 2.0 applications) become more complex and vulnerable.
– Network speed keeps increasing, which demands highly scalable approaches.
3838
Research Contributions
• Demonstrate vulnerability signatures can be applied to NIDS/NIPS, which can significantly improve the accuracy of current NIDS/NIPS
• Propose the candidate selection algorithm for matching a large number of vulnerability signatures efficiently
• Propose parsing state machine for fast protocol parsing
• Implement the NetShield
39
Motivation
• Network security has been recognized as the single most important attribute of their networks, according to survey to 395 senior executives conducted by AT&T
• Many new emerging threats make the situation even worse
4040
Candidate merge operation
1 ii AS
Si1 ii AS
Don’t care matcher i+1
requirematcher i+1
In Ai+1
4141
A Vulnerability Signature Example• Data representations
– For all the vulnerability signatures we studied, we only need numbers and strings
– number operators: ==, >, <, >=, <=– String operators: ==, match_re(.,.), len(.).
• Example signature for Blaster wormExample:BIND:rpc_vers==5 && rpc_vers_minor==1 && packed_drep==\x10\x00\x00\x00&& context[0].abstract_syntax.uuid=UUID_RemoteActivationBIND-ACK:rpc_vers==5 && rpc_vers_minor==1CALL:rpc_vers==5 && rpc_vers_minors==1 && packed_drep==\x10\x00\x00\x00&& stub.RemoteActivationBody.actual_length>=40 && matchRE( stub.buffer, /^\x5c\x00\x5c\x00/)
42
System Framework
Content-based signature matching
Streaming packet data
Data path Control pathModules on the critical path
Token Based Signature Generation (TOSG)
Part IIPolymorphic worm signature generation
Modules on the non-critical path
Honeynets/Honeyfarms
Network Situational Awareness
Length Based Signature Generation (LESG)
Part IVNetwork Situational Awareness
To unused IPblocks
Protocol semantic signature matching
Part IIISignature matching engines
Reversiblek-ary sketch monitoring
Sketch based statistical anomaly detection (SSAD)
Local sketch records
Sent out for aggregation
Remote aggregatedsketchrecords Part I
Sketch-basedmonitoring & detection
Scalability
Accuracy &adapt fast
Accuracy &Scalability & Coverage
Accuracy &adapt fast
Scalability
Accuracy &Scalability & Coverage
Accuracy &adapt fast
Scalability
Accuracy &Scalability & Coverage
Accuracy &adapt fast
Accuracy &adapt fast
Scalability
Accuracy &Scalability & Coverage
43
Example of Vulnerability Signatures• At least 75%
vulnerabilities are due to buffer overflow
Sample vulnerability signature
• Field length corresponding to vulnerable buffer > certain threshold
• Intrinsic to buffer overflow vulnerability and hard to evade
Vulnerable buffer
Protocol message
Overflow!