haigh & harp, learning from limited experience improving self-defense by learning from limited...
TRANSCRIPT
Haigh & Harp, Learning from Limited Experience
Improving Self-Defense byImproving Self-Defense by
Learning from Limited ExperienceLearning from Limited Experience
Karen Haigh Steven HarpBBN Technologies Adventium Labs
Haigh & Harp, Learning from Limited Experience
OverviewOverview
• Goal: Systems that autonomously improve their defenses with experience.
• Several ways to do this...• Examples discussed:
– Learning to recognize anomalies– Self Immunizing against observed exploits– Acquiring multistage attacks concepts– Learning effective responses
Haigh & Harp, Learning from Limited Experience
Learning in Cyber SecurityLearning in Cyber Security
• What is (machine) learning?– Automatically using prior experience to improve
performance over time
• Problems addressable by learning?– Detection: distinguish problem from non-problem– Immunity:
• Good: “an exploit should succeed at most once”• Better: “a vulnerability should be exploitable at most
once”
– Response: how best to actively counter an attack?
Long Term Goal: Cognitive Immunity
Haigh & Harp, Learning from Limited Experience
Opportunities & TechniquesOpportunities & Techniques
Detecting Attacks Responding to attacks
Passive Observation
Relatively well explored.Example: anomaly detection
Work remains to be done to detect attacks extended over multiple hosts and steps.
Not well explored.
CSISM innovation: Situation-dependent utility of responses.
Experiment (e.g. in sandbox / taster / laboratory)
Cortex innovation: Use experiments to generalize from instances of attacks to classes of attacks.
CSISM innovation: Use experiments to identify necessary & sufficient elements of multi-step attacks.
Not well explored
CSISM innovation: Variations on responses
Haigh & Harp, Learning from Limited Experience
Modelling Defended SystemsModelling Defended Systems
• Expert Rules
• Offline Learning
• Online Learning
Experimental Sandbox
Offline Training+ Good data
+ Complex environment- Dynamic system
Online Training- Unknown data
+ Complex environment+ Dynamic system
Experimental Sandbox+ Good data (self-labeled)+ Complex environment
+ Dynamic system
Very hard for adversary to “train” the learner!!!
Expert Heuristics+ Good data
- Complex environment- Dynamic system
Haigh & Harp, Learning from Limited Experience
Complex Domain: Human Rules are IncompleteComplex Domain: Human Rules are Incomplete
Quad 0&1 are slower than Quads 2&3.
Complex domain: human calibration
(incorrectly) claimed that Quad 1
was slowest, missing Quad 0
DPASA (DARPA OASIS) Registration Time by QuadRegistration Time by Quad
Haigh & Harp, Learning from Limited Experience
Complex Domain (2)Complex Domain (2)caf_plan, chem_haz
and maf_plan are slower than other clients
Complex domain: human calibration
(incorrectly) claimed that caf_plan & maf_plan were
slowest because of hand-typed
password, missing chem_haz
DPASA (DARPA OASIS) Registration Time by Client TypeRegistration Time by Client Type
Haigh & Harp, Learning from Limited Experience
Learning for CalibrationLearning for Calibration
• Calibrate the parameters of rules for normal operating conditions – Important first step because it learns how to respond to
normal conditions– For example: learn timing parameters for rapid response
controller, e.g.• Client Registration, PSQ server local probes, SELinux
enforcement, SELinux flapping, File integrity checks
– Need to handle multi-modal data:
CSISM / BBN
Haigh & Harp, Learning from Limited Experience
Beta=0.0005
Results for Results for allall Registration times Registration times
These two “shoulder” points indicate upper and lower limits.
As more observations are collected, the estimates become more confident of the
range of expected values (i.e. tighter estimates to observations)
CSISM / BBN Algorithm of Last & Kandel, 2001
Haigh & Harp, Learning from Limited Experience
Generalization of Attack SignaturesGeneralization of Attack Signatures
Cortex Project
Haigh & Harp, Learning from Limited Experience
GeneralizationGeneralization
• Goal: Learn a most general concept from instances of attacks and block all similar attacks against the vulnerability.Dealing with Zero-day attacks...
• Payload Analysis Challenges– How to automatically recognize which element(s) of an
attack are essential?– How to generalize them to their boundary conditions?
• avoid the fragility of simple pattern matching rules
• Approach: Experimentation– Validation of attack concepts 0 false positives
Cortex / Honeywell
Haigh & Harp, Learning from Limited Experience
Generalization by ExperimentationGeneralization by Experimentation
Model contains axes of
vulnerability
• Payload content– Binary machine instructions– Unusual payload (e.g. unix commands, registry
keys, database administrative commands)– Length (# bytes/terms)
• Resource consumption patterns• Probing (e.g. password guessing)• Session-wide (multiple queries)
Taste Tester
Model of normal traffic
Experiment1) Score suspicious elements2) Replace with innocuous
or generalized values3) Validate in tester
norm
al
attack
Blocking Rules
Cortex / Honeywell
Haigh & Harp, Learning from Limited Experience
Cortex Demo Architecture and Use CasesCortex Demo Architecture and Use Cases
TastersTastersTasters
Replicator
Delete tasters
Create tasters
Switch Tasters
Replicate queries
Heartbeat Status
RTSReplicate
Switch Tasters
Rebuild Tasters
Send to Learning
.
AMP
CSM
Once per phase
Proxy (Dexter)Block known bad queries
Taste test
Log results
Master DBQuery
LearnerRead Training Data
Experiment
Generate Rules
Normal Query Mission Planning
Cortex / Honeywell
Haigh & Harp, Learning from Limited Experience
TastersTastersTasters
Replicator
Delete tasters
Create tasters
Switch Tasters
Replicate queries
Heartbeat Status
RTSReplicate
Switch Tasters
Rebuild Tasters
Send to Learning
.
AMP
CSM
Proxy (Dexter)Block known bad queries
Taste test
Log results
Master DBQuery
LearnerRead Training Data
Experiment
Generate Rules
Attack gets throughAttack is blocked
Cortex / Honeywell
Cortex Demo Architecture and Use CasesCortex Demo Architecture and Use Cases
Haigh & Harp, Learning from Limited Experience
Example Results: MySQLExample Results: MySQL
Noted that hex bytes were suspicious, so generalized bytes and correctly blocked integer overflow!
MySQL DOS attack
Correctly generalized single attack to 0x7FFF max value
Integer overflow
Correctly generalized single attack to number of valid bytes.
String buffer overflow (password)
NotesAttacks
Project was tested with a red-team modelCortex / Honeywell
Haigh & Harp, Learning from Limited Experience
Identification of Multistage AttacksIdentification of Multistage Attacks
CSISM Project
Haigh & Harp, Learning from Limited Experience
MultiStage Attacks: ChallengesMultiStage Attacks: Challenges
• Detect and generalize multi-step attacks across time and space.– Multistage attacks involve a sequence of actions that span multiple
hosts and take multiple steps to succeed.
• Challenges:– Which observations are necessary & sufficient?
• Incidental observations that are either – side effects of normal operations, or– chaff explicitly added by an attacker to divert the defender.
• Concealment (e.g. to remove evidence)• Probabilistic actions (e.g. to improve probability of attack success)
– What are the most reliable observations?– What are the parameter boundaries?
• Approach: Experimentation– Allows validation of pruning
CSISM / BBN
Haigh & Harp, Learning from Limited Experience
Architectural SchemaArchitectural Schema
CSISM Sensors(ILC, IDS)
Observations ending in failureof protected system.Only some are essential.
1 2 3 4 5 6
Defense Measures
Experimenter
A B CX ?
Viable AttackTheories
Viable Defense Strategies and
Detection Rules
Attack TheoryExperimenter
1 2 3 4 6
A B D
5
C
A B C
“Sandbox”
A CB C
A B D
2A
ObservationsActions
CSISM / BBN
Haigh & Harp, Learning from Limited Experience
Multi-Stage LearnerMulti-Stage Learner
• Do {– Generate Theory according to heuristic
• Complete set of theories is Permutations( Powerset( observations ))
– Test Theory– Incrementally update controller rulebases
• } while Theories remain
• For only 10 observations, there are > 10,000,000 possible theories (not including variations on steps!)
The hard part!
CSISM / BBN
22
Hypothesis GenerationHypothesis Generation
• Query learner generates attack hypotheses– in heuristic order to acquire the concept rapidly
• Candidate Heuristics– Look for shorter attacks first (adjustable prior)– Suspect order of steps has an influence– Suspect steps to interact positively (for the attacker)– Prefer hypotheses with less common / more
suspicious elements
CSISM / BBN
Project was tested with a red-team model
Haigh & Harp, Learning from Limited Experience
Response LearningResponse Learning
CSISM Project
Haigh & Harp, Learning from Limited Experience
Situation-dependent Action UtilitiesSituation-dependent Action Utilities
• Learn tradeoffs among potential responses; context changes appropriateness of responses changes– Context includes descriptions of users, attack elements, system
performance, etc– Benefit is effectiveness of defense action– Cost includes effort to mount response and impact on availability
• Challenges:– Measuring the effect of responses is hard:
• Complex domain rarely identical situations non-deterministic actions/effects
• Approach: Experimentation– System “snapshots” get close to identical conditions
CSISM / BBN
Haigh & Harp, Learning from Limited Experience
Response Learning: Results PendingResponse Learning: Results Pending
• Bias toward results that worked in similar situations in the past– Hybrid Reinforcement learning and Nearest-Neighbour
approaches
• Given a set of hypotheses about the locus of an attack– Search for true locus:
• Hierarchical based on system architecture• Bias by historical attack patterns
– Select response based on similarity match to prior attacks:• Same response when quality was high• Alternate response when quality was low
Project will be tested by a red-team on 20 May 2008. Goal is to demonstrate “better” responses over time.
CSISM / BBN
Haigh & Harp, Learning from Limited Experience
ConclusionConclusion
Haigh & Harp, Learning from Limited Experience
Learning BenefitsLearning Benefits
• Learning can improve the defensive posture – better knowledge (about the attacks or attacker), better policies
• Learning can improve how the system responds to symptoms – better connection between response actions and their triggers
• Active Learning– A mechanism for recognizing Zero-day attacks– No false positives — only validated attacks are added
• Learning techniques are enablers for the next level of enhancements in adaptive defense
Adaptation is the key to survival
Haigh & Harp, Learning from Limited Experience
From Proof-of-Concept to ProductionFrom Proof-of-Concept to Production
Demonstrated Future Directions
Generalization Able to generalize instances to classes.
•More axes of vulnerability•More handling of joint probabilities•More domains•Meta learning to induce new axes
Multi-stage attack
Able to identify Chaff
•Probabilistic actions•Concealment•Model of normal•Generalization
Responses Able to map context to response
•Richer context, richer responses•Automatic measurement of benefit•Scalable “snapshots”
Haigh & Harp, Learning from Limited Experience
BackupBackup
Haigh & Harp, Learning from Limited Experience
Multistage AttacksMultistage Attacks
• Detect and then generalize multi-step attacks across time and space.
• Multistage attacks involve a sequence of actions that span multiple hosts and take multiple steps to succeed.
– A sequence of actions with causal relationships.– An action A must occur set up the initial conditions for action B.
Action B would have no effect without previously executing action A.– For example
1. gain ability to execute commands on Box1 as unprivileged user by exploiting a buffer overflow in Service1
2. gain root shell by running an exploit of a race condition3. disable protection mechanism, e.g. SElinux4. replace dpasa jar with attacker jar code5. run attacker code that sends bad refs to Box2, Box3, Box4.
Walk-Away-Message
Haigh & Harp, Learning from Limited Experience
Attacks (MySQL DoS-1)Attacks (MySQL DoS-1)
• mysql-com_table-dump-memory-corruption– Malformed request leaves MySQL unstable
• Countermeasures:– Block the malformed com_table_dump command using
learned pattern and proxy filter rules.– Restart the server– Block all requests from the offending sources
Haigh & Harp, Learning from Limited Experience
Attacks (MySQL DoS-2)Attacks (MySQL DoS-2)
• mysql-password-handler-buffer-overflow– Excessive password length can crash server
• Countermeasures:– Block connections which proffer “abnormal”
passwords (learned response or statistical anomaly).
– Restart the server.– Block all requests from the offending sources.
Haigh & Harp, Learning from Limited Experience
Attacks (MySQL DoS-3)Attacks (MySQL DoS-3)
• mysql-remote-fulltext-search-DoS– Malformed request crashes server
• Countermeasures:– Detect and block malformed queries– Block all queries of this type (fulltext-search)– Block all requests from the offending sources.– Restart the server