haigh & harp, learning from limited experience improving self-defense by learning from limited...

Haigh & Harp, Learning from Limited Experience

Improving Self-Defense byImproving Self-Defense by

Learning from Limited ExperienceLearning from Limited Experience

Karen Haigh Steven HarpBBN Technologies Adventium Labs


OverviewOverview

• Goal: Systems that autonomously improve their defenses with experience.

• Several ways to do this...• Examples discussed:

– Learning to recognize anomalies– Self Immunizing against observed exploits– Acquiring multistage attacks concepts– Learning effective responses


Learning in Cyber SecurityLearning in Cyber Security

• What is (machine) learning?– Automatically using prior experience to improve

performance over time

• Problems addressable by learning?– Detection: distinguish problem from non-problem– Immunity:

• Good: “an exploit should succeed at most once”• Better: “a vulnerability should be exploitable at most

once”

– Response: how best to actively counter an attack?

Long Term Goal: Cognitive Immunity


Opportunities & TechniquesOpportunities & Techniques

Detecting Attacks Responding to attacks

Passive Observation

Relatively well explored.Example: anomaly detection

Work remains to be done to detect attacks extended over multiple hosts and steps.

Not well explored.

CSISM innovation: Situation-dependent utility of responses.

Experiment (e.g. in sandbox / taster / laboratory)

Cortex innovation: Use experiments to generalize from instances of attacks to classes of attacks.

CSISM innovation: Use experiments to identify necessary & sufficient elements of multi-step attacks.

Not well explored

CSISM innovation: Variations on responses


Modelling Defended SystemsModelling Defended Systems

• Expert Rules

• Offline Learning

• Online Learning

Experimental Sandbox

Offline Training+ Good data

+ Complex environment- Dynamic system

Online Training- Unknown data

+ Complex environment+ Dynamic system

Experimental Sandbox+ Good data (self-labeled)+ Complex environment

+ Dynamic system

Very hard for adversary to “train” the learner!!!

Expert Heuristics+ Good data

- Complex environment- Dynamic system


Complex Domain: Human Rules are IncompleteComplex Domain: Human Rules are Incomplete

Quad 0&1 are slower than Quads 2&3.

Complex domain: human calibration

(incorrectly) claimed that Quad 1

was slowest, missing Quad 0

DPASA (DARPA OASIS) Registration Time by QuadRegistration Time by Quad


Complex Domain (2)Complex Domain (2)caf_plan, chem_haz

and maf_plan are slower than other clients

Complex domain: human calibration

(incorrectly) claimed that caf_plan & maf_plan were

slowest because of hand-typed

password, missing chem_haz

DPASA (DARPA OASIS) Registration Time by Client TypeRegistration Time by Client Type


Learning for CalibrationLearning for Calibration

• Calibrate the parameters of rules for normal operating conditions – Important first step because it learns how to respond to

normal conditions– For example: learn timing parameters for rapid response

controller, e.g.• Client Registration, PSQ server local probes, SELinux

enforcement, SELinux flapping, File integrity checks

– Need to handle multi-modal data:

CSISM / BBN


Beta=0.0005

Results for Results for allall Registration times Registration times

These two “shoulder” points indicate upper and lower limits.

As more observations are collected, the estimates become more confident of the

range of expected values (i.e. tighter estimates to observations)

CSISM / BBN Algorithm of Last & Kandel, 2001


Generalization of Attack SignaturesGeneralization of Attack Signatures

Cortex Project


GeneralizationGeneralization

• Goal: Learn a most general concept from instances of attacks and block all similar attacks against the vulnerability.Dealing with Zero-day attacks...

• Payload Analysis Challenges– How to automatically recognize which element(s) of an

attack are essential?– How to generalize them to their boundary conditions?

• avoid the fragility of simple pattern matching rules

• Approach: Experimentation– Validation of attack concepts 0 false positives

Cortex / Honeywell


Generalization by ExperimentationGeneralization by Experimentation

Model contains axes of

vulnerability

• Payload content– Binary machine instructions– Unusual payload (e.g. unix commands, registry

keys, database administrative commands)– Length (# bytes/terms)

• Resource consumption patterns• Probing (e.g. password guessing)• Session-wide (multiple queries)

Taste Tester

Model of normal traffic

Experiment1) Score suspicious elements2) Replace with innocuous

or generalized values3) Validate in tester

norm

al

attack

Blocking Rules

Cortex / Honeywell


Cortex Demo Architecture and Use CasesCortex Demo Architecture and Use Cases

TastersTastersTasters

Replicator

Delete tasters

Create tasters

Switch Tasters

Replicate queries

Heartbeat Status

RTSReplicate

Switch Tasters

Rebuild Tasters

Send to Learning

.

AMP

CSM

Once per phase

Proxy (Dexter)Block known bad queries

Taste test

Log results

Master DBQuery

LearnerRead Training Data

Experiment

Generate Rules

Normal Query Mission Planning

Cortex / Honeywell


TastersTastersTasters

Replicator

Delete tasters

Create tasters

Switch Tasters

Replicate queries

Heartbeat Status

RTSReplicate

Switch Tasters

Rebuild Tasters

Send to Learning

.

AMP

CSM

Proxy (Dexter)Block known bad queries

Taste test

Log results

Master DBQuery

LearnerRead Training Data

Experiment

Generate Rules

Attack gets throughAttack is blocked

Cortex / Honeywell

Cortex Demo Architecture and Use CasesCortex Demo Architecture and Use Cases


Example Results: MySQLExample Results: MySQL

Noted that hex bytes were suspicious, so generalized bytes and correctly blocked integer overflow!

MySQL DOS attack

Correctly generalized single attack to 0x7FFF max value

Integer overflow

Correctly generalized single attack to number of valid bytes.

String buffer overflow (password)

NotesAttacks

Project was tested with a red-team modelCortex / Honeywell


Identification of Multistage AttacksIdentification of Multistage Attacks

CSISM Project


MultiStage Attacks: ChallengesMultiStage Attacks: Challenges

• Detect and generalize multi-step attacks across time and space.– Multistage attacks involve a sequence of actions that span multiple

hosts and take multiple steps to succeed.

• Challenges:– Which observations are necessary & sufficient?

• Incidental observations that are either – side effects of normal operations, or– chaff explicitly added by an attacker to divert the defender.

• Concealment (e.g. to remove evidence)• Probabilistic actions (e.g. to improve probability of attack success)

– What are the most reliable observations?– What are the parameter boundaries?

• Approach: Experimentation– Allows validation of pruning

CSISM / BBN


Architectural SchemaArchitectural Schema

CSISM Sensors(ILC, IDS)

Observations ending in failureof protected system.Only some are essential.

1 2 3 4 5 6

Defense Measures

Experimenter

A B CX ?

Viable AttackTheories

Viable Defense Strategies and

Detection Rules

Attack TheoryExperimenter

1 2 3 4 6

A B D

5

C

A B C

“Sandbox”

A CB C

A B D

2A

ObservationsActions

CSISM / BBN


Multi-Stage LearnerMulti-Stage Learner

• Do {– Generate Theory according to heuristic

• Complete set of theories is Permutations( Powerset( observations ))

– Test Theory– Incrementally update controller rulebases

• } while Theories remain

• For only 10 observations, there are > 10,000,000 possible theories (not including variations on steps!)

The hard part!

CSISM / BBN

22

Hypothesis GenerationHypothesis Generation

• Query learner generates attack hypotheses– in heuristic order to acquire the concept rapidly

• Candidate Heuristics– Look for shorter attacks first (adjustable prior)– Suspect order of steps has an influence– Suspect steps to interact positively (for the attacker)– Prefer hypotheses with less common / more

suspicious elements

CSISM / BBN

Project was tested with a red-team model


Response LearningResponse Learning

CSISM Project


Situation-dependent Action UtilitiesSituation-dependent Action Utilities

• Learn tradeoffs among potential responses; context changes appropriateness of responses changes– Context includes descriptions of users, attack elements, system

performance, etc– Benefit is effectiveness of defense action– Cost includes effort to mount response and impact on availability

• Challenges:– Measuring the effect of responses is hard:

• Complex domain rarely identical situations non-deterministic actions/effects

• Approach: Experimentation– System “snapshots” get close to identical conditions

CSISM / BBN


Response Learning: Results PendingResponse Learning: Results Pending

• Bias toward results that worked in similar situations in the past– Hybrid Reinforcement learning and Nearest-Neighbour

approaches

• Given a set of hypotheses about the locus of an attack– Search for true locus:

• Hierarchical based on system architecture• Bias by historical attack patterns

– Select response based on similarity match to prior attacks:• Same response when quality was high• Alternate response when quality was low

Project will be tested by a red-team on 20 May 2008. Goal is to demonstrate “better” responses over time.

CSISM / BBN


ConclusionConclusion


Learning BenefitsLearning Benefits

• Learning can improve the defensive posture – better knowledge (about the attacks or attacker), better policies

• Learning can improve how the system responds to symptoms – better connection between response actions and their triggers

• Active Learning– A mechanism for recognizing Zero-day attacks– No false positives — only validated attacks are added

• Learning techniques are enablers for the next level of enhancements in adaptive defense

Adaptation is the key to survival


From Proof-of-Concept to ProductionFrom Proof-of-Concept to Production

Demonstrated Future Directions

Generalization Able to generalize instances to classes.

•More axes of vulnerability•More handling of joint probabilities•More domains•Meta learning to induce new axes

Multi-stage attack

Able to identify Chaff

•Probabilistic actions•Concealment•Model of normal•Generalization

Responses Able to map context to response

•Richer context, richer responses•Automatic measurement of benefit•Scalable “snapshots”


BackupBackup


Multistage AttacksMultistage Attacks

• Detect and then generalize multi-step attacks across time and space.

• Multistage attacks involve a sequence of actions that span multiple hosts and take multiple steps to succeed.

– A sequence of actions with causal relationships.– An action A must occur set up the initial conditions for action B.

Action B would have no effect without previously executing action A.– For example

1. gain ability to execute commands on Box1 as unprivileged user by exploiting a buffer overflow in Service1

2. gain root shell by running an exploit of a race condition3. disable protection mechanism, e.g. SElinux4. replace dpasa jar with attacker jar code5. run attacker code that sends bad refs to Box2, Box3, Box4.

Walk-Away-Message


Attacks (MySQL DoS-1)Attacks (MySQL DoS-1)

• mysql-com_table-dump-memory-corruption– Malformed request leaves MySQL unstable

• Countermeasures:– Block the malformed com_table_dump command using

learned pattern and proxy filter rules.– Restart the server– Block all requests from the offending sources



• mysql-password-handler-buffer-overflow– Excessive password length can crash server

• Countermeasures:– Block connections which proffer “abnormal”

passwords (learned response or statistical anomaly).

– Restart the server.– Block all requests from the offending sources.



• mysql-remote-fulltext-search-DoS– Malformed request crashes server

• Countermeasures:– Detect and block malformed queries– Block all queries of this type (fulltext-search)– Block all requests from the offending sources.– Restart the server

haigh & harp, learning from limited experience improving self-defense by learning from limited...

Documents

multimodal data

plan maf

quadcomplex domain

human rules

clientscomplex domain

day attacks

machine learning

instances of attacks