effective multimodel anomaly detection using cooperative negotiation

Effective multimodel anomaly detection using cooperative negotiation

Alberto Volpatto, Federico Maggi, Stefano ZaneroDEI, Politecnico di Milano

Anomaly detectionExample: detecting malicious HTTP messages

Dynamic web pageBad guy

Malicious HTTP Request

GET /login/id/<script>..</script>

Malicious HTTP Response

HTTP Redirect

www.iSpreadMalware.org Bad guy's page

UnluckyClient

Malicious HTTP Response

/* Attack 3rd party plugin */

Anomaly detectionModeling non-malicious messages to find malicious ones

Clients

Webserver

Millions of good HTTP messages

Anomaly detectionModeling non-malicious messages to find malicious ones

Clients

Webserver

Millions of good HTTP messages

Learning phaseClient

Webserver

Models of good messages

M1 MnM2 M3

Webserver

M1 MnM2 M3

Webserver

M1 MnM2 M3

Webserver

MnM1 M3M2

Example of models— parameter string length— numeric range— probabilistic grammar of strings— string character distribution

GET /page?uid=u44&p=14&do=delete

Webserver

Models of good sessions

M1 MnM2 M3

C2C10 C7

GET /page?uid=u43&p=10&do=add

Webserver

M1 MnM3M2

C2 C5C10

GET /page?uid=s10&do=add

Detection phaseClient

Webserver

Detection of bad messages

M1 MnM2 M3

GET /page?do=<script>MaliciousCode();

Anomaly value aggregation

• Not always well formalized (exceptions exist)

• Each model gives a partial anomaly value

• Issue: combining partial anomaly values is not trivial

• simple average (too simple)

• weighted average (how to set weights?)

• Bayesian networks (how to tune them easily?)

• etc. (ample literature on the subject)

Proposed approach

• Models treated as autonomous agents

Proposed approach

• Models treated as autonomous agents

• Overall anomaly value negotiated iteratively through a mediator

New detection phase

Client

Webserver

Partial models (i.e., agents)

M1 MnM2

Mediator

Partial anomaly values

pt1 pt2 pti ptn

Client

Webserver

M1 MnM2

Mediator

Anomaly value

atatat at

Client

Webserver

M1 MnM2

Mediator

Partial anomaly values

Client

Webserver

M1 MnM2

Mediator

pt+11 pt+1

2 pt+1i pt+1

n t+ 1

Until agreementAll partial anomaly values are equal

Overall anomaly value is selected

1 = pt�

2 = · · · = pt�

a = at�

Negotiation function

partial anomaly value of each agent at next iteration

anomaly value at current iteration

agreement coefficient for each agent

pt+1i = Fi(p

t) = pti + αi(at − pti)

Agreement coefficientαi = fα(wi) =

1 + eh(wi−k)

weights are the same trust levels of each agent (i.e., partial model)

are tuning parameters

Agreement coefficient

• values close to one:

• change i-th offer

• values close to zero:

• preserve i-th offer

Negoziazione Cooperativa

• Il parametro !i è chiamato coefficiente d’accordo e indica la volontà dell’agente i di adattarsi alla contro-offerta del mediatore. ! se prossimo a 1 l’agente è propenso ad un accordo ! se prossimo a 0 l’agente non è intenzionato a

mutare l’offerta

• È calcolato sulla base del livello di trust ! Se prossimo a 1 la valutazione dell’agente è

altamente attendibile e non si vuole mutare la propria offerta

! Se prossimo a 0 la valutazione non è attendibile e l’offerta può essere tenuta meno in considerazione

• Possibili problemi: ! Comportamento dittatoriale ! Convergenza

1 + eh(wi−k)

Agreement function

anomaly value at current iteration

weights are trust levels of each agent (i.e., partial model)

at each iteration, the agreement is, e.g., the weighted average of the the anomaly value p across all the agents

at = f(pi, wi) =

�i p

tiwi�

Modification of the learning phase

• A trust level for each model is needed

• Typically, modern tools compute it during learning [Criscione et al., EC2ND 2010]

• Examples: standard deviation of string length model, kurtosis measure of integer models

Trust levelsconstant over time

• Two-way communication between agents and mediator is not needed

• The mediator would just receive the initial offers (i.e., partial anomaly values) and run the iterative algorithm

wti = wt+1

i = · · · = wi

Optimized learning

• Monitor the trust level of each model using a simple sliding window

• When the j-th learning sample comes in

• Stop learning when “stability” is reached

δW (j) := maxj∈W

wji − min

j∈Wwj

δW (j�) < ε

Stop learningwhen trust is “stable”

Auto-calibrazione – terminazione apprendimento

• La durata dell’apprendimento determina l’efficacia di rilevazione ! Se troppo lunga può generare overfitting ! se troppo breve può non consentire la costruzione di

un adeguato modello di normalità • Introduzione di un meccanismo automatico per la

terminazione dell’apprendimento ! Un modello m è considerato stabile se non vi sono più

significative variazioni del proprio livello di trust T() nelle ultime w osservazioni in

! Un Anomaly Engine è stabile per una classe d’evento se ogni modello che implementa è stabile

Evaluation

• Background traffic: clean UCSB iCTF 2008 data (22,051 HTTP requests-responses)

14,961 attack-free training samples

7,090 attack-free testing samples

• Attack data: instances of the most popular real-world attacks (SQL injections, JavaScript injections, command injections) inserted with random mutations

Impact of modifications

0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4

Weighted averageCooperative negotiation

Cooperative negotiation and optimized learning

Limitations (1)i.e., future-works

1. strict convergence not formally proven

• but parameters influence detection quality only minimally, and predictably

• Mitigation: choose h and k to guarantee convergence

Convergence in practice

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8

h = 2.5h = 5.0h = 7.5

h = 10.0h = 12.5h = 15.0h = 17.5h = 20.0h = 22.5h = 25.0

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8

h = 2.5h = 5.0h = 7.5

h = 10.0h = 12.5h = 15.0h = 17.5h = 20.0h = 22.5h = 25.0

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8

h = 2.5h = 5.0h = 7.5

h = 10.0h = 12.5h = 15.0h = 17.5h = 20.0h = 22.5h = 25.0

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9

h = 2.5h = 5.0h = 7.5

h = 10.0

h, k versus FPR

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9

h = 2.5h = 5.0h = 7.5

h = 10.0

h, k versus DR

2. trust level independent from input

• possible concept drift not taken into account

• Mitigation: already addressed by other approaches [Maggi et al., RAID 2009]

2. relax cooperative assumption

• important in case of distributed detections

• agents may cheat because:

• outdated training base (mitigation: [Robertson et al., NDSS 2010])

• intruders took over a detector

Conclusions

• very simple to implement

• distributed detection is “embedded” in the model

• meaning of weights is now defined

• easy to generalize to more complex schema

• within the scope of our preliminary evaluation, it works against real-world attacks

Questions?Alberto Volpatto, Federico Maggi, Stefano Zanero

fmaggi@elet.polimi.ithttp://home.dei.polimi.it/fmaggi/

http://www.vplab.elet.polimi.it

effective multimodel anomaly detection using cooperative negotiation

partial anomaly values

partial anomaly value

iteration anomaly value

agent pt

anomaly detection example

anomaly value aggregation

selected pt

iat pt

Technology

a data model for multimodel process improvement

model selection and multimodel inference using the...

specification and implementation of multimodel data

multimodel inference -...

performance-based multimodel probabilistic climate change...

multimodel inference - sortie

multimodel inference - university of...

multimodel identiﬁcation of group structure in network...

research article multimodel predictive control approach

3d signature for efficient authentication in multimodel...

in silico prediction of aqueous solubility: a multimodel...

a multimodel streamflow forecasting system for the western...

multimodel superensemble technique for quantitative ...d....

multimodel superensemble forecasts of surface temperature ...

process architecture in multimodel environments

a multimodel assessment of future projections of north...

grouting reinforcement mechanism and multimodel simulation

multimodel control and fuzzy optimization of an induction...

novel multimodel approach for marathi speech emotion...

multimodel inference: understanding aic and bic in model...

￼effective multimodel anomaly detection using cooperative negotiation

effective multimodel anomaly detection using cooperative negotiation