hmm-web: a framework for the detection of attacks against web applications
Post on 18-Dec-2014
918 Views
Preview:
DESCRIPTION
TRANSCRIPT
June 17, 2009 ICC 2009 - HMMWeb - Davide Ariu 1
R AP
Pattern Recognition and Applications Group Department of Electrical and Electronic Engineering University of Cagliari, Italy
HMM-Web: a framework for the detection off attacks against Web
Applications I. Corona, D. Ariu, G. Giacinto
PRA Pattern Recognition and Applications Group !
Presenter Davide Ariu
June 17, 2009 ICC 2009 - HMMWeb - Davide Ariu 2
Outline
• Motivations
• HMM-Web vs. Web Application Firewalls
• Description of the IDS Scheme
• Noise inside the training set
• Sequences codification
• Experimental Setup
• Experimental Results
• Conclusions
Motivations
Why we do address the problem of securing Web Applications?
June 17, 2009 ICC 2009 - HMMWeb - Davide Ariu 3
Motivations
June 17, 2009 ICC 2009 - HMMWeb - Davide Ariu 4
Source: X-Force® 2008 Trend & Risk Report – January 2009
Motivations
June 17, 2009 ICC 2009 - HMMWeb - Davide Ariu 5
Source: X-Force® 2008 Trend & Risk Report – January 2009
Protection of Web Applications
• Web Applications can be protected using a Web-Application Firewall (WAF) – WAF filter applications’ input using a set of rules.
• Writing rules for a Web-Application Firewall is a procedure: – Vulnerable to zero-days attacks
• WAF can’t stop an attack if it doesn’t have a rule against it
– Time Expensive • Rules must be written by hand by the administrator
– Prone to errors • Requires the administrator having an in-depth
knowledge of applications which reside on the Web-Server
June 17, 2009 ICC 2009 - HMMWeb - Davide Ariu 6
HMM-Web
• HMM-Web addresses all of the weaknesses of Web-Application Firewalls because is an Intrusion Detection System: – Anomaly Based
• This means which is also able to face with zero-days attacks
– Fully Automated for what concerns the training procedure • Time saving • Doesn’t require the administrator having knowledge
of applications which reside on the Web-Server
June 17, 2009 ICC 2009 - HMMWeb - Davide Ariu 7
An usage scenario
June 17, 2009 ICC 2009 - HMMWeb - Davide Ariu 8
Request URI Modelling
• As attacks like XSS and SQL-Injection exploit input validation flaws, we want to model the input provided by the user.
• User-provided data are passed by the browser to the Web-Server (then to the application) using a sequence of attribute-value pairs.
• Consequently, we want to model:
– The sequence of attributes – The value of each attribute
June 17, 2009 ICC 2009 - HMMWeb - Davide Ariu 9
Request URI Modelling
• From the example request URI
GET /search.php?cat=32&key=hmm HTTP/1.1
we extract:
– The name of the application: “search.php” – The sequence of attributes: “cat-key” – The value of each attribute:
• “32” for the attribute cat • “hmm” for the attribute key
• These are the elements that HMM-Web analyses
June 17, 2009 ICC 2009 - HMMWeb - Davide Ariu 10
Classifier Ensemble
• HMM-Web is based on Hidden Markov Models
• For each application running on the Web Server HMM-Web creates a module consisting of
– An HMM-Ensemble to model the sequence of attributes • This feature allows to detect request URI modified by
hand – An HMM-Ensemble for each one of attributes received
by the Web Application • This feature allows to detect if one attribute is
receiving an anomalous value.
June 17, 2009 ICC 2009 - HMMWeb - Davide Ariu 11
IDS-Scheme
June 17, 2009 ICC 2009 - HMMWeb - Davide Ariu 12
Noise in the training set
• HMM-Web is trained on a training set made of requests toward the Web-Server we want to protect.
• This means that this training set might contain both legitimate and attack requests.
• From a Pattern Recognition point of view,this is a problem of training on noisy data..
How does this noise affect HMM-Web performances?
June 17, 2009 ICC 2009 - HMMWeb - Davide Ariu 13
Noise in the training set
• The assumption that the most part of queries inside the training set is legitimate is not reasonable for applications which are rarely interrogated.
June 17, 2009 ICC 2009 - HMMWeb - Davide Ariu 14
Noise in the training set Countermeasure
• We propose to model the fraction of attacks inside the training set as:
• Where: – M is the number of applications on the Web Server – N is the number of queries in the training set – is the number of queries on the i-th application – is the fraction of attacks on the i-th application
How can we estimate effectively for each application?
June 17, 2009 ICC 2009 - HMMWeb - Davide Ariu 15
€
α =1N
α i⋅ |q(wi) |i=1
M
∑
€
α i
€
|q(wi) |
€
α i
Noise in the training set Countermeasure
• Experimental results show that even a rough estimate of the amount of attacks inside the training set, allows to improve the performances of the IDS.
• A good estimate of is that provided by the following formula:
• is simply the ratio between the number of queries toward the i-th application and the overall number of queries.
June 17, 2009 ICC 2009 - HMMWeb - Davide Ariu 16
€
α i
€
α i =α
M ⋅ freq(wi), ∀i ∈ 1,M[ ]
€
freq(wi)
Attribute value codification
• The values passed to the attributes might contain digits, alphabetic letters or meta-characters.
• As it is not important distinguishing between elements belonging to each one of these categories, HMM-Web
– Replaces all the digits with the symbol “N”
– Replaces all the alphabetic letters with the symbol “A”
– Leaves immutate meta-characters
• E.g. The attribute value “/dir/sub/1,2” becomes “/AAA/AAA/N,N”
June 17, 2009 ICC 2009 - HMMWeb - Davide Ariu 17
Experimental Setup
• We tested HMM-Web on a production Web-Server of our Academic Institution.
• The Web-Server hosts 52 Applications: – 24 provide services for registered users – 28 provide public services
• Dataset D: 150.000 queries toward the Web –Server
• Dataset A: 38 attacks against 18 applications – 19 Cross Site Scripting Attacks – 19 SQL Injection Attacks
June 17, 2009 ICC 2009 - HMMWeb - Davide Ariu 18
Experimental Results Effectiveness of attributes’ codification
June 17, 2009 ICC 2009 - HMMWeb - Davide Ariu 19
The curve on the right has been obtained using the codification proposed by Kruegel et al. In “A multimodel approach to the detection of web-based attacks”, Computer Networks, 2005.
Experimental Result Effectiveness of the MCS Approach
June 17, 2009 ICC 2009 - HMMWeb - Davide Ariu 20
Conclusions
June 17, 2009 ICC 2009 - HMMWeb - Davide Ariu 21
• In this work we propose an anomaly-based IDS for the protection of Web-Applications
• Respect to traditional WAF HMM-Web is able to face with zero-days attacks and doesn’t require the administrator having an in-dept knowledge of applications to be protected.
• We suggest also a solution for the codification of queries toward the web server and a strategy to take into account the noise into the training set.
• HMM-Web achieves excellent results in terms of detection/false positive rate, even against attacks that are similar to those inside the training set.
Questions?
June 17, 2009 ICC 2009 - HMMWeb - Davide Ariu 22
top related