behind an application firewall, are we safe from sql injection attacks?

.lusoftware verification & validationVVS

Behind an Application Firewall, Are We Safe from SQL Injection Attacks?

Dennis Appelt, Cu D. Nguyen, Lionel Briand

• A Web Application Firewall (WAF) is the first layer of defense

• Stops attacks before they reach (vulnerable) applications

Onion Defense Paradigm

2

Problem Statement

• Ensuring that a WAF can reliably identify attacks is critical for protecting IT infrastructures

• Configuring and maintaining a WAF is difficult and error-prone •  False positives: By default, WAF rule sets are strict. This results in

legit requests being classified as attacks.

•  False negatives: Tailoring a WAF rule set to a specific application & attack types often relaxes the rule set too much.

3

Problem Statement

4

(?i:(?:\b(?:(?:s(?:ys\.(?:user_(?:(?:t(?:ab(?:_column|le)|rigger)|object|view)s|c(?:onstraints|atalog))|all_tables|tab)|elect\b.{0,40}\b(?:substring|users?|ascii))|m(?:sys(?:(?:queri|ac)e|relationship|column|object)s|ysql\.(db|user))|c(?:onstraint_type|harindex)|waitfor\b\W*?\bdelay|attnotnull)\b|(?:locate|instr)\W+\()|\@\@spid\b)|\b(?:(?:s(?:ys(?:(?:(?:process|tabl)e|filegroup|object)s|c(?:o(?:nstraint|lumn)s|at)|dba|ibm)|ubstr(?:ing)?)|user_(?:(?:(?:constrain|objec)t|tab(?:_column|le)|ind_column|user)s|password|group)|a(?:tt(?:rel|typ)id|ll_objects)|object_(?:(?:nam|typ)e|id)|pg_(?:attribute|class)|column_(?:name|id)|xtype\W+\bchar|mb_users|rownum)\b|t(?:able_name\b|extpos\W+\()))!

5

Bypass Testing of WAFs

Approach

6

7

Space of SQL Injection attacks

? •  Large input space"

•  Exhaustive Search infeasible"

•  Random search ineffective"

•  Difficult to guide the search

How to guide the test generation towards bypassing the WAF?"

8

Generate new test cases by learning from previously executed test cases!

9

Learning from test cases

10

Attack String Blocked? “ Union Select 1 From all_tables Yes

“ AND false # Yes

1 OR/**/”a”=“a” OR No

“ Union/**/Select 1 From all_tables ?

“ AND false OR ?

SQL Injection Grammar

• Attacks are generated from a context-free grammar

• Each attack has a derivation tree 11

-- ⌅‘0’ ⌅ hwspi hbooleanAttacki hwspi⌃ ‘)’ hwspi hbooleanAttacki hwspi ‘OR’ ‘(’ ‘0’ ⇧⌃ ⌅⌃ ‘)’ ⇧⌥ hwspi ⌅ hunionAttacki⌃ hpiggyAttacki ⇧⌃ hbooleanAttacki ⇧⌥ hcmti ⇧

⌥

⌃ ’ ⌅ hwspi hbooleanAttacki hwspi ‘OR’ ’⌃ ‘)’ hbooleanAttacki hwspi ‘OR’ ‘(’ ’ ⇧⌃⌅⌃ ‘)’ ⇧⌥ hwspi ⌅ hunionAttacki⌃ hpiggyAttacki ⇧⌃ hbooleanAttacki ⇧⌥ hcmti ⇧

⌥ ⇧

⌃ ” ⌅ hwspi hbooleanAttacki hwspi ‘OR’ ”⌃ ‘)’ hbooleanAttacki hwspi ‘OR’ ‘(’ ” ⇧⌃⌅⌃ ‘)’ ⇧⌥ hwspi ⌅ hunionAttacki⌃ hpiggyAttacki ⇧⌃ hbooleanAttacki ⇧⌥ hcmti ⇧

⌥ ⇧

⌥ -�

Fig. 1. The syntax diagram of the proposed grammar.

protected by a WAF and are labelled as “P” or “B” dependingwhether they bypass or are blocked by the WAF, accordingly.We encode and make use of these test results as initial trainingdata to learn a model predicting the likelihood (f ) with whichtests can bypass the WAF. Using this measure we can rank,select, and mutate tests with high predicted f values to producenew tests, with hopefully even higher bypassing probabilities.These new tests are then executed and their results (“P” or“B”) are in turn used to feed a machine learning algorithmand improve the prediction model, which will in turn helpgenerating more tests that bypass the WAF.

Our approach is inspired by genetic programming andsearch-based test generation [5], [12], [1]. We face the problemto efficiently choose from a large set of SQLi attacks the onesthat are more likely to reveal holes in the WAF. The problemis challenging because there is little information available tocalculate how close a test comes to bypassing the WAF. Whena test is executed only one of the following two events canbe observed: bypassing, or blocked. This leaves the searchwith no guidance to effectively assess how close a blockedattack is from bypassing the WAF. To tackle the problem,we use machine learning to model how the elements (featuresof attacks) of the tests are associated with high likelihoodsof bypassing the WAF. In the search process, tests that arepredicted to have such high likelihood are considered to havea high fitness and are likely candidates for mutation.

In what follows, we will discuss in detail how tests are de-composed and encoded for machine learning and the mutationprocess, which we use to generate new SQLi attacks. Finally,we describe our overall ML-driven test generation approachthat aims at iteratively finding new and effective attacks.

1) Test Decomposition: From the defined grammar, we canderive tests by recursively applying production rules, startingwith the <ATTACK> rule. A derivation tree (also calledas parse tree) of a test is a graphical representation of thederivation steps that are involved in producing the test. In aderivation tree, an intermediate node presents a non-terminalsymbol, a leave node represents a terminal one, and edgesare derivations. Figure 2 depicts the derivation tree of theBOOLEAN attack test: ’ OR“a”=“a”--. In the course ofgenerating this test, we first apply the <ATTACK> rule:

<START>

<sQuoteContext>

<squote> <wsp> <sqliAttack> <cmt>

<booleanAttack>

<orAttack>

<opOr> <booleanTrueExpr>

<binaryTrue>

‘ ␣ - -

OR

<dquote> <char> <dquote> <opEqual> <dquote> <char> <dquote>

=“ ” “ ”a a

Fig. 2. The derivation tree of the “boolean” SQLi attack: ’ OR“a”=“a”--.

hATTACKi ::= hnumericContexti| hsQuoteContexti| hdQuoteContexti ;

and derive <sQuoteContext>. We then apply the third rule ofthe grammar to derive <squote>, <wsp>, <sqliAttack>, and<cmt>. This procedure is repeated until all the leave nodesare terminal symbols.

We make use of derivation trees to identify which parts ofa SQLi attack are likely to be responsible for the attack beingblocked or passing. Specifically, for each test, we decomposeits derivation tree into slices, which are defined as follows:

Definition 1 (Slice). A slice s of a derivation tree T is a sub-tree of T such that the root of s is a non-terminal node of T ,except those that represent <ATTACK>, <numericContext>,<sQuoteContext>, and <dQuoteContext>.

We skip the start symbol and its children because sub-trees extracted from them will be (closely) equivalent to theoriginal derivation tree. Such decompositions provide no orlittle information as to why a test is blocked or bypassing.

Slicing

12

<START>

<sQuoteContext>


<booleanAttack>

<orAttack>


<binaryTrue>

‘ ␣ #

OR


=“ ” “ ”a a


<booleanAttack>

<orAttack>


<binaryTrue>

‘ ␣ #

OR


=“ ” “ ”a a

s1 s2 s4s3

‘ OR”a”=“a”# S = {‘, ,OR”a”=“a”,#}

Learning Attack Patterns

13

S1 S2 … Sn Class

A1 1 0 … 1 Blocked

A2 1 1 … 0 Passing

… … … … … …

Am 0 1 … 0 Blocked

•  Each attack becomes one observation in the training data •  An observation indicates out of which slices an attack consists •  From the training data a decision tree is built •  The decision tree predicts how close a test case comes

Decision Tree

•  Decision tree groups attacks based on the presence/absence of slices •  Benefits of using a decision tree

•  Interpretable •  Performance

14

S3

S5

0 1

S1

0S2

S4

1

Blocked Passed

0 1

Guiding Test Generation

• Amongst the training data, select the attacks that are most likely to pass for mutation.

•  Breadth-first: Select many attacks and mutate each attacks only a few times.

•  Depth-first: Select few attacks and mutate each attack many times.

• The structure of the decision tree is exploited to generate new attacks.

15

Exploiting the Decision Tree

• Attack A = <S1, S3, S6>

•  Path Condition P = S3 ∧ ¬S5 ∧ S1

• Mutate A so that the mutants satisfy P 16

S3

S5

0 1

S1

0S2

S4

1

Blocked Passed

0 1

Mutation

17

A = <S1, S3, S6> P = S3 ∧ ¬S5 ∧ S1

M1 = <> M1 = <S1,> M1 = <S1, S3, S11> M1 = <S1, S3,>

S7 è … | S6 | S11 | S9 | S5 | …

Production Rule for S6

Iterative Learning

18

Prepare Training

Data

Build Classifier

Mutate best

attacks

Slice attacks

Evaluation

19

Test subject

20

Research Question 1

How does the decision tree improve over training iterations in terms of F-measure?

21

Improvement of the Classifier

•  F-Measure improves constantly

•  In later iterations the improvement decreases

22

Research Question 2

Among ML-Driven breadth-first, ML-Driven depth-first and RAN, which one yields better performance in terms of the

number of bypassing tests (Dt ) and efficiency?

23

Number of Bypassing Tests

• ML-B and ML-D outperform RAN

• ML-D better in the beginning (< 75 Min.)

• ML-B better in the later stage (> 75 Min.)

24

Efficiency

•  Over time it becomes harder to find more bypassing tests.

•  Efficiency for RAN steadily decreases.

•  Efficiency for ML-D and ML-B increases in the first hour; then decreases.

25

Summary

26

Research Question 1

What is the best choice for parameter K for the RandomTree algorithm to generate a good classifier in terms of F-measure

and Msize?

28

Evaluation of parameter K

Selecting K:

•  Tradeoff computation time óF-measure/Msize

•  K ~ 40% reasonable compromise

29

behind an application firewall, are we safe from sql injection attacks?

Software