towards an smt-based approach for quantitative information flow

32
INTRODUCTION THE APPROACH CONCLUSION Towards an SMT-based approach for Quantitative Information Flow Quoc-Sang Phan Pasquale Malacaria Queen Mary, University of London November 29, 2012 1 / 32

Upload: quoc-sang-phan

Post on 08-May-2015

65 views

Category:

Education


0 download

DESCRIPTION

Presentation at the Dagstuhl Seminar on Quantitative Security Analysis. Full paper is here: http://dl.acm.org/citation.cfm?id=2590328

TRANSCRIPT

Page 1: Towards an SMT-based approach for Quantitative Information Flow

INTRODUCTIONTHE APPROACH

CONCLUSION

Towards an SMT-based approach for QuantitativeInformation Flow

Quoc-Sang Phan Pasquale Malacaria

Queen Mary, University of London

November 29, 2012

1 / 32

Page 2: Towards an SMT-based approach for Quantitative Information Flow

INTRODUCTIONTHE APPROACH

CONCLUSION

Outline

1 INTRODUCTION

2 THE APPROACHQIF as a #SMT problemA #DPLL(T ) for QIFSymbolic Execution as #DPLL(T )Soundness and CompletenessExperiment

3 CONCLUSION

2 / 32

Page 3: Towards an SMT-based approach for Quantitative Information Flow

INTRODUCTIONTHE APPROACH

CONCLUSION

Contributions

1 Introduction of a new research problem: #SMT, and itsapplications to QIF and Symbolic Execution.

2 A framework, called #DPLL(T ), to build a solver for#SMT-based QIF.

3 We show that Symbolic Execution analysis can be view as#SMT solver.

4 Two prototyping tools for QIF: sqifc employs CBMC andjpf-qif is built on top of Symbolic Pathfinder.

5 Experiment of the tools on non-trivial case studies, withdramatic improvement of performance compared withexisting tools.

3 / 32

Page 4: Towards an SMT-based approach for Quantitative Information Flow

INTRODUCTIONTHE APPROACH

CONCLUSION

Quantitative Information Flow Analysis

Channel Capacity

∆F (H) = F (H)− F (H|L) ≤ log2(N)

Lagrange multipliers and maximum information leakagein different observational models. Malacaria and Chen(PLAS 2008)

On the Foundations of Quantitative Information Flow.Smith (FOSSACS 2009).

4 / 32

Page 5: Towards an SMT-based approach for Quantitative Information Flow

INTRODUCTIONTHE APPROACH

CONCLUSION

Challenge

f : D → Do

N = 0for all v in Do do

if (assert O != v is violated) thenN ← N + 1

end ifend forreturn N

Figure: Exhaustive counting of outputs of a program f

5 / 32

Page 6: Towards an SMT-based approach for Quantitative Information Flow

INTRODUCTIONTHE APPROACH

CONCLUSION

STATE OF THE ART

Existing techniques:

DisQuant: Backes et al. S&P 2009.

Employ model checking to compute an equivalence relation R.If R is in linear integer inequalities Ax̄ > b̄ (bounded integerpolytope), then use Barvinok algorithm to count.

selfcomp: Heusser and Malacaria. ACSAC 2010.

Exploit assume-guarantee reasoning to extend self-composition.Applied to programs in Linux kernel.

6 / 32

Page 7: Towards an SMT-based approach for Quantitative Information Flow

INTRODUCTIONTHE APPROACH

CONCLUSION

QIF as a #SMT problemA #DPLL(T ) for QIFSymbolic Execution as #DPLL(T )Soundness and CompletenessExperiment

The #SMT problem

The SMT problem

Satisfiability Modulo Theories (SMT) is a decision problem forlogical formulas w.r.t. combinations of background theories Texpressed in classical first-order logic with equality.

Boolean abstraction BA(ϕ): a bijective function that

maps Boolean atoms into themselves.

maps non-Boolean T -atoms into fresh Boolean atoms.

7 / 32

Page 8: Towards an SMT-based approach for Quantitative Information Flow

INTRODUCTIONTHE APPROACH

CONCLUSION

QIF as a #SMT problemA #DPLL(T ) for QIFSymbolic Execution as #DPLL(T )Soundness and CompletenessExperiment

The #SMT problem

ϕ := {¬(x + y > 1) ∨ A1}∧ {(x + y > 1) ∨ ¬A2}∧ {¬A3 ∨ (y − z < 7)}

BA(ϕ) := {¬B1 ∨ A1}∧ {B1 ∨ ¬A2}∧ {¬A3 ∨ B2}

The #SMT problem

Propositional abstract model counting or #SMT is the problem ofcomputing the number of boolean abstraction of models for agiven logical formula.

- The number of boolean abstraction of the models is always finite.- #SMT solver: #SAT solver + T -solvers.

8 / 32

Page 9: Towards an SMT-based approach for Quantitative Information Flow

INTRODUCTIONTHE APPROACH

CONCLUSION

QIF as a #SMT problemA #DPLL(T ) for QIFSymbolic Execution as #DPLL(T )Soundness and CompletenessExperiment

The #SMT problem

ϕ := {¬(x + y > 1) ∨ A1}∧ {(x + y > 1) ∨ ¬A2}∧ {¬A3 ∨ (y − z < 7)}

BA(ϕ) := {¬B1 ∨ A1}∧ {B1 ∨ ¬A2}∧ {¬A3 ∨ B2}

The #SMT problem

Propositional abstract model counting or #SMT is the problem ofcomputing the number of boolean abstraction of models for agiven logical formula.

- The number of boolean abstraction of the models is always finite.- #SMT solver: #SAT solver + T -solvers.

9 / 32

Page 10: Towards an SMT-based approach for Quantitative Information Flow

INTRODUCTIONTHE APPROACH

CONCLUSION

QIF as a #SMT problemA #DPLL(T ) for QIFSymbolic Execution as #DPLL(T )Soundness and CompletenessExperiment

QIF as a #SMT problem

A set of boolean variables Φ := {p1, p2, .., pM}, in which each pi

corresponds to a bit bi of the output O.

Without any constraints: Φ represents 2M possible values.

With the constraints from program P: Φ represents Npossible values (possible outputs of the program).

10 / 32

Page 11: Towards an SMT-based approach for Quantitative Information Flow

INTRODUCTIONTHE APPROACH

CONCLUSION

QIF as a #SMT problemA #DPLL(T ) for QIFSymbolic Execution as #DPLL(T )Soundness and CompletenessExperiment

QIF as a #SMT problem

P can be encoded into a logical formula ϕ w.r.t. theories T .

Each pi is a boolean abstraction of the T -atom expressing theconstraints on bit bi → QIF is a #SMT problem.

Program←→ Logical formula

Model checker←→ T -solver

11 / 32

Page 12: Towards an SMT-based approach for Quantitative Information Flow

INTRODUCTIONTHE APPROACH

CONCLUSION

QIF as a #SMT problemA #DPLL(T ) for QIFSymbolic Execution as #DPLL(T )Soundness and CompletenessExperiment

An example

base = 8;if (H < 16) then

O = base + Helse

O = baseend if

Figure: Data sanitization program

H is in [0..15].

O is in [8..23].

12 / 32

Page 13: Towards an SMT-based approach for Quantitative Information Flow

INTRODUCTIONTHE APPROACH

CONCLUSION

QIF as a #SMT problemA #DPLL(T ) for QIFSymbolic Execution as #DPLL(T )Soundness and CompletenessExperiment

Symbolic Quantitative Information Flow

UNSAT

p1

p1 ∧ p2

p1 ∧ p2 ∧ p3

p1 ∧ p2 ∧ p3 ∧ p4

p1 ∧ p2 ∧ p3 ∧ p4 ∧ p5p1 ∧ p2 ∧ p3 ∧ p4 ∧ ¬p5

p1

p2

p3

p4

p5

assert !(p1 && p2 && p3 && p4 && p5);

13 / 32

Page 14: Towards an SMT-based approach for Quantitative Information Flow

A #DPLL(T ) for QIF

1: function SymCount(Φ, Ψ,N, pc, i)2: Extract pi from Φ3: pc1 ← pc ∧ pi

4: if (T -solver(pc1)) then5: if (i == M) then6: Ψ ← Ψ ∪ {pc1}7: N ← N + 18: else9: SymCount(Φ, Ψ,N, pc1, i + 1)

10: end if11: end if12: pc2 ← pc ∧ ¬pi

13: . . .14: end function

Figure: Symbolic counting for QIF

Page 15: Towards an SMT-based approach for Quantitative Information Flow

INTRODUCTIONTHE APPROACH

CONCLUSION

QIF as a #SMT problemA #DPLL(T ) for QIFSymbolic Execution as #DPLL(T )Soundness and CompletenessExperiment

Symbolic Execution as a #SMT solver

If a program is encoded as a logical formula, e.g. Static SingleAssignment form, then a Symbolic Execution tool is a #SMTsolver for this formula.

15 / 32

Page 16: Towards an SMT-based approach for Quantitative Information Flow

INTRODUCTIONTHE APPROACH

CONCLUSION

QIF as a #SMT problemA #DPLL(T ) for QIFSymbolic Execution as #DPLL(T )Soundness and CompletenessExperiment

Symbolic Execution as a #SMT solver

if (x > 1) y = x < 5 ? x + 10 : x ; else y = 0 ;

C1 as (x > 1).

C2 as (x < 5).

A1 as (y1 = x + 10).

A2 as (y2 = x).

A3 as (y3 = 0).

C1 ∧ (C2 ∧ A1 ∨ ¬C2 ∧ A2) ∨ ¬C1 ∧ A3

There are 4 models

{C1 ∧ C2,C1 ∧ ¬C2,¬C1 ∧ C2,¬C1 ∧ ¬C2}

16 / 32

Page 17: Towards an SMT-based approach for Quantitative Information Flow

INTRODUCTIONTHE APPROACH

CONCLUSION

QIF as a #SMT problemA #DPLL(T ) for QIFSymbolic Execution as #DPLL(T )Soundness and CompletenessExperiment

Symbolic Execution as a #SMT solver

O =

f1(i1, i2.., iM) if pc1f2(i1, i2.., iM) if pc2. . . . . .fN(i1, i2.., iM) if pcN

Where:

∀i , j ∈ [1,N] ∧ i 6= j , pci ∧ pcj = ⊥

17 / 32

Page 18: Towards an SMT-based approach for Quantitative Information Flow

INTRODUCTIONTHE APPROACH

CONCLUSION

QIF as a #SMT problemA #DPLL(T ) for QIFSymbolic Execution as #DPLL(T )Soundness and CompletenessExperiment

Symbolic Execution as #DPLL(T )

pc ` c : execute then path→ unit propagation

pc ` ¬c : execute else path→ unit propagation

(pc 0 c) ∧ (pc 0 ¬c)

then path: pc1 = pc ∧ celse path: pc2 = pc ∧ ¬c

→ branching

18 / 32

Page 19: Towards an SMT-based approach for Quantitative Information Flow

INTRODUCTIONTHE APPROACH

CONCLUSION

QIF as a #SMT problemA #DPLL(T ) for QIFSymbolic Execution as #DPLL(T )Soundness and CompletenessExperiment

SQIF-SE: SQIF by Symbolic Execution

base = 8;if (H < 16) then

O = base + Helse

O = baseend iffor all element bi in vector bvo do

if (bi == 1) thenpi = True

elsepi = False

end ifend for

Figure: Additional conditions19 / 32

Page 20: Towards an SMT-based approach for Quantitative Information Flow

INTRODUCTIONTHE APPROACH

CONCLUSION

QIF as a #SMT problemA #DPLL(T ) for QIFSymbolic Execution as #DPLL(T )Soundness and CompletenessExperiment

SQIF-SE: SQIF by Symbolic Execution

base = 8;if (H < 16) then

O = base + Helse

O = baseend iffor all element bi in vector bvo do

if (bi == 1) thenpi = True

elsepi = False

end ifend for

Figure: Additional conditions20 / 32

Page 21: Towards an SMT-based approach for Quantitative Information Flow

INTRODUCTIONTHE APPROACH

CONCLUSION

QIF as a #SMT problemA #DPLL(T ) for QIFSymbolic Execution as #DPLL(T )Soundness and CompletenessExperiment

SQIF-SE: SQIF by Symbolic Execution

s1

s2 s3

p1p1

p2 p2

H ≥ 16

pc := (H 16)

H < 16

pc := (H ≥ 16)<

pc ∧ p1 pc ∧ p1

pc ∧ p1 ∧ p2pc ∧ p1 ∧ ¬p2

(H ≥ 16) and (H < 16): program conditions.p1, p2, ..: additional conditions.

21 / 32

Page 22: Towards an SMT-based approach for Quantitative Information Flow

INTRODUCTIONTHE APPROACH

CONCLUSION

QIF as a #SMT problemA #DPLL(T ) for QIFSymbolic Execution as #DPLL(T )Soundness and CompletenessExperiment

Soundness and Completeness

Theoretically, the SQIF approach is both sound and complete.

1 In reality, SQIF is sound and complete with small leaks.

2 SQIF-SE is sound and complete with bounded model ofprogram.

Does it leak more than k?

Quantifying information leaks in software. ACSAC 2010.Heusser and Malacaria.

With user policy k , SQIF may not be complete but the result ofsecure/insecure is always sound.

22 / 32

Page 23: Towards an SMT-based approach for Quantitative Information Flow

INTRODUCTIONTHE APPROACH

CONCLUSION

QIF as a #SMT problemA #DPLL(T ) for QIFSymbolic Execution as #DPLL(T )Soundness and CompletenessExperiment

Soundness and Completeness

Theoretically, the SQIF approach is both sound and complete.

1 In reality, SQIF is sound and complete with small leaks.

2 SQIF-SE is sound and complete with bounded model ofprogram.

Does it leak more than k?

Quantifying information leaks in software. ACSAC 2010.Heusser and Malacaria.

With user policy k , SQIF may not be complete but the result ofsecure/insecure is always sound.

23 / 32

Page 24: Towards an SMT-based approach for Quantitative Information Flow

INTRODUCTIONTHE APPROACH

CONCLUSION

QIF as a #SMT problemA #DPLL(T ) for QIFSymbolic Execution as #DPLL(T )Soundness and CompletenessExperiment

Experiment

Two prototyping tools:

jpf-qiftool for Java and also developed in Java.built on top of Symbolic Pathfinder (Symbolic Executionextension of Java Pathfinder).

sqifctool for C and also develped in C.built on top of CBMC (Bounded Model Checking tool for C).

Compare with selfcomp (Heusser and Malacaria, ACSAC 2010).

24 / 32

Page 25: Towards an SMT-based approach for Quantitative Information Flow

CVE-2011-2208

1 int osf_getdomainname(char __user *name , int namelen)

2 {

3 unsigned len;

4 int i, error;

56 error = verify_area(VERIFY_WRITE , name , namelen );

7 if (error)

8 goto out;

910 len = namelen;

11 if (namelen > 32)

12 len = 32;

1314 down_read (& uts_sem );

15 for (i = 0; i < len; ++i) {

16 __put_user(system_utsname.domainname[i], name + i);

17 if (system_utsname.domainname[i] == ’\0’)

18 break;

19 }

20 up_read (& uts_sem );

21 out:

22 return error;

23 }

Figure: arch/alpha/kernel/osf sys.c

Page 26: Towards an SMT-based approach for Quantitative Information Flow

CVE-2011-1078

1 static int sco_sock_getsockopt_old(struct socket *sock , int optname ,

2 char __user *optval , int __user *optlen)

3 {

4 struct sock *sk = sock ->sk;

5 struct sco_conninfo cinfo;

6 int len , err = 0;

7 ...

89 lock_sock(sk);

1011 switch (optname) {

12 case SCO_OPTIONS:

13 ...

1415 case SCO_CONNINFO:

16 ...

1718 cinfo.hci_handle = sco_pi(sk)->conn ->hcon ->handle;

19 memcpy(cinfo.dev_class , sco_pi(sk)->conn ->hcon ->dev_class , 3);

2021 len = min_t(unsigned int , len , sizeof(cinfo ));

22 if (copy_to_user(optval , (char *)&cinfo , len))

23 err = -EFAULT;

24 break;

25 ...

26 }

2728 release_sock(sk);

29 return err;

30 }

Figure: net/bluetooth/sco.c

Page 27: Towards an SMT-based approach for Quantitative Information Flow

Cyclic Redundancy Check

1 unsigned char GetCRC8( unsigned char check ,

2 unsigned char ch)

3 {

4 int i, sft ;

5 for ( i = 0 ; i < 8 ; i++ ) {

6 if ( check & 0x80 ) {

7 check <<=1;

8 if ( ch & 0x80 ) {

9 check = check | 0x01;

10 } else {

11 check =check & 0xfe;

12 }

13 check = check ^ 0x85;

14 } else {

15 check <<=1;

16 if ( ch & 0x80 ) {

17 check = check | 0x01;

18 } else {

19 check = check & 0xfe;

20 }

21 }

22 ch <<=1;

23 }

24 check >>= sft;

25 return check;

26 }

Figure: Cyclic Redundancy Check

Page 28: Towards an SMT-based approach for Quantitative Information Flow

Tax Record

taxPayer1

taxRecord1

*

checker1

1

server1

1*taxRecords

TaxRecord

<<interface>>TaxRecord4taxPayer

getTaxes(): intgetAmountPayed(): intpayTaxes(don:int, amnt:int)

<<interface>>TaxRecord4taxChecker

verifyPayment(): intfreeze(): int

TaxPayer

TaxCheckercheckTaxes(tr:TaxRecord4taxChecker): int

Charity

<<interface>>TaxServer4charity

getCharity(): int

TaxServer

Figure: The tax program

taxChecker1: income × F % + donation > payment

taxChecker2: income × F % + donation − paymentjpf-qif: chanel capacity of 4.86 bits

Page 29: Towards an SMT-based approach for Quantitative Information Flow

INTRODUCTIONTHE APPROACH

CONCLUSION

QIF as a #SMT problemA #DPLL(T ) for QIFSymbolic Execution as #DPLL(T )Soundness and CompletenessExperiment

DEMO

29 / 32

Page 30: Towards an SMT-based approach for Quantitative Information Flow

Case Study LoC Language sqifc jpf-qif selfcompData

Sanitization< 10 C/Java 28.179 20.695 timed

out

CVE-2011-2208(64)

> 200 C 22.759 × 119.117

CVE-2011-2208(256)

C 88.196 × timedout

CVE-2011-1078(8)

> 200 C 10.380 × 13.853

CVE-2011-1078(64)

C 37.899 × timedout

CRC (8) < 30 C/Java 1.209 8.386 0.498

CRC (32) C/Java 8.657 9.357 timedout

Tax Record 267 Java × 24.988s ×Figure: Times in seconds for all case studies, timeout is 30 minutes

Page 31: Towards an SMT-based approach for Quantitative Information Flow

INTRODUCTIONTHE APPROACH

CONCLUSION

Conclusions

1 Introduction of a new research problem: #SMT, and itsapplications to QIF and Symbolic Execution.

2 A framework, called #DPLL(T ), to build a solver for#SMT-based QIF.

3 The methodology of Symbolic Execution re-casted as#DPLL(T ).

4 Two prototyping tools for QIF: sqifc and jpf-qif.

5 Experiment of the tools on non-trivial case studies.

31 / 32

Page 32: Towards an SMT-based approach for Quantitative Information Flow

INTRODUCTIONTHE APPROACH

CONCLUSION

THANK YOU FOR YOUR ATTENTION!

32 / 32