q uantitative i nformation f low as n etwork f low c apacity stephen mccamant and michael d. ernst...

42
QUANTITATIVE INFORMATION FLOW AS NETWORK FLOW CAPACITY Stephen McCamant and Michael D. Ernst Reading Group 9/18/08 Slides by Michelle Goodstein

Upload: hannah-randall

Post on 04-Jan-2016

212 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Q UANTITATIVE I NFORMATION F LOW AS N ETWORK F LOW C APACITY Stephen McCamant and Michael D. Ernst Reading Group 9/18/08 Slides by Michelle Goodstein

QUANTITATIVE INFORMATION FLOW AS NETWORK FLOW CAPACITY

Stephen McCamant and Michael D. Ernst

Reading Group 9/18/08Slides by Michelle Goodstein

Page 2: Q UANTITATIVE I NFORMATION F LOW AS N ETWORK F LOW C APACITY Stephen McCamant and Michael D. Ernst Reading Group 9/18/08 Slides by Michelle Goodstein

MOTIVATION Subset of inputs are secret Subset of outputs are public Express confidentiality as a limit on number

of secret bits revealed in public outputs Quantitative information flow security

Goal: Develop scheme for dynamic quantitative information flow analysis Dynamic: examine actual runs

Developed system for single-thread apps only

Page 3: Q UANTITATIVE I NFORMATION F LOW AS N ETWORK F LOW C APACITY Stephen McCamant and Michael D. Ernst Reading Group 9/18/08 Slides by Michelle Goodstein

MOTIVATION

Problem is similar to max flow Distinguished beginning, end Secret inputs “flow” from start Can take many routes to the end Want to know, how many secret bits reach end?

Invent a “gadget” to convert dynamic execution trace into a flow graph

Roughly: Max flow bounds number of bits of secret information revealed Max flow-min cut theorem

Page 4: Q UANTITATIVE I NFORMATION F LOW AS N ETWORK F LOW C APACITY Stephen McCamant and Michael D. Ernst Reading Group 9/18/08 Slides by Michelle Goodstein

OUTLINE

Dynamic Max-Flow Analysis Soundness and Consistency Implementation Details Checking Flow Bounds Case Studies Conclusions

Page 5: Q UANTITATIVE I NFORMATION F LOW AS N ETWORK F LOW C APACITY Stephen McCamant and Michael D. Ernst Reading Group 9/18/08 Slides by Michelle Goodstein

DYNAMIC MAX-FLOW ANALYSIS

Edges Capacities are in # of (secret) bits an edge can

hold Nodes:

Represent basic operations, memory locations, registers

In degree of node corresponds to arity of operation

Goal: Graph where max flow soundly bounds information dissemination

Page 6: Q UANTITATIVE I NFORMATION F LOW AS N ETWORK F LOW C APACITY Stephen McCamant and Michael D. Ernst Reading Group 9/18/08 Slides by Michelle Goodstein

“GADGET” EXAMPLE

c = a + b 32 bit integers, a & b part of “secret input”

a b

c

+

32

32

32

Page 7: Q UANTITATIVE I NFORMATION F LOW AS N ETWORK F LOW C APACITY Stephen McCamant and Michael D. Ernst Reading Group 9/18/08 Slides by Michelle Goodstein

DYNAMIC MAX-FLOW ANALYSIS

Output used multiple times: Modify gadget to properly bound flow

Example: d = c = a + b

a b

+

32 32

32

c d

32 32

a b

+32 32

c d

32 32

Max flow from a,b to c,d: 64 bits Max flow from a,b to c,d: 32 bits

Page 8: Q UANTITATIVE I NFORMATION F LOW AS N ETWORK F LOW C APACITY Stephen McCamant and Michael D. Ernst Reading Group 9/18/08 Slides by Michelle Goodstein

IMPLICIT FLOWS

General programs are more complicated than circuits Branches, arrays, pointers indirectly affect

computation Example: array[5] == 0 implies prior accesses

did not touch 5th location Call this implicit data flow

Need to fix graph to account for implicit flows Solution: add edges that represent all

possible data flows

Page 9: Q UANTITATIVE I NFORMATION F LOW AS N ETWORK F LOW C APACITY Stephen McCamant and Michael D. Ernst Reading Group 9/18/08 Slides by Michelle Goodstein

IMPLICIT FLOW SOLUTION

Implicit flows are contained in an enclosure with defined inputs/outputs Enclosures make code appear to be straight-line

Idea: Add edges from “implicit flows” to outputs of enclosure Examples: Square root

Special square root instruction: explicit flow Uses branches, loops: assume all can affect final value

One problem: how to represent an edge from a “flow” to an output

Page 10: Q UANTITATIVE I NFORMATION F LOW AS N ETWORK F LOW C APACITY Stephen McCamant and Michael D. Ernst Reading Group 9/18/08 Slides by Michelle Goodstein

ENCLOSURE EXAMPLE

Page 11: Q UANTITATIVE I NFORMATION F LOW AS N ETWORK F LOW C APACITY Stephen McCamant and Michael D. Ernst Reading Group 9/18/08 Slides by Michelle Goodstein

ENCLOSURE EXAMPLE

1st

Enclosure

Start

2ndEnclosur

e

End

Page 12: Q UANTITATIVE I NFORMATION F LOW AS N ETWORK F LOW C APACITY Stephen McCamant and Michael D. Ernst Reading Group 9/18/08 Slides by Michelle Goodstein

IMPLICIT FLOW SOLUTION

Assign edge capacity equal to number of possible different executions

2-way branch: add edge with a 1-bit capacity

Pointer op: add edge with capacity equal to number of secret bits in pointer value

Page 13: Q UANTITATIVE I NFORMATION F LOW AS N ETWORK F LOW C APACITY Stephen McCamant and Michael D. Ernst Reading Group 9/18/08 Slides by Michelle Goodstein

IMPLICIT FLOW SOLUTION

Enclosure regions are either Annotated explicitly by the programmer

Used in most of the case studies Inferred using static analysis

Pilot study conducted

“Additional edges” don’t actually go to enclosure outputs Instead, add a distinguished node All flows have edge to distinguished node, and

distinguished node has edges to outputs

Page 14: Q UANTITATIVE I NFORMATION F LOW AS N ETWORK F LOW C APACITY Stephen McCamant and Michael D. Ernst Reading Group 9/18/08 Slides by Michelle Goodstein

DETERMINING EDGE CAPACITIES

In order to assign edge capacities, need to know how many bits in a data value are “secret” Can only leak as many secret bits as a value

contains Computed as Taintcheck at the bit-level

Create shadow bit vectors Track taint of each bit When creating an edge: the number of tainted

bits provides the capacity bound

Page 15: Q UANTITATIVE I NFORMATION F LOW AS N ETWORK F LOW C APACITY Stephen McCamant and Michael D. Ernst Reading Group 9/18/08 Slides by Michelle Goodstein

EXAMPLE

Exposes 9 bits of secret input 1 bit from branch 8 bits from counter

Page 16: Q UANTITATIVE I NFORMATION F LOW AS N ETWORK F LOW C APACITY Stephen McCamant and Michael D. Ernst Reading Group 9/18/08 Slides by Michelle Goodstein

ADVERSARIAL MODEL A bound of k bits is sound if an adversary

could have communicated the message directly using a k-bit code

Consider deterministic, public programs Public inputs fixed in advance

Alice and Bob agree on a messages ahead of time

Alice communicates to Bob via program by manipulating secret inputs

Bob can only observe public inputs, public outputs and program code

Program is a channel for communication Bound is channel capacity

Page 17: Q UANTITATIVE I NFORMATION F LOW AS N ETWORK F LOW C APACITY Stephen McCamant and Michael D. Ernst Reading Group 9/18/08 Slides by Michelle Goodstein

ADVERSARIAL MODEL: SOUND

Alice sends an input i I Tool reports bound k(i)

Tool is sound iff there is also a code c where for each message i, Alice and Bob could have

communicated i using exactly k(i) bits

Page 18: Q UANTITATIVE I NFORMATION F LOW AS N ETWORK F LOW C APACITY Stephen McCamant and Michael D. Ernst Reading Group 9/18/08 Slides by Michelle Goodstein

ADVERSARIAL EXAMPLE

Assume Divide(a,b) returns c = a/b Alice controls inputs a,b Bob sees public output c Alice sets a=2,b=0 for “Attack”

Bob observes “exception” Alice sets a=4,b=1 for “Don’t attack”

Bob observes “4”

Code c: 1Attack, 0Don’t attack 1 bit bound is sound

Page 19: Q UANTITATIVE I NFORMATION F LOW AS N ETWORK F LOW C APACITY Stephen McCamant and Michael D. Ernst Reading Group 9/18/08 Slides by Michelle Goodstein

IMPLICATIONS OF ADVERSARIAL MODEL

Impossible for many distinct outputs which don’t reveal secret input Kraft’s inequality:

Bound of 0 bits Public output does not depend on secret inputs Fixed public inputs determine public output

2k possibilities Bound of ≥ k bits

12 )( i

ik

Page 20: Q UANTITATIVE I NFORMATION F LOW AS N ETWORK F LOW C APACITY Stephen McCamant and Michael D. Ernst Reading Group 9/18/08 Slides by Michelle Goodstein

SOUNDNESS AND CONSISTENCY

Soundness is measured over multiple runs Graph, as defined, only operates over one

dynamic run

Without getting into detail… Can “merge” these graphs using union-find Takes almost-linear time Bound for merged graph is sound for original

graphs Any cut in the combined graph is also valid in the

original graphs

Page 21: Q UANTITATIVE I NFORMATION F LOW AS N ETWORK F LOW C APACITY Stephen McCamant and Michael D. Ernst Reading Group 9/18/08 Slides by Michelle Goodstein

IMPLEMENTATION Valgrind, for Linux/x86 Associate positive integer tags with any

values that could contain secret information Registers, each byte in memory gets a tag Tag == 0 no secret information, not necessary

to include in graph For each operation, if at least one input has

nonzero tag, generate graph nodes and edges appropriately

Some optimizations for arrays possible Descriptions for large memory regions, along

with exception lists

Page 22: Q UANTITATIVE I NFORMATION F LOW AS N ETWORK F LOW C APACITY Stephen McCamant and Michael D. Ernst Reading Group 9/18/08 Slides by Michelle Goodstein

MAXIMUM FLOW

Solving for maximum flow takes O(VE) V = # of vertices E = # of edges For n nodes, potentially O(n3)

Want: Linear in actual program runtime

Soln: Collapse edges, nodes to shrink graph size Sacrificing some precision

Page 23: Q UANTITATIVE I NFORMATION F LOW AS N ETWORK F LOW C APACITY Stephen McCamant and Michael D. Ernst Reading Group 9/18/08 Slides by Michelle Goodstein

USING THE MAX-FLOW IN LATER RUNS

So far: Can observe ≥ 1 run, calculate a max flow for all observed runs

But later runs may have different inputs and different outputs

Question: how can the flow be used again without rerunning the entire algorithm?

Answer: Use Max-flow/Min-cut theorem

Page 24: Q UANTITATIVE I NFORMATION F LOW AS N ETWORK F LOW C APACITY Stephen McCamant and Michael D. Ernst Reading Group 9/18/08 Slides by Michelle Goodstein

MAX-FLOW/MIN-CUT

Once a max flow is found……Use it to find a min-cut (DFS)

Cut edges show where information flows from secret inputs to public outputs

If no other flows occur, the bound is sound

Commentary: if other flows do occur, no guarantee…

Page 25: Q UANTITATIVE I NFORMATION F LOW AS N ETWORK F LOW C APACITY Stephen McCamant and Michael D. Ernst Reading Group 9/18/08 Slides by Michelle Goodstein

FIXING THE SOUNDNESS

On future runs, use Taintcheck When encountering an operation corresponding

to an edge in the cut – clear all taint bits If anything is tainted at the end—followed a new

flow path Nothing tainted at the end—bound was sound

Page 26: Q UANTITATIVE I NFORMATION F LOW AS N ETWORK F LOW C APACITY Stephen McCamant and Michael D. Ernst Reading Group 9/18/08 Slides by Michelle Goodstein

FIXING THE SOUNDNESS

Can also run two versions of the app in lockstep E1 gets secret input, E2 gets “fake” secret input Reach nodes corresponding to the cut, E1 sends

values to E2 If they have same outputs at the end, then no

new flow path followed.

Page 27: Q UANTITATIVE I NFORMATION F LOW AS N ETWORK F LOW C APACITY Stephen McCamant and Michael D. Ernst Reading Group 9/18/08 Slides by Michelle Goodstein

WHY THIS MIGHT BE USEFUL…

Can use to limit the number of bits of information leaked in any particular execution

Results from one execution do not necessarily transfer to another Unless deterministic programs and equivalent

inputs Main use in debugging/testing

Page 28: Q UANTITATIVE I NFORMATION F LOW AS N ETWORK F LOW C APACITY Stephen McCamant and Michael D. Ernst Reading Group 9/18/08 Slides by Michelle Goodstein

CASE STUDIES

Paper presented 5 case studies & pilot study for inferring closure regions

I’ll go over 3 case studies: Battleship OpenSSH ImageMagick

Page 29: Q UANTITATIVE I NFORMATION F LOW AS N ETWORK F LOW C APACITY Stephen McCamant and Michael D. Ernst Reading Group 9/18/08 Slides by Michelle Goodstein

KBATTLESHIP

http://games.kde.org/game.php?game=kbattleship

Page 30: Q UANTITATIVE I NFORMATION F LOW AS N ETWORK F LOW C APACITY Stephen McCamant and Michael D. Ernst Reading Group 9/18/08 Slides by Michelle Goodstein

KBATTLESHIP

Network messages between players call method shipTypeAt() to determine what type of ship is at location (x,y), if any

shipTypeAt() returns an integer length Nonzero: ship is there Particular value: indicates which ship is there

Information can be used to write a modified program that infers extra information

Tool shows patched version leaks at most 2 bits Where or not hit a ship If hit, whether the hit was fatal

Page 31: Q UANTITATIVE I NFORMATION F LOW AS N ETWORK F LOW C APACITY Stephen McCamant and Michael D. Ernst Reading Group 9/18/08 Slides by Michelle Goodstein

OPENSSH

Marked private key as secret during authentication

Tool finds 128 bits of information about the secret key are revealed

Cut location corresponds to an MD5 checksum

Page 32: Q UANTITATIVE I NFORMATION F LOW AS N ETWORK F LOW C APACITY Stephen McCamant and Michael D. Ernst Reading Group 9/18/08 Slides by Michelle Goodstein

IMAGEMAGICK

Suppose you wish to obscure part of an image

Images from http://people.csail.mit.edu/smcc/projects/secret-flow/

Page 33: Q UANTITATIVE I NFORMATION F LOW AS N ETWORK F LOW C APACITY Stephen McCamant and Michael D. Ernst Reading Group 9/18/08 Slides by Michelle Goodstein

IMAGEMAGICK

Which technique does the best job?

Page 34: Q UANTITATIVE I NFORMATION F LOW AS N ETWORK F LOW C APACITY Stephen McCamant and Michael D. Ernst Reading Group 9/18/08 Slides by Michelle Goodstein

IMAGEMAGICK

Which technique does the best job?

Page 35: Q UANTITATIVE I NFORMATION F LOW AS N ETWORK F LOW C APACITY Stephen McCamant and Michael D. Ernst Reading Group 9/18/08 Slides by Michelle Goodstein

IMAGEMAGICKIn fact…

Swirled Unswirled

OriginalOriginal

Page 36: Q UANTITATIVE I NFORMATION F LOW AS N ETWORK F LOW C APACITY Stephen McCamant and Michael D. Ernst Reading Group 9/18/08 Slides by Michelle Goodstein

CONCLUSIONS

Interesting application of Max-flow/Min-cut to network security

Can be applied to a variety of programs Uses dynamic analysis instead of static

Can sometimes make inferences static couldn’t Bounds don’t necessarily hold across multiple

runs Framework designed for single-threaded

applications

Page 37: Q UANTITATIVE I NFORMATION F LOW AS N ETWORK F LOW C APACITY Stephen McCamant and Michael D. Ernst Reading Group 9/18/08 Slides by Michelle Goodstein

BACKUP SLIDES

Page 38: Q UANTITATIVE I NFORMATION F LOW AS N ETWORK F LOW C APACITY Stephen McCamant and Michael D. Ernst Reading Group 9/18/08 Slides by Michelle Goodstein

IMPLEMENTATION

Memory utilization Not merging graphs: can write graph

immediately to file Merging graphs:

Merging can be down piecewise Only current graph + info about nodes that still

correspond to values in registers/memory

Page 39: Q UANTITATIVE I NFORMATION F LOW AS N ETWORK F LOW C APACITY Stephen McCamant and Michael D. Ernst Reading Group 9/18/08 Slides by Michelle Goodstein

INFERRING ENCLOSURE REGIONS

Static analysis for C code Based on CIL framework No alias analysis

72% of regions found by hand are discovered by simple pilot program

Adding array and introprocedural aliasing necessary to infer full set of enclosures

bzip was an outlier

Page 40: Q UANTITATIVE I NFORMATION F LOW AS N ETWORK F LOW C APACITY Stephen McCamant and Michael D. Ernst Reading Group 9/18/08 Slides by Michelle Goodstein

INFERRING ENCLOSURE REGIONS

Page 41: Q UANTITATIVE I NFORMATION F LOW AS N ETWORK F LOW C APACITY Stephen McCamant and Michael D. Ernst Reading Group 9/18/08 Slides by Michelle Goodstein

Ran on bzip2, example of worst case Computationally intensive, relies extensively on

input, uses large arrays Used inputs of various sizes

Inputs: digits of in words (“Three point one foure one five nine”)

Highly compressible input

Estimated bound of flow to be portion of output output that depends on the input

MAX FLOW RUNTIME

Page 42: Q UANTITATIVE I NFORMATION F LOW AS N ETWORK F LOW C APACITY Stephen McCamant and Michael D. Ernst Reading Group 9/18/08 Slides by Michelle Goodstein

MAX FLOW RUNTIME