merlin: inferring specifications for explicit information flow problems ben livshits aditya nori...

27
Merlin: Inferring Specifications for Explicit Information Flow Problems Ben Livshits Aditya Nori Sriram Rajamani Anindya Banerjee

Upload: drusilla-french

Post on 04-Jan-2016

223 views

Category:

Documents


5 download

TRANSCRIPT

Page 1: Merlin: Inferring Specifications for Explicit Information Flow Problems Ben Livshits Aditya Nori Sriram Rajamani Anindya Banerjee

Merlin: Inferring Specifications for Explicit Information Flow Problems

Ben LivshitsAditya Nori

Sriram RajamaniAnindya Banerjee

Page 2: Merlin: Inferring Specifications for Explicit Information Flow Problems Ben Livshits Aditya Nori Sriram Rajamani Anindya Banerjee

Web application

vulnerabilities are

a serious threat

addressed by

static analysis

tools

Microsoft CAT.NET

Page 3: Merlin: Inferring Specifications for Explicit Information Flow Problems Ben Livshits Aditya Nori Sriram Rajamani Anindya Banerjee

MOTIVATION & PROJECT GOALS

Page 4: Merlin: Inferring Specifications for Explicit Information Flow Problems Ben Livshits Aditya Nori Sriram Rajamani Anindya Banerjee

When it comes to static analysis tools, specification quality affects result

quality

More specification more bugs

Better specification fewer false positives

Page 5: Merlin: Inferring Specifications for Explicit Information Flow Problems Ben Livshits Aditya Nori Sriram Rajamani Anindya Banerjee

A typical

specification

includes dozens

of sources, sinks,

and sanitizers

Type Count Revisions

Sources 27 11

Sanitizers 7 2

Sinks 77 10

• Specification

Sources: start taint

Sinks: taint not allowed

Sanitizers: untaint

data

= 111 = 23

Page 6: Merlin: Inferring Specifications for Explicit Information Flow Problems Ben Livshits Aditya Nori Sriram Rajamani Anindya Banerjee

Tools are only as

good as the

specification and

good specification is

hard to come by

• This example

ReadData1, ReadData2 – source?

Cleanse – sanitizer?

WriteData – sink?

• Large scale

Libraries with their own APIs

Specification particular to application

1. void ProcessRequest() 2. { 3. string s1 = ReadData1("name"); 4. string s2 = ReadData2("encoding");

5. string s3 = Cleanse(s1);

6. WriteData("Parameter " + s1); 7. WriteData("Header " + s2); 8. }

Page 7: Merlin: Inferring Specifications for Explicit Information Flow Problems Ben Livshits Aditya Nori Sriram Rajamani Anindya Banerjee

ALGORITHMS

Page 8: Merlin: Inferring Specifications for Explicit Information Flow Problems Ben Livshits Aditya Nori Sriram Rajamani Anindya Banerjee

Merlin Processing

Initial specification

Program

Final specificationMerlin

inferenceProp. graph construction

Factor graph construction

Probabilistic inference

Static analysis

Vulnerabilities

1 2 3

Page 9: Merlin: Inferring Specifications for Explicit Information Flow Problems Ben Livshits Aditya Nori Sriram Rajamani Anindya Banerjee

We convert the

propagation

graph we get

from CAT.NET to a

reduced

propagation

graph

ReadData1

Prop1 Prop2

Cleanse

WriteData

ReadData2

Page 10: Merlin: Inferring Specifications for Explicit Information Flow Problems Ben Livshits Aditya Nori Sriram Rajamani Anindya Banerjee

ReadData1, Cleanse,

WriteData

ReadData2

ReadData1

Prop1 Prop2

Cleanse

WriteData

ReadData2

Page 11: Merlin: Inferring Specifications for Explicit Information Flow Problems Ben Livshits Aditya Nori Sriram Rajamani Anindya Banerjee

ReadData1,

ReadData2,

WriteData

Cleanse

ReadData1

Prop1 Prop2

Cleanse

WriteData

ReadData2

Page 12: Merlin: Inferring Specifications for Explicit Information Flow Problems Ben Livshits Aditya Nori Sriram Rajamani Anindya Banerjee

ReadData1,

ReadData2, Cleanse

WriteData

ReadData1

Prop1 Prop2

Cleanse

WriteData

ReadData2

Page 13: Merlin: Inferring Specifications for Explicit Information Flow Problems Ben Livshits Aditya Nori Sriram Rajamani Anindya Banerjee

Avoid source

wrappers:

Prop1 is not a source

ReadData1

Prop1 Prop2

Cleanse

WriteData

ReadData2

Page 14: Merlin: Inferring Specifications for Explicit Information Flow Problems Ben Livshits Aditya Nori Sriram Rajamani Anindya Banerjee

Avoid sink wrappers:

Cleanse is not a sink

ReadData1

Prop1 Prop2

Cleanse

WriteData

ReadData2

Page 15: Merlin: Inferring Specifications for Explicit Information Flow Problems Ben Livshits Aditya Nori Sriram Rajamani Anindya Banerjee

Avoid double

sanitizers:

Prop1 is not a

sanitizer

ReadData1

Prop1 Prop2

Cleanse

WriteData

ReadData2

Page 16: Merlin: Inferring Specifications for Explicit Information Flow Problems Ben Livshits Aditya Nori Sriram Rajamani Anindya Banerjee

We derive

probabilistic

constraints from

the reduced

propagation

graph

Page 17: Merlin: Inferring Specifications for Explicit Information Flow Problems Ben Livshits Aditya Nori Sriram Rajamani Anindya Banerjee

We approximate

path constraints

with triple

constraints

B1: For every acyclic path

m1,m2,…,mk-1,mk, where m1 is a potential source and mk is a potential sink, the joint probability of classifying m1 as a source, mk as a sink and all of m2,…, mk-1 as regular nodes is low. C1: For every triple of nodes

‹m1,m2,m3›, where m1 is a potential source, m3 is a potential sink, and m1 and m3 are connected by a path through m2 in the propagation graph, the joint probability that m1 is a source, m2 is not a sanitizer, and m3 is a sink is low.

2N

N3

Page 18: Merlin: Inferring Specifications for Explicit Information Flow Problems Ben Livshits Aditya Nori Sriram Rajamani Anindya Banerjee

Probabilistic

inference

Source Sanitizer SinksReadData1 .95 .001 .001ReadData2 .5 .5 .5Cleanse .5 .5 .5WriteData .5 .5 .85

Source Sanitizer SinksReadData1 .95 .001 .001ReadData2 .5 .5 .5Cleanse .01 .997 .03WriteData .5 .5 .85

Page 19: Merlin: Inferring Specifications for Explicit Information Flow Problems Ben Livshits Aditya Nori Sriram Rajamani Anindya Banerjee

Direct constraint representation is too big. Factor graphs to the

rescue.

fC3(xProp1,xProp2) fC4(xProp1) fC2(xProp1,xProp2) fC4(xProp2) fC2(xProp2,xCleanse) fC4(xCleanse) fC3(xProp2,xCleanse) fC4(xWriteData)

xReadData1 xReadData2 xProp1 xProp2 xCleanse xWriteData

fC1(xReadData1,xProp1, xWriteData) fC1(xReadData1,xProp1, xWriteData) fC1(xReadData2,xProp2, xWriteData) fC1(xReadData1,xProp1, xWriteData)

Page 20: Merlin: Inferring Specifications for Explicit Information Flow Problems Ben Livshits Aditya Nori Sriram Rajamani Anindya Banerjee

EXPERIMENTALRESULTS

Page 21: Merlin: Inferring Specifications for Explicit Information Flow Problems Ben Livshits Aditya Nori Sriram Rajamani Anindya Banerjee

We have chosen

10 line-of-

business

applications

written in C#

using ASP.NET.

Page 22: Merlin: Inferring Specifications for Explicit Information Flow Problems Ben Livshits Aditya Nori Sriram Rajamani Anindya Banerjee

Summary of Discovered Specifications

Sources Sanitizers Sinks0

20

40

60

80

100

120

140

160Original With Merlin

Page 23: Merlin: Inferring Specifications for Explicit Information Flow Problems Ben Livshits Aditya Nori Sriram Rajamani Anindya Banerjee

Summary of Discovered Vulnerabilities

Original

With Merlin

Eliminated

-50 0 50 100 150 200 250 300 350 400

89

335

13

Page 24: Merlin: Inferring Specifications for Explicit Information Flow Problems Ben Livshits Aditya Nori Sriram Rajamani Anindya Banerjee

Analyze This:

a routine from

one of our

benchmarks that

shows how Merlin

affects

vulnerabilities.

known sink

Page 25: Merlin: Inferring Specifications for Explicit Information Flow Problems Ben Livshits Aditya Nori Sriram Rajamani Anindya Banerjee

Starting with an

initial

specification

really helps, but

Merlin can work

with no

specification at

all.

Page 26: Merlin: Inferring Specifications for Explicit Information Flow Problems Ben Livshits Aditya Nori Sriram Rajamani Anindya Banerjee

Executive

summary of

experimental

results.

• 10 large Web apps in .NET

• New specs: 167

• New vulnerabilities: 302

• False positives removed:

3

• Final false positive rate

for Cat.Net after Merlin

<1%

Page 27: Merlin: Inferring Specifications for Explicit Information Flow Problems Ben Livshits Aditya Nori Sriram Rajamani Anindya Banerjee

Related workExplicit information flow•Analysis of Web apps (WebSSARI, Griffin, etc.) and Fortify, Cat.Net•Asbestos, HiStar•Hardware support

Mining sec. specifications•AutoISES – security-sensitive data structures in Linux kernel•Ganapathy

Specification mining•Kremenek (belief inf. for malloc/free)•Perracotta, DynaMine, Dyckon, Weimer