precise, dynamic information flow for database- backed

38
Precise, Dynamic Information Flow for Database- Backed Applications Jean Yang, Travis Hance, Thomas H. Austin, Armando Solar-Lezama, Cormac Flanagan, and Stephen Chong PLDI 2016 Jean Yang / PLDI 2016

Upload: others

Post on 17-Jan-2022

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Precise, Dynamic Information Flow for Database- Backed

Precise, Dynamic Information Flow for Database-Backed ApplicationsJean Yang, Travis Hance, Thomas H. Austin, Armando Solar-Lezama, Cormac Flanagan, and Stephen Chong

PLDI 2016 Jea

n Y

an

g / P

LD

I 2

01

6

Page 2: Precise, Dynamic Information Flow for Database- Backed

An oil skimming operation works in a heavy oil slick after the spill on April 1, 1989. (Photo from Huffington Post)

Jea

n Y

an

g / P

LD

I 2

01

6

Page 3: Precise, Dynamic Information Flow for Database- Backed

Jea

n Y

an

g / P

LD

I 2

01

6

Oil-covered otter. (Photo from the Human Impact Project)

Page 4: Precise, Dynamic Information Flow for Database- Backed

The Relationship Between Design and Accidents

Jea

n Y

an

g / P

LD

I 2

01

6

Crude oil

Single hull

Crude oil

Double hull

Required by the Oil

Pollution Act of 1990.

Page 5: Precise, Dynamic Information Flow for Database- Backed

Jea

n Y

an

g / P

LD

I 2

01

6

But what about

information

leaks?

Page 6: Precise, Dynamic Information Flow for Database- Backed

Wanted: Double Hull for Information Security

Jea

n Y

an

g / P

LD

I 2

01

6

Sensitive data

Single hull

Sensitive data

Double hull

Research in language-based security looks at designs

for double hulls [Sabelfeld and Myers, JSAC 2003].

Our goal: make double hulls that are

as easy to construct as possible!

Page 7: Precise, Dynamic Information Flow for Database- Backed

This Talk: Making It Easier to Secure Web Programs

1. Why it’s hard to prevent information leaks.

2. A programming model that makes writing secure web programs easier.

3. How we support that programming model in database-backed applications. J

ea

n Y

an

g / P

LD

I 2

01

6

Page 8: Precise, Dynamic Information Flow for Database- Backed

Social Calendar Example

Jea

n Y

an

g / J

eeves

Let’s say Arjun and I want to throw a surprise paper discussion party for Emery.

Page 9: Precise, Dynamic Information Flow for Database- Backed

Challenge: Different Viewers Should See Different Events

Jea

n Y

an

g / P

LD

I 2

01

6

Guests Emery

Strangers

Surprise

discussion for

Emery at

Chuck E.

Cheese.

Pizza with

Arjun/Jean.

Private event

at Chuck E.

Cheese.

Page 10: Precise, Dynamic Information Flow for Database- Backed

Policies May Depend on Sensitive Values

Jea

n Y

an

g / P

LD

I 2

01

6

Guest List

Finalized list

Must be on guest list.

Must be member of list and the

list must be finalized.

Leaky enforcement:

when the programmer

neglects dependencies

of policies on sensitive

values.

Policy for event

depends on policy

for guest list!

Page 11: Precise, Dynamic Information Flow for Database- Backed

A Story of Leaky Enforcement

Jea

n Y

an

g / P

LD

I 2

01

6

Guest List

Finalized list

We add Armando to

non-final guest list.1

We run out of space

and remove Armando.3

Armando figures out

he was uninvited.4

There was a

party on my

calendar…

Guest List

Finalized list

Armando sees the

event on his calendar.2

Page 12: Precise, Dynamic Information Flow for Database- Backed

A Story of Leaky Enforcement

Jea

n Y

an

g / P

LD

I 2

01

6

Guest List

Finalized list

We add Armando to

non-final guest list.1

We run out of space

and remove Armando.3

Armando figures out

he was uninvited.4

There was a

party on my

calendar…

Guest List

Finalized list

Armando sees the

event on his calendar.2

Problem: implementation for event policy neglected to take into account guest list policy.

This arises whenever we

trust programmers to get

policy checks right!

Page 13: Precise, Dynamic Information Flow for Database- Backed

Need to Track Policies and Viewers Across the Code

Jea

n Y

an

g / P

LD

I 2

01

6

“What is the

most popular

location among

friends 7pm

Tuesday?”

Update to

all

calendar

users

Need to track how information flows

through derived values and where

derived values flow!

Page 14: Precise, Dynamic Information Flow for Database- Backed

“Policy Spaghetti” in HotCRP

Jea

n Y

an

g / P

LD

I 2

01

6

Conditional permissions

checks everywhere!

Page 15: Precise, Dynamic Information Flow for Database- Backed

Jacqueline Web Framework to the Rescue!

Jea

n Y

an

g / P

LD

I 2

01

6

Enhanced runtime

encompasses applications

and databases, preventing

leaks between the two.

Runtime prevents information

leaks according to policy

annotations.

Sensitive data

Policy annotations

Database

Programmer specifies

information flow policies

separately from other

functionality.

1

2

3

Page 16: Precise, Dynamic Information Flow for Database- Backed

Contributions

• Policy-agnostic programming model for database-backed web applications.

• Semantics and proofs for policy-agnostic programming that encompasses SQL databases.

• Demonstration of practical feasibility with Python implementation and application case studies.

Jea

n Y

an

g / P

LD

I 2

01

6

Page 17: Precise, Dynamic Information Flow for Database- Backed

Enhanced runtime

Jacqueline Web Framework

Jea

n Y

an

g / P

LD

I 2

01

6

Policies

Framework attaches

policies based on

annotations.

Framework

shows

appropriate

values based

on viewer and

policies.

Object-relational

mapping propagates

policies and sensitive

values through

computations.

Page 18: Precise, Dynamic Information Flow for Database- Backed

@jacqueline

def has_host(self, host):

return EventHost.objects.get(

event=self, host=host) != None

@jacqueline

def has_guest(self, guest):

return EventGuest.objects.get(

event=self, host=host) != None

Jea

n Y

an

g / P

LD

I 2

01

6

Base schema

Policy helper

functions

class Event(JacquelineModel):

name = CharField(max_length=256)

location = CharField(max_length=512)

time = DateTimeField()

description = CharField(max_length)=1024)

@staticmethod

@label_for(‘location’)

def restrict_event(event, ctxt):

return event.has_host(ctxt) or event.has_guest(ctxt)

@staticmethod

def jacqueline_get_private_location(event):

return “Undisclosed location”

Public value for location field

Information flow policy for location field

Coding in Jacqueline

Page 19: Precise, Dynamic Information Flow for Database- Backed

Centralized Policies in Jacqueline

Jea

n Y

an

g / P

LD

I 2

01

6

Model View Controller

Centralized policies! No checks or

declassifications needed anywhere else!

Page 20: Precise, Dynamic Information Flow for Database- Backed

20

Jea

n Y

an

g / J

eeves

if == :

userCount += 1

return userCount

userCount = 0

print { } print { }

1 0

Closer Look at the Policy-Agnostic Runtime

Runtime

propagates

values and

policies.

Runtime

solves for

values to show

based on

policies and

viewer.

21

Jeeves [Yang et al 2012, Austin et al 2013] uses facets

[Austin et al 2012] to simulate simultaneous multiple

executions.

Page 21: Precise, Dynamic Information Flow for Database- Backed

Labels Track Sensitive Values to Prevent Leaks

Jea

n Y

an

g / J

eeves

21

if == :

c += 1

true falseif :

c += 1

c = cold+1 cold

Labels follow

values through

all computations,

including

conditionals and

assignments.Emery can’t see secret

party information or

results of computations

on those values!

guest

guest

guest

Page 22: Precise, Dynamic Information Flow for Database- Backed

Jea

n Y

an

g / P

LD

I 2

01

6

Page 23: Precise, Dynamic Information Flow for Database- Backed

Jea

n Y

an

g / P

LD

I 2

01

6

Page 24: Precise, Dynamic Information Flow for Database- Backed

The Dangers of Interacting with Vanilla Databases

Database queries can leak information!

Jea

n Y

an

g / P

LD

I 2

01

6

ApplicationQueries

select * from Users

where location =

Database

Application All data

Databaseselect * from Users

Impractical

and potentially

slow!

Challenge: Support faceted execution when interacting with an unmodified SQL database.

Need faceted queries!

Page 25: Precise, Dynamic Information Flow for Database- Backed

save( )

Semantics of a Faceted Database

Jea

n Y

an

g / P

LD

I 2

01

6

SQL

Database select * from Users

where location =

Too expensive! Too difficult to extend the

formal semantics!

Primary key Location

1

Conceptual rowStore facets

as strings?

New

database

for each

label?

Page 26: Precise, Dynamic Information Flow for Database- Backed

Solution: Use ORM to Map Facets onto Database Rows

Jeeves key Location Labels

1 {𝑎}

1 {¬𝑎}

Jea

n Y

an

g / P

LD

I 2

01

6

select * from Users

where location =

ORM refacets

Jeeves key Location Labels

1 {𝑎}

Jeeves key Location

1NULL

Primary key Location

1a

Conceptual row

a

Page 27: Precise, Dynamic Information Flow for Database- Backed

Supporting Queries in Jacqueline

Jacqueline

Supports

SQL

Implements

ORM Implements

get select refaceting

all select refaceting

filter select refaceting

sort order by refaceting

foreign keys join -

save delete, insert turning a faceted value into

multiple rows

delete delete keeping track of which

facets to delete

Jea

n Y

an

g / P

LD

I 2

01

6

Can use SQL

implementations

for many

queries!

Page 28: Precise, Dynamic Information Flow for Database- Backed

Jea

n Y

an

g / P

LD

I 2

01

6

Page 29: Precise, Dynamic Information Flow for Database- Backed

Early Pruning Optimization

Jea

n Y

an

g / P

LD

I 2

01

6

Observation:

Framework can

often (but not

always) track

viewer.

Enhanced runtime

Policies

Optimization: Can

often explore fewer

possible paths!

Page 30: Precise, Dynamic Information Flow for Database- Backed

Jea

n Y

an

g / P

LD

I 2

01

6

Page 31: Precise, Dynamic Information Flow for Database- Backed

Review: Traditional Non-Interference

Jea

n Y

an

g / J

eeves

if == :

userCount += 1

print { }

0

0 1

if == :

userCount += 1

Challenge:

Compute labels from

program—may have

dependencies on

secret values!

Secret values should not affect public output.

guest guest

guest

Page 32: Precise, Dynamic Information Flow for Database- Backed

Jea

n Y

an

g / J

eeves

if == :

userCount += 1

print { }

0

0 1

if == :

userCount += 1

Theorem:

All executions where

guest must be

public produce

equivalent outputs.

Can’t tell apart secret

values that require

guest to be public.

Policy-Agnostic Non-Interference

guest guest

guest

Page 33: Precise, Dynamic Information Flow for Database- Backed

Jea

n Y

an

g / P

LD

I 2

01

6

Page 34: Precise, Dynamic Information Flow for Database- Backed

Application Case Studies

Jea

n Y

an

g / P

LD

I 2

01

6

Course

manager

Health

record

manager

Conference

management

system

(deployed!)

Jacqueline reduces the number

of lines of policy code and has

reasonable overheads!

Page 35: Precise, Dynamic Information Flow for Database- Backed

Demo

Jea

n Y

an

g / P

LD

I 2

01

6

Page 36: Precise, Dynamic Information Flow for Database- Backed

Conference Management System Running Times

Jea

n Y

an

g / P

LD

I 2

01

6

Tests from Amazon AWS machine via HTTP requests from another machine.

0

0.05

0.1

0.15

0.2

0 500 1000

Tim

e t

o s

how

pa

ge (

s)

Papers in database

Single paper

Jacqueline Django

0

2

4

6

8

10

12

14

16

0 500 1000

Tim

e t

o s

how

all

pap

ers

(s)

Papers in database

All Papers*

Jacqueline Django

*Different from numbers in paper.

Page 37: Precise, Dynamic Information Flow for Database- Backed

Summary: Policy-Agnostic Web Programming with Jacqueline

Jea

n Y

an

g / P

LD

I 2

01

6

Enhanced runtime

encompasses applications

and databases, preventing

leaks between the two.

Runtime prevents

information leaks according

to policy annotations.

Sensitive data

Policy annotations

Database

Programmer specifies

information flow policies

separately from other

functionality.

1

2

3

We have strong

formal

guarantees and

evidence that

this can be

practical!

Page 38: Precise, Dynamic Information Flow for Database- Backed

Jea

n Y

an

g / P

LD

I 2

01

6

http://jeeveslang.org

http://github.com/jeanqasaur/jeeves

You can factor out

information flow policies

from other code to avoid

policy spaghetti!

You can enforce policies

across the application and

database by using a

carefully-crafted ORM!

You can build realistic

systems using this approach!