accountable distributed systems and the accountable cloud

26
Building and Programming the Cloud, Mysore, Jan 2010 1 Accountable distributed systems and the accountable cloud Peter Druschel joint work with Andreas Haeberlen 1 , Petr Kuznetsov 2 , Rodrigo Rodrigues 1 University of Pennsylvania 2 TU Berlin/Deutsche Telekom Labs

Upload: binta

Post on 25-Feb-2016

64 views

Category:

Documents


0 download

DESCRIPTION

Accountable distributed systems and the accountable cloud. Peter Druschel joint work with Andreas Haeberlen 1 , Petr Kuznetsov 2 , Rodrigo Rodrigues 1 University of Pennsylvania 2 TU Berlin/Deutsche Telekom Labs. Outline. Why accountability? A definition - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Accountable distributed systems and the accountable cloud

Building and Programming the Cloud, Mysore, Jan 2010 1

Accountable distributed systems and the accountable cloud

Peter Druscheljoint work with Andreas Haeberlen1, Petr Kuznetsov2, Rodrigo

Rodrigues1 University of Pennsylvania

2 TU Berlin/Deutsche Telekom Labs

Page 2: Accountable distributed systems and the accountable cloud

2

Outline

Why accountability? A definition A practical implementation: PeerReview Accountability in the Cloud Technical Challenges Conclusion

Building and Programming the Cloud, Mysore, Jan 2010

Page 3: Accountable distributed systems and the accountable cloud

3

What is the problem?

Building and Programming the Cloud, Mysore, Jan 2010

Multiple administrative domains (federated, p2p)

Multiple stakeholders (hosting, Web) different actors, somewhat different

interests lack of global visibility, control

Complex faults software faults, mis-configuration,

negligence, disgruntled employees, outside attacks, manipulation

Lack of transparency

Page 4: Accountable distributed systems and the accountable cloud

4

Learning from the 'offline' world Relies heavily on accountability to deal with

faults, misbehavior Example: Banking

Record can be used to (manually) detect problems identify the responsible party convince that a problem does (not) exist

Requirement SolutionCommitment Signed receiptsTamper-evident record

Double-entry bookkeeping

Inspections Audits

Building and Programming the Cloud, Mysore, Jan 2010

Page 5: Accountable distributed systems and the accountable cloud

5

What does accountability mean in distributed systems?

1. Tamper-evident record of each node‘s actions2. (Automated) audit for fault detection,

localization3. Evidence to convince a third party that a fault

has (not) occured

Accountability provides transparency trust incentives to avoid faults

Building and Programming the Cloud, Mysore, Jan 2010

Page 6: Accountable distributed systems and the accountable cloud

6

Outline

Why accountability? A definition A practical implementation: PeerReview Accountability in the Cloud Technical Challenges Conclusion

Building and Programming the Cloud, Mysore, Jan 2010

Page 7: Accountable distributed systems and the accountable cloud

7

Ideal accountability

Whenever a node is faulty, the system generates a proof of misbehavior against that node

Fault := Node deviates from expected behavior

Our goal is to automatically detect faults identify the faulty nodes convince others that a node is (or is not) faulty

Can we build a system that provides the following guarantee?

Building and Programming the Cloud, Mysore, Jan 2010

Page 8: Accountable distributed systems and the accountable cloud

8

Can we detect all faults? Problem: Faults that

affect only a node's internal state

Would require online trusted probes at each node

Focus on observable faults: Faults that affect a correct node

Can detect observable faults without requiring trusted components

A

X

C

100101011000101101011100100100

0

Building and Programming the Cloud, Mysore, Jan 2010

Page 9: Accountable distributed systems and the accountable cloud

9

Can we always get a proof? Problem: He-said-she-said Three possible causes:

A never sent X B refuses to acknowledge X X was delayed by the network

Cannot get proof of misbehavior! Generalize to verifiable evidence:

a proof of misbehavior, or a challenge that a faulty node cannot answer

What if the challenged node does not respond? Does not prove a fault, but node is suspected until it

responds

A

X

B

C

?

I sent X!

I neverreceived

X!?!

Building and Programming the Cloud, Mysore, Jan 2010

Page 10: Accountable distributed systems and the accountable cloud

10

Practical accountability Requirement for an accountable distributed

system:

This is useful Any (!) fault that affects a correct node is

eventually detected and linked to a faulty node

It can be implemented in practice

Whenever a fault is observed by a correct node, the system eventually generates verifiable evidence against a faulty node

Building and Programming the Cloud, Mysore, Jan 2010

Page 11: Accountable distributed systems and the accountable cloud

11

Outline

Why accountability? A definition A practical implementation: PeerReview Accountability in the Cloud Technical Challenges Conclusion

Building and Programming the Cloud, Mysore, Jan 2010

Page 12: Accountable distributed systems and the accountable cloud

12

Adds accountability to a given system Implemented as a library Provides tamper-evident record Detects faults via state-machine replayAssumptions:

PeerReview

1. Nodes can be modeled as deterministic state machines

2. There is a trusted reference implementation of the state machines

3. Correct nodes can eventually communicate

4. Nodes can sign messagesBuilding and Programming the Cloud, Mysore, Jan 2010

Page 13: Accountable distributed systems and the accountable cloud

13

PeerReview is widely applicable App #1: NFS server in the Linux kernel

Many small, latency-sensitive requests Tampering with files Lost updates

App #2: Overlay multicast Transfers large volume of data

Freeloading Tampering with content

App #3: P2P email Complex, large, decentralized

Denial of service Attacks on DHT routing

Details in [Haeberlen et al., SOSP’07] NetReview [Haeberlen et al. NSDI’08]

Metadata corruption Incorrect access

control

Censorship

Building and Programming the Cloud, Mysore, Jan 2010

Page 14: Accountable distributed systems and the accountable cloud

14

How much does PeerReview cost?

Log storage 10 – 100 GByte per month, depending on

application

Message signatures Message latency (e.g. 1.5ms RTT with RSA-

1024) CPU overhead (embarrassingly parallel)

Log/authenticator transfer, replay overhead Depends on # witnesses Can be deferred to exploit bursty/diurnal load

patterns

Building and Programming the Cloud, Mysore, Jan 2010

Page 15: Accountable distributed systems and the accountable cloud

15

Outline

Why accountability? A definition A practical implementation: PeerReview Accountability in the Cloud Technical Challenges Conclusion

Building and Programming the Cloud, Mysore, Jan 2010

Page 16: Accountable distributed systems and the accountable cloud

Split administration in the Cloud

Bug in Alice‘s software Subtle differences between

Alice and Bob‘s environments ...

16

AliceBob

Alice's customers

Bug in Bob‘s software Insufficient resource allocation Hacker attack ...

What if there is a problem?

Building and Programming the Cloud, Mysore, Jan 2010

Page 17: Accountable distributed systems and the accountable cloud

Split administraction: Alice‘s perspective

17Building and Programming the Cloud, Mysore, Jan 2010

Alice Alice's customers

? ?????? ?

Bob

If something is wrong, how will I

know? How can I tell if it's

my software or the cloud?

If it's the cloud, how can I convince Bob?

Page 18: Accountable distributed systems and the accountable cloud

If something is wrong, how will I

know? How can I tell if it's

my software or the cloud?

If it's the cloud, how can I convince Bob?

Split administraction: Bob's perspective

18Building and Programming the Cloud, Mysore, Jan 2010

AliceBob

Alice's customers

?? ?

???

?

?

??

??

?

If something is wrong, how will I know?

How can I tell if it's the cloud or Alice's

software? If it's Alice's software,

how can I convince Alice?

Page 19: Accountable distributed systems and the accountable cloud

An idealized solution

What if we had an oracle that Alice and Bob could ask about problems?

Completeness: If the cloud is faulty, the oracle will say so

Accuracy: If the cloud is not faulty, the oracle will say so

Verifiability: The oracle produces evidence that would convince a third party

19Building and Programming the Cloud, Mysore, Jan 2010

AliceBob

Alice's customers

Oracle

Page 20: Accountable distributed systems and the accountable cloud

The accountable cloud

Idea: Make cloud accountable Cloud records its actions in a tamper-evident log Alice can audit the log and check for faults Use log to construct evidence that a fault does (not)

exist Should work even if one party was compromised!

20Building and Programming the Cloud, Mysore, Jan 2010

Alice

Bob

Alice's customersTamper-evident

log

Page 21: Accountable distributed systems and the accountable cloud

Discussion

Is this too pessimistic? Cloud isn't malicious!

Hacker attacks, software bugs, operator error, malicious client, …

Difficult to come up with a more restrictive fault model

Without provable properties, evidence has little value

Why would a provider want to deploy this?

Attractive to prospective customers (peace of mind) Helps in handling customer complaints, resolve

disputes21Building and Programming the Cloud, Mysore, Jan 2010

Page 22: Accountable distributed systems and the accountable cloud

22

Outline

Why accountability? A definition A practical implementation: PeerReview Accountability in the Cloud Technical Challenges Conclusion

Building and Programming the Cloud, Mysore, Jan 2010

Page 23: Accountable distributed systems and the accountable cloud

Is the technology ready? Cloud accountability should

Have provable guarantees Work for most cloud applications Require no changes to application code Cover a wide spectrum of properties Have reasonable overhead

Can existing techniques deliver this? CATS, Repeat&Compare, AIP, PeerReview,

NetReview, AudIt, ...

More work is needed!23Building and Programming the Cloud, Mysore, Jan 2010

??

?

Page 24: Accountable distributed systems and the accountable cloud

Work in progress: AVM

Goal: Provide accountability for arbitrary binary executables

Idea: Accountable virtual machine (AVM) Cloud records enough data to enable deterministic

replay Alice can replay log against a reference

implementation Can audit any part of the hosted execution 24Building and Programming the Cloud, Mysore, Jan 2010

Alice Bob

Virtual machine

Page 25: Accountable distributed systems and the accountable cloud

Challenges

Complete state-machine replay expensive

limit to spot checks, investigation of suspected faults

multi-core replay is hard replay log against an abstract model?

Checking performance properties

Checking information flow

Lots of research opportunities 25Building and Programming the Cloud, Mysore, Jan 2010

Page 26: Accountable distributed systems and the accountable cloud

Summary Accountability is a useful capability in

distributed systems tamper-evident record fault detection and localization evidence

Proposal: the accountable cloud Can verify correct operation, produce evidence Provable guarantees solid foundation for both

players Challenges remain

26

Questions?Building and Programming the Cloud, Mysore, Jan 2010