distributed transactions zachary g. ives university of pennsylvania cis 455 / 555 – internet and...

36
Distributed Transactions Zachary G. Ives University of Pennsylvania CIS 455 / 555 – Internet and Web Systems April 15, 2008

Upload: peregrine-hart

Post on 14-Jan-2016

213 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Distributed Transactions Zachary G. Ives University of Pennsylvania CIS 455 / 555 – Internet and Web Systems April 15, 2008

Distributed Transactions

Zachary G. IvesUniversity of Pennsylvania

CIS 455 / 555 – Internet and Web Systems

April 15, 2008

Page 2: Distributed Transactions Zachary G. Ives University of Pennsylvania CIS 455 / 555 – Internet and Web Systems April 15, 2008

Administrivia

We are down to 2 weeks in the semester

I would like a status report on your projects by this Friday

2

Page 3: Distributed Transactions Zachary G. Ives University of Pennsylvania CIS 455 / 555 – Internet and Web Systems April 15, 2008

Picking up: Vector Clocks, Briefly

Lamport clocks are logical clocks Each site maintains a clock value LC

Whenever an item publishes a timed item, use LC and postincrement it

Append clock on every published message When we receive a message with a bigger or equal LC timestamp,

say LC2, set our local LC := LC2 + 1 Causally ordered messages have bigger LC values than

their antecedents, but but a bigger LC value doesn’t mean that there’s any causal ordering…

Vector clocks: every node has a vector, representing the last Lamport clock value from each neighbor Key result: can compare two vector clocks to see if there

is a causal ordering between them

3

Page 4: Distributed Transactions Zachary G. Ives University of Pennsylvania CIS 455 / 555 – Internet and Web Systems April 15, 2008

4

We Need More Than Synchronization

What needs to happen when you… Click on “purchase” on Amazon?

Suppose you purchased by credit card?

Use online bill-paying services from your bank? Place a bid in an eBay-like auction system? Order music from iTunes?

What if your connection drops in the middle of downloading?

Is this more than a case of making a simple Web Service (-like) call?

Page 5: Distributed Transactions Zachary G. Ives University of Pennsylvania CIS 455 / 555 – Internet and Web Systems April 15, 2008

5

Transactions Are a Means of Handling Failures

There are many (especially, financial) applications where we want to create atomic operations that either commit or roll back

This is one of the most basic services provided by database management systems, but we want to do it in a broader sense

Part of “ACID” semantics…

Page 6: Distributed Transactions Zachary G. Ives University of Pennsylvania CIS 455 / 555 – Internet and Web Systems April 15, 2008

6

ACID Semantics

Atomicity: operations are atomic, either committing or aborting as a single entity

Consistency: the state of the data is internally consistent

Isolation: all operations act as if they were run by themselves

Durability: all writes stay persistent!

Page 7: Distributed Transactions Zachary G. Ives University of Pennsylvania CIS 455 / 555 – Internet and Web Systems April 15, 2008

7

A Problem Confronted by eBay

eBay wants to sell an item to: The highest bidder, once the auction is over, or The person who’s first to click “Buy It Now!”

But: What if the bidder doesn’t have the cash?

A solution: Record the item as sold Validate the PayPal or credit card info with a 3rd

party If not valid, discard this bidder and resume in prior

state

Page 8: Distributed Transactions Zachary G. Ives University of Pennsylvania CIS 455 / 555 – Internet and Web Systems April 15, 2008

8

“No Payment” Isn’t the Only Source of Failure

Suppose we start to transfer the money, but a server goes down…

Purchase:sb = Seller.balbb = Buyer.balWrite Buyer.bal= bb - $100

Write Item.sellTo = Buyer

Write Seller.bal= sb + $100

CRASH!

Page 9: Distributed Transactions Zachary G. Ives University of Pennsylvania CIS 455 / 555 – Internet and Web Systems April 15, 2008

9

Providing Atomicity and Consistency

Database systems provide transactions with the ability to abort a transaction upon some failure condition Based on transaction logging – record all operations and

undo them as necessary Database systems also use the log to perform

recovery from crashes Undo all of the steps in a partially-complete transaction Then redo them in their entirety This is part of a protocol called ARIES

These can be the basis of persistent storage, and we can use middleware like J2EE to build distributed transactions with the ability to abort database operations if necessary

Page 10: Distributed Transactions Zachary G. Ives University of Pennsylvania CIS 455 / 555 – Internet and Web Systems April 15, 2008

10

The Need for Isolation

Suppose eBay seller S has a bank account that we’re depositing money into, as people buy:

What if two purchases occur simultaneously, from two different servers on different continents?

S = Accounts.Get(1234)Write S.bal = S.bal + $50

Page 11: Distributed Transactions Zachary G. Ives University of Pennsylvania CIS 455 / 555 – Internet and Web Systems April 15, 2008

11

Concurrent Deposits

This update code is represented as a sequence of read and write operations on “data items” (which for now should be thought of as individual accounts):

where S is the data item representing the seller’s account # 1234

Deposit 1 Deposit 2Read(S.bal) Read(S.bal)S.bal := S.bal + $50 S.bal:= S.bal + €10Write(S.bal) Write(S.bal)

Page 12: Distributed Transactions Zachary G. Ives University of Pennsylvania CIS 455 / 555 – Internet and Web Systems April 15, 2008

12

A “Bad” Concurrent Execution

Only one action (e.g. a read or a write) can actually happen at a time for a given database, and we can interleave deposit operations in many ways:

Deposit 1 Deposit 2Read(S.bal) Read(S.bal)S.bal := S.bal + $50 S.bal:= S.bal + €10Write(S.bal) Write(S.bal)

time

BAD!

Page 13: Distributed Transactions Zachary G. Ives University of Pennsylvania CIS 455 / 555 – Internet and Web Systems April 15, 2008

13

A “Good” Execution

Previous execution would have been fine if the accounts were different (i.e. one were S and one were T), i.e., transactions were independent

The following execution is a serial execution, and executes one transaction after the other:

Deposit 1 Deposit 2Read(S.bal) S.bal := S.bal + $50 write(S.bal) Read(S.bal) S.bal:= S.bal + $10 Write(S.bal)

time

GOOD!

Page 14: Distributed Transactions Zachary G. Ives University of Pennsylvania CIS 455 / 555 – Internet and Web Systems April 15, 2008

14

Good Executions

An execution is “good” if it is serial (transactions are executed atomically and consecutively) or serializable (i.e. equivalent to some serial execution)

Equivalent to executing Deposit 1 then 3, or vice versa Why would we want to do this instead?

Deposit 1 Deposit 3read(S.bal) read(T.bal)S.bal := S.bal + $50 T.bal:= T.bal + €10write(S.bal) write(T.bal)

Page 15: Distributed Transactions Zachary G. Ives University of Pennsylvania CIS 455 / 555 – Internet and Web Systems April 15, 2008

15

Concurrency Control

A means of ensuring that transactions are serializable

There are many methods, of which we’ll see one Lock-based concurrency control (2-phase

locking) Optimistic concurrency control (no locks –

based on timestamps) Multiversion CC …

Page 16: Distributed Transactions Zachary G. Ives University of Pennsylvania CIS 455 / 555 – Internet and Web Systems April 15, 2008

Lock-Based Concurrency Control

Strict Two-phase Locking (Strict 2PL) Protocol: Each transaction must obtain:

a S (shared) lock on object before reading an X (exclusive) lock on object before writing An owner of an S lock can upgrade it to X if no one else is

holding the lock

All locks held by a transaction are released when the transaction completes Locks are handled in a “growing” phase, then a

“shrinking” phase (Non-strict) 2PL Variant:

Release locks anytime, but cannot acquire locks after releasing any lock.

Page 17: Distributed Transactions Zachary G. Ives University of Pennsylvania CIS 455 / 555 – Internet and Web Systems April 15, 2008

17

Benefits of Strict 2PL

Strict 2PL allows only serializable schedules. Additionally, it simplifies transaction aborts (Non-strict) 2PL also allows only serializable

schedules, but involves more complex abort processing

Page 18: Distributed Transactions Zachary G. Ives University of Pennsylvania CIS 455 / 555 – Internet and Web Systems April 15, 2008

Aborting a Transaction

If a transaction Ti is aborted, all its actions have to be undone Not only that, if Tj reads an object last written by Ti, Tj

must be aborted as well!

Most systems avoid such cascading aborts by releasing a transaction’s locks only at commit time If Ti writes an object, Tj can read this only after Ti

commits

Actions are undone by consulting the transaction log mentioned earlier

Page 19: Distributed Transactions Zachary G. Ives University of Pennsylvania CIS 455 / 555 – Internet and Web Systems April 15, 2008

The Transaction Log

The following actions are recorded in the log: Ti writes an object: the old value and the new value

Log record must go to disk before the changed page does!

Ti commits/aborts: a log record indicating this action

Log records are chained together by transaction id, so it’s easy to undo a specific transaction

Log is often mirrored and archived on stable storage

Page 20: Distributed Transactions Zachary G. Ives University of Pennsylvania CIS 455 / 555 – Internet and Web Systems April 15, 2008

Another Benefit of the Log:Recovering From a Crash

3 phases in the ARIES recovery algorithm: Analysis

Scan the log forward (from the most recent checkpoint) to identify all pending transactions, unwritten pages

Redo Redo all updates to unwritten pages in the buffer

pool, to ensure that all logged updates are in fact carried out and written to disk

Undo Undo all writes done by incomplete transactions by

working backwards in the log (Care must be taken to handle the case of a crash

occurring during the recovery process!)

Page 21: Distributed Transactions Zachary G. Ives University of Pennsylvania CIS 455 / 555 – Internet and Web Systems April 15, 2008

A Danger with Locks: Deadlocks

Deadlock: Cycle of transactions waiting for locks to be released by each other

Two ways of dealing with deadlocks: Deadlock prevention Deadlock detection

Page 22: Distributed Transactions Zachary G. Ives University of Pennsylvania CIS 455 / 555 – Internet and Web Systems April 15, 2008

Deadlock Prevention

Assign priorities based on timestamps (older = higher)

Assume Ti wants a lock that Tj holds Do one of:

Wait-Die: If Ti has higher priority, Ti waits for Tj; otherwise Ti aborts

Wound-wait: If Ti has higher priority, Tj aborts; otherwise Ti waits

Higher-priority transactions never wait for lower-priority

If a transaction re-starts, make sure it has its original timestamp Keeps it from always getting aborted!

Page 23: Distributed Transactions Zachary G. Ives University of Pennsylvania CIS 455 / 555 – Internet and Web Systems April 15, 2008

23

Database Transactions and Concurrency Control, Summarized

The basic goal was to guarantee ACID properties Transactions and logging provide Atomicity and

Consistency Locks ensure Isolation The transaction log (and RAID, backups, etc.) are

also used to ensure Durability

So far, we’ve been in the realm of databases – how does this extend to the distributed context?

Page 24: Distributed Transactions Zachary G. Ives University of Pennsylvania CIS 455 / 555 – Internet and Web Systems April 15, 2008

24

Distributed Transactions

We generally rely on a middleware layer called application servers, aka TP monitors, to provide transactions across systems Tuxedo, iPlanet, WebSphere,

etc. For atomicity, two-phase

commit protocol For isolation, need distributed

concurrency control DB DB

TransactServer

TransactServer

WorkflowController

MsgQueue

WebServer

App Server

Client

Page 25: Distributed Transactions Zachary G. Ives University of Pennsylvania CIS 455 / 555 – Internet and Web Systems April 15, 2008

Two-Phase Commit (2PC)

Site at which a transaction originates is the coordinator; other sites at which it executes are subordinates

Two rounds of communication, initiated by coordinator: Voting

Coordinator sends prepare messages, waits for yes or no votes

Then, decision or termination Coordinator sends commit or rollback messages, waits for

acks Any site can decide to abort a transaction!

Page 26: Distributed Transactions Zachary G. Ives University of Pennsylvania CIS 455 / 555 – Internet and Web Systems April 15, 2008

26

Steps in 2PC

When a transaction wants to commit: Coordinator sends prepare message to each subordinate Subordinate force-writes an abort or prepare log record

and then sends a no (abort) or yes (prepare) message to coordinator

Coordinator considers votes: If unanimous yes votes, force-writes a commit log record

and sends commit message to all subordinates Else, force-writes abort log rec, and sends abort message

Subordinates force-write abort/commit log records based on message they get, then send ack message to coordinator

Coordinator writes end log record after getting all acks

Page 27: Distributed Transactions Zachary G. Ives University of Pennsylvania CIS 455 / 555 – Internet and Web Systems April 15, 2008

27

Illustration of 2PC

Coordinator Subordinate 1Subordinate 2force-write begin log entry

force-write prepared log entry

force-writeprepared log entry

send “prepare”send “prepare”

send “yes”send “yes”

force-writecommit log entry

send “commit”send “commit”

force-writecommit log entry

force-writecommit log entry

send “ack”send “ack”

writeend log entry

Page 28: Distributed Transactions Zachary G. Ives University of Pennsylvania CIS 455 / 555 – Internet and Web Systems April 15, 2008

Comments on 2PC

Every message reflects a decision by the sender; to ensure that this decision survives failures, it is first recorded in the local log

All log records for a transaction contain its ID and the coordinator’s ID The coordinator’s abort/commit record also includes IDs

of all subordinates

Thm: there exists no distributed commit protocol that can recover without communicating with other processes, in the presence of multiple failures!

Page 29: Distributed Transactions Zachary G. Ives University of Pennsylvania CIS 455 / 555 – Internet and Web Systems April 15, 2008

What if a Site Fails in the Middle?

If we have a commit or abort log record for transaction T, but not an end record, we must redo/undo T If this site is the coordinator for T, keep sending commit/abort

msgs to subordinates until acks have been received

If we have a prepare log record for transaction T, but not commit/abort, this site is a subordinate for T Repeatedly contact the coordinator to find status of T, then write

commit/abort log record; redo/undo T; and write end log record

If we don’t have even a prepare log record for T, unilaterally abort and undo T This site may be coordinator! If so, subordinates may send

messages and need to also be undone

Page 30: Distributed Transactions Zachary G. Ives University of Pennsylvania CIS 455 / 555 – Internet and Web Systems April 15, 2008

Blocking for the Coordinator

If coordinator for transaction T fails, subordinates who have voted yes cannot decide whether to commit or abort T until coordinator recovers T is blocked Even if all subordinates know each other (extra

overhead in prepare msg) they are blocked unless one of them voted no

Page 31: Distributed Transactions Zachary G. Ives University of Pennsylvania CIS 455 / 555 – Internet and Web Systems April 15, 2008

Link and Remote Site Failures

If a remote site does not respond during the commit protocol for trnasaction T, either because the site failed or the link failed: If the current site is the coordinator for T,

should abort T If the current site is a subordinate, and has not

yet voted yes, it should abort T If the current site is a subordinate and has

voted yes, it is blocked until the coordinator responds!

Page 32: Distributed Transactions Zachary G. Ives University of Pennsylvania CIS 455 / 555 – Internet and Web Systems April 15, 2008

Observations on 2PC

Ack msgs used to let coordinator know when it’s done with a transaction; until it receives all acks, it must keep T in the transaction-pending table

If the coordinator fails after sending prepare msgs but before writing commit/abort log recs, when it comes back up it aborts the transaction

Page 33: Distributed Transactions Zachary G. Ives University of Pennsylvania CIS 455 / 555 – Internet and Web Systems April 15, 2008

33

From Distributed Commits toDistributed Concurrency Control

What we saw were the steps involved in preserving atomicity and consistency in a distributed fashion

Let’s briefly look at distributed isolation (locking)…

Page 34: Distributed Transactions Zachary G. Ives University of Pennsylvania CIS 455 / 555 – Internet and Web Systems April 15, 2008

Distributed Locking

How do we manage locks across many sites? Centralized: One site does all locking

Vulnerable to single site failure

Primary Copy: All locking for an object done at the primary copy site for this object Reading requires access to locking site as well as site

where the object is stored

Fully Distributed: Locking for a copy done at site where the copy is stored Locks at all sites holding the object being written

Page 35: Distributed Transactions Zachary G. Ives University of Pennsylvania CIS 455 / 555 – Internet and Web Systems April 15, 2008

Distributed Deadlock Detection

Each site maintains a local waits-for graph A global deadlock might exist even if the local

graphs contain no cycles:

T1 T1 T1T2 T2 T2

SITE A SITE B GLOBAL

Three solutions: Centralized (send all local graphs to one site) Hierarchical (organize sites into a hierarchy and

send local graphs to parent in the hierarchy) Timeout (abort transaction if it waits too long)

Page 36: Distributed Transactions Zachary G. Ives University of Pennsylvania CIS 455 / 555 – Internet and Web Systems April 15, 2008

36

Summary of Transactions and Concurrency

There are many (especially monetary) transfers that need atomicity and isolation

Transactions and concurrency control provide these features In a distributed, 3-tier setting they run in an Application

Server Similar features are provided in a 2-tier setting for

applications that run directly in the DBMS

Two-phase locking ensures isolation Two-phase commit is a voting scheme for doing

distributed commit