transactions and concurrency control zachary g. ives university of pennsylvania cis 550 – database...
Post on 22-Dec-2015
215 views
TRANSCRIPT
![Page 1: Transactions and Concurrency Control Zachary G. Ives University of Pennsylvania CIS 550 – Database & Information Systems December 2, 2003 Slide content](https://reader034.vdocuments.us/reader034/viewer/2022042717/56649d775503460f94a59d2f/html5/thumbnails/1.jpg)
Transactions and Concurrency Control
Zachary G. IvesUniversity of Pennsylvania
CIS 550 – Database & Information Systems
December 2, 2003
Slide content courtesy of Susan Davidson, Raghu Ramakrishnan & Johannes Gehrke
![Page 2: Transactions and Concurrency Control Zachary G. Ives University of Pennsylvania CIS 550 – Database & Information Systems December 2, 2003 Slide content](https://reader034.vdocuments.us/reader034/viewer/2022042717/56649d775503460f94a59d2f/html5/thumbnails/2.jpg)
2
From Queries to Updates
We’ve spent a lot of time talking about querying data
Yet updates are a really major part of many DBMS applications Particularly important: ensuring ACID properties
Atomicity: each operation looks atomic to the user Consistency: each operation in isolation keeps the
database in a consistent state (this is the responsibility of the user)
Isolation: should be able to understand what’s going on by considering each separate transaction independently
Durability: updates stay in the DBMS!!!
![Page 3: Transactions and Concurrency Control Zachary G. Ives University of Pennsylvania CIS 550 – Database & Information Systems December 2, 2003 Slide content](https://reader034.vdocuments.us/reader034/viewer/2022042717/56649d775503460f94a59d2f/html5/thumbnails/3.jpg)
3
What is a Transaction?
A transaction is a sequence of read and write operations on data items that logically functions as one unit of work: should either be done entirely or not at all if it succeeds, the effects of write operations
persist (commit); if it fails, no effects of write operations persist (abort)
these guarantees are made despite concurrent activity in the system, and despite failures that may occur
![Page 4: Transactions and Concurrency Control Zachary G. Ives University of Pennsylvania CIS 550 – Database & Information Systems December 2, 2003 Slide content](https://reader034.vdocuments.us/reader034/viewer/2022042717/56649d775503460f94a59d2f/html5/thumbnails/4.jpg)
4
How Things Can Go Awry
Suppose we have a table of bank accounts which contains the balance of the account
An ATM deposit of $50 to account # 1234 would be written as:
This reads and writes the account’s balance What if two accountholders make deposits
simultaneously from two ATMs?
update Accountsset balance = balance + $50where account#= ‘1234’;
![Page 5: Transactions and Concurrency Control Zachary G. Ives University of Pennsylvania CIS 550 – Database & Information Systems December 2, 2003 Slide content](https://reader034.vdocuments.us/reader034/viewer/2022042717/56649d775503460f94a59d2f/html5/thumbnails/5.jpg)
5
Concurrent Deposits
This SQL update code is represented as a sequence of read and write operations on “data items” (which for now should be thought of as individual accounts):
where X is the data item representing the account with account# 1234.
Deposit 1 Deposit 2read(X.bal) read(X.bal)X.bal := X.bal + $50 X.bal:= X.bal + $10write(X.bal) write(X.bal)
![Page 6: Transactions and Concurrency Control Zachary G. Ives University of Pennsylvania CIS 550 – Database & Information Systems December 2, 2003 Slide content](https://reader034.vdocuments.us/reader034/viewer/2022042717/56649d775503460f94a59d2f/html5/thumbnails/6.jpg)
6
A “Bad” Concurrent Execution
Only one “action” (e.g. a read or a write) can actually happen at a time, and we can interleave deposit operations in many ways:
Deposit 1 Deposit 2read(X.bal) read(X.bal)X.bal := X.bal + $50 X.bal:= X.bal + $10write(X.bal) write(X.bal)
time
BAD!
![Page 7: Transactions and Concurrency Control Zachary G. Ives University of Pennsylvania CIS 550 – Database & Information Systems December 2, 2003 Slide content](https://reader034.vdocuments.us/reader034/viewer/2022042717/56649d775503460f94a59d2f/html5/thumbnails/7.jpg)
7
A “Good” Execution
Previous execution would have been fine if the accounts were different (i.e. one were X and one were Y), i.e., transactions were independent
The following execution is a serial execution, and executes one transaction after the other:
Deposit 1 Deposit 2read(X.bal) X.bal := X.bal + $50 write(X.bal) read(X.bal) X.bal:= X.bal + $10 write(X.bal)
time
GOOD!
![Page 8: Transactions and Concurrency Control Zachary G. Ives University of Pennsylvania CIS 550 – Database & Information Systems December 2, 2003 Slide content](https://reader034.vdocuments.us/reader034/viewer/2022042717/56649d775503460f94a59d2f/html5/thumbnails/8.jpg)
8
Good Executions
An execution is “good” if it is serial (transactions are executed atomically and consecutively) or serializable (i.e. equivalent to some serial execution)
Equivalent to executing Deposit 1 then 3, or vice versa Why would we want to do this instead?
Deposit 1 Deposit 3read(X.bal) read(Y.bal)X.bal := X.bal + $50 Y.bal:= Y.bal + $10write(X.bal) write(Y.bal)
![Page 9: Transactions and Concurrency Control Zachary G. Ives University of Pennsylvania CIS 550 – Database & Information Systems December 2, 2003 Slide content](https://reader034.vdocuments.us/reader034/viewer/2022042717/56649d775503460f94a59d2f/html5/thumbnails/9.jpg)
9
Atomicity
Problems can also occur if a crash occurs in the middle of executing a transaction:
Need to guarantee that the write to X does not persist (ABORT) Default assumption if a transaction doesn’t commit
Transferread(X.bal)read(Y.bal)X.bal= X.bal-$100
Y.bal= Y.bal+$100
CRASH
![Page 10: Transactions and Concurrency Control Zachary G. Ives University of Pennsylvania CIS 550 – Database & Information Systems December 2, 2003 Slide content](https://reader034.vdocuments.us/reader034/viewer/2022042717/56649d775503460f94a59d2f/html5/thumbnails/10.jpg)
10
Transactions in SQL
A transaction begins when any SQL statement that queries the db begins.
To end a transaction, the user issues a COMMIT or ROLLBACK statement.
TransferUPDATE Accounts SET balance = balance - $100 WHERE account#= ‘1234’;UPDATE Accounts SET balance = balance + $100 WHERE account#= ‘5678’;COMMIT;
![Page 11: Transactions and Concurrency Control Zachary G. Ives University of Pennsylvania CIS 550 – Database & Information Systems December 2, 2003 Slide content](https://reader034.vdocuments.us/reader034/viewer/2022042717/56649d775503460f94a59d2f/html5/thumbnails/11.jpg)
11
Read-Only Transactions
When a transaction only reads information, we have more freedom to let the transaction execute in parallel with other transactions.
We signal this to the system by stating:
SET TRANSACTION READ ONLY; SELECT * FROM Accounts WHERE account#=‘1234’;...
![Page 12: Transactions and Concurrency Control Zachary G. Ives University of Pennsylvania CIS 550 – Database & Information Systems December 2, 2003 Slide content](https://reader034.vdocuments.us/reader034/viewer/2022042717/56649d775503460f94a59d2f/html5/thumbnails/12.jpg)
12
Read-Write Transactions
If we state “read-only”, then the transaction cannot perform any updates.
Instead, we must specify that the transaction may update (the default):
SET TRANSACTION READ ONLY; UPDATE AccountsSET balance = balance - $100WHERE account#= ‘1234’; ...
SET TRANSACTION READ WRITE; update Accountsset balance = balance - $100where account#= ‘1234’; ...
ILLEGAL!
![Page 13: Transactions and Concurrency Control Zachary G. Ives University of Pennsylvania CIS 550 – Database & Information Systems December 2, 2003 Slide content](https://reader034.vdocuments.us/reader034/viewer/2022042717/56649d775503460f94a59d2f/html5/thumbnails/13.jpg)
13
Dirty Reads
Dirty data is data written by an uncommitted transaction; a dirty read is a read of dirty data (WR conflict)
Sometimes we can tolerate dirty reads; other times we cannot:
e.g., if we wished to ensure balances never went negative in the transfer example, we should test that there is enough money first!
![Page 14: Transactions and Concurrency Control Zachary G. Ives University of Pennsylvania CIS 550 – Database & Information Systems December 2, 2003 Slide content](https://reader034.vdocuments.us/reader034/viewer/2022042717/56649d775503460f94a59d2f/html5/thumbnails/14.jpg)
14
“Bad” Dirty ReadEXEC SQL select balance into :bal from Accounts where account#=‘1234’;if (bal > 100) {
EXEC SQL update Accounts set balance = balance - $100 where account#= ‘1234’;
EXEC SQL update Accounts set balance = balance + $100 where account#= ‘5678’;}EXEC SQL COMMIT;
If the initial read (italics) were dirty, the balance could become negative!
![Page 15: Transactions and Concurrency Control Zachary G. Ives University of Pennsylvania CIS 550 – Database & Information Systems December 2, 2003 Slide content](https://reader034.vdocuments.us/reader034/viewer/2022042717/56649d775503460f94a59d2f/html5/thumbnails/15.jpg)
15
Acceptable Dirty Read
If we are just checking availability of an airline seat, a dirty read might be fine! (Why is that?) Reservation transaction:EXEC SQL select occupied into :occ
from Flights where Num= ‘123’ and date=11-03-99 and seat=‘23f’;if (!occ) {EXEC SQL update Flights set occupied=true where Num= ‘123’ and date=11-03-99 and seat=‘23f’;}else {notify user that seat is unavailable}
![Page 16: Transactions and Concurrency Control Zachary G. Ives University of Pennsylvania CIS 550 – Database & Information Systems December 2, 2003 Slide content](https://reader034.vdocuments.us/reader034/viewer/2022042717/56649d775503460f94a59d2f/html5/thumbnails/16.jpg)
16
Other Undesirable Phenomena
Unrepeatable read: a transaction reads the same data item twice and gets different values (RW conflict)
Phantom problem: a transaction retrieves a collection of tuples twice and sees different results
![Page 17: Transactions and Concurrency Control Zachary G. Ives University of Pennsylvania CIS 550 – Database & Information Systems December 2, 2003 Slide content](https://reader034.vdocuments.us/reader034/viewer/2022042717/56649d775503460f94a59d2f/html5/thumbnails/17.jpg)
17
Phantom Problem Example T1: “find the students with best grades who Take
either cis550-f03 or cis570-f02” T2: “insert new entries for student #1234 in the Takes
relation, with grade A for cis570-f02 and cis550-f03”
Suppose that T1 consults all students in the Takes relation and finds the best grades for cis550-f03
Then T2 executes, inserting the new student at the end of the relation, perhaps on a page not seen by T1
T1 then completes, finding the students with best grades for cis570-f02 and now seeing student #1234
![Page 18: Transactions and Concurrency Control Zachary G. Ives University of Pennsylvania CIS 550 – Database & Information Systems December 2, 2003 Slide content](https://reader034.vdocuments.us/reader034/viewer/2022042717/56649d775503460f94a59d2f/html5/thumbnails/18.jpg)
18
Isolation
The problems we’ve seen are all related to isolation
General rules of thumb w.r.t. isolation: Fully serializable isolation is more expensive
than “no isolation” We can’t do as many things concurrently (or we have
to undo them frequently)
For performance, we generally want to specify the most relaxed isolation level that’s acceptable Note that we’re “slightly” violating a correctness
constraint to get performance!
![Page 19: Transactions and Concurrency Control Zachary G. Ives University of Pennsylvania CIS 550 – Database & Information Systems December 2, 2003 Slide content](https://reader034.vdocuments.us/reader034/viewer/2022042717/56649d775503460f94a59d2f/html5/thumbnails/19.jpg)
19
Specifying Acceptable Isolation Levels
The default isolation level is SERIALIZABLE (as for the transfer example).
To signal to the system that a dirty read is acceptable,
In addition, there are
SET TRANSACTION READ WRITEISOLATION LEVEL READ UNCOMMITTED;
SET TRANSACTIONISOLATION LEVEL READ COMMITTED;SET TRANSACTIONISOLATION LEVEL REPEATABLE READ;
![Page 20: Transactions and Concurrency Control Zachary G. Ives University of Pennsylvania CIS 550 – Database & Information Systems December 2, 2003 Slide content](https://reader034.vdocuments.us/reader034/viewer/2022042717/56649d775503460f94a59d2f/html5/thumbnails/20.jpg)
20
READ COMMITTED
Forbids the reading of dirty (uncommitted) data, but allows a transaction T to issue the same query several times and get different answers No value written by T can be modified until T
completes
For example, the Reservation example could also be READ COMMITTED; the transaction could repeatably poll to see if the seat was available, hoping for a cancellation
![Page 21: Transactions and Concurrency Control Zachary G. Ives University of Pennsylvania CIS 550 – Database & Information Systems December 2, 2003 Slide content](https://reader034.vdocuments.us/reader034/viewer/2022042717/56649d775503460f94a59d2f/html5/thumbnails/21.jpg)
21
REPEATABLE READ
What it is NOT: a guarantee that the same query will get the same answer!
However, if a tuple is retrieved once it will be retrieved again if the query is repeated For example, suppose Reservation were
modified to retrieve all available seats If a tuple were retrieved once, it would be
retrieved again (but additional seats may also become available)
![Page 22: Transactions and Concurrency Control Zachary G. Ives University of Pennsylvania CIS 550 – Database & Information Systems December 2, 2003 Slide content](https://reader034.vdocuments.us/reader034/viewer/2022042717/56649d775503460f94a59d2f/html5/thumbnails/22.jpg)
22
Implementing Isolation Levels
One approach – use locking at some level (tuple, page, table, etc.): each data item is either locked (in some mode, e.g.
shared or exclusive) or is available (no lock) an action on a data item can be executed if the
transaction holds an appropriate lock consider granularity of locks – how big of an item to lock
Larger granularity = fewer locking operations but more contention!
Appropriate locks: Before a read, a shared lock must be acquired Before a write, an exclusive lock must be acquired
![Page 23: Transactions and Concurrency Control Zachary G. Ives University of Pennsylvania CIS 550 – Database & Information Systems December 2, 2003 Slide content](https://reader034.vdocuments.us/reader034/viewer/2022042717/56649d775503460f94a59d2f/html5/thumbnails/23.jpg)
23
Lock Compatibility Matrix
Locks on a data item are granted based on a lock compatibility matrix:
When a transaction requests a lock, it must wait (block) until the lock is granted
Mode of Data Item None Shared ExclusiveShared Y Y NExclusive Y N NRequest mode {
![Page 24: Transactions and Concurrency Control Zachary G. Ives University of Pennsylvania CIS 550 – Database & Information Systems December 2, 2003 Slide content](https://reader034.vdocuments.us/reader034/viewer/2022042717/56649d775503460f94a59d2f/html5/thumbnails/24.jpg)
24
Locks Prevent “Bad” Execution If the system used locking, the first “bad”
execution could have been avoided: Deposit 1 Deposit 2xlock(X)read(X.bal) {xlock(X) is not granted}X.bal := X.bal + $50 write(X.bal) release(X) xlock(X) read(X.bal) X.bal:= X.bal + $10 write(X.bal) release(X)
![Page 25: Transactions and Concurrency Control Zachary G. Ives University of Pennsylvania CIS 550 – Database & Information Systems December 2, 2003 Slide content](https://reader034.vdocuments.us/reader034/viewer/2022042717/56649d775503460f94a59d2f/html5/thumbnails/25.jpg)
25
Lock Types and Read/Write Modes
When we specify “read-only”, the system only uses shared-mode locks Any transaction that attempts to update will be
illegal
When we specify “read-write”, the system may also acquire locks in exclusive mode Obviously, we can still query in this mode
![Page 26: Transactions and Concurrency Control Zachary G. Ives University of Pennsylvania CIS 550 – Database & Information Systems December 2, 2003 Slide content](https://reader034.vdocuments.us/reader034/viewer/2022042717/56649d775503460f94a59d2f/html5/thumbnails/26.jpg)
26
Isolation Levels and Locking
READ UNCOMMITTED allows queries in the transaction to read data without acquiring any lock For updates, exclusive locks must be obtained
and held to end of transaction
READ COMMITTED requires a read-lock to be obtained for all tuples touched by queries, but it releases the locks immediately after the read Exclusive locks must be obtained for updates
and held to end of transaction
![Page 27: Transactions and Concurrency Control Zachary G. Ives University of Pennsylvania CIS 550 – Database & Information Systems December 2, 2003 Slide content](https://reader034.vdocuments.us/reader034/viewer/2022042717/56649d775503460f94a59d2f/html5/thumbnails/27.jpg)
27
Isolation levels and locking, cont. REPEATABLE READ places shared locks on
tuples retrieved by queries, holds them until the end of the transaction Exclusive locks must be obtained for updates
and held to end of transaction SERIALIZABLE places shared locks on
tuples retrieved by queries as well as the index, holds them until the end of the transaction Exclusive locks must be obtained for updates
and held to end of transaction Holding locks to the end of a transaction is
called “strict”
![Page 28: Transactions and Concurrency Control Zachary G. Ives University of Pennsylvania CIS 550 – Database & Information Systems December 2, 2003 Slide content](https://reader034.vdocuments.us/reader034/viewer/2022042717/56649d775503460f94a59d2f/html5/thumbnails/28.jpg)
28
Summary of Isolation Levels
Level Dirty Read Unrepeatable Read Phantoms
READ UN- Maybe Maybe MaybeCOMMITTED
READ No Maybe MaybeCOMMITTED
REPEATABLE No No MaybeREAD
SERIALIZABLE No No No
![Page 29: Transactions and Concurrency Control Zachary G. Ives University of Pennsylvania CIS 550 – Database & Information Systems December 2, 2003 Slide content](https://reader034.vdocuments.us/reader034/viewer/2022042717/56649d775503460f94a59d2f/html5/thumbnails/29.jpg)
29
Theory of Serializability
A schedule of a set of transactions is a linear ordering of their actions e.g. for the simultaneous deposits example:
R1(X.bal) R2(X.bal) W1(X.bal) W2(X.bal) A serial schedule is one in which all the steps of
each transaction occur consecutively A serializable schedule is one which is equivalent
to some serial schedule (i.e. given any initial state, the final state is the same as one produced by some serial schedule) The example above is neither serial nor serializable
![Page 30: Transactions and Concurrency Control Zachary G. Ives University of Pennsylvania CIS 550 – Database & Information Systems December 2, 2003 Slide content](https://reader034.vdocuments.us/reader034/viewer/2022042717/56649d775503460f94a59d2f/html5/thumbnails/30.jpg)
30
Questions to Address
Given a schedule S, is it serializable? How can we "restrict" transactions in
progress to guarantee that only serializable schedules are produced?
![Page 31: Transactions and Concurrency Control Zachary G. Ives University of Pennsylvania CIS 550 – Database & Information Systems December 2, 2003 Slide content](https://reader034.vdocuments.us/reader034/viewer/2022042717/56649d775503460f94a59d2f/html5/thumbnails/31.jpg)
31
Conflicting Actions
Consider a schedule S in which there are two consecutive actions Ii and Ij of transactions Ti and Tj respectively
If Ii and Ij refer to different data items, then swapping Ii and Ij does not matter
If Ii and Ij refer to the same data item Q, then swapping Ii and Ij matters if and only if one of the actions is a write Ri(Q) Wj(Q) produces a different final value for Q than
Wj(Q) Ri(Q)
![Page 32: Transactions and Concurrency Control Zachary G. Ives University of Pennsylvania CIS 550 – Database & Information Systems December 2, 2003 Slide content](https://reader034.vdocuments.us/reader034/viewer/2022042717/56649d775503460f94a59d2f/html5/thumbnails/32.jpg)
32
Testing for Serializability
Given a schedule S, we can construct a di-graph G=(V,E) called a precedence graph V : all transactions in S E : Ti Tj whenever an action of Ti precedes
and conflicts with an action of Tj in S Theorem:
A schedule S is conflict serializable if and only if its precedence graph contains no cycles
Note that testing for a cycle in a digraph can be done in time O(|V|2)
![Page 33: Transactions and Concurrency Control Zachary G. Ives University of Pennsylvania CIS 550 – Database & Information Systems December 2, 2003 Slide content](https://reader034.vdocuments.us/reader034/viewer/2022042717/56649d775503460f94a59d2f/html5/thumbnails/33.jpg)
33
An Example
T1 T2 T3 R(X,Y,Z) R(X) W(X) R(Y) W(Y) R(Y) R(X) W(Z)
T1 T2 T3
Cyclic: Not serializable.
![Page 34: Transactions and Concurrency Control Zachary G. Ives University of Pennsylvania CIS 550 – Database & Information Systems December 2, 2003 Slide content](https://reader034.vdocuments.us/reader034/viewer/2022042717/56649d775503460f94a59d2f/html5/thumbnails/34.jpg)
34
Another Example
T1 T2 T3 R(X) W(X) R(X) W(X) R(Y) W(Y) R(Y) W(Y)
T1 T2 T3
Acyclic: serializable
![Page 35: Transactions and Concurrency Control Zachary G. Ives University of Pennsylvania CIS 550 – Database & Information Systems December 2, 2003 Slide content](https://reader034.vdocuments.us/reader034/viewer/2022042717/56649d775503460f94a59d2f/html5/thumbnails/35.jpg)
35
Producing the Equivalent Serial Schedule
If the precedence graph for a schedule is acyclic, then an equivalent serial schedule can be found by a topological sort of the graph
For the second example, the equivalent serial schedule is:
R1(Y)W1(Y)R2(X)W2(X) R2(Y)W2(Y) R3(X)W3(X)
![Page 36: Transactions and Concurrency Control Zachary G. Ives University of Pennsylvania CIS 550 – Database & Information Systems December 2, 2003 Slide content](https://reader034.vdocuments.us/reader034/viewer/2022042717/56649d775503460f94a59d2f/html5/thumbnails/36.jpg)
36
Locking and Serializability
We said that a transaction must hold all locks until it terminates (a condition called strict locking)
It turns out that this is crucial to guarantee serializability Note that the first (bad) example could have
been produced if transactions acquired and immediately released locks.
![Page 37: Transactions and Concurrency Control Zachary G. Ives University of Pennsylvania CIS 550 – Database & Information Systems December 2, 2003 Slide content](https://reader034.vdocuments.us/reader034/viewer/2022042717/56649d775503460f94a59d2f/html5/thumbnails/37.jpg)
37
Well-Formed, Two-Phased Transactions
A transaction is well-formed if it acquires at least a shared lock on Q before reading Q or an exclusive lock on Q before writing Q and doesn’t release the lock until the action is performed Locks are also released by the end of the transaction
A transaction is two-phased if it never acquires a lock after unlocking one i.e., there are two phases: a growing phase in which
the transaction acquires locks, and a shrinking phase in which locks are released
![Page 38: Transactions and Concurrency Control Zachary G. Ives University of Pennsylvania CIS 550 – Database & Information Systems December 2, 2003 Slide content](https://reader034.vdocuments.us/reader034/viewer/2022042717/56649d775503460f94a59d2f/html5/thumbnails/38.jpg)
38
Two-Phased Locking Theorem
If all transactions are well-formed and two-phase, then any schedule in which conflicting locks are never granted ensures serializability i.e., there is a very simple scheduler!
However, if some transaction is not well-formed or two-phase, then there is some schedule in which conflicting locks are never granted but which fails to be serializable i.e., one bad apple spoils the bunch.
![Page 39: Transactions and Concurrency Control Zachary G. Ives University of Pennsylvania CIS 550 – Database & Information Systems December 2, 2003 Slide content](https://reader034.vdocuments.us/reader034/viewer/2022042717/56649d775503460f94a59d2f/html5/thumbnails/39.jpg)
39
Summary
Transactions are all-or-nothing units of work guaranteed despite concurrency or failures in the system
Theoretically, the “correct” execution of transactions is serializable (i.e. equivalent to some serial execution)
Practically, this may adversely affect throughput isolation levels
With isolation levels, users can specify the level of “incorrectness” they are willing to tolerate