concurrency control in dbms

UNIT IIConcurrency Control

Purpose of Concurrency Control

• To enforce Isolation (through mutual exclusion) among conflicting transactions.• To preserve database consistency through consistency preserving execution of transactions.• To resolve read-write and write-write conflict

Example:• In concurrent execution environment if T1 conflicts with T2

over a data item A, then the existing concurrency control decides if T1 or T2 should get the A and if the other transaction is rolled-back or waits.

LOCKING• Locking is an operation which secures (a) permission to Read

(b) permission to Write a data item for a transaction.Example:• Lock(X): Data item X is locked in behalf of the requesting transaction.• Unlocking is an operation which removes these permissions from the

data item.Example:• Unlock (X): Data item X is made available to all other transactions.• Lock and Unlock are Atomic operations.

Lock-Based Protocols• A lock is a mechanism to control concurrent access to a data

item• Data items can be locked in two modes : 1. exclusive (X) mode. Data item can be both read as well as

written. X-lock is requested using lock-X instruction. 2. shared (S) mode. Data item can only be read. S-lock is

requested using lock-S instruction.• Lock requests are made to concurrency-control manager.

Transaction can proceed only after request is granted.

Lock-Based Protocols (Cont.)• Lock-compatibility matrix

• A transaction may be granted a lock on an item if the requested lock is compatible with locks already held on the item by other transactions

• Any number of transactions can hold shared locks on an item, but if any transaction holds an exclusive on the item no other transaction may hold any lock on the item.

• If a lock cannot be granted, the requesting transaction is made to wait till all incompatible locks held by other transactions have been released. The lock is then granted.

Lock-Based Protocols (Cont.)• Example of a transaction performing locking: T2: lock-S(A); read (A); unlock(A); lock-S(B); read (B); unlock(B); display(A+B)• Locking as above is not sufficient to guarantee

serializability — if A and B get updated in-between the read of A and B, the displayed sum would be wrong.

• A locking protocol is a set of rules followed by all transactions while requesting and releasing locks. Locking protocols restrict the set of possible schedules.

Pitfalls of Lock-Based ProtocolsConsider the partial schedule

• Neither T3 nor T4 can make progress — executing lock-S(B) causes T4 to wait for T3 to release its lock on B, while executing lock-X(A) causes T3 to wait for T4 to release its lock on A.

• Such a situation is called a deadlock. • To handle a deadlock one of T3 or T4 must be rolled back

and its locks released.

Cont…

• The potential for deadlock exists in most locking protocols.

• Starvation is also possible if concurrency control manager is badly designed. For example:• A transaction may be waiting for an X-lock on an item, while a

sequence of other transactions request and are granted an S-lock on the same item.

• The same transaction is repeatedly rolled back due to deadlocks.

• Concurrency control manager can be designed to prevent starvation.

THE LOCKING SCHEDULER• The locking scheduler

• It is responsible for granting locks to the transactions.• Keeps list of all current locks and requests for locks.

• Key idea behind locking scheduler• Each data item has a unique lock• Transactions must first acquire the lock before reading/writing the element.• If the lock is taken by another transaction then wait for it to be released.• The transactions must release the locks.

• A locking scheduler generally comprises of two partsPart I: This part takes the stream of requests from the transactions and inserts

lock actions ahead of all database-access operations.Part II: It executes the sequence of actions passed to it by Part I.

This part determines whether to delay any transaction T because a lock has not been granted.

If so, then add the action to the list of actions that must eventually be performed for transaction T.

If the action is a database access, it is transmitted to the database and executed and if the action is a lock request, examine the lock table to see if the lock can be

granted.

Functioning of the scheduler

• When a transaction T commits or aborts: Part I releases all locks held by T. If any transactions are waiting for any of these locks, Part I notifies Part II.

• When Part II is notified that a lock on some database item P is available: It determines the next transaction or transactions that can now be given a lock on P. Those transactions are allowed to execute their delayed actions until they either complete or reach another lock request that cannot be granted

The Two-Phase Locking Protocol• This is a protocol which ensures conflict-serializable schedules.

This protocol requires that each transaction issue lock and unlock request in two phases

• Phase 1: Growing Phase• transaction may obtain locks • transaction may not release locks

• Phase 2: Shrinking Phase• transaction may release locks• transaction may not obtain locks

• The protocol assures serializability. It can be proved that the transactions can be serialized in the order of their lock points (i.e. the point where a transaction acquired its final lock).

Example

T1 T2 ResultLock-S (Y); lock-S (X); Initial values: X=20; Y=30read (Y); read (X); Result of serial executionunlock (Y); unlock (X); T1 followed by T2Lock-X (X); lock-X (Y); X=50, Y=80.read (X); read (Y); Result of serial executionX:=X+Y; Y:=X+Y; T2 followed by T1write (X); write (Y); X=70, Y=50unlock (X); unlock (Y);

Example2:Whether serializable or not?

T1 T2 ResultLock-S (Y); X=50; Y=50read (Y); unlock (Y);

lock-S (X);read (X);unlock (X);lock-X (Y);read (Y);Y:=X+Y;write (Y);unlock (Y);

Lock-X (X);read (X);X:=X+Y;write (X);unlock (X);

2PLP does not ensure freedom from DeadlockT1 T2read_lock (Y); T1 and T2 did follow two-phaseread_item (Y); policy but they are deadlock

read_lock (X); read_item (Y);

write_lock (X);(waits for X) write_lock (Y);

(waits for Y)

Cont…• Two-phase locking does not ensure freedom from deadlocks• Cascading roll-back is possible under two-phase locking. To

avoid this, follow a modified protocol called strict two-phase locking. Here a transaction must hold all its exclusive locks till it commits/aborts, preventing any other transaction from reading the data.

• Rigorous two-phase locking is even more strict then above: here all locks are held till the transactions commit/abort. In this protocol transactions can be serialized in the order in which they commit.

Lock Conversions in 2PLP• Two-phase locking with lock conversions:

– First Phase: • can acquire a lock-S on item• can acquire a lock-X on item• can convert a lock-S to a lock-X (upgrade)

– Second Phase:• can release a lock-S• can release a lock-X• can convert a lock-X to a lock-S (downgrade)

• This protocol assures serializability. But still relies on the programmer to insert the various locking instructions.

Implementation of Locking• A Lock manager can be implemented as a separate process to

which transactions send lock and unlock requests• The lock manager replies to a lock request by sending a lock

grant messages (or a message asking the transaction to roll back, in case of a deadlock)

• The requesting transaction waits until its request is answered• The lock manager maintains a datastructure called a lock

table to record granted locks and pending requests• The lock table is usually implemented as an in-memory hash

table indexed on the name of the data item being locked

Lock Table • Black rectangles indicate granted locks, white ones indicate waiting requests

• Lock table also records the type of lock granted or requested

• New request is added to the end of the queue of requests for the data item, and granted if it is compatible with all earlier locks

• Unlock requests result in the request being deleted, and later requests are checked to see if they can now be granted

• If transaction aborts, all waiting or granted requests of the transaction are deleted • lock manager may keep a list of locks

held by each transaction, to implement this efficiently

Timestamp-Based Protocols• Each transaction is issued a timestamp when it enters the system. If an

old transaction Ti has time-stamp TS(Ti), a new transaction Tj is assigned time-stamp TS(Tj) such that TS(Ti) <TS(Tj). There are two simple methods for implementing this scheme:

1. Use the value of system clock.

2. Use logical counter that is incremented after a new timestamp has been assigned.

• The protocol manages concurrent execution such that the time-stamps determine the serializability order.

• In order to assure such behavior, the protocol maintains for each data Q two timestamp values:• W-timestamp(Q) is the largest time-stamp of any transaction that executed

write(Q) successfully.• R-timestamp(Q) is the largest time-stamp of any transaction that executed

read(Q) successfully.

Timestamp-Based Protocols (Cont.)

• The timestamp ordering protocol ensures that any conflicting read and write operations are executed in timestamp order.The protocol operates as follows:

• Suppose a transaction Ti issues a read(Q)

1. If TS(Ti) W-timestamp(Q), then Ti needs to read a value of Q

that was already overwritten. Hence, the read operation is rejected, and Ti is rolled back.

2. If TS(Ti) W-timestamp(Q), then the read operation is

executed, and R-timestamp(Q) is set to the maximum of R- timestamp(Q) and TS(Ti).

Timestamp-Based Protocols (Cont.)

• Suppose that transaction Ti issues write(Q).

1. If TS(Ti) < R-timestamp(Q), then the value of Q that Ti is producing was needed previously, and the system assumed that that value would never be produced. Hence, the write operation is rejected, and Ti is rolled back.

2. If TS(Ti) < W-timestamp(Q), then Ti is attempting to write an obsolete value of Q. Hence, this write operation is rejected, and Ti is rolled back.

3. Otherwise, the write operation is executed, and W-timestamp(Q) is set to TS(Ti).

Correctness of Timestamp-Ordering Protocol

• The timestamp-ordering protocol guarantees serializability since all the arcs in the precedence graph are of the form:

Thus, there will be no cycles in the precedence graph• Timestamp protocol ensures freedom from deadlock as no

transaction ever waits. • But the schedule may not be cascade-free, and may not even be

recoverable.

transactionwith smallertimestamp

transactionwith largertimestamp

Recoverability and Cascade Freedom

• Problem with timestamp-ordering protocol:• Suppose Ti aborts, but Tj has read a data item written by Ti

• Then Tj must abort; if Tj had been allowed to commit earlier, the schedule is not recoverable.

• Further, any transaction that has read a data item written by Tj must abort.

• This can lead to cascading rollback --- that is, a chain of rollbacks • Solution 1:

• A transaction is structured such that its writes are all performed at the end of its processing• All writes of a transaction form an atomic action; no transaction may

execute while a transaction is being written• „ Solution 2: Limited form of locking: wait for data to be committed

before reading it• „ Solution 3: Tracking uncommitted writes. Ti is allowed to commit only after the commit of any transaction that wrote a value that Ti reads.

Thomas’ Write Rule• Modified version of the timestamp-ordering protocol in which

obsolete write operations may be ignored under certain circumstances.

• Let us consider the following schedule:T1 T2

Read(Q) Write(Q)

Write(Q)• T1 starts before T2, therefore TS(T1)<TS(T2).• The read(Q) operation of T1 succeeds, as does the write(Q) of T2.• When t1 attempts it Write(Q) operation, we find that TS(T1)<W-

Timestamp(Q), since W-Timestamp(Q) =TS(T2).• Thus write(Q) by T1 is rejected and T1 is rolled back.• Though the transaction in Timestamp based protocol is rolled back but

it is not necessary, since it was already written by T2.

• This observation leads to a modified version of Timestamp-ordering protocol in which obselete write operation can be ignored sometimes.

• Thomas' Write Rule allows greater potential concurrency. Unlike previous protocols, it allows some view-serializable schedules that are not conflict-serializable.

• Suppose that transaction T issues write(X) If read_TS(X) > TS(T) then abort and roll-back T and reject the operation. If write_TS(X) > TS(T), then just ignore the write operation and continue

execution. This is because the most recent writes counts in case of two consecutive writes.

If the conditions given in 1 and 2 above do not occur, then execute write_item(X) of T and set write_TS(X) to TS(T).

Deadlock• A deadlock is a situation in which two or more competing actions

are each waiting for the other to finish, and thus neither ever does.

• Deadlock is operational problem. • It does not violate the ACID property. • For detecting deadlock firstly we draw direct dependency graph.

If there is loop then there is deadlock otherwise not.T1 T2read_lock (Y); read_item (Y); read_lock (X);

read_item (Y);write_lock (X);(waits for X) write_lock (Y);

(waits for Y)

Conditions for deadlocks• Mutual exclusion:No resource can be shared by more

than one process at a time.• Hold and wait: There must exist a process that is

holding at least one resource and is waiting to acquire additional resources that are currently being held by other processes.

• No preemption :The operating system must not de-allocate resources once they have been allocated; they must be released by the holding process voluntarily.

• Circular wait: A process must be waiting for a resource which is being held by another process, which in turn is waiting for the first process to release the resource. In general, there is a set of waiting processes

Strategies for handling deadlocks• Deadlock prevention: Prevents deadlocks by restraining

requests made to ensure that at least one of the four deadlock conditions cannot occur.

• Deadlock avoidance: Dynamically grants a resource to a process if the resulting state is safe. A state is safe if there is at least one execution sequence that allows all processes to run to completion.

• Deadlock detection and recovery: Allows deadlocks to form; then finds and breaks them.

Deadlock Prevention:Deadlock prevention works by preventing one of the four Coffmanconditions from occurring:• Removing the mutual exclusion condition means that no process will

have exclusive access to a resource. • The hold and wait or resource holding conditions may be removed by

requiring processes to request all the resources they will need before starting up (or before embarking upon a particular set of operations).

• This advance knowledge is frequently difficult to satisfy and, in any case, is an inefficient use of resources.

• Another way is to require processes to request resources only when it has none. Thus, first they must release all their currently held resources before requesting all the resources they will need from scratch. This too is often impractical. It is so because resources may be allocated and remain unused for long periods. Also, a process requiring a popular resource may have to wait indefinitely, as such a resource may always be allocated to some process, resulting in resource starvation

Deadlock Prevention(Cont…)• To use Preemption and transaction rollback technique .i.e.

when T2 request a lock that T1 holds, the lock granted to T1 may be preempted by rolling back of T1 and granting the lock to T2.

• To control the preemption we assign a unique timestamp to each transaction, just to decide whether the transaction should wait or be rolled back.

• The final condition is the circular wait condition.

Deadlock detection:• In this approach, deadlocks are allowed to happen. The scheduler

maintains a wait-for-graph for detecting cycle. If a cycle exists, then one transaction involved in the cycle is selected (victim) and rolled-back.

• Deadlocks can be described as a wait-for graph, which consists of a pair G = (V,E)

• V is a set of vertices (all the transactions in the system)• E is a set of edges; each element is an ordered pair Ti Tj.

• If Ti Tj is in E, then there is a directed edge from Ti to Tj, implying that Ti is waiting for Tj to release a data item.

• When Ti requests a data item currently being held by Tj, then the edge Ti, Tj is inserted in the wait-for graph. This edge is removed only when Tj is no longer holding a data item needed by Ti.

The system is in a deadlock state if and only if the wait-for graph has a cycle.

Wait for graph

Wait-for graph without a cycle Wait-for graph with cycle

Deadlock Avoidance• Deadlock can be avoided if certain information about

processes are available to the operating system before allocation of resources, such as which resources a process will consume in its lifetime. For every resource request, the system sees whether granting the request will mean that the system will enter an unsafe state, meaning a state that could result in deadlock. The system then only grants requests that will lead to safe states.

• algorithm that is used for deadlock avoidance are • Wait/Die Scheme• Wound/Wait Scheme

Deadlock Avoidance algo 1WAIT-DIE Rule: If Ti requests a lock on a data item which is already locked by Tj, then Ti is permitted to wait iff ts(Ti)<ts(Tj). If ts(Ti)>ts(Tj), then Ti is aborted and restarted with the same timestamp.

• if ts(Ti)<ts(Tj) then Ti waits else Ti dies

• non-preemptive: Ti never preempts Tj

• prefers younger transactions

• Example: T1 T2 T3

TIMESTAMP: 5 10 15 WAIT(T1) Rollback(t3)

Deadlock avoidance algo 2

WOUND-WAIT Rule: If Ti requests a lock on a data item which is already locked by Tj , then Ti is permitted to wait iff ts(Ti)>ts(Tj). If ts(Ti)<ts(Tj), then Tj is aborted and the lock is granted to Ti.

• if ts(Ti)<ts(Tj) then Tj is wounded else Ti waits

• preemptive: Ti preempts Tj if it is younger• prefers older transactions

• Example: T1 T2 T3TIMESTAMP: 5 10 15 data item preempted from T2 and rolled back

T3 will wait

Deadlock detection and recoveryWe will consider a deadlock has occurred, the system must

recover from the deadlock using any below to steps:

PROCESS TERMINATION:

Could delete all the processes in the deadlock -- this is expensive. Delete one at a time until deadlock is broken ( time consuming ). Select who to terminate based on priority, time executed, time to completion,

needs for completion, or depth of rollback In general, it's easier to preempt the resource, than to terminate the process.

RESOURCE PREEMPTION:

Select a victim - which process and which resource to preempt. Rollback to previously defined "safe" state. Prevent one process from always being the one preempted ( starvation ).

concurrency control in dbms

Education

lock requests

transaction issue lock

therequested lock

lock table

lock pointsi

pitfalls of lock

locks transaction

unique lock transactions