transaction - unibg › mutti › teaching › exe5d_polimi_250116_braga.pdf9 definition: a write...

50
Transaction Primary unit of work Well formed: Sequence of operations between begin and end (commit or abort) Concurrency needed in order to start multiple parallel transactions. 1

Upload: others

Post on 04-Jul-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Transaction

● Primary unit of work● Well formed: Sequence of operations

between begin and end (commit or abort)

● Concurrency needed in order to start multiple parallel transactions.

1

Serial and serializable schedules

● Serial schedules: the actions of every transaction appear in blocks.

● A schedule is correct if it produces the same result as a serial schedule: serializable.

2

Concurrency Control Techniques

● Theoretical:– View-Serializability (VSR)– Conflict-Serializability (CSR)

● Practical:– Two Phase Locking (2PL, 2PL Strict)– Timestamp (TS Single, TS Multi)

3

Concurrency Control Techniques

4

Serial

All

VSR

CSR

2PLTS(mono)

View-Serializability VSR

Same final writes and reads-from as one of the serial schedules.

w0(x) r2(x) r1(x) w2(x) w2(z)

w0(x) r1(x) r2(x) w2(x) w2(z)

5

Conflict Graph Construction

● One node for every transaction● A directed arc for every conflict:

– read / write– write / write– write / read

CONFLICTS ARE GENERATED EVEN FOR NON-CONSECUTIVE OPERATIONS

6

• The following schedule is in VSR. Classify it wrt VSR, CSR

r1(x), w2(x), r1(z), w1(y), r3(x), r4(x), w3(z), w2(y), r3(y), w4(x), w4(y)

• Conflict graph:

r1(x), w2(x), r1(z), w1(y), r3(x), r4(x), w3(z), w2(y), r3(y), w4(x), w4(y)

X: r1, w2, r3, r4, w4Y: w1, w2, r3, w4Z: r1, w3

9

Definition: a write wi(X) is said to be blind if it is not the last action of resource X and the following action on X is a write wj(X)

Property: each schedule S VSR and SCSR has, in its conflict graph, cycles of arcs due to couples of blind writes.

These can be swapped without modifying the reads-from and final write relationships.Once the graph is acyclic it is possible to find a serial schedule view-equivalent to the initial one.

Concurrency Control – Hierarchical lock and Update lock

A 2PL system receives these requests from 3 transactions, each requesting the minimum level of lock in order to execute: r1(t1) r2(t1) r3(t1) w1(t1) r2(t5) w3(t1)

Show the behavior of the system (order of requests, order of actual execution, their result) in two different scenarios: (1) without update-lock on tuples (2) with update-lock on tuples

Also show the behavior with the following page-tuple hierarchy: P1(t1,t2,t3), P2(t4,t5,t6)

(Scenario 1) Without update lock, all transactions “eagerly” acquire their SL on t1: 

r1(t1) r2(t1) r3(t1) w1(t1) r2(t5) w3(t1)

SL1(t1) r1(t1)

SL2(t1) r2(t1)

SL3(t1) r3(t1)

[T3 and T1 cannot upgrade their locks and wait for the resource to be released] SL2(t5) r2(t5)

rel(SL2(t5)) rel(SL2(t1))

[T2 commits, but T1 and T3 are in deadlock]

(Scenario 2) With update-lock on tuples, transactions T1 and T3 are forced to acquire update locks on t1, while T2 can request a read-lock (because its next write occurs on a different resource) UL1(t1)

r1(t1) SL2(t1)

r2(t1) [T3 and T1 have to wait, as t1 is locked] SL2(t5)

r2(t5) rel(SL2(t5)) rel(SL2(t1)) [now T1 can continue] XL1(t1)

w1(t1) rel(XL1(t1)) [now T3 can “start”] UL3(t1)

r3(t1) XL3(t1)

w3(t1) rel(XL3(t1)) 

Locks marked in blue are lock upgrades, and the corresponding rel(.) also releases the previously acquired “weaker” lock.

In order to consider the Update Lock on hierarchical resources we need to extend the notion of intention to the case of UL, introducing the notion of INTENTION OF UPDATE LOCK, regulated by the following table

P1(t1,t2,t3), P2(t4,t5,t6) r1(t1) r2(t1) r3(t1) w1(t1) r2(t5) w3(t1)

(Scenario 2) With update-lock on tuples, transactions T1 and T3 are forced to acquire update locks on t1, while T2 can request a read-lock (because its next write occurs on a different resource) IUL1(P1) UL1(t1) r1(t1) ISL2(P1) SL2(t1) r2(t1) [T3 and T1 have to wait, as t1 is locked] ISL2(P2) SL2(t5) r2(t5) rel(SL2(t5)) rel(ISL2(P2)) rel(SL2(t1)) rel(ISL2(P1)) [now T1 can continue] IXL1(P1) XL1(t1) w1(t1) rel(XL1(t1)) rel(IXL1(P1)) [now T3 can “start”] IUL3(P1) UL3(t1) r3(t1) IXL3(P1) XL3(t1) w3(t1) rel(XL3(t1)) rel(IXL3(P1))  (Scenario 1) Without update lock, all transactions “eagerly” acquire their SL on t1: ISL1(P1) SL1(t1) r1(t1) ISL2(P1) SL2(t1) r2(t1) ISL3(P1) SL3(t1) r3(t1) [T3 and T1 cannot upgrade their locks] ISL2(P2) SL2(t5) r2(t5) rel(SL2(t5)) rel(ISL2(P2)) rel(SL2(t1)) rel(ISL2(P1)) [T2 commits, but T1 and T3 are in deadlock]

• The following schedule is in VSR. Classify it wrt CSR, TS mono, TS multi, 2PL Strict

r1(x), w2(x), r1(z), w1(y), r3(x), r4(x), w3(z), w2(y), r3(y), w4(x), w4(y)

• Conflict graph:

For 2PL strict we assume that all transactions commit and release all locks immediatly after their last operation (this is the least restrictive option), and check if acquisition can occur increasingly.

In details...

•It is clear at once that T1 must release the SL on X before instant 2 (for T2 to acquire XL on it), but this is not compatible with 2PL Strict, as the last operation of T1 occurs at instant 4. One may also observe that T2 must release SL on X before instant 5 (for T3 to acquire SL on it), but the last operation of T2 is at instant 8.

•From the conflict graph, that is acyclic, we know the schedule to be in CSR, and the only serial schedule view equivalent is T1,T2,T3,T4.

•Eventually, it is easy to observe that the schedule is both in TS mono- and TS multi-versioned, even without simulating the evolution of RTM and WTM counters. Transaction pedices, indeed, occur in strictly increasing order in the operations relative to each resource, so that no operation can be rejected by the TS rules.

Timestamp Mono-version

• The scheduler has two counters: RTM(x) and WTM(x) for each object• The scheduler receives read and write requests with timestamps:

– read(x,ts): • If ts < WTM(x) the request is rejected and the transaction killed• Else, the request is granted and RTM(x) is set to max(RTM(x), ts)

– write(x,ts): • If ts < WTM(x) or ts < RTM(x) the request is rejected and the

transaction killed• Else, the request is granted and WTM(x) is set to ts

• Many transactions are killed• To work w/o the commit-projection hypothesis, it needs to "buffer"

write operations until commit, which introduces waits

19

Timestamp Multi-version

● Rules:– read(x,ts) is always accepted. A copy xk is

selected for reading such that:• If ts > WTMN(x), then k = N

• Else take k such that WTMk(x) < ts < WTMk+1(x)

– write(x,ts): • If ts < RTM(x) the request is rejected• Else a new version is created (N is incremented)

with WTMN(x) = ts

20

(ctrl-z)n

The following schedule belongs to the class TS 2-versions but does not belong to TS-mono-version:

r2(X) w2(X) r1(X)

•Extend the diagram of the classes VSR, CSR, 2PL, TS-mono with the sets TS 2-versions, TS 3-versions.

•Show an example of schedule that belongs to TS-3 but not to TS-2.

21

(ctrl-z)n

If we have even only one version of the data previous to the most recent one, we can “infringe” the reads-from principle, thus making the classes TS-n not strictly included in the VSR set (for n > 1) .

For example the schedule:

w1 w2 r1

is not serializable, but is accepted by a system that uses TS with more than one version. Moreover, as the number of available versions increases, also the number of accepted schedules grows.

The schedule

w1 w2 w3 r1

belongs to TS-3 but not to TS-2. In general, the schedule w1 w2 w3 ... wn-1 wn r1 belongs to TS-n but not to TS-(n-1).

22

Serial

VSR

CSR

2PLTS(mono)

TS-2

TS-3

TS-n

...

23

ACID Transactions

Concurrency● Serial Schedule● VSR, CSR, 2PL, 2PL

Strict, TS, TS-Multi

Guarantees:● I: Isolation

Reliability● System Log and Dump● Warm/Cold restart

Guarantees:● A: Atomicity● D: Durability

24

C: Consistency is guaranteed at query compile time

Log Records

Transaction records

Actions performed by transactions.

Begin (T)

Insert (T, O, AS)

Delete (T, O, BS)

Update (T, O, BS, AS)

Commit (T)

Abort (T)

System records

Actions performed by the reliability controller.

Checkpoint (T1, T2, …)

Dump ()

25

T = Transaction, O = Object,BS = Before State, AS = After State

Reliability primitives

UNDO( Action )

● Update and Delete:– O ← BS

● Insert:– Remove O

REDO( Action )

● Update and Insert:– O ← AS

● Delete:– Remove O

26

Idempotence of undo and redo:UNDO( UNDO( Action ) ) = UNDO( Action )REDO( REDO( Action ) ) = REDO( Action )

Warm Restart

1. UNDO = Most recent checkpoint, REDO = {}

2. Starting from the most recent checkpoint:– Begin(T): UNDO += T– Commit(T): UNDO -= T, REDO += T

3. Trace back the log and undo all the actions of the transaction in the UNDO set.

4. Trace forward the log and redo all the actions of the transactions in the REDO set.

27

28

Example: describe the warm restart, indicating the progressive construction of the sets UNDO e REDO and the recovery actions, given the following situation on the log:

Dump, b(t1), b(t2), b(t3), i(t1,o1,a1), d(t2,o2,b2), b(t4), u(t4,o3,b3,a3), u(t1,o4,b4,a4), c(t2), ckpt(t1,t3,t4), b(t5), b(t6), u(t5,o5,b5,a5), a(t3), ckpt(t1,t4,t5,t6), b(t7), a(t4), u(t7,o6,b6,a6), u(t6,o3,b7,a7), b(t8), a(t7), failure

29

Distributed deadlock

• Apply the distributed deadlock detection algorithm to the following wait-for conditions:

• Node 1: E2 t1; t1 t2; t2 E3; E4 t3; t3 t1 • Node 2: E3 t4; t4 t5; t5 t1; t1 E1; t6   E3• Node 3: E2 t6; t6 t7; t7 E4; E1   t2; t2 t7;

t4 E2• Node 4: E3 t7; t7 t3; t3 E1

30

ti tj

indicates a local waiting(ti waits for the unlock of the resource locked by tj)

ti En

indicates that ti waits for the termination of theexecution of a subtransaction t? on the node n (synchronous invocation)

Em ti

indicates that ti has been invoked synchronouslyby a transaction t? active on the node m

31

Any waiting condition where a sub-transaction ti, activated by a remote DBMS on the node m, is waiting for a lock to be released by another transaction tj, which in turn waits for the termination of a remote sub-transaction invoked on a DBMS on node n. The waiting situation is summarized by the following wait sequence:

Em ti tj En

32

The algorithm is distributed. Every instance (executed by a node) communicates to other instances of the same algorithm the waiting sequences:

Em ti tj En

Messages are sent only “ahead”. That is, towards the nodes where the sub-transaction for which tj is waiting was activated. Also, the message is sent only if (for example) i>j

33

We simulate the execution – asynchronous and distributed – of the deadlock detection algorithm.Each node decides to send detected waiting condition (from remote to remote):

Em ti tm tn ... tp tj En

is translated into the message:

Em ti tj En

To be sent to node n, if ad only if i > j

34

NODE 1

3

2

1

E3

E4

E2

1

5

4

E1

E3

6

7

2

4

E4

E2

E16 E3

NODE 2 NODE 3

E4->t3->t2->E3

to be sent to node 3

Nothingto be sent

E3->t4->t1->E1

to be sent to node 1

E2

NODE 4

7

3

E3

E1

E3->t7->t3->E1

to be sent to node 1

35

NODE 1

3

2

1

E3

E4

E2

6

7

2

4

E4

E2

E1

NODE 3

E3->t7->t2->E3

to be sent to node 3

Nothingto be sent

E2

NODE 4

7

3

E3

E1

4 E3

7 E3

Nothingto be sent

Nothingto be sent

1

5

4

E1

E3

6 E3

NODE 2

3 E4

36

NODE 1

3

2

1

E3

E4

E2

1

5

4

E1

E3

6

7

2

4

E4

E2

E16 E3

NODE 2 NODE 3

Cycle 2-7Deadlock

E2

NODE 4

7

3

E3

E1

3 E44 E3

7 E3 3 E4

Z.1 Determine if the following schedule is compatible with 2PL Strict, 2PL or non-2PL.

r1(x) w1(x) r2(z) r1(y) w1(y)r2(x) w2(x) w2(z)

We SUPPOSE that the commits are performed immediately after the transaction last operation

r1(x) w1(x) r2(z) r1(y) w1(y) c1 r2(x) w2(x) w2(z) c2

X Y Z notes1 r1(x) S1

2 w1(x) X1

3 r2(z) X1 S2

4 r1(y) X1 S1 S2

5 w1(y) X1 X1 S2

6 c1 X1 X1 S2

7 r2(x) S2 S2

8 w2(x) X2 S2

9 w2(z) X2 X2

10 c2 X2 X2

No incompatibilities, the schedule is 2PL Strict compatible (and also 2PL).

Z.2 Determine if the following schedule is compatible with 2PL Strict, 2PL or non-2PL.

r1(x) w1(x) w3(x) r2(y) r3(y) w3(y) w1(y) r2(x)

We SUPPOSE that the commits are performed immediately after the transaction last operation

r1(x) w1(x) w3(x) r2(y) r3(y) w3(y) c3 w1(y) c1 r2(x) c2

X Y notes1 r1(x) S1

2 w1(x) X1

3 w3(x) 3 has to wait t1 for X

4 r2(y) X1 S2

5 r3(y) waiting

6 w3(y) waiting

7 c3 waiting, can’t commit

8 w1(y) 1 has to wait t2 for Y

9 c1 waiting, can’t commit

10 r2(x) 2 has to wait t1 for X

11 c2 waiting, can’t commit

t1t3 t2

There is a DEADLOCK between t1 and t2.

The schedule is not compatible with 2PL Strict.

We can remove the strictness and check if it’s 2PL. (A transaction can unlock resources before the commit, but cannot acquire other locks after the first unlock).

X Y notes1 r1(x) S1

2 w1(x) X1

3 w3(x)

4 r2(y)

In order to let t3 lock X, t1 should unlock it before.

Because of the 2PL rule (after the first unlock the transaction cannot lock any

other resource), we should try to anticipate all the locks for t1 before the

first unlock.

5 r3(y)

6 w3(y)

7 c3

8 w1(y)

9 c1

10 r2(x)

11 c2

X Y notes1 r1(x) S1

2 w1(x) X1

3 w3(x)

4 r2(y)

The only other action for t1 is w1(y).We should move its lock upwards, to be

able to unlock X.

Without anticipating the lock, in fact, t1 needs to unlock X and then lock Y,

which is not possible for the 2PL rule.

5 r3(y)

6 w3(y)

7 c3

8 w1(y)

9 c1

10 r2(x)

11 c2

X Y notes1 r1(x) S1

2 w1(x) X1

X1 X1

3 w3(x) X3

4 r2(y)

Now it’s possible for t3 to lock X.

But t1 cannot unlock Y until w1(y) (instant 8).

This would conflict with the actions r2(y), r3(y) and w3(y) (instants 4, 5, 6).

So this solution is not feasible.

5 r3(y)

6 w3(y)

7 c3

8 w1(y)

9 c1

10 r2(x)

11 c2

X Y notes1 r1(x) S1

2 w1(x) X1

X1 X1

3 w3(x) X3

4 r2(y)From this example we can derive that:

We can anticipate the locks/unlocks of the action A (instant M) to the instant N, only if A is not in conflict with any

other action X between N and M.

Otherwise we would generate more conflicts → more waiting states.

5 r3(y)

6 w3(y)

7 c3

8 w1(y)

9 c1

10 r2(x)

11 c2

X Y notes1 r1(x) S1

2 w1(x) X1

3 w3(x)

4 r2(y)

We could anticipate the lock of Y and unlock of X for t1 before 7, but not

before 6 (it would generate a conflict).

This doesn’t solve the waiting state, so w3(x) has to wait.

5 r3(y)

6 w3(y)

7 c3

8 w1(y)

9 c1

10 r2(x)

11 c2

X Y notes1 r1(x) S1

2 w1(x) X1

3 w3(x) 3 has to wait t1 for X

4 r2(y) X1 S2

5 r3(y) waiting

6 w3(y) waiting

7 c3 waiting, can’t commit

8 w1(y)t1 needs an exclusive lock on Y, t2 does not need it anymore, but it should lock X

before unlocking Y.

9 c1

10 r2(x)

11 c2

X Y notes1 r1(x) S1

2 w1(x) X1

3 w3(x) 3 has to wait t1 for X

4 r2(y) X1 S2

5 r3(y) waiting

6 w3(y) waiting

7 c3 waiting, can’t commit

8 w1(y)Unfortunately, the resource X is locked

by t1, so it cannot be locked by t2.And t1 cannot release it otherwise it

loses the ability to lock Y.

9 c1

10 r2(x)

11 c2

X Y notes1 r1(x) S1

2 w1(x) X1

3 w3(x) 3 has to wait t1 for X

4 r2(y) X1 S2

5 r3(y) waiting

6 w3(y) waiting

7 c3 waiting, can’t commit

8 w1(y) 1 has to wait t2 for Y

9 c1 waiting, can’t commit

10 r2(x) 2 has to wait t1 for X

11 c2 waiting, can’t commit

There is still a deadlock, the schedule is not 2PL.

t1t3 t2