1 controlled concurrency now we start looking at what kind of concurrency we should allow we first...

Post on 18-Jan-2018

213 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

3 FIGURE 21.3 (b) The temporary update (dirty read) problem. When one transaction updates a database item and then the transaction fails : the updated item is accessed by another transaction before it is changed back to its original value Here issues of concurrency and recovery Eg: X = 20 Y = 15 M = 2 N = 3

TRANSCRIPT

1

Controlled concurrency

• Now we start looking at what kind of concurrency we should allow

• We first look at uncontrolled concurrency and see what happens– We look at 3 bad examples– We then look at how we can understand whether

concurrency is OK or not.• Then we look at how to control concurrency

2

FIGURE 21.3 (a) : The lost update problem.This occurs when two transactions that access the same database items have their operations interleaved in a way that makes the value of some database item incorrect.

• Eg: X = 20, Y = 15, M = 2, N = 3

3

FIGURE 21.3 (b) The temporary update (dirty read) problem.When one transaction updates a database item and then the transaction fails : the updated item is accessed by another transaction before it is changed back to its original value

• Here issues of concurrency and recovery

Eg:

X = 20

Y = 15

M = 2

N = 3

4

FIGURE 21.3 (c) The incorrect summary problem.If one T is calculating an aggregate summary function on a number of records while another T id updating some of these records, the aggregate function may calculate some values before they are updated and others after they are updated.

Eg: A = 2, N = 3, X = 10, Y = 8

5

Serial Schedules• Serial schedule: A schedule S is serial if, for

every transaction T in the schedule, all operations of T are executed consecutively in S– i.e. all of one T has to finish before another T starts– Eg: T2 T1 T3 is serial

• Otherwise, the schedule is called nonserial or interleaved schedule

• S1 = r1(x), w1(x), r2(x), r2(y) : serial: T1 T2

• S2 = r1(x), r2(x), w1(x), r2(y) : interleaved

6

Concurrency• How to deal with problems of inconsistency of

data because of concurrency?– Like in the 3 examples we saw earlier

• Only allow serial execution. Problem?• Wasteful:T1 is doing I/O, T2 is forced to wait• Solution: Allow controlled concurrency

– Allow when no conflict– Don’t allow when conflict

• Now we see how to do “controlled concurrency”

7

Concurrency Eg– Figure

21.5

• Which of C, D should be allowed?

• Eg: – X= 50

– M = 10

– N = 5

8

Different serial schedules• Will 2 diff. serial schedules always give same results ?• No – diff. serial schedules can give diff. results. Eg:

– T1 = r(x), r(y), x = x + y, w(x) – T2 = r(x), r(y), y = x + y, w(y)– x = 20, y = 30– Serial schedule T1T2 : final values of X, Y?– Serial schedule T2T1 : final values of X, Y?

• Any serial execution is OK: why?• o/w we should not allow concurrency at all.• Eg: Suppose T1T2 OK, but T2T1 not OK:

– All of T1has to happen before all of T2

– Makes no sense to talk about T1 and T2 executing concurrently

9

Serializability• Implication for concurrent execution?• Want concurrent schedule equivalent to some

serial schedule• Serializable: A schedule S is serializable if it is

equivalent to some serial schedule.• Intuition behind serializability: since any serial

execution OK– allow interleaved execution as long as result will be

same as some serial execution.• Eg: Fig. 17.5 D OK (equivalent to A), C not OK

10

Serializability: Result Equivalency• We said schedule S is serializable if it is

equivalent to some serial schedule. – What does “equivalent” mean ?

• Check if concurrent schedule produces the same result as a serial schedule. How ?

• First approach: pick some data values, try.• Result equivalent: Two schedules are result

equivalent if they produce same final state on some data– Is this idea OK?– Saw it with Fig 17.5 Eg

11

Serializability: Result Equivalency• Problem: could have happened by accident i.e.

on the data we happened to look at, get the same result but not generally true

• Eg: Look at Fig 17.5 again– Any values of X, M, N which will make C produce

same result as A (or B) ?• When M = 0

– But C should not be allowed• Want stronger guarantee. How ?• Important ops should be in same order as serial

12

Conflicting Operations• Order of some pairs of ops are important to

consider for concurrency/recovery, others not.• Two operations are in conflict: When ?• 1. Belong to different transactions. Why?• Within T1 can’t switch: Eg: w1(y), r1 (x) • 2. Access the same data item. Why?• If diff. data, then doesn’t matter:

– w1(x), w2 (y) same as w2(y), w1 (x)

• 3. One of them is a write op. . Why?• r1(x),r2 (x) same as r2(x),r1(x): data unchanged

13

Complete Schedules• Complete Schedule : S of T1, T2, … Tn

1. Exactly same ops in S and T1, T2, … Tn

2. Includes abort/commit for each Ti

3. If op1 before op2 in Ti then same order in S

4. For any pair of conflicting operations, one must occur before other in S

– We can leave out internal operations

14

Serializability: Conflict Equivalent• Eg: S: r1(x), r2(y), w1(y), w1(x), w2(x)• What are the conflict pairs ?• (r1(x), w2(x))

• (w1(x), w2(x))

• (r2(y), w1(y))• Conflict Equivalent: Two schedules are conflict

equivalent if the order of any two conflicting operations is the same– i.e. have the same conflict pairs

15

Serializability: Conflict Equivalent• Eg: T1 = r1(x), w1(y), T2 = r2(y), w2(x)

– S1 = r1(x), r2(y), w2(x), w1(y)– S2 = r2(y), w2(x), r1(x), w1(y)– Are S1, S2 conflict equivalent ?

– are conflict pairs the same ?• What are the conflict pairs of S1• (r1(x), w2(x)), (r2(y), w1(y))• What are the conflict pairs of S2• (w2(x)), r1(x)), (r2(y), w1(y))• Different pairs: not conflict equivalent

16

Serializability: Conflict Equivalent• Eg: S3 = r1(x), r2(y), w1(y), w2(x)

S4 = r2(y), r1(x), w1(y), w2(x )• Are S3, S4 conflict equivalent ?

– are conflict pairs the same ?• What are the conflict pairs of S3• (r1(x), w2(x)), (r2(y), w1(y))

• What are the conflict pairs of S4• (r1(x), w2(x)), (r2(y), w1(y))

• Same pairs : are conflict equivalent

17

Serializability Eg– Figure

21.5

• Which of C, D should be allowed?

18

Serializability: Conflict Equivalency• S is conflict serializable if it is conflict equivalent to

some serial schedule S’• Figure 17.5 : A (T1T2) is serial, so is B (T2T1)• Is D conflict serializable

– D’s conflict pairs equivalent to those of A or B?• Conflict pair of A, B, D ?• A: (r1(x), w2(x)), (w1(x), r2(x)), (w1(x), w2(x))• B: (r2(x), w1(x)), (w2(x), r1(x)), (w2(x),w1(x)) • D: (r1(x), w2(x)), (w1(x), r2(x)), (w1(x), w2(x))• Is C conflict serializable. Conflict pairs ?• C: (r1(x), w2(x)), (w1(x), w2(x)), (r2(x), w1(x))• C not equivalent to A: r2(x) before w1(x)• C not equivalent to B: w1(x) before w2(x)

19

Serializability• Serializable not the same as serial.

– What is the difference ?• Serial means no interleaving: T1 T2 T3 etc• Serializable allows interleaving, but has to be

equivalent to a serial schedule • Serializable schedule :

– Will leave the database in a consistent state. – Interleaving is controlled and will result in the same

state as if the transactions were serially executed,– Will achieve efficiency due to concurrent execution.

20

Testing For Conflict SerializabilityTesting for conflict serializability Algorithm 17.1: 1. Looks at only read_Item (X) and write_Item (X)

operations : not the internal ops2. Constructs a precedence graph (serialization graph)

- a graph with directed edges 3. An edge is created from Ti to Tj if one of the

operations in Ti appears before a conflicting operation in Tj

4. The schedule is serializable if and only if the precedence graph has no cycles.

21

Figure 21.5: draw

precedence graphs

22

FIGURE 21.7: precedence graph for Figure 21.5 • Constructing precedence graphs for schedules from Figure 17.5 to test for

conflict serializability. Precedence graphs for (a) serial schedule A. (b) serial schedule B. (c) schedule C (not serializable). (d) schedule D (serializable, equivalent to schedule A).

• How do we interpret the cycles ?

23

FIGURE 21.8 (a). • Another example of serializability testing. (a) The

READ and WRITE operations of three transactions T1, T2, and T3.

• We will look at schedules in next 2 slides– And draw the precedence graphs

24

FIGURE 21.8 (b). • Schedule E.

– Precedence graph ? Serializable ?

25

FIGURE 21.8 (c). • Schedule F.

– Precedence graph ? Serializable ?

26

Serializability• Issue: OS controls how ops get interleaved :

– Resulting schedule may or may not be serializable– Problem ?

• If not serializable, then what?• Have to rollback. Problem?• Expensive – not practical! How to solve?• Guarantee serializability. How ?• Locks:

– Current approach used in most DBMSs: • Two phase locking: will study

27

View Serializability• We have seen result equivalent and conflict equivalent.• View equivalent: another condition. [RG] eg:

• Schedule S2 is serial• Schedule S1: R1(A), W2(A), W1(A), W3(A). Is this

conflict serializable?• No – precedence graph has a cycle.

– T1 → T2 → T1• Do you think S1 should be allowed ?

Schedule S1:T1: R(A) W(A)T2: W(A)T3: W(A)

Schedule S2:T1: R(A),W(A)T2: W(A)T3: W(A)

28

View Serializability• S1 is equivalent (in every situation) to serial

S2 i.e. T1,T2,T3. Why?• Because final value of A written by T3

– This is a blind write so does not matter whether T1, T2 were in serial order or interleaved

• Stronger than result equivalent, weaker than conflict equivalent

• View equivalent: we won’t do formal defn.• View serializability good enough

– but expensive to test (NP-hard)– so use conflict serializability since easier to test

29

Other Notions of SerializabilityOther Types of Equivalence of Schedules • Under special semantic constraints

– schedules that are otherwise not conflict serializable may work correctly.

– [SKS Eg] in next slide

30

[SKS] Example• A is checking account

• B is savings account

• T1 transferring 50$ from A to B

• T5 transferring 10$ from B to A

• Is this schedule conflict serializable?

• No. Also not view serializable– Though we have not studied definition.

• Should this schedule be allowed ?• Yes : Eg: A = 100, B = 30. In general, OK. Why?• D: debit, C: credit. D D C C same as D C D C

31

Recoverability vs Serializability• Both affected by concurrent execution of

transactions, but the two are quite different• Recoverability : How to recover if transaction

aborts or system crashes• Serializability : Even if no system crashes and

all transactions commit– Have to make sure we get correct results

• Equivalent to serial schedule

32

Serializability Tests• DBMS has to provide a mechanism to ensure that

schedules are conflict serializable • We have seen how to test a schedule to see if it is (was)

serializable. – How can this be used?

• We could run the transactions without attempting to control concurrency. Then what ?

• Test to see if the schedule which resulted was serializable. If serializable, then what ?

• Everything OK. If not serializable, then what ?• Rollback. Problem ?• Expensive. Alternative ?

33

Concurrency Control vs. Serializability Tests

• Develop concurrency control protocols that only allow concurrent schedules which we want– Serializable– Recoverable, cascadeless .

• Connection between concurrency control protocols and serializability tests ?

• Tests for serializability help us understand why a concurrency control protocol is correct– i.e. why protocol guarantees serializability.

top related