transactions. what is it? transaction - a logical unit of database processing motivation - want...

26
Transactions

Upload: curtis-gilmore

Post on 05-Jan-2016

224 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Transactions. What is it? Transaction - a logical unit of database processing Motivation - want consistent change of state in data Transactions developed

Transactions

Page 2: Transactions. What is it? Transaction - a logical unit of database processing Motivation - want consistent change of state in data Transactions developed

What is it?

• Transaction - a logical unit of database processing

• Motivation - want consistent change of state in data

• Transactions developed in 1950's – banking activities

• Idea of transaction formalized in 1970's • Transactions in clouds in 2010’s?

Page 3: Transactions. What is it? Transaction - a logical unit of database processing Motivation - want consistent change of state in data Transactions developed

Why?

Want to interleave operations-

• increases throughput of the system (number of transactions that can finish in any given period)

• interleave I/Os and CPU time

Page 4: Transactions. What is it? Transaction - a logical unit of database processing Motivation - want consistent change of state in data Transactions developed

Problems

1) Inconsistent result if crash in middle of transaction

crash due to: hardware failure, system error, exception condition

2) Can have error if concurrent execution3) Uncertainty when changes permanent. -- Write to disk every time?

Concurrency control and Recovery from failures are needed to solve these problems

Page 5: Transactions. What is it? Transaction - a logical unit of database processing Motivation - want consistent change of state in data Transactions developed

ACID properties – Jim Gray

• ACID properties• Atomicity - transaction is indivisible,• Consistency - correct execution takes DB

from one consistent state to another• Isolation – transactions affect each other as

if not concurrent• Durability - once a transaction completes

(commits), changes made to data are permanent and transaction is recoverable

Page 6: Transactions. What is it? Transaction - a logical unit of database processing Motivation - want consistent change of state in data Transactions developed

Systems guarantees 4 properties• ACID properties• Atomicity

– all or nothing, performs transaction in entirety – solves 1) Inconsistent result if crash in middle of transaction

• Consistency – a logical property based on some consistency rule, implied by isolation – If isolation is implemented properly – solves 2) correct execution takes DB from one consistent state to

another• Isolation

– equivalent to serial schedule (serializability) • T2 -> T1 or T1 -> T2

– takes care of 2) • Durability

– changes are never lost due to subsequent failures –– solves 3) Uncertainty when changes permanent

Page 7: Transactions. What is it? Transaction - a logical unit of database processing Motivation - want consistent change of state in data Transactions developed

ACID properties

• Atomicity ensures recovery

• Consistency ensures programmer and DBMS enforce consistency

• Isolation ensures concurrency control is utilized

• Durability ensures recovery

Page 8: Transactions. What is it? Transaction - a logical unit of database processing Motivation - want consistent change of state in data Transactions developed

Operations of transactions

• Read and Write of data items • Granularity can be  rows of a table or a table

(Granularity can affect concurrency)• Each transaction has an identifier i assigned to it

(Ti) • Transaction commit statement - causes

transaction to end (commit) successfully (Ci) • Transaction abort (rollback) statement - all writes

undone (Ai) • System or User can specify commit and abort

Page 9: Transactions. What is it? Transaction - a logical unit of database processing Motivation - want consistent change of state in data Transactions developed

R’s and W’s

Notation:

R1(X) - transaction 1 reads data item X     W2(Y) - transaction 2 writes to data item Y     C1 - transaction 1 commits

BL1 – transaction 1 blocks     A2 - transaction 2 aborts, rollback

• A series of R's and W's is a schedule (history) • Allow multiple users to execute simultaneously to

access tables in a common DB • Concurrent access - concurrency

Page 10: Transactions. What is it? Transaction - a logical unit of database processing Motivation - want consistent change of state in data Transactions developed

Lost Update Problem – Dirty Write

Dirty write - some value of DB is incorrect                     T1                T2     where A=10                     R1(A)                                          R2(A) (A=10)

A=A-10                     W1(A)                                         A=A+20                                         W2(A)  

R1(A)R2(A)W1(A)C1W2(A)C2

The value of A is 30 when T1 and T2 are done, but it should be 20                  

Page 11: Transactions. What is it? Transaction - a logical unit of database processing Motivation - want consistent change of state in data Transactions developed

Dirty Read ProblemTransaction updates DB item, then fails

                   T1                T2     where A=10                     R1(A)                     A=A-10                     W1(A)                     R1(C)                                        R2(A)     (A=0, reads value T1 wrote)

                    A1 - T1 fails                                         R2(B)                                         B=B+A                                         W2(B)

R1(A)W1(A)R1(C)R2(A)W2(A)A1 R2(B)W2(B)C2

When T1 is aborted, A is reset to value of 10, values B                         written by T2 are incorrect  

Page 12: Transactions. What is it? Transaction - a logical unit of database processing Motivation - want consistent change of state in data Transactions developed

Unrepeatable ReadTransaction reads data item twice, in between values change

                     T1                T2     where A=10                     R1(A)                     R1(B)                     B=B+A                     W1(B)                                        R2(A)                                        A=A+20                                        W2(A)                     R1(A)                     R1(C)                     C=C+A                     W1(C)  

   R1(A)R1(B)W1(B)R2(A)W2(A)C2R1(A)R1(C)W1(C)C1

When T1 first reads A it has a value of 10, then a value of 30                   

Page 13: Transactions. What is it? Transaction - a logical unit of database processing Motivation - want consistent change of state in data Transactions developed

Degrees of Isolation

DBs try to achieve the following:

degree 0 - doesn't overwrite data updated (dirty data) by other transactions with degree at least 1

degree 1 - no lost updates (no dirty writes)

degree 2 - no lost updates and no dirty reads

degree 3 - degree 2 plus repeatable reads

degree 4 - serializability

Page 14: Transactions. What is it? Transaction - a logical unit of database processing Motivation - want consistent change of state in data Transactions developed

Serializable

A transaction history is serializable if it is equivalent to some serial schedule         (does not mean it is a serial schedule)    The important word here is ‘some’

        Example of two serial schedules:

R1(X)W1(X)C1  R2(X)W2(X)R2(Y)W2(Y)C2      T1<< T2

R2(X)W2(X)R2(Y)W2(Y)C2  R1(X)W1(X)C1      T2 << T1

Page 15: Transactions. What is it? Transaction - a logical unit of database processing Motivation - want consistent change of state in data Transactions developed

Equivalence

• Is a given schedule equivalent to some serial schedule? R2(X)W2(X)R1(X)W1(X)R2(Y)W2(Y)C1C2

• How do we answer this question?

• What is equivalence? – Need same number of operations

• How to define equivalence

 

Page 16: Transactions. What is it? Transaction - a logical unit of database processing Motivation - want consistent change of state in data Transactions developed

Result equivalence

Result equivalent if produce same final state

R1(X) (X*2)W1(X)C1 R2(X)(X+Xmod10)W2(X)C2

      if X=10 result is X=20                           

R1(X)R2(X)(T2:X+Xmod10)W2(X)C2(T1: X*2)W1(X)

if X=10 result is X=20

Same results if X=10, but next try X= 7: (results are 18 and 14)

                                         

Page 17: Transactions. What is it? Transaction - a logical unit of database processing Motivation - want consistent change of state in data Transactions developed

Conflict equivalence

  First define conflicting operations

R and W conflict if:    1.  from 2 different transactions    2.  reference same data item    3.  at least one is a write

Page 18: Transactions. What is it? Transaction - a logical unit of database processing Motivation - want consistent change of state in data Transactions developed

Conflicting operations

The conflicting operations are:    Ri(A) << Wj(A)   read followed by a write    Wi(A) << Rj(A)   write followed by a read    Wi(A) << Wj(A)  write followed by a write

Ri(A) << Rj(A)  don't conflict    Ri(A) << Wj(B)  don't conflict

Page 19: Transactions. What is it? Transaction - a logical unit of database processing Motivation - want consistent change of state in data Transactions developed

Conflict equivalence

Conflict equivalent if order of any 2 conflicting operations is the same in both schedules

R1(A)W1(A)C1  R2(A)W2(A)C2  or in reverse order

R1(A)<<W2(A)                R2(A)<<W1(A)  W1(A)<<R2(A)                W2(A)<<R1(A) W1(A)<<W2(A)               W2(A)<<W1(A)  

Page 20: Transactions. What is it? Transaction - a logical unit of database processing Motivation - want consistent change of state in data Transactions developed

Conflict equivalence

R1(A)R2(A)W1(A)C1W2(A)C2

                    R1(A)<<W2(A)                     R2(A)<<W1(A)

W1(A)<<W2(A)                     NOT serializable - example of lost update

Page 21: Transactions. What is it? Transaction - a logical unit of database processing Motivation - want consistent change of state in data Transactions developed

ExampleIs this schedule serializable?          R2(X)W2(X)R1(X)W1(X)R2(Y)W2(Y)C1C2               R2(X)<<W1(X)

W2(X)<<R1(X)          W2(X)<<W1(X)                          

T2<<T1

Is this schedule serializable?    R2(X)R1(X)W2(X)R2(Y)W1(X)C1W2(Y)C2

R2(X)<<W1(X)          R1(X)<<W2(X)

W2(X)<<W1(X)                   

not equivalent

Page 22: Transactions. What is it? Transaction - a logical unit of database processing Motivation - want consistent change of state in data Transactions developed

Strategy

• There is a simple algorithm for determining conflict serializability of a schedule

• What is the algorithm?

Page 23: Transactions. What is it? Transaction - a logical unit of database processing Motivation - want consistent change of state in data Transactions developed

Testing for Equivalence

Construct a precedence graph (aka serialization graph)

     1.  For each Ti in S, create node Ti in graph   2.  For all Rj(X) after Wi(X) create an edge Ti->Tj   3.  For all Wj(X) after Ri(X) create an edge Ti->Tj   4.  For all Wj(X) after Wi(X) create an edge Ti-> Tj

   5.  Serializable iff no cycles

    If a cycle in the graph, no equivalent serial history

Page 24: Transactions. What is it? Transaction - a logical unit of database processing Motivation - want consistent change of state in data Transactions developed

Example

Is this the following schedule serializable?

R1(A)R2(A)W1(A)W2(A)C1C2  

T1 T2

Page 25: Transactions. What is it? Transaction - a logical unit of database processing Motivation - want consistent change of state in data Transactions developed

Serializability

• If a schedule is serializable, we can say it is correct • Serializable does not mean it is serial • Difficult to test for conflict serializability - interleaving of

operations is determined by the operating system scheduler

• Concurrency control methods don't test for it.

• Instead, protocols developed that guarantee a schedule is serializable.

Page 26: Transactions. What is it? Transaction - a logical unit of database processing Motivation - want consistent change of state in data Transactions developed

Concurrency Control Techniques

Protocols - set of rules that guarantee serializability

1.locking

2.Timestamps

3.multiversion

4.validation or certification - optimistic