215-database-recovery.ppt

Upload: vinay-varma

Post on 14-Apr-2018

216 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/27/2019 215-Database-Recovery.ppt

    1/38

    What is Concurrent Process (CP)? Multiple users access databases and use computer systems

    simultaneously.

    Example: Airline reservation system.

    An airline reservation system is used by hundreds of travel agentsand reservation clerks concurrently.

    Why Concurrent Process?

    Better transaction throughput and response time

    Better utilization of resource

    Introduction

  • 7/27/2019 215-Database-Recovery.ppt

    2/38

    Transaction

    What is Transaction? A sequence of many actions which are considered to be oneatomic unit of work.

    Basic operations a transaction can include actions: Reads, writes

    Special actions: commit, abort

    ACID properties of transaction Atomicity:Transaction is either performed in its entirety or not performed at all,

    this should be DBMS responsibility

    Consistency:Transaction must take the database from one consistent state toanother. It is users responsibility to insure consistency

    Isolation:Transaction should appear as though it is being executed in isolation fromother transactionss

    Durability:Changes applied to the database by a committed transaction must persist,even if the system fail before all changes reflected on disk

  • 7/27/2019 215-Database-Recovery.ppt

    3/38

    Concurrent Transactions

    interleaved processing parallel processing

    t1 t2 t1 t2

    time

    CPU1

    CPU2A

    B

    A

    B

    CPU1

  • 7/27/2019 215-Database-Recovery.ppt

    4/38

    Schedules

    What is Schedules A schedule S of n transactions T1,T2,Tn is an ordering of the operations of

    the transactions subject to the constraint that, for each transaction Ti thatparticipates in S, the operations of Ti in S must appear in the same order in

    which they occur in Ti. Example: Sa: r1(A),r2(A),w1(A),w2(A), a1,c2;

    T1 T2

    Read(A)

    Write(A)

    Abort T1

    Read(A)

    Write(A)

    Commit T2

  • 7/27/2019 215-Database-Recovery.ppt

    5/38

    Oops, somethings wrong

    Reserving a seat for a flight If concurrent access to data in DBMS, two users maytry to book the same seat simultaneously

    Agent 1 findsseat 35G empty

    Agent 2 finds

    seat 35G empty

    time

    Agent 1 sets

    seat 35G occupied

    Agent 2 sets

    seat 35G occupied

  • 7/27/2019 215-Database-Recovery.ppt

    6/38

    Another example Problems can occur when concurrent transactions execute in an uncontrolled

    manner.

    Examples of one problem. A original equals to 100, after execute T1 and T2, A is supposed to be 100+10-

    8=102

    Add 10 To A Minus 8 from A

    T1 T2

    Read(A)

    A=A+10

    Write(A)

    Read(A)

    A=A-8

    Write(A)

    Value of A onthe disk

    100

    100

    100

    100

    110

    92

  • 7/27/2019 215-Database-Recovery.ppt

    7/38

    What Can Go Wrong?

    Concurrent process may end up violating Isolation property oftransaction if not carefully scheduled

    Transaction may be aborted before committed- undo the uncommitted transactions

    - undo transactions that sees the uncommitted change before the crash

  • 7/27/2019 215-Database-Recovery.ppt

    8/38

    Conflict operations Two operations in a schedule are said to be conflictif they satisfy

    all three of the following conditions:

    (1) They belong to different transactions

    (2) They access the same item A;

    (3) at least one of the operations is a write(A)

    Example in Sa: r1(A),r2(A),w1(A),w2(A), a1,c2;

    r1(A),w2(A) conflict, so do r2(A),w1(A),

    r1(A), w1(A) do not conflict because they belong to the same transaction,

    r1(A),r2(A) do not conflict because they are both read operations.

  • 7/27/2019 215-Database-Recovery.ppt

    9/38

    Serializability of schedules

    Serial A schedule S is serial if, for every transaction T participating in the schedule,all the operations of T are executed consecutively in the schedule.( Nointerleaving occurs in a serial schedule)

    Serializable

    A schedule S of n transactions isserializable

    if it is equivalentto someserialschedule of the same n transactions.

    schedules are conflict equivalent if: they have the same sets of actions, and

    each pair of conflicting actions is ordered in the same way

    schedules are conflict equivalent if: they have the same sets of actions, and

    each pair of conflicting actions is ordered in the same way

    Conflict Serializable A schedule is said to be conflict serializable if it is conflict equivalent to a

    serial schedule

  • 7/27/2019 215-Database-Recovery.ppt

    10/38

    CharacterizingSchedules1. Avoid cascading abort(ACA)

    Aborting T1 requires aborting T2! Cascading Abort

    An ACA (avoids cascading abort) A X act only reads data from

    committed X acts.

    2.recoverable Aborting T1 requires aborting T2!

    But T2 has already committed! A recoverable schedule is one in which

    this cannot happen. i.e. a X act commits only after all the

    X acts it

    depends on

    (i.e. it reads from)commit. ACA implies recoverable (but notvice-versa!).

    3. strict schedule

    T1 T2

    Read(A)Write(A)

    Abort

    Read(A)

    Write(A)

    T1 T2

    Read(A)Write(A)

    commit

    Read(A)

    Write(A)

    T1 T2

    Read(A)

    Write(A)

    Abort

    Read(A)

    Write(A)

    Commit

    T1 T2

    Read(A)

    Write(A)

    Commit

    Read(A)

    Write(A)

    Commit

    T1 T2

    Read(A)

    Write(A)

    Commit

    Read(A)

    Write(A)

    Commit

    No Yes

  • 7/27/2019 215-Database-Recovery.ppt

    11/38

    Venn Diagram for Schedules

    Serial

    Strict

    ACA

    Recoverable

    View Serializable All Schedules

    Conflict Serializable

  • 7/27/2019 215-Database-Recovery.ppt

    12/38

    Example

    T1:W(X), T2:R(Y), T1:R(Y), T2:R(X), C2, C1

    serializable: Yes, equivalent to T1,T2

    conflict-serializable: Yes, conflict-equivalent to T1,T2

    recoverable: No. Yes, if C1 and C2 are

    switched.ACA: No. Yes, if T1 commits before T2writes X.

  • 7/27/2019 215-Database-Recovery.ppt

    13/38

    Sample Transaction (informal)

    Example: Move $40 from checking to savingsaccount

    To user, appears as one activity

    To database: Read balance of checking account: read( X)

    Read balance of savings account: read (Y)

    Subtract $40 from X

    Add $40 to Y Write new value of X back to disk

    Write new value of Y back to disk

  • 7/27/2019 215-Database-Recovery.ppt

    14/38

    Sample Transaction (Formal)

    T1

    read_item(X);

    read_item(Y);X:=X-40;

    Y:=Y+40;

    write _item(X);write_item(Y);

    t0

    tk

  • 7/27/2019 215-Database-Recovery.ppt

    15/38

    Focus on concurrency control

    Real DBMS does not test for serializability Very inefficient since transactions are continuously

    arriving Would require a lot of undoing

    Solution: concurrency protocols

    If followed by every transaction, and enforcedby transaction processing system, guarantee

    serializability of schedules

  • 7/27/2019 215-Database-Recovery.ppt

    16/38

    Concurrency Control Through Locks Lock: variable associated with each data

    item Describes status of item wrt operations that can

    be performed on it

    Binary locks: Locked/unlocked

    Multiple-mode locks: Read/write

    Three operations read_lock(X)

    write_lock(X)

    unlock(X)

    Each data item can be in one of three lockstates

  • 7/27/2019 215-Database-Recovery.ppt

    17/38

    Two TransactionsT1

    read_lock(Y);read_item(Y);

    unlock(Y);

    write_lock(X);

    read_item(X);

    X:=X+Y;

    write_item(X);

    unlock(X);

    T2

    read_lock(X);read_item(X);

    unlock(X);

    write_lock(Y);

    read_item(Y);

    Y:=X+Y;

    write_item(Y);

    unlock(Y);

    Lets assume serial schedule S1: T1;T2

    Initial values: X=20, Y=30 Result: X=50, Y=80

  • 7/27/2019 215-Database-Recovery.ppt

    18/38

    Locks Alone Dont Do the Trick!

    T1

    read_lock(Y);

    read_item(Y);unlock(Y);

    write_lock(X);read_item(X);X:=X+Y;write_item(X);

    unlock(X);

    T2

    read_lock(X);read_item(X);unlock(X);write_lock(Y);read_item(Y);

    Y:=X+Y;write_item(Y);unlock(Y);

    Non-serializable!

    Result: X=50, Y=50

    unlocked too early!

    Lets run T1 and T2 in interleafed fashionSchedule S

  • 7/27/2019 215-Database-Recovery.ppt

    19/38

    Two-Phase Locking (2PL)

    Def.: Transaction is said to follow the

    two-phase-locking protocolif all lockingoperations precede the firstunlockoperation

  • 7/27/2019 215-Database-Recovery.ppt

    20/38

    Example

    T1read_lock(Y);

    read_item(Y);

    write_lock(X);

    unlock(Y);read_item(X);

    X:=X+Y;

    write_item(X);

    unlock(X);

    T2

    read_lock(X);

    read_item(X);

    write_lock(Y);

    unlock(X);

    read_item(Y);

    Y:=X+Y;

    write_item(Y);

    unlock(Y);

    Both T1 and T2 follow the 2PL protocol

    Any schedule including T1 and T2 is guaranteed to be serializable

    Limits the amount of concurrency

  • 7/27/2019 215-Database-Recovery.ppt

    21/38

    Variations to the Basic Protocol

    Previous technique knows as basic 2PL

    Conservative2PL (static) 2PL: Lock allitems needed BEFORE execution begins

    by predeclaring its read and write set If any of the items in read or write set is

    already locked (by other transactions),transaction waits (does not acquire any

    locks) Deadlock free but not very realistic

  • 7/27/2019 215-Database-Recovery.ppt

    22/38

    Variations to the Basic Protocol

    Strict2PL: Transaction does notrelease its write locks until AFTER itaborts/commits

    Not deadlock free but guaranteesrecoverable schedules (strict schedule:transaction can neither read/write X until

    last transaction that wrote X hascommitted/aborted)

    Most popular variation of 2PL

  • 7/27/2019 215-Database-Recovery.ppt

    23/38

    Concluding Remarks

    Concurrency control subsystem is responsible forinserting locks at right places into yourtransaction

    Strict 2PL is widely used

    Requires use of waiting queue

    All 2PL locking protocols guarantee serializability

    Does not permit all possible serial schedules

  • 7/27/2019 215-Database-Recovery.ppt

    24/38

    Why Database Recovery Techniques?

    ACID properties of Transaction

    Time

    T1

    T2

    T3

    Crash

    Database system should guarantee

    -Durability : Applied changes by transactions

    must not be lost. ~ T3

    - Atomicity : Transactions can be aborted.

    ~ T1, T2

    System crashTransaction error

    System errorLocal errorDisk failureCatastrophe

  • 7/27/2019 215-Database-Recovery.ppt

    25/38

    Basic Idea : Logging

    Undo/Redo by the Log

    recover Non-catastrophic failure

    Time

    T1

    T2T3

    Crash

    System Log

    - keeps info of changesapplied by transactions

    CheckpointBackup

    Full DB Backup> Differential Backup

    > (Transaction) Log

    Catastrophic failure

  • 7/27/2019 215-Database-Recovery.ppt

    26/38

    Physical View - How they work - (1)

    Action :

    1)Checkthe directory whether in the cache2) If none, copy from disk pages to the cache

    Disk

    Memory

    DBMS cache(buffers)

    Disk pages/blocks(address:A,a,1)(address:B,b,0)

    A

    B a

    b

    Directory

    copy

    flush

    3) For the copy, old buffers needs to be flushed

    from the cache to the disk pages

    B

  • 7/27/2019 215-Database-Recovery.ppt

    27/38

    Physical View - How they work - (2)

    Disk

    Memory

    DBMS cache(buffers)

    Disk pages/blocks(address:A,a,1)(address:B,b,0)

    A

    B a

    b

    Directory

    copy

    flush

    4) Flush only ifa dirty bit is 1

    Dirty bit : (in the directory) whether there isa change after copy to the cache

    1 updated in the cache

    0

    not updated in the cache (no need to flush)

    update

    B

  • 7/27/2019 215-Database-Recovery.ppt

    28/38

    Physical View - How they work - (3)

    Disk

    Memory

    DBMS cache(buffers)

    Disk pages/blocks

    A

    B a

    b

    copy

    flush

    A-a : in-place updating

    - when flushing, overwrite at the same location- logging is required

    B-b : shadowing

    B

  • 7/27/2019 215-Database-Recovery.ppt

    29/38

    Physical View - How they work - (4)

    Disk

    Memory

    DBMS cache

    Data blocks

    Bb

    (1) copy (from the disk to the cache)

    (2) update the cached data, record it in the log

    (3) flush the log and the data

    (from the cache to the disk)

    copy

    flush

    Log blocks

    update

    B Logblocks

    Datablocks

    update

  • 7/27/2019 215-Database-Recovery.ppt

    30/38

    WAL : Write-Ahead Logging (1)

    in-place updating A log is necessary

    BFIM (BeFore IMage) overwrite AFIM (AFter)

    DiskMemory

    DBMS cache

    Data blocks

    copy

    2) flush

    Log blocks

    update

    Logblocks

    Datablocks

    update

    Aa

    BFIM

    BFIMAFIM

    BFIM

    WAL (Write-Ahead Logging)

    Log entries flushed before overwriting main data

    1) flush

    UNDO-type log record

  • 7/27/2019 215-Database-Recovery.ppt

    31/38

    WAL : Write-Ahead Logging (2)

    WAL protocol requires UNDO and REDO

    Time

    T

    commit

    Log

    - BFIM cannot be overwritten by AFIM on diskuntil all UNDO-type log have force-written to disk.

    - The commit operation cannot be completed

    until all UNDO/REDO-type log have force-written.UNDO

    REDO

  • 7/27/2019 215-Database-Recovery.ppt

    32/38

    Steal & No-Force (1)

    Typical DBemploysasteal/no-forcestrategy

    Time

    T1

    T2

    T3

    commit

    commit

    cache

    Updateddata by T2

    cacheCan be Used forother transactions

    (T3)

    before T2 commits

    Stealstrategy : a transaction can be written to diskbefore it commits

    Advantage :buffer space saving

  • 7/27/2019 215-Database-Recovery.ppt

    33/38

    Steal & No-Force (2)

    Time

    T1

    T2

    T3

    commit

    commit

    cache

    Updateddata by T2

    cache

    If T3 needs the same data,it must be copied again

    when T2 commits

    No-Forcestrategy : a transaction need not to bewritten to diskimmediately

    when it commits

    Advantage :I/O operations saving

    Force strategy

  • 7/27/2019 215-Database-Recovery.ppt

    34/38

    Checkpointing

    Checkpoint

    - All DMBS buffers modified are wrote out to disk.- A record is written into the log. ([checkpoint])- Periodically done

    (e.g. every n min. or every n transaction

    Time

    T1

    T2

    T3

    Crash

    Checkpoint

  • 7/27/2019 215-Database-Recovery.ppt

    35/38

    Transaction Rollback (1)

    Rollback / Roll foward

    Time

    T1

    T2

    T3

    Crash

    T4

    T5

    1 : Not necesary

    2 : Roll foward

    3 : Rollback

    4 : Roll forward

    5 : Roll back

    Checkpoint

    - Steal : transaction may be written on diskbefore it commits

    Recovery method

  • 7/27/2019 215-Database-Recovery.ppt

    36/38

    example :

    Name Account

    Mr.A $10Mr.B $2,000

    Mr.C $30,000

    Transaction Rollback (2)

    TimeT2

    Checkpoint Crash

    T1 : A company pays salary to employeesi) transfer $2,000 to Mr. As account

    ii) transfer $2,500 to Mr Bs account

    write(A)

    read(A)

    T1read(B)read(A) write(B)

    write(C)read(C)write(A)

    T2 : Mr.A pays the monthly rent.i) withdraw $1,500 from Mr.As accountii) transfer $1,500 to Mr.Cs account

  • 7/27/2019 215-Database-Recovery.ppt

    37/38

    Cascading Rollback

    Transaction Rollback (3)

    System Log

    [checkpoint][start_transaction, T1]

    [read_item, T1, A]

    [write_item, T1, A, 10, 2010]

    [start_transaction, T2]

    [read_item, T2, A][write_item, T2, A, 2010, 510]

    [read_item, T1, B]

    [read_item, T2, C]

    [write_item, T2, C, 1500, 31500]

    ~~~ CRASH ~~~~

    T2

    Checkpoint Crash

    w(A)

    r(A)

    T1r (B)r(A)

    w(C) r(C)w(A)

    A

    $10

    $10

    $2,010

    $2,010

    $510

    C

    $30,000

    $30,000

    $31,500

    -T1 is interrupted(needs rollback)

    -T2 uses value

    modified by T1(also needs

    rollback)

  • 7/27/2019 215-Database-Recovery.ppt

    38/38

    Categorization of Recovery Algorithm

    Deferred update the No-UNDO/REDO algorithm

    Immediate update the UNDO/REDO algorithm