transaction-oriented database recovery. application programmer (e.g., business analyst, data...

Post on 18-Dec-2015

216 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Transaction-Oriented Database Recovery

ApplicationProgrammer

(e.g., business analyst,Data architect)

SophisticatedApplicationProgrammer

(e.g., SAP admin)

DBA,Tuner

Hardware[Processor(s), Disk(s), Memory]

Operating System

Concurrency Control Recovery

Storage SubsystemIndexes

Query Processor

Application

Outline

• Principles of transaction-oriented database recovery

• Recovery tuning

Transaction-Oriented Database Recovery

• Transaction properties– A: Atomicity– C: Consistency– I: Isolation– D: Duration

• A database is transaction or logically consistent iff it contains the results of successful transactions

Failures To Recover From

• Transaction failure– Self- or system-abort– To recover within time for normal transaction– 10-100 times per min.

• System failure– OS or DBMS crash– To recover in same amount of time as required for all interrupted

transactions– A few times per week

• Media failure– Disk crash– To recover in hours– A few times per year

Recovery Actions

• Transaction UNDO – roll-back a specific active trans• Global UNDO – roll-back all active trans• Partial REDO – re-instate some committed trans• Global REDO – re-instate all committed trans

Failure Type Recovery Action

Transaction

System

Media

Transaction UNDO

Global UNDO, Partial REDO

Global REDO

Log for UNDO/REDO

• Logical logging – operators & their arguments– Requires atomic actions from physical layer– Not always possible/justifiable

• Physical state logging– Before and/or after image

• Physical transition logging– Use XOR: commutative and associative– Log XOR before image after image– Log XOR after image before image– Lower space consumption (1 entry/change; compress

long strings of 0s – small number of changes)

System Framework

Source: T. Haerder, A. Reuter

Log Timing

• UNDO entries must reach log file before changes are written out – Write-Ahead Logging (WAL) principle– To enable roll-back if necessary

• REDO entries must reach log file before End-Of-Transaction (EOT) is acknowledged– To enable re-instatement after failure

Dependency with Buffer Management

UNDO• STEAL: Modified pages

may be written anytime• ~STEAL: Modified pages

kept in buffer till after transaction commits– Large buffers required– No global UNDO– Transaction UNDO within

memory– No logging required for

UNDO

REDO• FORCE: All modified

pages written during EOT– No need to log for partial

REDO– Need logging for global

REDO

• ~FORCE: No propagation during EOT

At least one of global UNDO or partialREDO is always required. Why?

Checkpointing to Optimize Recovery

• Problem– With LRU buffer replacement, frequently used pages

will remain in buffer– Partial REDO has to go back very far

• Checkpointing limits amount of partial REDO• Checkpoint

– Write BEGIN-CHECKPOINT to temporary log– Write checkpoint data to log– Write END-CHECKPOINT to temporary log

Crash Recovery with Checkpoint

T1

T2

T3

T4

T5

CheckpointOldest PageIn Buffer Crash

Analyze

UNDO

REDO

Nothing

REDO

UNDO

RecoveryProcess

Transaction-Oriented Checkpoint (TOC)

• FORCE TOC• EOT (BEGIN-

CHECKPOINT, END-CHECKPOINT)

• Frequently used pages need to be written out each time a transaction commits

• Not suitable for large applications Source: T. Haerder, A. Reuter

Transaction-Consistent Checkpoint (TCC)

Source: T. Haerder, A. Reuter

Transaction-Consistent Checkpoint (TCC)

• When checkpoint generation is triggered– All new update transactions are put on hold– All incomplete update transactions are completed– Write out all modified pages

• Both REDO and UNDO are bounded– REDO starts from latest checkpoint– UNDO back to latest checkpoint

• Drawback– Delay new update transactions; not suitable for large

multi-user DBMS– High checkpointing costs

Action-Consistent Checkpoint (ACC)

Source: T. Haerder, A. Reuter

Action-Consistent Checkpoint (ACC)

• When checkpoint generation is triggered– All new actions are put on hold– All incomplete actions are completed– Write out all modified pages

• Less disruptive than TCC• Partial REDO only from the most recent

checkpoint• Global UNDO not bounded• Still costly when buffers are large

Fuzzy ACC

• During checkpointing, the numbers of all dirty pages in buffer are written to the log

• If a modified page is found in the previous checkpoint, and since then has not been written out, write it out now

• Partial REDO from penultimate checkpoint

Archive Recovery

Make sure the two paths are independent!!

Source: T. Haerder, A. Reuter

Multi-Generation Archive Copies

• Archive copies are accessed very infrequently• Subject to magnetic decay• Keep several generations

Source: T. Haerder, A. Reuter

Duplicate Archive Logs

Source: T. Haerder, A. Reuter

Duplicate Archive Logs

• Archive log must extend back to the oldest archive copy

• Log susceptible to magnetic decay as well• Duplicate archive log• Need to synchronize both archive logs with

temporary log at EOT• Very expensive!

Decouple Archive Logs from EOT

Source: T. Haerder, A. Reuter

Decouple Archive Logs from EOT

• Log entries written only to temporary log during EOT

• Asynchronous process copies REDO entries to archive log

• Need to replicate temporary log• Synchronize both temporary logs at EOT

Summary

• Crash recovery– TOC: Per transaction– TCC: Transaction

boundary– ACC: Action boundary

• Archive recovery– Multi-generation archive

copy– Duplicate archive logs– Decouple archive log from

EOT

Failure Type Recovery ActionTransactionSystemMedia

Transaction UNDOGlobal UNDO, Partial REDOGlobal REDO

• Failure types

top related