transaction unit 1 topic 4

Crash Recovery•The recovery manager is responsible for ensuring two imp.properties: Atomicity and durability•Atomicity is ensured by undoing the actions of transactions that do not commit•Durability by making sure that all actions of committed transactions survive system crashes and media failures•System crash may be an error in bus/OS failure•Media failure may be disk is corrupted

Logging

• Basic Idea: Logging• Log: An ordered list of REDO/UNDO actions• Record REDO and UNDO information, for every update, in a

log.• Sequential writes to log (put it on a separate disk).Minimal

info (diff) written to log, so multiple updates fit in a single log page.

• Log record contains: <, pageID, offset, length, old data, new data> and additional control info

Introduction to ARIES• ARIES is a recovery algo designed to work with a

steal, no-force approach used(no-force approach means that some of these changes may not have been written to disk at the time of subsequent crash), when recovery mgr. is invoked after a crash, restart & proceeds in 3 phases:-– Analysis :indentifies dirty pages in the buffer pool– Redo: repeats all actions, starting from an appropriate

point in the log & restores the db state to what it was at the time of crash

– Undo: undoes the actions of transactions that didn’t commit , so that db reflects only the actions of committed transactions

No force and Steal approach

ARIES contd….• ARIES is a state of the art recovery method

– Incorporates numerous optimizations to reduce overheads during normal processing and to speed up recovery

• ARIES uses :-1. Uses log sequence number (LSN) to identify log records

• Stores LSNs in pages to identify what updates have already been applied to a database page

2. Physiological redo3. Dirty page table to avoid unnecessary redos during recovery4. Fuzzy checkpointing that only records information about dirty pages,

and does not require dirty pages to be written out at checkpoint time

ARIES Recovery Algorithm• Db transaction recovery focuses on the different methods used to recover a db from an

inconsistent state to a consistent state by using the data in the transaction log• Four imp.concepts that affect the recovery process:-

Write ahead log protocol: the transaction logs are written before any db data are actually updated. This protocol ensures that , in case of a failure, the db can later be recovered to a consistent state, using data in the transaction log

Redundant transaction logs: most dbms keep several copies of the transaction log to ensure that a physical disk failure will not impair the DBMS’s ability to recover data

Db Buffers: a buffer is a temp.storage area in primary memory used to speed up disk operations. To improve processing time,the DBMS s/w reads the data from physical and stores a copy of it on a “buffer” in primary memory.

When a transaction gets executed, while updation, the copy of data in the buffer gets updated Db checkpoints : checkpoint is an operation in which db writes all of its updated

buffers to disk. Checkpoints are automatically scheduled by DBMS several times per hour.

It plays an imp.role in transaction recovery

The db recovery process involves bringing the db to a consistent state after a failure

Write-Ahead Logging (WAL)• The Write-Ahead Logging Protocol:– Must force the log record for an update before

the corresponding data page gets to disk.– Must write all log records before commit.

• #1 guarantees Atomicity• #2 guarantees Durability.

WAL & the Log• Each log record has a unique Log Sequence Number (LSN).• LSNs always increasing.• Each data page contains a pageLSN.• The LSN of the most recent log record for an update to that page.• System keeps track of flushedLSN.• The max LSN flushed so far.

The log• The log is known as trail or journal. It’s a history of actions executed by DBMS. The

log is a file of records stored in a stable storage, which is assumed to survive crashes

• The durability is achieved by maintaining two or more copies of the log on diff.disks, so that a chance of all copies of logs are lost is a rare case

• The most recent portion of log is called as log trail, kept in main memory• Every log record is given a unique id called the log sequence no.(LSN). As with any

record id, we can fetch a log record with one disk access given the LSN• Further , LSNs are given nos. monotonically increasing order, this’s required for

ARIES recovery algo• If the log is a sequential file, in principle, growing indefinitely , the LSN can simply

be the address of the first byte of the log record• Various techq.used to identify portion of the log which are ‘too old’ to be needed

again to bound the amount of stable storage used for log• For recovery procedure, every page in the db contains LSN of the most recent log

record , this LSN is called the pageLSN• Every log record has fields: prevLSN, transID(ID of transaction generating a log

record) and type. • The set of log records is maintained as a linked list and its accessed by prevLSN

field, this list is updated whenever a log record is added.

Log record with actions• Updating a page- after modification,an update type

record is appended to log trail• Commit – when a transaction decides to commit, it

forces-write a commit type log record containing transaction id is appended to the log

• Abort- when a transaction is aborted, an abort type log record containing transaction id is appended to the log

• End – all transactions committed/aborted are appended to the log with end type

• Undoing an update- when a transaction is rolled back,its updates are undone. Then , a compensation log record or CLR is written

Update log Record & Compensation Log Record(CLR)• Update log Record : the PageID field is page id of the modified page, the length in

bytes and the offset of change are also included. • The Before image is the value of changed bytes before the change• the After image is the value after the change• An Update log record contains both before and after image can be used to redo

the change and undo the change• A redo-only update log record contains just the after-image, undo-only update

record contains just the before-image

• CLR:It is written just before the change recorded in an update log record U is undone

• Such undo can happen during normal system execution when a transaction is aborted or during recovery from a crash

• it describes the actions taken to undo the actions recorded in the corresponding update log record and is appended to the log tail just like any other log record

• It contains a field as undoNextLSN which is LSN of the next log record which is to be undone for the transaction that wrote update record U;this field in C is set to the value of prevLSN in U.

Other Recovery Related Structure• In addition to the log,the following two table contain imp.recovery related

structure:- Transaction Table: it contains one entry for each active transaction.

The entry contains transaction id, the status and a field called lastLSN which is the LSN of the most recent log record for this transaction. The status can be whether a transaction is in progress, comitted or aborted

Dirty Page table : this table contains one entry for each dirty page in the buffer pool. The entry contains a field recLSN, which is the LSN of the first log record that caused the page becomes dirty. This LSN identifies earlier log record that might have to be redone for this page during restart from a crash

During normal tans.operation, these table are maintained by transaction manager and buffer manager.

During restart after a crash,these tables are reconstructed in the Analysis phase of restart

Write ahead log protocol• WAL is the fundamental rule ensures that a record of every change to the

db is available while attempting to recover from a crash• When a transaction is changed & committed, the log tail is forced to

stable storage, even if no-force approach is being used(no-force approach means that some of these changes may not have been written to disk at the time of subsequent crash)

• If a force approach is used,all the pages modified by the transaction, rather than a portion of the log that includes all its records,must be forced to disk when the transaction commits.

• The set of all changed pages is typically much larger than the log tail because the size of all update log record is close to (twice) the size of changed bytes, which is smaller than the page size

• The log is maintained in sequential file, hence all writes are done in a seql.manner

• Cost of forcing the log tail is much smaller than the cost of writing all changed pages to disk

Checkpointing• It’s a like a snapshot of the DBMS state, by taking checkpoints

periodically, DBMS can reduce the work to be done during restart in the event of a subsequent crash

• Checkpointing ARIES has 3 steps:- begin_checkpoint,end_checkpoint and fuzzy checkpoint

• Begin checkpoint record is written to indicate when the checkpoint starts• End checkpoint record is constructed, including in it the current contents

of the transaction table & dirty page table and appended to the log• 3rd step is executed after end_checkpoint record is written to stable

storage• Special master record containing LSN of the begin_checkpoint log record

is written to a known place on stable storage.• While end_checkpointing record is being constructed,the DBMS continues

executing transactions and writing other log records

Checkpointing

• Checkpointing is done as follows:1. Output all log records in memory to stable storage2. Output to disk all modified buffer blocks3. Output to log on stable storage a < checkpoint L> record.

Transactions are not allowed to perform any actions while checkpointing is in progress.

• Fuzzy checkpointing allows transactions to progress while the most time consuming parts of checkpointing are in progress– Performed as described on next slide

Fuzzy checkpointing…• Fuzzy checkpointing is done as follows:

1. Temporarily stop all updates by transactions2. Write a <checkpoint L> log record and force log to stable storage3. Note list M of modified buffer blocks4. Now permit transactions to proceed with their actions5. Output to disk all modified buffer blocks in list M

blocks should not be updated while being output Follow WAL: all log records pertaining to a block must be output before the block

is output6. Store a pointer to the checkpoint record in a fixed position last_checkpoint on disk

• When recovering using a fuzzy checkpoint, start scan from the checkpoint record pointed to by last_checkpoint– Log records before last_checkpoint have their updates reflected in database on disk,

and need not be redone.– Incomplete checkpoints, where system had crashed while performing

checkpoint, are handled safely

Some more notes on checkpointing..

• Periodically, the DBMS creates a checkpoint, in order to minimize the time taken to recover in the event of a system crash. Write to log:

• begin_checkpoint record: Indicates when chkpt began.• end_checkpoint record: including current contents of

transaction table and dirty page table • This is a `fuzzy checkpoint’: continue to run; so these tables

accurate only as of the time of the begin_checkpoint record.• No attempt to force dirty pages to disk; effectiveness of

checkpoint limited by oldest unwritten change to a dirty page. (So it’s a good idea to periodically flush dirty pages to disk!)

• Store LSN of chkpt record in a safe place (master record).

Recovering from system crash• When a system is restarted after a crash, the recovery mgr

procceds in 3 phases:-– Analysis – examines the most recent begin_checkpoint

record , whose LSN is denoted by C– Redo- follows analysis and redoes all the changes to any

page that might have been dirty at the time of crash;this set of pages and the starting pt.for Redo are determined during analysis

– Undo- undo phase follows Redo and undoes the changes of all transactions active at the time of crash. This set of transactions is identified at during the analysis phase

Redo reapplies changes in the order in which they were originally carried out; Undo reverses the changes in the opposite order, reversing the most recent changes first

Analysis phase• It performs 3 tasks:-

– It determines the point in the log at which to start the Redo pass– It determines pages in the buffer pool that were dirty at the time of

crash– It identifies transactions that were active at the time of crash and

must be undoneThis phase begins by examining the most recent begin_checkpt log

record and initializing the dirty page table and transaction table to the copies of those structures in the next end_checkpoint record

Thus these tables are initialized to the set of dirty pages and active transactions at the time of checkpoint

Redo phase• ARIES reapplies the updates of all transactions, committed or otherwise• If a transaction was aborted before the crash and its updates were undone,as

indicated by CLRs, the actions described in CLR are also reapplied• This reapplication mechanism distinguishes ARIES from other proposed WAL-

based recovery algo and causes the db to be brought to the same state it was in at the time of the crash

• The Redo phase starts with the log record has smallest recLSN of all pages in the dirty page table constructed by Analysis phase because this log record identifies the oldest update which may not have been written to the disk prior to the crash

• Starting from this log record,Redo scans forward until the end of the log• For each redoable log record encountered,Redo checks whether the logged action

must be redone

• The action must be redone unless one of the following conditions holds:-– The affected page is not in the dirty page table– The affected page may be in the dirty page table,but the recLSN for the entry is > than

LSN of the log record being checked– The pageLSN > or = to the LSN of the log record being checked

Undo phase• The undo phase scans backward from the end of the log• The goal is to undo actions of all transactions active at the time of the crash, i.e.,

to effectively abort them• Undo begins with the transaction table constructed by Analysis phase which

identifies all transactions active at the time of the crash, includes LSN of most recent log record(the last LSN field) for each such transaction

• Such transactions are called as loser transactions• All actions of loser must be undone & even must be undone in the reverse of the

order in which they appear in the log• Consider the set of lastLSN values for all loser transactions• Undo repeatedly chooses the target (i.e. most recent) LSN value this set and

processes it,until ToUndo is empty. To process a log record,– If it is a CLR and undoNextLSN value is not NULL, the undoNextLSN values is added to

the set ToUndo. If the undoNextLSN is null,an end record is written for the transaction because it is completely undone and CLR is discarded

– If it is an update record,a CLR is written and the corresponding action is undoneWhen the set toUndo is empty, the Undo phase is complete. Restart is now complete, and

the system can proceed with normal operations

transaction unit 1 topic 4

Technology

logging log

log trail

recent log record

log record andtype

single log page

recent portion of log

set of log records

log record various techq