12. recovery

13
12. Recovery REVIEW: COMMIT is the sucessful end-of-transaction operation. Changes to data items are not made permanent until the COMMIT issued by the TM is acknowledged by the DM. . . | Transaction Manager(s) | . . |SCHEDULER | . . |DATA MANAGER | . . |DATABASE ON DISK | Following that ack, the DBMS must guarantee that the updates will never be lost, no matter what happens! (DB must be recoverable). Techniques used by DBMS to guarantee recoverability to a recent COMMITTED DB STATE (all data items show the value written by a committed transaction and the resulting state is consistent with the integrity constraints) ROLLBACK (ABORT) is the unsuccessful end-of-transaction operation). All changes are undone using the LOG (or JOURNAL). - The on-line LOG holds all updates as they are made. - when on-line log fills up, written to off-line log (usually on tape) - LOGs can grow to be as large as the database itself. Write-Ahead Logging (WAL) Protocol: requires that a log record is physically written with the "last committed value" on it, before that item is changed (overwritten). WAL protocol facilitate "UNDO by re-intalling before-values" of all changes (removes effects of abortd transaction). Section 12 # 1

Upload: halen

Post on 19-Jan-2016

31 views

Category:

Documents


0 download

DESCRIPTION

Section 12 # 1. 12. Recovery. REVIEW: COMMIT is the sucessful end-of-transaction operation. Changes to data items are not made permanent until the COMMIT issued by the TM is acknowledged by the DM. . . | Transaction Manager(s) | . . | SCHEDULER | . . - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: 12. Recovery

12. RecoveryREVIEW:COMMIT is the sucessful end-of-transaction operation. Changes to data items are not made permanent until the COMMIT issued by the TM is acknowledged by the DM. . . | Transaction Manager(s) |

. . |SCHEDULER | . . |DATA MANAGER |

. . |DATABASE ON DISK |

Following that ack, the DBMS must guarantee that the updates will never be lost, no matter what happens! (DB must be recoverable).

Techniques used by DBMS to guarantee recoverability to a recent COMMITTED DB STATE (all data items show the value written by a committed transaction and the resulting state is consistent with the integrity constraints)

ROLLBACK (ABORT) is the unsuccessful end-of-transaction operation). All changes are undone using the LOG (or JOURNAL).- The on-line LOG holds all updates as they are made. - when on-line log fills up, written to off-line log (usually on tape) - LOGs can grow to be as large as the database itself.

Write-Ahead Logging (WAL) Protocol: requires that a log record is physically written with the "last committed value" on it, before that item is changed (overwritten).

WAL protocol facilitate "UNDO by re-intalling before-values" of all changes (removes effects of abortd transaction).

Section 12 # 1

Page 2: 12. Recovery

Transaction FailureTYPES OF FAILURES:

Transaction local (ABENDS, NSF check)

System failures (DBMS itself fails)

Media failure (disk crash)

TRANSACTION FAILURE (transactions themselves are responsible for action)

e.g., Abnormal program ends (ABENDS),

Non-Sufficient Funds (NSF)

Transaction code can can trap these and specify remedy (e.g., ROLLBACK). However, in order to facilitate proper transaction actions, system must hold all output messages until COMMIT. Otherwise, this can happen:

A T M ///// | | !@#$%! | O ` | o o o | ....ROLLBACK! | > | o o o | BANKER | `-| | o o o | | o o | `----' |_______ | | _ | | | / $ / | | ' ` | _______ |--------------/___/ | `-----' | | | | | |---- | NSF | | | | | |_______ | ^ | | |_____ ____ | | | | ----- | | | | | | | | | L L |_______ | | L |

At an ATM cash machine, if the "message" (the cash) is given to the customer before commit, it is impossible to ROLLBACK the transaction.

Section 12 # 2

Page 3: 12. Recovery

System FailureSYSTEM FAILURE: DBMS itself fails, and the memory contents are lost (including buffers), but the data on disk is

undamaged (The Data Manager is allowed to do its job any way it wants to (to optimize its activity). That's the reason for the component separation in the first place (instead of monlithic system). So the DM can be implemented so that the Disk(s) may contain some "uncommitted values" and/or it may not contain all committed values.

The Disk(s) may contain uncommitted values if a STEAL policy is used.

STEAL policy: The Buffer Manager can replace a page which still has uncommitted values (write a page to disk that contain uncommitted values) (actually "stealing" a page from 1 trans and give it to another) (Necessary for very long running trans e.g., a payroll processing)

The Disk(s) may not contain all committed values if a NO-FORCE policy is used.

NO-FORCE policy: Buffer Mgr may not write a page with newly committed values until later. (e.g., In a Banking system, may not be able to afford to force every write immediately)

BUFFER POLICIES:

|FORCE | STEAL

| YES | YES

| YES | NO

| NO | YES < - the hardest to implement but the best!

| NO | NO

Although there are system that use either a NO-STEAL or a FORCE policy (or both), we discuss only STEAL, NO-FORCE (STEAL NO-FORCE requires the most demanding recovery system).

Section 12 # 3

Page 4: 12. Recovery

Steal, No-force buffer policyIn a STEAL NO-FORCE system:

All transactions active at fail-time (BEGUN, not ENDed) must be UNDONE. (because some of the changes it made may have been written under the STEAL policy).

All transactions committed at fail-time must be idempotently REDONE (because the committed changes it made may have not been written under the NO-FORCE policy).

One way is to UNDO all active transactions and then idempotently REDO all committed transactions.

Do we have to go all the way back to IPL (Initial Program Load) and REDO all committed transactions?

Can that be avoided? YES! Through checkpointing!

The System periodically takes a CHECKPOINT.

There are many, many checkpointing methods, the next slide shows a "Standard" CHECKPOINT:

Section 12 # 4

Page 5: 12. Recovery

Steal, No-force checkpoint

Trans | "ca-chunk" | .- | "change record" | : | "ca-chunk" | log : | "COMMIT *-1st flush |-. record: | . | : .- - | then"check-point-rec" | : //// : : | |/ /| (- -)- : \ | . | / O `-' / 2 | log | |database| / `._ _|_/ :.<- | buff | | buffer |/ | :* |______|__ |_______ | | @@@ :: / ) ^ @ o >:: ( / | | @\/ :: `----'| | | |--:: \___/ L L /() :: | | :: V /^\ :: | | disk copy| L L:`>tr-log | | database | `>_chpt-rec | |__________|

Section 12 # 5

There are many, many checkpointing methods, this slide shows a "Standard" CHECKPOINT: It is usually done at a quiescent point in time (no activity going on), but not necessarily (i.e., there are "on-the-fly" checkpointing methods, but they are complex).

1. forcewrites all buffers to disk immediately (flushes buffers).

2. forcewrites a "checkpoint" record to the log. A CHECKPOINT record must have an "active list" containing all currently active transactions.

2. forcewrites a "checkpoint" record to the log. A CHECKPOINT record must have an "active list" containing all currently active transactions.

: //// (0 0)- `-' / _/|_/ - - -' | | ^ | | | | L L

Page 6: 12. Recovery

Steal, No-force checkpointWith standard SNF Checkpointing (described above), of the following which must be undone and which must be redone?

Active

where |------> | ^ ^ BEGIN COMMIT

CHECKPOINT CRASH T1 |------->| T2 |---------------------------->| T3 |-----------------------------------------> T4 |--->| T5 |----------> T6 |-------------------------->|

After the crash, the RECOVERY PROCESS would:

1. Start at most recent Checkpoint record in LOG containing ACTIVE-list={T2,T3,T6} UNDO-list = ACTIVE-list e.g., UNDO={T2,T3,T6} REDO-list = empty.

2. Scan forward in the LOG from CHECKPOINT record. For each BEGIN encountered, put trans in UNDO-list (UNDO={T4, T5} For each COMMIT encountered, move trans from UNDO to REDO. (e.g., move T4,T2).

3. When LOG is exhausted, Idempotently REDO REDO-list in commit in order. (e.g., {T6, T4, T2} ) UNDO all trans in UNDO-list (e.g., {T3, T5} )

Note: Since transactions are redone in commit-order = REDO-order, it must be the case that the Serial Order to which execution is equivalent is COMMIT order. That is, if another serial order is the order to which the serializability is equivalent, the REDO must be done in that order. Section 12 # 6

Page 7: 12. Recovery

Steal, No-force checkpointNote: Since transactions are redone in commit-order = REDO-order, it must be the case that the Serial Order to which

execution is equivalent is COMMIT order.

That is, if another serial order is the order to which the serializability is equivalent, the REDO must be done in that order.

In T2 and T4 above, messages may have gone back to the users which were based on and execution order equivalent to SOME serial order (values reported to users were generated by the execution in that order).

Thus, RECOVERY must regenerate in the same order.

The only way that the RECOVERY process can know what serial order the original execution was equivalent to is that the initial execution be equivalent to some serial order identifiable from the LOG.

One order identifiable from the LOG is COMMIT order. Therefore, it is common to demand that the order of execution be equivalent to the serial COMMIT-order.

(S2PL does that. Is that why it is so popular?)

Section 12 # 7

Page 8: 12. Recovery

Media FailueMEDIA FAILURE (from disk crash) RECOVERY

ARCHIVE: periodically dump database (i.e., make an ARCHIVE copy to off-line tape?):

1. Shut down the DBMS (e.g., late at night or during "quiescent" period)

2. Copy the entire database to off-line storage (tape)

3. Bring up the DBMS again

4. Erase the LOG and restart logging ___ | |

. . | disk copy | | tape |< - - - - - - - | of |

___ . ___ . | database | |________________|

Following a media failure (disk crash),

1. RESTORE DB from archive, ___ | |

. . | disk copy | | tape |- - - - - - - > | of |

___ . ___ . | database | |________________|

2. REDO transaction-log from archive-time to as near to crash-time as possible (using both the off-line and the on-line log (the on-line is kept on separate disk from the database itself for durability)). This is called ROLL-FORWARD :

___________COMPUTER________LOG |- - - ->| "ca-chunk" | ____| | "ca-chunk |

| "redo transaction" ||--------------. .-------------- | | log | | database | | buffer | | buffer | |_________ |_________ |_________ |

Section 12 # 8

Page 9: 12. Recovery

Media FailueMEDIA FAILURE (from disk crash) RECOVERY

There are many other methods. DUPLEXING = make two copies of every data item on separate disks (at least separate failure modes). The amount of extra disk space used can be reduced by methods such as Huffman coding to as low as 5% extra disk space, however, in this, the Age Of Infinite Storage is it worth doing? Huffman coding is used in some in RAID systems. (Redundant Arrays of Independent Disks)

APPENDIX

Storage past, present and future: In 1956, IBM developed RAMAC, a refrig sized disk system with 50 2-ft diam platters. RAMAC had a capacity of 5 megabytes.

Since then: 1. The amount of data stored on given area has increased 1,000,000-fold. 2. The transfer speed has increased 3,000-fold. 3. The cost per bit has decreased 500,000-fold (comparable $s).

This is due to breakthroughs in 1. "areal density" (# bits/squarech in). 2. revolution speeds. 3. read-write head technologies.

How much more higher can disk capacity go? So far predictions of "upper limits" have been made by engineers and they have always been wrong (way wrong).

However, we are approaching a limit determined by fundamental physics, not engineering ingenuity. There comes a point beyond which random jiggle of electron spins due to temperature is likely to cause the directions of bit's magnetization to spontaneously reverse within the expected livetime of the disk.

This is called the SUPERPARAMAGNETIC LIMIT and it may limit the progress that can be achieved through minaturizing or the "scaling down" of existing technologies.

Where is the superparamagnetic limit? Most agree it will be encountered at densities ~120 Gbits/square_inch. At 6.5 sq_in per 3.5 inch surface, that gives ~ 800 Gigabits/surface. or ~ 100 GigaBytes/surface times 50 surfaces, we can

conclued that a 3.5 inch hard-drives may go to 5000 GigaBytes/disk= 5 TeraBytes/disk Note that COMMODITY drives today have reached 500 GigaBytes/drive: so another 10=fold increas and we're there with

commodity drives!!! Indexing and providing reference paths and access paths to data stores of this size is nearly impossible!

Section 12 # 9

Page 10: 12. Recovery

Appendix continuedWhat are we going to do?? Holographic storage? (From holographic storage 1 holographic storage 2 "Storing data as

holograms has intrigued scientists for decades. In the early 1960s, former Almaden Research Center scientist Glenn Sincerbox helped IBM develop the world's first

working holographic data storage system - a write-once-read-many (worm) technology using photographic film, for US Air Force.

Today, IBM participates in two industry/university/government consortia that aim to demonstrate holographic storage technologies by the turn of the century.

A traditional hologram is produced when a beam of laser light, the reference beam, interferes with another beam reflected from the object to be recorded.

The pattern of interference is captured by photographic film, a light-sensitive crystal or some other optical material.Illuminating th pattern by the reference beam reproduces a 3-D image of the object. (the technology is called

interfereometry)

Each viewing angle gives you a different view of the same object.Holographic data storage works in exactly the same way. But for every angle, instead of having another view of object, we

have a completely different page of information."

Up to 10,000 pages have been stored in a single cube of recording material 1 cm on a side. Each page contains one megabit of information, which means that the cube can store ~10 gigabits.

Since there are approximately 27 cubic cms in a cubic inch and there are approximately 46 cubic inches in a 3.5 inch cube (3.5 inch diskettes piled 3.5 inches high) that means a 3.5 inch cube holograph would hold ~12 terabits of data. Holographic recording has the advantage of being inherently non-linear (parallel).

It reads and stores an entire page at a time. The technology permits data rates of up to one gigabit (or 125 megabytes) per second, making it ideal for storing image data.

Another advantage of holographic storage, largely untapped, lies in its use as associative memory. Just as illuminating a hologram with a reference beam recovers the stored info, illuminating it with a pattern of info will reproduce the corresponding reference beam and angle, which immediately identifies the page on which the information is stored.

Section 12 # 10

Page 11: 12. Recovery

Appendix continuedIn other words, holographic memories can be searched extremely quickly for data patterns (associative memory).

This would allow database searches using physics rather than software. Note that holographic storage may make the current access path technologies (indexes) obsolete.

Why would anyone use indexes. hash functions, SQL, Relational Alg, Relational Calc.... when you can simply pattern match in a holo cube?!?!?!?!

It should be interesting. Spintronics is another solution; IBM and Stanford University (and the NDSU CNSE center in our Research Park) are putting their heads together on a new microelectronics technology dubbed "spintronics" that promises breakthroughs in computer processors and other electronics components while extending Moore's Law forchip design.

In setting up a spintronics lab, researchers at the two organizations plan to control the spin, or magnetic orientation, of electrons within nano-scale electronic structures comprisingz super-thin layers to produce devices for low-power switching and non-volatile information storage.

Magnetic Properties: Electron spin is a quantum property that has "up" or "down" states. Aligning spins in a material creates magnetism, and magnetic fields affect the passage of electrons differently. Understanding and controlling this property is central to creating a whole new breed of electronic applications.

Among the possibilities are reconfigurable logic devices, room-temperature superconductors and quantum computers. The 1st commercial products, ranging from digital cameras to instant-on computers, will not be available for at least 5 years.

Current chip technology relies on the charges of electrons in circuitry, explained Mike Ross, a spokesperson for IBM's Almaden Research Center. Spintronics uses the quantum "spin" property of electrons to create magnetism, just as an electron's negative charge property creates electricity.

MRAM In the Works: By designing and making stacks of different materials -- some with layers only two to three atoms thick -- researchers can create devices that have novel properties. The spintronic GMR head, for example, has boosted the disk-drive industry, Ross told NewsFactor.

"This sensitive magnetic sensor, introduced by IBM, has resulted in a 40-fold increase in data storage in the past seven years," he said.

Section 12 # 11

Page 12: 12. Recovery

Appendix continuedMagnetic RAM (MRAM) is the next spintronic device in the works.

It has the potential to be a non-volatile memory that runs circles around non-volatile Flash memory typically used in cell phones, memory cards and other products. Current fast memory (SRAM, SDRAM, etc.) technology is volatile, meaning that devices must be booted up to save data.

"We want to learn more about using this technology in the sensor realm, and we see big benefits to logic and other types of electronics circuits," said Ross.

The IBM-Stanford Spintronic Science and Applications Center (SpinAps) will involve about a half-dozen Stanford professors and a similar number of IBM scientists. Research projects are funded by the two partners and agencies, including the Defense Advanced Research Projects Agency, the U.S. Department of Energy, and the National Science Foundation.

RAM Revolution Spintronics "has quickly revolutionized magnetic recording technology and is going to revolutionize random access memory (RAM)," University of Utah physics researcher Jing Shi told NewsFactor.

Compared with electronic computers, computers with spintronic memory should be able to store more data, process it faster, and consume less power.

Spintronics also may yield "instant-on" computers.

Aligned spins stay aligned until a magnetic field changes them -- even if a computer is shut off. Consequently, spin-based instant-on computers do not require booting to move data from the hard drive to the memory.

The data never left.

Section 12 # 12

Page 13: 12. Recovery

Thank you.

Section 12