raymond a. clarke enterprise storage specialist – oracle corporation bit rot: myth or way of life?...

28
Raymond A. Clarke Enterprise Storage Specialist – Oracle Corporation Bit Rot: Myth or Way of Life? Digital Preservation Seminar University of Alberta Edmonton, Alberta March 5, 2010

Upload: dana-chapman

Post on 26-Dec-2015

216 views

Category:

Documents


0 download

TRANSCRIPT

Raymond A. ClarkeEnterprise Storage Specialist – Oracle Corporation

Bit Rot:

Myth or Way of Life?

Digital Preservation Seminar University of AlbertaEdmonton, Alberta

March 5, 2010

Agenda

• A Little Bit about Digital Archiving• Bit Rot – What is it?• Bit Error Rate• Validation and Experimentation• What’s Being Done About This• Q & A ?

Why is Archive So Important?... because The History of Data Growth is Exponential!

24 Words - Pythagorean Theorem

67 Words - Archimedes Principal

179 Words - 10 Commandments

286 Words - Lincoln's Gettysburg Address

1300 Words - US Declaration of independence

26911 Words ........ EU REGULATION ON THE SALE OF CABBAGES

Building a Cost Efficient Archive, Sensitive to Access where any Asset can be Requested at Any Time?

Long-term (to forever) storage & retrieval of digital assets

Source: Horizon Information Strategies

Tier 1 Tier 2 Tier 3

Fundamental Problem to be Resolved

Perfect preservation not at any price– Threats too prevalent, diverse, poorly understood,– Real systems are inevitably imperfect

● How imperfect is adequate?– How much will it cost?

● How adequate is what we can afford now?– Won't know unless we can measure performance

● Kaizen: Improve cost performance thru time– Need preservation benchmarks to drive market

What is Bit Rot?

The result of bit rot is evidenced when a file/object sits undisturbed and unaccessed for months, maybe even years. Finally a day comes when we go to the hard drive or pull the prized DVD or memory stick out and

you can’t access the data!

What Causes Bit Rot?

• Lot of things -• System(NICS, HBAs, etc.)/Software interface

changes in active code that is called from the dormant code

• Online aging databases may also suffer from data loss due to update errors, media failure, incomplete backup and restore operations, user error, changes in the database structure, and other related maintenance issues.

• Cosmic rays and sun spots?• Data transmission noise• File system errors• Insider Attacks• Natural Disasters

BER & Storage Infrastructure Components

● NIC/Link/HBA: 10-10 (1 bit in ~1.1 GB)● Check-summed, retransmit if necessary

● Memory: 10-12 (1 bit in ~116 GB)● ECC

● Desktop Disk: 10-14 (1 bit in ~11.3 TB)● Various error correction codes

● Enterprise Disk: 10-15 (1 bit in ~113 TB)● Various error correction codes

● Tape: 10-18 (1 bit in ~1.11 PB)● Various error correction codes

● Quotes from standards/specifications

Note: Data maybe encoded up to five or more times as it travels from memory to the physical disk!

Note: Data maybe encoded up to five or more times as it travels from memory to the physical disk!

BER & Types of Corruption – cont.

● Type 1 Corruption

● Usually persistent● Bit(s) have flipped in a byte● Single Bit Error (SBE)● Double Bit Error (DBE)

● DBEs are 3x more common than SBEs● 1→0 transition more frequent than 0→1● Strong correlation with bad memory ● Happens with expensive ECC-memory too

Source:Silent CorruptionsPeter.KelemenCERN After C5, June 1st, 2007

BER & Types of Corruption – cont.

● Type 2 Corruption

● Usually transient● Small chunks of “random” looking data

● ...but can go up to 128K● Sometimes identifiable user data

Source:Silent CorruptionsPeter.KelemenCERN After C5, June 1st, 2007

BER & Types of Corruption – cont.

● Type 3 Corruption

● Usually persistent, comes in bursts● Strong correlation: I/O command timeouts

● Observed on plain SATA systems● ...sometimes with failed READ commands!

● Appears to match RAID stripe size (64K)● Observed on 16K chunk RAID arrays as well

Source:Silent CorruptionsPeter.KelemenCERN After C5, June 1st, 2007

BER & Types of Corruption – cont.

● Type 4 Corruption

● Usually persistent● Still pretty much unexplained● ...not sure yet this warrants another category

Source:Silent CorruptionsPeter.KelemenCERN After C5, June 1st, 2007

Experimentation - Studies

Study looked at the incidence of silent storagecorruption in individual disks in RAID arrays.

• Data was collected over 41 months covering over 1.5 × 106

drives. • Over 4 × 105 incidents of silent corruption found• More than 7.5% were not detected until RAID restoration and

could thus have caused data loss and/or system downtime

NetApp Study (Bairavasundaram, et al. 2008)

Experimentation - Studies

•CERN study wrote large files into various state-of-the-art enterprise storage systems (mostly RAID arrays), and checked them over a period of 6 months•A total of about 9.7 × 1016 bytes were written and about 1.92 × 108 bytes were found to have suffered silent corruption

• 2/3 of which were persistent; re-reading did not return good data.

CERN (Kelemen 2007)

Experimentation -

Conclusion - About 1.2 × 10−9 of the data written to

CERN’s storage was permanently corrupted

within six months. Thus to reach the petabyte for a

century requirement we would need to improve the

performance of current enterprise storage systems by a

factor of at least 109.

CERN (Kelemen 2007)

What Are We Doing About This?

● Self-examining/healing hardware (SNIA)● WRITE-READ cycles before ACK● Check-summing? → not necessarily enough

● End-to-End Check-summing (OpenSolaris/ZFS)● Store multiple copies(LOCKSS)● Regular scrubbing of RAID arrays● “data refresh” re-read cycles on tapes● ...generally accept and prepare for bit rot and data corruption

First off, accept the Bit Rot is a Way of Life. Secondly, prepare for it and implement technologies to mitigate it.First off, accept the Bit Rot is a Way of Life. Secondly, prepare for it and implement technologies to mitigate it.

Designed to study long-term digital information archive and retention requirements in the data center

Goal: Use these requirements to frame the definition of best practices and technology solutions

100 Year Archive Requirements Study

17

www.snia-dmf.org/100year

Key Findings

Over 80% report a need to retain information over 50 years, and 68% report a need of over 100 yearsLong-term generally means longer than 10 to 15 years Over 40% of respondents are keeping email records over 10 yearsDatabase information was considered most at risk of loss 70% of respondents say they are ‘highly dissatisfied’ with their ability to read their retained information in 50 yearsCurrent practices are too manual, too prone to error and too costlyCollaboration is recognized as necessary in order to define information retention requirements

“Remember that IT doesn't own the information. RIM, Legal, Business units and IT all have a part to play in the decisions

applied to business records and should be sitting down at the table together.” (Source: Respondent)

18

Long-Term Retention Projects

Logical MigrationLaunch a TWG to define “SIRF(Self-contained Information Retention Format)”, a self-describing, self-contained data format standard

Conduct Market EducationSpeaking, papers, web

Interact with international community working on retention and archive

19

Retention Reference Model

Requirements (done)

Glossary (done)

Best Practices for Storage (on-going)

Define Reference Architecture covering Migration, Security, etc.

Meta-data provided via XAM

The world's first 128-bit file system

With check-summing and copy-on-write transactions

A pooled storage model –no volume manager

SnapshotsClonesReplicationCompression

End-to EndData Integrity

Integrated Data Services

Software Developer

Easier Administration

Immense DataCapacity

ZFS/Open SolarisA new way to manage/protect data

Copy-on-WriteNever Overwrite Existing Data

Original Data

New Data

uber-block

System System DRAMDRAM

TransactionalAlways Consistent On Disk

Transaction Group

Buffered Writes

Flush to Disk Every 5 Seconds

Synchronous Writes

Go Direct to on disk ZFS Intent

Log

ZILZIL

Checksums are separated from

the data

How Do We Know What We Just Read Was What We Wrote ?End-to-End Checksums

Entire I/O path is self-validating (uber-block)

Prevents:

> Silent data corruption

> Panics from corrupted metadata

> Bit Rot

> Phantom writes

> Misdirected reads and writes

> DMA parity errors> Errors from driver bugs> Accidental overwrites

Parting Thoughts…

● Bit Rot is a fact of life

● Early detection is the first step towards a solution

● Total elimination may be impossible

● Existing files/objects are at the mercy of Mr. Murphy

● Correction/Mitigation will cost time, money and

require planning

● Efforts have begun and need to be intensified

As always, help is needed!

Credits

• Peter Kelemen, 2007. Silent Corruptions. In 8th Annual Work-shop on Linux Clusters for Super Computing.

• SNIA, Data Management Forum, 2007.100 Year Archive Requirements Survey. http://www.snia.org/forums/dmf/knowledge/100YrATF_Archive-Requirements-Survey_20070619.pdf

• David Rosenthal 2008, Bit Preservation: A Solved Problem?

• CERN. 2008. Worldwide LHC Computing Grid. http://lcg.web.cern.ch/LCG/

• Bernd Panzer-Steindel, CERN/IT, 2007, Data Integrity

Thank You for Your Time and [email protected]

(212) 558-9321