digital immortality dr david holdsworth keeping digital data for ever or
TRANSCRIPT
![Page 1: Digital Immortality Dr David Holdsworth Keeping Digital Data for Ever OR](https://reader035.vdocuments.us/reader035/viewer/2022062515/56649c8a5503460f94944c0e/html5/thumbnails/1.jpg)
Digital Immortality
Digital Immortality
Dr David Holdsworth
http://www.leeds.ac.uk/cedars/
Keeping Digital Data for Ever
OROR
![Page 2: Digital Immortality Dr David Holdsworth Keeping Digital Data for Ever OR](https://reader035.vdocuments.us/reader035/viewer/2022062515/56649c8a5503460f94944c0e/html5/thumbnails/2.jpg)
Digital Immortality
Obsolete(?) Data
• 1 Things that must be kept by law
• 2 Things that must be destroyed by law
• 3 Things that we choose to keep
• 4 Things that we are certain can be thrown away
![Page 3: Digital Immortality Dr David Holdsworth Keeping Digital Data for Ever OR](https://reader035.vdocuments.us/reader035/viewer/2022062515/56649c8a5503460f94944c0e/html5/thumbnails/3.jpg)
Digital Immortality
Obsolete(?) Data
• 5 Things that we would like to keep if we have room
• 6 Things that we would like to throw away, but are not sure about
• 7 Things that we think we have kept but cannot find
• 8 Things that we have kept but now cannot decypher
• 9 Things that we have not kept but now wish that we had
![Page 4: Digital Immortality Dr David Holdsworth Keeping Digital Data for Ever OR](https://reader035.vdocuments.us/reader035/viewer/2022062515/56649c8a5503460f94944c0e/html5/thumbnails/4.jpg)
Digital Immortality
What to Keep
• All of 1 and 3– 1 Things that must be kept by law– 3 Things that we choose to keep
• As much of 5 and 6 as is cost-effective– 5 Things that we would like to keep if we have room
– 6 Things that we would like to throw away, but are not sure about
• Data discarded from 5 and 6 has the potential to be in 9 in the future– 9 Things that we have not kept but now wish that we had
• Minimise cost per item
![Page 5: Digital Immortality Dr David Holdsworth Keeping Digital Data for Ever OR](https://reader035.vdocuments.us/reader035/viewer/2022062515/56649c8a5503460f94944c0e/html5/thumbnails/5.jpg)
Digital Immortality
Some Pitfalls
• Errors are usually not correctable
• Failure to index adequately puts data into category 7– 7 Things that we think we have kept but cannot find
• Failure to know the format puts data into category 8– 8 Things that we have kept but now cannot decypher
![Page 6: Digital Immortality Dr David Holdsworth Keeping Digital Data for Ever OR](https://reader035.vdocuments.us/reader035/viewer/2022062515/56649c8a5503460f94944c0e/html5/thumbnails/6.jpg)
Digital Immortality
• Curl Exemplars in Digital ARchiveS
• Collaborative project for libraries
• Funded by HEFCE/JISC
• Oxford, Cambridge and Leeds
CEDARS
Personal Involvement
![Page 7: Digital Immortality Dr David Holdsworth Keeping Digital Data for Ever OR](https://reader035.vdocuments.us/reader035/viewer/2022062515/56649c8a5503460f94944c0e/html5/thumbnails/7.jpg)
Digital Immortality
CAMiLEON
• Creative Archiving at Michigan and LeedsEmulating the Old on the New
• Collaborative project on emulation
• Funded by NSF/JISC
Personal Involvement - contd.
![Page 8: Digital Immortality Dr David Holdsworth Keeping Digital Data for Ever OR](https://reader035.vdocuments.us/reader035/viewer/2022062515/56649c8a5503460f94944c0e/html5/thumbnails/8.jpg)
Digital Immortality
Challenges to digital preservation
• Deteriorating media– Magnetic dropout– Obsolete equipment
• Obsolete data formats– EBCDIC– UNICODE has established itself– Machine code software is an extreme
example
![Page 9: Digital Immortality Dr David Holdsworth Keeping Digital Data for Ever OR](https://reader035.vdocuments.us/reader035/viewer/2022062515/56649c8a5503460f94944c0e/html5/thumbnails/9.jpg)
Digital Immortality
Philips LaserVision
![Page 10: Digital Immortality Dr David Holdsworth Keeping Digital Data for Ever OR](https://reader035.vdocuments.us/reader035/viewer/2022062515/56649c8a5503460f94944c0e/html5/thumbnails/10.jpg)
Digital Immortality
Challenges to digital preservation
• Needles in haystacks– ISBN– Meta-data
• Deteriorating Institutions– Where are the digital legal deposits?– .. Or even Digital Equipment Corporation
• Proprietary systems become obsolete– leaving data inaccessible
contd
![Page 11: Digital Immortality Dr David Holdsworth Keeping Digital Data for Ever OR](https://reader035.vdocuments.us/reader035/viewer/2022062515/56649c8a5503460f94944c0e/html5/thumbnails/11.jpg)
Digital Immortality
Compatibility - Friend or Foe
• e.g. OS/z evolves from OS/360• Windows Vista evolves from
16-bit Windows 3.1• Modern machines run old software
…… but faster• Who keeps old versions?
– Computer Museum in California– Microsoft -- ?
![Page 12: Digital Immortality Dr David Holdsworth Keeping Digital Data for Ever OR](https://reader035.vdocuments.us/reader035/viewer/2022062515/56649c8a5503460f94944c0e/html5/thumbnails/12.jpg)
Digital Immortality
Times Change
• People don’t always want to process their old data using the tools of yesteryear
![Page 13: Digital Immortality Dr David Holdsworth Keeping Digital Data for Ever OR](https://reader035.vdocuments.us/reader035/viewer/2022062515/56649c8a5503460f94944c0e/html5/thumbnails/13.jpg)
Digital Immortality
THIS IS GEORGE 3 MARK 8.67 ON 31DEC9910.19.03_
TIMED OUT 10.19.33
THE SYSTEM HAS TEMPORARILY CLOSED DOWN
![Page 14: Digital Immortality Dr David Holdsworth Keeping Digital Data for Ever OR](https://reader035.vdocuments.us/reader035/viewer/2022062515/56649c8a5503460f94944c0e/html5/thumbnails/14.jpg)
Digital Immortality
Times Change
• People don’t always want to process their old data using the tools of yesteryear
• Need to bridge the gap between data’s origins and the time of access
![Page 15: Digital Immortality Dr David Holdsworth Keeping Digital Data for Ever OR](https://reader035.vdocuments.us/reader035/viewer/2022062515/56649c8a5503460f94944c0e/html5/thumbnails/15.jpg)
Digital Immortality
Use the Past to Illuminate the Future
• In 1987 EDCDIC was king
• In 2007 UNICODE is heir apparent
• In 2027 …….
• In 2038 UNIX time_t overflows 31 bits
• What has survived the decades?
![Page 16: Digital Immortality Dr David Holdsworth Keeping Digital Data for Ever OR](https://reader035.vdocuments.us/reader035/viewer/2022062515/56649c8a5503460f94944c0e/html5/thumbnails/16.jpg)
Digital Immortality
Survival of the Abstract
• Character sets
• Bytes
• Unstructured Files (stream of bytes)
• Hierarchical file tree
• Associative mappings
• Programming languages
![Page 17: Digital Immortality Dr David Holdsworth Keeping Digital Data for Ever OR](https://reader035.vdocuments.us/reader035/viewer/2022062515/56649c8a5503460f94944c0e/html5/thumbnails/17.jpg)
Digital Immortality
All is not lost
• We can keep a byte-stream for everThe abstract data separated from the medium is technology-neutral
• i.e. files can be kept for ever
• Copies are perfect
• File formats do not last for ever
• ….. Remember WORDSTAR
![Page 18: Digital Immortality Dr David Holdsworth Keeping Digital Data for Ever OR](https://reader035.vdocuments.us/reader035/viewer/2022062515/56649c8a5503460f94944c0e/html5/thumbnails/18.jpg)
Digital Immortality
Non-File Objects
• e.g. CDs, DVDs, magnetic tapes, web sites
• Map each digital object into a byte-stream and then preserve
• Multiple files (e.g. websites) can go in a ZIP or tar archive
![Page 19: Digital Immortality Dr David Holdsworth Keeping Digital Data for Ever OR](https://reader035.vdocuments.us/reader035/viewer/2022062515/56649c8a5503460f94944c0e/html5/thumbnails/19.jpg)
Digital Immortality
Abstraction
• Identify significant properties of the object
• represent them in a byte stream
![Page 20: Digital Immortality Dr David Holdsworth Keeping Digital Data for Ever OR](https://reader035.vdocuments.us/reader035/viewer/2022062515/56649c8a5503460f94944c0e/html5/thumbnails/20.jpg)
Digital Immortality
Example -- magnetic tape
• Significant properties– blocks of data– tape marks– start and end of tape
• Representation– block
-- raw bytes, preceded by 32-bit byte count– tape mark -- 4 bytes all ones– start & end -- ends of stream
![Page 21: Digital Immortality Dr David Holdsworth Keeping Digital Data for Ever OR](https://reader035.vdocuments.us/reader035/viewer/2022062515/56649c8a5503460f94944c0e/html5/thumbnails/21.jpg)
Digital Immortality
When to convert
• Conversion is inevitable
• a) as soon as the format becomes obsolete
• b) only when we want to read the data
• c) never - emulate the original system
![Page 22: Digital Immortality Dr David Holdsworth Keeping Digital Data for Ever OR](https://reader035.vdocuments.us/reader035/viewer/2022062515/56649c8a5503460f94944c0e/html5/thumbnails/22.jpg)
Digital Immortality
Convert as soon as Obsolete
• Copying to new technology is no longer trivial
• Any errors are cast in stone
• Digital signatures are lost
• Only viable when the number of different formats is small
![Page 23: Digital Immortality Dr David Holdsworth Keeping Digital Data for Ever OR](https://reader035.vdocuments.us/reader035/viewer/2022062515/56649c8a5503460f94944c0e/html5/thumbnails/23.jpg)
Digital Immortality
Convert when we want to read
• Preserve the original by simply copying onto current technology
• Record the format of each stored object
• Keep an index of all the formats held
• Maintain access to conversion software from the old to the current
• Treasure open-source conversion software
![Page 24: Digital Immortality Dr David Holdsworth Keeping Digital Data for Ever OR](https://reader035.vdocuments.us/reader035/viewer/2022062515/56649c8a5503460f94944c0e/html5/thumbnails/24.jpg)
Digital Immortality
Format Registries
• National Archives PRONOM
• Harvard Global Digital Format Registry
• OAIS ISO14721:2003 Representation Information
![Page 25: Digital Immortality Dr David Holdsworth Keeping Digital Data for Ever OR](https://reader035.vdocuments.us/reader035/viewer/2022062515/56649c8a5503460f94944c0e/html5/thumbnails/25.jpg)
Digital Immortality
Emulation of Yesteryear
• Today’s desktop machine far exceeds the mainframe of the 1970s or even 80s
• George3– Emulate the George3 executive
• i.e. order code + system calls + peripherals
• BBC micro– Publicly available emulation on WWW
![Page 26: Digital Immortality Dr David Holdsworth Keeping Digital Data for Ever OR](https://reader035.vdocuments.us/reader035/viewer/2022062515/56649c8a5503460f94944c0e/html5/thumbnails/26.jpg)
Digital Immortality
Abstraction for Emulation of 1900 system
• George3 sits on 1900 instruction set plus executive calls
• Executive sits on 1900 instruction set plus Fancy I/O stuff
• George3 provides lots of embellishment of 1900 instruction set
• Emulate executive + 1900 instruction set
![Page 27: Digital Immortality Dr David Holdsworth Keeping Digital Data for Ever OR](https://reader035.vdocuments.us/reader035/viewer/2022062515/56649c8a5503460f94944c0e/html5/thumbnails/27.jpg)
George3 demo
![Page 28: Digital Immortality Dr David Holdsworth Keeping Digital Data for Ever OR](https://reader035.vdocuments.us/reader035/viewer/2022062515/56649c8a5503460f94944c0e/html5/thumbnails/28.jpg)
Digital Immortality
Malawi Census Data
• Data stored on ICL magnetic tapes
• Rescued by using emulated ICL 1900
![Page 29: Digital Immortality Dr David Holdsworth Keeping Digital Data for Ever OR](https://reader035.vdocuments.us/reader035/viewer/2022062515/56649c8a5503460f94944c0e/html5/thumbnails/29.jpg)
Digital Immortality
Standards
• Open Archival Information System– OAIS ISO14721:2003– Originated by Space Data Community
• Proprietary “standards”– Big enough to be reverse engineered
e.g. MS Word– XYZ Software Ltd
• Open standards, e.g. RFCs
![Page 30: Digital Immortality Dr David Holdsworth Keeping Digital Data for Ever OR](https://reader035.vdocuments.us/reader035/viewer/2022062515/56649c8a5503460f94944c0e/html5/thumbnails/30.jpg)
Digital Immortality
Really Long-Term
• Look back 20 years to see how things have changed
• Today’s Vista is not the final scene
• Ensure that systems can accommodate new formats
• Even the standards are likely to change
![Page 31: Digital Immortality Dr David Holdsworth Keeping Digital Data for Ever OR](https://reader035.vdocuments.us/reader035/viewer/2022062515/56649c8a5503460f94944c0e/html5/thumbnails/31.jpg)
Digital Immortality
Domesday 1986
• 900th anniversary of William the Conqueror’s version
• BBC collects data (inc pictures)
• Data written on 12" LaserVision discs
• Discs last 100 years, but not the drives
• Access is via BBC Master computer
• That won’t last 100 years either
• Can we preserve it until the 1000th anniversary?
![Page 32: Digital Immortality Dr David Holdsworth Keeping Digital Data for Ever OR](https://reader035.vdocuments.us/reader035/viewer/2022062515/56649c8a5503460f94944c0e/html5/thumbnails/32.jpg)
Digital Immortality
Stewardship
• Copies of the discs are lodged with:
• BBC
• British Library
• National Archives (ex PRO)
• Abstract data held by:
• DH / Leeds University
• Longlife Data Ltd
![Page 33: Digital Immortality Dr David Holdsworth Keeping Digital Data for Ever OR](https://reader035.vdocuments.us/reader035/viewer/2022062515/56649c8a5503460f94944c0e/html5/thumbnails/33.jpg)
Digital Immortality
Stewardship
• Current archival activity stresses retention of media
• Retention of digital media is useless
• Need digital safe deposits
![Page 34: Digital Immortality Dr David Holdsworth Keeping Digital Data for Ever OR](https://reader035.vdocuments.us/reader035/viewer/2022062515/56649c8a5503460f94944c0e/html5/thumbnails/34.jpg)
Digital Immortality
Keeping Digital Data for Ever
Dr David Holdsworth
http://www.leeds.ac.uk/cedars/
Digital Immortality
OROR