approaches to preserving digitized taxonomic data
DESCRIPTION
Sherborn Symposium. Natural History Museum, London. 28 October 2011.TRANSCRIPT
![Page 1: Approaches to preserving digitized taxonomic data](https://reader033.vdocuments.us/reader033/viewer/2022061203/546cfa6bb4af9f932c8b51a4/html5/thumbnails/1.jpg)
Approaches to preserving digitized taxonomic data:
Prints, manuscripts & specimens
Chris FreelandDirector, Center for Biodiversity Informatics
Technical Director, Biodiversity Heritage Library28 October 2011
@chrisfreeland
![Page 2: Approaches to preserving digitized taxonomic data](https://reader033.vdocuments.us/reader033/viewer/2022061203/546cfa6bb4af9f932c8b51a4/html5/thumbnails/2.jpg)
Prints / Manuscripts / SpecimensDifferent objects, similar management
http://www.flickr.com/photos/biodivlibrary/6257859557 http://www.flickr.com/photos/chrisfreeland/6018724034 http://www.biodiversitylibrary.org/page/34045915
![Page 3: Approaches to preserving digitized taxonomic data](https://reader033.vdocuments.us/reader033/viewer/2022061203/546cfa6bb4af9f932c8b51a4/html5/thumbnails/3.jpg)
Overview of Talk
• Why worry about digital preservation?
• Considerations for preservation– Collaboration– File formats– Metadata standards
• Views to the future
Preservation Panic!
![Page 4: Approaches to preserving digitized taxonomic data](https://reader033.vdocuments.us/reader033/viewer/2022061203/546cfa6bb4af9f932c8b51a4/html5/thumbnails/4.jpg)
WHY WORRY?http://www.flickr.com/photos/biodivlibrary/6008902662
![Page 5: Approaches to preserving digitized taxonomic data](https://reader033.vdocuments.us/reader033/viewer/2022061203/546cfa6bb4af9f932c8b51a4/html5/thumbnails/5.jpg)
Do it once, do it right
Costs more to get object to scanner than to scan
![Page 6: Approaches to preserving digitized taxonomic data](https://reader033.vdocuments.us/reader033/viewer/2022061203/546cfa6bb4af9f932c8b51a4/html5/thumbnails/6.jpg)
• Conversion / Compost / Corruption• Longevity of digital objects• File changes• Media obsolescence
Cautionary Tales
![Page 7: Approaches to preserving digitized taxonomic data](https://reader033.vdocuments.us/reader033/viewer/2022061203/546cfa6bb4af9f932c8b51a4/html5/thumbnails/7.jpg)
CONSIDERATION: COLLABORATION
![Page 8: Approaches to preserving digitized taxonomic data](https://reader033.vdocuments.us/reader033/viewer/2022061203/546cfa6bb4af9f932c8b51a4/html5/thumbnails/8.jpg)
LOCKSS
Lots Of Copies Keeps Stuff Safe
• LOCKSS is both a software platform & a concept– Software: http://www.lockss.org
![Page 9: Approaches to preserving digitized taxonomic data](https://reader033.vdocuments.us/reader033/viewer/2022061203/546cfa6bb4af9f932c8b51a4/html5/thumbnails/9.jpg)
Museum XLibrary Y
Rule of 3
Archive Z
1. Geographic Locations 2. Administrations 3. Technology Platforms
![Page 10: Approaches to preserving digitized taxonomic data](https://reader033.vdocuments.us/reader033/viewer/2022061203/546cfa6bb4af9f932c8b51a4/html5/thumbnails/10.jpg)
CONSIDERATION: FILE FORMATS
![Page 11: Approaches to preserving digitized taxonomic data](https://reader033.vdocuments.us/reader033/viewer/2022061203/546cfa6bb4af9f932c8b51a4/html5/thumbnails/11.jpg)
JPEG2000
• Wavelet compression, lossless encoding• 12 Parts• Of particular interest to documents &
specimens:– Part 1: Core Coding System, ISO/IEC 15444-1– Part 6: Compound image file format– Part 10: JP3D, Volumetric images
http://www.jpeg.org/jpeg2000/
![Page 12: Approaches to preserving digitized taxonomic data](https://reader033.vdocuments.us/reader033/viewer/2022061203/546cfa6bb4af9f932c8b51a4/html5/thumbnails/12.jpg)
http://www.tropicos.org/ImageFullView.aspx?imageid=62182
![Page 13: Approaches to preserving digitized taxonomic data](https://reader033.vdocuments.us/reader033/viewer/2022061203/546cfa6bb4af9f932c8b51a4/html5/thumbnails/13.jpg)
JPEG2000 (Hurrahs & Hisses)
• Advantages– Store a single file for access & preservation– Standards-based– Saves drive space (important at museum scale)
• Disadvantages– Doesn’t have wide native support in many apps– Requires an intermediary app to decode & serve
• But, there’s an open source option: djatoka http://djatoka.sourceforge.net
– Reports of data loss
![Page 14: Approaches to preserving digitized taxonomic data](https://reader033.vdocuments.us/reader033/viewer/2022061203/546cfa6bb4af9f932c8b51a4/html5/thumbnails/14.jpg)
PDF/A
• ISO-standardized version of PDF suitable for long-term preservation
• Identifies a "profile" for electronic documents that ensures the documents can be reproduced exactly the same way in years to come.*
• Makes the file self-contained (and therefore larger)– Embeds fonts– Graphics
* http://en.wikipedia.org/wiki/PDF/A
![Page 15: Approaches to preserving digitized taxonomic data](https://reader033.vdocuments.us/reader033/viewer/2022061203/546cfa6bb4af9f932c8b51a4/html5/thumbnails/15.jpg)
CONSIDERATION: METADATA
![Page 16: Approaches to preserving digitized taxonomic data](https://reader033.vdocuments.us/reader033/viewer/2022061203/546cfa6bb4af9f932c8b51a4/html5/thumbnails/16.jpg)
The Great Thing AboutSTANDARDS
Is That There AreSO MANY
To Choose From
![Page 17: Approaches to preserving digitized taxonomic data](https://reader033.vdocuments.us/reader033/viewer/2022061203/546cfa6bb4af9f932c8b51a4/html5/thumbnails/17.jpg)
FilesystemFilesystem
Metadata Preservation
• Descriptive information (metadata) provides content & context for indexing, reuse
• Can bundle metadata within files– EXIF: images, common in digital cameras– Adobe XMP: docs, images
• Should commit metadata to file system– Should not manage just
in DB or other management system
<DwC> XMLXML
JP2JP2
![Page 18: Approaches to preserving digitized taxonomic data](https://reader033.vdocuments.us/reader033/viewer/2022061203/546cfa6bb4af9f932c8b51a4/html5/thumbnails/18.jpg)
THE FUTURE
![Page 19: Approaches to preserving digitized taxonomic data](https://reader033.vdocuments.us/reader033/viewer/2022061203/546cfa6bb4af9f932c8b51a4/html5/thumbnails/19.jpg)
Electronic Publications
• Happening now, has been for years• Should take same care in ensuring
heterogeneity & diversity in digital management systems as with printed, bound books– Monolithic libraries have failed over time– Monolithic electronic archives will, too
![Page 20: Approaches to preserving digitized taxonomic data](https://reader033.vdocuments.us/reader033/viewer/2022061203/546cfa6bb4af9f932c8b51a4/html5/thumbnails/20.jpg)
http://www.biodiversitylibrary.org/page/22681143
Need a meadow…
![Page 21: Approaches to preserving digitized taxonomic data](https://reader033.vdocuments.us/reader033/viewer/2022061203/546cfa6bb4af9f932c8b51a4/html5/thumbnails/21.jpg)
…not a monoculture.
![Page 22: Approaches to preserving digitized taxonomic data](https://reader033.vdocuments.us/reader033/viewer/2022061203/546cfa6bb4af9f932c8b51a4/html5/thumbnails/22.jpg)
There is no silver bullet
• Make best decision today
• Stay up with technology changes & best practices– <insert library & archive professionals here>
• Evaluate, experiment, document, lead
• Move to stable new technologies when necessary
![Page 23: Approaches to preserving digitized taxonomic data](https://reader033.vdocuments.us/reader033/viewer/2022061203/546cfa6bb4af9f932c8b51a4/html5/thumbnails/23.jpg)
Questions?Chris Freeland
Director, Center for Biodiversity InformaticsTechnical Director, Biodiversity Heritage Library
28 October 2011
Email: [email protected]
Twitter: @chrisfreeland