digital collections: storage and access jon dunn assistant director for technology iu digital...
TRANSCRIPT
![Page 1: Digital Collections: Storage and Access Jon Dunn Assistant Director for Technology IU Digital Library Program jwd@indiana.edu](https://reader036.vdocuments.us/reader036/viewer/2022062620/551aa8c1550346e0158b5dcd/html5/thumbnails/1.jpg)
Digital Collections:Storage and Access
Jon DunnAssistant Director for Technology
IU Digital Library [email protected]
![Page 2: Digital Collections: Storage and Access Jon Dunn Assistant Director for Technology IU Digital Library Program jwd@indiana.edu](https://reader036.vdocuments.us/reader036/viewer/2022062620/551aa8c1550346e0158b5dcd/html5/thumbnails/2.jpg)
October 2, 2003 ALI Digital Library Workshop
Storage Why is storage an issue?
Space requirements Persistence Accessibility
Needs depend on purpose of storage Capture/encoding Access/delivery Preservation
![Page 3: Digital Collections: Storage and Access Jon Dunn Assistant Director for Technology IU Digital Library Program jwd@indiana.edu](https://reader036.vdocuments.us/reader036/viewer/2022062620/551aa8c1550346e0158b5dcd/html5/thumbnails/3.jpg)
October 2, 2003 ALI Digital Library Workshop
Storage: Working Space Space for storage of digital files during
capture/encoding/quality control process
Possibilities PC hard drive File server / LAN
Issues Capacity, backup, speed, accessibility
![Page 4: Digital Collections: Storage and Access Jon Dunn Assistant Director for Technology IU Digital Library Program jwd@indiana.edu](https://reader036.vdocuments.us/reader036/viewer/2022062620/551aa8c1550346e0158b5dcd/html5/thumbnails/4.jpg)
October 2, 2003 ALI Digital Library Workshop
Storage: Access/Delivery Storage of derivative files for web delivery
Image, audio, video, text files, etc. Possibilities
Local web server Commercially-hosted web site Consortial service provider
Issues: capacity, backup, performance, software integration, maintenance/migration
![Page 5: Digital Collections: Storage and Access Jon Dunn Assistant Director for Technology IU Digital Library Program jwd@indiana.edu](https://reader036.vdocuments.us/reader036/viewer/2022062620/551aa8c1550346e0158b5dcd/html5/thumbnails/5.jpg)
October 2, 2003 ALI Digital Library Workshop
Storage: Preservation Much harder problem Longer term
Issues of longevity of media, hardware, file format “Where did we put the files?”
Larger files Hard disk storage, traditional backup methods not
cost-effective Infrequency of access
Problems do not become immediately evident
![Page 6: Digital Collections: Storage and Access Jon Dunn Assistant Director for Technology IU Digital Library Program jwd@indiana.edu](https://reader036.vdocuments.us/reader036/viewer/2022062620/551aa8c1550346e0158b5dcd/html5/thumbnails/6.jpg)
October 2, 2003 ALI Digital Library Workshop
Long-Term Storage Options Removable media stored offline
Optical CD-R (CD-Recordable) DVD-R (DVD-Recordable), DVD+R, DVD+RW, DVD-RW, …
Tape DLT, 8mm, DAT, …
Pros: cheap, easy, produces tangible item Cons: Low capacity, physical space requirements, unknown
longevity, migration, potential format obsolescence Online/nearline storage systems
HSM: Hierarchical Storage Management Combine disk and automated tape storage with software to keep track
of where files are located Locally managed or remote provider Pros: high capacity, migration can be handled by software, Cons: expensive, complex, network bandwidth issues, must trust
service provider, potential single point of failure
![Page 7: Digital Collections: Storage and Access Jon Dunn Assistant Director for Technology IU Digital Library Program jwd@indiana.edu](https://reader036.vdocuments.us/reader036/viewer/2022062620/551aa8c1550346e0158b5dcd/html5/thumbnails/7.jpg)
![Page 8: Digital Collections: Storage and Access Jon Dunn Assistant Director for Technology IU Digital Library Program jwd@indiana.edu](https://reader036.vdocuments.us/reader036/viewer/2022062620/551aa8c1550346e0158b5dcd/html5/thumbnails/8.jpg)
![Page 9: Digital Collections: Storage and Access Jon Dunn Assistant Director for Technology IU Digital Library Program jwd@indiana.edu](https://reader036.vdocuments.us/reader036/viewer/2022062620/551aa8c1550346e0158b5dcd/html5/thumbnails/9.jpg)
October 2, 2003 ALI Digital Library Workshop
HSM Example: IU’s Massive Data Storage Service (MDSS)
HPSS (High Performance Storage System) software Developed as collaboration of IBM and US
national labs Four tape robots
2 in Bloomington, 2 in Indianapolis Data can be mirrored
540 terabytes (TB) total storage ~75 TB used as of April 2001
![Page 10: Digital Collections: Storage and Access Jon Dunn Assistant Director for Technology IU Digital Library Program jwd@indiana.edu](https://reader036.vdocuments.us/reader036/viewer/2022062620/551aa8c1550346e0158b5dcd/html5/thumbnails/10.jpg)
October 2, 2003 ALI Digital Library Workshop
A digital object is more than just a file!
Hi-res page image files (TIFF)
Delivery page image files (JPEG)
Text file (TEI/XML)
Metadata
![Page 11: Digital Collections: Storage and Access Jon Dunn Assistant Director for Technology IU Digital Library Program jwd@indiana.edu](https://reader036.vdocuments.us/reader036/viewer/2022062620/551aa8c1550346e0158b5dcd/html5/thumbnails/11.jpg)
October 2, 2003 ALI Digital Library Workshop
A digital object is more than just a file!
EADFinding
Aid
![Page 12: Digital Collections: Storage and Access Jon Dunn Assistant Director for Technology IU Digital Library Program jwd@indiana.edu](https://reader036.vdocuments.us/reader036/viewer/2022062620/551aa8c1550346e0158b5dcd/html5/thumbnails/12.jpg)
October 2, 2003 ALI Digital Library Workshop
DL Objects Digital library “objects” have many parts
Metadata Preservation/archival files Delivery files
How do we keep them connected? Now: Good practice in file naming, directory
organization, project documentation -not scalable! Future: Digital object repository
![Page 13: Digital Collections: Storage and Access Jon Dunn Assistant Director for Technology IU Digital Library Program jwd@indiana.edu](https://reader036.vdocuments.us/reader036/viewer/2022062620/551aa8c1550346e0158b5dcd/html5/thumbnails/13.jpg)
October 2, 2003 ALI Digital Library Workshop
Data Persistence Key is migration Keeping the bits alive
Physical media Logical media format
Keeping the bits understandable File format Metadata
Small “pockets” of digital content pose a problem for migration
![Page 14: Digital Collections: Storage and Access Jon Dunn Assistant Director for Technology IU Digital Library Program jwd@indiana.edu](https://reader036.vdocuments.us/reader036/viewer/2022062620/551aa8c1550346e0158b5dcd/html5/thumbnails/14.jpg)
October 2, 2003 ALI Digital Library Workshop
DL Object Repository
Preservation version in HSM
Delivery version(s) on web server
Metadata records
RepositorySystem
Users andapplications
![Page 15: Digital Collections: Storage and Access Jon Dunn Assistant Director for Technology IU Digital Library Program jwd@indiana.edu](https://reader036.vdocuments.us/reader036/viewer/2022062620/551aa8c1550346e0158b5dcd/html5/thumbnails/15.jpg)
October 2, 2003 ALI Digital Library Workshop
Web Delivery Functions Searching
Metadata Full text
Browsing By subject, date, author, …
Navigation Page turning, image panning/zooming, …
Streaming For audio/video
Reuse Downloading, format conversion Linking, persistent naming
Access control If necessary
![Page 16: Digital Collections: Storage and Access Jon Dunn Assistant Director for Technology IU Digital Library Program jwd@indiana.edu](https://reader036.vdocuments.us/reader036/viewer/2022062620/551aa8c1550346e0158b5dcd/html5/thumbnails/16.jpg)
October 2, 2003 ALI Digital Library Workshop
Digital Collection Delivery Software Very complex systems Need to integrate data from databases, full-text
search engines, file systems, and other sources Cross-collection searching Commercial
ContentDM, Luna Insight, various library management system addons
Open source UMich DLXS, Greenstone, Eprints, MIT DSpace, …
Homegrown
![Page 17: Digital Collections: Storage and Access Jon Dunn Assistant Director for Technology IU Digital Library Program jwd@indiana.edu](https://reader036.vdocuments.us/reader036/viewer/2022062620/551aa8c1550346e0158b5dcd/html5/thumbnails/17.jpg)
![Page 18: Digital Collections: Storage and Access Jon Dunn Assistant Director for Technology IU Digital Library Program jwd@indiana.edu](https://reader036.vdocuments.us/reader036/viewer/2022062620/551aa8c1550346e0158b5dcd/html5/thumbnails/18.jpg)
October 2, 2003 ALI Digital Library Workshop
Demonstration Hoagy Carmichael Collection,
IU Digital Library Program http://www.dlib.indiana.edu/collections/hoagy/
![Page 20: Digital Collections: Storage and Access Jon Dunn Assistant Director for Technology IU Digital Library Program jwd@indiana.edu](https://reader036.vdocuments.us/reader036/viewer/2022062620/551aa8c1550346e0158b5dcd/html5/thumbnails/20.jpg)
October 2, 2003 ALI Digital Library Workshop
Exposing Digital Resources Broadly Pay services
RLG Cultural Materials, Archival Resources Free services
University of Michigan OAIster www.oaister.org
UIUC Digital Gateway to Cultural Heritage Materials oai.grainger.uiuc.edu
OAI-PMH Open Archives Initiative Protocol for Metadata Harvesting www.openarchives.org
![Page 21: Digital Collections: Storage and Access Jon Dunn Assistant Director for Technology IU Digital Library Program jwd@indiana.edu](https://reader036.vdocuments.us/reader036/viewer/2022062620/551aa8c1550346e0158b5dcd/html5/thumbnails/21.jpg)
October 2, 2003 ALI Digital Library Workshop
OAI Metadata Harvesting Extract metadata from various sources Build services on local copies of metadata
user
. . .
search for “Indiana”
local copy ofmetadata
metadataharvested offline
metadataharvested offline
metadataharvested offline
metadataharvested offline
all searching, browsing, etc. performed on the metadata here
Data providers
Service provider
![Page 22: Digital Collections: Storage and Access Jon Dunn Assistant Director for Technology IU Digital Library Program jwd@indiana.edu](https://reader036.vdocuments.us/reader036/viewer/2022062620/551aa8c1550346e0158b5dcd/html5/thumbnails/22.jpg)
October 2, 2003 ALI Digital Library Workshop
More Information
Bibliography to be made available at: http://www.dlib.indiana.edu/workshops/alioct03/