![Page 1: Central Registry for Digitized Objects: Linking Production and Bibliographic Control](https://reader030.vdocuments.us/reader030/viewer/2022032709/568131ab550346895d981956/html5/thumbnails/1.jpg)
Central Registryfor Digitized Objects:
Linking Production andBibliographic Control
Ralf StockmannGöttinger Digitization Center
![Page 2: Central Registry for Digitized Objects: Linking Production and Bibliographic Control](https://reader030.vdocuments.us/reader030/viewer/2022032709/568131ab550346895d981956/html5/thumbnails/2.jpg)
As things are now
• Huge ventures in– Digitization
• Google• Microsoft• National programs• Local centers
– Accessibility• World Digital Library• European Digital Library• National portals• Google Book Search
![Page 3: Central Registry for Digitized Objects: Linking Production and Bibliographic Control](https://reader030.vdocuments.us/reader030/viewer/2022032709/568131ab550346895d981956/html5/thumbnails/3.jpg)
As things are now
• We just face the dawn of mass digitization– Leaving behind the state of
manufacturing– Entering industrialization– Scanning Robots– Accessible Full Text (OCR)
![Page 4: Central Registry for Digitized Objects: Linking Production and Bibliographic Control](https://reader030.vdocuments.us/reader030/viewer/2022032709/568131ab550346895d981956/html5/thumbnails/4.jpg)
Lack of …
• Coordination in digitization activities– Who scans what
where when in which quality and how will it be accessible
• How is “quality” defined?
• Do we agree on “what”?
![Page 5: Central Registry for Digitized Objects: Linking Production and Bibliographic Control](https://reader030.vdocuments.us/reader030/viewer/2022032709/568131ab550346895d981956/html5/thumbnails/5.jpg)
Number of digitized items per volume
Co
sts
/ Val
ue Waste of Ressources
Facing the Consequences
Costs
AdditionalBenefit
TechnicalImprovements
![Page 6: Central Registry for Digitized Objects: Linking Production and Bibliographic Control](https://reader030.vdocuments.us/reader030/viewer/2022032709/568131ab550346895d981956/html5/thumbnails/6.jpg)
The Solution• Central registry for digitized objects• Focused on the production context (no user
frontend)• API driven
– Application Programming Interface– Query / Ingest– Simple implementation into existing workflow-tools
• Batch mode (lists)• Open Source / free service• Matching on volume level
– Score / probability
![Page 7: Central Registry for Digitized Objects: Linking Production and Bibliographic Control](https://reader030.vdocuments.us/reader030/viewer/2022032709/568131ab550346895d981956/html5/thumbnails/7.jpg)
Implementation
? ? ?
APIAPI
Aggregator / Normalizer / MappingAggregator / Normalizer / Mapping
Registry / Meta Data StoreRegistry / Meta Data Store
IngestIngest
Present Collections
QueryQuery
IngestIngest
Running Project
! ! !Notice of Intent
IngestIngest
Backend ServicesEROMM / EDL / OCLC / …
Backend ServicesEROMM / EDL / OCLC / …
![Page 8: Central Registry for Digitized Objects: Linking Production and Bibliographic Control](https://reader030.vdocuments.us/reader030/viewer/2022032709/568131ab550346895d981956/html5/thumbnails/8.jpg)
Metadata Store• Bibliographic
– Title– Author– Date– Place of publication– Number of Pages (?)– Language– Print / Format– Edition
• Technical– Resolution– Color depth– File type / compression
• Accessibility– Institution– Persistent identifier– Rights– URL
• Status– Digitized– In Progress– Intended (Timeline?)– Requested?
Matching / Score„what“
Additional Judging„who, where, which quality, how accesible“
Decisive Factor„when“
![Page 9: Central Registry for Digitized Objects: Linking Production and Bibliographic Control](https://reader030.vdocuments.us/reader030/viewer/2022032709/568131ab550346895d981956/html5/thumbnails/9.jpg)
Obstacles• (open source) Tools for automated matching /
scoring?• Interface for manual comparison / decision making• Multivolume works: low rate of uniformity (near
50% of physical SUB stock before 1900)• Unicode• Transliteration tables• Random bound books• Reliable identifier
– ISBN for old books?
• Anticipated rate of accuracy: 50 – 70 %
![Page 10: Central Registry for Digitized Objects: Linking Production and Bibliographic Control](https://reader030.vdocuments.us/reader030/viewer/2022032709/568131ab550346895d981956/html5/thumbnails/10.jpg)
Appreciation of Values• The goal is NOT to build a reliable database in terms of
library standards
• But to prevent further waste of resources.
• If we manage to archive just 50% precision,
• We saved a min. 50% of founding!
![Page 11: Central Registry for Digitized Objects: Linking Production and Bibliographic Control](https://reader030.vdocuments.us/reader030/viewer/2022032709/568131ab550346895d981956/html5/thumbnails/11.jpg)
Work Packages• Define metadata model• Set up database• Implement mapping tools• Define API calls• Implement API• Build some connectors to popular mass digitization workflow
tools (e.g. “Goobi”)• Establish ISBN workflow• Harvest existing sources• Start with a community of actual projects
• Get some (!) founding• Estimated schedule plan: 6 months
![Page 12: Central Registry for Digitized Objects: Linking Production and Bibliographic Control](https://reader030.vdocuments.us/reader030/viewer/2022032709/568131ab550346895d981956/html5/thumbnails/12.jpg)
Thank You([email protected])