drs 2 metadata migration
DESCRIPTION
DRS 2 Metadata Migration. June 25, 2013. Agenda. Introduction Preliminary results - content analysis Metadata options Next steps Questions. Introduction. Reason for metadata migration. Different data model - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: DRS 2 Metadata Migration](https://reader036.vdocuments.us/reader036/viewer/2022081507/5681674a550346895ddbf9de/html5/thumbnails/1.jpg)
DRS 2 Metadata Migration
June 25, 2013
![Page 2: DRS 2 Metadata Migration](https://reader036.vdocuments.us/reader036/viewer/2022081507/5681674a550346895ddbf9de/html5/thumbnails/2.jpg)
Agenda
• Introduction• Preliminary results - content analysis• Metadata options• Next steps• Questions
![Page 3: DRS 2 Metadata Migration](https://reader036.vdocuments.us/reader036/viewer/2022081507/5681674a550346895ddbf9de/html5/thumbnails/3.jpg)
INTRODUCTION
![Page 4: DRS 2 Metadata Migration](https://reader036.vdocuments.us/reader036/viewer/2022081507/5681674a550346895ddbf9de/html5/thumbnails/4.jpg)
Reason for metadata migration
• Different data model– File -> Object (a coherent set of content that is
considered a single intellectual unit for purposes of description, use and/or management: for example a particular book, web harvest, serial or photograph.)
• Different metadata schemas– Many locally-defined -> community-standard
• Different packaging of metadata– Use of METS in some cases -> consistent use of METS
![Page 5: DRS 2 Metadata Migration](https://reader036.vdocuments.us/reader036/viewer/2022081507/5681674a550346895ddbf9de/html5/thumbnails/5.jpg)
Path to metadata migration
Analysis • Metadata• Content• Users
Prototype• Proof-of-
concept• Time
estimates
Migration plan• Sequence• Schedule
Develop tools• Dashboard• Object
builders
Metadata migrationWe are here
![Page 6: DRS 2 Metadata Migration](https://reader036.vdocuments.us/reader036/viewer/2022081507/5681674a550346895ddbf9de/html5/thumbnails/6.jpg)
Key feedback points
Analysis • Metadata• Content• Users
Prototype• Proof-of-
concept• Time
estimates
Migration plan• Sequence• Schedule
Develop tools• Dashboard• Object
builders
Metadata migrationTechnical
options
Process options
![Page 7: DRS 2 Metadata Migration](https://reader036.vdocuments.us/reader036/viewer/2022081507/5681674a550346895ddbf9de/html5/thumbnails/7.jpg)
Timing
Analysis • Metadata• Content• Users
Prototype• Proof-of-
concept• Time
estimates
Migration plan• Sequence• Schedule
Develop tools• Dashboard• Object
builders
Metadata migration
Next 3 months
![Page 8: DRS 2 Metadata Migration](https://reader036.vdocuments.us/reader036/viewer/2022081507/5681674a550346895ddbf9de/html5/thumbnails/8.jpg)
What does it involve?
• Aggregate DRS1 files into objects– Different object types = content models
• Generate an object descriptor per object
![Page 9: DRS 2 Metadata Migration](https://reader036.vdocuments.us/reader036/viewer/2022081507/5681674a550346895ddbf9de/html5/thumbnails/9.jpg)
Document example
PDF file
![Page 10: DRS 2 Metadata Migration](https://reader036.vdocuments.us/reader036/viewer/2022081507/5681674a550346895ddbf9de/html5/thumbnails/10.jpg)
Document example
PDF file
New object (content model = DOCUMENT)
![Page 11: DRS 2 Metadata Migration](https://reader036.vdocuments.us/reader036/viewer/2022081507/5681674a550346895ddbf9de/html5/thumbnails/11.jpg)
Document example
PDF file
Descriptor file
New object (content model = DOCUMENT)
![Page 12: DRS 2 Metadata Migration](https://reader036.vdocuments.us/reader036/viewer/2022081507/5681674a550346895ddbf9de/html5/thumbnails/12.jpg)
Still image example
Archival master
image file
![Page 13: DRS 2 Metadata Migration](https://reader036.vdocuments.us/reader036/viewer/2022081507/5681674a550346895ddbf9de/html5/thumbnails/13.jpg)
Still image example
Archival master
image file
Productionmaster
image file
![Page 14: DRS 2 Metadata Migration](https://reader036.vdocuments.us/reader036/viewer/2022081507/5681674a550346895ddbf9de/html5/thumbnails/14.jpg)
Still image example
Archival master
image file
Deliverableimage file
Productionmaster
image file
![Page 15: DRS 2 Metadata Migration](https://reader036.vdocuments.us/reader036/viewer/2022081507/5681674a550346895ddbf9de/html5/thumbnails/15.jpg)
Still image example
Archival master
image file
New object (content model = STILL IMAGE)
Deliverableimage file
Productionmaster
image file
![Page 16: DRS 2 Metadata Migration](https://reader036.vdocuments.us/reader036/viewer/2022081507/5681674a550346895ddbf9de/html5/thumbnails/16.jpg)
Still image example
Archival master
image file
Descriptor file
Deliverableimage file
Productionmaster
image file
New object (content model = STILL IMAGE)
![Page 17: DRS 2 Metadata Migration](https://reader036.vdocuments.us/reader036/viewer/2022081507/5681674a550346895ddbf9de/html5/thumbnails/17.jpg)
Aggregate DRS1 files into objects
• One content file per object– Color profile– Document– Google document container 1– Google document container 2– Google document container 3– Opaque container– Text
![Page 18: DRS 2 Metadata Migration](https://reader036.vdocuments.us/reader036/viewer/2022081507/5681674a550346895ddbf9de/html5/thumbnails/18.jpg)
Aggregate DRS1 files into objects
• Multiple content files per object– Audio– Web harvest– Biomedical image– PDS document– Target image– MOA2– Still image
![Page 19: DRS 2 Metadata Migration](https://reader036.vdocuments.us/reader036/viewer/2022081507/5681674a550346895ddbf9de/html5/thumbnails/19.jpg)
Generate object descriptors
• METS format– Embedded schemas (PREMIS, MODS, MIX, etc.)
• Metadata sources– DRS1 database– DRS1 METS files where they exist– Examining the content files– Catalog records?
![Page 20: DRS 2 Metadata Migration](https://reader036.vdocuments.us/reader036/viewer/2022081507/5681674a550346895ddbf9de/html5/thumbnails/20.jpg)
PRELIMINARY RESULTS:CONTENT ANALYSIS
![Page 21: DRS 2 Metadata Migration](https://reader036.vdocuments.us/reader036/viewer/2022081507/5681674a550346895ddbf9de/html5/thumbnails/21.jpg)
Preliminary content analysis
• Conceptually “built” objects for 13/14 content models (~36 million / 44 million files)– All but still image– Order helps!
Still Image
MOA2
Biomedical Image
PDS Document
![Page 22: DRS 2 Metadata Migration](https://reader036.vdocuments.us/reader036/viewer/2022081507/5681674a550346895ddbf9de/html5/thumbnails/22.jpg)
Preliminary content analysis
• 1,091,670 objects from 36,190,120 files– ~33 files per object
• Relatively few surprises but content analysis is not complete
![Page 23: DRS 2 Metadata Migration](https://reader036.vdocuments.us/reader036/viewer/2022081507/5681674a550346895ddbf9de/html5/thumbnails/23.jpg)
Content cleanup
• MOA2 files (8,024)• Index maps (2,686)• Entity files (1)• Merged PDS descriptors (22,203)
![Page 24: DRS 2 Metadata Migration](https://reader036.vdocuments.us/reader036/viewer/2022081507/5681674a550346895ddbf9de/html5/thumbnails/24.jpg)
Content cleanup
• Orphaned target image (5), target description files (4)
• Orphaned audio files (71)
![Page 25: DRS 2 Metadata Migration](https://reader036.vdocuments.us/reader036/viewer/2022081507/5681674a550346895ddbf9de/html5/thumbnails/25.jpg)
METADATA OPTIONS
![Page 26: DRS 2 Metadata Migration](https://reader036.vdocuments.us/reader036/viewer/2022081507/5681674a550346895ddbf9de/html5/thumbnails/26.jpg)
O
DRS1 DRS2
e.g., billingCodeownerCodeaccessFlag
tech metadataowner-suppliedName
rolepurposequality
usageClass
e.g., accessFlagtech metadata
owner-suppliedNamerole
processingquality
usageClass
e.g., billingCodeownerCode
owner-suppliedName
FILE INFO
FILE INFO
OBJECT INFO
DESCRIPTOR
![Page 27: DRS 2 Metadata Migration](https://reader036.vdocuments.us/reader036/viewer/2022081507/5681674a550346895ddbf9de/html5/thumbnails/27.jpg)
O
DRS1 DRS2
e.g., billingCodeownerCodeaccessFlag
tech metadataowner-suppliedName
rolepurposequality
usageClass
e.g., accessFlagtech metadata
owner-suppliedNamerole
processingquality
usageClass
e.g., billingCodeownerCode
owner-suppliedName
FILE INFO
FILE INFO
OBJECT INFO
DESCRIPTOR
![Page 28: DRS 2 Metadata Migration](https://reader036.vdocuments.us/reader036/viewer/2022081507/5681674a550346895ddbf9de/html5/thumbnails/28.jpg)
O
DRS1 DRS2
e.g., billingCodeownerCodeaccessFlag
tech metadataowner-suppliedName
rolepurposequality
usageClass
accessFlagtech metadata
owner-suppliedNamerole
processingquality
usageClass
billingCodeownerCode
owner-suppliedNamecaption unit name
view text
FILE INFO
FILE INFO
OBJECT INFO
DESCRIPTOR
METS
Object LabelMODSPDS info, etc.
Object LabelObject-level MODS
![Page 29: DRS 2 Metadata Migration](https://reader036.vdocuments.us/reader036/viewer/2022081507/5681674a550346895ddbf9de/html5/thumbnails/29.jpg)
Objects
• Owner supplied name is required• Need to generate during migration• Four cases– A METS file exists– New object will be built from a single content file– New object will be built from multiple content files– No OSN (potential case)
• Proposal for most cases: – add prefix or suffix to METS or content file owner supplied
name
![Page 30: DRS 2 Metadata Migration](https://reader036.vdocuments.us/reader036/viewer/2022081507/5681674a550346895ddbf9de/html5/thumbnails/30.jpg)
Objects
• Other required object elements– insertionDate• date of earliest file?
– captionBehavior• for existing objects, set based on billing code• prospectively, set by depositor
– viewText• available for all objects, not just PDS• default to off
![Page 31: DRS 2 Metadata Migration](https://reader036.vdocuments.us/reader036/viewer/2022081507/5681674a550346895ddbf9de/html5/thumbnails/31.jpg)
Objects
• Descriptive metadata– Take MODS from existing METS as is or import
new• From Aleph• From Finding Aid
– If re-imported, update METS label or not?– Import from OLIVIA based on owner supplied
name for the file?
![Page 32: DRS 2 Metadata Migration](https://reader036.vdocuments.us/reader036/viewer/2022081507/5681674a550346895ddbf9de/html5/thumbnails/32.jpg)
Objects from existing METS
• Identifiers for Harvard metadata – Identify finding aid identifiers– Convert “Old HOLLIS” numbers– Aleph IDs: include check digit or not?– Convert to URIs or actionable URNs from plain IDs• Could DRS format such URIs for new DRS2 input?
![Page 33: DRS 2 Metadata Migration](https://reader036.vdocuments.us/reader036/viewer/2022081507/5681674a550346895ddbf9de/html5/thumbnails/33.jpg)
Objects from existing METS
• PDS elements– PDF owner text becomes caption unit name– viewOcr function becomes viewText– goto function will be automatically determined by
presence of structMap/div attributes• Caption behavior – for existing objects, set by billing code
![Page 34: DRS 2 Metadata Migration](https://reader036.vdocuments.us/reader036/viewer/2022081507/5681674a550346895ddbf9de/html5/thumbnails/34.jpg)
Files
• Run automated processes to identify, validate and characterize file technical characteristics
• Extract technical metadata
![Page 35: DRS 2 Metadata Migration](https://reader036.vdocuments.us/reader036/viewer/2022081507/5681674a550346895ddbf9de/html5/thumbnails/35.jpg)
Files
• isFirstGenerationinDrs – Values: yes, no, unspecified– Should we supply “yes” for archival masters
and/or top of derivation chain?
![Page 36: DRS 2 Metadata Migration](https://reader036.vdocuments.us/reader036/viewer/2022081507/5681674a550346895ddbf9de/html5/thumbnails/36.jpg)
Image Files
• Converting from local scheme to MIX• Local field questions– Methodology– History– Source– Enhancements
![Page 37: DRS 2 Metadata Migration](https://reader036.vdocuments.us/reader036/viewer/2022081507/5681674a550346895ddbf9de/html5/thumbnails/37.jpg)
Text files
• Converting from local scheme to textMD• Descriptor_type will be absorbed into
different places in DRS2
• Extracted metadata can supply• markup_basis • markup_language for specific schemas• possibly other elements
![Page 38: DRS 2 Metadata Migration](https://reader036.vdocuments.us/reader036/viewer/2022081507/5681674a550346895ddbf9de/html5/thumbnails/38.jpg)
Audio files
• Moving from local schema to AES57-2011: Audio object structures for preservation and restoration
![Page 39: DRS 2 Metadata Migration](https://reader036.vdocuments.us/reader036/viewer/2022081507/5681674a550346895ddbf9de/html5/thumbnails/39.jpg)
Versioned metadata
• History will be tracked for key administrative elements:– Access flag– Admin flag (new)– Billing code– Owner code
• What values to assign for required creation date and agent for migrated content?
![Page 40: DRS 2 Metadata Migration](https://reader036.vdocuments.us/reader036/viewer/2022081507/5681674a550346895ddbf9de/html5/thumbnails/40.jpg)
NEXT STEPS
![Page 41: DRS 2 Metadata Migration](https://reader036.vdocuments.us/reader036/viewer/2022081507/5681674a550346895ddbf9de/html5/thumbnails/41.jpg)
Next steps
• Continue analysis and development of technical requirements
• Build prototype• September check-in on progress• Create metadata migration plan• Open meeting to review plan
![Page 42: DRS 2 Metadata Migration](https://reader036.vdocuments.us/reader036/viewer/2022081507/5681674a550346895ddbf9de/html5/thumbnails/42.jpg)
OPEN FOR QUESTIONS