implementing metadata standards for a digital audiovisual preservation repository
DESCRIPTION
This talk was created for discussion of PREMIS implementation in an audiovisual preservation repository environment.TRANSCRIPT
Implementing metadata standards for a digital audiovisual preservation repositoryKara van MalssenAudioVisual Preservation Solutions2011-03-24
PBS
Library of Congress
NYU
WNET WGBH
SIP site
Repository
Case Study: NDIIPP Preserving Digital Public Television Project
Producing Stations
WGBH
WNET
Station
B
Station
A
Station
C
PBS
Satellite
NYU PDPTV Prototype
Repository
Transmitting
Stations
WNET
Station
A
Station
C
Station
E
Station
G
Station
I
WGBH
Station
B
Station
D
Station
F
Station
H
Station
J
Submission Workflow
• Create a prototype repository for long term retention
• Aggregate content from partner stations + PBS for sample programs
• Populate records with metadata that already exists (in station databases, files, scheduling systems, etc)
• Transform data and package content, while preserving relationships between items
NYU Goals:
Important Vocabulary•The Repository: NYU
prototype preservation repository
•OAIS: Open Archival Information System
•SIP: Submission Information Package
•AIP: Archival Information Package
OAIS Terms!
Production Master (mov)
HD Broadcast
Master (mov/data)
SD Broadcast
Master (mov/aiff/
m2v)
SD Broadcast
Master (mpeg)
Production Master (mxf)
SIP Class 1: WNET National
Broadcast (Nature)
SIP Class 2: WGBH National
Broadcasts
SIP Class 3: WNET Local Broadcast
(New York Voices)
SIP Class 4: Religion and Ethics
PODS PROTRACK
TEAMSINMAGICDATABASE EXPORTS
ADDITIONAL ITEMS Scripts, etc
PODSPODS
PODS
Scripts, etcPRO
TRACK
PROTRACK
INMAGICINMAGIC
TEAMS
HD Broadcast
Master (mov/data)
SD Broadcast
Master (mov/aiff/
m2v)
Production Master (mxf)
Production Master (mxf)
SD Broadcast
Master (mpeg)
SD Broadcast
Master (mov/aiff/
m2v)
Production Master (mov)
Production Master (mov)
SD Broadcast
Master (mov/aiff/
m2v)
Challenge of managing
diverse
SIPs:
PDPTV metadata modelMETS: Metadata Encoding and Transmission StandardStructural and administrative
PBCore: Public Broadcasting Metadata DictionaryDescriptive and technical
PREMIS: Preservation Metadata Implementation StrategyTechnical preservation metadata
METS: Metadata Encoding and Transmission Standard
• Provides a structure to bundle all content (essence + metadata) in one AIP
• Identifies types of metadata, but not the terms to define them (with a few exceptions)
METS
dmdSec
amdSec
techMD rightsMD sourceMD digiprovMD
fileSec
structMap
behaviorSec
PBCore: What is it good for?
• Descriptive metadata elements that are specific to public broadcasting & AV
• Controlled vocabularies with broadcast terms
• Easy to map to from legacy station databases
• Granular technical metadata (PBCore 1.2+)
➡ Accurately represents the file specific metadata➡ Can be auto populated using technical metadata
extraction tools & sytlesheets
PREMIS: Preservation Metadata Implementation Strategies
Intellectual Entity
Object
Rights
Agents
Events
Object Entity:•Creating application info
•Playback environment (hardware and software
“Given the wide range of institutional contexts, PREMIS cannot be an out-of-the box solution. Users have to decide how to model their specific application, which semantic units need to be captured to support them, and how to implement them.”
- ISQ Special Issue: Digitial Preservation, Spring 2010, p.9
<?xml version="1.0" encoding="UTF-8"?><premis xmlns="info:lc/xmlns/premis-v2" version="2.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="info:lc/xmlns/premis-v2 http://www.loc.gov/standards/premis/premis.xsd"> <!-- ================================================================== --> <object xsi:type="representation" xmlID="bmaster-sd-001"> <objectIdentifier> <objectIdentifierType>NDIIPP:PDPTV repository naming scheme</objectIdentifierType> <objectIdentifierValue><!-- SD_BMASTER --></objectIdentifierValue> </objectIdentifier> <environment> <environmentPurpose>create</environmentPurpose> <environmentNote>The OMNEON server generated three files for the SD broadcast master: one QuickTime movie file (.mov), one video track (.m2v), and one audio track (.aiff). The .mov file contains fully-qualified pathname references to the .m2v and .aiff tracks that were only valid in the OMNEON server environment.</environmentNote> <environmentExtension> <creatingApplication> <creatingApplicationName>Avid Unity Workgroup</creatingApplicationName> <creatingApplicationVersion>4</creatingApplicationVersion> </creatingApplication> <creatingApplication> <creatingApplicationName>Avid Media Composer REG_SZ</creatingApplicationName> <creatingApplicationVersion>3.0.5</creatingApplicationVersion> </creatingApplication> <creatingApplication> <creatingApplicationName>Omneon</creatingApplicationName> <creatingApplicationVersion>4.3 sr2</creatingApplicationVersion> </creatingApplication> </environmentExtension> </environment> <environment> <environmentCharacteristic>known to work</environmentCharacteristic> <environmentPurpose>render</environmentPurpose> <environmentNote>To render the content in this environment the video track (.m2v) and audio track (.aiff) must be muxed using QTCoffee. The QuickTime movie file cannot be used to render the content because the .mov file refers to the .m2v and .aiff tracks by fully-qualified file names that were only valid in the creating environment. </environmentNote> <software> <swName>Apple Macintosh OS X version 10.5.5</swName> <swType>operating system</swType> </software> <software> <swName>Apple QuickTime Player version 7.5.5</swName> <swType>renderer</swType> </software> <software> <swName>QTCoffee 1.2.5</swName> <swType>muxer</swType> </software> <hardware> <hwName>Intel Core 2 Duo</hwName> <hwType>processor</hwType> <hwOtherInformation>2 GB RAM</hwOtherInformation> </hardware> </environment> </object> <!-- ================================================================== --> <object xsi:type="representation" xmlID="pmaster-001"> <objectIdentifier> <objectIdentifierType>NDIIPP:PDPTV repository naming scheme</objectIdentifierType> <objectIdentifierValue><!-- PMASTER --></objectIdentifierValue> </objectIdentifier> <environment> <environmentPurpose>create</environmentPurpose> <environmentExtension> <creatingApplication> <creatingApplicationExtension> <hardware> <hwName>Sony eVTR MSW-M210</hwName> <hwType>video tape recorder that creates MXF file directly</hwType> </hardware> </creatingApplicationExtension> </creatingApplication> <creatingApplication> <creatingApplicationName>Sony MXF eVTR Manager</creatingApplicationName> <creatingApplicationVersion>0.1.0.4</creatingApplicationVersion> </creatingApplication> </environmentExtension> </environment> <environment> <environmentCharacteristic>unknown</environmentCharacteristic> <environmentPurpose>render</environmentPurpose> <environmentNote>No rendering environment was identified for the MXF-wrapped IMX-50 files created by the Sony eVTR.</environmentNote> </environment> </object> <!-- ================================================================== --> </premis>
<?xml version="1.0" encoding="UTF-8"?><premis xmlns="info:lc/xmlns/premis-v2" version="2.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="info:lc/xmlns/premis-v2 http://www.loc.gov/standards/premis/premis.xsd"> <!-- ================================================================== --> <object xsi:type="representation" xmlID="bmaster-sd-001"> <objectIdentifier> <objectIdentifierType>NDIIPP:PDPTV repository naming scheme</objectIdentifierType> <objectIdentifierValue><!-- SD_BMASTER --></objectIdentifierValue> </objectIdentifier> <environment> <environmentPurpose>create</environmentPurpose> <environmentNote>The OMNEON server generated three files for the SD broadcast master: one QuickTime movie file (.mov), one video track (.m2v), and one audio track (.aiff). The .mov file contains fully-qualified pathname references to the .m2v and .aiff tracks that were only valid in the OMNEON server environment.</environmentNote> <environmentExtension> <creatingApplication> <creatingApplicationName>Avid Unity Workgroup</creatingApplicationName> <creatingApplicationVersion>4</creatingApplicationVersion> </creatingApplication> <creatingApplication> <creatingApplicationName>Avid Media Composer REG_SZ</creatingApplicationName> <creatingApplicationVersion>3.0.5</creatingApplicationVersion> </creatingApplication> <creatingApplication> <creatingApplicationName>Omneon</creatingApplicationName> <creatingApplicationVersion>4.3 sr2</creatingApplicationVersion> </creatingApplication> </environmentExtension> </environment> <environment> <environmentCharacteristic>known to work</environmentCharacteristic> <environmentPurpose>render</environmentPurpose> <environmentNote>To render the content in this environment the video track (.m2v) and audio track (.aiff) must be muxed using QTCoffee. The QuickTime movie file cannot be used to render the content because the .mov file refers to the .m2v and .aiff tracks by fully-qualified file names that were only valid in the creating environment. </environmentNote> <software> <swName>Apple Macintosh OS X version 10.5.5</swName> <swType>operating system</swType> </software> <software> <swName>Apple QuickTime Player version 7.5.5</swName> <swType>renderer</swType> </software> <software> <swName>QTCoffee 1.2.5</swName> <swType>muxer</swType> </software> <hardware> <hwName>Intel Core 2 Duo</hwName> <hwType>processor</hwType> <hwOtherInformation>2 GB RAM</hwOtherInformation> </hardware> </environment> </object> <!-- ================================================================== --> <object xsi:type="representation" xmlID="pmaster-001"> <objectIdentifier> <objectIdentifierType>NDIIPP:PDPTV repository naming scheme</objectIdentifierType> <objectIdentifierValue><!-- PMASTER --></objectIdentifierValue> </objectIdentifier> <environment> <environmentPurpose>create</environmentPurpose> <environmentExtension> <creatingApplication> <creatingApplicationExtension> <hardware> <hwName>Sony eVTR MSW-M210</hwName> <hwType>video tape recorder that creates MXF file directly</hwType> </hardware> </creatingApplicationExtension> </creatingApplication> <creatingApplication> <creatingApplicationName>Sony MXF eVTR Manager</creatingApplicationName> <creatingApplicationVersion>0.1.0.4</creatingApplicationVersion> </creatingApplication> </environmentExtension> </environment> <environment> <environmentCharacteristic>unknown</environmentCharacteristic> <environmentPurpose>render</environmentPurpose> <environmentNote>No rendering environment was identified for the MXF-wrapped IMX-50 files created by the Sony eVTR.</environmentNote> </environment> </object> <!-- ================================================================== --> </premis>
<?xml version="1.0" encoding="UTF-8"?><premis xmlns="info:lc/xmlns/premis-v2" version="2.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="info:lc/xmlns/premis-v2 http://www.loc.gov/standards/premis/premis.xsd"> <!-- ================================================================== --> <object xsi:type="representation" xmlID="bmaster-sd-001"> <objectIdentifier> <objectIdentifierType>NDIIPP:PDPTV repository naming scheme</objectIdentifierType> <objectIdentifierValue><!-- SD_BMASTER --></objectIdentifierValue> </objectIdentifier> <environment> <environmentPurpose>create</environmentPurpose> <environmentNote>The OMNEON server generated three files for the SD broadcast master: one QuickTime movie file (.mov), one video track (.m2v), and one audio track (.aiff). The .mov file contains fully-qualified pathname references to the .m2v and .aiff tracks that were only valid in the OMNEON server environment.</environmentNote> <environmentExtension> <creatingApplication> <creatingApplicationName>Avid Unity Workgroup</creatingApplicationName> <creatingApplicationVersion>4</creatingApplicationVersion> </creatingApplication> <creatingApplication> <creatingApplicationName>Avid Media Composer REG_SZ</creatingApplicationName> <creatingApplicationVersion>3.0.5</creatingApplicationVersion> </creatingApplication> <creatingApplication> <creatingApplicationName>Omneon</creatingApplicationName> <creatingApplicationVersion>4.3 sr2</creatingApplicationVersion> </creatingApplication> </environmentExtension> </environment> <environment> <environmentCharacteristic>known to work</environmentCharacteristic> <environmentPurpose>render</environmentPurpose> <environmentNote>To render the content in this environment the video track (.m2v) and audio track (.aiff) must be muxed using QTCoffee. The QuickTime movie file cannot be used to render the content because the .mov file refers to the .m2v and .aiff tracks by fully-qualified file names that were only valid in the creating environment. </environmentNote> <software> <swName>Apple Macintosh OS X version 10.5.5</swName> <swType>operating system</swType> </software> <software> <swName>Apple QuickTime Player version 7.5.5</swName> <swType>renderer</swType> </software> <software> <swName>QTCoffee 1.2.5</swName> <swType>muxer</swType> </software> <hardware> <hwName>Intel Core 2 Duo</hwName> <hwType>processor</hwType> <hwOtherInformation>2 GB RAM</hwOtherInformation> </hardware> </environment> </object> <!-- ================================================================== --> <object xsi:type="representation" xmlID="pmaster-001"> <objectIdentifier> <objectIdentifierType>NDIIPP:PDPTV repository naming scheme</objectIdentifierType> <objectIdentifierValue><!-- PMASTER --></objectIdentifierValue> </objectIdentifier> <environment> <environmentPurpose>create</environmentPurpose> <environmentExtension> <creatingApplication> <creatingApplicationExtension> <hardware> <hwName>Sony eVTR MSW-M210</hwName> <hwType>video tape recorder that creates MXF file directly</hwType> </hardware> </creatingApplicationExtension> </creatingApplication> <creatingApplication> <creatingApplicationName>Sony MXF eVTR Manager</creatingApplicationName> <creatingApplicationVersion>0.1.0.4</creatingApplicationVersion> </creatingApplication> </environmentExtension> </environment> <environment> <environmentCharacteristic>unknown</environmentCharacteristic> <environmentPurpose>render</environmentPurpose> <environmentNote>No rendering environment was identified for the MXF-wrapped IMX-50 files created by the Sony eVTR.</environmentNote> </environment> </object> <!-- ================================================================== --> </premis>
“When combining different metadata specifications or when embedding extension metadata, we often find that data models are mismatched or that semantic units overlap. In these cases, it is necessary to decide how to overcome the conflicts.”
- ISQ Special Issue: Digitial Preservation, Spring 2010, p.7
PBCore
PREMIS
METS
Agents
Checksums
Structure
File Size
HardwareSoftware
Rights
RelationshipsFile Format
Title
Creator
Description
PBCore
PREMIS
METS
Agents
Checksums
Structure
File Size
HardwareSoftware
RightsRelationships
File Format
TitleCreatorDescription
METSRights!
MODS
Descriptive elements only map to MODS
METS
dmdSec
amdSec
techMD rightsMD sourceMD digiprovMD
fileSec
structMap
behaviorSec
METS
amdSecfileSec
structMap
dmdSec techMD rightsMD
1. Content submitted, verified
2. METS automatically generated (checksums into METS attributes)
3. Source database exports automatically converted to PBCore
4. Technical metadata extracted from files using MediaInfo, converted to PBCore
5. MODS created from completed PBCore
6. Rights metadata (METSRights), preservation metadata (PREMIS) created
7. AIP complete
AIP creation simplified
AIPs:AIP Class 1: Nationally distributed content (Nature)
ESSENCE FILE
TYPES
METADATA
ADDITIONAL ITEMS Scripts, etc
METS PBCorePREMIS METS
RightsMODS
METS
PBCore PREMISMETS Rights
MODS
AIP Class 4: Religion and Ethics
METS
PBCore PREMISMETS Rights
MODSScripts,
etc
Production Master (mov)
HD Broadcast
Master (mov/data)
SD Broadcast
Master (mov/aiff/
m2v)
SD Broadcast
Master (mpeg)
Production Master (mxf)
Original database exports
HD Broadcast
Master (mov/data)
SD Broadcast
Master (mov/aiff/
m2v)
Production Master (mxf)
SD Broadcast
Master (mov/aiff/
m2v)
Production Master (mov)
Original database exports
Original database exports
“Some file formats enable the capture of technical, and other, metadata within their files, which has the advantage of keeping the files self-descriptive. However, by extracting and storing metadata explicitly we may also benefit.”
- ISQ Special Issue: Digitial Preservation, Spring 2010, p.11