using metadata standards in digital libraries: implementing mets, mods, premis and mix: introduction...

24
Using Metadata Standards in Digital Libraries: Implementing METS, MODS, PREMIS and MIX: Introduction Rebecca Guenther Library of Congress LITA Standards IG Program, ALA Annual 2007

Upload: sherilyn-white

Post on 24-Dec-2015

219 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Using Metadata Standards in Digital Libraries: Implementing METS, MODS, PREMIS and MIX: Introduction Rebecca Guenther Library of Congress LITA Standards

Using Metadata Standards in Digital Libraries:

Implementing METS, MODS, PREMIS and MIX:

Introduction

Rebecca GuentherLibrary of Congress

LITA Standards IG Program, ALA Annual 2007

Page 2: Using Metadata Standards in Digital Libraries: Implementing METS, MODS, PREMIS and MIX: Introduction Rebecca Guenther Library of Congress LITA Standards

Program overview• Introduction To METS, MODS, PREMIS and MIX

(Guenther)• Using METS and MODS for presentations of LC

content (Cundiff, Trail)• Using METS in special collections at CDL

(Tingle)• Creating rich shareable metadata: the DLF

Aquifer MODS implementation guidelines (Shreeves)

• METS, MODS and PREMIS, Oh My!: Integrating digital library standards for interoperability and preservation (Habing)

• MODS as metadata Hub (Olson)

Page 3: Using Metadata Standards in Digital Libraries: Implementing METS, MODS, PREMIS and MIX: Introduction Rebecca Guenther Library of Congress LITA Standards

Metadata standards in digital libraries

• XML is the de-facto standard for metadata descriptions on the Internet

• Interoperability and object exchange requires the use of established standards

• Many digital objects are complex and are comprised of multiple files

• Complex digital objects require many more forms of metadata than analog for their management and use• Descriptive• Technical• Digital provenance/events • Structural• Rights/Terms and conditions

Page 4: Using Metadata Standards in Digital Libraries: Implementing METS, MODS, PREMIS and MIX: Introduction Rebecca Guenther Library of Congress LITA Standards

Descriptive metadata: MARCXML• Millions of rich descriptive records in MARC

systems: can be reused in an XML environment using MARCXML

• MARCXML uses the MARC data element set in an XML syntax

• Allows interoperability with other XML schemes by taking advantage of free XML tools

• Allows for collaborative use of metadata for access (e.g. OAI)

• Provides continuity with current data and flexible transition options

Page 5: Using Metadata Standards in Digital Libraries: Implementing METS, MODS, PREMIS and MIX: Introduction Rebecca Guenther Library of Congress LITA Standards

MARC 21 evolution to XML

Page 6: Using Metadata Standards in Digital Libraries: Implementing METS, MODS, PREMIS and MIX: Introduction Rebecca Guenther Library of Congress LITA Standards

MARCXML• MARCXML record

– XML exact equivalent of MARC (2709) record

– Lossless/roundtrip conversion to/from MARC 21 record

– Simple flexible XML schema, no need to change when MARC 21 changes

– Presentations using XML stylesheets– LC provides converters (open source)

• http://www.loc.gov/standards/marcxml

• Music record in MARCXML

Page 7: Using Metadata Standards in Digital Libraries: Implementing METS, MODS, PREMIS and MIX: Introduction Rebecca Guenther Library of Congress LITA Standards

What is MODS?• Metadata Object Description Schema• An XML descriptive metadata standard • A derivative of MARC

– Uses language based tags– Contains a subset of MARC data elements– Repackages elements to eliminate redundancies

• MODS does not assume the use of any specific rules for description

• Element set is particularly applicable to digital resources

Page 8: Using Metadata Standards in Digital Libraries: Implementing METS, MODS, PREMIS and MIX: Introduction Rebecca Guenther Library of Congress LITA Standards

Uses of MODS

• Extension schema to METS– Rich description works well with hierarchical METS

objects

• To represent metadata for harvesting (OAI)– Language based tags are more user friendly

• As a specified XML format for SRU• As a core element set for convergence

between MARC and non-MARC XML descriptions

• For original resource description in XML syntax that is simpler than full MARC

Page 9: Using Metadata Standards in Digital Libraries: Implementing METS, MODS, PREMIS and MIX: Introduction Rebecca Guenther Library of Congress LITA Standards

MODS high-level elements• Title Info• Name• Type of resource• Genre• Origin Info• Language• Physical description• Abstract• Table of contents• Target audience

• Note• Subject• Classification• Related item• Identifier• Location• Access conditions• Part• Extension• Record Info

Music record in MODS

Page 10: Using Metadata Standards in Digital Libraries: Implementing METS, MODS, PREMIS and MIX: Introduction Rebecca Guenther Library of Congress LITA Standards

MODS Development• Developed 2002 through open listserv

discussion of possible implementers (LC coordinated)

• Version 1 in late 2002; now in version 3.2 with 3.3 almost complete

• Companion for authority metadata (MADS) in version 1.0 (2005)

• Endorsed as METS extension schema for descriptive metadata section

• Registered with NISO• Widely used in digital library projects• MODS Implementation registry:

http://www.loc.gov/mods/registry.php

Page 11: Using Metadata Standards in Digital Libraries: Implementing METS, MODS, PREMIS and MIX: Introduction Rebecca Guenther Library of Congress LITA Standards

What is METS?• METS records the (possibly hierarchical)

structure of digital objects, the names and locations of the files that comprise those objects, and the associated metadata

• A container for metadata and file pointers• A METS document may be a unit of storage or

a transmission format• METS is extensible and modular, using

“wrappers” or “sockets” where elements from other schemas can be plugged in

• METS uses the XML Schema facility for combining vocabularies from different Namespaces

Page 12: Using Metadata Standards in Digital Libraries: Implementing METS, MODS, PREMIS and MIX: Introduction Rebecca Guenther Library of Congress LITA Standards

What is PREMIS?

• A data dictionary for metadata to support the long-term preservation of digital objects– A piece of the necessary infrastructure for

implementing reliable, sustainable preservation programs

• A supporting set of XML schema for implementation in a variety of contexts

• A maintenance activity hosted at LC including an Implementers’ Group and Editorial Committee

Page 13: Using Metadata Standards in Digital Libraries: Implementing METS, MODS, PREMIS and MIX: Introduction Rebecca Guenther Library of Congress LITA Standards

What is preservation metadata? • Provenance:

– Who has had custody/ownership of the digital object?

• Authenticity:– Is the digital object what it purports to be?

• Preservation Activity:– What has been done to preserve the digital object?

• Technical Environment:– What is needed to render and use the digital object?

• Rights Management:– What IPR must be observed?

Makes digital objects self-documenting across time

Content

PreservationMetadata

10 years on

50 years on

Forever!

Page 14: Using Metadata Standards in Digital Libraries: Implementing METS, MODS, PREMIS and MIX: Introduction Rebecca Guenther Library of Congress LITA Standards

Guiding principles and assumptions …• “Implementable, core, preservation metadata”:

– “Preservation metadata”: maintain viability, renderability, understandability, authenticity, identity in a preservation context

– “Core”: What most preservation repositories need to know to preserve digital materials over the long-term

– “Implementable”: rigorously defined; supported by usage guidelines/recommendations; emphasis on automated workflows

• Implementation neutral:– No assumptions on specific implementation– Promote flexibility/interoperability– Focus on semantic units: what you need to know

(implementation-neutral) vs. metadata elements: how you record it (implementation-specific)

– Information that needs to be “recoverable” from the digital archiving system, independent of local implementation

Page 15: Using Metadata Standards in Digital Libraries: Implementing METS, MODS, PREMIS and MIX: Introduction Rebecca Guenther Library of Congress LITA Standards

Scope• What PREMIS is:

– Common data model for organizing/thinking about preservation metadata

– Guidance for local implementations– Standard for exchanging information packages between

repositories• What PREMIS is not:

– Out-of-the-box solution: need to instantiate as metadata elements in repository system

– All needed metadata: excludes business rules, format-specific technical metadata, descriptive metadata for access, non-core preservation metadata

– Lifecycle management of objects outside repository– Rights management: limited to permissions regarding

actions taken within repository

Page 16: Using Metadata Standards in Digital Libraries: Implementing METS, MODS, PREMIS and MIX: Introduction Rebecca Guenther Library of Congress LITA Standards

PREMIS data model

IntellectualEntities

Objects

Rights

Agents

Events

Page 17: Using Metadata Standards in Digital Libraries: Implementing METS, MODS, PREMIS and MIX: Introduction Rebecca Guenther Library of Congress LITA Standards

Semantic units pertaining to objects: technical metadata

• objectIdentifier• preservationLevel• objectCategory• objectCharacteristics• creatingApplication• originalName• storage• environment

• signatureInformation• relationship• linkingEventIdentifier• linkingIntellectual

Entity Identifier• linkingPermission

StatementIdentifier

Page 18: Using Metadata Standards in Digital Libraries: Implementing METS, MODS, PREMIS and MIX: Introduction Rebecca Guenther Library of Congress LITA Standards

Semantic units pertaining to Events: provenance and preservation activity

• eventIdentifier• eventType• eventDateTime• eventDetail• eventOutcome• eventOutcomeDetail• linkingAgentIdentifier• linkingObjectIdentifier

Page 19: Using Metadata Standards in Digital Libraries: Implementing METS, MODS, PREMIS and MIX: Introduction Rebecca Guenther Library of Congress LITA Standards

Semantic units pertaining to Rights: terms and conditions

permissionStatement permissionStatementIdentifier relatedObject grantingAgent grantingAgreement permissionGranted

act restriction termOfGrant permissionNote

Page 20: Using Metadata Standards in Digital Libraries: Implementing METS, MODS, PREMIS and MIX: Introduction Rebecca Guenther Library of Congress LITA Standards

Semantic units pertaining to Agents

• agentIdentifier• agentName• agentType

Page 21: Using Metadata Standards in Digital Libraries: Implementing METS, MODS, PREMIS and MIX: Introduction Rebecca Guenther Library of Congress LITA Standards

PREMIS maintenance activities

• First revision of Data Dictionary (PREMIS 2.0)– Documenting errata and proposed revisions to Data

Dictionary (feedback through PIG list)– http://www.loc.gov/standards/premis/changes.html

• PREMIS Implementers’ Registry– http://www.loc.gov/standards/premis/premis-registry.html

• Consultancies (funded by Library of Congress):– Rights issues for digital preservation (Karen Coyle)– PREMIS implementation guidelines and recommendations

(Deborah Woodyard-Robinson)• PREMIS Tutorials:

– Glasgow, Boston, Stockholm, Albuquerque, Washington

Page 22: Using Metadata Standards in Digital Libraries: Implementing METS, MODS, PREMIS and MIX: Introduction Rebecca Guenther Library of Congress LITA Standards

What is MIX?• Metadata For Images in XML• An XML Schema designed for expressing technical

metadata for digital still images• Based on the NISO Z39.87 Data Dictionary – Technical

Metadata for Digitial Still Images • Used to express attributes of digital images such as

file format, file size, dimensions, resolution, compression, etc.

• Version 1.0 (recently released) includes support for GIS images and JPEG 2000 images; data element names harmonized with PREMIS

• Can be used standalone or as an extension schema with METS

Page 23: Using Metadata Standards in Digital Libraries: Implementing METS, MODS, PREMIS and MIX: Introduction Rebecca Guenther Library of Congress LITA Standards

How do these standards work together for digital libraries?

• A container format such as METS allows for packaging together forms of metadata with objects or pointers to objects

• There are about 5 years of experimentation experience using METS in combination with other standards for managing and using digital objects in digital libraries

• These standards are all freely available• METS profiles detail how METS is used for particular

object types or applications• Best practices are needed (and being developed) for

use of PREMIS with METS and MIX• Using METS, MODS, PREMIS and MIX: http://www.loc.

gov/premis/louis.xml

Page 24: Using Metadata Standards in Digital Libraries: Implementing METS, MODS, PREMIS and MIX: Introduction Rebecca Guenther Library of Congress LITA Standards