metadata for digital preservation: a status report on premis priscilla caplan, fcla nancy...

37
Metadata for Digital Preservation: A Status Report on PREMIS Priscilla Caplan, FCLA Nancy Hoebelheinrich, Stanford University CNI Fall Task Force Meeting December 6-7, 2004

Upload: jeremiah-jimenez

Post on 27-Mar-2015

216 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Metadata for Digital Preservation: A Status Report on PREMIS Priscilla Caplan, FCLA Nancy Hoebelheinrich, Stanford University CNI Fall Task Force Meeting

Metadata for Digital Preservation: A Status Report on PREMIS

Priscilla Caplan,FCLANancy Hoebelheinrich,Stanford University

CNI Fall Task Force MeetingDecember 6-7, 2004

Page 2: Metadata for Digital Preservation: A Status Report on PREMIS Priscilla Caplan, FCLA Nancy Hoebelheinrich, Stanford University CNI Fall Task Force Meeting

CNI Fall 2004 Metadata for Digital Preservation

Preservation Metadata: Implementation Strategies

OCLC/RLG Preservation Metadata Framework Working Group

OCLC/RLG Preservation Metadata Working Group• Convened March 2000• Looked at CEDARS, NLA, NEDLIB, OCLC

Preservation metadata framework (June 2002)• Synthesized elements from existing sets• Based on OAIS information model• Set of “prototype” preservation metadata elements

Page 3: Metadata for Digital Preservation: A Status Report on PREMIS Priscilla Caplan, FCLA Nancy Hoebelheinrich, Stanford University CNI Fall Task Force Meeting

CNI Fall 2004 Metadata for Digital Preservation

Preservation Metadata: Implementation Strategies

PREMIS

June 2003: OCLC/RLG sponsored new working group: PREMIS• Preservation Metadata: Implementation Strategies

Objectives• Define “core” set of preservation metadata elements, with

supporting data dictionary, applicable to broad range of digital preservation activities

• Identify and evaluate alternative strategies for encoding, storing, managing, and exchanging preservation metadata

http://www.oclc.org/research/projects/pmwg/

Page 4: Metadata for Digital Preservation: A Status Report on PREMIS Priscilla Caplan, FCLA Nancy Hoebelheinrich, Stanford University CNI Fall Task Force Meeting

CNI Fall 2004 Metadata for Digital Preservation

Preservation Metadata: Implementation Strategies

Membership

Priscilla Caplan, FCLA (Chair) Rebecca Guenther, LC (Chair) Michael Alexander, British Library George Barnum, GPO Charles Blair, U. of Chicago Olaf Brandt, U. of Gottingen Adam Farquhar, British Library

David Gewirtz, Yale Kevin Glavash, MIT/Dspace Cathy Hartman, U. of N. Texas Helen Hodgart, British Library Nancy Hoebelheinrich, Stanford Roger Howard/Sally Hubbard,

Getty Museum Pam Kircher, OCLC John Kunze, Calif. Digital Library

Brian Lavoie, OCLC liaison Robin Dale, RLG liaison Vicky McCarger, LA Times Jerry McDonough, NYU/METS Evan Owens, JSTOR Erin Rhodes, NARA Madi Solomon, Walt Disney Co. Angela Spinazze, ATSPIN Stefan Strathmann, U. of

Gottingen Gunter Waibel, RLG Lisa Weber, NARA Robin Wendler, Harvard Hilde van Wijngaarden, KB Andrew Wilson, NAA

Page 5: Metadata for Digital Preservation: A Status Report on PREMIS Priscilla Caplan, FCLA Nancy Hoebelheinrich, Stanford University CNI Fall Task Force Meeting

CNI Fall 2004 Metadata for Digital Preservation

Preservation Metadata: Implementation Strategies

Advisory Committee

Howard Besser, UCLA Liz Bishoff, OCLC (via

Colorado Digitization Program)

Gerard Clifton, National Library of Australia

Gail Hodge, CENDI Steve Knight, National Library

of New Zealand

Maggie Jones, Digital Preservation Coalition

Nancy McGovern, Cornell Cliff Morgan, Wiley UK Richard Rinehart, U. of

California, Berkeley

Page 6: Metadata for Digital Preservation: A Status Report on PREMIS Priscilla Caplan, FCLA Nancy Hoebelheinrich, Stanford University CNI Fall Task Force Meeting

CNI Fall 2004 Metadata for Digital Preservation

Preservation Metadata: Implementation Strategies

Implementation Survey Report

State of the art in Winter, 2003/2004 28 libraries, 7 archives, 3 museums, and 11 other 13 different countries; 45% from U.S. 38% in planning; 33% development; 46% production

Page 7: Metadata for Digital Preservation: A Status Report on PREMIS Priscilla Caplan, FCLA Nancy Hoebelheinrich, Stanford University CNI Fall Task Force Meeting

CNI Fall 2004 Metadata for Digital Preservation

Preservation Metadata: Implementation Strategies

Core Elements

Mission: Define a core set of implementable preservation metadata elements.

Page 8: Metadata for Digital Preservation: A Status Report on PREMIS Priscilla Caplan, FCLA Nancy Hoebelheinrich, Stanford University CNI Fall Task Force Meeting

CNI Fall 2004 Metadata for Digital Preservation

Preservation Metadata: Implementation Strategies

Core Elements

Mission: Define a core set of implementable preservation metadata elements.

• Information that supports and documents the digital preservation process;

• Information that supports the the viability, renderability, understandability, identity and authenticity of digital objects over time.

Page 9: Metadata for Digital Preservation: A Status Report on PREMIS Priscilla Caplan, FCLA Nancy Hoebelheinrich, Stanford University CNI Fall Task Force Meeting

CNI Fall 2004 Metadata for Digital Preservation

Preservation Metadata: Implementation Strategies

Core Elements

Mission: Define a core set of implementable preservation metadata elements.

• What most working preservation repositories are likely to need to know.

Page 10: Metadata for Digital Preservation: A Status Report on PREMIS Priscilla Caplan, FCLA Nancy Hoebelheinrich, Stanford University CNI Fall Task Force Meeting

CNI Fall 2004 Metadata for Digital Preservation

Preservation Metadata: Implementation Strategies

Core Elements

Mission: Define a core set of implementable preservation metadata elements.

• As rigorous as possible• As much explanation as possible• Implementation neutral -- “This is what you

have to know”• Values can be automatically supplied and

processed -- no lengthy textual descriptions

Page 11: Metadata for Digital Preservation: A Status Report on PREMIS Priscilla Caplan, FCLA Nancy Hoebelheinrich, Stanford University CNI Fall Task Force Meeting

CNI Fall 2004 Metadata for Digital Preservation

Preservation Metadata: Implementation Strategies

Core Elements: Data Model

Page 12: Metadata for Digital Preservation: A Status Report on PREMIS Priscilla Caplan, FCLA Nancy Hoebelheinrich, Stanford University CNI Fall Task Force Meeting

CNI Fall 2004 Metadata for Digital Preservation

Preservation Metadata: Implementation Strategies

Sample data dictionary entry

Semantic unit sizeSemanticcomponents

None

Definition The size of a file or bitstream in bytes.Rationale Size is useful for knowing whether you have retrieved

the correct number of bytes from storage and whetheran application has enough room to move or processfiles. I t might also be used when billing for storage.

Data constraint IntegerLEVEL Representation File BitstreamScope Not applicable Applicable ApplicableExamples 2038927Repeatability Not repeatable Not repeatableObligation Optional OptionalNotes May be repeated for embedded files.

Page 13: Metadata for Digital Preservation: A Status Report on PREMIS Priscilla Caplan, FCLA Nancy Hoebelheinrich, Stanford University CNI Fall Task Force Meeting

CNI Fall 2004 Metadata for Digital Preservation

Preservation Metadata: Implementation Strategies

The evolution of a semantic unit: Format

What is a format? What types of objects have format? Is there a usable authority list of formats? Is there a difference between a format and a

profile? Shouldn’t we plan for format registries?

Page 14: Metadata for Digital Preservation: A Status Report on PREMIS Priscilla Caplan, FCLA Nancy Hoebelheinrich, Stanford University CNI Fall Task Force Meeting

CNI Fall 2004 Metadata for Digital Preservation

Preservation Metadata: Implementation Strategies

First try

Format• formatName• formatScheme

Page 15: Metadata for Digital Preservation: A Status Report on PREMIS Priscilla Caplan, FCLA Nancy Hoebelheinrich, Stanford University CNI Fall Task Force Meeting

CNI Fall 2004 Metadata for Digital Preservation

Preservation Metadata: Implementation Strategies

Second try

formatName• formatNameValue• formatVersion

formatRegistry• formatRegistryEntry• formatRegistryKey

Page 16: Metadata for Digital Preservation: A Status Report on PREMIS Priscilla Caplan, FCLA Nancy Hoebelheinrich, Stanford University CNI Fall Task Force Meeting

CNI Fall 2004 Metadata for Digital Preservation

Preservation Metadata: Implementation Strategies

Third try

formatName• formatNameValue• formatVersion

formatRegistry• formatRegistryIdentifier

• formatRegistryIdentifierScheme• formatRegistryIdentifierValue

• formatRegistryName• formatRegistryEntry

Page 17: Metadata for Digital Preservation: A Status Report on PREMIS Priscilla Caplan, FCLA Nancy Hoebelheinrich, Stanford University CNI Fall Task Force Meeting

CNI Fall 2004 Metadata for Digital Preservation

Preservation Metadata: Implementation Strategies

Fourth try

Format (Required, not repeatable)• formatName (Optional, repeatable)

• formatNameValue• formatNameVersion• formatNameRole

• formatRegistry (optional, repeatable)• formatRegistryIdentifier• formatRegistryName• formatRegistryEntry• formatRegistryRole

Page 18: Metadata for Digital Preservation: A Status Report on PREMIS Priscilla Caplan, FCLA Nancy Hoebelheinrich, Stanford University CNI Fall Task Force Meeting

CNI Fall 2004 Metadata for Digital Preservation

Preservation Metadata: Implementation Strategies

Current draft

Format (Required, Not Repeatable)• formatName (Optional, Not repeatable)

• formatNameValue• formatVersion

• formatRegistry (Optional, Repeatable)• formatRegistryName• formatRegistryKey• formatRegistryRole

Page 19: Metadata for Digital Preservation: A Status Report on PREMIS Priscilla Caplan, FCLA Nancy Hoebelheinrich, Stanford University CNI Fall Task Force Meeting

CNI Fall 2004 Metadata for Digital Preservation

Preservation Metadata: Implementation Strategies

Semantic units pertaining to Objects

objectIdentifier contentLocation originalName preservationLevel objectCharacteristics environment

Page 20: Metadata for Digital Preservation: A Status Report on PREMIS Priscilla Caplan, FCLA Nancy Hoebelheinrich, Stanford University CNI Fall Task Force Meeting

CNI Fall 2004 Metadata for Digital Preservation

Preservation Metadata: Implementation Strategies

objectCharacteristics

compositionlevel fixity size format inhibitors significantProperties creatingApplication

Page 21: Metadata for Digital Preservation: A Status Report on PREMIS Priscilla Caplan, FCLA Nancy Hoebelheinrich, Stanford University CNI Fall Task Force Meeting

CNI Fall 2004 Metadata for Digital Preservation

Preservation Metadata: Implementation Strategies

Semantic units pertaining to Events

eventIdentifier• eventIdentifierScheme• eventIdentifierValue

eventType eventOutcome eventOutcomeDetail eventDetail eventDateTime relatedPermission

Page 22: Metadata for Digital Preservation: A Status Report on PREMIS Priscilla Caplan, FCLA Nancy Hoebelheinrich, Stanford University CNI Fall Task Force Meeting

CNI Fall 2004 Metadata for Digital Preservation

Preservation Metadata: Implementation Strategies

Semantic units pertaining to Agents

agentIdentifier• agentIdentifierScheme• agentIdentifierValue

agentName

Page 23: Metadata for Digital Preservation: A Status Report on PREMIS Priscilla Caplan, FCLA Nancy Hoebelheinrich, Stanford University CNI Fall Task Force Meeting

CNI Fall 2004 Metadata for Digital Preservation

Preservation Metadata: Implementation Strategies

Semantic units pertaining to Rights

permissionStatement relatedObject grantingAgent grantingAgreement permission

act restriction

Page 24: Metadata for Digital Preservation: A Status Report on PREMIS Priscilla Caplan, FCLA Nancy Hoebelheinrich, Stanford University CNI Fall Task Force Meeting

CNI Fall 2004 Metadata for Digital Preservation

Preservation Metadata: Implementation Strategies

The context for rights statements

Two approaches addressed:

• (Preservation) rights metadata

• Formal agreements between depositors and digital archives / repositories

Page 25: Metadata for Digital Preservation: A Status Report on PREMIS Priscilla Caplan, FCLA Nancy Hoebelheinrich, Stanford University CNI Fall Task Force Meeting

CNI Fall 2004 Metadata for Digital Preservation

Preservation Metadata: Implementation Strategies

(Preservation) rights metadata

Documentary role by conveying• What is allowed• Record of changes made to content for preservation

purposes

Predicated on getting the metadata in the first place• From whom? (integrity of data)• Business reason to provide?

At present, few satisfactory means for managing preservation, intellectual property (rightsholders), or authorized uses of content (DRM tools)

C.Ayre, “The right to preserve: the rights issues of digital preservation”, D-Lib Magazine, March 2004.

Page 26: Metadata for Digital Preservation: A Status Report on PREMIS Priscilla Caplan, FCLA Nancy Hoebelheinrich, Stanford University CNI Fall Task Force Meeting

CNI Fall 2004 Metadata for Digital Preservation

Preservation Metadata: Implementation Strategies

Focus upon permissions

C. Ayre article – • Description of preservation strategies / copying

requirements, e.g.,

• From: Refreshing bits & media migration – for purposes of overcoming storage media deterioration or obsolescence – by periodic copying of bitstreams from one physical medium to another

• To: Re-creation of content – for purposes of overcoming both hardware & software obsolescence – by re-keying data, reverse engineering original software & recreating or creation of new software environment.

Page 27: Metadata for Digital Preservation: A Status Report on PREMIS Priscilla Caplan, FCLA Nancy Hoebelheinrich, Stanford University CNI Fall Task Force Meeting

CNI Fall 2004 Metadata for Digital Preservation

Preservation Metadata: Implementation Strategies

Use Cases

• Use cases gathered from active preservation repositories• Necessary to allow:

• Various kinds of copying• Retention of copies• Modification of the original material• Adaptation to new technologies • Transfer of all these permissions to another party• Withdrawing / deleting content

• Provide for various kinds of restrictions• Time, number related• Attribution• Format or quality• Purpose for actions to be taken

Page 28: Metadata for Digital Preservation: A Status Report on PREMIS Priscilla Caplan, FCLA Nancy Hoebelheinrich, Stanford University CNI Fall Task Force Meeting

CNI Fall 2004 Metadata for Digital Preservation

Preservation Metadata: Implementation Strategies

Formal agreements between depositors and digital archives / repositories

Acquisition of a work for a digital archive• Copies received through mandatory deposit (LC)• Copies obtained by gift or purchase• Copies obtained through subscription or license

• Copies made or received under agreements with copyright owners

• J.M. Besak, “Copyright issues relevant to the creation of a digital archive: a preliminary

assessment, 2002.

Page 29: Metadata for Digital Preservation: A Status Report on PREMIS Priscilla Caplan, FCLA Nancy Hoebelheinrich, Stanford University CNI Fall Task Force Meeting

CNI Fall 2004 Metadata for Digital Preservation

Preservation Metadata: Implementation Strategies

Comparison of sample agreements

14 out of 49 respondents contributed agreements Analyzed to determine how agreements treated:

• Rights granted, expressly• Restrictions and conditions upon the user / uses of the

materials submitted• Repository permissions and actions expressly allowed

(not covered herein)• Warranties made

Page 30: Metadata for Digital Preservation: A Status Report on PREMIS Priscilla Caplan, FCLA Nancy Hoebelheinrich, Stanford University CNI Fall Task Force Meeting

CNI Fall 2004 Metadata for Digital Preservation

Preservation Metadata: Implementation Strategies

Rights granted, expressly to repository

Type of Library Rights granted

Governmental agency Preserve and make accessibleArchive, distribute and use

National library (by legal deposit)

Retain in the archive and, subject to negotiated access conditions, provide public access to it in perpetuityTake preservation action necessary to keep publication accessible as hardware and software changes

State archive Non-exclusive, non-transferable and non-assignable right to make use of services of the Archive (not including access)

Open source repository system

Non-exclusive right to reproduce, translate and/or distribute data worldwide in print & e-format & in any medium (incl abstract)

Page 31: Metadata for Digital Preservation: A Status Report on PREMIS Priscilla Caplan, FCLA Nancy Hoebelheinrich, Stanford University CNI Fall Task Force Meeting

CNI Fall 2004 Metadata for Digital Preservation

Preservation Metadata: Implementation Strategies

Rights granted, expressly to repository

Type of Library Rights granted

Governmental agency Catalog, enhance, validate and document the dataDistribute copies of the data in a variety of formatsIncorporate metadata or documentation into public access catalogs

University Library Right to publish & continually an author’s work in digital format on the university websiteTransfer, in all or partially, the rights & obligations included in the Agreement to a 3rd party

Page 32: Metadata for Digital Preservation: A Status Report on PREMIS Priscilla Caplan, FCLA Nancy Hoebelheinrich, Stanford University CNI Fall Task Force Meeting

CNI Fall 2004 Metadata for Digital Preservation

Preservation Metadata: Implementation Strategies

Restrictions upon the user / uses

Type of Library RestrictionsGovernmental agencies, private archive, open source repository system

Non-commercial, research / educational purposes only

Governmental agencies, private archive,

Must be “authorised” user (e.g., subscribed, agreeing to license conditions, registerd)

National Library (legal deposit)

Upon commercial publications, access limited to physical premises of the Library on a single computer with copying & communication functions disabled (during time in which pub is commercially viable

Fee for service archive Must not breach rightsholder’s copyright by selling all or part of data, or including in product which is sold

Page 33: Metadata for Digital Preservation: A Status Report on PREMIS Priscilla Caplan, FCLA Nancy Hoebelheinrich, Stanford University CNI Fall Task Force Meeting

CNI Fall 2004 Metadata for Digital Preservation

Preservation Metadata: Implementation Strategies

Conditions upon the user / uses

Type of Library ConditionsGovernmental agencies, private archive

User must acknowledge or attribute rightsholder upon future publications based on use of the dataState to users of the data that rightsholder is not responsible for the quality of the work users produceDeposit with the Archive copies of any published work based in whole or in part under negotiated conditions of use

Private archive Keep list of all persons to whom access of data has been given & supply to Archive Director when asked

Governmental agencies, National library / archive

Adhere to privacy provisions, as applicable

Fee for service archive Attach notice of restrictions when making data available to end-users

Page 34: Metadata for Digital Preservation: A Status Report on PREMIS Priscilla Caplan, FCLA Nancy Hoebelheinrich, Stanford University CNI Fall Task Force Meeting

CNI Fall 2004 Metadata for Digital Preservation

Preservation Metadata: Implementation Strategies

Warranties made

Type of Library Warranties

All types Depositor is copyright holder or authorized by

National library Will provide for the permanent storage and maintenance of data in a form that will provide security to data integrity and usabilityWill maintain the content of the data, not the format or functionality of the contentExplicitly assumes the role of an official, non-exclusive archival agent

Governmental agencies, National library (federal deposit)

Not warranted to be suitable for useContributors to documents being archived have been notified of deposit into archive and agree

Page 35: Metadata for Digital Preservation: A Status Report on PREMIS Priscilla Caplan, FCLA Nancy Hoebelheinrich, Stanford University CNI Fall Task Force Meeting

CNI Fall 2004 Metadata for Digital Preservation

Preservation Metadata: Implementation Strategies

Range of approaches based on risk assessment

Implicit understanding: if you deposit it, we will preserve (safest for legal depository institutions?)

How dark is your archive?

It’s all stored in the agreement, see it here…

“Oh, sorry – we’ll just take it down” – asking for forgiveness (not permission)

Page 36: Metadata for Digital Preservation: A Status Report on PREMIS Priscilla Caplan, FCLA Nancy Hoebelheinrich, Stanford University CNI Fall Task Force Meeting

CNI Fall 2004 Metadata for Digital Preservation

Preservation Metadata: Implementation Strategies

Next steps:

PREMIS ACTIVITIES Complete data dictionary (January 2005) Write narrative report Develop XML schemas for exchanging metadata

FOLLOW-UP ACTIVITIES Community outreach Establish feedback/maintenance mechanism Testbeds for implementation and exchange

Page 37: Metadata for Digital Preservation: A Status Report on PREMIS Priscilla Caplan, FCLA Nancy Hoebelheinrich, Stanford University CNI Fall Task Force Meeting

CNI Fall 2004 Metadata for Digital Preservation

Preservation Metadata: Implementation Strategies

For More Information:

PREMIS Web Site• www.oclc.org/research/projects/pmwg

“Implementing Metadata in Digital Preservation Systems: The PREMIS Activity” D-Lib (April ‘04)• www.dlib.org/dlib/april04/lavoie/04lavoie.html

RLG DigiNews October 2004 and December 2004 issues• www.rlg.org/en/page.php?Page_ID=12081

Priscilla Caplan: [email protected]

Rebecca Guenther: [email protected]