mets: implementing a metadata standard in the digital library
TRANSCRIPT
METS: Metadata Encoding and Transmission Standard
Richard GartnerOxford University Library Services
The digital library: a status report
Digitisation technology now well established and well-understood
Standards for digitisation processes have settled down and are widely recognised
Still a disparity in approaches to metadata - no MARC standard for digital library
Approaches to metadata - some examples from Oxford
Ad-hoc databases: eg. Allegro SGML: including:-
TEI alone TEI + EAD Ad-hoc DTDs
Proprietary databases: eg Olive
The lack of a standard: what it mean for the digital library
poor cross-searchinglimited interchange facilitiesmetadata tied to proprietary packagesconsequent obsolescence and costs of conversion
What is needed?
A standard for metadata content : analogous to AACR2
A standardised framework for holding and exchanging metadata : analogous to the MARC record
Three types of metadata(defined by DLF)
Descriptive
Administrative
Structural
Information about intellectual content (analogous to standard catalogue record)
Information for handling, maintenance and archiving of object
Description of internal structure of object
METS: Metadata Encoding and Transmission Standard
Produced by Library of Congress Standards Office and Digital Library FederationProvides framework for holding all types of metadata for digital objectWritten in XMLDoes not prescribe content of metadata, but recommends a number of schemes for this
Why XML?
An ISO standard, not dependent on any given applicationInterchangeability with other applicationsHandles structural metadata easilyEasy to integrate cataloguing information with text transcription, images etc.
Features of a METS file
All metadata (descriptive, administrative and structural) encoded in single documentEach type is held in a separate section, linked by identifiersAll metadata and external data (eg. images, text, video) is either referenced from METS file or can be held internally
The structure of a METS file
METS
dmdSec
admSec
behaviorSec
structMap
fileSec file inventory
descriptive metadata
administrative metadata
behaviour metadata
structural map
Title Pagetitle page
Prefacepage ipage ii
Chapter 1page 1page 2page 3page 4page 5page 6
Chapter 2page 7page 8page 9
<div LABEL=”Title Page”>
<div LABEL=”Preface”>
<div LABEL=”Chapter 1>
<div LABEL=”Chapter 2>
<div LABEL=”Page 1”><div LABEL=”Page 2”><div LABEL=”Page 3”>
<fptr FILEID=”xxx”/>
The structure of a METS file
METS
dmdSec
admSec
behaviorSec
structMap
div
fptr
fptr
fileSecdiv
divstructMap
<structMap>
<div ID="munahi010-aaa-div.1" LABEL="Section 1”> <div ID="munahi010-aaa-div.1.1" LABEL="Plate 1: Panorama des glaciers du Mont Rose, partie orientale de la chaine"> <fptr FILEID="munahi010-aaa-fgrp-0001"/> </div>
</structMap>
The structure of a METS file
METS
dmdSec
admSec
behaviorSec
structMap
fileSec
Page 1
image 1 (thumbnail)
image 1 (master)
image 1 (delivery)
<fileGrp ID=”xxx”>
<file GROUPID=”6”><FLocat href=”....>
<file GROUPID=”0”><FLocat href=”....>
<file GROUPID=”3”><FLocat href=”....>
The structure of a METS file
METS
dmdSec
admSec
behaviorSec
structMap
fileSecfileSec fileGrp
file
file
file
<fileGrp ID="munahi010-aaa-fgrp-0001">
<file GROUPID="0" ID="munahi010-aaa-0001-0" MIMETYPE="image/tiff" ADMID="munahi010-aaa-tmd-0001-0"> <FLocat LOCTYPE="URL" xlink:href="file://hfs.ox.ac.uk/data/odl/munahi010/digObjects/aaa/0/munahi010-aaa-0001.tiff"/> </file>
<file GROUPID="6" ID="munahi010-aaa-0001-6" MIMETYPE="image/jpeg" ADMID="munahi010-aaa-tmd-0001-6"> <FLocat LOCTYPE="URL" xlink:href="http:odl/munahi010/digObjects/aaa/6/munahi010-aaa-0001-6.jpg"/> </file>
<file GROUPID="3" ID="munahi010-aaa-0001-3" MIMETYPE="image/jpeg" ADMID="munahi010-aaa-tmd-0001-3"> <FLocat LOCTYPE="URL" xlink:href="http:odl/munahi010/digObjects/aaa/3/munahi010-aaa-0001-3.jpg"/> </file>
</fileGrp>
Descriptive and administrative metadata
Descriptive and administrative metadata may be handled in two ways:-
embedding directly within the METS file within an <mdWrap> element (with any namespace)being held in an external file and
referenced from the METS file using an <mdRef> element
The structure of a METS file
METS
dmdSec
admSec
behaviorSec
structMap
fileSec
admSec
dmdSec
mdWrap
Any XML metadata
<dmdSec ID="munahi010-aaa-dmd-0001"> <mdWrap MIMETYPE="text/xml" MDTYPE="DC" LABEL="Dublin Core Metadata"> <xmlData> <odldc:title.main>Ðtudes sur les glaciers</odldc:title.main> <odldc:title.supplied>atlas</odldc:title.supplied> <odldc:creator.aut>Agassiz, Louis, 1807-1873 </odldc:creator.aut> <odldc:subject.lcsh>Glaciers</odldc:subject.lcsh> <odldc:description.summary>Plates accompanying a study of glaciers by 19th century glaciologist Louis Agassiz</odldc:description.summary> <odldc:description.note>Dessinés d'après nature et lithographiés par Jph. Bettannier 1840. Neuch‚tel, Lithographie de H. Nicolet</odldc:description.note> <dc:publisher>Neuch‚tel (Switzerland): Jent et Gassmann</dc:publisher> <odldc:contributor.ltg>Bettannier, Joseph</odldc:contributor.ltg> <odldc:date.issued>1840</odldc:date.issued> <dc:type>Image.Graphic.Map</dc:type> <odldc:format.dimensions>480 x 320</odldc:format.dimensions> <odldc:format.extent>18 plates: ill.</odldc:format.extent> <dc:source>OUM:E. 52</dc:source> <dc:language>fre</dc:language> <odldc:coverage.spatial>Alps</odldc:coverage.spatial> <odldc:relation.isRequiredBy>ODL:munahi010-aaa</odldc:relation.isRequiredBy> </xmlData>
</mdWrap></dmdSec>
The structure of a METS file
METS
dmdSec
admSec
behaviorSec
structMap
fileSec
admSec
dmdSec
mdRef
Reference to external file containing metadata
<amdSec ID="munahi010-aaa-amd-0001">
<techMD ID="munahi010-aaa-tmd-0001-0"> <mdRef MDTYPE="OTHER" OTHERMDTYPE="ODL Admin Metadata Scheme" LOCTYPE="URL" xlink:href="file:/export/home/odl/munahi010/admMetadata/aaa/0/munahi010-aaa-0001-0.xml"/> </techMD>
<techMD ID="munahi010-aaa-tmd-0001-3"> <mdRef MDTYPE="OTHER" OTHERMDTYPE="ODL Admin Metadata Scheme" LOCTYPE="URL" xlink:href="file:/export/home/odl/munahi010/admMetadata/aaa/3/munahi010-aaa-0001-3.xml"/> </techMD>
<techMD ID="munahi010-aaa-tmd-0001-6"> <mdRef MDTYPE="OTHER" OTHERMDTYPE="ODL Admin Metadata Scheme" LOCTYPE="URL" xlink:href="file:/export/home/odl/munahi010/admMetadata/aaa/6/munahi010-aaa-0001-6.xml"/> </techMD>
</amdSec>
IDs and METS All compontents of a METS file need to be identified with a logical (and easily generated) sets of identifiers
Project ID munahi010Item ID munahi010-aaaTechnical metadata munahi010-aaa-tmd-0001File groups munahi010-aaa-fgrp-0001File IDs munahi010-aaa-0001-3divs munahi010-aaa-div.1
What to put in a METS file? METS does not prescribe the content (particularly the
descriptive metadata) which it can contain
However, the METS board does endorse some schemas as recommended:-
Descriptive MetadataDublin Core MODS (Metadata Object Description Schema)MARCXML MARC 21 Schema (MARCXML)
Administrative MetadataSchema for Technical Metadata for Text (NYU)Library of Congress Audio-Visual Prototyping Project NISO Technical Metadata for Digital Still Images Schema for Rights Declaration
METS Profiles METS is very flexible in its application – there are multiple ways of encoding everything:-
metadata and data can be embedded or referenced
any scheme can be used for this metadatafile inventory can be organised in multiple ways
(by referenced object, by type of file etc)
This all reduces interchangeability of METS records.
METS Profiles This can be countered to some extent by METS Profiles:- XML documents describing application of METS in a given project/institution
follows METS Profile schema and each profile has to validate against itregistered with central repository at Library of
Congress
But does not allow automated cross-mapping of METS files: this has to be explored
METS in actionOxford Digital Library:-
a collection of collections of material held in Oxford librariesMETS files generated by automated webform-based cataloguing systemdescriptive metadata qualified Dublin Core following strict cataloguing guidelines (aimed to map to AACR2) – moving to MODS being investigatedMETS files easily converted to formats of digital library systems (currently investigating Greenstone)
Conclusions METS will undoubtedly be the central format for digital library metadata: the MARC of the DL world
Issues to be addressed:- standardization of metadata content and application (an AACR2 to MET's MARC) take-up by vendors: lobbying by major bodies more tools needed for creation and exploitation of METS records critical mass
Further information
www.loc.gov/standards/mets
www.jisc.ac.uk/index.cfm?name=techwatch_report_0205