metadata 101public
DESCRIPTION
Lecture given to digital Libraries class on metadataTRANSCRIPT
1
Metadata 101 An introduction to data about data
R. John Robertson,Digital Libraries class28th April 2005
2
Who? What? Why?
Project(s): Metadata Workflow Investigation, mandate
An introduction to metadata Inform you about other settings for your skills:
not just digital libraries – IR’s; LOR’s; records management; webpages..
Bibliography at end, notes online Ask questions; tell me if you can’t hear me
3
What is metadata?
Answers on a postcard… What do you know about metadata?
Simple definitions… Data about data; information about information
Getting trickier Lack of consensus
“Metadata is machine understandable information for the web” “Metadata is data about data. The term refers to any data
used to aid the identification, description and location of networked electronic resources.”
Discussing formal metadata
4
Overview
What does metadata have to do with digital libraries? What is metadata (1)? What does metadata do? What is metadata (2)? Why do you need it?
How do you create metadata? Workflow Quality Why bother with workflow or quality?
5
What I’m not doing
Technical systems stuff – rdf / xml bindings… Limits to examples Not how to create metadata, rather how and
why to choose and manage metadata
6
Metadata and digital libraries: what is metadata?
Firefox plug-in [bottom right]
7
Metadata and digital libraries: what is metadata?
8
Metadata and digital libraries: what is metadata?
When metadata becomes important (sort of)
Who is this? What do we know
about them? Why is this picture
significant? Where is this picture
stored?
9
Metadata and digital libraries: what does metadata do?
What do you are you trying to do? What do you need your metadata to do to achieve
that? (discuss…) Bibliographic (e.g. a picture of Fred) Administrative (e.g. taken on 12/07/1997) Rights (e.g. all rights reserved/ no re-use) Preservation (e.g. requirements to view jpeg ; context:
passport photo) Technical (e.g. jpeg ; 85.8kb) Education (e.g. illustration; UKEL 11)
How does this convey meaning?
10
Metadata and digital libraries: what is metadata (part 2)?
Definition from Weibel (1998) how to think about data… Structure: ‘a data model […] for specifying
semantic schemas’ e.g. Dublin Core Semantic: ‘agreed content description standards’
e.g. author name conventions; controlled vocabularies
Syntax: ‘syntax for expressing metadata’ e.g. XML binding for Dublin Core
11
Metadata and digital libraries: what is metadata (part 2)?
Metadata Structure - a data model Standards:
Marc 21 IEEE LOM DC(MES)
‘Super’standards METS
Application profiles
12
Metadata and digital libraries: what is metadata (part 2)?
Title Creator
Subject Description
Publisher Contributor
Date Type
Format Identifier
Source Language
Relation Coverage
Rights
"SIGNPOSTS" DATA
100 1# $a Arnosky, Jim.245 10 $a Raccoons and ripe corn /
$c Jim Arnosky.
250 ## $a 1st ed.
260 ## $a New York :$b Lothrop, Lee & Shepard Books,$c c1987.
300 ## $a 25 p. :$b col. ill. ;$c 26 cm.
520 ## $a Hungry raccoons feast at night in a field of ripe corn.
650 #1 $a Raccoons.
900 ## $a 599.74 ARN
901 ## $a 8009
903 ## $a $15.00
METS header
descriptive metadata
administrative metadata
file section
structural map
structural links
behaviour
13
Metadata and digital libraries: what is metadata (part 2)?
Metadata semantics - content description Where things get tricky
Tools for getting semantic metadata right Guidelines Controlled vocabularies
Challenge of interoperability
14
Metadata and digital libraries: what is metadata (part 2)?
Metadata Syntax – metadata expression Encoding information
Formats (XML)
Technical Infrastructure Protocols
z39.50 OAI-PMH
Software Manage Encode Crosswalk and map
15
Metadata and digital libraries: why do you need metadata?
How else could you manage stuff? Browse limits Free text limits Other limits
16
Metadata generation: workflow and quality
How do you create metadata? Who creates? What do they create? Why should they? How much does it cost…
Why do you need good metadata and what does that mean anyway?
17
Metadata generation: workflow What? Who? (Actors/ Agents/ Roles)
Automatic: good at/ bad at Creator: good at/ bad at LIS professional: good at/ bad at Other professional: good at/ bad at
Why? How much?
18
Metadata generation: quality
Good enough metadata Fitness for purpose
Metadata metrics accuracy reliability verification documentation consistency completeness sufficiency timeliness persistence etc.
19
Metadata quality: why bother?
Functional digital libraries Interoperability
Migration Exchange Participation Cost
20
Which leads to…
Amazon Merlot Jorum Cross-searching library catalogues e-journal access CORDRA Scotland’s Culture …and hopefully, interesting jobs for all of us
21
Key references
National Information Standards Organization. (2004). Understanding Metadata. NISO Press. Available from: http://www.niso.org/standards/resources/UnderstandingMetadata.pdf. Last accessed 21st December 2004.
NISO Framework Advisory Group. (2004). A Framework of Guidance for Building Good Digital Collections. 2nd ed. Bethesda, MD: National Information Standards Organization. Available from: http://www.niso.org/framework/framework2.html. . Last accessed 10th Nov 2004.
Weibel, S.L. (1998). The Metadata Landscape: conventions for semantics, syntax, and structure in the Internet Commons. In: Metadiversity. Proceedings of the Conference, Natural Bridge, VA. Available from: http://www.nfais.org/publications/metadiversity_preprints6.htm. Last accessed 20th January 2005.
Currier, S., Barton, J., O'Beirne, R. & Ryan, B. (2004). Quality assurance for digital learning object repositories: issues for the metadata creation process. ALT-J, 12(1), pp.5-20.
Dushay, N. & Hillmann, D.I. (2003). Analyzing metadata for effective use and re-use. DC-2003: 2003 Dublin Core Conference, Seattle.
Greenberg, J., Pattuelli, M. C., Parsia, B., & Robertson, W. D. (2001). Author-generated Dublin Core metadata for web resources: a baseline study in an organization. Journal of Digital Information, 2(2).