metadata standards and applications 4. metadata syntaxes and containers

26
Metadata Metadata Standards and Standards and Applications Applications 4. Metadata Syntaxes 4. Metadata Syntaxes and Containers and Containers

Upload: philomena-williams

Post on 24-Dec-2015

230 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Metadata Standards and Applications 4. Metadata Syntaxes and Containers

Metadata Standards Metadata Standards and Applicationsand Applications

4. Metadata Syntaxes and 4. Metadata Syntaxes and ContainersContainers

Page 2: Metadata Standards and Applications 4. Metadata Syntaxes and Containers

Goals of SessionGoals of Session

Understand the origin of and Understand the origin of and differences between the various differences between the various syntaxes used for encoding syntaxes used for encoding information, including HTML, XML information, including HTML, XML and RDFand RDF

Discover how container formats are Discover how container formats are used for managing digital resources used for managing digital resources and their metadataand their metadata

22Metadata Standards & ApplicationsMetadata Standards & Applications

Page 3: Metadata Standards and Applications 4. Metadata Syntaxes and Containers

Overview of SyntaxesOverview of Syntaxes

HTML, XHTML: Hypertext Markup HTML, XHTML: Hypertext Markup Language; eXtensible Hypertext Language; eXtensible Hypertext Markup LanguageMarkup Language

XML: Extensible Markup LanguageXML: Extensible Markup LanguageRDF: Resource Description RDF: Resource Description

FrameworkFramework

33Metadata Standards & ApplicationsMetadata Standards & Applications

Page 4: Metadata Standards and Applications 4. Metadata Syntaxes and Containers

HTMLHTML

HyperText Markup LanguageHyperText Markup Language

HTML 4 is the current standardHTML 4 is the current standard

HTML is an SGML (Standard Generalized Markup HTML is an SGML (Standard Generalized Markup Language) application conforming to Language) application conforming to International Standard ISO 8879International Standard ISO 8879

Widely regarded as the standard publishing Widely regarded as the standard publishing language of the World Wide Weblanguage of the World Wide Web

HTML addressed the problem of SGML HTML addressed the problem of SGML complexity by specifying a small set of structural complexity by specifying a small set of structural and semantic tags suitable for authoring and semantic tags suitable for authoring relatively simple documentsrelatively simple documents

44Metadata Standards & ApplicationsMetadata Standards & Applications

Page 5: Metadata Standards and Applications 4. Metadata Syntaxes and Containers

XHTMLXHTML

XML-ized version of HTML 4.0, tightens up HTML XML-ized version of HTML 4.0, tightens up HTML to match XML syntaxto match XML syntax

– Requires ending tags, quoted attributes, lower Requires ending tags, quoted attributes, lower case, etc., to conform to XML requirementscase, etc., to conform to XML requirements

XHTML is a W3C specification, redefining HTML XHTML is a W3C specification, redefining HTML as an XML implementation, rather than an SGML as an XML implementation, rather than an SGML implementationimplementation

Imposes requirements that are intended to lead Imposes requirements that are intended to lead to more well-formed, valid XML, easier for to more well-formed, valid XML, easier for browsers to handlebrowsers to handle

55Metadata Standards & ApplicationsMetadata Standards & Applications

Page 6: Metadata Standards and Applications 4. Metadata Syntaxes and Containers

<link rel="schema.DC" href="http://purl.org/dc/elements/1.1/" /> <link rel="schema.DCTERMS" href="http://purl.org/dc/terms/" /> <meta name="DC.title" content="Using Dublin Core" /> <meta name="DC.creator" content="Diane Hillmann" /> <meta name="DC.subject" content="documents; Bibliography; Model; meta; Glossary; mark; matching; refinements; XHTML; Controlled; Qualifiers; Hillmann; mixing; encoding; Diane; Issues; Appendix; elements; Simple; Special; element; trademark/service; DCMI; Dublin; pages; Section; Resource; Grammatical; Qualified; XML; Using; Principles; Documents; licensing; OCLC; formal; Usageguide; Roles; Implementing; Contents; Guidelines; Expressing; Table; Syntax; Content; Element; DC.dot; Home; document; Metadata; RDF/XML; Website; metadata; privacy; schemes; liability; profiles; Elements; Copyright; Localization; schemas; HTML/XHTML; Core; Guide; registry; Research; contact; Scope; Projects; languages; Maintenance; Application; available; Internationalization; HTML; Recommended; link; Purpose; Abstract; AskDCMI; Vocabularies; software; Storage; Introduction" /> <meta name="DC.description" content="This document is intended as an entry point for users of Dublin Core. For non-specialists, it will assist them in creating simple descriptive records for information resources (for example, electronic documents). Specialists may find the document a useful point of reference to the documentation of Dublin Core, as it changes and grows." /> <meta name="DC.publisher" content="Dublin Core Metadata Initiative" /> <meta name="DC.type" scheme="DCTERMS.DCMIType" content="Text" /> <meta name="DC.format" content="text/html" /> <meta name="DC.format" content="31250 bytes" /> <meta name="DC.identifier" scheme="DCTERMS.URI" content="http://dublincore.org/documents/usageguide/" />

An XHTML Example

6Metadata Standards & Applications

Page 7: Metadata Standards and Applications 4. Metadata Syntaxes and Containers

XMLXML

EExxtensible tensible MMarkup arkup LLanguageanguageA ‘A ‘metamarkup’metamarkup’ languagelanguage: has no : has no

fixed tags or elementsfixed tags or elementsStrict grammar imposes structure Strict grammar imposes structure

designed to be read by machinesdesigned to be read by machinesTwo levels of conformance:Two levels of conformance:

– well-formed--conforms to general grammar well-formed--conforms to general grammar rulesrules

– valid--conforms to particular XML schema or valid--conforms to particular XML schema or DTD (document type definition)DTD (document type definition)

77Metadata Standards & ApplicationsMetadata Standards & Applications

Page 8: Metadata Standards and Applications 4. Metadata Syntaxes and Containers

XML is the XML is the lingua franca lingua franca of the Webof the Web

Web pages increasingly use at least XHTMLWeb pages increasingly use at least XHTML Business use for data exchange/messagingBusiness use for data exchange/messaging Family of technologies can be leveragedFamily of technologies can be leveraged

– XML Schema, XSLT, XPath, and XqueryXML Schema, XSLT, XPath, and Xquery Software tools widely available (many open Software tools widely available (many open

source)source)– Storage, editing, parsing, validating, transforming and Storage, editing, parsing, validating, transforming and

publishing XMLpublishing XML Microsoft Office 2003 supports XML as document Microsoft Office 2003 supports XML as document

format (WordML and ExcelML)format (WordML and ExcelML) Web 2.0 applications are based on XMLWeb 2.0 applications are based on XML

88Metadata Standards & ApplicationsMetadata Standards & Applications

Page 9: Metadata Standards and Applications 4. Metadata Syntaxes and Containers

An XML Schema May Define:An XML Schema May Define:

What elements may be usedWhat elements may be used Of which typesOf which types Any attributesAny attributes In which orderIn which order Optional or compulsoryOptional or compulsory RepeatabilityRepeatability Sub-elementsSub-elements LogicLogic

99Metadata Standards & ApplicationsMetadata Standards & Applications

Page 10: Metadata Standards and Applications 4. Metadata Syntaxes and Containers

Anatomy of an XML RecordAnatomy of an XML Record

XML declaration--prepares the processor to work XML declaration--prepares the processor to work with the documentwith the document

Namespaces (uses xmlns:Namespaces (uses xmlns:prefixprefix and a URI to and a URI to attach a prefix to each element and attribute)attach a prefix to each element and attribute)

– Distinguishes between elements and attributes from Distinguishes between elements and attributes from different vocabularies that might share a name (but different vocabularies that might share a name (but not necessarily a definition) using association with not necessarily a definition) using association with URIs URIs

– Groups all related elements from an application so Groups all related elements from an application so software can deal with themsoftware can deal with them

– The URIs are the standardized bit, not the prefix, The URIs are the standardized bit, not the prefix, and they don’t necessarily lead anywhere useful, and they don’t necessarily lead anywhere useful, even if they look like URLseven if they look like URLs

1010Metadata Standards & ApplicationsMetadata Standards & Applications

Page 11: Metadata Standards and Applications 4. Metadata Syntaxes and Containers

XML Anatomy Lesson

<marc:subfield code="a">Metadata in practice /</marc:subfield>

AttributeAttribute

End TagEnd Tag

ContentContentNameName

Start TagStart Tag

11Metadata Standards & Applications

Page 12: Metadata Standards and Applications 4. Metadata Syntaxes and Containers

Namespace Anatomy Lesson

xmlns:dc=”http://purl.org/dc/elements/1.1/”

XML NamespaceXML Namespace

Namespace Namespace PrefixPrefix

Namespace Namespace IdentifierIdentifier

12Metadata Standards & Applications

Page 13: Metadata Standards and Applications 4. Metadata Syntaxes and Containers

RDFRDF

Resource Description Framework--A Resource Description Framework--A language for describing resources for the language for describing resources for the webweb

Structure based on “triples”Structure based on “triples”Focused on exchange of information Focused on exchange of information

between different kinds of organizations between different kinds of organizations and usagesand usages

Considered an essential part of the Considered an essential part of the Semantic WebSemantic Web

Can be expressed using XMLCan be expressed using XML1313Metadata Standards & ApplicationsMetadata Standards & Applications

Page 14: Metadata Standards and Applications 4. Metadata Syntaxes and Containers

Some RDF ConceptsSome RDF Concepts

A A ResourceResource is anything that you want to is anything that you want to describe; it’s most often identified with a URI, describe; it’s most often identified with a URI, such as:such as: http://dublincore.org/documents/usageguide/http://dublincore.org/documents/usageguide/

A A ClassClass is a category; it is a set that comprises is a category; it is a set that comprises individualsindividuals

A A PropertyProperty is a Resource that has a name, such is a Resource that has a name, such as "creator" or "homepage"as "creator" or "homepage"

A A Property valueProperty value is the value of a Property, is the value of a Property, such as ”George Washington" or such as ”George Washington" or ""http://dublincore.orghttp://dublincore.org" (note that a property " (note that a property value can be another resource)value can be another resource)

1414Metadata Standards & ApplicationsMetadata Standards & Applications

Page 15: Metadata Standards and Applications 4. Metadata Syntaxes and Containers

RDF StatementsRDF Statements

The combination of a The combination of a ResourceResource, a , a PropertyProperty, , and a and a Property valueProperty value forms a forms a StatementStatement (includes a subject, predicate and object)(includes a subject, predicate and object)

An example An example StatementStatement: "The editor of: "The editor of http://dublincore.org/documents/usageguidehttp://dublincore.org/documents/usageguide/ is / is Diane Hillmann"Diane Hillmann"

– The subject of the statement above is:The subject of the statement above is: http://dublincore.org/documents/usageguide/http://dublincore.org/documents/usageguide/

– The predicate is: editorThe predicate is: editor

– The object is: Diane HillmannThe object is: Diane Hillmann

1515Metadata Standards & ApplicationsMetadata Standards & Applications

Page 16: Metadata Standards and Applications 4. Metadata Syntaxes and Containers

RDF and OWLRDF and OWL

RDF does not have the language to specify RDF does not have the language to specify all relationshipsall relationships

Web Ontology Language (OWL) can Web Ontology Language (OWL) can specify richer relationships, such as specify richer relationships, such as equivalence, inverse, uniqueequivalence, inverse, unique

RDF and OWL may be used togetherRDF and OWL may be used together Resource Description Framework Schema Resource Description Framework Schema

(RDFS): a syntax for expressing (RDFS): a syntax for expressing relationships between elementsrelationships between elements

1616Metadata Standards & ApplicationsMetadata Standards & Applications

Page 17: Metadata Standards and Applications 4. Metadata Syntaxes and Containers

<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:dc="http://purl.org/dc/elements/1.1/"> <rdf:Description rdf:about="http://www.dlib.org"> <dc:title>D-Lib Program - Research in Digital Libraries</dc:title> <dc:description>The D-Lib program supports the community of people with research interests in digital libraries and electronic publishing.</dc:description> <dc:publisher>Corporation For National Research Initiatives</dc:publisher> <dc:date>1995-01-07</dc:date> <dc:subject> <rdf:Bag> <rdf:li>Research; statistical methods</rdf:li> <rdf:li>Education, research, related topics</rdf:li> <rdf:li>Library use Studies</rdf:li> </rdf:Bag> </dc:subject> <dc:type>World Wide Web Home Page</dc:type> <dc:format>text/html</dc:format> <dc:language>en</dc:language> </rdf:Description></rdf:RDF>

Note Note unordereunordere

d list d list

An XML/RDF Example

1717Metadata Standards & ApplicationsMetadata Standards & Applications

Note Note rdf:aboutrdf:about

Page 18: Metadata Standards and Applications 4. Metadata Syntaxes and Containers

Overview of Container FormatsOverview of Container Formats A container format is used to package A container format is used to package

together all forms of metadata and digital together all forms of metadata and digital contentcontent

Use of a container is compatible with, and Use of a container is compatible with, and an implementation of, the OAIS an implementation of, the OAIS information package conceptinformation package concept

METS: packages metadata with objects or METS: packages metadata with objects or links to objects and defines structural links to objects and defines structural relationshipsrelationships

MPEG 21 DID: represents digital objectsMPEG 21 DID: represents digital objects

1818Metadata Standards & ApplicationsMetadata Standards & Applications

Page 19: Metadata Standards and Applications 4. Metadata Syntaxes and Containers

METSMETS

Metadata Encoding & Transmission StandardMetadata Encoding & Transmission Standard Developed by the Digital Library Federation, Developed by the Digital Library Federation,

maintained by the Library of Congressmaintained by the Library of Congress ““... an XML document format for encoding ... an XML document format for encoding

metadata necessary for both management of metadata necessary for both management of digital library objects within a repository and digital library objects within a repository and exchange of such objects between repositories exchange of such objects between repositories (or between repositories and their users).”(or between repositories and their users).”

METS is open source and developed by open METS is open source and developed by open discussiondiscussion

Cultural heritage community is the main Cultural heritage community is the main audienceaudience

1919Metadata Standards & ApplicationsMetadata Standards & Applications

Page 20: Metadata Standards and Applications 4. Metadata Syntaxes and Containers

METS UsageMETS Usage

To package metadata with digital To package metadata with digital object in XML syntaxobject in XML syntax

For retrieving, storing, preserving, For retrieving, storing, preserving, serving resourceserving resource

For interchange of digital objects with For interchange of digital objects with their metadatatheir metadata

As an information package in a digital As an information package in a digital repository (may be a unit of storage or repository (may be a unit of storage or a transmission format)a transmission format)

Metadata Standards & ApplicationsMetadata Standards & Applications 2020

Page 21: Metadata Standards and Applications 4. Metadata Syntaxes and Containers

METS SectionsMETS Sections

Defined in METS schema for navigation & Defined in METS schema for navigation & browsingbrowsing– 1. Header (XML Namespaces)1. Header (XML Namespaces)– 2. File inventory2. File inventory– 3. Structural Map & Links3. Structural Map & Links– 4. Descriptive Metadata (not part of METS but uses an 4. Descriptive Metadata (not part of METS but uses an

externally developed descriptive metadata standard, externally developed descriptive metadata standard, e.g. DC, MODS)e.g. DC, MODS)

– 5. Administrative Metadata (points to external schemas):5. Administrative Metadata (points to external schemas): 1. Technical, Source1. Technical, Source 2. Digital Provenance2. Digital Provenance 3. Rights3. Rights

Metadata Standards & ApplicationsMetadata Standards & Applications 2121

Page 22: Metadata Standards and Applications 4. Metadata Syntaxes and Containers

Metadata Standards & ApplicationsMetadata Standards & Applications 2222

Page 23: Metadata Standards and Applications 4. Metadata Syntaxes and Containers

METS Extension SchemasMETS Extension Schemas

““Wrappers” or “sockets” where elements Wrappers” or “sockets” where elements from other schemas can be plugged infrom other schemas can be plugged in– Uses the XML Schema facility for combining Uses the XML Schema facility for combining

vocabularies from different Namespacesvocabularies from different Namespaces Endorsed extension schemas:Endorsed extension schemas:

– Descriptive: DC, MODS, MARCXMLDescriptive: DC, MODS, MARCXML– Technical metadata: MIX (image); textMD Technical metadata: MIX (image); textMD

(text)(text)– Preservation related: PREMISPreservation related: PREMIS

Metadata Standards & ApplicationsMetadata Standards & Applications 2323

Page 24: Metadata Standards and Applications 4. Metadata Syntaxes and Containers

MPEG-21 Digital Item Declaration MPEG-21 Digital Item Declaration (DID)(DID)

ISO/IEC 21000-2: Digital Item DeclarationISO/IEC 21000-2: Digital Item Declaration– An alternative to represent Digital ObjectsAn alternative to represent Digital Objects– Supported by some repositories, e.g., aDORe, Supported by some repositories, e.g., aDORe,

DSpace, FedoraDSpace, Fedora Model that represents compound objects Model that represents compound objects

(recursive “item”)(recursive “item”) MPEG DID is an ISO standard and has industry MPEG DID is an ISO standard and has industry

support, but it is often implemented in a support, but it is often implemented in a proprietary environment and the standards proprietary environment and the standards development is closed (as is ISO in general)development is closed (as is ISO in general)

Metadata Standards & ApplicationsMetadata Standards & Applications 2424

Page 25: Metadata Standards and Applications 4. Metadata Syntaxes and Containers

MPEG 21 Abstract ModelMPEG 21 Abstract Model

Metadata Standards & ApplicationsMetadata Standards & Applications 2525

Page 26: Metadata Standards and Applications 4. Metadata Syntaxes and Containers

An ExerciseAn Exercise

Encode a simple resource in both Encode a simple resource in both DC and MARC using XMLDC and MARC using XML

Use the template forms providedUse the template forms provided

2626Metadata Standards & ApplicationsMetadata Standards & Applications