gordon dunsire
DESCRIPTION
Linked data and the implications for library cataloguing: metadata models and structures in the Semantic Web. Gordon Dunsire Presented at the Canadian Library Association Annual Conference, 26-29 May 2011, Halifax, Nova Scotia. Outline. Context: evolution of the catalogue record RDF 101 - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Gordon Dunsire](https://reader035.vdocuments.us/reader035/viewer/2022081513/56813ec7550346895da92e66/html5/thumbnails/1.jpg)
Linked data and the implications for library cataloguing:
metadata models and structures in the Semantic Web
Gordon DunsirePresented at the Canadian Library
Association Annual Conference, 26-29 May 2011, Halifax, Nova Scotia
![Page 2: Gordon Dunsire](https://reader035.vdocuments.us/reader035/viewer/2022081513/56813ec7550346895da92e66/html5/thumbnails/2.jpg)
Outline
Context: evolution of the catalogue recordRDF 101Library metadata models/schemas in RDF
FRBR, RDA, ISBD, DCT, BiBO, ...From record to triples: worked example
![Page 3: Gordon Dunsire](https://reader035.vdocuments.us/reader035/viewer/2022081513/56813ec7550346895da92e66/html5/thumbnails/3.jpg)
A short historyof the evolution
of the library catalogue record
![Page 4: Gordon Dunsire](https://reader035.vdocuments.us/reader035/viewer/2022081513/56813ec7550346895da92e66/html5/thumbnails/4.jpg)
Lee, T. B.
Cataloguing has a future. - Audio disc (Spoken word). - Donated by the author.
1. Metadata
In the beginning ...
... the catalogue card
![Page 5: Gordon Dunsire](https://reader035.vdocuments.us/reader035/viewer/2022081513/56813ec7550346895da92e66/html5/thumbnails/5.jpg)
Author:
Title:
Content type:
Provenance:
Subject:
Lee, T. B.
Cataloguing has a future
Spoken word
Audio disc
Metadata
Donated by the author
Carrier type:
From flat-file record ...
... to relational record
Name:Biography:
...
Name authority
Term:Definition:
...
Subject authority
Bibliographic description
![Page 6: Gordon Dunsire](https://reader035.vdocuments.us/reader035/viewer/2022081513/56813ec7550346895da92e66/html5/thumbnails/6.jpg)
Author:
Title:
Content type:
Provenance:
Subject:
Lee, T. B.
Cataloguing has a future
Spoken word
Audio disc
MetadataDonated by the author
Carrier type:
From flat-file description ...
... to FRBR record
Name:Biography:
...
Name authority
Term:Definition:
...
Subject authority
Bibliographic description
Item
Manifestation
Author:
Content type:
Subject:
Spoken word
Expression
Work
![Page 7: Gordon Dunsire](https://reader035.vdocuments.us/reader035/viewer/2022081513/56813ec7550346895da92e66/html5/thumbnails/7.jpg)
Lee, T. B.
Metadata
From FRBR record ...
... to extinction!
Name:
Name authority
Term:
Subject authority
Item
Manifestation
Expression
Work
Provenance: Donated by the author
Subject:Author:
Title: Cataloguing has a future
Content type: Spoken word
Audio discCarrier type:Term:
RDA content type
Term:
RDA carrier type
Donor:
Title:
Amazon/Publisher
![Page 8: Gordon Dunsire](https://reader035.vdocuments.us/reader035/viewer/2022081513/56813ec7550346895da92e66/html5/thumbnails/8.jpg)
Where is the record?
Implicit, not explicitEverywhere and nowhere
A semantic Web will allow machines to create the record just-in-timeWe will not have to maintain records just-in-case
The user will have control over the presentationI want to see an archive or library or museum or Amazon
or Google or Flickr or ? displayAnd by avoiding duplication, we can all get on with
describing new stuff ...
![Page 9: Gordon Dunsire](https://reader035.vdocuments.us/reader035/viewer/2022081513/56813ec7550346895da92e66/html5/thumbnails/9.jpg)
The hyperdimensional (Tardis) card
Lee, T. B.
Cataloguing has a future. - Audio disc (Spoken word). - Donated by the author.
1. Metadata
Audio shop
Lee MuseumSpoken word archive
W3C Library
“TARDIS four port USB hub, for office-bound Time Lords:Open a time vortex on your desk” – Pocket-lint
![Page 10: Gordon Dunsire](https://reader035.vdocuments.us/reader035/viewer/2022081513/56813ec7550346895da92e66/html5/thumbnails/10.jpg)
RDF 101
![Page 11: Gordon Dunsire](https://reader035.vdocuments.us/reader035/viewer/2022081513/56813ec7550346895da92e66/html5/thumbnails/11.jpg)
Semantic Web
“machine-readable metadata”Faster! 24/7/365! Global!
Metadata expressed as “atomic” statementsA simple, single, irreducible statement
The title of this book is “Treasure island”
In a standard machine-processable formatResource Description Framework (RDF)
![Page 12: Gordon Dunsire](https://reader035.vdocuments.us/reader035/viewer/2022081513/56813ec7550346895da92e66/html5/thumbnails/12.jpg)
Resource Description Framework
Metadata statement constructed in 3 parts“Triple”
The title of this book is “Treasure island”Subject of the statement = Subject: This bookNature of the statement = Predicate: has titleValue of the statement = Object: “Treasure island”
This book – has title – “Treasure island”subject – predicate - object
![Page 13: Gordon Dunsire](https://reader035.vdocuments.us/reader035/viewer/2022081513/56813ec7550346895da92e66/html5/thumbnails/13.jpg)
Identifiers
Need unambiguous way of identifying each part of the triple for efficient machine-processingHuman labels (“This book”, “has title”) no good
Same thing, different labels; different things, same label
Exploit the utility of the URLMachine-readable, regular syntax, unambiguous
Uniform Resource Identifier (URI)
![Page 14: Gordon Dunsire](https://reader035.vdocuments.us/reader035/viewer/2022081513/56813ec7550346895da92e66/html5/thumbnails/14.jpg)
Uniform Resource Identifier
Can be any unique combination of numbers and lettersNo intrinsic meaning; it’s just an identifying label
Can look like a URLhttp://iflastandards.info/ns/isbd/elements/P1001But does not lead to a Web page (in principle ...)
RDF requires the subject and predicate of triple to be URIsObject can be a URI, or a literal string (“Treasure
island”)
![Page 15: Gordon Dunsire](https://reader035.vdocuments.us/reader035/viewer/2022081513/56813ec7550346895da92e66/html5/thumbnails/15.jpg)
Namespaces
URI can be constructed from a base plus a unique, identifying suffixhttp://iflastandards.info/ns/isbd/elements/+ P1001
Base is known as a namespaceCan be abbreviated by human programmer
“isbd” = http://iflastandards.info/ns/isbd/elements/isbd:P1001
Machine expands abbreviation for processing
![Page 16: Gordon Dunsire](https://reader035.vdocuments.us/reader035/viewer/2022081513/56813ec7550346895da92e66/html5/thumbnails/16.jpg)
Everything as triples in RDF
Every aspect of the metadata must be expressed in RDF to be machine-processableMetadata about real-world objects (books,
people, etc.)Metadata about the predicates (definition, label,
scope, etc.)Common predicates apply to many types of thing
(human-readable label, etc.)High-level RDF namespaces (rdfs, owl)
RDF is expressed in RDF (“bootstrap”)
![Page 17: Gordon Dunsire](https://reader035.vdocuments.us/reader035/viewer/2022081513/56813ec7550346895da92e66/html5/thumbnails/17.jpg)
Library namespaces
![Page 18: Gordon Dunsire](https://reader035.vdocuments.us/reader035/viewer/2022081513/56813ec7550346895da92e66/html5/thumbnails/18.jpg)
Creating namespaces and URIs
FRBR/FRAD/FRSAD, ISBD, and RDA are using the Open Metadata RegistryCan assign a running “number” to the base to
create a new URISet of properties for creating basic triples
Properties = predicatesrdfs:label for assigning a human-readable label to
the subjectisbd:P1001 - rdfs:label - “has content form”
![Page 19: Gordon Dunsire](https://reader035.vdocuments.us/reader035/viewer/2022081513/56813ec7550346895da92e66/html5/thumbnails/19.jpg)
![Page 20: Gordon Dunsire](https://reader035.vdocuments.us/reader035/viewer/2022081513/56813ec7550346895da92e66/html5/thumbnails/20.jpg)
![Page 21: Gordon Dunsire](https://reader035.vdocuments.us/reader035/viewer/2022081513/56813ec7550346895da92e66/html5/thumbnails/21.jpg)
![Page 22: Gordon Dunsire](https://reader035.vdocuments.us/reader035/viewer/2022081513/56813ec7550346895da92e66/html5/thumbnails/22.jpg)
![Page 23: Gordon Dunsire](https://reader035.vdocuments.us/reader035/viewer/2022081513/56813ec7550346895da92e66/html5/thumbnails/23.jpg)
Subject Predicate Object
isbd:P1001 rdfs:label “has content form”
![Page 24: Gordon Dunsire](https://reader035.vdocuments.us/reader035/viewer/2022081513/56813ec7550346895da92e66/html5/thumbnails/24.jpg)
![Page 25: Gordon Dunsire](https://reader035.vdocuments.us/reader035/viewer/2022081513/56813ec7550346895da92e66/html5/thumbnails/25.jpg)
![Page 26: Gordon Dunsire](https://reader035.vdocuments.us/reader035/viewer/2022081513/56813ec7550346895da92e66/html5/thumbnails/26.jpg)
![Page 27: Gordon Dunsire](https://reader035.vdocuments.us/reader035/viewer/2022081513/56813ec7550346895da92e66/html5/thumbnails/27.jpg)
![Page 28: Gordon Dunsire](https://reader035.vdocuments.us/reader035/viewer/2022081513/56813ec7550346895da92e66/html5/thumbnails/28.jpg)
Subject Predicate Object
isbdcf:T1008 skos:prefLabel “spoken word”
![Page 29: Gordon Dunsire](https://reader035.vdocuments.us/reader035/viewer/2022081513/56813ec7550346895da92e66/html5/thumbnails/29.jpg)
![Page 30: Gordon Dunsire](https://reader035.vdocuments.us/reader035/viewer/2022081513/56813ec7550346895da92e66/html5/thumbnails/30.jpg)
![Page 31: Gordon Dunsire](https://reader035.vdocuments.us/reader035/viewer/2022081513/56813ec7550346895da92e66/html5/thumbnails/31.jpg)
Application profile
Need a way to specify how a useful “record” can be constructed from RDF triples
Which triples are involved, and from which namespaces?
Sequence? Repeatable? Mandatory?Sub-component aggregations
Publication statement = place + name + dateContent rules?
![Page 32: Gordon Dunsire](https://reader035.vdocuments.us/reader035/viewer/2022081513/56813ec7550346895da92e66/html5/thumbnails/32.jpg)
Mandatory Not repeatable Aggregation of simpler elements
Syntax of aggregation (punctuation)
![Page 33: Gordon Dunsire](https://reader035.vdocuments.us/reader035/viewer/2022081513/56813ec7550346895da92e66/html5/thumbnails/33.jpg)
Getting triples from records
![Page 34: Gordon Dunsire](https://reader035.vdocuments.us/reader035/viewer/2022081513/56813ec7550346895da92e66/html5/thumbnails/34.jpg)
Linking Open Data cloud (LOD)
Diagram by Richard Cyganiak and Anja Jentzsch. http://lod-cloud.net/
![Page 35: Gordon Dunsire](https://reader035.vdocuments.us/reader035/viewer/2022081513/56813ec7550346895da92e66/html5/thumbnails/35.jpg)
LOD: “Library” corner
![Page 36: Gordon Dunsire](https://reader035.vdocuments.us/reader035/viewer/2022081513/56813ec7550346895da92e66/html5/thumbnails/36.jpg)
Why get involved?
To share our dataWe work for “society”
To share our expertise and experience150 + years
To promote the power of libraries (and archives and museums)
To survive
![Page 37: Gordon Dunsire](https://reader035.vdocuments.us/reader035/viewer/2022081513/56813ec7550346895da92e66/html5/thumbnails/37.jpg)
From record to triples (in 9 stages)Very large numbers of records
Catalogue records, finding aids, etc.300 million; 1 billion?
High quality metadataIn comparison with other communities
Each record may generate many triples200 “raw” triples (no inferences) per MARC record?
Very, very large numbers of triplesBillions? Trillions?
![Page 38: Gordon Dunsire](https://reader035.vdocuments.us/reader035/viewer/2022081513/56813ec7550346895da92e66/html5/thumbnails/38.jpg)
1. Take a recordField/attribute ValueRecord ID 54321Title Museum archives: an introductionAuthor Wythe, DeborahDate 2004LCSH Museum archivesMedia/GMD ElectronicContent form Text
![Page 39: Gordon Dunsire](https://reader035.vdocuments.us/reader035/viewer/2022081513/56813ec7550346895da92e66/html5/thumbnails/39.jpg)
2. Disaggregate to single statementsRecord Attribute Value54321 (has) title Museum archives: an
introduction54321 (has) author Wythe, Deborah54321 (has) date 200454321 (has) LCSH Museum archives54321 (has) media type Electronic54321 (has) content form Text
![Page 40: Gordon Dunsire](https://reader035.vdocuments.us/reader035/viewer/2022081513/56813ec7550346895da92e66/html5/thumbnails/40.jpg)
3. Create URI for record
Must be unique, so 54321 no good on its ownhttp URIs are a good thing (W3C)So add record ID to a unique http domain
E.g. http://MyLibraryX.com (unique to the library)+ 54321
http://MyLibraryX.com/54321(or http://MyLibraryX.com#54321)
This is not a URL!
![Page 41: Gordon Dunsire](https://reader035.vdocuments.us/reader035/viewer/2022081513/56813ec7550346895da92e66/html5/thumbnails/41.jpg)
4. Replace record ID with URIURI Attribute Valuemlx:54321 (has) title Museum archives:
an introductionmlx:54321 (has) author Wythe, Deborahmlx:54321 (has) date 2004mlx:54321 (has) LCSH Museum archivesmlx:54321 (has) media type Electronicmlx:54321 (has) content form Text
“mlx” = qname (xmlns) = shorthand for “http://MyLibraryX.com/”
![Page 42: Gordon Dunsire](https://reader035.vdocuments.us/reader035/viewer/2022081513/56813ec7550346895da92e66/html5/thumbnails/42.jpg)
5. Find URIs for attributesAttributes are modelled as RDF properties (predicates) in
“element set” namespacesE.g. Dublin Core terms (dct); ISBD (isbd); FRBR (frbrer); RDA
(rdaxxx); Bibliographic Ontology (bibo); etc.Choose a namespace, find property with same (or closest)
“meaning” (e.g. definition) as attributeNearest property minimises loss of information
Get URI for property If no suitable property, choose another namespace
Properties do not have to come from single namespaceMatch and mix!
![Page 43: Gordon Dunsire](https://reader035.vdocuments.us/reader035/viewer/2022081513/56813ec7550346895da92e66/html5/thumbnails/43.jpg)
5 (cont). Find URI for titlehttp://purl.org/dc/terms/title (dct:title)http://iflastandards.info/ns/isbd/elements/
P1014 (isbd:P1014)hasTitleProper
http://RDVocab.info/Elements/titleProper (rdaGR1:titleProper)
![Page 44: Gordon Dunsire](https://reader035.vdocuments.us/reader035/viewer/2022081513/56813ec7550346895da92e66/html5/thumbnails/44.jpg)
5 (cont). Find URI for authordct:creatorrdarole:author(isbd does not cover “headings”)
![Page 45: Gordon Dunsire](https://reader035.vdocuments.us/reader035/viewer/2022081513/56813ec7550346895da92e66/html5/thumbnails/45.jpg)
5 (cont). Find URI for datedct:dateisbd:P1018
hasDateOfPublicationProductionDistribution
rdaGr1:dateOfPublication
![Page 46: Gordon Dunsire](https://reader035.vdocuments.us/reader035/viewer/2022081513/56813ec7550346895da92e66/html5/thumbnails/46.jpg)
5 (cont). Find URI for LCSHLCSH is a subject vocabulary
Controlled termsSo attribute is really “subject”
And the term itself is the valuedct:subject
![Page 47: Gordon Dunsire](https://reader035.vdocuments.us/reader035/viewer/2022081513/56813ec7550346895da92e66/html5/thumbnails/47.jpg)
5 (cont). Find URI for media typeAssuming record uses new ISBD Area 0 ...isbd:P1003
hasMediaType
![Page 48: Gordon Dunsire](https://reader035.vdocuments.us/reader035/viewer/2022081513/56813ec7550346895da92e66/html5/thumbnails/48.jpg)
5 (cont). Find URI for content formAssuming record uses new ISBD Area 0 ...isbd: P1001
hasContentForm
![Page 49: Gordon Dunsire](https://reader035.vdocuments.us/reader035/viewer/2022081513/56813ec7550346895da92e66/html5/thumbnails/49.jpg)
6. Replace attributes with URIsURI URI Valuemlx:54321 isbd:P1014 Museum archives:
an introductionmlx:54321 rdarole:author Wythe, Deborahmlx:54321 isbd:P1018 2004mlx:54321 dct:subject Museum archivesmlx:54321 isbd:P1003 Electronicmlx:54321 isbd:P1001 Text
![Page 50: Gordon Dunsire](https://reader035.vdocuments.us/reader035/viewer/2022081513/56813ec7550346895da92e66/html5/thumbnails/50.jpg)
7. Find URIs for values If object of a triple is a URI, it can link to the subject
of another triple with the same URILinked data!
Values from controlled vocabularies may have URIsPossible vocabularies: author, subject, ISBD Area 0NOT: title, date
For author: Virtual International Authority File (VIAF)For LCSH: Library of Congress Authorities &
VocabulariesFor ISBD Area 0: Open Metadata Registry
![Page 51: Gordon Dunsire](https://reader035.vdocuments.us/reader035/viewer/2022081513/56813ec7550346895da92e66/html5/thumbnails/51.jpg)
7 (cont). Find URI for authorAuthor: Wythe, DeborahVIAF: http://www.viaf.org/
viaf:31899419/#Wythe,+Deborah
![Page 52: Gordon Dunsire](https://reader035.vdocuments.us/reader035/viewer/2022081513/56813ec7550346895da92e66/html5/thumbnails/52.jpg)
7 (cont). Find URI for subject (LCSH)LCSH: Museum archivesLoC: http://id.loc.gov/authorities/
lcsh:/sh85088707#concept
![Page 53: Gordon Dunsire](https://reader035.vdocuments.us/reader035/viewer/2022081513/56813ec7550346895da92e66/html5/thumbnails/53.jpg)
7 (cont). Find URIs for ISBD Area 0Media type: ElectronicISBD media type
isbdmt:T1002Content form: TextISBD Content form
isbdcf:T1009
![Page 54: Gordon Dunsire](https://reader035.vdocuments.us/reader035/viewer/2022081513/56813ec7550346895da92e66/html5/thumbnails/54.jpg)
8. Replace values with URIssubject predicate objectmlx:54321 isbd:P1014 “Museum archives: an
introduction”mlx:54321 rdarole:author viaf:31899419/#Wythe,
+Deborahmlx:54321 isbd:P1018 “2004”mlx:54321 dct:subject lcsh:/
sh85088707#concept mlx:54321 isbd:P1003 isbdmt:T1002mlx:54321 isbd:P1001 isbdcf:T1009
![Page 55: Gordon Dunsire](https://reader035.vdocuments.us/reader035/viewer/2022081513/56813ec7550346895da92e66/html5/thumbnails/55.jpg)
9. Publish triples (linked data)mlx:54321 | isbd:P1014 | “Museum archives: an
introduction” mlx:54321 | rdarole:author | viaf:31899419/#Wythe,
+Deborahmlx:54321 | isbd:P1018 | “2004”
mlx:54321 | dct:subject | lcsh:/sh85088707#concept
mlx:54321 | isbd:P1003 | isbdmt:T1002
mlx:54321 | isbd:P1001 | isbdcf:T1009
![Page 56: Gordon Dunsire](https://reader035.vdocuments.us/reader035/viewer/2022081513/56813ec7550346895da92e66/html5/thumbnails/56.jpg)
Linked data chains
mlx:54321 | dct:subject | lcsh:/sh85088707#concept
lcsh:/sh85088707#concept | skos:related | rameau:XXX
rameau:XXX | frbrer:isSubjectOf | mly:98765
rameau:XXX | skos:prefLabel | “archives du musée”
mly:98765 | rda:titleOfTheWork | “Managing archives in museums”
![Page 57: Gordon Dunsire](https://reader035.vdocuments.us/reader035/viewer/2022081513/56813ec7550346895da92e66/html5/thumbnails/57.jpg)
Linked data cluster = “record”mlx:54321 | isbd:P1014 | “Museum archives: an
introduction” mlx:54321 | rdarole:author | viaf:31899419/#Wythe,
+Deborahmlx:54321 | isbd:P1018 | “2004”
mlx:54321 | dct:subject | lcsh:/sh85088707#concept
mlx:54321 | isbd:P1003 | isbdmt:T1002
mlx:54321 | isbd:P1001 | isbdcf:T1009
![Page 58: Gordon Dunsire](https://reader035.vdocuments.us/reader035/viewer/2022081513/56813ec7550346895da92e66/html5/thumbnails/58.jpg)
Metadata focus
Shift of focus of metadata creation, maintenance, storage, preservation (by professionals, amateurs, machines)
From Record To Statement(s) = triple(s)
But metadata display ...... aggregates triples (from multiple sources) to create records on the fly