cigs lod bnb_cd_20131118

27
Publishing the British National Bibliography as Linked Open Data Corine Deliot Metadata Standards Analyst British Library CIGS Linked Open Data Seminar Edinburgh, 18 November 2013

Upload: cigscotland

Post on 10-May-2015

249 views

Category:

Education


0 download

DESCRIPTION

Publishing the British National Bibliography as Linked Open Data / Corine Deliot, British Library Presented at Linked Open Data: current practice in libraries and archives (Cataloguing & Indexing Group in Scotlland 3rd Linked Open Data Conference), Edinburgh, 18 Nov 2013

TRANSCRIPT

Page 1: Cigs lod bnb_cd_20131118

Publishing the British National Bibliography

as Linked Open Data

Corine DeliotMetadata Standards Analyst

British Library

CIGS Linked Open Data SeminarEdinburgh, 18 November 2013

Page 2: Cigs lod bnb_cd_20131118

www.bl.uk 2

Presentation overview

• Motivations and approach

• The modelling process and the data model

• Technical process: from MARC 21 to RDF

• Linking to external datasets

• Outcomes – datasets/platform/access

• Plans for future developments

Page 3: Cigs lod bnb_cd_20131118

www.bl.uk 3

Motivations

• Publishing our data for others to re-use

• Looking beyond library audiences

• Taking part in the Linked Data conversation

Page 4: Cigs lod bnb_cd_20131118

www.bl.uk 4

How?

• Pragmatic, bottom-up approach

• Using existing staff

• Building on existing skills

• Using existing tools as much as possible

Page 5: Cigs lod bnb_cd_20131118

www.bl.uk 5

Why BNB?

• General bibliography - not a unique institutional catalogue

• Consistent format - over 60 years

• Size & range of content - 3 million records on all subjects in many languages

• Control of metadata – publishable as CC0.

Page 6: Cigs lod bnb_cd_20131118

www.bl.uk 6

The modelling process (I)

• identify our objects of interest, i.e. what does the MARC record says about “things in the world”

e.g. Bibliographic resources, people, organizations, places, subjects, etc.

• Assign URIs to identify these objects of interests URI pattern guidance from the UK Cabinet Office

“Designing URI Sets for the UK Public Sector”

Page 7: Cigs lod bnb_cd_20131118

www.bl.uk 7

URI patterns

• http://bnb.data.bl.uk/id/resource/{control-number}

• http://bnb.data.bl.uk/id/resource/{BNB-number}

• http://bnb.data.bl.uk/id/person/{person-name}

• http://bnb.data.bl.uk/id/organization/{organization-name}

• http://bnb.data.bl.uk/id/concept/lcsh/{topic}

• http://bnb.data.bl.uk/id/concept/ddc/{edition-number}/{dewey-number}

Page 8: Cigs lod bnb_cd_20131118

www.bl.uk 8

The modelling process (II)

Describe these objects of interest and how they relate to each other.

Use classes and properties from existing RDF vocabularies

Define our own classes and properties when required; documented in the British Library Terms RDF schema

Page 9: Cigs lod bnb_cd_20131118

www.bl.uk 9

RDF Vocabularies

• Bibliographic Ontology

• Bio: a Vocabulary for Biographical Information

• British Library Terms

• Dublin Core

• Event Ontology

• FOAF: Friend of a Friend

• ISBD

• Org: an Organisation Ontology

• OWL

• RDA

• RDF

• RDF Schema

• SKOS

• WGS84 Geo Positioning

Page 10: Cigs lod bnb_cd_20131118

www.bl.uk 10

The British Library Terms RDF Schema

blt=“http://www.bl.uk/schemas/bibliographic/blterms#”

• Existing property not quite right (e.g. not granular enough)

e.g. dcterms:identifier vs blt:bnb

Page 11: Cigs lod bnb_cd_20131118

www.bl.uk 11

The British Library Terms RDF Schema

blt=“http://www.bl.uk/schemas/bibliographic/blterms#”

Property or class required by specific feature of the model

e.g. blt:publication and blt:PublicationEvent (rdfs:subclass of event:Event)

Page 12: Cigs lod bnb_cd_20131118

www.bl.uk 12

The British Library Terms RDF Schema

blt=“http://www.bl.uk/schemas/bibliographic/blterms#”

For pragmatic reasons, e.g. facilitate searching, inferencing and navigating through the graph

e.g. blt:TopicLCSH and blt:TopicDDC

e.g. blt:hasCreated owl:inverseOf dcterms:creator

Page 13: Cigs lod bnb_cd_20131118

www.bl.uk 13

The BNB data model - Books

http://www.bl.uk/bibliographic/pdfs/bldatamodelbook.pdf

Page 14: Cigs lod bnb_cd_20131118

www.bl.uk 14

Data Model Features (I): the Bibliographic Resource

Page 15: Cigs lod bnb_cd_20131118

www.bl.uk 15

Data Model Features (II): Publication as an event

• <BibResource> dcterms:publisher <Publisher> .

<BibResource> dcterms:issued “Date” .

<BibResource> ? “Place” .

Or

<BibResource> ? <Place> .

• <BibResource> blt:publication <PublicationEvent> .

<PublicationEvent> event:place <Place> .

<PublicationEvent> event:agent <Publisher> .

<PublicationEvent> event:time <Year> .

Usual approach

Event-based approach

Page 16: Cigs lod bnb_cd_20131118

www.bl.uk 16

Data model features (III)

• Birth and death are modelled as biographical events

• extensive use of foaf:focus to relate “things in the world” (e.g. people, organizations, places) to their SKOS concepts.

e.g. “Paris”, the capital of France as a single “thing in the world” may be the “focus” of multiple concepts belonging to different concept schemes, e.g. thesauri (LCSH, Rameau, etc.)

<Concept> foaf:focus <Thing in the World>

http://efoundations.typepad.com/efoundations/2011/09/things-their-conceptualisations-skos-foaffocus-modelling-choices.html by Pete Johnston

Page 17: Cigs lod bnb_cd_20131118

www.bl.uk 17

MARC to RDF Conversion Workflow

Full BNB MARC21

File

Transform to RDF/XML using

XSLT

Load to Linked Data Platform

Generate RDF Triple Dump

BNB RDF/XML file

Select records

Convert to pre-composed UTF-8

Normalise for improved

matching & transforms

Create BL URIs and add external

URIs by matching

MARCPre-Processing

Load to BL Downloads page

Process• Selection• Character set conversion• Pre-processing• URI generation• Data transformation• Create & load triples

Tools• Catalogue Bridge Utilities • MARC Global/MARC Report http://www.marcofquality.com/• Jena Eyeball http://jena.sourceforge.net/Eyeball/

Page 18: Cigs lod bnb_cd_20131118

www.bl.uk 18

Linking to external sources (I)

To give our data broader context we linked to:

• General resources:• GeoNames• Lexvo• RDF Book

Mashup

• Library resources:• LCSH• VIAF• Dewey.info• MARC language

and country codes

Page 19: Cigs lod bnb_cd_20131118

www.bl.uk 19

Linking to external sources (II)

Techniques included:

• Automatic generation from

record data

• Auto text match with linked data dumps

• Crosswalk matching for coded data

Page 20: Cigs lod bnb_cd_20131118

www.bl.uk 20

Outcomes

• Two datasets – Books and Serials, accessible at:

• BNB Linked data platform: http://bnb.data.bl.uk

• SPARQL endpoint: http://bnb.data.bl.uk/sparql

• SPARQL editor: http://bnb.data.bl.uk/flint

• Bulk downloads: http://www.bl.uk/bibliographic/download.html

Updated monthly Serializations available:

RDF/XML, N-Triples

Page 21: Cigs lod bnb_cd_20131118

www.bl.uk 21http://bnb.data.bl.uk

Page 22: Cigs lod bnb_cd_20131118

www.bl.uk 22

Page 23: Cigs lod bnb_cd_20131118

www.bl.uk 23http://bnb.data.bl.uk/flint

Page 24: Cigs lod bnb_cd_20131118

www.bl.uk 24http://www.bl.uk/bibliographic/download.html

Page 25: Cigs lod bnb_cd_20131118

www.bl.uk 25

Platform change

• 2011 - initial Talis platform

• 2013 – data migration to TSO platformhttp://www.tso.co.uk/our-expertise/technology/openup-platform

Tendering process Migration of data and services over a couple of months

Page 26: Cigs lod bnb_cd_20131118

www.bl.uk 26

Plans for Future Developments

• Refine and extend the model

• Investigate frbr-ization

• Link to other external sources• Geonames at city level

• ISNI, LC/NACO, DBpedia

• DNB bibliographic resources

• Expand scope beyond current BNB

• Improve developer support

Page 27: Cigs lod bnb_cd_20131118

www.bl.uk 27

For further information http://www.bl.uk/bibliographic/datafree.html

Thank you.

Questions?

[email protected]

http://twitter.com/#!/BLMetadata