seeding the linked data cloud - hioaedu.hioa.no/korg2016/korg2016_godby.pdfseeding the linked data...

Post on 14-Mar-2018

228 Views

Category:

Documents

1 Downloads

Preview:

Click to see full reader

TRANSCRIPT

“Days of Knowledge Organization.” Oslo and Akershus University,Department of Archivistics,

Library and Information Science. 30 May 2016

Seeding the Linked Data Cloud: The present and future of library

identifiers

Carol Jean GodbySenior Research Scientist, OCLC Research

Founded in

1967as the Ohio College Library Center

16,957members worldwide

1,200+ staff

18 offices in

10 countries OCLC headquarters in Dublin, OH USA

365million records

2.3billion holdings

46million digital items

17million eBooks

*As of February 26, 2016

A web of documents; a web of data

Albert Einstein

Person

Relativity: The Special and General Theory

Work

Physics

Concept

author

about

Entities and relationships

https://www.wikidata.org/wiki/Q937http://viaf.org/viaf/75121530/

Wikidata and the Virtual International Authority File (VIAF)

http://experiment.worldcat.org/entity/work/data/369081611

WorldCat Works

http://id.loc.gov/authorities/subjects/sh85101653.html

Library of Congress Subject Headings

author

about

described in entity hubs and “linked”

URI

URL

ID

• Persistent

• Globally unique

• ‘Thing’

• Web accessible document

• Database record

The evolution of identifiers

Database record ID: Library of Congress

control number. 78078534 is the source of the

heading “Hemingway, Ernest, 1899-1961”

URL: Web document for the LC Name Authority

record: https://lccn.loc.gov/n78078534

URI:http://id.loc.gov/authorities/names/n780785

34/. Refers to the “concept” Ernest Hemingway.

URI: http://id.loc.gov/rwo/agents/n78078534.

Refers to the “person” Ernest Hemingway.

An example: Ernest Hemingway

MORE CONTEXT: OCLC’S 2015

INTERNATIONAL LINKED DATA

SURVEYSOURCE: KAREN SMITH-YOSHIMURA

20 countries

represented

0 5 10 15 20 25 30 35 40 45

USA

Spain

UK

The Netherlands

Norway

Canada

Australia

France

Germany

Italy

Switzerland

Austria

Czech Republic

Hungary

Ireland

Japan

Malaysia

Portugal

Singapore

Sweden

Linked Data Survey Respondents

Geographic breakdown of 90 responding institutions

Academic library

National library

Network

Government

Scholarly

Public Library

Museum

Other

31%

20%14%

10%

8%

7%4% 6%

2015 responding institutions by type

What is published as linked data

0 10 20 30 40 50 60

Authority files

Bibliographic data

Data about musuem objects

Datasets

Descriptive metadata

Digital collections

Encoded archival descriptions

Geographic data

Ontologies/vocabularies

Other

VIAF

DBpedia

GeoNames

id.loc.gov

“Resources we convert to linked data ourselves”

Getty's Art and Architecture Thesaurus

FAST (Faceted Application of Subject Terminology)

WorldCat.org

data.bnf.fr

Deutsche National Bib Linked Data Service

Linked data resources most consumed

http://bnb.data.bl.uk

PUBLISHING LINKED DATA

IDENTIFIERS: LESSONS FROM

OCLC’S EXPERIENCE

Data is easier to

manage.

Data is broadly

understandable.

The cost of

description can

be shared.Data is easier to

integrate.

Conformance to linked data principles

Benefits for data publishersP

erc

eiv

ed b

enefits

CONVERTING LEGACY

DESCRIPTIONS

Format

conversion; one-

to-one mapping.

Objective: publishing

and maintaining

persistent identifiers

(URIs).

Outcomes

• A low-cost start

• A technical proof of concept

• A test of current ontologies

• A [small] break from the

past

Conformance to linked data principles

GoalsP

erc

eiv

ed b

enefits

Authority record

A MARC record and three RDF descriptions

British Library Data

Model

Schema.org BibFrame

bibo:

BibliographicResource

schema:CreativeWork bf:Instance

dc:title schema:name bf:Title

dcterms:language schema:inLanguage bf:language

dc:creator schema:creator bf:creator

Schema

BIBFRAME

AV model and ontology

Search engine discovery

OPAC discovery

Curation

The 2016 Library of Congress

audiovisual study

MARC, FOAF,

Product Types Ontology

MARC, FRBR, RDA,

Schema, FOAF,

Dublin Core

MARC, RDA,

PREMIS, FOAF

What we have to get right

Defining the right Things

Mapping strings to “Things”

Breaking away from legacy

Solving the essential problem

BUILDING ENTITY HUBS

Aggregating

evidence.

Objective:

resolving URIs to

the same entity.

Establishing real-

world references.

Outcomes

• A knowledge store or vault

about important entities or

‘Things’

• A resource that can be

integrated outside its original

creation context

• A radical break with the past

Conformance to linked data principles

Goals

Evaluating quality

and truthfulness

of source data.

Perc

eiv

ed b

enefits

WorldCat Linked Data for “A Farewell to Arms Control”

http://experiment/worldcat.org/entity/nnnnnnn#Topic

United States

Anti-Missile Missiles

Nuclear Weapons

Military Defences

Nuclear Disarmament

Arms Limitation

VIAF: An entity hubhttp://viaf.org/viaf/89803084

OCLC’s published identifiers

WorldCat Catalog

WorldCat Works

FAST

VIAF

ISNI

SOME LESSONS AND NEXT

STEPS

Janet A. Smith

Name Authority File 2

Janet B. A. Smith

Uncontrolled local URIs…

Janet B. Adam Smith

Wikidata

Janet Adam Smith

DBpedia

Janet A. Smith

Janet B. A. Smith

Name Authority File 1

‘Person’ entity

hub

Beyond legacies

Janet Adam Smith

Oxford Biography Index

Some obvious gaps

Defining creator roles beyond the

published monograph. Tracking

creators throughout their careers.

Respecting their privacy. Tracking

pseudonyms, collective names, and

personas. Linking to 3rd-party datasets.

Connecting creators to works. Defining

the model of “format” that users

understand. Graphic Novel. BluRay.

Virtual Reality. Delivering the objects

that users ask for. Identifying the

simplest possible model of ‘work’ that

cuts across all formats and genres.

work place

person event

conceptorganization

Aspirations

In the linked data

paradigm, authority

control is more

important than ever.

SM

Together we make breakthroughs possible.

Takk!

“Days of Knowledge Organization.” Oslo and Akershus University,Department of

Archivistics, Library and Information Science. 30 May 2016

Carol Jean Godbygodby@oclc.org

• Godby, Carol Jean, Shenghui Wang and Jeffrey K. Mixter.

2015. Library Linked Data in the Cloud: OCLC’s Experiments with New

Models of Resource Description. Morgan & Claypool.

• Lyons, B and Van Malssen, K. 2016. “BIBFRAME AV Assessment:

Technical, Structural, and Preservation Metadata.”

https://www.loc.gov/bibframe/docs/pdf/bf-avtechstudy-01-04-2016.pdf

• Smith-Yoshimura, Karen. 2016. “Linked Data Implementations—Who, What

and Why?” CNI Spring Membership Meeting, 4 April 2016, San Antonio,

Texas (USA).

• Smith-Yoshimura, Karen, et al. 2016. “Addressing the Challenges with

Organizational Identifiers and ISNI.” Dublin, Ohio: OCLC Research.

http://www.oclc.org/content/dam/research/publications/2016/oclcresearch-

organizational-identifiers -and-isni-2016.pdf

• Smith-Yoshimura, Karen, et al. 2014. “Registering Researchers in Authority

Files.” Dublin, Ohio: OCLC Research.

http://www.oclc.org/content/dam/research/publications/library/2014/oclcrese

archregistering-researchers-2014.pdf.

References

top related