vocabulary management, skos, and the semantic web · information—integration—intelligence...

35
InformationIntegrationIntelligence Solutions Vocabulary management, SKOS, and the semantic web Bob DuCharme 2011 Semantic Web Technology Conference June 7, 2011

Upload: others

Post on 14-Mar-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Vocabulary management, SKOS, and the semantic web · Information—Integration—Intelligence Solutions Vocabulary management, SKOS, and the semantic web Bob DuCharme 2011 Semantic

Information—Integration—Intelligence Solutions

Vocabulary management, SKOS, and the

semantic web

Bob DuCharme

2011 Semantic Web Technology Conference

June 7, 2011

Page 2: Vocabulary management, SKOS, and the semantic web · Information—Integration—Intelligence Solutions Vocabulary management, SKOS, and the semantic web Bob DuCharme 2011 Semantic

© Copyright 2007-2011 TopQuadrant Inc. Slide 2

Introductions

Presentation and all its URLs:

http://www.snee.com/semtech/2011

Me: Solutions Architect at TopQuadrant;

formerly XML, SGML guy at RIA, Moody’s,

LexisNexis, Innodata Isogen

Weblog: http://www.snee.com/bobdc.blog

Page 3: Vocabulary management, SKOS, and the semantic web · Information—Integration—Intelligence Solutions Vocabulary management, SKOS, and the semantic web Bob DuCharme 2011 Semantic

© Copyright 2007-2011 TopQuadrant Inc. Slide 3

Out soon

Page 4: Vocabulary management, SKOS, and the semantic web · Information—Integration—Intelligence Solutions Vocabulary management, SKOS, and the semantic web Bob DuCharme 2011 Semantic

© Copyright 2007-2011 TopQuadrant Inc. Slide 4

Simple Knowledge Organization System

SKOS: the W3C’s OWL ontology for

creating thesauruses, taxonomies,

and controlled vocabularies.

Page 5: Vocabulary management, SKOS, and the semantic web · Information—Integration—Intelligence Solutions Vocabulary management, SKOS, and the semantic web Bob DuCharme 2011 Semantic

© Copyright 2007-2011 TopQuadrant Inc. Slide 5

Terminology

Controlled vocabulary

Taxonomy

Thesaurus

Ontology

Page 6: Vocabulary management, SKOS, and the semantic web · Information—Integration—Intelligence Solutions Vocabulary management, SKOS, and the semantic web Bob DuCharme 2011 Semantic

© Copyright 2007-2011 TopQuadrant Inc. Slide 6

Controlled vocabularies

yes

noMr.

Ms.

Mrs.

Miss

Dr.

AL

AK

AR

CA

CO

CT

.

.

.

WI

WY

Page 7: Vocabulary management, SKOS, and the semantic web · Information—Integration—Intelligence Solutions Vocabulary management, SKOS, and the semantic web Bob DuCharme 2011 Semantic

© Copyright 2007-2011 TopQuadrant Inc. Slide 7

Terminology

Controlled vocabulary

Taxonomy

Thesaurus

Ontology

Page 8: Vocabulary management, SKOS, and the semantic web · Information—Integration—Intelligence Solutions Vocabulary management, SKOS, and the semantic web Bob DuCharme 2011 Semantic

© Copyright 2007-2011 TopQuadrant Inc. Slide 8

Taxonomies

Mammal

Dog

Bulldog Collie

Horse Cat

Above: subset-of relationship.

Alternatives: part-of, instance-of.

metadata!

Page 9: Vocabulary management, SKOS, and the semantic web · Information—Integration—Intelligence Solutions Vocabulary management, SKOS, and the semantic web Bob DuCharme 2011 Semantic

© Copyright 2007-2011 TopQuadrant Inc. Slide 9

Taxonomies: business value

Page 10: Vocabulary management, SKOS, and the semantic web · Information—Integration—Intelligence Solutions Vocabulary management, SKOS, and the semantic web Bob DuCharme 2011 Semantic

© Copyright 2007-2011 TopQuadrant Inc. Slide 10

Taxonomies: business value

Page 11: Vocabulary management, SKOS, and the semantic web · Information—Integration—Intelligence Solutions Vocabulary management, SKOS, and the semantic web Bob DuCharme 2011 Semantic

© Copyright 2007-2011 TopQuadrant Inc. Slide 11

Terminology

Controlled vocabulary

Taxonomy

Thesaurus

Ontology

Page 12: Vocabulary management, SKOS, and the semantic web · Information—Integration—Intelligence Solutions Vocabulary management, SKOS, and the semantic web Bob DuCharme 2011 Semantic

© Copyright 2007-2011 TopQuadrant Inc. Slide 12

Thesaurus

Mammal

Building

Dog

Bulldog Collie

Horse Cat

House

Residential Commercial

Doghouse

(use for: mutt, cur)

Related

term

Page 13: Vocabulary management, SKOS, and the semantic web · Information—Integration—Intelligence Solutions Vocabulary management, SKOS, and the semantic web Bob DuCharme 2011 Semantic

© Copyright 2007-2011 TopQuadrant Inc. Slide 13

Terminology

Controlled vocabulary

Taxonomy

Thesaurus

Ontology

Page 14: Vocabulary management, SKOS, and the semantic web · Information—Integration—Intelligence Solutions Vocabulary management, SKOS, and the semantic web Bob DuCharme 2011 Semantic

© Copyright 2007-2011 TopQuadrant Inc. Slide 14

Ontology

Define terms and structures

And relationships: Requires,

contraindicates…

Classes of things to keep track of,

potential properties they might have

OWL: W3C RDF-based standard for

specifying ontologies

Page 15: Vocabulary management, SKOS, and the semantic web · Information—Integration—Intelligence Solutions Vocabulary management, SKOS, and the semantic web Bob DuCharme 2011 Semantic

© Copyright 2007-2011 TopQuadrant Inc. Slide 15

Simple Knowledge Organization System

Controlled vocabulary

Taxonomy

Thesaurus

Ontology

SKOS: the W3C’s OWL

ontology for

creating thesauri,

taxonomies,

and controlled vocabularies.

Page 16: Vocabulary management, SKOS, and the semantic web · Information—Integration—Intelligence Solutions Vocabulary management, SKOS, and the semantic web Bob DuCharme 2011 Semantic

© Copyright 2007-2011 TopQuadrant Inc. Slide 16

Managing terms?

mammal

dogcatgato

bulldog dachshund

perro

chien

mutt

chucho

Page 17: Vocabulary management, SKOS, and the semantic web · Information—Integration—Intelligence Solutions Vocabulary management, SKOS, and the semantic web Bob DuCharme 2011 Semantic

© Copyright 2007-2011 TopQuadrant Inc. Slide 17

No, managing concepts

preferred label (English): "dog"

preferred label (Spanish): "perro"

preferred label (French): "chien"

alternative label (English): "mutt"

alternative label (Spanish): "chucho"

preferred label (English): “mammal”

preferred label (English): “cat”

Preferred label (Spanish): “gato”

preferred label (English): “dachshund”preferred label (English): “bulldog”

Page 18: Vocabulary management, SKOS, and the semantic web · Information—Integration—Intelligence Solutions Vocabulary management, SKOS, and the semantic web Bob DuCharme 2011 Semantic

© Copyright 2007-2011 TopQuadrant Inc. Slide 18

More metadata for concepts

Standardized

– Definition

– Related terms

– History note

– More…

Unstandardized

– Whatever you want!

Page 19: Vocabulary management, SKOS, and the semantic web · Information—Integration—Intelligence Solutions Vocabulary management, SKOS, and the semantic web Bob DuCharme 2011 Semantic

© Copyright 2007-2011 TopQuadrant Inc. Slide 19

http://myCompany.com/animals/c43209101

preferred label (English): "dog"

preferred label (Spanish): "perro"

preferred label (French): "chien"

alternative label (English): "mutt"

alternative label (Spanish): "chucho "

history note: "Edited by Jack on 5/4/11 "

related term: http://myCompany.com/shelters/c3048293

product: http://myCompany.com/vaccines/c2197503

More metadata: example

Page 20: Vocabulary management, SKOS, and the semantic web · Information—Integration—Intelligence Solutions Vocabulary management, SKOS, and the semantic web Bob DuCharme 2011 Semantic

© Copyright 2007-2011 TopQuadrant Inc. Slide 20

Who is using SKOS?

AGROVOC

New York Times: People,

Organizations, Locations,

Subject Descriptors

Library of Congress subject

headers

AGFA drug admin. forms

NASA: many categories

Page 21: Vocabulary management, SKOS, and the semantic web · Information—Integration—Intelligence Solutions Vocabulary management, SKOS, and the semantic web Bob DuCharme 2011 Semantic

© Copyright 2007-2011 TopQuadrant Inc. Slide 21

Other taxonomy standards

ANSI/NISO Z39.19-2005 Guidelines for the Construction,

Format, and Management of Monolingual Controlled

Vocabularies

ISO 2788:1986 Guidelines for the Establishment and

Development of Monolingual Thesauri

ISO 5964:1985 Guidelines for the Establishment and

Development of Multilingual Thesauri

ISO 25964 (combines 2788 and 5964)

Zthes

MADS/RDF (“Metadata Authority Description Schema in

RDF”) Library of Congress

Page 22: Vocabulary management, SKOS, and the semantic web · Information—Integration—Intelligence Solutions Vocabulary management, SKOS, and the semantic web Bob DuCharme 2011 Semantic

© Copyright 2007-2011 TopQuadrant Inc. Slide 22

http://myCompany.com/animals/c43209101

preferred label (English): "dog"

preferred label (Spanish): "perro"

preferred label (French): "chien"

alternative label (English): "mutt"

alternative label (Spanish): "chucho"

history note: "Edited by Jack on 5/4/11 "

related term: http://myCompany.com/shelters/c3048293

product: http://myCompany.com/vaccines/c2197503

What is SKOS-XL?

Date of last edit?

Who edited it?

Concept

Page 23: Vocabulary management, SKOS, and the semantic web · Information—Integration—Intelligence Solutions Vocabulary management, SKOS, and the semantic web Bob DuCharme 2011 Semantic

© Copyright 2007-2011 TopQuadrant Inc. Slide 23

SKOS-XL: SKOS Extension for Labels

http://myCompany.com/animals/c43209101

preferred label: http://myCompany.com/animals/c4329201

preferred label: http://myCompany.com/animals/c4329202

alternative label: http://myCompany.com/animals/c4329205

history note: "Edited by Jack on 5/4/11 "

related term: http://myCompany.com/shelters/c3048293

product: http://myCompany.com/vaccinations/c2197503

http://myCompany.com/animals/c4329201

literal form: "dog" (English)

http://myCompany.com/animals/c4329205

literal form: "chucho" (Spanish)

last edited: "2011-05-13"

editor: " Mary Jones"

Concept

Label

Label

Page 24: Vocabulary management, SKOS, and the semantic web · Information—Integration—Intelligence Solutions Vocabulary management, SKOS, and the semantic web Bob DuCharme 2011 Semantic

© Copyright 2007-2011 TopQuadrant Inc. Slide 24

http://myCompany.com/animals/c43209101

preferred label (English): "dog"

preferred label (Spanish): "perro"

preferred label (French): "chien"

alternative label (English): "mutt"

alternative label (Spanish): "chucho "

history note: "Edited by Jack on 5/4/11 "

related term: http://myCompany.com/shelters/c3048293

product: http://myCompany.com/vaccinations/c2197503

foo code: “5L-MN1-003”

Your own metadata

Page 25: Vocabulary management, SKOS, and the semantic web · Information—Integration—Intelligence Solutions Vocabulary management, SKOS, and the semantic web Bob DuCharme 2011 Semantic

© Copyright 2007-2011 TopQuadrant Inc. Slide 25

Networked vocabularies: why

Centralized

master

vocabulary

system ?

Ad hoc,

distributed,

disconnected

Page 26: Vocabulary management, SKOS, and the semantic web · Information—Integration—Intelligence Solutions Vocabulary management, SKOS, and the semantic web Bob DuCharme 2011 Semantic

© Copyright 2007-2011 TopQuadrant Inc. Slide 26

Enterprise Vocabulary Management:

dream vs. reality

When I use a word, it means

just what I choose it to mean —

neither more nor less.

- Humpty Dumpty in Lewis Carroll’s Through the

Looking Glass

Page 27: Vocabulary management, SKOS, and the semantic web · Information—Integration—Intelligence Solutions Vocabulary management, SKOS, and the semantic web Bob DuCharme 2011 Semantic

© Copyright 2007-2011 TopQuadrant Inc. Slide 27

Enterprise Vocabulary Management:

dream vs. reality

“Customer” “Customer”

Repair Department Marketing Department

Page 28: Vocabulary management, SKOS, and the semantic web · Information—Integration—Intelligence Solutions Vocabulary management, SKOS, and the semantic web Bob DuCharme 2011 Semantic

© Copyright 2007-2011 TopQuadrant Inc. Slide 28

Enterprise Vocabulary Management:

dream vs. reality

“Customer” “Customer”

Repair Department Marketing Department

?

Page 29: Vocabulary management, SKOS, and the semantic web · Information—Integration—Intelligence Solutions Vocabulary management, SKOS, and the semantic web Bob DuCharme 2011 Semantic

© Copyright 2007-2011 TopQuadrant Inc. Slide 29

Networking distributed vocabularies

http://myCompany.com/vocab/repairs/c8927432

preferred label: " customer "

broader term: http://myCompany.com/vocab/repairs/c93402934

http://myCompany.com/vocab/marketing/c93402934

preferred label: “customer”

Concept

Concept

Page 30: Vocabulary management, SKOS, and the semantic web · Information—Integration—Intelligence Solutions Vocabulary management, SKOS, and the semantic web Bob DuCharme 2011 Semantic

© Copyright 2007-2011 TopQuadrant Inc. Slide 30

Networking distributed vocabularies

http://myCompany.com/vocab/repairs/c8927432

preferred label: " customer "

broader term: http://myCompany.com/vocab/repairs/c93402934

http://myCompany.com/vocab/marketing/c93402934

preferred label: “customer”

Concept

Concept

Repairs department and marketing department

vocabularies can be stored in the same datastore on

the same server… or different datastores on different

servers stored with different products.

Page 31: Vocabulary management, SKOS, and the semantic web · Information—Integration—Intelligence Solutions Vocabulary management, SKOS, and the semantic web Bob DuCharme 2011 Semantic

© Copyright 2007-2011 TopQuadrant Inc. Slide 31

SKOS tools

Top Quadrant's Enterprise Vocabulary Net (EVN)

PoolParty

SKOSed Protégé plugin

iQvoc

TemaTres

Page 32: Vocabulary management, SKOS, and the semantic web · Information—Integration—Intelligence Solutions Vocabulary management, SKOS, and the semantic web Bob DuCharme 2011 Semantic

© Copyright 2007-2011 TopQuadrant Inc. Slide 32

Other semweb tech to use

Data conversion tools

Ontology editors

SPARQL

– Query, update, transform

– SPARQL Rules

Linked Data

OWL

Page 33: Vocabulary management, SKOS, and the semantic web · Information—Integration—Intelligence Solutions Vocabulary management, SKOS, and the semantic web Bob DuCharme 2011 Semantic

© Copyright 2007-2011 TopQuadrant Inc. Slide 33

SPARQL Rules and AGROVOC

SKOS spec lists six rules not implemented by

ontology

e.g. same term can’t be preferred and alternative

term in the same language

Expressed with SPARQL Rules

Violated by AGROVOC over 1600 times

e.g. Slovak word for “Buds” (“púèiky”)

Page 34: Vocabulary management, SKOS, and the semantic web · Information—Integration—Intelligence Solutions Vocabulary management, SKOS, and the semantic web Bob DuCharme 2011 Semantic

© Copyright 2007-2011 TopQuadrant Inc. Slide 34

Getting started

Look at existing SKOS thesauri

Small pilot project…

…or projects.

Existing thesaurus editor vendor SKOS support?

Free tools, commercial tools, trial software

Page 35: Vocabulary management, SKOS, and the semantic web · Information—Integration—Intelligence Solutions Vocabulary management, SKOS, and the semantic web Bob DuCharme 2011 Semantic

© Copyright 2007-2011 TopQuadrant Inc. Slide 35

Thank you

Questions?

[email protected]