Download - Un unbis-agrovoc 2010-09-03
AGROVOC -FAO’s multilingual thesaurus as a building block for linked open data
Gudrun JOHANNSEN1, Ahsan MORSHED1, Sachit RAJBHANDARI1, Armando Stellato 3 ,Thomas Baker 2 ,Margherita SINI1 and Johannes KEIZER1
1FAO of the UN, Italy; 2 WC3-SKOS working group, 3Universitá di Roma “Tor Vergata”
Yes!And Why:
Do we need Thesauri?
dr johannes keizer - FAO of the United Nations - knowledge and capacity for development
U
N,
DP
I U
NB
IIS
Th
esau
rus
Tea
m
N
ew
Yo
rk,
2010
-09-
03
Born as tools to assure consistency in the indexing of library collections
Thesauri were based on “terms”, but terms represented already concepts in a non explicit way
Hierarchical and associative relationships represented generic ontological domain knowledge
Candidate building blocks for the semantic web
Thesauri in the past and now
dr johannes keizer - FAO of the United Nations - knowledge and capacity for development
U
N,
DP
I U
NB
IIS
Th
esau
rus
Tea
m
N
ew
Yo
rk,
2010
-09-
03 The Linked Data Universe: http://www.linkeddata.org
4
dr johannes keizer - FAO of the United Nations - knowledge and capacity for development
U
N,
DP
I U
NB
IIS
Th
esau
rus
Tea
m
N
ew
Yo
rk,
2010
-09-
03
• The Semantic Web isn't just about putting data on the web. It is about making links, so that a person or machine can explore the web of data. With linked data, when you have some of it, you can find other, related, data.
• Like the web of hypertext, the web of data is constructed with documents on the web. However, unlike the web of hypertext, where links are relationships anchors in hypertext documents written in HTML, for data they links between arbitrary things described by RDF,. The URIs identify any kind of object or concept. But for HTML or RDF, the same expectations apply to make the web grow:
• Use URIs as names for things
• Use HTTP URIs so that people can look up those names.
• When someone looks up a URI, provide useful information, using the standards (RDF*, SPARQL)
• Include links to other URIs. so that they can discover more things.
• Simple. In fact, though, a surprising amount of data isn't linked in 2006, because of problems with one or more of the steps. This article discusses solutions to these problems, details of implementation, and factors affecting choices about how you publish your data.
http://www.w3.org/DesignIssues/LinkedData.html
dr johannes keizer - FAO of the United Nations - knowledge and capacity for development
U
N,
DP
I U
NB
IIS
Th
esau
rus
Tea
m
N
ew
Yo
rk,
2010
-09-
03
• http://www.w3.org/2007/Talks/0221-Bangalore-IH/
RDF as a common format for merging data
dr johannes keizer - FAO of the United Nations - knowledge and capacity for development
U
N,
DP
I U
NB
IIS
Th
esau
rus
Tea
m
N
ew
Yo
rk,
2010
-09-
03
dr johannes keizer - FAO of the United Nations - knowledge and capacity for development
U
N,
DP
I U
NB
IIS
Th
esau
rus
Tea
m
N
ew
Yo
rk,
2010
-09-
03Finding things related to “genes” across databases
Source: Joanne Luciano, Mitre, and the W3C HCLS IG
dr johannes keizer - FAO of the United Nations - knowledge and capacity for development
U
N,
DP
I U
NB
IIS
Th
esau
rus
Tea
m
N
ew
Yo
rk,
2010
-09-
03
Linking DataSets
• Have you seen Thomson Reuters Open Calais Service? ()
• Standard Vocabularies can become the glue between different data sets
• (everything linked through http://aims.fao.org/aos/agrovoc?c_2367#concept)
• Our goal: OpenAgro!
dr johannes keizer - FAO of the United Nations - knowledge and capacity for development
U
N,
DP
I U
NB
IIS
Th
esau
rus
Tea
m
N
ew
Yo
rk,
2010
-09-
03
• Name/Identy determination in unstructured texts
• Using Agrovoc as a controlled vocabulary
• Structured RDF files that can be used to link data
• Developed by IIT Kanpur for AgroPedia Indica
• Prototype under testing with excellent results
AgroTagger
dr johannes keizer - FAO of the United Nations - knowledge and capacity for development
U
N,
DP
I U
NB
IIS
Th
esau
rus
Tea
m
N
ew
Yo
rk,
2010
-09-
03
http://agropedia.iitk.ac.in/auto_tagger/callable_auto_tagger.php
AGROVOC: From a traditional thesaurus to an Agricultural Concept Scheme
A long and Winding Road
dr johannes keizer - FAO of the United Nations - knowledge and capacity for development
U
N,
DP
I U
NB
IIS
Th
esau
rus
Tea
m
N
ew
Yo
rk,
2010
-09-
03
..from thesaurus to Ontologies….
dr johannes keizer - FAO of the United Nations - knowledge and capacity for development
U
N,
DP
I U
NB
IIS
Th
esau
rus
Tea
m
N
ew
Yo
rk,
2010
-09-
03
• We wanted to make a semantic web tool from AGROVOC
• There was no standard model outside, we were on our own
• We discussed three years the model to which we wanted to convert AGROVOC
……but in the meantime AGROVOC was used and translated into 20 languages
AGROVOC since 2004
dr johannes keizer - FAO of the United Nations - knowledge and capacity for development
U
N,
DP
I U
NB
IIS
Th
esau
rus
Tea
m
N
ew
Yo
rk,
2010
-09-
03Semantic Relationships
Concept to Concept
isA (hierarchy), isPestOf, hasPest
Concept to Term
has_lexicalization (links concepts to their lexical realizations)
Term to Term
isSynonymOf, isTranslationOf, hasAcronym, hasAbbreviation
Term to String
hasSpellingVariant, hasSingular
dr johannes keizer - FAO of the United Nations - knowledge and capacity for development
U
N,
DP
I U
NB
IIS
Th
esau
rus
Tea
m
N
ew
Yo
rk,
2010
-09-
03
The AGROVOC OWL model
dr johannes keizer - FAO of the United Nations - knowledge and capacity for development
U
N,
DP
I U
NB
IIS
Th
esau
rus
Tea
m
N
ew
Yo
rk,
2010
-09-
03 More SemanticsMAIZE
UF corn
NT flint maize NT popcorn NT sweet corn
MILK
NT Milk Fat
NT Colostrum
NT Cow Milk
International Fund for Agricultural Development
UF IFAD
MAIZE
synonym corn
superclass-of flint maize used-to-make popcorn hybridized-into sweet corn
MILK
ingredient Milk Fat
ingredient Colostrum
superclass-of Cow Milk
International Fund for Agricultural
Development
acronym IFAD
dr johannes keizer - FAO of the United Nations - knowledge and capacity for development
U
N,
DP
I U
NB
IIS
Th
esau
rus
Tea
m
N
ew
Yo
rk,
2010
-09-
03
around 30,000 concepts
600000 labels in around 20 languages.
one-stop shop for terminological knowledge related to agriculture in general
a knowledge base of related concepts organized in ontological relationships (hierarchical, associative, equivalence)
Is a concept/term/string based system
Concepts may be organized in multiple categories.
AGROVOC today
dr johannes keizer - FAO of the United Nations - knowledge and capacity for development
U
N,
DP
I U
NB
IIS
Th
esau
rus
Tea
m
N
ew
Yo
rk,
2010
-09-
03
The AGROVOC OWL model needs a revision
Ontological over commitment in many class/subclass hierarchies
Unnecessary complications through concept /term/string hierarchy
Our push of AGROVOC to the Semantic Web had enormous positive effects, among others
From 4 to 20 language versions
Defacto standard for indexing in many areas
More than 2000 downloads only in 2009
SKOS incorporated all our requirements
Evaluation of the Process
dr johannes keizer - FAO of the United Nations - knowledge and capacity for development
U
N,
DP
I U
NB
IIS
Th
esau
rus
Tea
m
N
ew
Yo
rk,
2010
-09-
03
http://www.w3.org/2004/02/skos/
dr johannes keizer - FAO of the United Nations - knowledge and capacity for development
U
N,
DP
I U
NB
IIS
Th
esau
rus
Tea
m
N
ew
Yo
rk,
2010
-09-
03
The AGROVOC SKOS Model
8171
1474
12332
skosxl:altLabel
skosxl:prefLabel
skos:broader
SKOS Label
skos:broader
SKOSConcept
rdf:type
rdf:type
6211skos:broader
AgrovocConceptScheme
skos:topConceptOfskos:inScheme
SKOSConceptScheme
rdf:type
rdf:type
:bar
:foo
“corn”
“maize”
skosxl:literalForm
skosxl:literalForm
rdf:type
rdf:type
rdf:type
dr johannes keizer - FAO of the United Nations - knowledge and capacity for development
U
N,
DP
I U
NB
IIS
Th
esau
rus
Tea
m
N
ew
Yo
rk,
2010
-09-
03
Ex:FAO
FAO@en
Food and Agricultural
Organization@en
skos:prefLabel skos:altLabel
Ex:FAO
Skosxl: Label FAO abrev Skos: Label FAO Full
Skosxl:altLabel
skosxl:prefLabel
FAO@en Food and Agricultural Organization@en
Ex: full form
Ex: acronym form
skosxl: literalForm
skosxl: literalFormEx: SKOS
presentation
Ex: SKOS-XL presentation
SKOS and SKOS-XL presentation
dr johannes keizer - FAO of the United Nations - knowledge and capacity for development
U
N,
DP
I U
NB
IIS
Th
esau
rus
Tea
m
N
ew
Yo
rk,
2010
-09-
03SKOS-XL output
<rdf:Description rdf:about="http://aims.fao.org/aos/agrovoc/agrovocScheme"> <rdf:type rdf:resource="http://www.w3.org/2004/02/skos/core#ConceptScheme"/></rdf:Description><rdf:Description rdf:about="http://aims.fao.org/aos/agrovoc/c_330829"> <rdf:type rdf:resource="http://www.w3.org/2004/02/skos/core#Concept"/><skos:inScheme rdf:resource="http://aims.fao.org/aos/agrovoc/agrovocScheme"/><skos:topConceptOf rdf:resource="http://aims.fao.org/aos/agrovoc/agrovocScheme"/></rdf:Description><rdf:Description rdf:about="http://aims.fao.org/aos/agrovoc/xl_en_1278479064610"><literalForm xmlns="http://www.w3.org/2008/05/skos-xl#" xml:lang="en">subjects</literalForm> <rdf:type rdf:resource="http://www.w3.org/2008/05/skos-xl#Label"/></rdf:Description>
dr johannes keizer - FAO of the United Nations - knowledge and capacity for development
U
N,
DP
I U
NB
IIS
Th
esau
rus
Tea
m
N
ew
Yo
rk,
2010
-09-
03
SKOS presentation where there is no need to apply any type of relationship between lexical units.
SKOS-XL presentation needs to apply the relationship between lexical units so that it can be understood clearly. This property is compared with OWL inverse property.
SKOS-XL , with the need to define labels as resources, as with concepts, schemes and collections. This defines a special type of lexical entity which is assigned a literal chain which can be repeated for various units. So, the prefLabel and altLabel can be distinguished easily in the multi-lingual thesauri .
SKOS-XL gives more flexible semantic than SKOS for modeling the vocabularies, specially multi-lingual thesaurus like AGROVOC
SKOS and SKOS-XL presentation
The Conceptserver Workbench
dr johannes keizer - FAO of the United Nations - knowledge and capacity for development
U
N,
DP
I U
NB
IIS
Th
esau
rus
Tea
m
N
ew
Yo
rk,
2010
-09-
03
Is a web-based working environment for managing the AGROVOC Concept Server
Facilitate the collaborative editing of multilingual terminology and semantic concept information
It includes administration and group management features
It includes workflows for maintenance, validation and quality assurance of the data pool
The CS is accessible freely to everybody to facilitates collaborative editing
The workbench
dr johannes keizer - FAO of the United Nations - knowledge and capacity for development
U
N,
DP
I U
NB
IIS
Th
esau
rus
Tea
m
N
ew
Yo
rk,
2010
-09-
03
Architecture of the System
dr johannes keizer - FAO of the United Nations - knowledge and capacity for development
U
N,
DP
I U
NB
IIS
Th
esau
rus
Tea
m
N
ew
Yo
rk,
2010
-09-
03
Concept/Term Management
dr johannes keizer - FAO of the United Nations - knowledge and capacity for development
U
N,
DP
I U
NB
IIS
Th
esau
rus
Tea
m
N
ew
Yo
rk,
2010
-09-
03
Concept RelationshipCan create the concept-concept relationshipInverse relationship is also created automatically
• Ex: If we create A affect B, then B is affected by A relationship is also created
dr johannes keizer - FAO of the United Nations - knowledge and capacity for development
U
N,
DP
I U
NB
IIS
Th
esau
rus
Tea
m
N
ew
Yo
rk,
2010
-09-
03 Concept Image• Name of the image with
description
• URL will point to the image which will open in an external
• Provide the source of the image
• Can add more translation in different language
dr johannes keizer - FAO of the United Nations - knowledge and capacity for development
U
N,
DP
I U
NB
IIS
Th
esau
rus
Tea
m
N
ew
Yo
rk,
2010
-09-
03 Concept Definition
• Add definition to the selected concept
• Add translation in different languages
• Provide the source of the definition
• Creation and modified date are set automatically
dr johannes keizer - FAO of the United Nations - knowledge and capacity for development
U
N,
DP
I U
NB
IIS
Th
esau
rus
Tea
m
N
ew
Yo
rk,
2010
-09-
03
Visualization of concepts and Relationships in the Workbench
dr johannes keizer - FAO of the United Nations - knowledge and capacity for development
U
N,
DP
I U
NB
IIS
Th
esau
rus
Tea
m
N
ew
Yo
rk,
2010
-09-
03
AGROVOC Web Services
dr johannes keizer - FAO of the United Nations - knowledge and capacity for development
U
N,
DP
I U
NB
IIS
Th
esau
rus
Tea
m
N
ew
Yo
rk,
2010
-09-
03
The road map• The AGROVOC concept scheme in it’s
proprietary OWL has been published on June 5, 2010 on version 1.0 of the concept server workbench
• At the moment a patch to version 1.0 of the workbench is developed to make it possible to export AGROVOC SKOS
• AGROVOC SKOS will be published as linked data.
• With version 2.0 of the workbench SKOS will become the native format for AGROVOC
dr johannes keizer - FAO of the United Nations - knowledge and capacity for development
U
N,
DP
I U
NB
IIS
Th
esau
rus
Tea
m
N
ew
Yo
rk,
2010
-09-
03
Giving it a try…….
A demo version of the AWB: http://202.73.13.50:55234/agrovocdevv10d/ With all functionalities, availabe to users for testing purpose.
Latest stable release version 1.0 : (read/write) http://202.73.13.50:55381/agrovocv10i/
Latest stable release version 1.0 (Read only): http://202.73.13.50:55481/agrovocv10i/ (Visitors only with only view privilege)
dr johannes keizer - FAO of the United Nations - knowledge and capacity for development
U
N,
DP
I U
NB
IIS
Th
esau
rus
Tea
m
N
ew
Yo
rk,
2010
-09-
03
• Getting UN vocabularies as building blocks for the Linked Data Universe
• UNBIS thesaurus covers a broad range of international policy and development issues
• AGROVOC very specialized, but they do overlap partly
• Steps:• Promoting together UNBIS Thesaurus and
AGROVOC• Mapping Project?
Possible collaboration
dr johannes keizer - FAO of the United Nations - knowledge and capacity for development
U
N,
DP
I U
NB
IIS
Th
esau
rus
Tea
m
N
ew
Yo
rk,
2010
-09-
03
…and more: http://aims.fao.org
dr johannes keizer - FAO of the United Nations - knowledge and capacity for development
U
N,
DP
I U
NB
IIS
Th
esau
rus
Tea
m
N
ew
Yo
rk,
2010
-09-
03
Thank You!