aligner automatiquement des ontologies avec [email protected] tuesday 23 rd of january, 2007...

20
Aligner automatiquement des ontologies avec [email protected] Tuesday 23 rd of January, 2007 Raphaël Troncy

Upload: harry-manning

Post on 14-Jan-2016

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Aligner automatiquement des ontologies avec Raphael.Troncy@cwi.nl Tuesday 23 rd of January, 2007 Rapha ë l Troncy

Aligner automatiquement des ontologies avec

[email protected]

Tuesday 23rd of January, 2007

Raphaël Troncy

Page 2: Aligner automatiquement des ontologies avec Raphael.Troncy@cwi.nl Tuesday 23 rd of January, 2007 Rapha ë l Troncy

Motivation

Various SW repositories, using different vocabularies, distributed on the web

Already large amounts of data out thereSwoogle hits 107 – 109 unique Semantic

Web documents (16/11/2006) Problem:

How to search and retrieve information in such an environment?

How to map the various vocabularies used ?

Page 3: Aligner automatiquement des ontologies avec Raphael.Troncy@cwi.nl Tuesday 23 rd of January, 2007 Rapha ë l Troncy

oMAP: Ontology Alignment Tool

TerminologicalClassifiers

Machine Learning-based Classifiers

Structural andSemantics-based

Classifiers

- Formal and open framework- Classifiers customization (parameter, chaining)

subsumption

equivalent

Page 4: Aligner automatiquement des ontologies avec Raphael.Troncy@cwi.nl Tuesday 23 rd of January, 2007 Rapha ë l Troncy

oMAP: A Formal Framework

Sources of inspiration:Formal work in data exchange [Fagin et al.,

2003]GLUE: combining several specialized

components for finding the best set of mappings [Doan et al., 2003]

Notation:A mapping is a triple: M = (T, S, ∑)

S and T are the source and target ontologiesSi is an OWL entity (class, datatype property,

object property) of the ontology∑ is a set of mapping rules: αij Tj ← Si

Page 5: Aligner automatiquement des ontologies avec Raphael.Troncy@cwi.nl Tuesday 23 rd of January, 2007 Rapha ë l Troncy

oMAP: Combining Classifiers

Weight of a mapping rule:αij = w (Si,Tj, ∑)

Using different classifiers:w (Si,Tj,CLk) is the classifier's

approximation of the rule Tj ← Si

Combining the approximations:Use of a priority list: CL1 CL2 … CLn Weighted average of the classifiers

prediction

Page 6: Aligner automatiquement des ontologies avec Raphael.Troncy@cwi.nl Tuesday 23 rd of January, 2007 Rapha ë l Troncy

Terminological Classifiers

Same entity names (or URI)

Same entity name stems

otherwise 0

name, same have , if 1),,(

ji

Nji

TSCLTSw

otherwise 0

stem, same have , if 1),,(

ji

Sji

TSCLTSw

Page 7: Aligner automatiquement des ontologies avec Raphael.Troncy@cwi.nl Tuesday 23 rd of January, 2007 Rapha ë l Troncy

Terminological Classifiers

String distance name

Iterative substring matching

See [Stoilos et al., ISWC'05]

))(),(max(

),(),,(

ji

jinLevenshteiLDji TlengthSlength

TSdistCLTSw

),(),(),(),,( jijijiISji TSwinklerTSDiffTSCommCLTSw

Page 8: Aligner automatiquement des ontologies avec Raphael.Troncy@cwi.nl Tuesday 23 rd of January, 2007 Rapha ë l Troncy

Terminological Classifiers

WordNet distance name

lcs is the longest common substring between Si

and Tj

sim =

otherwise

)()(

*2,max

synonyms, are , if 1

),,(

ji

ji

WNji

TlengthSlength

lcssim

TS

CLTSw

)()(

)()(

ji

ji

TsynonymSsynonym

TsynonymSsynonym

Page 9: Aligner automatiquement des ontologies avec Raphael.Troncy@cwi.nl Tuesday 23 rd of January, 2007 Rapha ë l Troncy

Machine Learning-Based Classifiers

Collecting bag of words:label for the named individualsdata value for the datatype propertiestype for the anonymous individuals and the

range of object properties…

Recursion on the OWL definition:depth parameter

Use statistical methods on the collected bag of words

Page 10: Aligner automatiquement des ontologies avec Raphael.Troncy@cwi.nl Tuesday 23 rd of January, 2007 Rapha ë l Troncy

Machine Learning-Based Classifiers

ExampleIndividual (x1 type (Workshop)

value (label "DECOR") value (location x2) )

Individual (x2 type (Address)

value (city "Namur") value (country "Belgium") )

u1 = (" DECOR ", "Address")u2 = ("Address", "Namur", "Belgium")

Naïve Bayes text classifier

kNN text classifier

jTux um

iiNBji SmSCLTSw),(

)Pr()Pr(),,(

Page 11: Aligner automatiquement des ontologies avec Raphael.Troncy@cwi.nl Tuesday 23 rd of January, 2007 Rapha ë l Troncy

Structural and Semantics-Based Classifier

∑ is a set of mapping rules: αij Tj ← Si

∑ sets are computed by taking the OWL definition of the entities to alignrecursively in the OWL structure... without looping thanks to cycles

detection

Page 12: Aligner automatiquement des ontologies avec Raphael.Troncy@cwi.nl Tuesday 23 rd of January, 2007 Rapha ë l Troncy

Structural and Semantics-Based Classifier If Si and Tj are property names:

If Si and Tj are concept names1:

otherwise ),,('

if 0),,(

ji

ij

ji TSw

STTSw

otherwise ),,(max),,('1)Set(

1

and 0D if ),,('

if 0

),,(

),(

t

setDjCijisetji

ijji

ij

ji

DCwTSw

STTSw

ST

TSw

1 Where D = D(Si) * D(Tj) ; D(Si) represents the set of concepts directly parent of Si

Page 13: Aligner automatiquement des ontologies avec Raphael.Troncy@cwi.nl Tuesday 23 rd of January, 2007 Rapha ë l Troncy

Structural and Semantics-Based ClassifierLet CS=(QR.C) and DT=(Q’R’.D), then1:

Let CS=(op C1…Cm) and DT=(op’ D1…Dm), then2:

),,(),',()',(),,( DCwRRwQQwDCw QTS

),min(

),,(max

)',(),,(),(

nm

DCw

opopwDCwsetDjCi

jiset

opTS

1 Where Q,Q’ are quantifiers, R,R’ are property names and C,D concept expressions

2 Where op, op’ are concept constructors and n,m ≥ 1

Page 14: Aligner automatiquement des ontologies avec Raphael.Troncy@cwi.nl Tuesday 23 rd of January, 2007 Rapha ë l Troncy

Evaluation

OAEI Contests (2004, 2005, 2006): http://oaei.ontologymatching.org/Systematic benchmark tests on

bibliographic dataTests 2xx: aligning an ontology with

variations of itself where each OWL constructs are discarded or modified one per one

Tests 3xx: four real bibliographic ontologies Web categories alignmenthttp://oaei.ontologymatching.org/2005/re

sults/

Page 15: Aligner automatiquement des ontologies avec Raphael.Troncy@cwi.nl Tuesday 23 rd of January, 2007 Rapha ë l Troncy

Benchmark Tests (2005)

dublin20 0.92

Falcon 0.91

FOAM 0.90

oMAP 0.85

CMS 0.81

OLA 0.80

ctxMatch 0.72

edna 0.45

Falcon 0.89

OLA 0.74

dublin20 0.72

FOAM 0.69

oMAP 0.68

edna 0.61

ctxMatch 0.20

CMS 0.18oMAP:

4th with the global F-Measure1st on 3xx tests (real ontologies to align)

Precision Recall

Page 16: Aligner automatiquement des ontologies avec Raphael.Troncy@cwi.nl Tuesday 23 rd of January, 2007 Rapha ë l Troncy

Aligning Web Categories (2005)Aligning Google, Loksmart and Yahoo

web categories [Avesani et al., ISWC'05]

Blind tests: only recall results are available

ctxMatch

FOAM CMS Dublin20

Falcon OLA oMAP

9.4% 11.9% 14.1% 26.5% 31.2% 32.0% 34.4%

Page 17: Aligner automatiquement des ontologies avec Raphael.Troncy@cwi.nl Tuesday 23 rd of January, 2007 Rapha ë l Troncy

Conclusion

oMAP: a formal framework for aligning automatically OWL ontologies

Combining several specific classifiersTerminological classifiersMachine learning-based classifiersStructural and semantics-based classifier

Page 18: Aligner automatiquement des ontologies avec Raphael.Troncy@cwi.nl Tuesday 23 rd of January, 2007 Rapha ë l Troncy

Future Work

oMAPUsing additional classifiers:

KL-distance, other resources, background K, etc.

Straightforward theoretically but practically difficult!

Finding complex alignmentname = firstName + lastName

OWL and rule-based languages:Take into account this additional expressivity

Other KR languages: e.g. SKOS

Page 19: Aligner automatiquement des ontologies avec Raphael.Troncy@cwi.nl Tuesday 23 rd of January, 2007 Rapha ë l Troncy

http://www.cwi.nl/~troncy/oMAP/

Any questions ?

Page 20: Aligner automatiquement des ontologies avec Raphael.Troncy@cwi.nl Tuesday 23 rd of January, 2007 Rapha ë l Troncy

Structural and Semantics-Based ClassifierPossible values for wop and wQ

weights

wop wQ⊓ ⊔ ¬

⊓ 1 1/4 0

⊔ 1 0

¬ 1

1 1/4

1

n n m

1 1/3

m

1