xiaomeng su & jon atle gulla dept. of computer and information science norwegian university of...
Post on 22-Dec-2015
225 views
TRANSCRIPT
Xiaomeng Su & Jon Atle GullaDept. of Computer and Information Science
Norwegian University of Science and Technology
Trondheim Norway
June 2004
Semantic Enrichment for Ontology Mapping
NLDB’04 Page 2
Semantic interoperability
The Semantic Web vision Ontology heterogeneity problem Comparison of ontologies should be aided by
automatic process Ontology mapping typically involves
identifying correspondences between the source ontologies
NLDB’04 Page 3
Prerequisite
Focusing on light-weight ontology The ontologies share the same domain The same representation language is assumed Our approach is based on Referent Modelling
Language (RML)
NLDB’04 Page 4
Idea illustration
Enrich the concept with extension. Documents (textual) that belong to a concept
considered to be extension. Using Information Retrieval techniques to compute
a representative feature vector of the extension information.
When no extension available, using text categorization to assign documents to concepts.
NLDB’04 Page 5
Architecture
Mapping
Mapper
Mapping
Enhancer
Post-processing
WordNet
Presenter
Presentation&
Refinement
Exportor
Translation&
Storage
Mappingassertions
(XML)
------------
Configurationprofile(XML)
Cns + Manual
Textcategorization ----
--------
WordNetStop word
list
FVC
Semanticenrichment
Categorization results(XML)
Semantic enriching
NLDB’04 Page 6
Functional view of the system
Document assignment (optional) Feature vector construction
Pre-processing Document representation Concept vector construction
leaf node -- average vector of the documents vectorsnon-leaf node -- weighted sum of the documents vectors,
sub concept vectors and related concept vectors
NLDB’04 Page 7
Functional view of the system
Similarity calculation The similarity of two concepts – cosine measure The similarity of relations – domain and range The similarity calculation of clusters and that of the two
ontologies – based on the above two
Post-processing the assertions using WordNet to update the rank of assertions.
Mapping assertion generation and user feedback management
NLDB’04 Page 8
Post-processing
To update ranks according to the concept relatedness in WordNet
We use simple path length measurement JWNL
NLDB’04 Page 9
A prototype -iMapper
NLDB’04 Page 10
Validation
To measure the accuracy of the mapping algorithm Focusing on concepts Using users manually identified mappings as gold
standards Measures
Precision (the fraction of automatically discovered mappings that are correct)
Recall (the fraction of the correct mappings that have been discovered)
Experiment on two domains Product catalogue integration Tourism Ontology comparison
NLDB’04 Page 11
Measure precision at 11 standard recall levels Experiment in both domains The algorithm has identified most of the mappings and ranked them in
the correct order.
Precision vs. recall curves for the two tasks
0 %
20 %
40 %
60 %
80 %
100 %
120 %
0 % 20 % 40 % 60 % 80 % 100 % 120 %
recall
precision
product catalogue domain
tourism domain
Preliminary results
NLDB’04 Page 12
Preliminary results
Using WordNet to update the rank showed different effects on the two domains
No topic related semantics in WordNet
WordNet effect on tourism domain
0 %
20 %
40 %
60 %
80 %
100 %
120 %
0 % 20 % 40 % 60 % 80 % 100 % 120 %
recall
precision
with WordNet
without WordNet
WordNet effect on product domain
0 %
20 %
40 %
60 %
80 %
100 %
120 %
0 % 20 % 40 % 60 % 80 % 100 % 120 %
recall
precision
with WordNet
without WordNet
NLDB’04 Page 13
Evaluation remarks
Encouraging results Failure analysis
Quality of the gold standards Quality of the feature vector
Further evaluation Gold standard Sensitivity tests Alternative measures
NLDB’04 Page 14
Summing up
An approach to ontology mapping based on the idea of semantic enrichment
The approach has been implemented and evaluated
Explored the possiblity of incorporating WordNet into the system
NLDB’04 Page 15
The end...
Thank you!