an adaptive machine learning framework with user interaction for ontology matching hoai-viet to 1,...
TRANSCRIPT
An Adaptive Machine Learning Framework with User Interaction for Ontology Matching
Hoai-Viet To1, Ryutaro Ichise2, and Hoai-Bac Le1
1 Ho Chi Minh University of Science, Vietnam2 National Institute of Informatics, Japan
04/10/23 1
Ontology Matching (OM) Problem Ontology is a hierarchical structure used to
organize concepts.Ontology plays an important role in semantic
web development.Ontology matching finds correspondences
between concepts from two ontologies.Ontology matching is an important process when
we want to integrate heterogeneous information source in new semantic web environment.
04/10/23 2
Machine Learning Framework for OM
We introduced a machine learning framework for ontology matching problem in [Ichise, 2008]
Our hypothesis: the use of semi-supervised learning method will reduce the manual annotation cost.
04/10/23 3
Ca1
Ca2 Ca3
Cb1
Cb2 Cb3
Pre-alignment• Correct mapping:
• Ca1 Cb1
• Ca2 Cb1 …
• Incorrect mapping• Ca1 Cb2
• Ca1 Cb3…
ID Sim1 … Simn Class
Ca1 Cb1 0.5 … 0.7 1
Ca1 Cb2 0.3 … 0.56 0
… … … … …
Semi-supervised Learning with User Interaction
Basic idea: propagate label through unlabeled data
Problem: few samples of labeled data low confidence prediction.
?
04/10/23 4
User Interaction
?Blue
Red
Adaptive Machine Learning Framework
Use multiple learning strategies + user interactionOntology Storage
Ontology ParserSimilarity Calculator
LEARNER
Pre-alignment
training
04/10/23 5
LEARNER
training
Initialize
labeling
labeling
User Interaction
labeling
Similarity measures are based on those used in machine learning framework proposed in [Ichise, 2008], which: include 24 string-based similarity measures calculate similarity between: concept feature, concept
structure feature, and concept hierarchical feature.
Our system: Machine Learning Framework for Ontology Matching with User Interaction (MalfomUI)
04/10/23 6
Adaptive Machine Learning Framework
Experiments Purpose:
Compare the performance of our learning framework with other matching systems.
General setting: Dataset from directory track of OAEI 2008’s campaign. [Caracciolo
et. al., 2008] The dataset is constructed from three internet directories: Yahoo,
Google, Looksmart. Simple equivalent relation. The dataset includes 4487 labeled matching tasks, in which there
are 2160 positive samples and 2327 negative samples. Base learner: Naïve Bayes
04/10/23 7
Experiments Pre-Experiment – Supervised Learning method:
Used as baseline to compare with semi-supervised learning method.
Study the effect of training-set size on the performance of the supervised learning method.
04/10/23 8
Experimental ResultsMalUI-5 to MalUI-4000:
04/10/23 9
Training set size
Experiments Experiment – Semi-supervised learning with user
interaction Study the performance of semi-supervised learning
method with user interaction. User annotate 20 samples at initialize phase and then
label 4 samples more in 2 feedback round.
04/10/23 10
Experimental ResultsMalUI-RF:
Comparison with other matching systems [Caracciolo et. al., 2008]04/10/23 11
Semi-supervised learning with user feedback can reduce the cost of manual annotation.
* In MalfomUI-RF experiment, users need to label 28 samples in total.
04/10/23 12
MalUI -30
MalUI -100
MalUI -500
MalUI -4000
MalUI –RF
Precision 0.56 0.62 0.68 0.68 0.61
Recall 0.59 0.63 0.74 0.75 0.73
F-Measure 0.58 0.63 0.71 0.71 0.67
Experimental Results
Conclusion Conclusions:
Our adaptive machine learning framework is effective: it requires less annotation cost but gains approximately good performance.
Machine learning approaches with user interaction are promising for ontology matching systems.
Future works: Integrate more similarity measures to cover real
datasets. Consider more complicate semi-supervised models.
04/10/23 13