learning similarity functions from qualitative feedback

15
Learning Similarity Functions from Qualitative Feedback Weiwei Cheng and Eyke Hüllermeier University of Marburg, Germany

Upload: roywwcheng

Post on 02-Jul-2015

765 views

Category:

Technology


1 download

DESCRIPTION

The performance of a case-based reasoning system often depends on the suitability of an underlying similarity (distance) measure, and specifying such a measure by hand can be very difficult. In this paper, we therefore develop a machine learning approach to similarity assessment. More precisely, we propose a method that learns how to combine given local similarity measures into a global one. As training information, the method merely assumes qualitative feedback in the form of similarity comparisons, revealing which of two candidate cases is more similar to a reference case. Experimental results, focusing on the ranking performance of this approach, are very promising and show that good models can be obtained with a reasonable amount of training information. See more at www.chengweiwei.com

TRANSCRIPT

Page 1: Learning similarity functions from qualitative feedback

Learning Similarity Functions fromQualitative Feedback

Weiwei Cheng and Eyke HüllermeierUniversity of Marburg, Germany

Page 2: Learning similarity functions from qualitative feedback

Introduction

Proper definition of similarity (distance) measures is crucial for CBR systems.

The specification of local similarity measures, pertaining to individual properties (attributes) of a case, is often less difficult than their combination into a global measure.

Goal of this work:Using machine learning techniques to support elicitation of similarity measures (combination of local into global measures) on the basis of qualitative feedback.

7/31/20091/13 Weiwei Cheng & Eyke Hüllermeier

Page 3: Learning similarity functions from qualitative feedback

Problem Setting

Local-global principle: The global distance is an aggregation of local distances

For now, we focus on a linear model:

with (monotonicity).

… easy to incorporate background knowledge… amenable to efficient learning scheme… non-linear extension via kernelization

7/31/20092/13 Weiwei Cheng & Eyke Hüllermeier

Page 4: Learning similarity functions from qualitative feedback

Problem Setting cont.

Learning the weights from qualitative feedback:means “case a is more similar to b than to c”.

Given a query, a distance measure induces a linear order on cases:

Notice: Often the ordering of cases is more important than the distance itself it is sufficient to find a , such that

7/31/20093/13 Weiwei Cheng & Eyke Hüllermeier

Page 5: Learning similarity functions from qualitative feedback

The Learning Algorithm

Basic idea: From distance learning to classification

Extension 1: Incorporating monotonicity

Extension 2: Ensemble learning

Extension 3: Active learning

7/31/20094/13 Weiwei Cheng & Eyke Hüllermeier

Page 6: Learning similarity functions from qualitative feedback

From Distance Learning to Classification

7/31/20095/13 Weiwei Cheng & Eyke Hüllermeier

CASE BASE

(d-dim. vector)

Page 7: Learning similarity functions from qualitative feedback

Our model

requires that when a local distance increases, the global distance cannot decrease.

Our approach: (Noise-tolerant) Perceptron learning with a modified update rule:

Monotonicity

7/31/20097/13 Weiwei Cheng & Eyke Hüllermeier

The modified algorithm provably converges after a finite number of iterations.

Page 8: Learning similarity functions from qualitative feedback

Ensemble Learning

7/31/20096/13 Weiwei Cheng & Eyke Hüllermeier

Permutations of training

data

CoM of version space Bayes pointEnsemble of

perceptrons

hypothesis space

version space

Bayes point

committee

Page 9: Learning similarity functions from qualitative feedback

Goal:Reducing the feedback effort of the user by choosing the most informative training data.

Our approach (a variation of QBC):1. choose 2 most conflicting models2. generate 2 rankings with these 2 models3. get the first conflict pair of these rankings

Example:

Active Learning

7/31/20098/13 Weiwei Cheng & Eyke Hüllermeier

ranking 1: a b c d e

ranking 2: a b d e c

Page 10: Learning similarity functions from qualitative feedback

Experimental Setting

7/31/20099/13 Weiwei Cheng & Eyke Hüllermeier

Goal:Investigating the efficacy of our approach and the effectiveness of the extensions:

1. incorporating monotonicity2. ensemble learning3. active learning

Data sets

uni iris wine yeast nba

#features 6 4 13 24 15#cases 200 150 178 2465 3924

Page 11: Learning similarity functions from qualitative feedback

Quality Measures

7/31/2009Weiwei Cheng & Eyke Hüllermeier10/13

Kendall’s tau (a common rank correlation measure)

… defined by number of rank inversions (normalized to [-1,+1]):

Recall (a common retrieval measure) ... defined as number of predicted among true top-k cases (k=10):

Position error… defined by the position of true topmost case (minus 1):

Page 12: Learning similarity functions from qualitative feedback

7/31/200912 Weiwei Cheng & Eyke Hüllermeier

Page 13: Learning similarity functions from qualitative feedback

Extension to Nonlinear Models

7/31/200912/13 Weiwei Cheng & Eyke Hüllermeier

Actually, we only need linearity in the coefficients, not in the local distances. Therefore, some generalizations are easily possible, such as

More generally, with :

Page 14: Learning similarity functions from qualitative feedback

Extensions

7/31/200912/13 Weiwei Cheng & Eyke Hüllermeier

Special case of a kernel function leads to kernelization:

Nonlinear classification and sorting

(a, b, c)

b c

classifier

(b, c)

0.7

distance

Page 15: Learning similarity functions from qualitative feedback

Conclusions

7/31/2009Weiwei Cheng & Eyke Hüllermeier13/13

Learning to combine local distance measures into a global measure.

Only assuming qualitative feedback of the type “a is more similar to b than to c”.

Reduction of distance learning to classification.