a new instance-based label ranking approach using the mallows model

22
Weiwei Cheng & Eyke Hüllermeier Knowledge Engineering & Bioinformatics Lab Department of Mathematics and Computer Science University of Marburg, Germany

Upload: roywwcheng

Post on 07-Jul-2015

743 views

Category:

Technology


2 download

DESCRIPTION

See more at www.chengweiwei.com

TRANSCRIPT

Page 1: A new instance-based label ranking approach using the Mallows model

Weiwei Cheng & Eyke HüllermeierKnowledge Engineering & Bioinformatics LabDepartment of Mathematics and Computer ScienceUniversity of Marburg, Germany

Page 2: A new instance-based label ranking approach using the Mallows model

Label Ranking (an example)

Learning customers’ preferences on cars:

where the customers are typically described by feature vectors, e.g., (gender, age, place of birth, has child, …)

label ranking  customer 1 MINI      Toyota    BMW 

customer 2 BMW      MINI      Toyota

customer 3 BMW Toyota      MINI

customer 4 Toyota      MINI      BMW

new customer ???

1/16

Page 3: A new instance-based label ranking approach using the Mallows model

Label Ranking (an example)

Learning customers’ preferences on cars:

π(i) = position of the i‐th label in the ranking1: MINI  2: Toyota  3: BMW

MINI Toyota BMWcustomer 1 1 2 3

customer 2 2 3 1

customer 3 3 2 1

customer 4 2 1  3

new customer ? ? ?

2/16

Page 4: A new instance-based label ranking approach using the Mallows model

Label Ranking (more formally)

Given:a set of training instances a set of labelsfor each training instance     : a set of pairwise preferencesof the form

Find:a ranking function (            mapping) that maps each           to a ranking      of L (permutation     )

3/16

Page 5: A new instance-based label ranking approach using the Mallows model

Model‐Based Approaches… essentially reduce label ranking to classification:

Ranking by pairwise comparison (RPC)Fürnkranz and Hüllermeier,  ECML‐03

Constraint classification (CC)Har‐Peled , Roth and Zimak,  NIPS‐03

Log linear models for label ranking (LL)Dekel, Manning and Singer,  NIPS‐03

4/16

Page 6: A new instance-based label ranking approach using the Mallows model

Instance‐Based Approach (this work)

Lazy learning: Instead of eagerly inducing a model from the data, simply store the observations.

Target functions are estimated on demand in a local way, no need to define the mapping explicitly.

Core part is the aggregation of preference (order) information from neighbored examples.

5/16

Page 7: A new instance-based label ranking approach using the Mallows model

Related WorkCase‐based Label Ranking (Brinker and Hüllermeier, ECML‐06)

Aggregation of complete rankings is done bymedian rankingBorda count

Our aggregation method is based on a probabilistic model and can handle both complete and incomplete rankings.

6/16

Page 8: A new instance-based label ranking approach using the Mallows model

Instance‐Based PredictionBasic assumption: Distribution of output is (approximately) constant in the neighborhood of the query; consider outputs of neighbors as an i.i.d. sample. 

Conventional classification: discrete distribution on class labelsestimate probabilities by relative class frequenciesclass prediction by majority vote 

7/16

Page 9: A new instance-based label ranking approach using the Mallows model

Probabilistic Model for Ranking Mallows model (Mallows, Biometrika, 1957)

with center rankingspread parameterand        is a right invariant metric on permutations

8/16

Page 10: A new instance-based label ranking approach using the Mallows model

Inference (full rankings)We have observed from the neighbors.

ML

monotone in θ

Page 11: A new instance-based label ranking approach using the Mallows model

Inference (incomplete ranking)

“marginal” distibution

where denotes all consistent extensions of    .

Example for label set {a,b,c}:

10/16

Observation σ Extensions E(σ)

a ba      b ca c bc      a b

Page 12: A new instance-based label ranking approach using the Mallows model

Inference (incomplete ranking)  cont.

The corresponding likelihood:

ML estimation                                         becomes more difficult.

11/16

Page 13: A new instance-based label ranking approach using the Mallows model

InferenceNot only the estimated ranking     is of interest …

… but also the spread parameter    , which is a measure of precision and, therefore, reflects the confidence/reliability of the prediction (just like the variance of an estimated mean). 

The bigger   , the more peaked the distribution around the center ranking.

12/16

Page 14: A new instance-based label ranking approach using the Mallows model

Experimental SettingData sets

1 UCI data sets. 2 Phylogenetic profiles and DNA microarray expression data.

Name #instances #features #labels

Iris1 150  4 3

Wine1 178  13 3

Glass1 214  9 6

Vehicle1 846  18 4

Dtt2 2465  24 4

Cold2 2465  24 4

13/16

Page 15: A new instance-based label ranking approach using the Mallows model

Accuracy (Kendall tau)

Main observation:  Our approach is quite competitive with the state‐of‐the‐art  model based approaches. 

A typical run:

missing rate of labels

14/16

Ken

dall

tau

Page 16: A new instance-based label ranking approach using the Mallows model

Accuracy‐Rejection Curveθ as a measure of the reliability of a prediction

Main observation:  Decreasing curve confirms that θ is a reasonable measure of confidence.

ratio of the data that predicted

15/16

Page 17: A new instance-based label ranking approach using the Mallows model

Take‐away MessagesAn instance‐based label ranking approach using a probabilistic model.Suitable for complete and incomplete rankings.Comes with a natural measure of the reliability of a prediction.

o More efficient inference for the incomplete case. o Generalization: distance‐weighted prediction.o Dealing with variants of the label ranking problem, such as calibrated label ranking and multi‐label classification.

16/16

Page 18: A new instance-based label ranking approach using the Mallows model
Page 19: A new instance-based label ranking approach using the Mallows model

Median rank

MINI Toyota BMWranking 1 1 3 2

ranking 2 2 3 1

ranking 3 3 2 1

median rank 2 3 1

‐ tends to optimize Spearman footrule

Page 20: A new instance-based label ranking approach using the Mallows model

Borda count

‐ tends to optimize Spearman rank correlation

ranking 1 MINI      Toyota      BMW 

ranking 2 BMW      MINI      Toyota

ranking 3 BMW Toyota      MINI

Borda count BMW: 4  MINI: 3  Toyota:2

Page 21: A new instance-based label ranking approach using the Mallows model

21

Page 22: A new instance-based label ranking approach using the Mallows model

22