1 efficiently learning the accuracy of labeling sources for selective sampling by pinar donmez,...

Post on 24-Dec-2015

221 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

1

Efficiently Learning the Accuracy of

Labeling Sources for Selective Sampling

by Pinar Donmez, Jaime Carbonell, Jeff Schneider

School of Computer Science, Carnegie Mellon University

KDD ’09

June 30th 2009

Paris, France

2

Problem Illustration

0.74

0.55

0.8

0.9

0.67

0.83

0.58

0.69

instances

oracles

3

Interval Estimate Threshold (IEThresh) Goal: find the labeler(s) with the highest expected accuracy Our work builds upon Interval Estimation [L. P. Kaelbling]

1. Estimate the reward of each labeler (more on next slide)2. Compute upper confidence interval for the labelers

3. Select labelers with upper interval higher than a threshold

4. Observe the output of the chosen oracles to estimate their reward

5. Repeat to step 1

filter out unreliable labelers reduce labeling cost

4

Reward of the labelers The reward of each labeler is unknown => need to be estimated

reward of a labeler eliciting true label

true label is also unknown => estimated by the majority vote

We propose the below reward function

reward=1 if the labeler agrees with the majority label reward=0 otherwise

5

IEThresh at the Beginning

Oracles

Expect

ed

rew

ard

incr

ease

s

6

IEThresh Oracle Selection

Oracles

Expect

ed

rew

ard

incr

ease

s

Threshold

1 2 3 4 5

7

IE Learning Snapshot IIExpect

ed

rew

ard

incr

ease

s

Oracles

Threshold

1 2 3 4 5

8

IEThresh Instance Selection1

3

4

5

2

9

Uniform Expert Accuracy є (0.5,1]

Repeated Labeling [Sheng et al, 2008]: querying all experts for labeling

Cla

ssifi

cati

on e

rror

10

# Oracle Queries vs. Accuracy

: First 10 iterations

: Next 40 iterations

: Next 100 iterations

11

# Oracle queries to reach a target accuracy

skew increases

bett

er

12

Results on AMT Data with Human Annotators

IEThresh reaches the best performance with similar effort to Repeated labeling

Repeated baseline needs 840 queries total to reach 0.95 accuracy

Dataset at http://nlpannotations.googlepages.com/ made available by [Snow et al., 2008]

5 annotators

6 annotators

13

Conclusions and Future Work Conclusions

IEThresh is effective in balancing exploration vs. exploitation tradeoff

Early filtering of unreliable labelers boosts performance Utilizing labeler accuracy estimates is more effective

than asking all or randomly

Future Work

from consistent to time-variant labeler quality label noise conditioned on the data instance correlated labeling errors

14

THANK YOU!

top related