strategy-proof classification
DESCRIPTION
Strategy-Proof Classification. Reshef Meir School of Computer Science and Engineering, Hebrew University. A joint work with Ariel. D. Procaccia and Jeffrey S. Rosenschein. Strategy-Proof Classification. An Example of Strategic Labels in Classification Motivation Our Model - PowerPoint PPT PresentationTRANSCRIPT
Strategy-Proof Classification
Reshef MeirSchool of Computer Science and Engineering, Hebrew University
A joint work with Ariel. D. Procaccia and Jeffrey S. Rosenschein
Strategy-Proof Classification
• An Example of Strategic Labels in Classification• Motivation• Our Model• Previous work (positive results)
• An impossibility theoremAn impossibility theorem• More results (if there is time)More results (if there is time)
(~12 minutes)
ERM
Motivation Model Results
Strategic labeling: an example
Introduction
5 errors
There is a better classifier! (for me…)
Motivation Model ResultsIntroduction
If I will only change the
labels…
Motivation Model ResultsIntroduction
2+4 = 6 errors
ClassificationThe Supervised Classification problem:
– Input: a set of labeled data points (xi,yi)i=1..m
– output: a classifier c from some predefined concept class C ( functions of the form f : X-,+ )
– We usually want c to classify correctly not just the sample, but to generalize well, i.e .to minimize
R(c) ≡the expected number of errors w.r.t. the distribution D
Motivation ResultsIntroduction Model
E(x,y)~D[ c(x)≠y ]
Classification (cont.)• A common approach is to return the ERMERM, i.e.
the concept in C that is the best w.r.t. the given samples (has the lowest number of errors)
• Generalizes well under some assumptions on the concept class C
With multiple experts, we can’t trust our ERM!
Motivation ResultsIntroduction Model
Where do we find “experts” with incentives?
Example 1: A firm learning purchase patterns– Information gathered from local retailers– The resulting policy affects them – “the best policy, is the policy that fits my pattern”
Introduction Model ResultsMotivation
Users Reported Dataset
Classification AlgorithmClassifier
Introduction Model Results
Example 2: Internet polls / expert systems
Motivation
Related work• A study of SP mechanisms in Regression learning
– O. Dekel, F. Fischer and A. D. Procaccia, Incentive Compatible Regression Learning, SODA 2008
• No SP mechanisms for Clustering
– J. Perote-Peña and J. Perote. The impossibility of strategy-proof clustering, Economics Bulletin, 2003
Introduction Motivation Model Results
A problem instance is defined by
• Set of agents I = 1,...,n• A partial dataset for each agent i I,
Xi = xi1,...,xi,m(i) X• For each xikXi agent i has a label yik,
– Each pair sik=xik,yik is an example– All examples of a single agent compose the labeled
dataset Si = si1,...,si,m(i) • The joint dataset S= S1 , S2 ,…, Sn is our input
– m=|S|• We denote the dataset with the reported labels by S’
Introduction Motivation ResultsModel
Input: Example
++–––– ++
––––
––––––
++++ ++++ ++++
––
X1 Xm1 X2 Xm2 X3 Xm3
Y1 -,+m1 Y2 -,+m2 Y3 -,+m3
S = S1, S2,…, Sn = (X1,Y1),…, (Xn,Yn)
Introduction Motivation ResultsModel
Incentives and Mechanisms
• A Mechanism M receives a labeled dataset S’ and outputs c C
• Private risk of i: Ri(c,S) = |k: c(xik) yik| / mi
• Global risk: R(c,S) = |i,k: c(xik) yik| / m
• We allow non-deterministic mechanisms– The outcome is a random variable– Measure the expected risk
Introduction Motivation ResultsModel
ERM
We compare the outcome of M to the ERM:c* = ERM(S) = argmin(R(c),S)r* = R(c*,S)
c C
Can our mechanism simply compute and return the ERM?
Introduction Motivation ResultsModel
Requirements
1. Good approximation: S R(M(S),S) ≤ β∙r*
2. Strategy-Proofness (SP): i,S,Si‘ Ri(M(S-i , Si‘),S) ≥ Ri(M(S),S)
• ERM(S) is 1-approximating but not SP• ERM(S1) is SP but gives bad approximation
Are there any mechanisms
that guarantee both SP and
good approximation?
Introduction Motivation ResultsModel
MOST IMPORTANT
SLIDE
Restricted settings• A very small concept class: |C| = 2
– There is a deterministic SP mechanism that obtains a 3-approximation ratio
– This bound is tight– Randomization can improve the bound to 2
R. Meir, A. D. Procaccia and J. S. Rosenschein, Incentive Compatible Classification under Constant Hypotheses: A Tale of Two Functions, AAAI 2008
Introduction Motivation Model Results
Restricted settings (cont.)• Agents with similar interests:
– There is a randomized SP 3-approximation mechanism (works for any class C)
Introduction Motivation Model Results
R. Meir, A. D. Procaccia and J. S. Rosenschein, Incentive Compatible Classification with Shared Inputs, IJCAI 2009.
But not everything shines
• Without restrictions on the input, we cannot guarantee a constant approximation ratio
Our main result:Theorem: There is a concept class C, for which
there are no deterministic SP mechanisms with o(m)-approximation ratio
Introduction Motivation Model Results
Deterministic lower bound
Proof idea: – First construct a classification problem that is
equivalent to a voting problem with 3 candidates
– Then use the Gibbard-Satterthwaite theorem to prove that there must be a dictator
– Finally, the dictator’s opinion might be very far from the optimal classification
Introduction Motivation Model Results
Proof (1)
Construction: We have X=a,b, and 3 classifiers as follows
The dataset contains two types of agents, with samples distributed unevenly over a and b
Introduction Motivation Model Results
We do not set the labels.
Instead, we denote by Y all the possible labelings of an agent’s dataset.
Proof (2)Let P be the set of all 6 orders over C A voting rule is a function of the form f: Pn CBut our mechanism is a function M: Yn C !
(its input are labels and not orders)
Lemma 1: there is a valid mapping g: Pn Yn, s.t. (M*g) is a voting rule
Introduction Motivation Model Results
Proof (3)Lemma 2: If M is SP, and guarantees any bounded
approximation ratio, then f=M*g is dictatorialProof: (f is onto) any profile that c classifies perfectly
must induce the selection of c
(f is SP) suppose there is a manipulationBy mapping this profile to labels with g, we find a
manipulation of M, in contradiction to its SP
From the G-S theorem, f must be dictatorial
Introduction Motivation Model Results
Proof (4)Introduction Motivation Model Results
Finally, f (and thus M) can only be dictatorial. We assume w.l.o.g. that the dictator is agent 1 of
type Ia. We now label the data points as follows:
The optimal classifier is cab, which makes 2 errors
The dictator selects ca, which makes m/2 errors
Real concept classesIntroduction Motivation Model Results
• We managed to show that there are no good (deterministic) SP mechanisms, but only for a synthetically constructed class.
• We are interested in more common classes, that are really used in machine learning. For example:
• Linear Classifiers• Boolean Conjunctions
Linear classifiers
Only 2 errors
Introduction Motivation Model Results
“b”
cacb
cab
“a”
Ω(√m) errors
A lower bound for randomized SP mechanisms
• A lottery over dictatorships is still bad– Ω(k) instead of Ω(m), where k is the size of the
largest dataset controlled by an agent ( m ≈ k*n )
• However, it is not clear how to eliminate other mechanisms – G-S works only for deterministic mechanisms– Another theorem by Gibbard [’79] can help
• But only under additional assumptions
Introduction Motivation Model Results
Upper bounds
• So, our lower bounds do not leave much hope for good SP mechanisms
• We would still like to know if they are tight
A deterministic SP O(m)-approximation is easy:– break ties iteratively according to dictators
What about randomized SP O(k) mechanisms?
Introduction Motivation Model Results
The iterative random dictator (IRD)
(example with linear classifiers on R1)
Introduction Motivation Model Results
v v
The iterative random dictator (IRD)
(example with linear classifiers on R1)
Introduction Motivation Model Results
v v
Iteration 1: 2 errors
The iterative random dictator (IRD)
(example with linear classifiers on R1)
Introduction Motivation Model Results
v v
Iteration 1: 2 errorsIteration 2: 5 errorsIteration 3: 0 errors
The iterative random dictator (IRD)
(example with linear classifiers on R1)
Introduction Motivation Model Results
v v
Iteration 1: 2 errorsIteration 2: 5 errorsIteration 3: 0 errorsIteration 4: 0 errors
The iterative random dictator (IRD)
(example with linear classifiers on R1)
Introduction Motivation Model Results
v v
Iteration 1: 2 errorsIteration 2: 5 errorsIteration 3: 0 errorsIteration 4: 0 errorsIteration 5: 1 error
Theorem: The IRD is O(k2) approximating for Linear Classifiers in R1
Future work• Other concept classes
• Other loss functions
• Alternative assumptions on structure of data
• Other models of strategic behavior
• …
Introduction Motivation Model Results