mechanisms for making crowds truthful andrew mao, sergiy nesterko
TRANSCRIPT
Improving Peer Prediction Weakness in the Miller et al. paper:
Honest reporting is not a unique equilibrium (or even Pareto-optimal)
Collusion is not limited to symmetric strategies, nontransferable utility
Does not give a minimum bound on the payoff between lying and truth-telling Players may be indifferent if difference in payoffs is less
than ε Scoring rules cannot be easily extended to
accommodate new constraints
Overview Address cases of collusion
Improve payment mechanism by creating unique NE, or at least Pareto-optimal NE
Use multiple reference raters (>= 4) "...By giving a higher reward for matching all but one of
the reference reports, it is possible to give a higher expected payoff to the truthful reporting equilibrium..."
Symmetric and asymmetric strategies Transferable / non-transferable utility
Automated mechanism design approach Payments computed by optimization, rather than
closed form scoring rules
Some Features Only pure strategies are considered
Mixed strategy Bayes-Nash equilibria are too complicated to compute
Initially, prove NE for truthful reporting, then extend to different collusive cases
Payments to players for good or bad reports determine best-response strategies
The Model Many buyers experience the same product
with varying levels of quality. Define type as product quality, with a discrete
distribution. We'll use just two types - Good and Bad.
Buyers can rate what they get with either 1 (good) or 0 (bad). They get some reward for reporting.
In sequential games, respondent rewards are computed in batches Apply this model repeatedly to achieve
sequential play
Model continued Common prior among players, center N respondents in each batch Possible strategies: (0, n) and (1, n); for n = 0
… N-1 n is the number of other players that submit a
positive report Probability that n positive reports are
submitted by remaining N-1 reviewers, given my signal oi:
Example of Incentive-Compatible Payments Plumber Bob has the following prior:
P(G) = 0.8, P(B) = 0.2 P(1|G) = 0.9, P(1|B) = 0.15
Suppose Alice (customer) has a job done well. Then P(G|1) = 0.96.
She is told: "the report is paid only if it matches the reference report. A negative report is paid $2.62, while a positive report is paid $1.54"
Then Alice expects the next user to get good service from Bob with probability P(1|1) = P(1|G)P(G|1) + P(1|B)P(B|1) = 0.87.
Example continued Alice wants to match the expected report of the
next customer So, if she tells the truth, expected payoff is 0.87
* 1.54 + 0.13 * 0 = 1.34, if she lies 0.87 * 0 + 0.13 * 2.62 = 0.34. So, no incentive to lie.
Note that if we let P(G) = 0.001 and P(B) = 0.999, this is reversed! It is important that payoffs correspond to the right
prior! But even with smart payoffs, everyone 1 is still
an equilibrium! This is addressed in a later section
Automated Mechanism Design i.e., how did we magically compute payments to Alice?
First proposed by Conitzer and Sandholm (2003) In general, mechanisms are computed to satisfy
specified design goals, instead of deriving closed form rules
Allows variations within a class of mechanisms to be dynamically generated
Mechanism can make use of specific available information
In this case: Computing payments by solving optimization problems
Incentive-Compatible Payment Mechanisms Payment mechanism is incentive-compatible
if honest reporting is a Nash Equilibrium How do you compute the payment scheme so
as to satisfy this? Can you create a unique NE? Is it efficient? We want:
minimize expected payment to each player reward margin between truthful and dishonest
reports all payments must be positive
Solving a Linear Program Simple case: no collusion resistance For this to make sense, everyone must have
the same prior
Analytical Solution to the LP From constraints in the LP, we have two nonzero
decision variables (payments) Must be for two separate reports: τ(0, n1) and τ(1, n2)
Lemma: ratio of Pr[n|1]/Pr[n|0] is monotonically increasing in n From the dual, expected payment depends on this
ratio Under cost minimization, incentive compatible
payments are driven to n1 = 0 and n2 = N - 1, respectively Result: only τ(0, 0) and τ(1, N-1) are positive
payments
Satisfying Incentive Compatibility Consider the conditions for incentive
compatibility, with n1 = 0, n2 = N-1: τ(0, 0) > τ(1, 0); τ(1, N-1) > τ(0, N-1)
In the 2-player case, this becomes τ(0, 0) > τ(1, 0); τ(1, 1) > τ(0, 1)
Obviously, this introduces the "all-report-high" and "all-report-low" equilibria
Now, how do we fix this?
Add more constraints to the optimization problem!
Extensions Coalition size (full coalition/fractional coalition) Symmetric vs. asymmetric strategies Transferable utility
Some combinations of these conditions are unreasonable i.e. doesn’t make sense if colluders can make side
payments but not coordinate on asymmetric strategies
Achieving unique or Pareto-optimal Nash equilibria
Extension: Full coalition, symmetric strategies, non-transferrable utilities We want to get rid of the “all-report-X” Nash
Equilibrium Extending the plumber example to N = 4
agents, look at probabilities
Note the differences in distributions!
Example continued Optimal payment scheme:
Reporter is encouraged to "even out" the 0 distribution, but the prior compensates
This gives the incentive for one person to switch when everyone else is reporting the same Implicit collusion resistance to symmetric strategies
Extension: Partial Collusion, Asymmetric Strategies, Nontransferable Utility Theorem: When more than half of the agents
collude, no incentive-compatible payment mechanism can make truth-telling dominant strategy for the colluders
Cost of payments rises exponentially as the coalition fraction increases
Extension: Partial Collusion, Asymmetric Strategies, Transferable Utility Note that the normalized cost rises much
faster than before when participants can make side payments
Summary of Extensions Some conditions lead to MILPs, which are
harder to solve Unique vs. Pareto-optimal NE
The latter is much cheaper Partial collusion: payment cost increases
dramatically beyond a threshold of colluders
Improvements Extension to original peer prediction
mechanism with automated mechanism design
Dynamically generated payments, so rules don't have to be in closed form
Expected payment from honest reporting better than lying by some guaranteed threshold
Different conditions can generate Unique, Pareto-optimal, or even Dominant NE, with corresponding different costs
Drawbacks Common prior still required for BNE Report space is discrete (binary, in fact) Sequential nature of reports submission is not
considered Need at least a certain size group
Weird budget results if center has different prior from users
Not necessarily incentivizing players to spend effort to uncover information - why not just invent a report?