mechanisms for making crowds truthful andrew mao, sergiy nesterko

Mechanisms for Making Crowds Truthful

Andrew Mao, Sergiy Nesterko

Improving Peer Prediction Weakness in the Miller et al. paper:

Honest reporting is not a unique equilibrium (or even Pareto-optimal)

Collusion is not limited to symmetric strategies, nontransferable utility

Does not give a minimum bound on the payoff between lying and truth-telling Players may be indifferent if difference in payoffs is less

than ε Scoring rules cannot be easily extended to

accommodate new constraints

Overview Address cases of collusion

Improve payment mechanism by creating unique NE, or at least Pareto-optimal NE

Use multiple reference raters (>= 4) "...By giving a higher reward for matching all but one of

the reference reports, it is possible to give a higher expected payoff to the truthful reporting equilibrium..."

Symmetric and asymmetric strategies Transferable / non-transferable utility

Automated mechanism design approach Payments computed by optimization, rather than

closed form scoring rules

Some Features Only pure strategies are considered

Mixed strategy Bayes-Nash equilibria are too complicated to compute

Initially, prove NE for truthful reporting, then extend to different collusive cases

Payments to players for good or bad reports determine best-response strategies

The Model Many buyers experience the same product

with varying levels of quality. Define type as product quality, with a discrete

distribution. We'll use just two types - Good and Bad.

Buyers can rate what they get with either 1 (good) or 0 (bad). They get some reward for reporting.

In sequential games, respondent rewards are computed in batches Apply this model repeatedly to achieve

sequential play

Model continued Common prior among players, center N respondents in each batch Possible strategies: (0, n) and (1, n); for n = 0

… N-1 n is the number of other players that submit a

positive report Probability that n positive reports are

submitted by remaining N-1 reviewers, given my signal oi:

Example of Incentive-Compatible Payments Plumber Bob has the following prior:

P(G) = 0.8, P(B) = 0.2 P(1|G) = 0.9, P(1|B) = 0.15

Suppose Alice (customer) has a job done well. Then P(G|1) = 0.96.

She is told: "the report is paid only if it matches the reference report. A negative report is paid $2.62, while a positive report is paid $1.54"

Then Alice expects the next user to get good service from Bob with probability P(1|1) = P(1|G)P(G|1) + P(1|B)P(B|1) = 0.87.

Example continued Alice wants to match the expected report of the

next customer So, if she tells the truth, expected payoff is 0.87

* 1.54 + 0.13 * 0 = 1.34, if she lies 0.87 * 0 + 0.13 * 2.62 = 0.34. So, no incentive to lie.

Note that if we let P(G) = 0.001 and P(B) = 0.999, this is reversed! It is important that payoffs correspond to the right

prior! But even with smart payoffs, everyone 1 is still

an equilibrium! This is addressed in a later section

Automated Mechanism Design i.e., how did we magically compute payments to Alice?

First proposed by Conitzer and Sandholm (2003) In general, mechanisms are computed to satisfy

specified design goals, instead of deriving closed form rules

Allows variations within a class of mechanisms to be dynamically generated

Mechanism can make use of specific available information

In this case: Computing payments by solving optimization problems

Incentive-Compatible Payment Mechanisms Payment mechanism is incentive-compatible

if honest reporting is a Nash Equilibrium How do you compute the payment scheme so

as to satisfy this? Can you create a unique NE? Is it efficient? We want:

minimize expected payment to each player reward margin between truthful and dishonest

reports all payments must be positive

Solving a Linear Program Simple case: no collusion resistance For this to make sense, everyone must have

the same prior

Analytical Solution to the LP From constraints in the LP, we have two nonzero

decision variables (payments) Must be for two separate reports: τ(0, n1) and τ(1, n2)

Lemma: ratio of Pr[n|1]/Pr[n|0] is monotonically increasing in n From the dual, expected payment depends on this

ratio Under cost minimization, incentive compatible

payments are driven to n1 = 0 and n2 = N - 1, respectively Result: only τ(0, 0) and τ(1, N-1) are positive

payments

Satisfying Incentive Compatibility Consider the conditions for incentive

compatibility, with n1 = 0, n2 = N-1: τ(0, 0) > τ(1, 0); τ(1, N-1) > τ(0, N-1)

In the 2-player case, this becomes τ(0, 0) > τ(1, 0); τ(1, 1) > τ(0, 1)

Obviously, this introduces the "all-report-high" and "all-report-low" equilibria

Now, how do we fix this?

Add more constraints to the optimization problem!

Extensions Coalition size (full coalition/fractional coalition) Symmetric vs. asymmetric strategies Transferable utility

Some combinations of these conditions are unreasonable i.e. doesn’t make sense if colluders can make side

payments but not coordinate on asymmetric strategies

Achieving unique or Pareto-optimal Nash equilibria

Extension: Full coalition, symmetric strategies, non-transferrable utilities We want to get rid of the “all-report-X” Nash

Equilibrium Extending the plumber example to N = 4

agents, look at probabilities

Note the differences in distributions!

Example continued Optimal payment scheme:

Reporter is encouraged to "even out" the 0 distribution, but the prior compensates

This gives the incentive for one person to switch when everyone else is reporting the same Implicit collusion resistance to symmetric strategies

Extension: Partial Collusion, Asymmetric Strategies, Nontransferable Utility Theorem: When more than half of the agents

collude, no incentive-compatible payment mechanism can make truth-telling dominant strategy for the colluders

Cost of payments rises exponentially as the coalition fraction increases

Extension: Partial Collusion, Asymmetric Strategies, Transferable Utility Note that the normalized cost rises much

faster than before when participants can make side payments

Summary of Extensions Some conditions lead to MILPs, which are

harder to solve Unique vs. Pareto-optimal NE

The latter is much cheaper Partial collusion: payment cost increases

dramatically beyond a threshold of colluders

Improvements Extension to original peer prediction

mechanism with automated mechanism design

Dynamically generated payments, so rules don't have to be in closed form

Expected payment from honest reporting better than lying by some guaranteed threshold

Different conditions can generate Unique, Pareto-optimal, or even Dominant NE, with corresponding different costs

Drawbacks Common prior still required for BNE Report space is discrete (binary, in fact) Sequential nature of reports submission is not

considered Need at least a certain size group

Weird budget results if center has different prior from users

Not necessarily incentivizing players to spend effort to uncover information - why not just invent a report?

Discussion

mechanisms for making crowds truthful andrew mao, sergiy nesterko

Documents