analytic methods for optimizing realtime crowdsourcing
DESCRIPTION
Collective Intelligence 2012 Realtime crowdsourcing research has demonstrated that it is possible to recruit paid crowds within seconds by managing a small, fast-reacting worker pool. Realtime crowds enable crowd-powered systems that respond at interactive speeds: for example, cameras, robots and instant opinion polls. So far, these techniques have mainly been proof-of-concept prototypes: research has not yet attempted to understand how they might work at large scale or optimize their cost/performance trade-offs. In this paper, we use queueing theory to analyze the retainer model for realtime crowdsourcing, in particular its expected wait time and cost to requesters. We provide an algorithm that allows requesters to minimize their cost subject to performance requirements. We then propose and analyze three techniques to improve performance: push notifications, shared retainer pools, and precruitment, which involves recalling retainer workers before a task actually arrives. An experimental validation finds that precruited workers begin a task 500 milliseconds after it is posted, delivering results below the one-second cognitive threshold for an end-user to stay in flow.TRANSCRIPT
MIT HUMAN-COMPUTER INTERACTION
Analytic Methods for Optimizing Realtime Crowdsourcing
Michael Bernstein, David Karger, Rob Miller, and Joel BrandtMIT CSAIL and Adobe Systems
Tuesday, May 8, 12
MIT HUMAN-COMPUTER INTERACTION
Use queueing theory to understand and optimize performance of a paid, realtime crowdsourcing platform.
•Relationship between crowd size and response time
•Algorithm for optimizing crowd size & cost vs. response time
• Improvements to the platform: 500 millisecond feedback
Tuesday, May 8, 12
Realtime CrowdsAnswering visual questions for blind users[Bigham et al. 2010]
Tuesday, May 8, 12
Realtime CrowdsAnswering visual questions for blind users[Bigham et al. 2010]
Crowd-assisted photography[Bernstein et al. 2011]
Tuesday, May 8, 12
Realtime CrowdsAnswering visual questions for blind users[Bigham et al. 2010]
Crowd-assisted photography[Bernstein et al. 2011]
Tuesday, May 8, 12
Paid CrowdsourcingPay small amounts of money for short tasks
Amazon Mechanical Turk: Roughly five million tasks completed per year at 1-5¢ each [Ipeirotis 2010]
Label an imageRequester: Matt C.
Reward: $0.01
Transcribe short audio clipRequester: Gordon L.
Reward: $0.04
Tuesday, May 8, 12
Retainer RecruitmentWorkers sign up in advance½¢ per minute to remain on callAlert when the task is ready
Wait at most: 5 minutesTask:Click on the verbs in the paragraph
He leapt the fence anddashed toward the door.
[Bernstein et al. 2011]Tuesday, May 8, 12
Retainer RecruitmentWorkers sign up in advance½¢ per minute to remain on callAlert when the task is ready
Wait at most: 5 minutesTask:Click on the verbs in the paragraph
alert()
Start now! OK
He leapt the fence anddashed toward the door.
[Bernstein et al. 2011]Tuesday, May 8, 12
Retainer RecruitmentWorkers sign up in advance½¢ per minute to remain on callAlert when the task is ready
50% of workers return in two seconds, and 75% of workers return in three seconds.
[Bernstein et al. 2011]Tuesday, May 8, 12
State of the LiteratureRealtime Crowds
• Recruit crowds in two seconds, execute traditional tasks (e.g., votes) in five seconds
• Maintain continuous control of remote interfaces
• Opportunities in deployable, intelligently reactive software
[Bigham et al. 2010, Bernstein et al. 2011, Lasecki et al. 2011]Tuesday, May 8, 12
The ChallengeRunning Out of Retainer Workers
Tuesday, May 8, 12
The ChallengeRunning Out of Retainer Workers
Tuesday, May 8, 12
The ChallengeRunning Out of Retainer Workers
Tuesday, May 8, 12
The ChallengeRunning Out of Retainer Workers
Tuesday, May 8, 12
The ChallengeRunning Out of Retainer Workers
Tuesday, May 8, 12
The ChallengeRunning Out of Retainer Workers
Tuesday, May 8, 12
The ChallengeRunning Out of Retainer Workers
LossNon-realtime response
Tuesday, May 8, 12
The TradeoffMissed tasks, non-realtime results
Extra retainer workers,extra cost
Tuesday, May 8, 12
The Goal
Optimize the tradeoff betweenrecruiting too many workers anddropping too many tasks.
Tuesday, May 8, 12
The Goal
Optimize the tradeoff betweenrecruiting too many workers anddropping too many tasks.
Budget-optimal crowdsourcing is possible in non-realtime scenarios [Dai, Mausam and Weld 2010; Kamar, Hacker and Horvitz 2012; Karger, Oh, and Shah 2011]
Tuesday, May 8, 12
1 Model
2 Optimization
3 PlatformOut
line
Tuesday, May 8, 12
Queueing Theory
• Formal framework for stochastic arrival and service processes
• Basic idea: random task arrivals and random processing times for workers
• Quantify how long tasks will need to waitin line
ModelOptimizePlatformO
utlin
e
Queueing theory for completion times: [Ipeirotis 2010]Tuesday, May 8, 12
Queueing TheoryM/M/1 queue
Markovian (Poisson process) task arrivals, rateMarkovian (Poisson process) server work time, rate One server
µ�
Server
MM1
Tuesday, May 8, 12
Server
Queueing TheoryM/M/1 queue
Markovian (Poisson process) task arrivals, rateMarkovian (Poisson process) server work time, rate One server
µ�M
M1
Tuesday, May 8, 12
ServerTask
Queueing TheoryM/M/1 queue
Markovian (Poisson process) task arrivals, rateMarkovian (Poisson process) server work time, rate One server
µ�M
M1
Tuesday, May 8, 12
ServerTask
Queueing TheoryM/M/1 queue
Markovian (Poisson process) task arrivals, rateMarkovian (Poisson process) server work time, rate One server
µ�M
M1
Tuesday, May 8, 12
Server
Queueing TheoryM/M/1 queue
Markovian (Poisson process) task arrivals, rateMarkovian (Poisson process) server work time, rate One server
µ�M
M1
Tuesday, May 8, 12
Server
Queueing TheoryM/M/1 queue
Markovian (Poisson process) task arrivals, rateMarkovian (Poisson process) server work time, rate One server
µ�M
M1
Tuesday, May 8, 12
Server
Queueing TheoryM/M/1 queue
Markovian (Poisson process) task arrivals, rateMarkovian (Poisson process) server work time, rate One server
µ�M
M1
Tuesday, May 8, 12
Server
Queueing TheoryM/M/1 queue
Markovian (Poisson process) task arrivals, rateMarkovian (Poisson process) server work time, rate One server
µ�M
M1
Tuesday, May 8, 12
Server
Queueing TheoryM/M/1 queue
Markovian (Poisson process) task arrivals, rateMarkovian (Poisson process) server work time, rate One server
µ�M
M1
Tuesday, May 8, 12
Server
Queueing TheoryM/M/1 queue
Markovian (Poisson process) task arrivals, rateMarkovian (Poisson process) server work time, rate One server
µ�M
M1
Tuesday, May 8, 12
Server
Queueing TheoryM/M/1 queue
Markovian (Poisson process) task arrivals, rateMarkovian (Poisson process) server work time, rate One server
µ�M
M1
Tuesday, May 8, 12
Server
Queueing TheoryM/M/1 queue
Markovian (Poisson process) task arrivals, rateMarkovian (Poisson process) server work time, rate One server
µ�M
M1
Tuesday, May 8, 12
Queueing TheoryM/M/c/c queue
Markovian (Poisson process) task arrivals, rateMarkovian (Poisson process) server work time, rate c serversc max tasks in servers and queue
µ�M
Mcc
Tuesday, May 8, 12
Queueing TheoryM/M/c/c queue
Markovian (Poisson process) task arrivals, rateMarkovian (Poisson process) server work time, rate c serversc max tasks in servers and queue
µ�M
Mcc
Tuesday, May 8, 12
Queueing TheoryM/M/c/c queue
Markovian (Poisson process) task arrivals, rateMarkovian (Poisson process) server work time, rate c serversc max tasks in servers and queue
µ�M
Mcc
Tuesday, May 8, 12
Queueing TheoryM/M/c/c queue
Markovian (Poisson process) task arrivals, rateMarkovian (Poisson process) server work time, rate c serversc max tasks in servers and queue
µ�M
Mcc
Tuesday, May 8, 12
Queueing TheoryM/M/c/c queue
Markovian (Poisson process) task arrivals, rateMarkovian (Poisson process) server work time, rate c serversc max tasks in servers and queue
µ�M
Mcc
Tuesday, May 8, 12
Queueing TheoryM/M/c/c queue
Markovian (Poisson process) task arrivals, rateMarkovian (Poisson process) server work time, rate c serversc max tasks in servers and queue
µ�M
Mcc
Tuesday, May 8, 12
Queueing TheoryM/M/c/c queue
Markovian (Poisson process) task arrivals, rateMarkovian (Poisson process) server work time, rate c serversc max tasks in servers and queue
µ�M
Mcc
Tuesday, May 8, 12
All servers busy
Queueing TheoryM/M/c/c queue
Markovian (Poisson process) task arrivals, rateMarkovian (Poisson process) server work time, rate c serversc max tasks in servers and queue
µ�M
Mcc
Tuesday, May 8, 12
Queueing TheoryM/M/c/c queue
Markovian (Poisson process) task arrivals, rateMarkovian (Poisson process) server work time, rate c serversc max tasks in servers and queue
µ�M
Mcc
Tuesday, May 8, 12
Modeling Retainer Recruitment
Tuesday, May 8, 12
Retainer QueueM/M/c/c queue
c workers, no waiting queueTask arrivals: Poisson process, rateWorker recruitment time: Poisson process, rate µ
�
Crowd
Tuesday, May 8, 12
Retainer QueueM/M/c/c queue
c workers, no waiting queueTask arrivals: Poisson process, rateWorker recruitment time: Poisson process, rate µ
�
Crowd
Tuesday, May 8, 12
Retainer QueueM/M/c/c queue
c workers, no waiting queueTask arrivals: Poisson process, rateWorker recruitment time: Poisson process, rate µ
�
Crowd
Tuesday, May 8, 12
c workers, no waiting queueTask arrivals: Poisson process, rateWorker recruitment time: Poisson process, rate µ
�
Retainer QueueM/M/c/c queue
Crowd
Tuesday, May 8, 12
c workers, no waiting queueTask arrivals: Poisson process, rateWorker recruitment time: Poisson process, rate µ
�
Retainer QueueM/M/c/c queue
Crowd
Tuesday, May 8, 12
c workers, no waiting queueTask arrivals: Poisson process, rateWorker recruitment time: Poisson process, rate µ
�
Retainer QueueM/M/c/c queue
Crowd
Tuesday, May 8, 12
c workers, no waiting queueTask arrivals: Poisson process, rateWorker recruitment time: Poisson process, rate µ
�
Retainer QueueLoss
Crowd
Tuesday, May 8, 12
c workers, no waiting queueTask arrivals: Poisson process, rateWorker recruitment time: Poisson process, rate µ
�
Retainer QueueLoss
Crowd
Tuesday, May 8, 12
c workers, no waiting queueTask arrivals: Poisson process, rateWorker recruitment time: Poisson process, rate µ
�
Retainer QueueLoss
Crowd
All servers busy
Tuesday, May 8, 12
Retainer QueueLoss
Crowd
All servers busy
Tuesday, May 8, 12
Retainer QueueLoss
Crowd
All servers busy
Tuesday, May 8, 12
Retainer QueueLoss
Crowd
All servers busy
P (i servers busy) = ⇡(i)
P (all servers busy) = ⇡(c)
Tuesday, May 8, 12
Retainer QueueLoss
Crowd
All servers busy
P (i servers busy) = ⇡(i)
P (all servers busy) = ⇡(c)
Tuesday, May 8, 12
Retainer QueueLoss
Crowd
All servers busy
P (i servers busy) = ⇡(i)
P (all servers busy) = ⇡(c)
Tuesday, May 8, 12
Retainer QueueLoss
Crowd
All servers busy
P (i servers busy) = ⇡(i)
P (all servers busy) = ⇡(c)
Tuesday, May 8, 12
Model Predictions
1. Probability that all workers are busy:→ the task has to wait for expected time
2. Cost of keeping a retainer pool of size c→ cost depends on number of idle servers
1/µ⇡(c)
Tuesday, May 8, 12
Probability of Loss
• Draw on Erlang’s Loss Formula from queueing theory: probability of a rejected request in an M/M/c/c queue
• Let be the traffic intensity:
(roughly, the number of new tasks that will arrive in the time it takes to recruit a worker)
⇢⇢ = �/µ
Tuesday, May 8, 12
Probability of Loss
Erlang’s Loss Formula says:
Remarkably, this result makes no assumptionsabout the arrival distribution.
⇡(c) = P (c servers busy)
=⇢c/c!Pci=0 ⇢
i/i!
Tuesday, May 8, 12
Probability of Loss
●
●
●
●●
●●
●●
●
2 4 6 8 10
0.1
0.2
0.5
1.0
2.0
5.0
(a) Cost of retainer
Size of retainer pool
Paym
ent u
nits
per
uni
t tim
e
●
●
●
●
●●
●●
●●
●
●
●
●
●●
●●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
Traffic intensity ρ0.10.51510
●
●
●
●
●
●
●
●
●
2 4 6 8 10
1e−1
01e−0
71e−0
41e−0
1
(b) Probability of waiting
Size of retainer pool
Prob
abilit
y of
wai
ting
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
● ● ● ● ● ●●
●●
●
● ● ● ● ● ● ● ● ● ●
Traffic intensity ρ0.10.51510
●
●
●
●
●
●
●
●
2 4 6 8 10
1e−1
01e−0
71e−0
41e−0
11e
+02
(c) Expected wait time
Size of retainer pool
Expe
cted
wai
t tim
e (s
econ
ds)
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
● ● ● ● ● ● ●●
●●
● ● ● ● ● ● ● ● ● ●
Traffic intensity ρ0.10.51510
●
●
●
●●
●●
●●
●
2 4 6 8 10
0.1
0.2
0.5
1.0
2.0
5.0
(a) Cost of retainer
Size of retainer pool
Paym
ent u
nits
per
uni
t tim
e
●
●
●
●
●●
●●
●●
●
●
●
●
●●
●●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
Traffic intensity ρ0.10.51510
●
●
●
●
●
●
●
●
●
2 4 6 8 10
1e−1
01e−0
71e−0
41e−0
1
(b) Probability of waiting
Size of retainer pool
Prob
abilit
y of
wai
ting
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
● ● ● ● ● ●●
●●
●
● ● ● ● ● ● ● ● ● ●
Traffic intensity ρ0.10.51510
●
●
●
●
●
●
●
●
2 4 6 8 10
1e−1
01e−0
71e−0
41e−0
11e
+02
(c) Expected wait time
Size of retainer pool
Expe
cted
wai
t tim
e (s
econ
ds)
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
● ● ● ● ● ● ●●
●●
● ● ● ● ● ● ● ● ● ●
Traffic intensity ρ0.10.51510
Tuesday, May 8, 12
Expected Waiting Time
P (c servers busy)⇥ (expected recruitment time)
= ⇡(c)1
µ
=⇢
c/c!Pc
i=0 ⇢i/i!
1
µ
P (c servers busy)⇥ (expected recruitment time)
= ⇡(c)1
µ
=⇢
c/c!Pc
i=0 ⇢i/i!
1
µ
P (c servers busy)⇥ (expected recruitment time)
= ⇡(c)1
µ
=⇢
c/c!Pc
i=0 ⇢i/i!
1
µ
Tuesday, May 8, 12
Expected CostHow much do we pay in steady-state?
Depends on how many workers are usually waiting on retainer.
Tuesday, May 8, 12
Expected CostProbability of i busy servers in an M/M/c/c queue is a more general version of Erlang’s Loss Formula:
Derive the expected number of busy workers:
⇡(i) =⇢i/i!Pci=0 ⇢
i/i!
E[i] = ⇢[1� ⇡(c)]
Tuesday, May 8, 12
Expected CostProbability of i busy servers in an M/M/c/c queue is a more general version of Erlang’s Loss Formula:
Derive the expected number of busy workers:
Total cost is the number of idle workers:
⇡(i) =⇢i/i!Pci=0 ⇢
i/i!
E[i] = ⇢[1� ⇡(c)]
c� ⇢[1� ⇡(c)]
Tuesday, May 8, 12
Expected Cost
●
●
●
●●
●●
●●
●
2 4 6 8 10
0.1
0.2
0.5
1.0
2.0
5.0
(a) Cost of retainer
Size of retainer pool
Paym
ent u
nits
per
uni
t tim
e
●
●
●
●
●●
●●
●●
●
●
●
●
●●
●●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
Traffic intensity ρ0.10.51510
●
●
●
●
●
●
●
●
●
2 4 6 8 10
1e−1
01e−0
71e−0
41e−0
1
(b) Probability of waiting
Size of retainer pool
Prob
abilit
y of
wai
ting
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
● ● ● ● ● ●●
●●
●
● ● ● ● ● ● ● ● ● ●
Traffic intensity ρ0.10.51510
●
●
●
●
●
●
●
●
2 4 6 8 10
1e−1
01e−0
71e−0
41e−0
11e
+02
(c) Expected wait time
Size of retainer pool
Expe
cted
wai
t tim
e (s
econ
ds)
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
● ● ● ● ● ● ●●
●●
● ● ● ● ● ● ● ● ● ●
Traffic intensity ρ0.10.51510
Cost goes down when ,but performance suffers.
●
●
●
●●
●●
●●
●
2 4 6 8 10
0.1
0.2
0.5
1.0
2.0
5.0
(a) Cost of retainer
Size of retainer pool
Paym
ent u
nits
per
uni
t tim
e
●
●
●
●
●●
●●
●●
●
●
●
●
●●
●●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
Traffic intensity ρ0.10.51510
●
●
●
●
●
●
●
●
●
2 4 6 8 10
1e−1
01e−0
71e−0
41e−0
1
(b) Probability of waiting
Size of retainer pool
Prob
abilit
y of
wai
ting
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
● ● ● ● ● ●●
●●
●
● ● ● ● ● ● ● ● ● ●
Traffic intensity ρ0.10.51510
●
●
●
●
●
●
●
●
2 4 6 8 10
1e−1
01e−0
71e−0
41e−0
11e
+02
(c) Expected wait time
Size of retainer pool
Expe
cted
wai
t tim
e (s
econ
ds)
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
● ● ● ● ● ● ●●
●●
● ● ● ● ● ● ● ● ● ●
Traffic intensity ρ0.10.51510
c < ⇢
Tuesday, May 8, 12
Expected Cost
●
●
●
●●
●●
●●
●
2 4 6 8 10
0.1
0.2
0.5
1.0
2.0
5.0
(a) Cost of retainer
Size of retainer pool
Paym
ent u
nits
per
uni
t tim
e
●
●
●
●
●●
●●
●●
●
●
●
●
●●
●●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
Traffic intensity ρ0.10.51510
●
●
●
●
●
●
●
●
●
2 4 6 8 101e−1
01e−0
71e−0
41e−0
1
(b) Probability of waiting
Size of retainer pool
Prob
abilit
y of
wai
ting
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
● ● ● ● ● ●●
●●
●
● ● ● ● ● ● ● ● ● ●
Traffic intensity ρ0.10.51510
●
●
●
●
●
●
●
●
2 4 6 8 10
1e−1
01e−0
71e−0
41e−0
11e
+02
(c) Expected wait time
Size of retainer pool
Expe
cted
wai
t tim
e (s
econ
ds)
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
● ● ● ● ● ● ●●
●●
● ● ● ● ● ● ● ● ● ●
Traffic intensity ρ0.10.51510
Cost goes down when ,but performance suffers.
●
●
●
●●
●●
●●
●
2 4 6 8 10
0.1
0.2
0.5
1.0
2.0
5.0
(a) Cost of retainer
Size of retainer pool
Paym
ent u
nits
per
uni
t tim
e
●
●
●
●
●●
●●
●●
●
●
●
●
●●
●●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
Traffic intensity ρ0.10.51510
●
●
●
●
●
●
●
●
●
2 4 6 8 10
1e−1
01e−0
71e−0
41e−0
1
(b) Probability of waiting
Size of retainer pool
Prob
abilit
y of
wai
ting
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
● ● ● ● ● ●●
●●
●
● ● ● ● ● ● ● ● ● ●
Traffic intensity ρ0.10.51510
●
●
●
●
●
●
●
●
2 4 6 8 10
1e−1
01e−0
71e−0
41e−0
11e
+02
(c) Expected wait time
Size of retainer pool
Expe
cted
wai
t tim
e (s
econ
ds)
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
● ● ● ● ● ● ●●
●●
● ● ● ● ● ● ● ● ● ●
Traffic intensity ρ0.10.51510
c < ⇢
●
●
●
●●
●●
●●
●
2 4 6 8 10
0.1
0.2
0.5
1.0
2.0
5.0
(a) Cost of retainer
Size of retainer pool
Paym
ent u
nits
per
uni
t tim
e
●
●
●
●
●●
●●
●●
●
●
●
●
●●
●●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
Traffic intensity ρ0.10.51510
●
●
●
●
●
●
●
●
●
2 4 6 8 101e−1
01e−0
71e−0
41e−0
1
(b) Probability of waiting
Size of retainer pool
Prob
abilit
y of
wai
ting
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
● ● ● ● ● ●●
●●
●
● ● ● ● ● ● ● ● ● ●
Traffic intensity ρ0.10.51510
●
●
●
●
●
●
●
●
2 4 6 8 10
1e−1
01e−0
71e−0
41e−0
11e
+02
(c) Expected wait time
Size of retainer pool
Expe
cted
wai
t tim
e (s
econ
ds)
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
● ● ● ● ● ● ●●
●●
● ● ● ● ● ● ● ● ● ●
Traffic intensity ρ0.10.51510
Tuesday, May 8, 12
Optimal Retainer Size• Size of retainer pool is typically the only
value that requesters can manipulate
• Minimize costs by keeping the retainer pool small while keeping low
ModelOptimizePlatformO
utlin
e
⇡(c)
Tuesday, May 8, 12
Optimal Retainer SizeBased on Maximum Miss Probability
Given a maximum desired probability of a miss :
Minimize c subject to
●
●
●
●
●
●
●
●
●
0 2 4 6 8
1e−1
01e−0
4
Cost vs. probability of waiting
Expected payments per unit time
Prob
abilit
y of
wai
ting
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●● ● ● ● ● ● ●●
●●
●●
●
●●●●●●● ● ● ● ● ● ● ● ● ● ●●
●
Traffic intensity ρ0.10.51510
pmax
⇡(c) pmax
Tuesday, May 8, 12
Optimal Retainer SizeBased on Maximum Miss Probability
Given a maximum desired probability of a miss :
Minimize c subject to
●
●
●
●
●
●
●
●
●
0 2 4 6 8
1e−1
01e−0
4
Cost vs. probability of waiting
Expected payments per unit time
Prob
abilit
y of
wai
ting
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●● ● ● ● ● ● ●●
●●
●●
●
●●●●●●● ● ● ● ● ● ● ● ● ● ●●
●
Traffic intensity ρ0.10.51510
pmax
⇡(c) pmax
Tuesday, May 8, 12
Optimal Retainer SizeBased on Joint Cost
If the “pizza delivery” property holds: we can quantify the cost of loss
●●
●●
●●
● ● ● ●
2 4 6 8 10
15
2010
050
0
Total cost of task and retainer
Size of retainer pool
Com
bine
d co
st o
f ret
aine
r and
mis
sed
task
●
●●
●●
●● ● ● ●
●
●
●
● ●●
● ● ● ●
●
●
●
●
●● ● ● ● ●
Cost of a missed task1101001000
Tuesday, May 8, 12
Improving the Retainer Model
ModelOptimizePlatformO
utlin
e
1 Subscriptions2 Shared Pools3 Predictive
Recruitment
Tuesday, May 8, 12
Retainer Subscriptions
• Proposal: increase by allowing workers to subscribe to realtime tasks
• Instead of posting to the global task list, the platform sends a message to subscribers
• Change crowdsourcing from a pull model to a push model
µ
Tuesday, May 8, 12
Global Retainer Pools• Sharing one global retainer pool across
requesters improves performance
• Intuition: Most workers are padding for unlikely runs of arrivals)
Time
Task 1
Tuesday, May 8, 12
Global Retainer Pools• Sharing one global retainer pool across
requesters improves performance
• Intuition: Most workers are padding for unlikely runs of arrivals)
Time
Task 1Task 2
Tuesday, May 8, 12
Global Retainer Pools• Sharing one global retainer pool across
requesters improves performance
• Intuition: Most workers are padding for unlikely runs of arrivals)
Time
Combined
Tuesday, May 8, 12
Global Retainer Pools• Sharing one global retainer pool across
requesters improves performance
• Intuition: Most workers are padding for unlikely runs of arrivals)
Time
Combined
Tuesday, May 8, 12
Global Retainer Pools• Through approximation, individual pools:
• Shared pools across k requesters:
• Loss rate declines exponentially with the number of bundled retainer pools
⇡(c) ⇡p2⇡c
�e�⇢(e⇢/c)c
�
⇡(c) ⇡p2⇡kc
�e�⇢(e⇢/c)c
�k⇡(c) ⇡p2⇡c
�e�⇢(e⇢/c)c
�
⇡(c) ⇡p2⇡kc
�e�⇢(e⇢/c)c
�k
⇡(c) ⇡
⇡(c) ⇡
Tuesday, May 8, 12
Global Retainer Pools• Through approximation, individual pools:
• Shared pools across k requesters:
• Loss rate declines exponentially with the number of bundled retainer pools
⇡(c) ⇡p2⇡c
�e�⇢(e⇢/c)c
�
⇡(c) ⇡p2⇡kc
�e�⇢(e⇢/c)c
�k⇡(c) ⇡p2⇡c
�e�⇢(e⇢/c)c
�
⇡(c) ⇡p2⇡kc
�e�⇢(e⇢/c)c
�k
⇡(c) ⇡
⇡(c) ⇡
Tuesday, May 8, 12
Global Retainer Pools
Cost dramatically decreases as you combine retainers: k dollars to log(k) dollars
Tuesday, May 8, 12
Global Retainer Routing
• Not every worker in a global retainer pool is good at every task
• If we assigned each worker to any task they could do, some tasks would starve
Tuesday, May 8, 12
Global Retainer Routing
• We want to maintain a buffer of workers to respond to all kinds of tasks
• A linear programming technique can balance the traffic intensities across all tasks
Tuesday, May 8, 12
Precruitment• Predictive Recruitment: notify workers
before the task arrives
• Recall workers in expectation of having a task by the time they arrive 2–3 seconds later
Tuesday, May 8, 12
PrecruitmentFormative Study, N=373 tasks
• 3¢ for 3-minute retainer task: whack-a-mole
• ‘Loading...’ screen for randomly-selected time [0, 20] seconds after worker returns
• Click on randomly-placed mole
Tuesday, May 8, 12
PrecruitmentFormative Study, N=373 tasks
• 3¢ for 3-minute retainer task: whack-a-mole
• ‘Loading...’ screen for randomly-selected time [0, 20] seconds after worker returns
• Click on randomly-placed mole
Tuesday, May 8, 12
PrecruitmentResults
• Median time to mouse move: 0.50 seconds
•
• Standard retainer model (start timer @ alert): median mouse move in 1.36 seconds
Time waiting for task (sec)
Res
pons
e tim
e to
mov
e m
ouse
(sec
)
0.51.0
5.010.0
●
●
●
●
●
●
●
●●
●
●●
●●
● ● ●●
●
●
●
●
● ● ●●
●
●
● ●●
●
●●
●
●● ●
●
●●
●●●
●●●●
●●
●
●●
●
●●●
●
●
●●
●●
● ●
●● ●
●
●
●
●●●
●
●●
●●●
●
●●
●
●
●
●● ●
●
●
●●
●
●
●
● ●
●
●
●
●
●●
● ● ●●●
●
●
●●
●●
●
●
●
●
● ● ● ●●●
●
●
●
●●
● ●
●
●
●
●●
●●
●
●●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
● ●
●
●
●
●●
●
●●
●
●
●
●
● ● ●●
●
●●
●
●●
●● ●●●
●
●
●●
●●
●●●
●●
●
●
● ●●
●
● ●
●
●●
●
●
●
●
●●
●●●
●
●
●
●●
●
●
●●
●●
●●●
●
●
●
●
●
●
●
● ●● ●●
●
●
●
●
●● ●●●
●●
●
●
●
●
●
●
●● ●
●
●●
●
●
●
●
●
●
●●
●●●
●
●
●
●
● ●
●
●● ●
●
●●
●●
●●
● ●●
●
●
●●
●
●
●
●
●
● ●
●
●
●
●●
●
●●
● ●●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ● ●●
●
●●
●●
●
●●
●
●
●
●
●●
●● ●
●
●●●●
0 5 10 15 20
Tuesday, May 8, 12
Discussion• Empirics: Can deployed crowdsourcing
platforms support lots of realtime tasks?
• Theory: Crowds as queueing systems
• Reputation: median response time, overall response rate
Tuesday, May 8, 12
MIT HUMAN-COMPUTER INTERACTION
Use queueing theory to understand and optimize performance of a paid, realtime crowdsourcing platform.
•Relationship between crowd size and response time
•Algorithm for optimizing crowd size vs. response time
• Improvements to the platform: 500 millisecond feedback
Tuesday, May 8, 12
MIT HUMAN-COMPUTER INTERACTION
Analytic Methods for Optimizing Realtime Crowdsourcing
Michael Bernstein, David Karger, Rob Miller, and Joel BrandtMIT CSAIL and Adobe Systems
Tuesday, May 8, 12