crowdsourcing - kbs€¦ · of crowdsourcing 2 human computation is a computer science ... task...
TRANSCRIPT
CrowdsourcingMarkus Rokicki
FOR CERTAIN TASKS, THE CORTEX STILL BEATS THE CPU
“We’re going to consider the human brain as an extremely advanced processor that can solve problems that computers cannot yet solve. Even more, we’re going to consider all humanity as an extremely advanced and large scale distributed processing unit that can solve large scale problems that computers cannot yet solve. …. What I want to advocate ... is a symbiotic relationship, a symbiosis in which humans solve some problems, computers solve others, and together we work to create a better world.”
--- Luis von Ahn, 2006
Luis von Ahn - one of the pioneers of crowdsourcing
2
● Human computation is a computer science technique in which a machine performs its function by outsourcing certain steps to humans (Wikipedia).
● Social computing is related to humans in social role where their communication is mediated by technology (Parameswaran and Whinston, 2007).
● Crowdsourcing is the act of conducting traditional human work with ordinary people (Howe 2008).
● Collective intelligence is a superset of social computing, human computation and crowdsourcing. This term is defined by Malone et al. (2009) as groups of individuals doing things collectively that seem intelligent.
Definitions
Romani R. and Calani Baranauskas M..Exploring Human Computation and Social Computing to Inform the Design Process. (ICEIS-2013)
Collective Intelligence
4
The crowd at a county fair accurately guessed the weight of an ox when their individual guesses were averaged (the average was closer to the ox's true butchered weight than the estimates of most crowd members)
Collective Intelligence
5
4,000 experts80,000 articles200 years to developAnnual Updates
>~100,000 amateurs1.6 Million articles5 years to developReal-Time Updates
Expert $$$$
Util
ity
Masses $
# of contributors10 10,000+
Example: Product Categorization
6Figure: Product categorization task on Amazon Mechanical Turk1
1https://www.mturk.com
Human Computation FrameworkCustomer
● Researcher
● Online store
● Manufacturer
● etc.
Crowd
Result
How to obtain knowledge from the crowd?
7
?
Problem
● Monitor animal population
● Improving IR systems
● Translations● Sentiment
analysis● Video surveillance● Quality check● Audio
Surveillance
Human Computation FrameworkCustomer Problem Microtasks
● Monitor animal population
● Improving IR systems
● Translations● Sentiment
analysis● Video surveillance● Quality check● Audio
Surveillance
● Catch a fly
● Judge relevance of
a document
● Classify item
● Translate a
sentence
● Solve capture
● Detect damage
● Tag an image
● Identify persons
Design
● UI
● Prevent spam
● Engage
● Facilitate
● Platform
● Bias
● Motivation
● Size
● Give feedback
● Pay
● Researcher
● Online store
● Manufacturer
● etc.
● Filter
● Judge
● Weight
Recruit Process Aggregate
Result
Questions to be answered● How to create meaningful tasks?
● How to find suitable crowd?
● How to motivate to carry out my task?
● How to aggregate the results?8
Human Computation FrameworkCustomer Problem Microtasks
● Monitor animal population
● Improving IR systems
● Translations● Sentiment
analysis● Video
surveillance● Quality check● Audio
Surveillance
● Catch a fly
● Judge relevance of
a document
● Classify item
● Translate a
sentence
● Solve capture
● Detect damage
● Tag an image
● Identify persons
Design
● UI
● Prevent spam
● Engage
● Facilitate
● Platform
● Bias
● Motivation
● Size
● Give feedback
● Pay
● Researcher
● Online store
● Manufacturer
● etc.
● Filter
● Judge
● Weight
Recruit Process Aggregate
Result
● HIT: Human Intelligence Task
9
Crowdsourcing infrastructures
● Paid Crowdsourcing○ Crowdflower○ Mechanical Turk
● Virtual Citizen Science○ Zooniverse○ Crowdcrafting
● Hybrid Systems○ Crowd Databases
10
11
Paid Microtask Crowdsourcing
Paid Microtask Crowdsourcing
12
Figure: Paid Crowdsourcing Scheme2
2Figure source: http://dx.doi.org/10.1155/2014/135641
Gamification
13
Task design & quality control
Important for quality, worker retention and throughput:
● Task pricing
● Worker competence/screening
● Requester reputation
● Task packaging
● Task framing & priming
● Task clarity
● ...
Gamification
14
Task design & quality control
Important for quality, worker retention and throughput:
● Task pricing
● Worker competence/screening
● Requester reputation
● Task packaging
● Task framing & priming
● Task clarity
● ...
vague, blank, unclear, inconsistent,
imprecise, ambiguous, or poor
too many words, high standard of English, broken English, spelling
Gamification
15
Task design & quality control
Important for quality, worker retention and throughput:
● Task pricing
● Worker competence/screening
● Requester reputation
● Task packaging
● Task framing & priming
● Task clarity
● ...
Paper 1: Dynamics and task types on MTurk
Data-driven analysis of microtask crowdsourcing platform. Based on scraping information:
16
Djellel Eddine Difallah et al. “The dynamics of micro-task crowdsourcing: The case of amazon mturk”. In: Proceedings of
the 24th International Conference on World Wide Web. ACM. 2015, pp. 238–247.
Paper 1: Dynamics and task types on MTurk
17
Djellel Eddine Difallah et al. “The dynamics of micro-task crowdsourcing: The case of amazon mturk”. In: Proceedings of
the 24th International Conference on World Wide Web. ACM. 2015, pp. 238–247.
Figure: Distributions of rewards per HIT between 2009 and 2013.
Gamification
18
Quality control
asdf asdf asdf asdf
asdf asdf asdf asdf
asdf asdf asdf asdf
Paid Microtask Crowdsourcing
Gamification
19
Gold standard data
● Relying on questions with priorly known answers to filter out low quality workers
○ Gold standard questions are mixed with “real” tasks
○ Workers usually receive feedback on answers to gold standard questions
● Good gold standard data○ Unambiguous○ Score-able automatically○ Not easily detectable
Gamification
20
Qualification tests & pre-screening
Qualification Tests
● Using a sample/simulating real task as a qualification test.
Pre-screening
● Using specific criteria to select workers who are potentially more suitable to provide high quality responses (eg. personality type, demographic traits, etc.).
● Reputation systems
Paper 2: Postprocessing results for quality
● How to learn from noisy human (expert and crowd) labels?
● Estimate worker reliability and true labels using EM○ Each worker is assigned a reliability○ Iterate:
■ E-step: Estimate hidden true labels of the data given current estimated worker reliability■ M-step: Estimate worker reliability given current estimated labels
● Scenarios:○ Binary classification○ Multi-class classification○ Ordinal regression○ Regression
21
Vikas C Raykar et al. “Learning from crowds”. In: Journal of Machine Learning Research 11.Apr (2010), pp. 1297–1322.
Paper 2: Postprocessing results for quality
22Vikas C Raykar et al. “Learning from crowds”. In: Journal of Machine Learning Research 11.Apr (2010), pp.
1297–1322.
Paper 2: Postprocessing results for quality
23Vikas C Raykar et al. “Learning from crowds”. In: Journal of Machine Learning Research 11.Apr (2010), pp.
1297–1322.
Motivation
Classic motivations:
● Paid Crowdsourcing● Gamification● Contribution to Society
Other aspects:
● Curiosity● Team responsibility● Social pressure● etc.
24
Financial Methods of Motivation
25
Employment regulation
Are the payment rules clear?
Recruitment & Retention
Is the salary competitive?
Pay linked to performance
Is the salary fair?
Individual vs. Group incentives
What is the trade off?
Financial Rewards: Crowdsourcing Apps
26
Influence of Payment
➢ 611 participants sorted 36,000 images sets of varying size, for varying payments.
Winter Mason and Duncan J Watts. “Financial incentives and the performance of crowds”. In: ACM SigKDD Explorations Newsletter 11.2 (2010), pp. 100–108.
Financial Rewards: Influence of payment
27Winter Mason and Duncan J Watts. “Financial incentives and the performance of crowds”. In: ACM SigKDD Explorations Newsletter 11.2 (2010), pp. 100–108.
Figure: Accuracy (left) and number of completed tasks (right) in relation to payment
Financial Rewards: Influence of payment
28Winter Mason and Duncan J Watts. “Financial incentives and the performance of crowds”. In: ACM SigKDD Explorations Newsletter 11.2 (2010), pp. 100–108.
Figure: Post-hoc survey of perceived value
Paper 3: Influence of compensation and payment on quality
● Empirical study on influences on overall judgment quality
● Varying○ payment○ worker qualifications○ Required effort
● Scenario: relevance assessments for information retrieval
29
Gabriella Kazai. “In search of quality in crowdsourcing for search engine evaluation”. In: European Conference
on Information Retrieval. Springer. 2011, pp. 165–176.
Paper 3: Influence of compensation and payment on quality
30
Gabriella Kazai. “In search of quality in crowdsourcing for
search
engine evaluation”. In: European Conference on Information
Retrieval.
Springer. 2011, pp. 165–176.
Gamification
31
Complex Crowdsourcing and workflow design
1. Task decomposition: Complex tasks can be decomposed into smaller independent or dependent tasks (multi-stage workflows)
2. Managing dependencies between sub-tasks: Decomposed tasks can be setup and deployed in parallel or in sequence (different UIs, data, etc.)
3. Assembling results
JOB 1 JOB 2 JOB ‘n’
Paper 4: Collaborative workflows
Word processing interface that integrates workers to edit parts of documents on demand.
Find-Fix-Verify: split tasks into generation and review stages.
32Bernstein, Michael S., et al. "Soylent: a word processor with a crowd inside." Proceedings of the 23nd annual ACM symposium on User interface software and technology. ACM, 2010.
Thank you!
33