crowdsourcing - kbs€¦ · of crowdsourcing 2 human computation is a computer science ... task...

CrowdsourcingMarkus Rokicki

FOR CERTAIN TASKS, THE CORTEX STILL BEATS THE CPU

“We’re going to consider the human brain as an extremely advanced processor that can solve problems that computers cannot yet solve. Even more, we’re going to consider all humanity as an extremely advanced and large scale distributed processing unit that can solve large scale problems that computers cannot yet solve. …. What I want to advocate ... is a symbiotic relationship, a symbiosis in which humans solve some problems, computers solve others, and together we work to create a better world.”

--- Luis von Ahn, 2006

Luis von Ahn - one of the pioneers of crowdsourcing

2

● Human computation is a computer science technique in which a machine performs its function by outsourcing certain steps to humans (Wikipedia).

● Social computing is related to humans in social role where their communication is mediated by technology (Parameswaran and Whinston, 2007).

● Crowdsourcing is the act of conducting traditional human work with ordinary people (Howe 2008).

● Collective intelligence is a superset of social computing, human computation and crowdsourcing. This term is defined by Malone et al. (2009) as groups of individuals doing things collectively that seem intelligent.

Definitions

Romani R. and Calani Baranauskas M..Exploring Human Computation and Social Computing to Inform the Design Process. (ICEIS-2013)

Collective Intelligence

4

The crowd at a county fair accurately guessed the weight of an ox when their individual guesses were averaged (the average was closer to the ox's true butchered weight than the estimates of most crowd members)

Collective Intelligence

5

4,000 experts80,000 articles200 years to developAnnual Updates

>~100,000 amateurs1.6 Million articles5 years to developReal-Time Updates

Expert $$$$

Util

ity

Masses $

# of contributors10 10,000+

Example: Product Categorization

6Figure: Product categorization task on Amazon Mechanical Turk1

1https://www.mturk.com

Human Computation FrameworkCustomer

● Researcher

● Online store

● Manufacturer

● etc.

Crowd

Result

How to obtain knowledge from the crowd?

7

?

Problem

● Monitor animal population

● Improving IR systems

● Translations● Sentiment

analysis● Video surveillance● Quality check● Audio

Surveillance

Human Computation FrameworkCustomer Problem Microtasks




analysis● Video surveillance● Quality check● Audio

Surveillance

● Catch a fly

● Judge relevance of

a document

● Classify item

● Translate a

sentence

● Solve capture

● Detect damage

● Tag an image

● Identify persons

Design

● UI

● Prevent spam

● Engage

● Facilitate

● Platform

● Bias

● Motivation

● Size

● Give feedback

● Pay

● Researcher

● Online store

● Manufacturer

● etc.

● Filter

● Judge

● Weight

Recruit Process Aggregate

Result

Questions to be answered● How to create meaningful tasks?

● How to find suitable crowd?

● How to motivate to carry out my task?

● How to aggregate the results?8

Human Computation FrameworkCustomer Problem Microtasks




analysis● Video

surveillance● Quality check● Audio

Surveillance

● Catch a fly

● Judge relevance of

a document

● Classify item

● Translate a

sentence

● Solve capture

● Detect damage

● Tag an image

● Identify persons

Design

● UI

● Prevent spam

● Engage

● Facilitate

● Platform

● Bias

● Motivation

● Size

● Give feedback

● Pay

● Researcher

● Online store

● Manufacturer

● etc.

● Filter

● Judge

● Weight

Recruit Process Aggregate

Result

● HIT: Human Intelligence Task

9

Crowdsourcing infrastructures

● Paid Crowdsourcing○ Crowdflower○ Mechanical Turk

● Virtual Citizen Science○ Zooniverse○ Crowdcrafting

● Hybrid Systems○ Crowd Databases

10

11

Paid Microtask Crowdsourcing


12

Figure: Paid Crowdsourcing Scheme2

2Figure source: http://dx.doi.org/10.1155/2014/135641

Gamification

13

Task design & quality control

Important for quality, worker retention and throughput:

● Task pricing

● Worker competence/screening

● Requester reputation

● Task packaging

● Task framing & priming

● Task clarity

● ...

Gamification

14



● Task pricing



● Task packaging


● Task clarity

● ...

vague, blank, unclear, inconsistent,

imprecise, ambiguous, or poor

too many words, high standard of English, broken English, spelling

Gamification

15



● Task pricing



● Task packaging


● Task clarity

● ...

Paper 1: Dynamics and task types on MTurk

Data-driven analysis of microtask crowdsourcing platform. Based on scraping information:

16

Djellel Eddine Difallah et al. “The dynamics of micro-task crowdsourcing: The case of amazon mturk”. In: Proceedings of

the 24th International Conference on World Wide Web. ACM. 2015, pp. 238–247.

Paper 1: Dynamics and task types on MTurk

17

Djellel Eddine Difallah et al. “The dynamics of micro-task crowdsourcing: The case of amazon mturk”. In: Proceedings of

the 24th International Conference on World Wide Web. ACM. 2015, pp. 238–247.

Figure: Distributions of rewards per HIT between 2009 and 2013.

Gamification

18

Quality control

asdf asdf asdf asdf

asdf asdf asdf asdf

asdf asdf asdf asdf


Gamification

19

Gold standard data

● Relying on questions with priorly known answers to filter out low quality workers

○ Gold standard questions are mixed with “real” tasks

○ Workers usually receive feedback on answers to gold standard questions

● Good gold standard data○ Unambiguous○ Score-able automatically○ Not easily detectable

Gamification

20

Qualification tests & pre-screening

Qualification Tests

● Using a sample/simulating real task as a qualification test.

Pre-screening

● Using specific criteria to select workers who are potentially more suitable to provide high quality responses (eg. personality type, demographic traits, etc.).

● Reputation systems

Paper 2: Postprocessing results for quality

● How to learn from noisy human (expert and crowd) labels?

● Estimate worker reliability and true labels using EM○ Each worker is assigned a reliability○ Iterate:

■ E-step: Estimate hidden true labels of the data given current estimated worker reliability■ M-step: Estimate worker reliability given current estimated labels

● Scenarios:○ Binary classification○ Multi-class classification○ Ordinal regression○ Regression

21

Vikas C Raykar et al. “Learning from crowds”. In: Journal of Machine Learning Research 11.Apr (2010), pp. 1297–1322.


22Vikas C Raykar et al. “Learning from crowds”. In: Journal of Machine Learning Research 11.Apr (2010), pp.

1297–1322.


23Vikas C Raykar et al. “Learning from crowds”. In: Journal of Machine Learning Research 11.Apr (2010), pp.

1297–1322.

Motivation

Classic motivations:

● Paid Crowdsourcing● Gamification● Contribution to Society

Other aspects:

● Curiosity● Team responsibility● Social pressure● etc.

24

Financial Methods of Motivation

25

Employment regulation

Are the payment rules clear?

Recruitment & Retention

Is the salary competitive?

Pay linked to performance

Is the salary fair?

Individual vs. Group incentives

What is the trade off?

Financial Rewards: Crowdsourcing Apps

26

Influence of Payment

➢ 611 participants sorted 36,000 images sets of varying size, for varying payments.

Winter Mason and Duncan J Watts. “Financial incentives and the performance of crowds”. In: ACM SigKDD Explorations Newsletter 11.2 (2010), pp. 100–108.

Financial Rewards: Influence of payment

27Winter Mason and Duncan J Watts. “Financial incentives and the performance of crowds”. In: ACM SigKDD Explorations Newsletter 11.2 (2010), pp. 100–108.

Figure: Accuracy (left) and number of completed tasks (right) in relation to payment

Financial Rewards: Influence of payment

28Winter Mason and Duncan J Watts. “Financial incentives and the performance of crowds”. In: ACM SigKDD Explorations Newsletter 11.2 (2010), pp. 100–108.

Figure: Post-hoc survey of perceived value

Paper 3: Influence of compensation and payment on quality

● Empirical study on influences on overall judgment quality

● Varying○ payment○ worker qualifications○ Required effort

● Scenario: relevance assessments for information retrieval

29

Gabriella Kazai. “In search of quality in crowdsourcing for search engine evaluation”. In: European Conference

on Information Retrieval. Springer. 2011, pp. 165–176.

Paper 3: Influence of compensation and payment on quality

30

Gabriella Kazai. “In search of quality in crowdsourcing for

search

engine evaluation”. In: European Conference on Information

Retrieval.

Springer. 2011, pp. 165–176.

Gamification

31

Complex Crowdsourcing and workflow design

1. Task decomposition: Complex tasks can be decomposed into smaller independent or dependent tasks (multi-stage workflows)

2. Managing dependencies between sub-tasks: Decomposed tasks can be setup and deployed in parallel or in sequence (different UIs, data, etc.)

3. Assembling results

JOB 1 JOB 2 JOB ‘n’

Paper 4: Collaborative workflows

Word processing interface that integrates workers to edit parts of documents on demand.

Find-Fix-Verify: split tasks into generation and review stages.

32Bernstein, Michael S., et al. "Soylent: a word processor with a crowd inside." Proceedings of the 23nd annual ACM symposium on User interface software and technology. ACM, 2010.

Thank you!

33

crowdsourcing - kbs€¦ · of crowdsourcing 2 human computation is a computer science ... task...

Documents