pick a crowd

1

Pick-A-Crowd: Tell Me What You Like,and I’ll Tell You What to Do

A Crowdsourcing Platform for PersonalizedHuman Intelligence Task Assignment Based on Social

Networks

Djellel E. Difallah, Gianluca Demartini, Philippe Cudré-MaurouxeXascale Infolab

University of Fribourg, Switzerland15th May 2013, WWW 2013 - Rio De Janeiro, Brazil

2

Crowdsourcing

• Exploit human intelligence to solve tasks that are simple for Humans and complex for machines

• Examples: – Wikipedia, reCaptcha, Duolingo

• Incentives– Financial, fun, visibility

3

Motivation

• The Pull Methodology is suboptimal

Effective workers

Actual workers

Max Overlap

4

Motivation

• The Push Methodology is a Task-to-Worker Recommender System.

5

Contribution and Claim

• Pick-A-Crowd: A system architecture that uses Task-to-Worker matching:– The worker’s social profile – The task context

• Workers can provide higher quality answers on tasks they relate to

7

Worker Social Profiling

“YouAreWhatYouLike”

8

Image Tagging

Problem Definition (1)-The Human Intelligence Task (HIT)

Data CollectionSurveyCategorization

Batch of Tasks:TitleBatch InstructionSpecific task instruction*Task data:

- Text.- Options.- Additional data (image, Url)

List of categories*

9

Problem Definition (2)-The Worker

Completed HITs: 256Approval Rate: 96%Qualification TypesGeneric Qualifications

Page:- Title- Category- Description- Feed, etc.



10

Problem Definition (3) –Task-to-Worker Matching

Batch of Tasks:TitleBatch InstructionSpecific task instruction*Task data:

- Text.- Options.- Additional data (image,

Url)List of categories*




1- Task-to-Page Matching Function- Category- Expert finding- Semantic

2- Worker Ranking

11

Matching Models (1/3)–Category Based

• The requester provides a list of categories related to the batch• We create a subset of pages whose category is in the category

list of the batch• Rank the workers by the number of liked pages in the subset

12

Matching Models (2/3) –Expert Finding

• Build an inverted index on the pages’ titles and description• Use the title/description of the tasks as a key word query on

the inverted index and get a subset of pages• Rank the workers by the number of liked pages in the subset

Matching Models (3/3) –Semantic Based

• Link the context to an external knowledge base (e.g., DBPedia)• Exploit the underlying graph structure to determine the Hits and Pages similarity

– Assumption that a worker who likes a page is able to answer questions about related entities– Worker who likes a page is able to answer questions about entities of the same type

• Rank the workers by the number of liked pages in the subset

13

HIT FB Pages

Similarity

Relatedness

Type-Similarity

15

Pick-A-Crowd Architecture

16

Experimental Evaluation

• The Facebook app OpenTurk implements part of the Pick-A-Crowd architecture:– More than 170 registered workers participated– Over 12k pages crawled

• Covered both multiple answer questions as well as open-ended questions– 50 images with multiple choice question and 5 candidate answers

(Soccer, Actors, Music, Authors,Movies, Animes)– Answer 20 open-ended questions related to the topic (Cricket)

18

OpenTurk app

19

Evaluation -Correlation between the crowd accuracy and the number of relevant likes (Category Based)

WO

RKER

PR

ECIS

ION

NUMBER OF RELEVANT LIKES

20

Evaluation (Baseline) –Amazon Mechanical Turk (AMT)

AMT 3 = Majority vote of 3 workersAMT 5 = Majority vote of 5 workers

21

Evaluation – HIT Assignment Models

CATEGORY APPROACH

22


EXPERT FINDING BASED

TITLE/INSTRUCTION CONTENT

23


SEMANTIC BASED

TYPE RELATEDNESS

24

Evaluation -Comparison With Mechanical Turk

AMT

PICK

-A-C

ROW

D

25

Conclusions and Future Work

• Pull vs. Push methodologies in Crowdsourcing • Pick-A-Crowd system architecture with Task-

to-Worker recommendation• Experimental comparison with AMT shows a

consistent quality improvement“Workers Know what they Like”

• Exploit more of the social activity, and handle content-less tasks

26

Next Step

• We are building a Crowdsourcing platform for the research community

• Pre-register on:

www.openturk.com

Thank You!

pick a crowd

Business

category category category

task data

title title title

category list

task context workers

worker matching batch

matching models

workers amt