crowdsourcing case study adsafe

7
Crowdsourcing Case Study AdSafe Panos Ipeirotis

Upload: marius

Post on 25-Feb-2016

31 views

Category:

Documents


3 download

DESCRIPTION

Crowdsourcing Case Study AdSafe. Panos Ipeirotis. A few of the tasks in the past. Detect pages that discuss swine flu Pharmaceutical firm had drug “treating” (off-label) swine flu FDA prohibited pharmaceutical company to display drug ad in pages about swine flu Two days to build and go live - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Crowdsourcing Case Study AdSafe

Crowdsourcing Case StudyAdSafe

Panos Ipeirotis

Page 2: Crowdsourcing Case Study AdSafe

2

Page 3: Crowdsourcing Case Study AdSafe

3

Page 4: Crowdsourcing Case Study AdSafe

4

Page 5: Crowdsourcing Case Study AdSafe

5

A few of the tasks in the past• Detect pages that discuss swine flu

– Pharmaceutical firm had drug “treating” (off-label) swine flu– FDA prohibited pharmaceutical company to display drug ad in pages

about swine flu– Two days to build and go live

• Big fast-food chain does not want ad to appear:– In pages that discuss the brand (99% negative sentiment)– In pages discussing obesity– Three days to build and go live

Page 6: Crowdsourcing Case Study AdSafe

6

Need to build models fast

• Traditionally, modeling teams have invested substantial internal resources in data formulation, information extraction, cleaning, and other preprocessing

No time for such things…• However, now, we can outsource preprocessing tasks, such

as labeling, feature extraction, verifying information extraction, etc.– using Mechanical Turk, oDesk, etc.– quality may be lower than expert labeling (much?) – but low costs can allow massive scale

Page 7: Crowdsourcing Case Study AdSafe

7

AdSafe workflow• Find URLs for a given topic (hate speech, gambling, alcohol

abuse, guns, bombs, celebrity gossip, etc etc)http://url-collector.appspot.com/allTopics.jsp

• Classify URLs into appropriate categories http://url-annotator.appspot.com/AdminFiles/Categories.jsp

• Measure quality of the labelers and remove spammershttp://qmturk.appspot.com/

• Get humans to “beat” the classifier by providing cases where the classifier failshttp://adsafe-beatthemachine.appspot.com/