Download - Crowdsourcing is for the tail
Crowdsourcing is for the tail
Gianluca DemartinieXascale Infolab
University of Fribourg, Switzerland
gianlucademartini.netexascale.info
Crowdsourced Data Curation
• Enforce quality and coverage in KBs• To curate tail entity structured representation• Leveraging the diversity of the crowd• Targeted Crowdsourcing
The long tail of entity popularity
Tail Entities
• Local restaurants• Niches sport domains (chess, cricket)• Emerging music bands• Rare diseases
Gianluca Demartini 7
Improving Crowdsourcing Platforms
8
Push Crowdsourcing
• Pick-A-Crowd: A system architecture that uses Task-to-Worker matching:– The worker’s social profile – The task context
• Workers can provide higher quality answers on tasks they relate to
Djellel Eddine Difallah, Gianluca Demartini, and Philippe Cudré-Mauroux. Pick-A-Crowd: Tell Me What You Like, and I'll Tell You What to Do. In: 22nd International Conference on World Wide Web (WWW 2013), Rio de Janeiro, Brazil, May 2013.
9
Pick-A-Crowd
10
Discussion
• Task-to-Worker recommendation / Matchmaking
• Experimental comparison with AMT shows a consistent quality improvement
“Workers Know what they Like”
Gianluca Demartini 11
OpenTurk
• Yet another a platform? Build on top of Mturk!• Chrome Extension for push / notification• 400+ users• http://bit.ly/openturk-extension• Open source: https
://github.com/openturk/extension
Transactive Search
Transactive Search
• Transactive Memories• Transactive Search:– Memory reconstructed by a group of people– Need to target the right people– A form Targeted Crowdsourcing
• “Who attended the ISWC 2013 conference?”
Gianluca Demartini 14
Transactive Search
• Machines: Harvest the Web + Data Mining• Crowd: Search twitter, look at event pictures• Transactive Memories: Remember who I met
Michele Catasta, Alberto Tonon, Djellel Eddine Difallah, Gianluca Demartini, Karl Aberer, and Philippe Cudré-Mauroux. Hippocampus: Answering Memory Queries using Transactive Search. In: 23rd International Conference on World Wide Web (WWW 2014), Web Science Track. Seoul, South Korea, April 2014.
Gianluca Demartini 15
Who attended ISWC 2013?
Conclusions
• Crowdsourcing For Tail Entities• Focusing on the difficult part of the KB– The tail is long!
• Challenges– Which tail entities are valuable?– Who is the right worker?– Focus on passion rather than monetary incentives