exploiting disagreement through open ended tasks for capturing interpretation spaces
TRANSCRIPT
Exploiting disagreementthrough open-ended tasks for
capturing interpretation spaces
Doctoral Consortium
By / Benjamin Timmermans @8w
OutlineIntroductionState of the ArtProblem StatementMethodologyPreliminary ResultsConclusions
Introduction
How many dogs were in the picture?
There is no universal "truth"
For the training, testing and evaluationof machines we rely on a...
ground "truth"
State of the Art
Crowdsourcing Approach1-3 annotatorsEvaluate workersInner-annotator agreementUse test questionsPredefined answer choices
The CrowdTruth Approach10-15 annotatorsEvaluate the input, annotations and workersDisagreement-based analytics
Problem Statement
Problems with multimedia annotationsAre sparseAre homogeneousDo not represent everything that can be heard or seen
Problems with crowdsourcing tasksAre designed to stimulate agreementAssumes answers are right or wrong
Closed task
How many beams do you see?
1 2 3 4 5
1 1 2 3 4 5
5 5
Open-ended tasks
How many beams do you see?
Gathering the interpretation space of multimedia through open-ended crowdsourcing tasks
Goal
More efficient crowdsourcingHigher quality ground truth dataImproved search and discovery of multimedia
Are open-ended crowdsourcing tasks a feasible method forcapturing the interpretation space of multimedia?
Research Question
Methodology
1. Improving quality evaluationComparing Closed and open-ended tasksMeasure worker confidence
2. Improving open-ended task designCombine constrains with open-ended designsShowing known annotationsDetecting the distribution of answers
3. Applying the ground "truth"Compare different contextsImprove indexing of multimedia
Preliminary Results
Gathering training datafor IBM Watson
Range of tasksPassage JustificationPassage AlignmentDistributional disambiguation
Sound Interpretations
2.133 short soundsTop 5000 search terms = 11 mil. searches
Sound tag overlap
ConclusionsThere is no ultimate "truth"Do not stimulate agreementCapture the interpretation spaceUse open-ended crowdsourcing tasksEvaluation more difficult
Who we are
Lora Aroyo
Robert-Jan Sips
Chris Welty
Oana Inel
Anca Dumitrache
Benjamin
Timmermans
AcknowledgementsSupervisor: Dr. Lora AroyoMentor: Dr. Matteo Palmonari