my eswc 2017 keynote: disrupting the semantic comfort zone
TRANSCRIPT
![Page 1: My ESWC 2017 keynote: Disrupting the Semantic Comfort Zone](https://reader031.vdocuments.us/reader031/viewer/2022021815/5a6479027f8b9a8e568b45fb/html5/thumbnails/1.jpg)
http://lora-aroyo.org @laroyo
Disrupting the Semantic
Lora Aroyo
Web & Media Group
![Page 2: My ESWC 2017 keynote: Disrupting the Semantic Comfort Zone](https://reader031.vdocuments.us/reader031/viewer/2022021815/5a6479027f8b9a8e568b45fb/html5/thumbnails/2.jpg)
Web & Media Group
http://lora-aroyo.org @laroyo
BulgariaThe Netherlands
Sofia
NYC
Personal Semantics
![Page 3: My ESWC 2017 keynote: Disrupting the Semantic Comfort Zone](https://reader031.vdocuments.us/reader031/viewer/2022021815/5a6479027f8b9a8e568b45fb/html5/thumbnails/3.jpg)
Web & Media Group
http://lora-aroyo.org @laroyo
Riva del Garda, Italy, 2014
Semantic Social Life
![Page 4: My ESWC 2017 keynote: Disrupting the Semantic Comfort Zone](https://reader031.vdocuments.us/reader031/viewer/2022021815/5a6479027f8b9a8e568b45fb/html5/thumbnails/4.jpg)
Web & Media Group
http://lora-aroyo.org @laroyo 4
To understand the value of Semantic Web for e-learning
you have to understand people, e.g. how they learn, interact &
consume information
![Page 5: My ESWC 2017 keynote: Disrupting the Semantic Comfort Zone](https://reader031.vdocuments.us/reader031/viewer/2022021815/5a6479027f8b9a8e568b45fb/html5/thumbnails/5.jpg)
Web & Media Group
http://lora-aroyo.org @laroyo 5
To understand the value of Semantic Web for e-learning
you have to understand people, e.g. how they interact &
consume information
![Page 6: My ESWC 2017 keynote: Disrupting the Semantic Comfort Zone](https://reader031.vdocuments.us/reader031/viewer/2022021815/5a6479027f8b9a8e568b45fb/html5/thumbnails/6.jpg)
Web & Media Group
http://lora-aroyo.org @laroyo 6
To understand the value of Semantic Web for cultural heritage
you have to understand people, e.g. how they interact & consume information
![Page 7: My ESWC 2017 keynote: Disrupting the Semantic Comfort Zone](https://reader031.vdocuments.us/reader031/viewer/2022021815/5a6479027f8b9a8e568b45fb/html5/thumbnails/7.jpg)
Web & Media Group
http://lora-aroyo.org @laroyo 7
To understand the value of Semantic Web for cultural heritage
you have to understand people, e.g. how they interact & consume information
![Page 8: My ESWC 2017 keynote: Disrupting the Semantic Comfort Zone](https://reader031.vdocuments.us/reader031/viewer/2022021815/5a6479027f8b9a8e568b45fb/html5/thumbnails/8.jpg)
Web & Media Group
http://lora-aroyo.org @laroyo
To understand the value of Semantic Web for digital humanities, you have to
understand people, e.g. how they interact & consume information
![Page 9: My ESWC 2017 keynote: Disrupting the Semantic Comfort Zone](https://reader031.vdocuments.us/reader031/viewer/2022021815/5a6479027f8b9a8e568b45fb/html5/thumbnails/9.jpg)
Web & Media Group
http://lora-aroyo.org @laroyo
people are in the center of everythingpeople & their semantics, i.e. their real-world behavior,
online interactions, information needs, information consumption habits, personal preferences ...
![Page 10: My ESWC 2017 keynote: Disrupting the Semantic Comfort Zone](https://reader031.vdocuments.us/reader031/viewer/2022021815/5a6479027f8b9a8e568b45fb/html5/thumbnails/10.jpg)
Web & Media Group
http://lora-aroyo.org @laroyoCrowdTruth team
![Page 11: My ESWC 2017 keynote: Disrupting the Semantic Comfort Zone](https://reader031.vdocuments.us/reader031/viewer/2022021815/5a6479027f8b9a8e568b45fb/html5/thumbnails/11.jpg)
http://lora-aroyo.org @laroyo
Web & Media Group
the evolution of the semantic web:great moments from the 1980s to ESWC 2017
![Page 12: My ESWC 2017 keynote: Disrupting the Semantic Comfort Zone](https://reader031.vdocuments.us/reader031/viewer/2022021815/5a6479027f8b9a8e568b45fb/html5/thumbnails/12.jpg)
http://lora-aroyo.org @laroyo
50’AI more or less begins......
80’expert systems90’knowledge acquisition from experts
00’standards & interoperability10’big data & large crowds
A long time agoin a galaxy far, far away …
![Page 13: My ESWC 2017 keynote: Disrupting the Semantic Comfort Zone](https://reader031.vdocuments.us/reader031/viewer/2022021815/5a6479027f8b9a8e568b45fb/html5/thumbnails/13.jpg)
http://lora-aroyo.org @laroyo
80’s - empire of the experts
![Page 14: My ESWC 2017 keynote: Disrupting the Semantic Comfort Zone](https://reader031.vdocuments.us/reader031/viewer/2022021815/5a6479027f8b9a8e568b45fb/html5/thumbnails/14.jpg)
http://lora-aroyo.org @laroyo
Advances in hardware and SDEsPCs, workstations, Symbolics, SunNew architectures like the Hypercube LISP, Prolog, OPSAI can now BUILD SYSTEMS
Primary focus on experts and rules
What is the knowledge of expertsWhat is the form of this knowledge?Graphs, logic, rules, frames
How do experts reason?Deduction, induction
80’s - empire of the experts
Work on form & process remained academic
what happened inside the system, to make the reasoning inside the system proper and as good as possible
industry forged ahead with ad-hoc & proprietary systems and actually tried to build expert systems
Originals of uncertain KRFuzzy, probabilistic
![Page 15: My ESWC 2017 keynote: Disrupting the Semantic Comfort Zone](https://reader031.vdocuments.us/reader031/viewer/2022021815/5a6479027f8b9a8e568b45fb/html5/thumbnails/15.jpg)
http://lora-aroyo.org @laroyo
Piero Bonissone and the DELTA/CATS expert system for
locomotive repair with David Smith, a locomotive repair expert
Buchanan and Shortliff’s MYCIN project at Stanford built an huge rule base for medicat diagnosis working with an extensive team of
medical experts.
![Page 16: My ESWC 2017 keynote: Disrupting the Semantic Comfort Zone](https://reader031.vdocuments.us/reader031/viewer/2022021815/5a6479027f8b9a8e568b45fb/html5/thumbnails/16.jpg)
http://lora-aroyo.org @laroyo
90’s - knowledge acquisition from experts
![Page 18: My ESWC 2017 keynote: Disrupting the Semantic Comfort Zone](https://reader031.vdocuments.us/reader031/viewer/2022021815/5a6479027f8b9a8e568b45fb/html5/thumbnails/18.jpg)
http://lora-aroyo.org @laroyo
90’s - knowledge acquisition from expertsThe 90’s brought [attention for] knowledge acquisition. Knowing that expert systems by then can functionally work, the focus [in
practice as well as scientific research and technology development] shifted to the then-bigger challenge of how to acquire knowledge in real-world scenarios.
It seems natural that after the look inside the systems, then one needed to pay attention to how actually get the knowledge from the world outside and frame it into the proper structured knowledge for inside the system.
Dream of the 90’s
![Page 19: My ESWC 2017 keynote: Disrupting the Semantic Comfort Zone](https://reader031.vdocuments.us/reader031/viewer/2022021815/5a6479027f8b9a8e568b45fb/html5/thumbnails/19.jpg)
http://lora-aroyo.org @laroyo
![Page 20: My ESWC 2017 keynote: Disrupting the Semantic Comfort Zone](https://reader031.vdocuments.us/reader031/viewer/2022021815/5a6479027f8b9a8e568b45fb/html5/thumbnails/20.jpg)
http://lora-aroyo.org @laroyo
00’s - interoperability & standards odyssey
![Page 21: My ESWC 2017 keynote: Disrupting the Semantic Comfort Zone](https://reader031.vdocuments.us/reader031/viewer/2022021815/5a6479027f8b9a8e568b45fb/html5/thumbnails/21.jpg)
http://lora-aroyo.org @laroyo
10’s - AI Awakens• Machine Learning• Neural networks• Solving basic perceptual problems instead of high-expertise ones• Ambiguity tolerant reasoning• Non-taxonomic ordering → non-taxonomic reasoning • folksonomies, clustering, diversity of perspectives, embeddings
![Page 22: My ESWC 2017 keynote: Disrupting the Semantic Comfort Zone](https://reader031.vdocuments.us/reader031/viewer/2022021815/5a6479027f8b9a8e568b45fb/html5/thumbnails/22.jpg)
Web & Media Group
http://lora-aroyo.org @laroyo
2011
![Page 23: My ESWC 2017 keynote: Disrupting the Semantic Comfort Zone](https://reader031.vdocuments.us/reader031/viewer/2022021815/5a6479027f8b9a8e568b45fb/html5/thumbnails/23.jpg)
http://lora-aroyo.org @laroyo
10’s – Big Data
![Page 24: My ESWC 2017 keynote: Disrupting the Semantic Comfort Zone](https://reader031.vdocuments.us/reader031/viewer/2022021815/5a6479027f8b9a8e568b45fb/html5/thumbnails/24.jpg)
Web & Media Group
http://lora-aroyo.org @laroyo
Human AnnotationCentral in Machine Learning
Training & Evaluation
10’s – Crowds
![Page 25: My ESWC 2017 keynote: Disrupting the Semantic Comfort Zone](https://reader031.vdocuments.us/reader031/viewer/2022021815/5a6479027f8b9a8e568b45fb/html5/thumbnails/25.jpg)
http://lora-aroyo.org @laroyo
Web & Media Group
Team BellKor wins Netflix Prize
20071998 2006 2009
![Page 26: My ESWC 2017 keynote: Disrupting the Semantic Comfort Zone](https://reader031.vdocuments.us/reader031/viewer/2022021815/5a6479027f8b9a8e568b45fb/html5/thumbnails/26.jpg)
Web & Media Group
http://lora-aroyo.org @laroyo
![Page 27: My ESWC 2017 keynote: Disrupting the Semantic Comfort Zone](https://reader031.vdocuments.us/reader031/viewer/2022021815/5a6479027f8b9a8e568b45fb/html5/thumbnails/27.jpg)
Web & Media Group
http://lora-aroyo.org @laroyo
the semantic comfort
zone
![Page 28: My ESWC 2017 keynote: Disrupting the Semantic Comfort Zone](https://reader031.vdocuments.us/reader031/viewer/2022021815/5a6479027f8b9a8e568b45fb/html5/thumbnails/28.jpg)
Web & Media Group
http://lora-aroyo.org @laroyo
One truth: knowledge acquisition for the semantic web assumes one correct interpretation for every example
All examples are created equal: triples are triples, one is not more important than another, they are all either true or false
Disagreement bad: when people disagree, they don’t understand the problem
Experts rule: knowledge is captured from domain experts
One is enough: knowledge by a single expert is sufficient
Detailed explanations help: if examples cause disagreement - add instructions
Once done, forever valid: knowledge is not updated; new data not aligned with old
“Truth is a Lie: 7 Myths about Human Annotation”, AI Magazine 2014, L. Aroyo, C. Welty
![Page 29: My ESWC 2017 keynote: Disrupting the Semantic Comfort Zone](https://reader031.vdocuments.us/reader031/viewer/2022021815/5a6479027f8b9a8e568b45fb/html5/thumbnails/29.jpg)
Web & Media Group
http://lora-aroyo.org @laroyo
Use Case:video archive enrichment
Search Behavior of Media Professionals at an Audiovisual Archive: A Transaction Log Analysis (2009).
B. Huurnink, L. Hollink, W. van den Heuvel, M. de Rijke.
![Page 30: My ESWC 2017 keynote: Disrupting the Semantic Comfort Zone](https://reader031.vdocuments.us/reader031/viewer/2022021815/5a6479027f8b9a8e568b45fb/html5/thumbnails/30.jpg)
Web & Media Group
http://lora-aroyo.org @laroyo
Use Case:video archive enrichment
Goal: make the
multimedia content ofDutch National Video Archiveaccessible to large audiences
Comfort Zone Solution: media professionals watch & annotate videos. Of course!
![Page 31: My ESWC 2017 keynote: Disrupting the Semantic Comfort Zone](https://reader031.vdocuments.us/reader031/viewer/2022021815/5a6479027f8b9a8e568b45fb/html5/thumbnails/31.jpg)
Web & Media Group
http://lora-aroyo.org @laroyo
but ...
ExpensiveDoesn’t scale
time-consuming5 times the video duration
professional vocabularyexperts use a specific vocabulary
that is unknown to general audiences
![Page 32: My ESWC 2017 keynote: Disrupting the Semantic Comfort Zone](https://reader031.vdocuments.us/reader031/viewer/2022021815/5a6479027f8b9a8e568b45fb/html5/thumbnails/32.jpg)
Web & Media Group
http://lora-aroyo.org @laroyo
… and
people search for fragmentsexperts annotate full videos
not finding35% of search queries result in not found
![Page 33: My ESWC 2017 keynote: Disrupting the Semantic Comfort Zone](https://reader031.vdocuments.us/reader031/viewer/2022021815/5a6479027f8b9a8e568b45fb/html5/thumbnails/33.jpg)
Web & Media Group
http://lora-aroyo.org @laroyo
Use Case:real world QA
for Watson
Crowdsourcing ground truth for Question Answering using CrowdTruth (2015).B Timmermans, L Aroyo, C Welty
![Page 34: My ESWC 2017 keynote: Disrupting the Semantic Comfort Zone](https://reader031.vdocuments.us/reader031/viewer/2022021815/5a6479027f8b9a8e568b45fb/html5/thumbnails/34.jpg)
Web & Media Group
http://lora-aroyo.org @laroyo
Goal: gather questions
that real people ask for training & evaluating Watson
Data: 30K Questions + Candidate Answers.
from Yahoo! Answers
Comfort Zone Solution: ask people if the passage answers the question (Y/N). Simple!
Use Case:real world QA
for Watson
![Page 35: My ESWC 2017 keynote: Disrupting the Semantic Comfort Zone](https://reader031.vdocuments.us/reader031/viewer/2022021815/5a6479027f8b9a8e568b45fb/html5/thumbnails/35.jpg)
Web & Media Group
http://lora-aroyo.org @laroyo
Contradicting evidenceIs Coral a plant? • “Coral almost could be considered half-plant [..]”• “[..] organism, such as a coral, resembling a stony plant.”
Unanswerable questions• Can I take a pill if you don't have a child yet?• Is the spelling for being drunk right?• Is napster black?
Unclear answer typeIs paper animal plant or man made?
Multiple right answers to a questionWhat is the best university in NY? (subjective)
YES or NO?
![Page 36: My ESWC 2017 keynote: Disrupting the Semantic Comfort Zone](https://reader031.vdocuments.us/reader031/viewer/2022021815/5a6479027f8b9a8e568b45fb/html5/thumbnails/36.jpg)
Web & Media Group
http://lora-aroyo.org @laroyo
Use Case:medical relation
extraction for Watson
Crowdsourcing Ground Truth for Medical Relation Extraction (2017). A Dumitrache, L Aroyo, C Welty
![Page 37: My ESWC 2017 keynote: Disrupting the Semantic Comfort Zone](https://reader031.vdocuments.us/reader031/viewer/2022021815/5a6479027f8b9a8e568b45fb/html5/thumbnails/37.jpg)
Web & Media Group
http://lora-aroyo.org @laroyo
Goal: gather data to train
Watson to read medical text & automatically
extract a medical relations KB
Comfort Zone Solution: having medical experts read & annotate examples
Use Case:medical relation
extraction for Watson
![Page 38: My ESWC 2017 keynote: Disrupting the Semantic Comfort Zone](https://reader031.vdocuments.us/reader031/viewer/2022021815/5a6479027f8b9a8e568b45fb/html5/thumbnails/38.jpg)
Web & Media Group
http://lora-aroyo.org @laroyo
ANTIBIOTICS are the first line treatment for indications of TYPHUS. treats(ANTIBIOTICS, TYPHUS)? Expert: yes
Patients with TYPHUS who were given ANTIBIOTICS exhibited side-effects. treats(ANTIBIOTICS, TYPHUS)? Expert: yes
With ANTIBIOTICS in short supply, DDT was used during WWII to control the insect vectors of TYPHUS. treats(ANTIBIOTICS, TYPHUS)? Expert: yes.
Are these three really all the same???
![Page 39: My ESWC 2017 keynote: Disrupting the Semantic Comfort Zone](https://reader031.vdocuments.us/reader031/viewer/2022021815/5a6479027f8b9a8e568b45fb/html5/thumbnails/39.jpg)
Web & Media Group
http://lora-aroyo.org @laroyo
Use Case:map music to moods
![Page 40: My ESWC 2017 keynote: Disrupting the Semantic Comfort Zone](https://reader031.vdocuments.us/reader031/viewer/2022021815/5a6479027f8b9a8e568b45fb/html5/thumbnails/40.jpg)
Web & Media Group
http://lora-aroyo.org @laroyo
Use Case:map music to moods
Goal: annotate songs with emotional tags
Comfort Zone Solution: people assign the prevalent mood of a song
![Page 41: My ESWC 2017 keynote: Disrupting the Semantic Comfort Zone](https://reader031.vdocuments.us/reader031/viewer/2022021815/5a6479027f8b9a8e568b45fb/html5/thumbnails/41.jpg)
Cluster 1 Cluster 2 Cluster 3 Cluster 4 Cluster 5 Otherpassionate, rollicking, literate, humorous, silly, aggressive, fiery, does not fit into
rousing, cheerful, fun, poignant, wistful, campy, quirky, tense, anxious, any of the 5confident, sweet, amiable, bittersweet, whimsical, witty, intense, volatile, clustersboisterous, good-natured autumnal, wry visceral
rowdy brooding
Choose one:
Which is the mood most appropriate for each song?
Goal:
(Lee and Hu 2012)
1 song - 1 mood???
![Page 42: My ESWC 2017 keynote: Disrupting the Semantic Comfort Zone](https://reader031.vdocuments.us/reader031/viewer/2022021815/5a6479027f8b9a8e568b45fb/html5/thumbnails/42.jpg)
Web & Media Group
http://lora-aroyo.org @laroyo
One truth: knowledge acquisition for the semantic web assumes one correct interpretation for every example
All examples are created equal: triples are triples, one is not more important than another, they are all either true or false
Disagreement bad: when people disagree, they don’t understand the problem
Experts rule: knowledge is captured from domain experts
One is enough: knowledge by a single expert is sufficient
Detailed explanations help: if examples cause disagreement - add instructions
Once done, forever valid: knowledge is not updated; new data not aligned with old
“Truth is a Lie: 7 Myths about Human Annotation”, AI Magazine 2014, L. Aroyo, C. Welty
![Page 43: My ESWC 2017 keynote: Disrupting the Semantic Comfort Zone](https://reader031.vdocuments.us/reader031/viewer/2022021815/5a6479027f8b9a8e568b45fb/html5/thumbnails/43.jpg)
Web & Media Group
http://lora-aroyo.org @laroyo
One truth: knowledge acquisition for the semantic web assumes one correct interpretation for every example
All examples are created equal: triples are triples, one is not more important than another, they are all either true or false
Disagreement bad: when people disagree, they don’t understand the problem
Experts rule: knowledge is captured from domain experts
One is enough: knowledge by a single expert is sufficient
Detailed explanations help: if examples cause disagreement - add instructions
Once done, forever valid: knowledge is not updated; new data not aligned with old
“Truth is a Lie: 7 Myths about Human Annotation”, AI Magazine 2014, L. Aroyo, C. Welty
Semantic Comfort Zone
![Page 44: My ESWC 2017 keynote: Disrupting the Semantic Comfort Zone](https://reader031.vdocuments.us/reader031/viewer/2022021815/5a6479027f8b9a8e568b45fb/html5/thumbnails/44.jpg)
Web & Media Group
http://lora-aroyo.org @laroyo
One truth: knowledge acquisition for the semantic web assumes one correct interpretation for every example
All examples are created equal: triples are triples, one is not more important than another, they are all either true or false
Disagreement bad: when people disagree, they don’t understand the problem
Experts rule: knowledge is captured from domain experts
One is enough: knowledge by a single expert is sufficient
Detailed explanations help: if examples cause disagreement - add instructions
Once done, forever valid: knowledge is not updated; new data not aligned with old
“Truth is a Lie: 7 Myths about Human Annotation”, AI Magazine 2014, L. Aroyo, C. Welty
Semantic Comfort Zone
disrupted
![Page 45: My ESWC 2017 keynote: Disrupting the Semantic Comfort Zone](https://reader031.vdocuments.us/reader031/viewer/2022021815/5a6479027f8b9a8e568b45fb/html5/thumbnails/45.jpg)
Web & Media Group
http://lora-aroyo.org @laroyo
![Page 46: My ESWC 2017 keynote: Disrupting the Semantic Comfort Zone](https://reader031.vdocuments.us/reader031/viewer/2022021815/5a6479027f8b9a8e568b45fb/html5/thumbnails/46.jpg)
Web & Media Group
http://lora-aroyo.org @laroyo
interestingly …
![Page 47: My ESWC 2017 keynote: Disrupting the Semantic Comfort Zone](https://reader031.vdocuments.us/reader031/viewer/2022021815/5a6479027f8b9a8e568b45fb/html5/thumbnails/47.jpg)
Web & Media Group
http://lora-aroyo.org @laroyo
• collective decisions of large groups of people
• a group of error-prone decision-makers can be surprisingly good at picking the best choice
• when thumbs up or thumbs down - the chance of picking the right answer needs to be > 50%
• the odds that a most of them will pick the right answer is greater than any of them will pick it on their own
• performance gets better as size grows
1785 Marquis de Condorcet
“wisdom of crowds”
![Page 48: My ESWC 2017 keynote: Disrupting the Semantic Comfort Zone](https://reader031.vdocuments.us/reader031/viewer/2022021815/5a6479027f8b9a8e568b45fb/html5/thumbnails/48.jpg)
Web & Media Group
http://lora-aroyo.org @laroyo
• asked 787 people to guess the weight of an ox
• none got the right answer
• their collective guess was almost perfect
1906Sir Francis Galton
“wisdom of crowds”
![Page 49: My ESWC 2017 keynote: Disrupting the Semantic Comfort Zone](https://reader031.vdocuments.us/reader031/viewer/2022021815/5a6479027f8b9a8e568b45fb/html5/thumbnails/49.jpg)
Web & Media Group
http://lora-aroyo.org @laroyoWWII Math Rosies
1942: Ballistics calculations and flight trajectories
![Page 50: My ESWC 2017 keynote: Disrupting the Semantic Comfort Zone](https://reader031.vdocuments.us/reader031/viewer/2022021815/5a6479027f8b9a8e568b45fb/html5/thumbnails/50.jpg)
Web & Media Group
http://lora-aroyo.org @laroyoNASA’s Computer Room
transcribe raw flight data from celluloid film & oscillograph paper
![Page 51: My ESWC 2017 keynote: Disrupting the Semantic Comfort Zone](https://reader031.vdocuments.us/reader031/viewer/2022021815/5a6479027f8b9a8e568b45fb/html5/thumbnails/51.jpg)
Web & Media Group
http://lora-aroyo.org @laroyo
can we harness it?
![Page 52: My ESWC 2017 keynote: Disrupting the Semantic Comfort Zone](https://reader031.vdocuments.us/reader031/viewer/2022021815/5a6479027f8b9a8e568b45fb/html5/thumbnails/52.jpg)
http://lora-aroyo.org @laroyo
Web & Media GroupCrowdTruth
http://crowdtruth.org/
![Page 53: My ESWC 2017 keynote: Disrupting the Semantic Comfort Zone](https://reader031.vdocuments.us/reader031/viewer/2022021815/5a6479027f8b9a8e568b45fb/html5/thumbnails/53.jpg)
http://lora-aroyo.org @laroyo
Web & Media Group
CrowdTruthThree basic causes of disagreement: workers, examples, target semantics
Disagreement is signal, not noise.
It is indicative of the variation in human semantic interpretation
It can indicate ambiguity, vagueness, similarity, over-generality, etc, as well as quality
Crowdtruth: Machine-human computation framework for harnessing disagreement in gathering annotated data (2014)
O Inel, A Dumitrache, l.Aroyo, C. Welty
![Page 54: My ESWC 2017 keynote: Disrupting the Semantic Comfort Zone](https://reader031.vdocuments.us/reader031/viewer/2022021815/5a6479027f8b9a8e568b45fb/html5/thumbnails/54.jpg)
Web & Media Group
http://lora-aroyo.org @laroyo
one truth: multiple truths
all examples are created equal: each example is unique
disagreement bad: disagreement is good
experts rule: crowd rules
one is enough: the more the better
detailed explanations help: keep it simple stupid
once done, forever valid: maintenance is necessary
“Truth is a Lie: 7 Myths about Human Annotation”, AI Magazine 2014, L. Aroyo, C. Welty
![Page 55: My ESWC 2017 keynote: Disrupting the Semantic Comfort Zone](https://reader031.vdocuments.us/reader031/viewer/2022021815/5a6479027f8b9a8e568b45fb/html5/thumbnails/55.jpg)
Web & Media Group
http://lora-aroyo.org @laroyo
changes neededvideo archive enrichment
improve support for fragment search
time-based annotations
bridging vocabulary gap between searcher & cataloguer
![Page 56: My ESWC 2017 keynote: Disrupting the Semantic Comfort Zone](https://reader031.vdocuments.us/reader031/viewer/2022021815/5a6479027f8b9a8e568b45fb/html5/thumbnails/56.jpg)
Web & Media Group
http://lora-aroyo.org @laroyo
crowdsourcingvideo tagging
two video tagging pilots
![Page 57: My ESWC 2017 keynote: Disrupting the Semantic Comfort Zone](https://reader031.vdocuments.us/reader031/viewer/2022021815/5a6479027f8b9a8e568b45fb/html5/thumbnails/57.jpg)
Web & Media Group
http://lora-aroyo.org @laroyo
@waisdahttp://waisda.nl
engage crowds
through continuous
gaming
![Page 58: My ESWC 2017 keynote: Disrupting the Semantic Comfort Zone](https://reader031.vdocuments.us/reader031/viewer/2022021815/5a6479027f8b9a8e568b45fb/html5/thumbnails/58.jpg)
http://lora-aroyo.org @laroyo
Web & Media Group
“On the Role of User-Generated Metadata in A/V Collections”, Riste Gligorov et al. KCAP2011
![Page 59: My ESWC 2017 keynote: Disrupting the Semantic Comfort Zone](https://reader031.vdocuments.us/reader031/viewer/2022021815/5a6479027f8b9a8e568b45fb/html5/thumbnails/59.jpg)
http://lora-aroyo.org @laroyo
Web & Media Group
time-basedbernhard
just “tags”
“On the Role of User-Generated Metadata in A/V Collections”, Riste Gligorov et al. KCAP2011
![Page 60: My ESWC 2017 keynote: Disrupting the Semantic Comfort Zone](https://reader031.vdocuments.us/reader031/viewer/2022021815/5a6479027f8b9a8e568b45fb/html5/thumbnails/60.jpg)
http://lora-aroyo.org @laroyo
Web & Media Group
objects (57%)
westminster abbeyabbeypriestergeestelijken
hekpaardentochtaankomst
koetskroningmensenmassaparadekroon regen
“On the Role of User-Generated Metadata in A/V Collections”, Riste Gligorov et al. KCAP2011
![Page 61: My ESWC 2017 keynote: Disrupting the Semantic Comfort Zone](https://reader031.vdocuments.us/reader031/viewer/2022021815/5a6479027f8b9a8e568b45fb/html5/thumbnails/61.jpg)
http://lora-aroyo.org @laroyo
Web & Media Group
persons (31%)
bernhard
juliana
objects (57%)
“On the Role of User-Generated Metadata in A/V Collections”, Riste Gligorov et al. KCAP2011
![Page 62: My ESWC 2017 keynote: Disrupting the Semantic Comfort Zone](https://reader031.vdocuments.us/reader031/viewer/2022021815/5a6479027f8b9a8e568b45fb/html5/thumbnails/62.jpg)
http://lora-aroyo.org @laroyo
Web & Media Group
user vocabulary 8% in professional vocabulary 23% in Dutch lexicon 89% found on Google
locations (7%)
engeland
locations (7%)
persons (31%)
objects (57%)
“On the Role of User-Generated Metadata in A/V Collections”, Riste Gligorov et al. KCAP2011
![Page 63: My ESWC 2017 keynote: Disrupting the Semantic Comfort Zone](https://reader031.vdocuments.us/reader031/viewer/2022021815/5a6479027f8b9a8e568b45fb/html5/thumbnails/63.jpg)
http://lora-aroyo.org @laroyo
Web & Media Group
user vocabulary 8% in professional vocabulary 23% in Dutch lexicon 89% found on Google
locations (7%)
describe mainly short segmentsoften not very specificdon’t describe programmes as a whole
“On the Role of User-Generated Metadata in A/V Collections”, Riste Gligorov et al. KCAP2011
user vocabulary8% in professional vocabulary23% in Dutch lexicon89% found on Google
![Page 64: My ESWC 2017 keynote: Disrupting the Semantic Comfort Zone](https://reader031.vdocuments.us/reader031/viewer/2022021815/5a6479027f8b9a8e568b45fb/html5/thumbnails/64.jpg)
Web & Media Group
http://lora-aroyo.org @laroyo
crowdsourcingmedical relation
extraction
diversity of opinionsindependent perspectives
multitude of contexts
we exposed a richer set of possibilitiesthat help in identifying, processing
& understanding context
![Page 65: My ESWC 2017 keynote: Disrupting the Semantic Comfort Zone](https://reader031.vdocuments.us/reader031/viewer/2022021815/5a6479027f8b9a8e568b45fb/html5/thumbnails/65.jpg)
Web & Media Group
http://lora-aroyo.org @laroyo
Does this sentence express TREATS(Antibiotics, Typhus)?
Patients with TYPHUS who were given ANTIBIOTICS exhibited several side-effects.
With ANTIBIOTICS in short supply, DDT was used during World War II to control the insect vectors of TYPHUS.
ANTIBIOTICS are the first line treatment for indications of TYPHUS. 95%
75%
50%
The crowd results captures the natural ambiguity
![Page 66: My ESWC 2017 keynote: Disrupting the Semantic Comfort Zone](https://reader031.vdocuments.us/reader031/viewer/2022021815/5a6479027f8b9a8e568b45fb/html5/thumbnails/66.jpg)
http://lora-aroyo.org @laroyo
Web & Media Group
What is the relation between the highlighted terms?
He was the first physician to identify the relationship between HEMOPHILIA and HEMOPHILIC ARTHROPATHY.
Experts Hallucinate
Crowd reads text literally - provide better examples to machine
experts: cause crowd: no relation
![Page 67: My ESWC 2017 keynote: Disrupting the Semantic Comfort Zone](https://reader031.vdocuments.us/reader031/viewer/2022021815/5a6479027f8b9a8e568b45fb/html5/thumbnails/67.jpg)
http://lora-aroyo.org @laroyo
Web & Media Group
Unclear relationship between the two arguments reflected in the disagreement
Medical Relation Extraction
![Page 68: My ESWC 2017 keynote: Disrupting the Semantic Comfort Zone](https://reader031.vdocuments.us/reader031/viewer/2022021815/5a6479027f8b9a8e568b45fb/html5/thumbnails/68.jpg)
http://lora-aroyo.org @laroyo
Web & Media Group
Clearly expressed relation between the two arguments reflected in the agreement
Medical Relation Extraction
![Page 69: My ESWC 2017 keynote: Disrupting the Semantic Comfort Zone](https://reader031.vdocuments.us/reader031/viewer/2022021815/5a6479027f8b9a8e568b45fb/html5/thumbnails/69.jpg)
http://lora-aroyo.org @laroyo
Web & Media Group
Unclear relationship between the two arguments reflected in the disagreement
Medical Relation Extraction
![Page 70: My ESWC 2017 keynote: Disrupting the Semantic Comfort Zone](https://reader031.vdocuments.us/reader031/viewer/2022021815/5a6479027f8b9a8e568b45fb/html5/thumbnails/70.jpg)
http://lora-aroyo.org @laroyo
Web & Media Group
![Page 71: My ESWC 2017 keynote: Disrupting the Semantic Comfort Zone](https://reader031.vdocuments.us/reader031/viewer/2022021815/5a6479027f8b9a8e568b45fb/html5/thumbnails/71.jpg)
http://lora-aroyo.org @laroyo
Web & Media Group
Learning Curves
(crowd with pos./neg. threshold at 0.5)
above 400 sent.: crowd consistently over baseline & singleabove 600 sent.: crowd out-performs experts
![Page 72: My ESWC 2017 keynote: Disrupting the Semantic Comfort Zone](https://reader031.vdocuments.us/reader031/viewer/2022021815/5a6479027f8b9a8e568b45fb/html5/thumbnails/72.jpg)
http://lora-aroyo.org @laroyo
Web & Media Group
Learning Curves Extended
(crowd with pos./neg. threshold at 0.5)
crowd consistently performs better than baseline
![Page 73: My ESWC 2017 keynote: Disrupting the Semantic Comfort Zone](https://reader031.vdocuments.us/reader031/viewer/2022021815/5a6479027f8b9a8e568b45fb/html5/thumbnails/73.jpg)
http://lora-aroyo.org @laroyo
Web & Media Group
# of Workers: Impact on Sentence-Relation Score
![Page 74: My ESWC 2017 keynote: Disrupting the Semantic Comfort Zone](https://reader031.vdocuments.us/reader031/viewer/2022021815/5a6479027f8b9a8e568b45fb/html5/thumbnails/74.jpg)
Web & Media Group
http://lora-aroyo.org @laroyo
Training a Relation Extraction Classifier
F1 Cost per sentence
CrowdTruth 0.642 $0.66
Expert Annotator 0.638 $2.00
Single Annotator 0.492 $0.08
“wisdom of the crowd”provides training data that is at least as good
if not better than experts
only with proper analytic framework for harnessing disagreement from the crowd
![Page 75: My ESWC 2017 keynote: Disrupting the Semantic Comfort Zone](https://reader031.vdocuments.us/reader031/viewer/2022021815/5a6479027f8b9a8e568b45fb/html5/thumbnails/75.jpg)
http://lora-aroyo.org @laroyo
Web & Media Group
map music to moods
Goal: tag songs with emotional clusters
Comfort Zone Solution: people assign the prevalent mood of a song
![Page 77: My ESWC 2017 keynote: Disrupting the Semantic Comfort Zone](https://reader031.vdocuments.us/reader031/viewer/2022021815/5a6479027f8b9a8e568b45fb/html5/thumbnails/77.jpg)
Web & Media Group
http://lora-aroyo.org @laroyo
Is this song ….
?Passionate
RousingConfidentBoisterous
Rowdy
LiteratePoignantWistful
BittersweetAutumnalBrooding
RollickingCheerful
FunSweet
AmiableGood-natured
HumorousSilly
CampyWhimsical
WittyWry
AggressiveFiery
TenseAnxiousIntenseVolatile
![Page 78: My ESWC 2017 keynote: Disrupting the Semantic Comfort Zone](https://reader031.vdocuments.us/reader031/viewer/2022021815/5a6479027f8b9a8e568b45fb/html5/thumbnails/78.jpg)
Web & Media Group
http://lora-aroyo.org @laroyo
If “One Truth” & “No Disagreement”Worker Mood-C1 Mood-C2 Mood-C3 Mood-C4 Mood-C5
W1 1
W2 1
W3 1
W4 1
W5 1
W6 1
W7
W8
W9 1
W10 1
Totals 1 3 1 2 1
![Page 79: My ESWC 2017 keynote: Disrupting the Semantic Comfort Zone](https://reader031.vdocuments.us/reader031/viewer/2022021815/5a6479027f8b9a8e568b45fb/html5/thumbnails/79.jpg)
Web & Media Group
http://lora-aroyo.org @laroyo
Worker Mood-C1 Mood-C2 Mood-C3 Mood-C4 Mood-C5 Other
W1 1 1 1
W2 1 1 1
W3 1 1 1
W4 1 1
W5 1 1
W6 1 1 1
W7 1 1 1
W8 1 1 1
W9 1 1
W10 1 1 1 1 1
Totals 3 5 6 5 2 8
If “Many Truths” & “Disagreement”
![Page 80: My ESWC 2017 keynote: Disrupting the Semantic Comfort Zone](https://reader031.vdocuments.us/reader031/viewer/2022021815/5a6479027f8b9a8e568b45fb/html5/thumbnails/80.jpg)
Web & Media Group
http://lora-aroyo.org @laroyo
can indicate alternative interpretations
Worker Mood-C1 Mood-C2 Mood-C3 Mood-C4 Mood-C5 Other
W10 1 1 1 1 1
Totals 3 5 6 5 2 8
Disagreement as Signal
can indicate ambiguity in the
categorisation
can indicate low quality workers
![Page 81: My ESWC 2017 keynote: Disrupting the Semantic Comfort Zone](https://reader031.vdocuments.us/reader031/viewer/2022021815/5a6479027f8b9a8e568b45fb/html5/thumbnails/81.jpg)
http://lora-aroyo.org @laroyo
so …
![Page 82: My ESWC 2017 keynote: Disrupting the Semantic Comfort Zone](https://reader031.vdocuments.us/reader031/viewer/2022021815/5a6479027f8b9a8e568b45fb/html5/thumbnails/82.jpg)
http://lora-aroyo.org @laroyo
getting comfortable
again
![Page 83: My ESWC 2017 keynote: Disrupting the Semantic Comfort Zone](https://reader031.vdocuments.us/reader031/viewer/2022021815/5a6479027f8b9a8e568b45fb/html5/thumbnails/83.jpg)
http://lora-aroyo.org @laroyo
Take Home MessagePeople first, experts second
True and False is not enough,
There is diversity in human interpretation
CrowdTruth introduces a spatial representation
of meaning that harnesses disagreement
With CrowdTruth untrained workers can be just as
reliable as highly trained experts
![Page 84: My ESWC 2017 keynote: Disrupting the Semantic Comfort Zone](https://reader031.vdocuments.us/reader031/viewer/2022021815/5a6479027f8b9a8e568b45fb/html5/thumbnails/84.jpg)
http://lora-aroyo.org @laroyo
http://data.crowdtruth.org/