visualization of disagreement-based quality metrics of crowdsourcing data
DESCRIPTION
Crowdsourcing represents a significant source of data which needs to be analyzed and interpreted. These tasks influence the quality of the output as well as the efficiency of the process. Visualization proved to be an effective way of dealing with large amount of data. In this paper we propose a visualization analytic model in the context of the CrowdTruth framework and CrowdTruth metrics for optimizing the crowdsourcing process and improving its data quality. The requirements for the dynamic, scalable and interactive visualizations were extracted through literature and interviews with users of the framework.TRANSCRIPT
![Page 1: Visualization of Disagreement-based Quality Metrics of Crowdsourcing Data](https://reader036.vdocuments.us/reader036/viewer/2022062513/556ac72ed8b42acd348b4d99/html5/thumbnails/1.jpg)
By Tatiana Cristea
Supervised by Lora Aroyo (VU) & Robert-Jan Sips (IBM)
![Page 2: Visualization of Disagreement-based Quality Metrics of Crowdsourcing Data](https://reader036.vdocuments.us/reader036/viewer/2022062513/556ac72ed8b42acd348b4d99/html5/thumbnails/2.jpg)
Visualizations for quality assessment of crowdsourced data
Noisy Crowdsourced data
Quality data
![Page 3: Visualization of Disagreement-based Quality Metrics of Crowdsourcing Data](https://reader036.vdocuments.us/reader036/viewer/2022062513/556ac72ed8b42acd348b4d99/html5/thumbnails/3.jpg)
Current practices: based on the consensus of workers
CrowdTruth metrics : considers disagreement informative
![Page 4: Visualization of Disagreement-based Quality Metrics of Crowdsourcing Data](https://reader036.vdocuments.us/reader036/viewer/2022062513/556ac72ed8b42acd348b4d99/html5/thumbnails/4.jpg)
Select from the list the objects depicted in the image:
Balloon Flower Human Car Ghost Person
Can you identify the low quality worker(s)?
Balloon Flower Human Car Ghost Person
Balloon Flower Human Car Ghost Person
Worker 1 Worker 2 Worker 3
Unclear image (content
unit)
![Page 5: Visualization of Disagreement-based Quality Metrics of Crowdsourcing Data](https://reader036.vdocuments.us/reader036/viewer/2022062513/556ac72ed8b42acd348b4d99/html5/thumbnails/5.jpg)
Select from the list the objects depicted in the image:
Can you identify the low quality worker(s)?
Balloon Flower Human Car Ghost Person
Worker 1 Balloon Flower Human Car Ghost Person
Worker 2 Balloon Flower Human Car Ghost Person
Worker 3
Not clearly separable
answers
![Page 6: Visualization of Disagreement-based Quality Metrics of Crowdsourcing Data](https://reader036.vdocuments.us/reader036/viewer/2022062513/556ac72ed8b42acd348b4d99/html5/thumbnails/6.jpg)
Select from the list the objects depicted in the image:
Can you identify the low quality workers?
Balloon Flower Human Car Ghost Person
Worker 1 Balloon
Flower Human Car Ghost Person
Worker 2 Balloon
Flower Human Car Ghost Person
Worker 3
Low quality
workers
![Page 7: Visualization of Disagreement-based Quality Metrics of Crowdsourcing Data](https://reader036.vdocuments.us/reader036/viewer/2022062513/556ac72ed8b42acd348b4d99/html5/thumbnails/7.jpg)
How good is the unit for the specific task?
How well the worker understood the task?
Are the annotation options clear and separable?
Unit
AnnotationWorker
![Page 8: Visualization of Disagreement-based Quality Metrics of Crowdsourcing Data](https://reader036.vdocuments.us/reader036/viewer/2022062513/556ac72ed8b42acd348b4d99/html5/thumbnails/8.jpg)
Unit
AnnotationWorker
Unit
AnnotationWorker
Unit
AnnotationWorker
JOB 1 JOB 2
JOB N
Unit Unit
Unit
Worker
Worker
Worker Annotation
Annotation
Annotation
![Page 9: Visualization of Disagreement-based Quality Metrics of Crowdsourcing Data](https://reader036.vdocuments.us/reader036/viewer/2022062513/556ac72ed8b42acd348b4d99/html5/thumbnails/9.jpg)
Visualization approach for quality assessment of
crowdsourced data :
a) at aggregate level
b) at a specific level
c) and in the context of their interdependencies
![Page 10: Visualization of Disagreement-based Quality Metrics of Crowdsourcing Data](https://reader036.vdocuments.us/reader036/viewer/2022062513/556ac72ed8b42acd348b4d99/html5/thumbnails/10.jpg)
![Page 11: Visualization of Disagreement-based Quality Metrics of Crowdsourcing Data](https://reader036.vdocuments.us/reader036/viewer/2022062513/556ac72ed8b42acd348b4d99/html5/thumbnails/11.jpg)
Extracted through interviews
Visualization of properties, statistics and metrics of: single job/unit/worker collection of jobs/unit/workers
Functional requirements: Filtering, sorting Support for detection of outliers Visualization of connected workers, content units and jobs Support of comparative analysis Support for navigation between connected elements, etc.
![Page 12: Visualization of Disagreement-based Quality Metrics of Crowdsourcing Data](https://reader036.vdocuments.us/reader036/viewer/2022062513/556ac72ed8b42acd348b4d99/html5/thumbnails/12.jpg)
DEMO TOUR
![Page 13: Visualization of Disagreement-based Quality Metrics of Crowdsourcing Data](https://reader036.vdocuments.us/reader036/viewer/2022062513/556ac72ed8b42acd348b4d99/html5/thumbnails/13.jpg)
We evaluated the design with 9
persons
Different levels of experience with
crowdsourcing tasks
![Page 14: Visualization of Disagreement-based Quality Metrics of Crowdsourcing Data](https://reader036.vdocuments.us/reader036/viewer/2022062513/556ac72ed8b42acd348b4d99/html5/thumbnails/14.jpg)
useful in:
the assessment of quality
deep analysis of the data
But….
![Page 15: Visualization of Disagreement-based Quality Metrics of Crowdsourcing Data](https://reader036.vdocuments.us/reader036/viewer/2022062513/556ac72ed8b42acd348b4d99/html5/thumbnails/15.jpg)
The amount of information was a (little) bit overwhelming…
![Page 16: Visualization of Disagreement-based Quality Metrics of Crowdsourcing Data](https://reader036.vdocuments.us/reader036/viewer/2022062513/556ac72ed8b42acd348b4d99/html5/thumbnails/16.jpg)
The interactions are great!
… if you know about them
![Page 17: Visualization of Disagreement-based Quality Metrics of Crowdsourcing Data](https://reader036.vdocuments.us/reader036/viewer/2022062513/556ac72ed8b42acd348b4d99/html5/thumbnails/17.jpg)
The time dimension is not always present…
![Page 18: Visualization of Disagreement-based Quality Metrics of Crowdsourcing Data](https://reader036.vdocuments.us/reader036/viewer/2022062513/556ac72ed8b42acd348b4d99/html5/thumbnails/18.jpg)
Create user profiles
Decouple the visualization component and provide it as a separate plugin
Add the time dimension to the visualizations
Time