crowdsourcing linked data quality assessment

20
KIT – University of the State of Baden-Wuerttemberg and National Research Center of the Helmholtz Association www.kit.edu @ISWC2013 Crowdsourcing Linked Data Quality Assessment Maribel Acosta, Amrapali Zaveri, Elena Simperl, Dimitris Kontokostas, Sören Auer and Jens Lehmann

Upload: maribel-acosta

Post on 26-Jan-2015

106 views

Category:

Technology


1 download

DESCRIPTION

 

TRANSCRIPT

Page 1: Crowdsourcing Linked Data Quality Assessment

KIT – University of the State of Baden-Wuerttemberg and National Research Center of the Helmholtz Association www.kit.edu

@ISWC2013

Crowdsourcing Linked Data Quality AssessmentMaribel Acosta, Amrapali Zaveri, Elena Simperl, Dimitris Kontokostas, Sören Auer and Jens Lehmann

Page 2: Crowdsourcing Linked Data Quality Assessment

Institut für Angewandte Informatik und Formale Beschreibungsverfahren (AIFB)

 

2 10.04.2023

Motivation

Acosta et al. – Crowdsourcing Linked Data Quality Assessment

Varying quality of Linked Data sources

Some quality issues require certain interpretation that can be easily performed by humans

Solution: Include human verification in the process of LD quality assessment

Direct application: Detecting pattern in errors may allow to identify (and correct) the extraction mechanisms

dbpedia:Dave_Dobbyn dbprop:dateOfBirth “3”.

Page 3: Crowdsourcing Linked Data Quality Assessment

Institut für Angewandte Informatik und Formale Beschreibungsverfahren (AIFB)

 

3 10.04.2023

Research questions

RQ1: Is it possible to detect quality issues in LD data sets via crowdsourcing mechanisms?

RQ2: What type of crowd is most suitable for each type of quality issue?

RQ3: Which types of errors are made by lay users and experts when assessing RDF triples?

Acosta et al. – Crowdsourcing Linked Data Quality Assessment

Page 4: Crowdsourcing Linked Data Quality Assessment

Institut für Angewandte Informatik und Formale Beschreibungsverfahren (AIFB)

 

4 10.04.2023

Related work

Acosta et al. – Crowdsourcing Linked Data Quality Assessment

Crowdsourcing & Linked

Data

Web of data quality

assessment

Our work

ZenCrowd

Entity resolution

CrowdMAPOntology allignment

GWAP for LD

Assessing LD

mappings(Automatic)

Quality

characteristics of LD data sources

(Semi-automatic)

DBpedia

WIQA, Sieve,

(Manual)

Page 5: Crowdsourcing Linked Data Quality Assessment

Institut für Angewandte Informatik und Formale Beschreibungsverfahren (AIFB)

 

5 10.04.2023

OUR APPROACH

Acosta et al. – Crowdsourcing Linked Data Quality Assessment

Page 6: Crowdsourcing Linked Data Quality Assessment

Institut für Angewandte Informatik und Formale Beschreibungsverfahren (AIFB)

 

6 10.04.2023

Methodology

Selecting LD quality issues to crowdsource

Selecting the appropriate crowdsourcing approaches

Designing and generating the interfaces to present the data to the crowd Acosta et al. – Crowdsourcing Linked Data Quality Assessment

1

2

3

Dataset

{s p o .}{s p o .}

Correct

Incorrect +Quality issue

Steps to implement the methodology

1

2

3

Page 7: Crowdsourcing Linked Data Quality Assessment

Institut für Angewandte Informatik und Formale Beschreibungsverfahren (AIFB)

 

7 10.04.2023

Three categories of quality problems occur in DBpedia [Zaveri2013] and can be crowdsourced:

Incorrect object Example: dbpedia:Dave_Dobbyn dbprop:dateOfBirth “3”.

Incorrect data type or language tags Example: dbpedia:Torishima_Izu_Islands foaf:name “鳥島” @en.

Incorrect link to “external Web pages” Example: dbpedia:John-Two-Hawks dbpedia-owl:wikiPageExternalLink

<http://cedarlakedvd.com/>Acosta et al. – Crowdsourcing Linked Data Quality Assessment

Selecting LD quality issues to crowdsource

1

Page 8: Crowdsourcing Linked Data Quality Assessment

Institut für Angewandte Informatik und Formale Beschreibungsverfahren (AIFB)

 

8 10.04.2023

Selecting appropriate crowdsourcing approaches (1)

Acosta et al. – Crowdsourcing Linked Data Quality Assessment

2

ContestLD ExpertsDifficult taskFinal prize

Find Verify

MicrotasksWorkersEasy taskMicropayments

TripleCheckMate [Kontoskostas2013] MTurk

Adapted from [Bernstein2010]http://mturk.com

Page 9: Crowdsourcing Linked Data Quality Assessment

Institut für Angewandte Informatik und Formale Beschreibungsverfahren (AIFB)

 

9 10.04.2023 Acosta et al. – Crowdsourcing Linked Data Quality Assessment

Presenting the data to the crowd

• Selection of foaf:name or rdfs:label to extract human-readable descriptions

• Values extracted automatically from Wikipedia infoboxes

• Link to the Wikipedia article via foaf:isPrimaryTopicOf

• Preview of external pages by implementing HTML iframe

Microtask interfaces: MTurk tasksIncorrect object

Incorrect data type or language tag

Incorrect outlink

3

Page 10: Crowdsourcing Linked Data Quality Assessment

Institut für Angewandte Informatik und Formale Beschreibungsverfahren (AIFB)

 

10 10.04.2023

EXPERIMENTAL STUDY

Acosta et al. – Crowdsourcing Linked Data Quality Assessment

Page 11: Crowdsourcing Linked Data Quality Assessment

Institut für Angewandte Informatik und Formale Beschreibungsverfahren (AIFB)

 

11 10.04.2023

Experimental design

• Crowdsourcing approaches:• Find stage: Contest with LD experts

• Verify stage: Microtasks (5 assignments)

• Creation of a gold standard:• Two of the authors of this paper (MA, AZ) generated the

gold standard for all the triples obtained from the contest

• Each author independently evaluated the triples

• Conflicts were resolved via mutual agreement

• Metric: precision

Acosta et al. – Crowdsourcing Linked Data Quality Assessment

Page 12: Crowdsourcing Linked Data Quality Assessment

Institut für Angewandte Informatik und Formale Beschreibungsverfahren (AIFB)

 

12 10.04.2023

Overall results

LD Experts Microtask workers

Number of distinct participants

50 80

Total time3 weeks (predefined) 4 days

Total triples evaluated1,512 1,073

Total cost~ US$ 400 (predefined) ~ US$ 43

Maribel Acosta - Identifying DBpedia Quality Issues via Crowdsourcing

Page 13: Crowdsourcing Linked Data Quality Assessment

Institut für Angewandte Informatik und Formale Beschreibungsverfahren (AIFB)

 

13 10.04.2023

Precision results: Incorrect object task• MTurk workers can be used to reduce the error rates of LD

experts for the Find stage

• 117 DBpedia triples had predicates related to dates with incorrect/incomplete values:

”2005 Six Nations Championship” Date 12 .

• 52 DBpedia triples had erroneous values from the source:

”English (programming language)” Influenced by ? .• Experts classified all these triples as incorrect

• Workers compared values against Wikipedia and successfully classified this triples as “correct”

Acosta et al. – Crowdsourcing Linked Data Quality Assessment

Triples compared LD Experts MTurk (majority voting: n=5)

509 0.7151 0.8977

Page 14: Crowdsourcing Linked Data Quality Assessment

Institut für Angewandte Informatik und Formale Beschreibungsverfahren (AIFB)

 

14 10.04.2023

Precision results: Incorrect data type task

Acosta et al. – Crowdsourcing Linked Data Quality Assessment

Date

English Millimetre

Nanometre Num-ber

Number with dec-

imals

Second Volt Year Not speci-fied /

URI

0

20

40

60

80

100

120

140

Experts TP

Experts FP

Crowd TP

Crowd FP

Data types

Nu

mb

er o

f tr

iple

s

Triples compared LD Experts MTurk (majority voting: n=5)

341 0.8270 0.4752

Page 15: Crowdsourcing Linked Data Quality Assessment

Institut für Angewandte Informatik und Formale Beschreibungsverfahren (AIFB)

 

15 10.04.2023

Precision results: Incorrect link task

• We analyzed the 189 misclassifications by the experts:

• The 6% misclassifications by the workers correspond to pages with a language different from English.

Acosta et al. – Crowdsourcing Linked Data Quality Assessment

50%39%

11%

Freebase links

Wikipedia images

External links

Triples compared Baseline LD Experts MTurk (n=5 majority voting)

223 0.2598 0.1525 0.9412

Page 16: Crowdsourcing Linked Data Quality Assessment

Institut für Angewandte Informatik und Formale Beschreibungsverfahren (AIFB)

 

16 10.04.2023

Final discussion

RQ1: Is it possible to detect quality issues in LD data sets via crowdsourcing mechanisms?

Both forms of crowdsourcing can be applied to detect certain LD quality issues

RQ2: What type of crowd is most suitable for each type of quality issue?

The effort of LD experts must be applied on those tasks demanding specific-domain skills. MTurk crowd was exceptionally good at performing data comparisons

RQ3: Which types of errors are made by lay users and experts?

Lay users do not have the skills to solve domain-specific tasks, while experts performance is very low on tasks that demand an extra effort (e.g., checking an external page)

Acosta et al. – Crowdsourcing Linked Data Quality Assessment

Page 17: Crowdsourcing Linked Data Quality Assessment

Institut für Angewandte Informatik und Formale Beschreibungsverfahren (AIFB)

 

17 10.04.2023

CONCLUSIONS & FUTURE WORK

Acosta et al. – Crowdsourcing Linked Data Quality Assessment

Page 18: Crowdsourcing Linked Data Quality Assessment

Institut für Angewandte Informatik und Formale Beschreibungsverfahren (AIFB)

 

18 10.04.2023

Conclusions & Future Work

A crowdsourcing methodology for LD quality assessment:

Find stage: LD experts

Verify stage: MTurk workers

Crowdsourcing approaches are feasible in detecting the studied quality issues

Application: Detecting pattern in errors to fix the extraction mechanisms

Future Work

Conducting new experiments (other quality issues and domains)

Integration of the crowd into curation processes and tools

Acosta et al. – Crowdsourcing Linked Data Quality Assessment

Page 19: Crowdsourcing Linked Data Quality Assessment

Institut für Angewandte Informatik und Formale Beschreibungsverfahren (AIFB)

 

19 10.04.2023

References & Acknowledgements

[Bernstein2010]

[Kontoskostas2013]

[Zaveri2013]

Acosta et al. – Crowdsourcing Linked Data Quality Assessment

M. S. Bernstein, G. Little, R. C. Miller, B. Hartmann, M. S. Ackerman, D. R.

Karger, D. Crowell, and K. Panovich. Soylent: a word processor with a crowd

inside. In Proceedings of the 23nd annual ACM symposium on User interface

software and technology, UIST ’10, pages 313–322, New York, NY, USA, 2010.

ACM.

A. Zaveri, A. Rula, A. Maurino, R. Pietrobon, J. Lehmann, and S. Auer. Quality

as- sessment methodologies for linked open data. Under review,

http://www.semantic-web-journal.net/content/quality-assessment-

methodologies-linked-open-data.

D Kontokostas, A Zaveri, S Auer, J Lehmann. TripleCheckMate: A Tool for Crowdsourcing the Quality Assessment of Linked Data . Knowledge Engineering and the Semantic Web, 2013

Page 20: Crowdsourcing Linked Data Quality Assessment

Institut für Angewandte Informatik und Formale Beschreibungsverfahren (AIFB)

 

20 10.04.2023

QUESTIONS?

Acosta et al. – Crowdsourcing Linked Data Quality Assessment

ContestLD ExpertsDifficult taskFinal prize

Find Verify

MicrotasksWorkersEasy taskMicropayments

TripleCheckMate MTurk

Incorrect object

Incorrect data type

Incorrect outlink

Object values

Data types Interlinks

Linked Data experts

0.7151 0.8270 0.1525

MTurk (majority voting)

0.8977 0.4752 0.9412

Results: Precision

ApproachMTurk tasks