collecting a ground truth dataset for openstreetmap · ‣ ad-hoc collection of required reference...

Post on 28-Oct-2019

0 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Carsten Keßler Institute for Geoinformatics, University of Münster | soon: Hunter College, CUNY

http://carsten.io | @carstenkessler

Collecting a Ground Truth Dataset for OpenStreetMap

Background

‣ Study: Can we assess a feature’s quality based on its trustworthiness?

‣ Problem: What is the gold standard for a feature’s quality?

‣ Thursday, 10:30 – Session 4.2

Known problem

‣ Haklay (2010): Ordnance Survey‣ Girres & Touya (2010): Institut Géographique National‣ Mooney et al. (2010): Ordnance Survey Ireland (and others)‣ Zielstra and Zipf (2010): Tele Atlas‣ …

Comparing apples and oranges?

Comparing apples and oranges?‣ Focus on spatial accuracy

Comparing apples and oranges?‣ Focus on spatial accuracy‣ What about:

‣ thematic accuracy, consistency, and completeness‣ timeliness‣ fitness for purpose

→ harder, but also more interesting

Comparing apples and oranges?‣ Focus on spatial accuracy‣ What about:

‣ thematic accuracy, consistency, and completeness‣ timeliness‣ fitness for purpose

→ harder, but also more interesting

‣ What about data that do not even exist anywhere else?

Comparing apples and oranges?‣ Focus on spatial accuracy‣ What about:

‣ thematic accuracy, consistency, and completeness‣ timeliness‣ fitness for purpose

→ harder, but also more interesting

‣ What about data that do not even exist anywhere else?‣ Should other datasets set the benchmark for the OSM data?

2 Proposals

1. Combining existing datasets

1. Combining existing datasets

‣ Lots of services explicitly collect information about places:‣ Facebook, Foursquare, Yelp, …

Wikipedia/Wikidata, LOD Cloud, …

1. Combining existing datasets

‣ Lots of services explicitly collect information about places:‣ Facebook, Foursquare, Yelp, …

Wikipedia/Wikidata, LOD Cloud, … ‣ Different thematic angles → more complete

1. Combining existing datasets

‣ Lots of services explicitly collect information about places:‣ Facebook, Foursquare, Yelp, …

Wikipedia/Wikidata, LOD Cloud, … ‣ Different thematic angles → more complete‣ Complement authoritative/commercial reference data (scope)

Limitations and challenges

‣ APIs and terms of use set bounds to exploit these sources‣ Only useful for specific lookups‣ Different APIs

‣ Powerful and reliable identity reasoning required

2. Crowdsourcing

‣ … in the spirit of OSM‣ Ad-hoc collection of required reference data

‣ Potential to use the data not just for comparison‣ Problem: Motivation

Conclusions

Conclusions

‣ OSM should define its own quality requirements

Conclusions

‣ OSM should define its own quality requirements‣ Reference data:

External sources + crowdsourcing + X

Conclusions

‣ OSM should define its own quality requirements‣ Reference data:

External sources + crowdsourcing + X‣ Lots of research opportunities:

‣ Identity reasoning, HCI, social aspects, gamification, …

Conclusions

‣ OSM should define its own quality requirements‣ Reference data:

External sources + crowdsourcing + X‣ Lots of research opportunities:

‣ Identity reasoning, HCI, social aspects, gamification, …‣ Best incentive: increase the usage of the OSM data

Thank you!

carsten.kessler@uni-muenster.de | http://carsten.io | @carstenkessler

References

Girres and Touya (2010) Qualityassessment of the French OpenStreetMap dataset. Transactions in GIS, 14(4):435–459.

Haklay (2010) How good is volunteered geographical information? A comparative study of OpenStreetMap and Ordnance Survey datasets. Environment and planning. B, Planning & design, 37(4):682.

Zielstra and Zipf (2010) A Comparative Study of Proprietary Geodata and Volunteered Geographic Information for Germany. 13th AGILE International Conference on Geographic Information Science. Guimaraes, Portugal

Mooney, Corcoran, and Winstanley (2010). Towards quality metrics for openstreetmap. In Proceedings of the 18th SIGSPATIAL International Conference on Advances in Geographic Information Systems (pp. 514-517). ACM.

top related