collecting a ground truth dataset for openstreetmapflrec.ifas.ufl.edu/geomatics/agile2013/... ·...

27
Carsten Keßler Institute for Geoinformatics, University of Münster | soon: Hunter College, CUNY http://carsten.io | @carstenkessler Collecting a Ground Truth Dataset for OpenStreetMap

Upload: others

Post on 09-Oct-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Collecting a Ground Truth Dataset for OpenStreetMapflrec.ifas.ufl.edu/geomatics/agile2013/... · ‣ Facebook, Foursquare, Yelp, ... ‣ Potential to use the data not just for comparison

Carsten Keßler Institute for Geoinformatics, University of Münster | soon: Hunter College, CUNY

http://carsten.io | @carstenkessler

Collecting a Ground Truth Dataset for OpenStreetMap

Page 2: Collecting a Ground Truth Dataset for OpenStreetMapflrec.ifas.ufl.edu/geomatics/agile2013/... · ‣ Facebook, Foursquare, Yelp, ... ‣ Potential to use the data not just for comparison

Background

‣ Study: Can we assess a feature’s quality based on its trustworthiness?

‣ Problem: What is the gold standard for a feature’s quality?

‣ Thursday, 10:30 – Session 4.2

Page 3: Collecting a Ground Truth Dataset for OpenStreetMapflrec.ifas.ufl.edu/geomatics/agile2013/... · ‣ Facebook, Foursquare, Yelp, ... ‣ Potential to use the data not just for comparison

Known problem

‣ Haklay (2010): Ordnance Survey‣ Girres & Touya (2010): Institut Géographique National‣ Mooney et al. (2010): Ordnance Survey Ireland (and others)‣ Zielstra and Zipf (2010): Tele Atlas‣ …

Page 4: Collecting a Ground Truth Dataset for OpenStreetMapflrec.ifas.ufl.edu/geomatics/agile2013/... · ‣ Facebook, Foursquare, Yelp, ... ‣ Potential to use the data not just for comparison

Comparing apples and oranges?

Page 5: Collecting a Ground Truth Dataset for OpenStreetMapflrec.ifas.ufl.edu/geomatics/agile2013/... · ‣ Facebook, Foursquare, Yelp, ... ‣ Potential to use the data not just for comparison

Comparing apples and oranges?‣ Focus on spatial accuracy

Page 6: Collecting a Ground Truth Dataset for OpenStreetMapflrec.ifas.ufl.edu/geomatics/agile2013/... · ‣ Facebook, Foursquare, Yelp, ... ‣ Potential to use the data not just for comparison

Comparing apples and oranges?‣ Focus on spatial accuracy‣ What about:

‣ thematic accuracy, consistency, and completeness‣ timeliness‣ fitness for purpose

→ harder, but also more interesting

Page 7: Collecting a Ground Truth Dataset for OpenStreetMapflrec.ifas.ufl.edu/geomatics/agile2013/... · ‣ Facebook, Foursquare, Yelp, ... ‣ Potential to use the data not just for comparison

Comparing apples and oranges?‣ Focus on spatial accuracy‣ What about:

‣ thematic accuracy, consistency, and completeness‣ timeliness‣ fitness for purpose

→ harder, but also more interesting

‣ What about data that do not even exist anywhere else?

Page 8: Collecting a Ground Truth Dataset for OpenStreetMapflrec.ifas.ufl.edu/geomatics/agile2013/... · ‣ Facebook, Foursquare, Yelp, ... ‣ Potential to use the data not just for comparison

Comparing apples and oranges?‣ Focus on spatial accuracy‣ What about:

‣ thematic accuracy, consistency, and completeness‣ timeliness‣ fitness for purpose

→ harder, but also more interesting

‣ What about data that do not even exist anywhere else?‣ Should other datasets set the benchmark for the OSM data?

Page 10: Collecting a Ground Truth Dataset for OpenStreetMapflrec.ifas.ufl.edu/geomatics/agile2013/... · ‣ Facebook, Foursquare, Yelp, ... ‣ Potential to use the data not just for comparison

2 Proposals

Page 11: Collecting a Ground Truth Dataset for OpenStreetMapflrec.ifas.ufl.edu/geomatics/agile2013/... · ‣ Facebook, Foursquare, Yelp, ... ‣ Potential to use the data not just for comparison

1. Combining existing datasets

Page 12: Collecting a Ground Truth Dataset for OpenStreetMapflrec.ifas.ufl.edu/geomatics/agile2013/... · ‣ Facebook, Foursquare, Yelp, ... ‣ Potential to use the data not just for comparison

1. Combining existing datasets

‣ Lots of services explicitly collect information about places:‣ Facebook, Foursquare, Yelp, …

Wikipedia/Wikidata, LOD Cloud, …

Page 13: Collecting a Ground Truth Dataset for OpenStreetMapflrec.ifas.ufl.edu/geomatics/agile2013/... · ‣ Facebook, Foursquare, Yelp, ... ‣ Potential to use the data not just for comparison

1. Combining existing datasets

‣ Lots of services explicitly collect information about places:‣ Facebook, Foursquare, Yelp, …

Wikipedia/Wikidata, LOD Cloud, … ‣ Different thematic angles → more complete

Page 14: Collecting a Ground Truth Dataset for OpenStreetMapflrec.ifas.ufl.edu/geomatics/agile2013/... · ‣ Facebook, Foursquare, Yelp, ... ‣ Potential to use the data not just for comparison

1. Combining existing datasets

‣ Lots of services explicitly collect information about places:‣ Facebook, Foursquare, Yelp, …

Wikipedia/Wikidata, LOD Cloud, … ‣ Different thematic angles → more complete‣ Complement authoritative/commercial reference data (scope)

Page 15: Collecting a Ground Truth Dataset for OpenStreetMapflrec.ifas.ufl.edu/geomatics/agile2013/... · ‣ Facebook, Foursquare, Yelp, ... ‣ Potential to use the data not just for comparison

Limitations and challenges

‣ APIs and terms of use set bounds to exploit these sources‣ Only useful for specific lookups‣ Different APIs

‣ Powerful and reliable identity reasoning required

Page 16: Collecting a Ground Truth Dataset for OpenStreetMapflrec.ifas.ufl.edu/geomatics/agile2013/... · ‣ Facebook, Foursquare, Yelp, ... ‣ Potential to use the data not just for comparison
Page 17: Collecting a Ground Truth Dataset for OpenStreetMapflrec.ifas.ufl.edu/geomatics/agile2013/... · ‣ Facebook, Foursquare, Yelp, ... ‣ Potential to use the data not just for comparison
Page 18: Collecting a Ground Truth Dataset for OpenStreetMapflrec.ifas.ufl.edu/geomatics/agile2013/... · ‣ Facebook, Foursquare, Yelp, ... ‣ Potential to use the data not just for comparison

2. Crowdsourcing

‣ … in the spirit of OSM‣ Ad-hoc collection of required reference data

‣ Potential to use the data not just for comparison‣ Problem: Motivation

Page 19: Collecting a Ground Truth Dataset for OpenStreetMapflrec.ifas.ufl.edu/geomatics/agile2013/... · ‣ Facebook, Foursquare, Yelp, ... ‣ Potential to use the data not just for comparison
Page 20: Collecting a Ground Truth Dataset for OpenStreetMapflrec.ifas.ufl.edu/geomatics/agile2013/... · ‣ Facebook, Foursquare, Yelp, ... ‣ Potential to use the data not just for comparison
Page 21: Collecting a Ground Truth Dataset for OpenStreetMapflrec.ifas.ufl.edu/geomatics/agile2013/... · ‣ Facebook, Foursquare, Yelp, ... ‣ Potential to use the data not just for comparison

Conclusions

Page 22: Collecting a Ground Truth Dataset for OpenStreetMapflrec.ifas.ufl.edu/geomatics/agile2013/... · ‣ Facebook, Foursquare, Yelp, ... ‣ Potential to use the data not just for comparison

Conclusions

‣ OSM should define its own quality requirements

Page 23: Collecting a Ground Truth Dataset for OpenStreetMapflrec.ifas.ufl.edu/geomatics/agile2013/... · ‣ Facebook, Foursquare, Yelp, ... ‣ Potential to use the data not just for comparison

Conclusions

‣ OSM should define its own quality requirements‣ Reference data:

External sources + crowdsourcing + X

Page 24: Collecting a Ground Truth Dataset for OpenStreetMapflrec.ifas.ufl.edu/geomatics/agile2013/... · ‣ Facebook, Foursquare, Yelp, ... ‣ Potential to use the data not just for comparison

Conclusions

‣ OSM should define its own quality requirements‣ Reference data:

External sources + crowdsourcing + X‣ Lots of research opportunities:

‣ Identity reasoning, HCI, social aspects, gamification, …

Page 25: Collecting a Ground Truth Dataset for OpenStreetMapflrec.ifas.ufl.edu/geomatics/agile2013/... · ‣ Facebook, Foursquare, Yelp, ... ‣ Potential to use the data not just for comparison

Conclusions

‣ OSM should define its own quality requirements‣ Reference data:

External sources + crowdsourcing + X‣ Lots of research opportunities:

‣ Identity reasoning, HCI, social aspects, gamification, …‣ Best incentive: increase the usage of the OSM data

Page 27: Collecting a Ground Truth Dataset for OpenStreetMapflrec.ifas.ufl.edu/geomatics/agile2013/... · ‣ Facebook, Foursquare, Yelp, ... ‣ Potential to use the data not just for comparison

References

Girres and Touya (2010) Qualityassessment of the French OpenStreetMap dataset. Transactions in GIS, 14(4):435–459.

Haklay (2010) How good is volunteered geographical information? A comparative study of OpenStreetMap and Ordnance Survey datasets. Environment and planning. B, Planning & design, 37(4):682.

Zielstra and Zipf (2010) A Comparative Study of Proprietary Geodata and Volunteered Geographic Information for Germany. 13th AGILE International Conference on Geographic Information Science. Guimaraes, Portugal

Mooney, Corcoran, and Winstanley (2010). Towards quality metrics for openstreetmap. In Proceedings of the 18th SIGSPATIAL International Conference on Advances in Geographic Information Systems (pp. 514-517). ACM.