improve osm data quality with deep learning · compare predicts against osm on osm but not nothing...

Improve OSM data quality with Deep Learning

@o_courtin

@fosdem 2019

Detect inconsistencies between two DataSets

NeuralsNetwork

Imagery

Labels

Loss Function

NeuralsNetwork

Imagery

Labels

Loss Function

TrainedModel

Prediction

NeuralsNetwork

Imagery

Labels

Loss Function

TrainedModel

Prediction

AlternateDataSet

Compare

RoboSat.pink

@RobosatP Semantic Segmentation ecosystem for GeoSpatial Imagery

DataSet Quality Analysis

Change Detection highlighter

Features extraction

RoboSat.pink Spirit

State of Art SemSeg

Industrial code robustness

Code minimalism as a code aesthetic

Modular and extensible, by design

OSM and MapBox ecosystem friendly

GeoSpatial standards compliant

MIT Licence

DownloadWMS

RasterizeGeoJSON

ExtractOSM pbf

Raster

Subset

TrainingDataSetBbox

XYZ dir

Data Preparation

https://arxiv.org/pdf/1806.00844.pdf

PreTrained Encoder

Image Label Cross Entropy mIoU Lovasz

http://www.cs.toronto.edu/~wenjie/papers/iccv17/mattyus_etal_iccv17.pdfhttp://www.cs.umanitoba.ca/~ywang/papers/isvc16.pdfhttps://arxiv.org/abs/1705.08790

Semantic Loss

From OpenData to OpenDataSet

https://github.com/datapink/robosat.pink/blob/master/docs/from_opendata_to_opendataset.md

rsp cover --zoom 18 --type bbox 4.795,45.628,4.935,45.853 ~/rsp_dataset/cover

rsp download --type WMS 'https://download.data.grandlyon.com/wms/grandlyon?SERVICE=WMS&REQUEST=GetMap&VERSION=1.3.0&LAYERS=Ortho2015_vue_ensemble_16cm_CC46&WIDTH=512&HEIGHT=512&CRS=EPSG:3857&BBOX={xmin},{ymin},{xmax},{ymax}&FORMAT=image/jpeg' --web_ui --ext jpeg ~/rsp_dataset/cover ~/rsp_dataset/images

Imagery

wget -O ~/rsp_dataset/lyon_roofprint.json 'https://download.data.grandlyon.com/wfs/grandlyon?SERVICE=WFS&REQUEST=GetFeature&TYPENAME=ms:fpc_fond_plan_communaut.fpctoit&VERSION=1.1.0&srsName=EPSG:4326&outputFormat=application/json; subtype=geojson'

rsp rasterize --config config.toml --zoom 18 --web_ui ~/rsp_dataset/lyon_roofprint.json ~/rsp_dataset/cover ~/rsp_dataset/labels

Labels

mkdir ~/rsp_dataset/training ~/rsp_dataset/validation

cat ~/rsp_dataset/cover | sort -R > ~/rsp_dataset/cover.shuffledhead -n 16384 ~/rsp_dataset/cover.shuffled > ~/rsp_dataset/training/covertail -n 7924 ~/rsp_dataset/cover.shuffled > ~/rsp_dataset/validation/cover

rsp subset --web_ui --dir ~/rsp_dataset/images --cover ~/rsp_dataset/training/cover --out ~/rsp_dataset/training/imagesrsp subset --web_ui --dir ~/rsp_dataset/labels --cover ~/rsp_dataset/training/cover --out ~/rsp_dataset/training/labelsrsp subset --web_ui --dir ~/rsp_dataset/images --cover ~/rsp_dataset/validation/cover --out ~/rsp_dataset/validation/imagesrsp subset --web_ui --dir ~/rsp_dataset/labels --cover ~/rsp_dataset/validation/cover --out ~/rsp_dataset/validation/labels

rsp train --config config.toml ~/rsp_dataset/pth

Split DataSet and first Training

Buildings IoU metric on validation dataset,after 10 epochs : 0.82

rsp predict --config config.toml --checkpoint ~/rsp_dataset/pth/checkpoint-00010-of-00010.pth --web_ui ~/rsp_dataset/images ~/rsp_dataset/masks

Predict

Detect wrong labels (zoom out)

rsp compare --images ~/rsp_dataset/images ~/rsp_dataset/labels ~/rsp_dataset/masks --mode stack --labels ~/rsp_dataset/labels --masks ~/rsp_dataset/masks --config config.toml --ext jpeg --web_ui ~/rsp_dataset/compare

rsp compare --mode list --labels ~/rsp_dataset/labels --maximum_qod 80 --minimum_fg 5 --masks ~/rsp_dataset/masks --config config.toml --geojson ~/rsp_dataset/compare/tiles.json

Detect wrong labels (zoom in)

Semi-manually select wrong labels

rsp compare --mode side --images ~/rsp_dataset/images ~/rsp_dataset/compare --labels ~/rsp_dataset/labels --maximum_qod 80 --minimum_fg 5 --masks ~/rsp_dataset/masks --config config.toml --ext jpeg --web_ui ~/rsp_dataset/compare_side

rsp subset --mode delete --dir ~/rsp_dataset/training/images --cover ~/rsp_dataset/cover.to_remove > /dev/nullrsp subset --mode delete --dir ~/rsp_dataset/training/labels --cover ~/rsp_dataset/cover.to_remove > /dev/nullrsp subset --mode delete --dir ~/rsp_dataset/validation/images --cover ~/rsp_dataset/cover.to_remove > /dev/nullrsp subset --mode delete --dir ~/rsp_dataset/validation/labels --cover ~/rsp_dataset/cover.to_remove > /dev/null

rsp train --config config.toml --epochs 100 ~/rsp_dataset/pth_clean

Buildings IoU metric on validation datasetafter 10 epochs : 0.84after 100 epochs : 0.87

Remove selected wrong labels and Train again

Both Prediction and DataSet are quite consistents

Change Detection

Prediction False Negative

Compare Predicts against OSM

wget -O /tmp/ra_osm.pbf http://download.geofabrik.de/europe/france/rhone-alpes-latest.osm.pbf

osmosis --read-pbf file="/tmp/ra_osm.pbf" --bounding-box left=4.795 bottom=45.628 right=4.935 top=45.853 completeWays=yes completeRelations=yes cascadingRelations=yes --write-pbf file="/tmp/osm_lyon.pbf"

rsp extract --type building /tmp/osm_lyon.pbf ~/rsp_dataset/osm.json

rsp rasterize --config ~/robosat.pink/config.toml --zoom 18 ~/rsp_dataset/osm.json ~/rsp_dataset/cover ~/rsp_dataset/osm

rsp compare --images ~/rsp_dataset/images ~/rsp_dataset/osm ~/rsp_dataset/masks_clean --mode stack --labels ~/rsp_dataset/osm --masks ~/rsp_dataset/masks_clean --config config.toml --web_ui ~/rsp_dataset/compare_osm

rsp vectorize --type building --config config.toml ~/rsp_dataset/masks_clean /tmp/building.json

Compare Predicts against OSM

On OSM but not nothing related on the imagery

- building was builded since imagery - building was destroyed but since on OSM

Predict by Imagery but not in OSM :

- polygon OSM is OK but without buildings attribute (most frequent) - building is really missing in OSM - building was destroyed since imagery - model prediction artefact

OSM and Training DataSet classification divergence

Performances

Whole Data Preparation : About an hour and half (downloads included)

Manual Filtering : About two hours

Training : ~20mn per epoch (i.e ~30 hours for 100 epochs)

Prediction : ~3 MegaPixels per second

On a single GTX 1080 Ti

Training can scale with multi GPUs.

Stacks

Proj 4

GEOS GDAL

Rasterio

CUDAcuDNN

PyTorch

OpenCV

RoboSat.pink

PillowShapelib Osmium

Mercantile

SuperMercado

August RoboSat 1.0 MapBox RoboSat Initial release daniel-j-h bkowshik

September RoboSat 1.1 Training perfs increase Jesse-jApps ocourtin

October RoboSat master OSM Roads extraction DragonEmperorG

mIoU and Lovasz losses ocourtin

November RoboSat PR 138 MultiBands support and Tools refactor ocourtin

November RoboSat.pink 0.1 QoD support and whole refactor ocourtin

February RoboSat.pink 0.2 Feature Extraction ocourtin

From RoboSat to RoboSat.pink

Next ?

- Lower resolution Imagery SemSeg as Sentinel-2

- Predict performance improvments

- OSM OpenDataSet and Pre Trained models

Take Away

- Industrial state of art Aerial SemSeg ecosystem available, and playful

- Plain OpenData can be use to train model

- Predict speed performances still to be improve to scale at large

improve osm data quality with deep learning · compare predicts against osm on osm but not nothing...

Documents

osm presentations

osm - noja power - osm15-27-38 brochure... · osm osm...

osm awinners2013 d

osm awinners2013 h

carl ignite osm

osm awinners2014 a

osm awinners2013 f

osm awinners2013 g

osm for wikimedians

osm einführungsvortrag (thefive)

metro-haul 5th osm...

osm application

osm overview

version 2/9/09 “and was jerusalem builded here?” ·...

osm philippines presentation

franc3d / osm

osm cycle map

leveraging crowdsourced gps data for road extraction from...

osm sfd2012

the osm the osm