open spatial processing

17
Open Spatial Data Progress towards a reusable gazetteer Open Data Group – 16 th April 2012 @ianibbo This work is licensed under a Creative Commons Attribution 3.0 Unported License.

Upload: ianibbo

Post on 05-Jul-2015

445 views

Category:

Technology


2 download

TRANSCRIPT

Page 1: Open spatial processing

Open Spatial DataProgress towards a reusable gazetteer

Open Data Group – 16th April 2012@ianibbo

This work is licensed under a Creative Commons Attribution 3.0 Unported License.

Page 2: Open spatial processing
Page 3: Open spatial processing

Overview

Original Problem

How to transition a central gov't funded aggregation of childcare and positive activities with a budget of >2m / year to an open data* model running on £60/month hardwareRetaining security (Of a certain level)

Retaining functionality

(See http://www.madwdata.org.uk/blog/id/394)

Page 4: Open spatial processing

2 Major Costs To Mitigate

Large cluster of proprietary OS hosts, ~12 front end web servers, hot backup sql server

Migrated to 1*Pound Host server ~£60/month, server has 2 hard drives, hot backup, off site rsync

Data costs – BPH Address-Point data – Used for geocoding incoming records and lookups on search terms. OS Boundary Line

???

Page 5: Open spatial processing

Some Noise

Open Spatial Data Consultation......

Page 6: Open spatial processing

Open Spatial Data

Ordnance Survey Open Data

http://www.ordnancesurvey.co.uk/oswebsite/products/os-locator/index.html

Code Point Open

Postcodes to Northing/Easting

OS Locator

Gazetteer of road names (And other features)

Obtained by registering on website, requesting, getting email, following link, …..

Page 7: Open spatial processing

The reality of CodePoint Open

The core data is “Open”

Missing the one vital link between CodePoint Open and OS Locator – PostCode → Road Names / Identifiers.

If you're happy to display Postcodes without road names, it's ideal.

Last Mile Problem.

Finding an automated way to link the 2 is hard!

Licensed data is now open, but out of date

Page 8: Open spatial processing

Address Point

Still Licensed

Expensive

Probably not that useful anyway for most projects

Page 9: Open spatial processing

Problem with focus on “Open Data”

Everyone ends up implementing their own gazetteer

Large scale providers have rate limits and introduce external dependencies / Speed issues

People want local geo-coding (for lots of different reasons).

Having rolled your own gazetteer, you discover you need to handle updates (Full replacements)

It's not an end in itself

Page 10: Open spatial processing

Vision

A stand-alone gazetteer web app designed for local network use with features for importing updates from OS, reconciling multiple data sources and performing geo-coding lookups.

Page 11: Open spatial processing

Available Tools

Apache SOLR

Long-Standing stalwart of the open data and search community

Schemas slightly clunky

Several spatial options, all with different strengths / weaknesses. Multiple points a problem in some.

ElasticSearch

Schema Free, Apparently Solid Spatial, Multi Points

Good integration with Mongo via Rivers

Page 12: Open spatial processing

Problems / Issues

ES Spatial search hard to do directly via a COOL URL

Spatial query syntax is expressive, but complex and needs JSON sub-documents

Need service wrappers

But thats easily done

Updates!

Page 13: Open spatial processing

Missed Level of Abstraction(Common to many open data sets?)

Source

LocalCopy

Compare

Processing

NOSQL Like Mongo is ideal for this

ES Ideal for this

Page 14: Open spatial processing

Progress

Starting to extract code from existing services into a generic spatial app

https://github.com/ianibo/AnOpenGazetteerFramework/

Work progressing under aegis of GIST Mobile group / Open Data group

Workable Gaz now, but command line interface for importing.

Page 15: Open spatial processing

Questions / Comments?

Page 16: Open spatial processing

Some supporting info

Original Project – FOI request to DfE

2008-09 2009-10 2010-110

1000000

2000000

3000000

4000000

5000000

6000000

7000000

Total costs - First 3 years

Local Authority RevenueLocal Authority Capi-talCentral Office of In-formationQi ConsultingRedhouseDfE Staff Costs

Consultation sem-inarsMethods Consulting

Engine Group

Digital PublicTribal Education

Page 17: Open spatial processing

2008-09 2009-10 2010-110

500000

1000000

1500000

2000000

2500000

First 3 years - Non LA costs

Central Office of In-formationQi ConsultingRedhouseDfE Staff CostsConsultation sem-inarsMethods ConsultingEngine GroupDigital PublicTribal Education