shared data infrastructures from smart cities to education

Post on 25-Jan-2017

175 Views

Category:

Technology

2 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Shared data infrastructures: From Smart Cities to Education

Mathieu d’Aquin (@mdaquin)Data Science Group (@DataScienceGr)

KMi, The Open Universitydatahub.technology

A city?

A city?

A place

A city?

A place

A bunch of people (and organisations)

A city?

A place

A bunch of people (and organisations)

live / reside there

A city?

A place

Power infrastructure

A bunch of people (and organisations)

A city?

A place

Power infrastructure

Water infrastructure

A bunch of people (and organisations)

A city?

A place

Power infrastructure

Water infrastructure

Transport infrastructure

A bunch of people (and organisations)

A city?

A place

Power infrastructure

Water infrastructure

Transport infrastructure

A bunch of people (and organisations)

use

A city?

A place

Power infrastructure

Water infrastructure

Transport infrastructure

A bunch of people (and organisations)

useunderstand???

A city?

A place

Power infrastructure

Water infrastructure

Transport infrastructure

A bunch of people (and organisations)

useunderstand???

effective

with???

A smart city?

A place

Power infrastructure

Water infrastructure

Transport infrastructure

Data infrastructure

A bunch of people (and organisations)

mksmart.org

Making Milton Keynes a Smart City

Making Milton Keynes a Smart City

Challenges

Transport

Water

Energy

By 2026, transport demand in MK is estimated to grow by 60%, with engineering solutions only likely to meet half of this

MK is in a water stressed area and it is projected that climate change may reduce regional water availability by 29 million litres water per day by 2025.

MK needs to transition to being a low energy city to support sustainable economic growth. The 2013 core strategy of MK Council aims to achieve a 22% reduction in CO2 emissions per capita from a 2005 base by 2020.

data

apps

apps

apps

A data infrastructure

Because addressing vertically every route from data to application in isolation is crazy!

Challenges:

- Data heterogeneity: The content of the data is not the same

- Data diversity: The context and conditions under which the data is available are not the same

Data heterogeneity

3 types of data in a city:

- Highly temporal data- Highly spatial data- Others...

Answer the question: “what do we know about x?” where x is a place, organisation, bus route, roundabout, etc.

Data source 1

Data source 2

Data source 3

Data source 4

Data source 5

Data source 6

Big City Warehouse

access

ETL process

ETL process

ETL process

ETL process

ETL process

ETL process

Typical integration approach

Answer the question: “what do we know about x?” where x is a place, organisation, bus route, roundabout, etc.

Data source 1

Data source 2

Data source 3

Data source 4

Data source 5

Data source 6

Big City Warehouse

access

ETL process

ETL process

ETL process

ETL process

ETL process

ETL process

Typical integration approach

Hard to maintain and keep running at scale (i.e. as number of datasets grow)

Taking a Linked Data approach

Taking a Linked Data approach

Answer the question: “what do we know about x?” where x is a place, organisation, bus route, roundabout, etc.

Data source 1

Data source 2

Data source 3

Data source 4

Data source 5

Data source 6

Query template Query template Query template Query template Query template Query template

access access access

Answer the question: “what do we know about x?” where x is a place, organisation, bus route, roundabout, etc.

Data source 1

Data source 2

Data source 3

Data source 4

Data source 5

Data source 6

Query template Query template Query template Query template Query template Query template

access access accessCan be added and maintained in isolation from each other

Answer the question: “what do we know about x?” where x is a place, organisation, bus route, roundabout, etc.

Data source 1

Data source 2

Data source 3

Data source 4

Data source 5

Data source 6

Query template Query template Query template Query template Query template Query template

access access accessCan be added and maintained in isolation from each other

Virtual (i.e. never fully materialised) - no need for maintenance

Result

An “Entity-API” for things in the city, integrating hundreds of datasets and providing thousands of data endpoints, each providing integrated information about a bus stop, an area, a restaurant, a school, a roundabout, etc.

Result - Top MK - card playing with city data

Result - MK Insight - City data portal

Data Diversity

The eskimo language has 255 different words for “visiting linguist”

What’s the point of integrating data if we have to go through

every bit of it to check if it can be used?

Example

Smart meter data

Anonymisation analysisAnon data Modelprediction/

recommendation

Results

Example

Smart meter data

Anonymisation

Solar panel monitoring

Anonymisation

analysisAnon data

Anon data

Modelprediction/

recommendation

Results

Example

Smart meter data

Anonymisation

Solar panel monitoring

Anonymisation

Weather data

Location data

Electricity tariff data

analysisAnon data

Anon data

Modelprediction/

recommendation

Results

Example

Smart meter data

Anonymisation

Solar panel monitoring

Anonymisation

Weather data

Location data

Electricity tariff data

analysisAnon data

Anon data

Modelprediction/

recommendation

Data prot.

Corp lic. 1

Results

Example

Smart meter data

Anonymisation

Solar panel monitoring

Anonymisation

Weather data

Location data

Electricity tariff data

analysisAnon data

Anon data

Modelprediction/

recommendation

Data prot.

Corp lic. 1

Corp lic. 2

Data prot.

Data prot.

Results

User T&C

OGL

Corp lic. 3

Example

Smart meter data

Anonymisation

Solar panel monitoring

Anonymisation

Weather data

Location data

Electricity tariff data

analysisAnon data

Anon data

Modelprediction/

recommendation

Data prot.

Corp lic. 1

Corp lic. 2

Data prot.

Data prot.

Results

User T&C

OGL

Corp lic. 3

?

Example

Smart meter data

Anonymisation

Solar panel monitoring

Anonymisation

Weather data

Location data

Electricity tariff data

analysisAnon data

Anon data

Modelprediction/

recommendation

Data prot.

Corp lic. 1

Corp lic. 2

Data prot.

Data prot.

Results

User T&C

OGL

Corp lic. 3

?

?

?

?

Semantic approach

Explicit, machine readable representation of data policies and licences...

… as well as of the data flows through which they are processed

Handled through a sophisticated data cataloguing process

Handled through a sophisticated data cataloguing process

Result

datahub.mksmart.org

~ 400 registered users~ 700 datasets

Result - Applications

Result - Reusable components

datahub.technology

Open source data cataloguing components that integrate with CMS system

“Entity API” framework for data integration

High Speed processing components

Data portal framework

What about education?

Learner

Platform

Analytics

VLE | Website | LibraryAssessment |

Enrollment

School/University

Prediction Drop out

BI

Planning

Recommendation

Sentiment AnalysisCollective

Intelligence Behaviour Analysis

Collaboration

Community Support

AFEL - Analytics for Everyday Learning (afel-project.eu)

Learner

Platform

Analytics

VLE | Website | LibraryAssessment |

Enrollment

School/University

Prediction Drop out

BI

Planning

Recommendation

Sentiment AnalysisCollective

Intelligence Behaviour Analysis

Collaboration

Community Support

AFEL - Analytics for Everyday Learning (afel-project.eu)

Same challenges…… same solutions

Pointers and next steps

Deploying to other cities in the UK and Europe, as well as for research projects and other types of organisations.

Dealing with data quality and accuracy - automatic checking based on a cross comparison with other datasets.

Reusable, parameterizable services for analytics - building a catalogue of pipelines and models.

Thank you!

@mdaquin@DataScienceGr

http://datahub.technology

top related