extracting value from big data - the case vehicular traffic data by christian s. jensen

22
Christian S. Jensen www.cs.aau.dk/~csj Extracting Value from Big Data The Case of Vehicular Traffic Data

Upload: infinit-innovationsnetvaerket-for-it

Post on 31-Jul-2015

223 views

Category:

Technology


0 download

TRANSCRIPT

Page 1: Extracting Value from Big Data - The Case Vehicular Traffic Data by Christian S. Jensen

Christian S. Jensen

www.cs.aau.dk/~csj

Extracting Value from Big Data –

The Case of Vehicular Traffic Data

Page 2: Extracting Value from Big Data - The Case Vehicular Traffic Data by Christian S. Jensen

Roadmap

• Big data

Hype or substance?

Instrumentation of reality and digitization

The digital universe

Moore’s Law generalized

Big data challenges

• Big data in traffic

Motivation

Data and systems

Eco driving and routing

Traffic analytics examples

Page 3: Extracting Value from Big Data - The Case Vehicular Traffic Data by Christian S. Jensen

Hype or Substance?

• We have been pushing the boundaries for decades

How much data we can handle

How fast

Data integration

• Examples

VLDB: International Conference on Very Large Database

TODS: ACM Transactions on Database Systems

• So is it all hype?

No

Page 4: Extracting Value from Big Data - The Case Vehicular Traffic Data by Christian S. Jensen

Instrumentation and Digitization

• Instrumentation of reality

Notably, smartphones

• Digitization of processes

E.g., e-commerce, public services, communications, social

interactions

Page 5: Extracting Value from Big Data - The Case Vehicular Traffic Data by Christian S. Jensen

The Vatican, 2005

Page 6: Extracting Value from Big Data - The Case Vehicular Traffic Data by Christian S. Jensen

The Vatican, 2013

Page 7: Extracting Value from Big Data - The Case Vehicular Traffic Data by Christian S. Jensen

2005 vs. 2013

Page 8: Extracting Value from Big Data - The Case Vehicular Traffic Data by Christian S. Jensen

The Digital Universe

• The digital universe

Doubling every ~18-24 months

Grew 60% in 2009, 50% in 2010

2009: 0.8 zettabyte, 2010: 1.2 zettabyte, 2020: 35 zettabytes

2009-20: growth by a factor of 44

http://www.emc.com/collateral/demos/microsites/idc-digital-universe/iview.htm

1 zettabyte = 1024 exabytes

1 exabyte = 1024 petabytes =

260 = 1,152,921,504,606,846,976 ≈ 1018 bytes

Page 9: Extracting Value from Big Data - The Case Vehicular Traffic Data by Christian S. Jensen

Moore’s Law – The Bicycle Analogy

• Moore’s Law: computers double in speed every 24

months.

Applies also to quality-adjusted microprocessor prices, memory

capacity, disks, networks, sensors, and the number and size of

pixels in cameras.

• How fast would a bicyclist be if Moore’s Law applied?

50 years of doubling every 24 months

30 km/h originally

~1 billion km/h now

• Three lessons

Growth rates in computing are dramatic and difficult to imagine.

Hardware advances are important information technology drivers.

Humans don’t really improve – they are the constants.

Page 10: Extracting Value from Big Data - The Case Vehicular Traffic Data by Christian S. Jensen

Big Data – Synthesis

• The result is new opportunity.

• Lots of data and unprecedented computing infrastructure

combine to offer potentials for value creation from data.

• To be competitive, society and businesses must be able to

create value from data

• Data-based decisions and data-driven processes

Decisions based on good data beat decisions based on feelings or

opinions

• A finer granularity of services

• Entirely new services

Page 11: Extracting Value from Big Data - The Case Vehicular Traffic Data by Christian S. Jensen

Big Data – Data-Driven Society, Business

Page 12: Extracting Value from Big Data - The Case Vehicular Traffic Data by Christian S. Jensen

ITS – Motivation

• A safer, greener, and more efficient and cost-effective

transportation infrastructure

• Greenhouse gas emissions reductions via eco-routing

• Congestion, greater Copenhagen region

~10 billion DKK/year (2004)

• Bad setting of signalized intersections in Denmark

~9,3 billion DKK/year (2012)

Page 13: Extracting Value from Big Data - The Case Vehicular Traffic Data by Christian S. Jensen

Data, Software and Hardware

• Experimental infrastructure

• Data

4+ billion GPS records, 17.000+ vehicles

350+ million CAN Bus records

Conventional and electric vehicles (GPS/CAN bus data)

17 data sources, ~3 million rows per day from 3,500 vehicles

• Software

Have complete software stack for handling traffic data

Map-matching, data cleansing, multiple map support

• Hardware

Very modern server farm

Newest machine has 2TB main memory

Page 15: Extracting Value from Big Data - The Case Vehicular Traffic Data by Christian S. Jensen

Eco-Routing Framework

3D Laser Scan Point Cloud

Road Network Lifting

2D Road Network

3D Road Network Historical GPS Data Real-time GPS data

Eco-Weight Initialization Eco-Weight Maintenance

Eco-Weighted Road Network

Basic Eco-

Routing Skyline

Eco-Routing

Personalized

Eco-Routing

Source,

Target,

Time

Basic Eco-routes Skyline Eco-routes Personalized Eco-routes

Page 16: Extracting Value from Big Data - The Case Vehicular Traffic Data by Christian S. Jensen

3D Spatial Network

• Spatial network lifting

2D spatial network: OpenStreetMap

Aerial laser scan of Denmark (1+ point per m2; 2.5 TB for

Denmark)

Page 17: Extracting Value from Big Data - The Case Vehicular Traffic Data by Christian S. Jensen

Basic Eco Routing – CPH to the Train

Good 17:08 8.06 km 0.80 l

Bad 18:06 13.64 km 1.23 l

Page 18: Extracting Value from Big Data - The Case Vehicular Traffic Data by Christian S. Jensen

Napoleon’s Russian Campaign

Page 19: Extracting Value from Big Data - The Case Vehicular Traffic Data by Christian S. Jensen

Digression: Methodology

• The same methodology underlies the studies.

• Define precisely a problem of (perceived) real-world

interest.

• Develop solutions

Concepts, data structures, algorithms

• Carry out mathematical analyses

Correctness, complexity, storage size

• Prototype the solutions and perform empirical studies

Often, real data is needed

Offers detailed insight in the design properties of the solutions

• Iterate!

Page 20: Extracting Value from Big Data - The Case Vehicular Traffic Data by Christian S. Jensen

The Future

• Much more data

Inductive loop detectors

Bus data

Rejsekortet

• Much more connected vehicles

• New services

Routing

Safety and warnings

Parking, fees, insurance, road pricing

Car sharing, multi-modality

• Driver-less vehicles

Page 21: Extracting Value from Big Data - The Case Vehicular Traffic Data by Christian S. Jensen

Thank you for your attention.

Page 22: Extracting Value from Big Data - The Case Vehicular Traffic Data by Christian S. Jensen

Four Prototypes

• http://daisy.aau.dk/its

Point based

Travel-time map, congestion map, and eco route

• http://daisy.aau.dk/its/spqdemo

Trajectory based, Strict-Path Queries

Trips (historical travel-time), route choice, Napoleon (road usage)

• http://daisy.aau.dk/its/sheaf

Trajectory based

Traffic sheaf (advanced, high-performance)

• http://daisy.aau.dk/its/eco

Point-based skyline queries

Advanced weights