Smart Cities and Big Data- 1
SUSTAINABLE SYSTEMSSUSTAINABLE SYSTEMS
Smart Cities
and
Big Data
Smart Cities
and
Big Data
Based in part on material from
Professor Pang-Ning Tan
Dept of Computer Science & Engineering
Michigan State University
Website: http://www.cse.msu.edu/~ptan
Smart Cities and Big Data- 2
Google Trends(http://www.google.com/trends)
‘Big Data’ trend: Jan.2004 – Sept. 2019
0
10
20
30
40
50
60
70
80
90
100
2014
Smart Cities and Big Data- 3
Big Data: How Much Data is Out
There?
Source: https://www.emc.com/collateral/analyst-reports/idc-digital-universe-2014.pdf
Smart Cities and Big Data- 4
Google Trends(http://www.google.com/trends)
‘Smart Cities’ trend: Jan.2004 –
Apr.2018
Smart Cities and Big Data- 5
Big Data for Smart Cities
“Smart cities, founded on the use of information and communication technologies, aim at tackling many local problems, from local economy and transportation to quality of life and e-governance.”Martínez-Ballesté et al. IEEE
Communications 2013.
Smart Cities and Big Data- 6
Should cities bother
to collect big data?
• Pros
Can lead to better management of resources,
services, etc.
Can lead to better predictions of patterns of
use, of trouble, of resource demand.
• Cons
Hard to identify key data to collect.
Expensive to measure in the field.
Requires expertise to build such systems.
Smart Cities and Big Data- 7
Quality of the data
Stamp's Statistical Probability
The government [is] extremely fond of amassing great quantities of statistics. These are raised to the nth degree, the cube roots are extracted, and the results are arranged into elaborate and impressive displays. What must be kept ever in mind, however, is that in every case, the figures are first put down by a village watchman, and he puts down anything he damn well pleases.(Attributed to Sir Josiah Stamp, 1840-1941, H.M. collector of
inland revenue.
Smart Cities and Big Data- 8
How can big data lead
to better performance?
• Command and Control approach
Measure current system behavior
Compare to target (desired) behavior
Issue commands to meet targets
• C&C diagram
Feedback link is critical
Smart Cities and Big Data- 9
C&C diagram: general
Compare
and act
Actual
system
Sensors
inputs outputstargets
measured
data
actual
data
Smart Cities and Big Data- 10
Example: vehicle
speed control
Compare
and act
Actual
vehicle
Sensors
(tach)
gas
pedal
actual
speed
target
speed
measured
speed
Whether a driver or an automated speed control system
is in charge, the same structure works.
Smart Cities and Big Data- 11
Example: traffic management
in an urban environment
Compare
and act
Street
grid
Sensors
set
traffic
lightsactual
flows
target
flows
measured
flows
Consider the amount of data involved.
Smart Cities and Big Data- 12
NASA EOSDIS growing
source of ‘Big Data’
NASA’s Earth Observing System Data and Information System
(EOSDIS) is in the middle of a critical project prototyping, testing,
and evaluating a significant change in the way data users access
and use NASA Earth Observation (EO) data. Ironically, data users
likely will not even notice if this change is implemented. What they
will notice is more efficient access to more data and the ability to do
more with these data.
The change being considered is moving EOSDIS data to the cloud.
This move would not only be a logical technical evolution for
EOSDIS, but also a proactive effort to provide broader access to a
data archive that is expected to grow significantly over the next
several years.
https://earthdata.nasa.gov/about/eosdis-cloud-evolution, 09/11/2018
Smart Cities and Big Data- 13
NASA EOSDIS growing
source of ‘Big Data’
Smart Cities and Big Data- 14
Some social media sources
of ‘Big Data’
• Youtube: 400 hours of videos uploaded every minute (https://expandedramblings.com/index.php/youtube-statistics/ )
• Facebook: 600 TB of data per day (2014) (https://code.facebook.com/posts/229861827208629/scaling-the-facebook-data-
warehouse-to-300-pb/?_fb_noscript=1)
• Instagram: 60 million photos posted per day (2014) (http://get.simplymeasured.com/rs/simplymeasured2/images/InstagramStudy201
4Q3.pdf)
• Twitter: 250 million tweets posted per day (2011) (http://highscalability.com/blog/2011/12/19/how-twitter-stores-250-million-
tweets-a-day-using-mysql.html )
Smart Cities and Big Data- 15
Examples of Big Data
for Smart Cities
Sensor time series
Surveillance video
streamsGPS trajectories
from mobile
devices
Smart card Social mediaStructured
data
Smart Cities and Big Data- 16
TrendMap(http://www.trendsmap.com)
Smart Cities and Big Data- 17
Example: Sustainable
Transportation
Use big data approach for:
Bike station placement prediction
Demand forecasting
Smart Cities and Big Data- 18
Bike sharing station placement
• Previous studies have shown placement should be based on:
Area function (high demand near residential areas, transition
hubs, and tourist attractions)
Human activity (people rent a bike for commuting, shopping,
entertainment, and personal errands)
Demographics (users tend to be younger, highly educated, less
affluent)
• How to determine the placement locations using big data?
Google Places API – provide info about businesses and point of
interests
FourSquare API – provide user check-ins to restaurants and other
places
Data.gov – US government open data portal (to obtain demographic
data about an area)
Smart Cities and Big Data- 19
Bike sharing demand forecasting
Spatio-temporal prediction – where and when?
Smart Cities and Big Data- 20
Data flow diagram - 1
Data
storage
Analysis
for insight
Display
for insight
Raw
data
Initial
processing
Data
storage
instruments COLLECTION PHASE
ANALYSIS PHASE
Smart Cities and Big Data- 21
The 4 V’s of Big Data
Volume
Variety
Veracity
Velocity
The ‘Value added’ question:
Should we add the capability to handle
big data of type ‘X’, in view of the cost?
Smart Cities and Big Data- 22
Data flow diagram - 1
Data
storage
Analysis
for insight
Display
for insight
Raw
data
Initial
processing
Data
storage
instruments
VELOCITY
VERACITY,
VARIETY VOLUME
VALUE added VALUE added
Smart Cities and Big Data- 23
Initial processing of data
Noise
Outliers
Missing values
Overlapping data
Varying formats, scales, etc.
Smart Cities and Big Data- 24
Types of processing tasks
Anomaly detection
Descriptive statistics
Clustering
Association
Prediction
Smart Cities and Big Data- 25
Tid Refund Marital Status
Taxable Income Cheat
1 Yes Single 125K No
2 No Married 100K No
3 No Single 70K No
4 Yes Married 120K No
5 No Divorced 95K Yes
6 No Married 60K No
7 Yes Divorced 220K No
8 No Single 85K Yes
9 No Married 75K No
10 No Single 90K Yes
11 No Married 60K No
12 Yes Divorced 220K No
13 No Single 85K Yes
14 No Married 75K No
15 No Single 90K Yes 10
Data
Complex Data Mining Tasks
Ranking/
Recommendation
Smart Cities and Big Data- 26
Predictive Modeling example
Object detection for
autonomous vehicle driving
Smart Cities and Big Data- 27
Cluster Analysis example
Crime hotspot detection
Smart Cities and Big Data- 28
Anomaly Detection
Detect significant deviations from normal
observations
Smart Cities and Big Data- 29
Anomaly Detection examples
Smart Transportation
• Congestion detection
• Sensor fault detection
Smart Home/Building
• Water theft detection
• Pipe burst detection
Smart Cities and Big Data- 30
Big Data Challenge:
Privacy
Smart Cities and Big Data- 31
Challenge: Privacy and Security
Example: Hacking into database for private purposes …
Smart Cities and Big Data- 32
Thanks For Listening!Thanks For Listening!Thanks For Listening!Thanks For Listening!