large-scale data analytics for smart cities
DESCRIPTION
The 4th International Workshop on Cyber-Physical Cloud Computing, Osaka, Japan, August 2014.TRANSCRIPT
Large-scale data analytics for smart cities
1
Payam Barnaghi
Institute for Communication Systems (ICS)
University of Surrey
Guildford, United Kingdom
The Cyber-Physical Cloud Computing Workshop, August 2014, Osaka, JapanThe Cyber-Physical Cloud Computing Workshop, August 2014, Osaka, Japan
2
Things, Data, and lots of it
image courtesy: Smarter Data - I.03_C by Gwen Vanhee
Current focus on Big Data
− Emphasis on power of data and data mining solutions
− Technology solutions to handle large volumes of data; e.g. Hadoop, NoSQL, Graph Databases, …
− Trying to find patterns and trends from large volumes of data…
Myths About Big Data
− Big Data is only about massive data volume− Big Data means Hadoop− Big Data means unstructured data− If we have enough data we can draw conclusions
(enough here often means massive amounts)− NoSQL means No SQL− It is about increasing computational power and
taking more data and running data mining algorithms.
4Some of the items are adapted from: Brain Gentile, http://mashable.com/2012/06/19/big-data-myths/
What happens if we only focus on data
− Number of burgers consumed per day.− Number of cats outside.− Number of people checking their facebook
account.
− What insight would you draw?
5
Smart City Data
− Data is multi-modal and heterogeneous− Noisy and incomplete− Time and location dependent − Dynamic and varies in quality − Crowed sourced data can be unreliable − Requires (near-) real-time analysis− Privacy and security are important issues
− Data alone may not give a clear picture -we need contextual information, background knowledge, multi-source information and obviously better data analytics solutions…
6
Smart City Data
7
?
What type of problems we expect to solve in
“smart” cities
Back to the future
9
10Source LAT Times, http://documents.latimes.com/la-2013/
Future cities: a view from 1998
11Source: http://robertluisrabello.com/denial/traffic-in-la/#gallery[default]/0/
Source: wikipedia
12
13
We need an Integrated Approach
14
Processing steps
AnalyticsToolbox
Context-awareDecision Support,
Visualisation
Knowledge-based
Stream Processing
Real-TimeMonitoring &
Testing
Accuracy & Trust
Modelling
SemanticIntegration
On Demand Data
Federation
OpenReferenceData Sets
Real-TimeIoT InformationExtraction
IoT StreamProcessing
Federation ofHeterogenousData Streams
Design-Time Run-Time Testing
Exposure APIs
Some of the key issues
− Data collection, representation, interoperability− Indexing, search and selection− Storage and provision − Stream analysis, fusion and integration of multi-source,
multi-modal and variable-quality data− Aggregation, abstraction, pattern extraction and
time/location dependencies − Adaptive learning models for dynamic data− Reasoning methods for uncertain and incomplete data− Privacy, trust, security− Scalability and flexibility of the solutions
15
Some of our recent in this domain
16
Data discovery in the IoT
17
Time
Location
Type
Qu
ery
pre
-p
roce
ssin
g
Query attributes Information
Repository (IR)(archived data)
# location# type
Discovery Server (DS)
Gateway
Device/Sensor domain
Network/Back-enddomain
Application/userdomain
[ # lo
catio
n |#
Tim
e | T
ype
]
Distributed/scalable
Large-scale data discovery
18
timetime
locationlocation
typetype
Query formulatingQuery formulating
[#location | #type | time][#location | #type | time]
Discovery IDDiscovery ID
Discovery/DHT ServerDiscovery/DHT Server
Data repository(archived data)Data repository(archived data)
#location#type
#location#type
#location#type
GatewayGateway
Core networkCore network
Network Connection
Logical Connection
Data
Seyed Amir Hoseinitabatabaei, Payam Barnaghi, Chonggang Wang, Rahim Tafazolli, Lijun Dong, "A Distributed Data Discovery Mechanism for the Internet of Things", 2014.
Data abstraction
19
F. Ganz, P. Barnaghi, F. Carrez, "Information Abstraction for Heterogeneous Real World Internet Data", IEEE Sensors Journal, 2013.
Ontology learning from real world data
20
Adaptable and dynamic learning methods
http://kat.ee.surrey.ac.uk/
Social media analysis (collaboration with Kno.e.sis)
22
City Infrastructure
Tweets from a city
P. Anantharam, P. Barnaghi, K. Thirunarayan, A. Sheth, "Extracting city events from social streams,“, under review, 2014.
https://osf.io/b4q2t/
Correlation analysis
23
AD
CB
Image source for equilibrium diagram: John D. Hey, The University of York.
Equilibrium in transient and non-uniform world
Data analytics framework
25
Data:
DataData
Domain
KnowledgeDomain
Knowledge
Social
systemsSocial
systems
InteractionsInteractionsOpen
InterfacesOpen
Interfaces
Ambient
IntelligenceAmbient
IntelligenceQuality and
TrustQuality and
Trust
Privacy and
SecurityPrivacy and
Security
Open DataOpen Data
101 Smart City Use-case Scenarios
http://www.ict-citypulse.eu/page/content/smart-city-use-cases-and-requirements
In Conclusion
− Smart cities are complex social systems and no technological and data- analytics-driven solution alone can solve the problems.
− Combination of data from Physical, Cyber and Social sources can give more complete, complementary data and contributes to better analysis and insights.
− Intelligent processing methods should be adaptable and handle dynamic, multi-modal, heterogeneous and noisy and incomplete data.
− Effective visualisation and interaction methods are also key to develop successful solutions.
− There are several solution for different parts of a data analytics framework in smart cities. An integrated approach is more effective in which IoT devices, communication networks, data analytics and learning algorithms and methods, services and interaction and visualistions and methods (and their optimisation algorithms) can work and cooperate together.
27
Q&A
− Thank you.
− EU FP7 CityPulse Project:
http://www.ict-citypulse.eu/
@ictcitypulse