1
Transforming Big Data into Smart Data for Smart Energy: Deriving Value via harnessing Volume, Variety and Velocity
Amit Sheth, Kno.e.sis, Wright State University
2
Power Grids: A Historical Perspective on Complexity
Before Alternating Current (AC) After Alternating Current After/During Smart Grid
High System Complexity!
Moderate System Complexity + Low Data Complexity
High System + DataComplexity!
Separate power lines for different voltages.
AC as a boon for Electric companies.
Smart Grid = high volume, variety and velocity
http://en.wikipedia.org/wiki/Electric_power_transmission
Late 1800’s 1900’s Today
3
Big Data in Smart Grid
One data point per month 96 million data points / day / million consumers
Low instrumentation of the power grid with sensors High instrumentation of the power
grid with sensors
Low number of energy sources High proliferation of cleaner energy
sources like renewable energy
http://www.smartgridupdate.com/dataforutilities/pdf/DataManagementWhitePaper.pdf
4
Sources of Big Data in Smart Grid
Velocity Volume
Variety
Veracity
Original 3Vs: Doug Laney: http://goo.gl/wH3qG From: http://www.smartgridupdate.com/dataforutilities/pdf/DataManagementWhitePaper.pdf
5
Big Data Analytics in Smart Grid
6
• What if your data volume gets so large and varied you don't know how to deal with it?
• Do you store all your data?• Do you analyze it all?• How can you find out which data points are
really important?• How can you use it to your best advantage?
Questions typically asked on Big Data
http://www.sas.com/big-data/
7http://techcrunch.com/2012/10/27/big-data-right-now-five-trendy-open-source-technologies/
Variety of Data Analytics Enablers
8
• Prediction of the spread of flu in real time during H1N1 2009– Google tested a mammoth of 450 million different mathematical
models to test the search terms, comparing their predictions against the actual flu cases; 45 important parameters were founds
– Model was tested when H1N1 crisis struck in 2009 and gave more meaningful and valuable real time information than any public health official system [Big Data, Viktor Mayer-Schonberger and Kenneth Cukier, 2013]
• FareCast: predict the direction of air fares over different routes [Big Data, Viktor Mayer-Schonberger and Kenneth Cukier, 2013]
• NY city manholes problem [ICML Discussion, 2012]
Illustrative Big Data Applications
9
• Current focus mainly to serve business intelligence and targeted analytics needs, not to serve complex individual and collective human needs (e.g., empower human in health, fitness and well-being; better disaster coordination, smart energy consumption) that is highly personalized/individualized/contextualized– Incorporate real-world complexity: multi-modal and multi-sensory nature of real-
world and human perception– Need deeper understanding of data and its role to information (e.g., skew,
coverage) – Beyond correlation -> causation :: actionable info, decisions grounded on insights
• Human involvement and guidance: Leading to actionable information, understanding and insight right in the context of human activities– Bottom-up & Top-down processing: Infusion of models and background knowledge
(data + knowledge + reasoning)
What is missing?
10
Contextual
Information Smart Data
Makes Sense
Actionable or help decision support/making
11
Smart Data
Smart data makes sense out of Big data
It provides value from harnessing the challenges posed by volume, velocity, variety and veracity of big data, in-
turn providing actionable information and improve decision
making.
12
“OF human, BY human and FOR human”
Smart data is focused on the actionable value achieved by human
involvement in data creation, processing and consumption phases
for improving the human experience.
Another perspective on Smart Data
13
• Focus on verticals: advertising‚ social media‚ retail‚ financial services‚ telecom‚ and healthcare
– Aggregate data, focused on transactions, limited integration (limited complexity), analytics to find (simple) patterns
– Emphasis on technologies to handle volume/scale, and to lesser extent velocity: Hadoop, NoSQL,MPP warehouse ….
– Full faith in the power of data (no hypothesis), bottom up analysis
Current Focus on Big Data
14
DescriptiveExploratoryInferentialPredictive
Causal
Improved Analytics CREATION
PROCESSING
EXPERIENCE & DECISION MAKING
Human Centric Computing
15
“OF human, BY human and FOR human”
Another perspective on Smart Data
16Petabytes of Physical(sensory)-Cyber-Social Data everyday!
More on PCS Computing: http://wiki.knoesis.org/index.php/PCS
‘OF human’ : Relevant Real-time Data Streams for Human Experience
17
“OF human, BY human and FOR human”
Another perspective on Smart Data
Use of Prior Human-created Knowledge Models
18
‘BY human’: Involving Crowd Intelligence in data processing workflows
Crowdsourcing and Domain-expert guided Machine Learning Modeling
19
“OF human, BY human and FOR human”
Another perspective on Smart Data
20
Electricity usage over a day, device at work, power consumption, cost/kWh,
heat index, relative humidity, and public events from social stream
Weather Application
Power Monitoring Application
‘FOR human’ : Improving Human Experience
Population Level Observations
Personal Level Observations
Action in the Physical World
Washing and drying has resulted in significant cost
since it was done during peak load period. Consider
changing this time to night.
21
What matters?
Personal and Population Level Observations
Actionable information for optimized resource utilization
“The challenge for utilities in maximizing the benefits from smart grid data analytics is the ability to turn the huge volume of smart grid data into value”
- Marianne Hedin, Senior Research Analyst, Navigant Research
22
Why do we care about Smart Data rather than Big Data?
Transforming Big Data into Smart Data for Smart Energy: Deriving Value via harnessing Volume, Variety and Velocity
using semantics and Semantic Web
Put Knoesis Banner
Keynote at Building Research Collaborations: Electricity Systems @ Purdue, August 28-29, 2013
Pavan Kapanipathi
Pramod Anantharam
Amit Sheth
Cory Henson
Dr. T.K. Prasad
Maryam Panahiazar
Contributions by many, but Special Thanks to:
Hemant Purohit
Special Thanks
The Ohio Center of Excellence in Knowledge-enabled Computing (Kno.e.sis), Wright State, USA
24
10 Years Ago, August 14, 2003 Blackout!
http://www.npr.org/2013/08/14/210620446/10-years-after-the-blackout-how-has-the-power-grid-changed
Robert Giroux/Getty Images
25
50 Million People without Power in 5 Northeastern States of US
http://www.npr.org/2013/08/14/210620446/10-years-after-the-blackout-how-has-the-power-grid-changed
Jonathan Fickies/Getty Images
26
$6 Billion Lost Revenue
http://www.scientificamerican.com/article.cfm?id=2003-blackout-five-years-later http://www.npr.org/2013/08/14/210620446/10-years-after-the-blackout-how-has-the-power-grid-changed
Julie Jacobson/AP
Julie Jacobson/AP
Utilities are hit with millions of dollars of fine when such blackouts happen costing them on an average 1 million dollars a day!
27
Cause of the Problem: Informal Investigation
Excessive summer heat (31° C or 88° F) caused consumers to draw excess power for running air conditioners. Heating of power lines led to sagged
cables touching vegetation creating a fault.
FirstEnergy (FE) Corporation’s control room had a failed alarm system further propagating the fault (cascading effect).
Lack of situational awareness by the control room is only one aspect of the problem. The problem is deeply rooted in consumer awareness for making
informed decisions
http://www.npr.org/2013/08/14/210620446/10-years-after-the-blackout-how-has-the-power-grid-changed
28
Cause of the Problem: Official Investigation
The U.S.-Canada Power System Outage Task Force reported four major causes leading to the blackout:1) "failed to assess and understand the inadequacies of FE's system, particularly with respect to voltage instability and the vulnerability of the Cleveland-Akron area, and FE did not operate its system with appropriate voltage criteria."
2) "did not recognize or understand the deteriorating condition of its system."
3) "failed to manage adequately tree growth in its transmission rights-of-way."
4) "failure of the interconnected grid's reliability organizations to provide effective real-time diagnostic support."
http://en.wikipedia.org/wiki/Northeast_blackout_of_2003
29
"We've done some things that will reduce the risks of the blackouts that happened last time, but haven't done things that would prevent the next blackout”
-- Paul Hines, University of Vermont
Can we Prevent such Blackouts?
“we have new sensors installed in the grid, but utilities don't totally understand what to do with all the data”
-- Paul Hines, University of Vermont
http://epaabuse.com/5159/news/after-coal-plants-close-where-does-america-get-cheap-electricity/
30
How could Smart Data help?
Value: Utilities Context
31
Derive Insights from Smart Grid Data
"Big data .. for utility companies.. can turn the information from smart meter and smart grid projects into meaningful operational insights and insights about their customer’s behavior."
- Big Data in Action, IBM
http://www.ecomagination.com/portfolio/ges-grid-iq-advanced-metering-infrastructureami-point-to-multipoint-p2mp-solution http://gkenergyproject.blogspot.com/2010/07/smart-meter-diagram.html
32
Power Grid Control Rooms are Complex!
Pacific Gas and Electric Company in California has collected over 70 terabytes of AMI (Advanced Metering Infrastructure) data and this volume is increasing by 3 terabytes a month
- Data Management And Analytics for Utilities, Smart Grid Update, 2013
http://www.rugeleypower.com/electricity-generation/producing-electricity.php
33
Multimodal, Multisensory, and Real-time Observations
Synchrophasor data
Heat index, relative humidity
Current Grid Conditions
Renewable energy generation forecast
What is the overall health of the Grid?What are the vulnerabilities for today?
Power consumption by consumers
http://www.rugeleypower.com/electricity-generation/producing-electricity.php
34
Grid Health Score (diagnostic)
Semantic Perception and risk assessment algorithms can transform raw data (hard to comprehend) to abstractions (e.g., Grid Health is 3 on a scale of 5) that is intuitively
understandable and valuable for decision makers.
Having health score for various parts of a grid will allow efficient utilization of a decision maker’s precious attention
Risk assessment model
Semantic Perception
Synchrophasor data
Heat index, relative humidity
Current Grid Conditions
Renewable energy generation forecast
Power consumption by consumers
35
Vulnerability Score (prognostic)
Vulnerability score (e.g., Today’s vulnerability score 4 on a scale of 5) is an abstraction that uses current state of the grid (health score), power demand forecast, availability of
alternative energy sources, and historical consumer behavior
Vulnerability score will alleviate the data deluge problem of decision makers by leveraging prior knowledge of the domain for creating risk assessment models
Risk assessment model
Semantic Perception
Synchrophasor data
Heat index, relative humidity
Current Grid Conditions
Renewable energy generation forecast
Power consumption by consumers
36
Value: Consumer Context
How could Smart Data help?
37
“To make good on the promise of a truly “smart” grid, the industry must continue to implement equipment that employs distributed intelligence, out to the edges of the
distribution system.“ -- Layered Intelligence Smart Grid Solutions, S&C Electric Company
“Intelligence at the Edges” of a Smart Grid
http://www.digikey.com/us/en/techzone/energy-harvesting/resources/articles/zigbees-smart-energy-20-profile.html
38
Data Overload for Consumers
“They respond well to suggestions to do something.”- Alex Laskey, President
and Founder of Opower
Personal Schedule Smart Meters Power Consumption
Temperature, relative humidity
Dynamic pricing information
http://www.identika.com/2012/02/every-movie-made/
39
Optimizing Cost, Benefit, and Preferences
Algorithms on the consumer side of the Smart Grid should should consider cost, benefit, and preference of the user to devise an optimal strategy for power consumption
Which devices are contributing to higher power bill?When should I operate the washer/dryer?
How much convenience I am willing to forego?
Semantic Perception
Personalized optimization
Personalized recommendation
Img: http://marloncarvallovillae.blogspot.com/2011_02_01_archive.html http://www.1800timeclocks.com/icon-time-systems/icon-time-upgrades/icon-time-advanced-pack-upgrade-sb100-pro/
Personal Schedule
Smart Meters
Power Consumption
Temperature, relative humidity
Dynamic pricing information
41
Big Data to Smart Data: A peek at some domains
Healthcare Social Media & Disaster Response
http://theshannoncompany.com.au/blog/?p=504
44
Sensing is a key enabler of the Internet of Things
BUT, how do we make sense of the resulting avalanche of sensor data?
50 Billion Things by 2020 (Cisco)
45
… and do it efficiently and at scale
What if we could automate this sense making ability?
46
Making sense of sensor data with
47
People are good at making sense of sensory input
What can we learn from cognitive models of perception?• The key ingredient is prior knowledge
48* based on Neisser’s cognitive model of perception
ObserveProperty
PerceiveFeature
Explanation
Discrimination
1
2
Perception Cycle*
Translating low-level signals into high-level knowledge
Focusing attention on those aspects of the environment that provide useful information
Prior Knowledge
49
To enable machine perception,
Semantic Web technology is used to integrate sensor data with prior knowledge on the Web
50
Prior knowledge on the Web
W3C Semantic Sensor Network (SSN) Ontology Bi-partite Graph
51
Prior knowledge on the Web
W3C Semantic Sensor Network (SSN) Ontology Bi-partite Graph
52
ObserveProperty
PerceiveFeature
Explanation1
Translating low-level signals into high-level knowledge
Explanation
Explanation is the act of choosing the objects or events that best account for a set of observations; often referred to as hypothesis building
53
Explanation
Inference to the best explanation• In general, explanation is an abductive problem; and
hard to compute
Finding the sweet spot between abduction and OWL• Single-feature assumption* enables use of OWL-DL
deductive reasoner
* An explanation must be a single feature which accounts forall observed properties
Explanation is the act of choosing the objects or events that best account for a set of observations; often referred to as hypothesis building
54
Explanation
Explanatory Feature: a feature that explains the set of observed properties
ExplanatoryFeature ≡ ssn:isPropertyOf∃ —.{p1} … ssn:isPropertyOf⊓ ⊓ ∃ —.{pn}
elevated blood pressure
clammy skin
palpitations
Hypertension
Hyperthyroidism
Pulmonary Edema
Observed Property Explanatory Feature
55
Discrimination is the act of finding those properties that, if observed, would help distinguish between multiple explanatory features
ObserveProperty
PerceiveFeature
Explanation
Discrimination2
Focusing attention on those aspects of the environment that provide useful information
Discrimination
56
Discrimination
Expected Property: would be explained by every explanatory feature
ExpectedProperty ≡ ssn:isPropertyOf.{f∃ 1} … ssn:isPropertyOf.{f⊓ ⊓ ∃ n}
elevated blood pressure
clammy skin
palpitations
Hypertension
Hyperthyroidism
Pulmonary Edema
Expected Property Explanatory Feature
57
Discrimination
Not Applicable Property: would not be explained by any explanatory feature
NotApplicableProperty ≡ ¬ ssn:isPropertyOf.{f∃ 1} … ¬ ssn:isPropertyOf.{f⊓ ⊓ ∃ n}
elevated blood pressure
clammy skin
palpitations
Hypertension
Hyperthyroidism
Pulmonary Edema
Not Applicable Property Explanatory Feature
58
Discrimination
Discriminating Property: is neither expected nor not-applicable
DiscriminatingProperty ≡ ¬ExpectedProperty ¬NotApplicableProperty⊓
elevated blood pressure
clammy skin
palpitations
Hypertension
Hyperthyroidism
Pulmonary Edema
Discriminating Property Explanatory Feature
59
Through physical monitoring and analysis, our cellphones could act as an early warning system to detect serious health conditions, and provide actionable information
canary in a coal mine
Our Motivation
kHealth: knowledge-enabled healthcare
60
How do we implement machine perception efficiently on aresource-constrained device?
Use of OWL reasoner is resource intensive (especially on resource-constrained devices), in terms of both memory and time
• Runs out of resources with prior knowledge >> 15 nodes• Asymptotic complexity: O(n3)
61
intelligence at the edge
Approach 1: Send all sensor observations to the cloud for processing
Approach 2: downscale semantic processing so that each device is capable of machine perception
Henson et al. 'An Efficient Bit Vector Approach to Semantics-based Machine Perception in Resource-Constrained Devices, ISWC 2012.
62
Efficient execution of machine perception
Use bit vector encodings and their operations to encode prior knowledge and execute semantic reasoning
0101100011010011110010101100011011011010110001101001111001010110001101011000110100111
63
O(n3) < x < O(n4) O(n)
Efficiency Improvement
• Problem size increased from 10’s to 1000’s of nodes• Time reduced from minutes to milliseconds• Complexity growth reduced from polynomial to
linear
Evaluation on a mobile device
64
2 Prior knowledge is the key to perceptionUsing SW technologies, machine perception can be formalized and integrated with prior knowledge on the Web
3 Intelligence at the edgeBy downscaling semantic inference, machine perception can
execute efficiently on resource-constrained devices
Semantic Perception for smarter analytics: 3 ideas to takeaway
1 Translate low-level data to high-level knowledgeMachine perception can be used to convert low-level sensory signals into high-level knowledge useful for decision making
65
Qualities-High BP-Increased Weight
Entities-Hypertension-Hypothyroidism
kHealth
Machine Sensors
Personal Input
EMR/PHR
Comorbidity risk score e.g., Charlson Index
Longitudinal studies of cardiovascular risks
- Find correlations- Validation - domain knowledge - domain expert
Parameterize the model
Risk Assessment Model
Current Observations-Physical-Physiological-History
Risk Score(Actionable Information)
Model CreationValidate correlations
Historical observations of each patient
Risk Score: from Data to Abstraction and Actionable Information
661 http://www.pdf.org/en/parkinson_statistics
10 million 60,000
$25 billion
$100,000
1 million
People worldwide are living with Parkinson's disease1.
Americans are diagnosed with Parkinson's disease each year1.
Spent on Parkinson’s alone in a year in the US1
Therapeutic surgery can cost up to $100,000 dollars per patient1.
Americans live with Parkinson’s Disease1
Parkinson’s Disease (PD)
67
Parkinson’s disease (PD) data from The Michael J. Fox Foundation for Parkinson’s Research.
1https://www.kaggle.com/c/predicting-parkinson-s-disease-progression-with-smartphone-data
8 weeks of data from 5 sensors on a smart phone, collected for 16 patients resulting in ~12 GB (with lot of missing data).
Variety Volume
VeracityVelocity
ValueCan we detect the onset of Parkinson’s disease?Can we characterize the disease progression?Can we provide actionable information to the patient?
sem
antic
s Representing prior knowledge of PD led to a focused exploration of this massive dataset
WHY Big Data to Smart Data: Healthcare example
68
Big Data to Smart Data Using a Knowledge Based Approach
ParkinsonMild(person) = Tremor(person) PoorBalance(person)∧ParkinsonModerate(person) = MoveSlow(person) PoorSleep(person) MonotoneSpeech(person)∧ ∧ParkinsonAdvanced(person) = Fall(person)
Control Group PD PatientsMovements of an active
person has a good distribution over X, Y, and
Z axis
Restricted movements bya PD patient can be seen
in the acceleration readings
Audio is well modulated with good variations in the energy of the voice
Audio is not well modulated represented a
monotone speech
Declarative Knowledge of Parkinson’s Disease used to focus
our attention on symptom manifestations in sensor
observations
69
1http://www.nhlbi.nih.gov/health/health-topics/topics/asthma/2http://www.lung.org/lung-disease/asthma/resources/facts-and-figures/asthma-in-adults.html 3Akinbami et al. (2009). Status of childhood asthma in the United States, 1980–2007. Pediatrics,123(Supplement 3), S131-S145.
25 million
300 million
$50 billion
155,000
593,000
People in the U.S. are diagnosed with asthma (7 million are children)1.
People suffering from asthma worldwide2.
Spent on asthma alone in a year2
Hospital admissions in 20063
Emergency department visits in 20063
Asthma
70
Asthma is a multifactorial disease with health signals spanning personal, public health, and population levels.
Real-time health signals from personal level (e.g., Wheezometer, NO in breath, accelerometer, microphone), public health (e.g., CDC, Hospital EMR), and population level (e.g., pollen level, CO2) arriving continuously in fine grained samples potentially with missing information and uneven sampling frequencies.
Variety Volume
VeracityVelocity
Value
Can we detect the asthma severity level?Can we characterize asthma control level?What risk factors influence asthma control?What is the contribution of each risk factor?
sem
antic
s Understanding relationships betweenhealth signals and asthma attacksfor providing actionable information
WHY Big Data to Smart Data: Healthcare example
71
Population Level
Personal
Public Health
Variety: Health signals span heterogeneous sourcesVolume: Health signals are fine grainedVelocity: Real-time change in situationsVeracity: Reliability of health signals may be compromised
Value: Can I reduce my asthma attacks at night?
Decision support to doctorsby providing them with
deeper insights into patientasthma care
Asthma: Demonstration of Value
72
Sensordrone – for monitoring environmental air quality
Wheezometer – for monitoringwheezing sounds
Can I reduce my asthma attacks at night?
What are the triggers?
What is the wheezing level?
What is the propensity toward asthma?
What is the exposure level over a day?
What is the air quality indoors?
Commute to Work
Personal
Public Health
Population Level
Closing the window at homein the morning and taking analternate route to office may
lead to reduced asthma attacks
Actionable Information
Asthma: Actionable Information for Asthma Patients
73
Personal, Public Health, and Population Level Signals for Monitoring Asthma
ICS= inhaled corticosteroid, LABA = inhaled long-acting beta2-agonist, SABA= inhaled short-acting beta2-agonist ; *consider referral to specialist
Asthma Control and Actionable Information
Sensors and their observations for understanding asthma
74
Personal Level Signals
Societal Level Signals
(Personal Level Signals)
(Personalized Societal Level Signal)
(Societal Level Signals)Societal Level Signals
Relevant to the Personal Level
Personal Level Sensors
(kHealth**) (EventShop*)
Qualify QuantifyAction
Recommendation
What are the features influencing my asthma?What is the contribution of each of these features?
How controlled is my asthma? (risk score)What will be my action plan to manage asthma?
Storage
Societal Level Sensors
Asthma Early Warning Model (AEWM)
Query AEWM
Verify & augmentdomain knowledge
Recommended Action
Action Justification
Asthma Early Warning Model
*http://www.slideshare.net/jain49/eventshop-120721, ** http://www.youtube.com/watch?v=btnRi64hJp4
75
Population Level
Personal
Wheeze – YesDo you have tightness of chest? –Yes
Observations Physical-Cyber-Social System Health Signal Extraction Health Signal Understanding
<Wheezing=Yes, time, location>
<ChectTightness=Yes, time, location>
<PollenLevel=Medium, time, location>
<Pollution=Yes, time, location>
<Activity=High, time, location>
Wheezing
ChectTightness
PollenLevel
Pollution
Activity
Wheezing
ChectTightness
PollenLevel
Pollution
Activity
RiskCategory
<PollenLevel, ChectTightness, Pollution,Activity, Wheezing, RiskCategory><2, 1, 1,3, 1, RiskCategory><2, 1, 1,3, 1, RiskCategory><2, 1, 1,3, 1, RiskCategory><2, 1, 1,3, 1, RiskCategory>
.
.
.
Expert Knowledge
Background Knowledge
tweet reporting pollution level and asthma attacks
Acceleration readings fromon-phone sensors
Sensor and personal observations
Signals from personal, personal spaces, and community spaces
Risk Category assigned by doctors
Qualify
Quantify
Enrich
Outdoor pollen and pollution
Public Health
Health Signal Extraction to Understanding
Well Controlled - continueNot Well Controlled – contact nursePoor Controlled – contact doctor
76
• Real Time Feature Streams: http://www.youtube.com/watch?v=_ews4w_eCpg
• kHealth: http://www.youtube.com/watch?v=btnRi64hJp4
Demos
77
Smart Data in Social Media & Disaster Response
To Understand critical information dynamics in real world events
78
Twitris’ Dimensions of Integrated Semantic Analysis
Sheth et al. Twitris- a System for Collective Social Intelligence, ESNAM-2013
79
What is Smart Data in the context of Disaster Management
ACTIONABLE: Timely delivery of right resources and information to the right people at right location!
Join us for the Social Good!
http://twitris.knoesis.org
RT @OpOKRelief: Southgate Baptist Church
on 4th Street in Moore has food, water, clothes, diapers, toys, and more. If you can't go,call 794
Text \"FOOD\" to 32333, REDCROSS to 90999, or STORM to 80888 to donate $10
in storm relief. #moore #oklahoma
#disasterrelief #donate
Want to help animals in #Oklahoma? @ASPCA tells
how you can help: http://t.co/mt8l9PwzmO
CITIZEN SENSORS
RESPONSE TEAMS (including humanitarian
org. and ‘pseudo’ responders)
VICTIM SITE
Coordination of emerging needs after a disaster
Does anyone know where to send a check to donate to the
tornado victims?
Where do I go to help out for
volunteer work around Moore? Anyone know?
Anyone know where to donate
to help the animals from the
Oklahoma disaster?
#oklahoma #dogs
Matched
Matched
Matched
Serving the need!
If you would like to volunteer today, help is desperately
needed in Shawnee. Call 273-5331 for more info
http://www.slideshare.net/hemant_knoesis/cscw-2012-hemantpurohit-1153161280Purohit et al. Framework to Analyze Coordination in Crisis Response, 2012. Int’l Collaboration
in-progress: with QCRI
81
Smart Data from Twitris system for Disaster Response Coordination
Which are the primary locations of power failure?
Who are all the people to engage with for better information
diffusion?Where are the charging stations to sustain communication?
Smart data provides actionable information and improve decision making through semantic analysis of Big Data.
Who are the resource seekers and suppliers?
83
Disaster Response Coordination:Twitris Summary for Actionable Nuggets
Important tags to summarize Big Data flow
Related to Oklahoma tornado
Images and Videos Related to Oklahoma tornado
84
Disaster Response Coordination:Twitris Real-time information for needs
Incoming Tweets with need types to give quick idea of what is needed and where
currently #OKC
Legends for Different needs #OKC
(It is real-time widget for monitoring of needs, so will not be active after the event has passed) http://twitris.knoesis.org/oklahomatornado
85
Disaster Response Coordination:Influencers to engage with for specific needs
Influential users are respective needs and their interaction
network on the right.
86
Really sparse Signal to Noise:• 2M tweets during the first week after #Oklahoma-tornado-2013
- 1.3% as the highly precise donation requests to help - 0.02% as the highly precise donation offers to help
• Anyone know how to get involved to help the tornado victims in Oklahoma??\#tornado #oklahomacity (OFFER)
• I want to donate to the Oklahoma cause shoes clothes even food if I can (OFFER)
Disaster Response Coordination:Finding Actionable Nuggets for Responders to act
• Text REDCROSS to 909-99 to donate to those impacted by the Moore tornado! http://t.co/oQMljkicPs (REQUEST)
• Please donate to Oklahoma disaster relief efforts.: http://t.co/crRvLAaHtk (REQUEST)
For responders, most important information is the scarcity and availability of resources, can we mine it via Social Media?
87
Disaster Response Coordination:Engagement Interface for responders
What-Where-How-Who-Why Coordination
Influential users to engage with and resources for
seekers/supplies at a location, at a timestamp
Contextual Information for a
chosen topical tags
88
• Illustrious scenario: #Oklahoma-tornado 2013
Disaster Response Coordination:Anecdote for the value of Smart Data
FEMA asked us to quickly filter out gas-leak related data
Mining the data for smart nuggets to inform FEMA (Timely needs)
Engaged with the author of this information to confirm (Veracity)
e.g., All gas leaks in #moore were capped and stopped by 11:30 last night (at 5/22/2013 1:41:37)
Lot of tweets for ‘how to/where to’ assist (‘pseudo’ responders)e.g., I want to go to Oklahoma this weekend & do what i can to help those people with food,cloths & supplies,im in the feel of wanting to help ! :)
89
Current Grid Conditions
Renewable energy generation forecast
Synchrophasor data
Heat index, relative humidity
Power consumption by consumers
Big Data from Smart Grid Smart Data from Smart Grid
What is the overall health of the Grid?What are the vulnerabilities for today?
Red, yellow, and green indicate high, medium, and low risk allowing decision makers to focus on red & yellow lines
Big Data vs. Smart Data in Smart Grids (Utilities perspective)
90
Personal Schedule
Big Data from Smart Grid & Smart Meters
Smart Data from Smart Grid & Smart Meters
Smart Meters
Power Consumption
Temperature, relative humidity
Dynamic pricing information
http://www.digikey.com/us/en/techzone/energy-harvesting/resources/articles/zigbees-smart-energy-20-profile.html
Which devices are contributing to higher power bill?When should I operate the washer/dryer?
Red, yellow, and green indicating high, medium, and
low power consumption
Recommendation algorithms will analyze these abstractions
with domain knowledge
Actions to optimize power bill will be recommended
Big Data vs. Smart Data in Smart Grids (Consumer perspective)
91
Take Away
• Data processing for Smart Grids/Utilities and Consumers is lot more than a Big Data processing problem
• It is all about the human – not computing, not device: help them make better decisions, give actionable information– Computing for human experience
• Whatever we do in Smart Data, focus on human-in-the-loop (empowering machine computing!):– Of Human, By Human, For Human– But in serving human needs, there is a lot more than what
current big data analytics handle – variety, contextual, personalized, subjective, spanning data and knowledge across P-C-S dimensions
92
Acknowledgements
• Kno.e.sis team• Funds: NSF, NIH, AFRL, Industry…
• Note:• For images and sources, if not on slides, please see slide notes• Some images were taken from the Web Search results and all such images belong
to their respective owners, we are grateful to the owners for usefulness of these images in our context.
93
• OpenSource: http://knoesis.org/opensource• Showcase: http://knoesis.org/showcase • Vision: http://knoesis.org/node/266 • Publications: http://knoesis.org/library
References and Further Readings
Amit Sheth’s PHD students
Ashutosh Jadhav
Hemant Purohit
Vinh Nguyen
Lu ChenPavan
KapanipathiPramod
Anantharam
Sujan Perera
Alan Smith
Pramod Koneru
Maryam Panahiazar
Sarasi Lalithsena
Cory Henson
Kalpa Gunaratna
Delroy Cameron
Sanjaya Wijeratne
Wenbo Wang
Kno.e.sis in 2012 = ~100 researchers (15 faculty, ~50 PhD students)
95
thank you, and please visit us at
http://knoesis.org
Kno.e.sis – Ohio Center of Excellence in Knowledge-enabled ComputingWright State University, Dayton, Ohio, USA
Smart Data