powering a virtual power station with big data
TRANSCRIPT
![Page 1: Powering a Virtual Power Station with Big Data](https://reader036.vdocuments.us/reader036/viewer/2022070516/586f77811a28ab10258b681b/html5/thumbnails/1.jpg)
Powering a Virtual Power Station with
Big DataMichael BironneauApril 2016
![Page 2: Powering a Virtual Power Station with Big Data](https://reader036.vdocuments.us/reader036/viewer/2022070516/586f77811a28ab10258b681b/html5/thumbnails/2.jpg)
CCGTCoal
Nuclear
Wind
Interconnecto
rsOCGT
Pumped StorageSo
lar Oil
BiomassHydro
0
5
10
15
20
25
30
35
Installed Capacity (GW) Generation (GW)
![Page 3: Powering a Virtual Power Station with Big Data](https://reader036.vdocuments.us/reader036/viewer/2022070516/586f77811a28ab10258b681b/html5/thumbnails/3.jpg)
![Page 4: Powering a Virtual Power Station with Big Data](https://reader036.vdocuments.us/reader036/viewer/2022070516/586f77811a28ab10258b681b/html5/thumbnails/4.jpg)
![Page 5: Powering a Virtual Power Station with Big Data](https://reader036.vdocuments.us/reader036/viewer/2022070516/586f77811a28ab10258b681b/html5/thumbnails/5.jpg)
![Page 6: Powering a Virtual Power Station with Big Data](https://reader036.vdocuments.us/reader036/viewer/2022070516/586f77811a28ab10258b681b/html5/thumbnails/6.jpg)
02468
101214161820
Total PowerM
W
Average upwards flex – 120%
Average downwards flex – 35%
![Page 7: Powering a Virtual Power Station with Big Data](https://reader036.vdocuments.us/reader036/viewer/2022070516/586f77811a28ab10258b681b/html5/thumbnails/7.jpg)
?
?
![Page 8: Powering a Virtual Power Station with Big Data](https://reader036.vdocuments.us/reader036/viewer/2022070516/586f77811a28ab10258b681b/html5/thumbnails/8.jpg)
• 25-40k messages processed per second• Total size of data 500TB-800TB
Open Energi in the coming year:
![Page 9: Powering a Virtual Power Station with Big Data](https://reader036.vdocuments.us/reader036/viewer/2022070516/586f77811a28ab10258b681b/html5/thumbnails/9.jpg)
• 25-40k messages processed per second• Total size of data 500TB-800TB
Open Energi in the coming year:
Perspective: here’s what “big data” means to Boeing [1]:• ~64k messages per second from each aircraft• Total size of data over 100 petabytes
[1]: http://bit.ly/18kQlMn
![Page 10: Powering a Virtual Power Station with Big Data](https://reader036.vdocuments.us/reader036/viewer/2022070516/586f77811a28ab10258b681b/html5/thumbnails/10.jpg)
Open Energi Boeing0
20
40
60
80
100
120
Size of data (PB)
Our data is not huge at the moment…
![Page 11: Powering a Virtual Power Station with Big Data](https://reader036.vdocuments.us/reader036/viewer/2022070516/586f77811a28ab10258b681b/html5/thumbnails/11.jpg)
…but after domestic demand-side response (or something else on that scale)
Open Energi Boeing0
20
40
60
80
100
120
Size of data (PB)
![Page 12: Powering a Virtual Power Station with Big Data](https://reader036.vdocuments.us/reader036/viewer/2022070516/586f77811a28ab10258b681b/html5/thumbnails/12.jpg)
Why Hortonworks Data Platform
• Can scale quickly to respond to market demands• Interoperability with existing code• Fantastic data integration• Knowledgeable technical support• Security and data governance
![Page 13: Powering a Virtual Power Station with Big Data](https://reader036.vdocuments.us/reader036/viewer/2022070516/586f77811a28ab10258b681b/html5/thumbnails/13.jpg)
Batch | Our HDP setup
Flume
Asset Data
National Electricity Data
Market data
Other “live” timeseries data
Hive Streaming
Hive
otherApplications
![Page 14: Powering a Virtual Power Station with Big Data](https://reader036.vdocuments.us/reader036/viewer/2022070516/586f77811a28ab10258b681b/html5/thumbnails/14.jpg)
Real-time | (Work ongoing)
Asset Data
ML models
HDFS, cache, Elasticsearch…
Update ML ModelsCorrelate Events
Enrich
![Page 15: Powering a Virtual Power Station with Big Data](https://reader036.vdocuments.us/reader036/viewer/2022070516/586f77811a28ab10258b681b/html5/thumbnails/15.jpg)
Apache Hive | Example
CREATE EXTERNAL TABLE semi_structured_stuff (...) STORED BY 'org.elasticsearch.hadoop.hive.EsStorageHandler' TBLPROPERTIES('es.resource' = ‘semi/structured',
'es.index.auto.create' = 'false') ;
SELECT something FROM semi_structured_stuffJOIN metadata m ON …LEFT JOIN timeseries t ON …
Index semi-structured data (Elasticsearch)
Use Hive to integrate this with timeseries data and other metadata
Farm out complex analytics to PythonSELECT transform(something) USING ‘insane_maths.py’AS (result)
![Page 16: Powering a Virtual Power Station with Big Data](https://reader036.vdocuments.us/reader036/viewer/2022070516/586f77811a28ab10258b681b/html5/thumbnails/16.jpg)
Benefits
• Reduced storage cost compared to SAN + SQL Server• Better utilisation of infrastructure thanks to YARN• Pain-free integration of multiple data sources with external tables
in Hive• Scale up/down on demand• Re-use existing Python code = low development overhead
![Page 17: Powering a Virtual Power Station with Big Data](https://reader036.vdocuments.us/reader036/viewer/2022070516/586f77811a28ab10258b681b/html5/thumbnails/17.jpg)
Dynamic Demand
SimulationsInsights via web
Machine learningStatistical Analysis
Event correlationExpert system
Real-time aggregationReal-time web feed
![Page 18: Powering a Virtual Power Station with Big Data](https://reader036.vdocuments.us/reader036/viewer/2022070516/586f77811a28ab10258b681b/html5/thumbnails/18.jpg)
Dynamic Demand
SimulationsInsights via web
Machine learningStatistical Analysis
Event correlationExpert system
Real-time aggregationReal-time web feed
![Page 19: Powering a Virtual Power Station with Big Data](https://reader036.vdocuments.us/reader036/viewer/2022070516/586f77811a28ab10258b681b/html5/thumbnails/19.jpg)
Thanks for listening. Any questions?