![Page 1: High Resolution Energy Modeling that Scales with Apache Spark 2.0 Spark Summit East talk by Jonathan Farland](https://reader036.vdocuments.us/reader036/viewer/2022062522/58cef27a1a28abab738b46eb/html5/thumbnails/1.jpg)
High Resolution Energy Modeling that Scales with Apache Spark 2.0
Jonathan FarlandConsultant | Data Scientist, DNV GL
![Page 2: High Resolution Energy Modeling that Scales with Apache Spark 2.0 Spark Summit East talk by Jonathan Farland](https://reader036.vdocuments.us/reader036/viewer/2022062522/58cef27a1a28abab738b46eb/html5/thumbnails/2.jpg)
About me• Data Scientist & Technical Consultant for DNV
GL’s Policy Advisory and Research Group.
• Background in Econometrics, Forecasting, Machine Learning and Optimization.
• Working with Big Data for 3+ years
![Page 3: High Resolution Energy Modeling that Scales with Apache Spark 2.0 Spark Summit East talk by Jonathan Farland](https://reader036.vdocuments.us/reader036/viewer/2022062522/58cef27a1a28abab738b46eb/html5/thumbnails/3.jpg)
Agenda• Introduction to DNV GL• Energy Data Science using Spark
– Data Scales and the DGP– Application 1 – Princeton Score Keeping Method
(PRISM)– Application 2 – Hourly Predictive Modelling with
Distributed Energy Resources• Next Steps with Spark and Databricks
![Page 4: High Resolution Energy Modeling that Scales with Apache Spark 2.0 Spark Summit East talk by Jonathan Farland](https://reader036.vdocuments.us/reader036/viewer/2022062522/58cef27a1a28abab738b46eb/html5/thumbnails/4.jpg)
Introduction to DNV GL
Jonathan FarlandConsultant | Data Scientist, DNV GL
![Page 5: High Resolution Energy Modeling that Scales with Apache Spark 2.0 Spark Summit East talk by Jonathan Farland](https://reader036.vdocuments.us/reader036/viewer/2022062522/58cef27a1a28abab738b46eb/html5/thumbnails/5.jpg)
![Page 6: High Resolution Energy Modeling that Scales with Apache Spark 2.0 Spark Summit East talk by Jonathan Farland](https://reader036.vdocuments.us/reader036/viewer/2022062522/58cef27a1a28abab738b46eb/html5/thumbnails/6.jpg)
Energy Data Science:Data Scales and the DGP
Jonathan FarlandConsultant | Data Scientist, DNV GL
![Page 7: High Resolution Energy Modeling that Scales with Apache Spark 2.0 Spark Summit East talk by Jonathan Farland](https://reader036.vdocuments.us/reader036/viewer/2022062522/58cef27a1a28abab738b46eb/html5/thumbnails/7.jpg)
Metering Data: Historical measured quantities of electricity usage for a site or
meter during a particular time.
- An analogue origin requiring a physical reading of the meter on a specific cycle.
- Typically used for utility companies to bill customers for their usage
- Advanced metering technologies and machine learning now allows for millisecond reading and disaggregation down to the end use / appliance level.
Weather Data:
- Actual Weather: Records of temperature, humidity, cloud cover, solar irradiance, etc.
- Typical Weather: 30-year / 10-year averages that define “normal” weather conditions
Data Generating Process
![Page 8: High Resolution Energy Modeling that Scales with Apache Spark 2.0 Spark Summit East talk by Jonathan Farland](https://reader036.vdocuments.us/reader036/viewer/2022062522/58cef27a1a28abab738b46eb/html5/thumbnails/8.jpg)
Electricity Distribution Grid
Transmission Distribution ConsumerGeneration Transmission Distribution ConsumerGeneration
WindFarms
PhotoVoltaic
Aggregated Utility Scale
2-50 MW
Utility Scale
100kW-2MW
Distributed Scale
25kW-100kW
ResidentialCommercial & Industrial
DistributionTransmissionGeneration
Bulk Storage
> 50 MW
Distribution System
Bulk System
PhotovoltaicWind Farms
![Page 9: High Resolution Energy Modeling that Scales with Apache Spark 2.0 Spark Summit East talk by Jonathan Farland](https://reader036.vdocuments.us/reader036/viewer/2022062522/58cef27a1a28abab738b46eb/html5/thumbnails/9.jpg)
The Rise of The Smart Grid
![Page 10: High Resolution Energy Modeling that Scales with Apache Spark 2.0 Spark Summit East talk by Jonathan Farland](https://reader036.vdocuments.us/reader036/viewer/2022062522/58cef27a1a28abab738b46eb/html5/thumbnails/10.jpg)
Data Scales
![Page 11: High Resolution Energy Modeling that Scales with Apache Spark 2.0 Spark Summit East talk by Jonathan Farland](https://reader036.vdocuments.us/reader036/viewer/2022062522/58cef27a1a28abab738b46eb/html5/thumbnails/11.jpg)
The embarrassingly parallel ‘Primary Modeling Unit’:I. Temporal: Sub-hourly, hourly, daily, monthly, annually
II. CrossSectional: Clusters/Segments, Geography, System Hierarchy.
III. Hybrid: Structure and Year specific
Databricks: Rapid deployment and development of existing analytics pipeline
Spark 2.0: SparkR allows for UDF’s and Partition-Based Model Learning- gapply, dapply, lapply
Spark 2.1: Enable installing third party packages on workers using spark.addfile- SPARK-7159: Multiclass Logistic Regression in DataFrame-based API
Analytical Solution
![Page 12: High Resolution Energy Modeling that Scales with Apache Spark 2.0 Spark Summit East talk by Jonathan Farland](https://reader036.vdocuments.us/reader036/viewer/2022062522/58cef27a1a28abab738b46eb/html5/thumbnails/12.jpg)
Energy Data Science:Princeton Score Keeping Method (PRISM)
Jonathan FarlandConsultant | Data Scientist, DNV GL
![Page 13: High Resolution Energy Modeling that Scales with Apache Spark 2.0 Spark Summit East talk by Jonathan Farland](https://reader036.vdocuments.us/reader036/viewer/2022062522/58cef27a1a28abab738b46eb/html5/thumbnails/13.jpg)
PRISM Algorithm
- Decomposes energy usage into it’s weather-driven and baseload components.
- Site level modelling that combine both full and reduced form models
- Grid search over possible heating and cooling reference temperatures
- Rich history development based on fundamental structural engineering principles
- Origin: Miriam Goldberg's dissertation "A Geometrical Approach to Non-differentiable Regression Models as Related to Methods for Assessing Residential Energy Conservation.“
![Page 14: High Resolution Energy Modeling that Scales with Apache Spark 2.0 Spark Summit East talk by Jonathan Farland](https://reader036.vdocuments.us/reader036/viewer/2022062522/58cef27a1a28abab738b46eb/html5/thumbnails/14.jpg)
Just a little math…
![Page 15: High Resolution Energy Modeling that Scales with Apache Spark 2.0 Spark Summit East talk by Jonathan Farland](https://reader036.vdocuments.us/reader036/viewer/2022062522/58cef27a1a28abab738b46eb/html5/thumbnails/15.jpg)
Explained Visually
![Page 16: High Resolution Energy Modeling that Scales with Apache Spark 2.0 Spark Summit East talk by Jonathan Farland](https://reader036.vdocuments.us/reader036/viewer/2022062522/58cef27a1a28abab738b46eb/html5/thumbnails/16.jpg)
SparkR – gapply, dapply, lapply
![Page 17: High Resolution Energy Modeling that Scales with Apache Spark 2.0 Spark Summit East talk by Jonathan Farland](https://reader036.vdocuments.us/reader036/viewer/2022062522/58cef27a1a28abab738b46eb/html5/thumbnails/17.jpg)
Local Native R
![Page 18: High Resolution Energy Modeling that Scales with Apache Spark 2.0 Spark Summit East talk by Jonathan Farland](https://reader036.vdocuments.us/reader036/viewer/2022062522/58cef27a1a28abab738b46eb/html5/thumbnails/18.jpg)
![Page 19: High Resolution Energy Modeling that Scales with Apache Spark 2.0 Spark Summit East talk by Jonathan Farland](https://reader036.vdocuments.us/reader036/viewer/2022062522/58cef27a1a28abab738b46eb/html5/thumbnails/19.jpg)
![Page 20: High Resolution Energy Modeling that Scales with Apache Spark 2.0 Spark Summit East talk by Jonathan Farland](https://reader036.vdocuments.us/reader036/viewer/2022062522/58cef27a1a28abab738b46eb/html5/thumbnails/20.jpg)
Energy Data Science:Predictive Modeling with Distributed Energy Resources
Jonathan FarlandConsultant | Data Scientist, DNV GL
![Page 21: High Resolution Energy Modeling that Scales with Apache Spark 2.0 Spark Summit East talk by Jonathan Farland](https://reader036.vdocuments.us/reader036/viewer/2022062522/58cef27a1a28abab738b46eb/html5/thumbnails/21.jpg)
21
Load Shifting: Electric Vehicles
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 240
5
10
15
20
25
30
Standard Rate Electric Vehicle Rate
Hour Ending
Dem
and
(kW
)
![Page 22: High Resolution Energy Modeling that Scales with Apache Spark 2.0 Spark Summit East talk by Jonathan Farland](https://reader036.vdocuments.us/reader036/viewer/2022062522/58cef27a1a28abab738b46eb/html5/thumbnails/22.jpg)
22
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 -
20,000
40,000
60,000
80,000
100,000
120,000
140,000
160,000
Forecasted - DR Reduction Forecasted - DR BaselineForecasted - DR Impacted Load Actual DR - Reduction
Hour Ending
Load
(kW
h)Load Reduction: Demand Response
![Page 23: High Resolution Energy Modeling that Scales with Apache Spark 2.0 Spark Summit East talk by Jonathan Farland](https://reader036.vdocuments.us/reader036/viewer/2022062522/58cef27a1a28abab738b46eb/html5/thumbnails/23.jpg)
Cluster Sizes:1 – 10,4952 – 4,5133 – 1,1274 – 9,823
Digitalization: Scalable Cluster Computing (Spark, Python, R)
Data Science: Machine Learning Algorithms (Spectral Clustering and K-means)
Predictive Analytics (Semiparametric Regression)
![Page 24: High Resolution Energy Modeling that Scales with Apache Spark 2.0 Spark Summit East talk by Jonathan Farland](https://reader036.vdocuments.us/reader036/viewer/2022062522/58cef27a1a28abab738b46eb/html5/thumbnails/24.jpg)
Cluster Sizes:1 – 10,4952 – 4,5133 – 1,1274 – 9,823
Digitalization: Scalable Cluster Computing (Spark, Python, R)
Data Science: Machine Learning Algorithms (Spectral Clustering and K-means)
Predictive Analytics (Semiparametric Regression)
![Page 25: High Resolution Energy Modeling that Scales with Apache Spark 2.0 Spark Summit East talk by Jonathan Farland](https://reader036.vdocuments.us/reader036/viewer/2022062522/58cef27a1a28abab738b46eb/html5/thumbnails/25.jpg)
How well did it work?Cluster 1 Cluster 4
![Page 26: High Resolution Energy Modeling that Scales with Apache Spark 2.0 Spark Summit East talk by Jonathan Farland](https://reader036.vdocuments.us/reader036/viewer/2022062522/58cef27a1a28abab738b46eb/html5/thumbnails/26.jpg)
ClusterSite Predictions
![Page 27: High Resolution Energy Modeling that Scales with Apache Spark 2.0 Spark Summit East talk by Jonathan Farland](https://reader036.vdocuments.us/reader036/viewer/2022062522/58cef27a1a28abab738b46eb/html5/thumbnails/27.jpg)
1 7 13 19 25 31 37 43 49 55 61 67 73 79 85 91 97 1031091151211271331390
0.5
1
1.5
2
2.5
3
Load Forecast Adjusted Load Forecast PV Production Storage Discharging
Forecast Horizon
kWClusterSite Tech Simulations
![Page 28: High Resolution Energy Modeling that Scales with Apache Spark 2.0 Spark Summit East talk by Jonathan Farland](https://reader036.vdocuments.us/reader036/viewer/2022062522/58cef27a1a28abab738b46eb/html5/thumbnails/28.jpg)
Conclusions
Jonathan FarlandConsultant | Data Scientist, DNV GL
![Page 29: High Resolution Energy Modeling that Scales with Apache Spark 2.0 Spark Summit East talk by Jonathan Farland](https://reader036.vdocuments.us/reader036/viewer/2022062522/58cef27a1a28abab738b46eb/html5/thumbnails/29.jpg)
Spark 2.0 / 2.1 has allowed DNV GL’s existing expertise and code base to scale
Databricks has provided an environment that facilitated existing codebases as well as additional rapid development
- Analytical contexts, prediction goals, and model selection processes define the Primary Modeling Unit (PMU) in any Energy Data Science Application.
- The distributed computing framework must be able to scale with the appropriate Primary Modeling Unit for any Energy Data Science Application
Take Home Message
![Page 30: High Resolution Energy Modeling that Scales with Apache Spark 2.0 Spark Summit East talk by Jonathan Farland](https://reader036.vdocuments.us/reader036/viewer/2022062522/58cef27a1a28abab738b46eb/html5/thumbnails/30.jpg)
Modeling Additional Fuels - Natural Gas (Therms)- Water (Liters / Gallons)- Hybrid (British Thermal Units)
Climate Change Simulations- DNV GL’s BayTown System Dynamics Model
Electricity Grid Optimization with Distributed Energy Resource Assets
The Future!
![Page 31: High Resolution Energy Modeling that Scales with Apache Spark 2.0 Spark Summit East talk by Jonathan Farland](https://reader036.vdocuments.us/reader036/viewer/2022062522/58cef27a1a28abab738b46eb/html5/thumbnails/31.jpg)
Thank You.Jonathan [email protected]://github.com/jfarland