a comparative study of support vector machines and artificial … · bizerte to become a smart city...
TRANSCRIPT
A Comparative Study of Support Vector Machines andArtificial Neural Networks for Short Term Load
ForecastingMaster Thesis Presentation
Oussama Saad
Renewable Energy and Energy Efficiency for the Middle East and North Africa Region [REMENA]University of Kassel
July. 13, 2018
Oussama Saad A Comparative Study of SVM and ANN for STLF July. 13, 2018 1 / 30
Table of Contents
Table of Contents
1 Introduction
2 Methodology
3 Results
4 Conclusions and Future Work
Oussama Saad A Comparative Study of SVM and ANN for STLF July. 13, 2018 2 / 30
Introduction
Table of Contents
1 IntroductionThesis Background and MotivationThesis ContributionSelected Forecasting Techniques
2 Methodology
3 Results
4 Conclusions and Future Work
Oussama Saad A Comparative Study of SVM and ANN for STLF July. 13, 2018 3 / 30
Introduction Thesis Background and Motivation
Thesis Background and Motivation
Bizerte to become a smart city by the year 2050;
Set of projects aiming at improving energy managementsuch as:
Deploying smart grids,
Implementing an advanced energy management system(EMS).
EMS and electric utilities rely on short term loadforecasting (STLF), ranging from 1 hour to 1 week, for:
Improving equipment flexibility and asset management,
Planning demand side management (DSM) interventions,
Energy trading.
Accurate STLF is required to improve EMS operations.Figure: Bizerte governorate, Tunisia
source: Wikipedia
Oussama Saad A Comparative Study of SVM and ANN for STLF July. 13, 2018 4 / 30
Introduction Thesis Contribution
Thesis Contribution
Build a STLF model to predict the electrical load in the governorate of Bizerte inhourly resolution.
Take into account the influence of the temperature as well as the characteristicsof the calendar on the load.
Predict the electric load at an hour H using the electric load and the exogenousfactors from the previous hour (H − 1).
Two supervised machine learning (ML) techniques were selected for comparativeassessment:
Support Vector Machines (SVM),
Artificial Neural Networks (ANN).
Oussama Saad A Comparative Study of SVM and ANN for STLF July. 13, 2018 5 / 30
Introduction Selected Forecasting Techniques
ν- Support Vector Regression (ν-SVR)
Main idea:Given a training set {(xi, ti)}N
1 ∈ Rn × R,find a function f to approximate ti:
f(x) = y = 〈w,x〉+ b, with (1)
a maximum deviation ε from the targetvariable.
To give more flexibility to the SVRmachine, slack variables ξi, ξ
?i are
introduced.
+ε
0w
−ε
ξi
ξ?i
error vector
support vector
Figure: Support vectors for Linear SVR.
Oussama Saad A Comparative Study of SVM and ANN for STLF July. 13, 2018 6 / 30
Introduction Selected Forecasting Techniques
Convex quadratic programming problem (QPP)
minimize12||w||2 + C
(Nνε+
N∑i=1
(ξi + ξ?i )
)
subject to
ti − 〈w,xi〉 − b ≤ ε+ ξi
〈w,xi〉+ b− ti ≤ ε+ ξ?i
ξi, ξ?i ≥ 0
0 < ν ≤ 1
∀i ∈ {1, ..., N}
(2)
ν is lower bound on the fraction of support vectors and upper bound on thefraction of training errors [7].
C translates the trade-off between the minimization of the norm of w and thetolerated fraction of deviations larger than ε [7].
For non-linear problems, SVM rely on kernel functions to map the data set into ahigher dimensional space to achieve linearity.
Oussama Saad A Comparative Study of SVM and ANN for STLF July. 13, 2018 7 / 30
Introduction Selected Forecasting Techniques
Artificial Neural Networks
I1
I2
I3
...
In
h1
h2
h3
...
hm
O1
O2
O3
...
op
Input layer Hidden layer Output layer
w11,1 w2
1,1
w1m,n w2
p,m
b11
b1m
b21
b2p
x1
x2
x3
xn
y1
y2
y3
yp
Figure: A feedfroward neural network diagram
Oussama Saad A Comparative Study of SVM and ANN for STLF July. 13, 2018 8 / 30
Introduction Selected Forecasting Techniques
Recurrent Neural Networks
I OHx y
Figure: Recurrent Neural network diagram
Using an additional feedback loop in their hidden layer, RNN are capable oftransmitting the treated information from one step to the following one.
RNN are more suitable for capturing the dependencies between the observations.
Thus, for our study, RNN were selected for forecasting the electric load.
Oussama Saad A Comparative Study of SVM and ANN for STLF July. 13, 2018 9 / 30
Methodology
Table of Contents
1 Introduction
2 MethodologyDatasetsData PreprocessingModel DesignTraining and TestingForecastingEvaluation Metrics
3 Results
4 Conclusions and Future Work
Oussama Saad A Comparative Study of SVM and ANN for STLF July. 13, 2018 10 / 30
Methodology
Methodology steps
Data pre-processing Model design Training
and Testing Forecasting Evaluation
Oussama Saad A Comparative Study of SVM and ANN for STLF July. 13, 2018 11 / 30
Methodology Datasets
Datasets
Time series of the electric load in Bizerte covering the period from 01-01-2013 to31-12-2017 (5 years) in 15 minutes resolution (from STEG).
Time series of temperature covering the same period in hourly resolution (fromINM).
Calender data:Day, month, year and hour;Day of the week (DOW): Monday, Tuesday, · · · , Sunday;Type of day (TOD): Monday, working day (from Tuesday to Friday), weekend andholidays.
In addition to the electric load, 7 exogenous features were considered in ourproblem.
Oussama Saad A Comparative Study of SVM and ANN for STLF July. 13, 2018 12 / 30
Methodology Data Preprocessing
Data Preprocessing
data ag-gregation
Aggregate to hourly values byconsidering the hourly peaks
Removingoutliers
Using Interquartile range (IQR) andlocal outlier factor (LOF) methods,Each year was treated separately.
Fillingmissingvalues
Using spline interpolation
dataTransfor-mation
Normalizing data values between [0,1]
Oussama Saad A Comparative Study of SVM and ANN for STLF July. 13, 2018 13 / 30
Methodology Model Design
Model design : ν-SVR
Using LIBSVM in python [2],
The radial basis function (RBF):
K(xi, xj) = exp(−γ||xi − xj ||2
)(3)
γ, ν, C are the key hyperparameters to be optimised [2].
Table: Grid search intervals for ν-SVR model
Hyperparameter Intervalγ ]0,5]ν ]0,1]C ]0,10]
Oussama Saad A Comparative Study of SVM and ANN for STLF July. 13, 2018 14 / 30
Methodology Model Design
Models design : RNN
Using pybrain in python [6],
Number of neurons in input layer Nin = 8,
Number of neurons in output layer Nout = 1,
Number of Hidden layers =1; ”Can approximate any function that contains acontinuous mapping from one finite space to another” [3],
Number of neurons in hidden layer using the geometric pyramid rule:
Nhid =√Nin ·Nout = 3 (4)
Activation function : sigmoid function
σ(x) =1
1 + exp(−x) (5)
Oussama Saad A Comparative Study of SVM and ANN for STLF July. 13, 2018 15 / 30
Methodology Training and Testing
Training and Testing
Training both models on observations from the first four years (2013-2016),
Test their performance on the last year:LIBSVM uses the coefficient of determination R2:
R2 = 1 −
n∑i=1
(ti − yi)2
n∑i=1
(ti − t)2(6)
Pybrain (RNN) uses the mean squared error (MSE):
MSE =1n
n∑i=1
(ti − yi)2 (7)
where ti is the target or expected value, yi is the predicted value and t is the meanvalue of the target set.
Oussama Saad A Comparative Study of SVM and ANN for STLF July. 13, 2018 16 / 30
Methodology Forecasting
Forecasting
Start
Initialization:
Input Load & exogenousfactors at H = −1;
set counter i = 0,
set max-iterations
Forecast: Estimate load for next hour (H + 1)
Append forecast list
Incrementation: i = i + 1
i ≤ maxiteration
Scale backforecasted values
Concatenate the new valueand exogenous factors
with the same time index
End
NO
Yes
Oussama Saad A Comparative Study of SVM and ANN for STLF July. 13, 2018 17 / 30
Methodology Evaluation Metrics
Evaluation Metrics for Forecasts Accuracy
Mean absolute percentage error (MAPE):
MAPE =
(1n
n∑i=1
|ti − yi||ti|
)× 100 (8)
Mean absolute error (MAE):
MAE =1n
n∑i=1
|ti − yi| (9)
Root mean square error (RMSE):
RMSE =
√√√√ 1n
n∑i=1
(ti − yi)2 (10)
Oussama Saad A Comparative Study of SVM and ANN for STLF July. 13, 2018 18 / 30
Results
Table of Contents
1 Introduction
2 Methodology
3 Resultsν-SVR Training and Test PerformanceRNN Training and Test Performanceν-SVR & RNN Forecasting ResultsCombined ν-SVR−RNN ForecastsDiscussion
4 Conclusions and Future Work
Oussama Saad A Comparative Study of SVM and ANN for STLF July. 13, 2018 19 / 30
Results ν-SVR Training and Test Performance
ν-SVR Training and Test Performance
Table: Best configuration of theSVR model.
Hyperparameter Valueγ 5ν 0.8C 7.743
Model accuracy on the testset: R2 = 0.96 Figure: Contour plot of SVR model accuracy (R2) for
ν =0.8.
Oussama Saad A Comparative Study of SVM and ANN for STLF July. 13, 2018 20 / 30
Results RNN Training and Test Performance
RNN Training and Test Performance
Figure: RNN training and test results.
After 20 epochs, the RNN learning process converged to a MSE on the scaledtest set equal to 2× 10−3 .
Oussama Saad A Comparative Study of SVM and ANN for STLF July. 13, 2018 21 / 30
Results ν-SVR & RNN Forecasting Results
ν-SVR & RNN Forecasting Results
Figure: One week-ahead load forecasting using ν-SVR and RNN.
Oussama Saad A Comparative Study of SVM and ANN for STLF July. 13, 2018 22 / 30
Results ν-SVR & RNN Forecasting Results
Forecast Accuracy
(a) One Day-ahead forecast. (b) One Week-ahead forecast.
Figure: Forecast accuracy of ν-SVR and RNN models.
Oussama Saad A Comparative Study of SVM and ANN for STLF July. 13, 2018 23 / 30
Results Combined ν-SVR−RNN Forecasts
Combined ν-SVR−RNN Forecasts
Figure: A week ahead load forecasting using combined ν-SVR and RNN.
Oussama Saad A Comparative Study of SVM and ANN for STLF July. 13, 2018 24 / 30
Results Combined ν-SVR−RNN Forecasts
Forecast Accuracy (combined Model)
(a) Day ahead forecast. (b) Week ahead forecast.
Figure: Forecast accuracy of ν-SVR and RNN models.
Oussama Saad A Comparative Study of SVM and ANN for STLF July. 13, 2018 25 / 30
Results Discussion
Discussion
1 Under-performance of RNN due to two main limiting factors:Vanishing gradient problem: incapacity of RNN to capture accurately information in along sequence of input data,
Gradient descent optimisation method does not guarantee to achieve the globalminimum of the loss function.
2 Advantage of the structural risk minimisation (in SVM) over the Empirical riskminimisation (in RNN).
3 Although better accuracy can be achieved by combining forecasts, it isrecommended to consider only the forecasts of ν-SVR model [1].
Oussama Saad A Comparative Study of SVM and ANN for STLF July. 13, 2018 26 / 30
Conclusions and Future Work
Table of Contents
1 Introduction
2 Methodology
3 Results
4 Conclusions and Future Work
Oussama Saad A Comparative Study of SVM and ANN for STLF July. 13, 2018 27 / 30
Conclusions and Future Work
Conclusion
1 Data preprocessed; Two STLF models designed, trained and tested; Oneweek-ahead load forecasting performed;
2 Forecasting results:ν-SVR & RNN give good results,
ν-SVR outperforms RNN,
RNN are sensitive to error propagation,
Ability of both models to detect the period of the daily peak load.
3 Combined ν-SVR−RNN model improves the day-ahead forecasting accuracy butit is recommended to consider the results of the ν-SVR model.
Oussama Saad A Comparative Study of SVM and ANN for STLF July. 13, 2018 28 / 30
Conclusions and Future Work
Future Work
Determine the optimal size of the training and test sets for similar tasks in orderto reduce the time and computational cost.
Test other RNN models as the long-short term memory (LSTM) networks whichwere introduced to overcome the vanishing gradient problem.
Estimate the expected energy savings from the implementation of such STLFmodels.
Extend to other applications (heat demand, renewable energy production,· · · ).
Oussama Saad A Comparative Study of SVM and ANN for STLF July. 13, 2018 29 / 30
References
Cruz E Borges, Yoseba K Penya, and Ivan Fernandez.Optimal combined short-term building load forecasting.Innovative Smart Grid Technologies Asia (ISGT), pages 1–7, 2011.
Chih-Chung Chang and Chih-Jen Lin.LIBSVM: A library for support vector machines.ACM Transactions on Intelligent Systems and Technology, 2:27:1–27:27, 2011.Software available at http://www.csie.ntu.edu.tw/˜cjlin/libsvm.
Jeff Heaton.Introduction to neural networks with Java.Heaton Research, Inc., 2008.
Chih-Wei Hsu, Chih-Chung Chang, Chih-Jen Lin, et al.A practical guide to support vector classification.2003.
Timothy Masters.Practical neural network recipes in C++.Morgan Kaufmann, 1993.
Tom Schaul, Justin Bayer, Daan Wierstra, Yi Sun, Martin Felder, Frank Sehnke, Thomas Ruckstieß, andJurgen Schmidhuber.PyBrain.Journal of Machine Learning Research, 11:743–746, 2010.
Alex J Smola and Bernhard Scholkopf.A tutorial on support vector regression.Statistics and computing, 14(3):199–222, 2004.
Oussama Saad A Comparative Study of SVM and ANN for STLF July. 13, 2018 30 / 30