wind power forecast using neural networks tuned with ...€¦ · variables, which determines the...
TRANSCRIPT
1
Abstract — The increased integration of wind power into the
electric grid implies many challenges to power systems
operators, mainly due to the hard to predict and variability of
wind power generation. Thus, an accurate wind power forecast
is imperative for systems operators, aiming at an efficient and
economical wind power operation and integration into the
power system.
This work addresses the issue of forecasting short-term wind
speed and short-term wind power for one hour ahead,
combining Artificial Neural Networks (ANN) with
optimization techniques on real historical wind speed and wind
power data. Therefore, Levenberg-Marquardt (LM) and
Particle Swarm Optimization (PSO) are used as a training
algorithm to update the weights and bias of the ANN,
establishing two forecasting approaches: ANN-LM and ANN-
PSO.
The forecasting performance produced by the proposal
models are compared with each other as well as the benchmark
persistence model.
Test results show higher performance for ANN-LM
forecasting model, with accurate and reliable wind speed and
wind power prediction.
Keywords: Short-term wind speed and power forecast, Artificial
Neural Network, Levenberg-Marquardt, Particle Swarm
Optimization
1. INTRODUCTION
Wind power is the fastest growing source of renewable energy
in the world. It represents a clean and sustainable source of
energy, and is in abundant supply, which helps to explain the
growth in installed capacity of wind power plants in recent
years. This implies the need to efficiently integrate the power
generated from wind energy into existing power systems.
However, the increase in wind power penetration requires a
number of issues to be addressed. Since the wind power has a
cubic relationship with wind speed, any error in the wind speed
forecast leads to a larger error in wind power production. This
dependency in the stochastic nature of wind speed also causes
uncertainty in wind power production, and unexpected
variations of wind power output may increase the operating
costs for the overall power system. Thus, the use of an accurate
short-term wind power forecast techniques is crucial in the
planning of economical dispatch, aiming to an efficient and
economical wind power integration and operation. This will
enable to mitigate the undesirable effects of wind fluctuations
in the operation of power systems, namely by reducing the spin
reserve margin capacity and increase wind power penetration.
Recently, with the development of Artificial Intelligence (AI),
various new AI methods for wind speed and power prediction
have been developed and are being proposed. AI methods
mimic the learning process of the brain to discover the relations
between the variables of a system [1]. The new developed AI
methods includes Artificial Neural Networks (ANN) and
evolutionary techniques, such as Particle Swarm Optimization
(PSO).
Other new methods are catching researchers’ attention, namely,
data mining, neural networks, fuzzy logic and neuro-fuzzy,
evolutionary algorithms, and some hybrid methods. Also,
wavelets and Markov chains are being used to capture the
relevant patterns of the time-series and act as pre-processing
filter.
As far as the statistical models are concerned, a key factor is the
computation of the parameters or the weight coefficients of the
model. This process would be enhanced if optimization
techniques are to be used. This thesis aims at assessing the use
of optimization techniques to tune the parameters of the wind
forecast statistical models. The study will be focused on the use
of ANN to predict wind power, as this technique has proved to
be a very promising one. Levenberg-Marquardt (LM)
optimization and PSO is to be used as the weight coefficients
optimization method.
In order to achieve the objectives proposed, the study is
organized in following main tasks:
- To investigate on ANN, PSO and LM.
- To implement wind speed forecast ANN-LM and ANN-PSO
models and compare their performance using some accuracy
indexes.
- To assess if other explanatory variables (such as, pressure,
temperature, humidity) can improve the models’ performance.
Wind power forecast using Neural Networks
tuned with advanced optimization techniques
Gonçalo Miguel Reis de Pinho de Meneses Nazaré
Instituto Superior Técnico, Lisboa, Portugal
2
- To propose a wind power forecast model based on the
conclusions achieved in the previous tasks and to validate it
against experimental results.
As mentioned, for wind speed analysis, two different
forecasting approaches are studied: ANN together with
Levenberg-Marquardt optimization (ANN-LM) and ANN with
PSO (ANN-PSO). Firstly, ANN-LM forecasting methodology
is applied to data obtained from IST automatic weather station
(EMA), and other available weather parameters are studied in
order to assess how they can improve wind speed prediction.
Then, PSO is combined with ANN in order to determine if
optimization of weights and bias from ANN can be improved.
Based on the conclusions of this study, the method that proved
better is applied for wind power prediction, using data collected
from Redes Energéticas Nacionais (REN), containing injected
wind power in the Portuguese power system. A detailed study
is made throughout wind power from year 2014 in order to
establish some links between the forecasting errors and the
actual wind power data.
The effectiveness and efficiency of the proposed wind speed
and wind power forecasting strategy is demonstrated by
comparing the results with the widely used benchmark
persistence method.
2. TIME SERIES FORECASTING MODELS
This section presents two forecasting models to be compared,
ANN and persistence model. Two techniques are presented in
order to update ANN weight and bias, which are Levenberg-
Marquardt and PSO.
2.1 PERSISTENCE MODEL
The persistence method is the most common one between time
series forecasting models. It´s called the naive predictor
because the prediction made is based on the last measured
value. This method is very good for short predictions (from
seconds up to few hours) but lacks accuracy on medium and
large time-scales forecasts (from few hours up to days ahead).
This method is usually used as a benchmark when compared to
other forecasting models, so it will be also used in this thesis.
For a time series 𝑌𝑡, given a historical set of data 𝐻𝑡 ={ 𝑌0,𝑌1,𝑌2,, … , 𝑌𝑡}, the forecast of the forthcoming value of 𝑌𝑡 by
a persistence process p is given by equation (1):
𝑝(𝐻𝑡) = 𝑌𝑡
2.2 ARTIFICIAL NEURAL NETWORKS
ANN’s are structures based on the neural structure of the brain.
They are usually used for applications when there isn´t a known
relationship between input and output variables, when it´s
necessary a constant adaptation to frequent variations of these
variables or when there is high noise that doesn´t contribute to
a good system identification. Therefore they are a powerful
general and flexible modelling tool for forecasting purposes [2].
The learning process to adjust the weight factor in the
connections between neurons can be supervised or
unsupervised. In supervised learning, the weights are adjusted
through an iterative process, in order to minimize a
predetermined error function. A commonly used error function
is the mean-squared error, which aims at minimizing the
average squared error between the network output and the target
value. In unsupervised learning, only the input variables are
presented to the network which is trained to identify different
classes of data [3].
ANN may also be classified as feedforward or recurrent. A
Feedforward neural network is an artificial neural network
where connections between units do not form a cycle, i.e. the
information moves only in one direction. According to most of
the research done, [4,5,6], this topology is usually applied in
wind speed and wind power forecast and for that reason a FNN
will be used in this work. In Multilayer FNN, we have the
network divided in three layers: input layer, hidden layer and
output layer, as shown in Figure 1. One can choose the number
of hidden layers in a neural network but according to [2], one
hidden layer is sufficient to approximate any complex nonlinear
function with any desired accuracy.
Figure 1 – ANN layers: each circular node represents an
artificial neuron and an arrow between layers represents a
connection from the output of one neuron to the input of
another.
Figure 2 illustrate an example of a single neuron model.
Figure 2 – General model of a Neuron (plot from [7]).
𝑎 = 𝑓(∑ 𝑤𝑗𝑝𝑗
𝑗
+ 𝑏)
The output a of a node is the image of the weighted sum of all
inputs y by some activation function f. The most common
activation functions are linear, log-sigmoid and tan-sigmoid.
During the network training step the weights w and bias b are
updated for each unit. To perform this, the Levenberg-
Marquardt optimization algorithm is employed.
(1)
(2)
3
The LM algorithm is an approximation to the Newton method
and it was designed to approach second-order training speed
without having to compute the Hessian matrix. When the
performance function has the form of a sum of squares, which
happens in Neural Networks, the Hessian matrix is then
approximate by (3), thus second-order partial derivatives are
avoided to compute.
𝐻 = 𝐽𝑇 . 𝐽 The Levenberg-Marquardt algorithm consists in solving
equation (4):
𝐽𝑇 . 𝑒 = [𝜇𝐼 + 𝐽𝑇 . 𝐽 ]δ In equation (4):
- J is the Jacobian N-by-W matrix that contains first-order partial
derivatives, where N in the number of entries in the training set
and W is the total number of parameters (weights and bias) of
the Neural Network;
- e is the error vector containing the output error for each input
vector used on training the network;
- I is the identity matrix;
- 𝛿 is the weight update vector to be found;
- µ is Levenberg’s damping factor;
Lastly, one of the most important tasks in developing a good
time series forecasting model is the selection of the input
variables, which determines the architecture of the Neural
Network. Since there isn´t a systematic approach on choosing
these input variables in Artificial Intelligence based models,
statistical methods are employed to find relevant inputs, as
suggested by [8]. These methods are the partial autocorrelation
(PACF) and the cross-correlation function (XCF) whose
expressions are withdrawn from [9].
2.3 PARTICLE SWARM OPTIMIZATION
PSO algorithm was first introduced by Eberhart and Kennedy
in 1995 to explain social behaviour such as fish schooling and
birds flocking [10]. This algorithm is a group based stochastic
optimization technique for continuous nonlinear functions and
it´s defined by the evolution of a population of particles,
represented as vectors in an N-dimensional space. Each particle
flies around the multidimensional search space with a velocity,
which is continuously brought up to date by the particle’s own
experience and the experience of the particle’s neighbours or
the experience of the entire swarm, as shown in Figure 3.
Figure 3 – Random generation of particles in an N-dimensional
search space searching for the best solution.
The procedure to implement PSO algorithm can be defined in
the following steps:
Step 1 - Initiate particles positions, 𝑥, and velocities, 𝑣, in a D-
dimensional problem, randomly and uniformly distributed
across the design space
Step 2 - Update the velocities of all particles at iteration k+1
using the particles fitness or objective values, which are
functions of the particles current positions in the design space
at iteration k. The best previous position of a particle is Pbest and the index of the best particle among all particles is Gbest. Then through (2.9) and (2.10) the velocity and position of a
particle is updated, respectively.
𝑣𝑖(𝑘 + 1) = 𝜔. 𝑣𝑖(𝑘) + 𝑐1. 𝑟1. (𝑥𝑃𝑏𝑒𝑠𝑡 𝑖(k) − 𝑥𝑖(𝑘))
+ 𝑐2. 𝑟2. (𝑥𝐺𝑏𝑒𝑠𝑡(𝑘) − 𝑥𝑖(𝑘))
𝑥𝑖(𝑘 + 1) = 𝑥𝑖(𝑘) + 𝑣𝑖(𝑘 + 1)
Where:
- 𝑣𝑖 is the particle velocity;
- 𝑥𝑖 is the particle position;
- 𝑐1 and 𝑐2 represent the cognitive and social acceleration
coefficient, respectively.
- 𝑟1 and 𝑟2 are n-dimensional column vectors whose elements
are independent pseudo-random numbers selected from a
uniform distribution U(0,1);
- 𝜔 is a static inertia weight that provides preference for a
particle to keep moving in the same direction it was following
in the previous iteration. High values of inertia leads to a global
exploration and in the opposite small values of inertia leads to
a local exploration.
Equation (5) enables the computation of the inertia weight.
𝜔 = 𝜔𝑚𝑎𝑥 −𝜔𝑚𝑎𝑥 − 𝜔𝑚𝑖𝑛
𝑖𝑡𝑒𝑟𝑚𝑎𝑥
. 𝑖𝑡𝑒𝑟
Where:
- 𝜔𝑚𝑎𝑥 represents the inertia highest value;
- 𝜔𝑚𝑖𝑛 represents the inertia lowest value;
- 𝑖𝑡𝑒𝑟 represents the current iteration and
- 𝑖𝑡𝑒𝑟𝑚𝑎𝑥 is the number of maximum iterations.
Figure 4 illustrates how a particle updates its position and
velocity in each iteration. The social and global influence in a
particle when it´s moving towards an optimal solution is
apparent.
Figure 4 – Velocity and position update in PSO (array 1 -
current motion influence, array 2 - particle memory influence,
array 3 - the swarm influence).
(3)
(4)
(5)
(6)
(5)
4
In this work, two different methods of PSO algorithm: PSO
with local neighbourhood (Lbest) and PSO with global
neighbourhood (Gbest) [11
]. The main difference between these two methods is how each
particle moves toward the D-dimensional space. In Gbest
PSO, each particle moves towards its best previous position
and towards the best particle in the entire swarm; in Lbest
PSO, each particle moves towards its best previous position
and towards the best particle in its restricted neighbourhood.
3. APPLICATION TO WIND SPEED PREDICTION
This chapter discusses the ANN model used to perform wind
speed predictions. The data available was provided by IST
automatic weather station (EMA) located in the South Tower in
IST Alameda Campus, containing mean hourly values of wind
speed, temperature, pressure and humidity in year, 2013 and
2014. The optimization of the ANN parameters has been done
using two different techniques: Levenberg-Marquardt (LM)
and PSO.
3.1 PERFORMANCE EVALUATION
In order to evaluate the performance of the forecasting models,
two types of accuracy measures are computed: the Mean
Average Percentage Error (MAPE) and Root Mean Square
Error (RMSE), presented in equation (8) and (9), respectively.
𝑀𝐴𝑃𝐸 =1
𝑁∑
|𝑊𝑆𝑖𝑡𝑟𝑢𝑒 − 𝑊𝑆𝑖
𝑓𝑜𝑟𝑒𝑐𝑎𝑠𝑡|
𝑊𝑆𝑖𝑡𝑟𝑢𝑒
𝑁
𝑖=1
∗ 100%
R𝑀𝑆𝐸 = √1
𝑁∑(𝑊𝑆𝑖
𝑡𝑟𝑢𝑒 − 𝑊𝑆𝑖𝑓𝑜𝑟𝑒𝑐𝑎𝑠𝑡)2
𝑁
𝑖=1
Where:
− 𝑊𝑆𝑖𝑡𝑟𝑢𝑒 represents the actual wind speed at hour i;
- 𝑊𝑆𝑖𝑓𝑜𝑟𝑒𝑐𝑎𝑠𝑡 is the predicted wind speed for hour i;
- N is the number of forecasted values;
3.2 ANN-LM model
3.2.1 Forecasting methodology
With the data available, the main objective is to find how the
additional weather parameters, such as temperature, pressure
and humidity can help improving wind speed prediction
presenting these as inputs to the Neural Network. To verify this,
a combination with all 4 weather parameters is made, reaching
8 different cases that will be tested:
- Case 1: Wind speed prediction through wind speed;
- Case 2: Wind speed prediction through wind speed and
temperature;
- Case 3: Wind speed prediction through wind speed and
pressure;
- Case 4: Wind speed prediction through wind speed and
humidity;
- Case 5: Wind speed prediction through wind speed,
temperature and humidity;
- Case 6: Wind speed prediction through wind speed, pressure
and humidity;
- Case 7: Wind speed prediction through wind speed,
temperature and pressure;
- Case 8: Wind speed prediction through wind speed,
temperature, pressure and humidity.
Along with the weather parameters study, 7 different models
were tested in order to find the best number of past observations
which are relevant for future prediction. The models considered
are presented in Table 1:
Table 1 – Number of inputs for each prediction case.
Model 1 2 3 4 5 6 7 Training
size 2
months 1
month 2
weeks 1
week 5
days 2
days 1
day
The simulations performed make use of Neural Network
Toolbox for use with Matlab®. This toolbox allows one to
implement both the multilayer perceptron ANN and the
Levenberg-Marquardt algorithm. The functions use for ANN
models implementation were the following:
- newff: Creates a feedforward neural network: one can define
its structure: number of hidden layers, number of neurons in
each hidden layer, activation functions for hidden and output
layer or the sizes of input and output targets.
- dividerand: Separates inputs and targets vectors into three
sets;
- trainlm: Network training function that updates weight and
bias values according to Levenberg-Marquardt optimization;
- sim: Simulates a Neural Network which allows to calculate
network outputs for the presented inputs is performed by this
function;
- mse: It measures the network’s performance according to the
mean of square errors (mse).
In newff parameters, following the researched papers [4,12], the
linear activation function is used for the hidden layer and the
tansig activation function is used for output layer.
Firstly, the number of neurons in the hidden layer is set to 5 in
order to find the best training model. Then, to optimize the
number of neurons in the hidden layer, the best ANN model is
run with the number of neurons changing from 1 up to 20.
3.2.2 RESULTS AND DISCUSSION
In all simulations, the forecasting horizon considered is equal
to 24 hours and the predictions were made 1 hour ahead, which
corresponds to 24 values predicted. Following others papers
strategy, [4,5,6,13], the results are separated into seasonality.
The days chosen as representative of each season, and therefore
chosen for prediction, were the 1ºst of February for winter, 1ºst
of May for spring, 1ºst of August for summer and 1ºst of
November for autumn. As mentioned before results obtained
are compared with the persistence model.
In order to find the best weather parameters and best training
size for wind speed prediction, the flowchart from Figure 5 is
employed. The results are presented in Section 3.3.4 of the
dissertation report.
The results show that models with short training size (models 4
to 7) have in general bad performance results for the days
(8)
(9)
5
considered , i.e., worse than persistence model. On the other
hand, large training size models (models 1 and 2) have also in
general bad performance results and again outperformed by
persistence model. The strategy applied to find the best training
size model was to find one model that can outperform
persistence for all representative season days, with one
particular weather parameters case. This only happens with
model 3 (2 weeks training) for cases 2 (wind and temperature),
6 (wind, pressure and humidity), 7 (wind, temperature and
pressure) and 8 (wind, temperature pressure and humidity),
where RMSE and MAPE values have better results than
persistence ones. For this reason, model 3 is chosen as training
size model.
Figures 6 and 7 present a chart with MAPE and RMSE results
for model 3 for all representative season days, in order to
evaluate the best weather parameters case. All 8 cases are
presented in the Figures 6 and 7 just for a clean comparison
since cases 2, 6, 7 and 8 were the best cases found.
Figure 5- Flowchart for ANN algorithm to evaluate the
best case and model.
Figure 6 – 1 hour ahead daily MAPE comparison for model 3
(2 weeks training).
Figure 7 - 1 hour ahead daily RMSE comparison for model 3
(2 weeks training).
The MAPE and RMSE results in Figures 6 and 7 show the
improvement in wind speed prediction when additional weather
parameters are employed. In Figure 6, the best MAPE result
occurs on case 8 for 1st February (16.75%), on case 2 for 1st
May (12.21%) and 1st August (9.33%) and on case 7 for 1st
October (38.89%).
Figure 7 indicate that the best RMSE values take place on case
8 for 1st February (1.164 m/s), on case 2 for 1st May (0,757 m/s)
and 1st August (0.560 m/s) and on case 4 for 1st November
(0.860 m/s).
The conclusion is that there isn´t a combination of weather
parameters that reveals the best results for chosen days, and so
it is more difficult to choose the best case. However, cases 2, 7
and 8, point to similar MAPE and RMSE results. This allows to
conclude that, besides temperature, the weather parameters
humidity and pressure don´t contribute significantly to a better
wind speed forecasting for the data studied. As so, the best
19,2 19,3918,03
19,58 18,7 17,72 18,4516,78 16,75
13,06 12,66 12,21 13,01 12,71 12,61 12,6312,67 12,36
10,01 9,92 9,33 9,52 9,82 9,63 9,65 9,55 9,79
41,36 41,6639,6 39,86 40,64 40,25 40,14
38,89 39,17
7
12
17
22
27
32
37
42
47
MA
PE
[%]
2014
1st February 1st May
1st August 1st October
1,291,243
1,205
1,281
1,2141,187
1,2201,175 1,164
0,89
0,8150,757
0,824 0,8210,785 0,797 0,778 0,775
0,61 0,600 0,560 0,587 0,575 0,586 0,563 0,565 0,597
0,96
0,876 0,875 0,880 0,860 0,873 0,872 0,886 0,866
0,5
0,6
0,7
0,8
0,9
1
1,1
1,2
1,3
1,4
RM
SE [
m/s
]
2014
1st February 1st May 1st August 1st October
6
ANN model forecast chosen is given by case 2, which
represents wind speed prediction through wind speed and
temperature, along with model 3, which has 2 weeks training
size.
In order to optimize the number of neurons in the hidden layer,
an algorithm was developed in which the number of neurons is
changing from 1 up to 20 (with case 2 and model 3). The results
are presented in Section 3.3.4 of the dissertation report. The test
results show that, with this ANN structure, the best number of
neurons in the hidden layer is equal to 4 with a MAPE average
improvement of 7.2% and RMSE average improvement of
11.9% when compared to persistence model. The
corresponding wind forecast errors results are presented in
Table 2 for the best ANN topology model found.
Table 2: Best wind forecast errors result for best ANN
topology.
1st February 1st May 1st August 1st November
RMSE [m/s]
MAPE [%]
RMSE [m/s]
MAPE [%]
RMSE [m/s]
MAPE [%]
RMSE [m/s]
MAPE [%]
1,156 16,57 0,777 12,35 0,571 9,57 0,865 40,78
Figure 8 present graphical results from the comparison of best
ANN topology and the actual wind speed values for the 1ºst of
February for winter.
Figure 8 – Actual wind speed (blue) together with the best
ANN topology (red) forecasted for 1st February
3.3 ANN-PSO model
3.3.1 Forecasting methodology
To apply PSO technique in Neural Network training, the add-in
by Tricia Rambharose was employed [14]. This add-in to the
PSO Research Toolbox [15] allows an ANN to be trained using
the Particle Swarm Optimization technique. Therefore, instead
of trainlm function, the following function is used:
- train: Train Neural Network according to specific training
function and training parameters. This way, is possible to
update weights and bias values using a swarm optimization
approach by choosing a function provided by the Research
Toolbox, trainpso, as a training function.
This toolbox allows choosing, among others, the following
features: PSO topology (Local best or Global best), number of
particles, cognitive and social acceleration, initial and final
weight inertia, static inertia weight and number of iterations.
The purpose of PSO in ANN is to get the best set of weights
(particles position) where several particles are trying to move
to get the best solution. The dimension of the search space is
the total number of weights and bias that is determined by the
ANN structure.
Two different topologies were applied (Local best and Global
best referred in Section 2.3.3), in order to try to improve wind
speed prediction for the best topology found by ANN model. In
each topology, five different number of particles were tested
(10, 20, 30, 40 and 50) with the following PSO network
parameters presented in Table 3.
Table 3 – PSO network parameters for wind speed forecast.
Parameters Value
Initial inertia weight 𝝎𝒎𝒂𝒙 0.9
Final inertia weight 𝝎𝒎𝒂𝒙 0.4
Static inertia weight 𝝎 1.4
Maximum number of
iterations 𝒊𝒕𝒆𝒓𝒎𝒂𝒙
2000
Cognitive acceleration 𝒄𝟏 2
Social acceleration 𝒄𝟐 2
Search space range (-1,1)
As mentioned before, PSO is used for the update of weights and
bias in ANN training procedure as an alternative to Levenberg-
Marquardt optimization technique. This occur because the
performance function mse is the objective function of PSO
algorithm.
For the development of ANN-PSO algorithm the flowchart in
Figure 9 is followed.
The stopping criteria is presented in equation (10):
|𝑓(𝑥𝑖(𝑘)) − 𝑓(𝑥𝑖(𝑘 − 𝑚))| ≤ 휀, 𝑚 = 1,2, … , 𝑖𝑡𝑒𝑟𝑚𝑎𝑥
Where:
- 𝑓(𝑥𝑖(𝑘)) represents the fitness function (MSE) of particle 𝑥𝑖
at iteration k;
- 𝑓(𝑥𝑖(𝑘 − 𝑚)) represents the fitness function (MSE) of
particle 𝑥𝑖 at iteration k-m; - 𝑖𝑡𝑒𝑟𝑚𝑎𝑥 represents the maximum number of iterations;
- 휀 is specified as 10-6 in order to achieve PSO convergence.
3.3.2 Results and discussion
As mentioned before the Hybrid ANN-PSO model is used to
improve the best ANN model presented. To compare both
forecasting models the same strategy is employed regarding the
chosen days to forecast and the forecasting horizon, presented
in Section 3.3.2. The flowchart from Figure 9 is followed, for
Global best and Local best topologies. The results are presented
in Section 3.4.3 of the dissertation report.
(10)
7
The error results with the ANN-PSO algorithm are in general
worse than the Best ANN model found in section 3.3.2 and also
worse than persistence model. Since there isn´t a model that can
outperform Best ANN model in all seasons, it can be concluded
that the use of PSO technique to optimize weights and bias in
Neural Network training is an ineffective technique as
compared to Levenberg-Marquardt backpropagation algorithm
when applied to EMA data.
ANN-LM performs slightly better than ANN-PSO. As so,
ANN-LM will be used hereafter to perform wind power
predictions.
Figure 9 - PSO-ANN flowchart.
4. APPLICATION TO WIND POWER PREDICTION
This section addresses the application of ANN-LM in wind
power forecasting. In this section, the data available for wind
power forecast was provided by Redes Energéticas Nacionais
(REN), the Portuguese Transmission System Operator (TSO).
The provided data contains records of injected wind power in
the Portuguese power system, 15 minutes averages from years
2010 up to 2014, from all wind farms in Portugal that have
telemetry with REN. The data available also contains, at the end
of each year, the installed wind power capacity for all wind
farms in Continental Portugal.
4.1 Performance evaluation
In wind power forecast, two types of accuracy measures are
used to evaluate the performance of the forecasting models: the
Mean Average Percentage Error (MAPE) and Normalized Root
Mean Square Error (NRMSE), presented in equation (11) and
(12), respectively.
𝑀𝐴𝑃𝐸 =1
𝑁∑
|𝑊𝑃𝑖𝑡𝑟𝑢𝑒 − 𝑊𝑃𝑖
𝑓𝑜𝑟𝑒𝑐𝑎𝑠𝑡|
𝑊𝑃𝑖𝑡𝑟𝑢𝑒
𝑁
𝑖=1
∗ 100%
𝑁𝑅𝑀𝑆𝐸 = √1
𝑁∑(
𝑊𝑃𝑖𝑡𝑟𝑢𝑒 − 𝑊𝑃𝑖
𝑓𝑜𝑟𝑒𝑐𝑎𝑠𝑡
𝑊𝑃𝑁
)2
𝑁
𝑖=1
∗ 100%
Where:
- 𝑊𝑃𝑖𝑡𝑟𝑢𝑒 represents the real Wind Power at hour 𝑖;
- 𝑊𝑃𝑖𝑓𝑜𝑟𝑒𝑐𝑎𝑠𝑡 is the predicted Wind Power for hour 𝑖;
- N is the number of forecasted values;
- 𝑊𝑃𝑁 represents the installed wind power capacity.
4.2 ANN-LM model
4.2.1 Forecasting methodology
In order to choose the number of past observations that lead to
a better wind power forecast, 7 different models were
employed. Table 4 presents the training size for each model
considered.
Table 4 – Training size for each model.
Model 1 2 3 4 5 6 7
Training size
2 weeks
1 week
5 days
2 days
1 days
12 hours
6 hours
Once again, following methodology from 3.2.1, the number of
neurons in the hidden layer is set to 5 in order to find the best
training model. Then, to optimize the number of neurons in the
hidden layer, the best ANN model is run with the number of
neurons changing from 1 up to 20.
4.2.2 Results and discussion
In all simulations, the forecasting horizon is equal to 24 hours,
therefore with 15 minutes average wind power records, the
number of values predicted is equal to 96. The days chosen for
wind power prediction follow the strategy in Section 3.2.2,
which is to separate the results into seasonality by predicting
the following representative days from year 2014: 1ºst of
February for winter, 1ºst of May for spring, 1ºst of August for
summer and 1ºst of November for autumn. The results are
compared with the persistence model.
It’s also important to mention that in NRMSE calculations
(equation (12)), 𝑊𝑃𝑁 value (installed wind power capacity)
was set equal to 4453 MW. This is an average value related to
the installed wind power capacity in the end of year 2013 (4364
(12)
(11)
8
MW) and 2014 (4541 MW), since it is not possible to know the
value of the installed capacity at each forecast instant.
The results are presented in Section 4.3.3 of the dissertation
report. Models 1, 2 and 3 present better performance results
than persistence, outperforming it for the representative days of
each season. Figure 10 and Figure 11 show a comparison of
NRSME and MAPE indexes for this models.
Figure 10 – NRMSE comparison of Persistence with ANN
models 1, (2 week training), 2 (1 week training) and 3 (5 days
training).
Figure 11 – MAPE comparison of Persistence with ANN
models 1, (2 week training), 2 (1 week training) and 3 (5 days
training).
The results in Figures 10 and 11 show the improvements in
wind power forecast, when compared with the persistence
model.
Analyzing Figure 10, model 2 presents the best NRMSE results
for 1st February, 1st May and 1st August and for 1st November
the best NRMSE result is from model.
From the results in Figure 11, model 2 presents again better
MAPE results for 1st February and 1st May and model 1 has
better MAPE results for 1st August for 1st November.
All in all, model 1 and 2 have the best NRMSE and MAPE
results for each season and choosing the best ANN model is
difficult, since they present very similar results. However,
model 2 has, in general, lower NRMSE and MAPE results and
also lower average CPU times. As so, model 2 (1 week training
size) was chosen.
In order to optimize the number of neurons in the hidden layer,
an algorithm was developed in which the number of neurons is
changing from 1 up to 20 (with case 2 and model 3). The results
are presented in Section 4.3.3 of the dissertation report. The test
results show that, with this ANN structure, the best number of
neurons in the hidden layer is equal to 3 with an average MAPE
improvement of 28% and an average NRMSE improvement of
29,9%, when compared to the persistence model. Therefore, the
number of neurons in the hidden layer was chosen to be equal
to 3, since it presents the best results.
The corresponding wind power forecast errors results are
presented in Table 5 for the best ANN topology model found.
Table 5: Best wind forecast errors result for best ANN
topology.
1st February 1st May 1st August 1st November RMSE [m/s]
MAPE [%]
RMSE [m/s]
MAPE [%]
RMSE [m/s]
MAPE [%]
RMSE [m/s]
MAPE [%]
1,12 1,45
0,60 2,54
0,87 2,46
0,25 4,67
Figure 12 present graphical results from the comparison of best
ANN topology and the actual wind power values for the 1ºst of
February for winter.
Figure 12 – Actual wind speed (blue) together with the best
ANN topology (red) forecasted for 1st February
4.3 Detailed wind power analysis
Now that the best ANN-LM topology was established, a more
detailed analysis is made to wind power data from REN for year
2014. All data from this year was predicted which corresponds
to 35040 forecasted values (15 minutes average measures in one
year).
For the following analysis, only MAPE is considered as
accuracy measure.
4.4.1 Correlation MAPE versus Wind power
To evaluate the relation between wind power and MAPE, 3
scatter graphs are made. This scatter graphs present hourly,
daily and monthly MAPE indexes together with hourly, daily
and monthly averages of actual wind power data, respectively.
Consequently, the scatter graphs for hourly and daily analysis
are presented in Section 3.3.4 of the dissertation report. Figure
13 shows scatter plotters for monthly (12 values) analysis,
which has the highest R-squared value between all scatter plots,
equal to 0.376. This evidences a higher correlation between
monthly MAPE and actual wind power.
0,00
0,50
1,00
1,50
2,00
1st February 1st May 1st August 1st November
NR
MSE
[%
]
Persistence model 1 model 2 model 3
0,00
2,00
4,00
6,00
8,00
1st February 1st May 1st August 1st November
MA
PE
[%]
Persistence model 1 model 2 model 3
9
Figure 13 – Scatter graph for monthly MAPE and actual wind
power.
4.4.2 MAPE versus time
An analysis is made through Figure 14, showing evolution of
daily MAPE results over the year 2014. The days marked in this
Figure correspond to the higher MAPE result (10th October) and
the lower MAPE result (17th October).
Figure 14 – Daily MAPE for year 2014.
In order to understand why the daily MAPE result was so high
for 10th October, Figure 4.13 and Figure 4.14, from Section
4.4.2 from the dissertation report, shows the actual wind power
and the ANN-LM forecasted values for this day.
To sum up, knowing the actual wind power has a huge impact
in MAPE analysis, since it´s expected higher MAPE results
for lower wind power data and lower MAPE results for higher
wind power predictions.
Figure 15 presents the absolute difference between actual and
forecasted daily wind power. In order to do so, an average
wind power absolute value is firstly calculated for each hour
so that the negative terms don´t cancel with the positive ones
when computing the daily difference. The days marked in this
Figure correspond to the highest (11th February) and lowest
(17th April) absolute difference between actual and forecasted
daily wind power.
Analysing Figure 15 and comparing it with Figure 14, it´s
evidence that the maximum and minimum days marked in both
Figures from aren´t the same. This allows to conclude that
MAPE values don´t always correspond directly to the absolute
difference between actual and forecasted wind power, and so it
depends strongly on actual wind power.
Figure 15 - Absolute difference between actual and forecasted
daily wind power for year 2014.
Figures 4.16 and 4.17 from the dissertation report shows the
actual wind power and the ANN-LM forecasted values for 11th
February and 17th April, respectively. This analysis strengthens
the conclusions made above, which is the influence in MAPE
analysis knowing beforehand the actual wind power.
4.4.3 Typical days MAPE analysis
The strategy for calculate each season typical day MAPE is
presented in the following steps:
- Step 1: Calculate hourly MAPE (8760 values);
- Step 2: Separate the data into seasonality (January, February
and December for winter, March, April and May for spring,
June, July and August for summer and September, October and
November for autumn);
- Step 3: For each season, a season typical day MAPE represents
an average of each hour throughout all days for the selected
months of the season. For example, hour 1 for winter represents
an average MAPE of all hours 1 from days of January, February
and December.
Figures 16 to 19 presents the season typical days MAPE
indexes through time, for winter, spring, summer and autumn,
respectively.
Figure 16 – Evolution of MAPE in a winter day.
R² = 0,3763
0
500
1000
1500
2000
2500
0,7 1,2 1,7 2,2 2,7
Win
d P
ow
er [
MW
]
MAPE [%]
0
2
4
6
8
10
12
14
16
0 30 60 90 120 150 180 210 240 270 300 330 360
MA
PE
[%]
Time [days]
0
5
10
15
20
25
30
35
0 30 60 90 120 150 180 210 240 270 300 330 360
Win
d p
ow
er [
MW
]
Time [days]
0,00
0,50
1,00
1,50
2,00
1 3 5 7 9 11 13 15 17 19 21 23
MA
PE
[%]
Time [hours]
10th October
17th October
17th October
17th April
10
Figure 17 – Evolution of MAPE in a spring day.
Figure 18 – Evolution of MAPE in a spring day.
Figure 19 – Evolution of MAPE in an autumn day.
From these figures, some conclusions may be drawn:
- Prediction errors are small, typically ranging in the [1%; 2%]
interval.
- Prediction errors tend to be lower in the night than in the day.
- Maximum prediction errors occur by lunch time for summer
and winter, and for autumn and spring the maximum prediction
error is expected to happen in the morning.
- The MAPE autumn pattern is very smooth, with errors around
1.5%, except a peak of 4% occurring in the morning.
- Minimum prediction errors are expected during the night for
all typical days, the variation being between 1.0% and 1.5%.
5. CONCLUSION
Test results show higher performance for ANN-LM forecasting
model, with accurate and reliable wind speed and wind power
prediction. For wind power prediction, results evidence a
significant correlation between actual wind power and monthly
Mean Absolute Percentage Error (MAPE), indicating that it’s
important to analyse these two variables together. Also, a
typical day analysis was performed for each season, showing
that prediction errors are normally smaller, between 1% and
2%, and tend to be lower at night and higher during the day.
REFERENCES
[1] Bhaskar, Kanna, and S. N. Singh. "AWNN-assisted wind power forecasting using feed-forward neural network." Sustainable Energy, IEEE Transactions on 3.2, 2012: 306-315. [2] Zhang, Guoqiang, B. Eddy Patuwo, and Michael Y. Hu. "Forecasting with artificial neural networks: The state of the art." International journal of forecasting 14.1, 1998: 35-62.W.-K. Chen, Linear Networks and Systems. Belmont, CA: Wadsworth, 1993, pp. 123–135. [3] Fonte, Pedro. Previsão de Potência em Geradores Eólicos. Master’s thesis, IST/UTL, 2006.J. U. Duncombe, “Infrared navigation—Part I: An assessment of feasibility,” IEEE Trans. Electron Devices, vol. ED-11, no. 1, pp. 34–39, Jan. 1959. [4] Catalão, J. P. S., H. M. I. Pousinho, and V. M. F. Mendes. "Short-term wind power forecasting in Portugal by neural networks and wavelet transform.” Renewable Energy 36.4, 2011: 1245-1251. [5] Catalao, J., H. M. I. Pousinho, and V. M. F. Mendes. "Hybrid wavelet-PSO-ANFIS approach for short-term wind power forecasting in Portugal." Sustainable Energy, IEEE Transactions on 2.1, 2011: 50-59. [6] Haque, A. U., Mandal, P., Kaye, M. E., Meng, J.,
Chang, L., & Senjyu, T. (2012). A new strategy for predicting
short-term wind speed using soft computing
models. Renewable and sustainable energy reviews, 16(7),
4563-4573.
[7] Demuth, Howard and Mark Beale. "Neural network
toolbox for use with MATLAB." (1993).
[8] Sfetsos, A. "A comparison of various forecasting
techniques applied to mean hourly wind speed time
series." Renewable energy 21.1, 2000: 23-35.
[9] Box, G. E., Jenkins, G. M., Reinsel, G. C., & Ljung, G.
M. Time series analysis: forecasting and control. John Wiley
& Sons, 2015.
[10] Kennedy, J. and R. Eberhart, “Particle swarm
optimization”, IEEE international conference on neural
networks. 4 1942- 1948, 1995.
[11] Mendes, R., Cortez, P., Rocha, M., & Neves, J. “Particle
swarms for feedforward neural network
training”. learning, 2002, 6(1).
[12] Bashir, Z. “Short term load forecasting by using wavelet
neural networks”. In electrical and Computer Engineering,
2000. Canadian Conference on (Vol. 1, pp. 163-166). IEEE.
[13] Catalão, J. P. S., H. M. I. Pousinho, and V. M. F. Mendes.
"Short-term wind power forecasting in Portugal by neural
networks and wavelet transform.” Renewable Energy 36.4,
2011: 1245-1251. [14] Rambharose, Tricia. Neural Network Add-in for PSORT.
Found at:
https://www.mathworks.com/matlabcentral/fileexchange/2956
5-neural-network-add-in-for-psort, 2011.
[15] Evers, George. ”Particle Swarm Optimization Research
Toolbox”, Found at:
www.georgeevers.org/pso_research_toolbox.htm, 2011.
0,00
0,50
1,00
1,50
2,00
2,50
1 3 5 7 9 11 13 15 17 19 21 23
MA
PE
[%]
Time [hours]
0,00
0,50
1,00
1,50
2,00
2,50
1 3 5 7 9 11 13 15 17 19 21 23
MA
PE
[%]
Time [hours]
0,00
1,00
2,00
3,00
4,00
5,00
1 3 5 7 9 11 13 15 17 19 21 23
MA
PE
[%]
Time [hours]