wind power forecast using neural networks tuned with ...€¦ · variables, which determines the...

1

Abstract — The increased integration of wind power into the

electric grid implies many challenges to power systems

operators, mainly due to the hard to predict and variability of

wind power generation. Thus, an accurate wind power forecast

is imperative for systems operators, aiming at an efficient and

economical wind power operation and integration into the

power system.

This work addresses the issue of forecasting short-term wind

speed and short-term wind power for one hour ahead,

combining Artificial Neural Networks (ANN) with

optimization techniques on real historical wind speed and wind

power data. Therefore, Levenberg-Marquardt (LM) and

Particle Swarm Optimization (PSO) are used as a training

algorithm to update the weights and bias of the ANN,

establishing two forecasting approaches: ANN-LM and ANN-

PSO.

The forecasting performance produced by the proposal

models are compared with each other as well as the benchmark

persistence model.

Test results show higher performance for ANN-LM

forecasting model, with accurate and reliable wind speed and

wind power prediction.

Keywords: Short-term wind speed and power forecast, Artificial

Neural Network, Levenberg-Marquardt, Particle Swarm

Optimization

1. INTRODUCTION

Wind power is the fastest growing source of renewable energy

in the world. It represents a clean and sustainable source of

energy, and is in abundant supply, which helps to explain the

growth in installed capacity of wind power plants in recent

years. This implies the need to efficiently integrate the power

generated from wind energy into existing power systems.

However, the increase in wind power penetration requires a

number of issues to be addressed. Since the wind power has a

cubic relationship with wind speed, any error in the wind speed

forecast leads to a larger error in wind power production. This

dependency in the stochastic nature of wind speed also causes

uncertainty in wind power production, and unexpected

variations of wind power output may increase the operating

costs for the overall power system. Thus, the use of an accurate

short-term wind power forecast techniques is crucial in the

planning of economical dispatch, aiming to an efficient and

economical wind power integration and operation. This will

enable to mitigate the undesirable effects of wind fluctuations

in the operation of power systems, namely by reducing the spin

reserve margin capacity and increase wind power penetration.

Recently, with the development of Artificial Intelligence (AI),

various new AI methods for wind speed and power prediction

have been developed and are being proposed. AI methods

mimic the learning process of the brain to discover the relations

between the variables of a system [1]. The new developed AI

methods includes Artificial Neural Networks (ANN) and

evolutionary techniques, such as Particle Swarm Optimization

(PSO).

Other new methods are catching researchers’ attention, namely,

data mining, neural networks, fuzzy logic and neuro-fuzzy,

evolutionary algorithms, and some hybrid methods. Also,

wavelets and Markov chains are being used to capture the

relevant patterns of the time-series and act as pre-processing

filter.

As far as the statistical models are concerned, a key factor is the

computation of the parameters or the weight coefficients of the

model. This process would be enhanced if optimization

techniques are to be used. This thesis aims at assessing the use

of optimization techniques to tune the parameters of the wind

forecast statistical models. The study will be focused on the use

of ANN to predict wind power, as this technique has proved to

be a very promising one. Levenberg-Marquardt (LM)

optimization and PSO is to be used as the weight coefficients

optimization method.

In order to achieve the objectives proposed, the study is

organized in following main tasks:

- To investigate on ANN, PSO and LM.

- To implement wind speed forecast ANN-LM and ANN-PSO

models and compare their performance using some accuracy

indexes.

- To assess if other explanatory variables (such as, pressure,

temperature, humidity) can improve the models’ performance.

Wind power forecast using Neural Networks

tuned with advanced optimization techniques

Gonçalo Miguel Reis de Pinho de Meneses Nazaré

[email protected]

Instituto Superior Técnico, Lisboa, Portugal

mailto:[email protected]

2

- To propose a wind power forecast model based on the

conclusions achieved in the previous tasks and to validate it

against experimental results.

As mentioned, for wind speed analysis, two different

forecasting approaches are studied: ANN together with

Levenberg-Marquardt optimization (ANN-LM) and ANN with

PSO (ANN-PSO). Firstly, ANN-LM forecasting methodology

is applied to data obtained from IST automatic weather station

(EMA), and other available weather parameters are studied in

order to assess how they can improve wind speed prediction.

Then, PSO is combined with ANN in order to determine if

optimization of weights and bias from ANN can be improved.

Based on the conclusions of this study, the method that proved

better is applied for wind power prediction, using data collected

from Redes Energéticas Nacionais (REN), containing injected

wind power in the Portuguese power system. A detailed study

is made throughout wind power from year 2014 in order to

establish some links between the forecasting errors and the

actual wind power data.

The effectiveness and efficiency of the proposed wind speed

and wind power forecasting strategy is demonstrated by

comparing the results with the widely used benchmark

persistence method.

2. TIME SERIES FORECASTING MODELS

This section presents two forecasting models to be compared,

ANN and persistence model. Two techniques are presented in

order to update ANN weight and bias, which are Levenberg-

Marquardt and PSO.

2.1 PERSISTENCE MODEL

The persistence method is the most common one between time

series forecasting models. It´s called the naive predictor

because the prediction made is based on the last measured

value. This method is very good for short predictions (from

seconds up to few hours) but lacks accuracy on medium and

large time-scales forecasts (from few hours up to days ahead).

This method is usually used as a benchmark when compared to

other forecasting models, so it will be also used in this thesis.

For a time series 𝑌𝑡, given a historical set of data 𝐻𝑡 ={ 𝑌0,𝑌1,𝑌2,, … , 𝑌𝑡}, the forecast of the forthcoming value of 𝑌𝑡 by

a persistence process p is given by equation (1):

𝑝(𝐻𝑡) = 𝑌𝑡

2.2 ARTIFICIAL NEURAL NETWORKS

ANN’s are structures based on the neural structure of the brain.

They are usually used for applications when there isn´t a known

relationship between input and output variables, when it´s

necessary a constant adaptation to frequent variations of these

variables or when there is high noise that doesn´t contribute to

a good system identification. Therefore they are a powerful

general and flexible modelling tool for forecasting purposes [2].

The learning process to adjust the weight factor in the

connections between neurons can be supervised or

unsupervised. In supervised learning, the weights are adjusted

through an iterative process, in order to minimize a

predetermined error function. A commonly used error function

is the mean-squared error, which aims at minimizing the

average squared error between the network output and the target

value. In unsupervised learning, only the input variables are

presented to the network which is trained to identify different

classes of data [3].

ANN may also be classified as feedforward or recurrent. A

Feedforward neural network is an artificial neural network

where connections between units do not form a cycle, i.e. the

information moves only in one direction. According to most of

the research done, [4,5,6], this topology is usually applied in

wind speed and wind power forecast and for that reason a FNN

will be used in this work. In Multilayer FNN, we have the

network divided in three layers: input layer, hidden layer and

output layer, as shown in Figure 1. One can choose the number

of hidden layers in a neural network but according to [2], one

hidden layer is sufficient to approximate any complex nonlinear

function with any desired accuracy.

Figure 1 – ANN layers: each circular node represents an

artificial neuron and an arrow between layers represents a

connection from the output of one neuron to the input of

another.

Figure 2 illustrate an example of a single neuron model.

Figure 2 – General model of a Neuron (plot from [7]).

𝑎 = 𝑓(∑ 𝑤𝑗𝑝𝑗

𝑗

+ 𝑏)

The output a of a node is the image of the weighted sum of all

inputs y by some activation function f. The most common

activation functions are linear, log-sigmoid and tan-sigmoid.

During the network training step the weights w and bias b are

updated for each unit. To perform this, the Levenberg-

Marquardt optimization algorithm is employed.

(1)

(2)

3

The LM algorithm is an approximation to the Newton method

and it was designed to approach second-order training speed

without having to compute the Hessian matrix. When the

performance function has the form of a sum of squares, which

happens in Neural Networks, the Hessian matrix is then

approximate by (3), thus second-order partial derivatives are

avoided to compute.

𝐻 = 𝐽𝑇 . 𝐽 The Levenberg-Marquardt algorithm consists in solving

equation (4):

𝐽𝑇 . 𝑒 = [𝜇𝐼 + 𝐽𝑇 . 𝐽 ]δ In equation (4):

- J is the Jacobian N-by-W matrix that contains first-order partial

derivatives, where N in the number of entries in the training set

and W is the total number of parameters (weights and bias) of

the Neural Network;

- e is the error vector containing the output error for each input

vector used on training the network;

- I is the identity matrix;

- 𝛿 is the weight update vector to be found;

- µ is Levenberg’s damping factor;

Lastly, one of the most important tasks in developing a good

time series forecasting model is the selection of the input

variables, which determines the architecture of the Neural

Network. Since there isn´t a systematic approach on choosing

these input variables in Artificial Intelligence based models,

statistical methods are employed to find relevant inputs, as

suggested by [8]. These methods are the partial autocorrelation

(PACF) and the cross-correlation function (XCF) whose

expressions are withdrawn from [9].

2.3 PARTICLE SWARM OPTIMIZATION

PSO algorithm was first introduced by Eberhart and Kennedy

in 1995 to explain social behaviour such as fish schooling and

birds flocking [10]. This algorithm is a group based stochastic

optimization technique for continuous nonlinear functions and

it´s defined by the evolution of a population of particles,

represented as vectors in an N-dimensional space. Each particle

flies around the multidimensional search space with a velocity,

which is continuously brought up to date by the particle’s own

experience and the experience of the particle’s neighbours or

the experience of the entire swarm, as shown in Figure 3.

Figure 3 – Random generation of particles in an N-dimensional

search space searching for the best solution.

The procedure to implement PSO algorithm can be defined in

the following steps:

Step 1 - Initiate particles positions, 𝑥, and velocities, 𝑣, in a D-

dimensional problem, randomly and uniformly distributed

across the design space

Step 2 - Update the velocities of all particles at iteration k+1

using the particles fitness or objective values, which are

functions of the particles current positions in the design space

at iteration k. The best previous position of a particle is Pbest and the index of the best particle among all particles is Gbest. Then through (2.9) and (2.10) the velocity and position of a

particle is updated, respectively.

𝑣𝑖(𝑘 + 1) = 𝜔. 𝑣𝑖(𝑘) + 𝑐1. 𝑟1. (𝑥𝑃𝑏𝑒𝑠𝑡 𝑖(k) − 𝑥𝑖(𝑘))

+ 𝑐2. 𝑟2. (𝑥𝐺𝑏𝑒𝑠𝑡(𝑘) − 𝑥𝑖(𝑘))

𝑥𝑖(𝑘 + 1) = 𝑥𝑖(𝑘) + 𝑣𝑖(𝑘 + 1)

Where:

- 𝑣𝑖 is the particle velocity;

- 𝑥𝑖 is the particle position;

- 𝑐1 and 𝑐2 represent the cognitive and social acceleration

coefficient, respectively.

- 𝑟1 and 𝑟2 are n-dimensional column vectors whose elements

are independent pseudo-random numbers selected from a

uniform distribution U(0,1);

- 𝜔 is a static inertia weight that provides preference for a

particle to keep moving in the same direction it was following

in the previous iteration. High values of inertia leads to a global

exploration and in the opposite small values of inertia leads to

a local exploration.

Equation (5) enables the computation of the inertia weight.

𝜔 = 𝜔𝑚𝑎𝑥 −𝜔𝑚𝑎𝑥 − 𝜔𝑚𝑖𝑛

𝑖𝑡𝑒𝑟𝑚𝑎𝑥

. 𝑖𝑡𝑒𝑟

Where:

- 𝜔𝑚𝑎𝑥 represents the inertia highest value;

- 𝜔𝑚𝑖𝑛 represents the inertia lowest value;

- 𝑖𝑡𝑒𝑟 represents the current iteration and

- 𝑖𝑡𝑒𝑟𝑚𝑎𝑥 is the number of maximum iterations.

Figure 4 illustrates how a particle updates its position and

velocity in each iteration. The social and global influence in a

particle when it´s moving towards an optimal solution is

apparent.

Figure 4 – Velocity and position update in PSO (array 1 -

current motion influence, array 2 - particle memory influence,

array 3 - the swarm influence).

(3)

(4)

(5)

(6)

(5)

4

In this work, two different methods of PSO algorithm: PSO

with local neighbourhood (Lbest) and PSO with global

neighbourhood (Gbest) [11

]. The main difference between these two methods is how each

particle moves toward the D-dimensional space. In Gbest

PSO, each particle moves towards its best previous position

and towards the best particle in the entire swarm; in Lbest

PSO, each particle moves towards its best previous position

and towards the best particle in its restricted neighbourhood.

3. APPLICATION TO WIND SPEED PREDICTION

This chapter discusses the ANN model used to perform wind

speed predictions. The data available was provided by IST

automatic weather station (EMA) located in the South Tower in

IST Alameda Campus, containing mean hourly values of wind

speed, temperature, pressure and humidity in year, 2013 and

2014. The optimization of the ANN parameters has been done

using two different techniques: Levenberg-Marquardt (LM)

and PSO.

3.1 PERFORMANCE EVALUATION

In order to evaluate the performance of the forecasting models,

two types of accuracy measures are computed: the Mean

Average Percentage Error (MAPE) and Root Mean Square

Error (RMSE), presented in equation (8) and (9), respectively.

𝑀𝐴𝑃𝐸 =1

𝑁∑

|𝑊𝑆𝑖𝑡𝑟𝑢𝑒 − 𝑊𝑆𝑖

𝑓𝑜𝑟𝑒𝑐𝑎𝑠𝑡|

𝑊𝑆𝑖𝑡𝑟𝑢𝑒

𝑁

𝑖=1

∗ 100%

R𝑀𝑆𝐸 = √1

𝑁∑(𝑊𝑆𝑖

𝑡𝑟𝑢𝑒 − 𝑊𝑆𝑖𝑓𝑜𝑟𝑒𝑐𝑎𝑠𝑡)2

𝑁

𝑖=1

Where:

− 𝑊𝑆𝑖𝑡𝑟𝑢𝑒 represents the actual wind speed at hour i;

- 𝑊𝑆𝑖𝑓𝑜𝑟𝑒𝑐𝑎𝑠𝑡 is the predicted wind speed for hour i;

- N is the number of forecasted values;

3.2 ANN-LM model

3.2.1 Forecasting methodology

With the data available, the main objective is to find how the

additional weather parameters, such as temperature, pressure

and humidity can help improving wind speed prediction

presenting these as inputs to the Neural Network. To verify this,

a combination with all 4 weather parameters is made, reaching

8 different cases that will be tested:

- Case 1: Wind speed prediction through wind speed;

- Case 2: Wind speed prediction through wind speed and

temperature;


pressure;


humidity;

- Case 5: Wind speed prediction through wind speed,

temperature and humidity;

- Case 6: Wind speed prediction through wind speed, pressure

and humidity;


temperature and pressure;


temperature, pressure and humidity.

Along with the weather parameters study, 7 different models

were tested in order to find the best number of past observations

which are relevant for future prediction. The models considered

are presented in Table 1:

Table 1 – Number of inputs for each prediction case.

Model 1 2 3 4 5 6 7 Training

size 2

months 1

month 2

weeks 1

week 5

days 2

days 1

day

The simulations performed make use of Neural Network

Toolbox for use with Matlab®. This toolbox allows one to

implement both the multilayer perceptron ANN and the

Levenberg-Marquardt algorithm. The functions use for ANN

models implementation were the following:

- newff: Creates a feedforward neural network: one can define

its structure: number of hidden layers, number of neurons in

each hidden layer, activation functions for hidden and output

layer or the sizes of input and output targets.

- dividerand: Separates inputs and targets vectors into three

sets;

- trainlm: Network training function that updates weight and

bias values according to Levenberg-Marquardt optimization;

- sim: Simulates a Neural Network which allows to calculate

network outputs for the presented inputs is performed by this

function;

- mse: It measures the network’s performance according to the

mean of square errors (mse).

In newff parameters, following the researched papers [4,12], the

linear activation function is used for the hidden layer and the

tansig activation function is used for output layer.

Firstly, the number of neurons in the hidden layer is set to 5 in

order to find the best training model. Then, to optimize the

number of neurons in the hidden layer, the best ANN model is

run with the number of neurons changing from 1 up to 20.

3.2.2 RESULTS AND DISCUSSION

In all simulations, the forecasting horizon considered is equal

to 24 hours and the predictions were made 1 hour ahead, which

corresponds to 24 values predicted. Following others papers

strategy, [4,5,6,13], the results are separated into seasonality.

The days chosen as representative of each season, and therefore

chosen for prediction, were the 1ºst of February for winter, 1ºst

of May for spring, 1ºst of August for summer and 1ºst of

November for autumn. As mentioned before results obtained

are compared with the persistence model.

In order to find the best weather parameters and best training

size for wind speed prediction, the flowchart from Figure 5 is

employed. The results are presented in Section 3.3.4 of the

dissertation report.

The results show that models with short training size (models 4

to 7) have in general bad performance results for the days

(8)

(9)

5

considered , i.e., worse than persistence model. On the other

hand, large training size models (models 1 and 2) have also in

general bad performance results and again outperformed by

persistence model. The strategy applied to find the best training

size model was to find one model that can outperform

persistence for all representative season days, with one

particular weather parameters case. This only happens with

model 3 (2 weeks training) for cases 2 (wind and temperature),

6 (wind, pressure and humidity), 7 (wind, temperature and

pressure) and 8 (wind, temperature pressure and humidity),

where RMSE and MAPE values have better results than

persistence ones. For this reason, model 3 is chosen as training

size model.

Figures 6 and 7 present a chart with MAPE and RMSE results

for model 3 for all representative season days, in order to

evaluate the best weather parameters case. All 8 cases are

presented in the Figures 6 and 7 just for a clean comparison

since cases 2, 6, 7 and 8 were the best cases found.

Figure 5- Flowchart for ANN algorithm to evaluate the

best case and model.

Figure 6 – 1 hour ahead daily MAPE comparison for model 3

(2 weeks training).

Figure 7 - 1 hour ahead daily RMSE comparison for model 3

(2 weeks training).

The MAPE and RMSE results in Figures 6 and 7 show the

improvement in wind speed prediction when additional weather

parameters are employed. In Figure 6, the best MAPE result

occurs on case 8 for 1st February (16.75%), on case 2 for 1st

May (12.21%) and 1st August (9.33%) and on case 7 for 1st

October (38.89%).

Figure 7 indicate that the best RMSE values take place on case

8 for 1st February (1.164 m/s), on case 2 for 1st May (0,757 m/s)

and 1st August (0.560 m/s) and on case 4 for 1st November

(0.860 m/s).

The conclusion is that there isn´t a combination of weather

parameters that reveals the best results for chosen days, and so

it is more difficult to choose the best case. However, cases 2, 7

and 8, point to similar MAPE and RMSE results. This allows to

conclude that, besides temperature, the weather parameters

humidity and pressure don´t contribute significantly to a better

wind speed forecasting for the data studied. As so, the best

19,2 19,3918,03

19,58 18,7 17,72 18,4516,78 16,75

13,06 12,66 12,21 13,01 12,71 12,61 12,6312,67 12,36

10,01 9,92 9,33 9,52 9,82 9,63 9,65 9,55 9,79

41,36 41,6639,6 39,86 40,64 40,25 40,14

38,89 39,17

7

12

17

22

27

32

37

42

47

MA

PE

[%]

2014

1st February 1st May

1st August 1st October

1,291,243

1,205

1,281

1,2141,187

1,2201,175 1,164

0,89

0,8150,757

0,824 0,8210,785 0,797 0,778 0,775

0,61 0,600 0,560 0,587 0,575 0,586 0,563 0,565 0,597

0,96

0,876 0,875 0,880 0,860 0,873 0,872 0,886 0,866

0,5

0,6

0,7

0,8

0,9

1

1,1

1,2

1,3

1,4

RM

SE [

m/s

]

2014

1st February 1st May 1st August 1st October

6

ANN model forecast chosen is given by case 2, which

represents wind speed prediction through wind speed and

temperature, along with model 3, which has 2 weeks training

size.

In order to optimize the number of neurons in the hidden layer,

an algorithm was developed in which the number of neurons is

changing from 1 up to 20 (with case 2 and model 3). The results

are presented in Section 3.3.4 of the dissertation report. The test

results show that, with this ANN structure, the best number of

neurons in the hidden layer is equal to 4 with a MAPE average

improvement of 7.2% and RMSE average improvement of

11.9% when compared to persistence model. The

corresponding wind forecast errors results are presented in

Table 2 for the best ANN topology model found.

Table 2: Best wind forecast errors result for best ANN

topology.

1st February 1st May 1st August 1st November

RMSE [m/s]

MAPE [%]

RMSE [m/s]

MAPE [%]

RMSE [m/s]

MAPE [%]

RMSE [m/s]

MAPE [%]

1,156 16,57 0,777 12,35 0,571 9,57 0,865 40,78

Figure 8 present graphical results from the comparison of best

ANN topology and the actual wind speed values for the 1ºst of

February for winter.

Figure 8 – Actual wind speed (blue) together with the best

ANN topology (red) forecasted for 1st February

3.3 ANN-PSO model


To apply PSO technique in Neural Network training, the add-in

by Tricia Rambharose was employed [14]. This add-in to the

PSO Research Toolbox [15] allows an ANN to be trained using

the Particle Swarm Optimization technique. Therefore, instead

of trainlm function, the following function is used:

- train: Train Neural Network according to specific training

function and training parameters. This way, is possible to

update weights and bias values using a swarm optimization

approach by choosing a function provided by the Research

Toolbox, trainpso, as a training function.

This toolbox allows choosing, among others, the following

features: PSO topology (Local best or Global best), number of

particles, cognitive and social acceleration, initial and final

weight inertia, static inertia weight and number of iterations.

The purpose of PSO in ANN is to get the best set of weights

(particles position) where several particles are trying to move

to get the best solution. The dimension of the search space is

the total number of weights and bias that is determined by the

ANN structure.

Two different topologies were applied (Local best and Global

best referred in Section 2.3.3), in order to try to improve wind

speed prediction for the best topology found by ANN model. In

each topology, five different number of particles were tested

(10, 20, 30, 40 and 50) with the following PSO network

parameters presented in Table 3.

Table 3 – PSO network parameters for wind speed forecast.

Parameters Value

Initial inertia weight 𝝎𝒎𝒂𝒙 0.9

Final inertia weight 𝝎𝒎𝒂𝒙 0.4

Static inertia weight 𝝎 1.4

Maximum number of

iterations 𝒊𝒕𝒆𝒓𝒎𝒂𝒙

2000

Cognitive acceleration 𝒄𝟏 2

Social acceleration 𝒄𝟐 2

Search space range (-1,1)

As mentioned before, PSO is used for the update of weights and

bias in ANN training procedure as an alternative to Levenberg-

Marquardt optimization technique. This occur because the

performance function mse is the objective function of PSO

algorithm.

For the development of ANN-PSO algorithm the flowchart in

Figure 9 is followed.

The stopping criteria is presented in equation (10):

|𝑓(𝑥𝑖(𝑘)) − 𝑓(𝑥𝑖(𝑘 − 𝑚))| ≤ 휀, 𝑚 = 1,2, … , 𝑖𝑡𝑒𝑟𝑚𝑎𝑥

Where:

- 𝑓(𝑥𝑖(𝑘)) represents the fitness function (MSE) of particle 𝑥𝑖

at iteration k;

- 𝑓(𝑥𝑖(𝑘 − 𝑚)) represents the fitness function (MSE) of

particle 𝑥𝑖 at iteration k-m; - 𝑖𝑡𝑒𝑟𝑚𝑎𝑥 represents the maximum number of iterations;

- 휀 is specified as 10-6 in order to achieve PSO convergence.

3.3.2 Results and discussion

As mentioned before the Hybrid ANN-PSO model is used to

improve the best ANN model presented. To compare both

forecasting models the same strategy is employed regarding the

chosen days to forecast and the forecasting horizon, presented

in Section 3.3.2. The flowchart from Figure 9 is followed, for

Global best and Local best topologies. The results are presented

in Section 3.4.3 of the dissertation report.

(10)

7

The error results with the ANN-PSO algorithm are in general

worse than the Best ANN model found in section 3.3.2 and also

worse than persistence model. Since there isn´t a model that can

outperform Best ANN model in all seasons, it can be concluded

that the use of PSO technique to optimize weights and bias in

Neural Network training is an ineffective technique as

compared to Levenberg-Marquardt backpropagation algorithm

when applied to EMA data.

ANN-LM performs slightly better than ANN-PSO. As so,

ANN-LM will be used hereafter to perform wind power

predictions.

Figure 9 - PSO-ANN flowchart.

4. APPLICATION TO WIND POWER PREDICTION

This section addresses the application of ANN-LM in wind

power forecasting. In this section, the data available for wind

power forecast was provided by Redes Energéticas Nacionais

(REN), the Portuguese Transmission System Operator (TSO).

The provided data contains records of injected wind power in

the Portuguese power system, 15 minutes averages from years

2010 up to 2014, from all wind farms in Portugal that have

telemetry with REN. The data available also contains, at the end

of each year, the installed wind power capacity for all wind

farms in Continental Portugal.

4.1 Performance evaluation

In wind power forecast, two types of accuracy measures are

used to evaluate the performance of the forecasting models: the

Mean Average Percentage Error (MAPE) and Normalized Root

Mean Square Error (NRMSE), presented in equation (11) and

(12), respectively.

𝑀𝐴𝑃𝐸 =1

𝑁∑

|𝑊𝑃𝑖𝑡𝑟𝑢𝑒 − 𝑊𝑃𝑖

𝑓𝑜𝑟𝑒𝑐𝑎𝑠𝑡|

𝑊𝑃𝑖𝑡𝑟𝑢𝑒

𝑁

𝑖=1

∗ 100%

𝑁𝑅𝑀𝑆𝐸 = √1

𝑁∑(

𝑊𝑃𝑖𝑡𝑟𝑢𝑒 − 𝑊𝑃𝑖

𝑓𝑜𝑟𝑒𝑐𝑎𝑠𝑡

𝑊𝑃𝑁

)2

𝑁

𝑖=1

∗ 100%

Where:

- 𝑊𝑃𝑖𝑡𝑟𝑢𝑒 represents the real Wind Power at hour 𝑖;

- 𝑊𝑃𝑖𝑓𝑜𝑟𝑒𝑐𝑎𝑠𝑡 is the predicted Wind Power for hour 𝑖;

- N is the number of forecasted values;

- 𝑊𝑃𝑁 represents the installed wind power capacity.

4.2 ANN-LM model


In order to choose the number of past observations that lead to

a better wind power forecast, 7 different models were

employed. Table 4 presents the training size for each model

considered.

Table 4 – Training size for each model.

Model 1 2 3 4 5 6 7

Training size

2 weeks

1 week

5 days

2 days

1 days

12 hours

6 hours

Once again, following methodology from 3.2.1, the number of

neurons in the hidden layer is set to 5 in order to find the best

training model. Then, to optimize the number of neurons in the

hidden layer, the best ANN model is run with the number of

neurons changing from 1 up to 20.

4.2.2 Results and discussion

In all simulations, the forecasting horizon is equal to 24 hours,

therefore with 15 minutes average wind power records, the

number of values predicted is equal to 96. The days chosen for

wind power prediction follow the strategy in Section 3.2.2,

which is to separate the results into seasonality by predicting

the following representative days from year 2014: 1ºst of

February for winter, 1ºst of May for spring, 1ºst of August for

summer and 1ºst of November for autumn. The results are

compared with the persistence model.

It’s also important to mention that in NRMSE calculations

(equation (12)), 𝑊𝑃𝑁 value (installed wind power capacity)

was set equal to 4453 MW. This is an average value related to

the installed wind power capacity in the end of year 2013 (4364

(12)

(11)

8

MW) and 2014 (4541 MW), since it is not possible to know the

value of the installed capacity at each forecast instant.

The results are presented in Section 4.3.3 of the dissertation

report. Models 1, 2 and 3 present better performance results

than persistence, outperforming it for the representative days of

each season. Figure 10 and Figure 11 show a comparison of

NRSME and MAPE indexes for this models.

Figure 10 – NRMSE comparison of Persistence with ANN

models 1, (2 week training), 2 (1 week training) and 3 (5 days

training).

Figure 11 – MAPE comparison of Persistence with ANN

models 1, (2 week training), 2 (1 week training) and 3 (5 days

training).

The results in Figures 10 and 11 show the improvements in

wind power forecast, when compared with the persistence

model.

Analyzing Figure 10, model 2 presents the best NRMSE results

for 1st February, 1st May and 1st August and for 1st November

the best NRMSE result is from model.

From the results in Figure 11, model 2 presents again better

MAPE results for 1st February and 1st May and model 1 has

better MAPE results for 1st August for 1st November.

All in all, model 1 and 2 have the best NRMSE and MAPE

results for each season and choosing the best ANN model is

difficult, since they present very similar results. However,

model 2 has, in general, lower NRMSE and MAPE results and

also lower average CPU times. As so, model 2 (1 week training

size) was chosen.

In order to optimize the number of neurons in the hidden layer,

an algorithm was developed in which the number of neurons is

changing from 1 up to 20 (with case 2 and model 3). The results

are presented in Section 4.3.3 of the dissertation report. The test

results show that, with this ANN structure, the best number of

neurons in the hidden layer is equal to 3 with an average MAPE

improvement of 28% and an average NRMSE improvement of

29,9%, when compared to the persistence model. Therefore, the

number of neurons in the hidden layer was chosen to be equal

to 3, since it presents the best results.

The corresponding wind power forecast errors results are

presented in Table 5 for the best ANN topology model found.

Table 5: Best wind forecast errors result for best ANN

topology.

1st February 1st May 1st August 1st November RMSE [m/s]

MAPE [%]

RMSE [m/s]

MAPE [%]

RMSE [m/s]

MAPE [%]

RMSE [m/s]

MAPE [%]

1,12 1,45

0,60 2,54

0,87 2,46

0,25 4,67

Figure 12 present graphical results from the comparison of best

ANN topology and the actual wind power values for the 1ºst of

February for winter.

Figure 12 – Actual wind speed (blue) together with the best

ANN topology (red) forecasted for 1st February

4.3 Detailed wind power analysis

Now that the best ANN-LM topology was established, a more

detailed analysis is made to wind power data from REN for year

2014. All data from this year was predicted which corresponds

to 35040 forecasted values (15 minutes average measures in one

year).

For the following analysis, only MAPE is considered as

accuracy measure.

4.4.1 Correlation MAPE versus Wind power

To evaluate the relation between wind power and MAPE, 3

scatter graphs are made. This scatter graphs present hourly,

daily and monthly MAPE indexes together with hourly, daily

and monthly averages of actual wind power data, respectively.

Consequently, the scatter graphs for hourly and daily analysis

are presented in Section 3.3.4 of the dissertation report. Figure

13 shows scatter plotters for monthly (12 values) analysis,

which has the highest R-squared value between all scatter plots,

equal to 0.376. This evidences a higher correlation between

monthly MAPE and actual wind power.

0,00

0,50

1,00

1,50

2,00


NR

MSE

[%

]

Persistence model 1 model 2 model 3

0,00

2,00

4,00

6,00

8,00


MA

PE

[%]

Persistence model 1 model 2 model 3

9

Figure 13 – Scatter graph for monthly MAPE and actual wind

power.

4.4.2 MAPE versus time

An analysis is made through Figure 14, showing evolution of

daily MAPE results over the year 2014. The days marked in this

Figure correspond to the higher MAPE result (10th October) and

the lower MAPE result (17th October).

Figure 14 – Daily MAPE for year 2014.

In order to understand why the daily MAPE result was so high

for 10th October, Figure 4.13 and Figure 4.14, from Section

4.4.2 from the dissertation report, shows the actual wind power

and the ANN-LM forecasted values for this day.

To sum up, knowing the actual wind power has a huge impact

in MAPE analysis, since it´s expected higher MAPE results

for lower wind power data and lower MAPE results for higher

wind power predictions.

Figure 15 presents the absolute difference between actual and

forecasted daily wind power. In order to do so, an average

wind power absolute value is firstly calculated for each hour

so that the negative terms don´t cancel with the positive ones

when computing the daily difference. The days marked in this

Figure correspond to the highest (11th February) and lowest

(17th April) absolute difference between actual and forecasted

daily wind power.

Analysing Figure 15 and comparing it with Figure 14, it´s

evidence that the maximum and minimum days marked in both

Figures from aren´t the same. This allows to conclude that

MAPE values don´t always correspond directly to the absolute

difference between actual and forecasted wind power, and so it

depends strongly on actual wind power.

Figure 15 - Absolute difference between actual and forecasted

daily wind power for year 2014.

Figures 4.16 and 4.17 from the dissertation report shows the

actual wind power and the ANN-LM forecasted values for 11th

February and 17th April, respectively. This analysis strengthens

the conclusions made above, which is the influence in MAPE

analysis knowing beforehand the actual wind power.

4.4.3 Typical days MAPE analysis

The strategy for calculate each season typical day MAPE is

presented in the following steps:

- Step 1: Calculate hourly MAPE (8760 values);

- Step 2: Separate the data into seasonality (January, February

and December for winter, March, April and May for spring,

June, July and August for summer and September, October and

November for autumn);

- Step 3: For each season, a season typical day MAPE represents

an average of each hour throughout all days for the selected

months of the season. For example, hour 1 for winter represents

an average MAPE of all hours 1 from days of January, February

and December.

Figures 16 to 19 presents the season typical days MAPE

indexes through time, for winter, spring, summer and autumn,

respectively.

Figure 16 – Evolution of MAPE in a winter day.

R² = 0,3763

0

500

1000

1500

2000

2500

0,7 1,2 1,7 2,2 2,7

Win

d P

ow

er [

MW

]

MAPE [%]

0

2

4

6

8

10

12

14

16

0 30 60 90 120 150 180 210 240 270 300 330 360

MA

PE

[%]

Time [days]

0

5

10

15

20

25

30

35

0 30 60 90 120 150 180 210 240 270 300 330 360

Win

d p

ow

er [

MW

]

Time [days]

0,00

0,50

1,00

1,50

2,00

1 3 5 7 9 11 13 15 17 19 21 23

MA

PE

[%]

Time [hours]

10th October

17th October

17th October

17th April

10

Figure 17 – Evolution of MAPE in a spring day.

Figure 18 – Evolution of MAPE in a spring day.

Figure 19 – Evolution of MAPE in an autumn day.

From these figures, some conclusions may be drawn:

- Prediction errors are small, typically ranging in the [1%; 2%]

interval.

- Prediction errors tend to be lower in the night than in the day.

- Maximum prediction errors occur by lunch time for summer

and winter, and for autumn and spring the maximum prediction

error is expected to happen in the morning.

- The MAPE autumn pattern is very smooth, with errors around

1.5%, except a peak of 4% occurring in the morning.

- Minimum prediction errors are expected during the night for

all typical days, the variation being between 1.0% and 1.5%.

5. CONCLUSION

Test results show higher performance for ANN-LM forecasting

model, with accurate and reliable wind speed and wind power

prediction. For wind power prediction, results evidence a

significant correlation between actual wind power and monthly

Mean Absolute Percentage Error (MAPE), indicating that it’s

important to analyse these two variables together. Also, a

typical day analysis was performed for each season, showing

that prediction errors are normally smaller, between 1% and

2%, and tend to be lower at night and higher during the day.

REFERENCES

[1] Bhaskar, Kanna, and S. N. Singh. "AWNN-assisted wind power forecasting using feed-forward neural network." Sustainable Energy, IEEE Transactions on 3.2, 2012: 306-315. [2] Zhang, Guoqiang, B. Eddy Patuwo, and Michael Y. Hu. "Forecasting with artificial neural networks: The state of the art." International journal of forecasting 14.1, 1998: 35-62.W.-K. Chen, Linear Networks and Systems. Belmont, CA: Wadsworth, 1993, pp. 123–135. [3] Fonte, Pedro. Previsão de Potência em Geradores Eólicos. Master’s thesis, IST/UTL, 2006.J. U. Duncombe, “Infrared navigation—Part I: An assessment of feasibility,” IEEE Trans. Electron Devices, vol. ED-11, no. 1, pp. 34–39, Jan. 1959. [4] Catalão, J. P. S., H. M. I. Pousinho, and V. M. F. Mendes. "Short-term wind power forecasting in Portugal by neural networks and wavelet transform.” Renewable Energy 36.4, 2011: 1245-1251. [5] Catalao, J., H. M. I. Pousinho, and V. M. F. Mendes. "Hybrid wavelet-PSO-ANFIS approach for short-term wind power forecasting in Portugal." Sustainable Energy, IEEE Transactions on 2.1, 2011: 50-59. [6] Haque, A. U., Mandal, P., Kaye, M. E., Meng, J.,

Chang, L., & Senjyu, T. (2012). A new strategy for predicting

short-term wind speed using soft computing

models. Renewable and sustainable energy reviews, 16(7),

4563-4573.

[7] Demuth, Howard and Mark Beale. "Neural network

toolbox for use with MATLAB." (1993).

[8] Sfetsos, A. "A comparison of various forecasting

techniques applied to mean hourly wind speed time

series." Renewable energy 21.1, 2000: 23-35.

[9] Box, G. E., Jenkins, G. M., Reinsel, G. C., & Ljung, G.

M. Time series analysis: forecasting and control. John Wiley

& Sons, 2015.

[10] Kennedy, J. and R. Eberhart, “Particle swarm

optimization”, IEEE international conference on neural

networks. 4 1942- 1948, 1995.

[11] Mendes, R., Cortez, P., Rocha, M., & Neves, J. “Particle

swarms for feedforward neural network

training”. learning, 2002, 6(1).

[12] Bashir, Z. “Short term load forecasting by using wavelet

neural networks”. In electrical and Computer Engineering,

2000. Canadian Conference on (Vol. 1, pp. 163-166). IEEE.

[13] Catalão, J. P. S., H. M. I. Pousinho, and V. M. F. Mendes.

"Short-term wind power forecasting in Portugal by neural

networks and wavelet transform.” Renewable Energy 36.4,

2011: 1245-1251. [14] Rambharose, Tricia. Neural Network Add-in for PSORT.

Found at:

https://www.mathworks.com/matlabcentral/fileexchange/2956

5-neural-network-add-in-for-psort, 2011.

[15] Evers, George. ”Particle Swarm Optimization Research

Toolbox”, Found at:

www.georgeevers.org/pso_research_toolbox.htm, 2011.

0,00

0,50

1,00

1,50

2,00

2,50

1 3 5 7 9 11 13 15 17 19 21 23

MA

PE

[%]

Time [hours]

0,00

0,50

1,00

1,50

2,00

2,50

1 3 5 7 9 11 13 15 17 19 21 23

MA

PE

[%]

Time [hours]

0,00

1,00

2,00

3,00

4,00

5,00

1 3 5 7 9 11 13 15 17 19 21 23

MA

PE

[%]

Time [hours]

https://www.mathworks.com/matlabcentral/fileexchange/29565-neural-network-add-in-for-psort

https://www.mathworks.com/matlabcentral/fileexchange/29565-neural-network-add-in-for-psort

http://www.georgeevers.org/pso_research_toolbox.htm

wind power forecast using neural networks tuned with ...€¦ · variables, which determines the...

Documents