short-timewindspeedforecastusingartificial learning

15
Research Article Short-Time Wind Speed Forecast Using Artificial Learning-Based Algorithms Mariam Ibrahim , 1 Ahmad Alsheikh, 2 Qays Al-Hindawi, 3 Sameer Al-Dahidi, 4 and Hisham ElMoaqet 1 1 Dept. of Mechatronics Eng., Faculty of Applied Technical Sciences, German Jordanian University, Amman 11180, Jordan 2 Faculty of Applied Sciences and Industrial Engineering, Deggendorf Institute of Technology, Deggendorf 94469, Germany 3 School of Electrical, Information and Media Eng., University of Wuppertal, Wuppertal 42119, Germany 4 Dept. of Mechanical & Maintenance Eng., Faculty of Applied Technical Sciences, German Jordanian University, Amman 11180, Jordan Correspondence should be addressed to Mariam Ibrahim; [email protected] Received 9 November 2019; Revised 17 February 2020; Accepted 19 February 2020; Published 25 April 2020 Academic Editor: Leonardo Franco Copyright © 2020 Mariam Ibrahim et al. is is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. e need for an efficient power source for operating the modern industry has been rapidly increasing in the past years. erefore, the latest renewable power sources are difficult to be predicted. e generated power is highly dependent on fluctuated factors (such as wind bearing, pressure, wind speed, and humidity of surrounding atmosphere). us, accurate forecasting methods are of paramount importance to be developed and employed in practice. In this paper, a case study of a wind harvesting farm is investigated in terms of wind speed collected data. For data like the wind speed that are hard to be predicted, a well built and tested forecasting algorithm must be provided. To accomplish this goal, four neural network-based algorithms: artificial neural network (ANN), convolutional neural network (CNN), long short-term memory (LSTM), and a hybrid model convolutional LSTM (ConvLSTM) that combines LSTM with CNN, and one support vector machine (SVM) model are investigated, evaluated, and compared using different statistical and time indicators to assure that the final model meets the goal that is built for. Results show that even though SVM delivered the most accurate predictions, ConvLSTM was chosen due to its less computational efforts as well as high prediction accuracy. 1. Introduction e need to move towards renewable and clean energy sources has increased considerably over the previous years. Fossil fuels are being misused excessively and eventually will waste away. However, renewable energy (RE) sources such as wind, solar, and hydraulic or hydroelectric are regularly replenished and will sustain forever. Grid oper- ators who use RE face many challenges which lead to variability and uncertainty in power generation. For in- stance, in the case of solar power where the existence of clouds that move above solar power plants can narrow power generation for brief intervals of time. Cloud cover may introduce a very quick shift in the outcome of solar structures, but solar energy is still considered to be highly predictable as the sun motion is understood clearly [1]. However, wind power generation is less predictable due to the fact that fluctuations in wind speed are stochastic in nature. is issue will cause a break between supply and demand. So, in order to enhance and optimize renewable wind power generation, wind speed or power production forecasting models are recently being used to resolve this problem. is has led to huge increase in installing wind power plants [2]. As the demand for wind power has increased over the last decades, there is a serious need to set up wind farms and construct facilities depending on accurate wind forecasted data. Collected short-term wind forecasting has a significant Hindawi Computational Intelligence and Neuroscience Volume 2020, Article ID 8439719, 15 pages https://doi.org/10.1155/2020/8439719

Upload: others

Post on 08-Jan-2022

1 views

Category:

Documents


0 download

TRANSCRIPT

Research ArticleShort-Time Wind Speed Forecast Using ArtificialLearning-Based Algorithms

Mariam Ibrahim 1 Ahmad Alsheikh2 Qays Al-Hindawi3 Sameer Al-Dahidi4

and Hisham ElMoaqet1

1Dept of Mechatronics Eng Faculty of Applied Technical Sciences German Jordanian University Amman 11180 Jordan2Faculty of Applied Sciences and Industrial Engineering Deggendorf Institute of Technology Deggendorf 94469 Germany3School of Electrical Information and Media Eng University of Wuppertal Wuppertal 42119 Germany4Dept of Mechanical amp Maintenance Eng Faculty of Applied Technical Sciences German Jordanian UniversityAmman 11180 Jordan

Correspondence should be addressed to Mariam Ibrahim mariamwajdigjuedujo

Received 9 November 2019 Revised 17 February 2020 Accepted 19 February 2020 Published 25 April 2020

Academic Editor Leonardo Franco

Copyright copy 2020 Mariam Ibrahim et al )is is an open access article distributed under the Creative Commons AttributionLicense which permits unrestricted use distribution and reproduction in any medium provided the original work isproperly cited

)e need for an efficient power source for operating the modern industry has been rapidly increasing in the past years )ereforethe latest renewable power sources are difficult to be predicted )e generated power is highly dependent on fluctuated factors(such as wind bearing pressure wind speed and humidity of surrounding atmosphere))us accurate forecasting methods are ofparamount importance to be developed and employed in practice In this paper a case study of a wind harvesting farm isinvestigated in terms of wind speed collected data For data like the wind speed that are hard to be predicted a well built and testedforecasting algorithm must be provided To accomplish this goal four neural network-based algorithms artificial neural network(ANN) convolutional neural network (CNN) long short-term memory (LSTM) and a hybrid model convolutional LSTM(ConvLSTM) that combines LSTM with CNN and one support vector machine (SVM) model are investigated evaluated andcompared using different statistical and time indicators to assure that the final model meets the goal that is built for Results showthat even though SVM delivered the most accurate predictions ConvLSTMwas chosen due to its less computational efforts as wellas high prediction accuracy

1 Introduction

)e need to move towards renewable and clean energysources has increased considerably over the previous yearsFossil fuels are being misused excessively and eventuallywill waste away However renewable energy (RE) sourcessuch as wind solar and hydraulic or hydroelectric areregularly replenished and will sustain forever Grid oper-ators who use RE face many challenges which lead tovariability and uncertainty in power generation For in-stance in the case of solar power where the existence ofclouds that move above solar power plants can narrowpower generation for brief intervals of time Cloud covermay introduce a very quick shift in the outcome of solar

structures but solar energy is still considered to be highlypredictable as the sun motion is understood clearly [1]However wind power generation is less predictable due tothe fact that fluctuations in wind speed are stochastic innature )is issue will cause a break between supply anddemand So in order to enhance and optimize renewablewind power generation wind speed or power productionforecasting models are recently being used to resolve thisproblem )is has led to huge increase in installing windpower plants [2]

As the demand for wind power has increased over thelast decades there is a serious need to set up wind farms andconstruct facilities depending on accurate wind forecasteddata Collected short-term wind forecasting has a significant

HindawiComputational Intelligence and NeuroscienceVolume 2020 Article ID 8439719 15 pageshttpsdoiorg10115520208439719

effect on the electricity [3] which is also necessary to identifythe size of wind farms

It is obvious that there is a need for an accurate windforecasting technique to substantially reduce the cost bywind power scheduling [4])ere are several methods whichare aimed at short-time wind forecasting (eg statisticaltime series and neural networks) For an advanced and moreaccurate forecasting the hybrid models are used )esemodels combine physical and statistical approaches shortand medium-term models and combinations of alternativestatistical models

)e concept of artificial neural networks (ANNs) wasfirst introduced by McCulloch and Pitts [5] in 1943 as acomputational model for biological neural networks Con-volutional neural network (CNN) was influenced byldquoNeocognitronrdquo networks which were first introduced byFukushima in 1980 [6] CNN was based on biologicalprocesses which were hierarchical multilayered neuralnetworks used for image processing )ese networks arecapable of ldquolearning without a teacherrdquo for recognition ofvarious catalyst shapes depending on their geometricaldesigns [7]

Long short-term memory (LSTM) [8] is built uponrecurrent neural network (RNN) structure It was designedby Hochreiter and Schmidhuber in 1997 LSTM uses theconcept proposed in [9] which depends on feedback con-nections between its layers Unlike standard feedforwardneural networks LSTM can process entire sequences of data(such as voice or video) and not just single data points (suchas images)

Support vector machine (SVM) [10] is a popular ma-chine learning technique which is advanced enough to dealwith complex data It is aimed to deal with challenges inclassification problems

In 2016 Convolutional LSTM (ConvLSTM) was used tobuild a video prediction model by Shi et al [11] A tool isdeveloped to prognose action-conditioned video thatmodeled pixel movement by predicting a distribution overpixel movement from earlier frames Stacked convolutionalLSTM was employed to generate motion predictions )isapproach has gained the finest outcomes in predicting futureobject motion

An end-to-end learning of drivingmodels was developedin [12] using a LSTM-based algorithm A trainable structurefor learning how to accurately predict a distribution amongupcoming vehicle movement is developed through learninga generic vehicle movement from large-scale crowd-sourcedvideo )e data source used a rapid monocular cameraobservations and past vehicle state )e images wereencoded through long short-term memory fully convolu-tional network (FCN-LSTM) to determine the relatedgraphical illustration in every input frame side by side with atemporal network to use the movement history information)e authors were able to compose an innovative hybridstructure for time-series prediction (TSP) that combined anLSTM temporal encoder utilizing a fully convolutional visualencoder

Various papers have been explored in the literatureon wind speed forecasting For instance a model was

introduced by Xu et al [13] to predict short-term windspeed using LSTM empirical wavelet transformation(EWT) and Elman neural network approaches )e EWTis implemented to break down the raw wind speed datainto multiple sublayers and employ them in Elmanneural network (ENN) and LSTM network to predict thelow and high frequency sublayers Unscented Kalmanfilter (UKF) along with support vector regression (SVR)based state-space model was applied by Chen and Yu [14]to efficiently correct the short-term estimation of windspeed chain

A nonlinear-learning scheme of deep learning timeseries prediction EnsemLSTM was developed by Chenet al [15] )is scheme relied on LSTMs support vectorregression machine (SVRM) and extremal optimizationalgorithm (EO) Wind speed data are forecasted separatelyby an array of LSTMs that contained covered layersNeurons are built in every hidden layer )e authorsproved that the introduced EnsemLSTM is capable ofachieving an improved forecasting execution along withthe least mean absolute error (MAE) root mean squareerror (RMSE) mean absolute percentage error (MAPE)and the highest R-squared (R2)

A hybrid model constructed of wavelet transform(WT) and SVM was proposed by Liu et al [16] to predictwind speed in the short term )e model is improved bygenetic algorithm (GA) which is implemented to varyessential specifications of SVM through reducing theproduced errors and searching the optimum specifica-tions to bypass the danger of instability )e presentedmodel is proved to be more efficient than SVM-GAmodelWang [17] developed a genetic algorithm of waveletneural network (GAWNN) model )e developed modelshowed an enhanced operation as compared to the normalwavelet neural network (WNN) model in predicting short-term wind power )e model can be located at the be-ginning of network training as well as in convergentprecision

A prediction model was proposed by Sheikh et al [18]based on support vector regression (SVR) and neuralnetwork (NN) with backpropagation technique A win-dowing data preprocessing was combined with cross andsliding window validations in order to predict wind speedwith high accuracy A hybrid method was presented byNantian et al [19] which included variational modedecomposition (VMD) partial autocorrelation function(PACF) feature selection and modular weighted regu-larized extreme learning machine (WRELM) prediction)e optimal number of decomposition layers was ana-lyzed by the prediction error of one-step forecasting withdifferent decomposition layers

A robust forecasting model was proposed by Haijianand Deng [20] by evaluating seasonal features and lagspace in wind resource )e proposed model was based onthe multilayered perceptron with one hidden layer neuralnetwork using the LevenbergndashMarquardt optimizationmethod Least squares support vector machine (LSSVM)was used by Xiaodan [21] for the wind speed forecasting)e accuracy of the prediction model parameters was

2 Computational Intelligence and Neuroscience

optimized utilizing the particle swarm optimization (PSO)to minimize the fitness function in the training processNingsih et al [22] predicted wind speed using recurrentneural networks (RNNs) with long short-term memory(LSTM) Two optimization models of stochastic gradientdescent (SGD) and adaptive moment estimation (Adam)were evaluated )e Adam method was shown to be betterand quicker than SGD with a higher level of accuracy andless deviation from the target

A nonlinear autoregressive neural network (NAR-NET) model was developed by Datta [23] )e modelemployed univariate time series data to generate hourlywind speed forecast )e closed loop structure providederror feedback to the hidden layer to generate forecast ofthe next point A short-term wind speed forecastingmethod was proposed by Guanlong et al [24] using abackpropagation (BP) neural network )e weight andthreshold values of BP network are trained and optimizedby the improved artificial bee colony algorithm )en thegathered samples of wind speed are trained and opti-mized When training is finished test samples are used toforecast and validate

Fuzzy C-means (FCM) clustering was used by Gongguiet al [25] to forecast wind speed )e input data of BPneural network with similar characteristics are divided intocorresponding classes Different BP neural networks areestablished for each class )e coefficient of variation is usedto illustrate the dispersion of data and statistical knowledgeis used to illuminate the input data with large dispersionfrom the original dataset Artificial neural networks (ANNs)and decision trees (DTs) were used by ZhanJie and MazharulMujib [26] to analyze meteorological data for the applicationof data mining techniques through cloud computing in windspeed prediction )e neurons in the hidden layer are en-hanced gradually and the network performance in the forman error is examined Table 1 highlights the main charac-teristics of the existing schemes developed for wind speedforecasting

)e novelty of this work lies in enhancing the accuracy ofwind speed forecasting by using a hybrid model calledConvLSTM and comparing it with other four commonlyused models with optimized lags hidden neurons andparameters )is includes testing and comparing the per-formance of these five different models based on historicaldata as well employing multi-lags-one-step (MLOS) aheadforecasting concept MLOS provided an efficient general-ization to new time series data )us it increased the overallprediction accuracy )e remainder of this paper is orga-nized as follows Section 2 describes the four learning al-gorithms in addition to a hybrid algorithm investigated foran accurate wind speed forecasting Section 3 illustrates thestudy methodology Section 4 shows a real case study of awind farm Section 5 introduces the results and discussionFinally conclusions and future works are presented inSection 6

11 Acronyms and Notations Table 2 illustrates the acro-nyms and notations used through the paper

2 Prediction Algorithms

In this section the algorithms used for wind speed fore-casting are summarized as follows

21 LSTM Algorithms LSTM is built in a unique archi-tecture that empowers it to forget the unnecessary in-formation by turning multiplication into addition andusing a function whose second derivative can preserve fora long range before going to zero in order to reduce thevanishing gradient problem (VGP) It is constructed of thesigmoid layer which takes the inputs xtand htminus1 and thendecides by generating the zeros which part from the oldoutput should be removed )is process is done throughforget gate ft )e gate output is given as ft lowast ctminus1 Afterthat a vector of all the possible values from the new inputis created by tan h layer )ese two results are multipliedto renew the old memory ctminus1 that gives ct In other wordsthe sigmoid layer decides which portions of the cell statewill be the outcome )en the outcome of the sigmoidgate is multiplied by all possible values that are set upthrough tan h )us the output consists of only the partsthat are decided to be generated

LSTM networks [8] are part of recurrent neural net-works (RNNs) which are capable of learning long-termdependencies and powerful for modeling long-rangedependencies )e main criterion of the LSTM network isthe memory cell which can memorize the temporal stateIt is also shaped by the addition or removal of informationthrough three controlling gates )ese gates are the inputgate forget gate and output gate LSTMs are able to renewand control the information flow in the block using thesegates in the following equations

sit σ xt middot θxi + htminus1 middot θhi + θibias1113872 1113873 (1)

ft σ xt middot θxf + htminus1 middot θhf + θfbias1113872 1113873 (2)

1113957ct tan h xt middot θx~c + htminus1 middot θh~c + θ~cbias1113872 1113873 (3)

c 1113957ct ⊙ it + ctminus1⊙ft (4)

ot σ xt middot θxo + htminus1 middot θho + θobias1113872 1113873 (5)

ht ot ⊙ tan h ct( 1113857 (6)

where ldquomiddotrdquo presents matrix multiplication ldquo⊙rdquo is an ele-mentwise multiplication and ldquoθrdquo stands for the weights 1113957c isthe input to the cell c which is gated by the input gate whileot is the output )e nonlinear functions σ and tan h areapplied elementwise where σ(x) 11 + eminus x Equations (1)and (2) establish gate activations equation (3) indicates cellinputs equation (4) determines the new cell states where thelsquomemoriesrsquo are stored or deleted and equation (5) results inthe output gate activations which are shown in equation (6)the final output

Computational Intelligence and Neuroscience 3

Table 1 Main characteristics of the existing wind speed forecasting schemes

Aim Technique Meritsoutcomes Demerits Dataset

Hybrid windspeedprediction [13]

Empirical wavelettransformation (EWT) longshort-term memory neuralnetwork and a deep learning

algorithm

)e proposed model has thesatisfactory multistepforecasting results

)e performance of theEWT for the wind speedmultistep forecasting has

not been studied

Four sets of original windspeed series including 700

samples

Wind speedforecasting [14]

Unscented Kalman filter(UKF) is integrated withsupport vector regression

(SVR) model

)e proposed method hasbetter performance in both

one-step-ahead andmultistep-ahead predictions

than ANNs SVRautoregressive and

autoregressive integratedwith Kalman filter models

Needs to develop thepredictive model-basedcontrol and optimizationstrategies for wind farm

operation

Center for Energy Efficiencyand Renewable Energy atUniversity of Massachusetts

Wind speedforecasting [15]

Long short-term memoryneural networks supportvector regression machineand extremal optimization

algorithm

)e proposed model canachieve a better forecastingperformance than ARIMASVR ANN KNN and GBRT

models

Needs to consider moreinterrelated features like

weather conditions humanfactors and power system

status

A wind farm in InnerMongolia China

A hybrid short-term windspeedforecasting [16]

Wavelet transform (WT)genetic algorithm (GA) andsupport vector machines

(SVMs)

)e proposed method ismore efficient than a

persistent model and a SVM-GA model without WT

Needs to augment externalinformation such as the airpressure precipitation andair humidity besides the

temperature

)e wind speed data every05 h in a wind farm of

North China in September2012

Short-termwind speedprediction [18]

Support vector regression(SVR) and artificial neural

network (ANN) withbackpropagation

)e proposed SVR and ANNmodels are able to predictwind speed with more than

99 accuracy

Computationally expensive

Historical dataset(2008ndash2014) of wind speedof Chittagong costal area

from BangladeshMeteorological Division

(BMD)

Hybrid windspeedforecasting [19]

Variational modedecomposition (VMD) thepartial autocorrelationfunction (PACF) and

weighted regularized extremelearning machine (WRELM)

(i) )e VMD reduces theinfluences of randomness

and volatility of wind speed(ii) PACF reduces the featuredimension and complexity of

the model(iii) ELM improves theprediction accuracy

)e forecasting accuracy oftwo-step-ahead and three-step-ahead predictionsdeclined to different

degrees

USA National RenewableEnergy Laboratory (NREL)

in 2004

Short-termwind speedforecasting[20]

Wavelet analysis andAdaBoosting neural network

(i) Benefits the analysis of thewind speedrsquos randomness

and optimal neural networkrsquosstructure

(ii) It can be used to promotethe modelrsquos configurationand show the confidence inhigh-accuracy forecasting

Needs to consider thedynamical model with

ability of error correctionand adaptive adjustment

USA National RenewableEnergy Laboratory (NREL)

in 2004

Short-termwind speedforecasting[21]

Support vector machine(SVM) with particle swarm

optimization (PSO)

)e proposed model has thebest forecasting accuracycompared to classical SVMand backpropagation neural

network models

Needs to consideradditional information forefficient forecasting such as

season and weathervariables

Wind farm data in China in2011

Wind speedpredictions[22]

Recurrent neural network(RNN) with long short-term

memory (LSTM)

)e model provides 927accuracy for training dataand 916 for new data

High rate epochs increasedthe process time and

eventually provided lowaccuracy performance

Nganjuk Meteorology andGeophysics Agency(BMKG) East Java

(2008ndash2017)

Forecastingmultistep-ahead windspeed [23]

NARNETmodel to forecasthourly wind speed using anartificial neural network

(ANN)

)e model is cost effectiveand can work with minimumavailability of statistical data

(i) Faulty measurements ofinputs are likely to affect the

model parameters(ii) Removing rapid changesusing a low-pass filter might

result in neglectingimportant information

Meteorological data fromthe National Oceanic and

AtmosphericAdministration (NOAA)located in Dodge CityKansas (January 2010-December 2010)

4 Computational Intelligence and Neuroscience

Table 1 Continued

Aim Technique Meritsoutcomes Demerits Dataset

Short-termwind speedprediction [24]

Backpropagation (BP) neuralnetwork based on improved

artificial bee colonyalgorithm (ABC-BP)

)e model has high precisionand fast convergence ratecompared with traditionaland genetic BP neural

networks

Sensitive for noisy data)erefore data should befiltered which may affect

the nature of data

Wind farm in TianjinChina (December

2013ndashJanuary 2014)

Short-termwind speedforecasting [25]2019

Fuzzy C-means clustering(FCM) and improved mindevolutionary algorithm-BP

(IMEA-BP)

)e proposed model issuitable for one-step

forecasting and enhances theaccuracy of multistep

forecasting

)e accuracy of multistepforecasting needs to be

further improvedWind farm in China

Predicting windspeed [26]

Artificial neural network anddecision tree algorithms

)e platform has the abilityof mass storage of

meteorological data andefficient query and analysis of

weather forecasting

Needs improvement inorder to forecast more

realistic weatherparameters

Meteorological dataprovided by the DalianMeteorological Bureau

(2011ndash2015)

Our scheme

Employing multi-lags-one-step (MLOS) ahead

forecasting technique withartificial learning-based

algorithms

)e provided results suggestthat the ConvLSTM modelhas the best performance ascompared to ANN CNNLSTM and SVM models

Increasing the number ofhidden layers may increasethe computational time

exponentially

National Wind InstitutionWest Texas Mesonet

(2012ndash2015)

Table 2 Acronyms and notations used

Category Itemssymbols Description

Acronyms

ANN Artificial neural networkCNN Convolutional neural networkLSTM Long short-term memory

ConvLSTM Convolutional LSTM hybrid modelSVM Support vector machineRE Renewable energyRNN Recurrent neural networkEWT Empirical wavelet transformationENN Elman neural network

FC-LSTM Fully connected-long short-term memoryFCN-LSTM Long short-term memory fully convolutional network

TSP Time-series predictionUKF Unscented Kalman filterSVR Support vector regressionSVRM Support vector regression machineEO Extremal optimizationMAE Mean absolute errorRMSE Root mean square errorMAPE Mean absolute percentage errorR2 R-squaredWT Wavelet transformGA Genetic algorithm

GAWNN Genetic algorithm of wavelet neural networkWNN Wavelet neural networkMLOS Multi-lags-one-stepVGP Vanishing gradient problemLM LevenbergndashMarquardtRBF Radial basis function

Computational Intelligence and Neuroscience 5

22 CNN Algorithms CNN is a feed-forward neural net-work To achieve network architecture optimization andsolve the unknown parameters in the network the attributesof a two-dimensional image are excerpted and the back-propagation algorithms are implemented To achieve thefinal outcome the sampled data are fed inside the network to

extract the needed attributes within prerefining Next theclassification or regression is applied [27]

)eCNN is composed of basically two types of layers theconvolutional and the pooling layers)e neurons are locallyconnected within the convolution layer and the precedinglayer Meanwhile the neuronsrsquo local attributes are obtained

Table 2 Continued

Category Itemssymbols Description

Notations

ft Forget gateCt )e cell statesit Input gatext Current input data

htminus1 )e previous hidden output1113957ct Input to cell cc Memory cell1113957ct Input to cell cit Input gate

Ctminus1 Past cell statusOt Output gateht Hidden statemiddot Matrix multiplication⊙ An elementwise multiplicationθ Weight1113957c )e input to the cellσ Nonlinear functionzj )e jth hidden neuronp Number of inputs to the networkm Number of hidden neuronswij )e connection weight from the ith input node to the jth hidden nodeykminusi i-step behind previous wind speed

fh() )e activation function in the hidden layerwj )e connection weight from the jth hidden node to the output node1113954yk )e predicted wind speed at the kth sampling momentfo )e activation function for the output layeryk Actual wind speedxi Input vectoryi Output vectorRm Regularized function

f(x) A function that describes the correlation between inputs and outputsϕx() Preknown functionR[f] Structure risk

w )e regression coefficient vectorb Bias termc Punishment coefficient

L(xi yi f(xi)) )e ε-insensitive loss function

ε )resholdζ i ζlowasti Slack variables that let constraints feasible

ai alowasti )e Lagrange multipliersK(xixj) )e kernel function

W )e weight matrixlowast Convolution operation

bi bf bc Bias vectors∘ Hadamard product

Ht Hidden stateXt Current wind speed measure

Xtminus1 Previous wind speed measureXt+1 Future wind speed measure

6 Computational Intelligence and Neuroscience

)e local sensitivity is found through the pooling layer toobtain the attributes repeatedly )e existence of the con-volution and the pooling layers minimizes the attributeresolution and the number of network specifications whichrequire enhancement

CNN typically describes data and constructs them as atwo-dimensional array and is extensively utilized in the areaof image processing In this paper CNN algorithm is con-figured to predict the wind speed and fit it to process a one-dimensional array of data In the preprocessing phase theone-dimensional data are reconstructed into a two-di-mensional array )is enables CNN machine algorithm tosmoothly deal with data )is creates two files the propertyand the response files )ese files are delivered as inputs toCNN )e response file also contains the data of the expectedoutput value

Each sample is represented by a line from the propertyand the response files Weights and biases can be obtained assoon as an acceptable number of samples to train the CNN isdelivered )e training continues by comparing the re-gression results with the response values in order to reachthe minimum possible error )is delivers the final trainedCNN model which is utilized to achieve the neededpredictions

)e fitting mechanism of CNN is pooling Variouscomputational approaches have proved that two approachesof pooling can be used the average pooling and the max-imum pooling Images are stationary and all parts of imageshare similar attributes )erefore the pooling approachapplies similar average or maximum calculations for everypart of the high-resolution images)e pooling process leadsto reduction in the statistics dimensions and increase in thegeneralization strength of the model )e results are welloptimized and can have a lower possibility of over fitting

23 ANN Algorithms ANN has three layers which build upthe network )ese are input hidden and output layers)ese layers have the ability to correlate an input vector to anoutput scalar or vector using activation function in variousneurons)e jth hidden neuron Zjcan be computed by the p

inputs and m hidden neurons using the following equation[14]

Zj fh 1113944

p

i1wijykminusi

⎛⎝ ⎞⎠ (7)

where wij is the connection weight from the ith input node tothe jth hidden node ykminusi is i-step behind previous windspeed and fh() is the activation function in the hiddenlayer )erefore the future wind speed can be predictedthrough

1113954yk fo 1113944

m

j1wjzj

⎛⎝ ⎞⎠ (8)

where wj is the connection weight from the jth hidden nodeto the output node and 1113954yk is the predicted wind speed at thekth sampling moment while f0 is the activation function forthe output layer By minimizing the error between the actualand the predicted wind speeds yk and 1113954yk respectively usingLevenbergndashMarquardt (LM) algorithm the nonlinearmapping efficiency of ANN can be obtained [28]

24 SVM Algorithms Assuming a set of samples xi yi1113864 1113865where i 1 2 N with input vector xi isin Rm and out-put vectoryi isin Rm )e regression obstacles aim to identifya function f(x)that describes the correlation between inputsand outputs )e interest of SVR is to obtain a linear re-gression in the high-dimensional feature space delivered bymapping the primary input set utilizing a preknown func-tion ϕ(x())and to minimize the structure riskR[f] )ismechanism can be written as follows [15]

f(x) wTϕ(x) + b

R[f] 12W

2+ C 1113944

N

i1L xiyif xi( )1113874 1113875

(9)

where W b and Crespectively are the regression coefficientvector bias term and punishment coefficient L(xi yi f(xi)

)

is the e-insensitive loss function)e regression problem canbe handled by the following constrained optimizationproblem

min 12W

2+ C 1113944

N

i1L ζ iζ

lowasti( 1113857

st yi minus wTϕ(x) + b1113872 1113873 le ε + ζ i wTϕ(x) + b1113872 1113873 minus yi le ε + ζ lowasti ζ i ζlowasti ge 0 i 1 2 N

(10)

where ζ iand ζ lowasti represent the slack variables that let con-straints feasible By using the Lagrange multipliers the re-gression function can be written as follows

f(x) 1113944N

i1ai minus a

lowasti( 1113857K xixj1113872 1113873 + b (11)

where aiand alowasti are the Lagrange multipliers that fulfilthe conditions ai ge 0 alowasti ge 0 and 1113936

Ni1(ai minus alowasti ) 0 K(xixj)

is a general kernel function In this study the well-knownradial basis function (RBF) is chosen here as the kernelfunction

Computational Intelligence and Neuroscience 7

K xixj1113872 1113873 exp minusxi minus xj

2

2σ2⎛⎜⎜⎝ ⎞⎟⎟⎠ (12)

where σ defines the RBF kernel width [15]

25 ConvLSTM Algorithm ConvLSTM is designed to betrained on spatial information in the dataset and its aim is todeal with 3-dimentional data as an input Furthermore itexchanges matrix multiplication through convolution op-eration on every LSTM cellrsquos gate By doing so it has theability to put the underlying spatial features in multidi-mensional data)e formulas that are used at each one of thegates (input forget and output) are as follows

it σ Wxi lowast xt + Whi lowast htminus1 + bi( 1113857

ft σ Wxf lowast xt + Whf lowast htminus1 + bf1113872 1113873

ot σ Wxo lowastxt + Who lowast htminus1 + bo( 1113857

Ct ft ∘Ctminus1 tan h Wxc lowastxt + Whc lowast htminus1 + bc( 1113857

Ht o minus t ∘ tan h ct( 1113857

(13)

where it ft and ot are input forget and output gates and W

is the weight matrix while xt is the current input data htminus1 isthe previous hidden output and Ct is the cell state

)e difference between these equations in LSTM is thatthe matrix multiplication (middot) is substituted by the convo-lution operation (lowast) between W and each xt htminus1 at everygate By doing so the whole connected layer is replaced by aconvolutional layer )us the number of weight parametersin the model can be significantly reduced

3 Methodology

Due to the nonlinear nonstationary attributes and thestochastic variations in the wind speed time series the ac-curate prediction of wind speed is known to be a challengingeffort [29] In this work to improve the accuracy of the windspeed forecasting model a comparison between five modelsis conducted to forecast wind speed considering availablehistorical data A new concept called multi-lags-one-step(MLOS) ahead forecasting is employed to illustrate the effecton the five models accuracies Assume that we are at timeindex Xt To forecast one output element in the future Xt+1the input dataset can be splitted into many lags (past data)XtminusI where I isin 1ndash10 By doing so the model can be trainedon more elements before predicting a single event in thefuture In addition to that the model accuracy showed animprovement until it reached the optimum lagging pointwhich had the best accuracy Beyond this point the modelaccuracy is degraded as it will be illustrated in the Resultssection

Figure 1 illustrates the workflow of the forecastingmodel Specifically the proposed methodology entails foursteps

In Step 1 data have been collected and averaged from 5minutes to 30 minutes and to 1 hour respectively )edatasets are then standardized to generate a mean value of 0

and standard deviation of 1 )e lagging stage is very im-portant in Step 2 as the data are split into different lags tostudy the effect of training the models on more than oneelement (input) to predict a single event in the future In Step3 the models have been applied taking into considerationthat somemodels such as CNN LSTM and ConvLSTM needto be adjusted from matrix shape perspective )ese modelsnormally work with 2D or more In this stage manipulationand reshaping of matrix are conducted For the sake ofchecking and evaluating the proposed models in Step 4three main metrics are used to validate the case study (MAERMSE and R2) In addition the execution time and opti-mum lag are taken into account to select the best model

Algorithm 1 illustrates the training procedure forConvLSTM

4 Collected Data

Table 3 illustrates the characteristics of the collected data in 5minutes time span )e data are collected from a real windspeed dataset over a three-year period from the West TexasMesonet with 5-minute observation period from near LakeAlan Garza [30] )e data are processed through averagingfrom 5 minutes to 30 minutes (whose statistical charac-teristics are given in Table 4) and one more time to 1 hour(whose statistical characteristics are also given in Table 5))e goal of averaging is to study the effect of reducing thedata size in order to compare the five models and then selectthe one that can achieve the highest accuracy for the threedataset cases As shown in the three tables the data sets arealmost identical and reserved with their seasonality Alsothey are not affected by the averaging process

)e data have been split into three sets (training vali-dation and test) with fractions of 53 14 33

5 Results and Discussion

To quantitatively evaluate the performance of the predictivemodels four commonly used statistical measures are tested[20] All of them measure the deflection between the actualand predicted wind speed values Specifically RMSE MAEand R2 are as follows

RMSE

1113936Ni

i1 yi minus 1113954yi( 11138572

Ni

1113971

MAE 1N

1113944N

i1 yi minus 1113954yi

11138681113868111386811138681113868111386811138681113868

R2

1 minus1113936

Ni

i1 yi minus 1113954yi( 11138572

1113936Ni

i1 yi minus yi( 11138572

(14)

where yi and 1113954yi are the actual and the predicted wind speedrespectively while yi is the mean value of actual wind speedsequence Typically the smaller amplitudes of these mea-sures indicate an improved forecasting procedure while R2

is the goodness-of-fit measure for the model )erefore thelarger its value is the fitter the model will be )e testbedenvironment configuration is as follows

8 Computational Intelligence and Neuroscience

Collect raw dataTime series

Averaging andpreprocessing

L1 L2 L10

AN

N

CNN

LSTM

Con

vLS

TM SVR

Forecasting results

MSERMSEMAE

R2

DelayExecution time

Best fittedKPIs

Dat

apr

oces

sing

Lagging

ModelingEv

alua

tion

helliphellip

1hr30min5min

Figure 1 )e proposed forecasting methodology

Input the wind speed time series dataOutput forecasting performance indices

(1) )e wind speed time series data are measured every 5minutes being averaged two times for 30minutes and 1 hour respectively(2) )e wind datasets are split into Training Validation and Test sets(3) Initiate the multi-lags-one-step (MLOS) arrays for Training Validation and Test sets(4) Define MLOS range as 1 10 to optimize the number of needed lags(5) loop 1

Split the first set based on MLOS rangeInitiate and Extract set features with CNN layerPass the output to a defined LSTM layerSelect the first range of the number of hidden neuronsGenerate prediction results performance indicesCount time to execute and produce prediction resultsSave and compare results with previous onesloop 2Select next MLOS rangeIf MLOS rangemaximum range then goto loop 3 and initialize MLOS rangegoto loop 1loop 3Select new number of hidden neuronsIf number of hidden neurons rangemaximum range then goto loop 4 and initialize number of hidden neurons rangegoto loop 1

(6) loop 4Select new dataset from the sets 5min 30min and 1 hrgoto loop 1If set rangemaximum range thenGenerate the performance indices of all tested setsSelect the best results metrics

ALGORITHM 1 ConvLSTM training

Computational Intelligence and Neuroscience 9

(1) CPU Intel (R) Core(TM) i7-8550U CPU 180GHz 2001 Mhz 4 Core (s) 8 Logical Processor(s)

(2) RAM Installed Physical Memory 160 GB(3) GPU AMD Radeon(TM) RX 550 10GB(4) Framework Anaconda 201907 Python 37

Table 6 illustrates the chosen optimized internal pa-rameters (hyperparameters) for the forecasting methodsused in this work For each method the optimal number ofhidden neurons is chosen to achieve the maximum R2 andthe minimum RMSEand MAE values

After the implementation of CNN ANN LSTMConvLSTM and SVM it was noticed that the most fittedmodel was chosen depending on its accuracy in predictingfuture wind speed values )us the seasonality is consideredfor the forecast mechanism)e chosen model has to deliverthe most fitted data with the least amount of error takinginto consideration the nature of the data and not applyingnaive forecasting on it

To achieve this goal the statistical error indicators arecalculated for every model and time lapse and fully repre-sented as Figure 2 illustrates )e provided results suggestthat the ConvLSTM model has the best performance ascompared to the other four models)e chosen model has toreach the minimum values of RMSE and MAE whilemaximum R2 value

Different parameters are also tested to ensure the rightdecision of choosing the best fitted model )e optimumnumber of lags which is presented in Table 7 is one of themost important indicators in selecting the best fitted modelSince the less historical points are needed by the model thecomputational effort will be less as well For each methodthe optimal number of lags is chosen to achieve the max-imum R2and the minimum RMSE and MAE values Forinstance Figures 3 and 4 show the relation between the

statistical measures and the number of lags and hiddenneurons respectively for the proposed ConvLSTM methodfor the 5 minutes time span case It can be seen that 4 lagsand 15 hidden neurons achieved maximum R2 and mini-mum RMSE and MAE values

)e execution time shown in Table 8 is calculated foreach method and time lapse to assure that the final andchosen model is efficient and can effectively predict futurewind speed )e shorter the time for execution is the moreefficient and helpful the model will be )is is also a sign thatthe model is efficient for further modifications According toTable 8 the ConvLSTM model beats all other models in thetime that it needed to process the historical data and deliver afinal prediction SVM needed 54 minutes to accomplish thetraining and produce testing results while ConvLSTM madeit in just 17 minutes )is huge difference between them hasmade the choice of using ConvLSTM

Table 3 Dataset characteristic for 5min sample

Dataset Max Median Min Mean StdAll datasets 1873 353 001 391 210Training dataset 1873 347 001 383 205Test dataset 1487 367 001 405 220

Table 4 Dataset characteristic for 30min sample

Dataset Max Median Min Mean StdAll datasets 1766 353 001 391 208Training dataset 1766 347 001 3837309 202Test dataset 1432 367 002 405 218

Table 5 Dataset characteristic for 1 hour sample

Dataset Max Median Min Mean StdAll datasets 1761 353 007 391 205Training dataset 1761 346 007 383 200Test dataset 1422 366 007 405 215

Table 6 Optimized internal parameters for the forecastingmethods

Method Sets of parameters

LSTM5min 17 hidden neurons30min 20 hidden neurons1 hour 8 hidden neurons

CNN5min 15 hidden neurons30min 5 hidden neurons1 hour 15 hidden neurons

ConvLSTM5min 15 hidden neurons30min 8 hidden neurons1 hour 20 hidden neurons

ANN5min 15 hidden neurons30min 15 hidden neurons1 hour 20 hidden neurons

SVR5min C 7 ε 01 c 02

30min C 1 ε 025 c 0151 hour C 1 ε 01 c 005

10 Computational Intelligence and Neuroscience

Figure 5 shows that the 5-minute lapse dataset is themost fitted dataset to our chosen model It declares howaccurate the prediction of future wind speed will be

For completeness to effectively evaluate the investi-gated forecasting techniques in terms of their predictionaccuracies 50 cross validation procedure is carried out inwhich the investigated techniques are built and thenevaluated on 50 different training and test datasets re-spectively randomly sampled from the available overalldataset )e ultimate performance metrics are then re-ported as the average and the standard deviation values ofthe 50 metrics obtained in each cross validation trial Inthis regard Figure 6 shows the average performancemetrics on the test dataset using the 50 cross validationprocedure It can be easily recognized that the forecastingmodels that employ the LSTM technique outperform theother investigated techniques in terms of the three per-formance metrics R2 RMSE and MAE

From the experimental results of short-term wind speedforecasting shown in Figure 6 we can observe thatConvLSTM performs the best in terms of forecasting metrics(R2 RMSE andMAE) as compared to the other models (ieCNN ANN SVR and LSTM) )e related statistical tests inTables 6 and 7 respectively have proved the effectiveness ofConvLSTM and its capability of handling noisy large dataConvLSTM showed that it can produce high accurate windspeed prediction with less lags and hidden neurons )iswas indeed reflected in the results shown in Table 8 withless computation time as compared to the other testedmodels Furthermore we introduced the multi-lags-one-step (MLOS) ahead forecasting combined with the hybridConvLSTMmodel to provide an efficient generalization fornew time series data to predict wind speed accuratelyResults showed that ConvLSTM proposed in this paperis an effective and promising model for wind speedforecasting

CNN ANN LSTM SVM ConvLSTM5min 09873 09873 09876 09815 0987630min 09091 09089 09097 09078 090981hour 08309 08302 08315 08211 08316

Val

ues

Models

R2

0009018027036045054063072081

09099

(a)

Val

ues

CNN ANN LSTM SVM ConvLSTM5min 02482 02483 02424 02462 0241930min 06578 06588 06467 06627 064631hour 08862 08881 08718 09116 08717

Models

RMSE

0010203040506070809

(b)

Val

ues

CNN ANN LSTM SVM ConvLSTM5min 01806 01798 01758 01855 0175530min 04802 04799 04721 04818 047411hour 06468 06473 06392 06591 06369

Models

MAE

0006012018024

03036042048054

06066

(c)

Figure 2 Modelsrsquo key performance indicators (KPIs)

Table 7 Optimum number of lags

Method 5min 30min 1 hourCNN 9 4 4ANN 3 3 3LSTM 7 6 10ConvLSTM 4 5 8SVR 9 4 5

Computational Intelligence and Neuroscience 11

ConvLSTM

50 75 100 125 150 175 20025Number of hidden neurons

0986909870098710987209873098740987509876

R-sq

uare

d

(a)

ConvLSTM

0986909870098710987209873098740987509876

RMSE

50 75 100 125 150 175 20025Number of hidden neurons

(b)

ConvLSTM

0176017701780179018001810182

MA

E

50 75 100 125 150 175 20025Number of hidden neurons

(c)

Figure 3 ConvLSTM measured statistical values and number of hidden neurons

2 4 6 8 10Number of lags

ConvLSTM

098575

098600

098625

098650

098675

098700

098725

098750

R-sq

uare

d

(a)

2 4 6 8 10Number of lags

ConvLSTM

02425

02450

02475

02500

02525

02550

02575

02600

RMSE

(b)

2 4 6 8 10Number of lags

ConvLSTM

0176

0178

0180

0182

0184

0186

0188

0190

MA

E

(c)

Figure 4 ConvLSTM measured statistical values and number of lags

12 Computational Intelligence and Neuroscience

Similar to our work the proposed EnsemLSTM modelby Chen et al [15] contained different clusters of LSTMwith different hidden layers and hidden neurons )eycombined LSTM clusters with SVR and external optimizerin order to enhance the generalization capability and ro-bustness of their model However their model showed ahigh computational complexity with mediocre perfor-mance indices Our proposed ConvLSTM with MLOS

assured boosting the generalization and robustness for thenew time series data as well as producing high performanceindices

6 Conclusions

In this study we proposed a hybrid deep learning-basedframework ConvLSTM for short-term prediction of the wind

Table 8 Execution time

(5min) time (min) (30min) time (min) (1 hour) time (min)ConvLSTM 17338 03849 01451SVR 541424 08214 02250CNN 087828 01322 00708ANN 07431 02591 00587LSTM 16570 03290 01473

5

4

3

2

1

0 25 50 75 100 125 150 175 200Number of samples

Win

d sp

eed

(mph

)ConvLSTM 5 min time span

PredictedTrue

Figure 5 ConvLSTM truepredicted wind speed and number of samples

0988

0986

0984

0982

098ConvLSTM LSTM ANN CNN SVR

R2

(a)

ConvLSTM LSTM ANN CNN SVR

026

024

022

02

RMSE

(b)

ConvLSTM LSTM ANN CNN SVR

02

019

018

017

016

015

MA

E

(c)

Figure 6 Average performance metrics obtained on the test dataset using 50 cross validation procedure

Computational Intelligence and Neuroscience 13

speed time series measurements )e proposed dynamicpredictionmodel was optimized for the number of input lagsand the number of internal hidden neurons Multi-lags-one-step (MLOS) ahead wind speed forecasting using the pro-posed approach showed superior results compared to fourother different models built using standard ANN CNNLSTM and SVM approaches )e proposed modelingframework combines the benefits of CNN and LSTM net-works in a hybrid modeling scheme that shows highly ac-curate wind speed prediction results with less lags andhidden neurons as well as less computational complexityFor future work further investigation can be done to im-prove the accuracy of the ConvLSTM model for instanceincreasing and optimizing the number of hidden layersapplying a multi-lags-multi-steps (MLMS) ahead forecast-ing and introducing reinforcement learning agent to op-timize the parameters as compared to other optimizationmethods

Data Availability

)ewind speed data used in this study have been taken fromthe West Texas Mesonet of the US National Wind Institute(httpwwwdeptsttuedunwiresearchfacilitieswtmindexphp) Data are provided freely for academic researchpurposes only and cannot be shareddistributed beyondacademic research use without permission from the WestTexas Mesonet

Conflicts of Interest

)e authors declare that they have no conflicts of interest

Acknowledgments

)e authors acknowledge the help of Prof Brian Hirth inTexas Tech University for providing them with access toweather data

References

[1] R Rothermel ldquoHow to predict the spread and intensity offorest and range firesrdquo General Technical Report INT-143Intermountain Forest and Range Experiment Station OgdenUT USA 1983

[2] A Tascikaraoglu and M Uzunoglu ldquoA review of combinedapproaches for prediction of short-term wind speed andpowerrdquo Renewable and Sustainable Energy Reviews vol 34pp 243ndash254 2014

[3] R J Barthelmie F Murray and S C Pryor ldquo)e economicbenefit of short-term forecasting for wind energy in the UKelectricity marketrdquo Energy Policy vol 36 no 5 pp 1687ndash1696 2008

[4] Y-K Wu and H Jing-Shan A Literature Review of WindForecasting Technology in the World pp 504ndash509 IEEELausanne Powertech Lausanne Switzerland 2007

[5] W S McCulloch and W Pitts ldquoA logical calculus of the ideasimmanent in nervous activityrdquo Be Bulletin of MathematicalBiophysics vol 5 no 4 pp 115ndash133 1943

[6] K Fukushima ldquoNeocognitronrdquo Scholarpedia vol 2 no 1p 1717 2007

[7] K Fukushima ldquoNeocognitron a self-organizing neural net-work model for a mechanism of pattern recognition unaf-fected by shift in positionrdquo Biological Cybernetics vol 36no 4 pp 193ndash202 1980

[8] S Hochreite and J Schmidhuber ldquoLong short-termmemoryrdquoNeural Computation vol 9 no 8 pp 1735ndash1780 1997

[9] H T Siegelmann and E D Sontag ldquoOn the computationalpower of neural netsrdquo Journal of Computer and System Sci-ences vol 50 no 1 pp 132ndash150 1995

[10] R Sunil ldquoUnderstanding support vector machine algorithmrdquohttpswwwanalyticsvidhyacomblog201709understaing-support-vector-machine-example-code 2017

[11] X Shi Z Chen H Wang D Y Yeung W K Wong andW C Woo ldquoConvolutional LSTM network a machinelearning approach for precipitation nowcastingrdquo in Pro-ceedings of the Neural Information Processing Systemspp 7ndash12 Montreal Canada December 2015

[12] C Finn ldquoGoodfellow I and Levine S Unsupervised learningfor physical interaction through video predictionrdquo in Pro-ceedings of the Annual Conference Neural Information Pro-cessing Systems pp 64ndash72 Barcelona Spain December 2016

[13] H Xu Y Gao F Yu and T Darrell ldquoEnd-to-end learning ofdriving models from large-scale video datasetsrdquo in Proceed-ings of the 2017 IEEE Conference on Computer Vision andPattern Recognition pp 3530ndash3538 Honolulu HI USA July2017

[14] K Chen and J Yu ldquoShort-term wind speed prediction usingan unscented Kalman filter based state-space support vectorregression approachrdquo Applied Energy vol 113 pp 690ndash7052014

[15] J Chen G-Q Zeng W Zhou W Du and K-D Lu ldquoWindspeed forecasting using nonlinear-learning ensemble of deeplearning time series prediction and extremal optimizationrdquoEnergy Conversion and Management vol 165 pp 681ndash6952018

[16] D Liu D Niu H Wang and L Fan ldquoShort-term wind speedforecasting using wavelet transform and support vectormachines optimized by genetic algorithmrdquo Renewable Energyvol 62 pp 592ndash597 2014

[17] Y Wang ldquoShort-term wind power forecasting by geneticalgorithm of wavelet neural networkrdquo in Proceedings of the2014 International Conference on Information Science Elec-tronics and Electrical Engineering pp 1752ndash1755 SapporoJapan April 2014

[18] N Sheikh D B Talukdar R I Rasel and N Sultana ldquoA shortterm wind speed forcasting using svr and bp-ann a com-parative analysisrdquo in Proceedings of the 2017 20th Interna-tional Conference of Computer and Information Technology(ICCIT) pp 1ndash6 Dhaka Bangladesh December 2017

[19] H Nantian C Yuan G Cai and E Xing ldquoHybrid short termwind speed forecasting using variational mode decompositionand a weighted regularized extreme learning machinerdquo En-ergies vol 9 no 12 pp 989ndash1007 2016

[20] S Haijian and X Deng ldquoAdaBoosting neural network forshort-term wind speed forecasting based on seasonal char-acteristics analysis and lag space estimationrdquo ComputerModeling in Engineering amp Sciences vol 114 no 3pp 277ndash293 2018

[21] W Xiaodan ldquoForecasting short-term wind speed usingsupport vector machine with particle swarm optimizationrdquo inProceedings of the 2017 International Conference on SensingDiagnostics Prognostics and Control (SDPC) pp 241ndash245Shanghai China August 2017

14 Computational Intelligence and Neuroscience

[22] F R Ningsih E C Djamal and A Najmurrakhman ldquoWindspeed forecasting using recurrent neural networks and longshort term memoryrdquo in Proceedings of the 2019 6th Inter-national Conference on Instrumentation Control and Auto-mation (ICA) pp 137ndash141 Bandung Indonesia JulyndashAugust2019

[23] P K Datta ldquoAn artificial neural network approach for short-term wind speed forecastrdquo Master thesis Kansas State Uni-versity Manhattan KS USA 2018

[24] J Guanlong D Li L Yao and P Zhao ldquoAn improved ar-tificial bee colony-BP neural network algorithm in the short-term wind speed predictionrdquo in Proceedings of the 2016 12thWorld Congress on Intelligent Control and Automation(WCICA) pp 2252ndash2255 Guilin China June 2016

[25] C Gonggui J Chen Z Zhang and Z Sun ldquoShort-term windspeed forecasting based on fuzzy C-means clustering andimproved MEA-BPrdquo IAENG International Journal of Com-puter Science vol 46 no 4 pp 27ndash35 2019

[26] W ZhanJie and A B M Mazharul Mujib ldquo)e weatherforecast using data mining research based on cloud com-putingrdquo in Journal of Physics Conference Series vol 910 no 1Article ID 012020 2017

[27] A Zhu X Li Z Mo and R Wu ldquoWind power predictionbased on a convolutional neural networkrdquo in Proceedings ofthe 2017 International Conference on Circuits Devices AndSystems (ICCDS) pp 131ndash135 Chengdu China September2017

[28] O Linda T Vollmer and M Manic ldquoNeural network basedintrusion detection system for critical infrastructuresrdquo inProceedings of the 2009 International Joint Conference onNeural Networks pp 1827ndash1834 Atlanta GA USA June2009

[29] C Li Z Xiao X Xia W Zou and C Zhang ldquoA hybrid modelbased on synchronous optimisation for multi-step short-termwind speed forecastingrdquoApplied Energy vol 215 pp 131ndash1442018

[30] Institute National Wind West Texas Mesonet Online Ref-erencing httpwwwdeptsttueducom 2018

Computational Intelligence and Neuroscience 15

effect on the electricity [3] which is also necessary to identifythe size of wind farms

It is obvious that there is a need for an accurate windforecasting technique to substantially reduce the cost bywind power scheduling [4])ere are several methods whichare aimed at short-time wind forecasting (eg statisticaltime series and neural networks) For an advanced and moreaccurate forecasting the hybrid models are used )esemodels combine physical and statistical approaches shortand medium-term models and combinations of alternativestatistical models

)e concept of artificial neural networks (ANNs) wasfirst introduced by McCulloch and Pitts [5] in 1943 as acomputational model for biological neural networks Con-volutional neural network (CNN) was influenced byldquoNeocognitronrdquo networks which were first introduced byFukushima in 1980 [6] CNN was based on biologicalprocesses which were hierarchical multilayered neuralnetworks used for image processing )ese networks arecapable of ldquolearning without a teacherrdquo for recognition ofvarious catalyst shapes depending on their geometricaldesigns [7]

Long short-term memory (LSTM) [8] is built uponrecurrent neural network (RNN) structure It was designedby Hochreiter and Schmidhuber in 1997 LSTM uses theconcept proposed in [9] which depends on feedback con-nections between its layers Unlike standard feedforwardneural networks LSTM can process entire sequences of data(such as voice or video) and not just single data points (suchas images)

Support vector machine (SVM) [10] is a popular ma-chine learning technique which is advanced enough to dealwith complex data It is aimed to deal with challenges inclassification problems

In 2016 Convolutional LSTM (ConvLSTM) was used tobuild a video prediction model by Shi et al [11] A tool isdeveloped to prognose action-conditioned video thatmodeled pixel movement by predicting a distribution overpixel movement from earlier frames Stacked convolutionalLSTM was employed to generate motion predictions )isapproach has gained the finest outcomes in predicting futureobject motion

An end-to-end learning of drivingmodels was developedin [12] using a LSTM-based algorithm A trainable structurefor learning how to accurately predict a distribution amongupcoming vehicle movement is developed through learninga generic vehicle movement from large-scale crowd-sourcedvideo )e data source used a rapid monocular cameraobservations and past vehicle state )e images wereencoded through long short-term memory fully convolu-tional network (FCN-LSTM) to determine the relatedgraphical illustration in every input frame side by side with atemporal network to use the movement history information)e authors were able to compose an innovative hybridstructure for time-series prediction (TSP) that combined anLSTM temporal encoder utilizing a fully convolutional visualencoder

Various papers have been explored in the literatureon wind speed forecasting For instance a model was

introduced by Xu et al [13] to predict short-term windspeed using LSTM empirical wavelet transformation(EWT) and Elman neural network approaches )e EWTis implemented to break down the raw wind speed datainto multiple sublayers and employ them in Elmanneural network (ENN) and LSTM network to predict thelow and high frequency sublayers Unscented Kalmanfilter (UKF) along with support vector regression (SVR)based state-space model was applied by Chen and Yu [14]to efficiently correct the short-term estimation of windspeed chain

A nonlinear-learning scheme of deep learning timeseries prediction EnsemLSTM was developed by Chenet al [15] )is scheme relied on LSTMs support vectorregression machine (SVRM) and extremal optimizationalgorithm (EO) Wind speed data are forecasted separatelyby an array of LSTMs that contained covered layersNeurons are built in every hidden layer )e authorsproved that the introduced EnsemLSTM is capable ofachieving an improved forecasting execution along withthe least mean absolute error (MAE) root mean squareerror (RMSE) mean absolute percentage error (MAPE)and the highest R-squared (R2)

A hybrid model constructed of wavelet transform(WT) and SVM was proposed by Liu et al [16] to predictwind speed in the short term )e model is improved bygenetic algorithm (GA) which is implemented to varyessential specifications of SVM through reducing theproduced errors and searching the optimum specifica-tions to bypass the danger of instability )e presentedmodel is proved to be more efficient than SVM-GAmodelWang [17] developed a genetic algorithm of waveletneural network (GAWNN) model )e developed modelshowed an enhanced operation as compared to the normalwavelet neural network (WNN) model in predicting short-term wind power )e model can be located at the be-ginning of network training as well as in convergentprecision

A prediction model was proposed by Sheikh et al [18]based on support vector regression (SVR) and neuralnetwork (NN) with backpropagation technique A win-dowing data preprocessing was combined with cross andsliding window validations in order to predict wind speedwith high accuracy A hybrid method was presented byNantian et al [19] which included variational modedecomposition (VMD) partial autocorrelation function(PACF) feature selection and modular weighted regu-larized extreme learning machine (WRELM) prediction)e optimal number of decomposition layers was ana-lyzed by the prediction error of one-step forecasting withdifferent decomposition layers

A robust forecasting model was proposed by Haijianand Deng [20] by evaluating seasonal features and lagspace in wind resource )e proposed model was based onthe multilayered perceptron with one hidden layer neuralnetwork using the LevenbergndashMarquardt optimizationmethod Least squares support vector machine (LSSVM)was used by Xiaodan [21] for the wind speed forecasting)e accuracy of the prediction model parameters was

2 Computational Intelligence and Neuroscience

optimized utilizing the particle swarm optimization (PSO)to minimize the fitness function in the training processNingsih et al [22] predicted wind speed using recurrentneural networks (RNNs) with long short-term memory(LSTM) Two optimization models of stochastic gradientdescent (SGD) and adaptive moment estimation (Adam)were evaluated )e Adam method was shown to be betterand quicker than SGD with a higher level of accuracy andless deviation from the target

A nonlinear autoregressive neural network (NAR-NET) model was developed by Datta [23] )e modelemployed univariate time series data to generate hourlywind speed forecast )e closed loop structure providederror feedback to the hidden layer to generate forecast ofthe next point A short-term wind speed forecastingmethod was proposed by Guanlong et al [24] using abackpropagation (BP) neural network )e weight andthreshold values of BP network are trained and optimizedby the improved artificial bee colony algorithm )en thegathered samples of wind speed are trained and opti-mized When training is finished test samples are used toforecast and validate

Fuzzy C-means (FCM) clustering was used by Gongguiet al [25] to forecast wind speed )e input data of BPneural network with similar characteristics are divided intocorresponding classes Different BP neural networks areestablished for each class )e coefficient of variation is usedto illustrate the dispersion of data and statistical knowledgeis used to illuminate the input data with large dispersionfrom the original dataset Artificial neural networks (ANNs)and decision trees (DTs) were used by ZhanJie and MazharulMujib [26] to analyze meteorological data for the applicationof data mining techniques through cloud computing in windspeed prediction )e neurons in the hidden layer are en-hanced gradually and the network performance in the forman error is examined Table 1 highlights the main charac-teristics of the existing schemes developed for wind speedforecasting

)e novelty of this work lies in enhancing the accuracy ofwind speed forecasting by using a hybrid model calledConvLSTM and comparing it with other four commonlyused models with optimized lags hidden neurons andparameters )is includes testing and comparing the per-formance of these five different models based on historicaldata as well employing multi-lags-one-step (MLOS) aheadforecasting concept MLOS provided an efficient general-ization to new time series data )us it increased the overallprediction accuracy )e remainder of this paper is orga-nized as follows Section 2 describes the four learning al-gorithms in addition to a hybrid algorithm investigated foran accurate wind speed forecasting Section 3 illustrates thestudy methodology Section 4 shows a real case study of awind farm Section 5 introduces the results and discussionFinally conclusions and future works are presented inSection 6

11 Acronyms and Notations Table 2 illustrates the acro-nyms and notations used through the paper

2 Prediction Algorithms

In this section the algorithms used for wind speed fore-casting are summarized as follows

21 LSTM Algorithms LSTM is built in a unique archi-tecture that empowers it to forget the unnecessary in-formation by turning multiplication into addition andusing a function whose second derivative can preserve fora long range before going to zero in order to reduce thevanishing gradient problem (VGP) It is constructed of thesigmoid layer which takes the inputs xtand htminus1 and thendecides by generating the zeros which part from the oldoutput should be removed )is process is done throughforget gate ft )e gate output is given as ft lowast ctminus1 Afterthat a vector of all the possible values from the new inputis created by tan h layer )ese two results are multipliedto renew the old memory ctminus1 that gives ct In other wordsthe sigmoid layer decides which portions of the cell statewill be the outcome )en the outcome of the sigmoidgate is multiplied by all possible values that are set upthrough tan h )us the output consists of only the partsthat are decided to be generated

LSTM networks [8] are part of recurrent neural net-works (RNNs) which are capable of learning long-termdependencies and powerful for modeling long-rangedependencies )e main criterion of the LSTM network isthe memory cell which can memorize the temporal stateIt is also shaped by the addition or removal of informationthrough three controlling gates )ese gates are the inputgate forget gate and output gate LSTMs are able to renewand control the information flow in the block using thesegates in the following equations

sit σ xt middot θxi + htminus1 middot θhi + θibias1113872 1113873 (1)

ft σ xt middot θxf + htminus1 middot θhf + θfbias1113872 1113873 (2)

1113957ct tan h xt middot θx~c + htminus1 middot θh~c + θ~cbias1113872 1113873 (3)

c 1113957ct ⊙ it + ctminus1⊙ft (4)

ot σ xt middot θxo + htminus1 middot θho + θobias1113872 1113873 (5)

ht ot ⊙ tan h ct( 1113857 (6)

where ldquomiddotrdquo presents matrix multiplication ldquo⊙rdquo is an ele-mentwise multiplication and ldquoθrdquo stands for the weights 1113957c isthe input to the cell c which is gated by the input gate whileot is the output )e nonlinear functions σ and tan h areapplied elementwise where σ(x) 11 + eminus x Equations (1)and (2) establish gate activations equation (3) indicates cellinputs equation (4) determines the new cell states where thelsquomemoriesrsquo are stored or deleted and equation (5) results inthe output gate activations which are shown in equation (6)the final output

Computational Intelligence and Neuroscience 3

Table 1 Main characteristics of the existing wind speed forecasting schemes

Aim Technique Meritsoutcomes Demerits Dataset

Hybrid windspeedprediction [13]

Empirical wavelettransformation (EWT) longshort-term memory neuralnetwork and a deep learning

algorithm

)e proposed model has thesatisfactory multistepforecasting results

)e performance of theEWT for the wind speedmultistep forecasting has

not been studied

Four sets of original windspeed series including 700

samples

Wind speedforecasting [14]

Unscented Kalman filter(UKF) is integrated withsupport vector regression

(SVR) model

)e proposed method hasbetter performance in both

one-step-ahead andmultistep-ahead predictions

than ANNs SVRautoregressive and

autoregressive integratedwith Kalman filter models

Needs to develop thepredictive model-basedcontrol and optimizationstrategies for wind farm

operation

Center for Energy Efficiencyand Renewable Energy atUniversity of Massachusetts

Wind speedforecasting [15]

Long short-term memoryneural networks supportvector regression machineand extremal optimization

algorithm

)e proposed model canachieve a better forecastingperformance than ARIMASVR ANN KNN and GBRT

models

Needs to consider moreinterrelated features like

weather conditions humanfactors and power system

status

A wind farm in InnerMongolia China

A hybrid short-term windspeedforecasting [16]

Wavelet transform (WT)genetic algorithm (GA) andsupport vector machines

(SVMs)

)e proposed method ismore efficient than a

persistent model and a SVM-GA model without WT

Needs to augment externalinformation such as the airpressure precipitation andair humidity besides the

temperature

)e wind speed data every05 h in a wind farm of

North China in September2012

Short-termwind speedprediction [18]

Support vector regression(SVR) and artificial neural

network (ANN) withbackpropagation

)e proposed SVR and ANNmodels are able to predictwind speed with more than

99 accuracy

Computationally expensive

Historical dataset(2008ndash2014) of wind speedof Chittagong costal area

from BangladeshMeteorological Division

(BMD)

Hybrid windspeedforecasting [19]

Variational modedecomposition (VMD) thepartial autocorrelationfunction (PACF) and

weighted regularized extremelearning machine (WRELM)

(i) )e VMD reduces theinfluences of randomness

and volatility of wind speed(ii) PACF reduces the featuredimension and complexity of

the model(iii) ELM improves theprediction accuracy

)e forecasting accuracy oftwo-step-ahead and three-step-ahead predictionsdeclined to different

degrees

USA National RenewableEnergy Laboratory (NREL)

in 2004

Short-termwind speedforecasting[20]

Wavelet analysis andAdaBoosting neural network

(i) Benefits the analysis of thewind speedrsquos randomness

and optimal neural networkrsquosstructure

(ii) It can be used to promotethe modelrsquos configurationand show the confidence inhigh-accuracy forecasting

Needs to consider thedynamical model with

ability of error correctionand adaptive adjustment

USA National RenewableEnergy Laboratory (NREL)

in 2004

Short-termwind speedforecasting[21]

Support vector machine(SVM) with particle swarm

optimization (PSO)

)e proposed model has thebest forecasting accuracycompared to classical SVMand backpropagation neural

network models

Needs to consideradditional information forefficient forecasting such as

season and weathervariables

Wind farm data in China in2011

Wind speedpredictions[22]

Recurrent neural network(RNN) with long short-term

memory (LSTM)

)e model provides 927accuracy for training dataand 916 for new data

High rate epochs increasedthe process time and

eventually provided lowaccuracy performance

Nganjuk Meteorology andGeophysics Agency(BMKG) East Java

(2008ndash2017)

Forecastingmultistep-ahead windspeed [23]

NARNETmodel to forecasthourly wind speed using anartificial neural network

(ANN)

)e model is cost effectiveand can work with minimumavailability of statistical data

(i) Faulty measurements ofinputs are likely to affect the

model parameters(ii) Removing rapid changesusing a low-pass filter might

result in neglectingimportant information

Meteorological data fromthe National Oceanic and

AtmosphericAdministration (NOAA)located in Dodge CityKansas (January 2010-December 2010)

4 Computational Intelligence and Neuroscience

Table 1 Continued

Aim Technique Meritsoutcomes Demerits Dataset

Short-termwind speedprediction [24]

Backpropagation (BP) neuralnetwork based on improved

artificial bee colonyalgorithm (ABC-BP)

)e model has high precisionand fast convergence ratecompared with traditionaland genetic BP neural

networks

Sensitive for noisy data)erefore data should befiltered which may affect

the nature of data

Wind farm in TianjinChina (December

2013ndashJanuary 2014)

Short-termwind speedforecasting [25]2019

Fuzzy C-means clustering(FCM) and improved mindevolutionary algorithm-BP

(IMEA-BP)

)e proposed model issuitable for one-step

forecasting and enhances theaccuracy of multistep

forecasting

)e accuracy of multistepforecasting needs to be

further improvedWind farm in China

Predicting windspeed [26]

Artificial neural network anddecision tree algorithms

)e platform has the abilityof mass storage of

meteorological data andefficient query and analysis of

weather forecasting

Needs improvement inorder to forecast more

realistic weatherparameters

Meteorological dataprovided by the DalianMeteorological Bureau

(2011ndash2015)

Our scheme

Employing multi-lags-one-step (MLOS) ahead

forecasting technique withartificial learning-based

algorithms

)e provided results suggestthat the ConvLSTM modelhas the best performance ascompared to ANN CNNLSTM and SVM models

Increasing the number ofhidden layers may increasethe computational time

exponentially

National Wind InstitutionWest Texas Mesonet

(2012ndash2015)

Table 2 Acronyms and notations used

Category Itemssymbols Description

Acronyms

ANN Artificial neural networkCNN Convolutional neural networkLSTM Long short-term memory

ConvLSTM Convolutional LSTM hybrid modelSVM Support vector machineRE Renewable energyRNN Recurrent neural networkEWT Empirical wavelet transformationENN Elman neural network

FC-LSTM Fully connected-long short-term memoryFCN-LSTM Long short-term memory fully convolutional network

TSP Time-series predictionUKF Unscented Kalman filterSVR Support vector regressionSVRM Support vector regression machineEO Extremal optimizationMAE Mean absolute errorRMSE Root mean square errorMAPE Mean absolute percentage errorR2 R-squaredWT Wavelet transformGA Genetic algorithm

GAWNN Genetic algorithm of wavelet neural networkWNN Wavelet neural networkMLOS Multi-lags-one-stepVGP Vanishing gradient problemLM LevenbergndashMarquardtRBF Radial basis function

Computational Intelligence and Neuroscience 5

22 CNN Algorithms CNN is a feed-forward neural net-work To achieve network architecture optimization andsolve the unknown parameters in the network the attributesof a two-dimensional image are excerpted and the back-propagation algorithms are implemented To achieve thefinal outcome the sampled data are fed inside the network to

extract the needed attributes within prerefining Next theclassification or regression is applied [27]

)eCNN is composed of basically two types of layers theconvolutional and the pooling layers)e neurons are locallyconnected within the convolution layer and the precedinglayer Meanwhile the neuronsrsquo local attributes are obtained

Table 2 Continued

Category Itemssymbols Description

Notations

ft Forget gateCt )e cell statesit Input gatext Current input data

htminus1 )e previous hidden output1113957ct Input to cell cc Memory cell1113957ct Input to cell cit Input gate

Ctminus1 Past cell statusOt Output gateht Hidden statemiddot Matrix multiplication⊙ An elementwise multiplicationθ Weight1113957c )e input to the cellσ Nonlinear functionzj )e jth hidden neuronp Number of inputs to the networkm Number of hidden neuronswij )e connection weight from the ith input node to the jth hidden nodeykminusi i-step behind previous wind speed

fh() )e activation function in the hidden layerwj )e connection weight from the jth hidden node to the output node1113954yk )e predicted wind speed at the kth sampling momentfo )e activation function for the output layeryk Actual wind speedxi Input vectoryi Output vectorRm Regularized function

f(x) A function that describes the correlation between inputs and outputsϕx() Preknown functionR[f] Structure risk

w )e regression coefficient vectorb Bias termc Punishment coefficient

L(xi yi f(xi)) )e ε-insensitive loss function

ε )resholdζ i ζlowasti Slack variables that let constraints feasible

ai alowasti )e Lagrange multipliersK(xixj) )e kernel function

W )e weight matrixlowast Convolution operation

bi bf bc Bias vectors∘ Hadamard product

Ht Hidden stateXt Current wind speed measure

Xtminus1 Previous wind speed measureXt+1 Future wind speed measure

6 Computational Intelligence and Neuroscience

)e local sensitivity is found through the pooling layer toobtain the attributes repeatedly )e existence of the con-volution and the pooling layers minimizes the attributeresolution and the number of network specifications whichrequire enhancement

CNN typically describes data and constructs them as atwo-dimensional array and is extensively utilized in the areaof image processing In this paper CNN algorithm is con-figured to predict the wind speed and fit it to process a one-dimensional array of data In the preprocessing phase theone-dimensional data are reconstructed into a two-di-mensional array )is enables CNN machine algorithm tosmoothly deal with data )is creates two files the propertyand the response files )ese files are delivered as inputs toCNN )e response file also contains the data of the expectedoutput value

Each sample is represented by a line from the propertyand the response files Weights and biases can be obtained assoon as an acceptable number of samples to train the CNN isdelivered )e training continues by comparing the re-gression results with the response values in order to reachthe minimum possible error )is delivers the final trainedCNN model which is utilized to achieve the neededpredictions

)e fitting mechanism of CNN is pooling Variouscomputational approaches have proved that two approachesof pooling can be used the average pooling and the max-imum pooling Images are stationary and all parts of imageshare similar attributes )erefore the pooling approachapplies similar average or maximum calculations for everypart of the high-resolution images)e pooling process leadsto reduction in the statistics dimensions and increase in thegeneralization strength of the model )e results are welloptimized and can have a lower possibility of over fitting

23 ANN Algorithms ANN has three layers which build upthe network )ese are input hidden and output layers)ese layers have the ability to correlate an input vector to anoutput scalar or vector using activation function in variousneurons)e jth hidden neuron Zjcan be computed by the p

inputs and m hidden neurons using the following equation[14]

Zj fh 1113944

p

i1wijykminusi

⎛⎝ ⎞⎠ (7)

where wij is the connection weight from the ith input node tothe jth hidden node ykminusi is i-step behind previous windspeed and fh() is the activation function in the hiddenlayer )erefore the future wind speed can be predictedthrough

1113954yk fo 1113944

m

j1wjzj

⎛⎝ ⎞⎠ (8)

where wj is the connection weight from the jth hidden nodeto the output node and 1113954yk is the predicted wind speed at thekth sampling moment while f0 is the activation function forthe output layer By minimizing the error between the actualand the predicted wind speeds yk and 1113954yk respectively usingLevenbergndashMarquardt (LM) algorithm the nonlinearmapping efficiency of ANN can be obtained [28]

24 SVM Algorithms Assuming a set of samples xi yi1113864 1113865where i 1 2 N with input vector xi isin Rm and out-put vectoryi isin Rm )e regression obstacles aim to identifya function f(x)that describes the correlation between inputsand outputs )e interest of SVR is to obtain a linear re-gression in the high-dimensional feature space delivered bymapping the primary input set utilizing a preknown func-tion ϕ(x())and to minimize the structure riskR[f] )ismechanism can be written as follows [15]

f(x) wTϕ(x) + b

R[f] 12W

2+ C 1113944

N

i1L xiyif xi( )1113874 1113875

(9)

where W b and Crespectively are the regression coefficientvector bias term and punishment coefficient L(xi yi f(xi)

)

is the e-insensitive loss function)e regression problem canbe handled by the following constrained optimizationproblem

min 12W

2+ C 1113944

N

i1L ζ iζ

lowasti( 1113857

st yi minus wTϕ(x) + b1113872 1113873 le ε + ζ i wTϕ(x) + b1113872 1113873 minus yi le ε + ζ lowasti ζ i ζlowasti ge 0 i 1 2 N

(10)

where ζ iand ζ lowasti represent the slack variables that let con-straints feasible By using the Lagrange multipliers the re-gression function can be written as follows

f(x) 1113944N

i1ai minus a

lowasti( 1113857K xixj1113872 1113873 + b (11)

where aiand alowasti are the Lagrange multipliers that fulfilthe conditions ai ge 0 alowasti ge 0 and 1113936

Ni1(ai minus alowasti ) 0 K(xixj)

is a general kernel function In this study the well-knownradial basis function (RBF) is chosen here as the kernelfunction

Computational Intelligence and Neuroscience 7

K xixj1113872 1113873 exp minusxi minus xj

2

2σ2⎛⎜⎜⎝ ⎞⎟⎟⎠ (12)

where σ defines the RBF kernel width [15]

25 ConvLSTM Algorithm ConvLSTM is designed to betrained on spatial information in the dataset and its aim is todeal with 3-dimentional data as an input Furthermore itexchanges matrix multiplication through convolution op-eration on every LSTM cellrsquos gate By doing so it has theability to put the underlying spatial features in multidi-mensional data)e formulas that are used at each one of thegates (input forget and output) are as follows

it σ Wxi lowast xt + Whi lowast htminus1 + bi( 1113857

ft σ Wxf lowast xt + Whf lowast htminus1 + bf1113872 1113873

ot σ Wxo lowastxt + Who lowast htminus1 + bo( 1113857

Ct ft ∘Ctminus1 tan h Wxc lowastxt + Whc lowast htminus1 + bc( 1113857

Ht o minus t ∘ tan h ct( 1113857

(13)

where it ft and ot are input forget and output gates and W

is the weight matrix while xt is the current input data htminus1 isthe previous hidden output and Ct is the cell state

)e difference between these equations in LSTM is thatthe matrix multiplication (middot) is substituted by the convo-lution operation (lowast) between W and each xt htminus1 at everygate By doing so the whole connected layer is replaced by aconvolutional layer )us the number of weight parametersin the model can be significantly reduced

3 Methodology

Due to the nonlinear nonstationary attributes and thestochastic variations in the wind speed time series the ac-curate prediction of wind speed is known to be a challengingeffort [29] In this work to improve the accuracy of the windspeed forecasting model a comparison between five modelsis conducted to forecast wind speed considering availablehistorical data A new concept called multi-lags-one-step(MLOS) ahead forecasting is employed to illustrate the effecton the five models accuracies Assume that we are at timeindex Xt To forecast one output element in the future Xt+1the input dataset can be splitted into many lags (past data)XtminusI where I isin 1ndash10 By doing so the model can be trainedon more elements before predicting a single event in thefuture In addition to that the model accuracy showed animprovement until it reached the optimum lagging pointwhich had the best accuracy Beyond this point the modelaccuracy is degraded as it will be illustrated in the Resultssection

Figure 1 illustrates the workflow of the forecastingmodel Specifically the proposed methodology entails foursteps

In Step 1 data have been collected and averaged from 5minutes to 30 minutes and to 1 hour respectively )edatasets are then standardized to generate a mean value of 0

and standard deviation of 1 )e lagging stage is very im-portant in Step 2 as the data are split into different lags tostudy the effect of training the models on more than oneelement (input) to predict a single event in the future In Step3 the models have been applied taking into considerationthat somemodels such as CNN LSTM and ConvLSTM needto be adjusted from matrix shape perspective )ese modelsnormally work with 2D or more In this stage manipulationand reshaping of matrix are conducted For the sake ofchecking and evaluating the proposed models in Step 4three main metrics are used to validate the case study (MAERMSE and R2) In addition the execution time and opti-mum lag are taken into account to select the best model

Algorithm 1 illustrates the training procedure forConvLSTM

4 Collected Data

Table 3 illustrates the characteristics of the collected data in 5minutes time span )e data are collected from a real windspeed dataset over a three-year period from the West TexasMesonet with 5-minute observation period from near LakeAlan Garza [30] )e data are processed through averagingfrom 5 minutes to 30 minutes (whose statistical charac-teristics are given in Table 4) and one more time to 1 hour(whose statistical characteristics are also given in Table 5))e goal of averaging is to study the effect of reducing thedata size in order to compare the five models and then selectthe one that can achieve the highest accuracy for the threedataset cases As shown in the three tables the data sets arealmost identical and reserved with their seasonality Alsothey are not affected by the averaging process

)e data have been split into three sets (training vali-dation and test) with fractions of 53 14 33

5 Results and Discussion

To quantitatively evaluate the performance of the predictivemodels four commonly used statistical measures are tested[20] All of them measure the deflection between the actualand predicted wind speed values Specifically RMSE MAEand R2 are as follows

RMSE

1113936Ni

i1 yi minus 1113954yi( 11138572

Ni

1113971

MAE 1N

1113944N

i1 yi minus 1113954yi

11138681113868111386811138681113868111386811138681113868

R2

1 minus1113936

Ni

i1 yi minus 1113954yi( 11138572

1113936Ni

i1 yi minus yi( 11138572

(14)

where yi and 1113954yi are the actual and the predicted wind speedrespectively while yi is the mean value of actual wind speedsequence Typically the smaller amplitudes of these mea-sures indicate an improved forecasting procedure while R2

is the goodness-of-fit measure for the model )erefore thelarger its value is the fitter the model will be )e testbedenvironment configuration is as follows

8 Computational Intelligence and Neuroscience

Collect raw dataTime series

Averaging andpreprocessing

L1 L2 L10

AN

N

CNN

LSTM

Con

vLS

TM SVR

Forecasting results

MSERMSEMAE

R2

DelayExecution time

Best fittedKPIs

Dat

apr

oces

sing

Lagging

ModelingEv

alua

tion

helliphellip

1hr30min5min

Figure 1 )e proposed forecasting methodology

Input the wind speed time series dataOutput forecasting performance indices

(1) )e wind speed time series data are measured every 5minutes being averaged two times for 30minutes and 1 hour respectively(2) )e wind datasets are split into Training Validation and Test sets(3) Initiate the multi-lags-one-step (MLOS) arrays for Training Validation and Test sets(4) Define MLOS range as 1 10 to optimize the number of needed lags(5) loop 1

Split the first set based on MLOS rangeInitiate and Extract set features with CNN layerPass the output to a defined LSTM layerSelect the first range of the number of hidden neuronsGenerate prediction results performance indicesCount time to execute and produce prediction resultsSave and compare results with previous onesloop 2Select next MLOS rangeIf MLOS rangemaximum range then goto loop 3 and initialize MLOS rangegoto loop 1loop 3Select new number of hidden neuronsIf number of hidden neurons rangemaximum range then goto loop 4 and initialize number of hidden neurons rangegoto loop 1

(6) loop 4Select new dataset from the sets 5min 30min and 1 hrgoto loop 1If set rangemaximum range thenGenerate the performance indices of all tested setsSelect the best results metrics

ALGORITHM 1 ConvLSTM training

Computational Intelligence and Neuroscience 9

(1) CPU Intel (R) Core(TM) i7-8550U CPU 180GHz 2001 Mhz 4 Core (s) 8 Logical Processor(s)

(2) RAM Installed Physical Memory 160 GB(3) GPU AMD Radeon(TM) RX 550 10GB(4) Framework Anaconda 201907 Python 37

Table 6 illustrates the chosen optimized internal pa-rameters (hyperparameters) for the forecasting methodsused in this work For each method the optimal number ofhidden neurons is chosen to achieve the maximum R2 andthe minimum RMSEand MAE values

After the implementation of CNN ANN LSTMConvLSTM and SVM it was noticed that the most fittedmodel was chosen depending on its accuracy in predictingfuture wind speed values )us the seasonality is consideredfor the forecast mechanism)e chosen model has to deliverthe most fitted data with the least amount of error takinginto consideration the nature of the data and not applyingnaive forecasting on it

To achieve this goal the statistical error indicators arecalculated for every model and time lapse and fully repre-sented as Figure 2 illustrates )e provided results suggestthat the ConvLSTM model has the best performance ascompared to the other four models)e chosen model has toreach the minimum values of RMSE and MAE whilemaximum R2 value

Different parameters are also tested to ensure the rightdecision of choosing the best fitted model )e optimumnumber of lags which is presented in Table 7 is one of themost important indicators in selecting the best fitted modelSince the less historical points are needed by the model thecomputational effort will be less as well For each methodthe optimal number of lags is chosen to achieve the max-imum R2and the minimum RMSE and MAE values Forinstance Figures 3 and 4 show the relation between the

statistical measures and the number of lags and hiddenneurons respectively for the proposed ConvLSTM methodfor the 5 minutes time span case It can be seen that 4 lagsand 15 hidden neurons achieved maximum R2 and mini-mum RMSE and MAE values

)e execution time shown in Table 8 is calculated foreach method and time lapse to assure that the final andchosen model is efficient and can effectively predict futurewind speed )e shorter the time for execution is the moreefficient and helpful the model will be )is is also a sign thatthe model is efficient for further modifications According toTable 8 the ConvLSTM model beats all other models in thetime that it needed to process the historical data and deliver afinal prediction SVM needed 54 minutes to accomplish thetraining and produce testing results while ConvLSTM madeit in just 17 minutes )is huge difference between them hasmade the choice of using ConvLSTM

Table 3 Dataset characteristic for 5min sample

Dataset Max Median Min Mean StdAll datasets 1873 353 001 391 210Training dataset 1873 347 001 383 205Test dataset 1487 367 001 405 220

Table 4 Dataset characteristic for 30min sample

Dataset Max Median Min Mean StdAll datasets 1766 353 001 391 208Training dataset 1766 347 001 3837309 202Test dataset 1432 367 002 405 218

Table 5 Dataset characteristic for 1 hour sample

Dataset Max Median Min Mean StdAll datasets 1761 353 007 391 205Training dataset 1761 346 007 383 200Test dataset 1422 366 007 405 215

Table 6 Optimized internal parameters for the forecastingmethods

Method Sets of parameters

LSTM5min 17 hidden neurons30min 20 hidden neurons1 hour 8 hidden neurons

CNN5min 15 hidden neurons30min 5 hidden neurons1 hour 15 hidden neurons

ConvLSTM5min 15 hidden neurons30min 8 hidden neurons1 hour 20 hidden neurons

ANN5min 15 hidden neurons30min 15 hidden neurons1 hour 20 hidden neurons

SVR5min C 7 ε 01 c 02

30min C 1 ε 025 c 0151 hour C 1 ε 01 c 005

10 Computational Intelligence and Neuroscience

Figure 5 shows that the 5-minute lapse dataset is themost fitted dataset to our chosen model It declares howaccurate the prediction of future wind speed will be

For completeness to effectively evaluate the investi-gated forecasting techniques in terms of their predictionaccuracies 50 cross validation procedure is carried out inwhich the investigated techniques are built and thenevaluated on 50 different training and test datasets re-spectively randomly sampled from the available overalldataset )e ultimate performance metrics are then re-ported as the average and the standard deviation values ofthe 50 metrics obtained in each cross validation trial Inthis regard Figure 6 shows the average performancemetrics on the test dataset using the 50 cross validationprocedure It can be easily recognized that the forecastingmodels that employ the LSTM technique outperform theother investigated techniques in terms of the three per-formance metrics R2 RMSE and MAE

From the experimental results of short-term wind speedforecasting shown in Figure 6 we can observe thatConvLSTM performs the best in terms of forecasting metrics(R2 RMSE andMAE) as compared to the other models (ieCNN ANN SVR and LSTM) )e related statistical tests inTables 6 and 7 respectively have proved the effectiveness ofConvLSTM and its capability of handling noisy large dataConvLSTM showed that it can produce high accurate windspeed prediction with less lags and hidden neurons )iswas indeed reflected in the results shown in Table 8 withless computation time as compared to the other testedmodels Furthermore we introduced the multi-lags-one-step (MLOS) ahead forecasting combined with the hybridConvLSTMmodel to provide an efficient generalization fornew time series data to predict wind speed accuratelyResults showed that ConvLSTM proposed in this paperis an effective and promising model for wind speedforecasting

CNN ANN LSTM SVM ConvLSTM5min 09873 09873 09876 09815 0987630min 09091 09089 09097 09078 090981hour 08309 08302 08315 08211 08316

Val

ues

Models

R2

0009018027036045054063072081

09099

(a)

Val

ues

CNN ANN LSTM SVM ConvLSTM5min 02482 02483 02424 02462 0241930min 06578 06588 06467 06627 064631hour 08862 08881 08718 09116 08717

Models

RMSE

0010203040506070809

(b)

Val

ues

CNN ANN LSTM SVM ConvLSTM5min 01806 01798 01758 01855 0175530min 04802 04799 04721 04818 047411hour 06468 06473 06392 06591 06369

Models

MAE

0006012018024

03036042048054

06066

(c)

Figure 2 Modelsrsquo key performance indicators (KPIs)

Table 7 Optimum number of lags

Method 5min 30min 1 hourCNN 9 4 4ANN 3 3 3LSTM 7 6 10ConvLSTM 4 5 8SVR 9 4 5

Computational Intelligence and Neuroscience 11

ConvLSTM

50 75 100 125 150 175 20025Number of hidden neurons

0986909870098710987209873098740987509876

R-sq

uare

d

(a)

ConvLSTM

0986909870098710987209873098740987509876

RMSE

50 75 100 125 150 175 20025Number of hidden neurons

(b)

ConvLSTM

0176017701780179018001810182

MA

E

50 75 100 125 150 175 20025Number of hidden neurons

(c)

Figure 3 ConvLSTM measured statistical values and number of hidden neurons

2 4 6 8 10Number of lags

ConvLSTM

098575

098600

098625

098650

098675

098700

098725

098750

R-sq

uare

d

(a)

2 4 6 8 10Number of lags

ConvLSTM

02425

02450

02475

02500

02525

02550

02575

02600

RMSE

(b)

2 4 6 8 10Number of lags

ConvLSTM

0176

0178

0180

0182

0184

0186

0188

0190

MA

E

(c)

Figure 4 ConvLSTM measured statistical values and number of lags

12 Computational Intelligence and Neuroscience

Similar to our work the proposed EnsemLSTM modelby Chen et al [15] contained different clusters of LSTMwith different hidden layers and hidden neurons )eycombined LSTM clusters with SVR and external optimizerin order to enhance the generalization capability and ro-bustness of their model However their model showed ahigh computational complexity with mediocre perfor-mance indices Our proposed ConvLSTM with MLOS

assured boosting the generalization and robustness for thenew time series data as well as producing high performanceindices

6 Conclusions

In this study we proposed a hybrid deep learning-basedframework ConvLSTM for short-term prediction of the wind

Table 8 Execution time

(5min) time (min) (30min) time (min) (1 hour) time (min)ConvLSTM 17338 03849 01451SVR 541424 08214 02250CNN 087828 01322 00708ANN 07431 02591 00587LSTM 16570 03290 01473

5

4

3

2

1

0 25 50 75 100 125 150 175 200Number of samples

Win

d sp

eed

(mph

)ConvLSTM 5 min time span

PredictedTrue

Figure 5 ConvLSTM truepredicted wind speed and number of samples

0988

0986

0984

0982

098ConvLSTM LSTM ANN CNN SVR

R2

(a)

ConvLSTM LSTM ANN CNN SVR

026

024

022

02

RMSE

(b)

ConvLSTM LSTM ANN CNN SVR

02

019

018

017

016

015

MA

E

(c)

Figure 6 Average performance metrics obtained on the test dataset using 50 cross validation procedure

Computational Intelligence and Neuroscience 13

speed time series measurements )e proposed dynamicpredictionmodel was optimized for the number of input lagsand the number of internal hidden neurons Multi-lags-one-step (MLOS) ahead wind speed forecasting using the pro-posed approach showed superior results compared to fourother different models built using standard ANN CNNLSTM and SVM approaches )e proposed modelingframework combines the benefits of CNN and LSTM net-works in a hybrid modeling scheme that shows highly ac-curate wind speed prediction results with less lags andhidden neurons as well as less computational complexityFor future work further investigation can be done to im-prove the accuracy of the ConvLSTM model for instanceincreasing and optimizing the number of hidden layersapplying a multi-lags-multi-steps (MLMS) ahead forecast-ing and introducing reinforcement learning agent to op-timize the parameters as compared to other optimizationmethods

Data Availability

)ewind speed data used in this study have been taken fromthe West Texas Mesonet of the US National Wind Institute(httpwwwdeptsttuedunwiresearchfacilitieswtmindexphp) Data are provided freely for academic researchpurposes only and cannot be shareddistributed beyondacademic research use without permission from the WestTexas Mesonet

Conflicts of Interest

)e authors declare that they have no conflicts of interest

Acknowledgments

)e authors acknowledge the help of Prof Brian Hirth inTexas Tech University for providing them with access toweather data

References

[1] R Rothermel ldquoHow to predict the spread and intensity offorest and range firesrdquo General Technical Report INT-143Intermountain Forest and Range Experiment Station OgdenUT USA 1983

[2] A Tascikaraoglu and M Uzunoglu ldquoA review of combinedapproaches for prediction of short-term wind speed andpowerrdquo Renewable and Sustainable Energy Reviews vol 34pp 243ndash254 2014

[3] R J Barthelmie F Murray and S C Pryor ldquo)e economicbenefit of short-term forecasting for wind energy in the UKelectricity marketrdquo Energy Policy vol 36 no 5 pp 1687ndash1696 2008

[4] Y-K Wu and H Jing-Shan A Literature Review of WindForecasting Technology in the World pp 504ndash509 IEEELausanne Powertech Lausanne Switzerland 2007

[5] W S McCulloch and W Pitts ldquoA logical calculus of the ideasimmanent in nervous activityrdquo Be Bulletin of MathematicalBiophysics vol 5 no 4 pp 115ndash133 1943

[6] K Fukushima ldquoNeocognitronrdquo Scholarpedia vol 2 no 1p 1717 2007

[7] K Fukushima ldquoNeocognitron a self-organizing neural net-work model for a mechanism of pattern recognition unaf-fected by shift in positionrdquo Biological Cybernetics vol 36no 4 pp 193ndash202 1980

[8] S Hochreite and J Schmidhuber ldquoLong short-termmemoryrdquoNeural Computation vol 9 no 8 pp 1735ndash1780 1997

[9] H T Siegelmann and E D Sontag ldquoOn the computationalpower of neural netsrdquo Journal of Computer and System Sci-ences vol 50 no 1 pp 132ndash150 1995

[10] R Sunil ldquoUnderstanding support vector machine algorithmrdquohttpswwwanalyticsvidhyacomblog201709understaing-support-vector-machine-example-code 2017

[11] X Shi Z Chen H Wang D Y Yeung W K Wong andW C Woo ldquoConvolutional LSTM network a machinelearning approach for precipitation nowcastingrdquo in Pro-ceedings of the Neural Information Processing Systemspp 7ndash12 Montreal Canada December 2015

[12] C Finn ldquoGoodfellow I and Levine S Unsupervised learningfor physical interaction through video predictionrdquo in Pro-ceedings of the Annual Conference Neural Information Pro-cessing Systems pp 64ndash72 Barcelona Spain December 2016

[13] H Xu Y Gao F Yu and T Darrell ldquoEnd-to-end learning ofdriving models from large-scale video datasetsrdquo in Proceed-ings of the 2017 IEEE Conference on Computer Vision andPattern Recognition pp 3530ndash3538 Honolulu HI USA July2017

[14] K Chen and J Yu ldquoShort-term wind speed prediction usingan unscented Kalman filter based state-space support vectorregression approachrdquo Applied Energy vol 113 pp 690ndash7052014

[15] J Chen G-Q Zeng W Zhou W Du and K-D Lu ldquoWindspeed forecasting using nonlinear-learning ensemble of deeplearning time series prediction and extremal optimizationrdquoEnergy Conversion and Management vol 165 pp 681ndash6952018

[16] D Liu D Niu H Wang and L Fan ldquoShort-term wind speedforecasting using wavelet transform and support vectormachines optimized by genetic algorithmrdquo Renewable Energyvol 62 pp 592ndash597 2014

[17] Y Wang ldquoShort-term wind power forecasting by geneticalgorithm of wavelet neural networkrdquo in Proceedings of the2014 International Conference on Information Science Elec-tronics and Electrical Engineering pp 1752ndash1755 SapporoJapan April 2014

[18] N Sheikh D B Talukdar R I Rasel and N Sultana ldquoA shortterm wind speed forcasting using svr and bp-ann a com-parative analysisrdquo in Proceedings of the 2017 20th Interna-tional Conference of Computer and Information Technology(ICCIT) pp 1ndash6 Dhaka Bangladesh December 2017

[19] H Nantian C Yuan G Cai and E Xing ldquoHybrid short termwind speed forecasting using variational mode decompositionand a weighted regularized extreme learning machinerdquo En-ergies vol 9 no 12 pp 989ndash1007 2016

[20] S Haijian and X Deng ldquoAdaBoosting neural network forshort-term wind speed forecasting based on seasonal char-acteristics analysis and lag space estimationrdquo ComputerModeling in Engineering amp Sciences vol 114 no 3pp 277ndash293 2018

[21] W Xiaodan ldquoForecasting short-term wind speed usingsupport vector machine with particle swarm optimizationrdquo inProceedings of the 2017 International Conference on SensingDiagnostics Prognostics and Control (SDPC) pp 241ndash245Shanghai China August 2017

14 Computational Intelligence and Neuroscience

[22] F R Ningsih E C Djamal and A Najmurrakhman ldquoWindspeed forecasting using recurrent neural networks and longshort term memoryrdquo in Proceedings of the 2019 6th Inter-national Conference on Instrumentation Control and Auto-mation (ICA) pp 137ndash141 Bandung Indonesia JulyndashAugust2019

[23] P K Datta ldquoAn artificial neural network approach for short-term wind speed forecastrdquo Master thesis Kansas State Uni-versity Manhattan KS USA 2018

[24] J Guanlong D Li L Yao and P Zhao ldquoAn improved ar-tificial bee colony-BP neural network algorithm in the short-term wind speed predictionrdquo in Proceedings of the 2016 12thWorld Congress on Intelligent Control and Automation(WCICA) pp 2252ndash2255 Guilin China June 2016

[25] C Gonggui J Chen Z Zhang and Z Sun ldquoShort-term windspeed forecasting based on fuzzy C-means clustering andimproved MEA-BPrdquo IAENG International Journal of Com-puter Science vol 46 no 4 pp 27ndash35 2019

[26] W ZhanJie and A B M Mazharul Mujib ldquo)e weatherforecast using data mining research based on cloud com-putingrdquo in Journal of Physics Conference Series vol 910 no 1Article ID 012020 2017

[27] A Zhu X Li Z Mo and R Wu ldquoWind power predictionbased on a convolutional neural networkrdquo in Proceedings ofthe 2017 International Conference on Circuits Devices AndSystems (ICCDS) pp 131ndash135 Chengdu China September2017

[28] O Linda T Vollmer and M Manic ldquoNeural network basedintrusion detection system for critical infrastructuresrdquo inProceedings of the 2009 International Joint Conference onNeural Networks pp 1827ndash1834 Atlanta GA USA June2009

[29] C Li Z Xiao X Xia W Zou and C Zhang ldquoA hybrid modelbased on synchronous optimisation for multi-step short-termwind speed forecastingrdquoApplied Energy vol 215 pp 131ndash1442018

[30] Institute National Wind West Texas Mesonet Online Ref-erencing httpwwwdeptsttueducom 2018

Computational Intelligence and Neuroscience 15

optimized utilizing the particle swarm optimization (PSO)to minimize the fitness function in the training processNingsih et al [22] predicted wind speed using recurrentneural networks (RNNs) with long short-term memory(LSTM) Two optimization models of stochastic gradientdescent (SGD) and adaptive moment estimation (Adam)were evaluated )e Adam method was shown to be betterand quicker than SGD with a higher level of accuracy andless deviation from the target

A nonlinear autoregressive neural network (NAR-NET) model was developed by Datta [23] )e modelemployed univariate time series data to generate hourlywind speed forecast )e closed loop structure providederror feedback to the hidden layer to generate forecast ofthe next point A short-term wind speed forecastingmethod was proposed by Guanlong et al [24] using abackpropagation (BP) neural network )e weight andthreshold values of BP network are trained and optimizedby the improved artificial bee colony algorithm )en thegathered samples of wind speed are trained and opti-mized When training is finished test samples are used toforecast and validate

Fuzzy C-means (FCM) clustering was used by Gongguiet al [25] to forecast wind speed )e input data of BPneural network with similar characteristics are divided intocorresponding classes Different BP neural networks areestablished for each class )e coefficient of variation is usedto illustrate the dispersion of data and statistical knowledgeis used to illuminate the input data with large dispersionfrom the original dataset Artificial neural networks (ANNs)and decision trees (DTs) were used by ZhanJie and MazharulMujib [26] to analyze meteorological data for the applicationof data mining techniques through cloud computing in windspeed prediction )e neurons in the hidden layer are en-hanced gradually and the network performance in the forman error is examined Table 1 highlights the main charac-teristics of the existing schemes developed for wind speedforecasting

)e novelty of this work lies in enhancing the accuracy ofwind speed forecasting by using a hybrid model calledConvLSTM and comparing it with other four commonlyused models with optimized lags hidden neurons andparameters )is includes testing and comparing the per-formance of these five different models based on historicaldata as well employing multi-lags-one-step (MLOS) aheadforecasting concept MLOS provided an efficient general-ization to new time series data )us it increased the overallprediction accuracy )e remainder of this paper is orga-nized as follows Section 2 describes the four learning al-gorithms in addition to a hybrid algorithm investigated foran accurate wind speed forecasting Section 3 illustrates thestudy methodology Section 4 shows a real case study of awind farm Section 5 introduces the results and discussionFinally conclusions and future works are presented inSection 6

11 Acronyms and Notations Table 2 illustrates the acro-nyms and notations used through the paper

2 Prediction Algorithms

In this section the algorithms used for wind speed fore-casting are summarized as follows

21 LSTM Algorithms LSTM is built in a unique archi-tecture that empowers it to forget the unnecessary in-formation by turning multiplication into addition andusing a function whose second derivative can preserve fora long range before going to zero in order to reduce thevanishing gradient problem (VGP) It is constructed of thesigmoid layer which takes the inputs xtand htminus1 and thendecides by generating the zeros which part from the oldoutput should be removed )is process is done throughforget gate ft )e gate output is given as ft lowast ctminus1 Afterthat a vector of all the possible values from the new inputis created by tan h layer )ese two results are multipliedto renew the old memory ctminus1 that gives ct In other wordsthe sigmoid layer decides which portions of the cell statewill be the outcome )en the outcome of the sigmoidgate is multiplied by all possible values that are set upthrough tan h )us the output consists of only the partsthat are decided to be generated

LSTM networks [8] are part of recurrent neural net-works (RNNs) which are capable of learning long-termdependencies and powerful for modeling long-rangedependencies )e main criterion of the LSTM network isthe memory cell which can memorize the temporal stateIt is also shaped by the addition or removal of informationthrough three controlling gates )ese gates are the inputgate forget gate and output gate LSTMs are able to renewand control the information flow in the block using thesegates in the following equations

sit σ xt middot θxi + htminus1 middot θhi + θibias1113872 1113873 (1)

ft σ xt middot θxf + htminus1 middot θhf + θfbias1113872 1113873 (2)

1113957ct tan h xt middot θx~c + htminus1 middot θh~c + θ~cbias1113872 1113873 (3)

c 1113957ct ⊙ it + ctminus1⊙ft (4)

ot σ xt middot θxo + htminus1 middot θho + θobias1113872 1113873 (5)

ht ot ⊙ tan h ct( 1113857 (6)

where ldquomiddotrdquo presents matrix multiplication ldquo⊙rdquo is an ele-mentwise multiplication and ldquoθrdquo stands for the weights 1113957c isthe input to the cell c which is gated by the input gate whileot is the output )e nonlinear functions σ and tan h areapplied elementwise where σ(x) 11 + eminus x Equations (1)and (2) establish gate activations equation (3) indicates cellinputs equation (4) determines the new cell states where thelsquomemoriesrsquo are stored or deleted and equation (5) results inthe output gate activations which are shown in equation (6)the final output

Computational Intelligence and Neuroscience 3

Table 1 Main characteristics of the existing wind speed forecasting schemes

Aim Technique Meritsoutcomes Demerits Dataset

Hybrid windspeedprediction [13]

Empirical wavelettransformation (EWT) longshort-term memory neuralnetwork and a deep learning

algorithm

)e proposed model has thesatisfactory multistepforecasting results

)e performance of theEWT for the wind speedmultistep forecasting has

not been studied

Four sets of original windspeed series including 700

samples

Wind speedforecasting [14]

Unscented Kalman filter(UKF) is integrated withsupport vector regression

(SVR) model

)e proposed method hasbetter performance in both

one-step-ahead andmultistep-ahead predictions

than ANNs SVRautoregressive and

autoregressive integratedwith Kalman filter models

Needs to develop thepredictive model-basedcontrol and optimizationstrategies for wind farm

operation

Center for Energy Efficiencyand Renewable Energy atUniversity of Massachusetts

Wind speedforecasting [15]

Long short-term memoryneural networks supportvector regression machineand extremal optimization

algorithm

)e proposed model canachieve a better forecastingperformance than ARIMASVR ANN KNN and GBRT

models

Needs to consider moreinterrelated features like

weather conditions humanfactors and power system

status

A wind farm in InnerMongolia China

A hybrid short-term windspeedforecasting [16]

Wavelet transform (WT)genetic algorithm (GA) andsupport vector machines

(SVMs)

)e proposed method ismore efficient than a

persistent model and a SVM-GA model without WT

Needs to augment externalinformation such as the airpressure precipitation andair humidity besides the

temperature

)e wind speed data every05 h in a wind farm of

North China in September2012

Short-termwind speedprediction [18]

Support vector regression(SVR) and artificial neural

network (ANN) withbackpropagation

)e proposed SVR and ANNmodels are able to predictwind speed with more than

99 accuracy

Computationally expensive

Historical dataset(2008ndash2014) of wind speedof Chittagong costal area

from BangladeshMeteorological Division

(BMD)

Hybrid windspeedforecasting [19]

Variational modedecomposition (VMD) thepartial autocorrelationfunction (PACF) and

weighted regularized extremelearning machine (WRELM)

(i) )e VMD reduces theinfluences of randomness

and volatility of wind speed(ii) PACF reduces the featuredimension and complexity of

the model(iii) ELM improves theprediction accuracy

)e forecasting accuracy oftwo-step-ahead and three-step-ahead predictionsdeclined to different

degrees

USA National RenewableEnergy Laboratory (NREL)

in 2004

Short-termwind speedforecasting[20]

Wavelet analysis andAdaBoosting neural network

(i) Benefits the analysis of thewind speedrsquos randomness

and optimal neural networkrsquosstructure

(ii) It can be used to promotethe modelrsquos configurationand show the confidence inhigh-accuracy forecasting

Needs to consider thedynamical model with

ability of error correctionand adaptive adjustment

USA National RenewableEnergy Laboratory (NREL)

in 2004

Short-termwind speedforecasting[21]

Support vector machine(SVM) with particle swarm

optimization (PSO)

)e proposed model has thebest forecasting accuracycompared to classical SVMand backpropagation neural

network models

Needs to consideradditional information forefficient forecasting such as

season and weathervariables

Wind farm data in China in2011

Wind speedpredictions[22]

Recurrent neural network(RNN) with long short-term

memory (LSTM)

)e model provides 927accuracy for training dataand 916 for new data

High rate epochs increasedthe process time and

eventually provided lowaccuracy performance

Nganjuk Meteorology andGeophysics Agency(BMKG) East Java

(2008ndash2017)

Forecastingmultistep-ahead windspeed [23]

NARNETmodel to forecasthourly wind speed using anartificial neural network

(ANN)

)e model is cost effectiveand can work with minimumavailability of statistical data

(i) Faulty measurements ofinputs are likely to affect the

model parameters(ii) Removing rapid changesusing a low-pass filter might

result in neglectingimportant information

Meteorological data fromthe National Oceanic and

AtmosphericAdministration (NOAA)located in Dodge CityKansas (January 2010-December 2010)

4 Computational Intelligence and Neuroscience

Table 1 Continued

Aim Technique Meritsoutcomes Demerits Dataset

Short-termwind speedprediction [24]

Backpropagation (BP) neuralnetwork based on improved

artificial bee colonyalgorithm (ABC-BP)

)e model has high precisionand fast convergence ratecompared with traditionaland genetic BP neural

networks

Sensitive for noisy data)erefore data should befiltered which may affect

the nature of data

Wind farm in TianjinChina (December

2013ndashJanuary 2014)

Short-termwind speedforecasting [25]2019

Fuzzy C-means clustering(FCM) and improved mindevolutionary algorithm-BP

(IMEA-BP)

)e proposed model issuitable for one-step

forecasting and enhances theaccuracy of multistep

forecasting

)e accuracy of multistepforecasting needs to be

further improvedWind farm in China

Predicting windspeed [26]

Artificial neural network anddecision tree algorithms

)e platform has the abilityof mass storage of

meteorological data andefficient query and analysis of

weather forecasting

Needs improvement inorder to forecast more

realistic weatherparameters

Meteorological dataprovided by the DalianMeteorological Bureau

(2011ndash2015)

Our scheme

Employing multi-lags-one-step (MLOS) ahead

forecasting technique withartificial learning-based

algorithms

)e provided results suggestthat the ConvLSTM modelhas the best performance ascompared to ANN CNNLSTM and SVM models

Increasing the number ofhidden layers may increasethe computational time

exponentially

National Wind InstitutionWest Texas Mesonet

(2012ndash2015)

Table 2 Acronyms and notations used

Category Itemssymbols Description

Acronyms

ANN Artificial neural networkCNN Convolutional neural networkLSTM Long short-term memory

ConvLSTM Convolutional LSTM hybrid modelSVM Support vector machineRE Renewable energyRNN Recurrent neural networkEWT Empirical wavelet transformationENN Elman neural network

FC-LSTM Fully connected-long short-term memoryFCN-LSTM Long short-term memory fully convolutional network

TSP Time-series predictionUKF Unscented Kalman filterSVR Support vector regressionSVRM Support vector regression machineEO Extremal optimizationMAE Mean absolute errorRMSE Root mean square errorMAPE Mean absolute percentage errorR2 R-squaredWT Wavelet transformGA Genetic algorithm

GAWNN Genetic algorithm of wavelet neural networkWNN Wavelet neural networkMLOS Multi-lags-one-stepVGP Vanishing gradient problemLM LevenbergndashMarquardtRBF Radial basis function

Computational Intelligence and Neuroscience 5

22 CNN Algorithms CNN is a feed-forward neural net-work To achieve network architecture optimization andsolve the unknown parameters in the network the attributesof a two-dimensional image are excerpted and the back-propagation algorithms are implemented To achieve thefinal outcome the sampled data are fed inside the network to

extract the needed attributes within prerefining Next theclassification or regression is applied [27]

)eCNN is composed of basically two types of layers theconvolutional and the pooling layers)e neurons are locallyconnected within the convolution layer and the precedinglayer Meanwhile the neuronsrsquo local attributes are obtained

Table 2 Continued

Category Itemssymbols Description

Notations

ft Forget gateCt )e cell statesit Input gatext Current input data

htminus1 )e previous hidden output1113957ct Input to cell cc Memory cell1113957ct Input to cell cit Input gate

Ctminus1 Past cell statusOt Output gateht Hidden statemiddot Matrix multiplication⊙ An elementwise multiplicationθ Weight1113957c )e input to the cellσ Nonlinear functionzj )e jth hidden neuronp Number of inputs to the networkm Number of hidden neuronswij )e connection weight from the ith input node to the jth hidden nodeykminusi i-step behind previous wind speed

fh() )e activation function in the hidden layerwj )e connection weight from the jth hidden node to the output node1113954yk )e predicted wind speed at the kth sampling momentfo )e activation function for the output layeryk Actual wind speedxi Input vectoryi Output vectorRm Regularized function

f(x) A function that describes the correlation between inputs and outputsϕx() Preknown functionR[f] Structure risk

w )e regression coefficient vectorb Bias termc Punishment coefficient

L(xi yi f(xi)) )e ε-insensitive loss function

ε )resholdζ i ζlowasti Slack variables that let constraints feasible

ai alowasti )e Lagrange multipliersK(xixj) )e kernel function

W )e weight matrixlowast Convolution operation

bi bf bc Bias vectors∘ Hadamard product

Ht Hidden stateXt Current wind speed measure

Xtminus1 Previous wind speed measureXt+1 Future wind speed measure

6 Computational Intelligence and Neuroscience

)e local sensitivity is found through the pooling layer toobtain the attributes repeatedly )e existence of the con-volution and the pooling layers minimizes the attributeresolution and the number of network specifications whichrequire enhancement

CNN typically describes data and constructs them as atwo-dimensional array and is extensively utilized in the areaof image processing In this paper CNN algorithm is con-figured to predict the wind speed and fit it to process a one-dimensional array of data In the preprocessing phase theone-dimensional data are reconstructed into a two-di-mensional array )is enables CNN machine algorithm tosmoothly deal with data )is creates two files the propertyand the response files )ese files are delivered as inputs toCNN )e response file also contains the data of the expectedoutput value

Each sample is represented by a line from the propertyand the response files Weights and biases can be obtained assoon as an acceptable number of samples to train the CNN isdelivered )e training continues by comparing the re-gression results with the response values in order to reachthe minimum possible error )is delivers the final trainedCNN model which is utilized to achieve the neededpredictions

)e fitting mechanism of CNN is pooling Variouscomputational approaches have proved that two approachesof pooling can be used the average pooling and the max-imum pooling Images are stationary and all parts of imageshare similar attributes )erefore the pooling approachapplies similar average or maximum calculations for everypart of the high-resolution images)e pooling process leadsto reduction in the statistics dimensions and increase in thegeneralization strength of the model )e results are welloptimized and can have a lower possibility of over fitting

23 ANN Algorithms ANN has three layers which build upthe network )ese are input hidden and output layers)ese layers have the ability to correlate an input vector to anoutput scalar or vector using activation function in variousneurons)e jth hidden neuron Zjcan be computed by the p

inputs and m hidden neurons using the following equation[14]

Zj fh 1113944

p

i1wijykminusi

⎛⎝ ⎞⎠ (7)

where wij is the connection weight from the ith input node tothe jth hidden node ykminusi is i-step behind previous windspeed and fh() is the activation function in the hiddenlayer )erefore the future wind speed can be predictedthrough

1113954yk fo 1113944

m

j1wjzj

⎛⎝ ⎞⎠ (8)

where wj is the connection weight from the jth hidden nodeto the output node and 1113954yk is the predicted wind speed at thekth sampling moment while f0 is the activation function forthe output layer By minimizing the error between the actualand the predicted wind speeds yk and 1113954yk respectively usingLevenbergndashMarquardt (LM) algorithm the nonlinearmapping efficiency of ANN can be obtained [28]

24 SVM Algorithms Assuming a set of samples xi yi1113864 1113865where i 1 2 N with input vector xi isin Rm and out-put vectoryi isin Rm )e regression obstacles aim to identifya function f(x)that describes the correlation between inputsand outputs )e interest of SVR is to obtain a linear re-gression in the high-dimensional feature space delivered bymapping the primary input set utilizing a preknown func-tion ϕ(x())and to minimize the structure riskR[f] )ismechanism can be written as follows [15]

f(x) wTϕ(x) + b

R[f] 12W

2+ C 1113944

N

i1L xiyif xi( )1113874 1113875

(9)

where W b and Crespectively are the regression coefficientvector bias term and punishment coefficient L(xi yi f(xi)

)

is the e-insensitive loss function)e regression problem canbe handled by the following constrained optimizationproblem

min 12W

2+ C 1113944

N

i1L ζ iζ

lowasti( 1113857

st yi minus wTϕ(x) + b1113872 1113873 le ε + ζ i wTϕ(x) + b1113872 1113873 minus yi le ε + ζ lowasti ζ i ζlowasti ge 0 i 1 2 N

(10)

where ζ iand ζ lowasti represent the slack variables that let con-straints feasible By using the Lagrange multipliers the re-gression function can be written as follows

f(x) 1113944N

i1ai minus a

lowasti( 1113857K xixj1113872 1113873 + b (11)

where aiand alowasti are the Lagrange multipliers that fulfilthe conditions ai ge 0 alowasti ge 0 and 1113936

Ni1(ai minus alowasti ) 0 K(xixj)

is a general kernel function In this study the well-knownradial basis function (RBF) is chosen here as the kernelfunction

Computational Intelligence and Neuroscience 7

K xixj1113872 1113873 exp minusxi minus xj

2

2σ2⎛⎜⎜⎝ ⎞⎟⎟⎠ (12)

where σ defines the RBF kernel width [15]

25 ConvLSTM Algorithm ConvLSTM is designed to betrained on spatial information in the dataset and its aim is todeal with 3-dimentional data as an input Furthermore itexchanges matrix multiplication through convolution op-eration on every LSTM cellrsquos gate By doing so it has theability to put the underlying spatial features in multidi-mensional data)e formulas that are used at each one of thegates (input forget and output) are as follows

it σ Wxi lowast xt + Whi lowast htminus1 + bi( 1113857

ft σ Wxf lowast xt + Whf lowast htminus1 + bf1113872 1113873

ot σ Wxo lowastxt + Who lowast htminus1 + bo( 1113857

Ct ft ∘Ctminus1 tan h Wxc lowastxt + Whc lowast htminus1 + bc( 1113857

Ht o minus t ∘ tan h ct( 1113857

(13)

where it ft and ot are input forget and output gates and W

is the weight matrix while xt is the current input data htminus1 isthe previous hidden output and Ct is the cell state

)e difference between these equations in LSTM is thatthe matrix multiplication (middot) is substituted by the convo-lution operation (lowast) between W and each xt htminus1 at everygate By doing so the whole connected layer is replaced by aconvolutional layer )us the number of weight parametersin the model can be significantly reduced

3 Methodology

Due to the nonlinear nonstationary attributes and thestochastic variations in the wind speed time series the ac-curate prediction of wind speed is known to be a challengingeffort [29] In this work to improve the accuracy of the windspeed forecasting model a comparison between five modelsis conducted to forecast wind speed considering availablehistorical data A new concept called multi-lags-one-step(MLOS) ahead forecasting is employed to illustrate the effecton the five models accuracies Assume that we are at timeindex Xt To forecast one output element in the future Xt+1the input dataset can be splitted into many lags (past data)XtminusI where I isin 1ndash10 By doing so the model can be trainedon more elements before predicting a single event in thefuture In addition to that the model accuracy showed animprovement until it reached the optimum lagging pointwhich had the best accuracy Beyond this point the modelaccuracy is degraded as it will be illustrated in the Resultssection

Figure 1 illustrates the workflow of the forecastingmodel Specifically the proposed methodology entails foursteps

In Step 1 data have been collected and averaged from 5minutes to 30 minutes and to 1 hour respectively )edatasets are then standardized to generate a mean value of 0

and standard deviation of 1 )e lagging stage is very im-portant in Step 2 as the data are split into different lags tostudy the effect of training the models on more than oneelement (input) to predict a single event in the future In Step3 the models have been applied taking into considerationthat somemodels such as CNN LSTM and ConvLSTM needto be adjusted from matrix shape perspective )ese modelsnormally work with 2D or more In this stage manipulationand reshaping of matrix are conducted For the sake ofchecking and evaluating the proposed models in Step 4three main metrics are used to validate the case study (MAERMSE and R2) In addition the execution time and opti-mum lag are taken into account to select the best model

Algorithm 1 illustrates the training procedure forConvLSTM

4 Collected Data

Table 3 illustrates the characteristics of the collected data in 5minutes time span )e data are collected from a real windspeed dataset over a three-year period from the West TexasMesonet with 5-minute observation period from near LakeAlan Garza [30] )e data are processed through averagingfrom 5 minutes to 30 minutes (whose statistical charac-teristics are given in Table 4) and one more time to 1 hour(whose statistical characteristics are also given in Table 5))e goal of averaging is to study the effect of reducing thedata size in order to compare the five models and then selectthe one that can achieve the highest accuracy for the threedataset cases As shown in the three tables the data sets arealmost identical and reserved with their seasonality Alsothey are not affected by the averaging process

)e data have been split into three sets (training vali-dation and test) with fractions of 53 14 33

5 Results and Discussion

To quantitatively evaluate the performance of the predictivemodels four commonly used statistical measures are tested[20] All of them measure the deflection between the actualand predicted wind speed values Specifically RMSE MAEand R2 are as follows

RMSE

1113936Ni

i1 yi minus 1113954yi( 11138572

Ni

1113971

MAE 1N

1113944N

i1 yi minus 1113954yi

11138681113868111386811138681113868111386811138681113868

R2

1 minus1113936

Ni

i1 yi minus 1113954yi( 11138572

1113936Ni

i1 yi minus yi( 11138572

(14)

where yi and 1113954yi are the actual and the predicted wind speedrespectively while yi is the mean value of actual wind speedsequence Typically the smaller amplitudes of these mea-sures indicate an improved forecasting procedure while R2

is the goodness-of-fit measure for the model )erefore thelarger its value is the fitter the model will be )e testbedenvironment configuration is as follows

8 Computational Intelligence and Neuroscience

Collect raw dataTime series

Averaging andpreprocessing

L1 L2 L10

AN

N

CNN

LSTM

Con

vLS

TM SVR

Forecasting results

MSERMSEMAE

R2

DelayExecution time

Best fittedKPIs

Dat

apr

oces

sing

Lagging

ModelingEv

alua

tion

helliphellip

1hr30min5min

Figure 1 )e proposed forecasting methodology

Input the wind speed time series dataOutput forecasting performance indices

(1) )e wind speed time series data are measured every 5minutes being averaged two times for 30minutes and 1 hour respectively(2) )e wind datasets are split into Training Validation and Test sets(3) Initiate the multi-lags-one-step (MLOS) arrays for Training Validation and Test sets(4) Define MLOS range as 1 10 to optimize the number of needed lags(5) loop 1

Split the first set based on MLOS rangeInitiate and Extract set features with CNN layerPass the output to a defined LSTM layerSelect the first range of the number of hidden neuronsGenerate prediction results performance indicesCount time to execute and produce prediction resultsSave and compare results with previous onesloop 2Select next MLOS rangeIf MLOS rangemaximum range then goto loop 3 and initialize MLOS rangegoto loop 1loop 3Select new number of hidden neuronsIf number of hidden neurons rangemaximum range then goto loop 4 and initialize number of hidden neurons rangegoto loop 1

(6) loop 4Select new dataset from the sets 5min 30min and 1 hrgoto loop 1If set rangemaximum range thenGenerate the performance indices of all tested setsSelect the best results metrics

ALGORITHM 1 ConvLSTM training

Computational Intelligence and Neuroscience 9

(1) CPU Intel (R) Core(TM) i7-8550U CPU 180GHz 2001 Mhz 4 Core (s) 8 Logical Processor(s)

(2) RAM Installed Physical Memory 160 GB(3) GPU AMD Radeon(TM) RX 550 10GB(4) Framework Anaconda 201907 Python 37

Table 6 illustrates the chosen optimized internal pa-rameters (hyperparameters) for the forecasting methodsused in this work For each method the optimal number ofhidden neurons is chosen to achieve the maximum R2 andthe minimum RMSEand MAE values

After the implementation of CNN ANN LSTMConvLSTM and SVM it was noticed that the most fittedmodel was chosen depending on its accuracy in predictingfuture wind speed values )us the seasonality is consideredfor the forecast mechanism)e chosen model has to deliverthe most fitted data with the least amount of error takinginto consideration the nature of the data and not applyingnaive forecasting on it

To achieve this goal the statistical error indicators arecalculated for every model and time lapse and fully repre-sented as Figure 2 illustrates )e provided results suggestthat the ConvLSTM model has the best performance ascompared to the other four models)e chosen model has toreach the minimum values of RMSE and MAE whilemaximum R2 value

Different parameters are also tested to ensure the rightdecision of choosing the best fitted model )e optimumnumber of lags which is presented in Table 7 is one of themost important indicators in selecting the best fitted modelSince the less historical points are needed by the model thecomputational effort will be less as well For each methodthe optimal number of lags is chosen to achieve the max-imum R2and the minimum RMSE and MAE values Forinstance Figures 3 and 4 show the relation between the

statistical measures and the number of lags and hiddenneurons respectively for the proposed ConvLSTM methodfor the 5 minutes time span case It can be seen that 4 lagsand 15 hidden neurons achieved maximum R2 and mini-mum RMSE and MAE values

)e execution time shown in Table 8 is calculated foreach method and time lapse to assure that the final andchosen model is efficient and can effectively predict futurewind speed )e shorter the time for execution is the moreefficient and helpful the model will be )is is also a sign thatthe model is efficient for further modifications According toTable 8 the ConvLSTM model beats all other models in thetime that it needed to process the historical data and deliver afinal prediction SVM needed 54 minutes to accomplish thetraining and produce testing results while ConvLSTM madeit in just 17 minutes )is huge difference between them hasmade the choice of using ConvLSTM

Table 3 Dataset characteristic for 5min sample

Dataset Max Median Min Mean StdAll datasets 1873 353 001 391 210Training dataset 1873 347 001 383 205Test dataset 1487 367 001 405 220

Table 4 Dataset characteristic for 30min sample

Dataset Max Median Min Mean StdAll datasets 1766 353 001 391 208Training dataset 1766 347 001 3837309 202Test dataset 1432 367 002 405 218

Table 5 Dataset characteristic for 1 hour sample

Dataset Max Median Min Mean StdAll datasets 1761 353 007 391 205Training dataset 1761 346 007 383 200Test dataset 1422 366 007 405 215

Table 6 Optimized internal parameters for the forecastingmethods

Method Sets of parameters

LSTM5min 17 hidden neurons30min 20 hidden neurons1 hour 8 hidden neurons

CNN5min 15 hidden neurons30min 5 hidden neurons1 hour 15 hidden neurons

ConvLSTM5min 15 hidden neurons30min 8 hidden neurons1 hour 20 hidden neurons

ANN5min 15 hidden neurons30min 15 hidden neurons1 hour 20 hidden neurons

SVR5min C 7 ε 01 c 02

30min C 1 ε 025 c 0151 hour C 1 ε 01 c 005

10 Computational Intelligence and Neuroscience

Figure 5 shows that the 5-minute lapse dataset is themost fitted dataset to our chosen model It declares howaccurate the prediction of future wind speed will be

For completeness to effectively evaluate the investi-gated forecasting techniques in terms of their predictionaccuracies 50 cross validation procedure is carried out inwhich the investigated techniques are built and thenevaluated on 50 different training and test datasets re-spectively randomly sampled from the available overalldataset )e ultimate performance metrics are then re-ported as the average and the standard deviation values ofthe 50 metrics obtained in each cross validation trial Inthis regard Figure 6 shows the average performancemetrics on the test dataset using the 50 cross validationprocedure It can be easily recognized that the forecastingmodels that employ the LSTM technique outperform theother investigated techniques in terms of the three per-formance metrics R2 RMSE and MAE

From the experimental results of short-term wind speedforecasting shown in Figure 6 we can observe thatConvLSTM performs the best in terms of forecasting metrics(R2 RMSE andMAE) as compared to the other models (ieCNN ANN SVR and LSTM) )e related statistical tests inTables 6 and 7 respectively have proved the effectiveness ofConvLSTM and its capability of handling noisy large dataConvLSTM showed that it can produce high accurate windspeed prediction with less lags and hidden neurons )iswas indeed reflected in the results shown in Table 8 withless computation time as compared to the other testedmodels Furthermore we introduced the multi-lags-one-step (MLOS) ahead forecasting combined with the hybridConvLSTMmodel to provide an efficient generalization fornew time series data to predict wind speed accuratelyResults showed that ConvLSTM proposed in this paperis an effective and promising model for wind speedforecasting

CNN ANN LSTM SVM ConvLSTM5min 09873 09873 09876 09815 0987630min 09091 09089 09097 09078 090981hour 08309 08302 08315 08211 08316

Val

ues

Models

R2

0009018027036045054063072081

09099

(a)

Val

ues

CNN ANN LSTM SVM ConvLSTM5min 02482 02483 02424 02462 0241930min 06578 06588 06467 06627 064631hour 08862 08881 08718 09116 08717

Models

RMSE

0010203040506070809

(b)

Val

ues

CNN ANN LSTM SVM ConvLSTM5min 01806 01798 01758 01855 0175530min 04802 04799 04721 04818 047411hour 06468 06473 06392 06591 06369

Models

MAE

0006012018024

03036042048054

06066

(c)

Figure 2 Modelsrsquo key performance indicators (KPIs)

Table 7 Optimum number of lags

Method 5min 30min 1 hourCNN 9 4 4ANN 3 3 3LSTM 7 6 10ConvLSTM 4 5 8SVR 9 4 5

Computational Intelligence and Neuroscience 11

ConvLSTM

50 75 100 125 150 175 20025Number of hidden neurons

0986909870098710987209873098740987509876

R-sq

uare

d

(a)

ConvLSTM

0986909870098710987209873098740987509876

RMSE

50 75 100 125 150 175 20025Number of hidden neurons

(b)

ConvLSTM

0176017701780179018001810182

MA

E

50 75 100 125 150 175 20025Number of hidden neurons

(c)

Figure 3 ConvLSTM measured statistical values and number of hidden neurons

2 4 6 8 10Number of lags

ConvLSTM

098575

098600

098625

098650

098675

098700

098725

098750

R-sq

uare

d

(a)

2 4 6 8 10Number of lags

ConvLSTM

02425

02450

02475

02500

02525

02550

02575

02600

RMSE

(b)

2 4 6 8 10Number of lags

ConvLSTM

0176

0178

0180

0182

0184

0186

0188

0190

MA

E

(c)

Figure 4 ConvLSTM measured statistical values and number of lags

12 Computational Intelligence and Neuroscience

Similar to our work the proposed EnsemLSTM modelby Chen et al [15] contained different clusters of LSTMwith different hidden layers and hidden neurons )eycombined LSTM clusters with SVR and external optimizerin order to enhance the generalization capability and ro-bustness of their model However their model showed ahigh computational complexity with mediocre perfor-mance indices Our proposed ConvLSTM with MLOS

assured boosting the generalization and robustness for thenew time series data as well as producing high performanceindices

6 Conclusions

In this study we proposed a hybrid deep learning-basedframework ConvLSTM for short-term prediction of the wind

Table 8 Execution time

(5min) time (min) (30min) time (min) (1 hour) time (min)ConvLSTM 17338 03849 01451SVR 541424 08214 02250CNN 087828 01322 00708ANN 07431 02591 00587LSTM 16570 03290 01473

5

4

3

2

1

0 25 50 75 100 125 150 175 200Number of samples

Win

d sp

eed

(mph

)ConvLSTM 5 min time span

PredictedTrue

Figure 5 ConvLSTM truepredicted wind speed and number of samples

0988

0986

0984

0982

098ConvLSTM LSTM ANN CNN SVR

R2

(a)

ConvLSTM LSTM ANN CNN SVR

026

024

022

02

RMSE

(b)

ConvLSTM LSTM ANN CNN SVR

02

019

018

017

016

015

MA

E

(c)

Figure 6 Average performance metrics obtained on the test dataset using 50 cross validation procedure

Computational Intelligence and Neuroscience 13

speed time series measurements )e proposed dynamicpredictionmodel was optimized for the number of input lagsand the number of internal hidden neurons Multi-lags-one-step (MLOS) ahead wind speed forecasting using the pro-posed approach showed superior results compared to fourother different models built using standard ANN CNNLSTM and SVM approaches )e proposed modelingframework combines the benefits of CNN and LSTM net-works in a hybrid modeling scheme that shows highly ac-curate wind speed prediction results with less lags andhidden neurons as well as less computational complexityFor future work further investigation can be done to im-prove the accuracy of the ConvLSTM model for instanceincreasing and optimizing the number of hidden layersapplying a multi-lags-multi-steps (MLMS) ahead forecast-ing and introducing reinforcement learning agent to op-timize the parameters as compared to other optimizationmethods

Data Availability

)ewind speed data used in this study have been taken fromthe West Texas Mesonet of the US National Wind Institute(httpwwwdeptsttuedunwiresearchfacilitieswtmindexphp) Data are provided freely for academic researchpurposes only and cannot be shareddistributed beyondacademic research use without permission from the WestTexas Mesonet

Conflicts of Interest

)e authors declare that they have no conflicts of interest

Acknowledgments

)e authors acknowledge the help of Prof Brian Hirth inTexas Tech University for providing them with access toweather data

References

[1] R Rothermel ldquoHow to predict the spread and intensity offorest and range firesrdquo General Technical Report INT-143Intermountain Forest and Range Experiment Station OgdenUT USA 1983

[2] A Tascikaraoglu and M Uzunoglu ldquoA review of combinedapproaches for prediction of short-term wind speed andpowerrdquo Renewable and Sustainable Energy Reviews vol 34pp 243ndash254 2014

[3] R J Barthelmie F Murray and S C Pryor ldquo)e economicbenefit of short-term forecasting for wind energy in the UKelectricity marketrdquo Energy Policy vol 36 no 5 pp 1687ndash1696 2008

[4] Y-K Wu and H Jing-Shan A Literature Review of WindForecasting Technology in the World pp 504ndash509 IEEELausanne Powertech Lausanne Switzerland 2007

[5] W S McCulloch and W Pitts ldquoA logical calculus of the ideasimmanent in nervous activityrdquo Be Bulletin of MathematicalBiophysics vol 5 no 4 pp 115ndash133 1943

[6] K Fukushima ldquoNeocognitronrdquo Scholarpedia vol 2 no 1p 1717 2007

[7] K Fukushima ldquoNeocognitron a self-organizing neural net-work model for a mechanism of pattern recognition unaf-fected by shift in positionrdquo Biological Cybernetics vol 36no 4 pp 193ndash202 1980

[8] S Hochreite and J Schmidhuber ldquoLong short-termmemoryrdquoNeural Computation vol 9 no 8 pp 1735ndash1780 1997

[9] H T Siegelmann and E D Sontag ldquoOn the computationalpower of neural netsrdquo Journal of Computer and System Sci-ences vol 50 no 1 pp 132ndash150 1995

[10] R Sunil ldquoUnderstanding support vector machine algorithmrdquohttpswwwanalyticsvidhyacomblog201709understaing-support-vector-machine-example-code 2017

[11] X Shi Z Chen H Wang D Y Yeung W K Wong andW C Woo ldquoConvolutional LSTM network a machinelearning approach for precipitation nowcastingrdquo in Pro-ceedings of the Neural Information Processing Systemspp 7ndash12 Montreal Canada December 2015

[12] C Finn ldquoGoodfellow I and Levine S Unsupervised learningfor physical interaction through video predictionrdquo in Pro-ceedings of the Annual Conference Neural Information Pro-cessing Systems pp 64ndash72 Barcelona Spain December 2016

[13] H Xu Y Gao F Yu and T Darrell ldquoEnd-to-end learning ofdriving models from large-scale video datasetsrdquo in Proceed-ings of the 2017 IEEE Conference on Computer Vision andPattern Recognition pp 3530ndash3538 Honolulu HI USA July2017

[14] K Chen and J Yu ldquoShort-term wind speed prediction usingan unscented Kalman filter based state-space support vectorregression approachrdquo Applied Energy vol 113 pp 690ndash7052014

[15] J Chen G-Q Zeng W Zhou W Du and K-D Lu ldquoWindspeed forecasting using nonlinear-learning ensemble of deeplearning time series prediction and extremal optimizationrdquoEnergy Conversion and Management vol 165 pp 681ndash6952018

[16] D Liu D Niu H Wang and L Fan ldquoShort-term wind speedforecasting using wavelet transform and support vectormachines optimized by genetic algorithmrdquo Renewable Energyvol 62 pp 592ndash597 2014

[17] Y Wang ldquoShort-term wind power forecasting by geneticalgorithm of wavelet neural networkrdquo in Proceedings of the2014 International Conference on Information Science Elec-tronics and Electrical Engineering pp 1752ndash1755 SapporoJapan April 2014

[18] N Sheikh D B Talukdar R I Rasel and N Sultana ldquoA shortterm wind speed forcasting using svr and bp-ann a com-parative analysisrdquo in Proceedings of the 2017 20th Interna-tional Conference of Computer and Information Technology(ICCIT) pp 1ndash6 Dhaka Bangladesh December 2017

[19] H Nantian C Yuan G Cai and E Xing ldquoHybrid short termwind speed forecasting using variational mode decompositionand a weighted regularized extreme learning machinerdquo En-ergies vol 9 no 12 pp 989ndash1007 2016

[20] S Haijian and X Deng ldquoAdaBoosting neural network forshort-term wind speed forecasting based on seasonal char-acteristics analysis and lag space estimationrdquo ComputerModeling in Engineering amp Sciences vol 114 no 3pp 277ndash293 2018

[21] W Xiaodan ldquoForecasting short-term wind speed usingsupport vector machine with particle swarm optimizationrdquo inProceedings of the 2017 International Conference on SensingDiagnostics Prognostics and Control (SDPC) pp 241ndash245Shanghai China August 2017

14 Computational Intelligence and Neuroscience

[22] F R Ningsih E C Djamal and A Najmurrakhman ldquoWindspeed forecasting using recurrent neural networks and longshort term memoryrdquo in Proceedings of the 2019 6th Inter-national Conference on Instrumentation Control and Auto-mation (ICA) pp 137ndash141 Bandung Indonesia JulyndashAugust2019

[23] P K Datta ldquoAn artificial neural network approach for short-term wind speed forecastrdquo Master thesis Kansas State Uni-versity Manhattan KS USA 2018

[24] J Guanlong D Li L Yao and P Zhao ldquoAn improved ar-tificial bee colony-BP neural network algorithm in the short-term wind speed predictionrdquo in Proceedings of the 2016 12thWorld Congress on Intelligent Control and Automation(WCICA) pp 2252ndash2255 Guilin China June 2016

[25] C Gonggui J Chen Z Zhang and Z Sun ldquoShort-term windspeed forecasting based on fuzzy C-means clustering andimproved MEA-BPrdquo IAENG International Journal of Com-puter Science vol 46 no 4 pp 27ndash35 2019

[26] W ZhanJie and A B M Mazharul Mujib ldquo)e weatherforecast using data mining research based on cloud com-putingrdquo in Journal of Physics Conference Series vol 910 no 1Article ID 012020 2017

[27] A Zhu X Li Z Mo and R Wu ldquoWind power predictionbased on a convolutional neural networkrdquo in Proceedings ofthe 2017 International Conference on Circuits Devices AndSystems (ICCDS) pp 131ndash135 Chengdu China September2017

[28] O Linda T Vollmer and M Manic ldquoNeural network basedintrusion detection system for critical infrastructuresrdquo inProceedings of the 2009 International Joint Conference onNeural Networks pp 1827ndash1834 Atlanta GA USA June2009

[29] C Li Z Xiao X Xia W Zou and C Zhang ldquoA hybrid modelbased on synchronous optimisation for multi-step short-termwind speed forecastingrdquoApplied Energy vol 215 pp 131ndash1442018

[30] Institute National Wind West Texas Mesonet Online Ref-erencing httpwwwdeptsttueducom 2018

Computational Intelligence and Neuroscience 15

Table 1 Main characteristics of the existing wind speed forecasting schemes

Aim Technique Meritsoutcomes Demerits Dataset

Hybrid windspeedprediction [13]

Empirical wavelettransformation (EWT) longshort-term memory neuralnetwork and a deep learning

algorithm

)e proposed model has thesatisfactory multistepforecasting results

)e performance of theEWT for the wind speedmultistep forecasting has

not been studied

Four sets of original windspeed series including 700

samples

Wind speedforecasting [14]

Unscented Kalman filter(UKF) is integrated withsupport vector regression

(SVR) model

)e proposed method hasbetter performance in both

one-step-ahead andmultistep-ahead predictions

than ANNs SVRautoregressive and

autoregressive integratedwith Kalman filter models

Needs to develop thepredictive model-basedcontrol and optimizationstrategies for wind farm

operation

Center for Energy Efficiencyand Renewable Energy atUniversity of Massachusetts

Wind speedforecasting [15]

Long short-term memoryneural networks supportvector regression machineand extremal optimization

algorithm

)e proposed model canachieve a better forecastingperformance than ARIMASVR ANN KNN and GBRT

models

Needs to consider moreinterrelated features like

weather conditions humanfactors and power system

status

A wind farm in InnerMongolia China

A hybrid short-term windspeedforecasting [16]

Wavelet transform (WT)genetic algorithm (GA) andsupport vector machines

(SVMs)

)e proposed method ismore efficient than a

persistent model and a SVM-GA model without WT

Needs to augment externalinformation such as the airpressure precipitation andair humidity besides the

temperature

)e wind speed data every05 h in a wind farm of

North China in September2012

Short-termwind speedprediction [18]

Support vector regression(SVR) and artificial neural

network (ANN) withbackpropagation

)e proposed SVR and ANNmodels are able to predictwind speed with more than

99 accuracy

Computationally expensive

Historical dataset(2008ndash2014) of wind speedof Chittagong costal area

from BangladeshMeteorological Division

(BMD)

Hybrid windspeedforecasting [19]

Variational modedecomposition (VMD) thepartial autocorrelationfunction (PACF) and

weighted regularized extremelearning machine (WRELM)

(i) )e VMD reduces theinfluences of randomness

and volatility of wind speed(ii) PACF reduces the featuredimension and complexity of

the model(iii) ELM improves theprediction accuracy

)e forecasting accuracy oftwo-step-ahead and three-step-ahead predictionsdeclined to different

degrees

USA National RenewableEnergy Laboratory (NREL)

in 2004

Short-termwind speedforecasting[20]

Wavelet analysis andAdaBoosting neural network

(i) Benefits the analysis of thewind speedrsquos randomness

and optimal neural networkrsquosstructure

(ii) It can be used to promotethe modelrsquos configurationand show the confidence inhigh-accuracy forecasting

Needs to consider thedynamical model with

ability of error correctionand adaptive adjustment

USA National RenewableEnergy Laboratory (NREL)

in 2004

Short-termwind speedforecasting[21]

Support vector machine(SVM) with particle swarm

optimization (PSO)

)e proposed model has thebest forecasting accuracycompared to classical SVMand backpropagation neural

network models

Needs to consideradditional information forefficient forecasting such as

season and weathervariables

Wind farm data in China in2011

Wind speedpredictions[22]

Recurrent neural network(RNN) with long short-term

memory (LSTM)

)e model provides 927accuracy for training dataand 916 for new data

High rate epochs increasedthe process time and

eventually provided lowaccuracy performance

Nganjuk Meteorology andGeophysics Agency(BMKG) East Java

(2008ndash2017)

Forecastingmultistep-ahead windspeed [23]

NARNETmodel to forecasthourly wind speed using anartificial neural network

(ANN)

)e model is cost effectiveand can work with minimumavailability of statistical data

(i) Faulty measurements ofinputs are likely to affect the

model parameters(ii) Removing rapid changesusing a low-pass filter might

result in neglectingimportant information

Meteorological data fromthe National Oceanic and

AtmosphericAdministration (NOAA)located in Dodge CityKansas (January 2010-December 2010)

4 Computational Intelligence and Neuroscience

Table 1 Continued

Aim Technique Meritsoutcomes Demerits Dataset

Short-termwind speedprediction [24]

Backpropagation (BP) neuralnetwork based on improved

artificial bee colonyalgorithm (ABC-BP)

)e model has high precisionand fast convergence ratecompared with traditionaland genetic BP neural

networks

Sensitive for noisy data)erefore data should befiltered which may affect

the nature of data

Wind farm in TianjinChina (December

2013ndashJanuary 2014)

Short-termwind speedforecasting [25]2019

Fuzzy C-means clustering(FCM) and improved mindevolutionary algorithm-BP

(IMEA-BP)

)e proposed model issuitable for one-step

forecasting and enhances theaccuracy of multistep

forecasting

)e accuracy of multistepforecasting needs to be

further improvedWind farm in China

Predicting windspeed [26]

Artificial neural network anddecision tree algorithms

)e platform has the abilityof mass storage of

meteorological data andefficient query and analysis of

weather forecasting

Needs improvement inorder to forecast more

realistic weatherparameters

Meteorological dataprovided by the DalianMeteorological Bureau

(2011ndash2015)

Our scheme

Employing multi-lags-one-step (MLOS) ahead

forecasting technique withartificial learning-based

algorithms

)e provided results suggestthat the ConvLSTM modelhas the best performance ascompared to ANN CNNLSTM and SVM models

Increasing the number ofhidden layers may increasethe computational time

exponentially

National Wind InstitutionWest Texas Mesonet

(2012ndash2015)

Table 2 Acronyms and notations used

Category Itemssymbols Description

Acronyms

ANN Artificial neural networkCNN Convolutional neural networkLSTM Long short-term memory

ConvLSTM Convolutional LSTM hybrid modelSVM Support vector machineRE Renewable energyRNN Recurrent neural networkEWT Empirical wavelet transformationENN Elman neural network

FC-LSTM Fully connected-long short-term memoryFCN-LSTM Long short-term memory fully convolutional network

TSP Time-series predictionUKF Unscented Kalman filterSVR Support vector regressionSVRM Support vector regression machineEO Extremal optimizationMAE Mean absolute errorRMSE Root mean square errorMAPE Mean absolute percentage errorR2 R-squaredWT Wavelet transformGA Genetic algorithm

GAWNN Genetic algorithm of wavelet neural networkWNN Wavelet neural networkMLOS Multi-lags-one-stepVGP Vanishing gradient problemLM LevenbergndashMarquardtRBF Radial basis function

Computational Intelligence and Neuroscience 5

22 CNN Algorithms CNN is a feed-forward neural net-work To achieve network architecture optimization andsolve the unknown parameters in the network the attributesof a two-dimensional image are excerpted and the back-propagation algorithms are implemented To achieve thefinal outcome the sampled data are fed inside the network to

extract the needed attributes within prerefining Next theclassification or regression is applied [27]

)eCNN is composed of basically two types of layers theconvolutional and the pooling layers)e neurons are locallyconnected within the convolution layer and the precedinglayer Meanwhile the neuronsrsquo local attributes are obtained

Table 2 Continued

Category Itemssymbols Description

Notations

ft Forget gateCt )e cell statesit Input gatext Current input data

htminus1 )e previous hidden output1113957ct Input to cell cc Memory cell1113957ct Input to cell cit Input gate

Ctminus1 Past cell statusOt Output gateht Hidden statemiddot Matrix multiplication⊙ An elementwise multiplicationθ Weight1113957c )e input to the cellσ Nonlinear functionzj )e jth hidden neuronp Number of inputs to the networkm Number of hidden neuronswij )e connection weight from the ith input node to the jth hidden nodeykminusi i-step behind previous wind speed

fh() )e activation function in the hidden layerwj )e connection weight from the jth hidden node to the output node1113954yk )e predicted wind speed at the kth sampling momentfo )e activation function for the output layeryk Actual wind speedxi Input vectoryi Output vectorRm Regularized function

f(x) A function that describes the correlation between inputs and outputsϕx() Preknown functionR[f] Structure risk

w )e regression coefficient vectorb Bias termc Punishment coefficient

L(xi yi f(xi)) )e ε-insensitive loss function

ε )resholdζ i ζlowasti Slack variables that let constraints feasible

ai alowasti )e Lagrange multipliersK(xixj) )e kernel function

W )e weight matrixlowast Convolution operation

bi bf bc Bias vectors∘ Hadamard product

Ht Hidden stateXt Current wind speed measure

Xtminus1 Previous wind speed measureXt+1 Future wind speed measure

6 Computational Intelligence and Neuroscience

)e local sensitivity is found through the pooling layer toobtain the attributes repeatedly )e existence of the con-volution and the pooling layers minimizes the attributeresolution and the number of network specifications whichrequire enhancement

CNN typically describes data and constructs them as atwo-dimensional array and is extensively utilized in the areaof image processing In this paper CNN algorithm is con-figured to predict the wind speed and fit it to process a one-dimensional array of data In the preprocessing phase theone-dimensional data are reconstructed into a two-di-mensional array )is enables CNN machine algorithm tosmoothly deal with data )is creates two files the propertyand the response files )ese files are delivered as inputs toCNN )e response file also contains the data of the expectedoutput value

Each sample is represented by a line from the propertyand the response files Weights and biases can be obtained assoon as an acceptable number of samples to train the CNN isdelivered )e training continues by comparing the re-gression results with the response values in order to reachthe minimum possible error )is delivers the final trainedCNN model which is utilized to achieve the neededpredictions

)e fitting mechanism of CNN is pooling Variouscomputational approaches have proved that two approachesof pooling can be used the average pooling and the max-imum pooling Images are stationary and all parts of imageshare similar attributes )erefore the pooling approachapplies similar average or maximum calculations for everypart of the high-resolution images)e pooling process leadsto reduction in the statistics dimensions and increase in thegeneralization strength of the model )e results are welloptimized and can have a lower possibility of over fitting

23 ANN Algorithms ANN has three layers which build upthe network )ese are input hidden and output layers)ese layers have the ability to correlate an input vector to anoutput scalar or vector using activation function in variousneurons)e jth hidden neuron Zjcan be computed by the p

inputs and m hidden neurons using the following equation[14]

Zj fh 1113944

p

i1wijykminusi

⎛⎝ ⎞⎠ (7)

where wij is the connection weight from the ith input node tothe jth hidden node ykminusi is i-step behind previous windspeed and fh() is the activation function in the hiddenlayer )erefore the future wind speed can be predictedthrough

1113954yk fo 1113944

m

j1wjzj

⎛⎝ ⎞⎠ (8)

where wj is the connection weight from the jth hidden nodeto the output node and 1113954yk is the predicted wind speed at thekth sampling moment while f0 is the activation function forthe output layer By minimizing the error between the actualand the predicted wind speeds yk and 1113954yk respectively usingLevenbergndashMarquardt (LM) algorithm the nonlinearmapping efficiency of ANN can be obtained [28]

24 SVM Algorithms Assuming a set of samples xi yi1113864 1113865where i 1 2 N with input vector xi isin Rm and out-put vectoryi isin Rm )e regression obstacles aim to identifya function f(x)that describes the correlation between inputsand outputs )e interest of SVR is to obtain a linear re-gression in the high-dimensional feature space delivered bymapping the primary input set utilizing a preknown func-tion ϕ(x())and to minimize the structure riskR[f] )ismechanism can be written as follows [15]

f(x) wTϕ(x) + b

R[f] 12W

2+ C 1113944

N

i1L xiyif xi( )1113874 1113875

(9)

where W b and Crespectively are the regression coefficientvector bias term and punishment coefficient L(xi yi f(xi)

)

is the e-insensitive loss function)e regression problem canbe handled by the following constrained optimizationproblem

min 12W

2+ C 1113944

N

i1L ζ iζ

lowasti( 1113857

st yi minus wTϕ(x) + b1113872 1113873 le ε + ζ i wTϕ(x) + b1113872 1113873 minus yi le ε + ζ lowasti ζ i ζlowasti ge 0 i 1 2 N

(10)

where ζ iand ζ lowasti represent the slack variables that let con-straints feasible By using the Lagrange multipliers the re-gression function can be written as follows

f(x) 1113944N

i1ai minus a

lowasti( 1113857K xixj1113872 1113873 + b (11)

where aiand alowasti are the Lagrange multipliers that fulfilthe conditions ai ge 0 alowasti ge 0 and 1113936

Ni1(ai minus alowasti ) 0 K(xixj)

is a general kernel function In this study the well-knownradial basis function (RBF) is chosen here as the kernelfunction

Computational Intelligence and Neuroscience 7

K xixj1113872 1113873 exp minusxi minus xj

2

2σ2⎛⎜⎜⎝ ⎞⎟⎟⎠ (12)

where σ defines the RBF kernel width [15]

25 ConvLSTM Algorithm ConvLSTM is designed to betrained on spatial information in the dataset and its aim is todeal with 3-dimentional data as an input Furthermore itexchanges matrix multiplication through convolution op-eration on every LSTM cellrsquos gate By doing so it has theability to put the underlying spatial features in multidi-mensional data)e formulas that are used at each one of thegates (input forget and output) are as follows

it σ Wxi lowast xt + Whi lowast htminus1 + bi( 1113857

ft σ Wxf lowast xt + Whf lowast htminus1 + bf1113872 1113873

ot σ Wxo lowastxt + Who lowast htminus1 + bo( 1113857

Ct ft ∘Ctminus1 tan h Wxc lowastxt + Whc lowast htminus1 + bc( 1113857

Ht o minus t ∘ tan h ct( 1113857

(13)

where it ft and ot are input forget and output gates and W

is the weight matrix while xt is the current input data htminus1 isthe previous hidden output and Ct is the cell state

)e difference between these equations in LSTM is thatthe matrix multiplication (middot) is substituted by the convo-lution operation (lowast) between W and each xt htminus1 at everygate By doing so the whole connected layer is replaced by aconvolutional layer )us the number of weight parametersin the model can be significantly reduced

3 Methodology

Due to the nonlinear nonstationary attributes and thestochastic variations in the wind speed time series the ac-curate prediction of wind speed is known to be a challengingeffort [29] In this work to improve the accuracy of the windspeed forecasting model a comparison between five modelsis conducted to forecast wind speed considering availablehistorical data A new concept called multi-lags-one-step(MLOS) ahead forecasting is employed to illustrate the effecton the five models accuracies Assume that we are at timeindex Xt To forecast one output element in the future Xt+1the input dataset can be splitted into many lags (past data)XtminusI where I isin 1ndash10 By doing so the model can be trainedon more elements before predicting a single event in thefuture In addition to that the model accuracy showed animprovement until it reached the optimum lagging pointwhich had the best accuracy Beyond this point the modelaccuracy is degraded as it will be illustrated in the Resultssection

Figure 1 illustrates the workflow of the forecastingmodel Specifically the proposed methodology entails foursteps

In Step 1 data have been collected and averaged from 5minutes to 30 minutes and to 1 hour respectively )edatasets are then standardized to generate a mean value of 0

and standard deviation of 1 )e lagging stage is very im-portant in Step 2 as the data are split into different lags tostudy the effect of training the models on more than oneelement (input) to predict a single event in the future In Step3 the models have been applied taking into considerationthat somemodels such as CNN LSTM and ConvLSTM needto be adjusted from matrix shape perspective )ese modelsnormally work with 2D or more In this stage manipulationand reshaping of matrix are conducted For the sake ofchecking and evaluating the proposed models in Step 4three main metrics are used to validate the case study (MAERMSE and R2) In addition the execution time and opti-mum lag are taken into account to select the best model

Algorithm 1 illustrates the training procedure forConvLSTM

4 Collected Data

Table 3 illustrates the characteristics of the collected data in 5minutes time span )e data are collected from a real windspeed dataset over a three-year period from the West TexasMesonet with 5-minute observation period from near LakeAlan Garza [30] )e data are processed through averagingfrom 5 minutes to 30 minutes (whose statistical charac-teristics are given in Table 4) and one more time to 1 hour(whose statistical characteristics are also given in Table 5))e goal of averaging is to study the effect of reducing thedata size in order to compare the five models and then selectthe one that can achieve the highest accuracy for the threedataset cases As shown in the three tables the data sets arealmost identical and reserved with their seasonality Alsothey are not affected by the averaging process

)e data have been split into three sets (training vali-dation and test) with fractions of 53 14 33

5 Results and Discussion

To quantitatively evaluate the performance of the predictivemodels four commonly used statistical measures are tested[20] All of them measure the deflection between the actualand predicted wind speed values Specifically RMSE MAEand R2 are as follows

RMSE

1113936Ni

i1 yi minus 1113954yi( 11138572

Ni

1113971

MAE 1N

1113944N

i1 yi minus 1113954yi

11138681113868111386811138681113868111386811138681113868

R2

1 minus1113936

Ni

i1 yi minus 1113954yi( 11138572

1113936Ni

i1 yi minus yi( 11138572

(14)

where yi and 1113954yi are the actual and the predicted wind speedrespectively while yi is the mean value of actual wind speedsequence Typically the smaller amplitudes of these mea-sures indicate an improved forecasting procedure while R2

is the goodness-of-fit measure for the model )erefore thelarger its value is the fitter the model will be )e testbedenvironment configuration is as follows

8 Computational Intelligence and Neuroscience

Collect raw dataTime series

Averaging andpreprocessing

L1 L2 L10

AN

N

CNN

LSTM

Con

vLS

TM SVR

Forecasting results

MSERMSEMAE

R2

DelayExecution time

Best fittedKPIs

Dat

apr

oces

sing

Lagging

ModelingEv

alua

tion

helliphellip

1hr30min5min

Figure 1 )e proposed forecasting methodology

Input the wind speed time series dataOutput forecasting performance indices

(1) )e wind speed time series data are measured every 5minutes being averaged two times for 30minutes and 1 hour respectively(2) )e wind datasets are split into Training Validation and Test sets(3) Initiate the multi-lags-one-step (MLOS) arrays for Training Validation and Test sets(4) Define MLOS range as 1 10 to optimize the number of needed lags(5) loop 1

Split the first set based on MLOS rangeInitiate and Extract set features with CNN layerPass the output to a defined LSTM layerSelect the first range of the number of hidden neuronsGenerate prediction results performance indicesCount time to execute and produce prediction resultsSave and compare results with previous onesloop 2Select next MLOS rangeIf MLOS rangemaximum range then goto loop 3 and initialize MLOS rangegoto loop 1loop 3Select new number of hidden neuronsIf number of hidden neurons rangemaximum range then goto loop 4 and initialize number of hidden neurons rangegoto loop 1

(6) loop 4Select new dataset from the sets 5min 30min and 1 hrgoto loop 1If set rangemaximum range thenGenerate the performance indices of all tested setsSelect the best results metrics

ALGORITHM 1 ConvLSTM training

Computational Intelligence and Neuroscience 9

(1) CPU Intel (R) Core(TM) i7-8550U CPU 180GHz 2001 Mhz 4 Core (s) 8 Logical Processor(s)

(2) RAM Installed Physical Memory 160 GB(3) GPU AMD Radeon(TM) RX 550 10GB(4) Framework Anaconda 201907 Python 37

Table 6 illustrates the chosen optimized internal pa-rameters (hyperparameters) for the forecasting methodsused in this work For each method the optimal number ofhidden neurons is chosen to achieve the maximum R2 andthe minimum RMSEand MAE values

After the implementation of CNN ANN LSTMConvLSTM and SVM it was noticed that the most fittedmodel was chosen depending on its accuracy in predictingfuture wind speed values )us the seasonality is consideredfor the forecast mechanism)e chosen model has to deliverthe most fitted data with the least amount of error takinginto consideration the nature of the data and not applyingnaive forecasting on it

To achieve this goal the statistical error indicators arecalculated for every model and time lapse and fully repre-sented as Figure 2 illustrates )e provided results suggestthat the ConvLSTM model has the best performance ascompared to the other four models)e chosen model has toreach the minimum values of RMSE and MAE whilemaximum R2 value

Different parameters are also tested to ensure the rightdecision of choosing the best fitted model )e optimumnumber of lags which is presented in Table 7 is one of themost important indicators in selecting the best fitted modelSince the less historical points are needed by the model thecomputational effort will be less as well For each methodthe optimal number of lags is chosen to achieve the max-imum R2and the minimum RMSE and MAE values Forinstance Figures 3 and 4 show the relation between the

statistical measures and the number of lags and hiddenneurons respectively for the proposed ConvLSTM methodfor the 5 minutes time span case It can be seen that 4 lagsand 15 hidden neurons achieved maximum R2 and mini-mum RMSE and MAE values

)e execution time shown in Table 8 is calculated foreach method and time lapse to assure that the final andchosen model is efficient and can effectively predict futurewind speed )e shorter the time for execution is the moreefficient and helpful the model will be )is is also a sign thatthe model is efficient for further modifications According toTable 8 the ConvLSTM model beats all other models in thetime that it needed to process the historical data and deliver afinal prediction SVM needed 54 minutes to accomplish thetraining and produce testing results while ConvLSTM madeit in just 17 minutes )is huge difference between them hasmade the choice of using ConvLSTM

Table 3 Dataset characteristic for 5min sample

Dataset Max Median Min Mean StdAll datasets 1873 353 001 391 210Training dataset 1873 347 001 383 205Test dataset 1487 367 001 405 220

Table 4 Dataset characteristic for 30min sample

Dataset Max Median Min Mean StdAll datasets 1766 353 001 391 208Training dataset 1766 347 001 3837309 202Test dataset 1432 367 002 405 218

Table 5 Dataset characteristic for 1 hour sample

Dataset Max Median Min Mean StdAll datasets 1761 353 007 391 205Training dataset 1761 346 007 383 200Test dataset 1422 366 007 405 215

Table 6 Optimized internal parameters for the forecastingmethods

Method Sets of parameters

LSTM5min 17 hidden neurons30min 20 hidden neurons1 hour 8 hidden neurons

CNN5min 15 hidden neurons30min 5 hidden neurons1 hour 15 hidden neurons

ConvLSTM5min 15 hidden neurons30min 8 hidden neurons1 hour 20 hidden neurons

ANN5min 15 hidden neurons30min 15 hidden neurons1 hour 20 hidden neurons

SVR5min C 7 ε 01 c 02

30min C 1 ε 025 c 0151 hour C 1 ε 01 c 005

10 Computational Intelligence and Neuroscience

Figure 5 shows that the 5-minute lapse dataset is themost fitted dataset to our chosen model It declares howaccurate the prediction of future wind speed will be

For completeness to effectively evaluate the investi-gated forecasting techniques in terms of their predictionaccuracies 50 cross validation procedure is carried out inwhich the investigated techniques are built and thenevaluated on 50 different training and test datasets re-spectively randomly sampled from the available overalldataset )e ultimate performance metrics are then re-ported as the average and the standard deviation values ofthe 50 metrics obtained in each cross validation trial Inthis regard Figure 6 shows the average performancemetrics on the test dataset using the 50 cross validationprocedure It can be easily recognized that the forecastingmodels that employ the LSTM technique outperform theother investigated techniques in terms of the three per-formance metrics R2 RMSE and MAE

From the experimental results of short-term wind speedforecasting shown in Figure 6 we can observe thatConvLSTM performs the best in terms of forecasting metrics(R2 RMSE andMAE) as compared to the other models (ieCNN ANN SVR and LSTM) )e related statistical tests inTables 6 and 7 respectively have proved the effectiveness ofConvLSTM and its capability of handling noisy large dataConvLSTM showed that it can produce high accurate windspeed prediction with less lags and hidden neurons )iswas indeed reflected in the results shown in Table 8 withless computation time as compared to the other testedmodels Furthermore we introduced the multi-lags-one-step (MLOS) ahead forecasting combined with the hybridConvLSTMmodel to provide an efficient generalization fornew time series data to predict wind speed accuratelyResults showed that ConvLSTM proposed in this paperis an effective and promising model for wind speedforecasting

CNN ANN LSTM SVM ConvLSTM5min 09873 09873 09876 09815 0987630min 09091 09089 09097 09078 090981hour 08309 08302 08315 08211 08316

Val

ues

Models

R2

0009018027036045054063072081

09099

(a)

Val

ues

CNN ANN LSTM SVM ConvLSTM5min 02482 02483 02424 02462 0241930min 06578 06588 06467 06627 064631hour 08862 08881 08718 09116 08717

Models

RMSE

0010203040506070809

(b)

Val

ues

CNN ANN LSTM SVM ConvLSTM5min 01806 01798 01758 01855 0175530min 04802 04799 04721 04818 047411hour 06468 06473 06392 06591 06369

Models

MAE

0006012018024

03036042048054

06066

(c)

Figure 2 Modelsrsquo key performance indicators (KPIs)

Table 7 Optimum number of lags

Method 5min 30min 1 hourCNN 9 4 4ANN 3 3 3LSTM 7 6 10ConvLSTM 4 5 8SVR 9 4 5

Computational Intelligence and Neuroscience 11

ConvLSTM

50 75 100 125 150 175 20025Number of hidden neurons

0986909870098710987209873098740987509876

R-sq

uare

d

(a)

ConvLSTM

0986909870098710987209873098740987509876

RMSE

50 75 100 125 150 175 20025Number of hidden neurons

(b)

ConvLSTM

0176017701780179018001810182

MA

E

50 75 100 125 150 175 20025Number of hidden neurons

(c)

Figure 3 ConvLSTM measured statistical values and number of hidden neurons

2 4 6 8 10Number of lags

ConvLSTM

098575

098600

098625

098650

098675

098700

098725

098750

R-sq

uare

d

(a)

2 4 6 8 10Number of lags

ConvLSTM

02425

02450

02475

02500

02525

02550

02575

02600

RMSE

(b)

2 4 6 8 10Number of lags

ConvLSTM

0176

0178

0180

0182

0184

0186

0188

0190

MA

E

(c)

Figure 4 ConvLSTM measured statistical values and number of lags

12 Computational Intelligence and Neuroscience

Similar to our work the proposed EnsemLSTM modelby Chen et al [15] contained different clusters of LSTMwith different hidden layers and hidden neurons )eycombined LSTM clusters with SVR and external optimizerin order to enhance the generalization capability and ro-bustness of their model However their model showed ahigh computational complexity with mediocre perfor-mance indices Our proposed ConvLSTM with MLOS

assured boosting the generalization and robustness for thenew time series data as well as producing high performanceindices

6 Conclusions

In this study we proposed a hybrid deep learning-basedframework ConvLSTM for short-term prediction of the wind

Table 8 Execution time

(5min) time (min) (30min) time (min) (1 hour) time (min)ConvLSTM 17338 03849 01451SVR 541424 08214 02250CNN 087828 01322 00708ANN 07431 02591 00587LSTM 16570 03290 01473

5

4

3

2

1

0 25 50 75 100 125 150 175 200Number of samples

Win

d sp

eed

(mph

)ConvLSTM 5 min time span

PredictedTrue

Figure 5 ConvLSTM truepredicted wind speed and number of samples

0988

0986

0984

0982

098ConvLSTM LSTM ANN CNN SVR

R2

(a)

ConvLSTM LSTM ANN CNN SVR

026

024

022

02

RMSE

(b)

ConvLSTM LSTM ANN CNN SVR

02

019

018

017

016

015

MA

E

(c)

Figure 6 Average performance metrics obtained on the test dataset using 50 cross validation procedure

Computational Intelligence and Neuroscience 13

speed time series measurements )e proposed dynamicpredictionmodel was optimized for the number of input lagsand the number of internal hidden neurons Multi-lags-one-step (MLOS) ahead wind speed forecasting using the pro-posed approach showed superior results compared to fourother different models built using standard ANN CNNLSTM and SVM approaches )e proposed modelingframework combines the benefits of CNN and LSTM net-works in a hybrid modeling scheme that shows highly ac-curate wind speed prediction results with less lags andhidden neurons as well as less computational complexityFor future work further investigation can be done to im-prove the accuracy of the ConvLSTM model for instanceincreasing and optimizing the number of hidden layersapplying a multi-lags-multi-steps (MLMS) ahead forecast-ing and introducing reinforcement learning agent to op-timize the parameters as compared to other optimizationmethods

Data Availability

)ewind speed data used in this study have been taken fromthe West Texas Mesonet of the US National Wind Institute(httpwwwdeptsttuedunwiresearchfacilitieswtmindexphp) Data are provided freely for academic researchpurposes only and cannot be shareddistributed beyondacademic research use without permission from the WestTexas Mesonet

Conflicts of Interest

)e authors declare that they have no conflicts of interest

Acknowledgments

)e authors acknowledge the help of Prof Brian Hirth inTexas Tech University for providing them with access toweather data

References

[1] R Rothermel ldquoHow to predict the spread and intensity offorest and range firesrdquo General Technical Report INT-143Intermountain Forest and Range Experiment Station OgdenUT USA 1983

[2] A Tascikaraoglu and M Uzunoglu ldquoA review of combinedapproaches for prediction of short-term wind speed andpowerrdquo Renewable and Sustainable Energy Reviews vol 34pp 243ndash254 2014

[3] R J Barthelmie F Murray and S C Pryor ldquo)e economicbenefit of short-term forecasting for wind energy in the UKelectricity marketrdquo Energy Policy vol 36 no 5 pp 1687ndash1696 2008

[4] Y-K Wu and H Jing-Shan A Literature Review of WindForecasting Technology in the World pp 504ndash509 IEEELausanne Powertech Lausanne Switzerland 2007

[5] W S McCulloch and W Pitts ldquoA logical calculus of the ideasimmanent in nervous activityrdquo Be Bulletin of MathematicalBiophysics vol 5 no 4 pp 115ndash133 1943

[6] K Fukushima ldquoNeocognitronrdquo Scholarpedia vol 2 no 1p 1717 2007

[7] K Fukushima ldquoNeocognitron a self-organizing neural net-work model for a mechanism of pattern recognition unaf-fected by shift in positionrdquo Biological Cybernetics vol 36no 4 pp 193ndash202 1980

[8] S Hochreite and J Schmidhuber ldquoLong short-termmemoryrdquoNeural Computation vol 9 no 8 pp 1735ndash1780 1997

[9] H T Siegelmann and E D Sontag ldquoOn the computationalpower of neural netsrdquo Journal of Computer and System Sci-ences vol 50 no 1 pp 132ndash150 1995

[10] R Sunil ldquoUnderstanding support vector machine algorithmrdquohttpswwwanalyticsvidhyacomblog201709understaing-support-vector-machine-example-code 2017

[11] X Shi Z Chen H Wang D Y Yeung W K Wong andW C Woo ldquoConvolutional LSTM network a machinelearning approach for precipitation nowcastingrdquo in Pro-ceedings of the Neural Information Processing Systemspp 7ndash12 Montreal Canada December 2015

[12] C Finn ldquoGoodfellow I and Levine S Unsupervised learningfor physical interaction through video predictionrdquo in Pro-ceedings of the Annual Conference Neural Information Pro-cessing Systems pp 64ndash72 Barcelona Spain December 2016

[13] H Xu Y Gao F Yu and T Darrell ldquoEnd-to-end learning ofdriving models from large-scale video datasetsrdquo in Proceed-ings of the 2017 IEEE Conference on Computer Vision andPattern Recognition pp 3530ndash3538 Honolulu HI USA July2017

[14] K Chen and J Yu ldquoShort-term wind speed prediction usingan unscented Kalman filter based state-space support vectorregression approachrdquo Applied Energy vol 113 pp 690ndash7052014

[15] J Chen G-Q Zeng W Zhou W Du and K-D Lu ldquoWindspeed forecasting using nonlinear-learning ensemble of deeplearning time series prediction and extremal optimizationrdquoEnergy Conversion and Management vol 165 pp 681ndash6952018

[16] D Liu D Niu H Wang and L Fan ldquoShort-term wind speedforecasting using wavelet transform and support vectormachines optimized by genetic algorithmrdquo Renewable Energyvol 62 pp 592ndash597 2014

[17] Y Wang ldquoShort-term wind power forecasting by geneticalgorithm of wavelet neural networkrdquo in Proceedings of the2014 International Conference on Information Science Elec-tronics and Electrical Engineering pp 1752ndash1755 SapporoJapan April 2014

[18] N Sheikh D B Talukdar R I Rasel and N Sultana ldquoA shortterm wind speed forcasting using svr and bp-ann a com-parative analysisrdquo in Proceedings of the 2017 20th Interna-tional Conference of Computer and Information Technology(ICCIT) pp 1ndash6 Dhaka Bangladesh December 2017

[19] H Nantian C Yuan G Cai and E Xing ldquoHybrid short termwind speed forecasting using variational mode decompositionand a weighted regularized extreme learning machinerdquo En-ergies vol 9 no 12 pp 989ndash1007 2016

[20] S Haijian and X Deng ldquoAdaBoosting neural network forshort-term wind speed forecasting based on seasonal char-acteristics analysis and lag space estimationrdquo ComputerModeling in Engineering amp Sciences vol 114 no 3pp 277ndash293 2018

[21] W Xiaodan ldquoForecasting short-term wind speed usingsupport vector machine with particle swarm optimizationrdquo inProceedings of the 2017 International Conference on SensingDiagnostics Prognostics and Control (SDPC) pp 241ndash245Shanghai China August 2017

14 Computational Intelligence and Neuroscience

[22] F R Ningsih E C Djamal and A Najmurrakhman ldquoWindspeed forecasting using recurrent neural networks and longshort term memoryrdquo in Proceedings of the 2019 6th Inter-national Conference on Instrumentation Control and Auto-mation (ICA) pp 137ndash141 Bandung Indonesia JulyndashAugust2019

[23] P K Datta ldquoAn artificial neural network approach for short-term wind speed forecastrdquo Master thesis Kansas State Uni-versity Manhattan KS USA 2018

[24] J Guanlong D Li L Yao and P Zhao ldquoAn improved ar-tificial bee colony-BP neural network algorithm in the short-term wind speed predictionrdquo in Proceedings of the 2016 12thWorld Congress on Intelligent Control and Automation(WCICA) pp 2252ndash2255 Guilin China June 2016

[25] C Gonggui J Chen Z Zhang and Z Sun ldquoShort-term windspeed forecasting based on fuzzy C-means clustering andimproved MEA-BPrdquo IAENG International Journal of Com-puter Science vol 46 no 4 pp 27ndash35 2019

[26] W ZhanJie and A B M Mazharul Mujib ldquo)e weatherforecast using data mining research based on cloud com-putingrdquo in Journal of Physics Conference Series vol 910 no 1Article ID 012020 2017

[27] A Zhu X Li Z Mo and R Wu ldquoWind power predictionbased on a convolutional neural networkrdquo in Proceedings ofthe 2017 International Conference on Circuits Devices AndSystems (ICCDS) pp 131ndash135 Chengdu China September2017

[28] O Linda T Vollmer and M Manic ldquoNeural network basedintrusion detection system for critical infrastructuresrdquo inProceedings of the 2009 International Joint Conference onNeural Networks pp 1827ndash1834 Atlanta GA USA June2009

[29] C Li Z Xiao X Xia W Zou and C Zhang ldquoA hybrid modelbased on synchronous optimisation for multi-step short-termwind speed forecastingrdquoApplied Energy vol 215 pp 131ndash1442018

[30] Institute National Wind West Texas Mesonet Online Ref-erencing httpwwwdeptsttueducom 2018

Computational Intelligence and Neuroscience 15

Table 1 Continued

Aim Technique Meritsoutcomes Demerits Dataset

Short-termwind speedprediction [24]

Backpropagation (BP) neuralnetwork based on improved

artificial bee colonyalgorithm (ABC-BP)

)e model has high precisionand fast convergence ratecompared with traditionaland genetic BP neural

networks

Sensitive for noisy data)erefore data should befiltered which may affect

the nature of data

Wind farm in TianjinChina (December

2013ndashJanuary 2014)

Short-termwind speedforecasting [25]2019

Fuzzy C-means clustering(FCM) and improved mindevolutionary algorithm-BP

(IMEA-BP)

)e proposed model issuitable for one-step

forecasting and enhances theaccuracy of multistep

forecasting

)e accuracy of multistepforecasting needs to be

further improvedWind farm in China

Predicting windspeed [26]

Artificial neural network anddecision tree algorithms

)e platform has the abilityof mass storage of

meteorological data andefficient query and analysis of

weather forecasting

Needs improvement inorder to forecast more

realistic weatherparameters

Meteorological dataprovided by the DalianMeteorological Bureau

(2011ndash2015)

Our scheme

Employing multi-lags-one-step (MLOS) ahead

forecasting technique withartificial learning-based

algorithms

)e provided results suggestthat the ConvLSTM modelhas the best performance ascompared to ANN CNNLSTM and SVM models

Increasing the number ofhidden layers may increasethe computational time

exponentially

National Wind InstitutionWest Texas Mesonet

(2012ndash2015)

Table 2 Acronyms and notations used

Category Itemssymbols Description

Acronyms

ANN Artificial neural networkCNN Convolutional neural networkLSTM Long short-term memory

ConvLSTM Convolutional LSTM hybrid modelSVM Support vector machineRE Renewable energyRNN Recurrent neural networkEWT Empirical wavelet transformationENN Elman neural network

FC-LSTM Fully connected-long short-term memoryFCN-LSTM Long short-term memory fully convolutional network

TSP Time-series predictionUKF Unscented Kalman filterSVR Support vector regressionSVRM Support vector regression machineEO Extremal optimizationMAE Mean absolute errorRMSE Root mean square errorMAPE Mean absolute percentage errorR2 R-squaredWT Wavelet transformGA Genetic algorithm

GAWNN Genetic algorithm of wavelet neural networkWNN Wavelet neural networkMLOS Multi-lags-one-stepVGP Vanishing gradient problemLM LevenbergndashMarquardtRBF Radial basis function

Computational Intelligence and Neuroscience 5

22 CNN Algorithms CNN is a feed-forward neural net-work To achieve network architecture optimization andsolve the unknown parameters in the network the attributesof a two-dimensional image are excerpted and the back-propagation algorithms are implemented To achieve thefinal outcome the sampled data are fed inside the network to

extract the needed attributes within prerefining Next theclassification or regression is applied [27]

)eCNN is composed of basically two types of layers theconvolutional and the pooling layers)e neurons are locallyconnected within the convolution layer and the precedinglayer Meanwhile the neuronsrsquo local attributes are obtained

Table 2 Continued

Category Itemssymbols Description

Notations

ft Forget gateCt )e cell statesit Input gatext Current input data

htminus1 )e previous hidden output1113957ct Input to cell cc Memory cell1113957ct Input to cell cit Input gate

Ctminus1 Past cell statusOt Output gateht Hidden statemiddot Matrix multiplication⊙ An elementwise multiplicationθ Weight1113957c )e input to the cellσ Nonlinear functionzj )e jth hidden neuronp Number of inputs to the networkm Number of hidden neuronswij )e connection weight from the ith input node to the jth hidden nodeykminusi i-step behind previous wind speed

fh() )e activation function in the hidden layerwj )e connection weight from the jth hidden node to the output node1113954yk )e predicted wind speed at the kth sampling momentfo )e activation function for the output layeryk Actual wind speedxi Input vectoryi Output vectorRm Regularized function

f(x) A function that describes the correlation between inputs and outputsϕx() Preknown functionR[f] Structure risk

w )e regression coefficient vectorb Bias termc Punishment coefficient

L(xi yi f(xi)) )e ε-insensitive loss function

ε )resholdζ i ζlowasti Slack variables that let constraints feasible

ai alowasti )e Lagrange multipliersK(xixj) )e kernel function

W )e weight matrixlowast Convolution operation

bi bf bc Bias vectors∘ Hadamard product

Ht Hidden stateXt Current wind speed measure

Xtminus1 Previous wind speed measureXt+1 Future wind speed measure

6 Computational Intelligence and Neuroscience

)e local sensitivity is found through the pooling layer toobtain the attributes repeatedly )e existence of the con-volution and the pooling layers minimizes the attributeresolution and the number of network specifications whichrequire enhancement

CNN typically describes data and constructs them as atwo-dimensional array and is extensively utilized in the areaof image processing In this paper CNN algorithm is con-figured to predict the wind speed and fit it to process a one-dimensional array of data In the preprocessing phase theone-dimensional data are reconstructed into a two-di-mensional array )is enables CNN machine algorithm tosmoothly deal with data )is creates two files the propertyand the response files )ese files are delivered as inputs toCNN )e response file also contains the data of the expectedoutput value

Each sample is represented by a line from the propertyand the response files Weights and biases can be obtained assoon as an acceptable number of samples to train the CNN isdelivered )e training continues by comparing the re-gression results with the response values in order to reachthe minimum possible error )is delivers the final trainedCNN model which is utilized to achieve the neededpredictions

)e fitting mechanism of CNN is pooling Variouscomputational approaches have proved that two approachesof pooling can be used the average pooling and the max-imum pooling Images are stationary and all parts of imageshare similar attributes )erefore the pooling approachapplies similar average or maximum calculations for everypart of the high-resolution images)e pooling process leadsto reduction in the statistics dimensions and increase in thegeneralization strength of the model )e results are welloptimized and can have a lower possibility of over fitting

23 ANN Algorithms ANN has three layers which build upthe network )ese are input hidden and output layers)ese layers have the ability to correlate an input vector to anoutput scalar or vector using activation function in variousneurons)e jth hidden neuron Zjcan be computed by the p

inputs and m hidden neurons using the following equation[14]

Zj fh 1113944

p

i1wijykminusi

⎛⎝ ⎞⎠ (7)

where wij is the connection weight from the ith input node tothe jth hidden node ykminusi is i-step behind previous windspeed and fh() is the activation function in the hiddenlayer )erefore the future wind speed can be predictedthrough

1113954yk fo 1113944

m

j1wjzj

⎛⎝ ⎞⎠ (8)

where wj is the connection weight from the jth hidden nodeto the output node and 1113954yk is the predicted wind speed at thekth sampling moment while f0 is the activation function forthe output layer By minimizing the error between the actualand the predicted wind speeds yk and 1113954yk respectively usingLevenbergndashMarquardt (LM) algorithm the nonlinearmapping efficiency of ANN can be obtained [28]

24 SVM Algorithms Assuming a set of samples xi yi1113864 1113865where i 1 2 N with input vector xi isin Rm and out-put vectoryi isin Rm )e regression obstacles aim to identifya function f(x)that describes the correlation between inputsand outputs )e interest of SVR is to obtain a linear re-gression in the high-dimensional feature space delivered bymapping the primary input set utilizing a preknown func-tion ϕ(x())and to minimize the structure riskR[f] )ismechanism can be written as follows [15]

f(x) wTϕ(x) + b

R[f] 12W

2+ C 1113944

N

i1L xiyif xi( )1113874 1113875

(9)

where W b and Crespectively are the regression coefficientvector bias term and punishment coefficient L(xi yi f(xi)

)

is the e-insensitive loss function)e regression problem canbe handled by the following constrained optimizationproblem

min 12W

2+ C 1113944

N

i1L ζ iζ

lowasti( 1113857

st yi minus wTϕ(x) + b1113872 1113873 le ε + ζ i wTϕ(x) + b1113872 1113873 minus yi le ε + ζ lowasti ζ i ζlowasti ge 0 i 1 2 N

(10)

where ζ iand ζ lowasti represent the slack variables that let con-straints feasible By using the Lagrange multipliers the re-gression function can be written as follows

f(x) 1113944N

i1ai minus a

lowasti( 1113857K xixj1113872 1113873 + b (11)

where aiand alowasti are the Lagrange multipliers that fulfilthe conditions ai ge 0 alowasti ge 0 and 1113936

Ni1(ai minus alowasti ) 0 K(xixj)

is a general kernel function In this study the well-knownradial basis function (RBF) is chosen here as the kernelfunction

Computational Intelligence and Neuroscience 7

K xixj1113872 1113873 exp minusxi minus xj

2

2σ2⎛⎜⎜⎝ ⎞⎟⎟⎠ (12)

where σ defines the RBF kernel width [15]

25 ConvLSTM Algorithm ConvLSTM is designed to betrained on spatial information in the dataset and its aim is todeal with 3-dimentional data as an input Furthermore itexchanges matrix multiplication through convolution op-eration on every LSTM cellrsquos gate By doing so it has theability to put the underlying spatial features in multidi-mensional data)e formulas that are used at each one of thegates (input forget and output) are as follows

it σ Wxi lowast xt + Whi lowast htminus1 + bi( 1113857

ft σ Wxf lowast xt + Whf lowast htminus1 + bf1113872 1113873

ot σ Wxo lowastxt + Who lowast htminus1 + bo( 1113857

Ct ft ∘Ctminus1 tan h Wxc lowastxt + Whc lowast htminus1 + bc( 1113857

Ht o minus t ∘ tan h ct( 1113857

(13)

where it ft and ot are input forget and output gates and W

is the weight matrix while xt is the current input data htminus1 isthe previous hidden output and Ct is the cell state

)e difference between these equations in LSTM is thatthe matrix multiplication (middot) is substituted by the convo-lution operation (lowast) between W and each xt htminus1 at everygate By doing so the whole connected layer is replaced by aconvolutional layer )us the number of weight parametersin the model can be significantly reduced

3 Methodology

Due to the nonlinear nonstationary attributes and thestochastic variations in the wind speed time series the ac-curate prediction of wind speed is known to be a challengingeffort [29] In this work to improve the accuracy of the windspeed forecasting model a comparison between five modelsis conducted to forecast wind speed considering availablehistorical data A new concept called multi-lags-one-step(MLOS) ahead forecasting is employed to illustrate the effecton the five models accuracies Assume that we are at timeindex Xt To forecast one output element in the future Xt+1the input dataset can be splitted into many lags (past data)XtminusI where I isin 1ndash10 By doing so the model can be trainedon more elements before predicting a single event in thefuture In addition to that the model accuracy showed animprovement until it reached the optimum lagging pointwhich had the best accuracy Beyond this point the modelaccuracy is degraded as it will be illustrated in the Resultssection

Figure 1 illustrates the workflow of the forecastingmodel Specifically the proposed methodology entails foursteps

In Step 1 data have been collected and averaged from 5minutes to 30 minutes and to 1 hour respectively )edatasets are then standardized to generate a mean value of 0

and standard deviation of 1 )e lagging stage is very im-portant in Step 2 as the data are split into different lags tostudy the effect of training the models on more than oneelement (input) to predict a single event in the future In Step3 the models have been applied taking into considerationthat somemodels such as CNN LSTM and ConvLSTM needto be adjusted from matrix shape perspective )ese modelsnormally work with 2D or more In this stage manipulationand reshaping of matrix are conducted For the sake ofchecking and evaluating the proposed models in Step 4three main metrics are used to validate the case study (MAERMSE and R2) In addition the execution time and opti-mum lag are taken into account to select the best model

Algorithm 1 illustrates the training procedure forConvLSTM

4 Collected Data

Table 3 illustrates the characteristics of the collected data in 5minutes time span )e data are collected from a real windspeed dataset over a three-year period from the West TexasMesonet with 5-minute observation period from near LakeAlan Garza [30] )e data are processed through averagingfrom 5 minutes to 30 minutes (whose statistical charac-teristics are given in Table 4) and one more time to 1 hour(whose statistical characteristics are also given in Table 5))e goal of averaging is to study the effect of reducing thedata size in order to compare the five models and then selectthe one that can achieve the highest accuracy for the threedataset cases As shown in the three tables the data sets arealmost identical and reserved with their seasonality Alsothey are not affected by the averaging process

)e data have been split into three sets (training vali-dation and test) with fractions of 53 14 33

5 Results and Discussion

To quantitatively evaluate the performance of the predictivemodels four commonly used statistical measures are tested[20] All of them measure the deflection between the actualand predicted wind speed values Specifically RMSE MAEand R2 are as follows

RMSE

1113936Ni

i1 yi minus 1113954yi( 11138572

Ni

1113971

MAE 1N

1113944N

i1 yi minus 1113954yi

11138681113868111386811138681113868111386811138681113868

R2

1 minus1113936

Ni

i1 yi minus 1113954yi( 11138572

1113936Ni

i1 yi minus yi( 11138572

(14)

where yi and 1113954yi are the actual and the predicted wind speedrespectively while yi is the mean value of actual wind speedsequence Typically the smaller amplitudes of these mea-sures indicate an improved forecasting procedure while R2

is the goodness-of-fit measure for the model )erefore thelarger its value is the fitter the model will be )e testbedenvironment configuration is as follows

8 Computational Intelligence and Neuroscience

Collect raw dataTime series

Averaging andpreprocessing

L1 L2 L10

AN

N

CNN

LSTM

Con

vLS

TM SVR

Forecasting results

MSERMSEMAE

R2

DelayExecution time

Best fittedKPIs

Dat

apr

oces

sing

Lagging

ModelingEv

alua

tion

helliphellip

1hr30min5min

Figure 1 )e proposed forecasting methodology

Input the wind speed time series dataOutput forecasting performance indices

(1) )e wind speed time series data are measured every 5minutes being averaged two times for 30minutes and 1 hour respectively(2) )e wind datasets are split into Training Validation and Test sets(3) Initiate the multi-lags-one-step (MLOS) arrays for Training Validation and Test sets(4) Define MLOS range as 1 10 to optimize the number of needed lags(5) loop 1

Split the first set based on MLOS rangeInitiate and Extract set features with CNN layerPass the output to a defined LSTM layerSelect the first range of the number of hidden neuronsGenerate prediction results performance indicesCount time to execute and produce prediction resultsSave and compare results with previous onesloop 2Select next MLOS rangeIf MLOS rangemaximum range then goto loop 3 and initialize MLOS rangegoto loop 1loop 3Select new number of hidden neuronsIf number of hidden neurons rangemaximum range then goto loop 4 and initialize number of hidden neurons rangegoto loop 1

(6) loop 4Select new dataset from the sets 5min 30min and 1 hrgoto loop 1If set rangemaximum range thenGenerate the performance indices of all tested setsSelect the best results metrics

ALGORITHM 1 ConvLSTM training

Computational Intelligence and Neuroscience 9

(1) CPU Intel (R) Core(TM) i7-8550U CPU 180GHz 2001 Mhz 4 Core (s) 8 Logical Processor(s)

(2) RAM Installed Physical Memory 160 GB(3) GPU AMD Radeon(TM) RX 550 10GB(4) Framework Anaconda 201907 Python 37

Table 6 illustrates the chosen optimized internal pa-rameters (hyperparameters) for the forecasting methodsused in this work For each method the optimal number ofhidden neurons is chosen to achieve the maximum R2 andthe minimum RMSEand MAE values

After the implementation of CNN ANN LSTMConvLSTM and SVM it was noticed that the most fittedmodel was chosen depending on its accuracy in predictingfuture wind speed values )us the seasonality is consideredfor the forecast mechanism)e chosen model has to deliverthe most fitted data with the least amount of error takinginto consideration the nature of the data and not applyingnaive forecasting on it

To achieve this goal the statistical error indicators arecalculated for every model and time lapse and fully repre-sented as Figure 2 illustrates )e provided results suggestthat the ConvLSTM model has the best performance ascompared to the other four models)e chosen model has toreach the minimum values of RMSE and MAE whilemaximum R2 value

Different parameters are also tested to ensure the rightdecision of choosing the best fitted model )e optimumnumber of lags which is presented in Table 7 is one of themost important indicators in selecting the best fitted modelSince the less historical points are needed by the model thecomputational effort will be less as well For each methodthe optimal number of lags is chosen to achieve the max-imum R2and the minimum RMSE and MAE values Forinstance Figures 3 and 4 show the relation between the

statistical measures and the number of lags and hiddenneurons respectively for the proposed ConvLSTM methodfor the 5 minutes time span case It can be seen that 4 lagsand 15 hidden neurons achieved maximum R2 and mini-mum RMSE and MAE values

)e execution time shown in Table 8 is calculated foreach method and time lapse to assure that the final andchosen model is efficient and can effectively predict futurewind speed )e shorter the time for execution is the moreefficient and helpful the model will be )is is also a sign thatthe model is efficient for further modifications According toTable 8 the ConvLSTM model beats all other models in thetime that it needed to process the historical data and deliver afinal prediction SVM needed 54 minutes to accomplish thetraining and produce testing results while ConvLSTM madeit in just 17 minutes )is huge difference between them hasmade the choice of using ConvLSTM

Table 3 Dataset characteristic for 5min sample

Dataset Max Median Min Mean StdAll datasets 1873 353 001 391 210Training dataset 1873 347 001 383 205Test dataset 1487 367 001 405 220

Table 4 Dataset characteristic for 30min sample

Dataset Max Median Min Mean StdAll datasets 1766 353 001 391 208Training dataset 1766 347 001 3837309 202Test dataset 1432 367 002 405 218

Table 5 Dataset characteristic for 1 hour sample

Dataset Max Median Min Mean StdAll datasets 1761 353 007 391 205Training dataset 1761 346 007 383 200Test dataset 1422 366 007 405 215

Table 6 Optimized internal parameters for the forecastingmethods

Method Sets of parameters

LSTM5min 17 hidden neurons30min 20 hidden neurons1 hour 8 hidden neurons

CNN5min 15 hidden neurons30min 5 hidden neurons1 hour 15 hidden neurons

ConvLSTM5min 15 hidden neurons30min 8 hidden neurons1 hour 20 hidden neurons

ANN5min 15 hidden neurons30min 15 hidden neurons1 hour 20 hidden neurons

SVR5min C 7 ε 01 c 02

30min C 1 ε 025 c 0151 hour C 1 ε 01 c 005

10 Computational Intelligence and Neuroscience

Figure 5 shows that the 5-minute lapse dataset is themost fitted dataset to our chosen model It declares howaccurate the prediction of future wind speed will be

For completeness to effectively evaluate the investi-gated forecasting techniques in terms of their predictionaccuracies 50 cross validation procedure is carried out inwhich the investigated techniques are built and thenevaluated on 50 different training and test datasets re-spectively randomly sampled from the available overalldataset )e ultimate performance metrics are then re-ported as the average and the standard deviation values ofthe 50 metrics obtained in each cross validation trial Inthis regard Figure 6 shows the average performancemetrics on the test dataset using the 50 cross validationprocedure It can be easily recognized that the forecastingmodels that employ the LSTM technique outperform theother investigated techniques in terms of the three per-formance metrics R2 RMSE and MAE

From the experimental results of short-term wind speedforecasting shown in Figure 6 we can observe thatConvLSTM performs the best in terms of forecasting metrics(R2 RMSE andMAE) as compared to the other models (ieCNN ANN SVR and LSTM) )e related statistical tests inTables 6 and 7 respectively have proved the effectiveness ofConvLSTM and its capability of handling noisy large dataConvLSTM showed that it can produce high accurate windspeed prediction with less lags and hidden neurons )iswas indeed reflected in the results shown in Table 8 withless computation time as compared to the other testedmodels Furthermore we introduced the multi-lags-one-step (MLOS) ahead forecasting combined with the hybridConvLSTMmodel to provide an efficient generalization fornew time series data to predict wind speed accuratelyResults showed that ConvLSTM proposed in this paperis an effective and promising model for wind speedforecasting

CNN ANN LSTM SVM ConvLSTM5min 09873 09873 09876 09815 0987630min 09091 09089 09097 09078 090981hour 08309 08302 08315 08211 08316

Val

ues

Models

R2

0009018027036045054063072081

09099

(a)

Val

ues

CNN ANN LSTM SVM ConvLSTM5min 02482 02483 02424 02462 0241930min 06578 06588 06467 06627 064631hour 08862 08881 08718 09116 08717

Models

RMSE

0010203040506070809

(b)

Val

ues

CNN ANN LSTM SVM ConvLSTM5min 01806 01798 01758 01855 0175530min 04802 04799 04721 04818 047411hour 06468 06473 06392 06591 06369

Models

MAE

0006012018024

03036042048054

06066

(c)

Figure 2 Modelsrsquo key performance indicators (KPIs)

Table 7 Optimum number of lags

Method 5min 30min 1 hourCNN 9 4 4ANN 3 3 3LSTM 7 6 10ConvLSTM 4 5 8SVR 9 4 5

Computational Intelligence and Neuroscience 11

ConvLSTM

50 75 100 125 150 175 20025Number of hidden neurons

0986909870098710987209873098740987509876

R-sq

uare

d

(a)

ConvLSTM

0986909870098710987209873098740987509876

RMSE

50 75 100 125 150 175 20025Number of hidden neurons

(b)

ConvLSTM

0176017701780179018001810182

MA

E

50 75 100 125 150 175 20025Number of hidden neurons

(c)

Figure 3 ConvLSTM measured statistical values and number of hidden neurons

2 4 6 8 10Number of lags

ConvLSTM

098575

098600

098625

098650

098675

098700

098725

098750

R-sq

uare

d

(a)

2 4 6 8 10Number of lags

ConvLSTM

02425

02450

02475

02500

02525

02550

02575

02600

RMSE

(b)

2 4 6 8 10Number of lags

ConvLSTM

0176

0178

0180

0182

0184

0186

0188

0190

MA

E

(c)

Figure 4 ConvLSTM measured statistical values and number of lags

12 Computational Intelligence and Neuroscience

Similar to our work the proposed EnsemLSTM modelby Chen et al [15] contained different clusters of LSTMwith different hidden layers and hidden neurons )eycombined LSTM clusters with SVR and external optimizerin order to enhance the generalization capability and ro-bustness of their model However their model showed ahigh computational complexity with mediocre perfor-mance indices Our proposed ConvLSTM with MLOS

assured boosting the generalization and robustness for thenew time series data as well as producing high performanceindices

6 Conclusions

In this study we proposed a hybrid deep learning-basedframework ConvLSTM for short-term prediction of the wind

Table 8 Execution time

(5min) time (min) (30min) time (min) (1 hour) time (min)ConvLSTM 17338 03849 01451SVR 541424 08214 02250CNN 087828 01322 00708ANN 07431 02591 00587LSTM 16570 03290 01473

5

4

3

2

1

0 25 50 75 100 125 150 175 200Number of samples

Win

d sp

eed

(mph

)ConvLSTM 5 min time span

PredictedTrue

Figure 5 ConvLSTM truepredicted wind speed and number of samples

0988

0986

0984

0982

098ConvLSTM LSTM ANN CNN SVR

R2

(a)

ConvLSTM LSTM ANN CNN SVR

026

024

022

02

RMSE

(b)

ConvLSTM LSTM ANN CNN SVR

02

019

018

017

016

015

MA

E

(c)

Figure 6 Average performance metrics obtained on the test dataset using 50 cross validation procedure

Computational Intelligence and Neuroscience 13

speed time series measurements )e proposed dynamicpredictionmodel was optimized for the number of input lagsand the number of internal hidden neurons Multi-lags-one-step (MLOS) ahead wind speed forecasting using the pro-posed approach showed superior results compared to fourother different models built using standard ANN CNNLSTM and SVM approaches )e proposed modelingframework combines the benefits of CNN and LSTM net-works in a hybrid modeling scheme that shows highly ac-curate wind speed prediction results with less lags andhidden neurons as well as less computational complexityFor future work further investigation can be done to im-prove the accuracy of the ConvLSTM model for instanceincreasing and optimizing the number of hidden layersapplying a multi-lags-multi-steps (MLMS) ahead forecast-ing and introducing reinforcement learning agent to op-timize the parameters as compared to other optimizationmethods

Data Availability

)ewind speed data used in this study have been taken fromthe West Texas Mesonet of the US National Wind Institute(httpwwwdeptsttuedunwiresearchfacilitieswtmindexphp) Data are provided freely for academic researchpurposes only and cannot be shareddistributed beyondacademic research use without permission from the WestTexas Mesonet

Conflicts of Interest

)e authors declare that they have no conflicts of interest

Acknowledgments

)e authors acknowledge the help of Prof Brian Hirth inTexas Tech University for providing them with access toweather data

References

[1] R Rothermel ldquoHow to predict the spread and intensity offorest and range firesrdquo General Technical Report INT-143Intermountain Forest and Range Experiment Station OgdenUT USA 1983

[2] A Tascikaraoglu and M Uzunoglu ldquoA review of combinedapproaches for prediction of short-term wind speed andpowerrdquo Renewable and Sustainable Energy Reviews vol 34pp 243ndash254 2014

[3] R J Barthelmie F Murray and S C Pryor ldquo)e economicbenefit of short-term forecasting for wind energy in the UKelectricity marketrdquo Energy Policy vol 36 no 5 pp 1687ndash1696 2008

[4] Y-K Wu and H Jing-Shan A Literature Review of WindForecasting Technology in the World pp 504ndash509 IEEELausanne Powertech Lausanne Switzerland 2007

[5] W S McCulloch and W Pitts ldquoA logical calculus of the ideasimmanent in nervous activityrdquo Be Bulletin of MathematicalBiophysics vol 5 no 4 pp 115ndash133 1943

[6] K Fukushima ldquoNeocognitronrdquo Scholarpedia vol 2 no 1p 1717 2007

[7] K Fukushima ldquoNeocognitron a self-organizing neural net-work model for a mechanism of pattern recognition unaf-fected by shift in positionrdquo Biological Cybernetics vol 36no 4 pp 193ndash202 1980

[8] S Hochreite and J Schmidhuber ldquoLong short-termmemoryrdquoNeural Computation vol 9 no 8 pp 1735ndash1780 1997

[9] H T Siegelmann and E D Sontag ldquoOn the computationalpower of neural netsrdquo Journal of Computer and System Sci-ences vol 50 no 1 pp 132ndash150 1995

[10] R Sunil ldquoUnderstanding support vector machine algorithmrdquohttpswwwanalyticsvidhyacomblog201709understaing-support-vector-machine-example-code 2017

[11] X Shi Z Chen H Wang D Y Yeung W K Wong andW C Woo ldquoConvolutional LSTM network a machinelearning approach for precipitation nowcastingrdquo in Pro-ceedings of the Neural Information Processing Systemspp 7ndash12 Montreal Canada December 2015

[12] C Finn ldquoGoodfellow I and Levine S Unsupervised learningfor physical interaction through video predictionrdquo in Pro-ceedings of the Annual Conference Neural Information Pro-cessing Systems pp 64ndash72 Barcelona Spain December 2016

[13] H Xu Y Gao F Yu and T Darrell ldquoEnd-to-end learning ofdriving models from large-scale video datasetsrdquo in Proceed-ings of the 2017 IEEE Conference on Computer Vision andPattern Recognition pp 3530ndash3538 Honolulu HI USA July2017

[14] K Chen and J Yu ldquoShort-term wind speed prediction usingan unscented Kalman filter based state-space support vectorregression approachrdquo Applied Energy vol 113 pp 690ndash7052014

[15] J Chen G-Q Zeng W Zhou W Du and K-D Lu ldquoWindspeed forecasting using nonlinear-learning ensemble of deeplearning time series prediction and extremal optimizationrdquoEnergy Conversion and Management vol 165 pp 681ndash6952018

[16] D Liu D Niu H Wang and L Fan ldquoShort-term wind speedforecasting using wavelet transform and support vectormachines optimized by genetic algorithmrdquo Renewable Energyvol 62 pp 592ndash597 2014

[17] Y Wang ldquoShort-term wind power forecasting by geneticalgorithm of wavelet neural networkrdquo in Proceedings of the2014 International Conference on Information Science Elec-tronics and Electrical Engineering pp 1752ndash1755 SapporoJapan April 2014

[18] N Sheikh D B Talukdar R I Rasel and N Sultana ldquoA shortterm wind speed forcasting using svr and bp-ann a com-parative analysisrdquo in Proceedings of the 2017 20th Interna-tional Conference of Computer and Information Technology(ICCIT) pp 1ndash6 Dhaka Bangladesh December 2017

[19] H Nantian C Yuan G Cai and E Xing ldquoHybrid short termwind speed forecasting using variational mode decompositionand a weighted regularized extreme learning machinerdquo En-ergies vol 9 no 12 pp 989ndash1007 2016

[20] S Haijian and X Deng ldquoAdaBoosting neural network forshort-term wind speed forecasting based on seasonal char-acteristics analysis and lag space estimationrdquo ComputerModeling in Engineering amp Sciences vol 114 no 3pp 277ndash293 2018

[21] W Xiaodan ldquoForecasting short-term wind speed usingsupport vector machine with particle swarm optimizationrdquo inProceedings of the 2017 International Conference on SensingDiagnostics Prognostics and Control (SDPC) pp 241ndash245Shanghai China August 2017

14 Computational Intelligence and Neuroscience

[22] F R Ningsih E C Djamal and A Najmurrakhman ldquoWindspeed forecasting using recurrent neural networks and longshort term memoryrdquo in Proceedings of the 2019 6th Inter-national Conference on Instrumentation Control and Auto-mation (ICA) pp 137ndash141 Bandung Indonesia JulyndashAugust2019

[23] P K Datta ldquoAn artificial neural network approach for short-term wind speed forecastrdquo Master thesis Kansas State Uni-versity Manhattan KS USA 2018

[24] J Guanlong D Li L Yao and P Zhao ldquoAn improved ar-tificial bee colony-BP neural network algorithm in the short-term wind speed predictionrdquo in Proceedings of the 2016 12thWorld Congress on Intelligent Control and Automation(WCICA) pp 2252ndash2255 Guilin China June 2016

[25] C Gonggui J Chen Z Zhang and Z Sun ldquoShort-term windspeed forecasting based on fuzzy C-means clustering andimproved MEA-BPrdquo IAENG International Journal of Com-puter Science vol 46 no 4 pp 27ndash35 2019

[26] W ZhanJie and A B M Mazharul Mujib ldquo)e weatherforecast using data mining research based on cloud com-putingrdquo in Journal of Physics Conference Series vol 910 no 1Article ID 012020 2017

[27] A Zhu X Li Z Mo and R Wu ldquoWind power predictionbased on a convolutional neural networkrdquo in Proceedings ofthe 2017 International Conference on Circuits Devices AndSystems (ICCDS) pp 131ndash135 Chengdu China September2017

[28] O Linda T Vollmer and M Manic ldquoNeural network basedintrusion detection system for critical infrastructuresrdquo inProceedings of the 2009 International Joint Conference onNeural Networks pp 1827ndash1834 Atlanta GA USA June2009

[29] C Li Z Xiao X Xia W Zou and C Zhang ldquoA hybrid modelbased on synchronous optimisation for multi-step short-termwind speed forecastingrdquoApplied Energy vol 215 pp 131ndash1442018

[30] Institute National Wind West Texas Mesonet Online Ref-erencing httpwwwdeptsttueducom 2018

Computational Intelligence and Neuroscience 15

22 CNN Algorithms CNN is a feed-forward neural net-work To achieve network architecture optimization andsolve the unknown parameters in the network the attributesof a two-dimensional image are excerpted and the back-propagation algorithms are implemented To achieve thefinal outcome the sampled data are fed inside the network to

extract the needed attributes within prerefining Next theclassification or regression is applied [27]

)eCNN is composed of basically two types of layers theconvolutional and the pooling layers)e neurons are locallyconnected within the convolution layer and the precedinglayer Meanwhile the neuronsrsquo local attributes are obtained

Table 2 Continued

Category Itemssymbols Description

Notations

ft Forget gateCt )e cell statesit Input gatext Current input data

htminus1 )e previous hidden output1113957ct Input to cell cc Memory cell1113957ct Input to cell cit Input gate

Ctminus1 Past cell statusOt Output gateht Hidden statemiddot Matrix multiplication⊙ An elementwise multiplicationθ Weight1113957c )e input to the cellσ Nonlinear functionzj )e jth hidden neuronp Number of inputs to the networkm Number of hidden neuronswij )e connection weight from the ith input node to the jth hidden nodeykminusi i-step behind previous wind speed

fh() )e activation function in the hidden layerwj )e connection weight from the jth hidden node to the output node1113954yk )e predicted wind speed at the kth sampling momentfo )e activation function for the output layeryk Actual wind speedxi Input vectoryi Output vectorRm Regularized function

f(x) A function that describes the correlation between inputs and outputsϕx() Preknown functionR[f] Structure risk

w )e regression coefficient vectorb Bias termc Punishment coefficient

L(xi yi f(xi)) )e ε-insensitive loss function

ε )resholdζ i ζlowasti Slack variables that let constraints feasible

ai alowasti )e Lagrange multipliersK(xixj) )e kernel function

W )e weight matrixlowast Convolution operation

bi bf bc Bias vectors∘ Hadamard product

Ht Hidden stateXt Current wind speed measure

Xtminus1 Previous wind speed measureXt+1 Future wind speed measure

6 Computational Intelligence and Neuroscience

)e local sensitivity is found through the pooling layer toobtain the attributes repeatedly )e existence of the con-volution and the pooling layers minimizes the attributeresolution and the number of network specifications whichrequire enhancement

CNN typically describes data and constructs them as atwo-dimensional array and is extensively utilized in the areaof image processing In this paper CNN algorithm is con-figured to predict the wind speed and fit it to process a one-dimensional array of data In the preprocessing phase theone-dimensional data are reconstructed into a two-di-mensional array )is enables CNN machine algorithm tosmoothly deal with data )is creates two files the propertyand the response files )ese files are delivered as inputs toCNN )e response file also contains the data of the expectedoutput value

Each sample is represented by a line from the propertyand the response files Weights and biases can be obtained assoon as an acceptable number of samples to train the CNN isdelivered )e training continues by comparing the re-gression results with the response values in order to reachthe minimum possible error )is delivers the final trainedCNN model which is utilized to achieve the neededpredictions

)e fitting mechanism of CNN is pooling Variouscomputational approaches have proved that two approachesof pooling can be used the average pooling and the max-imum pooling Images are stationary and all parts of imageshare similar attributes )erefore the pooling approachapplies similar average or maximum calculations for everypart of the high-resolution images)e pooling process leadsto reduction in the statistics dimensions and increase in thegeneralization strength of the model )e results are welloptimized and can have a lower possibility of over fitting

23 ANN Algorithms ANN has three layers which build upthe network )ese are input hidden and output layers)ese layers have the ability to correlate an input vector to anoutput scalar or vector using activation function in variousneurons)e jth hidden neuron Zjcan be computed by the p

inputs and m hidden neurons using the following equation[14]

Zj fh 1113944

p

i1wijykminusi

⎛⎝ ⎞⎠ (7)

where wij is the connection weight from the ith input node tothe jth hidden node ykminusi is i-step behind previous windspeed and fh() is the activation function in the hiddenlayer )erefore the future wind speed can be predictedthrough

1113954yk fo 1113944

m

j1wjzj

⎛⎝ ⎞⎠ (8)

where wj is the connection weight from the jth hidden nodeto the output node and 1113954yk is the predicted wind speed at thekth sampling moment while f0 is the activation function forthe output layer By minimizing the error between the actualand the predicted wind speeds yk and 1113954yk respectively usingLevenbergndashMarquardt (LM) algorithm the nonlinearmapping efficiency of ANN can be obtained [28]

24 SVM Algorithms Assuming a set of samples xi yi1113864 1113865where i 1 2 N with input vector xi isin Rm and out-put vectoryi isin Rm )e regression obstacles aim to identifya function f(x)that describes the correlation between inputsand outputs )e interest of SVR is to obtain a linear re-gression in the high-dimensional feature space delivered bymapping the primary input set utilizing a preknown func-tion ϕ(x())and to minimize the structure riskR[f] )ismechanism can be written as follows [15]

f(x) wTϕ(x) + b

R[f] 12W

2+ C 1113944

N

i1L xiyif xi( )1113874 1113875

(9)

where W b and Crespectively are the regression coefficientvector bias term and punishment coefficient L(xi yi f(xi)

)

is the e-insensitive loss function)e regression problem canbe handled by the following constrained optimizationproblem

min 12W

2+ C 1113944

N

i1L ζ iζ

lowasti( 1113857

st yi minus wTϕ(x) + b1113872 1113873 le ε + ζ i wTϕ(x) + b1113872 1113873 minus yi le ε + ζ lowasti ζ i ζlowasti ge 0 i 1 2 N

(10)

where ζ iand ζ lowasti represent the slack variables that let con-straints feasible By using the Lagrange multipliers the re-gression function can be written as follows

f(x) 1113944N

i1ai minus a

lowasti( 1113857K xixj1113872 1113873 + b (11)

where aiand alowasti are the Lagrange multipliers that fulfilthe conditions ai ge 0 alowasti ge 0 and 1113936

Ni1(ai minus alowasti ) 0 K(xixj)

is a general kernel function In this study the well-knownradial basis function (RBF) is chosen here as the kernelfunction

Computational Intelligence and Neuroscience 7

K xixj1113872 1113873 exp minusxi minus xj

2

2σ2⎛⎜⎜⎝ ⎞⎟⎟⎠ (12)

where σ defines the RBF kernel width [15]

25 ConvLSTM Algorithm ConvLSTM is designed to betrained on spatial information in the dataset and its aim is todeal with 3-dimentional data as an input Furthermore itexchanges matrix multiplication through convolution op-eration on every LSTM cellrsquos gate By doing so it has theability to put the underlying spatial features in multidi-mensional data)e formulas that are used at each one of thegates (input forget and output) are as follows

it σ Wxi lowast xt + Whi lowast htminus1 + bi( 1113857

ft σ Wxf lowast xt + Whf lowast htminus1 + bf1113872 1113873

ot σ Wxo lowastxt + Who lowast htminus1 + bo( 1113857

Ct ft ∘Ctminus1 tan h Wxc lowastxt + Whc lowast htminus1 + bc( 1113857

Ht o minus t ∘ tan h ct( 1113857

(13)

where it ft and ot are input forget and output gates and W

is the weight matrix while xt is the current input data htminus1 isthe previous hidden output and Ct is the cell state

)e difference between these equations in LSTM is thatthe matrix multiplication (middot) is substituted by the convo-lution operation (lowast) between W and each xt htminus1 at everygate By doing so the whole connected layer is replaced by aconvolutional layer )us the number of weight parametersin the model can be significantly reduced

3 Methodology

Due to the nonlinear nonstationary attributes and thestochastic variations in the wind speed time series the ac-curate prediction of wind speed is known to be a challengingeffort [29] In this work to improve the accuracy of the windspeed forecasting model a comparison between five modelsis conducted to forecast wind speed considering availablehistorical data A new concept called multi-lags-one-step(MLOS) ahead forecasting is employed to illustrate the effecton the five models accuracies Assume that we are at timeindex Xt To forecast one output element in the future Xt+1the input dataset can be splitted into many lags (past data)XtminusI where I isin 1ndash10 By doing so the model can be trainedon more elements before predicting a single event in thefuture In addition to that the model accuracy showed animprovement until it reached the optimum lagging pointwhich had the best accuracy Beyond this point the modelaccuracy is degraded as it will be illustrated in the Resultssection

Figure 1 illustrates the workflow of the forecastingmodel Specifically the proposed methodology entails foursteps

In Step 1 data have been collected and averaged from 5minutes to 30 minutes and to 1 hour respectively )edatasets are then standardized to generate a mean value of 0

and standard deviation of 1 )e lagging stage is very im-portant in Step 2 as the data are split into different lags tostudy the effect of training the models on more than oneelement (input) to predict a single event in the future In Step3 the models have been applied taking into considerationthat somemodels such as CNN LSTM and ConvLSTM needto be adjusted from matrix shape perspective )ese modelsnormally work with 2D or more In this stage manipulationand reshaping of matrix are conducted For the sake ofchecking and evaluating the proposed models in Step 4three main metrics are used to validate the case study (MAERMSE and R2) In addition the execution time and opti-mum lag are taken into account to select the best model

Algorithm 1 illustrates the training procedure forConvLSTM

4 Collected Data

Table 3 illustrates the characteristics of the collected data in 5minutes time span )e data are collected from a real windspeed dataset over a three-year period from the West TexasMesonet with 5-minute observation period from near LakeAlan Garza [30] )e data are processed through averagingfrom 5 minutes to 30 minutes (whose statistical charac-teristics are given in Table 4) and one more time to 1 hour(whose statistical characteristics are also given in Table 5))e goal of averaging is to study the effect of reducing thedata size in order to compare the five models and then selectthe one that can achieve the highest accuracy for the threedataset cases As shown in the three tables the data sets arealmost identical and reserved with their seasonality Alsothey are not affected by the averaging process

)e data have been split into three sets (training vali-dation and test) with fractions of 53 14 33

5 Results and Discussion

To quantitatively evaluate the performance of the predictivemodels four commonly used statistical measures are tested[20] All of them measure the deflection between the actualand predicted wind speed values Specifically RMSE MAEand R2 are as follows

RMSE

1113936Ni

i1 yi minus 1113954yi( 11138572

Ni

1113971

MAE 1N

1113944N

i1 yi minus 1113954yi

11138681113868111386811138681113868111386811138681113868

R2

1 minus1113936

Ni

i1 yi minus 1113954yi( 11138572

1113936Ni

i1 yi minus yi( 11138572

(14)

where yi and 1113954yi are the actual and the predicted wind speedrespectively while yi is the mean value of actual wind speedsequence Typically the smaller amplitudes of these mea-sures indicate an improved forecasting procedure while R2

is the goodness-of-fit measure for the model )erefore thelarger its value is the fitter the model will be )e testbedenvironment configuration is as follows

8 Computational Intelligence and Neuroscience

Collect raw dataTime series

Averaging andpreprocessing

L1 L2 L10

AN

N

CNN

LSTM

Con

vLS

TM SVR

Forecasting results

MSERMSEMAE

R2

DelayExecution time

Best fittedKPIs

Dat

apr

oces

sing

Lagging

ModelingEv

alua

tion

helliphellip

1hr30min5min

Figure 1 )e proposed forecasting methodology

Input the wind speed time series dataOutput forecasting performance indices

(1) )e wind speed time series data are measured every 5minutes being averaged two times for 30minutes and 1 hour respectively(2) )e wind datasets are split into Training Validation and Test sets(3) Initiate the multi-lags-one-step (MLOS) arrays for Training Validation and Test sets(4) Define MLOS range as 1 10 to optimize the number of needed lags(5) loop 1

Split the first set based on MLOS rangeInitiate and Extract set features with CNN layerPass the output to a defined LSTM layerSelect the first range of the number of hidden neuronsGenerate prediction results performance indicesCount time to execute and produce prediction resultsSave and compare results with previous onesloop 2Select next MLOS rangeIf MLOS rangemaximum range then goto loop 3 and initialize MLOS rangegoto loop 1loop 3Select new number of hidden neuronsIf number of hidden neurons rangemaximum range then goto loop 4 and initialize number of hidden neurons rangegoto loop 1

(6) loop 4Select new dataset from the sets 5min 30min and 1 hrgoto loop 1If set rangemaximum range thenGenerate the performance indices of all tested setsSelect the best results metrics

ALGORITHM 1 ConvLSTM training

Computational Intelligence and Neuroscience 9

(1) CPU Intel (R) Core(TM) i7-8550U CPU 180GHz 2001 Mhz 4 Core (s) 8 Logical Processor(s)

(2) RAM Installed Physical Memory 160 GB(3) GPU AMD Radeon(TM) RX 550 10GB(4) Framework Anaconda 201907 Python 37

Table 6 illustrates the chosen optimized internal pa-rameters (hyperparameters) for the forecasting methodsused in this work For each method the optimal number ofhidden neurons is chosen to achieve the maximum R2 andthe minimum RMSEand MAE values

After the implementation of CNN ANN LSTMConvLSTM and SVM it was noticed that the most fittedmodel was chosen depending on its accuracy in predictingfuture wind speed values )us the seasonality is consideredfor the forecast mechanism)e chosen model has to deliverthe most fitted data with the least amount of error takinginto consideration the nature of the data and not applyingnaive forecasting on it

To achieve this goal the statistical error indicators arecalculated for every model and time lapse and fully repre-sented as Figure 2 illustrates )e provided results suggestthat the ConvLSTM model has the best performance ascompared to the other four models)e chosen model has toreach the minimum values of RMSE and MAE whilemaximum R2 value

Different parameters are also tested to ensure the rightdecision of choosing the best fitted model )e optimumnumber of lags which is presented in Table 7 is one of themost important indicators in selecting the best fitted modelSince the less historical points are needed by the model thecomputational effort will be less as well For each methodthe optimal number of lags is chosen to achieve the max-imum R2and the minimum RMSE and MAE values Forinstance Figures 3 and 4 show the relation between the

statistical measures and the number of lags and hiddenneurons respectively for the proposed ConvLSTM methodfor the 5 minutes time span case It can be seen that 4 lagsand 15 hidden neurons achieved maximum R2 and mini-mum RMSE and MAE values

)e execution time shown in Table 8 is calculated foreach method and time lapse to assure that the final andchosen model is efficient and can effectively predict futurewind speed )e shorter the time for execution is the moreefficient and helpful the model will be )is is also a sign thatthe model is efficient for further modifications According toTable 8 the ConvLSTM model beats all other models in thetime that it needed to process the historical data and deliver afinal prediction SVM needed 54 minutes to accomplish thetraining and produce testing results while ConvLSTM madeit in just 17 minutes )is huge difference between them hasmade the choice of using ConvLSTM

Table 3 Dataset characteristic for 5min sample

Dataset Max Median Min Mean StdAll datasets 1873 353 001 391 210Training dataset 1873 347 001 383 205Test dataset 1487 367 001 405 220

Table 4 Dataset characteristic for 30min sample

Dataset Max Median Min Mean StdAll datasets 1766 353 001 391 208Training dataset 1766 347 001 3837309 202Test dataset 1432 367 002 405 218

Table 5 Dataset characteristic for 1 hour sample

Dataset Max Median Min Mean StdAll datasets 1761 353 007 391 205Training dataset 1761 346 007 383 200Test dataset 1422 366 007 405 215

Table 6 Optimized internal parameters for the forecastingmethods

Method Sets of parameters

LSTM5min 17 hidden neurons30min 20 hidden neurons1 hour 8 hidden neurons

CNN5min 15 hidden neurons30min 5 hidden neurons1 hour 15 hidden neurons

ConvLSTM5min 15 hidden neurons30min 8 hidden neurons1 hour 20 hidden neurons

ANN5min 15 hidden neurons30min 15 hidden neurons1 hour 20 hidden neurons

SVR5min C 7 ε 01 c 02

30min C 1 ε 025 c 0151 hour C 1 ε 01 c 005

10 Computational Intelligence and Neuroscience

Figure 5 shows that the 5-minute lapse dataset is themost fitted dataset to our chosen model It declares howaccurate the prediction of future wind speed will be

For completeness to effectively evaluate the investi-gated forecasting techniques in terms of their predictionaccuracies 50 cross validation procedure is carried out inwhich the investigated techniques are built and thenevaluated on 50 different training and test datasets re-spectively randomly sampled from the available overalldataset )e ultimate performance metrics are then re-ported as the average and the standard deviation values ofthe 50 metrics obtained in each cross validation trial Inthis regard Figure 6 shows the average performancemetrics on the test dataset using the 50 cross validationprocedure It can be easily recognized that the forecastingmodels that employ the LSTM technique outperform theother investigated techniques in terms of the three per-formance metrics R2 RMSE and MAE

From the experimental results of short-term wind speedforecasting shown in Figure 6 we can observe thatConvLSTM performs the best in terms of forecasting metrics(R2 RMSE andMAE) as compared to the other models (ieCNN ANN SVR and LSTM) )e related statistical tests inTables 6 and 7 respectively have proved the effectiveness ofConvLSTM and its capability of handling noisy large dataConvLSTM showed that it can produce high accurate windspeed prediction with less lags and hidden neurons )iswas indeed reflected in the results shown in Table 8 withless computation time as compared to the other testedmodels Furthermore we introduced the multi-lags-one-step (MLOS) ahead forecasting combined with the hybridConvLSTMmodel to provide an efficient generalization fornew time series data to predict wind speed accuratelyResults showed that ConvLSTM proposed in this paperis an effective and promising model for wind speedforecasting

CNN ANN LSTM SVM ConvLSTM5min 09873 09873 09876 09815 0987630min 09091 09089 09097 09078 090981hour 08309 08302 08315 08211 08316

Val

ues

Models

R2

0009018027036045054063072081

09099

(a)

Val

ues

CNN ANN LSTM SVM ConvLSTM5min 02482 02483 02424 02462 0241930min 06578 06588 06467 06627 064631hour 08862 08881 08718 09116 08717

Models

RMSE

0010203040506070809

(b)

Val

ues

CNN ANN LSTM SVM ConvLSTM5min 01806 01798 01758 01855 0175530min 04802 04799 04721 04818 047411hour 06468 06473 06392 06591 06369

Models

MAE

0006012018024

03036042048054

06066

(c)

Figure 2 Modelsrsquo key performance indicators (KPIs)

Table 7 Optimum number of lags

Method 5min 30min 1 hourCNN 9 4 4ANN 3 3 3LSTM 7 6 10ConvLSTM 4 5 8SVR 9 4 5

Computational Intelligence and Neuroscience 11

ConvLSTM

50 75 100 125 150 175 20025Number of hidden neurons

0986909870098710987209873098740987509876

R-sq

uare

d

(a)

ConvLSTM

0986909870098710987209873098740987509876

RMSE

50 75 100 125 150 175 20025Number of hidden neurons

(b)

ConvLSTM

0176017701780179018001810182

MA

E

50 75 100 125 150 175 20025Number of hidden neurons

(c)

Figure 3 ConvLSTM measured statistical values and number of hidden neurons

2 4 6 8 10Number of lags

ConvLSTM

098575

098600

098625

098650

098675

098700

098725

098750

R-sq

uare

d

(a)

2 4 6 8 10Number of lags

ConvLSTM

02425

02450

02475

02500

02525

02550

02575

02600

RMSE

(b)

2 4 6 8 10Number of lags

ConvLSTM

0176

0178

0180

0182

0184

0186

0188

0190

MA

E

(c)

Figure 4 ConvLSTM measured statistical values and number of lags

12 Computational Intelligence and Neuroscience

Similar to our work the proposed EnsemLSTM modelby Chen et al [15] contained different clusters of LSTMwith different hidden layers and hidden neurons )eycombined LSTM clusters with SVR and external optimizerin order to enhance the generalization capability and ro-bustness of their model However their model showed ahigh computational complexity with mediocre perfor-mance indices Our proposed ConvLSTM with MLOS

assured boosting the generalization and robustness for thenew time series data as well as producing high performanceindices

6 Conclusions

In this study we proposed a hybrid deep learning-basedframework ConvLSTM for short-term prediction of the wind

Table 8 Execution time

(5min) time (min) (30min) time (min) (1 hour) time (min)ConvLSTM 17338 03849 01451SVR 541424 08214 02250CNN 087828 01322 00708ANN 07431 02591 00587LSTM 16570 03290 01473

5

4

3

2

1

0 25 50 75 100 125 150 175 200Number of samples

Win

d sp

eed

(mph

)ConvLSTM 5 min time span

PredictedTrue

Figure 5 ConvLSTM truepredicted wind speed and number of samples

0988

0986

0984

0982

098ConvLSTM LSTM ANN CNN SVR

R2

(a)

ConvLSTM LSTM ANN CNN SVR

026

024

022

02

RMSE

(b)

ConvLSTM LSTM ANN CNN SVR

02

019

018

017

016

015

MA

E

(c)

Figure 6 Average performance metrics obtained on the test dataset using 50 cross validation procedure

Computational Intelligence and Neuroscience 13

speed time series measurements )e proposed dynamicpredictionmodel was optimized for the number of input lagsand the number of internal hidden neurons Multi-lags-one-step (MLOS) ahead wind speed forecasting using the pro-posed approach showed superior results compared to fourother different models built using standard ANN CNNLSTM and SVM approaches )e proposed modelingframework combines the benefits of CNN and LSTM net-works in a hybrid modeling scheme that shows highly ac-curate wind speed prediction results with less lags andhidden neurons as well as less computational complexityFor future work further investigation can be done to im-prove the accuracy of the ConvLSTM model for instanceincreasing and optimizing the number of hidden layersapplying a multi-lags-multi-steps (MLMS) ahead forecast-ing and introducing reinforcement learning agent to op-timize the parameters as compared to other optimizationmethods

Data Availability

)ewind speed data used in this study have been taken fromthe West Texas Mesonet of the US National Wind Institute(httpwwwdeptsttuedunwiresearchfacilitieswtmindexphp) Data are provided freely for academic researchpurposes only and cannot be shareddistributed beyondacademic research use without permission from the WestTexas Mesonet

Conflicts of Interest

)e authors declare that they have no conflicts of interest

Acknowledgments

)e authors acknowledge the help of Prof Brian Hirth inTexas Tech University for providing them with access toweather data

References

[1] R Rothermel ldquoHow to predict the spread and intensity offorest and range firesrdquo General Technical Report INT-143Intermountain Forest and Range Experiment Station OgdenUT USA 1983

[2] A Tascikaraoglu and M Uzunoglu ldquoA review of combinedapproaches for prediction of short-term wind speed andpowerrdquo Renewable and Sustainable Energy Reviews vol 34pp 243ndash254 2014

[3] R J Barthelmie F Murray and S C Pryor ldquo)e economicbenefit of short-term forecasting for wind energy in the UKelectricity marketrdquo Energy Policy vol 36 no 5 pp 1687ndash1696 2008

[4] Y-K Wu and H Jing-Shan A Literature Review of WindForecasting Technology in the World pp 504ndash509 IEEELausanne Powertech Lausanne Switzerland 2007

[5] W S McCulloch and W Pitts ldquoA logical calculus of the ideasimmanent in nervous activityrdquo Be Bulletin of MathematicalBiophysics vol 5 no 4 pp 115ndash133 1943

[6] K Fukushima ldquoNeocognitronrdquo Scholarpedia vol 2 no 1p 1717 2007

[7] K Fukushima ldquoNeocognitron a self-organizing neural net-work model for a mechanism of pattern recognition unaf-fected by shift in positionrdquo Biological Cybernetics vol 36no 4 pp 193ndash202 1980

[8] S Hochreite and J Schmidhuber ldquoLong short-termmemoryrdquoNeural Computation vol 9 no 8 pp 1735ndash1780 1997

[9] H T Siegelmann and E D Sontag ldquoOn the computationalpower of neural netsrdquo Journal of Computer and System Sci-ences vol 50 no 1 pp 132ndash150 1995

[10] R Sunil ldquoUnderstanding support vector machine algorithmrdquohttpswwwanalyticsvidhyacomblog201709understaing-support-vector-machine-example-code 2017

[11] X Shi Z Chen H Wang D Y Yeung W K Wong andW C Woo ldquoConvolutional LSTM network a machinelearning approach for precipitation nowcastingrdquo in Pro-ceedings of the Neural Information Processing Systemspp 7ndash12 Montreal Canada December 2015

[12] C Finn ldquoGoodfellow I and Levine S Unsupervised learningfor physical interaction through video predictionrdquo in Pro-ceedings of the Annual Conference Neural Information Pro-cessing Systems pp 64ndash72 Barcelona Spain December 2016

[13] H Xu Y Gao F Yu and T Darrell ldquoEnd-to-end learning ofdriving models from large-scale video datasetsrdquo in Proceed-ings of the 2017 IEEE Conference on Computer Vision andPattern Recognition pp 3530ndash3538 Honolulu HI USA July2017

[14] K Chen and J Yu ldquoShort-term wind speed prediction usingan unscented Kalman filter based state-space support vectorregression approachrdquo Applied Energy vol 113 pp 690ndash7052014

[15] J Chen G-Q Zeng W Zhou W Du and K-D Lu ldquoWindspeed forecasting using nonlinear-learning ensemble of deeplearning time series prediction and extremal optimizationrdquoEnergy Conversion and Management vol 165 pp 681ndash6952018

[16] D Liu D Niu H Wang and L Fan ldquoShort-term wind speedforecasting using wavelet transform and support vectormachines optimized by genetic algorithmrdquo Renewable Energyvol 62 pp 592ndash597 2014

[17] Y Wang ldquoShort-term wind power forecasting by geneticalgorithm of wavelet neural networkrdquo in Proceedings of the2014 International Conference on Information Science Elec-tronics and Electrical Engineering pp 1752ndash1755 SapporoJapan April 2014

[18] N Sheikh D B Talukdar R I Rasel and N Sultana ldquoA shortterm wind speed forcasting using svr and bp-ann a com-parative analysisrdquo in Proceedings of the 2017 20th Interna-tional Conference of Computer and Information Technology(ICCIT) pp 1ndash6 Dhaka Bangladesh December 2017

[19] H Nantian C Yuan G Cai and E Xing ldquoHybrid short termwind speed forecasting using variational mode decompositionand a weighted regularized extreme learning machinerdquo En-ergies vol 9 no 12 pp 989ndash1007 2016

[20] S Haijian and X Deng ldquoAdaBoosting neural network forshort-term wind speed forecasting based on seasonal char-acteristics analysis and lag space estimationrdquo ComputerModeling in Engineering amp Sciences vol 114 no 3pp 277ndash293 2018

[21] W Xiaodan ldquoForecasting short-term wind speed usingsupport vector machine with particle swarm optimizationrdquo inProceedings of the 2017 International Conference on SensingDiagnostics Prognostics and Control (SDPC) pp 241ndash245Shanghai China August 2017

14 Computational Intelligence and Neuroscience

[22] F R Ningsih E C Djamal and A Najmurrakhman ldquoWindspeed forecasting using recurrent neural networks and longshort term memoryrdquo in Proceedings of the 2019 6th Inter-national Conference on Instrumentation Control and Auto-mation (ICA) pp 137ndash141 Bandung Indonesia JulyndashAugust2019

[23] P K Datta ldquoAn artificial neural network approach for short-term wind speed forecastrdquo Master thesis Kansas State Uni-versity Manhattan KS USA 2018

[24] J Guanlong D Li L Yao and P Zhao ldquoAn improved ar-tificial bee colony-BP neural network algorithm in the short-term wind speed predictionrdquo in Proceedings of the 2016 12thWorld Congress on Intelligent Control and Automation(WCICA) pp 2252ndash2255 Guilin China June 2016

[25] C Gonggui J Chen Z Zhang and Z Sun ldquoShort-term windspeed forecasting based on fuzzy C-means clustering andimproved MEA-BPrdquo IAENG International Journal of Com-puter Science vol 46 no 4 pp 27ndash35 2019

[26] W ZhanJie and A B M Mazharul Mujib ldquo)e weatherforecast using data mining research based on cloud com-putingrdquo in Journal of Physics Conference Series vol 910 no 1Article ID 012020 2017

[27] A Zhu X Li Z Mo and R Wu ldquoWind power predictionbased on a convolutional neural networkrdquo in Proceedings ofthe 2017 International Conference on Circuits Devices AndSystems (ICCDS) pp 131ndash135 Chengdu China September2017

[28] O Linda T Vollmer and M Manic ldquoNeural network basedintrusion detection system for critical infrastructuresrdquo inProceedings of the 2009 International Joint Conference onNeural Networks pp 1827ndash1834 Atlanta GA USA June2009

[29] C Li Z Xiao X Xia W Zou and C Zhang ldquoA hybrid modelbased on synchronous optimisation for multi-step short-termwind speed forecastingrdquoApplied Energy vol 215 pp 131ndash1442018

[30] Institute National Wind West Texas Mesonet Online Ref-erencing httpwwwdeptsttueducom 2018

Computational Intelligence and Neuroscience 15

)e local sensitivity is found through the pooling layer toobtain the attributes repeatedly )e existence of the con-volution and the pooling layers minimizes the attributeresolution and the number of network specifications whichrequire enhancement

CNN typically describes data and constructs them as atwo-dimensional array and is extensively utilized in the areaof image processing In this paper CNN algorithm is con-figured to predict the wind speed and fit it to process a one-dimensional array of data In the preprocessing phase theone-dimensional data are reconstructed into a two-di-mensional array )is enables CNN machine algorithm tosmoothly deal with data )is creates two files the propertyand the response files )ese files are delivered as inputs toCNN )e response file also contains the data of the expectedoutput value

Each sample is represented by a line from the propertyand the response files Weights and biases can be obtained assoon as an acceptable number of samples to train the CNN isdelivered )e training continues by comparing the re-gression results with the response values in order to reachthe minimum possible error )is delivers the final trainedCNN model which is utilized to achieve the neededpredictions

)e fitting mechanism of CNN is pooling Variouscomputational approaches have proved that two approachesof pooling can be used the average pooling and the max-imum pooling Images are stationary and all parts of imageshare similar attributes )erefore the pooling approachapplies similar average or maximum calculations for everypart of the high-resolution images)e pooling process leadsto reduction in the statistics dimensions and increase in thegeneralization strength of the model )e results are welloptimized and can have a lower possibility of over fitting

23 ANN Algorithms ANN has three layers which build upthe network )ese are input hidden and output layers)ese layers have the ability to correlate an input vector to anoutput scalar or vector using activation function in variousneurons)e jth hidden neuron Zjcan be computed by the p

inputs and m hidden neurons using the following equation[14]

Zj fh 1113944

p

i1wijykminusi

⎛⎝ ⎞⎠ (7)

where wij is the connection weight from the ith input node tothe jth hidden node ykminusi is i-step behind previous windspeed and fh() is the activation function in the hiddenlayer )erefore the future wind speed can be predictedthrough

1113954yk fo 1113944

m

j1wjzj

⎛⎝ ⎞⎠ (8)

where wj is the connection weight from the jth hidden nodeto the output node and 1113954yk is the predicted wind speed at thekth sampling moment while f0 is the activation function forthe output layer By minimizing the error between the actualand the predicted wind speeds yk and 1113954yk respectively usingLevenbergndashMarquardt (LM) algorithm the nonlinearmapping efficiency of ANN can be obtained [28]

24 SVM Algorithms Assuming a set of samples xi yi1113864 1113865where i 1 2 N with input vector xi isin Rm and out-put vectoryi isin Rm )e regression obstacles aim to identifya function f(x)that describes the correlation between inputsand outputs )e interest of SVR is to obtain a linear re-gression in the high-dimensional feature space delivered bymapping the primary input set utilizing a preknown func-tion ϕ(x())and to minimize the structure riskR[f] )ismechanism can be written as follows [15]

f(x) wTϕ(x) + b

R[f] 12W

2+ C 1113944

N

i1L xiyif xi( )1113874 1113875

(9)

where W b and Crespectively are the regression coefficientvector bias term and punishment coefficient L(xi yi f(xi)

)

is the e-insensitive loss function)e regression problem canbe handled by the following constrained optimizationproblem

min 12W

2+ C 1113944

N

i1L ζ iζ

lowasti( 1113857

st yi minus wTϕ(x) + b1113872 1113873 le ε + ζ i wTϕ(x) + b1113872 1113873 minus yi le ε + ζ lowasti ζ i ζlowasti ge 0 i 1 2 N

(10)

where ζ iand ζ lowasti represent the slack variables that let con-straints feasible By using the Lagrange multipliers the re-gression function can be written as follows

f(x) 1113944N

i1ai minus a

lowasti( 1113857K xixj1113872 1113873 + b (11)

where aiand alowasti are the Lagrange multipliers that fulfilthe conditions ai ge 0 alowasti ge 0 and 1113936

Ni1(ai minus alowasti ) 0 K(xixj)

is a general kernel function In this study the well-knownradial basis function (RBF) is chosen here as the kernelfunction

Computational Intelligence and Neuroscience 7

K xixj1113872 1113873 exp minusxi minus xj

2

2σ2⎛⎜⎜⎝ ⎞⎟⎟⎠ (12)

where σ defines the RBF kernel width [15]

25 ConvLSTM Algorithm ConvLSTM is designed to betrained on spatial information in the dataset and its aim is todeal with 3-dimentional data as an input Furthermore itexchanges matrix multiplication through convolution op-eration on every LSTM cellrsquos gate By doing so it has theability to put the underlying spatial features in multidi-mensional data)e formulas that are used at each one of thegates (input forget and output) are as follows

it σ Wxi lowast xt + Whi lowast htminus1 + bi( 1113857

ft σ Wxf lowast xt + Whf lowast htminus1 + bf1113872 1113873

ot σ Wxo lowastxt + Who lowast htminus1 + bo( 1113857

Ct ft ∘Ctminus1 tan h Wxc lowastxt + Whc lowast htminus1 + bc( 1113857

Ht o minus t ∘ tan h ct( 1113857

(13)

where it ft and ot are input forget and output gates and W

is the weight matrix while xt is the current input data htminus1 isthe previous hidden output and Ct is the cell state

)e difference between these equations in LSTM is thatthe matrix multiplication (middot) is substituted by the convo-lution operation (lowast) between W and each xt htminus1 at everygate By doing so the whole connected layer is replaced by aconvolutional layer )us the number of weight parametersin the model can be significantly reduced

3 Methodology

Due to the nonlinear nonstationary attributes and thestochastic variations in the wind speed time series the ac-curate prediction of wind speed is known to be a challengingeffort [29] In this work to improve the accuracy of the windspeed forecasting model a comparison between five modelsis conducted to forecast wind speed considering availablehistorical data A new concept called multi-lags-one-step(MLOS) ahead forecasting is employed to illustrate the effecton the five models accuracies Assume that we are at timeindex Xt To forecast one output element in the future Xt+1the input dataset can be splitted into many lags (past data)XtminusI where I isin 1ndash10 By doing so the model can be trainedon more elements before predicting a single event in thefuture In addition to that the model accuracy showed animprovement until it reached the optimum lagging pointwhich had the best accuracy Beyond this point the modelaccuracy is degraded as it will be illustrated in the Resultssection

Figure 1 illustrates the workflow of the forecastingmodel Specifically the proposed methodology entails foursteps

In Step 1 data have been collected and averaged from 5minutes to 30 minutes and to 1 hour respectively )edatasets are then standardized to generate a mean value of 0

and standard deviation of 1 )e lagging stage is very im-portant in Step 2 as the data are split into different lags tostudy the effect of training the models on more than oneelement (input) to predict a single event in the future In Step3 the models have been applied taking into considerationthat somemodels such as CNN LSTM and ConvLSTM needto be adjusted from matrix shape perspective )ese modelsnormally work with 2D or more In this stage manipulationand reshaping of matrix are conducted For the sake ofchecking and evaluating the proposed models in Step 4three main metrics are used to validate the case study (MAERMSE and R2) In addition the execution time and opti-mum lag are taken into account to select the best model

Algorithm 1 illustrates the training procedure forConvLSTM

4 Collected Data

Table 3 illustrates the characteristics of the collected data in 5minutes time span )e data are collected from a real windspeed dataset over a three-year period from the West TexasMesonet with 5-minute observation period from near LakeAlan Garza [30] )e data are processed through averagingfrom 5 minutes to 30 minutes (whose statistical charac-teristics are given in Table 4) and one more time to 1 hour(whose statistical characteristics are also given in Table 5))e goal of averaging is to study the effect of reducing thedata size in order to compare the five models and then selectthe one that can achieve the highest accuracy for the threedataset cases As shown in the three tables the data sets arealmost identical and reserved with their seasonality Alsothey are not affected by the averaging process

)e data have been split into three sets (training vali-dation and test) with fractions of 53 14 33

5 Results and Discussion

To quantitatively evaluate the performance of the predictivemodels four commonly used statistical measures are tested[20] All of them measure the deflection between the actualand predicted wind speed values Specifically RMSE MAEand R2 are as follows

RMSE

1113936Ni

i1 yi minus 1113954yi( 11138572

Ni

1113971

MAE 1N

1113944N

i1 yi minus 1113954yi

11138681113868111386811138681113868111386811138681113868

R2

1 minus1113936

Ni

i1 yi minus 1113954yi( 11138572

1113936Ni

i1 yi minus yi( 11138572

(14)

where yi and 1113954yi are the actual and the predicted wind speedrespectively while yi is the mean value of actual wind speedsequence Typically the smaller amplitudes of these mea-sures indicate an improved forecasting procedure while R2

is the goodness-of-fit measure for the model )erefore thelarger its value is the fitter the model will be )e testbedenvironment configuration is as follows

8 Computational Intelligence and Neuroscience

Collect raw dataTime series

Averaging andpreprocessing

L1 L2 L10

AN

N

CNN

LSTM

Con

vLS

TM SVR

Forecasting results

MSERMSEMAE

R2

DelayExecution time

Best fittedKPIs

Dat

apr

oces

sing

Lagging

ModelingEv

alua

tion

helliphellip

1hr30min5min

Figure 1 )e proposed forecasting methodology

Input the wind speed time series dataOutput forecasting performance indices

(1) )e wind speed time series data are measured every 5minutes being averaged two times for 30minutes and 1 hour respectively(2) )e wind datasets are split into Training Validation and Test sets(3) Initiate the multi-lags-one-step (MLOS) arrays for Training Validation and Test sets(4) Define MLOS range as 1 10 to optimize the number of needed lags(5) loop 1

Split the first set based on MLOS rangeInitiate and Extract set features with CNN layerPass the output to a defined LSTM layerSelect the first range of the number of hidden neuronsGenerate prediction results performance indicesCount time to execute and produce prediction resultsSave and compare results with previous onesloop 2Select next MLOS rangeIf MLOS rangemaximum range then goto loop 3 and initialize MLOS rangegoto loop 1loop 3Select new number of hidden neuronsIf number of hidden neurons rangemaximum range then goto loop 4 and initialize number of hidden neurons rangegoto loop 1

(6) loop 4Select new dataset from the sets 5min 30min and 1 hrgoto loop 1If set rangemaximum range thenGenerate the performance indices of all tested setsSelect the best results metrics

ALGORITHM 1 ConvLSTM training

Computational Intelligence and Neuroscience 9

(1) CPU Intel (R) Core(TM) i7-8550U CPU 180GHz 2001 Mhz 4 Core (s) 8 Logical Processor(s)

(2) RAM Installed Physical Memory 160 GB(3) GPU AMD Radeon(TM) RX 550 10GB(4) Framework Anaconda 201907 Python 37

Table 6 illustrates the chosen optimized internal pa-rameters (hyperparameters) for the forecasting methodsused in this work For each method the optimal number ofhidden neurons is chosen to achieve the maximum R2 andthe minimum RMSEand MAE values

After the implementation of CNN ANN LSTMConvLSTM and SVM it was noticed that the most fittedmodel was chosen depending on its accuracy in predictingfuture wind speed values )us the seasonality is consideredfor the forecast mechanism)e chosen model has to deliverthe most fitted data with the least amount of error takinginto consideration the nature of the data and not applyingnaive forecasting on it

To achieve this goal the statistical error indicators arecalculated for every model and time lapse and fully repre-sented as Figure 2 illustrates )e provided results suggestthat the ConvLSTM model has the best performance ascompared to the other four models)e chosen model has toreach the minimum values of RMSE and MAE whilemaximum R2 value

Different parameters are also tested to ensure the rightdecision of choosing the best fitted model )e optimumnumber of lags which is presented in Table 7 is one of themost important indicators in selecting the best fitted modelSince the less historical points are needed by the model thecomputational effort will be less as well For each methodthe optimal number of lags is chosen to achieve the max-imum R2and the minimum RMSE and MAE values Forinstance Figures 3 and 4 show the relation between the

statistical measures and the number of lags and hiddenneurons respectively for the proposed ConvLSTM methodfor the 5 minutes time span case It can be seen that 4 lagsand 15 hidden neurons achieved maximum R2 and mini-mum RMSE and MAE values

)e execution time shown in Table 8 is calculated foreach method and time lapse to assure that the final andchosen model is efficient and can effectively predict futurewind speed )e shorter the time for execution is the moreefficient and helpful the model will be )is is also a sign thatthe model is efficient for further modifications According toTable 8 the ConvLSTM model beats all other models in thetime that it needed to process the historical data and deliver afinal prediction SVM needed 54 minutes to accomplish thetraining and produce testing results while ConvLSTM madeit in just 17 minutes )is huge difference between them hasmade the choice of using ConvLSTM

Table 3 Dataset characteristic for 5min sample

Dataset Max Median Min Mean StdAll datasets 1873 353 001 391 210Training dataset 1873 347 001 383 205Test dataset 1487 367 001 405 220

Table 4 Dataset characteristic for 30min sample

Dataset Max Median Min Mean StdAll datasets 1766 353 001 391 208Training dataset 1766 347 001 3837309 202Test dataset 1432 367 002 405 218

Table 5 Dataset characteristic for 1 hour sample

Dataset Max Median Min Mean StdAll datasets 1761 353 007 391 205Training dataset 1761 346 007 383 200Test dataset 1422 366 007 405 215

Table 6 Optimized internal parameters for the forecastingmethods

Method Sets of parameters

LSTM5min 17 hidden neurons30min 20 hidden neurons1 hour 8 hidden neurons

CNN5min 15 hidden neurons30min 5 hidden neurons1 hour 15 hidden neurons

ConvLSTM5min 15 hidden neurons30min 8 hidden neurons1 hour 20 hidden neurons

ANN5min 15 hidden neurons30min 15 hidden neurons1 hour 20 hidden neurons

SVR5min C 7 ε 01 c 02

30min C 1 ε 025 c 0151 hour C 1 ε 01 c 005

10 Computational Intelligence and Neuroscience

Figure 5 shows that the 5-minute lapse dataset is themost fitted dataset to our chosen model It declares howaccurate the prediction of future wind speed will be

For completeness to effectively evaluate the investi-gated forecasting techniques in terms of their predictionaccuracies 50 cross validation procedure is carried out inwhich the investigated techniques are built and thenevaluated on 50 different training and test datasets re-spectively randomly sampled from the available overalldataset )e ultimate performance metrics are then re-ported as the average and the standard deviation values ofthe 50 metrics obtained in each cross validation trial Inthis regard Figure 6 shows the average performancemetrics on the test dataset using the 50 cross validationprocedure It can be easily recognized that the forecastingmodels that employ the LSTM technique outperform theother investigated techniques in terms of the three per-formance metrics R2 RMSE and MAE

From the experimental results of short-term wind speedforecasting shown in Figure 6 we can observe thatConvLSTM performs the best in terms of forecasting metrics(R2 RMSE andMAE) as compared to the other models (ieCNN ANN SVR and LSTM) )e related statistical tests inTables 6 and 7 respectively have proved the effectiveness ofConvLSTM and its capability of handling noisy large dataConvLSTM showed that it can produce high accurate windspeed prediction with less lags and hidden neurons )iswas indeed reflected in the results shown in Table 8 withless computation time as compared to the other testedmodels Furthermore we introduced the multi-lags-one-step (MLOS) ahead forecasting combined with the hybridConvLSTMmodel to provide an efficient generalization fornew time series data to predict wind speed accuratelyResults showed that ConvLSTM proposed in this paperis an effective and promising model for wind speedforecasting

CNN ANN LSTM SVM ConvLSTM5min 09873 09873 09876 09815 0987630min 09091 09089 09097 09078 090981hour 08309 08302 08315 08211 08316

Val

ues

Models

R2

0009018027036045054063072081

09099

(a)

Val

ues

CNN ANN LSTM SVM ConvLSTM5min 02482 02483 02424 02462 0241930min 06578 06588 06467 06627 064631hour 08862 08881 08718 09116 08717

Models

RMSE

0010203040506070809

(b)

Val

ues

CNN ANN LSTM SVM ConvLSTM5min 01806 01798 01758 01855 0175530min 04802 04799 04721 04818 047411hour 06468 06473 06392 06591 06369

Models

MAE

0006012018024

03036042048054

06066

(c)

Figure 2 Modelsrsquo key performance indicators (KPIs)

Table 7 Optimum number of lags

Method 5min 30min 1 hourCNN 9 4 4ANN 3 3 3LSTM 7 6 10ConvLSTM 4 5 8SVR 9 4 5

Computational Intelligence and Neuroscience 11

ConvLSTM

50 75 100 125 150 175 20025Number of hidden neurons

0986909870098710987209873098740987509876

R-sq

uare

d

(a)

ConvLSTM

0986909870098710987209873098740987509876

RMSE

50 75 100 125 150 175 20025Number of hidden neurons

(b)

ConvLSTM

0176017701780179018001810182

MA

E

50 75 100 125 150 175 20025Number of hidden neurons

(c)

Figure 3 ConvLSTM measured statistical values and number of hidden neurons

2 4 6 8 10Number of lags

ConvLSTM

098575

098600

098625

098650

098675

098700

098725

098750

R-sq

uare

d

(a)

2 4 6 8 10Number of lags

ConvLSTM

02425

02450

02475

02500

02525

02550

02575

02600

RMSE

(b)

2 4 6 8 10Number of lags

ConvLSTM

0176

0178

0180

0182

0184

0186

0188

0190

MA

E

(c)

Figure 4 ConvLSTM measured statistical values and number of lags

12 Computational Intelligence and Neuroscience

Similar to our work the proposed EnsemLSTM modelby Chen et al [15] contained different clusters of LSTMwith different hidden layers and hidden neurons )eycombined LSTM clusters with SVR and external optimizerin order to enhance the generalization capability and ro-bustness of their model However their model showed ahigh computational complexity with mediocre perfor-mance indices Our proposed ConvLSTM with MLOS

assured boosting the generalization and robustness for thenew time series data as well as producing high performanceindices

6 Conclusions

In this study we proposed a hybrid deep learning-basedframework ConvLSTM for short-term prediction of the wind

Table 8 Execution time

(5min) time (min) (30min) time (min) (1 hour) time (min)ConvLSTM 17338 03849 01451SVR 541424 08214 02250CNN 087828 01322 00708ANN 07431 02591 00587LSTM 16570 03290 01473

5

4

3

2

1

0 25 50 75 100 125 150 175 200Number of samples

Win

d sp

eed

(mph

)ConvLSTM 5 min time span

PredictedTrue

Figure 5 ConvLSTM truepredicted wind speed and number of samples

0988

0986

0984

0982

098ConvLSTM LSTM ANN CNN SVR

R2

(a)

ConvLSTM LSTM ANN CNN SVR

026

024

022

02

RMSE

(b)

ConvLSTM LSTM ANN CNN SVR

02

019

018

017

016

015

MA

E

(c)

Figure 6 Average performance metrics obtained on the test dataset using 50 cross validation procedure

Computational Intelligence and Neuroscience 13

speed time series measurements )e proposed dynamicpredictionmodel was optimized for the number of input lagsand the number of internal hidden neurons Multi-lags-one-step (MLOS) ahead wind speed forecasting using the pro-posed approach showed superior results compared to fourother different models built using standard ANN CNNLSTM and SVM approaches )e proposed modelingframework combines the benefits of CNN and LSTM net-works in a hybrid modeling scheme that shows highly ac-curate wind speed prediction results with less lags andhidden neurons as well as less computational complexityFor future work further investigation can be done to im-prove the accuracy of the ConvLSTM model for instanceincreasing and optimizing the number of hidden layersapplying a multi-lags-multi-steps (MLMS) ahead forecast-ing and introducing reinforcement learning agent to op-timize the parameters as compared to other optimizationmethods

Data Availability

)ewind speed data used in this study have been taken fromthe West Texas Mesonet of the US National Wind Institute(httpwwwdeptsttuedunwiresearchfacilitieswtmindexphp) Data are provided freely for academic researchpurposes only and cannot be shareddistributed beyondacademic research use without permission from the WestTexas Mesonet

Conflicts of Interest

)e authors declare that they have no conflicts of interest

Acknowledgments

)e authors acknowledge the help of Prof Brian Hirth inTexas Tech University for providing them with access toweather data

References

[1] R Rothermel ldquoHow to predict the spread and intensity offorest and range firesrdquo General Technical Report INT-143Intermountain Forest and Range Experiment Station OgdenUT USA 1983

[2] A Tascikaraoglu and M Uzunoglu ldquoA review of combinedapproaches for prediction of short-term wind speed andpowerrdquo Renewable and Sustainable Energy Reviews vol 34pp 243ndash254 2014

[3] R J Barthelmie F Murray and S C Pryor ldquo)e economicbenefit of short-term forecasting for wind energy in the UKelectricity marketrdquo Energy Policy vol 36 no 5 pp 1687ndash1696 2008

[4] Y-K Wu and H Jing-Shan A Literature Review of WindForecasting Technology in the World pp 504ndash509 IEEELausanne Powertech Lausanne Switzerland 2007

[5] W S McCulloch and W Pitts ldquoA logical calculus of the ideasimmanent in nervous activityrdquo Be Bulletin of MathematicalBiophysics vol 5 no 4 pp 115ndash133 1943

[6] K Fukushima ldquoNeocognitronrdquo Scholarpedia vol 2 no 1p 1717 2007

[7] K Fukushima ldquoNeocognitron a self-organizing neural net-work model for a mechanism of pattern recognition unaf-fected by shift in positionrdquo Biological Cybernetics vol 36no 4 pp 193ndash202 1980

[8] S Hochreite and J Schmidhuber ldquoLong short-termmemoryrdquoNeural Computation vol 9 no 8 pp 1735ndash1780 1997

[9] H T Siegelmann and E D Sontag ldquoOn the computationalpower of neural netsrdquo Journal of Computer and System Sci-ences vol 50 no 1 pp 132ndash150 1995

[10] R Sunil ldquoUnderstanding support vector machine algorithmrdquohttpswwwanalyticsvidhyacomblog201709understaing-support-vector-machine-example-code 2017

[11] X Shi Z Chen H Wang D Y Yeung W K Wong andW C Woo ldquoConvolutional LSTM network a machinelearning approach for precipitation nowcastingrdquo in Pro-ceedings of the Neural Information Processing Systemspp 7ndash12 Montreal Canada December 2015

[12] C Finn ldquoGoodfellow I and Levine S Unsupervised learningfor physical interaction through video predictionrdquo in Pro-ceedings of the Annual Conference Neural Information Pro-cessing Systems pp 64ndash72 Barcelona Spain December 2016

[13] H Xu Y Gao F Yu and T Darrell ldquoEnd-to-end learning ofdriving models from large-scale video datasetsrdquo in Proceed-ings of the 2017 IEEE Conference on Computer Vision andPattern Recognition pp 3530ndash3538 Honolulu HI USA July2017

[14] K Chen and J Yu ldquoShort-term wind speed prediction usingan unscented Kalman filter based state-space support vectorregression approachrdquo Applied Energy vol 113 pp 690ndash7052014

[15] J Chen G-Q Zeng W Zhou W Du and K-D Lu ldquoWindspeed forecasting using nonlinear-learning ensemble of deeplearning time series prediction and extremal optimizationrdquoEnergy Conversion and Management vol 165 pp 681ndash6952018

[16] D Liu D Niu H Wang and L Fan ldquoShort-term wind speedforecasting using wavelet transform and support vectormachines optimized by genetic algorithmrdquo Renewable Energyvol 62 pp 592ndash597 2014

[17] Y Wang ldquoShort-term wind power forecasting by geneticalgorithm of wavelet neural networkrdquo in Proceedings of the2014 International Conference on Information Science Elec-tronics and Electrical Engineering pp 1752ndash1755 SapporoJapan April 2014

[18] N Sheikh D B Talukdar R I Rasel and N Sultana ldquoA shortterm wind speed forcasting using svr and bp-ann a com-parative analysisrdquo in Proceedings of the 2017 20th Interna-tional Conference of Computer and Information Technology(ICCIT) pp 1ndash6 Dhaka Bangladesh December 2017

[19] H Nantian C Yuan G Cai and E Xing ldquoHybrid short termwind speed forecasting using variational mode decompositionand a weighted regularized extreme learning machinerdquo En-ergies vol 9 no 12 pp 989ndash1007 2016

[20] S Haijian and X Deng ldquoAdaBoosting neural network forshort-term wind speed forecasting based on seasonal char-acteristics analysis and lag space estimationrdquo ComputerModeling in Engineering amp Sciences vol 114 no 3pp 277ndash293 2018

[21] W Xiaodan ldquoForecasting short-term wind speed usingsupport vector machine with particle swarm optimizationrdquo inProceedings of the 2017 International Conference on SensingDiagnostics Prognostics and Control (SDPC) pp 241ndash245Shanghai China August 2017

14 Computational Intelligence and Neuroscience

[22] F R Ningsih E C Djamal and A Najmurrakhman ldquoWindspeed forecasting using recurrent neural networks and longshort term memoryrdquo in Proceedings of the 2019 6th Inter-national Conference on Instrumentation Control and Auto-mation (ICA) pp 137ndash141 Bandung Indonesia JulyndashAugust2019

[23] P K Datta ldquoAn artificial neural network approach for short-term wind speed forecastrdquo Master thesis Kansas State Uni-versity Manhattan KS USA 2018

[24] J Guanlong D Li L Yao and P Zhao ldquoAn improved ar-tificial bee colony-BP neural network algorithm in the short-term wind speed predictionrdquo in Proceedings of the 2016 12thWorld Congress on Intelligent Control and Automation(WCICA) pp 2252ndash2255 Guilin China June 2016

[25] C Gonggui J Chen Z Zhang and Z Sun ldquoShort-term windspeed forecasting based on fuzzy C-means clustering andimproved MEA-BPrdquo IAENG International Journal of Com-puter Science vol 46 no 4 pp 27ndash35 2019

[26] W ZhanJie and A B M Mazharul Mujib ldquo)e weatherforecast using data mining research based on cloud com-putingrdquo in Journal of Physics Conference Series vol 910 no 1Article ID 012020 2017

[27] A Zhu X Li Z Mo and R Wu ldquoWind power predictionbased on a convolutional neural networkrdquo in Proceedings ofthe 2017 International Conference on Circuits Devices AndSystems (ICCDS) pp 131ndash135 Chengdu China September2017

[28] O Linda T Vollmer and M Manic ldquoNeural network basedintrusion detection system for critical infrastructuresrdquo inProceedings of the 2009 International Joint Conference onNeural Networks pp 1827ndash1834 Atlanta GA USA June2009

[29] C Li Z Xiao X Xia W Zou and C Zhang ldquoA hybrid modelbased on synchronous optimisation for multi-step short-termwind speed forecastingrdquoApplied Energy vol 215 pp 131ndash1442018

[30] Institute National Wind West Texas Mesonet Online Ref-erencing httpwwwdeptsttueducom 2018

Computational Intelligence and Neuroscience 15

K xixj1113872 1113873 exp minusxi minus xj

2

2σ2⎛⎜⎜⎝ ⎞⎟⎟⎠ (12)

where σ defines the RBF kernel width [15]

25 ConvLSTM Algorithm ConvLSTM is designed to betrained on spatial information in the dataset and its aim is todeal with 3-dimentional data as an input Furthermore itexchanges matrix multiplication through convolution op-eration on every LSTM cellrsquos gate By doing so it has theability to put the underlying spatial features in multidi-mensional data)e formulas that are used at each one of thegates (input forget and output) are as follows

it σ Wxi lowast xt + Whi lowast htminus1 + bi( 1113857

ft σ Wxf lowast xt + Whf lowast htminus1 + bf1113872 1113873

ot σ Wxo lowastxt + Who lowast htminus1 + bo( 1113857

Ct ft ∘Ctminus1 tan h Wxc lowastxt + Whc lowast htminus1 + bc( 1113857

Ht o minus t ∘ tan h ct( 1113857

(13)

where it ft and ot are input forget and output gates and W

is the weight matrix while xt is the current input data htminus1 isthe previous hidden output and Ct is the cell state

)e difference between these equations in LSTM is thatthe matrix multiplication (middot) is substituted by the convo-lution operation (lowast) between W and each xt htminus1 at everygate By doing so the whole connected layer is replaced by aconvolutional layer )us the number of weight parametersin the model can be significantly reduced

3 Methodology

Due to the nonlinear nonstationary attributes and thestochastic variations in the wind speed time series the ac-curate prediction of wind speed is known to be a challengingeffort [29] In this work to improve the accuracy of the windspeed forecasting model a comparison between five modelsis conducted to forecast wind speed considering availablehistorical data A new concept called multi-lags-one-step(MLOS) ahead forecasting is employed to illustrate the effecton the five models accuracies Assume that we are at timeindex Xt To forecast one output element in the future Xt+1the input dataset can be splitted into many lags (past data)XtminusI where I isin 1ndash10 By doing so the model can be trainedon more elements before predicting a single event in thefuture In addition to that the model accuracy showed animprovement until it reached the optimum lagging pointwhich had the best accuracy Beyond this point the modelaccuracy is degraded as it will be illustrated in the Resultssection

Figure 1 illustrates the workflow of the forecastingmodel Specifically the proposed methodology entails foursteps

In Step 1 data have been collected and averaged from 5minutes to 30 minutes and to 1 hour respectively )edatasets are then standardized to generate a mean value of 0

and standard deviation of 1 )e lagging stage is very im-portant in Step 2 as the data are split into different lags tostudy the effect of training the models on more than oneelement (input) to predict a single event in the future In Step3 the models have been applied taking into considerationthat somemodels such as CNN LSTM and ConvLSTM needto be adjusted from matrix shape perspective )ese modelsnormally work with 2D or more In this stage manipulationand reshaping of matrix are conducted For the sake ofchecking and evaluating the proposed models in Step 4three main metrics are used to validate the case study (MAERMSE and R2) In addition the execution time and opti-mum lag are taken into account to select the best model

Algorithm 1 illustrates the training procedure forConvLSTM

4 Collected Data

Table 3 illustrates the characteristics of the collected data in 5minutes time span )e data are collected from a real windspeed dataset over a three-year period from the West TexasMesonet with 5-minute observation period from near LakeAlan Garza [30] )e data are processed through averagingfrom 5 minutes to 30 minutes (whose statistical charac-teristics are given in Table 4) and one more time to 1 hour(whose statistical characteristics are also given in Table 5))e goal of averaging is to study the effect of reducing thedata size in order to compare the five models and then selectthe one that can achieve the highest accuracy for the threedataset cases As shown in the three tables the data sets arealmost identical and reserved with their seasonality Alsothey are not affected by the averaging process

)e data have been split into three sets (training vali-dation and test) with fractions of 53 14 33

5 Results and Discussion

To quantitatively evaluate the performance of the predictivemodels four commonly used statistical measures are tested[20] All of them measure the deflection between the actualand predicted wind speed values Specifically RMSE MAEand R2 are as follows

RMSE

1113936Ni

i1 yi minus 1113954yi( 11138572

Ni

1113971

MAE 1N

1113944N

i1 yi minus 1113954yi

11138681113868111386811138681113868111386811138681113868

R2

1 minus1113936

Ni

i1 yi minus 1113954yi( 11138572

1113936Ni

i1 yi minus yi( 11138572

(14)

where yi and 1113954yi are the actual and the predicted wind speedrespectively while yi is the mean value of actual wind speedsequence Typically the smaller amplitudes of these mea-sures indicate an improved forecasting procedure while R2

is the goodness-of-fit measure for the model )erefore thelarger its value is the fitter the model will be )e testbedenvironment configuration is as follows

8 Computational Intelligence and Neuroscience

Collect raw dataTime series

Averaging andpreprocessing

L1 L2 L10

AN

N

CNN

LSTM

Con

vLS

TM SVR

Forecasting results

MSERMSEMAE

R2

DelayExecution time

Best fittedKPIs

Dat

apr

oces

sing

Lagging

ModelingEv

alua

tion

helliphellip

1hr30min5min

Figure 1 )e proposed forecasting methodology

Input the wind speed time series dataOutput forecasting performance indices

(1) )e wind speed time series data are measured every 5minutes being averaged two times for 30minutes and 1 hour respectively(2) )e wind datasets are split into Training Validation and Test sets(3) Initiate the multi-lags-one-step (MLOS) arrays for Training Validation and Test sets(4) Define MLOS range as 1 10 to optimize the number of needed lags(5) loop 1

Split the first set based on MLOS rangeInitiate and Extract set features with CNN layerPass the output to a defined LSTM layerSelect the first range of the number of hidden neuronsGenerate prediction results performance indicesCount time to execute and produce prediction resultsSave and compare results with previous onesloop 2Select next MLOS rangeIf MLOS rangemaximum range then goto loop 3 and initialize MLOS rangegoto loop 1loop 3Select new number of hidden neuronsIf number of hidden neurons rangemaximum range then goto loop 4 and initialize number of hidden neurons rangegoto loop 1

(6) loop 4Select new dataset from the sets 5min 30min and 1 hrgoto loop 1If set rangemaximum range thenGenerate the performance indices of all tested setsSelect the best results metrics

ALGORITHM 1 ConvLSTM training

Computational Intelligence and Neuroscience 9

(1) CPU Intel (R) Core(TM) i7-8550U CPU 180GHz 2001 Mhz 4 Core (s) 8 Logical Processor(s)

(2) RAM Installed Physical Memory 160 GB(3) GPU AMD Radeon(TM) RX 550 10GB(4) Framework Anaconda 201907 Python 37

Table 6 illustrates the chosen optimized internal pa-rameters (hyperparameters) for the forecasting methodsused in this work For each method the optimal number ofhidden neurons is chosen to achieve the maximum R2 andthe minimum RMSEand MAE values

After the implementation of CNN ANN LSTMConvLSTM and SVM it was noticed that the most fittedmodel was chosen depending on its accuracy in predictingfuture wind speed values )us the seasonality is consideredfor the forecast mechanism)e chosen model has to deliverthe most fitted data with the least amount of error takinginto consideration the nature of the data and not applyingnaive forecasting on it

To achieve this goal the statistical error indicators arecalculated for every model and time lapse and fully repre-sented as Figure 2 illustrates )e provided results suggestthat the ConvLSTM model has the best performance ascompared to the other four models)e chosen model has toreach the minimum values of RMSE and MAE whilemaximum R2 value

Different parameters are also tested to ensure the rightdecision of choosing the best fitted model )e optimumnumber of lags which is presented in Table 7 is one of themost important indicators in selecting the best fitted modelSince the less historical points are needed by the model thecomputational effort will be less as well For each methodthe optimal number of lags is chosen to achieve the max-imum R2and the minimum RMSE and MAE values Forinstance Figures 3 and 4 show the relation between the

statistical measures and the number of lags and hiddenneurons respectively for the proposed ConvLSTM methodfor the 5 minutes time span case It can be seen that 4 lagsand 15 hidden neurons achieved maximum R2 and mini-mum RMSE and MAE values

)e execution time shown in Table 8 is calculated foreach method and time lapse to assure that the final andchosen model is efficient and can effectively predict futurewind speed )e shorter the time for execution is the moreefficient and helpful the model will be )is is also a sign thatthe model is efficient for further modifications According toTable 8 the ConvLSTM model beats all other models in thetime that it needed to process the historical data and deliver afinal prediction SVM needed 54 minutes to accomplish thetraining and produce testing results while ConvLSTM madeit in just 17 minutes )is huge difference between them hasmade the choice of using ConvLSTM

Table 3 Dataset characteristic for 5min sample

Dataset Max Median Min Mean StdAll datasets 1873 353 001 391 210Training dataset 1873 347 001 383 205Test dataset 1487 367 001 405 220

Table 4 Dataset characteristic for 30min sample

Dataset Max Median Min Mean StdAll datasets 1766 353 001 391 208Training dataset 1766 347 001 3837309 202Test dataset 1432 367 002 405 218

Table 5 Dataset characteristic for 1 hour sample

Dataset Max Median Min Mean StdAll datasets 1761 353 007 391 205Training dataset 1761 346 007 383 200Test dataset 1422 366 007 405 215

Table 6 Optimized internal parameters for the forecastingmethods

Method Sets of parameters

LSTM5min 17 hidden neurons30min 20 hidden neurons1 hour 8 hidden neurons

CNN5min 15 hidden neurons30min 5 hidden neurons1 hour 15 hidden neurons

ConvLSTM5min 15 hidden neurons30min 8 hidden neurons1 hour 20 hidden neurons

ANN5min 15 hidden neurons30min 15 hidden neurons1 hour 20 hidden neurons

SVR5min C 7 ε 01 c 02

30min C 1 ε 025 c 0151 hour C 1 ε 01 c 005

10 Computational Intelligence and Neuroscience

Figure 5 shows that the 5-minute lapse dataset is themost fitted dataset to our chosen model It declares howaccurate the prediction of future wind speed will be

For completeness to effectively evaluate the investi-gated forecasting techniques in terms of their predictionaccuracies 50 cross validation procedure is carried out inwhich the investigated techniques are built and thenevaluated on 50 different training and test datasets re-spectively randomly sampled from the available overalldataset )e ultimate performance metrics are then re-ported as the average and the standard deviation values ofthe 50 metrics obtained in each cross validation trial Inthis regard Figure 6 shows the average performancemetrics on the test dataset using the 50 cross validationprocedure It can be easily recognized that the forecastingmodels that employ the LSTM technique outperform theother investigated techniques in terms of the three per-formance metrics R2 RMSE and MAE

From the experimental results of short-term wind speedforecasting shown in Figure 6 we can observe thatConvLSTM performs the best in terms of forecasting metrics(R2 RMSE andMAE) as compared to the other models (ieCNN ANN SVR and LSTM) )e related statistical tests inTables 6 and 7 respectively have proved the effectiveness ofConvLSTM and its capability of handling noisy large dataConvLSTM showed that it can produce high accurate windspeed prediction with less lags and hidden neurons )iswas indeed reflected in the results shown in Table 8 withless computation time as compared to the other testedmodels Furthermore we introduced the multi-lags-one-step (MLOS) ahead forecasting combined with the hybridConvLSTMmodel to provide an efficient generalization fornew time series data to predict wind speed accuratelyResults showed that ConvLSTM proposed in this paperis an effective and promising model for wind speedforecasting

CNN ANN LSTM SVM ConvLSTM5min 09873 09873 09876 09815 0987630min 09091 09089 09097 09078 090981hour 08309 08302 08315 08211 08316

Val

ues

Models

R2

0009018027036045054063072081

09099

(a)

Val

ues

CNN ANN LSTM SVM ConvLSTM5min 02482 02483 02424 02462 0241930min 06578 06588 06467 06627 064631hour 08862 08881 08718 09116 08717

Models

RMSE

0010203040506070809

(b)

Val

ues

CNN ANN LSTM SVM ConvLSTM5min 01806 01798 01758 01855 0175530min 04802 04799 04721 04818 047411hour 06468 06473 06392 06591 06369

Models

MAE

0006012018024

03036042048054

06066

(c)

Figure 2 Modelsrsquo key performance indicators (KPIs)

Table 7 Optimum number of lags

Method 5min 30min 1 hourCNN 9 4 4ANN 3 3 3LSTM 7 6 10ConvLSTM 4 5 8SVR 9 4 5

Computational Intelligence and Neuroscience 11

ConvLSTM

50 75 100 125 150 175 20025Number of hidden neurons

0986909870098710987209873098740987509876

R-sq

uare

d

(a)

ConvLSTM

0986909870098710987209873098740987509876

RMSE

50 75 100 125 150 175 20025Number of hidden neurons

(b)

ConvLSTM

0176017701780179018001810182

MA

E

50 75 100 125 150 175 20025Number of hidden neurons

(c)

Figure 3 ConvLSTM measured statistical values and number of hidden neurons

2 4 6 8 10Number of lags

ConvLSTM

098575

098600

098625

098650

098675

098700

098725

098750

R-sq

uare

d

(a)

2 4 6 8 10Number of lags

ConvLSTM

02425

02450

02475

02500

02525

02550

02575

02600

RMSE

(b)

2 4 6 8 10Number of lags

ConvLSTM

0176

0178

0180

0182

0184

0186

0188

0190

MA

E

(c)

Figure 4 ConvLSTM measured statistical values and number of lags

12 Computational Intelligence and Neuroscience

Similar to our work the proposed EnsemLSTM modelby Chen et al [15] contained different clusters of LSTMwith different hidden layers and hidden neurons )eycombined LSTM clusters with SVR and external optimizerin order to enhance the generalization capability and ro-bustness of their model However their model showed ahigh computational complexity with mediocre perfor-mance indices Our proposed ConvLSTM with MLOS

assured boosting the generalization and robustness for thenew time series data as well as producing high performanceindices

6 Conclusions

In this study we proposed a hybrid deep learning-basedframework ConvLSTM for short-term prediction of the wind

Table 8 Execution time

(5min) time (min) (30min) time (min) (1 hour) time (min)ConvLSTM 17338 03849 01451SVR 541424 08214 02250CNN 087828 01322 00708ANN 07431 02591 00587LSTM 16570 03290 01473

5

4

3

2

1

0 25 50 75 100 125 150 175 200Number of samples

Win

d sp

eed

(mph

)ConvLSTM 5 min time span

PredictedTrue

Figure 5 ConvLSTM truepredicted wind speed and number of samples

0988

0986

0984

0982

098ConvLSTM LSTM ANN CNN SVR

R2

(a)

ConvLSTM LSTM ANN CNN SVR

026

024

022

02

RMSE

(b)

ConvLSTM LSTM ANN CNN SVR

02

019

018

017

016

015

MA

E

(c)

Figure 6 Average performance metrics obtained on the test dataset using 50 cross validation procedure

Computational Intelligence and Neuroscience 13

speed time series measurements )e proposed dynamicpredictionmodel was optimized for the number of input lagsand the number of internal hidden neurons Multi-lags-one-step (MLOS) ahead wind speed forecasting using the pro-posed approach showed superior results compared to fourother different models built using standard ANN CNNLSTM and SVM approaches )e proposed modelingframework combines the benefits of CNN and LSTM net-works in a hybrid modeling scheme that shows highly ac-curate wind speed prediction results with less lags andhidden neurons as well as less computational complexityFor future work further investigation can be done to im-prove the accuracy of the ConvLSTM model for instanceincreasing and optimizing the number of hidden layersapplying a multi-lags-multi-steps (MLMS) ahead forecast-ing and introducing reinforcement learning agent to op-timize the parameters as compared to other optimizationmethods

Data Availability

)ewind speed data used in this study have been taken fromthe West Texas Mesonet of the US National Wind Institute(httpwwwdeptsttuedunwiresearchfacilitieswtmindexphp) Data are provided freely for academic researchpurposes only and cannot be shareddistributed beyondacademic research use without permission from the WestTexas Mesonet

Conflicts of Interest

)e authors declare that they have no conflicts of interest

Acknowledgments

)e authors acknowledge the help of Prof Brian Hirth inTexas Tech University for providing them with access toweather data

References

[1] R Rothermel ldquoHow to predict the spread and intensity offorest and range firesrdquo General Technical Report INT-143Intermountain Forest and Range Experiment Station OgdenUT USA 1983

[2] A Tascikaraoglu and M Uzunoglu ldquoA review of combinedapproaches for prediction of short-term wind speed andpowerrdquo Renewable and Sustainable Energy Reviews vol 34pp 243ndash254 2014

[3] R J Barthelmie F Murray and S C Pryor ldquo)e economicbenefit of short-term forecasting for wind energy in the UKelectricity marketrdquo Energy Policy vol 36 no 5 pp 1687ndash1696 2008

[4] Y-K Wu and H Jing-Shan A Literature Review of WindForecasting Technology in the World pp 504ndash509 IEEELausanne Powertech Lausanne Switzerland 2007

[5] W S McCulloch and W Pitts ldquoA logical calculus of the ideasimmanent in nervous activityrdquo Be Bulletin of MathematicalBiophysics vol 5 no 4 pp 115ndash133 1943

[6] K Fukushima ldquoNeocognitronrdquo Scholarpedia vol 2 no 1p 1717 2007

[7] K Fukushima ldquoNeocognitron a self-organizing neural net-work model for a mechanism of pattern recognition unaf-fected by shift in positionrdquo Biological Cybernetics vol 36no 4 pp 193ndash202 1980

[8] S Hochreite and J Schmidhuber ldquoLong short-termmemoryrdquoNeural Computation vol 9 no 8 pp 1735ndash1780 1997

[9] H T Siegelmann and E D Sontag ldquoOn the computationalpower of neural netsrdquo Journal of Computer and System Sci-ences vol 50 no 1 pp 132ndash150 1995

[10] R Sunil ldquoUnderstanding support vector machine algorithmrdquohttpswwwanalyticsvidhyacomblog201709understaing-support-vector-machine-example-code 2017

[11] X Shi Z Chen H Wang D Y Yeung W K Wong andW C Woo ldquoConvolutional LSTM network a machinelearning approach for precipitation nowcastingrdquo in Pro-ceedings of the Neural Information Processing Systemspp 7ndash12 Montreal Canada December 2015

[12] C Finn ldquoGoodfellow I and Levine S Unsupervised learningfor physical interaction through video predictionrdquo in Pro-ceedings of the Annual Conference Neural Information Pro-cessing Systems pp 64ndash72 Barcelona Spain December 2016

[13] H Xu Y Gao F Yu and T Darrell ldquoEnd-to-end learning ofdriving models from large-scale video datasetsrdquo in Proceed-ings of the 2017 IEEE Conference on Computer Vision andPattern Recognition pp 3530ndash3538 Honolulu HI USA July2017

[14] K Chen and J Yu ldquoShort-term wind speed prediction usingan unscented Kalman filter based state-space support vectorregression approachrdquo Applied Energy vol 113 pp 690ndash7052014

[15] J Chen G-Q Zeng W Zhou W Du and K-D Lu ldquoWindspeed forecasting using nonlinear-learning ensemble of deeplearning time series prediction and extremal optimizationrdquoEnergy Conversion and Management vol 165 pp 681ndash6952018

[16] D Liu D Niu H Wang and L Fan ldquoShort-term wind speedforecasting using wavelet transform and support vectormachines optimized by genetic algorithmrdquo Renewable Energyvol 62 pp 592ndash597 2014

[17] Y Wang ldquoShort-term wind power forecasting by geneticalgorithm of wavelet neural networkrdquo in Proceedings of the2014 International Conference on Information Science Elec-tronics and Electrical Engineering pp 1752ndash1755 SapporoJapan April 2014

[18] N Sheikh D B Talukdar R I Rasel and N Sultana ldquoA shortterm wind speed forcasting using svr and bp-ann a com-parative analysisrdquo in Proceedings of the 2017 20th Interna-tional Conference of Computer and Information Technology(ICCIT) pp 1ndash6 Dhaka Bangladesh December 2017

[19] H Nantian C Yuan G Cai and E Xing ldquoHybrid short termwind speed forecasting using variational mode decompositionand a weighted regularized extreme learning machinerdquo En-ergies vol 9 no 12 pp 989ndash1007 2016

[20] S Haijian and X Deng ldquoAdaBoosting neural network forshort-term wind speed forecasting based on seasonal char-acteristics analysis and lag space estimationrdquo ComputerModeling in Engineering amp Sciences vol 114 no 3pp 277ndash293 2018

[21] W Xiaodan ldquoForecasting short-term wind speed usingsupport vector machine with particle swarm optimizationrdquo inProceedings of the 2017 International Conference on SensingDiagnostics Prognostics and Control (SDPC) pp 241ndash245Shanghai China August 2017

14 Computational Intelligence and Neuroscience

[22] F R Ningsih E C Djamal and A Najmurrakhman ldquoWindspeed forecasting using recurrent neural networks and longshort term memoryrdquo in Proceedings of the 2019 6th Inter-national Conference on Instrumentation Control and Auto-mation (ICA) pp 137ndash141 Bandung Indonesia JulyndashAugust2019

[23] P K Datta ldquoAn artificial neural network approach for short-term wind speed forecastrdquo Master thesis Kansas State Uni-versity Manhattan KS USA 2018

[24] J Guanlong D Li L Yao and P Zhao ldquoAn improved ar-tificial bee colony-BP neural network algorithm in the short-term wind speed predictionrdquo in Proceedings of the 2016 12thWorld Congress on Intelligent Control and Automation(WCICA) pp 2252ndash2255 Guilin China June 2016

[25] C Gonggui J Chen Z Zhang and Z Sun ldquoShort-term windspeed forecasting based on fuzzy C-means clustering andimproved MEA-BPrdquo IAENG International Journal of Com-puter Science vol 46 no 4 pp 27ndash35 2019

[26] W ZhanJie and A B M Mazharul Mujib ldquo)e weatherforecast using data mining research based on cloud com-putingrdquo in Journal of Physics Conference Series vol 910 no 1Article ID 012020 2017

[27] A Zhu X Li Z Mo and R Wu ldquoWind power predictionbased on a convolutional neural networkrdquo in Proceedings ofthe 2017 International Conference on Circuits Devices AndSystems (ICCDS) pp 131ndash135 Chengdu China September2017

[28] O Linda T Vollmer and M Manic ldquoNeural network basedintrusion detection system for critical infrastructuresrdquo inProceedings of the 2009 International Joint Conference onNeural Networks pp 1827ndash1834 Atlanta GA USA June2009

[29] C Li Z Xiao X Xia W Zou and C Zhang ldquoA hybrid modelbased on synchronous optimisation for multi-step short-termwind speed forecastingrdquoApplied Energy vol 215 pp 131ndash1442018

[30] Institute National Wind West Texas Mesonet Online Ref-erencing httpwwwdeptsttueducom 2018

Computational Intelligence and Neuroscience 15

Collect raw dataTime series

Averaging andpreprocessing

L1 L2 L10

AN

N

CNN

LSTM

Con

vLS

TM SVR

Forecasting results

MSERMSEMAE

R2

DelayExecution time

Best fittedKPIs

Dat

apr

oces

sing

Lagging

ModelingEv

alua

tion

helliphellip

1hr30min5min

Figure 1 )e proposed forecasting methodology

Input the wind speed time series dataOutput forecasting performance indices

(1) )e wind speed time series data are measured every 5minutes being averaged two times for 30minutes and 1 hour respectively(2) )e wind datasets are split into Training Validation and Test sets(3) Initiate the multi-lags-one-step (MLOS) arrays for Training Validation and Test sets(4) Define MLOS range as 1 10 to optimize the number of needed lags(5) loop 1

Split the first set based on MLOS rangeInitiate and Extract set features with CNN layerPass the output to a defined LSTM layerSelect the first range of the number of hidden neuronsGenerate prediction results performance indicesCount time to execute and produce prediction resultsSave and compare results with previous onesloop 2Select next MLOS rangeIf MLOS rangemaximum range then goto loop 3 and initialize MLOS rangegoto loop 1loop 3Select new number of hidden neuronsIf number of hidden neurons rangemaximum range then goto loop 4 and initialize number of hidden neurons rangegoto loop 1

(6) loop 4Select new dataset from the sets 5min 30min and 1 hrgoto loop 1If set rangemaximum range thenGenerate the performance indices of all tested setsSelect the best results metrics

ALGORITHM 1 ConvLSTM training

Computational Intelligence and Neuroscience 9

(1) CPU Intel (R) Core(TM) i7-8550U CPU 180GHz 2001 Mhz 4 Core (s) 8 Logical Processor(s)

(2) RAM Installed Physical Memory 160 GB(3) GPU AMD Radeon(TM) RX 550 10GB(4) Framework Anaconda 201907 Python 37

Table 6 illustrates the chosen optimized internal pa-rameters (hyperparameters) for the forecasting methodsused in this work For each method the optimal number ofhidden neurons is chosen to achieve the maximum R2 andthe minimum RMSEand MAE values

After the implementation of CNN ANN LSTMConvLSTM and SVM it was noticed that the most fittedmodel was chosen depending on its accuracy in predictingfuture wind speed values )us the seasonality is consideredfor the forecast mechanism)e chosen model has to deliverthe most fitted data with the least amount of error takinginto consideration the nature of the data and not applyingnaive forecasting on it

To achieve this goal the statistical error indicators arecalculated for every model and time lapse and fully repre-sented as Figure 2 illustrates )e provided results suggestthat the ConvLSTM model has the best performance ascompared to the other four models)e chosen model has toreach the minimum values of RMSE and MAE whilemaximum R2 value

Different parameters are also tested to ensure the rightdecision of choosing the best fitted model )e optimumnumber of lags which is presented in Table 7 is one of themost important indicators in selecting the best fitted modelSince the less historical points are needed by the model thecomputational effort will be less as well For each methodthe optimal number of lags is chosen to achieve the max-imum R2and the minimum RMSE and MAE values Forinstance Figures 3 and 4 show the relation between the

statistical measures and the number of lags and hiddenneurons respectively for the proposed ConvLSTM methodfor the 5 minutes time span case It can be seen that 4 lagsand 15 hidden neurons achieved maximum R2 and mini-mum RMSE and MAE values

)e execution time shown in Table 8 is calculated foreach method and time lapse to assure that the final andchosen model is efficient and can effectively predict futurewind speed )e shorter the time for execution is the moreefficient and helpful the model will be )is is also a sign thatthe model is efficient for further modifications According toTable 8 the ConvLSTM model beats all other models in thetime that it needed to process the historical data and deliver afinal prediction SVM needed 54 minutes to accomplish thetraining and produce testing results while ConvLSTM madeit in just 17 minutes )is huge difference between them hasmade the choice of using ConvLSTM

Table 3 Dataset characteristic for 5min sample

Dataset Max Median Min Mean StdAll datasets 1873 353 001 391 210Training dataset 1873 347 001 383 205Test dataset 1487 367 001 405 220

Table 4 Dataset characteristic for 30min sample

Dataset Max Median Min Mean StdAll datasets 1766 353 001 391 208Training dataset 1766 347 001 3837309 202Test dataset 1432 367 002 405 218

Table 5 Dataset characteristic for 1 hour sample

Dataset Max Median Min Mean StdAll datasets 1761 353 007 391 205Training dataset 1761 346 007 383 200Test dataset 1422 366 007 405 215

Table 6 Optimized internal parameters for the forecastingmethods

Method Sets of parameters

LSTM5min 17 hidden neurons30min 20 hidden neurons1 hour 8 hidden neurons

CNN5min 15 hidden neurons30min 5 hidden neurons1 hour 15 hidden neurons

ConvLSTM5min 15 hidden neurons30min 8 hidden neurons1 hour 20 hidden neurons

ANN5min 15 hidden neurons30min 15 hidden neurons1 hour 20 hidden neurons

SVR5min C 7 ε 01 c 02

30min C 1 ε 025 c 0151 hour C 1 ε 01 c 005

10 Computational Intelligence and Neuroscience

Figure 5 shows that the 5-minute lapse dataset is themost fitted dataset to our chosen model It declares howaccurate the prediction of future wind speed will be

For completeness to effectively evaluate the investi-gated forecasting techniques in terms of their predictionaccuracies 50 cross validation procedure is carried out inwhich the investigated techniques are built and thenevaluated on 50 different training and test datasets re-spectively randomly sampled from the available overalldataset )e ultimate performance metrics are then re-ported as the average and the standard deviation values ofthe 50 metrics obtained in each cross validation trial Inthis regard Figure 6 shows the average performancemetrics on the test dataset using the 50 cross validationprocedure It can be easily recognized that the forecastingmodels that employ the LSTM technique outperform theother investigated techniques in terms of the three per-formance metrics R2 RMSE and MAE

From the experimental results of short-term wind speedforecasting shown in Figure 6 we can observe thatConvLSTM performs the best in terms of forecasting metrics(R2 RMSE andMAE) as compared to the other models (ieCNN ANN SVR and LSTM) )e related statistical tests inTables 6 and 7 respectively have proved the effectiveness ofConvLSTM and its capability of handling noisy large dataConvLSTM showed that it can produce high accurate windspeed prediction with less lags and hidden neurons )iswas indeed reflected in the results shown in Table 8 withless computation time as compared to the other testedmodels Furthermore we introduced the multi-lags-one-step (MLOS) ahead forecasting combined with the hybridConvLSTMmodel to provide an efficient generalization fornew time series data to predict wind speed accuratelyResults showed that ConvLSTM proposed in this paperis an effective and promising model for wind speedforecasting

CNN ANN LSTM SVM ConvLSTM5min 09873 09873 09876 09815 0987630min 09091 09089 09097 09078 090981hour 08309 08302 08315 08211 08316

Val

ues

Models

R2

0009018027036045054063072081

09099

(a)

Val

ues

CNN ANN LSTM SVM ConvLSTM5min 02482 02483 02424 02462 0241930min 06578 06588 06467 06627 064631hour 08862 08881 08718 09116 08717

Models

RMSE

0010203040506070809

(b)

Val

ues

CNN ANN LSTM SVM ConvLSTM5min 01806 01798 01758 01855 0175530min 04802 04799 04721 04818 047411hour 06468 06473 06392 06591 06369

Models

MAE

0006012018024

03036042048054

06066

(c)

Figure 2 Modelsrsquo key performance indicators (KPIs)

Table 7 Optimum number of lags

Method 5min 30min 1 hourCNN 9 4 4ANN 3 3 3LSTM 7 6 10ConvLSTM 4 5 8SVR 9 4 5

Computational Intelligence and Neuroscience 11

ConvLSTM

50 75 100 125 150 175 20025Number of hidden neurons

0986909870098710987209873098740987509876

R-sq

uare

d

(a)

ConvLSTM

0986909870098710987209873098740987509876

RMSE

50 75 100 125 150 175 20025Number of hidden neurons

(b)

ConvLSTM

0176017701780179018001810182

MA

E

50 75 100 125 150 175 20025Number of hidden neurons

(c)

Figure 3 ConvLSTM measured statistical values and number of hidden neurons

2 4 6 8 10Number of lags

ConvLSTM

098575

098600

098625

098650

098675

098700

098725

098750

R-sq

uare

d

(a)

2 4 6 8 10Number of lags

ConvLSTM

02425

02450

02475

02500

02525

02550

02575

02600

RMSE

(b)

2 4 6 8 10Number of lags

ConvLSTM

0176

0178

0180

0182

0184

0186

0188

0190

MA

E

(c)

Figure 4 ConvLSTM measured statistical values and number of lags

12 Computational Intelligence and Neuroscience

Similar to our work the proposed EnsemLSTM modelby Chen et al [15] contained different clusters of LSTMwith different hidden layers and hidden neurons )eycombined LSTM clusters with SVR and external optimizerin order to enhance the generalization capability and ro-bustness of their model However their model showed ahigh computational complexity with mediocre perfor-mance indices Our proposed ConvLSTM with MLOS

assured boosting the generalization and robustness for thenew time series data as well as producing high performanceindices

6 Conclusions

In this study we proposed a hybrid deep learning-basedframework ConvLSTM for short-term prediction of the wind

Table 8 Execution time

(5min) time (min) (30min) time (min) (1 hour) time (min)ConvLSTM 17338 03849 01451SVR 541424 08214 02250CNN 087828 01322 00708ANN 07431 02591 00587LSTM 16570 03290 01473

5

4

3

2

1

0 25 50 75 100 125 150 175 200Number of samples

Win

d sp

eed

(mph

)ConvLSTM 5 min time span

PredictedTrue

Figure 5 ConvLSTM truepredicted wind speed and number of samples

0988

0986

0984

0982

098ConvLSTM LSTM ANN CNN SVR

R2

(a)

ConvLSTM LSTM ANN CNN SVR

026

024

022

02

RMSE

(b)

ConvLSTM LSTM ANN CNN SVR

02

019

018

017

016

015

MA

E

(c)

Figure 6 Average performance metrics obtained on the test dataset using 50 cross validation procedure

Computational Intelligence and Neuroscience 13

speed time series measurements )e proposed dynamicpredictionmodel was optimized for the number of input lagsand the number of internal hidden neurons Multi-lags-one-step (MLOS) ahead wind speed forecasting using the pro-posed approach showed superior results compared to fourother different models built using standard ANN CNNLSTM and SVM approaches )e proposed modelingframework combines the benefits of CNN and LSTM net-works in a hybrid modeling scheme that shows highly ac-curate wind speed prediction results with less lags andhidden neurons as well as less computational complexityFor future work further investigation can be done to im-prove the accuracy of the ConvLSTM model for instanceincreasing and optimizing the number of hidden layersapplying a multi-lags-multi-steps (MLMS) ahead forecast-ing and introducing reinforcement learning agent to op-timize the parameters as compared to other optimizationmethods

Data Availability

)ewind speed data used in this study have been taken fromthe West Texas Mesonet of the US National Wind Institute(httpwwwdeptsttuedunwiresearchfacilitieswtmindexphp) Data are provided freely for academic researchpurposes only and cannot be shareddistributed beyondacademic research use without permission from the WestTexas Mesonet

Conflicts of Interest

)e authors declare that they have no conflicts of interest

Acknowledgments

)e authors acknowledge the help of Prof Brian Hirth inTexas Tech University for providing them with access toweather data

References

[1] R Rothermel ldquoHow to predict the spread and intensity offorest and range firesrdquo General Technical Report INT-143Intermountain Forest and Range Experiment Station OgdenUT USA 1983

[2] A Tascikaraoglu and M Uzunoglu ldquoA review of combinedapproaches for prediction of short-term wind speed andpowerrdquo Renewable and Sustainable Energy Reviews vol 34pp 243ndash254 2014

[3] R J Barthelmie F Murray and S C Pryor ldquo)e economicbenefit of short-term forecasting for wind energy in the UKelectricity marketrdquo Energy Policy vol 36 no 5 pp 1687ndash1696 2008

[4] Y-K Wu and H Jing-Shan A Literature Review of WindForecasting Technology in the World pp 504ndash509 IEEELausanne Powertech Lausanne Switzerland 2007

[5] W S McCulloch and W Pitts ldquoA logical calculus of the ideasimmanent in nervous activityrdquo Be Bulletin of MathematicalBiophysics vol 5 no 4 pp 115ndash133 1943

[6] K Fukushima ldquoNeocognitronrdquo Scholarpedia vol 2 no 1p 1717 2007

[7] K Fukushima ldquoNeocognitron a self-organizing neural net-work model for a mechanism of pattern recognition unaf-fected by shift in positionrdquo Biological Cybernetics vol 36no 4 pp 193ndash202 1980

[8] S Hochreite and J Schmidhuber ldquoLong short-termmemoryrdquoNeural Computation vol 9 no 8 pp 1735ndash1780 1997

[9] H T Siegelmann and E D Sontag ldquoOn the computationalpower of neural netsrdquo Journal of Computer and System Sci-ences vol 50 no 1 pp 132ndash150 1995

[10] R Sunil ldquoUnderstanding support vector machine algorithmrdquohttpswwwanalyticsvidhyacomblog201709understaing-support-vector-machine-example-code 2017

[11] X Shi Z Chen H Wang D Y Yeung W K Wong andW C Woo ldquoConvolutional LSTM network a machinelearning approach for precipitation nowcastingrdquo in Pro-ceedings of the Neural Information Processing Systemspp 7ndash12 Montreal Canada December 2015

[12] C Finn ldquoGoodfellow I and Levine S Unsupervised learningfor physical interaction through video predictionrdquo in Pro-ceedings of the Annual Conference Neural Information Pro-cessing Systems pp 64ndash72 Barcelona Spain December 2016

[13] H Xu Y Gao F Yu and T Darrell ldquoEnd-to-end learning ofdriving models from large-scale video datasetsrdquo in Proceed-ings of the 2017 IEEE Conference on Computer Vision andPattern Recognition pp 3530ndash3538 Honolulu HI USA July2017

[14] K Chen and J Yu ldquoShort-term wind speed prediction usingan unscented Kalman filter based state-space support vectorregression approachrdquo Applied Energy vol 113 pp 690ndash7052014

[15] J Chen G-Q Zeng W Zhou W Du and K-D Lu ldquoWindspeed forecasting using nonlinear-learning ensemble of deeplearning time series prediction and extremal optimizationrdquoEnergy Conversion and Management vol 165 pp 681ndash6952018

[16] D Liu D Niu H Wang and L Fan ldquoShort-term wind speedforecasting using wavelet transform and support vectormachines optimized by genetic algorithmrdquo Renewable Energyvol 62 pp 592ndash597 2014

[17] Y Wang ldquoShort-term wind power forecasting by geneticalgorithm of wavelet neural networkrdquo in Proceedings of the2014 International Conference on Information Science Elec-tronics and Electrical Engineering pp 1752ndash1755 SapporoJapan April 2014

[18] N Sheikh D B Talukdar R I Rasel and N Sultana ldquoA shortterm wind speed forcasting using svr and bp-ann a com-parative analysisrdquo in Proceedings of the 2017 20th Interna-tional Conference of Computer and Information Technology(ICCIT) pp 1ndash6 Dhaka Bangladesh December 2017

[19] H Nantian C Yuan G Cai and E Xing ldquoHybrid short termwind speed forecasting using variational mode decompositionand a weighted regularized extreme learning machinerdquo En-ergies vol 9 no 12 pp 989ndash1007 2016

[20] S Haijian and X Deng ldquoAdaBoosting neural network forshort-term wind speed forecasting based on seasonal char-acteristics analysis and lag space estimationrdquo ComputerModeling in Engineering amp Sciences vol 114 no 3pp 277ndash293 2018

[21] W Xiaodan ldquoForecasting short-term wind speed usingsupport vector machine with particle swarm optimizationrdquo inProceedings of the 2017 International Conference on SensingDiagnostics Prognostics and Control (SDPC) pp 241ndash245Shanghai China August 2017

14 Computational Intelligence and Neuroscience

[22] F R Ningsih E C Djamal and A Najmurrakhman ldquoWindspeed forecasting using recurrent neural networks and longshort term memoryrdquo in Proceedings of the 2019 6th Inter-national Conference on Instrumentation Control and Auto-mation (ICA) pp 137ndash141 Bandung Indonesia JulyndashAugust2019

[23] P K Datta ldquoAn artificial neural network approach for short-term wind speed forecastrdquo Master thesis Kansas State Uni-versity Manhattan KS USA 2018

[24] J Guanlong D Li L Yao and P Zhao ldquoAn improved ar-tificial bee colony-BP neural network algorithm in the short-term wind speed predictionrdquo in Proceedings of the 2016 12thWorld Congress on Intelligent Control and Automation(WCICA) pp 2252ndash2255 Guilin China June 2016

[25] C Gonggui J Chen Z Zhang and Z Sun ldquoShort-term windspeed forecasting based on fuzzy C-means clustering andimproved MEA-BPrdquo IAENG International Journal of Com-puter Science vol 46 no 4 pp 27ndash35 2019

[26] W ZhanJie and A B M Mazharul Mujib ldquo)e weatherforecast using data mining research based on cloud com-putingrdquo in Journal of Physics Conference Series vol 910 no 1Article ID 012020 2017

[27] A Zhu X Li Z Mo and R Wu ldquoWind power predictionbased on a convolutional neural networkrdquo in Proceedings ofthe 2017 International Conference on Circuits Devices AndSystems (ICCDS) pp 131ndash135 Chengdu China September2017

[28] O Linda T Vollmer and M Manic ldquoNeural network basedintrusion detection system for critical infrastructuresrdquo inProceedings of the 2009 International Joint Conference onNeural Networks pp 1827ndash1834 Atlanta GA USA June2009

[29] C Li Z Xiao X Xia W Zou and C Zhang ldquoA hybrid modelbased on synchronous optimisation for multi-step short-termwind speed forecastingrdquoApplied Energy vol 215 pp 131ndash1442018

[30] Institute National Wind West Texas Mesonet Online Ref-erencing httpwwwdeptsttueducom 2018

Computational Intelligence and Neuroscience 15

(1) CPU Intel (R) Core(TM) i7-8550U CPU 180GHz 2001 Mhz 4 Core (s) 8 Logical Processor(s)

(2) RAM Installed Physical Memory 160 GB(3) GPU AMD Radeon(TM) RX 550 10GB(4) Framework Anaconda 201907 Python 37

Table 6 illustrates the chosen optimized internal pa-rameters (hyperparameters) for the forecasting methodsused in this work For each method the optimal number ofhidden neurons is chosen to achieve the maximum R2 andthe minimum RMSEand MAE values

After the implementation of CNN ANN LSTMConvLSTM and SVM it was noticed that the most fittedmodel was chosen depending on its accuracy in predictingfuture wind speed values )us the seasonality is consideredfor the forecast mechanism)e chosen model has to deliverthe most fitted data with the least amount of error takinginto consideration the nature of the data and not applyingnaive forecasting on it

To achieve this goal the statistical error indicators arecalculated for every model and time lapse and fully repre-sented as Figure 2 illustrates )e provided results suggestthat the ConvLSTM model has the best performance ascompared to the other four models)e chosen model has toreach the minimum values of RMSE and MAE whilemaximum R2 value

Different parameters are also tested to ensure the rightdecision of choosing the best fitted model )e optimumnumber of lags which is presented in Table 7 is one of themost important indicators in selecting the best fitted modelSince the less historical points are needed by the model thecomputational effort will be less as well For each methodthe optimal number of lags is chosen to achieve the max-imum R2and the minimum RMSE and MAE values Forinstance Figures 3 and 4 show the relation between the

statistical measures and the number of lags and hiddenneurons respectively for the proposed ConvLSTM methodfor the 5 minutes time span case It can be seen that 4 lagsand 15 hidden neurons achieved maximum R2 and mini-mum RMSE and MAE values

)e execution time shown in Table 8 is calculated foreach method and time lapse to assure that the final andchosen model is efficient and can effectively predict futurewind speed )e shorter the time for execution is the moreefficient and helpful the model will be )is is also a sign thatthe model is efficient for further modifications According toTable 8 the ConvLSTM model beats all other models in thetime that it needed to process the historical data and deliver afinal prediction SVM needed 54 minutes to accomplish thetraining and produce testing results while ConvLSTM madeit in just 17 minutes )is huge difference between them hasmade the choice of using ConvLSTM

Table 3 Dataset characteristic for 5min sample

Dataset Max Median Min Mean StdAll datasets 1873 353 001 391 210Training dataset 1873 347 001 383 205Test dataset 1487 367 001 405 220

Table 4 Dataset characteristic for 30min sample

Dataset Max Median Min Mean StdAll datasets 1766 353 001 391 208Training dataset 1766 347 001 3837309 202Test dataset 1432 367 002 405 218

Table 5 Dataset characteristic for 1 hour sample

Dataset Max Median Min Mean StdAll datasets 1761 353 007 391 205Training dataset 1761 346 007 383 200Test dataset 1422 366 007 405 215

Table 6 Optimized internal parameters for the forecastingmethods

Method Sets of parameters

LSTM5min 17 hidden neurons30min 20 hidden neurons1 hour 8 hidden neurons

CNN5min 15 hidden neurons30min 5 hidden neurons1 hour 15 hidden neurons

ConvLSTM5min 15 hidden neurons30min 8 hidden neurons1 hour 20 hidden neurons

ANN5min 15 hidden neurons30min 15 hidden neurons1 hour 20 hidden neurons

SVR5min C 7 ε 01 c 02

30min C 1 ε 025 c 0151 hour C 1 ε 01 c 005

10 Computational Intelligence and Neuroscience

Figure 5 shows that the 5-minute lapse dataset is themost fitted dataset to our chosen model It declares howaccurate the prediction of future wind speed will be

For completeness to effectively evaluate the investi-gated forecasting techniques in terms of their predictionaccuracies 50 cross validation procedure is carried out inwhich the investigated techniques are built and thenevaluated on 50 different training and test datasets re-spectively randomly sampled from the available overalldataset )e ultimate performance metrics are then re-ported as the average and the standard deviation values ofthe 50 metrics obtained in each cross validation trial Inthis regard Figure 6 shows the average performancemetrics on the test dataset using the 50 cross validationprocedure It can be easily recognized that the forecastingmodels that employ the LSTM technique outperform theother investigated techniques in terms of the three per-formance metrics R2 RMSE and MAE

From the experimental results of short-term wind speedforecasting shown in Figure 6 we can observe thatConvLSTM performs the best in terms of forecasting metrics(R2 RMSE andMAE) as compared to the other models (ieCNN ANN SVR and LSTM) )e related statistical tests inTables 6 and 7 respectively have proved the effectiveness ofConvLSTM and its capability of handling noisy large dataConvLSTM showed that it can produce high accurate windspeed prediction with less lags and hidden neurons )iswas indeed reflected in the results shown in Table 8 withless computation time as compared to the other testedmodels Furthermore we introduced the multi-lags-one-step (MLOS) ahead forecasting combined with the hybridConvLSTMmodel to provide an efficient generalization fornew time series data to predict wind speed accuratelyResults showed that ConvLSTM proposed in this paperis an effective and promising model for wind speedforecasting

CNN ANN LSTM SVM ConvLSTM5min 09873 09873 09876 09815 0987630min 09091 09089 09097 09078 090981hour 08309 08302 08315 08211 08316

Val

ues

Models

R2

0009018027036045054063072081

09099

(a)

Val

ues

CNN ANN LSTM SVM ConvLSTM5min 02482 02483 02424 02462 0241930min 06578 06588 06467 06627 064631hour 08862 08881 08718 09116 08717

Models

RMSE

0010203040506070809

(b)

Val

ues

CNN ANN LSTM SVM ConvLSTM5min 01806 01798 01758 01855 0175530min 04802 04799 04721 04818 047411hour 06468 06473 06392 06591 06369

Models

MAE

0006012018024

03036042048054

06066

(c)

Figure 2 Modelsrsquo key performance indicators (KPIs)

Table 7 Optimum number of lags

Method 5min 30min 1 hourCNN 9 4 4ANN 3 3 3LSTM 7 6 10ConvLSTM 4 5 8SVR 9 4 5

Computational Intelligence and Neuroscience 11

ConvLSTM

50 75 100 125 150 175 20025Number of hidden neurons

0986909870098710987209873098740987509876

R-sq

uare

d

(a)

ConvLSTM

0986909870098710987209873098740987509876

RMSE

50 75 100 125 150 175 20025Number of hidden neurons

(b)

ConvLSTM

0176017701780179018001810182

MA

E

50 75 100 125 150 175 20025Number of hidden neurons

(c)

Figure 3 ConvLSTM measured statistical values and number of hidden neurons

2 4 6 8 10Number of lags

ConvLSTM

098575

098600

098625

098650

098675

098700

098725

098750

R-sq

uare

d

(a)

2 4 6 8 10Number of lags

ConvLSTM

02425

02450

02475

02500

02525

02550

02575

02600

RMSE

(b)

2 4 6 8 10Number of lags

ConvLSTM

0176

0178

0180

0182

0184

0186

0188

0190

MA

E

(c)

Figure 4 ConvLSTM measured statistical values and number of lags

12 Computational Intelligence and Neuroscience

Similar to our work the proposed EnsemLSTM modelby Chen et al [15] contained different clusters of LSTMwith different hidden layers and hidden neurons )eycombined LSTM clusters with SVR and external optimizerin order to enhance the generalization capability and ro-bustness of their model However their model showed ahigh computational complexity with mediocre perfor-mance indices Our proposed ConvLSTM with MLOS

assured boosting the generalization and robustness for thenew time series data as well as producing high performanceindices

6 Conclusions

In this study we proposed a hybrid deep learning-basedframework ConvLSTM for short-term prediction of the wind

Table 8 Execution time

(5min) time (min) (30min) time (min) (1 hour) time (min)ConvLSTM 17338 03849 01451SVR 541424 08214 02250CNN 087828 01322 00708ANN 07431 02591 00587LSTM 16570 03290 01473

5

4

3

2

1

0 25 50 75 100 125 150 175 200Number of samples

Win

d sp

eed

(mph

)ConvLSTM 5 min time span

PredictedTrue

Figure 5 ConvLSTM truepredicted wind speed and number of samples

0988

0986

0984

0982

098ConvLSTM LSTM ANN CNN SVR

R2

(a)

ConvLSTM LSTM ANN CNN SVR

026

024

022

02

RMSE

(b)

ConvLSTM LSTM ANN CNN SVR

02

019

018

017

016

015

MA

E

(c)

Figure 6 Average performance metrics obtained on the test dataset using 50 cross validation procedure

Computational Intelligence and Neuroscience 13

speed time series measurements )e proposed dynamicpredictionmodel was optimized for the number of input lagsand the number of internal hidden neurons Multi-lags-one-step (MLOS) ahead wind speed forecasting using the pro-posed approach showed superior results compared to fourother different models built using standard ANN CNNLSTM and SVM approaches )e proposed modelingframework combines the benefits of CNN and LSTM net-works in a hybrid modeling scheme that shows highly ac-curate wind speed prediction results with less lags andhidden neurons as well as less computational complexityFor future work further investigation can be done to im-prove the accuracy of the ConvLSTM model for instanceincreasing and optimizing the number of hidden layersapplying a multi-lags-multi-steps (MLMS) ahead forecast-ing and introducing reinforcement learning agent to op-timize the parameters as compared to other optimizationmethods

Data Availability

)ewind speed data used in this study have been taken fromthe West Texas Mesonet of the US National Wind Institute(httpwwwdeptsttuedunwiresearchfacilitieswtmindexphp) Data are provided freely for academic researchpurposes only and cannot be shareddistributed beyondacademic research use without permission from the WestTexas Mesonet

Conflicts of Interest

)e authors declare that they have no conflicts of interest

Acknowledgments

)e authors acknowledge the help of Prof Brian Hirth inTexas Tech University for providing them with access toweather data

References

[1] R Rothermel ldquoHow to predict the spread and intensity offorest and range firesrdquo General Technical Report INT-143Intermountain Forest and Range Experiment Station OgdenUT USA 1983

[2] A Tascikaraoglu and M Uzunoglu ldquoA review of combinedapproaches for prediction of short-term wind speed andpowerrdquo Renewable and Sustainable Energy Reviews vol 34pp 243ndash254 2014

[3] R J Barthelmie F Murray and S C Pryor ldquo)e economicbenefit of short-term forecasting for wind energy in the UKelectricity marketrdquo Energy Policy vol 36 no 5 pp 1687ndash1696 2008

[4] Y-K Wu and H Jing-Shan A Literature Review of WindForecasting Technology in the World pp 504ndash509 IEEELausanne Powertech Lausanne Switzerland 2007

[5] W S McCulloch and W Pitts ldquoA logical calculus of the ideasimmanent in nervous activityrdquo Be Bulletin of MathematicalBiophysics vol 5 no 4 pp 115ndash133 1943

[6] K Fukushima ldquoNeocognitronrdquo Scholarpedia vol 2 no 1p 1717 2007

[7] K Fukushima ldquoNeocognitron a self-organizing neural net-work model for a mechanism of pattern recognition unaf-fected by shift in positionrdquo Biological Cybernetics vol 36no 4 pp 193ndash202 1980

[8] S Hochreite and J Schmidhuber ldquoLong short-termmemoryrdquoNeural Computation vol 9 no 8 pp 1735ndash1780 1997

[9] H T Siegelmann and E D Sontag ldquoOn the computationalpower of neural netsrdquo Journal of Computer and System Sci-ences vol 50 no 1 pp 132ndash150 1995

[10] R Sunil ldquoUnderstanding support vector machine algorithmrdquohttpswwwanalyticsvidhyacomblog201709understaing-support-vector-machine-example-code 2017

[11] X Shi Z Chen H Wang D Y Yeung W K Wong andW C Woo ldquoConvolutional LSTM network a machinelearning approach for precipitation nowcastingrdquo in Pro-ceedings of the Neural Information Processing Systemspp 7ndash12 Montreal Canada December 2015

[12] C Finn ldquoGoodfellow I and Levine S Unsupervised learningfor physical interaction through video predictionrdquo in Pro-ceedings of the Annual Conference Neural Information Pro-cessing Systems pp 64ndash72 Barcelona Spain December 2016

[13] H Xu Y Gao F Yu and T Darrell ldquoEnd-to-end learning ofdriving models from large-scale video datasetsrdquo in Proceed-ings of the 2017 IEEE Conference on Computer Vision andPattern Recognition pp 3530ndash3538 Honolulu HI USA July2017

[14] K Chen and J Yu ldquoShort-term wind speed prediction usingan unscented Kalman filter based state-space support vectorregression approachrdquo Applied Energy vol 113 pp 690ndash7052014

[15] J Chen G-Q Zeng W Zhou W Du and K-D Lu ldquoWindspeed forecasting using nonlinear-learning ensemble of deeplearning time series prediction and extremal optimizationrdquoEnergy Conversion and Management vol 165 pp 681ndash6952018

[16] D Liu D Niu H Wang and L Fan ldquoShort-term wind speedforecasting using wavelet transform and support vectormachines optimized by genetic algorithmrdquo Renewable Energyvol 62 pp 592ndash597 2014

[17] Y Wang ldquoShort-term wind power forecasting by geneticalgorithm of wavelet neural networkrdquo in Proceedings of the2014 International Conference on Information Science Elec-tronics and Electrical Engineering pp 1752ndash1755 SapporoJapan April 2014

[18] N Sheikh D B Talukdar R I Rasel and N Sultana ldquoA shortterm wind speed forcasting using svr and bp-ann a com-parative analysisrdquo in Proceedings of the 2017 20th Interna-tional Conference of Computer and Information Technology(ICCIT) pp 1ndash6 Dhaka Bangladesh December 2017

[19] H Nantian C Yuan G Cai and E Xing ldquoHybrid short termwind speed forecasting using variational mode decompositionand a weighted regularized extreme learning machinerdquo En-ergies vol 9 no 12 pp 989ndash1007 2016

[20] S Haijian and X Deng ldquoAdaBoosting neural network forshort-term wind speed forecasting based on seasonal char-acteristics analysis and lag space estimationrdquo ComputerModeling in Engineering amp Sciences vol 114 no 3pp 277ndash293 2018

[21] W Xiaodan ldquoForecasting short-term wind speed usingsupport vector machine with particle swarm optimizationrdquo inProceedings of the 2017 International Conference on SensingDiagnostics Prognostics and Control (SDPC) pp 241ndash245Shanghai China August 2017

14 Computational Intelligence and Neuroscience

[22] F R Ningsih E C Djamal and A Najmurrakhman ldquoWindspeed forecasting using recurrent neural networks and longshort term memoryrdquo in Proceedings of the 2019 6th Inter-national Conference on Instrumentation Control and Auto-mation (ICA) pp 137ndash141 Bandung Indonesia JulyndashAugust2019

[23] P K Datta ldquoAn artificial neural network approach for short-term wind speed forecastrdquo Master thesis Kansas State Uni-versity Manhattan KS USA 2018

[24] J Guanlong D Li L Yao and P Zhao ldquoAn improved ar-tificial bee colony-BP neural network algorithm in the short-term wind speed predictionrdquo in Proceedings of the 2016 12thWorld Congress on Intelligent Control and Automation(WCICA) pp 2252ndash2255 Guilin China June 2016

[25] C Gonggui J Chen Z Zhang and Z Sun ldquoShort-term windspeed forecasting based on fuzzy C-means clustering andimproved MEA-BPrdquo IAENG International Journal of Com-puter Science vol 46 no 4 pp 27ndash35 2019

[26] W ZhanJie and A B M Mazharul Mujib ldquo)e weatherforecast using data mining research based on cloud com-putingrdquo in Journal of Physics Conference Series vol 910 no 1Article ID 012020 2017

[27] A Zhu X Li Z Mo and R Wu ldquoWind power predictionbased on a convolutional neural networkrdquo in Proceedings ofthe 2017 International Conference on Circuits Devices AndSystems (ICCDS) pp 131ndash135 Chengdu China September2017

[28] O Linda T Vollmer and M Manic ldquoNeural network basedintrusion detection system for critical infrastructuresrdquo inProceedings of the 2009 International Joint Conference onNeural Networks pp 1827ndash1834 Atlanta GA USA June2009

[29] C Li Z Xiao X Xia W Zou and C Zhang ldquoA hybrid modelbased on synchronous optimisation for multi-step short-termwind speed forecastingrdquoApplied Energy vol 215 pp 131ndash1442018

[30] Institute National Wind West Texas Mesonet Online Ref-erencing httpwwwdeptsttueducom 2018

Computational Intelligence and Neuroscience 15

Figure 5 shows that the 5-minute lapse dataset is themost fitted dataset to our chosen model It declares howaccurate the prediction of future wind speed will be

For completeness to effectively evaluate the investi-gated forecasting techniques in terms of their predictionaccuracies 50 cross validation procedure is carried out inwhich the investigated techniques are built and thenevaluated on 50 different training and test datasets re-spectively randomly sampled from the available overalldataset )e ultimate performance metrics are then re-ported as the average and the standard deviation values ofthe 50 metrics obtained in each cross validation trial Inthis regard Figure 6 shows the average performancemetrics on the test dataset using the 50 cross validationprocedure It can be easily recognized that the forecastingmodels that employ the LSTM technique outperform theother investigated techniques in terms of the three per-formance metrics R2 RMSE and MAE

From the experimental results of short-term wind speedforecasting shown in Figure 6 we can observe thatConvLSTM performs the best in terms of forecasting metrics(R2 RMSE andMAE) as compared to the other models (ieCNN ANN SVR and LSTM) )e related statistical tests inTables 6 and 7 respectively have proved the effectiveness ofConvLSTM and its capability of handling noisy large dataConvLSTM showed that it can produce high accurate windspeed prediction with less lags and hidden neurons )iswas indeed reflected in the results shown in Table 8 withless computation time as compared to the other testedmodels Furthermore we introduced the multi-lags-one-step (MLOS) ahead forecasting combined with the hybridConvLSTMmodel to provide an efficient generalization fornew time series data to predict wind speed accuratelyResults showed that ConvLSTM proposed in this paperis an effective and promising model for wind speedforecasting

CNN ANN LSTM SVM ConvLSTM5min 09873 09873 09876 09815 0987630min 09091 09089 09097 09078 090981hour 08309 08302 08315 08211 08316

Val

ues

Models

R2

0009018027036045054063072081

09099

(a)

Val

ues

CNN ANN LSTM SVM ConvLSTM5min 02482 02483 02424 02462 0241930min 06578 06588 06467 06627 064631hour 08862 08881 08718 09116 08717

Models

RMSE

0010203040506070809

(b)

Val

ues

CNN ANN LSTM SVM ConvLSTM5min 01806 01798 01758 01855 0175530min 04802 04799 04721 04818 047411hour 06468 06473 06392 06591 06369

Models

MAE

0006012018024

03036042048054

06066

(c)

Figure 2 Modelsrsquo key performance indicators (KPIs)

Table 7 Optimum number of lags

Method 5min 30min 1 hourCNN 9 4 4ANN 3 3 3LSTM 7 6 10ConvLSTM 4 5 8SVR 9 4 5

Computational Intelligence and Neuroscience 11

ConvLSTM

50 75 100 125 150 175 20025Number of hidden neurons

0986909870098710987209873098740987509876

R-sq

uare

d

(a)

ConvLSTM

0986909870098710987209873098740987509876

RMSE

50 75 100 125 150 175 20025Number of hidden neurons

(b)

ConvLSTM

0176017701780179018001810182

MA

E

50 75 100 125 150 175 20025Number of hidden neurons

(c)

Figure 3 ConvLSTM measured statistical values and number of hidden neurons

2 4 6 8 10Number of lags

ConvLSTM

098575

098600

098625

098650

098675

098700

098725

098750

R-sq

uare

d

(a)

2 4 6 8 10Number of lags

ConvLSTM

02425

02450

02475

02500

02525

02550

02575

02600

RMSE

(b)

2 4 6 8 10Number of lags

ConvLSTM

0176

0178

0180

0182

0184

0186

0188

0190

MA

E

(c)

Figure 4 ConvLSTM measured statistical values and number of lags

12 Computational Intelligence and Neuroscience

Similar to our work the proposed EnsemLSTM modelby Chen et al [15] contained different clusters of LSTMwith different hidden layers and hidden neurons )eycombined LSTM clusters with SVR and external optimizerin order to enhance the generalization capability and ro-bustness of their model However their model showed ahigh computational complexity with mediocre perfor-mance indices Our proposed ConvLSTM with MLOS

assured boosting the generalization and robustness for thenew time series data as well as producing high performanceindices

6 Conclusions

In this study we proposed a hybrid deep learning-basedframework ConvLSTM for short-term prediction of the wind

Table 8 Execution time

(5min) time (min) (30min) time (min) (1 hour) time (min)ConvLSTM 17338 03849 01451SVR 541424 08214 02250CNN 087828 01322 00708ANN 07431 02591 00587LSTM 16570 03290 01473

5

4

3

2

1

0 25 50 75 100 125 150 175 200Number of samples

Win

d sp

eed

(mph

)ConvLSTM 5 min time span

PredictedTrue

Figure 5 ConvLSTM truepredicted wind speed and number of samples

0988

0986

0984

0982

098ConvLSTM LSTM ANN CNN SVR

R2

(a)

ConvLSTM LSTM ANN CNN SVR

026

024

022

02

RMSE

(b)

ConvLSTM LSTM ANN CNN SVR

02

019

018

017

016

015

MA

E

(c)

Figure 6 Average performance metrics obtained on the test dataset using 50 cross validation procedure

Computational Intelligence and Neuroscience 13

speed time series measurements )e proposed dynamicpredictionmodel was optimized for the number of input lagsand the number of internal hidden neurons Multi-lags-one-step (MLOS) ahead wind speed forecasting using the pro-posed approach showed superior results compared to fourother different models built using standard ANN CNNLSTM and SVM approaches )e proposed modelingframework combines the benefits of CNN and LSTM net-works in a hybrid modeling scheme that shows highly ac-curate wind speed prediction results with less lags andhidden neurons as well as less computational complexityFor future work further investigation can be done to im-prove the accuracy of the ConvLSTM model for instanceincreasing and optimizing the number of hidden layersapplying a multi-lags-multi-steps (MLMS) ahead forecast-ing and introducing reinforcement learning agent to op-timize the parameters as compared to other optimizationmethods

Data Availability

)ewind speed data used in this study have been taken fromthe West Texas Mesonet of the US National Wind Institute(httpwwwdeptsttuedunwiresearchfacilitieswtmindexphp) Data are provided freely for academic researchpurposes only and cannot be shareddistributed beyondacademic research use without permission from the WestTexas Mesonet

Conflicts of Interest

)e authors declare that they have no conflicts of interest

Acknowledgments

)e authors acknowledge the help of Prof Brian Hirth inTexas Tech University for providing them with access toweather data

References

[1] R Rothermel ldquoHow to predict the spread and intensity offorest and range firesrdquo General Technical Report INT-143Intermountain Forest and Range Experiment Station OgdenUT USA 1983

[2] A Tascikaraoglu and M Uzunoglu ldquoA review of combinedapproaches for prediction of short-term wind speed andpowerrdquo Renewable and Sustainable Energy Reviews vol 34pp 243ndash254 2014

[3] R J Barthelmie F Murray and S C Pryor ldquo)e economicbenefit of short-term forecasting for wind energy in the UKelectricity marketrdquo Energy Policy vol 36 no 5 pp 1687ndash1696 2008

[4] Y-K Wu and H Jing-Shan A Literature Review of WindForecasting Technology in the World pp 504ndash509 IEEELausanne Powertech Lausanne Switzerland 2007

[5] W S McCulloch and W Pitts ldquoA logical calculus of the ideasimmanent in nervous activityrdquo Be Bulletin of MathematicalBiophysics vol 5 no 4 pp 115ndash133 1943

[6] K Fukushima ldquoNeocognitronrdquo Scholarpedia vol 2 no 1p 1717 2007

[7] K Fukushima ldquoNeocognitron a self-organizing neural net-work model for a mechanism of pattern recognition unaf-fected by shift in positionrdquo Biological Cybernetics vol 36no 4 pp 193ndash202 1980

[8] S Hochreite and J Schmidhuber ldquoLong short-termmemoryrdquoNeural Computation vol 9 no 8 pp 1735ndash1780 1997

[9] H T Siegelmann and E D Sontag ldquoOn the computationalpower of neural netsrdquo Journal of Computer and System Sci-ences vol 50 no 1 pp 132ndash150 1995

[10] R Sunil ldquoUnderstanding support vector machine algorithmrdquohttpswwwanalyticsvidhyacomblog201709understaing-support-vector-machine-example-code 2017

[11] X Shi Z Chen H Wang D Y Yeung W K Wong andW C Woo ldquoConvolutional LSTM network a machinelearning approach for precipitation nowcastingrdquo in Pro-ceedings of the Neural Information Processing Systemspp 7ndash12 Montreal Canada December 2015

[12] C Finn ldquoGoodfellow I and Levine S Unsupervised learningfor physical interaction through video predictionrdquo in Pro-ceedings of the Annual Conference Neural Information Pro-cessing Systems pp 64ndash72 Barcelona Spain December 2016

[13] H Xu Y Gao F Yu and T Darrell ldquoEnd-to-end learning ofdriving models from large-scale video datasetsrdquo in Proceed-ings of the 2017 IEEE Conference on Computer Vision andPattern Recognition pp 3530ndash3538 Honolulu HI USA July2017

[14] K Chen and J Yu ldquoShort-term wind speed prediction usingan unscented Kalman filter based state-space support vectorregression approachrdquo Applied Energy vol 113 pp 690ndash7052014

[15] J Chen G-Q Zeng W Zhou W Du and K-D Lu ldquoWindspeed forecasting using nonlinear-learning ensemble of deeplearning time series prediction and extremal optimizationrdquoEnergy Conversion and Management vol 165 pp 681ndash6952018

[16] D Liu D Niu H Wang and L Fan ldquoShort-term wind speedforecasting using wavelet transform and support vectormachines optimized by genetic algorithmrdquo Renewable Energyvol 62 pp 592ndash597 2014

[17] Y Wang ldquoShort-term wind power forecasting by geneticalgorithm of wavelet neural networkrdquo in Proceedings of the2014 International Conference on Information Science Elec-tronics and Electrical Engineering pp 1752ndash1755 SapporoJapan April 2014

[18] N Sheikh D B Talukdar R I Rasel and N Sultana ldquoA shortterm wind speed forcasting using svr and bp-ann a com-parative analysisrdquo in Proceedings of the 2017 20th Interna-tional Conference of Computer and Information Technology(ICCIT) pp 1ndash6 Dhaka Bangladesh December 2017

[19] H Nantian C Yuan G Cai and E Xing ldquoHybrid short termwind speed forecasting using variational mode decompositionand a weighted regularized extreme learning machinerdquo En-ergies vol 9 no 12 pp 989ndash1007 2016

[20] S Haijian and X Deng ldquoAdaBoosting neural network forshort-term wind speed forecasting based on seasonal char-acteristics analysis and lag space estimationrdquo ComputerModeling in Engineering amp Sciences vol 114 no 3pp 277ndash293 2018

[21] W Xiaodan ldquoForecasting short-term wind speed usingsupport vector machine with particle swarm optimizationrdquo inProceedings of the 2017 International Conference on SensingDiagnostics Prognostics and Control (SDPC) pp 241ndash245Shanghai China August 2017

14 Computational Intelligence and Neuroscience

[22] F R Ningsih E C Djamal and A Najmurrakhman ldquoWindspeed forecasting using recurrent neural networks and longshort term memoryrdquo in Proceedings of the 2019 6th Inter-national Conference on Instrumentation Control and Auto-mation (ICA) pp 137ndash141 Bandung Indonesia JulyndashAugust2019

[23] P K Datta ldquoAn artificial neural network approach for short-term wind speed forecastrdquo Master thesis Kansas State Uni-versity Manhattan KS USA 2018

[24] J Guanlong D Li L Yao and P Zhao ldquoAn improved ar-tificial bee colony-BP neural network algorithm in the short-term wind speed predictionrdquo in Proceedings of the 2016 12thWorld Congress on Intelligent Control and Automation(WCICA) pp 2252ndash2255 Guilin China June 2016

[25] C Gonggui J Chen Z Zhang and Z Sun ldquoShort-term windspeed forecasting based on fuzzy C-means clustering andimproved MEA-BPrdquo IAENG International Journal of Com-puter Science vol 46 no 4 pp 27ndash35 2019

[26] W ZhanJie and A B M Mazharul Mujib ldquo)e weatherforecast using data mining research based on cloud com-putingrdquo in Journal of Physics Conference Series vol 910 no 1Article ID 012020 2017

[27] A Zhu X Li Z Mo and R Wu ldquoWind power predictionbased on a convolutional neural networkrdquo in Proceedings ofthe 2017 International Conference on Circuits Devices AndSystems (ICCDS) pp 131ndash135 Chengdu China September2017

[28] O Linda T Vollmer and M Manic ldquoNeural network basedintrusion detection system for critical infrastructuresrdquo inProceedings of the 2009 International Joint Conference onNeural Networks pp 1827ndash1834 Atlanta GA USA June2009

[29] C Li Z Xiao X Xia W Zou and C Zhang ldquoA hybrid modelbased on synchronous optimisation for multi-step short-termwind speed forecastingrdquoApplied Energy vol 215 pp 131ndash1442018

[30] Institute National Wind West Texas Mesonet Online Ref-erencing httpwwwdeptsttueducom 2018

Computational Intelligence and Neuroscience 15

ConvLSTM

50 75 100 125 150 175 20025Number of hidden neurons

0986909870098710987209873098740987509876

R-sq

uare

d

(a)

ConvLSTM

0986909870098710987209873098740987509876

RMSE

50 75 100 125 150 175 20025Number of hidden neurons

(b)

ConvLSTM

0176017701780179018001810182

MA

E

50 75 100 125 150 175 20025Number of hidden neurons

(c)

Figure 3 ConvLSTM measured statistical values and number of hidden neurons

2 4 6 8 10Number of lags

ConvLSTM

098575

098600

098625

098650

098675

098700

098725

098750

R-sq

uare

d

(a)

2 4 6 8 10Number of lags

ConvLSTM

02425

02450

02475

02500

02525

02550

02575

02600

RMSE

(b)

2 4 6 8 10Number of lags

ConvLSTM

0176

0178

0180

0182

0184

0186

0188

0190

MA

E

(c)

Figure 4 ConvLSTM measured statistical values and number of lags

12 Computational Intelligence and Neuroscience

Similar to our work the proposed EnsemLSTM modelby Chen et al [15] contained different clusters of LSTMwith different hidden layers and hidden neurons )eycombined LSTM clusters with SVR and external optimizerin order to enhance the generalization capability and ro-bustness of their model However their model showed ahigh computational complexity with mediocre perfor-mance indices Our proposed ConvLSTM with MLOS

assured boosting the generalization and robustness for thenew time series data as well as producing high performanceindices

6 Conclusions

In this study we proposed a hybrid deep learning-basedframework ConvLSTM for short-term prediction of the wind

Table 8 Execution time

(5min) time (min) (30min) time (min) (1 hour) time (min)ConvLSTM 17338 03849 01451SVR 541424 08214 02250CNN 087828 01322 00708ANN 07431 02591 00587LSTM 16570 03290 01473

5

4

3

2

1

0 25 50 75 100 125 150 175 200Number of samples

Win

d sp

eed

(mph

)ConvLSTM 5 min time span

PredictedTrue

Figure 5 ConvLSTM truepredicted wind speed and number of samples

0988

0986

0984

0982

098ConvLSTM LSTM ANN CNN SVR

R2

(a)

ConvLSTM LSTM ANN CNN SVR

026

024

022

02

RMSE

(b)

ConvLSTM LSTM ANN CNN SVR

02

019

018

017

016

015

MA

E

(c)

Figure 6 Average performance metrics obtained on the test dataset using 50 cross validation procedure

Computational Intelligence and Neuroscience 13

speed time series measurements )e proposed dynamicpredictionmodel was optimized for the number of input lagsand the number of internal hidden neurons Multi-lags-one-step (MLOS) ahead wind speed forecasting using the pro-posed approach showed superior results compared to fourother different models built using standard ANN CNNLSTM and SVM approaches )e proposed modelingframework combines the benefits of CNN and LSTM net-works in a hybrid modeling scheme that shows highly ac-curate wind speed prediction results with less lags andhidden neurons as well as less computational complexityFor future work further investigation can be done to im-prove the accuracy of the ConvLSTM model for instanceincreasing and optimizing the number of hidden layersapplying a multi-lags-multi-steps (MLMS) ahead forecast-ing and introducing reinforcement learning agent to op-timize the parameters as compared to other optimizationmethods

Data Availability

)ewind speed data used in this study have been taken fromthe West Texas Mesonet of the US National Wind Institute(httpwwwdeptsttuedunwiresearchfacilitieswtmindexphp) Data are provided freely for academic researchpurposes only and cannot be shareddistributed beyondacademic research use without permission from the WestTexas Mesonet

Conflicts of Interest

)e authors declare that they have no conflicts of interest

Acknowledgments

)e authors acknowledge the help of Prof Brian Hirth inTexas Tech University for providing them with access toweather data

References

[1] R Rothermel ldquoHow to predict the spread and intensity offorest and range firesrdquo General Technical Report INT-143Intermountain Forest and Range Experiment Station OgdenUT USA 1983

[2] A Tascikaraoglu and M Uzunoglu ldquoA review of combinedapproaches for prediction of short-term wind speed andpowerrdquo Renewable and Sustainable Energy Reviews vol 34pp 243ndash254 2014

[3] R J Barthelmie F Murray and S C Pryor ldquo)e economicbenefit of short-term forecasting for wind energy in the UKelectricity marketrdquo Energy Policy vol 36 no 5 pp 1687ndash1696 2008

[4] Y-K Wu and H Jing-Shan A Literature Review of WindForecasting Technology in the World pp 504ndash509 IEEELausanne Powertech Lausanne Switzerland 2007

[5] W S McCulloch and W Pitts ldquoA logical calculus of the ideasimmanent in nervous activityrdquo Be Bulletin of MathematicalBiophysics vol 5 no 4 pp 115ndash133 1943

[6] K Fukushima ldquoNeocognitronrdquo Scholarpedia vol 2 no 1p 1717 2007

[7] K Fukushima ldquoNeocognitron a self-organizing neural net-work model for a mechanism of pattern recognition unaf-fected by shift in positionrdquo Biological Cybernetics vol 36no 4 pp 193ndash202 1980

[8] S Hochreite and J Schmidhuber ldquoLong short-termmemoryrdquoNeural Computation vol 9 no 8 pp 1735ndash1780 1997

[9] H T Siegelmann and E D Sontag ldquoOn the computationalpower of neural netsrdquo Journal of Computer and System Sci-ences vol 50 no 1 pp 132ndash150 1995

[10] R Sunil ldquoUnderstanding support vector machine algorithmrdquohttpswwwanalyticsvidhyacomblog201709understaing-support-vector-machine-example-code 2017

[11] X Shi Z Chen H Wang D Y Yeung W K Wong andW C Woo ldquoConvolutional LSTM network a machinelearning approach for precipitation nowcastingrdquo in Pro-ceedings of the Neural Information Processing Systemspp 7ndash12 Montreal Canada December 2015

[12] C Finn ldquoGoodfellow I and Levine S Unsupervised learningfor physical interaction through video predictionrdquo in Pro-ceedings of the Annual Conference Neural Information Pro-cessing Systems pp 64ndash72 Barcelona Spain December 2016

[13] H Xu Y Gao F Yu and T Darrell ldquoEnd-to-end learning ofdriving models from large-scale video datasetsrdquo in Proceed-ings of the 2017 IEEE Conference on Computer Vision andPattern Recognition pp 3530ndash3538 Honolulu HI USA July2017

[14] K Chen and J Yu ldquoShort-term wind speed prediction usingan unscented Kalman filter based state-space support vectorregression approachrdquo Applied Energy vol 113 pp 690ndash7052014

[15] J Chen G-Q Zeng W Zhou W Du and K-D Lu ldquoWindspeed forecasting using nonlinear-learning ensemble of deeplearning time series prediction and extremal optimizationrdquoEnergy Conversion and Management vol 165 pp 681ndash6952018

[16] D Liu D Niu H Wang and L Fan ldquoShort-term wind speedforecasting using wavelet transform and support vectormachines optimized by genetic algorithmrdquo Renewable Energyvol 62 pp 592ndash597 2014

[17] Y Wang ldquoShort-term wind power forecasting by geneticalgorithm of wavelet neural networkrdquo in Proceedings of the2014 International Conference on Information Science Elec-tronics and Electrical Engineering pp 1752ndash1755 SapporoJapan April 2014

[18] N Sheikh D B Talukdar R I Rasel and N Sultana ldquoA shortterm wind speed forcasting using svr and bp-ann a com-parative analysisrdquo in Proceedings of the 2017 20th Interna-tional Conference of Computer and Information Technology(ICCIT) pp 1ndash6 Dhaka Bangladesh December 2017

[19] H Nantian C Yuan G Cai and E Xing ldquoHybrid short termwind speed forecasting using variational mode decompositionand a weighted regularized extreme learning machinerdquo En-ergies vol 9 no 12 pp 989ndash1007 2016

[20] S Haijian and X Deng ldquoAdaBoosting neural network forshort-term wind speed forecasting based on seasonal char-acteristics analysis and lag space estimationrdquo ComputerModeling in Engineering amp Sciences vol 114 no 3pp 277ndash293 2018

[21] W Xiaodan ldquoForecasting short-term wind speed usingsupport vector machine with particle swarm optimizationrdquo inProceedings of the 2017 International Conference on SensingDiagnostics Prognostics and Control (SDPC) pp 241ndash245Shanghai China August 2017

14 Computational Intelligence and Neuroscience

[22] F R Ningsih E C Djamal and A Najmurrakhman ldquoWindspeed forecasting using recurrent neural networks and longshort term memoryrdquo in Proceedings of the 2019 6th Inter-national Conference on Instrumentation Control and Auto-mation (ICA) pp 137ndash141 Bandung Indonesia JulyndashAugust2019

[23] P K Datta ldquoAn artificial neural network approach for short-term wind speed forecastrdquo Master thesis Kansas State Uni-versity Manhattan KS USA 2018

[24] J Guanlong D Li L Yao and P Zhao ldquoAn improved ar-tificial bee colony-BP neural network algorithm in the short-term wind speed predictionrdquo in Proceedings of the 2016 12thWorld Congress on Intelligent Control and Automation(WCICA) pp 2252ndash2255 Guilin China June 2016

[25] C Gonggui J Chen Z Zhang and Z Sun ldquoShort-term windspeed forecasting based on fuzzy C-means clustering andimproved MEA-BPrdquo IAENG International Journal of Com-puter Science vol 46 no 4 pp 27ndash35 2019

[26] W ZhanJie and A B M Mazharul Mujib ldquo)e weatherforecast using data mining research based on cloud com-putingrdquo in Journal of Physics Conference Series vol 910 no 1Article ID 012020 2017

[27] A Zhu X Li Z Mo and R Wu ldquoWind power predictionbased on a convolutional neural networkrdquo in Proceedings ofthe 2017 International Conference on Circuits Devices AndSystems (ICCDS) pp 131ndash135 Chengdu China September2017

[28] O Linda T Vollmer and M Manic ldquoNeural network basedintrusion detection system for critical infrastructuresrdquo inProceedings of the 2009 International Joint Conference onNeural Networks pp 1827ndash1834 Atlanta GA USA June2009

[29] C Li Z Xiao X Xia W Zou and C Zhang ldquoA hybrid modelbased on synchronous optimisation for multi-step short-termwind speed forecastingrdquoApplied Energy vol 215 pp 131ndash1442018

[30] Institute National Wind West Texas Mesonet Online Ref-erencing httpwwwdeptsttueducom 2018

Computational Intelligence and Neuroscience 15

Similar to our work the proposed EnsemLSTM modelby Chen et al [15] contained different clusters of LSTMwith different hidden layers and hidden neurons )eycombined LSTM clusters with SVR and external optimizerin order to enhance the generalization capability and ro-bustness of their model However their model showed ahigh computational complexity with mediocre perfor-mance indices Our proposed ConvLSTM with MLOS

assured boosting the generalization and robustness for thenew time series data as well as producing high performanceindices

6 Conclusions

In this study we proposed a hybrid deep learning-basedframework ConvLSTM for short-term prediction of the wind

Table 8 Execution time

(5min) time (min) (30min) time (min) (1 hour) time (min)ConvLSTM 17338 03849 01451SVR 541424 08214 02250CNN 087828 01322 00708ANN 07431 02591 00587LSTM 16570 03290 01473

5

4

3

2

1

0 25 50 75 100 125 150 175 200Number of samples

Win

d sp

eed

(mph

)ConvLSTM 5 min time span

PredictedTrue

Figure 5 ConvLSTM truepredicted wind speed and number of samples

0988

0986

0984

0982

098ConvLSTM LSTM ANN CNN SVR

R2

(a)

ConvLSTM LSTM ANN CNN SVR

026

024

022

02

RMSE

(b)

ConvLSTM LSTM ANN CNN SVR

02

019

018

017

016

015

MA

E

(c)

Figure 6 Average performance metrics obtained on the test dataset using 50 cross validation procedure

Computational Intelligence and Neuroscience 13

speed time series measurements )e proposed dynamicpredictionmodel was optimized for the number of input lagsand the number of internal hidden neurons Multi-lags-one-step (MLOS) ahead wind speed forecasting using the pro-posed approach showed superior results compared to fourother different models built using standard ANN CNNLSTM and SVM approaches )e proposed modelingframework combines the benefits of CNN and LSTM net-works in a hybrid modeling scheme that shows highly ac-curate wind speed prediction results with less lags andhidden neurons as well as less computational complexityFor future work further investigation can be done to im-prove the accuracy of the ConvLSTM model for instanceincreasing and optimizing the number of hidden layersapplying a multi-lags-multi-steps (MLMS) ahead forecast-ing and introducing reinforcement learning agent to op-timize the parameters as compared to other optimizationmethods

Data Availability

)ewind speed data used in this study have been taken fromthe West Texas Mesonet of the US National Wind Institute(httpwwwdeptsttuedunwiresearchfacilitieswtmindexphp) Data are provided freely for academic researchpurposes only and cannot be shareddistributed beyondacademic research use without permission from the WestTexas Mesonet

Conflicts of Interest

)e authors declare that they have no conflicts of interest

Acknowledgments

)e authors acknowledge the help of Prof Brian Hirth inTexas Tech University for providing them with access toweather data

References

[1] R Rothermel ldquoHow to predict the spread and intensity offorest and range firesrdquo General Technical Report INT-143Intermountain Forest and Range Experiment Station OgdenUT USA 1983

[2] A Tascikaraoglu and M Uzunoglu ldquoA review of combinedapproaches for prediction of short-term wind speed andpowerrdquo Renewable and Sustainable Energy Reviews vol 34pp 243ndash254 2014

[3] R J Barthelmie F Murray and S C Pryor ldquo)e economicbenefit of short-term forecasting for wind energy in the UKelectricity marketrdquo Energy Policy vol 36 no 5 pp 1687ndash1696 2008

[4] Y-K Wu and H Jing-Shan A Literature Review of WindForecasting Technology in the World pp 504ndash509 IEEELausanne Powertech Lausanne Switzerland 2007

[5] W S McCulloch and W Pitts ldquoA logical calculus of the ideasimmanent in nervous activityrdquo Be Bulletin of MathematicalBiophysics vol 5 no 4 pp 115ndash133 1943

[6] K Fukushima ldquoNeocognitronrdquo Scholarpedia vol 2 no 1p 1717 2007

[7] K Fukushima ldquoNeocognitron a self-organizing neural net-work model for a mechanism of pattern recognition unaf-fected by shift in positionrdquo Biological Cybernetics vol 36no 4 pp 193ndash202 1980

[8] S Hochreite and J Schmidhuber ldquoLong short-termmemoryrdquoNeural Computation vol 9 no 8 pp 1735ndash1780 1997

[9] H T Siegelmann and E D Sontag ldquoOn the computationalpower of neural netsrdquo Journal of Computer and System Sci-ences vol 50 no 1 pp 132ndash150 1995

[10] R Sunil ldquoUnderstanding support vector machine algorithmrdquohttpswwwanalyticsvidhyacomblog201709understaing-support-vector-machine-example-code 2017

[11] X Shi Z Chen H Wang D Y Yeung W K Wong andW C Woo ldquoConvolutional LSTM network a machinelearning approach for precipitation nowcastingrdquo in Pro-ceedings of the Neural Information Processing Systemspp 7ndash12 Montreal Canada December 2015

[12] C Finn ldquoGoodfellow I and Levine S Unsupervised learningfor physical interaction through video predictionrdquo in Pro-ceedings of the Annual Conference Neural Information Pro-cessing Systems pp 64ndash72 Barcelona Spain December 2016

[13] H Xu Y Gao F Yu and T Darrell ldquoEnd-to-end learning ofdriving models from large-scale video datasetsrdquo in Proceed-ings of the 2017 IEEE Conference on Computer Vision andPattern Recognition pp 3530ndash3538 Honolulu HI USA July2017

[14] K Chen and J Yu ldquoShort-term wind speed prediction usingan unscented Kalman filter based state-space support vectorregression approachrdquo Applied Energy vol 113 pp 690ndash7052014

[15] J Chen G-Q Zeng W Zhou W Du and K-D Lu ldquoWindspeed forecasting using nonlinear-learning ensemble of deeplearning time series prediction and extremal optimizationrdquoEnergy Conversion and Management vol 165 pp 681ndash6952018

[16] D Liu D Niu H Wang and L Fan ldquoShort-term wind speedforecasting using wavelet transform and support vectormachines optimized by genetic algorithmrdquo Renewable Energyvol 62 pp 592ndash597 2014

[17] Y Wang ldquoShort-term wind power forecasting by geneticalgorithm of wavelet neural networkrdquo in Proceedings of the2014 International Conference on Information Science Elec-tronics and Electrical Engineering pp 1752ndash1755 SapporoJapan April 2014

[18] N Sheikh D B Talukdar R I Rasel and N Sultana ldquoA shortterm wind speed forcasting using svr and bp-ann a com-parative analysisrdquo in Proceedings of the 2017 20th Interna-tional Conference of Computer and Information Technology(ICCIT) pp 1ndash6 Dhaka Bangladesh December 2017

[19] H Nantian C Yuan G Cai and E Xing ldquoHybrid short termwind speed forecasting using variational mode decompositionand a weighted regularized extreme learning machinerdquo En-ergies vol 9 no 12 pp 989ndash1007 2016

[20] S Haijian and X Deng ldquoAdaBoosting neural network forshort-term wind speed forecasting based on seasonal char-acteristics analysis and lag space estimationrdquo ComputerModeling in Engineering amp Sciences vol 114 no 3pp 277ndash293 2018

[21] W Xiaodan ldquoForecasting short-term wind speed usingsupport vector machine with particle swarm optimizationrdquo inProceedings of the 2017 International Conference on SensingDiagnostics Prognostics and Control (SDPC) pp 241ndash245Shanghai China August 2017

14 Computational Intelligence and Neuroscience

[22] F R Ningsih E C Djamal and A Najmurrakhman ldquoWindspeed forecasting using recurrent neural networks and longshort term memoryrdquo in Proceedings of the 2019 6th Inter-national Conference on Instrumentation Control and Auto-mation (ICA) pp 137ndash141 Bandung Indonesia JulyndashAugust2019

[23] P K Datta ldquoAn artificial neural network approach for short-term wind speed forecastrdquo Master thesis Kansas State Uni-versity Manhattan KS USA 2018

[24] J Guanlong D Li L Yao and P Zhao ldquoAn improved ar-tificial bee colony-BP neural network algorithm in the short-term wind speed predictionrdquo in Proceedings of the 2016 12thWorld Congress on Intelligent Control and Automation(WCICA) pp 2252ndash2255 Guilin China June 2016

[25] C Gonggui J Chen Z Zhang and Z Sun ldquoShort-term windspeed forecasting based on fuzzy C-means clustering andimproved MEA-BPrdquo IAENG International Journal of Com-puter Science vol 46 no 4 pp 27ndash35 2019

[26] W ZhanJie and A B M Mazharul Mujib ldquo)e weatherforecast using data mining research based on cloud com-putingrdquo in Journal of Physics Conference Series vol 910 no 1Article ID 012020 2017

[27] A Zhu X Li Z Mo and R Wu ldquoWind power predictionbased on a convolutional neural networkrdquo in Proceedings ofthe 2017 International Conference on Circuits Devices AndSystems (ICCDS) pp 131ndash135 Chengdu China September2017

[28] O Linda T Vollmer and M Manic ldquoNeural network basedintrusion detection system for critical infrastructuresrdquo inProceedings of the 2009 International Joint Conference onNeural Networks pp 1827ndash1834 Atlanta GA USA June2009

[29] C Li Z Xiao X Xia W Zou and C Zhang ldquoA hybrid modelbased on synchronous optimisation for multi-step short-termwind speed forecastingrdquoApplied Energy vol 215 pp 131ndash1442018

[30] Institute National Wind West Texas Mesonet Online Ref-erencing httpwwwdeptsttueducom 2018

Computational Intelligence and Neuroscience 15

speed time series measurements )e proposed dynamicpredictionmodel was optimized for the number of input lagsand the number of internal hidden neurons Multi-lags-one-step (MLOS) ahead wind speed forecasting using the pro-posed approach showed superior results compared to fourother different models built using standard ANN CNNLSTM and SVM approaches )e proposed modelingframework combines the benefits of CNN and LSTM net-works in a hybrid modeling scheme that shows highly ac-curate wind speed prediction results with less lags andhidden neurons as well as less computational complexityFor future work further investigation can be done to im-prove the accuracy of the ConvLSTM model for instanceincreasing and optimizing the number of hidden layersapplying a multi-lags-multi-steps (MLMS) ahead forecast-ing and introducing reinforcement learning agent to op-timize the parameters as compared to other optimizationmethods

Data Availability

)ewind speed data used in this study have been taken fromthe West Texas Mesonet of the US National Wind Institute(httpwwwdeptsttuedunwiresearchfacilitieswtmindexphp) Data are provided freely for academic researchpurposes only and cannot be shareddistributed beyondacademic research use without permission from the WestTexas Mesonet

Conflicts of Interest

)e authors declare that they have no conflicts of interest

Acknowledgments

)e authors acknowledge the help of Prof Brian Hirth inTexas Tech University for providing them with access toweather data

References

[1] R Rothermel ldquoHow to predict the spread and intensity offorest and range firesrdquo General Technical Report INT-143Intermountain Forest and Range Experiment Station OgdenUT USA 1983

[2] A Tascikaraoglu and M Uzunoglu ldquoA review of combinedapproaches for prediction of short-term wind speed andpowerrdquo Renewable and Sustainable Energy Reviews vol 34pp 243ndash254 2014

[3] R J Barthelmie F Murray and S C Pryor ldquo)e economicbenefit of short-term forecasting for wind energy in the UKelectricity marketrdquo Energy Policy vol 36 no 5 pp 1687ndash1696 2008

[4] Y-K Wu and H Jing-Shan A Literature Review of WindForecasting Technology in the World pp 504ndash509 IEEELausanne Powertech Lausanne Switzerland 2007

[5] W S McCulloch and W Pitts ldquoA logical calculus of the ideasimmanent in nervous activityrdquo Be Bulletin of MathematicalBiophysics vol 5 no 4 pp 115ndash133 1943

[6] K Fukushima ldquoNeocognitronrdquo Scholarpedia vol 2 no 1p 1717 2007

[7] K Fukushima ldquoNeocognitron a self-organizing neural net-work model for a mechanism of pattern recognition unaf-fected by shift in positionrdquo Biological Cybernetics vol 36no 4 pp 193ndash202 1980

[8] S Hochreite and J Schmidhuber ldquoLong short-termmemoryrdquoNeural Computation vol 9 no 8 pp 1735ndash1780 1997

[9] H T Siegelmann and E D Sontag ldquoOn the computationalpower of neural netsrdquo Journal of Computer and System Sci-ences vol 50 no 1 pp 132ndash150 1995

[10] R Sunil ldquoUnderstanding support vector machine algorithmrdquohttpswwwanalyticsvidhyacomblog201709understaing-support-vector-machine-example-code 2017

[11] X Shi Z Chen H Wang D Y Yeung W K Wong andW C Woo ldquoConvolutional LSTM network a machinelearning approach for precipitation nowcastingrdquo in Pro-ceedings of the Neural Information Processing Systemspp 7ndash12 Montreal Canada December 2015

[12] C Finn ldquoGoodfellow I and Levine S Unsupervised learningfor physical interaction through video predictionrdquo in Pro-ceedings of the Annual Conference Neural Information Pro-cessing Systems pp 64ndash72 Barcelona Spain December 2016

[13] H Xu Y Gao F Yu and T Darrell ldquoEnd-to-end learning ofdriving models from large-scale video datasetsrdquo in Proceed-ings of the 2017 IEEE Conference on Computer Vision andPattern Recognition pp 3530ndash3538 Honolulu HI USA July2017

[14] K Chen and J Yu ldquoShort-term wind speed prediction usingan unscented Kalman filter based state-space support vectorregression approachrdquo Applied Energy vol 113 pp 690ndash7052014

[15] J Chen G-Q Zeng W Zhou W Du and K-D Lu ldquoWindspeed forecasting using nonlinear-learning ensemble of deeplearning time series prediction and extremal optimizationrdquoEnergy Conversion and Management vol 165 pp 681ndash6952018

[16] D Liu D Niu H Wang and L Fan ldquoShort-term wind speedforecasting using wavelet transform and support vectormachines optimized by genetic algorithmrdquo Renewable Energyvol 62 pp 592ndash597 2014

[17] Y Wang ldquoShort-term wind power forecasting by geneticalgorithm of wavelet neural networkrdquo in Proceedings of the2014 International Conference on Information Science Elec-tronics and Electrical Engineering pp 1752ndash1755 SapporoJapan April 2014

[18] N Sheikh D B Talukdar R I Rasel and N Sultana ldquoA shortterm wind speed forcasting using svr and bp-ann a com-parative analysisrdquo in Proceedings of the 2017 20th Interna-tional Conference of Computer and Information Technology(ICCIT) pp 1ndash6 Dhaka Bangladesh December 2017

[19] H Nantian C Yuan G Cai and E Xing ldquoHybrid short termwind speed forecasting using variational mode decompositionand a weighted regularized extreme learning machinerdquo En-ergies vol 9 no 12 pp 989ndash1007 2016

[20] S Haijian and X Deng ldquoAdaBoosting neural network forshort-term wind speed forecasting based on seasonal char-acteristics analysis and lag space estimationrdquo ComputerModeling in Engineering amp Sciences vol 114 no 3pp 277ndash293 2018

[21] W Xiaodan ldquoForecasting short-term wind speed usingsupport vector machine with particle swarm optimizationrdquo inProceedings of the 2017 International Conference on SensingDiagnostics Prognostics and Control (SDPC) pp 241ndash245Shanghai China August 2017

14 Computational Intelligence and Neuroscience

[22] F R Ningsih E C Djamal and A Najmurrakhman ldquoWindspeed forecasting using recurrent neural networks and longshort term memoryrdquo in Proceedings of the 2019 6th Inter-national Conference on Instrumentation Control and Auto-mation (ICA) pp 137ndash141 Bandung Indonesia JulyndashAugust2019

[23] P K Datta ldquoAn artificial neural network approach for short-term wind speed forecastrdquo Master thesis Kansas State Uni-versity Manhattan KS USA 2018

[24] J Guanlong D Li L Yao and P Zhao ldquoAn improved ar-tificial bee colony-BP neural network algorithm in the short-term wind speed predictionrdquo in Proceedings of the 2016 12thWorld Congress on Intelligent Control and Automation(WCICA) pp 2252ndash2255 Guilin China June 2016

[25] C Gonggui J Chen Z Zhang and Z Sun ldquoShort-term windspeed forecasting based on fuzzy C-means clustering andimproved MEA-BPrdquo IAENG International Journal of Com-puter Science vol 46 no 4 pp 27ndash35 2019

[26] W ZhanJie and A B M Mazharul Mujib ldquo)e weatherforecast using data mining research based on cloud com-putingrdquo in Journal of Physics Conference Series vol 910 no 1Article ID 012020 2017

[27] A Zhu X Li Z Mo and R Wu ldquoWind power predictionbased on a convolutional neural networkrdquo in Proceedings ofthe 2017 International Conference on Circuits Devices AndSystems (ICCDS) pp 131ndash135 Chengdu China September2017

[28] O Linda T Vollmer and M Manic ldquoNeural network basedintrusion detection system for critical infrastructuresrdquo inProceedings of the 2009 International Joint Conference onNeural Networks pp 1827ndash1834 Atlanta GA USA June2009

[29] C Li Z Xiao X Xia W Zou and C Zhang ldquoA hybrid modelbased on synchronous optimisation for multi-step short-termwind speed forecastingrdquoApplied Energy vol 215 pp 131ndash1442018

[30] Institute National Wind West Texas Mesonet Online Ref-erencing httpwwwdeptsttueducom 2018

Computational Intelligence and Neuroscience 15

[22] F R Ningsih E C Djamal and A Najmurrakhman ldquoWindspeed forecasting using recurrent neural networks and longshort term memoryrdquo in Proceedings of the 2019 6th Inter-national Conference on Instrumentation Control and Auto-mation (ICA) pp 137ndash141 Bandung Indonesia JulyndashAugust2019

[23] P K Datta ldquoAn artificial neural network approach for short-term wind speed forecastrdquo Master thesis Kansas State Uni-versity Manhattan KS USA 2018

[24] J Guanlong D Li L Yao and P Zhao ldquoAn improved ar-tificial bee colony-BP neural network algorithm in the short-term wind speed predictionrdquo in Proceedings of the 2016 12thWorld Congress on Intelligent Control and Automation(WCICA) pp 2252ndash2255 Guilin China June 2016

[25] C Gonggui J Chen Z Zhang and Z Sun ldquoShort-term windspeed forecasting based on fuzzy C-means clustering andimproved MEA-BPrdquo IAENG International Journal of Com-puter Science vol 46 no 4 pp 27ndash35 2019

[26] W ZhanJie and A B M Mazharul Mujib ldquo)e weatherforecast using data mining research based on cloud com-putingrdquo in Journal of Physics Conference Series vol 910 no 1Article ID 012020 2017

[27] A Zhu X Li Z Mo and R Wu ldquoWind power predictionbased on a convolutional neural networkrdquo in Proceedings ofthe 2017 International Conference on Circuits Devices AndSystems (ICCDS) pp 131ndash135 Chengdu China September2017

[28] O Linda T Vollmer and M Manic ldquoNeural network basedintrusion detection system for critical infrastructuresrdquo inProceedings of the 2009 International Joint Conference onNeural Networks pp 1827ndash1834 Atlanta GA USA June2009

[29] C Li Z Xiao X Xia W Zou and C Zhang ldquoA hybrid modelbased on synchronous optimisation for multi-step short-termwind speed forecastingrdquoApplied Energy vol 215 pp 131ndash1442018

[30] Institute National Wind West Texas Mesonet Online Ref-erencing httpwwwdeptsttueducom 2018

Computational Intelligence and Neuroscience 15