svr-based prediction of evaporation combined with chaotic approach

8

Click here to load reader

Upload: kasim

Post on 25-Dec-2016

220 views

Category:

Documents


4 download

TRANSCRIPT

Page 1: SVR-based prediction of evaporation combined with chaotic approach

Journal of Hydrology 508 (2014) 356–363

Contents lists available at ScienceDirect

Journal of Hydrology

journal homepage: www.elsevier .com/locate / jhydrol

SVR-based prediction of evaporation combined with chaotic approach

0022-1694/$ - see front matter � 2013 Elsevier B.V. All rights reserved.http://dx.doi.org/10.1016/j.jhydrol.2013.11.008

⇑ Corresponding author. Tel.: +90 2122853122.E-mail addresses: [email protected] (Ö. Baydaroglu), [email protected]

(K. Koçak).

Özlem Baydaroglu ⇑, Kasım KoçakDepartment of Meteorology, _Istanbul Technical University, _Istanbul, Turkey

a r t i c l e i n f o s u m m a r y

Article history:Received 19 April 2013Received in revised form 1 November 2013Accepted 6 November 2013Available online 20 November 2013This manuscript was handled by CorradoCorradini, Editor-in-Chief, with theassistance of Fritz Stauffer, Associate Editor

Keywords:PredictionEvaporationWater lossesSupport Vector RegressionChaos

Evaporation, temperature, wind speed, solar radiation and relative humidity time series are used to pre-dict water losses. Prediction of evaporation amounts is performed using Support Vector Regression (SVR)originated from Support Vector Machine (SVM). To prepare the input data for SVR, phase space recon-structions are realized using both univariate and multivariate time series embedding methods. The ideabehind SVR is based on the computation of a linear regression in a multidimensional feature space.Observations vector in the input space are transformed to feature space by way of a kernel function. Inthis study, Radial Basis Function (RBF) is preferred as a kernel function due to its flexibility to observa-tions from many divers fields. It is widely accepted that SVR is the most effective method for predictionwhen compared to other classical and modern methods like Artificial Neural Network (ANN), Autoregres-sive Integrated Moving Average (ARIMA), Group Method of Data Handling (GMDH) (Samsudin et al.,2011). Thus SVR has been chosen to predict evaporation amounts because of its good generalization capa-bility. The results show that SVR-based predictions are very successful with high determination coeffi-cients as 83% and 97% for univariate and multivariate time series embeddings, respectively.

� 2013 Elsevier B.V. All rights reserved.

1. Introduction

Both observations and climate models dictate that the globalwarming will amplify evaporation from soil and free water sur-faces. It is expected that this will have serious adverse effects onwater resources together with the increasing world population.Thus, prediction of evaporation amounts is of crucial importancein the management of water resources.

There are many methods for prediction of a time series but re-cent applications indicate that SVR (Vapnik, 1995; Vapnik andCortes, 1995) proves itself as a powerful method when comparedwith classical and state-of-the-art methods. By using SVR, eitherclassification of events, prediction of time series and probabilitydistribution of a random variable can be obtained. It also has agood generalization capability. However, SVR requires special in-put data format and high-speed computers due to long calculationtime. One important point is to decide the type of the kernel func-tion and determine the optimum parameter values of this function.

In literature, there are some important applications of SVR inwhich both direct and hybrid version of SVR are employed success-fully. Some of these applications are as follows. In the study of Wuet al. (2009), Autoregressive Moving Average (ARMA), K-nearestNeighbours (KNN), ANN, Crisp Distributed Artificial NeuralNetwork (CDANN), Crisp Distributed Support Vector Regression

(CDSVR) have been employed in order to predict one-month-aheadstreamflow and coefficient of efficiency (CE) (Nash and Sutcliffe,1970) has been found as 1.00 for CDSVR.

Zhao et al. (2010) have studied with 10-min mean wind speeddata and used Back Propagation Neural Network (BPNN) and SVRfor prediction. It is seen from the study that SVR results are betterthan BPNN. Similarly, Wang et al. (2010) have implemented SVRand ANN to 30-min mean wind speed data and chaotic approachfor preparation of input data and relative root-mean squared error(RRMSE) for SVR has been found smaller than ANN.

Sivapragasam et al. (2001) have used Non-linear Prediction(NLP) and SVR for daily flow data and have found that the root-mean squared error (RMSE) for SVR is smaller than the other.

In the forecasting of monthly discharge, Wang et al. (2009) haveutilized ARMA, ANN, Adaptive Neural-Based Fuzzy Inference Sys-tem (ANFIS), Genetic Programming (GP) and SVM. The resultsshow that the highest coefficient of determination of ManwanHydropower belongs to SVM.

Samsudin et al. (2011) have used Autoregressive IntegratedMoving Average (ARIMA), ANN, Group Method of Data Handling(GMDH), Least Square Support Vector Machine (LSSVM), GLSSVM(GMDH + LSSVM) and a hybrid GMDH and LSSVM have given accu-rate results according to the coefficients of determinations of allthe techniques used.

Most of the similar studies as mentioned above show that SVRis a quite successful method when compared the others.

In this study, phase space reconstruction is utilized to form theinput data set for SVR. Prior to phase space reconstruction based on

Page 2: SVR-based prediction of evaporation combined with chaotic approach

Ö. Baydaroglu, K. Koçak / Journal of Hydrology 508 (2014) 356–363 357

Embedding Theorem (Takens, 1981), embedding parameters suchas embedding dimension and time delay should be determinedfrom the time series. In literature, there are various methods suchas Grassberger–Procaccia (GP) (Grassberger and Procaccia, 1983)and False Nearest Neighbour (FNN) (Kennel et al., 1992) algorithmto find the embedding dimension. Although these methods givesimilar results, application of the FNN is more practical when com-pared the other methods and that is the reason why FNN is chosenin this study. Similarly, there are two basic methods, Autocorrela-tion Function (ACF) and Mutual Information Function (MIF) (Fraserand Swinney, 1986) which is a nonlinear version of ACF, in order toset the proper time delay. Even though ACF and MIF follow com-pletely different procedure to determine the optimum time delay,they usually give very close results to one another. In this study,the result of ACF is chosen because the attractor produced by thisapproach has given distinct appearance when compared to theother method.

Daily total evaporation (mm), mean temperature (�C), meanwind speed (m/s), mean solar radiation (cal/cm2) and mean rela-tive humidity (%) time series are used for univariate and multivar-iate time series embedding. Evaporation data observed betweenthe years 2007 and 2011 have been provided from evaporationpans in Ercan Meteorology Station. All of the data used in the studybelong to the same station and observation period. In this study,the embedding dimension and the time delay are determined foreach time series embedding using FNN, ACF and MIF. The resultsof ACF are preferred to MIF because accurate predictions have beenobtained in case of ACF. These algorithms have been implementedby using TISEAN (Time Series Analysis) package (Hegger et al.,1999).

The first step of the implementation of the SVR is the selectionof appropriate kernel function which directly affects the success ofthe SVR. In literature, there are different kernel functions such aslinear, polynomial, sigmoid and radial basis (Gaussian) function(RBF). In the application of SVR, RBF is chosen as it is the best ker-nel function according to past hydrometeorological studies (seeDebnath et al., 2004; Cheng-Ping et al., 2011) and it also representsthe hydrometeorological processes very well when compared withother kernel functions. Libsvm (A Library for Support VectorMachines) (Chang and Lin, 2001) is employed for all SVR applica-tions. Daily evaporation data is used for univariate time seriesembedding. In addition to daily evaporation, temperature, windspeed, solar radiation and relative humidity data are utilized formultivariate time series embeddings. One-step prediction is ap-plied for one-year prediction period.

2. Methods

In this paper, four different methods are used to characterizechaotic approach: (1) phase space reconstruction which requiresdetermination of the embedding parameters so as to prepare theinput data; (2) False Nearest Neighbour (FNN) algorithm in orderto find the embedding dimension; (3) Mutual Information Function(MIF) and (4) Autocorrelation Function (ACF) to determine theoptimum time delay.

Finally, Support Vector Regression (SVR) with Radial Basis Func-tion (RBF) is used to predict daily evaporation amounts. The above-mentioned methods are explained in detail in the followingsubsections.

2.1. Phase space reconstruction

The time evolution of a phenomenon can be given by its trajec-tories in the phase space. Coordinates of this space are spanned bythe variables which are necessary to specify the time evolution of

the system (Koçak et al., 2004). Every point in a phase space showsa state of the system and every trajectory represents the time evo-lution of the system corresponding to different initial conditions.Points or a set of points in a phase space compose a unique patternwhich attract trajectories onto itself. These kinds of patterns arecalled attractor.

A phase space reconstruction is necessary in order to estimatethe attractor’s complexity of the system quantitatively and deter-mine whether dynamic behaviours observed are complex or not.From one dimensional time series, a phase space can be con-structed using Embedding Theorem (Takens, 1981).

By considering a time series of a single variable, it is assumedthat the time series is generated from a chaotic dynamical system.

Let’s consider a time series xi 2 R; i ¼ 1;2; . . . ;N; ð1Þ

then the reconstruction procedure is given as

Xi ¼ ðxi; xi�s; . . . ; xi�ðm�1ÞsÞ 2 Rm; ð2Þ

where Xi is an m-dimensional vector and s is a time delay (Koçaket al., 2000).

A phase space reconstruction can also be applied to a multivar-iate time series embedding (Cao et al., 1998).

For an M-dimensional time series, X1, X2, . . ., XN, whereXi = (x1,i, x2,i, . . ., xM,i), i = 1, 2, . . ., N, M-dimensional vector can bereconstructed as

Vn ¼

ðx1;n; x1;n�s1 ; . . . ; x1;n�ðd1�1Þs1;

x2;n; x2;n�s2 ; . . . ; x2;n�ðd2�1Þs2;

. . .

xm;n; xm;n�sM ; . . . ; xm;n�ðdM�1ÞsMÞ

ð3Þ

where si, di, i = 1, 2, . . ., M are the time delays and the embeddingdimensions, respectively. When generalized in accordance withEmbedding Theorem (Takens, 1981; Sauer et al., 1992), Cao et al.(1998) remark that there exists in the generic case a functionF : Rd ! Rd where d ¼

PMi¼1di such that

Vnþ1 ¼ FðVnÞ ¼

x1;nþ1 ¼ F1ðVnÞx2;nþ1 ¼ F2ðVnÞ

. . .

xM;nþ1 ¼ FMðVnÞ

0BBB@

1CCCA ð4Þ

2.1.1. False Nearest Neighbour (FNN) algorithmPoints of the trajectories on the attractor have neighbours in the

phase space. The behaviour of these neighbours provides valuableinformation to understand the evolution of neighbourhoods in or-der to produce equations for prediction (Abarbanel et al., 1990). Onthe other hand, behaviour of the neighbours in the phase space en-ables us to develop a simple but efficient algorithm to determinethe optimum embedding dimension.

To implement FNN algorithm, the first step is to reconstruct thephase space from a time series as given in Eq. (2).

The square of Euclidean distance between the point Xi and rth

nearest neighbour XðrÞi in m-dimensional phase space can be statedas

R2mði; rÞ ¼

Xm�1

k¼0

½xðiþ ksÞ � xðrÞðiþ ksÞ�2 ð5Þ

In going from dimension m to dimension m + 1 by time delayembedding it is added an (m + 1)th coordinate onto each of the vec-tors Xi. This new coordinate is just x(i + sm). Kennel et al. (1992)indicate that after the addition of the new (m + 1)th coordinatethe distance between Xi and the same rth nearest neighbour deter-mined in m dimension is

Page 3: SVR-based prediction of evaporation combined with chaotic approach

Fig. 1. Regression with e-insensitive tube.

Fig. 2. False nearest neighbours for the evaporation time series.

358 Ö. Baydaroglu, K. Koçak / Journal of Hydrology 508 (2014) 356–363

R2mþ1ði; rÞ ¼ R2

mði; rÞ þ ½xðiþmsÞ � xðrÞðiþmsÞ�2 ð6Þ

After determining the Euclidean distance between a point andits nearest neighbour in phase spaces with dimensions m andm + 1, it is necessary to assess these points whether they are falseor true neighbours. It is obvious that this assessment requires apredetermined threshold Rtol value (see detail in Kennel et al.(1992)).

2.1.2. Autocorrelation Function (ACF)Autocorrelation analysis is a classical way of measuring linear

dependence between observations separated by a time delay k.

(a)

(b)

Fig. 3. (a) Autocorrelation function for the evaporation time series. (b) The at

The autocorrelation coefficient rk is computed from a discrete timeseries, xi, as

rk ¼

XN�k

i¼1

ðxi � �xÞðxiþk � �xÞ

XN

i¼1

ðxi � �xÞ2ð7Þ

where N is the number of observation points and �x is the average ofthe observations. If rk is plotted versus k then ACF is obtained. Inapplication, generally the time delay value that corresponds tothe first zero or 0.5 of ACF are taken as the optimum time delayfor the attractor reconstruction.

2.1.3. Mutual Information Function (MIF)As mentioned above, ACF is a measure of linear dependence

that may exist in a time series. In case of real observations, anytime series may have nonlinear innerdependence. Thus in applica-tion we need a tool that measures the dependence of every kinds.In relevant literature, as a nonlinear counterpart of the ACF, MIF isfrequently used tool to determine the optimum time delay forattractor reconstruction.

The mutual information is defined as

MðsÞ ¼XN�ðm�1Þs

i¼1

Pðxi; xiþs; xiþ2s; . . . ; xiþðm�1ÞsÞ

� logPðxi; xiþs; xiþ2s; . . . ; xiþðm�1ÞsÞ

PðxiÞPðxiþsÞPðxiþ2sÞ . . . Pðxiþðm�1ÞsÞ

� �ð8Þ

where P(xi) is the probability of occurrence of the time seriesvariable xi and P(xi, xi+s, xi+2s, . . ., xi+(m�1)s) is the joint probabilityof occurrence of the attractor co-ordinate Xi = (xi, xi+s, xi+2s, . . .,xi+(m�1)s). M(s) is a measure of the statistical dependence of the

tractor of the reconstructed phase space for the evaporation time series.

Page 4: SVR-based prediction of evaporation combined with chaotic approach

(a)

(b)

Fig. 4. (a) Mutual information function for the evaporation time series. (b) The attractor of the reconstructed phase space for the evaporation time series.

Table 1The input matrix for SVR based on the phase space reconstruction.

Target Input data

1stDimension

2ndDimension

3rdDimension

... mth

Dimension

Trainingdata

Xt+1 Xt Xt�s Xt�2s . . . Xt�(m�1)s

Xt+2 Xt+1 . . . . . . . .

. . . . . .

. . . . . .

. . . . . .Test data . . . . . .

Table 2Embedding parameters for a univariate time series embedding (UTSE) and themultivariate time series embeddings (MTSE). (E: Evaporation, T: Temperature, WS:Wind Speed, SR: Solar Radiation and RH: Relative Humidity).

UTSE (E) E: m = 11, s = 49

MTSE (E,T) E: m = 10, s = 49T: m = 11, s = 3

MTSE (E,T,WS) E: m = 7, s = 49T: m = 7, s = 2WS: m = 8, s = 2

MTSE (E,T,WS,SR,RH) E: m = 7, s = 49T: m = 7, s = 2WS: m = 8, s = 2SR: m = 4, s = 6RH: m = 8, s = 3

Ö. Baydaroglu, K. Koçak / Journal of Hydrology 508 (2014) 356–363 359

reconstruction variables on each other. If the co-ordinates are sta-tistically independent, then

Pðxi; xiþs; xiþ2s; . . . ; xiþðm�1ÞsÞ ¼ PðxiÞPðxiþsÞPðxiþ2sÞ . . . Pðxiþðm�1ÞsÞð9Þ

and it follows that M(s) = 0. This would be the case for a completelystochastic process such as white noise. In contrast, complete depen-dence results in M(s) =1. A suitable choice of time delay requiresthe mutual information to be a minimum. When this is the casethe attractor is as ‘‘spread out’’ as possible. This condition for choiceof delay time is known as the minimum mutual information crite-rion (Addison, 1997).

2.2. Support Vector Regression (SVR)

SV (Support Vector) algorithm is developed as a nonlinear gen-eralization algorithm (Vapnik and Lerner, 1963; Vapnik andChervonenkis, 1964). Support Vector Machines (SVMs) are learningmachines which they have a good generalization capability apply-ing a structural risk minimization on a limited number of learningpatterns (Basak et al., 2007).

SVR, a subcategory of SVM, is proposed to solve the regressionproblem (Cheng-Ping et al., 2011). SVR is based on the computa-tion of a linear regression function in a multidimensional featurespace.

The learning machine is given a the training data {xi, yi},i = 1, . . ., L, yi e R, x e RD. The approximation function of this datawhich is a linear regression hyperplane.

yi ¼ wxi þ b ð10Þ

Page 5: SVR-based prediction of evaporation combined with chaotic approach

Fig. 5. (a) MAE and SVR parameters, g and C for UTSE (E) (b) for MTSE (E,T) (c) forMTSE (E,T,WS) (d) for MTSE (E,T,WS,SR,RH).

360 Ö. Baydaroglu, K. Koçak / Journal of Hydrology 508 (2014) 356–363

where w is the weight vector, yi is the element of {+1,�1} and b isthe bias or deviation.

If data are not separated linearly, slack variables are added tothe optimization model. First of all, it is necessary to choose thekernel function and its parameters for nonlinear conditions.

If the predicted value yi is less than a distance e away from theactual value, a penalty factor is not included. If not, when outputvariables are outside e-insensitive tube, then slack variable penal-ties are added depending on they lie above (n+) or below (n�) thetube (see Fig. 1).

The error function of SVR can be taken as

CXL

i¼1

ðnþi þ n�i Þ þ12jjWjj2 ð11Þ

In order to minimize error function subject to the constraintsnþ P 0, n� P 08 i, Lagrange multipliers aþi P 0;a�i P 0;l�iP 0;lþi P 08 i are introduced. To find a+ and a�, the followingfunction is maximized

LD ¼XL

i¼1

ðaþi � a�i Þti�

2XL

i¼1

ðaþi � a�i Þ �12

Xi;j

ðaþi � a�i Þðaþj � a�j Þ/ðxiÞ � /ðxjÞ ð12Þ

under the constraints 0 6 aþi 6 C; 0 6 a�i 6 C;PL

i¼1ðaþi � a�i Þ ¼ 0 8i.In Eq. (12), /(xi) � /(xj) is a kernel function, K(xi, xj). There are variouskernel functions such as linear, polynomial, radial basis and sig-moid. As mentioned before, RBF has been used in this study

Kðxi; xjÞ ¼ ejjxi�xj jj

2

2g2 ð13Þ

where g is the width of radial basis function. A set of support vec-tors S can be obtained by finding the indices i where 0 6 a 6 Cand n�i ¼ 0 or nþi ¼ 0 and averaging all the indices i over the Ns num-ber of support vectors S, it is gained

b ¼ 1Ns

Xs2S

½ti� 2 �XL

m¼1

ðaþi � a�i ÞKðxi; xmÞ� ð14Þ

Each new point x0 is determined by evaluating (Fletcher, 2009)

y0 ¼XL

i¼1

ðaþi � a�i ÞKðxi; x0Þ þ b ð15Þ

3. Time series embeddings

As mentioned earlier, prediction procedures are applied to bothunivariate time series embedding using daily evaporation time ser-ies and multivariate time series embeddings using daily evapora-tion (mm), temperature (�C), wind speed (m/s), solar radiation(cal/cm2) and relative humidity (%) time series observed betweenthe years 2007 and 2011. In the multivariate time series embed-dings, the phase spaces are reconstructed evaluating the mostappropriate embedding parameters for each variable separately.For phase space reconstruction, embedding parameters are calcu-lated as explained below.

3.1. Embedding dimension estimation

Percentage of FNN versus m is obtained to select the optimumembedding dimension (m), (Fig. 2). As shown from this figure, per-centage of FNN takes its minimum value at m = 11. Due to noise ef-fect percentage of FNN does not drop to zero.

3.2. Time delay estimations

In applications, two methods are commonly used for the esti-mation of time delay, namely, ACF and MIF.

3.2.1. Autocorrelation Function (ACF)Time delay is obtained as 49 from autocorrelation function for

rk = 0.5 (Fig. 3(a)). For these embedding parameters (s = 49 andm = 11), projection of the reconstructed attractor onto 3-D phasespace is given in Fig. 3(b). The trajectories behave like a quasi peri-odic motion as shown in this figure.

Page 6: SVR-based prediction of evaporation combined with chaotic approach

Ö. Baydaroglu, K. Koçak / Journal of Hydrology 508 (2014) 356–363 361

3.2.2. Mutual Information Function (MIF)Time delay can be taken as 3 because it corresponds to the

first minimum of the mutual information function (Fig. 4(a)).For these embedding parameters (s = 3 and m = 11), the attrac-tor in the reconstructed phase space is obtained as given inFig. 4(b). As shown from this figure, the attractor is stretchedalong the diagonal when compared with the attractor given inFig. 3(a).

3.3. Phase space reconstruction

The embedding dimension using FNN algorithm and the timedelay using ACF are determined 11 and 49, respectively. Consider-

(a)

(b)

(c)

(d)

Fig. 6. Prediction of evaporation using SVR (a) UTSE (E) (b) for M

Table 3Information about the prediction made by univariate and multivariate time series embed

Time seriesembedding

Total datalength

Phase spacelength

Training datalength

Test dlength

UTSE (E) 1826 1796 1430 366MTSE (E,T) 1826 1796 1430 366MTSE (E,T,WS) 1826 1812 1446 366MTSE

(E,T,WS,SR,RH)1826 1784 1418 366

ing this embedding dimension and the time delay, the input matrixfor SVR is constructed as seen in Table 1.

As shown from this table, the input data is formed by the phasespace coordinates. The target column contains the training and testdata. Because of one-step prediction, the elements of this columnare one step shifted version of the original time series.

A phase space should be constructed for a univariate (UTSE) anda multivariate time series embedding (MTSE) to carry out the pre-dictions. Table 2 shows the embedding parameters which are nec-essary to reconstruct the phase spaces. As shown from this table,embedding parameters for UTSE are determined by using FNNand ACF approaches. In other words, direct methods are appliedto determine the optimum values of the embedding parameters.

TSE (E,T) (c) for MTSE (E,T,WS) (d) for MTSE (E,T,WS,SR,RH).

dings.

ata SVR parameters Mean absolute error(MAE)

R2

g C e

2.593679 0.018581 0.006237 0.00649 0.8330.068157 2.828427 0.007418 0.00163 0.9680.002533 512 0.00085 0.00151 0.9690.011049 10.374717 0.000601 0.00149 0.970

Page 7: SVR-based prediction of evaporation combined with chaotic approach

(a) (b)

(c) (d)

Fig. 7. Scatterplots of observed and predicted values (a) UTSE (E) (b) MTSE (E, T) (c) MTSE (E, T, WS) (d) MTSE (E, T, WS, SR, RH).

362 Ö. Baydaroglu, K. Koçak / Journal of Hydrology 508 (2014) 356–363

On the other hand, embedding parameters for MTSE are chosen sothat the prediction errors will remain in a minimum level.

3.4. Prediction using SVR

The performance of SVR is subject to the kernel function and theother parameters which are related to noise distribution in thetraining data (Basak et al., 2007). In this study, RBF is utilized asa kernel function.

There are three parameters that should be set prior to theimplementation of SVR. These parameters are g (width of radial ba-sis function), C (trade-off between slack variable penalty and thesize of the margin) and e (tolerated errors within the extent of e-insensitive tube). In this study, g, C and e are calculated for theirvalid ranges which are determined by using a Fortran code beforethe implementation of the SVR. Optimum values of parameters areused in the SVR program. Optimum values of the parameters areextracted from Fig. 5(a–d). The vertical axes of these figures repre-sent the mean absolute error (MAE) while the remaining axes rep-resent two SVR parameters, g and C (the third parameter e is notshown in these figures). Flat regions in these figures are most suit-able domain for parameter selection.

If a time series is chaotic, it is quite likely that a short-term pre-diction will be more accurate than long-term prediction because ofthe sensitivity to initial conditions (Koçak et al., 2000). With thisconsiderations, one-step prediction is performed with the predic-tion period of one year.

Table 3 summarizes four types of prediction realized by SVR. InUTSE, only evaporation time series is used to form input data setthen one-step prediction of evaporation is fulfilled. After this appli-cation, each time a new variable is added to evaporation to con-struct the MTSE. The last two columns of Table 3 show meanabsolute errors (MAE) and coefficient of determinations (R2). Bothcriteria indicate clearly that SVR is very successful method in pre-diction processes.

3-month period prediction results are given in Fig. 6(a–d). Inthis figure, time series of observed and predicted values are shown.It can be easily seen that the predictions made by the MTSE aremore accurate than those of the UTSE.

Scatterplots of observed and predicted evaporation amounts aregiven in Fig. 7. It is clear from these scatterplots that the results of

SVR based on MTSE give accurate predictions when compared withUTSE.

4. Results and discussion

Evaporation is one of the most important components that formwater losses from any kinds of surfaces. In this study, evaporationfrom free water surface is considered. It is important to predictwater amount that will be available in reservoirs. This is crucialfor water management with the purpose of irrigation, flood con-trol, energy production, recreational activities and drinking waterrequirement.

A five-year meteorological data set has been used to predict thefuture value of evaporation in this study. One-step prediction isrealized for a one-year prediction period. On the other hand, SVRis also capable of making both direct and indirect predictions withlonger time-step. Besides, noise reduction has not been made inthis study but SVR may be applied to noise-free data to obtain pre-dictions with higher accuracy.

In this study, water losses by way of evaporation are predictedusing state-of-the art statistical learning technique, namely, SVR.The input data for SVR is prepared using another forceful approach,chaos theory. To make use of the advantages of chaos theory, it isnecessary to reconstruct the phase space by using observations.Thus, in this study, two kinds of phase space embedding tech-niques are applied to observations. These techniques are univariateand multivariate time series embeddings. The results show clearlythat the prediction method invoked in the study is very successfuland encourage us for similar applications to different hydrometeo-rological variables.

Acknowledgement

This article was presented in AOGS-AGU (WPGM) Joint Assem-bly in Singapore, August 15th, 2012.

References

Abarbanel, H.D.I., Brown, R., Kadtke, J.B., 1990. Prediction in chaotic nonlinearsystems: methods for time series with broadband Fourier spectra. Phys. Rev. A138, 1782.

Addison, P.S., 1997. Chaos and Fractals, An Illustrated Course, first ed., UK.

Page 8: SVR-based prediction of evaporation combined with chaotic approach

Ö. Baydaroglu, K. Koçak / Journal of Hydrology 508 (2014) 356–363 363

Basak, D., Pal, S., Patranabis, D.C., 2007. Support vector regression. Neural Inform.Process.-Lett. Rev. 11 (10).

Cao, L., Mees, A., Judd, K., 1998. Dynamics from multivariate time series. Phys. D121, 75–88.

Chang, C.-C., Lin C.-J., 2001. LIBSVM – A library for Support Vector Machines.<http://www.csie.ntu.edu.tw/~cjlin/libsvm/>.

Cheng-Ping, Z., Chuan, L., Hai-wei, G., 2011. Research on hydrology time seriesprediction based on grey theory and e-support vector regression. In: SecondInternational Conference on Digital Manufacturing & Automation.

Debnath, R., Takahashi, H., 2004. Kernel selection for the support vector machine.IEICE Trans. Inform. Syst. E87-D (12), 2903–2904.

Fletcher, T., 2009. Support vector machines explained. Tutorial paper-PhD 2008.<http://www.cs.ucl.ac.uk/staff/T.Fletcher/>.

Fraser, A.M., Swinney, H.L., 1986. Independent coordinates for strange attractorsfrom mutual information. Phys. Rev. A 33 (2).

Grassberger, P., Procaccia, I., 1983. Estimation of the Kolmogorov entropy from achaotic signal. Phys. Rev. A 28 (4), 2591–2593.

Hegger, R., Kantz, H., Schreiber, T., 1999. Practical implementation of nonlinear timeseries methods: the TISEAN package. Chaos 9, 413–435.

Kennel, M.B., Brown, R., Abarbanel, H.D.I., 1992. Determining embedding dimensionfor phase-space reconstruction using a geometrical construction. Phys. Rev. A45, 3403–3411.

Koçak, K., S�aylan, L., S�en, O., 2000. Nonlinear time series prediction of O3

concentration in _Istanbul. Atmos. Environ., 1267–1271.Koçak, K., S�aylan, L., Eitzinger, J., 2004. Nonlinear prediction of near-surface

temperature via univariate and multivariate time series embedding. Ecol.Model., 1–7.

Nash, J.E., Sutcliffe, J.V., 1970. River flow forecasting through conceptual models;part I – a discussion of principles. J. Hydrol. 10, 282–290.

Samsudin, R., Saad, P., Shabri, A., 2011. River flow time series using least squaressupport vector machines. Hydrol. Earth Syst. Sci. 15, 1835–1852.

Sauer, T., Yorke, J.A., Casdagli, M., 1992. Embedology. J. Stat. Phys. 65, 579–616.Sivapragasam, C., Liong, S.Y., Pasha, M.F.K., 2001. Rainfall and runoff forecasting

with SSA-SVM approach. J. Hydroinform. 3 (3), 141–152.Takens, F., 1981. Detecting strange attractors in turbulence. In: Rand, D.A., Young,

L.S. (Eds.), Lecture Notes in Math. Springer-Verlag, pp. 366–381.Vapnik, V., 1995. The Nature of Statistical Learning Theory. Springer, New York.Vapnik, V., Chervonenkis, 1964. A note on one class of perceptrons. Automation

Remote Control, 25.Vapnik, V., Cortes, C., 1995. Support vector networks. Machine Learning 20, 273–

297.Vapnik, V., Lerner, A., 1963. Pattern recognition using generalized portrait method.

Automation Remote Control 24, 774–780.Wang, W.-C., Chau, K.-W., Cheng, C.-T., Qiu, L., 2009. A comparison of performance

of several artificial intelligence methods for forecasting monthly discharge timeseries. J. Hydrol. 374, 294–306.

Wang, Y., Wu, D.L., Guo, C.X., Wu, Q.H., Qian, W.Z., Yang, J., 2010. Short-term windspeed prediction using support vector regression. Power Energy Soc. Gen. Meet.IEEE, 1–6.

Wu, C.L., Chau, K.W., Li, Y.S., 2009. Predicting monthly streamflow using data-drivenmodels coupled with data-preprocessing techniques. Water Resour. Res. 45,W08432.

Zhao, P., Xia, J., Dai, Y., He, J., 2010. Wind speed prediction using support vectorregression. IEEE Conf. Ind. Electron. Appl., 882–886.