anfis unfolded in time

7/22/2019 ANFIS Unfolded in Time

http://slidepdf.com/reader/full/anfis-unfolded-in-time 1/30

Neurocomputing 61 (2004) 139 – 168www.elsevier.com/locate/neucom

ANFIS unfolded in time for multivariate timeseries forecasting

N. Arzu S isman-Ylmaza;∗ , Ferda N. Alpaslan b , Lakhmi Jainc

a Central Bank of the Republic of Turkey, Ankara, Turkey bMiddle East Technical University, Computer Engineering Department, Ankara 06531, Turkey

cUniversity of South Australia, Adelaide, Australia

Abstract

This paper proposes a temporal neuro-fuzzy system named ANFIS unfolded in time which

is designed to provide an environment that keeps temporal relationships between the variables

and to forecast the future behavior of data by using fuzzy rules. It is a modication of ANFIS

neuro-fuzzy model. The rule base of ANFIS unfolded in time contains temporal TSK(Takagi–

Sugeno–Kang) fuzzy rules. In the training phase, back-propagation learning algorithm is used.

The system takes the multivariate data and the number of lags needed to construct the unfolded

model in order to describe a variable and predicts the future behavior. Computer simulations

are performed by using real multivariate data and a benchmark problem (Gas Furnace Data).Experimental results show that the proposed model achieves online learning and prediction on

temporal data. The results are compared with the results of ANFIS.

c 2004 Elsevier B.V. All rights reserved.

Keywords: Neuro-fuzzy systems; Unfolding in time; Backpropagation

1. Introduction

In multivariate time series analysis, it is possible to dene each time series in terms

of previous values of itself and previous values of other time series in the same system.

The denitions of each time series can be represented as a rule which can be used ina rule-based system. These rules can be utilized for forecasting the future behavior of

the system.

Neuro-fuzzy systems are widely used for combining the function approximation and

learning ability of neural networks and enhanced explanation capability of fuzzy sys-

tems. Recurrent Neural Network is a convenient structure for processing time series

data as stated in [10]. In Recurrent Neural Networks, if the input sequence is of a

∗ Corresponding author.

0925-2312/$ - see front matter c 2004 Elsevier B.V. All rights reserved.

doi:10.1016/j.neucom.2004.03.009



140 N.A. S isman-Ylmaz et al. / Neurocomputing 61 (2004) 139– 168

maximum length T , the recurrent network can be turned into an equivalent feed-forward

neural network dened over T time intervals. The idea is called unfolding in time [12].

The feed-forward neural network is duplicated for T times so that each of the neuralnetwork is kept for a time interval. In other words, each neural network represents a

state of the input sequence in time. The connection weights between the same nodes

at the duplicated neural network are identical in such a way that the neural networks

at dierent time intervals behave identical. A modied version of back-propagation

algorithm is used for training the neural network. The weight updates are summed up

and applied to the same weights at dierent time intervals. In the literature, there are

various examples of unfolding in time applications such as [ 3,12]. These examples are

mostly knowledge-based systems.

In this paper, unfolding in time approach is used to construct the knowledge-base in

a neuro-fuzzy model. A neuro-fuzzy system is used and it is duplicated for the number

of time intervals needed to forecast the system output accurately. A modied temporal back-propagation algorithm is used as the learning algorithm. It is aimed to present

the temporal data to the neuro-fuzzy system continuously. In other words, the learning

algorithm is for online learning. The neuro-fuzzy model unfolded in time is treated

as a single neural network, rather than the duplication of the basic neural network.

The connections between neural networks at dierent time intervals are also taken into

account while network parameters are updated.

During the experiments, each data set is processed by Fuzzy multivariate auto-

regression (MAR) Algorithm [13], which is a variable extraction method for multi-

variate time series data. Fuzzy MAR is based on fuzzy linear regression. It aims to

extract a set of temporal variables by solving a linear programming problem by means

of Simplex method.

2. ANFIS

Adaptive neuro-fuzzy inference system (ANFIS) is a neuro-fuzzy system developed

by Roger Jang [6 – 9]. It has a feed-forward neural network structure where each layer

is a neuro-fuzzy system component (Fig. 1). It simulates Takagi–Sugeno–Kang (TSK)

fuzzy rule [14] of type-3 where the consequent part of the rule is a linear combination

of input variables and a constant. The nal output of the system is the weighted average

of each rule’s output. The form of the type-3 rule simulated in the system is as follows:

IF x1 is A1 AND x2 is A2 AND · · · AND xp is Ap

THEN y = c0 + c1 x1 + c2 x2 + · · · + cp xp.The neural network structure contains ve layers excluding input layer.

• Layer 0 is the input layer. It has n nodes where n is the number of inputs to the

system.

• Layer 1 is the fuzzication layer in which each node represents a membership value

to a linguistic term as a Gaussian function with the mean

Ai ( x) = 1

1 + [(( x − ci)=ai)2]bi;



N.A. S isman-Ylmaz et al. / Neurocomputing 61 (2004) 139 – 168 141

Σ

A1

A2

B1

B2

x 1

x 0

f

x 0 x 1

wi wi

wi f i.

3 41 2 5

µ

0

Layer

Fig. 1. Basic ANFIS structure.

where ai, bi, ci are parameters of the function. These are adaptive parameters. Their

values are adapted by means of the back-propagation algorithm during the learn-

ing stage. As the values of the parameters change, the membership function of the

linguistic term Ai changes. These parameters are called premise parameters.

In that layer there exists n × p nodes where n is the number of input variables and

p is the number of membership functions. For example, if size is an input variable

and there exists two linguistic values for size, which are SMALL and LARGE then

two nodes are kept in the rst layer and they denote the membership values of inputvariable size to the linguistic values SMALL and LARGE.

• Each node in Layer 2 provides the strength of the rule by means of multiplication

operator. It performs AND operation.

wi = Ai( x0) Bi( x1):

Every node in this layer computes the multiplication of the input values and gives the

product as the output as in the above equation. The membership values represented

by Ai ( x0) and Bi( x1) are multiplied in order to nd the ring strength of a rule

where the variable x0 has linguistic value Ai and x1 has linguistic value Bi in the

antecedent part of Rule i.

There are pn nodes denoting the number of rules in Layer 2. Each node representsthe antecedent part of the rule.

If there are two variables in the system namely x1 and x2 that can take two fuzzy

linguistic values SMALL and LARGE, there exist four rules in the system whose

antecedent parts are as follows:

IF x1 is SMALL AND x2 is SMALL

IF x1 is SMALL AND x2 is LARGE

IF x1 is LARGE AND x2 is SMALL

IF x1 is LARGE AND x2 is LARGE




• Layer 3 is the normalization layer which normalizes the strength of all rules accord-

ing to the equation

wi = wi R j=1 w j

;

where wi is the ring strength of the ith rule which is computed in Layer 2. Node

i computes the ratio of the ith rule’s ring strength to the sum of all rules’ ring

strengths. There are pn nodes in this layer.

• Layer 4 is a layer of adaptive nodes. Every node in this layer computes a linear

function where the function coecients are adapted by using the error function of

the multi-layer feed-forward neural network.

wifi = wi(p0 x0 + p1 x1 + p2);

pi’s are the parameters where i= n+1 and n is the number of inputs to the system

(i.e. number of nodes in Layer 0). In this example, since there exists two variables

( x1 and x2), there are three parameters (p0, p1 and p2) in Layer 4. wi is the output of

Layer 3. The parameters are updated by a learning step. Least squares approximation

is used in ANFIS. In the temporal model, back-propagation algorithm is used for

training.

• Layer 5 is the output layer whose function is the summation of the net outputs of

the nodes in Layer 4. The output is computed as:

i

wifi = i wifi

i

wi

;

where wifi is the output of node i in Layer 4. It denotes the consequent part of

rule i. The overall output of the neuro-fuzzy system is the summation of the rule

consequents.

ANFIS uses a hybrid learning algorithm in order to train the network. For the param-

eters in the layer 1, back-propagation algorithm is used. For training the parameters

in the Layer 4, a variation of least squares approximation is used. The following

example describes the processing of ANFIS over a data set.

Example. Gas furnace data processed by ANFIS

ANFIS accepts the input data in the (GasFlowRate(t − 4);CO2Concentration(t − 1))

format.

(1) An input data pair is given to the network.

(2) The network performs the forward pass, i.e. the output of the function which is

CO 2Concentration(t ) is computed.

(3) Another input data pair is presented to the network and the above computation

continues until the network is trained with sample size-4 data points (last four

pairs cannot be used since the expected output is not known) where sample size

denotes the total number of data points in the training data set.




(4) Error is computed for this epoch by using an error measure to compare the expected

output to the output of the system.

(5) Training is performed by updating the parameters in layer 1 (a, b, c) and in layer 4 (˜ pi). This is oine learning, because all data set is presented to the network at

once and the parameters are updated.

(6) After a predetermined number of training epochs is reached, the training process

terminates.

The fuzzy rules produced in terms of parameters are as follows.

Rule 1:

IF GasFlowRate(t − 4) is SMALL1 AND CO2Concentration(t − 1) is SMALL2

THEN CO2Concentration(t ) = p11 * GasFlowRate(t − 4)

+p12 * CO2Concentration(t − 1) + p13

Rule 2:

IF GasFlowRate(t − 4) is SMALL1 AND CO2Concentration(t − 1) is LARGE2



Rule 3:

IF GasFlowRate(t − 4) is LARGE1 AND CO2Concentration(t − 1) is SMALL2



Rule 4:

IF GasFlowRate(t − 4) is LARGE1 AND CO2Concentration(t − 1) is LARGE2



In this example there are two fuzzy values SMALLi and LARGE i for both variables(GasFlowRate(t − 4) and CO2Concentration(t − 1)) where i denotes the index of the

variable. Each fuzzy value such as SMALLi is denoted by the parameters in the rst

layer (ai ; bi ; ci). p jk is the parameter in the fourth layer where j denotes the rule and

k denotes the parameter index. It is used in computing the output of the system which

is CO2Concentration(t ).

3. ANFIS unfolded in time

The neuro-fuzzy systems in the literature are mostly multi-layer feed-forward neural

network structures. When temporal data is concerned it is needed to construct a neuralnetwork structure which uses temporal relationships.

Recurrent neural network structures are more convenient for that purpose. Unfolding

in time is a method used for training the recurrent neural network structures. The

neuro-fuzzy approach in the study utilizes this method.

Unfolding in time approach is applied to the neuro-fuzzy system in order to construct

a temporal multi-layer feed-forward neural network. The feed-forward neural network

is duplicated for T times where T is the number of time intervals needed in the specic

problem. The resulting system is called ANFIS unfolded in time.




Y

X0

Y1

X1

Y2 Y3

Y4

X3X2

NN2NN1 NN3 NN4

0

Fig. 2. ANFIS unfolded in time.

The neural network structure enables us to dene a problem where the input can be

a vector such that (̃x; y) and the system produces only one output which is y.

Sample system can be seen in Fig. 2. Each of the boxes represents one ANFIS struc-

ture dened in Fig. 1. In the problem given in Fig. 2, it is assumed that the output

of the system depends on four

previous input values. In order to achieve the network structure, ANFIS is duplicated for 4 time intervals. The input of the neuro-fuzzy sys-

tem is composed of two elements which are X and Y . There is only one output of

the system which is Y . Initially (at time t = 0), X 0 and Y 0 are input to NN1 (net-

work component for time interval 1). The output of NN1 is obtained as Y 1 (at time

t = 1). Then, it is input to the NN2. Another input is needed which is X 1 (external

input). So it is supplied externally, since the system does not produce X 1. Output for

the second time interval is obtained as Y 2. The same process is performed for the

rest of the time intervals (two more time intervals). Finally, Y 4 is obtained as the

output of NN4. It is treated as the output of the system ANFIS unfolded in time for

t = 0. In other words, the input is supplied at time 0, and the output for time 4 is

obtained.

The same process is repeated for time t = 1; 2; : : : until the end of the sample dataset is reached.

3.1. Temporal back-propagation algorithm

The algorithm used in the neural network structure is a modication of the back-

propagation algorithm. Since the basic neuro-fuzzy system is a feed-forward neural

network, back-propagation algorithm is convenient to use. Because the neural net-

work is duplicated T times the basic back-propagation learning algorithm is modied

accordingly.

The neuro-fuzzy system is treated as a black-box containing T neural networks.

The connections between neural networks are also taken into account, representing thetemporal relationships. The parameters in the last network are updated according to the

error in the last interval. The error in one of the previous networks is updated by using

the error in the specied time interval and the errors propagated from the following

intervals. The error coming from the following intervals are back-propagated to the

error in the specic interval. Unlike the conventional unfolding in time method, the

parameters are updated independently.

The algorithm in Fig. 3 describes the steps for processing the data for one specic

time interval. The data is presented to the neuro-fuzzy system at each time interval.




Part 1 (Forward Phase)

1. The data at time T = k is given as input to system ),( k k y x .

2. The output y k+1 is computed for network at time interval 1. If T isgreater than 1, the last element in the input vector y k becomes

y k+1 (y=y 1). The output of each step will be one of the elements in

the input vector until t = T .

3. The output of the last neural network is the output of

ANFIS_unfolded_in_time. In other words, Y k = y T+k

Part 2 (Backward Phase)

1. Compute the error for the output response of the system Y k and the

given input vector ),( k k y x

2)( k k Y yerror −=

2. Back-propagate the error to the parameters in network T .

3. Back-propagate the error to the error in the network T-1. Then

update the parameters in network T-1 by using the propagated

error .4. Repeat the above step until t=0.

Fig. 3. Forward and backward processing phases in ANFIS unfolded in time.

E(t)∂

E(t−1)∂E(t−1) +

y(t)

E(t)

y(t−1)

NN(t−1) NN(t)

Fig. 4. Error computation for a time interval.

The data is processed and the output is obtained for T time intervals ahead. The error

is computed and back-propagated through the network, updating the parameters of each

node (online learning).

The algorithm contains two phases: Forward phase and Backward phase. In the

Forward phase, the data at specic time k is introduced to the system and the com-

putations are performed according to the input value. The important feature of the

temporal neuro-fuzzy system is that at the end of the computation for a data at time

interval k , it yields into the output of the system at time interval k + T .

In the Backward phase, the parameters in all networks are updated according to the

output produced by the neuro-fuzzy system. The error in network T is used to update

only the parameters in network T . But, for updating the parameters in network T − 1,the error of the following network (which is network T ) is back-propagated to network

T −1. The same process is applied to all previous time intervals until all the parameters

in all networks are updated.

The output of the forward phase is accepted as the output of ANFIS unfolded in time.

At the end of the Backward Phase all parameters are updated and the data in the next

time interval is presented to the system.

The method of back-propagating the error is shown in Fig. 4. If the error computed

at time interval t is E (t ) then the error is back-propagated through the neural network




Table 1

Comparison of RMSE values for dierent neuro-fuzzy systems and ANFIS unfolded in time

Model nfMod NFIDENT ANFIS System ANFIS unfolded in time

RMSE 0.485 0.623 0.241 0.367 0.662

Number of rules 26 21 49 26 16

Fig. 5. Train results of ANFIS unfolded in time for Gas-furnace data.

at time t (i.e. NN (t )). Moreover, it is also back-propagated through the neural network

at time t − 1 (i.e. NN (t − 1)). For that purpose, partial derivative of E (t ) is taken

over E (t − 1) which is @E t =@E t −1. The error is propagated through the parameters in

the next time interval such that the partial derivative of the error E (t ) over parameters

(a;b;c) and (p0; p1; p2; : : :) in time interval t summed to the error in time interval t

which is E (t − 1). The procedure goes on like this for the previous time intervals.

4. Experimental results

In the experiments real data taken from [1,2,11] are used. First, Gas-furnace data

experiment is performed which is a benchmark problem. Thirteen data sets are used in

the second part of the tests. The fuzzy MAR algorithm [13] is used as a preprocessing

step to obtain the input variables and the number of time intervals.




Table 2

Temporal Variables dening each time series in the system

Data set Series Dening variables

AAA Bonds Interest Rates x0 x0;t −1; x0;t −2

x1 x1;t −1; x1;t −2; x0;t −1

Agriculture x0 x0;t −1; x0;t −2; x1;t −1; x1;t −2

x1 x2;t −1; x1;t −1; x1;t −2; x0;t −1

x2 x2;t −1; x2;t −2; x3;t −2; x1;t −1

x3 x0;t −1; x0;t −2; x1;t −1; x1;t −2; x3;t −1; x3;t −2

Flour Price Indices x0 x1;t −1; x1;t −2; x2;t −1; x0;t −1;

x1 x2;t −1; x0;t −1

x2 x2;t −1

Forest x0 x0;t −1; x0;t −2; x1;t −1

x1 x1;t −1; x0;t −1; x3;t −1

x2 x2;t −1; x2;t −2; x1;t −1; x3;t −1; x0;t −1

x3 x3;t −1; x3;t −2

Gas Furnace x0 x0;t −1; x0;t −4; x1;t −1

x1 x1;t −1; x1;t −4; x0;t −1

Grain Price Indices x0 x0;t −1; x1;t −1

x1 x1;t −1; x1;t −2; x0;t −1

x2 x2;t −1; x1;t −1; x0;t −1

x3 x3;t −1; x3;t −2

Housing Starts and Sold x0 x1;t −1; x1;t −2; x0;t −1

x1 x1;t −1; x1;t −2

Interest Rates x0 x0;t −1; x1;t −1

x1 x1;t −1; x1;t −2; x2;t −1

x2 x2;t −1; x1;t −1

Investment and Inventories x0 x0;t −1

x1 x0;t −1; x1;t −1Mink-Muskrat Furs x0 x0;t −1; x0;t −2; x1;t −1

x1 x1;t −1; x1;t −2

Power Station x0 x0;t −1; x0;t −2; x2;t −1

x1 x1;t −1; x1;t −2; x2;t −1

x2 x2;t −1; x2;t −2; x0;t −2

Production and Billing x0 x0;t −1; x0;t −2

x1 x1;t −1; x1;t −2; x0;t −2

Umemployment and GDP x0 x0;t −1; x1;t −1

x1 x1;t −1; x1;t −2; x0;t −1

4.1. Gas-furnace data experiment

Gas-furnace data consists of 296 data pairs [2]. The data has two variables which

are gas ow rate (input X ) and the concentration of CO2 in the exhaust gas (output

Y ). The data set consists of measurements sampled at a xed interval of 9 seconds.

The measured input X k represents the ow rate of the methane gas in a gas furnace

and the output measurement Y k represents the concentration of carbon dioxide CO2

in the gas mixture owing out of the furnace under a steady air supply [4]. I t is

stated that the output Y depends on previous values of itself and also the values of




Table 3

RMSE for Experiments with ANFIS and ANFIS unfolded in time using Real Data

Series Var. ANFIS ANFIS unfolded in time

Train. Recog. Train. Recog.

AAA Bonds Interest Rates x0 0.119902 0.652238 0.129453 0.650826

x1 0.173633 4.317652 0.409995 1.146950

Agriculture x0 0.406124 38.933538 1.491038 1.56818

x1 174.645974 24059.645974 1622.169764 2181.88

x2 1.461472 177.453338 5.934202 8.91102

x3 5.529144 5241.348344 860.494170 1049.02

Flour Price Indices x0 2.960263 28.606507 7.628765 7.875690

x1 2.689367 71.649687 8.334503 8.080240

x2 9.378680 9.485944 10.297852 8.69711

Forest x0 30.690742 978622.038395 53.744128 537.123

x1 0.016055 307.751859 0.056393 0.441184

x2 2.345635 7352.330969 18.841375 68.5216

x3 0.604859 0.848928 1.049248 0.895668

Gas Furnace x0 0.213040 0.514342 0.211783 0.201765

x1 0.109253 1.406189 0.380676 0.918988

Grain Price Indices x0 0.107664 0.41797 0.389521 0.179064

x1 1.544528 2.923548 0.048258 0.086044

x2 0.041943 0.307117 0.094808 0.139154

x3 0.050431 0.092057 0.068364 0.0756004

Housing Starts and Sold x0 0.000747 243.336994 6.911277 12.946600

x1 2.667528 584.922654 5.700290 11.561000

Interest Rates x0 0.156515 1.104897 0.222275 0.745151

x1 0.143651 8.526677 0.2177 0.374334

x2 0.153290 0.665197 0.255971 0.503007

Investment and Inventories x0 2.538655 9.461350 14.156823 6.830420 x1 1.737847 1980.411870 4.490392 6.38886

Mink-Muskrat Furs x0 0.000006 1.262477 0.268479 0.295973

x1 0.241326 0.414645 0.284571 0.492810

Power Station x0 0.000026 2.430449 0.592158 0.963655

x1 0.000175 19.510901 0.825107 0.871221

x2 0.000118 5.237194 0.486922 0.947947

Production and Billing x0 1.065657 1.851736 1.339888 1.783490

x1 1.285181 18.492608 4.48082 7.087390

Umemployment and GDP x0 1.361111 260.3560 0.000028 29.644100

x1 0.211697 9.657877 1.027380 9.2844100

X . Most studies used the inputs X t −4 and Y t −1 for the output Y t . It is observed that

Gas-furnace data in [2] is used as a benchmark problem in the neuro-fuzzy literature.

In order to compare the results for ANFIS unfolded in time, same data is used in an

experiment.

The step-size for learning is set to a very small number (0.02) since the training

algorithm performs online learning. The parameters are updated after each data pair

is presented to the system. The results are compared with the results taken from [5].

RMSE (root mean square error) is used as the error criterion in order to compare the




(a)

(b)

Fig. 6. Output for AAA-CP Bonds data (a) x0 (b) x1.

accuracy of the system with other systems. The formula for RMSE is given as follows:

RMSE =

K k =1 (Y k − Y k )

K :




(a)

(b)

Fig. 7. Output for Agriculture data (a) x0 (b) x1 (c) x2 (d) x3.

In RMSE, K is the number of samples in the data set. Y k is the expected out-

put for the input given and Y k is the system’s response. RMSE is computed after

each epoch, i.e. after all data set is trained. The size of the data set is 292. When

a smaller set having data pairs is used, the RMSE value decreases to 0.583. This

is much smaller than the RMSE computed when the whole data set is used which




(c)

(d)

Fig. 7. (continued ).

is 0.662 as seen in the Table 1. The reason is that the identication of a function

which holds for the whole data set is more dicult when the number of data points

increase.

In the results shown in Table 1, the error is computed for one-step-ahead predictions.

This means that the model is using the data pair xt −4; yt −1 and producing the result




(a)

(b)

Fig. 8. Output for Flour Price data (a) x0 (b) x1 (c) x2.

yt rather than taking xt −4; yt −4 and producing yt , as ANFIS unfolded in time does.

This is not one-step ahead prediction but four-step ahead prediction. Since other models

are all multi-layer feed-forward neural network models they take input values only in

the layer 0. It is not suitable to take the values of variables at dierent time instances

as input to the neural network, when temporal data processing is concerned. Since the




(c)


values at dierent time instances are important for dierent variables, the neuro-fuzzy

model must perform learning and prediction concerning the time interval needed.

ANFIS unfolded in time is a neuro-fuzzy structure which performs the processing for a

predened time interval. This is advantageous when time series prediction and controlapplications are concerned.

The training results for 296 points in time is displayed in Fig. 5. It is observed

that there is no signicant deviation of output produced by the system from expected

output. Two hundred and ninty two data points are used as input since four-step ahead

prediction is performed.

4.2. Real data experiments

In this section, the results obtained by using real data sets are presented. Thirteen

data sets are used in the real data experiments. Data sets are tested by using ANFIS

and ANFIS unfolded in time. The input variables for each of the series in data setsthat are obtained by Fuzzy MAR algorithm can be seen in Table 2.

These data sets contain the following information:

• There exists two time series in AAA Bonds Data Set which are AAA Bond Rate

( x0) and Commercial Paper Rate ( x1). The data is collected quarterly between 1953

and 1970.

• Monthly agriculture data contains four time series which are First dierence of the

logarithm of exchange rate ( x0), price ( x1), logarithm of levels of sales ( x2) and loga-




(b)

(a)

Fig. 9. Output for Forestry data (a) x0 (b) x1 (c) x2 (d) x3.

rithm of shipments ( x3). The data is collected between February 1978 and December

1992.

• Flour price data set contains monthly our price indices for three US cities which

are Bualo ( x0), Minneapolis ( x1) and Kansas City ( x2). The data belongs to the

time interval January 1972 and November 1980.




(c)

(d)


• Monthly Forestry data contains four variables, which are lumber production ( x0),

lumber price ( x1), the price that housing starts ( x2) and disposable income ( x4).

• Monthly US Grain Price Data contains the price in dollars per 100-pound sack for

wheat our ( x0) and per bushel for corn ( x1), wheat ( x2) and rye( x3). The data is

obtained between January 1961 and October 1972.




Fig. 10. Output for Gas furnace data variable x0.

• Monthly US housing starts and sold data contains two variables, which are the prices

that housing starts ( x0) and the prices that housing sold ( x1). The data is collected

for the period January 1965 and December 1974.

• Monthly Interest Rate data contains three series which are Federal Funds Rate ( x0),

90-Day Treasury Bill Rate ( x1), and the 1-Year Treasury Bill Rate ( x2).

• Quarterly, seasonally adjusted, US Fixed investment ( x0) and changes in business

inventories ( x1) are tested by the algorithm. The data is recorded between 1947 and

1971.

• Natural logarithms of the annual sales of mink furs ( x0) and muskrat furs ( x1) by

Hudson’s Bay company are used in the experiments. The data is recorded between

1850 and 1911.

• Power station data is taken from a 50 MW turbo-alternator and contains in-phase

current deviations ( x0), out-of-phase current deviations ( x1) and frequency deviations

of voltage generated ( x2).

• Production data set contains two variables which are weekly production gures inthousands of units ( x0) and weekly billing gures in millions of dollars ( x1) of a

company.

• Unemployment data set contains two variables which are unemployment ( x0) and

gross domestic product ( x1) in UK between 1955 and 1969. The data is recorded

quarterly.

In the experiments, rst half of the observations are used in training phase. The

entire data set is used in recognition phase. The training and recognition results are




(a)

(b)

Fig. 11. Output for Grain Price data (a) x0 (b) x1 (c) x2 (d) x3.

presented in Table 3. The gures between Figs. 6 and 18 show the expected output,

obtained output, and the error (RMSE) for all data sets.

The results of the experiments can be summarized as follows:

ANFIS unfolded in time performs better prediction for variable x0 than x1 in the

AAA-CP bonds data set as seen in Fig. 6. In Table 3, the recognition rate for x0




(c)

(d)


is slightly better than the recognition rate for x1 which are 0.650826 and 1.146950,

respectively.

For the Agriculture data, the recognition results are very dierent as shown in Table

3. x0 and x2 are recognized better than x1 and x2.This is also validated in Fig. 7.




(a)

(b)

Fig. 12. Output for Housing data (a) x0 (b) x1.

As seen in Fig. 8, the recognition phase yields similar patterns for the variables x0,

x1 and x2 in Flour Prices data set. All the variables are recognized with recognition

errors close to training errors as seen in Table 3.

For the Forestry data set, it can be said that the recognition errors for x1 and x3 are

superior to the errors for x0 and x2 given as in Table 3. The output gure for x3 shows

the best recognition result in Fig. 9.




(a)

(b)

Fig. 13. Output for Interest Rate data (a) x0 (b) x1 (c) x2.

Variable x0 of Gas Furnace is also used in real data experiments. Fig. 10 shows that

the expected and obtained x0 are very close to each other. This can also be seen in

Table 3.




(c)


The recognition error is 0.201765. For the Grain Prices data set, the recognition

errors changes between 0.075 and 0.17 which are very close results given as Table 3.

In Fig. 11, it can be seen that the output for x0 simulates successfully the behavior of

the original x0 data. The other variables also yield error very close to zero.The Housing data set has approximately same recognition errors for both x0 and x1

as seen in Table 3 which are 12.9466 and 11.561. Also as seen in Fig. 12, both for

x0 and x1, the output of ANFIS unfolded in time follows the same pattern with the

original output but misses the local peak data points.

The Interest Rates data set variables x0, x1 and x2 yields promising recognition errors

in Table 3. Fig. 13 validates this situation. The outputs obtained are very close to the

expected output. The recognition errors for x0 and x1 in

Investment and Inventories data set are very close to each other in Table 3. Out-of-

sample recognition error is slightly higher than the error for training sample for both

variables given as in Fig. 14.

The recognition errors are slightly worse for x1 than the error for x0 in Mink-Muskratdata set in Table 3. In Fig. 15, the outputs are close to the expected results for both

of the variables.

In Table 3, for the Power Station data, x0 and x2 has approximately the same recog-

nition errors (0.963655 and 0.947947 respectively), whereas x1 yields better recognition

error (0.871221). Besides to that x1 yields the best gure among other variables in Fig.

16.

For Production and Billing data set, the recognition error for x0 is small compared

to x1 in Table 3.




(a)

(b)

Fig. 14. Output for investment and inventories data (a) x0 (b) x1.

In Fig. 17, the output x0 yields a linear gure missing most of the peak values,

whereas x1 adapts itself to the uctuations in expected output values for x1.

In Unemployment and GDP data x0 gives worse recognition error than x1 as in Table

3. For the out-of-sample values, the recognition gets worse for x1 as seen in Fig. 18.

On the other hand, x0 simulates the expected results better.




(a)

(b)

Fig. 15. Output for Mink and Muskrat Furs data (a) x0 (b) x1.

The experimental results show that:

• ANFIS yields smaller training errors than ANFIS unfolded in time. This is an ex-

pected result, since during the training phase, our model uses online learning. The




(a)

(b)

Fig. 16. Output for Power Station data (a) x0 (b) x1 (c) x2.

error obtained for a sample is a cumulative error containing the error in the whole

time interval.

• Our model gives better recognition results than ANFIS. This is also not surprising,

since the neuro-fuzzy model is performing t -ahead prediction, the recognition results

are promising.




(c)


5. Conclusion

A neuro-fuzzy system is constructed for prediction of time series data by means of a temporal learning algorithm and a temporal neuro-fuzzy model. The model, namely

ANFIS unfolded in time, provides an on-line environment which takes a time series

and forecasts about the future behavior of it. Because the recurrent neural network

structure seems to be convenient for time series analysis, unfolding-in-time approach is

useful to represent a recurrent neural network as a feed-forward one. The neuro-fuzzy

system which is basically a black box of feed-forward neural networks is duplicated

for T time intervals in this approach. The number of time intervals is provided to

the neuro-fuzzy system as an argument. These are computed by using Fuzzy-MAR

algorithm [13]. As an alternative, tests can be iteratively performed to nd the best

number of time intervals for the time series data given.

Because the resulting model can be used for small time intervals T , it can be appliedto areas including short-term prediction, such as:

• Financial or meteorological data forecasting can be an application area, since the

model is convenient for forecasting problems.

• Sequence detection given by Rumelhart [12] is a possible application of unfolding-in-

time concept.

• Image processing, specically motion detection can be a dierent application area

for ANFIS unfolded in time because of the usage of temporal expert system rules.




(a)

(b)

Fig. 17. Output for Production and Billing data (a) x0 (b) x1.

The method is tested on various data sets and the results are compared with

the results of ANFIS. Although it seem the training error is a little bit higher than

ANFIS, the recognition error is much smaller in ANFIS unfolded in time. Since




(a)

(b)

Fig. 18. Output for Unemployment and GDP data (a) x0 (b) x1.

ANFIS unfolded in time uses error at t-ahead time intervals, the error back-

propagated through the network is a cumulative error.

References

[1] D. Akleman, F.N. Alpaslan, Temporal rule extraction for rule-based systems using time series approach,

Proceedings of ISCA CAINE-97, San Antonio, Texas, December 1997.




[2] G.E. Box, G.M. Jenkins, Time Series Analysis: Forecasting and Control, Holden Day, San Francisco,

1970.

[3] F.N. Civelek-Alpaslan, K.M. Swigger, A temporal neural network model for constructing connectionistexpert system knowledge bases, J. Network Comput. Appl. 19 (1996) 19–133.

[4] W. Farag, A. Tawk, On fuzzy model identication and the gas furnace Data, Proceedings of the

IASTED International Conference, Hawaii, 2000.

[5] M.B. Gorzalczany, A. Gluszek, Neuro-fuzzy systems for rule-based modeling of dynamic processes,

Proceedings of ESIT, 2000, pp. 416–422.

[6] J.-S.R. Jang, Self-learning fuzzy controllers based on temporal back propagation, IEEE Trans. Neural

Networks 3 (5) (1992) 714–723.

[7] J.-S.R. Jang, ANFIS: adaptive-network-based fuzzy inference systems, IEEE Trans. Systems, Man

Cybern. 23 (03) (1993) 665–685.

[8] J.-S.R. Jang, Neuro-Fuzzy and Soft Computing, Prentice-Hall, New Jersey, 1997.

[9] J.-S.R. Jang, Roger Jang’s Publications and Softwares, http://www.cs.nthu.edu.tw/∼ jang/publication.htm.

[10] T.M. Mitchell, Machine Learning, McGraw-Hill, New York, 1997.

[11] G.C. Reinsel, Elements of Multivariate Time Series Analysis, Springer, New York, 1997.

[12] D.E. Rumelhart, G.E. Hinton, R.J. Williams, Learning internal representations by error propagation,in: D.E. Rumelhart, J.L. McClelland (Eds.), Parallel Distributed Processing: Explorations in the

Microstructure of Cognition, MIT Press, 1986, pp. 318–362.

[13] N.A. S isman-Ylmaz, A temporal neuro-fuzzy approach for time series analysis, Ph.D. Thesis,

Department of Computer Engineering, Middle East Technical University, 2003.

[14] M. Sugeno, G.T. Kang, Structure identication of fuzzy model, Fuzzy Sets and Systems 28 (1988)

15–33.

http://www.cs.nthu.edu.tw/~jang/publication.htm





anfis unfolded in time

Documents