journal of hydrology 224 100 114

15
Application of fuzzy rule-based modeling technique to regional drought R. Pongracz a,1 , I. Bogardi b, * , L. Duckstein c,2 a Department of Meteorology, Eotvos Lorand University, Pazmany setany 1, Budapest, H-1117, Hungary b Department of Civil Engineering, University of Nebraska-Lincoln, W359 Nebraska Hall, Lincoln, NE 68588-0531, USA c Ecole Nationale du Genie Rural des Eaux et des Forets, 19, avenue du Maine, 75732 Paris Cedex 15, France Received 12 January 1999; accepted 19 August 1999 Abstract Fuzzy rule-based modeling is applied to the prediction of regional droughts (characterized by the modified Palmer index, PMDI) using two forcing inputs, El Nino/Southern Oscillation (ENSO) and large scale atmospheric circulation patterns (CPs) in a typical Great Plains state, Nebraska. Although, there is significant relationship between simultaneous monthly CP, lagged Southern Oscillation Index (SOI) and PMDI in Nebraska, the weakness of the correlations, the dependence between CP and SOI and the relatively short data set limit the applicability of statistical modeling for prediction. Due to the above difficulties, a fuzzy rule-based approach is presented to predict PMDI from monthly frequencies of daily CP types and lagged prior SOIs. The fuzzy rules are defined and calibrated using a subset called the learning set of the observed time series of premises and PMDI response. Then, another subset, the validation set is used to check how the application of fuzzy rules reproduces the observed PMDI. In all its eight climate divisions and Nebraska itself, the fuzzy rule-based technique using the joint forcing of CP and SOI, is able to learn the high variability and persistence of PMDI and results in almost perfect reproduction of the empirical frequency distributions. q 1999 Elsevier Science B.V. All rights reserved. Keywords: Fuzzy rule-based modeling; Drought; ENSO; Circulation pattern; Palmer index 1. Introduction The purpose of this paper is to develop and apply fuzzy rule-based modeling to the prediction of regional droughts from the joint use of two forcing inputs or premises, namely El Nino/Southern Oscilla- tion (ENSO) and large scale atmospheric circulation patterns (CPs) applied to the case study of a typical Great Plains state, Nebraska (Fig. 1). Drought is a normal part of the Great Plains climate, and it is different from other natural hazards that affect the region. Drought is a slow-onset, insi- dious hazard that is often well established before it is recognized as a threat, taking months or years to develop. Economic, environmental, and social impacts of drought can be enormous (WGA (1996)). The Federal Emergency Management Agency (FEMA, 1995) estimates annual drought losses in the US to be US$6–8 billion. The 1987–89 drought across much of the US totaled an estimated US$39.4 billion in direct and indirect losses, which is still the Journal of Hydrology 224 (1999) 100–114 0022-1694/99/$ - see front matter q 1999 Elsevier Science B.V. All rights reserved. PII: S0022-1694(99)00131-6 www.elsevier.com/locate/jhydrol * Corresponding author. Tel: 11-402-472-1726; fax: 11-402- 472-8934. E-mail addresses: [email protected] (R. Pongracz), [email protected] (I. Bogardi), [email protected] (L. Duckstein) 1 Tel: 136-1-209-0555/6615; fax: 136-1-372-2904. 2 Tel: 133-1-4549-8931; fax: 133-0-1-4549-8827.

Upload: sudharsananprs

Post on 02-Jun-2017

213 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Journal of Hydrology 224 100 114

Application of fuzzy rule-based modeling technique to regionaldrought

R. Pongracza,1, I. Bogardib,* , L. Ducksteinc,2

aDepartment of Meteorology, Eotvos Lorand University, Pazmany setany 1, Budapest, H-1117, HungarybDepartment of Civil Engineering, University of Nebraska-Lincoln, W359 Nebraska Hall, Lincoln, NE 68588-0531, USA

cEcole Nationale du Genie Rural des Eaux et des Forets, 19, avenue du Maine, 75732 Paris Cedex 15, France

Received 12 January 1999; accepted 19 August 1999

Abstract

Fuzzy rule-based modeling is applied to the prediction of regional droughts (characterized by the modified Palmer index,PMDI) using two forcing inputs, El Nino/Southern Oscillation (ENSO) and large scale atmospheric circulation patterns (CPs) ina typical Great Plains state, Nebraska. Although, there is significant relationship between simultaneous monthly CP, laggedSouthern Oscillation Index (SOI) and PMDI in Nebraska, the weakness of the correlations, the dependence between CP and SOIand the relatively short data set limit the applicability of statistical modeling for prediction. Due to the above difficulties, a fuzzyrule-based approach is presented to predict PMDI from monthly frequencies of daily CP types and lagged prior SOIs. The fuzzyrules are defined and calibrated using a subset called the learning set of the observed time series of premises and PMDIresponse. Then, another subset, the validation set is used to check how the application of fuzzy rules reproduces the observedPMDI. In all its eight climate divisions and Nebraska itself, the fuzzy rule-based technique using the joint forcing of CP andSOI, is able to learn the high variability and persistence of PMDI and results in almost perfect reproduction of the empiricalfrequency distributions.q 1999 Elsevier Science B.V. All rights reserved.

Keywords:Fuzzy rule-based modeling; Drought; ENSO; Circulation pattern; Palmer index

1. Introduction

The purpose of this paper is to develop andapply fuzzy rule-based modeling to the prediction ofregional droughts from the joint use of two forcinginputs or premises, namely El Nino/Southern Oscilla-tion (ENSO) and large scale atmospheric circulation

patterns (CPs) applied to the case study of a typicalGreat Plains state, Nebraska (Fig. 1).

Drought is a normal part of the Great Plainsclimate, and it is different from other natural hazardsthat affect the region. Drought is a slow-onset, insi-dious hazard that is often well established before it isrecognized as a threat, taking months or years todevelop. Economic, environmental, and socialimpacts of drought can be enormous (WGA (1996)).The Federal Emergency Management Agency(FEMA, 1995) estimates annual drought losses inthe US to be US$6–8 billion. The 1987–89 droughtacross much of the US totaled an estimated US$39.4billion in direct and indirect losses, which is still the

Journal of Hydrology 224 (1999) 100–114

0022-1694/99/$ - see front matterq 1999 Elsevier Science B.V. All rights reserved.PII: S0022-1694(99)00131-6

www.elsevier.com/locate/jhydrol

* Corresponding author. Tel:11-402-472-1726; fax:11-402-472-8934.

E-mail addresses:[email protected] (R. Pongracz),[email protected] (I. Bogardi),[email protected] (L. Duckstein)

1 Tel: 136-1-209-0555/6615; fax:136-1-372-2904.2 Tel: 133-1-4549-8931; fax:133-0-1-4549-8827.

Page 2: Journal of Hydrology 224 100 114

largest amount for any natural disaster in the US(Riebsame et al., 1991). Environmental and socialimpacts of drought are harder to measure, but noless significant. In the Great Plains, droughts havealways played a major role. During the second halfof the 19th Century, drought directly affected

settlement patterns and population shifts as EuropeanAmericans moved westward from the eastern US. Inthis century, drought conditions during the 1930s, andthe associated dust storms, gave the Great Plains thenickname “the Dust Bowl”, and again desperate farm-ers fled the Plains for the West Coast. Droughts in the

R. Pongracz et al. / Journal of Hydrology 224 (1999) 100–114 101

Fig. 1. Climate divisions in Nebraska. 1: Western Nebraska; 2: Northern Nebraska; 3: Northeastern Nebraska; 5: Central Nebraska; 6: EasternNebraska; 7: Southwestern Nebraska; 8: South-Central Nebraska; 9: Southeastern Nebraska.

Northeastern Nebraska

-8

-4

0

4

8

1946 1950 1954 1958 1962 1966 1970 1974 1978 1982 1986 1990 1994

years

PM

DI

Western Nebraska

-8

-4

0

4

8

1946 1950 1954 1958 1962 1966 1970 1974 1978 1982 1986 1990 1994

years

PM

DI

Fig. 2. PMDI time series (1946–1997) in climate divisions 1 and 3.

Page 3: Journal of Hydrology 224 100 114

1950s and 1970s caused less social upheaval, but stillresulted in large agricultural losses in the Plains. Thelate 1980s drought severely affected the northernPlains, while the recent droughts had a major effectin the southern Plains, causing US$5 billion in 1996and US$7 billion in 1998 in losses in Texas (Chenaultand Parsons, 1998).

Drought indices have become common tools tomeasure the intensity and spatial extent of droughts.One of the most commonly used climatic droughtindices in the US is the Palmer Drought SeverityIndex (PDSI) (Palmer, 1965), that is based on theprinciples of a balance between moisture supply anddemand when man-made changes are not considered.This index indicates the severity of a wet or dryspell—the greater the absolute value the more severethe dry or the wet spell. The PDSI was modified by theNational Weather Service Climate Analysis Center, toobtain another index (modified PDSI or PMDI) whichis more sensitive to the transition periods between dryand wet conditions (Heddinghause and Sabol, 1991).This paper considers the modified Palmer index. Themethodology is, however, applicable to any otherdrought indices such as the Standardized PrecipitationIndex (McKee et al., 1993) or the Bhalme–Mooleydrought index (Bogardi et al., 1994). This is an impor-tant point because it has been argued that PalmerDrought Indices have weaknesses that limit theirapplication as a drought monitoring tool (Alley,1984; Guttman et al., 1992). On the other hand,given the observed high variability and persistenceof PMDI (Fig. 2), it is a more challenging task to

reproduce these features with any modelingtechnique.

A long-term historical data set of PMDI valuesexists for climatic divisions around the US (Guttmanand Quayle, 1996). In the present paper, PMDI isevaluated during the summer half-year (April–September) in eight climate divisions in Nebraska(Fig. 1).

Drought conditions are quite different in these divi-sions; Fig. 2 shows the observed time series for divi-sions 1 and 3 (Western and Northeastern Nebraska,respectively) during the 1946–97 period. Severaldrought periods can be identified according to thesedivisional PMDI time series. After the drought in the1930s the next most significant drought periodoccurred from 1952 through 1957 (Lawson et al.,1977) that is obvious in both Nebraskan regions(Fig. 2). Although, the climate of the different divi-sions varies considerably, the main patterns are simi-lar. The western part of Nebraska is colder and drier ingeneral, compared to the eastern part (Palecki, 1996),and also less variable in PMDI values.

The question arizes if the monthly PMDIvalues are homogeneous. To this end, Fig. 3shows the cumulative frequency distribution ofPMDI in division 1 for two periods: the trainingsets of 1946–62, 1978–94 and the validation setof 1963–77. The two frequency distributions aredifferent at the 0.1, but the same at the 0.05significance level, using the two sample Kolmo-gorov–Smirnov test. The other divisions behavesimilarly.

R. Pongracz et al. / Journal of Hydrology 224 (1999) 100–114102

Fig. 3. Distributions of PMDI for the learning and validation sets in climate division 1.

Page 4: Journal of Hydrology 224 100 114

2. Atmospheric circulation, ENSO and droughts

Great Plains droughts are strongly related tounusual and persistent synoptic meteorological condi-tions, mostly large scale circulation patterns andENSO (El Nino and La Nina) events. The importanceof ENSO effects on weather anomalies and cropproduction in the Midwest was shown by manyresearchers, as summarized by Carlson et al. (1996).The association of PDSI with ENSO has been demon-strated for the whole United States (Piechota andDracup, 1996), but the correlations are not strongenough to predict drought from ENSO index alone.Pesti et al. (1996) used a fuzzy rule-based technique toidentify the relationship between PDSI and CP typesin New Mexico.

Large scale circulation patterns (CPs) can be repre-sented by the daily geopotential height field of the500 hPa level above a US-East-Pacific area centeredat Nebraska. The grid consists of 49 points betweenlatitude of 258–658N and longitude of 808–1308W.This data set (NCAR and University of Washington,1996) provides grid point values of daily geopotentialheight field observations for the 1946–94 period. Toovercome the time scale difference between monthlydroughts and daily CP, the effects of CP on droughtsare represented by the monthly empirical relativefrequencies of daily CP types. The CP types wereidentified by a combined multivariate technique(Wilks, 1995), namely principal component analysis

(PCA) and cluster analysis using thek-means method(MacQueen, 1967). The same methodology has beenused as in Matyasovszky et al. (1993), but for longertime series and smaller number of clusters in thepresent study (Pongracz, 1999).

The procedure starts with a PCA performed on thedaily geopotential height fields of 500 hPa level inorder to obtain new uncorrelated variables for theclassification. Then, a system withk initial clustercenters (here,k � 6 is chosen as the number of CPtypes) is defined, and the first daily PCA grid is exam-ined by calculating the distances between the grid andeach cluster center. The grid is classified into theclosest cluster, and the center of this cluster havinga new member is recalculated. The same steps areapplied to each daily PCA one grid after the other.Then, the final cluster centers obtained after classify-ing all PCA grids are handled as initial centers and thewhole classification procedure is reiterated untilcluster centers are stabilized.

The ENSO phenomena are represented by the timeseries of SOI (Southern Oscillation Index) which isone of the most commonly used indices in ENSOresearch. Monthly values are available from Internet(NOAA, 1997) for the years 1933–1997. The data setof the Palmer index, the PMDI consists of monthlyvalues during the period of 1895–1998 (NOAA,1998). Drought events occur in the case of negativePMDI values while positive values imply wet condi-tions. Possible statistical relationships between thethree above-mentioned time series have beenanalyzed elsewhere (Pongracz et al., 1997), andsome results are shown here. First, discrete categoriesare defined on the PMDI and SOI (Tables 1 and 2).

The correlation coefficients between the monthlyrelative frequencies of CP types and lagged PMDIor SOI are smaller than 0.18, and mostly not signifi-cant. On the other hand, the empirical frequencydistributions of CP types during the five drought cate-gories are different at the 0.01 significance level. Fig.4 shows the frequencies of CP types during the twomost extreme PMDI categories: very dry and very wetconditions. The frequencies of CP types during thethree ENSO phases are also significantly different.

The correlation coefficients between PMDI andlagged SOI reach 0.39 and are significant (Fig. 5).Both direction of lag has been evaluated since simul-taneous, lag and pre-lag teleconnections of climate

R. Pongracz et al. / Journal of Hydrology 224 (1999) 100–114 103

Table 1Categories defined on PMDI

PMDI intervals Drought categories

PMDI , 23 Very dry23 # PMDI , 21 Dry21 # PMDI # 11 Normal11 , PMDI # 13 WetPMDI . 13 Very wet

Table 2Categories defined on SOI

SOI intervals ENSO phases

SOI # 21 E1 Nino21 , SOI , 1 1 NormalSOI $ 1 1 La Nina

Page 5: Journal of Hydrology 224 100 114

variables may be related to ENSO (Wright, 1985).The conditional frequency distributions of PMDIduring El Nino and La Nina periods (Fig. 6) are alsosignificantly different.

This simple statistical analysis reinforces earlierfindings (e.g. Piechota and Dracup, 1996) that despitethe strong teleconnection between ENSO anddroughts, droughts have occurred in this regionunder various phases of ENSO (Carlson et al.,1996). Thus, in the Great Plains, the partial signalsof ENSO and CP on drought are even weaker thanin other regions. Also, CP and ENSO are evidentlyinterdependent as shown for example by Bartholy etal. (1996). Thus, the more traditional stochasticapproach to regress SOI and the frequencies of CPtypes with a drought index may not work, as shownlater.

3. Fuzzy rule-based methodology

Due to the above difficulties of the traditional statis-tical analysis, fuzzy rule-based modeling is used forutilizing the existing linkage between the joint

ENSO–CP forcing and the drought index. The fuzzyrule-based approach has relatively simpler structureand requires neither independency, nor long datasets (Galambosi et al., 1999). In the following, afuzzy rule-based technique, namely the weightedcounting algorithm (Bardossy and Duckstein, 1995)will be adapted to the present case and described in astep-by-step manner.

3.1. Selection of the input variables (premises) anddefinition of the training and validation data sets

Based on the previous analysis, SOI and themonthly frequency distribution of the six CP typesconstitute the input, forcing function, or premises. Inaddition, the question arises how many prior monthlypremises should be considered to predict the droughtindex. There is no strict rule for this case; here weused a selection based on the correlation between SOIwith different lag periods and the drought.

For the CP types, none of the prior months has anysignificant correlation, thus, only the simultaneousfrequency distributions of the six CP types represent

R. Pongracz et al. / Journal of Hydrology 224 (1999) 100–114104

Very dry

0

0.05

0.1

0.15

0.2

0.25

CP types

freq

uenc

y CP 1CP 6CP 5

CP 4

CP 3

CP 2

Very wet

0

0.05

0.1

0.15

0.2

0.25

CP types

freq

uenc

y CP 1 CP 6

CP 5

CP 4

CP 3CP 2

Fig. 4. Empirical relative frequency distributions of CP types during extreme drought conditions in climate division 8.

1946-94

-0.4

-0.3

-0.2

-0.1

0.0

6543210-1-2-3-4-5-6lag time [months]

corr

elat

ion

coef

ficie

nt

Fig. 5. Correlation coefficients between drought and the lagged SOI.

Page 6: Journal of Hydrology 224 100 114

the first type of premises�X1;…;X6�. For SOI—asexpected—the picture is different (Fig. 5), since thelag correlations are significant up to the prior sixmonths. The highest correlation between PMDI andSOI occurs for a lag of six months, but then the corre-lations weaken. However, another local maximumcorrelation can be seen at24 months lag period.Furthermore, theoretically, an annual cycle is consid-ered, so beyond six months in either direction no lagperiods are taken into account. Based on these find-ings, we used four lagged periods (0,22, 24 and26months) of high correlations as SOI-type premises(X7, X8, X9, X10). Note the trade-off between theincreasing number of premises and the length of thedata set.

Theentire1946–1994dataset {Xi;j ;Yj} i�1;…;k; j�1;…;n

contains k � 6 1 4� 10 premises Xi and nobservations on premises and the responseY. Theentire time series is split into two parts: a trainingset t (2/3 of the entire period) and a validation setn (1/3 of the entire period). The training set willbe used to learn the fuzzy rules so it must be longenough in order to provide valuable model outputs.

And the validation set will be applied to validatethe rules derived from the training set, namely,how correctly they can estimate the observedresponse. Different partitions of the data set wereused to check the sensitivity of results to thisoperation; in the present case, the results are notsensitive to the selection of partitions. Thereforeall the examples use the same partitioning(1946–62 and 1978–94 as the training period,and 1963–77 as the validation period).

3.2. Definition of fuzzy numbers

Fuzzy numbers are defined for each variableinvolved in the model. A fuzzy numberAi consistsof �x;mAi

�x�� pairs wherex is an element of sometype of continuous set andmAi

is the membershipfunction which must have no local minimum andattain a maximum of one—values ofmAi

�x� vary inthe range of [0;1] depending on the truth of the char-acteristics that are considered (Dubois and Prade,1980). One of the simplest fuzzy numbers is the trian-gular fuzzy number represented as�a1; a2;a3�T usingnotation of Fig. 7.

Define, for example, the fuzzy number of “verydry” condition. The definition of a very dry climato-logical condition may not be correct by characterizingit with a single value of PMDI. Even an interval hassharp limits and the various values between the lowerand upper limits are not distinguished. In the mean-while, fuzzy logic makes it possible to look at thesePMDI values as a sort of continuum. Although inter-val partitioning can be appropriate to apply in somecases fuzzy numbers are closer to human thinkingthan intervals. So all possible PMDI values have

R. Pongracz et al. / Journal of Hydrology 224 (1999) 100–114 105

El Nino

0

0.1

0.2

0.3

Drought categories

Very

dry

Nor

mal

Dry

Wet

Very

wet

La Nina

0

0.1

0.2

0.3

Drought categories

Very

dry

Nor

mal

Dry

Wet

Very

we t

Fig. 6. Empirical relative frequency distributions of drought conditions during El Nino and La Nina.

0

0.2

0.4

0.6

0.8

1

Mem

bers

hip-

func

tion

a1 a2 a3

Fig. 7. A triangular fuzzy number.

Page 7: Journal of Hydrology 224 100 114

some membership values in very dry climatologicalconditions (all the positive PMDI values representingwet conditions must have 0 membership values).Using a triangular fuzzy number, very dry climatolo-gical conditions can be defined witha1 � 26;a2 �24; and a3 � 21; or (26, 24, 21)T. For instance,actual values of PMDI:21.9, 22.1 and24.5 havemembership values of 0.30, 0.37 and 0.75, respec-tively—so they all represent a very dry conditionbut to different degrees.

The definition of fuzzy numbers were based mainlyon the range of premises and the response variable.Then, a linear partitioning was applied to each vari-able (SOI values, CP relative frequencies, PMDIvalues).

3.2.1. Fuzzy numbers defined on premisesThe entire range of possible premise values is

divided into several overlapping classes each forminga fuzzy number. The more fuzzy numbers we define,the better estimation can be expected for the values of

PMDI. However, if too many fuzzy numbers aredefined on the premises, the validation set mightcontain too many observations that have neveroccurred before in the training set, therefore fuzzyrules cannot be applied to them. As a compromise,all premises (relative frequencies of CP types, andlagged SOI time series) are divided into five regions,namely for monthly CP occurrence: very rareA(1),rare A(2), medium A(3), frequent A(4) and veryfrequentA(5) (Fig. 8). Then for SOI: strong El NinoA(1), weak El NinoA(2), normalA(3), weak La NinaA(4), and strong La Nina phasesA(5) (Fig. 9). VariousCP types occur with different frequencies, so for thesake of comparability the highest monthly frequencythat ever occurred in the data set is defined as themaximum of the given CP type premise (Table 3).

As an example, during the first month of data, April1946 the occurrences of CP types are: CP1: 9, CP2: 0,CP3: 6, CP4: 2, CP5:1, and CP6: 12 days. Thus, forthat monthX1 � 0:30 (relative frequency of CP1),X2 � 0, X3 � 0:20, X4 � 0:07; X5 � 0:03; X6 �0:40; X7 � 21:04 (simultaneous SOI),X8 � 0:31(SOI—2 months before),X9 � 0:60 (SOI—4 monthsbefore), X10 � 0:25 (SOI—6 months before). The

R. Pongracz et al. / Journal of Hydrology 224 (1999) 100–114106

0

1

0 ¼·max ½·max ¾·max max

Monthly relative frequency of a given CP type

valu

es o

f m

embe

rshi

p fu

nctio

nMedium

FrequentRareVery

frequentVery rare

Fig. 8. Fuzzy numbers defined on the monthly relative frequency of a given CP type.

0

1

-3 -2 -1 0 1 2 3

SOI values

valu

es o

f m

embe

rshi

p fu

nctio

n

Normalweak

La Ninaweak

El Ninostrong

La Ninastrong El Nino

Fig. 9. Fuzzy numbers defined on SOI.

Table 3Monthly maximum relative frequencies (max) for daily CP typesand their proportions

CP1 CP2 CP3 CP4 CP5 CP6

max 0.77 0.50 0.93 0.74 0.94 0.603/4·max 0.58 0.38 0.70 0.56 0.70 0.451/2·max 0.38 0.25 0.47 0.37 0.47 0.301/4·max 0.19 0.13 0.23 0.19 0.23 0.15

Page 8: Journal of Hydrology 224 100 114

corresponding membership functions are given inTable 4. For example, the relative frequency of CP1X1;1 � 0:30 has membership values (different from 0)in both fuzzy sets “Rare monthly CP occurrence” and“Medium monthly CP occurrence”, 0.44 and 0.56,respectively; or the relative frequency of CP5X5;1 �0:03 has membership values (different from 0) in bothfuzzy sets “Very rare monthly CP occurrence” and“Rare monthly CP occurrence”, 0.86 and 0.14,respectively.

3.2.2. Fuzzy numbers defined on response variablePMDI as the response variable (Y) was considered

for the eight climate divisions and spatial average ofthe entire state of Nebraska (NOAA, 1998).

Different fuzzy number systems were defined onthe range from extremely dry (large negative PMDIvalues) to extremely wet (large positive PMDI values)conditions. As the total number of fuzzy numbersincreases (7, 8, 11, 12, 17, 18), the accuracy of the

fuzzy rule-based model improves. So, the last optionwas chosen with fuzzy numbers:B�1�;…;B�18� (Fig.10). This number of fuzzy partitions offers a properrepresentation of the wide range of PMDI, and thedata set can provide several points in each interval.

For the example of April 1946, values of the PMDImembership function are given in Table 5 for the eightclimate divisions and the spatial average.

3.3. Rule construction

Fuzzy rules are constructed using the training set(t :{ Xi;j ;Yj} i�1;…;k; j�1;…;nt

(where nt , n, number ofobservations in the time series of the training set) byapplying the following steps.

3.3.1. Determine the highest values of all membershipfunctions for each data point

First, values of membership functions arecalculated for each observed premise and response

R. Pongracz et al. / Journal of Hydrology 224 (1999) 100–114 107

Table 4Values of membership function for the first data point (April 1946)

i Xi,1 mA�1i � mA�2i � mA�3i � mA�4i � mA�5i �

Very rare Rare Medium Frequent Very frequent

1 0.30 0 0.44 0.56 0 02 0 1.00 0 0 0 03 0.20 0.14 0.86 0 0 04 0.07 0.64 0.36 0 0 05 0.03 0.86 0.14 0 0 06 0.40 0 0 0.33 0.67 0

Strong El nino Weak El nino Normal Weak La nina Strong La nina

7 21.04 0 0.69 0.31 0 08 0.31 0 0 0.79 0.21 09 0.60 0 0 0.60 0.40 010 0.25 0 0 0.83 0.17 0

0

1

-8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 8 9

PMDI values

normal

dry 1

extreme wet

extreme dry

dry 3 dry 2dry 7 dry 6 dry 5 dry 4 wet 1 wet 8wet 7wet 6wet 5wet 4wet 3wet 2

Fig. 10. Fuzzy numbers defined on PMDI.

Page 9: Journal of Hydrology 224 100 114

variable:mA�l i ��Xi;j��for li � 1;…;5; i � 1;…; k� andmB�l� �Yj�: Then, the maximum values of membershipfunctions are selected. Thus, eachXi;j data pointwithin the data set�j � 1;…;nt� possesses a valueMi;j :

Mi;j � maxli�1;…;5

�mA�l i ��Xi;j��;

and also each responseYj possesses a valueM0;j :

M0;j � maxl�1;…;18

�mB�l��Yj��:

Table 6 shows these selected maximum values for thefirst data point, April 1946.

R. Pongracz et al. / Journal of Hydrology 224 (1999) 100–114108

Table 5Values of response membership function for the first data point (April 1946/PMDI in different regions of Nebraska)

Drought Membership valuesDivision Y1 mB(1) mB(2) … mB(6) mB(7) mB(8) mB(9) mB(10) … mB(17) mB(18)

Extreme dry Dry 7 … Dry 3 Dry 2 Dry 1 Normal Wet 1 … Wet 8 Extreme wet

1 21.16 0 0 … 0 0.16 0.84 0 0 … 0 02 21.71 0 0 … 0 0.71 0.29 0 0 … 0 03 21.31 0 0 … 0 0.31 0.69 0 0 … 0 05 22.47 0 0 … 0.47 0.53 0 0 0 … 0 06 21.47 0 0 … 0 0.47 0.53 0 0 … 0 07 22.08 0 0 … 0.08 0.92 0 0 0 … 0 08 22.29 0 0 … 0.29 0.71 0 0 0 … 0 09 21.76 0 0 … 0 0.76 0.24 0 0 … 0 0NE 21.84 0 0 … 0 0.84 0.16 0 0 … 0 0

Table 6Maximum membership function values and weights for the first data point (April 1946)

i Name Maximum valueMi,1

Name of the fuzzynumber

1 CP1 0.56 Medium2 CP2 1.00 Very rare3 CP3 0.86 Rare4 CP4 0.64 Very rare5 CP5 0.86 Very rare6 CP6 0.67 Frequent7 SOI 0.69 Weak El nino8 SOI (22) 0.79 Normal9 SOI (24) 0.60 Normal10 SOI (26) 0.83 Normal

DOF1 � 0:049

Response variable Location Maximum valueM0,1

Name of the fuzzynumber

Weight of rule 1v1 � DOF1·M0;1

PMDI div. 1 W-Ne 0.84 dry 1 0.041PMDI div. 2 N-Ne 0.71 dry 2 0.035PMDI div. 3 NE-Ne 0.69 dry 1 0.034PMDI div. 5 Central-Ne 0.53 dry 2 0.026PMDI div. 6 E-Ne 0.53 dry 1 0.026PMDI div. 7 SW-Ne 0.92 dry 2 0.045PMDI div. 8 S-Central Ne 0.71 dry 2 0.035PMDI div. 9 SE-Ne 0.76 dry 2 0.037PMDI/NE Nebraska 0.84 dry 2 0.041

Page 10: Journal of Hydrology 224 100 114

3.3.2. Combined effect of fuzzy numbers (usingoperator AND)

Since we have more than one premise, the effects ofpremises should be combined. The two mostcommonly used operators for fuzzy numbers areAND and OR (Zimmermann, 1985). In the presentmodel we used only the operatorAND to add theeffects of different premises. So a rule will look likethis:

IF (X1;j is A�l1� AND X2;j is A�l2� AND … ANDX10;j is A�l10�� THEN Yj is B(l).

The combined effect of all premises is representedhere by the product of membership functions calleddegree of fulfillment (DOF) which indicates thedegree of applicability of the rule within the system.Thus, DOF of thejth set of data points (DOFj) iscalculated as:

DOFj �Yk

i�1

Mi;j :

The first data point (April 1946) has a

DOF1

� 0:56·1:00·0:86·0:64·0:86·0:67·0:69·0:79·0:60·0:83

� 0:049:

In the very beginning, the fuzzy rule system isempty, it contains no rules at all—the first rule isderived from the first observed values. In the presentcase, this first rule for the entire Nebraska, andfor Northern, Central, Southwestern, South-Central,Southeastern Nebraska, looks as follows:

IFMedium CP1 occurrence AND Very rare CP2occurrence AND Rare CP3 occurrence AND Veryrare CP4 occurrence AND Very rare CP5 occur-rence AND Frequent CP6 occurrence AND weakEl Nino in the actual month AND Normal phase 2month before AND Normal phase 4 month beforeAND Normal phase 6 month before

THENdry2 drought condition (1)

for Western, Northeastern and Eastern Nebraska:

IFMedium CP1 occurrence AND Very rare CP2occurrence AND Rare CP3 occurrence AND Veryrare CP4 occurrence AND Very rare CP5 occur-rence AND Frequent CP6 occurrence AND weakEl Nino in the actual month AND Normal phase 2month before AND Normal phase 4 month beforeAND Normal phase 6 month before

THENdry1 drought condition (2)

The rule system will grow as more and more rulesare added on the basis of observed data points. If a rulederived from a given set of data points is not includedin the rule system yet, then it should be added to therule system.

3.3.3. Assign a weight to each ruleWeights indicate the proportion of the training data

sets explained by a given (mth) rule. They are calcu-lated as the sum of the products of DOFj and value ofmembership function of the response variable�M0;j� :

vm �Xnt

j�1

DOFj ·M0;j :

For the first data point (April 1946), the weights ofrule (1) or (2), depending on the area considered, areshown in Table 6.

If the first rule (1) or (2) appears in more datapoints, the individual weights are summed.

After proceeding throughout the entire training set,all derived rules will possess a weight, that will beused in the validation procedure when the estimatedvalues of the response variable are calculated duringthe defuzzification step.

3.4. Validation procedure

Fuzzy rules are validated using the validation dataset(n : { Xi;j ;Yj} i�1;…;k;j�nt11;…;nin the following steps.

3.4.1. Calculate all possible DOF for each data pointAll values of membership functions are calculated

for each premise, so we have allmA�l i � �Xi;j� (forl i � 1;…;5; i � 1;…; k) values. Since the fuzzynumbers are defined as overlapping regions, all thedata point will fall into two different fuzzy regions

R. Pongracz et al. / Journal of Hydrology 224 (1999) 100–114 109

Page 11: Journal of Hydrology 224 100 114

of a given premise (Figs. 8 and 9). Thus, theoretically,there are 2k possible rules, but most of them are eitherimpossible or did not occur in the training set (themaximum number of rules is determined by the lengthof the training set,nt which is much less then 2k).Therefore only a few existing rules will be taking

into account in specifying the response output. Asan example, for a data point from the validation set(July 1966), the possible membership values arecalculated in the western Nebraskan region (Table7). The total number of potentially applicable fuzzyrules is 210 � 1024 in the present study.

3.4.2. Combine the fuzzy responses: defuzzificationAt this time, the application of each rule provides

a fuzzy response. The defuzzification process willcombine the fuzzy responses and arrive at a crisp (areal number) estimated response. The center of grav-ity can be commonly used to obtain the estimatedvalue of the response variable (Yj):

Yj �

Xm[t

DOFm·vm·B�2�mXm[t

DOFm·vm

whereB�2�m is the most likely value (mB�m� � 1) of theconsequence fuzzy numberBm defined on PMDI.

In our example, for data point July 1966 five rulesare applicable out of the 1024 possible fuzzy rules(Table 8). So the estimation for PMDI in westernNebraska at July 1966 is:

3.5. Evaluate the fuzzy rule-based model

The fuzzy rule-based model must be evaluated interms of how well it reproduces the statistical proper-ties and the actual time series of the consequences inthe validation set.

4. Results

The results of the model using 5 fuzzy numbers oneach premise and 18 on the response variable aresummarized in Table 9 by providing the means andstandard deviations of observed and estimated PMDI,the root-mean squared errors (RMSE) and the correla-tion coefficients between observed and estimated timeseries for each climate division. These statistical char-acteristics serve as criteria of verification. It is evidentfrom Table 9 that the fuzzy rule-based modelpreserves the statistical parameters of the PMDI; in

R. Pongracz et al. / Journal of Hydrology 224 (1999) 100–114110

Table 7Membership function values at the data point of July 1966 (Western Nebraska)

i Xi,124 mA�1i � mA�2i � mA�3i � mA�4i � mA�5i �

Very rare Rare Medium Frequent Very frequent

1 0.13 0.33 0.67 0 0 02 0.07 0.48 0.52 0 0 03 0.23 0.03 0.97 0 0 04 0.32 0 0.26 0.74 0 05 0.22 0.03 0.97 0 0 06 0.03 0.79 0.21 0 0 0

Strong El nino Weak El nino Normal Weak La nina Strong La nina

7 20.24 0 0.16 0.84 0 08 20.65 0 0.43 0.57 0 09 21.77 0.18 0.82 0 0 010 21.33 0 0.89 0.11 0 0

Y124� 1025·�0:096·0:64·�21�1 0:25·0:14·1 1 3:59·0:23·3 1 1:01·0:15·1 1 9:12·0:30·�23��1025·�0:096·0:641 0:25·0:141 3:59·0:231 1:01·0:151 9:12·0:30� � 25:39

3:73� 21:45

Page 12: Journal of Hydrology 224 100 114

fact, there is no significant difference in any of theregions considered. In addition, the distributions ofthe calculated PMDI reproduce the empiricaldistributions (Fig. 11). It is even more noteworthy ifwe consider the much larger difference between theempirical distributions in the learning set (from whichthe rules are derived) and the validation set (Fig. 3).However, the performance of the model is very sensi-tive to the selection of the number of classes in thepremises. If, for instance, only three fuzzy numbers

are defined on the two types of premises (Rare, Medium,Frequent monthly CP occurrence, and El Nino, Normal,La Nina phases), the distributions differ significantly(Fig. 12)—the results are not as good.

Fig. 13 shows the observed and estimated timeseries for two divisions. The model performs almostperfectly over the training set, and quite well over thethe entire period. However, the estimated values arenot exactly the same as the observed drought indexduring the validation period. This is evident and was

R. Pongracz et al. / Journal of Hydrology 224 (1999) 100–114 111

Table 9Summary of results. Fuzzy numbers: 5 in CP time series, 5 in SOI time series, 18 in PMDI time series

Div1 Div2 Div3 Div5 Div6 Div7 Div8 Div9 NE

Ave. of observed 0.18 0.75 0.65 0.97 0.71 0.53 0.83 0.53 0.42Ave. of estimated 0.15 1.00 0.59 0.89 0.59 0.62 0.68 0.40 0.32Std. dev. of observed 2.13 2.87 2.85 2.70 2.59 2.60 2.47 2.75 2.91Std. dev. of estimated 1.99 2.76 2.78 2.83 2.53 2.54 2.46 2.60 2.92RMSE 1.53 1.82 1.63 1.74 1.57 1.73 1.64 1.65 1.77Correlation coeff. 0.74 0.80 0.84 0.82 0.82 0.79 0.79 0.82 0.83

South-Central Nebraska

0

0.2

0.4

0.6

0.8

1

-9 -8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 8 9

PMDI-values

rela

tive

freq

uenc

y

observed PMDI estimated PMDI / CP + SOI

estimated PMDI / CP only estimated PMDI / SOI only

Fig. 11. Comparison of cumulative frequency distributions of PMDI time series (1946–1994) in South-Central Nebraska (Fuzzy partitions: 5 onCP, 5 on SOI, 18 on PMDI).

Table 8Characteristics of the applied rules for the data point of July 1966 (Western Nebraska)

Applied (mth) rule DOFm [1025] Weightvm [1022] B�2�m

VR,R,VR,R,R,R,N,wE,wE,N! dry1 0.96 6.4 21VR,R,VR,M,R,VR,wE,N,wE,N! wet1 2.51 1.4 1VR,R,R,R,R,VR,wE,wE,sE,wE! wet3 35.91 2.3 3R,VR,VR,N,R,R,wE,N,wE,wE! wet1 10.12 1.5 1R,R,R,R,R,R,wE,wE,wE,wE! dry3 91.20 3.0 23

Page 13: Journal of Hydrology 224 100 114

expected, since droughts are triggered by a largenumber of atmospheric, hydrologic, agricultural, andother phenomena in addition to the two types ofpremises this model considers. Another reason isthat during the “learning” process huge and persistent

negative (years 1954–57) and positive (1992–94)peaks must be “assimilated”. The model didlearn all the peaks which is necessary to applythe fuzzy rule-based model to the entire range ofPMDI.

R. Pongracz et al. / Journal of Hydrology 224 (1999) 100–114112

0

0.2

0.4

0.6

0.8

1

-9 -8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 8 9

PMDI-values

rela

tive

freq

uenc

y

observed PMDI

estimated PMDI

Fig. 12. Cumulative frequency distribution of the observed and estimated PMDI time series (1946–1994) in Nebraska (fuzzy partitions: 3 onCP, 3 on SOI, 17 on PMDI).

South-Central Nebraska

-9

-6

-3

0

3

6

9

1946 1954 1962 1970 1978 1986 1994years

PM

DI

observed PMDI estimated PMDI

validation set training set – second half

training set – first half

Western Nebraska

-9

-6

-3

0

3

6

9

1946 1954 1962 1970 1978 1986 1994years

PM

DI

observed PMDI estimated PMDI

training set – second halftraining set – first half validation set

Fig. 13. Observed and estimated PMDI time series (1946–1994, summer half years). Fuzzy partitions: 5 on CP, 5 on SOI, 18 on PMDI.

Page 14: Journal of Hydrology 224 100 114

5. Discussion and conclusions

A fuzzy rule-based methodology has been presentedto estimate the modified Palmer index using globalatmospheric circulation and ENSO for the climate divi-sions in Nebraska. Separate use of either the relativefrequencies of CP types as premises or the lagged SOIshows that neither formulation can reproduce theempirical frequency distribution (Fig. 11). In fact,prediction based solely on SOI is the worst. The consid-eration of the joint forcing then results in dramaticimprovement. Thus, one of the main findings of thispaper is that both types of premises must be taken intoconsideration for the prediction of PMDI in Nebraska.On the other hand, multivariate regression provides verypoor results regardless of whether either or both types ofpremises are used. Fig. 14 presents typical results ofmutivariate regression; evidently this tool cannot beused in the regions considered with the amount of dataavailable. Similar results were found for precipitation inArizona in Galambosi et al. (1997).

The fuzzy rule-based technique has potential togenerate time series of drought indices under climatechange scenarios. The main idea is to use, instead ofthe historical CP and ENSO data, GCM-produceddata with the established linkage (fuzzy rule) topredict the drought indices. Several GCMs are ableto reproduce features of present atmospheric generalcirculation patterns quite correctly (e.g. Simmons andBengtsson, 1988; Mearns et al., 1999). On the otherhand, because of the possible difficulty of obtainingGCM-produced ENSO indices, one may resort to ascenario analysis: for instance, an unchanged ENSO

regime, a more frequent El Nino scenario, and a morefrequent La Nina scenario can be assumed, althoughthe rapid development of GCMs may provide mean-ingful ENSO outputs in the near future.

The following conclusions can be drawn:

1. Climate divisions in Nebraska reflect differentdrought conditions of high variability and persis-tence.

2. Although, there is significant relationship betweensimultaneous monthly CP, lagged SOI and PMDIin Nebraska, the weakness of the correlations, thedependence between CP and SOI and the relativelyshort data set limit the applicability of statisticalmodeling for prediction.

3. Fuzzy rule-based modeling that does not seek amathematical function to describe the relationship,provides an excellent tool to predict PMDI fromonly two types of premises: monthly frequencies ofdaily CP types and lagged prior SOI.

4. The fuzzy rules can be ascertained from a subsetcalled learning set of the observed time series ofpremises and PMDI response. Then another subset,the validation set should be also defined to checkhow the application of fuzzy rules reproduces theobserved PMDI.

5. In all its eight climate divisions and Nebraskaitself, the fuzzy rule-based technique using thejoint forcing of CP and SOI, is able to learn thehigh variability and persistence of PMDI andresults in almost perfect reproduction of theempirical frequency distributions and the “realtime” prediction is also acceptable.

R. Pongracz et al. / Journal of Hydrology 224 (1999) 100–114 113

South-Central Nebraska

-9

-6

-3

0

3

6

9

1946 1952 1958 1964 1970 1976 1982 1988 1994

years

PM

DI v

alue

s

observed PMDI predicted PMDI

Fig. 14. Observed and multivariate regression estimated PMDI time series for South-Central Nebraska.

Page 15: Journal of Hydrology 224 100 114

6. A step-by-step description of the methodology hasbeen provided to facilitate its application to othersimilar cases.

Acknowledgements

This research has been partially supported by theUS National Science Foundation under grants CMS-9613654 and CMS-9614017.

References

Alley, W.M., 1984. The Palmer drought severity index: limitationsand assumptions. J. Clim. Appl. Meteorol. 23 (7), 1100–1109.

Bardossy, A., Duckstein, L., 1995. Fuzzy Rule-Based Modelingwith Applications to Geophysical, Biological and EngineeringSciences, CRC Press, Boca Raton, FL 232 pp.

Bartholy, J., Matyasovszky, I., Duckstein, L., Bogardi, I., 1996.Interrelationship between ENSO and large-scale circulationpatterns. Presentation at the Conference on Probability andStatistics, San Francisco, CA, 21–23 February.

Bogardi, I., Matyasovszky, I., Bardossy, A., Duckstein, L., 1994. Ahydroclimatological model of areal drought. J. Hydrol. 153,245–264.

Carlson, R.E., Todey, D.P., Taylor, S.E., 1996. Midwestern cornyield and weather in relation to extremes of the Southern Oscil-lation. J. Prod. Agric. 9 (3), 347–352.

Chenault, E.A., Parsons, G., 1998. Drought worse than 96; cottoncrop’s one of worst ever. http://agnews.tamu.edu/stories/AGEC/Aug1998a.htm, Texas A&M Agricultural News Home Page,College Station, TX, August 19.

Dubois, P., Prade, H., 1980. Fuzzy Sets and Systems: Theory andApplications, Academic Press, San Diego, CA.

Federal Emergency Management Agency, 1995. National Mitiga-tion Strategy: Partnerships for Building Safer Communities.FEMA, Washington DC.

Galambosi, A., Duckstein, L., Ozelkan, E., Bogardi, I., 1997. Afuzzy rule-based model to link circulation patterns, ENSO,and extreme precipitation. In: Haimes, Y.Y., Moser, D.A.,Stakhiv, E.Z. (Eds.). Risk-Based Decision Making in WaterResources VIII, ASCE, Reston, VA, pp. 83–103.

Galambosi, A., Ozelkan, E., Duckstein, L., Bogardi, I., 1999. Afuzzy rule-based model for precipitation analysis under climatechange in the US Southwest. Presentation at the 79th AnnualMeeting, AMS, Dallas, TX, 10–15 January.

Guttman, N.B., Quayle, R.G., 1996. A historical perspective of U.S.climate divisions. Bull. Am. Met. Soc. 77 (2), 293–303.

Guttman, N.B., Wallis, J.R., Hosking, J.R.M., 1992. Spatialcomparability of the Palmer drought severity index. WaterResour. Bull. 28, 1111–1119.

Heddinghause, T.R., Sabol, P., 1991. A review of the PalmerDrought Severity Index and where do we go from here. Proc.of the Seventh Conference on Applied Climatology, 242–246.

Lawson, M.P., Dewey, K.F., Neild, R.E., 1977. Climatic Atlas ofNebraska, University of Nebraska Press, Lincoln, NE 88 pp.

MacQueen, J.B., 1967. Some methods for classification and analysisof multivariate observations. Proc. 5th Berkeley Symp. on Math.Stat. Probab. 1, 281–297.

Matyasovszky, I., Bogardi, I., Bardossy, A., Duckstein, L., 1993.Estimation of local precipitation statistics reflecting climatechange. Water Resour. Res. 29 (12), 3955–3968.

McKee, T.B., Doeskin, N.J., Kleist, J., 1993. The relationship ofdrought frequency and duration to time scales. Presented at theEighth Conference on Applied Climatology, AMS, Boston, MA.

Mearns, L.O., Bogardi, I., Giorgi, F., Matyasovszky, I., Palecki, M.,1999. Comparison of climate change scenarios generated fromregional climate model experiments and statistical downscaling.J. Geophys. Res. 104 (8) 6603–6621.

NCAR Data Support Section and University of Washington Dept. ofAtmospheric Sciences, 1996. NCEP Grid Point Data Set—version III.

NOAA, 1997. SOI time series. http://nic.fb4.noaa.gov:80/data/cddb/cddb/soi.

NOAA, National Climatic Data Center, 1998. Modified PalmerDrought Severity Index. ftp://ftp/ncdc.noaa.gov/pub/data/cirs/9808.pmdi.

Palecki, M., 1996. Nebraska’s climate: past and future. Nebraska-land Magazine 74 (1), 106–121.

Palmer, W.C., 1965. Meteorological drought. Research Paper 45,US Weather Bureau, Washington DC, 58 pp.

Piechota, T.C., Dracup, J.A., 1996. Drought and regional hydrologicvariation in the United States: Associations with the El Nino-Southern Oscillations. Water Resour. Res. 32 (5), 1359–1373.

Pesti, G., Shrestha, B., Duckstein, L., Bogardi, I., 1996. A fuzzyrule-based approach to droght assessment. Water Resour. Res.32 (6), 1741–1747.

Pongracz, R., 1999. ENSO impacts and climate change conse-quences in the Northern midlatitudes. PhD dissertation, EotvosLorand University, Budapest, Hungary (unpublished).

Pongracz, R., Bogardi, I., Duckstein, L., Bartholy, J., 1997. Risk ofregional drought influenced by ENSO. In: Haimes, Y.Y., Moser,D.A., Stakhiv, E.Z. (Eds.). Risk-Based Decision Making inWater Resources VIII, ASCE, Reston, VA, pp. 114–125.

Riebsame, W.E., Changnon Jr., S.A., Karl, T.R., 1991. Drought andNatural Resources Management in the United States: Impactsand Implications of the 1987–89 Drought, Westview Press,Boulder, CO 174 pp.

Simmons, A.J., Bengtson, L., 1988. Atmospheric general circula-tion models: their design and use for climate studies. In: Schle-singer, M. (Ed.). Physically-Based Modelling and Simulation ofClimate and Change, NATO ASI Series, II. Kluwer Academic,Dordrecht, pp. 627–652.

Western Governors’ Association, 1996. Drought Response ActionPlan. WGA, Denver, CO.

Wilks, D.S., 1995. Statistical Methods in the Atmospheric Sciences,Academic Press, San Diego, CA 467 pp.

Wright, P.B., 1985. The Southern Oscillation: an ocean–atmo-sphere feedback system? Bull. Amer. Met. Soc. 66, 398–412.

Zimmermann, H.J., 1985. Fuzzy Set Theory—and Its Applications,Kluwer–Nijhoff, Boston, MA 363 pp.

R. Pongracz et al. / Journal of Hydrology 224 (1999) 100–114114