analysis and statistical forecasting of trends in river ... faculteit/afdelingen... · analysis and...

10
Analysis and statistical forecasting of trends in river discharges under uncertain climate changes PIETER H.A.J.M. VAN GELDER TU Delft, Civil Engineering, Hydraulic and Offshore Engineering Section, P.O. Box 5048, NL-2600 GA DELFT, The Netherlands E-mail: [email protected] VADIM A. KUZMIN Department of Hydrology, Russian State Hydrometeorological University, Malookhtinski 98, Saint-Petersburg 195196, Russia E-mail: [email protected] [email protected] PAUL J. VISSER TU Delft, Civil Engineering, Hydraulic and Offshore Engineering Section, P.O. Box 5048, NL-2600 GA DELFT, The Netherlands E-mail: [email protected] Key words: long-term forecasting, stochastic modeling, robustness, method of main components, climate change Abstract This paper describes the application of the statistical method of main components (MMC) on annual average river discharges for which over a century of data is available. For these cases, the MMC is able to provide quite reliable forecasts of the average discharges up to a few years. Furthermore some dominant near-decadal modes of oscillation have been detected by MMC. 1 Introduction The majority of statistical methods for analysing hydrological data is based on the assumption that the hydrological processes are quasi-stationary. However with the observed changes in climate this assumption is not quite correct anymore. It is known that there are many causes for the climate change. A number of causes follows from global processes such as sun activity, earth axis fluctuation, changes of earth zones, etc. A way for determination of these causes in the available datasets is by using the statistical Method of Main Components (MMC). This method is closely related with the Fourier method for spectral data analysis. MMC was developed at the St.-Petersburg State University. It is an efficient statistical procedure and suitable software (Caterpillar ® ) has been developed. Caterpillar can separate several main components of casual processes. Using a 100-years observed data set, it provides a forecast length of a few years. The techniques will be applied in this paper on a time series of annual average discharges of the: - River Rhine at the location Koeln (Germany) and Lobith (NL) for which 182 (1816-1997) and 96 (1901-1996) years of data is available respectively; - River Meuse at the location Borgharen (NL) for which 80 (1911-1990) years of data is available; - River Weser at the location Vlotho (Germany) with 175 (1820-1994) years; Using modern time series analysis Tsonis et al. (1998) discovered a characteristic time scale in the global temperature record. They found a time scale corresponding to about 20 months and separating processes that promote a trend in the past from processes that

Upload: doantuong

Post on 08-Aug-2019

214 views

Category:

Documents


0 download

TRANSCRIPT

Analysis and statistical forecasting of trends inriver discharges under uncertain climate changes

PIETER H.A.J.M. VAN GELDERTU Delft, Civil Engineering, Hydraulic and Offshore Engineering Section,

P.O. Box 5048, NL-2600 GA DELFT, The NetherlandsE-mail: [email protected]

VADIM A. KUZMINDepartment of Hydrology, Russian State Hydrometeorological University,

Malookhtinski 98, Saint-Petersburg 195196, RussiaE-mail: [email protected] [email protected]

PAUL J. VISSERTU Delft, Civil Engineering, Hydraulic and Offshore Engineering Section,

P.O. Box 5048, NL-2600 GA DELFT, The NetherlandsE-mail: [email protected]

Key words: long-term forecasting, stochastic modeling, robustness, method of maincomponents, climate change

Abstract This paper describes the application of the statistical method of main components (MMC) onannual average river discharges for which over a century of data is available. For these cases, the MMC isable to provide quite reliable forecasts of the average discharges up to a few years. Furthermore somedominant near-decadal modes of oscillation have been detected by MMC.

1 Introduction

The majority of statistical methods for analysing hydrological data is based on theassumption that the hydrological processes are quasi-stationary. However with theobserved changes in climate this assumption is not quite correct anymore. It is known thatthere are many causes for the climate change. A number of causes follows from globalprocesses such as sun activity, earth axis fluctuation, changes of earth zones, etc. A wayfor determination of these causes in the available datasets is by using the statisticalMethod of Main Components (MMC). This method is closely related with the Fouriermethod for spectral data analysis.

MMC was developed at the St.-Petersburg State University. It is an efficientstatistical procedure and suitable software (Caterpillar®) has been developed. Caterpillarcan separate several main components of casual processes. Using a 100-years observeddata set, it provides a forecast length of a few years.

The techniques will be applied in this paper on a time series of annual averagedischarges of the:- River Rhine at the location Koeln (Germany) and Lobith (NL) for which 182 (1816-1997)and 96 (1901-1996) years of data is available respectively;- River Meuse at the location Borgharen (NL) for which 80 (1911-1990) years of data isavailable;- River Weser at the location Vlotho (Germany) with 175 (1820-1994) years;Using modern time series analysis Tsonis et al. (1998) discovered a characteristic timescale in the global temperature record. They found a time scale corresponding to about 20months and separating processes that promote a trend in the past from processes that

1 9 0 0 1 9 2 0 1 9 4 0 1 9 6 0 1 9 8 0 2 0 0 01 0 0 0

1 5 0 0

2 0 0 0

2 5 0 0

3 0 0 0

3 5 0 0A n n u a l A v e r a g e R i v e r D i s c h a r g e D a t a a t L o b i t h , R h i n e

reverse this tendency. This characteristic scale has important implications, one of whichmight be that the El Niño/La Niña cycle may act as a mechanism countering the tendencyof shorter time scale events to organize a positive or a negative temperature trend.

The annual record of hurricane activity in the North Atlantic basin for the period1886-1996 has been examined by Elsner et al. (1999) from the perspective of time seriesanalysis. Singular spectrum analysis combined with the maximum entropy method wasused on the time series of annual hurricane occurrences over the entire basin to extractthe dominant modes of oscillation. The annual frequency of hurricanes was modulated onthe biennial, semidecadal, and neardecadal timescales. The biennial and semidecadaloscillations corresponded to two well-known physical forcings in the local and globalclimate. These include a shift in tropical stratospheric winds between an east and westphase [quasi-biennial oscillation (QBO)] and a shift in equatorial Pacific Oceantemperatures between a warm and cold phase [El Nino-Southern Oscillation (ENSO)].These climate signals have previously been implicated in modulating interannualhurricane activity in the North Atlantic and elsewhere. Their near-decadal oscillation was anew finding. Separate analyses on tropical-only (TO) and baroclinically enhanced (BE)hurricane frequencies showed that the two components are largely complementary withrespect to their frequency spectra. The spectrum of TO hurricanes is dominated bytimescales associated with ENSO and the QBO, while the near-decadal timescaledominates the spectrum of BE hurricanes. Speculations as to the cause of the near-decadal oscillation of BE hurricanes center on changes in Atlantic SSTs possibly throughchanges in evaporation rates. Specifically, cross-correlation analysis pointed to solaractivity as a possible explanation as given by Elsner et al. (1999).

In this paper the time series of river discharges will be examined with the sametype of techniques as were applied in the above mentioned two papers (Tsonis et al.,1998 and Elsner et al., 1999), namely the singular spectrum analysis (SSA) or method ofmain components (MMC).

The paper is organised as follows. First two interesting discoveries in the timeseries analysis of annual average discharges will be presented. This gives rise to applythe MMC on these time series. The algorithm of MMC will be explained in Sec. 3. Resultsare given in Sec. 4 followed by the conclusions.

2 River discharge time series analysis

A linear regression method onthe Annual Average RiverDischarge Data at Lobith(Rhine) shows (Fig. 1) that theslope of the mean trend isestimated by - 2.67 10-1 m3/s/yrwith 95% confidence bounds of[-1.97 m3/s/yr 1.44 m3/s/yr].Despite rumours that rainfallincreases in Western Europebecause of climate change, theannual average river dischargesat Lobith remain fairly stableover the last century.

Fig.1 Annual average discharges (in m3/s) over the period 1901-1996 at Lobith.

1 00

1 01

1 02

1 03

0

0 .2

0 .4

0 .6

0 .8

1

1 .2

1 .4

1 .6

1 .8

2x 10 4

Time [y r ]

Spe

ctra

l Den

sity

B o r g h a r e n A n n u a l A v e r a g e D i s c h a r g e s

1 00

1 01

1 02

1 03

0

1

2

3

4

5

6

7

8x 10 5

Time [y r ]S

pect

ral D

ensi

ty

K o e l n A n n u a l A v e r a g e D i s c h a r g e s

1 00

1 01

1 02

1 03

0

1 0 0 0

2 0 0 0

3 0 0 0

4 0 0 0

5 0 0 0

6 0 0 0

7 0 0 0

8 0 0 0

Time [y r ]

Spe

ctra

l Den

sity

V lo tho Annua l Ave rage D i scha rges

1 00

1 01

1 02

1 03

0

2

4

6

8

1 0

1 2

1 4x 1 0

5

Time [y r ]

Spe

ctra

l Den

sity

L o b i t h A n n u a l A v e r a g e

Using the MATLAB software the above mentioned datasets are Fourier analysed, leadingto the following results (Fig 2a,b,c,d).

Figs 2a, 2b, 2c, 2d Spectral densities of river discharge time series.

Notice that each of the above spectra contains a periodic component at 4.2 years, forwhich no explanation has been found yet. A Fourier analysis is a suitable tool forcomponent detection, however predictions of the time series can not be made. For thispurpose the MMC algorithm has been developed. MMC is related with Fourier analysis,but with the advantage of making forecasts.

3 MMC Algorithm

The basic algorithm of MMC consists of 4 parts. Each part will be considered in detail.

3.1 Reconstruction of 1D-series into 2D-seriesConsider a time series ( )ix .

1 2 3, , ,...., ,...,M Nx x x x x (1)

Let N be its length. Choose a number M, which is known as the lag, (M<N). The larger thelag, the longer the period of the observed latent natural reasons fluctuations, and the

smaller the number of the main components. Rewrite 1D-series ( )ix into 2D-series,

1, 1( )i k j Mij i jx = =

= = as a matrix X :

X = ,1, 1( )i k j M

ij i jx = == =

1 2 3

2 3 4 1

1 2

M

M

k k k N

x x x x

x x x x

x x x x

+

+ +

=

L

L

LLLLLLL

L

(2)

This matrix may be shown as the MD (multi-dimensional) time series (which has volumek ). Every column corresponds to the non-straight line consisting of 1k − parts:

1 2 2 3 1

1

( ; ),( ; ),...,( ; )k k

k

x x x x x x−

−14444244443

(3)

The same procedure can be done for the rows, however the consideration of columns ismore suitable for theoretical explanation.

3.2 Analysis of the main components: singular separation of selected correlationmatrix

Let us start from the calculation of the average values of columns:

11

1 i k

j i ji

x xk

=

+ −=

= ∑ (4)

and the standard deviation:

( )2

11

1 i k

j i j ji

s x xk

=

+ −=

= −∑ . (5)

The found statistics, jx and js , are to be used in the transformation of an initial series.

The main components (MC) will be determined from the centered matrix X ∗ ( )ijx∗= :

, 1, 2,..., ; 1, 2,...,ij jij

j

x xx i k j M

s∗ −

= = = . (6)

Eqn.(6) expresses the determination of centered values for every column. Also thedetermination of centered values for every column and for every row simultaneously canbe performed. However, once centered values permit to leave an appearance of casualtrends. Twice centered values lead to disappearance of information about the averagetrend. Thus, to make one procedure of the centering is recommended. As every column isa vector, the matrix X may be seen as a series of M k-dimensional vectors. Now thematrix R is to be calculated:

1( )TR X X

k∗ ∗= , (7)

R is the correlation matrix, which consists of the elements ijr :

( )( )1 11

1 1k

ij i l i j l il i j

r x x x xk s s + − + −

=

= − −∑ (8)

Note, that in the space of columns every element ijr is the cosinus of the angle between

centered k-dimension vectors. The next part in the second step is the calculation ofeigenvalues and eigenvectors of the matrix R , i.e. its decomposition, as follows:

TR P P= Λ , (9)

where Λ is the diagonal matrix of eigenvalues:

1

2

0 0 0

0 0 0

0 0 0 M

λ

λ

λ

Λ =

L

L

LLLLLLL

L

, (10)

and P is the orthogonal matrix of eigenvectors of the matrix R :

11 21 1

12 22 21 2

1 2

( , ,..., )

M

MM

M M MM

p p p

p p pP p p p

p p p

= =

L

L

M M O M

L

. (11)

The matrix P can be written as a transfer matrix of the main components iy :

1 2( , ,..., )MX P Y y y y∗ = = . (12)

There is a very useful interpretation of the eigenvalues iλ . The iλ -values are the

selected standards of the corresponding main components iy . It is proportional to the

lengths of the ellipsoid described by the matrix R . Thus, 1

M

jj

Mλ=

=∑ .

3.3 Selection of the main components

As the main components are orthonormal ( TMY Y I∗ ∗ = ), the initial series is to be

decomposed into natural orthogonal components. Every vector iy may be considered as

the corresponding eigenvector jp (because the vector iy is the result of the projection of

the initial M-dimension series into the direction generated by jp ). In the same time the

procedure j jy X p∗= is the analogon of the linear reconstruction of the initial process by

the roll operator. The MMC bores a number of filters, which is self-tuning in thecomponents of the initial series.

Attention should be paid to the Nyquist frequency appearing in the case of too fewmeasurements of the natural processes. Too few measurements lead to the appearanceof false harmonics which do not have a natural prototype.

3.4 Reconstruction of 1D-initial series

The procedure of reconstruction is the key part of the MMC. It is founded by simple

relationships. As it follows from Eqn. (12), the matrix X ∗ may be reconstructed by themultiplication of Y and TP :

1

21 2

1 1

( , , ..., )

T

T M MT T

M l l ll l

TM

p

pX YP y y y y p X

p

∗ ∗

= =

= = ⋅ = =

∑ ∑M

. (13)

Furthermore,

01 0

1M M

Tk l l

l l

X x X S X X S X S∗ ∗ ∗ ∗

= =

= + = + =∑ ∑ . (14)

We obtain the decomposition of the initial series by the sum of ( 1)M + series.

3.5 The lag in the MMC

As it is explained in the previous sections, the main parameter of the model is the so-called lag (M). Here we meet some contradictions. On the one hand the theory requiresthat 2M N< . On the other hand, experience from the hydrological time series analysisshows, that by increasing the lag the average error of the forecast decreases. This factcan be explained from a hydrological point of view. Hydrological processes may begenerated by long-term fluctuations, which is much longer than the duration of theobservations. A short lag will have difficulties to find these fluctuations. As for the short-term fluctuations, they are reflected well by a short lag, however they do not define thestatistical properties of hydrological processes.

The described paradox may be approached in the following way. First we computethe eigenvalues by selecting a maximum lag (for example, N - 10). After that we determinethe maximum on the periodogram, and, finally, we choose the lag equal to the givenperiod.

4 Results

The MMC leads to the following eigenfunctions, principal components and reconstruction(Fig. 3, 4 and 5) of the Lobith data:

Fig 3: First 6 Eigenfunctions of Lobith dataset

Fig.4 : Principal Components of Lobith

Fig 5: Reconstructed series of Lobith (based on first 6 components and lag of 80).

1 0 0 0 1 5 0 0 2 0 0 0 2 5 0 0 3 0 0 0 3 5 0 00

0 . 1

0 . 2

0 . 3

0 . 4

0 . 5

0 . 6

0 . 7

0 . 8

0 . 9

1

D i s c h a r g e [ m 3 / s ]

CD

F

S o r t e d A n n u a l A v e r a g e D i s c h a r g e s @ L o b i t h + N o r m a l D i s t r i b u t i o n F i t

87 88 89 90 91 92 93 94 95 961000

1500

2000

2500

3000

3500

Year

Discharge

Lobith; observed and forecasted

ForecastedObserved

Observed

Forecasted

A forecasting experiment is performed on the Lobith data. The base series for the forecastis 1901-1986. The annual average discharges of 10 years will be forecasted andcompared with the measured discharges. The results are given in the following table:

Year Observed Forecasted Difference Abs Diff87 2967.5 3205.9 -8.0 8.088 1887.2 2319.9 -22.9 22.989 1942.9 1679.1 13.6 13.690 1938.9 1332.8 31.3 31.391 2121.2 2718.1 -28.1 28.192 2040.8 2176.6 -6.7 6.793 1875.6 1673.2 10.8 10.894 1952.5 1574.7 19.3 19.395 2077 2929.2 -41.0 41.096 2240 2242.0 -0.1 0.1

MeanAbs Diff 18.2

Table 1: Lobith results (discharges are in m3/s, differences are in %).

Figs 6a, 6b: Variability of the Lobith data and comparison between observed andforecasted data over period of 10 years.

Note the high absolute errors in the forecasts over the period of 10 years. The averagedifference is 18.28% and the maximum difference is 41.0% (for the year 1995). Accordingto Fig. 6a, the mean annual average discharge at Lobith is 2217m3/s and standarddeviation is 471m3/s (coefficient of variation is 21%). Using a 10 years prediction ofconstant discharges (the mean), would lead to an average difference of 14%. However,the directions of the fluctuations between observed and forecasted discharges appear tobe quite similar (see Fig. 6b). This is reflected in the correlation coefficient betweenobserved and forecasted values which is 70%.

If we concentrate on 1-year forecasting the following satisfactory results are obtained(Table 2):

Year Observed Forecasted Error Abs Error81 2608 2100 -19.5 19.582 2378 2466 3.7 3.783 2841 2606 -8.3 8.384 2187 2345 7.2 7.285 2319 2347 1.2 1.286 1746 2079 19.1 19.187 2968 2685 -9.5 9.588 1887 1792 -5.0 5.089 1943 2077 6.9 6.990 1939 1682 -13.3 13.391 2121 2248 6.0 6.092 2041 2008 -1.6 1.693 1876 1921 2.4 2.494 1953 2073 6.1 6.195 2077 2212 6.5 6.596 2240 2468 10.2 10.2

Average Err 7.9

Table 2: Results of 1-year forecasting.

The results are obtained using 6 principal components. Notice that there are years with apoor forecast (1981, 1986) in which the error is larger than if the mean discharge waspredicted. However, the average error is 7.9% which is significantly smaller than 14%.Finally, results are shown for the forecasts of the last year of the datasets of the otherlocations (Table 3):

No. Name Length #PC Lag Forecasted Observed Rel err % Size area1. Lobith 96 17 45 2329 2240 3.9 1068002. Koeln 182 6 90 1961 2001 2.0 1443233. Vlotho 175 20 85 164 286 42.5 176184. Borgharen 80 25 39 182 172 5.3 21300

Table 3: Errors in forecasting the annual average discharge of last-year (using all datauntil and including last-year - 1).

Notice that by increasing the number of principal components (from 6 tot 17), the relativeerror in the 1996-forecast at Lobith decreased from 10.2 to 3.9%. However, the optimalnumber of principal components is difficult to determine and a sensitivity analysis issuggested for this purpose (as well as for the determination of the lag size). Furthermore,notice from Table 3 that the 1994-forecast at Vlotho was quite bad (42.5% error); howeverthis is partly caused by stochastic variability (the 1993-forecast at Vlotho has 21.8% error)and partly caused by the multi-peaked spectrum at Vlotho (Fig 2b).

Conclusions

In this paper the method of main components (MMC) is described and applied to the long-term forecasting of annual average river discharges. It was shown in the case studies ofdifferent rivers that the MMC can provide a fairly reliable forecast of the averagedischarges up to a few years. The directions of the fluctuations are reflected by the MMCreasonably up to a decade. Furthermore it was interesting to note a stable (non-increasing, non-decreasing) behaviour in the annual average discharges over time (in theorder of a century) and finally a (yet) unexplainable periodic component of 4.2 years in theannual average discharges (of different rivers) was detected.

Acknowledgements

The authors thank the developers of the Caterpillar software for the possibility to applytheir software on the river flow datasets. Also the Global Runoff Data Center in Koblenz(DE) is acknowledged for providing the annual average river discharge datasets. The workwas performed under project code C71-CC5252.

Literature

1. Elsner JB, Kara AB, Owens MA, Fluctuations in North Atlantic hurricane frequency,JOURNAL OF CLIMATE, 12: (2) 427-437 FEB 1999

2. Tsonis AA, Roebber PJ, Elsner JB, A characteristic time scale in the globaltemperature record, GEOPHYSICAL RESEARCH LETTERS, 25: (15) 2821-2823 AUG1 1998

3. Marple, S. Lawrence (Jr.)., Digital spectral analysis: with applications, PublisherEnglewood Cliffs : Prentice-Hall, 1996

4. Elsner, James B., Tsonis, Anastasios A., Singular spectrum analysis; a new tool intime series analysis, Publisher New York : Plenum, 1996, ISBN 0-306-45472-6, 164pp.

5. Vadim Kuzmin, Pieter van Gelder, Hafzullah Aksoy and Ismail Kucuk, APPLICATIONOF THE STOCHASTIC SELF-TRAINING METHOD FOR THE MODELING OFEXTREME FLOODS, THE EXTREMES OF THE EXTREMES, July 17 -19, 2000,Grand Hótel Reykjavík, Iceland.