on the multiple breakpoint problem and the number of significant breaks in homogenisation of climate...

Post on 02-Jan-2016

214 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

On the multiple breakpoint problem and the number of significant breaks in homogenisation of climate records

Separation of true from spurious breaks

Ralf Lindau & Victor VenemaUniversity of Bonn

12th EMS Annual Meeting, Lodz, Poland – 13. September 2012

Internal and External Variance

Consider the differences of one station compared to a neighbour or a reference.

Breaks are defined by abrupt changes in the station-reference time series.

Internal variancewithin the subperiods

External variancebetween the means of different

subperiods

Criterion:Maximum external variance attained bya minimum number of breaks

12th EMS Annual Meeting, Lodz, Poland – 13. September 2012

Decomposition of Variance

n total number of yearsN subperiodsni years within a subperiod

The sum of external and internal variance is constant.

12th EMS Annual Meeting, Lodz, Poland – 13. September 2012

First Question

How do random data behave?

Needed as stop criterion for the numberof significant breaks.

12th EMS Annual Meeting, Lodz, Poland – 13. September 2012

Random Time Series

with stddev = 1

Segment averages xi scatter randomly

mean : 0

stddev: 1/

Because any deviation from zero can beseen as inaccuracy due to the limited number of members.

in

12th EMS Annual Meeting, Lodz, Poland – 13. September 2012

2-distribution

The external varianceis equal to the mean square sumof a random standard normal distributed variable.

Weighted measure for thevariability of the subperiods‘means

12th EMS Annual Meeting, Lodz, Poland – 13. September 2012

From 2 to distribution

n = 21 yearsk = 7 breaks

As the total variance is normalized to 1, a kind of normalized

chi2-distribution is expected:

This is the -distribution.

data

2

1,2

1)(

12

112

knkB

vvvp

knk

The exceeding probability P gives thebest (maximum) solution for v

Incomplete Beta Function

v

pdvvP0

1)(

7 breaks in 21 years

12th EMS Annual Meeting, Lodz, Poland – 13. September 2012

Added variance per break

5ln21ln

2

1

1

1*

***

k

kk

dk

dv

v

k

1

0

1)(i

l

lml vvl

mvP

Incomplete -function:

2

3n

m

2

ki

Transformation to dv/dk:

mean

90%

95%

12th EMS Annual Meeting, Lodz, Poland – 13. September 2012

The extisting algorithm Prodige

Original formulation of Caussinus and Mestre for the penalty term in Prodige

Translation into terms used by us.

Normalisation by k* = k / (n -1)

Derivation to get the minimum

In Prodige it is postulated that the relative gain of external variance is a constant for given n.

minln21ln * nkv

0ln21

1*

ndk

dv

v

ndk

dv

vln2

1

1*

minln1

21ln

n

n

kv

min)ln(

1

2

)(

)(

1ln)(

1

2

1

1

2

nn

lk

YY

YYn

YCn

ii

k

j

jj

k

12th EMS Annual Meeting, Lodz, Poland – 13. September 2012

Shorter length, less certainty

n = 21 yearsn = 101 years

Exceeding probability1/1281/641/321/161/81/4

12th EMS Annual Meeting, Lodz, Poland – 13. September 2012

Second Question

How do true breaks behave?

True Breaks

12th EMS Annual Meeting, Lodz, Poland – 13. September 2012

Identical Behaviour

True breaks behave identical to random data.

But the abscissa-scale is now:

k / nk instead of k / n.

Compared to random time series the external variance grows faster by the factor

n / nk

12th EMS Annual Meeting, Lodz, Poland – 13. September 2012

data

theory

nk = 19 true breaks within n = 100 years time series

Assumed / True Break Number k / nk

Break vs Scatter Regime

Simulated data with 19 breaks interfered by scatter

The internal variance decrease as a function of break number.

In the break regime the variance decrease faster by the factor:

15 breaks are detectable, depending on signal to noise ratio.

12th EMS Annual Meeting, Lodz, Poland – 13. September 2012

Time series lengthNumber of true breaks

12th EMS Annual Meeting, Lodz, Poland – 13. September 2012

Conclusions

• The analysis of random data shows that the external variance is -distributed, which leads to a new formulation for the penalty term.

• True breaks are also -distributed. Their external variance increases faster by a factor of n/nk compared to random scatter.

top related