statistical validation of numerical models: some methods
DESCRIPTION
statistical validation of numerical models: some methods. Ricardo Lemos. subject index. data setup standard methods of model validation model validation for a single location - time-series analysis model validation for a single instant - spatial data analysis - PowerPoint PPT PresentationTRANSCRIPT
statistical validation of numerical models: some methods
Ricardo Lemos
1. data setup
2. standard methods of model validation
3. model validation for a single location - time-series analysis
4. model validation for a single instant - spatial data analysis
5. model validation for variable space and time - spatiotemporal data analysis
6. summary
subject index
1. data setup
1. data setup
a) deterministic model already calibrated – the whole dataset is used to validate the model
b) deterministic model needs calibration – data subsetting according to the purpose of the numerical model (description vs. prediction)
calibrationvalidation
description prediction
space
time
space
time
spacespace
time
random subsampling
subsampling with the aim of
forecasting
Chang, J.C., Hanna, S.R., 2004. Air quality model performance evaluation. Meteorol Atmos Phys 87: 167–196
«Because there is not a single best performance measure or best evaluation methodology, it is recommended that a suite of different performance measures be applied.» (Chang and Hanna, 2004)
model validation
2. standard methods of model validation
WWRP/WGNE Joint Working Group on VerificationForecast Verification - Issues, Methods and FAQ
Introduction - what is this web site about? Issues:
Why verify? Types of forecasts and verification
What makes a forecast good? Forecast quality vs. value
What is "truth"? Validity of verification results Pooling vs. stratifying results
Methods: Standard verification methods:
Methods for dichotomous (yes/no) forecasts Methods for multi-category forecasts
Methods for forecasts of continuous variables Methods for probabilistic forecasts
Scientific or diagnostic verification methods: Methods for spatial forecasts
Methods for probabilistic forecasts, including ensemble prediction systems Other methods
Sample forecast datasets: Finley tornado forecasts
Sydney 2000 Forecast Demonstration Project radar-based rainfall nowcasts
..... climate example ..... Some Frequently Asked Questions
Discussion group References:
Links to other verification sites References and further reading
Contributors to this site
http://www.bom.gov.au/bmrc/wefor/staff/eee/verif/verif_web_page.shtml
2. standard methods of model validation
World Weather Research Program / Working Group on Numerical
Experimentation
2. standard methods of model validation
a) compare raw model predictions and observed data
b) analyse residuals (observed – predicted)
methods:
2. standard methods of model validation
a1) “eyeball“ verification:
a) compare raw model predictions and observed data
a2) straightforward statistical analysis - steps:
i. define important features of the data
ii. quantify them in some way - “statistical probes” (Kendall et al., 1999)
iii. investigate to what extent those features are captured by the model
probe mean variance min max lag-1 autocorrelation
amplitude of periodical fluctuation (e.g., =29.5d)
phase trend
Data
Model
Kendall B.E., et al., 1999. Why do populations cycle? A synthesis of statistical and mechanistic modeling approaches. Ecology 80(6): 1789-1805
2. standard methods of model validation
b1) “eyeball“ verification:
b) analyse residuals (observed – predicted)
resi
dua
l
resi
dua
lre
sidu
al
time or space
b2) straightforward statistical analysis:
autocorrelation plot periodogram
rationale: if the model performs well, it should closely follow the observations and leave white noise only
validated model
overfitted?
incomplete?
significant lag-1 autocorrelation in residuals
unmatched periodicity
3. model validation for a single location - time-series analysis
3. model validation for a single location - time-series analysis
methods:
i. compare the performance of the numerical model with the performance of statistical time-series models:
a) Autoregressive Integrated Moving Average Models (ARIMA)
b) Bayesian Dynamic Linear Models (DLM)
c) Analogue Forecasting Models
this method requires subsetting in order to build the statistical models.
ii. examine in detail the performance of the numerical model
a) models for known periodicities
b) bootstrap R
c) process convolutions
3. model validation for a single location - time-series analysis
i. compare the performance of the numerical model with the performance of statistical time-series models:
Xt+1=0.9Xt-0.2Xt-1+t
t~N(0,1.2)
ARIMA models can contain seasonal components.
T [ºC]
time
t+1 t+2t+3
Box, G. E. P., Jenkins, G. M. 1976. Time Series Analysis: forecasting and control. Holden Day, Oakland, CA.
a) Autoregressive Integrated Moving Average Models (ARIMA; Box and Jenkins, 1976)
3. model validation for a single location - time-series analysis
i. compare the performance of the numerical model with the performance of statistical time-series models:
b) Bayesian Dynamic Linear Models (West and Harrison, 2000)
Xt+1=tXt+tXt-1+t
tt-1t
tt-1t
t~N(0,1.2)
t~N(0,0.05)
t~N(0,0.02)
T [ºC]
time
t+1
West, M., Harrison, J., 2000. Bayesian Forecasting and Dynamic Models. Springer-Verlag, NY.
3. model validation for a single location - time-series analysis
i. compare the performance of the numerical model with the performance of statistical time-series models:
c) Analogue Forecasting Models (McNames, 2002)
Xt+1,L=0.7Xt+1,A1+0.3Xt+1,A2+
~N(0,1.2)
T [ºC]
time
t+1t+1
t+1
A1 A2 L
McNames, J. 2002. Local averaging optimization for chaotic time series prediction, Neurocomputing 48(1-4): 279-297
3. model validation for a single location - time-series analysis
i. compare the performance of the numerical model with the performance of statistical time-series models
T [ºC]
time
observations
numerical model
ARIMA
Bayesian DLM
analogue forecasting model
3. model validation for a single location - time-series analysis
i. compare the performance of the numerical model with the performance of statistical time-series models
Taylor, K.E. 2001. Summarizing multiple aspects of model performance in a single diagram. J Geophys Res 106(D7): 7183–7192
3. model validation for a single location - time-series analysis
ii. examine in detail the performance of the numerical model
a) models for known periodicities
e.g.: is the numerical model emulating the major tide components?
model for the observations: model for the numerical model output:
if, for example, 2 is significantly different from 0, we may conclude that the model is not reproducing well the f1 periodicity.
)2cos()2sin(
)2cos()2sin(
)2cos()2sin(
)2cos()2sin(
4847
3635
2423
1211
fafa
fafa
fafa
fafaX
)2cos()()2sin()(
)2cos()()2sin()(
)2cos()()2sin()(
)2cos()()2sin()(
488477
366355
244233
122111
fafa
fafa
fafa
fafaX
3. model validation for a single location - time-series analysis
ii. examine in detail the performance of the numerical model
T [ºC]
time
b) bootstrap R (Mudelsee, 2003) – time-series usually have positive serial dependence, a.k.a. persistence (i.e., lagged autocorrelations are significant and positive). This affects the estimation of confidence intervals for the cross-correlation (R)
observations
numerical model
Mudelsee, M., 2003. Estimating Pearson’s Correlation Coefficient With Bootstrap Confidence Interval From Serially Dependent Time Series. Mathematical Geology 35(6): 651-665
3. model validation for a single location - time-series analysis
ii. examine in detail the performance of the numerical model
T [ºC]
time
observationsnumerical model
residual [ºC]
time
observational missing values wider confidence bandssignificant model misfit
95% confidence band
Higdon, D., 2002. Space and space-time modeling using process convolutions. In Quantitative Methods for Current Environmental Issues, eds. C. Anderson, V. Barnett, P. C. Chatwin, and A. H. El-Shaarawi, 37–56. London: Springer-Verlag
0
c) process convolutions (Higdon, 2002) – help to define time periods where observations and predictions differ significantly. Should be applied to residuals (observations – predictions)
4. model validation for a single instant – spatial data analysis
4. model validation for a single instant – spatial data analysis
output of the numerical model
methods:
i. direct comparison between numerical model and observationsa) figure of Merit in Space (FMS) / measure of effectiveness (MOE)b) entity-based verification
ii. residual analysis
a) process convolutions
in-situ measurements
T[ºC]
i. direct comparison between numerical model and observations
4. model validation for a single instant – spatial data analysis
output of the numerical model (predictions)
in-situ measurements (observations)
AO
AP
AP∩AO
AP: T1<TP<T2
AO: T1<TO<T2
T2T1
T[ºC]
AFalse Negative
AFalse Positive
a) figure of merit in space (FMS) / measure of effectiveness (MOE)
AO
AP
AFalse Negative
AFalse Positive
i. direct comparison between numerical model and observations
4. model validation for a single instant – spatial data analysis
0º
45º
90º
AP∩AO
Azimuth [º]
d
0º 45º 90º
d
this is a simple statistical approach, with easy interpretation and potential impact on decision-makers. However, it depends on some subjective criteria that have a strong impact on the outcome: boundaries (T1 and T2), interpolation algorithm, interpolation smoothness; the density and location of the observations is also important.
a) figure of merit in space (FMS) / measure of effectiveness (MOE)
i. direct comparison between numerical model and observations
4. model validation for a single instant – spatial data analysis
b) entity-based verification (Ebert and McBride, 2000)
the total mean squared error (MSE) can be written as: MSEtotal = MSEdisplacement + MSEvolume + MSEpattern
the difference between the mean square error before and after translation is the contribution to total error due to displacement, MSEdisplacement = MSEtotal – MSEshifted
the error component due to volume represents the bias in mean intensity, MSEvolume = ( F - X )2
where F and X are the entity’s mean forecast and observed values after the shift. The pattern error accounts for differences in the fine structure of forecast and observed fields
MSEpattern = MSEshifted - MSEvolume
Ebert, E.E., McBride, J.L. 2000. Verification of precipitation in weather systems: Determination of systematic errors. J. Hydrology 239: 179-202.
0
0
0
y
x
z
95% confidence interval
ii. residual analysis
4. model validation for a single instant – spatial data analysis
a) process convolutions (Higdon, 2002)
5. model validation for variable space and time - spatiotemporal data analysis
methods:
i. analyse observations and predictions at a single location (time-series analysis) or time instant (spatial data analysis) – see sections 3 & 4
ii. residual analysis – dynamic process convolutions
5. model validation for variable space and time - spatiotemporal data analysis
Residuals Spatial Process Noise
Time 1 Time 2 Time 3
S(., 1) S(., 2) S(., 3)
= +
yi S(xi, 2) i= +
ii. residual analysis - dynamic process convolutions (Higdon, 2002)
5. model validation for variable space and time - spatiotemporal data analysis
6. summary
6. summary
in essence, two validation approaches were proposed:
1) signal analysis – used to investigate to what extent the most important features of the data are captured by the numerical model
2) residual analysis – used to investigate if some significant features were left out by the numerical model
a third option is available: compare the performance of the numerical model with that of statistical models (ARIMA, DLMs, etc.).