application of machine learning methods to model bias

19
2020.01.30 Data Assimilation Workshop Application of machine learning methods to model bias correction: Lorenz-96 model experiments Data Assimilation Research Team, RIKEN Center for Computational Science Arata Amemiya*, Takemasa Miyoshi Shlok Mohta Graduate School of Science, the University of Tokyo

Upload: others

Post on 24-Apr-2022

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Application of machine learning methods to model bias

2020.01.30 Data Assimilation Workshop

Application of machine learning methods to model bias correction:

Lorenz-96 model experiments

Data Assimilation Research Team, RIKEN Center for Computational Science

Arata Amemiya*, Takemasa Miyoshi

Shlok Mohta

Graduate School of Science, the University of Tokyo

Page 2: Application of machine learning methods to model bias

Motivation: bias in weather and climate models

Weather and climate models have model biases from various sources

β€’ Truncation errorβ€’ Approximation of unresolved physical processes

- Convection- Small-scale topography- Turbulence- Cloud microphysics

(From JMA website)

Page 3: Application of machine learning methods to model bias

Treatment of forecast error in data assimilation

𝑷𝑑+1𝑓

= π‘΄π‘·π‘‘π‘Žπ‘΄π‘»

Kalman filter

𝒙𝑑+1𝑓

=π“œ(π’™π‘‘π‘Ž)

𝑴 = πœ•π“œ/πœ•π’™

http://www.data-assimilation.riken.jp/jp/research/index.html

π‘·π‘Ž = 𝑰 βˆ’ 𝑲𝑯 𝑷𝑓

𝑲 = 𝑷𝑓𝑯𝑇 𝑯𝑷𝑓𝑯𝑇 + π‘Ήβˆ’1

π’™π‘Ž = 𝒙𝑓 +𝑲 π’š βˆ’ 𝐻(𝒙𝑓)

Model bias leads tothe underestimation of forecast(background) error

Previous analysis

Analysis mean

Observation

Forecast mean

Update state and forecast error covariance

Calculate analysis state and error covariance

Calculate Kalman gain

Page 4: Application of machine learning methods to model bias

Treatment of imperfect model

Insufficient model error degrades the performance of Kalman filter

1. Covariance inflation

2. Correction of systematic bias component

π‘·π‘Ž β†’ 𝛼 π‘·π‘Ž

𝒙𝑑+1𝑓

= 𝒙𝑑+1𝑓

+ 𝒃

𝒙𝑑+1π‘Ž = 𝒙𝑑+1

𝑓+𝑲 π’šπ‘‘+1 βˆ’ 𝐻 𝒙𝑑+1

𝑓

Relaxation-to-prior

multiplicative inflation

π‘·π‘Ž β†’ π‘·π‘Ž +𝑸additive inflation

π‘·π‘Ž β†’ (1 βˆ’ 𝛼) π‘·π‘Ž + 𝛼𝑷𝑓

Page 5: Application of machine learning methods to model bias

Bias correction with simple functional form

Dimensionality reduction can be applied using Singular Value Decomposition (SVD) (Danforth et al. 2007)

Set of training data {𝛿𝒙, 𝒙𝑓}β†’ bias correction term 𝑫(𝒙) estimation

𝑫0 = ΀𝛿𝒙 Δ𝑑

𝑳𝒙′ = π‘ͺ𝛿𝒙,𝒙 π‘ͺ𝒙,π’™βˆ’1𝒙′ βˆ• Δ𝑑

𝑫 𝒙 = 𝑫0 + 𝑳𝒙′ 𝒙′ = 𝒙𝑓 βˆ’ 𝒙𝑓

Steady component

π‘ͺ : correlation matrix

Linearly-dependent component (Leith , 1978)

βœ“ β€œOffline” bias correction

𝑑

𝑑𝑑𝒙 = 𝒇true(𝒙)

𝑑

𝑑𝑑𝒙 = 𝒇model 𝒙

𝛿𝒙 = 𝒙𝑑 𝑑 + Δ𝑑 βˆ’ 𝒙𝑓 𝑑 + Δ𝑑

𝒙𝑑 𝑑 , 𝒙𝑑 𝑑 + Δ𝑑 …

𝒙𝑓 𝑑 + Δ𝑑

𝑑

𝑑𝑑𝒙 = 𝒇model 𝒙 + 𝑫(𝒙)

Simplest form: linear dependency

Page 6: Application of machine learning methods to model bias

Bias correction with nonlinear basis functions

β€’ Coupled Lorenz96 system (Wilks et al. 2005, Arnold et al. 2013)

β€’ Real case: All-sky satellite infrared brightness temperature (Otkin et al. 2018)

Probability of (obs – fcst) vs obs

Higher order polynomials :

Neural networks :

β€’ Coupled Lorenz96 system (Watson et al. 2019)

(Fig.2 of Otkin et al. 2018)

Page 7: Application of machine learning methods to model bias

Online bias correction

β€’ Kalman filter

= Simultaneous estimation of state variables and bias correction terms

- Polynomials (Pulido et al. 2018)

βœ“ β€œOnline” bias correction

- Legendre polynomials(Cameron and Bell, 2016; for Satellite sounding in UK Met Office operational model)

- Steady component (Dee and Da Sliva 1998, Baek et al 2006)

β€’ Variational data assimilation (β€œVarBC”)

sequential treatment / augmented state

Penalty function forbias-corrected obs error

Penalty function forregularization

Page 8: Application of machine learning methods to model bias

Localization

Localization is built-in in LETKF

(Pathak et al. 2018, Watson et al. 2019)

Also in ML-based data driven modelling

βœ“ Reduced matrix size -> low cost

In high dimensional spatiotemporal system,geographically (and temporally) local interaction is usually dominant

βœ“ Highly effective parallelization(Miyoshi and Yamane, 2007)

Also used in simultaneous parameter estimation(Aksoy et al. 2006)

(Pathak et al. 2018)

Page 9: Application of machine learning methods to model bias

The goal of this study

βœ“ ML-based online bias correction using Long-Short Term Memory (LSTM)

βœ“ Combined with LETKF with similar localization

Test experiments with coupled Lorenz96 model

β€’ Experimental Setup

β€’ Online bias correction with simple linear regression as a reference

β€’ (Online bias correction with LSTM)

Page 10: Application of machine learning methods to model bias

bias correction in LETKF system

Model

𝒙𝑑+1π‘Ž = 𝒙𝑑+1

𝑓+𝑲 π’šπ‘‘+1 βˆ’ 𝐻 𝒙𝑑+1

𝑓

observationforecast

𝒙𝑑+1𝑓

Bias correction

Analysis

Bias correction function 𝒃(𝒙𝑓) update

𝒙𝑑+1 = 𝑴 𝒙𝑑

𝒙𝑑+1𝑓

= 𝒃(𝒙𝑑+1𝑓

)

LETKF

π’šπ‘‘+1

Page 11: Application of machine learning methods to model bias

Lorenz96 model

𝑑

𝑑𝑑π‘₯π‘˜ = π‘₯π‘˜βˆ’1 π‘₯π‘˜+1 βˆ’ π‘₯π‘˜βˆ’2 βˆ’ π‘₯π‘˜ + 𝐹

π‘₯π‘˜ π‘₯π‘˜+1

β€’ 1-D cyclic domainβ€’ Chaotic behavior for sufficiently large 𝐹

Wind @ 250hPaNCEP GFS

https://earth.nullschool.net/

(Lorenz 1996)

π‘˜ = 1,2,… , 𝐾

π‘₯ = 𝐹

𝐾 = 40, 𝐹 = 8

Largest Lyapunov exp.

Mean π‘₯ 𝑑

Page 12: Application of machine learning methods to model bias

Coupled Lorenz96 model

𝑑

𝑑𝑑π‘₯π‘˜ = π‘₯π‘˜βˆ’1 π‘₯π‘˜+1 βˆ’ π‘₯π‘˜βˆ’2 βˆ’ π‘₯π‘˜ + 𝐹 βˆ’

β„Žπ‘

𝑏

𝑗=𝐽 π‘˜βˆ’1 +1

π‘˜π½

𝑦𝑗

𝑑

𝑑𝑑𝑦𝑗 = βˆ’π‘π‘ 𝑦𝑗+1 𝑦𝑗+2 βˆ’ π‘¦π‘—βˆ’1 βˆ’ 𝑐𝑦𝑗 +

β„Žπ‘

𝑏π‘₯int[(π‘—βˆ’1)βˆ•π½]+1

π‘˜ = 1,2,… , 𝐾

𝑗 = 1,2,… , 𝐾𝐽

β€œNature run”

𝑑

𝑑𝑑π‘₯π‘˜ = π‘₯π‘˜βˆ’1 π‘₯π‘˜+1 βˆ’ π‘₯π‘˜βˆ’2 βˆ’ π‘₯π‘˜ + 𝐹 + 𝐴 sin 2πœ‹

π‘˜

𝐾

Forecast model (Danforth and Kalnay, 2008)

Coupling terms

Multi-scale interaction

𝑦1 𝑦𝐽

Parameters used in this study:

(Wilks, 2005)

… …

(𝐾 = 8)

Large scale (Slow) variables

Small scale (fast) variables

𝐾 = 16, 𝐽 = 16

β„Ž = 1, 𝑏 = 20, 𝑐 = 50

𝐹 = 16, A = 1

Page 13: Application of machine learning methods to model bias

Data assimilation experiment

Observation operator:identical(obs = model gridοΌ‰

Error standard deviation: 0.1

Interval: 0.025 / 0.05 / 0.1 (cf: doubling time ≃ 0.2)

Member: 20Localization: Gaussian weighting (length scale = 3 grids)

Covariance inflation: multiplicative (factor: 𝛼)

Obs interval Minimum RMSE Inflation 𝛼

0.025 0.0760 2.9

0.05 0.0851 5.8

0.1 0.0902 15.0

β€œObservation”

LETKF configuration

Large inflation factor 𝛼 is needed to stabilize DA cycle

β€œNature run” + random error

β€œControl run”: Data assimilation without bias correction

Page 14: Application of machine learning methods to model bias

Example: simple linear regression

Model

𝒙𝑑+1π‘Ž = 𝒙𝑑+1

𝑓+𝑲 π’šπ‘‘+1 βˆ’ 𝐻 𝒙𝑑+1

𝑓

π’šπ‘‘+1 = 𝐻 𝒙𝑑+1true + 𝝐obs

observationforecast

𝒙𝑑+1𝑓

Bias correction

AnalysisLETKF

𝒙𝑑+1𝑓

= 𝑾𝒕+πŸπ‘“

𝒙𝑑+1𝑓

+ 𝒃𝑑+1𝑓

Online bias correction by linear regression

𝑾𝑑+1π‘Ž = 𝑾𝑑+1

𝑓+ 𝛽𝑲 π’šπ‘‘+1 βˆ’ 𝐻 𝒙𝑑+1

𝑓⋅ (𝒙𝑑+1

𝑓)𝐓 βˆ• 𝒙𝑑+1

𝑓 𝟐

𝒙𝑑+1 = 𝑴 𝒙𝑑

𝒃𝑑+1 = (1 βˆ’ 𝛾)𝒃𝑑

𝑾𝑑+1 = 1 βˆ’ 𝛾 𝑾𝑑

𝒃𝑑+1π‘Ž = 𝒃𝑑+1

𝑓+ 𝛽𝑲 π’šπ‘‘+1 βˆ’ 𝐻 𝒙𝑑+1

𝑓

Page 15: Application of machine learning methods to model bias

Online bias correction by linear regression

𝛽 = 0.02, 𝛾 = 0.999 (half-life: ~37) 𝒙𝒇

𝒙𝒂 βˆ’ 𝒙𝒇

𝒙𝒇 βˆ’ 𝒙𝒇 :correction

𝒙𝒂 βˆ’ 𝒙𝒇 has large negative correlation with 𝒙𝒇

Interval (day) Min. infl. RMSE

0.025 2.9 0.0760

0.1 15.0 0.0902

Interval (day) Min. infl. RMSE

0.025 2.0 0.0625

0.1 8.2 0.0849

and mostly offset by bias correction term

Stable with smaller inflation factor 𝛼

Page 16: Application of machine learning methods to model bias

Correction term

Time 0-30 120-150

240-270 360-390

Bias: Forecast(before correction)-analysisCorrection

Page 17: Application of machine learning methods to model bias

Nonlinear bias correction with ML

Model

𝒙𝑑+1π‘Ž = 𝒙𝑑+1

𝑓+𝑲 π’šπ‘‘+1 βˆ’ 𝐻 𝒙𝑑+1

𝑓

observationforecast

𝒙𝑑+1𝑓

Bias correction

Analysis

Train the network 𝒃 with 𝒙𝑑+1𝑓

, 𝒙𝑑+1π‘Ž

𝒙𝑑+1 = 𝑴 𝒙𝑑

𝒙𝑑+1𝑓

= 𝒃(𝒙𝑑+1𝑓

, … )

LETKF

π’šπ‘‘+1

RNN

Page 18: Application of machine learning methods to model bias

Plan: LSTM implementation

β€’ Activation:tanh / sigmoid(recurrent)

LSTM

β€’ No regularization / dropout

𝒙𝑑𝑓, 𝒙𝑑

π‘Ž

Input

π’™π‘‘βˆ’Ξ”π‘‘π‘“

…𝒙𝑑𝑓

nx=9, nt=5

Outputπ’™π‘‘π‘Ž

nx=1

LSTMnx=15

Densenx=10

Densenx=5

Lorenz96LETKF 𝒙𝑑

𝑓

Python TensorFlowTensorflow LSTM is implemented and integrated with LETKF codes

Network architecture

β€’ 1 LSTM + 3 Dense layers

β€’ LETKF-like Localization

Input : localized area

Output : one grid point

Page 19: Application of machine learning methods to model bias

Summary

β€’ LSTM-based bias correction is implemented and to be tested

β€’ The efficiency of localization ?

β€’ Systematic model bias degrades forecasts and analysis

β€’ β€œOnline” bias correction with data assimilation has been studied using a fixed basis function set

β€’ β€œOffline” bias correction can be performed by ML as nonlinear regression

β€’ Online learning ?