a sampling filter for non-gaussian data assimilation...a sampling filter for non-gaussian data...
TRANSCRIPT
A Sampling Filter forNon-Gaussian Data Assimilation
Ahmed Attia1 and Adrian Sandu1
1Computational Science Laboratory (CSL)Department of Computer Science
Virginia Tech{attia,sandu}@cs.vt.edu
[1/19]October 13, 2014, UMD-VT Data Assimilation Day. Ahmed Attia, Adrian Sandu. (http://csl.cs.vt.edu)
Outline
I MotivationI Problem formulationI Sampling approach: HMCMCI Experiment and resultsI Future workI Questions
Outline [2/19]October 13, 2014, UMD-VT Data Assimilation Day. Ahmed Attia, Adrian Sandu. (http://csl.cs.vt.edu)
Motivation:
I A Monte Carlo approach capable of finding a good analysis for DAproblems in both Gaussian and non-Gaussian frameworks!
I Why don’t we sample directly from the posterior PDF?I A robust, fast sampling strategy is required!I MCMC: popular and guaranteed to converge but may take forever!I Auxiliary variable MCMC: Hybrid Monte Carlo (HMCMC);
I Fast convergence rate,I High acceptance probability of generated states,I Fairly easy to tune the parameters and enhance performance.
Motivation [3/19]October 13, 2014, UMD-VT Data Assimilation Day. Ahmed Attia, Adrian Sandu. (http://csl.cs.vt.edu)
Sequential DA:I From Bayes’ Theorem:
Pa(x) = P(x|y) =P(y|x)Pb(x)
P(y), (1)
∝ P(y|x)Pb(x) . (2)
I Gaussian framework:
Pb(x) ∝ exp(−1
2(x− xb)T B−1(x− xb)
), (3)
P(y|x) ∝ exp(−1
2(H(x)− y)T R−1(H(x)− y)
). (4)
I Posterior PDF:
Pa(x) ∝ exp (−J (x)) , (5)
J (x) =12
(x− xb)T B−1(x− xb) +12
(H(x)− y)T R−1(H(x)− y) . (6)
Problem Formulation [4/19]October 13, 2014, UMD-VT Data Assimilation Day. Ahmed Attia, Adrian Sandu. (http://csl.cs.vt.edu)
Hybrid MC:I HMCMC: Analogy to Hamiltonian mechanical system.I To draw samples {x(e)} from a given probability distribution π(x):
- View the state variable as a position variable,- Add an auxiliary variable ”momentum“ p and sample from the joint PDF of
both variables (p, x).+ The Hamiltonian:
H(p, x) =12
pT M−1p− log(π(x)) =12
pT M−1p︸ ︷︷ ︸kinetic energy
+ J (x)︸ ︷︷ ︸potential energy
, (7)
+ The Hamiltonian dynamics (symplectic integrator needed):
dxdt
= ∇p H = M−1p ,dpdt
= −∇x H = −∇xJ (x). (8)
I The canonical PDF of (p,x):
exp (−H(p, x)) = exp(−1
2pT M−1p− J (x)
)= exp
(−1
2pT M−1p
)· π(x). (9)
I Generate a Markov chain with stationary distribution is exp (−H(p,x)).
Sampling Approach: HMCMC [5/19]October 13, 2014, UMD-VT Data Assimilation Day. Ahmed Attia, Adrian Sandu. (http://csl.cs.vt.edu)
HMCMC Sampling for Sequential DA:
Assimilation at time tkI Forecast: given an analysis ensemble {xa
k−1(e)}e=1,2,...,Nens at time tk−1;generate the forecast ensemble via the modelM:
xbk (e) =Mtk−1→tk
(xa
k−1(e)), e = 1,2, . . . , Nens. (10)
I Analysis: given the observation vector yk at time point tk , start samplingfrom Pa(x|yk ):
i- Initialize the MC? Use the best estimate available for (x0)! (e.g. Forecast,EnKF, 3D-Var),
ii- Optionally, calculate the ensemble-based forecast error covariance matrixBk ,
iii- Choose the mass matrix M! (e.g. constant diagonal, diagonal of B0 or Bk ),iv- Generate the MC and sample at stationarity,iv- Use the generated samples {xa
k (e)}e=1,2,...,Nens as an analysis ensemble andcalculate the best estimate of the state (e.g. the mean), and the analysiserror covariance matrix.
Sampling Approach: HMCMC [6/19]October 13, 2014, UMD-VT Data Assimilation Day. Ahmed Attia, Adrian Sandu. (http://csl.cs.vt.edu)
HMCMC Sampling Algorithm:Given the initial state of the chain:
I Draw a random vector pk ∼ N (0,M),I Use a symplectic numerical integrator (e.g. Verlet) to advance the current
state (pk ,xk ) by a time increment T to obtain a proposal state (p∗,x∗):
(p∗,x∗) = ΦT((pk ,xk )
).
I Evaluate the loss of energy based on the Hamiltonian function ∆H
∆H = H(p∗,x∗)− H(pk ,xk ). (11)
I Calculate the probability:
a(k) = 1 ∧ e−∆H . (12)
I Discard both p∗, pk .I (Acceptance/Rejection) Draw a uniform random variable u(k) ∼ U(0,1):
i- If a(k) > u(k) accept the proposal as the next sample: xk+1 := x∗;ii- If a(k) ≤ u(k) reject the proposal and continue with the current state:
xk+1 := xk .I Repeat steps 1 to 6 until sufficiently many distinct samples are drawn.
Sampling Approach: HMCMC [7/19]October 13, 2014, UMD-VT Data Assimilation Day. Ahmed Attia, Adrian Sandu. (http://csl.cs.vt.edu)
Symplectic numerical integratorsOne step advance of the solution of the Hamiltonian equations from time tk totime tk+1 = tk + h as follows:
1. Position Verlet integrator
xk+1/2 = xk +h2
M−1 pk , (13a)
pk+1 = pk − h∇xJ (xk+1/2) , (13b)
xk+1 = xk+1/2 +h2
M−1 pk+1. (13c)
2. Two-stage integrator
x1 = xk + (a1h)M−1pk , (14a)
p1 = pk − (b1h)∇xJ (x1) , (14b)
x2 = x1 + (a2h)M−1p1 , (14c)
pk+1 = p1 − (b1h)∇xJ (x2) , (14d)
xk+1 = x2 + (a2h)M−1pk+1 , (14e)
a1 = 0.21132 , a2 = 1− 2a1 , b1 = 0.5 .
Sampling Approach: HMCMC [8/19]October 13, 2014, UMD-VT Data Assimilation Day. Ahmed Attia, Adrian Sandu. (http://csl.cs.vt.edu)
Symplectic numerical integratorsOne step advance of the solution of the Hamiltonian equations from time tk totime tk+1 = tk + h as follows:
3. Three-stage integrator
x1 = xk + (a1h)M−1pk , (15a)
p1 = pk − (b1h)∇xJ (x1) , (15b)
x2 = x1 + (a2h)M−1p1 , (15c)
p2 = p1 − (b2h)∇xJ (x2) , (15d)
x3 = x2 + (a2h)M−1p2 , (15e)
pk+1 = p2 − (b1h)∇xJ (x3) , (15f)
xk+1 = x3 + (a1h)M−1pk+1 , (15g)
where:
a1 = 0.11888010966548 ,
a2 = 0.5− a1 ,
b1 = 0.29619504261126 ,
b2 = 1− 2b1.
Sampling Approach: HMCMC [9/19]October 13, 2014, UMD-VT Data Assimilation Day. Ahmed Attia, Adrian Sandu. (http://csl.cs.vt.edu)
Symplectic numerical integratorsOne step advance of the solution of the Hamiltonian equations from time tk totime tk+1 = tk + h as follows:
4. Four-stage integratorx1 = xk + (a1h)M−1pk , (16a)
p1 = pk − (b1h)∇xJ (x1) , (16b)
x2 = x1 + (a2h)M−1p1 , (16c)
p2 = p1 − (b2h)∇xJ (x2) , (16d)
x3 = x2 + (a3h)M−1p2 , (16e)
p3 = p2 − (b2h)∇xJ (x3) , (16f)
x4 = x3 + (a2h)M−1p3 , (16g)
pk+1 = p3 − (b1h)∇xJ (x4) , (16h)
xk+1 = x4 + (a1h)M−1pk+1 , (16i)
where:
a1 = 0.071353913450279725904 ,
a2 = 0.268458791161230105820 ,
a3 = 1− 2a1 − 2a2 , b1 = 0.1916678 , b2 = 0.5− b1.
Sampling Approach: HMCMC [10/19]October 13, 2014, UMD-VT Data Assimilation Day. Ahmed Attia, Adrian Sandu. (http://csl.cs.vt.edu)
Symplectic numerical integratorsOne step advance of the solution of the Hamiltonian equations from time tk totime tk+1 = tk + h as follows:
5. Infinite dimensional integrator
p1 = pk −h2
M−1∇xJ (xk ) , (17a)
xk+1 = cos (h)xk + sin (h)p1 , (17b)
p2 = − sin (h)xk + cos (h)p1 , (17c)
pk+1 = p2 −h2
M−1∇xJ (xk+1). (17d)
Here, the loss of energy is computed as:
∆H = φ(x∗)− φ(xk ) +h2
8
(|M−
12 (−∇φ(xk ))|2 − |M−
12 (−∇φ(x∗))|2
)+ h
m−1∑i=1
(pT
k (−∇φ(xk )))
+h2
(pT
k (−∇φ(xk )) + (p∗)T(−∇φ(x∗))
),
where φ(x) = − log (π(x))
Sampling Approach: HMCMC [11/19]October 13, 2014, UMD-VT Data Assimilation Day. Ahmed Attia, Adrian Sandu. (http://csl.cs.vt.edu)
Experimental SettingsI Applied to 40-variables Lorenz-96, Observe each third component,I Observation operators:
1) Linear:H(x) = Hx = (x1, x4, x7, . . . , x37, x40)T ∈ R14 .
2) Quadratic:H(x) = (x2
1 , x24 , x2
7 , . . . , x237, x2
40)T ∈ R14.
3) Cubic:H(x) = (x3
1 , x34 , x3
7 , . . . , x337, x3
40)T ∈ R14.
4) Magnitude:
H(x) = (|x1|, |x4|, |x7|, . . . , |x37|, |x40|)T ∈ R14.
5) Quadratic with threshold:
H(x) = (x ′1, x ′4, x ′7, . . . , x ′37, x ′40)T ∈ R14 ,
where
x ′i =
{x2
i : xi ≥ 0.5−x2
i : xi < 0.5 ,6) Exponential:
H(x) = (er·x1 , er·x4 , er·x7 , . . . , er·x37 , er·x40 )T ∈ R14 ,
Experiment and Results [12/19]October 13, 2014, UMD-VT Data Assimilation Day. Ahmed Attia, Adrian Sandu. (http://csl.cs.vt.edu)
Results: Quadratic H with a threshold 0.5
0 1 2 3 4 5 6 7 8 9 100
1
2
3
4
5
6
7
Time
RMSE
Forecast
EnKF
MLEF
(a) Position Verlet integrator(13)
0 1 2 3 4 5 6 7 8 9 100
1
2
3
4
5
6
7
Time
RMSE
Forecast
EnKF
MLEF
(b) Two-stage integrator (14)
0 1 2 3 4 5 6 7 8 9 100
1
2
3
4
5
6
7
Time
RMSE
Forecast
EnKF
MLEF
(c) Three-stage integrator
0 1 2 3 4 5 6 7 8 9 10
0
1
2
3
4
5
6
7
Time
RMSE
Forecast
MLEF
EnKF
(d) Four-stage integrator (16)
0 1 2 3 4 5 6 7 8 9 100
1
2
3
4
5
6
7
Time
RMSE
Forecast
EnKF
MLEF
(e) Integrator defined onHilbert space (17)
Figure : Data assimilation results with the quadratic observation operator with a thresh-old value (12). The integrator used is indicated under each panel. The time step for allintegrators is T = 0.1 with h = 0.01, m = 10, and 30 inter-chain steps.
Experiment and Results [13/19]October 13, 2014, UMD-VT Data Assimilation Day. Ahmed Attia, Adrian Sandu. (http://csl.cs.vt.edu)
Results: Exponential H with r = 0.2
0 1 2 3 4 5 6 7 8 9 100
1
2
3
4
5
6
7
8
Time
RMSE
EnKF
Forecast
MLEF
(a) Position Verlet integrator(13)
0 1 2 3 4 5 6 7 8 9 100
1
2
3
4
5
6
7
8
Time
RMSE
Forecast
EnKF
MLEF
(b) Two-stage integrator (14)
0 1 2 3 4 5 6 7 8 9 100
1
2
3
4
5
6
7
8
Time
RMSE
EnKF
Forecast
MLEF
(c) Three-stage integrator(15)
0 1 2 3 4 5 6 7 8 9 100
1
2
3
4
5
6
7
8
Time
RMSE
Forecast
EnKF
MLEF
(d) Four-stage integrator (16)
0 1 2 3 4 5 6 7 8 9 100
1
2
3
4
5
6
7
8
Time
RMSE
Forecast
EnKF
MLEF
(e) Integrator defined onHilbert space (17)
Figure : Data assimilation results with the exponential observation operator with r =0.2 (12). The integrator used is indicated under each panel. The time step for allintegrators is T = 0.1 with h = 0.01, m = 10, and 30 inter-chain steps.
Experiment and Results [14/19]October 13, 2014, UMD-VT Data Assimilation Day. Ahmed Attia, Adrian Sandu. (http://csl.cs.vt.edu)
Results: Exponential H with r = 0.2
Figure : Scatter plot of the sampled joint pdf of selective components. Results areobtained from HMCMC ensembles with three-stage integrator.
Experiment and Results [15/19]October 13, 2014, UMD-VT Data Assimilation Day. Ahmed Attia, Adrian Sandu. (http://csl.cs.vt.edu)
Results: Exponential H with a factor r = 0.5
0 1 2 3 4 5 6 7 8 9 100
1
2
3
4
5
6
Time
RM
SE
Forecast
(a) Three-stage integrator; h = 0.01, m =60
0 1 2 3 4 5 6 7 8 9 100
1
2
3
4
5
6
Time
RMSE
Forecast
(b) Four-stage integrator; h = 0.001, m =30
Figure : Data assimilation results with the exponential observation operator with r =0.5 (12). The integrator used is indicated under each panel. The step size h andthe number of steps m are indicated under each panel, and 30 inter-chain steps areconsidered.
Experiment and Results [16/19]October 13, 2014, UMD-VT Data Assimilation Day. Ahmed Attia, Adrian Sandu. (http://csl.cs.vt.edu)
Tuning the parameters:
(a) RMSE mean: two-stageintegrator; h = 2/Nvar, m =Nvar
(b) RMSE standard devia-tion: two-stage integrator;h = 2/Nvar, m = Nvar
(c) RMSE mean: three-stageintegrator; h = 3/Nvar, m =Nvar/2
(d) RMSE standard devia-tion: three-stage integrator;h = 3/Nvar, m = Nvar/2
(e) RMSE mean: four-stageintegrator; h = 4/Nvar, m =Nvar/2
(f) RMSE standard deviation:four-stage integrator; h =4/Nvar, m = Nvar/2
Figure : The mean and standard deviation of RMS error for the quadratic observationoperator (12) for different MC number of inter-chain steps.
Experiment and Results [17/19]October 13, 2014, UMD-VT Data Assimilation Day. Ahmed Attia, Adrian Sandu. (http://csl.cs.vt.edu)
Results: Quadratic H with tuned parameters
0 1 2 3 4 5 6 7 8 9 100
1
2
3
4
5
6
7
Time
RMSE
Forecast
EnKF
MLEF
Figure : RMS error of the sampling filter using three-stage integrator with h =3/Nvar, m = Nvar/2. The number of inter-chain steps is 40.
Experiment and Results [18/19]October 13, 2014, UMD-VT Data Assimilation Day. Ahmed Attia, Adrian Sandu. (http://csl.cs.vt.edu)
Future work
I Apply to larger models,I Extend to 4D case,I Enhance and parallelize,I ....
Future Work [19/19]October 13, 2014, UMD-VT Data Assimilation Day. Ahmed Attia, Adrian Sandu. (http://csl.cs.vt.edu)