introduction to data assimilation and applications within ... · karina apodaca –introduction to...

Introduction to Data Assimilation and Applications within NOAA – Part INOAA Educational Partnership Program with Minority Serving Institutions9th Biennial Science and Education ForumHoward University, Washington, DCMarch 20, 2018

Karina ApodacaColorado State University/Cooperative Institute for Research on the Atmosphere, Ft. Collins, ColoradoNOAA/OAR/AOML/HRD and Global Observing Systems Analysis Group

1. Data assimilation overview

2. Fundamental terminology- The analysis- The background- The state vector, control space, and observations

3. Statistical methods

4. Uncertainty and probability density functions

5. Modeling of error variables

6. Importance of error covariances

7. Prior and conditional probabilities

8. Least-squares estimation

9. Introduction to common data assimilation algorithms

Outline

Karina Apodaca – Introduction to Data Assimilation

10. 3D Variational (3DVar) Analysis

11. 4D Variational (4DVar) Analysis

12. Ensemble Approach

13. Hybrid methods

14. Applications within NOAA

Outline


Data assimilation (DA) is a sequence of operations aimed at determining a best possible state of a system combining observations and a statistical/dynamical knowledge about the system (short range forecast)Objective: To make the best estimate of the state of an unknown system based on an imperfect model and a finite set of observations

Assimilation

Courtesy: http://www.knmi.nl/research/weather_research/data_assimilation/

NOAA - http://celebrating200years.noaa.gov/breakthroughs/climate_model/AtmosphericModelSchematic.png

Data Assimilation Overview

Typically, DA is a sequential, or intermittent time-stepping procedure, in which a previous model forecast is compared with newly received observations, the model state is then updated to reflect the observations, a new forecast is initiated, and so on.

Analysis: the update step in this process

Background: the short model forecast used to produce the analysis

Widely used in geophysics (meteorology, oceanography, atmospheric chemistry, hydrology), robotics, nuclear energy, systems biology, economics



Applications within geophysics:

Forecasting: to estimate initial conditions

Model tuning: for parameter estimation

Inverse modeling: to estimate parameter fields

Data analysis: re-analysis (used in modeling through interpolation operators)

Difficulties for application in Earth sciences:

u Non-linearitiesu Large number of dimensionsu Large computational costsu Insufficient information on error statistics


• An accurate picture of the true state of the atmosphere at a given time, represented in a model as a collection of numbers (example: a “nature run” [NR])

• Useful in itself as a comprehensive and self-consistent diagnostic of the atmosphere

• Can be used as input data to another operation, e.g. initial state for a numerical weather forecast, or as data retrieval to be used as a pseudo-observation

• Can provide a reference to check the quality of observations

• Under-determined in most cases because data are sparse and only indirectly related to model variables

The analysis

• A priori estimate of the model state

• Defines the knowledge about the dynamical state before new observations are assimilated

• Can be a climatology or any trivial state

• Can be generated from the output of the previous analysis with some assumptions of consistency in time

The background

State vector x:

A group of numbers needed to represent the atmospheric state of the model collected in a column matrix

Components: truth, background, and analysis

The best possible representation of reality as a state vector is: xt, the truestate at the time of the analysis

The a priori or background xb is the estimate of the true state before the analysis is carried out

The analysis is what we are after: xa

The state vector, control space, and observations

Control variable space:

It is more convenient not to solve the whole analysis problem for all components of the model state.

The work space for the analysis is not the model space, but the space allowed for corrections to the background (control variable space).

Need to find the correction dx (analysis increment), such that:

xa = xb + dx is as close to xt

Instead of looking for xa we look for (xa – xb) in a suitable space


Observations:

For a given analysis we use a limited number of observed values, gathered into an observation vector y

The number of observations correspond to the dimensions of y

The correct way to compare observations to the state vector is through a mapping function from the model state space to the observation space

This is called observation operator (H), or forward operator.

H is required to use observations in the analysis


Observations (continued):

This operator generates the values H(x) that the observations would like to take if they and the state vector were perfect (no modeling error)

H is a collection of interpolation operators from the model to the observation points, and conversions from model variables to observed parameters


Departures:

The vector of departures at observation points is equal to: y - H(x)

When calculated using the background (xb) this vector is called the innovations - y - H(xb)

When calculated using the analysis (xa) this vector is called the analysis residuals - y - H(xa)

These calculations are a way to asses the quality of the data assimilation technique being used


• Data assimilation is probabilistic

• Model equations and observations are imperfect

• Also, the initial conditions, model errors and estimation parameters are NOT perfectly known

• All of these unknowns imply that the model forecast will also contain errors

• Since observations and model forecasts are inputs to data assimilation…

Why do we need statistical methods?

• Data assimilation procedures will also be flawed (e.g., insufficient knowledge of the input implies insufficient knowledge of the output)

• Uncertainties and imperfect knowledge are best measured by probability

• In order to produce a good-quality analysis of the system, we should begin with a good quality first guess (previous analysis or forecast)

• The first guess can be a climatology or any trivial state or generated from the output of the previous analysis with some assumptions

Why do we need statistical methods?

introduction to data assimilation and applications within ... · karina apodaca –introduction to...

Documents