2013 doe/eu retrieval workshop, köln “common algorithm evaluation approaches ” session...

2013 DOE/EU Retrieval Workshop, Köln

“Common Algorithm Evaluation Approaches” Session

Shaocheng Xie, Kerstin Ebell, Dave Turner, and Ulrich Lohnert

EU/DOE Ground-based Cloud and Precipitation Retrieval Workshop, 13-14 May 2013, Cologne, Germany

Goals• Identify common algorithm evaluation approaches for retrieval

developments and uncertainty quantification

• Identify group activities to address those challenging issues that arise from previous intercomparison studies


What Will Be Covered?• Common algorithm evaluation approaches

• Observation System Simulation Experiments (OSSEs)• In-situ comparisons• Radiative closure• Comparison with other retrieval datasets (satellite, other instruments)• Intercomparison of retrievals

• Talks (~40 minutes)• Dave Turner: Some examples on using these approaches to evaluate cloud

retrievals• Shaocheng Xie: Questions for discussion

• Discussion (~50 minutes)


Earlier Efforts Made By EU/DOEMajor cloud retrieval intercomparison studies:

• EG-CLIMET• Lohnert, Donovan, Ebell, et al. (2013): Assessment of ground-based cloud

liquid water profiling retrieval techniques - (3 algorithms, Continental Stratus – CABAUW; Maritime Stratocumulus – ASTEX, only liquid, both OSSEs and real cases)

• DOE: • Comstock et al. (2007): High level ice clouds (16 algorithms, SGP- March

2000 IOP)• Turner et al. (2007): Optically thin liquid clouds (18 algorithms, SGP-March

2000 IOP)• Shupe et al. (2008): Mixed-phase clouds (8 algorithms, NSA-MPACE)


Limitations of instruments and uncertainties in the measurements and assumptions used in the algorithms account for a significant portion of the differences

The accuracy varies greatly with instrument, analysis method, and cloud type

No single retrieval method can work properly for all instruments and all cloud conditions


Evaluating Retrieval Algorithm Results

DOE / EU Ground-based Cloud and Precipitation Retrieval Workshop


Evaluating Different Retrievals: Which is Better? (1)

• Observation System Simulation Experiments (OSSEs) – Start with known atmospheric conditions– Use forward model to compute radiance / backscatter and by

adding realistic random noise create the ‘observations’– Retrievals applied to these observations can be compared

against a known truth

– Do simulations cover entire range of possible conditions?– Assumes that the forward model is ‘perfect’– Biases are generally not evaluated here– However, careful, well-constructed, simulations can be quite

illuminating, especially when comparing sensitivities of different instruments and techniques


Assessment of ground-based cloud liquid water profiling retrieval techniques

Ulrich Löhnert (University of Cologne)

Dave Donovan (KNMI)

Kerstin Ebell (University of Cologne)

Giovanni Martucci (Galway University)

Simone Placidi (TU Delft)

Christine Brandau (TU Delft)

Ewan O’Connor (FMI/University of Reading)

Herman Russchenberg (TU Delft)

weatherreport.com


MISSIONTo recommend an optimum European network of economical and

unmanned ground-based profiling stations for observing winds, humidity, temperature and clouds (together with associated erorrs) for use in evaluating climate and NWP models on a global and high resolution

(typically 1km) scale and ultimately for assimilating into NWP.

European Ground-based Observations of Essential Variables for Climate and Operational Meteorology

EG-CLIMET: ES-0702 www.eg-climet.org 2008-2012 16 countries, 13 national weather services

European COST action EG-CLIMET


Objective of this study

Evaluate current liquid cloud profiling techniques

•identify errors and discuss assumptions

•correct for errorsrecommendations for an optimal retrieval method

Simulation case

•“truth” known

•direct evaluation

•need to simulate measurements

Real case

•application to real measurements

•evaluation with radiation closure

ECSIM (EarthCareSimulator)

•radar, lidar, microwave radiometer

•SW, LW fluxes


Overview of measurements and parameters

... measurements which are used ...Lidar/cloud radar: cloud base & top (Cloudnet TC)Z: cloud radar reflectivity factor (dBZ)MWR: brightness temperature TB (K)LWP: MWR liquid water path (gm-2)

... parameters to be retrieved ...LWC: liquid water content (gm-3)Reff: cloud droplet effective radius (μm) N0: cloud droplet concentration (cm-3)


Retrievals

BRANDAU (Brandau et al. / Frisch et al.) retrieve LWC(z), Reff(z) and N0

•input: MWR LWP and radar Z, cloud boundaries•uni-modal drop size distribution •relation between moments (2nd and 3rd) of DSD (Brenguier et al. 2011)

Cloudnet (O’Connor et al.) retrieve LWC(z)

•input: MWR LWP, cloud boundaries & temperature

•linearly scaled adiabatic LWC, non-drizzle

IPT (Löhnert et al. / Ebell et al.) retrieve LWC(z), Reff(z)

•input: MWR TB, radar Z and a priori LWC, cloud boundaries, cloudnet TC

•minimize cost function to meet TB, LWC a priori profiles and radar Z-LWC relation, Reff accoring to Frisch et al. (2002)

All retrievals use common “measurements” and cloud boundaries


Simulated case #1: Continental StratusOne continental case simulated with LES

• CABAUW: only bulk microphysics (LWC), reasonable drop size distribution assumed (uni-modal), non-drizzling, only liquid

ASTEX


CABAUW case: Cloudnet

• large random error due to linear scaling

rand. error: 50-60%sys. error: ~0%

LWC / gm-3


CABAUW case: Brandau LWC & Reff

red: retrievalblack: „truth“

LWC / gm-3

rand. error: <10%sys. error: <10%

Reff / m-6

rand. error: < 5%sys. error: ~10%


CABAUW case: Brandau with LWP error

LWP error: +25 gm-2

( 20 < LWP < 50 gm-2)

LWP accuracy crucial!

Reff / m-6

rand. error: ~10%sys. error: ~50%

rand. error: < 5%sys. error: ~15%

LWC / gm-3


CABAUW case: IPT LWC & Reff

rand. error: <15%sys. error: 20-35%

rand. error: ~5%sys. error: up to 70%

LWC / gm-3

Reff / m-6



Simulated case #2: Maritime StratocumulusOne maritime case simulated with LES

• ASTEX: spectrally resolved microphysics, low LWP (< 100 gm-2), partially drizzling, only liquid

ASTEX


ASTEX: Brandau

rand. error: 20-50%sys. error: ~50% rand. error: >100%

sys. error: ~50%

• drop size distribution no longer uni-modal small number of drizzle droplets lead to Reff overestimation

LWC / gm-3

Reff / m-6



ASTEX: IPT

rand. error: 30-50%sys. error: <30%

rand. error: ~50%sys. error: <60%

• fairly robust LWC profile in drizzle „contaminated“ region

LWC / gm-3

Reff / m-6




• Comparisons with other retrieved datasets– Which is truth?

Different retrievals applied in single-layer

warm liquid clouds



• Comparisons against in-situ observations– Sampling volumes can be quite different– Temporal and spatial sampling issues– Statistics are key– Aircraft are expensive– In-situ obs aren’t

necessarily truth!

Radar volume 45 m

30 m

50-100 m/s



• Closure exercises can provide a robust test, if the closure variable is independent of the retrieval

• Broadband fluxes are critical for ARM, so agreement in radiative flux is a nice metric to use

• Compute broadband fluxes to compare with observed fluxes at the surface– Use retrieved cloud properties as input (what we want to test)– Need ancillary observations also (T/q profiles, surface albedo,

aerosol properties, etc)– Cloud fraction is important modulator of the observed flux, so

need to select cases that are ‘homogeneously’ overcast– Generally evaluate improvement in RMS, not in bias

• This is a necessary, but not sufficient, closure exercise


Radiative Closure Example• Using ARM Mobile Facility data from Pt.

Reyes, California in July-August 2005– Overcast ~85% of the time

– Clouds were low altitude (~500 m) and warm (i.e., liquid only)

– Very few cases with overlapping higher-altitude clouds

• Evaluated two different passive retrieval methods– Combined MWR+AERI retrieval of LWP and Reff

– MWR-only retrieved LWP (used AERI+MWR retrieval for Reff)

• Comparison in SW radiative flux at surface and TOA– Should not use LW flux, since AERI measures LW radiance– Compare both BIAS and RMS of flux differences

Turner JGR 2007


2-Day Example from Pt. Reyes

Turner JGR 2007


2-Day Example from Pt. Reyes

Combined AERI+MWR retrieval has smaller bias in SW surface flux

Turner JGR 2007


Surface SW Closure Exercise

• Similar exercise as LW closure

• No aerosols included in the calculations

• MIXCRA shows negative bias, but small amount of aerosol would improve results (and worsen MWRRET results)

• Variance in MIXCRA results is much lower than variance in MWRRET results for LWP below 120 g/m2

Turner JGR 2007


TOA SW Closure Exercise

• Similar exercise as SW surface closure

• No aerosols included in the calculations

• Both methods show negative bias, but small amount of aerosol would improve result slightly

• Unable to get agreement with both surface and TOA by changing LWP

• Variance in MIXCRA results is much lower than variance in MWRRET results for LWP below 100 g/m2

Turner JGR 2007


Questions for discussion

• What are the major issues in using these evaluation approaches for algorithm development and uncertainty quantification?

• What are the strategies for better using these algorithm evaluation approaches?

• What are the key areas that need the EU/DOE retrieval community to work together to improve algorithm evaluation approaches?

• What are our future plans?


Some thoughts on these questions


Q#1: What are the major issues in using these evaluation approaches for algorithm development and uncertainty quantification?

• No one is perfect

– OSSEs: simulations may not cover entire range of possible conditions and the forward model is not perfect

– In-Situ data: sampling issues, uncertainties, limited cases– Radiative closure: uncertainties in other input data and surface/TOA

radiative fluxes; cannot evaluate vertical structures of cloud properties

– Intercomparison of different retrievals: differences in retrieval basis, parameters, and underlying assumptions, as well as input and constraint parameters

– Compare with other retrievals (e.g., MFRSR, satellite): no one is truth!


Q#2: What are the strategies for better using these algorithm evaluation approaches?

• Need to identify what types of retrievals are of interest to the EU/DOE joint effort

– We may only focus on those algorithms that are used to retrieve cloud properties from radar, lidar, and radiometer since they are available for both ARM and European sites and provide long-term continuous retrieval data

• Uncertainty is large in both retrieved products and in-situ observations– Statistics could reduce the uncertainty– Case studies vs. statistical evaluations

• Can new instruments help?• What are critically needed for algorithm development and uncertainty

quantification?• What are critically needed by the modeling community?

– Error bars; statistics for various types of clouds



• Possible improvements

– OSSEs: develop more OSSE cases that cover various types of clouds

– In-Situ data: statistics is key – need long-term continuous observations to build up these statistics for different cloud types

– Radiative closure: facilitate the evaluation by retrievals using BBHRP

– Intercomparison of different retrievals: Common input dataset and consistent set of assumptions

– Compare with other retrievals (e.g., MFRSR): is there a consensus in the community on which instruments or retrievals are more reliable for a particular cloud parameter? Compare with new instrument retrievals?



• Develop a cloud retrieval testbed

– Suitable for both case studies and statistical evaluation– Combine strengths of these common algorithm evaluation approaches– Build-up a cloud retrieval test case library that will include OSSE

cases, as well as radar, lidar, radiometer measurements co-located with reference in-situ microphysical parameters

– Build-up common input dataset and use consistent assumptions for key parameters in current retrieval algorithms

– Make use of BBHRP– Quantify uncertainties in validation datasets– Develop statistics for each type of clouds based on long-term

observations


Q#3: What are the key areas that need the EU/DOE retrieval community to work together to improve algorithm evaluation approaches?

• Develop the cloud retrieval testbed

– Data sharing– Algorithm sharing

• Intercomparison studies on both retrieval algorithms and forward models


Q#4: What are our future plans?

• Collaborate with other existing science focus groups (e.g., ASR QUICR, IcePro; EG-CLIMET)

• Develop the cloud retrieval testbed• Intercomparison studies

– retrievals– forward models– Key assumptions

• Future workshops (coordinated with ASR and AGU meetings)


Questions:

• What are the major issues in using these evaluation approaches for algorithm development and uncertainty quantification?

• What are the strategies for better using these algorithm evaluation approaches?

• What are the key areas that need the EU/DOE retrieval community to work together to improve algorithm evaluation approaches?

• What are our future plans?

Discussion on Common Algorithm Evaluation Approaches

2013 doe/eu retrieval workshop, köln “common algorithm evaluation approaches ” session...

Documents

doeeu retrieval workshop

retrieval developments

single retrieval method

retrieval datasets satellite

cloud conditions

kln assessment of ground

minutes slide

kln mission