b.ritter (dwd) & r.pincus (uni of colorado)

Deutscher Wetterdienst

Monte Carlo Spectral IntegrationA computationally efficient alternative to sparse radiative

transfer calculations in COSMO NWP simulationsB.Ritter (DWD) & R.Pincus (Uni of Colorado)

B.Ritter, FE14 – 04/24/23

Radiative transfer in a plane parallel, horizontally homogeneous atmosphere can be described by the monochromatic radiative transfer equation (RTE):

ddLP

eSPBLL

),,(4

)(cos~4

)(cos~)()~1(),,(),,(

2

0

1

1

00 0

withL monochromatic directional radiance optical thickness,µ0 cosine of zenith angle for diffuse resp. direct radiation azimuth angle

single scattering albedoB Planck functionP Phase function for scattering,0 scattering angle for diffuse resp. direct solar radiationS0 solar constant

~

incoming radiance emissionscattering of direct beam

scattering of diffuse radiance components

B.Ritter, FE14 – 04/24/23

The efficiency problem of radiative transfer parameterisation

even though the RTE in the form shown previously already contains severe approximations (e.g. considerung only one dimension), further simplifications are required to find a solution which is at all feasible within the constraints of operational NWP

the most critical issue in this context originates from the need to integrate solutions of the RTE over all energetically relevant wavelengths

B.Ritter, FE14 – 04/24/23


integration over wavelength is severly hampered by the huge spectral variability of gaseous absorption coefficients

B.Ritter, FE14 – 04/24/23


replacing the numerical integration over wavelength by the so-called k-distribution method leads to affordable, but still very expensive solutions of the RT problem (cf. Fu and Liou, 1992)

G

g

ugkg

ugkuk ewdgedeT )(1

0

)(1

B.Ritter, FE14 – 04/24/23

simulation of radiative transfer in the atmosphere for every column at each model time step provides the spatio-temporal distribution of corresponding fluxes as:

where the summation over spectral bands reflects the fact that the grouping of gaseous absorption coefficients is carried out within intervals where the optical properties of other constituents (e.g. cloud droplets) are considered to be constant

BUT: RT simulation as described above is still far too expensive for NWP!

Question:What means to save CPU time are left?

B

b

bG

ggbgb tzyxFbwwtzyxF

)(

, ),,,()(),,,(


B.Ritter, FE14 – 04/24/23

Making interactive radiation in NWP models affordable

The usual compromiseCompute radiative heating rates every N time steps, apply these uniformly in time and/or employ a coarser grid for RT calculation than for other processes simulated

leads to the usual disadvantagesflux and heating rate errors are correlated with the flow (largest where flow/development is most vigorous, smallest in quasi-stationary situations)no theoretical basis: no guidelines for choosing ‘optimal’ radiation time step and or spatial resolution, and no way of knowing when this choice is affecting the solutiondifficult to assess whether optimal computational efficiency (i.e. a good cost/benefit ratio) is achieved

B.Ritter, FE14 – 04/24/23


a schematic illustration of the effects of reduced temporal sampling

1 2 3 4 5 6 7 8 9 10 11 12

model time step

radi

ativ

e flu

x

F(t)

F(t_Rad)

change of atmospheric state between radiation time steps is not reflected in fluxes, leading to sub-optimal interaction with other processes

substantial bias may occur (e.g. time lag between radiative fluxes and diurnal cycle of cloud field)

B.Ritter, FE14 – 04/24/23

An alternative: Monte Carlo spectral integration

Why are heating rate calculations so expensive? It’s the broadband integration - the double sum over bands and g-points!

We use the roadblock as a springboard:

This is a Monte Carlo sample of the calculation we’d like to do but can’t afford. A single estimate is (very) noisy but many estimates converge to the right answer.

)(

,

1)(1)(

),,,()(),,,(

bg

gb

wgpand

Bbp

where

tzyxFbBwtzyxF

(cf. Pincus&Stevens, 2008)

B.Ritter, FE14 – 04/24/23

A schematic depiction of the MCSI approach

B.Ritter, FE14 – 04/24/23

Classical approach: low frequency RT calculations

B.Ritter, FE14 – 04/24/23

MCSI approach: high frequency random sampling

B.Ritter, FE14 – 04/24/23


a schematic illustration of advantages of the MCSI approach over classical approach - low frequency, deterministic sampling

1 2 3 4 5 6 7 8 9 10 11 12

model time step

radi

ativ

e flu

x

F(t)F(t_Rad)F(t)_MC

MCSI approach introduces noise in fluxes but responds to changes in atmospheric state immediately

no bias occurs for sufficiently large sample size

B.Ritter, FE14 – 04/24/23 Pincus and Stevens, 2008

Is MCSI a valid approximation for atmospheric simulations?Can the approach be used in the COSMO NWP model?

•Pincus&Stevens, 2008 demonstrate the successful application of the MCSI approach in LES model simulations of the evolution of a nocturnal stratocumulus fields.

• The radiative transfer scheme of the COSMO model (cf. Ritter and Geleyn, 1992) can be modified easily so that an MCSI-like behaviour is achieved.

• Some proof-of-concept tests demonstrate that MCSI may be used successfully in COSMO in order to overcome problems associated with low temporal frequency radiation calculations

B.Ritter, FE14 – 04/24/23

Convergence test for the MCSI approach implemented in RG92 RT-scheme

MCSI average results for solar clear sky case at bottom of atmosphere

365,00

370,00

375,00

380,00

385,00

390,00

395,00

400,00

405,00

410,00

10 100 1000 10000 100000 1000000NCalls

W/m

2

orig soft full

The ‚soft‘ version of MCSI:• retains the loop over spectral bands• is less noisy than original MCSI version• is less efficient than original, but still much faster than standard RT calculations

Example:

B.Ritter, FE14 – 04/24/23

Experiments with COSMO-DE (Version 4.14)

Initial date: 20100808 12 UTCExperiment 1: Operational DWD configuration, i.e.

hincrad=0.25,lradf_avg=.true. Experiment 2: as 1), but lradf_avg=.false., i.e. RT is calculated at every

grid pointExperiment 3: as 2), but soft MCSI approach instead of full spectral

integrationExperiment 4: as 3), but nincrad=1, i.e. ‚radiation time step‘ = ‚dynamics time

step‘Experiment 5: as 2), but nincrad=1, i.e. ‚radiation time step‘ = ‚dynamics time

step‘ Experiment 5 can be considered as ‚reference‘ !

B.Ritter, FE14 – 04/24/23

Comparison of hourly precipitation rates

Reference Experiment 4,

i.e. MCSI at each time step and grid point

B.Ritter, FE14 – 04/24/23



i.e. MCSI at each grid point, but nincrad=36

B.Ritter, FE14 – 04/24/23



i.e. operational COSMO-DE configuration

B.Ritter, FE14 – 04/24/23

Comparison of T2m at end of forecast rangeReference Experiment 4

MCSI introduces some small scale noise, but no bias

B.Ritter, FE14 – 04/24/23

Comparison of T2m at end of forecast rangeExperiment 1, operational configurationExperiment 2, no spatial averaging

Impact of ‚coarse radiation grid‘ is at least as large as that of MCSI

B.Ritter, FE14 – 04/24/23

Computational efficiencyComparision of computational efficiency

COSMO-DE 21h forecast on DWD NEC SX9 NTASKS=8

10

100

1000

10000

Seco

nds

Routine NoAVE_36 NoAVE_36_MCSI NoAVE_01 NoAVE_01_MCSI

Routine 1415 61NoAVE_36 1549 180NoAVE_36_MCSI 1379 36NoAVE_01 8046 6269NoAVE_01_MCSI 2748 1306

WallClock Total WallClock Radiation

• Standard RT calculations at each time step and grid point would blow the computational budget available for operational NWP

• RT calculations employing the ‚soft‘ MCSI approach are approximately 5 times faster than the full RG92 scheme

• through further code optimization a theoretically possible speed-up factor of ~10 may be achieved

• Using the MCSI approach and a fairly small additional investment of CPU time, we could avoid the downscaling & call radiation more often

B.Ritter, FE14 – 04/24/23

Pitfalls resulting from naive use of RNG in parallel architecturesCode like x=random_number() will provide a useable series of random numbers

on a serial machine, but every random number generator keeps track of its ‚state‘ via a (set of) global

variable(s) in a distributed memory architecture, the Single Program – Multiple Memory

concept for parallelisation will lead to identical sequences of random numbers for each processor if no attention is paid to the seeding/initialization of the RNG

in a shared memory architecture, where both the program and the state of the RNG are shared between processors, the sequence of random numbers obtained by an individual processor may depend on the work balance between processes and/or the domain decomposition, leading to non-reproducible results

If undesirable side effects are to be avoided, care in the choice and application of the RNG are essential!

B.Ritter, FE14 – 04/24/23

Pitfalls resulting from naive use of RNG in parallel architectures: here ‚careless use on distributed memory machine‘

Random Number Sequence on 4 Tasks of NEC SX9 w/o specific seeding

0

0,2

0,4

0,6

0,8

1

1 11 21 31 41

random number sequence no. within each task

rand

om n

umbe

r

Task 0 Task 1 Task 2 Task 3

Identical random number sequences on each task (=sub-domain) are definitely not what we want for MCSI!

B.Ritter, FE14 – 04/24/23

Pitfalls resulting from naive use of RNG in parallel architectures

Random Number Sequence on 4 Tasks of NEC SX9 after proper seeding

0

0,2

0,4

0,6

0,8

1

1 11 21 31 41

random number sequence no. within each task

rand

om n

umbe

r

Task 0 Task 1 Task 2 Task 3

Proper seeding for each task individually avoids identical sequences, but reproducibility is only ensured, if domain decomposition is not changed!

B.Ritter, FE14 – 04/24/23

Use of NEC random number generator for MCSI in COSMO-DE

individual seeding for each task check of ‚randomness‘ in space&time of random numbers obtained in RT

scheme

B.Ritter, FE14 – 04/24/23

Summary and ConclusionsMCSI provides an opportunity to overcome some shortcomings of classical

approaches to deal with the efficiency problem of RT calculations

Applying the MCSI approach in the framework of the COSMO NWP model demonstrated: the introduction as a variation of the RG92 radiation scheme poses no major

problem no evidence of significant deterioration of critical forecast products was found in a

forecast experiment even the ‚soft‘ version MCSI is much faster than the standard RT scheme special attention is necessary to ensure that the mechanism employed for the

generation of random numbers does not lead to undesirable side effects

A closer inspection of this approach appears to be worth the effort!

b.ritter (dwd) & r.pincus (uni of colorado)

Documents

radiative fluxes

radiation time steps

achievedbodo ritter

rt problem

optimal radiation time

time lag

time andor

cpu time