1 information content tristan l’ecuyer. 2 historical perspective information theory has its roots...

28
1 Information Content Tristan L’Ecuyer

Upload: verity-lane

Post on 17-Dec-2015

213 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 1 Information Content Tristan L’Ecuyer. 2 Historical Perspective Information theory has its roots in telecommunications and specifically in addressing

1

Information Content

Tristan L’Ecuyer

Page 2: 1 Information Content Tristan L’Ecuyer. 2 Historical Perspective Information theory has its roots in telecommunications and specifically in addressing

2

Historical Perspective

Information theory has its roots in telecommunications and specifically in addressing the engineering problem of transmitting signals over noisy channels.

Papers in 1924 and 1928 by Harry Nyquist and Ralph Hartley, respectively introduce the notion of information as a measurable quantity representing the ability of a receiver to distinguish different sequences of symbols.

The formal theory begins with Shannon (1948), the first to establish the connection between information content and entropy.

Since this seminal work, information theory has grown into a broad and deep mathematical field with applications in data communication, data compression, error-correction, and cryptographic algorithms (codes and ciphers).

Claude Shannon (1948), “A Mathematical Theory of Communication”, Bell System Technical Journal 27, pp. 379-423 and 623-656.

Page 3: 1 Information Content Tristan L’Ecuyer. 2 Historical Perspective Information theory has its roots in telecommunications and specifically in addressing

3

Link to Remote Sensing

Shannon (1948): “The fundamental problem of communication is that of reproducing at one point, either exactly or approximately, a message selected at another point.”

Similarly, the fundamental goal of remote sensing is to use measurements to reproduce a set of geophysical parameters, the “message”, that are defined or “selected” in the atmosphere at the remote point of observation (eg. satellite).

Information theory makes it possible examine the capacity of transmission channels (usually in bits) accounting for noise, signal gaps, and other forms of signal degradation.

Likewise in remote sensing we can use information theory to examine the “capacity” of a combination of measurements to convey information about the geophysical parameters of interest accounting for “noise” due to measurement error and model error.

Page 4: 1 Information Content Tristan L’Ecuyer. 2 Historical Perspective Information theory has its roots in telecommunications and specifically in addressing

4

Corrupting the Message:Noise and Non-uniqueness

Measurement and model error as well as the character of the forward model all introduce non-uniqueness in the solution.

∆y

∆x

∆x

Linear Model Quadratic Model Cubic Model

∆x < ∆x ∆x < ∆x < ∆x

Unwanted Solutions

Page 5: 1 Information Content Tristan L’Ecuyer. 2 Historical Perspective Information theory has its roots in telecommunications and specifically in addressing

5

Forward Model Errors (∆y)

Uncertainty due to unknown “influence parameters” that impact forward model calculations but are not directly retrieved often represents the largest source of retrieval error

Errors in these parameters introduce non-uniqueness in the solution space by broadening the effective measurement PDF

Forward Problem Inverse Problem

εb)F(x,y

“Influence”parameters

εb)(y,Fx 1

Forward modelerrors

Measurementerror

Errors inInversion

Uncertainty in“influence”parameters

Page 6: 1 Information Content Tristan L’Ecuyer. 2 Historical Perspective Information theory has its roots in telecommunications and specifically in addressing

6

Error Propagation in Inversion

Bi-variate PDF of (sim. – obs.) measurements. Width dictated by measurement error and uncertainty in forward model assumptions.

R0.64μm

R2

.13

μm

σTB

σ∆TB

Obs.

Error in product from width of posterior distribution from application of Bayes theorem.

τ

Re

ff

στ

σReff

Soln

Page 7: 1 Information Content Tristan L’Ecuyer. 2 Historical Perspective Information theory has its roots in telecommunications and specifically in addressing

7

Visible Ice Cloud Retreivals

τ = 2

0.66 μm Reflectance

2.13

μm

Ref

lect

ance τ = 10 20 30 50

τ = 10 20 30 50

8 μm

12 μm

24 μm

48 μm

τ = 2

τ = 45±5; Re = 11±2

τ = 18±2; Re = 19±2

Due to assumptions: τ = 16-50; Re = 9-21

Nakajima and King (1990) technique based on a conservative scattering visible channel for optical depth and an absorbing near- IR channel for reff

Influence parameters are crystal habit, particle size distribution, and surface albedo.

Page 8: 1 Information Content Tristan L’Ecuyer. 2 Historical Perspective Information theory has its roots in telecommunications and specifically in addressing

8

CloudSat Snowfall Retrievals

Snowfall retrievals relate reflectivity, Z, to snowfall rate, S This relationship depends on snow crystal shape, density, size

distribution, and fall speed Since few, if any of these factors can be retrieved from

reflectivity alone, they all broaden the Z-S relationship and lead to uncertainty in the retrieved snowfall rate

Page 9: 1 Information Content Tristan L’Ecuyer. 2 Historical Perspective Information theory has its roots in telecommunications and specifically in addressing

9Snowfall Rate (mm h-1)

Refl

ecti

vit

y (

dB

Ze)

Hex Columns4-arm Rosettes6-arm Rosettes8-arm Rosettes

Impacts of Crystal Shape (2-7 dBZ)

Page 10: 1 Information Content Tristan L’Ecuyer. 2 Historical Perspective Information theory has its roots in telecommunications and specifically in addressing

10

Impacts of PSD (3-6 dBZ)

ν = 0ν = 1ν = 2

Snowfall Rate (mm h-1)

Refl

ecti

vit

y (

dB

Ze)

Sensitivity to ν

Snowfall Rate (mm h-1)

Refl

ecti

vit

y (

dB

Ze)

Sekhon/Srivastavaa & b = -10%a & b = +10%

Sensitivity to PSD Shape

ΛDν0 eDNN(D) βαSΛ b

0 aSN 0ν

Page 11: 1 Information Content Tristan L’Ecuyer. 2 Historical Perspective Information theory has its roots in telecommunications and specifically in addressing

11

Implications for Retrieval

Given a “perfect” forward model, 1 dB measurement errors lead to errors in retrieved snowfall rate of less than 10 %

Ideal Case

Refl

ecti

vit

y

Snowfall Rate (mm h-1)

“Reality”

Refl

ecti

vit

ySnowfall Rate (mm h-1)

PSD and snow crystal shape, however, spread the range of allowable solutions in the absence of additional constraint

Page 12: 1 Information Content Tristan L’Ecuyer. 2 Historical Perspective Information theory has its roots in telecommunications and specifically in addressing

12

Quantitative Retrieval Metrics

Four useful metrics for assessing how well formulated a retrieval problem:

– Sx – the error covariance matrix provides a useful diagnostic of retrieval performance measuring the uncertainty in the products

– A – the averaging kernel describes, among other things, the amount of information that comes from the measurements as opposed to a priori information

– Degrees of freedom– Information content

All require accurate specification of uncertainties in all inputs including errors due to forward model assumptions, measurements, and any mathematical approximations required to map geophysical parameters into measurement space.

Page 13: 1 Information Content Tristan L’Ecuyer. 2 Historical Perspective Information theory has its roots in telecommunications and specifically in addressing

13

Degrees of Freedom

The cost function can be used to define two very useful measures of the quality of a retrieval: the number of degrees of freedom for signal and noise denoted ds and dn, respectively

where Sa is the covariance matrix describing the prior state space and K represents the Jacobian of the measurements with respect to the parameters of interest.

ds specifies the number of observations that are actually used to constrain retrieval parameters while the dn is the corresponding number that are lost due to noise

Clive Rogers (2000), “Inverse Methods for Atmospheric Sounding: Theory and Practice”, World Scientific, 238 pp.

ΦT T-1 -1

a a a y= x - x S x - x + y - Kx S y - Kx

ds dn

Page 14: 1 Information Content Tristan L’Ecuyer. 2 Historical Perspective Information theory has its roots in telecommunications and specifically in addressing

14

Degrees of Freedom

Using the expression for the state vector that minimizes the cost function it is relatively straight-forward to show that

where Im is the m x m identity matrix and A is the averaging kernel. NOTE: Even if the number of retrieval parameters is equal to or less

than the number of measurements, a retrieval can still be under-constrained if noise and redundancy are such that the number of degrees of freedom for signal is less than the number of parameters to be retrieved.

sd =Tr Tr Tr -1-1 T -1 -1 T -1

x a y a yS S = K S K + S K S K = A

nd =Tr Tr -1T

y a y mS KS K + S = I - A

Page 15: 1 Information Content Tristan L’Ecuyer. 2 Historical Perspective Information theory has its roots in telecommunications and specifically in addressing

15

Entropy-based Information Content

The Gibbs entropy is the logarithm of the number of discrete internal states of a thermodynamic system

where pi is the probability of the system being in state i and k is the Boltzmann constant.

The information theory analogue has k=1 and the p i representing the probabilities of all possible combinations of retrieval parameters.

More generally, for a continuous distribution (eg. Gaussian):

i ii

S(P)=-k p lnp

2S P(x) =- P(x)log P(x) dx

Page 16: 1 Information Content Tristan L’Ecuyer. 2 Historical Perspective Information theory has its roots in telecommunications and specifically in addressing

16

Entropy of a Gaussian Distribution

For the Gaussian distributions typically used in optimal estimation

we have:

For an m-variable Gaussian dist.:

2

1/2 2

x-x1P(x)= exp -

2σ2π σ

2 21/2

21/2 2 2

x-x x-x1S P(x) = exp - log 2π σ +exp - dx

2σ 2σ2π σ

1/ 2

2S P(x) =log 2 e

1/ 2 12 22S P( ) =m log 2 log ye x S

Page 17: 1 Information Content Tristan L’Ecuyer. 2 Historical Perspective Information theory has its roots in telecommunications and specifically in addressing

17

Information Content of a Retrieval

The information content of an observing system is defined as the difference in entropy between an a priori set of possible solutions, S(P1), and the subset of these solutions that also satisfy the measurements, S(P2):

If Gaussian distributions are assumed for the prior and posterior state spaces as in the O. E. approach, this can be written:

since, after minimizing the cost function, the covariance of the posterior state space is:

)S(P)S(PH 21

2 2

1 1H= log log

2 2 1 T 1 1

1 2 a y aS S S K S K S

11y

T1ax KSKSS

Page 18: 1 Information Content Tristan L’Ecuyer. 2 Historical Perspective Information theory has its roots in telecommunications and specifically in addressing

18

Interpretation

Qualitatively, information content describes the factor by which knowledge of a quantity is improved by making a measurement.

Using Gaussian statistics we see that the information content provides a measure of how much the ‘volume of uncertainty’ represented by the a priori state space is reduced after measurements are made.

Essentially this is a generalization of the scalar concept of ‘signal-to-noise’ ratio.

2

1H= log

2-1

x aS S

Page 19: 1 Information Content Tristan L’Ecuyer. 2 Historical Perspective Information theory has its roots in telecommunications and specifically in addressing

19

Measuring Stick Analogy

Information content measures the resolution of the observing system for resolving solution space.

Analogous to the divisions on a measuring stick: the higher the information content, the finer the scale that can be resolved.

A: Biggest scale = 2 divisions H = 1

Full range of a priori solutions

AC

C: Finer still = 8 divisions H = 3

B

B: Next finer scale = 4 divisions H = 2

D

D: Finest scale = 16 divisions H = 4

Page 20: 1 Information Content Tristan L’Ecuyer. 2 Historical Perspective Information theory has its roots in telecommunications and specifically in addressing

20

Blue a priori state space

Green state space that also matches MODIS visible channel (0.64 μm)

Red state space that matches both 0.64 and 2.13 μm channels

Yellow state space that matches all 17 MODIS channels

Liquid Cloud Retrievals

Prior State Space 0.64 μm (H=1.20)

LW

P (

gm

-3)

Re (μm)

LW

P (

gm

-3)

Re (μm)

0.64 & 2.13 μm(H=2.51)

17 Channels(H=3.53)

Page 21: 1 Information Content Tristan L’Ecuyer. 2 Historical Perspective Information theory has its roots in telecommunications and specifically in addressing

21

Snowfall Retrieval Revisited

With a 140 GHz brightness temperature accurate to ±5 K as a constraint, the range of solutions is significantly narrowed by up to a factor of 4 implying an information content of ~2.

Radar Only

Refl

ecti

vit

y

Snowfall Rate (mm h-1)

Radar + Radiometer

Snowfall Rate (mm h-1)

Refl

ecti

vit

y

Page 22: 1 Information Content Tristan L’Ecuyer. 2 Historical Perspective Information theory has its roots in telecommunications and specifically in addressing

22

Return to Polynomial Functions

Order, N X1 X2 Error (%) ds H

1 1.984 1.988 18 1.933 1.45

2 1.996 1.998 9 1.985 2.19

5 1.999 2.000 3 1.998 3.16

N

1 21 1 1

2 32 2 2

a ay x b= +

a ay x b

σy = 10%σa = 100%

σy = 25%σa = 100%

σy = 10%σa = 10%

Order, N X1 X2 Error (%) ds H

1 1.401 1.432 8 0.568 0.07

2 1.682 1.771 7 1.099 0.21

5 1.927 1.976 3 1.784 0.83

Order, N X1 X2 Error (%) ds H

1 1.909 1.929 41 1.659 0.65

2 1.976 1.986 21 1.911 1.29

5 1.996 1.998 8 1.987 2.25

X1 = X2 = 2; X1a = X2a = 1

1

25

Page 23: 1 Information Content Tristan L’Ecuyer. 2 Historical Perspective Information theory has its roots in telecommunications and specifically in addressing

23

Application: MODIS Cloud Retrievals

The concept of information content provides a useful tool for analyzing the properties of observing systems within the constraints of realistic error assumptions.

As an example, consider the problem of assessing the information content of the channels on the MODIS instrument for retrieving cloud microphysical properties.

Application of information theory requires:– Characterize the expected uncertainty in modeled radiances due to

assumed temperature, humidity, ice crystal shape/density, particle size distribution, etc. (i.e. evaluate Sy);

– Determine the sensitivity of each radiance to the microphysical properties of interest (i.e. compute K);

– Establish error bounds provided by any available a priori information (eg. cloud height from CloudSat);

– Evaluate diagnostics such as Sx, A, ds, and H

1. L’Ecuyer et al. (2006), J. Appl. Meteor. 45, 20-41.

2. Cooper et al. (2006), J. Appl. Meteor. 45, 42-62.

Page 24: 1 Information Content Tristan L’Ecuyer. 2 Historical Perspective Information theory has its roots in telecommunications and specifically in addressing

24

Error Analyses

Fractional errors reveal a strong scene-dependence that varies from channel to channel.

LW channels are typically better at lower optical depths while SW channels improve at higher values.

Page 25: 1 Information Content Tristan L’Ecuyer. 2 Historical Perspective Information theory has its roots in telecommunications and specifically in addressing

25

Sensitivity Analyses

The sensitivity matrices also illustrate a strong scene dependence that varies from channel to channel.

The SW channels have the best sensitivity to number concentration in optically thick clouds and effective radius in thin clouds.

LW channels exhibit the most sensitivity to cloud height for thick clouds and to number concentration for clouds with optical depths between 0.5-4.

0.646 μm

2.130 μm

11.00 μm

Page 26: 1 Information Content Tristan L’Ecuyer. 2 Historical Perspective Information theory has its roots in telecommunications and specifically in addressing

26

Information Content

Information content is related to the ratio of the sensitivity to the uncertainty – i.e. the signal-to-noise.

H

ds

14 km11 km9 km

14 km11 km9 km

Page 27: 1 Information Content Tristan L’Ecuyer. 2 Historical Perspective Information theory has its roots in telecommunications and specifically in addressing

27

The Importance of Uncertainties

Rigorous specification of forward model uncertainties is critical for an accurate assessment of the information content of any set of measurements.

Uniform 10% Errors Rigorous Errors11 km11 km

11 km 11 km

Page 28: 1 Information Content Tristan L’Ecuyer. 2 Historical Perspective Information theory has its roots in telecommunications and specifically in addressing

28

The Role of A Priori

Information content measures the amount state space is reduced relative to prior information.

As prior information improves, the information content of the measurements decreases.

The presence of cloud height information from CloudSat, for example, constrains the a priori state space and reduces the information content of the MODIS observations.

Without CloudSat With CloudSat11 km11 km

11 km 11 km