artificial intelligence methods in the environmental sciences || neural network applications to...

15
9 Neural Network Applications to Solve Forward and Inverse Problems in Atmospheric and Oceanic Satellite Remote Sensing Vladimir M. Krasnopolsky List of Acronyms BT – Brightness Temperature DAS – Data Assimilation System FM – Forward Model MLP – Multi-Layer Perceptron NWP – Numerical Weather Prediction PB – Physically Based RMSE – Root Mean Square Error RS – Remote Sensing SSM/I – Special Sensor Microwave Imager SST – Sea Surface Temperature TF – Transfer Function 9.1 Introduction Here we discuss two very important practical appli- cations of the neural network (NN) technique: solu- tion of forward and inverse problems in atmospheric and oceanic satellite remote sensing (RS). A par- ticular example of this type of NN applications – solving the SAR wind speed retrieval problem – is Vladimir M. Krasnopolsky (*) Science Application International Company at Environmental Modeling Center, National Centers for Environmental Predic- tion, National Oceanic and Atmospheric Administration, Camp Springs, Maryland, USA Earth System Science Interdisciplinary Center, University of Maryland, EMC/NCEP/NOAA, 5200 Auth Rd., Camp Springs, MD 20746, USA Phone: 301-763-8000 ext. 7262; fax 301-763-8545; email: [email protected] also presented in Chapter 10 by G. Yung. These applications and those that we discuss in Chapter 11, from the mathematical point of view, belong to the broad class of applications called approximation of mappings. A particular type of the NN, a Multi-Layer Perceptron (MLP) NN (Rumelhart et al. 1986) is usu- ally employed to approximate mappings. We will start by introducing a remote sensing, mapping, and NN background. 9.1.1 Remote Sensing Background Estimating high quality geophysical parameters (infor- mation about the physical, chemical, and biological properties of the oceans, atmosphere, and land surface) from remote measurements (satellite, aircraft, etc.) is a very important problem in fields such as meteorology, oceanography, climatology and environmental mod- eling and prediction. Direct measurements of many parameters of interest, like vegetation moisture, phy- toplankton concentrations in the ocean, and aerosol concentrations in the atmosphere are, in general, not available for the entire globe at the required spatial and temporal resolution. Even when in situ measurements are available, they are usually sparse (especially over the oceans) and located mainly at ground level or at the ocean surface. Often such measurements can be estimated indirectly from the influence of these geo- physical parameters on the electromagnetic radiation measured by a remote sensor. Remote measurements allow us to obtain spatially dense measurements all over the globe at and above the level of the ground and ocean surface. S. E. Haupt et al. (eds.), Artificial Intelligence Methods in the Environmental Sciences 191 © Springer-Verlag Berlin Heidelberg 2009

Upload: caren

Post on 03-Feb-2017

212 views

Category:

Documents


0 download

TRANSCRIPT

000–0–00–000000–0 09-Haupt-c09 SHB0024-Haupt (Typeset by SPi, Delhi) page 191 of 206 September 22, 2008 16:11

9Neural Network Applications to SolveForward and Inverse Problems in Atmosphericand Oceanic Satellite Remote Sensing

Vladimir M. Krasnopolsky

List of Acronyms

BT – Brightness TemperatureDAS – Data Assimilation SystemFM – Forward ModelMLP – Multi-Layer PerceptronNWP – Numerical Weather PredictionPB – Physically BasedRMSE – Root Mean Square ErrorRS – Remote SensingSSM/I – Special Sensor Microwave ImagerSST – Sea Surface TemperatureTF – Transfer Function

9.1 Introduction

Here we discuss two very important practical appli-cations of the neural network (NN) technique: solu-tion of forward and inverse problems in atmosphericand oceanic satellite remote sensing (RS). A par-ticular example of this type of NN applications –solving the SAR wind speed retrieval problem – is

Vladimir M. Krasnopolsky (*)Science Application International Company at EnvironmentalModeling Center, National Centers for Environmental Predic-tion, National Oceanic and Atmospheric Administration, CampSprings, Maryland, USA

Earth System Science Interdisciplinary Center, University ofMaryland, EMC/NCEP/NOAA, 5200 Auth Rd., Camp Springs,MD 20746, USAPhone: 301-763-8000 ext. 7262; fax 301-763-8545;email: [email protected]

also presented in Chapter 10 by G. Yung. Theseapplications and those that we discuss in Chapter 11,from the mathematical point of view, belong to thebroad class of applications called approximation ofmappings. A particular type of the NN, a Multi-LayerPerceptron (MLP) NN (Rumelhart et al. 1986) is usu-ally employed to approximate mappings. We will startby introducing a remote sensing, mapping, and NNbackground.

9.1.1 Remote Sensing Background

Estimating high quality geophysical parameters (infor-mation about the physical, chemical, and biologicalproperties of the oceans, atmosphere, and land surface)from remote measurements (satellite, aircraft, etc.) is avery important problem in fields such as meteorology,oceanography, climatology and environmental mod-eling and prediction. Direct measurements of manyparameters of interest, like vegetation moisture, phy-toplankton concentrations in the ocean, and aerosolconcentrations in the atmosphere are, in general, notavailable for the entire globe at the required spatial andtemporal resolution. Even when in situ measurementsare available, they are usually sparse (especially overthe oceans) and located mainly at ground level or atthe ocean surface. Often such measurements can beestimated indirectly from the influence of these geo-physical parameters on the electromagnetic radiationmeasured by a remote sensor. Remote measurementsallow us to obtain spatially dense measurements allover the globe at and above the level of the groundand ocean surface.

S. E. Haupt et al. (eds.), Artificial Intelligence Methods in the Environmental Sciences 191© Springer-Verlag Berlin Heidelberg 2009

000–0–00–000000–0 09-Haupt-c09 SHB0024-Haupt (Typeset by SPi, Delhi) page 192 of 206 September 22, 2008 16:11

192 V. M. Krasnopolsky

Satellite RS data are used in a wide variety of appli-cations and by a wide variety of users. Satellite sen-sors generate measurements like radiances, backscattercoefficients, brightness temperatures, etc. The appli-cations usually utilize geophysical parameters suchas pressure, temperature, wind speed and direction,water vapor concentration, etc. derived from satellitedata. Satellite forward models, which simulate satel-lite measurements from given geophysical parame-ters, and retrieval algorithms, which transform satel-lite measurements into geophysical parameters, playthe role of mediators between satellite sensors andapplications. There exists an entire spectrum of differ-ent approaches in extracting geophysical informationfrom the satellite measurements. At one end of thisspectrum ‘satellite only’ approaches are located; wewill call them standard or traditional retrievals. Theyuse measurements performed by one particular sen-sor only, sometimes from different channels (frequen-cies, polarizations, etc.) of the same sensor to estimategeophysical parameters. Variational retrieval tech-niques or direct assimilation techniques are locatedat the other end of the spectrum. They use an entiredata assimilation system (DAS), including a numer-ical weather prediction (NWP) model and analysis(Prigent et al. 1997), which in turn includes all kind ofmeteorological measurements (buoys, radiosondes,ships, aircrafts, etc.) as well as data from numer-ous satellite sensors. Data assimilation is a methodin which observations of the current (and possibly,past) state of a system (atmosphere and/or ocean) arecombined with the results from a numerical modelto produce an analysis, which is considered as ‘thebest’ estimate of the current state of the system. Theanalysis can be used for many purposes including ini-tialization of the next step of the numerical model inte-gration. Many approaches have been developed whichbelong to the intermediate part of this spectrum. Theseapproaches use measurements from several satellitesensors, combine satellite measurements with otherkinds of measurements, and/or use background fieldsor profiles from NWP models for regularization of theinverse problem (retrievals) or for ambiguity removal,i.e., these approaches use some type of data fusionto regularize (see Sections 9.1.2 and 9.4 below) thesolution of the inverse problem.

It is noteworthy that over the last few years,direct assimilation of some geophysical parametersinto modern DASs has been successfully developed

and implemented. It improved the quality of assimi-lated products and numerical forecasts that use some ofthese products as initial conditions. Direct assimilationreplaces or eliminates the need for using retrievalsof these geophysical parameters in DASs. However,there are still many other geophysical parameters (e.g.,precipitations, atmospheric ice) that have not yet beenincluded, or it is not clear from both theoretical and/orpractical considerations how they could be includedinto DASs through direct assimilation. There are alsoother users of the retrieved geophysical parameters.Therefore, there is still an urgent need to use the stan-dard retrievals for these geophysical parameters andto develop the corresponding retrieval algorithms towhich the NN technique could be efficiently applied.Direct assimilation is discussed below at the end of thissubsection in the description of variational retrievaltechniques.

The remote measurements themselves are usuallyvery accurate. The quality of geophysical parametersderived from these measurements varies significantlydepending on the strength and uniqueness of the signalfrom the geophysical parameter and the mathematicalmethods applied to extract the parameter, i.e., to solveRS forward and/or inverse problems (see Section 9.2).The NN technique is a useful mathematical tool forsolving the forward and inverse problems in RS accu-rately. The number of NN RS applications has beenincreasing steadily over the last decade.

A broad class of NN applications has been devel-oped for solving the forward and inverse problems inRS in order to infer geophysical parameters from satel-lite data, i.e., to produce so-called satellite retrievals.A brief review of RS NN applications was presentedby Atkinson and Tatnall (1997). Examples of suchapplications follow. The NN technique was applied forthe inversion of a multiple scattering model to estimatesnow parameters from passive microwave measure-ments (Tsang et al. 1992). Smith (1993) used NNs forthe inversion of a simple two-stream radiative trans-fer model to derive the leaf area index from Moder-ate Resolution Imaging Spectrometer data. In otherstudies, NNs were applied to simulate scatterometermeasurements and to retrieve wind speed and directionfrom these measurements (Thiria et al. 1993; Corn-ford et al. 2001); to develop an inversion algorithmfor radar scattering from vegetation canopies (Pierceet al. 1994); to estimate atmospheric humidity profiles(Cabrera-Mercader and Staelin 1995), atmospheric

000–0–00–000000–0 09-Haupt-c09 SHB0024-Haupt (Typeset by SPi, Delhi) page 193 of 206 September 22, 2008 16:11

9 NNs for Satellite Remote Sensing 193

temperature, moisture, and ozone profiles (Aireset al. 2002) and atmospheric ozone profiles (Muelleret al. 2003). Stogryn et al. (1994), and Krasnopol-sky et al. (1995) applied NNs to invert Special Sen-sor Microwave Imager (SSM/I) data and retrieve sur-face wind speed. Davis et al. (1995) applied NNsto invert a forward model to estimate soil moisture,surface air temperature, and vegetation moisture fromScanning Multichannel Microwave Radiometer data.Using a NN technique, a fast SSM/I forward model(Krasnopolsky 1997) and SSM/I multi-parameterretrieval algorithm (Krasnopolsky et al. 1999, 2000;Meng et al. 2007) have been derived from empir-ical data (buoy SSM/I collocations). Abdelgadiret al. (1998) applied NNs to the forward and inversemodeling of canopy directional reflectance. Schillerand Doerffer (1999) used a NN technique for invert-ing a radiative transfer forward model to estimate theconcentration of phytoplankton pigment from MediumResolution Imaging Spectrometer data.

9.1.2 Mapping and Neural NetworksBackground

A mapping, M , between two vectors X (input vector)and Y (output vector) can be symbolically written as,

Y = M(X); X ∈ �n, Y ∈ �m (9.1)

where X ∈ �n means that the vector X has n com-ponents and all of them are real numbers. A largenumber of important practical geophysical applica-tions may be considered mathematically as a mappinglike (9.1). Keeping in mind that a NN technique willbe used to approximate this mapping, we will call it atarget mapping, using a common term from nonlinearapproximation theory (DeVore 1998). The target map-ping may be given to us explicitly or implicitly. It canbe given explicitly as a set of equations based on firstprinciples (e.g., radiative transfer or heat transfer equa-tions) and/or empirical dependencies, or as a computercode. Observational records composed of Xs and Y srepresent an implicit target mapping. In this case, it isassumed that the unknown target mapping generatesthe data.

The mapping (9.1) is a complicated mathematicalobject with many important characteristics like map-

D

nℜ

Rmℜ

Y M

Y = M(X)

X

Fig. 9.1 The mapping (1), M ; its input vector, X ; output vector,Y ; domain, D and range, R

ping dimensionalities, domain, range, complexities,etc. Some of them are illustrated by Fig. 9.1.

Among the applications considered in this chapter,we will find inverse problems that can be consideredas continuous unique mappings (9.1); however, forthese mappings small perturbations in X may causelarge changes in Y and the problem is then called ill-posed (Vapnik 1995). Large changes in Y may occurbecause an ill-posed problem may have more than onesolution, or the solution may depend discontinuouslyupon the initial data (Hadamard 1902). It is also knownas improperly posed problem. Ill-posed problemsusually arise when one attempts to estimate anunknown cause from observed effects (most of the geo-physical inverse problems belong to this class, e.g., thesatellite retrieval problem considered in Section 9.2.1)or to restore a whole object from its low dimensionalprojection (e.g., estimating the NN Jacobian consid-ered in Aires et al. 2004 and Krasnopolsky 2006). IfX contains even a low level of noise, the uncertain-ties in Y may be very large. To solve ill-posed prob-lems additional a priori information about the solution(regularization) should be introduced into the solutionapproach (Vapnik and Kotz 2006).

The simplest MLP NN is a generic analytical non-linear approximation or model for mapping, like thetarget mapping (9.1). The MLP NN uses for theapproximation a family of functions like:

yq = N N (X, a, b) = aq0 +k∑

j=1

aqj · z j ;

q = 1, 2, . . . , m (9.2)

z j = φ(b j0 +n∑

i=1

bji · xi ) (9.3)

where xi and yq are components of the input and out-put vectors respectively, a and b are fitting parameters,and φ is a so called activation function (a nonlinear

000–0–00–000000–0 09-Haupt-c09 SHB0024-Haupt (Typeset by SPi, Delhi) page 194 of 206 September 22, 2008 16:11

194 V. M. Krasnopolsky

function, often a hyperbolic tangent), n and m are thenumbers of inputs and outputs respectively, and k isthe number of the nonlinear basis function z j (9.3) inthe expansion (9.2). The expansion (9.2) is a linearexpansion (a linear combination of the basis func-tion z j (9.3)) and the coefficients aqj (q = 1, . . . , mand j = 1, . . . , k) are linear coefficients in this expan-sion. It is essential that the basis functions z j (9.3)are nonlinear with respect to inputs xi (i = 1, . . ., n)

and to the fitting parameters or coefficients bji ( j =1, . . . , k). As a result of the nonlinear dependenceof the basis functions on multiple fitting parametersbji, the basis {z j } j=1,...,k turns into a very flexible setof non-orthogonal basis functions that have a greatpotential to adjust to the functional complexity of themapping (9.1) to be approximated. It has been shownby many authors in different contexts that the familyof functions (9.2, 9.3) can approximate any contin-uous or almost continuous (with a finite number offinite discontinuities, like a step function) mapping(9.1) (Cybenko 1989; Funahashi 1989; Hornik 1991;Chen and Chen 1995a, b). The accuracy of theNN approximation or the ability of the NN to resolvedetails of the target mapping (9.1) is proportional to thenumber of basis functions (hidden neurons) k (Attaliand Pagès 1997).

In this chapter and in Chapter 11, we use the termsan emulating NN or a NN emulation for NN (9.2, 9.3)that provides a functional emulation of the target map-ping (9.1) that implies a small approximation error forthe training set and smooth and accurate interpolationbetween training set data points inside the mappingdomain D. The term “emulation” is introduced to dis-tinguish between these NNs and approximating NNsor NN approximations that guarantee small approxi-mation error for the training set only.

When an emulating NN is developed, in additionto the criterion of small approximation error at leastthree other criteria are used: (i) the NN complexity(proportional to the number k of hidden neurons whenother topological parameters are fixed) is controlledand restricted to a minimal level sufficient for goodapproximation and interpolation; (ii) independent val-idation and test data sets are used in the process oftraining (validation set) to control overfitting and afterthe training (test set) to evaluate interpolation accu-racy; (iii) redundant training set (additional redun-dant data points are added in-between training datapoints sufficient for a good approximation) is used for

improving the NN interpolation abilities (Krasnopol-sky 2007).

9.2 Deriving Geophysical Parametersfrom Satellite Measurements:Standard Retrievals and VariationalRetrievals Obtained Through DirectAssimilation

9.2.1 Standard or ConventionalRetrievals

Conventional methods for using satellite data (stan-dard retrievals) involve solving an inverse or retrievalproblem and deriving a transfer function (TF) f , whichrelates a geophysical parameter of interest G (e.g., sur-face wind speed over the ocean, atmospheric moistureconcentration, sea surface temperature (SST), etc.) to asatellite measurement S (e.g., brightness temperatures,radiances, reflection coefficients, etc.)

G = f (S) (9.4)

where both G and S may be vectors. The TF f , (alsocalled a retrieval algorithm) usually cannot be deriveddirectly from first principles because the relationship(9.4) does not correspond to a cause and effect prin-ciple and multiple values of G can sometimes corre-spond to a single S. Forward models (FM),

S = F(G) (9.5)

where F is a forward model, which relate a vector G toa vector S, can usually be derived from first principlesand physical considerations (e.g., a radiative transfertheory) in accordance with cause and effect princi-ples because geophysical parameters affect the satellitemeasurements (but not vice versa). Thus, the forwardproblem (9.5) is a well-posed problem in contrast tothe inverse problem (9.4) which is often an ill-posedone (Parker 1994); although, from a mathematicalpoint of view, both FM (9.5) and TF (9.4) are continu-ous (or almost continuous) mappings between the twovectors S and G. Even in the cases where the mapping(9.4) is not unique, this multi-valued mapping may beconsidered as a collection of single-valued continuousmappings. In order to derive the TF (9.4), the FM(9.5) has to be inverted (an inverse problem has to be

000–0–00–000000–0 09-Haupt-c09 SHB0024-Haupt (Typeset by SPi, Delhi) page 195 of 206 September 22, 2008 16:11

9 NNs for Satellite Remote Sensing 195

solved). The inversion technique usually searches fora vector G0 which minimizes the functional (Stoffelenand Anderson 1997)

‖�S‖ = ∥∥S0 − F(G)∥∥ (9.6)

where S0 is an actual vector of satellite measurements.Since the FM F is usually a complicated nonlinearfunction, this approach leads to a full-scale nonlinearoptimization with all its numerical problems, like slowconvergence, multiple solutions, etc. This approachdoes not determine the TF explicitly; it assumes thisfunction implicitly, and for each new measurementS0 the entire process has to be repeated. A simpli-fied linearization method to minimize the functional(9.6) can be applied if there is a good approximationfor the solution of the inverse problem, that is, anapproximate vector of the geophysical parameters G0.Then the difference vector �S is small and there is avector G in close proximity to G0 (|�G| = |G − G0|is small) where �S(G) = 0. Expanding F(G) in aTaylor series and keeping only those terms which arelinear with respect to �G, we can obtain a systemof linear equations to calculate the components of thevector �G (e.g., Wentz 1997),

n∑

i=1

∂ F(G)

∂Gi|G=G0�Gi = S0 − F(G0) (9.7)

where n is the dimension of vector G. After �Gis calculated, the next iteration of (9.7) with G0 =G0 + �G is performed. The process is expected toconverge quickly to the vector of retrievals G. Again,in this case the TF, f , is not determined explicitly butis only determined implicitly for the vector S0 by thesolution of (9.7). This type of retrieval can be calleda “local” or “localized” linear inversion. These tech-niques (9.6, 9.7) are usually called physically basedretrievals. It is important to emphasize that the phys-ically based algorithms (9.6, 9.7) are by definitionmulti-parameter algorithms since they retrieve severalgeophysical parameters simultaneously (a completevector G).

Empirical algorithms are based on an approachwhich, from the beginning, assumes the existence ofan explicit analytical representation for a TF, f . Amathematical (statistical) model, fmod, is usually cho-sen (usually some kind of a regression), which con-tains a vector of fitting (or regression) parameters

a = {a1, a2, . . .},Gk = fmod(S, a) (9.8)

where these parameters are determined from anempirical (or simulated) matchup data set {Gk, S}collocated in space and time and use, for example,statistical techniques such as the method of least-squares. This type of retrieval can be called a “global”inversion as it is not restricted to a given vector ofsatellite measurements. The subscript k in Gk stressesthe fact that the majority of empirical retrieval algo-rithms are single-parameter algorithms. For exam-ple, for Special Sensor Microwave Imager (SSM/I)there exist algorithms which retrieve only wind speed(Goodberlet et al. 1989), water vapor (Alishouseet al. 1990; Petty 1993), or cloud liquid water (Wengand Grody 1994). Krasnopolsky et al. (1999, 2000 )showed that single-parameter algorithms have addi-tional (compared to multi-parameter retrievals) sys-tematic (bias) and random (unaccounted variance)errors in a single retrieved parameter Gk .

The obvious way to improve single-parameterretrievals (9.8) is to include other parameters in theretrieval process using an empirical multi-parameterapproach, which as in the physically based multi-parameter approach (9.6, 9.7), inverts the data inthe complete space of the geophysical parameters(Krasnopolsky et al. 1999, 2000). Thus, the com-plete vector of the related geophysical parameters isretrieved simultaneously from a given vector of satel-lite measurements S,

G = fmod(S, a) (9.9)

where G = {Gi } is now a vector containing theprimary, physically-related geophysical parameters,which contribute to the observed satellite measure-ments S. These retrievals do not contain the additionalsystematic and random errors just described. Becauseequations (9.4), (9.5), (9.8), and (9.9) represent con-tinuous mappings, the NN technique is well suited foremulating the FM, TF, and fmod.

The standard retrievals derived using TF (9.4) havethe same spatial resolution as the sensor measurementsand produce instantaneous values of geophysical para-meters over the areas where the measurements areavailable. Geophysical parameters derived using stan-dard retrievals can be used for many applications, suchas the NWP DASs. In this case, a contribution to thevariational analysis cost function χG from a particular

000–0–00–000000–0 09-Haupt-c09 SHB0024-Haupt (Typeset by SPi, Delhi) page 196 of 206 September 22, 2008 16:11

196 V. M. Krasnopolsky

retrieval, G0, is:

χG = 1

2(G − G0)T (O + E)−1 (G − G0) (9.10)

where G0 = f (S0) is a vector of the retrieved geo-physical parameter, S0 is a vector of the sensor mea-surements, G is a vector of the geophysical parametersbeing analyzed, O is the expected error covariance ofthe observations, and E is the expected error covari-ance of the retrieval algorithm.

9.2.2 Variational Retrievals Through theDirect Assimilation of SatelliteMeasurements

Because standard retrievals are based on the solu-tion of an inverse problem which is usually mathe-matically ill-posed (Parker 1994), this approach hassome rather subtle properties and error characteris-tics (Eyre and Lorenc 1989) which cause additionalerrors and problems in retrievals (e.g., an amplificationof errors, ambiguities, etc.). As a result, high-qualitysensor measurements might be converted into lower-quality geophysical parameters. This type of error canbe avoided or reduced by using a variational retrievaltechnique (or an inversion) through direct assimilationof satellite measurements (Lorenc 1986; Parrish andDerber 1992; Phalippou 1996; Prigent et al. 1997;Derber and Wu 1998; McNally et al. 2000).

Variational retrievals or direct assimilation of satel-lite data offer an alternative to deriving geophysicalparameters from the satellite measurements. They usethe entire data assimilation system for the inversion (asa retrieval algorithm). In this case, a contribution tothe analysis cost function χS from a particular sensormeasurement, S0, is:

χS = 1

2(S − S0)T (O + E)−1 (S − S0) (9.11)

where S = F(G), and F is a FM which relates ananalysis state vector G (or a vector of geophysicalparameters in the analysis) to a vector of simulatedsensor measurements S, O is the expected error covari-ance of the observations, and E is the expected errorcovariance of the forward model. The forward problem(9.5) is a well-posed problem in contrast to the inverseproblem (9.4). However, a background term has to be

added to (9.11) to prevent the data assimilation prob-lem from being ill-posed (Parrish and Derber 1992).

The retrieval in this case results in an entire field(global in the case of the global data assimilationsystem) for the geophysical parameter G (non-localretrievals) which has the same resolution as the numer-ical forecast model used in the data assimilation sys-tem. This resolution may be lower or higher thanthe resolution of standard retrievals. The variationalretrievals are also not instantaneous but usually aver-aged in time over the analysis cycle; however, thefield is continuous and coherent (e.g., it should nothave problems such as directional ambiguity in thescatterometer winds). The variational retrievals are theresult of fusing many different types of data (includ-ing satellite data, ground observations, and numericalmodel first guesses) inside the data assimilation sys-tem. Sparse standard retrievals can be converted intocontinuous fields using the regular data assimilationprocedure (9.10) that fuses sparse observations withthe numerical model short term prediction producingthe integrated product on the model grid.

It is important to emphasize a very significant dif-ference between the use of the explicit TF for standardretrievals and the use of FM in variational retrievals.In standard retrievals, the explicit TF (9.4) is usuallysimple (e.g., a regression) and is applied once per sen-sor observation to produce a geophysical parameter. Invariational retrievals the FM, which is usually muchmore complicated than a simple explicit TF, and itspartial derivatives (the number of derivatives is equalto m × n, where m and n are the dimensions of thevectors G and S, respectively) have to be estimated foreach of the k iterations performed during the cost func-tion (9.11) minimization. Thus the requirements forsimplicity of the FM used in the variational retrievalsare restrictive, and variational retrievals often requiresome special, simplified and fast versions of FMs.

9.3 NNs for Emulating Forward Models

FMs are usually complex due to the complexity ofthe physical processes which they describe and thecomplexity of the first principle formalism on whichthey are based (e.g., a radiative transfer theory).Dependencies of satellite measurements on geophys-ical parameters, which FMs describe, are complicated

000–0–00–000000–0 09-Haupt-c09 SHB0024-Haupt (Typeset by SPi, Delhi) page 197 of 206 September 22, 2008 16:11

9 NNs for Satellite Remote Sensing 197

and nonlinear. These dependencies may exhibit dif-ferent types of nonlinear behavior. FMs are usu-ally exploited in physically based retrieval algorithmswhere they are numerically inverted to retrieve geo-physical parameters and in data assimilation systemswhere they are used for the direct assimilation ofsatellite measurements (variational retrievals). Bothnumerical inversions and direct assimilation are iter-ative processes where FMs and their Jacobians arecalculated many times for each satellite measurement.Thus, the retrieval process becomes very time consum-ing, sometimes prohibitively expensive for operational(real time) applications.

For such applications, it is essential to have fast andaccurate versions of FMs. Because the functional com-plexity of FM mappings (complexity of input/outputrelationships) is usually not as high as their physi-cal complexity, NNs can provide fast and accurateemulations of FMs. Moreover, a NN can also pro-vide an entire Jacobian matrix with only a small addi-tional computational effort. This is one of NN applica-tions where the NN Jacobian is calculated and used.Because a statistical inference of Jacobian is an ill-posed problem, it should be carefully tested and con-trolled.

To develop a NN emulation for the FM, a train-ing set which consists of matched pairs of vectorsof geophysical parameters and satellite measurements,{G, S}i=1,...,N , has to be created. If a physically basedFM exists, it can be used to simulate the data. Other-wise, empirical data can be used. The resulting dataset constitutes the training set and will be employed todevelop the NN emulation.

9.4 NNs for Solving Inverse Problems:NNs Emulating Retrieval Algorithms

NNs can be used in several different ways for retrievalalgorithms. In physically based retrieval algorithms afast NN, emulating the complex and slow physicallybased FM and its Jacobian, can be used to speed up thelocal inversion process (9.7). NNs can be used in manycases for a global inversion to explicitly invert a FM.In such cases, after an inversion the NN provides anexplicit retrieval algorithm (or TF), which is a solutionof the inverse problem and can be used for retrievals.To train a NN which emulates an explicit retrieval

algorithm, a training set {G, S}i=1,...,N , is required. Asin the case of FMs, simulated or empirical data can beused to create the training set.

In addition to the complications related to FMs(complexity, nonlinearity, etc.), retrieval algorithmsexhibit some problems because they are solutionsof the inverse problem, which is an ill-posed prob-lem. This is why mathematical tools which are usedto develop retrieval algorithms have to be accurateand robust in order to deal with these additionalproblems. NNs are fast, accurate and robust toolsfor emulating nonlinear (continuous) mappings andcan be effectively used for modeling multi-parameterretrieval algorithms. One of serious problems relatedto retrieval algorithms is the problem of regularizingthe solution of the inverse problem. Without regu-larization, from very accurate satellite measurementsonly poor quality or ambiguous geophysical parame-ters can usually be retrieved. To regularize an ill-posedinverse problem, additional (regularization) informa-tion should be introduced (Vapnik and Kotz 2006).The NN technique is flexible enough to accommodateregularization information as additional inputs and/oroutputs and as additional regularization terms in theerror or loss function. For example, in their pioneeringwork on using NNs for the simultaneous retrieval oftemperature, water vapor, and ozone atmospheric pro-files (Aires et al. 2002; Mueller et al. 2003) from satel-lite measurements, the authors made good use of thisNN flexibility by introducing the first guess from theatmospheric model or DAS as additional regularizinginputs in their NN based retrieval algorithms.

9.5 Controlling the NN Generalization

NNs are well suited to modeling complicated nonlin-ear relationships between multiple variables, as is thecase in multispectral remote sensing. Well-constructedNNs (NN emulations) have good interpolation prop-erties; however, they may produce unpredictable out-puts when forced to extrapolate. The NN training data(simulated by a FM or constructed from empiricaldata sets) cover a certain manifold DT (a sub-domainDT ∈ D) in the full domain D. Real data to be fed intothe NN, fNN , which emulates a TF, may not always liein DT . There are many reasons for such deviations ofreal data from the low dimensional manifold DT of

000–0–00–000000–0 09-Haupt-c09 SHB0024-Haupt (Typeset by SPi, Delhi) page 198 of 206 September 22, 2008 16:11

198 V. M. Krasnopolsky

training data, e.g., simplifications built into a modeldesign, neglecting the natural variability of parametersoccurring in the model and measurement errors in thesatellite signal not taken into account during the gen-eration of the training data set. When empirical dataare used, extreme events (highest and lowest valuesof geophysical parameters) are usually not sufficientlyrepresented in the training set because they have alow frequency of occurrence in nature. This meansthat during the retrieval stage, real data in some casesmay force the NN emulation, fNN , to extrapolate. Theerror resulting from such a forced extrapolation willincrease with the distance of the input point from DT

and will also depend on the orientation of the inputpoint relative to DT .

In order to recognize NN inputs not foreseen inthe NN training phase and, thus, out of the scopeof the inversion algorithm, a validity check (Schillerand Krasnopolsky 2001) can be used. This check mayserve as the basis for a quality control (QC) procedure.Some kind of QC procedure is usually applied to thesatellite retrievals in DAS. Let the model S = F(G)

have an inverse G = f (S), then, by definition S =F( f (S)). Further, let fNN be the NN emulating theinverse model in the domain DT . The result of G0 =fNN(S0) for S0 /∈ DT may be arbitrary and, in general,F( fNN(S0)) will not be equal to S0. The validity ofS = F( fNN(S)) is a necessary condition for S ∈ D.

Now, if in the application stage of the NN, fNN , S isnot in the domain ST , the NN is forced to extrapolate.In such a situation the validity condition may not befulfilled, and the resulting G is in general meaning-less. For operational applications, it is necessary toreport such events to make a proper decision aboutcorrecting this retrieval or removing it from the datastream. In order to perform the validity test, the FMmust be applied after each inversion. This requires afast but accurate FM. Such a FM can be achieved bydeveloping a NN that accurately emulates the originalFM, S = FNN(G). Thus, the validity check algorithmconsists of a combination of inverse and forward NNsthat, in addition to the inversion, computes a qualitymeasure for the inversion:

δ = ||S − FNN( fNN(S))|| (9.12)

In conclusion, the solution to the problem of a scopecheck is obtained by estimating δ (9.12) where S is theresult of the satellite measurement. This procedure (i)allows the detection of situations where the forward

model or/and transfer function is inappropriate, (ii)does an “in scope” check for the retrieved parameterseven if the domain has a complicated geometry, and(iii) can be adapted to all cases where a NN is used toemulate the inverse of an existing forward model.

9.6 Neural Network Emulations forSSM/I Data

In previous sections, we discussed the theoretical pos-sibilities and premises for using NNs for modeling TFsand FMs. In this section, we illustrate these theoreticalconsiderations using real-life applications of the NNapproach to the SSM/I forward and retrieval prob-lems. SSM/I is a well-established instrument, flownsince 1987. Many different retrieval algorithms andseveral forward models have been developed for thissensor and several different databases are availablefor algorithm development and validation. Various dif-ferent techniques have been applied to the algorithmdevelopment. Therefore, we can present an extensivecomparison of different methods and approaches forthis instrument. A raw buoy-SSM/I matchup databasecreated by the Navy was used for the NN algorithmdevelopment, validation, and comparison. This data-base is quite representative, with the exception ofhigh latitude and high wind speed events. In order toimprove this situation the data sets were enriched byadding matchup databases collected by the high lati-tude European ocean weather ships Mike and Lima tothe Navy database. Various filters have been applied toremove errors and noisy data (for a detailed discussionsee Krasnopolsky 1997, and Krasnopolsky et al. 1999).

9.6.1 NN Emulation of the Empirical FMfor SSM/I

The empirical SSM/I FM represents the relationshipbetween the vector of geophysical parameters G andvector of satellite brightness temperatures (BTs) S,where S = {T 19V, T 19H, T 22V, T 37V, T 37H}(TXXY means XX frequency in GHz and Y polar-ization). Four geophysical parameters are includedin G (surface wind speed W , columnar water vaporV , columnar liquid water L , and SST or Ts). These

000–0–00–000000–0 09-Haupt-c09 SHB0024-Haupt (Typeset by SPi, Delhi) page 199 of 206 September 22, 2008 16:11

9 NNs for Satellite Remote Sensing 199

Table 9.1 Comparison ofphysically based radiativetransfer and empirical NNforward models under clearand clear + cloudy (inparentheses) weatherconditions

BT RMS Error (K)

Author Type Inputs Vertical Horizontal

P & K (1992) PB W, V, L, SST, Theta1, P02, 1.9 (2.3) 3.3 (4.3)HWV3, ZCLD4, Ta5, G6

Wentz (1997) PB W, V, L, SST, Theta1 2.3 (2.8) 3.4 (5.1)Krasnopolsky (1997) NN W, V, L, SST 1.5 (1.7) 3.0 (3.4)1Theta – incidence angle2 P0 – surface pressure3 HWV – vapor scale height4 ZCLD – cloud height5Ta – effective surface temperature6G – lapse rate

are the main parameters influencing BTs measuredby satellite sensors, which were used as inputs inthe physically based FMs of Petty and Katsaros(1992, 1994) (referenced to below as PK) andWentz (1997) (see Table 9.1). The NN emulation FM1(Krasnopolsky 1997), which implements this SSM/IFM has four inputs {W, V, L , SST }, one hid-den layer with 12 neurons, and five nonlinear BToutputs {T 19V, T 19H, T 22V, T 37V, T 37H} (seeFig. 9.2). The derivatives of the outputs with respectto the inputs, which can be easily calculated, con-stitute the Jacobian matrix K[S] = {∂Si/∂Gj }, whichis required in the process of direct assimilation ofthe SSM/I BTs when the gradient of the SSM/I con-tribution to the cost function (9.11) χs is calculated

FM1

T19V T19H T22V T37V T37H

W V L SST

Fig. 9.2 SSM/I NN forward model, FM1 that generates bright-ness temperatures S = T X XY (X X– frequency in GHz, Y –polarization) if given the vector G of four geophysical para-meters: ocean surface wind speed (W ), water vapor (V ), liquidwater (L) concentrations, and sea surface temperature (SST)

(Parrish and Derber 1992; Phalippou 1996). Esti-mating an NN emulation of FM and its derivativesis a much simpler and faster task than calculat-ing radiative transfer forward models. The qualityof the Jacobian matrix was evaluated in (Krasnopol-sky 1997).

The matchup databases for the F11 SSM/I havebeen used for training (about 3,500 matchups) andvalidating (about 3,500 matchups) our forward model.FM1, the NN emulation of FM, was trained using allmatchups that correspond to clear and cloudy weatherconditions in accordance with the retrieval flagsintroduced by Stogryn et al. (1994). Only thosecases where the microwave radiation cannot pene-trate the clouds were removed. Then, more than 6,000matchups for the F10 instrument were used for thetesting and comparison of the FM1 with physicallybased forward models by PK and Wentz (1997). TheRMS errors for FM1 are systematically better thanthose for the PK and Wentz FMs for all weather con-ditions and all channels considered. With the FM1, thehorizontally polarized channels 19H and 37H have thehighest RMSE, ∼3.5 K under clear and ∼4. K underclear and cloudy conditions. For the vertically polar-ized channels RMSEs are lower, 1.5 K under clear and1.7 K under partly clear and partly cloudy conditions.The same trend can be observed for the PK and WentzFMs. Table 9.1 presents total statistics (RMS errors)for the three FMs discussed here. RMS errors are aver-aged over different frequencies separately for the verti-cal and horizontal polarizations. RMS errors are higherunder cloudy conditions because the complexity of theforward model increases due to the interaction of themicrowave radiation with clouds.

Thus, FM1 gives results which are compara-ble or better in terms of RMSEs than the results

000–0–00–000000–0 09-Haupt-c09 SHB0024-Haupt (Typeset by SPi, Delhi) page 200 of 206 September 22, 2008 16:11

200 V. M. Krasnopolsky

obtained with more sophisticated physically-basedmodels (shown in Table 9.1), and is much simplerthan physically based FMs. FM1 is not as general as aradiative transfer model; it was developed to be appliedto the data assimilation system for variational retrievaland direct assimilation of SSM/I BTs at particularfrequencies from a particular instrument. However,for this particular application (direct assimilation) andparticular instrument it has a significant advantage (itis significantly simpler and faster), especially in anoperational environment. This is also one of the appli-cations where the accuracy of the NN Jacobian isessential. FM1 simultaneously calculates the BTs andJacobian matrix. Krasnopolsky (1997) has demon-strated that for this particular application the NN Jaco-bian is sufficiently smooth. A generic NN ensembletechnique is discussed by Krasnopolsky (2007) thatimproves the stability and reduces uncertainties of theNN emulation Jacobian if desired.

9.6.2 NN Empirical SSM/I RetrievalAlgorithms

The SSM/I wind speed retrieval problem is a perfectexample illustrating the general discussion presentedin Section 9.2.1. The problems encountered in the caseof SSM/I wind speed retrievals are very representa-tive, and the methods used to solve them can easilybe generalized for other geophysical parameters andsensors. About 10 different SSM/I wind speed retrievalalgorithms, both empirical and physically-based, havebeen developed using a large variety of approachesand methods. Here these algorithms are compared

in order to illustrate some properties of the differentapproaches mentioned in previous sections, and someadvantages of the NN approach.

Goodberlet et al. (1989) developed the firstglobal SSM/I wind speed retrieval algorithm. Thisalgorithm is a single-parameter algorithm (it retrievesonly wind speed) and is linear with respect to BTs(a linear multiple regression is used). Statistics forthis algorithm are shown in Table 9.2 under theabbreviation GSW. This algorithm presents a linearapproximation of a nonlinear (especially under cloudysky conditions) SSM/I TF (9.8). Under clear skyconditions (Table 9.2), it retrieves the wind speedwith an acceptable accuracy. However, under cloudyconditions where the amount of the water vapor and/orcloud liquid water in the atmosphere increases, errorsin the retrieved wind speed increase significantly.

Goodberlet and Swift (1992) tried to improve theGSW algorithm performance under cloudy conditions,using nonlinear regression with a rational type of non-linearity. Since the nature of the nonlinearity of theSSM/I TF under cloudy conditions is not known pre-cisely, the application of such a nonlinear regressionwith a particular fixed type of nonlinearity may notbe enough to improve results, as happens with thealgorithm we refer to as GS. In many cases the GSalgorithm generates false high wind speeds when realwind speeds are less than 15 m/s (Krasnopolsky et al.1996).

A nonlinear (with respect to BTs) algorithm (calledthe GSWP algorithm here) introduced by Petty (1993)is based on a generalized linear regression. It presentsa case where a nonlinearity introduced in the algorithmrepresents the nonlinear behavior of the TF much bet-ter. This algorithm introduces a nonlinear correction

Table 9.2 Errors (in m/s) for different SSM/I wind speed algorithms under clear and clear + cloudy (in parentheses) conditions

Algorithm Method Bias Total RMSE W > 15 m/s RMSE

GSW1 Multiple linear regression −0.2 (−0.5) 1.8 (2.1) (2.7)GSWP2 Generalized linear regression −0.1 (−0.3) 1.7 (1.9) (2.6)GS3 Nonlinear regression 0.5 (0.7) 1.8 (2.5) (2.7)Wentz4 Physically-based 0.1 (−0.1) 1.7 (2.1) (2.6)NN15 Neural network −0.1 (−0.2) 1.5 (1.7) (2.3)NN26 Neural network (−0.3) (1.5) –1Goodberlet et al. (1989)2Petty (1993)3Goodberlet and Swift (1992)4Wentz (1997)5Krasnopolsky et al. (1996, 1999)6Meng et al. (2007)

000–0–00–000000–0 09-Haupt-c09 SHB0024-Haupt (Typeset by SPi, Delhi) page 201 of 206 September 22, 2008 16:11

9 NNs for Satellite Remote Sensing 201

for the linear GSW algorithm when the amount ofwater vapor in the atmosphere is not zero. Table 9.2shows that the GSWP algorithm improves the accuracyof retrievals compared to the linear GSW algorithmunder both clear and cloudy conditions. However, itdoes not improve the GSW algorithm performance athigh wind speeds because most of high wind speedevents occur at mid- and high-latitudes where theamount of water vapor in the atmosphere is not signif-icant. Here, the cloud liquid water is the main sourceof the nonlinear behavior of the TF and has to be takeninto account.

NN algorithms have been introduced as an alter-native to nonlinear and generalized linear regressionsbecause the NN can model the nonlinear behavior of aTF better than these regressions. Stogryn et al. (1994)developed the first NN SSM/I wind speed algorithm,which consists of two NNs, each with the surfacewind speed as a single output. One performs retrievalsunder clear and the other under cloudy conditions.Krasnopolsky et al. (1995) showed that a single NNwith the same architecture (a single output) can gener-ate retrievals for surface winds with the same accuracyas the two NNs developed by Stogryn et al. (1994)under both clear and cloudy conditions. Applicationof a single NN emulation led to a significant improve-ment in wind speed retrieval accuracy under clear con-ditions. Under higher moisture/cloudy conditions, theimprovement was even greater (25–30%) compared tothe GSW algorithm. The increase in areal coveragedue to the improvements in accuracy was about 15%on average and higher in areas where there were sig-nificant weather events (higher levels of atmosphericmoisture).

Both described above NN algorithms give very sim-ilar results because they had been developed usingthe same matchup database. This database, however,does not contain matchups for wind speed higherthan about 20 m/s and contains very few matchupsfor wind speeds higher than 15 m/s. These algorithmsare also single-parameter algorithms, i.e., they retrieveonly the one parameter of wind speed; therefore,they cannot account for the variability in all relatedatmospheric (e.g., water vapor and liquid water) andsurface (e.g., SST) parameters (which is especiallyimportant at higher wind speeds). This is why theseNN algorithms pose the same problem; they cannotgenerate acceptable wind speeds at ranges higher then18–19 m/s.

The next generation NN algorithm – a multi-parameter NN algorithm developed at NCEP (NN1 inTable 9.2) by Krasnopolsky et al. (1996, 1999) solvedthe high wind speed problem through three mainadvances. First, a new buoy/SSM/I matchup databasewas used in the development of this algorithm. It con-tained an extensive matchup data set for the F8, F10,and F11 sensors, provided by Navy Research Labo-ratory, and augmented with additional data from theEuropean Ocean Weather Ships Mike and Lima forhigh latitude, high wind speed events (up to 26 m/s).Second, the NN training method was improved byenhancing the learning for the high wind speed rangeby weighting the high wind speed events. Third, thevariability of related atmospheric and surface parame-ters was taken into account; surface wind speed (W ),columnar water vapor (V ), columnar liquid water (L),and SST are all retrieved simultaneously. In this case,the output vector of geophysical parameters is pre-sented by G = {W, V, L , SST }. The NN1 algorithmuses five SSM/I channels, including 19 and 37 GHzfor horizontal and vertical polarization and 22 GHz forvertical polarization (see Fig. 9.3).

Meng et al. (2007) (NN2 in Table 9.2) use theNN multi-parameter retrieval approach developed byKrasnopolsky et al. (1996, 1999) to design another NNmulti-parameter retrieval algorithm for SSM/I. They

NN1

T19V T19H T22V T37V T37H

W V L SST

Fig. 9.3 SSM/I retrieval algorithm (NN1) emulating the inversemodel to retrieve vector G of four geophysical parameters:ocean surface wind speed (W ), water vapor (V ), liquid water (L)

concentrations, and sea surface temperature (SST) if given fivebrightness temperatures S = TXXY (XX – frequency in GHz,Y – polarization)

000–0–00–000000–0 09-Haupt-c09 SHB0024-Haupt (Typeset by SPi, Delhi) page 202 of 206 September 22, 2008 16:11

202 V. M. Krasnopolsky

use all 7 SSM/I BTs as inputs. Their output vec-tor also has four components G = {W, Ta, H, SST }where surface wind speed (W ), surface air temperature(Ta), humidity (H), and SST are retrieved simultane-ously. In this case, the training database was limitedby maximum wind speeds of about 20 m/s. Moreover,there are only a few higher speed events with W > 15–17 m/s.

Table 9.2 shows a comparison of the performanceof all the aforementioned empirical algorithms interms of the accuracy of the surface wind speedretrievals. It also shows statistics for a physically basedalgorithm developed by Wentz (1997), which is basedon a linearized numerical inversion (9.7) of a phys-ically based FM. The statistics presented in Table9.2 were calculated using independent buoy-SSM/Imatchups. Table 9.2 shows that the NN algorithmsoutperform all other algorithms. All algorithms exceptthe NN algorithms show a tendency to overestimatehigh wind speeds. This happens because high windspeed events are usually accompanied by a significantamount of cloud liquid water in the atmosphere. Underthese circumstances the transfer function f becomesa complicated nonlinear function, and simple one-parametric regression algorithms cannot provide anadequate representation of this function and confusea high concentration of cloud liquid water with veryhigh wind speeds. Krasnopolsky et al. (1999, 2000 )showed that single-parameter algorithms have addi-tional (compared to multi-parameter retrievals) sys-tematic (bias) and random (unaccounted variance dueto other parameters) errors in a single retrieved para-meter because of described effects. NN1 shows thebest total performance, in terms of bias, RMSE, andhigh wind speed performance.

As mentioned above, one of the significant advan-tages of NN1 algorithm is its ability to retrievesimultaneously not only the wind speed but also thethree other atmospheric and ocean surface parame-ters columnar water vapor V , columnar liquid waterL , and SST. Krasnopolsky et al. (1996) showed thatthe accuracy of retrieval for other geophysical para-meters is very good and close to those attained bythe algorithms of the Alishouse et al. (1990) (forV ) and Weng and Grody (1994) (for L). In addi-tion, Krasnopolsky et al. (1999, 2000 ) have shownthat the errors of multi-parameter NN algorithmshave a weaker dependence on the related atmospheric

and surface parameters than the errors of the single-parameter algorithms considered. The retrieved SSTin this case is not accurate (the RMS error is about4◦C, see Krasnopolsky et al. 1996); however, includingSST in the vector of retrieved parameters decreasesthe error in other retrievals correlated with the SST.For the multi-parameter NN algorithm NN2 (Menget al. 2007), the choice of the additional outputs sur-face air temperature (Ta) and humidity (H), that areclosely and physically related and correlated withSST, makes the accuracy of the retrieved SST sig-nal higher (the bias is about 0.1◦C and RMS error1.54◦C). In accordance with the classical, “linear”remote sensing paradigm, the SSM/I instrument doesnot have the frequency required to sense SST. How-ever, due to the nonlinear nature of the NN emulationand the proper choice of output parameters the multi-parameter NN algorithm is probably able to use weaknonlinear dependencies between NN inputs and out-puts and between NN outputs to retrieve SST with ahigh accuracy.

9.6.3 Controlling the NN Generalizationin the SSM/I Case

The NN1 retrieval algorithm was judged so successfulthat it has been used as the operational algorithm inthe global data assimilation system at NCEP/NOAAsince 1998. Given five brightness temperatures, itretrieves the four geophysical parameters ocean sur-face wind speed, water vapor and liquid water concen-trations, and sea surface temperature. At high levelsof liquid water concentration the microwave radia-tion cannot penetrate clouds and surface wind speedretrievals become impossible. Brightness temperatureson these occasions fall far outside the training domainDT . The retrieval algorithm in these cases, if notflagged properly, will produce wind speed retrievalswhich are physically meaningless (i.e., not related toactual surface wind speed). Usually a statistically-based retrieval flag, based on global statistics, is usedto indicate such occurrences. Under complicated localconditions, however, it can produce significant num-ber of false alarms, or does not produce alarms whenneeded.

000–0–00–000000–0 09-Haupt-c09 SHB0024-Haupt (Typeset by SPi, Delhi) page 203 of 206 September 22, 2008 16:11

9 NNs for Satellite Remote Sensing 203

NN

1F

M1

T19V

T19V’

T19H

T19H’

T22V

T37V

T37H

T22V’

T37V’

T37H’

ε<′−∑ TT

W

V

L

SST

W

V

L

SSTQuality

Flag

Validity Check

Fig. 9.4 SSM/I retrieval algorithm (NN1) emulating the inversemodel to retrieve vector G of four geophysical parameters:ocean surface wind speed (W ), water vapor (V ) and liquidwater (L) concentrations, and sea surface temperature (SST) ifgiven five brightness temperatures S = TXXY (XX – frequency

in GHz, Y – polarization). This vector G is fed to the FM1emulating the forward model to get brightness temperaturesS′ = TXXY ′. The difference �S = |S − S′| is monitored andraises a warning flag if it is above a suitably chosen threshold

The validity check shown in Fig. 9.4, if added toa standard retrieval flag, helps indicate such occur-rences. The NN SSM/I forward model FM1 is usedin combination with the NN1 retrieval algorithm. Foreach satellite measurement S, the geophysical para-meters retrieved from brightness temperatures S arefed into the NN SSM/I forward model which pro-duces another set of brightness temperatures S′. ForS within the training domain (S ∈ DT ) the differ-ence, �S = |S − S′|, is sufficiently small. For S out-side the training domain the larger difference raises awarning flag, if it is above a suitably chosen thresh-old. Krasnopolsky and Schiller (2003) showed thepercentage of removed data and improvement in theaccuracy of the wind speed retrievals as functionsof this threshold. They showed that applying thegeneralization control reduces the RMS error signifi-cantly; the maximum error is reduced even more. Thismeans that this approach is very efficient for removingoutliers.

9.7 Discussion

In this chapter we discussed a broad class of NN appli-cations dealing with the solution of the RS forwardand inverse problems. These applications are closelyrelated to the standard and variational retrievals, whichestimate geophysical parameters from remote satel-lite measurements. Both standard and variational tech-niques require a data converter to convert satellite mea-surements into geophysical parameters or vice versa.Standard retrievals use a TF (a solution of the inverseproblem) and variational retrievals use a FM (a solu-tion of the forward problem) for this purpose. In manycases the TF and the FM can be considered as con-tinuous nonlinear mappings. Because the NN tech-nique is a general technique for continuous nonlinearmapping, it can be used successfully for modelingTFs and FMs.

Theoretical considerations presented in this sectionwere illustrated using several real-life applications that

000–0–00–000000–0 09-Haupt-c09 SHB0024-Haupt (Typeset by SPi, Delhi) page 204 of 206 September 22, 2008 16:11

204 V. M. Krasnopolsky

exemplify a NN based intelligent integral approach(e.g., the approach and design presented in Fig. 9.4)where the entire retrieval system, including the qualitycontrol block, is designed from a combination of sev-eral specialized NNs. This approach offers significantadvantages in real life operational applications. Thisintelligent retrieval system produces not only accurateretrievals but also performs an analysis and qualitycontrol of the retrievals and environmental conditions,rejecting any poor retrievals that occur.

The NN applications presented in this section illus-trate the strengths and limits of the NN technique forinferring geophysical parameters from remote sensingmeasurements. The success of the NN approach, asany nonlinear statistical approach, strongly depends onthe adequacy of the data set used for the NN train-ing. The data availability, precision, quality, represen-tativeness, and amount are crucial for success in thistype of NN application. However, NNs successfullycompete with other statistical methods and usuallyperform better because they are able to emulate thefunctional relationship between inputs and the outputsin an optimal way. NNs can successfully compete witheven physically based approaches because, in manycases, explicit knowledge of very complicated physi-cal processes in the environment is limited and a NNbased empirical approach is more appropriate. It cantake into account more physics implicitly using the NNability to learn from data inherent dependencies, thana physically based approach would include explicitly.

References

Abdelgadir, A. et al. (1998). Forward and inverse modeling ofcanopy directional reflectance using a neural network. Inter-national Journal of Remote Sensing, 19, 453–471.

Aires, F., Rossow, W. B., Scott, N. A, & Chedin, A. (2002).Remote sensing from the infrared atmospheric soundinginterferometer instrument. 2. Simultaneous retrieval of tem-perature, water vapor, and ozone atmospheric profiles. Jour-nal of Geophysical Research, 107, 4620.

Aires, F., Prigent, C., & Rossow, W. B. (2004). Neural networkuncertainty assessment using Bayesian statistics with appli-cation to remote sensing. 3. Network Jacobians. Journal ofGeophysical Research, 109, D10305.

Alishouse, J. C. et al. (1990). Determination of oceanic totalprecipitable water from the SSM/I. IEEE Transactions onGeoscience and Remote Sensing, GE-23, 811–816.

Atkinson, P. M., & Tatnall, A. R. L. (1997). Neural networksin remote sensing – introduction. International Journal ofRemote Sensing, 18(4), 699–709.

Attali, J.-G., & Pagès, G. (1997). Approximations of functionsby a multilayer perceptron: A new approach. Neural Net-works, 10, 1069–1081.

Cabrera-Mercader, C. R., & Staelin, D. H. (1995). Passivemicrowave relative humidity retrievals using feedforwardneural networks. IEEE Transactions on Geoscience andRemote Sensing, 33, 1324–1328.

Chen, T., & Chen, H. (1995a). Approximation capability tofunctions of several variables, nonlinear functionals andoperators by radial basis function neural networks. NeuralNetworks, 6, 904–910.

Chen, T., & Chen, H. (1995b). Universal approximation to non-linear operators by neural networks with arbitrary activationfunction and its application to dynamical systems. NeuralNetworks, 6, 911–917.

Cornford, D., Nabney, I. T., & Ramage, G. (2001). Improvedneural network scatterometer forward models. Journal ofGeophysical Research, 106, 22331–22338.

Cybenko, G. (1989). Approximation by superposition of sig-moidal functions. Mathematics of Control Signals and Sys-tems, 2, 303–314.

Davis, D. T. et al. (1995). Solving inverse problems by Bayesianiterative inversion of a forward model with application toparameter mapping using SMMR remote sensing data. IEEETransactions on Geoscience and Remote Sensing,33, 1182–1193.

Derber, J. C., & Wu, W.-S. (1998). The use of TOVS cloud-cleared radiances in the NCEP SSI analysis system. MonthlyWeather Reviews, 126, 2287–2299.

DeVore, R. A. (1998). Nonlinear approximation. Acta Numer-ica, 8, 51–150.

Eyre, J. R., & Lorenc, A. C. (1989). Direct use of satellite sound-ing radiances in numerical weather prediction. MeteorologyMagazine, 118, 13–16.

Funahashi, K. (1989). On the approximate realization of con-tinuous mappings by neural networks. Neural Networks, 2,183–192.

Goodberlet, M. A., & Swift, C. T. (1992). Improved retrievalsfrom the DMSP wind speed algorithm under adverse weatherconditions. IEEE Transactions on Geoscience and RemoteSensing, 30, 1076–1077.

Goodberlet, M. A., Swift, C. T., & Wilkerson, J. C. (1989).Remote sensing of ocean surface winds with the specialsensor microwave imager Journal of Geophysical Research,94, 14547–14555.

Hadamard, J. (1902). Sur les problèmes aux dérivées partielleset leur signification physique. Princeton University Bulletin,13, 49–52.

Hornik, K. (1991). Approximation capabilities of multilayerfeedforward network. Neural Networks, 4, 251–257.

Krasnopolsky, V. (1997). A neural network-based forwardmodel for direct assimilation of SSM/I brightness tem-peratures. Technical note (OMB contribution No. 140).NCEP/NOAA, Camp Springs, MD 20746.

Krasnopolsky, V., Breaker, L. C., & Gemmill, W. H. (1995).A neural network as a nonlinear transfer function modelfor retrieving surface wind speeds from the special sensormicrowave imager. Journal of Geophysical Research., 100,11033–11045.

Krasnopolsky, V., Gemmill, W. H., & Breaker, L. C. (1996). Anew transfer function for SSM/I based on an expanded neural

000–0–00–000000–0 09-Haupt-c09 SHB0024-Haupt (Typeset by SPi, Delhi) page 205 of 206 September 22, 2008 16:11

9 NNs for Satellite Remote Sensing 205

network architecture. Technical note (OMB contribution No.137). NCEP/NOAA, Camp Springs, MD 20746.

Krasnopolsky, V. M. (2007). Reducing uncertainties in neuralnetwork Jacobians and improving accuracy of neural net-work emulations with NN ensemble approaches. Neural Net-works, 20, 454–461.

Krasnopolsky, V. M., & Schiller, H. (2003). Some neural net-work applications in environmental sciences. Part I: Forwardand inverse problems in satellite remote sensing. Neural Net-works, 16, 321–334.

Krasnopolsky, V. M., Gemmill, W. H., & Breaker, L. C. (1999).A multiparameter empirical ocean algorithm for SSM/Iretrievals. Canadian Journal of Remote Sensing, 25, 486–503.

Krasnopolsky, V. M., Gemmill, W. H., & Breaker, L. C. (2000).A neural network multi-parameter algorithm SSM/I oceanretrievals: Comparisons and validations. Remote Sensing ofEnvironment, 73, 133–142.

Lorenc, A. C. (1986). Analysis methods for numerical weatherprediction. Quarterly Journal of Royal Meteorology Society,122, 1177–1194.

McNally, A. P., Derber, J. C., Wu, W.-S., & Katz, B. B. (2000).The use of TOVS level 1B radiances in the NCEP SSIanalysis system. Quarterly Journal of Royal MeteorologicalSociety, 126, 689–724.

Meng, L. et al. (2007). Neural network retrieval of ocean surfaceparameters from SSM/I data. Monthly Weather Review, 126,586–597.

Mueller, M. D. et al. (2003). Ozone profile retrieval fromglobal ozone monitoring experiment data using a neuralnetwork approach (Neural Network Ozone Retrieval System(NNORSY)). Journal of Geophysical Research, 108, 4497.

Parker, R. L. (1994). Geophysical inverse theory (400 pp.).Princeton, NJ: Princeton University Press.

Parrish, D. F., & Derber, J. C. (1992). The national meteorologi-cal center’s spectral statistical-interpolation analysis system.Monthly Weather Review, 120, 1747–1763.

Petty, G. W. (1993). A comparison of SSM/I algorithms for theestimation of surface wind. Proceedings of Shared Process-ing Network DMSP SSM/I Algorithm Symposium, Monter-rey, CA, June 8–10, 1993.

Petty, G. W., & Katsaros, K. B. (1992). The response of thespecial sensor microwave/imager to the marine environment.Part I: An analytic model for the atmospheric componentof observed brightness temperature. Journal of AtmosphericOceanic Technology, 9, 746–761.

Petty, G. W., & Katsaros, K. B. (1994). The responseof the SSM/I to the marine environment. Part II: Aparameterization of the effect of the sea surface slope dis-tribution on emission and reflection. Journal of AtmosphericOceanic Technology, 11, 617–628.

Phalippou, L. (1996). Variational retrieval of humidity profile,wind speed and cloud liquid–water path with the SSM/I:Potential for numerical weather prediction. Quarterly

Journal of Royal Meteorological Society, 122, 327–355.

Pierce, L., Sarabandi, K., & Ulaby, F. T. (1994). Applicationof an artificial neural network in canopy scattering inver-sion. International Journal of Remote Sensing, 15, 3263–3270.

Prigent, C., Phalippou, L., & English, S. (1997). Variationalinversion of the SSM/I observations during the ASTEX cam-paign. Journal of Applied Meteorology, 36, 493–508.

Rumelhart, D. E., Hinton, G. E., & Williams, R. J. (1986).Learning internal representations by error propagation. In D.E. Rumelhart, J. L. McClelland, & P. R. Group (Eds.), Paral-lel distributed processing (Vol. 1, pp. 318–362). Cambridge,MA: MIT Press.

Schiller, H., & Doerffer, R. (1999). Neural network for emula-tion of an inverse model - operational derivation of case IIwater properties from MERIS data. International Journal ofRemote Sensing, 20, 1735–1746.

Schiller, H., & Krasnopolsky, V. M. (2001). Domain checkfor input to NN emulating an inverse model. Proceed-ings of International Joint Conference on Neural Networks,Washington, DC, July 15–19, pp. 2150–2152.

Smith, J. A. (1993). LAI inversion using a back-propagationneural network trained with a multiple scattering model.IEEE Transactions on Geoscience and Remote Sensing, GE-31, 1102–1106.

Stoffelen, A., & Anderson, D. (1997). Scatterometer data inter-pretation: Estimation and validation of the transfer func-tion CMOD4. Journal of Geophysical Research, 102, 5767–5780.

Stogryn, A. P., Butler, C. T., & Bartolac, T. J. (1994). Ocean sur-face wind retrievals from special sensor microwave imagerdata with neural networks. Journal of Geophysical Research,90, 981–984.

Thiria, S., Mejia, C., Badran, F., & Crepon, M. (1993). A neuralnetwork approach for modeling nonlinear transfer functions:Application for wind retrieval from spaceborn scatterometerdata. Journal of Geophysical Research, 98, 22827–22841.

Tsang, L. et al. (1992). Inversion of snow parameters from pas-sive microwave remote sensing measurements by a neuralnetwork trained with a multiple scattering model. IEEETransactions on Geoscience and Remote Sensing, GE-30,1015–1024.

Vapnik, V. N. (1995). The nature of statistical learning theory(189 pp.). New York: Springer.

Vapnik, V. N., & Kotz, S. (2006). Estimation of dependencesbased on empirical data (Information Science and Statistics)(495 pp.). New York: Springer.

Weng, F., & Grody, N. G. (1994). Retrieval of cloud liquid waterusing the special sensor microwave imager (SSM/I). Journalof Geophysical Research, 99, 25535–25551.

Wentz, F. J. (1997). A well-calibrated ocean algorithm forspecial sensor microwave/imager. Journal of GeophysicalResearch, 102, 8703–8718.