laboratory in oceanography: data and methods mar599, spring 2009 anne-marie e.g. brunner-suzuki...

39
Laboratory in Oceanography: Data and Methods MAR599, Spring 2009 Anne-Marie E.G. Brunner-Suzuki Empirical Orthogonal Functions

Upload: rose-miles

Post on 05-Jan-2016

216 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Laboratory in Oceanography: Data and Methods MAR599, Spring 2009 Anne-Marie E.G. Brunner-Suzuki Empirical Orthogonal Functions

Laboratory in Oceanography:

Data and Methods

MAR599, Spring 2009Anne-Marie E.G. Brunner-Suzuki

Empirical Orthogonal Functions

Page 2: Laboratory in Oceanography: Data and Methods MAR599, Spring 2009 Anne-Marie E.G. Brunner-Suzuki Empirical Orthogonal Functions

MotivationMiles’s classDistinguish patterns/noiseReduce dimensionalityPredictionSmoothing

Page 3: Laboratory in Oceanography: Data and Methods MAR599, Spring 2009 Anne-Marie E.G. Brunner-Suzuki Empirical Orthogonal Functions

The Goal1. Separate time and space of the data:

1. Filter out the noise and reveal “hidden” structure

Page 4: Laboratory in Oceanography: Data and Methods MAR599, Spring 2009 Anne-Marie E.G. Brunner-Suzuki Empirical Orthogonal Functions

Matlab Example 1

Page 5: Laboratory in Oceanography: Data and Methods MAR599, Spring 2009 Anne-Marie E.G. Brunner-Suzuki Empirical Orthogonal Functions

Data

t: time; xt is one “map” in time. There are n timesteps and p different

measurements at each timestep.

Page 6: Laboratory in Oceanography: Data and Methods MAR599, Spring 2009 Anne-Marie E.G. Brunner-Suzuki Empirical Orthogonal Functions

Matlab Example 2 – artificial signal

Page 7: Laboratory in Oceanography: Data and Methods MAR599, Spring 2009 Anne-Marie E.G. Brunner-Suzuki Empirical Orthogonal Functions

SummaryEOF let’s us separate an ensemble of data into k

different modes.Each mode has a ‘space’ (EOF=u) and ‘time’ (EC =c) component

Pre-treating the data can be useful in finding “hidding” structures (taking out the temporal/spatial mean)

But all the information is contained in the dataIt is “just” a mathematical construct. We, the

researchers, are responsible for finding appropriate explanations.

Page 8: Laboratory in Oceanography: Data and Methods MAR599, Spring 2009 Anne-Marie E.G. Brunner-Suzuki Empirical Orthogonal Functions

Naming conventionEmpirical Orthogonal Functions AnalysisPrincipal Component AnalysisDiscrete Karhunen–Loève FunctionsHotelling transformProper orthogonal decomposition

Page 9: Laboratory in Oceanography: Data and Methods MAR599, Spring 2009 Anne-Marie E.G. Brunner-Suzuki Empirical Orthogonal Functions

How to deal with gaps?Ignore them; leave them be.Introduce randomly generated data to fill

gaps and test for M realizationsFill the gaps in each data series using e.g.

optimal interpolation

Page 10: Laboratory in Oceanography: Data and Methods MAR599, Spring 2009 Anne-Marie E.G. Brunner-Suzuki Empirical Orthogonal Functions

Next timeSome math:

What happens inside the black box?How do we know how many modes are

significant?Some problems and pitfallsMore advanced EOFMatlab’s own function

Page 11: Laboratory in Oceanography: Data and Methods MAR599, Spring 2009 Anne-Marie E.G. Brunner-Suzuki Empirical Orthogonal Functions

ReferencesPreisendorferStorchHannachi

Page 12: Laboratory in Oceanography: Data and Methods MAR599, Spring 2009 Anne-Marie E.G. Brunner-Suzuki Empirical Orthogonal Functions
Page 13: Laboratory in Oceanography: Data and Methods MAR599, Spring 2009 Anne-Marie E.G. Brunner-Suzuki Empirical Orthogonal Functions

Laboratory in Oceanography:

Data and Methods

MAR599, Spring 2009Anne-Marie E.G. Brunner-Suzuki

Empirical Orthogonal FunctionsPart II

Page 14: Laboratory in Oceanography: Data and Methods MAR599, Spring 2009 Anne-Marie E.G. Brunner-Suzuki Empirical Orthogonal Functions

X has n timesteps and p different measurements.

[n,p] = size(X);

Use ‘reshape’ to convert from 3D to 2D: X=reshape(X3D, [nx*ny ntimes]);

2.Remove the mean from the data, so each column (=timeseries) has zero mean: X=detrend(X,o);

Pre-treating the data X:1. Shaping the data set:

Page 15: Laboratory in Oceanography: Data and Methods MAR599, Spring 2009 Anne-Marie E.G. Brunner-Suzuki Empirical Orthogonal Functions

1. Form a covariance matrix:

2. Solve the eigenvalue Problem: Cx R = R Λ. Λ is a diagonal matrix containing all

the eigenvalues λ of Cx. The columns ri in R are the eigenvectors

of Cx. Each corresponding to its λi. We pick the ri to be our EOF patterns:

R= EOFs

3. We arrange the: λ1 > λ2….> λp and the ri correspondingly.

How to do it.

Page 16: Laboratory in Oceanography: Data and Methods MAR599, Spring 2009 Anne-Marie E.G. Brunner-Suzuki Empirical Orthogonal Functions

Eigenvectors & Eigenvalues

Cx R = R Λ Here, R is a set of vectors, that are transformed

by Cx into the same vectors except a multiplication factor Λ. R changes in length, but not in direction.

These R are called eigenvectors. The Λ are called eigenvalues.

Also, because Cx is hermitian (diagonally symmetric: Cx’=Cx) and Cx has rank p, there will be p eigenvectors.

Eigenvectors are always orthogonal.

Page 17: Laboratory in Oceanography: Data and Methods MAR599, Spring 2009 Anne-Marie E.G. Brunner-Suzuki Empirical Orthogonal Functions

4. All EOFs explain 100% of the variance. Each mode explains part of the total variance.

5. All eigenvectors are orthogonal to each other;Hence Empiriral ORTHOGONAL Functions.

6. To see how the EOFs evolve in time, we compute the ‘expansion coefficients ‘or amplitudes: ECi = X EOFi;

Page 18: Laboratory in Oceanography: Data and Methods MAR599, Spring 2009 Anne-Marie E.G. Brunner-Suzuki Empirical Orthogonal Functions

In Matlab:1. Shape your data into time x space2. Demean your data: X = detrend (X,o);3. Compute the Covariance: Cx = cov(X);4. Compute Eigenvectors, Eigenvalues:

[EOFs, l] = eig(Cx);5. Sort according to size. Matlab sorts in

ascending order.6. Compute EC: EC1 = X * EOFs(: , 1);7. Compute variance explained:

Var_expl = diag(l)/trace(l);

Page 19: Laboratory in Oceanography: Data and Methods MAR599, Spring 2009 Anne-Marie E.G. Brunner-Suzuki Empirical Orthogonal Functions

NormalizationOften the EOFs are normalized, so that

highest value is 1 or 100.

As X = EOF *EC the EC will need to be adjusted correspondingly, as has to be valid.

Page 20: Laboratory in Oceanography: Data and Methods MAR599, Spring 2009 Anne-Marie E.G. Brunner-Suzuki Empirical Orthogonal Functions

How to understand this?Let’s assume we only have 2

samples xa and ya that evolve in time:

If the all observations are random, there would be a blob in space. Any regularities would show up as directionalities in the blob.

EOF Analysis aims to find these new directionalities, by defining a new coordinate system, where the new axis goes right along these dimensionalities

Page 21: Laboratory in Oceanography: Data and Methods MAR599, Spring 2009 Anne-Marie E.G. Brunner-Suzuki Empirical Orthogonal Functions

With p observations, we have p-dimensional space, and hence we want to find every cluster, by laying a new coordinate system (basis) through the data.

EOF method takes all the variability in a time evolving field and breaks it into a (a few) standing oscillations and a time series to go with each oscillation. The EC show how the EOF modes vary in time.

Page 22: Laboratory in Oceanography: Data and Methods MAR599, Spring 2009 Anne-Marie E.G. Brunner-Suzuki Empirical Orthogonal Functions

A word about removing the mean

Removing the time means has nothing to do with the process of finding eigenvectors, but it allows us to interpret Cx as a covariance matrix, and hence, we can understand our results. Strictly speaking one can find EOFs without removing any mean.

Page 23: Laboratory in Oceanography: Data and Methods MAR599, Spring 2009 Anne-Marie E.G. Brunner-Suzuki Empirical Orthogonal Functions

EOF via SVDSVD : Singular Value Decomposition It decomposes any n x p matrix X into the

form:X = U S V’,

U is a n x n orthonormal matrixS is a diagnoal n x p matrix with si,i elements

on the diagonal. s are called singular values.The columns of U and V contain the singular

vectors of X.

Page 24: Laboratory in Oceanography: Data and Methods MAR599, Spring 2009 Anne-Marie E.G. Brunner-Suzuki Empirical Orthogonal Functions

Connecting SVD and EOFX is the demeaned data matrix as before. 1.Cx = X’X = (U S V’)’ (U S V’) = VS’ U’ U S V’

= V S’S V’2.Cx = EOFs Λ EOFs’ (rewritten eigenvalue problem)

Comparing 1. & 2.: EOFs = V (at least almost)

Λ = S’ S: the squared singular values are the eigenvalues.

The columns of V contain the eigenvectors of Cx= X’ X; our EOFs.

The columns of U contain the eigenvectors of X’ X. Which is also the normalized time series.

Page 25: Laboratory in Oceanography: Data and Methods MAR599, Spring 2009 Anne-Marie E.G. Brunner-Suzuki Empirical Orthogonal Functions

How to do it

1. Use SVD to find U S and V such that X = U S V’

2. Compute the eigenvalues of Cx.3. The eigenvectors of Cx are the column

vectors of V.

We never have to actually compute Cx!

Page 26: Laboratory in Oceanography: Data and Methods MAR599, Spring 2009 Anne-Marie E.G. Brunner-Suzuki Empirical Orthogonal Functions

In Matlab1. Shape your data into time x space2. Demean your data: X = detrend (X,o);3. Perform SVD: [ U, S, V ] = svd(X);4. Compue Eigenvalues: EVal = diag(S.^2);5. Compute explained variance:

expl_var = EVal/sum(EVal);6. EOFs are the column vectors of V’: EOFs

= V’; 7. Compute Expansion Coefficients: EC = U*S;

Page 27: Laboratory in Oceanography: Data and Methods MAR599, Spring 2009 Anne-Marie E.G. Brunner-Suzuki Empirical Orthogonal Functions

There are basically two techniques:1. Computing Eigenvector and Eigenvalues of the

Covariance Matrix2. Singular Value Decomposition (SVD) of the

data.

Both Methods give similar results. Check it out!

However, 1.There are some differences in dimesionality.2.SVD is much faster – especially when your

data are above 1000 x 1000 points.

The two techniques

Page 28: Laboratory in Oceanography: Data and Methods MAR599, Spring 2009 Anne-Marie E.G. Brunner-Suzuki Empirical Orthogonal Functions

Testing Domain DependencyIf the first EOF is unimodal, the second bimodal,

the EOF analysis might be domain dependent.Testing:

Split your domain into two sections (e.g. North and South)

Repeat EOF for each domainAre the same results (unimodal and bi-modal

structures) are obtained for each sub-domain?If yes: The EOF analysis is domain dependent.

Interpretation becomes difficult or impossibleA possibly solution are “rotated EOFs” (REOF):

After a EOF analysis some of the Eigenvectors are rotated.

Page 29: Laboratory in Oceanography: Data and Methods MAR599, Spring 2009 Anne-Marie E.G. Brunner-Suzuki Empirical Orthogonal Functions

EOF from Hannachi exampleWinter (DJF) monthly SLP over the Northern

Hemisphere (NH) from NCEP/NCAR reanalyses January 1948 to December 2000.

The mean annual cycle was removed

Page 30: Laboratory in Oceanography: Data and Methods MAR599, Spring 2009 Anne-Marie E.G. Brunner-Suzuki Empirical Orthogonal Functions

Positive contours solid, negative contours dashed. EOFs have been multiplied by 100.

Page 31: Laboratory in Oceanography: Data and Methods MAR599, Spring 2009 Anne-Marie E.G. Brunner-Suzuki Empirical Orthogonal Functions

Selection RulesVisual.

Page 32: Laboratory in Oceanography: Data and Methods MAR599, Spring 2009 Anne-Marie E.G. Brunner-Suzuki Empirical Orthogonal Functions

North’s Rule of Thumb

North et al defined “typical errors” between two neighboring eigenvalues λ:

“typical errors” between neighboring eigenvectors ψ:

n is the number of degrees of freedom, which is generally less than the number of data points.

Are two modes two close, they are called degenerate.

Page 33: Laboratory in Oceanography: Data and Methods MAR599, Spring 2009 Anne-Marie E.G. Brunner-Suzuki Empirical Orthogonal Functions
Page 34: Laboratory in Oceanography: Data and Methods MAR599, Spring 2009 Anne-Marie E.G. Brunner-Suzuki Empirical Orthogonal Functions

ComplexEOFAllows to analyze propagating signals.Analyze a set of time series by creating a

phase lag among between them by adding a 90degree phase shift. This is done in complex space using the Hilbert transform.

Is cool technique, but pretty complex.

Page 35: Laboratory in Oceanography: Data and Methods MAR599, Spring 2009 Anne-Marie E.G. Brunner-Suzuki Empirical Orthogonal Functions

Monte Carlo

Create surrogate data – a randomized data set by scrambling the monthly maps in the time domain, in order to break the chronological order.

Compute EOF of scrambled dataset and analyze EOFs.

Page 36: Laboratory in Oceanography: Data and Methods MAR599, Spring 2009 Anne-Marie E.G. Brunner-Suzuki Empirical Orthogonal Functions

Matlab’s own functionsPRINCOMP

[COEFF,SCORE,latent] = princomp(X)[EOFs,EC, EigVal] = princomp (data);The EOFs are columns and so are the ECs.

PCACOV[COEFF,latent,explained] = pcacov(V);[EOFs, EigVal, expl_var] = pcacov(data);I believe this uses svd

Page 37: Laboratory in Oceanography: Data and Methods MAR599, Spring 2009 Anne-Marie E.G. Brunner-Suzuki Empirical Orthogonal Functions

Assumptions we madeOrthogonalNormal distributed dataHigh signal to noise ratioStanding Patterns only“The mean”

Problems that might occur:No physical interpretation possibleDegenerate ModesDomain Dependency

Page 38: Laboratory in Oceanography: Data and Methods MAR599, Spring 2009 Anne-Marie E.G. Brunner-Suzuki Empirical Orthogonal Functions

A warning from von Storch and Navarra:“I have learned the following rule to be useful

when dealing with advanced methods. Such methods are often needed to find a signal in a vast noisy space, i.e. the needle in the haystack. But after having the needle in our hand, we should be able to identify the needle by simply looking at it. Whenever you are unable to do so there is a good chance that something is rotten in the analysis.”

Page 39: Laboratory in Oceanography: Data and Methods MAR599, Spring 2009 Anne-Marie E.G. Brunner-Suzuki Empirical Orthogonal Functions

ReferencesR. W. Preisendorfer. Principal component analysis in

meteorology and oceanography. Elsevier. Science, 1988

Hans v. Storch and Francis W. Zwiers: Statistical Analysis in Climate Research. Cambridge University Press, 2002.

North, G.R., T.L. Bell, R.F. Cahalan, and F.J. Moeng, Sampling errors in the estimation of empirical orthogonal functions, Mon. Wea. Rev., 110, 699-706, 1982.

Hannachi, A., I. T. Jolliffe and D. B. Stephenson: Empirical orthogonal functions and related techniques in atmospheric science: A review. International Journal of Climatology, 27, 1119–1152, 2007.