oceanography 569 oceanographic data analysis laboratory

26
Oceanography 569 Oceanographic Data Analysis Laboratory Kathie Kelly Applied Physics Laboratory 515 Ben Hall IR Bldg class web site: faculty.washington.edu/kellyapl/class es/ocean569_2014/

Upload: bary

Post on 22-Feb-2016

36 views

Category:

Documents


0 download

DESCRIPTION

Oceanography 569 Oceanographic Data Analysis Laboratory. Kathie Kelly Applied Physics Laboratory 515 Ben Hall IR Bldg class web site: faculty.washington.edu/kellyapl/classes/ocean569_2014/. Propagation of Errors. Example 1: linear function Example 2: mean. x t is the true value of x - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Oceanography 569 Oceanographic Data Analysis Laboratory

Oceanography 569Oceanographic Data Analysis Laboratory

Kathie KellyApplied Physics Laboratory

515 Ben Hall IR Bldgclass web site:

faculty.washington.edu/kellyapl/classes/ocean569_2014/

Page 2: Oceanography 569 Oceanographic Data Analysis Laboratory

Propagation of Errors

Example 1: linear function

Example 2: mean

xt is the true value of x

if no bias in the error

sample mean averageserrors in data

Page 3: Oceanography 569 Oceanographic Data Analysis Laboratory

Mean Squared Error

How much is error reduced by averaging? Examine the mean squared error or error variance.

The <~> indicates an ensemble average over N realizations.

If the errors are random and similar with no bias

error variance is reduced by a factor of N if errors are uncorrelated

Page 4: Oceanography 569 Oceanographic Data Analysis Laboratory

Example 3: difference in time

If errors are random (uncorrelated), the difference increases the squared errors by a factor of 2

If errors are correlated (bias), the difference reduces the errors

Most errors are a combination of random and bias

Errors for Differences

Page 5: Oceanography 569 Oceanographic Data Analysis Laboratory

Given a quantity that is a function of several variables F(x,y,z)

a variation (or error) in F is related to variations in the variables

or in terms of the error variance

assuming errors in x, y and z are uncorrelated

General Error Estimates

Page 6: Oceanography 569 Oceanographic Data Analysis Laboratory

Error Estimate Example where ρ and cp are constant

Squared error

Factor out F2

Take ensemble average

and define relative error

Page 7: Oceanography 569 Oceanographic Data Analysis Laboratory

Another Example

Wind stress where cD is constant

Error is given as a fraction r of wind speed

so relative error is

What is the relative error of wind stress (magnitude)?

What is the stress error if the wind speed error is 10%?

Page 8: Oceanography 569 Oceanographic Data Analysis Laboratory

Another Example Solution

Wind stress where cD is constantGeneral formula:

A 10% error in wind speed s gives a 20% error in stress

Page 9: Oceanography 569 Oceanographic Data Analysis Laboratory

Exercise 3: Error Estimates

Known errors for Q, T and H

Need error estimates for

• Q/(ρ cpH)• dT/dt

Page 10: Oceanography 569 Oceanographic Data Analysis Laboratory

Exercise 3: Are other terms significant?

1. compute LHS2. estimate total errors for LHS

Is the LHS difference larger than the estimated errors?

Notes:convert relative error variance of x to error variance using var(x)check that all units match

Page 11: Oceanography 569 Oceanographic Data Analysis Laboratory

Hypothesis Testing

To determine whether a relationship is significant we formulate a null hypothesis, that the proposed relationship is NOT true

We test to determine if the null hypothesis can be rejected within a given probabilty, say α = 0.05 (5%). (The level of confidence is 95%.)

A significance test consists of finding the probability of a given result (a p-value) and comparing that with the alpha test value. If the p-value (probability) is less than alpha, then the null hypothesis is rejected.

Page 12: Oceanography 569 Oceanographic Data Analysis Laboratory

Test Example

Is the mean <X> of a subsample of X over N points significantly different from the known mean value μ?

Depends on the std dev (error) of the mean estimate

A measure of how large this is (how likely it is to be significant) is found from the Z-transform

Probability of Z score (or lower) from a normal distribution N(0,1) is

p = normcdf(Z,0,1) [Matlab function]

Or let Matlab do the work:

p = normcdf(<X>,μ,σm)

Page 13: Oceanography 569 Oceanographic Data Analysis Laboratory

Analysis of Variance (ANOVA)To test how well a dynamical or statistical model fits observations d(t) we estimate the fraction of variance described by the model z

Two common types of models are(1) known function

z = f (x,y)

(2) linear estimator (coefficients by regression) z = a x + b y + c

The ratio of the squared residual (or error) r2 =( d – z )2

to the variance of the observations σd2 is the fraction of variance

not explained by the model.

Page 14: Oceanography 569 Oceanographic Data Analysis Laboratory

Time Series Analysis

The analysis of time series differs from that of independent objects (tossing dice, medical patient studies, etc) in that the measurements generally have serial correlation:

So a time series with N points does not have N independent measurements.

The effective number of independent measurements (degrees of freedom N*) depends on the degree of correlation of successive measurements, the autocorrelation of the time series:

Page 15: Oceanography 569 Oceanographic Data Analysis Laboratory

Covariance and Correlation

For two time series x(t) and y(t) covariance is defined as

where <~> is expected value and Δt is a time lag

Correlation is the covariance normalized by the std dev’s(values between -1 and 1)

Notes: 1) this terminology differs from that in Matlab, but is common 2) when applied to a single variable, x, autocovariance, autocorrelation3) these are time-lagged values, but we often use only zero-lag value4) we generally remove the mean values (as shown)

Page 16: Oceanography 569 Oceanographic Data Analysis Laboratory

Correlations

Some common types of correlations:

1) autocorrelation (to get a time scale for the data)2) correlations between two variables3) lagged correlations to determine if one variable

leads or lags another4) vector correlations (as opposed to scalar

correlations)

To evaluate a correlation, need an objective measure of significance

Page 17: Oceanography 569 Oceanographic Data Analysis Laboratory

Autocorrelation & Periodic Signals

Autocorrelation of variable with periodic signal mostly shows the periodicity

Remove harmonics before computing (auto) correlations for better interpretation & statistics

Page 18: Oceanography 569 Oceanographic Data Analysis Laboratory

Characteristic Time Scale

Is there a characteristic time scale for each variable?

First zero crossing?

Or something more robust?

Page 19: Oceanography 569 Oceanographic Data Analysis Laboratory

Integral Time Scale

More robust method: takes into account shape of function

integral time scale: integrate correlation (to first zero crossing) to get equivalent time (tau) for perfect correlation

integral time scales:1 month for Qnet4 months for SSH

integral time scales shorter than zero crossing

integral timescale

Page 20: Oceanography 569 Oceanographic Data Analysis Laboratory

Caution: Covariance from Observations

Autocovariance (or autocorrelation) from a single time series is an overestimate of the actual function

because the error is correlated with itself.

It should be estimated from two different measurements of the same quantity at the same location.

If the errors have shorter time scales than the variable, then the error can be estimated from the autocovariance at non-zero lags

Page 21: Oceanography 569 Oceanographic Data Analysis Laboratory

Autocorrelation:estimate correction for zero lag

extrapolate to zero lag

difference in correlation from unresolved signal variance and actual errors (upper bound)

SSH

Qnet

Page 22: Oceanography 569 Oceanographic Data Analysis Laboratory

Significance of a Correlation(degrees of freedom)

The integral time scale τ is used to define the number of degrees of freedom N* of a time series

N* = N/τwhere N is length of the series

which is needed to determine the statistical significance of the correlation

Z-test for significance of the correlation r based on a random parent distribution ρ of possible correlations

Create a new variable

The mean and std dev of w are

Page 23: Oceanography 569 Oceanographic Data Analysis Laboratory

Derivation of Significance Test (cont’d)

For null hypothesis ρ = 0 so μ = 0. Normalize using Z transform

If Z is within region containing fraction (1-α) of distribution

the correlation is NOT significant.

Alternatively, one can solve for the critical value of correlation rc

See Bendat & Piersol for derivation (2000), pp.101-111

Page 24: Oceanography 569 Oceanographic Data Analysis Laboratory

Exercise 4: Lagged correlations

SSH: longitude-time plot

SSH at two locations

lag

Can you estimate the speed of the Rossby wave from the SSH?

Page 25: Oceanography 569 Oceanographic Data Analysis Laboratory

Exercise 4: Vectors

Mean wind vectors• KEO mooring• ECMWF• QuikSCAT• NCEP2

Note: vector correlations do not include means

Page 26: Oceanography 569 Oceanographic Data Analysis Laboratory

Vector Correlations

complex correlation gives persistent direction errors & magnitude errors