lecture 2 probability and measurement error, part 1
DESCRIPTION
Lecture 2 Probability and Measurement Error, Part 1. Syllabus. - PowerPoint PPT PresentationTRANSCRIPT
Lecture 2
Probability and Measurement Error, Part 1
SyllabusLecture 01 Describing Inverse ProblemsLecture 02 Probability and Measurement Error, Part 1Lecture 03 Probability and Measurement Error, Part 2 Lecture 04 The L2 Norm and Simple Least SquaresLecture 05 A Priori Information and Weighted Least SquaredLecture 06 Resolution and Generalized InversesLecture 07 Backus-Gilbert Inverse and the Trade Off of Resolution and VarianceLecture 08 The Principle of Maximum LikelihoodLecture 09 Inexact TheoriesLecture 10 Nonuniqueness and Localized AveragesLecture 11 Vector Spaces and Singular Value DecompositionLecture 12 Equality and Inequality ConstraintsLecture 13 L1 , L∞ Norm Problems and Linear ProgrammingLecture 14 Nonlinear Problems: Grid and Monte Carlo Searches Lecture 15 Nonlinear Problems: Newton’s Method Lecture 16 Nonlinear Problems: Simulated Annealing and Bootstrap Confidence Intervals Lecture 17 Factor AnalysisLecture 18 Varimax Factors, Empircal Orthogonal FunctionsLecture 19 Backus-Gilbert Theory for Continuous Problems; Radon’s ProblemLecture 20 Linear Operators and Their AdjointsLecture 21 Fréchet DerivativesLecture 22 Exemplary Inverse Problems, incl. Filter DesignLecture 23 Exemplary Inverse Problems, incl. Earthquake LocationLecture 24 Exemplary Inverse Problems, incl. Vibrational Problems
Purpose of the Lecture
review random variables and their probability density functions
introduce correlation and the multivariate Gaussian distribution
relate error propagation to functions of random variables
Part 1
random variables and their probability density functions
d=?
random variable, dno fixed value until it is realized
d=?indeterminate
d=1.04indeterminate
d=0.98
random variables have systematics
tendency to takes on some values more often than others
0 20 40 60 80 100 120 140 160 180 2001
2
3
4
5
6
7
8
N realization of data
index, i
d i 50
10
(C) (B) (A)
0 5 100
0.1
0.2
0.3
0.4
0.5
d
p(d)
0 5 100
10
20
30
d
coun
ts
0 5 100
0.1
0.2
0.3
0.4
0.5
d
p(d)
0 2 4 6 8 100
0.1
0.2
0.3
0.4
0.5
d
p(d)p(d)
Δd d
probability of variable being between d and d+Δd is p(d) Δd
in general probability is the integral
probability that d is between d1 and d2
the probability that d has some value is 100% or unity
probability that d is between its minimum and
maximum bounds, dmin and dmax
How do these two p.d.f.’s differ?
dp(d)
dp(d)00
55
Summarizing a probability density function
typical value“center of the p.d.f.”
amount of scatter around the typical value“width of the p.d.f.”
0 2 4 6 8 100
0.1
0.2
0.3
0.4
0.5
d
p(d)p(d)
ddML
Several possibilities for a typical value
point beneath the peak or “maximum likelihood point” or
“mode”
0 2 4 6 8 100
0.1
0.2
0.3
0.4
0.5
d
p(d)p(d)
ddmedianpoint dividing area in half or “median”
50% 50%
0 2 4 6 8 100
0.1
0.2
0.3
0.4
0.5
d
p(d)p(d)
d<d>balancing point or
“mean” or “expectation”
can all be different
dML ≠ dmedian ≠ <d>
formula for“mean” or “expected value”
<d>
≈ sd
ds
≈ s NsN
data
histogram
Ns
dsp≈ s P(ds)
probability distribution
step 1: usual formula for mean
step 2: replace data with its histogram
step 3: replace histogram with probability distribution.
<d><d><d>
If the data are continuous, use analogous formula containing an
integral:
≈ s P(ds)<d>
<d>
quantifying width
This function grows away from the typical value:
q(d) = (d-<d>)2so the function q(d)p(d) is
small if most of the area is near <d> , that is, a narrow p(d)large if most of the area is far from <d> , that is, a wide p(d)
so quantify width as the area under q(d)p(d)
0 5 100
5
10
d
q(d)
0 5 100
0.5
1
dp(
d)
0 5 100
0.5
1
d
q(d)
*p(d
)
0 5 100
5
10
d
q(d)
0 5 100
0.5
1
d
p(d)
0 5 100
0.5
1
dq(
d)*p
(d)
(C)(B)(A)
(F)(E)(D)
variance
width is actually square root of variance, that is, σmean
estimating mean and variance from data
estimating mean and variance from data
usual formula for “sample
mean”
usual formula for square of
“sample standard
deviation”
MabLab scripts for mean and variance
dbar = Dd*sum(d.*p);q = (d-dbar).^2; sigma2 = Dd*sum(q.*p); sigma = sqrt(sigma2);
from tabulated p.d.f. p
from realizations of data
dbar = mean(dr);sigma = std(dr);sigma2 = sigma^2;
two important probability density functions:
uniform
Gaussian (or Normal)
uniform p.d.f.
ddmin dmax
p(d)1/(dmax- dmin)
probability is the same everywhere in the range of possible values
box-shaped function
Large probability near the mean, d. Variance is σ2.
0 10 20 30 40 50 60 70 80 90 1000
0.02
0.04
0.06
0.08
d2σ
Gaussian (or “Normal”) p.d.f.
bell-shaped function
probability between <d>±nσGaussian p.d.f.
Part 2
correlated errors
uncorrelated random variables
no pattern of between values of one variable and values of another
when d1 is higher than its meand2 is higher or lower than its mean with equal probability
0
0.2
0.4
0.6
0.8
1
d1
d20 100
10
0
0.2
0.4
0.6
0.8
1
0.0
0.2pjoint probability density function
uncorrelated case
<d1 >
<d2 >
0
0.2
0.4
0.6
0.8
1
d1
d20 100
10
0
0.2
0.4
0.6
0.8
1
0.0
0.2p
no tendency for d2 to be either high or low when d1 is high
<d1 >
<d2 >
in uncorrelated case
joint p.d.f.is just the product of individual p.d.f.’s
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0
0.2
0.4
0.6
0.8
1
d1
d20 100
10 0.00
0.25p
2σ1
<d1 >
<d2 >
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0
0.2
0.4
0.6
0.8
1
d1
d20 100
10 0.00
0.25p
2σ1
<d1 >
<d2 >
tendency for d2 to be high when d1 is high
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0
0.2
0.4
0.6
0.8
1
d1
d20 100
10 0.00
0.25p
2σ1
2σ12σ2
θ<d1 >
<d2 >
d1
d2
2σ1
<d2><d 1>
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
d1
d2<d2>
<d 1>
d1
d2<d2><d 1>
d1
d2<d2>
<d 1>
(A) (B) (C)
formula for covariance
+ positive correlation high d1 high d2
- negative correlation high d1 low d2
joint p.d.f.mean is a vector
covariance is a symmetric matrix
diagonal elements: variancesoff-diagonal elements: covariances
estimating covariance from a table D of data
Dki: realization k of data-type i in MatLab, C=cov(D)
univariate p.d.f. formed from joint p.d.f.
p(d) → p(di)behavior of di irrespective of the other ds
integrate over everything but di
d1
d2
d1
integrate over d2
integrate over d1
d2
p(d1,d2)
p(d2)
p(d1)
multivariate Gaussian (or Normal) p.d.f.
covariance matrix
mean
Part 3
functions of random variables
functions of random variables
data with measurement
error
data analysis process
inferences with
uncertainty
functions of random variables
given p(d)with m=f(d)
what is p(m) ?
univariate p.d.f.use the chair rule
univariate p.d.f.use the chair rule
rule for transforming a univariate p.d.f.
multivariate p.d.f.
determinant of matrix with elements ∂di/∂mj
Jacobiandeterminant
multivariate p.d.f.
rule for transforming a multivariate p.d.f.
Jacobian determinant
simple example
data with measurement
error
data analysis process
inferences with
uncertainty
one datum, duniform p.d.f.
0<d<1
m = 2d one model parameter, m
m=2d and d= m½d=0 → m=0d=1 → m=2dd/dm = ½p(d ) =1p(m ) = ½
p(d)
0 1 2 d01
p(m)
0 1 2 m01
another simple example
data with measurement
error
data analysis process
inferences with
uncertainty
one datum, duniform p.d.f.
0<d<1
m = d2 one model parameter, m
m=d2 and d=m½d=0 → m=0d=1 → m=1dd/dm = m½ -½p(d ) =1p(m ) = m½ -½
A)
B)
0.0 0.2 0.4 0.6 0.8 1.01.02.0
0.0 0.2 0.4 0.6 0.8 1.01.02.0
p(d)p(m)
d
m
multivariate example
data with measurement
error
data analysis process
inferences with
uncertainty
2 data, d1 , d2uniform p.d.f.
0<d<1for both
m1 = d1+d2m2 = d2-d2
2 model parameters, m1 and m2
d2
d1
1-1
-11
m1=d1+d1
m2=d1-d1(A)
p(d ) = 1
m=Md with
d=M-1m so ∂d/ ∂ m = M-1 and J= |det(M-1)|= ½and |det(M)|=2
p(m ) = ½
d2
d1
1-1
-11
m1=d1+d1
m2=d1-d1
m2
m1
1-1
-112
11
d1
p(d1)
0.5 p(m1)
0-1
m1
12
-10
(A)
(B)
(C)
(D)
Note that the shape in m is different than the shape in d
d2
d1
1-1
-11
m1=d1+d1
m2=d1-d1
m2
m1
1-1
-112
11
d1
p(d1)
0.5 p(m1)
0-1
m1
12
-10
(A)
(B)
(C)
(D)
The shape is a square with sides of length √2. The amplitude is ½. So the area is ½⨉√2⨉√2 = 1
d2
d1
1-1
-11
m1=d1+d1
m2=d1-d1
m2
m1
1-1
-112
11
d1
p(d1)
0.5 p(m1)
0-1
m1
12
-10
(A)
(B)
(C)
(D)
moral
p(m) can behavior quite differently from p(d)