statistical inferences
Post on 24-Feb-2016
58 Views
Preview:
DESCRIPTION
TRANSCRIPT
Uncertainty Analysis for Engineers 1
Statistical InferencesJake BlanchardSpring 2010
Uncertainty Analysis for Engineers 2
IntroductionStatistical inference=process of
drawing conclusions from random dataConclusions of this process are
“propositions,” for example◦Estimates◦Confidence intervals◦Credible intervals◦Rejecting a hypothesis◦Clustering data points
Part of this is the estimation of model parameters
Uncertainty Analysis for Engineers 3
Parameter EstimationPoint Estimation
◦Calculate single number from a set of observational data
Interval Estimation◦Determine interval within which true
parameter lies (along with confidence level)
Uncertainty Analysis for Engineers 4
PropertiesBias=expected value of
estimator does not necessarily equal parameter
Consistency=estimator approaches parameter as n approaches infinity
Efficiency=smaller variance of parameter implies higher efficiency
Sufficient=utilizes all pertinent information in a sample
Uncertainty Analysis for Engineers 5
Point EstimationStart with data sample of size NExample: estimate fraction of voters
who will vote for particular candidate (estimate is based on random sample of voters)
Other examples: quality control, clinical trials, software engineering, orbit prediction
Assume successive samples are statistically independent
Uncertainty Analysis for Engineers 6
EstimatorsMaximum likelihoodMethod of momentsMinimum mean squared errorBayes estimatorsCramer-Rao boundMaximum a posterioriMinimum variance unbiased
estimatorBest linear unbiased estimatoretc
Uncertainty Analysis for Engineers 7
Maximum LikelihoodSuppose we have a random
variable x with pdf f(x;)Take n samples of xWhat is value of that will
maximize the likelihood of obtaining these n observations?
Let L=likelihood of observing this set of values for x
Then maximize L with respect to
Uncertainty Analysis for Engineers 8
Maximum Likelihood
0;,...,log
0;,...,
);()...;();(;,...,
21
21
2121
n
n
nn
xxxL
xxxL
xfxfxfxxxL
Uncertainty Analysis for Engineers 9
ExampleTime between successive arrivals
of vehicles at an intersection are 1.2, 3, 6.3, 10.1, 5.2, 2.4, and 7.2 seconds
Assume exponential distributionFind MLE for
Uncertainty Analysis for Engineers 10
Solution
04.573.35
03.357)log(
3.35)(71)(7)log(
1exp11
1
2
7
1
7
17
7
1
/
/
L
LogtLogL
teL
ef
ii
ii
i
t
t
t
Uncertainty Analysis for Engineers 11
2-Parameter ExampleMeasure cycles to failure of
saturated sand (25, 20, 28, 33, 26 cycles)
Assume lognormal distribution
Uncertainty Analysis for Engineers 12
Solution
164.0
027.0)ln(1
26.3)ln(1
0)ln(1)ln(
0)ln(1)ln(
)ln(21ln)ln()2ln()ln(
)ln(21exp1
21)ln(
21exp
21
)ln(21exp
21
1
22
1
1
23
12
1
22
1
1
22
11
2
2
n
ii
n
ii
n
ii
n
ii
n
ii
n
ii
n
ii
n
i i
n
i
i
i
i
i
xn
xn
xnL
xL
xxnnL
xx
xx
L
xx
f
Uncertainty Analysis for Engineers 13
Method of MomentsUse sample moments (mean,
variance, etc.) to set distribution parameters
Uncertainty Analysis for Engineers 14
ExampleTime between successive arrivals
of vehicles at an intersection are 1.2, 3, 6.3, 10.1, 5.2, 2.4, and 7.2 seconds
Assume exponential distributionMean=5.05
Uncertainty Analysis for Engineers 15
2-Parameter ExampleMeasure cycles to failure of
saturated sand (25, 20, 28, 33, 26 cycles)
Assume lognormal distributionMean=26.4Standard Deviation=4.72Solve for and =3.26=0.177
Uncertainty Analysis for Engineers 16
Solution
164.0
027.0)ln(1
26.3)ln(1
0)ln(1)ln(
0)ln(1)ln(
)ln(21ln)ln()2ln()ln(
)ln(21exp1
21)ln(
21exp
21
)ln(21exp
21
1
22
1
1
23
12
1
22
1
1
22
11
2
2
n
ii
n
ii
n
ii
n
ii
n
ii
n
ii
n
ii
n
i i
n
i
i
i
i
i
xn
xn
xnL
xL
xxnnL
xx
xx
L
xx
f
Uncertainty Analysis for Engineers 17
Minimum Mean Square ErrorChoose parameters to minimize
mean squared error between measured data and continuous distribution
Essentially a curve fit
Uncertainty Analysis for Engineers 18
ApproachExcel
◦Guess parameters◦Calculate sum of squares of errors◦Vary guessed parameters to
minimize error (use the Solver)Matlab
◦Use fminsearch function
Uncertainty Analysis for Engineers 19
ExampleSolar insolation data
◦Gather data◦Form histogram◦Normalize histogram by number of
samples and width of bins
Uncertainty Analysis for Engineers 20
Scatter Plot and Histogram
3500 3600 3700 3800 3900 4000 4100 4200 4300 4400 45000
2
4
6
8
10
120 5 10 15 20 25 30 350
50010001500200025003000350040004500
Uncertainty Analysis for Engineers 21
Normal and Weibull Fits
3500 3600 3700 3800 3900 4000 4100 4200 4300 4400 45000
0.0005
0.001
0.0015
0.002
0.0025
0.003
0.0035
3500 3600 3700 3800 3900 4000 4100 4200 4300 4400 45000
0.0005
0.001
0.0015
0.002
0.0025
0.003
0.0035
Mean=3980 (fit)Mean=3915 (data)
Uncertainty Analysis for Engineers 22
Excel Screen Shot
Uncertainty Analysis for Engineers 23
Excel Screen Shot
Uncertainty Analysis for Engineers 24
Solver Set Up
Uncertainty Analysis for Engineers 25
Matlab Scripty=xlsread('matlabfit.xlsx','normal')[s,t]=hist(y,8);s=s/((max(t)-min(t))/8)/numel(y);numpts=numel(t);zin(1)=mean(t); zin(2)=std(t);sumoferrs(zin,t,s)zout=fminsearch(@(z) sumoferrs(z,t,s), zin)sumoferrs(zout,t,s)xplot=t(1):(t(end)-t(1))/(10*numel(t)):t(end);yplot=curve(xplot,zout);plot(t,s,'+',xplot,yplot)
Uncertainty Analysis for Engineers 26
Matlab Scriptfunction f=curve(x,z)mu=z(1);sig=z(2);f=normpdf(x,mu,sig); function f=sumoferrs(z, x, y)f=sum((curve(x,z)-y).^2);
Uncertainty Analysis for Engineers 27
Sampling DistributionsHow do we assess inaccuracy in
using sample mean to estimate population mean?
n
nn
nxVar
nx
nVarxVar
nn
xn
E
xn
x
x
n
ii
n
ii
n
iix
n
ii
22
21
21
1
1
111
11
1
Uncertainty Analysis for Engineers 28
ConclusionsExpected value of mean is equal to
population meanMean of sample is unbiased estimator
of mean of populationVariance of sample mean is sampling
errorBy CLT, sample mean is Gaussian for
large nMean of x is N(,/n)Estimator for improves as n increases
Uncertainty Analysis for Engineers 29
Sample Mean with Unknown In previous derivation, is the
population meanThis is generally not knownAll we have is the sample
variance (s2)If sample size is small,
distribution will not be GaussianWe can use a “student’s t-
distribution”
121
2
12/2/1)(
f
T ft
ffftf
f=number of degrees of freedom
Uncertainty Analysis for Engineers 30
Distribution of Sample Variance
2222
2
1
22
1
22
1
2
1 1
2
1 1
2
1
22
1
2
1
2
1
22
1
22
11
11
22
22
2
2
11
1111
nn
sE
xnExEn
sE
xnxxnnxnxx
xnxnxx
xxxx
xxxxxx
xxEn
xxEn
sE
xxn
s
n
ii
n
ii
n
ii
n
i
n
iii
n
i
n
iii
n
iii
n
ii
n
ii
n
ii
n
ii
Uncertainty Analysis for Engineers 31
ConclusionsSample variance is unbiased
estimator of population variance
For normal variates
44
44
42
13
xE
nn
nsVar
n
i
i
n
ii
n
ii
nxxsn
xnxxxsn
1
22
2
2
1
22
1
22
/)1(
)1(
Chi-Square Distribution with n-1 dof
This approaches normal
distribution for large
n
Uncertainty Analysis for Engineers 32
Testing HypothesesUsed to make decisions about
population based on sampleSteps
◦Define null and alternative hypotheses◦Identify test statistic◦Estimate test statistic, based on sample◦Specify level of significance Type I error: rejecting null hypothesis when it is
true Type II error: accepting null hypothesis when it
is false◦Define region of rejection (one tail or two?)
Uncertainty Analysis for Engineers 33
Level of SignificanceType I error
◦Level of significance ()◦Typically 1-5%
Type II error () is seldom used
Uncertainty Analysis for Engineers 34
ExampleWe need yield strength of rebar to
be at least 38 psiWe order sample of 25 rebarsSample mean from 25 tests is
37.5 psiStandard deviation of rebar
strength =3 psiUse one-sided testHypotheses: null-=38; alt.- <38
Uncertainty Analysis for Engineers 35
Solution
64.1)1,0,05.0(norminv)05.0()(
833.0
253
385.37
11
zn
xZ
So we cannot reject the null hypothesis and the supplier is considered acceptable
Uncertainty Analysis for Engineers 36
Variation of This ExampleSuppose standard deviation is
not knownUse student’s t-distributionSample stand. dev. = 3.5 psi
711.1)24,05.0(tinv24125
714.0
255.3
385.375.35.37
tdoff
n
xT
psispsix
So we cannot reject the null hypothesis and the supplier is considered acceptable
Uncertainty Analysis for Engineers 37
Third VariationSample size increased to 41Sample mean=37.6 psiSample standard deviation =
3.75 psiNull-variance=9Alternative-variance>9Use Chi-Square distribution
Uncertainty Analysis for Engineers 38
Solution
34.59)40,975.0(240025.0
5.62975.31411
975.0
2
2
2
2
invchicf
snC
So we reject the null hypothesis and the supplier is not acceptable
Uncertainty Analysis for Engineers 39
Confidence IntervalsIn addition to mean, standard
deviation, etc., confidence intervals can help us characterize populations
For example, the mean gives us a best estimate of the expected value of the population, but confidence intervals can help indicate the accuracy of the mean
Confidence interval is defined as the range within which a parameter will lie – within a prescribed probability
Uncertainty Analysis for Engineers 40
CI of the MeanFirst, we’ll assume the variance is
knownThe central limit theorem states
that the pdf of the mean of n individual observations from any distribution with finite mean and variance approaches a normal distribution as n approaches infinity
Uncertainty Analysis for Engineers 41
CI of the Mean
21
21
;
1
1
)1,0(
1
21
1
2
2121
212
212
K
K
nKx
nKxCI
nKx
nKxP
K
n
xKP
N
n
xK
Is CDF of standard normal variate
Uncertainty Analysis for Engineers 42
ExampleMeasure strength of rebar25 samplesMean=37.5 psiStandard deviation=3 psiFind 95% confidence interval for
mean
Uncertainty Analysis for Engineers 43
Solution
psi
KK
KK
7.38;3.3625396.15.37;
25396.15.37
96.1975.0
96.1975.0
95.0
95.0
1975.0
21
1025.0
2
So the mean of the strength falls between 36.3 and 38.7 with a 95% confidence level
Uncertainty Analysis for Engineers 44
The Scriptmu=37.5sig=3n=25alpha=0.05ka=-norminv(1-alpha/2)k1ma=-kacil=mu+ka*sig/sqrt(n)ciu=mu-ka*sig/sqrt(n)
Uncertainty Analysis for Engineers 45
Variance Not KnownWhat if the variance of the
population () is not known?That is, we only know variance of
sample.Let s=standard deviation of
sampleWe can show that
does not conform to a normal distribution, especially for small n
nsx
Uncertainty Analysis for Engineers 46
Variance Not KnownWe can show that this quantity
follows a Student’s t-distribution with n-1 degrees of freedom (f)
1
122/1)(
1,211,2
121
2
nn
f
t
t
nsxtP
ft
fftf
Uncertainty Analysis for Engineers 47
ExampleMeasure strength of rebar25 samplesMean=37.5 psis=3.5 psiFind 95% confidence interval for
mean
Uncertainty Analysis for Engineers 48
ScriptResult is 36.06, 38.94
xbar=37.5;s=3.5;n=25;alpha=0.05;ka=-tinv(1-alpha/2,n-1);kb=-tinv(alpha/2,n-1);cil=xbar+ka*s/sqrt(n)ciu=xbar+kb*s/sqrt(n)
Uncertainty Analysis for Engineers 49
One-Sided Confidence LimitSometimes we only care about
the upper or lower boundsLower
Upper
nstx
nKx
nstx
nKx
n
n
1,11
11
1,11
11
)
)
Uncertainty Analysis for Engineers 50
Example100 steel specimens – measure
strengthMean=2200 kgf; s=220 kgfSpecify 95% confidence limit of
mean
Assume =s=220 kgf1-=0.95; =0.05
216410022065.12200
65.1)95.0(
95.0
195.0
k Manufacturer has 95% confidence that yield strength is at least 2164 kgf
Uncertainty Analysis for Engineers 51
ExampleNow only 15 steel specimenMean=2200 kgf; s=220 kgfSpecify 95% confidence limit of mean
210015220761.12200
761.1
95.0
14,95.0
t
Manufacturer has 95% confidence that yield strength is at least 2100 kgf
Uncertainty Analysis for Engineers 52
Confidence Interval of Variance
1,2
2
1,21
2
1
2
1,212
2
1,2
1;1
11
nn
nn
csn
csn
csncP
Uncertainty Analysis for Engineers 53
Example25 storms, sample variance for measured
runoff is 0.36 in2
Find upper 95% confidence limit for variance
So, we can say, with 95% confidence, that the upper bound of the variance of the runoff is 0.624 in2 and the upper bound of the standard deviation is 0.79 in
2
1,
2
1
2 624.01 inc
sn
n
Uncertainty Analysis for Engineers 54
Scriptvar=0.36n=25alpha=0.05c=chi2inv(alpha,n-1)ci=1/c*var*(n-1)si=sqrt(ci)
Uncertainty Analysis for Engineers 55
Measurement TheorySuppose we are measuring
distancesd1, d2, …, dn are measured
distancesDistance estimate is
Standard error is◦s=standard deviation of sample◦d is the expected value of the mean
n
iidn
d1
1
ns
d
nstd
nstdd
nn 1,211,21;
top related