elec 303 – random signals lecture 18 – classical statistical inference, dr. farinaz koushanfar...
TRANSCRIPT
ELEC 303 – Random Signals
Lecture 18 – Classical Statistical Inference, Dr. Farinaz Koushanfar
ECE Dept., Rice UniversityNov 4, 2010
Lecture outline
• Reading: 9.1-9.2• Confidence Intervals– Central limit theorem– Student t-distribution
• Linear regression
Confidence interval
• Consider an estimator for unknown • We fix a confidence level, 1-• For every replace the single point estimator
with a lower estimate and upper one s.t.
• We call , a 1- confidence interval
1)ˆˆ(P nn
]ˆ,ˆ[ nn
Confidence interval - example
• Observations Xi’s are i.i.d normal with unknown mean and known variance /n
• Let =0.05• Find the 95% confidence interval
Confidence interval (CI)
• Wrong: the true parameter lies in the CI with 95% probability….
• Correct: Suppose that is fixed• We construct the CI many times, using the
same statistical procedure• Obtain a collection of n observations and
construct the corresponding CI for each• About 95% of these CIs will include
A note on Central Limit Theorem (CLT)
• Let X1, X2, X3, ... Xn be a sequence of n independent and identically distributed RVs with finite expectation µ and variance σ2 > 0
• CLT: as the sample size n increases, PDF of the sample average of the RVs approaches N(µ,σ2/n) irrespective of the shape of the original distribution
CLT
A probability density function Density of a sum of two variables
Density of a sum of three variables Density of a sum of four variables
CLT
• Let the sum of n random variables be Sn, given by Sn = X1 + ... + Xn. Then, defining a new RV
• The distribution of Zn converges towards the N(0,1) as n approaches (this is convergence in distribution),written as
• In terms of the CDFs
Confidence interval approximation
• Suppose that the observations Xi are i.i.d with mean and variance that are unknown
• Estimate the mean and (unbiased) variance
• We may estimate the variance /n of the sample mean by the above estimate
• For any given , we may use the CLT to approximate the confidence interval in this case
From the normal table:
Confidence interval approximation
• Two different approximations in effect:– Treating the sum as if it is a normal RV– The true variance is replaces by the estimated
variance from the sample
• Even in the special case where the Xi’s are i.i.d normal, the variance is an estimate and the RV Tn (below) is not normally distributed
t-distribution
• For normal Xi, it can be shown that the PDF of Tn does not depend on and
• This is called t-distribution with n-1 degrees of freedom
t-distribution
• Its is also symmetric and bell-shaped (like normal)
• The probabilities of various intervals are available in tables
• When the Xi’s are normal and n is relatively small, a more accurate CI is (z=1-/2)
Example
• The weight of an object is measured 8 times using an electric scale
• It reports true weight + random error ~N(0,)
.5547, .5404, .6364, .6438, .4917, .5674, .5564, .6066
• Compute the 95% confidence interval Using the t-distribution
Linear regression
• Building a model of relation between two or more variables of interest
• Consider two variables x and y, based on a collection of data points (xi,yi), i=1,…,n
• Assume that the scatter plot of these two variables show a systematic, approximately linear relationship between xi and yi
• It is natural to build a model: y0+1x
Linear regression
• Often, we cannot build a model, but we can estimate the parameters:
• The i-th residual is:
Linear regression
• The parameters are chosen to minimize the sum of squared residuals
• Always keep in mind that the postulated model may not be true
• To perform the optimization, we set the partial derivatives to zero w.r.t 0 and 1
Linear regression
• Given n data pairs (xi,yi), the estimates that minimize the sum of the squared residuals are
Example
• The leaning tower of Pisa continuously tilts• Measurements bw 1975-1987• Find the linear regression
Solution
Justification of the least square
• Maximum likelihood• Approximation of Bayesian linear LMS (under
a possibly nonlinear model)• Approximation of Bayesian LMS estimation
(linear model)
Maximum likelihood justification
• Assume that xi’s are given numbers
• Assume yi’s are realizations of a RV Yi as below where Wi’s are i.i.d ~N(0,2)
Yi = 0 + 1xi + Wi
• The likelihood function has the form
• ML is equivalent to minimizing the sum of square residuals
Approximate Bayesian linear LMS
• Assume xi and yi are realizations of RVs Xi & Yi,
• (Xi,Yi) pairs are i.i.d with unknown joint PDF
• Assume an additional independent pair (X0,Y0)
• We observe X0 and want to estimate Y0 by a linear estimator
• The linear estimator is of the form
Approximate Bayesian LMS
• For the previous scenario, make the additional assumption of linear model
Yi = 0 + 1xi + Wi
• Wi’s are i.i.d ~N(0,2), independent of Xi
• We know that E[Y0|X0] minimizes the mean squared estimation error, for E[Y0|X0]=0+1Xi
• As n,
Multiple linear regression
• Many phenomena involve multiple underlying variables, also called explanatory variables
• Such models are called multiple regression• E.g., for a triplet of data points (xi,yi,zi) we wish to
estimate the model: y 0 + 1x + 2z• Minimize:
i (yi - 0 - 1xi - 2zi)2
• In general, we can consider the model y 0 + j j hj(x)
Nonlinear regression
• Sometimes the expression is nonlinear in the unknown parameter
• Variables x and y obey the form yh(x;)• Min i (yi – h(xi ;))2
• The minimization is not typically closed-form• Assuming Wi’s are N(0,2), Yi = h(xi;) + Wi
• The ML function
Practical considerations
• Heteroskedasticity• Nonlinearity• Multicollinearity• Overfitting• Causality