elec 303 – random signals lecture 18 – classical statistical inference, dr. farinaz koushanfar...

ELEC 303 – Random Signals

Lecture 18 – Classical Statistical Inference, Dr. Farinaz Koushanfar

ECE Dept., Rice UniversityNov 4, 2010

Lecture outline

• Reading: 9.1-9.2• Confidence Intervals– Central limit theorem– Student t-distribution

• Linear regression

Confidence interval

• Consider an estimator for unknown • We fix a confidence level, 1-• For every replace the single point estimator

with a lower estimate and upper one s.t.

• We call , a 1- confidence interval

1)ˆˆ(P nn

]ˆ,ˆ[ nn

Confidence interval - example

• Observations Xi’s are i.i.d normal with unknown mean and known variance /n

• Let =0.05• Find the 95% confidence interval

Confidence interval (CI)

• Wrong: the true parameter lies in the CI with 95% probability….

• Correct: Suppose that is fixed• We construct the CI many times, using the

same statistical procedure• Obtain a collection of n observations and

construct the corresponding CI for each• About 95% of these CIs will include

A note on Central Limit Theorem (CLT)

• Let X1, X2, X3, ... Xn be a sequence of n independent and identically distributed RVs with finite expectation µ and variance σ2 > 0

• CLT: as the sample size n increases, PDF of the sample average of the RVs approaches N(µ,σ2/n) irrespective of the shape of the original distribution

CLT

A probability density function Density of a sum of two variables

Density of a sum of three variables Density of a sum of four variables

CLT

• Let the sum of n random variables be Sn, given by Sn = X1 + ... + Xn. Then, defining a new RV

• The distribution of Zn converges towards the N(0,1) as n approaches (this is convergence in distribution),written as

• In terms of the CDFs

Confidence interval approximation

• Suppose that the observations Xi are i.i.d with mean and variance that are unknown

• Estimate the mean and (unbiased) variance

• We may estimate the variance /n of the sample mean by the above estimate

• For any given , we may use the CLT to approximate the confidence interval in this case

From the normal table:

Confidence interval approximation

• Two different approximations in effect:– Treating the sum as if it is a normal RV– The true variance is replaces by the estimated

variance from the sample

• Even in the special case where the Xi’s are i.i.d normal, the variance is an estimate and the RV Tn (below) is not normally distributed

t-distribution

• For normal Xi, it can be shown that the PDF of Tn does not depend on and

• This is called t-distribution with n-1 degrees of freedom

t-distribution

• Its is also symmetric and bell-shaped (like normal)

• The probabilities of various intervals are available in tables

• When the Xi’s are normal and n is relatively small, a more accurate CI is (z=1-/2)

Example

• The weight of an object is measured 8 times using an electric scale

• It reports true weight + random error ~N(0,)

.5547, .5404, .6364, .6438, .4917, .5674, .5564, .6066

• Compute the 95% confidence interval Using the t-distribution

Linear regression

• Building a model of relation between two or more variables of interest

• Consider two variables x and y, based on a collection of data points (xi,yi), i=1,…,n

• Assume that the scatter plot of these two variables show a systematic, approximately linear relationship between xi and yi

• It is natural to build a model: y0+1x

Linear regression

• Often, we cannot build a model, but we can estimate the parameters:

• The i-th residual is:

Linear regression

• The parameters are chosen to minimize the sum of squared residuals

• Always keep in mind that the postulated model may not be true

• To perform the optimization, we set the partial derivatives to zero w.r.t 0 and 1

Linear regression

• Given n data pairs (xi,yi), the estimates that minimize the sum of the squared residuals are

Example

• The leaning tower of Pisa continuously tilts• Measurements bw 1975-1987• Find the linear regression

Solution

Justification of the least square

• Maximum likelihood• Approximation of Bayesian linear LMS (under

a possibly nonlinear model)• Approximation of Bayesian LMS estimation

(linear model)

Maximum likelihood justification

• Assume that xi’s are given numbers

• Assume yi’s are realizations of a RV Yi as below where Wi’s are i.i.d ~N(0,2)

Yi = 0 + 1xi + Wi

• The likelihood function has the form

• ML is equivalent to minimizing the sum of square residuals

Approximate Bayesian linear LMS

• Assume xi and yi are realizations of RVs Xi & Yi,

• (Xi,Yi) pairs are i.i.d with unknown joint PDF

• Assume an additional independent pair (X0,Y0)

• We observe X0 and want to estimate Y0 by a linear estimator

• The linear estimator is of the form

Approximate Bayesian LMS

• For the previous scenario, make the additional assumption of linear model

Yi = 0 + 1xi + Wi

• Wi’s are i.i.d ~N(0,2), independent of Xi

• We know that E[Y0|X0] minimizes the mean squared estimation error, for E[Y0|X0]=0+1Xi

• As n,

Multiple linear regression

• Many phenomena involve multiple underlying variables, also called explanatory variables

• Such models are called multiple regression• E.g., for a triplet of data points (xi,yi,zi) we wish to

estimate the model: y 0 + 1x + 2z• Minimize:

i (yi - 0 - 1xi - 2zi)2

• In general, we can consider the model y 0 + j j hj(x)

Nonlinear regression

• Sometimes the expression is nonlinear in the unknown parameter

• Variables x and y obey the form yh(x;)• Min i (yi – h(xi ;))2

• The minimization is not typically closed-form• Assuming Wi’s are N(0,2), Yi = h(xi;) + Wi

• The ML function

Practical considerations

• Heteroskedasticity• Nonlinearity• Multicollinearity• Overfitting• Causality

elec 303 – random signals lecture 18 – classical statistical inference, dr. farinaz koushanfar...

Documents