sampling distributions of estimators and hypothesis...

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Sampling Distributions of Estimators and HypothesisTesting

Weerachart T. Kilenthong

Research Institute for Policy Evaluation and Design (RIPED)University of the Thai Chamber of Commerce

Tee (RIPED) Statistical Inference 1 / 45

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Random Variable and Distribution Function

Definition (Random Variable)

A random variable X is a real-valued function that is defined on thesample space S .

We usually use a random variable to represent uncertain situations. Infact, we assign value to each situations, e.g.,

I Raining: we could assign values to a random variable X representingraining experiment, i.e., X = 1 if it is raining, and 0 otherwise.

To understand a random variable completely, we need to know itsdistribution function, i.e.,

I Raining: suppose that the probability of raining is p = 0.6. Itsdistribution function is defined by

Pr (X = 1) = 0.6,

Pr (X = 0) = 0.4.

I Income: suppose that household income is distributed as log-normalwith parameter

(µ = 10, σ2 = 100

):


.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Distribution Function from Simualted Data: BernoulliDistribution

x=0 x=1

Bernoulli Distribution p=0.6

Prob

abilit

y

0.0

0.2

0.4

0.6

0.8

1.0

0.4006

0.5994

Figure: Observed fraction of sample from a 10,000 draws of the Bernoullidistribution


.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Distribution Function from Simualted Data: CDF ofLog-Normal Distribution

0 20 40 60 80 100

0.0

0.2

0.4

0.6

0.8

1.0

Log−normal Distribution mu=10 sigma=10

y

Com

ulat

ive

dist

ribut

ion

func

tion

Figure: Observed fraction of sample from a 10,000 draws of the log-normaldistribution


.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Distribution Function from the Population

The question then is how can we get the distribution of a randomvariable we interested in?

Of course, if we can observe the whole population, then we can justcalculate the following cumulative distribution function

F (x) = Pr (X ≤ x)

For example, for each level of income x , we can estimate F (x) usingthe fraction of households whose income is no more than x .

Then, we can get the density probability function(p.d.f) f (x) by

f (x) =dF (x)

dx,

which is the slope of the c.d.f. F (x).


.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Distribution Function from Simualted Data: pdf ofLog-Normal Distribution

Log−normal Distribution mu=10 sigma=10

y

Prob

abilit

y di

strib

utio

n fu

nctio

n

0 20 40 60 80 100

0.00

0.02

0.04

0.06

0.08

0.10

Figure: Observed fraction of sample from a 10,000 draws of the log-normaldistribution


.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Population Mean and Variance

Sometimes it is more convenient to characterize a distributionfunction by moments or expectation.

The most popular moments are mean E [X ] ≡ µ and varianceVar [X ] ≡ σ2:

µ = E [X ] =

∫xf (x)dx ,

σ2 = Var [X ] =

∫(x − E [X ])2 f (x)dx .

One of the reason is that(µ, σ2

)are sufficient statistics for a Normal

distribution. That is, if we know(µ, σ2

)of a Normal distribution then

we know the whole distribution. Note: why do we care about Normaldistribution so much?


.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Sample and Population

Unfortunately, it is almost impossible to observe all the population.

Therefore, we need to use estimation and inference.

An observed data we got is called a sample. We, of course, need touse the sample to infer about the underlying random variable or thetruth. We cannot wait for the population data.

Problem: obviously data is not the same as population. What can wedo?


.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Estiamtion of Mean and Variance

We can use a non-parametric method to estimate the wholedistribution directly. But it requires a very large sample, which weusually do not have. This way is the best if you can do it, of course.

Most of the time, though, we will estimate mean and variance mostly.

There are several underlying theories that give us the followingestimators for mean and standard deviation:

µ = xn =

∑ni=1 xin

,

σ =

√∑ni=1 (xi − xn)

2

n.

Note: there is an alternative estimator of variance:

σ =

√∑ni=1 (xi − xn)

2

n − 1,

which is unbiased. But both will be very close when n is large.


.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Estimators as Random Variables

Question: Should we consider an estimator as a constant number ora random variable?

Answer: given that we use sample, which is not the population, weneed to consider an estimator as a random variable.

As a result, an estimator itself must have a distribution function,which is the key of a statistical analysis.

A statement from a statistical analysis, therefore, needs toincorporate the distribution function of the estimator. Technically,this procedure is called a statistical inference.


.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Distribution of the Mean Estimator or Average

Question: What is the distribution of the average:

µ =

∑ni=1 xin

,

I Using a law of large number, we know that the limit of the averageE [µ] is equal to the true parameter µ.

Question: what is the distribution of the estimator?I If we assume that the distribution is Normal (we usually prove this

using the central limit theorem), then we can simply ask what is thevariance of the mean estimator?


.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Standard Error of the Average

We usually estimate the standard deviation of an estimator bystandard error (s.e.).

Question: what is standard error of the mean estimator of a Normaldistribution?

Back to basic: what is standard error?I It is the standard deviation of the mean estimator.I But how can we calculate it?

Statistic programs (i.e. STATA, R) usually calculate the s.e. for you.

Here we will show you how to construct it manually. The purpose isnot for you to do it this way but to show you its meaning.


.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Simulated Data from the Standard Normal distributionwith

(µ = 0, σ2 = 1

)

We begin by simulating the data from a known Normal distributionwith

(µ = 0, σ2 = 1

). These parameters in this case are the true

parameters.

Suppose that we simulate 10,000 observations.

You can see the distribution of each sample sets int he followingfigure.

I It get closer to the theoretical distribution when the sample size islarger.


.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Simulated Data from a Normal distribution with(µ = 0, σ2 = 1

)number of sample = 30

z

Dens

ity

−1 0 1 2

0.00.4

0.8number of sample = 100

z

Dens

ity

−2 −1 0 1 2 3 4

0.00

0.15

0.30

number of sample = 500

z

Dens

ity

−3 −2 −1 0 1 2 3

0.00.2

0.4


z

Dens

ity

−3 −2 −1 0 1 2 3

0.00.2

0.4


z

Dens

ity

−2 0 2 4

0.00.1

0.20.3

0.4


z

Dens

ity

−2 0 2 4

0.00.1

0.20.3

0.4

Normal Distribution:µ=0,σ2=1

Figure: Distribution of simulated sample from the standard normal distribution


.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Standard Error of the Average from Simulated Data

We will randomly draw n = 50, 100, 500, 1,000 sample from the10,000 observations, each of them 1,000 times.

Then, calculate the average or the mean estimator for each draw

µn =

∑ni=1 xin

,

Get the distribution of the averages µn from those draws.

This is the distribution of the estimator. Of course, the accuracydepends on the number of draw.


.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Standard Error of the Average from Simulated Data

Figure: Distribution of estimated mean or averge from randomized sample


.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Standard Error from Theory

Question: Do we need to do this bootstrapping all the time?

Answer: we usually do not do it this way. We use probability theoryto guide us how to calculate the standard error (s.e.) of an estimator.

For example, using a central limit theorem, we can show that√n(Xn − µ

)has the Normal distribution with mean zero and

variance σ2. Hence, Var[Xn

]= σ2

n .

As a result, we can approximate the standard error for the average orthe mean estimator by

σ√n.

That is, we replace σ by its estimator σ:

σ =

√∑ni=1 (xi − xn)

2

n − 1,

As a result, the average Xn has the t distribution with n− 1 degree offreedom.


.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Standard Error from Simulated Data and Theory

Simulationn µ σ E [x ] Var [x ] SDx SE formular

x

50 0 1 0.018542 0.018731 0.140978 0.140585100 0 1 0.005671 0.010035 0.099687 0.098938500 0 1 0.012806 0.001813 0.044581 0.0445731000 0 1 0.011740 0.000861 0.031524 0.031520


.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Standard Error is Not the Standard Deviation of theSample

One may be confused whether the standard error is the standarddeviation of the data?

The answer is clearly NO.

To see this further, let see the standard error for standard deviationestimator.


.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Standard Error of the Standard Deviation σ fromSimulated Data

Figure: Distribution of estimated standard deviation from randomized sample


.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Proportion/Fraction as Probability

Another simple but useful estimator is a fraction pi , a ratio of numberof observations in a group ni and the total number of observations n:

pi =nin.

This fraction is an efficient estimator of a probability of having groupi in the population.

Example: Bernoulli distribution with parameter p; the outcome iseither 1 or 0. The estimator for p is

p = xn =

∑ni=1 xin

=n1n,

where n1 is the number of observations with xi = 1.

The key point: p ∼ N(p, p(1−p)

n

)Tee (RIPED) Statistical Inference 21 / 45

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Proportion/Fraction from Simulation

In theory, the standard error for p is√

p(1−p)n .

0.60 0.65 0.70 0.75 0.80 0.85 0.90

010

2030

40

Dens

ity

0.60 0.65 0.70 0.75 0.80 0.85 0.90

010

2030

40

Dens

ity

0.60 0.65 0.70 0.75 0.80 0.85 0.90

010

2030

40

Dens

ity

0.60 0.65 0.70 0.75 0.80 0.85 0.90

010

2030

40

Dens

ity

n=50 n=100 n=500 n=1000

p=0.75

Figure: Distribution of estimated p: the true parameter is p = 0.75.Tee (RIPED) Statistical Inference 22 / 45

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Bayes Estimation for a normal distribution

Assume that, X1,X2, ...,Xn be a random sample for N(µ, σ2) and σ2

is known.

Assume that, the prior is µ ∼ N(µ0, σ20).

The key point: the posterior of µ is also normal with mean andvariance:

µ1 =σ2µ0 + nσ2

0Xn

σ2 + nσ20

and σ21 =

σ2σ20

σ2 + nσ20

.

Where Xn is sample mean and n is number of sample.


.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Simulation of posterior distribution

Figure: Distribution of estimated µ1: the true parameter is µ = 0.


.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Maximum Likelihood Estimation for a normal distribution

Assume that, X1,X2, ...,Xn be a random sample for N(µ, σ2).

For this example, likelihood function is

L =n∏

i=1

ϕ

(xi − µ

σ

∣∣∣∣µ, σ2

)(1)

The MLE of θ = µ, σ2 is θ = µ, σ2 = x ,∑n

i=1(xi−xn)n and the

asymptotic normality result states when θ0 = µ0, σ20 is true

parameter that

√n(µ− µ0)

d−→ N(0,σ20

n)

√n(σ2 − σ2

0)d−→ N(0,

2(σ20)

2

n).


.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Simulation of estimator using MLE

Figure: Distribution of estimated µ and σ2: the true parameters are µ = 0 andσ2 = 1.


.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Method of Moment Estimation for a normal distribution

Assume that, X1,X2, ...,Xn be a random sample for N(µ, σ2).

The moments of this example are∑ni=1 xin

= E (X ) = µ (2)∑ni=1 x

2i

n= E (X 2) = µ2 + σ2 (3)

The MM estimors of θ = µ, σ is θ = µ, σ = x ,∑n

i=1(xi−xn)n .

In general, Bootstrap (Monte Carlo simulation) methods provideapproximations to the sampling distributions of MM estimators.


.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Simulation of estimator using MM estimators

Figure: Distribution of estimated µ and σ2: the true parameters are µ = 0 andσ2 = 1.


.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Hypothesis Testing

In general, we apply a statistical analysis to test whether a hypothesiscan be rejected or not.

This is a main reason why do we need to know the distribution or thestandard error of an estimator.

We will not go into details about how many types of testing we cando here. We will simply focus on the basic concept that might beuseful for simple analysis.


.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Null and Alternative Hypothesis

We usually want to test whether a parameter θ is in a set Ω0 or not:

H0 : θ ∈ Ω0,

which is called a Null hypothesis. Consequently, at the same time, theAlternative hypothesis is automatically defined as

H1 : θ ∈ Ω1,

where Ω0 and Ω1 are disjoint partition with Ω0 ∩ Ω1 = ∅ andΩ0 ∪ Ω1 = Ω.

If we know the true value of θ, it is then very easy to tell which one istrue. But unfortunately we usually do not know the true value. Wecan at best estimate it.


.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Test Statistics

To simplify a test further, we usually define a test statisticsT = r (X ), which is a function of observables X .Examples:

I t-statistics: it is a test statistics for testing whether a parameter isdifferent from zero:

t =θ

s.e.(θ) ,

where s.e.(θ)si the standard error of the estimator θ. If the

parameter of estimate is the mean, whose estimator is the average, wethen can test whether the mean is equal to a constant µ or not, usingthe following t-stat

t =Xn − µ

σ√n

,

where the distribution of this t-stat is the t distribution with degree of

freedom n − 1. Note that here θ = Xn−µσ and s.e.

(θ)= σ

σ√n


.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

t and Normal Distribution

−4 −2 0 2 4

0.00.1

0.20.3

0.4

Student's t DistributionsDe

nsity

normaldf=3df=5df=20df=30

Figure: Normal and t distributions.


.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Error Type I and Error Type II: Graphical Representation

Figure: Error type I and error type II


.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Error Type I and Error Type II: Formal Definition

Figure: Error type I and error type II


.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Significance Level of A Test

We usually like to tell us how big is the Type I error: α.

Figure: Significance Level


.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

p-Value

Definition

The p-value is the smallest level α s.t. we would reject H0 at level α withthe observed data

Alternatively, we can use p-value to tell us smallest probability thatwe would reject H0.

Figure: p-value


.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Confidence IntervalThe daily net-return of ABSM from 11/09/2017 to 11/09/2018

−0.02

−0.01

0.00

0.01

0.02

Date

ABSM

11/09

/17

11/10

/17

11/11

/17

11/12

/17

11/01

/18

11/02

/18

11/03

/18

11/04

/18

11/05

/18

11/06

/18

11/07

/18

11/08

/18

11/09

/18

95% CI Fitted values

Figure: Confidence Interval


.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Comparing Means of Two Distributions: t-Distribution

When we want to compare means of two distributions with the samevariance σ2:

H0 : µ1 = µ2

We can use the following statistic to test the hypothesis

U =

Xn−Ym

σ√

1n+ 1

m√S2X+S2

Yσ2(n+m−2)

The key point is that it has the t distribution.


.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Comparing Means of Two Distributions: Example

−0.02 −0.01 0.00 0.01 0.02

010

2030

4050

6070

kden

sity

−0.02 −0.01 0.00 0.01 0.02

010

2030

4050

6070

kden

sity

ABSM

1AMSET50

Figure: Comparing Means of Two Distributions: daily net return of ABSM and1AMSET50 from 11/09/2017 to 11/09/2018.


.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Comparing Means of Two Distributions: Example

We set a hypothesis testing to compare mean of two funds.

H0 : µABSM = µ1AMSET50

Ha : µABSM = µ1AMSET50


.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Comparing Variances of Two Distributions: F-Distribution

We can also compare variances of two distributions.

H0 : σ1 = σ2

We can use the following statistic to test the hypothesis

U =

S2X

n−1

S2Y

m−1

The key point is that it has the F distribution.


.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Comparing Variances of Two Distributions: Example

We set a hypothesis testing to compare mean of two funds.

H0 : σABSM = σ1AMSET50

Ha : σABSM = σ1AMSET50

Figure: Comparing Variances of Two Distributions: daily net return of ABSM and1AMSET50 from 11/09/2017 to 11/09/2018.


.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Comparing Two Distributions: χ2-Distribution

We sometimes would like to compare two distributions.

With categorical or discrete data, we can form the null hypothesis as

H0 : pji = p0i for i = 1, . . . , k and ∀j

where each sample can be categorized into k groups.

The test statistic is

Q =k∑

i=1

(Ni − Np0i

)2Np0i

,

where Ni is the number of observation in group i , and∑k

i=1 Ni = N.

The key point: Q ∼ χ2k−1.


.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Comparing Two Distributions: Example

We set a hypothesis testing to compare proportion of two categoricalfunds including: type and policy.

H0 : pji = pTotali

Ha : pji = pTotali

when i = Equity, Fixed, Mixed, Other and j = global, local,Total

global local Total

Equity 0.34 0.59 0.43Fixed 0.38 0.23 0.33Mixed 0.18 0.17 0.18Other 0.10 0.01 0.06

Total 1.00 1.00 1.00

Table: The proportion of number of funds


.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Comparing Two Distributions: Example

Figure: Comparing proportion of Two Distributions: Type and policy.


sampling distributions of estimators and hypothesis...

Documents