convergence in probability
TRANSCRIPT
-
8/8/2019 Convergence in Probability
1/3
1
Convergence in Probability
One of the handiest tools in regression is the asymptotic analysis of estimators as thenumber of observations becomes large. This is handy for the following reason. When you have
a nonlinear function of a random variable g(X), when you take an expectation E[g(X)], this is not
the same as g(E[X]). For example, for a mean centered X, E[X
2
] is the variance and this is notthe same as (E[X])2=(0)2=0. However, as we will soon show, plim[g(Xn)] is equal tog(plim[Xn]), so asymptotic properties can be passed through nonlinear operators. What is this
plim in the above?
Suppose that we have a sequence of random variables X1, X2,...,Xn,each withcumulative distribution functions F1(X), F2(X),...,Fn(X), We say that this sequence converges
in probability to a constant c if 0cobPrlim nn
!I"gp
for any small I>0. In terms of the
CDFs, this says that Fn gets very close to a staircase step function at the value c, as seen in the
figure below.
That is, the limiting random variable is very unlikely to be either below or above the value of c.
We use the notation plim[Xn]=c to denote the above 0cXobrlim nn
I
, and we say that
the probability limit of Xn is c.
If g(X) is a continuous function then values of X close to c produce values of g(X) that
are close to g(c), since that is what continuous means. If the plim[Xn]=c, then almost certainlyXn is near c, so almost certainly g(Xn) is near g(c). That is, plim[g(Xn)]=g(plim[Xn]). For
example, if g(X)=X2
and plim[ nx ]=Q, then the plim[ nx2]= (plim[ nx ])
2=Q2. It is not true, of
course, that E[ nx2
]=( E[ nx ])2
=Q2
. In fact, since var( nx )=E[( nx -Q)2
]=E[ nx2
]-Q2
, we know thatE[ nx
2]=Q2+var( nx )=Q
2+W2/n Q2.
c X
.0
F
c X
.0
Fn
-
8/8/2019 Convergence in Probability
2/3
2
Chebychevs Inequality: 2k
1kXobr eW
Q .
Chebychevs inequality (sometimes written Chebysheff) says that only an insignificant portionof the probability falls beyond a given number of standard deviations of the mean, regardless ofthe probability distribution. For example, less than 25% of the probability can be more than 2
standard deviations of the mean; of course, for a normal distribution, we can be more specific less than 5% of the probability is more than 2 standard deviations from the mean.
roof: g
WQ
WQ
gg
gQQuQ|W
k
2k 222dx)x(f)x(dx)x(f)x(dx)x(f)x( because we have
left out the middle piece of the sum of positive numbers.
a. Look at the first integral on the right of the inequality. Notice that in this region of integration
x eQ-kW or x-Qe-kW, but x
-
8/8/2019 Convergence in Probability
3/3
3
Application of plim in regression.
Suppose Y=XF+I and the OLS estimator is b=(XX)-1XY=F+(XX/n)-1XI/n. The plimb=F+(XX/n)
-1plim(XI/n). For the kth variable, plim(7XikIi/n)=0, since E[7XikIi/n]=0 and
var(7XikIi/n)=W2
7Xik2
/n2
. By Chebychevs inequality 2
2
ik
iikk
1
n
X
kn/Xobr e
7W
I7 . Set
7W
n
Xk
2
ikand we have
2
2ik
2
iikn
Xn/Xobr
7We
!I7 . Taking the limit with respect
to n, plim(7XikIi/n)=0. That is, the plim b =F and we say that the OLS estimator is consistent.