revisiting local asymptotic normality (lan) and passing on...

STATISTICS–THEORY AND PRACTICE

Richard A. Johnson Conference, University of Wisconsin

Revisiting Local Asymptotic

Normality (LAN) and Passing on

to Local Asymptotically Mixed

Normality (LAMN)

G.G. Roussas

University of California, Davis

(formerly, University of Wisconsin, Madison)

May, 2008

1. Basic Problem

On the basis of a random sample, whose proba-

bility law depends on a parameter θ, discriminate

between two values θ and θ∗ (θ 6= θ∗).

2

2. Some Notation

X0, X1, . . . , Xn i.i.d., (X ,A, Pθ), θ ∈ Θ open ⊆ Rk,

k ≥ 1, An = σ(X0, . . . , Xn), Pn,θ = Pθ | An,

P0,θ ≈ P0,θ∗, θ, θ∗ ∈ Θ (θ 6= θ∗),

q(X0; θ, θ∗) =dP0,θ∗

dP0,θ,

ϕj(θ, θ∗) = ϕ(Xj; θ, θ

∗) =[q(Xj; θ, θ

∗)]1/2

,

Ln(θ, θ∗) =dPn,θ∗

dPn,θ=

n∏j=0

ϕ2j (θ, θ∗),

Λn(θ, θ∗) = logLn(θ, θ∗);

restrict to θ∗’s close to θ; i.e., θn = θ + hn√n

,

hn → h ∈ Rk (all limits taken as n→∞).

3

3. Basic Assumption

ϕ0(θ, θ∗) is differentiable in q.m. with respect to

θ∗ at θ – under Pθ – with q.m. derivative ϕ0(θ) (a

k × 1 random vector). Set

Ln(θ, θn) =dPn,θndPn,θ

, Λn(θ, θn) = logLn(θ, θn),

∆n(θ) =2√n

n∑j=0

ϕj(θ),

Γ(θ) : h′Γ(θ)h = 4Eθ[h′ϕ0(θ)

]2, h ∈ Rk,

A(h, θ) =1

2h′Γ(θ)h.

4

4. First Installment of Results

Theorem 1: Λn(θ, θn)− h′∆n(θ)Pn,θ−→ −A(h, θ)

Theorem 2: L[∆n(θ) | Pn,θ

]⇒ N(0,Γ(θ))

Theorem 3: L[Λn(θ, θn) | Pn,θ

]⇒ N

(−1

2h′Γ(θ)h, h′Γ(θ)h

).

5

5. Reminder

{Pn} and {Qn}, defined on (Xn,An), are contiguous

if whenever Pn(An)→ 0, An ∈ An, then

Qn(An)→ 0, and vice versa.

6

6. Second Installment of Results

Theorem 4: Λn(θ, θn)− h′∆n(θ)Pn,θn−→ −A(h, θ)

Theorem 5: L[∆n(θ) | Pn,θn

]⇒ N(Γ(θ)h,Γ(θ))

Theorem 6: L[Λn(θ, θn) | Pn,θn

]⇒ N

(12h′Γ(θ)h, h′Γ(θ)h

).

7

7. Loose Interpretation of Theorem 1

Λn(θ, θn) ' h′∆n(θ)−A(h, θ)

or

Ln(θ, θn) ' eh′∆n(θ)−A(h,θ).

8

8. Precise Formulation of Above Result

Theorem 7: There exists a (suitably) truncated

version ∆∗n(θ) of ∆n(θ) such that:

Eθeh′∆∗n(θ) def= eBn(h) <∞,

Pn,θ[∆∗n(θ) 6= ∆n(θ)

]−→ 0,

Pn,θn[∆∗n(θ) 6= ∆n(θ)

]−→ 0,

and if

Rn,h(A) = e−Bn(h)∫A

eh′∆∗n(θ)dPn,θ, A ∈ An,

(so that

dRn,hdPn,θ

= eh′∆∗n(θ)−Bn(h), h ∈ Rk

),

then

‖Pn,θn −Rn,hn‖ → 0 or

sup

{∥∥∥∥∥Pn,θ+ h√n

−Rn,h

∥∥∥∥∥ ;h ∈ B bounded ⊂ Rk}→ 0.

9

9. Statistical Significance of Theorem 7

Exploitation of the local approximation (in the L1-

norm or total variation norm) of the given family

of probability measures by an exponential family of

probability measures.

10

10. Statistical Applications of the Theorems

(i) Θ ⊆ R

(a) For hypotheses (alternatives) for which there

are UMP tests in the exponential family, we can

construct tests ϕn – based on ∆(θj), θj, j = 0,1,2

boundary points – which are AUMP

(i.e., lim sup [sup (Eθωn − Eθϕn; θ ∈ A)] ≤ 0) among

all tests ωn such that lim sup [sup (Eθωn; θ ∈ H)] ≤α. For example: H : θ ≤ θ0, A > θ0 at level α,

ϕn = ϕn (∆n(θ0)) =

1, ∆n(θ0) ≥ cnγn, ∆n(θ0) = cn

0, ∆n(θ0) < cn

,

cn, γn : Eθ0ϕn = α.

11

(b) For hypotheses (alternatives) for which there

are UMPU tests in the exponential family, we can

construct tests ϕn – based on ∆(θj), θj, j = 0,1,2

boundary points – which are AUMP

(i.e., lim sup [sup (Eθωn − Eθϕn; θ ∈ A)] ≤ 0) among

all asymptotically unbiased tests ωn(i.e., lim inf [inf (Eθωn; θ ∈ H)] ≥ α). For example:

H : θ = θ0, A : θ 6= θ0 at level α,

ϕn = ϕn (∆n(θ0))

=

1, ∆n(θ0) < an or ∆n(θ0) > bn

0, an ≤∆n(θ0) ≤ bn,

(an < bn) with an → −ξα/2, bn → ξα/2 (ξp = p-th

quantile of N (0, Γ(θ0))) is AUMPU.

Remark: Actually, the tests are locally AUMP or

AUMPU, but they become globally so under the

additional assumption:

∆n(θj)Pn,θn−→ ±∞ if

√n(θn− θj)→ ±∞, j = 0,1,2.

12

(ii) Θ ⊆ Rk, k ≥ 1

H : θ = θ0, A : θ 6= θ0 at level α.

Simple tests ϕn are constructed – based on ∆n(θ0)

– which enjoy Wald-type asymptotically optimal

properties:

Weighted average power over certain surfaces is

largest – within a class of competing tests.

The sup of the difference of the sup and the inf of

the power over certain surfaces → 0.

The sup of the difference between the envelope

power and the power over certain surfaces – of the

test ϕn – compared to the same sup of any other

competing test have a difference, whose limsup is

≤ 0.

13

11. Set θn = θ + h√n

. Then the estimate Tn is

regular if

L[√n(Tn − θn) | Pn,θn

]⇒ L(θ),

a probability measure.

Theorem 8: For regular estimates,

L(θ) = N(0,Γ−1(θ)) ∗ L∗(θ),

L∗(θ) a specific probability measure.

14

12. Application to Asymptotic Efficiency of

Estimates via Theorem 8: The Weiss-Wolfowitz

Approach

Θ ⊆ R

Pn,θ[√n(Tn − θ) ≤ x

]→ FT (x; θ), a d.f., continu-

ously in Θ for each fixed x ∈ R and continuously

on R for each θ ∈ Θ. Let `T (θ) and uT (θ) be the

“smallest” and the “largest” median of the (con-

tinuous) d.f. FT (·; θ). Then

limPn,θ

(θ −

t1√n

+`T (θ)√n≤ Tn ≤ θ +

t2√n

+uT (θ)√n

)≤ B(θ; t1, t2),

B(θ; t1, t2) = Φ[t2σ(θ)]−Φ[t1σ(θ)], t1, t2 > 0,

Φ is the d.f. of N(0, σ2(θ)), σ2(θ) = 4Eθ [ϕ0(θ)]2.

15

Θ ⊆ Rk, k ≥ 1

Under similar assumptions as above,

limPn,θ[−t1 + `T (θ, h) ≤

√nh′(Tn − θ)

≤ t2 + uT (θ, h)]

≤ Φ[t2σ−1(θ, h)]−Φ[−t1σ−1(θ, h)],

t1, t2 > 0, Φ d.f. of N(0, σ2(θ, h)),

σ2(θ, h) = h′Γ−1(θ)h.

16

Θ ⊆ R

L[√n(Tn − θ) | Pn,θ

]⇒ LT,θ, a probability measure

having 0 as its median. Then

lim sup

(θ −

t1√n≤ Tn ≤ θ +

t2√n

)≤ B(θ; t1, t2),

t1, t2 > 0.

Consider median unbiased Tn’s; i.e., Pθ(Tn ≥ θ) ≥12, Pθ(Tn ≤ θ) ≥ 1

2. Then

lim supPθ

(θ −

t1√n≤ Tn ≤ θ +

t2√n

)≤ B(θ; t1, t2),

t1, t2 > 0.

17

13. Application to Asymptotic Efficiency of

Estimates: The Classical Approach

(i) Θ ⊆ R

Consider estimates Tn such that

Pθ[√n(Tn − θ) ≤ x

]→ ΦT (x; θ)

(or continuously so in Θ), where ΦT (·; θ) is the

d.f. of N(0, σ2T (θ)). Then σ2

T (θ) ≥ 1/σ2(θ) a.e.[`]

(or pointwise, respectively).

Also, Eθ[nEθ(Tn − θ)2

]≥ 1/σ2(θ) a.e.[`] (or point-

wise, respectively).

(ii) Θ ⊆ Rk, k ≥ 1

Consider estimates Tn such that

Pθ[√n(Tn − θ) ≤ x

]⇒ Φ(k)(x;CT (θ))

continuously in Θ, where Φ(k)(·;CT (θ)) is the d.f.

of N(0, CT (θ)) with CT (θ) positive definite. Then,

with Γ(θ) = Eθ[ϕ0(θ)ϕ′0(θ)

], CT (θ)−Γ−1(θ) is pos-

itive semi-definite a.e.[`k]. Also, CT is continuous

in Θ. Furthermore, if Γ is also continuous, then

CT (θ)−Γ−1(θ) is positive semi-definite for all θ ∈ Θ.

18

14. Generalizations

Most of the above results have been generalizedto:

(i) i.n.n.i.d. case

(ii) Markov processes

(iii) Semi-Markov processes

(iv) General stochastic processes, which need noteven be stationary

(v) Continuous time Markov processes

(vi) Levy processes of the discontinuous type

(vii) Continuous time diffusions and Gaussian pro-cesses with known covariance.Also,

(viii) Certain generalizations when the sample size isa stopping time.

19

15. Case Where the Underlying Family of

Probability Measures is not LAN but Rather

it is LAMN

Refer to Theorem 1:

Λn(θ, θn)−[h′∆n(θ)−A(h, θ)

] Pn,θ−→ 0,

A(h, θ) non-random. Instead, it may happen that

Λn(θ, θn)−[h′∆n(θ)− 1

2h′Tn(θ)h

] Pn,θ−→ 0,

Tn(θ) k × k An-measurable random matrices,

∆n(θ) = T1/2n (θ)Wn(θ),

Wn(θ) k-dimensional An-measurable random vec-

tors

L{

[Wn(θ), Tn(θ)] | Pn,θ}⇒ L{[W,T (θ)] | Pθ}

W ∼ N(0, Ik) independent of T (θ), or

L{

[∆n(θ), Tn(θ)] | Pn,θ}⇒ L{[∆(θ), T (θ)] | Pθ}

∆(θ) = T1/2(θ)W . It follows that

L{[∆(θ) | T (θ)] | Pθ} = N(T (θ)h, T (θ)).

20

Thus, Theorem 1 becomes here

Theorem 1′:

Λn(θ, θn)− h′∆n(θ) + 12h′Tn(θ)h

Pn,θ−→ 0.

The underlying family of probability measures is

referred to as Locally Asymptotically Mixed Normal

(LAMN).

From Theorem 1′,

Ln(θ, θn) ' eh′∆n(θ)+1

2h′Tn(θ)h

The expression on the RHS above is referred to as

curved exponential family.

From the fact that

L{[∆(θ) | T (θ)] | Pθ} = N(T (θ)h, T (θ)),

it follows that, in the limit, inference can be car-

ried out conditionally as in a normal distribution,

and then revert to the original family of probability

measures.

21

Examples

(i) Explosive autoregressive process of first order

Xj = θXj−1 + εj, X0 = 0, |θ| > 1,

εj, j ≥ 1, i.i.d. ∼ N(0,1).

Here

∆n(θ) =θ2 − 1

θn

n∑j=1

Xj−1εj

Tn(θ) =n∑

j=1

Xj−1εj

/ n∑j=1

X2j−1

L[T (θ) | Pθ] = χ21, W ∼ N(0,1), T1/2(θ) ∼ N(0,1),

T1/2(θ) and W independent.

22

(ii) Super-critical Galton-Watson branching pro-

cess with geometric offspring distribution

Here

∆n(θ) =1

θ(n−1)/2

n∑j=1

(Xj − θXj−1),

Tn(θ) =θ − 1

θn

n∑j=1

Xj−1,

T (θ) exponentially distributed with mean 1,

W ∼ N(0,1), T1/2(θ) and W independent.

For LAMN families some results analogous to those

associated with LAN families have been established.

23

Some Selected References

1. Akritas, M.G. (1978): Contiguity of Probability Mea-

sures Associated with Continuous Time Stochastic Pro-

cesses. Ph.D. Thesis, Department of Statistics, Univer-

sity of Wisconsin, Madison.

2. Akritas, M.G. and Roussas, G.G. (1980): Asymptotic

inference in continuous semi-Markov processes. Scandi-

navian Journal of Statistics, Vol.7, 73–79.

3. Bhattacharya, D. and Roussas, G.G. (2001): Exponen-

tial approximation for randomly stopped locally asymp-

totically mixture of normal experiments. Stochastic Mod-

eling and Applications, Vol.4, No.2, 56–71.

4. Jeganathan, P. (1982): On the asymptotic theory of

estimation when the limit of the log-likelihood ratios is

mixed normal. Sankhya, Vol.44, Ser.A, 173–212.

5. Jeganathan, P. (1995): Some aspects of asymptotic the-

ory with applications to time series model. Economic

Theory, Vol.II, 818–887.

6. Johnson, R.A. and Roussas, G.G. (1969): Asymptoti-

cally most powerful tests in Markov processes. Annals

of Mathematical Statistics, Vol.40, 1207–15.

7. Johnson, R.A. and Roussas, G.G. (1970): Asymptoti-

cally optimal tests in Markov processes. Annals of Math-

ematical Statistics, Vol.41, 918–38.

24

8. Johnson, R.A. and Roussas, G.G. (1972): Applications

of contiguity to multiparameter hypothesis testing. Pro-

ceedings of the Sixth Berkeley Symposium on Probability

Theory and Mathematical Statistics, Vol.1, 195–226.

9. Le Cam, L. (1960): Locally asymptotically normal fam-

ilies of distributions. Univ. of California Publications in

Statistics, Vol.3, 37–98.

10. Le Cam, L. (1986): Asymptotic Methods in Statistical

Decision Theory. Springer-Verlag, New York.

11. Le Cam, L. and Yang, G.L. (2000): Asymptotics in

Statistics: Some Basic Concepts (2nd edition). Springer

Series in Statistics, Springer, New York.

12. Lind, B. and Roussas, G.G. (1972): A remark on quadratic

mean differentiability. Annals of Mathematical Statis-

tics, Vol.43, 1030–34.

13. Roussas, G.G. (1972): Contiguity of Probability Mea-

sures: Some applications in statistics. Cambridge Uni-

versity Press, Cambridge.

14. Roussas, G.G. (1979): Asymptotic distribution of the

log-likelihood function for stochastic processes. Z. Wahr-

scheinlichkeitstheorie und Verwandte Gebiete, Vol.47, 31–

46.

25

15. Roussas, G.G. (2005): An Introduction to Measure-

Theoretic Probability. Elsevier Academic Press, Burling-

ton, Massachusetts.

16. Roussas, G.G. and Soms, A. (1972): On the exponential

approximation of a family of probability measures and

representation theorem of Hajek-Inagaki. Annals of the

Institute of Statistical Mathematics, Vol.25, 27–39.

17. Roussas, G.G. and Bhattacharya, D. (1999): Asymp-

totic behavior of the log-likelihood function in stochas-

tic processes when based on a random number of ran-

dom variables. In Semi-Markov Models and Applications,

Jacques Janssen and Nikolaos Limnios (eds), 119–147,

Kluwer Academic Publishers, Dordrecht.

18. Roussas, G.G. and Bhattacharya, D. (2002): Exponen-

tial approximation of distributions. In Teoriya Imovirnos-

tey ta Matematichna Statystika, Vol.66, 109–120. Also,

in Theory of Probability and Mathematical Statistics,

Vol.66, 119–132 (2003) (English version).

19. Roussas, G.G. and Bhattacharya, D. (2008): Hajek-

Inagaki representation theorem, under general stochastic

processes framework, based on stopping times. Statis-

tics & Probability Letters.

26

20. Van der Vaart, A.W. (1998): Asymptotic Statistics,

Cambridge series in Statistical and Probabilistic Math-

ematics. Cambridge Univ. Press.

21. Wald, A. (1943): Tests of statistical hypotheses con-

cerning several parameters when the number of obser-

vations is large. Transactions of the American Mathe-

matical Society, Vol.54, 426–482.

27

revisiting local asymptotic normality (lan) and passing on...

Documents