tests of risk premia in linear factor models · linear factor models are amongst the most commonly...

30
Tests of Risk Premia in Linear Factor Models Frank Kleibergen Preliminary Version, Not to be quoted without permission. January 3, 2005 Abstract We show that inference on risk premia in linear factor models that is based on the Fama-MacBeth and GLS risk premia estimators is misleading when the β’s are small and/or the number of assets is large. We propose some novel statistics that remain trustworthy in these cases. The inadequacy of Fama-MacBeth and GLS based Wald statistics is highlighted in a power comparison and using daily portfolio returns from Jagannathan and Wang (1996). The power comparison shows that the Fama-MacBeth and GLS Wald statistics can be severly size distorted. The daily portfolio returns from Jagannathan and Wang (1996) reveal a large discrepancy between the 95% condence sets for the risk premia that result from the dierent statistics. The Fama-MacBeth and GLS Wald statistics imply small 95% condence sets for the risk premia on the yield premium and labor income growth while the statistics that remain trustworthy in case of small β’s imply large and possibly unbounded condence sets for these risk premia. 1 Introduction Linear factor models are amongst the most commonly used statistical models in nance, see e.g. Fama and MacBeth (1973), Fama and French (1996) and Jagannathan and Wang (1996). This results as many well-known nancial models, like, for example, the capital asset pricing model, imply a linear factor structure for the asset returns. Linear factor models imply risk premia for the dierent factors that result from a regression of the asset returns on the factor β ’s where the β’s are obtained in a regression of the asset returns on the factors. This regression is known as the Fama-MacBeth (FM) regression since it was initially proposed by Fama and MacBeth (1973). Regressions are known to be sensitive to collinearity of the explanatory variables so the estimates of the risk premia that result from a FM regression are sensitive to collinearity of the β ’s. Collinearity of the β’s occurs when they are small or zero. Kan and Zhang (1999), for example, show that the estimates of the risk premia that result from a FM regression are erroneous when the β’s are zero and the expected asset returns are non-zero. Misleading risk premia estimates that result from a FM regression occur more often than just in the case suggested by Kan and Zhang (1999). We show that the risk premia estimates are spurious whenever the β’s are small and are aggrevated in factor models with a large number of assets. Hence, even for size-able values of the β’s the risk premia estimates can be misleading when the number of assets is large. The FM risk premia estimates that are obtained in such settings are often too small Department of Economics, Brown University, Box B, Providence, RI 02912, United States and Department of Quantitative Economics, University of Amsterdam, Roetersstraat 11, 1018 WB Amsterdam, The Netherlands, Email: [email protected]. Homepage: http://www.econ.brown.edu/fac/Frank_Kleibergen. I thank Sung Jae Jun for research assistance and discussion. 1

Upload: others

Post on 26-Jun-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Tests of Risk Premia in Linear Factor Models · Linear factor models are amongst the most commonly used statistical models in finance, see e.g.Fama and MacBeth (1973), Fama and French

Tests of Risk Premia in Linear Factor ModelsFrank Kleibergen∗

Preliminary Version, Not to be quoted without permission.

January 3, 2005

Abstract

We show that inference on risk premia in linear factor models that is based on the Fama-MacBethand GLS risk premia estimators is misleading when the β’s are small and/or the number of assetsis large. We propose some novel statistics that remain trustworthy in these cases. The inadequacyof Fama-MacBeth and GLS based Wald statistics is highlighted in a power comparison and usingdaily portfolio returns from Jagannathan and Wang (1996). The power comparison shows that theFama-MacBeth and GLS Wald statistics can be severly size distorted. The daily portfolio returnsfrom Jagannathan and Wang (1996) reveal a large discrepancy between the 95% confidence sets forthe risk premia that result from the different statistics. The Fama-MacBeth and GLS Wald statisticsimply small 95% confidence sets for the risk premia on the yield premium and labor income growthwhile the statistics that remain trustworthy in case of small β’s imply large and possibly unboundedconfidence sets for these risk premia.

1 Introduction

Linear factor models are amongst the most commonly used statistical models in finance, see e.g. Famaand MacBeth (1973), Fama and French (1996) and Jagannathan and Wang (1996). This results as manywell-known financial models, like, for example, the capital asset pricing model, imply a linear factorstructure for the asset returns. Linear factor models imply risk premia for the different factors thatresult from a regression of the asset returns on the factor β’s where the β’s are obtained in a regressionof the asset returns on the factors. This regression is known as the Fama-MacBeth (FM) regressionsince it was initially proposed by Fama and MacBeth (1973). Regressions are known to be sensitiveto collinearity of the explanatory variables so the estimates of the risk premia that result from a FMregression are sensitive to collinearity of the β’s. Collinearity of the β’s occurs when they are small orzero. Kan and Zhang (1999), for example, show that the estimates of the risk premia that result froma FM regression are erroneous when the β’s are zero and the expected asset returns are non-zero.

Misleading risk premia estimates that result from a FM regression occur more often than just inthe case suggested by Kan and Zhang (1999). We show that the risk premia estimates are spuriouswhenever the β’s are small and are aggrevated in factor models with a large number of assets. Hence,even for size-able values of the β’s the risk premia estimates can be misleading when the number ofassets is large. The FM risk premia estimates that are obtained in such settings are often too small

∗Department of Economics, Brown University, Box B, Providence, RI 02912, United States and Department ofQuantitative Economics, University of Amsterdam, Roetersstraat 11, 1018 WB Amsterdam, The Netherlands, Email:[email protected]. Homepage: http://www.econ.brown.edu/fac/Frank_Kleibergen. I thank Sung Jae Junfor research assistance and discussion.

1

Page 2: Tests of Risk Premia in Linear Factor Models · Linear factor models are amongst the most commonly used statistical models in finance, see e.g.Fama and MacBeth (1973), Fama and French

and confidence sets indicate an erroneous high level of precision. Statistical inference based on standardt-statistics is then misguided since the large sample distribution of the t-statistic differs from the normalone. We propose a set of statistics whose large sample distribution remains unaltered. These statisticscan then be used to construct trustworthy confidence sets.

The paper is organized as follows. In the second section, we discuss the FM and Generalized LeastSquares (GLS) risk premia estimates and construct their large sample distributions when the β’s aresmall. These large sample distributions are non-standard and imply that Wald statistics, like, forexample, the t-statistic, are misleading when the β’s are small or when the number of assets is large.The third section proposes the statistics whose large sample distributions remain unaltered when the β’sare small and/or the number of assets is large. These statistics are based on the Generalized Method ofMoment (GMM) and instrumental variable statistics that are proposed in Kleibergen (2002,2004a) andMoreira (2003) but differ from these statistics since the linear factor model differs from the models thatare estimable by means of GMM. The fourth section conducts a size and power comparison. It illustratesthe size distortion of the Wald t-statistics that result for the FM and GLS risk premia estimators. Thefifth section generalizes the test statistics to allow for tests on some of the risk premia. The sixth sectionanalyzes a linear factor model for the daily portfolio returns analyzed in Jagannathan and Wang (1999).It shows that the 95% confidence sets reported in Jagannathan and Wang (1999) for the risk premia onthe yield premium and labor income growth are misleading. These confidence sets give the erroneousimpression that the risk premia are estimated with a high level of precision while the confidence sets ofthese risk premia that result from our novel statistics are large and possibly unbounded which revealan imprecise estimate of the risk premia. The seventh section concludes.

We use the following notation throughout the paper: E(a) is the expected value of the randomvariable a, vec(A) stands for the column vectorization of the T × n dimensional matrix A, vec(A) =(a01 . . . a0n)0 when A = (a1 . . . an), vecinvn((a01 . . . a0n)0) = A, PA = A(A0A)−1A0 and MA = IT − PA fora full rank matrix A and the T × T identity matrix IT , ιn is a n × 1 vector of ones, “→

p” indicates

convergence in probability and “→d” indicates convergence in distribution.

2 Risk Premia estimators and small β’s

We analyze the FM estimator, see Fama and MacBeth (1973), in a GMM setting, see e.g. Hansen (1982).We specify the moments of the factor related (excess) assets returns1 as, see e.g. Cochrane (2001),

E(Rt) = ιnλ1 + βλFcov(Rt, Ft) = βvar(Ft)

E(Ft) = µF ,(1)

where Rt is the n× 1 vector of (excess) asset returns, Ft is the k × 1 vector of factors, λ1 is the zero-βreturn, λF is the k × 1 vector of factor risk premia, β is the n × k matrix of asset β’s and µF is thek×1 vector of factor means. The moments conditions (1) result from a linear factor model for the assetreturns,

Rt = ιnλ1 + β(Ft − µF + λF ) + εt, (2)

where εt is the n×1 vector of disturbances which we assume to be independently distributed with meanzero and n× n dimensional covariance matrix Ω.

For time series observations of Rt and Ft, t = 1, . . . , T, the FM two-pass methodology estimates the

risk premia using a least squares time-series regression estimate of β, β =PT

t=1 RtF 0thPT

j=1 FjF0j

i−1,

1All asset returns are in deviation from the riskless return.

2

Page 3: Tests of Risk Premia in Linear Factor Models · Linear factor models are amongst the most commonly used statistical models in finance, see e.g.Fama and MacBeth (1973), Fama and French

with Ft = Ft− F , F = 1T

PTt=1 Ft, Rt = Rt− R, R = 1

T

PTt=1Rt. Given β, estimators for λ1 and λk are

obtained by regressing the average value of Rt, R, on (ιn... β),µ

λ1λF

¶=

·(ιn

... β)0(ιn... β)

¸−1(ιn

... β)0R. (3)

We analyze the statistical properties of the FM risk premia estimator in large samples. We thereforemake an assumption about the behavior of the disturbances of the linear factor model (2) in largesamples.

Assumption 1. When the number of time series observations becomes large,

1√T

PTt=1

µ1Ft

¶⊗ (Rt − ιnλ1 − β(Ft + λF ))

Ft − µf

→d

ϕRϕβϕF

, (4)

with (ϕ0R ϕ0β ϕ0F )0 ∼ N(0, V ) and

V =

µ(Q⊗Ω) 00 VFF

¶, (5)

with Q =³Q11QF1

Q1FQFF

´: (k + 1)× (k + 1), Q11 : 1× 1, QF1 = Q01F : k × 1, Q : k × k and VFF : k × k.

Assumption 1 is a central limit theorem for the disturbances of the linear factor model (2) interactedwith a constant and the factors. Because of the independence of the disturbances over time and theirfinite variance, Assumption 1 holds under the linear factor model (2) with a constant covariance matrix.

The specification of the covariance matrix variance V (5) is implied by the linear factor model aswell. Since the factors are uncorrelated with the disturbances of the linear factor model, the off-diagonalblocks of V are equal to zero. The Kronecker product form of the remaining part of the covariancematrix V results from the constant covariance matrix of the disturbances over time. The specificationof Q,

Q = limT→∞Eh1T

PTt=1

¡ 1Ft

¢¡ 1Ft

¢0i, (6)

is such thatQ = 1

T

PTt=1

¡ 1Ft

¢¡ 1Ft

¢0, (7)

is a consistent estimator of Q.We deduce the large sample distribution of the average asset returns andthe least squares estimator of the β’s, β, from Assumption 1.

Lemma 1. When Assumption 1 holds, the average returns, R, the least squares estimator of the assetsβ’s, β, and the average factors, F , converge in distribution as described by

√T

R− ιnλ1 − βλFvec(β − β)F − µf

→d

ψR

ψβ

ψF

, (8)

where ψR, ψβ and ψF are n × 1, nk × 1 and k × 1 independent random vectors that have normaldistributions with mean zero and covariance matrices, Ω, QFF.1 ⊗ Ω and VFF with QFF.1 = VFF =

limT→∞Eh1T

PTt=1 FtF

0t

i.

3

Page 4: Tests of Risk Premia in Linear Factor Models · Linear factor models are amongst the most commonly used statistical models in finance, see e.g.Fama and MacBeth (1973), Fama and French

Proof. see Appendix A.When the β’s have size-able values, the FM risk premia estimator has a standard normal limiting

distribution so standard Wald t-tests perform adequately. Kan and Zhang (1999) show that the largesample distribution of the FM risk premia estimator is notably different when the β’s are equal to zeroand E(R) equals some n×1 vector of constants c. The behavior of the FM risk premia estimator differsin other cases as well. It results from the dependence of the FM risk premia estimator on the β’s. Thisdependence is similar to the dependence of the two stage least squares estimator on the quality of theinstruments. There is an extensive literature that discusses the behavior of the two stage least squaresestimator in case of irrelevant, see e.g. Phillips (1989), or weak instruments, see e.g. Staiger and Stock(1997). Theorem 1 states the behavior of the FM risk premia estimator in similar cases for the β’s. Wetherefore define, analogous to the weak instruments from Staiger and Stock (1999), so-called weak β’sthat coincide with the occurence of small values of the β’s with a large number of observations T.

Theorem 1. When Assumptions 1 and 2 hold, the behavior of the FM risk premia estimator (3) inlarge samples in case of zero or weak β’s is characterized by:

1. When β = 0 and

(a) E(R) = ιnλ1 : µ √T (λ1 − λ1)

λF

¶→d

Ã1√T1ab0

[Ψ0βMιnΨβ]−1Ψ0βMιn

!ψR, (9)

and the order of finite moments of the limit behavior of λF is equal to n− k while the meanof the limit behavior of λF equals zero when n > k.

(b) E(R) = c :Ã √T (λ1 − 1

ab0c)

λF −√T [Ψ0βMιnΨβ]

−1Ψ0βMιnc

!→d

Ã1√T1ab0

[Ψ0βMιnΨβ ]−1Ψ0βMιn

!ψR (10)

where Ψβ =vecinvk(ψβ), a = ι0nMΨβιn, b =MΨβ

ιn.

2. When β has a weak value such that β = 1√TB and

(a) E(R) = ιnλ1 +1√TBλF :µ √T (λ1 − λ1)

λF − λF

¶→dÃ1√T1ab0

[(B +Ψβ)0Mιn(B +Ψβ)]

−1(B +Ψβ)0Mιn

!(ψR −ΨβλF ).

(11)

and the order of finite moments of the limit behavior of λF is equal to n− k.

(b) E(R) = c : Ã √T (λ1 − 1

ab0c)

λF −√T [(B +Ψβ)

0Mιn(B +Ψβ)]−1(B +Ψβ)

0Mιnc

!→d

Ã1√T1ab0

[(B +Ψβ)0Mιn(B +Ψβ)]

−1(B +Ψβ)0Mιn

!ψR,

(12)

4

Page 5: Tests of Risk Premia in Linear Factor Models · Linear factor models are amongst the most commonly used statistical models in finance, see e.g.Fama and MacBeth (1973), Fama and French

where Ψβ =vecinvk(ψβ), a = ι0nMB+Ψβ ιn, b =MB+Ψβ ιn.

Proof. see Appendix B.

Theorem 1 shows that the large sample distribution of the FM risk premia estimator differs substan-tially from normality in case of zero or weak values of the β’s. First, Theorem 1 shows that zero or weakvalues of β alter the convergence rate of the risk premia estimator. When the moment conditions apply,so E(R) = ιnλ1 (1a) or E(R) = ιnλ1 +

1√TBλF (2a), Theorem 1 shows that the risk premia estimator

λF converges to a random variable instead of the convergence at rate√T to the true value λF in case of

size-able values of β.When the number of assets exceeds the number of factors, the mean of the randomvariable where the risk premia estimator converges to is equal to zero since ψR is independent of Ψβ.When the moment conditions do not apply, so E(R) = c in (1b) and (2b), the risk premia estimatorλF diverges at rate

√T both in case of zero or weak β’s. This extends Kan and Zhang (1999) who

show that the risk premia estimator λF diverges when β equals zero and E(R) = c which correspondswith (1b). Second, Theorem 1 shows that the random variable where the risk premia estimator λFconverges to depends on the number of assets and factors since the number of finite moments of thisrandom variable depends on the number of assets. In general, density functions of random variablesthat possess more moments are more concentrated and have smaller tail probabilities. Hence, when weadd assets with zero β’s, the density of the risk premia estimator becomes more concentrated aroundzero although the risk premia are not identified since β equals zero.

To illustrate the consequences of Theorem 1, we compute the empirical distribution of the riskpremia estimator λF in some simulated data-sets. We therefore simulate data from the model,

Rt = c+ βFt + εt, t = 1, . . . , T, (13)

with E(Ft) = 0 and εt are independent realizations of N(0,Ω) distributed random variables. Theparameter settings that we use for the simulated data-sets are obtained from Jagannathan and Wang(1996). The data-set of Jagannathan and Wang (1996) comprises of 330 daily observations of the returnon hundred size and β sorted portfolios. The (demeaned) return on the value weighted portfolio is theonly factor that we use from Jagannathan and Wang (1996). The covariance matrix Ω results fromregressing the size and β sorted portfolio returns on a constant and the return on the value weightedportfolio. Figures 1.1-1.4 in Panel 1 show the different elements of Theorem 1. The value of λ1 that isused for the simulated data-sets results from the FM-estimator (3) when applied to the Jagannathanand Wang (1996) data.

Figure 1.1 contains the sampling density of λF for different values of the sample size T. It illustratesCase 1a of Theorem 1. Figure 1.1 shows that λF converges to a random variable when β = 0, themodel is correctly specified and T converges to infinity since it reflects no convergence of the samplingdistribution towards a point mass. The density of λF is centered around zero because the mean of λFis equal to zero as indicated by Theorem 1. Since β is equal to zero, λF is not identified so the densityof λF is centered around a value of λF that contains no information about λF .

Figure 1.2 illustrates the divergence of λF as indicated by Case 1b of Theorem 1. In Figure 1.2, β = 0and E(R) equals a realization of a normal distributed random variable with mean zero and covariancematrix (0.25)2In. Figure 1.2 shows that the variance of λF rises when the sample size increases. Hence,λF converges to (plus or minus) infinity when the sample size converges to infinity. Figure 1.2 coincideswith Kan and Zhang (1999).

5

Page 6: Tests of Risk Premia in Linear Factor Models · Linear factor models are amongst the most commonly used statistical models in finance, see e.g.Fama and MacBeth (1973), Fama and French

Panel 1: Shapes of densities of λF for small or zero values of β.

-4 -3 -2 -1 0 1 2 3 40

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4de

nsity

λf

λf

-8 -6 -4 -2 0 2 4 6 80

0.05

0.1

0.15

0.2

0.25

0.3

0.35

dens

ity λ

f

λf

Figure 1.1: Density of λF when β = 0, λ1 = 1.24 Figure 1.2: Density of λF when β = 0, λ1 = 1.24and T = 330 (solid line), 990 (dashed), 1880 and E(R) = c ∼ N(0, 0.252In), T = 330 (solid(dashed-dotted), 3300 (solid with plusses) line), 990 (dashed), 1880 (dashed-dotted), 3300

(solid with plusses)

-2 -1.5 -1 -0.5 0 0.5 1 1.5 2 2.5 30

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

dens

ity λ

f

λf

-3 -2 -1 0 1 2 3 40

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8de

nsity

λf

λf

Figure 1.3: Density of λF with λF = 1, λ1 = 1.24, Figure 1.4: Density λF with E(R) = (ιnλ1 + β,β = (0.3 0 . . . 0)0 and n = 2 (solid with plusses), +c) with λ1 = 1.24, β = (0.3 0 . . . 0)0,10 (dashed-dotted), 24 (dashed), 100 (solidline) c ∼ N(0.0.04In), and n = 2 (solid with plusses),

10 (dashed-dotted), 24 (dashed), 100 (solidline)

Figure 1.3 shows the influence of adding assets whose β’s equal zero to a factor model whose assetshave rather small values of β. The β’s are therefore specified such that only the first one is small andnon-zero, β1 = 0.3. All other elements of β are equal to zero, β2 = . . . = βn = 0. Figure 1.3 shows thesampling density of λF for various values of n while only the β of the first asset is unequal to zero. Forsmall values of n, n = 2, the density of λF is centered around the true value of λF that is equal to one.When we increase the number of assets n by adding assets with a zero value of β, the density of λFbecomes centered around zero. The density of λF in case of zero β’s is centered around zero as well soFigure 1.3 shows that adding assets with a zero value of β, which therefore contain no information onthe risk premia λF , moves the density of λF towards the density which holds in case of zero β’s.

6

Page 7: Tests of Risk Premia in Linear Factor Models · Linear factor models are amongst the most commonly used statistical models in finance, see e.g.Fama and MacBeth (1973), Fama and French

Figure 1.4 shows the consequences of misspecification for the parameter setting that is used inFigure 1.3. We therefore added some misspecification to the E(R), E(R) = (ιnλ1 + β + c), where c isa realization of a N(0, 0.04In) distributed random vector. All remaining parameter values are identicalto the ones that are used for Figure 1.3. Figure 1.4 shows that the density of λF moves towards zeroand that its’ variance increases when we add assets with zero β’s and the model is misspecified. Figure1.4 reveals a combined effect of the phenomena present in Figures 1.2 and 1.3.

The odd behavior of the FM risk premia estimator for small or zero values of the β’s holds for otherrisk premia estimators as well. The behavior of the GLS risk premia estimator,µ

λ1λF

¶=

·(ιn

... β)0Ω−1(ιn... β)

¸−1(ιn

... β)0Ω−1R, (14)

is qualitively similar to the behavior of the FM risk premia estimator for small or zero values of the β’s.Theorem 2 states the behavior of the GLS risk premia estimator (14) in case of small and zero valuesof the β’s.

Theorem 2. When Assumptions 1 and 2 hold, the behavior of the GLS risk premia estimator (14) inlarge samples in case of zero or weak β’s is characterized by:

1. When β = 0 and

(a) E(R) = ιnλ1 :µ √T (λ1 − λ1)

λF

¶→d

à 1√T1ab0

(Ψ0βΩ− 12M

Ω−12 ιnΩ−

12Ψβ)

−1Ψ0βΩ− 12M

Ω−12 ιnΩ−

12

!ψR, (15)

so λF converges to a multi-variate t distributed random variable with location 0, scale param-eter Q11.FQ

−1FF and n-k degrees of freedom and the mean of the limit behavior of λF equals

zero when n > k.

(b) E(R) = c : Ã √T (λ1 − 1

ab0Ω−

12 c)

λF −√T (Ψ0βΩ

− 12M

Ω−12 ιnΩ−

12Ψβ)

−1Ψ0βΩ− 12M

Ω−12 ιnΩ−

12 c

!→dà 1√

T1ab0

(Ψ0βΩ− 12M

Ω−12 ιnΩ−

12Ψβ)

−1Ψ0βΩ− 12M

Ω−12 ιn

!Ω−

12ψR

(16)

where Ψβ =vecinvk(ψβ), a = ι0nΩ−12M

Ω−12ΨβΩ−

12 ιn, b =M

Ω−12ΨβΩ−

12 ιn,Q11.F =Q11−Q1FQ−1FFQF1.

2. When β has a weak value such that β = 1√TB and

(a) E(R) = ιnλ1 +1√TBλF :µ √

T (λ1 − λ1)

λF − λF

¶→d

Ã1√T1ab0

[(B +Ψβ)0Ω−

12M

Ω−12 ιnΩ−

12 (B +Ψβ)]

−1(B +Ψβ)0Ω−

12M

Ω−12 ιn

!Ω−

12 (ψR.β +Ψβ(ρ− λF )).

(17)and the order of finite moments of the limit behavior of λF − ρ is equal to n− k.

7

Page 8: Tests of Risk Premia in Linear Factor Models · Linear factor models are amongst the most commonly used statistical models in finance, see e.g.Fama and MacBeth (1973), Fama and French

(b) E(R) = c :Ã √T (λ1 − 1

ab0Ω−

12 c)

λF −√T [(B +Ψβ)

0Ω−12M

Ω−12 ιnΩ−

12 (B +Ψβ)]

−1(B +Ψβ)0Ω−

12M

Ω−12 ιnΩ−

12 c

!

→d

à 1√T1ab0

[(B +Ψβ)0Ω−

12M

Ω−12 ιnΩ−

12 (B +Ψβ)]

−1(B +Ψβ)0Ω−

12M

Ω−12 ιn

!Ω−

12ψR,

(18)

where Ψβ =vecinvk(ψβ), a = ι0nΩ− 12M

Ω−12 (B+Ψβ)

Ω−12 ιn, b =M

Ω−12 (B+Ψβ)

Ω−12 ιn.

Proof. see Appendix C.

Theorems 1 and 2 show that the dependence of the behavior of risk premia estimators on the valueof the β’s makes standard statistical inference that is based on standard Wald t statistics unreliablewhen the β’s are small. This shows the need for statistics that remain reliable when the β’s are small.

3 Tests of the risk premium

Theorems 1 and 2 show that the usual risk premia estimators converge to random variables for smallor zero values of the β’s when the sample size gets large. The convergence of Wald or t-statistics totest hypothezes on the risk premia is therefore also non-standard for small or zero values of the β’s. Toovercome this problem, we propose an alternative set of statistics to test hypothezes on the risk premia,like, for example, H0 : λF = λF,0. Since our interest is on the risk premia λF , we remove λ1 from themodel by dropping the return on the n-th asset and taking all other asset returns in deviation from thereturn on the n-th asset.2 The moment conditions then become

E(Rt) = BλFcov(Rt, Ft) = Bvar(Ft)

E(Ft) = µF ,(19)

where Rt = R1t − ιn−1Rnt and B = β1 − ιn−1βn, for Rt = (R01t R0nt)0, β = (β01 β

0n)0; R1t : (n− 1)× 1,

Rnt : 1×1, β1 : (n−1)×k, βn : 1× k. Under H0 : λF = λF,0, a least squares estimator for B is given by

B =PTt=1Rt(Ft + λF,0)

hPTj=1(Fj + λF,0)(Fj + λF,0)

0i−1

. (20)

Under H0 : λF = λF,0 and since B is a least squares estimator, we use Assumption 1 to determine thejoint behavior of the average returns and the least squares estimator B.

Lemma 2. Under H 0 : λF = λF,0 and when Assumption 1 holds,

√T

R− BλF,0vec(B − B)F − µF

→d

ψRψBψF

,

where ψR, ψB and ψF are independent (n− 1)× 1, (n− 1)k× 1 and k× 1 normal distributed randomvectors with mean zero and covariance matrices (1 − λ0F,0Q(λF )FFλF,0)⊗ Σ, Q(λF )FF ⊗ Σ and VFF ,

2The results are invariant with respect to the asset return that is dropped and with respect to which all assets returnsare taken in deviation from.

8

Page 9: Tests of Risk Premia in Linear Factor Models · Linear factor models are amongst the most commonly used statistical models in finance, see e.g.Fama and MacBeth (1973), Fama and French

with Q(λF )FF =hlimT→∞E

³1T

PTt=1(Ft + λF )(Ft + λF )

0´i−1

and Σ = Ω11 − ιn−1ω1n − ωn1ι0n−1 +

ιn−1ωnnι0n−1 for Ω =³Ω11ωn1

ω1nωnn

´, Ω11 : (n− 1)× (n− 1), ω1n = ω0n1 : 1× (n− 1) and ωnn : 1× 1.

Proof. see Appendix D.

The Wald statistic that is based on the FM risk premia estimator to test H0 : λF = λF,0, reads

FM-W(λF,0) = (λF − λF,0)0var(λF )−1(λF − λF,0). (21)

For size-able full rank values of β and under H0 : λF = λF,0, FM-W(λF,0) converges in distribution to aχ2(k) distributed random variable when the sample size gets large. Theorem 1 shows that λF convergesto a non-normal distributed random variable that is centered around zero for weak and zero values of theβ’s. The Wald statistic WFM(λF,0) does therefore not converge to a χ2(m) distributed random variablefor such values of the β’s. It implies that the Wald statistic is unreliable for conducting inference inthese cases. To indicate a solution to this problem, we specify the Wald statistic that tests H0 as

FM-W(λF,0) = (R− ιN λ1 − βλF,0)0βΘ−1β

0(R− ιN λ1 − βλF,0), (22)

with Θ = β0Ωβ− βΩιn(ι

0nΩιn)

−1ι0nΩβ and where Ω is a least squares covariance matrix estimator of Ω,Ω = 1

T−k−1PT

t=1(Rt− βFt)(Rt− βFt)0. Lemma 1 shows that R− ιNλ1−βλF,0 and β are independentlydistributed in large samples. Because of this dependence, R−ιN λ1−βλF,0, is, however, not independentof β in large samples which explains the convergence issues with the Wald statistic when the samplesize gets large and the β’s are relatively small.

Lemma 2 shows that R−BλF,0 and B are independently distributed when the sample size gets large.Statistics that are based on R − BλF,0 and/or B therefore possess better convergence properties thanthe Wald-statistic. We propose a few of these statistics which are based on the GMM and instrumentalvariable statistics that are proposed in Kleibergen (2002, 2004a) and Moreira (2003). Since the factormodel is not accomodated by the GMM framework proposed in Kleibergen (2004a), the statistics donot result automatically from the GMM and instrumental variable statistics.

Fama-MacBeth LM statistic The independence of R− BλF,0 and B in large samples implies thatwe can base a statistic with convenient properties on their product.

Theorem 3. Under H 0 : λF = λF,0 and when Assumption 1 holds, the statistic,

FM-LM(λF,0) = T1−λ0F,0Q(λF )−1FFλF,0

(R− BλF,0)0B(B0ΣB)−1B0(R− BλF,0), (23)

where Q(λF ) = 1T

PTt=1(Ft+λF )(Ft+λF )

0 and Σ = 1T−k

PTt=1(Rt− B(Ft+λF,0))(Rt− B(Ft+λF,0))

0,converges to a χ2(m) distributed random variable when the sample size gets large for all possible valuesof the β’s.

Proof. results from the independence of R− BλF,0 and B shown in Lemma 2 and that Q(λF ) →p

Q(λF ) and Σ→pΣ.

The FM-LM statistic is based on the FM-Wald statistic in (22) which explains why we refer to itas a Fama-MacBeth LM statistic. Compared with the FM-Wald statistic (22), the primary differencesare the estimators of the β’s, B instead of β, and that λ1 is no longer present since it has been removedby taking the returns in deviation from the returns on the n-th asset. Identical to the FM risk premiaestimator, FM-LM(λF,0) is not invariant to transformations of the asset returns so it differs in thisrespect from the standard LM statistics which are invariant to transformations.

9

Page 10: Tests of Risk Premia in Linear Factor Models · Linear factor models are amongst the most commonly used statistical models in finance, see e.g.Fama and MacBeth (1973), Fama and French

GLS-LM statistic The GLS risk premia estimator (14) is invariant to transformations of the assetreturns so we obtain an invariant LM statistic by incorporating the inverse of the covariance matrix in(23).

Theorem 4. Under H 0 : λF = λF,0 and when Assumption 1 holds, the statistic,

GLS-LM(λF,0) = T1−λ0F,0Q(λF )−1FFλF,0

(R− BλF,0)0Σ−1B(B0Σ−1B)−1B0Σ−1(R− BλF,0), (24)

converges to a χ2(m) distributed random variable when the sample size gets large for all possible valuesof the β’s.

Proof. results from the independence of R− BλF,0 and B shown in Lemma 2.Unlike the FM-LM statistic (23), the GLS-LM statistic (24) is invariant to transformations of the

asset returns. The GLS-LM statistic (24) does, however, depend on the inverse of the covariance matrixΣ which can be of a large dimension and therefore difficult to obtain. For example, this is a 99×99matrix in Jagannathan and Wang (1996).

Factor AR statistic When the disturbances in the linear factor model (2) are normally distributedwith mean zero and covariance matrix Ω, we can construct the likelihood function of λ1, λF , β and Ω.In an identical manner, we can construct the likelihood function for a linear factor model in which theexpected asset returns are not restricted to be equal to ιnλ1 + βλF . After concentrating with respectto λ1 and β, the difference between the logarithms of the likelihoods of the unrestricted linear factormodel and the restricted factor model under H0 : λF = λF,0 is proportional to:

FAR(λF,0) = T1−λ0F,0Q(λF )−1FFλF,0

(R− BλF,0)0Σ−1(R− BλF,0). (25)

We refer to this statistic as the factor AR statistic (FAR) since it is similar to the Anderson-Rubinstatistic, see Anderson and Rubin (1949), in the instrumental variables regression model. The FARstatistic is proportional to the square of the Hansen-Jagannathan distance when evaluated at λF,0, seeHansen and Jagannathan (1997).

Theorem 5. Under H 0 : λF = λF,0 and when Assumption 1 holds, the Factor AR statistic (25)converges to a χ2(n − 1) distributed random variable when the sample size gets large for all possiblevalues of the β’s.

Proof. results from the large sample behavior of R− BλF,0 documented in Lemma 2.Because of the relationship between the FAR statistic (25) and the likelihood function, minimizing

the FAR statistic over λF,0 is identical to maximizing the likelihood. Hence, the maximum likelihoodestimator for λF,0 is obtained by minimizing FAR(λF,0) over λF,0.

Misspecification factor pricing tests The GLS-LM statistic (24) is a quadratic form of the deriva-tive of FAR(λF,0) since3

∂∂λ0F,0

FAR(λF,0) = c(R− BλF,0)0Σ−1B, (26)

with c = −2 T1−λ0F,0Q(λF )−1FFλF,0

h1 + 1

T−k−1FAR(λF,0)i. The GLS-LM statistic is therefore equal to zero

at the maximum likelihood estimator of λF,0 which also minimizes FAR(λF,0). Besides the maximum

3see Appendix E for a proof.

10

Page 11: Tests of Risk Premia in Linear Factor Models · Linear factor models are amongst the most commonly used statistical models in finance, see e.g.Fama and MacBeth (1973), Fama and French

likelihood estimator, the GLS-LM statistic is also equal to zero at other values of λF,0 which set thederivative of the likelihood to zero, like, for example, local maxima and inflexion points. This hampersinference using the GLS-LM statistic.

To overcome the difficulty with the GLS-LM statistic, it is convenient to conduct a pre-test thatverifies the validity of the moment conditions (19) at λF = λF,0. A statistic that is well suited for thispurpose is the factor pricing statistic

JGLS(λF,0) = FAR(λF,0)−GLS-LM(λF,0). (27)

Theorem 6. Under H 0 : λF = λF,0 and when Assumption 1 holds, the factor pricing JGLS(λF,0)statistic (27) converges to a χ2(n− k− 1) distributed random variable that is independently distributedof the χ2(k) distributed random variable where GLS-LM (λF,0) (24) converges to when the sample sizegets large for all possible values of the β’s.

Proof. results from the large sample behavior of R− BλF,0 documented in Lemma 2. The indepen-dence from GLS-LM(λF,0) results since JGLS(λF,0) projects Σ−

12 (R− BλF,0) on the space orthogonal

to Σ−12 B where GLS-LM(λF,0) projects it onto.

The factor pricing statistic tests the moment conditions (19) for a pre-specified value of the riskpremia λF . The FM-LM statistic can be used as well for this purpose

JFM(λF,0) = FAR(λF,0)− FM-LM(λF,0). (28)

Theorem 7. Under H 0 : λF = λF,0 and when Assumption 1 holds, the factor pricing JFM (λF,0)statistic (28) converges to a χ2(n− k− 1) distributed random variable that is independently distributedof the χ2(k) distributed random variable where FM-LM (λF,0) (23) converges to when the sample sizegets large for all possible values of the β’s.

Proof. results from the large sample behavior of R − BλF,0 documented in Lemma 2. The inde-pendence from FM-LM(λF,0) results since JFM(λF,0) projects Σ−

12 (R− BλF,0) on the space orthogonal

to Σ12 B where FM-LM(λF,0) projects it onto.The factor pricing statistics JGLS(λF,0) and JFM(λF,0) resolve the inferential problems with the

GLS-LM and FM-LM statistics when we use them as pre-tests before we apply the GLS-LM andFM-LM statistics. A test which tests the adequacy of the moment conditions and H0 : λF = λF,0with (1− α)× 100% significance then results by applying the JGLS(λF,0) or JFM(λF,0) statistics with(1−αJ)× 100% significance and in case we do not reject, we apply the GLS-LM(λF,0) or FM-LM(λF,0)statistics with (1−αLM)×100% significance, where (1−α) = (1−αJ)(1−αLM) so α = αJ+αLM+αJαLM ≈αJ + αLM, see Kleibergen (2004a,b).

Conditional likelihood ratio statistic The FAR statistic equals the difference between the loga-rithms of the likelihood of the factor model with an unrestricted constant and the likelihood evaluatedat H0 : λF = λF,0. The likelihood ratio statistic to test H0 : λF = λF,0 against H1 : λF 6= λF,0 thereforeequals:

LR(λF,0) = FAR(λF,0)−minλF FAR(λF ). (29)

Under H0, R− BλF,0 and B are independently distributed in large samples and we can specify minλFFAR(λF ) as a function of R−BλF,0 and B. When k = 1, the specification of the likelihood ratio statisticcan then be specified as4

CLR(λF,0) = 12

hFAR(λF,0)− r(λF,0) +

p(FAR(λF,0) + r(λF,0))2 − 4r(λF,0)JFAR(λF,0)

i, (30)

4see Appendix F for a proof.

11

Page 12: Tests of Risk Premia in Linear Factor Models · Linear factor models are amongst the most commonly used statistical models in finance, see e.g.Fama and MacBeth (1973), Fama and French

with r(λF,0) = Q(λF )−1FF B0Σ−1B, which corresponds with the conditional likelihood (CLR) ratio statisticof Moreira (2003) in the linear instrumental variables regresssion model. The CLR statistic (30) has aconditional distribution that depends on r(λF,0) in large samples, see Moreira (2003).

Theorem 8. Under H 0 : λF = λF,0, k = 1 and when Assumption 1 holds, the conditional likelihoodratio statistic CLR(λF,0) statistic (30) given r(λF,0) converges to the random variable:

12

£ϕk + ϕn−k−1 − r(λF,0) +

p(ϕk + ϕn−k−1 + r(λF,0))2 − 4r(λF,0)ϕn−k−1

¤, (31)

where ϕk and ϕn−k−1 are independent χ2(k) and χ2(n− k− 1) distributed random variables, when thesample size gets large.

Proof. results from the large sample behavior of R − BλF,0 documented in Lemma 2 and thatFAR(λF,0)→

dϕk + ϕn−k−1, JFAR(λF,0)→

dϕn−k−1.

The expression of the large sample distribution of the CLR statistic in Theorem 8 only applies whenk = 1. It provides a tight upperbound on the large sample distribution of the CLR statistic when r(λF,0)is a statistic that tests for the rank of B using B when k exceeds one, see Kleibergen (2004b).

Kan and Zhang Misspecification The factor pricing JGLS and JFM statistics, which test thehypothesis of factor pricing at H0 : λF = λF,0, and the FAR statistic, which tests the joint hypothesis offactor pricing and H0, have power against the misspecification analyzed by Kan and Zhang (1999). Thecombinations of the JGLS and GLS-LM statistics and JFM and FM-LM statistics, that were proposedto overcome the spurious power declines of the GLS-LM and FM-LM statistics, therefore also providepower against misspecification of the asset returns.

4 Power and Size Comparison

We analyze the size and power of the different statistics in a simulation experiment that is based onJagannathan and Wang (1996). We therefore generate assets returns from the one factor asset pricingmodel

Rt = ιnλ1 + β(Ft + λF ) + εt, t = 1, . . . , T, (32)

where Rt is a n×1 vector of asset returns, β is the n×1 vector of β’s, λ1 is the zero-β return, λF is thescalar risk premium and the disturbances εt are independently normal distributed with mean zero andcovariance matrix Ω. We use parameter settings that are obtained from the data used by Jagannathanand Wang (1996) so n = 100, T = 330, Ft is the demeaned return on the value weighted portfolio, λ1equals the FM estimate λ1 and Ω equals the least squares covariance matrix estimator.

We generate asset returns from the linear factor model (32) for three different values of β, 1. β =βVW , 2. β = 0.25βVW and 3. β = 0.1βVW , with βVW the least squares estimator for the β’s of the assetreturns with respect to the value weighted portfolio. We use a range of values of λF and the differentstatistics that we discussed previously to test H0 : λF = 3 with 95% significance.

The power curves in Panel 2 reveal the issues involved with the different statistics. The lefthandsideof Panel 2, which consists of Figures 2.1, 2.3 and 2.5, displays the power curves of the FM and GLSbased Wald and LM statistics. The righthandside of Panel 2, which consists of Figures 2.2, 2.4 and 2.6,displays the power curves of the CLR, FAR and combinations of the JGLS and GLS-LM statistics andthe JFM and FM-LM statistics. These combinations are such that a 99% critical value is applied toJGLS and JFM and a 96% critical value to GLS-LM and FM-LM so the overall size of the tests equals5%.

12

Page 13: Tests of Risk Premia in Linear Factor Models · Linear factor models are amongst the most commonly used statistical models in finance, see e.g.Fama and MacBeth (1973), Fama and French

The power curves of the FM and GLS Wald statistics in Figures 2.1-2.5 show that the rejectionfrequency of these statistics does not equal the size of 5% at λF = 3. Hence, these statistics are sizedistorted. Figures 2.1-2.5 indicate that the size distortion increases when β decreases. Figures 2.1-2.5reveal that the FM and GLS-LM statistics are not size distorted. Figures 2.1-2.5 demonstrate a spuriousdecline of power of the FM and GLS-LM statistics at values of λF which are considerable different fromthe hypothesized value of three. These power declines results as the FM and GLS LM statistics areproportional to the derivatives of resp. (R − βλF )

0(R − βλF ) and the FAR statistic with respect toλF . Hence, the FM and GLS-LM statistics are equal to zero at minima, maxima and inflexion pointsof these statistics which explains the spurious power declines of the FM and GLS-LM statistics.

It is remarkable that the FM and GLS Wald statistics are already size distorted when β equals βVWin Figure 2.1. This especially holds for the GLS Wald statistic whose size distortion is enormous inFigure 2.1. The size distortion of the GLS Wald statistic results to a large extent from the inversion ofthe 100×100 covariance matrix Ω. Appendix G explains why this size distortion does not occur for theFM-GLS and the other statistics which contain the inverse of the 99×99 matrix Σ.

Figures 2.2, 2.4 and 2.6 only contain power curves of statistics whose large sample distributions areunaffected by small or zero values of the β’s. These Figures therefore reveal no size distortion of any ofthe depicted statistics. Figures 2.2-2.6 show that the power of the FAR statistic is in general below thepower of the other (combinations of) statistics but the FAR statistic has similar power than the otherstatistics when the β’s are very small. The smaller power of the FAR statistic in case of reasonablylarge β’s results from the larger degree of freedom parameter of its large sample distribution comparedto the other statistics.

Figures 2.2-2.6 show that the combinations of the JGLS and GLS-LM statistics and JFM andFM-LM statistics overcome the spurious power decline of the GLS-LM and FM-LM statistics. Thesecombined statistics therefore perform appropriately and also provide power against misspecification.The combination of the JGLS and GLS-LM statistics and the CLR statistics have comparable powercurves and outperform the combination of the JFM and FM-LM statistics. We note that the CLRstatistic has no power against misspecification.

The power curves in Panel 2 clearly show the appeal of the statistics proposed in the previoussection. Compared to the FM-Wald statistic, the optimal configuration of the statistics, which are thecombined JGLS and GLS-LM statistics and the CLR statistic, do not indicate any sacrifice of power toimprove the size. For example, in Figure 2.1, these statistics already dominate the FM-Wald statisticin terms of power. This continues to hold in the other Figures where the FM-Wald statistics becomeseventually enormlously size distorted for very small values of the β’s.

5 Tests on some risk premia

The statistics that we proposed sofar test hypothezes that are specified on all risk premia, H0 : λF = λF,0.Instead of testing hypothezes specified on all parameters, we often want to test hypothezes that are

specified on sub-sets of the parameters, for example, H∗0 : µF = µF,0, where λF = (ν0F... µ0F )

0, withνF : kν × 1; µF,0, µF : kµ × 1 and k = kν + kµ. The large sample distributions of the statistics that weproposed previously extend to this setting when the β’s that are associated with νF are size-able andconstitute a full rank matrix.

Assumption 2. The matrix of β’s,

β = (βν... βµ), (33)

with βν : n× kν and βµ : n× kµ, is such that βν is a fixed full rank matrix.

13

Page 14: Tests of Risk Premia in Linear Factor Models · Linear factor models are amongst the most commonly used statistical models in finance, see e.g.Fama and MacBeth (1973), Fama and French

Panel 2: Power curves of statistics that test H0 : λF = 3 with 95% significanceβ = βVW : Figure 2.1 Figure 2.2

1 1.5 2 2.5 3 3.5 4 4.5 50

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

λf

Rej

ectio

n F

requ

ency

1 1.5 2 2.5 3 3.5 4 4.5 50

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

λf

Rej

ectio

n F

requ

ency

β = 0.25βVW : Figure 2.3 Figure 2.4

-5 0 5 10 150

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Rej

ectio

n F

requ

ency

λf

-5 0 5 10 150

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

λf

Rej

ectio

n F

requ

ency

β = 0.1βVW : Figure 2.5 Figure 2.6

-15 -10 -5 0 5 10 15 20 25 30 35 400

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Rej

ectio

n F

requ

ency

λf

-15 -10 -5 0 5 10 15 20 25 30 35 400

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

λf

Rej

ectio

n F

requ

ency

Figures 2.1, 2.3, 2.5: Power curves of FM-W Figures 2.2, 2.4, 2.6: Power curves of CLR (solid),(solid), GLS-W (dashed), FM-LM (dashed-dotted) combination of JGLS and GLS-LM (dashed), com-and GLS-LM (solid with plusses) statistics. bination of JFM and FM-LM (dashed-dotted) and

FAR (solid with plusses) statistics.

14

Page 15: Tests of Risk Premia in Linear Factor Models · Linear factor models are amongst the most commonly used statistical models in finance, see e.g.Fama and MacBeth (1973), Fama and French

Under Assumptions 1 and 2, the large sample distributions of the robust statistics extend towards asetting where we use these statistics to test the hypothesis H∗0 : µF = µF,0 that is specified on a sub-setof the parameters. The statistics to test H∗0 result when we use

λF,0 =¡νF (µF,0)

µF,0

¢, (34)

in the expressions of the robust statistics. In (34), νF (µF,0) is a consistent estimator for νF . Thisestimator can, for example, be obtained from the first order condition for νF that results from (26), so

(R− B)0Σ−1Bν = 0, (35)

with B =(Bν... Bµ) =

PTt=1Rt(Ft +

¡νF (µF,0)µF,0

¢)hPT

j=1(Fj +¡νF (µF,0)µF,0

¢)(Fj +

¡νF (µF,0)µF,0

¢)0i−1

, Bν : (n −1)×kν , Bµ : (n−1)×kµ. The estimator νF (µF,0) that results from (35) seems complicated to constructbut, since it corresponds with the maximum likelihood estimator, results in a straightforward mannerfrom an eigenvalue problem.

Theorem 9. Under H ∗0 : µF = µF,0 and when Assumptions 1 and 2 hold, the large sample distribu-tion(s) of

1. FM-LM( (νF (µF,0)0 ... µ0F,0)

0) and GLS-LM( (νF (µF,0)0... µ0F,0)

0) are χ2(kµ) distributions.

2. FAR( (νF (µF,0)0 ... µ0F,0)

0) is a χ2(n− kν − 1) distribution.

3. JFM( (νF (µF,0)0 ... µ0F,0)

0) and JGLS( (νF (µF,0)0... µ0F,0)

0) are χ2(n − k − 1) distributions whichare independent from the large sample distributions of FM-LM( (νF (µF,0)

0 ... µ0F,0)0) and GLS-

LM( (νF (µF,0)0 ... µ0F,0)

0).

4. CLR( (νF (µF,0)0 ... µ0F,0)

0) given r( (νF (µF,0)0... µ0F,0)

0) results from 12

·ϕkµ + ϕn−k−1 − r((νF (µF,0)0

... µ0F,0)0)+r

(ϕkµ + ϕn−k−1 + r((νF (µF,0)0... µ0F,0)0))2 − 4r((νF (µF,0)0

... µ0F,0)0)ϕn−k−1

#, where ϕkµ and ϕn−k−1

are independent χ2(kµ) and χ2(n− k − 1) distributed random variables.

for all possible values of Bµ.

Proof. results since νF (µF,0) is a consistent estimator of νF under H∗0 and Assumptions 1 and 2.

6 Confidence sets of the risk premia from Jagannathan and Wang(1996)

When we specify a range of values of λF,0, or µF,0, we can use the statistics to construct a confidenceset for λF , or µF . The (1 − α) × 100% asymptotic confidence set contains all values of λF,0, or µF,0,for which the value of the statistic that is used to test H0 : λF = λF,0, or H∗0 : µF = µF,0, is below its(1− α)× 100% asymptotic critical value.

15

Page 16: Tests of Risk Premia in Linear Factor Models · Linear factor models are amongst the most commonly used statistical models in finance, see e.g.Fama and MacBeth (1973), Fama and French

We construct the asymptotic confidence sets for the risk premia for the asset returns from Jagan-nathan and Wang (1996). Jagannathan and Wang (1996) use the factor model,

Rt = ιnλ1 + sizeλs + β(Ft + λF ) + εt, t = 1, . . . , T, (36)

to describe the return on hundred size and β sorted portfolios that are collected into the vector ofasset returns Rt, so n = 100, for three hundred and thirty daily observations, so T = 330. The n × 1vector size reflects the relative size of the different portfolios and is constant over time. The k × 1vector of (demeaned) factors Ft can contain up to three different factors: the return on a value weightedportfolio, the yield premium between low and high grade corporate bonds and the growth of per capitalabor income. Since our interest is on the risk premia λF , we remove λ1 and λs from the factor model(36) in an analogous manner as by taking the returns in deviation from the n-th asset return as weproposed previously to remove λ1.

Panel 3: One minus p-value plots for λvw.

-1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 10

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

λvw

One

min

us p

-val

ue

-1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 10

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

One

min

us p

-val

ue

λvw

Figure 3.1: One minus p-value plot for FM-W Figure 3.2: One minus p-value plot for CLR (solid),(solid), GLS-W (dashed), FM-LM (dashed-dotted) JGLS (solid with plusses), FAR (dashed-dotted)and GLS-LM (solid with plusses) statistics. and GLS-LM (dashed) statistics.

Figures 3.1 and 3.2 in Panel 3 show the p-value plots of tests of the hypothesis H0 : λvw = λvw,0,where λvw is the risk premium on the value weighted portfolio, for a range of values of λvw,0 in a factormodel where the return on the value weighted portfolio is the only factor. Both Figures contain a lineat 95% which enables us to construct the 95% confidence sets in a convenient manner by using theintersection of the 95% line and the p-value plots. Panel 3 shows that the GLS-LM and CLR statisticslead to considerably larger 95% confidence sets compared to the FM-W and GLS-W statistics. Panel3 also shows that λvw is well-identified since the confidence sets are relatively small so the differencebetween the different confidence sets is not too large.

Figure 3.1 shows that the 95% confidence set that results from the FM-LM statistic differs consid-erably from the other confidence sets. Figure 2.1 shows that the FM-LM statistic has considerably lesspower when the risk premium is well-identified. This explains the different 95% confidence set thatresults from the FM-LM statistic.

The one minus p-value plots of the FAR and JGLS statistics in Figure 3.2 are rather low whichresults from the large degrees of freedom parameters of their χ2 large sample distributions, 98 and 97.

16

Page 17: Tests of Risk Premia in Linear Factor Models · Linear factor models are amongst the most commonly used statistical models in finance, see e.g.Fama and MacBeth (1973), Fama and French

Panel 4: One minus p-value plots for λprem and λlabor.

λprem : Figure 4.1 Figure 4.2

-0.2 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.80

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

λprem

One

min

us p

-val

ue

-0.2 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.80

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

One

min

us p

-val

ue

λprem

λlabor : Figure 4.3 Figure 4.4

0 0.5 1 1.50

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

λlabor

One

min

us p

-val

ue

0 0.5 1 1.50

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

λlabor

One

min

us p

-val

ue

Figure 4.1, 4.3: One minus p-value plot for FM-W Figure 4.2, 4.4: One minus p-value plot for CLR(solid), GLS-W (dashed), FM-LM (dashed-dotted) (solid), JGLS (solid with plusses), FAR (dashedand GLS-LM (solid with plusses) statistics. -dotted) and GLS-LM (dashed) statistics.

Panel 4 shows the one minus p-value plots of statistics that test H∗0 : λprem = λprem or H∗0 : λlabor =λlabor,0 in a factor model where the return on the value weighted portfolio and either the yield premiumor the labor income growth rate are the only two factors. Because the β’s of the return on the valueweighted portfolio are size-able, Assumption 2 holds and we can conduct tests on λprem or λlabor only.

Figures 4.1 and 4.2 show that the 95% confidence sets of λprem and λlabor that result from the FM-Wand GLS-W statistics are small while the 95% confidence that result from the FM-LM, GLS-LM andCLR statistics are very large and even sometimes unbounded. This indicates that the β’s associated withthe yield premium and labor growth rate are small so the FM-W and GLS-W statistics are unreliable.5

The GLS-LM and CLR statistics remain reliable when the β’s are small so Panel 4 shows that the

5Kleibergen and Paap (2003) test the rank of the β-matrix and also conclude that reduced rank values of the β-matrixcan not be rejected.

17

Page 18: Tests of Risk Premia in Linear Factor Models · Linear factor models are amongst the most commonly used statistical models in finance, see e.g.Fama and MacBeth (1973), Fama and French

results from the FM-W and GLS-W statistics are misleading. The GLS-LM and CLR statistics reflectthat the β’s are small and that inference on the risk premia is difficult.

The one minus p-value plots of the FAR and JGLS statistics in Figures 4.2 and 4.4 are rather lowwhich results from the large degrees of freedom parameter of the χ2 large sample distribution of thesestatistics, 97 and 96.

Panel 5: 95% confidence sets of (λprem, λlabor).

Figure 5.1: FM-W Figure 5.2: GLS-W

-0.2 0 0.2 0.4 0.6 0.8 1 1.2 1.40

0.5

1

1.5

λlabor

λ pre

m

-0.2 0 0.2 0.4 0.6 0.8 1 1.2 1.40

0.5

1

1.5

λ pre

m

λlabor

Figure 5.3: GLS-LM Figure 5.4: CLR

0 0.5 1 1.50

0.5

1

1.5

λlabor

λ pre

m

0 0.5 1 1.50

0.5

1

1.5

λ pre

m

λlabor

Since the β’s that are associated with the yield premium and labor income growth are small, As-sumption 2 does not hold for these β’s. We can thus not construct the 95% confidence sets of λprem andλlabor in a factor model that contains all three factors in a reliable manner. Since Assumption 2 holdsfor the β’s that are associated with the return on a value weighted portfolio, we can, however, constructthe (two dimensional) joint 95% confidence set of (λprem, λlabor). Panel 5 contains the 95% confidenceset of (λprem, λlabor) that results from the different statistics. Panel 5 shows that there are enormousdifferences between the different 95% confidence sets. The 95% confidence sets that result from theFM-W and GLS-W statistics in Figures 5.1 and 5.2 are small while the 95% confidence sets that resultfrom the GLS-LM and CLR statistics are large or even unbounded, which holds for the CLR statistic.This shows that the FM-W and GLS-W statistics indicate a misleading high degree of precision which

18

Page 19: Tests of Risk Premia in Linear Factor Models · Linear factor models are amongst the most commonly used statistical models in finance, see e.g.Fama and MacBeth (1973), Fama and French

is completely spurious.

Panel 6: One minus p-value plots for (λprem, λlabor).

Figure 6.1: FAR Figure 6.2 JGLS

0

0.5

1

1.5

0

0.5

1

1.5

0

0.2

0.4

0.6

0.8

1

λlaborλ

prem

One

min

us p

-val

ue

0

0.5

1

1.5

0

0.5

1

1.5

0

0.2

0.4

0.6

0.8

1

λlaborλ

prem

One

min

us p

-val

ue

Panel 6 contains the one minus p-value plots of the FAR and JGLS statistics for different valuesof (λprem, λlabor). The FAR statistic is both proportional to the logarithm of the likelihood and thesquare of the Hansen-Jagannathan statistic. Its p-value plot shows that the FAR statistic is small formost positive values of (λprem, λlabor). This explains the large 95% confidence sets that result from theGLS-LM and CLR statistics. The Figures in Panel 6 again confirm the misleading results that areobtained from applying the FM-W and GLS-W statistics.

7 Conclusions

We reveal the inadequacy of standard inferential procedures that are based on the FM and GLS riskpremia estimators when the β’s are small and/or the number of assets is large. We propose some newstatistics whose large sample distributions remain trustworthy. We apply these statistics to the sizeand β sorted portfolio return series from Jagannathan and Wang (1996). We obtain that the small95% confidence sets of the risk premia that result from the standard inferential procedures indicate anerroneous high level of precision since the 95% confidence sets of some of the risk premia that resultfrom the new statistics are very large. This shows that these risk premia are almost non-identified.

The linear factor models analyzed in this paper assume a constant covariance matrix of the assetreturns over time. We will relax this assumption in future work.

19

Page 20: Tests of Risk Premia in Linear Factor Models · Linear factor models are amongst the most commonly used statistical models in finance, see e.g.Fama and MacBeth (1973), Fama and French

Appendix

A. Proof of Lemma 1. Assumption 1 states that

1√T

PTt=1

µ1Ft

¶⊗ (Rt − ιnλ1 − β(Ft + λF ))

Ft − µF

→d

ϕRϕβϕF

,

with (ϕ0R ϕ0β ϕ0F )0 ∼ N(0, V ). We pre-multiply it with µ

1 0

−( 1TPT

j=1 FjF0j)−1F ( 1T

PTj=1 FjF

0j)−1

¶⊗ In 0

0 Ik

to obtain, since

PTt=1 Ft = 0, β =

PTt=1RtF

0t(PT

j=1 FjF0j)−1,PT

t=1[(ιnλ1 + β(Ft + λF ))F0t − (ιnλ1 + β(Ft + λF ))F

0](PT

j=1 FjFj)−1 =PT

t=1[βFt(Ft − F )0](PT

j=1 FjFj)−1 = β,

and

limT→∞µ

1 0

−( 1TPT

j=1 FjF0j)−1F ( 1T

PTj=1 FjFj)

−1

¶Q

µ1 0

−( 1TPT

j=1 FjF0j)−1F ( 1T

PTj=1 FjFj)

−1

¶0=µ

Q11 00 Q−1FF.1

¶with QFF.1 = limT→∞E( 1T

PTj=1 FjF

0j), that

√T

R− ιnλ1 − βλFvec(β − β)F − µF

→d

ψR

ψβ

ψF

,

where ψR, ψβ and ψF are independent n× 1, nk× 1 and k× 1 normal distributed random vectors withmean 0 and covariance matrices, Ω, Q−1FF.1 ⊗Ω and VFF .

B. Proof of Theorem 1. To analyze the behavior of the FM estimator, we rephraze Lemma 1 as R

vec(β)F

=a

ιnλ1 + βλkvec(β)µF

+ 1√T

ψR

ψβ

ψF

+ op(1√T),

where =aimplies equality in large samples and op(

1√T) indicates that all other elements are of a lower

order than 1√T.

1. β = 0. We use that since (ιn... 1√

TΨβ)

0(ιn... 1√

TΨβ) =

µι0nιn1√TΨ0βιn

1√Tι0nΨβ

1TΨ0βΨβ

¶,

·(ιn

... 1√TΨβ)

0(ιn... 1√

TΨβ)

¸−1=Ã

a−1 −a−1ι0nΨβ(Ψ0βΨβ)

−1

−√T (Ψ0βMinΨβ)−1Ψ0βιn

n T (Ψ0βMinΨβ)−1

!.

20

Page 21: Tests of Risk Premia in Linear Factor Models · Linear factor models are amongst the most commonly used statistical models in finance, see e.g.Fama and MacBeth (1973), Fama and French

with a = ι0nMΨβ ιn.1a. E(R) = ιnλ1 andµ

λ1λF

¶=

·(ιn

... β)0(ιn... β)

¸−1(ιn

... β)0R

=a

·(ιn

... 1√TΨβ)

0(ιn... 1√

TΨβ)

¸−1(ιn

... 1√TΨβ)

0(ιnλ1 + 1√TψR)

=a

Ãa−1 −a−1ι0nΨβ(Ψ

0βΨβ)

−1

−√T (Ψ0βMinΨβ)−1Ψ0βιn

n T (Ψ0βMinΨβ)−1

ι0nιnλ1 +1√Tι0nψR

1√TΨ0βιnλ1 +

1TΨ

0βψR)

!=a

Ãa−1(aλ1 + 1√

Tb0ψR)

(Ψ0βMinΨβ)−1Ψ0βMinψR

!=a

µλ10

¶+

Ã1√T1ab0

(Ψ0βMinΨβ)−1Ψ0βMin

!ψR

where Ψβ =vecinvk(ψβ) and b =MΨβιn, so, since Ψβ and ψR are independent,µ √

T (λ1 − λ1)

λF

¶→d

Ã1√T1ab0

(Ψ0βMinΨβ)−1Ψ0βMin

!ψR.

Since ψR is a normal distributed random vector that is independent of Ψβ, the moments of limitingdistribution of λF are determined by the moments of (Ψ0βMinΨβ)

−1Ψ0βMin . When Ω is an identitymatrix, Ψ0βMinΨβ is a Wishart distributed random matrix with n − 1 degrees of freedom and scalematrix QFF , see e.g. Muirhead (1982). Hence, (Ψ0βMinΨβ)

−1Ψ0βMin is the square root of the inverse ofthe Wishart distributed random matrix. To proof the order of finite moments of this matrix, we use analternative definition of a multi-variate t distributed random variable.

A multi-variate t distributed random k × 1 vector x with ν degrees, k× k dimensional scale matrixΣ−1 and k × 1 scale vector α can be defined by

x = α+W− 12 z,

where z ∼ N(0, I) and independent of the k × k Wishart distributed random matrix W with ν + k − 1degrees of freedom and scale matrix Σ. Since the moments of the multi-variate t exist up to order ν, themoments of the square root of the inverted-Wishart thus exist up to order ν + k− 1. This definition ofa multi-variate t distributed random matrix results from the definition of a matric-variate t distributedrandom matrix, see e.g. Zellner (1971) and Drèze and Richard (1983).

In our case W = Ψ0βMinΨβ which is a k × k Wishart distributed matrix with n − 1 degrees offreedom. The moments of (Ψ0βMinΨβ)

−1Ψ0βMin exist therefore up to order n− 1− k+1 = n− k which

holds as well for the moments of λF . The difference of the covariance matrix of Ω from the identitymatrix does not change the order of finite moments.

21

Page 22: Tests of Risk Premia in Linear Factor Models · Linear factor models are amongst the most commonly used statistical models in finance, see e.g.Fama and MacBeth (1973), Fama and French

1b. When β is equal to zero and E(R) = c:µλ1λF

¶=

·(ιn

... β)0(ιn... β)

¸−1(ιn

... β)0R

=a

·(ιn

... 1√TΨβ)

0(ιn... 1√

TΨβ)

¸−1(ιn

... 1√TΨβ)

0(c+ 1√TψR)

=

Ãa−1 −a−1ι0nΨβ(Ψ

0βΨβ)

−1

−√T (Ψ0βMinΨβ)−1Ψ0βιn

n T (Ψ0βMinΨβ)−1

ι0nc+1√Tι0nψR

1√TΨ0βc+

1TΨ

0βψR)

!=

Ãa−1b0c+ 1√

Ta−1b0ψR√

T (Ψ0βMinΨβ)−1Ψβ)

−1Ψ0βMιnc+ (Ψ0βMinΨβ)

−1Ψβ)−1Ψ0βMιnψR

!.

=

µ 1ab0√

T (Ψ0βMinΨβ)−1Ψβ)

−1Ψ0βMιnc

¶c+

Ã1√T1ab0

(Ψ0βMinΨβ)−1Ψβ)

−1Ψ0βMιn

!ψR

and à √T (λ1 − 1

ab0c)

λF −√T (Ψ0βMinΨβ)

−1Ψβ)−1Ψ0βMιnc

!→d

Ã1√T1ab0

(Ψ0βMinΨβ)−1Ψβ)

−1Ψ0βMιn

!ψR

so λF diverges when T gets large.2a. β = 1√

TB, with B a n× k matrix of full rank and E(R) = ιnλ1 +

1√TB so

µλ1λF

¶=a

·(ιn

... β)0(ιn... β)

¸−1(ιn

... β)0R

=a

·[ιn... 1√

T(B +Ψβ)]

0[ιn... 1√

T(B +Ψβ)]

¸−1(ιn

... 1√T(B +Ψβ))

0(ιnλ1 + 1√T(BλF + ψR))

=a

Ãa−1 −a−1√Tι0n(B +Ψβ)((B +Ψβ)

0(B +Ψβ))−1

−√T [(B +Ψβ)0Mιn(B +Ψβ)]

−1 (B+Ψβ)0ιn

n [(B +Ψβ)0Mιn(B +Ψβ)]

−1

ι0nιnλ1 + 1√Tι0n(BλF + ψR)

1√T(B +Ψβ)

0ιnλ1 + 1T (B +Ψβ)

0(BλF + ψR))

!=a

Ãa−1(aλ1 + 1√

Ta−1b0(BλF + ψR))

((B +Ψβ)0Mιn(B +Ψβ))

−1(B +Ψβ)0Mιn(BλF + ψR))

!=a

µλ10

¶+

Ã1√T1ab0

((B +Ψβ)0Mιn(B +Ψβ))

−1(B +Ψβ)0Mιn

!(BλF + ψR)

with a = ι0nM(B+Ψβ)ιn, b =M(B+Ψβ)ιn, so,µ √T (λ1 − λ1)

λF − λF

¶→d

Ã1√T1ab0

[(B +Ψβ)0(B +Ψβ)]

−1(B +Ψβ)0(In − 1

aιnb0)

!(ψR.β −ΨβλF ).

In this case (B + Ψβ)0(B + Ψβ) has a non-central Wishart distribution when Ω = In which has no

implications for the order of the finite moments compared to the case when B = 0. Hence, the momentsof λF exist up to order n− k.

22

Page 23: Tests of Risk Premia in Linear Factor Models · Linear factor models are amongst the most commonly used statistical models in finance, see e.g.Fama and MacBeth (1973), Fama and French

2b. β = 1√TB, with B a n× k matrix of full rank and E(R) = c:µ

λ1λF

¶=

·(ιn

... β)0(ιn... β)

¸−1(ιn

... β)0R

=a

·[ιn... 1√

T(B +Ψβ)]

0[ιn... 1√

T(B +Ψβ)]

¸−1[ιn... 1√

T(B +Ψβ)]

0(c+ 1√TψR)

=a

Ãa−1 −a−1√Tι0n(B +Ψβ)((B +Ψβ)

0(B +Ψβ))−1

−√T [(B +Ψβ)0Mιn(B +Ψβ)]

−1 (B+Ψβ)0ιn

n [(B +Ψβ)0Mιn(B +Ψβ)]

−1

ι0nc+ 1√Tι0nψR

1√T(B +Ψβ)

0c+ 1T (B +Ψβ)

0ψR)

!=a

Ãa−1b0c+ 1√

Ta−1b0ψR

[(B +Ψβ)0Mιn(B +Ψβ)]

−1(B +Ψβ)0Mιn [

√Tc+ ψR]

!.

=a

µ 1ab0√

T [(B +Ψβ)0Mιn(B +Ψβ)]

−1(B +Ψβ)0Mιn

¶c+Ã

1√T1ab0

[(B +Ψβ)0Mιn(B +Ψβ)]

−1(B +Ψβ)0Mιn

!ψR

with a = ι0nM(B+Ψβ)ιn, b =M(B+Ψβ)ιn, andµ √T (λ1 − λ1)

λF −√T [(B +Ψβ)

0(B +Ψβ)]−1(B +Ψβ)

0(In − 1aιnb

0)c

¶→d

Ã1√T1ab0

[(B +Ψβ)0(B +Ψβ)]

−1(B +Ψβ)0(In − 1

aιnb0)

!ψR.

C. Proof of Theorem 2.

1. β = 0. We use that since (ιn... 1√

TΨβ)

0Ω−1(ιn... 1√

TΨβ) =

µι0nΩ−1ιn1√TΨ0βΩ−1ιn

1√Tι0nΩ−1Ψβ

1TΨ0βΩ−1Ψβ

¶,

·(ιn

... 1√TΨβ)

0Ω−1(ιn... 1√

TΨβ)

¸−1=Ã

a−1 −√Ta−1ι0nΩ−1Ψβ(Ψ0βΩ

−1Ψβ)−1

−√T (Ψ0βΩ−12M

Ω−12 ιnΩ−

12Ψβ)

−1Ψ0βΩ−1ιnι0nΩ−1ιn

T (Ψ0βΩ− 12M

Ω−12 ιnΩ−

12Ψβ)

−1

!.

with a = ι0nΩ− 12M

Ω−12Ψβ

Ω−12 ιn.

1a. E(R) = ιnλ1 and since Ω→pΩ,

µλ1λF

¶=

·(ιn

... β)0Ω−1(ιn... β)

¸−1(ιn

... β)0ΩR

=a

·(ιn

... 1√TΨβ)

0Ω−1(ιn... 1√

TΨβ)

¸−1(ιn

... 1√TΨβ)

0Ω−1(ιnλ1 + 1√TψR)

=a

Ãa−1 −√Ta−1ι0nΩ−1Ψβ(Ψ

0βΩ

−1Ψβ)−1

−√T (Ψ0βΩ−12M

Ω−12 ιnΩ−

12Ψβ)

−1Ψ0βΩ−1ιnι0nΩ−1ιn

T (Ψ0βΩ− 12M

Ω−12 ιnΩ−

12Ψβ)

−1

ι0nΩ−1ιnλ1 +1√Tι0nΩ−1ψR

1√TΨ0βΩ

−1ιnλ1 + 1TΨ

0βΩ

−1ψR)

!

23

Page 24: Tests of Risk Premia in Linear Factor Models · Linear factor models are amongst the most commonly used statistical models in finance, see e.g.Fama and MacBeth (1973), Fama and French

=a

Ãa−1(aλ1 + 1√

Tb0Ω−

12ψR)

(Ψ0βΩ− 12M

Ω−12 ιnΩ−

12Ψβ)

−1Ψ0βΩ− 12M

Ω−12 ιnΩ−

12ψR

!

=a

µλ10

¶+

à 1√T1ab0

(Ψ0βΩ−12M

Ω−12 ιnΩ−

12Ψβ)

−1Ψ0βΩ− 12M

Ω−12 ιn

!Ω−

12ψR

where Ψβ =vecinvk(ψβ) and b =MΩ−

12Ψβ

Ω−12 ιn, so

µ √T (λ1 − λ1)

λF

¶→d

à 1√T1ab0

(Ψ0βΩ− 12M

Ω−12 ιnΩ−

12Ψβ)

−1Ψ0βΩ− 12M

Ω−12 ιn

!Ω−

12ψR.

Since ψR is a normal distributed random vector that is independent of Ψβ. The random matrix(Ψ0βΩ

− 12M

Ω−12 ιnΩ−

12Ψβ)

−1Ψ0βΩ− 12M

Ω−12 ιn

is the square root of the inverse of the Wishart distributed

random matrix (Ω−12Ψβ)

0MΩ−

12 ιnΩ−

12Ψβ which has a Wishart distributed matrix with n− 1 degrees of

freedom and scale matrix QFF . Hence, it results from the proof of Theorem 1 that the limiting distri-bution of λF is a multi-variate t distribution with n− k degrees of freedom, location 0 and scale matrixQ11.FQ

−1FF with Q11.F = Q11 −Q1FQ

−1FFQF1.

1b. When β is equal to zero and E(R) = c:µλ1λF

¶=

·(ιn

... β)0Ω−1(ιn... β)

¸−1(ιn

... β)0ΩR

=a

·(ιn

... 1√TΨβ)

0Ω−1(ιn... 1√

TΨβ)

¸−1(ιn

... 1√TΨβ)

0Ω−1(ιnλ1 + 1√TψR)

=a

Ãa−1 −√Ta−1ι0nΩ−1Ψβ(Ψ

0βΩ

−1Ψβ)−1

−√T (Ψ0βΩ−12M

Ω−12 ιnΩ−

12Ψβ)

−1Ψ0βΩ−1ιnι0nΩ−1ιn

T (Ψ0βΩ− 12M

Ω−12 ιnΩ−

12Ψβ)

−1

ι0nΩ−1c+1√Tι0nΩ−1ψR

1√TΨ0βΩ

−1c+ 1TΨ

0βΩ

−1ψR)

!

=a

a−1ι0nΩ−12M

Ω−12ΨβΩ−

12 c+ 1√

Tb0Ω−

12ψR)

(Ψ0βΩ− 12M

Ω−12 ιnΩ−

12Ψβ)

−1Ψ0βΩ− 12M

Ω−12 ιnΩ−

12 (√Tc+ ψR)

=a

Ãa−1b0Ω−

12 c√

T (Ψ0βΩ−12M

Ω−12 ιnΩ−

12Ψβ)

−1Ψ0βΩ− 12M

Ω−12 ιnΩ−

12 c

!+Ã 1√

T1ab0

(Ψ0βΩ− 12M

Ω−12 ιnΩ−

12Ψβ)

−1Ψ0βΩ− 12M

Ω−12 ιn

!Ω−

12ψR

and à √T (λ1 − a−1b0Ω−

12 c)

λF −√T (Ψ0βΩ

−12M

Ω−12 ιnΩ−

12Ψβ)

−1Ψ0βΩ− 12M

Ω−12 ιnΩ−

12 c

!→dÃ

1√T1ab0

(Ψ0βΩ− 12M

Ω−12 ιnΩ−

12Ψβ)

−1Ψ0βΩ−12M

Ω−12 ιnΩ−

12

!ψR

and λF diverges when T gets large.

24

Page 25: Tests of Risk Premia in Linear Factor Models · Linear factor models are amongst the most commonly used statistical models in finance, see e.g.Fama and MacBeth (1973), Fama and French

2a. β = 1√TB, with B a n× k matrix of full rank and E(R) = ιnλ1 +

1√TB soµ

λ1λF

¶=

·(ιn

... β)0Ω−1(ιn... β)

¸−1(ιn

... β)0Ω−1R

=a

·[ιn... 1√

T(B +Ψβ)]

0Ω−1[ιn... 1√

T(B +Ψβ)]

¸−1(ιn

... 1√T(B +Ψβ))

0Ω−1(ιnλ1 + 1√T(BλF + ψR))

=a

Ãa−1

−√T ((B +Ψβ)0Ω−

12M

Ω−12 ιnΩ−

12 (B +Ψβ))

−1 (B+Ψβ)0Ω−1ιn

ι0nΩ−1ιn−√Ta−1ι0nΩ−1(B +Ψβ)((B +Ψβ)

0Ω−1(B +Ψβ))−1

T ((B +Ψβ)0Ω−

12M

Ω−12 ιnΩ−

12 (B +Ψβ))

−1

ι0nΩ−1ιnλ1 +1√Tι0nΩ−1(BλF + ψR)

1√T(B +Ψβ)

0Ω−1ιnλ1 + 1T (B +Ψβ)

0Ω−1(BλF + ψR))

!

=a

Ãa−1(aλ1 + 1√

Ta−1b0Ω−

12 (BλF + ψR))

((B +Ψβ)0Ω−

12M

Ω−12 ιnΩ−

12 (B +Ψβ))

−1(B +Ψβ)0Ω−

12M

Ω−12 ιnΩ−

12 (BλF + ψR))

!

=a

µλ10

¶+

à 1√T1ab0

((B +Ψβ)0Ω−

12M

Ω−12 ιnΩ−

12 (B +Ψβ))

−1(B +Ψβ)0Ω−

12M

Ω−12 ιn

!Ω−

12 (BλF + ψR).

Hence, µ √T (λ1 − λ1)

λF − λF

¶→dà 1√

T1ab0

((B +Ψβ)0Ω−

12M

Ω−12 ιnΩ−

12 (B +Ψβ))

−1(B +Ψβ)0Ω−

12Mιn

!Ω−

12 (ψR −ΨβλF ).

In this case (B + Ψβ)0(B + Ψβ) has a non-central Wishart distribution regardless of the value of Ω.

Hence, the moments of λF exist up to order n− k.2b. β = 1√

TB, with B a n× k matrix of full rank and E(R) = c:µ

λ1λF

¶=

·(ιn

... β)0Ω−1(ιn... β)

¸−1(ιn

... β)0R

=a

·[ιn... 1√

T(B +Ψβ)]

0Ω−1[ιn... 1√

T(B +Ψβ)]

¸−1(ιn

... 1√T(B +Ψβ))

0Ω−1(c+ 1√TψR))

=a

Ãa−1

−√T ((B +Ψβ)0Ω−

12M

Ω−12 ιnΩ−

12 (B +Ψβ))

−1 (B+Ψβ)0Ω−1ιn

ι0nΩ−1ιn−√Ta−1ι0nΩ−1(B +Ψβ)((B +Ψβ)

0Ω−1(B +Ψβ))−1

T ((B +Ψβ)0Ω−

12M

Ω−12 ιnΩ−

12 (B +Ψβ))

−1

ι0nΩ−1c+ 1√Tι0nΩ−1ψR

1√T(B +Ψβ)

0Ω−1c+ 1T (B +Ψβ)

0Ω−1ψR

!=a

Ãa−1(b0Ω−

12 c+ 1√

Tb0Ω−

12ψR)

((B +Ψβ)0Mιn(B +Ψβ))

−1(B +Ψβ)0Mιn)(

√Tc+ ψR))

!

=a

Ã1ab0Ω−

12 c√

T ((B +Ψβ)0Ω−

12M

Ω−12 ιnΩ−

12 (B +Ψβ))

−1(B +Ψβ)0Ω−

12M

Ω−12 ιnΩ−

12 c

!

+

Ã1√T1ab0

((B +Ψβ)0Ω−

12M

Ω−12 ιnΩ−

12 (B +Ψβ))

−1(B +Ψβ)0Ω−

12M

Ω−12 ιn

!Ω−

12ψR,

25

Page 26: Tests of Risk Premia in Linear Factor Models · Linear factor models are amongst the most commonly used statistical models in finance, see e.g.Fama and MacBeth (1973), Fama and French

with a = ι0nΩ− 12M

Ω−12 (B+Ψβ)

Ω−12 ιn, b =M

Ω−12 (B+Ψβ)

Ω−12 ιn, andà √

T (λ1 − 1ab0Ω−

12 c)

λF −√T ((B +Ψβ)

0Ω−12M

Ω−12 ιnΩ−

12 (B +Ψβ))

−1(B +Ψβ)0Ω−

12M

Ω−12 ιnΩ−

12 c

!

→d

Ã1√T1ab0

((B +Ψβ)0Ω−

12M

Ω−12 ιnΩ−

12 (B +Ψβ))

−1(B +Ψβ)0Ω−

12M

Ω−12 ιn

!Ω−

12ψR.

D. Proof of Lemma 2. Assumption 1 states that

1√T

PTt=1

µ1Ft

¶⊗ (Rt − ιnλ1 − β(Ft + λF ))

Ft − µF

→d

ϕRϕβϕF

,

with (ϕ0R ϕ0β ϕ0F )0 ∼ N(0, V ). We take all asset returns in deviation from the n-th asset return so

1√T

PTt=1

µ1Ft

¶⊗ (Rt − B(Ft + λF ))

Ft − µF

→d

ϕRϕBψF

,

with ϕR = JnϕR, ϕβ = (Ik⊗Jn)ϕβ with Jn = (IN−1...−ιn−1) so (ϕ0R

... ϕ0B... ϕ0F )

0 ∼ N(0,diag(Q⊗Σ, VFF ))with Σ = JnΩJ

0n. We pre-multiply the result from Assumption 1 with

Ã1 0h

1T

PTj=1(Fj + λF,0)(Fj + λF,0)

0i−1

(λF,0 − F )h1T

PTj=1(Fj + λF,0)(Fj + λF,0)

0i−1 !⊗ In 0

0 Ik

to obtain under H0 : λF = λF,0 that

√T

R− BλFvec(B − B)F − µF

→d

ξRξBξF

,

where B =PTt=1Rt(Ft+λF,0)

0(PT

j=1(Fj +λF,0)(Fj +λF,0)0)−1 and (ξ0R

... ξ0B... ξ0F ) ∼ N(0,diag(Q(λF )⊗

Σ, VFF )) and

Q(λF ) = limT→∞E

1 λ0Fh1T

PTt=1(Ft + λF )(Ft + λF )

i−1h1T

PTt=1(Ft + λF )(Ft + λF )

i−1λF

h1T

PTt=1(Ft + λF )(Ft + λF )

0i−1

=

µ Q(λF )11 Q(λF )1FQ(λF )F1 Q(λF )FF

¶.

Hence, under H0 : λF = λF,0, R−BλF,0 = R−BλF−(B−B)λF so√T (R−BλF,0)→

dξR−(λ0F⊗In−1)ξB,

which is independent of ξB since³10−λ0F,0Ik

´Q(λF )

³10−λ0F,0Ik

´0=diag(1− λ0F,0Q(λF )−1FFλF,0,Q(λF )FF ), so

√T

R− BλFvec(B − B)F − µF

→d

ψRψBψF

,

26

Page 27: Tests of Risk Premia in Linear Factor Models · Linear factor models are amongst the most commonly used statistical models in finance, see e.g.Fama and MacBeth (1973), Fama and French

where ψR, ψB and ψF are independent (n− 1)× 1, (n− 1)k × 1 and k × 1 normal distributed randomvectors with mean zero and covariance matrices, (1− λ0F,0Q(λF )−1FFλF,0)⊗Σ, Q(λF )FF ⊗Σ and VFF .

E. The derivative of the FAR statistic with respect to λF,0To obtain the derivative of

FAR(λF,0) = T1−λ0F,0Q(λF )−1FFλF,0

(R− BλF,0)0Σ−1(R− BλF,0),we first construct the derivatives of its different elements with respect to λF,0.

∂vec(B)∂λ0F,0

= ∂∂λ0F,0

vechPT

t=1Rt(Ft + λF,0)0(PT

j=1(Fj + λF,0)(Fj + λF,0)0)−1

i=

h(PT

j=1(Fj + λF,0)(Fj + λF,0)0)−1 ⊗ In−1

i∂

∂λ0F,0vec

hPTt=1Rt(Ft + λF,0)

0i

+hIk ⊗

PTt=1Rt(Ft + λF,0)

0i

∂∂λ0F,0

vech(PT

j=1(Fj + λF,0)(Fj + λF,0)0)−1

i=

h(PT

j=1(Fj + λF,0)(Fj + λF,0)0)−1 ⊗ T R

i+hIk ⊗

PTt=1Rt(Ft + λF,0)

0i

∂∂λ0F,0

vech(PT

j=1(Fj + λF,0)(Fj + λF,0)0)−1

i=

h(PT

j=1(Fj + λF,0)(Fj + λF,0)0)−1 ⊗ T R

i−2h(PT

j=1(Fj + λF,0)(Fj + λF,0)0)−1 ⊗ T BλF,0

i= T

h(PT

j=1(Fj + λF,0)(Fj + λF,0)0)−1 ⊗ (R− BλF,0)

i∂

∂λ0F,0

h1− λ0F,0Q(λF )−1FFλF,0

i= −2λ0F,0( 1T

PTj=1(Fj + λF,0)(Fj + λF,0)

0)−1−(λ0F,0 ⊗ λ0F,0)

∂∂λ0F,0

vech( 1TPT

j=1(Fj + λF,0)(Fj + λF,0)0)−1

i= −2λ0F,0( 1T

PTj=1(Fj + λF,0)(Fj + λF,0)

0)−1FF(1− λ0F,0(

1T

PTj=1(Fj + λF,0)(Fj + λF,0)

0)−1λF,0)∂vec(Σ)∂λ0F,0

= −2( 1T−k−1

PTt=1(Rt − B(Ft + λF,0))⊗ B)

−2( 1T−k−1

PTt=1(Rt − B(Ft + λF,0))(Ft + λF,0)

0 ⊗ In−1)= −2( 1

T−k−1PT

t=1(Rt − B(Ft + λF,0))⊗ B)= −2( T

T−k−1((R− BλF,0)⊗ B)since

PTt=1(Rt − B(Ft + λF,0))(Ft + λF,0)

0 = 0 and F = 0.Hence,

∂∂λ0F,0

FAR(λF,0) = −2½

T1−λ0F,0Q(λF )−1FFλF,0

(R− BλF,0)0Σ−1B+T

1−λ0F,0Q(λF )−1FFλF,0(λ0F,0 ⊗ (R− BλF,0)0Σ−1)∂vec(B)∂λ0F,0

T(1−λ0F,0Q(λF )−1FFλF,0)2

(R− BλF,0)0Σ−1(R− BλF,0)∂

∂λ0F,0

h1− λ0F,0Q(λF )−1FFλF,0

i+ T1−λ0F,0Q(λF )−1FFλF,0

((R− BλF,0)0Σ−1 ⊗ (R− BλF,0)0Σ−1)∂vec(Σ)∂λ0F,0

= −2 T1−λ0F,0Q(λF )−1FFλF,0

n(R− BλF,0)0Σ−1B+h

λ0F,0(1T

PTj=1(Fj + λF,0)(Fj + λF,0)

0)−1 ⊗ (R− BλF,0)0Σ−1(R− BλF,0)i

−hλ0F,0(

1T

PTj=1(Fj + λF,0)(Fj + λF,0)

0)−1FF (R− BλF,0)0Σ−1(R− BλF,0)i

( TT−k−1(R− BλF,0)0Σ−1(R− BλF,0)⊗ (R− BλF,0)0Σ−1B)

= −2 T1−λ0F,0Q(λF )−1FFλF,0

h1 + 1

T−k−1FAR(λF,0)i(R− BλF,0)0Σ−1B

27

Page 28: Tests of Risk Premia in Linear Factor Models · Linear factor models are amongst the most commonly used statistical models in finance, see e.g.Fama and MacBeth (1973), Fama and French

F. Minimal value of FAR(λF ) and specification of CLR(λF,0).The minimal value of

FAR(λF,0) = T1−λ0F,0Q(λF )−1FFλF,0

(R− BλF,0)0Σ−1(R− BλF,0)

results from the eigenvalue problem¯µQ(λF )−

¡ R B ¢0Σ−1

¡ R B ¢¯= 0⇔¯

µ³1−λ0F,0Q(λF )−1FFλF,0

00

Q(λF )FF´− ¡ R− BλF B ¢0

Σ−1¡ R− BλF B ¢¯

= 0⇔¯¯µ³10 0

IK

´−µΣ−

12

³R− BλF

´1q

1−λ0F,0Q(λF )−1FFλF,0Σ−

12 BQ(λF )−

12

FF

¶0µΣ−

12 (R− BλF ) 1q

1−λ0F,0Q(λF )−1FFλF,0Σ−

12 BQ(λF )−

12

FF

¶¯= 0,

where the second equation is obtained by pre and post-multiplying by³

1−λF,0

0Ik

´0,³

1−λF,0

0Ik

´0which

has a determinantal value of one. When k = 1, the above equation is a quadratic polynomial and wehave a closed form expression for the minimal value of µ, see Moreira (2003):

µ = 12

hFAR(λF,0) + r(λF,0)−

p(FAR(λF,0) + r(λF,0))2 − 4r(λF,0)JFAR(λF,0)

iwith r(λF,0) = Q(λF )−1FF B0Σ−1B. For larger values of k, the value of µ provides a lowerbound on thesmallest root of the eigenvalue problem when r(λF,0) is a statistic that test for the rank of B, seeKleibergen (2004a,b). The value of the CLR statistic that results is

CLR(λF,0) = 12

hFAR(λF,0)− r(λF,0) +

p(FAR(λF,0) + r(λF,0))2 − 4r(λF,0)JFAR(λF,0)

i.

G. Size distortions of statistics that result from the inversion of a high dimensional covari-ance matrix.

The expression of the GLS Wald statistic is

GLS-W(λF,0) = (R− ιnλ1 − βλF,0)0Ω−1βΘ−1β

0Ω−1(R− ιnλ1 − βλF,0),

with Θ = β0Ω−1β − βΩ−1ιn(ι0nΩ−1ιn)−1ι0nΩ−1β. The expression for Ω is:

Ω = 1T−k−1

PTt=1(Rt − βFt)(Rt − βFt)0.

To understand the size distortion of GLS-W, we specify it using matrix notation:

GLS-W(λF,0) =h1T ι0T (R− ιT (λ1ι

0n + λ0F,0β

0)− F β

0)iΩ−1βΘ−1β

0Ω−1

h1T (R− ιT (λ1ι

0n + λ0F,0β

0)− F β

0)0ιT

i,

with R = (R01 . . . R0T )0, F = (F 01 . . . F 0T )

0, so ι0T F = 0, and Ω = 1T−k−1R

0M(ιT F )R. When T equals n,

Ω is a T × T matrix of rank T − 2 since M(ιT F ) is a T × T matrix of rank T − 2. The eigenvectorsassociated with the zero eigenvalues of M(ιT F ) are spanned by ιT and F . The eigenvectors associated

with the zero eigenvalues of Ω are obtained by specifying R as

R = P(ιT F )R+M(ιT F )R.

28

Page 29: Tests of Risk Premia in Linear Factor Models · Linear factor models are amongst the most commonly used statistical models in finance, see e.g.Fama and MacBeth (1973), Fama and French

The first part of this specification of R, P(ιT F )R, is mapped onto zero when R is multiplied byM(ιT F ).

When T = n, the eigenvectors that are associated with a zero value of Ω are therefore spanned byR0(ιT F ) and the eigenvectors that belong to the non-zero eigenvalues are spanned by the orthogonalcomplement of R0(ιT F ). For large values of n smaller than T, Ω is non-singular but will have tworelatively small eigenvalues whose eigenvectors are spanned by R0(ιT F ). In the GLS-Wald statistic,Ω−1 is post-multiplied by (R− ιT (λ1+ β

0λF,0)− F β0)0ιT . This vector lies in the span of R0(ιT F ) which

is the eigenvector associated with the smallest eigenvalues of Ω or put differently the largest eigenvaluesof Ω−1. This shows that the large size distortion of the GLS-Wald statistics results from the associationbetween (R − ιT (λ1 + β

0λF,0)− F β

0)0ιT and the eigenvectors that belong to the largest eigenvalues of

Ω−1.When we use the covariance matrix estimator,

Σ = 1T−k−1

PTt=1(Rt − B(Ft + λF,0))(Rt − B(Ft + λF,0))

0,

the matrix expression of this covariance matrix estimator reads

Ω = 1T−k−1R0MF+ιT λ

0F,0R.

with R = (R01 . . .R0T )0. Since ι0T (R− (F + ιTλF,0)B0) is not spanned by R0(F + ιTλ0F,0), ι

0T (R− (F +

ιTλF,0)B0) is not associated with the eigenvector of the largest eigenvalue of Σ−1 and therefore allstatistics that use Σ−1 perform appropriately despite the inversion of a large dimensional matrix.

29

Page 30: Tests of Risk Premia in Linear Factor Models · Linear factor models are amongst the most commonly used statistical models in finance, see e.g.Fama and MacBeth (1973), Fama and French

References

[1] Anderson, T.W. and H. Rubin. Estimators of the Parameters of a Single Equation in a CompleteSet of Stochastic Equations. The Annals of Mathematical Statistics, 21:570—582, (1949).

[2] Cochrane, J.H. Asset Pricing. Princeton University Press, 2001.

[3] Drèze, J.H. and J.F. Richard. Bayesian Analysis of Simultaneous Equations systems. In Z. Grilichesand M.D. Intrilligator, editor, Handbook of Econometrics, volume 1. Elsevier Science (Amsterdam),(1983).

[4] Fama, E.F. and J.D. MacBeth. Risk, Return and Equillibrium: Empirical Tests. Journal of PoliticalEconomy, 81:607—636, 1973.

[5] Fama, E.F. and K.R. French. Multifactor Explanations of Asset Pricing Anomalies. Journal ofFinance, 51:55—84, 1996.

[6] Hansen, L.P. Large Sample Properties of Generalized Method Moments Estimators. Econometrica,50:1029—1054, 1982.

[7] Hansen, L.P. and R. Jagannathan. Assessing specification errors in stochastic discount factormodels. Journal of Finance, (52):557—590, 1997.

[8] Jagannathan, R. and Z. Wang. The Conditional CAPM and the Cross-Section of Expected Returns.Journal of Finance, 51:3—53, 1996.

[9] Kan, R. and C. Zhang. Two-Pass Tests of Asset Pricing Models with Useless Factors. Journal ofFinance, 54:203—235, 1999.

[10] Kleibergen, F. Pivotal Statistics for testing Structural Parameters in Instrumental Variables Re-gression. Econometrica, 70:1781—1803, 2002.

[11] Kleibergen, F. Testing Parameters in GMM without assuming that they are identified. TinbergenInstitute Discussion Paper TI 2001-67/4, 2004a.

[12] Kleibergen, F. Generalizing weak instrument robust IV statistics towards multiple parameters,unrestricted covariance matrices and identification statistics. 2004b. Department of Economics,Brown University, Working Paper.

[13] Kleibergen, F. and R. Paap. Generalized Reduced Rank Tests using the Singular Value Decompo-sition. UvA-Econometrics working paper 2002/25, University of Amsterdam, 2004.

[14] Moreira, M.J.,. A Conditional Likelihood Ratio Test for Structural Models. Econometrica, 71:1027—1048, 2003.

[15] Muirhead, R.J. Aspects of Multivariate Statistical Theory. John Wiley (New York), 1982.

[16] Phillips, P.C.B. Partially Identified Econometric Models. Econometric Theory, 5:181—240, (1989).

[17] Staiger, D. and J.H. Stock. Instrumental Variables Regression with Weak Instruments. Economet-rica, 65:557—586, 1997.

[18] Zellner, A. An Introduction to Bayesian Inference in Econometrics. Wiley, (1971).

30