ams 225 final project presentation - course web pages · pdf fileams 225 nal project...
TRANSCRIPT
AMS 225 final project presentation
Waley W.J. Liang
Department of Applied Mathematics and StatisticsUniversity of California, Santa Cruz
March 11, 2010
Joint Bayesian Analysis of Factor Scores and StructuralParameters in the Factor Analysis Model
Authors: Sik-Yum Lee and Jian-Qing ShiPublished in: Annals of the Institute of Statistical Mathematics,Springer, vol. 52(4), pages 722-736, December 2000.
The factor analysis model
y = Λζ + δ, (1)
where
• y is a (p × 1) observed random vector.
• Λ is a (p × r) factor loading matrix.
• ζ is (r × 1) vector of factor scores such that ζ ∼ N(0,Φ).
• δ is a (p × 1) random vector of measurement error such thatδ ∼ N(0,Ψ), where Ψ is a diagonal matrix.
Common practice in estimation
• Estimate the unknown parameters in Λ, Ψ, and Φ based onML or generalized least squares.
• Estimate ζ based on the above estimates of the unknownparameters.
• Sampling errors are ignored.
Setup for Bayesian inference
Let
• Y = (y1, · · · , yn) be the observed data matrix.
• Z = (ζ1, · · · , ζn) be the matrix of factor scores.
• θ be the structural parameter vector that contains theunknown elements of Λ, Φ, and Ψ.
The goal is to get p(θ,Z|Y), which can be obtained by drawingsamples from the conditional distribution p(Z|Y,θ) and p(θ|Z,Y).
Conditional distribution of Z given (Y, θ)
Note that for i = 1, · · · , n, ζ i are mutually independent; and yi arealso mutually independent given (ζ i ,θ). We have
p(Z|Y,θ) =n∏
i=1
p(ζ i |yi ,θ) ∝n∏
i=1
p(ζ i |θ)p(yi |ζ i ,θ).
Since ζ i |θ ∼ N(0,Φ) and yi |ζ i ,θ ∼ N(Λζ i ,Ψ), it can be shownthat
ζ i |yi ,θ ∼ N((Φ−1 + Λ′Ψ−1Λ)−1Λ′Ψ−1yi , (Φ−1 + Λ′Ψ−1Λ)−1).
Conditional distribution of θ given (Y, Z)
The distribution of Y only depends on (Λ,Ψ) when Z is given, andthe distribution of Z only involves Φ. We can assume that (Λ,Ψ)and Φ are independent in the prior.
p(θ) = p(Λ,Φ,Ψ) = p(Λ,Ψ)p(Φ),
and it follows that
p(Λ,Ψ,Φ|Y,Z) = p(θ|Y,Z)
∝ p(Y,Z|θ)p(θ)
∝ p(Y|θ,Z)p(Z|θ)p(θ)
∝ p(Y|θ,Z)p(Z|θ)p(Λ,Ψ)p(Φ)
∝ p(Λ,Ψ)p(Y|Λ,Ψ,Z) · p(Z|Φ)p(Φ).
Conditional distribution of θ given (Y, Z) (continued)
Let φkk and Λ′k be the k-th diagonal elements of Ψ and the k-throw of Λ, respectively. Assuming that the prior distribution of ψkk
is independent of ψhh, and Λk is independent of Λh for k 6= h, thepriors are specified as
ψ−1kk ∼ G (α0k , β0k),
Λk |ψkk ∼ N(Λ0k , ψkkH0k),
Φ−1 ∼W (R0, ρ0).
Conditional distribution of θ given (Y, Z) (continued)
The conditional distributions of Λk , ψkk , and Φ are given by
ψ−1kk |Y,Z ∼ G (n/2 + α0k , βk),
Λk |Y,Z, ψkk ∼ N(µk , ψkkΩk),
Φ|Y,Z ∼W−1(ZZ′ + R−10 , n + ρ0),
where
Ωk = (H−10k + ZZ′)−1,
µk = Ωk(H−10k Λ0k + ZYk),
βk = β0k + 2−1(Y′kYk − µ′kΩ−1k µk + Λ′0kH−1
0k Λ0k).
Summary
• A Bayesian approach is developed to estimate jointly thefactor scores Z and the unknown parameters in Λ, Φ, and Ψ.
• The key idea is to treat Z as missing data, and develop theconditional distributions p(Z|Y,θ) and p(θ|Z,Y), whereGibbs sampling can be deployed.
Bayesian analysis of the factor model with financeapplications
Authors: Sik-Yum Lee, Wai-Yin Poon, and Xin-Yuan SongPublished in: Quantitative Finance, Taylor and Francis Journals,vol. 7(3), pages 343-356
The factor model of asset return
yt = µ + Λζt + δt , (2)
where
• yt is a (p × 1) vector of returns of p assets at time t.
• µ is a vector of mean returns of the p assets.
• Λ is a (p × q) factor loading matrix.
• ζt is (q × 1) random vector of factor scores such thatζ ∼ N(0, I).
• δt is a (p × 1) random vector of measurement error such thatδt ∼ N(0,Ψ), where Ψ is a diagonal matrix.
Procedure of applying factor models in finance
1. Determination of the number of factors.
2. Estimation of the factor loadings and factor scores.
3. Use the results to analyze various financial problem.
Common practice of finding the number of factors q
• PCA based on the eigenvalues of Σ = ΛΛ′ + Ψ.
• ML approach that uses an asymtotic likelihood ratio statisticto test the goodness-of-fit of a model with a specified q.(Problems: 1. No conclusion if null is not rejected. 2. Null istoo frequently rejected if sample size is large)
Model selection using the Bayes factor
Let p(Y|Mk) denote the probability density of Y given model Mk ,for k = 0, 1. The Bayes factor for these models is defined as
B01 =p(Y|M1)
p(Y|M0),
It is useful to consider the natural logarithm of the Bayes factorand interpret the resulting values based on the following criterion(Kass and Raftery 1995):
< 0 [0, 1) [1, 3) ≥ 3
SupportM0 No Conclusion SupportM1 Strongly supportM1
Model selection using the Bayes factor (continued)
The key quantity in computing B01 is
p(Y|Mk) =
∫p(Y|θk ,Mk)p(θk |Mk)dθk , k = 0, 1
where
• θk is the vector of unknown parameters.
• p(θk |Mk) is prior density of θk .
• p(Y|θk ,Mk) is the likelihood of Y given θk , under Mk .
In many cases, this integral is intractable!
Path sampling
Consider
p(θ,Z|Y, v) = p(θ,Z,Y|v)/p(Y|v), v ∈ [0, 1]
where p(Y|v = k) = p(Y|Mk) for k = 0, 1. Let
U(θ,Z,Y, v) =dp(Z,Y|θ, v)
dv,
where p(Z,Y|θ, v) is the complete data likelihood function at v.Assuming θ is independent of v , it follows from Gelman and Meng(1998) that,
log B01 =
∫ 1
0Eθ,Z[U(θ,Z,Y, v)]dv ,
where Eθ,Z is the expectation with respect to p(θ,Z|Y, v).
Path sampling (continued)
The trapezoidal rule can be used to estimate the above integral.Fix S ordered grids v(0) = 0 < v(2) < · · · < v(S) < v(S+1) = 1, thenlog B01 is estimated by
ˆlog B01 =1
2
S∑s=0
(v(s+1) − v(s))(U(s+1) + U(s)),
where
U(s) = J−1J∑
j=1
U(θ(j),Z(j),Y, v(s)),
and (θ(j),Z(j)); j = 1, · · · , J are samples from p(θ,Z|Y, v),which can be obtained using the Gibbs sampler described in theprevious paper.
Path sampling (continued)
Consider applying the above procedure to select one of thefollowing two competing models, M0 and M1:
M0 : y = µ + Λ10ζ1 + · · ·+ Λq0ζq + δ,
M1 : y = µ + Λ11ζ1 + · · ·+ Λm0ζm + δ,
where q < m (without loss of generality), and Λh0 and Λh1 are thecolumns of the loading matrices. These two models can be linkedby v ∈ [0, 1] as follows,
Mv : y = µ + (1− v)Λ10 + vΛ11ζ1 + · · ·+(1− v)Λq0 + vΛq1ζq + vΛ(q+1)1ζq+1 +
· · ·+ vΛm1ζm + δ.
Path sampling (continued)Letting
Λv = (1− v)Λ10 + vΛ11 + · · ·+ (1− v)Λq0 + vΛq1 + vΛ(q+1)1 +
· · ·+ vΛm1,
it can be shown that the corresponding log likelihood is given by
log p(Z,Y|θ, v) = −1
2
p log(sπ) + n log |Ψ|+
T∑t=1
ζ′tζt +
T∑t=1
(yt − µ− Λvζt)′Ψ−1(yt − µ− Λvζt)
.
By differentiation with respect to v, we have
U(θ,Z,Y, v) =T∑
t=1
(yt − µ− Λvζt)′Ψ−1Λv0ζt ,
where Λv0 = (−Λ10 + Λ11, · · · ,−Λq0 + Λq1,Λq1, · · · ,Λm1).
Summary
• A Bayes factor approach based on path sampling is used todetermine the number of factor scores in a factor model.
• This Bayes factor approach uses samples of θ and Z drawnfrom their conditional distribution for each v .
Extension
A natural extension would be
yt = µ + Λζt + δt , (3)
ζt = ζt−1 + εt . (4)