lecture # 8
DESCRIPTION
Lecture # 8. Studenmund (2006) : Chapter 8. Multicollinearity. Objectives. Perfect and imperfect multicollinearity Effects of multicollinearity Detecting multicollinearity Remedies for multicollinearity. The nature of M ulticollinearity. Perfect multicollinearity : - PowerPoint PPT PresentationTRANSCRIPT
8.1
All right reserved by Dr.Bill Wan Sing Hung - HKBU
Lecture #8Studenmund(2006): Chapter 8
• Perfect and imperfect multicollinearity• Effects of multicollinearity• Detecting multicollinearity• Remedies for multicollinearity
Objectives
8.2
All right reserved by Dr.Bill Wan Sing Hung - HKBU
The nature of Multicollinearity
If multicollinearity ismulticollinearity is perfectperfect, the regression coefficients ofthe Xi variables, is, are indeterminateindeterminate and their standarderrors, Se(i)s, are infinite.
Perfect multicollinearity: When there are some functional relationships existing
among independent variables, that is iXi = 0
or 1X1+ 2X2 + 3X3 +…+ iXi = 0
Such as 1X1+ 2X2 = 0 X1= -2X2
8.3
All right reserved by Dr.Bill Wan Sing Hung - HKBU
Y = 0 + 1X1 + 2X2 + ^ ^ ^ ^Example:3-variable Case:
If x2 = x1,
=(yx1)(2x1
2) - (yx1)(x1x1)
(x12)(2 x1
2) - 2(x1x1)21^ =
0
0
Indeterminate
=(yx1)(x1
2) - (yx1)(x1x1)
(x12)(2 x1
2) - 2(x1x1)22^ =
0
0
Similarly
If x2 = x1
Indeterminate
=(yx1)(x2
2) - (yx2)(x1x2)
(x12)(x2
2) - (x1x2)21^
=(yx2)(x1
2) - (yx1)(x1x2)
(x12)(x2
2) - (x1x2)22^
8.4
All right reserved by Dr.Bill Wan Sing Hung - HKBU
If multicollinearity is imperfect,
x2 = 1 x1+ where is a stochastic error
(or x2 = 0+ 1 x1+ )
Then the regression coefficients, although determinate, possess large standard errors, which means the coefficients can be estimated but with less accuracy.
=(yx1)(2x1
2 + 2 ) - ( yx1 + y )( x1x1+ x1 )
(x12)(2 x1
2 + 2 ) - ( x1x1 + x1 )2
1^
0 = 0(Why?)
8.5
All right reserved by Dr.Bill Wan Sing Hung - HKBU
Example: Production function Yi = 0 + 1X1i + 2X2i + 3X3i + i
Y X1 X2 X3
122 10 50 52
170 15 75 75
202 18 90 97
270 24 120 129
330 30 150 152
Y: Output
X1: Capital
X2: Labor
X3: Land
XX11 = = 55XX22
8.6
All right reserved by Dr.Bill Wan Sing Hung - HKBU
Example: Perfect multicollinearity
a. Suppose D1, D2, D3 and D4 = 1 for spring, summer, autumn and winter, respectively.
Yi = 0 + 1D1i + 2D2i + 3D3i + 4D4i + 1X1i + i.
b. Yi = 0 + 1X1i + 2X2i + 3X3i + i
X1: Nominal interest rate; X2: Real interest rate; X3: CPI
c. Yt = 0 + 1Xt + 2Xt + 3Xt-1 + t
Where Xt = (Xt – Xt-1) is called “first different”
8.7
All right reserved by Dr.Bill Wan Sing Hung - HKBU
Yi = 0 + 1X1i + 2X2i + … + KXKi + i
When some independent variables are linearly correlated but the relation is not exact, there is imperfect multicollinearity.
0 + 1X1i + 2X2i + + KXKi + ui = 0where u is a random error term and k 0 for some k.
When will it be a problem?
Imperfect Multicollinearity
8.8
All right reserved by Dr.Bill Wan Sing Hung - HKBU
Consequences of imperfect multicollinearity
5. The OLS estimators and their standard errors can be sensitive to small change in the data.
Can be detected from
regressionresults
1. The estimated coefficients are still BLUE, however, OLS estimators have large variances and covariances, thus making the estimation with less accuracy.
2. The estimation confidence intervals tend to be much wider, leading to accept the “zero null hypothesis” more readily.
3. The t-statistics of coefficients tend to be statistically insignificant.
4. The R2 can be very high.
8.9
All right reserved by Dr.Bill Wan Sing Hung - HKBU
OLS estimators are still BLUE under imperfect multicollinearity
Remarks:
•Unbiasedness is a repeated sampling property, not about the properties of estimators in any given sample
•Minimum variance does not mean small variance
•Imperfect multicollinearity is just a sample phenomenon
Why???
8.10
All right reserved by Dr.Bill Wan Sing Hung - HKBU
Effects of Imperfect Multicollinearity
Unaffected:
a. OLS estimators are still BLUE.
b. The overall fit of the equation
c. The estimation of the coefficients of non-multicollinear variables
8.11
All right reserved by Dr.Bill Wan Sing Hung - HKBU
The variances of OLS estimators increase with the degree of multicollinearity
Regression model:
Yi = 0 + 1X1i + 2X2i + i
High correlation between X1 and X2
Difficult to isolate effects of X1 and X2 from each other
8.12
All right reserved by Dr.Bill Wan Sing Hung - HKBU
Closer relation between X1 and X2
larger r212
larger VIF
larger variances
where VIFk = 1/(1-Rk²), k=1,...,K and Rk² is the coefficient of determination of regressing Xk on all other (K-1) explanatory variables.
8.13
All right reserved by Dr.Bill Wan Sing Hung - HKBU
8.14
All right reserved by Dr.Bill Wan Sing Hung - HKBU
Larger kˆvar
2kˆse .b tends to be large
a. More likely to get unexpected signsunexpected signs.
Larger variances tend to increase the increase the standard errors of estimated coefficientsstandard errors of estimated coefficients.
c. Larger standard errors Lower t-valuesLower t-values
k
*kk
k ˆse
ˆt
8.15
All right reserved by Dr.Bill Wan Sing Hung - HKBU
d. Larger standard errors
Wider confidence intervals
Less precise interval estimates.
k2/,dfkˆsetˆ
8.16
All right reserved by Dr.Bill Wan Sing Hung - HKBU
Detection of Multicollinearity
Example: Data set: CONS8 (pp. 254 – 255)
COi = 0 + 1Ydi + 2LAi + i
CO: Annual consumption expenditure
Yd: Annual disposable income
LA: Liquid assets
8.17
All right reserved by Dr.Bill Wan Sing Hung - HKBU
Since LA (liquid assets, saving, etc.) is highly related to YD (disposable income)
Studenmund (2006) - Eq. 8.9, pp254
Results:High R2 and Adjusted R2
Less significant t-values
Drop one variable
8.18
All right reserved by Dr.Bill Wan Sing Hung - HKBU
OLS estimates and SE’s can be sensitive to specification and small changes in data
Small changes:
Add or drop some observations
Change some data values
Specification changes:
Add or drop variables
8.19
All right reserved by Dr.Bill Wan Sing Hung - HKBU
High Simple Correlation Coefficients
2jj
2ii
jjiiij
XXXX
XXXXr
Remark: High rij for any i and j is a sufficient indicator for the existence of multicollinearity but not necessary.
8.20
All right reserved by Dr.Bill Wan Sing Hung - HKBU
Variance Inflation Factors (VIF) method
Procedures: kk XXXY ...)1( 22110
kk XXXX ...)2( 332211
Obtain 2kR
21
1ˆ)3(k
k RVIF
Rule of thumb: VIF > 5 multicollinearityNotes: (a.) Using VIF is not a statistical test. (b.) The cutting point is arbitrary.
8.21
All right reserved by Dr.Bill Wan Sing Hung - HKBU
1. Drop the Redundant Variable
Using theories to pick the variable(s) to drop.Do not drop a variable that is strongly supported by theory. (Danger of specification error)
Remedial Measures
8.22
All right reserved by Dr.Bill Wan Sing Hung - HKBU
Insignificant
Insignificant
Since M1 and M2are highly related
Other examples: CPI <=> WPI; CD rate <=> TB rate
GDP GNP GNI
8.23
All right reserved by Dr.Bill Wan Sing Hung - HKBU
Check after dropping variables:Check after dropping variables:• The estimation of the coefficients of other variablescoefficients of other variables are not affecare not affec
ted.ted. (necessary)• RR22 does not fall much does not fall much when some collinear variables are dropped.
(necessary)• More significant t-valuesMore significant t-values vs. smaller standard errors (likely)
8.24
All right reserved by Dr.Bill Wan Sing Hung - HKBU
2. Redesigning the Regression Model
ttttttt PNLnYdPBPFF 543210
There is no definite rule for this method.Example (Studenmund(2006), pp.268)
Ft = average pounds of fish consumed per capitaPFt = price index for fishPBt = price index for beefYdt = real per capita disposable incomeN = the # of CatholicP = dummy = 1 after the Pop’s 1966 decision, = 0 otherwise
ttttttt PNYdPBPFfF
),,,(
8.25
All right reserved by Dr.Bill Wan Sing Hung - HKBU
Signs are unexpectedMost t-values are insignificant
High correlations
VIFPF = 43.4VIFlnYd =23.3VIFPB = 18.9VIFN =18.5VIFP =4.4
8.26
All right reserved by Dr.Bill Wan Sing Hung - HKBU
Drop N, but not improved
Use the Relative Prices Relative Prices (RP(RPtt = PF = PFtt/PB/PBtt))
Ft = 0 + 1RPt + 2lnYdt + 3Pt + t
ttttt PYdRPfF
),,(
Improved
8.27
All right reserved by Dr.Bill Wan Sing Hung - HKBU
Using the lagged term of RP to allow the lag effect Using the lagged term of RP to allow the lag effect in the regressionin the regression
Ft = 0 + 1RPRPt-1t-1 + 2lnYdt + 3Pt + t
Improved much
8.28
All right reserved by Dr.Bill Wan Sing Hung - HKBU
From previous empirical work, e.g.
Consi = 0 + 1Incomei + 2Wealthi + i
and a priori information: 22 = 0.1 = 0.1.
Then construct a new variable or proxy,
(Cons*Cons*ii = ConsConsii – – 0.1Wealth0.1Wealthii)
Run OLS:Run OLS: ConsCons**i = 0 + 1Incomei + i
3. Using A Priori Information
8.29
All right reserved by Dr.Bill Wan Sing Hung - HKBU
Taking first differencesfirst differences of time series data.
Origin regression model:
Yt = 0 + 1X1t + 2X2t + t
Transforming model:Transforming model: First differencing First differencing
Yt = ’0 +’1X1t + ’2X2t + ut
Where Yt = Yt- Yt-1, (Yt-1 is called a lagged term)
X1t = X1t- X1,t-1, X2t = X2t- X2,t-1,
4. Transformation of the Model
8.30
All right reserved by Dr.Bill Wan Sing Hung - HKBU
5. Collect More Data (expand sample size)
6. Doing Nothing:
Unless multicollearity causes serious biased, and the change of specification give better results.
Larger sample size means smaller variance of estimators.