multi hetero auto

8/3/2019 Multi Hetero Auto

1/8

CHAPTER 10: MULTICOLLINEARITY: WHAT HAPPENS IF THEREGRESSORS ARE CORRELATED?

The Nature of MulticollinearityMulticollinearity the existence of a perfect, or exact, linear relationship among some or all

explanatory variables for a regression model.

Why does the classical linear regression model assume that there is no multicollinearityamong the Xs?

If multicollinearity is perfect, the regression coefficients of the X variables are indeterminate andtheir standard errors are infinite. If multicollinearity is less than perfect, the regressioncoefficients, although determinate, possess large standard errors, which means the coefficientcannot be estimated with great passion or accuracy.

Sources of Multicollinearity

1. The data collection method employed.2. Constraints on the model or in the population being sampled.3. Model specification.4. An overdetermined model.

Practical Consequences of Multicollinearity

1. Although BLUE, the OLS estimators have large variances and covariances, making preciseestimation difficult.

2. Because of consequence 1, the confidence intervals tend to be much wider, leading to the

acceptance of the zero null hypothesis.3. Also because of consequence 1, the t ratio of one or more coefficients tend to be statisticallyinsignificant.

4. Although the t ratio of one or more coefficients is statistically insignificant, R 2, the overallmeasure of goodness of fit, can be very high.

5. The OLS estimators and their standard errors can be sensitive to small changes in the data.

Large Variances and Covariances of OLS Estimators

var( 2) =X 2i(1 r 23)

var( 3) =X 3i(1 r 23)

cov( 2, 3) = -r 23 (1 r 23) X 2i X 3i


2/8

Detection of Multicollinearity

1. High R 2 but few significant t ratios.2. High pair-wise correlations among regressors.3. Examinations of partial correlations.

4. Auxiliary regressions.5. Eigenvalues and condition index.

CI = Maximum eigenvalue = k Minimum eigenvalue

6. Tolerance and variance inflation factor

Remedial Measures

y

Do Nothingy Rule-of-Thumb Procedures1) A priori information.2) Combining cross-sectional and time series data.3) Dropping a variable(s) and specification bias.4) Transformation of variables.5) Additional or new data.6) Reducing collinearity in polynomial regressions.7) Other methods of remedying multicollinearity.

CHAPTER 11: HETEROSCEDASTICITY: WHAT HAPPENS IF THE ERROR VARIANCE IS NONCONSTANT?

The Nature of Heteroscedasticity

There are several reasons why the variances of ui may be a variable, some of which are asfollows:

1. Following the error-learning models, as people learn, their errors of behavior become smaller over time.

2. As incomes grow, people have more discretionary income and hence more scope of choice aboutthe disposition of their income.

3. As data collecting techniques, is likely to decrease.4. Heteroscedasticity can also arise as a result of the presence of outliers.5. Another source of heteroscedasticity arises from violating Assumption 9 of CLRM, namely, that

the regression model is correctly specified.6. Another source of heteroscedasticity is skewness in the distribution of one or more regressors

included in the model.


3/8

7. Heteroscedasticity can also arise because of (1) incorrect data transformation and (2) incorrectfunctional form.

OLS Estimation in the Presence of Heteroscedasticity

var( 2) = X i2

( X i2)2

var( 3) =X i2

The Method of Generalized Least Squares- Takes such information into account explicitly and is therefore capable of producing estimators

that are BLUE.

Difference Between OLS and GLS

OLS:u i2 = (Y i 1 2X i)2

GLS:

w iu i2 = w i(Y i 1X i 2X i)2

Consequence of Using OLS in the Presence of Heteroscedasticity OLS Estimation Disregarding Heteroscedasticity

In short, if we persist in using the usual testing procedures despite heteroscedasticity, whatever conclusions we draw or inferences we make may be very misleading.

Detection of Heteroscedasticity

1. Informal Methods Nature of the Problem. Very often the nature of the problem under consideration suggestswhether heteroscedasticity is likely to be encountered.Graphical Method. If there are no priori or empirical information about the nature of heteroscedasticity, in practice one can do the regression analysis on the assumption that there isno heteroscedasticity and then do the postmortem examination of the residual squared u i2 to seeif they exhibit any systematic pattern.

2 . Formal MethodsPark Test. Park formalizes the graphical method by suggesting that is some function of theexplanatory variable X i.Glejser Test. After obtaining the residuals u i from the OLS regression, Glejser suggestsregressing the absolute values of u i on the X variable that is thought to be closely associatedwith .Spearmans Rank Correlation Test.


4/8

r s = 1 - d i2

n(n2 1)Step 1: Fit the regression to the data on Y and X and obtain the residuals u i.Step 2 : Ignoring the sign of u i, that is, taking their absolute value u i , rank both u i and X i or

(Y i) according to an ascending or descending order and compute the Spearmans rank correlationcoefficient given previously.Step 3: Assuming that the population rank correlation coefficient s is zero and n>8, thesignificance of the sample r s can be tested by the t test as follows.

t = r s n 21 r 2s

Goldfeld-Quandt Test . This popular method is applicable if one assumes that theheteroscedastic variance , is positively related to one of the explanatory variable in theregression model.

Step 1: Order or rank the observations according to the values of X i, beginning with the lowestX value.Step 2 : Omit c central observations, where c is specified a priori, and divide the remaining (n c) observations into two groups each of (n c) /2 observations.Step 3: Fit separate OLS regressions to the first (n c)/2 observations and the last (n c)/2observations and obtain the respective residual sums of squares RSS 1 and RSS 2, RSS 1 representing the RSS from the regression corresponding to the smaller X i values and RSS 2 thatfrom the larger X i values.Step 4: Compute the ratio

= RSS2/df

RSS 1/df If u i are assumed to be normally distributed and if the assumption of homoscedasticity is valid.

Breusch-Pagan-Godfrey TestStep 1: Estimate

Y i = 1 + 2X2i + k Xk + u i

by OLS and obtain the residuals u 1, u 2, . . . ,u n Step 2 : Obtain = u i2/nStep 3: Construct variables p i defined as

p i = u i2/which is simply each residual squared divided by .Step 4: Regress pi thus considered on the zs as


5/8

p i = 1 + 2Z2i + . . . + mZmi + v i where vi is the residual term of this regression.Step 5: Obtain the ESS and define

= (ESS)

Whites General Heteroscedasticity Test

Step 1: Given the data, we estimateY i = 1 + 2X2i + 3X3i + u i

and obtain the residuals, u i.Step 2 : We then run the following regression:

u i2 = 1 + 2X2i + 3X3i + 4X2i + 5X3i + 6X2iX3i + v i Step 3: Under the null hypothesis that there is no heteroscedasticity, it can be shown that thesample size (n) times the R 2 obtained from the auxiliary regression asymptotically follows the

chi-square distribution with df equal to the number of regressors in the auxiliary regression. Thatis,

n R 2 asy X2df Step 4: If the chi-square value obtained in n R 2 asy X2df exceeds the critical chi-square value atthe chosen level of significance, conclusion is that there is heteroscedasticity.

Other Tests of Heteroscedasticityy Koenker-Bassett (KB) Test

Remedial Measuresy

When is Known: The Method of Weighted Least Squaresy When not Known

y Plausible assumptions about heteroscedasticity patternAssumption 1: The error variance is proportional to X i2

E(u i2) = X i2 Assumption 2: The error variance is proportional to X i. The square root transformation:

E(u i2) = X i Assumption 3: The error variance is proportional to the square of the mean value of Y.

E(u i2) = [E(Y i)]2 Assumption 4: A log transformation such as

lnY i = 1 + 2lnX i + u i very often reduces heteroscedasticity when compared with the regression Y i = 1 + 2X i + u i


6/8

CHAPTER 1 2 : AUTOCORRELATION: WHAT HAPPENS IF THE ERROR TERMS ARE CORRELATED?

The Nature of The Problem1. Autocorrelation correlation between members of series of observations ordered in

time.2. Specification Bias : Excluded Variables Case. In empirical analysis, the researcher often

starts with a plausible regression model that may not be the most perfect one. After theregression analysis, the researcher does the postmortem to find out whether the resultsaccord with a priori expectations.

3. Cobweb Phenomenon . The supply of many agricultural commodities reflects the so-called cobweb phenomenon, where supply reacts to price with a lag of one time period

because supply decisions take time to implement.4. Lags. A regression such as

Consumption t = 1 + 2income t + 3consumption t 1 + u i

is known as autoregression because one of the explanatory variables is the lagged value of thedependent variable.Manipulation of Data. Another source of manipulation is interpolation or extrapolation of data.

OLS Estimation in the Presence of Autocorrelation(=rho) is known as the coefficient of autocovariance.

The scheme:

u t = u t 1 + t -1<


7/8

4. Therefore, the usual t and F tests of significance are no longer valid, and if applied, are likely togive seriously misleading conclusions about the statistical significance of the estimatedregression coefficients.

Detecting AutocorrelationI. Graphical Method

II. The Runs Test

Mean: E(R) = 2N 1 N2 + 1 N

Variance: = 2N 1 N2(2N 1 N2 N)(N) 2(N 1)

Decision Rule: Do not reject the null hypothesis of randomness with 95% confidence if R,the number of runs, lies in the preceding confidence interval, reject the null hypothesis if theestimated R lies outside these limits.

III.

Durbin-Watson d TestDurbin-Watson d statistic

d = t = 2 (u i u t 1)2 t = 1 u i2

Assumptions underlying d statistic:1. The regression model includes the intercept form.2. The explanatory variables, the Xs are nonstochastic, or fixed in repeated sampling.3. The disturbances u i are generated by the first-order autoregressive sceme: u t = u t 1 +

t 4. The error term u t is assumed to be normally distributed.5. The regression model does not include the lagged value(s) of the dependent variable as one of

the explanatory variables.6. There are no missing observations in the data. Therefore, as a rule of thumb, if d is found to be 2

in the application, one may assume that there is no first-order autocorrelation, either positive or negative.Mechanics of the Durbin-Watson Test

1) Run the OLS regression and obtain the residuals.2) Compute d.3) For the given sample size and given number of explanatory variables, find out the critical d L and

dV values.4) Now follow the decision rules of durbin-watson d test.

IV. A General Test of Autocorrelation: The Breusch-Godfrey (BG) TestSteps:

1. Estimate Y t = 1 + 2X t + u t by OLS and obtain the residuals, u t .2. Regress u t on the original X t and u t 1 , u t 2 , . . . ,u t p , where the latter are the lagged values of the

estimated residuals in step 1.3. If the sample size is large, Breusch and Godfrey have shown that

(n p)R 2 X2 p


8/8

What To Do When You Find Autocorrelation: Remedial Measures1. Try to find out if the autocorrelation is pure autocorrelation and not the result of mis-

specification of the model.2. If it is pure autocorrelation, one can use appropriate transformation of the original model so that

in the transformed model we do not have the problem of autocorrelation.

3. In large samples, we can use the Newey-West method to obtain standard errors of OLSestimators that are corrected for autocorrelation.4. In some situations we can continue to use the OLS method.

multi hetero auto

Documents