financial econometrics – 2014, dr. kashif saleem (lut) 1 financial econometrics dr. kashif saleem...

Financial Econometrics – 2014, Dr. Kashif Saleem (LUT)1

Financial Econometrics

Dr. Kashif Saleem

Associate Professor (Finance)

Lappeenranta School of Business


Lecture 3 Agenda

Further development and analysis of the classical linear regression model

•Generalising the simple model to multiple linear regression •Testing multiple hypotheses: the F-test •Data mining•Goodness of fit statistics •Multifactor Models


Generalising the Simple Model to Multiple Linear Regression

• Before, we have used the model t = 1,2,...,T

• But what if our dependent (y) variable depends on more than one independent variable? For example the number of cars sold might plausibly depend on

1. the price of cars2. the price of public transport3. the price of petrol4. the extent of the public’s concern about global warming

• Similarly, stock returns might depend on several factors.• Having just one independent variable is no good in this case - we want to

have more than one x variable. It is very easy to generalise the simple model to one with k-1 regressors (independent variables).

ttt uxy


Multiple Regression and the Constant Term

• Now we write

, t=1,2,...,T

• Where is x1? It is the constant term. In fact the constant term is usually represented by a column of ones of length T:

1 is the coefficient attached to the constant term (which we called before).

tktkttt uxxxy ...33221

1

1

1

1 x


Different Ways of Expressing the Multiple Linear Regression Model

• We could write out a separate equation for every value of t:

• We can write this in matrix form

y = X +u

where y is T 1

X is T k

is k 1

u is T 1

TkTkTTT

kk

kk

uxxxy

uxxxy

uxxxy

...

...

...

33221

2232322212

1131321211


Inside the Matrices of the Multiple Linear Regression Model

• e.g. if k is 2, we have 2 regressors, one of which is a column of ones:

T 1 T2 21 T1

• Notice that the matrices written in this way are conformable.

TTT u

u

u

x

x

x

y

y

y

2

1

2

1

2

22

21

2

1

1

1

1


How Do We Calculate the Parameters (the ) in this Generalised Case?

• Previously, we took the residual sum of squares, and minimised it w.r.t. and .

• In the matrix notation, we have

• The RSS would be given by

Tu

u

u

u

ˆ

ˆ

ˆ

ˆ 2

1

2222

21

2

1

21 ˆˆ...ˆˆ

ˆ

ˆ

ˆ

ˆˆˆˆ'ˆ tT

T

T uuuu

u

u

u

uuuuu


The OLS Estimator for the Multiple Regression Model

• In order to obtain the parameter estimates, 1, 2,..., k, we would minimise the RSS with respect to all the s.

• It can be shown that

yXXX

k

12

1

)(

ˆ

ˆ

ˆ

ˆ


Calculating the Standard Errors for the Multiple Regression Model

• Check the dimensions: is k 1 as required.

• But how do we calculate the standard errors of the coefficient estimates?

• Previously, to estimate the variance of the errors, 2, we used .

• Now using the matrix notation, we use

• where k = number of regressors. It can be proved that the OLS estimator of the variance of is given by the diagonal elements of , so that the variance of is the first element, the variance of is the second element, and …, and the variance of is the kth diagonal element.

su u

T k2

'

s X X2 1( ' )

1 2 k

2

ˆ22

T

us t


Calculating Parameter and Standard Error Estimates for Multiple Regression Models: An Example

• Example: The following model with k=3 is estimated over 15 observations: and the following data have been calculated from the original X’s.

Calculate the coefficient estimates and their standard errors.• To calculate the coefficients, just multiply the matrix by the vector to obtain

.• To calculate the standard errors, we need an estimate of 2.

( ' ). . .. . .. . .

, ( ' ).

.

., ' .X X X y u u

12 0 35 1035 10 6510 65 4 3

302 206

1096

uxxy 33221

yXXX '' 1


Calculating Parameter and Standard Error Estimates for Multiple Regression Models: An Example (cont’d)

• The variance-covariance matrix of is given by

• The variances are on the leading diagonal:

• We write:

Var SE

Var SE

Var SE

( ) . ( ) .

( ) . ( ) .

( ) . ( ) .

1 1

2 2

3 3

183 135

0 91 0 96

3 93 198

98.196.035.1

88.1940.410.1ˆ 32 tt xxy


Testing Multiple Hypotheses: The F-test

• We used the t-test to test single hypotheses, i.e. hypotheses involving only one coefficient. But what if we want to test more than one coefficient simultaneously?

• We do this using the F-test. The F-test involves estimating 2 regressions.

• The unrestricted regression is the one in which the coefficients are freely determined by the data, as we have done before.

• The restricted regression is the one in which the coefficients are restricted, i.e. the restrictions are imposed on some s.


The F-test: Restricted and Unrestricted Regressions

• Example


Calculating the F-Test Statistic

• The test statistic is given by

where URSS = RSS from unrestricted regression

RRSS = RSS from restricted regression

m = number of restrictions

T = number of observations

k = number of regressors in unrestricted regression including a constant in the unrestricted regression (or the total number of parameters to be estimated).

test statisticRRSS URSS

URSST k

m


The F-Distribution

• The test statistic follows the F-distribution, which has 2 d.f. parameters.

• The value of the degrees of freedom parameters are m and (T-k) respectively (the order of the d.f. parameters is important).

• The appropriate critical value will be in column m, row (T-k).

• The F-distribution has only positive values and is not symmetrical. We therefore only reject the null if the test statistic > critical F-value.


Determining the Number of Restrictions in an F-test

• Examples :

H0: hypothesis No. of restrictions, m

1 + 2 = 2 ?

2 = 1 and 3 = -1 ?

2 = 0, 3 = 0 and 4 = 0 ?• If the model is yt = 1 + 2x2t + 3x3t + 4x4t + ut,

then the null hypothesis

H0: 2 = 0, and 3 = 0 and 4 = 0 is tested by the regression F-statistic.

What it means????

• Note the form of the alternative hypothesis for all tests when more than one restriction is involved: H1: 2 0, or 3 0 or 4 0


What we Cannot Test with Either an F or a t-test

• We cannot test using this framework hypotheses which are not linear

or which are multiplicative, e.g.

Example …..

cannot be tested.


F-test Example

• Question: Suppose a researcher wants to test whether the returns on a company stock (y) show unit sensitivity to two factors (factor x2 and factor x3) among three considered. The regression is carried out on 144 monthly observations.

• The regression is yt = 1 + 2x2t + 3x3t + 4x4t+ ut

- What are the restricted and unrestricted regressions?

- If the two RSS are 436.1 and 397.2 respectively, perform the test.

• Solution:

Conclusion:


Data Mining

• Data mining is searching many series for statistical relationships without theoretical justification.

• For example, suppose we generate one dependent variable and twenty explanatory variables completely randomly and independently of each other.

• If we regress the dependent variable separately on each independent variable, on average one slope coefficient will be significant at 5%.

• If data mining occurs, the true significance level will be greater than the nominal significance level.


Goodness of Fit Statistics

• We would like some measure of how well our regression model actually fits the data.

• We have goodness of fit statistics to test this: i.e. how well the sample regression function (srf) fits the data.

• The most common goodness of fit statistic is known as R2. One way to define R2 is to say that it is the square of the correlation coefficient between y and .

• For another explanation, recall that what we are interested in doing is explaining the variability of y about its mean value, , i.e. the total sum of squares, TSS:

• We can split the TSS into two parts, the part which we have explained (known as the explained sum of squares, ESS) and the part which we did not explain using the model (the RSS).

y

t

t yyTSS 2

y


Defining R2

• That is, TSS = ESS + RSS

• Our goodness of fit statistic is

• But since TSS = ESS + RSS, we can also write

• R2 must always lie between zero and one. To understand this, consider two extremesRSS = TSS i.e. ESS = 0 so R2 = ESS/TSS = 0ESS = TSS i.e. RSS = 0 so R2 = ESS/TSS = 1

RESSTSS

2

RESS

TSS

TSS RSS

TSS

RSS

TSS2 1

t

tt t

tt uyyyy 222 ˆˆ


The Limit Cases: R2 = 0 and R2 = 1

ty

y

tx

ty

tx


Problems with R2 as a Goodness of Fit Measure

• There are a number of them:

1. R2 is defined in terms of variation about the mean of y so that if a model is reparameterised (rearranged) and the dependent variable changes, R2 will change.

2. R2 never falls if more regressors are added. to the regression, e.g. consider:

Regression 1: yt = 1 + 2x2t + 3x3t + ut

Regression 2: y = 1 + 2x2t + 3x3t + 4x4t + ut

R2 will always be at least as high for regression 2 relative to regression 1.

3. R2 quite often takes on values of 0.9 or higher for time series regressions.


Adjusted R2

• In order to get around these problems, a modification is often made which takes into account the loss of degrees of freedom associated with adding extra variables. This is known as , or adjusted R2:

• So if we add an extra regressor, k increases and unless R2 increases by a more than offsetting amount, will actually fall.

• There are still problems with the criterion:1. A “soft” rule

2. No distribution for or R2

2R

)1(1

1 22 RkT

TR

2R

2R

Multifactor Models

• Classic References: Chen, Roll and Ross’ (1986) , "Economic forces and the stock market“

• Examine the relation between (macro) economic state variables and stock returns

• Data:

• Monthly returns on 20 EW size-sorted portfolios, 1953-1983


Macro variables:


Example: Chen, Roll and Ross’ (1986)

• Factors:– • growth in industrial production (MP).– • change in expected inflation (DEI).– • unexpected inflation (UI).– • unexpected changes in risk premiums (UPR).– • unexpected changes in term structure slope (UTS).– • Also include EV and VW market indexes in the regressions.

Methodology: Fama-MacBeth’s two pass estimation of the slopes.

Findings: Statistical significant slopes for MP, UI, and UPR (this factors are priced!). Market indexes not significant.


Macroeconomic variables that are employed in previous APT tests


Example: Exam Question


Dependent Variable

Explanatory Variable

Expected sign

Average Rating

Moody’s Rating

S&P Rating

Moody’s / S&P Difference

Intercept ? 1.442 (0.663)

3.408 (1.379)

-0.524 (-0.223)

3.932** (2.521)

Per capita income + 1.242*** (5.302)

1.027*** (4.041)

1.458*** (6.048)

-0.431*** (-2.688)

GDP growth + 0.151 (1.935)

0.130 (1.545)

0.171** (2.132)

-0.040 (0.756)

Inflation - -0.611*** (-2.839)

-0.630*** (-2.701)

-0.591*** (2.671)

-0.039 (-0.265)

Fiscal Balance + 0.073 (1.324)

0.049 (0.818)

0.097* (1.71)

-0.048 (-1.274)

External Balance + 0.003 (0.314)

0.006 (0.535)

0.001 (0.046)

0.006 (0.779)

External Debt - -0.013*** (-5.088)

-0.015*** (-5.365)

-0.011*** (-4.236)

-0.004*** (-2.133)

Development dummy + 2.776*** (4.25)

2.957*** (4.175)

2.595*** (3.861)

0.362 (0.81)

Default dummy - -2.042*** (-3.175)

-1.63** (-2.097)

-2.622*** (-3.962)

1.159*** (2.632)

Adjusted R2 0.924 0.905 0.926 0.836 Notes: t-ratios in parentheses; *, **, and *** indicate significance at the 10%, 5% and 1% levels respectively. Source: Cantor and Packer (1996). Reprinted with permission from Institutional Investor.

Home Excercise

• Chose a market index of your interest, for example, BSE India (Bombay stock exchange) as your dependent variable. Collect at least five macroeconomic variables of the same market ( for example, inflation, exchange rates, industrial production, interest rates, money supply, etc.). Use multifactor model of APT type. Interpret the results

• Note: Data for the macro economic variables can be accessed from the state banks website freely, stock market data is freely available from the concerned stock exchange website or yahoo finance. Use monthly data of at least 5 years length


financial econometrics – 2014, dr. kashif saleem (lut) 1 financial econometrics dr. kashif saleem...

Documents