1 prediction in the previous sequence, we saw how to predict the price of a good or asset given the...
TRANSCRIPT
1
PREDICTION
In the previous sequence, we saw how to predict the price of a good or asset given the composition of its characteristics. In this sequence, we discuss the properties of such predictions.
i
k
jjiji uXP
21
k
jjiji XbbP
21
ˆ
True model
Fitted model
2
Suppose that, given a sample of n observations, we have fitted a pricing model with k – 1 characteristics, as shown.
PREDICTION
i
k
jjiji uXP
21
k
jjiji XbbP
21
ˆ
True model
Fitted model
3
Suppose now that one encounters a new variety of the good with characteristics {X2*, X3
*, ..., Xk
* }. Given the sample regression result, it is natural to predict that the price of the new variety should be given by the third equation.
PREDICTION
i
k
jjiji uXP
21
k
jjiji XbbP
21
ˆ
**3
*2 ,...,, kXXX
k
jjjXbbP
2
*1
*ˆPrediction conditional on
True model
Fitted model
4
What can one say about the properties of this prediction? First, it is natural to ask whether it is fair, in the sense of not systematically overestimating or underestimating the actual price. Second, we will be concerned about the likely accuracy of the prediction.
i
k
jjiji uXP
21
k
jjiji XbbP
21
ˆ
**3
*2 ,...,, kXXX
k
jjjXbbP
2
*1
*ˆ
PREDICTION
Prediction conditional on
True model
Fitted model
5
iii uXP 21
ii XbbP 21ˆ
*21
*ˆ XbbP
PREDICTION
*XPrediction conditional on
True model
Fitted model
We will consider the case where the good has only one relevant characteristic and suppose that we have fitted the simple regression model shown. Hence, given a new variety of the good with characteristic X = X *, the model gives us the predicted price.
6
iii uXP 21
ii XbbP 21ˆ
*21
*ˆ XbbP
PREDICTION
*XPrediction conditional on
True model
Fitted model
We will assume that the model applies to the new good and therefore the actual price, conditional on X = X *, is generated as shown, where u* is the value of the disturbance term for the new good.
Actual value of *P **
21* uXP
7
PREDICTION
We will define the prediction error of the model, PE, as the difference between the actual price and the predicted price.
iii uXP 21
ii XbbP 21ˆ
*21
*ˆ XbbP *XPrediction conditional on
True model
Fitted model
Actual value of *P **
21* uXP
*21**
21** ˆ XbbuXPPPE
8
Substituting for the actual and predicted prices, the prediction error is as shown.
PREDICTION
iii uXP 21
ii XbbP 21ˆ
*21
*ˆ XbbP *XPrediction conditional on
True model
Fitted model
Actual value of *P **
21* uXP
*21**
21** ˆ XbbuXPPPE
9
PREDICTION
We take expectations.
iii uXP 21
ii XbbP 21ˆ
*21
*ˆ XbbP *XPrediction conditional on
True model
Fitted model
Actual value of *P **
21* uXP
*21**
21** ˆ XbbuXPPPE
02*
1*
21
2*
1**
21
*21
**21
XX
bEXbEuEX
XbbEuXEPEE
10
PREDICTION
1 and 2 are assumed to be fixed parameters, so they are not affected by taking expectations. Likewise, X * is assumed to be a fixed quantity and unaffected by taking expectations. However, u*, b1 and b2 are random variables.
iii uXP 21
ii XbbP 21ˆ
*21
*ˆ XbbP *XPrediction conditional on
True model
Fitted model
Actual value of *P **
21* uXP
*21**
21** ˆ XbbuXPPPE
02*
1*
21
2*
1**
21
*21
**21
XX
bEXbEuEX
XbbEuXEPEE
11
*21**
21** ˆ XbbuXPPPE
PREDICTION
02*
1*
21
2*
1**
21
*21
**21
XX
bEXbEuEX
XbbEuXEPEE
E(u*) = 0 because u* is randomly drawn from the distribution for u, which we have assumed as zero population mean. Under the usual OLS assumptions, b1 will be an unbiased estimator of 1 and b2 an unbiased estimator of 2.
iii uXP 21
ii XbbP 21ˆ
*21
*ˆ XbbP *XPrediction conditional on
True model
Fitted model
Actual value of *P **
21* uXP
12
PREDICTION
Hence the expectation of the prediction error is zero. The result generalizes easily to the case where there are multiple characteristics and the new good embodies a new combination of them.
*21**
21** ˆ XbbuXPPPE
02*
1*
21
2*
1**
21
*21
**21
XX
bEXbEuEX
XbbEuXEPEE
iii uXP 21
ii XbbP 21ˆ
*21
*ˆ XbbP *XPrediction conditional on
True model
Fitted model
Actual value of *P **
21* uXP
13
The population variance of the prediction error is given by the expression shown. Unsurprisingly, this implies that, the further is the value of X * from the sample mean, the larger will be the population variance of the prediction error.
PREDICTION
2
1
2
2*2 1
1 un
ii
PE
XX
XXn
Variance of prediction error
14
It also implies, again unsurprisingly, that, the larger is the sample, the smaller will be the population variance of the prediction error, with a lower limit of u
2.
PREDICTION
2
1
2
2*2 1
1 un
ii
PE
XX
XXn
Variance of prediction error
15
Provided that the regression model assumptions are valid, b1 and b2 will tend to their true values as the sample becomes large, so the only source of error in the prediction will be u*, and by definition this has population variance u
2.
PREDICTION
2
1
2
2*2 1
1 un
ii
PE
XX
XXn
Variance of prediction error
16
2
1
2
2*2 1
1 un
ii
PE
XX
XXn
2
1
2
2*11s.e. un
ii
sXX
XXn
PE
The standard error of the prediction error is calculated using the square root of the expression for the population variance, replacing the variance of u with the estimate obtained when fitting the model in the sample period.
PREDICTION
Standard error
Variance of prediction error
17
Hence we are able to construct a confidence interval for a prediction. tcrit is the critical level of t , given the significance level selected and the number of degrees of freedom, and s.e. is the standard error of the prediction.
PREDICTION
0 50 100 150 200 250
P
XX*X
s.e.ˆs.e.ˆcrit
**crit
* tPPtP
upper limit of confidence interval for P*
XbbP 21ˆ
18
The confidence interval has been drawn as a function of X *. As we noted from the mathematical expression, it becomes wider, the greater the distance from X * to the sample mean.
PREDICTION
0 50 100 150 200 250
P
XX X*
XbbP 21ˆ
upper limit of confidence interval for P*
lower limit of confidenceinterval for P*
s.e.ˆs.e.ˆcrit
**crit
* tPPtP
19
PREDICTION
0 50 100 150 200 250
P
XX X*
XbbP 21ˆ
upper limit of confidence interval for P*
lower limit of confidenceinterval for P*
s.e.ˆs.e.ˆcrit
**crit
* tPPtP
With multiple explanatory variables, the expression for the prediction variance becomes complex. One point to note is that multicollinearity may not have an adverse effect on prediction precision, even if the estimates of the coefficients have large variances.
20
PREDICTION
For simplicity, suppose that there are two explanatory variables, that both have positive true coefficients, and that they are positively correlated, the model being as shown, and that we are predicting the value of Y *, given values X2
* and X3*.
uXXY 33221
Suppose X2 and X3 are positively correlated, 2 > 0, 3 > 0.
Then cov(b2, b3) < 0.
If b2 is overestimated, b3 is likely to be underestimated.
(b2X2* + b3X3
*) may be a good estimate of (2X2* + 3X3
*).
Similarly, for other combinations.
21
Then if the effect of X2 is overestimated, so that b2 > 2, the effect of X3 is likely to be underestimated, with b3 < 3. As a consequence, the effects of the errors may to some extent cancel out, with the result that the linear combination may be close to (2X2
* + 3X3*).
uXXY 33221
PREDICTION
Suppose X2 and X3 are positively correlated, 2 > 0, 3 > 0.
Then cov(b2, b3) < 0.
If b2 is overestimated, b3 is likely to be underestimated.
(b2X2* + b3X3
*) may be a good estimate of (2X2* + 3X3
*).
Similarly, for other combinations.
22
This will be illustrated with a simulation, with the model and data shown. We fit the model and make the prediction Y * = b1 + b2X2
* + b3X3*.
Simulation
PREDICTION
uXXY 32 3210
}20,19,18,17...,,4,3,2,1{2 X}20,20,18,18...,,4,4,2,2{3 X
9962.032 ,XXr
1,0~ Nu
*2321
*33
*221
*
Xbbb
XbXbbY
23
Since X2 and X3 are virtually identical, this may be approximated as Y * = b1 + (b2 + b3)X2*.
Thus the predictive accuracy depends on how close (b2 + b3) is to (2 + 3), that is, to 5.
PREDICTION
uXXY 32 3210
Simulation
}20,19,18,17...,,4,3,2,1{2 X}20,20,18,18...,,4,4,2,2{3 X
9962.032 ,XXr
1,0~ Nu
*2321
*33
*221
*
Xbbb
XbXbbY
24
The figure shows the distributions of b2 and b3 for 10 million samples. Their distributions have relatively wide variances around their true values, as should be expected, given the multicollinearity. The actual standard deviations of their distributions is 0.45.
PREDICTION
standard deviations 0.45
standard deviation 0.04
0
5
10
0 1 2 3 4 5 6 7
b2 b3
b2 + b3
25
The figure also shows the distribution of their sum. As anticipated, it is distributed around 5, but with a much lower standard deviation, 0.04, despite the multicollinearity affecting the point estimates of the individual coefficients.
PREDICTION
standard deviations 0.45
standard deviation 0.04
0
5
10
0 1 2 3 4 5 6 7
b2 b3
b2 + b3
Copyright Christopher Dougherty 2012.
These slideshows may be downloaded by anyone, anywhere for personal use.
Subject to respect for copyright and, where appropriate, attribution, they may be
used as a resource for teaching an econometrics course. There is no need to
refer to the author.
The content of this slideshow comes from Section 3.6 of C. Dougherty,
Introduction to Econometrics, fourth edition 2011, Oxford University Press.
Additional (free) resources for both students and instructors may be
downloaded from the OUP Online Resource Centre
http://www.oup.com/uk/orc/bin/9780199567089/.
Individuals studying econometrics on their own who feel that they might benefit
from participation in a formal course should consider the London School of
Economics summer school course
EC212 Introduction to Econometrics
http://www2.lse.ac.uk/study/summerSchools/summerSchool/Home.aspx
or the University of London International Programmes distance learning course
EC2020 Elements of Econometrics
www.londoninternational.ac.uk/lse.
2012.12.03