chapter 4 – nonlinear models and transformations of variables
TRANSCRIPT
Chapter 4 – Nonlinear Models and Transformations of Variables
© Christopher Dougherty 1999–2006
LINEARITY AND NONLINEARITY
0
2
4
6
8
10
12
14
0 1 2 3 4 5 6 7 8 9 10 11X
Y
Motivating example:
© Christopher Dougherty 1999–2006
Linear in variables and parameters:
LINEARITY AND NONLINEARITY
uXXXY 4433221
To introduce the topic of fitting nonlinear regression models, we need a definition of linearity.
The model shown above is linear in two senses. The right side is linear in variables because the variables are included exactly as defined, rather than as functions.
It is also linear in parameters since a different parameter appears as a multiplicative factor in each term.
© Christopher Dougherty 1999–2006
Linear in variables and parameters:
Linear in parameters, nonlinear in variables:
uXXXY 4433221
uXXXY 44332221 log
The second model above is linear in parameters, but nonlinear in variables.
LINEARITY AND NONLINEARITY
© Christopher Dougherty 1999–2006
Linear in variables and parameters:
Linear in parameters, nonlinear in variables:
uXXXY 4433221
uXXXY 44332221 log
4433222 log,, XZXZXZ
uZZZY 4433221
Such models present no problem at all. Define new variables as shown. With these cosmetic transformations, we have made the model linear in both variables and parameters.
LINEARITY AND NONLINEARITY
© Christopher Dougherty 1999–2006
Linear in variables and parameters:
Linear in parameters, nonlinear in variables:
Nonlinear in parameters:
uXXXY 4433221
uXXXY 44332221 log
4433222 log,, XZXZXZ
uZZZY 4433221
uXXXY 43233221
This model is nonlinear in parameters since the coefficient of X4 is the product of the coefficients of X2 and X3. As we will see, some models which are nonlinear in parameters can be linearized by appropriate transformations, but this is not one of those.
LINEARITY AND NONLINEARITY
© Christopher Dougherty 1999–2006
bananas income (lbs) ($10,000) household Y X
1 1.71 1
2 6.88 2
3 8.25 3
4 9.52 4
5 9.81 5
6 11.43 6
7 11.09 7
8 10.87 8
9 12.15 9
10 10.94 10
We will begin with an example of a model that can be linearized by a cosmetic transformation. Suppose that you have data on annual consumption of bananas and annual income for a sample of 10 households.
LINEARITY AND NONLINEARITY
© Christopher Dougherty 1999–2006
Here is the scatter diagram.
0
2
4
6
8
10
12
14
0 1 2 3 4 5 6 7 8 9 10 11X
Y
LINEARITY AND NONLINEARITY
© Christopher Dougherty 1999–2006
. reg Y X
Source | SS df MS Number of obs = 10---------+------------------------------ F( 1, 8) = 17.44 Model | 58.8774834 1 58.8774834 Prob > F = 0.0031Residual | 27.003764 8 3.3754705 R-squared = 0.6856---------+------------------------------ Adj R-squared = 0.6463 Total | 85.8812475 9 9.54236083 Root MSE = 1.8372
------------------------------------------------------------------------------ Y | Coef. Std. Err. t P>|t| [95% Conf. Interval]---------+-------------------------------------------------------------------- X | .8447878 .2022741 4.176 0.003 .378343 1.311233 _cons | 4.618667 1.255078 3.680 0.006 1.724453 7.512881------------------------------------------------------------------------------
Here is the output from a linear regression. As you would expect from seeing the scatter diagram, the coefficient of X is highly significant, and the fit, as measured by R2, is quite good.
LINEARITY AND NONLINEARITY
© Christopher Dougherty 1999–2006
Here is the scatter diagram again with the regression line, the fitted values and residuals plotted. The residuals indicate that the model must be misspecified in some way.
If the model is correctly specified, the residuals should be random. Here, we have one negative residual followed by 6 positive ones and 3 negative ones.
0
2
4
6
8
10
12
14
0 1 2 3 4 5 6 7 8 9 10 11X
Y
LINEARITY AND NONLINEARITY
© Christopher Dougherty 1999–2006
Revised model:
Relating Y to 1/X would be more sensible.
Y still increases with X if b2 < 0, but the rate of increase falls and
there is an upper limit, b1.
After all, you can eat only so many bananas.
uX
Y 21
LINEARITY AND NONLINEARITY
© Christopher Dougherty 1999–2006
Revised model:
uX
Y 21
This is a nonlinear model, but we can linearize it by defining a new variable Z to be the reciprocal of X.
XZ
1
uZY 21
LINEARITY AND NONLINEARITY
© Christopher Dougherty 1999–2006
bananas income (lbs) ($10,000) household Y X Z
1 1.71 1 1.00
2 6.88 2 0.50
3 8.25 3 0.33
4 9.52 4 0.25
5 9.81 5 0.20
6 11.43 6 0.17
7 11.09 7 0.14
8 10.87 8 0.13
9 12.15 9 0.11
10 10.94 10 0.10
16
The next step is to calculate the data for Z from the data for X. All serious regression applications allow you to create new variables from existing ones.
LINEARITY AND NONLINEARITY
© Christopher Dougherty 1999–2006
Here is the scatter diagram for Y and Z.
0
2
4
6
8
10
12
14
0 0.2 0.4 0.6 0.8 1 1.2
Y
Z
LINEARITY AND NONLINEARITY
© Christopher Dougherty 1999–2006
. g Z=1/X
. reg Y Z
Source | SS df MS Number of obs = 10---------+------------------------------ F( 1, 8) = 286.10 Model | 83.5451508 1 83.5451508 Prob > F = 0.0000Residual | 2.33609666 8 .292012083 R-squared = 0.9728---------+------------------------------ Adj R-squared = 0.9694 Total | 85.8812475 9 9.54236083 Root MSE = .54038
------------------------------------------------------------------------------ Y | Coef. Std. Err. t P>|t| [95% Conf. Interval]---------+-------------------------------------------------------------------- Z | -10.98865 .6496573 -16.915 0.000 -12.48677 -9.490543 _cons | 12.48354 .2557512 48.811 0.000 11.89378 13.07331------------------------------------------------------------------------------
Here is the regression output. The first command creates the new variable Z. ("g" is short for "generate".)
LINEARITY AND NONLINEARITY
© Christopher Dougherty 1999–2006
. g Z=1/X
. reg Y Z
Source | SS df MS Number of obs = 10---------+------------------------------ F( 1, 8) = 286.10 Model | 83.5451508 1 83.5451508 Prob > F = 0.0000Residual | 2.33609666 8 .292012083 R-squared = 0.9728---------+------------------------------ Adj R-squared = 0.9694 Total | 85.8812475 9 9.54236083 Root MSE = .54038
------------------------------------------------------------------------------ Y | Coef. Std. Err. t P>|t| [95% Conf. Interval]---------+-------------------------------------------------------------------- Z | -10.98865 .6496573 -16.915 0.000 -12.48677 -9.490543 _cons | 12.48354 .2557512 48.811 0.000 11.89378 13.07331------------------------------------------------------------------------------
The regression equation is as shown.
ZY 99.1048.12ˆ
LINEARITY AND NONLINEARITY
© Christopher Dougherty 1999–2006
Here is the scatter diagram again with the regression line plotted.
0
2
4
6
8
10
12
14
0 0.2 0.4 0.6 0.8 1 1.2Z
Y
ZY 99.1048.12ˆ
LINEARITY AND NONLINEARITY
© Christopher Dougherty 1999–2006
0
2
4
6
8
10
12
14
0 1 2 3 4 5 6 7 8 9 10 11 12
Substituting the reciprocal of X for Z and plotting the curve in the original diagram, we get a much better fit.
XY
99.1048.12ˆ
X
Y
LINEARITY AND NONLINEARITY
© Christopher Dougherty 1999–2006
Elasticity of Y with respect toX is the proportional change inY per proportional change in X:
ELASTICITIES AND LOGARITHMIC MODELS
0 52
XYdXdY
XdXYdY
elasticity Y
X
A
O
This sequence defines elasticities and shows how one may fit nonlinear models with constant elasticities. First, the general definition of an elasticity.
XY
© Christopher Dougherty 1999–2006
Elasticity of Y with respect toX is the proportional change inY per proportional change in X:
XYdXdY
XdXYdY
elasticity
OAA
of slopeat tangent the of slope
Y
X
A
Re-arranging the expression for the elasticity, we can obtain a graphical interpretation. The elasticity at any point on the curve is the ratio of the slope of the tangent at that point to the slope of the line joining the point to the origin.
0 52
XY
O
ELASTICITIES AND LOGARITHMIC MODELS
© Christopher Dougherty 1999–2006
Elasticity of Y with respect toX is the proportional change inY per proportional change in X:
In this case it is clear that the tangent at A is flatter than the line OA and so the elasticity must be less than 1.
XYdXdY
XdXYdY
elasticity
OAA
of slopeat tangent the of slope
Y
X
A
1elasticity
XY
0 52O
ELASTICITIES AND LOGARITHMIC MODELS
© Christopher Dougherty 1999–2006
0 52
In this case the tangent at A is steeper than OA and the elasticity is greater than 1.
A
O
1elasticity
Elasticity of Y with respect toX is the proportional change inY per proportional change in X:
XYdXdY
XdXYdY
elasticity
OAA
of slopeat tangent the of slope
Y
X
ELASTICITIES AND LOGARITHMIC MODELS
In general the elasticity will be different at different points on the function relating Y to X.
In the example above, Y is a linear function of X. The tangent at any point is coincidental with the line itself, so in this case its slope is always b2. The elasticity depends on the slope of the line joining the point to the origin.
xO
A
XY 21
XYdXdY elasticity
OAA
of slopeat tangent the of slope
21
2
21
2
)/(
/)(
X
XX
Y
X
ELASTICITIES AND LOGARITHMIC MODELS
© Christopher Dougherty 1999–2006
OB is flatter than OA, so the elasticity is greater at B than at A. (This ties in with the mathematical expression: (b1 / X) + b2 is smaller at B than at A, assuming that b1 is positive.)
x
A
O
B
XY 21
XYdXdY elasticity
OAA
of slopeat tangent the of slope
21
2
21
2
)/(
/)(
X
XX
Y
X
ELASTICITIES AND LOGARITHMIC MODELS
© Christopher Dougherty 1999–2006
21
XY
However, a function of the type shown above (a.k.a. Constant Elasticity model) has the same elasticity for all values of X.
ELASTICITIES AND LOGARITHMIC MODELS
© Christopher Dougherty 1999–2006
21
XY
121
2 XdXdY
For the numerator of the elasticity expression, we need the derivative of Y with respect to X.
ELASTICITIES AND LOGARITHMIC MODELS
© Christopher Dougherty 1999–2006
21
XY
121
2 XdXdY
11
1 2
2
X
XX
XY
For the denominator, we need Y/X.
ELASTICITIES AND LOGARITHMIC MODELS
© Christopher Dougherty 1999–2006
Hence we obtain the expression for the elasticity. This simplifies to b2 and is therefore constant.
21
XY
121
2 XdXdY
11
1 2
2
X
XX
XY
211
121
2
2
elasticity
X
XXYdXdY
ELASTICITIES AND LOGARITHMIC MODELS
© Christopher Dougherty 1999–2006
By way of illustration, the function will be plotted for a range of values of b2. We will start with a very low value, 0.25.
Y
X
21
XY 25.02
ELASTICITIES AND LOGARITHMIC MODELS
© Christopher Dougherty 1999–2006
We will increase b2 in steps of 0.25 and see how the shape of the function changes.
21
XY 50.02
Y
X
ELASTICITIES AND LOGARITHMIC MODELS
© Christopher Dougherty 1999–2006
21
XY 75.02
Y
X
ELASTICITIES AND LOGARITHMIC MODELS
© Christopher Dougherty 1999–2006
When b2 is equal to 1, the curve becomes a straight line through the origin.
21
XY 00.12
Y
X
ELASTICITIES AND LOGARITHMIC MODELS
© Christopher Dougherty 1999–2006
21
XY 25.12
Y
X
ELASTICITIES AND LOGARITHMIC MODELS
© Christopher Dougherty 1999–2006
21
XY 50.12
Y
X
ELASTICITIES AND LOGARITHMIC MODELS
© Christopher Dougherty 1999–2006
21
XY 75.12
Y
X
ELASTICITIES AND LOGARITHMIC MODELS
© Christopher Dougherty 1999–2006
Note that the curvature can be quite gentle over wide ranges of X.
This means that even if the true model is of the constant elasticity form, a linear model may be a good approximation over a limited range.
21
XY 00.22
Y
X
ELASTICITIES AND LOGARITHMIC MODELS
© Christopher Dougherty 1999–2006
It is easy to fit a constant elasticity function using a sample of observations. You can linearize the model by taking the logarithms of both sides.
21
XY
X
X
XY
loglog
loglog
loglog
21
1
1
2
2
ELASTICITIES AND LOGARITHMIC MODELS
© Christopher Dougherty 1999–2006
You thus obtain a linear relationship between Y' and X', as defined. All serious regression applications allow you to generate logarithmic variables from existing ones.
The coefficient of X' will be a direct estimate of the elasticity, b2.
21
XY
X
X
XY
loglog
loglog
loglog
21
1
1
2
2
'' 2'1 XY
1'1 log
log'
,log' where
XX
YY
ELASTICITIES AND LOGARITHMIC MODELS
© Christopher Dougherty 1999–2006
The constant term will be an estimate of log b1. To obtain an estimate of b1, you calculate exp(b1'), where b1' is the estimate of b1'. (This assumes that you have used natural logarithms, that is, logarithms to base e, to transform the model.)
21
XY
'' 2'1 XY
1'1 log
log'
,log' where
XX
YY
X
X
XY
loglog
loglog
loglog
21
1
1
2
2
ELASTICITIES AND LOGARITHMIC MODELS
© Christopher Dougherty 1999–2006
Here is a scatter diagram showing annual household expenditure on FDHO, food eaten at home, and EXP, total annual household expenditure, both measured in dollars, for 1995 for a sample of 869 households in the United States (Consumer Expenditure Survey data).
0
2000
4000
6000
8000
10000
12000
14000
16000
0 20000 40000 60000 80000 100000 120000 140000 160000
FDHO
EXP
ELASTICITIES AND LOGARITHMIC MODELS
© Christopher Dougherty 1999–2006
. reg FDHO EXP
Source | SS df MS Number of obs = 869---------+------------------------------ F( 1, 867) = 381.47 Model | 915843574 1 915843574 Prob > F = 0.0000Residual | 2.0815e+09 867 2400831.16 R-squared = 0.3055---------+------------------------------ Adj R-squared = 0.3047 Total | 2.9974e+09 868 3453184.55 Root MSE = 1549.5
------------------------------------------------------------------------------ FDHO | Coef. Std. Err. t P>|t| [95% Conf. Interval]---------+-------------------------------------------------------------------- EXP | .0528427 .0027055 19.531 0.000 .0475325 .0581529 _cons | 1916.143 96.54591 19.847 0.000 1726.652 2105.634------------------------------------------------------------------------------
Here is a regression of FDHO on EXP. It is usual to relate types of consumer expenditure to total expenditure, rather than income, when using household data. Household income data tend to be relatively erratic.
The regression implies that, at the margin, 5 cents out of each dollar of expenditure is spent on food at home. Does this seem plausible? Probably, though possibly a little low.
ELASTICITIES AND LOGARITHMIC MODELS
© Christopher Dougherty 1999–2006
. reg FDHO EXP
Source | SS df MS Number of obs = 869---------+------------------------------ F( 1, 867) = 381.47 Model | 915843574 1 915843574 Prob > F = 0.0000Residual | 2.0815e+09 867 2400831.16 R-squared = 0.3055---------+------------------------------ Adj R-squared = 0.3047 Total | 2.9974e+09 868 3453184.55 Root MSE = 1549.5
------------------------------------------------------------------------------ FDHO | Coef. Std. Err. t P>|t| [95% Conf. Interval]---------+-------------------------------------------------------------------- EXP | .0528427 .0027055 19.531 0.000 .0475325 .0581529 _cons | 1916.143 96.54591 19.847 0.000 1726.652 2105.634------------------------------------------------------------------------------
It also suggests that $1,916 would be spent on food at home if total expenditure were zero. Obviously this is impossible. It might be possible to interpret it somehow as baseline expenditure, but we would need to take into account family size and composition.
ELASTICITIES AND LOGARITHMIC MODELS
© Christopher Dougherty 1999–2006
Here is the regression line plotted on the scatter diagram
0
2000
4000
6000
8000
10000
12000
14000
16000
0 20000 40000 60000 80000 100000 120000 140000 160000EXP
FDHO
ELASTICITIES AND LOGARITHMIC MODELS
© Christopher Dougherty 1999–2006
We will now fit a constant elasticity function using the same data. The scatter diagram shows the logarithm of FDHO plotted against the logarithm of EXP.
5.00
6.00
7.00
8.00
9.00
10.00
7.00 8.00 9.00 10.00 11.00 12.00 13.00
LGFDHO
LGEXP
ELASTICITIES AND LOGARITHMIC MODELS
© Christopher Dougherty 1999–2006
. g LGFDHO = ln(FDHO)
. g LGEXP = ln(EXP)
. reg LGFDHO LGEXP
Source | SS df MS Number of obs = 868---------+------------------------------ F( 1, 866) = 396.06 Model | 84.4161692 1 84.4161692 Prob > F = 0.0000Residual | 184.579612 866 .213140429 R-squared = 0.3138---------+------------------------------ Adj R-squared = 0.3130 Total | 268.995781 867 .310260416 Root MSE = .46167
------------------------------------------------------------------------------ LGFDHO | Coef. Std. Err. t P>|t| [95% Conf. Interval]---------+-------------------------------------------------------------------- LGEXP | .4800417 .0241212 19.901 0.000 .4326988 .5273846 _cons | 3.166271 .244297 12.961 0.000 2.686787 3.645754------------------------------------------------------------------------------
Here is the result of regressing LGFDHO on LGEXP. The first two commands generate the logarithmic variables.
ELASTICITIES AND LOGARITHMIC MODELS
© Christopher Dougherty 1999–2006
. g LGFDHO = ln(FDHO)
. g LGEXP = ln(EXP)
. reg LGFDHO LGEXP
Source | SS df MS Number of obs = 868---------+------------------------------ F( 1, 866) = 396.06 Model | 84.4161692 1 84.4161692 Prob > F = 0.0000Residual | 184.579612 866 .213140429 R-squared = 0.3138---------+------------------------------ Adj R-squared = 0.3130 Total | 268.995781 867 .310260416 Root MSE = .46167
------------------------------------------------------------------------------ LGFDHO | Coef. Std. Err. t P>|t| [95% Conf. Interval]---------+-------------------------------------------------------------------- LGEXP | .4800417 .0241212 19.901 0.000 .4326988 .5273846 _cons | 3.166271 .244297 12.961 0.000 2.686787 3.645754------------------------------------------------------------------------------
The estimate of the elasticity is 0.48. Does this seem plausible?
Yes, definitely. Food is a normal good, so its elasticity should be positive, but it is a basic necessity. Expenditure on it should grow less rapidly than expenditure generally, so its elasticity should be < 1.
ELASTICITIES AND LOGARITHMIC MODELS
© Christopher Dougherty 1999–2006
. g LGFDHO = ln(FDHO)
. g LGEXP = ln(EXP)
. reg LGFDHO LGEXP
Source | SS df MS Number of obs = 868---------+------------------------------ F( 1, 866) = 396.06 Model | 84.4161692 1 84.4161692 Prob > F = 0.0000Residual | 184.579612 866 .213140429 R-squared = 0.3138---------+------------------------------ Adj R-squared = 0.3130 Total | 268.995781 867 .310260416 Root MSE = .46167
------------------------------------------------------------------------------ LGFDHO | Coef. Std. Err. t P>|t| [95% Conf. Interval]---------+-------------------------------------------------------------------- LGEXP | .4800417 .0241212 19.901 0.000 .4326988 .5273846 _cons | 3.166271 .244297 12.961 0.000 2.686787 3.645754------------------------------------------------------------------------------
The intercept has no substantive meaning. To obtain an estimate of
b1, we calculate e3.16, which is 23.8.
48.08.23ˆ48.017.3ˆ EXPOHFDLGEXPHODLGF
ELASTICITIES AND LOGARITHMIC MODELS
© Christopher Dougherty 1999–2006
Here is the scatter diagram with the regression line plotted.
5.00
6.00
7.00
8.00
9.00
10.00
7.00 8.00 9.00 10.00 11.00 12.00 13.00
LGFDHO
LGEXP
ELASTICITIES AND LOGARITHMIC MODELS
Here, we compare the regression line from the logarithmic regression (plotted in the original scatter diagram), to the linear regression line.
You can see that the logarithmic regression line gives a somewhat better fit, especially at low levels of expenditure.
0
2000
4000
6000
8000
10000
12000
14000
16000
0 20000 40000 60000 80000 100000 120000 140000 160000EXP
FDHO
ELASTICITIES AND LOGARITHMIC MODELS
© Christopher Dougherty 1999–2006
However, the difference in the fit is not dramatic. The main reason for preferring the constant elasticity model is that it makes more sense theoretically. It also has a technical advantage that we will come to later on (when discussing heteroscedasticity).
0
2000
4000
6000
8000
10000
12000
14000
16000
0 20000 40000 60000 80000 100000 120000 140000 160000EXP
FDHO
ELASTICITIES AND LOGARITHMIC MODELS
© Christopher Dougherty 1999–2006
SEMILOGARITHMIC MODELS
This sequence introduces the semilogarithmic model and shows how it may be applied to an earnings function. The dependent variable is linear but the explanatory variables, multiplied by their coefficients, are exponents of e.
XeY 21
© Christopher Dougherty 1999–2006
The differential of Y with respect to X simplifies to b2Y.
XeY 21
YedXdY X
2212
SEMILOGARITHMIC MODELS
© Christopher Dougherty 1999–2006
XeY 21
YedXdY X
2212
2dXYdY
Hence the proportional change in Y per unit change in X is equal to b2. It is therefore independent of the value of X.
Strictly speaking, this interpretation is valid only for small changes in X. When changes are not small, the interpretation may be a little more complex.
SEMILOGARITHMIC MODELS
© Christopher Dougherty 1999–2006
The semilogarithmic model is generally used to fit earnings functions and we will use an earnings function as an example.
SeEARNINGS 21
SEMILOGARITHMIC MODELS
© Christopher Dougherty 1999–2006
In this example, we will relate earnings to years of schooling, measured as highest grade completed. This is a discrete variable, changing in whole units. We will consider the effect of increasing S by one year to S'.
SeEARNINGS 21
'1
2' SeEARNINGS
1' SS
SEMILOGARITHMIC MODELS
© Christopher Dougherty 1999–2006
We will relate earnings with S' years of schooling to earnings with S years of schooling.
SeEARNINGS 21
'1
2' SeEARNINGS
1' SS
...)!2
1(
'
22
2
1
)1(1
'1
2
22
22
EARNINGS
eEARNINGS
ee
eeEARNINGSS
SS
SEMILOGARITHMIC MODELS
© Christopher Dougherty 1999–2006
The right side of the equation can be rewritten as shown.
SeEARNINGS 21
'1
2' SeEARNINGS
1' SS
...)!2
1(
'
22
2
1
)1(1
'1
2
22
22
EARNINGS
eEARNINGS
ee
eeEARNINGSS
SS
SEMILOGARITHMIC MODELS
© Christopher Dougherty 1999–2006
Thus in this model, an extra year of schooling causes earnings to be
multiplied by a factor eb .
SeEARNINGS 21
'1
2' SeEARNINGS
1' SS
...)!2
1(
'
22
2
1
)1(1
'1
2
22
22
EARNINGS
eEARNINGS
ee
eeEARNINGSS
SS
2
SEMILOGARITHMIC MODELS
© Christopher Dougherty 1999–2006
SeEARNINGS 21
'1
2' SeEARNINGS
1' SS
...)!2
1(
'
22
2
1
)1(1
'1
2
22
22
EARNINGS
eEARNINGS
ee
eeEARNINGSS
SS
The final line replaces eb with its mathematical definition. If b2 is small, b2
will be very small and can be neglected. In that case, the right side of the equation simplifies to EARNINGS (1 + b2) and the original marginal interpretation of b2 still applies.
22
SEMILOGARITHMIC MODELS
© Christopher Dougherty 1999–2006
b1 is the value of Y when X is equal to zero (note that e0 is equal to 1).
10
10 eYX
XeY 21
SEMILOGARITHMIC MODELS
© Christopher Dougherty 1999–2006
X
eX
e
eYX
X
2'1
2'1
1
1
log
loglog
loglog2
2
To fit a function of this type, you take logarithms of both sides.
The right side of the equation becomes a linear function of X (note that the logarithm of e, to base e, is 1).
Hence we can fit the model with a linear regression, regressing log Y on X.
XeY 21
SEMILOGARITHMIC MODELS
© Christopher Dougherty 1999–2006
. reg LGEARN S
Source | SS df MS Number of obs = 540-------------+------------------------------ F( 1, 538) = 140.05 Model | 38.5643833 1 38.5643833 Prob > F = 0.0000 Residual | 148.14326 538 .275359219 R-squared = 0.2065-------------+------------------------------ Adj R-squared = 0.2051 Total | 186.707643 539 .34639637 Root MSE = .52475
------------------------------------------------------------------------------ LGEARN | Coef. Std. Err. t P>|t| [95% Conf. Interval]-------------+---------------------------------------------------------------- S | .1096934 .0092691 11.83 0.000 .0914853 .1279014 _cons | 1.292241 .1287252 10.04 0.000 1.039376 1.545107------------------------------------------------------------------------------
Here is the regression output from a regression using Data Set 21. The estimate of b2 is 0.110. As an approximation, this implies that an extra year of schooling increases earnings by a proportion 0.110, that is, 11.0%.
SEMILOGARITHMIC MODELS
© Christopher Dougherty 1999–2006
If we take account of the fact that a year of schooling is not a marginal change, and work out the effect exactly, the increase is 11.6%.
When b2 is less than 0.1, there is little to be gained by working out the effect exactly.
SeEARNINGS 110.01
'110.01' SeEARNINGS
1' SS
...)006.0110.01(
'
110.0
110.0110.01
)1(110.01
'110.01
EARNINGS
eEARNINGS
ee
eeEARNINGS
S
SS
SEMILOGARITHMIC MODELS
© Christopher Dougherty 1999–2006
The intercept in the regression is an estimate of log b1. From it, we obtain an
estimate of b1 equal to e1.29, which is 3.64.
Literally this implies that a person with no schooling would earn $3.64 per hour. Note: It’s dangerous to extrapolate so far from the range for which we have data.
. reg LGEARN S
Source | SS df MS Number of obs = 540-------------+------------------------------ F( 1, 538) = 140.05 Model | 38.5643833 1 38.5643833 Prob > F = 0.0000 Residual | 148.14326 538 .275359219 R-squared = 0.2065-------------+------------------------------ Adj R-squared = 0.2051 Total | 186.707643 539 .34639637 Root MSE = .52475
------------------------------------------------------------------------------ LGEARN | Coef. Std. Err. t P>|t| [95% Conf. Interval]-------------+---------------------------------------------------------------- S | .1096934 .0092691 11.83 0.000 .0914853 .1279014 _cons | 1.292241 .1287252 10.04 0.000 1.039376 1.545107------------------------------------------------------------------------------
292.1log 1 b
64.3292.11 eb
SEMILOGARITHMIC MODELS
© Christopher Dougherty 1999–2006
Here is the scatter diagram with the semilogarithmic regression.
0
1
2
3
4
5
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
Years of schooling
Lo
gar
ith
m o
f h
ou
rly
earn
ing
s
SEMILOGARITHMIC MODELS
© Christopher Dougherty 1999–2006
Here is the semilogarithmic regression line plotted in a scatter diagram with the untransformed data, with the linear regression shown for comparison.
-20
0
20
40
60
80
100
120
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
Years of schooling
Ho
url
y ea
rnin
gs
($)
SEMILOGARITHMIC MODELS
© Christopher Dougherty 1999–2006
There is not much difference in the fit of the regression lines, but the semilogarithmic regression is more satisfactory in two respects.
•The linear specification predicts that earnings will increase by about $2 per hour with each additional year of schooling, which is implausible for high levels of education. The semi-logarithmic specification allows the increment to increase with level of education.
•Second, the linear specification predicts negative earnings for an individual with no schooling. The semilogarithmic specification predicts hourly earnings of $3.64, which at least is not obvious nonsense.
SEMILOGARITHMIC MODELS
© Christopher Dougherty 1999–2006
THE DISTURBANCE TERM IN NONLINEAR MODELS
Thus far, nothing has been said about the disturbance term in nonlinear regression models.
For the regression results in a linearized model to have the desired properties, the disturbance term in the transformed model should be additive and it should satisfy the regression model conditions.
To be able to perform the usual tests, it should be normally distributed in the transformed model.
© Christopher Dougherty 1999–2006
In the case of the first example of a nonlinear model, there was no problem. If the disturbance term had the required properties in the original model, it would have them in the regression model. It has not been affected by the transformation.
uX
Y 21
XZ
1
uZY 21
THE DISTURBANCE TERM IN NONLINEAR MODELS
© Christopher Dougherty 1999–2006
vXeXY u 2211
uXY logloglog 21
In the discussion of the logarithmic model, the disturbance term was omitted altogether.
THE DISTURBANCE TERM IN NONLINEAR MODELS
© Christopher Dougherty 1999–2006
vXeXY u 2211
uXY logloglog 21
However, implicitly it was being assumed that there was an additive disturbance term in the transformed model.
THE DISTURBANCE TERM IN NONLINEAR MODELS
© Christopher Dougherty 1999–2006
vXeXY u 2211
uXY logloglog 21
For this to be possible, the random component in the original model must be a multiplicative term, eu.
THE DISTURBANCE TERM IN NONLINEAR MODELS
© Christopher Dougherty 1999–2006
vXeXY u 2211
uXY logloglog 21
We will denote this multiplicative term v.
When u is equal to 0, not modifying the value of log Y, v is equal to 1, likewise not modifying the value of Y.
Positive values of u correspond to values of v greater than 1, the random factor having a positive effect on Y and log Y. Likewise negative values of u correspond to values of v between 0 and 1, the random factor having a negative effect on Y and log Y.
Besides satisfying the regression model conditions, we need u to be normally distributed if we are to perform t tests and F tests.
THE DISTURBANCE TERM IN NONLINEAR MODELS
© Christopher Dougherty 1999–2006
0.00
0.05
0.10
0.15
0.20
0.25
0.30
0.35
0.40
0.45
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16v
f(v)
vXeXY u 2211
uXY logloglog 21
This will be the case if v has a lognormal distribution, shown above.
The median of the distribution is located at v = 1, where u = 0.
THE DISTURBANCE TERM IN NONLINEAR MODELS
© Christopher Dougherty 1999–2006
The same multiplicative disturbance term is needed in the semilog model. Note that, with this asymmetric distribution, one should expect a small proportion of observations to be subject to large positive random effects.
0.00
0.05
0.10
0.15
0.20
0.25
0.30
0.35
0.40
0.45
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16v
f(v)
veeeY XuX 2211
uXY 21loglog
THE DISTURBANCE TERM IN NONLINEAR MODELS
© Christopher Dougherty 1999–2006
Here is the scatter diagram for earnings and schooling using Data Set 21. You can see that there are several outliers, with the four most extreme highlighted.
-20
0
20
40
60
80
100
120
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
Years of schooling
Ho
url
y ea
rnin
gs
($)
THE DISTURBANCE TERM IN NONLINEAR MODELS
© Christopher Dougherty 1999–2006
Here is the scatter diagram for the semilogarithmic model, with its regression line. The same four observations remain outliers, but they do not appear to be so extreme.
0
1
2
3
4
5
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
Years of schooling
Lo
gar
ith
m o
f h
ou
rly
earn
ing
s
THE DISTURBANCE TERM IN NONLINEAR MODELS
© Christopher Dougherty 1999–2006
The histogram above compares the distributions of the residuals from the linear and semi-logarithmic regressions. The distributions have been standardized, that is, scaled so that they have standard deviation equal to 1, to make them comparable.
0
20
40
60
80
100
120
140
160
180
200
Linear Semilogarithmic
0–1–2–3 1 2 3
THE DISTURBANCE TERM IN NONLINEAR MODELS
© Christopher Dougherty 1999–2006
It can be shown that if the disturbance term in a regression model has a normal distribution, so will the residuals.
It is obvious that the residuals from the semilogarithmic regression are approximately normal, but those from the linear regression are not. This is evidence that the semi-logarithmic model is the better specification.
THE DISTURBANCE TERM IN NONLINEAR MODELS
© Christopher Dougherty 1999–2006
What would happen if the disturbance term in the logarithmic or semilogarithmic model were additive, rather than multiplicative?
uXY 21
THE DISTURBANCE TERM IN NONLINEAR MODELS
© Christopher Dougherty 1999–2006
If this were the case, we would not be able to linearize the model by taking logarithms. There is no way of simplifying log(b1Xb + u). We should have to use some nonlinear regression technique.
See textbook for example, Stata command for nonlinear regression: nl
uXY 21
)log(log 21 uXY
2
THE DISTURBANCE TERM IN NONLINEAR MODELS