2 simple regression model 29x09x2011

Upload: papogo

Post on 06-Apr-2018

215 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/3/2019 2 Simple Regression Model 29x09x2011

    1/35

    1

    2 The simple regression model

    Version: 29-09-2011

    Ezequiel Uriel

    2.1 Definition of the simple regression model

    Population model

    In the simple regression model, the reference model or population model is the

    following:

    0 1y x u (1)

    We are going to look at the different elements of the model (1) and the

    terminology used to designate them. We are going to consider that the three variables of

    the model (y, x and u) are random variables. In this model x is the unique factor toexplainy. All the other factors that affect toy are captured by u.

    We typically refer to y as the endogenous (from the Greek: generated inside)

    variable or dependent variable. Other denominations are also used to designate y: left-

    hand side variable, explained variable, or regressand. In this model all these

    denominations are equivalent, but in other models, as we will see later, there can be

    some differences.

    In the simple linear regression ofy onx, we typically refer tox as the exogenous

    (from the Greek: generated outside) variable or independent variable. Other

    denominations are also used to designate x: right-hand side variable, explanatory

    variable, regressor, covariate, or control variable. All these denominations areequivalent, but in other models, as we will see later, there can be some differences.

    The variable u represents factors other than x that affect y. It is denominated

    error or random disturbance. The error term can also capture measurement error in the

    dependent variable. The error is an unobservable variable.

    The parameters 0 and 1 are fixed and unknown.

    Population regression function

    In order to define thepopulation regression function, we are previously going to

    introduce two statistical assumptions.The first assumption to introduce is that the mean value of the error in the

    population is 0. That is to say, the expectation ofu is 0:

    ( ) 0E u (2)

    This is not a restrictive assumption, since we can always use0

    to normalize

    E(u) to 0. Let us suppose, for example, that ( ) 4E u , then we could redefine the modelin the following way:

    0 1( 4)y x v

    where 4v u . Therefore, the expectation of the new error, v, is 0 and the expectationofu has been absorbed by the intercept.

  • 8/3/2019 2 Simple Regression Model 29x09x2011

    2/35

    2

    The second assumption is about how x and u are related. Concretely, this

    assumption states that

    ( ) 0E ux (3)

    This is a crucial assumption in regression analysis. If this assumption were not

    acceptable, it would be difficult to separate the effects ofx and u ony.

    Taking into account the two assumptions together, we have

    ( | ) ( ) 0E u x E u , (4)

    which implies

    0 1( | )E y x x (5)

    This equation is known as the population regression function (PRF) or

    population line. Therefore,E(y|x), as it can be seen in figure 2.1, is a linear function ofx

    with intercept 0 and slope 1 .

    x

    Figure 2.1. The population regression function (PRF)

    The linearity means that a one-unit increase inx changes the expected value ofyby the amount 1 .

    Now, let us suppose we have a random sample of size n {(yi ,xi): i = 1,,n}

    from the population. In figure 2.2 the scatter diagram, corresponding to these data, have

    been displayed.

  • 8/3/2019 2 Simple Regression Model 29x09x2011

    3/35

    3

    x

    Figure 2.2. The scatter diagram

    We can express the population model for each observation of the sample:

    0 1 1,2, ,i i iy x u i n (6)

    In figure 2.3 the population regression function and the scatter diagram are put

    together, but it is important to keep in mind that 0 and 1 are fixed, but unknown.

    According to the model it is possible, from a theoretical point of view, to make thefollowing decomposition:

    ( | ) 1,2, ,i i i iy E y x u i n (7)

    that has been represented in the figure 2.3 for the observation i. However, from an

    empirical point of view, it is not possible to do it, because 0 and 1 are unknown

    parameters and iu is not observable.

  • 8/3/2019 2 Simple Regression Model 29x09x2011

    4/35

    4

    x

    E(y/x)

    i

    xi

    E(y i/xi)

    u i

    Figure 2.3. The population regression function and the scatter diagram

    Sample regression function

    The basic idea of regression is to estimate the population parameters, 0 and 1 ,

    from a given sample.

    The sample regression function (SRF) is the sample counterpart of the

    population regression function (PRF). Since the SRFis obtained for a given sample, anew sample will generate different estimates.

    The SRF, which is an estimation of the PRF, given by

    0 1

    i iy x (8)

    allows to calculate the fitted value ( iy ) for y when ix x .In the SRF 0 and 1 are

    estimators of the parameters 0 and 1 .

    For each xi we have an observed value ( iy ) and a fitted value ( iy )

    We call residual ( iu ) to the difference between iy and iy . That is

    0 1

    i i i i iu y y y x (9)

    In other words, the residual iu is the difference between the fitted line and the

    sample point, as it can be seen in figure 2.4. In this case, it is possible to calculate the

    decomposition:

    i i iy y u

    for a given sample.

  • 8/3/2019 2 Simple Regression Model 29x09x2011

    5/35

    5

    i

    i

    x

    i

    xi

    Figure 2.4. The sample regression function and the scatter diagram

    To sum up, 0 , 1

    , iy and iu are the sample counterpart of 0 , 1 , E(y|x) and

    iu respectively. It is possible to calculate 0 and 1

    for a given sample, but for each

    sample the estimates will change. On the contrary, 0 and 1 are fixed, but unknown.

    2.2 Deriving the Ordinary Least Squares Estimates

    Different criteria of estimation

    Before deriving the ordinary least squares estimates, we are going to examine

    three alternative methods to illustrate the problem which we face. These three methods

    have in common that they try to minimize, in some way, the value of the residuals as a

    whole.

    Criterion 1

    A first criterion would consist of taking as estimators those values of0

    and1

    that make the sum of all the residuals as near to zero as possible. With this criterion the

    expression to minimize would be the following:

    Min.1

    n

    i

    i

    u

    (10)

    The main problem of this method of estimation is that residuals of different sign

    can be compensated. Such situation can be observed graphically in the figure 2.5, in

    which three aligned observations are graphed, ( 1 1,x y ), ( 2 2,x y ) and ( 3 3,x y ). In this case

    the following happens:

    3 12 1

    2 1 3 1

    y yy y

    x x x x

  • 8/3/2019 2 Simple Regression Model 29x09x2011

    6/35

    6

    xx 1 x 2 x 3

    Figure 2.5. The problems of criterion 1.

    If a straight line is fitted so that it crosses through the three points, each one of

    the residuals will take value zero, so that

    1

    0n

    i

    i

    u

    This fit could be considered optimal. But it is also possible to obtain3

    1 0ii u , by rotating the straight line -from the point 2 2,x y - in any direction, as

    figure 2 shows, because 3 1 u u . In other words, by rotating in this way the result3

    1 0ii u is always obtained. This simple example shows us that this criterion is not

    appropriate for the estimation of the parameters because, for any set of observations,

    infinite straight lines exist, that satisfy this criterion,

    Criterion 2

    A form to avoid the compensation of positive residuals with negatives consists

    of taking the absolute values from the residuals. In this case the following expression

    would be minimized:

    Min.1

    n

    i

    i

    u

    (11)

    Unfortunately, although the estimators thus obtained have some interesting

    properties, their calculation is complicated, requiring the resolution of a problem of

    linear programming or the application of a procedure of iterative calculation.

    Criterion 3

    A third method consists of minimizing the sum of the square residuals, that is to

    say, minimizing

  • 8/3/2019 2 Simple Regression Model 29x09x2011

    7/35

    7

    2

    1

    . .n

    i

    i

    Min S Min u

    (12)

    The estimators obtained are denominated least square estimators, and they enjoy

    certain desirable statistical properties, that we will study later. On the other hand, as

    opposed to first of the examined criteria, when we square the residuals the compensationof these is avoided, whereas, unlike the second of the criteria, the least square estimators

    are simple to obtain. It is important to indicate that, from the moment at which we are

    squaring the residuals, we are proportionally penalizing to the great residuals with

    respect to the small ones (if a residual is the double of another one, its square will be

    four times greater), which also characterizes the least square estimation with respect to

    other possible methods.

    Application of least square criterion

    Now, we are going to look at the process to obtain least square estimators. The

    objective is to minimize the sum of the square residuals (S). To do this, in the first placewe are going to express S as a function of the estimators, using (9):

    Therefore, we must

    0 1 0 1 0 1

    2 2

    0 1 , , ,1 1

    min min min ( )T n

    t i i

    t i

    S u y x

    (13)

    To minimize S, we differentiate partially with respect to 0 and 1

    :

    0 1

    10

    2 ( )

    n

    i i

    i

    Sy x

    0 1

    11

    2 ( )

    n

    i i i

    i

    Sy x x

    The least square estimators are obtained equalling to 0 the previous derivatives:

    0

    1

    ( ) 0n

    i i i

    i

    y x

    (14)

    0

    1

    ( ) 0n

    i i i i

    i

    y x x

    (15)

    The equations (14)are denominated normal equations or LS first order

    conditions.

    In operations with summations, the following rules must be taken into account:

    1

    n

    i

    a na

    1 1

    n n

    i i

    i i

    ax a x

  • 8/3/2019 2 Simple Regression Model 29x09x2011

    8/35

    8

    1 1 1

    ( )n n n

    i i i i

    i i i

    x y x y

    Operating with the normal equations, we have

    0 1

    1 1

    n ni i

    i i

    y n x

    (16)

    2

    0 1

    1 1 1

    n n n

    i i i i

    i i i

    y x x x

    (17)

    Dividing both sides of (16) by n, we have

    0 1 y x (18)

    Therefore

    0 1 y x (19)

    Substituting this value of 0 in the second normal equation (17), we have

    2

    1 1

    1 1 1

    ( )n n n

    i i i i

    i i i

    y x y x x x

    2

    1 1

    1 1 1 1

    n n n n

    i i i i i

    i i i i

    y x y x x x x

    Solving for 1

    1 11

    2

    1 1

    n n

    i i i

    i i

    n n

    i i

    i i

    y x y x

    x x x

    (20)

    Taking into account that

    1 1 1 1 1

    1 1

    1 1 1 1 1 1

    1 1

    ( )( ) ( )n n n n n

    i i i i i i i i i i

    i i i i i

    n n

    i in n n n n ni i

    i i i i i i i i

    i i i i i i

    n n

    i i i

    i i

    y y x x y x xy yx yx y x x y y x nyx

    x x

    y x ny y x ny y x y x y x y xn n

    y x y x

    2 2 2 2

    1 1 1 1

    2 2

    1 1 1 1 1

    ( ) ( 2 ) 2

    2

    n n n n

    i i i i i

    i i i i

    n n n n n

    i i i i i

    i i i i i

    x x x xx xx x x x nxx

    x x x x x x x x

  • 8/3/2019 2 Simple Regression Model 29x09x2011

    9/35

    9

    then (20) can be expressed in the following way:

    1 1 11

    2 2

    1 1 1

    ( )( )

    ( )

    n n n

    i i i i i

    i i i

    n n n

    i i ii i i

    y x y x y y x x

    x x x x x

    (21)

    Once 1 is calculated, then we can obtain 0

    by using (19).

    These are the Least Squares (LS) estimators. Since other more complicated

    methods exist, also called least square methods, the method that we have applied is

    denominated ordinary least square (OLS), due to its simplicity.

    In the precedent epigraphs 0 and 1

    have been used to designate generic

    estimators. From now on with this notation we will only designate OLS estimators.

    An alternative approach of estimationWe are going to see an alternative method to obtain estimators: the method of

    moments. This method is a general method to estimate econometric models.

    The main idea behind this method is quite intuitive and very simple. The

    assumptions in an econometric model imply some restrictions on the means, variances

    or covariances (i.e., moments) of the population. In the method of moments the

    estimates are the values that satisfy these restrictions when we use sample means,

    variances and covariances, instead of the population ones.

    In the presentation of the population model in simple regression, we introduced

    assumptions (2) and (3) that are restrictions to the model:

    Taking into account that

    0 1u y x

    we can express our two restrictions just in terms ofx,y, 0 and 1 :

    0 1( ) 0E y x (22)

    0 1( ) 0E y x x (23)

    The sample restrictions, that correspond with (22) and (23), are the following:

    0

    1

    1( ) 0

    n

    i i i

    i

    y xn

    (24)

    0

    1

    1( ) 0

    n

    i i i i

    i

    y x xn

    (25)

    From the two previous equations the estimators0

    and 1 by the method of the

    moments are obtained.

    But these two equations are precisely equal to the two normal equations - (14)and (15) -obtained by minimizing the sum of the square residuals. In any case, it must

  • 8/3/2019 2 Simple Regression Model 29x09x2011

    10/35

    10

    be kept in mind that the estimators obtained by the method of moments are conditional

    to the restrictions imposed to the model. Consequently, the equality between both

    approaches used is due to the two assumptions considered.

    2.3 Properties of OLS on any sample of data

    Algebraic properties of OLS estimators

    The algebraic properties of OLS are properties derived exclusively from the

    application of the method of OLS:

    1. The sum of the OLS residuals is equal to 0:

    1

    0n

    i

    i

    u

    (26)

    From the definition of residual

    0 1 1,2, ,i i i i iu y y y x i n (27)

    If we add for the n observations, we get

    0 1

    1 1

    ( ) 0n n

    i i i

    i i

    u y x

    (28)

    which is precisely the first equation (14) of the system of normal equations.

    Note that, if (26) holds, it implies that

    1 1

    n n

    i i

    i i

    y y

    (29)

    and, dividing (26) and (29) by n, we obtain

    0u y y (30)

    2. The OLS line always goes through the mean of the sample ( ,y x ).

    Effectively, dividing the equation (16) by n,we have:

    0 1 y x (31)

    3. The sample covariance between the regressor and the OLS residuals is zero.

    The sample covariance between the regressor and the OLS residuals is equal to

    1 1

    1 1 1

    ( )( ) ( )

    n n

    i i i i

    i ixu

    n n n

    i i i i i

    i i i

    x x u u x x u

    Sn n

    x u x u x u

    n n

    Therefore, if

  • 8/3/2019 2 Simple Regression Model 29x09x2011

    11/35

    11

    1

    1

    0n

    i i

    i

    x u

    = (32)

    then the sample covariance between the regressor and the OLS residuals is equal to 0.

    It can be seen that (32) is equal to the second normal equation,

    0 1

    1 1

    ( ) 0n n

    i i i i i

    i i

    x u x y x

    given in (15).

    Therefore, the sample covariance between the regressor and the OLS residuals is

    zero.

    4. The sample covariance between the fitted values ( iy ) and the OLS residuals

    iu is zero.

    Analogous to property 3, the sample covariance between the fitted values ( iy )and the OLS residuals ( iu ) is equal to 0, if the following happens

    1

    0n

    i i

    i

    y u

    (33)

    Proof

    Taking into account the algebraic properties 1 - (26)- and 3 - (32)-, we have

    0 1 0 1

    1 1 1 1 0 1

    ( ) 0 00

    n n n n

    i i i i i i i

    i i i iyu

    y u x u u x u

    S n n n n

    Decomposition of the variance ofy

    By definition

    i i iy y u (34)

    Subtracting y in both sides of the previous expression (remember that y is

    equal toy ), we have

    i i iy y y y u

    Squaring both sides:

    22 2 2 ( ) ( ) 2 ( )i i i i i i iy y y y u y y u u y y

    Summing for all i:

    2 2 2 ( ) 2 ( )i i i i iy y y y u u y y

    Taking into account the algebraic properties 1 and 4, the third term of the right

    hand side is equal to 0. Analytically,

  • 8/3/2019 2 Simple Regression Model 29x09x2011

    12/35

    12

    ( ) 0i i i i iu y y u y y u (35)Therefore, we have

    2 2 2 ( )i i iy y y y u (36)

    It must be stressed that it is necessary to use the relation (26) to assure that (35)

    is equal to 0. We must remember that (26) is associated to the first normal equation that

    is to say, to the equation corresponding to the intercept. If in the fitted model there is no

    intercept, then in general the decomposition obtained will not be fulfilled (36).

    In words,

    Sum of square total (SST) =

    Sum of square explained (SSE)+ Sum of square residuals (SSR)

    or,

    SST=SSE+SSR

    This decomposition can be done in terms of variances, dividing both sides of

    (36) by n:

    2 2 2 ( )i i iy y y y u

    n n n

    (37)

    In words,

    Total variance= explained variance + residual variance

    Goodness of fit: Coefficient of determination (R2)

    A priori we have obtained the estimators minimizing the sum of square

    residuals.

    Now, once the estimation has been done, we can look at how well our sample

    regression line fits our data.

    The statistics that indicate how well sample regression line fits to data are

    denominated goodness of fit measures. We are going to look at the most known

    measure, which is called coefficient of determination or the R-square (2

    R ). This

    measure is defined in the following way:

    2

    2 1

    2

    1

    ( )

    ( )

    n

    i

    i

    n

    i

    i

    y y

    R

    y y

    (38)

    Therefore,2

    R is the proportion of sum of square total that is explained by the

    regression, that is to say, that is explained by the model. We can also say that 100 2R is

    the percentage of the sample variation iny explained byx.

    Alternatively, taking into account (36), we have:

  • 8/3/2019 2 Simple Regression Model 29x09x2011

    13/35

    13

    2 2 2 ( ) ( )i i iy y y y u Substituting in (38), we have

    22 2 2

    2 1

    2 2 2

    1 1 1

    ( ) ( ) 1

    ( ) ( ) ( )

    n

    ii i ii

    n n n

    i i i

    i i i

    y yy y u uR

    y y y y y y

    (39)

    Therefore,2

    R is equal to 1 minus the proportion of sum of square total that is

    non-explained by the regression.

    According to the definition of2

    R , the following must be accomplished

    20 1R

    Extreme cases:

    a) If we have a perfect fit, then 0u i would happen. This implies that

    2 2 2 ( ) ( ) 1 i iy y i y y y y R

    b) If y c i , it implies that

    2 2 0 ( ) 0 0i iy c y y c c i y y R

    If2

    R is close to 0, it implies that we have a poor fit. In other words, very little

    variation iny is explained byx.

    In many cases, a high

    2

    R is obtained when the model is fitted using time seriesdata, due to the effect of a common trend. On the contrary, when we use cross sectioned

    data a low value is obtained in many cases, but it does not mean that the fitted model is

    bad.

    Coefficient of determination and coefficient of correlation

    In descriptive statistics, the coefficient of correlation is studied. It is calculated

    according to the following formula:

    1

    2 2

    1 1

    1

    2 2

    1 1

    ( )( )

    cov( , )var( ) var( )

    ( ) ( )

    ( )( )

    ( ) ( )

    n

    i i

    i

    xyn n

    i i

    i i

    n

    i i

    i

    n n

    i i

    i i

    y y x x

    x y nrx y

    x x y y

    n n

    y y x x

    x x y y

    (40)

  • 8/3/2019 2 Simple Regression Model 29x09x2011

    14/35

    14

    What is the relationship between the coefficient of determination and the

    coefficient of correlation?

    Answer: the coefficient of determination is equal to the squared coefficient of

    correlation. Analytically,

    2 2xyr R (41)

    (This equality is valid in simple regression model, but not in multiple regression

    model)

    Proof

    In first place we are going to see an equivalence that is going to be used in the

    proof. By definition,

    0 1

    i iy x

    From the first normal equation, we have

    0 1 y x

    Subtracting the second equation of the first one:

    1 ( )i iy y x x

    Squaring both sides

    2 2 2

    1( ) ( )i iy y x x

    and summing for all i, we have

    2 2 2

    1( ) ( )i iy y x x

    Taking into account the previous equivalence, we have

    2 2 2

    12 1 1

    2 2

    1 1

    2

    2

    1 1

    22

    2

    11

    2

    1 2

    2 2

    1 1

    ( ) ( )

    ( ) ( )

    ( )( ) ( )

    ( )( )

    ( )( )1

    ( ) ( )

    n n

    i i

    i i

    n n

    i i

    i i

    n n

    i i ii i

    nn

    ii

    ii

    n

    i i

    i

    xyn n

    i i

    i i

    y y x x

    R

    y y y y

    y y x x x x

    y yx x

    y y x x

    r

    x x y y

  • 8/3/2019 2 Simple Regression Model 29x09x2011

    15/35

    15

    2.4 Units of measurement and functional form

    Units of Measurement

    Changing the units of measurement (changing of scale)

    If x is multiplied/divided by a constant, c 0, then the OLS slope isdivided/multiplied by the same constant, c. Thus

    10

    ( )i iy x c

    c

    (42)

    Example

    Let us suppose the following estimated function of consumption, in

    which both variables are measured in thousands of euros:

    0.2 0.85i icons inc= + (43)

    If we now express income in euros (multiplication by 1000) and call it

    ince, the fitted model with the new units of measurement of income would

    be the following:

    0.2 0.00085i icons ince

    As can be seen, changing the units of measurement of the explanatory variable

    does not affect the intercept.

    Ify is multiplied/divided by a constant, c 0, then the OLS slope and intercept

    are both multiplied/divided by the same constant, c. Thus,

    0 1 ( ) ( ) ( )i iy c c c x (44)

    Example

    If we express, in model (43), consumption in euros (multiplication by

    1000) and call it conse, the fitted model with the new units of measurement

    of consumption would be the following:

    200 850i iconse inc

    Changing the originIf one adds/subtracts a constant d to x and/or y, then the OLS slope is not

    affected.

    However, changing the origin of either x and/or y affects the intercept of the

    regression.

    If one subtracts a constant dtox, the intercept will change in the following way:

    0 1 1 ( ) ( )i iy d x d (45)

    Example

  • 8/3/2019 2 Simple Regression Model 29x09x2011

    16/35

    16

    Let us suppose that the average income is 20 thousand euros. If we define

    the variablei iincd inc inc and both variables are measured in thousands

    of euros, the fitted model with this change in the origin will be the following:

    (0.2 0.85 20) 0.85 ( 20) 17.2 0.85i i icons inc incd

    If one subtracts a constant dtoy, the intercept will change in the following way:

    0 1 ( )

    i iy d d x (46)

    Example

    Let us suppose that the average consumption is 15 thousands of euros. If

    we define the variable i iconsd cons cons and both variables are

    measured in euros, the fitted model with this change in the origin will be the

    following:

    15 0.2 15 0.85i icons inc

    that is to say,

    14.8 0.85i iconsd inc= - +

    Eventually note that 2R is invariant to changes in the units ofx and/or y, and

    also is invariant to the origin of the variables.

    Functional Form

    Linear relationships are not adequate in many cases for economic applications.

    However, we can incorporate many nonlinearities (in variables) into simple regressionanalysis by appropriately redefining the dependent and independent variables.

    Previously, before examining different functional form, we are going to look at

    some definitions which will be useful in the exposition.

    Some definitions

    We are going to look at some definitions of variation measures that will be

    useful in the interpretation of the coefficients corresponding to different functional

    forms. Specifically, we will look at the following: absolute change, proportional change,

    percentage change and changes in logarithms.

    The absolute change between 1x and 0x is given by

    1 1 0x x x (47)

    This measure of variation is not very used by economists.

    The proportional change (or relative rate of variation) between 1x and 0x is

    given by:

    1 01

    0 0

    x xx

    x x

    (48)

  • 8/3/2019 2 Simple Regression Model 29x09x2011

    17/35

    17

    Multiplying a proportional change by 100 apercentage change is obtained. That

    is to say:

    1

    0

    100 %x

    x

    (49)

    In economic reports this is the most used measure.

    The change in logarithms between 1x and 0x is given by

    1 0log( ) log( ) log( )x x x (50)

    This is also a variation rate, which is used in economics research. The

    relationship between proportional change and change in logarithms can be seen if we

    expand (50) by Taylor series:

    11

    00

    1

    0

    11 0

    0

    2

    1 1

    10 0 1

    0 1 0 1

    3

    1

    30

    1

    0 1

    1 1

    0 0

    log( ) log( ) log

    1 1 1log(1) 1 1

    2

    1 2

    13 2

    11 1

    2

    xx

    xx

    x

    x

    xx x x

    x x

    xx x x

    x x

    x

    x x

    x

    x x

    x x

    2 3

    1

    0

    2 3

    1 1 1

    0 0 0

    11

    3

    1 1

    2 3

    x

    x

    x x x

    x x x

    (51)

    Therefore, if we take the linear approximation in this expansion, we have

    1 11 0

    0 0

    log( ) log( ) log( ) logx x

    x x xx x

    (52)

    This approximation is good when the proportional change is small, but the

    differences can be important when the proportional change is big, as can see in table 1.

  • 8/3/2019 2 Simple Regression Model 29x09x2011

    18/35

    18

    Table 1. Examples of proportional change and change in logarithms

    x1 202 210 220 240 300x0 200 200 200 200 200

    Proportional change 0.010 0.050 0.100 0.200 0.500

    Change in logarithms 0.010 0.049 0.095 0.182 0.405

    Proportional change % 1% 5.0% 10.0% 20.0% 50.0%Change in logarithms % 1% 4.9% 9.5% 18.2% 40.5%

    Elasticity is the ratio of the relative changes of two variables.

    If we use proportional changes, the elasticity of the variabley with respect to the

    variablex is given by

    0/

    0

    y x

    y

    y

    x

    x

    (53)

    If we use changes in logarithms and consider infinitesimal changes, the elasticity

    of the variabley with respect to a variablex is given by

    /

    log( )

    log( )y x

    dy

    d yy

    dx d x

    x

    (54)

    In general, in the econometric models elasticity is defined by using (54).

    Alternative functional forms

    The OLS method can be applied without any problem to models in which the

    transformations of the endogenous variable and/or of the exogenous variable have been

    done. In the presentation of the model (1) we said that exogenous variable and regressor

    were equivalent terms. But from now on, regressor is the specific form in which an

    exogenous variable appears in the equation. For example, in the model

    0 1 log( )y x u

    the exogenous variable isx, but the regressor is log(x).

    In the presentation of the model (1) we also said that endogenous variable andregressand were equivalent. But from now on, regressand is the specific form in which

    an endogenous variable appears in the equation. For example, in the model

    0 1log( )y x u

    the endogenous variable isy, but the regressand is log(y).

    The two previous models are linear in parameters, although they are not linear in

    the variablex (the first one) or in the variabley (the second one). In any case, if a model

    is linear in parameters, it can be estimated applying the OLS method. On the contrary, if

    a model is not linear in parameters,iterative methods must be used in the estimation.However, there are certain nonlinear models that, by means of suitable

    transformations, can become linear. These models are denominated linearizables.

  • 8/3/2019 2 Simple Regression Model 29x09x2011

    19/35

    19

    Thus, in some occasions potential models are postulated in economic theory, like

    in the well-known production function of Cobb-Douglas. A potential model with a

    unique explanatory variable is given by

    1y Ax

    If we introduce the error term in the following way

    1 uy Ax e (55)

    then, taking natural logarithms in both sides of (55), we obtain a linear model in

    parameters:

    0 1log( ) log( )y x u (56)

    where we have called 0 to log(A).

    On the contrary, if we introduce the error term in the following way

    1y Ax u

    then there is not a transformation that allows turning this model into a linear model.

    This is a non linearizable model.

    Now, we are going to consider some models with alternative functional forms,

    all of them linear are in parameters. We will look at the interpretation of the coefficient

    1 in each case.

    a) Linear model

    In this model, defined in (1), if the other factors included in u are held fixed, so

    that the change in u is zero, 0u , thenx has a linear effect ony:

    1 if 0y x u

    Therefore,1

    is the change in y (in the units in which y is measured) by a unit

    change ofx (in the units in whichx is measured).

    For example, in the fitted function (43) if income increases by 1 unit,

    consumption will increase by 0,85 units.

    The linearity of this model implies that a one-unit change in x has always the

    same effect ony, regardless of the value ofx considered.

    b) Linear log model

    A linear log model is given by

    0 1log( )y x u (57)

    Taking first differences in (57) and doing 0u , we have

    1 log( ) if 0y x u

    Therefore, ifx increases by 1%, then y will increase by 1( /100) units, being

    0u .

  • 8/3/2019 2 Simple Regression Model 29x09x2011

    20/35

    20

    c) Log linear model

    A log linear model is given by

    0 1log( )y x u (58)

    This model can be obtained from the following one:

    0 1exp( )y x u

    by taking natural logs in both sides. For this reason this model is also called

    exponential.

    Taking first differences in (58) and doing 0u , we have

    1log( ) if 0y x u

    Therefore, ifx increases by 1 unit, then y will increase by 100 1 %, being

    0u .d)Log log model

    The model given in (56) is a log log model or, before the transformation, a

    potential model (55). This model is also called constant elasticity model.

    Taking first differences in (56) and doing 0u , we have

    1log( ) log( ) if 0y x u

    Therefore, ifx increases by 1%, theny will increase by 1 %, being 0u . It is

    important to remark that, in this model, 1 is the elasticity ofy with respect tox, for any

    value ofx andy. Consequently, in this model the elasticity is constant.

    In table 2 and for the fitted model, the interpretation of 1 in these four models

    is shown.

    Table 2. Interpretation of 1 in different models

    Model Ifx increases in theny will increases in

    linear 1 unit1

    units

    linear log 1%1

    ( /100) units

    log linear 1 unit1

    (100 )%

    log log 1%1

    %

    2.5 Statistical Properties of OLS

    We are going now to study the statistical properties of OLS estimators, 0 and

    1 , considered as estimators of the population parameters,0 and1.

    The first property to study is the unbiasedness of the OLS.

    Unbiasedness of the OLS

    To proof this property we need formulate the following assumptions:Assumptions

  • 8/3/2019 2 Simple Regression Model 29x09x2011

    21/35

    21

    SLR.1: Linear in parameters

    The population model is linear in parameters and given by

    0 1y x u (59)

    SLR.2: Random samplingWe have a random sample from of size n, {(y

    i,x

    i): i = 1,2,3,,n}, from the

    population model.

    Thus we can write the population model in terms of the sample,

    0 1 1,2, ,i i iy x u i n (60)

    SLR.3: Sample variation in the explanatory variable

    The sample outcomes onx, namely , 1,2, ,ix i n are not all the same value.

    This assumption implies that

    2

    2 1 0

    n

    i

    iX

    x x

    Sn

    (61)

    SLR.4: Zero conditional mean

    E(u|x) = 0

    For a random sample, this assumption implies that

    E(ui|x

    i) = 0

    , i = 1,2,3,,n

    According to the SLR.4, all derivations using this assumption will be conditional

    on the sample values,xs.

    We are going to proof only the unbiasedness of the estimator 1 , which is the

    most important. In order to do this proof, we need to rewrite our estimator in terms of

    the population parameter. Start with a simple rewrite of the OLS formula (21) as

    1 11

    2 2

    1 1

    n n

    i i i i

    i i

    n n

    i ii i

    x x y y x x y

    x x x x

    (62)

    because 1 1

    0 0n n

    i i

    i i

    x x y y x x y

    Substituting (60) into (62), we have

  • 8/3/2019 2 Simple Regression Model 29x09x2011

    22/35

    22

    0 1 1

    1 1 11

    2 2 2

    1 1 1

    21

    1 1 11

    2 2 2

    1 1 1

    ( ) ( )

    n n n

    i i i i i i i i

    i i i

    n n n

    i i i

    i i i

    n n n

    i i i i i

    i i i

    n n n

    i i i

    i i i

    x x y x x x u x x x u

    x x x x x x

    x x x x u x x u

    x x x x x x

    (63)

    The estimator equals the population slope, 1, plus a term that is a linear

    combination in the errors, {u1,,un}.

    In the derivation of (63), we have taken into account that

    0 0 01 1

    0 0n n

    i i

    i i

    x x x x

    2

    1 1 1 1 1

    n n n n n

    i i i i i i i i

    i i i i i

    x x x x x x x x x x x x x x x

    THEOREM 2.1 Unbiasedness of OLS

    Under assumptions SLR.1 to SLR.4 the expectations of 0 and 1

    , conditional

    tox, are unbiased.

    Proof 1 is unbiased

    1 11 1 1

    2 2

    1 1

    11 1

    2

    1

    |( | ) |

    ( )

    n n

    i i i i i

    i i

    n n

    i i

    i i

    n

    i i

    i

    n

    i

    i

    x x u E x x u x

    E X E X

    x x x x

    x x E u

    x x

    (64)

    (From now on in order to simplify the notation, we will use 1

    ( )E , instead of1( | )E X , but understanding that is a conditional expectation.)

    Therefore, the OLS estimate of1

    is unbiased. Similarly, one can show that the

    OLS estimator of0

    is also unbiased. In any case, the proof of unbiasedness depends on

    assumptions SLR.1 to SLR.4. If any assumption fails, then OLS estimators are not

    necessarily unbiased. Remember unbiasedness is a general property of the estimator, but

    in a given sample we may be near or far from the true parameter, but its distribution

    will be centered at the population parameter.

    Variance of the OLS estimators

  • 8/3/2019 2 Simple Regression Model 29x09x2011

    23/35

    23

    Now we know that the sampling distribution of our estimator is centered around

    the true parameter. How spread out is this distribution? This will be a measure of

    uncertainty.

    In order to calculate the variance of 1 (and 0 ) an additional assumption is

    needed.SLR.5: Homoskedasticity

    Homoskedasticity implies that all errors have the same variance. That is to say,

    2( | )var u x (65)

    We add SLR.5 because it simplifies the variance calculations and because it

    implies that OLS has certain efficiency properties.

    The homoskedasticity assumption is quite distinct from the zero conditional

    mean assumption, E(u|x)=0. SLR.4 involves the expected value of u, while SLR.5

    concerns the variance ofu. Homoskedasticity plays no role in showing that 0 and 1

    are unbiased.

    Taking into account that ( ) 0E u x and 2 2( | ) ( )E u x E u -because E(xu)=0-,

    we obtain

    22 2

    2 2 2

    ( | ( ( | ) 2 ( ( (

    ( | ) ( ( | ) ( ) ( )

    var u x E u x E u x E u x E u x E u x E u x

    E u x E u x E u x E u var u

    2

    2

    | ) | ) | ) | )+ | )

    | )

    Thus, is the conditional variance, but also the unconditional variance. It is

    called the error variance. The square root of the error variance ( is called thestandard deviation of the error.

    We can say

    0 1( )E y x x | and2

    ( | )var u x .

    So, the conditional expectation ofy givenx is linear inx, but the variance ofy

    given x is constant. When var(u|x) depends on x, the error term is said to exhibit

    heteroskedasticity.

    On the other hand,

    2 2

    0 1

    2 2

    0 1

    ( | ( | (

    ( ) (

    var y x E y x E y x E y x x x

    E y x x E u x var u x

    | ) | ) + | )

    - | ) | ) | )(66)

    Since var(u|x)=var (y|x), heteroskedasticity is present whenever var(y|x) is a

    function ofx.

    THEOREM 2.2 Sampling variances of OLS estimators

    Under assumptions SLR.1 to SLR.5 the variances of 1 and 0 , conditional on

    {x1,,xn}, are the following:

  • 8/3/2019 2 Simple Regression Model 29x09x2011

    24/35

    24

    2

    12

    1

    ( )n

    i

    i

    Var

    x x

    (67)

    and

    2 1 2

    10

    2

    1

    ( )

    n

    i

    i

    n

    i

    i

    n x

    Var

    x x

    (68)

    Proof of(67):

    Taking into account (63)

    1

    22 1 1 1|

    2 2

    2

    1

    2 2 1

    1 1

    2

    2 2

    212

    1

    ( | ) ( ) |

    1| |

    1|

    1

    X

    n

    i i ni

    i in ni

    i i

    i i

    n n

    i i i j i jn

    i i j

    i

    i

    i

    var X E E X

    x x u

    E X E x x u X

    x x x x

    E x x u x x x x u u X

    x x

    x x

    2 2

    212

    1

    2 2

    212

    1

    2 22 2

    2 2212

    11

    | |

    1( ) ( )

    1( )

    n n

    i i i j i jn

    i i j

    i

    n n

    i i i j i jn

    i i j

    i

    i

    n

    i i nn

    Xii

    i

    ii

    E x x u X E x x x x u u X

    x x E u x x x x E u u

    x x

    x x E unS

    x xx x

    (69)

    where

    2

    2 1

    n

    i

    iX

    x x

    Sn

    is the sample variance ofX.

    In the above proof we have taken into account that ( )i jE u u is equal to 0,

    because the sampling is random and errors are therefore independent.

    (From now on in order to simplify the notation, we will not use explicitely

    conditional expectations)

    We can draw the following conclusions of the last expression:

  • 8/3/2019 2 Simple Regression Model 29x09x2011

    25/35

    25

    1) The larger the error variance, , the larger the variance of theslope estimate. This is not at all surprising: more noise in the

    equation, a larger , makes it more difficult to estimate the

    effect ofx on y, and this is reflected in higher variances for the

    OLS slope estimators. Since is a feature of the population,

    it has nothing to do with the sample size.

    2) The larger the variability in thexis, the smaller the variance of

    the slope estimate. Everything else being equal, for estimating

    1 we prefer to have as much sample variation inx as possible.

    In any case it is not allowed by Assumption SLR.3 that 2XS =0.

    3) The larger the sample size (n), the smaller the variance of theslope estimate.

    We have a problem to calculate1

    2 because the error variance,

    , is unknown.

    Estimating the Error Variance

    We dont know what the error variance, , is, and we cannot estimate it from

    the errors, ui, because we dont observe the errors.

    Given that = E(u), an unbiased estimator would be

    2

    2 1

    n

    i

    i

    u

    n

    (70)

    Unfortunately, this is not a true estimator, because we dont observe the errorsu

    i. But, we do have estimates of the u

    i, namely the OLS residuals

    i.

    The relation between errors and residuals is given by

    0 1 0 1

    0 0 1 1

    i i i i i i

    i i

    u y y x u x

    u x

    (71)

    Hence i

    is not the same as ui, although the difference between them does have

    an expected value of zero. If we replace the errors with the OLS residuals, we have

    2

    2 1

    n

    i

    i

    u

    n

    (72)

    This is a true estimator, because it gives a computable rule for any sample of the

    data,x andy. However, this estimator is biased, essentially because it does not account

    for two restrictions that must be satisfied by the OLS residuals,

  • 8/3/2019 2 Simple Regression Model 29x09x2011

    26/35

    26

    1

    1

    0

    0

    n

    i

    i

    n

    i i

    i

    u

    x u

    (73)

    One way to view these restrictions is the following one: if we know n2 of the

    residuals, we can get the other two residuals by using the restrictions implied by the

    moment conditions.

    Thus, there are only n2 degrees of freedom in the OLS residuals, as opposed to

    n degrees of freedom in the errors. The unbiased estimator of2

    that we will use makes

    an adjustment taking into account the degrees of freedom:

    2

    2 1

    2

    n

    i

    i

    u

    n

    (74)

    THEOREM 2.3 Unbiased estimator of2

    Under assumptions SLR.1 to SLR.5

    2 2( )E (75)

    If 2 is plugged into the variance formulas we then have unbiased estimators of

    var( 0 ) and var( 1

    )

    The natural estimator of is2 and is called the standard error of the

    regression.

    The square root of the variance of 1 , defined en (69), is called the standard

    deviation of 1 , that is to say,

    1

    2

    1

    ( )n

    i

    i

    sd

    x x

    (76)

    Therefore, its natural estimator is

    1

    2

    1

    ( )n

    i

    i

    se

    x x

    (77)

    Note that 1( )se , the standard error of 1 , is viewed as a random variable

    when we think of running OLS over different samples; this is because 1 varies with

    different samples. The standard error of any estimate gives us an idea of how precise the

    estimator is.

  • 8/3/2019 2 Simple Regression Model 29x09x2011

    27/35

    27

    2.6 Regression through the origin

    If we force the regression line to pass through the point (0,0) we are constraining

    the intercept to be zero, as it can be seen in figure 2.5. This is called a regression

    through the origin.

    x

    Figure 2.5. A regression through the origin.

    This is not very often done because, among other problems, when0

    0 thenthe slope estimate will be biased.

    We are going to how to estimate a regression line through the origin.

    The fitted model is the following:

    1

    i iy x (78)

    Therefore, we must minimize

    0 0

    2

    1

    1

    min min ( )n

    i i

    i

    S y x

    (79)

    To minimize S, we differentiate with respect to 1 and equal to 0:

    1

    11

    2 ( ) 0n

    i i i

    i

    Sy x x

    (80)

    Solving for 1

  • 8/3/2019 2 Simple Regression Model 29x09x2011

    28/35

    28

    11

    2

    1

    n

    i i

    i

    n

    i

    i

    y x

    x

    (81)

    Other problem with fitting a regression line through the origin is that, in general,happens:

    2 2 2 ( )

    i i iy y y y u If it is not possible the decomposition of the variance ofy in two components

    (explained and residual), then the R2

    is meaningless. This coefficient can take values

    negative or greater than 1 in this case.

    To sum up, it must be included an intercept in the regressions, unless there are

    strong theoretical reasons from the economic theory- to exclude it.

  • 8/3/2019 2 Simple Regression Model 29x09x2011

    29/35

    29

    Case study. Engel's Curve. Demand of dairy products

    The expression Engel's Curve makes reference to the line which shows the

    relationship between various quantities of a good a consumer is willing to purchase at

    varying income levels.

    In a survey to 40 household, data about expenditure on dairy products andincome have been obtained. These data appear in table 3. In order to avoid distortions

    due to the different size of households, both the consumption and the income have been

    expressed in terms per capita. The data come expressed in thousands of euros monthly.

    Table 3. Expenditure in dairy products (dairy), disposable income (inc) in terms per capita.

    Unit: euros per month.

    household dairy inc household dairy inc

    1 8.87 1,250 21 16.20 2,100

    2 6.59 985 22 10.39 1,470

    3 11.46 2,175 23 13.50 1,225

    4 15.07 1,025 24 8.50 1,380

    5 15.60 1,690 25 19.77 2,450

    6 6.71 670 26 9.69 910

    7 10.02 1,600 27 7.90 690

    8 7.41 940 28 10.15 1,450

    9 11.52 1,730 29 13.82 2,275

    10 7.47 640 30 13.74 1,620

    11 6.73 860 31 4.91 740

    12 8.05 960 32 20.99 1,125

    13 11.03 1,575 33 20.06 1,335

    14 10.11 1,230 34 18.93 2,875

    15 18.65 2,190 35 13.19 1,680

    16 10.30 1,580 36 5.86 870

    17 15.30 2,300 37 7.43 1,620

    18 13.75 1,720 38 7.15 960

    19 11.49 850 39 9.10 1,125

    20 6.69 780 40 15.31 1,875

    There are multiple types of models that are used in the demand studies. In

    particular in this case study we are going to consider the following models: linear,

    inverse, semi-logarithmic, potential, exponential and exponential inverse. In the first

    three models, the regressand of the equation is directly the endogenous variable,

    whereas in the last three the regressand is the natural logarithm of the endogenous

    variable.

    In all the models we will calculate the marginal propensity to expenditure, aswell as the elasticity expenditure/income.

    Linear model

    The linear model for demand of dairy products will be the following:

    0 1dairy inc u (82)

    As it is known, the marginal propensity indicates us as expenditure changes

    when income varies and it is obtained differentiating the expenditure with respect to

    income in the demand equation. In the linear model the marginal propensity of the

    expenditure is given by

  • 8/3/2019 2 Simple Regression Model 29x09x2011

    30/35

    30

    1

    d dairy

    d inc (83)

    In other words, in the linear model the marginal propensity is constant and,

    therefore, it is independent of the value that takes the income. The fact that it is constant

    is an advantage, but at the same time it has the disadvantage of not being adapted todescribe the behaviour of the consumers, especially when there are important

    differences in the income of the households. Thus, it is unreasonable that the marginal

    propensity of expenditure on dairy products is the same in a low income family and

    another family with great income. However, if the variation of the income is not very

    high in the sample, a linear model can be used to describe the demand of certain goods.

    In this model the elasticity expenditure/income is the following:

    / 1

    lineardairy inc

    d dairy inc inc

    d inc dairy dairy (84)

    Estimating the model (82) with the data of table 3, we obtain

    24.012 0.005288 0.4584dairy inc R (85)

    Inverse model

    In an inverse model there is a linear relationship between expenditure and the

    inverse income. Therefore, this model is directly linear in the parameters and it is

    expressed in the following way:

    0 1

    1dairy u

    inc

    (86)

    The sign of the coefficient 1 will be negative if the income is correlated

    positively with the expenditure. It is easy to see that, when the income tends towards

    infinite, the expenditure tends towards a limit which is equal to . In other words,

    represents the maximum consumption of this good.

    In figure 2.6, we can see a double representation of the population function

    corresponding to this model. In the first one, the relationship between the dependent

    variable and explanatory variable has been represented. In the second one, the

    relationship between the regressand and regressor has been represented. As can be seen,

    the second function is linear.

    Figure 2.6. The inverse model

    In the inverse model, the marginal propensity to expenditure is given by

  • 8/3/2019 2 Simple Regression Model 29x09x2011

    31/35

    31

    1 2

    1

    ( )

    d dairy

    d inc inc (87)

    According to (87), the propensity is inversely proportional to the square of the

    income level.

    On the other hand, the elasticity is inversely proportional to the product ofexpenditure and income, as can be seen in the following expression:

    / 1

    1

    inv

    dairy inc

    d dairy inc

    d inc dairy inc dairy

    (88)

    Estimating the model (86) with the data of table 3, we obtain

    2118.652 8702 0.4281dairy Rinc

    (89)

    In this case the coefficient 1 does not have an economic meaning.

    Linear-log model

    This model is denominated linear-log model, because the expenditure is a linear

    function of the logarithm of income, that is to say,

    0 1 log( )dairy inc u (90)

    In this model the marginal propensity to expenditure is given by

    11 1

    log( )d dairy d dairy inc d dairyd inc d inc inc d inc inc inc

    (91)

    and the elasticity expenditure/income by

    / 1

    1 1

    log( )

    lin-log

    dairy inc

    d dairy inc d dairy

    d inc dairy d inc dairy dairy (92)

    As can be seen, in the linear-log model the marginal propensity is inversely

    proportional to the level of income, while the elasticity is inversely proportional to the

    level of expenditure on dairy products.

    In figure 2.7, we can see a double representation of the population function

    corresponding to this model.

    Figure 2.7. The linear log model

  • 8/3/2019 2 Simple Regression Model 29x09x2011

    32/35

    32

    Estimating the model (90) with the data of table 3, we obtain

    241.623 7.399 log( ) 0.4567dairy inc R (93)

    The interpretation of 1 is the following: if the income increases by 1%, the

    demand of dairy products will increase by 0.07399 euros.

    Log-log model or potential model

    This exponential model is defined in the following way:

    1 udairy Ainc e (94)

    This model is not linear in the parameters, but linearizable, by taking natural

    logarithms, obtaining the model:

    0 1log( ) log( )dairy inc u (95)

    where0=log(A)

    This model is also called log-log model, because it is the structure of the

    corresponding linearized model.

    In this model the marginal propensity to expenditure is given by

    1

    d dairy dairy

    d inc inc (96)

    In the log-log model, the elasticity is constant. Therefore, if the income increasesby 1%, the expenditure will increase by 1%, since

    / 1

    log( )

    log( )

    log-log

    dairy inc

    d dairy inc d dairy

    d inc dairy d inc (97)

    In figure 2.8, we can see a double representation of the population function

    corresponding to this model.

    Figure 2.8. The log log model

    Estimating the model (95) with the data of table 3, we obtain

    2log( ) 2.556 0.6866 log( ) 0.5190dairy inc R (98)

  • 8/3/2019 2 Simple Regression Model 29x09x2011

    33/35

    33

    In this case 1 is the elasticity expenditure/income. Its interpretation is the

    following: if the income increases by 1%, the demand of dairy products will increase by

    0.68%.

    Log linear or exponential model

    This exponential model is defined in the following way:

    0 1exp( )dairy inc u (99)

    By taking natural logarithms in both sides of (99), we obtain the following

    model that is linear in the parameters:

    0 1log( )dairy inc u (100)

    In this model the marginal propensity to expenditure is given by

    1d dairy dairyd inc

    (101)

    In the exponential model, unlike other models seen previously, the marginal

    propensity increases when the level of expenditure does. For this reason, this model is

    adequate to describe the demand of luxury products. On the other hand, the elasticity is

    proportional to the level of income:

    / 1

    log( )

    exp

    dairy inc

    d dairy inc d dairyinc inc

    d inc dairy d inc (102)

    In figure 2.9, we can see a double representation of the population functioncorresponding to this model.

    Figure 2.9. The log linear model

    Estimating the model (100) with the data of table 3, we obtain

    2log( ) 1.694 0.00048 0.4978dairy inc R (103)

    The interpretation of 1 is the following: if the income increases by 1 euro the

    demand of dairy products will increase by 0.048%.

    Inverse exponential model

    The inverse exponential model, that is a mixture of the exponential model and

    the inverse model, is given by

  • 8/3/2019 2 Simple Regression Model 29x09x2011

    34/35

    34

    0 1

    1exp( )dairy u

    inc (104)

    By taking natural logarithms in both sides of (104), we obtain the following

    model that is linear in the parameters:

    0 1

    1log( )dairy u

    inc (105)

    In this model the marginal propensity to expenditure is given by

    1 2

    ( )

    d dairy dairy

    d inc inc (106)

    and the elasticity by

    / 1

    log( ) 1

    invexp

    dairy inc

    d dairy inc d dairyincd inc dairy d inc inc (107)

    Estimating the model (105) with the data of table 3, we obtain

    21log( ) 3.049 822.02 0.5040dairy Rinc

    (108)

    In this case, as in the inverse model, the coefficient 1 does not have an

    economic meaning.

    In table 4, the results of marginal propensity, elasticity expenditure/income andR

    2in the six fitted models are shown

    Table 4. Marginal propensity, elasticity expenditure/income andR2 in the fitted models.

    Model Marginal propensity Elasticity R2

    Linear 1 =0.0053 1

    inc

    dairy =0.6505 0.4440

    Inverse1 2

    1

    inc

    =0.00441

    1dairy inc

    =0.53610.4279

    Linear log 11

    inc =0.0052 1

    1dairy

    =0.64410.4566

    Log log1

    dairy

    inc =0.0056 1

    =0.6864 0.5188

    Log linear 1 dairy =0.0055

    1 inc =0.6783 0.4976

    Inverse log 1 2 dairy

    inc

    =0.00471

    1

    inc

    =0.5815 0.5038

  • 8/3/2019 2 Simple Regression Model 29x09x2011

    35/35

    The R2 obtained in the three first models are not comparable with the R

    2

    obtained in the three last one because the functional form of the regressand is different:

    y in the three first models and log(y) in the three last ones.

    Comparing the three first models the best one is the linear log model, if we use

    the R2

    as goodness of fit measure. Comparing the three last models the best one is the

    log log model. If we had used the measure Akaike Information Criterion (AIC), which

    allows comparing models with different functional forms for the regressand, then the

    log log model would have been the best among the six models fitted.