regressi on

Upload: fansuri80

Post on 29-May-2018

215 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/9/2019 Regressi On

    1/16

    Chapter

    Regression Approaches Initial investigations. Simple linear regression model.

    Parameter estimation. Forecasting.

    Multivariate linear regression model. Parameter estimation. Forecasting.

    Model building and residual analysis.

    . p.1/1

  • 8/9/2019 Regressi On

    2/16

    Initial Investigation

    It is a good practice to carry out some investigation on the databefore performing advance analyses (e.g. modelling).

    Some reasons for performing initial analyses: to identify some pattern of the data, to identify any potential outlier or non-normal behaviour of

    some observations and

    to understand the data better. Some possible methods that can be used: Plots (e.g. scatter plot, histogram, distribution plot etc.).

    Simple statistics measurements (e.g. mean, variance,

    skewness etc.).

    . p.2/1

  • 8/9/2019 Regressi On

    3/16

    Simple Linear Regression Mode

    Objective: to model a relationship between two variables. This model assumes that the relationship between the dependent

    variable, y, and the independent variable, x, can be described bya straight line:

    y = 0 + 1x + (1)

    where0 - intercept of y when x = 01 - slope; the change in the mean value of y associated with aunit increase in x - error term

    All the unknown parameters can be estimated using least squaremethod so that the estimated model is y = b0 + b1x where b0 andb1 are unbiased estimators of 0 and 1 respectively.

    . p.3/1

  • 8/9/2019 Regressi On

    4/16

    Least Square Metho

    Objective: this method seek for estimators (b0 and b1) that giveminimum total value of error rate e.

    The total error is computed by:

    t

    e2t =t

    (yt yt)2 (2)

    By solving equation (2), then we obtain the estimators as follows: b0 = y b1x b1 = SSxy/SSxx

    where

    SSxy =nt=1

    (xt x)(yt y) =t

    xtyt t

    xtt

    yt

    n

    SSxx =n

    t=1

    (xt

    x)2 =t

    x2t

    t

    (xt)2

    n

    y =nt=1

    yt/n and x =nt=1

    xt/n. p.4/1

  • 8/9/2019 Regressi On

    5/16

    Model Fi

    (i) Determination of relationship between x and y.

    Degree of relationship between x and y represents how variabilityin y can be explained by x.

    In regression analysis, total variation consists of explainedvariation and unexplained variation,

    Total variation the total of squared of errors obtained when we do

    not consider the explain variable x,t

    (yt y)2.

    Unexplained variation it measures the amount of variation in thevalues of y that is NOT explained by x. Also called SSE,t

    (yt yt)2.Explained variation it measures the amount of variation in the

    values of y that is explained by x,t

    (yt y)2.

    . p.5/1

  • 8/9/2019 Regressi On

    6/16

    Model Fi

    So, the degree of relationship between x and y can be measuredusing a simple coefficient called R2

    R2 =Explained variation

    Total variation

    where 0 R2 1. This coefficient gives the proportion of the total variation in y that

    is explained by the simple linear regression model based on thesample of size n. The constructed model is explainable when R2

    approaching 1.

    R =

    R2 (

    1

    R

    1) gives a direction of relationship; R > 0

    shows a positive relationship and R < 0 exhibits negativerelationship.

    . p.6/1

  • 8/9/2019 Regressi On

    7/16

    Model Fi

    Hypothesis testing for determining significance relationship of xand yH0 : There is no relationship between x and y, = 0.

    H1 : There is a relationship between x and y, = 1. The test statistic

    t = rn21r2

    . p.7/1

  • 8/9/2019 Regressi On

    8/16

    Model Fi

    (ii) An F-test for testing the model.

    This statistic tests the significance of the constructed model.

    Hypothesis testingH0 : 0 = 1 = 0.H1 : some parameters are important in the model.

    The relevant test statistic

    FM =Explained variation

    Unexplained variation/(n1).

    If the regression assumptions hold, then under H0 the statistic FM

    will have F-distribution with 1 and n 2 degrees of freedom.

    . p.8/1

  • 8/9/2019 Regressi On

    9/16

    Model Fi

    (iii) Testing significance of b1.

    Objective: to check the significance relationship between x and y.

    Null hypothesis (for example)H0 : 1 = 0 vs 1 = 0

    If the regression assumptions hold, then

    b1 N(1, b1 = /

    SSxx)

    where the estimator of b1 is sb1 = s/

    SSxx

    Then,b11sb1

    has tdistribution with n 2 degrees of freedom.. p.9/1

  • 8/9/2019 Regressi On

    10/16

    Model Fi

    (iv) Testing significance of b0.

    Objective: to check the significance of intercept in y-axis.

    Null hypothesis (for example)H0 : 0 = 0 vs 0 = 0

    If the regression assumptions hold, then b0

    N(0, b0) where

    the estimator of b0 is sb0 = s

    1n +

    x2

    SSxx

    Then,

    b0

    0

    sb0

    has tdistribution with n 2 degrees of freedom.

    . p.10/1

  • 8/9/2019 Regressi On

    11/16

    Model Adequacy Chec

    Statistic models depend on some assumptions. These must bechecked so that the obtained results can be accepted.

    In least square linear model, the following assumptions must befulfilled. A linear relationship between x and y. Error term, , must be normally distributed with mean 0 and a

    constant variance . Any value of error is statistically independent of each other.

    Mean square error, 2, is estimated bys2 = y2t b0 yt b1xt yt = SSE Standard error, , is estimated by

    s =

    SSEn2

    All these can be checked through plots.. p.11/1

  • 8/9/2019 Regressi On

    12/16

    Some Informative Plot

    . p.12/1

    Forecasting Using th

  • 8/9/2019 Regressi On

    13/16

    Forecasting Using th

    Simple Linear Regression Mode

    Once the constructed model has been checked and we aresatisfied with it, then forecasting can be made.

    Forecasting can be made through

    point estimatee.g. let the constructed model is y = 0.5 + 2x. By replacing xwith a value then we obtain the value of y.

    interval estimatee.g. by giving a value of x, we want to know range of possiblevalues of y. This can be done by solving the followingequation

    y t(n2)(/2) s

    1n + (x x)

    2

    SSxx

    . p.13/1

    Example

  • 8/9/2019 Regressi On

    14/16

    Example

    Quality Home Improvement Center (QHIC

    QHIC operates five stores in a large metropolitan area. The marketing

    department at QHIC wishes to study the relationship between home

    value (in thousands $), x, and yearly expenditure on home upkeep

    ($), y. A random sample of 40 homeowners is taken, and they are asked

    to estimate their expenditures during the previous year on the types of

    home upkeep products and services offered by QHIC.

    . p.14/1

  • 8/9/2019 Regressi On

    15/16

    How to Investigate

    Check list

    1. What is the relationship between x and y? Is it linear?

    2. What is the estimated value of parameters 0 and 1?3. Is the constructed model good enough?

    4. Does the constructed model fulfill all the assumptions? (Checkthe error plots)

    5. Can prediction or forecasting can be made?

    6. What can we conclude about the constructed model?

    . p.15/1

  • 8/9/2019 Regressi On

    16/16