chapter 1 regression analysis[1]

Post on 22-Oct-2015

52 Views

Category:

Documents

1 Downloads

Preview:

Click to see full reader

DESCRIPTION

nn

TRANSCRIPT

“Regression is a statistical technique which establish a functional relationship between two or more variables in the form of an equation to estimate the value of one variable based on the value of another variable”

Regression Analysis

• Simple Linear Regression Model

y = 0 + 1x +

• Simple Linear Regression Equation

y = 0 + 1x

• Estimated Simple Linear Regression Equation

xb b y 10

Principle of least squares technique

Case 1:

Observed points : (4,8); (8,1); (12,6)

Estimated points : (4,6); (8,5); (12,4)

Observed points : (4,8); (8,1); (12,6)

Estimated points : (4,2); (8,5); (12,8)

Error (graph 1) Error (graph 2)

8-6=2 8-2=6

1-5=-4 1-5=-4

6-4=2 6-8=-2

Total error=0 Total error=0

Absolute error Absolute error

I8-6I=2 I8-2I=6

I1-5I=4 I1-5I=4

I6-4I=2 I6-8I=2

Total Absolute error=8Total Abs error=12

Case 2:

Observed points: (2,4); (6,7); (10,2)

Estimated points: (2,4); (6,3); (10,2)

Observed points: (2,4); (6,7); (10,2)

Estimated points: (2,5); (6,4); (10,3)

Abs Error Abs Error

I4-4I=0 I4-5I=1

I7-3I=4 I7-4I=3

I2-2I=0 I2-3I=1

Total Abs error=4 Total Abs error=5

Error Square ErrorSquare

(4-4)2 =0 (4-5) 2=1

(7-3) 2=16 (7-4) 2=9

(2-2) 2=0 (2-3) 2=1

Sum of error square=16 (Graph 1)

Sum of error square=11 (Graph 2)

Least Squares Method

• Least Squares Criterion

where:

yi = observed value of the dependent variable for the i th observation

2)ˆ(min ii yy

nobservatioith for the

variabledependent theof valueestimated yi

• Slope for the Estimated Regression Equation

x = value of independent variable for ith observationy = value of dependent variable for ith observationn = total number of observations

• y-Intercept for the Estimated Regression Equation

221

xxn

yxxynb

xbyb 10

variabledependent for mean value y

t variableindependenfor mean value x

• Simple Linear Regression

Reed Auto periodically has a special week-long sale. As part of the advertising campaign Reed runs one or more television commercials during the weekend preceding the sale. Data from a sample of 5 previous sales are shown below.

Number of TV Ads Number of Cars

Sold1 143 242 181 173 27

• The HRD manager of a company wants to find a measure which he can use to fix the monthly income of persons applying for a job in the production department. As an experimental project, he collected data on 7 persons from that department referring to years of service and their monthly income (in 000’s).

Years of experience 11 7 9 5 8 6 10

Income 10 8 6 5 9 7 11

• Find the regression equation of income on years of service.

• What initial start would you recommend for a person applying for the job after having served in a similar capacity in another company for 13 years?

• Do you think other factors are to be considered (in addition to the years of service) in fixing the income? Explain.

Properties of regression lines and their coefficients:

1. Correlation coefficient is the geometric mean between the regression coefficient

2. The sign of correlation coefficient is the same as that of regression coefficient.

3. Regression coefficients are dependent of the change origin but not of scale.

In finance, it is of interest to look at the relationship between Y, a stock’s average return, and X, the overall market return. The slope coefficient computed by linear regression is called the stock’s beta by investment analysts. A beta greater than 1 indicates that the stock is relatively sensitive to changes in the market; a beta less than 1 indicates that the stock is relatively insensitive. For the following data, compute the beta and suggest market trend.

X (%)

10 12 8 15 9 11 8 10 13 11

Y (%)

11 15 3 18 10 12 6 7 18 13

Multiple regression Analysis

• A linear regression equation with more than one independent variable is called a multiple regression model.

chance. to dueerror random the is ε

variable.t independen x the of each with

associated tscoefficien regression the are ...βββ

constant a is β

estimated be to variabledependent of value the is y

where

εxβ........xβxβxββy

:form the takes variablest independen k

with equation regressionlinear The

k

k2,1,

0

kk3322110

technique. squaresleast of principle the by obtained

are and tscoefficien regression partial ....bb,b,b

y variabledependent of value estimated the is y

where

)y-(y (SSE) errors squares of sum the

minimizes which xb.......xbxbby

be equation regressionlinear fitted theLet

k321

2

kk22110

ˆ

ˆ

ˆ

• Let us consider the case where two independent variables and a dependent variable.

ts.coefficien regression the are β,β

intercept.-y the is β

chance. to dueerror random the is ε

variables.t independen are x and x

variabledependent the is y

where

εxβxββ y

:is variablest independen two involving

model regressionlinear multiple The

21

0

21

22110

2

21,0

21

2y2.11y1.20

22110

)y-(y(SSE) errors squres of sum the minimizes which

technique squaresleast of priniple the by determined are

and constants unknown the are bb,b

variables.t independen the are x,x

y. variabledependent of value estimated the is y

where

xbxbby

xbxbby

be equation regressionlinear multiple fitted theLet

ˆ

ˆ

ˆ

ˆ

or

22y2.121y1.2202

21y2.12

1y1.2101

2y2.11y1.20

210

xbx xbxbx y

xxbxbxbx y

xbxbnby

.determined be can b ,b,b

of values the equations following the solving By

2y2.11y1.2

22y2.111y1.2

2y2.11y1.20

2y2.11y1.20

22110

XbXbY

)x-(xb)x(xb)y-(y

(2)-(1)

xbxbby

xbxbbyor

xbxbby

be equation regressionlinear multiple fitted theLet

-(2)---

-(1)---

xxX

xxX

y-yY

where

XXXX

XXXYXXYb

XXXX

XXXYXXYb

222

111

2

2122

21

121212

y2.1

2

2122

21

122221

y1.2

• A marketing manager of a company wants to predict demand for the product. He is believing strongly demand (Y) is highly influenced by annual average price (X1) of the product (in units) & advertising expenditure (X2) (Rs in lakh).He has collected past data to know the effect of these factors on demand and given below:

Y 4 6 7 9 13 15X1 15 12 8 6 4 3X2 30 24 20 14 10 4

Ex: Christmas week is a critical period for most ski resorts. Because many students and adults are free from other obligations, they are able to spend several days indulging in their favorite pastime, skiing. A large proportion of gross revenue is earned during this period. A ski resort in Vermont wanted to determine the effect that weather had on its sales of lift tickets. The manager of the resort collected data on the number of lift tickets sold during Christmas week (y), the total snowfall in inches (x1), and the average temperature in degrees Fahrenheit (x2) for the past 10 years. Develop the multiple regression model.

Tickets Snowfall Temperature6835 19 117870 15 -196173 7 367979 11 227639 19 147167 2 -208094 21 399903 19 279788 18 269557 20 16

• The Federal Reserve is performing a preliminary study to determine the relationship between certain economic indicators and annual percentage change in the gross national product (GNP). Two such indicators being examined are the amount of the federal government’s deficit (in billions of dollars) and the Dow Jones Industrial Average (the mean value over the year). Data for 6 years follow:

Change in GNP 2.5 -1.0 4.0 1.0 1.5 3.0Federal Deficit 100.0 400.0 120.0 200.0 180.0 80.0Dow Jones 2850 2100 3300 2400 2550 2700

i) Calculate the least squares equation that best describes the data.

ii) What % change in GNP would be expected in a year in which the federal deficit was $240 billion and the mean Dow Jones value was 3000?

• Multiple correlation analysis:

It is a measure of association between a dependent variable and several independent variables taken together.

The coefficient of multiple correlation is given by,

1. and 0 between in lie always value Its

r1

rr2rrrR

212

12y2y12y2

2y1

y.12

• Coefficient of multiple determination:

It is the proportion of the total variation in the multiple values of dependent variable y, accounted for or explained by the independent variables in the multiple regression model.

• The square of coefficient of multiple correlation is called Coefficient of multiple determination.

top related