chapter 6 (cont.) regression estimation. simple linear regression: review of least squares procedure...

16
Chapter 6 (cont.) Regression Estimation

Upload: charity-lynette-blankenship

Post on 23-Dec-2015

220 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Chapter 6 (cont.) Regression Estimation. Simple Linear Regression: review of least squares procedure 2

Chapter 6 (cont.)Regression Estimation

Page 2: Chapter 6 (cont.) Regression Estimation. Simple Linear Regression: review of least squares procedure 2

Simple Linear Regression:review of least squares

procedure

2

Page 3: Chapter 6 (cont.) Regression Estimation. Simple Linear Regression: review of least squares procedure 2

Introduction

We will examine the relationship between quantitative variables x and y via a mathematical equation.

x: explanatory variable y: response variable Data:

3

1 1 2 2( , ), ( , ), , ( , )n nx y x y x y

Page 4: Chapter 6 (cont.) Regression Estimation. Simple Linear Regression: review of least squares procedure 2

The Model

4

House size

HouseCost

Most lots sell for $25,000

Building a house costs about

$75 per square foot.

House cost = 25000 + 75(Size)

The model has a deterministic and a probabilistic component

Page 5: Chapter 6 (cont.) Regression Estimation. Simple Linear Regression: review of least squares procedure 2

The Model

5

House cost = 25000 + 75(Size)

House size

HouseCost

Most lots sell for $25,000

+ e

However, house costs vary even among same size houses! Since cost behave unpredictably,

we add a random component.

Page 6: Chapter 6 (cont.) Regression Estimation. Simple Linear Regression: review of least squares procedure 2

The Model

The first order linear model

y = response variablex = explanatory variableb0 = y-interceptb1 = slope of the linee = error variable

6

xy 10

x

y

b0Run

Rise b1 = Rise/Run

b0 and b1 are unknown populationparameters, therefore are estimated from the data.

Page 7: Chapter 6 (cont.) Regression Estimation. Simple Linear Regression: review of least squares procedure 2

Estimating the Coefficients The estimates are determined by

– drawing a sample from the population of interest,

– calculating sample statistics.– producing a straight line that cuts into the

data.

7

ww

w

ww w w w

w

w w

w

w ww

Question: What should be considered a good line?

x

y

Page 8: Chapter 6 (cont.) Regression Estimation. Simple Linear Regression: review of least squares procedure 2

The Least Squares (Regression) Line

8

20 1

1

ˆdetermine and to minimize ( ) .n

i ii

b b SSE y y

A good line is one that minimizes the sum

ˆof squared differences ( ) errors

between the scatterplot points and the line.i iy y

1 1 2 2( , ), ( , ), , ( , )n nx y x y x y

0 1ˆi iy b b x

Page 9: Chapter 6 (cont.) Regression Estimation. Simple Linear Regression: review of least squares procedure 2

The Least Squares (Regression) Line

9

3

3

ww

w

w

41

1

4

(1,2)

2

2

(2,4)

(3,1.5)

Sum of squared differences = (2 - 1)2 + (4 - 2)2 + (1.5 - 3)2 +

(4,3.2)

(3.2 - 4)2 = 6.89Sum of squared differences = (2 -2.5)2 + (4 - 2.5)2 + (1.5 - 2.5)2 + (3.2 - 2.5)2 = 3.99

2.5

Let us compare two linesThe second line is horizontal

The smaller the sum of squared differencesthe better the fit of the line to the data.

Page 10: Chapter 6 (cont.) Regression Estimation. Simple Linear Regression: review of least squares procedure 2

The Estimated Coefficients

10

To calculate the estimates of the slope and intercept of the least squares line , use the formulas:

1

0 1

2

1

2

1

correlation coefficient

( )

1

( )

1

y

x

n

ii

y

n

ii

x

sb r

s

b y b x

r

y ys

n

x xs

n

The least squares prediction equation that estimates the mean value of y for a particular value of x is:

0 1

1 1

1

ˆ

( )

( )

y b b x

y b x b x

y b x x

Page 11: Chapter 6 (cont.) Regression Estimation. Simple Linear Regression: review of least squares procedure 2

• Example: Consumer’s Union

recently evaluated 26 brands of frozen pizza based on taste (y)

We will examine the taste scores (y) and the corresponding fat content (x).

11

Simple Linear Regression Brand Fat ScoreFreshetta 4 Cheese 15 75Freschetta stuffed crust 11 56DiGiorno 12 71Amy's organic 14 81Safeway 9 41Tony's 12 67Kroger 9 55Tombstone stuffed crust 18 75Red Baron 20 73Bobli 12 67Tombstone extra cheese 14 60Jack's 13 51Celeste 17 59McCain Ellio's 9 46Totino's 14 68Freschetta pepperoni 18 80DiGiorno pepperoni 16 78Tombstone stuffed crust pepperoni 22 80Tombstone pepperoni 20 73Red Baron pepperoni 23 64Tony's pepperoni 26 86Red Baron deep dish pepperoni 25 77Stouffer's pepperoni 14 54Weight Watchers pepperoni 6 43Jeno's pepperoni 20 75Totino's pepperoni 20 65

Page 12: Chapter 6 (cont.) Regression Estimation. Simple Linear Regression: review of least squares procedure 2

The Simple Linear Regression Line (example, cont.)

12

• Solution– Solving by hand: Calculate a number of statistics

15.73;

66.15;

x

y

5.23

12.47

x

y

s

s

where n = 26.

1

0 1

12.470.724 1.726

5.23

66.15 (1.726)(15.73) 39.002

y

x

sb r

s

b y b x

0 1ˆ 39.002 1.726y b b x x

0.724r

Page 13: Chapter 6 (cont.) Regression Estimation. Simple Linear Regression: review of least squares procedure 2

The Simple Linear Regression Line (example, cont.)

13

• Solution – continued– Using the computer

1. Scatterplot2. Trend function3. Data tab > Data Analysis > Regression

Page 14: Chapter 6 (cont.) Regression Estimation. Simple Linear Regression: review of least squares procedure 2

Regression Statistics

Multiple R 0.723546339

R Square 0.523519305

Adjusted R Square 0.503665943

Standard Error 8.785081398

Observations 26

ANOVA

df SS MS F Significance F

Regression 1 2035.120891 2035.121 26.3693 2.95293E-05

Residual 24 1852.263724 77.17766

Total 25 3887.384615

Coefficients Standard Error t Stat P-value Lower 95% Upper 95%

Intercept 39.00208322 5.561098219 7.013378 2.99E-07 27.5245406 50.47962583

Fat 1.72602894 0.336123407 5.135105 2.95E-05 1.032304324 2.419753555

The Simple Linear Regression Line (example, cont.)

14

ˆ 39.002 1.726y x

Page 15: Chapter 6 (cont.) Regression Estimation. Simple Linear Regression: review of least squares procedure 2

The Simple Linear Regression Line (example, cont.)

15

5 10 15 20 25 3040

45

50

55

60

65

70

75

80

85

90

f(x) = 1.72602893981195 x + 39.0020832160351R² = 0.523519304840857

Pizza Score vs Fat Content

Page 16: Chapter 6 (cont.) Regression Estimation. Simple Linear Regression: review of least squares procedure 2

Regression estimator of a population mean y

1

1

ˆ ( )

where

ˆEstimated variance of

1ˆ ˆ( ) 12

1

yL x

y

x

yL

yL

y b x

sb r

s

n SSEV

N n n

n MSE

N n