![Page 1: Chapter 6 (cont.) Regression Estimation. Simple Linear Regression: review of least squares procedure 2](https://reader036.vdocuments.us/reader036/viewer/2022062321/56649d9c5503460f94a84b7e/html5/thumbnails/1.jpg)
Chapter 6 (cont.)Regression Estimation
![Page 2: Chapter 6 (cont.) Regression Estimation. Simple Linear Regression: review of least squares procedure 2](https://reader036.vdocuments.us/reader036/viewer/2022062321/56649d9c5503460f94a84b7e/html5/thumbnails/2.jpg)
Simple Linear Regression:review of least squares
procedure
2
![Page 3: Chapter 6 (cont.) Regression Estimation. Simple Linear Regression: review of least squares procedure 2](https://reader036.vdocuments.us/reader036/viewer/2022062321/56649d9c5503460f94a84b7e/html5/thumbnails/3.jpg)
Introduction
We will examine the relationship between quantitative variables x and y via a mathematical equation.
x: explanatory variable y: response variable Data:
3
1 1 2 2( , ), ( , ), , ( , )n nx y x y x y
![Page 4: Chapter 6 (cont.) Regression Estimation. Simple Linear Regression: review of least squares procedure 2](https://reader036.vdocuments.us/reader036/viewer/2022062321/56649d9c5503460f94a84b7e/html5/thumbnails/4.jpg)
The Model
4
House size
HouseCost
Most lots sell for $25,000
Building a house costs about
$75 per square foot.
House cost = 25000 + 75(Size)
The model has a deterministic and a probabilistic component
![Page 5: Chapter 6 (cont.) Regression Estimation. Simple Linear Regression: review of least squares procedure 2](https://reader036.vdocuments.us/reader036/viewer/2022062321/56649d9c5503460f94a84b7e/html5/thumbnails/5.jpg)
The Model
5
House cost = 25000 + 75(Size)
House size
HouseCost
Most lots sell for $25,000
+ e
However, house costs vary even among same size houses! Since cost behave unpredictably,
we add a random component.
![Page 6: Chapter 6 (cont.) Regression Estimation. Simple Linear Regression: review of least squares procedure 2](https://reader036.vdocuments.us/reader036/viewer/2022062321/56649d9c5503460f94a84b7e/html5/thumbnails/6.jpg)
The Model
The first order linear model
y = response variablex = explanatory variableb0 = y-interceptb1 = slope of the linee = error variable
6
xy 10
x
y
b0Run
Rise b1 = Rise/Run
b0 and b1 are unknown populationparameters, therefore are estimated from the data.
![Page 7: Chapter 6 (cont.) Regression Estimation. Simple Linear Regression: review of least squares procedure 2](https://reader036.vdocuments.us/reader036/viewer/2022062321/56649d9c5503460f94a84b7e/html5/thumbnails/7.jpg)
Estimating the Coefficients The estimates are determined by
– drawing a sample from the population of interest,
– calculating sample statistics.– producing a straight line that cuts into the
data.
7
ww
w
ww w w w
w
w w
w
w ww
Question: What should be considered a good line?
x
y
![Page 8: Chapter 6 (cont.) Regression Estimation. Simple Linear Regression: review of least squares procedure 2](https://reader036.vdocuments.us/reader036/viewer/2022062321/56649d9c5503460f94a84b7e/html5/thumbnails/8.jpg)
The Least Squares (Regression) Line
8
20 1
1
ˆdetermine and to minimize ( ) .n
i ii
b b SSE y y
A good line is one that minimizes the sum
ˆof squared differences ( ) errors
between the scatterplot points and the line.i iy y
1 1 2 2( , ), ( , ), , ( , )n nx y x y x y
0 1ˆi iy b b x
![Page 9: Chapter 6 (cont.) Regression Estimation. Simple Linear Regression: review of least squares procedure 2](https://reader036.vdocuments.us/reader036/viewer/2022062321/56649d9c5503460f94a84b7e/html5/thumbnails/9.jpg)
The Least Squares (Regression) Line
9
3
3
ww
w
w
41
1
4
(1,2)
2
2
(2,4)
(3,1.5)
Sum of squared differences = (2 - 1)2 + (4 - 2)2 + (1.5 - 3)2 +
(4,3.2)
(3.2 - 4)2 = 6.89Sum of squared differences = (2 -2.5)2 + (4 - 2.5)2 + (1.5 - 2.5)2 + (3.2 - 2.5)2 = 3.99
2.5
Let us compare two linesThe second line is horizontal
The smaller the sum of squared differencesthe better the fit of the line to the data.
![Page 10: Chapter 6 (cont.) Regression Estimation. Simple Linear Regression: review of least squares procedure 2](https://reader036.vdocuments.us/reader036/viewer/2022062321/56649d9c5503460f94a84b7e/html5/thumbnails/10.jpg)
The Estimated Coefficients
10
To calculate the estimates of the slope and intercept of the least squares line , use the formulas:
1
0 1
2
1
2
1
correlation coefficient
( )
1
( )
1
y
x
n
ii
y
n
ii
x
sb r
s
b y b x
r
y ys
n
x xs
n
The least squares prediction equation that estimates the mean value of y for a particular value of x is:
0 1
1 1
1
ˆ
( )
( )
y b b x
y b x b x
y b x x
![Page 11: Chapter 6 (cont.) Regression Estimation. Simple Linear Regression: review of least squares procedure 2](https://reader036.vdocuments.us/reader036/viewer/2022062321/56649d9c5503460f94a84b7e/html5/thumbnails/11.jpg)
• Example: Consumer’s Union
recently evaluated 26 brands of frozen pizza based on taste (y)
We will examine the taste scores (y) and the corresponding fat content (x).
11
Simple Linear Regression Brand Fat ScoreFreshetta 4 Cheese 15 75Freschetta stuffed crust 11 56DiGiorno 12 71Amy's organic 14 81Safeway 9 41Tony's 12 67Kroger 9 55Tombstone stuffed crust 18 75Red Baron 20 73Bobli 12 67Tombstone extra cheese 14 60Jack's 13 51Celeste 17 59McCain Ellio's 9 46Totino's 14 68Freschetta pepperoni 18 80DiGiorno pepperoni 16 78Tombstone stuffed crust pepperoni 22 80Tombstone pepperoni 20 73Red Baron pepperoni 23 64Tony's pepperoni 26 86Red Baron deep dish pepperoni 25 77Stouffer's pepperoni 14 54Weight Watchers pepperoni 6 43Jeno's pepperoni 20 75Totino's pepperoni 20 65
![Page 12: Chapter 6 (cont.) Regression Estimation. Simple Linear Regression: review of least squares procedure 2](https://reader036.vdocuments.us/reader036/viewer/2022062321/56649d9c5503460f94a84b7e/html5/thumbnails/12.jpg)
The Simple Linear Regression Line (example, cont.)
12
• Solution– Solving by hand: Calculate a number of statistics
15.73;
66.15;
x
y
5.23
12.47
x
y
s
s
where n = 26.
1
0 1
12.470.724 1.726
5.23
66.15 (1.726)(15.73) 39.002
y
x
sb r
s
b y b x
0 1ˆ 39.002 1.726y b b x x
0.724r
![Page 13: Chapter 6 (cont.) Regression Estimation. Simple Linear Regression: review of least squares procedure 2](https://reader036.vdocuments.us/reader036/viewer/2022062321/56649d9c5503460f94a84b7e/html5/thumbnails/13.jpg)
The Simple Linear Regression Line (example, cont.)
13
• Solution – continued– Using the computer
1. Scatterplot2. Trend function3. Data tab > Data Analysis > Regression
![Page 14: Chapter 6 (cont.) Regression Estimation. Simple Linear Regression: review of least squares procedure 2](https://reader036.vdocuments.us/reader036/viewer/2022062321/56649d9c5503460f94a84b7e/html5/thumbnails/14.jpg)
Regression Statistics
Multiple R 0.723546339
R Square 0.523519305
Adjusted R Square 0.503665943
Standard Error 8.785081398
Observations 26
ANOVA
df SS MS F Significance F
Regression 1 2035.120891 2035.121 26.3693 2.95293E-05
Residual 24 1852.263724 77.17766
Total 25 3887.384615
Coefficients Standard Error t Stat P-value Lower 95% Upper 95%
Intercept 39.00208322 5.561098219 7.013378 2.99E-07 27.5245406 50.47962583
Fat 1.72602894 0.336123407 5.135105 2.95E-05 1.032304324 2.419753555
The Simple Linear Regression Line (example, cont.)
14
ˆ 39.002 1.726y x
![Page 15: Chapter 6 (cont.) Regression Estimation. Simple Linear Regression: review of least squares procedure 2](https://reader036.vdocuments.us/reader036/viewer/2022062321/56649d9c5503460f94a84b7e/html5/thumbnails/15.jpg)
The Simple Linear Regression Line (example, cont.)
15
5 10 15 20 25 3040
45
50
55
60
65
70
75
80
85
90
f(x) = 1.72602893981195 x + 39.0020832160351R² = 0.523519304840857
Pizza Score vs Fat Content
![Page 16: Chapter 6 (cont.) Regression Estimation. Simple Linear Regression: review of least squares procedure 2](https://reader036.vdocuments.us/reader036/viewer/2022062321/56649d9c5503460f94a84b7e/html5/thumbnails/16.jpg)
Regression estimator of a population mean y
1
1
ˆ ( )
where
ˆEstimated variance of
1ˆ ˆ( ) 12
1
yL x
y
x
yL
yL
y b x
sb r
s
n SSEV
N n n
n MSE
N n