statistics for business and economics
DESCRIPTION
Statistics for Business and Economics. Chapter 10 Simple Linear Regression. Learning Objectives. Describe the Linear Regression Model State the Regression Modeling Steps Explain Least Squares Compute Regression Coefficients Explain Correlation Predict Response Variable. - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Statistics for Business and Economics](https://reader033.vdocuments.us/reader033/viewer/2022061607/56812f52550346895d94e1a6/html5/thumbnails/1.jpg)
Statistics for Business and Economics
Chapter 10
Simple Linear Regression
![Page 2: Statistics for Business and Economics](https://reader033.vdocuments.us/reader033/viewer/2022061607/56812f52550346895d94e1a6/html5/thumbnails/2.jpg)
Learning Objectives
1. Describe the Linear Regression Model
2. State the Regression Modeling Steps
3. Explain Least Squares
4. Compute Regression Coefficients
5. Explain Correlation
6. Predict Response Variable
![Page 3: Statistics for Business and Economics](https://reader033.vdocuments.us/reader033/viewer/2022061607/56812f52550346895d94e1a6/html5/thumbnails/3.jpg)
What is Regression?
1.1. Method of modeling Method of modeling relationships relationships between a between a responseresponse variable Y and one or more variable Y and one or more predictorspredictors X X
2.2. A way of “fitting line A way of “fitting line through data”through data”
![Page 4: Statistics for Business and Economics](https://reader033.vdocuments.us/reader033/viewer/2022061607/56812f52550346895d94e1a6/html5/thumbnails/4.jpg)
Example 1: Store Site Selection
Model sales Y at existing sites as a Model sales Y at existing sites as a function of demographic variables:function of demographic variables:
XX11 == Population in store Population in store vicinityvicinity
XX22 == Income in areaIncome in areaXX33 == Age of houses in areaAge of houses in areaXX44 ==XX55 ==From equation, predict sales at new sitesFrom equation, predict sales at new sites
![Page 5: Statistics for Business and Economics](https://reader033.vdocuments.us/reader033/viewer/2022061607/56812f52550346895d94e1a6/html5/thumbnails/5.jpg)
Example 2:Marketing Research
Model consumer response to a Model consumer response to a product on basis of product product on basis of product characteristics:characteristics:
YY == Taste score on soft Taste score on soft drinkdrink
XX11 == Sugar levelSugar levelXX22 == Carbonation levelCarbonation levelXX33 ==XX44 ==
![Page 6: Statistics for Business and Economics](https://reader033.vdocuments.us/reader033/viewer/2022061607/56812f52550346895d94e1a6/html5/thumbnails/6.jpg)
Example 3:Operations/Quality
Model product quality in plant as a Model product quality in plant as a function of manufacturing process function of manufacturing process characteristics:characteristics:
YY == Quality score of sheet Quality score of sheet metalmetal
XX11 == Raw material purity Raw material purity scorescore
XX22 == Molten aluminum Molten aluminum temptemp
XX33 == Line speedLine speedXX44 ==
![Page 7: Statistics for Business and Economics](https://reader033.vdocuments.us/reader033/viewer/2022061607/56812f52550346895d94e1a6/html5/thumbnails/7.jpg)
Example 4:Real Estate Pricing
YY == Selling price of housesSelling price of houses
XX11 == Square feetSquare feetXX22 == TaxesTaxesXX33 == Lot acreageLot acreageXX44 ==XX55 ==XX66 ==
![Page 8: Statistics for Business and Economics](https://reader033.vdocuments.us/reader033/viewer/2022061607/56812f52550346895d94e1a6/html5/thumbnails/8.jpg)
One-Predictor Regression
Price
050000
100000150000
200000250000
300000350000
400000
0 2000 4000 6000
Square Feet (X)
Pri
ce
(Y
in
US
$)
25 Homes Sold in Essex County, New Jersey, 199625 Homes Sold in Essex County, New Jersey, 1996
![Page 9: Statistics for Business and Economics](https://reader033.vdocuments.us/reader033/viewer/2022061607/56812f52550346895d94e1a6/html5/thumbnails/9.jpg)
One-Predictor Regression
Price
050000
100000150000
200000250000
300000350000
400000
0 2000 4000 6000
Square Feet (X)
Pri
ce
(Y
in
US
$)
Regression Equation: Y= -39001 + 85.0 XRegression Equation: Y= -39001 + 85.0 X
![Page 10: Statistics for Business and Economics](https://reader033.vdocuments.us/reader033/viewer/2022061607/56812f52550346895d94e1a6/html5/thumbnails/10.jpg)
Regression Models
• Answers ‘What is the relationship between the variables?’
• Equation used– One numerical dependent (response) variable
What is to be predicted– One or more numerical or categorical
independent (explanatory) variables
• Used mainly for prediction and estimation
![Page 11: Statistics for Business and Economics](https://reader033.vdocuments.us/reader033/viewer/2022061607/56812f52550346895d94e1a6/html5/thumbnails/11.jpg)
Prediction Using Regression
Price
050000
100000150000
200000250000
300000350000
400000
0 2000 4000 6000
Square Feet (X)
Pri
ce
(Y
in
US
$)
Y= -39001 + 85.0(4000)= $300,999Y= -39001 + 85.0(4000)= $300,999
![Page 12: Statistics for Business and Economics](https://reader033.vdocuments.us/reader033/viewer/2022061607/56812f52550346895d94e1a6/html5/thumbnails/12.jpg)
Regression Modeling Steps
1. Hypothesize deterministic component
2. Estimate unknown model parameters
3. Specify probability distribution of random error term
• Estimate standard deviation of error
4. Evaluate model
5. Use model for prediction and estimation
![Page 13: Statistics for Business and Economics](https://reader033.vdocuments.us/reader033/viewer/2022061607/56812f52550346895d94e1a6/html5/thumbnails/13.jpg)
Specifying the Model
1. Define variables• Conceptual (e.g., Advertising, price)• Empirical (e.g., List price, regular price) • Measurement (e.g., $, Units)
2. Hypothesize nature of relationship• Expected effects (i.e., Coefficients’ signs)• Functional form (linear or non-linear)• Interactions
![Page 14: Statistics for Business and Economics](https://reader033.vdocuments.us/reader033/viewer/2022061607/56812f52550346895d94e1a6/html5/thumbnails/14.jpg)
Model Specification Is Based on Theory
• Theory of field (e.g., Sociology)
• Mathematical theory
• Previous research
• ‘Common sense’
![Page 15: Statistics for Business and Economics](https://reader033.vdocuments.us/reader033/viewer/2022061607/56812f52550346895d94e1a6/html5/thumbnails/15.jpg)
Advertising
Sales
Advertising
Sales
Advertising
Sales
Advertising
Sales
Thinking Challenge: Which Is More Logical?
![Page 16: Statistics for Business and Economics](https://reader033.vdocuments.us/reader033/viewer/2022061607/56812f52550346895d94e1a6/html5/thumbnails/16.jpg)
Simple
1 ExplanatoryVariable
Types of Regression Models
RegressionModels
2+ ExplanatoryVariables
Multiple
LinearNon-
LinearLinear
Non-Linear
![Page 17: Statistics for Business and Economics](https://reader033.vdocuments.us/reader033/viewer/2022061607/56812f52550346895d94e1a6/html5/thumbnails/17.jpg)
y x 0 1
Linear Regression Model
Relationship between variables is a linear function
Dependent (Response) Variable
Independent (Explanatory) Variable
Population Slope
Population y-intercept
Random Error
![Page 18: Statistics for Business and Economics](https://reader033.vdocuments.us/reader033/viewer/2022061607/56812f52550346895d94e1a6/html5/thumbnails/18.jpg)
y
E(y) = β0 + β1x (line of means)
β0 = y-intercept
x
Change in y
Change in x
β1 = Slope
Line of Means
High school teacher
![Page 19: Statistics for Business and Economics](https://reader033.vdocuments.us/reader033/viewer/2022061607/56812f52550346895d94e1a6/html5/thumbnails/19.jpg)
YY XXii ii ii 00 11
Linear Regression Model
• 1. Relationship between variables is a linear function
Dependent Dependent (response) (response) variablevariable
Independent Independent (explanatory) (explanatory) variablevariable
Population Population slopeslope
Population Population Y-interceptY-intercept
Random Random errorerror
![Page 20: Statistics for Business and Economics](https://reader033.vdocuments.us/reader033/viewer/2022061607/56812f52550346895d94e1a6/html5/thumbnails/20.jpg)
Y
X
Y
X
Population Linear Regression Model
Y Xi i i 0 1Y Xi i i 0 1ObservedObservedvaluevalue
Observed valueObserved value
ii = Random error= Random error
E Y X iaf 0 1E Y X iaf 0 1
![Page 21: Statistics for Business and Economics](https://reader033.vdocuments.us/reader033/viewer/2022061607/56812f52550346895d94e1a6/html5/thumbnails/21.jpg)
Observed valueObserved value
Sample LinearRegression Model
YY
XXTrue Unknown LineTrue Unknown Line
Y Xi i 0 1 Y Xi i 0 1
![Page 22: Statistics for Business and Economics](https://reader033.vdocuments.us/reader033/viewer/2022061607/56812f52550346895d94e1a6/html5/thumbnails/22.jpg)
ii = = Random Random
errorerror
Observed valueObserved value
^
Sample LinearRegression Model
YY
XX
Y Xi i i 0 1Y Xi i i 0 1
![Page 23: Statistics for Business and Economics](https://reader033.vdocuments.us/reader033/viewer/2022061607/56812f52550346895d94e1a6/html5/thumbnails/23.jpg)
Regression Modeling Steps
1. Hypothesize deterministic component
2. Estimate unknown model parameters
3. Specify probability distribution of random error term
• Estimate standard deviation of error
4. Evaluate model
5. Use model for prediction and estimation
![Page 24: Statistics for Business and Economics](https://reader033.vdocuments.us/reader033/viewer/2022061607/56812f52550346895d94e1a6/html5/thumbnails/24.jpg)
Scattergram
1. Plot of all (xi, yi) pairs
2. Suggests how well model will fit
0204060
0 20 40 60
x
y
![Page 25: Statistics for Business and Economics](https://reader033.vdocuments.us/reader033/viewer/2022061607/56812f52550346895d94e1a6/html5/thumbnails/25.jpg)
0204060
0 20 40 60
x
y
Thinking Challenge
• How would you draw a line through the points?• How do you determine which line ‘fits best’?
![Page 26: Statistics for Business and Economics](https://reader033.vdocuments.us/reader033/viewer/2022061607/56812f52550346895d94e1a6/html5/thumbnails/26.jpg)
Least Squares
• ‘Best fit’ means difference between actual y values and predicted y values are a minimum
– But positive differences off-set negative
2 2
1 1
ˆ ˆn n
ii ii i
y y
• Least Squares minimizes the Sum of the Squared Differences (SSE)
![Page 27: Statistics for Business and Economics](https://reader033.vdocuments.us/reader033/viewer/2022061607/56812f52550346895d94e1a6/html5/thumbnails/27.jpg)
Least Squares Graphically
2
y
x
1
3
4
^^
^^
2 0 1 2 2ˆ ˆ ˆy x
0 1ˆ ˆˆi iy x
2 2 2 2 21 2 3 4
1
ˆ ˆ ˆ ˆ ˆLS minimizes n
ii
![Page 28: Statistics for Business and Economics](https://reader033.vdocuments.us/reader033/viewer/2022061607/56812f52550346895d94e1a6/html5/thumbnails/28.jpg)
Coefficient Equations
1 1
11 2
12
1
ˆ
n n
i ini i
i ixy i
nxx
ini
ii
x y
x ySS nSS
x
xn
Slope
y-intercept
0 1ˆ ˆy x Prediction Equation
0 1ˆ ˆy x
![Page 29: Statistics for Business and Economics](https://reader033.vdocuments.us/reader033/viewer/2022061607/56812f52550346895d94e1a6/html5/thumbnails/29.jpg)
Computation Table
xi2
x12
x2
2
: :
xn2
Σxi2
yi
y1
y2
:
yn
Σyi
xi
x1
x2
:
xn
Σxi
2
2
2
2
2
yi
y1
y2
:
yn
Σyi
xiyi
Σxiyi
x1y1
x2y2
xnyn
![Page 30: Statistics for Business and Economics](https://reader033.vdocuments.us/reader033/viewer/2022061607/56812f52550346895d94e1a6/html5/thumbnails/30.jpg)
Interpretation of Coefficients
^
^
^1. Slope (1)
• Estimated y changes by 1 for each 1unit increase
in x1. If 1 = 2, then Sales (y) is expected to increase by 2
for each 1 unit increase in Advertising (x)
2. Y-Intercept (0)
• Average value of y when x = 02. If 0 = 4, then Average Sales (y) is expected to be
4 when Advertising (x) is 0
^
^
![Page 31: Statistics for Business and Economics](https://reader033.vdocuments.us/reader033/viewer/2022061607/56812f52550346895d94e1a6/html5/thumbnails/31.jpg)
Least Squares Example
You’re a marketing analyst for Hasbro Toys. You gather the following data:
Ad $ Sales (Units)1 12 13 24 25 4
Find the least squares line relatingsales and advertising.
![Page 32: Statistics for Business and Economics](https://reader033.vdocuments.us/reader033/viewer/2022061607/56812f52550346895d94e1a6/html5/thumbnails/32.jpg)
0
1
2
3
4
0 1 2 3 4 5
Scattergram Sales vs. Advertising
Sales
Advertising
![Page 33: Statistics for Business and Economics](https://reader033.vdocuments.us/reader033/viewer/2022061607/56812f52550346895d94e1a6/html5/thumbnails/33.jpg)
Parameter Estimation Solution Table
2 2
1 1 1 1 1
2 1 4 1 2
3 2 9 4 6
4 2 16 4 8
5 4 25 16 20
15 10 55 26 37
xi yi xi yi xiyi
![Page 34: Statistics for Business and Economics](https://reader033.vdocuments.us/reader033/viewer/2022061607/56812f52550346895d94e1a6/html5/thumbnails/34.jpg)
Parameter Estimation Solution
1 1
11 2 2
12
1
15 1037
5ˆ .7015
555
n n
i ini i
i ii
n
ini
ii
x y
x yn
x
xn
ˆ .1 .7y x
0 1ˆ ˆ 2 .70 3 .10y x
![Page 35: Statistics for Business and Economics](https://reader033.vdocuments.us/reader033/viewer/2022061607/56812f52550346895d94e1a6/html5/thumbnails/35.jpg)
Parameter Estimates
Parameter Standard T for H0:
Variable DF Estimate Error Param=0 Prob>|T|
INTERCEP 1 -0.1000 0.6350 -0.157 0.8849
ADVERT 1 0.7000 0.1914 3.656 0.0354
Parameter Estimation Computer Output
0^
1^
ˆ .1 .7y x
![Page 36: Statistics for Business and Economics](https://reader033.vdocuments.us/reader033/viewer/2022061607/56812f52550346895d94e1a6/html5/thumbnails/36.jpg)
JMP “Fit Model” Results
![Page 37: Statistics for Business and Economics](https://reader033.vdocuments.us/reader033/viewer/2022061607/56812f52550346895d94e1a6/html5/thumbnails/37.jpg)
Coefficient Interpretation Solution
1. Slope (1)
• Sales Volume (y) is expected to increase by .7 units for each $1 increase in Advertising (x)
^
2. Y-Intercept (0)
• Average value of Sales Volume (y) is -.10 units when Advertising (x) is 0— Difficult to explain to marketing manager
— Expect some sales without advertising
^
![Page 38: Statistics for Business and Economics](https://reader033.vdocuments.us/reader033/viewer/2022061607/56812f52550346895d94e1a6/html5/thumbnails/38.jpg)
Regression Modeling Steps
1. Hypothesize deterministic component
2. Estimate unknown model parameters
3. Specify probability distribution of random error term
• Estimate standard deviation of error
4. Evaluate model
5. Use model for prediction and estimation
![Page 39: Statistics for Business and Economics](https://reader033.vdocuments.us/reader033/viewer/2022061607/56812f52550346895d94e1a6/html5/thumbnails/39.jpg)
Linear Regression Assumptions
1. Mean of probability distribution of error, ε, is 0
2. Probability distribution of error has constant variance
3. Probability distribution of error, ε, is normal
4. Errors are independent
![Page 40: Statistics for Business and Economics](https://reader033.vdocuments.us/reader033/viewer/2022061607/56812f52550346895d94e1a6/html5/thumbnails/40.jpg)
11XX22
XX
Error Probability Distribution
^
YY
f(f())
XX
![Page 41: Statistics for Business and Economics](https://reader033.vdocuments.us/reader033/viewer/2022061607/56812f52550346895d94e1a6/html5/thumbnails/41.jpg)
11XX22
XX
Error Probability Distribution
^
YY
f(f())
XX
![Page 42: Statistics for Business and Economics](https://reader033.vdocuments.us/reader033/viewer/2022061607/56812f52550346895d94e1a6/html5/thumbnails/42.jpg)
11XX22
XX
Error Probability Distribution
^
YY
f(f())
XX
![Page 43: Statistics for Business and Economics](https://reader033.vdocuments.us/reader033/viewer/2022061607/56812f52550346895d94e1a6/html5/thumbnails/43.jpg)
Estimating
Recall that:Recall that:
where is our estimator of the mean. where is our estimator of the mean.
Now substitute (1) Now substitute (1) YYii for X for Xii , (2) , , (2) , and and nn-2 df for -2 df for nn-1:-1:
€
s2 =(X i − X )2
n −1i =1
n
∑
€
X
€
ˆ Y for X
€
s2 =(Yi − ˆ Y )2
n − 2i=1
n
∑
![Page 44: Statistics for Business and Economics](https://reader033.vdocuments.us/reader033/viewer/2022061607/56812f52550346895d94e1a6/html5/thumbnails/44.jpg)
Why are df = n - 2?
1.1. Two statistics are estimated Two statistics are estimated to compute the regression line, to compute the regression line, 00 and and 11
2.2. As a result, if I know any n-2 As a result, if I know any n-2 residuals, the other two can be residuals, the other two can be computed from computed from ii = 0 and = 0 and iiXXii = 0.= 0.
^ ^
^ ^
![Page 45: Statistics for Business and Economics](https://reader033.vdocuments.us/reader033/viewer/2022061607/56812f52550346895d94e1a6/html5/thumbnails/45.jpg)
Regression Modeling Steps
1. Hypothesize deterministic component
2. Estimate unknown model parameters
3. Specify probability distribution of random error term
• Estimate standard deviation of error
4. Evaluate model
5. Use model for prediction and estimation
![Page 46: Statistics for Business and Economics](https://reader033.vdocuments.us/reader033/viewer/2022061607/56812f52550346895d94e1a6/html5/thumbnails/46.jpg)
Test of Slope Coefficient
• Shows if there is a linear relationship between x and y
• Involves population slope 1
• Hypotheses – H0: 1 = 0 (No Linear Relationship)
– Ha: 1 0 (Linear Relationship)
• Theoretical basis is sampling distribution of slope
![Page 47: Statistics for Business and Economics](https://reader033.vdocuments.us/reader033/viewer/2022061607/56812f52550346895d94e1a6/html5/thumbnails/47.jpg)
y
Population Linex
Sample 1 Line
Sample 2 Line
Sampling Distribution of Sample Slopes
1
Sampling Distribution
11
11SS
^
^
All Possible Sample Slopes
Sample 1: 2.5
Sample 2: 1.6
Sample 3: 1.8
Sample 4: 2.1 : :
Very large number of sample slopes
![Page 48: Statistics for Business and Economics](https://reader033.vdocuments.us/reader033/viewer/2022061607/56812f52550346895d94e1a6/html5/thumbnails/48.jpg)
Slope Coefficient Test Statistic
1
1 1
ˆ
2
12
1
ˆ ˆ2
wherexx
n
ini
xx ii
t df nsS
SS
x
SS xn
![Page 49: Statistics for Business and Economics](https://reader033.vdocuments.us/reader033/viewer/2022061607/56812f52550346895d94e1a6/html5/thumbnails/49.jpg)
Test of Slope Coefficient Example
You’re a marketing analyst for Hasbro Toys.
You find β0 = –.1, β1 = .7 and s = .6055.Ad $ Sales (Units)
1 12 13 24 25 4
Is the relationship significant at the .05 level of significance?
^^
![Page 50: Statistics for Business and Economics](https://reader033.vdocuments.us/reader033/viewer/2022061607/56812f52550346895d94e1a6/html5/thumbnails/50.jpg)
Test StatisticSolution
1
1
ˆ 2
1
ˆ
.6055.1914
1555
5
ˆ .703.657
.1914
xx
SS
SS
tS
![Page 51: Statistics for Business and Economics](https://reader033.vdocuments.us/reader033/viewer/2022061607/56812f52550346895d94e1a6/html5/thumbnails/51.jpg)
JMPFit Model Results
t = 1 / S
P-Value
S
1
1
1
^^^
^
![Page 52: Statistics for Business and Economics](https://reader033.vdocuments.us/reader033/viewer/2022061607/56812f52550346895d94e1a6/html5/thumbnails/52.jpg)
Computing a CI for a Slope
A (1 - )100% confidence interval for the true slope parameter 1 is given by:
This assumes that all of the standard regression assumptions are met!
Example from previous slide: 6123.7000.0
1914.1824.37000.0
1914.)3;025(.7000.0
t
121 )2;(ˆ
snt
![Page 53: Statistics for Business and Economics](https://reader033.vdocuments.us/reader033/viewer/2022061607/56812f52550346895d94e1a6/html5/thumbnails/53.jpg)
Proportion of variation ‘explained’ by relationship between x and y
Coefficient of Determination
2 Explained Variation
Total Variationyy
yy
SS SSEr
SS
0 r2 1
r2 = (coefficient of correlation)2
![Page 54: Statistics for Business and Economics](https://reader033.vdocuments.us/reader033/viewer/2022061607/56812f52550346895d94e1a6/html5/thumbnails/54.jpg)
Y
X
Y
X
Y
X
Coefficient of Determination Examples
Y
X
r2 = 1 r2 = 1
r2 = .8 r2 = 0
![Page 55: Statistics for Business and Economics](https://reader033.vdocuments.us/reader033/viewer/2022061607/56812f52550346895d94e1a6/html5/thumbnails/55.jpg)
Coefficient of Determination Example
You’re a marketing analyst for Hasbro Toys.
You know r = .904. Ad $ Sales (Units)
1 12 13 24 25 4
Calculate and interpret thecoefficient of determination.
![Page 56: Statistics for Business and Economics](https://reader033.vdocuments.us/reader033/viewer/2022061607/56812f52550346895d94e1a6/html5/thumbnails/56.jpg)
Coefficient of Determination Solution
r2 = (coefficient of correlation)2
r2 = (.904)2
r2 = .817
Interpretation: About 81.7% of the sample variation in Sales (y) can be explained by using Ad $ (x) to predict Sales (y) in the linear model.
![Page 57: Statistics for Business and Economics](https://reader033.vdocuments.us/reader033/viewer/2022061607/56812f52550346895d94e1a6/html5/thumbnails/57.jpg)
r2 Computer Output
Root MSE 0.60553 R-square 0.8167
Dep Mean 2.00000 Adj R-sq 0.7556
C.V. 30.27650
r2 adjusted for number of explanatory variables & sample size
r2
![Page 58: Statistics for Business and Economics](https://reader033.vdocuments.us/reader033/viewer/2022061607/56812f52550346895d94e1a6/html5/thumbnails/58.jpg)
JMPFit Model Results
r2
r2 adjusted for number of explanatory variables & sample size
![Page 59: Statistics for Business and Economics](https://reader033.vdocuments.us/reader033/viewer/2022061607/56812f52550346895d94e1a6/html5/thumbnails/59.jpg)
Correlation Models
1. Answer ‘How strong is the linear relationship between 2 variables?’
2. Coefficient of correlation usedPopulation coefficient denoted (rho)
Values range from -1 to +1
Measures degree of association
3. Used mainly for understanding
![Page 60: Statistics for Business and Economics](https://reader033.vdocuments.us/reader033/viewer/2022061607/56812f52550346895d94e1a6/html5/thumbnails/60.jpg)
1.Pearson product moment coefficient of correlation, r
Sample Coefficient of Correlation
€
r = (sign of slope) × R2
=(X i − X )(Yi −Y )∑
(X i − X )2 (Yi −Y )2∑∑
![Page 61: Statistics for Business and Economics](https://reader033.vdocuments.us/reader033/viewer/2022061607/56812f52550346895d94e1a6/html5/thumbnails/61.jpg)
Coefficient of Correlation Values
-1.0-1.0 +1.0+1.000
Perfect Perfect Positive Positive
CorrelationCorrelation
-.5-.5 +.5+.5
Perfect Perfect Negative Negative
CorrelationCorrelationNo No
CorrelationCorrelation
![Page 62: Statistics for Business and Economics](https://reader033.vdocuments.us/reader033/viewer/2022061607/56812f52550346895d94e1a6/html5/thumbnails/62.jpg)
Coefficient of Correlation Examples
Y
X
Y
X
Y
X
Y
X
r = 1 r = -1
r = .89 r = 0
![Page 63: Statistics for Business and Economics](https://reader033.vdocuments.us/reader033/viewer/2022061607/56812f52550346895d94e1a6/html5/thumbnails/63.jpg)
Regression Modeling Steps
1. Hypothesize deterministic component
2. Estimate unknown model parameters
3. Specify probability distribution of random error term
• Estimate standard deviation of error
4. Evaluate model
5. Use model for prediction and estimation
![Page 64: Statistics for Business and Economics](https://reader033.vdocuments.us/reader033/viewer/2022061607/56812f52550346895d94e1a6/html5/thumbnails/64.jpg)
Prediction With Regression Models
• Types of predictions– Point estimates– Interval estimates
• What is predicted– Population mean response E(y) for given x
Point on population regression line
– Individual response (yi) for given x
![Page 65: Statistics for Business and Economics](https://reader033.vdocuments.us/reader033/viewer/2022061607/56812f52550346895d94e1a6/html5/thumbnails/65.jpg)
What Is Predicted
Mean y, E(y)
y
^yIndividual
Prediction, y
E(y) = x
^x
xP
^ ^y i =
x
![Page 66: Statistics for Business and Economics](https://reader033.vdocuments.us/reader033/viewer/2022061607/56812f52550346895d94e1a6/html5/thumbnails/66.jpg)
Confidence Interval Estimate for Mean Value of y at x = xp
€
ˆ y ± (tα / 2)(s)1
n+
X p − X ( )2
SSxx
df = n – 2
![Page 67: Statistics for Business and Economics](https://reader033.vdocuments.us/reader033/viewer/2022061607/56812f52550346895d94e1a6/html5/thumbnails/67.jpg)
Factors Affecting Interval Width
1. Level of confidence (1 – )• Width increases as confidence increases
2. Data dispersion (s)• Width increases as variation increases
3. Sample size• Width decreases as sample size increases
4. Distance of xp from meanx1. Width increases as distance increases
![Page 68: Statistics for Business and Economics](https://reader033.vdocuments.us/reader033/viewer/2022061607/56812f52550346895d94e1a6/html5/thumbnails/68.jpg)
Confidence Interval Estimate Example
You’re a marketing analyst for Hasbro Toys.
You find β0 = -.1, β 1 = .7 and s = .6055.
Ad $ Sales (Units)1 12 13 24 25 4
Find a 95% confidence interval forthe mean sales when advertising is $4.
^^
![Page 69: Statistics for Business and Economics](https://reader033.vdocuments.us/reader033/viewer/2022061607/56812f52550346895d94e1a6/html5/thumbnails/69.jpg)
Confidence Interval Estimate Solution
2
/ 2
2
1ˆ
ˆ .1 .7 4 2.7
4 312.7 3.182 .6055
5 10
1.645 ( ) 3.755
p
xx
x xy t s
n SS
y
E Y
x to be predicted
![Page 70: Statistics for Business and Economics](https://reader033.vdocuments.us/reader033/viewer/2022061607/56812f52550346895d94e1a6/html5/thumbnails/70.jpg)
Prediction Interval of Individual Value of y at x = xp
Note!
2
/ 2
1ˆ 1
p
xx
x xy t S
n SS
df = n – 2
![Page 71: Statistics for Business and Economics](https://reader033.vdocuments.us/reader033/viewer/2022061607/56812f52550346895d94e1a6/html5/thumbnails/71.jpg)
Why the Extra ‘S’?
Expected(Mean) y
yy we're trying topredict
Prediction, y
xxp
E(y) = x
^^
^y i =
x i
![Page 72: Statistics for Business and Economics](https://reader033.vdocuments.us/reader033/viewer/2022061607/56812f52550346895d94e1a6/html5/thumbnails/72.jpg)
Prediction Interval Solution
2
/ 2
2
4
1ˆ 1
ˆ .1 .7 4 2.7
4 312.7 3.182 .6055 1
5 10
.503 4.897
p
xx
x xy t s
n SS
y
y
x to be predicted
![Page 73: Statistics for Business and Economics](https://reader033.vdocuments.us/reader033/viewer/2022061607/56812f52550346895d94e1a6/html5/thumbnails/73.jpg)
Dep Var Pred Std Err Low95% Upp95% Low95% Upp95%
Obs SALES Value Predict Mean Mean Predict Predict
1 1.000 0.600 0.469 -0.892 2.092 -1.837 3.037
2 1.000 1.300 0.332 0.244 2.355 -0.897 3.497
3 2.000 2.000 0.271 1.138 2.861 -0.111 4.111
4 2.000 2.700 0.332 1.644 3.755 0.502 4.897
5 4.000 3.400 0.469 1.907 4.892 0.962 5.837
Interval Estimate Computer Output
Predicted y when x = 4
Confidence Interval
SYPredictionInterval
![Page 74: Statistics for Business and Economics](https://reader033.vdocuments.us/reader033/viewer/2022061607/56812f52550346895d94e1a6/html5/thumbnails/74.jpg)
Confidence Intervals v. Prediction Intervals
x
y
x
^^ ^
y i =
x i
![Page 75: Statistics for Business and Economics](https://reader033.vdocuments.us/reader033/viewer/2022061607/56812f52550346895d94e1a6/html5/thumbnails/75.jpg)
Confidence Intervals vs. Prediction Intervals in JMP
![Page 76: Statistics for Business and Economics](https://reader033.vdocuments.us/reader033/viewer/2022061607/56812f52550346895d94e1a6/html5/thumbnails/76.jpg)
Conclusion
1. Described the Linear Regression Model
2. Stated the Regression Modeling Steps
3. Explained Least Squares
4. Computed Regression Coefficients
5. Used the model for prediction and estimation
6. Evaluated model on the basis of significance, Rsquare, size of prediction intervals
7. Discussed Rsquare and correlation coefficient