overview
DESCRIPTION
Overview. 4.2 Introduction to Correlation 4.3 Introduction to Regression. Scatterplots. Used to summarize the relationship between two quantitative variables that have been measured on the same element Graph of points (x, y) each of which represents one observation from the data set - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Overview](https://reader035.vdocuments.us/reader035/viewer/2022070422/5681641a550346895dd5d361/html5/thumbnails/1.jpg)
![Page 2: Overview](https://reader035.vdocuments.us/reader035/viewer/2022070422/5681641a550346895dd5d361/html5/thumbnails/2.jpg)
Overview
4.2 Introduction to Correlation
4.3 Introduction to Regression
![Page 3: Overview](https://reader035.vdocuments.us/reader035/viewer/2022070422/5681641a550346895dd5d361/html5/thumbnails/3.jpg)
ScatterplotsUsed to summarize the relationship between two quantitative variables that have been
measured on the same element
Graph of points (x, y) each of which represents one observation from the data set
One of the variables is measured along the horizontal axis and is called the x variable
The other variable is measured along the vertical axis and is called the y variable
![Page 4: Overview](https://reader035.vdocuments.us/reader035/viewer/2022070422/5681641a550346895dd5d361/html5/thumbnails/4.jpg)
Predictor Variable and Response Variable
The value of the x variable can be used to predict or estimate the value of the
y variable
The x variable is referred to as the predictor variable
The y variable is called the response variable
![Page 5: Overview](https://reader035.vdocuments.us/reader035/viewer/2022070422/5681641a550346895dd5d361/html5/thumbnails/5.jpg)
Scatterplot TerminologyNote the terminology in the caption to Figure
4.2.
When describing a scatterplot, always indicate the y variable first and use the term versus (vs.) or against the x variable.
This terminology reinforces the notion that the y variable depends on the x variable.
![Page 6: Overview](https://reader035.vdocuments.us/reader035/viewer/2022070422/5681641a550346895dd5d361/html5/thumbnails/6.jpg)
FIGURE 4.2Scatterplot of sales price versus square
footage.
![Page 7: Overview](https://reader035.vdocuments.us/reader035/viewer/2022070422/5681641a550346895dd5d361/html5/thumbnails/7.jpg)
Positive relationshipAs the x variable increases in value, the y variable also tends to increase.
FIGURE 4.3 (a) Scatterplot of a positive relationship
![Page 8: Overview](https://reader035.vdocuments.us/reader035/viewer/2022070422/5681641a550346895dd5d361/html5/thumbnails/8.jpg)
Negative relationshipAs the x variable increases in value, the y variable tends to decrease
FIGURE 4.3 (b) scatterplot of a negative relationship
![Page 9: Overview](https://reader035.vdocuments.us/reader035/viewer/2022070422/5681641a550346895dd5d361/html5/thumbnails/9.jpg)
No apparent relationshipAs the x variable increases in value, the y
variable tends to remain unchanged
FIGURE 4.3 (c) scatterplot of no apparent relationship.
![Page 10: Overview](https://reader035.vdocuments.us/reader035/viewer/2022070422/5681641a550346895dd5d361/html5/thumbnails/10.jpg)
4.2 Introduction to CorrelationObjective:By the end of this section, I will beable to…
1) Calculate and interpret the value of the correlation coefficient.
![Page 11: Overview](https://reader035.vdocuments.us/reader035/viewer/2022070422/5681641a550346895dd5d361/html5/thumbnails/11.jpg)
Correlation Coefficient rMeasures the strength and direction of the
linear relationship between two variables.
sx is the sample standard deviation of the x data values.
sy is the sample standard deviation of the y data values.
)( )(( 1) x y
y yx xrn s s
![Page 12: Overview](https://reader035.vdocuments.us/reader035/viewer/2022070422/5681641a550346895dd5d361/html5/thumbnails/12.jpg)
Example 4.5 - Calculating the correlation coefficient rFind the value of the correlation coefficient rfor the temperature data in Table 4.11.
Table 4.11 High and low temperatures, in degrees Fahrenheit, of 10 American cities
![Page 13: Overview](https://reader035.vdocuments.us/reader035/viewer/2022070422/5681641a550346895dd5d361/html5/thumbnails/13.jpg)
Interpreting the Correlation Coefficient r
1) Values of r close to 1 indicate a positive relationship between the two variables.
The variables are said to be positively correlated.
As x increases, y tends to increase as well.
![Page 14: Overview](https://reader035.vdocuments.us/reader035/viewer/2022070422/5681641a550346895dd5d361/html5/thumbnails/14.jpg)
Interpreting the Correlation Coefficient r2) Values of r close to -1 indicate a negative
relationship between the two variables.
The variables are said to be negatively correlated.
As x increases, y tends to decrease.
![Page 15: Overview](https://reader035.vdocuments.us/reader035/viewer/2022070422/5681641a550346895dd5d361/html5/thumbnails/15.jpg)
Interpreting the Correlation Coefficient r3) Other values of r indicate the lack of either
a positive or negative linear relationship between the two variables.
The variables are said to be uncorrelated
As x increases, y tends to neither increase nor decrease linearly.
![Page 16: Overview](https://reader035.vdocuments.us/reader035/viewer/2022070422/5681641a550346895dd5d361/html5/thumbnails/16.jpg)
Guidelines for Interpreting the Correlation Coefficient rIf the correlation coefficient between twovariables isgreater than 0.7, the variables are positively
correlated.between 0.33 and 0.7, the variables are
mildly positively correlated.between –0.33 and 0.33, the variables are
not correlated.between –0.7 and –0.33, the variables are
mildly negatively correlated. less than –0.7, the variables are negatively
correlated.
![Page 17: Overview](https://reader035.vdocuments.us/reader035/viewer/2022070422/5681641a550346895dd5d361/html5/thumbnails/17.jpg)
Example 4.6 - Interpreting the correlation coefficientInterpret the correlation coefficient found in Example 4.5.
![Page 18: Overview](https://reader035.vdocuments.us/reader035/viewer/2022070422/5681641a550346895dd5d361/html5/thumbnails/18.jpg)
Example 4.6 continuedSolution
In Example 4.5, we found the correlation coefficient for the relationship between high and low temperature to be r = 0.9761.
r = 0.9761 very close to 1. We would therefore say that high and low
temperatures for these 10 American cities are strongly positively correlated.
As low temperature increases, high temperatures also tend to increase.
![Page 19: Overview](https://reader035.vdocuments.us/reader035/viewer/2022070422/5681641a550346895dd5d361/html5/thumbnails/19.jpg)
Equivalent Computational Formula for Calculating the Correlation Coefficient r
2 22 2
/
/ /
xy x y nr
x x n y y n
![Page 20: Overview](https://reader035.vdocuments.us/reader035/viewer/2022070422/5681641a550346895dd5d361/html5/thumbnails/20.jpg)
Example 4.7Use the computational formula to calculate the correlation coefficient r for the relationshipbetween square footage and sales price of the eight home lots for sale in Glen Ellyn from Table 4.6 (Example 4.3 in Section 4.1).
![Page 21: Overview](https://reader035.vdocuments.us/reader035/viewer/2022070422/5681641a550346895dd5d361/html5/thumbnails/21.jpg)
SummarySection 4.2 introduces the correlation coefficient r, a measure of the strength of linear
association between two numeric variables.
Values of r close to 1 indicate that the variables are positively correlated.
Values of r close to –1 indicate that the variables are negatively correlated.
Values of r close to 0 indicate that the variables are not correlated.
![Page 22: Overview](https://reader035.vdocuments.us/reader035/viewer/2022070422/5681641a550346895dd5d361/html5/thumbnails/22.jpg)
4.3 Introduction to RegressionObjectives:By the end of this section, I will beable to…
1) Calculate the value and understand the meaning of the slope and the y intercept of the regression line.
2) Predict values of y for given values of x.
![Page 23: Overview](https://reader035.vdocuments.us/reader035/viewer/2022070422/5681641a550346895dd5d361/html5/thumbnails/23.jpg)
Equation of the Regression LineApproximates the relationship between x
and y
The equation is where the regression coefficients are the
slope, b1, and the y intercept, b0.
The “hat” over the y (pronounced “y-hat”) indicates that this is an estimate of y and not necessarily an actual value of y.
0 1y b b x
![Page 24: Overview](https://reader035.vdocuments.us/reader035/viewer/2022070422/5681641a550346895dd5d361/html5/thumbnails/24.jpg)
Example 4.8 - Calculating the regression coefficients b0 and b1
Find the value of the regression coefficients b0 and b1 for the temperature data inTable 4.11.
Table 4.11 High and low temperatures, in degrees Fahrenheit, of 10 American cities
![Page 25: Overview](https://reader035.vdocuments.us/reader035/viewer/2022070422/5681641a550346895dd5d361/html5/thumbnails/25.jpg)
Example 4.8 continuedStep 4:
Thus, the equation of the regression line for the temperature data is
10.0533 0.9865y x
![Page 26: Overview](https://reader035.vdocuments.us/reader035/viewer/2022070422/5681641a550346895dd5d361/html5/thumbnails/26.jpg)
Example 4.8 continuedSince y and x represent high and low
temperatures, respectively, this equation is read as follows:
“The estimated high temperature for an American city is 10.0533 degrees Fahrenheit plus 0.9865 times the low temperature for that city.”
![Page 27: Overview](https://reader035.vdocuments.us/reader035/viewer/2022070422/5681641a550346895dd5d361/html5/thumbnails/27.jpg)
Using the Regression Equation to Make PredictionsFor any particular value of x, the predicted
value for y lies on the regression line.
Example 4.11
Suppose we are considering moving to a city that has a low temperature of 47 degrees Fahrenheit (ºF) on this particular winter’s day. What would the estimated high temperature be for this city?
![Page 28: Overview](https://reader035.vdocuments.us/reader035/viewer/2022070422/5681641a550346895dd5d361/html5/thumbnails/28.jpg)
Example 4.11 continuedSolution
Plug the value of 47ºF for the variable low into the regression equation from Example 4.8:
We would say: “The estimated high temperature for an American city with a low of 47ºF, is 56.4188ºF.”
10.0533 0.9865
10.0533 0.9865 47
56.4188
y low
![Page 29: Overview](https://reader035.vdocuments.us/reader035/viewer/2022070422/5681641a550346895dd5d361/html5/thumbnails/29.jpg)
Interpreting the SlopeRelationship Between Slope and Correlation Coefficient
The slope b1 of the regression line and the correlation coefficient r always have the same sign.
b1 is positive if and only if r is positive.
b1 is negative if and only if r is negative.