chapter 10 lecture 2 section: 10.3. we analyzed paired data with the goal of determining whether...
TRANSCRIPT
Chapter 10
Lecture 2
Section: 10.3
We analyzed paired data with the goal of determining whether thereis a linear correlation between two variables. The main objective of
this section is to describe the relationship between two variables by finding the graph and equation of the straight line that represents the relationship. This straight line is called the regression line, and its equation is called the regression equation.
The regression equation expresses a relationship between the variable x and the variable y. Just as y = mx+b.
• x is called the independent variable, predictor variable, or explanatory variable.
• y is the dependent variable, or response variable.
0 1 0 1
The typical equation of a straight line is expressed in the
ˆform of , where is the y-intercept and is the slope.
y mx b
y b b x b b
Assumptions:1. We are investigating only linear relationships.2. For each x-value, y is a random variable having a normal distribution.
All of these y distributions have the same variance. Also, for a given value of x, the distribution of y-values has a mean that lies on the regression line. Results are not seriously affected if departures from normal distributions and equal variances are not too extreme.
0 1
2
0 122
1 22
ˆFor , y b b x
y x x xyb y b x
n x x
n xy x yb
n x x
Using the Regression Equation for Predictions:Regression equations can be helpful when used for predicting the value of one variable, given some particular value of the other variable. If the regression line fits the data quite well, then it makes sense to use its equation for predictions, provided that we don’t go beyond the scope of the available values. However, we should use the equation of the regression line only if r indicates that there is a linearcorrelation. In the absence of a linear correlation, we should not use the regression equation for projecting or predicting; instead, our best estimate of the second variable is simply its sample mean. Which will be y.
Thus our guide lines in predicting a value of y based on some given value of x are:1. If there is not a linear correlation, the best predicted y-value is y .2. If there is a linear correlation, the best predicted y-value is foundby substituting the x-value into the regression equation.
1. The accompanying table lists monthly income and their food expenditures for the month of December.
Income: $5,500 $8,300 $3,800 $6,100 $3,300 $4,900 $6,700 Food Expenditure: $1,400 $2,400 $1,300 $1,600 $900 $1,500
$1,700
MINITAB OutputRegression Analysis: FoodEx versus income The regression equation isFoodEx = 151 + 0.252 income
Predictor Coef SE Coef T PConstant 150.7 217.4 0.69 0.519income 0.25246 0.03788 6.66 0.001
S = 159.508 R-Sq = 89.9% R-Sq(adj) = 87.9%
income
FoodEx
9000800070006000500040003000
2400
2200
2000
1800
1600
1400
1200
1000
1700
1500
900
1600
1300
2400
1400
Scatterplot of FoodEx vs income
2.NECK(X) 15.8 18.0 16.3 17.5 16.6 17.2 16.5
Arm Length(Y)33.536.835.235.034.735.734.8
Compute the regression equation and find the arm length if the neck size is 16.0.
2. Compute the regression equation and find the GPA of a student if their SAT score is 1000.
SAT GPA
1591 4.33
1530 3.96
1322 3.74
1169 3.12
979 2.80
825 2.70
791 2.54
766 2.35
743 2.32
633 2.07
4. The accompanying table lists weights in pounds of paper discarded by a sample of households, along with the size of the household.
Paper: 2.41 7.57 9.55 8.82 8.72 6.96 6.83 11.42HSize : 2 3 3 6 4 2 1 5What is the best predicted size of a household that discards 10lbs. of paper.
Paper
HS
ize
111098765432
6
5
4
3
2
1
S 1.40073R-Sq 39.6%R-Sq(adj) 29.6%
Fitted Line PlotHSize = 0.152 + 0.3979 Paper
Paper
HSiz
e
111098765432
6
5
4
3
2
1
Scatterplot of HSize vs Paper
Estimate the blood pressure of some one who is 40 years of age.
Age 38 41 42 45 50 52 55 60 62 65
BloodPressure
120 115 130 120 132 135 140 145 140 149
5. The following is the age and the corresponding blood pressure of 10 subjects randomly selected subjects from a large city
Age 17.2 43.5 30.7 53.1 37.2 21.0 27.6 46.3
BAC 0.19 0.20 0.26 0.16 0.24 0.20 0.18 0.23
6. A study was conducted to investigate the relationship between age (in years) and BAC (blood alcohol concentration) measured when convicted DWI (driving while intoxicated) jail inmates were first arrested. What is the best predicted BAC of a person who is 22 years of age who has been convicted and jailed for a DWI?