–working with relationships between two variables “donation “ made to teacher & stats test...
TRANSCRIPT
![Page 1: –Working with relationships between two variables “Donation “ made to teacher & Stats Test Score](https://reader035.vdocuments.us/reader035/viewer/2022081603/5697bf7a1a28abf838c82ff6/html5/thumbnails/1.jpg)
– Working with relationships between two variables
• “Donation “ made to teacher & Stats Test Score
0
10
20
30
40
50
60
70
80
90
100
$0 $20 $40 $60 $80
StatsTestScore
![Page 2: –Working with relationships between two variables “Donation “ made to teacher & Stats Test Score](https://reader035.vdocuments.us/reader035/viewer/2022081603/5697bf7a1a28abf838c82ff6/html5/thumbnails/2.jpg)
Correlation & Regression
• Univariate & Bivariate Statistics – U: frequency distribution, mean, mode, range, standard deviation– B: correlation – two variables
• Correlation– linear pattern of relationship between one variable (x) and
another variable (y) – an association between two variables– relative position of one variable correlates with relative
distribution of another variable
• X - An explanatory variable attempts to explain the observed outcomes in Y –A response variable measures an outcome of a study.
• Warning: – No proof of causality– Cannot assume x causes y
![Page 3: –Working with relationships between two variables “Donation “ made to teacher & Stats Test Score](https://reader035.vdocuments.us/reader035/viewer/2022081603/5697bf7a1a28abf838c82ff6/html5/thumbnails/3.jpg)
Scatterplot orScatter Diagram
a plot of paired data to determine or show a relationship between two variables
![Page 4: –Working with relationships between two variables “Donation “ made to teacher & Stats Test Score](https://reader035.vdocuments.us/reader035/viewer/2022081603/5697bf7a1a28abf838c82ff6/html5/thumbnails/4.jpg)
Graduating Seniors by State in 2005
The state of Louisiana
The state of Rhode Island
![Page 5: –Working with relationships between two variables “Donation “ made to teacher & Stats Test Score](https://reader035.vdocuments.us/reader035/viewer/2022081603/5697bf7a1a28abf838c82ff6/html5/thumbnails/5.jpg)
AP Statistics, Section 3.1, Part 1
5
Figure 3.1 (Percent taking SAT vs. Score)
• Attributes of a good scatterplot– Consistent and uniform scale– Label on both axis– Accurate placement of data– Data throughout the axis– Axis break lines if not starting at zero.
• To achieve this goal you should try to do your scatterplots on graph paper.
![Page 6: –Working with relationships between two variables “Donation “ made to teacher & Stats Test Score](https://reader035.vdocuments.us/reader035/viewer/2022081603/5697bf7a1a28abf838c82ff6/html5/thumbnails/6.jpg)
Graduating Seniors by State in 2005
States from NE, Mid-Atlantic and West
States from Midwest, Mtn Central, and Southwest
![Page 7: –Working with relationships between two variables “Donation “ made to teacher & Stats Test Score](https://reader035.vdocuments.us/reader035/viewer/2022081603/5697bf7a1a28abf838c82ff6/html5/thumbnails/7.jpg)
Paired Data
Miles traveled Minutes2 65 9
12 237 187 15
15 2810 19
![Page 8: –Working with relationships between two variables “Donation “ made to teacher & Stats Test Score](https://reader035.vdocuments.us/reader035/viewer/2022081603/5697bf7a1a28abf838c82ff6/html5/thumbnails/8.jpg)
Scatter Diagram
Relationship between miles traveled and minutes
0
10
20
30
0 5 10 15 20
miles
min
utes
Miles Minutes2 65 912 237 187 1515 2810 19
![Page 9: –Working with relationships between two variables “Donation “ made to teacher & Stats Test Score](https://reader035.vdocuments.us/reader035/viewer/2022081603/5697bf7a1a28abf838c82ff6/html5/thumbnails/9.jpg)
Linear Correlation
The general trend of the points seems to follow a straight line segment.
![Page 10: –Working with relationships between two variables “Donation “ made to teacher & Stats Test Score](https://reader035.vdocuments.us/reader035/viewer/2022081603/5697bf7a1a28abf838c82ff6/html5/thumbnails/10.jpg)
Linear Correlation
![Page 11: –Working with relationships between two variables “Donation “ made to teacher & Stats Test Score](https://reader035.vdocuments.us/reader035/viewer/2022081603/5697bf7a1a28abf838c82ff6/html5/thumbnails/11.jpg)
Non-Linear Correlation
![Page 12: –Working with relationships between two variables “Donation “ made to teacher & Stats Test Score](https://reader035.vdocuments.us/reader035/viewer/2022081603/5697bf7a1a28abf838c82ff6/html5/thumbnails/12.jpg)
No Linear Correlation
![Page 13: –Working with relationships between two variables “Donation “ made to teacher & Stats Test Score](https://reader035.vdocuments.us/reader035/viewer/2022081603/5697bf7a1a28abf838c82ff6/html5/thumbnails/13.jpg)
High Linear Correlation
Points lie close to a straight line.
![Page 14: –Working with relationships between two variables “Donation “ made to teacher & Stats Test Score](https://reader035.vdocuments.us/reader035/viewer/2022081603/5697bf7a1a28abf838c82ff6/html5/thumbnails/14.jpg)
High Linear Correlation
![Page 15: –Working with relationships between two variables “Donation “ made to teacher & Stats Test Score](https://reader035.vdocuments.us/reader035/viewer/2022081603/5697bf7a1a28abf838c82ff6/html5/thumbnails/15.jpg)
Moderate Linear Correlation
![Page 16: –Working with relationships between two variables “Donation “ made to teacher & Stats Test Score](https://reader035.vdocuments.us/reader035/viewer/2022081603/5697bf7a1a28abf838c82ff6/html5/thumbnails/16.jpg)
Low Linear Correlation
![Page 17: –Working with relationships between two variables “Donation “ made to teacher & Stats Test Score](https://reader035.vdocuments.us/reader035/viewer/2022081603/5697bf7a1a28abf838c82ff6/html5/thumbnails/17.jpg)
Perfect Linear Correlation
![Page 18: –Working with relationships between two variables “Donation “ made to teacher & Stats Test Score](https://reader035.vdocuments.us/reader035/viewer/2022081603/5697bf7a1a28abf838c82ff6/html5/thumbnails/18.jpg)
Questions Arising
• Can we find a relationship between x and y?
• How strong is the relationship?
Relationship between miles traveled and minutes
05
1015202530
0 5 10 15 20
miles
min
utes
![Page 19: –Working with relationships between two variables “Donation “ made to teacher & Stats Test Score](https://reader035.vdocuments.us/reader035/viewer/2022081603/5697bf7a1a28abf838c82ff6/html5/thumbnails/19.jpg)
When there appears to be a linear relationship between x and y:
attempt to “fit” a line to the scatter diagram.
![Page 20: –Working with relationships between two variables “Donation “ made to teacher & Stats Test Score](https://reader035.vdocuments.us/reader035/viewer/2022081603/5697bf7a1a28abf838c82ff6/html5/thumbnails/20.jpg)
When using x values to predict y values:
• Call x the explanatory variable
• Call y the response variable
![Page 21: –Working with relationships between two variables “Donation “ made to teacher & Stats Test Score](https://reader035.vdocuments.us/reader035/viewer/2022081603/5697bf7a1a28abf838c82ff6/html5/thumbnails/21.jpg)
Scatterplot!
• No Correlation– Random or circular assortment of dots
• Positive Correlation– ellipse leaning to right
– GPA and SAT– Smoking and Lung Damage
– Number of Whoppers eaten and Mr. Flynn’s weight
• Negative Correlation– ellipse learning to left
– Depression & Self-esteem
– Studying & test errors
– Vampire friends & Werewolf boyfriends
![Page 22: –Working with relationships between two variables “Donation “ made to teacher & Stats Test Score](https://reader035.vdocuments.us/reader035/viewer/2022081603/5697bf7a1a28abf838c82ff6/html5/thumbnails/22.jpg)
AP Statistics, Section 3.1, Part 1
22
Interpreting Scatterplots
• Pattern/Shape: linear, parabola, bell shaped– Deviations from pattern: Are there areas where the data conform
less to the pattern?
– Form: Are there clusters of data?
– Special data: Are there any influential points?
– Is a transformation of data necessary?
• Trend/Direction: positive, negative, or WTF?– As x increases what happens to y?
• Strength/Association: weak, moderate, strong– IF a line were drawn through the data, how close would the points
be to the line?
– Is the a small or large amount of variability within the y values?
![Page 23: –Working with relationships between two variables “Donation “ made to teacher & Stats Test Score](https://reader035.vdocuments.us/reader035/viewer/2022081603/5697bf7a1a28abf838c82ff6/html5/thumbnails/23.jpg)
Pearson’s Correlation Coefficient
• “r” indicates…– strength of relationship (strong, weak, or none)
– the variation of the points around the model (linear)
– direction of relationship
• positive (direct) – variables move in same direction
• negative (inverse) – variables move in opposite directions
• r ranges in value from –1.0 to +1.0
Strong Negative No Rel. Strong Positive-1.0 0.0 +1.0
•Try quick estimates–Next slide and strange quiz
![Page 24: –Working with relationships between two variables “Donation “ made to teacher & Stats Test Score](https://reader035.vdocuments.us/reader035/viewer/2022081603/5697bf7a1a28abf838c82ff6/html5/thumbnails/24.jpg)
Practice with Scatterplots
r = .__ __
r = .__ __
r = .__ __
r = .__ __
![Page 25: –Working with relationships between two variables “Donation “ made to teacher & Stats Test Score](https://reader035.vdocuments.us/reader035/viewer/2022081603/5697bf7a1a28abf838c82ff6/html5/thumbnails/25.jpg)
A relationship between correlation coefficient, r, and the slope, b, of the least squares line:
x values theofdeviation standards and
y values theofdeviation standards where
x
y
y
x
s
sbr
![Page 26: –Working with relationships between two variables “Donation “ made to teacher & Stats Test Score](https://reader035.vdocuments.us/reader035/viewer/2022081603/5697bf7a1a28abf838c82ff6/html5/thumbnails/26.jpg)
Linear correlation coefficient
1 r +1
![Page 27: –Working with relationships between two variables “Donation “ made to teacher & Stats Test Score](https://reader035.vdocuments.us/reader035/viewer/2022081603/5697bf7a1a28abf838c82ff6/html5/thumbnails/27.jpg)
Calculating the Correlation Coefficient, r
pairsdataofnumbern
n
yySS
n
xxSS
n
yxxySSwhere
SSSS
SSr
2
2
y
2
2
x
xy
yx
xy
![Page 28: –Working with relationships between two variables “Donation “ made to teacher & Stats Test Score](https://reader035.vdocuments.us/reader035/viewer/2022081603/5697bf7a1a28abf838c82ff6/html5/thumbnails/28.jpg)
Paired Data
Miles traveled Minutes2 65 9
12 237 187 15
15 2810 19
![Page 29: –Working with relationships between two variables “Donation “ made to teacher & Stats Test Score](https://reader035.vdocuments.us/reader035/viewer/2022081603/5697bf7a1a28abf838c82ff6/html5/thumbnails/29.jpg)
Scatter Diagram
Relationship between miles traveled and minutes
0
10
20
30
0 5 10 15 20
miles
min
utes
Miles Minutes2 65 912 237 187 1515 2810 19
![Page 30: –Working with relationships between two variables “Donation “ made to teacher & Stats Test Score](https://reader035.vdocuments.us/reader035/viewer/2022081603/5697bf7a1a28abf838c82ff6/html5/thumbnails/30.jpg)
Find the Least Squares Linex (Miles
Traveled)y
(Minutes)x2 xy
2 6 4 12
5 9 25 45
12 23 144 276
7 18 49 126
7 15 49 105
15 28 225 420
10 19 100 190
x = 58 y = 118 x2 = 596 xy = 1174
![Page 31: –Working with relationships between two variables “Donation “ made to teacher & Stats Test Score](https://reader035.vdocuments.us/reader035/viewer/2022081603/5697bf7a1a28abf838c82ff6/html5/thumbnails/31.jpg)
Finding the slope
700495.142857.115
28571.196
SS
SSbslope
42857.1157
58596
n
xxSSand
28571.1967
)118)(58(1174
n
yxxySS
x
xy
22
2
x
xy
![Page 32: –Working with relationships between two variables “Donation “ made to teacher & Stats Test Score](https://reader035.vdocuments.us/reader035/viewer/2022081603/5697bf7a1a28abf838c82ff6/html5/thumbnails/32.jpg)
Finding the y-intercept
7673273.2)2857143.8(700495.1857143.16
xbyaerceptinty
2857143.87
58valuesxofmeanx
857143.167
118valuesyofmeany
![Page 33: –Working with relationships between two variables “Donation “ made to teacher & Stats Test Score](https://reader035.vdocuments.us/reader035/viewer/2022081603/5697bf7a1a28abf838c82ff6/html5/thumbnails/33.jpg)
The equation of the least squares line is:
y = a + bx
y = 2.8 + 1.7x
![Page 34: –Working with relationships between two variables “Donation “ made to teacher & Stats Test Score](https://reader035.vdocuments.us/reader035/viewer/2022081603/5697bf7a1a28abf838c82ff6/html5/thumbnails/34.jpg)
To Compute r:
• Complete a table, with columns listing x, y, x2, y2, xy
• Compute SSxy, SSx, and SSy
• Use the formula:
yx
xy
SSSS
SSr
![Page 35: –Working with relationships between two variables “Donation “ made to teacher & Stats Test Score](https://reader035.vdocuments.us/reader035/viewer/2022081603/5697bf7a1a28abf838c82ff6/html5/thumbnails/35.jpg)
Find the Correlation Coefficientx
(Miles)y
(Min.)x2 y2 xy
2 6 4 36 12
5 9 25 81 45
12 23 144 529 276
7 18 49 324 126
7 15 49 225 105
15 28 225 784 420
10 19 100 361 190
x = 58 y = 118 x2 = 596 y2=2340 xy = 1174
![Page 36: –Working with relationships between two variables “Donation “ made to teacher & Stats Test Score](https://reader035.vdocuments.us/reader035/viewer/2022081603/5697bf7a1a28abf838c82ff6/html5/thumbnails/36.jpg)
Calculations:
9753643.085714.35042857.115
28571.196
SSSS
SSr
85714.3507
1182340
n
yySS
42857.1157
58596
n
xxSS
28571.1967
)118)(58(1174
n
yxxySS
yx
xy
22
2
y
22
2
x
xy
![Page 37: –Working with relationships between two variables “Donation “ made to teacher & Stats Test Score](https://reader035.vdocuments.us/reader035/viewer/2022081603/5697bf7a1a28abf838c82ff6/html5/thumbnails/37.jpg)
The Correlation Coefficient,
r = 0.9753643
r 0.98
![Page 38: –Working with relationships between two variables “Donation “ made to teacher & Stats Test Score](https://reader035.vdocuments.us/reader035/viewer/2022081603/5697bf7a1a28abf838c82ff6/html5/thumbnails/38.jpg)
AP Statistics, Section 3.2, Part 1 38
Calculating Correlation
• The calculation of correlation is based on mean and standard deviation.
• Remember that both mean and standard deviation are not resistant measures.
1
1i i
x y
x x y yr
n s s
![Page 39: –Working with relationships between two variables “Donation “ made to teacher & Stats Test Score](https://reader035.vdocuments.us/reader035/viewer/2022081603/5697bf7a1a28abf838c82ff6/html5/thumbnails/39.jpg)
AP Statistics, Section 3.2, Part 1 39
Calculating Correlation
• What does the contents of the parenthesis look like?
• What happens when the values are both from the lower half of the population? From the upper half?
1
1i i
x y
x x y yr
n s s
Both z-values are negative.
Their product is positive.
Both z-values are positive.
Their product is positive.
The formula for calculating z-
values.
![Page 40: –Working with relationships between two variables “Donation “ made to teacher & Stats Test Score](https://reader035.vdocuments.us/reader035/viewer/2022081603/5697bf7a1a28abf838c82ff6/html5/thumbnails/40.jpg)
AP Statistics, Section 3.2, Part 1 40
Calculating Correlation
• What happens when one value is from the lower half of the population but other value is from the upper half?
1
1i i
x y
x x y yr
n s s
One z-value is positive and the other is negative. Their product is
negative.
![Page 41: –Working with relationships between two variables “Donation “ made to teacher & Stats Test Score](https://reader035.vdocuments.us/reader035/viewer/2022081603/5697bf7a1a28abf838c82ff6/html5/thumbnails/41.jpg)
AP Statistics, Section 3.2, Part 1 41
Using the TI-83/84 to calculate r• With Diagnostics ON:
• Run LinReg(a+bx) [STAT>CALC>option 8] with the explanatory variable as the first list, and response variable as the second list
The results are the slope and vertical intercept of the regression equation (more on that later) and values of r and r2. (More on r2 check next handout ;)
![Page 42: –Working with relationships between two variables “Donation “ made to teacher & Stats Test Score](https://reader035.vdocuments.us/reader035/viewer/2022081603/5697bf7a1a28abf838c82ff6/html5/thumbnails/42.jpg)
Predictive Potential
• Coefficient of Determination– r²
– Amount of variance accounted for in y by x
– Percentage increase in accuracy you gain by using the regression line to make predictions
– Without correlation, you can only guess the mean of y
– [Used with regression]
20%0% 80% 100%60%40%
Understanding r-squared actvity
![Page 43: –Working with relationships between two variables “Donation “ made to teacher & Stats Test Score](https://reader035.vdocuments.us/reader035/viewer/2022081603/5697bf7a1a28abf838c82ff6/html5/thumbnails/43.jpg)
Limitations of Correlation
• linearity: – can’t describe (accurately) non-linear relationships
– e.g., flavor and % eaten, thickness and strength
• truncation of range: – underestimate strength of relationship if you can’t see full range
of x value
• no proof of causation– third variable problem:
• could be 3rd variable causing change in both variables
• directionality: can’t be sure which way causality “flows”
• “We don’t get it” – what does it have to do with that f#$%@! Line?
That is for another session…
![Page 44: –Working with relationships between two variables “Donation “ made to teacher & Stats Test Score](https://reader035.vdocuments.us/reader035/viewer/2022081603/5697bf7a1a28abf838c82ff6/html5/thumbnails/44.jpg)
Regression
• Regression: Correlation + Prediction– predicting y based on x
– e.g., predicting….
• throwing points (y)
• based on distance from target (x)
• Regression equation – formula that specifies a line
– y’ = a + bx
– plug in a x value (distance from target) and predict y (points)
– note
• y= actual value of a score
• y’= predict value •Data Handout–Test takers, planets, darts
![Page 45: –Working with relationships between two variables “Donation “ made to teacher & Stats Test Score](https://reader035.vdocuments.us/reader035/viewer/2022081603/5697bf7a1a28abf838c82ff6/html5/thumbnails/45.jpg)
AP Statistics, Section 3.3, Part 1 45
The Least-Square Regression
• Finds the best fit line by trying to minimize the areas formed by the difference of the real data from the values predicted by the model.
![Page 46: –Working with relationships between two variables “Donation “ made to teacher & Stats Test Score](https://reader035.vdocuments.us/reader035/viewer/2022081603/5697bf7a1a28abf838c82ff6/html5/thumbnails/46.jpg)
AP Statistics, Section 3.3, Part 1 46
The Least-Square Regression
• Statisticians use a slightly different version of “slope-intercept” form.
y
x
y a bx
sb r
s
a y bx
Slope is the product of r value and std dev ratio
Y-intercept is the value found using the avg x and avg y
![Page 47: –Working with relationships between two variables “Donation “ made to teacher & Stats Test Score](https://reader035.vdocuments.us/reader035/viewer/2022081603/5697bf7a1a28abf838c82ff6/html5/thumbnails/47.jpg)
Distance from target
2624222018161412108
Tota
l ba
ll to
ss p
oin
ts
120
100
80
60
40
20
0 Rsq = 0.6031
Regression Graphic – Regression Line
if x=18 then…
y’=47
if x=24 then…
y’=20
![Page 48: –Working with relationships between two variables “Donation “ made to teacher & Stats Test Score](https://reader035.vdocuments.us/reader035/viewer/2022081603/5697bf7a1a28abf838c82ff6/html5/thumbnails/48.jpg)
AP Statistics, Section 3.3, Part 1 48
Predicting Model
• To put the regression line on the graph use the Statistics:Eq:RegEQ from the Vars menu to put the Y1
equation.
• Then you can use Trace or Table or Y1 to find response values that correspond to particular experimental values.
![Page 49: –Working with relationships between two variables “Donation “ made to teacher & Stats Test Score](https://reader035.vdocuments.us/reader035/viewer/2022081603/5697bf7a1a28abf838c82ff6/html5/thumbnails/49.jpg)
Regression Equation
• y’= a + bx– y’ = predicted value of y– b = slope of the line– x = value of x that you plug-in– a = y-intercept (where line crosses y axis)
• In the dart throwing case….– y’ = 125.401 - 4.263(x)
• So if the distance is 2020 feet– y’ = 125.401 - 4.263(2020) – y’ = 125.401 -85.26
– y’ = 40.141
See STAT – CALC – LinReg: a + bx
![Page 50: –Working with relationships between two variables “Donation “ made to teacher & Stats Test Score](https://reader035.vdocuments.us/reader035/viewer/2022081603/5697bf7a1a28abf838c82ff6/html5/thumbnails/50.jpg)
Drawing a Regression Line by Hand
Four steps
1. Use the y-intercept (if possible; does it have meaning =interval vs. rational)
2. Plot the average point (mean x, mean y)
3. Plug in a large value for x (just so it falls on the right end of the graph), plug it in for x, then plot the resulting point
4. Connect the three points with a straight line!
![Page 51: –Working with relationships between two variables “Donation “ made to teacher & Stats Test Score](https://reader035.vdocuments.us/reader035/viewer/2022081603/5697bf7a1a28abf838c82ff6/html5/thumbnails/51.jpg)
AP Statistics, Section 3.3, Part 1 51
Residuals
• It is important to note that the observed value almost never match the predicted values exactly
• The difference between the observed value and predicted has a special name: residual
Observed Value: (y)
Predicted Value ( ) y
Residual:ˆy y
![Page 52: –Working with relationships between two variables “Donation “ made to teacher & Stats Test Score](https://reader035.vdocuments.us/reader035/viewer/2022081603/5697bf7a1a28abf838c82ff6/html5/thumbnails/52.jpg)
AP Statistics, Section 3.3, Part 1 52
Residual Plots
• You can plot the residuals to see if the there is any trends with the quality of the predictive model
• Try looking in the List menu for “RESID:”
![Page 53: –Working with relationships between two variables “Donation “ made to teacher & Stats Test Score](https://reader035.vdocuments.us/reader035/viewer/2022081603/5697bf7a1a28abf838c82ff6/html5/thumbnails/53.jpg)
AP Statistics, Section 3.3, Part 1 53
Residual Plots
• This residual shows no tendencies. It is equally bad throughout.
• This suggests that the original relationship is linear.
![Page 54: –Working with relationships between two variables “Donation “ made to teacher & Stats Test Score](https://reader035.vdocuments.us/reader035/viewer/2022081603/5697bf7a1a28abf838c82ff6/html5/thumbnails/54.jpg)
AP Statistics, Section 3.3, Part 1
54
“Pattern” =Not Linear
“Well Distributed”=Linear
![Page 55: –Working with relationships between two variables “Donation “ made to teacher & Stats Test Score](https://reader035.vdocuments.us/reader035/viewer/2022081603/5697bf7a1a28abf838c82ff6/html5/thumbnails/55.jpg)
Predictive Ability
• Mantra!! – As variability decreases, prediction accuracy __________
– if we can account for variance, we can make better predictions
• As r increases:– r² increases
• “variance accounted for” increases
• the prediction accuracy increases
– prediction error decreases (distance between y’ and y)
– Sy decreases
• the standard error of the residual/predictor
• measures overall amount of prediction error
– It can be thought of like this …
![Page 56: –Working with relationships between two variables “Donation “ made to teacher & Stats Test Score](https://reader035.vdocuments.us/reader035/viewer/2022081603/5697bf7a1a28abf838c82ff6/html5/thumbnails/56.jpg)
Thanks – Peace !
We like big r’s and we cannot lie!!!
You other brothers can’t deny!!!
Check out those residuals son
and plot em with your TI-84 on
Cause if they don’t look all scattered and patterned
then your least squared line is shattered
Then I only want that - if your scale and r squared is fat
So kick out those nasty outliers
When your correlation factor is on
BABY GOT STATS!