gaiseing into the statistics common core day 2: statistical association june 27, 2013

139
GAISEing into the Statistics Common Core Day 2: Statistical Association June 27, 2013

Upload: melina-rodgers

Post on 28-Dec-2015

219 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: GAISEing into the Statistics Common Core Day 2: Statistical Association June 27, 2013

GAISEing into the Statistics Common CoreDay 2: Statistical AssociationJune 27, 2013

Page 2: GAISEing into the Statistics Common Core Day 2: Statistical Association June 27, 2013

Team• Dr. Stephanie Casey is an Assistant Prof. of MathEd at EMU. Her research focuses on

teacher knowledge for teaching statistics at the middle and secondary levels, motivated by her experience of teaching secondary mathematics for fourteen years.

• Dr. Andrew Ross is an Associate Prof. of Math at EMU, specializing in operations research. He was named the Michigan MAA Distinguished Teaching Awardee in 2011.

• Dr. Brenda Gunderson is a Senior Lecturer in Stats Dept at the University of Michigan. She coordinates and teaches Statistics and Data Analysis, with approximately 1800 students each term.

• Anamaria Kazanis, Pstat, is a Senior Statistician at MSU. She is the current president of the Ann Arbor Chapter of ASA

• Karen Nielsen is a PhD student in the Stats Dept. at the University of Michigan. She has taught 2 years of undergraduate introductory Statistics labs and served as a mentor to other Graduate Student Instructors. As part of a cross-disciplinary team, she helped to bring online learning objects into large-enrollment gateway classes.

• Mackenzie Fankell graduated from the U of M in 2009 with a degree in psychology. After graduating she worked as an English teacher in Chile for two years before returning to the US and working as a high school math teacher in Dearborn, MI. She began her masters in education at U of M in 2012 but transferred to a masters program in statistics later that year. She hopes to pursue research in education and the social sciences.

Page 3: GAISEing into the Statistics Common Core Day 2: Statistical Association June 27, 2013

Outline of Our Day• 9:00-10:30 a.m. GAISE into the CCSS-M statistics standard(s) of the day:

• The standard , • its learning trajectory, and • content

• 10:30-10:40 a.m.: BREAK

• 10:40 a.m.-12:10 p.m. GAISE activities part 1• activities that teach the standard through the GAISE process, • debrief on the experience and how to utilize the activity in their own classroom

• 12:10-1:00 p.m.: LUNCH BREAK

• 1:00-2:00 p.m.: GAISE activities part 2

• 2:00-2:30 p.m.: Interactive lecture on • knowledge of standard and students, • discussing what students are likely to think about and do as they progress through the learning trajectory for the

standard;• common student conceptions, effective ways to support students as they move through the learning trajectory

• 2:30-3:00 p.m.: Reflections on the day’s standard(s), share ideas, comments, concerns, etc. for teaching the standard(s)

Page 4: GAISEing into the Statistics Common Core Day 2: Statistical Association June 27, 2013

9:00-10:30 a.m. GAISE into the CCSS-M statistics standards of the day:

The standards Learning trajectory Content

Page 5: GAISEing into the Statistics Common Core Day 2: Statistical Association June 27, 2013

Standards, Grade 8 (part 1) Investigate patterns of association in

bivariate data. • CCSS.Math.Content.8.SP.A.1 Construct and interpret scatter

plots for bivariate measurement data to investigate patterns of association between two quantities. Describe patterns such as clustering, outliers, positive or negative association, linear association, and nonlinear association.

• CCSS.Math.Content.8.SP.A.2 Know that straight lines are widely used to model relationships between two quantitative variables. For scatter plots that suggest a linear association, informally fit a straight line, and informally assess the model fit by judging the closeness of the data points to the line.

Page 6: GAISEing into the Statistics Common Core Day 2: Statistical Association June 27, 2013

Standards, Grade 8 (part 2)

Investigate patterns of association in bivariate data.

• CCSS.Math.Content.8.SP.A.3 Use the equation of a linear model to solve problems in the context of bivariate measurement data, interpreting the slope and intercept. For example, in a linear model for a biology experiment, interpret a slope of 1.5 cm/hr as meaning that an additional hour of sunlight each day is associated with an additional 1.5 cm in mature plant height.

• CCSS.Math.Content.8.SP.A.4 Understand that patterns of association can also be seen in bivariate categorical data by displaying frequencies and relative frequencies in a two-way table. Construct and interpret a two-way table summarizing data on two categorical variables collected from the same subjects. Use relative frequencies calculated for rows or columns to describe possible association between the two variables. For example, collect data from students in your class on whether or not they have a curfew on school nights and whether or not they have assigned chores at home. Is there evidence that those who have a curfew also tend to have chores?

Page 7: GAISEing into the Statistics Common Core Day 2: Statistical Association June 27, 2013

Standards, High School (part 1) Summarize, represent, and interpret data on two categorical and

quantitative variables • CCSS.Math.Content.HSS-ID.B.5 Summarize categorical data for two

categories in two-way frequency tables. Interpret relative frequencies in the context of the data (including joint, marginal, and conditional relative frequencies). Recognize possible associations and trends in the data.

• CCSS.Math.Content.HSS-ID.B.6 Represent data on two quantitative variables on a scatter plot, and describe how the variables are related.

• CCSS.Math.Content.HSS-ID.B.6a Fit a function to the data; use functions fitted to data to solve problems in the context of the data. Use given functions or choose a function suggested by the context. Emphasize linear, quadratic, and exponential models.

• CCSS.Math.Content.HSS-ID.B.6b Informally assess the fit of a function by plotting and analyzing residuals.

• CCSS.Math.Content.HSS-ID.B.6c Fit a linear function for a scatter plot that suggests a linear association.

Page 8: GAISEing into the Statistics Common Core Day 2: Statistical Association June 27, 2013

Standards, High School (part 2)

Interpret linear models • CCSS.Math.Content.HSS-ID.C.7 Interpret the

slope (rate of change) and the intercept (constant term) of a linear model in the context of the data.

• CCSS.Math.Content.HSS-ID.C.8 Compute (using technology) and interpret the correlation coefficient of a linear fit.

• CCSS.Math.Content.HSS-ID.C.9 Distinguish between correlation and causation.

Page 9: GAISEing into the Statistics Common Core Day 2: Statistical Association June 27, 2013

AP Statistics (part 1)• 1 . Exploring Data: Describing patterns and departures from patterns (20%–30%)Exploratory analysis of data makes use of graphical and numerical techniques to study patterns and departures from patterns. Emphasis should be placed on interpreting information from graphical and numerical displays and summaries

D . Exploring bivariate data1 . Analyzing patterns in scatterplots2 . Correlation and linearity3 . Least-squares regression line4 . Residual plots, outliers and influential points5 . Transformations to achieve linearity: logarithmic and power

transformationsE . Exploring categorical data1 . Frequency tables and bar charts2 . Marginal and joint frequencies for two-way tables3 . Conditional relative frequencies and association4 . Comparing distributions using bar charts

Page 10: GAISEing into the Statistics Common Core Day 2: Statistical Association June 27, 2013

AP Statistics (part 2)

• IV . Statistical Inference: Estimating population parameters and testing hypotheses (30%–40%)

Statistical inference guides the selection of appropriate models.

A . Estimation (point estimators and confidence intervals)8 . Confidence interval for the slope of a least-

squares regression lineB . Tests of significance

6 . Chi-square test for … homogeneity of proportions, and independence (…two-way tables)7 . Test for the slope of a least-squares regression line

Page 11: GAISEing into the Statistics Common Core Day 2: Statistical Association June 27, 2013

Learning Trajectories/Progressions

• TurnOnCCMath.net• Progressions for the Common Core State Standards in

Mathematics• Project SET: http://project-set.com/• http://project-set.com/presentations/121712-regressionlp-

final-released/

Page 12: GAISEing into the Statistics Common Core Day 2: Statistical Association June 27, 2013

Turn On CC Math.net (up to 8th grade)

Page 13: GAISEing into the Statistics Common Core Day 2: Statistical Association June 27, 2013
Page 14: GAISEing into the Statistics Common Core Day 2: Statistical Association June 27, 2013
Page 15: GAISEing into the Statistics Common Core Day 2: Statistical Association June 27, 2013

Progressions for the Common Core State Standards in Mathematics• By The Common Core Standards Writing Team themselves

Page 16: GAISEing into the Statistics Common Core Day 2: Statistical Association June 27, 2013

GAISE Level A, assoc.-related• I. Formulate the Question• → Teachers help pose questions (questions in contexts of interest

to the student).• II. Collect Data to Answer the Question• → Students conduct a census of the classroom.• → Students understand individual-to-individual natural variability.• → Students conduct simple experiments with nonrandom

assignment of treatments.• III. Analyze the Data• → Students observe association between two variables• → Students use tools for exploring … association, including:

• ▪ Scatterplot ▪ Tables (using counts)

• IV. Interpret Results

Page 17: GAISEing into the Statistics Common Core Day 2: Statistical Association June 27, 2013

Example:

Page 18: GAISEing into the Statistics Common Core Day 2: Statistical Association June 27, 2013

GAISE Level B, assoc.-related• I. Formulate Questions• → Students begin to pose their own questions• III. Analyze Data• → Students quantify the strength of association between two variables,

develop simple models for association between two numerical variables, and use expanded tools for exploring association, including:

• ▪ Contingency tables for two categorical variables• ▪ Time series plots• ▪ The QCR (Quadrant Count Ratio) as a measure of strength of

association• ▪ Simple lines for modeling association between two numerical variables• IV. Interpret Results• → Students understand basic interpretations of measures of

association.

Page 19: GAISEing into the Statistics Common Core Day 2: Statistical Association June 27, 2013

Example: favorite music

Page 20: GAISEing into the Statistics Common Core Day 2: Statistical Association June 27, 2013

GAISE Level C, assoc.-related• I. Formulate Questions• → Students should be able to formulate questions and

determine how data can be collected and analyzed to provide an answer

• III. Analyze Data• → Students should be able to recognize association between

two categorical variables.• → Students should be able to recognize when the relationship

between two numerical variables is reasonably linear, know that Pearson’s correlation coefficient is a measure of the strength of the linear relationship between two numerical variables, and understand the least squares criterion in line fitting

Page 21: GAISEing into the Statistics Common Core Day 2: Statistical Association June 27, 2013

Example

Page 22: GAISEing into the Statistics Common Core Day 2: Statistical Association June 27, 2013

Example: plotting residuals

Page 23: GAISEing into the Statistics Common Core Day 2: Statistical Association June 27, 2013

• http://project-set.com (there are many other Project SET’s)• Aimed at high school• Loop 1, golf ball drop, could be used in middle school

• Informal lines of fit• Loop 2, vertical leap, is for HS: least-squares, residuals

• And possibility of categorical association• Loop 2, used car prices, is for HS: least-squares, residuals• Loop 3, NFL QB salaries, is for HS: least-squares, r or R^2• Loop 4&5, txting, just for AP Stat

Page 24: GAISEing into the Statistics Common Core Day 2: Statistical Association June 27, 2013

Loop 1: Informal Fit

• Using Golf Ball Drop data• Please read the handout, use spaghetti to

show your informally fitted line.• Not allowed to break spaghetti to connect

individual dots!• Finish instructions on handout.• Also, what is wrong with experimental

plan?

Page 25: GAISEing into the Statistics Common Core Day 2: Statistical Association June 27, 2013

Lack of Replication!• When possible, should do at least 2 experiments under each

experimental setting (drop height, in this case)• Helps quantify uncertainty at each x value• Can then use fancy tests for nonlinearity (post-AP-level stats)

What if we had only done one trial at each dose?Might see just the diamonds, or just the Xs.Also, when designing,choose 3 or moreX values, so we candetect nonlinearity.

Page 26: GAISEing into the Statistics Common Core Day 2: Statistical Association June 27, 2013

Show 3 Types of Scatterplots:• Designed experiment, with replication

Don’t average the y values at each x value to “make it simpler”!

Page 27: GAISEing into the Statistics Common Core Day 2: Statistical Association June 27, 2013

Show 3 Types of Scatterplots:• Observational Study

13

14

15

16

17

18

19

average_salary (thousands)

42 44 46 48 50 52 54 56 58 60 62

Collection 1 Scatter Plot

Pennsylvania, district-by-district y = 0.0059x + 1023.8

R2 = 0.2108

0

200

400

600

800

1000

1200

1400

1600

1800

$- $10,000 $20,000 $30,000 $40,000 $50,000 $60,000 $70,000 $80,000

Avg Te a che r S a la ry

Mat

h t

est

sco

res

Page 28: GAISEing into the Statistics Common Core Day 2: Statistical Association June 27, 2013

Show 3 Types of Scatterplots:• Time Series

Page 29: GAISEing into the Statistics Common Core Day 2: Statistical Association June 27, 2013

Common Suggestions for Informal Fits• Connect First and Last Points• Connect Lowest and Highest Points• Divide the data in half• Connect as many points as possible• And others we’ll get to later.• Before we go on, sketch graphs that

show these ideas aren’t great.

Page 30: GAISEing into the Statistics Common Core Day 2: Statistical Association June 27, 2013

Common suggestions for informal fits:

Page 31: GAISEing into the Statistics Common Core Day 2: Statistical Association June 27, 2013

Another common suggestion

Page 32: GAISEing into the Statistics Common Core Day 2: Statistical Association June 27, 2013

Loop 2: Residuals = actual - predicted

New ideas for informal fit?

NOTE:Residuals aremeasuredVERTICALLY,not horizontallyand not perpendicular to the line of best fit.

Page 33: GAISEing into the Statistics Common Core Day 2: Statistical Association June 27, 2013
Page 34: GAISEing into the Statistics Common Core Day 2: Statistical Association June 27, 2013

Usual student’s answer: sum the absolute residuals• Not a bad idea!• But, some bad points about it:

• Historically, harder to do than what we’ll see next.• Sometimes the choice of line is not unique.• Advanced statistical theory supports a different choice.

• Good points:• Modern software can do it.• It’s resistant to outliers.

Page 35: GAISEing into the Statistics Common Core Day 2: Statistical Association June 27, 2013

Usual statistician’s answer: sum the squared residuals• This applet shows the geometric squares of the residuals:

http://www.geogebra.org/en/upload/files/mrfox001/line_of_best_fit.html

• Does CCSSM require use or knowledge of formulas to find the line that minimizes the sum of squared residuals?• Standards aren’t so clear to me; the draft Progressions document

seems to focus only on using technology to fit the line automatically.

Page 36: GAISEing into the Statistics Common Core Day 2: Statistical Association June 27, 2013

Standards, High School (part 1) Summarize, represent, and interpret data on two categorical and

quantitative variables • CCSS.Math.Content.HSS-ID.B.5 Summarize categorical data for two

categories in two-way frequency tables. Interpret relative frequencies in the context of the data (including joint, marginal, and conditional relative frequencies). Recognize possible associations and trends in the data.

• CCSS.Math.Content.HSS-ID.B.6 Represent data on two quantitative variables on a scatter plot, and describe how the variables are related.

• CCSS.Math.Content.HSS-ID.B.6a Fit a function to the data; use functions fitted to data to solve problems in the context of the data. Use given functions or choose a function suggested by the context. Emphasize linear, quadratic, and exponential models.

• CCSS.Math.Content.HSS-ID.B.6b Informally assess the fit of a function by plotting and analyzing residuals.

• CCSS.Math.Content.HSS-ID.B.6c Fit a linear function for a scatter plot that suggests a linear association.

Page 37: GAISEing into the Statistics Common Core Day 2: Statistical Association June 27, 2013

Standards, High School (part 2)

Interpret linear models • CCSS.Math.Content.HSS-ID.C.7 Interpret the

slope (rate of change) and the intercept (constant term) of a linear model in the context of the data.

• CCSS.Math.Content.HSS-ID.C.8 Compute (using technology) and interpret the correlation coefficient of a linear fit.

• CCSS.Math.Content.HSS-ID.C.9 Distinguish between correlation and causation.

Page 38: GAISEing into the Statistics Common Core Day 2: Statistical Association June 27, 2013

What is the length of this line?

Page 39: GAISEing into the Statistics Common Core Day 2: Statistical Association June 27, 2013

What is the length of this line?

Page 40: GAISEing into the Statistics Common Core Day 2: Statistical Association June 27, 2013

Is this a square?What is its Area?

Page 41: GAISEing into the Statistics Common Core Day 2: Statistical Association June 27, 2013

Popular drawing: sum of squared residuals

But squares are actually coming “out of the page” at us; both base & depth are measured in $

Page 42: GAISEing into the Statistics Common Core Day 2: Statistical Association June 27, 2013

What is the danger lurking in the equation that it shows?

Page 43: GAISEing into the Statistics Common Core Day 2: Statistical Association June 27, 2013

http://xkcd.com/833/

Page 44: GAISEing into the Statistics Common Core Day 2: Statistical Association June 27, 2013

Label the axes!• It is very easy to get confused: is y=original data, or

y=residuals?• Other, more advanced plots have:

• X=predicted, y=actual• X=predicted, y=residual• X=run sequence of data (1st, 2nd, etc) , y=residual

• Here are six recommended plots for examining the residuals: http://www.itl.nist.gov/div898/handbook/eda/section3/6plot.htm However, it neglects another type that it mentions elsewhere: a run-order or run-sequence plot.

Page 45: GAISEing into the Statistics Common Core Day 2: Statistical Association June 27, 2013

It is standard practice to graph the residuals!

Page 46: GAISEing into the Statistics Common Core Day 2: Statistical Association June 27, 2013

Timing data from yesterday

Let’s try it on the TI calculators.

Page 47: GAISEing into the Statistics Common Core Day 2: Statistical Association June 27, 2013

• Mackenzie 17 17Lori 21 27Paul 17 21ASK 24 22Katelyn 23 20Karin 24 33Allison 18 19Karen 20 20Jamie 22 19Andra 18 20Susan 24 25Sherita 53 45Susan 15 16Stephanie 23 27Jordan 27 26Ed 18 18Mila 25 27Wendy 24 23Claudia 28 27Steve 24 26Linda 25 25Karen 28 28Elizabeth 28 26Jeff 26 25Kim 38 26Jeannette 19 24Lisa 29 28Joanne 30 25Molly 31 38Laura 33 35

Page 48: GAISEing into the Statistics Common Core Day 2: Statistical Association June 27, 2013

With line of best fit:

• What if we flip the x & y data before doing regression?

Page 49: GAISEing into the Statistics Common Core Day 2: Statistical Association June 27, 2013

It is standard practice to graph the residuals!

Page 50: GAISEing into the Statistics Common Core Day 2: Statistical Association June 27, 2013

What should residual graphs look like?• No patterns!• If there are any patterns, that means our original regression

missed something. Which of these are okay/not okay?

Each graph has x=original x data values, y=residuals

Page 51: GAISEing into the Statistics Common Core Day 2: Statistical Association June 27, 2013

Usual Procedure

1. Graph the data2. Fit a function3. Compute and graph residuals4. Any pattern left? Repeat from step 25. No pattern left? We’re done!

• Students get confused: do I want to see a pattern, or not?

• In the original data, yes (usually). In the residuals, no.

Page 52: GAISEing into the Statistics Common Core Day 2: Statistical Association June 27, 2013

Correlation Coefficient• r (that is, little-r)• Always between -1 and +1• Close to zero: no linear relationship• Close to -1 or +1: close-to-linear relationship• 0 to 0.5 is weak, 0.5 to 0.8 is moderate, above 0.8 is strong

(though that’s for social-science/biology stuff, not engr/physics)• r doesn’t change if x or y units change (or both), or axes flip.

Page 53: GAISEing into the Statistics Common Core Day 2: Statistical Association June 27, 2013

Which has the highest correl.?• This is called Anscombe’s Quartet

Page 54: GAISEing into the Statistics Common Core Day 2: Statistical Association June 27, 2013

• Again, which has the highest correlation?

Page 55: GAISEing into the Statistics Common Core Day 2: Statistical Association June 27, 2013

Teacher Value-Added Scores

• New York City, 2006 vs 2008 school year• Data on each individual teacher• Same school, same subject, same grade

level• At least 3 years of experience• X=VA z-score in 2006; Y=VA z-score in 2008• What will the scatterplot look like?• What will the r or R^2 value be?

Page 56: GAISEing into the Statistics Common Core Day 2: Statistical Association June 27, 2013
Page 57: GAISEing into the Statistics Common Core Day 2: Statistical Association June 27, 2013

Outliers & Influential Points• Each data point could be:

• An x outlier• A y outlier• Both x and y outlier, or neither x nor y outlier• A regression outlier (far from the pattern of the data)• Influential (if removed, the slope of the regression line would

change more than just a little bit, whatever that means in the context of the problem)

• Not all outliers are influential!

Page 58: GAISEing into the Statistics Common Core Day 2: Statistical Association June 27, 2013

Outlier boundaries• Using the usual definition of outlier: more than 1.5 IQR from

Q1 or Q3• Slanted lines for regression outliers use Q1, Q3, IQR of

residuals:• Trendline + Q3 + 1.5 IQR, and• Trendline + Q1 – 1.5 IQR (Q1 of residuals will be < 0, almost

always)

Page 59: GAISEing into the Statistics Common Core Day 2: Statistical Association June 27, 2013

Influence on Slope• Consider a lattice of possible points we might add to data set• Compute abs(%change in regression slope)• Color small changes blue, large changes red.• Center is near (mean x, mean y)

Page 60: GAISEing into the Statistics Common Core Day 2: Statistical Association June 27, 2013

Influence on R^2• Using abs(% change in R^2)• Red regions show large change, blue shows little change• Note white-ish regions at bottom-left & top-right: adding

points from those regions (which are near the original trendline) increases R^2

• Adding points from top-left or bottom-right decreases R^2

Page 61: GAISEing into the Statistics Common Core Day 2: Statistical Association June 27, 2013

Common Names for Variableshorizontal verticalx yindependent dependentfree outcomepredictor predictedstimulus responsecause effectcontrolled uncontrolledexplanatory explainedregressor result

Page 62: GAISEing into the Statistics Common Core Day 2: Statistical Association June 27, 2013

Example• Suppose you read an article that says that people who eat at

least one carrot a day tend to spend less on health care than those that don’t. Does this mean you should eat more carrots to stay healthier?

• Perhaps a hidden variable is a person’s attitude toward health. People who try to take good care of themselves probably eat a lot of all veggies. They probably also have better health than those who don’t care what they eat.

• This proposes a specific lurking variable (rather than saying “there is a lurking variable” with no further explanation), and it says how that variable affects both of the variables already mentioned. It doesn’t argue that the link doesn’t exist.

Page 63: GAISEing into the Statistics Common Core Day 2: Statistical Association June 27, 2013

Correlation does not imply Causation• Perhaps a 3rd variable, not in the study, is affecting both

variables that were in the study (“lurking”)• Perhaps the causation runs the opposite way of what was

proposed• Common lurking variables:

• SES = Socio-Economic Status (poverty, etc.)• A person’s overall health• A person’s health attitude• Population Size (of a city/state/country)• Inflation or flow of time• Weather• Local cost of living• % liberal/conservative by regione

Page 64: GAISEing into the Statistics Common Core Day 2: Statistical Association June 27, 2013

Argue about these:• There is a positive link between consumption of tobacco and

non-use of seatbelts. So if you want to cut down on smoking, buckle up!

• There is a link between the # of years of math someone takes in high school and their future income. So, Michigan should require high school students to take at least Algebra 2.

• An actual article said something like: there is a link between credit card debt and health problems. So, to make yourself healthier, pay down your credit cards, since credit card debt causes stress which can cause health problems.

• There is a link between the presence of computers in K-12 schools and their standardized test scores. Therefore, we should spend more money on computers in schools.

Page 65: GAISEing into the Statistics Common Core Day 2: Statistical Association June 27, 2013

• Smoking and seat belts: health attitude• Algebra-2: lurking variable of geekiness?• Debt and health problems: maybe health

problems cause debt, more than debt causes health problems?

• Computers in schools: socio-economic status?

Page 66: GAISEing into the Statistics Common Core Day 2: Statistical Association June 27, 2013

http://xkcd.com/552/

Page 67: GAISEing into the Statistics Common Core Day 2: Statistical Association June 27, 2013

Interpreting SlopePennsylvania, district-by-district y = 0.0059x + 1023.8

R2 = 0.2108

0

200

400

600

800

1000

1200

1400

1600

1800

$- $10,000 $20,000 $30,000 $40,000 $50,000 $60,000 $70,000 $80,000

Avg Te a che r S a la ry

Mat

h te

st s

core

s

Page 68: GAISEing into the Statistics Common Core Day 2: Statistical Association June 27, 2013

Algebra vs Statistics vocabulary: Slope• Algebra: slope =

how much y will change for a 1-unit change in x• Statistics: slope of regression line = AVERAGE change in y per 1-unit DIFFERENCE in x

• AVERAGE: no guarantee that y will change exactly that much.• DIFFERENCE: saying “change” might give the impression that

we are changing the x value of a data point (putting someone on a stretching machine) instead of comparing two different x values (two people of different heights

Page 69: GAISEing into the Statistics Common Core Day 2: Statistical Association June 27, 2013

True or False?1. T/F: If you give a raise of $10,000 to each teacher in a particular district,

that district’s avg. test score will go up 59 points.

2. T/F: If you live in a district with a $50,000 average teacher salary and move to a district with a $60,000 average, your child’s test scores will go up, on average, by 59 points.

3. T/F: If you live in a district with a $50,000 average and move to a district with a $60,000 average, that district’s score will be 59 points higher.

4. T/F: If you live in a district with a $50,000 average and move to a district with a $60,000 average, then on average the scores in that district will be 59 points higher.

5. T/F: Since there is a district with a salary of about $30,000 with test scores above 1400, and another district around $65,000 with test scores below 1200, we can see that there’s no correlation between salary and test scores.

Page 70: GAISEing into the Statistics Common Core Day 2: Statistical Association June 27, 2013

Answers

1. False; this presumes that the correlation is a causationIt talks about changing the x-value of one data point, not comparing two data points.

2. False; your child is still your child with all their existing demographics. While an increase might happen, there’s no reason to think it would even average 59 points.

3. False; this statement sounds like a guarantee.4. True; saying “on average” is the key point.5. False; a single counter-example (or even many of them)

doesn’t disprove a general trend.

Page 71: GAISEing into the Statistics Common Core Day 2: Statistical Association June 27, 2013

x=% free lunch; y=scoreFree School Lunches cause bad test scores? y = -304.48x + 1394.4

R2 = 0.4524

0

200

400

600

800

1000

1200

1400

1600

1800

0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%

Pct Free Lunch

Page 72: GAISEing into the Statistics Common Core Day 2: Statistical Association June 27, 2013

http://xkcd.com/605/

Page 73: GAISEing into the Statistics Common Core Day 2: Statistical Association June 27, 2013

http://xkcd.com/1007/

Page 74: GAISEing into the Statistics Common Core Day 2: Statistical Association June 27, 2013

Ecological Fallacy• Better to call it Aggregation Fallacy (my personal opinion)• The fallacy is: aggregate data gives useful info on individuals.

Page 75: GAISEing into the Statistics Common Core Day 2: Statistical Association June 27, 2013

10:40 a.m.-12:10 p.m.• GAISE activities part 1: participants engage in activities that

teach the standard through the GAISE process, then debrief on the experience and how to utilize the activity in their own classroom

• Possible / Favorite activities for Quantitative Association:• Barbie Bungee• Spaghetti Bridge• Balloon Descent Time• Paper Helicopter Descent Time• Sports ball bounce height (Golf? Ping-pong? Superbounce?)• M&M Exponential Survival Curve• Two Estimates of Timing, or weight of objects, or age of a person

Page 76: GAISEing into the Statistics Common Core Day 2: Statistical Association June 27, 2013

12:10-1:00 p.m. : Lunch Break

Workshop participants engage in a process of balancing nutritional value, price, and flavor options to decide what food to eat.

They might have already done this in pre-class work (“brown bag”)

Participants will eat their selected food (or randomly selected?) n=30 times and record the resulting observations.

Page 77: GAISEing into the Statistics Common Core Day 2: Statistical Association June 27, 2013

1:00-2:00 p.m. GAISE activities part 2: participants engage in activities that

teach the standard through the GAISE process and utilize technology, then debrief on the experience and how to utilize the activity in their own classroom

Categorical Association Activity: Eyes!

Page 78: GAISEing into the Statistics Common Core Day 2: Statistical Association June 27, 2013

Categorical Association

• CCSS.Math.Content.8.SP.A.4 Understand that patterns of association can also be seen in bivariate categorical data by displaying frequencies and relative frequencies in a two-way table. Construct and interpret a two-way table summarizing data on two categorical variables collected from the same subjects. Use relative frequencies calculated for rows or columns to describe possible association between the two variables. For example, collect data from students in your class on whether or not they have a curfew on school nights and whether or not they have assigned chores at home. Is there evidence that those who have a curfew also tend to have chores?

Page 79: GAISEing into the Statistics Common Core Day 2: Statistical Association June 27, 2013

Categorical Association

• CCSS.Math.Content.HSS-ID.B.5 Summarize categorical data for two categories in two-way frequency tables. Interpret relative frequencies in the context of the data (including joint, marginal, and conditional relative frequencies). Recognize possible associations and trends in the data.

Page 80: GAISEing into the Statistics Common Core Day 2: Statistical Association June 27, 2013

Summer Employment and Gender: Joint FrequencyJob Experience Male Female TotalNever had a part-time job

21 31

Had a part-time job during summer only

15 13

Had a part-time job but not only during summer

12 8

Total

What numbers/tables and graphs could we look at to explore the datalooking for any association between gender and summer employment?

Page 81: GAISEing into the Statistics Common Core Day 2: Statistical Association June 27, 2013

Are these helpful?

0102030

MaleFemale

Male05

101520253035

Never had a part-time job

Had a part-time job dur-ing summer only

Had a part-time job but not only dur-ing summer

Page 82: GAISEing into the Statistics Common Core Day 2: Statistical Association June 27, 2013

Marginal FrequencyJob Experience Male Female TotalNever had a part-time job

21 31 52

Had a part-time job during summer only

15 13 28

Had a part-time job but not only during summer

12 8 20

Total 48 52 100

What numbers/tables and graphs could we look at to explore the datalooking for any association between gender and summer employment?

Page 83: GAISEing into the Statistics Common Core Day 2: Statistical Association June 27, 2013

Conditional Frequency, by GenderJob Experience Male Female TotalNever had a part-time job

44% 60% 52%

Had a part-time job during summer only

31% 25% 28%

Had a part-time job but not only during summer

25% 15% 20%

Total 100% 100% 100%

Grade 8: Use relative frequencies calculated for rows or columns.HS: Interpret relative frequencies in the context of the data (including joint, marginal, and conditional relative frequencies)

Page 84: GAISEing into the Statistics Common Core Day 2: Statistical Association June 27, 2013

Conditional Frequency, by Gender: 100% Stacked Column• A.K.A. Segmented Bar Graph

Male Female0%

10%20%30%40%50%60%70%80%90%

100%

Had a part-time job but not only during summerHad a part-time job during summer onlyNever had a part-time job

Page 85: GAISEing into the Statistics Common Core Day 2: Statistical Association June 27, 2013

Conditional Frequency by ExperienceJob Experience Male Female TotalNever had a part-time job

40% 60% 100%

Had a part-time job during summer only

54% 46% 100%

Had a part-time job but not only during summer

60% 40% 100%

Total 48% 52% 100%

What numbers/tables and graphs could we look at to explore the datalooking for any association between gender and summer employment?

Page 86: GAISEing into the Statistics Common Core Day 2: Statistical Association June 27, 2013

Conditional Frequency, by Experience

0%20%40%60%80%

100%

FemaleMale

Page 87: GAISEing into the Statistics Common Core Day 2: Statistical Association June 27, 2013

Famous Discrimination Case Recommended

forPromotion

Not Recommended for Promotion

Total

Male 21 3 24Female 14 10 24Total 35 13 48

Suppose these numbers show an actual association(“statistically significant”).Is that evidence of illegal discrimination?

Page 88: GAISEing into the Statistics Common Core Day 2: Statistical Association June 27, 2013

Activity starter• Everyone get a triangle-ended slip of paper.• With both eyes open, hold hands at arm’s distance like this,

• Focus on a distant object.• Close left eye. Reopen.• Close right eye. Reopen.

Page 89: GAISEing into the Statistics Common Core Day 2: Statistical Association June 27, 2013

• Whichever eye still saw the object is your “dominant eye”

• Write your dominant eye (L or R) on the triangle-end of your slip of paper.

• What might be related to eye dominance?• (suggestions from the whole group)• Write that on the non-triangle end of your

slip of paper.• Pass slips of paper to the front

Page 90: GAISEing into the Statistics Common Core Day 2: Statistical Association June 27, 2013

Let’s make a table

Left Hand Right Hand

Left Eye 2 10 12Right Eye 2 18 20

4 28 32

Page 91: GAISEing into the Statistics Common Core Day 2: Statistical Association June 27, 2013

Example table 1 (fake data)

Left RightEye 5 35Hand 9 31

Page 92: GAISEing into the Statistics Common Core Day 2: Statistical Association June 27, 2013

Example table 2 (fake data)

Left Hand Right HandLeft Eye 3 2Right Eye 6 29

Page 93: GAISEing into the Statistics Common Core Day 2: Statistical Association June 27, 2013

What if extreme association?

Left Hand Right HandLeft Eye 4 8 12Right Eye 0 20 20

4 28 32

Page 94: GAISEing into the Statistics Common Core Day 2: Statistical Association June 27, 2013

What if no association?

Joint Left Hand Right HandLeft Eye 12*12.5% 12*87.5% 12Right Eye 20*12.5% 20*87.5% 20

4 28 32

•First, make a segmented bar chart (100% Stacked Column chart)•Then, make a table of conditional relative frequencies•Then, make a table of joint frequencies

Conditional: Left Hand Right HandLeft Eye 12.5% 87.5% 100%Right Eye 12.5% 87.5% 100%

Page 95: GAISEing into the Statistics Common Core Day 2: Statistical Association June 27, 2013

What if no association?

Joint Left Hand Right HandLeft Eye 1.5 10.5 12Right Eye 2.5 17.5 20

4 28 32

•First, make a segmented bar chart (100% Stacked Column chart)•Then, make a table of conditional relative frequencies•Then, make a table of joint frequencies

Conditional: Left Hand Right HandLeft Eye 12.5% 87.5% 100%Right Eye 12.5% 87.5% 100%

Page 96: GAISEing into the Statistics Common Core Day 2: Statistical Association June 27, 2013

Simulating No Association (artificial data)

Left Hand Right HandLeft Eye 5Right Eye 35

9 31 40

Page 97: GAISEing into the Statistics Common Core Day 2: Statistical Association June 27, 2013

2:00-2:30 p.m.• Interactive lecture on

• knowledge of standard and students, • what students are likely to think about and do as they progress

through the learning trajectory for the standard• common student (mis-)conceptions,• effective ways to support students as they move through the

learning trajectory

Page 98: GAISEing into the Statistics Common Core Day 2: Statistical Association June 27, 2013

Common student conceptions “What Fits” --from Project SET and STEW Activity: drop a golf ball from various heights, record height of

its first bounce The following graphs show where actual students placed a

“best-fit” line (thin metal rod) For each, try to figure out the student's reasoning; then we'll

reveal it.

Page 99: GAISEing into the Statistics Common Core Day 2: Statistical Association June 27, 2013

0

10

20

30

40

50

0 10 20 30 40 50 60 70 80

Height_of_ball_cm

Bounce_height_cm = -0.00155Height_of_ball_cm + 19.5

Golf Ball drop height and bounce height Scatter Plot

Page 100: GAISEing into the Statistics Common Core Day 2: Statistical Association June 27, 2013

• I thought of the line of best fit like the mode, so I put the line through two points with the same y-coordinate because they occur most often.

0

10

20

30

40

50

0 10 20 30 40 50 60 70 80

Height_of_ball_cm

Bounce_height_cm = -0.00155Height_of_ball_cm + 19.5

Golf Ball drop height and bounce height Scatter Plot

Page 101: GAISEing into the Statistics Common Core Day 2: Statistical Association June 27, 2013

0

10

20

30

40

50

0 10 20 30 40 50 60 70 80

Height_of_ball_cm

Bounce_height_cm = 0.741Height_of_ball_cm

Golf Ball drop height and bounce height Scatter Plot

Page 102: GAISEing into the Statistics Common Core Day 2: Statistical Association June 27, 2013

• The line needs to start at (0,0) then go through the most dots. I got my line to go through two of the dots so I put it there.

0

10

20

30

40

50

0 10 20 30 40 50 60 70 80

Height_of_ball_cm

Bounce_height_cm = 0.741Height_of_ball_cm

Golf Ball drop height and bounce height Scatter Plot

Page 103: GAISEing into the Statistics Common Core Day 2: Statistical Association June 27, 2013

0

10

20

30

40

50

Height_of_ball_cm

0 10 20 30 40 50 60 70 80

Bounce_height_cm = 23.6

Golf Ball drop height and bounce height Scatter Plot

Page 104: GAISEing into the Statistics Common Core Day 2: Statistical Association June 27, 2013

• The line should be in the middle of the highest and lowest points because that’s like the average.

0

10

20

30

40

50

Height_of_ball_cm

0 10 20 30 40 50 60 70 80

Bounce_height_cm = 23.6

Golf Ball drop height and bounce height Scatter Plot

Page 105: GAISEing into the Statistics Common Core Day 2: Statistical Association June 27, 2013

0

10

20

30

40

50

0 10 20 30 40 50 60 70 80

Height_of_ball_cm

Height_of_ball_cm mean = 41.015

Golf Ball drop height and bounce height Scatter Plot

Page 106: GAISEing into the Statistics Common Core Day 2: Statistical Association June 27, 2013

• I know the line should go in the middle of the data. I put it here so it would be in the middle, four points on each side.

0

10

20

30

40

50

0 10 20 30 40 50 60 70 80

Height_of_ball_cm

Height_of_ball_cm mean = 41.015

Golf Ball drop height and bounce height Scatter Plot

Page 107: GAISEing into the Statistics Common Core Day 2: Statistical Association June 27, 2013

0

10

20

30

40

50

Height_of_ball_cm

0 10 20 30 40 50 60 70 80

Bounce_height_cm = 0.684Height_of_ball_cm - 4.1

Golf Ball drop height and bounce height Scatter Plot

Page 108: GAISEing into the Statistics Common Core Day 2: Statistical Association June 27, 2013

• I tried to make my line go through the most dots.

0

10

20

30

40

50

Height_of_ball_cm

0 10 20 30 40 50 60 70 80

Bounce_height_cm = 0.684Height_of_ball_cm - 4.1

Golf Ball drop height and bounce height Scatter Plot

Page 109: GAISEing into the Statistics Common Core Day 2: Statistical Association June 27, 2013

Help these students• You are leading the class in analyzing the relationship between

GPA and ACT scores for a data set, using technology. You have asked your students to find the correlation coefficient, coefficient of determination, and regression line for the data set. One pair of students asks for your help. They have done their work independently and are now comparing their answers. They have the same correlation coefficient & coefficient of determination, but their regression lines are different.

• What went wrong?• How would you help them?

Page 110: GAISEing into the Statistics Common Core Day 2: Statistical Association June 27, 2013

The answer• One of them used GPA to predict ACT, and the other did the

reverse.• The r and R^2 values will be the same• The slopes will be different (slope1 approximately = 1/slope2 but

not exactly)• The intercepts will be different• Predictions will be different

• How to help? Perhaps ask each to predict the ACT of a 4.0 student; one will be able to answer quickly, the other will have to solve a linear equation.

• Perhaps sketch a quick graph with vertical residuals and horizontal residuals?

Page 111: GAISEing into the Statistics Common Core Day 2: Statistical Association June 27, 2013

Rerouting a Student Idea• To start a lesson on lines of best fit, you are following the

curriculum guide and have presented your students the following data set about the pounds of beans used by families of different sizes when traveling on the Overland Trail:

• First question: how much does a party of 20 need?• Student suggests: You could look at people with like 10 people

in their families and just double that amount.• How to reroute into a line-of-best-fit?

#people 5 8 6 7 11 10 5 7 10 5 8 7 9 12 10Pounds of beans

61 95 56 75 125 135 80 100 103 75 100 105 125 150 125

Page 112: GAISEing into the Statistics Common Core Day 2: Statistical Association June 27, 2013

Rerouting a Student Idea• Which party of 10? There are 3 of them.• Why not take a party of 5 and quadruple their usage?• Why not take a party of 12 and a party of 8 and add them?• What if the parties of 10 were by chance not big eaters? Can

we use all of the data to estimate needs?• What if a student suggests to compute pounds-per-person for

each party, average them all, then multiply by 20?

• Industrial engineers would point out that while a linear regression will forecast the needed amount of food, you should actually bring along some safety stock, above the forecast.

Page 113: GAISEing into the Statistics Common Core Day 2: Statistical Association June 27, 2013

Vocabulary Time• A student in your class raises his hand and asks “What’s the

difference between correlation, association, and regression?”. • What would you say, do, or draw?

Page 114: GAISEing into the Statistics Common Core Day 2: Statistical Association June 27, 2013

The answer• Association is the most general. All of these are associated:

Association also applies to categorical data.• Correlation is the direction and strength of any linear

relationship • Regression is the process of fitting a line or curve to a data set.• But, many people use them as if they mean the same thing, so

we can't assume when we read or hear something that the person is being technically accurate in their usage. This is even true for some textbooks.

Page 115: GAISEing into the Statistics Common Core Day 2: Statistical Association June 27, 2013

• R^2 = % of variation explained by the predictor variable, using the model that you used.

• = 1 – (sum of squared residuals)/(sum of squared y deviations from the y mean)

• “Coefficient of Determination”

• Quadratic fits will ALWAYS be better (or at least as good) than linear fits, as measured by R^2 or sum of squared residuals.

• Similarly, cubic will ALWAYS beat quadratic (or be at least as good), etc.

• This is the danger of “Overfitting”: modeling your existing data points too well, at the expense of making good predictions of future data points.

Page 116: GAISEing into the Statistics Common Core Day 2: Statistical Association June 27, 2013

Zero Correlation• Create three distinctly differently patterned scatterplots which

all have a correlation coefficient of approximately zero. • Explain why the correlation coefficient is near zero for each of

the cases.

Page 117: GAISEing into the Statistics Common Core Day 2: Statistical Association June 27, 2013

Zero Correlation

• Also, if all the data is on a perfect flat line, y=0x+b, the correlation coefficient is technically undefined (divide-by-zero), but we might redefine it to be zero in that instance.

• But that would never happen with real data.

Page 118: GAISEing into the Statistics Common Core Day 2: Statistical Association June 27, 2013

Model Selection• You have been teaching your class about finding the best

model for a data set. A student says “So I just try all of these different equations on the data set and whichever one gives me the biggest value of r is the best one, right?”.

Page 119: GAISEing into the Statistics Common Core Day 2: Statistical Association June 27, 2013

An answer• First, is there theory that indicates which model might be best?

Exponential growth for populations, for example. Decay toward zero in some cases (could be power or exponential). Horizontal asymptotes predicted by theory? Are negative values (for x or for y) allowed?

• If theory has nothing to say, we have an inherent preference for linear models, since (a) Occam’s razor says to use the simplest thing, and while mathematically they are all equally simple in some sense, we really like linear stuff, and (b) if you zoom in on any of these smooth curves you will get approximately a line. This preference might lead us to select a linear model with R^2=0.80 instead of a power/log/exponential model with R^2=0.83, for example. How big a difference is acceptable? Hard to say.

• If we’re including polynomial models like a*x^2+b*x+c, warn about the danger of overfitting—adding more terms always increases R^2, but can make the predictions of future values ridiculous.

Page 120: GAISEing into the Statistics Common Core Day 2: Statistical Association June 27, 2013

And then there’s Logs• Logarithmic transforms aren’t part of the CCSSM, but are part

of AP Statistics.• If some of the data has already been log-transformed

(decibels, earthquake richter scale magnitudes, pH, etc) then it’s unlikely you would want to log it again.

• If the data (usually y rather than x) spans roughly 2 or more orders of magnitude, consider logging it

• If the residuals have a fan-shape, consider transforming y (either logging or square-rooting, usually)

• If the residuals look skewed, consider transforming y to bring them back toward normality.

Page 121: GAISEing into the Statistics Common Core Day 2: Statistical Association June 27, 2013

Common Misconceptions on Categorical Association• Research has found that students commonly have three

incorrect conceptions about association of categorical variables:

• Determinist: students believed that an association meant all cases must show an association with no exceptions. These students believed that the cells in the two-way table that did not agree with the association should have zero frequency.

• Unidirectional: students believed dependence occurred only when it was direct. This could be explained by the tendency of students to give more relevance to positive cases than negative cases that confirm a given hypothesis.

• Localist: students looked at part of the data to determine if an association existed, often only looking at the cell with the highest frequency or at only one conditional distribution.

Page 122: GAISEing into the Statistics Common Core Day 2: Statistical Association June 27, 2013

Determinist: no exceptions! Has HIV Doesn't have HIVHas AIDS 97 2Doesn't have AIDS 1 9900

Artificial data:

• What are some arguments, based on the numbers in this table, that they are NOT associated?

• What are some arguments, based on the numbers in this table, that they ARE associated?

• What does a Determinist look at in a segmented bar chart?

• Does an association have to hold for every single case for the association to be true?

• Does the strong association shown in the table show that HIV causes AIDS?

Page 123: GAISEing into the Statistics Common Core Day 2: Statistical Association June 27, 2013

Unidirectional?• Poll 100 people at your high school about their college plans.

In-state Out-of-statePrivate 5 25Public 60 10

Does this table, above, show an association?

Does the table below show an association?

In-state Out-of-statePublic 60 10Private 5 25

What does Unidirectional thinking look like on a segmented bar chart?

Page 124: GAISEing into the Statistics Common Core Day 2: Statistical Association June 27, 2013

Localist?• Ask 100 people whether they like summer or winter weather

more, and their birthplace. Summer WinterEast of the Mississippi 72 18West of the Mississippi 8 2

It's pretty clear that people from the East prefer Summer more than Winter?

Does this mean that there is an association between birthplace and weather preference?

Compute the conditional distributions:What is the probability that someone from the east likes Summer weather? [72/(72+18)=72/90=80%]What is the probability that someone from the west likes Summer weather?[8/(8+2)=8/10=80%]

Page 125: GAISEing into the Statistics Common Core Day 2: Statistical Association June 27, 2013

Categorical Association and Experimental Design• You are leading your students in an activity in which they

complete all 4 steps of the GAISE framework (formulate a research question, collect their own data, analyze the data, and interpret the results) in a situation relevant to categorical association.

• Students have brainstormed the following research questions. • For each question, determine if it is relevant to categorical

association, and provide your response to the student. • Consider data collection issues as well. For those that are not

relevant and/or have data collection issues, suggest some modifications.

Page 126: GAISEing into the Statistics Common Core Day 2: Statistical Association June 27, 2013

Relevant to Categ. Assoc.?Data collection issues?• LAURA: I want to study if girls are smarter than boys, so I am

going to compare the GPAs of boys and girls at school.

• BILL: I think people with higher GPAs are less likely to have had a car crash while driving. I am going to ask a senior with drivers licenses their GPAs and if they’ve crashed while driving.

• ANNE MARIE: I want to see whether boys are more likely to own smartphones than girls. So I’m going to count during passing periods the number of boys I see with smartphones and the number of girls I see with no phones.

Page 127: GAISEing into the Statistics Common Core Day 2: Statistical Association June 27, 2013

Relevant to Categ. Assoc.?Data collection issues?• NICK: I think girls are more likely to bring their lunch to school

than boys. So I’m going to stand at the entrance to the lunchroom and count the number of boys and girls that bring their lunches.

• JEFF: I want to study the relationships between gender, race, grade level, political party and whether students make honor roll or not at our school.

• KRISTIN: I read that 77% of high school students go on to college. I want to see if that’s true of the students here.

Page 128: GAISEing into the Statistics Common Core Day 2: Statistical Association June 27, 2013

2:30-3:00 p.m.• Reflections on the day’s standard(s), share ideas, comments,

concerns, etc. for teaching the standard(s)

Page 129: GAISEing into the Statistics Common Core Day 2: Statistical Association June 27, 2013

Standards for Mathematical Practice

• How do they relate to Statistical Association?• (not sure what part of the day this fits in the best, if at all)

Page 130: GAISEing into the Statistics Common Core Day 2: Statistical Association June 27, 2013

• CCSS.Math.Practice.MP1 Make sense of problems and persevere in solving them.

• Make sense of trends in scatterplots• Make sense of outliers/influential points• Persevere in trying different functional fits

Page 131: GAISEing into the Statistics Common Core Day 2: Statistical Association June 27, 2013

• CCSS.Math.Practice.MP2 Reason abstractly and quantitatively.• “The slope has important practical interpretations for most

statistical investigations of this type” (CCSS Progressions grade 8)

Page 132: GAISEing into the Statistics Common Core Day 2: Statistical Association June 27, 2013

• CCSS.Math.Practice.MP3 Construct viable arguments and critique the reasoning of others.

• Think of lurking variables that would argue against a correlation being a causation

• Evaluate linear/exponential/power/log models based on their asymptotic behavior

Page 133: GAISEing into the Statistics Common Core Day 2: Statistical Association June 27, 2013

• CCSS.Math.Practice.MP4 Model with mathematics.

• Everything!• “build statistical models to explore the relationship between

two variables” (CCSS Progression grade 8)• “using their knowledge of functions to fit models to

quantitative data” (CCSS Progression HS)

Page 134: GAISEing into the Statistics Common Core Day 2: Statistical Association June 27, 2013

• CCSS.Math.Practice.MP5 Use appropriate tools strategically.• Not much argument that technology is needed to do most

regression analysis?• Tension between classroom technology and real-world

technology?

Page 135: GAISEing into the Statistics Common Core Day 2: Statistical Association June 27, 2013

• CCSS.Math.Practice.MP6 Attend to precision.• Don't report too many decimal places in slope, intercept, R^2,

or predictions of new values• Phrase predictions about new values carefully, avoiding causal

or deterministic language.

Page 136: GAISEing into the Statistics Common Core Day 2: Statistical Association June 27, 2013

• CCSS.Math.Practice.MP7 Look for and make use of structure.• “looking for and making use of structure to describe possible

association in bivariate Data” (CCSS Progression grade 8)• “using their knowledge of proportions to describe categorical

associations”, “Looking for patterns in tables” (CCSS Progression HS)

Page 137: GAISEing into the Statistics Common Core Day 2: Statistical Association June 27, 2013

• CCSS.Math.Practice.MP8 Look for and express regularity in repeated reasoning.

Page 138: GAISEing into the Statistics Common Core Day 2: Statistical Association June 27, 2013

Day 2 Wrap-Up

• What surprised you today?• What did you find interesting?• How might you bring these ideas to your

class?• What would you change?• Other activities/ideas to share with the

group?

Page 139: GAISEing into the Statistics Common Core Day 2: Statistical Association June 27, 2013

Project-SET sports data

GenderHeight (in)

Vertical Jump Height (in)

Likes Sports?

male 72.5 22.5Yesfemale 70 18.5Yesmale 71 17Nofemale 64 17Nofemale 69 16.5Yesmale 72.5 27.5Yesfemale 64.75 12.5Yesmale 70 16Yesfemale 67.5 5.5Nofemale 66 12Yesmale 65.5 20.5Yesfemale 66.75 13.5Nofemale 59.25 11Nomale 69.25 16No

What categorical association question can we ask?