correlation and regression

61
Introduction to Introduction to Correlation and Correlation and Regression Regression Ginger Holmes Rowell, Ginger Holmes Rowell, Ph. D. Ph. D. Associate Professor of Associate Professor of Mathematics Mathematics Middle Tennessee State Middle Tennessee State University University

Upload: shhussain

Post on 23-Jun-2015

241 views

Category:

Education


0 download

DESCRIPTION

SHAHBAZ

TRANSCRIPT

Page 1: Correlation and regression

Introduction toIntroduction to

Correlation and RegressionCorrelation and Regression

Ginger Holmes Rowell, Ph. D.Ginger Holmes Rowell, Ph. D.Associate Professor of MathematicsAssociate Professor of MathematicsMiddle Tennessee State UniversityMiddle Tennessee State University

Page 2: Correlation and regression

OutlineOutline Introduction Introduction

Linear CorrelationLinear Correlation

Regression Regression Simple Linear Simple Linear

Regression Regression Using the TI-83 Using the TI-83 Model/FormulasModel/Formulas

Page 3: Correlation and regression

Outline continuedOutline continued ApplicationsApplications

Real-life ApplicationsReal-life Applications Practice ProblemsPractice Problems

Internet Resources Internet Resources Applets Applets Data SourcesData Sources

Page 4: Correlation and regression

CorrelationCorrelation Correlation Correlation

A measure of association between A measure of association between two numerical variables.two numerical variables.

Example (positive correlation)Example (positive correlation) Typically, in the summer as the Typically, in the summer as the

temperature increases people are temperature increases people are thirstier.thirstier.

Page 5: Correlation and regression

Specific Example Specific Example

For seven For seven random summer random summer days, a person days, a person recorded the recorded the temperaturetemperature and and their their water water consumptionconsumption, , during a three-hour during a three-hour period spent period spent outside.  outside.  

Temperature (F)

Water Consumption

(ounces)

75 16

83 20

85  25

85 27

92 32

97 48

99 48

Page 6: Correlation and regression

How would you describe the graph?How would you describe the graph?

Page 7: Correlation and regression

How “strong” is the linear relationship?How “strong” is the linear relationship?

Page 8: Correlation and regression

Measuring the RelationshipMeasuring the Relationship

Pearson’s Sample Pearson’s Sample Correlation Coefficient, Correlation Coefficient, rr

measures the measures the directiondirection and the and the strengthstrength of the linear association of the linear association

between two numerical paired between two numerical paired variables.variables.

Page 9: Correlation and regression

Direction of AssociationDirection of Association

Positive CorrelationPositive Correlation Negative CorrelationNegative Correlation

Page 10: Correlation and regression

Strength of Linear AssociationStrength of Linear Association

r value Interpretation

1perfect positive linear

relationship

0 no linear relationship

-1perfect negative linear

relationship

Page 11: Correlation and regression

Strength of Linear AssociationStrength of Linear Association

Page 12: Correlation and regression

Other Strengths of AssociationOther Strengths of Association

r value Interpretation

0.9 strong association

0.5 moderate association

0.25 weak association

Page 13: Correlation and regression

Other Strengths of AssociationOther Strengths of Association

Page 14: Correlation and regression

FormulaFormula

    = the sum       n = number of paired items

     

xi = input variableyi

= output variable 

x = x-bar = mean of x’s

y = y-bar = mean of y’s

sx= standard deviation of x’s

sy= standard deviation of y’s

Page 15: Correlation and regression

RegressionRegression

RegressionRegression

Specific statistical methodsSpecific statistical methods for for finding the “line of best fit” for one finding the “line of best fit” for one response (dependent) numerical response (dependent) numerical variable based on one or more variable based on one or more explanatory (independent) explanatory (independent) variables.variables.

Page 16: Correlation and regression

Curve Fitting vs. RegressionCurve Fitting vs. Regression

RegressionRegression

Includes using statistical methods Includes using statistical methods to assess the "goodness of fit" of to assess the "goodness of fit" of the model.  (ex. Correlation the model.  (ex. Correlation Coefficient)Coefficient)

Page 17: Correlation and regression

Regression: 3 Main PurposesRegression: 3 Main Purposes

To describeTo describe (or model) (or model)

To predictTo predict ( (or estimate) or estimate)

To controlTo control (or administer) (or administer)

Page 18: Correlation and regression

Simple Linear RegressionSimple Linear Regression

Statistical method for findingStatistical method for finding the “line of best fit” the “line of best fit”

for one response (dependent) for one response (dependent) numerical variable numerical variable

based on one explanatory based on one explanatory (independent) variable.  (independent) variable.  

Page 19: Correlation and regression

Least Squares RegressionLeast Squares Regression GOAL GOAL - -

minimize the minimize the sum of the sum of the square of square of the errors of the errors of the data the data points.points.

This minimizes the This minimizes the Mean Square ErrorMean Square Error

Page 20: Correlation and regression

ExampleExample

Plan an outdoor party.Plan an outdoor party.

EstimateEstimate number of soft drinks to buy number of soft drinks to buy per person, based on how hot the per person, based on how hot the weather is.weather is.

Use Temperature/Water data and Use Temperature/Water data and regressionregression..

Page 21: Correlation and regression

Steps to Reaching a SolutionSteps to Reaching a Solution Draw a scatterplot of the data.Draw a scatterplot of the data.

Page 22: Correlation and regression

Steps to Reaching a SolutionSteps to Reaching a Solution Draw a scatterplot of the data.Draw a scatterplot of the data. Visually, consider the strength of the Visually, consider the strength of the

linear relationship.linear relationship.

Page 23: Correlation and regression

Steps to Reaching a SolutionSteps to Reaching a Solution Draw a scatterplot of the data.Draw a scatterplot of the data. Visually, consider the strength of the Visually, consider the strength of the

linear relationship.linear relationship. If the relationship appears relatively If the relationship appears relatively

strong, find the correlation coefficient strong, find the correlation coefficient as a numerical verification.as a numerical verification.

Page 24: Correlation and regression

Steps to Reaching a SolutionSteps to Reaching a Solution Draw a scatterplot of the data.Draw a scatterplot of the data. Visually, consider the strength of the Visually, consider the strength of the

linear relationship.linear relationship. If the relationship appears relatively If the relationship appears relatively

strong, find the correlation coefficient strong, find the correlation coefficient as a numerical verification.as a numerical verification.

If the correlation is still relatively If the correlation is still relatively strong, then find the simple linear strong, then find the simple linear regression line.regression line.

Page 25: Correlation and regression

Our Next StepsOur Next Steps Learn to Use the TI-83 for Learn to Use the TI-83 for

Correlation and Regression. Correlation and Regression.

Interpret the Results (in the Interpret the Results (in the Context of the Problem). Context of the Problem).

Page 26: Correlation and regression

Finding the Solution: TI-83Finding the Solution: TI-83 Using the TI- 83 graphing calculatorUsing the TI- 83 graphing calculator

Turn on the calculator diagnostics.Turn on the calculator diagnostics. Enter the data. Enter the data. Graph a scatterplot of the data.Graph a scatterplot of the data. Find the equation of the regression line Find the equation of the regression line

and the correlation coefficient.and the correlation coefficient. Graph the regression line on a graph Graph the regression line on a graph

with the scatterplot. with the scatterplot.

Page 27: Correlation and regression

Preliminary StepPreliminary Step Turn the Diagnostics On.Turn the Diagnostics On.

Press Press 2nd 02nd 0 (for Catalog). (for Catalog). Scroll down to Scroll down to DiagnosticOnDiagnosticOn. The . The

marker points to the right of the marker points to the right of the words.words.

Press Press ENTERENTER. Press . Press ENTERENTER again. again. The word The word Done Done should appear on the should appear on the

right hand side of the screen.right hand side of the screen.

Page 28: Correlation and regression

ExampleExample

Temperature (F)

Water Consumption

(ounces)

75 16

83 20

85  25

85 27

92 32

97 48

99 48

Page 29: Correlation and regression

1. Enter the Data into Lists1. Enter the Data into Lists Press Press STATSTAT. . Under Under EDITEDIT, select , select 1: Edit1: Edit. . Enter x-values (input) into Enter x-values (input) into L1 L1 Enter y-values (output) into Enter y-values (output) into L2L2.. After data is entered in the lists, go After data is entered in the lists, go

to to 2nd MODE2nd MODE to quit and return to the to quit and return to the home screen.home screen.

Note:Note: If you need to clear out a list, for If you need to clear out a list, for example list 1, place the cursor on L1  example list 1, place the cursor on L1  then hit CLEAR and ENTER .then hit CLEAR and ENTER .

Page 30: Correlation and regression

2. Set up the Scatterplot.2. Set up the Scatterplot. Press Press 2nd Y=2nd Y= (STAT PLOTS). (STAT PLOTS). Select Select 1: PLOT 11: PLOT 1 and hit  and hit ENTERENTER. . Use the arrow keys to move the Use the arrow keys to move the

cursor down to cursor down to OnOn and hit  and hit ENTERENTER.. Arrow down to Arrow down to Type:Type: and select the and select the

first graphfirst graph under Type. under Type. Under Under Xlist:Xlist: Enter Enter L1L1.. Under Under Ylist:Ylist: Enter Enter L2L2.. Under Under Mark:Mark: select any of these. select any of these.

Page 31: Correlation and regression

3. View the Scatterplot3. View the Scatterplot Press Press 2nd MODE2nd MODE to quit and to quit and

return to the home screen.return to the home screen. To plot the points, press To plot the points, press ZOOMZOOM

and select and select 9: ZoomStat9: ZoomStat.. The scatterplot will then be The scatterplot will then be

graphed.graphed.

Page 32: Correlation and regression

4. Find the regression line.4. Find the regression line. Press Press STATSTAT.. Press Press CALCCALC.. Select Select 4: LinReg(ax + b)4: LinReg(ax + b). . Press Press 2nd 12nd 1 (for List 1) (for List 1) Press the Press the comma keycomma key,, Press Press 2nd 22nd 2 (for List 2) (for List 2) Press Press ENTERENTER.  .  

Page 33: Correlation and regression

5. Interpreting and Visualizing5. Interpreting and Visualizing Interpreting the result: Interpreting the result:

y = ax + by = ax + b

The valueThe value ofof aa is the is the slopeslope The value of The value of bb is the is the y-intercepty-intercept rr is the is the correlation coefficientcorrelation coefficient rr22 is the is the coefficient of determinationcoefficient of determination

Page 34: Correlation and regression

5. Interpreting and Visualizing5. Interpreting and Visualizing Write down the equation of the Write down the equation of the

line in slope intercept form. line in slope intercept form. Press Press Y=Y= and enter the equation and enter the equation

under Y1. (Clear all other under Y1. (Clear all other equations.) equations.) 

Press Press GRAPHGRAPH and the line will and the line will be graphed through the data be graphed through the data points.points.

Page 35: Correlation and regression

Questions ???Questions ???

Page 36: Correlation and regression

Interpretation in ContextInterpretation in Context

Regression Equation: Regression Equation:

y=1.5*x - 96.9y=1.5*x - 96.9

Water Consumption = Water Consumption = 1.5*Temperature - 96.9 1.5*Temperature - 96.9

  

Page 37: Correlation and regression

Interpretation in ContextInterpretation in Context

Slope = 1.5 (ounces)/(degrees F)Slope = 1.5 (ounces)/(degrees F)

for each 1 degree F increase in for each 1 degree F increase in temperature, you expect an increase temperature, you expect an increase of 1.5 ounces of water drank.of 1.5 ounces of water drank.

  

Page 38: Correlation and regression

Interpretation in ContextInterpretation in Context

y-intercept = -96.9y-intercept = -96.9

For this example, For this example, when the temperature is 0 degrees F, when the temperature is 0 degrees F, then a person would drink about -97 then a person would drink about -97 ounces of water. ounces of water.

That does not make any sense! That does not make any sense! Our model is not applicable for x=0.  Our model is not applicable for x=0.  

Page 39: Correlation and regression

Prediction ExamplePrediction Example

Predict Predict the amount of the amount of water a person would drink when the water a person would drink when the temperature is temperature is 95 degrees F.95 degrees F.

Solution:Solution: Substitute the value of x=95 Substitute the value of x=95 (degrees F) into the regression equation (degrees F) into the regression equation and solve for y (water consumption).and solve for y (water consumption).

If x=95, y=1.5*95 - 96.9 = If x=95, y=1.5*95 - 96.9 = 45.6 ounces.  45.6 ounces.  

Page 40: Correlation and regression

Strength of the Association: Strength of the Association: rr22

Coefficient of Determination – Coefficient of Determination – rr22

General Interpretation:General Interpretation: The The coefficient of determination tells the coefficient of determination tells the percent of the variationpercent of the variation in the in the response variable that is response variable that is explained explained (determined) by the model and the (determined) by the model and the explanatory variable.  explanatory variable.  

Page 41: Correlation and regression

Interpretation of Interpretation of rr22

Example: Example: rr22 =92.7%. =92.7%. Interpretation:Interpretation:

Almost 93% of the variability in the Almost 93% of the variability in the amount of water consumed is amount of water consumed is explained by outside temperature explained by outside temperature using this model.using this model.

Note: Therefore 7% of the variation Note: Therefore 7% of the variation in the amount of water consumed is in the amount of water consumed is not explained by this model using not explained by this model using temperature.temperature.

Page 42: Correlation and regression

Questions ???Questions ???

Page 43: Correlation and regression

Simple Linear Regression ModelSimple Linear Regression Model

The model for The model for simple linear regression issimple linear regression is

There are mathematical assumptions There are mathematical assumptions

behind the concepts thatbehind the concepts that we are covering today.we are covering today.

Page 44: Correlation and regression

FormulasFormulas

Prediction Equation: Prediction Equation:

Page 45: Correlation and regression

Real Life ApplicationsReal Life Applications

Cost Estimating for Future Space Cost Estimating for Future Space Flight Vehicles (Multiple Flight Vehicles (Multiple

Regression)Regression)

Page 46: Correlation and regression

Nonlinear ApplicationNonlinear Application

Predicting when Solar Maximum Will Predicting when Solar Maximum Will OccurOccur

http://science.msfc.nasa.gov/ssl/pad/http://science.msfc.nasa.gov/ssl/pad/

solar/predict.htmsolar/predict.htm

Page 47: Correlation and regression

Real Life ApplicationsReal Life Applications Estimating Seasonal Sales for Estimating Seasonal Sales for

Department Stores (Periodic)Department Stores (Periodic)

Page 48: Correlation and regression

Real Life ApplicationsReal Life Applications Predicting Student Grades Based Predicting Student Grades Based

on Time Spent Studyingon Time Spent Studying

Page 49: Correlation and regression

Real Life ApplicationsReal Life Applications

. . .. . .

What ideas can you think of?What ideas can you think of?

What ideas can you think of that What ideas can you think of that your students will relate to?your students will relate to?

Page 50: Correlation and regression

Practice ProblemsPractice Problems Measure Height vs. Arm SpanMeasure Height vs. Arm Span Find line of best fit for height.Find line of best fit for height. Predict height forPredict height for

one student not inone student not indata set. Checkdata set. Checkpredictability of model.predictability of model.

Page 51: Correlation and regression

Practice ProblemsPractice Problems

Is there any correlation between Is there any correlation between shoe size and height? shoe size and height?

Does gender make a difference Does gender make a difference in this analysis?in this analysis?

Page 52: Correlation and regression

Practice ProblemsPractice Problems Can the number of points Can the number of points

scored in a basketball game be scored in a basketball game be predicted by predicted by The time a player plays in The time a player plays in

the game?the game?

By the player’s height?By the player’s height?

Idea modified from Steven King, Aiken, Idea modified from Steven King, Aiken, SC. NCTM presentation 1997.)SC. NCTM presentation 1997.)

Page 53: Correlation and regression

ResourcesResources Data Analysis and StatisticsData Analysis and Statistics. .

Curriculum and Evaluation Curriculum and Evaluation Standards for School Standards for School Mathematics.  Addenda Series, Mathematics.  Addenda Series, Grades 9-12.  NCTM. 1992.Grades 9-12.  NCTM. 1992.

Data and Story LibraryData and Story Library.  Internet .  Internet Website.   Website.   http://lib.stat.cmu.edu/DASL/http://lib.stat.cmu.edu/DASL/ 2001. 2001. 

Page 54: Correlation and regression

Internet ResourcesInternet Resources CorrelationCorrelation

Guessing CorrelationsGuessing Correlations - An - An interactive site that allows you to interactive site that allows you to try to match correlation coefficients try to match correlation coefficients to scatterplots. University of Illinois, to scatterplots. University of Illinois, Urbanna Champaign Statistics Urbanna Champaign Statistics Program. Program. http://www.stat.uiuc.edu/~stat100/jhttp://www.stat.uiuc.edu/~stat100/java/guess/GCApplet.htmlava/guess/GCApplet.html

Page 55: Correlation and regression

Internet ResourcesInternet Resources

RegressionRegression Effects of adding an Effects of adding an

OutlierOutlier. .

W. West, University of South W. West, University of South Carolina. Carolina.

http://www.stat.sc.edu/~west/http://www.stat.sc.edu/~west/javahtml/Regression.htmljavahtml/Regression.html

Page 56: Correlation and regression

Internet ResourcesInternet Resources RegressionRegression

Estimate the Regression LineEstimate the Regression Line. . Compare the mean square error Compare the mean square error from different regression lines. Can from different regression lines. Can you find the minimum mean square you find the minimum mean square error? Rice University Virtual error? Rice University Virtual Statistics Lab. Statistics Lab. http://www.ruf.rice.edu/~lane/stat_sihttp://www.ruf.rice.edu/~lane/stat_sim/reg_by_eye/index.htmlm/reg_by_eye/index.html

Page 57: Correlation and regression

Internet Resources: Data SetsInternet Resources: Data Sets Data and Story Library. Data and Story Library.

Excellent source for small data sets. Excellent source for small data sets. Search for specific statistical methods Search for specific statistical methods (e.g. boxplots, regression) or for data (e.g. boxplots, regression) or for data concerning a specific field of interest concerning a specific field of interest (e.g. health, environment, sports). (e.g. health, environment, sports). http://lib.stat.cmu.edu/DASL/http://lib.stat.cmu.edu/DASL/

Page 58: Correlation and regression

Internet Resources: Data SetsInternet Resources: Data Sets

FEDSTATS.FEDSTATS. "The gateway to "The gateway to statistics from over 100 U.S. Federal statistics from over 100 U.S. Federal agencies" agencies" http://www.fedstats.gov/http://www.fedstats.gov/

"Kid's Pages.""Kid's Pages." (not all related to (not all related to statistics) statistics) http://www.fedstats.gov/kids.htmlhttp://www.fedstats.gov/kids.html  

Page 59: Correlation and regression

Internet ResourcesInternet Resources OtherOther

Statistics Applets. Using Web Statistics Applets. Using Web Applets to Assist in Statistics Applets to Assist in Statistics Instruction. Robin Lock, St. Instruction. Robin Lock, St. Lawrence University. Lawrence University. http://it.stlawu.edu/~rlock/maa99/http://it.stlawu.edu/~rlock/maa99/

Page 60: Correlation and regression

Internet ResourcesInternet Resources OtherOther

Ten Websites Every Statistics Ten Websites Every Statistics Instructor Should Bookmark. Instructor Should Bookmark. Robin Lock, St. Lawrence Robin Lock, St. Lawrence University. University. http://it.stlawu.edu/~rlock/10sitehttp://it.stlawu.edu/~rlock/10sites.htmls.html

Page 61: Correlation and regression

For More Information…For More Information…

On-line version of this presentationOn-line version of this presentationhttp://www.mtsu.edu/~statshttp://www.mtsu.edu/~stats

/corregpres/index.html/corregpres/index.html

More information about regressionMore information about regressionVisit Visit STATS @ MTSUSTATS @ MTSU web site web site

http://www.mtsu.edu/~statshttp://www.mtsu.edu/~stats