sta302: regression analysis. statistics objective: to draw reasonable conclusions from noisy...
TRANSCRIPT
![Page 1: STA302: Regression Analysis. Statistics Objective: To draw reasonable conclusions from noisy numerical data Entry point: Study relationships between variables](https://reader035.vdocuments.us/reader035/viewer/2022062802/56649ea05503460f94ba3529/html5/thumbnails/1.jpg)
STA302: Regression Analysis
See last slide for copyright information
![Page 2: STA302: Regression Analysis. Statistics Objective: To draw reasonable conclusions from noisy numerical data Entry point: Study relationships between variables](https://reader035.vdocuments.us/reader035/viewer/2022062802/56649ea05503460f94ba3529/html5/thumbnails/2.jpg)
Statistics
• Objective: To draw reasonable conclusions from noisy numerical data
• Entry point: Study relationships between variables
![Page 3: STA302: Regression Analysis. Statistics Objective: To draw reasonable conclusions from noisy numerical data Entry point: Study relationships between variables](https://reader035.vdocuments.us/reader035/viewer/2022062802/56649ea05503460f94ba3529/html5/thumbnails/3.jpg)
Data File
• Rows are cases. There are n cases.
• Columns are variables. A variable is a piece of information that is recorded for every case.
![Page 4: STA302: Regression Analysis. Statistics Objective: To draw reasonable conclusions from noisy numerical data Entry point: Study relationships between variables](https://reader035.vdocuments.us/reader035/viewer/2022062802/56649ea05503460f94ba3529/html5/thumbnails/4.jpg)
![Page 5: STA302: Regression Analysis. Statistics Objective: To draw reasonable conclusions from noisy numerical data Entry point: Study relationships between variables](https://reader035.vdocuments.us/reader035/viewer/2022062802/56649ea05503460f94ba3529/html5/thumbnails/5.jpg)
Variables can be
• Independent: Predictor or cause (contributing factor)
• Dependent: Predicted or effect
![Page 6: STA302: Regression Analysis. Statistics Objective: To draw reasonable conclusions from noisy numerical data Entry point: Study relationships between variables](https://reader035.vdocuments.us/reader035/viewer/2022062802/56649ea05503460f94ba3529/html5/thumbnails/6.jpg)
Simple regression and correlation
• Simple means one IV• DV quantitative• IV usually quantitative too
![Page 7: STA302: Regression Analysis. Statistics Objective: To draw reasonable conclusions from noisy numerical data Entry point: Study relationships between variables](https://reader035.vdocuments.us/reader035/viewer/2022062802/56649ea05503460f94ba3529/html5/thumbnails/7.jpg)
Simple regression and correlation
High School GPA University GPA
88 86
78 73
87 89
86 81
77 67
… …
![Page 8: STA302: Regression Analysis. Statistics Objective: To draw reasonable conclusions from noisy numerical data Entry point: Study relationships between variables](https://reader035.vdocuments.us/reader035/viewer/2022062802/56649ea05503460f94ba3529/html5/thumbnails/8.jpg)
Scatterplot
![Page 9: STA302: Regression Analysis. Statistics Objective: To draw reasonable conclusions from noisy numerical data Entry point: Study relationships between variables](https://reader035.vdocuments.us/reader035/viewer/2022062802/56649ea05503460f94ba3529/html5/thumbnails/9.jpg)
Least squares line
![Page 10: STA302: Regression Analysis. Statistics Objective: To draw reasonable conclusions from noisy numerical data Entry point: Study relationships between variables](https://reader035.vdocuments.us/reader035/viewer/2022062802/56649ea05503460f94ba3529/html5/thumbnails/10.jpg)
Correlation between variables
•
is an estimate of
•
![Page 11: STA302: Regression Analysis. Statistics Objective: To draw reasonable conclusions from noisy numerical data Entry point: Study relationships between variables](https://reader035.vdocuments.us/reader035/viewer/2022062802/56649ea05503460f94ba3529/html5/thumbnails/11.jpg)
Correlation coefficient r
• -1 ≤ r ≤ 1• r = +1 indicates a perfect positive linear
relationship. All the points are exactly on a line with a positive slope.
• r = -1 indicates a perfect negative linear relationship. All the points are exactly on a line with a negative slope.
• r = 0 means no linear relationship (curve possible). Slope of least squares line = 0
• r2 = proportion of variation explained
![Page 12: STA302: Regression Analysis. Statistics Objective: To draw reasonable conclusions from noisy numerical data Entry point: Study relationships between variables](https://reader035.vdocuments.us/reader035/viewer/2022062802/56649ea05503460f94ba3529/html5/thumbnails/12.jpg)
r = 0.004
![Page 13: STA302: Regression Analysis. Statistics Objective: To draw reasonable conclusions from noisy numerical data Entry point: Study relationships between variables](https://reader035.vdocuments.us/reader035/viewer/2022062802/56649ea05503460f94ba3529/html5/thumbnails/13.jpg)
r = 0.112
![Page 14: STA302: Regression Analysis. Statistics Objective: To draw reasonable conclusions from noisy numerical data Entry point: Study relationships between variables](https://reader035.vdocuments.us/reader035/viewer/2022062802/56649ea05503460f94ba3529/html5/thumbnails/14.jpg)
r = 0.368
![Page 15: STA302: Regression Analysis. Statistics Objective: To draw reasonable conclusions from noisy numerical data Entry point: Study relationships between variables](https://reader035.vdocuments.us/reader035/viewer/2022062802/56649ea05503460f94ba3529/html5/thumbnails/15.jpg)
r = 0.547
![Page 16: STA302: Regression Analysis. Statistics Objective: To draw reasonable conclusions from noisy numerical data Entry point: Study relationships between variables](https://reader035.vdocuments.us/reader035/viewer/2022062802/56649ea05503460f94ba3529/html5/thumbnails/16.jpg)
r = 0.733
![Page 17: STA302: Regression Analysis. Statistics Objective: To draw reasonable conclusions from noisy numerical data Entry point: Study relationships between variables](https://reader035.vdocuments.us/reader035/viewer/2022062802/56649ea05503460f94ba3529/html5/thumbnails/17.jpg)
r = - 0.822
![Page 18: STA302: Regression Analysis. Statistics Objective: To draw reasonable conclusions from noisy numerical data Entry point: Study relationships between variables](https://reader035.vdocuments.us/reader035/viewer/2022062802/56649ea05503460f94ba3529/html5/thumbnails/18.jpg)
r = 0.025
![Page 19: STA302: Regression Analysis. Statistics Objective: To draw reasonable conclusions from noisy numerical data Entry point: Study relationships between variables](https://reader035.vdocuments.us/reader035/viewer/2022062802/56649ea05503460f94ba3529/html5/thumbnails/19.jpg)
r = - 0.811
![Page 20: STA302: Regression Analysis. Statistics Objective: To draw reasonable conclusions from noisy numerical data Entry point: Study relationships between variables](https://reader035.vdocuments.us/reader035/viewer/2022062802/56649ea05503460f94ba3529/html5/thumbnails/20.jpg)
Why -1 ≤ r ≤ 1 ?
•
•
![Page 21: STA302: Regression Analysis. Statistics Objective: To draw reasonable conclusions from noisy numerical data Entry point: Study relationships between variables](https://reader035.vdocuments.us/reader035/viewer/2022062802/56649ea05503460f94ba3529/html5/thumbnails/21.jpg)
A Statistical Model
![Page 22: STA302: Regression Analysis. Statistics Objective: To draw reasonable conclusions from noisy numerical data Entry point: Study relationships between variables](https://reader035.vdocuments.us/reader035/viewer/2022062802/56649ea05503460f94ba3529/html5/thumbnails/22.jpg)
One Independent Variable at a Time Can Produce Misleading Results
• The standard elementary methods all have a single independent variable (at most), so they should be used with caution in practice.
• Example: Artificial and extreme, to make a point:
• Suppose the correlation between Age and Strength is r = -0.96
![Page 23: STA302: Regression Analysis. Statistics Objective: To draw reasonable conclusions from noisy numerical data Entry point: Study relationships between variables](https://reader035.vdocuments.us/reader035/viewer/2022062802/56649ea05503460f94ba3529/html5/thumbnails/23.jpg)
![Page 24: STA302: Regression Analysis. Statistics Objective: To draw reasonable conclusions from noisy numerical data Entry point: Study relationships between variables](https://reader035.vdocuments.us/reader035/viewer/2022062802/56649ea05503460f94ba3529/html5/thumbnails/24.jpg)
Need multiple regression
![Page 25: STA302: Regression Analysis. Statistics Objective: To draw reasonable conclusions from noisy numerical data Entry point: Study relationships between variables](https://reader035.vdocuments.us/reader035/viewer/2022062802/56649ea05503460f94ba3529/html5/thumbnails/25.jpg)
Multiple regression in scalar form
![Page 26: STA302: Regression Analysis. Statistics Objective: To draw reasonable conclusions from noisy numerical data Entry point: Study relationships between variables](https://reader035.vdocuments.us/reader035/viewer/2022062802/56649ea05503460f94ba3529/html5/thumbnails/26.jpg)
Multiple regression in matrix form
![Page 27: STA302: Regression Analysis. Statistics Objective: To draw reasonable conclusions from noisy numerical data Entry point: Study relationships between variables](https://reader035.vdocuments.us/reader035/viewer/2022062802/56649ea05503460f94ba3529/html5/thumbnails/27.jpg)
So we need
• Matrix algebra• Random vectors, especially multivariate
normal• Software to do the computation
![Page 28: STA302: Regression Analysis. Statistics Objective: To draw reasonable conclusions from noisy numerical data Entry point: Study relationships between variables](https://reader035.vdocuments.us/reader035/viewer/2022062802/56649ea05503460f94ba3529/html5/thumbnails/28.jpg)
Copyright Information
This slide show was prepared by Jerry Brunner, Department of Statistical Sciences, University of Toronto. It is licensed under a Creative Commons Attribution - ShareAlike 3.0 Unported License. Use any part of it as you like and share the result freely. These Powerpoint slides are available from the course website:
http://www.utstat.toronto.edu/~brunner/oldclass/302f15