psy 1950 regression november 10, 2008. definition simple linear regression –models the linear...
Post on 21-Dec-2015
221 views
TRANSCRIPT
![Page 1: PSY 1950 Regression November 10, 2008. Definition Simple linear regression –Models the linear relationship between one predictor variable and one outcome](https://reader035.vdocuments.us/reader035/viewer/2022062313/56649d5d5503460f94a3c91a/html5/thumbnails/1.jpg)
PSY 1950Regression
November 10, 2008
![Page 2: PSY 1950 Regression November 10, 2008. Definition Simple linear regression –Models the linear relationship between one predictor variable and one outcome](https://reader035.vdocuments.us/reader035/viewer/2022062313/56649d5d5503460f94a3c91a/html5/thumbnails/2.jpg)
Definition• Simple linear regression
– Models the linear relationship between one predictor variable and one outcome variable
– e.g., predicting income based upon age
• Multiple linear regression– Models the linear relationship between more than one predictor variables and one outcome variable
– e.g., predicting income based upon age and sex
• Lingo– Independent/dependent, predictor/outcome
![Page 3: PSY 1950 Regression November 10, 2008. Definition Simple linear regression –Models the linear relationship between one predictor variable and one outcome](https://reader035.vdocuments.us/reader035/viewer/2022062313/56649d5d5503460f94a3c91a/html5/thumbnails/3.jpg)
History• Astronomical predictions: method of least squares– Piazzi (1801) spotted Ceres, made 22 observations over 41 days, got sick, lost Ceres
– Gauss: "... for it is now clearly shown that the orbit of a heavenly body may be determined quite nearly from good observations embracing only a few days; and this without any hypothetical assumption.”
• Genetics: Regression to the mean– Galton, F. (1886). Regression towards mediocrity in hereditary stature. Journal of the Anthropological Institute, 15, 246–263.
QuickTime™ and aTIFF (Uncompressed) decompressor
are needed to see this picture.
QuickTime™ and aTIFF (Uncompressed) decompressor
are needed to see this picture.
![Page 4: PSY 1950 Regression November 10, 2008. Definition Simple linear regression –Models the linear relationship between one predictor variable and one outcome](https://reader035.vdocuments.us/reader035/viewer/2022062313/56649d5d5503460f94a3c91a/html5/thumbnails/4.jpg)
Lines• Mathematically, a line is defined by its slope and intercept– Slope is change in Y per change in X
– Intercept is the points at which the line crosses the Y-axis, i.e., Y when X = 0
• Y = bX + a– b is slope– a is intercept
![Page 5: PSY 1950 Regression November 10, 2008. Definition Simple linear regression –Models the linear relationship between one predictor variable and one outcome](https://reader035.vdocuments.us/reader035/viewer/2022062313/56649d5d5503460f94a3c91a/html5/thumbnails/5.jpg)
Which Lines is Best?
0
50
100
150
200
250
300
350
400
0 500 1000 1500 2000 2500 3000 3500 4000
Advertising Budget (Thousands of $)
Record Sales (Thousands)
![Page 6: PSY 1950 Regression November 10, 2008. Definition Simple linear regression –Models the linear relationship between one predictor variable and one outcome](https://reader035.vdocuments.us/reader035/viewer/2022062313/56649d5d5503460f94a3c91a/html5/thumbnails/6.jpg)
Residuals• Residuals are
– Errors in prediction– Difference between expected values (under your model) and observed values (in your dataset)
Y = 0.063X + 131.59
0
50
100
150
200
250
300
350
400
0 500 1000 1500 2000 2500 3000 3500 4000
Advertising Budget (Thousands of $)
Record Sales (Thousands)
-200
-150
-100
-50
0
50
100
150
200
250
0 1000 2000 3000 4000
Advertising Budget (Thousands of $)
Residuals (Thousands of
Record Sales)
![Page 7: PSY 1950 Regression November 10, 2008. Definition Simple linear regression –Models the linear relationship between one predictor variable and one outcome](https://reader035.vdocuments.us/reader035/viewer/2022062313/56649d5d5503460f94a3c91a/html5/thumbnails/7.jpg)
Minimizing Residuals• Can define the best fit line by summing– Absolute residuals (Method of Least Absolute Deviations)
– Squared residuals (Method of Least Squares)
![Page 8: PSY 1950 Regression November 10, 2008. Definition Simple linear regression –Models the linear relationship between one predictor variable and one outcome](https://reader035.vdocuments.us/reader035/viewer/2022062313/56649d5d5503460f94a3c91a/html5/thumbnails/8.jpg)
Which is Better?• Method of Least Squares
– Not robust– Stable (line doesn’t “jump” with small changes in X)
– Only one solution (unique line for each dataset)
• Method of Least Absolute Deviations– Robust– Unstable (line does “jump” with small changes in X)
– Multiple solutions (sometimes)• http://www.math.wpi.edu/Course_Materials/SAS/lablets/7.3/7.3c/lab73c.html
![Page 9: PSY 1950 Regression November 10, 2008. Definition Simple linear regression –Models the linear relationship between one predictor variable and one outcome](https://reader035.vdocuments.us/reader035/viewer/2022062313/56649d5d5503460f94a3c91a/html5/thumbnails/9.jpg)
Multiple Solutions• Any line within the “green zone” produces the same summed residuals via the method of least absolute deviations
QuickTime™ and a decompressor
are needed to see this picture.
![Page 10: PSY 1950 Regression November 10, 2008. Definition Simple linear regression –Models the linear relationship between one predictor variable and one outcome](https://reader035.vdocuments.us/reader035/viewer/2022062313/56649d5d5503460f94a3c91a/html5/thumbnails/10.jpg)
Method of (Ordinary) Least Squares-1.02738397 0.34691735
y = 0.063x + 131.59
R2 = 0.355
0
50
100
150
200
250
300
350
400
0 1000 2000 3000 4000
Advertising Budget (Thousands of $)
Record Sales (Thousands)
![Page 11: PSY 1950 Regression November 10, 2008. Definition Simple linear regression –Models the linear relationship between one predictor variable and one outcome](https://reader035.vdocuments.us/reader035/viewer/2022062313/56649d5d5503460f94a3c91a/html5/thumbnails/11.jpg)
Regression Coefficients• Slope
• Intercept
![Page 12: PSY 1950 Regression November 10, 2008. Definition Simple linear regression –Models the linear relationship between one predictor variable and one outcome](https://reader035.vdocuments.us/reader035/viewer/2022062313/56649d5d5503460f94a3c91a/html5/thumbnails/12.jpg)
Standardized Coefficients
^
^
![Page 13: PSY 1950 Regression November 10, 2008. Definition Simple linear regression –Models the linear relationship between one predictor variable and one outcome](https://reader035.vdocuments.us/reader035/viewer/2022062313/56649d5d5503460f94a3c91a/html5/thumbnails/13.jpg)
Y = 0.5958X
-2.5
-2
-1.5
-1
-0.5
0
0.5
1
1.5
2
2.5
-2 -1 0 1 2 3 4
Advertising Budget (z-score)
Record Sales (z-score)
![Page 14: PSY 1950 Regression November 10, 2008. Definition Simple linear regression –Models the linear relationship between one predictor variable and one outcome](https://reader035.vdocuments.us/reader035/viewer/2022062313/56649d5d5503460f94a3c91a/html5/thumbnails/14.jpg)
Regression Line Passes Through (MX, MY)
![Page 15: PSY 1950 Regression November 10, 2008. Definition Simple linear regression –Models the linear relationship between one predictor variable and one outcome](https://reader035.vdocuments.us/reader035/viewer/2022062313/56649d5d5503460f94a3c91a/html5/thumbnails/15.jpg)
Correlation and Regression • Statistical distinction based on nature of the variables– In correlation, both X and Y are random– In regression, X is fixed and Y is random
• Practical distinction based on interest of researcher– With correlation, the researcher asks: What is the strength (and direction) of the linear relationship between X and Y
– With regression, the research asks the above and/or: How do I predict Y given X?
![Page 16: PSY 1950 Regression November 10, 2008. Definition Simple linear regression –Models the linear relationship between one predictor variable and one outcome](https://reader035.vdocuments.us/reader035/viewer/2022062313/56649d5d5503460f94a3c91a/html5/thumbnails/16.jpg)
Goodness of Fit• The regression equation does not reveal how well your data fit your model– e.g., in the below, both sets of data produce the same regression equation
0
1
2
3
4
5
6
4 5 6 7 8 9 100
1
2
3
4
5
6
4 5 6 7 8 9 10
![Page 17: PSY 1950 Regression November 10, 2008. Definition Simple linear regression –Models the linear relationship between one predictor variable and one outcome](https://reader035.vdocuments.us/reader035/viewer/2022062313/56649d5d5503460f94a3c91a/html5/thumbnails/17.jpg)
Standard Error of Estimate• The standard residual
• Why df = n - 2?– To determine regression equation (and thus the residuals), we need to estimate two population parameters• Slope and intercept OR• Mean of X and mean of Y
– A regression with n = 2 has no df
^
![Page 18: PSY 1950 Regression November 10, 2008. Definition Simple linear regression –Models the linear relationship between one predictor variable and one outcome](https://reader035.vdocuments.us/reader035/viewer/2022062313/56649d5d5503460f94a3c91a/html5/thumbnails/18.jpg)
Coefficicent of Determination (r2)
![Page 19: PSY 1950 Regression November 10, 2008. Definition Simple linear regression –Models the linear relationship between one predictor variable and one outcome](https://reader035.vdocuments.us/reader035/viewer/2022062313/56649d5d5503460f94a3c91a/html5/thumbnails/19.jpg)
0
1
2
3
4
5
6
0 1 2 3 4 5 6
0
1
2
3
4
5
6
0 1 2 3 4 5 6
0
1
2
3
4
5
6
0 1 2 3 4 5 6
Partitioning Sums of Squares
![Page 20: PSY 1950 Regression November 10, 2008. Definition Simple linear regression –Models the linear relationship between one predictor variable and one outcome](https://reader035.vdocuments.us/reader035/viewer/2022062313/56649d5d5503460f94a3c91a/html5/thumbnails/20.jpg)
Partitioning Sums of Squares
![Page 21: PSY 1950 Regression November 10, 2008. Definition Simple linear regression –Models the linear relationship between one predictor variable and one outcome](https://reader035.vdocuments.us/reader035/viewer/2022062313/56649d5d5503460f94a3c91a/html5/thumbnails/21.jpg)
Testing the Model
# predictors
n minus # model parametersn minus (1 + # predictors)
![Page 22: PSY 1950 Regression November 10, 2008. Definition Simple linear regression –Models the linear relationship between one predictor variable and one outcome](https://reader035.vdocuments.us/reader035/viewer/2022062313/56649d5d5503460f94a3c91a/html5/thumbnails/22.jpg)
Online Applets• Explaining variance
– http://www.duxbury.com/authors/mcclellandg/tiein/johnson/reg.htm
• Leverage– http://www.stat.sc.edu/~west/javahtml/Regression.html
• Distribution of slopes/intercepts– http://lstat.kuleuven.be/java/version2.0/Applet003.html