lecture 20: simple linear regression api-201z · announcements i midterms nearly graded i executive...
TRANSCRIPT
![Page 1: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/1.jpg)
Lecture 20:Simple Linear Regression
API-201Z
Maya Sen
Harvard Kennedy Schoolhttp://scholar.harvard.edu/msen
![Page 2: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/2.jpg)
Announcements
I Midterms nearly graded
I Executive summaries now due on 11/29 (Thursday, as part ofPS #10)
I We’ll set up online poll for which groups will present on 12/4(due date)
I Regular office hours resume post-TG – happy to chat with youat any point about final exercises!
![Page 3: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/3.jpg)
Announcements
I Midterms nearly graded
I Executive summaries now due on 11/29 (Thursday, as part ofPS #10)
I We’ll set up online poll for which groups will present on 12/4(due date)
I Regular office hours resume post-TG – happy to chat with youat any point about final exercises!
![Page 4: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/4.jpg)
Announcements
I Midterms nearly graded
I Executive summaries now due on 11/29 (Thursday, as part ofPS #10)
I We’ll set up online poll for which groups will present on 12/4(due date)
I Regular office hours resume post-TG – happy to chat with youat any point about final exercises!
![Page 5: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/5.jpg)
Announcements
I Midterms nearly graded
I Executive summaries now due on 11/29 (Thursday, as part ofPS #10)
I We’ll set up online poll for which groups will present on 12/4(due date)
I Regular office hours resume post-TG – happy to chat with youat any point about final exercises!
![Page 6: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/6.jpg)
Announcements
I Midterms nearly graded
I Executive summaries now due on 11/29 (Thursday, as part ofPS #10)
I We’ll set up online poll for which groups will present on 12/4(due date)
I Regular office hours resume post-TG – happy to chat with youat any point about final exercises!
![Page 7: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/7.jpg)
Roadmap
I Introduce concept of Ordinary Least Squares (OLS) methodof estimating linear regression
I Discuss simplest application, Simple Linear RegressionI Relationship between two continuous variables
I Hypothesis tests and CIs for regression parameters
I Sets us up to cover regression with more than one explanatoryvariable, interpretation of tables
![Page 8: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/8.jpg)
Roadmap
I Introduce concept of Ordinary Least Squares (OLS) methodof estimating linear regression
I Discuss simplest application, Simple Linear RegressionI Relationship between two continuous variables
I Hypothesis tests and CIs for regression parameters
I Sets us up to cover regression with more than one explanatoryvariable, interpretation of tables
![Page 9: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/9.jpg)
Roadmap
I Introduce concept of Ordinary Least Squares (OLS) methodof estimating linear regression
I Discuss simplest application, Simple Linear RegressionI Relationship between two continuous variables
I Hypothesis tests and CIs for regression parameters
I Sets us up to cover regression with more than one explanatoryvariable, interpretation of tables
![Page 10: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/10.jpg)
Roadmap
I Introduce concept of Ordinary Least Squares (OLS) methodof estimating linear regression
I Discuss simplest application, Simple Linear RegressionI Relationship between two continuous variables
I Hypothesis tests and CIs for regression parameters
I Sets us up to cover regression with more than one explanatoryvariable, interpretation of tables
![Page 11: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/11.jpg)
Roadmap
I Introduce concept of Ordinary Least Squares (OLS) methodof estimating linear regression
I Discuss simplest application, Simple Linear RegressionI Relationship between two continuous variables
I Hypothesis tests and CIs for regression parameters
I Sets us up to cover regression with more than one explanatoryvariable, interpretation of tables
![Page 12: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/12.jpg)
Last time
I We have covered several more advanced inference techniquesI ANOVA: Global test comparing means across groupsI Chi Square Test: Test comparing independence of rows and
columns in a frequency tableI But both suffer from weakness →
I If null rejected, then what can we say about strength/directionof association?
I Can we predict anything?
I Linear regression allows us to assess (1) strength and (2)direction in the relationship between two variables
I Useful across many different applications and for prediction
I Along with difference in means, one of the most widely usedstatistical techniques; we’ll cover only basics in this course
![Page 13: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/13.jpg)
Last time
I We have covered several more advanced inference techniques
I ANOVA: Global test comparing means across groupsI Chi Square Test: Test comparing independence of rows and
columns in a frequency tableI But both suffer from weakness →
I If null rejected, then what can we say about strength/directionof association?
I Can we predict anything?
I Linear regression allows us to assess (1) strength and (2)direction in the relationship between two variables
I Useful across many different applications and for prediction
I Along with difference in means, one of the most widely usedstatistical techniques; we’ll cover only basics in this course
![Page 14: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/14.jpg)
Last time
I We have covered several more advanced inference techniquesI ANOVA: Global test comparing means across groups
I Chi Square Test: Test comparing independence of rows andcolumns in a frequency table
I But both suffer from weakness →I If null rejected, then what can we say about strength/direction
of association?I Can we predict anything?
I Linear regression allows us to assess (1) strength and (2)direction in the relationship between two variables
I Useful across many different applications and for prediction
I Along with difference in means, one of the most widely usedstatistical techniques; we’ll cover only basics in this course
![Page 15: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/15.jpg)
Last time
I We have covered several more advanced inference techniquesI ANOVA: Global test comparing means across groupsI Chi Square Test: Test comparing independence of rows and
columns in a frequency table
I But both suffer from weakness →I If null rejected, then what can we say about strength/direction
of association?I Can we predict anything?
I Linear regression allows us to assess (1) strength and (2)direction in the relationship between two variables
I Useful across many different applications and for prediction
I Along with difference in means, one of the most widely usedstatistical techniques; we’ll cover only basics in this course
![Page 16: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/16.jpg)
Last time
I We have covered several more advanced inference techniquesI ANOVA: Global test comparing means across groupsI Chi Square Test: Test comparing independence of rows and
columns in a frequency tableI But both suffer from weakness →
I If null rejected, then what can we say about strength/directionof association?
I Can we predict anything?
I Linear regression allows us to assess (1) strength and (2)direction in the relationship between two variables
I Useful across many different applications and for prediction
I Along with difference in means, one of the most widely usedstatistical techniques; we’ll cover only basics in this course
![Page 17: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/17.jpg)
Last time
I We have covered several more advanced inference techniquesI ANOVA: Global test comparing means across groupsI Chi Square Test: Test comparing independence of rows and
columns in a frequency tableI But both suffer from weakness →
I If null rejected, then what can we say about strength/directionof association?
I Can we predict anything?
I Linear regression allows us to assess (1) strength and (2)direction in the relationship between two variables
I Useful across many different applications and for prediction
I Along with difference in means, one of the most widely usedstatistical techniques; we’ll cover only basics in this course
![Page 18: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/18.jpg)
Last time
I We have covered several more advanced inference techniquesI ANOVA: Global test comparing means across groupsI Chi Square Test: Test comparing independence of rows and
columns in a frequency tableI But both suffer from weakness →
I If null rejected, then what can we say about strength/directionof association?
I Can we predict anything?
I Linear regression allows us to assess (1) strength and (2)direction in the relationship between two variables
I Useful across many different applications and for prediction
I Along with difference in means, one of the most widely usedstatistical techniques; we’ll cover only basics in this course
![Page 19: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/19.jpg)
Last time
I We have covered several more advanced inference techniquesI ANOVA: Global test comparing means across groupsI Chi Square Test: Test comparing independence of rows and
columns in a frequency tableI But both suffer from weakness →
I If null rejected, then what can we say about strength/directionof association?
I Can we predict anything?
I Linear regression allows us to assess (1) strength and (2)direction in the relationship between two variables
I Useful across many different applications and for prediction
I Along with difference in means, one of the most widely usedstatistical techniques; we’ll cover only basics in this course
![Page 20: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/20.jpg)
State Unemployment Example
I Motivate linear regression with a simple example:
I Suppose our policy area is labor unemployment – thinkunemployment is “sticky” and lags over time
I Is there relationship between state-level unemployment ratesin U.S. in 1995 and in 2000?
I A random sample of 30 states was takenI For each state, data was collected on:
I Unemployment rate in 1995I Unemployment rate in 2000
![Page 21: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/21.jpg)
State Unemployment Example
I Motivate linear regression with a simple example:
I Suppose our policy area is labor unemployment – thinkunemployment is “sticky” and lags over time
I Is there relationship between state-level unemployment ratesin U.S. in 1995 and in 2000?
I A random sample of 30 states was takenI For each state, data was collected on:
I Unemployment rate in 1995I Unemployment rate in 2000
![Page 22: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/22.jpg)
State Unemployment Example
I Motivate linear regression with a simple example:
I Suppose our policy area is labor unemployment – thinkunemployment is “sticky” and lags over time
I Is there relationship between state-level unemployment ratesin U.S. in 1995 and in 2000?
I A random sample of 30 states was takenI For each state, data was collected on:
I Unemployment rate in 1995I Unemployment rate in 2000
![Page 23: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/23.jpg)
State Unemployment Example
I Motivate linear regression with a simple example:
I Suppose our policy area is labor unemployment – thinkunemployment is “sticky” and lags over time
I Is there relationship between state-level unemployment ratesin U.S. in 1995 and in 2000?
I A random sample of 30 states was takenI For each state, data was collected on:
I Unemployment rate in 1995I Unemployment rate in 2000
![Page 24: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/24.jpg)
State Unemployment Example
I Motivate linear regression with a simple example:
I Suppose our policy area is labor unemployment – thinkunemployment is “sticky” and lags over time
I Is there relationship between state-level unemployment ratesin U.S. in 1995 and in 2000?
I A random sample of 30 states was taken
I For each state, data was collected on:I Unemployment rate in 1995I Unemployment rate in 2000
![Page 25: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/25.jpg)
State Unemployment Example
I Motivate linear regression with a simple example:
I Suppose our policy area is labor unemployment – thinkunemployment is “sticky” and lags over time
I Is there relationship between state-level unemployment ratesin U.S. in 1995 and in 2000?
I A random sample of 30 states was takenI For each state, data was collected on:
I Unemployment rate in 1995I Unemployment rate in 2000
![Page 26: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/26.jpg)
State Unemployment Example
I Motivate linear regression with a simple example:
I Suppose our policy area is labor unemployment – thinkunemployment is “sticky” and lags over time
I Is there relationship between state-level unemployment ratesin U.S. in 1995 and in 2000?
I A random sample of 30 states was takenI For each state, data was collected on:
I Unemployment rate in 1995
I Unemployment rate in 2000
![Page 27: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/27.jpg)
State Unemployment Example
I Motivate linear regression with a simple example:
I Suppose our policy area is labor unemployment – thinkunemployment is “sticky” and lags over time
I Is there relationship between state-level unemployment ratesin U.S. in 1995 and in 2000?
I A random sample of 30 states was takenI For each state, data was collected on:
I Unemployment rate in 1995I Unemployment rate in 2000
![Page 28: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/28.jpg)
State Unemployment Example
State 1995 2000
Alabama 5.3 4.0Alaska 7.1 6.2Arizona 5.4 4.1
Arkansas 4.8 4.1California 8.0 5.0Colorado 4.3 3.0
... ... ...
![Page 29: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/29.jpg)
State Unemployment Example
State 1995 2000
Alabama 5.3 4.0Alaska 7.1 6.2Arizona 5.4 4.1
Arkansas 4.8 4.1California 8.0 5.0Colorado 4.3 3.0
... ... ...
![Page 30: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/30.jpg)
State Unemployment Example
![Page 31: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/31.jpg)
State Unemployment Example
![Page 32: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/32.jpg)
State Unemployment Example
I Obvious two variables correlated → could use correlation toexamine relationship
I Correlation: Measures strength of linear association btn 2 vars
I 2 variables treated in similar manner → variablesinterchangeable (correlation of x with y , or y with x , same)
I Correlation coefficient r takes values between 0 and 1I State unemployment rate example:
I Strong positive correlationI Correlation coefficient r = 0.78
![Page 33: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/33.jpg)
State Unemployment Example
I Obvious two variables correlated → could use correlation toexamine relationship
I Correlation: Measures strength of linear association btn 2 vars
I 2 variables treated in similar manner → variablesinterchangeable (correlation of x with y , or y with x , same)
I Correlation coefficient r takes values between 0 and 1I State unemployment rate example:
I Strong positive correlationI Correlation coefficient r = 0.78
![Page 34: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/34.jpg)
State Unemployment Example
I Obvious two variables correlated → could use correlation toexamine relationship
I Correlation: Measures strength of linear association btn 2 vars
I 2 variables treated in similar manner → variablesinterchangeable (correlation of x with y , or y with x , same)
I Correlation coefficient r takes values between 0 and 1I State unemployment rate example:
I Strong positive correlationI Correlation coefficient r = 0.78
![Page 35: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/35.jpg)
State Unemployment Example
I Obvious two variables correlated → could use correlation toexamine relationship
I Correlation: Measures strength of linear association btn 2 vars
I 2 variables treated in similar manner → variablesinterchangeable (correlation of x with y , or y with x , same)
I Correlation coefficient r takes values between 0 and 1I State unemployment rate example:
I Strong positive correlationI Correlation coefficient r = 0.78
![Page 36: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/36.jpg)
State Unemployment Example
I Obvious two variables correlated → could use correlation toexamine relationship
I Correlation: Measures strength of linear association btn 2 vars
I 2 variables treated in similar manner → variablesinterchangeable (correlation of x with y , or y with x , same)
I Correlation coefficient r takes values between 0 and 1
I State unemployment rate example:I Strong positive correlationI Correlation coefficient r = 0.78
![Page 37: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/37.jpg)
State Unemployment Example
I Obvious two variables correlated → could use correlation toexamine relationship
I Correlation: Measures strength of linear association btn 2 vars
I 2 variables treated in similar manner → variablesinterchangeable (correlation of x with y , or y with x , same)
I Correlation coefficient r takes values between 0 and 1I State unemployment rate example:
I Strong positive correlationI Correlation coefficient r = 0.78
![Page 38: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/38.jpg)
State Unemployment Example
I Obvious two variables correlated → could use correlation toexamine relationship
I Correlation: Measures strength of linear association btn 2 vars
I 2 variables treated in similar manner → variablesinterchangeable (correlation of x with y , or y with x , same)
I Correlation coefficient r takes values between 0 and 1I State unemployment rate example:
I Strong positive correlation
I Correlation coefficient r = 0.78
![Page 39: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/39.jpg)
State Unemployment Example
I Obvious two variables correlated → could use correlation toexamine relationship
I Correlation: Measures strength of linear association btn 2 vars
I 2 variables treated in similar manner → variablesinterchangeable (correlation of x with y , or y with x , same)
I Correlation coefficient r takes values between 0 and 1I State unemployment rate example:
I Strong positive correlationI Correlation coefficient r = 0.78
![Page 40: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/40.jpg)
State Unemployment Example
However: we can put more structure on relationship w/ regression
I Regression: Each variable has specific roleI x is explanatory (or independent or predictor) variable
I Always represented on horizontal (X ) axisI Can be binary, categorical, or continuous (will discuss a bit in
this class)
I y is outcome (or dependent or response) variable, the variablewe are trying to predict
I Always represented on vertical (Y ) axisI Here: Continuous (expanded to include dichotomous,
categorical outcomes next semester)
![Page 41: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/41.jpg)
State Unemployment Example
However: we can put more structure on relationship w/ regression
I Regression: Each variable has specific roleI x is explanatory (or independent or predictor) variable
I Always represented on horizontal (X ) axisI Can be binary, categorical, or continuous (will discuss a bit in
this class)
I y is outcome (or dependent or response) variable, the variablewe are trying to predict
I Always represented on vertical (Y ) axisI Here: Continuous (expanded to include dichotomous,
categorical outcomes next semester)
![Page 42: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/42.jpg)
State Unemployment Example
However: we can put more structure on relationship w/ regression
I Regression: Each variable has specific role
I x is explanatory (or independent or predictor) variableI Always represented on horizontal (X ) axisI Can be binary, categorical, or continuous (will discuss a bit in
this class)
I y is outcome (or dependent or response) variable, the variablewe are trying to predict
I Always represented on vertical (Y ) axisI Here: Continuous (expanded to include dichotomous,
categorical outcomes next semester)
![Page 43: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/43.jpg)
State Unemployment Example
However: we can put more structure on relationship w/ regression
I Regression: Each variable has specific roleI x is explanatory (or independent or predictor) variable
I Always represented on horizontal (X ) axisI Can be binary, categorical, or continuous (will discuss a bit in
this class)
I y is outcome (or dependent or response) variable, the variablewe are trying to predict
I Always represented on vertical (Y ) axisI Here: Continuous (expanded to include dichotomous,
categorical outcomes next semester)
![Page 44: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/44.jpg)
State Unemployment Example
However: we can put more structure on relationship w/ regression
I Regression: Each variable has specific roleI x is explanatory (or independent or predictor) variable
I Always represented on horizontal (X ) axis
I Can be binary, categorical, or continuous (will discuss a bit inthis class)
I y is outcome (or dependent or response) variable, the variablewe are trying to predict
I Always represented on vertical (Y ) axisI Here: Continuous (expanded to include dichotomous,
categorical outcomes next semester)
![Page 45: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/45.jpg)
State Unemployment Example
However: we can put more structure on relationship w/ regression
I Regression: Each variable has specific roleI x is explanatory (or independent or predictor) variable
I Always represented on horizontal (X ) axisI Can be binary, categorical, or continuous (will discuss a bit in
this class)
I y is outcome (or dependent or response) variable, the variablewe are trying to predict
I Always represented on vertical (Y ) axisI Here: Continuous (expanded to include dichotomous,
categorical outcomes next semester)
![Page 46: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/46.jpg)
State Unemployment Example
However: we can put more structure on relationship w/ regression
I Regression: Each variable has specific roleI x is explanatory (or independent or predictor) variable
I Always represented on horizontal (X ) axisI Can be binary, categorical, or continuous (will discuss a bit in
this class)
I y is outcome (or dependent or response) variable, the variablewe are trying to predict
I Always represented on vertical (Y ) axisI Here: Continuous (expanded to include dichotomous,
categorical outcomes next semester)
![Page 47: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/47.jpg)
State Unemployment Example
However: we can put more structure on relationship w/ regression
I Regression: Each variable has specific roleI x is explanatory (or independent or predictor) variable
I Always represented on horizontal (X ) axisI Can be binary, categorical, or continuous (will discuss a bit in
this class)
I y is outcome (or dependent or response) variable, the variablewe are trying to predict
I Always represented on vertical (Y ) axis
I Here: Continuous (expanded to include dichotomous,categorical outcomes next semester)
![Page 48: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/48.jpg)
State Unemployment Example
However: we can put more structure on relationship w/ regression
I Regression: Each variable has specific roleI x is explanatory (or independent or predictor) variable
I Always represented on horizontal (X ) axisI Can be binary, categorical, or continuous (will discuss a bit in
this class)
I y is outcome (or dependent or response) variable, the variablewe are trying to predict
I Always represented on vertical (Y ) axisI Here: Continuous (expanded to include dichotomous,
categorical outcomes next semester)
![Page 49: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/49.jpg)
State Unemployment Example
Predictor, Explanatory, or Independent Variable
Out
com
e, R
espo
nse,
or
Dep
ende
nt V
aria
ble
![Page 50: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/50.jpg)
State Unemployment Example
Predictor, Explanatory, or Independent Variable
Out
com
e, R
espo
nse,
or
Dep
ende
nt V
aria
ble
![Page 51: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/51.jpg)
Correlation versus Regression
Regression offers key advantages:
1. Assess whether there is statistically significant relationshipbetween the 2 variables
2. Assess magnitude of that relationship
3. Use explanatory variable to predict predicted values of theoutcome variable
4. Eventually will allow us to take other variables into account
![Page 52: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/52.jpg)
Correlation versus Regression
Regression offers key advantages:
1. Assess whether there is statistically significant relationshipbetween the 2 variables
2. Assess magnitude of that relationship
3. Use explanatory variable to predict predicted values of theoutcome variable
4. Eventually will allow us to take other variables into account
![Page 53: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/53.jpg)
Correlation versus Regression
Regression offers key advantages:
1. Assess whether there is statistically significant relationshipbetween the 2 variables
2. Assess magnitude of that relationship
3. Use explanatory variable to predict predicted values of theoutcome variable
4. Eventually will allow us to take other variables into account
![Page 54: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/54.jpg)
Correlation versus Regression
Regression offers key advantages:
1. Assess whether there is statistically significant relationshipbetween the 2 variables
2. Assess magnitude of that relationship
3. Use explanatory variable to predict predicted values of theoutcome variable
4. Eventually will allow us to take other variables into account
![Page 55: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/55.jpg)
Correlation versus Regression
Regression offers key advantages:
1. Assess whether there is statistically significant relationshipbetween the 2 variables
2. Assess magnitude of that relationship
3. Use explanatory variable to predict predicted values of theoutcome variable
4. Eventually will allow us to take other variables into account
![Page 56: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/56.jpg)
Correlation versus Regression
Regression offers key advantages:
1. Assess whether there is statistically significant relationshipbetween the 2 variables
2. Assess magnitude of that relationship
3. Use explanatory variable to predict predicted values of theoutcome variable
4. Eventually will allow us to take other variables into account
![Page 57: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/57.jpg)
Simple Linear Regression
Let’s explore w/ simplest kind of regression:
I Simple: Only one independent variable (so bivariate)
I Linear: Straight line relationship
I Regression: Method of fitting data to (linear) model
I However: How to find the line that best describes the datasetwe have collected?
![Page 58: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/58.jpg)
Simple Linear Regression
Let’s explore w/ simplest kind of regression:
I Simple: Only one independent variable (so bivariate)
I Linear: Straight line relationship
I Regression: Method of fitting data to (linear) model
I However: How to find the line that best describes the datasetwe have collected?
![Page 59: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/59.jpg)
Simple Linear Regression
Let’s explore w/ simplest kind of regression:
I Simple: Only one independent variable (so bivariate)
I Linear: Straight line relationship
I Regression: Method of fitting data to (linear) model
I However: How to find the line that best describes the datasetwe have collected?
![Page 60: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/60.jpg)
Simple Linear Regression
Let’s explore w/ simplest kind of regression:
I Simple: Only one independent variable (so bivariate)
I Linear: Straight line relationship
I Regression: Method of fitting data to (linear) model
I However: How to find the line that best describes the datasetwe have collected?
![Page 61: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/61.jpg)
Simple Linear Regression
Let’s explore w/ simplest kind of regression:
I Simple: Only one independent variable (so bivariate)
I Linear: Straight line relationship
I Regression: Method of fitting data to (linear) model
I However: How to find the line that best describes the datasetwe have collected?
![Page 62: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/62.jpg)
Simple Linear Regression
Let’s explore w/ simplest kind of regression:
I Simple: Only one independent variable (so bivariate)
I Linear: Straight line relationship
I Regression: Method of fitting data to (linear) model
I However: How to find the line that best describes the datasetwe have collected?
![Page 63: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/63.jpg)
State Unemployment Example
![Page 64: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/64.jpg)
State Unemployment Example
![Page 65: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/65.jpg)
Simple Linear Regression
I If we had a true linear relationship between 1995unemployment (x) to 2000 unemployment (y), it would beexpressed by:
y = β0 + β1x
I where y is the outcomeI β0 is interceptI β1 is slopeI and x is the explanatory variable
I Much of our interest is in the size and sign of β1, the slope
I Slope captures the linear relationship between x and y
![Page 66: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/66.jpg)
Simple Linear Regression
I If we had a true linear relationship between 1995unemployment (x) to 2000 unemployment (y), it would beexpressed by:
y = β0 + β1x
I where y is the outcomeI β0 is interceptI β1 is slopeI and x is the explanatory variable
I Much of our interest is in the size and sign of β1, the slope
I Slope captures the linear relationship between x and y
![Page 67: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/67.jpg)
Simple Linear Regression
I If we had a true linear relationship between 1995unemployment (x) to 2000 unemployment (y), it would beexpressed by:
y = β0 + β1x
I where y is the outcomeI β0 is interceptI β1 is slopeI and x is the explanatory variable
I Much of our interest is in the size and sign of β1, the slope
I Slope captures the linear relationship between x and y
![Page 68: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/68.jpg)
Simple Linear Regression
I If we had a true linear relationship between 1995unemployment (x) to 2000 unemployment (y), it would beexpressed by:
y = β0 + β1x
I where y is the outcome
I β0 is interceptI β1 is slopeI and x is the explanatory variable
I Much of our interest is in the size and sign of β1, the slope
I Slope captures the linear relationship between x and y
![Page 69: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/69.jpg)
Simple Linear Regression
I If we had a true linear relationship between 1995unemployment (x) to 2000 unemployment (y), it would beexpressed by:
y = β0 + β1x
I where y is the outcomeI β0 is intercept
I β1 is slopeI and x is the explanatory variable
I Much of our interest is in the size and sign of β1, the slope
I Slope captures the linear relationship between x and y
![Page 70: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/70.jpg)
Simple Linear Regression
I If we had a true linear relationship between 1995unemployment (x) to 2000 unemployment (y), it would beexpressed by:
y = β0 + β1x
I where y is the outcomeI β0 is interceptI β1 is slope
I and x is the explanatory variable
I Much of our interest is in the size and sign of β1, the slope
I Slope captures the linear relationship between x and y
![Page 71: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/71.jpg)
Simple Linear Regression
I If we had a true linear relationship between 1995unemployment (x) to 2000 unemployment (y), it would beexpressed by:
y = β0 + β1x
I where y is the outcomeI β0 is interceptI β1 is slopeI and x is the explanatory variable
I Much of our interest is in the size and sign of β1, the slope
I Slope captures the linear relationship between x and y
![Page 72: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/72.jpg)
Simple Linear Regression
I If we had a true linear relationship between 1995unemployment (x) to 2000 unemployment (y), it would beexpressed by:
y = β0 + β1x
I where y is the outcomeI β0 is interceptI β1 is slopeI and x is the explanatory variable
I Much of our interest is in the size and sign of β1, the slope
I Slope captures the linear relationship between x and y
![Page 73: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/73.jpg)
Simple Linear Regression
I If we had a true linear relationship between 1995unemployment (x) to 2000 unemployment (y), it would beexpressed by:
y = β0 + β1x
I where y is the outcomeI β0 is interceptI β1 is slopeI and x is the explanatory variable
I Much of our interest is in the size and sign of β1, the slope
I Slope captures the linear relationship between x and y
![Page 74: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/74.jpg)
Positive Relationship Between X and Y
Slope is Positive
X
Y
![Page 75: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/75.jpg)
Positive Relationship Between X and Y
Slope is Positive
X
Y
![Page 76: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/76.jpg)
Negative Relationship Between X and Y
Slope is Negative
X
Y
![Page 77: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/77.jpg)
No Relationship Between X and Y
Slope is 0
X
Y
![Page 78: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/78.jpg)
Simple Linear Regression
I However: Simple line yi = β0 + β1xi assumes perfectlydeterministic relationship between x and y
I Maybe good for understanding, e.g., relationship of Fahrenheitto Celsius, but not much else!
I More realistic → x and y are related linearly, but there issome noise around that, so that it’s not a single perfect line
I Thus: for a single observation (xi , yi ):
yi = β0︸︷︷︸Intercept
+ β1︸︷︷︸Slope
xi + εi︸︷︷︸Error
I where εi also known as random errors
![Page 79: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/79.jpg)
Simple Linear Regression
I However: Simple line yi = β0 + β1xi assumes perfectlydeterministic relationship between x and y
I Maybe good for understanding, e.g., relationship of Fahrenheitto Celsius, but not much else!
I More realistic → x and y are related linearly, but there issome noise around that, so that it’s not a single perfect line
I Thus: for a single observation (xi , yi ):
yi = β0︸︷︷︸Intercept
+ β1︸︷︷︸Slope
xi + εi︸︷︷︸Error
I where εi also known as random errors
![Page 80: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/80.jpg)
Simple Linear Regression
I However: Simple line yi = β0 + β1xi assumes perfectlydeterministic relationship between x and y
I Maybe good for understanding, e.g., relationship of Fahrenheitto Celsius, but not much else!
I More realistic → x and y are related linearly, but there issome noise around that, so that it’s not a single perfect line
I Thus: for a single observation (xi , yi ):
yi = β0︸︷︷︸Intercept
+ β1︸︷︷︸Slope
xi + εi︸︷︷︸Error
I where εi also known as random errors
![Page 81: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/81.jpg)
Simple Linear Regression
I However: Simple line yi = β0 + β1xi assumes perfectlydeterministic relationship between x and y
I Maybe good for understanding, e.g., relationship of Fahrenheitto Celsius, but not much else!
I More realistic → x and y are related linearly, but there issome noise around that, so that it’s not a single perfect line
I Thus: for a single observation (xi , yi ):
yi = β0︸︷︷︸Intercept
+ β1︸︷︷︸Slope
xi + εi︸︷︷︸Error
I where εi also known as random errors
![Page 82: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/82.jpg)
Simple Linear Regression
I However: Simple line yi = β0 + β1xi assumes perfectlydeterministic relationship between x and y
I Maybe good for understanding, e.g., relationship of Fahrenheitto Celsius, but not much else!
I More realistic → x and y are related linearly, but there issome noise around that, so that it’s not a single perfect line
I Thus: for a single observation (xi , yi ):
yi = β0︸︷︷︸Intercept
+ β1︸︷︷︸Slope
xi + εi︸︷︷︸Error
I where εi also known as random errors
![Page 83: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/83.jpg)
Simple Linear Regression
I However: Simple line yi = β0 + β1xi assumes perfectlydeterministic relationship between x and y
I Maybe good for understanding, e.g., relationship of Fahrenheitto Celsius, but not much else!
I More realistic → x and y are related linearly, but there issome noise around that, so that it’s not a single perfect line
I Thus: for a single observation (xi , yi ):
yi = β0︸︷︷︸Intercept
+ β1︸︷︷︸Slope
xi + εi︸︷︷︸Error
I where εi also known as random errors
![Page 84: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/84.jpg)
Simple Linear Regression
I However: Simple line yi = β0 + β1xi assumes perfectlydeterministic relationship between x and y
I Maybe good for understanding, e.g., relationship of Fahrenheitto Celsius, but not much else!
I More realistic → x and y are related linearly, but there issome noise around that, so that it’s not a single perfect line
I Thus: for a single observation (xi , yi ):
yi = β0︸︷︷︸Intercept
+ β1︸︷︷︸Slope
xi + εi︸︷︷︸Error
I where εi also known as random errors
![Page 85: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/85.jpg)
Simple Linear Regression
I This describes the “true” relationship between x and y :
yi = β0 + β1xi + εi
I However: We can never observe β0 and β1 → these arepopulation parameters!
I Best thing we can do is estimate them using our data
I Thus, we have an estimated linear relationship:
yi = b0 + b1xi + ei
I Sometimes also denoted using “hat” notation as
yi = β0 + β1xi + εi
I Residuals (ei ) represent estimates of the random errors, ε1
![Page 86: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/86.jpg)
Simple Linear Regression
I This describes the “true” relationship between x and y :
yi = β0 + β1xi + εi
I However: We can never observe β0 and β1 → these arepopulation parameters!
I Best thing we can do is estimate them using our data
I Thus, we have an estimated linear relationship:
yi = b0 + b1xi + ei
I Sometimes also denoted using “hat” notation as
yi = β0 + β1xi + εi
I Residuals (ei ) represent estimates of the random errors, ε1
![Page 87: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/87.jpg)
Simple Linear Regression
I This describes the “true” relationship between x and y :
yi = β0 + β1xi + εi
I However: We can never observe β0 and β1 → these arepopulation parameters!
I Best thing we can do is estimate them using our data
I Thus, we have an estimated linear relationship:
yi = b0 + b1xi + ei
I Sometimes also denoted using “hat” notation as
yi = β0 + β1xi + εi
I Residuals (ei ) represent estimates of the random errors, ε1
![Page 88: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/88.jpg)
Simple Linear Regression
I This describes the “true” relationship between x and y :
yi = β0 + β1xi + εi
I However: We can never observe β0 and β1 → these arepopulation parameters!
I Best thing we can do is estimate them using our data
I Thus, we have an estimated linear relationship:
yi = b0 + b1xi + ei
I Sometimes also denoted using “hat” notation as
yi = β0 + β1xi + εi
I Residuals (ei ) represent estimates of the random errors, ε1
![Page 89: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/89.jpg)
Simple Linear Regression
I This describes the “true” relationship between x and y :
yi = β0 + β1xi + εi
I However: We can never observe β0 and β1 → these arepopulation parameters!
I Best thing we can do is estimate them using our data
I Thus, we have an estimated linear relationship:
yi = b0 + b1xi + ei
I Sometimes also denoted using “hat” notation as
yi = β0 + β1xi + εi
I Residuals (ei ) represent estimates of the random errors, ε1
![Page 90: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/90.jpg)
Simple Linear Regression
I This describes the “true” relationship between x and y :
yi = β0 + β1xi + εi
I However: We can never observe β0 and β1 → these arepopulation parameters!
I Best thing we can do is estimate them using our data
I Thus, we have an estimated linear relationship:
yi = b0 + b1xi + ei
I Sometimes also denoted using “hat” notation as
yi = β0 + β1xi + εi
I Residuals (ei ) represent estimates of the random errors, ε1
![Page 91: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/91.jpg)
Simple Linear Regression
I This describes the “true” relationship between x and y :
yi = β0 + β1xi + εi
I However: We can never observe β0 and β1 → these arepopulation parameters!
I Best thing we can do is estimate them using our data
I Thus, we have an estimated linear relationship:
yi = b0 + b1xi + ei
I Sometimes also denoted using “hat” notation as
yi = β0 + β1xi + εi
I Residuals (ei ) represent estimates of the random errors, ε1
![Page 92: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/92.jpg)
Simple Linear Regression
I This describes the “true” relationship between x and y :
yi = β0 + β1xi + εi
I However: We can never observe β0 and β1 → these arepopulation parameters!
I Best thing we can do is estimate them using our data
I Thus, we have an estimated linear relationship:
yi = b0 + b1xi + ei
I Sometimes also denoted using “hat” notation as
yi = β0 + β1xi + εi
I Residuals (ei ) represent estimates of the random errors, ε1
![Page 93: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/93.jpg)
Simple Linear Regression
I This describes the “true” relationship between x and y :
yi = β0 + β1xi + εi
I However: We can never observe β0 and β1 → these arepopulation parameters!
I Best thing we can do is estimate them using our data
I Thus, we have an estimated linear relationship:
yi = b0 + b1xi + ei
I Sometimes also denoted using “hat” notation as
yi = β0 + β1xi + εi
I Residuals (ei ) represent estimates of the random errors, ε1
![Page 94: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/94.jpg)
Simple Linear Regression
I This describes the “true” relationship between x and y :
yi = β0 + β1xi + εi
I However: We can never observe β0 and β1 → these arepopulation parameters!
I Best thing we can do is estimate them using our data
I Thus, we have an estimated linear relationship:
yi = b0 + b1xi + ei
I Sometimes also denoted using “hat” notation as
yi = β0 + β1xi + εi
I Residuals (ei ) represent estimates of the random errors, ε1
![Page 95: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/95.jpg)
Simple Linear Regression
I Note: Important alternative way of thinking about linearregression is via expected values
I E [yi |xi ] gives the expected (or mean) value of yi for a givenvalue of the independent variable, xi
I Under the linear specification,
E [yi |xi ] = β0 + β1xi
I All predicted values fall exactly on regression line
I Why no error term here? Because E [εi |xi ] = 0
I (You’ll see violations of this in API 202)
![Page 96: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/96.jpg)
Simple Linear Regression
I Note: Important alternative way of thinking about linearregression is via expected values
I E [yi |xi ] gives the expected (or mean) value of yi for a givenvalue of the independent variable, xi
I Under the linear specification,
E [yi |xi ] = β0 + β1xi
I All predicted values fall exactly on regression line
I Why no error term here? Because E [εi |xi ] = 0
I (You’ll see violations of this in API 202)
![Page 97: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/97.jpg)
Simple Linear Regression
I Note: Important alternative way of thinking about linearregression is via expected values
I E [yi |xi ] gives the expected (or mean) value of yi for a givenvalue of the independent variable, xi
I Under the linear specification,
E [yi |xi ] = β0 + β1xi
I All predicted values fall exactly on regression line
I Why no error term here? Because E [εi |xi ] = 0
I (You’ll see violations of this in API 202)
![Page 98: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/98.jpg)
Simple Linear Regression
I Note: Important alternative way of thinking about linearregression is via expected values
I E [yi |xi ] gives the expected (or mean) value of yi for a givenvalue of the independent variable, xi
I Under the linear specification,
E [yi |xi ] = β0 + β1xi
I All predicted values fall exactly on regression line
I Why no error term here? Because E [εi |xi ] = 0
I (You’ll see violations of this in API 202)
![Page 99: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/99.jpg)
Simple Linear Regression
I Note: Important alternative way of thinking about linearregression is via expected values
I E [yi |xi ] gives the expected (or mean) value of yi for a givenvalue of the independent variable, xi
I Under the linear specification,
E [yi |xi ] = β0 + β1xi
I All predicted values fall exactly on regression line
I Why no error term here? Because E [εi |xi ] = 0
I (You’ll see violations of this in API 202)
![Page 100: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/100.jpg)
Simple Linear Regression
I Note: Important alternative way of thinking about linearregression is via expected values
I E [yi |xi ] gives the expected (or mean) value of yi for a givenvalue of the independent variable, xi
I Under the linear specification,
E [yi |xi ] = β0 + β1xi
I All predicted values fall exactly on regression line
I Why no error term here? Because E [εi |xi ] = 0
I (You’ll see violations of this in API 202)
![Page 101: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/101.jpg)
Simple Linear Regression
I Note: Important alternative way of thinking about linearregression is via expected values
I E [yi |xi ] gives the expected (or mean) value of yi for a givenvalue of the independent variable, xi
I Under the linear specification,
E [yi |xi ] = β0 + β1xi
I All predicted values fall exactly on regression line
I Why no error term here? Because E [εi |xi ] = 0
I (You’ll see violations of this in API 202)
![Page 102: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/102.jpg)
Simple Linear Regression
I Note: Important alternative way of thinking about linearregression is via expected values
I E [yi |xi ] gives the expected (or mean) value of yi for a givenvalue of the independent variable, xi
I Under the linear specification,
E [yi |xi ] = β0 + β1xi
I All predicted values fall exactly on regression line
I Why no error term here? Because E [εi |xi ] = 0
I (You’ll see violations of this in API 202)
![Page 103: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/103.jpg)
How to find the best estimated line?
Going back to our data:
How to fit the best line?
![Page 104: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/104.jpg)
How to find the best estimated line?
Going back to our data:
How to fit the best line?
![Page 105: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/105.jpg)
How to find the best estimated line?
Going back to our data:
How to fit the best line?
![Page 106: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/106.jpg)
How to find the best estimated line?
Going back to our data:
We’ll take the line that minimizes the sum of squared residuals
![Page 107: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/107.jpg)
How to find the best estimated line?
Going back to our data:
We’ll take the line that minimizes the sum of squared residuals
![Page 108: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/108.jpg)
How to find the best estimated line?
I Specifically we will choose the values of β0 and β1 thatminimize:
n∑i=1
(yi − yi )2
I Or:n∑
i=1
(yi − β0 − β1xi )2
I Gives Ordinary Least Squares Estimators (see appendix forproof)
I Could calculate other ways to fit a line, but OLS has veryattractive properties
I Under Gauss-Markov Theorem, least squares line is “BLUE”(Best Linear Unbiased Estimator)
I For properties, see Wikipedia (Link)I Video of proof at Khan Academy (Link)
![Page 109: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/109.jpg)
How to find the best estimated line?
I Specifically we will choose the values of β0 and β1 thatminimize:
n∑i=1
(yi − yi )2
I Or:n∑
i=1
(yi − β0 − β1xi )2
I Gives Ordinary Least Squares Estimators (see appendix forproof)
I Could calculate other ways to fit a line, but OLS has veryattractive properties
I Under Gauss-Markov Theorem, least squares line is “BLUE”(Best Linear Unbiased Estimator)
I For properties, see Wikipedia (Link)I Video of proof at Khan Academy (Link)
![Page 110: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/110.jpg)
How to find the best estimated line?
I Specifically we will choose the values of β0 and β1 thatminimize:
n∑i=1
(yi − yi )2
I Or:n∑
i=1
(yi − β0 − β1xi )2
I Gives Ordinary Least Squares Estimators (see appendix forproof)
I Could calculate other ways to fit a line, but OLS has veryattractive properties
I Under Gauss-Markov Theorem, least squares line is “BLUE”(Best Linear Unbiased Estimator)
I For properties, see Wikipedia (Link)I Video of proof at Khan Academy (Link)
![Page 111: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/111.jpg)
How to find the best estimated line?
I Specifically we will choose the values of β0 and β1 thatminimize:
n∑i=1
(yi − yi )2
I Or:n∑
i=1
(yi − β0 − β1xi )2
I Gives Ordinary Least Squares Estimators (see appendix forproof)
I Could calculate other ways to fit a line, but OLS has veryattractive properties
I Under Gauss-Markov Theorem, least squares line is “BLUE”(Best Linear Unbiased Estimator)
I For properties, see Wikipedia (Link)I Video of proof at Khan Academy (Link)
![Page 112: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/112.jpg)
How to find the best estimated line?
I Specifically we will choose the values of β0 and β1 thatminimize:
n∑i=1
(yi − yi )2
I Or:n∑
i=1
(yi − β0 − β1xi )2
I Gives Ordinary Least Squares Estimators (see appendix forproof)
I Could calculate other ways to fit a line, but OLS has veryattractive properties
I Under Gauss-Markov Theorem, least squares line is “BLUE”(Best Linear Unbiased Estimator)
I For properties, see Wikipedia (Link)I Video of proof at Khan Academy (Link)
![Page 113: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/113.jpg)
How to find the best estimated line?
I Specifically we will choose the values of β0 and β1 thatminimize:
n∑i=1
(yi − yi )2
I Or:n∑
i=1
(yi − β0 − β1xi )2
I Gives Ordinary Least Squares Estimators (see appendix forproof)
I Could calculate other ways to fit a line, but OLS has veryattractive properties
I Under Gauss-Markov Theorem, least squares line is “BLUE”(Best Linear Unbiased Estimator)
I For properties, see Wikipedia (Link)
I Video of proof at Khan Academy (Link)
![Page 114: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/114.jpg)
How to find the best estimated line?
I Specifically we will choose the values of β0 and β1 thatminimize:
n∑i=1
(yi − yi )2
I Or:n∑
i=1
(yi − β0 − β1xi )2
I Gives Ordinary Least Squares Estimators (see appendix forproof)
I Could calculate other ways to fit a line, but OLS has veryattractive properties
I Under Gauss-Markov Theorem, least squares line is “BLUE”(Best Linear Unbiased Estimator)
I For properties, see Wikipedia (Link)I Video of proof at Khan Academy (Link)
![Page 115: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/115.jpg)
OLS Estimates for One Explanatory Variable
I Proof (in Appendix) gives us equation for slope estimate:
b1 =
∑(xi − x)(yi − y)∑
(xi − x)2
I and equation for the intercept estimate:
b0 = y − b1x
I where x is average of x values (explanatory variable)
I and y is average of y values (outcome variable)
I Note that b1 = r sxsy
![Page 116: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/116.jpg)
OLS Estimates for One Explanatory Variable
I Proof (in Appendix) gives us equation for slope estimate:
b1 =
∑(xi − x)(yi − y)∑
(xi − x)2
I and equation for the intercept estimate:
b0 = y − b1x
I where x is average of x values (explanatory variable)
I and y is average of y values (outcome variable)
I Note that b1 = r sxsy
![Page 117: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/117.jpg)
OLS Estimates for One Explanatory Variable
I Proof (in Appendix) gives us equation for slope estimate:
b1 =
∑(xi − x)(yi − y)∑
(xi − x)2
I and equation for the intercept estimate:
b0 = y − b1x
I where x is average of x values (explanatory variable)
I and y is average of y values (outcome variable)
I Note that b1 = r sxsy
![Page 118: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/118.jpg)
OLS Estimates for One Explanatory Variable
I Proof (in Appendix) gives us equation for slope estimate:
b1 =
∑(xi − x)(yi − y)∑
(xi − x)2
I and equation for the intercept estimate:
b0 = y − b1x
I where x is average of x values (explanatory variable)
I and y is average of y values (outcome variable)
I Note that b1 = r sxsy
![Page 119: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/119.jpg)
OLS Estimates for One Explanatory Variable
I Proof (in Appendix) gives us equation for slope estimate:
b1 =
∑(xi − x)(yi − y)∑
(xi − x)2
I and equation for the intercept estimate:
b0 = y − b1x
I where x is average of x values (explanatory variable)
I and y is average of y values (outcome variable)
I Note that b1 = r sxsy
![Page 120: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/120.jpg)
OLS Estimates for One Explanatory Variable
I Proof (in Appendix) gives us equation for slope estimate:
b1 =
∑(xi − x)(yi − y)∑
(xi − x)2
I and equation for the intercept estimate:
b0 = y − b1x
I where x is average of x values (explanatory variable)
I and y is average of y values (outcome variable)
I Note that b1 = r sxsy
![Page 121: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/121.jpg)
OLS Estimates for One Explanatory Variable
I Proof (in Appendix) gives us equation for slope estimate:
b1 =
∑(xi − x)(yi − y)∑
(xi − x)2
I and equation for the intercept estimate:
b0 = y − b1x
I where x is average of x values (explanatory variable)
I and y is average of y values (outcome variable)
I Note that b1 = r sxsy
![Page 122: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/122.jpg)
OLS Estimates for One Explanatory Variable
I Proof (in Appendix) gives us equation for slope estimate:
b1 =
∑(xi − x)(yi − y)∑
(xi − x)2
I and equation for the intercept estimate:
b0 = y − b1x
I where x is average of x values (explanatory variable)
I and y is average of y values (outcome variable)
I Note that b1 = r sxsy
![Page 123: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/123.jpg)
State Unemployment Example
I Rare to calculate by hand except for simplest cases
I In STATA
. regress yr2000 yr1995
-----------------------------------------------------
yr2000 | Coef. Std. Err. t P>|t|
-----------------------------------------------------
yr1995 | .5398317 .0818083 6.60 0.000
_cons | 1.077917 .4571589 2.36 0.026
----------------------------------------------------
I In R, use lm (linear model) command: lm(yr2000 ∼ yr1995)
I Statistical software will give you:I Intercept coefficient estimate (b0 or β0): 1.077917I Slope coefficient estimate (b1 orβ1): 0.5398317
![Page 124: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/124.jpg)
State Unemployment Example
I Rare to calculate by hand except for simplest cases
I In STATA
. regress yr2000 yr1995
-----------------------------------------------------
yr2000 | Coef. Std. Err. t P>|t|
-----------------------------------------------------
yr1995 | .5398317 .0818083 6.60 0.000
_cons | 1.077917 .4571589 2.36 0.026
----------------------------------------------------
I In R, use lm (linear model) command: lm(yr2000 ∼ yr1995)
I Statistical software will give you:I Intercept coefficient estimate (b0 or β0): 1.077917I Slope coefficient estimate (b1 orβ1): 0.5398317
![Page 125: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/125.jpg)
State Unemployment Example
I Rare to calculate by hand except for simplest cases
I In STATA
. regress yr2000 yr1995
-----------------------------------------------------
yr2000 | Coef. Std. Err. t P>|t|
-----------------------------------------------------
yr1995 | .5398317 .0818083 6.60 0.000
_cons | 1.077917 .4571589 2.36 0.026
----------------------------------------------------
I In R, use lm (linear model) command: lm(yr2000 ∼ yr1995)
I Statistical software will give you:I Intercept coefficient estimate (b0 or β0): 1.077917I Slope coefficient estimate (b1 orβ1): 0.5398317
![Page 126: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/126.jpg)
State Unemployment Example
I Rare to calculate by hand except for simplest cases
I In STATA
. regress yr2000 yr1995
-----------------------------------------------------
yr2000 | Coef. Std. Err. t P>|t|
-----------------------------------------------------
yr1995 | .5398317 .0818083 6.60 0.000
_cons | 1.077917 .4571589 2.36 0.026
----------------------------------------------------
I In R, use lm (linear model) command: lm(yr2000 ∼ yr1995)
I Statistical software will give you:I Intercept coefficient estimate (b0 or β0): 1.077917I Slope coefficient estimate (b1 orβ1): 0.5398317
![Page 127: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/127.jpg)
State Unemployment Example
I Rare to calculate by hand except for simplest cases
I In STATA
. regress yr2000 yr1995
-----------------------------------------------------
yr2000 | Coef. Std. Err. t P>|t|
-----------------------------------------------------
yr1995 | .5398317 .0818083 6.60 0.000
_cons | 1.077917 .4571589 2.36 0.026
----------------------------------------------------
I In R, use lm (linear model) command:
lm(yr2000 ∼ yr1995)
I Statistical software will give you:I Intercept coefficient estimate (b0 or β0): 1.077917I Slope coefficient estimate (b1 orβ1): 0.5398317
![Page 128: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/128.jpg)
State Unemployment Example
I Rare to calculate by hand except for simplest cases
I In STATA
. regress yr2000 yr1995
-----------------------------------------------------
yr2000 | Coef. Std. Err. t P>|t|
-----------------------------------------------------
yr1995 | .5398317 .0818083 6.60 0.000
_cons | 1.077917 .4571589 2.36 0.026
----------------------------------------------------
I In R, use lm (linear model) command: lm(yr2000 ∼ yr1995)
I Statistical software will give you:I Intercept coefficient estimate (b0 or β0): 1.077917I Slope coefficient estimate (b1 orβ1): 0.5398317
![Page 129: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/129.jpg)
State Unemployment Example
I Rare to calculate by hand except for simplest cases
I In STATA
. regress yr2000 yr1995
-----------------------------------------------------
yr2000 | Coef. Std. Err. t P>|t|
-----------------------------------------------------
yr1995 | .5398317 .0818083 6.60 0.000
_cons | 1.077917 .4571589 2.36 0.026
----------------------------------------------------
I In R, use lm (linear model) command: lm(yr2000 ∼ yr1995)
I Statistical software will give you:
I Intercept coefficient estimate (b0 or β0): 1.077917I Slope coefficient estimate (b1 orβ1): 0.5398317
![Page 130: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/130.jpg)
State Unemployment Example
I Rare to calculate by hand except for simplest cases
I In STATA
. regress yr2000 yr1995
-----------------------------------------------------
yr2000 | Coef. Std. Err. t P>|t|
-----------------------------------------------------
yr1995 | .5398317 .0818083 6.60 0.000
_cons | 1.077917 .4571589 2.36 0.026
----------------------------------------------------
I In R, use lm (linear model) command: lm(yr2000 ∼ yr1995)
I Statistical software will give you:I Intercept coefficient estimate (b0 or β0): 1.077917
I Slope coefficient estimate (b1 orβ1): 0.5398317
![Page 131: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/131.jpg)
State Unemployment Example
I Rare to calculate by hand except for simplest cases
I In STATA
. regress yr2000 yr1995
-----------------------------------------------------
yr2000 | Coef. Std. Err. t P>|t|
-----------------------------------------------------
yr1995 | .5398317 .0818083 6.60 0.000
_cons | 1.077917 .4571589 2.36 0.026
----------------------------------------------------
I In R, use lm (linear model) command: lm(yr2000 ∼ yr1995)
I Statistical software will give you:I Intercept coefficient estimate (b0 or β0): 1.077917I Slope coefficient estimate (b1 orβ1): 0.5398317
![Page 132: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/132.jpg)
State Unemployment Example
I Gives us estimated regression line:
y = 1.08 + 0.54x
I How to interpret?
I One-unit increase in x associated w/ b1 increase/decrease in y
I Here: Based on our data, an increase of 1 percentage point in1995 unemployment rate is associated w/ an increase of 0.54percent point in 2000 unemployment rate
![Page 133: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/133.jpg)
State Unemployment Example
I Gives us estimated regression line:
y = 1.08 + 0.54x
I How to interpret?
I One-unit increase in x associated w/ b1 increase/decrease in y
I Here: Based on our data, an increase of 1 percentage point in1995 unemployment rate is associated w/ an increase of 0.54percent point in 2000 unemployment rate
![Page 134: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/134.jpg)
State Unemployment Example
I Gives us estimated regression line:
y = 1.08 + 0.54x
I How to interpret?
I One-unit increase in x associated w/ b1 increase/decrease in y
I Here: Based on our data, an increase of 1 percentage point in1995 unemployment rate is associated w/ an increase of 0.54percent point in 2000 unemployment rate
![Page 135: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/135.jpg)
State Unemployment Example
I Gives us estimated regression line:
y = 1.08 + 0.54x
I How to interpret?
I One-unit increase in x associated w/ b1 increase/decrease in y
I Here: Based on our data, an increase of 1 percentage point in1995 unemployment rate is associated w/ an increase of 0.54percent point in 2000 unemployment rate
![Page 136: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/136.jpg)
State Unemployment Example
I Gives us estimated regression line:
y = 1.08 + 0.54x
I How to interpret?
I One-unit increase in x associated w/ b1 increase/decrease in y
I Here: Based on our data, an increase of 1 percentage point in1995 unemployment rate is associated w/ an increase of 0.54percent point in 2000 unemployment rate
![Page 137: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/137.jpg)
State Unemployment Example
I Gives us estimated regression line:
y = 1.08 + 0.54x
I How to interpret?
I One-unit increase in x associated w/ b1 increase/decrease in y
I Here: Based on our data, an increase of 1 percentage point in1995 unemployment rate is associated w/ an increase of 0.54percent point in 2000 unemployment rate
![Page 138: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/138.jpg)
State Unemployment Example
![Page 139: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/139.jpg)
State Unemployment Example
![Page 140: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/140.jpg)
State Unemployment Example
![Page 141: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/141.jpg)
OLS Assumptions
OLS relies on several key assumptions
I (1) There is a linear relationship in the population betweenthe independent variable x and the outcome y
I (2) Observations are independent (i.e., one observation oneach state)
I (3) Errors are not correlated with one another
I → You’ll study violations of these assumptions in API 202
![Page 142: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/142.jpg)
OLS Assumptions
OLS relies on several key assumptions
I (1) There is a linear relationship in the population betweenthe independent variable x and the outcome y
I (2) Observations are independent (i.e., one observation oneach state)
I (3) Errors are not correlated with one another
I → You’ll study violations of these assumptions in API 202
![Page 143: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/143.jpg)
OLS Assumptions
OLS relies on several key assumptions
I (1) There is a linear relationship in the population betweenthe independent variable x and the outcome y
I (2) Observations are independent (i.e., one observation oneach state)
I (3) Errors are not correlated with one another
I → You’ll study violations of these assumptions in API 202
![Page 144: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/144.jpg)
OLS Assumptions
OLS relies on several key assumptions
I (1) There is a linear relationship in the population betweenthe independent variable x and the outcome y
I (2) Observations are independent (i.e., one observation oneach state)
I (3) Errors are not correlated with one another
I → You’ll study violations of these assumptions in API 202
![Page 145: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/145.jpg)
OLS Assumptions
OLS relies on several key assumptions
I (1) There is a linear relationship in the population betweenthe independent variable x and the outcome y
I (2) Observations are independent (i.e., one observation oneach state)
I (3) Errors are not correlated with one another
I → You’ll study violations of these assumptions in API 202
![Page 146: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/146.jpg)
Using Regression for Prediction
I We can use information from estimated regression line topredict relationships between x and y
I Ex) Suppose interested in predicting 2000 unemployment ratefor another state not included in the sample
I One state has unemployment rate of 7.5% in 1995 → what ispredicted 2000 rate?
y = 1.08 + 0.54x
= 1.08 + 0.54(7.5) = 5.3
I Another state has unemployment rate of 14% in 1995 → whatis predicted 2000 rate?
y = 1.08 + 0.54x
= 1.08 + 0.54(14.0) = 8.64
I These are called predicted values
![Page 147: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/147.jpg)
Using Regression for PredictionI We can use information from estimated regression line to
predict relationships between x and y
I Ex) Suppose interested in predicting 2000 unemployment ratefor another state not included in the sample
I One state has unemployment rate of 7.5% in 1995 → what ispredicted 2000 rate?
y = 1.08 + 0.54x
= 1.08 + 0.54(7.5) = 5.3
I Another state has unemployment rate of 14% in 1995 → whatis predicted 2000 rate?
y = 1.08 + 0.54x
= 1.08 + 0.54(14.0) = 8.64
I These are called predicted values
![Page 148: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/148.jpg)
Using Regression for PredictionI We can use information from estimated regression line to
predict relationships between x and yI Ex) Suppose interested in predicting 2000 unemployment rate
for another state not included in the sample
I One state has unemployment rate of 7.5% in 1995 → what ispredicted 2000 rate?
y = 1.08 + 0.54x
= 1.08 + 0.54(7.5) = 5.3
I Another state has unemployment rate of 14% in 1995 → whatis predicted 2000 rate?
y = 1.08 + 0.54x
= 1.08 + 0.54(14.0) = 8.64
I These are called predicted values
![Page 149: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/149.jpg)
Using Regression for PredictionI We can use information from estimated regression line to
predict relationships between x and yI Ex) Suppose interested in predicting 2000 unemployment rate
for another state not included in the sampleI One state has unemployment rate of 7.5% in 1995 → what is
predicted 2000 rate?
y = 1.08 + 0.54x
= 1.08 + 0.54(7.5) = 5.3
I Another state has unemployment rate of 14% in 1995 → whatis predicted 2000 rate?
y = 1.08 + 0.54x
= 1.08 + 0.54(14.0) = 8.64
I These are called predicted values
![Page 150: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/150.jpg)
Using Regression for PredictionI We can use information from estimated regression line to
predict relationships between x and yI Ex) Suppose interested in predicting 2000 unemployment rate
for another state not included in the sampleI One state has unemployment rate of 7.5% in 1995 → what is
predicted 2000 rate?
y = 1.08 + 0.54x
= 1.08 + 0.54(7.5) = 5.3
I Another state has unemployment rate of 14% in 1995 → whatis predicted 2000 rate?
y = 1.08 + 0.54x
= 1.08 + 0.54(14.0) = 8.64
I These are called predicted values
![Page 151: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/151.jpg)
Using Regression for PredictionI We can use information from estimated regression line to
predict relationships between x and yI Ex) Suppose interested in predicting 2000 unemployment rate
for another state not included in the sampleI One state has unemployment rate of 7.5% in 1995 → what is
predicted 2000 rate?
y = 1.08 + 0.54x
= 1.08 + 0.54(7.5) = 5.3
I Another state has unemployment rate of 14% in 1995 → whatis predicted 2000 rate?
y = 1.08 + 0.54x
= 1.08 + 0.54(14.0) = 8.64
I These are called predicted values
![Page 152: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/152.jpg)
Using Regression for PredictionI We can use information from estimated regression line to
predict relationships between x and yI Ex) Suppose interested in predicting 2000 unemployment rate
for another state not included in the sampleI One state has unemployment rate of 7.5% in 1995 → what is
predicted 2000 rate?
y = 1.08 + 0.54x
= 1.08 + 0.54(7.5) = 5.3
I Another state has unemployment rate of 14% in 1995 → whatis predicted 2000 rate?
y = 1.08 + 0.54x
= 1.08 + 0.54(14.0) = 8.64
I These are called predicted values
![Page 153: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/153.jpg)
Using Regression for PredictionI We can use information from estimated regression line to
predict relationships between x and yI Ex) Suppose interested in predicting 2000 unemployment rate
for another state not included in the sampleI One state has unemployment rate of 7.5% in 1995 → what is
predicted 2000 rate?
y = 1.08 + 0.54x
= 1.08 + 0.54(7.5) = 5.3
I Another state has unemployment rate of 14% in 1995 → whatis predicted 2000 rate?
y = 1.08 + 0.54x
= 1.08 + 0.54(14.0) = 8.64
I These are called predicted values
![Page 154: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/154.jpg)
State Unemployment Example
Some notes on prediction:
1. These are good predictions, but not necessarily correct!
2. Regression line is only good for predicting values in range forwhich we have data
I Best not to extrapolate, or predict values outside this rangeI 1995 state unemployment in our data ranges from 3% to
around 10%I Should we use regression equation to predict 2000
unemployment for state w/ 40% 1995 unemployment?
![Page 155: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/155.jpg)
State Unemployment Example
Some notes on prediction:
1. These are good predictions, but not necessarily correct!
2. Regression line is only good for predicting values in range forwhich we have data
I Best not to extrapolate, or predict values outside this rangeI 1995 state unemployment in our data ranges from 3% to
around 10%I Should we use regression equation to predict 2000
unemployment for state w/ 40% 1995 unemployment?
![Page 156: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/156.jpg)
State Unemployment Example
Some notes on prediction:
1. These are good predictions, but not necessarily correct!
2. Regression line is only good for predicting values in range forwhich we have data
I Best not to extrapolate, or predict values outside this rangeI 1995 state unemployment in our data ranges from 3% to
around 10%I Should we use regression equation to predict 2000
unemployment for state w/ 40% 1995 unemployment?
![Page 157: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/157.jpg)
State Unemployment Example
Some notes on prediction:
1. These are good predictions, but not necessarily correct!
2. Regression line is only good for predicting values in range forwhich we have data
I Best not to extrapolate, or predict values outside this rangeI 1995 state unemployment in our data ranges from 3% to
around 10%I Should we use regression equation to predict 2000
unemployment for state w/ 40% 1995 unemployment?
![Page 158: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/158.jpg)
State Unemployment Example
Some notes on prediction:
1. These are good predictions, but not necessarily correct!
2. Regression line is only good for predicting values in range forwhich we have data
I Best not to extrapolate, or predict values outside this range
I 1995 state unemployment in our data ranges from 3% toaround 10%
I Should we use regression equation to predict 2000unemployment for state w/ 40% 1995 unemployment?
![Page 159: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/159.jpg)
State Unemployment Example
Some notes on prediction:
1. These are good predictions, but not necessarily correct!
2. Regression line is only good for predicting values in range forwhich we have data
I Best not to extrapolate, or predict values outside this rangeI 1995 state unemployment in our data ranges from 3% to
around 10%
I Should we use regression equation to predict 2000unemployment for state w/ 40% 1995 unemployment?
![Page 160: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/160.jpg)
State Unemployment Example
Some notes on prediction:
1. These are good predictions, but not necessarily correct!
2. Regression line is only good for predicting values in range forwhich we have data
I Best not to extrapolate, or predict values outside this rangeI 1995 state unemployment in our data ranges from 3% to
around 10%I Should we use regression equation to predict 2000
unemployment for state w/ 40% 1995 unemployment?
![Page 161: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/161.jpg)
Using Regression for Hypothesis Tests of Slope
I Can also use OLS estimators in hypothesis testing framework
I Remember that for OLS we estimate the slope via:
b1 =
∑(xi − x)(yi − y)∑
(xi − x)2
I and the intercept via:
b0 = y − b1x
I Both b1 and b0 are sums and means of random variables
I Means that CLT kicks in!
I → b1 and b0 are normally distributed in large samples!
![Page 162: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/162.jpg)
Using Regression for Hypothesis Tests of Slope
I Can also use OLS estimators in hypothesis testing framework
I Remember that for OLS we estimate the slope via:
b1 =
∑(xi − x)(yi − y)∑
(xi − x)2
I and the intercept via:
b0 = y − b1x
I Both b1 and b0 are sums and means of random variables
I Means that CLT kicks in!
I → b1 and b0 are normally distributed in large samples!
![Page 163: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/163.jpg)
Using Regression for Hypothesis Tests of Slope
I Can also use OLS estimators in hypothesis testing framework
I Remember that for OLS we estimate the slope via:
b1 =
∑(xi − x)(yi − y)∑
(xi − x)2
I and the intercept via:
b0 = y − b1x
I Both b1 and b0 are sums and means of random variables
I Means that CLT kicks in!
I → b1 and b0 are normally distributed in large samples!
![Page 164: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/164.jpg)
Using Regression for Hypothesis Tests of Slope
I Can also use OLS estimators in hypothesis testing framework
I Remember that for OLS we estimate the slope via:
b1 =
∑(xi − x)(yi − y)∑
(xi − x)2
I and the intercept via:
b0 = y − b1x
I Both b1 and b0 are sums and means of random variables
I Means that CLT kicks in!
I → b1 and b0 are normally distributed in large samples!
![Page 165: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/165.jpg)
Using Regression for Hypothesis Tests of Slope
I Can also use OLS estimators in hypothesis testing framework
I Remember that for OLS we estimate the slope via:
b1 =
∑(xi − x)(yi − y)∑
(xi − x)2
I and the intercept via:
b0 = y − b1x
I Both b1 and b0 are sums and means of random variables
I Means that CLT kicks in!
I → b1 and b0 are normally distributed in large samples!
![Page 166: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/166.jpg)
Using Regression for Hypothesis Tests of Slope
I Can also use OLS estimators in hypothesis testing framework
I Remember that for OLS we estimate the slope via:
b1 =
∑(xi − x)(yi − y)∑
(xi − x)2
I and the intercept via:
b0 = y − b1x
I Both b1 and b0 are sums and means of random variables
I Means that CLT kicks in!
I → b1 and b0 are normally distributed in large samples!
![Page 167: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/167.jpg)
Using Regression for Hypothesis Tests of Slope
I Can also use OLS estimators in hypothesis testing framework
I Remember that for OLS we estimate the slope via:
b1 =
∑(xi − x)(yi − y)∑
(xi − x)2
I and the intercept via:
b0 = y − b1x
I Both b1 and b0 are sums and means of random variables
I Means that CLT kicks in!
I → b1 and b0 are normally distributed in large samples!
![Page 168: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/168.jpg)
Using Regression for Hypothesis Tests of Slope
I Can also use OLS estimators in hypothesis testing framework
I Remember that for OLS we estimate the slope via:
b1 =
∑(xi − x)(yi − y)∑
(xi − x)2
I and the intercept via:
b0 = y − b1x
I Both b1 and b0 are sums and means of random variables
I Means that CLT kicks in!
I → b1 and b0 are normally distributed in large samples!
![Page 169: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/169.jpg)
Using Regression for Hypothesis Tests of Slope
I Can also use OLS estimators in hypothesis testing framework
I Remember that for OLS we estimate the slope via:
b1 =
∑(xi − x)(yi − y)∑
(xi − x)2
I and the intercept via:
b0 = y − b1x
I Both b1 and b0 are sums and means of random variables
I Means that CLT kicks in!
I → b1 and b0 are normally distributed in large samples!
![Page 170: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/170.jpg)
Using Regression for Hypothesis Tests of Slope
I Can use this fact to conduct hypothesis tests, usuallytwo-tailed
I Specifically: If our slope β1 is zero, then no linear relationshipbetween the two variables
I Null and alternative hypotheses:I H0: β1 = 0I Ha: β1 6= 0
I Test statistic given by
tn−2 =b1 − 0
SE (b1)
I Where we use a t distribution and (usually) a two-tailed testand SE [b1]
SE (b1) =
√∑(yi − yi )2∑(xi − x)2
![Page 171: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/171.jpg)
Using Regression for Hypothesis Tests of Slope
I Can use this fact to conduct hypothesis tests, usuallytwo-tailed
I Specifically: If our slope β1 is zero, then no linear relationshipbetween the two variables
I Null and alternative hypotheses:I H0: β1 = 0I Ha: β1 6= 0
I Test statistic given by
tn−2 =b1 − 0
SE (b1)
I Where we use a t distribution and (usually) a two-tailed testand SE [b1]
SE (b1) =
√∑(yi − yi )2∑(xi − x)2
![Page 172: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/172.jpg)
Using Regression for Hypothesis Tests of Slope
I Can use this fact to conduct hypothesis tests, usuallytwo-tailed
I Specifically: If our slope β1 is zero, then no linear relationshipbetween the two variables
I Null and alternative hypotheses:I H0: β1 = 0I Ha: β1 6= 0
I Test statistic given by
tn−2 =b1 − 0
SE (b1)
I Where we use a t distribution and (usually) a two-tailed testand SE [b1]
SE (b1) =
√∑(yi − yi )2∑(xi − x)2
![Page 173: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/173.jpg)
Using Regression for Hypothesis Tests of Slope
I Can use this fact to conduct hypothesis tests, usuallytwo-tailed
I Specifically: If our slope β1 is zero, then no linear relationshipbetween the two variables
I Null and alternative hypotheses:
I H0: β1 = 0I Ha: β1 6= 0
I Test statistic given by
tn−2 =b1 − 0
SE (b1)
I Where we use a t distribution and (usually) a two-tailed testand SE [b1]
SE (b1) =
√∑(yi − yi )2∑(xi − x)2
![Page 174: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/174.jpg)
Using Regression for Hypothesis Tests of Slope
I Can use this fact to conduct hypothesis tests, usuallytwo-tailed
I Specifically: If our slope β1 is zero, then no linear relationshipbetween the two variables
I Null and alternative hypotheses:I H0: β1 = 0
I Ha: β1 6= 0
I Test statistic given by
tn−2 =b1 − 0
SE (b1)
I Where we use a t distribution and (usually) a two-tailed testand SE [b1]
SE (b1) =
√∑(yi − yi )2∑(xi − x)2
![Page 175: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/175.jpg)
Using Regression for Hypothesis Tests of Slope
I Can use this fact to conduct hypothesis tests, usuallytwo-tailed
I Specifically: If our slope β1 is zero, then no linear relationshipbetween the two variables
I Null and alternative hypotheses:I H0: β1 = 0I Ha: β1 6= 0
I Test statistic given by
tn−2 =b1 − 0
SE (b1)
I Where we use a t distribution and (usually) a two-tailed testand SE [b1]
SE (b1) =
√∑(yi − yi )2∑(xi − x)2
![Page 176: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/176.jpg)
Using Regression for Hypothesis Tests of Slope
I Can use this fact to conduct hypothesis tests, usuallytwo-tailed
I Specifically: If our slope β1 is zero, then no linear relationshipbetween the two variables
I Null and alternative hypotheses:I H0: β1 = 0I Ha: β1 6= 0
I Test statistic given by
tn−2 =b1 − 0
SE (b1)
I Where we use a t distribution and (usually) a two-tailed testand SE [b1]
SE (b1) =
√∑(yi − yi )2∑(xi − x)2
![Page 177: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/177.jpg)
Using Regression for Hypothesis Tests of Slope
I Can use this fact to conduct hypothesis tests, usuallytwo-tailed
I Specifically: If our slope β1 is zero, then no linear relationshipbetween the two variables
I Null and alternative hypotheses:I H0: β1 = 0I Ha: β1 6= 0
I Test statistic given by
tn−2 =b1 − 0
SE (b1)
I Where we use a t distribution and (usually) a two-tailed testand SE [b1]
SE (b1) =
√∑(yi − yi )2∑(xi − x)2
![Page 178: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/178.jpg)
Using Regression for Hypothesis Tests of Slope
I Can use this fact to conduct hypothesis tests, usuallytwo-tailed
I Specifically: If our slope β1 is zero, then no linear relationshipbetween the two variables
I Null and alternative hypotheses:I H0: β1 = 0I Ha: β1 6= 0
I Test statistic given by
tn−2 =b1 − 0
SE (b1)
I Where we use a t distribution and (usually) a two-tailed testand SE [b1]
SE (b1) =
√∑(yi − yi )2∑(xi − x)2
![Page 179: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/179.jpg)
Using Regression for Hypothesis Tests of Slope
I Can use this fact to conduct hypothesis tests, usuallytwo-tailed
I Specifically: If our slope β1 is zero, then no linear relationshipbetween the two variables
I Null and alternative hypotheses:I H0: β1 = 0I Ha: β1 6= 0
I Test statistic given by
tn−2 =b1 − 0
SE (b1)
I Where we use a t distribution and (usually) a two-tailed testand SE [b1]
SE (b1) =
√∑(yi − yi )2∑(xi − x)2
![Page 180: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/180.jpg)
State Unemployment Example
I STATA and R report results of two-tailed hypothesis test:
. regress yr2000 yr1995
-----------------------------------------------------
yr2000 | Coef. Std. Err. t P>|t|
-----------------------------------------------------
yr1995 | .5398317 .0818083 6.60 0.000
_cons | 1.077917 .4571589 2.36 0.026
----------------------------------------------------
I Note: For β1, hypothesis test yields p-value of < 0.001
I Note: Hypothesis test for β0 is testing null hypothesis that interceptequal to zero → that mean of y is zero when mean of x is zero
![Page 181: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/181.jpg)
State Unemployment Example
I STATA and R report results of two-tailed hypothesis test:
. regress yr2000 yr1995
-----------------------------------------------------
yr2000 | Coef. Std. Err. t P>|t|
-----------------------------------------------------
yr1995 | .5398317 .0818083 6.60 0.000
_cons | 1.077917 .4571589 2.36 0.026
----------------------------------------------------
I Note: For β1, hypothesis test yields p-value of < 0.001
I Note: Hypothesis test for β0 is testing null hypothesis that interceptequal to zero → that mean of y is zero when mean of x is zero
![Page 182: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/182.jpg)
State Unemployment Example
I STATA and R report results of two-tailed hypothesis test:
. regress yr2000 yr1995
-----------------------------------------------------
yr2000 | Coef. Std. Err. t P>|t|
-----------------------------------------------------
yr1995 | .5398317 .0818083 6.60 0.000
_cons | 1.077917 .4571589 2.36 0.026
----------------------------------------------------
I Note: For β1, hypothesis test yields p-value of < 0.001
I Note: Hypothesis test for β0 is testing null hypothesis that interceptequal to zero → that mean of y is zero when mean of x is zero
![Page 183: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/183.jpg)
State Unemployment Example
I STATA and R report results of two-tailed hypothesis test:
. regress yr2000 yr1995
-----------------------------------------------------
yr2000 | Coef. Std. Err. t P>|t|
-----------------------------------------------------
yr1995 | .5398317 .0818083 6.60 0.000
_cons | 1.077917 .4571589 2.36 0.026
----------------------------------------------------
I Note: For β1, hypothesis test yields p-value of < 0.001
I Note: Hypothesis test for β0 is testing null hypothesis that interceptequal to zero → that mean of y is zero when mean of x is zero
![Page 184: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/184.jpg)
State Unemployment Example
I STATA and R report results of two-tailed hypothesis test:
. regress yr2000 yr1995
-----------------------------------------------------
yr2000 | Coef. Std. Err. t P>|t|
-----------------------------------------------------
yr1995 | .5398317 .0818083 6.60 0.000
_cons | 1.077917 .4571589 2.36 0.026
----------------------------------------------------
I Note: For β1, hypothesis test yields p-value of < 0.001
I Note: Hypothesis test for β0 is testing null hypothesis that interceptequal to zero → that mean of y is zero when mean of x is zero
![Page 185: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/185.jpg)
State Unemployment Example
I Statistical interpretation?I Since p-value < 0.001, we can reject null hypothesis thatβ1 = 0 at an α = 0.05 level
I Substantive interpretation?I Strong evidence against the slope being zeroI Implies that there appears to be some relationship between
state unemployment rates in 1995 and in 2000
I In addition: Estimated slope suggests positive association →higher 1995 rate is linked w/higher 2000 rate
![Page 186: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/186.jpg)
State Unemployment Example
I Statistical interpretation?
I Since p-value < 0.001, we can reject null hypothesis thatβ1 = 0 at an α = 0.05 level
I Substantive interpretation?I Strong evidence against the slope being zeroI Implies that there appears to be some relationship between
state unemployment rates in 1995 and in 2000
I In addition: Estimated slope suggests positive association →higher 1995 rate is linked w/higher 2000 rate
![Page 187: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/187.jpg)
State Unemployment Example
I Statistical interpretation?I Since p-value < 0.001, we can reject null hypothesis thatβ1 = 0 at an α = 0.05 level
I Substantive interpretation?I Strong evidence against the slope being zeroI Implies that there appears to be some relationship between
state unemployment rates in 1995 and in 2000
I In addition: Estimated slope suggests positive association →higher 1995 rate is linked w/higher 2000 rate
![Page 188: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/188.jpg)
State Unemployment Example
I Statistical interpretation?I Since p-value < 0.001, we can reject null hypothesis thatβ1 = 0 at an α = 0.05 level
I Substantive interpretation?
I Strong evidence against the slope being zeroI Implies that there appears to be some relationship between
state unemployment rates in 1995 and in 2000
I In addition: Estimated slope suggests positive association →higher 1995 rate is linked w/higher 2000 rate
![Page 189: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/189.jpg)
State Unemployment Example
I Statistical interpretation?I Since p-value < 0.001, we can reject null hypothesis thatβ1 = 0 at an α = 0.05 level
I Substantive interpretation?I Strong evidence against the slope being zero
I Implies that there appears to be some relationship betweenstate unemployment rates in 1995 and in 2000
I In addition: Estimated slope suggests positive association →higher 1995 rate is linked w/higher 2000 rate
![Page 190: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/190.jpg)
State Unemployment Example
I Statistical interpretation?I Since p-value < 0.001, we can reject null hypothesis thatβ1 = 0 at an α = 0.05 level
I Substantive interpretation?I Strong evidence against the slope being zeroI Implies that there appears to be some relationship between
state unemployment rates in 1995 and in 2000
I In addition: Estimated slope suggests positive association →higher 1995 rate is linked w/higher 2000 rate
![Page 191: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/191.jpg)
State Unemployment Example
I Statistical interpretation?I Since p-value < 0.001, we can reject null hypothesis thatβ1 = 0 at an α = 0.05 level
I Substantive interpretation?I Strong evidence against the slope being zeroI Implies that there appears to be some relationship between
state unemployment rates in 1995 and in 2000
I In addition: Estimated slope suggests positive association →higher 1995 rate is linked w/higher 2000 rate
![Page 192: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/192.jpg)
Using Regression for Confidence Intervals of Slope
I Just as we can conduct hypothesis tests, can also constructconfidence intervals for true slope, β1
I Follows the same formula as before:
b1 ± tn−2(α/2)× SE [b1]
I In our example (w/30 observations):
0.5398± t28,α/2 × 0.0818
→ (0.372, 0.707)
I Interpretation: In repeated sampling, expect 95 out of 100confidence intervals to contain true slope
![Page 193: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/193.jpg)
Using Regression for Confidence Intervals of Slope
I Just as we can conduct hypothesis tests, can also constructconfidence intervals for true slope, β1
I Follows the same formula as before:
b1 ± tn−2(α/2)× SE [b1]
I In our example (w/30 observations):
0.5398± t28,α/2 × 0.0818
→ (0.372, 0.707)
I Interpretation: In repeated sampling, expect 95 out of 100confidence intervals to contain true slope
![Page 194: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/194.jpg)
Using Regression for Confidence Intervals of Slope
I Just as we can conduct hypothesis tests, can also constructconfidence intervals for true slope, β1
I Follows the same formula as before:
b1 ± tn−2(α/2)× SE [b1]
I In our example (w/30 observations):
0.5398± t28,α/2 × 0.0818
→ (0.372, 0.707)
I Interpretation: In repeated sampling, expect 95 out of 100confidence intervals to contain true slope
![Page 195: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/195.jpg)
Using Regression for Confidence Intervals of Slope
I Just as we can conduct hypothesis tests, can also constructconfidence intervals for true slope, β1
I Follows the same formula as before:
b1 ± tn−2(α/2)× SE [b1]
I In our example (w/30 observations):
0.5398± t28,α/2 × 0.0818
→ (0.372, 0.707)
I Interpretation: In repeated sampling, expect 95 out of 100confidence intervals to contain true slope
![Page 196: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/196.jpg)
Using Regression for Confidence Intervals of Slope
I Just as we can conduct hypothesis tests, can also constructconfidence intervals for true slope, β1
I Follows the same formula as before:
b1 ± tn−2(α/2)× SE [b1]
I In our example (w/30 observations):
0.5398± t28,α/2 × 0.0818
→ (0.372, 0.707)
I Interpretation: In repeated sampling, expect 95 out of 100confidence intervals to contain true slope
![Page 197: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/197.jpg)
Using Regression for Confidence Intervals of Slope
I Just as we can conduct hypothesis tests, can also constructconfidence intervals for true slope, β1
I Follows the same formula as before:
b1 ± tn−2(α/2)× SE [b1]
I In our example (w/30 observations):
0.5398± t28,α/2 × 0.0818
→ (0.372, 0.707)
I Interpretation: In repeated sampling, expect 95 out of 100confidence intervals to contain true slope
![Page 198: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/198.jpg)
Using Regression for Confidence Intervals of Slope
I Just as we can conduct hypothesis tests, can also constructconfidence intervals for true slope, β1
I Follows the same formula as before:
b1 ± tn−2(α/2)× SE [b1]
I In our example (w/30 observations):
0.5398± t28,α/2 × 0.0818
→ (0.372, 0.707)
I Interpretation: In repeated sampling, expect 95 out of 100confidence intervals to contain true slope
![Page 199: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/199.jpg)
State Unemployment Example
STATA and R will also report 95% CIs
. regress yr2000 yr1995
Source | SS df MS Number of obs = 30
-------------+------------------------------ F( 1, 28) = 43.54
Model | 13.3338426 1 13.3338426 Prob > F = 0.0000
Residual | 8.57415592 28 .306219854 R-squared = 0.6086
-------------+------------------------------ Adj R-squared = 0.5947
Total | 21.9079986 29 .755448226 Root MSE = .55337
------------------------------------------------------------------------------
yr2000 | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
yr1995 | .5398317 .0818083 6.60 0.000 .372255 .7074084
_cons | 1.077917 .4571589 2.36 0.026 .1414697 2.014365
------------------------------------------------------------------------------
![Page 200: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/200.jpg)
State Unemployment Example
STATA and R will also report 95% CIs
. regress yr2000 yr1995
Source | SS df MS Number of obs = 30
-------------+------------------------------ F( 1, 28) = 43.54
Model | 13.3338426 1 13.3338426 Prob > F = 0.0000
Residual | 8.57415592 28 .306219854 R-squared = 0.6086
-------------+------------------------------ Adj R-squared = 0.5947
Total | 21.9079986 29 .755448226 Root MSE = .55337
------------------------------------------------------------------------------
yr2000 | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
yr1995 | .5398317 .0818083 6.60 0.000 .372255 .7074084
_cons | 1.077917 .4571589 2.36 0.026 .1414697 2.014365
------------------------------------------------------------------------------
![Page 201: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/201.jpg)
Model Fit of a Simple Linear Regression
I Model fit is a measure of how “well” the line fits the data
I In linear regression, R2 most commonly used measure
I R2: Proportion of variance in y explained by variance in x
I With one explanatory variable (one x), correlation coefficientr is square root of R2:
r =√R
I Here: √0.6086 = 0.780
I Substantive interpretation: High R2 → two variables highlycorrelated, regression explaining a lot of the variance in theoutcome
![Page 202: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/202.jpg)
Model Fit of a Simple Linear Regression
I Model fit is a measure of how “well” the line fits the data
I In linear regression, R2 most commonly used measure
I R2: Proportion of variance in y explained by variance in x
I With one explanatory variable (one x), correlation coefficientr is square root of R2:
r =√R
I Here: √0.6086 = 0.780
I Substantive interpretation: High R2 → two variables highlycorrelated, regression explaining a lot of the variance in theoutcome
![Page 203: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/203.jpg)
Model Fit of a Simple Linear Regression
I Model fit is a measure of how “well” the line fits the data
I In linear regression, R2 most commonly used measure
I R2: Proportion of variance in y explained by variance in x
I With one explanatory variable (one x), correlation coefficientr is square root of R2:
r =√R
I Here: √0.6086 = 0.780
I Substantive interpretation: High R2 → two variables highlycorrelated, regression explaining a lot of the variance in theoutcome
![Page 204: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/204.jpg)
Model Fit of a Simple Linear Regression
I Model fit is a measure of how “well” the line fits the data
I In linear regression, R2 most commonly used measure
I R2: Proportion of variance in y explained by variance in x
I With one explanatory variable (one x), correlation coefficientr is square root of R2:
r =√R
I Here: √0.6086 = 0.780
I Substantive interpretation: High R2 → two variables highlycorrelated, regression explaining a lot of the variance in theoutcome
![Page 205: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/205.jpg)
Model Fit of a Simple Linear Regression
I Model fit is a measure of how “well” the line fits the data
I In linear regression, R2 most commonly used measure
I R2: Proportion of variance in y explained by variance in x
I With one explanatory variable (one x), correlation coefficientr is square root of R2:
r =√R
I Here: √0.6086 = 0.780
I Substantive interpretation: High R2 → two variables highlycorrelated, regression explaining a lot of the variance in theoutcome
![Page 206: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/206.jpg)
Model Fit of a Simple Linear Regression
I Model fit is a measure of how “well” the line fits the data
I In linear regression, R2 most commonly used measure
I R2: Proportion of variance in y explained by variance in x
I With one explanatory variable (one x), correlation coefficientr is square root of R2:
r =√R
I Here: √0.6086 = 0.780
I Substantive interpretation: High R2 → two variables highlycorrelated, regression explaining a lot of the variance in theoutcome
![Page 207: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/207.jpg)
Model Fit of a Simple Linear Regression
I Model fit is a measure of how “well” the line fits the data
I In linear regression, R2 most commonly used measure
I R2: Proportion of variance in y explained by variance in x
I With one explanatory variable (one x), correlation coefficientr is square root of R2:
r =√R
I Here: √0.6086 = 0.780
I Substantive interpretation: High R2 → two variables highlycorrelated, regression explaining a lot of the variance in theoutcome
![Page 208: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/208.jpg)
Model Fit of a Simple Linear Regression
I Model fit is a measure of how “well” the line fits the data
I In linear regression, R2 most commonly used measure
I R2: Proportion of variance in y explained by variance in x
I With one explanatory variable (one x), correlation coefficientr is square root of R2:
r =√R
I Here: √0.6086 = 0.780
I Substantive interpretation: High R2 → two variables highlycorrelated, regression explaining a lot of the variance in theoutcome
![Page 209: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/209.jpg)
Some Notes About Residuals
I Residuals represent estimates of the random errors, ε1I Empirically: Represent “left-over” distance from each
observation to regression line after fitting
I Differences observed in our sample data between each pointand regression line (vertically):
Residual = Observed y − Predicted y
I Least-squares line makes sum of the squared residuals as smallas possible
I Other strategies for drawing the line probably have biggervalues for this sum
![Page 210: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/210.jpg)
Some Notes About Residuals
I Residuals represent estimates of the random errors, ε1
I Empirically: Represent “left-over” distance from eachobservation to regression line after fitting
I Differences observed in our sample data between each pointand regression line (vertically):
Residual = Observed y − Predicted y
I Least-squares line makes sum of the squared residuals as smallas possible
I Other strategies for drawing the line probably have biggervalues for this sum
![Page 211: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/211.jpg)
Some Notes About Residuals
I Residuals represent estimates of the random errors, ε1I Empirically: Represent “left-over” distance from each
observation to regression line after fitting
I Differences observed in our sample data between each pointand regression line (vertically):
Residual = Observed y − Predicted y
I Least-squares line makes sum of the squared residuals as smallas possible
I Other strategies for drawing the line probably have biggervalues for this sum
![Page 212: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/212.jpg)
Some Notes About Residuals
I Residuals represent estimates of the random errors, ε1I Empirically: Represent “left-over” distance from each
observation to regression line after fitting
I Differences observed in our sample data between each pointand regression line (vertically):
Residual = Observed y − Predicted y
I Least-squares line makes sum of the squared residuals as smallas possible
I Other strategies for drawing the line probably have biggervalues for this sum
![Page 213: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/213.jpg)
Some Notes About Residuals
I Residuals represent estimates of the random errors, ε1I Empirically: Represent “left-over” distance from each
observation to regression line after fitting
I Differences observed in our sample data between each pointand regression line (vertically):
Residual = Observed y − Predicted y
I Least-squares line makes sum of the squared residuals as smallas possible
I Other strategies for drawing the line probably have biggervalues for this sum
![Page 214: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/214.jpg)
Some Notes About Residuals
I Residuals represent estimates of the random errors, ε1I Empirically: Represent “left-over” distance from each
observation to regression line after fitting
I Differences observed in our sample data between each pointand regression line (vertically):
Residual = Observed y − Predicted y
I Least-squares line makes sum of the squared residuals as smallas possible
I Other strategies for drawing the line probably have biggervalues for this sum
![Page 215: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/215.jpg)
Some Notes About Residuals
I Residuals represent estimates of the random errors, ε1I Empirically: Represent “left-over” distance from each
observation to regression line after fitting
I Differences observed in our sample data between each pointand regression line (vertically):
Residual = Observed y − Predicted y
I Least-squares line makes sum of the squared residuals as smallas possible
I Other strategies for drawing the line probably have biggervalues for this sum
![Page 216: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/216.jpg)
Some Notes About Residuals
![Page 217: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/217.jpg)
Some Notes About Residuals
![Page 218: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/218.jpg)
Some Notes About Residuals
I Sum of residuals equals zero using least-squares regression
I → Plotting residuals against x values should result in plotthat looks random, i.e. no pattern present
I If pattern, a line might not be a good fit for the data
![Page 219: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/219.jpg)
Some Notes About Residuals
I Sum of residuals equals zero using least-squares regression
I → Plotting residuals against x values should result in plotthat looks random, i.e. no pattern present
I If pattern, a line might not be a good fit for the data
![Page 220: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/220.jpg)
Some Notes About Residuals
I Sum of residuals equals zero using least-squares regression
I → Plotting residuals against x values should result in plotthat looks random, i.e. no pattern present
I If pattern, a line might not be a good fit for the data
![Page 221: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/221.jpg)
Some Notes About Residuals
I Sum of residuals equals zero using least-squares regression
I → Plotting residuals against x values should result in plotthat looks random, i.e. no pattern present
I If pattern, a line might not be a good fit for the data
![Page 222: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/222.jpg)
Some Notes About Residuals
In Stata:
predict res, r
plot res yr1995
![Page 223: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/223.jpg)
Some Notes About ResidualsIn Stata:
predict res, r
plot res yr1995
![Page 224: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/224.jpg)
Some Notes About ResidualsIn Stata:
predict res, r
plot res yr1995
![Page 225: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/225.jpg)
Some Notes About ResidualsIn Stata:
predict res, r
plot res yr1995
![Page 226: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/226.jpg)
Some Notes About Residuals
I Left hand side: Looks random
I Right hand side: Looks like errors get bigger with larger xvalues → heteroskedasticity
![Page 227: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/227.jpg)
Some Notes About Residuals
I Left hand side: Looks random
I Right hand side: Looks like errors get bigger with larger xvalues → heteroskedasticity
![Page 228: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/228.jpg)
Some Notes About Residuals
I Left hand side: Looks random
I Right hand side: Looks like errors get bigger with larger xvalues → heteroskedasticity
![Page 229: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/229.jpg)
Some Notes About Residuals
I Left hand side: Looks random
I Right hand side: Looks like errors get bigger with larger xvalues → heteroskedasticity
![Page 230: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/230.jpg)
Outliers and Leverage Points
I Outlier: Observation that has an unusual y value, conditionalon x
I Leverage point: Observation that has an unusual x value (farfrom the mean of X )
I An observation is influential if it substantially changes theregression line → that is, it is an outlier and has high leverage
I Outlier, leverage points, and influential observations raiseinteresting questions to examine more
![Page 231: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/231.jpg)
Outliers and Leverage Points
I Outlier: Observation that has an unusual y value, conditionalon x
I Leverage point: Observation that has an unusual x value (farfrom the mean of X )
I An observation is influential if it substantially changes theregression line → that is, it is an outlier and has high leverage
I Outlier, leverage points, and influential observations raiseinteresting questions to examine more
![Page 232: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/232.jpg)
Outliers and Leverage Points
I Outlier: Observation that has an unusual y value, conditionalon x
I Leverage point: Observation that has an unusual x value (farfrom the mean of X )
I An observation is influential if it substantially changes theregression line → that is, it is an outlier and has high leverage
I Outlier, leverage points, and influential observations raiseinteresting questions to examine more
![Page 233: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/233.jpg)
Outliers and Leverage Points
I Outlier: Observation that has an unusual y value, conditionalon x
I Leverage point: Observation that has an unusual x value (farfrom the mean of X )
I An observation is influential if it substantially changes theregression line → that is, it is an outlier and has high leverage
I Outlier, leverage points, and influential observations raiseinteresting questions to examine more
![Page 234: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/234.jpg)
Outliers and Leverage Points
I Outlier: Observation that has an unusual y value, conditionalon x
I Leverage point: Observation that has an unusual x value (farfrom the mean of X )
I An observation is influential if it substantially changes theregression line → that is, it is an outlier and has high leverage
I Outlier, leverage points, and influential observations raiseinteresting questions to examine more
![Page 235: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/235.jpg)
Outliers and Leverage Points
![Page 236: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/236.jpg)
Outliers and Leverage Points
![Page 237: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/237.jpg)
Outliers and Leverage Points
![Page 238: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/238.jpg)
Warning about Association versus Causation
I Linear regression allows us under certain circumstances toI Make statements about whether a relationship between two
variables existsI Make statements about the size of that relationshipI Predict one variable using another
I However: At this point, not ok to say variable “causes”change in other variables → this requires additionalassumptions about the relationship between x and y
I You’ll visit the additional assumptions required to make causalstatements in API 202
![Page 239: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/239.jpg)
Warning about Association versus Causation
I Linear regression allows us under certain circumstances to
I Make statements about whether a relationship between twovariables exists
I Make statements about the size of that relationshipI Predict one variable using another
I However: At this point, not ok to say variable “causes”change in other variables → this requires additionalassumptions about the relationship between x and y
I You’ll visit the additional assumptions required to make causalstatements in API 202
![Page 240: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/240.jpg)
Warning about Association versus Causation
I Linear regression allows us under certain circumstances toI Make statements about whether a relationship between two
variables exists
I Make statements about the size of that relationshipI Predict one variable using another
I However: At this point, not ok to say variable “causes”change in other variables → this requires additionalassumptions about the relationship between x and y
I You’ll visit the additional assumptions required to make causalstatements in API 202
![Page 241: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/241.jpg)
Warning about Association versus Causation
I Linear regression allows us under certain circumstances toI Make statements about whether a relationship between two
variables existsI Make statements about the size of that relationship
I Predict one variable using another
I However: At this point, not ok to say variable “causes”change in other variables → this requires additionalassumptions about the relationship between x and y
I You’ll visit the additional assumptions required to make causalstatements in API 202
![Page 242: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/242.jpg)
Warning about Association versus Causation
I Linear regression allows us under certain circumstances toI Make statements about whether a relationship between two
variables existsI Make statements about the size of that relationshipI Predict one variable using another
I However: At this point, not ok to say variable “causes”change in other variables → this requires additionalassumptions about the relationship between x and y
I You’ll visit the additional assumptions required to make causalstatements in API 202
![Page 243: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/243.jpg)
Warning about Association versus Causation
I Linear regression allows us under certain circumstances toI Make statements about whether a relationship between two
variables existsI Make statements about the size of that relationshipI Predict one variable using another
I However: At this point, not ok to say variable “causes”change in other variables → this requires additionalassumptions about the relationship between x and y
I You’ll visit the additional assumptions required to make causalstatements in API 202
![Page 244: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/244.jpg)
Warning about Association versus Causation
I Linear regression allows us under certain circumstances toI Make statements about whether a relationship between two
variables existsI Make statements about the size of that relationshipI Predict one variable using another
I However: At this point, not ok to say variable “causes”change in other variables → this requires additionalassumptions about the relationship between x and y
I You’ll visit the additional assumptions required to make causalstatements in API 202
![Page 245: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/245.jpg)
Next Time
I More on interpretation
I Multiple regression: regression with two or more explanatoryvariables
![Page 246: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/246.jpg)
Next Time
I More on interpretation
I Multiple regression: regression with two or more explanatoryvariables
![Page 247: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/247.jpg)
Next Time
I More on interpretation
I Multiple regression: regression with two or more explanatoryvariables
![Page 248: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/248.jpg)
Appendix: Proof of Least Squares Coefficient EstimatorsTaking the partial derivatives:
S(b0, b1) =n∑
i=1
(Yi − b0 − Xib1)2
=
n∑i=1
(Y 2i − 2Yib0 − 2Yib1Xi + b20 + 2b0b1Xi + b21X
2i )
∂S(b0, b1)
∂b0=
n∑i=1
(−2Yi + 2b0 + 2b1Xi )
= −2n∑
i=1
(Yi − b0 − b1Xi )
∂S(b0, b1)
∂b1=
n∑i=1
(−2YiXi + 2b0Xi + 2b1X2i )
= −2n∑
i=1
Xi (Yi − b0 − b1Xi )
![Page 249: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/249.jpg)
Appendix: Proof of Least Squares Coefficient Estimators
I One condition of β0 and β1 minimizing the sum of thesquared residuals is that they must make the partialderivatives equal to 0
I Each of these conditions is called a first order condition.
I The first order conditions are:
0 = −2n∑
i=1
(Yi − β0 − β1Xi )
0 = −2n∑
i=1
Xi (Yi − β0 − β1Xi )
![Page 250: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/250.jpg)
Appendix: Proof of Least Squares Coefficient Estimators
I Let’s solve for the estimator of the intercept first:
0 = −2n∑
i=1
(Yi − β0 − β1Xi )
0 =
n∑i=1
(Yi − β0 − β1Xi )
0 =
n∑i=1
Yi −
n∑i=1
β0 −
n∑i=1
β1Xi
β0n =
(n∑
i=1
Yi
)− β1
(n∑
i=1
Xi
)β0 = Y − β1X
![Page 251: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/251.jpg)
Appendix: Proof of Least Squares Coefficient EstimatorsI Now, we can plug this back in to get an estimate for the slope:
0 = −2n∑
i=1
Xi (Yi − β0 − β1Xi )
0 =
n∑i=1
Xi (Yi − β0 − β1Xi )
0 =
n∑i=1
Xi (Yi − (Y − β1X ) − β1Xi )
0 =
n∑i=1
Xi (Yi − Y − β1(Xi − X ))
0 =
n∑i=1
Xi (Yi − Y ) − β1
n∑i=1
Xi (Xi − X )
β1
n∑i=1
Xi (Xi − X ) =
n∑i=1
Xi (Yi − Y ) − X∑i=1
(Yi − Y )
![Page 252: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/252.jpg)
Appendix: Proof of Least Squares Coefficient Estimators
β1
n∑i=1
Xi (Xi − X ) − X∑i=1
(Xi − X ) =
n∑i=1
(Xi − X )(Yi − Y )
β1
n∑i=1
(Xi − X )2 =
n∑i=1
(Xi − X )(Yi − Y )
β1 =
∑ni=1(Xi − X )(Yi − Y )∑n
i=1(Xi − X )2
![Page 253: Lecture 20: Simple Linear Regression API-201Z · Announcements I Midterms nearly graded I Executive summaries now due on 11/29 (Thursday, as part of PS #10) I We’ll set up online](https://reader033.vdocuments.us/reader033/viewer/2022051910/5fff5b5ab2f04c43f97b4b47/html5/thumbnails/253.jpg)
Appendix: Proof of Least Squares Coefficient Estimators
I Note: We used a key fact about sums and means,∑ni=1(Yi − Y ) = 0
I Deviations from mean sum to 0
I Intuitively this makes sense because the mean is just the sumof observations divided by n
I Allows us to write∑n
i=1 Xi (Yi −Y ) =∑n
i=1(Xi −X )(Yi −Y )