adequacy of linear regression models

53
06/14/22 http:// numericalmethods.eng.usf.edu 1 Adequacy of Linear Regression Models http://numericalmethods.eng.us f.edu Transforming Numerical Methods Education for STEM Undergraduates

Upload: salaam

Post on 12-Feb-2016

37 views

Category:

Documents


1 download

DESCRIPTION

Adequacy of Linear Regression Models. http://numericalmethods.eng.usf.edu Transforming Numerical Methods Education for STEM Undergraduates. 8/6/2014. http://numericalmethods.eng.usf.edu. 1. Data. Is this adequate?. Straight Line Model. Quality of Fitted Data. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Adequacy of Linear Regression Models

04/22/23 http://numericalmethods.eng.usf.edu 1

Adequacy of Linear Regression Models

http://numericalmethods.eng.usf.eduTransforming Numerical Methods Education for STEM

Undergraduates

Page 2: Adequacy of Linear Regression Models

Data

-350 -300 -250 -200 -150 -100 -50 0 50 1002

2.5

3

3.5

4

4.5

5

5.5

6

6.5

x

yy vs x

Page 3: Adequacy of Linear Regression Models

Is this adequate?

-350 -300 -250 -200 -150 -100 -50 0 50 1002

2.5

3

3.5

4

4.5

5

5.5

6

6.5

7

x

yy vs x

Straight Line Model

Page 4: Adequacy of Linear Regression Models

Quality of Fitted Data• Does the model describe the data

adequately?

• How well does the model predict the response variable predictably?

Page 5: Adequacy of Linear Regression Models

Linear Regression Models• Limit our discussion to adequacy of

straight-line regression models

Page 6: Adequacy of Linear Regression Models

Four checks

1. Plot the data and the model.2. Find standard error of estimate.3. Calculate the coefficient of

determination.4. Check if the model meets the

assumption of random errors.

Page 7: Adequacy of Linear Regression Models

Example: Check the adequacy of the straight line model for given data

T(F)

α (μin/in/F)

-340 2.45

-260 3.58

-180 4.52

-100 5.28

-20 5.86

60 6.36

Taa 10

Page 8: Adequacy of Linear Regression Models

END

Page 9: Adequacy of Linear Regression Models

1. Plot the data and the model

Page 10: Adequacy of Linear Regression Models

Data and model

T(F)

α (μin/in/F)

-340 2.45-260 3.58-180 4.52-100 5.28-20 5.8660 6.36

-350 -300 -250 -200 -150 -100 -50 0 50 1002

2.5

3

3.5

4

4.5

5

5.5

6

6.5

7

T

TT 0096964.00325.6)(

Page 11: Adequacy of Linear Regression Models

END

Page 12: Adequacy of Linear Regression Models

2. Find the standard error of estimate

Page 13: Adequacy of Linear Regression Models

Standard error of estimate

2/

nSs r

T

n

iiir TaaS

1

210 )(

Page 14: Adequacy of Linear Regression Models

Standard Error of Estimate

-340-260-180-100-2060

2.453.584.525.285.866.36

2.73573.51144.28715.06295.83866.6143

-0.285710.0685710.232860.21714

0.021429-0.25429

iT i iTaa 10 ii Taa 10

TT 0096964.00325.6)(

Page 15: Adequacy of Linear Regression Models

Standard Error of Estimate

25283.0rS

2/

nS

s rT

2625283.0

25141.0

Page 16: Adequacy of Linear Regression Models

Standard Error of Estimate

T

ii

sTaa

/

10 ResidualScaled

-350 -300 -250 -200 -150 -100 -50 0 50 1002

3

4

5

6

7

8

T

Page 17: Adequacy of Linear Regression Models

Scaled Residuals

T

ii

sTaa

/

10 ResidualScaled

Estimateof Error StandardResidual ResidualScaled

95% of the scaled residuals need to be in [-2,2]

Page 18: Adequacy of Linear Regression Models

Scaled Residuals

Ti αi Residual Scaled Residual

-340-260-180-100-2060

2.453.584.525.285.866.36

-0.285710.0685710.232860.217140.021429-0.25429

-1.13640.272750.926220.86369

0.085235-1.0115

25141.0/ Ts

Page 19: Adequacy of Linear Regression Models

END

Page 20: Adequacy of Linear Regression Models

3. Find the coefficient of determination

Page 21: Adequacy of Linear Regression Models

Coefficient of determination

n

iiir TaaS

1

210

n

iitS

1

2

t

rt

SSS

r

2

Page 22: Adequacy of Linear Regression Models

Sum of square of residuals between data and mean

n

iit yyS

1

2

11, yx 33 , yx

22 , yx

),( nn yx

ii yx ,y

x

_

yy _

yy

Page 23: Adequacy of Linear Regression Models

Sum of square of residuals between observed and predicted

n

iiir xaayS

1

210

11, yx

33 , yx

22 , yx

),( nn yx ii yx ,

iii xaayE 10

y

x

Page 24: Adequacy of Linear Regression Models

Limits of Coefficient of Determination

t

rt

SSS

r

2

10 2 r

Page 25: Adequacy of Linear Regression Models

Calculation of St

-340-260-180-100-2060

2.453.584.525.285.866.36

-2.2250-1.09500.155000.605001.18501.6850

iT i i

783.10tS6750.4

Page 26: Adequacy of Linear Regression Models

Calculation of Sr

-340-260-180-100-2060

2.453.584.525.285.866.36

2.73573.51144.28715.06295.83866.6143

-0.285710.0685710.232860.21714

0.021429-0.25429

iT i iTaa 10 ii Taa 10

25283.0rS

Page 27: Adequacy of Linear Regression Models

Coefficient of determination

t

rt

SSS

r

2

783.1025283.0783.10

97655.0

Page 28: Adequacy of Linear Regression Models

Correlation coefficient

t

rt

SSS

r

98820.0

How do you know if r is positive or negative ?

Page 29: Adequacy of Linear Regression Models

What does a particular value of |r| mean?

0.8 to 1.0 - Very strong relationship 0.6 to 0.8 - Strong relationship 0.4 to 0.6 - Moderate relationship 0.2 to 0.4 - Weak relationship 0.0 to 0.2 - Weak or no relationship 

Page 30: Adequacy of Linear Regression Models

Caution in use of r2

• Increase in spread of regressor variable (x) in y vs. x increases r2

• Large regression slope artificially yields high r2

• Large r2 does not measure appropriateness of the linear model

• Large r2 does not imply regression model will predict accurately

Page 31: Adequacy of Linear Regression Models

Final Exam Grade

Page 32: Adequacy of Linear Regression Models

Final Exam Grade vs Pre-Req GPA

Page 33: Adequacy of Linear Regression Models

END

Page 34: Adequacy of Linear Regression Models

4. Model meets assumption of random errors

Page 35: Adequacy of Linear Regression Models

Model meets assumption of random errors

• Residuals are negative as well as positive

• Variation of residuals as a function of the independent variable is random

• Residuals follow a normal distribution• There is no autocorrelation between the

data points.

Page 36: Adequacy of Linear Regression Models

Therm exp coeff vs temperatureT α60 6.3640 6.2420 6.120 6.00-20 5.86-40 5.72-60 5.58-80 5.43

T α-100 5.28-120 5.09-140 4.91-160 4.72-180 4.52-200 4.30-220 4.08-240 3.83

T α-280 3.33-300 3.07-320 2.76-340 2.45

Page 37: Adequacy of Linear Regression Models

Data and model

-350 -300 -250 -200 -150 -100 -50 0 50 1002

2.5

3

3.5

4

4.5

5

5.5

6

6.5

7

T

T0093868.00248.6

Page 38: Adequacy of Linear Regression Models

Plot of Residuals

-350 -300 -250 -200 -150 -100 -50 0 50 100-0.4

-0.3

-0.2

-0.1

0

0.1

0.2

0.3

T

Resid

ual

Page 39: Adequacy of Linear Regression Models

Histograms of Residuals

Page 40: Adequacy of Linear Regression Models

Check for Autocorrelation• Find the number of times, q the sign of the

residual changes for the n data points.• If (n-1)/2-√(n-1) ≤q≤ (n-1)/2+√(n-1), you

most likely do not have an autocorrelation.

1222

1221222

)122(

q

083.159174.5 q

Page 41: Adequacy of Linear Regression Models

Is there autocorrelation?

083.159174.5 q

-350 -300 -250 -200 -150 -100 -50 0 50 100-0.4

-0.3

-0.2

-0.1

0

0.1

0.2

0.3

T

Resid

ual

Page 42: Adequacy of Linear Regression Models

y vs x fit and residuals

n=40

Is 13.3≤21≤ 25.7? Yes!(n-1)/2-√(n-1) ≤p≤ (n-1)/2+√(n-1)

Page 43: Adequacy of Linear Regression Models

y vs x fit and residuals

(n-1)/2-√(n-1) ≤p≤ (n-1)/2+√(n-1)Is 13.3≤2≤ 25.7? No!

n=40

Page 44: Adequacy of Linear Regression Models

END

Page 45: Adequacy of Linear Regression Models

What polynomial model to choose if one needs to be chosen?

Page 46: Adequacy of Linear Regression Models

First Order of Polynomial

-350 -300 -250 -200 -150 -100 -50 0 50 1002

2.5

3

3.5

4

4.5

5

5.5

6

6.5

7x 10

-6 Polynomial Regression of order 1

x

y =

a 0+a1*x

+a2*x

2 +....

.+a m

*xm

Page 47: Adequacy of Linear Regression Models

Second Order Polynomial

-350 -300 -250 -200 -150 -100 -50 0 50 1002

2.5

3

3.5

4

4.5

5

5.5

6

6.5

7x 10

-6 Polynomial Regression of order 2

x

y =

a 0+a1*x

+a2*x

2 +....

.+a m

*xm

Page 48: Adequacy of Linear Regression Models

Which model to choose?

-350 -300 -250 -200 -150 -100 -50 0 50 1002

2.5

3

3.5

4

4.5

5

5.5

6

6.5

7

x

yy vs x

Page 49: Adequacy of Linear Regression Models

Optimum Polynomial

0 1 2 3 4 5 60

1

2

3

4

5

x 10-14 Optimum Order of Polynomial

Order of Polynomial, m

Sr

[n-(m

+1)]

Page 50: Adequacy of Linear Regression Models

THE END

Page 51: Adequacy of Linear Regression Models

Effect of an Outlier

Page 52: Adequacy of Linear Regression Models

Effect of Outlier

y = 2xR2 = 1

0

5

10

15

20

25

0 2 4 6 8 10 12

Page 53: Adequacy of Linear Regression Models

Effect of Outlier

y = 3.2727x - 5.0909R2 = 0.6879

-10

0

10

20

30

40

50

60

0 2 4 6 8 10 12