simple linear models straight line is simplest case, but key is that parameters appear linearly in...

Post on 28-Mar-2015

215 Views

Category:

Documents

1 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Simple linear models

• Straight line is simplest case, but key is that parameters appear linearly in the model

• Needs estimates of the model parameters (slope and intercept)- usually by least squares

• Makes a number of assumptions, usually checked graphically using residuals

Examples for linear regression

• How is LOI related to moisture?• How should we estimate merchantable volume of wood

from the height of a living tree?• How is pest infestation late in the season affected by

the concentration of insecticide applied early in the season?

Scatterplot of tree volume vs height

Minitab commands

Regression Output

Interpreting the output

• Goodness of fit (R-squared) and ANOVA table p-value?• Confidence intervals and tests for the parameters• Assessing assumptions (outliers and influential

observations• Residual plots

t = distance between estimate and hypothesised value, in units of standard error

t Coef SECoef

vs tcrit

CI Coef tcrit SECoef

Confidence intervals and t-tests

Confidence intervals and t-tests

Confidence intervals and t-tests

Regression output

Outliers

Residual plots

Standardized Residual

Perc

ent

210-1-2

99

90

50

10

1

Fitted Value

Sta

ndard

ized R

esi

dual

5040302010

2

1

0

-1

-2

Standardized Residual

Fre

quency

210-1

8

6

4

2

0

Observation Order

Sta

ndard

ized R

esi

dual

30282624222018161412108642

2

1

0

-1

-2

Normal Probability Plot of the Residuals Residuals Versus the Fitted Values

Histogram of the Residuals Residuals Versus the Order of the Data

Residual Plots for VOLUME

Confidence and prediction intervals

HEIGHT

VOLU

ME

90858075706560

80

60

40

20

0

-20

S 13.3970R-Sq 35.8%R-Sq(adj) 33.6%

Regression95% CI95% PI

volume as a function of heightVOLUME = - 87.12 + 1.543 HEIGHT

Low R-sq

High R-sq

Low p-value: significant High p-value: non-significant

Four possible outcomes

• Not because relationships are linear• Transformations can often help linearise• Good simple starting point – results are well understood• Approximation to a smoothly varying curve

Why linear?

top related