![Page 1: Diploma in Statistics Introduction to Regression Lecture 5.11 Introduction to Regression Lecture 5.1 1.Review 2.Transforming data, the log transform i.liver](https://reader030.vdocuments.us/reader030/viewer/2022033107/56649de65503460f94adeeca/html5/thumbnails/1.jpg)
Diploma in StatisticsIntroduction to Regression
Lecture 5.1 1
Introduction to RegressionLecture 5.1
1. Review
2. Transforming data, the log transform
i. liver fluke egg hatching rate
ii. explaining CEO remuneration
iii. brain weights and body weights
3. SLR with transformed data
4. Transforming X, quadratic fit
5. Other options
![Page 2: Diploma in Statistics Introduction to Regression Lecture 5.11 Introduction to Regression Lecture 5.1 1.Review 2.Transforming data, the log transform i.liver](https://reader030.vdocuments.us/reader030/viewer/2022033107/56649de65503460f94adeeca/html5/thumbnails/2.jpg)
Diploma in StatisticsIntroduction to Regression
Lecture 5.1 2
Using t values
Convention: n >30 is big,
n < 30 is small.
Z0.05 = 1.96
≈ 2
t30, 0.05 = 2.04
≈ 2
![Page 3: Diploma in Statistics Introduction to Regression Lecture 5.11 Introduction to Regression Lecture 5.1 1.Review 2.Transforming data, the log transform i.liver](https://reader030.vdocuments.us/reader030/viewer/2022033107/56649de65503460f94adeeca/html5/thumbnails/3.jpg)
Diploma in StatisticsIntroduction to Regression
Lecture 5.1 3
Selected critical values for the t-distribution .25 .10 .05 .02 .01 .002 .001
= 1 2.41 6.31 12.71 31.82 63.66 318.32 636.61 2 1.60 2.92 4.30 6.96 9.92 22.33 31.60 3 1.42 2.35 3.18 4.54 5.84 10.22 12.92 4 1.34 2.13 2.78 3.75 4.60 7.17 8.61 5 1.30 2.02 2.57 3.36 4.03 5.89 6.87 6 1.27 1.94 2.45 3.14 3.71 5.21 5.96 7 1.25 1.89 2.36 3.00 3.50 4.79 5.41 8 1.24 1.86 2.31 2.90 3.36 4.50 5.04 9 1.23 1.83 2.26 2.82 3.25 4.30 4.78 10 1.22 1.81 2.23 2.76 3.17 4.14 4.59 12 1.21 1.78 2.18 2.68 3.05 3.93 4.32 15 1.20 1.75 2.13 2.60 2.95 3.73 4.07 20 1.18 1.72 2.09 2.53 2.85 3.55 3.85 24 1.18 1.71 2.06 2.49 2.80 3.47 3.75 30 1.17 1.70 2.04 2.46 2.75 3.39 3.65 40 1.17 1.68 2.02 2.42 2.70 3.31 3.55 60 1.16 1.67 2.00 2.39 2.66 3.23 3.46 120 1.16 1.66 1.98 2.36 2.62 3.16 3.37 ∞ 1.15 1.64 1.96 2.33 2.58 3.09 3.29
![Page 4: Diploma in Statistics Introduction to Regression Lecture 5.11 Introduction to Regression Lecture 5.1 1.Review 2.Transforming data, the log transform i.liver](https://reader030.vdocuments.us/reader030/viewer/2022033107/56649de65503460f94adeeca/html5/thumbnails/4.jpg)
Diploma in StatisticsIntroduction to Regression
Lecture 5.1 4
Quantify the extent of the recovery in Year 6, Q3.
= 1030 Q1 + 1292 Q2 + 1210 Q3 + 1279 Q4 + 33.7 Time
Year 6 Q2: P = 1657
= 1292 + 33.7 × 22 = 2033
P – = 1657 – 2033 = – 376
Year 6 Q3: P = 2185
= 1210 + 33.7 × 23 = 1985
P – = 2185 – 1985 = 200
Homework 4.2.1
P̂
P̂
P̂
P̂
P̂
![Page 5: Diploma in Statistics Introduction to Regression Lecture 5.11 Introduction to Regression Lecture 5.1 1.Review 2.Transforming data, the log transform i.liver](https://reader030.vdocuments.us/reader030/viewer/2022033107/56649de65503460f94adeeca/html5/thumbnails/5.jpg)
Diploma in StatisticsIntroduction to Regression
Lecture 5.1 5
Homework 4.2.2
List correspondences between the output from the original regression and the output from the alternative regression.
Confirm that the coefficients of Q1, Q2 and Q3 in the original are the corresponding coefficients in the alternative with the Q4 coefficient added.
![Page 6: Diploma in Statistics Introduction to Regression Lecture 5.11 Introduction to Regression Lecture 5.1 1.Review 2.Transforming data, the log transform i.liver](https://reader030.vdocuments.us/reader030/viewer/2022033107/56649de65503460f94adeeca/html5/thumbnails/6.jpg)
Diploma in StatisticsIntroduction to Regression
Lecture 5.1 6
Predictor Coef SE Coef T PNoconstantQ1 1029.87 23.41 43.99 0.000Q2 1292.35 24.45 52.85 0.000Q3 1210.42 25.55 47.37 0.000Q4 1278.70 26.71 47.88 0.000Time 33.725 1.619 20.83 0.000S = 40.9654
Predictor Coef SE Coef T PConstant 1278.70 26.71 47.88 0.000Q1 -248.82 26.36 -9.44 0.000Q2 13.65 26.11 0.52 0.609Q3 -68.27 25.96 -2.63 0.019Time 33.725 1.619 20.83 0.000S = 40.9654
![Page 7: Diploma in Statistics Introduction to Regression Lecture 5.11 Introduction to Regression Lecture 5.1 1.Review 2.Transforming data, the log transform i.liver](https://reader030.vdocuments.us/reader030/viewer/2022033107/56649de65503460f94adeeca/html5/thumbnails/7.jpg)
Diploma in StatisticsIntroduction to Regression
Lecture 5.1 7
Homework 4.2.3
1. Calculate the simple linear regressions of Jobtime on each of T_Ops and Units. Confirm the corresponding t-values.
2. Calculate the simple linear regression of Jobtime on Ops per Unit. Comment on the negative correlation of Jobtime with Ops per Unit in the light of the corresponding t-value.
3. Confirm the calculation of the R2 values.
![Page 8: Diploma in Statistics Introduction to Regression Lecture 5.11 Introduction to Regression Lecture 5.1 1.Review 2.Transforming data, the log transform i.liver](https://reader030.vdocuments.us/reader030/viewer/2022033107/56649de65503460f94adeeca/html5/thumbnails/8.jpg)
Diploma in StatisticsIntroduction to Regression
Lecture 5.1 8
Solution 4.2.3
2. Calculate the simple linear regression of Jobtime on Ops per Unit. Comment on the negative correlation of Jobtime with Ops per Unit in the light of the corresponding t-value.
Comment: The t-value is insignificant; the negative correlation is just chance variation, with no substantive meaning.
![Page 9: Diploma in Statistics Introduction to Regression Lecture 5.11 Introduction to Regression Lecture 5.1 1.Review 2.Transforming data, the log transform i.liver](https://reader030.vdocuments.us/reader030/viewer/2022033107/56649de65503460f94adeeca/html5/thumbnails/9.jpg)
Diploma in StatisticsIntroduction to Regression
Lecture 5.1 9
Variance Inflation Factors
2kk
kR1
1ns
)ˆ(SE
ns)ˆ(SE0R
kk
2k
factorlationinferrordardtansR1
12k
factorlationinfiancevarR1
12k
Convention: problem if > 90% or VIFk > 102kR
![Page 10: Diploma in Statistics Introduction to Regression Lecture 5.11 Introduction to Regression Lecture 5.1 1.Review 2.Transforming data, the log transform i.liver](https://reader030.vdocuments.us/reader030/viewer/2022033107/56649de65503460f94adeeca/html5/thumbnails/10.jpg)
Diploma in StatisticsIntroduction to Regression
Lecture 5.1 10
What to do?
• Get new X values, to break correlation pattern
– impractical in observational studies
• Choose a subset of the X variables
– manually
– automatically
• stepwise regression
• other methods
![Page 11: Diploma in Statistics Introduction to Regression Lecture 5.11 Introduction to Regression Lecture 5.1 1.Review 2.Transforming data, the log transform i.liver](https://reader030.vdocuments.us/reader030/viewer/2022033107/56649de65503460f94adeeca/html5/thumbnails/11.jpg)
Diploma in StatisticsIntroduction to Regression
Lecture 5.1 11
Residential load survey data.
Data collected by a US electricity supplier during an investigation of the factors that influence peak demand for electricity by residential customers.
Load is demand at system peak demand hour, (kW)
Size is house size, in SqFt/1000,
Income (X2) is annual family income, in $/1000,
AirCon (X3) is air conditioning capacity, in tons,
Index (X4) is the house appliance index, in kW,
Residents (X5) is number in house on a typical day
![Page 12: Diploma in Statistics Introduction to Regression Lecture 5.11 Introduction to Regression Lecture 5.1 1.Review 2.Transforming data, the log transform i.liver](https://reader030.vdocuments.us/reader030/viewer/2022033107/56649de65503460f94adeeca/html5/thumbnails/12.jpg)
Diploma in StatisticsIntroduction to Regression
Lecture 5.1 12
Matrix plot
![Page 13: Diploma in Statistics Introduction to Regression Lecture 5.11 Introduction to Regression Lecture 5.1 1.Review 2.Transforming data, the log transform i.liver](https://reader030.vdocuments.us/reader030/viewer/2022033107/56649de65503460f94adeeca/html5/thumbnails/13.jpg)
Diploma in StatisticsIntroduction to Regression
Lecture 5.1 13
Results
All variables in:Predictor Coef SE Coef T PConstant 0.1263 0.2289 0.55 0.585Size -2.6689 0.9059 -2.95 0.006Income 0.00027912 0.00007892 3.54 0.001AirCon 0.42462 0.03472 12.23 0.000Index 0.00038137 0.00007884 4.84 0.000Residents 0.00197 0.02218 0.09 0.930
Income deletedPredictor Coef SE Coef T PConstant -397.0 492.7 -0.81 0.426Size 10943.3 594.2 18.42 0.000AirCon -1.86 75.45 -0.02 0.980Index 0.0721 0.1709 0.42 0.676Residents 38.65 47.75 0.81 0.424
![Page 14: Diploma in Statistics Introduction to Regression Lecture 5.11 Introduction to Regression Lecture 5.1 1.Review 2.Transforming data, the log transform i.liver](https://reader030.vdocuments.us/reader030/viewer/2022033107/56649de65503460f94adeeca/html5/thumbnails/14.jpg)
Diploma in StatisticsIntroduction to Regression
Lecture 5.1 14
Exercise
Calculate the VIF for Size. Comment.
Homework
Calculate variance inflation factors for all explanatory variables. Discuss
![Page 15: Diploma in Statistics Introduction to Regression Lecture 5.11 Introduction to Regression Lecture 5.1 1.Review 2.Transforming data, the log transform i.liver](https://reader030.vdocuments.us/reader030/viewer/2022033107/56649de65503460f94adeeca/html5/thumbnails/15.jpg)
Diploma in StatisticsIntroduction to Regression
Lecture 5.1 15
Multicollinearity
when when there is perfect correlation within the X variables.
Example: Indicators
Illustration: Minitab
![Page 16: Diploma in Statistics Introduction to Regression Lecture 5.11 Introduction to Regression Lecture 5.1 1.Review 2.Transforming data, the log transform i.liver](https://reader030.vdocuments.us/reader030/viewer/2022033107/56649de65503460f94adeeca/html5/thumbnails/16.jpg)
Diploma in StatisticsIntroduction to Regression
Lecture 5.1 16
Introduction to RegressionLecture 5.1
1. Review
2. Transforming data, the log transform
i. liver fluke egg hatching rate
ii. explaining CEO remuneration
iii. brain weights and body weightsA
3. SLR with transformed data
4. Transforming X, quadratic fit
5. Other options
![Page 17: Diploma in Statistics Introduction to Regression Lecture 5.11 Introduction to Regression Lecture 5.1 1.Review 2.Transforming data, the log transform i.liver](https://reader030.vdocuments.us/reader030/viewer/2022033107/56649de65503460f94adeeca/html5/thumbnails/17.jpg)
Diploma in StatisticsIntroduction to Regression
Lecture 5.1 17
(i) Hatching of liver fluke eggs
The life cycle of the liver fluke
1. Adults in liver lay eggs
2. Animals excrete eggs
3. Eggs hatch on ground
4. Larvae seek snail
5. Development within snail
6. Emergence from snail
7. Consumption by animal
8. Penetration to liver
![Page 18: Diploma in Statistics Introduction to Regression Lecture 5.11 Introduction to Regression Lecture 5.1 1.Review 2.Transforming data, the log transform i.liver](https://reader030.vdocuments.us/reader030/viewer/2022033107/56649de65503460f94adeeca/html5/thumbnails/18.jpg)
Diploma in StatisticsIntroduction to Regression
Lecture 5.1 18
Hatching of liver fluke eggs:Duration and Success rate
Duration and success rate of hatching of 600 liver fluke eggs at a series of fixed temperatures
Temperature (C)
Number hatched
Duration (mean days)
SD Hatch%
10 546 115.75 2.14 91.0 13 543 56.50 2.33 90.5 16 534 32.39 1.98 89.0 18 501 24.49 1.41 83.5 20 499 18.92 1.39 83.1 22 497 15.58 1.23 82.8 24 465 13.39 1.03 77.5 26 448 11.98 1.28 74.0 28 438 10.16 0.94 73.0 30 432 9.45 0.96 72.0 32 256 10.37 0.94 42.5 34 42 11.52 0.85 7.0 35 0
![Page 19: Diploma in Statistics Introduction to Regression Lecture 5.11 Introduction to Regression Lecture 5.1 1.Review 2.Transforming data, the log transform i.liver](https://reader030.vdocuments.us/reader030/viewer/2022033107/56649de65503460f94adeeca/html5/thumbnails/19.jpg)
Diploma in StatisticsIntroduction to Regression
Lecture 5.1 19
Temperature
Dura
tion
353025201510
120
100
80
60
40
20
0
Scatterplot of Duration vs Temperature
![Page 20: Diploma in Statistics Introduction to Regression Lecture 5.11 Introduction to Regression Lecture 5.1 1.Review 2.Transforming data, the log transform i.liver](https://reader030.vdocuments.us/reader030/viewer/2022033107/56649de65503460f94adeeca/html5/thumbnails/20.jpg)
Diploma in StatisticsIntroduction to Regression
Lecture 5.1 20
Temperature
Log(D
ura
tion)
353025201510
2.2
2.0
1.8
1.6
1.4
1.2
1.0
Scatterplot of Log(Duration) vs Temperature
![Page 21: Diploma in Statistics Introduction to Regression Lecture 5.11 Introduction to Regression Lecture 5.1 1.Review 2.Transforming data, the log transform i.liver](https://reader030.vdocuments.us/reader030/viewer/2022033107/56649de65503460f94adeeca/html5/thumbnails/21.jpg)
Diploma in StatisticsIntroduction to Regression
Lecture 5.1 21
Sales
Tota
l com
p
140000120000100000800006000040000200000
200000000
150000000
100000000
50000000
0
Scatterplot of Total comp vs Sales
(ii) Explaining CEO Compensationand Company Sales,
(Forbes magazine, May 1994)
![Page 22: Diploma in Statistics Introduction to Regression Lecture 5.11 Introduction to Regression Lecture 5.1 1.Review 2.Transforming data, the log transform i.liver](https://reader030.vdocuments.us/reader030/viewer/2022033107/56649de65503460f94adeeca/html5/thumbnails/22.jpg)
Diploma in StatisticsIntroduction to Regression
Lecture 5.1 22
Explaining CEO Remuneration,bivariate log transformation
LogSales
LogCom
p
5.55.04.54.03.53.02.52.0
8
7
6
5
4
Scatterplot of LogComp vs LogSales
![Page 23: Diploma in Statistics Introduction to Regression Lecture 5.11 Introduction to Regression Lecture 5.1 1.Review 2.Transforming data, the log transform i.liver](https://reader030.vdocuments.us/reader030/viewer/2022033107/56649de65503460f94adeeca/html5/thumbnails/23.jpg)
Diploma in StatisticsIntroduction to Regression
Lecture 5.1 23
(iii) Mammals' Brainweight vs Bodyweight
Species Bodyweight Brainweight
African elephant 6654 5712 African giant pouched rat 1 6.6 Artic fox 3.385 44.5 Artic ground squirrel 0.92 5.7 Asian elephant 2547 4603 Brachiosaurus 87000 154.5 Baboon 10.55 179.5 Big brown bat 0.023 0.3 Brazilian tapir 160 169 Cat 3.3 25.6 Chimpanzee 52.16 440
● ● ●
● ● ●
● ● ●
![Page 24: Diploma in Statistics Introduction to Regression Lecture 5.11 Introduction to Regression Lecture 5.1 1.Review 2.Transforming data, the log transform i.liver](https://reader030.vdocuments.us/reader030/viewer/2022033107/56649de65503460f94adeeca/html5/thumbnails/24.jpg)
Diploma in StatisticsIntroduction to Regression
Lecture 5.1 24
Bodyweight
Bra
inw
eig
ht
9000080000700006000050000400003000020000100000
6000
5000
4000
3000
2000
1000
0
Scatterplot of Brainweight vs Bodyweight
Scatterplot view
![Page 25: Diploma in Statistics Introduction to Regression Lecture 5.11 Introduction to Regression Lecture 5.1 1.Review 2.Transforming data, the log transform i.liver](https://reader030.vdocuments.us/reader030/viewer/2022033107/56649de65503460f94adeeca/html5/thumbnails/25.jpg)
Diploma in StatisticsIntroduction to Regression
Lecture 5.1 25
LBodyW
LBra
inW
543210-1-2-3
4
3
2
1
0
-1
Scatterplot of LBrainW vs LBodyW
Scatterplot view,log transform
![Page 26: Diploma in Statistics Introduction to Regression Lecture 5.11 Introduction to Regression Lecture 5.1 1.Review 2.Transforming data, the log transform i.liver](https://reader030.vdocuments.us/reader030/viewer/2022033107/56649de65503460f94adeeca/html5/thumbnails/26.jpg)
Diploma in StatisticsIntroduction to Regression
Lecture 5.1 26
LBodyW
LBra
inW
43210-1-2-3
4
3
2
1
0
-1
Scatterplot of LBrainW vs LBodyW
Scatterplot view,Dinosaurs deleted
![Page 27: Diploma in Statistics Introduction to Regression Lecture 5.11 Introduction to Regression Lecture 5.1 1.Review 2.Transforming data, the log transform i.liver](https://reader030.vdocuments.us/reader030/viewer/2022033107/56649de65503460f94adeeca/html5/thumbnails/27.jpg)
Diploma in StatisticsIntroduction to Regression
Lecture 5.1 27
Histogram view
600048003600240012000
48
36
24
12
0
Brainweight
Fre
qu
en
cy
6000500040003000200010000
60
45
30
15
0
Bodyweight
Fre
qu
en
cy
Histogram of Brainweight
Histogram of Bodyweight
![Page 28: Diploma in Statistics Introduction to Regression Lecture 5.11 Introduction to Regression Lecture 5.1 1.Review 2.Transforming data, the log transform i.liver](https://reader030.vdocuments.us/reader030/viewer/2022033107/56649de65503460f94adeeca/html5/thumbnails/28.jpg)
Diploma in StatisticsIntroduction to Regression
Lecture 5.1 28
Histogram view,log transform
43210-1
16
12
8
4
0
LBrainW
Fre
qu
en
cy
43210-1-2
12
9
6
3
0
LBodyW
Fre
qu
en
cy
Histogram of LBrainW
Histogram of LBodyW
![Page 29: Diploma in Statistics Introduction to Regression Lecture 5.11 Introduction to Regression Lecture 5.1 1.Review 2.Transforming data, the log transform i.liver](https://reader030.vdocuments.us/reader030/viewer/2022033107/56649de65503460f94adeeca/html5/thumbnails/29.jpg)
Diploma in StatisticsIntroduction to Regression
Lecture 5.1 29
Changing spread with log
![Page 30: Diploma in Statistics Introduction to Regression Lecture 5.11 Introduction to Regression Lecture 5.1 1.Review 2.Transforming data, the log transform i.liver](https://reader030.vdocuments.us/reader030/viewer/2022033107/56649de65503460f94adeeca/html5/thumbnails/30.jpg)
Diploma in StatisticsIntroduction to Regression
Lecture 5.1 30
Changing spread with log
![Page 31: Diploma in Statistics Introduction to Regression Lecture 5.11 Introduction to Regression Lecture 5.1 1.Review 2.Transforming data, the log transform i.liver](https://reader030.vdocuments.us/reader030/viewer/2022033107/56649de65503460f94adeeca/html5/thumbnails/31.jpg)
Diploma in StatisticsIntroduction to Regression
Lecture 5.1 31
Changing spread with log
![Page 32: Diploma in Statistics Introduction to Regression Lecture 5.11 Introduction to Regression Lecture 5.1 1.Review 2.Transforming data, the log transform i.liver](https://reader030.vdocuments.us/reader030/viewer/2022033107/56649de65503460f94adeeca/html5/thumbnails/32.jpg)
Diploma in StatisticsIntroduction to Regression
Lecture 5.1 32
Changing spread with log
![Page 33: Diploma in Statistics Introduction to Regression Lecture 5.11 Introduction to Regression Lecture 5.1 1.Review 2.Transforming data, the log transform i.liver](https://reader030.vdocuments.us/reader030/viewer/2022033107/56649de65503460f94adeeca/html5/thumbnails/33.jpg)
Diploma in StatisticsIntroduction to Regression
Lecture 5.1 33
Changing spread with log
![Page 34: Diploma in Statistics Introduction to Regression Lecture 5.11 Introduction to Regression Lecture 5.1 1.Review 2.Transforming data, the log transform i.liver](https://reader030.vdocuments.us/reader030/viewer/2022033107/56649de65503460f94adeeca/html5/thumbnails/34.jpg)
Diploma in StatisticsIntroduction to Regression
Lecture 5.1 34
Changing spread with log
![Page 35: Diploma in Statistics Introduction to Regression Lecture 5.11 Introduction to Regression Lecture 5.1 1.Review 2.Transforming data, the log transform i.liver](https://reader030.vdocuments.us/reader030/viewer/2022033107/56649de65503460f94adeeca/html5/thumbnails/35.jpg)
Diploma in StatisticsIntroduction to Regression
Lecture 5.1 35
Changing spread with log
![Page 36: Diploma in Statistics Introduction to Regression Lecture 5.11 Introduction to Regression Lecture 5.1 1.Review 2.Transforming data, the log transform i.liver](https://reader030.vdocuments.us/reader030/viewer/2022033107/56649de65503460f94adeeca/html5/thumbnails/36.jpg)
Diploma in StatisticsIntroduction to Regression
Lecture 5.1 36
Changing spread with log
![Page 37: Diploma in Statistics Introduction to Regression Lecture 5.11 Introduction to Regression Lecture 5.1 1.Review 2.Transforming data, the log transform i.liver](https://reader030.vdocuments.us/reader030/viewer/2022033107/56649de65503460f94adeeca/html5/thumbnails/37.jpg)
Diploma in StatisticsIntroduction to Regression
Lecture 5.1 37
Changing spread with log
![Page 38: Diploma in Statistics Introduction to Regression Lecture 5.11 Introduction to Regression Lecture 5.1 1.Review 2.Transforming data, the log transform i.liver](https://reader030.vdocuments.us/reader030/viewer/2022033107/56649de65503460f94adeeca/html5/thumbnails/38.jpg)
Diploma in StatisticsIntroduction to Regression
Lecture 5.1 38
Why the log transform works
High spread at high X
transformed to
low spread at high Y
Low spread at low X
transformed to
high spread at low Y
![Page 39: Diploma in Statistics Introduction to Regression Lecture 5.11 Introduction to Regression Lecture 5.1 1.Review 2.Transforming data, the log transform i.liver](https://reader030.vdocuments.us/reader030/viewer/2022033107/56649de65503460f94adeeca/html5/thumbnails/39.jpg)
Diploma in StatisticsIntroduction to Regression
Lecture 5.1 39
Why the log transform works
10 to 100
transformed to
log10(10) to log10(102)
i.e. 1 to 2
1/10 = 0.1 to 1/100 = 0.01
transformed to
log10(10–1) to log10(10–2)
i.e., – 1 to – 2
![Page 40: Diploma in Statistics Introduction to Regression Lecture 5.11 Introduction to Regression Lecture 5.1 1.Review 2.Transforming data, the log transform i.liver](https://reader030.vdocuments.us/reader030/viewer/2022033107/56649de65503460f94adeeca/html5/thumbnails/40.jpg)
Diploma in StatisticsIntroduction to Regression
Lecture 5.1 40
Introduction to RegressionLecture 5.1
1. Review
2. Transforming data, the log transform
i. liver fluke egg hatching rate
ii. explaining CEO remuneration
iii. brain weights and body weights
3. SLR with transformed data
4. Transforming X, quadratic fit
5. Other options
![Page 41: Diploma in Statistics Introduction to Regression Lecture 5.11 Introduction to Regression Lecture 5.1 1.Review 2.Transforming data, the log transform i.liver](https://reader030.vdocuments.us/reader030/viewer/2022033107/56649de65503460f94adeeca/html5/thumbnails/41.jpg)
Diploma in StatisticsIntroduction to Regression
Lecture 5.1 41
SLR with transformed dataLBrainW versus LBodyW
The regression equation is
LBrainW = 0.932 + 0.753 LBodyW
Predictor Coef SE Coef T P
Constant 0.93237 0.04170 22.36 0.000
LBodyW 0.75309 0.02858 26.35 0.000
S = 0.302949
![Page 42: Diploma in Statistics Introduction to Regression Lecture 5.11 Introduction to Regression Lecture 5.1 1.Review 2.Transforming data, the log transform i.liver](https://reader030.vdocuments.us/reader030/viewer/2022033107/56649de65503460f94adeeca/html5/thumbnails/42.jpg)
Diploma in StatisticsIntroduction to Regression
Lecture 5.1 42
LBodyW
LBra
inW
43210-1-2-3
4
3
2
1
0
-1
Scatterplot of LBrainW vs LBodyW
Application:Do humans conform?
Human
![Page 43: Diploma in Statistics Introduction to Regression Lecture 5.11 Introduction to Regression Lecture 5.1 1.Review 2.Transforming data, the log transform i.liver](https://reader030.vdocuments.us/reader030/viewer/2022033107/56649de65503460f94adeeca/html5/thumbnails/43.jpg)
Diploma in StatisticsIntroduction to Regression
Lecture 5.1 43
Application:Do humans conform?
• Delete the Human data,
• calculate regression,
• predict human LBrainW and
• compare to actual, relative to s
![Page 44: Diploma in Statistics Introduction to Regression Lecture 5.11 Introduction to Regression Lecture 5.1 1.Review 2.Transforming data, the log transform i.liver](https://reader030.vdocuments.us/reader030/viewer/2022033107/56649de65503460f94adeeca/html5/thumbnails/44.jpg)
Diploma in StatisticsIntroduction to Regression
Lecture 5.1 44
Application:Do humans conform?
Regression Analysis: LBrainW versus LBodyW
The regression equation isLBrainW = 0.924 + 0.744 LBodyW
Predictor Coef SE Coef t pConstant 0.92410 0.03933 23.50 0.000LBodyW 0.74383 0.02706 27.48 0.000
S = 0.285036
![Page 45: Diploma in Statistics Introduction to Regression Lecture 5.11 Introduction to Regression Lecture 5.1 1.Review 2.Transforming data, the log transform i.liver](https://reader030.vdocuments.us/reader030/viewer/2022033107/56649de65503460f94adeeca/html5/thumbnails/45.jpg)
Diploma in StatisticsIntroduction to Regression
Lecture 5.1 45
Application:Do humans conform?
LBodyW(Human) = 1.79239
LBrainW(Human) = 3.12057
Predicted LBrainW = 0.924 + 0.744 × 1.79239
= 2.25754
Residual = 3.12057 – 2.25754= 0.86303
Residual / s = 0.86303 / 0.285036 = 3.03
![Page 46: Diploma in Statistics Introduction to Regression Lecture 5.11 Introduction to Regression Lecture 5.1 1.Review 2.Transforming data, the log transform i.liver](https://reader030.vdocuments.us/reader030/viewer/2022033107/56649de65503460f94adeeca/html5/thumbnails/46.jpg)
Diploma in StatisticsIntroduction to Regression
Lecture 5.1 46
Deleted residuals
For each potentially exceptional case:
– delete the case
– calculate the regression from the rest
– use the fitted equation to calculate a
deleted fitted value
– calculate deleted residual
= obseved value – deleted fitted value
Minitab does this automatically for all cases!
![Page 47: Diploma in Statistics Introduction to Regression Lecture 5.11 Introduction to Regression Lecture 5.1 1.Review 2.Transforming data, the log transform i.liver](https://reader030.vdocuments.us/reader030/viewer/2022033107/56649de65503460f94adeeca/html5/thumbnails/47.jpg)
Diploma in StatisticsIntroduction to Regression
Lecture 5.1 47
Application:Do humans conform?
With 63 cases, we do not expect to see any cases with residuals exceeding 3 standard deviations.
On the other hand, recalling the scatter plot, the humans do not appear particulary exceptional. The dotplot view of deleted residuals emphasises this:
Water opossums appear more exceptional.
HumanWater Opossum
![Page 48: Diploma in Statistics Introduction to Regression Lecture 5.11 Introduction to Regression Lecture 5.1 1.Review 2.Transforming data, the log transform i.liver](https://reader030.vdocuments.us/reader030/viewer/2022033107/56649de65503460f94adeeca/html5/thumbnails/48.jpg)
Diploma in StatisticsIntroduction to Regression
Lecture 5.1 48
Application:Do humans conform?
4
3
2
1
0
-1
-2
-3
-43210-1-2-3
De
lete
d R
esi
du
als
Score
AD 0.385
P-Value 0.383
Probability Plot of Deleted Residuals
![Page 49: Diploma in Statistics Introduction to Regression Lecture 5.11 Introduction to Regression Lecture 5.1 1.Review 2.Transforming data, the log transform i.liver](https://reader030.vdocuments.us/reader030/viewer/2022033107/56649de65503460f94adeeca/html5/thumbnails/49.jpg)
Diploma in StatisticsIntroduction to Regression
Lecture 5.1 49
Introduction to RegressionLecture 5.1
1. Review
2. Transforming data, the log transform
i. liver fluke egg hatching rate
ii. explaining CEO remuneration
iii. brain weights and body weights
3. SLR with transformed data
4. Transforming X, quadratic fit
5. Other options
![Page 50: Diploma in Statistics Introduction to Regression Lecture 5.11 Introduction to Regression Lecture 5.1 1.Review 2.Transforming data, the log transform i.liver](https://reader030.vdocuments.us/reader030/viewer/2022033107/56649de65503460f94adeeca/html5/thumbnails/50.jpg)
Diploma in StatisticsIntroduction to Regression
Lecture 5.1 50
Optimising a nicotine extraction process
In determining the quantity of nicotine in different samples of tobacco, temperature is a key variable in optimising the extraction process. A study of this phenomenon involving analysis of 18 samples produced these data.
![Page 51: Diploma in Statistics Introduction to Regression Lecture 5.11 Introduction to Regression Lecture 5.1 1.Review 2.Transforming data, the log transform i.liver](https://reader030.vdocuments.us/reader030/viewer/2022033107/56649de65503460f94adeeca/html5/thumbnails/51.jpg)
Diploma in StatisticsIntroduction to Regression
Lecture 5.1 51
Optimising a nicotine extraction process
Regression Analysis: Nicotine versus Temperature
The regression equation isNicotine = 2.61 + 0.0247 Temperature
Predictor Coef SE Coef T PConstant 2.6086 0.2121 12.30 0.000Temperature 0.024656 0.003579 6.89 0.000
S = 0.217412 R-Sq = 74.8%
![Page 52: Diploma in Statistics Introduction to Regression Lecture 5.11 Introduction to Regression Lecture 5.1 1.Review 2.Transforming data, the log transform i.liver](https://reader030.vdocuments.us/reader030/viewer/2022033107/56649de65503460f94adeeca/html5/thumbnails/52.jpg)
Diploma in StatisticsIntroduction to Regression
Lecture 5.1 52
Optimising a nicotine extraction process
![Page 53: Diploma in Statistics Introduction to Regression Lecture 5.11 Introduction to Regression Lecture 5.1 1.Review 2.Transforming data, the log transform i.liver](https://reader030.vdocuments.us/reader030/viewer/2022033107/56649de65503460f94adeeca/html5/thumbnails/53.jpg)
Diploma in StatisticsIntroduction to Regression
Lecture 5.1 53
Optimising a nicotine extraction process,quadratic fit
90807060504030
4.6
4.4
4.2
4.0
3.8
3.6
3.4
3.2
3.0
Temperature
Nic
oti
ne
Scatterplot of Nicotine vs Temperature
![Page 54: Diploma in Statistics Introduction to Regression Lecture 5.11 Introduction to Regression Lecture 5.1 1.Review 2.Transforming data, the log transform i.liver](https://reader030.vdocuments.us/reader030/viewer/2022033107/56649de65503460f94adeeca/html5/thumbnails/54.jpg)
Diploma in StatisticsIntroduction to Regression
Lecture 5.1 54
Optimising a nicotine extraction process,quadratic fit
The regression equation isNicotine = 1.20 + 0.0767 Temperature - 0.000453 Temp-sqr
Predictor Coef SE Coef T PConstant 1.2041 0.6312 1.91 0.076Temperature 0.07674 0.02257 3.40 0.004Temp-sqr -0.0004529 0.0001943 -2.33 0.034
S = 0.192398 R-Sq = 81.5%
![Page 55: Diploma in Statistics Introduction to Regression Lecture 5.11 Introduction to Regression Lecture 5.1 1.Review 2.Transforming data, the log transform i.liver](https://reader030.vdocuments.us/reader030/viewer/2022033107/56649de65503460f94adeeca/html5/thumbnails/55.jpg)
Diploma in StatisticsIntroduction to Regression
Lecture 5.1 55
Optimising a nicotine extraction process,quadratic fit
![Page 56: Diploma in Statistics Introduction to Regression Lecture 5.11 Introduction to Regression Lecture 5.1 1.Review 2.Transforming data, the log transform i.liver](https://reader030.vdocuments.us/reader030/viewer/2022033107/56649de65503460f94adeeca/html5/thumbnails/56.jpg)
Diploma in StatisticsIntroduction to Regression
Lecture 5.1 56
Optimising a nicotine extraction process,quadratic fit, case 5 excluded
The regression equation isNicotine = 1.21 + 0.0750 Temperature - 0.000419 Temp-sqr
Predictor Coef SE Coef T PConstant 1.2096 0.5129 2.36 0.033Temperature 0.07504 0.01835 4.09 0.001Temp-sqr -0.0004189 0.0001583 -2.65 0.019
S = 0.156321 R-Sq = 88.6%
![Page 57: Diploma in Statistics Introduction to Regression Lecture 5.11 Introduction to Regression Lecture 5.1 1.Review 2.Transforming data, the log transform i.liver](https://reader030.vdocuments.us/reader030/viewer/2022033107/56649de65503460f94adeeca/html5/thumbnails/57.jpg)
Diploma in StatisticsIntroduction to Regression
Lecture 5.1 57
Optimising a nicotine extraction process,quadratic fit, case 5 excluded
![Page 58: Diploma in Statistics Introduction to Regression Lecture 5.11 Introduction to Regression Lecture 5.1 1.Review 2.Transforming data, the log transform i.liver](https://reader030.vdocuments.us/reader030/viewer/2022033107/56649de65503460f94adeeca/html5/thumbnails/58.jpg)
Diploma in StatisticsIntroduction to Regression
Lecture 5.1 58
5 Other options
• Other functions,
– e.g., 1/Y, Y, Y2, etc., same for X
• Generalised linear models,
– choose a function of Y, a model for
• etc.
![Page 59: Diploma in Statistics Introduction to Regression Lecture 5.11 Introduction to Regression Lecture 5.1 1.Review 2.Transforming data, the log transform i.liver](https://reader030.vdocuments.us/reader030/viewer/2022033107/56649de65503460f94adeeca/html5/thumbnails/59.jpg)
Diploma in StatisticsIntroduction to Regression
Lecture 5.1 59
Reading
EM Section 6.7.1
Hamilton, Ch. 5
Extra Notes: More on log