curves_coefficients_cutoffs@measurments_sensitivity_specificity
TRANSCRIPT
-
7/25/2019 curves_coefficients_cutoffs@measurments_sensitivity_specificity
1/66
All Rights Reserved, Indian
-
7/25/2019 curves_coefficients_cutoffs@measurments_sensitivity_specificity
2/66
All Rights Reserved, Indian
Session Outline
Introduction to classification problems and discrete choice models.
Introduction to Logistics Regression.
Logistic function and Logit function.
Maximum Likelihood Estimator (MLE) for estimation of LR param
Examples: Challenger Shuttle, German Bank Credit Rating.
-
7/25/2019 curves_coefficients_cutoffs@measurments_sensitivity_specificity
3/66
All Rights Reserved, Indian
Classification Problems
Classification is an important category of problems in which the decis
like to classify the customers into two or more groups.
Examples of Classification Problems:
Customer Churn.
Credit Rating (low, high and medium risk)
Employee attrition.Fraud (classification of a transaction to fraud/non-fraud)
Outcome of any binomial and multinomial experiment.
-
7/25/2019 curves_coefficients_cutoffs@measurments_sensitivity_specificity
4/66
All Rights Reserved, Indian
Logistic Regression attempts to classify customers into differecategories
Always Cheerful Always Sad
-
7/25/2019 curves_coefficients_cutoffs@measurments_sensitivity_specificity
5/66 All Rights Reserved, Indian
Discrete Choice Models
Problems involving discrete choices available to the decision mak
Discrete choice models (in business) examines whichalternaby a customer and why?
Most discrete choice models estimate the probability that a custa particular alternative from several possible alternatives.
-
7/25/2019 curves_coefficients_cutoffs@measurments_sensitivity_specificity
6/66 All Rights Reserved, Indian
Logist
ic
Regress
ion
Classification Problem
Discrete Choice Model
Probability of an event
-
7/25/2019 curves_coefficients_cutoffs@measurments_sensitivity_specificity
7/66 All Rights Reserved, Indian
-
7/25/2019 curves_coefficients_cutoffs@measurments_sensitivity_specificity
8/66 All Rights Reserved, Indian
Logistic Regression - Introduction
The name logistic regression emerges from logistic function.
Mathematically, logistic regression attempts to estimate conditional proevent.
Z
Z
e
eYP
1)1(
-
7/25/2019 curves_coefficients_cutoffs@measurments_sensitivity_specificity
9/66 All Rights Reserved, Indian
Logistic Regression
Logistic regression models estimate how probability of an ebe affected by one or more explanatory variables.
-
7/25/2019 curves_coefficients_cutoffs@measurments_sensitivity_specificity
10/66
-
7/25/2019 curves_coefficients_cutoffs@measurments_sensitivity_specificity
11/66
All Rights Reserved, Indian
Logistic Function (Sigmoidal function
nn
z
z
xxxz
eez
...
1)(
22110
-
7/25/2019 curves_coefficients_cutoffs@measurments_sensitivity_specificity
12/66
All Rights Reserved, Indian
Logistic Regression with one Explanatory
Variable
)exp(1
)exp()()|1(
x
xxxXYP
= 0 implies that P(Y | x) is same for each value of x
> 0 implies that P(Y | x) is increases as the value of x increases
< 0 implies that P(Y | x) is decreases as the value of x increases
-
7/25/2019 curves_coefficients_cutoffs@measurments_sensitivity_specificity
13/66
All Rights Reserved, Indian
Logit Function
The Logit function is the logarithmic transformation of function. It is defined as the natural logarithm of odds.
Logit of a variableis given by:
110)1
ln()( xLogit
odds1
-
7/25/2019 curves_coefficients_cutoffs@measurments_sensitivity_specificity
14/66
All Rights Reserved, Indian
Logistic regression
More robustError terms need not be normal.
No requirement for equal variance for error term (homosce
No requirement for linear relationship between dependent andvariables.
-
7/25/2019 curves_coefficients_cutoffs@measurments_sensitivity_specificity
15/66
All Rights Reserved, Indian
-
7/25/2019 curves_coefficients_cutoffs@measurments_sensitivity_specificity
16/66
All Rights Reserved, Indian
Estimation of parameters in Logistic Regres
Estimation of parameters in logistic regression is carriedMaximum Likelihood Estimation (MLE) technique.
No closed form solution exists for estimation of regression p
logistic regression.
-
7/25/2019 curves_coefficients_cutoffs@measurments_sensitivity_specificity
17/66
All Rights Reserved, Indian
Maximum Likelihood Estimator (MLE)
MLEis a statistical model for estimating model parameters of a f
For a given data set, the MLEchooses the values of model parmakes the data more likely,than other parameter values.
-
7/25/2019 curves_coefficients_cutoffs@measurments_sensitivity_specificity
18/66
All Rights Reserved, Indian
Likelihood Function
Likelihood function L() represents the joint probability or observing the data that have been collected.
MLE chooses that estimator of the set of unknown parame
maximizes the likelihood function L().
-
7/25/2019 curves_coefficients_cutoffs@measurments_sensitivity_specificity
19/66
All Rights Reserved, Indian
Maximum Likelihood Estimator
Assume that x1, x2, , xnare some sample observations of a
f(x,), whereis an unknown parameter.
The likelihood function is L() = f(x1, x2, , xn, ) whichprobability density function of the sample.
The value of ,*, which maximizes L() is called the maximuestimator of.
-
7/25/2019 curves_coefficients_cutoffs@measurments_sensitivity_specificity
20/66
All Rights Reserved, Indian
Example: Exponential Distribution
Let x1, x2, , xn be the sample observation that follows expon
distribution with parameter . That is:
The likelihood function is given by (assuming independence):
xe)f(x,
n
i
ixn
n
e
xfxfxfxL
1n11 x-x-x-
21
eee
),(),(),(),(
-
7/25/2019 curves_coefficients_cutoffs@measurments_sensitivity_specificity
21/66
All Rights Reserved, Indian
Log-likelihood function
Log-likelihood function is given by:
The optimal , * is given by:
n
iixnxLLn
1
ln)),((
xx
n
xnxLdd
n
i
i
n
i
i
1
0)),((ln(
1
*
1
-
7/25/2019 curves_coefficients_cutoffs@measurments_sensitivity_specificity
22/66
All Rights Reserved, Indian
Likelihood function for Binary Logistic
Function
Probability density function for binary logistic regression is give
ii yi
yiiyf 1)1()(
ii y
i
n
i
y
inyyyfL
11
21 )1(),...,,()(
n
i
ii
n
i
ii xyxyL11
(1ln()1()(ln))(ln(
-
7/25/2019 curves_coefficients_cutoffs@measurments_sensitivity_specificity
23/66
All Rights Reserved, Indian
Likelihood function for Binary Logis
Function
exp(1ln()()(ln 101
10
1
n
i
i
n
i
i xxyL
-
7/25/2019 curves_coefficients_cutoffs@measurments_sensitivity_specificity
24/66
All Rights Reserved, Indian
Estimation of LR parameters
0)exp(1
)exp()),(ln(
1 10
10
10
10
n
i i
in
ii x
x
y
L
0)exp(1
)exp()),(ln(
1 10
10
11
10
n
i i
iin
iii
x
xxyx
L
The above system of equations are solved iteratively to
estimate 0and 1
-
7/25/2019 curves_coefficients_cutoffs@measurments_sensitivity_specificity
25/66
All Rights Reserved, Indian
Limitations of MLE
Maximum likelihood estimator may not be unique or may not
Closed form solution may not exist for many cases, one maiterative procedure to estimate the parameter values.
-
7/25/2019 curves_coefficients_cutoffs@measurments_sensitivity_specificity
26/66
All Rights Reserved, Indian
-
7/25/2019 curves_coefficients_cutoffs@measurments_sensitivity_specificity
27/66
All Rights Reserved, Indian
Flt Temp Damage
STS-1 66 No
STS-2 70 YesSTS-3 69 No
STS-4 80 No
STS-5 68 No
STS-6 67 No
STS-7 72 No
STS-8 73 No
STS-9 70 No
STS-41B 57 Yes
STS-41C 63 Yes
STS-41D 70 Yes
Flt Temp Damage
STS-41G 78 No
STS-51-A 67 No
STS-51-C 53 Yes
STS-51-D 67 No
STS-51-B 75 No
STS-51-G 70 No
STS-51-F 81 No
STS-51-I 76 No
STS-51-J 79 No
STS-61-A 75 Yes
STS-61-B 76 No
STS-61-C 58 Yes
Challenger Data
-
7/25/2019 curves_coefficients_cutoffs@measurments_sensitivity_specificity
28/66
All Rights Reserved, Indian
Challenger launch temperature vs
damage data
-
7/25/2019 curves_coefficients_cutoffs@measurments_sensitivity_specificity
29/66
All Rights Reserved, Indian
Logistic Regression of challenger data
Let:
Yi = 0 denote no damage
Yi = 1 denote damage to the O-ring
P(Yi = 1) = i and P(Yi = 0) = 1 -i.
We predict P(Yi = 1|xi), xi = launch temperatu
-
7/25/2019 curves_coefficients_cutoffs@measurments_sensitivity_specificity
30/66
All Rights Reserved, Indian
Logistic Regression using
SPSS
Dependent variable:
In Binary logistic regression, the dependent variable can take on
In multinomial logistic regression, the dependent variable camore values (but not continuous).
Covariate:
All independent (predictor) variables are entered as covariates.
-
7/25/2019 curves_coefficients_cutoffs@measurments_sensitivity_specificity
31/66
All Rights Reserved, Indian
ii
i X236.0297.151
ln
Variables in the Equation
-.236 .107 4.832 1 .028 15.297 7.329 4.357 1 .037 4
LaunchTemperatureConstant
Step1
a
B S.E. Wald df Sig . E
Variable(s) entered on step 1: LaunchTemperature.a.
i
i
X
X
ie
eYP
236.0297.15
236.0297.15
1)1(
-
7/25/2019 curves_coefficients_cutoffs@measurments_sensitivity_specificity
32/66
All Rights Reserved, Indian
Challenger: Probability of failure
estimate
i
i
X
X
ie
e236.0297.15
236.0297.15
1
0
0.2
0.4
0.6
0.8
1
1.2
0 20 40 60 80 100 120
Probability
Classification table from SPSS
-
7/25/2019 curves_coefficients_cutoffs@measurments_sensitivity_specificity
33/66
All Rights Reserved, Indian
Classification table from SPSS
-
7/25/2019 curves_coefficients_cutoffs@measurments_sensitivity_specificity
34/66
All Rights Reserved, Indian
Accuracy Paradox
Assume an example of insurance fraud. Past data has revealed that out
in the past, 950 are true claims and 50 are fraudulent claims.
The classification table using a logistic regression model is given below:
Observed Predicted % accuracy
0 1
0 900 50 94.73%
1 5 45 90.00%
The overall accuracy is 94.5%. Not predicting f
will give 95% accuracy!
-
7/25/2019 curves_coefficients_cutoffs@measurments_sensitivity_specificity
35/66
All Rights Reserved, Indian
-
7/25/2019 curves_coefficients_cutoffs@measurments_sensitivity_specificity
36/66
All Rights Reserved, Indian
ODDS and ODDS RATIO
ODDS:Ratio of two probability values.
ODDS Ratio: Ratio of two odds.
1odds
-
7/25/2019 curves_coefficients_cutoffs@measurments_sensitivity_specificity
37/66
All Rights Reserved, Indian
ODDS RATIO
Assume that X is an independent variable (covariate). Th
ratio, OR, is defined as the ratio of the odds for X = 1 tofor X = 0. The odds ratio is given by:
)0(1)0(
)1(1)1(
OR
-
7/25/2019 curves_coefficients_cutoffs@measurments_sensitivity_specificity
38/66
All Rights Reserved, Indian
Interpretation of Beta Coefficient in LR
x)
(x)1
(x)ln( 110
)(0)1
(0)ln(0For x 0
)(1)1
(1)ln(1For x 10
))0(1/()0(
))1(1/()1(ln1
(2)
(3)
(1)
-
7/25/2019 curves_coefficients_cutoffs@measurments_sensitivity_specificity
39/66
All Rights Reserved, Indian
Interpretation of LR coefficients
ratiooddsinChange))1(1()(
))1(1()1(
raoddslninChange))1(1()(
))1(1()1(ln
1
1
xx
xx
e
xx
xx
-
7/25/2019 curves_coefficients_cutoffs@measurments_sensitivity_specificity
40/66
All Rights Reserved, Indian
Odds Ratio for Binary Logistic Regressi
1
)0(1)0(
)1(1)1(
eOR
If OR = 2, then the event is twice likely to occur when X = 1 comp
Odds ratio approximates the relative risk.
-
7/25/2019 curves_coefficients_cutoffs@measurments_sensitivity_specificity
41/66
All Rights Reserved, Indian
Interpretation of LR coefficients
1 is the change in log-odds ratio for unit change in the variable.
1is the change in odds ratio by a factor exp(1).
-
7/25/2019 curves_coefficients_cutoffs@measurments_sensitivity_specificity
42/66
All Rights Reserved, Indian
-
7/25/2019 curves_coefficients_cutoffs@measurments_sensitivity_specificity
43/66
All Rights Reserved, Indian
Session Outline
Measuring the fitness of Logistic Regression Model.
Testing individual regression parameters (Walds test).
Omnibus test for overall model fitness
Hosmer-Lemeshow Goodness of fit test.
R2 in Logistic Regression.
Confidence Intervals for parameters and probabilities.
-
7/25/2019 curves_coefficients_cutoffs@measurments_sensitivity_specificity
44/66
All Rights Reserved, Indian
Wald Test
Wald test is used to check the significance of individual
variables (similar to t-statistic in linear regression).
Wald test statistic is given by:
2
)(
i
i
SE
W
W is a chi-square statistic
-
7/25/2019 curves_coefficients_cutoffs@measurments_sensitivity_specificity
45/66
All Rights Reserved, Indian
Wald test hypothesis
Null Hypothesis H0: i= 0
Alternative Hypothesis H1: i 0
Wald Test Challenger Data
-
7/25/2019 curves_coefficients_cutoffs@measurments_sensitivity_specificity
46/66
All Rights Reserved, Indian
Wald Test Challenger Data
The explanatory
variable is
significant, since
the p value is lessthan 0.05
-
7/25/2019 curves_coefficients_cutoffs@measurments_sensitivity_specificity
47/66
All Rights Reserved, Indian
Wald Test Challenger Data
For signif
variables, th
Exp() w
contai
-
7/25/2019 curves_coefficients_cutoffs@measurments_sensitivity_specificity
48/66
-
7/25/2019 curves_coefficients_cutoffs@measurments_sensitivity_specificity
49/66
All Rights Reserved, Indian
Hosmer-Lemeshow Goodness of Fit
Test for overall fitness of the model for a binary logistic regresto chi-square goodness of fit test).
The observations are grouped into 10 groups based on theprobabilities.
-
7/25/2019 curves_coefficients_cutoffs@measurments_sensitivity_specificity
50/66
All Rights Reserved, Indian
Hosmer-Lemeshow Test Statistic
Hosmer-Lemeshow Test Statistic is given by:
groupkinaveargeThe
group.kforvaluestheofSumO
groupeachinnsobservatioofNumbern
groupsofNumber
)1(
)(
th
th
k
k
1
2
k
g
kkkk
kkk
g
n
nOC
-
7/25/2019 curves_coefficients_cutoffs@measurments_sensitivity_specificity
51/66
All Rights Reserved, Indian
H-L test for Challenger Data
Hosmer and Lemeshow Test
9.396 8 .310
Step
1
Chi-square df Sig.
Since P (=0.310) is more than 0.05, we accept the null hypo
there is no difference between the predicted and observed f(accept the model)
-
7/25/2019 curves_coefficients_cutoffs@measurments_sensitivity_specificity
52/66
All Rights Reserved, Indian
-
7/25/2019 curves_coefficients_cutoffs@measurments_sensitivity_specificity
53/66
All Rights Reserved, Indian
Classification Table
-
7/25/2019 curves_coefficients_cutoffs@measurments_sensitivity_specificity
54/66
All Rights Reserved, Indian
Classification Table
Prediction
(Classification)
Observed
1 (Positive) 7 0 (Negative) 17
1 (Positive) 4
[True Positive] TP
0
[False Positive] FP
0 (Negative) 3 [False Negative] FN 17 [True Negative] TN
10017
17
1.577
4ySensitivit
FPTN
TNySpecificit
FNTP
TP
-
7/25/2019 curves_coefficients_cutoffs@measurments_sensitivity_specificity
55/66
All Rights Reserved, Indian
Sensitivity Specificity
negatifalseofNumberpositivestrueofNumber
positivestrueofNoySensitivit
positivfalseofNumbernegativestrueofNumber
negativestrueofNoySpecificit
Sensitivity is the probability that the predicted value of y = 1
given that the observed value is 1.
Specificity is the probability that the predicted value of y = 0
given that the observed value is 0.
-
7/25/2019 curves_coefficients_cutoffs@measurments_sensitivity_specificity
56/66
All Rights Reserved, Indian
Receiver Operating Characteristics (ROC) C
ROC curve plots the true positive ratio (right positive classifica
the false positive ratio (1- specificity) and compares it wclassification.
The higher the area under the ROC curve, the better the predic
Challenger Example: Sensitivity Vs 1-
Specificity
(T iti V F l iti )
-
7/25/2019 curves_coefficients_cutoffs@measurments_sensitivity_specificity
57/66
All Rights Reserved, Indian
(True positive Vs False positive)
Cut off Value Sensitivity Specificity 1-specificity
0.05 1 0.235 0.765
0.1 0.857 0.412 0.588
0.2 0.857 0.529 0.471
0.3 0.571 0.706 0.294
0.4 0.571 0.941 0.059
0.5 0.571 1 0
0.6 0.571 1 00.7 0.429 1 0
0.8 0.429 1 0
0.9 0.143 1 0
0.95 0 1 0
-
7/25/2019 curves_coefficients_cutoffs@measurments_sensitivity_specificity
58/66
-
7/25/2019 curves_coefficients_cutoffs@measurments_sensitivity_specificity
59/66
Area Under the ROC Curve
-
7/25/2019 curves_coefficients_cutoffs@measurments_sensitivity_specificity
60/66
All Rights Reserved, Indian
Area under the ROC curve is interpreted as the probabil
model will rank a randomly chosen positive higher thachosen negative.
If n1 is the number of positives (1s) and n2 is the number o
(0s), then the area under the ROC curve is the proportion
all possible combinations of (n1, n2) such that n1 willprobability than n2.
ROC Curve
-
7/25/2019 curves_coefficients_cutoffs@measurments_sensitivity_specificity
61/66
All Rights Reserved, Indian
General rule for acceptance of the model:
If the area under ROC is:
0.5 No discrimination
0.7 ROC area < 0.8 Acceptable discrimination
0.8 ROC area < 0.9 Excellent discrimination
ROC area 0.9 Outstanding discrimination
-
7/25/2019 curves_coefficients_cutoffs@measurments_sensitivity_specificity
62/66
Optimal Cut-off probabilities
-
7/25/2019 curves_coefficients_cutoffs@measurments_sensitivity_specificity
63/66
All Rights Reserved, Indian
Optimal Cut-off probabilities
Using classification plots.
Youdens Index.
Cost based optimization.
Classification Plots
-
7/25/2019 curves_coefficients_cutoffs@measurments_sensitivity_specificity
64/66
All Rights Reserved, Indian
Step number: 1
Observed Groups and Predicted Probabilities
4 + 1
I 1
I 1
F I 1
R 3 + 1 0
E I 1 0
Q I 1 0
U I 1 0
E 2 + 0 0 1 0 0
N I 0 0 1 0 0
C I 0 0 1 0 0
Y I 0 0 1 0 0 1 + 000 0 0 0 0 0 0 0 0 0 1 1 1
I 000 0 0 0 0 0 0 0 0 0 1 1 1
I 000 0 0 0 0 0 0 0 0 0 1 1 1
I 000 0 0 0 0 0 0 0 0 0 1 1 1
Predicted ---------+---------+---------+---------+---------+---------+---------+---------+---------+-
Prob: 0 .1 .2 .3 .4 .5 .6 .7 .8 .9
Group: 0000000000000000000000000000000000000000000000000011111111111111111111111111111111111111111
Youdens Index
-
7/25/2019 curves_coefficients_cutoffs@measurments_sensitivity_specificity
65/66
All Rights Reserved Indian
Youden's index is a measures for diagnostic accuracy.
Youden's index is calculated by deducting 1 from the sum of
sensitivity and specificity.
-y(p)specificity(p)SensitivitJ(p)IndexsYouden'
-
7/25/2019 curves_coefficients_cutoffs@measurments_sensitivity_specificity
66/66