lab6: logis+c regression and metrics · bernoulli trials would be 10, i.e., ... tp tp+fn fpr = fp...

Lab6:Logis+cRegressionandMetrics

Department of Computer Science, National Tsing Hua University, Taiwan

2020.10.08

Outline

• BriefReview:Logis+cRegression

- MaximumlikelihoodinLogis+cRegression

- Implementa+on

• CommonEvalua+onMetricsforBinaryClassifica+on

- ConfusionMatrix

- SoHClassifiers-ROCCurve

MaximumLikelihood• Flippingcoin:wehavealreadyknowgroundtruthdistribu+on.

Forexample, and .P(x = head) = 1/2 P(x = tail) = 1/2

0

0.5

1

head tail

H, T, H, H, T, T …

MaximumLikelihood• Flippingcoin:wehavealreadyknowgroundtruthdistribu+on.

Forexample, and .

• However,inmanytasks,thegroundtruthdistribu+onsareneverknown,e.g.,probabilitydistribu+onofgeMngCOVID-19.

P(x = head) = 1/2 P(x = tail) = 1/2

0

0.5

1

head tail

Is Patient?

MaximumLikelihood• Theprocesstoapproximatethedistribu+on:

- First,weassumethepropor+onofpeoplediagnosedwithadiseasefollowsBinomialdistribu+on,e.g., .X ∼ Bin(A, ρ)


- First,weassumethepropor+onofpeoplediagnosedwithadiseasefollowsBinomialdistribu+on,e.g., .Where isnumberofpersondiagnosed, isillnessrate.

X ∼ Bin(A, ρ)A ρ



- Ifthereare4pa+entsoutof10people,thenumberofBernoullitrialswouldbe10,i.e.,


X ∼ Bin(10, ρ)



- Ifthereare4pa+entsoutof10people,thenumberofBernoullitrialswouldbe10,i.e.,


X ∼ Bin(10, ρ)

P(X = 4 |ρ) = C104 ρ4(1 − ρ)(10−4)

MaximumLikelihood

P(X = 4 |ρ) = C104 ρ4(1 − ρ)(10−4)

ρ

P(X

=4|

ρ)

Outline



- Implementa+on


- ConfusionMatrix


Logis+cRegression• Inlogis+cregression,wesolvemaximumlog-likelihoodinstead.

• Updatewithgradientdecent:

where

arg maxw

log P(𝕏 |w)

w(t+1) = w(t) − η∇wlog P(𝕏 | w(t))

∇wlog P(𝕏 | w(t)) =N

∑t=1

[y′�(i) − σ(w(t)⊤x(i))]x(i) y′� = y + 12

,

Logis+cRegression

P(y | x; w) = σ(w⊤x)y′�[1 − σ(w⊤x)](1−y′ �)

SoHpredic+on

arg maxy

{σ(w⊤x),1 − σ(w⊤x)} = sign(w⊤x)

Labelpredic+on

Outline



- Implementa+on


- ConfusionMatrix


ConfusionMatrix• Asidefromaccuracy,itisimportant

toknowhowthemodelmakewrongpredic+ons.



• Inbinaryclassifica+on,confusionmatrixisacommontooltoanalyzethepredic+ons.




Ground truth

Positive predictions of your model




• Manymetricsarederivedfromtheconfusionmatrix.




• Manymetricsarederivedfromtheconfusionmatrix.

• e.g.

TPR =TP

TP + FNFPR =

FPFP + TN

Outline



- Implementa+on


- ConfusionMatrix


ROCCurve• ROCcurveanalyzetheperformancefor

everythresholdinsoHclassifiers.

ROCCurve• ROCcurveanalyzetheperformancefor

everythresholdinsoHclassifiers.

• X-axis:FPR

• Y-axis:TPR

TPR =TP

TP + FN

FPR =FP

FP + TN

ROCCurve

ROCCurve• WhatisbestROCcurve?

Homework• Homework:Lab6

- Lab6:Logis+cRegression,Metrics

• Bonus:Lab7,Lab8

- Lab7:SupportVectorMachine,k-NearestNeighbors

- Lab8:CrossValida+on,Ensemble

Homework• Deadline:10/2023:59(Tue)

- Duetotheheavyworkloads,wehaveextendedthedeadline.

Reference• hfps://bookdown.org/ccwang/medical_sta+s+cs6/sec+on-43.html

• hfps://bookdown.org/ccwang/medical_sta+s+cs6/bernoulli.html

• hfps://bookdown.org/ccwang/medical_sta+s+cs6/binomial.html

• hfps://bookdown.org/ccwang/medical_sta+s+cs6/likelihood-defini+on.html

• hfps://en.wikipedia.org/wiki/Sensi+vity_and_specificity

• hfps://github.com/dariyasydykova/open_projects/tree/master/ROC_anima+on
https://bookdown.org/ccwang/medical_statistics6/section-43.htmlhttps://bookdown.org/ccwang/medical_statistics6/bernoulli.htmlhttps://bookdown.org/ccwang/medical_statistics6/binomial.htmlhttps://bookdown.org/ccwang/medical_statistics6/likelihood-definition.htmlhttps://bookdown.org/ccwang/medical_statistics6/likelihood-definition.htmlhttps://en.wikipedia.org/wiki/Sensitivity_and_specificityhttps://github.com/dariyasydykova/open_projects/tree/master/ROC_animationhttps://github.com/dariyasydykova/open_projects/tree/master/ROC_animationhttps://bookdown.org/ccwang/medical_statistics6/section-43.htmlhttps://bookdown.org/ccwang/medical_statistics6/bernoulli.htmlhttps://bookdown.org/ccwang/medical_statistics6/binomial.htmlhttps://bookdown.org/ccwang/medical_statistics6/likelihood-definition.htmlhttps://bookdown.org/ccwang/medical_statistics6/likelihood-definition.htmlhttps://en.wikipedia.org/wiki/Sensitivity_and_specificityhttps://github.com/dariyasydykova/open_projects/tree/master/ROC_animationhttps://github.com/dariyasydykova/open_projects/tree/master/ROC_animation

lab6: logis+c regression and metrics · bernoulli trials would be 10, i.e., ... tp tp+fn fpr = fp...

Documents