lab6: logis+c regression and metrics · bernoulli trials would be 10, i.e., ... tp tp+fn fpr = fp...

28
Lab6: Logis+c Regression and Metrics Department of Computer Science, National Tsing Hua University, Taiwan 2020.10.08

Upload: others

Post on 05-Feb-2021

1 views

Category:

Documents


0 download

TRANSCRIPT

  • Lab6:Logis+cRegressionandMetrics

    Department of Computer Science, National Tsing Hua University, Taiwan

    2020.10.08

  • Outline

    • BriefReview:Logis+cRegression

    - MaximumlikelihoodinLogis+cRegression

    - Implementa+on

    • CommonEvalua+onMetricsforBinaryClassifica+on

    - ConfusionMatrix

    - SoHClassifiers-ROCCurve

  • Outline

    • BriefReview:Logis+cRegression

    - MaximumlikelihoodinLogis+cRegression

    - Implementa+on

    • CommonEvalua+onMetricsforBinaryClassifica+on

    - ConfusionMatrix

    - SoHClassifiers-ROCCurve

  • MaximumLikelihood• Flippingcoin:wehavealreadyknowgroundtruthdistribu+on.

    Forexample, and .P(x = head) = 1/2 P(x = tail) = 1/2

    0

    0.5

    1

    head tail

    H, T, H, H, T, T …

  • MaximumLikelihood• Flippingcoin:wehavealreadyknowgroundtruthdistribu+on.

    Forexample, and .

    • However,inmanytasks,thegroundtruthdistribu+onsareneverknown,e.g.,probabilitydistribu+onofgeMngCOVID-19.

    P(x = head) = 1/2 P(x = tail) = 1/2

    0

    0.5

    1

    head tail

    Is Patient?

  • MaximumLikelihood• Theprocesstoapproximatethedistribu+on:

    - First,weassumethepropor+onofpeoplediagnosedwithadiseasefollowsBinomialdistribu+on,e.g., .X ∼ Bin(A, ρ)

  • MaximumLikelihood• Theprocesstoapproximatethedistribu+on:

    - First,weassumethepropor+onofpeoplediagnosedwithadiseasefollowsBinomialdistribu+on,e.g., .Where isnumberofpersondiagnosed, isillnessrate.

    X ∼ Bin(A, ρ)A ρ

  • MaximumLikelihood• Theprocesstoapproximatethedistribu+on:

    - First,weassumethepropor+onofpeoplediagnosedwithadiseasefollowsBinomialdistribu+on,e.g., .Where isnumberofpersondiagnosed, isillnessrate.

    - Ifthereare4pa+entsoutof10people,thenumberofBernoullitrialswouldbe10,i.e.,

    X ∼ Bin(A, ρ)A ρ

    X ∼ Bin(10, ρ)

  • MaximumLikelihood• Theprocesstoapproximatethedistribu+on:

    - First,weassumethepropor+onofpeoplediagnosedwithadiseasefollowsBinomialdistribu+on,e.g., .Where isnumberofpersondiagnosed, isillnessrate.

    - Ifthereare4pa+entsoutof10people,thenumberofBernoullitrialswouldbe10,i.e.,

    X ∼ Bin(A, ρ)A ρ

    X ∼ Bin(10, ρ)

    P(X = 4 |ρ) = C104 ρ4(1 − ρ)(10−4)

  • MaximumLikelihood

    P(X = 4 |ρ) = C104 ρ4(1 − ρ)(10−4)

    ρ

    P(X

    =4|

    ρ)

  • Outline

    • BriefReview:Logis+cRegression

    - MaximumlikelihoodinLogis+cRegression

    - Implementa+on

    • CommonEvalua+onMetricsforBinaryClassifica+on

    - ConfusionMatrix

    - SoHClassifiers-ROCCurve

  • Logis+cRegression• Inlogis+cregression,wesolvemaximumlog-likelihoodinstead.

    • Updatewithgradientdecent:

    where

    arg maxw

    log P(𝕏 |w)

    w(t+1) = w(t) − η∇wlog P(𝕏 | w(t))

    ∇wlog P(𝕏 | w(t)) =N

    ∑t=1

    [y′�(i) − σ(w(t)⊤x(i))]x(i) y′� = y + 12

    ,

  • Logis+cRegression

    P(y | x; w) = σ(w⊤x)y′�[1 − σ(w⊤x)](1−y′ �)

    SoHpredic+on

    arg maxy

    {σ(w⊤x),1 − σ(w⊤x)} = sign(w⊤x)

    Labelpredic+on

  • Outline

    • BriefReview:Logis+cRegression

    - MaximumlikelihoodinLogis+cRegression

    - Implementa+on

    • CommonEvalua+onMetricsforBinaryClassifica+on

    - ConfusionMatrix

    - SoHClassifiers-ROCCurve

  • ConfusionMatrix• Asidefromaccuracy,itisimportant

    toknowhowthemodelmakewrongpredic+ons.

  • ConfusionMatrix• Asidefromaccuracy,itisimportant

    toknowhowthemodelmakewrongpredic+ons.

    • Inbinaryclassifica+on,confusionmatrixisacommontooltoanalyzethepredic+ons.

  • ConfusionMatrix• Asidefromaccuracy,itisimportant

    toknowhowthemodelmakewrongpredic+ons.

    • Inbinaryclassifica+on,confusionmatrixisacommontooltoanalyzethepredic+ons.

    Ground truth

    Positive predictions of your model

  • ConfusionMatrix• Asidefromaccuracy,itisimportant

    toknowhowthemodelmakewrongpredic+ons.

    • Inbinaryclassifica+on,confusionmatrixisacommontooltoanalyzethepredic+ons.

  • ConfusionMatrix• Asidefromaccuracy,itisimportant

    toknowhowthemodelmakewrongpredic+ons.

    • Inbinaryclassifica+on,confusionmatrixisacommontooltoanalyzethepredic+ons.

    • Manymetricsarederivedfromtheconfusionmatrix.

  • ConfusionMatrix• Asidefromaccuracy,itisimportant

    toknowhowthemodelmakewrongpredic+ons.

    • Inbinaryclassifica+on,confusionmatrixisacommontooltoanalyzethepredic+ons.

    • Manymetricsarederivedfromtheconfusionmatrix.

    • e.g.

    TPR =TP

    TP + FNFPR =

    FPFP + TN

  • Outline

    • BriefReview:Logis+cRegression

    - MaximumlikelihoodinLogis+cRegression

    - Implementa+on

    • CommonEvalua+onMetricsforBinaryClassifica+on

    - ConfusionMatrix

    - SoHClassifiers-ROCCurve

  • ROCCurve• ROCcurveanalyzetheperformancefor

    everythresholdinsoHclassifiers.

  • ROCCurve• ROCcurveanalyzetheperformancefor

    everythresholdinsoHclassifiers.

    • X-axis:FPR

    • Y-axis:TPR

    TPR =TP

    TP + FN

    FPR =FP

    FP + TN

  • ROCCurve

  • ROCCurve• WhatisbestROCcurve?

  • Homework• Homework:Lab6

    - Lab6:Logis+cRegression,Metrics

    • Bonus:Lab7,Lab8

    - Lab7:SupportVectorMachine,k-NearestNeighbors

    - Lab8:CrossValida+on,Ensemble

  • Homework• Deadline:10/2023:59(Tue)

    - Duetotheheavyworkloads,wehaveextendedthedeadline.

  • Reference• hfps://bookdown.org/ccwang/medical_sta+s+cs6/sec+on-43.html

    • hfps://bookdown.org/ccwang/medical_sta+s+cs6/bernoulli.html

    • hfps://bookdown.org/ccwang/medical_sta+s+cs6/binomial.html

    • hfps://bookdown.org/ccwang/medical_sta+s+cs6/likelihood-defini+on.html

    • hfps://en.wikipedia.org/wiki/Sensi+vity_and_specificity

    • hfps://github.com/dariyasydykova/open_projects/tree/master/ROC_anima+on

    https://bookdown.org/ccwang/medical_statistics6/section-43.htmlhttps://bookdown.org/ccwang/medical_statistics6/bernoulli.htmlhttps://bookdown.org/ccwang/medical_statistics6/binomial.htmlhttps://bookdown.org/ccwang/medical_statistics6/likelihood-definition.htmlhttps://bookdown.org/ccwang/medical_statistics6/likelihood-definition.htmlhttps://en.wikipedia.org/wiki/Sensitivity_and_specificityhttps://github.com/dariyasydykova/open_projects/tree/master/ROC_animationhttps://github.com/dariyasydykova/open_projects/tree/master/ROC_animationhttps://bookdown.org/ccwang/medical_statistics6/section-43.htmlhttps://bookdown.org/ccwang/medical_statistics6/bernoulli.htmlhttps://bookdown.org/ccwang/medical_statistics6/binomial.htmlhttps://bookdown.org/ccwang/medical_statistics6/likelihood-definition.htmlhttps://bookdown.org/ccwang/medical_statistics6/likelihood-definition.htmlhttps://en.wikipedia.org/wiki/Sensitivity_and_specificityhttps://github.com/dariyasydykova/open_projects/tree/master/ROC_animationhttps://github.com/dariyasydykova/open_projects/tree/master/ROC_animation