quiz 3 practice problems solutions 1. - computer action teamweb.cecs.pdx.edu › ~mm ›...

6
Quiz 3 Practice Problems Solutions 1. Consider the table below, which reports, for eight instances, the actual class of each instance and the score given by a classifier to that instance. Instance Actual Class Score 1 Positive 7 2 Positive 4 3 Negative 2 4 Negative 1 5 Negative -1 6 Positive -4 7 Negative -5 8 Negative -6 (a) For each value of the threshold in the table below, fill in the precision, and recall of the classifier. (If the score is equal to or greater than the threshold, the classifier assigns “positive”, else it assigns “negative”.) Threshold Precision Recall (TPR) FPR 5 1/1 1/3 0/5 3 2/2 2/3 0/5 1 2/4 2/3 2/5 -3 2/5 2/3 3/5 -6 3/8 3/3 5/5

Upload: others

Post on 07-Jun-2020

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Quiz 3 Practice Problems Solutions 1. - Computer Action Teamweb.cecs.pdx.edu › ~mm › MachineLearningWinter2019 › ...following two attributes: Genre (Romance, Self-Help, or Thriller)

Quiz 3 Practice Problems

Solutions

1. Consider the table below, which reports, for eight instances, the actual class of each instance and the score given by a classifier to that instance.

Instance Actual Class Score 1 Positive 7 2 Positive 4 3 Negative 2 4 Negative 1 5 Negative -1 6 Positive -4 7 Negative -5 8 Negative -6

(a) For each value of the threshold in the table below, fill in the precision, and recall of the classifier. (If the score is equal to or greater than the threshold, the classifier assigns “positive”, else it assigns “negative”.)

Threshold Precision Recall (TPR) FPR 5 1/1 1/3 0/5 3 2/2 2/3 0/5 1 2/4 2/3 2/5 -3 2/5 2/3 3/5 -6 3/8 3/3 5/5

Page 2: Quiz 3 Practice Problems Solutions 1. - Computer Action Teamweb.cecs.pdx.edu › ~mm › MachineLearningWinter2019 › ...following two attributes: Genre (Romance, Self-Help, or Thriller)

(b) Sketch a precision-recall curve with these five thresholds. Put precision on y-axis, and recall on x-axis. (Be sure to label the axes.)

Solution:

(c) Sketch a ROC curve with these five thresholds. (Be sure to label the axes.)

Solution:

Page 3: Quiz 3 Practice Problems Solutions 1. - Computer Action Teamweb.cecs.pdx.edu › ~mm › MachineLearningWinter2019 › ...following two attributes: Genre (Romance, Self-Help, or Thriller)

2. A rare genetic disease has recently been identified. The good news is that only one in 100,000 people have the disease, and the genetic test is extremely good; if you have the disease the test will always be positive; if you don’t have the disease the test will return a false positive only 2% of the time. The bad news is that you have tested positive for the disease. Using Bayes rule, compute the probability you have the disease. (Show your work)

Solution:

Data D = test is positive

P(Disease) = .00001

P(NotDisease) = .99999

P(D | Disease) = 1

P(D | NotDisease) = .02

𝑃 𝐷 = 𝑃 𝐷 𝐷𝑖𝑠𝑒𝑎𝑠𝑒 𝑃 𝐷𝑖𝑠𝑒𝑎𝑠𝑒 + 𝑃 𝐷 𝑁𝑜𝑡𝐷𝑖𝑠𝑒𝑎𝑠𝑒 𝑃 𝑁𝑜𝑡𝐷𝑖𝑠𝑒𝑎𝑠𝑒= 1 . 00001 + . 02 . 99999 = .0200098

𝑃 𝐷𝑖𝑠𝑒𝑎𝑠𝑒 𝐷 =𝑃 𝐷 𝐷𝑖𝑠𝑒𝑎𝑠𝑒 𝑃(𝐷𝑖𝑠𝑒𝑎𝑠𝑒)

𝑃(𝐷) =1 (.00001). 0200098 ≈ .0005

Page 4: Quiz 3 Practice Problems Solutions 1. - Computer Action Teamweb.cecs.pdx.edu › ~mm › MachineLearningWinter2019 › ...following two attributes: Genre (Romance, Self-Help, or Thriller)

3. You’ve been hired by Amazon to work on their recommendation system for books.

Your first task is to see if you can predict the recommendations of a single user, using the following two attributes: Genre (Romance, Self-Help, or Thriller) and Price (High, Medium, or Low). The classification is whether the user recommended the book or not. You collect the following training set.

Book Genre Price Class

B1 Romance Low Recommended

B2 Romance Medium Recommended

B3 Thriller Low Recommended

B4 Thriller High Recommended

B5 Self-Help Low Recommended

B6 Self-Help High Not Recommended

B7 Romance High Not Recommended

Show how a Naïve Bayes classifier, with Laplace smoothing of probabilities, would classify the following new example as “Recommended” or “Not Recommended”

Book Genre Price Class

B8 Self-Help Medium

Show your work.

Solution:

𝑃 𝑅𝑒𝑐 = !! 𝑃 𝑁𝑜𝑡𝑅𝑒𝑐 = !

!

𝑃 𝑆𝑒𝑙𝑓𝐻𝑒𝑙𝑝 𝑅𝑒𝑐 = !!→ !

! 𝑃 𝑀𝑒𝑑𝑖𝑢𝑚 𝑅𝑒𝑐 = !

!→ !

!

𝑃 𝑆𝑒𝑙𝑓𝐻𝑒𝑙𝑝 𝑁𝑜𝑡𝑅𝑒𝑐 = !!→ !

! 𝑃 𝑀𝑒𝑑𝑖𝑢𝑚 𝑁𝑜𝑡𝑅𝑒𝑐 = !

!→ !

!

𝑅𝑒𝑐: !!

!!

!!= .045 𝑁𝑜𝑡𝑅𝑒𝑐: !

!!!

!!= .023

Class is “Recommended”

Page 5: Quiz 3 Practice Problems Solutions 1. - Computer Action Teamweb.cecs.pdx.edu › ~mm › MachineLearningWinter2019 › ...following two attributes: Genre (Romance, Self-Help, or Thriller)

4. (a) Consider the following training data, where each instance x is described by two features, F1 and F2 . The possible classes are POS and NEG.

Instance

F1 F2 Class

x1 7 1 POS x2 6 2 POS x3 8 3 POS

x4 4 6 NEG x5 5 5 NEG x6 6 4 NEG

Give the probabilistic model that Gaussian Naive Bayes would compute—that is, the prior class probabilities and parameters of the relevant Gaussian distributions.

Hint: Recall that 𝜎 = (!!! )!!!!!

!

(b) Using the formula for a Gaussian distribution,

𝑁 𝑥; 𝜇,𝜎 =12𝜋 𝜎

𝑒!!!! !

!!!

show how the Gaussian Naive Bayes classifier from part (a) would classify the instance

x7 = (8, 4).

𝜇!!,!"# = 7+ 6+ 8

3 = 7

𝜎!!,!"# =7− 7 ! + 7− 6 ! + 7− 8 !

3 = .82

𝜇!!,!"# = 1+ 2+ 3

3 = 2

𝜎!!,!"# =2− 1 ! + 2− 2 ! + 2− 3 !

3 = .82

𝜇!!,!"# = 4+ 5+ 6

3 = 5

Page 6: Quiz 3 Practice Problems Solutions 1. - Computer Action Teamweb.cecs.pdx.edu › ~mm › MachineLearningWinter2019 › ...following two attributes: Genre (Romance, Self-Help, or Thriller)

𝜎!!,!"# =5− 4 ! + 5− 5 ! + 5− 6 !

3 = .82

𝜇!!,!"# = 6+ 5+ 4

3 = 5

𝜎!!,!"# =5− 6 ! + 5− 5 ! + 5− 4 !

3 = .82

x7 = (8, 4), Positive:

𝑃 𝑃𝑂𝑆 = .5

𝑃𝐷𝐹 𝐹! = 8 𝑃𝑂𝑆) = 1

2𝜋(.82)𝑒!(!!!)!!(.!")! = .231

𝑃𝐷𝐹 𝐹! = 4 𝑃𝑂𝑆) = 1

2𝜋(.82)𝑒!(!!!)!!(.!")! = .025

ln . 5 + ln . 231 + ln . 025 = -5.85

x7 = (8, 4), Negative:

𝑃 𝑁𝐸𝐺 = .5

𝑃𝐷𝐹 𝐹! = 8 𝑁𝐸𝐺) = 1

2𝜋(.82)𝑒!(!!!)!!(.!")! = .000603

𝑃𝐷𝐹 𝐹! = 4 𝑁𝐸𝐺) = 1

2𝜋(.82)𝑒!(!!!)!!(.!")! = .231

ln . 5 + ln . 000603 + ln . 231 = -9.57

Class is POSITIVE.