announcementslwehbe/10701_s19/files/lecture_5.pdf · 2019-02-05 · 10-701 introduction to machine...

21
10-701 Introduction to Machine Learning (PhD) Lecture 5: Naive Bayes Leila Wehbe Carnegie Mellon University Machine Learning Department Slides based on Tom Mitchell’s 10-701 Spring 2016 material Announcements HW2 was released, due on February 15th • Projects Teams + project due February 11th (let us know if you have problems!) Midway Report due March 6th Poster Session on April 17th, 9am-2pm Final Report due May 1st You want to build a classifier for new customers by learning P(Y|X) O: Is older than 35 years I: Has Personal Income S: Is a Student J: Birthday before July 1st Y: Buys computer 0 1 0 0 0 0 1 0 1 0 1 1 0 1 1 1 1 0 1 1 1 0 1 0 1 1 0 1 1 0 0 0 1 0 1 0 1 0 0 0 0 0 1 1 1 0 1 1 0 1 0 0 1 1 1 1 1 0 1 1 0 1 1 0 1 1 1 0 0 0 0 1 1 1 ? New customer i: Want to find P(Yi = 0| O=0, I = 0, S = 1, J = 1) How many parameters must we estimate? Suppose X =(X 1 ,X 2 ) where X i and Y are boolean RV’s To estimate P(Y| X 1 , X 2 ) X 1 X 2 P(Y = 1 | X 1 ,X 2 ) P(Y = 0 | X 1 ,X 2 ) 0 0 0.1 0.9 1 0 0.24 0.76 0 1 0.54 0.46 1 1 0.23 0.77

Upload: others

Post on 08-Jul-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Announcementslwehbe/10701_S19/files/Lecture_5.pdf · 2019-02-05 · 10-701 Introduction to Machine Learning (PhD) Lecture 5: Naive Bayes Leila Wehbe Carnegie Mellon University Machine

10-701 Introduction to Machine Learning (PhD) Lecture 5: Naive Bayes

Leila Wehbe Carnegie Mellon University

Machine Learning Department

Slides based on Tom Mitchell’s 10-701 Spring 2016 material

Announcements• HW2 was released, due on February

15th

• Projects • Teams + project due February 11th (let

us know if you have problems!) • Midway Report due March 6th • Poster Session on April 17th,

9am-2pm • Final Report due May 1st

You want to build a classifier for new customers by learning P(Y|X)

O: Is older than 35

years

I: Has Personal

Income

S: Is a

Student

J: Birthday before

July 1st

Y: Buys computer

0 1 0 0 00 1 0 1 01 1 0 1 11 1 0 1 11 0 1 0 11 0 1 1 00 0 1 0 10 1 0 0 00 0 1 1 10 1 1 0 10 0 1 1 11 1 0 1 10 1 1 0 11 1 0 0 0

0 1 1 1 ?

New customer i:

Want to find P(Yi = 0| O=0, I = 0, S = 1, J = 1)

How many parameters must we estimate?

Suppose X =(X1,X2) where Xi and Y are boolean RV’s

To estimate P(Y| X1, X2)

X1 X2 P(Y = 1 | X1,X2) P(Y = 0 | X1,X2)

0 0 0.1 0.9

1 0 0.24 0.76

0 1 0.54 0.46

1 1 0.23 0.77

Page 2: Announcementslwehbe/10701_S19/files/Lecture_5.pdf · 2019-02-05 · 10-701 Introduction to Machine Learning (PhD) Lecture 5: Naive Bayes Leila Wehbe Carnegie Mellon University Machine

How many parameters must we estimate?

Suppose X =(X1,X2) where Xi and Y are boolean RV’s

To estimate P(Y| X1, X2)

X1 X2 P(Y = 1 | X1,X2) P(Y = 0 | X1,X2)

0 0 0.1 0.9

1 0 0.24 0.76

0 1 0.54 0.46

1 1 0.23 0.77

4 parameters: P(Y=0|X) = 1-P(Y=1|X)

How many parameters must we estimate?

Suppose X =(X1,… Xn) where Xi and Y are boolean RV’s

To estimate P(Y| X1, X2, … Xn)

X1 X2 X3 … XN

P(Y = 1 | X)

P(Y = 0 | X)

0 0 0 … 0 0.1 0.91 0 0 … … 0.24 0.76… … … … … … …… … … … … … …… … … … … … …1 1 1 1 1 0.52 0.48

How many parameters must we estimate?

Suppose X =(X1,… Xn) where Xi and Y are boolean RV’s

To estimate P(Y| X1, X2, … Xn)

X1 X2 X3 … XN

P(Y = 1 | X)

P(Y = 0 | X)

0 0 0 … 0 0.1 0.91 0 0 … … 0.24 0.76… … … … … … …… … … … … … …… … … … … … …1 1 1 1 1 0.52 0.48

2n rows!

How many parameters must we estimate?

Suppose X =(X1,… Xn) where Xi and Y are boolean RV’s

To estimate P(Y| X1, X2, … Xn)

If we have 30 boolean Xi’s: P(Y | X1, X2, … X30) ~ 1Billion

X1 X2 X3 … XN

P(Y = 1 | X)

P(Y = 0 | X)

0 0 0 … 0 0.1 0.91 0 0 … … 0.24 0.76… … … … … … …… … … … … … …… … … … … … …1 1 1 1 1 0.52 0.48

2n rows!

Page 3: Announcementslwehbe/10701_S19/files/Lecture_5.pdf · 2019-02-05 · 10-701 Introduction to Machine Learning (PhD) Lecture 5: Naive Bayes Leila Wehbe Carnegie Mellon University Machine

Bayes Rule

Which is shorthand for:

Equivalently:

Can we reduce params using Bayes Rule?

Suppose X =(X1,… Xn) where Xi and Y are boolean RV’s

How many parameters to define P(X1,… Xn | Y)?

Can we reduce params using Bayes Rule?

Suppose X =(X1,… Xn) where Xi and Y are boolean RV’s

How many parameters to define P(X1,… Xn | Y)?

Y X1 X2 … XNP(X|Y)

0 0 0 … 0 0.10 1 0 … … 0.02… … … … … …… … … … … …… … … … … …0 1 1 1 1 0.16

2n rows!

-1 because all the probabilities sum to 1

Can we reduce params using Bayes Rule?

Suppose X =(X1,… Xn) where Xi and Y are boolean RV’s

How many parameters to define P(X1,… Xn | Y)?

Y X1 X2 … XNP(X|Y)

0 0 0 … 0 0.10 1 0 … … 0.02… … … … … …… … … … … …… … … … … …0 1 1 1 1 0.16

Y X1 X2 … XNP(X|Y)

1 0 0 … 01 1 0 … …… … … … …… … … … …… … … … …1 1 1 1 1

2n -1 Should I compute these as well?

Page 4: Announcementslwehbe/10701_S19/files/Lecture_5.pdf · 2019-02-05 · 10-701 Introduction to Machine Learning (PhD) Lecture 5: Naive Bayes Leila Wehbe Carnegie Mellon University Machine

Can we reduce params using Bayes Rule?

Suppose X =(X1,… Xn) where Xi and Y are boolean RV’s

How many parameters to define P(X1,… Xn | Y)?

Y X1 X2 … XNP(X|Y)

0 0 0 … 0 0.10 1 0 … … 0.02… … … … … …… … … … … …… … … … … …0 1 1 1 1 0.16

Y X1 X2 … XNP(X|Y)

1 0 0 … 01 1 0 … …… … … … …… … … … …… … … … …1 1 1 1 1

2n -1 2n -1

Can we reduce params using Bayes Rule?

Suppose X =(X1,… Xn) where Xi and Y are boolean RV’s

How many parameters to define P(X1,… Xn | Y)?

How many parameters to define P(Y)?

2*(2n -1)

If n = 30, 2 billion

Can we reduce params using Bayes Rule?

Suppose X =(X1,… Xn) where Xi and Y are boolean RV’s

How many parameters to define P(X1,… Xn | Y)?

How many parameters to define P(Y)?

2*(2n -1)

If n = 30, 2 billion

1

Naïve Bayes

Naïve Bayes assumes

i.e., that Xi and Xj are conditionally independent given Y, for all i≠j

Page 5: Announcementslwehbe/10701_S19/files/Lecture_5.pdf · 2019-02-05 · 10-701 Introduction to Machine Learning (PhD) Lecture 5: Naive Bayes Leila Wehbe Carnegie Mellon University Machine

Conditional IndependenceDefinition: X is conditionally independent of Y given Z, if

the probability distribution governing X is independent of the value of Y, given the value of Z

Which we often write

E.g.,

P(thunder|raining,lightning) = P(thunder|lightning)

Naïve Bayes uses assumption that the Xi are conditionally independent, given Y. E.g.,

Given this assumption, then:

Naïve Bayes uses assumption that the Xi are conditionally independent, given Y. E.g.,

Given this assumption, then:

Chain rule

Naïve Bayes uses assumption that the Xi are conditionally independent, given Y. E.g.,

Given this assumption, then:

Conditional independence

Page 6: Announcementslwehbe/10701_S19/files/Lecture_5.pdf · 2019-02-05 · 10-701 Introduction to Machine Learning (PhD) Lecture 5: Naive Bayes Leila Wehbe Carnegie Mellon University Machine

Naïve Bayes uses assumption that the Xi are conditionally independent, given Y. E.g.,

Given this assumption, then:

in general:

Naïve Bayes uses assumption that the Xi are conditionally independent, given Y. E.g.,

Given this assumption, then:

in general:

How many parameters to describe P(X1…Xn|Y)? P(Y)? • Without conditional indep assumption? • With conditional indep assumption?

2(2n -1) and 1

Naïve Bayes uses assumption that the Xi are conditionally independent, given Y. E.g.,

Given this assumption, then:

in general:

How many parameters to describe P(X1…Xn|Y)? P(Y)? • Without conditional indep assumption? • With conditional indep assumption?

2(2n -1) and 1 2n and 1

Bayes rule:

Assuming conditional independence among Xi’s:

So, to pick most probable Y for Xnew = ( X1, …, Xn )

Naïve Bayes in a Nutshell

(estimate in training)

(testing)

Page 7: Announcementslwehbe/10701_S19/files/Lecture_5.pdf · 2019-02-05 · 10-701 Introduction to Machine Learning (PhD) Lecture 5: Naive Bayes Leila Wehbe Carnegie Mellon University Machine

Naïve Bayes Algorithm – discrete Xi

• Train Naïve Bayes (examples) for each* value yk

estimate for each* value xij of each attribute Xi

estimate • Classify (Xnew)

* probabilities must sum to 1, so need estimate only v-1 of these, where v is the number of values, which is 2 in the binary case

Let’s train! Buy computer? P(Y|O,S,J)• O= Is older than 35 • S= Is a student

• J = Birthday before July 1 • Y= buys computer

What probability parameters must we estimate?

O: Is older than 35

years

S: Is a

Student

J: Birthday before

July 1st

Y: Buys computer

0 0 0 00 0 1 01 0 1 11 0 1 11 1 0 11 1 1 00 1 0 10 0 0 00 1 1 10 1 0 10 1 1 11 0 1 10 1 0 11 0 0 0

Let’s train! Buy computer? P(Y|O,S,J)• O= Is older than 35 • S= Is a student

• J = Birthday before July 1 • Y= buys computer

What probability parameters must we estimate?

P(Y=1) : P(O=1 | Y=1) : P(O=1 | Y=0) : P(S=1 | Y=1) : P(S=1 | Y=0) : P(J=1 | Y=1) : P(J=1 | Y=0) :

P(Y=0) : P(O=0 | Y=1) : P(O=0 | Y=0) : P(S=0 | Y=1) : P(S=0 | Y=0) : P(J=0 | Y=1) : P(J=0 | Y=0) :

O S J Y

1 0 0 00 0 1 01 0 1 11 0 1 11 1 0 11 1 1 00 1 0 11 0 0 00 1 1 10 1 0 10 1 1 11 0 1 10 1 0 11 0 0 0

Let’s train! Buy computer? P(Y|O,S,J)• O= Is older than 35 • S= Is a student

• J = Birthday before July 1 • Y= buys computer

What probability parameters must we estimate?

P(Y=1) : P(O=1 | Y=1) : P(O=1 | Y=0) : P(S=1 | Y=1) : P(S=1 | Y=0) : P(J=1 | Y=1) : P(J=1 | Y=0) :

P(Y=0) : P(O=0 | Y=1) : P(O=0 | Y=0) : P(S=0 | Y=1) : P(S=0 | Y=0) : P(J=0 | Y=1) : P(J=0 | Y=0) :

5/149/14O S J Y

1 0 0 00 0 1 01 0 1 11 0 1 11 1 0 11 1 1 00 1 0 11 0 0 00 1 1 10 1 0 10 1 1 11 0 1 10 1 0 11 0 0 0

Page 8: Announcementslwehbe/10701_S19/files/Lecture_5.pdf · 2019-02-05 · 10-701 Introduction to Machine Learning (PhD) Lecture 5: Naive Bayes Leila Wehbe Carnegie Mellon University Machine

Let’s train! Buy computer? P(Y|O,S,J)• O= Is older than 35 • S= Is a student

• J = Birthday before July 1 • Y= buys computer

What probability parameters must we estimate?

P(Y=1) : P(O=1 | Y=1) : P(O=1 | Y=0) : P(S=1 | Y=1) : P(S=1 | Y=0) : P(J=1 | Y=1) : P(J=1 | Y=0) :

P(Y=0) : P(O=0 | Y=1) : P(O=0 | Y=0) : P(S=0 | Y=1) : P(S=0 | Y=0) : P(J=0 | Y=1) : P(J=0 | Y=0) :

5/149/144/9

O S J Y

1 0 0 00 0 1 01 0 1 11 0 1 11 1 0 11 1 1 00 1 0 11 0 0 00 1 1 10 1 0 10 1 1 11 0 1 10 1 0 11 0 0 0

Let’s train! Buy computer? P(Y|O,S,J)• O= Is older than 35 • S= Is a student

• J = Birthday before July 1 • Y= buys computer

What probability parameters must we estimate?

P(Y=1) : P(O=1 | Y=1) : P(O=1 | Y=0) : P(S=1 | Y=1) : P(S=1 | Y=0) : P(J=1 | Y=1) : P(J=1 | Y=0) :

P(Y=0) : P(O=0 | Y=1) : P(O=0 | Y=0) : P(S=0 | Y=1) : P(S=0 | Y=0) : P(J=0 | Y=1) : P(J=0 | Y=0) :

5/149/144/9 5/9

O S J Y

1 0 0 00 0 1 01 0 1 11 0 1 11 1 0 11 1 1 00 1 0 11 0 0 00 1 1 10 1 0 10 1 1 11 0 1 10 1 0 11 0 0 0

Let’s train! Buy computer? P(Y|O,S,J)• O= Is older than 35 • S= Is a student

• J = Birthday before July 1 • Y= buys computer

What probability parameters must we estimate?

P(Y=1) : P(O=1 | Y=1) : P(O=1 | Y=0) : P(S=1 | Y=1) : P(S=1 | Y=0) : P(J=1 | Y=1) : P(J=1 | Y=0) :

P(Y=0) : P(O=0 | Y=1) : P(O=0 | Y=0) : P(S=0 | Y=1) : P(S=0 | Y=0) : P(J=0 | Y=1) : P(J=0 | Y=0) :

5/149/144/9 5/92/5 3/5

O S J Y

1 0 0 00 0 1 01 0 1 11 0 1 11 1 0 11 1 1 00 1 0 11 0 0 00 1 1 10 1 0 10 1 1 11 0 1 10 1 0 11 0 0 0

Let’s train! Buy computer? P(Y|O,S,J)• O= Is older than 35 • S= Is a student

• J = Birthday before July 1 • Y= buys computer

What probability parameters must we estimate?

P(Y=1) : P(O=1 | Y=1) : P(O=1 | Y=0) : P(S=1 | Y=1) : P(S=1 | Y=0) : P(J=1 | Y=1) : P(J=1 | Y=0) :

P(Y=0) : P(O=0 | Y=1) : P(O=0 | Y=0) : P(S=0 | Y=1) : P(S=0 | Y=0) : P(J=0 | Y=1) : P(J=0 | Y=0) :

5/149/144/9 5/94/5 1/56/9 3/91/5 4/5

O S J Y

1 0 0 00 0 1 01 0 1 11 0 1 11 1 0 11 1 1 00 1 0 11 0 0 00 1 1 10 1 0 10 1 1 11 0 1 10 1 0 11 0 0 0

Page 9: Announcementslwehbe/10701_S19/files/Lecture_5.pdf · 2019-02-05 · 10-701 Introduction to Machine Learning (PhD) Lecture 5: Naive Bayes Leila Wehbe Carnegie Mellon University Machine

Let’s train! Buy computer? P(Y|O,S,J)• O= Is older than 35 • S= Is a student

• J = Birthday before July 1 • Y= buys computer

What probability parameters must we estimate?

P(Y=1) : P(O=1 | Y=1) : P(O=1 | Y=0) : P(S=1 | Y=1) : P(S=1 | Y=0) : P(J=1 | Y=1) : P(J=1 | Y=0) :

P(Y=0) : P(O=0 | Y=1) : P(O=0 | Y=0) : P(S=0 | Y=1) : P(S=0 | Y=0) : P(J=0 | Y=1) : P(J=0 | Y=0) :

O S J Y

1 0 0 00 0 1 01 0 1 11 0 1 11 1 0 11 1 1 00 1 0 11 0 0 00 1 1 10 1 0 10 1 1 11 0 1 10 1 0 11 0 0 0

5/149/144/9 5/94/5 1/56/9 3/91/5 4/55/9 4/92/5 3/5

Let’s test! Buy computer? P(Y|O,S,J)• O= Is older than 35 • S= Is a student

• J = Birthday before July 1 • Y= buys computer

What probability parameters must we estimate?

P(Y=1) : P(O=1 | Y=1) : P(O=1 | Y=0) : P(S=1 | Y=1) : P(S=1 | Y=0) : P(J=1 | Y=1) : P(J=1 | Y=0) :

P(Y=0) : P(O=0 | Y=1) : P(O=0 | Y=0) : P(S=0 | Y=1) : P(S=0 | Y=0) : P(J=0 | Y=1) : P(J=0 | Y=0) :

5/149/144/9 5/94/5 1/56/9 3/91/5 4/55/9 4/92/5 3/5

0 1 1 ?

Let’s test! Buy computer? P(Y|O,S,J)• O= Is older than 35 • S= Is a student

• J = Birthday before July 1 • Y= buys computer

What probability parameters must we estimate?

P(Y=1) : P(O=1 | Y=1) : P(O=1 | Y=0) : P(S=1 | Y=1) : P(S=1 | Y=0) : P(J=1 | Y=1) : P(J=1 | Y=0) :

P(Y=0) : P(O=0 | Y=1) : P(O=0 | Y=0) : P(S=0 | Y=1) : P(S=0 | Y=0) : P(J=0 | Y=1) : P(J=0 | Y=0) :

0 1 1 ?

5/149/144/9 5/94/5 1/56/9 3/91/5 4/55/9 4/92/5 3/5

P(O=0,S=1,J=1|Y=0) * P(Y=0) = 1/5 * 1/5 * 2/5 * 5/14 = 0.0057

Let’s test! Buy computer? P(Y|O,S,J)• O= Is older than 35 • S= Is a student

• J = Birthday before July 1 • Y= buys computer

What probability parameters must we estimate?

P(Y=1) : P(O=1 | Y=1) : P(O=1 | Y=0) : P(S=1 | Y=1) : P(S=1 | Y=0) : P(J=1 | Y=1) : P(J=1 | Y=0) :

P(Y=0) : P(O=0 | Y=1) : P(O=0 | Y=0) : P(S=0 | Y=1) : P(S=0 | Y=0) : P(J=0 | Y=1) : P(J=0 | Y=0) :

0 1 1 ?

5/149/144/9 5/94/5 1/56/9 3/91/5 4/55/9 4/92/5 3/5

P(O=0,S=1,J=1|Y=0) * P(Y=0) = 1/5 * 1/5 * 2/5 * 5/14 = 0.0057

P(O=0,S=1,J=1|Y=1) * P(Y=1) = 5/9 * 6/9 * 5/9 * 9/14 = 0.13

Page 10: Announcementslwehbe/10701_S19/files/Lecture_5.pdf · 2019-02-05 · 10-701 Introduction to Machine Learning (PhD) Lecture 5: Naive Bayes Leila Wehbe Carnegie Mellon University Machine

Let’s test! Buy computer? P(Y|O,S,J)• O= Is older than 35 • S= Is a student

• J = Birthday before July 1 • Y= buys computer

What probability parameters must we estimate?

P(Y=1) : P(O=1 | Y=1) : P(O=1 | Y=0) : P(S=1 | Y=1) : P(S=1 | Y=0) : P(J=1 | Y=1) : P(J=1 | Y=0) :

P(Y=0) : P(O=0 | Y=1) : P(O=0 | Y=0) : P(S=0 | Y=1) : P(S=0 | Y=0) : P(J=0 | Y=1) : P(J=0 | Y=0) :

0 1 1 ?

5/149/144/9 5/94/5 1/56/9 3/91/5 4/55/9 4/92/5 3/5

P(O=0,S=1,J=1|Y=0) * P(Y=0) = 1/5 * 1/5 * 2/5 * 5/14 = 0.0057

P(O=0,S=1,J=1|Y=1) * P(Y=1) = 5/9 * 6/9 * 5/9 * 9/14 = 0.13

Can already pick label 1

Let’s test! Buy computer? P(Y|O,S,J)• O= Is older than 35 • S= Is a student

• J = Birthday before July 1 • Y= buys computer

What probability parameters must we estimate?

P(Y=1) : P(O=1 | Y=1) : P(O=1 | Y=0) : P(S=1 | Y=1) : P(S=1 | Y=0) : P(J=1 | Y=1) : P(J=1 | Y=0) :

P(Y=0) : P(O=0 | Y=1) : P(O=0 | Y=0) : P(S=0 | Y=1) : P(S=0 | Y=0) : P(J=0 | Y=1) : P(J=0 | Y=0) :

0 1 1 ?

5/149/144/9 5/94/5 1/56/9 3/91/5 4/55/9 4/92/5 3/5

P(O=0,S=1,J=1|Y=0) * P(Y=0) = 1/5 * 1/5 * 2/5 * 5/14 = 0.0057

P(O=0,S=1,J=1|Y=1) * P(Y=1) = 5/9 * 6/9 * 5/9 * 9/14 = 0.13

P (X) =X

i

P (X|Y = i)P (Y = i)<latexit sha1_base64="vmS1q7EwCzdmHDZj7XUc+voZ32M=">AAACBnicdVDLSgMxFM34rPU16lKEYBHazZCp1dZFoeDGZQX7kHYYMmnahmYeJBmhjF258VfcuFDErd/gzr8x01ZQ0QMhJ+fcy809XsSZVAh9GAuLS8srq5m17PrG5ta2ubPblGEsCG2QkIei7WFJOQtoQzHFaTsSFPsepy1vdJ76rRsqJAuDKzWOqOPjQcD6jGClJdc8qOfbBViFXRn7LoP6dXtdZQVN0ss1c8hCpyW7aENknSC7UkQzclY+hraFpsiBOequ+d7thST2aaAIx1J2bBQpJ8FCMcLpJNuNJY0wGeEB7WgaYJ9KJ5muMYFHWunBfij0CRScqt87EuxLOfY9XeljNZS/vVT8y+vEql9xEhZEsaIBmQ3qxxyqEKaZwB4TlCg+1gQTwfRfIRligYnSyWV1CF+bwv9Js2jZyLIvS7laZR5HBuyDQ5AHNiiDGrgAddAABNyBB/AEno1749F4MV5npQvGvGcP/IDx9gn76ZY9</latexit><latexit sha1_base64="vmS1q7EwCzdmHDZj7XUc+voZ32M=">AAACBnicdVDLSgMxFM34rPU16lKEYBHazZCp1dZFoeDGZQX7kHYYMmnahmYeJBmhjF258VfcuFDErd/gzr8x01ZQ0QMhJ+fcy809XsSZVAh9GAuLS8srq5m17PrG5ta2ubPblGEsCG2QkIei7WFJOQtoQzHFaTsSFPsepy1vdJ76rRsqJAuDKzWOqOPjQcD6jGClJdc8qOfbBViFXRn7LoP6dXtdZQVN0ss1c8hCpyW7aENknSC7UkQzclY+hraFpsiBOequ+d7thST2aaAIx1J2bBQpJ8FCMcLpJNuNJY0wGeEB7WgaYJ9KJ5muMYFHWunBfij0CRScqt87EuxLOfY9XeljNZS/vVT8y+vEql9xEhZEsaIBmQ3qxxyqEKaZwB4TlCg+1gQTwfRfIRligYnSyWV1CF+bwv9Js2jZyLIvS7laZR5HBuyDQ5AHNiiDGrgAddAABNyBB/AEno1749F4MV5npQvGvGcP/IDx9gn76ZY9</latexit><latexit sha1_base64="vmS1q7EwCzdmHDZj7XUc+voZ32M=">AAACBnicdVDLSgMxFM34rPU16lKEYBHazZCp1dZFoeDGZQX7kHYYMmnahmYeJBmhjF258VfcuFDErd/gzr8x01ZQ0QMhJ+fcy809XsSZVAh9GAuLS8srq5m17PrG5ta2ubPblGEsCG2QkIei7WFJOQtoQzHFaTsSFPsepy1vdJ76rRsqJAuDKzWOqOPjQcD6jGClJdc8qOfbBViFXRn7LoP6dXtdZQVN0ss1c8hCpyW7aENknSC7UkQzclY+hraFpsiBOequ+d7thST2aaAIx1J2bBQpJ8FCMcLpJNuNJY0wGeEB7WgaYJ9KJ5muMYFHWunBfij0CRScqt87EuxLOfY9XeljNZS/vVT8y+vEql9xEhZEsaIBmQ3qxxyqEKaZwB4TlCg+1gQTwfRfIRligYnSyWV1CF+bwv9Js2jZyLIvS7laZR5HBuyDQ5AHNiiDGrgAddAABNyBB/AEno1749F4MV5npQvGvGcP/IDx9gn76ZY9</latexit><latexit sha1_base64="vmS1q7EwCzdmHDZj7XUc+voZ32M=">AAACBnicdVDLSgMxFM34rPU16lKEYBHazZCp1dZFoeDGZQX7kHYYMmnahmYeJBmhjF258VfcuFDErd/gzr8x01ZQ0QMhJ+fcy809XsSZVAh9GAuLS8srq5m17PrG5ta2ubPblGEsCG2QkIei7WFJOQtoQzHFaTsSFPsepy1vdJ76rRsqJAuDKzWOqOPjQcD6jGClJdc8qOfbBViFXRn7LoP6dXtdZQVN0ss1c8hCpyW7aENknSC7UkQzclY+hraFpsiBOequ+d7thST2aaAIx1J2bBQpJ8FCMcLpJNuNJY0wGeEB7WgaYJ9KJ5muMYFHWunBfij0CRScqt87EuxLOfY9XeljNZS/vVT8y+vEql9xEhZEsaIBmQ3qxxyqEKaZwB4TlCg+1gQTwfRfIRligYnSyWV1CF+bwv9Js2jZyLIvS7laZR5HBuyDQ5AHNiiDGrgAddAABNyBB/AEno1749F4MV5npQvGvGcP/IDx9gn76ZY9</latexit>

Let’s test! Buy computer? P(Y|O,S,J)• O= Is older than 35 • S= Is a student

• J = Birthday before July 1 • Y= buys computer

What probability parameters must we estimate?

P(Y=1) : P(O=1 | Y=1) : P(O=1 | Y=0) : P(S=1 | Y=1) : P(S=1 | Y=0) : P(J=1 | Y=1) : P(J=1 | Y=0) :

P(Y=0) : P(O=0 | Y=1) : P(O=0 | Y=0) : P(S=0 | Y=1) : P(S=0 | Y=0) : P(J=0 | Y=1) : P(J=0 | Y=0) :

0 1 1 ?

5/149/144/9 5/94/5 1/56/9 3/91/5 4/55/9 4/92/5 3/5

P(Y=0|O=0,S=1,J=1) = 0.04

P(Y=1|O=0,S=1,J=1) = 0.96 Choice is Label 1

Assume we have a lot of data• We estimate these probabilities

• We obtain 68% test accuracy. • What does this mean?

• Our estimated P(Y=1) was 64.5%, let’s say this is actually the truth.

• Is 68% impressive?

Page 11: Announcementslwehbe/10701_S19/files/Lecture_5.pdf · 2019-02-05 · 10-701 Introduction to Machine Learning (PhD) Lecture 5: Naive Bayes Leila Wehbe Carnegie Mellon University Machine

• What is chance performance?

• What happens if I flip an unbiased coin?

• If P(Y=1)>0.5, then we can just predict 1 all the time! • What will be the accuracy?

• What happens if you predict Y=1 with probability 0.645 ==> this is called probability matching in cognitive science

Naïve Bayes: Point #1

Often the Xi are not really conditionally independent

• We use Naïve Bayes in many cases anyway, and it often works pretty well – often the right classification, even when not the right

probability (see [Domingos&Pazzani, 1996])

• What is effect on estimated P(Y|X)? – Extreme case: what if we add two copies: Xi = Xk

Extreme case: what if we add two copies: Xi = Xk

P(Y=1) P(O=1|Y=1) P(S=0|Y=1) P(J=0|Y=1) _____________________________________________________________________________ P(Y=1) P(O=1|Y=1) P(S=0|Y=1) P(J=0|Y=1) + P(Y=0) P(O=1|Y=0) P(S=0|Y=0) P(J=0|Y=0)

Naïve Bayes: Point #2

Irrelevant variables, do they affect you?

P(Y=1) P(O=1|Y=1) P(S=0|Y=1) P(J=0|Y=1) _____________________________________________________________________________ P(Y=1) P(O=1|Y=1) P(S=0|Y=1) P(J=0|Y=1) + P(Y=0) P(O=1|Y=0) P(S=0|Y=0) P(J=0|Y=0)

Page 12: Announcementslwehbe/10701_S19/files/Lecture_5.pdf · 2019-02-05 · 10-701 Introduction to Machine Learning (PhD) Lecture 5: Naive Bayes Leila Wehbe Carnegie Mellon University Machine

Naïve Bayes: Point #2

Irrelevant variables, do they affect you?

P(Y=1) P(O=1|Y=1) P(S=0|Y=1) P(J=0|Y=1) _____________________________________________________________________________ P(Y=1) P(O=1|Y=1) P(S=0|Y=1) P(J=0|Y=1) + P(Y=0) P(O=1|Y=0) P(S=0|Y=0) P(J=0|Y=0)

If J is not relevant then P(J|Y=0) = P(J|Y=1) = P(J)

Does it hurt classification?

Naïve Bayes: Point #2

Irrelevant variables, do they affect you?

P(Y=1) P(O=1|Y=1) P(S=0|Y=1) P(J=0|Y=1) _____________________________________________________________________________ P(Y=1) P(O=1|Y=1) P(S=0|Y=1) P(J=0|Y=1) + P(Y=0) P(O=1|Y=0) P(S=0|Y=0) P(J=0|Y=0)

If J is not relevant then P(J|Y=0) = P(J|Y=1) = P(J)

Does it hurt classification?

If we had the correct estimate, performance is not affected! If we have noisy estimates ==> performance is affected

Another way to view Naïve Bayes (Boolean Y, Xi’s): Decision rule: is this quantity greater or less than 1?

(is q1 larger than q0?)

Naïve Bayes: Point #3

P (Y = 1|X1...Xn)

P (Y = 0|X1...Xn)=

P (Y = 1)Q

i P (Xi|Y = 1)

P (Y = 0)Q

i P (Xi|Y = 0)<latexit sha1_base64="qPxGUhFFqiqy0kAj6kcWnWvH5J0=">AAACUXicbVFNS8MwGH5Xv+b8mnr0EhyCu5RWBHcRBl48TnCzspaSZqkLpmlJUmHU/UUPevJ/ePGgmG49zOkLgYfngzd5EmWcKe047zVrZXVtfaO+2dja3tnda+4fDFSaS0L7JOWp9CKsKGeC9jXTnHqZpDiJOL2LHq9K/e6JSsVScasnGQ0S/CBYzAjWhgqbYz+WmBSod3p/6T57oWvbtheK9rQoGWeRQZdobp55234m01HITNIL2XPJVJk2WpYcEw6bLcd2ZoP+ArcCLaimFzZf/VFK8oQKTThWaug6mQ4KLDUjnE4bfq5ohskjfqBDAwVOqAqKWSNTdGKYEYpTaY7QaMYuJgqcKDVJIuNMsB6rZa0k/9OGuY47QcFElmsqyHxRnHOkU1TWi0ZMUqL5xABMJDN3RWSMTWvafELDlOAuP/kvGJzZrmO7N+etbqeqow5HcAyn4MIFdOEaetAHAi/wAV/wXXurfVpgWXOrVasyh/BrrK0fGJSuHg==</latexit><latexit sha1_base64="qPxGUhFFqiqy0kAj6kcWnWvH5J0=">AAACUXicbVFNS8MwGH5Xv+b8mnr0EhyCu5RWBHcRBl48TnCzspaSZqkLpmlJUmHU/UUPevJ/ePGgmG49zOkLgYfngzd5EmWcKe047zVrZXVtfaO+2dja3tnda+4fDFSaS0L7JOWp9CKsKGeC9jXTnHqZpDiJOL2LHq9K/e6JSsVScasnGQ0S/CBYzAjWhgqbYz+WmBSod3p/6T57oWvbtheK9rQoGWeRQZdobp55234m01HITNIL2XPJVJk2WpYcEw6bLcd2ZoP+ArcCLaimFzZf/VFK8oQKTThWaug6mQ4KLDUjnE4bfq5ohskjfqBDAwVOqAqKWSNTdGKYEYpTaY7QaMYuJgqcKDVJIuNMsB6rZa0k/9OGuY47QcFElmsqyHxRnHOkU1TWi0ZMUqL5xABMJDN3RWSMTWvafELDlOAuP/kvGJzZrmO7N+etbqeqow5HcAyn4MIFdOEaetAHAi/wAV/wXXurfVpgWXOrVasyh/BrrK0fGJSuHg==</latexit><latexit sha1_base64="qPxGUhFFqiqy0kAj6kcWnWvH5J0=">AAACUXicbVFNS8MwGH5Xv+b8mnr0EhyCu5RWBHcRBl48TnCzspaSZqkLpmlJUmHU/UUPevJ/ePGgmG49zOkLgYfngzd5EmWcKe047zVrZXVtfaO+2dja3tnda+4fDFSaS0L7JOWp9CKsKGeC9jXTnHqZpDiJOL2LHq9K/e6JSsVScasnGQ0S/CBYzAjWhgqbYz+WmBSod3p/6T57oWvbtheK9rQoGWeRQZdobp55234m01HITNIL2XPJVJk2WpYcEw6bLcd2ZoP+ArcCLaimFzZf/VFK8oQKTThWaug6mQ4KLDUjnE4bfq5ohskjfqBDAwVOqAqKWSNTdGKYEYpTaY7QaMYuJgqcKDVJIuNMsB6rZa0k/9OGuY47QcFElmsqyHxRnHOkU1TWi0ZMUqL5xABMJDN3RWSMTWvafELDlOAuP/kvGJzZrmO7N+etbqeqow5HcAyn4MIFdOEaetAHAi/wAV/wXXurfVpgWXOrVasyh/BrrK0fGJSuHg==</latexit><latexit sha1_base64="qPxGUhFFqiqy0kAj6kcWnWvH5J0=">AAACUXicbVFNS8MwGH5Xv+b8mnr0EhyCu5RWBHcRBl48TnCzspaSZqkLpmlJUmHU/UUPevJ/ePGgmG49zOkLgYfngzd5EmWcKe047zVrZXVtfaO+2dja3tnda+4fDFSaS0L7JOWp9CKsKGeC9jXTnHqZpDiJOL2LHq9K/e6JSsVScasnGQ0S/CBYzAjWhgqbYz+WmBSod3p/6T57oWvbtheK9rQoGWeRQZdobp55234m01HITNIL2XPJVJk2WpYcEw6bLcd2ZoP+ArcCLaimFzZf/VFK8oQKTThWaug6mQ4KLDUjnE4bfq5ohskjfqBDAwVOqAqKWSNTdGKYEYpTaY7QaMYuJgqcKDVJIuNMsB6rZa0k/9OGuY47QcFElmsqyHxRnHOkU1TWi0ZMUqL5xABMJDN3RWSMTWvafELDlOAuP/kvGJzZrmO7N+etbqeqow5HcAyn4MIFdOEaetAHAi/wAV/wXXurfVpgWXOrVasyh/BrrK0fGJSuHg==</latexit>

Another way to view Naïve Bayes (Boolean Y, Xi’s): Decision rule: is this quantity greater or less than 1?

Naïve Bayes: Point #3

logP (Y = 1|X1...Xn)

P (Y = 0|X1...Xn)= log

P (Y = 1)

P (Y = 0)+

X

i

logP (Xi|Y = 1)

P (Xi|Y = 0)<latexit sha1_base64="/rYJIOwZIzM4Y+zUxADSRw01G6Y=">AAACXHicbZHPS8MwHMXTOnU/nE4FL16CQ5gIpRXBXQYDLx4nuK2yjZJm6RaWpiVJhdHtn/S2i/+KZlsnc/MLgcd7n5DkxY8Zlcq2F4Z5kDs8Os4XiqWT8ulZ5fyiI6NEYNLGEYuE6yNJGOWkrahixI0FQaHPSNefPC/z7gcRkkb8TU1jMgjRiNOAYqS05VUki0b9QCCcwlbtveHMXM+xLMv1+N08XTr2tgMb8Jdf4RtIR/ew2JdJ6NFtxPXobIOttUa9StW27NXAfeFkogqyaXmVz/4wwklIuMIMSdlz7FgNUiQUxYzMi/1EkhjhCRqRnpYchUQO0lU5c3irnSEMIqEXV3Dlbu9IUSjlNPQ1GSI1lrvZ0vwv6yUqqA9SyuNEEY7XBwUJgyqCy6bhkAqCFZtqgbCg+q4Qj5HuRen/KOoSnN0n74vOg+XYlvP6WG3Wszry4BrcgBpwwBNoghfQAm2AwQJ8G3mjYHyZObNklteoaWR7LsGfMa9+AMWMsBk=</latexit><latexit sha1_base64="/rYJIOwZIzM4Y+zUxADSRw01G6Y=">AAACXHicbZHPS8MwHMXTOnU/nE4FL16CQ5gIpRXBXQYDLx4nuK2yjZJm6RaWpiVJhdHtn/S2i/+KZlsnc/MLgcd7n5DkxY8Zlcq2F4Z5kDs8Os4XiqWT8ulZ5fyiI6NEYNLGEYuE6yNJGOWkrahixI0FQaHPSNefPC/z7gcRkkb8TU1jMgjRiNOAYqS05VUki0b9QCCcwlbtveHMXM+xLMv1+N08XTr2tgMb8Jdf4RtIR/ew2JdJ6NFtxPXobIOttUa9StW27NXAfeFkogqyaXmVz/4wwklIuMIMSdlz7FgNUiQUxYzMi/1EkhjhCRqRnpYchUQO0lU5c3irnSEMIqEXV3Dlbu9IUSjlNPQ1GSI1lrvZ0vwv6yUqqA9SyuNEEY7XBwUJgyqCy6bhkAqCFZtqgbCg+q4Qj5HuRen/KOoSnN0n74vOg+XYlvP6WG3Wszry4BrcgBpwwBNoghfQAm2AwQJ8G3mjYHyZObNklteoaWR7LsGfMa9+AMWMsBk=</latexit><latexit sha1_base64="/rYJIOwZIzM4Y+zUxADSRw01G6Y=">AAACXHicbZHPS8MwHMXTOnU/nE4FL16CQ5gIpRXBXQYDLx4nuK2yjZJm6RaWpiVJhdHtn/S2i/+KZlsnc/MLgcd7n5DkxY8Zlcq2F4Z5kDs8Os4XiqWT8ulZ5fyiI6NEYNLGEYuE6yNJGOWkrahixI0FQaHPSNefPC/z7gcRkkb8TU1jMgjRiNOAYqS05VUki0b9QCCcwlbtveHMXM+xLMv1+N08XTr2tgMb8Jdf4RtIR/ew2JdJ6NFtxPXobIOttUa9StW27NXAfeFkogqyaXmVz/4wwklIuMIMSdlz7FgNUiQUxYzMi/1EkhjhCRqRnpYchUQO0lU5c3irnSEMIqEXV3Dlbu9IUSjlNPQ1GSI1lrvZ0vwv6yUqqA9SyuNEEY7XBwUJgyqCy6bhkAqCFZtqgbCg+q4Qj5HuRen/KOoSnN0n74vOg+XYlvP6WG3Wszry4BrcgBpwwBNoghfQAm2AwQJ8G3mjYHyZObNklteoaWR7LsGfMa9+AMWMsBk=</latexit><latexit sha1_base64="/rYJIOwZIzM4Y+zUxADSRw01G6Y=">AAACXHicbZHPS8MwHMXTOnU/nE4FL16CQ5gIpRXBXQYDLx4nuK2yjZJm6RaWpiVJhdHtn/S2i/+KZlsnc/MLgcd7n5DkxY8Zlcq2F4Z5kDs8Os4XiqWT8ulZ5fyiI6NEYNLGEYuE6yNJGOWkrahixI0FQaHPSNefPC/z7gcRkkb8TU1jMgjRiNOAYqS05VUki0b9QCCcwlbtveHMXM+xLMv1+N08XTr2tgMb8Jdf4RtIR/ew2JdJ6NFtxPXobIOttUa9StW27NXAfeFkogqyaXmVz/4wwklIuMIMSdlz7FgNUiQUxYzMi/1EkhjhCRqRnpYchUQO0lU5c3irnSEMIqEXV3Dlbu9IUSjlNPQ1GSI1lrvZ0vwv6yUqqA9SyuNEEY7XBwUJgyqCy6bhkAqCFZtqgbCg+q4Qj5HuRen/KOoSnN0n74vOg+XYlvP6WG3Wszry4BrcgBpwwBNoghfQAm2AwQJ8G3mjYHyZObNklteoaWR7LsGfMa9+AMWMsBk=</latexit>

> or < 0 ?

> or < 1 ?

1� ✓ik = P̂ (Xi = 0|Y = k)<latexit sha1_base64="B/e7ypgMQQOWjtzVDDblu7ueuoI=">AAACDXicbVDLSsNAFJ3UV62vqEs3g1WoC0sigt0UCm5cVrAPaUqYTKfN0MmDmRuhxPyAG3/FjQtF3Lp35984bbPQ6oELh3Pu5d57vFhwBZb1ZRSWlldW14rrpY3Nre0dc3evraJEUtaikYhk1yOKCR6yFnAQrBtLRgJPsI43vpz6nTsmFY/CG5jErB+QUciHnBLQkmse2afYAZ8BcVM+znAdOz6BtJnhStfldev+tj4+cc2yVbVmwH+JnZMyytF0zU9nENEkYCFQQZTq2VYM/ZRI4FSwrOQkisWEjsmI9TQNScBUP519k+FjrQzwMJK6QsAz9edESgKlJoGnOwMCvlr0puJ/Xi+BYa2f8jBOgIV0vmiYCAwRnkaDB1wyCmKiCaGS61sx9YkkFHSAJR2CvfjyX9I+q9pW1b4+LzdqeRxFdIAOUQXZ6AI10BVqohai6AE9oRf0ajwaz8ab8T5vLRj5zD76BePjG6olmf0=</latexit><latexit sha1_base64="B/e7ypgMQQOWjtzVDDblu7ueuoI=">AAACDXicbVDLSsNAFJ3UV62vqEs3g1WoC0sigt0UCm5cVrAPaUqYTKfN0MmDmRuhxPyAG3/FjQtF3Lp35984bbPQ6oELh3Pu5d57vFhwBZb1ZRSWlldW14rrpY3Nre0dc3evraJEUtaikYhk1yOKCR6yFnAQrBtLRgJPsI43vpz6nTsmFY/CG5jErB+QUciHnBLQkmse2afYAZ8BcVM+znAdOz6BtJnhStfldev+tj4+cc2yVbVmwH+JnZMyytF0zU9nENEkYCFQQZTq2VYM/ZRI4FSwrOQkisWEjsmI9TQNScBUP519k+FjrQzwMJK6QsAz9edESgKlJoGnOwMCvlr0puJ/Xi+BYa2f8jBOgIV0vmiYCAwRnkaDB1wyCmKiCaGS61sx9YkkFHSAJR2CvfjyX9I+q9pW1b4+LzdqeRxFdIAOUQXZ6AI10BVqohai6AE9oRf0ajwaz8ab8T5vLRj5zD76BePjG6olmf0=</latexit><latexit sha1_base64="B/e7ypgMQQOWjtzVDDblu7ueuoI=">AAACDXicbVDLSsNAFJ3UV62vqEs3g1WoC0sigt0UCm5cVrAPaUqYTKfN0MmDmRuhxPyAG3/FjQtF3Lp35984bbPQ6oELh3Pu5d57vFhwBZb1ZRSWlldW14rrpY3Nre0dc3evraJEUtaikYhk1yOKCR6yFnAQrBtLRgJPsI43vpz6nTsmFY/CG5jErB+QUciHnBLQkmse2afYAZ8BcVM+znAdOz6BtJnhStfldev+tj4+cc2yVbVmwH+JnZMyytF0zU9nENEkYCFQQZTq2VYM/ZRI4FSwrOQkisWEjsmI9TQNScBUP519k+FjrQzwMJK6QsAz9edESgKlJoGnOwMCvlr0puJ/Xi+BYa2f8jBOgIV0vmiYCAwRnkaDB1wyCmKiCaGS61sx9YkkFHSAJR2CvfjyX9I+q9pW1b4+LzdqeRxFdIAOUQXZ6AI10BVqohai6AE9oRf0ajwaz8ab8T5vLRj5zD76BePjG6olmf0=</latexit><latexit sha1_base64="B/e7ypgMQQOWjtzVDDblu7ueuoI=">AAACDXicbVDLSsNAFJ3UV62vqEs3g1WoC0sigt0UCm5cVrAPaUqYTKfN0MmDmRuhxPyAG3/FjQtF3Lp35984bbPQ6oELh3Pu5d57vFhwBZb1ZRSWlldW14rrpY3Nre0dc3evraJEUtaikYhk1yOKCR6yFnAQrBtLRgJPsI43vpz6nTsmFY/CG5jErB+QUciHnBLQkmse2afYAZ8BcVM+znAdOz6BtJnhStfldev+tj4+cc2yVbVmwH+JnZMyytF0zU9nENEkYCFQQZTq2VYM/ZRI4FSwrOQkisWEjsmI9TQNScBUP519k+FjrQzwMJK6QsAz9edESgKlJoGnOwMCvlr0puJ/Xi+BYa2f8jBOgIV0vmiYCAwRnkaDB1wyCmKiCaGS61sx9YkkFHSAJR2CvfjyX9I+q9pW1b4+LzdqeRxFdIAOUQXZ6AI10BVqohai6AE9oRf0ajwaz8ab8T5vLRj5zD76BePjG6olmf0=</latexit>

✓ik = P̂ (Xi = 1|Y = k)<latexit sha1_base64="mi3cvEbeM0hMBeUWxZLXWgT5+x8=">AAACCnicbVDLSsNAFJ3UV62vqEs3o0Wom5KIYDeFghuXFexD2hAm00kzdPJg5kYoMWs3/oobF4q49Qvc+TdOHwttPXDhcM693HuPlwiuwLK+jcLK6tr6RnGztLW9s7tn7h+0VZxKylo0FrHsekQxwSPWAg6CdRPJSOgJ1vFGVxO/c8+k4nF0C+OEOSEZRtznlICWXPO4DwED4mZ8lOM67gcEsmaOK12X1+2Hu/rozDXLVtWaAi8Te07KaI6ma371BzFNQxYBFUSpnm0l4GREAqeC5aV+qlhC6IgMWU/TiIRMOdn0lRyfamWA/VjqigBP1d8TGQmVGoee7gwJBGrRm4j/eb0U/JqT8ShJgUV0tshPBYYYT3LBAy4ZBTHWhFDJ9a2YBkQSCjq9kg7BXnx5mbTPq7ZVtW8uyo3aPI4iOkInqIJsdIka6Bo1UQtR9Iie0St6M56MF+Pd+Ji1Foz5zCH6A+PzB2wLmWI=</latexit><latexit sha1_base64="mi3cvEbeM0hMBeUWxZLXWgT5+x8=">AAACCnicbVDLSsNAFJ3UV62vqEs3o0Wom5KIYDeFghuXFexD2hAm00kzdPJg5kYoMWs3/oobF4q49Qvc+TdOHwttPXDhcM693HuPlwiuwLK+jcLK6tr6RnGztLW9s7tn7h+0VZxKylo0FrHsekQxwSPWAg6CdRPJSOgJ1vFGVxO/c8+k4nF0C+OEOSEZRtznlICWXPO4DwED4mZ8lOM67gcEsmaOK12X1+2Hu/rozDXLVtWaAi8Te07KaI6ma371BzFNQxYBFUSpnm0l4GREAqeC5aV+qlhC6IgMWU/TiIRMOdn0lRyfamWA/VjqigBP1d8TGQmVGoee7gwJBGrRm4j/eb0U/JqT8ShJgUV0tshPBYYYT3LBAy4ZBTHWhFDJ9a2YBkQSCjq9kg7BXnx5mbTPq7ZVtW8uyo3aPI4iOkInqIJsdIka6Bo1UQtR9Iie0St6M56MF+Pd+Ji1Foz5zCH6A+PzB2wLmWI=</latexit><latexit sha1_base64="mi3cvEbeM0hMBeUWxZLXWgT5+x8=">AAACCnicbVDLSsNAFJ3UV62vqEs3o0Wom5KIYDeFghuXFexD2hAm00kzdPJg5kYoMWs3/oobF4q49Qvc+TdOHwttPXDhcM693HuPlwiuwLK+jcLK6tr6RnGztLW9s7tn7h+0VZxKylo0FrHsekQxwSPWAg6CdRPJSOgJ1vFGVxO/c8+k4nF0C+OEOSEZRtznlICWXPO4DwED4mZ8lOM67gcEsmaOK12X1+2Hu/rozDXLVtWaAi8Te07KaI6ma371BzFNQxYBFUSpnm0l4GREAqeC5aV+qlhC6IgMWU/TiIRMOdn0lRyfamWA/VjqigBP1d8TGQmVGoee7gwJBGrRm4j/eb0U/JqT8ShJgUV0tshPBYYYT3LBAy4ZBTHWhFDJ9a2YBkQSCjq9kg7BXnx5mbTPq7ZVtW8uyo3aPI4iOkInqIJsdIka6Bo1UQtR9Iie0St6M56MF+Pd+Ji1Foz5zCH6A+PzB2wLmWI=</latexit><latexit sha1_base64="mi3cvEbeM0hMBeUWxZLXWgT5+x8=">AAACCnicbVDLSsNAFJ3UV62vqEs3o0Wom5KIYDeFghuXFexD2hAm00kzdPJg5kYoMWs3/oobF4q49Qvc+TdOHwttPXDhcM693HuPlwiuwLK+jcLK6tr6RnGztLW9s7tn7h+0VZxKylo0FrHsekQxwSPWAg6CdRPJSOgJ1vFGVxO/c8+k4nF0C+OEOSEZRtznlICWXPO4DwED4mZ8lOM67gcEsmaOK12X1+2Hu/rozDXLVtWaAi8Te07KaI6ma371BzFNQxYBFUSpnm0l4GREAqeC5aV+qlhC6IgMWU/TiIRMOdn0lRyfamWA/VjqigBP1d8TGQmVGoee7gwJBGrRm4j/eb0U/JqT8ShJgUV0tshPBYYYT3LBAy4ZBTHWhFDJ9a2YBkQSCjq9kg7BXnx5mbTPq7ZVtW8uyo3aPI4iOkInqIJsdIka6Bo1UQtR9Iie0St6M56MF+Pd+Ji1Foz5zCH6A+PzB2wLmWI=</latexit>

= logP (Y = 1)

P (Y = 0)+X

i

Xilog✓i1✓i0

+ (1�Xi)1� ✓i11� ✓i0

<latexit sha1_base64="e2j9oPERXnt8bgyTVS1N9P82sIY=">AAACZ3icbZFLSwMxEMez66uur/pABC/RIrRIy0YEexEELx4rWK10y5JNs20w+yCZFcqyfkhv3r34LUzbBZ8DQ/6Z+Q1J/glSKTS47ptlLywuLa9UVp219Y3Nrer2zr1OMsV4lyUyUb2Aai5FzLsgQPJeqjiNAskfgqfraf/hmSstkvgOJikfRHQUi1AwCqbkV1/wJZbJyAsVZXmn/nhJGsVsdRsFPsWOp7PIF7hn0mB4znkw5kD9XJCiwF87t5iO1EnT0I0SJU38kybN77xfrbktdxb4ryClqKEyOn711RsmLIt4DExSrfvETWGQUwWCSV44XqZ5StkTHfG+kTGNuB7kM58KfGIqQxwmymQMeFb9PpHTSOtJFBgyojDWv3vT4n+9fgZhe5CLOM2Ax2x+UJhJDAmemo6HQnEGcmIEZUqYu2I2psYfMF/jGBPI7yf/FfdnLeK2yO157apd2lFBh+gY1RFBF+gK3aAO6iKG3i3H2rX2rA97y963D+aobZUzu+hH2EefVIK1ig==</latexit><latexit sha1_base64="e2j9oPERXnt8bgyTVS1N9P82sIY=">AAACZ3icbZFLSwMxEMez66uur/pABC/RIrRIy0YEexEELx4rWK10y5JNs20w+yCZFcqyfkhv3r34LUzbBZ8DQ/6Z+Q1J/glSKTS47ptlLywuLa9UVp219Y3Nrer2zr1OMsV4lyUyUb2Aai5FzLsgQPJeqjiNAskfgqfraf/hmSstkvgOJikfRHQUi1AwCqbkV1/wJZbJyAsVZXmn/nhJGsVsdRsFPsWOp7PIF7hn0mB4znkw5kD9XJCiwF87t5iO1EnT0I0SJU38kybN77xfrbktdxb4ryClqKEyOn711RsmLIt4DExSrfvETWGQUwWCSV44XqZ5StkTHfG+kTGNuB7kM58KfGIqQxwmymQMeFb9PpHTSOtJFBgyojDWv3vT4n+9fgZhe5CLOM2Ax2x+UJhJDAmemo6HQnEGcmIEZUqYu2I2psYfMF/jGBPI7yf/FfdnLeK2yO157apd2lFBh+gY1RFBF+gK3aAO6iKG3i3H2rX2rA97y963D+aobZUzu+hH2EefVIK1ig==</latexit><latexit sha1_base64="e2j9oPERXnt8bgyTVS1N9P82sIY=">AAACZ3icbZFLSwMxEMez66uur/pABC/RIrRIy0YEexEELx4rWK10y5JNs20w+yCZFcqyfkhv3r34LUzbBZ8DQ/6Z+Q1J/glSKTS47ptlLywuLa9UVp219Y3Nrer2zr1OMsV4lyUyUb2Aai5FzLsgQPJeqjiNAskfgqfraf/hmSstkvgOJikfRHQUi1AwCqbkV1/wJZbJyAsVZXmn/nhJGsVsdRsFPsWOp7PIF7hn0mB4znkw5kD9XJCiwF87t5iO1EnT0I0SJU38kybN77xfrbktdxb4ryClqKEyOn711RsmLIt4DExSrfvETWGQUwWCSV44XqZ5StkTHfG+kTGNuB7kM58KfGIqQxwmymQMeFb9PpHTSOtJFBgyojDWv3vT4n+9fgZhe5CLOM2Ax2x+UJhJDAmemo6HQnEGcmIEZUqYu2I2psYfMF/jGBPI7yf/FfdnLeK2yO157apd2lFBh+gY1RFBF+gK3aAO6iKG3i3H2rX2rA97y963D+aobZUzu+hH2EefVIK1ig==</latexit><latexit sha1_base64="e2j9oPERXnt8bgyTVS1N9P82sIY=">AAACZ3icbZFLSwMxEMez66uur/pABC/RIrRIy0YEexEELx4rWK10y5JNs20w+yCZFcqyfkhv3r34LUzbBZ8DQ/6Z+Q1J/glSKTS47ptlLywuLa9UVp219Y3Nrer2zr1OMsV4lyUyUb2Aai5FzLsgQPJeqjiNAskfgqfraf/hmSstkvgOJikfRHQUi1AwCqbkV1/wJZbJyAsVZXmn/nhJGsVsdRsFPsWOp7PIF7hn0mB4znkw5kD9XJCiwF87t5iO1EnT0I0SJU38kybN77xfrbktdxb4ryClqKEyOn711RsmLIt4DExSrfvETWGQUwWCSV44XqZ5StkTHfG+kTGNuB7kM58KfGIqQxwmymQMeFb9PpHTSOtJFBgyojDWv3vT4n+9fgZhe5CLOM2Ax2x+UJhJDAmemo6HQnEGcmIEZUqYu2I2psYfMF/jGBPI7yf/FfdnLeK2yO157apd2lFBh+gY1RFBF+gK3aAO6iKG3i3H2rX2rA97y963D+aobZUzu+hH2EefVIK1ig==</latexit>

P (Y = 1|X1...Xn)

P (Y = 0|X1...Xn)=

P (Y = 1)Q

i P (Xi|Y = 1)

P (Y = 0)Q

i P (Xi|Y = 0)<latexit sha1_base64="qPxGUhFFqiqy0kAj6kcWnWvH5J0=">AAACUXicbVFNS8MwGH5Xv+b8mnr0EhyCu5RWBHcRBl48TnCzspaSZqkLpmlJUmHU/UUPevJ/ePGgmG49zOkLgYfngzd5EmWcKe047zVrZXVtfaO+2dja3tnda+4fDFSaS0L7JOWp9CKsKGeC9jXTnHqZpDiJOL2LHq9K/e6JSsVScasnGQ0S/CBYzAjWhgqbYz+WmBSod3p/6T57oWvbtheK9rQoGWeRQZdobp55234m01HITNIL2XPJVJk2WpYcEw6bLcd2ZoP+ArcCLaimFzZf/VFK8oQKTThWaug6mQ4KLDUjnE4bfq5ohskjfqBDAwVOqAqKWSNTdGKYEYpTaY7QaMYuJgqcKDVJIuNMsB6rZa0k/9OGuY47QcFElmsqyHxRnHOkU1TWi0ZMUqL5xABMJDN3RWSMTWvafELDlOAuP/kvGJzZrmO7N+etbqeqow5HcAyn4MIFdOEaetAHAi/wAV/wXXurfVpgWXOrVasyh/BrrK0fGJSuHg==</latexit><latexit sha1_base64="qPxGUhFFqiqy0kAj6kcWnWvH5J0=">AAACUXicbVFNS8MwGH5Xv+b8mnr0EhyCu5RWBHcRBl48TnCzspaSZqkLpmlJUmHU/UUPevJ/ePGgmG49zOkLgYfngzd5EmWcKe047zVrZXVtfaO+2dja3tnda+4fDFSaS0L7JOWp9CKsKGeC9jXTnHqZpDiJOL2LHq9K/e6JSsVScasnGQ0S/CBYzAjWhgqbYz+WmBSod3p/6T57oWvbtheK9rQoGWeRQZdobp55234m01HITNIL2XPJVJk2WpYcEw6bLcd2ZoP+ArcCLaimFzZf/VFK8oQKTThWaug6mQ4KLDUjnE4bfq5ohskjfqBDAwVOqAqKWSNTdGKYEYpTaY7QaMYuJgqcKDVJIuNMsB6rZa0k/9OGuY47QcFElmsqyHxRnHOkU1TWi0ZMUqL5xABMJDN3RWSMTWvafELDlOAuP/kvGJzZrmO7N+etbqeqow5HcAyn4MIFdOEaetAHAi/wAV/wXXurfVpgWXOrVasyh/BrrK0fGJSuHg==</latexit><latexit sha1_base64="qPxGUhFFqiqy0kAj6kcWnWvH5J0=">AAACUXicbVFNS8MwGH5Xv+b8mnr0EhyCu5RWBHcRBl48TnCzspaSZqkLpmlJUmHU/UUPevJ/ePGgmG49zOkLgYfngzd5EmWcKe047zVrZXVtfaO+2dja3tnda+4fDFSaS0L7JOWp9CKsKGeC9jXTnHqZpDiJOL2LHq9K/e6JSsVScasnGQ0S/CBYzAjWhgqbYz+WmBSod3p/6T57oWvbtheK9rQoGWeRQZdobp55234m01HITNIL2XPJVJk2WpYcEw6bLcd2ZoP+ArcCLaimFzZf/VFK8oQKTThWaug6mQ4KLDUjnE4bfq5ohskjfqBDAwVOqAqKWSNTdGKYEYpTaY7QaMYuJgqcKDVJIuNMsB6rZa0k/9OGuY47QcFElmsqyHxRnHOkU1TWi0ZMUqL5xABMJDN3RWSMTWvafELDlOAuP/kvGJzZrmO7N+etbqeqow5HcAyn4MIFdOEaetAHAi/wAV/wXXurfVpgWXOrVasyh/BrrK0fGJSuHg==</latexit><latexit sha1_base64="qPxGUhFFqiqy0kAj6kcWnWvH5J0=">AAACUXicbVFNS8MwGH5Xv+b8mnr0EhyCu5RWBHcRBl48TnCzspaSZqkLpmlJUmHU/UUPevJ/ePGgmG49zOkLgYfngzd5EmWcKe047zVrZXVtfaO+2dja3tnda+4fDFSaS0L7JOWp9CKsKGeC9jXTnHqZpDiJOL2LHq9K/e6JSsVScasnGQ0S/CBYzAjWhgqbYz+WmBSod3p/6T57oWvbtheK9rQoGWeRQZdobp55234m01HITNIL2XPJVJk2WpYcEw6bLcd2ZoP+ArcCLaimFzZf/VFK8oQKTThWaug6mQ4KLDUjnE4bfq5ohskjfqBDAwVOqAqKWSNTdGKYEYpTaY7QaMYuJgqcKDVJIuNMsB6rZa0k/9OGuY47QcFElmsqyHxRnHOkU1TWi0ZMUqL5xABMJDN3RWSMTWvafELDlOAuP/kvGJzZrmO7N+etbqeqow5HcAyn4MIFdOEaetAHAi/wAV/wXXurfVpgWXOrVasyh/BrrK0fGJSuHg==</latexit>

Page 13: Announcementslwehbe/10701_S19/files/Lecture_5.pdf · 2019-02-05 · 10-701 Introduction to Machine Learning (PhD) Lecture 5: Naive Bayes Leila Wehbe Carnegie Mellon University Machine

Another way to view Naïve Bayes (Boolean Y, Xi’s): Decision rule: is this quantity greater or less than 1?

Naïve Bayes: Point #3

logP (Y = 1|X1...Xn)

P (Y = 0|X1...Xn)= log

P (Y = 1)

P (Y = 0)+

X

i

logP (Xi|Y = 1)

P (Xi|Y = 0)<latexit sha1_base64="/rYJIOwZIzM4Y+zUxADSRw01G6Y=">AAACXHicbZHPS8MwHMXTOnU/nE4FL16CQ5gIpRXBXQYDLx4nuK2yjZJm6RaWpiVJhdHtn/S2i/+KZlsnc/MLgcd7n5DkxY8Zlcq2F4Z5kDs8Os4XiqWT8ulZ5fyiI6NEYNLGEYuE6yNJGOWkrahixI0FQaHPSNefPC/z7gcRkkb8TU1jMgjRiNOAYqS05VUki0b9QCCcwlbtveHMXM+xLMv1+N08XTr2tgMb8Jdf4RtIR/ew2JdJ6NFtxPXobIOttUa9StW27NXAfeFkogqyaXmVz/4wwklIuMIMSdlz7FgNUiQUxYzMi/1EkhjhCRqRnpYchUQO0lU5c3irnSEMIqEXV3Dlbu9IUSjlNPQ1GSI1lrvZ0vwv6yUqqA9SyuNEEY7XBwUJgyqCy6bhkAqCFZtqgbCg+q4Qj5HuRen/KOoSnN0n74vOg+XYlvP6WG3Wszry4BrcgBpwwBNoghfQAm2AwQJ8G3mjYHyZObNklteoaWR7LsGfMa9+AMWMsBk=</latexit><latexit sha1_base64="/rYJIOwZIzM4Y+zUxADSRw01G6Y=">AAACXHicbZHPS8MwHMXTOnU/nE4FL16CQ5gIpRXBXQYDLx4nuK2yjZJm6RaWpiVJhdHtn/S2i/+KZlsnc/MLgcd7n5DkxY8Zlcq2F4Z5kDs8Os4XiqWT8ulZ5fyiI6NEYNLGEYuE6yNJGOWkrahixI0FQaHPSNefPC/z7gcRkkb8TU1jMgjRiNOAYqS05VUki0b9QCCcwlbtveHMXM+xLMv1+N08XTr2tgMb8Jdf4RtIR/ew2JdJ6NFtxPXobIOttUa9StW27NXAfeFkogqyaXmVz/4wwklIuMIMSdlz7FgNUiQUxYzMi/1EkhjhCRqRnpYchUQO0lU5c3irnSEMIqEXV3Dlbu9IUSjlNPQ1GSI1lrvZ0vwv6yUqqA9SyuNEEY7XBwUJgyqCy6bhkAqCFZtqgbCg+q4Qj5HuRen/KOoSnN0n74vOg+XYlvP6WG3Wszry4BrcgBpwwBNoghfQAm2AwQJ8G3mjYHyZObNklteoaWR7LsGfMa9+AMWMsBk=</latexit><latexit sha1_base64="/rYJIOwZIzM4Y+zUxADSRw01G6Y=">AAACXHicbZHPS8MwHMXTOnU/nE4FL16CQ5gIpRXBXQYDLx4nuK2yjZJm6RaWpiVJhdHtn/S2i/+KZlsnc/MLgcd7n5DkxY8Zlcq2F4Z5kDs8Os4XiqWT8ulZ5fyiI6NEYNLGEYuE6yNJGOWkrahixI0FQaHPSNefPC/z7gcRkkb8TU1jMgjRiNOAYqS05VUki0b9QCCcwlbtveHMXM+xLMv1+N08XTr2tgMb8Jdf4RtIR/ew2JdJ6NFtxPXobIOttUa9StW27NXAfeFkogqyaXmVz/4wwklIuMIMSdlz7FgNUiQUxYzMi/1EkhjhCRqRnpYchUQO0lU5c3irnSEMIqEXV3Dlbu9IUSjlNPQ1GSI1lrvZ0vwv6yUqqA9SyuNEEY7XBwUJgyqCy6bhkAqCFZtqgbCg+q4Qj5HuRen/KOoSnN0n74vOg+XYlvP6WG3Wszry4BrcgBpwwBNoghfQAm2AwQJ8G3mjYHyZObNklteoaWR7LsGfMa9+AMWMsBk=</latexit><latexit sha1_base64="/rYJIOwZIzM4Y+zUxADSRw01G6Y=">AAACXHicbZHPS8MwHMXTOnU/nE4FL16CQ5gIpRXBXQYDLx4nuK2yjZJm6RaWpiVJhdHtn/S2i/+KZlsnc/MLgcd7n5DkxY8Zlcq2F4Z5kDs8Os4XiqWT8ulZ5fyiI6NEYNLGEYuE6yNJGOWkrahixI0FQaHPSNefPC/z7gcRkkb8TU1jMgjRiNOAYqS05VUki0b9QCCcwlbtveHMXM+xLMv1+N08XTr2tgMb8Jdf4RtIR/ew2JdJ6NFtxPXobIOttUa9StW27NXAfeFkogqyaXmVz/4wwklIuMIMSdlz7FgNUiQUxYzMi/1EkhjhCRqRnpYchUQO0lU5c3irnSEMIqEXV3Dlbu9IUSjlNPQ1GSI1lrvZ0vwv6yUqqA9SyuNEEY7XBwUJgyqCy6bhkAqCFZtqgbCg+q4Qj5HuRen/KOoSnN0n74vOg+XYlvP6WG3Wszry4BrcgBpwwBNoghfQAm2AwQJ8G3mjYHyZObNklteoaWR7LsGfMa9+AMWMsBk=</latexit>

What happens when X has a large number of dimensions? 1� ✓ik = P̂ (Xi = 0|Y = k)

<latexit sha1_base64="B/e7ypgMQQOWjtzVDDblu7ueuoI=">AAACDXicbVDLSsNAFJ3UV62vqEs3g1WoC0sigt0UCm5cVrAPaUqYTKfN0MmDmRuhxPyAG3/FjQtF3Lp35984bbPQ6oELh3Pu5d57vFhwBZb1ZRSWlldW14rrpY3Nre0dc3evraJEUtaikYhk1yOKCR6yFnAQrBtLRgJPsI43vpz6nTsmFY/CG5jErB+QUciHnBLQkmse2afYAZ8BcVM+znAdOz6BtJnhStfldev+tj4+cc2yVbVmwH+JnZMyytF0zU9nENEkYCFQQZTq2VYM/ZRI4FSwrOQkisWEjsmI9TQNScBUP519k+FjrQzwMJK6QsAz9edESgKlJoGnOwMCvlr0puJ/Xi+BYa2f8jBOgIV0vmiYCAwRnkaDB1wyCmKiCaGS61sx9YkkFHSAJR2CvfjyX9I+q9pW1b4+LzdqeRxFdIAOUQXZ6AI10BVqohai6AE9oRf0ajwaz8ab8T5vLRj5zD76BePjG6olmf0=</latexit><latexit sha1_base64="B/e7ypgMQQOWjtzVDDblu7ueuoI=">AAACDXicbVDLSsNAFJ3UV62vqEs3g1WoC0sigt0UCm5cVrAPaUqYTKfN0MmDmRuhxPyAG3/FjQtF3Lp35984bbPQ6oELh3Pu5d57vFhwBZb1ZRSWlldW14rrpY3Nre0dc3evraJEUtaikYhk1yOKCR6yFnAQrBtLRgJPsI43vpz6nTsmFY/CG5jErB+QUciHnBLQkmse2afYAZ8BcVM+znAdOz6BtJnhStfldev+tj4+cc2yVbVmwH+JnZMyytF0zU9nENEkYCFQQZTq2VYM/ZRI4FSwrOQkisWEjsmI9TQNScBUP519k+FjrQzwMJK6QsAz9edESgKlJoGnOwMCvlr0puJ/Xi+BYa2f8jBOgIV0vmiYCAwRnkaDB1wyCmKiCaGS61sx9YkkFHSAJR2CvfjyX9I+q9pW1b4+LzdqeRxFdIAOUQXZ6AI10BVqohai6AE9oRf0ajwaz8ab8T5vLRj5zD76BePjG6olmf0=</latexit><latexit sha1_base64="B/e7ypgMQQOWjtzVDDblu7ueuoI=">AAACDXicbVDLSsNAFJ3UV62vqEs3g1WoC0sigt0UCm5cVrAPaUqYTKfN0MmDmRuhxPyAG3/FjQtF3Lp35984bbPQ6oELh3Pu5d57vFhwBZb1ZRSWlldW14rrpY3Nre0dc3evraJEUtaikYhk1yOKCR6yFnAQrBtLRgJPsI43vpz6nTsmFY/CG5jErB+QUciHnBLQkmse2afYAZ8BcVM+znAdOz6BtJnhStfldev+tj4+cc2yVbVmwH+JnZMyytF0zU9nENEkYCFQQZTq2VYM/ZRI4FSwrOQkisWEjsmI9TQNScBUP519k+FjrQzwMJK6QsAz9edESgKlJoGnOwMCvlr0puJ/Xi+BYa2f8jBOgIV0vmiYCAwRnkaDB1wyCmKiCaGS61sx9YkkFHSAJR2CvfjyX9I+q9pW1b4+LzdqeRxFdIAOUQXZ6AI10BVqohai6AE9oRf0ajwaz8ab8T5vLRj5zD76BePjG6olmf0=</latexit><latexit sha1_base64="B/e7ypgMQQOWjtzVDDblu7ueuoI=">AAACDXicbVDLSsNAFJ3UV62vqEs3g1WoC0sigt0UCm5cVrAPaUqYTKfN0MmDmRuhxPyAG3/FjQtF3Lp35984bbPQ6oELh3Pu5d57vFhwBZb1ZRSWlldW14rrpY3Nre0dc3evraJEUtaikYhk1yOKCR6yFnAQrBtLRgJPsI43vpz6nTsmFY/CG5jErB+QUciHnBLQkmse2afYAZ8BcVM+znAdOz6BtJnhStfldev+tj4+cc2yVbVmwH+JnZMyytF0zU9nENEkYCFQQZTq2VYM/ZRI4FSwrOQkisWEjsmI9TQNScBUP519k+FjrQzwMJK6QsAz9edESgKlJoGnOwMCvlr0puJ/Xi+BYa2f8jBOgIV0vmiYCAwRnkaDB1wyCmKiCaGS61sx9YkkFHSAJR2CvfjyX9I+q9pW1b4+LzdqeRxFdIAOUQXZ6AI10BVqohai6AE9oRf0ajwaz8ab8T5vLRj5zD76BePjG6olmf0=</latexit>

✓ik = P̂ (Xi = 1|Y = k)<latexit sha1_base64="mi3cvEbeM0hMBeUWxZLXWgT5+x8=">AAACCnicbVDLSsNAFJ3UV62vqEs3o0Wom5KIYDeFghuXFexD2hAm00kzdPJg5kYoMWs3/oobF4q49Qvc+TdOHwttPXDhcM693HuPlwiuwLK+jcLK6tr6RnGztLW9s7tn7h+0VZxKylo0FrHsekQxwSPWAg6CdRPJSOgJ1vFGVxO/c8+k4nF0C+OEOSEZRtznlICWXPO4DwED4mZ8lOM67gcEsmaOK12X1+2Hu/rozDXLVtWaAi8Te07KaI6ma371BzFNQxYBFUSpnm0l4GREAqeC5aV+qlhC6IgMWU/TiIRMOdn0lRyfamWA/VjqigBP1d8TGQmVGoee7gwJBGrRm4j/eb0U/JqT8ShJgUV0tshPBYYYT3LBAy4ZBTHWhFDJ9a2YBkQSCjq9kg7BXnx5mbTPq7ZVtW8uyo3aPI4iOkInqIJsdIka6Bo1UQtR9Iie0St6M56MF+Pd+Ji1Foz5zCH6A+PzB2wLmWI=</latexit><latexit sha1_base64="mi3cvEbeM0hMBeUWxZLXWgT5+x8=">AAACCnicbVDLSsNAFJ3UV62vqEs3o0Wom5KIYDeFghuXFexD2hAm00kzdPJg5kYoMWs3/oobF4q49Qvc+TdOHwttPXDhcM693HuPlwiuwLK+jcLK6tr6RnGztLW9s7tn7h+0VZxKylo0FrHsekQxwSPWAg6CdRPJSOgJ1vFGVxO/c8+k4nF0C+OEOSEZRtznlICWXPO4DwED4mZ8lOM67gcEsmaOK12X1+2Hu/rozDXLVtWaAi8Te07KaI6ma371BzFNQxYBFUSpnm0l4GREAqeC5aV+qlhC6IgMWU/TiIRMOdn0lRyfamWA/VjqigBP1d8TGQmVGoee7gwJBGrRm4j/eb0U/JqT8ShJgUV0tshPBYYYT3LBAy4ZBTHWhFDJ9a2YBkQSCjq9kg7BXnx5mbTPq7ZVtW8uyo3aPI4iOkInqIJsdIka6Bo1UQtR9Iie0St6M56MF+Pd+Ji1Foz5zCH6A+PzB2wLmWI=</latexit><latexit sha1_base64="mi3cvEbeM0hMBeUWxZLXWgT5+x8=">AAACCnicbVDLSsNAFJ3UV62vqEs3o0Wom5KIYDeFghuXFexD2hAm00kzdPJg5kYoMWs3/oobF4q49Qvc+TdOHwttPXDhcM693HuPlwiuwLK+jcLK6tr6RnGztLW9s7tn7h+0VZxKylo0FrHsekQxwSPWAg6CdRPJSOgJ1vFGVxO/c8+k4nF0C+OEOSEZRtznlICWXPO4DwED4mZ8lOM67gcEsmaOK12X1+2Hu/rozDXLVtWaAi8Te07KaI6ma371BzFNQxYBFUSpnm0l4GREAqeC5aV+qlhC6IgMWU/TiIRMOdn0lRyfamWA/VjqigBP1d8TGQmVGoee7gwJBGrRm4j/eb0U/JqT8ShJgUV0tshPBYYYT3LBAy4ZBTHWhFDJ9a2YBkQSCjq9kg7BXnx5mbTPq7ZVtW8uyo3aPI4iOkInqIJsdIka6Bo1UQtR9Iie0St6M56MF+Pd+Ji1Foz5zCH6A+PzB2wLmWI=</latexit><latexit sha1_base64="mi3cvEbeM0hMBeUWxZLXWgT5+x8=">AAACCnicbVDLSsNAFJ3UV62vqEs3o0Wom5KIYDeFghuXFexD2hAm00kzdPJg5kYoMWs3/oobF4q49Qvc+TdOHwttPXDhcM693HuPlwiuwLK+jcLK6tr6RnGztLW9s7tn7h+0VZxKylo0FrHsekQxwSPWAg6CdRPJSOgJ1vFGVxO/c8+k4nF0C+OEOSEZRtznlICWXPO4DwED4mZ8lOM67gcEsmaOK12X1+2Hu/rozDXLVtWaAi8Te07KaI6ma371BzFNQxYBFUSpnm0l4GREAqeC5aV+qlhC6IgMWU/TiIRMOdn0lRyfamWA/VjqigBP1d8TGQmVGoee7gwJBGrRm4j/eb0U/JqT8ShJgUV0tshPBYYYT3LBAy4ZBTHWhFDJ9a2YBkQSCjq9kg7BXnx5mbTPq7ZVtW8uyo3aPI4iOkInqIJsdIka6Bo1UQtR9Iie0St6M56MF+Pd+Ji1Foz5zCH6A+PzB2wLmWI=</latexit>

= logP (Y = 1)

P (Y = 0)+X

i

Xilog✓i1✓i0

+ (1�Xi)1� ✓i11� ✓i0

<latexit sha1_base64="e2j9oPERXnt8bgyTVS1N9P82sIY=">AAACZ3icbZFLSwMxEMez66uur/pABC/RIrRIy0YEexEELx4rWK10y5JNs20w+yCZFcqyfkhv3r34LUzbBZ8DQ/6Z+Q1J/glSKTS47ptlLywuLa9UVp219Y3Nrer2zr1OMsV4lyUyUb2Aai5FzLsgQPJeqjiNAskfgqfraf/hmSstkvgOJikfRHQUi1AwCqbkV1/wJZbJyAsVZXmn/nhJGsVsdRsFPsWOp7PIF7hn0mB4znkw5kD9XJCiwF87t5iO1EnT0I0SJU38kybN77xfrbktdxb4ryClqKEyOn711RsmLIt4DExSrfvETWGQUwWCSV44XqZ5StkTHfG+kTGNuB7kM58KfGIqQxwmymQMeFb9PpHTSOtJFBgyojDWv3vT4n+9fgZhe5CLOM2Ax2x+UJhJDAmemo6HQnEGcmIEZUqYu2I2psYfMF/jGBPI7yf/FfdnLeK2yO157apd2lFBh+gY1RFBF+gK3aAO6iKG3i3H2rX2rA97y963D+aobZUzu+hH2EefVIK1ig==</latexit><latexit sha1_base64="e2j9oPERXnt8bgyTVS1N9P82sIY=">AAACZ3icbZFLSwMxEMez66uur/pABC/RIrRIy0YEexEELx4rWK10y5JNs20w+yCZFcqyfkhv3r34LUzbBZ8DQ/6Z+Q1J/glSKTS47ptlLywuLa9UVp219Y3Nrer2zr1OMsV4lyUyUb2Aai5FzLsgQPJeqjiNAskfgqfraf/hmSstkvgOJikfRHQUi1AwCqbkV1/wJZbJyAsVZXmn/nhJGsVsdRsFPsWOp7PIF7hn0mB4znkw5kD9XJCiwF87t5iO1EnT0I0SJU38kybN77xfrbktdxb4ryClqKEyOn711RsmLIt4DExSrfvETWGQUwWCSV44XqZ5StkTHfG+kTGNuB7kM58KfGIqQxwmymQMeFb9PpHTSOtJFBgyojDWv3vT4n+9fgZhe5CLOM2Ax2x+UJhJDAmemo6HQnEGcmIEZUqYu2I2psYfMF/jGBPI7yf/FfdnLeK2yO157apd2lFBh+gY1RFBF+gK3aAO6iKG3i3H2rX2rA97y963D+aobZUzu+hH2EefVIK1ig==</latexit><latexit sha1_base64="e2j9oPERXnt8bgyTVS1N9P82sIY=">AAACZ3icbZFLSwMxEMez66uur/pABC/RIrRIy0YEexEELx4rWK10y5JNs20w+yCZFcqyfkhv3r34LUzbBZ8DQ/6Z+Q1J/glSKTS47ptlLywuLa9UVp219Y3Nrer2zr1OMsV4lyUyUb2Aai5FzLsgQPJeqjiNAskfgqfraf/hmSstkvgOJikfRHQUi1AwCqbkV1/wJZbJyAsVZXmn/nhJGsVsdRsFPsWOp7PIF7hn0mB4znkw5kD9XJCiwF87t5iO1EnT0I0SJU38kybN77xfrbktdxb4ryClqKEyOn711RsmLIt4DExSrfvETWGQUwWCSV44XqZ5StkTHfG+kTGNuB7kM58KfGIqQxwmymQMeFb9PpHTSOtJFBgyojDWv3vT4n+9fgZhe5CLOM2Ax2x+UJhJDAmemo6HQnEGcmIEZUqYu2I2psYfMF/jGBPI7yf/FfdnLeK2yO157apd2lFBh+gY1RFBF+gK3aAO6iKG3i3H2rX2rA97y963D+aobZUzu+hH2EefVIK1ig==</latexit><latexit sha1_base64="e2j9oPERXnt8bgyTVS1N9P82sIY=">AAACZ3icbZFLSwMxEMez66uur/pABC/RIrRIy0YEexEELx4rWK10y5JNs20w+yCZFcqyfkhv3r34LUzbBZ8DQ/6Z+Q1J/glSKTS47ptlLywuLa9UVp219Y3Nrer2zr1OMsV4lyUyUb2Aai5FzLsgQPJeqjiNAskfgqfraf/hmSstkvgOJikfRHQUi1AwCqbkV1/wJZbJyAsVZXmn/nhJGsVsdRsFPsWOp7PIF7hn0mB4znkw5kD9XJCiwF87t5iO1EnT0I0SJU38kybN77xfrbktdxb4ryClqKEyOn711RsmLIt4DExSrfvETWGQUwWCSV44XqZ5StkTHfG+kTGNuB7kM58KfGIqQxwmymQMeFb9PpHTSOtJFBgyojDWv3vT4n+9fgZhe5CLOM2Ax2x+UJhJDAmemo6HQnEGcmIEZUqYu2I2psYfMF/jGBPI7yf/FfdnLeK2yO157apd2lFBh+gY1RFBF+gK3aAO6iKG3i3H2rX2rA97y963D+aobZUzu+hH2EefVIK1ig==</latexit>

P (Y = 1|X1...Xn)

P (Y = 0|X1...Xn)=

P (Y = 1)Q

i P (Xi|Y = 1)

P (Y = 0)Q

i P (Xi|Y = 0)<latexit sha1_base64="qPxGUhFFqiqy0kAj6kcWnWvH5J0=">AAACUXicbVFNS8MwGH5Xv+b8mnr0EhyCu5RWBHcRBl48TnCzspaSZqkLpmlJUmHU/UUPevJ/ePGgmG49zOkLgYfngzd5EmWcKe047zVrZXVtfaO+2dja3tnda+4fDFSaS0L7JOWp9CKsKGeC9jXTnHqZpDiJOL2LHq9K/e6JSsVScasnGQ0S/CBYzAjWhgqbYz+WmBSod3p/6T57oWvbtheK9rQoGWeRQZdobp55234m01HITNIL2XPJVJk2WpYcEw6bLcd2ZoP+ArcCLaimFzZf/VFK8oQKTThWaug6mQ4KLDUjnE4bfq5ohskjfqBDAwVOqAqKWSNTdGKYEYpTaY7QaMYuJgqcKDVJIuNMsB6rZa0k/9OGuY47QcFElmsqyHxRnHOkU1TWi0ZMUqL5xABMJDN3RWSMTWvafELDlOAuP/kvGJzZrmO7N+etbqeqow5HcAyn4MIFdOEaetAHAi/wAV/wXXurfVpgWXOrVasyh/BrrK0fGJSuHg==</latexit><latexit sha1_base64="qPxGUhFFqiqy0kAj6kcWnWvH5J0=">AAACUXicbVFNS8MwGH5Xv+b8mnr0EhyCu5RWBHcRBl48TnCzspaSZqkLpmlJUmHU/UUPevJ/ePGgmG49zOkLgYfngzd5EmWcKe047zVrZXVtfaO+2dja3tnda+4fDFSaS0L7JOWp9CKsKGeC9jXTnHqZpDiJOL2LHq9K/e6JSsVScasnGQ0S/CBYzAjWhgqbYz+WmBSod3p/6T57oWvbtheK9rQoGWeRQZdobp55234m01HITNIL2XPJVJk2WpYcEw6bLcd2ZoP+ArcCLaimFzZf/VFK8oQKTThWaug6mQ4KLDUjnE4bfq5ohskjfqBDAwVOqAqKWSNTdGKYEYpTaY7QaMYuJgqcKDVJIuNMsB6rZa0k/9OGuY47QcFElmsqyHxRnHOkU1TWi0ZMUqL5xABMJDN3RWSMTWvafELDlOAuP/kvGJzZrmO7N+etbqeqow5HcAyn4MIFdOEaetAHAi/wAV/wXXurfVpgWXOrVasyh/BrrK0fGJSuHg==</latexit><latexit sha1_base64="qPxGUhFFqiqy0kAj6kcWnWvH5J0=">AAACUXicbVFNS8MwGH5Xv+b8mnr0EhyCu5RWBHcRBl48TnCzspaSZqkLpmlJUmHU/UUPevJ/ePGgmG49zOkLgYfngzd5EmWcKe047zVrZXVtfaO+2dja3tnda+4fDFSaS0L7JOWp9CKsKGeC9jXTnHqZpDiJOL2LHq9K/e6JSsVScasnGQ0S/CBYzAjWhgqbYz+WmBSod3p/6T57oWvbtheK9rQoGWeRQZdobp55234m01HITNIL2XPJVJk2WpYcEw6bLcd2ZoP+ArcCLaimFzZf/VFK8oQKTThWaug6mQ4KLDUjnE4bfq5ohskjfqBDAwVOqAqKWSNTdGKYEYpTaY7QaMYuJgqcKDVJIuNMsB6rZa0k/9OGuY47QcFElmsqyHxRnHOkU1TWi0ZMUqL5xABMJDN3RWSMTWvafELDlOAuP/kvGJzZrmO7N+etbqeqow5HcAyn4MIFdOEaetAHAi/wAV/wXXurfVpgWXOrVasyh/BrrK0fGJSuHg==</latexit><latexit sha1_base64="qPxGUhFFqiqy0kAj6kcWnWvH5J0=">AAACUXicbVFNS8MwGH5Xv+b8mnr0EhyCu5RWBHcRBl48TnCzspaSZqkLpmlJUmHU/UUPevJ/ePGgmG49zOkLgYfngzd5EmWcKe047zVrZXVtfaO+2dja3tnda+4fDFSaS0L7JOWp9CKsKGeC9jXTnHqZpDiJOL2LHq9K/e6JSsVScasnGQ0S/CBYzAjWhgqbYz+WmBSod3p/6T57oWvbtheK9rQoGWeRQZdobp55234m01HITNIL2XPJVJk2WpYcEw6bLcd2ZoP+ArcCLaimFzZf/VFK8oQKTThWaug6mQ4KLDUjnE4bfq5ohskjfqBDAwVOqAqKWSNTdGKYEYpTaY7QaMYuJgqcKDVJIuNMsB6rZa0k/9OGuY47QcFElmsqyHxRnHOkU1TWi0ZMUqL5xABMJDN3RWSMTWvafELDlOAuP/kvGJzZrmO7N+etbqeqow5HcAyn4MIFdOEaetAHAi/wAV/wXXurfVpgWXOrVasyh/BrrK0fGJSuHg==</latexit>

Another way to view Naïve Bayes (Boolean Y, Xi’s): Decision rule: is this quantity greater or less than 1?

Naïve Bayes: Point #3

logP (Y = 1|X1...Xn)

P (Y = 0|X1...Xn)= log

P (Y = 1)

P (Y = 0)+

X

i

logP (Xi|Y = 1)

P (Xi|Y = 0)<latexit sha1_base64="/rYJIOwZIzM4Y+zUxADSRw01G6Y=">AAACXHicbZHPS8MwHMXTOnU/nE4FL16CQ5gIpRXBXQYDLx4nuK2yjZJm6RaWpiVJhdHtn/S2i/+KZlsnc/MLgcd7n5DkxY8Zlcq2F4Z5kDs8Os4XiqWT8ulZ5fyiI6NEYNLGEYuE6yNJGOWkrahixI0FQaHPSNefPC/z7gcRkkb8TU1jMgjRiNOAYqS05VUki0b9QCCcwlbtveHMXM+xLMv1+N08XTr2tgMb8Jdf4RtIR/ew2JdJ6NFtxPXobIOttUa9StW27NXAfeFkogqyaXmVz/4wwklIuMIMSdlz7FgNUiQUxYzMi/1EkhjhCRqRnpYchUQO0lU5c3irnSEMIqEXV3Dlbu9IUSjlNPQ1GSI1lrvZ0vwv6yUqqA9SyuNEEY7XBwUJgyqCy6bhkAqCFZtqgbCg+q4Qj5HuRen/KOoSnN0n74vOg+XYlvP6WG3Wszry4BrcgBpwwBNoghfQAm2AwQJ8G3mjYHyZObNklteoaWR7LsGfMa9+AMWMsBk=</latexit><latexit sha1_base64="/rYJIOwZIzM4Y+zUxADSRw01G6Y=">AAACXHicbZHPS8MwHMXTOnU/nE4FL16CQ5gIpRXBXQYDLx4nuK2yjZJm6RaWpiVJhdHtn/S2i/+KZlsnc/MLgcd7n5DkxY8Zlcq2F4Z5kDs8Os4XiqWT8ulZ5fyiI6NEYNLGEYuE6yNJGOWkrahixI0FQaHPSNefPC/z7gcRkkb8TU1jMgjRiNOAYqS05VUki0b9QCCcwlbtveHMXM+xLMv1+N08XTr2tgMb8Jdf4RtIR/ew2JdJ6NFtxPXobIOttUa9StW27NXAfeFkogqyaXmVz/4wwklIuMIMSdlz7FgNUiQUxYzMi/1EkhjhCRqRnpYchUQO0lU5c3irnSEMIqEXV3Dlbu9IUSjlNPQ1GSI1lrvZ0vwv6yUqqA9SyuNEEY7XBwUJgyqCy6bhkAqCFZtqgbCg+q4Qj5HuRen/KOoSnN0n74vOg+XYlvP6WG3Wszry4BrcgBpwwBNoghfQAm2AwQJ8G3mjYHyZObNklteoaWR7LsGfMa9+AMWMsBk=</latexit><latexit sha1_base64="/rYJIOwZIzM4Y+zUxADSRw01G6Y=">AAACXHicbZHPS8MwHMXTOnU/nE4FL16CQ5gIpRXBXQYDLx4nuK2yjZJm6RaWpiVJhdHtn/S2i/+KZlsnc/MLgcd7n5DkxY8Zlcq2F4Z5kDs8Os4XiqWT8ulZ5fyiI6NEYNLGEYuE6yNJGOWkrahixI0FQaHPSNefPC/z7gcRkkb8TU1jMgjRiNOAYqS05VUki0b9QCCcwlbtveHMXM+xLMv1+N08XTr2tgMb8Jdf4RtIR/ew2JdJ6NFtxPXobIOttUa9StW27NXAfeFkogqyaXmVz/4wwklIuMIMSdlz7FgNUiQUxYzMi/1EkhjhCRqRnpYchUQO0lU5c3irnSEMIqEXV3Dlbu9IUSjlNPQ1GSI1lrvZ0vwv6yUqqA9SyuNEEY7XBwUJgyqCy6bhkAqCFZtqgbCg+q4Qj5HuRen/KOoSnN0n74vOg+XYlvP6WG3Wszry4BrcgBpwwBNoghfQAm2AwQJ8G3mjYHyZObNklteoaWR7LsGfMa9+AMWMsBk=</latexit><latexit sha1_base64="/rYJIOwZIzM4Y+zUxADSRw01G6Y=">AAACXHicbZHPS8MwHMXTOnU/nE4FL16CQ5gIpRXBXQYDLx4nuK2yjZJm6RaWpiVJhdHtn/S2i/+KZlsnc/MLgcd7n5DkxY8Zlcq2F4Z5kDs8Os4XiqWT8ulZ5fyiI6NEYNLGEYuE6yNJGOWkrahixI0FQaHPSNefPC/z7gcRkkb8TU1jMgjRiNOAYqS05VUki0b9QCCcwlbtveHMXM+xLMv1+N08XTr2tgMb8Jdf4RtIR/ew2JdJ6NFtxPXobIOttUa9StW27NXAfeFkogqyaXmVz/4wwklIuMIMSdlz7FgNUiQUxYzMi/1EkhjhCRqRnpYchUQO0lU5c3irnSEMIqEXV3Dlbu9IUSjlNPQ1GSI1lrvZ0vwv6yUqqA9SyuNEEY7XBwUJgyqCy6bhkAqCFZtqgbCg+q4Qj5HuRen/KOoSnN0n74vOg+XYlvP6WG3Wszry4BrcgBpwwBNoghfQAm2AwQJ8G3mjYHyZObNklteoaWR7LsGfMa9+AMWMsBk=</latexit>

= logP (Y = 1)

P (Y = 0)+

X

i

xilog✓i1✓i0

+ (1� xi)1� ✓i11� ✓i0

<latexit sha1_base64="daTz7ayQg2WPc7ful+7fQQEAHOw=">AAACZ3icbZFLSwMxEMez63t9VSsieIkWoUVaNiLopSB48VjB+qBblmyabYPZB8msWJb1Q3rz7sVvYdou+BwY8s/Mb0jyT5BKocF13yx7bn5hcWl5xVldW9/YrGxt3+okU4x3WSITdR9QzaWIeRcESH6fKk6jQPK74PFy0r974kqLJL6Bccr7ER3GIhSMgin5lRfcxjIZeqGiLO/UH9qkUUxXt1HgY+x4Oot8gZ9NGgzPOA9GHKifC1IU+GvnFpOROmkaulGipIl/0qT5nfcrNbflTgP/FaQUNVRGx6+8eoOEZRGPgUmqdY+4KfRzqkAwyQvHyzRPKXukQ94zMqYR1/186lOBj0xlgMNEmYwBT6vfJ3IaaT2OAkNGFEb6d29S/K/XyyA87+ciTjPgMZsdFGYSQ4InpuOBUJyBHBtBmRLmrpiNqPEHzNc4xgTy+8l/xe1Ji7gtcn1auzgv7VhG++gQ1RFBZ+gCXaEO6iKG3i3Hqlo71oe9ae/aezPUtsqZKvoR9sEnxYK1yg==</latexit><latexit sha1_base64="daTz7ayQg2WPc7ful+7fQQEAHOw=">AAACZ3icbZFLSwMxEMez63t9VSsieIkWoUVaNiLopSB48VjB+qBblmyabYPZB8msWJb1Q3rz7sVvYdou+BwY8s/Mb0jyT5BKocF13yx7bn5hcWl5xVldW9/YrGxt3+okU4x3WSITdR9QzaWIeRcESH6fKk6jQPK74PFy0r974kqLJL6Bccr7ER3GIhSMgin5lRfcxjIZeqGiLO/UH9qkUUxXt1HgY+x4Oot8gZ9NGgzPOA9GHKifC1IU+GvnFpOROmkaulGipIl/0qT5nfcrNbflTgP/FaQUNVRGx6+8eoOEZRGPgUmqdY+4KfRzqkAwyQvHyzRPKXukQ94zMqYR1/186lOBj0xlgMNEmYwBT6vfJ3IaaT2OAkNGFEb6d29S/K/XyyA87+ciTjPgMZsdFGYSQ4InpuOBUJyBHBtBmRLmrpiNqPEHzNc4xgTy+8l/xe1Ji7gtcn1auzgv7VhG++gQ1RFBZ+gCXaEO6iKG3i3Hqlo71oe9ae/aezPUtsqZKvoR9sEnxYK1yg==</latexit><latexit sha1_base64="daTz7ayQg2WPc7ful+7fQQEAHOw=">AAACZ3icbZFLSwMxEMez63t9VSsieIkWoUVaNiLopSB48VjB+qBblmyabYPZB8msWJb1Q3rz7sVvYdou+BwY8s/Mb0jyT5BKocF13yx7bn5hcWl5xVldW9/YrGxt3+okU4x3WSITdR9QzaWIeRcESH6fKk6jQPK74PFy0r974kqLJL6Bccr7ER3GIhSMgin5lRfcxjIZeqGiLO/UH9qkUUxXt1HgY+x4Oot8gZ9NGgzPOA9GHKifC1IU+GvnFpOROmkaulGipIl/0qT5nfcrNbflTgP/FaQUNVRGx6+8eoOEZRGPgUmqdY+4KfRzqkAwyQvHyzRPKXukQ94zMqYR1/186lOBj0xlgMNEmYwBT6vfJ3IaaT2OAkNGFEb6d29S/K/XyyA87+ciTjPgMZsdFGYSQ4InpuOBUJyBHBtBmRLmrpiNqPEHzNc4xgTy+8l/xe1Ji7gtcn1auzgv7VhG++gQ1RFBZ+gCXaEO6iKG3i3Hqlo71oe9ae/aezPUtsqZKvoR9sEnxYK1yg==</latexit><latexit sha1_base64="daTz7ayQg2WPc7ful+7fQQEAHOw=">AAACZ3icbZFLSwMxEMez63t9VSsieIkWoUVaNiLopSB48VjB+qBblmyabYPZB8msWJb1Q3rz7sVvYdou+BwY8s/Mb0jyT5BKocF13yx7bn5hcWl5xVldW9/YrGxt3+okU4x3WSITdR9QzaWIeRcESH6fKk6jQPK74PFy0r974kqLJL6Bccr7ER3GIhSMgin5lRfcxjIZeqGiLO/UH9qkUUxXt1HgY+x4Oot8gZ9NGgzPOA9GHKifC1IU+GvnFpOROmkaulGipIl/0qT5nfcrNbflTgP/FaQUNVRGx6+8eoOEZRGPgUmqdY+4KfRzqkAwyQvHyzRPKXukQ94zMqYR1/186lOBj0xlgMNEmYwBT6vfJ3IaaT2OAkNGFEb6d29S/K/XyyA87+ciTjPgMZsdFGYSQ4InpuOBUJyBHBtBmRLmrpiNqPEHzNc4xgTy+8l/xe1Ji7gtcn1auzgv7VhG++gQ1RFBZ+gCXaEO6iKG3i3Hqlo71oe9ae/aezPUtsqZKvoR9sEnxYK1yg==</latexit>

✓ik = P̂ (xi = 1|Y = k)<latexit sha1_base64="s1BRrSy+i6eIhIIR7ZaPuGDtl/I=">AAACCnicbVDLSsNAFJ34rPUVdelmtAh1UxIR7KZQcOOygn1IW8JkOm2GTCZh5kYssWs3/oobF4q49Qvc+TdOHwttPXDhcM693HuPnwiuwXG+raXlldW19dxGfnNre2fX3ttv6DhVlNVpLGLV8olmgktWBw6CtRLFSOQL1vTDy7HfvGNK81jewDBh3YgMJO9zSsBInn3UgYAB8TIejnAFdwICWW2Ei/cer7gPt5Xw1LMLTsmZAC8Sd0YKaIaaZ391ejFNIyaBCqJ123US6GZEAaeCjfKdVLOE0JAMWNtQSSKmu9nklRE+MUoP92NlSgKeqL8nMhJpPYx80xkRCPS8Nxb/89op9MvdjMskBSbpdFE/FRhiPM4F97hiFMTQEEIVN7diGhBFKJj08iYEd/7lRdI4K7lOyb0+L1TLszhy6BAdoyJy0QWqoitUQ3VE0SN6Rq/ozXqyXqx362PaumTNZg7QH1ifP52rmYI=</latexit><latexit sha1_base64="s1BRrSy+i6eIhIIR7ZaPuGDtl/I=">AAACCnicbVDLSsNAFJ34rPUVdelmtAh1UxIR7KZQcOOygn1IW8JkOm2GTCZh5kYssWs3/oobF4q49Qvc+TdOHwttPXDhcM693HuPnwiuwXG+raXlldW19dxGfnNre2fX3ttv6DhVlNVpLGLV8olmgktWBw6CtRLFSOQL1vTDy7HfvGNK81jewDBh3YgMJO9zSsBInn3UgYAB8TIejnAFdwICWW2Ei/cer7gPt5Xw1LMLTsmZAC8Sd0YKaIaaZ391ejFNIyaBCqJ123US6GZEAaeCjfKdVLOE0JAMWNtQSSKmu9nklRE+MUoP92NlSgKeqL8nMhJpPYx80xkRCPS8Nxb/89op9MvdjMskBSbpdFE/FRhiPM4F97hiFMTQEEIVN7diGhBFKJj08iYEd/7lRdI4K7lOyb0+L1TLszhy6BAdoyJy0QWqoitUQ3VE0SN6Rq/ozXqyXqx362PaumTNZg7QH1ifP52rmYI=</latexit><latexit sha1_base64="s1BRrSy+i6eIhIIR7ZaPuGDtl/I=">AAACCnicbVDLSsNAFJ34rPUVdelmtAh1UxIR7KZQcOOygn1IW8JkOm2GTCZh5kYssWs3/oobF4q49Qvc+TdOHwttPXDhcM693HuPnwiuwXG+raXlldW19dxGfnNre2fX3ttv6DhVlNVpLGLV8olmgktWBw6CtRLFSOQL1vTDy7HfvGNK81jewDBh3YgMJO9zSsBInn3UgYAB8TIejnAFdwICWW2Ei/cer7gPt5Xw1LMLTsmZAC8Sd0YKaIaaZ391ejFNIyaBCqJ123US6GZEAaeCjfKdVLOE0JAMWNtQSSKmu9nklRE+MUoP92NlSgKeqL8nMhJpPYx80xkRCPS8Nxb/89op9MvdjMskBSbpdFE/FRhiPM4F97hiFMTQEEIVN7diGhBFKJj08iYEd/7lRdI4K7lOyb0+L1TLszhy6BAdoyJy0QWqoitUQ3VE0SN6Rq/ozXqyXqx362PaumTNZg7QH1ifP52rmYI=</latexit><latexit sha1_base64="s1BRrSy+i6eIhIIR7ZaPuGDtl/I=">AAACCnicbVDLSsNAFJ34rPUVdelmtAh1UxIR7KZQcOOygn1IW8JkOm2GTCZh5kYssWs3/oobF4q49Qvc+TdOHwttPXDhcM693HuPnwiuwXG+raXlldW19dxGfnNre2fX3ttv6DhVlNVpLGLV8olmgktWBw6CtRLFSOQL1vTDy7HfvGNK81jewDBh3YgMJO9zSsBInn3UgYAB8TIejnAFdwICWW2Ei/cer7gPt5Xw1LMLTsmZAC8Sd0YKaIaaZ391ejFNIyaBCqJ123US6GZEAaeCjfKdVLOE0JAMWNtQSSKmu9nklRE+MUoP92NlSgKeqL8nMhJpPYx80xkRCPS8Nxb/89op9MvdjMskBSbpdFE/FRhiPM4F97hiFMTQEEIVN7diGhBFKJj08iYEd/7lRdI4K7lOyb0+L1TLszhy6BAdoyJy0QWqoitUQ3VE0SN6Rq/ozXqyXqx362PaumTNZg7QH1ifP52rmYI=</latexit>

1� ✓ik = P̂ (xi = 0|Y = k)<latexit sha1_base64="33YFRAM5JISW7rVcYS5ZeIJcR88=">AAACDXicbVC7SgNBFJ2NrxhfUUubwSjEwrArgmkCARvLCOYh2WWZncwmQ2YfzNwVw5ofsPFXbCwUsbW382+cJFto4oELh3Pu5d57vFhwBab5beSWlldW1/LrhY3Nre2d4u5eS0WJpKxJIxHJjkcUEzxkTeAgWCeWjASeYG1veDnx23dMKh6FNzCKmROQfsh9TgloyS0eWafYhgED4qZ8OMY1bA8IpI0xLt+7vGY+3NaGJ26xZFbMKfAisTJSQhkabvHL7kU0CVgIVBClupYZg5MSCZwKNi7YiWIxoUPSZ11NQxIw5aTTb8b4WCs97EdSVwh4qv6eSEmg1CjwdGdAYKDmvYn4n9dNwK86KQ/jBFhIZ4v8RGCI8CQa3OOSURAjTQiVXN+K6YBIQkEHWNAhWPMvL5LWWcUyK9b1ealezeLIowN0iMrIQheojq5QAzURRY/oGb2iN+PJeDHejY9Za87IZvbRHxifP9vFmh0=</latexit><latexit sha1_base64="33YFRAM5JISW7rVcYS5ZeIJcR88=">AAACDXicbVC7SgNBFJ2NrxhfUUubwSjEwrArgmkCARvLCOYh2WWZncwmQ2YfzNwVw5ofsPFXbCwUsbW382+cJFto4oELh3Pu5d57vFhwBab5beSWlldW1/LrhY3Nre2d4u5eS0WJpKxJIxHJjkcUEzxkTeAgWCeWjASeYG1veDnx23dMKh6FNzCKmROQfsh9TgloyS0eWafYhgED4qZ8OMY1bA8IpI0xLt+7vGY+3NaGJ26xZFbMKfAisTJSQhkabvHL7kU0CVgIVBClupYZg5MSCZwKNi7YiWIxoUPSZ11NQxIw5aTTb8b4WCs97EdSVwh4qv6eSEmg1CjwdGdAYKDmvYn4n9dNwK86KQ/jBFhIZ4v8RGCI8CQa3OOSURAjTQiVXN+K6YBIQkEHWNAhWPMvL5LWWcUyK9b1ealezeLIowN0iMrIQheojq5QAzURRY/oGb2iN+PJeDHejY9Za87IZvbRHxifP9vFmh0=</latexit><latexit sha1_base64="33YFRAM5JISW7rVcYS5ZeIJcR88=">AAACDXicbVC7SgNBFJ2NrxhfUUubwSjEwrArgmkCARvLCOYh2WWZncwmQ2YfzNwVw5ofsPFXbCwUsbW382+cJFto4oELh3Pu5d57vFhwBab5beSWlldW1/LrhY3Nre2d4u5eS0WJpKxJIxHJjkcUEzxkTeAgWCeWjASeYG1veDnx23dMKh6FNzCKmROQfsh9TgloyS0eWafYhgED4qZ8OMY1bA8IpI0xLt+7vGY+3NaGJ26xZFbMKfAisTJSQhkabvHL7kU0CVgIVBClupYZg5MSCZwKNi7YiWIxoUPSZ11NQxIw5aTTb8b4WCs97EdSVwh4qv6eSEmg1CjwdGdAYKDmvYn4n9dNwK86KQ/jBFhIZ4v8RGCI8CQa3OOSURAjTQiVXN+K6YBIQkEHWNAhWPMvL5LWWcUyK9b1ealezeLIowN0iMrIQheojq5QAzURRY/oGb2iN+PJeDHejY9Za87IZvbRHxifP9vFmh0=</latexit><latexit sha1_base64="33YFRAM5JISW7rVcYS5ZeIJcR88=">AAACDXicbVC7SgNBFJ2NrxhfUUubwSjEwrArgmkCARvLCOYh2WWZncwmQ2YfzNwVw5ofsPFXbCwUsbW382+cJFto4oELh3Pu5d57vFhwBab5beSWlldW1/LrhY3Nre2d4u5eS0WJpKxJIxHJjkcUEzxkTeAgWCeWjASeYG1veDnx23dMKh6FNzCKmROQfsh9TgloyS0eWafYhgED4qZ8OMY1bA8IpI0xLt+7vGY+3NaGJ26xZFbMKfAisTJSQhkabvHL7kU0CVgIVBClupYZg5MSCZwKNi7YiWIxoUPSZ11NQxIw5aTTb8b4WCs97EdSVwh4qv6eSEmg1CjwdGdAYKDmvYn4n9dNwK86KQ/jBFhIZ4v8RGCI8CQa3OOSURAjTQiVXN+K6YBIQkEHWNAhWPMvL5LWWcUyK9b1ealezeLIowN0iMrIQheojq5QAzURRY/oGb2iN+PJeDHejY9Za87IZvbRHxifP9vFmh0=</latexit>

Prevents underflow

Naïve Bayes: Point #4

If unlucky, our MLE estimate for P(Xi | Y) might be zero. (for example, Xi = birthdate. Xi = Jan_25_1992)

• Why worry about just one parameter out of many?

• What can be done to address this?

Naïve Bayes: Point #4

If unlucky, our MLE estimate for P(Xi | Y) might be zero. (for example, Xi = birthdate. Xi = Jan_25_1992)

• Why worry about just one parameter out of many?

• What can be done to address this?

P (Y = i)P (X1|Y = i)P (X2|Y = i) . . . P (Xn|Y = i)<latexit sha1_base64="DyJsZu8dRCE7q5OSN/HHZ9lolN0=">AAACFnicbZDLSgMxFIbPeK31VnXpJliEurDMFMFuhIIblxXsRdphyKSZNjSTGZKMUGqfwo2v4saFIm7FnW9jOh1EW38IfPznnCTn92POlLbtL2tpeWV1bT23kd/c2t7ZLeztN1WUSEIbJOKRbPtYUc4EbWimOW3HkuLQ57TlDy+n9dYdlYpF4kaPYuqGuC9YwAjWxvIKp/XS7QU7QfVS23Puf7Ayw24v0io1RGp4haJdtlOhRXAyKEKmulf4NFeQJKRCE46V6jh2rN0xlpoRTif5bqJojMkQ92nHoMAhVe44XWuCjo3TQ0EkzREape7viTEOlRqFvukMsR6o+drU/K/WSXRQdcdMxImmgsweChKOdISmGaEek5RoPjKAiWTmr4gMsMREmyTzJgRnfuVFaFbKjl12rs+KtWoWRw4O4QhK4MA51OAK6tAAAg/wBC/waj1az9ab9T5rXbKymQP4I+vjG5oum+M=</latexit><latexit sha1_base64="DyJsZu8dRCE7q5OSN/HHZ9lolN0=">AAACFnicbZDLSgMxFIbPeK31VnXpJliEurDMFMFuhIIblxXsRdphyKSZNjSTGZKMUGqfwo2v4saFIm7FnW9jOh1EW38IfPznnCTn92POlLbtL2tpeWV1bT23kd/c2t7ZLeztN1WUSEIbJOKRbPtYUc4EbWimOW3HkuLQ57TlDy+n9dYdlYpF4kaPYuqGuC9YwAjWxvIKp/XS7QU7QfVS23Puf7Ayw24v0io1RGp4haJdtlOhRXAyKEKmulf4NFeQJKRCE46V6jh2rN0xlpoRTif5bqJojMkQ92nHoMAhVe44XWuCjo3TQ0EkzREape7viTEOlRqFvukMsR6o+drU/K/WSXRQdcdMxImmgsweChKOdISmGaEek5RoPjKAiWTmr4gMsMREmyTzJgRnfuVFaFbKjl12rs+KtWoWRw4O4QhK4MA51OAK6tAAAg/wBC/waj1az9ab9T5rXbKymQP4I+vjG5oum+M=</latexit><latexit sha1_base64="DyJsZu8dRCE7q5OSN/HHZ9lolN0=">AAACFnicbZDLSgMxFIbPeK31VnXpJliEurDMFMFuhIIblxXsRdphyKSZNjSTGZKMUGqfwo2v4saFIm7FnW9jOh1EW38IfPznnCTn92POlLbtL2tpeWV1bT23kd/c2t7ZLeztN1WUSEIbJOKRbPtYUc4EbWimOW3HkuLQ57TlDy+n9dYdlYpF4kaPYuqGuC9YwAjWxvIKp/XS7QU7QfVS23Puf7Ayw24v0io1RGp4haJdtlOhRXAyKEKmulf4NFeQJKRCE46V6jh2rN0xlpoRTif5bqJojMkQ92nHoMAhVe44XWuCjo3TQ0EkzREape7viTEOlRqFvukMsR6o+drU/K/WSXRQdcdMxImmgsweChKOdISmGaEek5RoPjKAiWTmr4gMsMREmyTzJgRnfuVFaFbKjl12rs+KtWoWRw4O4QhK4MA51OAK6tAAAg/wBC/waj1az9ab9T5rXbKymQP4I+vjG5oum+M=</latexit><latexit sha1_base64="DyJsZu8dRCE7q5OSN/HHZ9lolN0=">AAACFnicbZDLSgMxFIbPeK31VnXpJliEurDMFMFuhIIblxXsRdphyKSZNjSTGZKMUGqfwo2v4saFIm7FnW9jOh1EW38IfPznnCTn92POlLbtL2tpeWV1bT23kd/c2t7ZLeztN1WUSEIbJOKRbPtYUc4EbWimOW3HkuLQ57TlDy+n9dYdlYpF4kaPYuqGuC9YwAjWxvIKp/XS7QU7QfVS23Puf7Ayw24v0io1RGp4haJdtlOhRXAyKEKmulf4NFeQJKRCE46V6jh2rN0xlpoRTif5bqJojMkQ92nHoMAhVe44XWuCjo3TQ0EkzREape7viTEOlRqFvukMsR6o+drU/K/WSXRQdcdMxImmgsweChKOdISmGaEek5RoPjKAiWTmr4gMsMREmyTzJgRnfuVFaFbKjl12rs+KtWoWRw4O4QhK4MA51OAK6tAAAg/wBC/waj1az9ab9T5rXbKymQP4I+vjG5oum+M=</latexit>

Page 14: Announcementslwehbe/10701_S19/files/Lecture_5.pdf · 2019-02-05 · 10-701 Introduction to Machine Learning (PhD) Lecture 5: Naive Bayes Leila Wehbe Carnegie Mellon University Machine

Estimating Parameters• Maximum Likelihood Estimate (MLE): choose θ that

maximizes probability of observed data

• Maximum a Posteriori (MAP) estimate: choose θ that is most probable given prior probability and the data

Maximum likelihood estimates:

Estimating Parameters: Y, Xi discrete-valued

MAP estimates (Beta, Dirichlet priors): Only difference:

“imaginary” examples

How many counts should we add?

Learning to classify text documents• Classify which emails are spam? • Classify which emails promise an attachment? • Classify which web pages are student home pages?

How shall we represent text documents for Naïve Bayes?

Baseline: Bag of Words Approach

aardvark 0

about 2

all 2

Africa 1

apple 0

anxious 0

...

gas 1

...

oil 1

Zaire 0

Page 15: Announcementslwehbe/10701_S19/files/Lecture_5.pdf · 2019-02-05 · 10-701 Introduction to Machine Learning (PhD) Lecture 5: Naive Bayes Leila Wehbe Carnegie Mellon University Machine

How can we express X?

• Y discrete valued. e.g., Spam or not • X = (X1, X2, … Xn) with n the number of words in English.

• What are the problems with this representation?

How can we express X?

• Y discrete valued. e.g., Spam or not • X = (X1, X2, … Xn) with n the number of words in English.

• What are the problems with this representation? • Some words always present • Some words very infrequent • Doesn’t count how often a word appears • Conditional independence assumption is false…

Learning to classify document: P(Y|X)the “Bag of Words” model

• Y discrete valued. e.g., Spam or not • X = (X1, X2, … Xn) = document

• Xi is a random variable describing the word at position i in the document

• possible values for Xi : any word wk in English

• Xi represents the ith word position in document

• X1 = “I”, X2 = “am”, X3 = “pleased”

• and, let’s assume the Xi are iid (indep, identically distributed)

Learning to classify document: P(Y|X)the “Bag of Words” model

• Y discrete valued. e.g., Spam or not • X = (X1, X2, … Xn) = document

• Xi is a random variable describing the word at position i in the document

• possible values for Xi : any word wk in English

Multinomial distribution

All Xi have the same distribution

P (D|✓) / ✓↵11 ✓↵2

2 . . . ✓↵nn

<latexit sha1_base64="Uk8D2QK9q1pbLGje3ccUxeuS+gU=">AAACQXicbZBLSwMxFIUzvq2vqks3wSLopswUQZeCLlxWsFro1OFOmtpgJhOSO0IZ56+58R+4c+/GhSJu3Zg+xOeFwMl37s3jxFoKi77/4E1MTk3PzM7NlxYWl5ZXyqtrZzbNDOMNlsrUNGOwXArFGyhQ8qY2HJJY8vP46nDgn19zY0WqTrGveTuBSyW6ggE6FJWb9e2jmxB7HGGHhtqkGlM62kd5UFzkIUjdgygoPmHtC9aKsJOi/XTUl6OKqFzxq/6w6F8RjEWFjKsele/dYSxLuEImwdpW4Gts52BQMMmLUphZroFdwSVvOakg4badDxMo6JYjHdpNjVsK6ZB+n8ghsbafxK4zAezZ394A/ue1Muzut3OhdIZcsdFF3UxSl9IgTtoRhjOUfSeAGeHeSlkPDDB0oZdcCMHvL/8VZ7Vq4FeDk93Kwf44jjmyQTbJNgnIHjkgx6ROGoSRW/JInsmLd+c9ea/e26h1whvPrJMf5b1/AC8Csnw=</latexit><latexit sha1_base64="Uk8D2QK9q1pbLGje3ccUxeuS+gU=">AAACQXicbZBLSwMxFIUzvq2vqks3wSLopswUQZeCLlxWsFro1OFOmtpgJhOSO0IZ56+58R+4c+/GhSJu3Zg+xOeFwMl37s3jxFoKi77/4E1MTk3PzM7NlxYWl5ZXyqtrZzbNDOMNlsrUNGOwXArFGyhQ8qY2HJJY8vP46nDgn19zY0WqTrGveTuBSyW6ggE6FJWb9e2jmxB7HGGHhtqkGlM62kd5UFzkIUjdgygoPmHtC9aKsJOi/XTUl6OKqFzxq/6w6F8RjEWFjKsele/dYSxLuEImwdpW4Gts52BQMMmLUphZroFdwSVvOakg4badDxMo6JYjHdpNjVsK6ZB+n8ghsbafxK4zAezZ394A/ue1Muzut3OhdIZcsdFF3UxSl9IgTtoRhjOUfSeAGeHeSlkPDDB0oZdcCMHvL/8VZ7Vq4FeDk93Kwf44jjmyQTbJNgnIHjkgx6ROGoSRW/JInsmLd+c9ea/e26h1whvPrJMf5b1/AC8Csnw=</latexit><latexit sha1_base64="Uk8D2QK9q1pbLGje3ccUxeuS+gU=">AAACQXicbZBLSwMxFIUzvq2vqks3wSLopswUQZeCLlxWsFro1OFOmtpgJhOSO0IZ56+58R+4c+/GhSJu3Zg+xOeFwMl37s3jxFoKi77/4E1MTk3PzM7NlxYWl5ZXyqtrZzbNDOMNlsrUNGOwXArFGyhQ8qY2HJJY8vP46nDgn19zY0WqTrGveTuBSyW6ggE6FJWb9e2jmxB7HGGHhtqkGlM62kd5UFzkIUjdgygoPmHtC9aKsJOi/XTUl6OKqFzxq/6w6F8RjEWFjKsele/dYSxLuEImwdpW4Gts52BQMMmLUphZroFdwSVvOakg4badDxMo6JYjHdpNjVsK6ZB+n8ghsbafxK4zAezZ394A/ue1Muzut3OhdIZcsdFF3UxSl9IgTtoRhjOUfSeAGeHeSlkPDDB0oZdcCMHvL/8VZ7Vq4FeDk93Kwf44jjmyQTbJNgnIHjkgx6ROGoSRW/JInsmLd+c9ea/e26h1whvPrJMf5b1/AC8Csnw=</latexit><latexit sha1_base64="Uk8D2QK9q1pbLGje3ccUxeuS+gU=">AAACQXicbZBLSwMxFIUzvq2vqks3wSLopswUQZeCLlxWsFro1OFOmtpgJhOSO0IZ56+58R+4c+/GhSJu3Zg+xOeFwMl37s3jxFoKi77/4E1MTk3PzM7NlxYWl5ZXyqtrZzbNDOMNlsrUNGOwXArFGyhQ8qY2HJJY8vP46nDgn19zY0WqTrGveTuBSyW6ggE6FJWb9e2jmxB7HGGHhtqkGlM62kd5UFzkIUjdgygoPmHtC9aKsJOi/XTUl6OKqFzxq/6w6F8RjEWFjKsele/dYSxLuEImwdpW4Gts52BQMMmLUphZroFdwSVvOakg4badDxMo6JYjHdpNjVsK6ZB+n8ghsbafxK4zAezZ394A/ue1Muzut3OhdIZcsdFF3UxSl9IgTtoRhjOUfSeAGeHeSlkPDDB0oZdcCMHvL/8VZ7Vq4FeDk93Kwf44jjmyQTbJNgnIHjkgx6ROGoSRW/JInsmLd+c9ea/e26h1whvPrJMf5b1/AC8Csnw=</latexit>

Page 16: Announcementslwehbe/10701_S19/files/Lecture_5.pdf · 2019-02-05 · 10-701 Introduction to Machine Learning (PhD) Lecture 5: Naive Bayes Leila Wehbe Carnegie Mellon University Machine

Learning to classify document: P(Y|X)the “Bag of Words” model

• Y discrete valued. e.g., Spam or not • X = (X1, X2, … Xn) = document

• Xi is a random variable describing the word at position i in the document

• possible values for Xi : any word wk in English

• Document = bag of words: the vector of counts for all wk’s – like #heads, #tails, but we have many more than 2 values – assume word probabilities are position independent

(i.i.d. rolls of a 50,000-sided die)

Naïve Bayes Algorithm – discrete Xi • Train Naïve Bayes (examples) for each value yk

estimate for each value xj of each attribute Xi

estimate

• Classify (Xnew)

prob that word xj appears in position i, given Y=yk

* Additional assumption: word probabilities are position independent

MAP estimates for bag of words

Map estimate for multinomial

✓jk =↵jk + �jk � 1P

m ↵mk +P

m �mk � 1<latexit sha1_base64="uX3PJcn417A8oAFe0wB+DSPe5bM=">AAACSHicbZDNS8MwGMbT+TXn19Sjl+AQBHG0IriLMPDicYL7gHWUNEu3uKQtyVthlP55Xjx682/w4kERb2ZbGbr5QuDheX4vSR4/FlyDbb9ahZXVtfWN4mZpa3tnd6+8f9DSUaIoa9JIRKrjE80ED1kTOAjWiRUj0hes7Y9uJnn7kSnNo/AexjHrSTIIecApAWN5Zc+FIQPipQ+jDF9jN1CEpi4R8TD3zrDrz4FzJ8OpqxPpyTkkR9mUyt0ZLHM488oVu2pPBy8LJxcVlE/DK7+4/YgmkoVABdG669gx9FKigFPBspKbaBYTOiID1jUyJJLpXjotIsMnxunjIFLmhICn7u+NlEitx9I3pCQw1IvZxPwv6yYQ1HopD+MEWEhnFwWJwBDhSau4zxWjIMZGEKq4eSumQ2K6BNN9yZTgLH55WbQuqo5dde4uK/VaXkcRHaFjdIocdIXq6BY1UBNR9ITe0Af6tJ6td+vL+p6hBSvfOUR/plD4AZUvssE=</latexit><latexit sha1_base64="uX3PJcn417A8oAFe0wB+DSPe5bM=">AAACSHicbZDNS8MwGMbT+TXn19Sjl+AQBHG0IriLMPDicYL7gHWUNEu3uKQtyVthlP55Xjx682/w4kERb2ZbGbr5QuDheX4vSR4/FlyDbb9ahZXVtfWN4mZpa3tnd6+8f9DSUaIoa9JIRKrjE80ED1kTOAjWiRUj0hes7Y9uJnn7kSnNo/AexjHrSTIIecApAWN5Zc+FIQPipQ+jDF9jN1CEpi4R8TD3zrDrz4FzJ8OpqxPpyTkkR9mUyt0ZLHM488oVu2pPBy8LJxcVlE/DK7+4/YgmkoVABdG669gx9FKigFPBspKbaBYTOiID1jUyJJLpXjotIsMnxunjIFLmhICn7u+NlEitx9I3pCQw1IvZxPwv6yYQ1HopD+MEWEhnFwWJwBDhSau4zxWjIMZGEKq4eSumQ2K6BNN9yZTgLH55WbQuqo5dde4uK/VaXkcRHaFjdIocdIXq6BY1UBNR9ITe0Af6tJ6td+vL+p6hBSvfOUR/plD4AZUvssE=</latexit><latexit sha1_base64="uX3PJcn417A8oAFe0wB+DSPe5bM=">AAACSHicbZDNS8MwGMbT+TXn19Sjl+AQBHG0IriLMPDicYL7gHWUNEu3uKQtyVthlP55Xjx682/w4kERb2ZbGbr5QuDheX4vSR4/FlyDbb9ahZXVtfWN4mZpa3tnd6+8f9DSUaIoa9JIRKrjE80ED1kTOAjWiRUj0hes7Y9uJnn7kSnNo/AexjHrSTIIecApAWN5Zc+FIQPipQ+jDF9jN1CEpi4R8TD3zrDrz4FzJ8OpqxPpyTkkR9mUyt0ZLHM488oVu2pPBy8LJxcVlE/DK7+4/YgmkoVABdG669gx9FKigFPBspKbaBYTOiID1jUyJJLpXjotIsMnxunjIFLmhICn7u+NlEitx9I3pCQw1IvZxPwv6yYQ1HopD+MEWEhnFwWJwBDhSau4zxWjIMZGEKq4eSumQ2K6BNN9yZTgLH55WbQuqo5dde4uK/VaXkcRHaFjdIocdIXq6BY1UBNR9ITe0Af6tJ6td+vL+p6hBSvfOUR/plD4AZUvssE=</latexit><latexit sha1_base64="uX3PJcn417A8oAFe0wB+DSPe5bM=">AAACSHicbZDNS8MwGMbT+TXn19Sjl+AQBHG0IriLMPDicYL7gHWUNEu3uKQtyVthlP55Xjx682/w4kERb2ZbGbr5QuDheX4vSR4/FlyDbb9ahZXVtfWN4mZpa3tnd6+8f9DSUaIoa9JIRKrjE80ED1kTOAjWiRUj0hes7Y9uJnn7kSnNo/AexjHrSTIIecApAWN5Zc+FIQPipQ+jDF9jN1CEpi4R8TD3zrDrz4FzJ8OpqxPpyTkkR9mUyt0ZLHM488oVu2pPBy8LJxcVlE/DK7+4/YgmkoVABdG669gx9FKigFPBspKbaBYTOiID1jUyJJLpXjotIsMnxunjIFLmhICn7u+NlEitx9I3pCQw1IvZxPwv6yYQ1HopD+MEWEhnFwWJwBDhSau4zxWjIMZGEKq4eSumQ2K6BNN9yZTgLH55WbQuqo5dde4uK/VaXkcRHaFjdIocdIXq6BY1UBNR9ITe0Af6tJ6td+vL+p6hBSvfOUR/plD4AZUvssE=</latexit>

MAP estimates for bag of words

Map estimate for multinomial

What β’s should we choose?

✓jk =↵jk + �jk � 1P

m ↵mk +P

m �mk � 1<latexit sha1_base64="uX3PJcn417A8oAFe0wB+DSPe5bM=">AAACSHicbZDNS8MwGMbT+TXn19Sjl+AQBHG0IriLMPDicYL7gHWUNEu3uKQtyVthlP55Xjx682/w4kERb2ZbGbr5QuDheX4vSR4/FlyDbb9ahZXVtfWN4mZpa3tnd6+8f9DSUaIoa9JIRKrjE80ED1kTOAjWiRUj0hes7Y9uJnn7kSnNo/AexjHrSTIIecApAWN5Zc+FIQPipQ+jDF9jN1CEpi4R8TD3zrDrz4FzJ8OpqxPpyTkkR9mUyt0ZLHM488oVu2pPBy8LJxcVlE/DK7+4/YgmkoVABdG669gx9FKigFPBspKbaBYTOiID1jUyJJLpXjotIsMnxunjIFLmhICn7u+NlEitx9I3pCQw1IvZxPwv6yYQ1HopD+MEWEhnFwWJwBDhSau4zxWjIMZGEKq4eSumQ2K6BNN9yZTgLH55WbQuqo5dde4uK/VaXkcRHaFjdIocdIXq6BY1UBNR9ITe0Af6tJ6td+vL+p6hBSvfOUR/plD4AZUvssE=</latexit><latexit sha1_base64="uX3PJcn417A8oAFe0wB+DSPe5bM=">AAACSHicbZDNS8MwGMbT+TXn19Sjl+AQBHG0IriLMPDicYL7gHWUNEu3uKQtyVthlP55Xjx682/w4kERb2ZbGbr5QuDheX4vSR4/FlyDbb9ahZXVtfWN4mZpa3tnd6+8f9DSUaIoa9JIRKrjE80ED1kTOAjWiRUj0hes7Y9uJnn7kSnNo/AexjHrSTIIecApAWN5Zc+FIQPipQ+jDF9jN1CEpi4R8TD3zrDrz4FzJ8OpqxPpyTkkR9mUyt0ZLHM488oVu2pPBy8LJxcVlE/DK7+4/YgmkoVABdG669gx9FKigFPBspKbaBYTOiID1jUyJJLpXjotIsMnxunjIFLmhICn7u+NlEitx9I3pCQw1IvZxPwv6yYQ1HopD+MEWEhnFwWJwBDhSau4zxWjIMZGEKq4eSumQ2K6BNN9yZTgLH55WbQuqo5dde4uK/VaXkcRHaFjdIocdIXq6BY1UBNR9ITe0Af6tJ6td+vL+p6hBSvfOUR/plD4AZUvssE=</latexit><latexit sha1_base64="uX3PJcn417A8oAFe0wB+DSPe5bM=">AAACSHicbZDNS8MwGMbT+TXn19Sjl+AQBHG0IriLMPDicYL7gHWUNEu3uKQtyVthlP55Xjx682/w4kERb2ZbGbr5QuDheX4vSR4/FlyDbb9ahZXVtfWN4mZpa3tnd6+8f9DSUaIoa9JIRKrjE80ED1kTOAjWiRUj0hes7Y9uJnn7kSnNo/AexjHrSTIIecApAWN5Zc+FIQPipQ+jDF9jN1CEpi4R8TD3zrDrz4FzJ8OpqxPpyTkkR9mUyt0ZLHM488oVu2pPBy8LJxcVlE/DK7+4/YgmkoVABdG669gx9FKigFPBspKbaBYTOiID1jUyJJLpXjotIsMnxunjIFLmhICn7u+NlEitx9I3pCQw1IvZxPwv6yYQ1HopD+MEWEhnFwWJwBDhSau4zxWjIMZGEKq4eSumQ2K6BNN9yZTgLH55WbQuqo5dde4uK/VaXkcRHaFjdIocdIXq6BY1UBNR9ITe0Af6tJ6td+vL+p6hBSvfOUR/plD4AZUvssE=</latexit><latexit sha1_base64="uX3PJcn417A8oAFe0wB+DSPe5bM=">AAACSHicbZDNS8MwGMbT+TXn19Sjl+AQBHG0IriLMPDicYL7gHWUNEu3uKQtyVthlP55Xjx682/w4kERb2ZbGbr5QuDheX4vSR4/FlyDbb9ahZXVtfWN4mZpa3tnd6+8f9DSUaIoa9JIRKrjE80ED1kTOAjWiRUj0hes7Y9uJnn7kSnNo/AexjHrSTIIecApAWN5Zc+FIQPipQ+jDF9jN1CEpi4R8TD3zrDrz4FzJ8OpqxPpyTkkR9mUyt0ZLHM488oVu2pPBy8LJxcVlE/DK7+4/YgmkoVABdG669gx9FKigFPBspKbaBYTOiID1jUyJJLpXjotIsMnxunjIFLmhICn7u+NlEitx9I3pCQw1IvZxPwv6yYQ1HopD+MEWEhnFwWJwBDhSau4zxWjIMZGEKq4eSumQ2K6BNN9yZTgLH55WbQuqo5dde4uK/VaXkcRHaFjdIocdIXq6BY1UBNR9ITe0Af6tJ6td+vL+p6hBSvfOUR/plD4AZUvssE=</latexit>

# seen “aardvark” # hallucinated “aardvark”

# seen words # hallucinated words

Page 17: Announcementslwehbe/10701_S19/files/Lecture_5.pdf · 2019-02-05 · 10-701 Introduction to Machine Learning (PhD) Lecture 5: Naive Bayes Leila Wehbe Carnegie Mellon University Machine

MAP estimates for bag of words

Map estimate for multinomial

What β’s should we choose?

✓jk =↵jk + �jk � 1P

m ↵mk +P

m �mk � 1<latexit sha1_base64="uX3PJcn417A8oAFe0wB+DSPe5bM=">AAACSHicbZDNS8MwGMbT+TXn19Sjl+AQBHG0IriLMPDicYL7gHWUNEu3uKQtyVthlP55Xjx682/w4kERb2ZbGbr5QuDheX4vSR4/FlyDbb9ahZXVtfWN4mZpa3tnd6+8f9DSUaIoa9JIRKrjE80ED1kTOAjWiRUj0hes7Y9uJnn7kSnNo/AexjHrSTIIecApAWN5Zc+FIQPipQ+jDF9jN1CEpi4R8TD3zrDrz4FzJ8OpqxPpyTkkR9mUyt0ZLHM488oVu2pPBy8LJxcVlE/DK7+4/YgmkoVABdG669gx9FKigFPBspKbaBYTOiID1jUyJJLpXjotIsMnxunjIFLmhICn7u+NlEitx9I3pCQw1IvZxPwv6yYQ1HopD+MEWEhnFwWJwBDhSau4zxWjIMZGEKq4eSumQ2K6BNN9yZTgLH55WbQuqo5dde4uK/VaXkcRHaFjdIocdIXq6BY1UBNR9ITe0Af6tJ6td+vL+p6hBSvfOUR/plD4AZUvssE=</latexit><latexit sha1_base64="uX3PJcn417A8oAFe0wB+DSPe5bM=">AAACSHicbZDNS8MwGMbT+TXn19Sjl+AQBHG0IriLMPDicYL7gHWUNEu3uKQtyVthlP55Xjx682/w4kERb2ZbGbr5QuDheX4vSR4/FlyDbb9ahZXVtfWN4mZpa3tnd6+8f9DSUaIoa9JIRKrjE80ED1kTOAjWiRUj0hes7Y9uJnn7kSnNo/AexjHrSTIIecApAWN5Zc+FIQPipQ+jDF9jN1CEpi4R8TD3zrDrz4FzJ8OpqxPpyTkkR9mUyt0ZLHM488oVu2pPBy8LJxcVlE/DK7+4/YgmkoVABdG669gx9FKigFPBspKbaBYTOiID1jUyJJLpXjotIsMnxunjIFLmhICn7u+NlEitx9I3pCQw1IvZxPwv6yYQ1HopD+MEWEhnFwWJwBDhSau4zxWjIMZGEKq4eSumQ2K6BNN9yZTgLH55WbQuqo5dde4uK/VaXkcRHaFjdIocdIXq6BY1UBNR9ITe0Af6tJ6td+vL+p6hBSvfOUR/plD4AZUvssE=</latexit><latexit sha1_base64="uX3PJcn417A8oAFe0wB+DSPe5bM=">AAACSHicbZDNS8MwGMbT+TXn19Sjl+AQBHG0IriLMPDicYL7gHWUNEu3uKQtyVthlP55Xjx682/w4kERb2ZbGbr5QuDheX4vSR4/FlyDbb9ahZXVtfWN4mZpa3tnd6+8f9DSUaIoa9JIRKrjE80ED1kTOAjWiRUj0hes7Y9uJnn7kSnNo/AexjHrSTIIecApAWN5Zc+FIQPipQ+jDF9jN1CEpi4R8TD3zrDrz4FzJ8OpqxPpyTkkR9mUyt0ZLHM488oVu2pPBy8LJxcVlE/DK7+4/YgmkoVABdG669gx9FKigFPBspKbaBYTOiID1jUyJJLpXjotIsMnxunjIFLmhICn7u+NlEitx9I3pCQw1IvZxPwv6yYQ1HopD+MEWEhnFwWJwBDhSau4zxWjIMZGEKq4eSumQ2K6BNN9yZTgLH55WbQuqo5dde4uK/VaXkcRHaFjdIocdIXq6BY1UBNR9ITe0Af6tJ6td+vL+p6hBSvfOUR/plD4AZUvssE=</latexit><latexit sha1_base64="uX3PJcn417A8oAFe0wB+DSPe5bM=">AAACSHicbZDNS8MwGMbT+TXn19Sjl+AQBHG0IriLMPDicYL7gHWUNEu3uKQtyVthlP55Xjx682/w4kERb2ZbGbr5QuDheX4vSR4/FlyDbb9ahZXVtfWN4mZpa3tnd6+8f9DSUaIoa9JIRKrjE80ED1kTOAjWiRUj0hes7Y9uJnn7kSnNo/AexjHrSTIIecApAWN5Zc+FIQPipQ+jDF9jN1CEpi4R8TD3zrDrz4FzJ8OpqxPpyTkkR9mUyt0ZLHM488oVu2pPBy8LJxcVlE/DK7+4/YgmkoVABdG669gx9FKigFPBspKbaBYTOiID1jUyJJLpXjotIsMnxunjIFLmhICn7u+NlEitx9I3pCQw1IvZxPwv6yYQ1HopD+MEWEhnFwWJwBDhSau4zxWjIMZGEKq4eSumQ2K6BNN9yZTgLH55WbQuqo5dde4uK/VaXkcRHaFjdIocdIXq6BY1UBNR9ITe0Af6tJ6td+vL+p6hBSvfOUR/plD4AZUvssE=</latexit>

Probabilities over all classes?

Constant per word?

# seen “aardvark” # hallucinated “aardvark”

# seen words # hallucinated words

MAP estimates for bag of words

Map estimate for multinomial

What β’s should we choose?

✓jk =↵jk + �jk � 1P

m ↵mk +P

m �mk � 1<latexit sha1_base64="uX3PJcn417A8oAFe0wB+DSPe5bM=">AAACSHicbZDNS8MwGMbT+TXn19Sjl+AQBHG0IriLMPDicYL7gHWUNEu3uKQtyVthlP55Xjx682/w4kERb2ZbGbr5QuDheX4vSR4/FlyDbb9ahZXVtfWN4mZpa3tnd6+8f9DSUaIoa9JIRKrjE80ED1kTOAjWiRUj0hes7Y9uJnn7kSnNo/AexjHrSTIIecApAWN5Zc+FIQPipQ+jDF9jN1CEpi4R8TD3zrDrz4FzJ8OpqxPpyTkkR9mUyt0ZLHM488oVu2pPBy8LJxcVlE/DK7+4/YgmkoVABdG669gx9FKigFPBspKbaBYTOiID1jUyJJLpXjotIsMnxunjIFLmhICn7u+NlEitx9I3pCQw1IvZxPwv6yYQ1HopD+MEWEhnFwWJwBDhSau4zxWjIMZGEKq4eSumQ2K6BNN9yZTgLH55WbQuqo5dde4uK/VaXkcRHaFjdIocdIXq6BY1UBNR9ITe0Af6tJ6td+vL+p6hBSvfOUR/plD4AZUvssE=</latexit><latexit sha1_base64="uX3PJcn417A8oAFe0wB+DSPe5bM=">AAACSHicbZDNS8MwGMbT+TXn19Sjl+AQBHG0IriLMPDicYL7gHWUNEu3uKQtyVthlP55Xjx682/w4kERb2ZbGbr5QuDheX4vSR4/FlyDbb9ahZXVtfWN4mZpa3tnd6+8f9DSUaIoa9JIRKrjE80ED1kTOAjWiRUj0hes7Y9uJnn7kSnNo/AexjHrSTIIecApAWN5Zc+FIQPipQ+jDF9jN1CEpi4R8TD3zrDrz4FzJ8OpqxPpyTkkR9mUyt0ZLHM488oVu2pPBy8LJxcVlE/DK7+4/YgmkoVABdG669gx9FKigFPBspKbaBYTOiID1jUyJJLpXjotIsMnxunjIFLmhICn7u+NlEitx9I3pCQw1IvZxPwv6yYQ1HopD+MEWEhnFwWJwBDhSau4zxWjIMZGEKq4eSumQ2K6BNN9yZTgLH55WbQuqo5dde4uK/VaXkcRHaFjdIocdIXq6BY1UBNR9ITe0Af6tJ6td+vL+p6hBSvfOUR/plD4AZUvssE=</latexit><latexit sha1_base64="uX3PJcn417A8oAFe0wB+DSPe5bM=">AAACSHicbZDNS8MwGMbT+TXn19Sjl+AQBHG0IriLMPDicYL7gHWUNEu3uKQtyVthlP55Xjx682/w4kERb2ZbGbr5QuDheX4vSR4/FlyDbb9ahZXVtfWN4mZpa3tnd6+8f9DSUaIoa9JIRKrjE80ED1kTOAjWiRUj0hes7Y9uJnn7kSnNo/AexjHrSTIIecApAWN5Zc+FIQPipQ+jDF9jN1CEpi4R8TD3zrDrz4FzJ8OpqxPpyTkkR9mUyt0ZLHM488oVu2pPBy8LJxcVlE/DK7+4/YgmkoVABdG669gx9FKigFPBspKbaBYTOiID1jUyJJLpXjotIsMnxunjIFLmhICn7u+NlEitx9I3pCQw1IvZxPwv6yYQ1HopD+MEWEhnFwWJwBDhSau4zxWjIMZGEKq4eSumQ2K6BNN9yZTgLH55WbQuqo5dde4uK/VaXkcRHaFjdIocdIXq6BY1UBNR9ITe0Af6tJ6td+vL+p6hBSvfOUR/plD4AZUvssE=</latexit><latexit sha1_base64="uX3PJcn417A8oAFe0wB+DSPe5bM=">AAACSHicbZDNS8MwGMbT+TXn19Sjl+AQBHG0IriLMPDicYL7gHWUNEu3uKQtyVthlP55Xjx682/w4kERb2ZbGbr5QuDheX4vSR4/FlyDbb9ahZXVtfWN4mZpa3tnd6+8f9DSUaIoa9JIRKrjE80ED1kTOAjWiRUj0hes7Y9uJnn7kSnNo/AexjHrSTIIecApAWN5Zc+FIQPipQ+jDF9jN1CEpi4R8TD3zrDrz4FzJ8OpqxPpyTkkR9mUyt0ZLHM488oVu2pPBy8LJxcVlE/DK7+4/YgmkoVABdG669gx9FKigFPBspKbaBYTOiID1jUyJJLpXjotIsMnxunjIFLmhICn7u+NlEitx9I3pCQw1IvZxPwv6yYQ1HopD+MEWEhnFwWJwBDhSau4zxWjIMZGEKq4eSumQ2K6BNN9yZTgLH55WbQuqo5dde4uK/VaXkcRHaFjdIocdIXq6BY1UBNR9ITe0Af6tJ6td+vL+p6hBSvfOUR/plD4AZUvssE=</latexit>

Probabilities over all classes?

Constant per word?

Prior is

Dirichlet

Posterior is also

Dirichlet

(Dirichlet is the conjugate prior for a multinomial

likelihood function)

P (✓k) =✓�1k

1k , ✓�2k

2k , . . . , ✓�mk

mk

Beta(�1k,�2k, . . . ,�mk)<latexit sha1_base64="03N+DkYPgNUd/J3uXdmRdwzpDpw=">AAACgHicbZHZSgMxFIYz41brVvXSm2ARWqh1pggWQSh642UFu0BbSybN2DCZheSMUIZ5Dt/LOx9GMN1r64HAn/87J8s5TiS4Asv6Nsyt7Z3dvcx+9uDw6Pgkd3rWVGEsKWvQUISy7RDFBA9YAzgI1o4kI74jWMvxnsa89cGk4mHwCqOI9XzyHnCXUwLa6uc+64UuDBmQfuKlRfyAu64kNJl7tpe+JV1nrtPSHFRWQGUCBiGoBfZXsNZpipNHvSksjyoti+e1i/QiTvu5vFW2JoE3hT0TeTSLej/3pU+hsc8CoIIo1bGtCHoJkcCpYGm2GysWEeqRd9bRMiA+U71k0sAUX2lngN1Q6hUAnrirFQnxlRr5js70CQzVOhub/7FODG61l/AgioEFdHqRGwsMIR5PAw+4ZBTESAtCJddvxXRI9ARAzyyrm2Cvf3lTNCtl2yrbL7f5WnXWjgy6QJeogGx0h2roGdVRA1H0Y+SNknFtmmbBvDHtaappzGrO0Z8w738BTN3EdA==</latexit><latexit sha1_base64="03N+DkYPgNUd/J3uXdmRdwzpDpw=">AAACgHicbZHZSgMxFIYz41brVvXSm2ARWqh1pggWQSh642UFu0BbSybN2DCZheSMUIZ5Dt/LOx9GMN1r64HAn/87J8s5TiS4Asv6Nsyt7Z3dvcx+9uDw6Pgkd3rWVGEsKWvQUISy7RDFBA9YAzgI1o4kI74jWMvxnsa89cGk4mHwCqOI9XzyHnCXUwLa6uc+64UuDBmQfuKlRfyAu64kNJl7tpe+JV1nrtPSHFRWQGUCBiGoBfZXsNZpipNHvSksjyoti+e1i/QiTvu5vFW2JoE3hT0TeTSLej/3pU+hsc8CoIIo1bGtCHoJkcCpYGm2GysWEeqRd9bRMiA+U71k0sAUX2lngN1Q6hUAnrirFQnxlRr5js70CQzVOhub/7FODG61l/AgioEFdHqRGwsMIR5PAw+4ZBTESAtCJddvxXRI9ARAzyyrm2Cvf3lTNCtl2yrbL7f5WnXWjgy6QJeogGx0h2roGdVRA1H0Y+SNknFtmmbBvDHtaappzGrO0Z8w738BTN3EdA==</latexit><latexit sha1_base64="03N+DkYPgNUd/J3uXdmRdwzpDpw=">AAACgHicbZHZSgMxFIYz41brVvXSm2ARWqh1pggWQSh642UFu0BbSybN2DCZheSMUIZ5Dt/LOx9GMN1r64HAn/87J8s5TiS4Asv6Nsyt7Z3dvcx+9uDw6Pgkd3rWVGEsKWvQUISy7RDFBA9YAzgI1o4kI74jWMvxnsa89cGk4mHwCqOI9XzyHnCXUwLa6uc+64UuDBmQfuKlRfyAu64kNJl7tpe+JV1nrtPSHFRWQGUCBiGoBfZXsNZpipNHvSksjyoti+e1i/QiTvu5vFW2JoE3hT0TeTSLej/3pU+hsc8CoIIo1bGtCHoJkcCpYGm2GysWEeqRd9bRMiA+U71k0sAUX2lngN1Q6hUAnrirFQnxlRr5js70CQzVOhub/7FODG61l/AgioEFdHqRGwsMIR5PAw+4ZBTESAtCJddvxXRI9ARAzyyrm2Cvf3lTNCtl2yrbL7f5WnXWjgy6QJeogGx0h2roGdVRA1H0Y+SNknFtmmbBvDHtaappzGrO0Z8w738BTN3EdA==</latexit><latexit sha1_base64="hP+6LrUf2d3tZaldqaQQvEKMXyw=">AAAB2XicbZDNSgMxFIXv1L86Vq1rN8EiuCozbnQpuHFZwbZCO5RM5k4bmskMyR2hDH0BF25EfC93vo3pz0JbDwQ+zknIvSculLQUBN9ebWd3b/+gfugfNfzjk9Nmo2fz0gjsilzl5jnmFpXU2CVJCp8LgzyLFfbj6f0i77+gsTLXTzQrMMr4WMtUCk7O6oyaraAdLMW2IVxDC9YaNb+GSS7KDDUJxa0dhEFBUcUNSaFw7g9LiwUXUz7GgUPNM7RRtRxzzi6dk7A0N+5oYkv394uKZ9bOstjdzDhN7Ga2MP/LBiWlt1EldVESarH6KC0Vo5wtdmaJNChIzRxwYaSblYkJN1yQa8Z3HYSbG29D77odBu3wMYA6nMMFXEEIN3AHD9CBLghI4BXevYn35n2suqp569LO4I+8zx84xIo4</latexit><latexit sha1_base64="BtOu40dmbNDpNla2CgWibObMJGw=">AAACdXicbZHLSgMxFIYz463WqqNbN8EitFDrTDe6EUQ3LivYC3RqyaSZNjRzITkjlGGew/dy58MIppdpa/VA4M//5Zwk53ix4Aps+8swd3b39g8Kh8Wj0vHJqXVWaqsokZS1aCQi2fWIYoKHrAUcBOvGkpHAE6zjTZ5mvPPOpOJR+ArTmPUDMgq5zykBbQ2sj2bFhTEDMkgnWRXfY9eXhKa550yyt9T1cp3VctDYAI05GEagVjjYwFpnGU4f9aayLlVbJ+e5q+NVnA2ssl2354H/CmcpymgZzYH1qavQJGAhUEGU6jl2DP2USOBUsKzoJorFhE7IiPW0DEnAVD+dNzDDV9oZYj+SeoWA5+5mRkoCpaaBp08GBMZqm83M/1gvAf+un/IwToCFdHGRnwgMEZ5NAw+5ZBTEVAtCJddvxXRM9ARAz6yom+Bsf/mvaDfqjl13XmxUQBfoElWQg27RA3pGTdRCFH0bZaNmXJumWTFvFu0yjWXfztGvMJ0fyPDDkg==</latexit><latexit sha1_base64="BtOu40dmbNDpNla2CgWibObMJGw=">AAACdXicbZHLSgMxFIYz463WqqNbN8EitFDrTDe6EUQ3LivYC3RqyaSZNjRzITkjlGGew/dy58MIppdpa/VA4M//5Zwk53ix4Aps+8swd3b39g8Kh8Wj0vHJqXVWaqsokZS1aCQi2fWIYoKHrAUcBOvGkpHAE6zjTZ5mvPPOpOJR+ArTmPUDMgq5zykBbQ2sj2bFhTEDMkgnWRXfY9eXhKa550yyt9T1cp3VctDYAI05GEagVjjYwFpnGU4f9aayLlVbJ+e5q+NVnA2ssl2354H/CmcpymgZzYH1qavQJGAhUEGU6jl2DP2USOBUsKzoJorFhE7IiPW0DEnAVD+dNzDDV9oZYj+SeoWA5+5mRkoCpaaBp08GBMZqm83M/1gvAf+un/IwToCFdHGRnwgMEZ5NAw+5ZBTEVAtCJddvxXRM9ARAz6yom+Bsf/mvaDfqjl13XmxUQBfoElWQg27RA3pGTdRCFH0bZaNmXJumWTFvFu0yjWXfztGvMJ0fyPDDkg==</latexit><latexit sha1_base64="2OOn6zPXhp7CH/LzqyoQWWX0IUQ=">AAACgHicbZHZSgMxFIYz41brVvXSm2ARWqh1pjcWQSh642UFu0BbSybNtGEyC8kZoQzzHL6Xdz6MYLrX1gOBP/93TpZznEhwBZb1bZg7u3v7B5nD7NHxyelZ7vyiqcJYUtagoQhl2yGKCR6wBnAQrB1JRnxHsJbjPU9464NJxcPgDcYR6/lkGHCXUwLa6uc+64UujBiQfuKlRfyIu64kNFl4tpe+J11nodPSAlTWQGUKBiGoJfbXsNZpipMnvSmsjiqtihe1y/QiTvu5vFW2poG3hT0XeTSPej/3pU+hsc8CoIIo1bGtCHoJkcCpYGm2GysWEeqRIetoGRCfqV4ybWCKb7QzwG4o9QoAT931ioT4So19R2f6BEZqk03M/1gnBrfaS3gQxcACOrvIjQWGEE+mgQdcMgpirAWhkuu3YjoiegKgZ5bVTbA3v7wtmpWybZXtVytfq87bkUFX6BoVkI3uUQ29oDpqIIp+jLxRMm5N0yyYd6Y9SzWNec0l+hPmwy9LncRw</latexit><latexit sha1_base64="03N+DkYPgNUd/J3uXdmRdwzpDpw=">AAACgHicbZHZSgMxFIYz41brVvXSm2ARWqh1pggWQSh642UFu0BbSybN2DCZheSMUIZ5Dt/LOx9GMN1r64HAn/87J8s5TiS4Asv6Nsyt7Z3dvcx+9uDw6Pgkd3rWVGEsKWvQUISy7RDFBA9YAzgI1o4kI74jWMvxnsa89cGk4mHwCqOI9XzyHnCXUwLa6uc+64UuDBmQfuKlRfyAu64kNJl7tpe+JV1nrtPSHFRWQGUCBiGoBfZXsNZpipNHvSksjyoti+e1i/QiTvu5vFW2JoE3hT0TeTSLej/3pU+hsc8CoIIo1bGtCHoJkcCpYGm2GysWEeqRd9bRMiA+U71k0sAUX2lngN1Q6hUAnrirFQnxlRr5js70CQzVOhub/7FODG61l/AgioEFdHqRGwsMIR5PAw+4ZBTESAtCJddvxXRI9ARAzyyrm2Cvf3lTNCtl2yrbL7f5WnXWjgy6QJeogGx0h2roGdVRA1H0Y+SNknFtmmbBvDHtaappzGrO0Z8w738BTN3EdA==</latexit><latexit sha1_base64="03N+DkYPgNUd/J3uXdmRdwzpDpw=">AAACgHicbZHZSgMxFIYz41brVvXSm2ARWqh1pggWQSh642UFu0BbSybN2DCZheSMUIZ5Dt/LOx9GMN1r64HAn/87J8s5TiS4Asv6Nsyt7Z3dvcx+9uDw6Pgkd3rWVGEsKWvQUISy7RDFBA9YAzgI1o4kI74jWMvxnsa89cGk4mHwCqOI9XzyHnCXUwLa6uc+64UuDBmQfuKlRfyAu64kNJl7tpe+JV1nrtPSHFRWQGUCBiGoBfZXsNZpipNHvSksjyoti+e1i/QiTvu5vFW2JoE3hT0TeTSLej/3pU+hsc8CoIIo1bGtCHoJkcCpYGm2GysWEeqRd9bRMiA+U71k0sAUX2lngN1Q6hUAnrirFQnxlRr5js70CQzVOhub/7FODG61l/AgioEFdHqRGwsMIR5PAw+4ZBTESAtCJddvxXRI9ARAzyyrm2Cvf3lTNCtl2yrbL7f5WnXWjgy6QJeogGx0h2roGdVRA1H0Y+SNknFtmmbBvDHtaappzGrO0Z8w738BTN3EdA==</latexit><latexit sha1_base64="03N+DkYPgNUd/J3uXdmRdwzpDpw=">AAACgHicbZHZSgMxFIYz41brVvXSm2ARWqh1pggWQSh642UFu0BbSybN2DCZheSMUIZ5Dt/LOx9GMN1r64HAn/87J8s5TiS4Asv6Nsyt7Z3dvcx+9uDw6Pgkd3rWVGEsKWvQUISy7RDFBA9YAzgI1o4kI74jWMvxnsa89cGk4mHwCqOI9XzyHnCXUwLa6uc+64UuDBmQfuKlRfyAu64kNJl7tpe+JV1nrtPSHFRWQGUCBiGoBfZXsNZpipNHvSksjyoti+e1i/QiTvu5vFW2JoE3hT0TeTSLej/3pU+hsc8CoIIo1bGtCHoJkcCpYGm2GysWEeqRd9bRMiA+U71k0sAUX2lngN1Q6hUAnrirFQnxlRr5js70CQzVOhub/7FODG61l/AgioEFdHqRGwsMIR5PAw+4ZBTESAtCJddvxXRI9ARAzyyrm2Cvf3lTNCtl2yrbL7f5WnXWjgy6QJeogGx0h2roGdVRA1H0Y+SNknFtmmbBvDHtaappzGrO0Z8w738BTN3EdA==</latexit><latexit sha1_base64="03N+DkYPgNUd/J3uXdmRdwzpDpw=">AAACgHicbZHZSgMxFIYz41brVvXSm2ARWqh1pggWQSh642UFu0BbSybN2DCZheSMUIZ5Dt/LOx9GMN1r64HAn/87J8s5TiS4Asv6Nsyt7Z3dvcx+9uDw6Pgkd3rWVGEsKWvQUISy7RDFBA9YAzgI1o4kI74jWMvxnsa89cGk4mHwCqOI9XzyHnCXUwLa6uc+64UuDBmQfuKlRfyAu64kNJl7tpe+JV1nrtPSHFRWQGUCBiGoBfZXsNZpipNHvSksjyoti+e1i/QiTvu5vFW2JoE3hT0TeTSLej/3pU+hsc8CoIIo1bGtCHoJkcCpYGm2GysWEeqRd9bRMiA+U71k0sAUX2lngN1Q6hUAnrirFQnxlRr5js70CQzVOhub/7FODG61l/AgioEFdHqRGwsMIR5PAw+4ZBTESAtCJddvxXRI9ARAzyyrm2Cvf3lTNCtl2yrbL7f5WnXWjgy6QJeogGx0h2roGdVRA1H0Y+SNknFtmmbBvDHtaappzGrO0Z8w738BTN3EdA==</latexit><latexit sha1_base64="03N+DkYPgNUd/J3uXdmRdwzpDpw=">AAACgHicbZHZSgMxFIYz41brVvXSm2ARWqh1pggWQSh642UFu0BbSybN2DCZheSMUIZ5Dt/LOx9GMN1r64HAn/87J8s5TiS4Asv6Nsyt7Z3dvcx+9uDw6Pgkd3rWVGEsKWvQUISy7RDFBA9YAzgI1o4kI74jWMvxnsa89cGk4mHwCqOI9XzyHnCXUwLa6uc+64UuDBmQfuKlRfyAu64kNJl7tpe+JV1nrtPSHFRWQGUCBiGoBfZXsNZpipNHvSksjyoti+e1i/QiTvu5vFW2JoE3hT0TeTSLej/3pU+hsc8CoIIo1bGtCHoJkcCpYGm2GysWEeqRd9bRMiA+U71k0sAUX2lngN1Q6hUAnrirFQnxlRr5js70CQzVOhub/7FODG61l/AgioEFdHqRGwsMIR5PAw+4ZBTESAtCJddvxXRI9ARAzyyrm2Cvf3lTNCtl2yrbL7f5WnXWjgy6QJeogGx0h2roGdVRA1H0Y+SNknFtmmbBvDHtaappzGrO0Z8w738BTN3EdA==</latexit><latexit sha1_base64="03N+DkYPgNUd/J3uXdmRdwzpDpw=">AAACgHicbZHZSgMxFIYz41brVvXSm2ARWqh1pggWQSh642UFu0BbSybN2DCZheSMUIZ5Dt/LOx9GMN1r64HAn/87J8s5TiS4Asv6Nsyt7Z3dvcx+9uDw6Pgkd3rWVGEsKWvQUISy7RDFBA9YAzgI1o4kI74jWMvxnsa89cGk4mHwCqOI9XzyHnCXUwLa6uc+64UuDBmQfuKlRfyAu64kNJl7tpe+JV1nrtPSHFRWQGUCBiGoBfZXsNZpipNHvSksjyoti+e1i/QiTvu5vFW2JoE3hT0TeTSLej/3pU+hsc8CoIIo1bGtCHoJkcCpYGm2GysWEeqRd9bRMiA+U71k0sAUX2lngN1Q6hUAnrirFQnxlRr5js70CQzVOhub/7FODG61l/AgioEFdHqRGwsMIR5PAw+4ZBTESAtCJddvxXRI9ARAzyyrm2Cvf3lTNCtl2yrbL7f5WnXWjgy6QJeogGx0h2roGdVRA1H0Y+SNknFtmmbBvDHtaappzGrO0Z8w738BTN3EdA==</latexit>

For code and data, see www.cs.cmu.edu/~tom/mlbook.html click on “Software and Data” For code and data, see

www.cs.cmu.edu/~tom/mlbook.html click on “Software and Data”

19 Prior probabilities

Page 18: Announcementslwehbe/10701_S19/files/Lecture_5.pdf · 2019-02-05 · 10-701 Introduction to Machine Learning (PhD) Lecture 5: Naive Bayes Leila Wehbe Carnegie Mellon University Machine

For code and data, see www.cs.cmu.edu/~tom/mlbook.html click on “Software and Data”

19 Prior probabilities

Can group over sub-classes to generate

prior counts

(hallucinate training examples)

Performance can be very good• Even when taking half of the email

• Assumption doesn’t hurt the particular problem?

• Redundancy? • Leads less examples to train?

Converges faster to asymptotic performance? (Ng and Jordan)

What if we have continuous Xi ?

Eg., image classification: Xi is real-valued ith pixel

What if we have continuous Xi ?Eg., image classification: Xi is real-valued ith pixel

Naïve Bayes requires P(Xi | Y=yk), but Xi is real (continuous)

Common approach: assume P(Xi | Y=yk) follows a Normal (Gaussian) distribution

Page 19: Announcementslwehbe/10701_S19/files/Lecture_5.pdf · 2019-02-05 · 10-701 Introduction to Machine Learning (PhD) Lecture 5: Naive Bayes Leila Wehbe Carnegie Mellon University Machine

GNB Example: Classify a person’s cognitive state, based on brain image

• reading a sentence or viewing a picture? • reading the word describing a “Tool” or “Building”? • answering the question, or getting confused?

time

Stimuli for the study:

ant

or 60 distinct exemplars, presented 6 times each

What if we have continuous Xi ?Gaussian Naïve Bayes (GNB): assume

Sometimes assume variance • is independent of Y (i.e., σi), • or independent of Xi (i.e., σk) • or both (i.e., σ)

Gaussian Naïve Bayes Algorithm – continuous Xi (but still discrete Y)

• Train Naïve Bayes (examples) for each value yk

estimate* for each attribute Xi estimate

• class conditional mean , variance

• Classify (Xnew)

* probabilities must sum to 1, so need estimate only n-1 parameters...

Page 20: Announcementslwehbe/10701_S19/files/Lecture_5.pdf · 2019-02-05 · 10-701 Introduction to Machine Learning (PhD) Lecture 5: Naive Bayes Leila Wehbe Carnegie Mellon University Machine

Estimating Parameters: Y discrete, Xi continuous

Maximum likelihood estimates: jth training example

δ()=1 if (Yj=yk) else 0

ith feature kth class

Mean activations over all training examples for Y=“bottle”

Y is the mental state (reading “house” or “bottle”) Xi are the voxel activities,

this is a plot of the µ’s defining P(Xi | Y=“bottle”)

fMRI activation

high

below average

average

Classification task: is person viewing a “tool” or “building”?

p4 p8 p6 p11 p5 p7 p10 p9 p2 p12 p3 p10

0.10.20.30.40.50.60.70.80.9

1

Participants

Cla

ssifi

catio

n ac

cura

cy

statistically significant

p<0.05

Cla

ssifi

catio

n ac

cura

cy

Where is information encoded in the brain?

Accuracies of cubical 27-voxel

classifiers centered at

each significant voxel

[0.7-0.8]

Page 21: Announcementslwehbe/10701_S19/files/Lecture_5.pdf · 2019-02-05 · 10-701 Introduction to Machine Learning (PhD) Lecture 5: Naive Bayes Leila Wehbe Carnegie Mellon University Machine

Questions to think about:

• How can we extend Naïve Bayes if just 2 of the Xi‘s are dependent?

• What error will the classifier achieve if Naïve Bayes assumption is satisfied and we have infinite training data?

• What does the decision surface of a Naïve Bayes classifier look like?

• Can you use Naïve Bayes for a combination of discrete and real-valued Xi?

We covered: • Bayes classifiers to learn P(Y|X) • MLE and MAP estimates for parameters of P • Conditional independence • Naïve Bayes ! make Bayesian learning practical • Text classification • Naïve Bayes and continuous variables Xi:

• Gaussian Naïve Bayes classifier

Next: • Learn P(Y|X) directly

• Logistic regression, Regularization, Gradient ascent • Naïve Bayes or Logistic Regression?

• Generative vs. Discriminative classifiers

Gaussian Naïve Bayes – Big Picture

Example: Y= PlayBasketball (boolean), X1=Height, X2=MLgrade

assume P(Y=1) = 0.5

Logistic RegressionIdea: • Naïve Bayes allows computing P(Y|X) by

learning P(Y) and P(X|Y)

• Why not learn P(Y|X) directly?