introduction to probability: lecture 14: introduction to ... · a sample of application domains •...

21
LECTURE 14: Introduction to Bayesian inference The big picture motivation, applications problem types (hypothesis testing, estimation, etc.) The general framework Bayes' rule , posterior (4 versions) point estimates (MAP, LMS) performance measures) (prob. of error; mean squared error) examples 1

Upload: others

Post on 04-May-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Introduction to Probability: Lecture 14: Introduction to ... · A sample of application domains • Design and interpretation of experiments STATE COUNTS - polling • 17 SOLIDLY

LECTURE 14: Introduction to Bayesian inference

• The big picture

motivation, applications

problem types (hypothesis testing, estimation, etc.)

• The general framework

Bayes' rule , posterior

(4 versions)

point estimates (MAP, LMS)

performance measures) (prob. of error; mean squared error)

examples

1

Page 2: Introduction to Probability: Lecture 14: Introduction to ... · A sample of application domains • Design and interpretation of experiments STATE COUNTS - polling • 17 SOLIDLY

Inference: the big picture

redictions Real world

Probability theory

Decisions (Analysis)

Models

In fere n ce IS tatist i cs •

2

Page 3: Introduction to Probability: Lecture 14: Introduction to ... · A sample of application domains • Design and interpretation of experiments STATE COUNTS - polling • 17 SOLIDLY

Inference then and now

• Then:

0 patients were treated: 3 died

0 patients were not treated: 5 died

herefore ...

w:

Big data

Big models

• Big computers

1

1

T

No

3

Page 4: Introduction to Probability: Lecture 14: Introduction to ... · A sample of application domains • Design and interpretation of experiments STATE COUNTS - polling • 17 SOLIDLY

A sample of application domains

• Design and interpretation of experiments

STATE COUNTS

- polling •

17 SOLIDLY DEMOCRATIC 23 SOLIDLY RIPUBLICAN 11 TOSSUP

ELECTORAL VOTE COUNTS 237 LIKELY DEMOCRATIC 191 LIKELY REPUBLICAN

I IO TOSSUP

Obama/B den I Romne /Ryani

© Source unknown. All rights reserved. This content is excludedfrom our Creative Commons license. For more information, seehttps://ocw.mit.edu/help/faq-fair-use.

4

Page 5: Introduction to Probability: Lecture 14: Introduction to ... · A sample of application domains • Design and interpretation of experiments STATE COUNTS - polling • 17 SOLIDLY

A sample of application domains

• marketing , advertising

• recommendation systems

- Netflix competition

• •

t 5

Page 6: Introduction to Probability: Lecture 14: Introduction to ... · A sample of application domains • Design and interpretation of experiments STATE COUNTS - polling • 17 SOLIDLY

S�P 500 INDEX (STANDARD & POOR' as of 30-Ma r -2007

1500

1450�-+--- - ---+- - - --+-- - ---+- - - -t-----:""T"'rf"''rt-----l

1350

1300

12001---�.....,.,---,� .,..+�� ��......,.,---,��.,-L�- � ��'-,-:, ��,--,..,�- ----1 Ma 06 Jul06 Se 06 Nov06 Jan07 Mar07

6.0 6 4 .0 l----+-- - ---+- - - -+-- - --+-�- -+-- - --.+-�----1 ·-

::: 2 .o

0.0 Cop�right 2007 Yahoo! Inc. http://financ�.�ahoo.co•/

A sample of application domains

• Finance

© Source unknown. All rights reserved. This content is excludedfrom our Creative Commons license. For more information, seehttps://ocw.mit.edu/help/faq-fair-use.

6

Page 7: Introduction to Probability: Lecture 14: Introduction to ... · A sample of application domains • Design and interpretation of experiments STATE COUNTS - polling • 17 SOLIDLY

x,.n.l, tonti

x,.n.» ' ..

' ·..r:o2ea:!JlSI • IIT :l:'ll·

--

-,)I1 oll ,•

.� Mo,-3$.,2?

�-.m11,.�n

.

. it,2:t,21 .r:

x,. n . )(fon.:

n ·.;,:,:s,n.�I):!.. m.

"' .. • 111'&1,

�·- -l"·11 ...... "M

.:.3I �!lH

llo,1t491

,.

:ll't>lJ.'l J(f,21,! ... 21.1

llf -'tf77'4 )l'.tll,I ,. •. ,ou,, , irT_,m-n. t,:o

""·'"'"llf,OlflOI

_111 -"'l',l, ·111.ons. 11,,m,1• 11,.� _1111,.,,.11,,no

W.

:

41

� ,17t:)2t

"·Mll1

,'l'S�l••w.:zt

-11 "•

o

• ·"·� , ""'

� _111w1.

1:::*ili llo, ltfl

-�•*· 11,,,..1n ,4,

� llo,2U.S'l 110.)11 10

.. _....... 1 11,.,011.t "·'",.,

"'-11,'26. 11, .. ,.

i::::1m

11,,&:tl llf,:!011�

, �

-tn':1259", .;,:,:s,n. •llf,019� 11,,,,,...,.

ia

Honnones Swvival Fbetors Tfansml11ers Grov.th Factors

(e.g� IGFl) (e.g., nter1eukins, (e.g., TGFo, EGF) ElCltacelUar iserotonin. etc.) Mattix

I I I I GPCR ...

RT I RT� I cdc42'i

"i""' Wm

1-I I G">2f0 / F'yr1Shc I I

Pl3K /P�C ' ...____

G-Protel Reis S

n / FAK O

----- I '\,_ RtSrc

i$hCVOlltd -

, Akt PKC ,I GSK-30

I

i

..-----rr--Adenytate

Akka NF-11.8i cyclase MEK

I ......._

APC / PKA

4 ,,s

/ I " -+-- JAKs

MEKK MAPK MKK P,.catenln I

-..... STAT3..S

Cytoeh'ome C

Caspase9 I

C.SPM08 _::: e ARF

mdm2 I

8C'- --+

FkO l•2 1

- -Sad

- '

f:asR Abno

� M1-

SenSOf rmalily-S .

p53

im J

Oeath I

tactors (e.g. Fast., Tnf)

A sample of application domains systems biology

• Life sciences Chemotines.

genomics

ldeoer-11"� H$UniG:tl�

neuroscience, etc., etc. •

7

This image is in the public domain. This image is in the public domain. Source: Wikimedia. Source: Wikimedia.

yunpeng
Rectangle
yunpeng
Rectangle
Page 8: Introduction to Probability: Lecture 14: Introduction to ... · A sample of application domains • Design and interpretation of experiments STATE COUNTS - polling • 17 SOLIDLY

A sample of application domains

• Modeling and monitoring the oceans

Modeling and monitoring global climate

Modeling and monitoring pollution

Interpreting data from physics experiments •

Interpreting astronomy data

8

Page 9: Introduction to Probability: Lecture 14: Introduction to ... · A sample of application domains • Design and interpretation of experiments STATE COUNTS - polling • 17 SOLIDLY

A sample of application domains

ignal processing

communication systems (noisy ... )

speech processing and understanding

image processing and understanding

tracking of objects

positioning systems (e.g., GPS)

detection of abnormal events •

• S

9

Page 10: Introduction to Probability: Lecture 14: Introduction to ... · A sample of application domains • Design and interpretation of experiments STATE COUNTS - polling • 17 SOLIDLY

10

Page 11: Introduction to Probability: Lecture 14: Introduction to ... · A sample of application domains • Design and interpretation of experiments STATE COUNTS - polling • 17 SOLIDLY

Hypothesis testing versus estimation

• Hypothesis testing:

unknown takes one of few possible values

aim at small probability of incorrect decision

it an airplane or a bird? Is

• Estimation:

numerical unknown(s)

aim at an estimate that is "close" to the true but unknown value

11

Page 12: Introduction to Probability: Lecture 14: Introduction to ... · A sample of application domains • Design and interpretation of experiments STATE COUNTS - polling • 17 SOLIDLY

The Bayesian inference framework

• Unknown

treated as a random va r iable

prior distribution Pe or Ie

Observation X

observa tion model P x le or I x le

Use appropriate version of the Bayes rule

to find Pelx(· 1 X = x) or Ie lx(· 1 X = x)

e -

-

-

Pr l• or Pe

Cond itionaPX IS

• Where does the prior come fro

symmetry

known range

earlier studies

subjective or arbitrary

m ?

PSlx(· I X =x) x Observation Posterior

l Process Calculation

12

Page 13: Introduction to Probability: Lecture 14: Introduction to ... · A sample of application domains • Design and interpretation of experiments STATE COUNTS - polling • 17 SOLIDLY

ELECTORAL VOTE DISTRIBUTION FOR OBAMA

ROMNEY 14.62% 84.59% OBAMA

11 " Tlr

6%

5% >--t: 4%...., -

-< 3% <O 0 cL 2% "'-

1 %

270

0% ___________ .....

NUMBER OF ELECTORAL VOTES

The output of Bayesian inference

The complete answer is a posterior distribution: PMF PeixC· Ix) or PDF feixC· Ix)

Pri or .-------------, Pe I

-- Observation X I

. Posterior Pe1x ( · I X = x) 1• I

Point Estimates 1

Process Calculation - I Error Analysis - I

: etc. Cond itional -I - - ---- _._ -- - :

PxJe 13

• •

© Source unknown. All rights reserved. This content is excludedfrom our Creative Commons license. For more information, seehttps://ocw.mit.edu/help/faq-fair-use.

Page 14: Introduction to Probability: Lecture 14: Introduction to ... · A sample of application domains • Design and interpretation of experiments STATE COUNTS - polling • 17 SOLIDLY

I I r I I ,

Point estimates in Bayesian inference

The complete answer is a posterior distribution :

PMF PerxC- I x) or PDF fe rxC- I x)

• Maximum a posteriori probability (MAP):

Perx(O* I x) = mrperx(8 1 x) I

fe rx(O* I x) = mrfelx(81 x) ,

ConditIonal expectation: E[e I X = xl (LMS: L east Mean Squares)

estimate: (j = g(x)

(number)

estimator: e = g(X) =

(random va ri ab le)

14

Page 15: Introduction to Probability: Lecture 14: Introduction to ... · A sample of application domains • Design and interpretation of experiments STATE COUNTS - polling • 17 SOLIDLY

0 .6

0.3

0.1

1 e

Discrete 6. discrete X

• values of 6: alternative hypotheses

• MAP rule: fJ = 2.

px(x) = 2>e(8') PX le(x I ef)

0'

• conditional prob of error:

P (fJ i=6 I X = x) = 0.11 smallest under the MAP rule

overall probability of error:

P(8 i=6 ) =2:P(8i=6 I X=x)px(x) •

I x I •

I

= 2:P(8 i= 6 I 6 = e)pe(e) e

15

Page 16: Introduction to Probability: Lecture 14: Introduction to ... · A sample of application domains • Design and interpretation of experiments STATE COUNTS - polling • 17 SOLIDLY

Discrete e, continuous X pe(e) fX le(x I e)

• Standard example:

send signal e E {l , 2, 3} fx(x) = LPe(e')fx le(x I ef) x=e+w 8'

w ~ N(O , a 2 ), indep. of e • conditional prob of error:

fX le(x I e) = fw(x - e) p(Bi=e l x=x)

0.6 smallest under the MAP rule •

0.3 • overall probability of error:

0.1 '- pee i= e) = f,P(e i= e I X = x )fx(x) dx 1

1 2 3 e ~

= LP(e i= e I e = e)pe(e) MAP rule: e

16

pelx(e I x) = fx(x)

Page 17: Introduction to Probability: Lecture 14: Introduction to ... · A sample of application domains • Design and interpretation of experiments STATE COUNTS - polling • 17 SOLIDLY

Continuous 8, continuous X fe(O) fX leCx ] 0)

fe lxCO ] x) = fx(x) linear normal models

fx(x) = fe(O')fx le(x ] 0') dO' estimation of a noisy signal

J

8 and W: independent normals

multi-dimensional versions Cmany normal parameters, many observations)

estimating the parameter of a uniform - MAt • 8 = g(X) I.."?$

• interested in: X: uniform[O, 8]

8: uniform [0 , 1] E[ce - 8)2 ] X = x] E[ce - 8)2]

17

Page 18: Introduction to Probability: Lecture 14: Introduction to ... · A sample of application domains • Design and interpretation of experiments STATE COUNTS - polling • 17 SOLIDLY

Inferring the unknown bias of a coin and the Beta distribution

• Standard example: 1 (0 1 k) = PK le(k 0)

coin with bias 8; prior leO e lK PK(k) fix n; K =number of heads

PK(k) = le(O')PKle(k 0') dO' ssume leO

1

is uniform in [0, 1) J

/, 1 • I ~ )

- I< e Ie

(1-19) ..

PI' (k ~ • .

1 Ok ( 1 _ 0) n - k "Beta distribution , with parameters (k + 1, n - k + 1)" d(n,k)

f prior is Beta: le(O) = ~ 0"(1 - 0) 13 c

18

le(O) I

-

• A

-

• I

Page 19: Introduction to Probability: Lecture 14: Introduction to ... · A sample of application domains • Design and interpretation of experiments STATE COUNTS - polling • 17 SOLIDLY

Inferring the unknown bias of a coin: point estimates

• Standard example:

- coin with bias e; prior feUfix n; K =number of heads -

• Assume is uniform in [0, 1

1 k ( )n k0 1 - 0 -

MAP estimate:

BMAP = kin e e.,r e[K €o 1 ~ Ie) (1-

.... fo- • e) =- 0

8 M AP = -- Ie: t- I

~ +-~

19

1<1 ...

feU

fe lK(O I k) = d(n, k)

"'" Q)( ("'l -l>

~/e ("'-"=)

{l " fJ _ ",1131 1

10 0 ( - 0) dO - ('" + 13 + 1) 1

,

) E[e I K = k) = e fe/K (0 IJddB o

I I Ie + ) ... _I: I -:. - (J (I-e) of)

, o )

I

J) ~---

~I

Page 20: Introduction to Probability: Lecture 14: Introduction to ... · A sample of application domains • Design and interpretation of experiments STATE COUNTS - polling • 17 SOLIDLY

Summary

• Problem data: peO, pXleC I ·)

Given the value x of X: find , e.g., pelxC 1 x ) •

- using appropriate version of the Bayes rule ("I Gf20 I'Ges)

-• Estimator e = g(X) Estimate Ii = g(x)

MAP: liMAP = 9MAP(X) maximizes pelx(e 1 x)

LMS : liLMS = 9LMS(X) = E[e 1 X = xl •

20

Page 21: Introduction to Probability: Lecture 14: Introduction to ... · A sample of application domains • Design and interpretation of experiments STATE COUNTS - polling • 17 SOLIDLY

MIT OpenCourseWarehttps://ocw.mit.edu

Resource: Introduction to ProbabilityJohn Tsitsiklis and Patrick Jaillet

The following may not correspond to a particular course on MIT OpenCourseWare, but has been provided by the author as an individual learning resource.

For information about citing these materials or our Terms of Use, visit: https://ocw.mit.edu/terms.