cs6140 lec9 - college of computer and information … · 3/23/17 2 formally viterbi backtrace s 1 s...
TRANSCRIPT
3/23/17
1
CS6140:MachineLearningSpring2017
Instructor:LuWangCollegeofComputerandInformaBonScience
NortheasternUniversityWebpage:www.ccs.neu.edu/home/luwang
Email:[email protected]
LogisBcs
• Assignment3isdueon3/30.
• 4/13:courseprojectpresentaBon.
• 4/20:finalexam.
WhatwelearnedlastBme
• SequenBallabelingmodels– HiddenMarkovModels– Maximum-entropyMarkovmodel– CondiBonalRandomFields
Sample Markov Model for POS
0.95 0.9
0.05 stop
0.5
0.1
0.8
0.1
0.1
0.25
0.25
start 0.1
0.5 0.4
Det Noun
PropNoun
Verb
TheMarkovAssumpBon HiddenMarkovModels(HMMs)
Words Part-of-Speechtags
3/23/17
2
Formally Viterbi Backtrace
s1 s2
sN
• • •
• • •
s0 sF • • •
• • •
• • •
• • • • • •
• • •
• • •
t1 t2 t3 tT-1 tT
Most likely Sequence: s0 sN s1 s2 …s2 sF
Log-LinearModels
UsingLog-LinearModels CondiBonalRandomFields(CRFs)
3/23/17
3
Today’sOutline
• BayesianNetworks• MixtureModels• ExpectaBonMaximizaBon• LatentDirichletAllocaBon
[SomeslidesareborrowedfromChristopherBishopandDavidSontag]
3/23/17
4
Today’sOutline
• BayesianNetworks• MixtureModels• ExpectaBonMaximizaBon• LatentDirichletAllocaBon
K-meansAlgorithm• Goal:representadatasetintermsofKclusterseachofwhichissummarizedbyaprototype(mean)
• IniBalizeprototypes,theniteratebetweentwophases:– Step1:assigneachdatapointtonearestprototype
– Step2:updateprototypestobetheclustermeans• SimplestversionisbasedonEuclideandistance
BCSSummerSchool,Exeter,2003 ChristopherM.Bishop
3/23/17
5
BCSSummerSchool,Exeter,2003 ChristopherM.Bishop BCSSummerSchool,Exeter,
2003 ChristopherM.Bishop
BCSSummerSchool,Exeter,2003 ChristopherM.Bishop BCSSummerSchool,Exeter,
2003 ChristopherM.Bishop
BCSSummerSchool,Exeter,2003 ChristopherM.Bishop BCSSummerSchool,Exeter,
2003 ChristopherM.Bishop
3/23/17
6
BCSSummerSchool,Exeter,2003 ChristopherM.Bishop BCSSummerSchool,Exeter,
2003 ChristopherM.Bishop
3/23/17
7
TheGaussianDistribuBon• MulBvariateGaussian
mean covariance
GaussianMixtures• Linearsuper-posiBonofGaussians
• NormalizaBonandposiBvityrequire
• CaninterpretthemixingcoefficientsaspriorprobabiliBes
Example:Mixtureof3Gaussians
3/23/17
8
ContoursofProbabilityDistribuBon SamplingfromtheGaussian
• Togenerateadatapoint:– firstpickoneofthecomponentswithprobability– thendrawasamplefromthatcomponent
• Repeatthesetwostepsforeachnewdatapoint
SyntheBcDataSet SyntheBcDataSetWithoutLabels
FigngtheGaussianMixture
• Wewishtoinvertthisprocess–giventhedataset,findthecorrespondingparameters:– mixingcoefficients– means– Covariances
FigngtheGaussianMixture
• Wewishtoinvertthisprocess–giventhedataset,findthecorrespondingparameters:– mixingcoefficients– means– covariances
• Ifweknewwhichcomponentgeneratedeachdatapoint,themaximumlikelihoodsoluBonwouldinvolvefigngeachcomponenttothecorrespondingcluster
• Problem:thedatasetisunlabelled• Weshallrefertothelabelsaslatent(=hidden)variables
3/23/17
9
SyntheBcDataSetWithoutLabels PosteriorProbabiliBes
• WecanthinkofthemixingcoefficientsaspriorprobabiliBesforthecomponents
• ForagivenvalueofwecanevaluatethecorrespondingposteriorprobabiliBes,calledresponsibili,es
• ThesearegivenfromBayes’theoremby
PosteriorProbabiliBes(colourcoded)
3/23/17
10
Today’sOutline
• BayesianNetworks• MixtureModels• ExpectaBonMaximizaBon• LatentDirichletAllocaBon
3/23/17
11
3/23/17
12
BCSSummerSchool,Exeter,2003 ChristopherM.Bishop
3/23/17
13
BCSSummerSchool,Exeter,2003 ChristopherM.Bishop BCSSummerSchool,Exeter,
2003 ChristopherM.Bishop
BCSSummerSchool,Exeter,2003 ChristopherM.Bishop BCSSummerSchool,Exeter,
2003 ChristopherM.Bishop
BCSSummerSchool,Exeter,2003 ChristopherM.Bishop
EMinGeneral• ConsiderarbitrarydistribuBonoverthelatentvariables(pisthetruedistribuBon)
• ThefollowingdecomposiBonalwaysholdswhere
3/23/17
14
DecomposiBon OpBmizingtheBound
• E-step:maximizewithrespectto– equivalenttominimizingKLdivergence– setsequaltotheposteriordistribuBon
• M-step:maximizeboundwithrespectto– equivalenttomaximizingexpectedcomplete-dataloglikelihood
• EachEMcyclemustincreaseincomplete-datalikelihoodunlessalreadyata(local)maximum
E-step M-step
Today’sOutline
• BayesianNetworks• MixtureModels• ExpectaBonMaximizaBon• LatentDirichletAllocaBon
[SlidesarebasedonDavidBlei’sICML2012tutorial]
3/23/17
15
3/23/17
16
GeneraBvemodelforadocumentinLDA
3/23/17
17
3/23/17
18
3/23/17
19
GeneraBvemodelforadocumentinLDA
Comparisonofmixtureandadmixturemodels
UsageofLDA EMformixturemodels
3/23/17
20
EMformixturemodels
WhatWeLearnedToday
• BayesianNetworks• MixtureModels• ExpectaBonMaximizaBon• LatentDirichletAllocaBon
Homework
• ReadingMurphy11.1-11.2,11.4.1-11.4.4,27.1-27.3
• MoreaboutEM– hkp://cs229.stanford.edu/notes/cs229-notes7b.pdf– hkp://cs229.stanford.edu/notes/cs229-notes8.pdf
• MoreaboutLDA– hkp://menome.com/wp/wp-content/uploads/2014/12/Blei2011.pdf
– hkp://obphio.us/pdfs/lda_tutorial.pdf