generative learning methods for bffbags of featureslazebnik/spring10/lec20_generative.pdf ·...
TRANSCRIPT
Generative learning methods for b f fbags of features
• Model the probability of a bag of featuresModel the probability of a bag of features given a class
Many slides adapted from Fei-Fei Li, Rob Fergus, and Antonio Torralba
Generative methods• We will cover two models, both inspired by
text document analysis:N ï B• Naïve Bayes
• Probabilistic Latent Semantic Analysis
The Naïve Bayes modelThe Naïve Bayes model
• Assume that each feature is conditionally independent given the class
∏=N
cfpcffp 1 )|()|(
f ith f t i th i
∏=i
iN cfpcffp1
1 )|()|,,( K
fi: ith feature in the imageN: number of features in the image
Csurka et al. 2004
The Naïve Bayes modelThe Naïve Bayes model
• Assume that each feature is conditionally independent given the class
∏∏ ==M
wnj
N
iNjcwpcfpcffp )(
1 )|()|()|,,( K ∏∏== j
ji
iN cwpcfpcffp11
1 )|()|()|,,( K
f ith f t i th ifi: ith feature in the imageN: number of features in the image
wj: jth visual word in the vocabularyM: size of visual vocabulary
( ) b f f t f t i th iCsurka et al. 2004
n(wj): number of features of type wj in the image
The Naïve Bayes modelThe Naïve Bayes model
• Assume that each feature is conditionally independent given the class
∏∏ ==M
wnj
N
iNjcwpcfpcffp )(
1 )|()|()|,,( K ∏∏== j
ji
iN cwpcfpcffp11
1 )|()|()|,,( K
No. of features of type wj in training images of class cp(w | c) =
Total no. of features in training images of class cp(wj | c) =
Csurka et al. 2004
The Naïve Bayes modelThe Naïve Bayes model
• Assume that each feature is conditionally independent given the class
∏∏ ==M
wnj
N
iNjcwpcfpcffp )(
1 )|()|()|,,( K ∏∏== j
ji
iN cwpcfpcffp11
1 )|()|()|,,( K
No. of features of type wj in training images of class c + 1p(w | c) =
Total no. of features in training images of class c + Mp(wj | c) =
(L l thi t id t )Csurka et al. 2004
(Laplace smoothing to avoid zero counts)
The Naïve Bayes modelThe Naïve Bayes model
• Maximum A Posteriori decision:
∏=M
wnjc cwpcpc j )()|()(maxarg*
∑
=
+=M
j
cwpwncp
1
)|(log)()(logmaxarg ∑=
+=j
jjc cwpwncp1
)|(log)()(logmaxarg
(you should compute the log of the likelihood instead of the likelihood itself in order to avoid
d fl )Csurka et al. 2004
underflow)
The Naïve Bayes modelThe Naïve Bayes model
• “Graphical model”:p
c wN
cN
Csurka et al. 2004
Probabilistic Latent Semantic AnalysisProbabilistic Latent Semantic Analysis
zebra grass treeImage
= p1 + p2 + p3
T. Hofmann, Probabilistic Latent Semantic Analysis, UAI 1999
“visual topics”
Probabilistic Latent Semantic AnalysisProbabilistic Latent Semantic Analysis• Unsupervised technique• Two-level generative model: a document is a
mixture of topics, and each topic has its own characteristic word distribution
wd z wd z
document topic wordP(z|d) P(w|z)
T. Hofmann, Probabilistic Latent Semantic Analysis, UAI 1999
P(z|d) P(w|z)
Probabilistic Latent Semantic AnalysisProbabilistic Latent Semantic Analysis• Unsupervised technique• Two-level generative model: a document is a
mixture of topics, and each topic has its own characteristic word distribution
wd z wd z
∑=K
kjkkiji dzpzwpdwp
1)|()|()|(
T. Hofmann, Probabilistic Latent Semantic Analysis, UAI 1999
=k 1
The pLSA modelThe pLSA model
∑=K
kjkkiji dzpzwpdwp
1)|()|()|(
=k 1
Probability of word i Probability of Probability ofProbability of word i in document j
(known)
Probability of word i given
topic k (unknown)
Probability oftopic k givendocument j(unknown)
The pLSA modelThe pLSA model
∑=K
kjkkiji dzpzwpdwp
1)|()|()|(
=k 1documents topics documents
p(zk|dj)wor
ds
wor
ds
topi
cs
p(wi|dj) p(wi|zk)=
Observed codeword Codeword distributions Class distributionsdistributions
(M×N)per topic (class)
(M×K)per image
(K×N)
Learning pLSA parametersLearning pLSA parametersMaximize likelihood of data:
Ob d t fObserved counts of word i in document j
M number of codewordsM … number of codewords
N … number of images
Slide credit: Josef Sivic
InferenceInference• Finding the most likely topic (class) for an image:
)|(maxarg dzpzz
=∗
InferenceInference• Finding the most likely topic (class) for an image:
)|(maxarg dzpzz
=∗
• Finding the most likely topic (class) for a visual• Finding the most likely topic (class) for a visualword in a given image:
∑ ′
∗
′′==
zz dzpzwpdzpzwpdwzpz
)|()|()|()|(maxarg),|(maxarg
∑ ′zpp )|()|(
Topic discovery in images
J. Sivic, B. Russell, A. Efros, A. Zisserman, B. Freeman, Discovering Objects and their Location in Images, ICCV 2005
Application of pLSA: Action recognitionSpace-time interest points
Juan Carlos Niebles, Hongcheng Wang and Li Fei-Fei, Unsupervised Learning of Human Action Categories Using Spatial-Temporal Words, IJCV 2008.
Application of pLSA: Action recognition
Juan Carlos Niebles, Hongcheng Wang and Li Fei-Fei, Unsupervised Learning of Human Action Categories Using Spatial-Temporal Words, IJCV 2008.
pLSA model
∑=K
jkkiji dzpzwpdwp )|()|()|(=k 1
P b bilit f d i Probability of Probability ofProbability of word iin video j(known)
Probability of word i given
topic k (unknown)
Probability oftopic k given
video j(unknown)
– wi = spatial-temporal word– dj = videodj video– n(wi, dj) = co-occurrence table
(# of occurrences of word wi in video dj)( i j)– z = topic, corresponding to an action
Action recognition example
Multiple Actions
Multiple Actions
Summary: Generative models• Naïve Bayes
• Unigram models in document analysis• Assumes conditional independence of words given class• Assumes conditional independence of words given class• Parameter estimation: frequency counting
• Probabilistic Latent Semantic Analysisy• Unsupervised technique• Each document is a mixture of topics (image is a mixture of
classes)classes)• Can be thought of as matrix decomposition• Parameter estimation: Expectation-Maximization