bayesian nonparametric matrix factorization for recorded music reading group presenter: shujie hou...
TRANSCRIPT
Bayesian Nonparametric Matrix Factorization for Recorded Music
Reading Group Presenter:
Shujie Hou
Cognitive Radio Institute
Friday, October 15, 2010
Authors: Matthew D. Hoffman, David M. Blei, Perry R. cook
Princeton University, Department of Computer Science, 35 olden St., Princeton, NJ, 08540 USA
Outline
■ Introduction■ Terminology■ Problem statement and contribution of this paper
■ Gap-NMF Model(Gamma Process Nonnegative Matrix Factorization )
■ Variational Inference■ Definition■ Variational Objective Function■ Coordinate Ascent Optimization
■ Other Approaches■ Evaluation
Terminology(1)
■ Nonparametric Statistics:□ The term non-parametric is not meant to imply that such models
completely lack parameters but that the number and nature of the parameters are flexible and not fixed in advance.
■ Nonnegative Matrix Factorization:□ Non-negative matrix factorization (NMF) is a group of algorithms
in multivariate analysis and linear algebra where a matrix, is factorized into (usually) two matrices with all elements are greater than or equal to 0 WHX
The above two definitions are cited from Wikipedia
Terminology(2)
■ Variational Inference:□ Variational inference approximates the posterior distribution with
a simpler distribution, whose parameters are optimized to be close to the true posterior.
■ Mean-field Variational Inference:□ In mean-field variational inference, each variable is given an
independent distribution, usually of the same family as its prior.
Outline
■ Introduction■ Terminology■ Problem statement and Contribution of this Paper
■ Gap-NMF Model■ Variational Inference
■ Definition■ Variational Objective Function■ Coordinate Ascent Optimization
■ Other Approaches■ Evaluation
Problem Statement and Contribution
■ Research Topic:□ Breaking audio spectrograms into separate sources of sound
using latent variable decompositions. E.g., matrix factorization.
■ A potential problem :□ The number of latent variables must be specified in advance
which is not always possible.
■ Contribution of this paper□ The paper develops Gamma Process Nonnegative Matrix
Factorization (GaP-NMF), a Bayesian nonparametric approach to decompose spectrograms.
Outline
■ Introduction■ Terminology■ Problem statement and Contribution of this Paper
■ Gap-NMF Model■ Variational Inference
■ Definition■ Variational Objective Function■ Coordinate Ascent Optimization
■ Other Approaches■ Evaluation
Dataset on GaP-NMF Model
■ What are given is a M by N matrix
in which is the power of audio signal at time window n and frequency bin m.
If the number of latent variable is specified in advance:■ Assuming the audio signal is composed of K static sound
sources. The problem is to decompose , in which is M by K matrix, is K by N matrix. In which cell is the average amount of energy source k exhibits at frequency m. cell is the gain of source k at time n.
■ The problem is solved by
X
mnX
WHX
H
W
knH
mkW
GaP-NMF Model
If the number of latent variable is not specified in advance:
■ GaP-NMF assumes that the data is drawn according to the following generative process:
Based on the formula that(Abdallah&Plumbley (2004))
GaP-NMF Model
If the number of latent variable is not specified in advance:
■ GaP-NMF assumes that the data is drawn according to the following generative process:
The overall gain of the corresponding source l
Based on the formula that(Abdallah&Plumbley (2004))
Used to control the number of latent variables
GaP-NMF Model
■ The number of nonzero is the number of the latent variables K.
■ If L increased towards infinity, the nonzero L which expressed by K is finite and obeys:
Kingman ,1993
Outline
■ Introduction■ Terminology■ Problem statement and Contribution of this Paper
■ Gap-NMF Model■ Variational Inference
■ Definition■ Variational Objective Function■ Coordinate Ascent Optimization
■ Other Approaches■ Evaluation
Definition of Variational Inference
■ Variational inference approximates the posterior distribution with a simpler distribution, whose parameters are optimized to be close to the true posterior.
■ Under this paper’s condition:
Posterior Distribution
What measured
Definition of Variational Inference
■ Variational inference approximates the posterior distribution with a simpler distribution, whose parameters are optimized to be close to the true posterior.
■ Under this paper’s condition:
Variational distribution assumption with free parameters
Variational Distribution Posterior Distribution
Approximates
What measured
Definition of Variational Inference
■ Variational inference approximates the posterior distribution with a simpler distribution, whose parameters are optimized to be close to the true posterior.
■ Under this paper’s condition:
Variational distribution assumption with free parameters
Variational Distribution Adjust Parameters Posterior Distribution
Approximates
What measured
Outline
■ Introduction■ Terminology■ Problem statement and Contribution of this Paper
■ Gap-NMF Model■ Variational Inference
■ Definition■ Variational Objective Function■ Coordinate Ascent Optimization
■ Other Approaches■ Evaluation
Variational Objective Function
■ Assume each variable obeys the following Generalized Inverse-Gaussian (GIG) family:
Variational Objective Function
■ Assume each variable obeys the following Generalized Inverse-Gaussian (GIG) family:
It is Gamma family
Variational Objective Function
■ Assume each variable obeys the following Generalized Inverse-Gaussian (GIG) family:
Denotes a modified Bessel function of the second kind
It is Gamma family
Deduction(1)
■ The difference between the left and right sides is the Kullback-Leibler divergence between the true posterior and the variational distribution q.
■ Kullback-Leibler divergence : for probability distributions P and Q of a discrete random variable their K–L divergence is defined to be
From Jordan et al., 1999
Deduction(2)
Deduction(2)
Using Jensen’s inequality
Objective function
■ L=
■ The objective function becomes
Bounded by
+
■ Maximize the objective function defined above with the corresponding parameters.
■ The distribution is obtained:
■ Because these three distributions are independent, we gain
approximates
Outline
■ Introduction■ Terminology■ Problem statement and Contribution of this Paper
■ Gap-NMF Model■ Variational Inference
■ Definition■ Variational Objective Function■ Coordinate Ascent Optimization
■ Other Approaches■ Evaluation
Coordinate Ascent Algorithm(1)
■ The derivative of the objective function with respect to variational parameters equals to zero to obtain:
■ Similarly:
Coordinate Ascent Algorithm(2)
■ Using Lagrange multipliers, then the bound parameters become
■ Then updating bound parameters and variational parameters according to equations 14,15,16,17 and18 to ultimately reaching a local minimum.
Outline
■ Introduction■ Terminology■ Problem statement and Contribution of this Paper
■ Gap-NMF Model■ Variational Inference
■ Definition■ Variational Objective Function■ Coordinate Ascent Optimization
■ Other Approaches■ Evaluation
Other Approaches
■ Finite Bayesian Model ( also called GIG-NMF).■ Finite Non-Bayesian Model.■ EU-Nonnegative Matrix Factorization.■ KL-Nonnegative Matrix Factorization.
Outline
■ Introduction■ Terminology■ Problem statement and Contribution of this Paper
■ Gap-NMF Model■ Variational Inference
■ Definition■ Variational Objective Function■ Coordinate Ascent Optimization
■ Other Approaches
■ Evaluation
Evaluation on Synthetic data(1)
■ The data is generated according to the following model:
Evaluation on Synthetic data(2)
Evaluation on Recorded Music
Conclusion
■ Gap-NMF model is capable of determining the number of latent source automatically.
■ The key step of the paper is to use variational distribution to approximate posterior distribution.
■ Gap-NMF can work well on analyzing and processing recorder music, it can be applicable to other types of audio.
■Thank you!