bregman information bottleneck nips’03, whistler december 2003 koby crammer hebrew university of...
TRANSCRIPT
Bregman Bregman Information BottleneckInformation Bottleneck
NIPS’03, Whistler December 2003
Koby CrammerKoby CrammerHebrew UniversityHebrew University
of Jerusalemof Jerusalem
Noam SlonimNoam SlonimPrinceton UniversityPrinceton University
MotivationMotivation
• Extend the IB for a broad family of representations• Relation to the Exponential family
Hello, world
Multinomial distribution
Vectors
OutlineOutline
• Rate-Distortion Formulation• Bregman Divergences• Bregman IB• Statistical Interpretation• Summary
Information BottleneckInformation Bottleneck
X T Y
X
[ p(y=1|X) … p(y=n|X)]
[ p(y=1|T) … p(y=n|T)]
T
• Input
• Variables
• Distortion
Rate-Distortion FormulationRate-Distortion Formulation
• Bolzman Distribution:
• Markov + Bayes
• Marginal
Self-Consistent EquationsSelf-Consistent Equations
Bregman DivergencesBregman Divergences
f
(u,f(u))
(v,f(v))
(v, f(u)+f’(u)(v-u))
Bf(v||u) = f(v) - (f(u)+f’(u)(v-u))Bf(v||u) = f:S R
• Functional
• Bregman Function
• Input
• Variables
• Distortion
Bregman IB: Rate-Distortion FormulationBregman IB: Rate-Distortion Formulation
• Bolzman Distribution:
• Prototypes: convex combination of input vectors
• Marginal
Self-Consistent EquationsSelf-Consistent Equations
Special CasesSpecial Cases
• Information Bottleneck: Bregman function: f(x)=x log(x) – x Domain: Simplex Divergence: Kullback-Leibler
• Soft K-means Bregman function: f(x)=(1/2) x2
Domain: Realsn
Divergence: Euclidian Distance [Still, Bialek, Bottou, NIPS 2003]
Bregman IBBregman IB
Information Bottleneck
BregmanClustering
Rate-Distortion
Exponential Family
Exponential FamilyExponential Family
• Expectation parameters:
• Examples (single dimension): Normal
Poisson
• Expectation parameters:
• Properties :
Exponential Family and Exponential Family and Bregman DivergencesBregman Divergences
IllustrationIllustration
• Expectation parameters:
• Properties :
Exponential Family and Exponential Family and Bregman DivergencesBregman Divergences
• Distortion:
• Data vectors and prototypes: expectation parameters
• Question: For what exponential distribution we have ?
Answer: Poisson
Back to Distributional ClusteringBack to Distributional Clustering
Product of Poisson
Distributions
IllustrationIllustration
a a b a a a b a a a .8.2
a b
6040
a b
Pr
Multinomial Distribution
Back to Distributional ClusteringBack to Distributional Clustering
• Information Bottleneck: Distributional clustering of Poison distributions
• (Soft) k-means: (Soft) Clustering of Normal distributions
• Distortion
• Input: Observations
• Output Parameters of Distribution
• IB functional: EM [Elidan & Fridman, before]
Maximum Likelihood PerspectiveMaximum Likelihood Perspective
• Posterior:
• Partition Function:
Weighted -norm of the Likelihood
• → ∞ , most likely cluster governs• →0 , clusters collapse into a single prototype
Back to Self Consistent EquationsBack to Self Consistent Equations
Summary Summary
• Bregman Information Bottleneck Clustering/Compression
for many representations and divergences
• Statistical Interpretation Clustering of distributions from the exponential family EM like formulation
• Current Work: Algorithms Characterize distortion measures which also yield
Bolzman distributions General distortion measures