multivariate information bottleneck
DESCRIPTION
Multivariate Information Bottleneck. Nir FriedmanOri Mosenzon Noam Slonim Naftali Tishby Hebrew University. Statistics. Data Analysis. Population. Information Bottleneck. Bachlor’s degree. Some college. Cluster “age” clusters that are predictive of education level?. High school. - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Multivariate Information Bottleneck](https://reader035.vdocuments.us/reader035/viewer/2022070406/568141e7550346895dadc750/html5/thumbnails/1.jpg)
.
Multivariate Information Bottleneck
Nir Friedman Ori Mosenzon
Noam Slonim Naftali Tishby
Hebrew University
![Page 2: Multivariate Information Bottleneck](https://reader035.vdocuments.us/reader035/viewer/2022070406/568141e7550346895dadc750/html5/thumbnails/2.jpg)
Data Analysis
Population
Statistics
5 15 25 35 45 55 65 75 80
Age
![Page 3: Multivariate Information Bottleneck](https://reader035.vdocuments.us/reader035/viewer/2022070406/568141e7550346895dadc750/html5/thumbnails/3.jpg)
Information Bottleneck
Cluster “age” clusters that are predictive of education level?
High sc
hool
Bachlo
r’s d
egre
e
PHDNon
e
17192429343944495459646974
Some
colle
ge
![Page 4: Multivariate Information Bottleneck](https://reader035.vdocuments.us/reader035/viewer/2022070406/568141e7550346895dadc750/html5/thumbnails/4.jpg)
Information Bottleneck
Cluster “age” clusters that are predictive of education level?
Also cluster education attained to be predictive of age?
High sc
hool
Bachlo
r’s d
egre
e
PHDNon
e
17192429343944495459646974
Some
colle
ge
![Page 5: Multivariate Information Bottleneck](https://reader035.vdocuments.us/reader035/viewer/2022070406/568141e7550346895dadc750/html5/thumbnails/5.jpg)
Our contribution
Generalize Information Bottleneck:
Generic principle for specifying systems of interacting clusters
Characterization of the solution for these specs
General purpose methods for constructing solutions
![Page 6: Multivariate Information Bottleneck](https://reader035.vdocuments.us/reader035/viewer/2022070406/568141e7550346895dadc750/html5/thumbnails/6.jpg)
Information Bottleneck[Tishby, Peirera & Bialek 99]
A
B P(A,B)
T
B P(T,B)
P(T|A)
Soft clustering
);( ATI);( BTI
A B
T
Minimize: I(T;A) - I(T;B)
CompressionInformation lost about A
Preserved information about B
Tradeoff
![Page 7: Multivariate Information Bottleneck](https://reader035.vdocuments.us/reader035/viewer/2022070406/568141e7550346895dadc750/html5/thumbnails/7.jpg)
Information Bottleneck Reexamined
A B
T
A B
T
Actual Distribution
)|(),( ATPBAP
Input parameters
A B
T
Desired independencies
)|;( TBAInd
G in G out
![Page 8: Multivariate Information Bottleneck](https://reader035.vdocuments.us/reader035/viewer/2022070406/568141e7550346895dadc750/html5/thumbnails/8.jpg)
Example: Symmetric Bottleneck
Simultaneous clustering of both A and B P(TA|A)
P(TB|B)
A
TA
B
TB
G in
A B
TA TB
G out
So that TA captures the information A contain about B
TB captures the information B contain about A
![Page 9: Multivariate Information Bottleneck](https://reader035.vdocuments.us/reader035/viewer/2022070406/568141e7550346895dadc750/html5/thumbnails/9.jpg)
General Principle
Input: P(X1,…,Xn)
G in - Compression Tj clusters values of paj
G out - Desired (conditional) independencies
Goal: Find P(Tj|paj) in G in to “match” G out
X1 X2 Xn…
T1 Tk…
![Page 10: Multivariate Information Bottleneck](https://reader035.vdocuments.us/reader035/viewer/2022070406/568141e7550346895dadc750/html5/thumbnails/10.jpg)
Multi-information
Multi-information
Information random variables jointly contain about each other
Generalizes mutual information
I
])()(),,(
[log),,(1
11
n
nn XPXP
XXPEXXΙ
![Page 11: Multivariate Information Bottleneck](https://reader035.vdocuments.us/reader035/viewer/2022070406/568141e7550346895dadc750/html5/thumbnails/11.jpg)
Graph Projection
Let G be a DAG
Define:
)(min)( QPKLGPKL GQ
P
Distributions consistent with G
All possible distributions
![Page 12: Multivariate Information Bottleneck](https://reader035.vdocuments.us/reader035/viewer/2022070406/568141e7550346895dadc750/html5/thumbnails/12.jpg)
Graph Projection
Let G be a DAG
Define:
)(min)( QPKLGPKL GQ
P
Multi-info as thoughP is consistent with G
Real multi-info
Gn IXXIGPKL ),,()( 1
Proposition:
![Page 13: Multivariate Information Bottleneck](https://reader035.vdocuments.us/reader035/viewer/2022070406/568141e7550346895dadc750/html5/thumbnails/13.jpg)
Multi-information & Bayesian Networks
Proposition:
If P is consistent with G
Then
Define
I
i
iin XPXXP )|(),,( 1 pa
Sum of local interactions
i
iiG XII );( pa
i
iin XIXXI );(),,( 1 pa
![Page 14: Multivariate Information Bottleneck](https://reader035.vdocuments.us/reader035/viewer/2022070406/568141e7550346895dadc750/html5/thumbnails/14.jpg)
Optimizing Criteria
Two goals: Lose info wrt G in
Attain conditional independencies in G out
Optimization objective:
)( outin GPKLIL
Force clusters to compress Minimize violations
of conditional indep. in G out
![Page 15: Multivariate Information Bottleneck](https://reader035.vdocuments.us/reader035/viewer/2022070406/568141e7550346895dadc750/html5/thumbnails/15.jpg)
Additional Interpretation
Using properties of we can rewrite
Thus, we can instead minimize
)(
)(outinin
outin
III
GPKLIL
outin IIL
)( GPKL
Minimize informationin G in
Maximize informationin G out
![Page 16: Multivariate Information Bottleneck](https://reader035.vdocuments.us/reader035/viewer/2022070406/568141e7550346895dadc750/html5/thumbnails/16.jpg)
Minimization Objective - Example
);();();( BABA TTIBTIATIL
A
TA
B
TB
G in
A B
TA TBG out
Symmetric Bottleneck
Recall BA
BABA BAPBTPATPTTP,
),()|()|(),(
Input (fixed)Parameters we
can controlParameters we
can control
![Page 17: Multivariate Information Bottleneck](https://reader035.vdocuments.us/reader035/viewer/2022070406/568141e7550346895dadc750/html5/thumbnails/17.jpg)
Characterization of Solutions
Thm: Minimal point if and only if
)},(Exp{),(
)()|( jj
jj
jjj td
Z
tPtP pa
papa
d(tj,paj) - measure of “distortion” between tj and paj
For example in symmetric bottleneck:))|()|((),( aBBA tTPaTPKLatd
![Page 18: Multivariate Information Bottleneck](https://reader035.vdocuments.us/reader035/viewer/2022070406/568141e7550346895dadc750/html5/thumbnails/18.jpg)
Finding Solutions
How can we find solutions?
Asynchronous update Pick an index j Update P(Tj|paj)
Theorem Asynchronous updates converge to (local) minima
)},(Exp{),(
)()|( jj
jj
jjj td
Z
tPtP pa
papa
![Page 19: Multivariate Information Bottleneck](https://reader035.vdocuments.us/reader035/viewer/2022070406/568141e7550346895dadc750/html5/thumbnails/19.jpg)
Example - 20 newsgroup
20,000 messages from 20 news group [Lang 1995]
A - newsgroup of the message B - word in the message
P(a,b) -
probability that choosing a random position in the corpus would select word b in a message in newsgroup a
We applied symmetric bottleneck on both attributes
![Page 20: Multivariate Information Bottleneck](https://reader035.vdocuments.us/reader035/viewer/2022070406/568141e7550346895dadc750/html5/thumbnails/20.jpg)
20 Newsgroup: Symmetric Bottleneck
N
ewsg
roup
word
![Page 21: Multivariate Information Bottleneck](https://reader035.vdocuments.us/reader035/viewer/2022070406/568141e7550346895dadc750/html5/thumbnails/21.jpg)
20 Newsgroup: Symmetric Bottleneck
alt.atheismrec.autosrec.motorcyclesrec.sport.*sci.medsci.spacesoc.religion.christiantalk.politics.*
comp.*misc.forsalesci.cryptsci.electronics
carturkishgameteamjesusgunhockey…
xfileimageencryptionwindowdosmac…
New
sgro
up
word
P(TD,TW)
![Page 22: Multivariate Information Bottleneck](https://reader035.vdocuments.us/reader035/viewer/2022070406/568141e7550346895dadc750/html5/thumbnails/22.jpg)
20 Newsgroup: Symmetric Bottleneck
New
sgro
up
word
P(TD,TW)
![Page 23: Multivariate Information Bottleneck](https://reader035.vdocuments.us/reader035/viewer/2022070406/568141e7550346895dadc750/html5/thumbnails/23.jpg)
20 Newsgroup: Symmetric Bottleneck
New
sgro
up
word
P(TD,TW)
![Page 24: Multivariate Information Bottleneck](https://reader035.vdocuments.us/reader035/viewer/2022070406/568141e7550346895dadc750/html5/thumbnails/24.jpg)
20 Newsgroup: Symmetric Bottleneck
New
sgro
up
word
P(TD,TW)
![Page 25: Multivariate Information Bottleneck](https://reader035.vdocuments.us/reader035/viewer/2022070406/568141e7550346895dadc750/html5/thumbnails/25.jpg)
20 Newsgroup: Symmetric Bottleneck
New
sgro
up
wordatheistschristianityjesusbiblesinfaith…
alt.atheismsoc.religion.christiantalk.religion.misc
P(TD,TW)
![Page 26: Multivariate Information Bottleneck](https://reader035.vdocuments.us/reader035/viewer/2022070406/568141e7550346895dadc750/html5/thumbnails/26.jpg)
Discussion
General framework: Defines a new family of optimization problems
… and solutions
Future directions: Additional algorithms - agglomerative solutions Relation to generative models Parametric constraints in Gout
![Page 27: Multivariate Information Bottleneck](https://reader035.vdocuments.us/reader035/viewer/2022070406/568141e7550346895dadc750/html5/thumbnails/27.jpg)
Example: Parallel Bottleneck
A B
T1 T2A
T1
B
T2
Gin Gout
)];,();([);();( 212111 BTTITTIBTIATIL
))|()|((
)),|(),|((),(
aBB
BaBA
tTPaTPKL
TtBPTaBPKLatd