Transcript

Symmetry Breaking Bifurcationsof the Information Distortion

Dissertation DefenseApril 8, 2003

Albert E. Parker III

Complex Biological Systems Department of Mathematical Sciences

Center for Computational Biology

Montana State University

Goal: Solve the Information Distortion Problem

The goal of my thesis is to solve the Information Distortion problem, an optimization problem of the form

maxqG(q) constrained by D(q)D0

where

is a subset of Rn.• G and D are sufficiently smooth in .• G and D have symmetry: they are invariant to some group action.

Problems of this form arise in the study of clustering problems or optimal source coding systems.

Goal: Another Formulation

Using the method Lagrange multipliers, the goal of finding solutions of the optimization problem can be rephrased as finding stationary points of the problem

maxqF(q,) = maxq(G(q)+D(q))

where [0,). is a subset of RNK.• G and D are sufficiently smooth in .• G and D have symmetry: they are invariant to some group action.

How: Determine the Bifurcation Structure

We have described the bifurcation structure of stationary points to any problem of the form

maxqF(q,) = maxq(G(q)+D(q))

where [0,). is a linear subset of RNK.• G and D are sufficiently smooth in .• G and D have symmetry: they are invariant to some group action.

Thesis Topics

The Data Clustering ProblemThe Neural Coding Problem Information Theory / Probability TheoryOptimization TheoryDynamical SystemsBifurcation Theory with SymmetriesGroup TheoryContinuation Techniques

Outline of this talk

The Data Clustering ProblemA Class of Optimization ProblemsBifurcation with SymmetriesNumerical Results

The Data Clustering Problem

• Data Classification: identifying all of the books printed in 2002

which address the martial art Kempo

• Data Compression: converting a bitmap file to a jpeg file

Y YN

q(YN|Y) : a clustering

K objects {yi} N objects {yNi}

A Symmetry: invariance to relabelling of the clusters of YN

Y YN

q(YN|Y) : a clustering

K objects {yi} N objects {yNi}

class 1

class 2

A Symmetry: invariance to relabelling of the clusters of YN

Y YN

q(YN|Y) : a clustering

K objects {yi} N objects {yNi}

class 2

class 1

Requirements of a Clustering Method

• The original data is represented reasonably well by the clusters

– Choosing a cost function, D(Y,YN) , called a distortion function, rigorously defines what we mean by the “data is represented reasonably well”.

• Fast implementation

• Deterministic Annealing (Rose 1998) A Fast Clustering Algorithm

max H(YN|Y) constrained by D(Y,YN) D0

• Rate Distortion Theory (Shannon ~1950) Minimum Informative Compression

min I(Y,YN) constrained by D(Y,YN) D0

qC,

Examplesoptimizing at a distortion level D(Y,YN) D0

q

NK

YyNN Yyyyqyyq

NN

,1)|(|)|(:

Inputs and Outputs and Clustered Outputs

• The Information Distortion method clusters the outputs Y into clusters YN so that the information that one can learn about X by observing YN , I(X;YN), is as close as possible to the mutual information I(X;Y)• The corresponding information distortion function is

DI(Y;YN)=I(X;Y) - I(X;YN )

X Y

Inputs Outputs

YN

q(YN |Y)

Clusters

K objects {yi} N objects {yNi}L objects {xi}

p(X,Y)

• Information Distortion Method (Dimitrov and Miller 2001)

max H(YN|Y) constrained by DI(Y,YN) D0

max H(YN|Y) + I(X;YN)

• Information Bottleneck Method (Tishby, Pereira, Bialek 1999)

min I(Y,YN) constrained by DI(Y,YN) D0

max –I(Y,YN) + I(X;YN)

q

Two optimization problems which use the information distortion function

q

q

q

An annealing algorithmto solve

maxqF(q,) = maxq(G(q)+D(q))

Let q0 be the maximizer of maxq G(q), and let 0 =0. For k 0, let (qk , k ) be a solution to maxq G(q) + D(q ). Iterate the following steps until K = max for some K.

1. Perform -step: Let k+1 = k + dk where dk>0

2. The initial guess for qk+1 at k+1 is qk+1(0) = qk + for some small

perturbation .

3. Optimization: solve maxq (G(q) + k+1 D(q)) to get the maximizer qk+1 , using initial guess qk+1

(0) .

Application of the annealing method to the Information Distortion problem maxq (H(YN|Y) + I(X;YN))

when p(X,Y) is defined by four gaussian blobs

Inputs

Outputs

X Y

52 objects52 objects

p(X,Y)

Y YN

q(YN |Y)

52 objects N objects I(X;YN)=D(q(YN|Y))

Observed Bifurcations for the Four Blob problem:

We just saw the optimal clusterings q* at some *= max . What do the clusterings look like for < max ??

Bifurcations of q *()

Observed Bifurcations for the 4 Blob Problem

Conceptual Bifurcation Structure

q*

Nq

1*

??????

Why are there only 3 bifurcations observed? In general, are there only N-1 bifurcations?

What kinds of bifurcations do we expect: pitchfork-like, transcritical, saddle-node, or some other type?

How many bifurcating solutions are there?

What do the bifurcating branches look like? Are they subcritical or supercritical ?

What is the stability of the bifurcating branches? Is there always a bifurcating branch which contains solutions of the optimization problem?

Are there bifurcations after all of the classes have resolved ?

q*

Conceptual Bifurcation Structure

Observed Bifurcations for the 4 Blob Problem

Bifurcations with symmetry• To better understand the bifurcation structure, we capitalize on

the symmetries of the function G(q)+D(q)

• The “obvious” symmetry is that G(q)+D(q) is invariant to relabelling of the N classes of YN

• The symmetry group of all permutations on N symbols is SN.

switch labels 1 and 3

Symmetry Breaking Bifurcations

q*

4

11

N

q

41 by fixed is SSq N

N

Symmetry Breaking Bifurcations

q*

4

11

N

q

*q

41 by fixed is SSq N

N

31* by fixed is SSq N

Symmetry Breaking Bifurcations

q*

4

11

N

q

*q

41 by fixed is SSq N

N

31* by fixed is SSq N

*q

22* by fixed is SSq N

Symmetry Breaking Bifurcations

q*

Symmetry Breaking Bifurcations

q*

*q

)34)(12()1324()(by fixed is 2* pcycleNq

Existence Theorems for Bifurcating Branches

Given a bifurcation at a point fixed by SN ,

• Equivariant Branching Lemma (Vanderbauwhede and Cicogna 1980-1)

• There are N bifurcating branches, each which have symmetry SN-1 .

• The Smoller-Wasserman Theorem (Smoller and Wasserman 1985-6)

• There are bifurcating branches which have symmetry <(N-cycle)p> for every prime p|N, p<N.

q*

Given a bifurcation at a point fixed by SN-1 ,

• Equivariant Branching Lemma (Vanderbauwhede and Cicogna 1980-1)

• Gives N-1 bifurcating branches which have symmetry SN-2 .

• The Smoller-Wasserman Theorem (Smoller and Wasserman 1985-6)

• Gives bifurcating branches which have symmetry <(M-cycle)p> for every prime p|N-1, p<N-1 .

When N = 4, N-1=3, there are no bifurcating branches given by SW Theorem.

q*Existence Theorems for Bifurcating Branches

Bifurcation Structure corresponds with Group Structure

4S

3S3S

3S 3S

0

3

vv

v

v

0

3

vv

v

v

0

3vv

v

v

0

3vv

v

v

2S2S 2S2S2S2S2S2S

1

0

2

0

vv

v

2S 2S 2S2S

0

2

0

vv

v

0

2

0

vv

v

0

0

2

vv

v

0

2

0

vv

v

0

2

0

vv

v

0

0

2

v

v

v

0

20v

v

v

0

0

2

v

v

v

0

0

2

v

v

v

0

0

2

v

v

v

0

02v

v

v

A partial subgroup lattice for S4 and the corresponding bifurcating directions given by the Equivariant Branching Lemma

4S

4A

34,12 24,13

23,14

v

v

v

v

v

v

v

v

0)(Fix 4 A

v

v

v

v

)1324(

0))1234((Fix

A partial subgroup lattice for S4 and the corresponding bifurcating directions given by the Smoller-Wasserman Theorem

q*

Conceptual Bifurcation Structure

q*

Conceptual Bifurcation Structure

4S

3S3S

3S 3S

2S2S 2S2S2S2S2S2S

1

2S 2S 2S2S

The Equivariant Branching Lemma shows that the bifurcation structure from SM to SM-1 is …

Group Structure

q*

Conceptual Bifurcation Structure

q*

4S

3S3S

3S 3S

2S2S 2S2S2S2S2S2S

1

2S 2S 2S2S

Group Structure

The Equivariant Branching Lemma shows that the bifurcation structure from SM to SM-1 is …

The Smoller-Wasserman Theorem shows additional structure …

q*

q*

Conceptual Bifurcation Structure

4S

4A

34,12 24,13

23,14

)1324(

Group Structure

q*

Conceptual Bifurcation Structure

4S

4A

34,12 24,13

23,14

)1324(

Group Structure

q*

The Smoller-Wasserman Theorem shows additional structure … 3 branches from the S4 to S3 bifurcation only.

q*

Conceptual Bifurcation Structure

q*

If we stay on a branch which is fixed by SM , how many bifurcations are there?

q*

Conceptual Bifurcation Structure

4S

4A

34,12 24,13

23,14

)1324(

Group Structure

q*

Theorem: There are at exactly K/N bifurcations on the branch (q1/N , ) for the Information Distortion problem

There are 13bifurcations on the first

branch

Bifurcation theory in the presence of symmetries

enables us to answer the questions previously posed …

??????

Why are there only 3 bifurcations observed? In general, are there only N-1 bifurcations?

What kinds of bifurcations do we expect: pitchfork-like, transcritical, saddle-node, or some other type?

How many bifurcating solutions are there?

What do the bifurcating branches look like? Are they subcritical or supercritical ?

What is the stability of the bifurcating branches? Is there always a bifurcating branch which contains solutions of the optimization problem?

Are there bifurcations after all of the classes have resolved ?

q*

Conceptual Bifurcation Structure

Observed Bifurcations for the 4 Blob Problem

??????

Why are there only 3 bifurcations observed? In general, are there only N-1 bifurcations?There are N-1 symmetry breaking bifurcations from SM to SM-1 for M N.

What kinds of bifurcations do we expect: pitchfork-like, transcritical, saddle-node, or some other type?

How many bifurcating solutions are there? There are at least N from the first bifurcation, at least N-1 from the next one, etc.

What do the bifurcating branches look like? They are subcritical or supercritical depending on the sign of the bifurcation discriminator (q*,*,uk) .

What is the stability of the bifurcating branches? Is there always a bifurcating branch which contains solutions of the optimization problem? No.

Are there bifurcations after all of the classes have resolved ? In general, no.

Conceptual Bifurcation StructureObserved Bifurcations for the 4 Blob Problem

q*

We can explain the bifurcation structure

of all problems of the form

maxq F(q, ) = maxq (G(q)+D(q))

where [0,). is a subset of RNK.• G and D are sufficiently smooth in .• G and D are invariant to relabelling of the classes of YN

• The blocks of the Hessian q(G+ D) at bifurcation satisfy a set of generic conditions.

This class of problems includes the Information Distortion problem.

Hessian d constraine theis , LqHessian nedunconstrai isFq

singular is , Lq

singular isFq rnonsingula isFq

rnonsingula is1

1

MN

iKi MIRB

1M 1M

Symmetry breaking

bifurcation

Impossible scenario

Saddle-node bifurcation

Impossible scenario

Non-generic

chap

ter

6

rnonsingula is1

1

MN

iKi MIRB

singular is1

1

MN

iKi MIRB

singular is1

1

MN

iKi MIRB

chap

ter

6

chap

ter

8

chap

ter

4

Continuation techniques provide

numerical confirmation of the theory

Previously Observed Bifurcation Structure for the Four Blob problem:

Equivariant Branching Lemma: Previous vs. Actual Bifurcation Structure

We used Continuation Techniques and the Theory of Bifurcations with Symmetries on the 4 Blob Problem using the Information Distortion method to get this picture.

Previous results:

Actual structure:

Singularity of F:

Singularity of L :

*

q*

Smoller-Wasserman Theorem: there are bifurcating branches with

symmetry <(1324)2> = <(12)(34)>

q*

A closer look …

q*

Bifurcation from S4 to S3…

q*

The bifurcation from S4 to S3 is subcritical …

(the theory predicted this since the bifurcation discriminator (q1/4,*,u)<0 )

q*Bifurcation from S3 to S2…

The bifurcation from S3 to S2 is subcritical …

q*Bifurcation from S2 to S1…

The bifurcation from S2 to S1 …

What are these branches ???

q*

Theorem: In general, either symmetry breaking bifurcations or saddle-node bifurcations can occur.

Outline of proof: The Equivariant Branching Lemma, Smoller-Wasserman

Theorem, and the following singularity structure:

singular is , Lq

singular isFq singular-non isFq

singular-non is1

1

MN

iKi MIRB

1M 1M

singular-non is1

1

MN

iKi MIRB

singular is1

1

MN

iKi MIRB

singular is1

1

MN

iKi MIRB

Conclusions

Symmetry breaking

bifurcation

ImpossibleScenario

Saddle-node bifurcation

Impossible scenario

Non-generic

Theorem: All symmetry breaking bifurcations from

SM to SM-1 are pitchfork-like, and there exists M

bifurcating branches, for which we have explicit

directions.

Conclusions

q*

Theorem: The bifurcation discriminator of the pitchfork-like branch (q*,*,*) + (tu,0,(t)) is

If (q*,*,uk) < 0, then the branch is subcritical. If (q*,*,uk) > 0, then the branch is supercritical.

Conclusions

],,[,)33(][][,[,3),( 42

,

213** vvvFvMMvv

qqELuuuq

sr srsr

LL

Theorem: Solutions of the optimization problem do not always persist from bifurcation.

Theorem: In general, bifurcations do not occur after all of the classes have resolved.

Conclusions

A numerical algorithm to solve max(G(q)+D(q))

Let q0 be the maximizer of maxq G(q), 0 =1 and s > 0. For k 0, let (qk , k ) be a solution to maxq G(q) + D(q ). Iterate the following steps until K = max for some K.

1. Perform -step: solve

for and select k+1 = k + dk where dk = (s sgn(cos )) /(||qk ||2 + ||k ||2 +1)1/2.

2. The initial guess for (qk+1,k+1) at k+1 is (qk+1

(0),k+1 (0)) = (qk ,k) + dk ( qk, k) .

3. Optimization: solve maxq (G(q) + k+1 D(q)) using pseudoarclength continuation to get the maximizer qk+1, and the vector of Lagrange multipliers k+1 using initial guess (qk+1

(0),k+1 (0)).

4. Check for bifurcation: compare the sign of the determinant of an identical block of each of q [G(qk) + k D(qk)] and q [G(qk+1) + k+1 D(qk+1)]. If a bifurcation is detected, then set qk+1

(0) = qk + d_k u where u is bifurcating direction and repeat step 3.

),,(),,( ,, kkkqk

kkkkq q

qq

LL

k

kq

q

Details …

• The Dynamical System

• Types of Singularities

• Continuation Techniques

• The Explicit Group of Symmetries

• Explicit Existence Theorems for bifurcating branches

A Class of Problems

max F(q, ) = max(G(q)+D(q))

• G and D are sufficiently smooth in .

• G and D must be invariant under relabelling of the classes.

q q

The Dynamical SystemGoal: To determine the bifurcation structure of solutions to

maxq (G(q) + D(q)) for [0,) .

Method: Study the equilibria of the of the flow

• The Jacobian wrt q of the K constraints {YNq(YN|y)-1} is J=(IK IK … IK).

• If wT qF(q*,) w < 0 for every wker J, then q*() is a maximizer of .

• The first equilibrium is q*(0 = 0) 1/N.

• If wT qF(q*,) w < 0 for every wker J, then q*() is a maximiYNer of .

• The first equilibrium is q*(0 = 0) 1/N.

Yy z

yqq yzqqDqGqq

1)|()()(:),,( ,, L

KnKnq

:, L

• In our dynamical system

the hessian

determines the stability of equilibria and the location of bifurcation.

.

),,(, qq

q L

0),,(, T

qq J

JFq L

Properties of the Dynamical System

Hessian d constraine theis , LqHessian nedunconstrai isFq

singular is , Lq

singular isFq rnonsingula isFq

rnonsingula is1

1

MN

iKi MIRB

1M 1M

Symmetry breaking

bifurcation

Impossible scenario

Saddle-node bifurcation

Impossible scenario

Non-generic

chap

ter

6

rnonsingula is1

1

MN

iKi MIRB

singular is1

1

MN

iKi MIRB

singular is1

1

MN

iKi MIRB

chap

ter

6

chap

ter

8

chap

ter

4

The Dynamical System

How:

Use numerical continuation in a constrained system to choose and to choose an initial guess to find the equilibria q*( ).

Use bifurcation theory with symmetries to understand bifurcations of the equilibria.

Investigating the Dynamical System

Continuation• A local maximum qk

*(k) of is an equilibrium of the gradient flow .• Initial condition qk+1

(0)(k+1(0)) is sought in tangent direction qk, which is found by solving the matrix system

• The continuation algorithm used to find qk+1*(k+1) is based on Newton’s method.

• Parameter continuation follows the dashed (---) path, pseudoarclength continuation follows the dotted (…) path ),,(),,( ,, kkkq

k

k

kkkq qq

q

LL

k)0(

1k

),( , kkkq

),,( 111 kkkq

),( )0(1

)0(1

)0(1 kkkq

),( 11 kkq

),( kkq

),( q

),( )0(1

)0(1 kkq

The Groups• Let P be the finite group of n ×n “block” permutation matrices which represents the action of SN

on q and F(q,) . For example, if N=3,

permutes q(YN1|y) with q(YN2|y) for every y

• F(q,) is P -invariant means that for every P, F( q,) = F(q,)

• Let be the finite group of (n+K) × (n+K) block permutation matrices

which represents the action of SN on and q, L(q,,):

q, L(q, , ) is -equivariant means that for every q, L(q, , ) = q, L( ,)

q

! |0

0: fixed are sconstraint and smultiplier lagrange the

P

KKnK

Kn

I

q

P

K

K

K

I

I

I

00

00

00

Notation and Definitions• The symmetry of is measured by its isotropy subgroup

• An isotropy subgroup is a maximal isotropy subgroup of if there does not exist an isotropy subgroup of such that .

• At bifurcation , the fixed point subspace of q*,* is

qqq |,

q

),( *

*

*

q

**** ,

***,,

,|),,(ker)(Fix

qqq

wwqw L

Equivariant Branching LemmaOne of the Existence Theorems we use to describe a bifurcation in the

presence of symmetries is the Equivariant Branching Lemma (Vanderbauwhede and Cicogna 1980-1).

Idea: The bifurcation structure of local solutions is described by the isotropy subgroups of which have dim Fix()=1. • System: .

• r(x,) is G-equivariant for some compact Lie Group G• • Fix(G)={0}• Let H be an isotropy subgroup of G such that dim Fix (H) = 1.• Assume r(0,0) 0 (crossing condition).

Then there is a unique smooth solution branch (tx0,(t)) to r = 0 such that x0 Fix (H) and the isotropy subgroup of each solution is H.

mmrxrx :),,(

0)0,0(,0)0,0( rr x

From bifurcation, the Equivariant Branching Lemma shows that the following solutions emerge:

An stationary point q* is M-uniform if there exists 1 M N and a

K x 1 vector P such that q(yNi|Y)=P for M and only M classes, {yNi}Ni=1

of YN. These M classes of YN are unresolved classes. The classes of YN that are not unresolved are called resolved.

The first equilibria, q* 1/N, is N-uniform.

Theorem: q* is M-uniform if and only if q* is fixed by SM.

Symmetry Breaking from SM to SM-1

Theorem: dim ker qF (q*,)=M with basis vectors {vi}Mi=1

Theorem: dim ker q,L (q*,,)=M-1 with basis vectors

Point: Since the bifurcating solutions whose existence is guaranteed by the EBL and the SW Theorem

are tangential to ker q,L (q*,,), then we know the explicit form of the bifurcating directions.

otherwise 0

class unresolved theis if ][

th

i

ivv

00

Mi vv

Kernel of the Hessian at Symmetry Breaking Bifurcation

Assumptions:• Let q* be M-uniform • Call the M identical blocks of qF (q*,): B. Call the other N-M blocks of qF (q*,):

{R}. We assume that B has a single nullvector v and that R is nonsingular for every .

• If M<N, then BR-1 + MIK is nonsingular.

Theorem: Let (q*,*,*) be a singular point of the flow

such that q* is M-uniform. Then there exists M bifurcating (M-1)-uniform solutions (q*,*,*) + (tuk,0,(t)), where

Symmetry Breaking Bifurcation from M-uniform solutions

otherwise 0

class unresolvedother any is if

class unresolved theis if)1(

][ kv

kvM

u

th

k

),,(, qq

q L

Hessian d constraine theis , LqHessian nedunconstrai isFq

singular is , Lq

singular isFq rnonsingula isFq

rnonsingula is1

1

MN

iKi MIRB

1M 1M

Symmetry breaking

bifurcation

Impossible scenario

Saddle-node bifurcation

Impossible scenario

Non-generic

chap

ter

6

rnonsingula is1

1

MN

iKi MIRB

singular is1

1

MN

iKi MIRB

singular is1

1

MN

iKi MIRB

chap

ter

6

chap

ter

8

chap

ter

4

4S

3S3S

3S 3S

0

3

vv

v

v

0

3

vv

v

v

0

3vv

v

v

0

3vv

v

v

2S2S 2S2S2S2S2S2S

1

0

2

0

vv

v

2S 2S 2S2S

0

2

0

vv

v

0

2

0

vv

v

0

0

2

vv

v

0

2

0

vv

v

0

2

0

vv

v

0

0

2

v

v

v

0

20v

v

v

0

0

2

v

v

v

0

0

2

v

v

v

0

0

2

v

v

v

0

02v

v

v

Some of the bifurcating branches when N = 4 are given by the following isotropy subgroup lattice for S4

For the 4 Blob problem:The isotropy subgroups and bifurcating directions of the

observed bifurcating branches

isotropy group: S4 S3 S2 1bif direction: (-v,-v,3v,-v,0)T (-v,2v,0,-v,0)T (-v,0,0,v,0)T … No more bifs!

Smoller-Wasserman Theorem

The other Existence Theorem:

Smoller-Wasserman Theorem (1985-6)

For variational problems where

there is a bifurcating solution tangential to Fix(H) for every maximal isotropy subgroup H, not only those with dim Fix(H) = 1.

• dim Fix(H) =1 implies that H is a maximal isotropy subgroup

),(),( xfxr x

The Smoller-Wasserman Theorem shows that (under the same assumptions as before)

if M is composite, then there exists bifurcating solutions with isotropy group <p> for every element of order M in and every prime p|M, p<M. Furthermore,

dim (Fix <p>)=p-1

Other branches

4S

4A

34,12 24,13

23,14

v

v

v

v

v

v

v

v

0)(Fix 4 A

v

v

v

v

)1324(

0))1234((Fix

Bifurcating branches from a 4-uniform solution are given by the following isotropy subgroup lattice for S4

Maximal isotropy subgroup for S4

4S

3S3S

3S 3S 4A

34,12 24,13

23,14

Issues: SM

• The full lattice of subgroups of the group SM is not known for arbitrary M.

• The lattice of maximal subgroups of the group SM is not known for arbitrary M.

More about the Bifurcation Structure

Theorem: All symmetry breaking bifurcations from SM to SM-1 are pitchfork-like.

Outline of proof: ’(0)=0 since 2xx r(0,0) =0.

Theorem: The bifurcation discriminator of the pitchfork-like branch

(q*,*,*) + (tuk,0,(t)) is

If (q*,*,uk) < 0, then the branch is subcritical. If (q*,*,uk) > 0, then the branch is supercritical.

Theorem: Generically, bifurcations do not occur after all of the classes have resolved.

Theorem: If dim (ker q,L (q*,,)) = 1, and if a crossing condition is satisfied, then saddle-node bifurcation must occur.

],,[,)33(][][,[,3),( 42

,

213** vvvFvMMvv

qqELuuuq

sr srsr

LL


Top Related