2012　mdsp pr10 ica

Course Calendar Class DATE Contents

1 Sep. 26 Course information & Course overview

2 Oct. 4 Bayes Estimation

3 〃 11 Classical Bayes Estimation - Kalman Filter -

4 〃 18 Simulation-based Bayesian Methods

5 〃 25 Modern Bayesian Estimation ：Particle Filter

6 Nov. 1 HMM(Hidden Markov Model)

Nov. 8 No Class

7 〃 15 Bayesian Decision

8 〃 29 Non parametric Approaches

9 Dec. 6 PCA(Principal Component Analysis)

10 〃 13 ICA(Independent Component Analysis)

11 〃 20 Applications of PCA and ICA

12 〃 27 Clustering, k-means et al.

13 Jan. 17 Other Topics 1 Kernel machine.

14 〃 22(Tue) Other Topics 2

Lecture Plan

Independent Component Analysis

-1．Whitening by PCA

1. Introduction

Blind Sourｃe Separation（BSS）

2. Problem Formulation and independence

3. Whitening + ICA Approach

4. Non-Gaussianity Measure

References:

[1] A. Hyvarinen et al. “Independent Component Analysis”Wiley-InterScience, 2001

3

- １. Whitening by PCA (Preparation for ICA approach)

Whitened := Uncorrelatedness + Unity variance*

PCA is a very useful tool to transform a random vector x to an

uncorrelated or whitened z.

Matrix V is not uniquely defined, so we have free parameter.

(In 2-d case: rotation parameter)

* Here we assume all random variables are zero mean

PCA gives one solution of whitening issue.

z Vx

n-vector

matrix (unity matrix)z

z Vx

n n

C = I

n-vector

covariance matrix

x

n n

Fig. 1

4

PCA whitening method:

- Define the covariance matrix

- {Eigenvalue, Eigenvector} pairs of Cx

- Representation of Cx

(*)

- Whitening matrix transformation

* The matrix E is orthogonal matrix that satisfies EET=ETE=I

Tx E xxC

1

21 2

T

TT

x n

Tn

e

eC e e e EΛE

e

{ , 1 }i i i n e

1 1

2 2

1 2

1 1 : ( , , )T

Tz

x diag

E

z Vx , V = EΛ E

C zz I

(1)

(2)

(3)

5

１. Introduction Blind Source Separation (BSS)

Ex. Source signals Mixed signals

BSS Problem: Recover or separate source signals with no prior

information on mixing matrix [ aij ]. Typical BSS problem in real world

is known as “Cocktail party problem”

- Independent Component Analysis (ICA) utilizes the independence of

source signals to solve BSS issue.

1 3s t s t 3

1

( 1,2,3)i ij j

j

x t a s t i

Fig.2

mic1

mic2

Source 1

s1(t)

Source 2

s2(t) y2(t)

y1(t)

ICA Solution of BSS

?

Mixing

Process Separation

Process

Independency degree

Fig.3

7

2. Problem Formulation and Independence - Source signals (zero mean): sj (t), j=1~n

- Recorded signals : xi (t), i=1~n

- Linear Mixing Process (No delay model *) (* In real environment, the arrival time differences between microphones should be involved in the mixing model)

Recover the sources sj (t), j=1~n from mixed signals xi (t) i=1~n

- aij are unknown

- We want to get both aij and si (t) (-aij , -si (t))

Under the following assumptions

1

1 1

( 1 )

Vector-Marix form:

=

where ( constant matrix) ,

, , , ,

n

i ij j

j

ij

T T

n n

x t a s t i n

a n n

x x s s

x As

A

x s

(4)

(5)

8

Assumption 1: Source waveforms are statistically independent

Assumption 2: The sources have non-Gaussian distribution

Assumption 3: Matrix A is square and invertible (for simplicity)

Estimated A is used to recover the original signals by inverse (de-

mixing) operation.

s=Bx where B=A-1

Possible Solutions by ICA -To ambiguities-

- The Variance(amplitude) of recovered signals

Because, if a pair of is the solution of the underlying

BSS problem, then is also the other solution.

Variance of source signals are assumed to be unity:

- The order of recovered signals cannot be determined (Permutation)

,ij ja s t

2 1jE s

1

,ij jKa s tK

(6)

Basics Independence and Uncorrelatedness

Statistical independence of two random variables x and y

Knowing the value of x does not provide information about the

distribution of y .

Example:

Uncorrelatedness of two random variables x and y

their covariance is zero, i.e.

9

, ,y y x y x yp y x p y p x y p x p y

1

0

- variables , , are uncorrelated then we have

is diagonal matrix

n

Tx

E xy

x x

E

C xx

Fig.4

(7)

Statistical Independence

0x yE x m y m Uncorrelatedness

x,y x yp x,y p x p y

- Independence means uncorrelatedness

- Gaussian density case

Independence = Uncorrelatedness

11

Examples of Sources and Mixed Signals

[Uniform densities (Sub-Gaussian)]

- Two independent components s1, s2 that have the same

uniform density

- Mixing matrix

1 2 1 2

13

, 2 3

0

Variance of 1

,

ii

i

sp s

otherwise

s

p s s p s p s

1 2 1 1 2 2

5 10where

10 2

5 10

10 2

x As

A

x s s s s

a a2a

1a

Fig.5

Fig.6

(8)

12

[Super-Gaussian densities]

- Two independents s1, s2 have super-Gaussian like Fig. ??

-

- Mixing signals

2a

1a

Super Gaussian Source signals joint distribution

Mixed signals distribution Fig.8

Fig.7

13

3. Whitening(PCA) + ICA approach Observed signals

This means y is also whitened signal.

Conclusion: Whitening gives the independent components only

up to an orthogonal transformation.

whitening ICA

x z s

1

2 T

z ED E x

= Vx

New Mixing matrix is Orthog

onal matrix * *( )

z Vx = VAs

= As

A VA

Question*: Is this unique solution?

Ans* ( orthogonal matrix)

T T T T

y

U

E E

y = Uz

C yy Uzz U UIU = I

** T T T TE zz AE ss A AA I

14

Why Gaussian variables are forbidden (Assumption 2)

Suppose two independent signals s1 and s2 are Gaussian

and

we obtain

Joint Gaussian distribution is invariant with respect to orthogonal

transformation. This means that we cannot find (identify)

orthogonal matrix A from the mixed data.

2 2

1 21 2

2 2

1 2

1 1, exp exp

2 22 2

1 1 1exp exp

2 2 2 2

T

s sp s s

s s

s s

1

1

1 2

( =orthogonal matrix, )

1 1 1 1, exp exp

2 2 2 2

T T

T Tp z z

z = As A A = A , s = A z

z AA z z z

The same density is observed = no information arises by the orthonormal transformation

(9)

15

We need to answer the following two questions.

1) How can the non-Gaussianity of y is measured?

2) How can we compute the values of B that maximize the measure?

Maximization of Non-Gaussianity

For a given density p(x), we define a measure of non-Gaussianity NG(p(x)) (non-negative, =0 if p(x) is Gaussian )

NG

[Intuitive interpretation of ICA as non-Gaussianity minimization]

0

Gaussian p(x) Non-Gaussian

1

1

is non-Gaussian

: ( ) is unknown

( ( )) as a function of ( ) is maximuzed when s

i

T

T

n

T T T

i i

x As s

b

s A x Bx x

b

y b x b As q s b q Ab

p y b q y q

NG

(10)

(11)

16

NG(py) 0

y=q1s1 +q2s2 y=q1s1 y=q2s2

Reduce NG by mixing

Maximization by alternating b: As y →qisi NG tends to be maximized

4. Measure of non-Gaussianity Kurtosis is a classical measure of non-Gaussianity

The absolute value of the kurtosis can be used as s measure of

non-Gaussian. Optimization problem

is solved as an ICA solution.

2

4 2( ) : 3Kurt p y E y E y

0

Gaussian super-Gaussian

( )Kurt p y

sub-Gaussian

, : ( )b

Max J b J b Kurt p y

(12)

(13)

2012 mdsp pr10 ica

Technology

2012　mdsp pr10 ica