2012 mdsp pr10 ica
TRANSCRIPT
Course Calendar Class DATE Contents
1 Sep. 26 Course information & Course overview
2 Oct. 4 Bayes Estimation
3 〃 11 Classical Bayes Estimation - Kalman Filter -
4 〃 18 Simulation-based Bayesian Methods
5 〃 25 Modern Bayesian Estimation :Particle Filter
6 Nov. 1 HMM(Hidden Markov Model)
Nov. 8 No Class
7 〃 15 Bayesian Decision
8 〃 29 Non parametric Approaches
9 Dec. 6 PCA(Principal Component Analysis)
10 〃 13 ICA(Independent Component Analysis)
11 〃 20 Applications of PCA and ICA
12 〃 27 Clustering, k-means et al.
13 Jan. 17 Other Topics 1 Kernel machine.
14 〃 22(Tue) Other Topics 2
Lecture Plan
Independent Component Analysis
-1.Whitening by PCA
1. Introduction
Blind Source Separation(BSS)
2. Problem Formulation and independence
3. Whitening + ICA Approach
4. Non-Gaussianity Measure
References:
[1] A. Hyvarinen et al. “Independent Component Analysis”Wiley-InterScience, 2001
3
- 1. Whitening by PCA (Preparation for ICA approach)
Whitened := Uncorrelatedness + Unity variance*
PCA is a very useful tool to transform a random vector x to an
uncorrelated or whitened z.
Matrix V is not uniquely defined, so we have free parameter.
(In 2-d case: rotation parameter)
* Here we assume all random variables are zero mean
PCA gives one solution of whitening issue.
z Vx
n-vector
matrix (unity matrix)z
z Vx
n n
C = I
n-vector
covariance matrix
x
n n
Fig. 1
4
PCA whitening method:
- Define the covariance matrix
- {Eigenvalue, Eigenvector} pairs of Cx
- Representation of Cx
(*)
- Whitening matrix transformation
* The matrix E is orthogonal matrix that satisfies EET=ETE=I
Tx E xxC
1
21 2
T
TT
x n
Tn
e
eC e e e EΛE
e
{ , 1 }i i i n e
1 1
2 2
1 2
1 1 : ( , , )T
Tz
x diag
E
z Vx , V = EΛ E
C zz I
(1)
(2)
(3)
5
1. Introduction Blind Source Separation (BSS)
Ex. Source signals Mixed signals
BSS Problem: Recover or separate source signals with no prior
information on mixing matrix [ aij ]. Typical BSS problem in real world
is known as “Cocktail party problem”
- Independent Component Analysis (ICA) utilizes the independence of
source signals to solve BSS issue.
1 3s t s t 3
1
( 1,2,3)i ij j
j
x t a s t i
Fig.2
mic1
mic2
Source 1
s1(t)
Source 2
s2(t) y2(t)
y1(t)
ICA Solution of BSS
?
Mixing
Process Separation
Process
Independency degree
Fig.3
7
2. Problem Formulation and Independence - Source signals (zero mean): sj (t), j=1~n
- Recorded signals : xi (t), i=1~n
- Linear Mixing Process (No delay model *) (* In real environment, the arrival time differences between microphones should be involved in the mixing model)
Recover the sources sj (t), j=1~n from mixed signals xi (t) i=1~n
- aij are unknown
- We want to get both aij and si (t) (-aij , -si (t))
Under the following assumptions
1
1 1
( 1 )
Vector-Marix form:
=
where ( constant matrix) ,
, , , ,
n
i ij j
j
ij
T T
n n
x t a s t i n
a n n
x x s s
x As
A
x s
(4)
(5)
8
Assumption 1: Source waveforms are statistically independent
Assumption 2: The sources have non-Gaussian distribution
Assumption 3: Matrix A is square and invertible (for simplicity)
Estimated A is used to recover the original signals by inverse (de-
mixing) operation.
s=Bx where B=A-1
Possible Solutions by ICA -To ambiguities-
- The Variance(amplitude) of recovered signals
Because, if a pair of is the solution of the underlying
BSS problem, then is also the other solution.
Variance of source signals are assumed to be unity:
- The order of recovered signals cannot be determined (Permutation)
,ij ja s t
2 1jE s
1
,ij jKa s tK
(6)
Basics Independence and Uncorrelatedness
Statistical independence of two random variables x and y
Knowing the value of x does not provide information about the
distribution of y .
Example:
Uncorrelatedness of two random variables x and y
their covariance is zero, i.e.
9
, ,y y x y x yp y x p y p x y p x p y
1
0
- variables , , are uncorrelated then we have
is diagonal matrix
n
Tx
E xy
x x
E
C xx
Fig.4
(7)
Statistical Independence
0x yE x m y m Uncorrelatedness
x,y x yp x,y p x p y
- Independence means uncorrelatedness
- Gaussian density case
Independence = Uncorrelatedness
11
Examples of Sources and Mixed Signals
[Uniform densities (Sub-Gaussian)]
- Two independent components s1, s2 that have the same
uniform density
- Mixing matrix
1 2 1 2
13
, 2 3
0
Variance of 1
,
ii
i
sp s
otherwise
s
p s s p s p s
1 2 1 1 2 2
5 10where
10 2
5 10
10 2
x As
A
x s s s s
a a2a
1a
Fig.5
Fig.6
(8)
12
[Super-Gaussian densities]
- Two independents s1, s2 have super-Gaussian like Fig. ??
-
- Mixing signals
2a
1a
Super Gaussian Source signals joint distribution
Mixed signals distribution Fig.8
Fig.7
13
3. Whitening(PCA) + ICA approach Observed signals
This means y is also whitened signal.
Conclusion: Whitening gives the independent components only
up to an orthogonal transformation.
whitening ICA
x z s
1
2 T
z ED E x
= Vx
New Mixing matrix is Orthog
onal matrix * *( )
z Vx = VAs
= As
A VA
Question*: Is this unique solution?
Ans* ( orthogonal matrix)
T T T T
y
U
E E
y = Uz
C yy Uzz U UIU = I
** T T T TE zz AE ss A AA I
14
Why Gaussian variables are forbidden (Assumption 2)
Suppose two independent signals s1 and s2 are Gaussian
and
we obtain
Joint Gaussian distribution is invariant with respect to orthogonal
transformation. This means that we cannot find (identify)
orthogonal matrix A from the mixed data.
2 2
1 21 2
2 2
1 2
1 1, exp exp
2 22 2
1 1 1exp exp
2 2 2 2
T
s sp s s
s s
s s
1
1
1 2
( =orthogonal matrix, )
1 1 1 1, exp exp
2 2 2 2
T T
T Tp z z
z = As A A = A , s = A z
z AA z z z
The same density is observed = no information arises by the orthonormal transformation
(9)
15
We need to answer the following two questions.
1) How can the non-Gaussianity of y is measured?
2) How can we compute the values of B that maximize the measure?
Maximization of Non-Gaussianity
For a given density p(x), we define a measure of non-Gaussianity NG(p(x)) (non-negative, =0 if p(x) is Gaussian )
NG
[Intuitive interpretation of ICA as non-Gaussianity minimization]
0
Gaussian p(x) Non-Gaussian
1
1
is non-Gaussian
: ( ) is unknown
( ( )) as a function of ( ) is maximuzed when s
i
T
T
n
T T T
i i
x As s
b
s A x Bx x
b
y b x b As q s b q Ab
p y b q y q
NG
(10)
(11)
16
NG(py) 0
y=q1s1 +q2s2 y=q1s1 y=q2s2
Reduce NG by mixing
Maximization by alternating b: As y →qisi NG tends to be maximized
4. Measure of non-Gaussianity Kurtosis is a classical measure of non-Gaussianity
The absolute value of the kurtosis can be used as s measure of
non-Gaussian. Optimization problem
is solved as an ICA solution.
2
4 2( ) : 3Kurt p y E y E y
0
Gaussian super-Gaussian
( )Kurt p y
sub-Gaussian
, : ( )b
Max J b J b Kurt p y
(12)
(13)