1 a convex analysis framework for blind separation of non...

30
1 A Convex Analysis Framework for Blind Separation of Non-Negative Sources Tsung-Han Chan , Wing-Kin Ma * , Chong-Yung Chi , and Yue Wang § Abstract This paper presents a new framework for blind source separation (BSS) of non-negative source signals. The proposed framework, referred herein to as convex analysis of mixtures of non-negative sources (CAMNS), is deterministic requiring no source independence assumption, the entrenched premise in many existing (usually statistical) BSS frameworks. The development is based on a special assumption called local dominance. It is a good assumption for source signals exhibiting sparsity or high contrast, and thus is considered realistic to many real-world problems such as multichannel biomedical imaging. Under local dominance and several standard assumptions, we apply convex analysis to establish a new BSS criterion, which states that the source signals can be perfectly identified (in a blind fashion) by finding the extreme points of an observation-constructed polyhedral set. Algorithms for fulfilling the CAMNS criterion are also derived, using either linear programming or simplex geometry. Simulation results on several data sets are presented to demonstrate the efficacy of the proposed method over several other reported BSS methods. Index Terms Blind separation, Non-negative sources, Convex analysis criterion, Linear program, Simplex geom- etry, Convex optimization EDICS: SSP-SSEP (Signal separation), SAM-APPL (Applications of sensor & array multichannel processing), BIO-APPL (Applications of biomedical signal processing) This work was supported in part by the National Science Council (R.O.C.) under Grants NSC 95-2221-E-007-122 and NSC 95-2221-E-007-036, and by the US National Institutes of Health under Grants EB000830 and CA109872. This work was partly presented at the 32th IEEE International Conference on Acoustics, Speech, and Signal Processing, Honolulu, Hawaii, USA, April 15-20, 2007. Tsung-Han Chan is with Institute of Communications Engineering & Department of Electrical Engineering, National Tsing Hua University, Hsinchu, Taiwan, R.O.C. E-mail: [email protected], Tel: +886-3-5715131X34033, Fax: +886-3- 5751787. * Wing-Kin Ma is the corresponding author. Address: Department of Electronic Engineering, The Chinese University of Hong Kong, Shatin, N.T., Hong Kong. E-mail: [email protected], Tel: +852-31634350, Fax: +852-26035558. Chong-Yung Chi is with Institute of Communications Engineering & Department of Electrical Engineering, National Tsing Hua University, Hsinchu, Taiwan, R.O.C. E-mail: [email protected], Tel: +886-3-5731156, Fax: +886-3-5751787. § Yue Wang is with Bradley Department of Electrical and Computer Engineering, Virginia Polytechnic Institute and State University, Arlington, VA 22203, USA E-mail: [email protected], Tel: 703-387-6056, Fax: 703-528-5543. December 27, 2007 DRAFT

Upload: others

Post on 25-Jul-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 1 A Convex Analysis Framework for Blind Separation of Non ...wkma/CAMNS/download/CAMNS_1st_draft.pdfConvex analysis and optimization techniques have drawn considerable attention in

1

A Convex Analysis Framework for BlindSeparation of Non-Negative Sources

Tsung-Han Chan†, Wing-Kin Ma∗, Chong-Yung Chi‡, and Yue Wang§

Abstract

This paper presents a new framework for blind source separation (BSS) of non-negative sourcesignals. The proposed framework, referred herein to as convex analysis of mixtures of non-negativesources (CAMNS), is deterministic requiring no source independence assumption, the entrenched premisein many existing (usually statistical) BSS frameworks. Thedevelopment is based on a special assumptioncalled local dominance. It is a good assumption for source signals exhibiting sparsity or high contrast,and thus is considered realistic to many real-world problems such as multichannel biomedical imaging.Under local dominance and several standard assumptions, weapply convex analysis to establish a newBSS criterion, which states that the source signals can be perfectly identified (in a blind fashion) byfinding the extreme points of an observation-constructed polyhedral set. Algorithms for fulfilling theCAMNS criterion are also derived, using either linear programming or simplex geometry. Simulationresults on several data sets are presented to demonstrate the efficacy of the proposed method over severalother reported BSS methods.

Index Terms

Blind separation, Non-negative sources, Convex analysis criterion, Linear program, Simplex geom-etry, Convex optimization

EDICS: SSP-SSEP (Signal separation), SAM-APPL (Applications ofsensor & array multichannelprocessing), BIO-APPL (Applications of biomedical signalprocessing)

This work was supported in part by the National Science Council (R.O.C.) under Grants NSC 95-2221-E-007-122 and NSC95-2221-E-007-036, and by the US National Institutes of Health under Grants EB000830 and CA109872. This work was partlypresented at the 32th IEEE International Conference on Acoustics, Speech, and Signal Processing, Honolulu, Hawaii, USA,April 15-20, 2007.

†Tsung-Han Chan is with Institute of Communications Engineering & Department of Electrical Engineering, NationalTsing Hua University, Hsinchu, Taiwan, R.O.C. E-mail: [email protected], Tel: +886-3-5715131X34033, Fax: +886-3-5751787.

∗Wing-Kin Ma is the corresponding author. Address: Department of Electronic Engineering, The Chinese University of HongKong, Shatin, N.T., Hong Kong. E-mail: [email protected], Tel:+852-31634350, Fax: +852-26035558.

‡Chong-Yung Chi is with Institute of Communications Engineering & Department of Electrical Engineering, National TsingHua University, Hsinchu, Taiwan, R.O.C. E-mail: [email protected], Tel: +886-3-5731156, Fax: +886-3-5751787.

§Yue Wang is with Bradley Department of Electrical and Computer Engineering, Virginia Polytechnic Institute and StateUniversity, Arlington, VA 22203, USA E-mail: [email protected], Tel: 703-387-6056, Fax: 703-528-5543.

December 27, 2007 DRAFT

Page 2: 1 A Convex Analysis Framework for Blind Separation of Non ...wkma/CAMNS/download/CAMNS_1st_draft.pdfConvex analysis and optimization techniques have drawn considerable attention in

2

I. INTRODUCTION

Recently there has been much interest in blind separation ofnon-negative source signals [1], [2],

referred herein to as non-negative blind source separation(nBSS). There are many applications where

the sources to be separated are non-negative by nature; for example, in analytical chemistry [3], [4],

hyperspectral imaging [5], and biomedical imaging [6]. Howto cleverly exploit the non-negative signal

characteristic in nBSS has been an intriguing subject, leading to numerous nBSS alternatives being

proposed [7]–[11].

One major class of nBSS methods utilizes the statistical property that the sources are mutually

uncorrelated or independent, supposing that the sources dosatisfy that property. Methods falling in this

class include second-order blind identification (SOBI) [12], fast fixed-point algorithm for independent

component analysis (ICA) [13], non-negative ICA (nICA) [7], stochastic non-negative ICA (SNICA)

[8], and Bayesian positive source separation (BPSS) [9], toname a few. SOBI and fast ICA were

originally developed for more general blind source separation (BSS) problems where signals can be

negative. The two methods can be directly applied to nBSS, but it has been reported that their separated

signals may have negative values especially in the presenceof finite sample effects [4], [14]. nICA takes

source non-negativity into account, and is shown to provideperfect separation when the sources have

non-vanishing density around zeros (which is also called the well-grounded condition). SNICA uses a

simulated annealing algorithm for extracting non-negative sources under the minimum mutual information

criterion. BPSS also uses source non-negativity. It applies Bayesian estimation with both the sources and

mixing matrix being assigned Gamma distribution priors.

Another class of nBSS methods is deterministic requiring noassumption on source independence

or zero correlations. Roughly speaking, these methods explicitly exploits source non-negativity or even

mixing matrix non-negativity, with an attempt to achieve some kind of least square criterion. Alternating

least squares (ALS) [10], [15] deals with a sequence of leastsquares problems where non-negativity

constraints on either the sources or mixing matrix are imposed. Non-negative matrix factorization (NMF)

[11] decomposes the observation matrix as a product of two non-negative matrices, one serving as the

estimate of the sources while another the mixing matrix. NMFis not a unique decomposition, which may

result in indeterminacy of the sources and mixing matrix. Toovercome this problem, a sparse constraint

on the sources has been proposed [16].

In this paper we propose a new nBSS framework, known as convexanalysis of mixtures of non-negative

sources (CAMNS). Convex analysis and optimization techniques have drawn considerable attention in

December 27, 2007 DRAFT

Page 3: 1 A Convex Analysis Framework for Blind Separation of Non ...wkma/CAMNS/download/CAMNS_1st_draft.pdfConvex analysis and optimization techniques have drawn considerable attention in

3

signal processing, serving as powerful tools for various topics such as communications [17]–[22], array

signal processing [23], and sensor networks [24]. Apart from using source non-negativity, adopts a special

deterministic assumption calledlocal dominance. This assumption was initially proposed to capture the

sparse characteristics of biomedical images [25], [26], but we found it a good assumption or approximation

for high contrast images such as human portraits, as well. Under the local dominant assumption and

some standard nBSS assumptions, we show using convex analysis that the true source signals serve as

the extreme points of some observation-constructed polyhedral set. This geometrical discovery is not

only surprising but important, since it provides a novel nBSS criterion that guarantees perfect blind

separation. To practically realize CAMNS, we derive extreme-point finding algorithms using either linear

programming or simplex geometry. As we will see by simulations, the blind separation performance of

the CAMNS algorithms is promising even in the presence of strongly correlated sources.

We should stress that the proposed CAMNS framework is deterministic, but it is conceptually different

from the other existing deterministic frameworks such as NMF. In particular, the idea of using convex

analysis to establish nBSS criterion cannot be found in the other frameworks. Moreover, CAMNS does

not require mixing matrix non-negativity while NMF requires. The closest match to CAMNS would be

non-negative least correlated component analysis (nLCA) [25], [26], a concurrent development that also

exploits the local dominance assumption. Nevertheless, convex analysis is not involved in nLCA.

CAMNS works by exploiting signal geometry, using convex analysis and optimization. It is interesting

to note that in BSS of magnitude bounded sources (BSS-MBS) [27]–[29], signal geometry would also be

utilized. For example, in [27] a key step is to use a neural learning algorithm to determine the vertices

of an observation space generated by magnitude bounded source signals. But we should also emphasize

that the underlying assumptions and techniques employed inCAMNS are substantially different from

those in BSS-MBS.

The paper is organized as follows. In Section II, the problemstatement is given. In Section III,

we review some key concepts of convex analysis, which would be useful for understanding of the

mathematical derivations that follow. The new BSS criterion for separating non-negative sources is

developed in Section IV. We show how to use linear programs (LPs) to practically achieve the new

criterion in Section V. Section VI studies a geometric alternative to the LP method. Finally, in Section

VII, we use simulations to evaluate the performance of the proposed methods as well as some other

existing nBSS methods.

December 27, 2007 DRAFT

Page 4: 1 A Convex Analysis Framework for Blind Separation of Non ...wkma/CAMNS/download/CAMNS_1st_draft.pdfConvex analysis and optimization techniques have drawn considerable attention in

4

II. PROBLEM STATEMENT AND ASSUMPTIONS

For ease of later use, let us define the following notations:

R, RN , R

M×N Set of real numbers,N -vectors,M × N matrices

R+, RN+ , R

M×N+ Set of non-negative real numbers,N -vectors,M × N matrices

1 All one vector

IN N × N identity matrix

ei Unit vector with theith entry being equal to 1

� Componentwise inequality

‖ · ‖ Euclidean norm

N (µ,Σ) Gaussian distribution with meanµ and covarianceΣ

The scenario under consideration is that of linear instantaneous mixtures. The signal model is given

by

x[n] = As[n], n = 1, . . . , L (1)

wheres[n] = [ s1[n], . . . , sN [n] ]T is the input or source vector sequence withN denoting the input

dimension,x[n] = [ x1[n], . . . , xM [n] ]T is the output or observation vector sequence withM denoting

the output dimension,A ∈ RM×N is the mixing matrix describing the input-output relation,andL is the

sequence (or data) length and we assumeL � max{M,N}. Note that (1) can be alternatively expressed

as

xi =N

j=1

aijsj , i = 1, . . . ,M, (2)

whereaij is the (i, j)th element ofA, sj = [ sj[1], . . . , sj[L] ]T is a vector representing thejth source

signal andxi = [ xi[1], . . . , xi[L] ]T is a vector representing theith observed signal.

In BSS, the problem is to extracts[n] without information ofA. The BSS framework to be proposed

is based on the following assumptions:

(A1) All sj are componentwise non-negative; i.e., for eachj, sj ∈ RL+.

(A2) Each source signal vector islocal dominant, in the following sense: For eachi ∈ {1, . . . , N}, there

exists an (unknown) indexi such thatsi[`i] > 0 andsj[`i] = 0, ∀j 6= i. (This means that for each

source there is at least onen at which the source dominates.)

(A3) The mixing matrix has unit row sum; i.e., for alli = 1, . . . ,M ,

N∑

j=1

aij = 1. (3)

December 27, 2007 DRAFT

Page 5: 1 A Convex Analysis Framework for Blind Separation of Non ...wkma/CAMNS/download/CAMNS_1st_draft.pdfConvex analysis and optimization techniques have drawn considerable attention in

5

(A4) M ≥ N andA is of full column rank.

Let us discuss the practicality of(A1) − (A4). Assumption(A1) is true in image analysis [5], [6]

where image intensities are often represented by non-negative numbers. Assumption(A2) is special

and instrumental to the development that ensues. It may be completely satisfied or serve as a good

approximation when the source signals are sparse (or contain many zeros). In brain magnetic resonance

imaging (MRI), for instance, the non-overlapping region ofthe spatial distribution of a fast perfusion and a

slow perfusion source images [6] can be higher than95%. It may also be appropriate to assume(A2) when

the source signals exhibit high contrast. Assumption(A4) is a standard assumption in BSS. Assumption

(A3) is automatically satisfied in MRI due to the partial volume effect [26], and in hyperspectral images

due to the full additivity condition [5]. When(A3) is not satisfied, the following idea [26] can be used.

Suppose thatxTi 1 6= 0 andsT

j 1 6= 0 for all i and j, and consider the following normalized observation

vectors:

xi =xi

xTi 1

=N

j=1

(aijs

Tj 1

xTi 1

)(sj

sTj 1

). (4)

By letting aij = aijsTj 1/xT

i 1 andsj = sj/sTj 1, we obtain a BSS problem formulationxi =

∑Nj=1 aij sj

which is in the same form as the original signal model in (2). It is easy to show that the new mixing

matrix, denoted byA, has unit row sum. In addition, the rank ofA is the same as that ofA. To show

this, we notice that

A = D−11 AD2 (5)

whereD1 = diag(xT1 1, ...,xT

M1) andD2 = diag(sT1 1, ..., sT

N1). SinceD1 andD2 are of full rank, we

have rank(A) =rank(A).

III. SOME BASIC CONCEPTS OFCONVEX ANALYSIS

We review some convex analysis concepts that will play an important role in the ensuing development.

For detailed explanations of these concepts, readers are referred to [30], [31], [32] which are excellent

literatures in convex analysis.

Given a set of vectors{s1, . . . , sN} ⊂ RL, the affine hull is defined as

aff{s1, . . . , sN} =

{

x =

N∑

i=1

θisi

θ ∈ RN ,

N∑

i=1

θi = 1

}

. (6)

An affine hull can always be represented by an affine set:

aff{s1, . . . , sN} ={

x = Cα+ d∣

∣ α ∈ RP

}

(7)

December 27, 2007 DRAFT

Page 6: 1 A Convex Analysis Framework for Blind Separation of Non ...wkma/CAMNS/download/CAMNS_1st_draft.pdfConvex analysis and optimization techniques have drawn considerable attention in

6

for some (non-unique)d ∈ RL andC ∈ R

L×P . Here,C is assumed to be of full column rank andP is

the affine dimension which must be less thanN . For example, if{s1, . . . , sN} is linearly independent,

thenP = N − 1. In that case, a legitimate(C,d) is given byC = [ s1 − sN , s2 − sN , . . . , sN−1 − sN ]

andd = sN . To give some insights, Fig. 1 pictorially illustrates an affine hull for N = 3. We can see

that it is a plane passing throughs1, s2, ands3.

Given a set of vectors{s1, . . . , sN} ⊂ RL, the convex hullis defined as

conv{s1, . . . , sN} =

{

x =

N∑

i=1

θisi

θ ∈ RN+ ,

N∑

i=1

θi = 1

}

. (8)

ForN = 3, a convex hull is a triangle with verticess1, s2, ands3 Geometricallys1, . . . , sN would be the

‘corner points’ of its convex hull, defined formally as theextreme points. A point x ∈ conv{s1, . . . , sN}

is an extreme point ofconv{s1, . . . , sN} if it cannot be a nontrivial convex combination ofs1, . . . , sN ;

i.e.,

x 6=N

i=1

θisi (9)

for all θ ∈ RN+ ,

∑Ni=1 θi = 1, andθ 6= ei for any i. The set of extreme points ofconv{s1, . . . , sN} must

be either the full set or a subset of{s1, . . . , sN}. In addition, if {s1, . . . , sN} is an affinely independent

set (or {s1 − sN , . . . , sN−1 − sN} is a linearly independent set), then the set of extreme points of

conv{s1, . . . , sN} is exactly {s1, . . . , sN}. Moreover, the boundary ofconv{s1, . . . , sN} is entirely

constituted by all itsfaces, defined byconv{si1 , . . . , siK} where{i1, ..., iK} ⊂ {1, ..., N}, ik 6= il for

k 6= l, andK < N . A facet is a face withK = N − 1. For example, in Fig. 1, we can see that the facets

are the line segmentsconv{s1, s2}, conv{s2, s3} andconv{s1, s3}.

s1

s2

s3

Facets

0

aff{s1, s2, s3} = {x = Cα + d|α ∈ R2}

conv{s1, s2, s3}

Fig. 1. Example of 3-dimensional signal space geometry forN = 3.

A simplest simplexof affine dimensionN − 1, or (N − 1)-simplex is defined as the convex hull ofN

December 27, 2007 DRAFT

Page 7: 1 A Convex Analysis Framework for Blind Separation of Non ...wkma/CAMNS/download/CAMNS_1st_draft.pdfConvex analysis and optimization techniques have drawn considerable attention in

7

affinely independent vectors{s1, . . . , sN} ⊂ RN−1. An (N −1)-simplex is formed byN extreme points,

N facets, and a number of faces. The family of these simplest simplexes has the vertex-descriptions

including a point, line segment, triangle forN = 1, 2, and3, respectively.

IV. N EW NBSS CRITERION BY CONVEX ANALYSIS

Now, consider the BSS problem formulation in (2) and the assumptions in(A1)-(A4). Under(A3), we

see that every observationxi is an affine combination of the true source signals{s1, . . . , sN}; that is

xi ∈ aff{s1, . . . , sN} (10)

for all i = 1, . . . ,M . This leads to an interesting question whether the observationsx1, . . . ,xM provide

sufficient information to construct the source signal affinehull aff{s1, . . . , sN}. This is indeed possible,

as described in the following lemma:

Lemma 1. Under (A3) and (A4), we have

aff{s1, . . . , sN} = aff{x1, . . . ,xM}. (11)

The proof of Lemma 1 is given in Appendix A. Figure 2(a) demonstrates geometrically the validity

of Lemma 1, for the special case ofN = 2 where aff{s1, . . . , sN} is a line. Now let us focus on

the characterization ofaff{s1, . . . , sN}. It can easily be shown from(A2) that {s1, . . . , sN} is linearly

independent. Hence,aff{s1, . . . , sN} has dimensionN − 1 and admits a representation

aff{s1, . . . , sN} ={

x = Cα+ d∣

∣ α ∈ RN−1

}

(12)

for some(C,d) ∈ RL×(N−1)×R

L such thatrank(C) = N −1. Note that (C,d) is non-unique. Without

loss of generality, we can restrictC to be a semi-orthogonal matrix, i.e.,CTC = I. If M = N , it is easy

to obtain(C,d) from the observationsx1, . . . ,xM ; see the review in Section III. For the more general

case ofM ≥ N , (C,d) may be found by solving the following minimization problem

(C,d) = arg minC,d

CT C=I

M∑

i=1

eA(C,d)(xi) (13)

whereeA(x) is the projection error ofx ontoA, defined as

eA(x) = minx∈A

‖x− x‖22, (14)

and

A(C, d) = {x = Cα+ d | α ∈ RN−1} (15)

December 27, 2007 DRAFT

Page 8: 1 A Convex Analysis Framework for Blind Separation of Non ...wkma/CAMNS/download/CAMNS_1st_draft.pdfConvex analysis and optimization techniques have drawn considerable attention in

8

is an affine set parameterized by (C, d). The objective of (13) is to find an (N −1)-dimensional affine set

that has the minimum projection error with respect to the observations. At first glance, (13) is a nonconvex

optimization problem, but in fact it can be solved analytically as stated in the following proposition:

Proposition 1. The affine set fitting problem in (13) has a closed-form solution

d =1

M

M∑

i=1

xi (16)

C = [ q1(UUT ),q2(UUT ), . . . ,qN−1(UUT ) ] (17)

whereU = [ x1 −d, . . . ,xM −d ] ∈ RL×M , and the notationqi(R) denotes the eigenvector associated

with the ith principal eigenvalue of the input matrixR.

The proof of Proposition 1 is given in Appendix B. Since affineset fitting provides a best affine set

for minimizing the data fitting error, in the presence of noise it has a noise mitigation advantage.

s1s1

s2s2

x1

x2

x3

e1e1

e2e2

e3e3

RL+

aff{s1, s2} = aff{x1,x2,x3}

S = A(C,d) ∩ RL+

= conv{s1, s2}

(a) (b)

Fig. 2. Geometric illustrations of CAMNS, for the special case of N = 2, M = 3, and L = 3. (a) Affine set geometry

indicated by Lemma 1 and Proposition 1; (b) convex hull geometry suggested in Lemma 2.

Recall that the source signals are non-negative. Hence, we havesi ∈ aff{s1, . . . , sN} ∩RL+ for any i.

Let us define

S = aff{s1, . . . , sN} ∩ RL+ = A(C,d) ∩ R

L+ (18)

= {x | x = Cα+ d, x � 0, α ∈ RN−1} (19)

December 27, 2007 DRAFT

Page 9: 1 A Convex Analysis Framework for Blind Separation of Non ...wkma/CAMNS/download/CAMNS_1st_draft.pdfConvex analysis and optimization techniques have drawn considerable attention in

9

which is a polyhedral set. We can show that

Lemma 2. Under (A1) and (A2), we have

S = conv{s1, . . . , sN}. (20)

The proof of Lemma 2 is given in Appendix C. Following the simple illustrative example in Figure 2(a),

in Figure 2(b) we verify geometrically thatS is equivalent toconv{s1, . . . , sN} for N = 2. From

Lemma 2, we notice an important consequence that the set of extreme points ofS or conv{s1, . . . , sN}

is {s1, . . . , sN}, since{s1, . . . , sN} is linearly independent [due to(A2)]. The extremal property of

{s1, . . . , sN} can be seen in the illustration in Figure 2(b).

From the derivations above, we conclude that

Theorem 1. (nBSS criterion by CAMNS) Under (A1) to (A4), the polyhedral set

S ={

x ∈ RL

∣ x = Cα+ d � 0, α ∈ RN−1

}

(21)

where (C,d) is obtained from the observation set{x1, ...,xM} by the affine set fitting procedure in

Proposition 1, hasN extreme points given by the true source vectorss1, ..., sN .

Proof: Theorem 1 is a direct consequence of Lemma 1, Proposition 1, Lemma 2, and the basic

result that the extreme points ofconv{s1, . . . , sN} ares1, . . . , sN .

The theoretical implication of Theorem 1 is profound: It suggests that the true source vectors can

be perfectly identified by finding all the extreme points ofS. This provides new opportunities in nBSS

that cannot be found in the other presently available literature to our best knowledge. Exploring these

opportunities in practice are the subject of the following sections.

V. L INEAR PROGRAMMING METHOD FORCAMNS

This section, as well as the next section are dedicated to thepractical implementation of CAMNS. In

this section, we propose an approach that uses linear programs (LPs) to systematically fulfil the CAMNS

criterion. The next section will describe a geometric approach as an alternative to the LP.

Our problem as indicated in Theorem 1 is to find all the extremepoints of the polyhedral setS in (19). In

the optimization literature this problem is known as vertexenumeration; see [33]–[35] and the references

therein. The available extreme-point finding methods are sophisticated, requiring no assumption on the

extreme points. However, the complexity of those methods would increase exponentially with the number

of inequalitiesL (note thatL is also the data length in our problem, which is often large inpractice),

December 27, 2007 DRAFT

Page 10: 1 A Convex Analysis Framework for Blind Separation of Non ...wkma/CAMNS/download/CAMNS_1st_draft.pdfConvex analysis and optimization techniques have drawn considerable attention in

10

except for a few special cases not applicable to this work. The notable difference of the development

here is that we exploit the characteristic that the extreme points s1, ..., sN are linearly independent in

the CAMNS problem [recall that this property is a direct consequence of(A2)]. By doing so we will

establish an extreme-point finding method (for CAMNS) whosecomplexity is polynomial inL.

We first concentrate on identifying one extreme point fromS. Consider the following linear minimiza-

tion problem:

p? = mins

rTs

subject to (s.t.) s ∈ S

(22)

for some arbitrarily chosen directionr ∈ RL, wherep? denotes the optimal objective value of (22). Using

the polyhedral structures ofS in (19), problem (22) can be equivalently represented by an LP

p? = minα

rT (Cα+ d)

s.t. Cα+ d � 0

(23)

which can be solved by readily available algorithms such as the polynomial-time interior-point meth-

ods [36], [37]. Problem (23) is the problem we solve in practice, but (22) leads to important implications

to extreme-point search.

A fundamental result in LP theory is thatrTs, the objective function of (22), attains the minimum at

a point of the boundary ofS. To provide more insights, some geometric illustrations are given in Fig. 3.

We can see that the solution of (22) may be uniquely given by one of the extreme pointssi [Fig. 3(a)],

or it may be any point on a face [Fig. 3(b)]. The latter case poses a trouble to our task of identifyingsi,

but it is arguably not a usual situation. For instance, in thedemonstration in Fig. 3(b),r must be normal

to s2 − s3 which may be unlikely to happen for a randomly pickedr. With this intuition in mind, we

prove in Appendix D that

Lemma 3. Suppose thatr is randomly generated following a distributionN (0, IL). Then, with proba-

bility 1, the solution of (22) is uniquely given bysi for somei ∈ {1, ..., N}.

The idea behind Lemma 3 is to show that undesired cases, such as that in Fig. 3(b) happen with probability

zero.

We may find another extreme point by solving the maximizationcounterpart of (22)

q? = maxα

rT (Cα+ d)

s.t. Cα+ d � 0.

(24)

December 27, 2007 DRAFT

Page 11: 1 A Convex Analysis Framework for Blind Separation of Non ...wkma/CAMNS/download/CAMNS_1st_draft.pdfConvex analysis and optimization techniques have drawn considerable attention in

11

00

rr

s1s1

s2s2

s3s3

(a) (b)

SS

optimal point set of optimal points

Fig. 3. Geometric interpretation of an LP.

Using the same development as above, we can show the following: Under the premise of Lemma 3,

the solution of (24) is, with probability 1, uniquely given by an extreme pointsi different from that in

(22).

Suppose that we have identifiedl extreme points, say, without loss of generality,{s1, ..., sl}. Our

interest is in refining the above LP extreme-point finding procedure such that the search space is restricted

to {sl+1, ..., sN}. To do so, consider a thin QR decomposition [38] of[s1, ..., sl]

[s1, ..., sl] = Q1R1, (25)

whereQ1 ∈ RL×l is semi-unitary andR1 ∈ R

l×l is upper triangular. Let

B = IL − Q1QT1 . (26)

We assume thatr takes the form

r = Bw (27)

for somew ∈ RL, and consider solving (23) and (24) with such anr. Sincer is orthogonal to the old

extreme pointss1, ..., sl, the intuitive expectation is that (23) and (24) should bothlead to new extreme

points. Interestingly, we found theoretically that expectation is not true, but close. Consider the following

lemma which is proven in Appendix E:

Lemma 4. Suppose thatr = Bw, whereB ∈ RL×L is given by (26) andw is randomly generated

following a distributionN (0, IL). Then, with probability 1, at least one of the optimal solutions of (23)

December 27, 2007 DRAFT

Page 12: 1 A Convex Analysis Framework for Blind Separation of Non ...wkma/CAMNS/download/CAMNS_1st_draft.pdfConvex analysis and optimization techniques have drawn considerable attention in

12

and (24) is a new extreme point; i.e.,si for somei ∈ {l+1, ..., N}. The certificate of finding new extreme

points is indicated by|p?| 6= 0 for the case of (23), and|q?| 6= 0 for (24).

By repeating the above described procedures, we can identify all the extreme pointss1, ..., sN . The

resultant CAMNS-LP method is summarized in Table I.

TABLE I

A SUMMARY OF CAMNS-LP METHOD

CAMNS-LP method

Given an affine set characterization 2-tuple (C,d).

Step 1. Set l = 0, andB = IL.

Step 2. Randomly generate a vectorw ∼ N (0, IL), and setr := Bw.

Step 3. Solve the LPs

p? = minα:Cα+d�0

rT (Cα + d)

q? = maxα:Cα+d�0

rT (Cα + d)

and obtain their optimal solutions, denoted byα?

1 andα?

2, respectively.

Step 4. If l = 0

S = [ Cα?

1 + d, Cα?

2 + d ]

else

If |p?| 6= 0 then S := [ S Cα?

1 + d ].

If |q?| 6= 0 then S := [ S Cα?

2 + d ].

Step 5. Updatel to be the number of columns ofS.

Step 6. Apply QR decomposition

S = Q1R1,

whereQ1 ∈ RL×l andR1 ∈ R

l×l. UpdateB := IL −Q1QT

1 .

Step 7. RepeatStep 2 until l = N .

The CAMNS-LP method in Table I is not only systematically straightforward to apply, it is also efficient

due to the maturity of convex optimization algorithms. Using a primal-dual interior-point method, each

LP problem [or the problem in (22) or (24)] can be solved with aworst-case complexity ofO(L0.5(L(N−

1) + (N − 1)3)) ' O(L1.5(N − 1)) for L � N [37]. Since the algorithm solves2(N − 1) LP problems

in the worst case, we infer that its worst-case complexity isO(L1.5(N − 1)2).

December 27, 2007 DRAFT

Page 13: 1 A Convex Analysis Framework for Blind Separation of Non ...wkma/CAMNS/download/CAMNS_1st_draft.pdfConvex analysis and optimization techniques have drawn considerable attention in

13

Based on Theorem 1, Lemma 3, Lemma 4, and the above complexitydiscussion, we assert that

Proposition 2. Under (A1)-(A4), the CAMNS-LP method in Table I finds all the true source vectors

s1, ..., sN with probability 1. It does so with a worst-case complexity of O(L1.5(N − 1)2).

We have provided a MATLAB implementation of the CAMNS-LP method in http://www.ee.

nthu.edu.tw/∼wkma/CAMNS/CAMNS.htm. Readers are encouraged to test the code and give us

feedback.

VI. GEOMETRIC METHOD FORCAMNS

The CAMNS-LP method developed in the last section effectively uses numerical optimization to achieve

the CAMNS criterion. In this section we develop analytical or semi-analytical alternatives to achieving

CAMNS, that are simple and more efficient than CAMNS-LP. The methods to be proposed are for the

cases of two and three sources only, where the relatively simple geometrical structures in the two cases

are utilized. To facilitate the development, in Section VI-A we provide an alternate form of the CAMNS

criterion. Then, Sections VI-B and VI-C describe the geometric algorithms for two and three sources,

respectively.

A. An Alternate nBSS Criterion

Let us consider the pre-image ofS under the mappings = Cα+ d, denoted by

F ={

α ∈ RN−1

∣ Cα+ d � 0}

={

α ∈ RN−1

∣ cTnα+ dn ≥ 0, n = 1, . . . , L

}

(28)

wherecTn is the nth row of C. There is a direct correspondence between the extreme points of S and

F , as described in the following lemma:

Lemma 5. The polyhedral setF in (28) is equivalent to an (N-1)-simplex

F = conv{α1, . . . ,αN} (29)

where eachαi ∈ RN−1 satisfies

Cαi + d = si. (30)

The proof of Lemma 5 is given in Appendix F. Hence, as an alternative to our previously proposed

approach where we find{s1, ..., sN} by identifying the extreme points ofS, we can achieve perfect

December 27, 2007 DRAFT

Page 14: 1 A Convex Analysis Framework for Blind Separation of Non ...wkma/CAMNS/download/CAMNS_1st_draft.pdfConvex analysis and optimization techniques have drawn considerable attention in

14

blind separation by identifying the extreme points ofF . In this alternative, we are aided by a useful

convex analysis result presented as follows:

Lemma 6. (Extreme point validation for polyhedra [31]) For a polyhedral set in form of (28), a point

α ∈ F is an extreme point ofF if and only if the following collection of vectors

Cα = { cn ∈ RN−1 | cT

nα = −dn, n = 1, . . . , L } (31)

containsN − 1 linearly independent vectors.

In summary, an alternate version of the BSS criterion in Theorem 1 is given as follows

Theorem 2. (Alternate nBSS criterion by CAMNS) Under (A1) to (A4), the set

F ={

α ∈ RN−1

∣ Cα+ d � 0}

(32)

where(C,d) are obtained from the observationsx1, . . . ,xM via the affine set fitting solution in Propo-

sition 1, hasN extreme pointsα1, . . . ,αN . Each extreme point corresponds to a source signal through

the relationshipCαi +d = si. The extreme points ofF may be validated by the procedure in Lemma 6.

Proof: Theorem 2 directly follows from Theorem 1, Lemma 5, and Lemma6.

Based on Theorem 2, we propose geometric based methods for the cases of two and three sources in

the following two subsections.

B. Two Sources: A Simple Closed-Form Solution

For the case ofN = 2, the setF is simply

F = {α ∈ R | cnα + dn ≥ 0, n = 1, ..., L}. (33)

By Lemma 6,α is extremal if and only if

α = −dn/cn, cn 6= 0, α ∈ F (34)

for somen ∈ {1, ..., L}. From (33) we see thatα ∈ F implies the following two conditions:

α ≥ −dn/cn, for all n such that cn > 0, (35)

α ≤ −dn/cn, for all n such that cn < 0. (36)

December 27, 2007 DRAFT

Page 15: 1 A Convex Analysis Framework for Blind Separation of Non ...wkma/CAMNS/download/CAMNS_1st_draft.pdfConvex analysis and optimization techniques have drawn considerable attention in

15

We therefore conclude from (34), (35), and (36) that the extreme points are obtained ifα1 > α2, given

by

α1 = min{−dn/cn | cn < 0, n = 1, 2, ..., L}, (37)

α2 = max{−dn/cn | cn > 0, n = 1, 2, ..., L} (38)

which are simple closed-form expressions.

C. Three Sources: A Semi-Analytic Solution

α1

α2

α3

F

Hk1

Hk2

Hk3

Fig. 4. Geometric illustration ofF for N = 3.

For N = 3, the setF is a triangle onR2. By exploiting the relatively simple geometry ofF , we can

locate the extreme points very effectively. Figure 4 shows the geometry ofF in this case. We see that the

boundary ofF is entirely constituted by three facetsconv{α1,α2}, conv{α1,α3}, andconv{α2,α3}.

They may be represented by the following polyhedral expression

conv{α1,α2} = F ∩Hk1, (39)

conv{α1,α3} = F ∩Hk2, (40)

conv{α2,α3} = F ∩Hk3, (41)

for somek1, k2, k3 ∈ {1, ..., L}, where

Hk = {α ∈ R2 | cT

kα = −dk}. (42)

Suppose that (k1, k2, k3) is known. Then, by Lemma 6, the three extreme points are given by the closed

form

α1 = −

cTk1

cTk2

−1

dk1

dk2

, (43a)

December 27, 2007 DRAFT

Page 16: 1 A Convex Analysis Framework for Blind Separation of Non ...wkma/CAMNS/download/CAMNS_1st_draft.pdfConvex analysis and optimization techniques have drawn considerable attention in

16

α2 = −

cTk1

cTk3

−1

dk1

dk3

, (43b)

α3 = −

cTk2

cTk3

−1

dk2

dk3

. (43c)

A geometric interpretation of (43) is as follows: an extremepoint is the intersection of any two of the

facet-forming linesHk1, Hk2

, andHk3. This can also be seen in Figure 4.

Inspired by the fact that finding all the facets is equivalentto finding all the extreme points, we propose

an extreme-point finding heuristic in Table II. The idea behind is illustrated in Figure 5. In the first stage

[Figure 5(a)], we move from some interior pointα′ over a directionr until we reach the boundary. This

process helps identify one facet line, sayHk1. Similarly, by travelling over the direction opposite tor,

we may locate another facet line, sayHk2. The intersection ofHk1

andHk2results in finding an extreme

point α1. In the second stage [Figure 5(b)], we use the same idea to locate the last facet lineHk3, by

travelling over directionα′ −α1.

α1α1

α′ −α1

α′α′

(a) (b)

Hk1

Hk2

Hk3

r

ψ1

ψ2

ψ3

Fig. 5. Illustration of the operations of the geometric extreme-point finding algorithm forN = 3.

The proposed algorithm requires an interior pointα′ as the starting point for facet search. Sometimes,

the problem nature allows us to determine such a point easily. For instance, if the mixing matrix is

componentwise non-negative oraij ≥ 0 for all i, j, then it can be verified thatα′ = 0 is interior toF .

When an interior point is not known, we can find one numerically by solving the LP

maxα, β

β

s.t. cTnα+ dn ≥ β, n = 1, ..., L

(44)

(This is known as the phase I method in optimization [30].)

December 27, 2007 DRAFT

Page 17: 1 A Convex Analysis Framework for Blind Separation of Non ...wkma/CAMNS/download/CAMNS_1st_draft.pdfConvex analysis and optimization techniques have drawn considerable attention in

17

TABLE II

A SUMMARY OF GEOMETRIC EXTREME-POINT FINDING ALGORITHM

Geometric Extreme-Point Finding Algorithm for N = 3

Given an affine set characterization 2-tuple (C,d), and a vectorα′ interior toF .Step 1. Randomly generate a directionr ∼ N (0, I2).Step 2. Locate a boundary point

ψ1 = α′ + t1r

where

t1 = sup{t | α′ + tr ∈ F}

= min{−(cT

nα′ + dn)/cT

nr | cT

nr < 0, n = 1, ..., L}.

Step 3. Find the index set

K1 = {n | cT

nψ1 = −dn, n = 1, ..., L}

If {ck | k ∈ K1} contains2 linearly independent vectors (i.e.,ψ1 is an extreme pointand there is an indeterminacy thatHk 6= Hl for somek, l ∈ K1), then go toStep 1.

Step 4. Locate a boundary pointψ2 = α

′ + t2r

where

t2 = inf{t | α′ + tr ∈ F}

= max{−(cT

nα′ + dn)/cT

n r | cT

nr > 0, n = 1, ..., L}.

Step 5. Find the index set

K2 = {n | cT

nψ3 = −dn, n = 1, ..., L}

If {ck | k ∈ K2} contains2 linearly independent vectors, then go toStep 1.Step 6. Determine

α1 = −

�cT

k1

cT

k2 �−1 �dk1

dk2 � ,

for an arbitraryk1 ∈ K1 andk2 ∈ K2.Step 7. Setr = α′ − α1, and locate a boundary point

ψ3 = α′ + t3r

where

t3 = sup{t | α′ + tr ∈ F}

= min{−(cT

nα′ + dn)/cT

nr | cT

nr < 0, n = 1, ..., L}.

Step 8. Find the index set

K3 = {n | cT

nψ3 = −dn, n = 1, ..., L}.

Step 9. Determine

α2 = −

�cT

k1

cT

k3 �−1�dk1

dk3 � , α3 = −

�cT

k2

cT

k3 �−1 �dk2

dk3 � ,

for an arbitraryk1 ∈ K1, k2 ∈ K2, andk3 ∈ K3.

December 27, 2007 DRAFT

Page 18: 1 A Convex Analysis Framework for Blind Separation of Non ...wkma/CAMNS/download/CAMNS_1st_draft.pdfConvex analysis and optimization techniques have drawn considerable attention in

18

VII. S IMULATIONS

To demonstrate the efficacy of the CAMNS-based algorithms, four simulation results are presented here.

Section VII-A is an X-ray image example where our task is to distinguish bone structures from soft tissue.

Section VII-B considers a benchmarked problem [2] in which the sources are faces of three different

persons. Section VII-C focuses on a challenging scenario reminiscent of ghosting effects in photography.

Section VII-D uses Monte Carlo simulation to evaluate the performance of CAMNS-based algorithms

under noisy condition. For performance comparison, we alsotest three existing nBSS algorithms, namely

non-negative least-correlated component analysis (nLCA)[26], non-negative matrix factorization (NMF)

[11], and non-negative independent component analysis (nICA) [7].

The performance measure used in this paper is described as follows. Let S = [s1, . . . , sN ] be the

true multi-source signal matrix, andS = [s1, . . . , sN ] be the multi-source output of a BSS algorithm.

It is well known that a BSS algorithm is inherently subject topermutation and scaling ambiguities. We

propose a sum square error (SSE) measure forS and S [39], [40], given as follows:

e(S, S) = minπ∈ΠN

N∑

i=1

si −‖si‖

‖sπi‖sπi

2

(45)

whereπ = (π1, . . . , πN ), and ΠN = {π ∈ RN | πi ∈ {1, 2, . . . , N}, πi 6= πj for i 6= j} is the set

of all permutations of{1, 2, ..., N}. The optimization of (45) is to adjust the permutationπ such that

the best match between true and estimated signals is yielded, while the factor‖si‖/‖sπi‖ is to fix the

scaling ambiguity. Problem (45) is the optimal assignment problem which can be efficiently solved by

Hungarian algorithm1 [41].

A. Example ofN = M = 2: Dual-energy Chest X-ray Imaging

Dual-energy chest x-ray imaging is clinically used for detecting calcified granuloma, a symptom of

lung nodules [42]. The diagnostic images are acquired from two stacked detectors separated by a copper

filter along which x-rays at two different energies are passed. For visualizing the symptom of calcified

granuloma, it is necessary to separate bone structures and soft tissue from the diagnostic images.

In this simulation we have two164× 164 source images, one representing bone structure and another

soft tissue. The two images can be found in [43] and they are displayed in Figure 6(a). Each image

is represented by a source vectorsi ∈ RL, by scanning the image vertically from top left to bottom

right (therebyL = 1642 = 26896). We found that the two source signals satisfy the local dominant

1A Matlab implementation is available athttp://si.utia.cas.cz/Tichavsky.html.

December 27, 2007 DRAFT

Page 19: 1 A Convex Analysis Framework for Blind Separation of Non ...wkma/CAMNS/download/CAMNS_1st_draft.pdfConvex analysis and optimization techniques have drawn considerable attention in

19

assumption [or(A2)] perfectly, by numerical inspection. The observation vectors, or the diagnostic images

are synthetically generated using a mixing matrix

A =

0.55 0.45

0.63 0.37

. (46)

The mixed images are shown in Figure 6(b). The separated images of the various nBSS methods are

illustrated in Figure 6(c)-(g). By visual inspection, the CAMNS-based methods, nICA, and nLCA appear

to work equally well. In Table IV the various methods are quantitatively compared, using the SSE in (45).

The table suggests that the CAMNS-based methods, along sidewith nLCA achieve perfect separation.

B. Example ofN = M = 3: Human Face Separation

Three128 × 128 human face images, taken from the benchmarks in [2], are usedto generate three

observations. The mixing matrix is

A =

0.20 0.62 0.18

0.35 0.37 0.28

0.40 0.40 0.20

. (47)

In this example, the local dominant assumption is not perfectly satisfied. To shed some light into this,

we propose a measure called thelocal dominance proximity factor(LDPF) of theith source, defined as

follows:

κi = maxn=1,...,L

si[n]∑

j 6=i sj[n]. (48)

Whenκi = ∞, we have theith source satisfying the local dominant assumption perfectly. The values of

κi’s in this example are shown in Table III, where we see that theLDPFs of the three sources are strong

but not infinite.

Figure 7 shows the separated images of the various nBSS methods. We see that the CAMNS-based

methods and nLCA provide good separation, despite the fact that the local dominance assumption is not

perfectly satisfied. This result indicates that the CAMNS-based methods have some robustness against

violation of local dominance. Moreover, the nICA works poorly due to the violation of the assumption

of uncorrelated sources. The SSE performance of the variousmethods is given in Table IV, where we

have two observations. First, the CAMNS-LP method yields the best performance among all the methods

under test. Second, the performance of the LP method is better than that of the CAMNS-geometric

method. The latter suggests that the LP method is more robustthan the geometric method, when local

dominance is not exactly satisfied. This result will be further confirmed in the Monte Carlo simulation

in Section VII-D.

December 27, 2007 DRAFT

Page 20: 1 A Convex Analysis Framework for Blind Separation of Non ...wkma/CAMNS/download/CAMNS_1st_draft.pdfConvex analysis and optimization techniques have drawn considerable attention in

20

��� ��� ���

��� ��� ��� ��

Fig. 6. Dual-energy chest x-ray imaging: (a) the sources, (b) the observations, and the extracted sources obtained by (c)

CAMNS-LP method, (d) CAMNS-geometric method, (e) nLCA, (f)NMF and (g) nICA.

C. Example ofM = N = 4: Ghosting Effect

We take a285 × 285 Lena image from [2] as one source and then shift it diagonallyto create three

more sources; see Figure 8(a). Apparently, these sources are strongly correlated. Even worse, their LDPFs,

December 27, 2007 DRAFT

Page 21: 1 A Convex Analysis Framework for Blind Separation of Non ...wkma/CAMNS/download/CAMNS_1st_draft.pdfConvex analysis and optimization techniques have drawn considerable attention in

21

��� ��� ���

��� ��� ��� ��

Fig. 7. Human face separation: (a) the sources, (b) the observations, and the extracted sources obtained by (c) CAMNS-LP

method, (d) CAMNS-geometric method, (e) nLCA, (f) NMF and (g) nICA.

December 27, 2007 DRAFT

Page 22: 1 A Convex Analysis Framework for Blind Separation of Non ...wkma/CAMNS/download/CAMNS_1st_draft.pdfConvex analysis and optimization techniques have drawn considerable attention in

22

shown in Table III are not too satisfactory compared to the previous two examples. The mixing matrix is

A =

0.02 0.37 0.31 0.30

0.31 0.21 0.26 0.22

0.05 0.38 0.28 0.29

0.33 0.23 0.21 0.23

. (49)

Figure 8(b) displays the observations, where the mixing effect is reminiscent of the ghosting effect

in analog televisions. The image separation results are illustrated in Figure 8(c)-(f). Clearly, only the

CAMNS-LP method and nLCA provide sufficiently good mitigation of the “ghosts”. This result once

again suggests that the CAMNS-based methods (as well as nLCA) is not too sensitive to the effect of

local dominance violation. Regarding the comparison of theCAMNS-LP method and nLCA, one may

find that the nLCA separated images have some ghosting residuals, upon very careful visual inspection.

As for the proposed method, we argue that the residuals are harder to notice. Moreover, our numerical

inspection found a problem that the nLCA signal outputs havenegative values sometimes. For this reason,

we see in Table IV that the SSE of nLCA is about 10dB larger thanthat of the CAMNS-LP method.

TABLE III

LOCAL DOMINANCE PROXIMITY FACTORS IN THE THREE SCENARIOS.

κi

source 1 source 2 source 3 source 4

Dual-energy X-ray ∞ ∞ - -

Human face separation 15.625 6.172 10.000 -

Ghosting reduction 2.133 2.385 2.384 2.080

TABLE IV

THE SSES OF THE VARIOUS NBSSMETHODS IN THE THREE SCENARIOS.

SSEe(S, S) (in dB)

CAMNS-LP CAMNS-Geometric nLCA NMF nICA

Dual-energy X-ray -252.2154 -247.2876 -259.0132 30.8372 24.4208

Human face separation 9.4991 18.2349 19.6589 32.0085 38.5425

Ghosting reduction 20.7535 - 31.3767 38.6202 41.8963

December 27, 2007 DRAFT

Page 23: 1 A Convex Analysis Framework for Blind Separation of Non ...wkma/CAMNS/download/CAMNS_1st_draft.pdfConvex analysis and optimization techniques have drawn considerable attention in

23

���

���

���

���

���

���

Fig. 8. Ghosting reduction: (a) the sources, (b) the observations and the extracted sources obtained by (c) CAMNS-LP method,

(d) nLCA, (e) NMF and (f) nICA.

December 27, 2007 DRAFT

Page 24: 1 A Convex Analysis Framework for Blind Separation of Non ...wkma/CAMNS/download/CAMNS_1st_draft.pdfConvex analysis and optimization techniques have drawn considerable attention in

24

D. Example ofM = 6, N = 3: Noisy Environment

We use Monte Carlo simulation to test the performance of the various methods when noise is present.

The three face images in Figure 7(a) were used to generate sixnoisy observations. The noise is in-

dependently and identically distributed (i.i.d.), following a Gaussian distribution with zero mean and

varianceσ2. To maintain non-negativity of the observations in the simulation, we force the negative

noisy observations to zero. We performed 100 independent runs. At each run the mixing matrix was i.i.d.

uniformly generated on [0,1] and then each row is normalizedto 1 to maintain (A3). The average errors

e(S, S) for different SNRs (defined here as SNR=∑N

i=1 ‖si‖2/LNσ2

i ) are shown in Figure 9. One can

see that the CAMNS-LP method performs better than the other methods.

�� �� �� ����

��

��

��

��

��

��

��

���� �

���

����

��

�ˆ

(,

)e

SS ��������

����

���������������

� ��

��!

Fig. 9. Performance evaluation of the CAMNS-based methods,nLCA, NMF, and nICA for the human face images experiment

under noisy condition.

VIII. C ONCLUSION

We have developed a convex analysis based framework for non-negative blind source separation.

The core of the framework is a new nBSS criterion, which guarantees perfect separation under some

assumptions [see (A1)-(A4)] that are realistic in many applications such as multichannel biomedical

imaging. To practically realize this result, we have proposed a systematic LP-based method for fulfilling

the criterion. We should mention a side benefit that the LP method deals with linear optimization that can

be solved efficiently and does not suffer from local minima. Moreover, we have used simplex geometry to

establish a computationally very cheap alternative to the LP method. Our current development has led to

December 27, 2007 DRAFT

Page 25: 1 A Convex Analysis Framework for Blind Separation of Non ...wkma/CAMNS/download/CAMNS_1st_draft.pdfConvex analysis and optimization techniques have drawn considerable attention in

25

two simple geometric algorithms, for two and three sources respectively. Future direction should consider

extension of the geometric approach to four sources and beyond. We anticipate that the extension would

be increasingly complex in a combinatorial manner. By contrast, the comparatively more expensive LP

method does not have such a troubleper se, and is applicable to any number of sources.

We have also performed extensive simulations to evaluate the separation performance of the CAMNS-

based methods, under several scenarios such as x-ray, humanportraits, and ghosting. The results indicate

that the LP method offers the best performance among the various methods under test.

APPENDIX

A. Proof of Lemma 1

Any x ∈ aff{x1, ...,xM} can be represented by

x =

M∑

i=1

θixi, (50)

whereθ ∈ RM , θT1 = 1. Substituting (2) into (50), we get

x =

N∑

j=1

βjsj , (51)

whereβj =∑M

i=1 θiaij for j = 1, ..., N , or equivalently

β = ATθ. (52)

SinceA has unit row sum [(A3)], we have

βT 1 = θT (A1) = θT1 = 1. (53)

This implies thatβT1 = 1, and as a result it follows from (51) thatx ∈ aff{s1, ..., sN}.

On the other hand, anyx ∈ aff{s1, ..., sN} can be represented by (51) forβT1 = 1. SinceA has full

column rank [(A4)], there always exist aθ such that (52) holds. Substituting (52) into (51) yields (50).

Since (53) implies thatθT1 = 1, we conclude thatx ∈ aff{x1, ...,xM}.

B. Proof of Proposition 1

As a basic result in least squares, each projection error in (13)

eA(C,d)(xi) = minα∈R

N−1

‖Cα+ d− xi‖22 (54)

has a closed form

eA(C,d)(xi) = (xi − d)TP⊥c (xi − d) (55)

December 27, 2007 DRAFT

Page 26: 1 A Convex Analysis Framework for Blind Separation of Non ...wkma/CAMNS/download/CAMNS_1st_draft.pdfConvex analysis and optimization techniques have drawn considerable attention in

26

whereP⊥c is the orthogonal complement projection ofC. Using (55), we can therefore rewrite the affine

set fitting problem [in (13)] as

minC

TC=I

{

mind

J(C, d)

}

(56)

where

J(C, d) =M∑

i=1

(xi − d)TP⊥c (xi − d) (57)

The inner minimization problem in (56) is an unconstrained convex quadratic program, and it can be easily

verified thatd = 1M

∑Mi=1 xi is an optimal solution to the inner minimization problem. Bysubstituting

this optimald into (56) and by lettingU = [x1 − d, ...,xM − d], problem (56) can be reduced to

minCT C=IN−1

Trace{UTP⊥c U} (58)

When CT C = IN−1, the projection matrixP⊥c can be simplified toIL − CCT . Subsequently (58) can

be further reduced to

maxC

TC=IN−1

Trace{UT CCTU}. (59)

An optimal solution of (59) is known to be theN − 1 principal eigenvector matrix [44].

C. Proof of Lemma 2

Assume thatz ∈ aff{s1, ..., sN} ∩ RL+:

z =

N∑

i=1

θisi � 0, 1Tθ = 1.

From (A2), it follows thatz[`i] = θisi[`i] ≥ 0, ∀i. Sincesi[`i] > 0, we must haveθi ≥ 0, ∀i. Therefore,

z lies in conv{s1, ..., sN}. On the other hand, assume thatz ∈ conv{s1, ..., sN}, i.e.,

z =

N∑

i=1

θisi, 1Tθ = 1, θ � 0

implying thatz ∈ aff{s1, ..., sN}. From(A1), we havesi � 0 ∀i and subsequentlyz � 0. This completes

the proof for (20).

D. Proof of Lemma 3

Any point in S = conv{s1, ..., sN } can be equivalently represented bys =∑N

i=1 θisi, whereθ � 0

andθT1 = 1. Applying this result to (22), problem (22) can be reformulated as

minθ∈RN

∑Ni=1 θiρi

s.t. θT 1 = 1, θ � 0.(60)

December 27, 2007 DRAFT

Page 27: 1 A Convex Analysis Framework for Blind Separation of Non ...wkma/CAMNS/download/CAMNS_1st_draft.pdfConvex analysis and optimization techniques have drawn considerable attention in

27

whereρi = rTsi. We assume without loss of generality thatρ1 < ρ2 ≤ · · · ≤ ρN . If ρ1 < ρ2 < · · · < ρN ,

then it is easy to verify that the optimal solution to (60) is uniquely given byθ? = e1. In its counterpart

in (22), this translates intos? = s1. But whenρ1 = ρ2 = · · · = ρP and ρP < ρP+1 ≤ · · · ≤ ρN for

someP , the solution of (60) is not unique. In essence, the latter case can be shown to have a solution

set

Θ = {θ | θT1 = 1, θ � 0, θP+1 = ... = θN = 0}. (61)

We now prove that the non-unique solution case happens with probability zero. Suppose thatρi = ρj

for somei 6= j, which means that

(si − sj)T r = 0. (62)

Let v = (si−sj)T r. Apparently,v follows a distributionN (0, ‖si−sj‖

2). Sincesi 6= sj , the probability

Pr[ρi = ρj] = Pr[v = 0] is of measure zero. This in turn implies thatρ1 < ρ2 < · · · < ρN holds with

probability 1.

E. Proof of Lemma 4

The approach to proving Lemma 4 is similar to that in Lemma 3. Let

ρi = rTsi = (Bw)Tsi (63)

for which we haveρi = 0 for i = 1, ..., l. It can be shown that

ρl+1 < ρl+2 < · · · < ρN (64)

holds with probability 1, as long as{s1, . . . , sN} is linearly independent. Problems (22) and (24) are

respectively equivalent to

p? = minθ∈RN

N∑

i=l+1

θiρi

s.t. θ � 0, θT1 = 1,

(65)

q? = maxθ∈RN

N∑

i=l+1

θiρi

s.t. θ � 0, θT1 = 1.

(66)

Assuming (64), we have three distinct cases to consider

(C1) ρl+1 < 0, ρN < 0,

(C2) ρl+1 < 0, ρN > 0,

December 27, 2007 DRAFT

Page 28: 1 A Convex Analysis Framework for Blind Separation of Non ...wkma/CAMNS/download/CAMNS_1st_draft.pdfConvex analysis and optimization techniques have drawn considerable attention in

28

(C3) ρl+1 > 0, ρN > 0.

For (C2), we can see the following: Problem (65) has a unique optimal solutionθ? = el+1 [ands? = sl+1

in its counterpart in (22)], attaining an optimal valuep? = ρl+1 < 0. Problem (66) has a unique optimal

solutionθ? = eN [and s? = sN in its counterpart in (24)], attaining an optimal valueq? = ρN > 0. In

other words, both (65) and (66) lead to finding of new extreme points. For(C1), it is still true that (65)

finds a new extreme point withp? < 0. However, problem (66) is shown to have a solution set

Θ = {θ | θT1 = 1, θ � 0, θl+1 = · · · = θN = 0}. (67)

which contains convex combinations of the old extreme points, and the optimal value isq? = 0. A similar

condition happens with(C3), where (66) finds a new extreme point withq? > 0 while (65) does not

with p? = 0.

F. Proof of Lemma 5

Equation (28) can also be expressed as

F ={

α ∈ RN−1 | Cα+ d ∈ conv{s1, ..., sN}

}

.

Thus, everyα ∈ F satisfies

Cα+ d =N

i=1

θisi (68)

for someθ � 0, θT1 = 1. SinceC has full column rank, (68) can be re-expressed as

α =

N∑

i=1

θiαi, (69)

whereαi = (CTC)−1CT (si−d) (or Cαi+d = si). Equation (69) implies thatF = conv{α1, ...,αN}.

Now, assume that{α1, ...,αN} are affinely dependent, i.e., there must exist an extreme point αN =∑N−1

i=1 γiαi where∑N−1

i=1 γi = 1. One then hassN = CαN + d =∑N−1

i=1 γisi where∑N−1

i=1 γi = 1

which implies{s1, ..., sN} are affinely dependent (contradiction). Thus, the setF is an(N −1)-simplex.

REFERENCES

[1] A. Hyvarinen, J. Karhunen, and E. Oja,Independent Component Analysis. New York: John Wiley, 2001.

[2] A. Cichocki and S. Amari,Adaptive Blind Signal and Image Processing. John Wiley and Sons, Inc., 2002.

[3] E. R. Malinowski,Factor Analysis in Chemistry. New York: John Wiley, 2002.

[4] D. Nuzillard and J.-M. Nuzillard, “Application of blindsource separation to 1-D and 2-D nuclear magnetic resonance

spectroscopy,”IEEE Trans. Signal Process. Lett., vol. 5, no. 8, pp. 209–211, Feb. 1998.

December 27, 2007 DRAFT

Page 29: 1 A Convex Analysis Framework for Blind Separation of Non ...wkma/CAMNS/download/CAMNS_1st_draft.pdfConvex analysis and optimization techniques have drawn considerable attention in

29

[5] J. M. P. Nascimento and J. M. B. Dias, “Does independent component analysis play a role in unmixing hyperspectral

data?”IEEE Trans. Geosci. Remote Sensing, vol. 43, no. 1, pp. 175–187, Jan. 2005.

[6] Y. Wang, J. Xuan, R. Srikanchana, and P. L. Choyke, “Modeling and reconstruction of mixed functional and molecular

patterns,”International Journal of Biomedical Imaging, vol. 2006, pp. 1–9, 2006.

[7] M. D. Plumbley, “Algorithms for non-negative independent component analysis,”IEEE Trans. Neural Netw., vol. 14, no. 3,

pp. 534–543, 2003.

[8] S. A. Astakhov, H. Stogbauer, A. Kraskov, and P. Grassberger, “Monte Carlo algorithm for least dependent non-negative

mixture decomposition,”Analytical Chemistry, vol. 78, no. 5, pp. 1620–1627, 2006.

[9] S. Moussaoui, D. Brie, A. Mohammad-Djafari, and C. Carteret, “Separation of non-negative mixture of non-negative

sources using a Bayesian approach and MCMC sampling,”IEEE Trans. Signal Process., vol. 54, no. 11, pp. 4133–4145,

Nov. 2006.

[10] R. Tauler and B. Kowalski, “Multivariate curve resolution applied to spectral data from multiple runs of an industrial

process,”Anal. Chem., vol. 65, pp. 2040–2047, 1993.

[11] D. Lee and H. S. Seung, “Learning the parts of objects by non-negative matrix factorization,”Nature, vol. 401, pp. 788–791,

Oct. 1999.

[12] A. Belouchrani, K. Meraim, J.-F. Cardoso, and E. Moulines, “A blind source separation technique using second order

statistics,”IEEE Trans. Signal Process., vol. 45, no. 2, pp. 434–444, Feb. 1997.

[13] A. Hyvarinen and E. Oja, “A fixed-point algorithm for independent component analysis,”Neur. Comput., vol. 9, pp.

1483–1492, 1997.

[14] M. S. Bartlett, J. R. Movellan, and T. J. Sejnowski, “Face recognition by independent component analysis,”IEEE Trans.

Neural Netw., vol. 13, no. 6, pp. 1450–1464, Nov. 2002.

[15] C. Lawson and R. J. Hanson,Solving Least-Squares Problems. New Jersey: Prentice-Hall, 1974.

[16] P. O. Hoyer, “Non-negative matrix factorization with sparseness constraints,”Journal of Machine Learning Research, vol. 5,

pp. 1457–1469, 2004.

[17] W.-K. Ma, T. Davidson, K. Wong, Z. Luo, and P. Ching, “Quasi-maximum-likelihood multiuser detection using semi-

definite relaxation with application to synchronous CDMA,”IEEE Trans. Signal Process., vol. 50, no. 4, pp. 912–922,

April 2002.

[18] D. P. Palomar, J. M. Cioffi, and M. A. Lagunas, “Joint Tx-Rx beamforming design for multicarrier MIMO channels: A

unified framework for convex optimization,”IEEE Trans. Signal Process., vol. 51, no. 9, pp. 2381–2401, 2003.

[19] Z.-Q. Luo, T. N. Davidson, G. B. Giannakis, and K. M. Wong, “Transceiver optimization for block-based multiple access

through ISI channels,”IEEE Trans. Signal Process., vol. 52, no. 4, pp. 1037–1052, 2004.

[20] Y. Ding, T. N. Davidson, Z.-Q. Luo, and K. M. Wong, “Minimum BER block precoders for zero-forcing equalization,”

IEEE Trans. Signal Process., vol. 51, no. 9, pp. 2410–2423, 2003.

[21] A. Wiesel, Y. C. Eldar, and S. Shamai, “Linear precodingvia conic optimization for fixed mimo receivers,”IEEE Trans.

Signal Process., vol. 54, no. 1, pp. 161–176, 2006.

[22] N. D. Sidiropoulos, T. N. Davidson, and Z.-Q. Luo, “Transmit beamforming for physical-layer multicasting,”IEEE Trans.

Signal Process., vol. 54, no. 6, pp. 2239–2251, 2006.

[23] S. A. Vorobyov, A. B. Gershman, and Z.-Q. Luo, “Robust adaptive beamforming using worst-case performance optimization:

A solution to the signal mismatch problem,”IEEE Trans. Signal Process., vol. 51, no. 2, pp. 313–324, 2003.

December 27, 2007 DRAFT

Page 30: 1 A Convex Analysis Framework for Blind Separation of Non ...wkma/CAMNS/download/CAMNS_1st_draft.pdfConvex analysis and optimization techniques have drawn considerable attention in

30

[24] P. Biswas, T.-C. Lian, T.-C. Wang, and Y. Ye, “Semidefinite programming based algorithms for sensor network localization,”

ACM Trans. Sensor Networks, vol. 2, no. 2, pp. 188–220, 2006.

[25] F.-Y. Wang, Y. Wang, T.-H. Chan, and C.-Y. Chi, “Blind separation of multichannel biomedical image patterns by non-

negative least-correlated component analysis,” inLecture Notes in Bioinformatics (Proc. PRIB’06), Springer-Verlag, vol.

4146, Berlin, Dec. 9-14, 2006, pp. 151–162.

[26] F.-Y. Wang, C.-Y. Chi, T.-H. Chan, and Y. Wang, “Blind separation of positive dependent sources by non-negative least-

correlated component analysis,” inIEEE International Workshop on Machine Learning for SignalProcessing (MLSP’06),

Maynooth, Ireland, Sept. 6-8, 2006, pp. 73–78.

[27] A. Prieto, C. G. Puntonet, and B. Prieto, “A neural learning algorithm for blind separation of sources based on geometric

properties,”Signal Processing, vol. 64, pp. 315–331, 1998.

[28] A. T. Erdogan, “A simple geometric blind source separation method for bound magnitude sources,”IEEE Trans. Signal

Process., vol. 54, no. 2, pp. 438–449, 2006.

[29] F. Vrins, J. A. Lee, and M. Verleysen, “A minimum-range approach to blind extraction of bounded sources,”IEEE Trans.

Neural Netw., vol. 18, no. 3, pp. 809–822, 2006.

[30] S. Boyd and L. Vandenberghe,Convex Optimization. Cambridge Univ. Press, 2004.

[31] D. P. Bertsekas, A. Nedic, and A. E. Ozdaglar,Convex Analysis and Optimization. Athena Scientific, 2003.

[32] B. Grunbaum,Convex Polytopes. Springer, 2003.

[33] M. E. Dyer, “The complexity of vertex enumeration methods,” Mathematics of Operations Research, vol. 8, no. 3, pp.

381–402, 1983.

[34] K. G. Murty and S.-J. Chung, “Extreme point enumeration,” College of Engineering, University of Michigan,” Technical

Report 92-21, 1992, available online:http://deepblue.lib.umich.edu/handle/2027.42/6731.

[35] K. Fukuda, T. M. Liebling, and F. Margot, “Analysis of backtrack algorithms for listing all vertices and all faces ofa

convex polyhedron,”Computational Geometry: Theory and Applications, vol. 8, no. 1, pp. 1–12, 1997.

[36] J. F. Sturm, “Using SeDuMi 1.02, a MATLAB toolbox for optimization over symmetric cones,”Optimization Methods and

Software, vol. 11-12, pp. 625–653, 1999.

[37] I. J. Lustig, R. E. Marsten, and D. F. Shanno, “Interior point methods for linear programming: Computational state of the

art,” ORSA Journal on Computing, vol. 6, no. 1, pp. 1–14, 1994.

[38] G. H. Golub and C. F. V. Loan,Matrix Computations. The Johns Hoplkins University Press, 1996.

[39] P. Tichavsky and Z. Koldovsky, “Optimal pairing of signal components separated by blind techniques,”IEEE Signal

Process. Lett., vol. 11, no. 2, pp. 119–122, 2004.

[40] J. R. Hoffman and R. P. S. Mahler, “Multitarget miss distance via optimal assignment,”IEEE Trans. System, Man, and

Cybernetics, vol. 34, no. 3, pp. 327–336, May 2004.

[41] H. W. Kuhn, “The Hungarian method for the assignment method,” Naval Research Logistics Quarterly, vol. 2, pp. 83–97,

1955.

[42] S. G. Armato, “Enhanced visualization and quantification of lung cancers and other diseases of the chest,”Experimental

Lung Res., vol. 30, no. 30, pp. 72–77, 2004.

[43] K. Suzuki, R. Engelmann, H. MacMahon, and K. Doi, “Virtual dual-energy radiography: improved chest radiographs by

means of rib suppression based on a massive training artificial neural network (MTANN),”Radiology, vol. 238. [Online].

Available: http://suzukilab.uchicago.edu/research.htm

[44] R. A. Horn and C. R. Johnson,Matrix Analysis. New York: Cambridge Univ. Press, 1985.

December 27, 2007 DRAFT