binary principal component analysis

10
Binary Principal Component Analysis Feng Tang and Hai Tao University of California, Santa Cruz tang|[email protected] Abstract Efcient and compact representation of images is a fundamental problem in comp uter vision. Princ ipal Compone nt Anal ysis (PCA) has been widely used for image representation and has been successfully applied to many compu ter vision algo rithms. In this paper , we propose a method that uses Haar-like binary box functions to span a subspace which approximates the PCA subspa ce. The propose d method can perfor m vector dot produ ct very efciently using integral image. We also show that B-PCA base vectors are nearly orthogonal to each other. As a result, in the non-orthogonal vector de- composition process, the expensive pseudo-inverse projection operator can be approximated by the direct dot product without causing signicant dis- tance disto rtion. Exper iment s on real image datase ts show that B-PCA has comparable performance to PCA in image reconstruction and recognition tasks with signicant speed improvement. 1 Intr oduc ti on Principal component analysis(PCA) [3] is widely used in applications ranging from neu- roscience to computer vision [9, 1, 2, 4] to extract relevant information from high dimen- sional data sets. It reduces the dimension of a data set to reveal its essential characteristics and the subspac e captures the main structu re of the data. The main computa tion in PCA is the dot products of a data vect or with the PCA base vectors . This can be computa- tionally expensive especially when the original data dimension is high because it involves element-by-element oating point multiplications. This paper investigates the possibility of nding a subspace representation that is similar to PCA in terms capturing the essential data structure while avoiding the expensive oating point dot product operations. 1.1 Rel ated work In recent years, Haar-like box functions became a popul ar choice as image featur es due to the semin al face detectio n work of Viola and Jones [11]. Examp les of such featu res used in our method are shown in Figure 1. The main adva ntage of such featur es is that the inner product of a data vector with a box function can be performed by three or seven integer additions, instead of d oating point multiplications, where d is the dimension of the base vectors. In a more recent paper [7], the authors propos ed a metho d to repre sent images using these simple one- or two-box base vectors. Since the binary box base vec- tors are highly non-orthogonal to each other, the representation is called non-orthogonal binar y subs pace (NBS). The base vecto rs of NBS are selected using a greed y algor ithm 1

Upload: danishmjn

Post on 10-Apr-2018

239 views

Category:

Documents


0 download

TRANSCRIPT

8/8/2019 Binary Principal Component Analysis

http://slidepdf.com/reader/full/binary-principal-component-analysis 1/10

8/8/2019 Binary Principal Component Analysis

http://slidepdf.com/reader/full/binary-principal-component-analysis 2/10

8/8/2019 Binary Principal Component Analysis

http://slidepdf.com/reader/full/binary-principal-component-analysis 3/10

8/8/2019 Binary Principal Component Analysis

http://slidepdf.com/reader/full/binary-principal-component-analysis 4/10

8/8/2019 Binary Principal Component Analysis

http://slidepdf.com/reader/full/binary-principal-component-analysis 5/10

8/8/2019 Binary Principal Component Analysis

http://slidepdf.com/reader/full/binary-principal-component-analysis 6/10

8/8/2019 Binary Principal Component Analysis

http://slidepdf.com/reader/full/binary-principal-component-analysis 7/10

8/8/2019 Binary Principal Component Analysis

http://slidepdf.com/reader/full/binary-principal-component-analysis 8/10

Figure 8: B-PCA base vectors and its corresponding box functions for vehicle modeling.

could be similar. The test database comprises 64 aligned face images in the FERETdatabase. Each image is projected to the PCA and B-PCA subspace and the coefcientsare used as features. A nearest neighbor classier is used to nd the best match as therecognition result. We compared the PCA with B-PCA using the original projection

process which has the pseudo-inverse (ΨT

Ψ )− 1

ΨT

x and also that of direct dot prod-uct (DNP) Ψ T x for recognition, the results are shown in Fig. 9 and Fig. 10, in Fig. 9,the approximation threshold ζ is set to 0.9, in Fig. 10, ζ is 0.1. As can be observed,when ζ = 0.1, which means the B-PCA base vectors are more orthogonal, the differencebetween recognition rates using pseudo-inverse and DNP is very small. The DNP, Ψ T x,is computationally much cheaper because it only needs several additions. The differencebetween (Ψ T Ψ )− 1Ψ T x and Ψ T x is small when the B-PCA base vectors are more orthog-onal (smaller ζ ), and it is large when the base vectors are less orthogonal (larger ζ ). Aswe can observe that when the approximation threshold ζ is small, the difference betweenB-PCA and PCA base vectors is small, the difference between DNP and pseudo-inverseprojection is also small.

Figure 9: Recognition performance for facedataset with ζ = 0.9.

Figure 10: Recognition performance for facedataset with ζ = 0.1.

4.3 Speed Improvement of B-PCA

Suppose the image size is m × n , T PCA denotes the time for computing the PCA subspaceprojection coefcients. N denotes the number of PCA base vectors. It will take m × n × N oating point multiplications and N × (m × n − 1) + ( N − 1) oating point additions to

8

8/8/2019 Binary Principal Component Analysis

http://slidepdf.com/reader/full/binary-principal-component-analysis 9/10

perform the projection operation, orT PCA = N × m × n × T f m + [ N × (m × n − 1) + ( N − 1)] × T f a (4)

where T f m is the time for a single oating point multiplication, T f a is the time for a singleoating point addition.

For B-PCA, the time for a single projection is denoted as T BPCA . It consists of twoparts, one is T ii , the time to construct the integral image, for a m × n image, it will takem × n × 2 integer additions with recursive implementation. This is performed only oncefor each image. The other is the time for the projection operation PΨ (x) = ( Ψ T Ψ )− 1Ψ T x.When the bases are nearly orthogonal to each other, we can approximate the projectioncoefcient using direct dot-product Ψ T x. The B-PCA base vector ψ i (1 ≤ i ≤ N ) is repre-sented as a linear combination of N i box functions, ψ i = ∑ N i

j= 1 c jb j. The projection of x to

ψ i can be written as ψ i, x = ∑ N i j= 1 ci, j b j, x . Each box function b j has n j boxes, wheren j can be one or two. The b j, x can be performed using 3 × n j integer additions. Sincec j is oating point, ψ i, x needs N i oating point multiplications and N i − 1 oating pointadditions.

T BPCA = ∑ N i= 1 ∑ N i

j= 1 (3 × n j × T ia + N i × T f m + ( N i − 1) × T f a) (5)

where T ia is the time for one integer addition. As we can observe, T BPCA is only dependenton the number of binary box functions which is often much smaller than the dimensionof the image. While for PCA, the time is proportional to the image dimension. Sincethe number of operations in B-PCA is much smaller than PCA, T BPCA is much faster thanT PCA , and the speed up is more dramatic when the dimension of data is higher.

Table 2: Comparison of the computational cost between PCA and B-PCA projectionoperation

Approximation threshold: ζ 0.4 0.6 0.85# binary box functions 409 214 68

T PCA (a single subspace projection (s)) 3 .15 × 10− 4

T ii (integral image computation (s)) 4 .72 × 10− 5

T BPCA (a single subspace projection (s)) 8 .11 × 10− 6 4.29 × 10− 6 1.45 × 10− 6

Speedup ( T PCAT BPCA + T ii / N ) 27.97 42.34 68.47

Suppose m = n = 24, N = 15, T PCA needs 24 × 24 × 15 = 8640 oating point multipli-cations to compute the projection coefcients. Suppose the total number of NBS basevectors used to represent all the B-PCA base vectors is 200, that is, ∑ N

i= 1 N i = 200, theB-PCA needs only 2 × ∑ N

i= 1 N i − 1 = 399 oating point operations. The speed up is sig-nicant.

The experiment for speed improvement is carried out on a Pentium 4, 3.2GHz, 1GRAM machine, using C++ code. We compute the 15 PCA base vectors and 15 B-PCAbase vectors, and evaluate the computation time for an image to project onto the PCAsubspace and B-PCA subspace. We tested the PCA projection and B-PCA projection with

9

8/8/2019 Binary Principal Component Analysis

http://slidepdf.com/reader/full/binary-principal-component-analysis 10/10

1000 images and the time for a single projection is computed as the average. In Table-IV, we can observe, the B-PCA projection process Ψ T x is much faster than direct PCAprojection operation. The improvement is even more dramatic if more base vectors areused for comparison. Note: 1) for PCA projection test, the image data is put in continuousmemory space to avoid unnecessary memory access overhead; 2) the T ii is distributed intoeach base vector projection operation to make the comparison fair, because integral imageis only needed to be computed once.

5 Conclusion and future work

A novel compact and efcient representation called B-PCA is presented in this paper. Itcan capture the main structure of the dataset while take advantages of the computational

efciency of NBS. A PCA-embedded OOMP method is proposed to obtain the B-PCAbasis. We have applied the B-PCA method to the image reconstruction and recognitiontasks. Promising results are demonstrated in this paper. Future work may use more so-phisticated classication methods to enhance the recognition performance.

References

[1] M.J. Black and A.D. Jepson. Eigentracking: Robust matching and tracking of artic-ulated objects using a view-based representation. In ECCV , pages 329–342, 1996.

[2] T.F. Cootes, G.J. Edwards, and C.J. Taylor. Active appearance models. In ECCV ,pages II 484–498, 1998.

[3] I. Jollife. Principal Component Analysis . Springer-Verlag, 1986.

[4] H. Murase and S.K. Nayar. Visual learning and recognition of 3d objects fromappearance. IJCV , 14(1):5–24, June 1995.

[5] L. Rebollo-Neira and D. Lowe. Optimized orthogonal matching pursuit approach. IEEE Signal Processing Letters , 9(4):137–140, April 2002.

[6] B. Scholkopf, A. Smola, and K.R. Muller. Nonlinear component analysis as a kerneleigenvalue problem. Neural Computation , 10(5):1299–1319, July 1998.

[7] H. Tao, R. Crabb, and F. Tang. Non-orthogonal binary subspace and its applicationsin computer vision. In ICCV , pages 864–870, 2005.

[8] M. Tipping and C. Bishop. Probabilistic principal component analysis. J. RoyalStatistical Soc. , 61(3):611–622, 1999.

[9] M. Turk and A. Pentland. Face recognition using eigenface. In CVPR , pages 586–591, 1991.

[10] R. Vidal, Y. Ma, and S. Sastry. Generalized principal component analysis (gpca).PAMI , 27(12):1–15, December 2005.

[11] P. Viola and M. Jones. Rapid object detection using a boosted cascade of simplefeatures. In CVPR , pages I: 511–518, 2001.

10