hoip10 articulo reconocimiento facial_univ_vigo

8
1 Face recognition and the head pose problem José Luis Alba Castro 1 , Lucía Teijeiro-Mosquera 1 and Daniel González-Jiménez 2 (1) Signal Theory and Communications Department, University of Vigo, (2) Gradiant ABSTRACT One of the main advantages of face recognition is that it doesn't need active cooperation from the subject. This advantage converts into a challenging research topic when it comes to process face images acquired in uncontrolled scenarios. Matching faces on the web or pointing out criminals in the crowd are real application examples where we found many of the face recognition open issues: gallery-test pose mismatch, extreme illumination conditions, partial occlusions or severe expression changes. From all of these problems, pose variation has been identified as one of the most recurrent problems in real-life applications. In this paper we give a small introduction to the face recognition problem through a combination of local and global approaches and then explain an AAM-based fully automatic system for pose robust face recognition that allows faster fitting to any not frontal view. 1. INTRODUCTION TO FACE RECOGNITION Identifying people from their face appearance is a natural and fundamental way of human interaction. That’s the reason behind the great public acceptance of automatic face recognition as a biometric technology supporting many kinds of human-computer interfaces. We can find face recognition technology in many different applications like, control access to physical and logical facilities, video-surveillance, personalized human-machine interaction, multimedia database indexing and retrieval, meeting summarization, interactive videogames, interactive advertising, etc. Some of these applications are quite mature and, under a set of restrictions, the performance of the face recognition module is quite high and acceptable for normal operation. Nevertheless, face recognition is still a very challenging pattern recognition problem due to the high intra-class variability. The main sources of variability can be divided into four groups: Variability due to the 3D deformable and non-convex nature of heads:illumination, expression, pose/viewpoint, Variability due to short-term changes: occlusions, make-up, moustache, beard, glasses, hat, etc. Variability due to long-term changes: getting weight, getting old Variability due to change of acquisition device and quality (resolution, compression, focussing, etc.) The last source of variability is common to many other pattern recognition problems in computer vision but the other sources are quite particular to the face “object”. From all of them, the pose, illumination and expression (PIE) variability are the ones that have yielded more research efforts since the nineties. For human- beings it is quite disturbing that automatic systems fail so catastrophically when two pictures from the same person but with different expression, point of view and lighting conditions are not categorized as the same identity, even if the pictures have been taken two minutes apart. The great specialty of our brains to recognize faces has been largely studied and nowadays, thanks to the prosopagnosia or “face-blindness” disease, it is accepted that there is a specific area in the temporal lobe dedicated to this complex task and that both local characteristics and the whole ordered appearance of the “object” play an important role when recognizing a particular identity. Holistic approaches to face recognition have been evolving since the very first works on eigenfaces [1], where a simple dimensionality reduction (classical PCA) was applied to a set of face images to get rid of small changes in appearance, through Fisherfaces [2], where a similar dimensionality reduction principle was applied (LDA) to maximize discriminability among classes instead of minimize MSE for optimal reconstruction, and through probabilistic subspaces [3], where the concept of classes as separated identities is changed to binary classification over image differences: same identity against different identity. Finally, the AAM technique [4] appears as a natural evolution of shape description techniques and as a way to geometrically normalize textured images from the same class before applying any kind of dimensionality reduction. Thousands of papers have been written on different modifications and combinations of the above related techniques. In this paper we will extend the AAM approach to handle pose variations and normalize faces before feature extraction and matching for recognition. Local approaches have been largely based on locating facial features and describing them jointly with their mutual distances. The elasticity of their geometrical relationship has been successfully described as a graph with anchor points in the EBGM model (Elastic Bunch Graph Matching) [5] where a cost function that fused local

Upload: tecnalia-research-innovation

Post on 28-Oct-2014

891 views

Category:

Technology


2 download

DESCRIPTION

Artículo presentado por la Universidad de Vigo durante la jornada HOIP'10 organizada por la Unidad de Sistemas de información e interacción de TECNALIA. Más información en http://www.tecnalia.com/es/ict-european-software-institute/index.htm

TRANSCRIPT

Page 1: Hoip10 articulo reconocimiento facial_univ_vigo

1

Face recognition and the head pose problem José Luis Alba Castro1, Lucía Teijeiro-Mosquera1 and Daniel González-Jiménez2

(1) Signal Theory and Communications Department, University of Vigo, (2) Gradiant

ABSTRACT

One of the main advantages of face recognition is that it doesn't need active cooperation from the subject. This advantage converts into a challenging research topic when it comes to process face images acquired in uncontrolled scenarios. Matching faces on the web or pointing out criminals in the crowd are real application examples where we found many of the face recognition open issues: gallery-test pose mismatch, extreme illumination conditions, partial occlusions or severe expression changes. From all of these problems, pose variation has been identified as one of the most recurrent problems in real-life applications. In this paper we give a small introduction to the face recognition problem through a combination of local and global approaches and then explain an AAM-based fully automatic system for pose robust face recognition that allows faster fitting to any not frontal view.

1. INTRODUCTION TO FACE RECOGNITION

Identifying people from their face appearance is a natural and fundamental way of human interaction. That’s the reason behind the great public acceptance of automatic face recognition as a biometric technology supporting many kinds of human-computer interfaces. We can find face recognition technology in many different applications like, control access to physical and logical facilities, video-surveillance, personalized human-machine interaction, multimedia database indexing and retrieval, meeting summarization, interactive videogames, interactive advertising, etc. Some of these applications are quite mature and, under a set of restrictions, the performance of the face recognition module is quite high and acceptable for normal operation. Nevertheless, face recognition is still a very challenging pattern recognition problem due to the high intra-class variability. The main sources of variability can be divided into four groups:

Variability due to the 3D deformable and non-convex nature of heads:illumination, expression, pose/viewpoint,

Variability due to short-term changes: occlusions, make-up, moustache, beard, glasses, hat, etc. Variability due to long-term changes: getting weight, getting old Variability due to change of acquisition device and quality (resolution, compression, focussing, etc.)

The last source of variability is common to many other pattern recognition problems in computer vision but the other sources are quite particular to the face “object”. From all of them, the pose, illumination and expression (PIE) variability are the ones that have yielded more research efforts since the nineties. For human-beings it is quite disturbing that automatic systems fail so catastrophically when two pictures from the same person but with different expression, point of view and lighting conditions are not categorized as the same identity, even if the pictures have been taken two minutes apart. The great specialty of our brains to recognize faces has been largely studied and nowadays, thanks to the prosopagnosia or “face-blindness” disease, it is accepted that there is a specific area in the temporal lobe dedicated to this complex task and that both local characteristics and the whole ordered appearance of the “object” play an important role when recognizing a particular identity.

Holistic approaches to face recognition have been evolving since the very first works on eigenfaces [1], where a simple dimensionality reduction (classical PCA) was applied to a set of face images to get rid of small changes in appearance, through Fisherfaces [2], where a similar dimensionality reduction principle was applied (LDA) to maximize discriminability among classes instead of minimize MSE for optimal reconstruction, and through probabilistic subspaces [3], where the concept of classes as separated identities is changed to binary classification over image differences: same identity against different identity. Finally, the AAM technique [4] appears as a natural evolution of shape description techniques and as a way to geometrically normalize textured images from the same class before applying any kind of dimensionality reduction. Thousands of papers have been written on different modifications and combinations of the above related techniques. In this paper we will extend the AAM approach to handle pose variations and normalize faces before feature extraction and matching for recognition.

Local approaches have been largely based on locating facial features and describing them jointly with their mutual distances. The elasticity of their geometrical relationship has been successfully described as a graph with anchor points in the EBGM model (Elastic Bunch Graph Matching) [5] where a cost function that fused local

Page 2: Hoip10 articulo reconocimiento facial_univ_vigo

2 texture similarity and graph similarity, was minimized in order to fit the EBGM model and to perform the comparison between two candidate faces. Since this model was launched many others tried to find more robust representations for the local texture using local statistics of simpler descriptors like LBP (Local Binary Patterns) [6], SIFT (Scale-Invariant Feature Transform) [7], HoG (Histogram of Gradients) [8], most of them trying to get descriptors invariants to illumination changes and more robust to small face-feature location errors. In this paper, the feature extraction module of the face recognizer is based on Gabor Jets applied to user-specific face landmarks from a illumination normalized image, adding, this way, more discriminability and illumination invariance to the local matching process.

Local and holistic approaches for face recognition have been combined in very different ways to produce

face recognizers robust to the main sources of variation. In our work we use a holistic approach that we called Pose-dependent AAM to normalize the face as a frontal face and a local approach to extract the texture information over a set of face points in order to match two normalized sets of local textures from different face images.

The rest of the paper is organized as follows: section 2 gives a brief review of methods that try to handle pose variations and rapidly focus on 2D pose correction approaches, that is the main body of the paper. Section 3 explains how the feature extraction and matching is performed once the face has been corrected. Section 4 gives some results comparing the proposed pose-dependent approach with the well-known view-based AAM. Section 5 closes the paper with some conclusions.

2 FACE RECOGNITION ACROSS POSE

The last decade has witnessed great research efforts to deal with pose variation in face recognition. Most of the main works are compiled in [9]. The brute-force solution to this problem consists of saving different views for each registered subject. Nevertheless in many real applications there are difficulties to take more than one enrollment shot per subject or we can not afford to store and/or match more than one image per subject, and even, frequently, the stored face image is not frontal, as in video-surveillance. In these cases the approaches can be divided into those based on fully redesigning the recognizer [10][11][12] in order to match pose-invariant face features, and those that rely on creating virtual views of rotated faces and use a general purpose recognizer for matching the real image and the synthetic one under the same pose. In this last approach, we distinguish between methods based on 3D modeling, [13][14] and methods based on 2D, [15][16][17][18]. In [15] a PDM-based PCA analysis allows to identify the two eigenvectors responsible for in-depth rotation that they happen to be those with the highest eigenvalues. Manipulations of the projection coefficients associated to these eigenvectors allow to synthesize a virtual shape in a wide range of pitch and yaw values and then rendering the texture of the new image using thin plate splines from the original face image.

In this paper we present a fully automatic pose-robust face recognition system extending the study in [15] to include automatic estimation of pose parameters through AAM-based landmarking and design a fully automatic system for face recognition across pose. We will present several variants for improving speed and accuracy of the View-Based Active Appearance Model [18]. Since pose changes is also one of the factors that detriment AAM landmarking performance, we compare View-based AAM performance with this novel and faster variant coined as Pose Dependent AAM [19]. This approach is also based on dividing the non linear manifold created by face pose changes into several linear subspaces (different models for different poses), but makes use of automatic pose estimation to decide between different pose-models in a multiresolution scheme and differs in the way virtual views are created. In the next subsection we give a brief review of AAM and View-Based AAM to introduce the concepts and notation that root our work.

2.1. VIEW-BASED AAM

Active Appearance Models combine a powerful model of joint shape and texture with a gradient-descent fitting algorithm. AAM was first introduced by Cootes et al. in 1998 [4] and, since then, this modeling has been widely used in face and medical image analysis. During a training stage, we use a set of manual landmarked images, these images are selected as representative examples of the face variability. All the image have been manually landmarked with 72 landmark points, the position of these 72 landmarks conform the face-shape of the image. The training set of landmarked face-shapes si = (x0i; y0i; :::; xni; yni) are aligned using Procrustes analysis, in order to get invariance against 2D rigid changes (scale, translation, roll rotation). A shape model is created through PCA of the aligned training shapes (1). In the same way, textures g are warped to a frame reference set, and intensity normalized before being combined in a PCA texture model (2). The joint shape-texture model is

Page 3: Hoip10 articulo reconocimiento facial_univ_vigo

3 built by applying a third PCA to the suitably combined shape and texture coefficients. At this point we have a model that can represent one face in a few set of parameters, known as the appearance parameters ci (3).

)3(

)2(

)1(

⎭⎬⎫

+=+=

+=

+=

gii

sii

ggi

gsi

QcggQcss

Pbgg

Pbss

i

i

Interpretation of a previously unseen image is seen as an optimization problem in which the difference

between the new image an the model (synthesized) image is minimized, in other words, the model reconstruction error. On this purpose, after the model is built, a regression matrix R = δc/δr is calculated in order to learn the variation of the residual according to the variation of the appearance parameters. As we want to use a constant regression matrix the residual r(p)=gs-gm needs to be calculated in a normalized reference frame (see [4] for details). Therefore, we minimize the squared error between the texture of the face normalized and warped to the reference frame, gs, and the texture of the face reconstructed by the model, gm, using the appearance parameters, where p includes both the appearance parameters and the rigid parameters (scale, translation, roll rotation). The assumption of R as constant, lets us estimate it from our training set. We estimate _δc/δr by numeric differentiation, systematically displacing each parameter from the know optimal value on training images and computing an average over the training set. During the fitting stage, starting from a reasonable good initialization, the AAM algorithm iteratively corrects the appearance parameters, using an iterative gradient descent algorithm. The projection of the residual over the regression matrix leads us to the optimal increment of the parameters, δp=-Rr(p) pi+1=pi+kδp . After updating parameters, the residual (reconstruction error) is recalculated and the process is repeated until the error stops decreasing. Once the fitting converges we have found the appearance parameters that best represent the new image in our model. This means, that given a target face, with an unknown shape, we can figure out the shape of the face by fitting it to the model and recovering the shape from the appearance parameters.

One of the drawbacks of the AAM is that the performance of the fitting algorithm decreases if the model has to explain large variations and non-gaussian distributions, like face yaw and pitch. The straight-forward solution to this problem is to divide the face space into different clusters depending on the rotation angle, this way we train different AAM models for each class of the clustered face space. As result, we have one model for each view/pose group and therefore each model has less variations to handle and distributions are nearly gaussian.

In [18], Cootes et al. trained different AAM models to fit different ranges in pose. The view-based AAM can cope with large pose changes, by fitting each new image to its most adequate model. The drawback of this approach, is that the fitting relays on the ability to choose the best model to each image. The model selection approach proposed by Cootes consist of trying to fit each image to each one of the N view-based models (see figure 1). The fitting stops after a few iterations and the model with the smallest reconstruction error at that time is chosen as the most adequate model for this image.

2DParameterEstimation

r(p)?

AAMInitialization

Fitting AnyPose

FittingPose Dependent

View1

View2

...

ViewN

View1

View2

...

ViewN

Figure 1. Landmarking Process using the View-Based AAM. The input image is fitted to each of the models, after a few iterations the model with the smallest error is chosen. The fitting keeps on using the chosen model until convergence.

From our point of view, the model selection of ’View-based’ approach presents two different drawbacks. On one hand using the residual in different models to choose the best model, relies on the assumption that different models have comparable residuals. We argue that even though it works fine for acquisition-controlled datasets,

Page 4: Hoip10 articulo reconocimiento facial_univ_vigo

4 as we going to show later on, this assumption could not be necessarily true, as it was reported for View-based eigenfaces [3]. On the other hand, fitting each image to each of the view-based models is computationally expensive and with an increasing cost as we increase the number of models. Notice that the iterations in the not-selected models are wasted. In the next section, we propose a different way to select the best model for each image based on the estimation of the rotation angle. In our approach (see figure 2), we use a multiresolution framework, where a generic model including all pose variations is used to estimate the rotation angle. Once the rotation angle is estimated we use the most adequate pose model to fit the image. In the next section we explain how to estimate the rotation angle.

2.2. POSE DEPENDENT AAM

In the scheme of Figure 1 we can save computation time if the final view-based model is selected before making iterations over uncorrect models. A pose estimator is then needed to select the candidate model. We have explored two different approaches to detect the head angle from manual or automatically landmarked images. In the first one, González-Jiménez et al. [15] show that pitch and yaw rotation angles, have nearly linear variation with the first two shape parameters of a PDM model, coined then as pose parameters. In [18], Cootes et al. restrict their approach to yaw rotations, and claim that appearance parameters follow an elliptic variation with the yaw rotation angle. Both approaches are different by nature because the first one is based on empirical observation over the shape parameters and the second one is based on a cilindrical model for the face and an extension of the shape model to the appearance model. It is easy to show that Gonzalez’s approach can be also seen as a simplification of Cootes’s model under a range of yaw rotation angles +/-45º. So far we have seen that the yaw angle can be estimated both from appearance and shape parameters, and that shape-elliptic and shape-linear approaches are equivalent for small angles. We will extend the angle etimation approach to the creation of virtual frontal faces and we will show comparative results among three different rotation models: the appearance-elliptical (Cootes), shape-linear (González), and shape elliptical (as the result of mixing both approaches).

Once the yaw angle is estimated we can proceed to manipulate it and make a virtual frontal view. Both gallery and probe images are processed, therefore, the face recognition across pose problem is reduced to a frontal face recognition problem and a general purpose recognizer can be used. In this subsection, we are going to compare approaches presented in [15] and [18]. González-Jiménez et al. [15] proposed a linear model to detect the head angle. They also proposed to build frontal synthetic images by creating synthetic frontal shapes, setting to zero the pose parameter and warping the image texture to the synthetic frontal shape. Each landmarked shape is projected into a PCA shape subspace. A PCA model of both frontal and yaw rotated faces is used to capture the main variation in the first eigenvector, as we explained before. Once we have a face represented in our shape model bsj , the frontalization process consist of setting to zero the pose parameter and reconstructing the shape using the frontalized parameters. The frontalized shape is filled with the texture from the image [19]. In cases where the rotation is large and self-occlusions appear, symmetry is applied, and the texture of the visible half side of the face is used to fill both left and right half side of the frontalized shape. In [18], Cootes et al. modeled yaw rotation in the appearance subspace using an elliptical approach. Once they have represented the face with angle θ_ in the appearance model, a new head angle, α_, can be obtained using the transformation in equation 4, where r is the residual vector not explained by the rotation model. A synthetic frontal face can be recovered from the appearance parameters. It has been demonstrated in [19], that when we use the model texture instead of the original texture from the input image, recognition results decrease because the test subjects are not included in the model and, consequently, the representation of test subjects texture is less accurate. Therefore, to establish a fair comparison between [15] and [18], we are frontalizing the shape using both methods but rendering the frontal face always with the original texture from the image.

)4()cos()cos(

)cos()cos()(

0

0

⎪⎭

⎪⎬⎫

++−=

+++=

θθ

ααα

yx

yx

ccccr

rcccc

In table 1 we show comparative results of the two frontalization methods using manual landmarks. We also show the performance of the elliptical approach applied to the shape instead of to the appearance. We can see that the linear approach performs better for angles between −45º and 45º than the elliptical approach. Also, using the elliptical model in the shape subspace, instead of using it in the appearance subspace, performs slightly better, this can be due to the fact that some information relevant to recognition can be modified by the texture representation in the appearance subspace. In any case this differences are not statistically significant.

Linear Model ShapeElliptical Model Texture Elliptical Model

Recognition rate 98,68% 95,44% 94,71%

Table 1: Recognition rate with different frontalization methods. Results averaged over 34 subjects in PIE database [20].

In order to automatically run the frontalization method in the whole recognition process, we have resorted to

Page 5: Hoip10 articulo reconocimiento facial_univ_vigo

5 a computationally simpler method coined as pose-dependent AAM. First we use a multiresolution AAM [21] that includes both left and right rotation to have a coarse approximation of the face shape, and hence, a coarse approximation of the shape parameters. Before the higher level of resolution, we decide which is the best model for our image based on the pose parameter. In the previous section, we have justified the selection of this parameter. Therefore, the face is finally landmarked with its corresponding model. The higher resolution of the generic multiresolution model is not used for fitting, being replaced by the pose dependent model. Figure 2 shows the full landmarking process using Pose-Dependent AAM. We use 4 Viola-Jones detectors [22] (face, eyes, nose, mouth), to estimate the 2D rigid parameters (scale, translation and tilt). An scaled and translated version of the mean shape is used as initial shape at the lower level of resolution. As we are going to see next, having a good initialization improves the landmarking results. Once we have estimated scaling and translation, we fit the image to the low resolution level of the generic model, jumping to the next resolution level after the error stop decreasing. Before the higher level of resolution, the decision about the image rotation angle is made and the adequate pose-dependent model is chosen.

The advantage of our approach is that we save the extra-cost of landmarking the face with several pose-dependent models. The view-based approach runs a few iterations of the AAM algorithm for each model, the best fitting model is used to landmark the image, while the iterations done in the not selected models are wasted. On the contrary, using the pose-dependent approach, no iterations are wasted, even if the generic multiresolution model can not achieve a good fitting for every image, it helps to improve the initialization, and thus the number of iterations in the pose-dependent level decreases.

2DParameterEstimation

bs?

LowerResolution

MiddleResolution

LeftModel

RightModel

AAMInitialization

Fitting AnyPose

FittingPose Dependent

Figure 2: Landmarking process using the Pose-Dependent AAM. The image is registered with the generic multiresolution model. Before the higher level of resolution, pose estimation is performed. The image is then registered to the most adequate pose-dependent model.

3 FACE FEATURE EXTRACTION AND MATCHING

The frontal (or frontalized) face verification engine is based on multi-scale and multi-orientation Gabor features, which have been shown to provide accurate results for face recognition [23], due to biological reasons and because of the optimal resolution in both frequency and spatial domains [24]. More specifically, the face recognition system relies upon extraction of local Gabor responses (jets) at each of the nodes located at facial points with informative shape characteristics. These points are selected by sampling the binary face image resulting from a ridge&valley operator and the jets are applied on the geometrically and photometrically corrected face region [25]. The similarity between two jets is given by their normalized dot product, and the final score between two faces combines the local similarities using trained functions in order to optimize discriminability in the matching process [26]. Figure 3 shows the process of obtaining the texture features from the face image. Figure 4 shows the full face recognition diagram robust to pose changes between -45º and 45º.

In next section some comparative results between Pose-dependent and View-based AAM will be exposed. It is important to highlight that this scheme combines efficiently a local description of the face with a holistic representation that normalize the faces before matching.

Page 6: Hoip10 articulo reconocimiento facial_univ_vigo

6

Figure 3: Feature extraction process: Sampling of ridges in the face image and Multiscale and multiorientation Gabor filtering centered on these points. Each point is represented by a texture vector called jet.

Figure 4: Face-Recognition Diagram. After the illumination processing, both gallery and probe images are landmarked using pose-dependent approach. Frontal view of both faces are synthesized and matched to perform recognition

4 COMPARATIVE RESULTS

This section shows the recognition results using the two schemes for registering the rotated image and the reference result with manually landmarked images (72 points). These experiments are tested over 34 subjects from CMU Pie database in order to have an easy comparison to the results presented in [15]. Tables 2, 3 and 4 show the recognition results for manually landmarked faces and for the fully automatic landmarking system, both using Pose Dependent scheme and View Based scheme. The results in recognition are comparable for most of the probe-gallery pose combinations, decreasing for pose differences of 45º due to the propagation of larger landmarking errors in these poses, for both automatic approaches.

Multiresolution and

lti i t ti

Ridges &Valleys

Thresholding

SamplingFeature set: one jet per

IluminationCorrection

AAMLandmarking

Warping(texture)

Warping(texture)

AAMLandmarking

IluminationCorrection

TargetImage

ShapeFrontalization

ShapeFrontalization

Matching

RegisteredImage

Virtual Image

RegisteredVirtual Image

Page 7: Hoip10 articulo reconocimiento facial_univ_vigo

7

Probe angle -45º -22,5º 0º +22,5º +45º Average

-45º - 100 100 100 91,18 97,79

-22,5º 100 - 100 100 97,06 99,26

0º 100 100 - 100 100 100

+22,5º 100 100 100 - 100 100

+45º 91,18 94,12 100 100 - 96,68

Average 97,79 98,53 100 100 97,06 98,68

Table 2: Face recognition results using manual landmarks instead of AAM fitting.

Probe angle -45º -22,5º 0º +22,5º +45º Average

-45º - 97,06 97,06 94,12 91,18 94,85

-22,5º 94,12 - 100 100 85,29 94,85

0º 94,12 100 - 100 100 98,53

+22,5º 91,18 100 100 - 100 97,85

+45º 88,23 94,12 100 97,06 - 94,85

Average 91,91 97,79 99,26 97,79 94,12 96,18

Table 3: Face recognition results using the Pose-Dependent solution

Probe angle -45º -22,5º 0º +22,5º +45º Average

-45º - 97,06 97,06 94,12 91,18 94,85

-22,5º 97,06 - 100 100 85,29 95,58

0º 97,06 100 - 100 94,12 98,53

+22,5º 94,12 100 100 - 94,12 97,85

+45º 88,23 91,18 100 97,06 - 94,85

Average 94,12 97,06 99,26 98,53 91,78 96,02

Table 4: Face recognition results using the View-Based solution

It is clear that both approaches perform quite similarly and very close to the perfect fitting given by the manual landmarked faces. What is more interesting to highlight is that the pose Dependent solution was able to landmark 5 images per second while the view-based solution landmarked 2 images per second, using in both cases an Intel Core 2 Quad CPU (2,85 GHz).

5 CONCLUSIONS

In this paper we have presented a fully automatic system for face recognition using a combination of local and global approaches and introducing a scheme to avoid most of the errors due to pose variation. The multiresolution scheme of Pose Dependent AAM allowed a faster fitting to the correct pose-dependent model than the pure view-based approach. Recognition results over CMU PIE Database showed similar recognition

Page 8: Hoip10 articulo reconocimiento facial_univ_vigo

8 values compared to view-based AAM and a performance quite close to that achieved using manually landmarked faces.

REFERENCES

[1] M. Turk and A. Pentland: “Eigenfaces for Recognition,” J. Cognitive Neuroscience, vol. 3, no. 1, pp. 71-86, 1991. [2] P. N. Belhumeur , J. P. Hespanha , D. J. Kriegman: “Eigenfaces vs. Fisherfaces: Recognition Using Class Specific Linear Projection,” IEEE Transactions on Pattern Analysis and Machine Intelligence, v.19 n.7, p.711-720, July 1997 [3] A. Pentland, B. Moghaddam, and Starner, View-based and modular eigenspaces for face recognition, in Proc. IEEE Conf. Computer Vision and Pattern Recognition, 1994, pp. 84. 1991. [4] T.F. Cootes, G.J. Edwards, and C.J. Taylor: “Active Appearance Models”,Proc. Fifth European Conf. Computer Vision, H. Burkhardt and B. Neumann, eds., vol. 2, pp. 484-498, 1998.. [5] L. Wiskott, J.M. Fellous, N.Kruger and C. von der Malsburg: “Face recognition by Elastic Bunch Graph Matching,” IEEE Trans. on PAMI, 19(7), 775-779, 1997 [6] Timo Ahonen, Abdenour Hadid, Matti Pietikainen: "Face Description with Local Binary Patterns: Application to Face Recognition," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 28, no. 12, pp. 2037-2041, Dec. 2006 [7] M. Bicego, A. Lagorio, E. Grosso, and M. Tistarelli.: “On the Use of SIFT Features for Face Authentication”, Computer Vision and Pattern Recognition Workshop (CVPRW'06), 35, 2006 [8] Albiol, A., Monzo, D., Martin, A., Sastre, J., Albiol, A.: “Face recognition using hog-ebgm”. Pattern Recognit. Lett. 29(10), 1537–1543 (2008) [9] X. Zhang and Y. Gao: “Face recognition across pose: A review. Pattern Recognition, 42(11):2876 – 2896, 2009. [10] C. Castillo and D. Jacob:. “Using stereo matching for 2-d face recognition across pose”. Computer Vision and Pattern Recognition, 2007. CVPR ’07. IEEE Conference on, pages 1–8, June 2007. [11] Z. Wang, X. Ding, and C. Fang: “Pose adaptive lda based face recognition”. ICPR 2008. 19th, pages 1–4, Dec.2008. [12] R. Gross, I. Matthews, and S. Baker:. “Appearance-based face recognition and light-fields”. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 26(4):449–465, April 2004. [13] X. Zhang, Y. Gao, and M. Leung: “Recognizing rotated faces from frontal and side views: An approach toward effective use of mugshot databases”. Information Forensics and Security, IEEE Transactions on, 3(4):684–697,Dec. 2008. [14] V. Blanz and T. Vetter: “A morphable model for the synthesis of 3d faces”. SIGGRAPH ’99: Proceedings of the 26th annual conference on Computer graphics and interactive techniques, pages 187–194, 1999. [15] D. Gonzalez-Jimenez and J. Alba-Castro: “Toward pose-invariant 2-d face recognition through point distribution models and facial symmetry”. Information Forensics and Security, IEEE Transactions on, 2(3):413–429, Sept.2007. [16] T. Shan, B. Lovell, and S. Chen:. “Face recognition robust to head pose from one sample image”. ICPR 2006.,1:515–518, 0-0 2006. [17] X. Chai, S. Shan, X. Chen, and W. Gao:. “Locally linear regression for pose-invariant face recognition”. Image Processing, IEEE Transactions on, 16(7):1716–1725, July 2007. [18] Cootes, T.F.;Walker, K.;Taylor, C.J.. “View-based active appearance models”, Automatic Face and Gesture Recognition, 2000. Proceedings. Fourth IEEE International Conference on, pp. 227-232, 2000 [19] L.Teijeiro-Mosquera, J.L.Alba-Castro, D. González-Jiménez: “Face recognition across pose with automatic esti-mation of pose parameters through AAM-based landmarking”, ICPR 2010. 20th, pages 1339-1342. [20] T. Sim, S. Baker, and M. Bsat:. “The cmu pose, illumination, and expression (pie) database of human faces” TR. [21] T. Cootes, C. Taylor, and A. Lanitis:. “Multi-resolution search with active shape models”. Pattern Recognition, 1994. Vol. 1 - Conference A: Computer Vision and Image Processing., Proceedings of the 12th IAPR International Conference on, 1:610–612 vol.1, Oct 1994. [22] P. Viola and M. Jones: “Rapid object detection using a boosted cascade of simple features,” in proc. Intl. Conf. on Computer Vision and Pattern Recognition, pp. 511-518, 2001. [23] N. Poh, C. H. Chan, J. Kittler, S. Marcel, C. McCool, E. Argones Rúa, J. L. Alba Castro, M. Villegas, R. Paredes, V. ˇStruc, N. Paveˇsic, A. A. Salh, H. Fang, and N. Costen: “An evaluation of video to video face verification”. IEEE Transactions on Information Forensics and Security. Vol. 5, N. 4, pp. 781-801, dic. 2010 [24] J. G. Daugman: “Complete Discrete 2D Gabor Transforms by Neural Networks for Image Analysis and Compression”. IEEE Trans. on Acoustics, Speech and Signal Processing, 36(7):1169 – 1179, July 1988. [25] D. González-Jiménez and J.L- Alba-Castro: “Shape-Driven Gabor Jets for Face Description and Authentication”. IEEE Transactions on Information Forensics and Security, 2(4):769–780, 2007. [26] D.González-Jiménez, E. Argones-Rúa, José L. Alba-Castro, Josef Kittler: “Evaluation of Point Localization and Similarity Fusion Methods for Gabor Jets-based Face Verification”, IET Computer Vision, pp. 101-112, Vol. 1, Num 3-4. Dec 2007