ai documentation

A Seminar Report On

“Advances in the Face Detection and recognition technologies”

By

Ramakrishna ParuchuriDate of submission: 31st July 2005.

Abstract

This article is the brief summary based on the reference article “advances in the face detection and recognition technologies” by Atsushi SATO, Hitoshi IMMOKA, SUZUKI and HOSAI. This report briefly describes the advances in the authors face detection and recognition technologies. For the face detection they proposed a combined scheme for both face and eye based on the Generalized Learning Vector Quantization method. For the face recognition a perturbation method has been improved to reduce the adverse effects of both illumination and pose changes. In this report the first section gives an introduction to the biometric authentication methods, the second section explains different methods that are proposed by the authors for the face detection and recognition and in the third section the experimental results of the proposed method has been explained.

1. Introduction In the recent years there have been great expectations in the biometric authentication. Biometric authentication is the automatic identification of the individual based on the physiological or behavioral characteristics such as finger print, iris, vein, face and voice. This kind of authentication is commonly used in safeguarding country borders, in control access to facilities and to enhance the computer security.

In the biometric authentication face recognition has special characteristics. Following are the some of the advantages of the face reorganization technology.Advantages: - They do not need any physical contact - They can be captured from a distance. -some biometric authentication like voice detection depends mostly on the surrounding environment. But for the face recognition they do not affect.

The same authors have developed lot of techniques for the face detection earlier [1].this particular paper that I have discussed mentions some of the advances in the authors face recognition technologies. For the face detection a combined scheme for the both face and eye detection has been developed using generalized learning vector quantization (GLVQ), for the face recognition a perturbation method has been improved to reduce the adverse effects of both illumination and pose changes.

2. Face detection and alignment:

Face detection has mainly two important factors one is to determine all the facial backgrounds on different backgrounds and the other is to determine the alignment of each face such as position, rotation and size to obtain the better view. If the face is rigid it has six parameters of freedom, three coordinate and three rotation freedoms. In the front view the degrees of the freedom reduces to four as we have the compromise on two rotational angles. so most detection algorithms uses detection of one in-plane characteristic like eyes and one out of plane characteristic like detection on of the mouth region in the face. Lot of work has been done in this field earlier like detecting skin regions for the face detection but the disadvantage with this method is its sensitivity for the background light. The development of view based methods over came this problem. However the problem with view based methods are they consume lot of time and the

facial alignment is not so clear in theses methods as they ignore high frequency components.

So the authors of this paper have developed new algorithm for the detection of the face as show in the figure1.

Fig: 1 Proposed diagram for the face detection

From the input image the position of the face is determined using low frequency components by searching over multi scale images by taking plane rotation into account.after that; the position of the both eyes are determined by using coarse to fine search by taking high frequency components of the image into account.

For the simplicity reasons I am not going to explain all the formulas that are involved In Generalized learning vector quantization (GLVQ) I rather focus more on the way it helps in the process of image detection. I will briefly mention the idea behind the vector quantization. A vector quantizer maps k-dimensional vectors in the vector space Rk

into a finite set of vectors Y = {yi: i = 1, 2... N}. Each vector yi is called a code vector or a codeword. And the set of all the codewords is called a codebook. Associated with each codeword, yi is a nearest neighbor region called Voronoi region, and it is defined by:

The set of Voronoi regions partition the entire space Rk such that:

For all i j

As an example we take vectors in the two dimensional case without loss of generality. Figure 2 shows some vectors in space. Associated with each cluster of vectors is a representative codeword. Each codeword resides in its own Voronoi region. These regions are separated with imaginary lines in figure 1 for illustration. Given an input vector, the codeword that is chosen to represent it is the one in the same Voronoi region.

Figure 2: Codewords in 2-dimensional space. Input vectors are marked with an x, codewords are marked with red circles, and the Voronoi regions are separated with boundary lines.

The representative codeword is determined to be the closest in Euclidean distance from the input vector. The Euclidean distance is defined by:

Where xj is the jth component of the input vector, and yij is the jth is component of the codeword yi. So the vector quantization is the process of mapping the large number of vectors into a group of vectors which will make the comparative methods easier to apply. Examples of the one of the vector quantization method are described below.

The algorithm

1. Determine the number of codewords, N, or the size of the codebook. 2. Select N codewords at random, and let that be the initial codebook. The

initial codewords can be randomly chosen from the set of input vectors. 3. Using the Euclidean distance measure clusterize the vectors around each

codeword. This is done by taking each input vector and finding the Euclidean distance between it and each codeword. The input vector belongs to the cluster of the codeword that yields the minimum distance.

4. Compute the new set of codewords. This is done by obtaining the average of each cluster. Add the component of each vector and divide by the number of vectors in the cluster.

Where i is the component of each vector (x, y, z ... directions), m is the number of vectors in the cluster.

5. Repeat at steps 2 and 3 until the either the codewords don't change or the change in the codewords is small.

Face detection:

As shown in the figure 3 from the input image multiple images are generated, theses images are different in both size and resolution. Then reliability maps are generated in each of the images by scanning the template with all the images by using GLVQ method. After that these reliability maps are merged through interpolation to get the final result. Interpolation is an imaging process to increase or decrease the number of pixels in the digital data. One of the simplest ways of the interpolation method is by taking the average value of adjacent pixels as the value of the new pixel.

Fig 3: step wise procedure of the proposed face detection method

Fig 4: examples of face images taking account of in-plane rotation

Fig 5: Scanning multi scale images with template.

The figure 4 shows the examples of the faces that are experimented by the authors by taking in plane rotation into account. The size of the face is normalized by taking eyes positions into account. Since searching speed depends on the size of the template reducing the size of the template speeds up the process as shown in the figure 5 but doing this an also has the other effect on the quality as it reduces the higher frequency

components. To avoid this problem high frequency components are extracted before reducing the size of the image. on the other hand have the option of reducing the sizes of the multiple images by retaining the size of the template.

Eye detection:

As shown in the figure 6, the exact position of the both eyes are detected in this case by using coarse to fine search. First a grid is assumed on the initial position of the both eyes as determined by the face detection method. After this the reliability of each eye is determined by using quantization methods that are used in the face detection method. In each iteration the most likelihood of the both eyes are determined and the size of the grid is reduced precisely by using the GLVQ method. As this procedure is continued, the probability of finding the exact eye position is increased and the size of the grid is reduced.

Fig 6: processing flow of course to fine search in eye detection.

Face recognition: The idea of the face recognition is to calculate the similarities between the query and the facial images. Usually the performance of the face recognition is degraded by the two factors:

1. Global changes: They include changes in the background light and in the poses. The method used to reduce these effects is perturbation space method.

2. Local changes: They include aging effects and wearing glasses. The method that is used to reduce these effects is adaptive regional blending matching.

In the perturbation method the enrolled image, which has four degrees of freedom is mapped to the three dimensional figure as shown in the figure 7. It has been proved earlier that illumination changes can be described not more than 10 degrees of freedom [2]. As the mapping of the image to the three dimensional figure increases the degrees of freedom by two.

.

Fig 7: conceptual figures of generation of various facial images using standard face model.

Once the enrolled image is mapped into 3D facial faces, various 3D faces are generated with different illuminations on the background and with different poses in the faces. This method helps to reduce the negative effects of these global problems. Now the degrees of freedom are increased to 6. It might be time consuming process to compare all the produced images with the quary image. To solve this problem generated images are compressed using principle component analysis (PCA) and then compared with quary image. Principle component analysis is the way of identifying the patterens (similarities and the differences) in the data.

Fig 8: conceptual figure of adaptive regional blending matching

To reduce the effects of the local changes they used adaptive regional blend matching method as shown in the figure 8. In this method both quary and the enrolled images are divided into N segments initially and then each segment is compared by using the perturbation method. In the perturbation analysis the segments having differences in the comparison of both images are neglected so as to get the identical image segments. These matching segments help us to find out the local changes in both the images.

3. Experimental results

Face recognition experiments are conducted by the authors to see the effect of the proposed methods. In the laboratory they have four different databases; each database contains 200 to 1000 person’s images. The data bases are namely divided into following categories:

DB1& DB2: they contain the images with ageing effects DB3: it contains images with facial expression changes. DB4: it contains Images with illumination changes.

Fig 9: experimental results for face reorganization for several databases.

As shown in the figure 9 the proposed method is being compared to the previous methods. The equal error rate in the figure represents the performance measure of verification in which the threshold for scores is tuned so that its false acceptance rate is equal to its false rejection rate. The identification rate is the probability of finding the enrolled image in the first search when compared to the quary image. As we see from the above diagram that the proposed method is almost out performed the previous method. The accuracy of finding the eyes in the face is almost similar to the method where the eyes positions are determined by the humans physically.

Conclusion: This report has described the advances in the face detection and recognition technologies that are proposed by the authors of the reference paper. For the face detection a hierarchical scheme for the combined face and eye detection has been proposed using Generalized learning vector quantization method. For the face recognition perturbation method has been improved to reduce the adverse effects of the illumination and pose changes. The experimental results proved that the proposed method has been performed much better than the earlier methods. The proposed eye detection method is the striking point of this paper as it is detects the eyes as accurately as the human estimation.

References

[1] A.Sato, A.Inoue, et al.”neo-face development of face detection and recognition engine” Res &develop 44, 3 pp 303-306 July 2003.

[2] R.Ishiyama & S.Sakamoto “Geodesic illumination basics: compensating for illumination variations in any pose for face recognition”, in Proc...Int Con. On Pattern recognition.4 pp 297-301, 2002

Sources of pictures 1 to 9 except 2:

http://www.nec.co.jp/techrep/en/r_and_d/a05/a05-no1/a028.pdf

Atsushi SATO, Hitoshi IMAOKA, Tetsuaki SUZUKI and Toshinori HOSOI.Advances in Face Detection and Recognition Technologies. NEC Journal ofAdvanced Technology, Vol 2 (1), winter 2005.

Source of picture 2:

http://www.geocities.com/mohamedqasem/vectorquantization/vq.html