representing facial information using gabor wavelets

1
Properties of informative Gabor Wavelets Spatial scales 0.0 0.5 1.0 1.5 2.0 2.5 1 100 10000 Noise absolute frequencies 0 2 4 6 8 1 100 10000 Signal absolute frequencies 27.85 c/fw 13.93 c/fw 6.96 c/fw 3.48 c/fw 1.74 c/fw 0.87 c/fw 0.44 c/fw 1e-02 1e-01 1e+00 1e+01 1e+02 1 100 10000 Signal to Noise ratio absolute frequencies high spatial frequencies: low signal, intermediate noise, and low signal-to-noise ratio middle spatial frequencies: high signal, high noise, intermediate signal-to-noise ratio low spatial frequenceis: low signal, low noise, high signal-to-noise ratio W-E WSW-ENE SW-NE SWS-NEN S-N SES-NWN SE-NW ESE-WNW W-E Preferred wavelet orientation Orientation Signal to noise ratio Signal Noise horizontally oriented wavelets: high signal, medium noise, high signal to noise ratio vertically oriented wavelets: medium signal, medium noise, medium to low signal to noise ratio Scatterplot of Signal to Noise Revealing the average face Signal: ordering acc. to S m,n,k,l -> equal weights Noise: ordering acc. to S m,n,k,l /N m,n,k,l -> signals weighted acc. to 1/N m,n,k,l Signal to noise ratio: ordering acc. to S m,n,k,l m,n,k,l -> signals weighted acc. to SN m,n,k,l Location - wavelets are densely clustered over facial features - high-frequency wavelets (27.85 and 13.93 c/fw) at the same locations - medium-frequency wavelets in facial interior Noise Signal Signal to noise ratio - circular arrangement with higher frequencies in the interior and lower frequencies closer to the outline of the face (facial features) - high-frequency wavelets are densely clustered at the outlines of facial features with little overlap -- medium-frequency wavelets cover facial outlines - wavelets broadly cover facial outlines - orientation of wavelets perpendicular to outline - almost no low-noise wavelets in facial interior 1 2 3 4 Dierent selection criteria σ .14 .61 .01 Population variance: Signal - variance in wavelet activation across 96 dierent frontal face images - face images are grey-valued snapshots of 3D head models [5] - standardized in size (area coverage) and average birghtness - illumination source above and to the left of the face Rotation variance: Noise - variance in wavelet activation across dierent in-depth rotations of the average face around the vertical and horizontal axis - variance as weighted squared sum with weights according to a multivariate normal distribution with σ=2 and an aspect ratio of 2:1 Signal to noise ratio - for each wavelet calculate the ratio of population variance to rotation variance - estimate of the Raleigh quotient of be- tween- to within-class variance Image Representation using Gabor wavelets [4] 2D Gabor wavelets - local spatial bandpass lters - models of simple/complex cells in V1 - generated by dilation, rotation from mother wavelet Hierarchical ensemble of Gabor wavelets - 8 dierent optimal spatial frequencies with1.5 octave half- amplitude bandwidth - 8 dierent optimal orientations - centers of RFs positioned in hierarchical grid Complete image representation - spatial coverage of all input locations, overlapping RFs - local coverage of the full Fourier spectrum, overlapping fre- quency bandwidths 0.22 c/fw 0.44 c/fw 0.87 c/fw 1.74 c/fw 3.48 c/fw 6.96 c/fw 13.93 c/fw 27.85 c/fw Spatial frequency ltering Introduction What is the kind of information that we extract from a face in order to correctly infer the person’s identity? Previously, investigations have selectively studied the inuence of spatial frequencies [1], orientation [2], or location [3]. Here, we use an hierarchical ensemble of Gabor wavelets [4] to study all three aspects at the same time. Summary and discussion Informativeness of individual wavelets was measured by two cri- teria: Signal (population-variance) and Noise (rotation-variance). Wavelets with a high signal are mostly horizontally oriented and either of medium spatial frequency and dispersed over the face, or of high spatial frequency and clustered over face parts (eyes, mouth, nose, ...). Wavelets with a high signal-to-noise ratio include low spatial frequencies, the predominance of horizontal orientations is more pronounced, and they are arranged around the outlines of face parts. Properties of informative wavelets show complex inter- dependencies of spatial frequency, orientation, and loca- tion. Rainer Stollho MPI for Mathematics in the Sciences, Leipzig, Germany Representing facial information using Gabor wavelets References [1] Gold J.B. et al. (1999) Identication of band-pass ltered letters and faces by human and ideal observers. Vision Research, 39 (21): 3537-3560 [2] Dakin S.C. and Watt R.J. (2009) Biological "bar codes" in human faces. Journal of Vision, 9 (4): 2 [3] Gosselin, F. and Schyns, P.G. (2001). Bubbles: A new technique to reveal the use of visual information in recognition tasks. Vision Research, 41: 2261–2271 [4] Lee, T. S. (1996). Image representation using 2D gabor wavelets. IEEE T Pattern Anal, 18(10):959–971. [5] Troje, N.F. and Bültho, H.H. (1996) Face recognition under varying poses: the role of texture and shape. Vision Research, 36(12) :1761–71. Acknowledgements The author thanks J.Jost, T. Elze, L. Avdiyenko, and H. Wilhelm (MPI MiS); I. Bültho and T. Wolf (MPI Biol. Cyb.) for interesting discussions, technical support and assistance in constructing the stimuli. in the Sciences Mathematics Max Planck Institute for

Upload: others

Post on 27-Mar-2022

7 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Representing facial information using Gabor wavelets

Properties of informative Gabor Wavelets

Spatial scales

0.0 0.5 1.0 1.5 2.0 2.5

1100

10000

Noise

abso

lute

freq

uenc

ies

0 2 4 6 8

1100

10000

Signal

abso

lute

freq

uenc

ies

27.85 c/fw13.93 c/fw6.96 c/fw3.48 c/fw1.74 c/fw0.87 c/fw0.44 c/fw

1e-02 1e-01 1e+00 1e+01 1e+02

1100

10000

Signal to Noise ratio

abso

lute

freq

uenc

ies

high spatial frequencies: low signal, intermediate noise, and low signal-to-noise ratiomiddle spatial frequencies: high signal, high noise, intermediate signal-to-noise ratiolow spatial frequenceis: low signal, low noise, high signal-to-noise ratio

W-E

WSW-ENE

SW-NE

SWS-NENS-N

SES-NWN

SE-NW

ESE-WNW

W-E

Preferred wavelet orientation

Orientation

Signal to noise ratio

Signal

Noise

horizontally oriented wavelets: high signal, medium noise, high signal to noise ratiovertically oriented wavelets: medium signal, medium noise, medium to low signal to noise ratio

Scatterplot of Signal to Noise

Revealing the average face

Signal: ordering acc. to Sm,n,k,l

-> equal weights

Noise: ordering acc. to Sm,n,k,l/Nm,n,k,l

-> signals weighted acc. to 1/Nm,n,k,l

Signal to noise ratio: ordering acc. to Sm,n,k,l m,n,k,l

-> signals weighted acc. to SNm,n,k,l

Location

- wavelets are densely clustered over facial features- high-frequency wavelets (27.85 and 13.93 c/fw) at the same locations- medium-frequency wavelets in facial interior

Noise

Signal

Signal to noise ratio

- circular arrangement with higher frequencies in the interior and lower frequencies closer to the outline of the face (facial features)- high-frequency wavelets are densely clustered at the outlines of facial features with little overlap -- medium-frequency wavelets cover facial outlines

- wavelets broadly cover facial outlines- orientation of wavelets perpendicular to outline - almost no low-noise wavelets in facial interior

1

2 3

4

Di!erent selection criteria

σ

.14

.61

.01Population variance: Signal- variance in wavelet activation across 96 di!erent frontal face images- face images are grey-valued snapshots of 3D head models [5]- standardized in size (area coverage) and average birghtness- illumination source above and to the left of the face

Rotation variance: Noise- variance in wavelet activation across di!erent in-depth rotations of the average face around the vertical and horizontal axis- variance as weighted squared sum with weights according to a multivariate normal distribution with σ=2 and an aspect ratio of 2:1

Signal to noise ratio- for each wavelet calculate the ratio of population variance to rotation variance- estimate of the Raleigh quotient of be-tween- to within-class variance

Image Representation using Gabor wavelets [4]

2D Gabor wavelets- local spatial bandpass #lters- models of simple/complex cells in V1- generated by dilation, rotation from mother wavelet

Hierarchical ensemble of Gabor wavelets- 8 di!erent optimal spatial frequencies with1.5 octave half-amplitude bandwidth- 8 di!erent optimal orientations- centers of RFs positioned in hierarchical grid

Complete image representation- spatial coverage of all input locations, overlapping RFs- local coverage of the full Fourier spectrum, overlapping fre-quency bandwidths

0.22 c/fw 0.44 c/fw 0.87 c/fw 1.74 c/fw

3.48 c/fw 6.96 c/fw 13.93 c/fw 27.85 c/fw

Spatial frequency #ltering

IntroductionWhat is the kind of information that we extract from a face in order to correctly infer the person’s identity?Previously, investigations have selectively studied the in$uence of spatial frequencies [1], orientation [2], or location [3].Here, we use an hierarchical ensemble of Gabor wavelets [4] to study all three aspects at the same time.

Summary and discussionInformativeness of individual wavelets was measured by two cri-teria: Signal (population-variance) and Noise (rotation-variance).Wavelets with a high signal are mostly horizontally oriented and either of medium spatial frequency and dispersed over the face, or of high spatial frequency and clustered over face parts (eyes, mouth, nose, ...).

Wavelets with a high signal-to-noise ratio include low spatial frequencies, the predominance of horizontal orientations is more pronounced, and they are arranged around the outlines of face parts.

Properties of informative wavelets show complex inter-dependencies of spatial frequency, orientation, and loca-tion.

Rainer Stollho! MPI for Mathematics in the Sciences, Leipzig, Germany

Representing facial information using Gabor wavelets

References[1] Gold J.B. et al. (1999) Identi!cation of band-pass !ltered letters and faces by human and ideal observers. Vision Research, 39 (21): 3537-3560[2] Dakin S.C. and Watt R.J. (2009) Biological "bar codes" in human faces. Journal of Vision, 9 (4): 2 [3] Gosselin, F. and Schyns, P.G. (2001). Bubbles: A new technique to reveal the use of visual information in recognition tasks. Vision Research, 41: 2261–2271[4] Lee, T. S. (1996). Image representation using 2D gabor wavelets. IEEE T Pattern Anal, 18(10):959–971.[5] Troje, N.F. and Bültho!, H.H. (1996) Face recognition under varying poses: the role of texture and shape. Vision Research, 36(12) :1761–71.

AcknowledgementsThe author thanks J.Jost, T. Elze, L. Avdiyenko, and H. Wilhelm (MPI MiS); I. Bültho! and T. Wolf (MPI Biol. Cyb.) for interesting discussions, technical support and assistance in constructing the stimuli.

in the SciencesMathematicsMax Planck Institute for