interactive 3d caricature from harmonic exaggeration · ation rules that allow the user to control...

Interactive 3D Caricature from Harmonic Exaggeration

Thomas Lewinera, Thales Vieirab, Dimas Martınezb, Adelailson Peixotob, Vinıcius Melloc, Luiz Velhod

aDepartment of Mathematics, PUC-Rio de Janeiro, BrazilbInstitute of Mathematics, UFAL, Maceio, Brazil

cInstitute of Mathematics, UFBA, Salvador, BrazildIMPA, Rio de Janeiro, Brazil

Abstract

A common variant of caricature relies on exaggerating characteristics of a shape that differs from a reference template,usually the distinctive traits of a human portrait. This work introduces a caricature tool that interactively emphasizesthe differences between two three-dimensional meshes. They are represented in the manifold harmonic basis of theshape to be caricatured, providing intrinsic controls on the deformation and its scales. It further provides a smoothlocalization scheme for the deformation. This lets the user edit the caricature part by part, combining different settingsand models of exaggeration, all expressed in terms of harmonic filter. This formulation also allows for interactivity,rendering the resulting 3d shape in real time.

1. Introduction

Caricature is an illustration technique that exagger-ates specific characteristic traits in a portrait of a humansubject. Its main goal is to reveal the essence of a personby emphasizing particular aspects that visually identi-fies the individual. In this way, some features are mag-nified while other features are attenuated creating a non-realistic personalized impression of the subject [13].

In the western culture, caricature has a long tradition,way back to the Renaissance. Early examples can befound in the works of Leonardo da Vinci. Subsequently,other great artists such as Honore Daumier specializedin this form of expression. Nowadays, caricatures arepresent in many types of media, ranging from newspa-pers and magazines to film and television. It has beentraditionally used in political cartoons and then becamea powerful means of entertainment in general. Manypublic figures, such as politicians and movie stars areassociated with their caricatured depiction.

URL: http://thomas.lewiner.org/ (Thomas Lewiner),http://www.im.ufal.br/professor/thales/ (Thales Vieira),http://www.im.ufal.br/professor/dimas/ (DimasMartınez), [email protected] (Adelailson Peixoto),http://www.dmat.ufba.br/docentes/vinicius-moreira-mello

(Vinıcius Mello), http://lvelho.impa.br (Luiz Velho)

As a practice, caricature can be considered an artform, not only because the production of an effectivecaricature requires skill and talent, but mainly becauseit essentially depends on an interpretation of the reality.In that sense, each caricaturist has a particular style thatmarks his/her work.

The social importance and appeal of caricature moti-vates the investigation of this topic in Computer Graph-ics. In its own way, it configures a remarkably rich areaof research because it can be seen both as modelingand as a visualization problem, where perceptual andsemantic issues play a fundamental role.

The main challenge in this area is to develop mod-els that are able to sensibly take into account subjectiveparameters and to implement systems that allow simpleintuitive expression. In this perspective, this work intro-duces an interactive system to create and model three-dimensional caricatures by exaggerating the harmonicdifferences between a shape and a template.

Historically, the research in computer-based carica-ture has its origins in the master’s thesis of Susan Bren-nan in 1982 [6]. Since then, the area experiences sig-nificant development as reflected by the large numberof publications devoted to this subject. Despite of thoseadvances, the basic problems are far from being com-pletely solved and still motivate intense research efforts.

Preprint submitted to SMI 2011 March 11, 2011

distance minimizing

normal displacement

Figure 1: Progressive caricature of a scanned head: the user paints the regions to be caricatured, and chooses the amplitude (here µ = −6), scaleselection as curve and filter model to use for each part. The differences are obtained by registration with the head of Figure 8 as template.

Related Work. Computer-based caricature methods canbe broadly subdivided into three main categories de-pending on the principles adopted to model the prob-lem: template-based, extrapolation, and style learningmethods. These methods produce caricatures automat-ically or semi-automatically, but most of them rely onsome form of user interaction.

Template-based methods employ a reference facialshape, which contains specific features that can be em-phasized or de-emphasized to different levels in order toproduce a caricature. In general, the shape is defined ge-ometrically and warping techniques are used to deformthe template geometry interactively. Most of such sys-tems work in two dimensions directly with photographsor with illustrations of a face [1, 8, 12] or parametricthree dimensional shapes [11, 20]. The techniques pro-posed here work directly on scanned 3d shapes.

Extrapolation methods assume that a caricature exag-gerates the traits distinguishing the face from the nor-mal one. They resort to an average face that serves asthe basis for extrapolating specific features. This kindof techniques works by amplifying the difference of theinput face from the mean face. The previously men-tioned seminal system of Brennan [6] was based on suchprinciples, also known by illustrators [22]. In recentyears, several systems of this type have been proposed[5, 14, 27]. A common way to model the face in suchsystems is through principal component analysis (PCA)that provides a representation in which the mean face isexplicitly defined. These systems also specify exagger-ation rules that allow the user to control the caricatureeffects [9, 21, 30]. The present work formulates the dif-

ferences in harmonic space, which performs at interac-tive rates on general shapes.

Style-Learning methods, instead of modeling the car-icature process by itself, attempt to recreate the mecha-nisms used by caricature artists. This is typically doneusing statistical inference techniques that construct aprobabilistic model from examples [10, 17, 18, 19, 25],capturing the style of a specific caricaturist.

Contribution. In this work, we propose to use spectralrepresentation of the differences between a template andthe 3d shape to be caricatured. We build on the Mani-fold Harmonics geometry processing [28, 24, 23], sinceit provides a very intuitive framework for mesh editionthrough a reduced set of parameters. In particular it ad-mits a direct GPU implementation [16] that performs atinteractive rate.

We derive the extrapolation principle within the har-monic space of the shape to be caricatured. We proposethree different harmonic deformation filters, extendingusual 3d caricatures beyond human faces, e.g. using an-imal shapes. We also use the harmonic basis to provideinteractive controls to tune the caricature: a localizationcontrol, which specifies which parts of the face shouldbe caricatured; and a scale control to use or exaggeratea subset of the harmonics for the caricature.

Our formulations adapt to a GPU implementation thatcorrectly computes the per-vertex normals. This allowsa WYSIWYG interface that gives feedback on the user’scontrol in real time. Moreover, a partial result can bequickly used in place of the initial shape, letting the usercaricature each feature with different methods and pa-rameters (Figure 1).

2

2. Basics of Manifold Harmonics

In this section, we quickly review the basics of mani-fold harmonics filtering [28, 23].

2.1. Laplace harmonicsOn a discrete mesh M with vertex set V , manifold

harmonics provide a linear basis Hk for the space of dis-crete scalar functions f : V → R defined at the meshvertices. This basis is built from the eigenvectors of adiscrete Laplace operator, Hk being interpreted as thekth fundamental oscillation mode of the mesh. Further-more, the basis is sorted by increasing frequency, thefirst eigenvectors corresponding to large scale deforma-tion and the last ones interpreted as fine details or noise.

uβuvβ'uv

v av

v

Figure 2: Geometric elements for the discrete Laplace operator.

In order to construct Hk, Vallet and Levy use theLaplace-De Rham operator derived from Discrete Ex-terior Calculus, which is given by an N × N matrix ∆,where N = #V is the number of vertices of the mesh. Itscoefficients ∆uv are zero if vertices u and v are differentand not adjacent, and otherwise:

∆uv = −cot (βuv) + cot

(β′uv

)√

au · av, ∆uu = −

∑v

∆uv ,

where av is the area of the circumcentric dual of vertexv, and angles βuv, β′uv are opposed to edge uv (Figure 2).

2.2. Harmonic transformThe above definition of the discrete Laplace oper-

ator is symmetric, guaranteeing the existence of real-valued orthogonal eigenvectors Hk : V → R, satisfying∆Hk(v) = λkHk and, using Kronecher’s δ notation, forall vertices u,v and all frequencies k,l, we have:∑

v∈V

Hk(v) · Hl(v) = δk,l,

N−1∑k=0

av · Hk(u) · Hk(v) = δu,v.

Since (Hk, 0 ≤ k < N) is a basis, any scalar functionf : V → R has coordinates on this basis, which wewrite fk. With the orthogonality properties, we obtainthose coordinates by simple projections [28]:

f (v) =

N−1∑k=0

Hk(v) · fk, fk =∑u∈V

au · Hk(u) · f (u) . (1)

2.3. Scalar Filtering

Using the analogy with Fourier analysis, the squareroot of the eigenvalue

√λk is interpreted as the fre-

quency of harmonic Hk. GIven a scalar signal on themesh f : V → R, the amplitudes fk of each of its fre-quencies of can be filtered by amplifying each of themby ϕk ∈ R. The filtered signal ϕf is then given by:

ϕf (v) =

N−1∑k=0

Hk(v) ·(ϕk · fk

).

As observed in the original work [28], the high fre-quencies (i.e. for k > n with n � N) are expensive tocompute, increase the complexity of the filter control,and do not provide significant modeling power at theglobal scale. We thus use a single factor ϕd to amplifyall those high frequencies:

ϕf (v) =

n−1∑k=0

Hk(v) ·(ϕk · fk

)+ ϕd ·

N−1∑k=n

Hk(v) · fk

=

n−1∑k=0

Hk(v) ·(ϕk · fk

)+ ϕd · d f (v) . (2)

The residual d f (v) of f at each vertex is computedat preprocessing using the complementary expression:d f (v) = f (v) −

∑n−1k=0 Hk(v) · fk. This way, the last eigen-

vectors of the basis are never used, and only the re-stricted basis (Hk, 0 ≤ k < n) needs to be computed.

3. Interactive 3D Caricature Overview

We propose a caricature system in the extrapolationcategory, which starts with a reference 3d template (e.g.,a normal face) and a 3d shape (e.g., the face to be carica-tured), both represented as triangular meshes (Figure 3).We first compute, for each vertex v of the shape, a cor-responding point xr(v) in the template (Section 6). Wedecompose their differences in the frequency domain,using the harmonic basis of the shape, since we wantto preserve and exaggerate its features, and not those ofthe template. We propose three different representationsof the differences between the shape and the template:coordinate-wise xδ, as a normal displacement ξ or as adistance minimizing filter φ, which are described in thenext section.

The caricature is created by the user through severalcontrols (Section 5): a morphing parameter µ, whichamplifies the differences; a localization function χ(v),which specifies the region of the shape to be deformed;

3

and a scale control Φk, which selects and eventually am-plifies the frequencies used for the caricature. The local-ization function χ(v) serves as a blending between theshape and the deformed region, and is obtained fromthe characteristic function of that region through filter-ing, also using the harmonics basis.

The computation of the caricature for each represen-tation is performed on the GPU, with an exact normalcomputation (Section 6), allowing the display of thethree resulting caricatures in real time with high quality.Finally, an optimized reprocessing allows using the re-sult of one caricature in place of the shape, caricaturingeach part independently with different parameters [2], ina layer-by-layer manner. The schematic representationof the whole process is illustrated in Figure 4.

4. Harmonic Exaggeration

In this section, we propose three harmonic represen-tations for the differences between the shape and thetemplate. We denote by x(v) = (x(v), y(v), z(v)) ∈ R3

the position of vertex v of the shape, and xr(v) the cor-responding point in the template. The k-th harmonic ofthe shape is denoted Hk : V → R.

Given a morphing parameter µ, and a representationXk of the differences, we compute the coordinates µx(v)of our exaggerated shape using the generic form:

µx(v) = x(v) + µ ·

∑k

Hk(v) · Xk

. (3)

This ensures that for µ = 0, 0x(v) = x(v), and that forµ = 1, xr(v) is approximated by 1x(v). The caricatureis obtained for µ < 0, exaggerating away from the tem-plate (Figure 5).

Figure 3: Vertex-wise correspondences between a reference template(left) and the original shape of Figure 1 to be caricatured (right), col-ored according to the value of H170 on each vertex.

Filters

ReferenceTemplate

xr(v')

OriginalModelx(v)

HarmonicbasisHk(v)

correspondences

v' v

Harmonicdifferences

x o~ ~

Coordinate-wiseNormal

displacementDistance

minimizing

User control , ,

Figure 4: Workflow of our interactive 3d caricature system: duringpreprocessing, the correspondences between the shape and the tem-plate and the harmonic decomposition of the shape are computed, andthen the harmonic representations of their differences. The user theninteracts by selecting morphing, localization and scale controls µ, χ, ϕ,visualizing the results of those controls in realtime. The user then vali-dates one of the three harmonic exaggeration models, and its resultingcaricature is quickly reused in place of the new shape.

4.1. Coordinate-wise filterThe simplest approach looks for a strict morphing:

1x(v) = xr(v), which translates in our framework byX(v) ≡ xδ(v) = xr(v) − x(v). The coordinate-wise ex-aggeration is written in coordinates as:

cµx(v) = x(v) + µ ·

n−1∑k=0

Hk(v) · xδk + dxδ(v)

(4)

This ensures that for µ = 0, c0x = x and for µ = 1, c

1x =

xr. Without further control, this corresponds to linearmorphing as obtained by direct correspondences [15].

Observe that there is no single filter ϕ that achievesmapping the three scalar signals x(v), y(v) and z(v) ofthe shape to the template through Equation (2). This isonly possible in general with three different filters ϕx

k ,ϕ

yk and ϕz

k, which are defined here by:

ϕxk = 1 + µ · xδk

xk

ϕyk = 1 + µ ·

yδkyk

ϕzk = 1 + µ · zδk

zk

=⇒µ x(v) =

xϕx (v)yϕy (v)zϕz (v)

.Such triple filtering allows mapping any mesh to anymesh with the same connectivity, and does not use muchof the intrinsic geometry of the shape (Figure 5).

4

Figure 5: Caricatures obtained with an ellipsoid as reference template, varying morphing parameter µ from −0.3 (top), −0.4 (middle) to −0.6(bottom), for the coordinate-wise filter (left), normal displacement filter (center) and distance minimizing filter (right).

4.2. Normal displacement filter

In order to model the shape to template differencesthrough a single scalar signal, we must give up the exactmorphing 1x(v) ≈ xr(v). We propose to approximate thedifference xδ(v) by its projection ξ(v) along the shapenormal n(v) ∈ S2 at vertex v: ξ(v) = 〈 xδ(v) | n(v) 〉.This normal displacement ξ : V → R is a scalar signal,directly representable in the low and high frequencies ofthe shape by ξk, dξ(v) through Equation (2). InterpretingX(v) � ξ(v)·n(v), the normal displacement exaggerationis derived from Equation (3):

nµx(v) = x(v) + µ ·

n−1∑k=0

Hk(v) · ξk + dξ(v)

· n(v) (5)

On perfect conditions [7], xr(v) = x(v) + ξ(v) · n(v),guaranteeing the morphing. The specificity of thoseconditions means that this filter restricts the deforma-tion, preserving more of the original geometry of theshape (Figure 5). However, this formulation generatesauto-intersections in the caricature for large |µ|.

4.3. Distance minimizing filter

Our third filter models the caricature directly as aspectral deformation [24], looking for a single scalarfilter ϕ, ϕd that, equally applied to each of the three co-ordinate signals x(v), y(v) and z(v) would best approxi-mate the template. Using a 2-norm, pair-wise distancefor approximation measure, this error is expressed as∑

v∈V ‖ϕx(v)−xr(v) ‖2, where the optimization variables

ϕ, ϕd appear in ϕx =∑n−1

k=0 Hk(v) ·(ϕk · xk

)+ ϕd · dx(v).

Subtracting 1 to each ϕk and to ϕd is equivalent to re-placing xr by xδ in the measure (see appendix):

oφ,oφd = argminϕ,ϕd

12

∑v∈V

‖ ϕx(v) − xδ(v) ‖2

This minimization actually simplifies due to the or-thogonality of the harmonic basis, and its solution is ex-plicitly given by (see appendix):

oφk =

⟨xk

∣∣∣ xδk

⟩‖xk‖

2 , oφd =

∑v∈V

⟨dx(v)

∣∣∣ dxδ(v)⟩

∑v∈V‖dx(v)‖2

.

This optimal filter is then exaggerated by morphingparameter µ using Equation (3):

oµx(v) = x(v) + µ ·

n−1∑k=0

Hk(v) ·oφk · xk + µ ·oφd · dx(v). (6)

This filter uses few extrinsic geometry of the shape, andthus further restricts the deformation possibilities, forsmall |µ| (Figure 5). In positive terms, it better preservesthe intrinsic geometry of the shape even for large |µ|.

5. Localization and Scale Controls

The above filters are already controlled by the mor-phing parameter µ, with the generic formulation of thecaricatured coordinates µx(v) = x(v)+µ·

(∑k Hk(v) · Xk

),

with Xk according to the chosen filter (Equations (4), (5)and (6)). We enhance the user control by letting her/himspecify which part of the shape (e.g. nose, chick) shouldbe deformed, and how much each scale (i.e. frequency)should contribute to the deformation. Those controls areinjected in the generic form of µx.

5

Figure 6: Localization control: by a simple pick on the mesh (left), the user sketches a region to deform χ01. This region is smoothed by cuttingthe high frequencies k ≥ nχ of its characteristic function χ01, here with nχ = 32 for the top row, 96 for the middle one and 192 for the lowest row,out of n = 512 computed frequencies. The results of the three filters with µ = −0.8 is represented in the same order as Figure 5.

5.1. Localization control

The user specifies the region to be deformed simplyby painting on the shape (Figure 6). The vertices pickedby the pointer defines a characteristic function on themesh: χ01 : V → R. This function is typically 1 forthe painted vertices and 0 otherwise, but intensity val-ues lower than 1 may be specified before picking. Thedeformation is then restricted to the painted region by:µx(v) = x(v) + µ · χ01(v) ·

(∑k Hk(v) · Xk + dX(v)

).

If restricting the deformation directly to the paintedregion, the blending by χ01 would lead to cracks. Wedecompose χ01 to obtain a smoother localization func-tion χ. Since we already computed the harmonics de-composition on the mesh, we obtain a surface Gaussiansmoothing χ interactively by a low-pass filter on χ01:

χ(v) =

nχ−1∑k=0

Hk(v) · χ01k .

The cutoff frequency nχ is specified by the user. Thecaricature is then smoothly localized by:

µx(v) = x(v) + µ · χ(v) ·

∑k

Hk(v) · Xk + dX(v)

.The user can caricature independently several regions

by using result on one region as the new shape, andspecify a new region on that shape.

5.2. Scale control

Our caricature system offers controls to exaggeraterather fine details or larger scale differences, by restrict-ing or amplifying the difference representation to cer-

tain frequencies. This is simply done through an inde-pendent filter Φk,Φd (Figure 7):

µx(v) = x(v) + µ χ(v)

∑k

Hk(v) Φk Xk + Φd dX(v)

.

Figure 7: The result of Figure 6 with nχ = 96 and the normal dis-placement filter is used in place of the original shape to continue thecaricaturing on the chin (left). The frequencies can be used equally(middle) or selecting and amplifying some of them through Φ, hereamplifying intermediate frequencies (right) to mark the chin wrinkle.

Combining with the morphing, localization and scalecontrol, we get the final formulations for each filter:

- coordinate-wise filter (Equation (4)):

cµx(v) = x(v) + µ χ(v)

n−1∑k=0

Hk(v) Φk xδk + Φd dxδ(v)

.- normal displacement filter (Equation (5)):

nµx(v) = x(v) + µ χ(v)

n−1∑k=0

Hk(v) Φk ξk + Φd dξ(v)

n(v).

- distance minimizing filter (Equation (6)):

oµx(v) = x(v) + µ χ(v)

n−1∑k=0

Hk(v) Φkoφk xk + Φd

oφddx(v)

.6

distance minimizingcoordinate-wisenormal displacement coordinate-wise

Figure 8: Progressive caricature of a face model registered to the face of Figure 5. The last step uses a scaled down model as template.

6. Implementation

The above formulations allow for interactive imple-mentation, offering to the user an immediate return fromthe chosen setting of each control. This is achievedthrough a GPU implementation of the filters, leaving tothe CPU the low-pass filter for the localization controlχ, which is very fast since nχ is small. The picking forthe scale control Φ and the definition of χ01 is also han-dled in CPU. Finally, to correctly illuminate the carica-tured shapes, we propose a GLSL program to computethe vertex normals. This geometry shader is not specificto our filters and would apply to any triangular mesh.

6.1. Shape / template correspondences

The correspondences computation is a challengingproblem, mainly because geometries may differ consid-erably between the template and the shape. In addition,our filters presume that meshes coordinates are repre-sented in a common global coordinate system, requiringa previous registration step.

A first option for the correspondences relies on semi-automatic cross-parameterization [15]. A set of match-ing vertices is obtained by user interaction, and a com-mon, coarse base mesh is constructed on those ver-tices. The connectivity of the shape is then mapped ontothe template through the parameterizations obtained oneach element of the base mesh. Then, we align the shapeto the template through a rigid-body transformation,computed by minimizing a point-to-point error metricbetween a set of correspondences. We use the corre-spondences obtained by the cross-parameterization, anda fast registration based on singular value decomposi-tion [3] (Figures 10, 11 and 14).

Since the parameterization step is closely related tothe registration step, as the quality of both results is de-pendent on the quality of the correspondences the re-verse approach is feasible. A robust registration actuallydefines (or may help improving) the correspondences

between the shape and the template. We illustrate thisapproach by using a single ICP [4] with automatic pre-alignment on some of the results (Figures 1 and 8).

As a final step of the caricature, the user may magnifyall the previously edited exaggerations at once. A sim-ple way to obtain such effect is to consider as templatea simplified or scaled down version of the shape (Fig-ures 1 and 8). The vertices of the versions of the meshthen naturally correspond.

6.2. GPU feeding and normal computation

We opted for GLSL as programming language forits portability, and follow previous GPU Manifold Har-monic filter implementation [16]. It uses a fragmentshader for the summation of Equation (2) and a render-to-vertex-buffer copy to obtain µx for each filter. Theonly difference resides in the larger number of textures:xδ, dxδ, ξ, dξ, n and oφ, complementing the basis Hk,original position x, dx and the controls µ, χ and Φ.

uniform float w,h ; // buffer dimensionsuniform float max ; // maximal areavarying in float id[], d−1[]; // vertex id and inverse degree

void main()vec3 e0 = gl PositionIn[1].xyz - gl PositionIn[0].xyz ;vec3 e1 = gl PositionIn[2].xyz - gl PositionIn[0].xyz ;vec3 nT = max ∗ cross(e0, e1) + vec3(0.5,0.5,0.5) ;for (int i=0; i< gl VerticesIn; i++)

gl FrontColor.rgb = d−1[i] * nT ;gl Position.x = 2.0 ∗ mod( id[i], w ) / w - 1.0 ;gl Position.y = 2.0 ∗ floor( id[i]/w ) / h - 1.0 ;EmitVertex() ;EndPrimitive() ;

Algorithm 9: Geometry shader for the vertex normal computation.

We introduce a vertex normal computation using thesame GLSL framework. We render the triangle meshonce with the updated vertices coordinates, and a fixedtexture containing the vertex id i(v) and the inverse ofits degree d−1(v). The geometry shader (Algorithm 9)

7

coordinate-wise = -0.75 coordinate-wise

= 0.55

normal displacement = -0.25

= 0

= 1

distance minimizing

= -1.0

distance minimizing = -0.75

normal displacement = 0.45

Figure 10: Independent caricatures obtained using different filters, regions and morphing parameter values, using Beethoven model as template forthe lion head. For each caricature, the vignettes represent the selected region χ01 (bottom) and blending function χ (top) all obtained with nχ = 96.The scale control has been used for the middle row, and is displayed below the vignettes.

computes the triangle normal nT and, for each trianglevertex v, emits a point at the 2d position in the normalarray corresponding to its id i(v), with a color encod-ing d−1(v) nT . Using a 1/1 blending function performsthe final sum, the frame buffer ends up storing, for eachvertex, the average of the normals of its adjacent faces.This sum is easily improved by weighting each normalby the area of the triangle, i.e. avoiding to normalize thecross product. The normal is color coded with a smallworkaround to the normalization. The normal is finallyused after a render-to-normal-buffer copy.

6.3. Fast use of a result as a new shape

The versatility of the proposed controls may leadthe user to prefer one filter for some part of the car-icature and different filters or control values for otherparts. This is possible by using a partial caricature re-sult in place of the original shape, in a layer-by-layerprocess. In particular, this permits caricaturing one fea-ture at a time, as recommended by professional [2]. Thefirst CPU preprocessing computes the template/shapevertex-wise correspondences, the manifold harmonicsbasis and the filter elements xδ, dxδ, ξ, dξ and oφ. Toavoid computing for each layer, we keep the first corre-spondences and the harmonic basis, preserving the orig-inal features of the shape. Only the filter elements areupdated and re-sent to the GPU.

7. Results

We experiment our interactive caricaturing tool, let-ting the user choose the filters that were more expressiveto her/him. We first observe that each of the three fil-ters was used, with a preference on the normal displace-ment filter for smooth shapes (Figure 7), distance min-imizing for isolated parts (Figures 1, 8, 12 and 13) andcoordinate-wise for the final exaggeration (Figure 8).

We also test different correspondences computationmethods: cross-parameterization [15], as illustrated inFigures 10, 11 and 14, registration only and scaled downmodel, both used in the examples of Figures 1 and 8.

We set the cut-off frequency to nχ = 96. We can usestrong morphing parameters for the distance minimizingfilters: µ = −8.0, while much lower ones −0.8 > µ >−2.4 suit better for the other filters. The scale control Φ

was set by hand, with amplifications from 0 to 2.

Interaction performances. Our harmonic formulationallows interactive rates (Table 1), and we take advantageof this capability by introducing our WYSIWYG inter-face for 3d caricature. This feature let the user quicklytune expressive caricatures even with a single iterationof the parameters (Figures 11 and 10). The fast re-useof the result as template also permits more sophisticatedeffects, while keeping the interactivity of the interface(Figures 1, 8, 13 and 12).

8

coordinate-wise = -0.27

distance minimizing

= 0.4

distance minimizing

= -0.5

coordinate-wise = -0.67 = 0

= 1

Figure 11: Independent caricatures obtained using different regions and morphing parameter values, using a dog head as template. For eachcaricature, the vignettes represent the selected region χ01 (bottom) and blending function χ (top) all obtained with nχ = 96.

verts Hk X usermodel fig. template #V secs secs fpsFace 1 1,3 Face 3 49k 429 4.4 8.6Simple 5-7 Ellipsoid 50k 422 3.8 8.5Face 2 8 Simple 33k 217 2.5 8.8Lion 10 Beethoven 30k 171 1.1 12.1Face 3 11 Dog 36k 191 1.7 10.6Planck 12 Egea 49k 402 3.9 9.0Egea 13 Planck 31k 183 2.4 9.6

Table 1: Performance tests on a 2.8GHz processor with a GeForceGT 9600M with 512MB of RAM. The interaction speed includes themodel display, picking, the three filters and the vertex normal update.It is measured in frame per second (fps), while the pre-computationtimes are expressed in seconds.

Indeed, the interface achieves above 8 frames per sec-ond when displaying the three filters simultaneously,even with the picking and smoothing of χ which takesin average 39 milliseconds on the models we use (Ta-ble 1). The substitution of the shape by one of the resultslasts in average 1.2 seconds for the filter computation onCPU and up to 4 seconds for the reprocessing.

Filter comparisons. We can compare the filters intro-duced here based on some preliminary experiments.The coordinate-wise filter allows for blended exagger-ation or reduction (as Egea’s mouth in Figure 13).Among the three filters, it is the most appropriate forgenerating sharp features, such as the nose in Figure 6,although with large µ it can induce self-intersections ofthe model (Figure 14). It is also appropriate for the finalexaggeration with a scaled down template. The normaldisplacement filter must be used with small µ to keepa coherent model, but suits very well to inflate regions

without thin features, such as the chin in Figure 8 orlarge noses in Figures 1 and 12. Finally, the distanceminimizing filter, by its least squares nature, offers sub-tle, diffusive but shape preserving deformations. For ex-ample in Figure 12, the ears are exaggerated but the in-ner cartilage shape is preserved, similarly to Egea’s scarin Figure 13.

The left group of Figure 14 illustrates the use of thecoordinate-wise filter alone, varying the morphing pa-rameter µ without other control. This morphing ex-actly corresponds to the only cross-parameterization ap-proach [15] if χ ≡ 1, and we can observe the more ex-pressive results obtained in Figures 12 and 13.

Finally, our formulation does not rely on a specifichuman face model, and thus allows for other caricaturestyles, such as using animals (Figure 11) or even carica-turing animals (Figure 10).

Discussion. The use of manifold harmonics to modelthe exaggeration as a filter restricts the deformation tooperate on non-localized basis. The reduction of the fil-ters to not too high frequencies further limits the abilityto capture details such as wrinkles, while a more carefultreatment of high frequencies may [26].

The correspondence accuracy may also impact theresult. In particular, using only continuous cross-parameterization reveals the C1-discontinuity of the cor-respondences for large |µ| (in particular the left Egea inthe last row of Figure 14).

9

distance minimizing distance minimizingnormal displacement

Figure 12: Progressive caricature of the Max Planck model with template from Egea (Figure 14).

8. Conclusion

In this work, we propose an interactive tool for 3dcaricature modeling. It derives the extrapolation princi-ple [6] in terms of harmonic filtering, introducing inter-activity and providing morphing, localization and scalecontrols. This work may be improved in several direc-tions, among which enhancing the interface with auto-matic segmentation to select significant parts to be exag-gerated, adding subjective learning in the interface withintelligent galleries [29], and including texture and non-photorealistic rendering.

Acknowledgements. We would like to thank the Max-Planck-Institut fur Informatik and Cyberware for the 3dmodels and the persons who let us use their 3d faces.This work is financed by CNPq, FAPERJ and FAPEAL.

References

[1] Akleman, E., 1997. Making caricatures with morphing. In: Sig-graph Visual Proceedings. p. 145.

[2] Akleman, E., Reisch, J., 2004. Modeling expressive 3D carica-tures. In: Siggraph Sketches. p. 61.

[3] Arun, K., Huang, T., Blostein, S., 1987. Least-squares fitting oftwo 3D point sets. Pattern Analysis and Machine Intelligence9 (5), 698–700.

[4] Besl, P., McKay, N., 1992. A method for registration of 3Dshapes. Pattern Analysis and Machine Intelligence 14 (2), 239–256.

[5] Blanz, V., Vetter, T., 1999. A morphable model for the synthesisof 3d faces. In: Siggraph. pp. 187–194.

[6] Brennan, S. E., 1985. Caricature generator: the dynamic exag-geration of faces by computer. Leonardo 18 (3), 170–178.

[7] Chazal, F., Lieutier, A., Rossignac, J., 2007. Normal-map be-tween normal-compatible manifolds. International Journal ofComputational Geometry and Applications 17 (5), 403–421.

[8] Chen, H., Zheng, N.-N., Liang, L., Li, Y., Xu, Y.-Q., Shum, H.-Y., 2002. Pictoon: a personalized image-based cartoon system.In: Multimedia. pp. 171–178.

[9] Chen, W., Yu, H., Zhang, J. J., 2009. Example based caricaturesynthesis. In: Computer Animation and Social Agents. pp. 11–18.

[10] Clarke, L., Chen, M., Mora, B., 2010. Automatic generation of3d caricatures based on artistic deformation styles. Transactionson Visualization and Computer Graphics.

[11] Fu, G., Chen, Y., Liu, J., Zhou, J., Li, P., 2008. Interactive ex-pressive 3D caricatures design. In: Multimedia. pp. 965–968.

[12] Gooch, B., Reinhard, E., Gooch, A., 2002. Perception-drivenblack-and-white drawings and caricatures.

[13] Grose, F., 1796. Rules for drawing caricatures : with an essayon comic painting. Hooper Wigstead.

[14] Koshimizu, H., Tominaga, M., Fujiwara, T., Murakami, K.,1999. On KANSEI facial processing for computerized facialcaricaturing system. In: Systems, Man, and Cybernetics. Vol. 6.pp. 294–299.

[15] Kraevoy, V., Sheffer, A., 2004. Cross-parameterization and com-patible remeshing of 3d models. In: Siggraph. pp. 861–869.

[16] Lewiner, T., Vieira, T., Bordignon, A., Cabral, A., Marques, C.,Paixao, J., Custodio, L., Lage, M., Andrade, M., Nascimento,R., de Botton, S., Pesco, S., Lopes, H., Mello, V., Peixoto, A.,Martinez, D., 2010. Tuning manifold harmonics filters. In: Sib-grapi. pp. 110–117.

[17] Li, P., Chen, Y., Liu, J., Fu, G., 2008. 3d caricature generationby manifold learning. In: Multimedia. pp. 941–944.

[18] Liang, L., Chen, H., Xu, Y.-Q., Shum, H.-Y., 2002. Example-based caricature generation with exaggeration. In: PacificGraphics. pp. 386–393.

[19] Liu, J., Chen, Y., Xie, J., Gao, X., Gao, W., 2009. Semi-supervised learning of caricature pattern from manifold regu-larization. In: Advances in Multimedia Modeling. pp. 413–424.

[20] Mattos, A., Cesar Jr, R., Mena-Chalco, J., Velho, L., 2010. 3Dlinear facial animation based on real data. In: Sibgrapi. pp. 271–278.

[21] Mo, Z., Lewis, J. P., Neumann, U., 2004. Improved automaticcaricature by feature normalization and exaggeration. In: Sig-graph Sketches. p. 57.

[22] Redman, L., 1984. How to draw caricatures. ContemporaryBooks.

10

[23] Reuter, M., Biasotti, S., Giorgi, D., Patane, G., Spagnuolo, M.,2009. Discrete Laplace-Beltrami operators for shape analysisand segmentation. Computers & Graphics 33 (3), 381–390.

[24] Rong, G., Cao, Y., Guo, X., 2008. Spectral mesh deformation.The Visual Computer 24 (7), 787–796.

[25] Shet, R. N., Lai, K. H., Edirisinghe, E. A., Chung, P. W., 2005.Use of neural networks in automatic caricature generation: Anapproach based on drawing style capture. Pattern Recognitionand Image Analysis, 343–351.

[26] Sorkine, O., Cohen-Or, D., Lipman, Y., Alexa, M., Rossl, C.,Seidel, H., 2004. Laplacian surface editing. In: Symposium onGeometry Processing. pp. 175–184.

[27] Tseng, C.-C., Lien, J.-J. J., 2007. Synthesis of exaggerative car-icature with inter and intra correlations. In: Asian Conferenceon Computer Vision. pp. 314–323.

[28] Vallet, B., Levy, B., 2008. Spectral geometry processing withmanifold harmonics. In: Computer Graphics Forum. Vol. 27.pp. 251–260.

[29] Vieira, T., Bordignon, A., Peixoto, A., Tavares, G., Lopes, H.,Velho, L., Lewiner, T., 2009. Learning good views through intel-ligent galleries. Computer Graphics Forum (Eurographics Pro-ceedings) 28 (2), 717–726.

[30] Yu, H., Zhang, J. J., 2010. Caricature synthesis based on meanvalue coordinates. In: Computer Animation and Social Agents.pp. 1–10.

Appendix A. Distance optimization details

The distance minimizing filter oφk,oφd is defined as the

global minimum of the quadratic functional F (ϕ, ϕd) =12

∑v∈V‖

ϕx(v) − xr(v)‖2. First, substituting ϕ per ϕ − 1leads to a simpler form:

ϕx(v) − xr(v) = x(ϕ−1)+1(v) −(xδ(v) − x(v)

)=

n−1∑k=0

Hk(v) · (ϕk − 1) · xk + (ϕd − 1) · dx

−

n−1∑k=0

Hk(v) · xk + dx − x(v)

− xδ(v)

= x(ϕ−1)(v) − 0 − xδ(v)

Using this substitution and the orthogonality proper-ties of the harmonic basis allows to solve this optimiza-tion problem explicitly. Without grouping the high fre-quencies yet, i.e. considering ϕk = ϕd for k ≥ n, we canwrite F in the harmonic basis.

F(ϕ) =12

∑v∈V

‖ϕx − xδ‖2

=12

∑v∈V

∥∥∥∥ N−1∑k=0

Hk(v) ·(ϕk · xk

)−

N−1∑k=0

Hk(v) · xδk

∥∥∥∥2

=12

∑v∈V

∥∥∥∥ N−1∑k=0

Hk(v) ·(ϕk · xk − xδk

) ∥∥∥∥2

∣∣∣∣ N−1∑l=0

Hl(v) ·(ϕl · xl − xδl

) ⟩

Grouping the terms by v and using Equation (1):

F(ϕ)=12

N−1∑k,l=0

∑v∈V

Hk(v)Hl(v)

·⟨ϕk · xk − xδk

∣∣∣∣ϕl · xl − xδl

⟩=

12

N−1∑k,l=0

δk,l

⟨ϕk · xk − xδk

∣∣∣∣ϕl · xl − xδl

⟩=

12

N−1∑k=0

∥∥∥∥ϕk · xk − xδk

∥∥∥∥2

=12

n−1∑k=0

∥∥∥∥ϕk xk − xδk

∥∥∥∥2+

12

N−1∑k=n

∥∥∥∥ϕd xk − xδk

∥∥∥∥2

The last line is deduced by grouping the high frequen-cies.

The minimum of functional F is achieved when itsgradient

[∂F∂ϕ(0) ,

∂F∂ϕ(2) , . . . ,

∂F∂ϕ(n−1) ,

∂F∂ϕd

]vanishes:

∂F∂ϕk

= 0 =⟨

xk

∣∣∣ ϕk · xk − xδk

⟩= ϕk · ‖xk‖

2 −⟨

xk

∣∣∣ xδk

⟩.

The high frequency part goes the reverse way of the pre-vious derivation of F:

∂F∂ϕd

= 0 =

N−1∑k=n

⟨xk

∣∣∣ ϕd · xk − xδk

⟩=

N−1∑k,l=n

δk,l ·⟨

xk

∣∣∣ ϕd · xl − xδl

⟩=

N−1∑k,l=n

∑v∈V

Hk(v)Hl(v) ·⟨

xk

∣∣∣ ϕd · xl − xδl

⟩=

∑v∈V

⟨ N−1∑k=n

Hk(v) · xk

∣∣∣ N−1∑l=n

ϕd · Hl(v) · xl − Hl(v) · xδl

⟩=

∑v∈V

⟨dx(v)

∣∣∣ ϕd · dx(v) − dxδ(v)⟩

= ϕd ·∑v∈V

‖dx(v)‖2 −∑v∈V

⟨dx(v)

∣∣∣ dxδ(v)⟩.

Each equation is linear and independent of the other.The minimum argument oφ,oφd of F is then explicitlygiven by:

oφk =

⟨xk

∣∣∣ xδk

⟩‖xk‖

2 , oφd =

∑v∈V

⟨dx(v)

∣∣∣ dxδ(v)⟩

∑v∈V‖dx(v)‖2

.

11

Figure 13: Progressive caricature of the Egea model, with reference from the Max Planck (Figure 14).

1

0

0

-1

μ

μ

Figure 14: Varying morphing parameter µ from +1 to −1 from Egea to Max Planck (left) and reversely (right), using the coordinate-wise, normaland distance minimizing filter.

12

interactive 3d caricature from harmonic exaggeration · ation rules that allow the user to control...

Documents