face animation
TRANSCRIPT
-
8/3/2019 Face Animation
1/57
Facial AnimationFacial Animation
By: Shahzad MalikBy: Shahzad Malik
CSC2529 PresentationCSC2529 PresentationMarch 5, 2003March 5, 2003
-
8/3/2019 Face Animation
2/57
MotivationMotivation
Realistic human facial animation is aRealistic human facial animation is achallenging problem (DOF, deformations)challenging problem (DOF, deformations)
Would like expressive and plausibleWould like expressive and plausibleanimations of a photorealistic 3D faceanimations of a photorealistic 3D face
Useful for virtual characters in film, videoUseful for virtual characters in film, video
gamesgames
-
8/3/2019 Face Animation
3/57
PapersPapers
Three major areas are Lip Syncing, FaceThree major areas are Lip Syncing, FaceModeling, and Expression SynthesisModeling, and Expression Synthesis
Video Rewrite (BreglerVideo Rewrite (Bregler SIGGRAPH 1997)SIGGRAPH 1997)
Making Faces (GuenterMaking Faces (Guenter SIGGRAPH 1998)SIGGRAPH 1998)
Expression Cloning (NohExpression Cloning (Noh SIGGRAPH 2001)SIGGRAPH 2001)
-
8/3/2019 Face Animation
4/57
Video RewriteVideo Rewrite
Generate a new video of an actorGenerate a new video of an actormouthing a new utterance by piecingmouthing a new utterance by piecingtogether old footagetogether old footage
Two stages:Two stages:
AnalysisAnalysis
SynthesisSynthesis
-
8/3/2019 Face Animation
5/57
Analysis StageAnalysis Stage
Given footage of the subject speaking,Given footage of the subject speaking,extract mouth position and lip shapeextract mouth position and lip shape
Hand label 26 training images:Hand label 26 training images:
34 points on mouth (20 outer boundary, 1234 points on mouth (20 outer boundary, 12inner boundary, 1 at bottom of upper teeth, 1inner boundary, 1 at bottom of upper teeth, 1
at top of lower teeth)at top of lower teeth) 20 points on chin and jaw line20 points on chin and jaw line
Morph training set to get to 351 imagesMorph training set to get to 351 images
-
8/3/2019 Face Animation
6/57
EigenPointsEigenPoints
Create EigenPoint models using this setCreate EigenPoint models using this set
Use derived EigenPoints model to labelUse derived EigenPoints model to labelfeatures in all frames of the training videofeatures in all frames of the training video
-
8/3/2019 Face Animation
7/57
EigenPoints (continued)EigenPoints (continued)
Problem:Problem: EigenPoints assumes featuresEigenPoints assumes featuresare undergoing pure translationare undergoing pure translation
-
8/3/2019 Face Animation
8/57
Face WarpingFace Warping
Before EigenPoints labeling, warp eachBefore EigenPoints labeling, warp eachimage into a reference planeimage into a reference plane
Use a minimization algorithm to registerUse a minimization algorithm to registerimagesimages
? A !i
iFiT IIME2
)x()'x()(
-
-
!}
1
x'x
876
543
210
y
x
mmm
mmm
mmm
M
-
8/3/2019 Face Animation
9/57
Face Warping (continued)Face Warping (continued)
Use rigid parts of face to estimate warpUse rigid parts of face to estimate warp MM
Warp face byWarp face by MM--11
Perform eigenpoint analysisPerform eigenpoint analysis
BackBack--project features byproject features by MM onto faceonto face
-
8/3/2019 Face Animation
10/57
Audio AnalysisAudio Analysis
Want to capture visual dynamics of speechWant to capture visual dynamics of speech
Phonemes are not enoughPhonemes are not enough
ConsiderConsider coarticulationcoarticulation
Lip shapes for many phonemes areLip shapes for many phonemes aremodified based on phonemes contextmodified based on phonemes context(e.g. /T/ in beet vs. /T/ in boot)(e.g. /T/ in beet vs. /T/ in boot)
-
8/3/2019 Face Animation
11/57
Audio Analysis (continued)Audio Analysis (continued)
Segment speech into triphonesSegment speech into triphones
e.g. teapot becomes /SILe.g. teapot becomes /SIL--TT--IY/, /TIY/, /T--IYIY--P/, /IYP/, /IY--PP--AA/, /PAA/, /P--AAAA--T/, and /AAT/, and /AA--TT--SIL/)SIL/)
Emphasize middle of each triphoneEmphasize middle of each triphone
Effectively captures forward and backwardEffectively captures forward and backwardcoarticulationcoarticulation
-
8/3/2019 Face Animation
12/57
Audio Analysis (continued)Audio Analysis (continued)
Training footage audio is labeled with phonemesTraining footage audio is labeled with phonemesand associated timingand associated timing
Use genderUse gender--specific HMMs for segmentationspecific HMMs for segmentation Convert transcript into triphonesConvert transcript into triphones
-
8/3/2019 Face Animation
13/57
Synthesis StageSynthesis Stage
Given some new speech utteranceGiven some new speech utterance
Mark it with phoneme labelsMark it with phoneme labels
Determine triphonesDetermine triphones Find a video example with theFind a video example with the desireddesired
transitiontransition in databasein database
Compute a matching distance to eachCompute a matching distance to eachtriphone:triphone:
error = Dp + (1- )Ds
-
8/3/2019 Face Animation
14/57
Viseme ClassesViseme Classes
Cluster phonemes intoCluster phonemes into visemeviseme classesclasses
Use 26 viseme classes (10 consonant, 15Use 26 viseme classes (10 consonant, 15
vowel):vowel):(1) /CH/, /JH/, /SH/, /ZH/(1) /CH/, /JH/, /SH/, /ZH/
(2) /K/, /G/, /N/, /L/(2) /K/, /G/, /N/, /L/
(25) /IH/, /AE/, /AH/(25) /IH/, /AE/, /AH/
(26) /SIL/(26) /SIL/
-
8/3/2019 Face Animation
15/57
Phoneme Context DistancePhoneme Context Distance
Dp is phoneme context distance
Distance is 0 if phonemic categories are thesame (e.g. /P/ and /P/)
Distance is 1 if viseme classes are different(e.g. /P/ and /IY/)
Distance is between 0 and 1 if different
phonemic classes but same viseme class (e.g./P/ and /B/)
Compute for the entire triphone
Weight the center phoneme most
-
8/3/2019 Face Animation
16/57
Lip Shape DistanceLip Shape Distance
Ds is distance between lip shapes inoverlapping triphones Eg. for teapot, contours for /IY/ and /P/
should match between /T-IY-P/ and /IY-P-AA/
Compute Euclidean distance between 4-element vectors (lip width, lip height, inner lip
height, height of visible teeth) Solution depends on neighbors in both
directions (use DP)
-
8/3/2019 Face Animation
17/57
Time Alignment of Triphone VideosTime Alignment of Triphone Videos
Need to combine triphone videosNeed to combine triphone videos
Choose portion of overlapping triphonesChoose portion of overlapping triphoneswhere lip shapes are close as possiblewhere lip shapes are close as possible
Already done when computing DsAlready done when computing Ds
-
8/3/2019 Face Animation
18/57
Time Alignment to UtteranceTime Alignment to Utterance
Still need to time align with target audioStill need to time align with target audio
Compare corresponding phoneme transcriptsCompare corresponding phoneme transcripts
Start time of center phoneme in triphone isStart time of center phoneme in triphone isaligned with label in target transcriptaligned with label in target transcript
Video is then stretched/compressed to fit timeVideo is then stretched/compressed to fit time
needed between target phoneme boundariesneeded between target phoneme boundaries
-
8/3/2019 Face Animation
19/57
Combining Lips and BackgroundCombining Lips and Background
Need to stitch new mouth movie intoNeed to stitch new mouth movie intobackground original face sequencebackground original face sequence
Compute transformCompute transformMM
as beforeas before Warping replacement mask defines mouthWarping replacement mask defines mouth
and background portions in final videoand background portions in final video
Mouth mask Background mask
-
8/3/2019 Face Animation
20/57
Combining Lips and BackgroundCombining Lips and Background
Mouth shape comes from triphone image,Mouth shape comes from triphone image,and is warped usingand is warped using MM
Jaw shape is combination of backgroundJaw shape is combination of backgroundjaw and triphone jaw linesjaw and triphone jaw lines
Near ears, jaw dependent on background,Near ears, jaw dependent on background,
near chin, jaw depends on mouthnear chin, jaw depends on mouth
Illumination matching is used to avoidIllumination matching is used to avoidseams mouth and backgroundseams mouth and background
-
8/3/2019 Face Animation
21/57
Video Rewrite ResultsVideo Rewrite Results
Video: 8 minutes of video, 109 sentencesVideo: 8 minutes of video, 109 sentences
Training Data: frontTraining Data: front--facing segments of video,facing segments of video,around 1700 triphonesaround 1700 triphones
Emily sequences
-
8/3/2019 Face Animation
22/57
Video Rewrite ResultsVideo Rewrite Results
2 minutes of video, 1157 triphones2 minutes of video, 1157 triphones
JFKsequences
-
8/3/2019 Face Animation
23/57
Video RewriteVideo Rewrite
ImageImage--based facial animation systembased facial animation system
Driven by audioDriven by audio
Output sequence created from real videoOutput sequence created from real video
Allows natural facial movements (eyeAllows natural facial movements (eyeblinks, head motions)blinks, head motions)
-
8/3/2019 Face Animation
24/57
Making FacesMaking Faces
Allows capturing facial expressions in 3DAllows capturing facial expressions in 3Dfrom a video sequencefrom a video sequence
Provides a 3D model and texture that canProvides a 3D model and texture that canbe rendered on 3D hardwarebe rendered on 3D hardware
-
8/3/2019 Face Animation
25/57
Data CaptureData Capture
Actors face digitized using a CyberwareActors face digitized using a Cyberwarescanner to get a base 3D meshscanner to get a base 3D mesh
Six calibrated video cameras captureSix calibrated video cameras captureactors expressionsactors expressions
Six camera views
-
8/3/2019 Face Animation
26/57
Data CaptureData Capture
182 dots are glued to actors face182 dots are glued to actors face
Each dot is one of six colors withEach dot is one of six colors withfluorescent pigmentfluorescent pigment
Dots of same color are placed as far apartDots of same color are placed as far apartas possibleas possible
Dots follow the contours of the face (eyes,Dots follow the contours of the face (eyes,lips, nasiolips, nasio--labial furrows, etc.)labial furrows, etc.)
-
8/3/2019 Face Animation
27/57
Dot LabelingDot Labeling
Each dot needs a unique labelEach dot needs a unique label
Dots will be used to warp the 3D meshDots will be used to warp the 3D mesh
Also used later for texture generation fromAlso used later for texture generation fromthe six viewsthe six views
For each frame in each camera:For each frame in each camera: Classify each pixel as belonging to one of sixClassify each pixel as belonging to one of six
categoriescategories Find connected componentsFind connected components
Compute the centroidCompute the centroid
-
8/3/2019 Face Animation
28/57
Dot Labeling (continued)Dot Labeling (continued)
Need to compute dot correspondencesNeed to compute dot correspondencesbetween camera viewsbetween camera views
Must handle occlusions, false matchesMust handle occlusions, false matches
Compute all point correspondencesCompute all point correspondencesbetweenbetween kk cameras andcameras and nn 2D dots2D dots
2
2n
k
point correspondences
-
8/3/2019 Face Animation
29/57
Dot Labeling (continued)Dot Labeling (continued)
For each correspondenceFor each correspondence
Triangulate a 3D point based on closestTriangulate a 3D point based on closest
intersection of rays cast through 2D dotsintersection of rays cast through 2D dots
Check if backCheck if back--projection is above someprojection is above somethresholdthreshold
All 3D candidates below threshold are storedAll 3D candidates below threshold are stored
-
8/3/2019 Face Animation
30/57
Dot Labeling (continued)Dot Labeling (continued)
Project stored 3D points into a referenceProject stored 3D points into a referenceviewview
Keep points that are within 2 pixels ofKeep points that are within 2 pixels ofdots in reference viewdots in reference view
These points are potential 3D matches forThese points are potential 3D matches for
a given 2D dota given 2D dot Compute average as final 3D positionCompute average as final 3D position
Assign to 2D dot in reference viewAssign to 2D dot in reference view
-
8/3/2019 Face Animation
31/57
Dot Labeling (continued)Dot Labeling (continued)
Need to assign consistent labels to 3D dotNeed to assign consistent labels to 3D dotlocations across entire sequencelocations across entire sequence
Define a reference set of dotsDefine a reference set of dots DD (frame 0)(frame 0)
LetLet ddjj DD be the neutral location for dotbe the neutral location for dot jj
Position ofPosition of ddjj at frameat frame ii isis ddjjii=d=djj+v+vjj
ii
For each reference dot, find the closest 3DFor each reference dot, find the closest 3Ddot of same color within some distancedot of same color within some distance
-
8/3/2019 Face Animation
32/57
Moving the DotsMoving the Dots
Move reference dot to matched locationMove reference dot to matched location
For unmatched reference dotFor unmatched reference dot ddkk
, let, let nnkkbe the set of neighbor dots with matchbe the set of neighbor dots with match
in current framein current frame ii
!
k
ij nd
i
j
k
i
kv
nv 1
-
8/3/2019 Face Animation
33/57
Constructing the MeshConstructing the Mesh
Cyberware scan has problems:Cyberware scan has problems: Fluorescent markers cause bumps on meshFluorescent markers cause bumps on mesh
No mouth openingNo mouth opening Too many polygonsToo many polygons
Bumps removed manuallyBumps removed manually
Split mouth polygons, add teeth andSplit mouth polygons, add teeth andtongue polygonstongue polygons
Run mesh simplification algorithmRun mesh simplification algorithm(Hoppes algorithm: 460k to 4800 polys)(Hoppes algorithm: 460k to 4800 polys)
-
8/3/2019 Face Animation
34/57
Moving the MeshMoving the Mesh
Move vertices by linear combination ofMove vertices by linear combination ofoffsets of nearest dotsoffsets of nearest dots
!k
k
i
k
j
kj
i
j ddpp E
!Dd
jk
k
1where E
-
8/3/2019 Face Animation
35/57
Assigning Blend CoefficientsAssigning Blend Coefficients
Assign blend coefficients for a grid ofAssign blend coefficients for a grid of1400 evenly distributed points on face1400 evenly distributed points on face
-
8/3/2019 Face Animation
36/57
Assigning Blend CoefficientsAssigning Blend Coefficients
Label each dot, vertex, grid point asLabel each dot, vertex, grid point asaboveabove,, belowbelow, or, or neitherneither
Find 2 closest dots to each grid pointFind 2 closest dots to each grid point pp
DDnn is set of dots within 1.8(dis set of dots within 1.8(d11+d+d22)/2 of)/2 of pp
Remove points in relatively same directionRemove points in relatively same direction
Assign blend values based on distanceAssign blend values based on distancefromfrom pp
-
8/3/2019 Face Animation
37/57
Assign Blend Coefficients (cont.)Assign Blend Coefficients (cont.)
If dot not in Dn, then a is 0If dot not in Dn, then a is 0
If dot in Dn:If dot in Dn:
!
!
niDd
i
ii
i
il
l
pdl Ethen,
0.1let
For vertices, find closest grid pointsFor vertices, find closest grid points
Copy blend coefficientsCopy blend coefficients
-
8/3/2019 Face Animation
38/57
Dot RemovalDot Removal
Substitute skin color for dot colorsSubstitute skin color for dot colors
First lowFirst low--pass filter the imagepass filter the image
Directional filter prevents color bleedingDirectional filter prevents color bleeding Black=symmetric, White=directionalBlack=symmetric, White=directional
Face with dots Low-pass mask
-
8/3/2019 Face Animation
39/57
Dot Removal (continued)Dot Removal (continued)
Extract rectangular patch of dotExtract rectangular patch of dot--free skinfree skin
HighHigh--pass filter this patchpass filter this patch
Register patch to center of dot regionsRegister patch to center of dot regions
Blend it with the lowBlend it with the low--frequency skinfrequency skin
Clamp hue values to narrow rangeClamp hue values to narrow range
-
8/3/2019 Face Animation
40/57
Dot Removal (continued)Dot Removal (continued)
Original Low-pass
High-pass Hue clamped
-
8/3/2019 Face Animation
41/57
Texture GenerationTexture Generation
Texture map generated for each frameTexture map generated for each frame
Project mesh onto cylinderProject mesh onto cylinder
Compute mesh location (Compute mesh location (kk,, 11,, 22) for each) for eachtexel (u,v)texel (u,v)
For each camera, transform mesh into viewFor each camera, transform mesh into view
For each texel, get 3D coordinates in meshFor each texel, get 3D coordinates in mesh Project 3D point to camera planeProject 3D point to camera plane
Get color at (x,y) and store as texel colorGet color at (x,y) and store as texel color
-
8/3/2019 Face Animation
42/57
Texture Generation (continued)Texture Generation (continued)
Compute a texel weight (dot product betweenCompute a texel weight (dot product betweentexel normal on mesh and direction to camera)texel normal on mesh and direction to camera)
Merge the texture maps from all the camerasMerge the texture maps from all the camerasbased on the weight mapbased on the weight map
-
8/3/2019 Face Animation
43/57
ResultsResults
-
8/3/2019 Face Animation
44/57
Making Faces SummaryMaking Faces Summary
3D geometry and texture of a face3D geometry and texture of a face
Data generated for each frame of videoData generated for each frame of video
Shading and highlights glued to faceShading and highlights glued to face
Nice results, but not totally automatedNice results, but not totally automated
Need to repeat entire process for everyNeed to repeat entire process for everynew face we want to animatenew face we want to animate
-
8/3/2019 Face Animation
45/57
Expression CloningExpression Cloning
Allows facial expressions to be mappedAllows facial expressions to be mappedfrom one model to anotherfrom one model to another
Source
model
Animation
Targetmodel
-
8/3/2019 Face Animation
46/57
Expression Cloning OutlineExpression Cloning Outline
Motion capture dataor any animation
mechanism
Deform
Dense surface
correspondences
Vertex
displacements
Cloned
expressions
Motion transfer
Source model Target model
Source animation Target animation
-
8/3/2019 Face Animation
47/57
Source Animation CreationSource Animation Creation
Use any existing facial animation methodUse any existing facial animation method
Eg. Making Faces paper described earlierEg. Making Faces paper described earlier
Motion capture
data
Source model Source animation
-
8/3/2019 Face Animation
48/57
Dense Surface CorrespondenceDense Surface Correspondence Manually select 15Manually select 15--35 correspondences35 correspondences
Morph the source model using RBFsMorph the source model using RBFs
Perform cylindrical projection (ray throughPerform cylindrical projection (ray through
source vertex, into target mesh)source vertex, into target mesh) Compute Barycentric coords of intersectionCompute Barycentric coords of intersection
After RBFInitial Features After projection
-
8/3/2019 Face Animation
49/57
Automatic Feature SelectionAutomatic Feature Selection
Can also automate initial correspondencesCan also automate initial correspondences
Use basic facts about human geometry:Use basic facts about human geometry:
Tip of Nose = point with highest Z valueTip of Nose = point with highest Z value
Top of Head = point with highest Y valueTop of Head = point with highest Y value
Currently use around 15 such heuristic rulesCurrently use around 15 such heuristic rules
-
8/3/2019 Face Animation
50/57
Example DeformationsExample Deformations
Source Deformed source Target
Closely approximates
the target models
-
8/3/2019 Face Animation
51/57
Animation with Motion VectorsAnimation with Motion Vectors
Animate by displacing target vertex byAnimate by displacing target vertex bymotion of corresponding source pointmotion of corresponding source point
Interpolate Barycentric coordinates ofInterpolate Barycentric coordinates oftarget vertices based on source verticestarget vertices based on source vertices
Need to project target model onto sourceNeed to project target model onto source
model (opposite of what we did before)model (opposite of what we did before)
-
8/3/2019 Face Animation
52/57
Motion Vector TransferMotion Vector Transfer
Need to adjust direction and magnitudeNeed to adjust direction and magnitude
Source Target Source Target
Source motion vector
Proper target motion vector
-
8/3/2019 Face Animation
53/57
Motion Vector Transfer (cont.)Motion Vector Transfer (cont.) Attach local coordinate system for each vertex inAttach local coordinate system for each vertex in
source and deformed sourcesource and deformed source XX--axis = average normalaxis = average normal YY--axis = proj. of any adjacent edge onto planeaxis = proj. of any adjacent edge onto plane
with Xwith X--axis as normalaxis as normal ZZ--axis = cross product of X and Yaxis = cross product of X and Y
Source TargetDeformed
source
X
Y
Z
X
Y
Z
T mX
mm M
-
8/3/2019 Face Animation
54/57
Motion Vector TransferMotion Vector Transfer
Compute transformation between twoCompute transformation between twocoordinate systemscoordinate systems
Mapping determines the deformed sourceMapping determines the deformed sourcemodel motion vectorsmodel motion vectors
-
8/3/2019 Face Animation
55/57
Example Motion TransferExample Motion Transfer
Adjusted
motions
Models Motion vectors
Source
Target
More
horizontal
Smaller
-
8/3/2019 Face Animation
56/57
ResultsResults
Angry expression
Distorted mouth
Source Targets
Big open mouth
-
8/3/2019 Face Animation
57/57
SummarySummary
EC can animate new models using aEC can animate new models using alibrary of existing expressionslibrary of existing expressions
Transfers motion vectors from sourceTransfers motion vectors from sourceanimations to target modelsanimations to target models
Process is fast and can be fully automatedProcess is fast and can be fully automated