vision-based control of 3d facial animation · • map low quality visual signals to high quality...

Post on 13-Mar-2020

1 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

VISION-BASED CONTROL OF 3D FACIAL ANIMATION

Jin-xiang Chai - Jing Xiao - Jessica Hodgins Carnegie Mellon University

Eurographics / SIGGRAPH 2003

Yusuf OSMANLIOĞLU 2010

OUTLINE

•  Aim •  Existing techniques •  Proposed method and challenges •  Related work •  Overall system •  Analysis of system •  Results •  Future work

AIM

“Interactive avatar control”

•  Designing a rich set of realistic facial actions for a virtual character

•  Providing intuitive and interactive control over these actions in real time

•  Physically modeling skin and muscles of the face

•  Motion capturing techniques –  Vision based –  Online

EXISTING TECHNIQUES

EXISTING TECHNIQUES

EXISTING TECHNIQUES

+ High resolution - Expensive - Troublesome

- Noisy - Low resolution

+ Inexpensive + Easy to use

Control Interface Quality

Vision based animation

Online motion capture

Motion Capture Techniques

PROPOSED METHOD

Vision-based interface

Motion capture database

Interactive avatar control

+

CHALLENGES

•  Map low quality visual signals to high quality motion data

•  Extract meaningful animation control signals from the video sequence of a live performer in real time

•  Make vertices of the face model to change place for forming facial expression, according to the displacement of limited number of markers

•  Allow any user to control any 3D face model

RELATED WORK •  Keyframe interpolation •  Performanc Capturing •  Pseudo – muscle based / muscle based simulation •  2D facial data for speech (viseme driven

approach) •  Full 3D motion capture data

RELATED WORK Motion capture

•  Making Faces[Guenter et al. 98] •  Expression Cloning[Noh and Neumann 01]

Vision based tracking for direct animation •  Physical markers[Williams 90] •  Edges [Terzopoulos and Waters 93, Lanitis et al. 97] •  Optical flow with 3D models[Essa et al. 96, Pighin et al. 99, DeCarlo

et al. 00]

Vision based animation with blenshape •  Hand-drawn expressions [Buck et al. 00] •  3D avatar model [FaceStation]

SYSTEM OVERVIEW

Video Analysis

Avatar Animation

Preprocessed motion capture

data Expression control and animation

Expression retargeting

Performance capture 3D head pose

Video Analysis •  Vision based facial tracking

–  Tracking 19 2D features on the face –  2xLips, 2xMouth, 4xEyebrow, 8xEye, 3xNose

•  Initialization –  Neutral face –  Positioning and initializing parameters of the cylinder model to capture head

pose –  Positioning locations of 19 points manually

•  Tracking pose of the head –  6 DOF – yaw, pitch, roll, 3D position –  Updating position and orientation per frame –  Reseting accumulated errors

•  Expression tracking –  Defining square windows centered at feature’s position

•  Expression Control Parameters –  15 parameters that are extracted automatically from 2D

tracking points –  Mouth(6) – Nose(2) – Eyes(2) - Eyebrows(5)

Distance between two tracking points

Distance between a line and a point

Orientation and center of the

mouth

Expression control signal

Video Analysis

SYSTEM OVERVIEW

Video Analysis

Avatar Animation

Preprocessed motion capture

data Expression control and animation

Expression retargeting

Performance capture

Motion Capture Data Preprocessing

•  Building up the face model with 3D laser scan

•  Motion capture –  Attaching 76 reflective markers on actor’s face –  Actor is allowed to move his head freely

•  Coupled head and facial movements –  Decoupling pose and expressions

3D poses Expression seperation

Expression control

parameter extraction

Motion Capture Data Preprocessing

Motion capture database •  70.000 frames with a 120 fps camera (~10 minutes record)

•  76 referance points on the face

•  6 basic facial expression

• Anger, fear, surprise, sadness, joy, disgust

• Eating yawning, snoring

• Each expression repeated 6 times during mocap session

• Very limited motion data related to speaking(6000 frames)

• Does not cover all variations of the facial movements related to speaking

Motion Capture Data Preprocessing

SYSTEM OVERVIEW

Video Analysis

Avatar Animation

Preprocessed motion capture

data Expression control and animation

Expression retargeting

Performance capture

Expression Control and Animation

2D tracking data

Vision-based Interface

Motion Capture Database

19*2 DOF

Facial expression control

parameters

Facial expression control

parameters 15 DOF 15 DOF

76*3 DOF 3D motion data

•  Visual expression control signals are very noisy

•  One-to-many mapping from expression control signal space 3D motion space

Control Signal Space 3D Motion Space

76*3 DOF 15 DOF

Expression control signal Expression control parameter

Expression Control and Animation

Nearest Neghbor Search

Noisy Control Signal

Online PCA

K=120 closest examples

Time Interval W = 20 frame/60 fps =0.33s

7 largest eigen curves (99.5 % energy)

Filtered Control Signal

Filter by eigen curves

Preprocessed motion capture

database

Expression Control and Animation

Nearest Neighbor

Search

d1

d2

dK

...

w(d2)

w(dK)

w(d1)

...

Filtered Control Signal

Expression Control and Animation

SYSTEM OVERVIEW

Video Analysis

Avatar Animation

Preprocessed motion capture

data Expression control and animation

Expression retargeting

Performance capture

EXPRESSION RETARGETING

Sythesized Expression Avatar Expression

•  Learn the surface mapping function using Radial Basis Functions such that xt=f(xs) •  Transfer the motion vector by local Jacobian matrix Jf(xs) by δxt=Jf(xs) δxs •  Run time computational cost is independent from the number of vertices of head model

δxs δxt

xs xt

?

EXPRESSION RETARGETING

SYSTEM OVERVIEW

Video Analysis

Avatar Animation

Preprocessed motion capture

data Expression control and animation

Expression retargeting

Performance capture

RESULTS

CONCLUSIONS

Developed a performance-based facial animation system for interactive expression control •  Tracking real-time facial movements in video •  Preprocessing the motion capture database •  Transforming low-quality 2D visual control signal

to high quality 3D facial expression •  An efficient online expression retargetting

FUTURE WORK

•  Formal user study on the quality of the synthesized motion

•  Controlling and animating 3D photorealistic facial expression

•  Size of database

•  Speech as an input to the system

THANKS…

QUESTIONS?

top related