embodied speech and facial expression avatar dan harbin - evan zoss - jaclyn tech - brent sicking ...

26
Embodied Speech and Facial Expression Avatar Dan Harbin - Evan Zoss - Jaclyn Tech - Brent Sicking May 10, 2004

Post on 21-Dec-2015

219 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Embodied Speech and Facial Expression Avatar  Dan Harbin - Evan Zoss - Jaclyn Tech - Brent Sicking  May 10, 2004

Embodied Speech and Facial Expression Avatar

Dan Harbin - Evan Zoss - Jaclyn Tech - Brent Sicking May 10, 2004

Page 2: Embodied Speech and Facial Expression Avatar  Dan Harbin - Evan Zoss - Jaclyn Tech - Brent Sicking  May 10, 2004

Problem Background/Needs Statement• Messages of the face help illustrate verbal communication

by revealing what the expresser is feeling or trying to convey.

• The ability to generate animated facial expressions together with speech is important to many diverse application areas.

– A deaf person could use an animated face as a lip-reading system.

– An autistic child could be positively affected from a robotic face in terms of social interaction, language development, and learning through structure and repetition.

Page 3: Embodied Speech and Facial Expression Avatar  Dan Harbin - Evan Zoss - Jaclyn Tech - Brent Sicking  May 10, 2004

Goals and Objectives

• The overall goal of this project is to create a robotic face capable of displaying human emotion accompanied with speech.

Page 4: Embodied Speech and Facial Expression Avatar  Dan Harbin - Evan Zoss - Jaclyn Tech - Brent Sicking  May 10, 2004

Goals and Objectives

• Reverse engineer Yano’s motors and sensors so we are able to move them to any desired position.

• Develop a GUI that allows the user to move each motor in both directions to a desired position.

• Research the psychology behind the use of facial expressions to convey emotion and mimic these facial expressions with the Yano face.

• Develop a GUI that allows the user to select and display real human facial expressions.

• Develop software to mimic speech based on a measure of the intensity of various pre-recorded wave files.

Page 5: Embodied Speech and Facial Expression Avatar  Dan Harbin - Evan Zoss - Jaclyn Tech - Brent Sicking  May 10, 2004

Yano Control System

Page 6: Embodied Speech and Facial Expression Avatar  Dan Harbin - Evan Zoss - Jaclyn Tech - Brent Sicking  May 10, 2004

Part 1: The Computer

1. Allows the user to directly control the movement of Yano’s eyes, cheeks, and mouth motors.

2. Provides parameterized control of Yano’s facial expressions by allowing the user to both select from a predefined set of expressions and to control his expression in terms of valence, arousal, and stance.

3. Allows the user to load a pre-recorded wave file and play it back as Yano mimics human speech based on the intensity of the wave file.

Page 7: Embodied Speech and Facial Expression Avatar  Dan Harbin - Evan Zoss - Jaclyn Tech - Brent Sicking  May 10, 2004

User Interface: The Main Menu

Page 8: Embodied Speech and Facial Expression Avatar  Dan Harbin - Evan Zoss - Jaclyn Tech - Brent Sicking  May 10, 2004

User Interface: Manual Motor Control

Page 9: Embodied Speech and Facial Expression Avatar  Dan Harbin - Evan Zoss - Jaclyn Tech - Brent Sicking  May 10, 2004

User Interface: Facial Expressions

Page 10: Embodied Speech and Facial Expression Avatar  Dan Harbin - Evan Zoss - Jaclyn Tech - Brent Sicking  May 10, 2004

User Interface: Facial Expressions

• Arousal – to stir up, excite; provoke; to awaken from.

• Valence – the degree of attraction or aversion that an individual feels toward a specific object or event.

• Stance – the altitude or position of a standing person; mental posture; point of view; a station, a position.

Page 11: Embodied Speech and Facial Expression Avatar  Dan Harbin - Evan Zoss - Jaclyn Tech - Brent Sicking  May 10, 2004

User Interface: Sound Processing

Progress Bar

File Name (.wav)

Intensity Meter (based on power waveform)

Page 12: Embodied Speech and Facial Expression Avatar  Dan Harbin - Evan Zoss - Jaclyn Tech - Brent Sicking  May 10, 2004

User Interface: Sound Processing

Original Waveform Power Waveform

Page 13: Embodied Speech and Facial Expression Avatar  Dan Harbin - Evan Zoss - Jaclyn Tech - Brent Sicking  May 10, 2004

Input Port: AD1

AD5

Power: Gnd

Vcc

Serial Port

Motor Control Port:SV6

SV1

Part 2: SV203 Microcontroller

Page 14: Embodied Speech and Facial Expression Avatar  Dan Harbin - Evan Zoss - Jaclyn Tech - Brent Sicking  May 10, 2004

• Receives command through the serial port

• Set or Clear the appropriate Motor Control Pin(s)

• Read an analogue voltage off of the desired Input Pin(s)

• Transmit a value representing the voltage back up the serial line

SV203 Functional Description

Page 15: Embodied Speech and Facial Expression Avatar  Dan Harbin - Evan Zoss - Jaclyn Tech - Brent Sicking  May 10, 2004

• Serial Port – ASCII text commands are sent to the board via the serial port to tell it what to do. Values from the input pins are also sent back to the computer via the serial port– List of commands we use:

• SVxM0 – initialize pin x to use digital logic• PSx – set pin x high• PCx – clear pin x to low• Dn – delay for n milliseconds before next command• PC1PC3PC5D300PS1PS3PS5 – typical motor control command• ADy – read the voltage of input pin y, transmit up serial port

• Motor Control Port – sends the logic controls for the motors to the Yano I/O Board. When a pin is set high with PSx, it is set to 6V, PCx will set it to 0V. We use six pins, SV1 through SV6

• A/D Input Port – receives the status of Yano’s switches from the Yano I/O Board. We use 5 pins, AD1 through AD5. Each pin will have 6V on it if it’s switch is open, and near 0V if it is closed. The SV203 converts these voltages to the numbers 0 – 255 for 0V-6V.

SV203 Interface Description

Page 16: Embodied Speech and Facial Expression Avatar  Dan Harbin - Evan Zoss - Jaclyn Tech - Brent Sicking  May 10, 2004

SV20

3 M

icro

cont

roll

er

Yano

Switch Circuit:

Part 3: Yano I/O Board

Page 17: Embodied Speech and Facial Expression Avatar  Dan Harbin - Evan Zoss - Jaclyn Tech - Brent Sicking  May 10, 2004

• Receives logic controls for the motors from SV203• Converts them into powered control for Yano’s motors

• Reads in status of Yano’s switches, open or closed• Converts this to a voltage, 6V for open, 0V for closed, and

sends back to SV203

Yano I/O Board Functional Description

Page 18: Embodied Speech and Facial Expression Avatar  Dan Harbin - Evan Zoss - Jaclyn Tech - Brent Sicking  May 10, 2004

• Motor Control Input – the logic input for the H-Bridges that determines motor direction and movement. They are paired off, 2 pins per H-Bridge, 1 Bridge per motor:

– Mouth: SV5 and SV6– Cheeks: SV3 and SV4– Eyes: SV1 and SV2

• Motor Outputs – 3 two pin ports, one for each motor, each pin will have either Vcc or Gnd. If both pins are Vcc (default state) there is no potential between them and the motor will not turn. If one pin drops to Gnd, the motor will turn one way, vice-versa for the other pin.

• Sensor Inputs – these ports connect directly to Yano’s switches. The mouth and cheek motors each have two limit switches to determine when they run far enough in each direction. The eye motor can run a complete 360 degree rotation, and so just has a single sensor that is triggered when the eye motor is in a particular place around the rotation.

• Sensor Outputs - the interface back to the SV203 that has 5 pins, each of which are set to 6V for open switch and 0V for closed switch. They are paired off according to which motor they are the limit switches for:

– Mouth: AD3 and AD4– Cheeks: AD1 and AD2– Eyes: AD5

Yano I/O Board Interface Description

Page 19: Embodied Speech and Facial Expression Avatar  Dan Harbin - Evan Zoss - Jaclyn Tech - Brent Sicking  May 10, 2004

Part 4: Yano

Page 20: Embodied Speech and Facial Expression Avatar  Dan Harbin - Evan Zoss - Jaclyn Tech - Brent Sicking  May 10, 2004

• Yano has 3 motors powered by the Yano I/O Board. One for each the mouth, one for the cheeks, and one to control the eyelids, eyebrows, and ears.

• When the mouth and cheek motors reach their endpoints (ie. fully open or fully closed), they close a switch to indicate that limit is reached.

• These switches are read by the Yano I/O Board.

Yano Functional Description

Page 21: Embodied Speech and Facial Expression Avatar  Dan Harbin - Evan Zoss - Jaclyn Tech - Brent Sicking  May 10, 2004

• Yano’s interfaces are the motor controls, and the switch feedbacks.

• The wires are coded as follows:– Motors:

• Red/Black – Eyes

• Green/Black – Cheeks

• White/Black – Mouth

– Sensors:

• Red/Green/Brown – Mouth

• Gray/Yellow/Pink – Cheeks

• Green/Yellow/Red/Brown – Eyes

Yano Interface Description

Page 22: Embodied Speech and Facial Expression Avatar  Dan Harbin - Evan Zoss - Jaclyn Tech - Brent Sicking  May 10, 2004

Validation and Testing Procedures

• Calibration Test - Calibrate the motors, then run the motors to its limits and back to see if it stays calibrated.

• Expression Test - Change from any one expression to any other expression, and the face should show the desired expression each time.

• Speech Test - Using a sample sound file, make sure Yano produces the right mouth movements for the differences in sound volume consistently and accurately.

Page 23: Embodied Speech and Facial Expression Avatar  Dan Harbin - Evan Zoss - Jaclyn Tech - Brent Sicking  May 10, 2004

Validation and Testing Procedures

• How do we know we accomplished our goal?

• Calibration – We are able to know the exact position of each motor at any given time while the software is running.

• Expressions – We can produce and move between a pre-defined set of believable and readable facial expressions.

• Speech – Yano is consistently able to move his mouth in concurrence with a wave file; The movement and amount of opening is believable and realistic.

Page 24: Embodied Speech and Facial Expression Avatar  Dan Harbin - Evan Zoss - Jaclyn Tech - Brent Sicking  May 10, 2004

Itemized BudgetPart Quantity Cost

Computer 1 N/A

Yano 1 $65.49

SV203 Microcontroller 1 $59.98

TC4424 H-bridges 9 $9.33

Serial Cable 1 $11.99

Breadboard 1 $19.97

2 pin .100" Female Locking Connector 6 $8.94

4 pin .100" Female Locking Connector 1 $1.49

6 pin .100" Female Locking Connector 1 $1.49

8 pin .100" Female Locking Connector 2 $2.98

2 pin .100" Male Locking Connector 4 $5.96

4 pin .100" Male Locking Connector 3 $4.47

6 pin .100" Male Locking Connector 1 $1.49

8 pin .100" Male Locking Connector 1 $1.49

1kΩ Resistor 5 $0.99

.1 µF Capacitor 1 $0.10

.01 µF Capacitor 1 $0.10

Green Wire 24 $1.00

Red Wire 17 $1.00

Black Wire 12 $1.00

Total   $199.26

Page 25: Embodied Speech and Facial Expression Avatar  Dan Harbin - Evan Zoss - Jaclyn Tech - Brent Sicking  May 10, 2004

Timeline of Tasks

Page 26: Embodied Speech and Facial Expression Avatar  Dan Harbin - Evan Zoss - Jaclyn Tech - Brent Sicking  May 10, 2004

Thank you

Applied Materialsand

The National Science Foundation