teaching cars, robots and buildings to read human body...

23
Copyright © 2017 1 Teaching Cars, Robots and Buildings to Read Human Body Language using Webcams and GPUs

Upload: buikhuong

Post on 30-May-2018

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Teaching Cars, Robots and Buildings to Read Human Body ...on-demand.gputechconf.com/gtc-eu/2017/presentation/23375-paul... · body language tone of voice Dogs are our best friends

Copyright © 2017 1

Teaching Cars, Robots and Buildings to Read Human Body

Language using Webcams and GPUs

Page 2: Teaching Cars, Robots and Buildings to Read Human Body ...on-demand.gputechconf.com/gtc-eu/2017/presentation/23375-paul... · body language tone of voice Dogs are our best friends

Copyright © 2017 2

Show me don’t tell meHuman are highly visual creaturesThe majority of communication is non-verbal

body languagetone of voice

Dogs are our best friends because they have the unique ability to read our body languageHow do we teach computers to read our body language?

Problem: How to allow Humans and Computers to Interact

Page 3: Teaching Cars, Robots and Buildings to Read Human Body ...on-demand.gputechconf.com/gtc-eu/2017/presentation/23375-paul... · body language tone of voice Dogs are our best friends

Copyright © 2017 3

Motion CaptureMarker-basedSensor-based

IMUDepth

ShortcomingsExpensive to own / operateTime consuming to set up / useRequire specialty sensors / suitsSensitive to light conditions

Existing Tracking Solutions

Page 4: Teaching Cars, Robots and Buildings to Read Human Body ...on-demand.gputechconf.com/gtc-eu/2017/presentation/23375-paul... · body language tone of voice Dogs are our best friends

Copyright © 2017 4

Eureka Moment: software “Kinect” using regular cameras and GPUs

Mobile AR Object Tracking

Page 5: Teaching Cars, Robots and Buildings to Read Human Body ...on-demand.gputechconf.com/gtc-eu/2017/presentation/23375-paul... · body language tone of voice Dogs are our best friends

Copyright © 2017 5

Value Proposition No specialty sensors / use standard RGB monocular cameraNo specialty hardware / use GPUsCan be deployed in virtually any environment where humans can seeNo set up time

DurabilityLeverage consumer grade GPUsForm factor opens up new marketsFuture proof: system will only get better with more powerful GPUs and more data

Solution: Webcams, GPUs and Deep Learning

Page 6: Teaching Cars, Robots and Buildings to Read Human Body ...on-demand.gputechconf.com/gtc-eu/2017/presentation/23375-paul... · body language tone of voice Dogs are our best friends

Copyright © 2017 6

BodySLAM™: Human Machine Interaction Engine

User Application

BodySLAM™

In: Video

Out: WhoWhat

WhereWhen

Page 7: Teaching Cars, Robots and Buildings to Read Human Body ...on-demand.gputechconf.com/gtc-eu/2017/presentation/23375-paul... · body language tone of voice Dogs are our best friends

Copyright © 2017 7

Deep LearningAccurateRobustKeeps getting smarter

Real-timeEnables interactivity

No Specialty HardwareConsumer grade cameras & GPUs

BodySLAM™: Unique Features

Page 8: Teaching Cars, Robots and Buildings to Read Human Body ...on-demand.gputechconf.com/gtc-eu/2017/presentation/23375-paul... · body language tone of voice Dogs are our best friends

Copyright © 2017 8

Live Demos

Page 9: Teaching Cars, Robots and Buildings to Read Human Body ...on-demand.gputechconf.com/gtc-eu/2017/presentation/23375-paul... · body language tone of voice Dogs are our best friends

Copyright © 2017 9

BodySLAM: High Level Architecture

2D Human Detection

3D Human Detection

Human Tracker

Activity Recognition

Gesture Recognition

Background 2D Skeletons

3D Skeletons ID Location Activities Gestures

2D Video

2D Video

2D Video

BodySLAM™

Page 10: Teaching Cars, Robots and Buildings to Read Human Body ...on-demand.gputechconf.com/gtc-eu/2017/presentation/23375-paul... · body language tone of voice Dogs are our best friends

Copyright © 2017 10

In Home for elder care, etc.In Store for retailIn Vehicle for awarenessIn Factories for accidents, etc.

Applications: Human Monitoring

Page 11: Teaching Cars, Robots and Buildings to Read Human Body ...on-demand.gputechconf.com/gtc-eu/2017/presentation/23375-paul... · body language tone of voice Dogs are our best friends

Copyright © 2017 11

Applications: Robot Interaction

Page 12: Teaching Cars, Robots and Buildings to Read Human Body ...on-demand.gputechconf.com/gtc-eu/2017/presentation/23375-paul... · body language tone of voice Dogs are our best friends

Copyright © 2017 12

Applications: Eyes for Virtual Assistants

Page 13: Teaching Cars, Robots and Buildings to Read Human Body ...on-demand.gputechconf.com/gtc-eu/2017/presentation/23375-paul... · body language tone of voice Dogs are our best friends

Copyright © 2017 13

Applications: Health & Wellness

Page 14: Teaching Cars, Robots and Buildings to Read Human Body ...on-demand.gputechconf.com/gtc-eu/2017/presentation/23375-paul... · body language tone of voice Dogs are our best friends

Copyright © 2017 14

Applications: Sport Analytics

Page 15: Teaching Cars, Robots and Buildings to Read Human Body ...on-demand.gputechconf.com/gtc-eu/2017/presentation/23375-paul... · body language tone of voice Dogs are our best friends

Copyright © 2017 15

Applications: AR / VR

Page 16: Teaching Cars, Robots and Buildings to Read Human Body ...on-demand.gputechconf.com/gtc-eu/2017/presentation/23375-paul... · body language tone of voice Dogs are our best friends

Copyright © 2017 16

Fast allowing real-time interactivityAccurate tracking of 63 body parts per person including fingersRobust across large numbers of people in crowded conditions

BodySLAM™: Unique Features

Page 17: Teaching Cars, Robots and Buildings to Read Human Body ...on-demand.gputechconf.com/gtc-eu/2017/presentation/23375-paul... · body language tone of voice Dogs are our best friends

Copyright © 2017 17

TrainingEnhanced COCOProprietary Synthetic Data

Engine2D: VGG-19 backbone / 51 layers3D: Stacked Hourglass / 134 layersC++ Tracking

InferencingTensorRT (NVIDIA GPU)wrCNN (MPS iOS)

Tech Stack

Page 18: Teaching Cars, Robots and Buildings to Read Human Body ...on-demand.gputechconf.com/gtc-eu/2017/presentation/23375-paul... · body language tone of voice Dogs are our best friends

Copyright © 2017 18

2D Training Pipeline

Page 19: Teaching Cars, Robots and Buildings to Read Human Body ...on-demand.gputechconf.com/gtc-eu/2017/presentation/23375-paul... · body language tone of voice Dogs are our best friends

Copyright © 2017 19

Depth Training Pipeline

Page 20: Teaching Cars, Robots and Buildings to Read Human Body ...on-demand.gputechconf.com/gtc-eu/2017/presentation/23375-paul... · body language tone of voice Dogs are our best friends

Copyright © 2017 20

Synthetic Data: Unity Game Engine

6 Human Models3 Male / 3 Female

Randomized skin color, clothing textures,

50 Animations100 BackgroundsRandomized lighting and camera position

Page 21: Teaching Cars, Robots and Buildings to Read Human Body ...on-demand.gputechconf.com/gtc-eu/2017/presentation/23375-paul... · body language tone of voice Dogs are our best friends

Copyright © 2017 21

High Level Architecture

Page 22: Teaching Cars, Robots and Buildings to Read Human Body ...on-demand.gputechconf.com/gtc-eu/2017/presentation/23375-paul... · body language tone of voice Dogs are our best friends

Copyright © 2017 22

GPU CPU OS Total(mSec)

2D CNN 3D CNN Misc

K80 Intel Xeon E5-2686v4 2.3GHz Linux 131 90 36 5

1080 Intel i7-5930K 3.5GHz Linux 37 22 12 3

1080 Intel i7-5930K 3.5GHz Windows 41 24 13 4

Titan XP Intel i7-5930K 3.5GHz Linux 30 16 11 3

Titan XP Intel i7-5930K 3.5GHz Windows 34 18 12 4

iPhone 6s iOS 1183 963 200 20iPhone 7 iOS 854 696 147 11

Runtime Performance

Page 23: Teaching Cars, Robots and Buildings to Read Human Body ...on-demand.gputechconf.com/gtc-eu/2017/presentation/23375-paul... · body language tone of voice Dogs are our best friends

Copyright © 2017 23

Questions?