s7348: deep learning in ford's autonomous...

Post on 29-Jun-2018

218 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

S7348: Deep Learning in Ford's Autonomous Vehicles

Bryan Goodman

Argo AI

9 May 2017

1

Today: examples from• Stereo image processing

• Object detection• Using RNN’s

• Motorsports

2

Ford’s 12 Year History in Autonomous Driving

Stereo Matching Problem

• Determining the correspondences in stereo images

• Calculating the disparities

• But what is the correct correspondence?

• Basic stereo matching algorithm− Compare pixels on the same

epipolar line in two images

− Choose the best match

3

Deep neural networks for stereo matching

• The brain can estimate the distance of an object using the visual information from two eyes.

• We can use deep neural networks

4

Right Stereo Camera

Deep Convolutional Neural Networks

Post-Processing

Left Stereo Camera

Distance Map Estimation

Proposed deep convolutional neural network• AV driving requires an intelligent distance map estimation, which filters out the

objects not of interest.• Network I

− General network

− Encoding and decoding layers

− Retain objects of interest in the training data sets

5

Co

nv1 Conv5

Co

nv2

Co

nv3

Co

nv6

Deco

nv6

Co

nv7

De

con

v7

Deco

nv8

Co

nv8

Deco

nv9

Encoder Decoder

Co

nv9

Loss Function

Deco

nv1

0

Co

nv1

0

Co

nv4

Proposed deep convolutional neural network II

− Specialized network

− Encoding and decoding layers

− The cross correlation layers force the network to look for correspondence on the epipolar line

− The weights in the encoding layers are shared

6

Co

nv1

L

Conv4L

Loss

Fun

ction

Encoder Decoder

Co

nv2

L

Co

nv3

L

Co

nv1

R

Conv4R

Co

nv2

R

Co

nv3

R

CC

5

Co

nv5

Co

nv6

CC

6D

econ

v6

Co

nv7

CC

7

Deco

nv7

Deco

nv8

Co

nv8

Deco

nv9

Co

nv9

Proposed deep convolutional neural network

• Cross correlation (CC) layer− Computes CC values between each pairs of

patches

− Outputs the CC values for each pair of patches

− Does not lose any information

• Loss function− In AV driving, closer objects are more important

than distant ones

− Assigns more weight to the closer objects

− The closer object distance is estimated more accurately

7

0.2 0.4 0.6 0.8 1

1

0.4

0.2

α

d

0.6

0.8

Performance on synthetic and real stereo data

• Synthetic data generation− Generate 14,000 pairs of RGB stereo images

− Synthetic distance maps are only generated for the objects of interest, e.g. cars or pedestrians

− Gaussian noise added to the stereo images

8

Performance on synthetic and real stereo data• Fine tuning with LIDAR data sets

− Project LIDAR point clouds onto the camera images

− The baseline and optic axes are not the same as the synthetic data

9

Left camera Right camera Network I Network II

1/2x

10

Comparing Manual Annotation to DNN Model

11

12

Detection Result Original Image Enhanced Contrast

Network’s detection outperforms human labelerin low-contrast areas

Pedestrian detection Pedestrian misdetection Detected, but not labeled

Introducing Recurrence in Detection and Tracking

• Use RNN’s to detect occluded objects• Remember location of static objects

• Predict location of non-static objects

13

Image 0

FeatureMap

RNN Conv

Image 1

FeatureMap

Image 2

FeatureMap

RNN Conv RNN Conv

Detector Detector Detector

Orange = ground truth; Green = model prediction

14

15

Classifying NASCAR images

The Ford team reviews pictures during the race

16

Classifying NASCAR images

Looking for damage and other performance indicators

Gap

17

Results –Boxing the Cars

Using ~2k images labeled

with boxes around the

vehicles, the model does

well detecting cars

18

Results –Boxing the Cars

Classifying NASCAR images

Next –determine car

number:labeled ~30k

images

Classifying NASCAR images

Outliers easy to find in review

Classifying NASCAR images

Human: ???Model: 78

Confidence: 0.999

Classifying NASCAR images

Human: ???Model: 42

Confidence: 0.985

Inspecting the Neural Network

23

Activated Filter Input Image

The Model is not a black box. We can see that it is detecting the numbers – important for robustness when the paint changes

Argo AI

• Argo AI is an artificial intelligence company, established to tackle one of the most challenging applications in computer science, robotics and artificial intelligence: self-driving vehicles

• Engineering hubs in Pittsburgh, Southeastern Michigan and the Bay Area of California

• For more information regarding Argo AI and its work, please talk to me at GTC or visit: www.argo.ai

24

top related