deep reinforcement learning for robotics

22
Deep Reinforcement Learning for Robotics Pieter Abbeel -- UC Berkeley EECS

Upload: vothien

Post on 13-Feb-2017

232 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Deep Reinforcement Learning for Robotics

Deep Reinforcement Learning for Robotics Pieter Abbeel -- UC Berkeley EECS

Page 2: Deep Reinforcement Learning for Robotics

State-of-the-art object detection until 2012:

Deep Supervised Learning (Krizhevsky, Sutskever, Hinton 2012; also LeCun, Bengio, Ng, Darrell, …):

60 million learned parameters (since then, billions of parameters)

~1.2 million training images

Object Detection in Computer Vision

Input Image

Hand-engineered features (SIFT,

HOG, DAISY, …)

Support Vector

Machine (SVM)

“cat” “dog” “car” …

Input Image

8-layer neural network with 60 million parameters to learn

“cat” “dog” “car” …

Page 3: Deep Reinforcement Learning for Robotics

Performance

graph credit Matt Zeiler, Clarifai

Page 4: Deep Reinforcement Learning for Robotics

Performance

graph credit Matt Zeiler, Clarifai

Page 5: Deep Reinforcement Learning for Robotics

Performance

graph credit Matt Zeiler, Clarifai

AlexNet

Page 6: Deep Reinforcement Learning for Robotics

Performance

graph credit Matt Zeiler, Clarifai

AlexNet

Page 7: Deep Reinforcement Learning for Robotics

Performance

graph credit Matt Zeiler, Clarifai

AlexNet

Page 8: Deep Reinforcement Learning for Robotics

Speech Recognition

graph credit Matt Zeiler, Clarifai

Page 9: Deep Reinforcement Learning for Robotics

History

Is deep learning 3, 30, or 60 years old?

2000s Sparse, Probabilistic, and Energy models (Hinton, Bengio, LeCun, Ng)

Rosenblatt’s Perceptron

(Olshausen, 1996)

based on history by K. Cho

Presenter
Presentation Notes
connected the dots exploration of model structure optimization know-how computation + data
Page 10: Deep Reinforcement Learning for Robotics

Data

1.2M training examples

* 2048 (shifts)

* 90 (PCA re-coloring)

1.2M * 2k *90 ~ 0.216 trillion

Human eye: 1k frames/s

~6.84yrs

Compute power

Two NVIDIA GTX 580 GPUs

5-6 days of training time

What’s Changed Nonlinearity

Sigmoid

ReLU

Regularization

Drop-out

(Training data augmentation)

Exploration of model structure

Optimization know-how

Page 11: Deep Reinforcement Learning for Robotics

State-of-the-art object detection until 2012:

Deep Supervised Learning (Krizhevsky, Sutskever, Hinton 2012; also LeCun, Bengio, Ng, Darrell, …):

60 million learned parameters (since then, billions of parameters)

~1.2 million training images

Object Detection in Computer Vision

Input Image

Hand-engineered features (SIFT,

HOG, DAISY, …)

Support Vector

Machine (SVM)

“cat” “dog” “car” …

Input Image

8-layer neural network with 60 million parameters to learn

“cat” “dog” “car” …

Page 12: Deep Reinforcement Learning for Robotics

Current state-of-the-art robotics

Deep reinforcement learning

Robotics

Percepts Hand-

engineered state-

estimation

Many-layer neural network

with many parameters to learn

Hand-engineered

control policy class

Hand-tuned (or learned) 10’ish free parameters

Motor commands

Percepts Motor commands

Page 13: Deep Reinforcement Learning for Robotics

Reinforcement Learning (RL)

Robotics

Marketing / Advertising

Dialogue

Optimizing operations / logistics

Queue management

Robot + Environment

probability of taking action a in state s

Page 14: Deep Reinforcement Learning for Robotics

How About Deep RL?

Pong Enduro Beamrider Q*bert

Page 15: Deep Reinforcement Learning for Robotics

Deep Q-learning [Mnih et al, 2013]

Monte Carlo Tree Search [Xiao-Xiao et al, 2014]

Trust Region Policy Optimization [Schulman, Levine, Moritz, Jordan, A., 2014]

Deep Reinforcement Learning for Atari Games

Pong Enduro Beamrider Q*bert

Page 16: Deep Reinforcement Learning for Robotics

[Schulman, Levine, Moritz, Jordan, Abbeel, ICML 2015]

Experiments in Locomotion

Page 17: Deep Reinforcement Learning for Robotics

How About Real Robotic Visuo-Motor Skills?

Page 18: Deep Reinforcement Learning for Robotics

Architecture (92,000 parameters)

[Levine*, Finn*, Darrell, Abbeel, 2015, TR at: rll.berkeley.edu/deeplearningrobotics]

Page 19: Deep Reinforcement Learning for Robotics

Block Stacking – Learning the Controller for a Single Instance

Page 20: Deep Reinforcement Learning for Robotics

Learned Skills

Page 21: Deep Reinforcement Learning for Robotics

Architectures for shared learning / transfer learning

Multiple robots and sensors (including simulation)

Multiple tasks

Simulation – Real world

Frontiers / Limitations Exploration

Controllers that require memory / estimation

Temporal hierarchy

Page 22: Deep Reinforcement Learning for Robotics

Thank you