deep robotic learning - lima-city...deep learning felzenszwalb ‘08 robotic control pipeline...
TRANSCRIPT
![Page 1: Deep Robotic Learning - lima-city...deep learning Felzenszwalb ‘08 robotic control pipeline observations state estimation (e.g. vision) modeling & prediction planning low-level control](https://reader033.vdocuments.us/reader033/viewer/2022042804/5f52277534a87911a64e7dfd/html5/thumbnails/1.jpg)
Deep Robotic Learning
Sergey LevineUC Berkeley Google Brain
![Page 2: Deep Robotic Learning - lima-city...deep learning Felzenszwalb ‘08 robotic control pipeline observations state estimation (e.g. vision) modeling & prediction planning low-level control](https://reader033.vdocuments.us/reader033/viewer/2022042804/5f52277534a87911a64e7dfd/html5/thumbnails/2.jpg)
![Page 3: Deep Robotic Learning - lima-city...deep learning Felzenszwalb ‘08 robotic control pipeline observations state estimation (e.g. vision) modeling & prediction planning low-level control](https://reader033.vdocuments.us/reader033/viewer/2022042804/5f52277534a87911a64e7dfd/html5/thumbnails/3.jpg)
roboticcontrolpipeline
observationsstate
estimation(e.g. vision)
modeling & prediction
planninglow-level control
controls
![Page 4: Deep Robotic Learning - lima-city...deep learning Felzenszwalb ‘08 robotic control pipeline observations state estimation (e.g. vision) modeling & prediction planning low-level control](https://reader033.vdocuments.us/reader033/viewer/2022042804/5f52277534a87911a64e7dfd/html5/thumbnails/4.jpg)
standardcomputervision
features(e.g. HOG)
mid-level features(e.g. DPM)
classifier(e.g. SVM)
deeplearning
Felzenszwalb ‘08
roboticcontrolpipeline
observationsstate
estimation(e.g. vision)
modeling & prediction
planninglow-level control
controls
deeproboticlearning
observationsstate
estimation(e.g. vision)
modeling & prediction
planninglow-level control
controls
end-to-end training
end-to-end training
![Page 5: Deep Robotic Learning - lima-city...deep learning Felzenszwalb ‘08 robotic control pipeline observations state estimation (e.g. vision) modeling & prediction planning low-level control](https://reader033.vdocuments.us/reader033/viewer/2022042804/5f52277534a87911a64e7dfd/html5/thumbnails/5.jpg)
no direct supervision
actions have consequences
![Page 6: Deep Robotic Learning - lima-city...deep learning Felzenszwalb ‘08 robotic control pipeline observations state estimation (e.g. vision) modeling & prediction planning low-level control](https://reader033.vdocuments.us/reader033/viewer/2022042804/5f52277534a87911a64e7dfd/html5/thumbnails/6.jpg)
1. Does end-to-end learning produce bettersensorimotor skills?
2. Can we apply sensorimotor skill learning to a wide variety of robots & tasks?
3. Can we scale up deep robotic learning and produce skills that generalize?
4. How can we learn safely and efficiently in safety-critical domains?
5. Can we transfer skills from simulation to the real world, and from one robot to another?
![Page 7: Deep Robotic Learning - lima-city...deep learning Felzenszwalb ‘08 robotic control pipeline observations state estimation (e.g. vision) modeling & prediction planning low-level control](https://reader033.vdocuments.us/reader033/viewer/2022042804/5f52277534a87911a64e7dfd/html5/thumbnails/7.jpg)
1. Does end-to-end learning produce bettersensorimotor skills?
2. Can we apply sensorimotor skill learning to a wide variety of robots & tasks?
3. Can we scale up deep robotic learning and produce skills that generalize?
4. How can we learn safely and efficiently in safety-critical domains?
5. Can we transfer skills from simulation to the real world, and from one robot to another?
![Page 8: Deep Robotic Learning - lima-city...deep learning Felzenszwalb ‘08 robotic control pipeline observations state estimation (e.g. vision) modeling & prediction planning low-level control](https://reader033.vdocuments.us/reader033/viewer/2022042804/5f52277534a87911a64e7dfd/html5/thumbnails/8.jpg)
Chelsea Finn
![Page 9: Deep Robotic Learning - lima-city...deep learning Felzenszwalb ‘08 robotic control pipeline observations state estimation (e.g. vision) modeling & prediction planning low-level control](https://reader033.vdocuments.us/reader033/viewer/2022042804/5f52277534a87911a64e7dfd/html5/thumbnails/9.jpg)
end-to-end training
0%successrate
96.3%successrate
pose prediction
(trained on pose only)
L.*, Finn*, Darrell, Abbeel, ‘16
![Page 10: Deep Robotic Learning - lima-city...deep learning Felzenszwalb ‘08 robotic control pipeline observations state estimation (e.g. vision) modeling & prediction planning low-level control](https://reader033.vdocuments.us/reader033/viewer/2022042804/5f52277534a87911a64e7dfd/html5/thumbnails/10.jpg)
1. Does end-to-end learning produce bettersensorimotor skills?
2. Can we apply sensorimotor skill learning to a wide variety of robots & tasks?
3. Can we scale up deep robotic learning and produce skills that generalize?
4. How can we learn safely and efficiently in safety-critical domains?
5. Can we transfer skills from simulation to the real world, and from one robot to another?
![Page 11: Deep Robotic Learning - lima-city...deep learning Felzenszwalb ‘08 robotic control pipeline observations state estimation (e.g. vision) modeling & prediction planning low-level control](https://reader033.vdocuments.us/reader033/viewer/2022042804/5f52277534a87911a64e7dfd/html5/thumbnails/11.jpg)
Deep Robotic Learning Applications
manipulation
locomotion
with N. Wagener, P. Abbeel with V. Kumar, A. Gupta, E. Todorov
with V. Koltun
aerial vehicles
with G. Kahn, T. Zhang, P. Abbeel
tensegrity robot
with X. Geng, M. Zhang, J. Bruce, K. Caluwaerts,M. Vespignani, V. SunSpiral, P. Abbeel
dexterous hands
with C. Eppner, A. Gupta, P. Abbeel
soft hands
![Page 12: Deep Robotic Learning - lima-city...deep learning Felzenszwalb ‘08 robotic control pipeline observations state estimation (e.g. vision) modeling & prediction planning low-level control](https://reader033.vdocuments.us/reader033/viewer/2022042804/5f52277534a87911a64e7dfd/html5/thumbnails/12.jpg)
1. Does end-to-end learning produce bettersensorimotor skills?
2. Can we apply sensorimotor skill learning to a wide variety of robots & tasks?
3. Can we scale up deep robotic learning and produce skills that generalize?
4. How can we learn safely and efficiently in safety-critical domains?
5. Can we transfer skills from simulation to the real world, and from one robot to another?
![Page 13: Deep Robotic Learning - lima-city...deep learning Felzenszwalb ‘08 robotic control pipeline observations state estimation (e.g. vision) modeling & prediction planning low-level control](https://reader033.vdocuments.us/reader033/viewer/2022042804/5f52277534a87911a64e7dfd/html5/thumbnails/13.jpg)
ingredients for success in learning:
supervised learning: learning robotic skills:
computation
algorithms
data
computation
algorithms~data?
![Page 14: Deep Robotic Learning - lima-city...deep learning Felzenszwalb ‘08 robotic control pipeline observations state estimation (e.g. vision) modeling & prediction planning low-level control](https://reader033.vdocuments.us/reader033/viewer/2022042804/5f52277534a87911a64e7dfd/html5/thumbnails/14.jpg)
monocularRGB camera
7 DoF arm
2-fingergripper
objectbin
Grasping with Learned Hand-Eye Coordination
• monocular camera (no depth)• no camera calibration either
• 2-5 Hz update• continuous arm control
• servo the gripper to target
• fix mistakes
• no prior knowledge
L., Pastor, Krizhevsky, Quillen ‘16
Peter PastorAlex
Krizhevsky Deirdre Quillen
![Page 15: Deep Robotic Learning - lima-city...deep learning Felzenszwalb ‘08 robotic control pipeline observations state estimation (e.g. vision) modeling & prediction planning low-level control](https://reader033.vdocuments.us/reader033/viewer/2022042804/5f52277534a87911a64e7dfd/html5/thumbnails/15.jpg)
Grasping Experiments
![Page 16: Deep Robotic Learning - lima-city...deep learning Felzenszwalb ‘08 robotic control pipeline observations state estimation (e.g. vision) modeling & prediction planning low-level control](https://reader033.vdocuments.us/reader033/viewer/2022042804/5f52277534a87911a64e7dfd/html5/thumbnails/16.jpg)
Policy Learning with Multiple Robots
Local policy optimization Global policy optimization
Rollout execution
MrinalKalakrishnan Yevgen ChebotarAdrian LiAli Yahya
![Page 17: Deep Robotic Learning - lima-city...deep learning Felzenszwalb ‘08 robotic control pipeline observations state estimation (e.g. vision) modeling & prediction planning low-level control](https://reader033.vdocuments.us/reader033/viewer/2022042804/5f52277534a87911a64e7dfd/html5/thumbnails/17.jpg)
Yahya, Li, Kalakrishnan, Chebotar, L., ‘16
![Page 18: Deep Robotic Learning - lima-city...deep learning Felzenszwalb ‘08 robotic control pipeline observations state estimation (e.g. vision) modeling & prediction planning low-level control](https://reader033.vdocuments.us/reader033/viewer/2022042804/5f52277534a87911a64e7dfd/html5/thumbnails/18.jpg)
Policy Learning with Multiple Robots: Deep RL with NAF
Gu*, Holly*, Lillicrap, L., ‘16
Shane Gu Ethan Holly Tim Lillicrap
![Page 19: Deep Robotic Learning - lima-city...deep learning Felzenszwalb ‘08 robotic control pipeline observations state estimation (e.g. vision) modeling & prediction planning low-level control](https://reader033.vdocuments.us/reader033/viewer/2022042804/5f52277534a87911a64e7dfd/html5/thumbnails/19.jpg)
Learning a Predictive Model of Natural Images
originalvideo
predictions
Chelsea Finn
![Page 20: Deep Robotic Learning - lima-city...deep learning Felzenszwalb ‘08 robotic control pipeline observations state estimation (e.g. vision) modeling & prediction planning low-level control](https://reader033.vdocuments.us/reader033/viewer/2022042804/5f52277534a87911a64e7dfd/html5/thumbnails/20.jpg)
1. Does end-to-end learning produce bettersensorimotor skills?
2. Can we apply sensorimotor skill learning to a wide variety of robots & tasks?
3. Can we scale up deep robotic learning and produce skills that generalize?
4. How can we learn safely and efficiently in safety-critical domains?
5. Can we transfer skills from simulation to the real world, and from one robot to another?
![Page 21: Deep Robotic Learning - lima-city...deep learning Felzenszwalb ‘08 robotic control pipeline observations state estimation (e.g. vision) modeling & prediction planning low-level control](https://reader033.vdocuments.us/reader033/viewer/2022042804/5f52277534a87911a64e7dfd/html5/thumbnails/21.jpg)
unknown environment
1. Learn a collision prediction model
command velocities
raw image
neural network ensemble
3. Iteratively train with on-policy samples
2. Speed-dependent, uncertainty-awarecollision cost
Key idea: To learn about collisions,must experience collisions (but safely!)
Safe Uncertainty-Aware Learning
Kahn, Pong, Abbeel, L. ‘16
Greg Kahn
![Page 22: Deep Robotic Learning - lima-city...deep learning Felzenszwalb ‘08 robotic control pipeline observations state estimation (e.g. vision) modeling & prediction planning low-level control](https://reader033.vdocuments.us/reader033/viewer/2022042804/5f52277534a87911a64e7dfd/html5/thumbnails/22.jpg)
Safe Uncertainty-Aware Learning
Kahn, Pong, Abbeel, L. ‘16
![Page 23: Deep Robotic Learning - lima-city...deep learning Felzenszwalb ‘08 robotic control pipeline observations state estimation (e.g. vision) modeling & prediction planning low-level control](https://reader033.vdocuments.us/reader033/viewer/2022042804/5f52277534a87911a64e7dfd/html5/thumbnails/23.jpg)
1. Does end-to-end learning produce bettersensorimotor skills?
2. Can we apply sensorimotor skill learning to a wide variety of robots & tasks?
3. Can we scale up deep robotic learning and produce skills that generalize?
4. How can we learn safely and efficiently in safety-critical domains?
5. Can we transfer skills from simulation to the real world, and from one robot to another?
![Page 24: Deep Robotic Learning - lima-city...deep learning Felzenszwalb ‘08 robotic control pipeline observations state estimation (e.g. vision) modeling & prediction planning low-level control](https://reader033.vdocuments.us/reader033/viewer/2022042804/5f52277534a87911a64e7dfd/html5/thumbnails/24.jpg)
Training in Simulation: CAD2RL
Sadeghi, L. ‘16
Fereshteh Sadeghi
![Page 25: Deep Robotic Learning - lima-city...deep learning Felzenszwalb ‘08 robotic control pipeline observations state estimation (e.g. vision) modeling & prediction planning low-level control](https://reader033.vdocuments.us/reader033/viewer/2022042804/5f52277534a87911a64e7dfd/html5/thumbnails/25.jpg)
Training in Simulation: CAD2RL
Sadeghi, L. ‘16
![Page 26: Deep Robotic Learning - lima-city...deep learning Felzenszwalb ‘08 robotic control pipeline observations state estimation (e.g. vision) modeling & prediction planning low-level control](https://reader033.vdocuments.us/reader033/viewer/2022042804/5f52277534a87911a64e7dfd/html5/thumbnails/26.jpg)
Training in Simulation: CAD2RL
Sadeghi, L. ‘16
![Page 27: Deep Robotic Learning - lima-city...deep learning Felzenszwalb ‘08 robotic control pipeline observations state estimation (e.g. vision) modeling & prediction planning low-level control](https://reader033.vdocuments.us/reader033/viewer/2022042804/5f52277534a87911a64e7dfd/html5/thumbnails/27.jpg)
Sadeghi, L. ‘16
![Page 28: Deep Robotic Learning - lima-city...deep learning Felzenszwalb ‘08 robotic control pipeline observations state estimation (e.g. vision) modeling & prediction planning low-level control](https://reader033.vdocuments.us/reader033/viewer/2022042804/5f52277534a87911a64e7dfd/html5/thumbnails/28.jpg)
Learning with Transfer in Mind: Ensemble Policy Optimization (EPOpt)
train test
adapt
training on single torso mass training on model ensemble
unmodeled effectsensemble adaptation
Aravind Rajeswaran
![Page 29: Deep Robotic Learning - lima-city...deep learning Felzenszwalb ‘08 robotic control pipeline observations state estimation (e.g. vision) modeling & prediction planning low-level control](https://reader033.vdocuments.us/reader033/viewer/2022042804/5f52277534a87911a64e7dfd/html5/thumbnails/29.jpg)
1. Does end-to-end learning produce bettersensorimotor skills?
2. Can we apply sensorimotor skill learning to a wide variety of robots & tasks?
3. Can we scale up deep robotic learning and produce skills that generalize?
4. How can we learn safely and efficiently in safety-critical domains?
5. Can we transfer skills from simulation to the real world, and from one robot to another?
6. How can we get sufficient supervision to learn in unstructured real-world environments?
![Page 30: Deep Robotic Learning - lima-city...deep learning Felzenszwalb ‘08 robotic control pipeline observations state estimation (e.g. vision) modeling & prediction planning low-level control](https://reader033.vdocuments.us/reader033/viewer/2022042804/5f52277534a87911a64e7dfd/html5/thumbnails/30.jpg)
Learning what Success Means
can we learn the goalwith visual features?
Finn, Abbeel, L. ‘16
![Page 31: Deep Robotic Learning - lima-city...deep learning Felzenszwalb ‘08 robotic control pipeline observations state estimation (e.g. vision) modeling & prediction planning low-level control](https://reader033.vdocuments.us/reader033/viewer/2022042804/5f52277534a87911a64e7dfd/html5/thumbnails/31.jpg)
Learning what Success Means
Sermanet, Xu, L. ‘16
![Page 32: Deep Robotic Learning - lima-city...deep learning Felzenszwalb ‘08 robotic control pipeline observations state estimation (e.g. vision) modeling & prediction planning low-level control](https://reader033.vdocuments.us/reader033/viewer/2022042804/5f52277534a87911a64e7dfd/html5/thumbnails/32.jpg)
ingredients for success in learning:
supervised learning: learning robotic skills:
computation
algorithms
data
computation
algorithms~data?
![Page 33: Deep Robotic Learning - lima-city...deep learning Felzenszwalb ‘08 robotic control pipeline observations state estimation (e.g. vision) modeling & prediction planning low-level control](https://reader033.vdocuments.us/reader033/viewer/2022042804/5f52277534a87911a64e7dfd/html5/thumbnails/33.jpg)
Announcement: New ConferenceConference on Robotic Learning (CoRL)www.robot-learning.org
Goal: bring together robotics & machine learning in a focused conference format
Conference: November 2017Papers deadline: late June 2017Steering committee: Ken Goldberg (UC Berkeley), Sergey Levine (UC Berkeley), Vincent Vanhoucke (Google), Abhinav Gupta (CMU), Stefan Schaal (USC, MPI), Michael I. Jordan (UC Berkeley), RaiaHadsell (DeepMind), Dieter Fox (UW), Joelle Pineau (McGill), J. Andrew Bagnell (CMU), Aude Billard (EPFL), Stefanie Tellex (Brown), Minoru Asada (Osaka), Wolfram Burgard (Freiburg), Pieter Abbeel(UC Berkeley)
Chelsea Finn
Peter PastorAlex
Krizhevsky Deirdre Quillen
MrinalKalakrishnan Yevgen ChebotarAdrian LiAli Yahya Shane Gu Ethan Holly Tim Lillicrap
Greg Kahn Fereshteh Sadeghi Aravind Rajeswaran
Pieter AbbeelTrevor Darrell