nov 2019 russ tedrake robust control with perception in ... · visuomotor policy learning....

69
Robust Control with Perception in the Loop: Toward “Category-Level” Manipulation Russ Tedrake Nov 2019

Upload: others

Post on 24-May-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Nov 2019 Russ Tedrake Robust Control with Perception in ... · Visuomotor Policy Learning. Leveraging advances in deep perception. CoRL 2018. Dense Object Nets. ... Primary existing

Robust Control with Perception in the Loop:Toward “Category-Level” Manipulation

Russ TedrakeNov 2019

Page 2: Nov 2019 Russ Tedrake Robust Control with Perception in ... · Visuomotor Policy Learning. Leveraging advances in deep perception. CoRL 2018. Dense Object Nets. ... Primary existing

Can we make a control system for a fixed-wing airplane to land on a perch like a bird?

Page 4: Nov 2019 Russ Tedrake Robust Control with Perception in ... · Visuomotor Policy Learning. Leveraging advances in deep perception. CoRL 2018. Dense Object Nets. ... Primary existing

MIT Robot Locomotion Group

Page 6: Nov 2019 Russ Tedrake Robust Control with Perception in ... · Visuomotor Policy Learning. Leveraging advances in deep perception. CoRL 2018. Dense Object Nets. ... Primary existing
Page 7: Nov 2019 Russ Tedrake Robust Control with Perception in ... · Visuomotor Policy Learning. Leveraging advances in deep perception. CoRL 2018. Dense Object Nets. ... Primary existing
Page 8: Nov 2019 Russ Tedrake Robust Control with Perception in ... · Visuomotor Policy Learning. Leveraging advances in deep perception. CoRL 2018. Dense Object Nets. ... Primary existing

ONR MURI: Provable-safe high-speed flightthrough forests

w/ Ani Majumdar

Page 10: Nov 2019 Russ Tedrake Robust Control with Perception in ... · Visuomotor Policy Learning. Leveraging advances in deep perception. CoRL 2018. Dense Object Nets. ... Primary existing
Page 11: Nov 2019 Russ Tedrake Robust Control with Perception in ... · Visuomotor Policy Learning. Leveraging advances in deep perception. CoRL 2018. Dense Object Nets. ... Primary existing

My new favorite challenge:

Dexterous Manipulation

Page 14: Nov 2019 Russ Tedrake Robust Control with Perception in ... · Visuomotor Policy Learning. Leveraging advances in deep perception. CoRL 2018. Dense Object Nets. ... Primary existing

Many challenges for RL/Control● Synthesis (which RL algorithm?)

● How do we specify the task?○ Need evaluation function using

real-world sensors○ Over what set of environments?

● How do we represent the policy?○ What is the “state space”?○ Need “Output feedback”

● Can we meaningfully quantify distributional robustness?

Page 15: Nov 2019 Russ Tedrake Robust Control with Perception in ... · Visuomotor Policy Learning. Leveraging advances in deep perception. CoRL 2018. Dense Object Nets. ... Primary existing

Let’s narrow the scope (a bit):

“Category-Level” Manipulation

Page 18: Nov 2019 Russ Tedrake Robust Control with Perception in ... · Visuomotor Policy Learning. Leveraging advances in deep perception. CoRL 2018. Dense Object Nets. ... Primary existing

Problem StatementManipulate potentially unknown rigid objects from a category (e.g. mugs, shoes) into desired target configurations

Page 19: Nov 2019 Russ Tedrake Robust Control with Perception in ... · Visuomotor Policy Learning. Leveraging advances in deep perception. CoRL 2018. Dense Object Nets. ... Primary existing

SE(3) pose is difficult to generalize across category

So how do we even specify the task?What’s the cost function?Images of mugs on the rack?

Page 20: Nov 2019 Russ Tedrake Robust Control with Perception in ... · Visuomotor Policy Learning. Leveraging advances in deep perception. CoRL 2018. Dense Object Nets. ... Primary existing

3D Keypoints provide rich, class-general semantics

… and robust performance in practice

Page 21: Nov 2019 Russ Tedrake Robust Control with Perception in ... · Visuomotor Policy Learning. Leveraging advances in deep perception. CoRL 2018. Dense Object Nets. ... Primary existing

kPAM pipeline No template model or pose appears in this pipeline.

Page 22: Nov 2019 Russ Tedrake Robust Control with Perception in ... · Visuomotor Policy Learning. Leveraging advances in deep perception. CoRL 2018. Dense Object Nets. ... Primary existing

Keypoint Training DataDense Reconstruction helps overcome partial observability

Page 23: Nov 2019 Russ Tedrake Robust Control with Perception in ... · Visuomotor Policy Learning. Leveraging advances in deep perception. CoRL 2018. Dense Object Nets. ... Primary existing

Keypoint networkArchitecture based on:Sun, Xiao, et al. "Integral human pose regression." Proceedings of the European Conference on Computer Vision (ECCV). 2018.

Screenshot from our custom annotation tool

Page 24: Nov 2019 Russ Tedrake Robust Control with Perception in ... · Visuomotor Policy Learning. Leveraging advances in deep perception. CoRL 2018. Dense Object Nets. ... Primary existing

kPAM results

Page 25: Nov 2019 Russ Tedrake Robust Control with Perception in ... · Visuomotor Policy Learning. Leveraging advances in deep perception. CoRL 2018. Dense Object Nets. ... Primary existing

kPAM-SC: now with shape completionMotion planning step can now include non-collision constraints on the object

Page 26: Nov 2019 Russ Tedrake Robust Control with Perception in ... · Visuomotor Policy Learning. Leveraging advances in deep perception. CoRL 2018. Dense Object Nets. ... Primary existing

Q: How do we specify a diversity of tasks?

Proposal: For many geometric tasks, simple costs and constraints on semantically-labeled keypoints.

Page 27: Nov 2019 Russ Tedrake Robust Control with Perception in ... · Visuomotor Policy Learning. Leveraging advances in deep perception. CoRL 2018. Dense Object Nets. ... Primary existing

How do we represent the policy?

Dense Object Nets inVisuomotor Policy Learning

Page 28: Nov 2019 Russ Tedrake Robust Control with Perception in ... · Visuomotor Policy Learning. Leveraging advances in deep perception. CoRL 2018. Dense Object Nets. ... Primary existing

Leveraging advances in deep perception

Page 31: Nov 2019 Russ Tedrake Robust Control with Perception in ... · Visuomotor Policy Learning. Leveraging advances in deep perception. CoRL 2018. Dense Object Nets. ... Primary existing

Dense Object NetsIn initial paper, our tasks were very simple. Just “grab here”.

Do these representations facilitate more complicated control?

Page 32: Nov 2019 Russ Tedrake Robust Control with Perception in ... · Visuomotor Policy Learning. Leveraging advances in deep perception. CoRL 2018. Dense Object Nets. ... Primary existing

(Dramatically) improved dense descriptor trainingin 2D

Page 33: Nov 2019 Russ Tedrake Robust Control with Perception in ... · Visuomotor Policy Learning. Leveraging advances in deep perception. CoRL 2018. Dense Object Nets. ... Primary existing

(Dramatically) improved dense descriptor trainingAnd now 3D

Page 34: Nov 2019 Russ Tedrake Robust Control with Perception in ... · Visuomotor Policy Learning. Leveraging advances in deep perception. CoRL 2018. Dense Object Nets. ... Primary existing

Visuomotor policies

Levine*, Finn*, Darrell, Abbeel, JMLR 2016

Page 35: Nov 2019 Russ Tedrake Robust Control with Perception in ... · Visuomotor Policy Learning. Leveraging advances in deep perception. CoRL 2018. Dense Object Nets. ... Primary existing

Primary existing methods for training visual portion

1. Pose-based auxiliary loss

2. Auto-encoding

3. End-to-end

Estimate object/hand pose(but hard for class-general or deformable)

Page 36: Nov 2019 Russ Tedrake Robust Control with Perception in ... · Visuomotor Policy Learning. Leveraging advances in deep perception. CoRL 2018. Dense Object Nets. ... Primary existing

Idea: What if we use dense correspondences?

Page 37: Nov 2019 Russ Tedrake Robust Control with Perception in ... · Visuomotor Policy Learning. Leveraging advances in deep perception. CoRL 2018. Dense Object Nets. ... Primary existing

Idea: Use a small set of descriptors

Page 38: Nov 2019 Russ Tedrake Robust Control with Perception in ... · Visuomotor Policy Learning. Leveraging advances in deep perception. CoRL 2018. Dense Object Nets. ... Primary existing

Evaluations are based on imitation learningw/ standard “behavior cloning” objective

+ some novel(?) heuristics for data augmentation

from hand-coded policies in simulation and tele-op on real robot

Page 39: Nov 2019 Russ Tedrake Robust Control with Perception in ... · Visuomotor Policy Learning. Leveraging advances in deep perception. CoRL 2018. Dense Object Nets. ... Primary existing

Simulation experimentsPolicy is a small LSTM network (~ 100 LSTMs)

“flip box”“push box”

Page 44: Nov 2019 Russ Tedrake Robust Control with Perception in ... · Visuomotor Policy Learning. Leveraging advances in deep perception. CoRL 2018. Dense Object Nets. ... Primary existing
Page 45: Nov 2019 Russ Tedrake Robust Control with Perception in ... · Visuomotor Policy Learning. Leveraging advances in deep perception. CoRL 2018. Dense Object Nets. ... Primary existing
Page 46: Nov 2019 Russ Tedrake Robust Control with Perception in ... · Visuomotor Policy Learning. Leveraging advances in deep perception. CoRL 2018. Dense Object Nets. ... Primary existing
Page 47: Nov 2019 Russ Tedrake Robust Control with Perception in ... · Visuomotor Policy Learning. Leveraging advances in deep perception. CoRL 2018. Dense Object Nets. ... Primary existing

Can we meaningfully quantify distributional robustness?

How should we represent distributions at the category level?

Page 48: Nov 2019 Russ Tedrake Robust Control with Perception in ... · Visuomotor Policy Learning. Leveraging advances in deep perception. CoRL 2018. Dense Object Nets. ... Primary existing

Requirements authoring

In controls (polytopic/ellipsoidal, etc)Domain randomization in reinforcement learning

Developing autonomous systems in the real world.

Page 49: Nov 2019 Russ Tedrake Robust Control with Perception in ... · Visuomotor Policy Learning. Leveraging advances in deep perception. CoRL 2018. Dense Object Nets. ... Primary existing
Page 54: Nov 2019 Russ Tedrake Robust Control with Perception in ... · Visuomotor Policy Learning. Leveraging advances in deep perception. CoRL 2018. Dense Object Nets. ... Primary existing

More advanced falsification via nonlinear black-box optimization and rare event simulation.

NIPS 2018

Page 56: Nov 2019 Russ Tedrake Robust Control with Perception in ... · Visuomotor Policy Learning. Leveraging advances in deep perception. CoRL 2018. Dense Object Nets. ... Primary existing

Procedural dishes

Page 57: Nov 2019 Russ Tedrake Robust Control with Perception in ... · Visuomotor Policy Learning. Leveraging advances in deep perception. CoRL 2018. Dense Object Nets. ... Primary existing
Page 58: Nov 2019 Russ Tedrake Robust Control with Perception in ... · Visuomotor Policy Learning. Leveraging advances in deep perception. CoRL 2018. Dense Object Nets. ... Primary existing
Page 59: Nov 2019 Russ Tedrake Robust Control with Perception in ... · Visuomotor Policy Learning. Leveraging advances in deep perception. CoRL 2018. Dense Object Nets. ... Primary existing

Outlier detection

Page 60: Nov 2019 Russ Tedrake Robust Control with Perception in ... · Visuomotor Policy Learning. Leveraging advances in deep perception. CoRL 2018. Dense Object Nets. ... Primary existing

To achieve robustness, do I need to simulate the diversity of the world?

Page 61: Nov 2019 Russ Tedrake Robust Control with Perception in ... · Visuomotor Policy Learning. Leveraging advances in deep perception. CoRL 2018. Dense Object Nets. ... Primary existing

Can we simulate everything in the kitchen?

Napkins? Ketchup? Soba noodles?

How accurate do our models have to be?

Control authority+ model fidelity

Task requirements

Page 62: Nov 2019 Russ Tedrake Robust Control with Perception in ... · Visuomotor Policy Learning. Leveraging advances in deep perception. CoRL 2018. Dense Object Nets. ... Primary existing

How do I provide test coverage for every possible kitchen?

Hypothesis: Only need a sufficiently rich sandbox to deploy

+ continual improvement (fleet learning)

Page 63: Nov 2019 Russ Tedrake Robust Control with Perception in ... · Visuomotor Policy Learning. Leveraging advances in deep perception. CoRL 2018. Dense Object Nets. ... Primary existing

SummaryOptimization brought us today’s “modern control”...

..with strong results for relatively simple forms uncertainty.

Real world uncertainty and “domain randomization” in RL is much richer. “Black-box” optimization in RL still works.

Dealing with perception and “open-worlds” may cause the next major shift in controls research; we need the maturity of control to help address fundamental problems in robustness and sample-complexity.

Page 64: Nov 2019 Russ Tedrake Robust Control with Perception in ... · Visuomotor Policy Learning. Leveraging advances in deep perception. CoRL 2018. Dense Object Nets. ... Primary existing

There is so much that we don’t know how to do yet!

Page 65: Nov 2019 Russ Tedrake Robust Control with Perception in ... · Visuomotor Policy Learning. Leveraging advances in deep perception. CoRL 2018. Dense Object Nets. ... Primary existing
Page 66: Nov 2019 Russ Tedrake Robust Control with Perception in ... · Visuomotor Policy Learning. Leveraging advances in deep perception. CoRL 2018. Dense Object Nets. ... Primary existing
Page 67: Nov 2019 Russ Tedrake Robust Control with Perception in ... · Visuomotor Policy Learning. Leveraging advances in deep perception. CoRL 2018. Dense Object Nets. ... Primary existing
Page 68: Nov 2019 Russ Tedrake Robust Control with Perception in ... · Visuomotor Policy Learning. Leveraging advances in deep perception. CoRL 2018. Dense Object Nets. ... Primary existing

“For the Rubik’s cube task, we use 8 × 8 = 64 NVIDIA V100 GPUs and 8 × 115 = 920 worker machines with 32 CPU cores each. … The cumulative amount of experience ... is roughly 13 thousand years.”

Solving Rubik's Cube with a Robot Hand by OpenAI, arXiv:1910.07113

Page 69: Nov 2019 Russ Tedrake Robust Control with Perception in ... · Visuomotor Policy Learning. Leveraging advances in deep perception. CoRL 2018. Dense Object Nets. ... Primary existing

Solving Rubik's Cube with a Robot Hand by OpenAI, arXiv:1910.07113