safe model-based learning for robot control...safe model-based learning for robot control felix...
TRANSCRIPT
![Page 1: Safe model-based learning for robot control...Safe model-based learning for robot control Felix Berkenkamp, Andreas Krause, Angela P. Schoellig @CDC Workshop on Learning for Control](https://reader033.vdocuments.us/reader033/viewer/2022053015/5f164acd3af2e92ef65e198b/html5/thumbnails/1.jpg)
Safe model-based learning for robot control
Felix Berkenkamp, Andreas Krause, Angela P. Schoellig
@CDC Workshop on Learning for Control
16th December 2018
![Page 2: Safe model-based learning for robot control...Safe model-based learning for robot control Felix Berkenkamp, Andreas Krause, Angela P. Schoellig @CDC Workshop on Learning for Control](https://reader033.vdocuments.us/reader033/viewer/2022053015/5f164acd3af2e92ef65e198b/html5/thumbnails/2.jpg)
The future of automation
2Felix Berkenkamp
![Page 3: Safe model-based learning for robot control...Safe model-based learning for robot control Felix Berkenkamp, Andreas Krause, Angela P. Schoellig @CDC Workshop on Learning for Control](https://reader033.vdocuments.us/reader033/viewer/2022053015/5f164acd3af2e92ef65e198b/html5/thumbnails/3.jpg)
The future of automation
3Felix Berkenkamp
Large prior uncertainties, active decision making
Need safe and high-performance behavior
![Page 4: Safe model-based learning for robot control...Safe model-based learning for robot control Felix Berkenkamp, Andreas Krause, Angela P. Schoellig @CDC Workshop on Learning for Control](https://reader033.vdocuments.us/reader033/viewer/2022053015/5f164acd3af2e92ef65e198b/html5/thumbnails/4.jpg)
Control approach
4Felix Berkenkamp
System model System identification Controller design
Controlled environments Robustness towards errors Safety constraints
data collection
![Page 5: Safe model-based learning for robot control...Safe model-based learning for robot control Felix Berkenkamp, Andreas Krause, Angela P. Schoellig @CDC Workshop on Learning for Control](https://reader033.vdocuments.us/reader033/viewer/2022053015/5f164acd3af2e92ef65e198b/html5/thumbnails/5.jpg)
Two approaches
5Felix Berkenkamp
Control(Systems)
+ Models+ Feedback+ Safety+ Worst-case- Learning- Data
Performance limited by system understanding
Systems must learn and adapt
![Page 6: Safe model-based learning for robot control...Safe model-based learning for robot control Felix Berkenkamp, Andreas Krause, Angela P. Schoellig @CDC Workshop on Learning for Control](https://reader033.vdocuments.us/reader033/viewer/2022053015/5f164acd3af2e92ef65e198b/html5/thumbnails/6.jpg)
Reinforcement learning approach
6Felix Berkenkamp
System model System identification Controller designData samples Controller optimization
Action
State
Agent
Environment
Reward
Collecting relevant data for the task (in controlled environments)
Performance typically in expectation
![Page 7: Safe model-based learning for robot control...Safe model-based learning for robot control Felix Berkenkamp, Andreas Krause, Angela P. Schoellig @CDC Workshop on Learning for Control](https://reader033.vdocuments.us/reader033/viewer/2022053015/5f164acd3af2e92ef65e198b/html5/thumbnails/7.jpg)
Two approaches
7Felix Berkenkamp
Control(Systems)
Machine Learning(Data)
+ Models+ Feedback+ Safety+ Worst-case- Learning- Data
+ Learning+ Data collection+ Explore / exploit+ Average case- Worst-case- Safety
Systems must learn and adapt
Performance limited by system understanding
Safety limited by lack of system understanding
safety, data efficiency
Model-based reinforcement learning
![Page 8: Safe model-based learning for robot control...Safe model-based learning for robot control Felix Berkenkamp, Andreas Krause, Angela P. Schoellig @CDC Workshop on Learning for Control](https://reader033.vdocuments.us/reader033/viewer/2022053015/5f164acd3af2e92ef65e198b/html5/thumbnails/8.jpg)
Prerequisites for safe reinforcement learning
8Felix Berkenkamp
Safe Model-based Reinforcement Learning
Understand model errors and learning dynamics
Algorithm to safely acquiredata and optimize task
Define safety, analyze a model for safety
![Page 9: Safe model-based learning for robot control...Safe model-based learning for robot control Felix Berkenkamp, Andreas Krause, Angela P. Schoellig @CDC Workshop on Learning for Control](https://reader033.vdocuments.us/reader033/viewer/2022053015/5f164acd3af2e92ef65e198b/html5/thumbnails/9.jpg)
Overview
9Felix Berkenkamp
Safe Model-based Reinforcement Learning
Understand model errors and learning dynamics
Algorithm to safely acquiredata and optimize task
Define safety, analyze a model for safety
![Page 10: Safe model-based learning for robot control...Safe model-based learning for robot control Felix Berkenkamp, Andreas Krause, Angela P. Schoellig @CDC Workshop on Learning for Control](https://reader033.vdocuments.us/reader033/viewer/2022053015/5f164acd3af2e92ef65e198b/html5/thumbnails/10.jpg)
Learning a model
10
Dynamics
Felix Berkenkamp
Need to quantify model error
Model error must decrease with data
![Page 11: Safe model-based learning for robot control...Safe model-based learning for robot control Felix Berkenkamp, Andreas Krause, Angela P. Schoellig @CDC Workshop on Learning for Control](https://reader033.vdocuments.us/reader033/viewer/2022053015/5f164acd3af2e92ef65e198b/html5/thumbnails/11.jpg)
Learning a model
10
Dynamics
Felix Berkenkamp
Need to quantify model error
Model error must decrease with data
![Page 12: Safe model-based learning for robot control...Safe model-based learning for robot control Felix Berkenkamp, Andreas Krause, Angela P. Schoellig @CDC Workshop on Learning for Control](https://reader033.vdocuments.us/reader033/viewer/2022053015/5f164acd3af2e92ef65e198b/html5/thumbnails/12.jpg)
Learning a model
10
Dynamics
Felix Berkenkamp
Need to quantify model error
Model error must decrease with data
![Page 13: Safe model-based learning for robot control...Safe model-based learning for robot control Felix Berkenkamp, Andreas Krause, Angela P. Schoellig @CDC Workshop on Learning for Control](https://reader033.vdocuments.us/reader033/viewer/2022053015/5f164acd3af2e92ef65e198b/html5/thumbnails/13.jpg)
Learning a model
10
Dynamics
Felix Berkenkamp
Need to quantify model error
Model error must decrease with data
![Page 14: Safe model-based learning for robot control...Safe model-based learning for robot control Felix Berkenkamp, Andreas Krause, Angela P. Schoellig @CDC Workshop on Learning for Control](https://reader033.vdocuments.us/reader033/viewer/2022053015/5f164acd3af2e92ef65e198b/html5/thumbnails/14.jpg)
Learning a model
10
Dynamics
Felix Berkenkamp
Need to quantify model error
Model error must decrease with data
![Page 15: Safe model-based learning for robot control...Safe model-based learning for robot control Felix Berkenkamp, Andreas Krause, Angela P. Schoellig @CDC Workshop on Learning for Control](https://reader033.vdocuments.us/reader033/viewer/2022053015/5f164acd3af2e92ef65e198b/html5/thumbnails/15.jpg)
Learning a model
10
Dynamics
Felix Berkenkamp
Need to quantify model error
Model error must decrease with data
sub-Gaussian
![Page 16: Safe model-based learning for robot control...Safe model-based learning for robot control Felix Berkenkamp, Andreas Krause, Angela P. Schoellig @CDC Workshop on Learning for Control](https://reader033.vdocuments.us/reader033/viewer/2022053015/5f164acd3af2e92ef65e198b/html5/thumbnails/16.jpg)
Gaussian process
16Felix Berkenkamp
![Page 17: Safe model-based learning for robot control...Safe model-based learning for robot control Felix Berkenkamp, Andreas Krause, Angela P. Schoellig @CDC Workshop on Learning for Control](https://reader033.vdocuments.us/reader033/viewer/2022053015/5f164acd3af2e92ef65e198b/html5/thumbnails/17.jpg)
Gaussian process
16Felix Berkenkamp
![Page 18: Safe model-based learning for robot control...Safe model-based learning for robot control Felix Berkenkamp, Andreas Krause, Angela P. Schoellig @CDC Workshop on Learning for Control](https://reader033.vdocuments.us/reader033/viewer/2022053015/5f164acd3af2e92ef65e198b/html5/thumbnails/18.jpg)
Gaussian process
16Felix Berkenkamp
![Page 19: Safe model-based learning for robot control...Safe model-based learning for robot control Felix Berkenkamp, Andreas Krause, Angela P. Schoellig @CDC Workshop on Learning for Control](https://reader033.vdocuments.us/reader033/viewer/2022053015/5f164acd3af2e92ef65e198b/html5/thumbnails/19.jpg)
Gaussian process
16Felix Berkenkamp
![Page 20: Safe model-based learning for robot control...Safe model-based learning for robot control Felix Berkenkamp, Andreas Krause, Angela P. Schoellig @CDC Workshop on Learning for Control](https://reader033.vdocuments.us/reader033/viewer/2022053015/5f164acd3af2e92ef65e198b/html5/thumbnails/20.jpg)
Gaussian process
16Felix Berkenkamp
Theorem (informally):The model error is contained in the scaled Gaussian process confidence intervalswith probability at least jointly for all , time steps, and actively selected measurements.
Gaussian Process Optimization in the Bandit Setting: No Regret and Experimental DesignN. Srinivas, A. Krause, S. Kakade, M.Seeger, ICML 2010
![Page 21: Safe model-based learning for robot control...Safe model-based learning for robot control Felix Berkenkamp, Andreas Krause, Angela P. Schoellig @CDC Workshop on Learning for Control](https://reader033.vdocuments.us/reader033/viewer/2022053015/5f164acd3af2e92ef65e198b/html5/thumbnails/21.jpg)
A Bayesian dynamics model
21
Dynamics
Felix Berkenkamp
![Page 22: Safe model-based learning for robot control...Safe model-based learning for robot control Felix Berkenkamp, Andreas Krause, Angela P. Schoellig @CDC Workshop on Learning for Control](https://reader033.vdocuments.us/reader033/viewer/2022053015/5f164acd3af2e92ef65e198b/html5/thumbnails/22.jpg)
A Bayesian dynamics model
21
Dynamics
Felix Berkenkamp
![Page 23: Safe model-based learning for robot control...Safe model-based learning for robot control Felix Berkenkamp, Andreas Krause, Angela P. Schoellig @CDC Workshop on Learning for Control](https://reader033.vdocuments.us/reader033/viewer/2022053015/5f164acd3af2e92ef65e198b/html5/thumbnails/23.jpg)
A Bayesian dynamics model
21
Dynamics
Felix Berkenkamp
![Page 24: Safe model-based learning for robot control...Safe model-based learning for robot control Felix Berkenkamp, Andreas Krause, Angela P. Schoellig @CDC Workshop on Learning for Control](https://reader033.vdocuments.us/reader033/viewer/2022053015/5f164acd3af2e92ef65e198b/html5/thumbnails/24.jpg)
A Bayesian dynamics model
21
Dynamics
Felix Berkenkamp
![Page 25: Safe model-based learning for robot control...Safe model-based learning for robot control Felix Berkenkamp, Andreas Krause, Angela P. Schoellig @CDC Workshop on Learning for Control](https://reader033.vdocuments.us/reader033/viewer/2022053015/5f164acd3af2e92ef65e198b/html5/thumbnails/25.jpg)
Overview
25Felix Berkenkamp
Safe Model-based Reinforcement Learning
Understand model errors and learning dynamics
Algorithm to safely acquire data and optimize task
Define safety, analyze a model for safety
![Page 26: Safe model-based learning for robot control...Safe model-based learning for robot control Felix Berkenkamp, Andreas Krause, Angela P. Schoellig @CDC Workshop on Learning for Control](https://reader033.vdocuments.us/reader033/viewer/2022053015/5f164acd3af2e92ef65e198b/html5/thumbnails/26.jpg)
Safety definition
26
unsafe
Felix Berkenkamp
robust, control-invariant
prior knowledge
![Page 27: Safe model-based learning for robot control...Safe model-based learning for robot control Felix Berkenkamp, Andreas Krause, Angela P. Schoellig @CDC Workshop on Learning for Control](https://reader033.vdocuments.us/reader033/viewer/2022053015/5f164acd3af2e92ef65e198b/html5/thumbnails/27.jpg)
Safety for learned models
27Felix Berkenkamp
Dynamics Policy
+
Stability?Region of attraction?
![Page 28: Safe model-based learning for robot control...Safe model-based learning for robot control Felix Berkenkamp, Andreas Krause, Angela P. Schoellig @CDC Workshop on Learning for Control](https://reader033.vdocuments.us/reader033/viewer/2022053015/5f164acd3af2e92ef65e198b/html5/thumbnails/28.jpg)
Lyapunov functions
28
[A.M. Lyapunov 1892]
Felix Berkenkamp
![Page 29: Safe model-based learning for robot control...Safe model-based learning for robot control Felix Berkenkamp, Andreas Krause, Angela P. Schoellig @CDC Workshop on Learning for Control](https://reader033.vdocuments.us/reader033/viewer/2022053015/5f164acd3af2e92ef65e198b/html5/thumbnails/29.jpg)
Lyapunov functions
29Felix Berkenkamp
![Page 30: Safe model-based learning for robot control...Safe model-based learning for robot control Felix Berkenkamp, Andreas Krause, Angela P. Schoellig @CDC Workshop on Learning for Control](https://reader033.vdocuments.us/reader033/viewer/2022053015/5f164acd3af2e92ef65e198b/html5/thumbnails/30.jpg)
Learning Lyapunov functions
30Felix Berkenkamp
Finding the right Lyapunov function is difficult!
The Lyapunov Neural Network: Adaptive Stability Certification for Safe Learning of Dynamic SystemsS.M. Richards, F. Berkenkamp, A. Krause, CoRL 2018
Weights - positive-definiteNonlinearities - trivial nullspace
Classification problem
![Page 31: Safe model-based learning for robot control...Safe model-based learning for robot control Felix Berkenkamp, Andreas Krause, Angela P. Schoellig @CDC Workshop on Learning for Control](https://reader033.vdocuments.us/reader033/viewer/2022053015/5f164acd3af2e92ef65e198b/html5/thumbnails/31.jpg)
Overview
31Felix Berkenkamp
Safe Model-based Reinforcement Learning
Understand model errors and learning dynamics
Algorithm to safely acquiredata and optimize task
Define safety, analyze a model for safety
![Page 32: Safe model-based learning for robot control...Safe model-based learning for robot control Felix Berkenkamp, Andreas Krause, Angela P. Schoellig @CDC Workshop on Learning for Control](https://reader033.vdocuments.us/reader033/viewer/2022053015/5f164acd3af2e92ef65e198b/html5/thumbnails/32.jpg)
Safety definition
32
unsafe
Felix Berkenkamp
Safe Model-based Reinforcement Learning with Stability GuaranteesF. Berkenkamp, M. Turchetta, A.P. Schoellig, A. Krause, NIPS, 2017
Initial safe policy
![Page 33: Safe model-based learning for robot control...Safe model-based learning for robot control Felix Berkenkamp, Andreas Krause, Angela P. Schoellig @CDC Workshop on Learning for Control](https://reader033.vdocuments.us/reader033/viewer/2022053015/5f164acd3af2e92ef65e198b/html5/thumbnails/33.jpg)
Safety definition
32
unsafe
Felix Berkenkamp
Safe Model-based Reinforcement Learning with Stability GuaranteesF. Berkenkamp, M. Turchetta, A.P. Schoellig, A. Krause, NIPS, 2017
Initial safe policy
![Page 34: Safe model-based learning for robot control...Safe model-based learning for robot control Felix Berkenkamp, Andreas Krause, Angela P. Schoellig @CDC Workshop on Learning for Control](https://reader033.vdocuments.us/reader033/viewer/2022053015/5f164acd3af2e92ef65e198b/html5/thumbnails/34.jpg)
Safety definition
32
unsafe
Felix Berkenkamp
Safe Model-based Reinforcement Learning with Stability GuaranteesF. Berkenkamp, M. Turchetta, A.P. Schoellig, A. Krause, NIPS, 2017
Initial safe policy
![Page 35: Safe model-based learning for robot control...Safe model-based learning for robot control Felix Berkenkamp, Andreas Krause, Angela P. Schoellig @CDC Workshop on Learning for Control](https://reader033.vdocuments.us/reader033/viewer/2022053015/5f164acd3af2e92ef65e198b/html5/thumbnails/35.jpg)
Safety definition
32
unsafe
Felix Berkenkamp
Safe Model-based Reinforcement Learning with Stability GuaranteesF. Berkenkamp, M. Turchetta, A.P. Schoellig, A. Krause, NIPS, 2017
Initial safe policy
![Page 36: Safe model-based learning for robot control...Safe model-based learning for robot control Felix Berkenkamp, Andreas Krause, Angela P. Schoellig @CDC Workshop on Learning for Control](https://reader033.vdocuments.us/reader033/viewer/2022053015/5f164acd3af2e92ef65e198b/html5/thumbnails/36.jpg)
Safety definition
32
unsafe
Felix Berkenkamp
Safe Model-based Reinforcement Learning with Stability GuaranteesF. Berkenkamp, M. Turchetta, A.P. Schoellig, A. Krause, NIPS, 2017
Initial safe policy
![Page 37: Safe model-based learning for robot control...Safe model-based learning for robot control Felix Berkenkamp, Andreas Krause, Angela P. Schoellig @CDC Workshop on Learning for Control](https://reader033.vdocuments.us/reader033/viewer/2022053015/5f164acd3af2e92ef65e198b/html5/thumbnails/37.jpg)
Safety definition
32
unsafe
Felix Berkenkamp
Theorem (informally):
Under suitable conditionscan identify (near-)maximal subset of X on which π is stable, while never leaving the safe set
Safe Model-based Reinforcement Learning with Stability GuaranteesF. Berkenkamp, M. Turchetta, A.P. Schoellig, A. Krause, NIPS, 2017
Initial safe policy
![Page 38: Safe model-based learning for robot control...Safe model-based learning for robot control Felix Berkenkamp, Andreas Krause, Angela P. Schoellig @CDC Workshop on Learning for Control](https://reader033.vdocuments.us/reader033/viewer/2022053015/5f164acd3af2e92ef65e198b/html5/thumbnails/38.jpg)
Illustration of safe learning
38
Policy
Need to safely explore!
Felix Berkenkamp
Safe Model-based Reinforcement Learning with Stability GuaranteesF. Berkenkamp, M. Turchetta, A.P. Schoellig, A. Krause, NIPS, 2017
low
high
![Page 39: Safe model-based learning for robot control...Safe model-based learning for robot control Felix Berkenkamp, Andreas Krause, Angela P. Schoellig @CDC Workshop on Learning for Control](https://reader033.vdocuments.us/reader033/viewer/2022053015/5f164acd3af2e92ef65e198b/html5/thumbnails/39.jpg)
Illustration of safe learning
39
Policy
Felix Berkenkamp
Safe Model-based Reinforcement Learning with Stability GuaranteesF. Berkenkamp, M. Turchetta, A.P. Schoellig, A. Krause, NIPS, 2017
low
high
![Page 40: Safe model-based learning for robot control...Safe model-based learning for robot control Felix Berkenkamp, Andreas Krause, Angela P. Schoellig @CDC Workshop on Learning for Control](https://reader033.vdocuments.us/reader033/viewer/2022053015/5f164acd3af2e92ef65e198b/html5/thumbnails/40.jpg)
Model predictive control
40Felix Berkenkamp
Makes decisions based on predictions about the future
Includes input / state constraints
![Page 41: Safe model-based learning for robot control...Safe model-based learning for robot control Felix Berkenkamp, Andreas Krause, Angela P. Schoellig @CDC Workshop on Learning for Control](https://reader033.vdocuments.us/reader033/viewer/2022053015/5f164acd3af2e92ef65e198b/html5/thumbnails/41.jpg)
Model predictive control on a robot
41Felix Berkenkamp
Robust constrained learning-based NMPC enabling reliable mobile robot path trackingC.J. Ostafew, A.P. Schoellig, T.D. Barfoot, IJRR, 2016
https://youtu.be/3xRNmNv5Efk
![Page 42: Safe model-based learning for robot control...Safe model-based learning for robot control Felix Berkenkamp, Andreas Krause, Angela P. Schoellig @CDC Workshop on Learning for Control](https://reader033.vdocuments.us/reader033/viewer/2022053015/5f164acd3af2e92ef65e198b/html5/thumbnails/42.jpg)
Model predictive control
42Felix Berkenkamp
Problem: True dynamics are unknown!
![Page 43: Safe model-based learning for robot control...Safe model-based learning for robot control Felix Berkenkamp, Andreas Krause, Angela P. Schoellig @CDC Workshop on Learning for Control](https://reader033.vdocuments.us/reader033/viewer/2022053015/5f164acd3af2e92ef65e198b/html5/thumbnails/43.jpg)
Outer approximation contains true dynamics for all time steps with probability at least
Prediction under uncertainty
43
Learning-based Model Predictive Control for Safe Exploration T. Koller, F. Berkenkamp, M. Turchetta, A. Krause, CDC, 2018
Felix Berkenkamp
![Page 44: Safe model-based learning for robot control...Safe model-based learning for robot control Felix Berkenkamp, Andreas Krause, Angela P. Schoellig @CDC Workshop on Learning for Control](https://reader033.vdocuments.us/reader033/viewer/2022053015/5f164acd3af2e92ef65e198b/html5/thumbnails/44.jpg)
Safe model-based learning framework
44
unsafe safety trajectory
exploration trajectoryfirst step same
Theorem (informally):
Under suitableconditions can alwaysguarantee that we areable to return to thesafe set
Felix Berkenkamp
![Page 45: Safe model-based learning for robot control...Safe model-based learning for robot control Felix Berkenkamp, Andreas Krause, Angela P. Schoellig @CDC Workshop on Learning for Control](https://reader033.vdocuments.us/reader033/viewer/2022053015/5f164acd3af2e92ef65e198b/html5/thumbnails/45.jpg)
Exploration via expected performance
45Felix Berkenkamp
We design our cost functions to be helpful for optimization
Driving too fast Slow down for safety Faster driving after learning
Exploration objective:
subject to safety constraints
![Page 46: Safe model-based learning for robot control...Safe model-based learning for robot control Felix Berkenkamp, Andreas Krause, Angela P. Schoellig @CDC Workshop on Learning for Control](https://reader033.vdocuments.us/reader033/viewer/2022053015/5f164acd3af2e92ef65e198b/html5/thumbnails/46.jpg)
Example
46Felix Berkenkamp
Robust constrained learning-based NMPC enabling reliable mobile robot path trackingC.J. Ostafew, A.P. Schoellig, T.D. Barfoot, IJRR, 2016
https://youtu.be/3xRNmNv5Efk
![Page 47: Safe model-based learning for robot control...Safe model-based learning for robot control Felix Berkenkamp, Andreas Krause, Angela P. Schoellig @CDC Workshop on Learning for Control](https://reader033.vdocuments.us/reader033/viewer/2022053015/5f164acd3af2e92ef65e198b/html5/thumbnails/47.jpg)
Example
47Felix Berkenkamp
Robust constrained learning-based NMPC enabling reliable mobile robot path trackingC.J. Ostafew, A.P. Schoellig, T.D. Barfoot, IJRR, 2016
https://youtu.be/3xRNmNv5Efk
![Page 48: Safe model-based learning for robot control...Safe model-based learning for robot control Felix Berkenkamp, Andreas Krause, Angela P. Schoellig @CDC Workshop on Learning for Control](https://reader033.vdocuments.us/reader033/viewer/2022053015/5f164acd3af2e92ef65e198b/html5/thumbnails/48.jpg)
Summary
48Felix Berkenkamp
Safe Model-based Reinforcement Learning
Understand model and learning dynamics
Algorithm to safely acquiredata and optimize task
Define safety, analyze a model for safety
RKHS / Gaussian processes Lyapunov stability Model predictive control
https://berkenkamp.me www.dynsyslab.orgwww.las.inf.ethz.ch
reliable confidence intervals stability of learned models Uncertainty propagation, safe active learning