making deep q-learning approaches robust to time

Making Deep Q-learning Approaches Robust to Time Discretization Corentin Tallec L´ eonard Blier Yann Ollivier Universit´ e Paris-Sud, Facebook AI Research June 4, 2019 C. Tallec et al. (UPSUD, FAIR) Framerate robust DQ Learning June 4, 2019 1/4

Upload: others

Post on 08-Dec-2021

2 views

Category:

Documents

0 download

Report

Download

Embed Size (px):

TRANSCRIPT

Making Deep Q-learning Approaches Robust to Time DiscretizationCorentin Tallec Leonard Blier Yann Ollivier
Universite Paris-Sud, Facebook AI Research
June 4, 2019
C. Tallec et al. (UPSUD, FAIR) Framerate robust DQ Learning June 4, 2019 1 / 4
Reinforcement Learning in Near Continuous Time
What happens when using standard RL methods with small time discretization or high framerate?
Usual RL algorithm + high framerate → failure
Scalability limited by algorithms! Better hardware, sensors, actuators → Worse performance
Contributes to lack of robustness of Deep RL: New environment → different framerate → new hyperparameters.
Low FPS High FPS
C. Tallec et al. (UPSUD, FAIR) Framerate robust DQ Learning June 4, 2019 2 / 4
ddpg_low_best.mp4
As δt → 0, Qπ(s, a) → V π(s)
Qπ does not depend on actions when δt → 0 =⇒ Cannot use Qπ to select actions!
There is no continuous time ε-greedy exploration
ε-greedy, ε = 1 pendulum:
δt = .05 δt = .0001
C. Tallec et al. (UPSUD, FAIR) Framerate robust DQ Learning June 4, 2019 3 / 4
eps-0-05.mp4
Poster #32 this evening
Low FPS High FPS
C. Tallec et al. (UPSUD, FAIR) Framerate robust DQ Learning June 4, 2019 4 / 4
advup_low_best.mp4

Robust Subspace Discovery: Low-rank and Max-margin Approaches

Duality Approaches to Robust Portfolio Choice

TOWARDS ROBUST DESIGN AND TRAINING OF DEEP NEURAL NETWORKS

Deep into Hypersphere: Robust and Unsupervised Anomaly ... · Deep into Hypersphere: Robust and Unsupervised Anomaly Discovery in Dynamic Networks Xian Teng1, Muheng Yan1, Ali Mert

Efficient Compositional Approaches for Real-Time Robust Direct

Deep Robust Single Image Depth Estimation Neural Network ...openaccess.thecvf.com/content_CVPRW_2019/papers/Deep Vision W… · Figure 1. Our proposed two-stage robust SIDE framework

Deep Learning Approaches for Predicting Glaucoma

Chapter 8 Robust Features in Deep Learning-Based …robust/Papers/MitraEtAl_100116.pdf8 Robust Features in Deep Learning-Based Speech Recognition 185 for speech recognition, a single

Adversarial Robust Training of Deep Learning MRI

Decomposition approaches for two-stage robust binary

A Theoretically Guaranteed Deep Optimization Framework for Robust Compressive Sensing MRI · 2018-11-14 · A Theoretically Guaranteed Deep Optimization Framework for Robust Compressive

Shape Constraints in Deep Learning for Robust 2D