real-life reinforcement learning - boussias conferences · © 2019, amazon web services, inc. or...

33
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Real-life reinforcement learning Julien Simon Global Evangelist, AI & Machine Learning, AWS @julsimon

Upload: others

Post on 24-Jun-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Real-life reinforcement learning - Boussias Conferences · © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Real-life reinforcement learning Julien Simon

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.

Real-life reinforcement learning

Julien Simon

Global Evangelist, AI & Machine Learning, AWS

@julsimon

Page 2: Real-life reinforcement learning - Boussias Conferences · © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Real-life reinforcement learning Julien Simon

© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.

Supervised learning• Run an algorithm on a labeled data set.

• The model learns how to correctly predict the right answer.

• Regression and classification are examples of supervised learning.

Unsupervised learning• Run an algorithm on an unlabeled data set.

• The model learns patterns and organizes samples accordingly.

• Clustering and topic modeling are examples of unsupervised learning.

Types of Machine Learning

Page 3: Real-life reinforcement learning - Boussias Conferences · © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Real-life reinforcement learning Julien Simon

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.

Building a dataset is not always an option

• Large, complex problems• Uncertain, dynamic environments• Continuous learning

• Supply chain management• HVAC systems• Industrial robotics• Autonomous vehicles• Portfolio management• Oil exploration• etc.

Page 4: Real-life reinforcement learning - Boussias Conferences · © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Real-life reinforcement learning Julien Simon

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.

Building a dataset is not always an option

Large, complex problems

Uncertain, dynamic environments

Continuous learning

Supply chain management, HVAC systems, industrial robotics, autonomous vehicles, portfolio management, oil exploration, etc.

Page 5: Real-life reinforcement learning - Boussias Conferences · © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Real-life reinforcement learning Julien Simon

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.

Learning without any data: we’ve all done it!

Page 6: Real-life reinforcement learning - Boussias Conferences · © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Real-life reinforcement learning Julien Simon

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.

Types of Machine Learning

Reinforcement learning(RL)

Supervised learning

Unsupervised learning

AMOUNT OF TRAINING DATA REQUIRED

SOP

HIS

TIC

ATI

ON

OF

ML

MO

DEL

S

Page 7: Real-life reinforcement learning - Boussias Conferences · © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Real-life reinforcement learning Julien Simon

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.

Reinforcement Learning

An agent interacts with its environment.

The agent receives a positive or negative reward for actions that it takes: rewards are computed by a user-defined function which outputs a numeric representation of the actions that should be incentivized.

By trying to maximize the accumulation of rewards, the agent learns an optimal strategy (aka policy) for decision making.

Source: Wikipedia

Page 8: Real-life reinforcement learning - Boussias Conferences · © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Real-life reinforcement learning Julien Simon

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.

Training a RL model

1. Formulate the problem: goal, environment, state, actions, reward

2. Define the environment: real-world or simulator?

3. Define the presets

4. Write the training code and the reward function

5. Train the model

Page 9: Real-life reinforcement learning - Boussias Conferences · © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Real-life reinforcement learning Julien Simon

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.

Environments can be sophisticated

Popular simulators:

OpenAI, Roboschool, EnergyPlus,

MATLAB,Simulink

Page 10: Real-life reinforcement learning - Boussias Conferences · © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Real-life reinforcement learning Julien Simon

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.

You can write your own RL environment

Page 11: Real-life reinforcement learning - Boussias Conferences · © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Real-life reinforcement learning Julien Simon

© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.

https://github.com/awslabs/amazon-sagemaker-examples/tree/master/reinforcement_learning/rl_roboschool_ray

Page 12: Real-life reinforcement learning - Boussias Conferences · © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Real-life reinforcement learning Julien Simon

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.

The players

Simulation environment

RL agent

R O L L O U T

Initial model

Page 13: Real-life reinforcement learning - Boussias Conferences · © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Real-life reinforcement learning Julien Simon

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.

At first, the agent can’t even stand up

Page 14: Real-life reinforcement learning - Boussias Conferences · © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Real-life reinforcement learning Julien Simon

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.

Actions and observations

Simulation environment

Actions

Observations

RL agent

Initial model

Page 15: Real-life reinforcement learning - Boussias Conferences · © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Real-life reinforcement learning Julien Simon

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.

The model learns through actions and observations

Page 16: Real-life reinforcement learning - Boussias Conferences · © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Real-life reinforcement learning Julien Simon

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.

Interactions generate training data

Simulation environment

Actions

Observation

RL agent

Training Data

Observation, action, reward

Training

Page 17: Real-life reinforcement learning - Boussias Conferences · © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Real-life reinforcement learning Julien Simon

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.

Training results in model updates

Simulation environment

Actions

Observation

RL agent

Model updates

Training data

Observation, action, reward

Training

Page 18: Real-life reinforcement learning - Boussias Conferences · © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Real-life reinforcement learning Julien Simon

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.

The agent learns to stand and step

Page 19: Real-life reinforcement learning - Boussias Conferences · © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Real-life reinforcement learning Julien Simon

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.

Multiple training episodes improve learning

Simulation environment

Actions

Observation

RL agent

Model updates

Training data

Observation, action, reward

Training

Page 20: Real-life reinforcement learning - Boussias Conferences · © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Real-life reinforcement learning Julien Simon

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.

Making progress

Page 21: Real-life reinforcement learning - Boussias Conferences · © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Real-life reinforcement learning Julien Simon

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.

RL agents try to maximize rewards

Page 22: Real-life reinforcement learning - Boussias Conferences · © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Real-life reinforcement learning Julien Simon

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.

Eventually, the model learns how to walk and run

You can continue training Harry to jump obstacles, play games, dance, and more

Page 23: Real-life reinforcement learning - Boussias Conferences · © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Real-life reinforcement learning Julien Simon

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.

Evaluate and deploy trained models

Simulation environment

Actions

Observation

Trained Model

RL agent

Model updates

Training data

Training visualization

Observation, action, reward

Deploy Model

Training

Page 24: Real-life reinforcement learning - Boussias Conferences · © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Real-life reinforcement learning Julien Simon

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.

Customers are using RL on AWS

Page 25: Real-life reinforcement learning - Boussias Conferences · © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Real-life reinforcement learning Julien Simon

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.

Amazon SageMaker RLReinforcement learning for every developer and data scientist

Broad support for frameworks

Broad support for simulation environments including SimuLink and MatLab

K E Y F E A T U R E S

TensorFlow, Apache MXNet, Intel Coach, and

Ray RL support

2D & 3D physics environments and

OpenAI Gym support

Supports Amazon Sumerian and Amazon RoboMaker

Fully managed

Example notebooks and tutorials

Page 26: Real-life reinforcement learning - Boussias Conferences · © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Real-life reinforcement learning Julien Simon

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.

Robotics

Page 27: Real-life reinforcement learning - Boussias Conferences · © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Real-life reinforcement learning Julien Simon

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.

Auto Scaling

Objective Adapt instance capacity to load profile

S TAT E Current load, failed jobs, active machines

AC T I O N Remove or add machines

R E WA R D Positive for successful transactions

High penalty for losing transactions

https://github.com/awslabs/amazon-sagemaker-examples/tree/master/reinforcement_learning/rl_predictive_autoscaling_coach_customEnv

Page 28: Real-life reinforcement learning - Boussias Conferences · © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Real-life reinforcement learning Julien Simon

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.

Financial portfolio management

Objective Maximize the value of a financial portfolio

S TAT E Current stock portfolio, price history

AC T I O N Buy, sell stocks

R E WA R D Positive when return is positive

Negative when return is negative

https://github.com/awslabs/amazon-sagemaker-examples/tree/master/reinforcement_learning/rl_portfolio_management_coach_customEnv

Jiang, Zhengyao, Dixing Xu, and Jinjun Liang« A deep reinforcement learning framework for the financial portfolio management problem. »arXiv:1706.10059 (2017)

Page 29: Real-life reinforcement learning - Boussias Conferences · © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Real-life reinforcement learning Julien Simon

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.

Compressing deep learning models

Objective Compress model without losing accuracy

S TAT E Layers

AC T I O N Remove or shrink a layer

R E WA R D A combination of compression ratio and accuracy.

https://github.com/awslabs/amazon-sagemaker-examples/tree/master/reinforcement_learning/rl_network_compression_ray_custom

Ashok, Anubhav, Nicholas Rhinehart, Fares Beainy, and Kris M. Kitani"N2N learning: network to network compression via policy gradient

reinforcement learning." arXiv:1709.06030 (2017).

Page 30: Real-life reinforcement learning - Boussias Conferences · © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Real-life reinforcement learning Julien Simon

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.

Vehicle routing

Objective Fulfill customer orders

S TAT E Current location, distance from homes

AC T I O N Accept, pick up, and deliver order

R E WA R D Positive when we deliver on time

Negative when we fail to deliver on time

https://github.com/awslabs/amazon-sagemaker-examples/tree/master/reinforcement_learning/rl_traveling_salesman_vehicle_routing_coach

Page 31: Real-life reinforcement learning - Boussias Conferences · © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Real-life reinforcement learning Julien Simon

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.

Autonomous driving

AWS DeepRacer

1/18th scale autonomous vehicle

Amazon RoboMaker

Page 32: Real-life reinforcement learning - Boussias Conferences · © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Real-life reinforcement learning Julien Simon

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.

Getting started

http://aws.amazon.com/free

https://ml.aws

https://aws.amazon.com/sagemakerhttps://github.com/awslabs/amazon-sagemaker-exampleshttps://aws.amazon.com/blogs/aws/amazon-sagemaker-rl-managed-reinforcement-learning-with-amazon-sagemaker/

https://aws.amazon.com/deepracer/

https://medium.com/@julsimon

Page 33: Real-life reinforcement learning - Boussias Conferences · © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Real-life reinforcement learning Julien Simon

Thank you!

© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.

Julien SimonGlobal Evangelist, AI & Machine Learning, AWS@julsimon