practice theory · 2020. 1. 3. · practice theory powerful modeling, simple exploration...

48

Upload: others

Post on 17-Oct-2020

0 views

Category:

Documents

0 download

Report

Download

Embed Size (px):

TRANSCRIPT

Page 1: Practice Theory · 2020. 1. 3. · Practice Theory Powerful modeling, simple exploration Sophisticated explorationin small-state MDPs e.g.: Atari Deep ReinforcementLearning e.g. !",

Page 2: Practice Theory · 2020. 1. 3. · Practice Theory Powerful modeling, simple exploration Sophisticated explorationin small-state MDPs e.g.: Atari Deep ReinforcementLearning e.g. !",

Page 3: Practice Theory · 2020. 1. 3. · Practice Theory Powerful modeling, simple exploration Sophisticated explorationin small-state MDPs e.g.: Atari Deep ReinforcementLearning e.g. !",

Practice Theory

Powerfulmodeling,simpleexploration Sophisticatedexploration insmall-stateMDPs

e.g.:AtariDeepReinforcement Learning e.g.𝐸",R-MAXalgorithms

Limitedtheoryforrichobservations

Goal

DevelopReinforcementLearningapproachesguaranteed tolearnanoptimalpolicy withasmallnumberofsamples despiterichobservations.

Page 4: Practice Theory · 2020. 1. 3. · Practice Theory Powerful modeling, simple exploration Sophisticated explorationin small-state MDPs e.g.: Atari Deep ReinforcementLearning e.g. !",

Model PACGuarantees

Small-state MDPs Known

Structured large-stateMDPs New

ReactivePOMDPs Extended

ReactivePSRs New

LQR (continuousactions) Known

Page 5: Practice Theory · 2020. 1. 3. · Practice Theory Powerful modeling, simple exploration Sophisticated explorationin small-state MDPs e.g.: Atari Deep ReinforcementLearning e.g. !",

Model PACGuarantees

Small-state MDPs Known

Structured large-stateMDPs New

ReactivePOMDPs Extended

ReactivePSRs New

LQR (continuousactions) Known

Page 6: Practice Theory · 2020. 1. 3. · Practice Theory Powerful modeling, simple exploration Sophisticated explorationin small-state MDPs e.g.: Atari Deep ReinforcementLearning e.g. !",

§

§

§

Page 7: Practice Theory · 2020. 1. 3. · Practice Theory Powerful modeling, simple exploration Sophisticated explorationin small-state MDPs e.g.: Atari Deep ReinforcementLearning e.g. !",

Page 8: Practice Theory · 2020. 1. 3. · Practice Theory Powerful modeling, simple exploration Sophisticated explorationin small-state MDPs e.g.: Atari Deep ReinforcementLearning e.g. !",

Page 9: Practice Theory · 2020. 1. 3. · Practice Theory Powerful modeling, simple exploration Sophisticated explorationin small-state MDPs e.g.: Atari Deep ReinforcementLearning e.g. !",

Page 10: Practice Theory · 2020. 1. 3. · Practice Theory Powerful modeling, simple exploration Sophisticated explorationin small-state MDPs e.g.: Atari Deep ReinforcementLearning e.g. !",

𝐻

Page 11: Practice Theory · 2020. 1. 3. · Practice Theory Powerful modeling, simple exploration Sophisticated explorationin small-state MDPs e.g.: Atari Deep ReinforcementLearning e.g. !",

§

Page 12: Practice Theory · 2020. 1. 3. · Practice Theory Powerful modeling, simple exploration Sophisticated explorationin small-state MDPs e.g.: Atari Deep ReinforcementLearning e.g. !",

§

𝜋(𝑥')

§

Page 13: Practice Theory · 2020. 1. 3. · Practice Theory Powerful modeling, simple exploration Sophisticated explorationin small-state MDPs e.g.: Atari Deep ReinforcementLearning e.g. !",

Page 14: Practice Theory · 2020. 1. 3. · Practice Theory Powerful modeling, simple exploration Sophisticated explorationin small-state MDPs e.g.: Atari Deep ReinforcementLearning e.g. !",

Page 15: Practice Theory · 2020. 1. 3. · Practice Theory Powerful modeling, simple exploration Sophisticated explorationin small-state MDPs e.g.: Atari Deep ReinforcementLearning e.g. !",

§

§

§

§

§

Page 16: Practice Theory · 2020. 1. 3. · Practice Theory Powerful modeling, simple exploration Sophisticated explorationin small-state MDPs e.g.: Atari Deep ReinforcementLearning e.g. !",

Page 17: Practice Theory · 2020. 1. 3. · Practice Theory Powerful modeling, simple exploration Sophisticated explorationin small-state MDPs e.g.: Atari Deep ReinforcementLearning e.g. !",

Page 18: Practice Theory · 2020. 1. 3. · Practice Theory Powerful modeling, simple exploration Sophisticated explorationin small-state MDPs e.g.: Atari Deep ReinforcementLearning e.g. !",

§

§

§

§

§

Page 19: Practice Theory · 2020. 1. 3. · Practice Theory Powerful modeling, simple exploration Sophisticated explorationin small-state MDPs e.g.: Atari Deep ReinforcementLearning e.g. !",

Page 20: Practice Theory · 2020. 1. 3. · Practice Theory Powerful modeling, simple exploration Sophisticated explorationin small-state MDPs e.g.: Atari Deep ReinforcementLearning e.g. !",

§

§

§

§

𝑥

§

Page 21: Practice Theory · 2020. 1. 3. · Practice Theory Powerful modeling, simple exploration Sophisticated explorationin small-state MDPs e.g.: Atari Deep ReinforcementLearning e.g. !",

§

𝜋 𝑥 ) *

Distributionofinitialstate

Distributionofnextstate

Instantaneousreward

Page 22: Practice Theory · 2020. 1. 3. · Practice Theory Powerful modeling, simple exploration Sophisticated explorationin small-state MDPs e.g.: Atari Deep ReinforcementLearning e.g. !",

§

max/E0~23 𝑟 𝑎 + E*7~8 *,/ 𝑉⋆(𝑥<)

Distributionofinitialstate

DistributionofnextstateInstantaneous

reward

Optimalaction

Page 23: Practice Theory · 2020. 1. 3. · Practice Theory Powerful modeling, simple exploration Sophisticated explorationin small-state MDPs e.g.: Atari Deep ReinforcementLearning e.g. !",

§

max/E0~23 𝑟 𝑎 + E*7~8 *,/ 𝑉⋆(𝑥<)

𝑄⋆(𝑥, 𝑎)

𝜋⋆ 𝑥 = argmax/

𝑄⋆ 𝑥, 𝑎

Page 24: Practice Theory · 2020. 1. 3. · Practice Theory Powerful modeling, simple exploration Sophisticated explorationin small-state MDPs e.g.: Atari Deep ReinforcementLearning e.g. !",

§

§

Page 25: Practice Theory · 2020. 1. 3. · Practice Theory Powerful modeling, simple exploration Sophisticated explorationin small-state MDPs e.g.: Atari Deep ReinforcementLearning e.g. !",

§

§

§

§

Page 26: Practice Theory · 2020. 1. 3. · Practice Theory Powerful modeling, simple exploration Sophisticated explorationin small-state MDPs e.g.: Atari Deep ReinforcementLearning e.g. !",

§

§

§

§

Page 27: Practice Theory · 2020. 1. 3. · Practice Theory Powerful modeling, simple exploration Sophisticated explorationin small-state MDPs e.g.: Atari Deep ReinforcementLearning e.g. !",

Page 28: Practice Theory · 2020. 1. 3. · Practice Theory Powerful modeling, simple exploration Sophisticated explorationin small-state MDPs e.g.: Atari Deep ReinforcementLearning e.g. !",

E0~23 𝑟 𝑎 + E*7~8 *,/ 𝑉⋆ 𝑥<

§

Page 29: Practice Theory · 2020. 1. 3. · Practice Theory Powerful modeling, simple exploration Sophisticated explorationin small-state MDPs e.g.: Atari Deep ReinforcementLearning e.g. !",

E0~23 𝑟 𝑎 + E*7~8 *,/ max/7𝑄⋆(𝑥<, 𝑎<)

§

Page 30: Practice Theory · 2020. 1. 3. · Practice Theory Powerful modeling, simple exploration Sophisticated explorationin small-state MDPs e.g.: Atari Deep ReinforcementLearning e.g. !",

E0~23 𝑟 𝑎 + E*7~8 *,/ 𝑄⋆(𝑥<, 𝜋⋆ 𝑥< )

§

§

Page 31: Practice Theory · 2020. 1. 3. · Practice Theory Powerful modeling, simple exploration Sophisticated explorationin small-state MDPs e.g.: Atari Deep ReinforcementLearning e.g. !",

E 𝑓 𝑥', 𝑎' − 𝑟' − 𝑓 𝑥'CD, 𝑎'CD ,

𝑥'

Page 32: Practice Theory · 2020. 1. 3. · Practice Theory Powerful modeling, simple exploration Sophisticated explorationin small-state MDPs e.g.: Atari Deep ReinforcementLearning e.g. !",

E 𝑓 𝑥', 𝑎' − 𝑟' − 𝑓 𝑥'CD, 𝑎'CD ,

Page 33: Practice Theory · 2020. 1. 3. · Practice Theory Powerful modeling, simple exploration Sophisticated explorationin small-state MDPs e.g.: Atari Deep ReinforcementLearning e.g. !",

§

§

§ Validitycondition

Page 34: Practice Theory · 2020. 1. 3. · Practice Theory Powerful modeling, simple exploration Sophisticated explorationin small-state MDPs e.g.: Atari Deep ReinforcementLearning e.g. !",

§

§

§

§

Page 35: Practice Theory · 2020. 1. 3. · Practice Theory Powerful modeling, simple exploration Sophisticated explorationin small-state MDPs e.g.: Atari Deep ReinforcementLearning e.g. !",

§

§

§

§

Page 36: Practice Theory · 2020. 1. 3. · Practice Theory Powerful modeling, simple exploration Sophisticated explorationin small-state MDPs e.g.: Atari Deep ReinforcementLearning e.g. !",

Page 37: Practice Theory · 2020. 1. 3. · Practice Theory Powerful modeling, simple exploration Sophisticated explorationin small-state MDPs e.g.: Atari Deep ReinforcementLearning e.g. !",

§

§

§

§

§

Page 38: Practice Theory · 2020. 1. 3. · Practice Theory Powerful modeling, simple exploration Sophisticated explorationin small-state MDPs e.g.: Atari Deep ReinforcementLearning e.g. !",

§

§§

§

§

§

Page 39: Practice Theory · 2020. 1. 3. · Practice Theory Powerful modeling, simple exploration Sophisticated explorationin small-state MDPs e.g.: Atari Deep ReinforcementLearning e.g. !",

Page 40: Practice Theory · 2020. 1. 3. · Practice Theory Powerful modeling, simple exploration Sophisticated explorationin small-state MDPs e.g.: Atari Deep ReinforcementLearning e.g. !",

§

§E*∼8F max/ [𝑄⋆ 𝑥, 𝑎 ]

E*∼8F𝑄⋆(𝑥, 𝜋⋆ 𝑥 )

Page 41: Practice Theory · 2020. 1. 3. · Practice Theory Powerful modeling, simple exploration Sophisticated explorationin small-state MDPs e.g.: Atari Deep ReinforcementLearning e.g. !",

§

§

§ 𝑉I = E𝒙∼𝚪𝟏[𝒇 𝒙, 𝝅𝒇 𝒙 ]

§

§

§

§

§

Optimismunderuncertainty,guessfor𝑉 𝜋⋆ if𝑓 = 𝑄⋆

Checkingouroptimisticbelief

Prunethepossiblesolutions

Page 42: Practice Theory · 2020. 1. 3. · Practice Theory Powerful modeling, simple exploration Sophisticated explorationin small-state MDPs e.g.: Atari Deep ReinforcementLearning e.g. !",

Page 43: Practice Theory · 2020. 1. 3. · Practice Theory Powerful modeling, simple exploration Sophisticated explorationin small-state MDPs e.g.: Atari Deep ReinforcementLearning e.g. !",

§§

§§

§

§

§

§

Page 44: Practice Theory · 2020. 1. 3. · Practice Theory Powerful modeling, simple exploration Sophisticated explorationin small-state MDPs e.g.: Atari Deep ReinforcementLearning e.g. !",

§

§

§

Page 45: Practice Theory · 2020. 1. 3. · Practice Theory Powerful modeling, simple exploration Sophisticated explorationin small-state MDPs e.g.: Atari Deep ReinforcementLearning e.g. !",

§

§

§

§

Page 46: Practice Theory · 2020. 1. 3. · Practice Theory Powerful modeling, simple exploration Sophisticated explorationin small-state MDPs e.g.: Atari Deep ReinforcementLearning e.g. !",

§

§

§

§

§

Page 47: Practice Theory · 2020. 1. 3. · Practice Theory Powerful modeling, simple exploration Sophisticated explorationin small-state MDPs e.g.: Atari Deep ReinforcementLearning e.g. !",

§

§

§

§

§

Page 48: Practice Theory · 2020. 1. 3. · Practice Theory Powerful modeling, simple exploration Sophisticated explorationin small-state MDPs e.g.: Atari Deep ReinforcementLearning e.g. !",

Detailsat:https://arxiv.org/abs/1610.09512

Original Article Artigo Original phonological development ... · epenthesis (e.g..: bruxa – [bu’ɾuʃa]), merger (e.g..: cravo – [´davu]), compensatory stretching (e.g.: planta

SCIENTIFIC ABSTRACT FR:AKHUNDOVA, E.G. … · Title: SCIENTIFIC ABSTRACT FR:AKHUNDOVA, E.G. TO:AKHMEDZHANOV, K.A. Subject: SCIENTIFIC ABSTRACT FR:AKHUNDOVA, E.G. TO:AKHMEDZHANOV,

Wall mounted case - nVent...e.g. special colours, special sizes e.g. air conditioning e.g. fittings, conversions, cut-outs e.g. custom solutions Overview . . . . . 2.24 epcase 19"

ReinforcementLearning - GitHub Pages

Immunotherapy: The Newest Treatment RouteDiversity of Approaches for Anticancer Immunotherapy Galuzzi et al, Oncotarget 2014 e.g. Rituxan e.g. CAR-T cells e.g. Revlimid e.g. Checkpoint

ROBUST CONSTRAINED REINFORCEMENTLEARNING FOR …

Deep learning in neural networks: An · PDF fileThroughoutthispaper,leti,j,k,t,p,q,r denotepositive ... Recurring themes of Deep Learning 4.1.DynamicprogrammingforSupervised/ReinforcementLearning

For personal use only - ASX · 2020. 9. 23. · Oil and Gas Activities During the financial year K2 acquired a drilling interest an explorationin permit located in Alabama, USA. well

Partalas.I() ReinforcementLearning&AutomatedPlanning.asurvey

ReinforcementLearning 1. Introduction

CS325ArtiﬁcialIntelligence Ch.21–ReinforcementLearning · Rats! Fundooprofessor Ratputinacagewithlever. Eachleverpresssendsa signaltorat’sbrain, totherewardcenter. Ratpresseslevercontinously

Small Scale ecosystems. Micro habitats e.g. under leaf Habitats e.g. freshwater pond Zones e.g. layers of the rain forest Biomes e.g. Tropical rainforest

Security on FHIR - FHIR DevDays€¦ · (e.g. Wordpress) Secure Website (e.g. Patient Portal) Service Proxy (e.g. LDAP) Internal Service (e.g. LDAP) Logging Services. X. Phoenix FHIR®

MAA slides ReinforcementLearning

Ability Tests Sensory (e.g., hearing, vision) Motor/Physical (e.g., dexterity, strength, agility) Cognitive (e.g., intelligence, aptitude)

Immobility. Degrees of mobility Complete immobility e.g. unconscious patient Complete immobility e.g. unconscious patient Partial mobility e.g. patient

Artificial Intelligence - UCSByuxiangw/classes/CS165A...Artificial Intelligence CS 165A Mar 7, 2019 Instructor:Prof.Yu-XiangWang ®ReinforcementLearning ®Logic 1. ... Mastering the

Universal Morphological Analysis using ReinforcementLearning

AnIntroductiontoDeep ReinforcementLearning arXiv:1811 ... · 1.2. Outline 3 maybemaybeconstrained(e.g.,notaccesstoanaccuratesimulator orlimiteddata). Overthepastfewyears,RLhasbecomeincreasinglypopulardue

REINFORCEMENTLEARNING ANDSTOCHASTICOPTIMIZATION

REINFORCEMENTLEARNING ANDSTOCHASTICOPTIMIZATION · 7.8.4 The knowledge gradient for hierarchical belief models 278 7.9 Simulation optimization 281 7.9.1 An indifference zone algorithm

Junk DNA domestic imported (e.g., dead genes) (e.g., retroviruses)

Learner name: School/Centre name: Teacher/Tutor name · • Face to face (e.g. a live presentation) • Audio (e.g. a podcast) • Visual (e.g. facial expression) • Virtual (e.g

CS325ArtiﬁcialIntelligence Ch.17.5–6,GameTheory · MDPsandRLforgames: Civilization 2010PaperonplayingCivilizationIV;uses: MarkovDecisionProcesses ReinforcementLearning,amodel-based

Reinforcement Learning - Function approximation › mlr › wp... · ReinforcementLearning Function approximation DanielHennes 19.06.2017 University Stuttgart - IPVS - Machine Learning

Measuring Scholarly Impact and · communication – Topic modeling (e.g., Latent DirichletAllocation) – Information Extraction (e.g., OpenIE) – Social Network Analysis (e.g.,

AnIntroductiontoDeep ReinforcementLearning

Issue 23 - IMM...ASTM D4287 ASTM D5125 ASTM D562 ISO 2431 ISO 2884-1 e.g…..± 0.05 P e.g 3.24.± 0.02 P e.g…..± 0.05 P e.g 2.78.± 0.03 P Density e.g. ISO 2811-4 e.g…..± 0.05

ESTIMATINGTHEPAYOFFTOATTENDINGAMORE …e.g.,studentSAT),andanothersetofvariablesthatis observedbytheadmissionscommittee(e.g.,anassessmentof

Non-MarkovianStateAggregationfor ReinforcementLearning · 3 FeatureReinforcementLearning 3.1 FeatureMaps Many interesting problems are neither fully observable nor ﬁnite state

Learning in TensorFlow Deep ReinforcementLearning in TensorFlow Danijar Hafner · Stanford CS 20SI · 2017-03-10. Gu16. Barron16. Hafner16. Repeat until end of episode: Most methods

FoundationsandTrends inMachineLearning ...dechter/courses/ics-295/...FoundationsandTrends® inMachineLearning AnIntroductiontoDeep ReinforcementLearning Suggested Citation: Vincent

ReinforcementLearning for NLP · Examples of RL for NLP. Many Faces of RL By David Silver. What is RL? RL is a general-purpose framework for sequential decision-making Usually describe

KOMUNIKASI VERBAL DAN NON VERBAL · non-verbal communication/ body language eye movements (e.g. winking) posture (e.g. slouching) appearance (e.g. untidiness) head movements (e.g

Re: Application for the addition of Whole Blood and Red ...€¦ · anticoagulants (e.g., citrate), nutrients (e.g., dextrose), buffers (e.g., phosphate), and preservatives (e.g.,