lab 6-1: q network - github pageshunkim.github.io/ml/rl/rl06-l1.pdfplaying atari with deep...

Lab 6-1: Q Network

Reinforcement Learning with TensorFlow&OpenAI GymSung Kim <hunkim+ml@gmail.com>

State(0~15) as input

(2)Ws(1)sstate 7

State(0~15) as input

(2)Ws(1)sOne-hot state 7

np.identify

State (0~15) as input

(2)Ws(1)sOne-hot state 7

Q-Network training (Network construction)

(2)Ws(1)s

Q-Network training (linear regression)

(2)Ws(1)s

y = r + �maxQ(s0)

cost(W ) = (Ws� y)2

Algorithm

Playing Atari with Deep Reinforcement Learning - University of Toronto by V Mnih et al.

Algorithm

Y label and loss function

Code: Network and setup

Code: Training

Code: results

Percent of successful episodes: 0.5195%

Q-Table VS NetworkQ-network: 0.5195%

Q-table: 0.653

Array shape

[[0, 2, …]][ [0,1,2,3], [3,1,2,3], [0,5,2,3], … ]

[[a1,a2,a3,a4]]1x4

Array Shape

[[a1,a2,a3,a4]]1x4

Exercise

• Too slow- Minibatch?

• A bit unstable?

Lab: Q-network for

cart pole

lab 6-1: q network - github pageshunkim.github.io/ml/rl/rl06-l1.pdfplaying atari with deep...

Documents

dungeon master - atari st - manual - gamesdatabase...title...

status of csr rl06 grace reprocessing and preliminary...

fight for life - atari jaguar - manual - gamesdatabase ·...

sit-in dedicated game operation manual francisco rush...

atari software

atari st internals

atari 65xe - zgierz 65xe_owners_manual...atari 65xe personal...

atari game recorder

star raiders - atari 8-bit - manual - gamesdatabase ·...

icepic - computer files preserving archive manual will help...

atari sm124 mt21

rana 1000 systems - the atari site – the atari site

atari - lazard

de re atari - musanim.com · preface introduction by enter...

atari-head: atari human eye-tracking and demonstration...

atari teenage riot

atari system reference...

atari punk console

atari® program exchange program... · atari, inc., created...

arcade games arcade blaster list · atari baseball atari...