self-learning rock paper scissors robot

Self-Learning Rock Paper Scissors Robot Bryant Pong

Upload: tuyet

Post on 22-Feb-2016

41 views

Category:

Documents

0 download

Report

Download

Embed Size (px):

DESCRIPTION

Self-Learning Rock Paper Scissors Robot. Bryant Pong. Q-Learning Algorithm. Four variables needed (all variables are[0, 1)): Alpha (learning rate): higher alpha is to 1, robot learns faster, but incorrect moves will affect the robot more - PowerPoint PPT Presentation

TRANSCRIPT

Self-Learning Rock Paper Scissors Robot

Bryant Pong

Q-Learning Algorithm

Four variables needed (all variables are[0, 1)):Alpha (learning rate): higher alpha is to 1, robot learns faster, but incorrect moves will affect the robot moreGamma (discount factor): higher gamma is to 1, robot will use learned information rather than performing random movesEpsilon (probability of a learned action): epsilon starts off at 0.999, decrements after robot performs a moveKappa (thoroughness variable): kappa ensures that more accurate values are recorded for the future

Page 3: Self-Learning Rock Paper Scissors Robot

The Robot• 3 Touch Sensors/3 Motors

Page 4: Self-Learning Rock Paper Scissors Robot

Page 5: Self-Learning Rock Paper Scissors Robot

Rock (Retracted)

Page 6: Self-Learning Rock Paper Scissors Robot

Rock (fully extended)

Page 7: Self-Learning Rock Paper Scissors Robot

Page 8: Self-Learning Rock Paper Scissors Robot

Decision Making- Markov Chains- 3 x 3 x 3 array (like a Rubik’s cube)

- 1st Dimension: Move Player Made (rock, paper, scissors)- 2nd Dimension: Move Robot Made (rock, paper, scissors)- 3rd Dimension: Result (robot won, lost, or tied?)

- Array stores probability of success for a given state

Page 9: Self-Learning Rock Paper Scissors Robot

How does the robot make a decision?if(norm_random() > epsilon) { //Use learned information ClearScreen(); TextOut(0, LCD_LINE1, "I'm thinking..."); Wait(1000); //loop through all possible choices for(i = 0; i < CHOICES; i++) { //use the lookup table to determine what the best //move to make to counter the player's move if(qscores[initial_input][i] > max_q_score) { //find the move with the highest success rate //and store if max_q_score = qscores[initial_input][i]; next_movement = i; } } }

Page 10: Self-Learning Rock Paper Scissors Robot

Reward Systemif((i == 0 && j == 0) || (i == 0 && j == 2) || (i == 1 && j == 0) || (i == 1 && j == 1) ||

(i == 2 && j ==1) || (i == 2 && j == 2)) { reward = -10000; } //otherwise, the robot must win; reward it with a virtual point!else{ reward = 1;}

Page 11: Self-Learning Rock Paper Scissors Robot

How does the reward system work?• Robot gains virtual points when it wins; loses

10,000 points when it ties/loses • Example: Give a kid $20 if he can learn to fly• Kid tries to crawl, walk, run, and jump to fly• Eventually, kid learns that he cannot fly• Along the way though, kid learns how to

crawl, walk, run, jump

Multiplayer rock-paper-scissors

Rock Paper Scissors Lizard Spock by @orsnemes

Steal This Idea: Scissors, Paper, Rock

Rock, Paper, Scissors - Mr. Taggart's Virtual Classmrtaggart.weebly.com/uploads/1/1/1/7/11176853/rock... · Utah Agriculture in the Classroom 3 scissors, scissors dominate paper and

Rock Paper Scissors - Design Position Agreement

Rock, Paper, Scissors Play in pairs Count 1,2,3 Rock beats scissors Scissors beats paper Paper beats rock Sit down when you are out of candy

Total Correct Scorer’s Initials · or each rock-paper-scissors match, paper beats rock, rock beats scissors, scissors beat paper and two identical items result in a tie. ased on

2020 School Competition Sprint Round Problems 1−30 · For each rock-paper-scissors match, paper beats rock, rock beats scissors, scissors beat paper and two identical items result

Level 5 Games Scissors, Paper, Rock - kudlian.net · Scissors, Paper, Rock “ Design and create a Scissors, Paper, Rock App” In this session we are going to recreate the basic

Yr 8 Rock Paper Scissors Sg

Rock, Paper, Scissors Loot!