falcon on a cloudy day

Falcon on a Falcon on a Cloudy DayCloudy Day

A Ro Sham Bo AlgorithmA Ro Sham Bo Algorithm

by Andrew Postby Andrew Post

Lets ReviewLets Review

If you missed my previous presentation:If you missed my previous presentation: Ro Sham Bo = Rock Paper ScissorsRo Sham Bo = Rock Paper Scissors

Can be more complicated thoughCan be more complicated though Ro Sham Bo has important applicationsRo Sham Bo has important applications Algorithms compete at Ro Sham Bo in Algorithms compete at Ro Sham Bo in

tournamentstournaments Iocaine Powder is the world champ of Ro Iocaine Powder is the world champ of Ro

Sham BoSham Bo Because it uses ‘Sicilian Reasoning’Because it uses ‘Sicilian Reasoning’

I will beat Iocaine PowderI will beat Iocaine Powder Eventually…Eventually…

What is Ro Sham Bo?What is Ro Sham Bo?

Also known as Rock Paper ScissorsAlso known as Rock Paper Scissors

What is Ro Sham Bo?What is Ro Sham Bo?

Generalized case of Rock Paper Generalized case of Rock Paper Scissors actuallyScissors actually

Not always three choicesNot always three choices Ties can be resolved differentlyTies can be resolved differently The game is not necessarily zero-The game is not necessarily zero-

sumsum

Why does it matter?Why does it matter?

Many competitive scenarios involve a Ro Many competitive scenarios involve a Ro Sham BoSham Bo

Example:Example: CBS and NBC choosing Primetime TV ShowsCBS and NBC choosing Primetime TV Shows

They can choose to show a Drama, Comedy, or Sports They can choose to show a Drama, Comedy, or Sports showshow

Viewers prefer Comedy to Drama, Sports to Comedy, Viewers prefer Comedy to Drama, Sports to Comedy, and Drama to Sports, given the choice.and Drama to Sports, given the choice.

Neither station knows ahead of time what the other will Neither station knows ahead of time what the other will choosechoose

Billions of dollars every day rely on decisions Billions of dollars every day rely on decisions like these.like these.

How it worksHow it works

Simplest Non-Cooperative GameSimplest Non-Cooperative Game Players cannot play to ensure they both winPlayers cannot play to ensure they both win

Governed by the Nash EquilibriumGoverned by the Nash Equilibrium There are strategies which cannot be There are strategies which cannot be

dominateddominated http://www.youtube.com/watch?v=pdrBDfRhttp://www.youtube.com/watch?v=pdrBDfR

vpBAvpBA

1:31 -- 2:201:31 -- 2:20

How to WinHow to Win

As you just heard, playing randomly As you just heard, playing randomly can ensure you don’t lose, but how can ensure you don’t lose, but how do you win?do you win?

How to predict your opponentHow to predict your opponent Sub-Optimal Frequency DistributionsSub-Optimal Frequency Distributions Pattern MatchingPattern Matching History AnalysisHistory Analysis

Iocaine PowderIocaine Powder

International Ro Sham Bo International Ro Sham Bo Programming Tournament Programming Tournament ChampionChampion

Named for this famous scene:Named for this famous scene:http://youtube.com/watch?v=TUee1WvtQZUhttp://youtube.com/watch?v=TUee1WvtQZU0:57 -- 2:200:57 -- 2:20

The TournamentThe Tournament

Tournament programs play Tournament programs play thousands of roundsthousands of rounds

Win by beating the most opponents Win by beating the most opponents by a large marginby a large margin

Most programs play sub-optimally, Most programs play sub-optimally, so exploiting your opponent is more so exploiting your opponent is more important than playing randomly to important than playing randomly to avoid losing.avoid losing.

Iocaine PowderIocaine Powder

IP is the algorithm which does this IP is the algorithm which does this best.best.

IP uses the same heuristics to predict IP uses the same heuristics to predict what an opponent is most likely to do.what an opponent is most likely to do.

Using the same tools, how can you be Using the same tools, how can you be better?better?

Sicilian Reasoning!Sicilian Reasoning!

Sicilian ReasoningSicilian Reasoning

Levels of second guessing:Levels of second guessing:1.1. Opponent will play rock, so play paperOpponent will play rock, so play paper

2.2. Opponent knows you will counter rock Opponent knows you will counter rock with paper, and play scissors – so play with paper, and play scissors – so play rockrock

3.3. Opponent knows all this, and will now play Opponent knows all this, and will now play paper to beat your rock – so play scissorspaper to beat your rock – so play scissors

4.4. Opponent will play rock again – same as 1Opponent will play rock again – same as 1

Sicilian ReasoningSicilian Reasoning

Use your predictive strategies to Use your predictive strategies to evaluate what is going to happen next.evaluate what is going to happen next.

Run SR on yourself and your opponent, Run SR on yourself and your opponent, and keep a table of what each of the and keep a table of what each of the sixsix levels of reasoning say you should do.levels of reasoning say you should do.

Pick the level of reasoning which would Pick the level of reasoning which would have won against what your opponent have won against what your opponent actuallyactually diddid the most often. the most often.

Wait, six? Don’t you Wait, six? Don’t you mean three?mean three?

You can use the same predictive tools You can use the same predictive tools that your opponent uses to ‘predict’ that your opponent uses to ‘predict’ what you are going to do.what you are going to do.

Now you have three more levels of SR:Now you have three more levels of SR:4. I will play rock. So he plays paper. 4. I will play rock. So he plays paper. So play So play

ScissorsScissors

5. He knows I will counter with scissors, and 5. He knows I will counter with scissors, and play rock. play rock. So play Paper.So play Paper.

6. He expects me to counter-counter with 6. He expects me to counter-counter with paper, and will play scissors. paper, and will play scissors. So play rock.So play rock.

More Sicilian ReasoningMore Sicilian Reasoning

Just because one level of SR is Just because one level of SR is winning now, doesn’t mean it always winning now, doesn’t mean it always will be.will be.

Opponents will change how they Opponents will change how they play if they are losing, so you must play if they are losing, so you must change too!change too!

How do you switch your level of SR?How do you switch your level of SR?

Switching ReasoningSwitching Reasoning

SR-2 has just won the first 100 SR-2 has just won the first 100 roundsrounds

Opponent changes strategyOpponent changes strategy You lose 50 rounds before SR-4 has You lose 50 rounds before SR-4 has

more than 100 theoretical wins.more than 100 theoretical wins. You just wasted 50 rounds!You just wasted 50 rounds!


Use several different methodologies Use several different methodologies for switchesfor switches

Most wins in last 10, 25, 50, 100, 1000 Most wins in last 10, 25, 50, 100, 1000 roundsrounds

Has won the most in similar situationsHas won the most in similar situations Causes the opponent to switch to a worse Causes the opponent to switch to a worse

strategystrategy


Here is the real genius – now use the Here is the real genius – now use the switching methodology which has switching methodology which has helped you win the most rounds!helped you win the most rounds!

Falcon on a Cloudy DayFalcon on a Cloudy Day

So you ask, how do you beat Iocaine So you ask, how do you beat Iocaine Powder?Powder? Improve the basic predictive heuristicsImprove the basic predictive heuristics Extend Sicilian ReasoningExtend Sicilian Reasoning

Improving PredictionImproving Prediction

What I have implemented:What I have implemented: Improved Variable History AnalysisImproved Variable History Analysis

Look at just your history, your Look at just your history, your opponents, or bothopponents, or both

Improved Frequency AnalysisImproved Frequency Analysis EV[x] = Pr[x+2] - Pr[x+1] EV[x] = Pr[x+2] - Pr[x+1]

DemonstrationDemonstration

Here is how my project does with Here is how my project does with what is implemented so far.what is implemented so far.

Improving PredictionImproving Prediction

What I have not implemented yet:What I have not implemented yet: Improved Pattern MatchingImproved Pattern Matching

Markov Models with MegaHALMarkov Models with MegaHAL Extended Sicilian ReasoningExtended Sicilian Reasoning

More on MegaHALMore on MegaHAL

MegaHAL is a very simple "infinite-MegaHAL is a very simple "infinite-order" Markov model. order" Markov model.

Stores frequency information about Stores frequency information about the moves the opponent has made in the moves the opponent has made in the past for all possible contextsthe past for all possible contexts

Using the ‘context’ of the last few Using the ‘context’ of the last few moves, the “appropriate” response is moves, the “appropriate” response is then selected.then selected.

Extended Sicilian Extended Sicilian ReasoningReasoning

Q: Isn’t Sicilian Reasoning complete at 6?Q: Isn’t Sicilian Reasoning complete at 6? A: Yes, but there is information we are A: Yes, but there is information we are

ignoring.ignoring.

By compressing your strategy decisions By compressing your strategy decisions into the idea of which of six strategies is into the idea of which of six strategies is best right now, you have no way to keep best right now, you have no way to keep track of how changing your strategies has track of how changing your strategies has paid off best in the past.paid off best in the past.

Now for some MathNow for some Math

Hilbert Space Hilbert Space Game Trajectory and Game Game Trajectory and Game

StateState Projection OperatorsProjection Operators Annotated History AnalysisAnnotated History Analysis Project EnigmaProject Enigma

falcon on a cloudy day

Documents