Download - Lecture 2 - Probability
Intro to Probability
History of Probability
• Until the 16th Century, nobody put together a
systematic analysis of probability.
• Cardano, an eminent Mathematician (and
compulsive gambler) wrote “A Book on Games of
Chance” in 1526.
• He also included chapters on effective cheating
strategies.
2
Basics
• If you have five things to choose from, and only one
of them is right, you have a 1-in-5 chance of getting
it right.
‣ Also 1/5
‣ 20%
‣ 0.2
• If X represents “choosing right” we can say
‣ P(X) = 0.2 (or 20%, 1/5 etc)
3
Monty Hall
• Last lecture we talked about the “Monty Hall”
problem
• There are 3 possible doors - behind one of them is
a car, and behind two are donkeys.
• The aim is to win the car.
4
The Twist
• After you pick a door, the gameshow host opens
one of the other doors to show a donkey.
• You are offered the opportunity to change to the
other door.
• Should you?
5
Proof of Monty Hall
• Like we said yesterday, yes you should.
• And here’s why
6
Proof 1 - Simple
• Initially you had a 1/3 chance of being right.
• That means a 2/3 chance of being wrong.
• If you were wrong, you should pick a different door,
and you know which door to pick now.
7
Proof 2 - Enumerate
• Car at 1, 2, or 3
• Player picks 1
8
Car at 1 Car at 2 Car at 3
Host Opens 2
Lose x Win (twice)
Host Opens 3
Lose Win (twice)
x
Switching has a 2/3 chance of winning
Conditional Probability
• Bayes’ Theorem of Conditional Probability
• Hinges on the concept of dependent variables.
•What is the chance that X happens given that Y has
happened.
• If X and Y are unrelated, it’s just the probability of X
happening
9
Example
•What is the likelihood of flipping a coin and getting
heads, if we have just flipped a coin and got heads.
•One thing can’t affect the other.
‣ Probability of X given Y = P(X) = 1/2
•What is the likelihood that the next train will be
late if the last train was late(Actually, although the events are related, this
one is more based on Queue Theory...)
10
Bayes’ Theorem
• P(A|B) => Probability of A given B has happened.
• P(A|B) = ( P(B|A) * P(A) ) / P(B)
• In AI we make a lot of use of this theorem
‣ “Bayesian Classification”
‣What is the likelihood that this is thing given that we have
observed data
11
Bayesian Monty Hall
•What is the probability that Door1 wins, given we
have seen that Door 2 does not?
• 3 variables Car, Selection, Host - drawn from {1,2,3}
• P(C = 1 | S = 1, H = 2)
12
Proof
• See Wikipedia entry on “Monty Hall Problem” for recap
of maths shown in class
13
Spam Filtering
• Spam detection can be done with Bayes’ Theorem
•What is the likelihood that this message is spam
given it has these characteristics?
• Characteristics are typically keywords, origin, header
info etc.
14
Spam Filtering
• Variables Spam, Characteristics
• P( S | C ) = ( P( C | S ) P ( C ) ) / P ( S )
•We can learn all the values of the RHS of this from
“training data”.
• Bayes’ Theorem then allows us to generalise to
items that aren’t in the training data.(Note that actual spam filters are much
more sophisticated, but still use Bayes)
15
Training Data
• Big data set
• Pre-classified (by hand)
• Statistical analysis builds up a picture of what spam
looks like
‣ E.g. Emails that include “viagra” are typically spam
• Future emails can be classified using the stats we
learnt from the training data
• Refine analysis by “Report Spam” and “Not Spam”16
Using Bayesian Classifiers
•We’ll see next week how we can use Bayes’
Theorem in games to classify players into
“stereotypes”
• And we can use Utility Theory from last lecture to
exploit these stereotypes
17
Expected Value
• Expected Value is another statistical measure.
• “How much do I expect to win on average”
• Yesterday we talked about an example
‣ Guaranteed £1 or even chance at £3
• P(X) = 1/2, Payout is 3
‣ E(X) = £1.50
18
Using Expected Value
• Expected Value can be used to make informed
choices.
• If we get to play the £1/£3 game repeatedly, over
time we will do better picking £3.
• Note that if we play only once, we may win nothing.
‣Which explains the result in £1,000,000/£3,000,000 game
• Expected Value can be deceptive, but it can also be
helpful.19
The St Petersburg Paradox
• You pay a fee to enter a game where a coin is
flipped repeatedly. The game ends when the first
tails is shown.
• The payout starts at £1 and doubles for every head
that is shown.
•When the game ends, you win whatever the payout
has reached.
20
The St Petersburg Paradox
•What is a sensible entry fee?
•Would you pay £1 to play?
•Would you pay £10 to play?
21
The St Petersburg Paradox
• See Wikipedia entry on St Petersburg Paradox for recap
of maths shown in class
22
The St Petersburg Paradox
• The Expected Value of this game is infinite.
• Therefore it “makes sense” to pay any price to play.
• But of course it doesn’t.
‣ The high payout cases are infinitesimally unlikely.
•We’ll talk next week about how we can work
around this.
23
Iterated Games
• If you repeatedly play a game we call it “Iterated”.
• Iterating opens up a whole host of other options.
• In games with equilibrium points, it doesn’t change
• But in games without equilibrium points, it makes a
massive difference.
• In the same way we saw with Expected Values, we
can “average out” equilibrium points for the game.
24
Mixed Strategies
•When a player has a choice of A, B, C etc. these are
“Pure” strategies
•When we are playing the same game repeatedly, we
can also choose a “Mixed” strategy.
• This is a probability distribution across two or more
of the Pure strategies.
‣ E.g. P(A) = 2/3, P(C) = 1/3
25
Games Without Equilibria
26
Odd Even
Odd -1 1
Even 1 -1
Equilibria
• Remember the definition of an equilibrium point
• If Player 1 changes strategy, they can only do worse
(assuming Player 2 does not change)
• Likewise Player 2 cannot change their strategy
unilaterally and do any better either.
• For both players, this is the best they can hope to
achieve
27
The Odds/Evens Game
• But this does not hold in Odds/Evens
• Player 1 chooses Odd and Player 2 chooses Even
‣ Player 2 would do better to unilaterally change to Odd.
• Player 1 chooses Even and Player 2 chooses Even
‣ Player 1 would do better to unilaterally change to Odd.
• This game has no equilibria!
28
Pseudo-Equilibria
• Calculating appropriate mixed strategies is tough.
• It’s not important to know how to do it for this
course, just that it can be done.
• However an easy approach that sometimes works
‣Delete all dominated strategies (consider that a strategy
may be dominated by a mixed strategy...)
‣ Find a combination that will give the same average payoff
regardless of your opponent.
29
Iterated Odd/Even
•We talked previously about how best to play the
Odd/Even game, and how to vary your strategy.
•What works best is not to think or reason or plot
or scheme.
• A simple mixed strategy works best
‣ P(Odd) = 0.5, P(Even) = 0.5
• Regardless of your opponent, you will get the value
of the game, which is 0.30
Iteration For Communication
• In non-zero sum games, it may be to our advantage
to telegraph to the other player our intentions.
• But we have no way of communicating.
• In an iterated game, we can send our intentions
using the choice strategy.
‣Our previous plays become a transcript of the message
we are sending
31
Optimal Prisoner’s Dilemma
• The best strategy for Iterated Prisoner’s Dilemma is
tit-for-tat.
• Signal initially to your opponent that you are willing
to cooperate.
• Subsequently, play the strategy that the opponent
played last time.
• Punishes betrayal, rewards cooperation.
32
Iterated Prisoner’s Dilemma
•Why is this a good thing?
• Consider the Prisoner’s Dilemma
•We can signal to the other player that we are willing
to cooperate with them.
‣We gain the best mutual payout.
‣ Removes a lot of the risk.
33
The Hangman Paradox
• The Hangman Paradox is something to be wary of.
• A prisoner has been sentenced to be executed.
• He has been told that it will take place next week.
• He has also been told that it will be a surprise.
34
The Hangman Paradox
• It can’t happen on Friday
‣ As that’s the last day of the week, if it did it would not be
a surprise.
• And if it can’t happen on Friday, equally it can’t
happen on Thursday by the same logic.
• By induction, he can’t be executed!
35
The Hangman Paradox
• Having realised that he can’t be executed, he now
feels safe.
•On Wednesday, the hangman arrives to execute him.
• He is, as predicted, very surprised.
36
Hangman Paradox for Iterated Games
• It’s easy to fall into the same reasoning for iterated
games.
‣ In the final iteration, there is no consequence to betrayal
‣ By induction, the case for cooperating at all falls apart
• This might be true for a determinate number of
iterations.
•What about an indeterminate number?
37
Summary
• Lots of Probability
‣ Bayesian Probability
‣ Expected Value
• Iterated Games
• Mixed Strategies
• Cooperation
38
Next Week
• Covering Poker in detail
• Designing agents to play games
• Mathematical models of players
39