july, 2013 tutorial: introduction to game theory
TRANSCRIPT
© 2013 IBM Corporation
July, 2013
Tutorial: Introduction to Game Theory Jesus Rios IBM T.J. Watson Research Center, USA [email protected]
© 2013 IBM Corporation 2
Approaches to decision analysis
Descriptive – Understanding of how decisions are made
Normative – Models of how decision should be made
Prescriptive – Helping DM make smart decisions – Use of normative theory to support DM – Elicit inputs of normative models
• DM preferences and beliefs (psycho-analysis) • use of experts
– Role of descriptive theories of DM behavior
© 2013 IBM Corporation 3
Game theory arena
Non-cooperative games – More than one intelligent player – Individual action spaces – Interdependent consequences
• Players’ consequences depend on their own and other player actions
Cooperative game theory – Normative bargaining models
• Joint decision making - Binding agreements on what to play
• Given players preferences and solution space Find a fair, jointly satisfying and Pareto optimal agreement/solution
– Group decision making on a common action space (Social choice) • Preference aggregation • Voting rules
- Arrow’s theorem – Coalition games
© 2013 IBM Corporation 4
Cooperative game theory: Bargaining solution concepts
• Disagreement point: BATNA, status quo • Feasible solutions: ZOPA • Pareto-efficiency • Aspiration levels • Fairness:
K-S, Nash, maxmin solutions
Working alone Juan $ 10
Maria $ 20
Working together $ 100
Juan
Maria
How to distribute
the profits of the cooperation? Juan x
Maria y
10
20
y
x
x + y = 100
80
90
Fair K
x = 45
y = 55
Bliss point
© 2013 IBM Corporation 5
Normative models of decision making under uncertainty
Models for a unitary DM – vN-M expected utility
• Objective probability distributions – Subjective expected utility (SEU)
• Subjective probability distributions
Example: investment decision problem – One decision variable with two alternatives
• In what to investment? - Treasury bonds - IBM shares
– One uncertainty with two possible states • IBM share price at the end of the year
- High - Low
– One evaluation criteria for consequences • Profit from investment
The simplest decision problem under uncertainty
© 2013 IBM Corporation 6
Decision Table
DM chooses a row without knowing which column will occur
Choice depends on the relative likelihood of High and Low? – If DM is sure that IBM share price will be High,
best choice is to buy Shares – If DM is sure that IBM share price will be Low,
best choice is to buy Bonds Elicit the DM’s beliefs about which column will occur
Choice depends on the value of money – Expected return not a good measure of decision preferences
• The two alternatives give the same expected return but most of DMs would not fell indifferent between them Elicit risk attitude of the DM
© 2013 IBM Corporation 7
Decision tree representation
What does the choice depends upon? – relative likelihood of H vs L – strength of preferences for money
IBM Shares
Bonds
High
Low
What to buy
price
$2,000
- $1,000
$500
uncertainty
certainty
© 2013 IBM Corporation 8
Subjective expected utility solution
If DM’s decision behavior consistent with some set of “rational” desiderata (axioms) DM decides as if he has
– probabilities to represent his beliefs about the future price of IBM share – “utilities” to represent his preferences and risk attitude towards money
and choose the alternative of maximum expected utility
The subjective expected utility model balance in a “rational” manner – the DM’s beliefs and risk attitudes
Application requires to – know the DM’s beliefs and “utilities”
• Different elicitation methods – compute of expected utilities of each decision strategy
• It may require approximation in non-simple problems
© 2013 IBM Corporation 9
The Basic Canonical Reference Lottery ticket: p-BCRL
p
1 - p BCLR
$2,000
- $1,000
Preferences over BCRL p-BCRL > q-BCRL iff p > q
where p and q are canonical probabilities
A constructive definition of “utility”
© 2013 IBM Corporation 10
Elicit prob. of the price of IBM shares
Event H – IBM price High
Event L – IBM price Low
Pr( H ) + Pr( L ) = 1
1 - p
IBM shares
p-BCRL
H
L
p $2,000
- $1,000
$2,000
- $1,000
BCRL
price
Move p from 1 to 0 Which alternative is preferred by the DM?
– IBM shares – p-BCRL
There exists a breakeven canonical prob. such that the DM is indifferent – pH-BCRL ~ IBM shares
– The judgmental probability of H is pH
© 2013 IBM Corporation 11
Elicit the utility of $500
U( $500 )? 1 - p
p - BCLR
p $2,000
- $1,000
BCLR
$500 Bonds
Move p from 1 to 0 Which alternative is preferred by the DM?
p-BCRL vs. Bonds There exists a breakeven canonical prob. such that the DM is indifferent
– u-BCRL ~ Bonds
– This scales the value of $500 between the value of $2,000 and - $1,000 U($500) = u
What is then U($500)? – The probability of a BCRL between $2,000 and - $1,000 that is indifferent (for the DM) to getting
$500 with certainty
© 2013 IBM Corporation 12
Comparison of alternatives IBM shares
H
L
pH $2,000
- $1,000
$2,000
- $1,000
BCRL
price
U($500) $2,000
- $1,000
BCLR
$500 Bonds
~
~ The DM prefers to
invest on “IBM Shares” iff
pH > U($500)
© 2013 IBM Corporation 13
Solving the tree: backward induction
Utility scaling 0 = U( - $1,000 ) < U( $500 ) = u < U( $2,000 ) = 1
IBM Shares
Bonds
High
Low
What to buy
price
$2,000
- $1,000
$500
1
0
Utilities
u
pH
1 - pH
© 2013 IBM Corporation 14
Preferences: value vs. utility
Value function – measure the desirability (intensity of preferences) of money gained, – but do not measure risk attitude
Utility function – Measure risk attitude – but no intensity of preferences over sure consequences
Many methods to elicit a utility function – Qualitative analysis of risk attitude leads to parametric utility functions – Ask quantitative indifference questions between deals (one of which must be an uncertain
lottery) to assess parameters of utility function – Consistency checks and sensitivity analysis
© 2013 IBM Corporation 15
The Bayesian process of inference and evaluation with several stakeholders and decision makers (Group decision making)
© 2013 IBM Corporation 16
Disagreements in group decision making
Group decision making assumes – Group value/utility function – Group probabilities on the uncertainties
If our experts disagree on the science (Expert problem) – How to draw together and learn from conflicting probabilistic judgements – Mathematical aggregation
• Bayesian approach • Opinion pools
- There is no opinion pool satisfying a consensus minimum set of “good” probabilistic properties • Issues
- How do we model knowledge overlap/correlation - Expertise evaluation
– Behavioural aggregation – The textbook problem
• If we do not have access to experts we need to develop meta-analytical methodologies for drawing together expert judgment studies
© 2013 IBM Corporation 17
Disagreements in group decision making
If group members disagree on the values – How to combine different individuals’ rankings of options into a group ranking? – Arbitration/voting
• Ordinal rankings - Arrow impossibility results.
• Cardinal ranking (values and not utilities -- Decisions without uncertainty) - Interpersonal comparison of preferences’ strengths - Supra decision maker approach (MAUT)
• Issues: manipulation and true reporting of rankings
Disagreement on the values and the science – Combining
• individual probabilities and utilities • into group probabilities and utilities, respectively, • to form the corresponding group expected utilities and choosing accordingly
– Impossibility of being Bayesian and Paretian at the same time • No aggregation method exist (of probabilities and utilities) compatible with the Pareto order
– Behavioral approaches • Consensus on group probabilities and utilities via sensitivity analysis. • Agreement on what to do via negotiation
© 2013 IBM Corporation 18
Decision analysis in the presence of intelligent others
Matrix games against nature – One player: R (Row)
• Two choices: U (Up) and D (Down) – Payoff matrix
0 5
10 3
U
D
R
Nature
L R
If you were R, what would you do? D > U against L U > D against R
© 2013 IBM Corporation 19
Games against nature
Do we know which Colum nature will choose? – We know our best responses to Nature moves, but not what move Nature will choose
Do we know the (objective) probabilities of Nature’s possible moves? – YES
0 5
10 3
U
D
R
Nature
L R
p 1-p
0 p + 5 (1-p)
10 p + 3 (1-p)
Expected payoff
U > D iff p < 1/6 Payoffs = vNM utils
© 2013 IBM Corporation 20
Games against nature and the SEU criteria
Do we know the (objective) probabilities of Nature’s possible moves? – No
• Variety of decision criteria - Maximin (pessimistic), maxmax (optimistic), Hurwicz, minimax regret,…
0 5
10 3
U
D
R
Nature
L R
0
3
Min
Maxmin D
5
10
Max
10
2
Max Regret
Maxmax D
Minmax Regret
D SEU criteria
Elicit DM’s subjective probabilistic beliefs about Nature move (p) Compute SEU of each alternative: D > U iff p > 1/6
© 2013 IBM Corporation 21
Games against others intelligent players
Bimatrix (simultaneous) games – Second intelligent player: C (Column)
• Two choices: L (Left) and R (Right) – Payoff bimatrix
• we know C payoffs and that he will try to maximize them – As R, what would you do?
0 5
10 3
U
D
R
C
L R
– Knowledge C’s payoffs and rationality allows us to predict with certitude C’s move (R)
2 4
3 8
*
© 2013 IBM Corporation 22
One shot simultaneous bi-matrix games
Two players – Trying to maximize their payoffs
Players must choose one out of two fixed alternatives – Row player chooses a row – Column player chooses a column
Payoffs depends of both players’ moves Simultaneous move game
– Players must act without knowing what the other player does – Play once
No other uncertainties involved Players have full and common knowledge of
– choice spaces – bi-matrix payoffs
No cooperation allowed
uR(U,L) uR(U,R)
uR(D,L) uR(D,L)
U
D
R
C
L R
uC(U,L) uC(U,R)
uC(D,L) uC(D,L)
© 2013 IBM Corporation 23
Dominant alternatives and social dilemmas
Prisoner dilemma – (NC,NC) is mutually dominant
• Players’ choices are independent of information regarding the other player’s move
– (NC,NC) is socially dominated by (C,C)
Airport network security
5 -5
10 -2
C
NC
R
C
C NC
5 10
-5 -2 *
*
© 2013 IBM Corporation 24
Iterative dominance
No dominant strategy for either player, however – There are iterative dominated strategies
• L > R • Now M is dominant in the restricted game
- M > U and M > D • Now L > C in the restricted game
- 20 > - 10 – (M,L) solution by iteratively elimination of (strict) dominated strategies
• Common knowledge and rationality assumptions
Exercise – Find if there is a solution by iteratively eliminating dominated strategies
Solution: (D,C)
© 2013 IBM Corporation 25
Nash equilibrium
Games without – Dominant solution – Solution by iterative elimination of dominated alternatives
0 2
1 0
Ballet
Concert
Concert Ballet
0 1
2 0 *
1 -1
-1 1
Head
Tails
Head Tails
-1 1
1 -1
*
Battle of the sexes Matching pennies
© 2013 IBM Corporation 26
Existence of Nash equilibrium (Nash)
Every finite game has a NE in mixed strategies – Requires extending the original set of alternatives of each player
Consider the matching pennies game – Mixed strategies
• Choosing a lottery of certain probabilities over Head and Tails – Players’ choice sets defined by the lottery’s probability
• Row: p in [0,1] • Column: q in [0,1]
– Payoff associated with a pair of strategies (p,q) is • (p,1-p) P (q,1-q)T
where P is the payoff matrix for the original game in pure strategies • Payoffs need to be vNM utilities
– Nash equilibrium • Intersection of players best response correspondences
uR(p*,q*) > uR(p,q*) uC(p*,q*) > uC(p*,q) (p*,q*)
© 2013 IBM Corporation 27
Nash equilibria concept as predictive tool
Supporting the row player against the column player Games with multiple NEs
4 10
12 5
U
D
L R
-100 6
8 4 *
* Two NEs (D,L) > (U,R), since 12>10 and 8>6 C may prefer to play R
– To protect himself against -100 Knowing this, R would prefer to play U
– ending up at the inferior NE (U,R) How can we model C behavior?
– Bayesian K-level thinking
© 2013 IBM Corporation 28
K-level thinking
Row is not sure about Column’s move – p: Row’s beliefs about C moving L – Row’s SEU
• U: 4 p + 10 (1-p) • D: 12 p + 5 (1-p)
– U > D iff p < 5/13 = 0.38 How to elicit p?
– Row’s analysis of Column’s decision • Assuming C behave as a SEU maximizer • q: C’s beliefs about whether Row is smart enough to choose D (best NE) • L SEU: -100 (1-q) + 8 q
R SEU: 6 (1-q) + 4 q • L > R iff q > 53/55 = 0.96 • Since Row does not know q, his beliefs about q are represented by a CPD F • p = Pr (q > 0.96) = F(0.96)
p
q
© 2013 IBM Corporation 29
Simultaneous vs sequential games
First mover advantage – Both players want to move first
• Credible commitment/threat
Second mover advantage – Players want to observe their opponent’s move
before acting – Both players try not to disclose their moves
Game of Chicken Matching pennies game
© 2013 IBM Corporation 30
Dynamic games: backward induction
Sequential Defend-Attack games – Two intelligent players
• Defender and Attacker – Sequential moves
• First Defender, afterwards Attacker knowing Defender’s decision
© 2013 IBM Corporation 31
Standard Game Theoretic Analysis
Solution:
Expected utilities at node S
Best Attacker’s decision at node A
Assuming Defender knows Attacker’s analysis Defender’s best decision at node D
© 2013 IBM Corporation 32
Supporting a SEU maximizer Defender
Defender’s problem Defender’s solution of maximum SEU
Modeling input: ??
© 2013 IBM Corporation 33
Example: Banks-Anderson (2006)
Exploring how to defend US against a possible smallpox attack – Random costs (payoffs)
– Conditional probabilities of each kind of smallpox attack given terrorists know what defence has been adopted
– Compute expected cost of each defence strategy
Solution: defence of minimum expected cost
This is the problematic step
of the analysis
© 2013 IBM Corporation 34
Predicting Attacker’s decision: .
Defender problem Defender’s view of Attacker problem
© 2013 IBM Corporation 35
Solving the assessment problem
Defender’s view of Attacker problem
Elicitation of
A is an EU maximizer
D’s beliefs about
MC simulation
© 2013 IBM Corporation 37
Standard Game Theory vs. Bayesian Decision Analysis
Decision Analysis (unitary DM) – Use of decision trees – Opponent’ actions treated as a random variables
• How to elicit probs on opponents’ decisions?? • Sensitivity analysis on (problematic) probabilities
Game theory (multiple DMs) – Use of game trees – Opponent’ actions treated as a decision variables – All players are EU maximizers
• Do we really know the utilities our opponents try to maximizes?
© 2013 IBM Corporation 38
Bayesian decision analysis approach to games
One-sided prescriptive support – Use a prescriptive model (SEU) for supporting one of the DMs – Treat opponent's decisions as uncertainties – Assess probs over opponent's possible actions – Compute action of maximum expected utility
The ‘real’ bayesian approach to games (Kadane & Larkey 1982) – Weaken common (prior) knowledge assumption
How to assess a prob distribution over actions of intelligent others?? – “Adversarial Risk Analysis” (DRI, DB and JR) – Development of new methods for the elicitation of probs on adversary’s actions
• by modeling the adversary’s decision reasoning - Descriptive decision models
© 2013 IBM Corporation 39
Relevance to counterbioterrorism
Biological Threat Risk Assessment for DHS (Battelle, 2006) – Based on Probability Event Trees (PET)
• Government & Terrorists’ decisions treated as random events
Methodological improvements study (NRC committee) – PET appropriate for risk assessment of
• Random failure in engineering systems but not for adversarial risk assessment
• Terrorists are intelligent adversaries trying to achieve their own objectives
• Their decisions (if rational) can be somehow anticipated
– PET cannot be used for a full risk management analysis • Government is a decision maker not a random variable
© 2013 IBM Corporation 40
Methodological improvement recommendations
Distinction between risks from – Nature/Accidents vs. – Actions of intelligent adversaries
Need of models to predict Terrorists’ behavior – Red team role playing (simulations of adversaries thinking)
– Attack-preference models • Examine decision from Attacker viewpoint (T as DM)
– Decision analytic approaches • Transform the PET in a decision tree (G as DM)
- How to elicit probs on terrorist decisions?? - Sensitivity analysis on (problematic) probabilities - Von Winterfeldt and O’Sullivan (2006)
– Game theoretic approaches • Transform the PET in a game tree (G & T as DM)
© 2013 IBM Corporation 41
Models to predict opponents’ behavior
Role playing (simulations of adversaries thinking)
Opponent-preference models – Examine decision from the opponent viewpoint
• Elicit opponent’s probs and utilities from our viewpoint (point estimates) – Treat the opponent as a EU maximizer ( = rationality?)
• Solve opponent’s decision problem by finding his action of max. EU
– Assuming we know the opponent’s true probs and utilities • We can anticipate with certitude what the opponent will do
Probabilistic prediction models – Acknowledge our uncertainty on opponent’s thinking
© 2013 IBM Corporation 42
Opponent-preference models
Von Winterfeldt and O’Sullivan (2006) – Should We Protect Commercial Airplanes Against
Surface-to-Air Missile Attacks by Terrorists?
Decision tree + sensitivity analysis on probs
© 2013 IBM Corporation 43
Parnell (2007)
Elicit Terrorist’s probs and utilities from our viewpoint – Point estimates
Solve Terrorist’s decision problem – Finding Terrorist’s action that gives him max. expected utility
Assuming we know the Terrorist’s true probs and utilities – We can anticipate with certitude what the terrorist will do
© 2013 IBM Corporation 46
Paté-Cornell & Guikema (2002)
Assessing probabilities of terrorist’s actions – From the Defender viewpoint
• Model the Attacker’s decision problem • Estimate Attacker’s probs and utilities (point estimates) • Calculate expected utilities of attacker’s actions
– Prob of attacker’s actions proportional to their perceived EU
Feed these probs into the Defender’s decision problem – Uncertainty of Attacker’s decisions has been quantified – Choose defense of maximum expected utility
Shortcoming – If the (idealized) adversary is an EU maximizer he would certainly choose the attack of max expected utility
© 2013 IBM Corporation 47
How to assess probabilities over the actions of an intelligent adversary??
Raiffa (2002): Asymmetric prescriptive/descriptive approach – Prescriptive advice to one party conditional on
a (probabilistic) description of how others will behave – Assess probability distribution from experimental data
• Lab role simulation experiments
Rios Insua, Rios & Banks (2009) – Assessment based on an analysis of the adversary rational behavior
• Assuming the opponent is a SEU maximizer - Model his decision problem - Assess his probabilities and utilities - Find his action of maximum expected utility
– Uncertainty in the Attacker’s decision stems from • our uncertainty about his probabilities and utilities
– Sources of information • Available past statistical data of Attacker’s decision behavior • Expert knowledge / Intelligence
© 2013 IBM Corporation 48
The Defend–Attack–Defend model
Two intelligent players – Defender and Attacker
Sequential moves – First, Defender moves – Afterwards, Attacker knowing Defender’s move – Afterwards, Defender again responding to attack
Infinite regress
© 2013 IBM Corporation 49
Under common knowledge of utilities and probs At node
Expected utilities at node S
Best Attacker’s decision at node A
Best Defender’s decision at node
Nash Solution:
Standard Game Theory Analysis
© 2013 IBM Corporation 50
At node
Expected utilities at node S
At node A
Best Defender’s decision at node
??
Supporting the Defender against the Attacker
© 2013 IBM Corporation 54
The Defender may want to exploit information about how the Attacker analyzes her problem
Hierarchy of recursive analysis – Infinity regress – Stop when there is no more information to elicit
The assessment of
© 2013 IBM Corporation 55
Games with private information
Example: – Consider the following two-person simultaneous game with asymmetric information
• Player 1 (row) knows whether he is stronger than player 2 (Colum) but player 2 does not know this
• Player's type use to represent information privately known by that player
© 2013 IBM Corporation 56
Bayes Nash Equilibrium
Assumption – common prior over the row player's type:
• Column's beliefs about the row player's type are common knowledge • Why column is going to disclose this information? • Why row is going to believe that column is disclosing her true beliefs about his type?
Row’s strategy function
© 2013 IBM Corporation 58
Is the common knowledge assumption realistic?
– Column is better off reporting that
–
–
© 2013 IBM Corporation 59
Modeling opponents' learning of private information
Simultaneous decisions – Bayes Nash Equilibrium – No opportunity to learn about this information
Sequential decisions • Perfect Bayesian equilibrium/Sequential rationality • Opportunity to learn from the observed decision behavior
- Signaling games
Models of adversaries' thinking to anticipate their decision behavior – need to model opponents' learning of private information we want to keep secret – how would this lead to a predictive probability distribution?
© 2013 IBM Corporation 60
Sequential Defend-Attack model with Defender’s private information
Two intelligent players – Defender and Attacker
Sequential moves – First Defender, afterwards Attacker knowing Defender’s decision
Defender’s decision takes into account her private information – The vulnerabilities and importance of sites she wants to protect – The position of ground soldiers in the data ferry control problem (ITA)
Attacker observes Defender’s decision – Attacker can infer/learn about information she wants to keep secret
How to model the Attacker’s learning
© 2013 IBM Corporation 65
Supporting the Defender
We weaken the common knowledge assumption The Defender’s decision problem
D
V
S A ??
© 2013 IBM Corporation 70
How to stop this hierarchy of recursive analysis?
Potentially infinite analysis of nested decision models – where to stop?
• Accommodate as much information as we can • Stop when the Defender has no more information • Non-informative or reference model • Sensitivity analysis test