19074879 100 mark project game theory (1)
Post on 07-Apr-2018
219 Views
Preview:
TRANSCRIPT
-
8/6/2019 19074879 100 Mark Project Game Theory (1)
1/55
Game Theory
Introduction
Game theory is concerned with the decision-making process in situations where
outcomes depend upon choices made by one or more players. The word "game" is not
used in the conventional sense but describes any situation involving positive or negative
outcomes determined by the players' choices and, in some cases, chance. In order for
game theory to apply, certain assumptions must be made. The first is that each player is
rational, acting in his self-interest. In addition, the players' choices determine the
outcome of the game, but each player has only partial control of the outcome.
Game theory is the mathematical analysis of a conflict of interest to find optimal choices
that will lead to desired outcome under given conditions. To put it simply, it's a study of
ways to win in a situation given the conditions of the situation. While seemingly trivial
in name, it is actually becoming a field of major interest in fields like economics,
sociology, and political and military sciences, where game theory can be used to predict
more important trends.
Though the title of originator is given to mathematician John von Neumann, the first to
explore this matter was a French mathematician named Borel. In the 1930s, Neumann
published a set of papers that outlined the tenets of game theory and thus made way for
the first simulations which considered mathematical probabilities. This was used by
strategists during the Second World War, and since then has earned game theory a place
in the context of Social Science.
It may at first seem arcane to involve mathematics in something that seems purely basedon skill and chance, but game theory is in actuality a complex part of many branches of
mathematics including set theory, probability and statistics, and plain algebra. This
results from the fact that games are dictated by a given set of rules that can be used to
outline a set of possible moves which can be ranked by desirability and effectiveness,
and with information available, such a set can also be constructed for the opponent, thus
allowing predictions about the possible outcomes within a certain number of moves with
a probabilistic accuracy.
1
-
8/6/2019 19074879 100 Mark Project Game Theory (1)
2/55
Game Theory
Von Neumann and the development
of Game Theory
Emile Borel: The Forgotten Father of Game Theory?
In 1921, Emile Borel, a French mathematician, published
several papers on the theory of games. He used poker as
an example and addressed the problem of bluffing and
second-guessing the opponent in a game of imperfect
information. Borel envisioned game theory as being used
in economic and military applications. Borel's ultimate
goal was to determine whether a "best" strategy for a
given game exists and to find that strategy.
While Borel could be arguably called as the first mathematician to envision an organized
system for playing games, he did not develop his ideas very far. For that reason, most
historians give the credit for developing and popularizing game theory to John VonNeumann, who published his first paper on game theory in 1928, seven years after Borel.
John Von Neumann
Born in Budapest, Hungary, in 1903, Von Neumann distinguished himself from his peers
in childhood for having a photographic memory, being able to memorize and recite back
a page out of a phone book in a few minutes. Science, history, and psychology were
among his many interests; he succeeded in every academic subject in school.
He published his first mathematical paper in collaboration with his tutor at the age of
eighteen, and resolved to study mathematics in college. He enrolled in the University of
Budapest in 1921, and over the next few years attended the University of Berlin and the
Swiss Federal Institute of Technology in Zurich as well. By 1926, he received his Ph.D.
in mathematics with minors in physics and chemistry.
2
-
8/6/2019 19074879 100 Mark Project Game Theory (1)
3/55
Game Theory
By his mid-twenties, von Neumann was known as a young mathematical genius and his
fame had spread worldwide in the academic community. In 1929, he was offered a job at
Princeton. Upon marrying his fiancee, Mariette, Neumann moved to the U.S. (Agnostic
most of his life, Von Neumann accepted his wife's Catholic faith for the marriage,
though not taking it very seriously.)
In 1937, Mariette left Von Neumann for J. B. Kuper, a physicist. Within a year of his
divorce, Von Neumann began an affair with Klara Dan, his childhood sweetheart, who
was willing to leave her husband for him.
Von Neumann is commonly described as a practical joker and always the life of the
party. John and Klara held a party every week or so, creating a kind of salon at theirhouse. Von Neumann used his phenomenal memory to compile an immense library of
jokes which he used to liven up a conversation. Von Neumann loved games and toys,
which probably contributed in great part to his work in Game Theory.
Beginning in 1927, Von Neumann applied new mathematical methods to quantum
theory. His work was instrumental in subsequent "philosophical" interpretations of the
theory.
For Von Neumann, the inspiration for game theory was poker, a game he played
occasionally and not terribly well. Von Neumann realized
that poker was not guided by probability theory alone, as
an unfortunate player who would use only probability
theory would find out. Von Neumann wanted to formalize
the idea of "bluffing," a strategy that is meant to deceive
the other players and hide information from them.
In his 1928 article, "Theory of Parlor Games," Von Neumann first approached the
discussion of game theory, and proved the famous Minimax theorem. From the outset,
Von Neumann knew that game theory would prove invaluable to economists. He teamed
up with Oskar Morgenstern, an Austrian economist at Princeton, to develop his theory.
Their book, Theory of Games and Economic Behavior, revolutionized the field of
economics. Although the work itself was intended solely for economists, its applications
to psychology, sociology, politics, warfare, recreational games, and many other fields
soon became apparent.
3
-
8/6/2019 19074879 100 Mark Project Game Theory (1)
4/55
Game Theory
Although Von Neumann appreciated Game Theory's applications to economics, he was
most interested in applying his methods to politics and warfare, perhaps stemming from
his favorite childhood game, Kriegspiel, a chess-like military simulation. He used his
methods to model the Cold War interaction between the U.S. and the USSR, viewing
them as two players in a zero-sum game.
From the very beginning of World War II, Von Neumann was confident of the Allies'
victory. He sketched out a mathematical model of the conflict from which he deduced
that the Allies would win, applying some of the methods of game theory to his
predictions.
In 1943, Von Neumann was invited to work on the Manhattan Project. Von Neumanndid crucial calculations on the implosion design of the atomic bomb, allowing for a more
efficient, and more deadly, weapon. Von Neumann's mathematical models were also
used to plan out the path the bombers carrying the bombs would take to minimize their
chances of being shot down. The mathematician helped select the location in Japan to
bomb. Among the potential targets he examined was Kyoto, Yokohama, and Kokura.
"Of all of Von Neumann's postwar work, his development of the digital computer looms
the largest today." After examining the Army's ENIAC during the war, Von Neumann
came up with ideas for a better computer, using his mathematical abilities to improve the
computer's logic design. Once the war had ended, the U.S. Navy and other sources
provided funds for Von Neumann's machine, which he claimed would be able to
accurately predict weather patterns.
Capable of 2,000 operations a second, the computer did not predict weather very well,
but became quite useful doing a set of calculations necessary for the design of thehydrogen bomb. Von Neumann is also credited with coming up with the idea of basing
computer calculations on binary numbers, having programs stored in computer's memory
in coded form as opposed to punchcards, and several other crucial developments. Von
Neumann's wife, Klara, became one of the first computer programmers.
Von Neumann later helped design the SAGE computer system designed to detect a
Soviet nuclear attack
In 1948, Von Neumann became a consultant for the RAND Corporation. RAND
(Research ANd Development) was founded by defense contractors and the Air Force as a
4
-
8/6/2019 19074879 100 Mark Project Game Theory (1)
5/55
Game Theory
"think tank" to "think about the unthinkable." Their main focus was exploring the
possibilities of nuclear war and the possible strategies for such a possibility.
Von Neumann was, at the time, a strong supporter of "preventive war." Confident even
during World War II that the Russian spy network had obtained many of the details of
the atom bomb design, Von Neumann knew that it was only a matter of time before the
Soviet Union became a nuclear power. He predicted that were Russia allowed to build a
nuclear arsenal, a war against the U.S. would be inevitable. He therefore recommended
that the U.S. launch a nuclear strike at Moscow, destroying its enemy and becoming a
dominant world power, so as to avoid a more destructive nuclear war later on. "With the
Russians it is not a question of whether but of when," he would say. An oft-quoted
remark of his is, "If you say why not bomb them tomorrow, I say why not today? If you
say today at 5 o'clock, I say why not one o'clock?"
Just a few years after "preventive war" was first advocated, it became an impossibility.
By 1953, the Soviets had 300-400 warheads, meaning that any nuclear strike would be
effectively retaliated.
In 1954, Von Neumann was appointed to the Atomic Energy Commission. A year later,
he was diagnosed with bone cancer. The disease resulted from the radiation Von
Neumann received as a witness to the atomic tests on Bikini atoll.
Von Neumann maintained a busy schedule throughout his sickness, even when he
became confined to a wheelchair. It has been claimed by some that the wheelchair-bound
mathematician was the inspiration for the character of Dr. Strangelove in the 1963 film
Dr. Strangelove or: How I Learned to Stop Worrying and Love the Bomb.
Von Neumann's last public appearance was in February 1956, when President
Eisenhower presented him with the Medal of Freedom at the White House. In April, Von
Neumann checked into Walter Reed Hospital. He set up office in his room, and
constantly received visitors from the Air Force and the Secretary of Defense office, still
performing his duties as a consultant to many top political figures.
John von Neumann died on February 8, 1957.
His wife, Klara von Neumann, committed suicide six years later.
5
-
8/6/2019 19074879 100 Mark Project Game Theory (1)
6/55
Game Theory
Dr. Marina von Neumann Whitman, John's daughter from his first marriage, was invited
by President Nixon to become the first woman to serve on the council of economic
advisers.
Concepts in Game Theory
Game
A conflict in interest among n individuals or groups (players). There exists a set of rules
that define the terms of exchange of information and pieces, the conditions under which
the game begins, and the possible legal exchanges in particular conditions. The entirety
of the game is defined by all the moves to that point, leading to an outcome.
Move
The way in which the game progresses between states
through exchange of information and pieces. Moves are
defined by the rules of the game and can be made in
either alternating fashion, occur simultaneously for all
players, or continuously for a single player until he
reaches a certain state or declines to move further. Moves
may be choice or by chance. For example, choosing a card from a deck or rolling a die is
a chance move with known probabilities. On the other hand, asking for cards in
blackjack is a choice move.
Information
A state of perfect information is when all moves are known to all players in a game.
Games without chance elements like chess are games of perfect information, while
games with chance involved like blackjack are games of imperfect information.
Strategy
A strategy is the set of best choices for a player for an entire game. It is an overlying plan
that cannot be upset by occurrences in the game itself.
Payoff
6
-
8/6/2019 19074879 100 Mark Project Game Theory (1)
7/55
Game Theory
The payoff or outcome is the state of the game at it's conclusion. In games such as chess,
payoff is defined as win or a loss. In other situations the payoff may be material (i.e.
money) or a ranking as in a game with many players.
Extensive and Normal Form
Games can be characterized as extensive or normal. A in extensive form game is
characterized by a rules that dictate all possible moves in a state. It may indicate which
player can move at which times, the payoffs of each chance determination, and the
conditions of the final payoffs of the game to each player. Each player can be said to
have a set of preferred moves based on eventual goals and the attempt to gain the
maximum payoff, and the extensive form of a game lists all such preference patterns for
all players. Games involving some level of determination are examples of extensive form
games.
The normal form of a game is a game where computations can be carried out completely.
This stems from the fact that even the simplest extensive form game has an enormous
number of strategies, making preference lists are difficult to compute. More complicated
games such as chess have more possible strategies that there are molecules in the
universe. A normal form game already has a complete list of all possible combinations of
strategies and payoffs, thus removing the element of player choices. In short, in a normal
form game, the best move is always known.
7
-
8/6/2019 19074879 100 Mark Project Game Theory (1)
8/55
Game Theory
Types of Games
One-Person Games
A one-person games has no real conflict of interest. Only the interest of the player in
achieving a particular state of the game exists. Single-person games are not interesting
from a game-theory perspective because there is no adversary making conscious choices
that the player must deal with. However, they can be interesting from a probabilistic
point of view in terms of their internal complexity.
Zero-Sum Games
In a zero-sum game the sum of the total possible payoffs at the end is zero since the
amounts won or lost are equal. Von Neumann and Oskar Morgenstern demonstrated
mathematically that n-person non-zero-sum game can be reduced to an n + 1 zero-sum
game, and that such n + 1 person games can be generalized from the special case of the
two-person zero-sum game. Another important theorem by Von Neumann, the minimax
theorem, states certain aspects of the maximal and minimal strategies of are part of all
two-person zero-sum games. Thanks to these discoveries, such games are a major part of
game theory.
Two-Person Games
Two-person games are the largest category of familiar
games. A more complicated game derived from 2-person
games is the n-person game. These games are
extensively analyzed by game theorists. However, in
extending these theories to n-person games a difficulty
arises in predicting the interaction possible among
players since opportunities arise for cooperation and
collusion.
8
-
8/6/2019 19074879 100 Mark Project Game Theory (1)
9/55
Game Theory
Zero-Sum Games
A zero-sum game is one in which no wealth is created or destroyed. So, in a two-player
zero-sum game, whatever one player wins, the other loses. Therefore, the players share
no common interests. There are two general types of zero-sum games: those with perfect
information and those without.
In a game with perfect information, every player knows the results of all previous moves.
Such games include chess, tic-tac-toe, and Nim. In games of perfect information, there is
at least one "best" way to play for each player. This best strategy does not necessarily
allow him to win but will minimize his losses. For instance, in tic-tac-toe, there is a
strategy that will allow you to never lose, but there is no strategy that will allow you to
always win. Even though there is an optimal strategy, it is not always possible for players
to find it. For instance, chess is a zero-sum game with perfect information, but the
number of possible strategies is so large that it is not possible for our computers to
determine the best strategy.
In games with imperfect information, the players do not know all of the previous moves.
Often, this occurs because the players play simulataneously. Here are some examples of
such games:
Game 1
Suppose two people are playing a simple game with nickels and quarters. At the same
time, they each put out either a nickel or a quarter. If at least one player plays a nickel,
player 1 gets both coins. Otherwise, player 2 gets both. Naturally, both players wish to
gain as much money as possible. How should they play in order to do this?
Game 2
Suppose two people are playing a similar game with nickels and quarters. Now, if player
1 plays a nickel, player 2 gives him 5 cents. If player 2 plays a nickel and player 1 plays
a quarter, player 1 gets 25 cents. If both players play quarters, player 2 gets 25 cents.
Game 39
-
8/6/2019 19074879 100 Mark Project Game Theory (1)
10/55
Game Theory
We still have two people playing a game with nickels and quarters. Now, if both players
play the same coin, player 2 gives player 1 the average value of the coins; otherwise,
player 1 gives player 2 the average value of the coins.
Although the three games seem similar, the methods used to find the best strategies in
each are very different. Game 1 is solved by eliminating dominant strategies, game 2's
solution is known as a saddle point, and game 3 requires a mixed strategy.
Game 1 Dominant Strategies
Suppose two people are playing a simple game with nickels and quarters. At the same
time, they each put out either a nickel or a quarter. If at least one player plays a nickel,
player 1 gets both coins. Otherwise, player 2 gets both. Naturally, both players wish to
gain as much money as possible. How should they play in order to do this?
We can assign payoff matrices to such games that define the payoffs that players will get
based on the strategies they use. In this example, each player has only two strategies--put
out a nickel or put out a quarter. Here is a payoff matrix for player 1:
Player 2Nickel Quarter
Player 1Nickel 5 25
Quarter 5 -25
The rows represent player 1's possible strategies, and the columns represent player 2's
possible strategies. If player 1 and player 2 both play nickels (the top left entry), player 1
wins player 2's nickel so gains 5 cents. On the other hand, if both play quarters (the
bottom right entry), player 2 wins player 1's quarter, so player 1 loses 25 cents.
Notice that every entry in the first row is greater than all of the entries in that column. In
other words, playing a nickel is always at least as good as playing a quarter for player 1.
So, playing a nickel is called a dominant strategy, and it dominates the strategy of
playing a quarter. It is never advantageous to play a dominated strategy, so we can
reduce our payoff matrix to reflect this:
Player 2
10
http://cse.stanford.edu/classes/sophomore-college/projects-98/game-theory/game1.htmlhttp://cse.stanford.edu/classes/sophomore-college/projects-98/game-theory/game2.htmlhttp://cse.stanford.edu/classes/sophomore-college/projects-98/game-theory/game3.htmlhttp://cse.stanford.edu/classes/sophomore-college/projects-98/game-theory/game2.htmlhttp://cse.stanford.edu/classes/sophomore-college/projects-98/game-theory/game3.htmlhttp://cse.stanford.edu/classes/sophomore-college/projects-98/game-theory/game1.html -
8/6/2019 19074879 100 Mark Project Game Theory (1)
11/55
Game Theory
Nickel Quarter
Player 1Nickel 5 25
Now, the nickel strategy for player 2 also dominates. So, playing nickels is the beststrategy for both players. Notice that, if either plays quarters, he will not gain more
money than if he had just played nickels.
Game 2 Saddle Points
This game differs from game 1 in that it has no dominant strategies. The rules are as
follows: If player 1 plays a nickel, player 2 gives him 5 cents. If player 2 plays a nickel
and player 1 plays a quarter, player 1 gets 25 cents. If both players play quarters, player 2
gets 25 cents. We get a payoff matrix for this game:
Player 2Nickel Quarter
Player 1Nickel 5 25
Quarter 25 -25
Notice that there are no longer any dominant strategies. To solve this game, we need a
more sophisticated approach. First, we can define lower and upper values of a game.
These specify the least and most (on average) that a player can expect to win in the game
if both player play rationally. To find the lower value of the game, first look at the
minimum of the entries in each row. In our example, the first row has minimum value 5
and the second has minimum -25. The lower value of the game is the maximum of these
numbers, or 5. In other words, player 1 expects to win at least an average of 5 cents per
game. To find the upper value of the game, do the opposite. Look at the maximum of
every column. In this case, these values are 25 and 5. The upper value of the game is the
minimum of these numbers, or 5. So, on average, player 1 should win at most 5 cents per
game.
Nickel Quarter Min
Nickel 5 5 5Quarter 25 -25 -25
Max 25 5
11
-
8/6/2019 19074879 100 Mark Project Game Theory (1)
12/55
Game Theory
Notice that, in our example, the upper and lower values of the game are the same. This is
not always true; however, when it is, we just call this number the pure value of the game.
The row with value 5 and the column with value 5 intersect in the top right entry of the
payoff matrix. This entry is called the saddle point or minimax of the game and is both
the smallest in its row and the largest in its column. The row and column that the saddle
point belongs to are the best strategies for the players. So, in this example, player 1
should always play a nickel while player 2 should always play a quarter.
Game 3 Mixed Strategies
The rules of game 3 were as follows: two players have nickels and quarters. At the same
time, they each play one coin. If both players play the same coin, player 2 gives player 1
the average value of the coins; otherwise, player 1 gives player 2 the average value of the
coins. Here is the payoff matrix for this game:
Player 2Nickel Quarter
Player 1Nickel 5 -15
Quarter -15 25
The lower value of this game is -15 while the upper value is 5. Can we find a pure value
for the game? According to the Minimax Theorem, one of the most important results in
game theory, we can. The Minimax Theorem states that every finite, two-person, zero-
sum game has a value Vthat is the average amount that one player can expect to win if
both players act sensibly.
Suppose player 2 knows which coin player 1 will play on each turn. Then it will be easy
for player 2 to play a coin that makes player 2 lose money. Therefore, player 1 can't play
with a pattern. Instead, he must use a mixed strategy, in which he randomly chooses to
play a nickel or quarter on each turn. However, it is not necessarily true that he should
play each strategy half the time. He may want to weight the strategies differently,
playing one with probabilityp and the other with probability 1 - p. How do we figure out
p?
It turns out that one property of the value of a game is that, if player 1 plays his optimal
strategy, he will achieve exactly the value of the game no matter what the other player
does (as long as the other player has no dominant strategies). In particular, the yield
12
-
8/6/2019 19074879 100 Mark Project Game Theory (1)
13/55
Game Theory
when player 1 plays agains player 2's two different pure strategies should be the same. In
other words, if player 1 uses his optimal strategy, he will get the same amount of money
whether player 2 always plays nickels or always plays quarters. Let's suppose that player
2 always plays nickels. Player 1 plays nickelsp of the time so gains 5 centsp of the time.
The other 1 -p of the time, he loses 15 cents. Overall, he wins 5p - 15(1 -p) = 20p - 15.
Now, suppose player 2 always plays quarters. Player 1 plays nickels p of the time so
loses 15 centsp of the time. The rest of the time, he wins 25 cents. Overall, he wins -15p
+ 25(1 - p) = 25 - 40p. Because he should win the same in both situations, the two
winnings are the same. So, 20p - 15 = 25 - 40p. Solving forp, we find that it is 2/3. To
find the amount that player 1 expects to win, we just plug this back into either of the
equations and find that he should win an average of -5/3 per game. Even if player 2
figures out this strategy, he cannot do anything to change it.
Similarly, we can look at the payoff matrix from player 2's point of view and find a
mixed strategy for player 2. If we do so, we find that player 2 should play nickels 2/3 of
the time and quarters 1/3 of the time. If he does so, he should win an average of 5/3 cents
per game.
13
-
8/6/2019 19074879 100 Mark Project Game Theory (1)
14/55
Game Theory
Strategies of Play
The Minimax algorithm is the most well-known strategy of play of two-player, zero-sum
games. The minimax theorem was proven by John von Neumann in 1928. Minimax is a
strategy of always minimizing the maximum possible loss which can result from a choice
that a player makes. Before we examine minimax, though, let's look at some of the other
possible algorithms.
Maximax
Maximax principle counsels the player to choose the strategy that yields the best of the
best possible outcomes. For example, let's consider a zero-sum game where two players
simultaneously put either a blue or a red card on the table. If player 1 puts a red card
down on the table, whichever card player 2 puts down, no one wins anything. If player 1
puts a blue card on the table and player 2 puts a red card, then player 2 wins $1,000 from
player 1. Finally, if player 1 puts a blue card on the table and player 2 puts a blue card
down, then player 1 wins $1,000 from player 2.
The payoff matrix for player 1 is shown in this table:
Player 2Blue Red
Player 1Blue 1,000 -1,000
Red 0 0
Going by maximax principle, player 1 will always play the blue card, since it has the
maximum possible payoff of 1,000. But as can be clearly seen from the table, assuming
player 2 is rational, he will never play the blue card, since the red card gives him the
dominant strategy. In such a case, if player 1 plays by the maximax rule, he will always
lose.
The maximax principle is inherently irrational, as it does not take into account the other
player's possible choices. Maximax is often adopted by naive decision-makers such as
young children.
14
-
8/6/2019 19074879 100 Mark Project Game Theory (1)
15/55
Game Theory
Maximin
Maximin is solely a one-person game strategy, i.e. a principle which may be used when a
person's "competition" is nature, or chance. Whereas the maximax principle is ultra-
optimistic, expecting the best possible payoff, the maximin is ultra-pessimistic, expectingthe worst possible payoff. It involves choosing the best of the worst possible outcomes.
A simple example of a slot machine game may be used. A player has only two choices to
make -- to gamble or not to gamble. If he gambles, he risks losing his bet (say, $1), but
also has a chance to win the maximum payoff (say, $10,000). If he does not gamble, he
can neither win nor lose.
The payoff matrix looks like this:
ChanceWin Lose
PlayerGamble
1000
0-1
Not Gamble 0 0
According to the maximin principle, the player should never gamble, because he faces a
risk of losing $1. It is clear that the maximin principle is quite inefficient, since it
discourages taking any risks, no matter how high the reward may be.
Minimax for One-Person Games
The Minimax Regret Principle is based on the Minimax Theorem advanced by John von
Neumann, but is geared only towards one-person games. It relies on the concept of regret
matrices. To demonstrate, consider an example of a company trying to decide whether or
not it should support a research project. Let's assume that the research project costs c
units. If it succeeds, it will bring in a return of r units. If it fails, it will obviously not
bring in anything.
The payoff matrix for the company looks like this:
15
-
8/6/2019 19074879 100 Mark Project Game Theory (1)
16/55
Game Theory
ResearchSucceeds Fails
CompanySupports Research R - C - C
Neglect Research 0 0
By the maximax principle, a company should always support research if the expected
return on it is greater than its cost. By maximin, the company should never support
research, since it is risking the cost of the research. Minimax is slightly more
complicated.
We need to come up with a matrix that shows the "opportunity cost," or regret, of the
player, depending on each possible decision. For example, if the company supports theresearch and it fails, the company's regret will be c, the price of research. If the company
supports the research and it succeeds, the company will have no regrets. If the company
neglects research and it would have succeeded, its regret value is r-c, the return on the
research. So, the minimax regret matrix will look like this:
Research
Succeeds FailsCompany
Supports Research 0 C
Neglect Research R - C 0
The object is to minimize the maximum possible regret. It is not obvious from the above
matrix what the maximum value is. That is, is c greater than r-c? If (r-c) > c, the
company should support research. If (r-c) < c, the company should not. In other words,
the company should support research if c < r/2, or, if the expected return on research is
more than twice its cost.
Minimax for Two-Person Games
In a two-person, zero-sum game, a person can win only if the other player loses. No
cooperation is possible. Andrew Colman's Game Theory and Experimental Games shows
the following historical example:
16
-
8/6/2019 19074879 100 Mark Project Game Theory (1)
17/55
Game Theory
In 1943, the Allied forces received reports that a Japanese convoy would be heading by
sea to reinforce their troops. The convoy could take on of two routes the Northern or
the Southern route. The Allies had to decide where to disperse their reconnaissance
aircraft - in the north or the south - in order to spot the convoy as early as possible. The
following payoff matrix shows the possible decisions made by the Japanese and the
Allies, with the outcomes expressed in the number of days of bombing the Allies could
achieve with each possibility:
Japanese RouteNorth South
Allies ReconnaissanceNorth 2 2
South 1 3
By this matrix, if the Japanese chose to take the southern route while the Allies decided
to focus their recon planes in the north, the convoy would be bombed for 2 days. The
best outcome for the Allies would be if they placed their airplanes in the south and the
Japanese took the southern route. The best outcome for the Japanese would be to take the
northern route, provided the Allies were looking for them in the south.
To minimize the worst possible outcome, the Allies would have to choose the north as
the focus of their reconnaisance efforts. This ensures them 2 days of bombing, whereas
they risk only 1 day of bombing if they focus on the south. Therefore, by minimax, the
best strategy for them would be to focus on the north.
The Japanese can use the same strategy. The worst possible outcome for them is the 3
days of bombing which might occur if they took the southern route. Therefore, the
Japanese would take the northern route.
It is, in fact, what had occurred: both the Allies and the Japanese chose the north, and the
Japanese convoy was bombed for 2 days.
The previous matrix had a saddle point, meaning that both the Japanese and the Allies
settled on the (North, North) square as the best outcome for both of them. Neither could
do any better if the opponent was rational. In this case, the maximin and the minimax
17
-
8/6/2019 19074879 100 Mark Project Game Theory (1)
18/55
-
8/6/2019 19074879 100 Mark Project Game Theory (1)
19/55
Game Theory
computer is programmed with a strong strategy, it becomes predictable and easy to take
advantage of.
On the other hand, the computer might very well benefit if it recognizes a predictable
strategy on the part of an opponent. Even in such a simple game as "Matching Pennies,"
where a 50/50 is called for, while the computer may follow a 50% algorithm for deciding
whether to play heads or tails, a human cannot come up with completely random
numbers. In fact, it has been observed that humans tend to play heads slightly more
often. If a computer recognizes that the probability of its opponent of picking heads is
slightly higher, it may adjust its own strategy to have an advantage.
19
-
8/6/2019 19074879 100 Mark Project Game Theory (1)
20/55
Game Theory
Non-Zero-Sum Games
The theory of zero-sum games is vastly different from that of non-zero-sum games
because an optimal solution can always be found. However, this hardly represents the
conflicts faced in the everyday world. Problems in the real world do not usually have
straightforward results. The branch of Game Theory that better represents the dynamics
of the world we live in is called the theory of non-zero-sum games. Non-zero-sum games
differ from zero-sum games in that there is no universally accepted solution. That is,
there is no single optimal strategy that is preferable to all others, nor is there a
predictable outcome. Non-zero-sum games are also non-strictly competitive, as opposed
to the completely competitive zero-sum games, because such games generally have both
competitive and cooperative elements. Players engaged in a non-zero sum conflict have
some complementary interests and some interests that are completely opposed.
A Typical Example
The Battle of the Sexes is a simple example of a typical non-zero-sum game. In this
example a man and his wife want to go out for the evening. They have decided to go
either to a ballet or to a boxing match. Both prefer to go together rather than going alone.
While the man prefers to go to the boxing match, he would prefer to go with his wife to
the ballet rather than go to the fight alone. Similarly, the wife would prefer to go to the
ballet, but she too would rather go to the fight with her husband than go to the ballet
alone. The matrix representing the game is given below:
Husband
Boxing Match Ballet
WifeBoxing Match 2, 3 1, 1
Ballet 1, 1 3, 2
The wife's payoff matrix is represented by the first element of the ordered pair while the
husband's payoff matrix is represented by the second of the ordered pair.
From the matrix above, it can be seen that the situation represents a non-zero-sum, non-
strictly competitive conflict. The common interest between the husband and wife is that
they would both prefer to be together than to go to the events separately. However, the
20
-
8/6/2019 19074879 100 Mark Project Game Theory (1)
21/55
Game Theory
opposing interests is that the wife prefers to go to the ballet while her husband prefers to
go to the boxing match.
Analyzing a Non-Zero-Sum Game Communication
It is conventional belief that the ability to communicate could never work to a player's
disadvantage since a player can always refuse to exercise his right to communicate.
However, refusing to communicate is different from an inability to communicate. The
inability to communicate may work to a player's advantage in many cases.
An experiment performed by Luce and Raiffa compares what happens when player can
communicate and when players cannot communicate. Luce and Raiffa devised the
following game:
A B
A 1, 2 3,1
B 0, -200 2, -300
If Susan and Bob cannot communicate, then there is no possibility of threats being made.
So, Susan can do no better than to play strategy A and Bob can do no better than to play
strategy a. Susan, therefore gains 1 and Bob gains 2. However, when communication is
allowed, the situation is complicated. Susan can threaten Bob by saying that she will play
strategy B unless Bob commits himself to playing strategy b. If Bob submits, Susan will
gain 2 and Bob will lose 1 (as opposed to Susan gaining 1 and Bob gaining 2 when
communication is not allowed).
Restricting Alternatives
The Battle of the Sexes example given above seems to be an unsolvable dilemma.
However, this problem can be solved it either the husband or the wife resticts the others'
alternatives. For example, if the wife buys two tickets for the ballet, indicating that she is
definitely not going to the boxing match, the husband would have to go to the ballet
along with his wife in order to maximize his self-interest. Because the wife bought the
two tickets, the husbands optimal payoff, now, would be to go along with his wife. If he
were to go to the boxing match alone, he would not be maximizing his self-interest.
21
-
8/6/2019 19074879 100 Mark Project Game Theory (1)
22/55
Game Theory
The Number of Times the Game is "Played"
If the game is played only once, players do not have to fear retaliation from their
opponents, so they may play differently than they would in a game played repeatedly.
Typical Non-Zero-Sum Games:
Prisoner's Dilemma
Chicken and Volunteer's Dilemma
Deadlock and Stag Hunt
The Prisoners Dilemma
The Prisoner's Dilemma game was first proposed by Merrill Flood in 1951. It was
formalized and defined by Albert W. Tucker. The name refers to the following
hypothetical situation:
Two criminals are captured by the police. The police suspect that they are responsible for
a murder, but do not have enough evidence to prove it in court, though they are able to
convict them of a lesser charge (carrying a concealed
weapon, for example). The prisoners are put in separate
cells with no way to communicate with one another and
each is offered to confess.
If neither prisoner confesses, both will be convicted of the lesser offense and sentenced
to a year in prison. If both confess to murder, both will be sentenced to 5 years. If,
however, one prisoner confesses while the other does not, then the prisoner who
confessed will be granted immunity while the prisoner who did not confess will go to jail
for 20 years.
What should each prisoner do?
Discussion of Prisoners Dilemma
22
-
8/6/2019 19074879 100 Mark Project Game Theory (1)
23/55
Game Theory
To help us determine the answer, let's come up with a payoff matrix for each prisoner.
The value in each cell is the time spent in prison, so the prisoners will try to end up in the
matrix cell with the lowest number. The first number of each pair refers to the prison
time of prisoner 1, and the second number to prisoner 2.
Prisoner 2
Confess Not Confess
Prisoner 1Confess 5, 5 0, 20
Not Confess 20, 0 1, 1
Let's assume the role of prisoner 1. We're looking to minimize our prison time. Since we
have no way of knowing whether our partner in crime has confessed, let's first assumethat he has not. If Prisoner 1 doesn't confess either, both will go to prison for 1 year. Not
bad. But, if Prisoner 1 confesses, he will go free, while his partner rots away in jail. We'll
assume that there is no "honor among thieves" and each prisoner only cares about
minimizing his jail time. From the above discussion, it is obvious that if Prisoner 2 does
not confess, Prisoner 1 is definitely better off confessing.
Now let's look at the other possibility. Say prisoner 2 confesses. If Prisoner 1 does notconfess, he will go to jail for 20 years. But if he does confess, he will get only 5 years in
prison. It is clearly better to confess in this case as well.
So is that it? Is the problem solved? Is each prisoner better off confessing? Well, it may
seem so from the above discussion, but if we look at the payoff matrix, it is clear that the
best payoff for both prisoners is when neither confesses! But game theory advocates that
both confess.
This "game" can be generalized to any situation when two players are in a non-
cooperative situation where the best all-around situation is for both to cooperate, but the
worst individual outcome is to be cooperating player while the other player defects.
On the one hand, it is tempting to defect, or confess. Since you have no way of
influencing the other player's decision, no matter what he does, you're better off
confessing. But on the other hand, you're both in the same boat. Both of you should be
sensible enough to realize that cheating undermines the common good.
23
-
8/6/2019 19074879 100 Mark Project Game Theory (1)
24/55
Game Theory
There is no single "right" solution to the Prisoner's Dilemma (that's why it's a dilemma).
Its implications carry into psychology, economics, and many other fields.
The Flood Dresher Experiment
Many experiments have been done on the Prisoner's Dilemma, to try to gauge the normal
human behavior in a prisoner's dilemma-type situation.
The Flood-Dresher Experiment was a prisoner's dilemma game run 100 times between 2
players. In this case, the game was unfair - one of the players had an inherent advantage
over the other player, but the payoff matrix layout was still a prisoner's dilemma. The
following is the table used in the experiment. (In this case, the payoffs are positive, that
is, each rational player seeks to maximize the value in the matrix cell he ends up in.)
Prisoner 2
Defect Cooperate
Prisoner 1Cooperate -1, 2 0.5, 1
Defect 0, 0.5 1, -1
As in the Prisoner's Dilemma, both players are better off defecting. But when both
defect, they do relatively poorly. On the other hand, if both choose their "worse" strategy
consistently, they should both gain.
In the 100 trials, Player 1 chose to cooperate 68 times, and Player 2 78 times. Player 1
began the game expecting both players to defect. Player 2 realized the value of
cooperation and started cooperating. Both players started cooperating after the first 10 or
so moves, though Player 1 would defect on a regular basis, unhappy that his payoff
wasn't as big as Player 2's. This in turn brought retaliation from Player 2, who would
defect on the next move.
Each player kept a log of comments for each move. Some of those comments are quite
amusing. "The stinker," writes Player 2 after Player 1's defection. "He's a shady
character... A shiftless individual--opportunist, knave... He can't stand success."
24
-
8/6/2019 19074879 100 Mark Project Game Theory (1)
25/55
Game Theory
The players' comments reflect their concern about the final few moves. Both seem to
realize that it would make sense for both to defect on move 100, since no retaliation from
the other player is possible. Player 1 worries about starting to defect earlier than Player 2
so that he has the advantage. As the game was played, both players cooperated on moves
83 through 98. On move 99, Player 1 defected, and on move 100, both defected.
It is clear that the long-term prospect of the game encouraged cooperation. Since the
game was played multiple times, it became beneficial for both players to cooperate. On
move 100, however, the game suddenly becomes a regular prisoner's dilemma, and both
players defect, as game theory advocates they should (although if they both cooperated
they would ensure themselves a gain of 0.5 points).
This reasoning is troubling though. Since both players must realize that they will both
defect on move 100, move 100 does not have to figure into the game. It can then be
thought that move 99 is really the last move in the game, since both players are
obviously going to defect on move 100. But if move 99 is the last move, both players
should defect, since no retaliation is possible (both players will defect anyway on move
100, no matter what the other player did on move 99). So both players should defect on
move 99 as well. Then, move 98 can be thought of as the last move in the game. This
line of reasoning can be extended indefinitely until move 1. So should both players
always defect?
Clearly not, since if they both cooperate, they will gain more than if both defect.
This paradox is still unresolved. William Poundstone, in Prisoner's Dilemma, says that
"Both Flood and Dresher say they initially hoped that someone would 'resolve' the
prisoner's dilemma. They expected someone to come up with a new and better theory of
non-zero-sum games. The solution never came. The prisoner's dilemma remains a
negative result - A demonstration of what's wrong with theory, and indeed, the world."
Axelrods Tournament
In 1980, Robert Axelrod, professor of political science at the University of Michigan,
held a tournament of various strategies for the prisoner's dilemma. He invited a number
25
-
8/6/2019 19074879 100 Mark Project Game Theory (1)
26/55
Game Theory
of well-known game theorists to submit strategies to be run by computers. In the
tournament, programs played games against each other and themselves repeatedly. Each
strategy specified whether to cooperate or defect based on the previous moves of both
the strategy and its opponent.
Some of the strategies submitted were:
Always defect: This strategy defects on every turn. This is what game
theory advocates. It is the safest strategy since it cannot be taken advantage of.
However, it misses the chance to gain larger payoffs by cooperating with an
opponent who is ready to cooperate.
Always cooperate: This strategy does very well when matched against
itself. However, if the opponent chooses to defect, then this strategy will do
badly.
Random:The strategy cooperates 50% of the time.
All of these strategies are prescribed in advance. Therefore, they cannot take advantage
of knowing the opponent's previous moves and figuring out its strategy.
The winner of Axelrod's tournament was the TIT FOR TAT strategy. The strategy
cooperates on the first move, and then does whatever its opponent has done on the
previous move. Thus, when matched against the all-defect strategy, TIT FOR TAT
strategy always defects after the first move. When matched against the all-cooperate
strategy, TIT FOR TAT always cooperates. This strategy has the benefit of both
cooperating with a friendly opponent, getting the full benefits of cooperation, and of
defecting when matched against an opponent who defects. When matched against itself,
the TIT FOR TAT strategy always cooperates.
Several variations to TIT FOR TAT have been proposed. TIT FOR TWO TATS is a
forgiving strategy that defects only when the opponent has defected twice in a row. TWO
TITS FOR TAT, on the other hand, is a strategy that punishes every defection with two
of its own.
26
-
8/6/2019 19074879 100 Mark Project Game Theory (1)
27/55
Game Theory
TIT FOR TAT relies on the assumption that its opponent is trying to maximize his score.
When paired with a mindless strategy like RANDOM, TIT FOR TAT sinks to its
opponent's level. For that reason, TIT FOR TAT cannot be called a "best" strategy.
It must be realized that there really is no "best" strategy for prisoner's dilemma. Each
individual strategy will work best when matched against a "worse" strategy. In order to
win, a player must figure out his opponent's strategy and then pick a strategy that is best
suited for the situation.
Multi-Person Prisoners Dilemma
The n-person prisoner's dilemma (NPD) is basically the Prisoner's Dilemma with more
than two players. The NPD emerged during the early 1970's and quickly became popular
among social theorists and economists. The sudden interest in NPD occurred mainly
because of the economic and social developments during the late 60s and early 70s. At
this time, problems such as inflation, voluntary wage restraint, the energy crisis, and
environmental pollution were pressing issues. This era of history, however, is better
known for the increasing international tension between the U.S. and the Soviet Union.
Both superpowers were engaged in mass production of nuclear weapons, creating a very
real threat to the existence of the entire world. With the proliferation of nuclear weapons
came the issue of multilateral disarmament. The various social, political, and economic
tensions of the 70's can all be modeled by the NPD, indicating the remarkable range of
real-world problems that NPDs can simulate.
Many real-world problems, be they social, political, or economic, can be modeled as an
NPD. In economics, an interesting example concerns the "invisible hand theory" and
how it applies to the labor market.
In 1776, economist Adam Smith introduced the theory of the "invisible hand" which still
remains the cornerstone of traditional economics. The "invisible hand", in short, is what
dictates the motion of the economy. It is not a single individual or government that
controls its motion, but is instead motivated by every person who participates in the
economy.
27
-
8/6/2019 19074879 100 Mark Project Game Theory (1)
28/55
Game Theory
In the labor market, companies hiring workers are consumers and those looking for jobs
are the producers. That is, job seekers have a product to sell, namely their skills and the
companies want to buy their labor. The "invisible hand" which dictates the labor market
decides the wages that companies will pay.
An example of how and NPD can be used to model the labor market is as follows: Every
trade union's individual self-interest is to negotiate wages that exceed the rate of inflation
in the economy as a whole. However, if all trade unions negotiated wages to benefit their
own self-interest, the prices of goods and services go up and everyone is worse off than
if they had all exercised restraint.
In order to solve this problem, the British Labour Party issued a Manifesto (1974) which
contained an outline of a "social contract" whose aim was to encourage trade unions to
exercise voluntary wage restraint in order to decrease the rate of inflation. The "social
contract" was designed to encourage collective rationality in wage bargaining over
individual rationality. However, this solution was unsuccessful because it did not change
the underlying strategic structure of the wage bargaining game.
Another type of NPD that is readily evident in the real
world is that which simulates situations where
resources are scarce. For example, when there is a
shortage of any resource, such as water or energy,
there is usually a call for conservation. However, and
individual only benefits from restraint if everyone else restrains as well. However,
restraint of an individual is unnecessary. That is, if everyone else restrains then it would
make much of a difference if you didn't restrain. On the other-hand, if you restrain and
no one else does, then your attempt at conservation is futile. Therefore, it is everyone's
individual self-interest to NOT conserve. However, if everyone acts individualistically,
all are worse off.
One last interesting example of an NPD is called the tragedy of the commons. Suppose
there are six farmers who each owns one cow that weighs 1000 lbs. These six farmers
share one plot of grazing land, a plot that can maximally sustain six cows without
deterioration from overgrazing. For every additional cow that is added, the weight of
every animal decreases by 100 lbs. Suppose every farmer had the opportunity to add one
28
-
8/6/2019 19074879 100 Mark Project Game Theory (1)
29/55
Game Theory
cow. If one farmer decides to add one cow, then his wealth increases since he will now
have two cows that weigh 900 lbs each instead of just one cow that weighs 1000 lbs.
Each of the six farmers, if they act in their own self-interest, will also add another cow.
However, if all six farmers do add another cow, then each farmer ends up worse off. That
is, each farmer will have two cows that weighs 400 lbs each instead of one cow that
weighs 1000 lbs.
Small farmers in England during the period of the enclosures in the eighteenth century
became impoverished because of this NPD situation.
All multi-person prisoner's dilemmas share a common underlying strategic structure.
Therefore, any game that satisfies the following criteria is an NPD by definition:
each player has two options: cooperate or defect
defecting is the dominant strategy for each player (i.e. each player is better off
choosing to defect than to cooperate no matter how many other players choose to
cooperate)
the dominant strategies (to defect) intersect at a deficient equilibrium point (if all
players choose to defect, the outcome is worse than if each player had chosen
non-dominant strategies (to cooperate))
Chicken
There is a game called Chicken, in which two people drive two very fast cars towards
each other from opposite ends of a long straight road. If one of them swerves before the
other, he is called a chicken. Of course, if neither swerves, they will crash. The worst
possible payoff is to crash into each other, so we assign this a value 0. The best payoff is
to have your opponent be the chicken, so we assign this a value 3. The next to worst
possibility is to be the chicken, so we assign this a value 1. The last possibility is that
both drivers swerve. Then, neither has less honor than the other, so this is preferable to
being the chicken. However, it is not quite as good as being the victor, so we assign it a
value 2. We could assign a payoff matrix to this:
Swerve Drive Straight
Swerve 2, 2 1, 3
29
-
8/6/2019 19074879 100 Mark Project Game Theory (1)
30/55
Game Theory
Drive Straight 3, 1 0, 0
Unlike the prisoner's dilemma, mutual defection is the worst outcome in chicken. Both
players want to do the opposite of what the other player does.
Volunteers Dilemma
The version of chicken with more than two players is
known as the volunteer's dilemma. In a volunteer's
dilemma, one player needs to take an action that will
benefit all of the players. For instance, suppose James
Bond, Paris Carver, and Wai Lin are locked in three
sound-proof cells by Elliot Carver. In one hour, Elliott
will release poison gas into their cells unless at least one of the three pushes a button.
Whoever pushes the button will be immediately killed, but the other two will be released
immediately. The three cannot communicate or coordinate their efforts.
If any of the three are to survive, one of them must sacrifice himself or herself. The least
disturbing case is when all three reach the same conclusion about who should be
sacrificed. In this case, the martyr will push the button, and the others will be spared. A
second possibility is that all players decide to save each other. Then there will be a race
to push the button first. The most disturbing case is when each player decides that he or
she should be saved. When this happens, none push the button and the clock ticks away.
Suppose that you are Bond, and Paris and Wai Lin have not pushed the button by the end
of 59 minutes. It seems that they have decided that you should sacrifice yourself, but you
don't want to do that. There is no point in vowing to never push the button because then
all three of you will die. Ideally, you want to push the button in the last possible second.
However, there is no way to determine exactly when this is, so the resolution of the
conflict rides on chance and reflexes.
Other Dilemmas
These dilemmas are examples of games in which both players share the same
preferences. These games are known as symmetric games. In these games, neither player
has a privileged position. In this sense, they can often model the real world.
30
-
8/6/2019 19074879 100 Mark Project Game Theory (1)
31/55
Game Theory
Deadlock
The payoff matrix for deadlock looks something like this:
Cooperate Defect
Cooperate 1, 1 0, 3
Defect 3, 0 2, 2
Each player does better defecting no matter what his partner does. Unlike the prisoner's
dilemma though, it is better for them to both defect than to both cooperate. This is called
deadlock because the two players will decide not to cooperate. This situation sometimes
arises when two countries do not want to disarm so fail to reach arms controlagreements.
Stag Hunt
The philosopher Jean-Jacques Rousseau imagined a situation like this:
In early societies, people formed alliances to hunt deer. If even one person in the group
did not help with the hunt, the deer would be lost. The hunters were sometimes tempted
to leave the hunt by seeing rabbits, but they preferred deer to rabbit. However, only one
person was needed to catch a rabbit. From a game theory perspective, the best strategy is
to hunt the deer, but people may decide to hunt the rabbit because they believe others
may defect from the hunt also.
Countries face the same dilemma in situations involving nuclear weapons. Each country
generally believes that the world would be better if no countries possessed nuclear
weapons. However, the temptation to build up a nuclear arsenal arises because each
country is afraid that other countries may stash nuclear warheads and undermine
international security.
31
-
8/6/2019 19074879 100 Mark Project Game Theory (1)
32/55
Game Theory
Applications of Game Theory
Though at first glance the idea of game theory sounds trivial, applications of game theory
are extensive. Von Neumann and Morgenstern originally applied their models of games
to economic analysis. Each factor in the market, such as seasonal preferences, buyer
choice, changes in supply and material costs, and other such market factors can be used
to describe strategies to maximize the outcome and thus the profit. However, game
theory can be also used to simply study economics of the past and interactions of
different factors in a matter. It can also be used to investigate matters such as monetary
distributions and their effects on other outcomes.
Military strategists have turned to game theory to play "war games." Usually, such
games are not zero-sum games, for loses to one side are not won by the other, and they
have been criticized as potentially dangerous oversimplification of necessarily factors.
Economic situations are also more complicated than zero-sum games, but those factors
only require readjustments to the strategy over time. Sociologists have taken an interest
in game theory, and have developed an entire branch dedicated to group decision
making. Immunization procedures and vaccine or other medication tests are analyzed by
epidemiologists using game theory.
The properties of n-person non-zero-sum games can be used to study different aspects of
social sciences as well. Matters such as distribution of power, interactions between
nations, the distribution of classes and their effects of government, and many other
matters can be easily investigated by breaking the problem down into smaller games,
each of whose outcomes affect the final result of a larger game.
32
-
8/6/2019 19074879 100 Mark Project Game Theory (1)
33/55
Game Theory
Philosophy
Philosophers are increasingly becoming interested in Game Theory because it provides a
way of elucidating the logical difficulty of philosophers such as Hobbes, Rousseau, Kant
and other social and political theorists.
Rationality and the pursuit of self-interest:
According to Bertrand Russell Reason has a perfectly clear and precise meaning. It
signifies the choice of the right means to an end that you wish to achieve". This is the
interpretation of 'reason' that most contemporary philosophers favor. However, many
philosophers have pointed out situations where the concept of rationality seems to break
down. The situations are those who strategic structures resemble that of the Prisoner's
Dilemma.
An example of a multiple person Prisoner's Dilemma is as follows: Suppose that during a
drought, a person must decide whether he should act in his own self-interest and water
the garden or whether he should exercise restraint and conserve water. No matter what
the other community members do, a person is always better off watering his garden
because this is the right means to the end that he desires. The reasoning for this is that it
is unnecessary for one person to exercise restraint if the most other community members
are restraining as well. Even if the rest of the community doesn't exercise restraint, it is
futile for just one person to do so since one person does not have that big of an impact on
the whole water supply.
The paradox is that if the entire community reasons this way, the water supply will dry
up completely but if each community member cooperates and exercises restraint (actsirrationally) the water supply will be spared. Moral philosopher, Derek Parfit, believes
that cooperation, instead of being the irrational choice, can be a rational course of action.
Parfit has proposed several solutions to the Prisoner's dilemma so that cooperation
becomes the reasonable choice. One solution involves changing the entire structure of
the game so that it is no longer a Prisoner's Dilemma. To do this, the payoff functions of
each player should be changed in order to make it unprofitable for anyone to defect. In
the case of the example given above, the payoff functions of each individual would
change if there were a fine for watering the garden during a drought. Such a solution is
33
-
8/6/2019 19074879 100 Mark Project Game Theory (1)
34/55
Game Theory
considered a "political" solution and oftentimes these sorts of solutions cannot be
implemented.
Parfit argues that an even better solution would be to find ways to make people cooperate
for purely moral reasons. Parfit proposes that the way to achieve such a "moral" solution
would be to educate society about the Prisoner's Dilemma and it's most desirable, though
irrational solution.
Kant's Categorical Imperative
Immanuel Kant's categorical imperative, which is intended to be a fundamental principle
of morality, states: "Act only on such a maxim through which you can at the same time
will that it should become a universal law." A maxim is just a personal rule of conduct
while the universal law is the conduct of all people. Kant's categorical imperative is
continually debated among moral philosophers because of its obscurity. Through the use
of Game Theory, Kant's views can be clarified. Kant's beliefs, when understood, offers a
moral solution to the Prisoner's Dilemma.
One of Kant's examples of categorical imperative is illustrated in the following maxim:"Always borrow money when in need and promise to pay it back without any intention
of keeping the promise." This maxim cannot possibly made into a universal law because
it cannot be made universal without creating a contradiction. That is, if this maxim was
made universal, then everyone would break promises and a promise would have no
meaning and therefore promises would cease to exist. Therefore, if this maxim were
made universal, a logical contradiction would follow.
In terms of Game Theory, Kant's categorical imperative can be restated as follows:
"Choose only a strategy which, if you could will it to be chosen by all the players, would
yield a better outcome from you point of view than any other". This statement, then,
becomes a solution to the Prisoner's Dilemma. That is, according to Kant's categorical
imperative, only a cooperative choice can result. This is because the personal choice of
defecting, if made universal, is in contradiction to one's personal interest (similar to the
above example).
34
-
8/6/2019 19074879 100 Mark Project Game Theory (1)
35/55
Game Theory
Hobbes's and Rousseau's Social Contract
Through the use of Game Theory, Hobbes' argument, later made popular by Jean-
Jacques Rousseau, for absolute monarchy can be reconstructed. Hobbes argued that,
without some form of external constraint on people's behaviors, anarchy would ensue.
Cooperation among people would be impossible since people act only to maximize
individual welfare and not the welfare of society as a whole. Granted, there will exists
altruists (maybe even many of them) who constrain their self-interests for the good of
others. However, if even one self-interested person exists, he/she will exploit the
altruists' constraints, profiting from both his/her absence of constraint and the altruist's
unselfish behavior. As a result, Hobbes believes that it is psychologically unnatural for
altruists to exist. If just one narrowly self-interested person exists no altruist can survive
unless he/she becomes narrowly self-interested too. In such an environment, known as a
State of Nature, Hobbes argues that a person must always be suspicious that another will
attack in order to maximize his/her own self-interest. Therefore, in order for a person to
maximize his best interest, he must attack the other person before that other person can
attack. Each such conflict between two people in a state of nature has been termed as the
"Hobbesian Dilemma." However, in the field of Game Theory, the Hobbesian Dilemma
has the same structure as a "Prisoner's Dilemma."
Hobbes believed that the "Hobbesian Dilemma" results in a State of Nature because
morality is an unstable enforcer of social cooperation. According to Hobbes, a stable
enforcer can only exist if not one person can deviate from the established rule by which
the rest adhere to. Since cooperation among people is biologically necessary, a stable
enforcer must exist. Hobbes believes that the best form of social enforcement is the
existence of an all-powerful sovereign.
Resource Allocation and Networking
Computer network bandwidth can be viewed as a limited resource. The users on the
network compete for that resource. Their competition can be simulated using game
theory models. No centralized regulation of network usage is possible because of the
diverse ownership of network resources.
35
-
8/6/2019 19074879 100 Mark Project Game Theory (1)
36/55
Game Theory
The problem is of ensuring the fair sharing of network resources. For example, ten
Stanford students on the same local network need access to the Internet. Each person, by
using their network connection, diminishes the quality of the connection for the other
users. This particular case is that of a volunteer's dilemma. That is, if one person abstains
from using the network, the other people will be better off, but that person will be worse
off.
If a centralized system could be developed which would govern the use of the shared
resources, each person would get an assigned network usage time or bandwidth, thereby
limiting each person's usage of network resources to his or her fair share.
As of yet, however, such a system remains an impossibility, making the situation of
sharing network resources a competitive game between the users of the network and
decreasing everyone's utility.
Biology
Although the natural world is often thought of as brutal, dog-eat-dog type, cooperation
exists between many different species. The reason behind this coexistence can be
modeled using game theory. For example, birds called ziczacs enter crocodiles' mouths
to eat parasites. This symbiosis allows crocodiles to achieve good oral hygiene and
allows the ziczacs to get a decent meal. But any crocodile can easily eat a ziczac (defect).
So why don't they? Apparently, over the eons of evolutionary action, the crocodiles and
the ziczacs have learned the benefits of cooperation, the "equilibrium point."
Of course, chances are that neither the crocodiles nor the ziczacs rationalize their
behavior with game theory. But their behavior can still be modeled using game theory
principles.
Artificial Intelligence
One of the marks that differentiate a human from a machine is the human's ability to
make independent decisions based on environmental stimuli. Most computer programs
that are required to make any sort of a decision are currently pre-programmed with the
lists of decisions based on a number of conditions. However, if those conditions are not
36
-
8/6/2019 19074879 100 Mark Project Game Theory (1)
37/55
Game Theory
met in some way or are altered, computers have no way of making decisions they were
not programmed to make.
In the future, AI programs may be endowed with the ability to make new decisions
unplanned for by their creators. This would require the programs to be able to generate
new payoff matrices based on the observed stimuli and experience. A program that is
able to do that would be capable of learning and would, in a lot of ways, resemble the
human decision-making process.
Economics
Many of the interactions in the business world may be modeled using game theory
methodology. A famous example is that of the similarity of the price-setting of
oligopolies to the Prisoner's Dilemma. If an oligopoly situation exists, the companies are
able to set prices if they choose to cooperate with each other. If they cooperate, both are
able to set higher prices, leading to higher profits. However, if one company decides to
defect by lowering its price, it will get higher sales, and, consequently, bigger profits
than its competitor(s), who will receive lower profits. If both companies decide to defect,
i.e. lower prices, a price war will ensue, in which case neither company will profit, since
it will retain its market share and experience lower revenues at the same time.
Similar arguments can be extended to many cases like:
advertising for a companys products
international trade between nations and Trade Barriers
expenditure on National Defense for two neighbouring nations
This can better be explained with the help of the following example:
Suppose PepsiCo & Coca-Cola enter a new market and decide to form a cartel and fix
prices. Assuming that if they honour the cartel agreement, they both would land up with
revenues of $6 million each. But in case one of them cheats and the other does not, the
difference in revenues could be huge ($6 million in this case; between the companies).
37
-
8/6/2019 19074879 100 Mark Project Game Theory (1)
38/55
Game Theory
The temptation to cheat is very high in this case and if both the companies cheat on the
agreement, then its a loss to both as can be very well seen from the following table.
PepsiCoCheat on Cartel
(Charge Low Price)
Dont Cheat
(Charge Monopoly Price)
Coca-
Cola
Cheat on
Cartel$3 million each
Coke earns $8 million
Pepsi earns $2 million
Dont
Cheat
Coke earns $2 million
Pepsi earns $8 million$6 million each
Prisoner's dilemma is not the only game theory model which can be used to model
economic situations. Other models can be applied to different situations and, in many
cases, can suggest the best outcome for all parties concerned.
38
-
8/6/2019 19074879 100 Mark Project Game Theory (1)
39/55
Game Theory
Case Study: Limitations of Game
Theory
& the Theory of Moves
Theory of Moves
"We're eyeball to eyeball, and I think the other fellow just
blinked" were the eerie words of Secretary of State Dean Rusk at
the height of the Cuban missile crisis in October 1962. He was
referring to signals by the Soviet Union that it desired to defuse
the most dangerous nuclear confrontation ever to occur between
the superpowers, which many analysts have interpreted as a
classic instance of nuclear "Chicken".
Chicken is the usual game used to model conflicts in which the players are on a collision
course. The players may be drivers approaching each other on a narrow road, in which
each has the choice of swerving to avoid a collision or not swerving. In the novel Rebel
without a Cause, which was later made into a movie starring James Dean, the drivers
were two teenagers, but instead of bearing down on each other they both raced toward a
cliff, with the object being not to be the first driver to slam on his brakes and thereby
"chicken out", while, at the same time, not plunging over the cliff.
While ostensibly a game of Chicken, the Cuban missile crisis is in fact not well modelled
by this game. Another game more accurately represents the preferences of American and
Soviet leaders, but even for this game standard game theory does not explain their
choices.
On the other hand, the "theory of moves," which is founded on game theory but radically
changes its standard rules of play, does retrodict, or make past predictions of, the leaders'
choices. More important, the theory explicates the dynamics of play, based on the
assumption that players think not just about the immediate consequences of their actions
but their repercussions for future play as well.
39
-
8/6/2019 19074879 100 Mark Project Game Theory (1)
40/55
Game Theory
I will use the Cuban missile crisis to illustrate parts of the theory, which is not just an
abstract mathematical model but one that mirrors the real-life choices, and underlying
thinking, of flesh-and-blood decision makers. Indeed, Theodore Sorensen, special
counsel to President John Kennedy, used the language of "moves" to describe the
deliberations of Excom, the Executive Committee of key advisors to Kennedy during the
Cuban missile crisis:
"We discussed what the Soviet reaction would be to any possible move by the United
States, what our reaction with them would have to be to that Soviet action, and so on,
trying to follow each of those roads to their ultimate conclusion."
Classical Game Theory and the Missile Crisis
Game theory is a branch of mathematics concerned with decision-making in social
interactions. It applies to situations (games) where there are two or more people (called
players) each attempting to choose between two more more ways of acting (called
strategies). The possible outcomes of a game depend on the choices made by all players,
and can be ranked in order of preference by each player.
In some two-person, two-strategy games, there are combinations
of strategies for the players that are in a certain sense "stable".
This will be true when neither player, by departing from its
strategy, can do better. Two such strategies are together known
as Nash equilibrium, named after John Nash, a mathematician
who received the Nobel prize in economics in 1994 for his work
on game theory.
Nash equilibria do not necessarily lead to the best outcomes for one, or even both,
players. Moreover, for the games that will be analyzed - in which players can only rank
outcomes ("ordinal games") but not attach numerical values to them ("cardinal games") -
they may not always exist. (While they always exist, as Nash showed, in cardinal games,
Nash equilibria in such games may involve "mixed strategies," which will be described
later.)
The Cuban missile crisis was precipitated by a Soviet attempt in October 1962 to install
medium-range and intermediate-range nuclear-armed ballistic missiles in Cuba that were
40
-
8/6/2019 19074879 100 Mark Project Game Theory (1)
41/55
Game Theory
capable of hitting a large portion of the United States. The goal of the United States was
immediate removal of the Soviet missiles, and U.S. policy makers seriously considered
two strategies to achieve this end [see Figure 1 below]:
1. A naval blockade (B), or "quarantine" as it was euphemistically called, to
prevent shipment of more missiles, possibly followed by stronger action to
induce the Soviet Union to withdraw the missiles already installed.
2. A "surgical" air strike (A) to wipe out the missiles already installed,
insofar as possible, perhaps followed by an invasion of the island.
The alternatives open to Soviet policy makers were:
1. Withdrawal (W)of their missiles.
2. Maintenance (M)of their missiles.
Soviet Union (S.U.)
Withdrawal (W) Maintenance (M)
United
States
(U.S.)
Blockade
(B)
Compromise
(3,3)
Soviet victory,
U.S. defeat
(2,4)
Air strike
(A)
U.S. victory,
Soviet defeat
(4,2)
Nuclear war
(1,1)
Figure 1: Cuban missile crisis as Chicken
Key: (x, y) = (payoff to U.S., payoff to S.U.)
4= best; 3= next best; 2= next worst; 1= worst
Nash equilibria underscored
These strategies can be thought of as alternative courses of action that the two sides, or
"players" in the parlance of game theory, can choose. They lead to four possible
outcomes, which the players are assumed to rank as follows: 4=best; 3=next best; 2=next
worst; and l=worst. Thus, the higher the number, the greater the payoff; but the payoffs
are only ordinal, that is, they indicate an ordering of outcomes from best to worst, not the
degree to which a player prefers one outcome over another. The first number in the
41
-
8/6/2019 19074879 100 Mark Project Game Theory (1)
42/55
Game Theory
ordered pairs for each outcome is the payoff to the row player (United States), the second
number the payoff to the column player (Soviet Union).
Needless to say, the strategy choices, probable outcomes, and associated payoffs shown
in Figure 1 provide only a skeletal picture of the crisis as it developed over a period of
thirteen days. Both sides considered more than the two alternatives listed, as well as
several variations on each. The Soviets, for example, demanded withdrawal of American
missiles from Turkey as a quid pro quo for withdrawal of their own missiles from Cuba,
a demand publicly ignored by the United States.
Nevertheless, most observers of this crisis believe that the two superpowers were on a
collision course, which is actually the title of one book describing this nuclear
confrontation. They also agree that neither side was eager to take any irreversible step,
such as one of the drivers in Chicken might do by defiantly ripping off the steering wheel
in full view of the other driver, thereby foreclosing the option of swerving.
Although in one sense the United States "won" by getting the Soviets to withdraw their
missiles, Premier Nikita Khrushchev of the Soviet Union at the same time extracted from
President Kennedy a promise not to invade Cuba, which seems to indicate that the
eventual outcome was a compromise of sorts. But this is not game theory's prediction for
Chicken, because the strategies associated with compromise do not constitute a Nash
equilibrium.
To see this, assume play is at the compromise position (3,3), that is, the U.S. blockades
Cuba and the S.U. withdraws its missiles. This strategy is not stable, because both
players would have an incentive to defect to their more belligerent strategy. If the U.S.
were to defect by changing its strategy to airstrike, play would move to (4,2), improving
the payoff the U.S. received; if the S.U. were to defect by changing its strategy to
maintenance, play would move to (2,4), giving the S.U. a payoff of 4. (This classic game
theory setup gives us no information about which outcome would be chosen, because the
table of payoffs is symmetric for the two players. This is a frequent problem in
interpreting the results of a game theoretic analysis, where more than one equilibrium
position can arise.) Finally, should the players be at the mutually worst outcome of (1,1),
that is, nuclear war, both would obviously desire to move away from it, making the
strategies associated with it, like those with (3,3), unstable.
42
-
8/6/2019 19074879 100 Mark Project Game Theory (1)
43/55
Game Theory
Theory of Moves and the Missile Crisis
Using Chicken to model a situation such as the Cuban missile
crisis is problematic not only because the (3,3) compromiseoutcome is unstable but also because, in real life, the two sides did
not choose their strategies simultaneously, or independently of
each other, as assumed in the game of Chicken described above.
The Soviets responded specifically to the blockade after it was
imposed by the United States. Moreover, the fact that the United
States held out the possibility of escalating the conflict to at least an air strike indicates
that the initial blockade decision was not considered final - that is, the United States
considered its strategy choices still open after imposing the blockade.
As a consequence, this game is better modelled as one of sequential bargaining, in which
neither side made an all-or-nothing choice but rather both considered alternatives,
especially should the other side fail to respond in a manner deemed appropriate. In the
most serious breakdown in the nuclear deterrence relationship between the superpowers
that had persisted from World War II until that point, each side was gingerly feeling its
way, step by ominous step. Before the crisis, the Soviets, fearing an invasion of Cuba by
the United States and also the need to bolster their international strategic position,
concluded that installing the missiles was worth the risk. They thought that the United
States, confronted by a fait accompli, would be deterred from invading Cuba and would
not attempt any other severe reprisals. Even if the installation of the missiles precipitated
a crisis, the Soviets did not reckon the probability of war to be high (President Kennedy
estimated the chances of war to be between 1/3 and 1/2 during the crisis), thereby
making it rational for them to risk provoking the United States.
There are good reasons to believe that U.S. policymakers did not view the confrontation
to be Chicken-like, at least as far as they interpreted and ranked the possible outcomes. I
offer an alternative representation of the Cuban missile crisis in the form of a game I will
call Alternative, retaining the same strategies for both players as given in Chicken but
presuming a different ranking and interpretation of outcomes by the United States [see
Figure 2]. These rankings and interpretations fit the historical record better than those of
"Chicken", as far as can be told by examining the statements made at the time by
43
-
8/6/2019 19074879 100 Mark Project Game Theory (1)
44/55
Game Theory
President Kennedy and the U.S. Air Force, and the type and number of nuclear weapons
maintained by the S.U. (more on this below).
BW:The choice of blockade by the United States and withdrawal by the Soviet Union
remains the compromise for both players - (3,3).
BM: In the face of a U.S. blockade, Soviet maintenance of their missiles leads to a
Soviet victory (its best outcome) and U.S. capitul
top related