l6 8 ardersarial search
TRANSCRIPT
-
8/9/2019 L6 8 Ardersarial Search
1/22
CIT 6261:Advanced Artificial Intelligence
Acknowledgement:Tom Lenaerts, SWITCH, Vlaams Interuniversitair Instituut voor Biotechnologie
Adversarial SearchAdversarial Search
Lecturer: Dr. Md. Saiful Islam
2
Outline
What are games?
Optimal decisions in games Which strategy leads to success?
- pruning
Games of imperfect information
Games that include an element of
chance
-
8/9/2019 L6 8 Ardersarial Search
2/22
3
What are and why study games?
Games are a form ofmulti-agent environment What do other agents do and how do they affect our success?
Cooperative vs. competitive multi-agent environments.
Competitive multi-agent environments give rise to adversarialsearch a.k.a. games
Why study games? Fun; historically entertaining
Interesting subject of study because they are hard
State of a game is easy to represent and agents are
restricted to small number of actions
4
Relation of Games to Search
Search no adversary Solution is (heuristic) method for finding goal
Heuristics and CSP techniques can find optimal solution
Evaluation function: estimate of cost from start to goalthrough given node
Examples: path planning, scheduling activities
Games adversary Solution is strategy (strategy specifies move for every
possible opponent reply).
Time limits force an approximate solution
Evaluation function: evaluate goodness ofgame position
Examples: chess, checkers, Othello, backgammon
-
8/9/2019 L6 8 Ardersarial Search
3/22
5
Types of Games
6
Game setup
Two players: MAX and MIN
MAX moves first and they take turns until the game
is over. Winner gets award, looser gets penalty.Games as search: Initial state: e.g. board configuration of chess
Successor function: list of (move, state) pairs specifying legalmoves.
Terminal test: Is the game finished? States where the game hasended are called terminal states.
Utility function: Gives numerical value of terminal states. E.g.win (+1), loose (-1) and draw (0) in tic-tac-toe (next)
MAX uses search tree to determine next move.
-
8/9/2019 L6 8 Ardersarial Search
4/22
7
Partial Game Tree for Tic-Tac-Toe
8
Optimal strategies
Find the contingent s t ra tegy for MAX assumingan infallible MIN opponent.
Assumption: Both players play optimally !!
Given a game tree, the optimal strategy can bedetermined by using the minimax value of eachnode:
MINIMAX-VALUE(n)=
UTILITY(n) Ifn is a terminal
maxs successors(n) MINIMAX-VALUE(s) Ifn is a max node
mins successors(n) MINIMAX-VALUE(s) Ifn is a min node
-
8/9/2019 L6 8 Ardersarial Search
5/22
9
Two-Ply Game Tree
10
Two-Ply Game Tree
-
8/9/2019 L6 8 Ardersarial Search
6/22
11
Two-Ply Game Tree
12
Two-Ply Game Tree
The minimax decision
Minimax maximizes the worst-case outcome for max.
-
8/9/2019 L6 8 Ardersarial Search
7/22
13
What if MIN does not play optimally?
Definition of optimal play for MAX assumesMIN plays optimally: maximizes worst-caseoutcome for MAX.
But if MIN does not play optimally, MAX willdo even better. [proven.]
14
Minimax Algorithm
function MINIMAX-DECISION(state) returns an action
inputs: state, current state in game
vMAX-VALUE(state)return the actionin SUCCESSORS(state) with value v
function MIN-VALUE(state) returns a utility value
ifTERMINAL-TEST(state) then return UTILITY(state)
v
for a,sin SUCCESSORS(state) dov MIN(v,MAX-VALUE(s))
return v
function MAX-VALUE(state) returns a utility valueifTERMINAL-TEST(state) then return UTILITY(state)
v
for a,sin SUCCESSORS(state) do
v MAX(v,MIN-VALUE(s))return v
-
8/9/2019 L6 8 Ardersarial Search
8/22
15
Properties of Minimax
O(bm)Space
O(bm)Time
MinimaxCriterion
16
Multiplayer games
Games allow more than two players
Single minimax values become vectors
-
8/9/2019 L6 8 Ardersarial Search
9/22
17
Problem of minimax search
Number of games states is exponentialto the number of moves.
Solution: Do not examine every node
==> Alpha-beta pruning
Remove branches that do not influence final decision
Revisit example
18
Alpha-Beta Example
[-, +]
[-,+]
Range of possible values
Do DF-search until first leaf
-
8/9/2019 L6 8 Ardersarial Search
10/22
19
Alpha-Beta Example (continued)
[-,3]
[-,+]
3
20
Alpha-Beta Example (continued)
[-,3]
[-
,+
]
-
8/9/2019 L6 8 Ardersarial Search
11/22
21
Alpha-Beta Example (continued)
[3,+]
[3,3]
22
Alpha-Beta Example (continued)
[-,2]
[3,+]
[3,3]
This node is worse
for MAX
-
8/9/2019 L6 8 Ardersarial Search
12/22
23
Alpha-Beta Example (continued)
[-,2]
[3,14]
[3,3] [-,14]
,
24
Alpha-Beta Example (continued)
[,2]
[3,5]
[3,3] [-,5]
,
-
8/9/2019 L6 8 Ardersarial Search
13/22
25
Alpha-Beta Example (continued)
[2,2][,2]
[3,3]
[3,3]
26
Alpha-Beta Example (continued)
[2,2]
[3,3]
[3,3] [,2]
-
8/9/2019 L6 8 Ardersarial Search
14/22
27
Alpha-Beta Algorithm
function ALPHA-BETA-SEARCH(state) returns an action
inputs: state, current state in gamevMAX-VALUE(state, - , +)return the actionin SUCCESSORS(state) with value v
function MAX-VALUE(state,, ) returns a utility value
ifTERMINAL-TEST(state) then return UTILITY(state)
v -
for a,sin SUCCESSORS(state) do
v MAX(v,MIN-VALUE(s, , ))
if v then return v
MAX(,v)return v
28
Alpha-Beta Algorithm
function MIN-VALUE(state, , ) returns a utility valueifTERMINAL-TEST(state) then return UTILITY(state)
v +
for a,sin SUCCESSORS(state) do
v MIN(v,MAX-VALUE(s, , ))
if v then return v
MIN(,v)return v
-
8/9/2019 L6 8 Ardersarial Search
15/22
29
General alpha-beta pruning
Consider a node n
somewhere in the treeIf player has a betterchoice m at Parent node ofn
Or any choice point furtherup
n will never be reachedin actual play.
Hence when enough isknown about n, it can bepruned.
30
Final Comments about Alpha-Beta Pruning
Pruning does not affect final results
Entire subtrees can be pruned.
Good move ordering improves effectiveness ofpruning
With perfect ordering, time complexity is O(bm/2) Branching factor of sqrt(b) !!
Alpha-beta pruning can look twice as far as minimax in the sameamount of time
Repeated states are again possible. Store them in memory = transposition table
-
8/9/2019 L6 8 Ardersarial Search
16/22
31
Games of imperfect information
Minimax and alpha-beta pruning require too
much leaf-node evaluations.May be impractical within a reasonableamount of time.
SHANNON (1950): Cut off search earlier (replace TERMINAL-TEST by CUTOFF-
TEST)
Apply heuristic evaluation function EVAL (replacing utilityfunction of alpha-beta)
32
Cutting off search
Change: ifTERMINAL-TEST(state) then return UTILITY(state)
into ifCUTOFF-TEST(state,depth) then return EVAL(state)
Introduces a fixed-depth limit depth Is selected so that the amount of time will not exceed what the
rules of the game allow.
When cuttoff occurs, the evaluation is performed.
More robust approach: apply iterative deepening. When time runes out, the program return the move selected by
deepest completed search.
-
8/9/2019 L6 8 Ardersarial Search
17/22
33
Heuristic EVAL
Idea: produce an estimate of the expected utility ofthe game from a given position.
Performance depends on quality of EVAL.
Requirements: EVAL should order terminal-nodes in the same way as UTILITY.
Computation may not take too long.
For non-terminal states the EVAL should be strongly correlatedwith the actual chance of winning.
Only useful for quiescent (no wild swings in value innear future) states
34
Heuristic EVAL example
Most evaluation function computes separate numerical
contributions for each features and then combines them to find the
total value.
Example: pawn = 1, knight = bishop = 3, rook = 5, queen = 9,
good pawn structure = king safety = .
Weighted linear evaluation function:
Eval(s) = w1f1(s) + w2f2(s) + + wnfn(s)
For chess:
fi = number of each kind of piece on the board.
wi = value of the pieces.
Nonlinear combination of feature: a pair of bishop > 2(value of single bishop) a bishop is worth more in the end game the at the beginning.
-
8/9/2019 L6 8 Ardersarial Search
18/22
35
Heuristic EVAL exampleA secure advantage of equivalent to
one pawn is likely to win
three pawns gives almost certain victory.
36
Heuristic difficulties
Heuristic counts pieces won
-
8/9/2019 L6 8 Ardersarial Search
19/22
37
Horizon effect
Fixed depth search
thinks it can avoid
the queening move
As hardware improvements leads to deeper searches,
horizon effect will occur less frequently.
38
Discussion
Combining all these techniques results in a programthat can play reasonable chess (or other game). We can generate and evaluate around a million nodes per
second on the latest PC, allowing us to search roughly 200
million nodes per move under standard time controls ( 3minutes / move)
On average, branching factor for chess is 35.
If we use minimax search we could look ahead only aboutfive plies (355 = 50 million)
If we use alpha-beta search we could look ahead only aboutten plies
To reach grandmaster status we need an extensively tuned
evaluation function A supercomputer would perform better.
-
8/9/2019 L6 8 Ardersarial Search
20/22
39
Games that include chance
Possible moves (5-10,5-11), (5-11,19-24),(5-10,10-16) and (5-11,11-16)
Backgammon: luck + skill
40
Games that include chance
Doubles (e.g. [1,1], [6,6]) chance 1/36, all other chance 1/18
Can not calculate definite minimax value, only expected value
chance nodes
-
8/9/2019 L6 8 Ardersarial Search
21/22
41
Expected minimax value
EXPECTIMINIMAX(n)=
UTILITY(n) Ifn is a terminal
maxs successors(n) MINIMAX-VALUE(s) Ifn is a max nodemins successors(n) MINIMAX-VALUE(s) Ifn is a max node
s successors(n) P(s) . EXPECTIMINIMAX(s) Ifn is a chance node
These equations can be backed-up recursively all theway to the root of the game tree.
As with minimax, approcimation to make with
EXPECTIMINMAX Cut the search off at some point
Apply the evaluation function to each leaf
42
Position evaluation with chance nodes
Left, A1 wins
Right A2 wins
Outcome of evaluation function may not change when values are
scaled differently. Avoiding this sensitivity: the EVAL must be a pos i t i v e l inea r
transformation of the probability of winning from a position.
The presence of chance node requires to be more careful forevaluation values
-
8/9/2019 L6 8 Ardersarial Search
22/22
43
Discussion
Examine section on state-of-the-art games yourself
Summary
Games are fun (and dangerous)They illustrate several important points
about AI Perfection is unattainable -> approximation Uncertainty constrains: the assignment of values
to states
Games are to AI as grand prix racing isto automobile design.