adversarial search - jbnu.ac.krnlp.jbnu.ac.kr/ai/slides_cbnu/nash_ai_slides_ch5.pdf · evaluation...
TRANSCRIPT
![Page 1: Adversarial Search - jbnu.ac.krnlp.jbnu.ac.kr/AI/slides_cbnu/nash_ai_slides_ch5.pdf · Evaluation function In most cases, use various features of the state: E.g.) in chess, features](https://reader036.vdocuments.us/reader036/viewer/2022071008/5fc57b8e65ea4008543a72ac/html5/thumbnails/1.jpg)
Adversarial Search
Seung-Hoon Na1
1Department of Computer ScienceChonbuk National University
2017.9.25
Seung-Hoon Na (Chonbuk National University) Adversarial Search 2017.9.25 1 / 23
![Page 2: Adversarial Search - jbnu.ac.krnlp.jbnu.ac.kr/AI/slides_cbnu/nash_ai_slides_ch5.pdf · Evaluation function In most cases, use various features of the state: E.g.) in chess, features](https://reader036.vdocuments.us/reader036/viewer/2022071008/5fc57b8e65ea4008543a72ac/html5/thumbnails/2.jpg)
Minimax value
MINIMAX (s): Minimax value of a node
The utility of being in the corresponding state, assuming that bothplayers play optimally from there to the end of the game.s is a terminal state: Its utility defined.
MINIMAX (s) = UTILITY (s) if TERMINAL-TEST (s)maxa∈Actions(s)MINIMAX (RESULT (s, a)) if PLAYER(s) = MAXmina∈Actions(s)MINIMAX (RESULT (s, a)) if PLAYER(s) = MIN
Seung-Hoon Na (Chonbuk National University) Adversarial Search 2017.9.25 2 / 23
![Page 3: Adversarial Search - jbnu.ac.krnlp.jbnu.ac.kr/AI/slides_cbnu/nash_ai_slides_ch5.pdf · Evaluation function In most cases, use various features of the state: E.g.) in chess, features](https://reader036.vdocuments.us/reader036/viewer/2022071008/5fc57b8e65ea4008543a72ac/html5/thumbnails/3.jpg)
Minimax algorithm
The algorithm first recurses down to the leaves of the tree, and
Then the minimax values are backed up through the tree as therecursion unwinds.
The time complexity: O(bm)
The space complexity: O(bm) or O(m)
Seung-Hoon Na (Chonbuk National University) Adversarial Search 2017.9.25 3 / 23
![Page 4: Adversarial Search - jbnu.ac.krnlp.jbnu.ac.kr/AI/slides_cbnu/nash_ai_slides_ch5.pdf · Evaluation function In most cases, use various features of the state: E.g.) in chess, features](https://reader036.vdocuments.us/reader036/viewer/2022071008/5fc57b8e65ea4008543a72ac/html5/thumbnails/4.jpg)
Minimax algorithm: A two-play game tree
Seung-Hoon Na (Chonbuk National University) Adversarial Search 2017.9.25 4 / 23
![Page 5: Adversarial Search - jbnu.ac.krnlp.jbnu.ac.kr/AI/slides_cbnu/nash_ai_slides_ch5.pdf · Evaluation function In most cases, use various features of the state: E.g.) in chess, features](https://reader036.vdocuments.us/reader036/viewer/2022071008/5fc57b8e65ea4008543a72ac/html5/thumbnails/5.jpg)
Optimal decision in multiplayer games
Use a vector of values for each node
〈vA, vB , vC 〉 in three-player game with A, B, and C .For terminal state, the vector gives the utility of the state from eachplayer’s viewpoint.
Seung-Hoon Na (Chonbuk National University) Adversarial Search 2017.9.25 5 / 23
![Page 6: Adversarial Search - jbnu.ac.krnlp.jbnu.ac.kr/AI/slides_cbnu/nash_ai_slides_ch5.pdf · Evaluation function In most cases, use various features of the state: E.g.) in chess, features](https://reader036.vdocuments.us/reader036/viewer/2022071008/5fc57b8e65ea4008543a72ac/html5/thumbnails/6.jpg)
Alpha-beta pruning
Returns the same move as minimax would, but prunes awaybranches that cannot possibly influence the final decision.
Use two parameters – alpha and beta – that describes bounds on thebacked-up values
α = the value of the best (i.e. highest-value) choice we have found sofar at any choice point along the path for MAX .β = the value of the best (i.e. lowest-value) choice we have found sofar at any choice point along the path for MIN.
Seung-Hoon Na (Chonbuk National University) Adversarial Search 2017.9.25 6 / 23
![Page 7: Adversarial Search - jbnu.ac.krnlp.jbnu.ac.kr/AI/slides_cbnu/nash_ai_slides_ch5.pdf · Evaluation function In most cases, use various features of the state: E.g.) in chess, features](https://reader036.vdocuments.us/reader036/viewer/2022071008/5fc57b8e65ea4008543a72ac/html5/thumbnails/7.jpg)
Alpha-beta pruning
If m is better than n for Player, n will never be reached in actual play.
Seung-Hoon Na (Chonbuk National University) Adversarial Search 2017.9.25 7 / 23
![Page 8: Adversarial Search - jbnu.ac.krnlp.jbnu.ac.kr/AI/slides_cbnu/nash_ai_slides_ch5.pdf · Evaluation function In most cases, use various features of the state: E.g.) in chess, features](https://reader036.vdocuments.us/reader036/viewer/2022071008/5fc57b8e65ea4008543a72ac/html5/thumbnails/8.jpg)
Alpha-beta search
Seung-Hoon Na (Chonbuk National University) Adversarial Search 2017.9.25 8 / 23
![Page 9: Adversarial Search - jbnu.ac.krnlp.jbnu.ac.kr/AI/slides_cbnu/nash_ai_slides_ch5.pdf · Evaluation function In most cases, use various features of the state: E.g.) in chess, features](https://reader036.vdocuments.us/reader036/viewer/2022071008/5fc57b8e65ea4008543a72ac/html5/thumbnails/9.jpg)
Alpha-beta pruning: An example
Seung-Hoon Na (Chonbuk National University) Adversarial Search 2017.9.25 9 / 23
![Page 10: Adversarial Search - jbnu.ac.krnlp.jbnu.ac.kr/AI/slides_cbnu/nash_ai_slides_ch5.pdf · Evaluation function In most cases, use various features of the state: E.g.) in chess, features](https://reader036.vdocuments.us/reader036/viewer/2022071008/5fc57b8e65ea4008543a72ac/html5/thumbnails/10.jpg)
Alpha-beta pruning: An example
MINIMAX (root) = max (min(3, 12, 8),min(2, x , y))
= max (3, z) where z = min(2, x , y) ≤ 2
= 3 (1)
Seung-Hoon Na (Chonbuk National University) Adversarial Search 2017.9.25 10 / 23
![Page 11: Adversarial Search - jbnu.ac.krnlp.jbnu.ac.kr/AI/slides_cbnu/nash_ai_slides_ch5.pdf · Evaluation function In most cases, use various features of the state: E.g.) in chess, features](https://reader036.vdocuments.us/reader036/viewer/2022071008/5fc57b8e65ea4008543a72ac/html5/thumbnails/11.jpg)
Alpha-beta pruning: Move ordering
We couldn’t prune any successors of D, because the worst successors weregenerated first.
Seung-Hoon Na (Chonbuk National University) Adversarial Search 2017.9.25 11 / 23
![Page 12: Adversarial Search - jbnu.ac.krnlp.jbnu.ac.kr/AI/slides_cbnu/nash_ai_slides_ch5.pdf · Evaluation function In most cases, use various features of the state: E.g.) in chess, features](https://reader036.vdocuments.us/reader036/viewer/2022071008/5fc57b8e65ea4008543a72ac/html5/thumbnails/12.jpg)
Imperfect real-time decision
Replace the utility function by a heuristic evaluation function EVAL,which estimates the position’s utility.
Replace the terminal test by a cutoff test that decides when to applyEVAL, resulting in:
H-MINIMAX (s, d) = EVAL(s) if CUTOFF -TEST (s, d)maxa∈Actions(s)H-MINIMAX (RESULT (s, a), d + 1) if PLAYER(s) = MAXmina∈Actions(s)H-MINIMAX (RESULT (s, a), d + 1) if PLAYER(s) = MIN
Seung-Hoon Na (Chonbuk National University) Adversarial Search 2017.9.25 12 / 23
![Page 13: Adversarial Search - jbnu.ac.krnlp.jbnu.ac.kr/AI/slides_cbnu/nash_ai_slides_ch5.pdf · Evaluation function In most cases, use various features of the state: E.g.) in chess, features](https://reader036.vdocuments.us/reader036/viewer/2022071008/5fc57b8e65ea4008543a72ac/html5/thumbnails/13.jpg)
Evaluation function
In most cases, use various features of the state:
E.g.) in chess, features can include: the num of white pawns, blackpawns, white queens, black queens, etc.
These features usually define various categories or equivalenceclasses of states
The states in each category have the same values for all the features
Any category will contain some states, that lead to wins, some thatlead to draws, and some that lead to losses.
E.g.) in chess, one category contains all two-pawn vs. one-pawnendgames. Among the states of the category, 72%: win, 20%: lose, 8%:draw- A reasonable evaluation for states is the expected value:(0.72)× 1 + (0.20)× 0 + 0.08× 1/2 = 0.76
Seung-Hoon Na (Chonbuk National University) Adversarial Search 2017.9.25 13 / 23
![Page 14: Adversarial Search - jbnu.ac.krnlp.jbnu.ac.kr/AI/slides_cbnu/nash_ai_slides_ch5.pdf · Evaluation function In most cases, use various features of the state: E.g.) in chess, features](https://reader036.vdocuments.us/reader036/viewer/2022071008/5fc57b8e65ea4008543a72ac/html5/thumbnails/14.jpg)
Evaluation function
Mathematically, it is a weighed linear function:
EVAL(s) = w1f1(s) + · · ·wnfn(s) =n∑
i=1
wi fi (s) (2)
where fi is a feature of the position and wi is a weight.
E.g.)material value for each piece: fi : the num of each kind of pieceon the board, wi : the value of the pieces (1 for pawn, 3 for bishop, 5for rook, 9 for queen, etc.).
Seung-Hoon Na (Chonbuk National University) Adversarial Search 2017.9.25 14 / 23
![Page 15: Adversarial Search - jbnu.ac.krnlp.jbnu.ac.kr/AI/slides_cbnu/nash_ai_slides_ch5.pdf · Evaluation function In most cases, use various features of the state: E.g.) in chess, features](https://reader036.vdocuments.us/reader036/viewer/2022071008/5fc57b8e65ea4008543a72ac/html5/thumbnails/15.jpg)
Cutting off search: The simple approach
CUTOFF -TEST (s, depth) returns true all depth greater than somefixed depth d
The depth d is chosen so that a move is selected within the allocatedtime, or an iterative deepening could be applied.
But lead to approximate errors
Seung-Hoon Na (Chonbuk National University) Adversarial Search 2017.9.25 15 / 23
![Page 16: Adversarial Search - jbnu.ac.krnlp.jbnu.ac.kr/AI/slides_cbnu/nash_ai_slides_ch5.pdf · Evaluation function In most cases, use various features of the state: E.g.) in chess, features](https://reader036.vdocuments.us/reader036/viewer/2022071008/5fc57b8e65ea4008543a72ac/html5/thumbnails/16.jpg)
Quiescence search
For a more sophisticated cutoff test, we apply the evaluation functiononly to quiescent positions.
that is, unlikely to exhibit wild swings in value in the near future.E.g.) in chess, positions in which favorable captures can be made arenot quiescent for an evaluation function that just counts material.
Quiescence search: The extra search that expands nonquiescentpositions until quiescent positions are reached.
This can also be used for dealing with the horizon problem
Seung-Hoon Na (Chonbuk National University) Adversarial Search 2017.9.25 16 / 23
![Page 17: Adversarial Search - jbnu.ac.krnlp.jbnu.ac.kr/AI/slides_cbnu/nash_ai_slides_ch5.pdf · Evaluation function In most cases, use various features of the state: E.g.) in chess, features](https://reader036.vdocuments.us/reader036/viewer/2022071008/5fc57b8e65ea4008543a72ac/html5/thumbnails/17.jpg)
Horizon effect
Horizon effect: the program is facing an opponent’s move that causesserious damage and is ultimately unavoidable, but can be temporarilyavoided by delaying tactics
Seung-Hoon Na (Chonbuk National University) Adversarial Search 2017.9.25 17 / 23
![Page 18: Adversarial Search - jbnu.ac.krnlp.jbnu.ac.kr/AI/slides_cbnu/nash_ai_slides_ch5.pdf · Evaluation function In most cases, use various features of the state: E.g.) in chess, features](https://reader036.vdocuments.us/reader036/viewer/2022071008/5fc57b8e65ea4008543a72ac/html5/thumbnails/18.jpg)
Singular extension
Another possible strategy to mitigate the horizon effect
Singular extension: move that is “clearly better” than all othermoves in a given position.
Once discovered anywhere in the tree in the course of a search, thissingular move is remembered.The algorithm checks to see if the singular extension is a legal move; ifit is, the algorithm allows the move to be considered.Makes the tree deeper, but because there will be only few singularextensions, it does not add many total nodes to the tree.
Seung-Hoon Na (Chonbuk National University) Adversarial Search 2017.9.25 18 / 23
![Page 19: Adversarial Search - jbnu.ac.krnlp.jbnu.ac.kr/AI/slides_cbnu/nash_ai_slides_ch5.pdf · Evaluation function In most cases, use various features of the state: E.g.) in chess, features](https://reader036.vdocuments.us/reader036/viewer/2022071008/5fc57b8e65ea4008543a72ac/html5/thumbnails/19.jpg)
Stochastic game
Stochastic game: The games that include unpredictable external eventsas random elements such as the throwing of dice
Seung-Hoon Na (Chonbuk National University) Adversarial Search 2017.9.25 19 / 23
![Page 20: Adversarial Search - jbnu.ac.krnlp.jbnu.ac.kr/AI/slides_cbnu/nash_ai_slides_ch5.pdf · Evaluation function In most cases, use various features of the state: E.g.) in chess, features](https://reader036.vdocuments.us/reader036/viewer/2022071008/5fc57b8e65ea4008543a72ac/html5/thumbnails/20.jpg)
Stochastic game tree
chance nodes are further included in addition to MAX and MIN nodes.
Seung-Hoon Na (Chonbuk National University) Adversarial Search 2017.9.25 20 / 23
![Page 21: Adversarial Search - jbnu.ac.krnlp.jbnu.ac.kr/AI/slides_cbnu/nash_ai_slides_ch5.pdf · Evaluation function In most cases, use various features of the state: E.g.) in chess, features](https://reader036.vdocuments.us/reader036/viewer/2022071008/5fc57b8e65ea4008543a72ac/html5/thumbnails/21.jpg)
Expecti-minimax value
E -MINIMAX (s) =UTILITY (s) if TERMINAL-TEST (s)maxa∈Actions(s)E -MINIMAX (RESULT (s, a)) if PLAYER(s) = MAXmina∈Actions(s)E -MINIMAX (RESULT (s, a)) if PLAYER(s) = MIN∑
r P(r)E -MINIMAX (RESULT (s, r)) if PLAYER(s) = CHANCE
Seung-Hoon Na (Chonbuk National University) Adversarial Search 2017.9.25 21 / 23
![Page 22: Adversarial Search - jbnu.ac.krnlp.jbnu.ac.kr/AI/slides_cbnu/nash_ai_slides_ch5.pdf · Evaluation function In most cases, use various features of the state: E.g.) in chess, features](https://reader036.vdocuments.us/reader036/viewer/2022071008/5fc57b8e65ea4008543a72ac/html5/thumbnails/22.jpg)
Partially observable games
Kriesgpiel: Partially observable chess
Seung-Hoon Na (Chonbuk National University) Adversarial Search 2017.9.25 22 / 23
![Page 23: Adversarial Search - jbnu.ac.krnlp.jbnu.ac.kr/AI/slides_cbnu/nash_ai_slides_ch5.pdf · Evaluation function In most cases, use various features of the state: E.g.) in chess, features](https://reader036.vdocuments.us/reader036/viewer/2022071008/5fc57b8e65ea4008543a72ac/html5/thumbnails/23.jpg)
Partially observable games: Card games
Card games: Gives stochastic partial observability – missinginformation is generated randomly
Suppose each deal s occurs with probability P(s).
argmaxa
∑s
P(s)MINIMAX (RESULT (s, a)) (3)
where we run either MINIMAX or H-MINIMAX .Monte Carlo approximation:
argmaxa
1
N
N∑i=1
MINIMAX (RESULT (si , a)) (4)
Seung-Hoon Na (Chonbuk National University) Adversarial Search 2017.9.25 23 / 23