1 search cs 331/531 dr m m awais a* examples:. 2 search cs 331/531 dr m m awais 8-puzzle 0+41+5 1+3...

CS 331/531 Dr M M Awais 1

search

A* Examples:


search

8-Puzzle

0+4

1+5

1+5

1+3

3+3

3+4

3+4

3+2 4+1

5+2

5+0

2+3

2+4

2+3

f(N) = g(N) + h(N) with h(N) = number of misplaced tiles


search

Robot Navigation


search

Robot Navigation

0 211

58 7

7

3

4

7

6

7

6 3 2

8

6

45

23 3

36 5 24 43 5

54 6

5

6

4

5

f(N) = h(N), with h(N) = Manhattan distance to the goal(not A*)


search

Robot Navigation

0 211

58 7

7

3

4

7

6

7

6 3 2

8

6

45

23 3

36 5 24 43 5

54 6

5

6

4

5

f(N) = h(N), with h(N) = Manhattan distance to the goal(not A*)

7

0


search

Robot Navigationf(N) = g(N)+h(N), with h(N) = Manhattan distance to goal(A*)

0 211

58 7

7

3

4

7

6

7

6 3 2

8

6

45

23 3

36 5 24 43 5

54 6

5

6

4

57+0

6+1

6+1

8+1

7+0

7+2

6+1

7+2

6+1

8+1

7+2

8+3

7+26+36+35+45+44+54+53+63+62+7

8+37+47+46+5

5+6

6+35+6

2+73+8

4+7

5+64+7

3+8

4+73+83+82+92+93+10

2+9

3+8

2+91+101+100+110+11


search

Adversary Search (Games)

The aim is to move in such a way as to ‘stop’ the opponent from making a good / winning move.

Game playing can use Tree - Search.

The tree or game - tree alternates between two players.


search

Games? Games are a form of multi-agent environment

What do other agents do How do they affect our success? Cooperative vs. competitive multi-agent

environments. Competitive multi-agent environments give

rise to adversarial problems (games) Why study games?

Fun; historically entertaining Interesting subject of study because they are

hard Easy to represent and agents restricted to

small number of actions


search

Games vs. Search Search – no adversary

Solution is (heuristic) method for finding goal Heuristics and CSP techniques can find optimal

solution Evaluation function: estimate of cost from start to goal

through given node Examples: path planning, scheduling activities

Games – adversary Solution is strategy (strategy specifies move for every

possible opponent reply). Time limits force an approximate solution Evaluation function: evaluate “goodness” of

game position Examples: chess, checkers, Othello, backgammon


search

Types of Games


search

Game setup Two players: MAX and MIN MAX and MIN take turns until the game is

over. Winner gets award, looser gets penalty.

Games as search: Initial state: e.g. board configuration of chess Successor function: list of (move,state) pairs

specifying legal moves. Terminal test: Is the game finished? Utility function: Gives numerical value of

terminal states. E.g. win (+1), loose (-1) and draw (0) in tic-tac-toe (next)

MAX uses search tree to determine next move.


search

Things to Remember:

1. Every move is vital

2. The opponent could win at the next move or subsequent moves.

3. Keep track of the safest moves

4. The opponent is well - informed

5. How the opponent is likely to response to your moves.


search

Two move win

Player 1 = P1

Player 2 = P2

Safest move for P1 is always A to C

Safest move for P2 is always A to D (if allowed 1st move)

P1 P2 P1 P1 P2 P2

A

BC

D

E F G H I J

P1 moves

P2 moves

wins


search

MINIMAX Procedure for Games

Assumption: Opponent has same knowledge of state space and makes a consistent effort to WIN.

MIN: Label for the opponent trying to minimize other player’s (MAX) score.

MAX: Player trying to win (maximise advantage)

BOTH MAX AND MIN ARE EQUALLY INFORMED


search

Rules1. Label levels MAX and MIN

2. Assign values to leaf nodes:

0 if MIN wins

1 if MAX wins

3. Propagate values up the graph.

If parent is MAX, assign it

Max-value of its children

If parent is MIN, assign it

min-value of its children

MAX

MIN

MAX

MIN


search

Rules1. Label level’s MAX and MIN

2. Assign values to leaf nodes:

0 if MIN wins

1 if MAX wins

3. Propagate values up the graph.





MAX

MIN

MAX

MIN

0 1


search

Rules3. Propagate values up the graph.





MAX

MIN

MAX

MIN

0 1

1

Max(0,1) = 1

1

Max(1) = 1


search






MAX

MIN

MAX

MIN

0 1

1

1

Min(1) = 1

1

1

Min(1) = 1


search






MAX

MIN

MAX

MIN

0 1

1 1

1

Max(1,1)

1

Min(1) = 1

1

Min(1) = 1


search

Utility Values

• Leaf Nodes represent the result of the game• Results could be WIN or LOOSE for any player• WIN for MAX is 1, LOOSE for MAX is 0• These values are known as Utility values / functions• Draw could be another result, in this case• WIN for MAX could be 1• LOOSE for MAX could be –1• DRAW could be 0


search Game tree (2-player, deterministic, turns)


search

MINMAX Unfinished Games

• Apply from the leaf node to the start node• Or, Result nodes are necessary to be in the search space• What if you want to evaluate the game status at an intermediate level• E.g.,• The game finishes at level 5• We want to find out the relative advantage of MAX upto level 3.• Solution: Evaluate intermediate nodes through a heuristic and then apply MINMAX


search

Minimaxing to fixed ply depth(Complex games)

Strategy: n - move look ahead

- Suppose you start in the middle of the game.

- One cannot assign WIN/LOOSEWIN/LOOSE values at that stage

- In this case some heuristics evaluation is applied

- Values are then projected back to supply indications of WINNING/LOOSING trend.


search

HEURISTIC FUNCTION: TIC - TAC - TOETIC - TAC - TOE

M(n) = Total of possible winning lines for MAX

O(n) = Trial of Opponents winning lines

E(n) = M(n) - O(n)

X

X

X

O

O

O

X

X


search




E(n) = M(n) - O(n)

X

X

X

O

O

O

X

X

M(n)=4

M(n)=5


search




E(n) = M(n) - O(n)

X

X

X

O

O

O

X

X

M(n)=4O(n)=2E(n)=2

M(n)=5O(n)=1E(n)=4


search

Two-Ply Game Tree


search

Two-Ply Game TreeThe minimax decision

Minimax maximizes the worst-case outcome for max.


search Problem of minimax search

Number of games states is exponential to the number of moves.


search

Solution Do not examine every node Alpha-beta pruning

Alpha = value of best choice found so far at any choice point along the MAX path Beta = value of best choice found so far at any choice point along the MIN path


search

Alpha - Beta Procedures

• Minimax procedure pursues all branches in the space. Some of them could have been ignored or pruned.

• To improve efficiency pruning is applied to two person games


search

Simple Idea

if A > 5 OR B < 0

If the first condition A > 5 succeeds then B < 0 may not be evaluated.

if A > 5 AND B < 0

If the first condition A > 5 fails then B < 0 may not be evaluated.


search

Implementation

FORWARD PASS:

APPLY DEPTH FIRST SEARCH REACH THE LEAF NODE

BACKWARD PASS:

PROPAGATE THE VALUES TO THE ROOT NODE


search

a

b = 0.4

g = -0.2

e

c

MAX

MIN

MAX

MIN

-0.2 (at least)

Why –0.2 is the least value?Why –0.2 is the least value?


search

a

b = 0.4

g = -0.2

e

c

MAX

MIN

MAX

MIN

-0.2

Suppose this node takes a value less than –0.2Value for node e will not change and remains at –0.2


search

a

b = 0.4

g = -0.2

e

c

MAX

MIN

MAX

MIN

v

Suppose this node takes a value greater than –0.2, say vValue for node e will change to v


search

a

b = 0.4

g = -0.2

e

c

MAX

MIN

MAX

MIN

v

WHAT IS THE LOWER BOUND ON v ?

Lower bound is the value at node g


search

a

b = 0.4

g = -0.2

e

c

MAX

MIN

MAX

MIN

=-0.2 (at least)

Minimum advantage for e MAX node is –0.2Minimum advantage for e MAX node is –0.2This is called the This is called the ALPHAALPHA Value for MAX Node Value for MAX Node


search

a

b = 0.4

g = -0.2

e

c

MAX

MIN

MAX

MIN

-0.2 (at most)

Why –0.2 is the AT MOST valueWhy –0.2 is the AT MOST valueFor node c ?For node c ?

=-0.2 (at least)


search

a

b = 0.4

g = -0.2

e

c

MAX

MIN

MAX

MIN

v

Suppose this node takes a value v less than –0.2

Value for node c will change to v

=-0.2 (at least)


search

a

b = 0.4

g = -0.2

e

c

MAX

MIN

MAX

MIN

-0.2

Suppose this node takes a value greater than –0.2Value for node c will not change and will remain at –0.2

=-0.2 (at least)


search

a

b = 0.4

g = -0.2

e

c

MAX

MIN

MAX

MIN

-0.2

WHAT IS THE UPPER BOUND ON v ?

UPPER bound is the value at node e

=-0.2 (at least)


search

a

b = 0.4

g = -0.2

e

c

MAX

MIN

MAX

MIN

= -0.2 (at most)

Maximum advantage for c MIN node is –0.2Maximum advantage for c MIN node is –0.2This is called the This is called the BETABETA Value for MIN Node Value for MIN Node

=-0.2 (at least)


search

a

b = 0.4

g = -0.2

e

c

MAX

MIN

MAX

MIN

= -0.2 (at most)

FIND THE FIND THE ALPHAALPHA VALUE FOR NODE VALUE FOR NODE aa ? ?

=-0.2 (at least)


search

a

b = 0.4

g = -0.2

e

c

MAX

MIN

MAX

MIN

= -0.2 (at most)

=-0.2 (at least)

= 0.4 (at least)

The The least advantageleast advantage which which MAXMAX can can get in this portion of the game is get in this portion of the game is 0.40.4


search

a

b = 0.4

g = -0.2

e

c

MAX

MIN

MAX

MIN

= -0.2 (at most)

=-0.2 (at least)

= 0.4 (at least)

IF this least advantage is acceptable, thenIF this least advantage is acceptable, thenExpanding to c and to all the proceeding Expanding to c and to all the proceeding nodes can be neglected: nodes can be neglected: Prune away link to cPrune away link to cWith ALPHA=0.4 With ALPHA=0.4


search

- MAX node neglects values <= a (atleast it can score) at MIN nodes below it.

- MIN node neglects values >= b (almost it can score) at MAX nodes below it

A

B =10

C

G=0 H

MAX

MIN

C node can score ATMOST 0 nothing above 0 (beta)

A node can score ATLEAST 10 nothing less than 10 (alpha)


search

Alpha-Beta Example

[-∞, +∞]

[-∞,+∞]

Range of possible values

Do DF-search until first leaf


search

Alpha-Beta Example (continued)

[-∞,3]

[-∞,+∞]


search


[-∞,3]

[-∞,+∞]


search


[3,+∞]

[3,3]


search


[-∞,2]

[3,+∞]

[3,3]

This node is worse for MAX


search


[-∞,2]

[3,14]

[3,3] [-∞,14]

,


search


[−∞,2]

[3,5]

[3,3] [-∞,5]

,


search


[2,2][−∞,2]

[3,3]

[3,3]


search


[2,2][-∞,2]

[3,3]

[3,3]

1 search cs 331/531 dr m m awais a* examples:. 2 search cs 331/531 dr m m awais 8-puzzle 0+41+5 1+3...

Documents

search cs

tree search

awais games

awais adversary search

awais robot navigation

awais types of games

awais things

backgammon slide