cs b551: elements of artificial intelligence
DESCRIPTION
CS B551: Elements of Artificial Intelligence. Instructor: Kris Hauser http://cs.indiana.edu/~hauserk. Announcements. HW1 due Thursday Final project proposal due Tuesday, Oct 6. Recap. Revisited states and non-unit costs Uniform-cost search Heuristic search Admissible heuristics and A*. - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: CS B551: Elements of Artificial Intelligence](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813494550346895d9b7fb3/html5/thumbnails/1.jpg)
1
CS B551: Elements of Artificial Intelligence
Instructor: Kris Hauserhttp://cs.indiana.edu/~hauserk
![Page 2: CS B551: Elements of Artificial Intelligence](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813494550346895d9b7fb3/html5/thumbnails/2.jpg)
2
Announcements
HW1 due Thursday Final project proposal due Tuesday, Oct 6
![Page 3: CS B551: Elements of Artificial Intelligence](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813494550346895d9b7fb3/html5/thumbnails/3.jpg)
3
Recap
Revisited states and non-unit costsUniform-cost search
Heuristic searchAdmissible heuristics and A*
![Page 4: CS B551: Elements of Artificial Intelligence](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813494550346895d9b7fb3/html5/thumbnails/4.jpg)
4
Topics
Revisited states in A*More bookkeepingConsistent heuristics
Local searchSteepest descent
![Page 5: CS B551: Elements of Artificial Intelligence](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813494550346895d9b7fb3/html5/thumbnails/5.jpg)
5
What to do with revisited states?
c = 1
100
21
2
h = 100
0
90
1
104
4+90
f = 1+100 2+1
?If we discard this new node, then the searchalgorithm expands the goal node next andreturns a non-optimal solution
![Page 6: CS B551: Elements of Artificial Intelligence](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813494550346895d9b7fb3/html5/thumbnails/6.jpg)
6
It is not harmful to discard a node revisiting a state if the cost of the new path to this state is cost of the previous path[so, in particular, one can discard a node if it re-visits a state already visited by one of its ancestors]
A* remains optimal, but states can still be re-visited multiple times [the size of the search tree can still be exponential in the number of visited states]
Fortunately, for a large family of admissible heuristics – consistent heuristics – there is a much more efficient way to handle revisited states
![Page 7: CS B551: Elements of Artificial Intelligence](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813494550346895d9b7fb3/html5/thumbnails/7.jpg)
7
Consistent HeuristicAn admissible heuristic h is consistent (or monotone) if for each node N and each child N’ of N: h(N) c(N,N’) + h(N’)
(triangle inequality)
N
N’ h(N)
h(N’)
c(N,N’)
Intuition: a consistent heuristics becomes more precise as we get deeper in the search tree
![Page 8: CS B551: Elements of Artificial Intelligence](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813494550346895d9b7fb3/html5/thumbnails/8.jpg)
8
Consistency Violation
N
N’ h(N)=100
h(N’)=10
c(N,N’)=10
(triangle inequality)
If h tells that N is 100 units from the goal, then moving from N along an arc costing 10 units should not lead to a node N’ that h estimates to be 10 units away from the goal
If h tells that N is 100 units from the goal, then moving from N along an arc costing 10 units should not lead to a node N’ that h estimates to be 10 units away from the goal
![Page 9: CS B551: Elements of Artificial Intelligence](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813494550346895d9b7fb3/html5/thumbnails/9.jpg)
9
Consistent Heuristic(alternative definition)A heuristic h is consistent (or monotone) if 1) for each node N and each child N’ of N: h(N) c(N,N’) + h(N’)
2) for each goal node G: h(G) = 0
(triangle inequality)
N
N’ h(N)
h(N’)
c(N,N’)
A consistent heuristic is also admissible
![Page 10: CS B551: Elements of Artificial Intelligence](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813494550346895d9b7fb3/html5/thumbnails/10.jpg)
10
A consistent heuristic is also admissible
An admissible heuristic may not be consistent, but many admissible heuristics are consistent
Admissibility and Consistency
![Page 11: CS B551: Elements of Artificial Intelligence](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813494550346895d9b7fb3/html5/thumbnails/11.jpg)
11
8-Puzzle
1 2 3
4 5 6
7 8
12
3
4
5
67
8
STATE(N) goal
h1(N) = number of misplaced tiles h2(N) = sum of the (Manhattan) distances of every tile to its goal positionare both consistent (why?)
N
N’ h(N)
h(N’)
c(N,N’)
h(N) c(N,N’) + h(N’)
![Page 12: CS B551: Elements of Artificial Intelligence](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813494550346895d9b7fb3/html5/thumbnails/12.jpg)
12
Robot Navigation
Cost of one horizontal/vertical step = 1Cost of one diagonal step = 2
2 2g g1 N Nh (N) = (x -x ) +(y -y )
h2(N) = |xN-xg| + |yN-yg|is consistent
is consistent if moving along diagonals is not allowed, and not consistent otherwise
N
N’ h(N)
h(N’)
c(N,N’)
h(N) c(N,N’) + h(N’)
![Page 13: CS B551: Elements of Artificial Intelligence](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813494550346895d9b7fb3/html5/thumbnails/13.jpg)
13
If h is consistent, then whenever A* expands a node, it has already found an optimal path to this node’s state
Result #2
![Page 14: CS B551: Elements of Artificial Intelligence](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813494550346895d9b7fb3/html5/thumbnails/14.jpg)
14
Proof (1/2)1) Consider a node N and its child N’
Since h is consistent: h(N) c(N,N’)+h(N’)
f(N) = g(N)+h(N) g(N)+c(N,N’)+h(N’) =
f(N’)So, f is non-decreasing along any path
N
N’
![Page 15: CS B551: Elements of Artificial Intelligence](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813494550346895d9b7fb3/html5/thumbnails/15.jpg)
15
2) If a node K is selected for expansion, then any other node N in the fringe verifies f(N) f(K)
If one node N lies on another path to the state of K, the cost of this other path is no smaller than that of the path to K:f(N’) f(N) f(K) and h(N’) = h(K)So, g(N’) g(K)
Proof (2/2)
K N
N’S
![Page 16: CS B551: Elements of Artificial Intelligence](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813494550346895d9b7fb3/html5/thumbnails/16.jpg)
16
2) If a node K is selected for expansion, then any other node N in the fringe verifies f(N) f(K)
If one node N lies on another path to the state of K, the cost of this other path is no smaller than that of the path to K:f(N’) f(N) f(K) and h(N’) = h(K)So, g(N’) g(K)
Proof (2/2)
K N
N’S
If h is consistent, then whenever A* expands a node, it has already found an optimal path to this node’s state
Result #2
![Page 17: CS B551: Elements of Artificial Intelligence](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813494550346895d9b7fb3/html5/thumbnails/17.jpg)
17
Implication of Result #2
N N1S S1
The path to N is the optimal path to S
N2
N2 can be discarded
![Page 18: CS B551: Elements of Artificial Intelligence](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813494550346895d9b7fb3/html5/thumbnails/18.jpg)
18
Revisited States with Consistent Heuristic (Search#3)
When a node is expanded, store its state into CLOSED
When a new node N is generated:• If STATE(N) is in CLOSED, discard N• If there exists a node N’ in the fringe
such that STATE(N’) = STATE(N), discard the node – N or N’ – with the largest f (or, equivalently, g)
![Page 19: CS B551: Elements of Artificial Intelligence](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813494550346895d9b7fb3/html5/thumbnails/19.jpg)
19
A* is optimal if
h admissible but not consistentRevisited states not detectedSearch #2 is used
h is consistentRevisited states not detectedSearch #3 is used
![Page 20: CS B551: Elements of Artificial Intelligence](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813494550346895d9b7fb3/html5/thumbnails/20.jpg)
20
Heuristic AccuracyLet h1 and h2 be two consistent heuristics such that for all nodes N:
h1(N) h2(N)
h2 is said to be more accurate (or more informed) than h1
h1(N) = number of misplaced tiles h2(N) = sum of distances of every
tile to its goal position
h2 is more accurate than h1
14
7
5
2
63
8
STATE(N)
64
7
1
5
2
8
3
Goal state
![Page 21: CS B551: Elements of Artificial Intelligence](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813494550346895d9b7fb3/html5/thumbnails/21.jpg)
21
Result #3 Let h2 be more accurate than h1
Let A1* be A* using h1 and A2* be A* using h2
Whenever a solution exists, all the nodes expanded by A2*, except possibly for some nodes such that f1(N) = f2(N) = C* (cost of optimal
solution)
are also expanded by A1*
![Page 22: CS B551: Elements of Artificial Intelligence](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813494550346895d9b7fb3/html5/thumbnails/22.jpg)
22
Proof C* = h*(initial-node) [cost of optimal solution]
Every node N such that f(N) C* is eventually expanded. No node N such that f(N) C* is ever expanded
Every node N such that h(N) C*g(N) is eventually expanded. So, every node N such that h2(N) C*g(N) is expanded by A2*. Since h1(N) h2(N), N is also expanded by A1*
If there are several nodes N such that f1(N) = f2(N) = C* (such nodes include the optimal goal nodes, if there exists a solution), A1* and A2* may or may not expand them in the same order (until one goal node is expanded)
![Page 23: CS B551: Elements of Artificial Intelligence](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813494550346895d9b7fb3/html5/thumbnails/23.jpg)
23
Effective Branching Factor It is used as a measure the
effectiveness of a heuristic Let n be the total number of nodes
expanded by A* for a particular problem and d the depth of the solution
The effective branching factor b* is defined by n = 1 + b* + (b*)2 +...+ (b*)d
![Page 24: CS B551: Elements of Artificial Intelligence](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813494550346895d9b7fb3/html5/thumbnails/24.jpg)
24
Experimental Results(see R&N for details)
8-puzzle with: h1 = number of misplaced tiles
h2 = sum of distances of tiles to their goal positions
Random generation of many problem instances Average effective branching factors (number of
expanded nodes):d IDS A1* A2*
2 2.45 1.79 1.79
6 2.73 1.34 1.30
12 2.78 (3,644,035)
1.42 (227) 1.24 (73)
16 -- 1.45 1.25
20 -- 1.47 1.27
24 -- 1.48 (39,135)
1.26 (1,641)
![Page 25: CS B551: Elements of Artificial Intelligence](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813494550346895d9b7fb3/html5/thumbnails/25.jpg)
25
By solving relaxed problems at each node In the 8-puzzle, the sum of the distances of
each tile to its goal position (h2) corresponds to solving 8 simple problems:
It ignores negative interactions among tiles
How to create good heuristics?
14
7
5
2
63
8
64
7
1
5
2
8
3
5
5
di is the length of theshortest path to movetile i to its goal position, ignoring the other tiles,e.g., d5 = 2
h2 = i=1,...8 di
![Page 26: CS B551: Elements of Artificial Intelligence](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813494550346895d9b7fb3/html5/thumbnails/26.jpg)
26
For example, we could consider two more complex relaxed problems:
h = d1234 + d5678 [disjoint pattern heuristic]
Can we do better?
14
7
5
2
63
8
64
7
1
5
2
8
3
3
2 14 4
1 2 3
d1234 = length of the shortest path to move tiles 1, 2, 3, and 4 to their goal positions, ignoring the other tiles
6
7
5
87
5
6
8
d5678
![Page 27: CS B551: Elements of Artificial Intelligence](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813494550346895d9b7fb3/html5/thumbnails/27.jpg)
27
For example, we could consider two more complex relaxed problems:
h = d1234 + d5678 [disjoint pattern heuristic] How to compute d1234 and d5678?
Can we do better?
14
7
5
2
63
8
64
7
1
5
2
8
3
3
2 14 4
1 2 3
d1234 = length of the shortest path to move tiles 1, 2, 3, and 4 to their goal positions, ignoring the other tiles
6
7
5
87
5
6
8
d5678
![Page 28: CS B551: Elements of Artificial Intelligence](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813494550346895d9b7fb3/html5/thumbnails/28.jpg)
28
For example, we could consider two more complex relaxed problems:
h = d1234 + d5678 [disjoint pattern heuristic] These distances are pre-computed and stored
[Each requires generating a tree of 3,024 nodes/states (breadth-first search)]
Can we do better?
14
7
5
2
63
8
64
7
1
5
2
8
3
3
2 14 4
1 2 3
d1234 = length of the shortest path to move tiles 1, 2, 3, and 4 to their goal positions, ignoring the other tiles
6
7
5
87
5
6
8
d5678
![Page 29: CS B551: Elements of Artificial Intelligence](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813494550346895d9b7fb3/html5/thumbnails/29.jpg)
29
For example, we could consider two more complex relaxed problems:
h = d1234 + d5678 [disjoint pattern heuristic] These distances are pre-computed and stored
[Each requires generating a tree of 3,024 nodes/states (breadth-first search)]
Can we do better?
14
7
5
2
63
8
64
7
1
5
2
8
3
3
2 14 4
1 2 3
6
7
5
87
5
6
8
d1234 = length of the shortest path to move tiles 1, 2, 3, and 4 to their goal positions, ignoring the other tiles
d5678
Several order-of-magnitude speedups for the 15- and 24-puzzle (see R&N)
Several order-of-magnitude speedups for the 15- and 24-puzzle (see R&N)
![Page 30: CS B551: Elements of Artificial Intelligence](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813494550346895d9b7fb3/html5/thumbnails/30.jpg)
30
Iterative Deepening A* (IDA*)
Idea: Reduce memory requirement of A* by applying cutoff on values of f
Consistent heuristic function h Algorithm IDA*:
1. Initialize cutoff to f(initial-node)2. Repeat:
a. Perform depth-first search by expanding all nodes N such that f(N) cutoff
b. Reset cutoff to smallest value f of non-expanded (leaf) nodes
![Page 31: CS B551: Elements of Artificial Intelligence](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813494550346895d9b7fb3/html5/thumbnails/31.jpg)
31
8-Puzzle
4
6
f(N) = g(N) + h(N) with h(N) = number of misplaced tiles
Cutoff=4
![Page 32: CS B551: Elements of Artificial Intelligence](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813494550346895d9b7fb3/html5/thumbnails/32.jpg)
32
8-Puzzle
4
4
6
Cutoff=4
6
f(N) = g(N) + h(N) with h(N) = number of misplaced tiles
![Page 33: CS B551: Elements of Artificial Intelligence](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813494550346895d9b7fb3/html5/thumbnails/33.jpg)
33
8-Puzzle
4
4
6
Cutoff=4
6
5
f(N) = g(N) + h(N) with h(N) = number of misplaced tiles
![Page 34: CS B551: Elements of Artificial Intelligence](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813494550346895d9b7fb3/html5/thumbnails/34.jpg)
34
8-Puzzle
4
4
6
Cutoff=4
6
5
5
f(N) = g(N) + h(N) with h(N) = number of misplaced tiles
![Page 35: CS B551: Elements of Artificial Intelligence](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813494550346895d9b7fb3/html5/thumbnails/35.jpg)
35
4
8-Puzzle
4
6
Cutoff=4
6
5
56
f(N) = g(N) + h(N) with h(N) = number of misplaced tiles
![Page 36: CS B551: Elements of Artificial Intelligence](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813494550346895d9b7fb3/html5/thumbnails/36.jpg)
36
8-Puzzle
4
6
Cutoff=5
f(N) = g(N) + h(N) with h(N) = number of misplaced tiles
![Page 37: CS B551: Elements of Artificial Intelligence](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813494550346895d9b7fb3/html5/thumbnails/37.jpg)
37
8-Puzzle
4
4
6
Cutoff=5
6
f(N) = g(N) + h(N) with h(N) = number of misplaced tiles
![Page 38: CS B551: Elements of Artificial Intelligence](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813494550346895d9b7fb3/html5/thumbnails/38.jpg)
38
8-Puzzle
4
4
6
Cutoff=5
6
5
f(N) = g(N) + h(N) with h(N) = number of misplaced tiles
![Page 39: CS B551: Elements of Artificial Intelligence](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813494550346895d9b7fb3/html5/thumbnails/39.jpg)
39
8-Puzzle
4
4
6
Cutoff=5
6
5
7
f(N) = g(N) + h(N) with h(N) = number of misplaced tiles
![Page 40: CS B551: Elements of Artificial Intelligence](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813494550346895d9b7fb3/html5/thumbnails/40.jpg)
40
8-Puzzle
4
4
6
Cutoff=5
6
5
7
5
f(N) = g(N) + h(N) with h(N) = number of misplaced tiles
![Page 41: CS B551: Elements of Artificial Intelligence](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813494550346895d9b7fb3/html5/thumbnails/41.jpg)
41
8-Puzzle
4
4
6
Cutoff=5
6
5
7
5 5
f(N) = g(N) + h(N) with h(N) = number of misplaced tiles
![Page 42: CS B551: Elements of Artificial Intelligence](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813494550346895d9b7fb3/html5/thumbnails/42.jpg)
42
8-Puzzle
4
4
6
Cutoff=5
6
5
7
5 5
f(N) = g(N) + h(N) with h(N) = number of misplaced tiles
5
![Page 43: CS B551: Elements of Artificial Intelligence](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813494550346895d9b7fb3/html5/thumbnails/43.jpg)
43
Advantages/Drawbacks of IDA* Advantages:
• Still complete and optimal• Requires less memory than A*• Avoid the overhead to sort the fringe
Drawbacks:• Can’t avoid revisiting states not on the
current path• Available memory is poorly used • Non-unit costs?
![Page 44: CS B551: Elements of Artificial Intelligence](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813494550346895d9b7fb3/html5/thumbnails/44.jpg)
44
Pathological case…
C* = 100
Nodes expanded:
A*: O(100/)
IDA*: O((100/)2)
![Page 45: CS B551: Elements of Artificial Intelligence](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813494550346895d9b7fb3/html5/thumbnails/45.jpg)
45
Memory-Bounded Search
Proceed like A* until memory is fullNo more nodes can be added to search treeDrop node in fringe with highest f(N)Place parent back in fringe with “backed-up”
f(P) min(f(P),f(N)) Extreme example: RBFS
Only keeps nodes in path to current node
![Page 46: CS B551: Elements of Artificial Intelligence](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813494550346895d9b7fb3/html5/thumbnails/46.jpg)
46
![Page 47: CS B551: Elements of Artificial Intelligence](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813494550346895d9b7fb3/html5/thumbnails/47.jpg)
47
Intermission
![Page 48: CS B551: Elements of Artificial Intelligence](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813494550346895d9b7fb3/html5/thumbnails/48.jpg)
48
Final Project Proposals
October 6 One per group 2-4 paragraphs
![Page 49: CS B551: Elements of Artificial Intelligence](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813494550346895d9b7fb3/html5/thumbnails/49.jpg)
49
Criteria
Technical meritDoes it connect to ideas seen in class, or
seek an exciting extension?Can you teach the class something new (and
interesting)? Project feasibility
6 1/2 weeks (after proposals are reviewed)Remember that you still have to take final!
![Page 50: CS B551: Elements of Artificial Intelligence](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813494550346895d9b7fb3/html5/thumbnails/50.jpg)
50
Local Search
![Page 51: CS B551: Elements of Artificial Intelligence](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813494550346895d9b7fb3/html5/thumbnails/51.jpg)
51
Local Search Light-memory search methods No search tree; only the current state is
represented! Applicable to problems where the path is
irrelevant (e.g., 8-queen) For other problems, must encode entire paths
in the state Many similarities with optimization
techniques
![Page 52: CS B551: Elements of Artificial Intelligence](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813494550346895d9b7fb3/html5/thumbnails/52.jpg)
52
Idea: Minimize h(N)
…Because h(G)=0 for any goal G An optimization problem!
![Page 53: CS B551: Elements of Artificial Intelligence](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813494550346895d9b7fb3/html5/thumbnails/53.jpg)
53
Steepest Descent1) S initial state2) Repeat:
a) S’ arg minS’SUCCESSORS(S){h(S’)}
b) if GOAL?(S’) return S’ c) if h(S’) h(S) then S S’ else return failure
Similar to:- hill climbing with –h- gradient descent over continuous space
![Page 54: CS B551: Elements of Artificial Intelligence](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813494550346895d9b7fb3/html5/thumbnails/54.jpg)
Application: 8-QueenRepeat n times:1) Pick an initial state S at random with one queen in each
column2) Repeat k times:
a) If GOAL?(S) then return Sb) Pick an attacked queen Q at random c) Move Q in its column to minimize the number of
attacking queens new S [min-conflicts heuristic]3) Return failure
12
33223
22
22
202
![Page 55: CS B551: Elements of Artificial Intelligence](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813494550346895d9b7fb3/html5/thumbnails/55.jpg)
Application: 8-QueenRepeat n times:1) Pick an initial state S at random with one queen in each
column2) Repeat k times:
a) If GOAL?(S) then return Sb) Pick an attacked queen Q at random c) Move Q it in its column to minimize the number of
attacking queens is minimum new S
12
33223
22
22
202
Why does it work ???1) There are many goal states that are
well-distributed over the state space2) If no solution has been found after a few
steps, it’s better to start it all over again. Building a search tree would be much less efficient because of the high branching factor
3) Running time almost independent of the number of queens
Why does it work ???1) There are many goal states that are
well-distributed over the state space2) If no solution has been found after a few
steps, it’s better to start it all over again. Building a search tree would be much less efficient because of the high branching factor
3) Running time almost independent of the number of queens
![Page 56: CS B551: Elements of Artificial Intelligence](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813494550346895d9b7fb3/html5/thumbnails/56.jpg)
56
Steepest Descent in Continuous Space Minimize f(x), x in Rn
Move in opposite of gradient f/x(x)
f
x2
x1
![Page 57: CS B551: Elements of Artificial Intelligence](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813494550346895d9b7fb3/html5/thumbnails/57.jpg)
57
Steepest Descent in Continuous Space Minimize f(x), x in Rn
Move in opposite of gradient f/x(x)
f
x2
x1
![Page 58: CS B551: Elements of Artificial Intelligence](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813494550346895d9b7fb3/html5/thumbnails/58.jpg)
58
f
f
SD works well
SD works poorly
![Page 59: CS B551: Elements of Artificial Intelligence](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813494550346895d9b7fb3/html5/thumbnails/59.jpg)
59
Problems for Discrete Optimization…
Plateau Ridges
NP-hard problems typically have an exponential number of local minima
![Page 60: CS B551: Elements of Artificial Intelligence](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813494550346895d9b7fb3/html5/thumbnails/60.jpg)
60
Steepest Descent1) S initial state2) Repeat:
a) S’ arg minS’SUCCESSORS(S){h(S’)}
b) if GOAL?(S’) return S’ c) if h(S’) h(S) then S S’ else return failure
may easily get stuck in local minima Random restart (as in n-queen example) Monte Carlo descent
![Page 61: CS B551: Elements of Artificial Intelligence](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813494550346895d9b7fb3/html5/thumbnails/61.jpg)
61
Monte Carlo Descent1) S initial state2) Repeat k times:
a) If GOAL?(S) then return S
b) S’ successor of S picked at random c) if h(S’) h(S) then S S’d) else
- h = h(S’)-h(S)- with probability ~ exp(h/T), where T is called the
“temperature”, do: S S’ [Metropolis criterion]
3) Return failure
Simulated annealing lowers T over the k iterations. It starts with a large T and slowly decreases T
![Page 62: CS B551: Elements of Artificial Intelligence](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813494550346895d9b7fb3/html5/thumbnails/62.jpg)
62
“Parallel” Local Search TechniquesThey perform several local searches concurrently, but not independently: Beam search Genetic algorithms Tabu search Ant colony optimization
![Page 63: CS B551: Elements of Artificial Intelligence](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813494550346895d9b7fb3/html5/thumbnails/63.jpg)
63
Beam Search
Keep k nodes in memory Init: sample k nodes at random Repeat:
Generate successors, keep k best
What if the k best nodes are the successors of just a few states?
![Page 64: CS B551: Elements of Artificial Intelligence](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813494550346895d9b7fb3/html5/thumbnails/64.jpg)
64
Stochastic Beam Search
Keep k nodes in memory Init: sample k nodes at random Repeat:
Generate successors Sample k successors at random with a probability
distribution that decreases as f(N) increases E.g. p(N) ~ e-f(N)
Can be used for search problems with paths as well
![Page 65: CS B551: Elements of Artificial Intelligence](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813494550346895d9b7fb3/html5/thumbnails/65.jpg)
65
Empirical Successes of Local Search Satisfiability (SAT) Vertex Cover Traveling salesman problem Planning & scheduling
![Page 66: CS B551: Elements of Artificial Intelligence](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813494550346895d9b7fb3/html5/thumbnails/66.jpg)
66
Recap
Consistent heuristics for revisited states Local search techniques
![Page 67: CS B551: Elements of Artificial Intelligence](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813494550346895d9b7fb3/html5/thumbnails/67.jpg)
67
Homework
R&N 4.1-5
![Page 68: CS B551: Elements of Artificial Intelligence](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813494550346895d9b7fb3/html5/thumbnails/68.jpg)
68
Next Class
More local search Review