oregon state university – cs430 intro to ai (c) 2003 thomas g. dietterich and devika subramanian1...
Post on 20-Dec-2015
215 views
TRANSCRIPT
(c) 2003 Thomas G. Dietterich and Devika Subramanian
1
Ore
gon
Sta
te U
nive
rsit
y –
CS
430
Intr
o to
AI Search-Based Agents
Appropriate in Static Environments where a model of the agent is known and the environment allows prediction of the effects of actions evaluation of goals or utilities of predicted states
Environment can be partially-observable, stochastic, sequential, continuous, and even multi-agent, but it must be static!
We will first study the deterministic, discrete, single-agent case.
(c) 2003 Thomas G. Dietterich and Devika Subramanian
2
Ore
gon
Sta
te U
nive
rsit
y –
CS
430
Intr
o to
AI Computing Driving Directions
Arad
Oradea
Zerind
Timisoara
Lugoj
Dobreta
Sibiu
Rimnicu Vilcea
Fagaras
Pitesti
Mehadia
Craiova
Bucharest
Giurgia
Urzicenl
Neamt
Iasi
Vasini
Hirsova
Eforie
71
75
118
140
151
99
80
111
70
75
120
146
97
138
101
211
85
90
142
92
87
98
86
You are here
You want to be here
(c) 2003 Thomas G. Dietterich and Devika Subramanian
3
Ore
gon
Sta
te U
nive
rsit
y –
CS
430
Intr
o to
AI Search Algorithms
Breadth-First Depth-First Uniform Cost A* Dijkstra’s Algorithm
(c) 2003 Thomas G. Dietterich and Devika Subramanian
4
Ore
gon
Sta
te U
nive
rsit
y –
CS
430
Intr
o to
AI
Oradea (146)
71
Breadth-First
Arad (0)
75
Zerind (75)
118
Timisoara (118)
Lugoj (229)
111
Mehadia (299)
70
75Dobreta
(486)120
140Sibiu (140)
151
Rimnicu Vilcea (220)
80
Fagaras (239)99
Craiova (366)
146
Pitesti (317)97
138 Bucharest (450)
211
101
Detect duplicate path (291)
Detect new shorter path (418)
Detect duplicate path (455)
Detect duplicate path (504)
Detect new shorter path (374)
Detect duplicate path (494)
Detect duplicate path (197)
Breadth-FirstOradea (146)
71
Arad (0)
75
Zerind (75)
118
Timisoara (118)
Lugoj (229)
111
Mehadia (299)
70
75Dobreta
(486)120
140Sibiu (140)
151
Rimnicu Vilcea (220)
80
Fagaras (239)99
Craiova (366)
146
Pitesti (317)97
138 Bucharest (450)
211
101
Arad (0)
Sibiu (140)Zerind (75) Timisoara (118)
Arad (150) Oradea (146) Arad (280) Fagaras (239) Oradea (291) Arad (236) Lugoj (229)
Sibiu (197) Sibiu (338) Bucharest (450)
Rimnicu Vilcea (220)
Timisoara (340) Mehadia (299)Sibiu (300) Pitesti (317) Craiova (366)
Lugoj (369) Dobreta (374)Craiova (455)Rimnicu Vilcea (317) Bucharest (418) Rimnicu Vilcea (482) Dobreta (486) Pitesti (504)
Medhadia (449) Craiova (494)
Zerind (217)
(c) 2003 Thomas G. Dietterich and Devika Subramanian
6
Ore
gon
Sta
te U
nive
rsit
y –
CS
430
Intr
o to
AI Formal Statement of Search
Problems
State Space: set of possible “mental” states cities in Romania
Initial State: state from which search begins Arad
Operators: simulated actions that take the agent from one mental state to another traverse highway between two cities
Goal Test: Is current state Bucharest?
(c) 2003 Thomas G. Dietterich and Devika Subramanian
7
Ore
gon
Sta
te U
nive
rsit
y –
CS
430
Intr
o to
AI General Search Algorithm
Strategy: first-in first-out queue (expand oldest leaf first)
(c) 2003 Thomas G. Dietterich and Devika Subramanian
8
Ore
gon
Sta
te U
nive
rsit
y –
CS
430
Intr
o to
AI Leaf Selection Strategies
Breadth-First Search: oldest leaf (FIFO) Depth-First Search: youngest leaf (LIFO) Uniform Cost Search: cheapest leaf (Priority
Queue) A* search: leaf with estimated shortest total
path length g(x) + h(x) = f(x) where g(x) is length so far and h(x) is estimate of remaining length (Priority Queue)
(c) 2003 Thomas G. Dietterich and Devika Subramanian
9
Ore
gon
Sta
te U
nive
rsit
y –
CS
430
Intr
o to
AI A* Search
Let h(x) be a “heuristic function” that gives an underestimate of the true distance between x and the goal state Example: Euclidean distance
Let g(x) be the distance from the start to x, then g(x) + h(x) is an lower bound on the length of the optimal path
(c) 2003 Thomas G. Dietterich and Devika Subramanian
10
Ore
gon
Sta
te U
nive
rsit
y –
CS
430
Intr
o to
AI Euclidean Distance Table
Arad 366 Mehadia 241
Bucharest 0 Neamt 234
Craiova 160 Oradea 380
Dobreta 242 Pitesti 100
Eforie 161 Rimnicu Vilcea 193
Fagaras 176 Sibiu 253
Giurgiu 77 Timisoara 329
Hirsova 151 Urziceni 80
Iasi 226 Vaslui 199
Lugoj 244 Zerind 374
(c) 2003 Thomas G. Dietterich and Devika Subramanian
11
Ore
gon
Sta
te U
nive
rsit
y –
CS
430
Intr
o to
AI A* Search
Arad (0+366=366)
Sibiu (140+253=393)Zerind (75+374=449) Timisoara (118+329=447)
Fagaras (239+176=415) Oradea (291+380=671)
Bucharest (450+0=450)
Rimnicu Vilcea (220+193=413)
Pitesti (317+100=417) Craiova (366+160=526)
Craiova (455+160=615)Bucharest (418+0=418)
All remaining leaves have f(x) ¸ 418, so we know they cannot have shorter paths to Bucharest
(c) 2003 Thomas G. Dietterich and Devika Subramanian
12
Ore
gon
Sta
te U
nive
rsit
y –
CS
430
Intr
o to
AI Dijkstra’s Algorithm
Works backwards from the goal Each node keeps track of the shortest
known path (and its length) to the goal Equivalent to uniform cost search
starting at the goal No early stopping: finds shortest path
from all nodes to the goal
(c) 2003 Thomas G. Dietterich and Devika Subramanian
13
Ore
gon
Sta
te U
nive
rsit
y –
CS
430
Intr
o to
AI Local Search Algorithms
Keep a single current state x Repeat
Apply one or more operators to x Evaluate the resulting states according to an
Objective Function J(x) Choose one of them to replace x (or decide
not to replace x at all)
Until time limit or stopping criterion
(c) 2003 Thomas G. Dietterich and Devika Subramanian
14
Ore
gon
Sta
te U
nive
rsit
y –
CS
430
Intr
o to
AI Hill Climbing
Simple hill climbing: apply a randomly-chosen operator to the current state
If resulting state is better, replace current state
Steepest-Ascent Hill Climbing:
Apply all operators to current state, keep state with the best value
Stop when no successors state is better than current state
(c) 2003 Thomas G. Dietterich and Devika Subramanian
15
Ore
gon
Sta
te U
nive
rsit
y –
CS
430
Intr
o to
AI Gradient Ascent
In continuous state spaces, x = (x1, x2, …, xn) is a vector of real values
Continuous operator: x := x+ x for any arbitrary vector x (infinitely many operators!)
Suppose J(x) is differentiable. Then we can compute the direction of steepest increase of J by the first derivative with respect to x, the gradient:
(c) 2003 Thomas G. Dietterich and Devika Subramanian
16
Ore
gon
Sta
te U
nive
rsit
y –
CS
430
Intr
o to
AI Gradient Descent Search
Repeat Compute Gradient rJ Update x := x + rJ
Until rJ ¼ 0
is the “step size”, and it must be chosen carefully
Methods such as conjugate gradient and Newton’s method choose automatically
(c) 2003 Thomas G. Dietterich and Devika Subramanian
17
Ore
gon
Sta
te U
nive
rsit
y –
CS
430
Intr
o to
AI Visualizing Gradient Ascent
If is too large, search may overshoot and miss the maximum or oscillate forever
(c) 2003 Thomas G. Dietterich and Devika Subramanian
18
Ore
gon
Sta
te U
nive
rsit
y –
CS
430
Intr
o to
AI Problems with Hill Climbing
Local optima
Flat regions
Random restarts can give good results
(c) 2003 Thomas G. Dietterich and Devika Subramanian
19
Ore
gon
Sta
te U
nive
rsit
y –
CS
430
Intr
o to
AI Simulated Annealing
T = 100 (or some large value) Repeat
Apply randomly-chosen operator to x to obtain x′. Let E = J(x′) – J(x) If E > 0, /* J(x′) is better */
switch to x′ Else /* J(x′) is worse */
switch to x′ with probability exp [E/T] /* large negative steps are less likely */ T := 0.99 * T */ “cool” T */
Slowly decrease T (“anneal”) to zero Stop when no changes have been accepted for many moves Idea: Accept “down hill” steps with some probability to help
escape from local minima. As T ! 0 this probability goes to zero.