cooperative heuristic search with software agentsusers.ics.aalto.fi/ahalme/coop/presis_halme.pdf ·...
TRANSCRIPT
Cooperative Heuristic Search with Software AgentsAntti Halme
2/30
Outline
Cooperation and heuristic search
The A! algorithm
Solving n-puzzles with A!
Conclusion
Cooperative Heuristic Search with Software AgentsAntti Halme
3/30
Outline
Cooperation and heuristic search
The A! algorithm
Solving n-puzzles with A!
Conclusion
Cooperative Heuristic Search with Software AgentsAntti Halme
3/30
Cooperation
co·op·er·ate
I to work togetherI to work with another person or group to do somethingI to be helpful by doing what someone asks or tells you to doI to act in a way that makes something possible or likelyI to produce the right conditions for something to happen
(Merriam-Webster)
Cooperative Heuristic Search with Software AgentsAntti Halme
4/30
Heuristic Search
I Searching is a fundamental computing taskI A heuristic provides focus to searching effortsI The most well-known heuristic search algorithm is the A∗
Uninformed search Heuristic search
Cooperative Heuristic Search with Software AgentsAntti Halme
5/30
A∗ search1
I Informed best-first graph search algorithmI Single-source shortest path problem (SSSP)I Fringe nodes ranked by a cost estimate
f (n)︸︷︷︸cost estimate
= g(n)︸︷︷︸known distance
+ h(n)︸︷︷︸estimated remaining
I Admissible heuristics never overestimate, h(n) ≤ h∗(n)I Consistent heuristics guarantee optimality
h(n) ≤ d(n,n′) + h(n′)
I Optimally efficient on given heuristicI No algorithm can expand fewer nodes, except in
tie-breaks1
Peter Hart, Nils Nilsson, and Bertram Raphael. A Formal Basis for the Heuristic Determination of MinimumCost Paths. IEEE Transactions on Systems Science and Cybernetics, 4(2):100–107, 1968.
Cooperative Heuristic Search with Software AgentsAntti Halme
6/30
Constructing cooperative searchI The goal
I Search faster as a collective, leverage parallel hardwareI The hypothesis
I Cooperating search agents outperform agents in isolationI The idea
I Cooperation is communicationI Agents share and make good use of progress information
I The methodI A secondary ranking heuristic, a dynamic tiebreaker
I The mechanismI Asynchronous messaging, natural concurrency,
implicit randomness, nondeterministic exploration
Unus pro omnibus, omnes pro uno.
Cooperative Heuristic Search with Software AgentsAntti Halme
7/30
Outline
Cooperation and heuristic search
The A! algorithm
Solving n-puzzles with A!
Conclusion
Cooperative Heuristic Search with Software AgentsAntti Halme
7/30
Overview of A!
I A∗ + cooperation + concurrency = A! (a-bang)
I A cooperative heuristic search algorithm for the SSSPI Search agents run an upgraded version of A∗
I Shared information included as a secondary heuristic, hI Simplest case: share best encountered (n,h(n))-pair
I Vanilla A∗ maintains a node priority queue sorted by
f (n) = g(n) + h(n),
A! takes k equal-valued nodes from the top, ranks by h
Cooperative Heuristic Search with Software AgentsAntti Halme
8/30
Overview of A!
Gt
u
s
vw BA
A! search on a grid graph with two agents, A and B. Between nodes t and u,of equal distance to G (1st heuristic), agent A chooses u, because its closer(2nd heuristic) to best node marked with star, discovered by B.
Cooperative Heuristic Search with Software AgentsAntti Halme
9/30
Optimality of A!I Proof sketch based on an argument for A∗ itself2
I Cost estimates, f -values, are nondecreasing for all pathsI Consistent heuristic
I Optimal path to a node is found before the node is openedI Expansion from the fringe, edge connection
I Let S be a subset of all equally good candidate nodes, ESall explored nodes with a successor in S
1. f (n′) is the same for all n′ ∈ S, so for n ∈ ES
f (n′) = g(n′) + h(n′) = g(n) + d(n, n′) + h(n′) ≥ g(n) + h(n) = f (n)
2. For a node to be opened before an optimal path there isfound, another fringe node m with a better f -value isimplied, but then f (n) < f (m) and f (n) > f (m).
2Stuart Russell and Peter Norvig. Artificial Intelligence: A Modern Approach, 3rd ed. Prentice Hall Press, 2009.
Cooperative Heuristic Search with Software AgentsAntti Halme
10/30
A! cooperation architecture
I Agents process async messages at their own paceI Concurrency, nondeterminism; implicit randomness
I Best-effort notion of a globally superior reference node
I A message broker entity can facilitate communicationI Instantaneous diffusion is not the goalI Publish/subscribe-topology (cf. web chat)
I Short-circuiting best-update leads to a momentum effectI Prefer nodes near the previously explored onesI Manifest already with a single agent
I Actor abstraction, fast termination, diversification, . . .
Cooperative Heuristic Search with Software AgentsAntti Halme
11/30
A! details – A!Search, A!Solver, A!Select
Algorithm 1 : A!Search
Require: N > 0, NODE start , PREDICATE isGoal , HEURISTIC h, HEURISTIC hEnsure: path from start to nearest node satisfying isGoal is shortest possible
mb ← MsgBroker()for i = 0 to N do
workers[i]← A!Solver(mb.portOut ,mb.portIn, s, isGoal, h, h)end forfor each worker in workers in parallel do
worker .launch()end for
wait for terminationreturn path← getPath(workers)
Cooperative Heuristic Search with Software AgentsAntti Halme
12/30
Algorithm 2 : A!Solver
Require: PORT portIn, portOut , NODE start , PRED isGoal , HEUR h, h1: openHeap ← FibonacciHeap〈INTEGER, NODE〉2: closedSet ← Set〈NODE〉3: pathMap ← Map〈NODE, NODE〉4: current ← start5: repeat6: if isGoal(current) then terminate(current , start , pathMap) end if7: closedSet .add(current)8: for each n in current .getNeighbors() do9: if closedSet .contains(n) then continue end if
10: g ← current .g + dist(current , n)11: f ← g + h(n)12: improved ← openHeap.update(n, f )13: if improved then pathMap.update(n, current) end if14: end for15: peekList ← openHeap.getPeekList()16: if isEmpty(peekList) then terminate() end if17: current ← A!Select(peekList , portIn, portOut , h, h)18: openHeap.remove(current)19: until termination
Cooperative Heuristic Search with Software AgentsAntti Halme
13/30
Algorithm 3 : A!Select
Require: LIST peekList , PORT portIn, portOut , HEURISTIC h, hEnsure: select is the most promising node in peekList according to h on best
1: update, updateH ← asyncRecv(portIn)2: if updateH < bestH then3: best , bestH ← update, updateH4: end if5: select ← peekList .pop()6: selectD ← h(select , best)7: for each node in peekList do8: d ← h(node, best)9: if d < selectD then select , selectD ← node, d end if
10: end for11: if h(select) < bestH then12: best , bestH ← select , selectH13: asyncSend(portOut , {best , bestH})14: end if15: return select
Cooperative Heuristic Search with Software AgentsAntti Halme
14/30
A! tradeoffs and challenges
I Shared memory vs. distributedI Cooperation details: what to share and how, usageI Maintaining diversity: partitioning, hashing, . . .I IDA∗ and maintaining memory-efficiencyI Clean, fast terminationI Priority queue bottleneckI Agent abstraction: actors, coroutines, . . .
Cooperative Heuristic Search with Software AgentsAntti Halme
15/30
Outline
Cooperation and heuristic search
The A! algorithm
Solving n-puzzles with A!
Conclusion
Cooperative Heuristic Search with Software AgentsAntti Halme
15/30
The n-puzzle
23 1
456 78
A start state
2 314 5 67 8
A standard goal state
Cooperative Heuristic Search with Software AgentsAntti Halme
16/30
The n-puzzle
I Classic sliding tile puzzle with a long history3
I Turn the start state into the target state by sliding tiles
I Literally a toy problemI Finding k -bound sequence for general m ×m is NP-C4
I 8-puzzle avg branching factor ∼3, avg solution 22 stepsI 322 tree states, 9!
2 ≈ 180k graphI 15-puzzle 1.3× 1012, 24-puzzle ∼1025
I Hardest 8-p on 31 steps, 15-p 80, 24-p 152–208I cf. God’s Number for the Rubik’s Cube is 20 (Rokicki et al., 2010)
3Jerry Slocum and Dic Sonneveld. The 15 Puzzle Book. The Slocum Puzzle Foundation, 2006.
4Daniel Ratner and Manfred Warmuth. Finding a Shortest Solution for the NxN Extension of the 15-PUZZLE Is
Intractable. AAAI ’86, pages 168–172, 1986.
Cooperative Heuristic Search with Software AgentsAntti Halme
17/30
Heuristics for n-puzzle
I Misplaced tile countI Manhattan distance
I The sum of distances the tiles are from their targetpositions, counted as moves along the grid
I Manhattan distance with linear collisionsI Two tiles on the right row, but in the wrong order must pass
each other to reach their targetsI Walking distanceI Pattern databasesI Additive/disjoint pattern databases
I Store precomputed solutions to sub-problems; disjoint setsensure combination heuristic remains admissible
I Learned heuristics
Cooperative Heuristic Search with Software AgentsAntti Halme
18/30
Computational setting
I Randomly generated 8- and 15-puzzle instancesI Grouped by optimal path length
I Substantial variance within each group
I Simple A! implementation based on A∗ by Brian BorowskiI Focus on nodes opened by winning agent
I Correlates with runtime, total opened countI Comparison with A∗ and non-coop randomized A∗, A?
I A!Select→ A*Select, A?Select
I Test execution on Aalto SCI Science-IT project resourcesI Triton cluster of mixed multi-core blade serversI 2.6GHz Opteron 2435, 2.67GHz Xeon X5650, and 2.8GHz Xeon E5 2680 v2
Cooperative Heuristic Search with Software AgentsAntti Halme
19/30
Results: Cooperation benefit and scalability
0 100000 200000 300000 400000 500000 600000 700000 800000 900000 0
100000
200000
300000
400000
500000
600000
700000
800000
900000
Visited vertices for winning agent, vanilla A* search
Vis
ited v
ert
ices for
win
nin
g a
gent, c
oopera
tive A
! searc
h15−puzzle runs, length groups 40−59, 1−8 agents, PDB heuristic.
1 agent
2 agents
4 agents
6 agents
8 agents
Par
1 A lin. reg.
2 A lin. reg.
4 A lin. reg.
6 A lin. reg.
8 A lin. reg.
Relative performance of A∗ (x-axis) and A! (y-axis). The black line is par, sodata points below it represent instances for which A! performs better than A∗.Agent count is evaluated in five batches – 1, 2, 4, 6, and 8 agents – with therespective trend lines showing how the methods compare.
Cooperative Heuristic Search with Software AgentsAntti Halme
20/30
Results: Cooperation benefit and scalability
0 100000 200000 300000 400000 500000 600000 700000 800000 900000 0
100000
200000
300000
400000
500000
600000
700000
800000
900000
Visited vertices for winning agent, random A? search
Vis
ited v
ert
ices for
win
nin
g a
gent, c
oopera
tive A
! searc
h15−puzzle runs, length groups 40−59, 1−8 agents, PDB heuristic.
1 agent
2 agents
4 agents
6 agents
8 agents
Par
1 A lin. reg.
2 A lin. reg.
4 A lin. reg.
6 A lin. reg.
8 A lin. reg.
Relative performance of A? and A!. As before, the data points and trend linesbelow par-line reflect the benefit from cooperation. The slopes vary fromaround 7
10 to 810 , reflecting a 25− 40% performance difference in favor of A!.
Cooperative Heuristic Search with Software AgentsAntti Halme
21/30
Results: Cooperation benefit and scalability
0
50000
100000
150000
200000
250000
300000
350000
400000V
isited v
ert
ices for
win
nin
g a
gent, g
roup m
edia
n
A*
A?
G40
A!
A*
A?
G41
A!
A*
A?
G42
A!
A*
A?
G43
A!
A*
A?
G44
A!
A*
A?
G45
A!
A*
A?
G46
A!
A*
A?
G47
A!
A*
A?
G48
A!
A*
A?
G49
A!
A*
A?
G50
A!
A*
A?
G51
A!
A*
A?
G52
A!
A*
A?
G53
A!
A*
A?
G54
A!
A*
A?
G55
A!
A*
A?
G56
A!
A*
A?
G57
A!
A*
A?
G58
A!
A*
A?
G59
A!
15−puzzle runs, length groups 40−59, 1−8 agents, PDB heuristic.
A*, 1−8 agents
A?, 1 agent
A!, 1 agent
A?, 2 agents
A!, 2 agents
A?, 4 agents
A!, 4 agents
A?, 6 agents
A!, 6 agents
A?, 8 agents
A!, 8 agents
General A∗, A? and A! performance trends. 100 inst./group. 1, 2, 4, 6, and 8agents. Cooperation appears to be more beneficial with harder instances.
Cooperative Heuristic Search with Software AgentsAntti Halme
22/30
0
0.5
1
1.5
2
A*
A*
1A? 2A? 4A? 6A? 8A?
40
1A! 2A! 4A! 6A! 8A!
40
41
41
42
42
43
43
44
44
45
45
46
46
47
47
48
48
49
49
50
50
51
51
52
52
53
53
54
54
55
55
56
56
57
57
58
58
59
59
15−puzzle runs, length groups 40−59, 1−8 agents, PDB heuristic.
Vis
ite
d v
ert
ice
s f
or
win
nin
g a
ge
nt,
me
an
s b
y g
rou
p,
no
rma
lize
d t
o A
*
A*
A? normalized group mean, 1−−8A
A! normalized group mean, 1−−8A
A? mean of norm. group means, 1−−8A
A! mean of norm. group means, 1−−8A
I A∗-normalized length groupsI Scaling benefit from more agents
I Overlapping trend lines
I Rudimentary asymptotics
Cooperative Heuristic Search with Software AgentsAntti Halme
23/30
Results: Heuristic impact
0 100000 200000 300000 400000 500000 600000 700000 800000 900000 0
50000
100000
150000
200000
250000
300000
Visited vertices, LC heuristic
Vis
ited v
ert
ices, 6−
6−
3 P
DB
heuristic
15−puzzle runs, lengths 30−57, 1−8 agents, LC/PDB heuristic.
1 agents
2 agents
4 agents
6 agents
8 agents
1 agent lin. reg.
2 agents lin. reg.
4 agents lin. reg.
6 agents lin. reg.
8 agents lin. reg.
Heuristic comparison with data grouped by agent configuration. The trendlines being essentially the same indicates that the number of agents is notstrongly correlated with heuristic impact: regardless of method, two agentsbenefit from a better heuristic as much (or little) as eight agents.
Cooperative Heuristic Search with Software AgentsAntti Halme
24/30
Results: Heuristic impact
0 100000 200000 300000 400000 500000 600000 700000 800000 900000 0
50000
100000
150000
200000
250000
300000
Visited vertices, LC heuristic
Vis
ited v
ert
ices, 6−
6−
3 P
DB
heuristic
15−puzzle runs, lengths 30−57, 1−8 agents, LC/PDB heuristic.
A?, 1−8 agents
A!, 1−8 agents
A? lin. reg.
A! lin. reg.
Heuristic comparison with data grouped by method. The now more visibledifference in trend lines suggests that A! benefits more from the improvedheuristic than A?. The slope is about 1
8 for A?, and around 110 for A!.
Cooperative Heuristic Search with Software AgentsAntti Halme
25/30
Results: Path diversity
1 2 3 4 5 6 7 8 9
1
2
3
4
5
6
7
8
9
0
1
2
Solution path
1 2 3 4 5 6 7 8 9
1
2
3
4
5
6
7
8
9
200
400
600
800
1000
1200
1400
Vanilla A∗
1 2 3 4 5 6 7 8 9
1
2
3
4
5
6
7
8
9
200
400
600
800
1000
1200
1400
Random A?
1 2 3 4 5 6 7 8 9
1
2
3
4
5
6
7
8
9
200
400
600
800
1000
1200
1400
Cooperative A!
Visualization of A∗, A? and A! on 8-puzzle [8 6 7 2 5 4 3 0 1]. The heat-maps are derived from a 9× 9
self-organizing map trained on an optimal solution path of 31 steps, shown top left in the codomain.
METHOD A∗ A? A!FREQ. 1 2 3 4 1 2 3 4 1 2 3 4MEAN 648 414 269 4780 3290 1147 903 5245 562 600 1006 4544RATIO 0.11 0.07 0.04 0.78 0.31 0.11 0.09 0.50 0.08 0.09 0.15 0.68STD 476 363 235 525 1345 559 327 716 233 254 372 656
State visits for A∗, A? and A!. The table gives the mean, ratio and standard deviation of state visit frequencies from
ten iterations of A∗ on four agents with Manhattan heuristic.
Cooperative Heuristic Search with Software AgentsAntti Halme
26/30
Results: Path diversity
2 4 6 8 10 12 14 16
2
4
6
8
10
12
14
16
0
1
2
Solution path
2 4 6 8 10 12 14 16
2
4
6
8
10
12
14
16
0
0.5
1
1.5
2
2.5
3
3.5
4
4.5
x 104
Vanilla A∗
2 4 6 8 10 12 14 16
2
4
6
8
10
12
14
16
0
0.5
1
1.5
2
2.5
3
3.5
4
4.5
x 104
Random A?
2 4 6 8 10 12 14 16
2
4
6
8
10
12
14
16
0
0.5
1
1.5
2
2.5
3
3.5
4
4.5
x 104
Cooperative A!
16× 16 SOM visualization of A∗, A? and A! on 54-optimal 15-puzzle [12 8 6 3 13 4 2 7 0 9 15 5 14 10 11 1].
METHOD A∗ A? A!FREQ. 1 2 3 4 1 2 3 4 1 2 3 4MEAN 661 862 346 158124 91459 32347 13256 76659 40168 8406 4253 75356RATIO 0.00 0.01 0.00 0.99 0.43 0.15 0.06 0.36 0.31 0.07 0.03 0.59STD 352 1573 379 1420 27768 27092 14508 4702 29408 8212 1800 1040
State visits for A∗, A? and A! on 15-puzzle [12 8 6 3 13 4 2 7 0 9 15 5 14 10 11 1]. The table gives the
mean, ratio and standard deviation of state visit frequencies from ten iterations of A∗ on four agents with a 6-6-3
disjoint pattern database heuristic.
Cooperative Heuristic Search with Software AgentsAntti Halme
27/30
Results: Hybrid performance
I Diminishing returns from simply adding more agentsI Search overhead, new agents mostly tread on old paths
I Path diversity is essentialI Secondary heuristic performance depends on it
I Ideally, one would like to have the focused searchperformance of A! and the diversity apparent in A?
I A! + A? = A′′
I Simple threshold combination does not appear to workI A! seems to skip past the very states that A? wastes time on
Cooperative Heuristic Search with Software AgentsAntti Halme
28/30
1 5 9 13 17 21 25 29 33 37 41 45 49 53 57 61 65 69 73 77 81 85 89 93 97 0
10000
20000
30000
40000
50000
60000
70000
80000
90000
40
42
44
46
48
50
52
15−puzzle runs, HYBRID mode, even length groups 40−52, 4 agents, 5x, PDB heuristic.
Cutoff threshold p for A": Pr[ A! ] = p, Pr[ A? ] = 1−p.
Vis
ite
d v
ert
ice
s f
or
win
nin
g a
ge
nt,
me
an
s b
y g
rou
p
I Hybrid A′′ performance overmultiple p-thresholds
I Three iterations per instance pergroup in the 45–54 range
I A! likelihood over A? grows withp to the right
I The downward trending slopessuggest that adding someA? elements into A! does notimprove overall performance
Cooperative Heuristic Search with Software AgentsAntti Halme
29/30
Results: Discussion
I A! outperforms both vanilla A∗ and the non-cooperativerandom parallel search A∗-variant A?
I Nondeterministic cooperation emerging from asyncmessage exchange was shown to be beneficial
I With more agents, A! was shown to work betterI Rapidly diminishing returnsI Search overhead appeared to be an issue
I Adding simple explicit randomization did not helpI More robust diversity increasing mechanisms are needed
I A! demonstrated extra sensitivity for heuristic improvementI The reason for this is unclear
I SOM visualization proved thought-provoking, but lo-fiI More appropriate visualizations worth exploring
Cooperative Heuristic Search with Software AgentsAntti Halme
30/30
Outline
Cooperation and heuristic search
The A! algorithm
Solving n-puzzles with A!
Conclusion
Cooperative Heuristic Search with Software AgentsAntti Halme
30/30
Thesis summary
I Cooperation approach to parallel computingI Heuristic search context, parallel A∗
I Cooperation as a secondary heuristic
I Cooperation is communicationI Asynchronity, concurrency and nondeterminismI Ripple effects, implicit randomness
I The A! algorithm
I Empirical study, computational experimentsI Performance, scalabilityI Path diversity, heuristic sensitivityI Hybrid algorithm
Cooperative Heuristic Search with Software AgentsAntti Halme
30/30
Related workI Alba, E. (Ed.). (2005). Parallel Metaheuristics. John Wiley & Sons. doi:10.1002/0471739383I Alba, E., Luque, G., & Nesmachnow, S. (2013). Parallel Metaheuristics: Recent Advances and New Trends.
Intl. Trans. in Operational Research, 20(1), 1–48. doi:10.1111/j.1475-3995.2012.00862.xI Barbucha, D. (2012). Search Modes for the Cooperative Multi-agent System Solving the Vehicle Routing
Problem. Neurocomputing, 88, 13–23. doi:10.1016/j.neucom.2011.07.032I Burns, E., Lemons, S., Ruml, W., & Zhou, R. (2010). Best-First Heuristic Search for Multicore Machines.
Journal of Artificial Intelligence Research, 39, 689–743. doi:10.1613/jair.3094I Clearwater, S. H., Huberman, B. A., & Hogg, T. (1991). Cooperative Solution of Constraint Satisfaction
Problems. Science, 254(5035), 1181–3. doi:10.1126/science.254.5035.1181I Crainic, T. G., & Toulouse, M. (2008). Explicit and Emergent Cooperation Schemes for Search Algorithms.
In Proc. of the 2nd Intl. Conf. on Learning and Intelligent Optimization (LION ’07) (pp. 95–109).doi:10.1007/978-3-540-92695-5_8
I Hogg, T., & Huberman, B. A. (1993). Better Than the Best: The Power of Cooperation. In 1992 Lectures inComplex Systems (Vol. V, pp. 165–184). Addison-Wesley.
I Hogg, T., & Williams, C. P. (1993). Solving the Really Hard Problems with Cooperative Search. In Proc. ofthe 11th National Conference on Artificial Intelligence (AAAI ’93) (pp. 231–236).
I Kishimoto, A., Fukunaga, A., & Botea, A. (2013). Evaluation of a Simple, Scalable, Parallel Best-FirstSearch Strategy. Artificial Intelligence, 195(0), 222–248. doi:10.1016/j.artint.2012.10.007
I Nitschke, G. (2005). Emergence of Cooperation: State of the Art. Artificial Life, 11(3), 367–96.doi:10.1162/1064546054407194
I Halme, A. (2014). Cooperative Heuristic Search with Software Agents. M.Sc. thesis, Aalto University.