cooperative heuristic search with software agentsusers.ics.aalto.fi/ahalme/coop/presis_halme.pdf ·...

Cooperative Heuristic Search withSoftware Agents

Antti [email protected]

April 24, 2014

Cooperative Heuristic Search with Software AgentsAntti Halme

2/30

Outline

Cooperation and heuristic search

The A! algorithm

Solving n-puzzles with A!

Conclusion


3/30

Outline


The A! algorithm


Conclusion


3/30

Cooperation

co·op·er·ate

I to work togetherI to work with another person or group to do somethingI to be helpful by doing what someone asks or tells you to doI to act in a way that makes something possible or likelyI to produce the right conditions for something to happen

(Merriam-Webster)


4/30

Heuristic Search

I Searching is a fundamental computing taskI A heuristic provides focus to searching effortsI The most well-known heuristic search algorithm is the A∗

Uninformed search Heuristic search


5/30

A∗ search1

I Informed best-first graph search algorithmI Single-source shortest path problem (SSSP)I Fringe nodes ranked by a cost estimate

f (n)︸︷︷︸cost estimate

= g(n)︸︷︷︸known distance

+ h(n)︸︷︷︸estimated remaining

I Admissible heuristics never overestimate, h(n) ≤ h∗(n)I Consistent heuristics guarantee optimality

h(n) ≤ d(n,n′) + h(n′)

I Optimally efficient on given heuristicI No algorithm can expand fewer nodes, except in

tie-breaks1

Peter Hart, Nils Nilsson, and Bertram Raphael. A Formal Basis for the Heuristic Determination of MinimumCost Paths. IEEE Transactions on Systems Science and Cybernetics, 4(2):100–107, 1968.


6/30

Constructing cooperative searchI The goal

I Search faster as a collective, leverage parallel hardwareI The hypothesis

I Cooperating search agents outperform agents in isolationI The idea

I Cooperation is communicationI Agents share and make good use of progress information

I The methodI A secondary ranking heuristic, a dynamic tiebreaker

I The mechanismI Asynchronous messaging, natural concurrency,

implicit randomness, nondeterministic exploration

Unus pro omnibus, omnes pro uno.


7/30

Outline


The A! algorithm


Conclusion


7/30

Overview of A!

I A∗ + cooperation + concurrency = A! (a-bang)

I A cooperative heuristic search algorithm for the SSSPI Search agents run an upgraded version of A∗

I Shared information included as a secondary heuristic, hI Simplest case: share best encountered (n,h(n))-pair

I Vanilla A∗ maintains a node priority queue sorted by

f (n) = g(n) + h(n),

A! takes k equal-valued nodes from the top, ranks by h


8/30

Overview of A!

Gt

u

s

vw BA

A! search on a grid graph with two agents, A and B. Between nodes t and u,of equal distance to G (1st heuristic), agent A chooses u, because its closer(2nd heuristic) to best node marked with star, discovered by B.


9/30

Optimality of A!I Proof sketch based on an argument for A∗ itself2

I Cost estimates, f -values, are nondecreasing for all pathsI Consistent heuristic

I Optimal path to a node is found before the node is openedI Expansion from the fringe, edge connection

I Let S be a subset of all equally good candidate nodes, ESall explored nodes with a successor in S

1. f (n′) is the same for all n′ ∈ S, so for n ∈ ES

f (n′) = g(n′) + h(n′) = g(n) + d(n, n′) + h(n′) ≥ g(n) + h(n) = f (n)

2. For a node to be opened before an optimal path there isfound, another fringe node m with a better f -value isimplied, but then f (n) < f (m) and f (n) > f (m).

2Stuart Russell and Peter Norvig. Artificial Intelligence: A Modern Approach, 3rd ed. Prentice Hall Press, 2009.


10/30

A! cooperation architecture

I Agents process async messages at their own paceI Concurrency, nondeterminism; implicit randomness

I Best-effort notion of a globally superior reference node

I A message broker entity can facilitate communicationI Instantaneous diffusion is not the goalI Publish/subscribe-topology (cf. web chat)

I Short-circuiting best-update leads to a momentum effectI Prefer nodes near the previously explored onesI Manifest already with a single agent

I Actor abstraction, fast termination, diversification, . . .


11/30

A! details – A!Search, A!Solver, A!Select

Algorithm 1 : A!Search

Require: N > 0, NODE start , PREDICATE isGoal , HEURISTIC h, HEURISTIC hEnsure: path from start to nearest node satisfying isGoal is shortest possible

mb ← MsgBroker()for i = 0 to N do

workers[i]← A!Solver(mb.portOut ,mb.portIn, s, isGoal, h, h)end forfor each worker in workers in parallel do

worker .launch()end for

wait for terminationreturn path← getPath(workers)


12/30

Algorithm 2 : A!Solver

Require: PORT portIn, portOut , NODE start , PRED isGoal , HEUR h, h1: openHeap ← FibonacciHeap〈INTEGER, NODE〉2: closedSet ← Set〈NODE〉3: pathMap ← Map〈NODE, NODE〉4: current ← start5: repeat6: if isGoal(current) then terminate(current , start , pathMap) end if7: closedSet .add(current)8: for each n in current .getNeighbors() do9: if closedSet .contains(n) then continue end if

10: g ← current .g + dist(current , n)11: f ← g + h(n)12: improved ← openHeap.update(n, f )13: if improved then pathMap.update(n, current) end if14: end for15: peekList ← openHeap.getPeekList()16: if isEmpty(peekList) then terminate() end if17: current ← A!Select(peekList , portIn, portOut , h, h)18: openHeap.remove(current)19: until termination


13/30

Algorithm 3 : A!Select

Require: LIST peekList , PORT portIn, portOut , HEURISTIC h, hEnsure: select is the most promising node in peekList according to h on best

1: update, updateH ← asyncRecv(portIn)2: if updateH < bestH then3: best , bestH ← update, updateH4: end if5: select ← peekList .pop()6: selectD ← h(select , best)7: for each node in peekList do8: d ← h(node, best)9: if d < selectD then select , selectD ← node, d end if

10: end for11: if h(select) < bestH then12: best , bestH ← select , selectH13: asyncSend(portOut , {best , bestH})14: end if15: return select


14/30

A! tradeoffs and challenges

I Shared memory vs. distributedI Cooperation details: what to share and how, usageI Maintaining diversity: partitioning, hashing, . . .I IDA∗ and maintaining memory-efficiencyI Clean, fast terminationI Priority queue bottleneckI Agent abstraction: actors, coroutines, . . .


15/30

Outline


The A! algorithm


Conclusion


15/30

The n-puzzle

23 1

456 78

A start state

2 314 5 67 8

A standard goal state


16/30

The n-puzzle

I Classic sliding tile puzzle with a long history3

I Turn the start state into the target state by sliding tiles

I Literally a toy problemI Finding k -bound sequence for general m ×m is NP-C4

I 8-puzzle avg branching factor ∼3, avg solution 22 stepsI 322 tree states, 9!

2 ≈ 180k graphI 15-puzzle 1.3× 1012, 24-puzzle ∼1025

I Hardest 8-p on 31 steps, 15-p 80, 24-p 152–208I cf. God’s Number for the Rubik’s Cube is 20 (Rokicki et al., 2010)

3Jerry Slocum and Dic Sonneveld. The 15 Puzzle Book. The Slocum Puzzle Foundation, 2006.

4Daniel Ratner and Manfred Warmuth. Finding a Shortest Solution for the NxN Extension of the 15-PUZZLE Is

Intractable. AAAI ’86, pages 168–172, 1986.


17/30

Heuristics for n-puzzle

I Misplaced tile countI Manhattan distance

I The sum of distances the tiles are from their targetpositions, counted as moves along the grid

I Manhattan distance with linear collisionsI Two tiles on the right row, but in the wrong order must pass

each other to reach their targetsI Walking distanceI Pattern databasesI Additive/disjoint pattern databases

I Store precomputed solutions to sub-problems; disjoint setsensure combination heuristic remains admissible

I Learned heuristics


18/30

Computational setting

I Randomly generated 8- and 15-puzzle instancesI Grouped by optimal path length

I Substantial variance within each group

I Simple A! implementation based on A∗ by Brian BorowskiI Focus on nodes opened by winning agent

I Correlates with runtime, total opened countI Comparison with A∗ and non-coop randomized A∗, A?

I A!Select→ A*Select, A?Select

I Test execution on Aalto SCI Science-IT project resourcesI Triton cluster of mixed multi-core blade serversI 2.6GHz Opteron 2435, 2.67GHz Xeon X5650, and 2.8GHz Xeon E5 2680 v2


19/30

Results: Cooperation benefit and scalability

0 100000 200000 300000 400000 500000 600000 700000 800000 900000 0

100000

200000

300000

400000

500000

600000

700000

800000

900000

Visited vertices for winning agent, vanilla A* search

Vis

ited v

ert

ices for

win

nin

g a

gent, c

oopera

tive A

! searc

h15−puzzle runs, length groups 40−59, 1−8 agents, PDB heuristic.

1 agent

2 agents

4 agents

6 agents

8 agents

Par

1 A lin. reg.

2 A lin. reg.

4 A lin. reg.

6 A lin. reg.

8 A lin. reg.

Relative performance of A∗ (x-axis) and A! (y-axis). The black line is par, sodata points below it represent instances for which A! performs better than A∗.Agent count is evaluated in five batches – 1, 2, 4, 6, and 8 agents – with therespective trend lines showing how the methods compare.


20/30


0 100000 200000 300000 400000 500000 600000 700000 800000 900000 0

100000

200000

300000

400000

500000

600000

700000

800000

900000

Visited vertices for winning agent, random A? search

Vis

ited v

ert

ices for

win

nin

g a

gent, c

oopera

tive A

! searc

h15−puzzle runs, length groups 40−59, 1−8 agents, PDB heuristic.

1 agent

2 agents

4 agents

6 agents

8 agents

Par

1 A lin. reg.

2 A lin. reg.

4 A lin. reg.

6 A lin. reg.

8 A lin. reg.

Relative performance of A? and A!. As before, the data points and trend linesbelow par-line reflect the benefit from cooperation. The slopes vary fromaround 7

10 to 810 , reflecting a 25− 40% performance difference in favor of A!.


21/30


0

50000

100000

150000

200000

250000

300000

350000

400000V

isited v

ert

ices for

win

nin

g a

gent, g

roup m

edia

n

A*

A?

G40

A!

A*

A?

G41

A!

A*

A?

G42

A!

A*

A?

G43

A!

A*

A?

G44

A!

A*

A?

G45

A!

A*

A?

G46

A!

A*

A?

G47

A!

A*

A?

G48

A!

A*

A?

G49

A!

A*

A?

G50

A!

A*

A?

G51

A!

A*

A?

G52

A!

A*

A?

G53

A!

A*

A?

G54

A!

A*

A?

G55

A!

A*

A?

G56

A!

A*

A?

G57

A!

A*

A?

G58

A!

A*

A?

G59

A!

15−puzzle runs, length groups 40−59, 1−8 agents, PDB heuristic.

A*, 1−8 agents

A?, 1 agent

A!, 1 agent

A?, 2 agents

A!, 2 agents

A?, 4 agents

A!, 4 agents

A?, 6 agents

A!, 6 agents

A?, 8 agents

A!, 8 agents

General A∗, A? and A! performance trends. 100 inst./group. 1, 2, 4, 6, and 8agents. Cooperation appears to be more beneficial with harder instances.


22/30

0

0.5

1

1.5

2

A*

A*

1A? 2A? 4A? 6A? 8A?

40

1A! 2A! 4A! 6A! 8A!

40

41

41

42

42

43

43

44

44

45

45

46

46

47

47

48

48

49

49

50

50

51

51

52

52

53

53

54

54

55

55

56

56

57

57

58

58

59

59

15−puzzle runs, length groups 40−59, 1−8 agents, PDB heuristic.

Vis

ite

d v

ert

ice

s f

or

win

nin

g a

ge

nt,

me

an

s b

y g

rou

p,

no

rma

lize

d t

o A

*

A*

A? normalized group mean, 1−−8A

A! normalized group mean, 1−−8A

A? mean of norm. group means, 1−−8A

A! mean of norm. group means, 1−−8A

I A∗-normalized length groupsI Scaling benefit from more agents

I Overlapping trend lines

I Rudimentary asymptotics


23/30

Results: Heuristic impact

0 100000 200000 300000 400000 500000 600000 700000 800000 900000 0

50000

100000

150000

200000

250000

300000

Visited vertices, LC heuristic

Vis

ited v

ert

ices, 6−

6−

3 P

DB

heuristic

15−puzzle runs, lengths 30−57, 1−8 agents, LC/PDB heuristic.

1 agents

2 agents

4 agents

6 agents

8 agents

1 agent lin. reg.

2 agents lin. reg.

4 agents lin. reg.

6 agents lin. reg.

8 agents lin. reg.

Heuristic comparison with data grouped by agent configuration. The trendlines being essentially the same indicates that the number of agents is notstrongly correlated with heuristic impact: regardless of method, two agentsbenefit from a better heuristic as much (or little) as eight agents.


24/30

Results: Heuristic impact

0 100000 200000 300000 400000 500000 600000 700000 800000 900000 0

50000

100000

150000

200000

250000

300000

Visited vertices, LC heuristic

Vis

ited v

ert

ices, 6−

6−

3 P

DB

heuristic

15−puzzle runs, lengths 30−57, 1−8 agents, LC/PDB heuristic.

A?, 1−8 agents

A!, 1−8 agents

A? lin. reg.

A! lin. reg.

Heuristic comparison with data grouped by method. The now more visibledifference in trend lines suggests that A! benefits more from the improvedheuristic than A?. The slope is about 1

8 for A?, and around 110 for A!.


25/30

Results: Path diversity

1 2 3 4 5 6 7 8 9

1

2

3

4

5

6

7

8

9

0

1

2

Solution path

1 2 3 4 5 6 7 8 9

1

2

3

4

5

6

7

8

9

200

400

600

800

1000

1200

1400

Vanilla A∗

1 2 3 4 5 6 7 8 9

1

2

3

4

5

6

7

8

9

200

400

600

800

1000

1200

1400

Random A?

1 2 3 4 5 6 7 8 9

1

2

3

4

5

6

7

8

9

200

400

600

800

1000

1200

1400

Cooperative A!

Visualization of A∗, A? and A! on 8-puzzle [8 6 7 2 5 4 3 0 1]. The heat-maps are derived from a 9× 9

self-organizing map trained on an optimal solution path of 31 steps, shown top left in the codomain.

METHOD A∗ A? A!FREQ. 1 2 3 4 1 2 3 4 1 2 3 4MEAN 648 414 269 4780 3290 1147 903 5245 562 600 1006 4544RATIO 0.11 0.07 0.04 0.78 0.31 0.11 0.09 0.50 0.08 0.09 0.15 0.68STD 476 363 235 525 1345 559 327 716 233 254 372 656

State visits for A∗, A? and A!. The table gives the mean, ratio and standard deviation of state visit frequencies from

ten iterations of A∗ on four agents with Manhattan heuristic.


26/30

Results: Path diversity

2 4 6 8 10 12 14 16

2

4

6

8

10

12

14

16

0

1

2

Solution path

2 4 6 8 10 12 14 16

2

4

6

8

10

12

14

16

0

0.5

1

1.5

2

2.5

3

3.5

4

4.5

x 104

Vanilla A∗

2 4 6 8 10 12 14 16

2

4

6

8

10

12

14

16

0

0.5

1

1.5

2

2.5

3

3.5

4

4.5

x 104

Random A?

2 4 6 8 10 12 14 16

2

4

6

8

10

12

14

16

0

0.5

1

1.5

2

2.5

3

3.5

4

4.5

x 104

Cooperative A!

16× 16 SOM visualization of A∗, A? and A! on 54-optimal 15-puzzle [12 8 6 3 13 4 2 7 0 9 15 5 14 10 11 1].

METHOD A∗ A? A!FREQ. 1 2 3 4 1 2 3 4 1 2 3 4MEAN 661 862 346 158124 91459 32347 13256 76659 40168 8406 4253 75356RATIO 0.00 0.01 0.00 0.99 0.43 0.15 0.06 0.36 0.31 0.07 0.03 0.59STD 352 1573 379 1420 27768 27092 14508 4702 29408 8212 1800 1040

State visits for A∗, A? and A! on 15-puzzle [12 8 6 3 13 4 2 7 0 9 15 5 14 10 11 1]. The table gives the

mean, ratio and standard deviation of state visit frequencies from ten iterations of A∗ on four agents with a 6-6-3

disjoint pattern database heuristic.


27/30

Results: Hybrid performance

I Diminishing returns from simply adding more agentsI Search overhead, new agents mostly tread on old paths

I Path diversity is essentialI Secondary heuristic performance depends on it

I Ideally, one would like to have the focused searchperformance of A! and the diversity apparent in A?

I A! + A? = A′′

I Simple threshold combination does not appear to workI A! seems to skip past the very states that A? wastes time on


28/30

1 5 9 13 17 21 25 29 33 37 41 45 49 53 57 61 65 69 73 77 81 85 89 93 97 0

10000

20000

30000

40000

50000

60000

70000

80000

90000

40

42

44

46

48

50

52

15−puzzle runs, HYBRID mode, even length groups 40−52, 4 agents, 5x, PDB heuristic.

Cutoff threshold p for A": Pr[ A! ] = p, Pr[ A? ] = 1−p.

Vis

ite

d v

ert

ice

s f

or

win

nin

g a

ge

nt,

me

an

s b

y g

rou

p

I Hybrid A′′ performance overmultiple p-thresholds

I Three iterations per instance pergroup in the 45–54 range

I A! likelihood over A? grows withp to the right

I The downward trending slopessuggest that adding someA? elements into A! does notimprove overall performance


29/30

Results: Discussion

I A! outperforms both vanilla A∗ and the non-cooperativerandom parallel search A∗-variant A?

I Nondeterministic cooperation emerging from asyncmessage exchange was shown to be beneficial

I With more agents, A! was shown to work betterI Rapidly diminishing returnsI Search overhead appeared to be an issue

I Adding simple explicit randomization did not helpI More robust diversity increasing mechanisms are needed

I A! demonstrated extra sensitivity for heuristic improvementI The reason for this is unclear

I SOM visualization proved thought-provoking, but lo-fiI More appropriate visualizations worth exploring


30/30

Outline


The A! algorithm


Conclusion


30/30

Thesis summary

I Cooperation approach to parallel computingI Heuristic search context, parallel A∗

I Cooperation as a secondary heuristic

I Cooperation is communicationI Asynchronity, concurrency and nondeterminismI Ripple effects, implicit randomness

I The A! algorithm

I Empirical study, computational experimentsI Performance, scalabilityI Path diversity, heuristic sensitivityI Hybrid algorithm


30/30

Related workI Alba, E. (Ed.). (2005). Parallel Metaheuristics. John Wiley & Sons. doi:10.1002/0471739383I Alba, E., Luque, G., & Nesmachnow, S. (2013). Parallel Metaheuristics: Recent Advances and New Trends.

Intl. Trans. in Operational Research, 20(1), 1–48. doi:10.1111/j.1475-3995.2012.00862.xI Barbucha, D. (2012). Search Modes for the Cooperative Multi-agent System Solving the Vehicle Routing

Problem. Neurocomputing, 88, 13–23. doi:10.1016/j.neucom.2011.07.032I Burns, E., Lemons, S., Ruml, W., & Zhou, R. (2010). Best-First Heuristic Search for Multicore Machines.

Journal of Artificial Intelligence Research, 39, 689–743. doi:10.1613/jair.3094I Clearwater, S. H., Huberman, B. A., & Hogg, T. (1991). Cooperative Solution of Constraint Satisfaction

Problems. Science, 254(5035), 1181–3. doi:10.1126/science.254.5035.1181I Crainic, T. G., & Toulouse, M. (2008). Explicit and Emergent Cooperation Schemes for Search Algorithms.

In Proc. of the 2nd Intl. Conf. on Learning and Intelligent Optimization (LION ’07) (pp. 95–109).doi:10.1007/978-3-540-92695-5_8

I Hogg, T., & Huberman, B. A. (1993). Better Than the Best: The Power of Cooperation. In 1992 Lectures inComplex Systems (Vol. V, pp. 165–184). Addison-Wesley.

I Hogg, T., & Williams, C. P. (1993). Solving the Really Hard Problems with Cooperative Search. In Proc. ofthe 11th National Conference on Artificial Intelligence (AAAI ’93) (pp. 231–236).

I Kishimoto, A., Fukunaga, A., & Botea, A. (2013). Evaluation of a Simple, Scalable, Parallel Best-FirstSearch Strategy. Artificial Intelligence, 195(0), 222–248. doi:10.1016/j.artint.2012.10.007

I Nitschke, G. (2005). Emergence of Cooperation: State of the Art. Artificial Life, 11(3), 367–96.doi:10.1162/1064546054407194

I Halme, A. (2014). Cooperative Heuristic Search with Software Agents. M.Sc. thesis, Aalto University.

doi:10.1002/0471739383

doi:10.1111/j.1475-3995.2012.00862.x

doi:10.1016/j.neucom.2011.07.032

doi:10.1613/jair.3094

doi:10.1126/science.254.5035.1181

doi:10.1007/978-3-540-92695-5_8

doi:10.1016/j.artint.2012.10.007

doi:10.1162/1064546054407194

cooperative heuristic search with software agentsusers.ics.aalto.fi/ahalme/coop/presis_halme.pdf ·...

Documents