online evolution for multi action adversarial games
TRANSCRIPT
Online Evolution for Multi-Action Adversarial Games
Niels Justesen IT University of Copenhagen
Tobias Mahlmann Lund University
Julian Togelius New York University
Hero Academy
https://github.com/njustesen/hero-aicademy
● 5 action points each turn○ Movement ■
○ Healing ■
○ Attacking ■
○ Equipping ■
○ Swapping ■
● Branching factor:○ One action: ~60
○ One turn: 605 = 7.78 × 108
= 778,000,000
Hero Academy
Hand-written heuristic
(b) Bonus added to units on special squares.(a) Bonus added to units with items.
Search algorithms in Hero Academy
● 1-ply search
○ Greedy on action-level
● 5-ply (1 turn) depth-first search
○ ~500,000 unique outcomes evaluated each turn (6 seconds)
○ Action pruning and sorting
○ Similar to MiniMax search depth-limited to 5 plies
○ Greedy on turn-level
Monte Carlo Tree Search
Chaslot, Guillaume, et al. "Monte-Carlo Tree Search: A New Framework for Game AI." AIIDE. 2008.
MCTS in Hero Academy
● No longer an anytime algorithm
● Rollouts have negative effects - use the evaluation function
● ~200 unique outcomes evaluated each turn (6 seconds)
● Side effect of best-first searches in multi-action games: avoids searching the
opponents turn
Online Evolution
Rolling Horizon EvolutionPerez, Diego, et al. "Rolling horizon evolution versus tree search for navigation in single-player real-time games", 2013
Online Evolution
● Population size of 100
● 50% elitism
● Random selection of parents
● Uniform crossover
● 10% mutation rate
Online Evolution in Hero Academy
● ~10,000 unique outcomes evaluated each turn (6 seconds)
● ~3,500 generations each turn on average
Future work● Considering opponent actions
○ Rollouts
○ Competitive co-evolution
● Evolving heuristics○ Parameter tuning of existing heuristic
○ 1-ply evolution
○ NEAT / Deep Learning
● MCTS variations for Hero Academy
● Online Evolution in multi-action games with more actions