faculty of mathematics and information science warsaw...
TRANSCRIPT
Towards cognitively-plausible
game playing systems
Jacek Mańdziuk
Faculty of Mathematics and Information ScienceWarsaw University of Technology
Yonsei University, Seoul, Korea, 29/03/2011
Agenda
• Brief introduction to board games
• State-of-the-art achievements in boardgames Quo vadis board game research?
• Cognitively-plausible game playing systems
• Intuitive playing – a real challenge
• Examples of intuitively playing systems
• Conclusions
• Associate Professor, Warsaw University of Technology in Poland
• Knowledge-free and learning-based methodsin solving problems
• Application to games, financial modeling, bioinformatics, optimization
About me
4
Board games - basics
BF space length tree size
Chess 35 10^50 80 10^123
Checkers 3 5x10^20 70 10^31
Othello 7 10^28 60 10^50
Go (Baduk) 250 10^172 150 10^360
Basic comparison of the
most popular board games
• Perfect-information (vs. imperfect)
• Two-player (vs. multi-player)
• Zero-sum (vs. expected payoff)
• Deterministic (vs. non-deterministic)
• Turn-based (vs. simultaneously-played)
Minimax algorithm
von Neumann i Morgenstern (1944)
Evaluation function
Claude Shannon (1950) – linear weighted evaluation function;
algoritms of type A and type B;
for convenience:
ni
i
ii xwwsVwhere
otherwisewsV
lossaissifMIN
winaissifMAX
wsP1
),(;
),,(
,
,
),(
)tanh(),(,1,11
ni
i
ii xwbwsVMAXMIN
• Environments
• popular
• repeatable
• cheap
• Used for benchmarking
• heuristic search methods
• methods of efficient problem representation
• Social and mental aspect of games
Why AI cares about games?
Chess
• The Turk (Baron Wolfgang von Kempelen, 1769) –
Empress Catherina II, Napoleon, …
10
State-of-the-art
board game playing systems
Chess
• Torres y Quevedo (around 1890), KR vs. K
• Alan Turing (1953) – the first chess playing program
Chess• …
• Deep Blue II (1997 - rematch)
• Evaluation function composed of 8000 features
• Manually adjusted weights in the evaluation
function (depending on particular opponent)
• Parallel implementation based on 480 specialized
chess chips
• 30-node cluster
• 100-330 million positions /sek. (50 billion/3 min.)
• NegaScout + TT + Iterative Deep. + Quis. Search
• Extended opening and endgame databases
Chess
• Deep Fritz,
• Junior
• Shredder
• Hydra
…
• Rybka (Vasik Rajlich and Iweta Rajlich)
• Above 3100 ELO on a 4 CPU PC
• Close to 3000 ELO on a single-core PC
Checkers
• A.L. Samuel (1959)
• Evaluation function composed of 22 elements
• Actually it was the first implementation of TD method
• Chinook (1994)
• 4 phases of the game ( 4 eval. functions), 25 basic
features
• Each features depends on several detailed parameters
• Dr Marion Tinsley, lost only 7 games in his 40-year
career as a world champion (including 2 to Chinook)
Checkers…
• The game was solved (Science, 2007, Schaeffer et al.)
• Complete 10-piece endings database
• Partly relying on brute force
• Checkers is a draw game
Othello• Logistello (1997) – M. Buro
• Logistello vs. Takeshi Murakami (6:0)
• Convincing win.
Go (Baduk)
• One of the last strongholds of human supremacy
• MoGo, Many Faces of Go, Fuego, CrazyStone, GoIntellect,
Indigo, Golois, …
• Size of the board
• High branching factor
• Additive nature of the game
• Serious problems with analytical construction of effective
evaluation function for the middle-game phase
(the assessment of end-game positions is relatively easy).
• Due to variety of positional and tactical threats it is highly
probable that „no simple yet reasonable evaluation function
will ever be found for Go” – M. Muller
Go – rollout (MC) simulations• Until the stoping condition is not fulfilled (there is still time):
• Start in the root (current node in the game tree)
• Until the end-game position is not reached:
• Choose move according to some policy
• Simulate execution of the move
• Assess the game (in end-game position)
• Update all moves on the played path in a game tree
according to the end-game score.
• Policy: UCT and its variations (exploration vs. exploitation)
• MoGo (Gelly et al.) in 2008 won for the first time a game
against professional (dan) player (5d)
18
Quo vadis
board game research?
Quo vadis?
• Practically there are no chances of catching up with
machines in chess, checkers, Othello, and many
other games.
• Baduk reamains the last stronghold. For how long?
• What can we really gain from developing more and
more efficient programs running on faster and faster
hardware?
Human-type approach – a challenge
Mimicking human approach to game playing
in all its major aspects.
• Automatic feature selection for the evaluation function
• Modeling the opponent’s style of play.
• Learning from scratch (knowledge-free methods).
• Autonomous pattern-based knowledge acquisition.
• Highly selective search (efficient move preselection)
Anaconda (Blondie24) – neuro-evolution: checkers;
TD-Gammon - TD-learning (self-play mode): Backgammon,
Grand challenges
• Intuition
• Creativity („to bring into form or being out of nothing”)
• Backgammon (new opening)
• Blondie24 (Anaconda)
• Morph
• Zenith
• Multigame playing (universal, game-independent learning)
• SAL
• Hoyle and METAGAMMER
• General Game Playing contest (Cluneplayer, Cadia, Ary)
22
Intuitive playing – some insights
Quoting Albert Einstein
The intuitive mind is a sacred gift and the rational mind
is a faithful servant. We have created a society that
honors the servant and has forgotten the gift.
(Albert Einstein, 1931)
23
Facets of human-type intuition
• Instantaneous qualitative estimation of a game position
• Instantaneous focus on relevant regions of the game board
• Efficient move pre-ordering
• Search-free playing
• Focus on goal /plans rather than particular moves
• Ability to make long-term material sacrifice in order to
gain some positional advantage (without precise
verification of all consequences – possible paths of play)
Example of intuitive play
Immortal game, Anderssen vs. Kieseritzky, London 1851
11. Wg1 … 17. … Hxb2 23. Ge7++
Intuition as a „side effect” of „perfect” play
Deep Blue II vs. Kasparov (NY, 1997, game 2)
After 36...axb5 Deep Blue played „deeply strategic”,
intuitive move 37. Ge4!!, despite obvious 37. Hb6.
Kasparov acused Deep Blue team of cheating!
Intuition – goals and plans of play
Wilkins, 1980
Intuition – neurobiological foundations
• „Monitoring” the decision-making process of human
players:
• How game positions are perceived (mainly in chess)
• Activity of brain regions – fMRI technique (chess and Go)
• Research on perception abilities [de Groot, 1965]
• A chess position composed of about 25 pieces
• 3-15 seconds
• Reconstruction rate: 93% grandmasters, gradually decreasing
along with decrease of the level of play
Intuition – neurobiological foundations
• Perception [Chase and Simon, 1973]
• Confirmed de Groot’s conclusions
• Short 5 second exposure
• Much higher capabilities of grandmasters than intermediate and
novice players, but only in the case of sensible positions.
• Not confirmed in the case of random positions (!)
• this ability is neither related to specific memory skills nor to
their training
• it is the effect of internal position representation in the form
ofchunks of information/templates with associated moves),
which they use to remember and categorize positions
• Templates library of grandmasters is composed of around 300K
elements.
Intuition – neurobiological foundations
• Saccadic eye movement [de Groot and Gobet, 2002]
• GM – concentration on edges, weaker players on squares
• GM – greater range (the average distance between consecutive
fixations is greater in case of GM)
• strong players decode 2-3 pieces (part of a template?)
within a single fixation; weaker ones – single element
• Analysis of the first 5 fixations in the problem of finding the best
move in a given position:
• strong players concentrated on relevant pieces/squares
more often than weak players
Intuition – neurobiological foundations
• Check status detection [Reingold and Charness, 2005]
• Positions on small 3x3 or 5x5 boards, king and 1-2 opponent’s
pieces (potentially attaking the king)
• One of the pieces can be cued
• GM – the same average time, weak players – shorter time in
case of one opponent’s piece or when the cued piece is the
attacking one
• confirmation of suggestions about parallel analysis
performed by GM which include a few pieces comprising
meaningful chunk or part of it.
Intuition – neurobiological foundations
• Functional Magnetic Resonance Imaging (fMRI)
• Go players compared with chess players
• Amateur players (know the rules, play occassionally)
• Chess: empty board (focus on the center) vs. position with
improperly places pieces (a few of them marked with a star – point
them out) vs. appropriate position with about 25 pieces (point out
the next move)
• Go: empty board (focus on the center) vs. position with improperly
places pieces (point out 6 marked stones) vs. appropriate position
with about 25 pieces (point out the next move)
• Positions are shown in circle, each exposure equals 30 sec.
Intuition – neurobiological foundations
• For each type of position different neuronal activity
patterns in players’ brains are observed.
• Some differences between Go and chess
• In the case of real Go positions much higher
involvement of the right hemisphere (responsible for
spatail relations)
• In the case of real chess positions: greater involvement
of „analytical” (left) hemisphere - QUESTIONABLE
34
Examples of intuitively game
playing systems
Trajectories of solutions
Pioneer Project and Linguistic Geometry
• Pioneer Project (USSR, 1960-ties-1970-teis)
• Michaił Botvinnik and his collaborators
• Working on search methods that significantly narrow the
game tree
• Goal: abstraction of relevant position features, and using
them so as to find „promissing” trajectories (solution
candidates) instead of applying laborious, systematic
search similar to human players
• Continuation and generalization of this research in LG project
(mainly with military applications secret)
Abstraction and generalization of game features
Reti ending. White to begin and draw.
How about machine solving the problem on a 100 x 100 board?
Intuition – trajectories of solutions
Pioneer Project and Linguistic Geometry
Intuition – trajectories of solutions
Pioneer Project and Linguistic Geometry
d=6:Full tree: 10^6 nodesPioneer: 54 nodes, av. bf.=1.68
Intuitive chess playing – SYLPH
• SYLPH [Finkelstein and Markovitz, 1998]
• Extension of Morph (Levinson) and Chump (Gobet).
• Relies on move patterns (game position + played move).
• Game positions are represented as hipergraphs: nodes
refer to squares and edges to relations between them
(including empty squares).
• Relations: a piece controls a square, a piece attacts
directly (indirectly) another piece/square; double attack,
etc.
• More complicated relations – involving up to 4-5 nodes
(pieces/squares).
Intuitive chess playing – SYLPH
• Learning with a teacher
• a human,
• GNU Chess,
• copy of itself.
• Material patterns (ref. to capture moves): assigned weights
equal to the difference in material.
• Positional patterns (non-capture moves): weight proportional
to the frequency of using that move in played games.
• Augmentation process: observation of games played by two
copies of GNU Chess or analysis of grandmaster games from
game repository.
Intuitive chess playing – SYLPH
• SYLPH was equipped with the rules of chess
• 100 games against GNU Chess + augmentation: 50
self-played games by GNU Chess 4614 patterns
• Test (of intuitive play): find the best move in a given position
without search.
• Filter_tk: 1<= k <=10: selects k (best) moves.
• Filter_gf: 0.1<= f <=1.0: selects fraction f of (best) moves.
• A success = the best move is inculded in the selected set.
Intuitive chess playing – SYLPH
• A few thousand test positions generated from games played
between two copies of GNU Chess.
• For k=4, the efficiency equals 0,557
• For f=1/3, SYLPH was superior to alpha-beta with d=4 (!) and
material-based evaluation function.
• On 3830 positions extracted from 100 games of M. Tal
(previously unknown to the system) similar results to those
on the above test on GNU Chess positions good
generalization properties.
• No retention mechanism
42
Summary of the main facets
of cognitive playing
The main facets of cognitively-plausible
board game playing systems
• Allen Newell (1990) introduced several criteria that
define cognitive systems, which refer mainly to their
behavioral, learning, and knowledge-related
properties.
• Duch et al. (2008) proposed a simplified taxonomy of
cognitive systems based on two main criteria:
• LEARNING
• MEMORY
Learning-related postulates
• P1: Learning should be implemented as an incremental
development process.
• P2: Learning should be implemented in a parallel
(multitasking mode).
• P3: Learning system should be capable of suitable
decomposition of game patterns into meaningful
subpatterns without the need for external intervention.
• P4: Learning process should be performed on several
levels of detail.
Memory-related postulates
• P5: Game-related concepts may be effectively
represented and processed with the use of
pattern-based representation.
• P6: Knowledge acquired in the learning process
should be represented in a hierarchical structure
with various inter- and intra-level connections.
• P7: Acquisition of knowledge as well as its further
processing in the system should take into account
symmetries existing in the game.
General conclusions
46
• Argumentation for potential virtues of cognitively-plausible,
pattern-based playing systems. Several underlying concepts
which in our opinion should of such systems were presented.
• Certainly, the cognitively-plausible methods by no means are
competitive to established AI approaches, but efficacy should
not be the sole goal of game research.
47
Thank you for your attention!
J. Mańdziuk, Knowledge-Free and Learning-Based Methods inIntelligent Game Playing, Studies in Computational Intelligence, vol. 276, Springer-Verlag, 2010
J. Mańdziuk, Towards cognitively-plausible game playing systems,IEEE COMPUTATIONAL INTELLIGENCE MAGAZINE, 6(2), MAY 2011, IN PRESS