learning spatial reasoning jack gelfand center for the study of brain, mind and behavior department...
Post on 20-Dec-2015
213 Views
Preview:
TRANSCRIPT
LEARNING SPATIAL REASONING
Jack Gelfand
Center for the Study of Brain, Mind and Behavior
Department of Psychology
LEARNING SPATIAL REASONING
• Computer Game Playing
• Game Playing and Pattern-Based Reasoning
• Organization of the visual system– Multi-stream hierarchy
– Form perception
– Motion perception
• Elements of perceptual organization– Gestalt figural organization
– Popout phenomena
• Learning New Spatial Concepts - Spatial Concept Formation Languages
• Structural Features and Functional Features
ReadingsEpstein, Gelfand and Lock, Constraints, 2, 239 (1998).Gelfand et al., Proceedings of the Joint Conference on Information Systems (1998).Epstein, Gelfand and Lesniak, Computational Intelligence, 12, 199 (1996).
WHY PLAY GAMES WITH COMPUTERS?
• From the Artificial Intelligence and Cognitive Psychology Point of View– Games are excellent testbeds as they:
• Have well-defined rules generating a large search space• Easily represented in a computer• Easy to test• Computers can compete with humans at some games but not
others
• From the Game Playing Point of View– Can make the game much more enjoyable to play– New levels of analysis
Search
• The brute force approach of search has been highly effective in games such as Checkers and Chess.– Checkers
• Chinook (World Champion)
– Chess• Best programs can hold their own with the best humans.
• Deep Blue II– move generation and evaluation in hardware
– parallel search in software
EXHAUSTIVE SEARCH
• From the starting position1. Generate every legal move for player 1.
2. For each legal move of player 1 generate every legal move for player 2.
3. Repeat steps 1 & 2 until the game reaches a definitive result.
PROBLEM WITH EXHAUSTIVE SEARCH
• Not practical– A player in chess has, on average, 36 legal moves.
– A game could take 45 moves to reach a conclusion (underestimate).
– Total number of positions = 3690
– There is only ~1081 atoms in the universe
• Couldn’t store all the positions in computer the size of the universe.
EVALUATION FUNCTIONS
• Assign a value for each factor contributing to the worth of a position.
• Add up the terms • Search positions based upon the values
Searching the Game Tree
-3 20 4 -5 -3-1 -4-2 0 1
MAX
MIN
MAX
-2 4 -5 2 -3 1
-2 -5 -3
-2
This is the Minimax Algorithm
Improving Minimax
• The Minimax Algorithm has various improvements that are used in practice.– Alpha-Beta– Principle Variation Search (PVS)– Transposition Tables– Killer Move Heuristics
• At best they can halve the work of the search.
Computer Chess
• Deep Blue II– 256 dedicated chess processors
• generate moves• evaluate positions
– Search process in software (PVS)– Database of opening sequences– Databases of endgame sequences
• Deep Blue II can evaluate 200 million positions per second (3 billion in 3 minutes).
• Deep Blue II can hold its own with the best players in the world, but it is not invincible!
WHAT IS WRONG WITH THIS PICTURE?
THERE ARE AS MANY POSSIBLE GAME STATES IN CHESS AS ATOMS IN THE
UNIVERSE.
THERE IS ABOUT 20 X 6 FEET OF SPACE FOR CHESS BOOKS IN THE LIBRARY.
WHAT’S WRONG?
MOST OF THE GAME STATES IN CHESS ARE IRRELEVANT.
HUMANS HAVE AN EXTREMELY COMPACT WAY OF REPRESENTING THE SALIENT
CONCEPTS IN CHESS.
LEARNING NEW REPRESENTATIONS THROUGH EXPERIENCE
• Statement of problem or early experience does not necessarily provide optimal representation.
• People acquire optimal representations gradually.• Often problems are stated in terms of local relationships. Experts
utilize global spatial heuristics acquired through performing the task.– CHESS - Control of the center of the board– Othello - Control of edges– Go - Shape and thickness of zones
Vertical and Horizontal Control on the Chess Board
Diagonal Control on the Chess Board
HIERARCHICAL ORGANIZATION OF THE HUMAN VISUAL SYSTEM
• Multiple streams of processing
• System of feature hierarchies
RECEPTIVE FIELD OF A CORTICAL VISUAL NEURON
CONVERGENT PROJECTIONS IN THE VISUAL FEATURE HIERARCHY
NEURONS IN THE HIGHEST VISUAL FORM RECOGNITION AREAS OF CORTEX RESPOND TO COMPLEX STIMULI
PERCEPTUAL ORGANIZATION
URGE TO ORGANIZE
GESTALT FIGURAL GROUPING
• Forms or objects composed of elements
• Organization of elements into perceptual objects involves an active construction process
• Gestalt researchers studied the way in which these elements tend to become formlike or object like perceptions
• GESTALT LAWS OF PERCEPTUAL ORGANIZATION
• Works for sounds as well
• Gestalt thinking was widely applied but became discredited because it lacked an underlying model. More modern neural models can account for these mechanisms.
GESTALT PRINCIPLES OF FIGURAL ORGANIZATION
PERCEPTUAL ORGANIZATION TAKES PLACE AT MANY LEVELS
X O OO
The level of perceptual organization depends upon the task and the attentional state of the viewer.
POPOUT PHENOMENA
POPOUT PHENOMENA
SYNCHRONICITY OF NEURONS IN VISUAL CORTEX MAY LINK THE COMPONENTS OF THE FIGURE RELATIVE TO THE GROUND
HOYLE DECISION MAKING SYSTEM
current stateacquired useful knowledge
legal moves
Victory
Panic
EnoughRope
Absolute decision?
PatsyMaterial
Tier 1: Shallow search and inference based on perfect knowledge
Tier 2: Heuristic opinions
yes
no
make move
…
Blackboard
Voting
Patsy Spatial-1 Spatial-2
FORR - FOr the Right Reasons
• Linear mixture of experts
• Advisors - decision-making rationales
• Multi-tier hierarchy
– Tier 1 - guarantied correct, shallow search
– Higher tiers - heuristic knowledge - probably correct
Susan Epstein, CUNY
Empirical experience with Hoyle indicates that these weights are game specific andshould therefore be learned. Initially, the weight of each general game-playing Advisor isset to 1. After every contest Hoyle plays against an expert, AWL (Algorithm for Weight
Name Tier Description Useful Knowl-edge
General game-playing Advisors that do not rely on learned, game-specific knowledgeVictory 1 Makes winning move from current state if there is one. —Enough Rope 1 Avoids blocking losing move non-mover would have if it were its
turn.—
Candide 2 Formulates and advances naive offensive plans. —Challenge 2 Moves to maximize its number of winning lines or minimize non-
mover’s.—
Coverage 2 Maximizes mover’s influence on predrawn board lines orminimizes non-mover’s.
—
Freedom 2 Moves to maximize number of its immediate next moves orminimize non-mover’s.
—
Greedy 2 Moves to advance more than one winning line. —Material 2 Moves to increase number of its pieces or decrease those of non-
mover.—
Vulnerable 2 Reduces non-mover’s capture moves on two-ply lookahead. —Worried 2 Observes and destroys naive offensive plans of non-mover. —General game-playing Advisors that rely on learned, game-specific knowledgeWiser 1 Makes correct move if current state is remembered as certain win. Significant
statesSadder 1 Resigns if current state is remembered as certain loss. Significant
statesDon’t Lose 1 Eliminates any move that results in immediate loss. Significant
statesPanic 1 Blocks winning move non-mover would have if it were its turn
now.Significantstates
Shortsight 1 Advises for or against moves based on two-ply lookahead. Significantstates
Anthropomorph 2 Moves as winning or drawing non-Hoyle expert did. Expert movesCyber 2 Moves as winning or drawing Hoyle did. Hoyle movesLeery 2 Avoids moves to state from which loss occurred, but where
limited search proved no certain failure.Dangerousstates
Not Again 2 Avoids moving as losing Hoyle did. Hoyle movesOpen 2 Recommends previously-observed expert openings. Opening
databasePitchfork 2 Advances offensive forks or destroys defensive ones. ForksGeneral game-playing Advisor that relies on learned, game-specific spatial heuristicsPatsy 2 Supports or opposes moves based on t heir patterns’ associated
outcomesPattern cache
Learned game-specific Advisors that rely on learned, game-specific spatial conceptsLearned spatialAdvisors
2 Supports or opposes moves based on their creation or destructionof a single pattern.
—
LINEAR NON-INTERACTING MIXTURE OFEXPERTS
w1A1 + w2A2 + w3A3 + .......
FORMULAS SUCH AS THESE ARE USED INCOLLEGE ADMISSIONS.
This is related to a perceptron neural network,Which we will learn about later
HOYLE DECISION MAKING SYSTEM
current stateacquired useful knowledge
legal moves
Victory
Panic
EnoughRope
Absolute decision?
PatsyMaterial
Tier 1: Shallow search and inference based on perfect knowledge
Tier 2: Heuristic opinions
yes
no
make move
…
Blackboard
Voting
Patsy Spatial-1 Spatial-2
Figure 1: (a) Spatial arrangements of game pieces processed in the algorithms described. (b) The L-shaped arrangement fitted to a game board two different ways. (c) A fitted L-shape instantiated to produce a tic-tac-toe pattern.
(a)
(b) (c)
CONCEPT FORMATION LANGUAGE
PATTERN LEARNING SYSTEM
Patsy
Recommended Action
New Spatial Advisors
Test Correctness
Spatial Concepts
Pattern Cache
Proceduralize
Gener alize
Game State
Pattern Waiting
List
Associate patterns with outcomes
Validate
Remove
1
2
3
4
3
GENERALIZING PATTERNS INTO SPATIAL CONCEPTS
Given distinct agreeing patterns
Pieces and movers are opposites Variabilize the mover and pieces
Different movers, pieces opposite
in only one position
Variabilize the mover and position
Drop the single differing position
To construct a concept
Same mover and outcomeFor X For X For X For X
For X For O For
For X For O For *
X X O X OO X OX O
X X O O O X
X X O O X O * X O
Figure 5: Three learned spatial Advisors for lose tic-tac-toe, and their weights during 200 consecutive contests. The mover for each Advisor is in the current state; the pattern is matched for in the subsequent state In an Advisor, either = X and = O or = O and = X. In an * Advisor, * is either X or O
consistently.
50 100 150 2000
1
2
3
4
5
6
Advisor 1
Advisor 2
Advisor 3# # #
## #
Advisor 3Advisor 2Advisor 1
O *
Mover *
# # #
# # #
Mover
# # #
*
# ##
OX
Mover *
AN ALGORTITM FOR WEIGHT LEARNING ADJUSTS THE WEIGHTS OF EACH ADVISOR
BASED UPON PERFORMANCE
Challenger Perfect Player 90% Perfect 30% Perfect
Wins+Draws Wins Wins+Draws Wins Wins+Draws Wins
Tic-tac-toe 100.0 — 100.0 16.4 100.0 80.7
Without patterns 100.0 (0.0) — 98.0 (4.0) 18.0 (7.5) 97.0 (6.4) 83.0 (11.9)
Pattern-oriented 100.0 (0.0) — 97.0 (6.4) 13.0 (12.8) 94.0 (4.9) 77.0 (13.5)
Context and weight
1 only
100.0 (0.0) — 100.0 (0.0) 22.0 (16.6) 99.0 (3.0) 85.0 (11.2)
Lose tic-tac-toe 100.0 — 100.0 18.5 100.0 66.4
Without patterns 100.0 (0.0) — 96.0 (4.9) 18.0 (7.5) 73.0 (7.8) 54.0 (9.2)
Pattern-oriented 100.0 (0.0) — 98.0 (6.0) 18.0 (8.7) 92.0 (6.0) 49.0 (11.4)
Weight > 2 only 100.0 (0.0) — 99.0 (3.0) 18.0 (11.7) 96.0 (6.6) 68.0 (8.7)
Table 2: Average and standard deviation of performance with and without spatial orientation against three challengers. Boldface is an improvement over play without patterns at the 95% confidence level. Estimated optima are in italics.
A LEARNED SPATIAL ADVISOR AFFECTS DECISION-MAKING
With New Advisors move vote1 and 3 35.72 59.24 and 6 15.47 and 9 43.8
Without New Advisors move vote1 and 3 34.82 15.34 and 6 12.87 and 9 43.0
XO
X
XX
1 2 3
4 5 6
7 8 9
STRUCTURAL FEATURES -FUNCTIONAL FEATURES
• Perceptual features are related to functional features through the spatial nature of the rules and the layout of the game board.
• The architecture of our perceptual system is filtered through our experience in the physical world
• This leads to visual primatives that include lines, simple geometric arrangements, contiguous space and boundaries.
• Pieces influence adjacent pieces.• The goal of most games involves,
– simple contiguous geometric arrangements - three-in-a-row– capture of contiguous space - Go– capture of pieces with contiguous space in between - chess, checkers
FLAX’S LAW
The rules of the games we like to play result in configurations on the game board we like to
see.
THE GAME OF GO INVOLVES THE CONTROL OF SPACE
a
If white plays a stone at the point a then the three black stones will be captured and removed from the board.
THE CONTROL OF SPACE
COMPLEXITY OF DECISION MAKING
There is a tendency to analyze the complexity of reasoning tasks in terms of an exhaustive search of alternatives. Humans manage to function in complex reasoning domains by compartmentalization of the problem and restriction of search based upon past experience.
REORGANIZATION OF DOMAIN KNOWLEDGE WITH EXPERIENCE
Chunking of rules in production systems
Reorganizing mental models
DEVISE A GAME WHERE THE RULES RESULT IN GEOMETRIC ARRANGEMENTS OF PIECES OF TACTICAL OR STRATEGIC SIGNIFICANCE THAT ARE NOT EASILY PERCEIVED.
DIAGRAM THE BOARD
LIST THE RULES
SHOW A BOARD POSITION THAT RESULTS FROM LEGAL MOVES WHICH DEMONSTRATES THIS FACT.
HOMEWORK
top related