Compressing Mental Model Spaces and
Modeling Human Strategic Intent
Prashant Doshi University of Georgia, USA
http://thinc.cs.uga.edu
Yingke ChenDoctoral student
Yifeng ZengReader, Teesside Univ.
Previously: Assoc Prof., Aalborg Univ.
Hua MaoDoctoral student
Muthu ChandrasekaranDoctoral student
What is a mental behavioral model?
How large is the behavioral model space?
How large is the behavioral model space?
General definitionA mapping from the agent’s history of
observations to its actions
How large is the behavioral model space?
2H (Aj)Uncountably infinite
How large is the behavioral model space?
Let’s assume computable models
Countable
A very large portion of the model space is not computable!
Daniel DennettPhilosopher and Cognitive Scientist
Intentional stanceAscribe beliefs, preferences and intent to explain others’ actions
(analogous to theory of mind - ToM)
Organize the mental models
Intentional modelsSubintentional models
Organize the mental modelsIntentional models
E.g., POMDP = bj, Aj, Tj, j, Oj, Rj, OCj BDI, ToM
Subintentional models
Frame(may give rise to recursive modeling)
Organize the mental modelsIntentional models
E.g., POMDP = bj, Aj, Tj, j, Oj, Rj, OCj BDI, ToM
Subintentional modelsE.g., (Aj), finite state controller, plan
Frame
Finite model space grows as the interaction progresses
Growth in the model space
Other agent may receive any one of |j| observations
|Mj| |Mj||j| |Mj||j|2 ... |Mj||j|t
0 1 2 t
Growth in the model space
Exponential
Absolute continuity condition (ACC)
ACC1. Subjective distribution over histories2. True distribution over histories
ACC is a sufficient and necessary condition for Bayesian update of belief over models
How do we satisfy ACC?
Cautious beliefs (full prior) Grain of truth assumption
Prior with a grain of truth is sufficient but not necessary
General model space is large and grows exponentially as the interaction progresses
It would be great if we can compress this space!
No loss in value to the modelerFlexible loss in value for greater compression
LosslessLossy
Expansive usefulness of model space compression to many areas:
1. Sequential decision making (dt-planning) in multiagent settings
2. Bayesian plan recognition3. Games of imperfect information
1. Sequential decision making in multiagent settings
Interactive POMDP framework (Gmytrasiewicz&Doshi05)
Include models of the other agent in the state spaceUpdate beliefs over the physical state and models
General and domain-independent approach for compression
Establish equivalence relations that partition the model space and retain representative models from each equivalence class
Approach #1: Behavioral equivalence (Rathanasabapathy et al.06,Pynadath&Marsella07)
Intentional models whose complete solutions are identical are considered equivalent
Approach #1: Behavioral equivalence
Behaviorally minimal set of models
Lossless
Works when intentional models have differing frames
Approach #1: Behavioral equivalence
Multiagent tiger
Approach #1: Behavioral equivalence
Impact on dt-planning in multiagent settings
Multiagent tiger
Multiagent MM
Utilize model solutions (policy trees) for mitigating model growth
Approach #1: Behavioral equivalence
Model reps that are not BE may become BE next step onwards
Preemptively identify such models and do not update all of them
Redefine BE
Approach #2: -Behavioral equivalence(Zeng et al.11,12)
Intentional models whose partial depth-d solutions are identical and vectors of updated beliefs at the leaves of the partial trees
are identical are considered equivalent
Approach #2: Revisit BE(Zeng et al.11,12)
Sufficient but not necessary
Lossless if frames are identical
Approach #2: (,d)-Behavioral equivalence
Two models are (,d)-BE if their partial depth-d solutions are identical and vectors of updated beliefs at the leaves of the
partial trees differ by
Models are(0.33,1)-BE
Lossy
Approach #2: -Behavioral equivalence
Lemma (Boyen&Koller98): KL divergence between two distributions in a discrete Markov stochastic process reduces or remains the same after a transition, with the mixing rate acting as a discount factor
Mixing rate represents the minimal amount by which the posterior distributions agree with each other after one transition
Property of a problem and may be pre-computed
Given the mixing rate and a bound, , on the divergence between two belief vectors, lemma allows computing the depth, d, at which the bound is reached
Approach #2: -Behavioral equivalence
Compare two solutions up to depth d for equality
Discount factor F = 0.5
Multiagent Concert
Approach #2: -Behavioral equivalence
Impact on dt-planning in multiagent settings
Multiagent Concert
On a UAV reconnaissance problem in a 5x5 grid, allows the solution to scale to a 10 step look ahead in 20 minutes
What is the value of d when some problems exhibit F with a value of 0 or 1?
Approach #2: -Behavioral equivalence
F=1 implies that the KL divergence is 0 after one step: Set d = 1
F=0 implies that the KL divergence does not reduce: Arbitrarily set d to the horizon
Intentional or subintentional models whose predictions at time step t (action distributions)
are identical are considered equivalent at t
Approach #3: Action equivalence(Zeng et al.09,12)
Approach #3: Action equivalence
Lossy
Works when intentional models have differing frames
Approach #3: Action equivalence
Approach #3: Action equivalence
Impact on dt-planning in multiagent settings
Multiagent tigerAE bounds the model space at each time
step to the number of distinct actions
Intentional or subintentional models whose predictions at time step t influence the subject agent’s plan
identically are considered equivalent at t
Regardless of whether the other agent opened the left or right door,the tiger resets thereby affecting the agent’s plan identically
Approach #4: Influence equivalence(related to Witwicki&Durfee11)
Influence may be measured as the change in the subject agent’s belief due to the action
Approach #4: Influence equivalence
Group more models at time step t compared to AE
Lossy
Compression due to approximate equivalence may violate ACC
Regain ACC by appending a covering model to the compressed set of representatives
Open questions
N > 2 agents
Under what conditions could equivalent models belonging to different agents be
grouped together into an equivalence class?
Can we avoid solving models by using heuristics for identifying approximately
equivalent models?
Modeling Strategic Human Intent
Yifeng ZengReader, Teesside Univ.
Previously: Assoc Prof., Aalborg Univ.
Yingke ChenDoctoral student
Hua MaoDoctoral student
Muthu ChandrasekaranDoctoral student
Xia QuDoctoral student
Roi CerenDoctoral student
Matthew MeiselDoctoral student
Adam GoodieProfessor of Psychology, UGA
Computational modeling of human recursive thinking in sequential games
Computational modeling of probability judgment in stochastic games
Human strategic reasoning is generally hobbled by low levels of recursive thinking
(Stahl&Wilson95,Hedden&Zhang02,Camerer et al.04,Ficici&Pfeffer08)
(I think what you think that I think...)
You are Player I and II is human. Will you move or stay?
Move MoveMove
Stay Stay Stay
Payoff for I:Payoff for II:
31
13
24
42
I II I IIPlayer to move:
Less than 40% of the sample population performed the rational action!
Thinking about how others think (...) is hard in general contexts
Move MoveMove
Stay Stay Stay
Payoff for I:
(Payoff for II is 1 – decimal)
0.6 0.4 0.2 0.8
I II I IIPlayer to move:
About 70% of the sample population performed the rational action in this simpler and strictly competitive game
Simplicity, competitiveness and embedding the task in intuitive representations seem to facilitate
human reasoning (Flobbe et al.08, Meijering et al.11, Goodie et al.12)
3-stage game
Myopic opponents default to staying (level 0) while predictive opponents think about the player’s
decision (level 1)
Can we computationally model these strategic behaviors using process models?
Yes! Using a parameterized Interactive POMDP framework
Replace I-POMDP’s normative Bayesian belief update with Bayesian learning that underweights evidence, parameterized by
Notice that the achievement score increases as more games are played indicating learning of the opponent modelsLearning is slow and partial
Replace I-POMDP’s normative expected utility maximization with quantal response model that selects actions proportional to their utilities, parameterized by
Notice the presence of rationality errors in the participants’ choices (action is inconsistent with prediction) Errors appear to reduce with time
Underweighting evidence during learning and quantal response for
choice have prior psychological support
Use participants’ predictions of other’s action to learn and participants’ actions to learn
Use participants’ actions to learn both and Let vary linearly
Insights revealed by process modeling:1. Much evidence that participants did not make rote use of BI, instead
engaged in recursive thinking2. Rationality errors cannot be ignored when modeling human decision
making and they may vary3. Evidence that participants’ could be attributing surprising
observations of others’ actions to their rationality errors
Open questions:1. What is the impact on strategic thinking if action outcomes
are uncertain?2. Is there a damping effect on reasoning levels if participants
need to concomitantly think ahead in time
Suite of general and domain-independent approaches for compressing agent model
spaces based on equivalence
Computational modeling of human behavioral data pertaining to strategic thinking
Thank you for your time
2. Bayesian plan recognition under uncertainty
Plan recognition literature has paid scant attention to finding general ways of reducing the set of feasible
plans (Carberry, 01)
3. Games of imperfect information (Bayesian games)
Real-world applications often involve many player types Examples• Ad hoc coordination in a spontaneous team• Automated Poker player agent
3. Games of imperfect information (Bayesian games)
Real-world applications often involve many player types
Model space compression facilitates equilibrium computation