1 towards a testbed for evaluating learning systems in gaming simulators 21 march 2004 symposium on...

1

Towards a Testbed for EvaluatingLearning Systems in Gaming Simulators

21 March 2004Symposium on Reasoning and Learning

Outline:1. Objective2. Specification3. Intended functionality4. Example of use5. Status and Goals

Outline:1. Objective2. Specification3. Intended functionality4. Example of use5. Status and Goals

David W. AhaIntelligent Decision Aids Group

Naval Research Laboratory, Code 5515Washington, DC

home.earthlink.net/[email protected]

(working with Matt Molineaux)

David W. AhaIntelligent Decision Aids Group

Naval Research Laboratory, Code 5515Washington, DC

home.earthlink.net/[email protected]

(working with Matt Molineaux)

TIELTTIELT

2

Objective & Expected Benefits

Objective

Support the machine learning community by providing an API for a set of gaming engines, the ability to select a wide variety of learning and performance tasks, and an editor for specifying and conducting an evaluation methodology.


Benefits

1. Reduce costs (time, $) to create these integrations• Costs are often prohibitive, encouraging isolated studies

2. Encourage research on learning in cognitive systems:• Embedded (e.g., process aware)• Rapid (i.e., few trials)• Knowledge-intensive• Enduring

3. Support analysis of alternative learning systems for a given task

1. Reduce costs (time, $) to create these integrations• Costs are often prohibitive, encouraging isolated studies

2. Encourage research on learning in cognitive systems:• Embedded (e.g., process aware)• Rapid (i.e., few trials)• Knowledge-intensive• Enduring

3. Support analysis of alternative learning systems for a given task

3

DARPA’s Cognitive Systems Thrust

IPTO: Information Processing Technology Office• Director: Ron Brachman• Assistant Director: Barbara Yoon

History: IPTO has many impressive contributions• e.g., time-sharing, interactive computing, Internet• Long-term goal: human-computer symbiosis

IPTO: Information Processing Technology Office• Director: Ron Brachman• Assistant Director: Barbara Yoon

History: IPTO has many impressive contributions• e.g., time-sharing, interactive computing, Internet• Long-term goal: human-computer symbiosis

Current goal: Develop “cognitive” systems that know what they’re doing:

– can reason, using substantial amounts of appropriately represented knowledge– can learn from its experience to perform better tomorrow than it did today– can explain itself and be told what to do– can be aware of its own capabilities and reflect on its own behavior – can respond robustly to surprise

Current goal: Develop “cognitive” systems that know what they’re doing:

– can reason, using substantial amounts of appropriately represented knowledge– can learn from its experience to perform better tomorrow than it did today– can explain itself and be told what to do– can be aware of its own capabilities and reflect on its own behavior – can respond robustly to surprise

4

IPTO’s View of a Cognitive Agent

External Environment External Environment

Communication(language,gesture,image)

Prediction,planning

Deliberative Processes

Reflective Processes

Reactive Processes

Perception Action

STM

Sensors Effectors

Other reasoning

LTM(KB)

Concepts

SentencesCog

nit

ive

Ag

en

t

Affect

Attention

Learning

5

Learning Foci in Cognitive Systems(Langley & Laird, 2002)

Action preconditions

Plan adaptor

Action effects

Resource allocater

Information fuser

Decision application procedure

Decision selector, Conflict resolver

Explanation generation

Recall procedureRemembering & Reflection

Dialogue coordination

NL interpretationInteraction & Communication

Action utility

Action executerExecution & Action

Inferencing knowledge and procedures

Beliefs & belief relationsReasoning & Belief Maintenance

Plans, Plan generator (e.g., search method)Problem Solving & Planning

Monitoring focus

Environment modelPrediction & Monitoring

Situation categories, Situation categorizationPerception & Situation Assessment

Space of possible decisionsDecision Making & Choice

Categories, Pattern Categorizer

Patterns, Pattern recognizer Recognition & Categorization

Knowledge Container(s)Capability

Action preconditions

Plan adaptor

Action effects

Resource allocater

Information fuser

Decision application procedure

Decision selector, Conflict resolver

Explanation generation

Recall procedureRemembering & Reflection

Dialogue coordination

NL interpretationInteraction & Communication

Action utility

Action executerExecution & Action

Inferencing knowledge and procedures

Beliefs & belief relationsReasoning & Belief Maintenance

Plans, Plan generator (e.g., search method)Problem Solving & Planning

Monitoring focus

Environment modelPrediction & Monitoring

Situation categories, Situation categorizationPerception & Situation Assessment

Space of possible decisionsDecision Making & Choice

Categories, Pattern Categorizer

Patterns, Pattern recognizer Recognition & Categorization

Knowledge Container(s)Capability

Many opportunities for learning

6

Typical Current Practice

Comparatively few machine learning (ML) efforts on cognitive systems: Isolated studies of learning algorithms Single-step, non-interactive tasks Knowledge-poor learning contexts

Comparatively few machine learning (ML) efforts on cognitive systems: Isolated studies of learning algorithms Single-step, non-interactive tasks Knowledge-poor learning contexts

Few of Today’s Cognitive Systems Support Realistic

Learning Capabilities

Cognitive System

ML System

ML System

Database

7

Wanted: A new Interface(thanks to W. Cohen)

ML System

Database ML System

ML System

ML System

InterfaceDatabaseDatabaseDatabase

Curmudgeon’s Viewpoint: • This might encourage research on more challenging problems• But don’t count on it

Curmudgeon’s Viewpoint: • This might encourage research on more challenging problems• But don’t count on it

(e.g., UCI Repository)

Cognitive System

ML SystemInterface

Cognitive System

Cognitive System

ML System

Cognitive System

ML System

Cognitive System

ML System(e.g., TIELT?)

8

Your Potential Uses of TIELT(Hastily Considered…and Reaching)

1. Randy Jones: Smart way to update rule conditions• Use: Updating game model’s tasks

2. Doug Pearson: Changing conditions on operators• Use: Controlling game agents

3. Prasad Tedapalli: Learning hierarchies in the game model• Use: Active development of a game model’s task hierarchy

4. Jim Blythe: Knowledge acquisition• Use: Acquiring game model constraints

5. Gheorghe Tecuci: Mixed-initiative learning for knowledge acquisition• Use: Active learning of task models, etc.

6. Karen Myers: Incorporating guidance from humans for agent control• Use: Learning agent controls (assuming players can provide direct feedback)

7. Barney Pell: Learning to play any of a category of games given their rules• Use: Hmm…agent control, if collaborating with a game model-updating system

8. Afzal Upal: Updating plan quality• Use: Induce task-specific control rules

9. Susan Epstein: Learning to solve (large) CSPs• Use: Reasoning with game model’s constraints

10. Frank Ritter: Recognition tasks (e.g., for strategies?)• Use: Learning opponent strategies

11. Dan Roth: Using multiple classifiers to solve problems• Use: Set of (coordinated) learning systems for problem solving

12. Ken Forbus: Analogical reasoning and companion cognitive systems• Use: Qualitative representation for game model, predicting human/agent intentions

13. Daniel Borrajo: Learning control knowledge for planning• Use: Incremental learning for agent control tasks

14. Niels Taatgen: Learning for real-time tasks• Use: Agent control, several RTS applications

9

Outline (cont)

1. Objective2. Specification of a testbed

• Select a category of cognitive systems• Category-specific challenges

3. Intended functionality4. Example of use5. Status of project6. Goals for future work

1. Objective2. Specification of a testbed

• Select a category of cognitive systems• Category-specific challenges

3. Intended functionality4. Example of use5. Status of project6. Goals for future work

10

ML System

Cognitive System

• Effects• State

• State • DecisionMessage passingAchieve a goal(s)Create a planSignificant• Temporal, qualitative, …Plan execution measures

Interface Comparison(e.g., for a supervised learning system)

(e.g., for a cognitive system involving planning)

Characteristics

1. Performance API• Input• Output

2. Learning API• Input• Output

3. Integration4. Performance task5. Learning task6. Domain knowledge

• Reasoning7. Evaluation

methodology

• -• Common data format

• Matches input format• -Data input moduleClassification Set wts., create tree, etc.-• -Acc., ROC curves, etc.

ML System

Database

11

What type of Cognitive System?

Desiderata:

Candidate: Interactive Gaming SimulatorsCandidate: Interactive Gaming Simulators

1. Available implementations• Inexpensive to acquire and run

2. Pushes ML research boundaries• Challenging embedded learning tasks

3. Significant interest/excitement• Military, industry, academia, funding

1. Available implementations• Inexpensive to acquire and run

2. Pushes ML research boundaries• Challenging embedded learning tasks

3. Significant interest/excitement• Military, industry, academia, funding

12

Gaming Genres(Laird & van Lent, 2001)

Control units and strategic enemy (i.e., other coach), commentator

Act as coach and a key player

Madden NFL Football

Team Sports

Control enemy1st vs. 3rd personIndividual competitionMany (e.g., driving games)

Individual Sports

Control unit goals and goal-achievement strategies

Control a simulated world & its units

SimCity, The Sims

God

Control all units and strategic enemies

God’s eye view, controls many units (e.g., tactical warfare)

Age of Empires, Warcraft, Civilization

Strategy

Control supporting characters

Linear vs. dynamic scripting

Player solves puzzles, interacting w/ others

King’s Quest, Blade Runner

Adventure

Control enemies, partners, and supporting characters

Solo vs. (massively) multi-player

Be a characterDiabloRole-Playing

Control enemies1st vs. 3rd person, solo vs team play

Control a characterQuake, UnrealAction

AI RolesSub-GenresDescriptionExampleGenre

Unfortunately,…reaction time and aiming skill are the most important factors in success in a first-person shooter game. Deep reasoning about tactics and strategy don’t end up playing a big role as might be expected. (van Lent et al., 2004)

13

Real-Time Strategy (RTS) Games(Buro & Furtak, 2003)

Fundamental AI research problems

1. Adversarial real-time planning• Motivates need for abstractions of world state

2. Decision making under uncertainty• e.g., opponent intentions

3. Opponent modeling, learning• One of the biggest shortcomings of current RTS game AI

systems is their inability to learn quickly…Current ML approaches…are inadequate in this area.

4. Spatial and temporal reasoning5. Resource management6. Collaboration7. Pathfinding

1. Adversarial real-time planning• Motivates need for abstractions of world state

2. Decision making under uncertainty• e.g., opponent intentions

3. Opponent modeling, learning• One of the biggest shortcomings of current RTS game AI

systems is their inability to learn quickly…Current ML approaches…are inadequate in this area.

4. Spatial and temporal reasoning5. Resource management6. Collaboration7. Pathfinding

14

Military: Learning in Simulators for Computer Generated Forces (CGF)

Purpose: Training (present) & planning (future)

• Simulators: JWARS, OneSAF, Full Spectrum Command, etc.• Target: Control strategic opponent or own units

• Simulators: JWARS, OneSAF, Full Spectrum Command, etc.• Target: Control strategic opponent or own units

Evidence of commitment: Some Claims

• “Learning is an essential ability of intelligent systems” (NRC, 1998)• “To realize the full benefit of a human behavior model within an intelligent

simulator,…the model should incorporate learning” (Hunter et al., CCGBR’00)• “Successful employment of human behavior models…requires that [they] possess

the ability to integrate learning” (Banks & Stytz, CCGBR’00)

• “Learning is an essential ability of intelligent systems” (NRC, 1998)• “To realize the full benefit of a human behavior model within an intelligent

simulator,…the model should incorporate learning” (Hunter et al., CCGBR’00)• “Successful employment of human behavior models…requires that [they] possess

the ability to integrate learning” (Banks & Stytz, CCGBR’00)

Status:

• No CGF simulator has been deployed with learning (D. Reece, 2003)• Problems: Performance (costly training), overtraining, behavioral

accuracy (e.g., learned behaviors may become unpredictable), constraint violations (learned behaviors do not follow doctrine), difficult to isolate the utility of learning (Petty, CGFBR’01)

• No CGF simulator has been deployed with learning (D. Reece, 2003)• Problems: Performance (costly training), overtraining, behavioral

accuracy (e.g., learned behaviors may become unpredictable), constraint violations (learned behaviors do not follow doctrine), difficult to isolate the utility of learning (Petty, CGFBR’01)

15

Industry: Learning in Video and Computer Games

Focus: Increase sales via enhanced gaming experience

• Simulators: Many! (e.g., SimCity, Quake, SoF, UT)• Target: Control avatars, unit behaviors

• Simulators: Many! (e.g., SimCity, Quake, SoF, UT)• Target: Control avatars, unit behaviors

Status

• Few deployed systems have used learning (Kirby, 2004): e.g.,1. Black & White: on-line, explicit (player immediately reinforces behavior)2. C&C Renegade: on-line, implicit (agent updates set of legal paths)3. Re-volt: off-line, implicit (GA tunes racecar behaviors prior to shipping)

• Problems: Performance, constraints (preventing learning “something dumb”), trust in learning system

• Few deployed systems have used learning (Kirby, 2004): e.g.,1. Black & White: on-line, explicit (player immediately reinforces behavior)2. C&C Renegade: on-line, implicit (agent updates set of legal paths)3. Re-volt: off-line, implicit (GA tunes racecar behaviors prior to shipping)

• Problems: Performance, constraints (preventing learning “something dumb”), trust in learning system

Evidence of commitment

• Developers: “keenly interested in building AIs that might learn, both from the player & environment around them.” (GDC’03 Roundtable Report)

• Middleware products that support learning (e.g., MASA, SHAI, LearningMachine)• Long-term investments in learning (e.g., iKuni, Inc.)

“A computer that learns is worth 10 Microsofts.” (B. Gates, 2004)

• Developers: “keenly interested in building AIs that might learn, both from the player & environment around them.” (GDC’03 Roundtable Report)

• Middleware products that support learning (e.g., MASA, SHAI, LearningMachine)• Long-term investments in learning (e.g., iKuni, Inc.)

“A computer that learns is worth 10 Microsofts.” (B. Gates, 2004)

16

Focus: Several research thrusts

Academia: Learning inInteractive Computer Games

Status: Publication options (specific to AI & gaming)

• AAAI symposia and workshops (several)AAAI’04 Workshop on Challenges in Game AI

• International Conference on Computers and Games• Journals: J. of Game Development, Int. Computer Games J.

• AAAI symposia and workshops (several)AAAI’04 Workshop on Challenges in Game AI

• International Conference on Computers and Games• Journals: J. of Game Development, Int. Computer Games J.

• Game engines (e.g., GameBots, ORTS, RoboCup Soccer Server)Use (other) open source engines (e.g., FreeCiv, Stratagus)

• Representation (e.g., Forbus et al., 2001; Houk, 2004; Munoz-Avila & Fisher, 2004)

• Knowledge acquisition (e.g., Hieb et al., 1995)• Supervised learning of lower-level behaviors (e.g., Geisler, 2002)• Learning plans (e.g., Fasciano, 1996)• Learning opponent unit models (e.g., Laird, 2001; Hill et al., 2002)• Learning to provide advice (e.g., Sweetser & Dennis, 2003)• Learning hierarchical knowledge (e.g., van Lent & Laird, 1998)• Learning rule preferences (e.g., Ponsen, 2004)

• Game engines (e.g., GameBots, ORTS, RoboCup Soccer Server)Use (other) open source engines (e.g., FreeCiv, Stratagus)

• Representation (e.g., Forbus et al., 2001; Houk, 2004; Munoz-Avila & Fisher, 2004)

• Knowledge acquisition (e.g., Hieb et al., 1995)• Supervised learning of lower-level behaviors (e.g., Geisler, 2002)• Learning plans (e.g., Fasciano, 1996)• Learning opponent unit models (e.g., Laird, 2001; Hill et al., 2002)• Learning to provide advice (e.g., Sweetser & Dennis, 2003)• Learning hierarchical knowledge (e.g., van Lent & Laird, 1998)• Learning rule preferences (e.g., Ponsen, 2004)

17

Example integrations

Academia: Learning in Interactive Computer Games (cont.)

Name + Reference Game Engine Learning Approach Tasks(Goodman, AAAI’93) Bilestoad Projective visualization Fighting maneuvers

CAPTAIN (Hieb et al., CCGFBR’95)

ModSAF Multistrategy (e.g., version spaces)

Platoon placement

MAYOR (Fasciano, 96 Dept. of CS, U. Chicago TR 96-05)

SimCity Case-based reasoning City development

(Fogel et al., CCGFBR’96) ModSAF Genetic programming Tank movements

KnoMic (van Lent & Laird, ICML’98)

ModSAF Rule condition learning in SOAR

Aircraft maneuvers

(Geisler, 2002) Soldier of Fortune Multiple (e.g., boosting backprop)

FPS action selection

(Sweetser & Dennis, 2003) Tubby Terror Regression Advice generation

(Chia & Williams, BRIMS’03) TankSoar Naïve Bayes classification Tank behaviors

(Ponsen, 2004) Wargus/Stratagus Genetic algorithms (dynamic scripting)

Strategic rule selection

18

Summary: Some Additional Challenges withEmbedding Learning in Gaming Simulators

1. Low cpu requirements (e.g., in real-time games)2. Constraining learned knowledge

• Must not violate expectations

3. Learning & reasoning (e.g., planning)4. Isolating learning contributions (for evaluation)

1. Low cpu requirements (e.g., in real-time games)2. Constraining learned knowledge

• Must not violate expectations

3. Learning & reasoning (e.g., planning)4. Isolating learning contributions (for evaluation)

19

Specification for Integrating Learning Systems with Gaming Simulators

1. Simplifies integration!• Interests ML researchers• Interests game developers

2. Learning focus concerns at least three types of models:• Task (e.g., learn how to perform, or advise on, a task)• Player (e.g., learn a human player’s strategies)• Game (e.g., learn its objects, their relations & functions)

• State interpretation/abstraction3. Learning methods: A wide variety

• They should be able to output their learned behaviors for inspection (e.g., by game developers)

4. Game engines: Those with challenging learning tasks• i.e., large hypothesis spaces, knowledge-intensive

5. Supports reuse via modularity (to be at all feasible)• Abstracts interface definitions from game & task models

6. Free (unlike some similar commercial tools)• Preferably, open source

1. Simplifies integration!• Interests ML researchers• Interests game developers

2. Learning focus concerns at least three types of models:• Task (e.g., learn how to perform, or advise on, a task)• Player (e.g., learn a human player’s strategies)• Game (e.g., learn its objects, their relations & functions)

• State interpretation/abstraction3. Learning methods: A wide variety

• They should be able to output their learned behaviors for inspection (e.g., by game developers)

4. Game engines: Those with challenging learning tasks• i.e., large hypothesis spaces, knowledge-intensive

5. Supports reuse via modularity (to be at all feasible)• Abstracts interface definitions from game & task models

6. Free (unlike some similar commercial tools)• Preferably, open source

20

Outline (cont)

1. Objective2. Specification of a testbed3. Intended functionality

• Interaction• Types of use• Open issues

4. Example of use5. Status and Goals




21EditorsEditors

TIELT

TIELTUser

Reasoning & Learning

System


SystemLearning System

#1

LearningSystem

#n

. . .

Knowledge BasesKnowledge Bases

GameModel

Description

Task Descriptions

GameInterface

Description

LearningInterface

Description

EvaluationMethodology Description

Displays Displays

PredictionDisplay

AdviceDisplay

EvaluationDisplayGame

Engine

GameEngine

Stratagus

FreeCiv

GamePlayer(s)

Learned Knowledge

TIELT: Testbed for Integrating and Evaluating Learning Techniques

22

TIELT Knowledge Bases

GameModel

Description

Task Descriptions

GameInterface

Description

LearningInterface

Description


Defines communication processes with the game engine

Defines communication processes with the learning system

Defines interpretation of the game• e.g., objects, operators, behaviors model, tasks, initial state

Defines the selected learning and performance tasks• Selected from the game model description

Defines the empirical evaluation to conduct

23

Example learning functionalities supported

Data Sources and Targeted Functionalities

1. Learning from observations (e.g., behavioral cloning)2. Active learning3. Learning from advice (requires inputs from user)4. Learning to advise5. …

1. Learning from observations (e.g., behavioral cloning)2. Active learning3. Learning from advice (requires inputs from user)4. Learning to advise5. …

Data sources

1. Game (world) model (possibly incomplete, incorrect)2. Simulator

• Passive state observations (e.g., behavioral cloning)• Active testing (e.g., apply an action in a state)

3. Humans• Advice

1. Game (world) model (possibly incomplete, incorrect)2. Simulator

• Passive state observations (e.g., behavioral cloning)• Active testing (e.g., apply an action in a state)

3. Humans• Advice

24EditorsEditors

TIELT

TIELTUser


System



#1

LearningSystem

#n

. . .


GameModel

Description

Task Descriptions

GameInterface

Description

LearningInterface

Description


Displays Displays

PredictionDisplay

AdviceDisplay


Engine

GameEngine

Stratagus

FreeCiv

GamePlayer(s)

Example TIELT Usage: Controlling a Game Character

Raw State Processed State

DecisionAction

25EditorsEditors

TIELT

TIELTUser


System



#1

LearningSystem

#n

. . .


GameModel

Description

Task Descriptions

GameInterface

Description

LearningInterface

Description


Displays Displays

PredictionDisplay

AdviceDisplay


Engine

GameEngine

Stratagus

FreeCiv

GamePlayer(s)

Example TIELT Usage: Advising a Game Player


Decision,Reason

Advice, Explanation

26EditorsEditors

TIELT

TIELTUser


System



#1

LearningSystem

#n

. . .


GameModel

Description

Task Descriptions

GameInterface

Description

LearningInterface

Description


Displays Displays

PredictionDisplay

AdviceDisplay


Engine

GameEngine

Stratagus

FreeCiv

GamePlayer(s)

Example TIELT Usage: Predicting a Game Player’s Actions


Prediction,Reason

Prediction, Explanation

27EditorsEditors

TIELT

TIELTUser


System



#1

LearningSystem

#n

. . .


GameModel

Description

Task Descriptions

GameInterface

Description

LearningInterface

Description


Displays Displays

PredictionDisplay

AdviceDisplay


Engine

GameEngine

Stratagus

FreeCiv

GamePlayer(s)

Example TIELT Usage: Updating a Game Model


Edit

Edit

28EditorsEditors

TIELT

TIELTUser


System



#1

LearningSystem

#n

. . .


GameModel

Description

Task Descriptions

GameInterface

Description

LearningInterface

Description


Displays Displays

PredictionDisplay

AdviceDisplay


Engine

GameEngine

Stratagus

FreeCiv

GamePlayer(s)

Learned Knowledge

Example TIELT Usage: Building a Task/Player Model


Model

29

Game Developer

Intended Use Cases

1. Define/store game engine interface2. Define/store game model3. Select learning system & interface4. Select learning and performance tasks5. Define (or select) evaluation methodology6. Run experiments7. Analyze displayed results

1. Define/store game engine interface2. Define/store game model3. Select learning system & interface4. Select learning and performance tasks5. Define (or select) evaluation methodology6. Run experiments7. Analyze displayed results

ML Researcher

1. Define/store learning system interface2. Select game engine & interface3. Select game model4. Select learning and performance tasks5. Define (or select) evaluation methodology6. Run experiments7. Analyze displayed results

1. Define/store learning system interface2. Select game engine & interface3. Select game model4. Select learning and performance tasks5. Define (or select) evaluation methodology6. Run experiments7. Analyze displayed results

Repository of learning systems, learning interface descriptions

Repository of game engines, game interface descriptions

Repository of game engines, game interface descriptions

30

Some Open Questions

1. Game Model: • What representation?

• STRIPS operators?• Hierarchical task networks?• Explicit constraints?

• How to communicate it to the learning system?• Should it instead be maintained in the learning system?

2. What standards for:• Game engine message passing• Learning system message passing• Output format for learned knowledge

3. Support both on-line and off-line studies?4. What representations for advice and explanations?5. How to explicitly represent & apply constraints on learned

knowledge?6. How to evaluate TIELT’s utility?

1. Game Model: • What representation?

• STRIPS operators?• Hierarchical task networks?• Explicit constraints?

• How to communicate it to the learning system?• Should it instead be maintained in the learning system?

2. What standards for:• Game engine message passing• Learning system message passing• Output format for learned knowledge

3. Support both on-line and off-line studies?4. What representations for advice and explanations?5. How to explicitly represent & apply constraints on learned

knowledge?6. How to evaluate TIELT’s utility?

31

Outline (cont)



4. Example of use• Demonstration of initial GUI• Simple “city placement” task

5. Status and Goals



4. Example of use• Demonstration of initial GUI• Simple “city placement” task

5. Status and Goals

32

Outline (cont)







33

Status and Goals

TIELTSpecification

TIELT (Initial GUI)

Matt Molineaux

34

2. ORTS (Open Real-Time Strategy (RTS)) project:• Open source RTS game engine

• Free• Flexible game specification (via scripts)• Hack-free server-side simulation• Open message protocol: Players have total control

• Prefer ORTS to Stratagus?

2. ORTS (Open Real-Time Strategy (RTS)) project:• Open source RTS game engine

• Free• Flexible game specification (via scripts)• Hack-free server-side simulation• Open message protocol: Players have total control

• Prefer ORTS to Stratagus?

Status and Goals: Recent Influences

1. Full Spectrum Command (van Lent et al., 2004)• Multiple AI systems, one game engine

1. Full Spectrum Command (van Lent et al., 2004)• Multiple AI systems, one game engine

3. Collaboration with Lehigh University (Asst. Prof. H. Muñoz-Avila)• Extended Hierarchical Task Network (HTN) process

representation for the Game Model’s tasks?• Fall 2004 PhD candidate: First to integrate ML with Stratagus• Fall 2004 student: Will develop Game Models for us

3. Collaboration with Lehigh University (Asst. Prof. H. Muñoz-Avila)• Extended Hierarchical Task Network (HTN) process

representation for the Game Model’s tasks?• Fall 2004 PhD candidate: First to integrate ML with Stratagus• Fall 2004 student: Will develop Game Models for us

35

Conclusion

Objective



Status

• Started 12/03, effectively• Initial GUI implementation• Many open research questions

• Started 12/03, effectively• Initial GUI implementation• Many open research questions

Goals

• 9/04: First complete implementation• Incrementally integrate with game engines, learning systems• Document & publicize for use to gain ML interest• Subsequently, seek military/industry interest

• 9/04: First complete implementation• Incrementally integrate with game engines, learning systems• Document & publicize for use to gain ML interest• Subsequently, seek military/industry interest

And game-developer community? Other research communities?

36

Backup Slides

37

Goal: Wargaming testbed for the machine learning community–Explore learning techniques in the

context of today’s latest simulations & video games

–Facilitate exploration of strategies and “what if” scenarios

–Provide common platform for evaluating different learning techniques

New Learning Techniques

Development Environment

Video Wargaming

Testbed

API

TIELT: Initial Vision(DARPA, 11/13/03)

Technical Approach: Enable insertion of learning/KA techniques into state-of-the-art video combat & strategy games–Create API for integrating learning into selected video games

• e.g., comm. module, socket interface, client-server comms protocol & language

–Create API that enables learning in computer generated forces (CGF) tools

38

Functionality: Supervised learning using a passive dataset

API for Isolated Studies

Performance (Classifier): • Task: Classification• Interface:

Input: NoneOutput: Common access format (across all tasks & datasets)

Learning: • Task: Varies (e.g., tree, weight settings) • Interface:

Input: Data instance or set– Common format (across all tasks & systems)

Output: Classification decision

Performance (Classifier): • Task: Classification• Interface:

Input: NoneOutput: Common access format (across all tasks & datasets)

Learning: • Task: Varies (e.g., tree, weight settings) • Interface:

Input: Data instance or set– Common format (across all tasks & systems)

Output: Classification decision

39

API for Cognitive Learning

Functionality: Learning by doing/being told/observation/etc.

Performance (Cognitive System):• Task: Varied (e.g., planning, design, diagnosis, … , classification)• Interface:

Input: ActionOutput: Current state

Learning: • Task: Varies (e.g., rule application conditions) • Interface:

Input: Processed current stateOutput: Decision

Performance (Cognitive System):• Task: Varied (e.g., planning, design, diagnosis, … , classification)• Interface:

Input: ActionOutput: Current state

Learning: • Task: Varies (e.g., rule application conditions) • Interface:

Input: Processed current stateOutput: Decision

40

Role Focus State-of-the-Art AI NeedsTactical Enemies

Challenge human player Cheats, Scripts using FSMs, Path planning, expert systems

Situation assessment, user modeling, spatial & temporal reasoning, planning, plan recognition, learning

Partners Cooperation & coordination w/ human

Scripted responses to specific commands

Speech recognition, NLP, gesture recognition, user modeling, adaptation

Support Characters

Guide/interact with human Canned responses NL understanding & generation, path planning, coordination

Strategic Opponents

Develop high-level strategy, allocate resources. & issue unit-level commands

Cheating, etc. Integrated planning, commonsense reasoning, spatial reasoning, plan recognition, resource allocation

Units Carry out-high level commands autonomously

FSMs and path planning

Commonsense reasoning & coordination

Commentators Observe and comment on game play

NL generation, plan recognition

NL generation, plan recognition

Commercial Game Roles for AI(Laird & van Lent, 2001)

41

TIELT

Display Display

Editors Editors

EvaluationMethodology

Game InterfaceEditor

Percepts

UserLearning Interface

EditorGame Model

EditorTask

Editor

GameModel

Description

Task Descriptions

Pe

rf.

Ta

sk

Display DisplayEvaluation

Display

Evaluator

ActionTranslator(Mapper)

Learning OutputsActions

LearningSystem(s)

LearningSystem(s)

System#1

System#2

System#n

. . .

Translated Model (Subset)

Learning Task

GameInterface

Description

LearningInterface

Description

LearningTranslator(Mapper)

Controller

CurrentState

ModelUpdater

Database

EvaluationSettings

StoredState

AdviceDisplay Database

EngineStateGame

Engine

GameEngine

Stratagus

FreeCiv

42

TIELT

Editors Editors


Sensors

UserGame Model

Editor

GameModel

Description

Up

da

tes

GameInterface

Description

ActionTranslator

Actions

GameEngine

GameEngine

CurrentState

1

2

4 3

4

In Game Engine, game begins and the colony pod is created and placed.

1

The Game Engine sends a “See” sensor message stating where the pod is.

The message template provides updates to the Game Model Description, which tell the Current Model that there is a pod at the location See describes.

4

2

The Model Updater receives the sensor message and finds the corresponding message template in the Game Interface Description.

3

ControllerModel

Updater

3

The Model Updater notifies the Controller that the See action event has occurred.

5

5

1. Sensing the Game State

43

TIELT

Editors Editors

UserLearning Interface

EditorTask

Editor

Task Descriptions

LearningTranslator

Translated Model (Subset)

LearningInterface

Description

ActionTranslator

Learning Outputs

The Controllor notifies the Learning Translator that it has received a See message.

The Learning Translator finds a city location task which is triggered by the See message. It queries the controller for the learning mode, then creates a TestInput message to send to the learning system with information on the pod’s location and the map from the Current State.

The Learning System(s) transmit output to the Action Translator.

The Learning Translator transmits the TestInput message to the appropriate Learning System(s).

12

2 23

Controller

LearningSystem(s)

LearningSystem(s)

System#1

System#2

System#n

. . .

CurrentState

StoredState

2. Fetching Decisions fromthe Learning System

1

4

2

3

4

44

TIELT

Editors Editors


User

ActionTranslator

Actions

GameEngine

GameEngine

CurrentState

1

2

4

The Action Translator receives a TestOutput message from a Learning System.

The Action Translator finds the TestOutput message template and determines that it is associated with the city location task, and builds a MovePod operator (defined by the Current State) with the parameters of TestOutput.

The Game Engine receives Move and updates the game to move the pod toward its destination, or

The Action Translator determines that the Move Action from the Game Interface Description is triggered by the MovePod Operator and binds Move using information from MovePod, then sends Move to the Game Engine. Learning Interface

Editor

2

3

GameInterface

Description

LearningInterface

Description

Display Display

AdviceDisplay

3

The Advice Display receives Move and displays advice to a human player on what to do next.

51

4

2

3

3. Acting in the Game World

5

45

TIELT

Editors Editors

TaskEditor

Task Descriptions

Model

Ev

alu

ati

on

Display DisplayEvaluation

Display

EvaluatorCurrent

State

The Evaluator is triggered by the Controller according to a trigger from the Evaluation Settings.

1

The Evaluator obtains performance metrics from each Task and calculates them on the Current State.

2

The Evaluator sends the new metrics values to the Evaluation Display, which updates with the new information.

3

2 2

3

Controller1

4. Displaying an Operation to the User

46

TIELTWhen in Record mode, the Controller triggers the Database Engine when the state updates.

1

The Database Engine records the Current State in a Database for later use.

2

Later, in Playback mode, the Controller triggers the Database Engine after the Learning System indicates readiness.

3

LearningTranslator(Mapper)

Controller

CurrentState

Database

StoredState

DatabaseEngine

State

12

2

3

The Database Engine then queries a Database and retrieves a Stored State.

4

Finally, the Controller notifies the Learning System that an update has arrived and to query the Stored State for message info.

5

4

4

55

5. Retrieving States from a Database

1 towards a testbed for evaluating learning systems in gaming simulators 21 march 2004 symposium on...

Documents

system cognitive system

learning outline

learning foci

learning hierarchies

attention learning slide

todays cognitive systems

darpas cognitive systems

learning agent controls