practical reasoning agents - poznań university of …intentions in practical reasoning intentions...

Practical Reasoning AgentsBased on �An Introduction to MultiAgent Systems� and slides

by Michael Wooldridge

What is Practical Reasoning?

Practical reasoning is reasoning directed towards actions �the process of �guring out what to do

Practical reasoning is a matter of weighing con�ictingconsiderations for and against competing options, where therelevant considerations are provided by what the agentdesires/values/cares about and what the agent believes.

Practical reasoning is di�erent than theoretical reasoning

(directed towards beliefs) � e.g., all men are mortal ∧ Socratesis a man → Socrates is mortal

The Components of Practical Reasoning

Practical Reasoning = Deliberation + Means-Ends Reasoning

Deliberation � deciding what state of a�airs we want toachieve � the outputs of deliberation are intentions

Means-ends reasoning � deciding how to achieve these statesof a�airs � the outputs of means-ends reasoning are plans

Computations and Resource Bounds

An agent has a �xed amount of memory and a �xed processor(resource bounds) to carry out necessary computations

Resource bounds impose limits on the size of computations

Time constraints in real environments impose limit on the sizeof computations

Implications

1 Agents must control its reasoning e�ectively if it is to performwell

2 Agents cannot deliberate inde�nitely (even if the decided stateof a�airs is not optimal)

Intentions

Several types of intention (in ordinary speech)

1 Intention may characterize an actionI might intentionally push someone under a train, and pushthem with the intention of killing them

2 Intention may characterize state of mindI might have the intention this morning of pushing someoneunder a train this afternoon

We focus on future-directed intentions as states of minds →intentions that an agent has towards some future state of a�airs

Intentions and Desires

Intentions are stronger in in�uencing actions, than desires

My desire to play basketball this afternoon is merely a potentialin�uencer of my conduct this afternoon. It must vie with my otherrelevant desires [. . . ] before it is settled what I will do. Incontrast, once I intend to play basketball this afternoon, the matteris settled: I normally need not continue to weigh the pros and cons.When the afternoon arrives, I will normally just proceed to executemy intentions.

Intentions in Practical Reasoning

Intentions drive means-ends reasoning

If I have formed an intention, then I will attempt to achieve it(including deciding how to achieve it)If one course of actions fails, then I will typically attemptothers

Intentions persist

I will not give up my intentions without good reasonIntentions will persist until I have achieved them, I believe Icannot achieve them, or I believe the reason for intention is nolonger present

Intentions in Practical Reasoning

Intentions constraint further deliberation

I will not consider options that are incompatible with mycurrent intentionsFilter of admissibility

Intentions in�uence beliefs upon which practical reasoning isbased

If I adopt an intention, then I believe I will achieve theintentionHowever, I must also recognize the possibility that I can fail tobring the intention about

Symbolic Representation of Beliefs, Desires and Intentions

Agent has to maintain an explicit symbolic representation (e.g.,Prolog facts) of its beliefs, desires, and intentions. Let

B be a variable for current beliefs

Bel be the set of all such beliefs

D be a variable for current desires

Des be the set of all desires

I be a variable for current intentions

Int be the set of all intentions

Deliberation

Deliberation is modeled via two functions

an option generation function

options : 2Bel ×2Int → 2Des

a �ltering function

�lter : 2Bel ×2Des ×2Int → 2Int

An agent's belief update process is modeled trough a belief revisionfunction

brf : 2Bel ×Per → 2Bel

Means-Ends Reasoning

Means-ends reasoning is the process of deciding how toachieve an end (an intention an agent has) using the availablemeans (i.e., the actions that an agent can perform)

Means-end reasoning is better known as planning (AIcommunity)

Planning is essentially an automatic programming

Planner

Planner is a system that takes asinput (representations of ) thefollowing:

1 a goal, intention or task

2 the current state of theenvironment � the agent's beliefs

3 the actions available to theagent

As output a planner generates a plan

(course of actions / a �recipe�)

STRIPS

STRIPS � the �rst real planner developed in 1960/70s

Two basic components of STRIPS

a model of the world (set of formulas of �rst-order logic)a set of action schemata describing preconditions and e�ectsof all actions

The planning algorithm was based on

�nding a di�erence between the current state of the world andthe goal statereducing this di�erence by applying an appropriate action

The Blocks World

Contains a robot arm, 3 blocks (A, B, C) of equal size, and atable-top.

Representations in the Blocks World

Predicates for describing the Blocks World

Predicate Meaning

On (x ,y) object x on top of object y

OnTable (x) object x on the table

Clear (x) nothing is on the top of object x

Holding (x) robot arm is holding x

ArmEmpty robot arm is empty (not holding anything)

Representations in the Blocks World

The current (initial) state of the Blocks World (closed worldassumption)

{Clear (A) ,On (A,B) ,OnTable (B) ,OnTable (C ) ,Clear (C )}∪∪{ArmEmpty}

A goal to achieve (intention to bring about)

{OnTable (A) ,OnTable (B) ,OnTable (C )}

Actions in the Blocks World

Each action is characterized by

a name � which may have arguments

a precondition list � a list of facts that must be true for theaction to be executed

a delete list � a list of facts that are no longer true after theaction is performed

an add list � a list of facts made true by executing action


The Stack action occurs when the robot arm places the objectx it is holding is placed on top of object y

Stack (x ,y)

pre {Clear (y) ,Holding (x)}del {Clear (y) ,Holding (x)}add {ArmEmpty ,On (x ,y)}

The UnStack action occurs when the robot arm picks anobject x up from on top of another object y

UnStack (x ,y)

pre {On (x ,y) ,Clear (x) ,ArmEmpty}del {On (x ,y) ,ArmEmpty}add {Holding (x) ,Clear (y)}


The Pickup action occurs when the arm picks up an object xfrom the table

Pickup (x)

pre {Clear (x) ,OnTable (x) ,ArmEmpty}del {OnTable (x) ,ArmEmpty}add {Holding (x)}

The PutDown action occurs when the arm places the object xonto the table

PutDown (x)

pre {Holding (x)}del {Holding (x)}add {ArmEmpty ,OnTable (x)}

Means-Ends Reasoning More Formally

We assume a �xed set of actions Ac = {α1,α2, . . . ,αn} thatthe agent can perform

A descriptor for an action α ∈ Ac is a triple 〈Pα ,Dα ,Aα〉,where

Pα is the precondition list for α

Dα is the delete list for α

Aα is the add list for α

A planning problem (over the set of actions Ac) is determinedby a triple 〈∆,O,γ〉, where

∆ is the beliefs of the agent about the initial state of the worldO = {〈Pα ,Dα ,Aα〉 |α ∈ Ac } is an indexed set of actiondescriptorsγ is the goal/task/intention to be achieved


A plan π is a sequence of actions, π = (α1,α2, . . . ,αn), whereαi ∈ AcWith respect to a planning problem 〈∆,O,γ〉, a planπ = (α1,α2, . . . ,αn) determines a sequence of n+1 beliefdatabases ∆o ,∆1, . . . ,∆n, where

∆o = ∆, and∆i =

(∆i−1 \Dαi

)∪Aαi

for 1≤ i ≤ n


A (linear) plan π = (α1,α2, . . . ,αn) is said to be acceptable

with respect to the problem 〈∆,O,γ〉 i� the precondition ofevery action is satis�ed in the preceding belief database, i.e., if∆i−1 |= Pαi

, for all 1≤ i ≤ n

A plan π = (α1,α2, . . . ,αn) is correct with respect to〈∆,O,γ〉 i�

it is acceptable, and∆n |= γ

The restated problem to be solved by a planner

Given a planning problem 〈∆,O,γ〉, �nd a correct plan for 〈∆,O,γ〉or announce that none exists


Let π be a plan, and Plans be the set of all plans (over some set ofactions Ac). Then

pre (π) is the precondition of π, and body (π) is the body of π

empty (π) is a Boolean function that indicates whether π isempty or not

execute (π) is a procedure that executes π without stopping(i.e., executes each action in the plan body in turn)

hd (π) is the plan made up by the �rst action in the body of π

tail (π) is the plan made up by all but the �rst action in thebody of π

sound (π, I ,B) is a Boolean function that indicates whether π

is a correct plan for intentions I given beliefs B


An agent's means-ends reasoning capability is represented by afunction

plan : 2Bel ×2Int ×2Ac → Plan

A plan does not have to be created from scratch � in manypractical solutions a plan library (a per-assembled collection ofplans) is used

Finding a proper plan means �nding a plan whose preconditionuni�es with the agent's beliefs, and whose postconditionuni�es with the intention

Basic Control Structure for a Practical Reasoning Agent

The basic structure of the decision-making process is a loop, inwhich the agent continually

observes the world, and updates beliefs

deliberates to decide what intention to achieve (by �rstdetermining the available options and then by �ltering)

uses means-ends reasoning to �nd a plan to achieve theseintentions

executes the plan

Implementing a Practical Reasoning Agent

B ←B0I ← I0while true do

get next percept ρ through see (. . .) functionB ← brf (B,ρ)D← options (B, I )I ← �lter (B,D, I )π ← plan (B, I ,Ac)while not (empty (π) or succeeded (I ,B) or impossible (I ,B)) do

π ← hd (π)execute (π)π ← tail (π)get next percept ρ through see (. . .) functionB ← brf (B,ρ)if reconsider (I ,B) then

D← options (B, I )I ← �lter (B,D, I )

end

if not sound (π, I ,B) then

π ← plan (B, I ,Ac)end

end

end

Commitments to Ends and Means

When an option (desire) successfully passes through �lter

function and is chosen by the agent, we say the agent hasmade a commitment

Commitment implies temporal persistence � an intention, onceadopted, should not immediately evaporate

Critical issues

How committed an agent should be to its intentions

How long should an intention persist?Under what circumstances should an intention vanish?

Commitments Strategies

Commitment strategy is the mechanism that agent uses todetermine when and how to drop intentions

Blind commitmentA blindly committed agent will continue to maintain anintention until it believes the intention has actually beenachieved. Blind commitment is also sometimes referred to asfanatical commitment

Single-minded commitmentA single-minded agent will continue to maintain an intentionuntil it believes that either the intention has been achieved, orelse that it is no longer possible to achieve the intention

Open-minded commitmentAn open-minded agent will maintain an intention as long as itis still believed possible

Intention Reconsideration

When should an agent stop to reconsider its intentions (it costs!)?

An agent that does not stop to reconsider su�ciently oftenwill continue attempting to achieve its intentions even after itis clear that they cannot be achieved or that there is no longerany reason for achieving them

An agent that constantly reconsiders its intentions may spendinsu�cient time working to achieve them, and hence runs arisk of never actually achieving them

Controlling Intention Reconsideration

Solution is to incorporate an explicit meta-level controlcomponent, that decides whether or not to reconsider (thereconsider (. . .) function)

The possible interactions between meta-level control anddeliberation

Chose Changed Would have reconsider (. . .)to deliberate? intentions? changed intentions? optimal?

No � No YesNo � Yes NoYes No � NoYes Yes � Yes

Optimal Intention Reconsideration

E�ectiveness of intention reconsideration strategies wasexperimentally investigated

Two di�erent types of reconsideration strategy were used

bold agents that never pause to reconsider intentionscautious agents that stop to reconsider after every action

Dynamism in the environment is represented by the rate ofworld change, γ

Optimal Intention Reconsideration

If γ is low (i.e., the environment does not change quickly),then bold agents do well compared to cautious ones. This isbecause cautious ones waste time reconsidering theircommitments

If γ is high (i.e., the environment changes frequently), thencautious agents tend to outperform bold agents. This isbecause they are able to recognize when intentions aredoomed, and take advantage of this

The Procedural Reasoning System (PRS)

PRS was the �rst agent architecture to explicitly embodybelief-desire-intention paradigm

PRS proved to be the most durable agent architecturedeveloped to date (1980s)

PRS has been applied in several signi�cant multiagentapplications so far built

OASIS � an air-tra�c control system called (Sydney airport)SWARMM � a simulation system for the Royal Australian AirForce

PRS Architecture

Plans in PRS

PRS is equipped with a library of pre-compiled plans

Each plan has the following components

a goal � the postcondition of the plana context � the precondition of the plana body � the 'recipe' part of the plan (the course of actions tocarry out)

Plans can have goals � these goals must be then achievedbefore the remainder of the plan can be executed

It is possible to have disjunctions of goals ('achieve φ orachieve ψ '), and loops ('keep achieving φ until ψ ')

Planning in PRS

When the agent starts up, the goal to be achieved is pushedonto the intention stack

The agent then searches through the plan library to see whatplans have the top goal as their precondition

The set of plans that achieve the top goal and have theirprecondition satis�ed become the possible options for theagent

Planning in PRS

The agent selects an appropriate plan during deliberation

Meta-level plans (plans about plans)Utilities for plans (maximized)

The chosen plan is executed � this may involve pushing furthergoals onto the intention stack.

If a particular plan fails, then the agent is able to selectanother plan from the set of all candidate plans

Example in JAM

GOALS:ACHIEVE b lock s_s tacked ;

FACTS :// Block1 on Block2 i n i t i a l l y so need to c l e a r Block2 b e f o r e s t a c k i n g .FACT ON "Block1 " "Block2 " ;FACT ON "Block2 " "Table " ;FACT ON "Block3 " "Table " ;FACT CLEAR "Block1 " ;FACT CLEAR "Block3 " ;FACT CLEAR "Table " ;

Example in JAM

Plan : {NAME: "Top− l e v e l p l an "DOCUMENTATION: " E s t a b l i s h Block1 on Block2 on Block3 . "GOAL:

ACHIEVE b lock s_s tacked ;CONTEXT:BODY:

EXECUTE p r i n t "Goal i s Block1 on Block2 on Block2 on Table . \ n " ;EXECUTE p r i n t "World Model a t s t a r t i s : \ n " ; EXECUTE pr in tWor ldMode l ;EXECUTE p r i n t "ACHIEVEing Block3 on Table . \ n " ;ACHIEVE ON "Block3 " "Table " ;EXECUTE p r i n t "ACHIEVEing Block2 on Block3 . \ n " ;ACHIEVE ON "Block2 " "Block3 " ;EXECUTE p r i n t "ACHIEVEing Block1 on Block2 . \ n " ;ACHIEVE ON "Block1 " "Block2 " ;EXECUTE p r i n t "World Model a t end i s : \ n " ;EXECUTE pr in tWor ldMode l ;

}

Example in JAM

Plan : {NAME: " Stack b l o c k s t ha t a r e a l r e a d y c l e a r "GOAL:

ACHIEVE ON $OBJ1 $OBJ2 ;CONTEXT:BODY:

EXECUTE p r i n t "Making s u r e " $OBJ1 " i s c l e a r \n " ;ACHIEVE CLEAR $OBJ1 ;EXECUTE p r i n t "Making s u r e " $OBJ2 " i s c l e a r . \ n " ;ACHIEVE CLEAR $OBJ2 ;EXECUTE p r i n t "Moving " $OBJ1 " on top o f " $OBJ2 " .\ n " ;PERFORM move $OBJ1 $OBJ2 ;

UTILITY : 10 ;FAILURE :

EXECUTE p r i n t "\n\ nStack b l o c k s f a i l e d !\ n\n " ; }}

Example in JAM

Plan : {NAME: " C l e a r a b l o ck "GOAL:

ACHIEVE CLEAR $OBJ ;CONTEXT:

FACT ON $OBJ2 $OBJ ;BODY:

EXECUTE p r i n t " C l e a r i n g " $OBJ2 " from on top o f " $OBJ "\n " ;EXECUTE p r i n t "Moving " $OBJ2 " to t a b l e . \ n " ;PERFORM move $OBJ2 "Table " ;

EFFECTS :EXECUTE p r i n t "CLEAR : R e t r a c t i n g ON " $OBJ2 " " $OBJ "\n " ;RETRACT ON $OBJ1 $OBJ ;

FAILURE :EXECUTE p r i n t "\n\ nC l e a r i n g b l o ck " $OBJ " f a i l e d !\ n\n " ;

}

Example in JAM

Plan : {NAME: "Move a b l o ck onto ano the r o b j e c t "GOAL:

PERFORM move $OBJ1 $OBJ2 ;CONTEXT:

FACT CLEAR $OBJ1 ;FACT CLEAR $OBJ2 ;

BODY:EXECUTE p r i n t " Per fo rm ing low− l e v e l move a c t i o n " ;EXECUTE p r i n t " o f " $OBJ1 " to " $OBJ2 " .\ n " ;

EFFECTS :WHEN : TEST (!= $OBJ2 "Table ") {

EXECUTE p r i n t " R e t r a c t i n g CLEAR " $OBJ2 "\n " ;RETRACT CLEAR $OBJ2 ;

}FACT ON $OBJ1 $OBJ3 ;EXECUTE p r i n t " move : R e t r a c t i n g ON " $OBJ1 " " $OBJ3 "\n " ;RETRACT ON $OBJ1 $OBJ3 ;EXECUTE p r i n t " move : A s s e r t i n g CLEAR " $OBJ3 "\n " ;ASSERT CLEAR $OBJ3 ;EXECUTE p r i n t " move : A s s e r t i n g ON " $OBJ1 " " $OBJ2 "\n\n " ;ASSERT ON $OBJ1 $OBJ2 ;

FAILURE :EXECUTE p r i n t "\n\nMove f a i l e d !\ n\n " ;

} ;

OASIS Air Tra�c Management System

System providing automatic support for air tra�c controllersperforming tactical air tra�c management

Maximization of runaway utilizationArranging landing airplanes into an optimal order, assigningthem a landing time, and then monitoring the progress of eachindividual plane

Typically such mangement is conducted manually by Flow

Directors

Management starts ~ 120 miles / 20 minutes before landingNot concerned with �safe� separation of landing airplanes

System developed for the airport in Sidney (60 airplaneslanding per hour)

Structure of the OASIS System

Coordinator � coordinates other global agents

Sequencer � arranges the lansing sequence

Trajectory Manager � veri�es that instructionsprovided by Sequencer do not violate statuatoryseparation requirements

Wind Model � uses wind observations made byindividual aircrafts for predicting the wind �eld

User Interface � single point of communicationbetween the system and the Flow Director

Aircraft � estimated the landing time, monitorsthe progress and plans the trajectory of theaircraft

Major Components of the Aircraft Agent

Predictor

computes when the aircraft can reach an airport (min...max)performance pro�les for various types of planesusers information from Wind Model

Monitor

compares predictions made by Predictor with actual �ying timenoti�es the rest of the system (Planner, Sequencer) aboutpossible discrepanciessends actual data to Wind Model to improve its predictions

Planner

constructs plans so the airplane lands at the assigned timeuses a library of various strategies (strict algorithms, heuristics)plan to be implemented is selected by the Flow Director

SWARMM Simulation System

System used to model and simulate tactical decision makingprocesses of pilots and �ghter-controllers involved into aircombat

Employed also for evaluating potential merits of new hardware(radars, missles,) and tactics

Automatic or HIL (human-in-the-loop) training

Decisions made during simulation should be explained in acomprehensible (human-alike) way

reliance on expert/domian knowledgeproblems with acquisition of such knowledge (→signi�cant

risk?)

Used by the Royal Australian Air Forces

User Interface in SWARMM

Structure of SWARMM

Physical models (external agents) describe physics models

Reasoning models (reasoning agents) make decisions

Physical models supply data about themselves to appropriatereasoning models

Plans

500 plans for a pilot agent

Plans developed with help of(real) �gher pilots

Can be easily updated/modi�edduring runtime

practical reasoning agents - poznań university of …intentions in practical reasoning intentions...

Documents