practical reasoning agents - poznań university of …intentions in practical reasoning intentions...
TRANSCRIPT
Practical Reasoning AgentsBased on �An Introduction to MultiAgent Systems� and slides
by Michael Wooldridge
What is Practical Reasoning?
Practical reasoning is reasoning directed towards actions �the process of �guring out what to do
Practical reasoning is a matter of weighing con�ictingconsiderations for and against competing options, where therelevant considerations are provided by what the agentdesires/values/cares about and what the agent believes.
Practical reasoning is di�erent than theoretical reasoning
(directed towards beliefs) � e.g., all men are mortal ∧ Socratesis a man → Socrates is mortal
The Components of Practical Reasoning
Practical Reasoning = Deliberation + Means-Ends Reasoning
Deliberation � deciding what state of a�airs we want toachieve � the outputs of deliberation are intentions
Means-ends reasoning � deciding how to achieve these statesof a�airs � the outputs of means-ends reasoning are plans
Computations and Resource Bounds
An agent has a �xed amount of memory and a �xed processor(resource bounds) to carry out necessary computations
Resource bounds impose limits on the size of computations
Time constraints in real environments impose limit on the sizeof computations
Implications
1 Agents must control its reasoning e�ectively if it is to performwell
2 Agents cannot deliberate inde�nitely (even if the decided stateof a�airs is not optimal)
Intentions
Several types of intention (in ordinary speech)
1 Intention may characterize an actionI might intentionally push someone under a train, and pushthem with the intention of killing them
2 Intention may characterize state of mindI might have the intention this morning of pushing someoneunder a train this afternoon
We focus on future-directed intentions as states of minds →intentions that an agent has towards some future state of a�airs
Intentions and Desires
Intentions are stronger in in�uencing actions, than desires
My desire to play basketball this afternoon is merely a potentialin�uencer of my conduct this afternoon. It must vie with my otherrelevant desires [. . . ] before it is settled what I will do. Incontrast, once I intend to play basketball this afternoon, the matteris settled: I normally need not continue to weigh the pros and cons.When the afternoon arrives, I will normally just proceed to executemy intentions.
Intentions in Practical Reasoning
Intentions drive means-ends reasoning
If I have formed an intention, then I will attempt to achieve it(including deciding how to achieve it)If one course of actions fails, then I will typically attemptothers
Intentions persist
I will not give up my intentions without good reasonIntentions will persist until I have achieved them, I believe Icannot achieve them, or I believe the reason for intention is nolonger present
Intentions in Practical Reasoning
Intentions constraint further deliberation
I will not consider options that are incompatible with mycurrent intentionsFilter of admissibility
Intentions in�uence beliefs upon which practical reasoning isbased
If I adopt an intention, then I believe I will achieve theintentionHowever, I must also recognize the possibility that I can fail tobring the intention about
Symbolic Representation of Beliefs, Desires and Intentions
Agent has to maintain an explicit symbolic representation (e.g.,Prolog facts) of its beliefs, desires, and intentions. Let
B be a variable for current beliefs
Bel be the set of all such beliefs
D be a variable for current desires
Des be the set of all desires
I be a variable for current intentions
Int be the set of all intentions
Deliberation
Deliberation is modeled via two functions
an option generation function
options : 2Bel ×2Int → 2Des
a �ltering function
�lter : 2Bel ×2Des ×2Int → 2Int
An agent's belief update process is modeled trough a belief revisionfunction
brf : 2Bel ×Per → 2Bel
Means-Ends Reasoning
Means-ends reasoning is the process of deciding how toachieve an end (an intention an agent has) using the availablemeans (i.e., the actions that an agent can perform)
Means-end reasoning is better known as planning (AIcommunity)
Planning is essentially an automatic programming
Planner
Planner is a system that takes asinput (representations of ) thefollowing:
1 a goal, intention or task
2 the current state of theenvironment � the agent's beliefs
3 the actions available to theagent
As output a planner generates a plan
(course of actions / a �recipe�)
STRIPS
STRIPS � the �rst real planner developed in 1960/70s
Two basic components of STRIPS
a model of the world (set of formulas of �rst-order logic)a set of action schemata describing preconditions and e�ectsof all actions
The planning algorithm was based on
�nding a di�erence between the current state of the world andthe goal statereducing this di�erence by applying an appropriate action
Representations in the Blocks World
Predicates for describing the Blocks World
Predicate Meaning
On (x ,y) object x on top of object y
OnTable (x) object x on the table
Clear (x) nothing is on the top of object x
Holding (x) robot arm is holding x
ArmEmpty robot arm is empty (not holding anything)
Representations in the Blocks World
The current (initial) state of the Blocks World (closed worldassumption)
{Clear (A) ,On (A,B) ,OnTable (B) ,OnTable (C ) ,Clear (C )}∪∪{ArmEmpty}
A goal to achieve (intention to bring about)
{OnTable (A) ,OnTable (B) ,OnTable (C )}
Actions in the Blocks World
Each action is characterized by
a name � which may have arguments
a precondition list � a list of facts that must be true for theaction to be executed
a delete list � a list of facts that are no longer true after theaction is performed
an add list � a list of facts made true by executing action
Actions in the Blocks World
The Stack action occurs when the robot arm places the objectx it is holding is placed on top of object y
Stack (x ,y)
pre {Clear (y) ,Holding (x)}del {Clear (y) ,Holding (x)}add {ArmEmpty ,On (x ,y)}
The UnStack action occurs when the robot arm picks anobject x up from on top of another object y
UnStack (x ,y)
pre {On (x ,y) ,Clear (x) ,ArmEmpty}del {On (x ,y) ,ArmEmpty}add {Holding (x) ,Clear (y)}
Actions in the Blocks World
The Pickup action occurs when the arm picks up an object xfrom the table
Pickup (x)
pre {Clear (x) ,OnTable (x) ,ArmEmpty}del {OnTable (x) ,ArmEmpty}add {Holding (x)}
The PutDown action occurs when the arm places the object xonto the table
PutDown (x)
pre {Holding (x)}del {Holding (x)}add {ArmEmpty ,OnTable (x)}
Means-Ends Reasoning More Formally
We assume a �xed set of actions Ac = {α1,α2, . . . ,αn} thatthe agent can perform
A descriptor for an action α ∈ Ac is a triple 〈Pα ,Dα ,Aα〉,where
Pα is the precondition list for α
Dα is the delete list for α
Aα is the add list for α
A planning problem (over the set of actions Ac) is determinedby a triple 〈∆,O,γ〉, where
∆ is the beliefs of the agent about the initial state of the worldO = {〈Pα ,Dα ,Aα〉 |α ∈ Ac } is an indexed set of actiondescriptorsγ is the goal/task/intention to be achieved
Means-Ends Reasoning More Formally
A plan π is a sequence of actions, π = (α1,α2, . . . ,αn), whereαi ∈ AcWith respect to a planning problem 〈∆,O,γ〉, a planπ = (α1,α2, . . . ,αn) determines a sequence of n+1 beliefdatabases ∆o ,∆1, . . . ,∆n, where
∆o = ∆, and∆i =
(∆i−1 \Dαi
)∪Aαi
for 1≤ i ≤ n
Means-Ends Reasoning More Formally
A (linear) plan π = (α1,α2, . . . ,αn) is said to be acceptable
with respect to the problem 〈∆,O,γ〉 i� the precondition ofevery action is satis�ed in the preceding belief database, i.e., if∆i−1 |= Pαi
, for all 1≤ i ≤ n
A plan π = (α1,α2, . . . ,αn) is correct with respect to〈∆,O,γ〉 i�
it is acceptable, and∆n |= γ
The restated problem to be solved by a planner
Given a planning problem 〈∆,O,γ〉, �nd a correct plan for 〈∆,O,γ〉or announce that none exists
Means-Ends Reasoning More Formally
Let π be a plan, and Plans be the set of all plans (over some set ofactions Ac). Then
pre (π) is the precondition of π, and body (π) is the body of π
empty (π) is a Boolean function that indicates whether π isempty or not
execute (π) is a procedure that executes π without stopping(i.e., executes each action in the plan body in turn)
hd (π) is the plan made up by the �rst action in the body of π
tail (π) is the plan made up by all but the �rst action in thebody of π
sound (π, I ,B) is a Boolean function that indicates whether π
is a correct plan for intentions I given beliefs B
Means-Ends Reasoning More Formally
An agent's means-ends reasoning capability is represented by afunction
plan : 2Bel ×2Int ×2Ac → Plan
A plan does not have to be created from scratch � in manypractical solutions a plan library (a per-assembled collection ofplans) is used
Finding a proper plan means �nding a plan whose preconditionuni�es with the agent's beliefs, and whose postconditionuni�es with the intention
Basic Control Structure for a Practical Reasoning Agent
The basic structure of the decision-making process is a loop, inwhich the agent continually
observes the world, and updates beliefs
deliberates to decide what intention to achieve (by �rstdetermining the available options and then by �ltering)
uses means-ends reasoning to �nd a plan to achieve theseintentions
executes the plan
Implementing a Practical Reasoning Agent
B ←B0I ← I0while true do
get next percept ρ through see (. . .) functionB ← brf (B,ρ)D← options (B, I )I ← �lter (B,D, I )π ← plan (B, I ,Ac)while not (empty (π) or succeeded (I ,B) or impossible (I ,B)) do
π ← hd (π)execute (π)π ← tail (π)get next percept ρ through see (. . .) functionB ← brf (B,ρ)if reconsider (I ,B) then
D← options (B, I )I ← �lter (B,D, I )
end
if not sound (π, I ,B) then
π ← plan (B, I ,Ac)end
end
end
Commitments to Ends and Means
When an option (desire) successfully passes through �lter
function and is chosen by the agent, we say the agent hasmade a commitment
Commitment implies temporal persistence � an intention, onceadopted, should not immediately evaporate
Critical issues
How committed an agent should be to its intentions
How long should an intention persist?Under what circumstances should an intention vanish?
Commitments Strategies
Commitment strategy is the mechanism that agent uses todetermine when and how to drop intentions
Blind commitmentA blindly committed agent will continue to maintain anintention until it believes the intention has actually beenachieved. Blind commitment is also sometimes referred to asfanatical commitment
Single-minded commitmentA single-minded agent will continue to maintain an intentionuntil it believes that either the intention has been achieved, orelse that it is no longer possible to achieve the intention
Open-minded commitmentAn open-minded agent will maintain an intention as long as itis still believed possible
Intention Reconsideration
When should an agent stop to reconsider its intentions (it costs!)?
An agent that does not stop to reconsider su�ciently oftenwill continue attempting to achieve its intentions even after itis clear that they cannot be achieved or that there is no longerany reason for achieving them
An agent that constantly reconsiders its intentions may spendinsu�cient time working to achieve them, and hence runs arisk of never actually achieving them
Controlling Intention Reconsideration
Solution is to incorporate an explicit meta-level controlcomponent, that decides whether or not to reconsider (thereconsider (. . .) function)
The possible interactions between meta-level control anddeliberation
Chose Changed Would have reconsider (. . .)to deliberate? intentions? changed intentions? optimal?
No � No YesNo � Yes NoYes No � NoYes Yes � Yes
Optimal Intention Reconsideration
E�ectiveness of intention reconsideration strategies wasexperimentally investigated
Two di�erent types of reconsideration strategy were used
bold agents that never pause to reconsider intentionscautious agents that stop to reconsider after every action
Dynamism in the environment is represented by the rate ofworld change, γ
Optimal Intention Reconsideration
If γ is low (i.e., the environment does not change quickly),then bold agents do well compared to cautious ones. This isbecause cautious ones waste time reconsidering theircommitments
If γ is high (i.e., the environment changes frequently), thencautious agents tend to outperform bold agents. This isbecause they are able to recognize when intentions aredoomed, and take advantage of this
The Procedural Reasoning System (PRS)
PRS was the �rst agent architecture to explicitly embodybelief-desire-intention paradigm
PRS proved to be the most durable agent architecturedeveloped to date (1980s)
PRS has been applied in several signi�cant multiagentapplications so far built
OASIS � an air-tra�c control system called (Sydney airport)SWARMM � a simulation system for the Royal Australian AirForce
Plans in PRS
PRS is equipped with a library of pre-compiled plans
Each plan has the following components
a goal � the postcondition of the plana context � the precondition of the plana body � the 'recipe' part of the plan (the course of actions tocarry out)
Plans can have goals � these goals must be then achievedbefore the remainder of the plan can be executed
It is possible to have disjunctions of goals ('achieve φ orachieve ψ '), and loops ('keep achieving φ until ψ ')
Planning in PRS
When the agent starts up, the goal to be achieved is pushedonto the intention stack
The agent then searches through the plan library to see whatplans have the top goal as their precondition
The set of plans that achieve the top goal and have theirprecondition satis�ed become the possible options for theagent
Planning in PRS
The agent selects an appropriate plan during deliberation
Meta-level plans (plans about plans)Utilities for plans (maximized)
The chosen plan is executed � this may involve pushing furthergoals onto the intention stack.
If a particular plan fails, then the agent is able to selectanother plan from the set of all candidate plans
Example in JAM
GOALS:ACHIEVE b lock s_s tacked ;
FACTS :// Block1 on Block2 i n i t i a l l y so need to c l e a r Block2 b e f o r e s t a c k i n g .FACT ON "Block1 " "Block2 " ;FACT ON "Block2 " "Table " ;FACT ON "Block3 " "Table " ;FACT CLEAR "Block1 " ;FACT CLEAR "Block3 " ;FACT CLEAR "Table " ;
Example in JAM
Plan : {NAME: "Top− l e v e l p l an "DOCUMENTATION: " E s t a b l i s h Block1 on Block2 on Block3 . "GOAL:
ACHIEVE b lock s_s tacked ;CONTEXT:BODY:
EXECUTE p r i n t "Goal i s Block1 on Block2 on Block2 on Table . \ n " ;EXECUTE p r i n t "World Model a t s t a r t i s : \ n " ; EXECUTE pr in tWor ldMode l ;EXECUTE p r i n t "ACHIEVEing Block3 on Table . \ n " ;ACHIEVE ON "Block3 " "Table " ;EXECUTE p r i n t "ACHIEVEing Block2 on Block3 . \ n " ;ACHIEVE ON "Block2 " "Block3 " ;EXECUTE p r i n t "ACHIEVEing Block1 on Block2 . \ n " ;ACHIEVE ON "Block1 " "Block2 " ;EXECUTE p r i n t "World Model a t end i s : \ n " ;EXECUTE pr in tWor ldMode l ;
}
Example in JAM
Plan : {NAME: " Stack b l o c k s t ha t a r e a l r e a d y c l e a r "GOAL:
ACHIEVE ON $OBJ1 $OBJ2 ;CONTEXT:BODY:
EXECUTE p r i n t "Making s u r e " $OBJ1 " i s c l e a r \n " ;ACHIEVE CLEAR $OBJ1 ;EXECUTE p r i n t "Making s u r e " $OBJ2 " i s c l e a r . \ n " ;ACHIEVE CLEAR $OBJ2 ;EXECUTE p r i n t "Moving " $OBJ1 " on top o f " $OBJ2 " .\ n " ;PERFORM move $OBJ1 $OBJ2 ;
UTILITY : 10 ;FAILURE :
EXECUTE p r i n t "\n\ nStack b l o c k s f a i l e d !\ n\n " ; }}
Example in JAM
Plan : {NAME: " C l e a r a b l o ck "GOAL:
ACHIEVE CLEAR $OBJ ;CONTEXT:
FACT ON $OBJ2 $OBJ ;BODY:
EXECUTE p r i n t " C l e a r i n g " $OBJ2 " from on top o f " $OBJ "\n " ;EXECUTE p r i n t "Moving " $OBJ2 " to t a b l e . \ n " ;PERFORM move $OBJ2 "Table " ;
EFFECTS :EXECUTE p r i n t "CLEAR : R e t r a c t i n g ON " $OBJ2 " " $OBJ "\n " ;RETRACT ON $OBJ1 $OBJ ;
FAILURE :EXECUTE p r i n t "\n\ nC l e a r i n g b l o ck " $OBJ " f a i l e d !\ n\n " ;
}
Example in JAM
Plan : {NAME: "Move a b l o ck onto ano the r o b j e c t "GOAL:
PERFORM move $OBJ1 $OBJ2 ;CONTEXT:
FACT CLEAR $OBJ1 ;FACT CLEAR $OBJ2 ;
BODY:EXECUTE p r i n t " Per fo rm ing low− l e v e l move a c t i o n " ;EXECUTE p r i n t " o f " $OBJ1 " to " $OBJ2 " .\ n " ;
EFFECTS :WHEN : TEST (!= $OBJ2 "Table ") {
EXECUTE p r i n t " R e t r a c t i n g CLEAR " $OBJ2 "\n " ;RETRACT CLEAR $OBJ2 ;
}FACT ON $OBJ1 $OBJ3 ;EXECUTE p r i n t " move : R e t r a c t i n g ON " $OBJ1 " " $OBJ3 "\n " ;RETRACT ON $OBJ1 $OBJ3 ;EXECUTE p r i n t " move : A s s e r t i n g CLEAR " $OBJ3 "\n " ;ASSERT CLEAR $OBJ3 ;EXECUTE p r i n t " move : A s s e r t i n g ON " $OBJ1 " " $OBJ2 "\n\n " ;ASSERT ON $OBJ1 $OBJ2 ;
FAILURE :EXECUTE p r i n t "\n\nMove f a i l e d !\ n\n " ;
} ;
OASIS Air Tra�c Management System
System providing automatic support for air tra�c controllersperforming tactical air tra�c management
Maximization of runaway utilizationArranging landing airplanes into an optimal order, assigningthem a landing time, and then monitoring the progress of eachindividual plane
Typically such mangement is conducted manually by Flow
Directors
Management starts ~ 120 miles / 20 minutes before landingNot concerned with �safe� separation of landing airplanes
System developed for the airport in Sidney (60 airplaneslanding per hour)
Structure of the OASIS System
Coordinator � coordinates other global agents
Sequencer � arranges the lansing sequence
Trajectory Manager � veri�es that instructionsprovided by Sequencer do not violate statuatoryseparation requirements
Wind Model � uses wind observations made byindividual aircrafts for predicting the wind �eld
User Interface � single point of communicationbetween the system and the Flow Director
Aircraft � estimated the landing time, monitorsthe progress and plans the trajectory of theaircraft
Major Components of the Aircraft Agent
Predictor
computes when the aircraft can reach an airport (min...max)performance pro�les for various types of planesusers information from Wind Model
Monitor
compares predictions made by Predictor with actual �ying timenoti�es the rest of the system (Planner, Sequencer) aboutpossible discrepanciessends actual data to Wind Model to improve its predictions
Planner
constructs plans so the airplane lands at the assigned timeuses a library of various strategies (strict algorithms, heuristics)plan to be implemented is selected by the Flow Director
SWARMM Simulation System
System used to model and simulate tactical decision makingprocesses of pilots and �ghter-controllers involved into aircombat
Employed also for evaluating potential merits of new hardware(radars, missles,) and tactics
Automatic or HIL (human-in-the-loop) training
Decisions made during simulation should be explained in acomprehensible (human-alike) way
reliance on expert/domian knowledgeproblems with acquisition of such knowledge (→signi�cant
risk?)
Used by the Royal Australian Air Forces
Structure of SWARMM
Physical models (external agents) describe physics models
Reasoning models (reasoning agents) make decisions
Physical models supply data about themselves to appropriatereasoning models