a decision-theoretic approach to designing proactive communication in multi-agent teamwork
DESCRIPTION
A Decision-Theoretic Approach to Designing Proactive Communication in Multi-Agent Teamwork. Thomas R. Ioerger, Yu Zhang, Richard Volz, John Yen (PSU-IST) Dept. of Computer Science Texas A&M University. Motivation. Team. Agents share a large amount of knowledge about the teamwork. - PowerPoint PPT PresentationTRANSCRIPT
A Decision-Theoretic Approach to A Decision-Theoretic Approach to Designing Proactive Communication in Designing Proactive Communication in
Multi-Agent TeamworkMulti-Agent Teamwork
Thomas R. Ioerger, Yu Zhang,
Richard Volz, John Yen (PSU-IST)
Dept. of Computer Science
Texas A&M University
2
MotivationMotivation
AgentMulti-Agent
Team Agents share a large amount of knowledge aboutthe teamwork.Hard coded Interactions amongparticipants.High-frequency message exchange.Communication risk.
3
Challenging Issues in Designing Challenging Issues in Designing Communication ProtocolsCommunication Protocols
Each agent has incomplete information from which uncertainties arise.
Each agent has different problem solving capabilities.
Data are decentralized and lack systems’ global control.
Excessive/unrestricted communication leads to lack of scalability
4
Our Approach and Its ContributionsOur Approach and Its Contributions
Proactive CommunicationOBPC: Reduction of communication load
through OBservations.
DIP: Dynamic estimation of the probability distribution of Information Production and need.
DTPC: Decision-Theoretic determination of communication strategies.
5
BackgroundBackground CAST (Collab. Agents for Simulating Teamwork) MALLET (Multi-Agent Logic-based Language for
Encoding Teamwork)
(team-plan killwumpus(?w) (process (seq (agent-bind ?ca (constraint (play-role ?ca scout))) (DO ?ca (findwumpus ?w))) (agent-bind ?fi (constraint ((play-role ?fi fighter)
(closest-to-wumpus ?fi ?w)))) (DO ?fi (movetowumpus ?w)) (DO ?fi (shootwumpus ?w))))))
(ioper shootwumpus (?w) (pre-cond (wumpus ?w) (location ?w ?x ?y) (dead ?w false)) (effect (dead ?w true)))
6
OverviewOverview
CASTCASTKB
KB
KB
KB
KB
KBProactive Communication
Proactive Communication
OBPCOBPC
DIP DIP DTPCDTPC
Optimal Communication Strategy
Team Structure & Teamwork Procedure
7
Agent Execution CycleAgent Execution Cycle
ObserveSense Predict
Info. need and production
DecideStrategyCommunicate
Information
ActEffect
ExecutionCycle
8
Syntax of ObservabilitySyntax of Observability
<observability> ::= (CanSee <viewing>)* (BelieveCanSee <believer><viewing>)*
<viewing> ::= <observer><observable> <cond><believer> ::= <agent><observer> ::= <agent><observable> ::= <property>|<action><cond> ::= (<property>)*<property> ::= (<property-name> <object> <args>)<action> ::= (DO <doer> (<operator-name> <args>))<object> ::= <agent>|<non-agent><doer> ::= <agent>
9
Example Observability RulesExample Observability Rules(CanSee ca (location ?o ?x ?y) (location ca ?xc ?yc) (location ?o ?x ?y) (inradius ?x ?y ?xc ?yc rca)) //The carrier can see the location property of an object.
(CanSee ca (DO ?fi (shootwumpus ?w)) (play-role fighter ?fi) (location ca ?xc ?yc) (location ?fi ?x ?y) (adjacent ?xc ?yc ?x ?y) ) //The carrier can see the shootwumpus action of a fighter.
(BelieveCanSee ca fi (location ?o ?x ?y) (location fi ?xi ?yi) (location ?o ?x ?y) (inradius ?x ?y ?xi ?yi rfi)) //The carrier believes the fighter is able to see the location property of an object.
(BelieveCanSee ca fi (DO ?f (shootwumpus ?w)) (play-role fighter ?f) ( ?f fi) (location ca ?xc ?yc) (location fi ?xi ?yi) (location ?f ?x ?
y) (inradius ?xi ?yi ?xc ?yc rca) (inradius ?x ?y ?xc ?yc rca) (adjacent ?x ?y ?xi ?yi)) //The carrier believes the fighter is able to see the shootwumpus action of another
fighter.
10
Proactive Communication Based Proactive Communication Based on Observationon Observation
ProactiveTell– A provider reasons about what information it will have.– A provider reasons about whether to deliver a piece of
information when having the information.
ActiveAsk– A needer reasons about what information it will need.
– A needer reasons about whether to ask for a piece of information when needing the information.
11
EvaluationEvaluation
20 wumpuses, 8 pits, and 20 piles of gold per world.
1 carrier and 3 fighters compose a team.
The team goal is to kill wumpuses and get the gold without being killed.
5 randomly generated worlds with 20×20 cells.
Multi-Agent Wumpus World
12
Decision-Theoretic Proactive Decision-Theoretic Proactive CommunicationCommunication
StrategiesUtility FunctionCost FunctionValue FunctionDecision-Making
13
Decision-Making on Situation PADecision-Making on Situation PA
0
1
2
e
ea-b: ProactiveTell
a-b: Silence
b-a: Accept
b-a: Wait
b-a: Silence
e
e
b-a: ActiveAsk
Situation PA: Provider produces a new piece of information
a: provider b: needer e: end
14
DM on Situation PBDM on Situation PB
0
a-b: Reply
ea-b: WaitUntilNext
Situation PB: Provider receives a request for a piece of information
e
15
DM on Situation NADM on Situation NA
b-a: ActiveAsk
b-a: Silence
b-a: Wait
a-b: Reply
a-b: WaitUntilNext
a-b: Silence
a-b: ProactiveTell
Situation NA: Needer needs a piece of information
01
0
t
t
e
e
e
t: transfer
16
DM on Situation NBDM on Situation NB
Situation NB: Needer receives a piece of information
t
0 eb-a: Accept
17
Utility FunctionUtility Function
Parameters in utility function:
– I: information about which communication occurs
– t: time of decision-making
– t1: time at which I is needed
– t2: time at which the value for I used is produced
– SU: situation at t
– S: strategy available at SU
– M: a set of messages involving in obtaining I
– E: environment state at t
U(I, t, t1, t2, SU, S, M, E)
=V(I, t, t1, t2, SU, S)–C(M)
18
Value FunctionValue Function
V(I, t, t1, t2, SU, S)
=T(I, t, t1, t2, SU, S)//Timeliness
+R(I, t, t1, t2, SU, S)//Relevance
19
Timeliness– Whether agents use a value that can be produced in
time when they need I.
d(I, t, t1, t2, SU, S) = max(0, t2–t1)
ft(d(I, t, t1, t2, SU, S))s.t. ft(x) < ft(y) if y < x
T(I, t, t1, t2, SU, S) = ft(d(I, t, t1, t2, SU, S))
Timeliness FunctionTimeliness Function
20
Relevance FunctionRelevance Function
Relevance– Unprocessed, Most recent, Important
P(I, t, t1, t2, SU, S) = Pr(I t t1 t2 no other value for I was produced
between Int[t1,t2] | S SU)
frI(P(I, t, t1, t2, SU, S))s.t. frI(x) < frI(y) if x < y
R(I, t, t1, t2, SU, S) = frI(P(I, t, t1, t2, SU, S))
21
Cost FunctionCost Function
0 if Mi=
C(Mi) =
k1 + k2 × len(Mi) otherwise
22
Expected UtilityExpected Utility
E(U) =
Time
Strategy
t1 t2
P.ProactiveTell
P.Silence +T
P.Reply
P.WaitUntilNext
N.ActiveAsk if a Reply
if a WaitUnitlNext
N.Silence
N.Wait if a ProactiveTell
+T if a Silence
N.Accept
ufNbT ,
0,PaTuf
NbT ,0,PaT
qbT ,1
,q
PaT
0,q
PaT
qbT ,
0,NbT
rbT ,
0,a
PaT1
,a
PaT0,NbT
0,NbT
1,a
PaT
gbT ,1
,gNbT
0,NbT
1s 2st t 2121r )t,U(t)t,t(P
23
StrategiesStrategies
xNbT ,
ufNbT ,
nsPaT ,
0,PaT
1,PaT
lsPaT ,
t
Current time
Unknown
Known
Next production
Last sent
Last notsent
Last need aware of
Unfulfilled need
Situation PA: Situation PA: provider produces I
ProactiveTell?Silence?
24
StrategiesStrategies
qbT ,
1,q
PaT
0,q
PaTt
Current time
Unknown
Known
Next production
Last production
Situation PB:Situation PB: provider receives a request for I
Reply? WaitUntilNext?
25
StrategiesStrategies
0,NbTrbT ,
0,a
PaT 1,a
PaT
t
Current time
Unknown
Known
Next production
Last I received
Most recentproduction
Situation NA: Situation NA: needer needs I
ActiveAsk?Wait?
Silence?
26
StrategiesStrategiesSituation NB: Situation NB: needer receives I
Accept
27
Summary• Advantages of Approach: allows agents to
make intelligent choices of communication policy based on:– frequencies: of needs, of sensing, of info. change– costs: of messages, plus penalities for delays in
action, or acting with incorrect information
28
CriteriaCriteria for for Applicable DomainsApplicable DomainsThere are information needs among the team.
Agents can communicate.
There is uncertainty in the environment. – Stochastic properties of teamwork process.– Agents have incomplete/disjoint knowledge
about the world.
The team acts under critical time constraints, so proactive assistance becomes important.