dynamically formed human-robot teams performing ... · these challenges, propose a novel approach...

Dynamically Formed Human-Robot Teams Performing Coordinated Tasks

M. B. Dias, T. K. Harris, B. Browning, E. G. Jones, B. Argall, M. Veloso, A. Stentz, A. RudnickySchool of Computer ScienceCarnegie Mellon University

Pittsburgh, Pennsylvania 15213Email: {mbdias,tkharris,brettb,egjones,bargall,mmv,axs,air}@cs.cmu.edu

Abstract

In this new era of space exploration where human-robotteams are envisioned maintaining a long-term presenceon other planets, effective coordination of these teams isparamount. Three critical research challenges that must besolved to realize this vision are the human-robot team chal-lenge, the pickup-team challenge, and the effective human-robot communication challenge. In this paper, we addressthese challenges, propose a novel approach towards solvingthese challenges, and situate our approach in the newly intro-ducedtreasure huntdomain.

IntroductionWe have entered a new era of Space exploration that re-quires the sustained presence of human-robot teams on otherplanets. Meeting this requirement entails enabling teams ofhumans and robots to effectively carry out many tasks in-cluding habitat construction, exploration and surveying ofcites, and carrying out science experiments. Enabling effec-tive human-robot cooperation requires robust and efficientsolutions to many research challenges. Three critical chal-lenges to be solved are(a) effective coordination of human-robot teams,(b) enablingpickup teams1 and (c) effectivecommunication mechanisms between humans and robots ina team setting. In response to these challenges, the vi-sion that drives the reported work is that humans and robotswill dynamically form teams to solve complex tasks by ef-ficiently joining their complementary capabilities. Dynami-cally formed human-robot teams are also relevant to severalother application domains including urban reconnaissance,disaster response, hazardous exploration, and domains suchas RoboCup (Nodaet al. 1998).

The Pickup Team challenge is to dynamically form teams(or sub-teams) to solve a task, where the teams consist of hu-mans and/or robots with possibly little a priori information.That is, team members may have only minimal prior knowl-edge of each others’ behavior, the tasks at hand, and the en-vironments they operate in, but are able to combine effec-tively. There are several additional reasons why an increased

Copyright c© 2006, American Association for Artificial Intelli-gence (www.aaai.org). All rights reserved.

1We define a pickup team as a team of humans and/or robotsthat forms dynamically in response to a task. Moreover, teammatesmay have little or no prior experience working with one another.

understanding of pickup teams is needed. First, the devel-opment of large teams or teams of expensive robots at thesame site at the same time is impractical. This limitation cur-rently hinders multirobot research. Successful pickup teamswill facilitate further research by allowing separate groupsto easily pool their robots to create teams for further study.Second, human-robot teams may be needed for emergencytasks where there may be insufficient time to hand-engineerthe coordination mechanisms before task execution. Pickupteams enable human-robot teams to be formed on very shortnotice for such tasks. Third, as robots fail, get lost, or oth-erwise malfunction, it is often necessary to substitute or addnew robots. Similarly, humans must be substituted as theyfail, or move from one domain to another. Successful pickupteams will allow the integration of new team members intoexisting teams, and also enable human-robot teams with het-erogeneous robots to perform efficiently under dynamic anduncertain conditions. Thus, the overall research challengeis to provide a principled methodology for creating pickupteams. This paper presents a first approach to address thischallenge.

The human-robot team coordination challenge further re-quires robust team operation across multiple environments,team capabilities applicable across humans and multiplerobot types, effective multi-human-multi-robot interactionin a team setting, efficient task, role, and resource alloca-tion among and within the teams or sub-teams, and teamsthat improve over time. Competitions, such as RoboCup,have been effective in focusing efforts to overcome some ofthese challenges (Nodaet al. 1998). However, these com-petitions focus on part of the overall problem and do notgenerally address teams formed in an ad-hoc manner, com-plex environments beyond a well-defined soccer field, andthe complexities of heterogeneous teams.

Combining these two challenges, this paper focuses ondynamically forming teams of humans and heterogeneousrobots to perform tasks that require coordination. The robotshave limited individual capabilities, can sense different in-formation about their environment, and can be assigned ab-stract tasks for execution. Robots can solve primitive tasksin different ways depending on the robot capabilities and theprevailing environmental conditions.

A third challenge addressed in this paper is that of effec-tive communication between humans and robots in a team

setting. While there has been considerable work into human-robot coordination, effectively passing information back andforth between one or more humans and one or more robotsin the midst of carrying out tasks remains a significant chal-lenge.

We propose a novel approach to solving the three chal-lenges of effective pickup teams, effective human-robotteam coordination, and effective human-robot communica-tion. The proposed approach combines and extends threepreviously proven components: TheTraderBots(Dias 2004)market-based coordination approach for efficient and flexi-ble allocation of tasks to teams and roles to team-members,Plays(Bowling, Browning, & Veloso 2004) for synchroniz-ing team-execution of coordinated tasks, and the multi-agentmulti-modal dialog systemTeamTalk(Harris et al. 2004).This approach is implemented on a team consisting of a hu-man, a Segway RMP robot, and a Pioneer IIDX robot, andsituated in a treasure hunt scenario.

The Treasure Hunt DomainTo investigate the coordination of human-robot pickupteams, we require a domain that will allow for dynamic andheterogeneous team formation, encourage different levelsof coordination between sub-teams and within sub-teams,and provide a metric against which to compare team per-formances. We believe the treasure hunt domain, jointlyproposed by researchers at The Boeing Company and atCarnegie Mellon University, provides for each of these char-acteristics.

The treasure hunt domain consists of human-robot teamscompeting in and exploring an unknown space in a quest tolocate and amass objects of interest, or treasure. The robotsare heterogeneous, and the particular capabilities necessaryto accomplish the tasks are distributed throughout the team.The treasure hunt domain requires a team to acquire specificobjects, the treasure, within an unknown or partially-knownenvironment. Thus a representation of the world must bebuilt as it is explored, and the treasure must be identifiedand then localized within the built representation, an possi-bly moved to and amassed at a home-base. Team coordina-tion follows as a direct necessity, as the abilities required toperform each of these tasks are distributed among the teammembers.

The ultimate goal with respect to pickup team formationis speed and flexibility. Within this domain, not only areteams created quickly and on the fly, but each member hasno prior knowledge about the abilities of its potential team-mates; a robot knows of its own capabilities only. Includedin the description of a hunt task are the abilities necessaryto accomplish it. Communication between potential pick-upteam members is therefore carried out at the time of teamformation, to ensure the satisfaction of all capabilities re-quirements.

This domain can also be extended to include an adversar-ial environment in which to execute the hunt tasks. Teamscan compete against the clock, with the intent of collectingas much treasure as possible within the allotted time or com-pete against other dynamically formed teams. A metric forteam performance can be the amount of treasure collected

by the team in a given time period, or conversely, the timetaken by a team to collect a percentage of the total hiddentreasure.

In our specific treasure hunt implementation, potentialteam members include humans, Pioneer robots and therobotic Segway RMP platform provided by Segway, LLC(see Figure 1). The Pioneer robot is equipped with a SICKlaser and gyroscope, and is therefore able to both constructa map of an unknown environment, and localize itself uponthat map. The Segway robot has been outfitted with twocameras, by which it is able to visually identify both the Pi-oneer robot and the treasure. The task assigned to the teamis to search for and retrieve treasure. To explore an areawhile searching for the treasure, the Pioneer is able to nav-igate and build a map, while the Segway is able to followthe Pioneer and simultaneously search for treasure. Upontreasure identification, the robots must return home with thetreasure. By localizing on its created map, the Pioneer isable to determine the home location; the Segway then fol-lows the Pioneer as it proceeds home. Team coordinationbetween the robots during execution is accomplished jointlyvia the visual identification of the Pioneer by the Segway,and by communication between the two robots should thisvisual link be lost (in which case the Pioneer is commandedto pause while the Segway searches for it). Communicationadditionally occurs when the Segway informs the Pioneerthat treasure has been found.

Figure 1: The left figure shows a Segway robot, while theright figure shows the pioneer robots.

The human team members have unique capabilities inplanning, coordination, navigation, sensing, and manipula-tion that are valuable in the treasure hunt the domain. In ourimplementation of the treasure hunt scenario, robots do nothave manipulators and hence they are unable to load and un-load treasure. Thus the human team member plays an essen-tial role in treasure retrieval, and is required to coordinatewith the robots to amass and return treasure to the team’shome-base. The human also has a prominent role as a teamleader, segmenting the world into prioritized search areas,and acts as an assimilator and authority for the team’s sharedknowledge about its environment. We anticipate the need forthe human to issue other types of constraints on search tasks,such as judgments about when a search has become exhaus-

tive, what routes provide the best affordances to the robots,and guidance to the robots for the search vs. explorationtrade-off. However it is important to note that the humandoes not carry out task allocation and task coordination forthe team. For most cases the robots can autonomously coor-dinate their individual and team actions, and even generatetasks and allocate them to the human team members.

Overall, the treasure hunt domain satisfies the specifiedcriterion for the study of human-robot pickup teams. It of-fers a number of challenging aspects, including robust andefficient operation in unconstrained environments, and ad-hoc team formation. Efficient execution requires a coordi-nated search of the space and the maintenance of an accu-rate shared knowledge about the space. As such, this domainprovides a rich environment in which to push the boundariesof adaptive, autonomous robotics.

Component TechnologiesIn this section we review our current approaches to team-work – Skills, Tactics, and Plays (STP) for team coor-dination in adversarial environments, TraderBots for effi-cient and robust role assignment in multi-robot tasks, andTeamTalk for multi-agent, multi-modal interactions.

STP: Skills, Tactics, and PlaysVeloso and colleagues (Bowling, Browning, & Veloso 2004)introduce a skills, tactics, and plays architecture (STP) forcontrolling autonomous robot teams in adversarial environ-ments. In STP, teamwork, individual behavior, and low-levelcontrol are decomposed into three separate modules. Rel-evant to our implementation are Plays which provide themechanism for adaptive team coordination. Plays are thecentral mechanism for coordinating team actions. Each playconsists of the following components:(a) a set of roles foreach team member executing the play,(b) a sequence of ac-tions for each role to perform,(c) an applicability evalua-tion function,(d) a termination evaluation function, and(e)a weight to determine the likelihood of selecting the play.

Each play is a fixed team plan that describes a sequence ofactions for each role in the team toward achieving the teamgoal(s). Each of the roles is assigned to a unique team mem-ber during execution. The role assignment is based on thebelieved state of the world and is dynamic (e.g., role A maystart with player 1, but may switch to player 3 as executionprogresses). Note that the role assignment mechanism is in-dependent of the play framework.

The concept of plays was created for domains where tightsynchronization of actions between team members is re-quired. Therefore, the sequence of tactics to be performedby each role is executed in lock step with each other role inthe play. Hence, the play forms a fixed team plan wherebythe sequence of activities is synchronized between teammembers.

As not all plans are appropriate under all circumstances,each play has a boolean evaluation function that determinesthe applicability of the play. This function is defined on theteam’s belief state, and determines if the play can be exe-cuted or not. Thus, it is possible to define special purpose

plays that are applicable only under specific conditions aswell as general-purpose plays that can be executed undermuch broader conditions. Once executed, there are two con-ditions under which the play can terminate. The first is thatthe team finishes executing the team plan. Each play in-cludes an evaluation function that determines whether theplay should be terminated. As with applicability, this evalu-ation function operates over the team’s belief state. Hence,the second means of ending a play is if the termination eval-uation function determines that the play should end, eitherbecause it has failed or because it has succeeded.

Team strategy consists of a set of plays, called a play-book, of which the team can execute only one play at anyinstant of time. A play can only be selected for executionif it is applicable. From the set of applicable plays, one isselected at random with a likelihood that is tied to the play’sweight. The plays are selected with a likelihood determinedby a Gibbs distribution from the weights over the set of ap-plicable plays, which means the team strategy is, in effect,stochastic. This strategy is desirable in adversarial domainsto prevent the team strategy being predictable, and thereforeexploitable by the opponent.

TraderBotsTraderBots, developed by Dias and Stentz (Dias 2004) is acoordination mechanism, inspired by the contract net pro-tocol by Smith (Smith 1980), is designed to inherit the ef-ficacy and flexibility of a market economy, and to exploitthese benefits to enable robust and efficient multirobot co-ordination in dynamic environments. A brief overview ofthe TraderBots approach is presented here to provide con-text for the reported implementation. For more completedetails of this approach, such as decisional processes andtask representations, the reader is referred to previous publi-cations (Dias 2004).

Consider a team of robots assembled to perform a par-ticular set of tasks. Consider further, that each robot in theteam is modeled as a self-interested agent, and the team ofrobots as an economy. The goal of the team is to completethe tasks successfully while minimizing overall costs. Eachrobot aims to maximize its individual profit; however, sinceall revenue is derived from satisfying team objectives, therobot’s self-interest translates to doing global good. More-over, all robots can increase their profit by eliminating ex-cess cost. Thus, to solve the task-allocation problem, therobots run task auctions, and bid on tasks in other robotstask auctions.

If the global cost is determined by the summation of indi-vidual robot costs, each deal made by a robot will result inglobal cost reduction. Note that robots will only make prof-itable deals. Furthermore, the individual aim to maximizeprofit (rather than to minimize cost) allows added flexibilityin the approach to prioritize tasks that are of high cost andhigh priority over tasks that incur low cost but provide lowervalue to the overall mission. The competitive element of therobots bidding for different tasks enables the system to de-cipher the competing local information of each robot, whilethe currency exchange provides grounding for the compet-ing local costs in terms of the global value of the tasks.

Figure 2: TeamTalk system architecture

TeamTalkThe human interface to the robots, TeamTalk (Harriset al.2004), is a multi-modal multi-agent dialog system, whichaccommodates multiple dialog agents and supports multi-ple simultaneous conversations. TeamTalk is based on theRavenclaw dialog manager (Bohus & Rudnicky 2003), ageneralized framework that makes use of task-tree repre-sentations to manage interaction between the user and thecomputer (in the present case, multiple robots). In additionto Ravenclaw, TeamTalk makes use of various spoken lan-guage technologies developed at Carnegie Mellon. Theseinclude Sphinx-II (Huanget al. 1993) for automatic speechrecognition and Phoenix (Ward 1994) for context-free gram-mar parsing. Language generation is template based andsynthesis (the Kalliope module) uses Swift, a commercial-ized version of the Festival synthesizer (Blacket al. 2004).A diagram of the system architecture, as configured for onehuman and two robots is shown in Figure 2.

ImplementationIn our implementation we address the three main challengesinvolved with enabling dynamic human robot teams. Firstwe describe how we address the challenge of human-robotcommunication.

The TeamTalk system provides a framework for explor-ing a number of issues in human-robot communication, in-cluding multi-participant dialog and grounding. Below webriefly discuss some of these issues and describe our currentsolutions.

Managing multi-participant dialogs Building systemswith multiple human and robot participants presents severalchallenges. After considering several alternatives, we settledon the following design. Each human participant is givena complete interface hosted on a single computer incorpo-rating the components that perform speech recognition, un-derstanding, generation, and synthesis. An alternative andseemingly natural solution would have been to give eachrobot its own spoken language capability, but this anthro-pomorphism is not necessary. A single transducer for the

human also allows us to mitigate the effects of environmentand distance. Associating the interface directly with the hu-man also allows us to use a multi-modal interface that com-bines speech and gesture in a single device (a tablet com-puter). Finally, there is the issue of cost as the number ofrobots increases.

TeamTalk creates a stateful dialog manager for eachrobot; this manager handles all of the human-side communi-cation. TeamTalk automatically spawns a new dialog man-ager whenever the system detects a hitherto unknown robot.This mechanism allows the system to deal gracefully withthe appearance of new robots on the team.

To support flexible communication, user input is broad-cast to all robots on the team, much as one might encounteron a shared radio channel. Broadcasting also allows us tosupport more complex group addressing functions (see be-low) and creates the conditions for beneficial eavesdropping.Grounding concepts and clarification dialog Robustmechanisms for grounding are an essential part of interac-tion between agents, particularly in open domains wherenovel events and entities can occur. The Treasure Hunt callsfor interactive techniques that allow humans and robots todevelop common shared understanding of the environmentthat they are operating in. For example, a newly activatedrobot needs to communicate its capabilities to the human bytransmitting an ontology (and associated language) that canbe used to configure understanding, generation and popu-late the sub-task library. Grounding is also performed dur-ing activities. For example, a shallow but useful form ofgrounding is agreeing on names for landmarks (with nameschosen from a closed set of labels). A more complex ver-sion would involve introducing new labels to the system(given that task-specific names make more sense than ab-stract ones). We can go beyond that and have a system thatcan generalize mappings between labels and features in theenvironment (”Call that a doorway”, ”Now go to the door-way”). Current capability includes only fixed labeling.

Clarification deals with confusions and ambiguities incommunication due either to processing (for example recog-nition errors) or actual ambiguities (”I know of two doors.Which one do I go to?”). In the case of recognition or un-derstanding errors clarification is triggered by confidencemetrics computed for recognition accuracy, grammar cov-erage, concept history, the task, and per-robot policies. Inthis way, the system can conservatively ground concepts forhigh risk tasks, and yet liberally and efficiently execute lowrisk tasks. An example, of a per-robot policy is as follows:the Pioneer robot with its laser-based collision detection willtranslate on command, only asking for command confirma-tion when there is a very noisy speech signal; conversely,the Segway RMP, with its high momentum, poor odometry,and total lack of collision detection, almost always asks fortranslation confirmation.

The second and third challenges we address are the dy-namic and efficient formation of pickup teams given hetero-geneous robots, and the efficient coordination of these teamswhen performing tightly coordinated tasks in dynamic envi-ronments. We have adapted the Traderbots approach andPlay system to perform this function. Tasks are matched

with plays consisting of a number of roles; the roles of theplay, when performed, should satisfy the requirements of thetask. Each role in a play contains a sequence of action prim-itives that can actually be executed by the robots, but a rolecan also require a certain set of capabilities. Robots only bidon roles that they have the capability to perform, thus ac-commodating the heterogeneity of the robots and providingan efficient way for new kinds of robots with different setsof capabilities to represent themselves to the system. Trader-bots also requires that robots have the ability to estimate thecost of actions; for instance, a cost may be the total distancethat a role requires a robot to move. By minimizing cost inperforming role allocations we hope to not only get feasiblesolutions but also to have high efficiency.

The heterogeneity of the pickup teams demands that muchcare be taken during play execution; roles may depend oneach other, and all robots need to do their part to actuallydiscover, localize, and retrieve treasure. Thus, once the allo-cation has been performed, the tight coordination subsystemmust monitor and direct play execution. We now describethe details of this approach.

Overview of ArchitectureThe top layer of our implementation consists of a numberof Traders. Each agent, including the human operator, isassigned a Trader. A Trader is the agent’s interface to themarket. The Trader can introduce items to be auctioned toother Traders by sending a call for bids, can respond to callsfor bids with bids, can determine which bids are most benefi-cial for the auctioning agent, and can issue awards to biddersthat have won auctions. Each robot agent has a trader calleda RoboTrader.

In our system the human operator has a trader, called theOpTrader. Each human team member uses TeamTalk as aclient from which to communicate with the OpTrader. Inthis implementation, TeamTalk runs on a tablet PC, and eachhuman team member carries the tablet PC with them. Thetablet runs an instance of the TeamTalk software, providesfor pen-and-tablet-based i/o with the robots, and provides forspeech-based i/o though an attached headset microphone.

A human team member (each with an instance ofTeamTalk) is a TeamTalk client of the OpTrader, and eachTeamTalk instance spawns a Ravenclaw dialog manager foreach active robot,n × m stateful dialogs are held, wheren is the number of human team members (and TeamTalkinstances) andm is the number of robotic team mem-bers. Each TeamTalk instance contains a single automaticspeech recognition engine, parser, natural language genera-tion module, text-to-speech engine, and pen-and-tablet inter-face. For each robot that comes on-line, the interface spawnsa separate dialog management system and confidence anno-tation module.

The PlayManager forms the next component module. Theprimary role of the PlayManager is to select useful plays fora task and to coordinate the execution of activities in a playacross a small sub-team to perform a won task. The Play-Manager can select plays to be bid upon for a specified task,coordinate the execution of a play with the play managersfor the other assigned roles, execute the sequence of tasks

for its particular robot, and synchronize the activities, whererequired, between the different roles.

A final important component, the Robot Server, providesan interface between the PlayManager and the componentson the robot responsible for controlling the robot. A strengthof our system is that neither the PlayManagers nor theTraders need to know very much about how the control ofthe system is actually implemented, as the Robot Serverserves as the single point of contact. Thus the Robot Serverson the robots must understand a standard packet, cause Play-Manager commanded actions to be performed, and to sendaction statuses, but beyond that robot platform developersare free to develop their system in any fashion they wish.

The Treasure Hunt Task Life Cycle

SEARCH (8,12) (8,18) (14,18) (14,12)

SEARCH−1

PlayManager

RoboTrader

TaskSEARCH−2

PlayManager

RoboTrader

TaskSEARCH−3

PlayManager

RoboTrader

TaskSEARCH−4

PlayManager

RoboTrader

TaskPlay Play Play Play

PIONEER 1 SEGWAY 1 SEGWAY 2 PIONEER 2

Task

Auction Call

OpTrader

Figure 3: Parts (a) and (b) of the task life-cycle as discussedin the Implementation section.

A human operator utters (or pen gestures) a task If themessage was a pen gesture, it is passed directly to Helios.If the task is an utterance, it is decoded by Sphinx into oneor more hypotheses along with associated confidence scores.The hypotheses are routed to Phoenix, which parses each hy-pothesized utterance, and for each parse, generates the con-cepts and additional confidence annotation.

These concepts are routed tom instances of Helios, onefor each robot. Helios examines the acoustic and languagemodel confidences from Sphinx, additional confidence an-notations imparted by Phoenix’s parser coverage of the hy-potheses, and the expectation agenda generated by Raven-claw based on that robot’s dialog state. From all of thisevidence, Helios picks concepts to pass to the dialog man-ager, and assigns a final combined confidence value for eachunique concept that is passed along.

At this point, each robot’s dialog manager has receiveda set of concepts with associated confidences. Each dia-log manager decides whether it is the addressed robot in the

dialog or it is simply eavesdropping on a conversation be-tween the human and another dialog partner. We allow forthe possibility of opportunistic eavesdropping, so that an un-addressed robot could, for example, develop a better contextfrom which to understand the next utterance that may ac-tually be addressed to it. The dialog manager may act onthe concepts, or it may engage in a grounding action on oneor more of the concepts. This decision is made based on apolicy that depends on the concept confidence history in thedialog, and the action to be performed (more on this below).The dialog is a mixed-initiative dialog, so that if certain con-cepts are provided that only partially complete some agenda,the dialog system may take initiative to obtain the missinginformation. For example, ”how far do you want me to gobackwards?”. Based on grounded concepts, the dialog man-ager sends messages to the back-end manager, which in turninserts tasks to anOpTraderserver.

A task is issued and enters the Trading system Initially,the human operator designates a SearchArea task, embed-ding the points of a bounding polygon in the task data. Thetask is sent to the OpTrader. The OpTrader creates an auc-tion call with the task data, serializes it, and sends it via thewireless network using UDP to all RoboTraders (see Figure3).

Each RoboTrader receives the task auction call andgets a matching play from the PlayManager The Robo-Traders receive the call for bids from the OpTrader. Theydeserialize the call and pass the task specification to their in-dividual PlayManagers over a UDP socket connection. Eachcontacted PlayManager will compare the task string againstthe applicability conditions for each play in its playbook. Itwill then select a play stochastically amongst the set of playsthat are applicable, and return this play to the RoboTrader.

The RoboTrader assesses the playThe RoboTrader nowhas a play matching the task and a set of roles. It then mustselect one of the roles for itself. For each role in the play, theRoboTrader first considers whether or not it possesses all thecapabilities required to perform that role. For all roles forwhich it is capable, the Trader then performs a cost evalua-tion.

In the treasure hunt we are largely concerned with min-imizing the time it takes to accomplish the task. As ourrobots move roughly the same speeds during play execution,we try to minimize distance traveled as an approximation oftime. If robots were heterogeneous with respect to speed thisshould be reflected in their costing function. For most roles,cost is computed as total metric path cost for goal points tobe visited in the role. Costing in our system is modular, soadditional or different costing functions can be added easilyas required.

Once a cost has been assigned to each role, the Robo-Trader selects the lowest cost role for itself, then preparesan auction call with all the remaining roles. This auctioncall is serialized and sent to the other Traders, as shown onthe left in Figure 4.

Play − SEARCH 1

Role 1 − Cover area and map − Cost $105

Role 2 − Follow and look for treasure.

RoboTrader

Auction Call

SEGWAY 1

RoboTrader

RoboTrader

SEGWAY 2

RoboTrader

PIONEER 2

PIONEER 1

Figure 4: The RoboTrader, upon receiving a play from thePlayManager, selects a play for itself and produces a costestimate. It auctions the other role in the play to the otherrobots.

The other RoboTraders bid on the role auction Afterthe other RoboTraders receive and deserialize the role call,they determine their own cost for each role in the call. Forany role in the call that requires capabilities the Trader doesnot possess it assigns an infinite cost. For all roles that theTrader can perform, it assigns the cost determined by thecosting function. Note that the Trader must bid on all rolesfor which it is capable. The role bids are then returned toauctioneering RoboTrader (shown in Figure 5).

The RoboTrader bids on the task Once all bids are re-ceived or a timeout has expired the auctioneer RoboTraderthen determines the winner or winners of the role auction.All role bids are considered, and the lowest cost bid is se-lected. If that bid is finite, the role is designated as assignedto the bidding Trader. As a Trader can only win a single rolein a play, that Trader’s other bids are nullified. Addition-ally, all bids for that role from any other Trader are nullified.Then the lowest cost remaining bid is considered; this con-tinues until all roles in the play are assigned or there are noremaining bids.

If all roles from the play have been assigned, the Robo-Trader constructs a bid for the original task, with a cost thatis summed over all assigned role cost estimates. The bid issent to the OpTrader. This process is shown in Figure 6.

Task and role awards are granted Once the OpTraderhas received bids from all RoboTraders or a call timeout hasexpired it awards the task to RoboTrader with the lowest costbid. An award message is sent to the winning RoboTrader.All other RoboTraders are informed they lost the auction.The winning RoboTrader then sends role award messages toall RoboTraders that were assigned roles in the winning play.

SEGWAY 1

RoboTrader

Cost $180

Role 2

Role Bid

RoboTrader

PIONEER 2

RoboTrader

SEGWAY 2

Play − SEARCH 1

Role 2 − Cost $140

Role 1 − Cost $105

RoboTrader

PIONEER 1

Cost $140

Role 2

Role Bid

Role 2

Role Bid

Cost $ ∞

Figure 5: Receiving the bids for the remaining role, the Op-Trader selects the lowest cost bid.

RoboTrader

PIONEER 2

RoboTrader

SEGWAY 1

RoboTrader

PIONEER 1

$280Task Bid

$260Task Bid

OpTraderTask

Award

$300Task Bid

RoboTrader

SEGWAY 2

$245Task Bid

Figure 6: The bids from all plays are returned to the Op-Trader, who selects the lowest cost task bid and sends a taskaward to the winning bidder.

Note that at this point the RoboTrader still has not acceptedthe task award.

Awards are accepted and play execution beginsThe ini-tial role bids made by the RoboTraders were non-binding.However, a role award is binding; once accepted, a role mustbe performed. If a RoboTrader has received no other roleaward in the interval between bidding and the arrival of thisaward, it accepts the role award. Otherwise it rejects the roleaward. If any of the role awards are rejected, or one of theTraders awarded a role does not respond to the award by atimeout, then the task award is rejected and the OpTradermust perform another auction.

If all role awards are accepted, the RoboTrader sends atask acceptance message to the RoboTrader. Then it sendsan execution message to the PlayManager detailing the fi-nal role assignments. This stage is shown in Figure 7. TheRoboTrader’s role in the pickup team allocation is concludedexcept for relaying status information to the OpTrader whenthe play completes. Future work, however, will examine dy-namic re-assignment of roles if and when required.

TaskAccept

RoboTrader

PlayManager

RobotServer

Role 1Action 1

Role 2

OpTrader

PIONEER 1

Play Search 1

Role 1 − Assigned to Pioneer 1

Role 2 − Assigned to Segway 2

SEGWAY 2

PlayManager

RobotServer

Role 2Action 1

Figure 7: After the RoboTrader has confirmed the availabil-ity of the winning Role bidder, it can send notice of taskacceptance to the OpTrader and forward the role assign-ments to the PlayManager. The PlayManager contacts theplay managers of all robots assigned a role and sends them adescription of the actions in their role. Action primitives aresent to the robot servers and the play execution begins.

The PlayManager begins play execution Once informedof the play to execute, and the list of assigned roles, the Play-Manager forms the sub-team to execute the play. It does thisby contacting each of the sub-team members’ RoleExecutorsand transmits the play in compressed XML format to them.The RoleExecutor is responsible for executing the assignedrole in the play. If any sub-team members are essential andfail to acknowledge the play reception, or are not able tobe contacted, the play is terminated and reported as such tothe RoboTrader. Otherwise, execution proceeds by the Play-Manager informing each RoleExecutor to start operation.

At this point, each RoleExecutor becomes loosely cou-pled to the PlayManager. The RoleExecutor will execute itssequence of actions and will only inform the PlayManager oftermination (success or failure), or when it needs to synchro-nize with another role according to the play. To synchronize,the RoleExecutor contacts the appropriate teammate’s Role-Executor and informs the PlayManager for monitoring pur-poses.

When each RoleExecutor reaches the end of its sequenceof actions to perform, it informs the PlayManager of suc-cessful termination, and when the team is complete the Play-Manager reports this to the RoboTrader. Alternatively, if arobot fails, or the time limit for execution is reached (as en-coded in the play itself), the play is terminated and reportedas such. Taken together, execution is distributed and looselycoupled.

The robots report status Task completion messages, aswell as other messages (e.g., status reports) originated bythe robots, are captured by the TeamTalk back-end managerand the messages are routed to each robot’s correspondingHelios just as for the human utterances. Helios assigns highconfidence to information which originates from robots, onthe presumption that robot communications channel is notnoisy.

Generally, robot-originated messages are interpreted asrequests by the robots to communicate to the human, andRavenclaw passes those messages directly to Rosetta. Mes-sages may also complete task agencies, allowing Ravenclawto take these off of the focus stack and modify the expecta-tion agenda as appropriate (Bohus & Rudnicky 2003). Mes-sages are also routed to the GUI controller to allow the sys-tem to modify the display or perform some other signalingaction. In the case of a completed task, this may change thecolor of the robot’s icon.

Rosetta uses template-based natural language generationto render concept terms into well-formed sentences. Thesentences generated by Rosetta are routed to Kalliope forsynthesis. Kalliope selects a different Swift voice for eachrobot, thus each robot speaks in its own distinct voice. Thishelps the user to keep better track of information.

Current SystemThe indicated architecture has been implemented and testedon the Pioneer and Segway platforms. The Pioneers andSegways each have platform specific Robot Servers, whichfunction as the interface from the robot specific software tothe rest of the implementation. The RoboTraders and Play-Managers run on the platforms identically. The capabilitiesof a given robot are defined in a configuration file unique tothem; the RoboTrader reads this file and uses the informa-tion in the role allocation process.

We have implemented and tested one of the main playsrequired for the treasure-hunt domain: the covering of anarea while searching for treasure. This play consists of tworoles, one which explores and maps an indicated area andthe other which follows the exploring agent while visuallysearching for treasure. We have tested this system from taskissue to treasure identification.

In our tests, the search area was initially identified by thehuman via a graphical user interface. The OpTrader then dy-namically allocated the two roles to our participating agents,the Pioneer and Segway. With the allocation completed, thePlayManager successfully enabled the tight coordination re-quired of this team task. Each agent then executed their roleuntil treasure was discovered, thus completing the task andits associated play. The appropriate robots then reported thecompletion of the task to the human.

To demonstrate the capabilities of the market-based allo-cation to dynamically form sub-teams, we created a tightlycoordinated search play calling for two agents to map andexplore an area together. The roles associated with this playdesignated leader and follower agents. We positioned fourPioneer robots in varying location for this task. Using adistance-based cost model, the two agents with starting po-sitions closest to the search area were dynamically selected

for the sub-team and subsequently executed the search pat-tern.

The realization of these plays utilized a few simplifyingassumptions. A good allocation requires the existence of anaccurate cost model. Necessary for this cost model, as wellas for proper execution, is the notion of a shared coordinatespace, as well as the ability to accurately localize within thiscoordinate space. Additionally, the environment was also vi-sually simplified to allow higher probability of treasure dis-covery; both the Pioneer and the treasure were outfitted withidentifying visual fiducials. Work in progress includes im-plementing and testing treasure retrieval with human-robotparticipation, and with robots creating and allocating tasksto other robots and humans.

Related Work

Within the field of robotics, there has been considerableresearch into multi-robot coordination for a variety of do-mains and tasks (Mataric 1994; Balch & Arkin 1998;Gerkey & Mataric 2003). Many groups have focused on re-search questions relevant to robot teams particularly (Balch& Parker 2001). RoboCup robot soccer has offered a stan-dardized domain in which to explore multi-robot team-work in dynamic, adversarial tasks (Nodaet al. 1998;Nardi et al. 2004) (see alsohttp://www.robocup.org). Seg-way Soccer (Browninget al. 2005) is a new league withinthis RoboCup domain, which addresses the coordination ofheterogeneous team members specifically. The emphasisof these human-robot teams is equality, both physically, asboth ride the Segway mobility platform, and with respect todecision-making power and responsibility.

How to effectively coordinate heterogeneous teams hasbeen an ongoing challenge in multi-robot research (Scerriet al. 2004; Kaminka & Frenkel 2005). However, no onehas focused explicitly on the principles underlying the build-ing of such highly dynamic teams when the a priori inter-action between individual robot developers is so minimal.Much of the existing research implicitly assumes that therobot team is built by a group of people working closelytogether over an extended period of time. While some pre-vious research within the software agents community hasaddressed the coordination of simulated agents built by dif-ferent groups (Pynadath & Tambe 2003), none has chosen toaddress this pickup challenge for the coordination of multi-ple robots. We believe this research direction of forming dy-namic teams will greatly advance the science of multi-robotsystems.

Dialog agents are almost always designed with the tacitassumption that at any one time, there is a single agent anda single human, and that their communication channel istwo-way and exclusive. As a consequence such systems donot deal gracefully with additional speakers or agents in thechannel. One approach constructs an aggregated spoken di-alog front-end for a community of under-specified agents,for example (Harris 2004). These systems, however, sacri-fice the integration of domain knowledge into the dialog andsupport only shallow levels of interaction.

ConclusionsIn this paper, we presented the concept of pickup teams,where teams are formed dynamically from humans and het-erogeneous robots with little prior information about one an-other. Within the team, humans and robots communicate viaspeech as their primary medium. The TraderBots allocationmechanism is coupled with Play mechanisms to achieve dy-namic sub-team formation, and efficient allocation of tasksand roles. Thus, we have proposed a new technique to ad-dress the research challenge of dynamic and efficient forma-tion, coordination, and interaction of human-robot pickupteams. A first version of this approach has been imple-mented and tested on a team of one human and two hetero-geneous robots developed by two different research groups.Future work will enhance this preliminary implementationin several dimensions including improved robustness, morecomplex environments, teams and tasks, and learned adap-tation to dynamic conditions.

AcknowledgmentThis work is principally funded by the Boeing CompanyGrant CMU-BA-GTA-1. The content of the information inthis publication does not necessarily reflect the position orpolicy of the Boeing Company and no official endorsementshould be inferred. The authors would also like to thankthe Boeing Company for their contributions to this projectand this paper. This work was also enabled in part by fund-ing from the Qatar Foundation for Education, Science andCommunity Development and from the U.S. Army ResearchLaboratory, under contract Robotics Collaborative Technol-ogy Alliance (contract number DAAD19-01-2-0012).

ReferencesBalch, T., and Arkin, R. 1998. Behavior-based formationcontrol for multiagent robot teams.IEEE Transactions onRobotics and Automation.

Balch, T., and Parker, L., eds. 2001.Robot Teams: FromDiversity to Polymorphism. AK Peters.

Black, A.; Clark, R.; Richmond, K.; and King, S. 2004.The festival speech synthesis system.http://www.cstr.ed.ac.uk/projects/festival/ .

Bohus, D., and Rudnicky, A. I. 2003. Ravenclaw: Dialogmanagement using hierarchical task decomposition and anexpectation agenda. InProceedings of European Confer-ence on Speech Communication and Technology, 597–600.

Bowling, M.; Browning, B.; and Veloso, M. 2004. Plays aseffective multiagent plans enabling opponent-adaptive playselection. InProceedings of International Conference onAutomated Planning and Scheduling (ICAPS’04).

Browning, B.; Searock, J.; Rybski, P. E.; and Veloso, M.2005. Turning segways into soccer robots.Industrial Robot32(2):149–156.

Dias, M. B. 2004.TraderBots: A New Paradigm for Robustand Efficient Multirobot Coordination in Dynamic Envi-ronments. Ph.D. Dissertation, Robotics Institute, CarnegieMellon University, Pittsburgh, PA.

Gerkey, B. P., and Mataric, M. J. 2003. Multi-robot taskallocation: analyzing the complexity and optimality of keyarchitectures. InProceedings of ICRA’03, the 2003 IEEEInternational Conference on Robotics and Automation.Harris, T. K.; Banerjee, S.; Rudnicky, A.; Sison, J.; Bo-dine, K.; and Black, A. 2004. A research platform formulti-agent dialogue dynamics. InProc. of the IEEE In-ternational Workshop on Robotics and Human InteractiveCommunication, 497–502.Harris, T. 2004. The speech graffiti personal universal con-troller. Master’s thesis, Carnegie Mellon University, Pitts-burgh.Huang, D.; Alleva, F.; Hon, H. W.; Hwang, M. Y.; Lee,K. F.; and Rosenfeld, R. 1993. The Sphinx-II speechrecognition system: An overview.Computer, Speech, andLanguage7(2):137–148.Kaminka, G., and Frenkel, I. 2005. Flexible teamwork inbehavior-based robots. InIn Proceedings of the NationalConference on Artificial Intelligence (AAAI-2005).Mataric, M. J. 1994.Interaction and Intelligent Behavior.Ph.D. Dissertation, EECS, MIT, Boston, MA. Available astechnical report AITR-1495.Nardi, D.; Riedmiller, M.; Sammut, C.; and Santos-Victor,J., eds. 2004.RoboCup 2004: Robot Soccer World CupVIII (LNCS/LNAI). Berlin: Springer-Verlag Press.Noda, I.; Suzuki, S.; Matsubara, H.; Asada, M.; and Ki-tano, H. 1998. RoboCup-97: The first robot world cupsoccer games and conferences.AI Magazine19(3):49–59.Pynadath, D. V., and Tambe, M. 2003. An automated team-work infrastructure for heterogeneous software agents andhumans.Autonomous Agents and Multi-Agent Systems7(1-2):71–100.Scerri, P.; Xu, Y.; Liao, E.; Lai, J.; and Sycara, K. 2004.Scaling teamwork to very large teams. InAAMAS’04.Smith, R. G. 1980. The contract net protocol: High levelcommunication and control in a distributed problem solver.IEEE Transactions on ComputersC-29(12):1104–1113.Ward, W. 1994. Extracting information from spontaneousspeech. InProceedings of the International Conference onSpoken Language Processing, 83–87.

dynamically formed human-robot teams performing ... · these challenges, propose a novel approach...

Documents