c 2005 by thomas brown. all rights reserved

41
c 2005 by Thomas Brown. All rights reserved.

Upload: others

Post on 16-Feb-2022

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: c 2005 by Thomas Brown. All rights reserved

c© 2005 by Thomas Brown. All rights reserved.

Page 2: c 2005 by Thomas Brown. All rights reserved

DECENTRALIZED COORDINATION WITH CRASH FAILURES

BY

THOMAS BROWN

B.S., University of Texas at Austin, 2002

THESIS

Submitted in partial fulfillment of the requirementsfor the degree of Master of Science in Electrical Engineering

in the Graduate College of theUniversity of Illinois at Urbana-Champaign, 2005

Urbana, Illinois

Page 3: c 2005 by Thomas Brown. All rights reserved

ACKNOWLEDGMENTS

I thank my adviser, Professor Gul Agha, and the members of the Open Systems Laboratory for

supporting my work with their time, resources and ideas.

The study described in this thesis is a continuation of a project done in collaboration with

Amr Ahmed, Abhilash Patel, MyungJoo Ham, and Hannaneh Hajishirzi. Together we built all the

infrastructure used in developing this thesis, I have made only relatively small improvements. Since

the original project ended several conversations with Amr and Predrag Tosic have been fruitful.

Myeong-Wuk Jang generously found time for me to run my own experiments when his computers

were idle.

I also acknowledge the unceasing moral and grammatical support of Stephanie Jensen.

iii

Page 4: c 2005 by Thomas Brown. All rights reserved

TABLE OF CONTENTS

LIST OF TABLES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v

LIST OF FIGURES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vi

LIST OF ABBREVIATIONS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii

CHAPTER 1 INTRODUCTION . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

CHAPTER 2 RELATED WORK . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32.1 Related Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32.2 Auction Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42.3 Robot Specific . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

CHAPTER 3 MODEL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93.1 Informal Description of Game . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93.2 Formal Description of Game . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93.3 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

CHAPTER 4 THE IDAC ALGORITHM . . . . . . . . . . . . . . . . . . . . . . . 144.1 Creating an Auctioneer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144.2 Auctioneer Mechanism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154.3 Bidder Mechanism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

4.3.1 Main controller of the bidder . . . . . . . . . . . . . . . . . . . . . . . . . . . 174.3.2 Calculating the best bid . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

CHAPTER 5 SURVIVING CRASH FAILURES . . . . . . . . . . . . . . . . . . . 21

CHAPTER 6 RESULTS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 226.1 Finding IDAC Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 226.2 Demonstration of Crash Recovery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 266.3 Real Experiments Compared to Optimal . . . . . . . . . . . . . . . . . . . . . . . . . 276.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 296.5 Continuing Research . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

REFERENCES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

iv

Page 5: c 2005 by Thomas Brown. All rights reserved

LIST OF TABLES

Table 6.1 Summary of physical experiments. . . . . . . . . . . . . . . . . . . . . . . . . 28Table 6.2 Summary of simulated experiments. . . . . . . . . . . . . . . . . . . . . . . . 28Table 6.3 Growth of combinations of trips. . . . . . . . . . . . . . . . . . . . . . . . . . 29Table 6.4 Comparison of IDAC without collision avoidence to the optimal trips for a

given mission. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

v

Page 6: c 2005 by Thomas Brown. All rights reserved

LIST OF FIGURES

Figure 3.1 The common structure of an agent is shown in this figure. . . . . . . . . . . 11Figure 3.2 The connections between the control modules in the UAV agent are shown

in this figure. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12Figure 3.3 The robot with switches for real and simulated cases is shown in this figure. 13

Figure 6.1 This graph shows that total time to service all targets is weakly related toCost Bid and Cost Assign. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

Figure 6.2 This figure shows a potential deadlock. . . . . . . . . . . . . . . . . . . . . . 23Figure 6.3 This graph shows the proportion of missions that did not deadlock for var-

ious Cost Bid and Cost Assign. . . . . . . . . . . . . . . . . . . . . . . . . . 24Figure 6.4 This graph shows the mean number of Bid messages sent per mission for

various Cost Bid and Cost Assign. . . . . . . . . . . . . . . . . . . . . . . . 25Figure 6.5 This graph shows the mean number of times a UAV bid for a target other

than its current assignment for various Cost Bid and Cost Assign. . . . . . 26Figure 6.6 These figures show the starting positions used in physical experiments with

(a) four moving targets and (b) six static targets. . . . . . . . . . . . . . . . 27Figure 6.7 The auction may be orientated with UAVs as (a) bidders or (b) sellers. . . . 30

vi

Page 7: c 2005 by Thomas Brown. All rights reserved

LIST OF ABBREVIATIONS

DCOP Distributed Constraint Optimization Problem

DDTA Dynamic Distributed Task Allocation

GUI Graphical User Interface

IDAC Illinois Distributed Auction Coordination, the algorithm described in this thesis

TSP Traveling Salesperson Problem

UAV Unmanned Autonomous Vehicle

vii

Page 8: c 2005 by Thomas Brown. All rights reserved

CHAPTER 1

INTRODUCTION

There are many practical computing problems that remain largely unsolved today. For example

modern natural language processing applications fall far short of what science fiction writers en-

visioned 30 years ago, large computer systems struggle to avoid collapsing under the weight of

their complexity, and many computations that can be quickly solved for small cases are essentially

unsolvable for real-sized cases. Optimal task assignment at a useful scale is an example of a com-

putation that modern and plausible future computers cannot solve. Instead, approximations with

bounds on their worst-case behavior have been proposed. Most of these approximations run on

a centralized processor to create a static solution, but large decentralized and dynamic systems

may also need to assign tasks to their constituent agents. This thesis proposes and describes the

implementation of a decentralized algorithm for task assignment in a dynamic environment.

Task assignment can be stated quite simply: given a set of agents, a set of goals, and a method

for measuring cost, find an agent for each goal that will minimize the total cost. Yet finding a

solution can be surprisingly difficult and becomes unsolvable in a reasonable amount of time as

the number of goals or agents increases. Unfortunately, this problem is faced in many real-life

situations. Package delivery firms plan paths for their trucks. A taxi dispatch must match cars

to passengers. A team of nurses must coordinate their individual schedules to provide continuous

care. Children in a daycare must share the toys and games. These people have methods for

finding acceptable solutions to their problems, but can their algorithms be improved or used in

new situations?

This thesis adapts a centralized static auction-based algorithm for single agent task assignment

to solve a distributed dynamic multiagent task assignment problem. The adaption, which will

be referred to as Illinois Distributed Auction Coordination (IDAC), is used to assign a group

of unmanned autonomous vehicles (UAVs) to each target. The UAVs cooperatively select an

1

Page 9: c 2005 by Thomas Brown. All rights reserved

auctioneer for each target and each UAV bids on the target with greatest expected profit. The

auctions are run continuously, allowing the system to adapt to changes in the available UAVs and

targets. When an auctioneer crashes, the other UAVs detect the failure and select a new auctioneer.

In simulation IDAC successfully coordinated scenarios as large as 20 UAVs chasing 40 targets each

needing up to six UAVs and with real robots IDAC coordinated 4 UAVs chasing 6 targets.

The main contribution of this work is a reliable totally decentralized auction algorithm for as-

signing tasks to agents that tolerate crash-failure in agents. Some prior research has used centralized

auctions to assign agents to tasks that each need multiple agents, and this work demonstrates a

decentralized technique for making such auctions finish.

For our purpose task allocation/task assignment will be synonyms as will agent/machine/UAV

and task/target. In this version each agent may pursue at most one task at a time and an assignment

is a map from each agent to its task.

The remainder of this thesis is organized as follows. Chapter 2 reviews some literature related

to the algorithm and task assignment. Chapter 3 describes the problem this thesis addresses in

detail. Chapters 4 and 5 describe the algorithm and chapter 6 presents experimental results and

concluding remarks.

2

Page 10: c 2005 by Thomas Brown. All rights reserved

CHAPTER 2

RELATED WORK

The problem of distributing a set of tasks among a set of agents has been addressed by countless

researchers in different fields. This review starts with some general formulations and solutions to

the problem. Since IDAC depends on an auction, we briefly describe the current state of auction

theory. Finally, some similar work in UAV task assignment is compared to IDAC. Using the

common meanings (as explained in [1]), agents of a distributed system decide on an action by

negotiating with each other and agents of a decentralized system independently process (possibly

shared) information to decide what action to execute.

2.1 Related Problems

Finding a map from machines to assigned tasks is a well-known and well-researched problem.

Most instantiations of the problem have a method of ordering possible assignment maps and seek

to minimize the cost or equivalently maximizing benefit. Burkard and Cela [2] describe some

techniques for solving the linear sum assignment problem (LSAP), a problem where each machine

has a cost for each task, and the total cost is the sum of the costs for each machine’s task. Because

we attempt to assign a set of UAVs to each target, the benefit of assigning a task depends on how

many other UAVs are also assigned to that task; thus, our problem is not an LSAP. It is instead

a multidimensional assignment problem as described at the end of [2] which is equivalent to what

Curiel [3] calls the multiassignment game. Like the traveling salesman problem (TSP) [4] or vehicle

route problem (VRP) [5], these problems are all NP-Hard.

While most treatments of assignment problems discuss the computational complexity in a cen-

tralized system, Yokoo and Hirayama [6] and Yokoo [7] generalized it to the distributed constraint

satisfaction problem (DisCSP) and distributed constrain optimization problem (DCOP).

3

Page 11: c 2005 by Thomas Brown. All rights reserved

The design of IDAC is based on Bertsekas and Castanon’s [8] and Bertsekas’ [9] forward/reverse

auction for asymmetric assignment. Bertsekas describes a centralized algorithm that uses a global

value λ to ensure convergence to equilibrium and cannot handle tasks requiring more than one

agent. The bids are made for profitfirst − profit second + pricefirst, and the auctioneer raises the

price as much as possible while maintaining the necessary bid. This strategy increases the rate

of convergence in a static system, but creates a solution which is unstable in the dynamic system

addressed in this thesis.

2.2 Auction Theory

Despite the widespread use of auctions for many millennia, modern rigorous study of their mech-

anisms did not start until the 1960s. Most early papers only consider many buyers competing for

a single item. They compare different mechanisms for deciding which bidder participates in the

exchange and at what price. Klemperer [10] provides a good introductory survey of auction theory.

Aside from the classic auction where a set of bidders compete for a single item, significant

research has been done on multiunit and combinatorial auctions. In these auctions bidders have

value for several identical items or a combination of different items. A single auctioneer conducts

multiple simultaneous auctions in an attempt to find a stable buyer and price for each item. Vries

and Vohra [11] wrote survey of combinatorial auctions.

Strategies for handling decentralized auctions (i.e., not run by a single auctioneer) are less well

developed. Esteva and Padget [12] proposed a method using leader election to run auctions without

any specific auctioneer, but it requires that every participant be on a ring network and that they

can only sell single items. Byde et al. [13] provided probabilistic strategies for a buyer when trying

to buy a combination of items from independent heterogeneous auctions.

In a typical auction system the sellers will exchange their item(s) for any price above each

seller’s constant reserve price, and the bidders compete with each other, raising the price. If the

price of an item does not exceed the reserve price, the seller keeps the item. In some distributed

auction markets sellers compete for buyers. McAfee [14] wrote an early paper analyzing competing

sellers. Coles and Eeckhout [15] built on this work to show that a distributed auction can be used

to solve a coordination problem when buyers and sellers are heterogeneous.

4

Page 12: c 2005 by Thomas Brown. All rights reserved

Unlike all these previously mentioned studies, IDAC uses many distributed auctions (one per

target) each competing for a set of buyers (UAVs to service the target). To the best of our

knowledge IDAC and its predecessor, as described by Ahmed et al. [16], have introduced a new

type of auction not mentioned in any prior auction theory work. While both the previous dynamic

benefits scheme and cost waiting in this thesis have demonstrated a working seller bundle auction,

Section 6.4 suggests an alternative method which may achieve the same result using a normal

multiunit auction.

2.3 Robot Specific

Many different approaches have been taken to coordinate a system of multiple UAVs. Cao et al. [17]

describe five axes along which much of the research may be analyzed. A summary of the axes (in

parentheses) and the position of IDAC on each axis follows.

1. Group Architecture (“robot heterogeneity/homogeneity, the ability of a given robot to rec-

ognize and model other robots, and communication structure”): The group of UAVs are

homogeneous and communicate via intentional messages and a shared local view. There is

no central control, but a UAV temporarily takes responsibility for coordinating a local task.

2. Resource Conflict (methods that allow “robots to inhabit a shared environment, manipulate

objects in the environment, and possibly communicate with each other”): Resource conflicts

are not an important focus of this work; 802.11b and Ethernet are used for communication.

3. Origin of Cooperation (“how cooperative behavior is actually motivated and achieved”):

The cooperation of UAVs is the result of explicit negotiations rather than innate eusocial

(i.e., swarm) behavior.

4. Learning (methods for “adaptability and flexibility”): Many of the variables used to control the

cooperation have been hand tuned during debugging, while a few are dynamically adjusted.

Our study does not focus on learning techniques.

5. Geometric Problems (issues such as “multi-agent path planning, moving to formation, and

pattern generation”): Geometric problems where solved to the minimum needed. A simple

5

Page 13: c 2005 by Thomas Brown. All rights reserved

reactive decentralized path planner ([18], [19]) and negotiated formation strategy are used to

avoid collisions.

Cao et al. [17] explicitly exclude task allocation from their axes but, like most of their axes, task

allocation can be delineated along a scale with reactive/behavioral/swarm methods at one end and

intentional/deliberative/negotiated methods at the other end. IDAC is between the swarm and

negotiated end points for task allocation.

Research in optimal task allocation and scheduling generally solves a static mission with ho-

mogeneous agents and, thus, cannot be directly used by robot teams [20]. Robots must work

in a dynamic and failure-prone environment to complete tasks that may need several heteroge-

neous robots. We adapted a well-known task allocation algorithm to explore a new method of

coordinating robot teams.

ALLIANCE [20] uses behaviors that are enabled or suppressed based on models of the motives

impatience and acquiescence. Parker writes “While some efficiency may be lost as a consequence

of not negotiating the task subdivision in advance, robustness is gained if robot failures or other

dynamic events occur at any time during the mission. We speculate that a hybrid combination of

negotiation and the ALLIANCE motivational mechanisms would enable a system to experience the

benefits of both approaches” [20, p. 7]. IDAC combines negotiation and motivational mechanisms

in pursuit of these combined benefits. Like ALLIANCE it easily reassigns tasks after failure or

environmental changes but ALLIANCE is able to reassign based on actual progress instead of

response to negotiation. The cost waiting of IDAC was inspired by impatience in ALLIANCE.

Unlike ALLIANCE, the task allocation system MURDOCH [21] uses explicit negotiation to

assign particular agent(s) to a task. Similar to the system described here, MURDOCH uses an

auction followed by simple negotiation to assign a task, and the auctioneer can reassign a task if

the first winner fails. Unlike MURDOCH, our system uses a multibid auction that attempts to

make a more efficient assignment at the cost of more messages, that may reassign a task even if

the original winner has not failed (for example, a much closer UAV could become available), and

that can handle failure of the auctioneer.

Cao et al. claim “very few works in cooperative robotics have centered on task decomposition

and allocation” [17, p. 5] and Gerkey and Mataric say “task allocation for physically embod-

6

Page 14: c 2005 by Thomas Brown. All rights reserved

ied robots has received far less attention” [21, p. 760] than for software agent systems. These

statements cannot be repeated today with the same strength, as many recent works have focused

on coordinating tasks between groups of UAVs (frequently “unmanned aerial vehicles” instead of

“unmanned autonoumous vehicles”).

For example, [22] addresses the intractability of large coordination problems by partitioning

the UAVs and targets into small subteams where a leader can enable coordination. A market-

like scheme for evaluating cost and benefit is used to swap UAVs and targets between subteams.

Richards et al. [23] combine a distributed approximation of possible tours for a UAV with a cen-

tralized coordinator to make final assignments. Zigoris et al. [24] use a centralized stable marriage

algorithm to assign tasks to robots. Schumacher et al. [25] make the brave assumption that every

UAV has a synchronized world view so that they can each plan entire tours in a decentralized

manner. Similarly Frazzoli and Bullo [26] provide algorithms for solving the m-vehicle dynamic

traveling repairperson problem and let the UAVs coordinate by each using the same method on the

same world view. Walker [27] discusses the difficulty of scaling problem to more UAVs and targets

and the coupling of path planning to target assignment. He proposes several heuristics to help pick

assignments.

Many market-based coordination schemes use an auction to assign or trade tasks between fixed

coalitions rather than as part of coalition formation. An exception is Guerrero and Oliver [28]

who suggested a method similar to IDAC in many respects. Both select a leader for each task

as tasks appear. The leader then requests and collects bids and uses a three phase commit to

confirm a coalition. Both Guerrero and Oliver’s work and IDAC use the bidding mechanism to

form coalitions, though their work selects a coalition size based on the received bids whereas in

IDAC each target needs a fixed coalition. Guerrero and Oliver’s work and IDAC differ in how

dynamic behavior in the system is handled. They use leader to leader negotiation for moving

UAVs between coalitions, whereas IDAC simply reexecutes the same algorithm.

Adopt [29] is an algorithm for solving general DCOPs. Since the dynamic distributed task

allocation (DDTA) [16] problem is a subset of DCOP, Adopt can solve it. Adopt needs a depth-

first search tree with constraints only between ancestors, not siblings, so it can be highly inefficient

in a dynamic environment where every UAV shares a constraint with all UAVs within a finite

7

Page 15: c 2005 by Thomas Brown. All rights reserved

distance.

We are able to run the same control logic on a real or simulated robot using similar a scheme to

RAVE [30]. Both RAVE and our scheme depend on a simulated version of the hardware updating

its state in a centralized simulation of the environment. Unlike RAVE, our vision server (equivalent

to RAVE’s environment manager) runs exclusively in simulated or real mode; we cannot combine

the two. In hindsight, adopting RAVE may have saved development time.

Although their task allocation is centralized, Brummit and Stentz [31] use a similar architecture

for their simulation of a team of autonomous ground vehicles by separating robot control, path

planning, and task allocation.

Pongpunwattana [32] uses an auction scheme between agents running identical software to

maintain an optimal assignment, but trades take place via a single coordinator.

IDAC directly builds on the infrastructure used in a previously published work [16] but makes

some significant changes. IDAC can recover from an auctioneer crashing, uses a new auction round

instead of swaps to handle the dynamic environment, and provides a intuitive connection between

the expected cost and the bid amount instead of changing benefit.

8

Page 16: c 2005 by Thomas Brown. All rights reserved

CHAPTER 3

MODEL

While IDAC can be applied to many problems, this instantiation uses a simple team game. This

chapter describes the game informally and formally and the parts of the implementation which are

not part of IDAC execution.

3.1 Informal Description of Game

A team of UAVs are placed in a rectangular arena with a group of targets. Each target must be

simultaneously tagged by a required number (typically one to four) of UAVs at which time the

UAV team earns the benefit value associated with the target. The targets may appear at any time

and are fairly dumb. They may move in a fixed pattern until tagged and do not communicate,

other than to be told they have been tagged. The UAVs must communicate with each other to try

to make sure they send the required number to tag each target. Every robot is given fairly reliable

position information for all robots in the arena. (The infinite vision range could be reduced, see

Section 6.5.)

3.2 Formal Description of Game

Given a set of UAVs U and a set of targets T where each target t ∈ T has a utility utilityt and

required number of UAVs req t, the UAVs must decide on a map of assignments U 7→ T , represented

as an adjacency matrix A with elements Aut (Eq. (3.1)). Each UAV is assigned to at most one

target (Eq. (3.2)) and each target has either zero or req i UAVs assigned to it (Eq. (3.3)).

Aut ∈ 0, 1 ∀u ∈ U , t ∈ T (3.1)

9

Page 17: c 2005 by Thomas Brown. All rights reserved

∑t∈T

Aut ∈ {0, 1} ∀u ∈ U (3.2)

∑u∈U

Aut ∈ {0, req t} ∀t ∈ T (3.3)

Each robot (UAV or target) has a position on a two-dimensional planar map. The distance

between two robots i and j is dij . Each robot has accurate location information for every robot

within Comm Range and may have less accurate information about more distant robots.

Every target can be in one of three states. It starts with statust = free. Once at least one

UAV with Aut = 1 gets within Influence of t, it stops moving. When all req t UAVs with Aut = 1

are within Influence of t, the target changes to statust = serviced and is removed from the game.

The UAV team then earns utilityt points.

As the robots move and the states of targets change, the UAVs may need to modify A to

continue efficiently collecting points.

The efficiency of the coordination algorithm could be measured in several ways, such as the

number of points collected in a fixed time, the time to collect all points, integration of the points

earned through time, or average or maximum service time in a dynamic system. Since all the

scenarios in this study have a constant number of targets, the time to service all targets has been

used to compare variations in the coordination algorithm.

A robot can send a message to a particular robot within Comm Range or use a broadcast to

send it to all robots within Comm Range. Every sent message will be received within Comm Delay

if the recipient is within Comm Range from the time the message is sent until it is received. No

message is sent over Comm Range or takes more than Comm Delay between sending and receiving.

For each pair of robots, each message arrives in the order it was sent (or is dropped because the

robots are Comm Range apart).

Each robot has accurate position information for all other robots within Comm Range and no

robot is incorrectly located inside Comm Range. Robots may have position information for robots

beyond Comm Range.

10

Page 18: c 2005 by Thomas Brown. All rights reserved

3.3 Implementation

IDAC has been tested on a robot system built as part of the DARPA TASK project. The game

described above was simplified by setting Comm Range = ∞ and broadcasting identical position

information to all robots. The system is composed of a network of agents. Each robot is an

agent, and the mission manager, graphical user interface (GUI) viewer, and localization server are

agents. Each type of agent is derived from a generic agent that contains a communication module

which sends, receives and delivers messages and a world model which stores information about the

location and status of all visible robots. Figure 3.1 shows the generic agent.

Agent

Agent

Communication

WorldModel

Agent

agent specificmessages

updatesstatus

robot information

posi

tion

upda

tes

...

Coordination

Viewer

modulesagent specific

Figure 3.1 The common structure of an agent is shown in this figure.

The agent for the UAV is the most complex, and it is expanded in Figure 3.2. The coordination

module contains the bidder and auctioneer (described in Chapter 4) and outputs the current target

assignment for this UAV. The target handler selects the best direction to approach the target,

decides where to go when there is no assignment, and gives the path planner a goal position on

the map. The path planner uses the A* algorithm to select a series of waypoints moving to the

goal and passes the nearest waypoint to the motion controller, which then creates robot specific

commands to move towards the waypoint.

11

Page 19: c 2005 by Thomas Brown. All rights reserved

Agent

Agent

coordinationmessages

Motion Controller

Target Handler

Coordination

Communication

Path Planner

WorldModel

targetassignment

destination

waypoint

motion commands

UAV

robotinformation

status updates

targ

et m

essa

ges

posi

tion

upda

tes

Figure 3.2 The connections between the control modules in the UAV agent are shown in this figure.

While the entire coordination algorithm runs on the distributed UAV and target agents, there

are a few centralized agents in this implementation. The mission manager distributes unique ID

numbers and starting locations to the robots when they start, gives the targets a list of waypoints

to follow, and announces when the mission has finished. A GUI viewer receives periodic updates

from the robots so it can overlay a map of the robot positions with the current auction status, A*

path, and other useful debugging information. The localization server is the last centralized agent

and has an implementation for simulated and real modes.

The system can operate in simulated or real mode. To change between the two modes, the

target and UAV agents are recompiled to send their motion commands to a physical robot or to a

software-simulated robot. In the physical case a set of video cameras send images to a localization

server which recognizes the coordinates of a color-coded card on top of each robot and sends position

updates to each agent. If the simulated robots are active, they send periodic position reports to a

central simulated localization server which then sends position updates to each agent in the same

format as the real localization server. Figure 3.3 visualizes the real and simulated systems. In both

12

Page 20: c 2005 by Thomas Brown. All rights reserved

Robot

Localization Server

positionreports

Robot

RobotRobot

updatesposition

SimulatedReal

messagesmessages

Simulated

motion

Real

command

Reasoning

Output HandlerInput Handler

Car

Robot

Figure 3.3 The robot with switches for real and simulated cases is shown in this figure.

cases robots communicate using an IP network.

In the real case each UAV and target is a Garica [33] robot controlled by an iPAQ running

Microsoft Pocket PC. The iPAQs have a built-in IEEE 802.11b wireless network interface for

communicating with other agents and a RS232 serial interface connected to the Garcia. In both

the real and simulated modes the UAVs move forwards and backwards at up to 27 cm/s and pivot

in place. The targets move similarly, but have a maximum speed of 10 cm/s.

The coordination module of each UAV contains a single state machine for the bidder and a state

machine for each auctioneer hosted on the UAV. Chapter 5 describes how a UAV starts hosting

an auctioneer. The workings of the auctioneer are described in Section 4.2 and of the bidder in

Section 4.3.

13

Page 21: c 2005 by Thomas Brown. All rights reserved

CHAPTER 4

THE IDAC ALGORITHM

This chapter describes the distributed auction coordination bidding and auctioneering mechanisms.

Each UAV agent hosts a single bidder (a buyer) and may host any number of auctioneers (sellers).

The targets are represented by auctioneers, but the target agents are not involved in the coordi-

nation. Auctions run asynchronously for rounds of a fixed duration. An auction round ends with

either a complete set of winning bidders or no result.

4.1 Creating an Auctioneer

Every UAV u ∈ U has an auctioneer for every target t ∈ T (Auctut will represent this auctioneer)

within Comm Range/2, but for each target there will normally be only a single UAV with an active

auctioneer. Each auctioneer starts in state inactive (or waiting if all UAVs start synchronously).

If dut < Comm Range/2 and u has not received any messages from an auctioneer for t in over

Round Time ∗ 2, then u can be certain that t does not have an active auctioneer, and Auctut is

changed to state waiting. If dut < Comm Range/2 and u has still not received any messages from

an auctioneer for t in dut ∗Dist Delay seconds, Auctut enters state inital. Auctu

t then proceeds to

run auctions for target t. At the start of each auction round Auctut measures dut and attaches it to

each PriceUpdate and PriceUpdateFirst message during that round. If u receives a message from

u′ 6= u with du′,t ≤ dut or at any time dut > Comm Range/2, then Auctut is immediately reverted

to inactive. If u receives a message from u′ 6= u with du′,t > dut, it may reply with a PriceUpdate

to force Auctu′t to revert to inactive.

Because all u with an active auctioneer for t must be within Comm Range /2 of t, every active

auctioneer for t will receive every message sent by an auctioneer for t. Thus, there can be at most

one auctioneer for t that has sent a PriceUpdateFirst message and is still active.

14

Page 22: c 2005 by Thomas Brown. All rights reserved

Because multiple auctioneers (on different UAVs) may send PriceUpdateFirst messages for the

same target, the bidders must decide where to send their bid. For each target the bidder remembers

the current auctioneer UAV, the last distance in the message from that auctioneer, and the last

time it received a message from that auctioneer. If the bidder receives a PriceUpdateFirst message

with a smaller distance, the sender becomes the auctioneer for the target. Messages with a larger

distance are ignored. If more than Round Time + Comm Delay seconds pass without receiving a

message from the auctioneer, the bidder forgets about the auctioneer.

4.2 Auctioneer Mechanism

The auctioneer for a target i is trying to collect req i bids in each auction. There is no benefit

to not selling the item, so it uses a zero reserve price. The auctioneer keeps the following state

information for each k ∈ U : price, start price, highest rej price, state, time remaining , and bidk

and confirmk. For notational convenience, define the sets bids = {bidk : k ∈ U and bidk is defined}

and confirms = {confirmk : k ∈ U and confirmk is defined} and the number of bids as numbids =

|bids ∪ confirms|. The invariant ∀k ∈ Uconfirmk is undefined, or bidk is undefined shall be main-

tained.

Before the first round, initialize these values:

price = 0

confirmk = undefined ∀k ∈ U

Before each round set:

start price = price

highest rej price = start price − ε

state = bid

time remaining = Round Time − 2 ∗ Comm Delay

bidk = undefined∀k ∈ U

15

Page 23: c 2005 by Thomas Brown. All rights reserved

At the start of each round and whenever the set of bidders changes the price is recalculated

price =

start price if numbids is 0

min({highest rej price + ε} ∪ bids ∪ confirms) otherwise

After the price is set at the start of a round a PriceUpdateFirst message is broadcast. All

bidders that want to participate in the auction, including those that won in the previous round

(and thus have entries in confirms), reply with a Bid message.

Upon receiving a Bid message from UAV k, the auctioneer must update bids and confirms. The

bid value in the message is stored in bidk and confirmk (which will be set if k won in the last round)

is set to undefined. If numbids is now larger than req i, then the lowest value in bids ∪ confirms

is undefined, and highest rej price may be increased. Then a new price is calculated, and a

PriceUpdate message is broadcast.

Upon receiving a Retract message from UAV k, the auctioneer clears bidk and confirmk, cal-

culates a new price, and broadcasts a PriceUpdate.

After at least 2 ∗ Comm Delay has passed since the beginning of the round, every UAV that

won the previous round should have replied with Bid message if it wants to maintain its bid. The

field confirms is cleared by setting confirmk = undefined∀k ∈ U . If every winner of the previous

round did reply, then confirms was already empty. Otherwise, the price should be recalculated,

and a new PriceUpdate broadcast.

Receiving a Bid message is the first step in a three-way handshake that insures that the bidders

and auctioneer agree on who won the auction. When the time remaining of the round reaches

0 (or, as an optimazation, numbids = req i and no Bid or Retract messages have been received

in over 2 ∗ Comm Delay), the auctioneer compares numbids to req i. If |bids| < req i, the auction

round failed to collect enough bids and is finished. Otherwise, the auctioneer initiates the second

step of the three-way handshake by sending a Confirm message to all k ∈ bids and setting state =

confirmwait. In this state Bid messages are ignored, a Retract message from a UAV in bids causes

the auction round to abort, and a Confirmed message moves the bid of the sending UAV from bids

to confirms. After at least 2∗Comm Delay the auctioneer checks if the round has been successfully

completed. If |bids| > 0 or |confirms| < req i then some UAV that was sent a Confirm message

16

Page 24: c 2005 by Thomas Brown. All rights reserved

did not reply with a Confirmed message. It may have crashed or retracted its bid, so the auction

round failed. Otherwise the three-way handshake was completed between the auctioneer and each

winning bidder (now in confirms), and a Winners message is broadcast.

If the auctioneer failed to collect req i bids(i.e., no Winners message was sent), then it reduces

price almost as much as a bidders cost dist t would have decreased if it was moving towards the

target during the round. This decrease rate prevents an auctioneer from “stealing” a UAV that is

moving towards its assigned target. The auctioneer always pauses for at least 2∗Comm Delay , but

no longer than Round Time after it last sent a price update before starting the next round. The

pause prevents messages from crossing between auction rounds and reduces the cost of running

continuous auctions. During the pause, a Retract message removes the sender from confirms, but

the auctioneer ignores all other messages and does not send messages.

4.3 Bidder Mechanism

The main controller of the bidder must send Bid and Retract messages as it changes its bid and

must respond to Confirm and Winners messages. Most of the work of the bidder is contained in a

single function that calculates a best bid using the state of the local UAV as well as the locations

and the most recent price updates from each target.

4.3.1 Main controller of the bidder

The bidder starts by initializing waiting mult t = 1 for each target and clearing its current bid. The

bidder then computes its best bid, a target and a bid price, every 0.5 s. If the best bid target has

changed, a Retract message is sent to the previous current target (if defined), and if there is a new

target (there could be no defined best bid), a Bid message is sent. If the target has not changed,

but the bid price has changed by more than Cost Bid/3, then a Bid message is sent with the new

price.

When the bidder receives a Confirm message from the auctioneer of the current bid target, it

replies with a Confirmed message. If it receives a Winners message from auctioneer of the current

bid target, it updates the Target Handler with the new assigned target. In no case does the bidder

stop calculating the best bid every 0.5 s and updating auctioneers with Retract and Bid messages

17

Page 25: c 2005 by Thomas Brown. All rights reserved

as needed.

As mentioned in Sections 4.2 and 4.1, the bidder saves the contents of the last PriceUpdate and

PriceUpdateFirst messages from the active auctioneer for every target, and it always replies to an

saved PriceUpdateFirst message for the current bid with a Bid message.

4.3.2 Calculating the best bid

The predefined constants Cost Bid , Cost Assign, Cost Per Unassigned waiting mult max , and

Cost Per Distance are used to determine the best bid. First cost fixed total t is calculated. It

represents the total of the distance-independent costs that this UAV expects to incur to service t:

cost fixed total t = Cost Bid

+ Cost Assign

+ Cost Per Unassigned ∗ req t

Each UAV u then calculates the following dynamic values using its knowledge of the state of

each target:

Cost Bid t =

0 if the current bid of u is t

Cost Bid otherwise

Cost Assignt =

0 if u is assigned to t

2 ∗ Cost Assign if other UAVs are assigned to t

Cost Assign otherwise, no UAVs are assigned to t

cost waiting t = (req t − assigned t)

∗Cost Per Unassigned

∗waiting mult t

18

Page 26: c 2005 by Thomas Brown. All rights reserved

cost dist t = dut ∗ Cost Per Distance

These values are used to calculate cost fixed remaining t, the expenses the UAV has yet to incur

to services t.

cost fixed remaining t = Cost Bid t

+ Cost Assignt

+ cost waiting t

Note that cost fixed remaining t may be greater than cost fixed total t. For example, if another

UAV has been assigned to t, then Cost Assignut = 2 ∗ Cost Assign because u needs to steal the

assignment away from another UAV before it can be assigned to t.

The value waiting mult t models the increasing cost of waiting for other UAVs to bid on a target

that has less than req t bids. When numbidst increases or numbidst = req t, then waiting mult t is re-

set to 1, and a new random waiting mult ratet is selected. Otherwise waiting mult t is continuously

updated according to the following:

d

dtwaiting mult t =

0 if waiting mult t = 1 or waiting mult max

waiting mult ratet if u is bidding on t

−waiting mult ratet otherwise

Finally u computes the expected profit for each target t

profit t = benefit t − pricet − cost dist t − cost fixed remaining t

and finds the first highest profit target f and second highest profit target s. If profitf < 0, then u

will not bid. If profits < 0, then u will use the best bid price profitf + pricef . If both profitf > 0

and profits > 0, then the bidder may bid for the marginal profit of f over s, bidmargin (Eq. (4.1)), or

19

Page 27: c 2005 by Thomas Brown. All rights reserved

zero profit, bidall (Eq. (4.2)).1 The term bidmargin is the greatest price for f such that s is not more

profitable. If bidmargin becomes the price of f , then u will expect to get the same profit, profits,

from f and s. When profitf is close to profits but both are large, the use of bidmargin can leave u

in a deadlock when bidmargin ≈ cost fixed remainings − cost fixed totalf which may be less than

pricef or even less than zero. The term bidall provides an alternative way to select a bid price.

To select the actual bid amount u starts by using min(bidmargin, bidall), but if it has not won an

auction in Round Time, it increases the best bid towards max(bidmargin, bidall).

bidmargin = profitf − profits + pricef

+ cost fixed remainingf

− cost fixed totalf − ε

= benefitf − cost distf − profits

− cost fixed totalf − ε

(4.1)

bidall = benefitf − cost distf

− cost fixed totalf − ε(4.2)

1During inital simulations and real experiments bid second = benefits − cost fixed totals − cost dists − prices +pricef − ε was used instead of bidall.

20

Page 28: c 2005 by Thomas Brown. All rights reserved

CHAPTER 5

SURVIVING CRASH FAILURES

Chapter 4 describes the auction mechanism. This chapter describes how the auction mechanism

recovers after a UAV crashes.

Because the system state is distributed, a crash of any UAV will only affect the subset of UAVs

that interact with it. Normal coordinated actions will proceed no more than 2 ∗Round Time after

the last UAV crash.

If the crashed UAV u does not have a current bid (never sent a bid, sent a bid followed by a

retract, or sent a bid but was outbid by a different UAV), then no other UAV has state depending

on u, and the auctions will continue uninterrupted. If u has sent a bid, then either u will be

asked to confirm its bid, or the auction will time out without the required number of winners. If

u does not confirm its bid within 2 ∗ Comm Delay , the auctioneer will not announce winners for

the current round. If u has confirmed a bid, then it is part of a coalition moving towards a target,

and confirmk is set in the auctioneer. A new auction for the target will start within Round Time

and confirmk will be cleared 2 ∗ Comm Delay after the round starts.

Every auctioneer sends a PriceUpdateFirst at the start of each auction round. A running

auctioneer starts a new round at least every Round Time seconds. When any UAV does not

receive a PriceUpdateFirst or PriceUpdate for a target in over Round Time + Comm Delay , it

assumes the auctioneer has crashed and becomes the auctioneer as detailed in Section 4.1.

If the crashed UAV u was hosting an auctioneer for t ∈ T then once Round Time has passed

since u last sent a message for the auctioneer of t, the running UAVs will timeout and will activate

their auctioneer for t. If more than one auctioneer is activated, all but one will be deactivated by

the mechanism described in Section 4.1.

21

Page 29: c 2005 by Thomas Brown. All rights reserved

CHAPTER 6

RESULTS

6.1 Finding IDAC Parameters

In order for the bidding to converge to a solution appropriate values for Cost Bid , Cost Assign and

Cost Per Unassigned have to be estimated. This estimation was done manually by looking at the

results of many simulations. To increase the determinism of the simulation the robots were allowed

to move directly towards their intended destination, ignoring all collisions and path planning.

Even with this change the assignment map created by IDAC can vary significantly between runs

with identical parameters. Fortunately, in general, if at least req t UAVs have benefit t larger than

Cost Bid +Cost Assign +Cost Per Unassigned ∗ req t + cost dist t for all unserviced targets t then

the the auction will eventually make an assignment. Figure 6.1 shows the total time for six UAVs

to service seven targets in a constant pattern with 3 ≤ req t ≤ 4. At first glance Figure 6.1 may

suggest a relationship between Cost Bid and Cost Assign and the average mission time, but the

large StdDev is indicative of a large nondeterministic component. A small change in a parameter

or starting location of a robot can significantly increase or decrease the total mission time if it

changes the order in which UAVs bid on or tag the targets. To make a general statement about the

relationship between IDAC parameters and mission time a much larger study that covers a variety

of mission layouts would be needed.

Beside the cases where benefit t was not large enough to yield a profit, the only other set of

parameters that prevented the auctions from ending where ones where Cost Bid is much larger

than waiting mult max ∗Cost Per Unassigned . When the UAV has bid on t but more UAVs must

bid to meet the target’s requirements waiting mult t increases up to waiting mult max . It should

cause a UAV to bid on a distant target when the required number of UAVs are not bidding on a

nearer target. For example, in Figure 6.2 the top targets (triangles with thin circles around them)

22

Page 30: c 2005 by Thomas Brown. All rights reserved

�� ��

��� ��

��� ��

��� ��

��� ��

��� ��

��� ��

�� ��� ��� ��� ��� ���� ����

CostBid

�� �����������������

������

���������

CostAssign

�������

Figure 6.1 This graph shows that total time to service all targets is weakly related to Cost Bidand Cost Assign. The error bars StdDev in length have been placed next to some points,Cost Per Unassigned = 100 and Round Time = 12 s.

Figure 6.2 This figure shows a potential deadlock. The two unserviced targets (at the top) eachneed three UAVs to tag them, but the UAVs have each bid on the closest target.

23

Page 31: c 2005 by Thomas Brown. All rights reserved

0

0.2

0.4

0.6

0.8

1

1.2

50 100 200 400 800 1600 3200

100

400

1600

3200

Fini

shed

Rat

io

Cost�

Assign

Cost�

Bid

Figure 6.3 This graph shows the proportion of missions that did not deadlock for various Cost Bidand Cost Assign. The simulations were run with Cost Per Unassigned = 50 and Round Time =12 s.

need three UAVs, so one UAV will need to bid on the more distant target for any auction to

finish. Pick a UAV, and call its nearest target n and the more distant target d. Since the targets

are identical, the UAV will start by bidding on n. The UAV will then change its bid to d when

profitd − profitn > 0, which is expanded in Eq. (6.1) for assigned = req − 1. As waiting multn

increases from 1 to waiting mult max , the value of the equation increases toward 0, but if Cost Bid

is huge, the UAV will never change its bid to d, creating a deadlock. Figure 6.3 shows this cutoff

in the results of the simulations.

profitd − profitn

= −cost distd − cost fixed remainingd

+ cost distn + cost fixed remainingn

= −cost distd − (Cost Bid + Cost Assign + Cost Per Unassigned)

+ cost distn + (Cost Assign + waiting multn ∗ Cost Per Unassigned)

= cost distn − cost distd − Cost Bid

+ Cost Per Unassigned ∗ (waiting multn − 1)

(6.1)

In early versions of IDAC, UAVs would change their bid in response to small price fluctuations.

Changing bids caused the prices to change, creating an unstable system. After winning an auction,

24

Page 32: c 2005 by Thomas Brown. All rights reserved

a UAV would frequently start bidding on a different target, which would sometimes lead to a UAV

moving back and forth between two targets as it bids on and wins alternating auctions. Cost Bid

and Cost Assign where introduced to deliberately increase the profit difference needed for a UAV

to change its bid.

Figure 6.4 shows that for moderate values of Cost Bid , the number of Bid messages sent de-

creases as Cost Bid increases. We conjecture that this relationship breaks down for Cost Assign =

1600 because some ranges of IDAC parameters take more messages to converge. A continuing study

should use a variety of missions to confirm this conjecture. Figure 6.5 shows the average number

of times a UAV won an auction and then started bidding in a different auction before servicing the

first target. Increasing the Cost Assign clearly reduces the number of times a UAV changes its bid

after winning an auction. Note that reducing Cost Assign will allow the system to more rapidly

adjust to changes such as a new target appearing.

This experiment shows that for a wide range of values Cost Bid affects the number of Bid

messages, and Cost Assign affects the stability of the assignment map without having much effect

on the presence of deadlocks or the average mission time.

0.0

100.0

200.0

300.0

400.0

500.0

600.0

700.0

50 100 200 400 800 1600 3200

100

400

1600

3200

Cost�Bid

���������

���� ���

Cost�Assign

Figure 6.4 This graph shows the mean number of Bid messages sent per mission for various Cost Bidand Cost Assign. The simulations were run with Cost Per Unassigned = 100 and Round Time =12 s.

25

Page 33: c 2005 by Thomas Brown. All rights reserved

0.0

10.0

20.0

30.0

40.0

50.0

60.0

70.0

80.0

90.0

100.0

100 400 1600 3200

Num

ber

of b

id c

hang

es100

400

1600

Cost�

Assign

Cost�

Bid

Figure 6.5 This graph shows the mean number of times a UAV bid for a target other thanits current assignment for various Cost Bid and Cost Assign. The simulations were run withCost Per Unassigned = 100 and Round Time = 12 s.

6.2 Demonstration of Crash Recovery

When a UAV crashes, the remaining UAVs should reassign targets as necessary to continue without

the crashed UAV. If the crashed UAV was acting as the auctioneer for an unserviced target, a

different UAV should take over the role. To test this functionality, the UAV was modified include

a list of active auctioneers and their status in each status message sent to the viewer. The viewer

displays the auctioneer status for a target if exactly one UAV has listed itself as an auctioneer in

the past 8 s.

If the UAV with the active auctioneer for a target crashes, then the auctioneer status will

disappear after 8 s. After about Round Time ∗ 2, a different UAV will become the auctioneer, and

the status will reappear. If more than one UAV becomes the auctioneer at the same time, the

status may appear briefly, disappear because more than one UAV has sent auctioneer status for

the same target before they saw each other, and then reappear after eight more seconds.

In the simulator each UAV may be run in a different operating system process so a single UAV

may be easily “crashed.” Crash reliability was tested by deliberately killing the process of a UAV

and then watching the viewer and reviewing the debugging log generated by each UAV. As long as

at least one UAV was running, every target would eventually have a single auctioneer, and after

26

Page 34: c 2005 by Thomas Brown. All rights reserved

appropriate auction parameters had been found, all targets were serviced if enough UAVs were in

the arena.

6.3 Real Experiments Compared to Optimal

Physical experiments were conducted for three and four UAVs chasing four moving targets and

four and six UAVs chasing six static targets. The available area (3m by 4.2m) limited the number

of robots that could participate in the experiments. Figure 6.6 shows the starting positions for

the UAVs and targets. Table 6.1 provides a summary of the results. A video1 of the run was also

created.

(a) Moving targets with tracks (b) Static targets

Figure 6.6 These figures show the starting positions used in physical experiments with (a) fourmoving targets and (b) six static targets.

During the physical experiments the speed and acceleration of the real robots were measured to

improve their models in the simulator. Then exactly the same experiments reported in Table 6.1

were conducted in the simulator with results shown in Table 6.2. There is a noticeable difference

between the results, which could be explained by unresolved issues with the robot controller and1The video can be fetched from http://osl.cs.uiuc.edu/~tdbrown/tom-MS-thesis/.

27

Page 35: c 2005 by Thomas Brown. All rights reserved

Table 6.1 Summary of physical experiments.Scenario Total time Mean deviation Trials3 UAVs and 4 moving targets 214 s 2 s 24 UAVs and 4 moving targets 179 s 35 s 94 UAVs and 4 static targets 179 s 3 s 24 UAVs and 6 static targets 245 s 29 s 7

Table 6.2 Summary of simulated experiments.Scenario Total time Mean deviation Trials3 UAVs and 4 moving targets 177 s 24 s 204 UAVs and 4 moving targets 164 s 28 s 204 UAVs and 4 static targets 147 s 32 s 204 UAVs and 6 static targets 219 s 43 s 20

the limited number of real trials performed.

In order to compare coordination methods which is the focus of this thesis to optimal coordi-

nation, the complexity of path planning has been ignored. The simulator was run again for the

static target experiments with all collision detection and avoidance disabled. A separate program

tried every combination of different UAV “trips” of the targets to find the fastest order for the

UAVs to visit the targets and meet the requirements. To reduce the time spent searching for an

optimal coordination all UAVs started from the average of their positions in the real experiments.

This optimal-seeking program suffers from the scaling problem of the brute force traveling sales-

man problem solvers and cannot provide a solution for scenarios larger than about 4 UAVs and

6 targets. Table 6.3 lists the number of different combinations of trips that must be compared to

find the optimal set of trips for the UAVs to follow.

Table 6.4 compares the simulated and optimal times. For these small experiments the auction

coordination scheme is close to optimal. Comparing the actual order in which the UAVs visited the

targets shows that with four targets the auction and optimal coordination visit in the same order

other than swapping two adjacent targets, but with six targets the order differs more significantly

(4,2,5,0,1,3 in the simulator and 3,4,2,1,0,5 in the optimal). In general the auction coordination

cannot be expected to perform close to the optimal in larger or more contrived scenarios because it

is a greedy algorithm. Each UAV repeatedly solves a snapshot of the scenario for the best target to

next visit while planning the optimal trip requires solving for a sequence of visits. See Section 6.5

28

Page 36: c 2005 by Thomas Brown. All rights reserved

Table 6.3 Growth of combinations of trips.|U|

|T | 3 4 5 62 4 5 6 73 56 126 252 4624 2,600 17,550 98,280 475,0205 295,240 9,078,630 225,150,024 5× 109

6 62,467,440 11× 109 2× 1012 198× 1012

7 21× 109 27× 1012 27× 1015 23× 1018

Table 6.4 Comparison of IDAC without collision avoidence to the optimal trips for a given mission.Total time to finish scenario

Scenario Mean for auction coordination (Mean deviation) Optimal4 UAVs and 4 static targets 48 s (0.4 s) 33.7 s4 UAVs and 6 static targets 62 s (5.7 s) 40.2 s

for more discussion how this algorithm could scale.

6.4 Discussion

Even though trajectory planning was not a part of this study, it needed to be handled in order to

run the physical robots. As the number of UAVs and req of each target increased, small amounts of

nondeterminism were exaggerated by collision avoidance. Our simple A* trajectory planner does

not use any information about the expected future positions of possible obstacles. Our initial design

deliberately separated coordination and trajectory formation, but we needed to combine them in

two places. The trajectory planner is used to estimate the cost of moving to a target and when

the auctioneer announces the winners of an auction it uses the trajectory planner to suggest which

corner of the target each UAV should approach. A more sophisticated merger of the coordination

and trajectory planner may dramatically reduce the mission time when physical space is under

contention.

Much of the difficulty in getting this algorithm to work consistently occurred because the

system, as originally designed, created an unstable equilibrium. Bidders increased the prices until

their marginal profit was zero. Then a small change in the cost caused a UAV to change its bid

to a different target, which in turn changed numbids, further disrupting the system. Solving the

case when req = 1 is fairly easy; the real problem lies in trying to force a set of UAVs to converge

29

Page 37: c 2005 by Thomas Brown. All rights reserved

their bids on a single target. It is perhaps not a coincidence that the problem of distributed sellers

competing for sets of buyers has not been addressed in auction theory. Perhaps a better strategy,

as visualized in Figure 6.7, would be to switch the roles of the bidder and auctioneer and have

targets bid for a set of UAVs.

(bidders)UAVs

Targets(sellers)

���� � �

� ���

$$

(a) UAVs as bidders

����

� �� �

��

Targets

(sellers)

(biders)

UAVs

$ $

(b) UAVs as sellers

Figure 6.7 The auction may be orientated with UAVs as (a) bidders or (b) sellers.

There is probably a connection between the ratio r = |U|/(∑

t∈T req t) and the best direction

to run the auction. Using the current bidding scheme, if r � 1, the price of most targets will stay

zero until just before they are assigned a set of UAVs and subsequently tagged. Algorithms that

let a UAV announce its intention to visit a sequence of targets have an obvious advantage. When

r � 1 prices will be driven up as many UAVs compete for a few targets.

6.5 Continuing Research

In this implementation, every robot is given position information for all robots in the arena and can

send messages to any robot. A continuation of this work could demonstrate that IDAC works with

finite vision and communication (Comm Range) ranges in an arena of unbounded size. Presently

a UAV updates the status of a target, but with finite communication range a UAV must be able

to perceive the status of a target when it becomes visible. A multihop routing mechanism could

be used to increase the effective communication range. UAVs could be given partial information

about the arena beyond their vision range such as the density of UAVs and unserviced targets and

the average price of targets for some regions.

IDAC depends on numbidst in price updates to calculate cost waiting t and, thus, the expected

30

Page 38: c 2005 by Thomas Brown. All rights reserved

profit of t. This method of using status indicators in price updates to affect the expected profits

could be applied to other situations. For example, if a target needed a group of skills provided by

different heterogeneous UAVs to service it, then the price update could include a price and number

of accepted bids for each skill.

IDAC creates a greedy assignment from the current UAVs and unserviced targets, but could

be extended to allow UAVs to bid on future targets. Each UAV could calculate the best trip of

two (or n) targets for it to visit and the expected profit for each. The UAV can then bid on all the

targets in the best trip. This scheme, in which a UAV could win part of a trip and lose another

part, would need to be compared to a combinatorial auction where UAVs use a single bid for an

entire trip.

31

Page 39: c 2005 by Thomas Brown. All rights reserved

REFERENCES

[1] T. C. Lueth and T. Laengle, “Task description, decomposition and allocation in a distributedautonomous multi-agent robot system,” in Proceedings of International Conference on Intel-ligent Robots and Systems (IEEE/RSJ IROS), vol. 3, Sept. 1994, pp. 1516–1523.

[2] R. E. Burkard and E. Cela, “Linear assignment problems and extensions,” in Handbook ofCombinatorial Optimization, Supplemental Volume A, D.-Z. Du and P. M. Pardalos, Eds.Dordrecht, The Netherlands: Klewer Academic Publishers, pp. 75–149.

[3] I. Curiel, Cooperative Game Theory and Applications: Cooperative Games Arising from Com-binatorial Optimization Problems. Boston, MA: Kluwer Academic print on demand, 1997.

[4] E. Lawler and A. Rinnooy Kan, Eds., The Traveling Salesman Problem: A Guided Tour ofCombinatorial Optimization. New York, NY: John Wiley and Sons, 1985.

[5] G. Laporte and Y. Nobert, “Exact algorithms for the vehicle routing problem,” in Surveys incombinatorial optimization, ser. Annals of Discrete Mathematics 31, S. Martello, Ed. NewYork, NY: Elsevier Science Pub. Co., 1987, pp. 147–184.

[6] M. Yokoo and K. Hirayama, “Algorithms for distributed constraint satisfaction: A review,”Autonomous Agents and Multi-Agent Systems, vol. 3, no. 2, pp. 185–207, 2000.

[7] M. Yokoo, Distributed Constraint Satisfaction: Foundation of Cooperation in Multi-AgentSystems. New York, NY: Springer, 2001.

[8] D. P. Bertsekas and D. A. Castanon, “A forward/reverse auction algorithm for asymmetricassignment problems,” Massachusetts Institute of Technology, Tech. Rep. LIDS-P-2159, 1993.

[9] D. P. Bertsekas, Linear Network Optimization: Algorithms and Codes. Cambridge, MA:M.I.T. Press, 1991.

[10] P. Klemperer, “Auction theory: A guide to the literature,” Journal of Economic Surveys,vol. 13, no. 3, pp. 227–286, July 1999.

[11] S. de Vries and R. Vohra, “Combinatorial auctions: A survey,” Northwestern University,Center for Mathematical Studies in Economics and Management Science, Tech. Rep. 1296,May 2000.

[12] M. Esteva and J. A. Padget, “Auctions without auctioneers: Distributed auction protocols,”in Agent Mediated Electronic Commerce (IJCAI Workshop), 1999, pp. 220–238.

32

Page 40: c 2005 by Thomas Brown. All rights reserved

[13] A. Byde, C. Preist, and N. R. Jennings, “Decision procedures for multiple auctions,” in AA-MAS ’02: Proceedings of the First International Joint Conference on Autonomous Agents andMultiagent Systems, New York, NY, 2002, pp. 613–620.

[14] R. P. McAfee, “Mechanism design by competing sellers,” Econometrica, vol. 61, no. 6, pp.1281–1312, 1993.

[15] M. G. Coles and J. Eeckhout, “Heterogeneity as a coordination device,” Department of Eco-nomics and Business, Universitat Pompeu Fabra, Tech. Rep. 510, Feb. 2000.

[16] A. Ahmed, A. Patel, T. Brown, M. Ham, M.-W. Jang, and G. Agha, “Task assignment for aphysical agent team via a dynamic forward/reverse auction mechanism,” in The InternationalConference of Integration of Knowledge Intensive Multi-Agent Systems KIMAS ’05: Modeling,Evolutions and Engineering, Apr. 2005, pp. 311–317.

[17] Y. U. Cao, A. S. Fukunaga, and A. B. Kahng, “Cooperative mobile robotics: Antecedents anddirections,” Autonomous Robots, vol. 4, no. 1, pp. 7–23, Mar. 1997.

[18] V. J. Lumelsky and K. R. Harinarayan, “Decentralized motion planning for multiple mobilerobots: The cocktail party model,” Autonomous Robots, vol. 4, no. 1, pp. 121–135, Mar. 1997.

[19] T. Arai and J. Ota, “Motion planning of multiple mobile robots,” in Proceedings of the 1992lEEE/RSJ International Conference on Intelligent Robots and Systems (IEEE/RSJ IROS),July 1992, pp. 1761–1768.

[20] L. E. Parker, “ALLIANCE: An architecture for fault tolerant multirobot cooperation,” IEEETransactions on Robotics and Automation, vol. 14, no. 2, pp. 220–240, Apr. 1998.

[21] B. P. Gerkey and M. J. Mataric, “Sold!: Auction methods for multi-robot coordination,” IEEETransactions on Robotics and Automation, vol. 18, no. 5, pp. 758–768, Oct. 2002.

[22] P. R. Chandler and M. Pachter, “Hierarchical control for autonomous teams,” in Proceedings ofAIAA Guidance, Navigation, and Control Conference, vol. 1, Aug. 2001, pp. 632–642, (AIAApaper 2001-4149).

[23] A. Richards, J. Bellingham, M. Tillerson, and J. How, “Co-ordination and control of multipleUAVs,” in Proceedings of AIAA Guidance, Navigation, and Control Conference, Aug. 2002,(AIAA Paper 2002–4588).

[24] P. Zigoris, J. Siu, O. Wang, and A. T. Hayes, “Balancing automated behavior and humancontrol in multi-agent systems: A case study in roboflag,” in Proceedings of American ControlConference, 2003, pp. 667–671.

[25] C. Schumacher, P. R. Chandler, S. J. Rasmussen, and D. Walker, “Task allocation for widearea search munitions with variable path length,” in Proceedings of the American ControlConference, June 2003, pp. 3472–3477.

[26] E. Frazzoli and F. Bullo, “Decentralized algorithms for vehicle routing in a stochastic time-varying environment,” in Proceedings of the IEEE Conference on Decision and Control, vol. 4,Dec. 2004, pp. 3357–3363.

[27] D. H. Walker, “Coordinated UAV target assignment using distributed calculation of target-task tours,” M.S. thesis, Brigham Young University, 2004.

33

Page 41: c 2005 by Thomas Brown. All rights reserved

[28] J. Guerrero and G. Oliver, “Multi-robot task allocation strategies using auction-like mecha-nisms,” in Sixth Congress of the Catalan Association for Artificial Intelligence (CCIA), 2003.

[29] P. J. Modi, W.-M. Shen, M. Tambe, and M. Yokoo, “An asynchronous complete method fordistributed constraint optimization,” in Proc of Autonomous Agents and Multi-Agent Systems,2003, pp. 161–168.

[30] K. Dixon, J. Dolan, W. Huang, C. Paredis, and P. Khosla, “RAVE: a real and virtual envi-ronment for multiple mobile robot systems,” in Proceedings of International Conference onIntelligent Robots and Systems (IEEE/RSJ IROS), vol. 3, 1999, pp. 1360–1367.

[31] B. L. Brummit and A. Stentz, “Dynamic mission planning for multiple mobile robots,” inProceedings of the IEEE International Conference on Robotics and Automation, vol. 3, Apr.1996, pp. 2396–2401.

[32] A. Pongpunwattana, “Real-time planning for teams of autonomous vehicles in dynamic un-certain environments,” Ph.D. dissertation, University of Washington, Seattle, Washington,2004.

[33] Acroname, Inc., “Garcia,” Mar. 2005, http://www.acroname.com/garcia/garcia.html.

34