response threshold model based uav search planning and task allocation

J Intell Robot SystDOI 10.1007/s10846-013-9887-6

Response Threshold Model Based UAVSearch Planning and Task Allocation

Min-Hyuk Kim · Hyeoncheol Baik ·Seokcheon Lee

Received: 13 January 2013 / Accepted: 10 September 2013© Springer Science+Business Media Dordrecht 2013

Abstract This paper addresses a search planningand task allocation problem for a Unmanned Aer-ial Vehicle (UAV) team that performs a searchand destroy mission in an environment wheretargets with different values move around. TheUAVs are heterogeneous having different sensingand attack capabilities, and carry limited amountof munitions. The objective of the mission is tofind targets and eliminate them as quickly as pos-sible considering the values of the targets. In thiscontext, there are two distinct issues that needto be addressed simultaneously: search planningand task allocation. The search plan generates anefficient search path for each UAV to facilitate afast target detection. The task allocation assigns

M.-H. Kim (B)Center for Army Analysis and Simulation,Republic of Korea Army, PO Box 501-15,Bunam-ri, Sindoan-myeon, Gyeryong-si,Chungnam-do, 321-929, South Koreae-mail: [email protected]

H. BaikSchool of Aeronautics and Astronautics,Purdue University, 701 W. Stadium Avenue,West Lafayette, IN 47907, USAe-mail: [email protected]

S. LeeSchool of Industrial Engineering, Purdue University,315 N. Grant Street, West Lafayette, IN 47907, USAe-mail: [email protected]

UAVs attack tasks over detected targets suchthat each UAV’s attack capability is respected.We model these two issues in one frameworkand propose a distributed approach that utilizesa probabilistic decision making mechanism basedon response threshold model. The proposed ap-proach accounts for natural uncertainties in theenvironment, and provides flexibility, resultingin efficient exploration in the environment andeffective allocation of attack tasks. The approachis evaluated in simulation experiments in compari-son with other methods, of which results show thatour approach outperforms the other methods.

Keywords Unmanned Aerial Vehicles ·Search planning · Task allocation ·Probabilistic decision making ·Response threshold model

1 Introduction

The advanced technologies in Unmanned AircraftSystem (UAS) allow Unmanned Aerial Vehicles(UAVs) to carry out complex missions that re-quire UAVs to perform various types of tasks[1]. One of crucial missions for which multipleUAVs can be utilized includes a search and de-stroy mission where a group of UAVs is deployedinto a battlefield in order to search for targets anddestroy identified targets.

J Intell Robot Syst

This paper considers simultaneously searchplanning and task allocation for a group of UAVsthat performs a search and destroy mission in adynamic environment. In the environment, tar-gets move dynamically and are of various typesthat have different value and munition resourcesrequirement. The value of a target specifies thecriticality of the target and may change over time.The munition requirement indicates the minimumamount of munition resources usage required todestroy the target. Each UAV is assumed tohave: (1) a target sensing device to detect tar-gets; (2) a communication equipment to exchangeinformation and cooperate with others; and (3)munition resources to attack targets. The UAVteam is composed of heterogeneous ones in termsof sensing and attack capabilities. The munitionresources that a UAV possesses are limited inquantity and deplete with usage. In a mission, theUAVs are expected to carry out two differenttypes of tasks associated with targets: search andattack. A search task is to search for targets ina search region and an attack task is to strikea target with munition resources. When a UAVdetects a target, it either attacks the target single-handedly if it has proper attack capability or as-signs it to another UAV if not. The objective ofthe mission is to find targets and eliminate them asquickly as possible taking into account the valuesof targets.

In this scenario, there are two important issuesthat need to be considered simultaneously; searchplanning and task allocation. The search plan pro-vides each UAV with a search path for fast targetdetection so that the following attack task overa detected target can be shortly performed. It isinherently impossible to plan an optimal searchpath for each UAV throughout the mission be-cause of unpredictability in the environment, andhence the search path of each UAV has to begenerated in an on-line fashion. The task alloca-tion assigns an UAV to a detected target such thatUAV’s attack capability and attack response timeare respected. Though there have been significantefforts that deal with each issue separately, onlyfew works study the problem that combines thesetwo issues [2].

Motivated by this notion, we model the prob-lem that treats these two distinct issues in one

framework. The problem becomes complicated byseveral factors: multiple UAVs, moving targets,constrained resources, and heterogeneous UAVcapabilities. Moreover, tight coupling of searchplanning and task allocation increases the com-plexity of the problem. Finding globally optimalsolution is computationally impracticable due tothe dynamic nature of the problem. Therefore,we need to find an efficient procedure to pro-duce sufficiently good solution to this complexproblem. In our work, presented is a distrib-uted approach that utilizes a probabilistic decisionmaking protocol based on the response thresholdmodel, which is inspired by social insects’ taskassignment behavior [3, 4]. The basic idea is thatan individual’s behavior is not deterministicallybut probabilistically determined using a probabil-ity function conditioned on two factors, so-calledresponse threshold and stimulus. The responsethreshold represents the level of preference orspecialization of an individual UAV to a task andthe stimulus corresponds to the demand, value, orpriority of the task. Based on the threshold leveland the stimulus intensity, the probability func-tion determines a probability that the individualaccepts the task. This probabilistic action selectionmechanism accounts for natural uncertainties inthe environment and provides flexibility, resultingin efficient exploration in the environment andeffective task allocation of attack tasks.

The remainder of this paper is organized asfollows. In the next section we present a briefreview of related work. In Section 3, we describeour problem in detail. The UAV information baseis defined in Section 4, which is followed by ap-proach we propose in Section 5. We show simula-tion results comparing our approach to other al-gorithms in Section 6. Finally, Section 7 discussesthe conclusion and future work.

2 Related Work

The problem of searching for targets in an en-vironment has been an intensive research areafor several decades. The earliest one has beenaddressed in classical search theory [5], whichmainly focuses on the optimal allocation of searchresources to maximize detection probability of

J Intell Robot Syst

a stationary target. The basic elements in theoptimal search problem include a prior distrib-ution on target location, a function associatedwith search resource and detection probability,and a constrained amount of search resource. Theexponential function is a common assumption todescribe the probability of target detection [6, 7].While the classical search problem concerns asingle stationary target [8–10], modern work onsearch problem has extended the area by consid-ering mobile targets and/or multiple targets. In thework, the search region is partitioned into a finitenumber of cells in which unknown position of thetarget is given by a probability distribution. Themovement of the target is modeled as a Markov-ian motion with transition matrix over cells [11].The transition probabilities of the Markov chaincan be constructed based on known dynamics oftarget, operation strategy of target, and terrainfeatures of the region.

However, all the work mentioned above deal-ing with a single searcher may not be suit-able for modern applications where a team ofmultiple agents is adopted in order to ensurerobustness and faster mission accomplishment.The search problem, when dealing with multipleagents, needs to address several realistic consid-erations such as heterogeneity of agents and dis-tributed decision making. In recent years, therehave been numerous studies on search strategiesfor multiple agents. Polycarpou et al. [12] addressa cognitive map-based cooperative search strategyfor a team of UAVs. In their work, each UAVutilizes a cognitive map regarding a search regionas a knowledge base. Based on the informationnewly collected about the environment, the mapis dynamically updated and used to route UAVs.The objective of the search is to minimize thetotal uncertainty in the region. Sujit and Ghose[13] present a search problem that incorporates anendurance time constraint on UAVs, and proposean algorithm that utilizes the k-shortest path algo-rithm with the objective of minimization of uncer-tainty in a region. The other works that employ acognitive map and use a measure of uncertaintyfor the construction of search path can be foundin the literature [14–19].

While the search algorithms designed to mini-mize uncertainty yields good performance in sta-

tionary target search problems, the algorithms’performance may deteriorate in dynamic envi-ronments where targets are assumed to move.Polycarpou et al. [12] state that the algorithm canbe adjusted by applying a decay factor on uncer-tainty values in the case of dynamic environments.However, this model does not fully address thedynamic environment as reported in [20] since theuncertainty measure will continuously increase atall points that are not under direct surveillance.The other approaches to solve a search problemfor moving targets include probability-map basedsearch and exhaustive search such as in [21–23].

Other than search path planning, task alloca-tion over targets is also of great interest to a teamof multiple agents. Some researchers have formu-lated the task allocation problem in optimizationmodel and proposed solution approaches basedon centralized control mechanism utilizing mixedinteger linear programming, dynamic program-ming and genetic algorithms, which can be foundin [24–28]. Though the centralized mechanismshave good advantages including guaranteeing op-timal solution, decentralized approaches are morefavored and common in dynamic and uncertainenvironments because of robustness, quick re-sponse to dynamic events, and little computationoverhead. Lemaire et al. [29] present a distrib-uted task allocation scheme based on contract netprotocol for multiple UAVs. Gurfil [30] investi-gate the performance of a team of UAVs thatperforms a search and destroy mission utilizingauction based task allocation. Jin et al. [2] addressa search and response problem where a hetero-geneous team of UAVs performs various tasksincluding search, attack, and damage assessment.The authors propose a predictive assignment algo-rithm where the UAVs estimate probabilities forfuture tasks and incorporate these predictions intotheir decision making. Sujit et al. propose a nego-tiation based task allocation scheme for multipleUAVs that have communication constraints [31],and further develop the scheme using sequentialauctions [32].

Most of the work mentioned above study eitherthe problems that consider separately search andtask allocation issues or the problems that inte-grates the two issues only for stationary targets.However, for a complex mission where multiple

J Intell Robot Syst

UAVs are supposed to deal with moving targets,there is a need to develop an efficient task allo-cation mechanism combined with search planningin a distributed setting. In the following sections,we describe the problem in detail and develop adistributed search and task allocation scheme forthe problem.

3 Problem Statement

We consider a team of UAVs deployed into acertain battlefield where the team is supposed toperform search and attack tasks for targets thatmove around in the battlefield. The UAVs aremodeled as autonomous distributed agents thatmake their own decisions based on the knowledgeabout the environment. In the rest of the paper,we use the general term “agent” to represent aUAV.

3.1 Mission Space

The continuous battlefield is partitioned into acollection of identically-sized grid cells, C = {1, 2,. . . , NC}, as shown in Fig. 1. To describe the move-ments of agents and targets in this discretizedenvironment, we define the movement period Tas the time interval that discretizes the movementsof agents and targets. In the model, agents andtargets move among the cells at each discrete timestep t = kT for k = 0, 1, 2, . . .

Fig. 1 Movement of agent in grid environment

In the grid environment, a cell represents aregion where an agent expends its search effort orstrikes a target detected. It is assumed that a cellis large enough for an agent to maneuver insideit, performing a search or attack task. At eachtime step, an agent determines a cell for searchaccording to a search mechanism, and a sequenceof cells that the agent chooses constitutes a searchpath of the agent.

3.2 Targets

There are NT moving targets in the environ-ment. The targets are probabilistically located attime t = 0 and move among cells according to aMarkov process, each occupying one cell at eachtime interval. It is assumed that the initial locationprobabilities and transition probabilities of eachtarget are known and independent of others. Forsimplicity, we assume that the targets move ac-cording to a homogeneous Markov chain �.

The targets are of various types and each targetis allotted a value based on its priority or im-portance level. A target value is time-dependent,typically discounted in time, and is specified by itsvalue function,

V j(t) = V0j · ϕ j(t) (1)

where V0j is a positive constant representing an

initial value assigned to target j and ϕ j(t) ∈[0,1]is a function defined to be non-increasing overtime. Each of these targets requires some amountof munition resources for destruction and theamount may vary depending on the type of mu-nition that an agent carries.

3.3 Agents

A team of agents consists of NA heterogeneousagents. Each agent moves autonomously throughthe environment executing tasks such as searchand attack. The agents carry limited fuel resourcesand are assumed to have no refueling duringmission. It is assumed that global communicationbetween the agents is possible and there is nointerruption or delay in communication.

J Intell Robot Syst

3.3.1 Movement of Agent

During search, an agent moves from one cellto another cell or stays in a cell at each timestep. The available location at the next time stepis constrained to one of adjacent cells to theagent’s current location (See Fig. 1). When con-ducting a search task in a cell, an agent movesaround searching for targets. We do not explic-itly model the internal dynamics of the individualUAV (when maneuvering in a cell) and henceconcentrate on the development of mechanismsfor search planning and task allocation. However,when an agent needs to move to attack a targetin other location, the agent flies directly from thecurrent location to a cell where it is supposed toperform the attack task. We assume that the agentmoves in its optimal speed, vattack, so as to carryout the attack task in a minimum time.

A target may change its location while an agentis heading to attack the target, and thus the agentinitially computes its expected travel time basedon the location where the target is spotted. Thedistance between two locations is measured inEuclidean distance from the center of one cell tothe center of the other cell as illustrated in Fig. 2.Then the expected travel time that agent i in cellci takes to arrive cellc j where target j is currentlylocated is computed as

�tij = dist(ci, c j)

vattack=

√�x2(ci, c j) + �y2(ci, c j)

vattack

(2)

TargetΔx

Δy dist

Fig. 2 Euclidean path of agent for attacking a target

In the model, we define the response time foragent i to attack target j as the minimum numberof time steps to arrive to the current location ofthe target, which is computed as

τij =⌈

�tijT

⌉(3)

where �r� denotes the next higher integer of realnumber r and T represents the time interval.While moving to attack a target, an agent ad-justs its path as it obtains new information aboutthe target’s location and searches no cells that itmoves through in order to reach the target in aminimum time.

3.3.2 Capabilities of Agent

An agent is equipped with a sensing device, withwhich the agent is assumed to be able to detecta target with a certain probability provided thatthe agent and the target are located in the samecell. The detection capability of agents might bedifferent depending on the type of the agent andthe search condition of a region. The detectioncapability of agent i is defined by a detectionfunction,

βi(c, t) = 1 − e−α(c,t) (4)

where α(c, t) ≥0 represents the search effective-ness of cell c at time t, which is normally used insearch problems. We assume that, once an agentdetects a target, the agent can identify physicalproperties of the target such as type, value andmunition resources requirement as well.

In order to destroy a target that has beenidentified, the target needs to be serviced by anagent that holds proper attack capability over thetarget. An agent carries some munition resourceslimited in quantity and depletes with usage. Whileengaged with a target, an agent delivers someamount of munitions to the target and destroys thetarget with kill probability, which is determined

J Intell Robot Syst

based on the effectiveness of munitions the agentcarries. The kill probability of agent i over target jis defined as

δij(μij(t)

) = Probability(attack destorys

target j∣∣ μij(t)

)(5)

where μij(t) represents the amount of availablemunition resources at time t that agent i can useto destroy target j. The attack capability of agent iis specified by its kill probability to the target anddefined as follows.

γij(t) ={

δij(μij(t)

), if agent δij

(μij(t)

) ≥ δmin

0, otherwise(6)

An agent is not eligible to attack target j if itskill probability is less than some positive numberδmin or it has no sufficient amount of munitionresources required to attack the target. Follow-ing the attack task on a target, Battle DamageAssessment (BDA) is instantly performed by theattacking agent and assumed to be precise. If theattack task fails, the agent is required to re-attackthe target until the target is successfully destroyed.

3.4 The Objective of Mission

During the mission individual agent collects infor-mation about targets and environment, and sharesthis information via communication with otheragents. Utilizing this information, a team of agentcollaborates to search for and destroy targets.When an agent successfully destroys a target, theagent is awarded a reward corresponding to thevalue of the target at that time. Then the objectiveof the mission for the team is to maximize thetotal rewards attained by successfully destroyingtargets in a given mission horizon. To accomplishthis goal, the team needs to cooperatively workin the efficient search planning and effective taskallocation mechanism.

4 The Agent Information Base

In order to effectively search for targets, agentsmust keep the state information of the environ-

ment in terms of targets location. To do this, twoinformation maps are used as knowledge base forplanning search route.

4.1 Target Location Probability Map

Each agent has a target location probability mapQ(t = kT) that contains probability informationregarding targets location. In the map, each celli has a value, termed the target location proba-bility, q(i, j, t) ∈ [0,1] representing the probabilitythat target j is present in the cell at time t. Thelocation probability map is constructed with theinitial target location probability distribution ofeach target,Q(t = 0) = {q(i, j, 0)}, known in ad-vance and is updated at each time step accordingto a Markov process � as new target locationinformation is obtained.

In the absence of search, the target locationdistribution evolves according to the formula,

Q(t = kT + T) = Q(t = kT) · � (7)

where � = {λij|i, j ∈ C} represents the target tran-sition probability matrix. However, incorporatedwith the likelihood of overlooking targets in asearch, the probability of target j being present incell i is computed as

q(i, j, t + T) =∑

k=adj(i)

λki · q(k, j, t) · ψa(k, t) (8)

where adj(i) represents a set of adjacent cells tocell i including the cell itself and ψa(k,t) is theprobability that agent a overlooks targets in cellk ∈adj(i), which is defined as

ψa(k, t) ={

1 − βa(k, t), if agent a searches cell k1, at time t otherwise

(9)

Agents cooperatively build a target location prob-ability map, utilizing new information each agentgathers during each time interval. As an agent car-ries out a task in a cell, it obtains some information

J Intell Robot Syst

about targets, e.g. targets detected, undetected, ordestroyed. The agent propagates this informationto other agents via communication and updatesthe target location probability map according tothe formula (7) and (8). For example, if a targetis detected in a cell, the location probability of thetarget is set to 1.0 in the cell and 0 for all other cellsin the map. Once a target is destroyed, the target’slocation information is removed from the locationprobability map.

Since all information is shared among agentsthrough global communication, every agent keepsthe same target location probability map at anytime. Each agent utilizes the target location prob-ability map and a certainty map detailed in the fol-lowing section, as a guide for autonomous decisionmaking in its own search path planning.

4.2 Certainty Map

In the target location probability map, each cellcontains estimated probabilities of targets’ exis-tence. However, in general, the probabilities maynot precisely reflect real probabilities of targets’location due to inherent uncertainty in the prob-abilities. For example, if agents have detected notarget for a long period of time, the probabilitiesof targets’ existence, which have been developedfrom Eqs. 7 and 8 over the time, come out withonly a small value. It implies that the locationprobability distribution is un-normalized throughthe entire environment and thus there may be ahigh uncertainty in the probabilities. Therefore,using only the target location probability map maynot provide sufficiently good information that ac-count for targets location. For this reason, in ourmodel another measure, called certainty, is usedto capture the moment information describingambiguity in the location probabilities as used in[14–16].

We define a certainty variable, h(i,t) ∈ [0,1],for each cell i, which quantifies informationdeficiency about target location probabilities inthe cell. This certainty value corresponds to thedegree to which the cell has been searched anddrives agents to explore un-searched region. Thecertainty value h(i,t) = 0 implies that the cell hasnot been searched for a long time and thus the

location probabilities in the cell is quite uncertain.As the cell is searched repeatedly, its certaintyvalue approaches 1.0. The map that stores thisinformation is called the certainty map, H(t =kT), which begins with an initial certainty valuein each cell and is updated at each time step asthe cells are searched. Each time an agent visits acell and performs a search, the certainty value ofthe cell changes according to the rule described in[15, 16];

h(i, t + T) = h(i, t) + 0.5 (1 − h(i, t)) (10)

However, in a dynamic environment where tar-gets move, if no search is performed in a cell, thecell’s certainty level decays with time in cope witha changing environment so that cells scanned inrecent search have more weight on the certaintylevel. In our model, the certainty value of a cellwith no search decreases using a discount factorε∈[0,1] as

h(i, t + T) = ε · h(i, t) (11)

This update rule is a simple way for agents totrack the number of searches recently conductedon each cell and capture the notion of uncertaintyin location probability information of targets. Thecertainty map constantly changes over a missionhorizon and is shared among all agents throughcommunication.

5 Proposed Approach

The approach proposed in this paper utilizes aprobabilistic decision making protocol based onthe response threshold model described in [3, 4].In the response threshold model, each agent isgiven a set of thresholds and each task is as-signed a stimulus. The threshold represents thepreference or specialization of an agent towarda task and can be updated in regard with thechange of state. The stimulus of a task indicatesthe demand or value of the task itself and is usedto attract agents’ attention. A probability functionis defined using the threshold and the stimulus, by

J Intell Robot Syst

which an agent determines the probability that ittakes a task.

In our model, tasks include a set of searchtasks, each for searching an individual cell, anda set of attack tasks, each for attacking an indi-vidual target detected. To each set of search andattack tasks, the response thresholds of individ-ual agent i are given as S

i = {θ S

i1(t), · · · , θ Si,NC

(t)}

and Ai = {

θ Ai1 (t), · · · , θ A

i,NT(t)

}, respectively. The

stimulus of a task is determined based on totalexpected values of targets. While the responsethreshold of agents and the stimulus of tasks aredefined differently for the search and attack tasksas detailed in the following sub-sections, agentsare designed to operate in a similar manner; theagent selects probabilistically its action by eval-uating task according to a probability function.Though there are a couple of different forms ofthe probability function used in the literature,we adopt a function in3, i.e., given threshold θ ij

and stimulus S j, agent i will take task j with prob-ability:

P(S j, θij) = S2j

S2j + a · θ2

ij + �τ 2·bij

(12)

where �τ ij is the time period taken until agenti starts task j, and a and b are parameters. Thelower an individual’s threshold or the higher atask’s stimulus, the more likely the agent acceptsthe task.

The task allocation framework we propose con-sists of two mechanisms; search planning mecha-nism for search tasks and target assignment mech-anism for attack tasks. At each time step, everyagent is introduced search tasks, each of whichis to search one of adjacent cells. An agent se-lects one search task through the search planningmechanism. During search, attack tasks would beintroduced if targets are detected. In our model,attack tasks have higher priority over search tasks.An agent that detects a target runs the targetassignment mechanism through which an attacktask over the target is assigned. The overall flow ofthe task allocation framework is shown in Fig. 3.

End

Initiate targetassignment mechanism

Initiate searchplanning mechanism

BC

Start

A

Begin a search

Detect a target?

Yes

No Update maps

Fig. 3 Task allocation framework flowchart

5.1 Search Planning Mechanism

An agent performs a search (as a default task)if no target is detected at the time or the agentis not eligible to any of attack tasks announced.At each time step, an agent selects one searchtask from available search tasks for the next timeinterval. The response threshold to and the stimu-lus of an individual search task are defined usingthe environment state information at the timesuch as target location probabilities in the locationprobability map, Q(t), and certainty values in thecertainty map, H(t).

In the mechanism, the response threshold ofagent i to a search task in cell j, is given by thecell’s certainty valueh( j,t) at the time and boundedin [0,θmax], which is defined as:

θ Sij (t) = θmax · (

1 − e−ω·h( j,t)) (13)

where ω>0 is a parameter. Note that agents haveall common response thresholds to search taskssince the thresholds are determined only withcells’ certainty value. The response thresholds areupdated every time step as certainty value of cellschanges. A stimulus of a search task in cell j is

J Intell Robot Syst

defined based on potential possibilities to detecttargets in the cell and values of the targets, whichis given by:

SSj (t) =

∑

k∈NT

q( j, k, t) · βi( j, t) · Vk(t) (14)

In our model, no delay time is required to transferto the next search task, which means �τ ij = 0.Then, according to the Eq. 12, the probability thatagent i accepts a search task for cell j at time t iscomputed as:

P̃Sij(t) =

(SS

j (t))2

(SS

j (t))2 + a ·

(θ S

ij (t))2 (15)

We interpret this probability as the preferencelevel of an agent to a particular search task. Sincean agent is given multiple search tasks, the agentneeds to pick only one among them for the nexttime search. Thus a real probability for an agentto select a search task is given by a normalizedprobability using the preference levels as:

PSij(t) = P̃S

ij(t)∑

k∈adj(c)P̃S

ik(t)(16)

According to the function (15), the lower responsethreshold or the greater stimulus is given, thehigher preference level is imposed to a searchtask in a cell, which means the cell is more likelyto be visited. In this way, agents are driven toexplore cells not only with high target locationprobabilities but also with low certainty value.

However, there might be a case where morethan one agent competes for the same cell tosearch. In that case, a conflict resolution rule isapplied to avoid overlap in search efforts; thecell is given to an agent with higher preferencelevel and the other one takes other cell with thesecond highest preference level. Since there areno more than one UAV in one cell simultane-ously, collision-free operation (flight) is guaran-teed when searching in a cell. Figure 4 shows the

Competing?

No

Yes

B

Move

A

Apply conflict resolutionrule to avoid overlap

Compute responsethreshold and stimulus

Compute preferencelevel and normalizedprobability

Select a cell for the nexttime search and checkoverlap

Fig. 4 Search planning mechanism flowchart

flowchart that details the process of the searchplanning mechanism.

5.2 Target Assignment Mechanism

When an agent senses a target, it initiates targetassignment mechanism to select a best suitableagent to perform the attack task over the target.If an agent detects multiple targets, it proceedswith target assignment process in parallel foreach target. Target assignment mechanism in ourmodel utilizes auction-like protocol. The processstarts with announcement of an attack task alongwith information associated with a target such astype, location, munitions requirement, stimulus,etc. Each other agent that receives the attack taskinformation checks its eligibility, i.e., whether ornot the agent holds a proper attack capabilitywith sufficient munition resources, and if so, re-spond to the task with a probability given fromfunction (12).

An agent’s response threshold to attack task isgiven based on the agent’s attack capability. If an

J Intell Robot Syst

agent has high kill probability over a target, it getslow response threshold to a corresponding attacktask. Therefore, the response threshold is definedto be inversely proportional to kill probability asfollows:

θ Aij (t) = c − d · γij(t) (17)

where c and d are positive constants and θ Aij is

bounded in [0,θmax]. If the agent has no attackcapability on a target, the response threshold tothe attack task is set to infinity. A stimulus of anattack task is simply determined as a correspond-ing target’s value at the time, i.e.,

SAj (t) = V j(t) (18)

Once the agent decides to respond, it sends a re-sponse message including some information suchas estimated response time (τ ij), kill probability,and expected rewards gain to the auctioneeringagent. The expected reward of agent i for destroy-ing target j is computed as

Eij(V j(t + τij)

) = γij(t) · V j(t + τij) (19)

C

Announce attack taskand request response

Select the most suitableUAV for attack task

Send award/rejectmessages

Abort attack task andpropagate targetlocation information

Perform cooperativelyattack task

Any response?

Yes

No

Switch to search task

A

Fig. 5 Target assignment mechanism flowchart

60

65

70

75

80

85

90

95

100

5 10 2015

Number of targets

Per

cent

age

Probability map basedCertainty map basedThreshold model based

Fig. 6 The performance of each method by varying thenumber of targets when 5 UAVs and ε = 0.90 are used

An agent may have a schedule of attack taskspreviously assigned. Then its response time is thetotal time to complete all the attack tasks in theschedule, plus estimated time to reach the an-nounced target after finishing the last task in theschedule. When multiple attack tasks are concur-rently announced, an agent performs the samedecision making process to select an attack task asdoes in the search task selection, i.e., evaluates aresponse probability for each of attack tasks, andselects a task by a normalized probability.

5 10 2015

Number of targets

60

65

70

75

80

85

90

95

100

Per

cent

age


Fig. 7 The performance of each method by varying thenumber of targets when 5 UAVs and ε = 0.95 are used

J Intell Robot Syst

Table 1 The capability ofUAV

Detection Kill

Type A Type B Type C Type D Type ETarget Target Target Target Target

Type I UAV 0.9 0.5 0.2 0.9 0.8 0.85Type II UAV 0.85 0.7 0.95 0.85 0.3 0.4Type III UAV 0.95 0.9 0.75 0.7 0.75 0.7

After an agent collects all responses from eligi-ble agents, the agent chooses the best one that canproduce the largest expected reward. The agentsends a reject message to other agents that are notselected. In the event that an agent does not getany response from other agents, then the agentaborts the attack task and switches to performa search task. However, even in this case, theagent propagates location information about thedetected target so that other agents use this infor-mation to update a target location probability mapand build a direction in future search plan.

Agents cooperatively perform an attack task.Once an agent assigns an attack task to otheragent, the agent chases a target until the otheragent reaches the target to attack. While chasingthe target, the agent communicates the target’slocation information with the other agent. Thiscooperative attack task execution helps an attack-ing agent navigate to the final location where theattack task needs to be conducted. The agentsare released from the attack task and begin othertasks when the task is completed. The overallprocess of the target assignment mechanism isillustrated in Fig. 5.

6 Simulation Results

In this section we demonstrate the performanceof the proposed approach compared to two other

methods using a simulation environment; tar-get location probability map based [11] and un-certainty map based search and task allocationmethod [12]. Since these methods are designedonly to address search path planning, we modifythe methods to include deterministic attack taskallocation scheme for comparison with our ap-proach. In the deterministic task allocation thetwo methods use, an agent must respond to anannounced attack task if the agent is eligible to thetask. Each method is described as follows.

Target location probability map based search andtask allocation This approach is a somewhatgreedy method that utilizes only a target locationprobability map. At each time step an agent se-lects one cell for search that has the highest loca-tion probability weighted by targets’ value. Upondetecting a target, an agent immediately performsan attack task over the target if it has properattack capability. Otherwise, the attack task isdeterministically assigned to the other agent thatis expected to yield the best reward by destroyingthe target.

Certainty map based search and task allocation Inthis approach, an agent uses only a certainty mapfor routing its search path, that is, an agent movesto a cell with the lowest certainty level for the nexttime search. The certainty value of a cell dynami-cally changes as described in the previous section.

Table 2 The average number of targets destroyed, total cumulative reward and mission completion time of each methodwhen ε = 0.90 is applied in case study 1

No. of targets Probability map based method Certainty map based method Response threshold model based

Targets Rewards Completion Targets Rewards Completion Targets Rewards Completiondestroyed gain time destroyed gain time destroyed gain time

5 4.4 299.0 90.3 4.5 298.5 88.8 4.7 319.5 74.710 7.6 513.5 120.0 7.8 508.0 114.2 8.6 600.0 114.815 11.3 771.0 120.0 11.3 759.0 120.0 12.6 869.0 120.020 16.4 1106.5 120.0 15.4 1035.0 120.0 17.6 1200.5 116.9

J Intell Robot Syst

Table 3 The average number of targets destroyed, total cumulative rewards and mission completion time of each methodwhen ε = 0.95 is applied in case study 1

No. of targets Probability map based method Certainty map based method Response threshold model based


5 4.4 299.0 90.3 4.3 280.0 97.0 4.8 324.5 65.210 7.6 513.5 120.0 7.1 466.0 119.4 8.7 599.5 115.015 11.3 771.0 120.0 11.5 773.5 120.0 12.4 855.0 116.520 16.4 1106.5 120.0 16.3 1080.5 120.0 17.5 1192.0 120.0

The attack task allocation process is performed inthe same manner as the target location probabilitymap based method.

6.1 Simulation Scenario

The simulation is conducted with a simulatedUAV team that performs a search and destroymission in a battlefield. A battlespace is repre-sented by 10 × 10 grid cells with the size of 2 kmfor each side. In the environment there are fivetypes of targets that move around. The initialvalue of target is given 100, 90, 70, 60 and 40 fortype A, B, C, D, and E, respectively. The value ofa target decreases according to the function

V j(t) = V0j ·

(1

1 + η · t

)(20)

5 10 2015

Number of UAVs

20

30

40

50

60

70

80

90

100

Per

cent

age


Fig. 8 The performance of each method by varying thenumber of UAVs when 15 targets and ε = 0.90 are used

where η is set to 0.03. We consider three typesof UAVs, of which munition resources is givenbetween 20 to 60 units. The munitions usage ofUAV to attack a target is varied between 10 to20 units. For simplicity, the detection and killprobability of UAV to each type of target areset as shown in Table 1. The speed of UAV formoving to attack is identical and constant, which isassumed to be 300 km/hr. The movement period Tand the mission horizon are set to one minute andtwo hours, respectively.

For each simulation run, the initial probabilitydistribution of target location is randomly gener-ated and each target is located in one of cells ofwhich q(i, j, 0)>0. The certainty value of eachcell is initially set to 1.0. The parameters of thesystem are set as follows: δmin = 0.7; a = 0.01; b =1.0; θmax = 100; ω = 3.0; c = 333.33; d = 333.33.The algorithms are coded in MATLAB, and the

5 10 2015

Number of UAVs

20

30

40

50

60

70

80

90

100

Per

cent

age


Fig. 9 The performance of each method by varying thenumber of UAVs when 15 targets and ε = 0.95 are used

J Intell Robot Syst


No. of UAVS Probability map based method Certainty map based method Response threshold model based


3 5.3 362.0 120.0 4.8 335.0 120.0 5.9 409.0 120.05 8.2 567.0 120.0 8.6 559.5 120.0 9.6 678.5 120.07 11.8 797.0 120.0 11.4 765.5 120.0 13.2 895.5 116.79 13.5 926.3 114.0 13.5 931.5 102.2 13.6 939.5 105.3

simulation results are the average of 50 runs foreach case.

6.2 Results and Discussion

Wetest the response threshold based method com-paring to the other algorithms by varying simula-tion environment. The performance is measuredwith total cumulative rewards that a UAV teamcollects by successfully destroying targets over afixed time window. Then the performance is rep-resented in percentage obtained by dividing thetotal rewards gain by total sum of target initialvalues.

Case study 1 In the first set of simulation, a UAVteam composed of 5 UAVs (2 of each type A andB, and 1 of type C) is used given different numberof targets varied from 5 to 20. Figures 6 and 7demonstrate the performance of each methodwith different discount factor values (ε = 0.90 andε = 0.95). As seen in the figures, the responsethreshold model based method performs the bestin all instances, achieving up to 12.1 % and 12.7 %more rewards than probability map and certaintymap based method respectively when ε = 0.90,and 11.9 % and 18.5 % more than probabilitymap and certainty map based method respectively

when ε = 0.95. We measure the average numberof target destroyed, the total cumulative rewardsand mission completion time as listed in Tables 2and 3. Obviously, as more targets are injectedinto the system, each method obtains more re-wards by detecting and destroying more targets.In addition, the response threshold model basedapproach completes mission in shorter time sincea UAV team can have better chances to findtargets in the proposed approach. Utilizing stateinformation including targets location probabili-ties and certainty variables, our approach reducesuncertainty in search environment and providesflexibility for routing a search path. When as-signing attack tasks, the proposed approach de-termines the most suitable UAV considering itsattack capability and response time, and henceproduces more rewards.

Case study 2 The second set of simulation com-pares the performance by varying the number ofUAVs given 15 targets. Figures 8 and 9 illustratethe performance of each method when discountfactor value ε = 0.90 and ε = 0.95 is applied,which show that the response threshold modelbased method outperforms the other two methodsin both cases. Our method attains at most 10.3 %more rewards that probability map based method


No. of UAVS Probability map based method Certainty map based method Response threshold model based


3 5.3 362.0 120.0 4.9 331.0 120.0 5.7 408.5 120.05 8.2 567.0 120.0 8.1 553.5 120.0 9.6 672.5 120.07 11.8 797.0 120.0 12.0 811.0 120.0 12.3 849.5 120.09 13.5 926.3 114.0 13.7 940.0 111.3 13.8 947.0 105.1

J Intell Robot Syst

when ε = 0.90 and 11.0 % more than certaintymap based method when ε = 0.95. Tables 4 and 5summarize the average number of targets de-stroyed, the total cumulative rewards and missioncompletion time of each method. The responsethreshold model based method detects and de-stroys more targets in a given time duration andcompletes faster a mission. This confirms that theproposed approach can more efficiently explorethe environment and more effectively allocatestasks. Notice that the difference in performance isnot so significant when a small or large numberof UAVs is used. This is because a small sizeUAV team has a limited amount of munitionresources in total, which is not sufficient to handleall targets. In contrast, a large size UAV teamhas many available UAVs with sufficient munitionresources so that it can immediately react to adetected target. This suggests that the responsethreshold based method would become more su-perior if a proper number of UAVs are used.

Case study 3 The third set of simulation investi-gates the effect of certainty discount factor on theperformance of response threshold model basedmethod. In the first simulation, 5 UAVs given10 targets and 7 UAVs given 15 targets are triedwith various discount factor values from 0.8 to 1.0.Figures 10 and 11 illustrate the performance ofeach method. Observe that the performance of re-

60

65

70

75

80

85

90

0.8 0.85 0.9 0.95 1

Discount factor value

Per

cent

age


Fig. 10 The comparison of performances by varying dis-count factor value when 5 UAVs and 10 targets are used

0.8 0.85 0.9 0.95 1


60

65

70

75

80

85

90

Per

cent

age


Fig. 11 The comparison of performances by varying dis-count factor value when 7 UAVs and 15 targets are used

sponse threshold model based method seems notgreatly influenced by discount factor value whilethe performance of certainty map based methodfluctuates depending on discount factor value.This is because the response threshold modelbased method more flexibly explores the environ-ment by probabilistically selecting a search cellwhereas the certainty map based method deter-ministically chooses a search cell with the highestuncertainty. It suggests that our method works

0.8 0.85 0.9 0.95 1


70

75

80

85

90

95

100

Per

cent

age

5 targets 10 targets

15 targets 20 targets

Fig. 12 The performance of the response threshold modelbased method with various number of targets and discountfactor values when 5 UAVs are used

J Intell Robot Syst

0.8 0.85 0.9 0.95 1


30

40

50

60

70

80

90

100P

erce

ntag

e3 UAVs 5 UAVs

7 UAVs 9 UAVs

Fig. 13 The performance of the response threshold modelbased method with various number of UAVs and discountfactor values when 15 targets are given

well in various situations yielding a stable per-formance. To verify this, the second simulationis conducted by varying the number of targetsand UAVs. The results summarized in Figs. 12and 13 show that the difference in performance iswithin only 6.1 %. This confirms that the proposedapproach performs in a stable way better than thecertainty map based method.

7 Conclusion

This paper presents a distributed search planningand task allocation approach for a heterogeneousUAV team that performs a search and destroymission for moving targets. In the model, themovement of targets is described by a Markovprocess based on which target location probabilityinformation is developed. The certainty is usedas a measure to account for ambiguity in thetarget location probability information. Using thisenvironment state information, the UAV teamcooperatively builds a target location probabil-ity map and a certainty map, which constitute aknowledge base for planning search route andtask allocation. In the proposed approach, proba-bilistic decision making scheme based on responsethreshold model is developed to provide flexibilitythat enables the UAV team to efficiently explore

an environment and carry out tasks. The proposedapproach has been evaluated in simulation exper-iments compared to deterministic methods. Theresults clearly show that the proposed approachoutperforms the other methods in various condi-tions. Future research will examine some practi-cal issues such as targets’ threats, communicationconstraints and failures, UAV malfunctions andshoot-downs, and task failures for the extensionof the problem. Those issues could significantlyaffect the system performance in a real battlefieldand hence should be addressed in a model. Wewill also investigate the impact of parameters ofthe model on the system and develop techniquesto tune the parameters according to the conditionof environment.

References

1. Unmanned Aircraft Systems Roadmap 2005–2030.Office of Secretary of Defense (2005)

2. Jin, Y., Liao, Y., Minai, A., Polycarpou, M.: Balancingsearch and target response in cooperative UnmannedAerial Vehicle (UAV) team. IEEE Trans. Syst. ManCybern. B Cybern. 36, 571–587 (2006)

3. Campos, M., Bonabeau, E., Theraulaz, G., Deneubourg,J.-L.: Dynamic scheduling and division of labor insocial insects. Adapt. Behav. 8, 83–92 (2001)

4. Nouyan, S., Ghizzioli, R., Birattari, M., Dorigo, M.: Aninsect-based algorithm for the dynamic task allocationproblem. Künstlich Intell. 4, 25–31 (2005)

5. Benkoski, S., Monticino, M., Weisinger, J.: A survey ofthe search theory literature. Nav. Res. Logist. 38, 469–494 (1991)

6. Stone D.: Theory of Optimal Search, 2nd edn. Opera-tion Research Society of America, Arlington (1989)

7. Washburn, A.: On a search for a moving target. Nav.Res. Logist. 27, 315–322 (1980)

8. Brown, S.: Optimal search for a moving target in dis-crete time and space. Oper. Res. 28, 1275–1289 (1980)

9. Stewart, T.: Search for a moving target when searchermotion is restricted. Comput. Oper. Res. 6, 129–140(1979)

10. Washburn, A.: Search for a moving target: the FABalgorithm. Oper. Res. 31, 739–751 (1983)

11. Grundel, D.: Searching for a moving target: optimalpath panning. In: Proceedings of the IEEE Interna-tional Conference on Networking, Sensing, and Con-trol, pp. 867–872. Los Alamitos, CA (2005)

12. Polycarpou, M., Yang, Y., Passino, K.: A coopera-tive search framework for distributed agents. In: Pro-ceedings of the 2001 IEEE International Symposiumon Intelligent Control, pp. 1–6. Mexico City, Mexico(2001)

J Intell Robot Syst

13. Sujit, P., Ghose, D.: Search using multiple UAVs withflight time constraints. IEEE Trans. Aerosp. Electron.Syst. 40, 491–508 (2004)

14. Yang, Y., Minai, A., Polycarpou, M.: Evidentialmap-building approaches for multi-UAV cooperativesearch. In: Proceedings of American Control Confer-ence, pp. 116–121. Portland, OR (2005)

15. Jin, Y., Minai, A., Polycarpou, M.: Cooperative real-time search and task allocation in UAV teams. In: Pro-ceedings of IEEE International Conference on Deci-sion and Control, pp. 7–12. Maui, Hawaii (2003)

16. Flint, M., Fernandez-Gaucherand, E., Polycarpou, M.:Cooperative control for UAV’s searching risky envi-ronments for targets. In: Proceedings of IEEE Interna-tional Conference on Decision and Control, pp. 3567–3572. Maui, Hawaii (2003)

17. Yang, Y., Minai, A., Polycarpou, M.: Decentralizedcooperative search by networked UAVs in an uncer-tain environment. In: Proceedings of American Con-trol Conference, pp. 5558–5563. Boston, Massachusetts(2004)

18. Flint, M., Polycarpou, M., Fernandez-Gaucherand, E.:Cooperative control for multiple autonomous UAV’ssearching for targets. In: Proceedings of IEEE Interna-tional Conference on Decision and Control, pp. 2823–2828. Las Vegas, Nevada (2002)

19. Sujit, P., Sinha, A., Ghose, D.: Team, game, and ne-gotiation based intelligent autonomous UAV task allo-cation for wide area application. In: Chahl, J. S., Jain,L. C., Mizutani, A., Sato-Ilic, M. (eds.) Innovations inIntelligent Machines (1), pp. 39–75. Springer-Verlag,Heidelberg (2007)

20. Passino, K., Polycarpou, M., Jacques, D., Pachter, M.,Liu, Y., Yang, Y., Flint, M., Barum, M.: Coopera-tive control for autonomous air vehicles. In: Coop-erative Control and Optimization. Kluwer, Boston(2002)

21. Vincent, P., Rubin, I.: A framework and analysis forcooperative search using UAV swarms. In: Proceed-ings of the 19th Annual ACM Symposium on AppliedComputing, pp. 79–86. Nicosia, Cyprus (2004)

22. Jun, M., D’Andrea, R.: Probability map building of un-certain dynamic environments with indistinguishableobstacles. In: Proceedings of American Control Con-ference, pp. 3417–3422. Denver, Colorado (2003)

23. Bertuccelli, L., How, J.: Search for dynamic tar-gets with uncertain probability map. In: Proceed-ings of American Control Conference, pp. 737–742.Minneapolis, Minnesota (2006)

24. Shumacher, C., Chandler, P., Pachter, M., Pachter,L.: UAV task assignment with timing constraints viamixed-integer linear programming. In: Proceedings ofAIAA 3rd the Unmanned Unlimited Systems Confer-ence, pp. 232–252. Chicago, Illinois (2004)

25. Nygard, K., Chandler, P., Pachter, M.: Dynamic net-work flow optimization models for air vehicle re-source allocation. In: Proceedings of the AmericanControl Conference, pp. 1853–1858. Arlington, Texas(2001)

26. Shima, T., Rasmussen, S., Sparks, A., Passino, K.: Mul-tiple task assignments for cooperating uninhabited aer-ial vehicles using genetic algorithms. Comput. Oper.Res. 33, 3252–3269 (2006)

27. Arslan, G., Wolfe, J., Shamma, J. Speyer, J.: Optimalplanning for autonomous air vehicle battle manage-ment. In: Proceedings of the IEEE Conference on De-cision and Control, pp. 3782–3787. Las Vegas, Nevada(2002)

28. Shetty, V., Sudit, M., Nagi, R.: Priority-based assign-ment and routing of a fleet of unmanned combataerial vehicles. Comput. Oper. Res. 35, 1813–1828(2008)

29. Lemaire, T., Alami, R., Lacroix, S.: A distributed taskallocation scheme in multi-UAV context. In: Proceed-ings of the IEEE International Conference on Ro-botics and Automation, pp. 1157–1162. New Orleans,Louisiana (2004)

30. Gurfil, P.: Evaluating UAV flock mission performanceusing Dudek’s taxonomy. In: Proceedings of the Amer-ican Control Conference, pp. 4679–4684. Portland,Oregon (2005)

31. Sujit, P., Sinha, A., Ghose, D.: Multiple UAV task al-location using negotiation. In: Proceedings of the Inter-national Conference on Autonomous Agents and Mul-tiagent Systems, pp. 471–478. Hakodate, Hokkaido,Japan (2006)

32. Sujit, P., Beard, R.: Distributed sequential auctionsfor multiple UAV task allocation. In: Proceedingsof the American Control Conference, pp. 3955–3960.New York City, New York (2007)

response threshold model based uav search planning and task allocation

Documents