robustness in probabilistic temporal planning · 2015. 5. 11. · figure 2: temporal network...

Robustness in Probabilistic Temporal Planning

Jeb Brooks, Emilia Reed, Alexander Gruver and James C. Boerkoel Jr.Harvey Mudd College, Claremont, CA

{jbrooks,egreed,agruver,boerkoel}@g.hmc.edu

Abstract

Flexibility in agent scheduling increases the resilienceof temporal plans in the face of new constraints. How-ever, current metrics of flexibility ignore domain knowl-edge about how such constraints might arise in practice,e.g., due to the uncertain duration of a robot’s transitiontime from one location to another. Probabilistic tempo-ral planning accounts for actions whose uncertain dura-tions can be modeled with probability density functions.We introduce a new metric called robustness that mea-sures the likelihood of success for probabilistic tempo-ral plans. We show empirically that in multi-robot plan-ning, robustness may be a better metric for assessing thequality of temporal plans than flexibility, thus reframingmany popular scheduling optimization problems.

IntroductionTemporal plans exist to provide robust directives that agentscan follow to accomplish their goals, while also coordinat-ing when these activities should occur. Figure 1 shows atwo robot coordination problem—the Robot-T problem—that we will use as a running example throughout the paper.Here, Robot 1 must make its way to the bottom of the T(from α to δ via β) without colliding with Robot 2, which isattempting to traverse across the top of the T (from γ to β).To guarantee success, the plan enforces that Robot 1 mustvacate location β before Robot 2 can transition into β.

In general, we want temporal plans that are adaptableto events that are beyond the direct control of agents; e.g.,Robot 1 may experience slippage and so depart β later thanexpected. To do this, we must answer two questions: (1) howand when do new or unexpected events arise in practice?,and (2) how ‘good’ is the temporal plan at adapting to unex-pected events that might otherwise invalidate the plan?

First, current approaches for modeling how and when un-expected events arise simply capture the ranges of time dur-ing which events might occur. Unfortunately, this ignoresimportant domain knowledge about why such events are un-certain in practice. For instance, the exact amount of time arobot will take to traverse some portion of the course may bedependent on factors such as battery life, slippage, or surface

Copyright c© 2015, Association for the Advancement of ArtificialIntelligence (www.aaai.org). All rights reserved.

2

β

δ

α γ

1

Figure 1: Example Robot-T navigation problem. In order forRobot 2 to traverse the path γ → β, it must first wait forRobot 1 to take the path α→ β→ δ.

friction. Our approach more accurately models such uncer-tainty as a probability distribution over possible durations.

Second, the current state of the art in measuring the ‘good-ness’ of a schedule is called flexibility. As explained later,in our Robot-T example, if the robots are given 24 secondsto complete all the activities, flexibility gives a rating of21.4 seconds. Although flexibility has a strong mathematicalbacking, it lacks intuition for practical applications. Whatdoes having 21.4 seconds of flexibility really mean? Doesthis flexibility exist where we need it most? Our hypothesisin this paper is that where flexibility lies is more importantthan how much flexibility there is. Our work introduces anew metric called robustness, which assesses the likelihoodof temporal plan execution success. We argue that this is amore useful measure of plan quality.

More specifically, this paper contributes:

• an application of probabilistic temporal planning that cap-tures empirically derived domain knowledge about dura-tional uncertainty as a probability density function;

• a new metric, called robustness, that computes the like-lihood that a plan can be executed successfully, thus re-framing many scheduling optimization problems; and

• an empirical evaluation that shows that robustness is bet-ter suited than previous metrics for assessing plan qualityin deployed multi-robot coordination problems.

Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence

3239

α

α1A α1

D

[0,24] [0,24]

[1,2]

β

β1A β1

D

[0,24] [0,24]

[0,2]

δ

δ1A δ1

D

[0,24] [0,24]

[1,2]

β

β2A β2

D

[0,24] [0,24]

[1,2]

γ

γ2A

γ2D

[0,24] [0,24]

[1,∞]

[0,∞)

[4.6,∞) [6.6,∞)

[4.6,∞)

Robo

t 2Ro

bot 1

Figure 2: Temporal network corresponding to the Robot-Tproblem. Each robot’s arrival (A) and departure (D) of a lo-cation is encoded as an event (timepoint), and all events mustoccur between 0 and 24 seconds.

BackgroundFigure 2 is a graphical representation of our Robot-T prob-lem from the introduction. Each graph node is labeledwith the location it represents and a subscript that dictateswhether it is an arrival event (A) to or a departure event (D)from that location. The constraint β1

A −α1D ∈ [4.6,∞), which

indicates that Robot 1 must take at least 4.6 seconds to tra-verse the distance from α to β (and has no maximum transittime), is represented by the edge from α1

D to β1A with label

[4.6,∞). The constraint β1D−β

1A ∈ [0, 2] represents that there

is no need for the robot to wait at location β as it passesthrough, whereas the wait time at start/stop locations is atleast 1 second to allow for, e.g., a pick/drop activity. The factthat each event in the Robot-T problem must be completedwithin 24 seconds of the start of the task is represented bythe range of values [0,24] placed above each timepoint. Fi-nally, which agent owns a particular timepoint is indicatedby superscript, where Robot 1’s subproblem is contained inthe top half of the figure, while Robot 2’s subproblem is con-tained in the bottom half of the figure. There is a single inter-agent (external) constraint, shown as a dashed line, whichdictates that Robot 1 must arrive at location δ (i.e., vacatelocation β) before Robot 1 can depart location γ (to headtowards location β). This encoding of the Robot-T problemis expressed as a Simple Temporal Network, a problem for-malism we introduce next.

Simple Temporal NetworksA Simple Temporal Network (STN), S = 〈T,C〉, is com-posed of T — a set {t0, t1, . . . , tn} of n + 1 timepoint vari-ables, each of which represents a real-world event—andC—a set of m temporal difference constraints, each ofwhich restricts the amount of time that can elapse betweentwo events and takes the form ci j := t j − ti ≤ bi j for somereal-value bounded bi j (Dechter, Meiri, and Pearl 1991). Aconvenient graphical representation of an STN is called adistance graph, where each timepoint is represented as avertex and each constraint t j − ti ≤ bi j is represented as anedge from ti to t j with weight bi j. As a notational conve-nience, when ti − t j ≤ b ji also exists, the symmetric pair ofconstraints involving ti and t j can be combined into a singleconstraint t j − ti ∈

[−b ji, bi j

]. The Multiagent STN is one

that establishes how control of an STN is distributed amongagents as local subproblems that are related through a set of

α

α1A

α1D

[0,6.2] [1,7.2]

[1,2]

β

β1A β1

D

[5.6,11.8] [5.6,11.8]

[0,2]

δ

δ1A δ1

D

[12.2,18.4] [13.2,20.4]

[1,2]

β

β2A β2

D

[16.8,23] [17.8,24]

[1,2]

γ

γ2A

γ2D

[0,17.4] [12.2,18.4]

[1,18.3]

[0,6.2]

[4.6,10.8] [6.6,12.8]

[4.6,10.8]

Robo

t 2Ro

bot 1

Figure 3: Minimal (shortest path) version of Figure 2.

external constraints (Boerkoel and Durfee 2013).The STN can be viewed as a constraint satisfaction prob-

lem, where each variable’s domain is defined implicitlythrough constraints with the special zero timepoint, t0,which represents the start of time and so is assumed to beset to 0. Commonly, the domain of each timepoint is deter-mined by a makespan constraint—the deadline between theexecution of the first and last events. From Figure 2, the factthat γ2

A has a domain of [0, 24] is encoded as γ2A−t0 ∈ [0, 24].

Solving an STN requires assigning a time to each timepointsuch that all temporal difference constraints are satisfied. Afeasible STN—one that has a solution—is called consistent.As an alternative to a fully-assigned point solution, shortest-path algorithms such as Floyd-Warshall can be applied to anSTN’s distance graph in O(n3) time to compute a set of im-plied constraints. These implied constraints define minimalintervals, which represent the sound and complete range oftimes during which each event can take place. We displayminimal constraints for our running example in Figure 3.

Quality of Simple Temporal NetworksA key to the popularity of the STN representation of tem-poral plans in real applications is that it naturally sup-ports a flexible times representation (Bresina et al. 2005;Barbulescu et al. 2010; Boerkoel et al. 2013). That is, whenthe minimality of a temporal network is maintained the re-sult is a space of all possible solutions. For instance, asshown in Figure 3, Robot 1 can depart location α at anytime between 1 and 7.2 seconds. Thus, as new constraintsare introduced due to ongoing plan construction, execution,or unexpected disturbances, as long as some of the feasibleschedules are retained, the integrity of the plan as a wholeremains intact. To fully exploit the advantages of the STNwe first need (1) a model to assess where new constraintscome from, and (2) a measure of an STN’s ability to adaptto new constraints. Next we discuss related approaches thataddress each of these needs and that inspire our approach.

STN with Uncertainty A Simple Temporal Network withUncertainty (STNU) partitions the set of constraints of anSTN into a set of requirement links and a set of contin-gent links (Vidal and Ghallab 1996). A requirement link isa constraint that represents hard bounds on the difference be-tween two timepoint variables. A contingent link, ki j, mod-els the fact that the time that will elapse between ti and t j isdictated by an uncontrollable process such as nature. Naturewill choose some value, βi j, between the specified lower andupper bounds, t j − ti = βi j ∈ [−b ji, bi j], which is unknown

3240

to the agent prior to execution. A contingent timepoint isa timepoint t j that appears as the target of exactly one con-tingent link, ki j; non-contingent timepoints are called exe-cutable. In Figure 2, an agent controls how much time itspends at a location, but not the time it takes to navigatebetween locations. Thus, all navigation constraints are con-tingent links (bold). A primary concern of the STNU liter-ature is finding a controllable execution strategy—one thatguarantees success despite the presence of uncertain events,our approach instead computes the likelihood of executionsuccess. While controllability is an extremely useful prop-erty, our approach aims to be useful in environments wherea plan is worth attempting even if success is not guaranteed,e.g., when the expected value of attempting a plan may out-weigh the costs of potential failure.

Probabilistic STNs We originally proposed an extensionto STNUs that recognized that domain knowledge often ex-ists to allow contingent links to be modeled more accuratelyas probability density functions (pdf). We called this modelthe STN with Modeled Uncertainty. However, Tsamardi-nos (2002) independently introduced a similar extension toSTNUs, called the Probabilistic Simple Temporal Net-work (PSTN), and so we have recast our contributions interms of this model for probabilistic temporal planning.

PSTNs augment STNUs’ definition of a contingent linksuch that t j − ti = Xi j where Xi j is a random variablethat is determined by Pi j, a pdf that models the durationof the link from ti to t j; that is Xi j ∼ Pi j. This formula-tion of the PSTN permits the use of any distribution to de-scribe the duration of an activity, including ones that areparametrized, e.g. Pi j(S ). This means that any STNU canbe cast as a PSTN with uniform distributions over specifiedranges of times, making the PSTN strictly more expressivethan the STNU. Further, whenever it is possible to extractbounds from the distributions used in a PSTN, any algo-rithm that establishes controllability for an STNU can stillbe applied to a PSTN. Previous work has primarily focusedon generating execution strategies (assignments of times toexecutable timepoints) that minimizes risk (i.e., maximizesthe likelihood of successful execution) (Tsamardinos 2002;Fang, Yu, and Williams 2014)

Flexibility of Temporal Networks Generally speaking,the more schedules that a representation permits, the bettersituated it is to adapt to new constraints as they arise. Wil-son et al. (2014) formalized the idea of flexibility—a metricthat is designed to quantify the aggregate slack in temporalplans. To parallel our later discussion about robustness, westart by introducing the naıve flexibility metric, flexN . TheflexN of a timepoint is the length of its domain interval, thatis flexN(ti) = bi0−(−b0i), where ti−t0 ∈ [−b0i, bi0]. The flexNof an entire temporal network, then, is the sum of the flexi-bility of its timepoints, flexN(S) =

∑ni=1 flexN(ti). The flexN

of the network in Figure 3 is 6.2 × 9 + 17.4 = 73.2 seconds.This metric is naıve in the sense that it systematically overcounts flexibility. For example, even though both α1

A and α1D

have 6.2 seconds of flexibility, as soon as Robot 1 arrives atlocation α, the flexibility of α1

D shrinks to 1 (since it mustoccur at least 1 and at most 2 seconds after α1

A).

α

α1A α1

D

[0,0] [1,2]

[1,2]

β

β1A β1

D

[6.6,8.6] [8.6,8.6]

[0,2]

δ

δ1A δ1

D

[15.2,16.2] [17.2,17.2]

[1,2]

β

β2A β2

D

[22,23] [24,24]

[1,2]

γ

γ2A

γ2D

[0,15.2] [16.2,17.4]

[1,17.4]

[0,22]

[4.6,7.6] [6.6,7.6]

[4.6,6.8]

Robo

t 2Ro

bot 1

Figure 4: Interval schedule of the Robot-T Problem.

Wilson et al. propose a new flexibility metric that avoidsdouble counting by first decomposing all events into inde-pendent interval schedules. An interval schedule computesa range of times for each timepoint ti,

[t−i , t

+i

], such that

the execution of ti is independent of all other timepoints.This is accomplished by enforcing that t+j − t−i ≤ bi j holdsfor all constraints of the form t j − ti ≤ bi j, which guar-antees that ti can be set to any value within its interval[t−i , t

+i

]without impacting the values to which t j can be set.

This allows us to redefine the flexibility of a timepoint asflex(ti) =

(t+i − t−i

)and the flexibility of a temporal network

as flex(S) =∑n

i=1 flex(ti), eliminating the concern that flexi-bility is over counted.

We present an interval schedule of our ongoing examplein Figure 4. Here, all assignments of times to timepoints thatrespect the interval schedule will lead to a solution, since anytime within α1

A’s domain is guaranteed to be consistent withany time within α1

D’s domain. As a result, the reported flex-ibility of the interval schedule drops from 73.2 seconds to21.4 seconds. While flexibility yields an important measureof scheduling slack, it may not tell the whole story.

A New Robustness MetricHere we introduce a new metric called robustness for assess-ing the quality of probabilistic temporal plans. As motiva-tion for how PSTNs can be applied to real-world problems,consider the empirical distributions we captured (using theexperimental setup described later) for the time it takes ourrobots to travel forward a single unit both with and withoutcompleting a turn, labeled PT and PF respectively in Figure5. Allowing the robot between 5.5 and 7.5 seconds to travelforward, i.e., 2 seconds of flexibility, would lead to plan suc-cess (91.70% of the time) far more often than if the robot had7.5 - 11.5 seconds, or 4 seconds of flexibility (which leadsto success only 6.02% of the time). That is to say, in real-world problems where uncertainty is controlled by a mode-lable process, where flexibility lies may often be more im-portant than how much flexibility there is. In this section, weformalize this idea of using models of durational uncertaintyto compute robustness, a measure of the likelihood of plansuccess.

Robustness of PSTNsThe PSTN problem formulation allows us to consider ques-tions such as: How likely is a given temporal network to

3241

-0.2

0

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

5 6 7 8 9 10 11 12

PDF

Time (s)

Forward Distribution PFTurn Distribution PT

Figure 5: The empirical distributions for the robot transitiontime of moving forward PF and turning PT one unit.

lead to a successful execution? This is in contrast to previ-ous STNU controllability algorithms, which presuppose (orattempt to compute) a dispatch strategy that guarantees exe-cution success. For many interesting problems, determininga controllable execution strategy may be impossible, yet ifthe likelihood of success is high enough, it might still beworth attempting to execute the plan.

Our new robustness metric leverages domain knowledgesupplied within the PSTN definition to attempt to predictthe success rate of a given temporal plan. Our first attemptat defining robustness, which we call naıve robustness toparallel our earlier discussion about naıve flexibility, is:∏

c∈C

∫ ci j

−c ji

Pi j(x)dx

This metric is the product of the probability mass foundwithin contingent intervals specified in the PSTN. Applyingthis metric to the network in Figure 3 yields:∫ 10.8

4.6PF(x)dx ·

∫ 12.8

6.6PT (x)dx ·

∫ 10.8

4.6PF(x)dx

where PF and PT are the distributions from Figure 5.The definition of a PSTN assumes that the pdfs that de-

termine contingent links are independent from each other.However, as noted in our discussion of the naıve flexibilitymetric, the time intervals within which these activities musttake place are not independent. Thus, our naıve robustnessmetric is overly optimistic for the same reason that the naıveflexibility metric over-counts slack.

As an example, if Robot 1 takes x seconds to travel from γto β, it will have x fewer seconds to travel from β to δ. Sim-ilarly, if it takes Robot 1 y seconds to from β to δ, Robot 2will have (x+y) fewer seconds to complete its maneuver. Soto calculate the overall likelihood of success of a temporalnetwork requires evaluating every possible duration x thatan activity could take and weighting it by the likelihood offuture activities, given that the first activity took time x. Ofcourse, determining the likelihood of each subsequent activ-ity relies recursively on its future activities, and vice-versa.

Previous approaches for predicting the likelihood of suc-cess for PSTNs have dealt with the problem of temporalinter-dependencies by first assigning all executable time-points, which makes contingent links temporally indepen-dent, but does so at the cost of unnecessarily decreasing

Algorithm 1: Sampling-based SimulatorInput: A PSTN S , A number of samples NOutput: A robustness estimate for S

1 failures = 02 foreach i ∈ [1,N] do3 while Not all timepoints have been assigned do4 t = selectNextTimepoint(S )5 if t ∈ ExecutableTimepoints then6 t.assignToEarliestTime();

7 else8 l = S .getContingentLink(t)9 p = l.getPreviousTimepoint(t)

10 x = sample(Pl)11 t.assign(p.getAssignedTime() + x)

12 propagateConstraints(S )13 if S is inconsistent then14 failures += 115 break

16 return 1 − (failures/N)

the likelihood of execution success (Tsamardinos 2002). Tocorrectly capture these inter-dependencies without commit-ting to specific execution times for our running example, wewould need to construct a multi-dimensional integral thatadjusts the domain of integration the same way the networkupdates the time intervals, for instance:∫ 10.8

4.6PF(x)

∫ 12.8−x

6.6PT (y)

∫ 10.8−(x+y)

4.6PF(z)dz dy dx

While this multi-dimensional integral approach seems likea promising way to compute the robustness of a temporalnetwork, it suffers from two fatal flaws. First, generally con-structing and solving such systems of integrals quickly be-comes intractable as the number of activities and complexityof interdependencies grow. Second, as currently expressed,the multi-dimensional integral approach implicitly assumesthat the agent gets to make all of its decisions after naturehas determined the duration of all contingent links, which isof course not true in practice.

Next, we describe two computationally-efficient,simulation-based methods that correct these shortcomingsby approximating the true robustness of temporal networks.

Sampling-based Simulator Approximation Our first ap-proach for approximating the true robustness of a temporalnetwork is presented in Algorithm 1. Our Sampling-basedSimulator works by simulating N plan executions and count-ing the relative number of successes vs. failures. The select-NextTimepoint() function chooses timepoints in the sameorder they would be executed in the plan. This is determinedusing the next first protocol as defined by the dynamic con-trollability dispatcher (DC-dispatch), which always selectstimepoints that are both live (can occur next temporally)and enabled (all incoming links have already been executed)(Morris, Muscettola, and Vidal 2001).

If the selected timepoint is executable (has no incomingcontingent links), we follow the lead of DC-dispatch and as-sign to it the earliest possible time. Otherwise, if the selected

3242

timepoint is contingent, there must be exactly one incomingcontingent link. We extract that link, sample from its prob-ability distribution to get a duration x, and then assign thetimepoint to occur x seconds after the previous timepoint’sassigned time. If any assignment of a time to a timepointvariable leads to an inconsistent network, we count the sim-ulated plan execution as failure. Finally, after N iterations,we return the overall success rate, (1 − (failures/N)).

Whereas the computationally-intractable multi-dimensional integral approach theoretically evaluatesall possible combinations of contingent timepoint assign-ments, our Sampling-based Simulator instead draws samplecombinations according to their underlying likelihood. Italso captures temporal dependencies by propagating thetemporal implications of these samples to approximatethe frequency of success. Further, we faithfully recognizethat how nature chooses to assign contingent timepointsonly becomes known at execution time, and that, in themeantime, agents must still choose how and when toexecute their executable timepoints, which we accomplishby using the DC-dispatch strategy.

Our algorithm draws O(N) samples, where for each sam-ple, each of O(|T |) timepoints is assigned and propagated us-ing Floyd-Warshall, which takes O(|T |3) time. If we assumethat samples have been drawn as a preprocessing routine andcan be retrieved in constant time, the total runtime of our al-gorithm is O(N|T |4).

Representative Simulator Approximation OurSampling-based Simulator may require a large numberof iterations to accurately approximate the robustness of theunderlying temporal network. For cases where efficiencyis a primary concern, we have devised another simulation-based algorithm that approximates the robustness of atemporal network in a single pass. The idea is that, whenwe select a duration for a contingent link, instead of gener-ating a random sample, we instead select a representativesample—one that acts as a proxy for the various ways therandom samples could play out.

Algorithm 2: Representative SimulatorInput: A PSTN SOutput: A robustness estimate for S

1 robustness = 1.02 while Not all timepoints have been assigned do3 t = selectNextTimepoint(S )4 if t ∈ ExecutableTimepoints then5 t.assignToEarliestTime();

6 else7 l = S .getContingentLink(t)8 p = l.getPreviousTimepoint(t)

9 prob =∫ lmax

lminPl(x)dx

10 robustness = robustness × prob11 x = selectTime(Pl, lmin, lmax)12 t.assign(p.getAssignedTime() + x)

13 propagateConstraints(S )

14 return robustness

Our Representative Simulator (Algorithm 2) uses thesame basic dispatching method as our Sampling-based Sim-ulator. The key idea is that as we simulate, we keep a run-ning estimate of the robustness of the network, which is ini-tially set to be 1.0 (i.e., 100% likely). Then, before we as-sign a contingent timepoint, we first calculate the probabilitymass that lies within its current temporal bounds and multi-ply this probability into our running estimate. Next, we usethe selectTime function to select a time that both lies withinthe bounds specified and is representative of the underlyingdistribution. This ensures that we will evaluate a complete,representative simulation. In our implementation of the algo-rithm, we use the expected value to act as the representativesample, but any metric, such as median or mode, could besubstituted. The rationale for our approach is that assigninga timepoint effectively decomposes it from the rest of theschedule, which eliminates temporal dependencies betweencontingent links and so allows us to multiply the probabili-ties together as a running product.

We now demonstrate how Representative Simulatorworks on a small portion of our running Robot-T example.First, it fixes points γ2

A and α1A to their minimum value of 0,

since they have no incoming edges from unassigned time-points. This makes the timepoint α1

D enabled, and becauseits only incoming edge is a requirement edge, it also getsassigned its minimum value of 1. Next the algorithm willcompute the robustness of the edge α1

D → β1A. This edge has

constraints [4.6, 10.8] and so the algorithm will calculate theprobability of success of that edge as:

pα1D,β

1A

=

∫ 10.8

4.6PF(x)dx = 1.0

which happens to capture its entire probability mass. Theexpected value is then computed:

eα1D,β

1A

=

∫ 10.8

4.6

x × PF(x)pα1

D,β1A

dx = 6.12

Next the algorithm sets the timepoint β1A to α1

D + eα1D,β

1A

=

7.12. The algorithm then propagates the new constraints andrepeats this process for all remaining timepoints.

The runtime of this algorithm is O(|T |(|T |3 + I)) where I isthe runtime of the integrator. However, its accuracy dependson how well the choice of representative sample approxi-mates the convolution of distributions.

Empirical EvaluationWith our two new robustness approximations in hand, wenow ask how well these metrics assess the quality of tempo-ral plans in a deployed multi-robot navigation scenario.

Experimental SetupOur experiments involved iRobot Creates performing sim-ple navigation tasks. We placed two brightly colored mark-ers on top of each robot and monitored these markers with aMicrosoft Kinect in order to track the location and orienta-tion of the robots. Using this information, we built a simple

3243

0

0.2

0.4

0.6

0.8

1

18 19 20 21 22 23 24 25 13

14

15

16

17

18

19

20

21

22

23Ro

bust

ness

(%)

Flex

ibili

ty (s

)

Makespan Time (s)

Empirical RobustnessNaive Robustness

Sampling SimulatorRepresentative Simulator

Flexibility

Figure 6: Comparison of our robustness approximationsagainst both experimental data and flexibility for the Robot-T experiment.

control system that enabled the robots to accurately navigatefrom point to point on a grid.1

We focused our efforts on the Robot-T and Robot-Passexperiments. The Robot-Pass experiment involves threerobots and three unique points of coordination.2 For each ofthese experiments, we varied the makespan deadline withinwhich the robots had to complete the entire navigationplan. In all of our experiments, we dispatched robots us-ing the DC-dispatch protocol (described above) and usedDI∆STP (Boerkoel et al. 2013), a distributed algorithmthat exploits multiagent structure as our implementation ofpropagateConstraints(S ). To generate the Empirical Robust-ness curve in the charts below, we ran each multi-robot sce-nario 50 times for each deadline. For each run, we recordedwhether or not the robots successfully completed the navi-gation plan within the allotted time.

For the two approximation algorithms, we used the PFand PT pdfs displayed in Figure 5 as our models of dura-tional uncertainty for the PSTN input to both the Sampling-based Simulator (N = 1000) and Representative Simula-tor. We empirically derived these pdfs by running our threerobots around a one unit square 100 times each, and for eachindependent maneuver, building a histogram using a kernelsmoother. Finally, we also report the output of both our naıverobustness metric and Wilson et al.’s flexibility metric com-puted with PuLP, a Python library that leverages CoinMP -an open source linear programming library.

AnalysisFigures 6 and 7 show each of the aforementioned curves forthe Robot-T and Robot-Pass experiments, respectively. Ingeneral, the increased size and complexity of the Robot Passproblem yielded more variance in our results. In both ex-periments, the shape of our robustness curves more closelymatches the empirically measured success rate than eitherthe flexibility or naıve robustness metrics.

As predicted, the naıve robustness calculation is overlyoptimistic. Surprisingly, both our Sampling-based and Rep-resentative approaches are pessimistic in their predictions

1Control code available upon request.2Video available at: http://tinyurl.com/ocslka7

0

0.2

0.4

0.6

0.8

1

28 30 32 34 36 38 40 30

35

40

45

50

55

60

65

70

Robu

stne

ss (%

)

Flex

ibili

ty (s

)

Makespan Time (s)

Empirical RobustnessNaive Robustness

Sampling SimulatorRepresentative Simulator

Flexibility

Figure 7: Comparison of our robustness approximationsagainst both experimental data and flexibility for the Robot-Pass experiment.

of robustness—the robots succeeded more often in the real-world than our PSTN model would predict. Our preliminaryexplorations indicate that this is due to the fact that, in prac-tice, the underlying distributions are not independent, as isassumed by our PSTN model. In fact, correlations may beboth positive—a robot that is running fast in the first leg ofthe experiment is more likely to run fast in the second leg—and negative—a robot that ‘arrives’ faster by getting accept-ably close to its target (within its spatial tolerance) in thefirst leg may require further adjustment in the second leg.

There are also differences between the shape of our tworobustness approximations. While our sampling-based sim-ulation attempts to directly mimic the real-world scenariousing many sample executions, the representative-basedsampling method attempts to predict this value in one shot.Differences between the two are due to the fact that combin-ing the expected values of various distributions (representa-tive) is not the same as the expectation of their convolution(sampling-based). Flexibility tends to grow linearly with theexperimental deadline, indicating no obvious connection be-tween a plan’s flexibility and chance of success.

The two algorithms also exhibit slightly different run-time behavior. We tested our algorithms on an Intel XeonE5-1603 2.80GHz quadcore processor with 8 GB of RAMrunning Ubuntu 12.04 LTS. The runtime results reported inFigure 8 are the average execution time across 20 execu-tions of each configuration. We found that at low makespans,the Sampling-based simulator can fail early in its simula-tion and so has relatively short runtime. Additionally we seethat the sampling simulator grows linearly with N, each timethe number of samples is doubled the run time doubles. Be-cause the Representative simulator is making a sub-processcall to the Mathematica-based integrator, it does not recievethe same benefits from early failure. Surprisingly, we foundthat the Representative simulator performs similarly to theSampling-based simulator set at N = 1000, which was thesetting we found to perform well. It is important to note thatthe sampling simulator is embarrassingly parallel and thuscan benefit from parallelization, whereas the representativesimulator cannot. Thus, further work is needed to better as-sess the computational trade-offs of these two approaches asthe scale and complexity of problems grows.

3244

0

1

2

3

4

5

6

7

8

18 19 20 21 22 23 24 25

Run

Tim

e (s

)

Makespan (s)

Representative SimulatorSampling (N=2000)Sampling (N=1000)

Sampling (N=500)Sampling (N=250)

Figure 8: Algorithm runtimes for the Robot-T experiment.

Finally, let us consider our earlier hypothesis that hav-ing flexibility in the right places matters more than havingmore flexibility. We tested this hypothesis by approximatingthe robustness of interval schedules for the same Robot-Tproblem presented in Figure 6. As a reminder, the processof computing interval schedules artificially constrains time-points into temporally independent intervals so that flex-bility is not double counted. However, if all that mattersis the total flexibility, we would expect that removing re-dundant slack from the schedule would have no impact onrobustness. We modified Wilson et al.’s approach to com-puting interval schedules, because initially our LP-solverwould return interval schedules that provided no slack toone or more contingent timepoints (which yields a 0% suc-cess rate). Our modification replaced the objective functionof the LP-formulation so that it prioritized contingent time-points (since executable timepoints are chosen by agents andso require no flexibility). The results are shown in Figure 9.

Notice that despite the steady increase in flexibility, therobustness of the PSTN does not continue to increase. Infact, as the makespan increases to beyond 24 seconds, thereis a decrease in robustness. The reason is that even thoughthe size of the intervals increase as flexibility increases, thelocation of the intervals also shifts towards the tail of thedistribution. So even though flexibility is increasing in ab-solute terms, the intervals are capturing less of the probabil-ity mass, thus decreasing robustness. Interestingly, this dataalso appears to refute Wilson et al.’s conjecture that comput-ing an interval schedule has no costs in terms of flexibility.In other words, while flexibility correctly measures schedul-ing slack, it does not provide practical insight into a plan’sadaptability in real world problems. This argues for refram-ing many scheduling optimization problems in terms of ro-bustness rather than flexibility.

ConclusionIn this paper, we employ probablistic temporal planning as away to more precisely capture the nature of scheduling un-certainty within real-world problems. We developed a newmetric for assessing the quality of probabilistic temporalplans called robustness. Our robustness metric utilizes themodel of durational uncertainty to estimate the likelihood ofsuccess of a given temporal plan. Our empirical evaluationdemonstrates that our robustness approximations better esti-

0

0.02

0.04

0.06

0.08

0.1

0.12

0.14

0.16

18 20 22 24 26 28 30 12

14

16

18

20

22

24

26

28

Robu

stne

ss (%

)

Flex

ibili

ty (s

)

Makespan Time (s)

Representative SimulatorFlexibility

Figure 9: Comparison of robustness and flexibility as appliedto an interval schedule version of the Robot-T experiment.

mate plan success rate than previous metrics for determiningthe quality of a schedule. Robustness provides a supplementto, not a replacement for, controllability—robustness pro-vides helpful information for determining whether or not aplan is worth attempting (in expectation) even if agents can-not completely hedge against uncertainty.

This paper provides many future research directions.First, we would like to improve the quality of our ro-bustness approximations and better understand the theoret-ical implications of the assumptions that PSTNs make (pri-marily that the pdfs representing durational uncertainty areindependent). Second, assessing the robustness and flex-ibility of solution methods is hardly a new idea in AI.For instance, both ideas have been explored in the con-text of finite-domain constraint programming for dynamicand uncertain environments (Verfaillie and Jussien 2005;Climent et al. 2014). An interesting future direction inspiredby this literature is exploring whether the ideas of reactiveapproaches, which try to repair a broken schedule with min-imal upheaval, or proactive approaches, which try to useknowledge about possible future dynamics to minimize theireffects, are better suited for the continuous nature of tem-poral planning. Finally, perhaps the most interesting futureresearch direction is in reframing scheduling optimizationproblems in terms of robustness, rather than flexibility. Asdemonstrated in this paper, simply maximizing flexibilityfails to produce robust schedules (see Figure 9). Interestingoptimization problems include computing optimal temporaldecouplings of multiagent schedules (Planken, de Weerdt,and Witteveen 2010), and controllable schedules (Wilson etal. 2014). One challenge is that the efficiency of previousoptimization approaches has relied on the linearity of flexi-bility as a metric, whereas optimizing for robustness requiresaccounting for non-linear distributions; however the payoffis that we will have plans that are more robust to the messi-ness of the real-world.

Acknowledgements

We would like to thank Melissa O’Neill, Susan Martonosi,Jacob Rosenbloom, Priya Donti, and the anonymous review-ers for their helpful feedback.

3245

ReferencesBarbulescu, L.; Rubinstein, Z. B.; Smith, S. F.; and Zim-merman, T. L. 2010. Distributed coordination of mobileagent teams: The advantage of planning ahead. In Proc. ofAAMAS-10, 1331–1338.Boerkoel, J. C., and Durfee, E. H. 2013. Distributed Rea-soning for Multiagent Simple Temporal Problems. Journalof Artificial Intelligence Research (JAIR) 47:95–156.Boerkoel, J. C.; Planken, L. R.; Wilcox, R. J.; and Shah, J. A.2013. Distributed algorithms for incrementally maintainingmultiagent simple temporal networks. In Proc. of ICAPS-13,11–19.Bresina, J.; Jonsson, A. K.; Morris, P.; and Rajan, K. 2005.Activity planning for the Mars exploration rovers. In Proc.of ICAPS-05, 40–49.Climent, L.; Wallace, R. J.; Salido, M. A.; and Barber, F.2014. Robustness and stability in constraint programmingunder dynamism and uncertainty. J. Artif. Intell. Res.(JAIR)49:49–78.Dechter, R.; Meiri, I.; and Pearl, J. 1991. Temporal con-straint networks. In Knowledge Representation, volume 49,61–95.Fang, C.; Yu, P.; and Williams, B. C. 2014. Chance-constrained probabilistic simple temporal problems. InProc. of AAAI-14, 2264–2270.Morris, P.; Muscettola, N.; and Vidal, T. 2001. Dynamiccontrol of plans with temporal uncertainty. In Proc. ofIJCAI-01, 494–502.Planken, L. R.; de Weerdt, M. M.; and Witteveen, C. 2010.Optimal temporal decoupling in multiagent systems. InProc. of AAMAS-10, 789–796.Tsamardinos, I. 2002. A probabilistic approach to robustexecution of temporal plans with uncertainty. In Methodsand Applications of Artificial Intelligence. Springer. 97–108.Verfaillie, G., and Jussien, N. 2005. Constraint solving inuncertain and dynamic environments: A survey. Constraints10(3):253–281.Vidal, T., and Ghallab, M. 1996. Dealing with uncertaindurations in temporal constraint networks dedicated to plan-ning. In Proc. ECAI-96, 48–54.Wilson, M.; Klos, T.; Witteveen, C.; and Huisman, B. 2014.Flexibility and decoupling in simple temporal networks. Ar-tificial Intelligence 214:26–44.

3246

robustness in probabilistic temporal planning · 2015. 5. 11. · figure 2: temporal network...

Documents