path following with collision avoidance and velocity constraints for multi-agent systems

Path Following with Collision Avoidanceand Velocity Constraints for Multi-Agent

Systems

Ionela Prodan Sorin Olaru Cristina Stoica Silviu-Iulian Niculescu

SUPELEC Systems Sciences (E3S) - Automatic Control Department,3 rue Joliot Curie, F-91190, Gif-sur-Yvette cedex, France

(e-mail: {Ionela.Prodan, Sorin.Olaru, Cristina.Stoica}@supelec.fr). Laboratory of Signals and Systems, SUPELEC - CNRS,

3 rue Joliot Curie, Gif-sur-Yvette, France(e-mail: {Ionela.Prodan, Silviu.Niculescu}@lss.supelec.fr).

Abstract: This paper deals with avoidance constraints while following an optimal trajectory fora group of agents operating in open space. The basic idea is to use the Model Predictive Control(MPC) technique to solve a real time optimization problem over a finite time horizon. Followinga specified trajectory, the agents move in the same direction and eventually they end up in aparticular formation. Combining the optimization-based control study with the ability of MPCto handle convex and non-convex constraints allows a thorough analysis of the motion controlof a group of agents with linear dynamics subject to state constraints. Avoidance constraintsare also added to the optimization problem when the agents are operating in an environmentwith obstacles. A flat trajectory is planned in the physical open space allowing the agents tomaneuver successfully in such a dynamic environment and to reach a common objective.

Keywords: Multi-agent systems, cooperative control, Model Predictive Control, non-convexconstrains

1. INTRODUCTION

Questions about achieving a formation of a group of agentsand how to ensure that all the agents avoid collision bothamong the group and with the obstacles around themarise when dealing with multi-agent systems Richards andHow (2005). The goal of this paper is to control a set ofsubsystems having independent dynamics while achievinga common objective. The problem is relevant in manyapplications involving the control of cooperative systems(Mesbahi and Egerstedt (2010), Blondel et al. (2005),Olfati-Saber and Murray (2002)). Among the applications,can be cited the characterization of pedestrian behavior inthe crowd (Helbing et al. (2000), Fang et al. (2010)). Sucha characterization is essential for evaluating the safety ofthe social infrastructures. An important property of thesecooperating systems is that the group behavior is not im-posed by one of the agents, this behavior results from thelocal interaction between the agents and their neighbors.For instance, every pedestrian in a crowd knows where theother pedestrians in its neighborhood are heading, but itdoes not know the average heading of all pedestrians inthe crowd.

This paper considers the Model Predictive Control (MPC),a widely used technique in control community due toits ability to handle control and state constraints, whileoffering good performance specifications (Camacho andBordons (2004), Rossiter (2003), Mayne et al. (2000),Maciejowski (2002), Bemporad and Morari (1999)). This

paper deals with a group of agents in a predictive controlcontext, which enables the inclusion of state constraints,both for collision avoidance between the agents and forthe velocity of each agent. The design of vehicle formationthrough the use of MPC is detailed in Dunbar and Murray(2002). Other approaches can be found in aerospace ap-plications, where MPC is applied to spacecraft formationkeeping Manikonda et al. (1999), but no avoidance con-straints are considered. The collision avoidance betweenthe agents is known to be a difficult problem, since certainconstraints require the use of auxiliary binary variables. Inparticular, in Bemporad and Morari (1999), the authorsconsidered such an approach based on the use of auxil-iary binary variables together with MPC, with examplesin hybrid control systems. In Richards and How (2005),the collision and obstacle avoidance are included in thetrajectory planning of spacecraft vehicles, but the velocityconstraints are not taken into account.

The present paper considers a two-dimensional environ-ment for the group of agents, with supplementary non-convex velocity constraints that can also be handled byadding binary variables.

An important contribution to previous work is the decreaseof the complexity of the control design problem. This isobtained by reducing the number of auxiliary binary vari-ables used to reformulate the non-convex state constraintsin a linear form.

The path following problem formulated in a non-convexconstrained predictive control framework is described fromthe standard centralized point of view as a receding hori-zon mixed integer optimization problem. Using the pre-dicted control laws, the agents move in the same directionfollowing a specified trajectory. The imposed state con-straints will enforce a certain safety distance, eventually,the agents ending up in a particular formation. The speci-fied trajectory of the group of agents can be generated us-ing the differential flatness formalism (Van Nieuwstadt andMurray (1998), De Dona et al. (2009)). In De Dona et al.(2009), the use of MPC is combined with the differentialflatness formalism for trajectory generation of nonlinearsystems. In conclusion the goal of our paper is to achievean agent group formation only by imposing constraintson the position and the velocity of the agents, while theyfollow a specified path.

This remaining paper is organized as follows. Section 2introduces the agent dynamics and the reference trajectorygeneration mechanism. The constrained predictive controlproblem is then summarized in. Section 3 deals withthe linear reformulation of the state constraints for thereal-time optimization problem. Section 4 presents theMPC problem, where the generated trajectory is usedby the group of agents for prediction in a centralizedapproach. Based on the information received from theMPC formulation the avoidance and velocity constraintsare taken into account, leading the agents to follow thereference trajectory in a formation which depends onthe geometry of the constraints. In Section 5, illustrativesimulation results are presented. Finally, some conclusionsare drawn in Section 6.

Throughout the paper, the following notations are used.An intersection of finitely many half-spaces, a polytopedenoted as P will be used to describe a safety region foran agent.

(x, y) - position coordinates of an agent(vx, vy) - velocity coordinates of an agent - agent stateNa - number of agentsi - the i-th agentP (i) - polyhedral safety region of the i-th agentP (i) - the complement of the polyhedral safety region P (i)N - prediction horizonVN - cost functionQ 0 - Q is a strictly positive definite matrixc - binary variables {0, 1}

2. PROBLEM FORMULATION

In this section, based on the model of individual agentsthe principles of a prediction based optimization problemis stated such that the group converge to a fix formation.Imposing the specified state constraints the agents will pre-serve a safety distance in-between, thus allowing collisionavoidance both inside the group and with possible obsta-cles. Non-convex velocity constraints can be considered inthe same formulation.

A. Model description

Let us consider a linear system (vehicle, pedestrian oragent in a general form) whose dynamics is modeled bythe following equation:

xyvxvy

=

0 0 1 00 0 0 10 0

m0

0 0 0 m

xyvxvy

+

+

0 00 01m

0

01m

[uxuy

], (1)

where (x, y) are the position coordinates and (vx, vy) thevelocity coordinates of the agent in a two-dimensionalrepresentation. The agent mass is denoted by m, and is the damping factor. By associating the index i to thei-th agent the following model is obtained:

i(t) = Aci(t) +Bcui(t), i = 1 : Na, (2)with the corresponding state and input vectors:i = [x y vx yy]T , ui = [ux uy]T , and Na the numberof agents. A corresponding discrete-time model for theequations (2) is constructed upon a chosen sampling periodTs by considering the time instants tk = kTs:

ik+1 = Aik +Bu

ik, k N, i = 1 : Na, (3)

where k = tk , uk = utk . The pair (A, B) is given by:

A = eAcTs , B = Ts0

eAc(Ts)Bcd

For the collision avoidance problem, let us consider aconvex set, a polytope (in the state space) that describesa safety region around an agent i and also safety limits forthe velocity of an agent i:

P (i) = { Rn : H( i) K} (4)where H Rncn , K Rnc1, and nc is the number ofhyperplanes. The position of the agent i represents thecenter of the region defined by the projection of P (i)on the position subspace of the state space. The feasibleregion in the space of solutions is the complement of thesafety region, denoted by P (i), which can be describedas a union of regions that cover all the space exceptthe polytope P (i). The velocity constraints imposed tothe agent i represent for example safety limits, such asa minimum maneuvering velocity near an obstacle oranother agent. Another example considers a spacecraftformation flying, where each agent has to keep its velocitygrater than a specified value, even if the formation followsa trajectory with a relative velocity inferior to some pre-imposed bounds for each spacecraft.

For sake of completeness, the problem of generating areference trajectory for the linear system (1) is nextsummarized, along the line in Van Nieuwstadt and Murray(1998).

B. Trajectory generation

The idea is to find a trajectory ((t), u(t)) that steers themodel (1) from an initial state x0 to a final state xf ,over a fixed time interval [t0, tf ]. Using the flatness theoryintroduced by Van Nieuwstadt and Murray (1998) thesystem is parameterized in terms of a finite set of variablesz(t) and a finite number of their derivatives:

xi

yi

xf

yf

xo

yo

Fig. 1. A flat trajectory for obstacle avoidance

(t) = (z(t), z(t), , z(q)(t)), (5)u(t) = u(z(t), z(t), , z(q)(t)),

where z(t) = ((t), u(t), u(t), , u(q)(t)) is called theflat output. In order to generate the reference trajectorythe class of polynomial functions is used. Based on theparametrization (5) and imposing boundary constraintsfor the evolution of the differentially flat systems De Donaet al. (2009) one can generate a reference trajectory zref (t)by the resolution of a linear system of equalities. Thereforethe corresponding reference state and input for the system(1) are obtained by replacing the reference flat outputzref (t) with t [t0, tf ] in (5):

ref (t) = (zref (t), zref (t)), (6)

uref (t) = u(zref (t), zref (t)),where t [t0, tf ].The flat trajectory can also be generated to enforce ob-stacle avoidance at the path planning stage. The idea isillustrated in Figure 1. In this framework the obstacles canbe modeled in terms of a convex safety region around eachagent, as in (4). Even if the reference trajectory is gen-erated over the entire interval [t0, tf ], intermediary pointscan be added along the system trajectory in order to avoidobstacles at a specific time subinterval by redesigning ofthe flat trajectory.

Since the reference trajectory is available beforehand,an optimization problem which minimizes the trackingerror for the system can be formulated in a predictivecontrol framework. Consequently the agents must followthe reference trajectory from the initial position to thedesired position, using the available information over afinite time horizon N in the presence of constraints.

C. Constrained Predictive Control

The aim is to find the N-move control sequenceu =

{uk|k, uk+1|k, , uk+N1|k

}that minimizes the

finite horizon quadratic objective function VN (k, u):

VN = Tk+N |kPk+N |k +N1l=1

Tk+l|kQk+l|k+

+N1l=0

uTk+l|kRuk+l|k,

(7)

while respecting the constraints imposed by each agentdynamics (3), and the physical limitations

i P (j), (i, j) N[1,Na] N[1,Na], i 6= j. (8)

Here Q = QT 0, R 0 are the weighting matrices andP = PT 0 defines the terminal cost. A finite horizontrajectory optimization is performed at each sample in-stant, the first component of the resulting control sequenceis effectively applied and the optimization procedure isreiterated using the available measurements based on thereceding horizon principle Mayne et al. (2000).

The constraints (8) describe a non-convex region in thestate-space and thus, the MPC problem (7) can not becasted in the classical LP/QP parametric problem formu-lation. In the following the constraints (8) are reformulatedin a linear form, by introducing a set of auxiliary binaryvariables, which have to be considered as decision variablesin the new MPC formulation. It is worth mentioning thatthe problem can be interpreted as a hybrid system controlproblem Bemporad and Morari (1999).

3. LINEAR CONSTRAINTS REFORMULATION

The constraints for 2 agents are discussed here, the gener-alization to Na agents following the same lines.

Let us consider the global model of any two different agents(i, j) N[1,Na] N[1,Na], i 6= j:[

ik+1jk+1

]=[Ai 00 Aj

] [ikjk

]+[Bi 00 Bj

] [uikujk

](9)

From the point of view of the MPC algorithm, the feasibleregion P (i) is a non-convex region. In order to reformulatethe non-convex constraints in a convex form one has touse mixed integer techniques. By introducing nc additionalbinary variables c {0, 1} one can write:

H( i) K +Mc, c = 1 : nc (10)This linear description gives a natural formulation for theconstrains verification. If c = 1, the right-hand side ofthe inequality is negative and elementwise inferior to theleft-hand side, for a sufficiently large user-defined scalarM > 0. From the optimization point of view the inequalityis inactive in this case and trivially satisfied. If c = 0, thec-th inequality is activated. For collision avoidance, it isrequired that at least one of the half-spaces defining theconstraints in (10) has to be active which is translated bythe additional constraint

ncc=1

c nc 1

Remark: Introducing a large number of constraints in(4) allows a better approximation of the safety regionwhile increasing complexity of the problem by the increaseof the number of binary variables. It is worthwhile toconsider simple candidates for the safety region of theagents (hypercubes or simplices).

For simplicity reasons, in the rest of the paper a rectan-gular shape for the region is considered to be a fair choicein terms of precision and complexity (Fig.2):

Pd =d

2B(xi, yi), (11)

where B(xi, yi) is the ball with norm infinity centered in(xi, yi) and d is a constant which defines the size of thebox.

Figure 2 illustrates that the avoidance constrains can bewritten as:

xi + dxi d xi

yi + d

yi d

yi

xj

yj

i

j

Fig. 2. Approximation of state constraints: the squareapproximates the regions with linear constraints

xj xi + d or xj xi d oryj yi + d or yj yi d. (12)

To translate the avoidance constraints as the complementof the safety region (in order to avoid logical operands)in terms of linear constraints one has to introduce in (12)four binary variables c, c = 1, . . . , 4, leading to:

xi xj d+M1,xi + xj d+M2,yi yj d+M3, yi + yj d+M4,1 + 2 + 3 + 4 3. (13)

Thus a binary variable is associated to each inequality(12). Consequently, a large number of inequalities in thedescription of the safety region will enforce the use of aexceeding number of binary variables, which exponentiallyaffects the complexity of the MPC problem.

A more compact representation can be obtained using onlytwo binary variables c, c = 1, 2:

xi xj d+M(1 + 2),xi + xj d+M(1 1 + 2),yi yj d+M(1 + 1 2),yi + yj d+M(2 1 2). (14)

For any combination of the two binary variables a con-straint from (12) will be activated. With respect to theoriginal system (9) a compact form is the following: 1 0 0 0 1 0 0 01 0 0 0 1 0 0 0

0 1 0 0 0 1 0 00 1 0 0 0 1 0 0

Cpij

[ikjk

]+

+

M MM MM MM M

Dpij

[1

2

]

dd+Md+Md+ 2M

pij

(15)

These constraints have to be used in the classical MPCframework with the values of (Cpij , D

pij ,

pij) as in (15).

The auxiliary binary variables c = [1, 2] have to beconsidered in the optimization problem.

The rectangular region (11) is also considered to define thevelocity constraints for an agent i:

vix vm, orvix vm, or (16)viy vm, orviy vm,

where the constant vm > 0.

Similarly, the non-convex velocity constraints (16) canbe rewritten using binary variables c, c = 1, 2, whichcorrespond in general terms to fixed-obstacle avoidanceconstraints:

vix vm +M(1 + 2),vix vm +M(1 1 + 2),viy vm +M(1 + 1 2),viy vm +M(2 1 2), (17)

For each agent velocity constraints can be added in theclassical MPC framework with the values of (Cvij , D

vij ,

vij),

defined similar with(15).

Globally, the non-convex state constraints (12), (16) aretransposed in a compact form:[

CpijCvij

]ij +

[Dpij 00 Dvij

] [p

v

][pijvij

], (18)

with p and v the auxiliary binary variables introducedfor position and velocity constraints reformulation andij = [iT jT ]T , the global state of any two differentagents (i, j) N[1,Na] N[1,Na], i 6= j.

4. OPTIMIZATION-BASED CONTROL & PATHFOLLOWING

This section presents the centralized MPC problem, wherean optimization is performed to compute the control lawsfor each agent. Based on the global model, a flat trajectoryis planned and, based on the information received from thereal-time predictive control law, the avoidance constraintsare taken into account. This leads the agents to achieve aformation while following the trajectory.

To define the MPC centralized problem, let us consider ina first step the global system defined as:

(t) = Agc(t) +Bgcu(t), (19)

with the corresponding state and input vectors: = [x1 y1 v1x v

1y| |xNa yNa vNax vNay ]T ,

u = [u1x u1y| |uNax uNay ]T ,

and the matrices which describe the global model:Agc = diag(A1, , ANa), Bgc = diag(B1, , BNa).The next step is to compare the measured state and inputvariables with the reference trajectory (ref (t), uref (t))which satisfies the nominal dynamics:

ref (t) = Acref (t) +Bcuref (t), (20)

ref = [ref1 | |refNa ]T , uref = [uref1 | |urefNa ]T .Then from (19) and (20) the global system becomes:

(t) = Agc(t) +Bgcu(t), (21)

with u(t) = u(t) uref (t), (t) = (t) ref (t). The cor-responding discrete time prediction model is the following:

k+1 = Ag k +Bguk, (22)with Ag, Bg the discrete form of Agc , Bgc as described inSection 2. Taking into account the constraints (18) and

the fact that all the agents must follow the given referencetrajectory, the centralized MPC problem is formulated as:

VN (k, uk) = minuk,pk ,vk

VN (k, uk, pk , vk), (23)

subject to

k+l+1|k = Ag k+l|k +Bguk+l|k, l = 0 : N 1[Cp

Cv

]k+l|k +

[Dp 00 Dv

] [pk+l|kvk+i|k

][p

v

](24)

where Ag, Bg, Cp, Cv, Dp, Dv, p, v contain the central-ized structural information of the multi-agent system.Therefore the centralized MPC controller is acting on theglobal system (22) while offering the control inputs foreach agent.

The simulation results show that the agents eventuallyform a certain structure while they follow the given ref-erence trajectory.

5. SIMULATION RESULTS

This section proposes three simulation examples in orderto better illustrate the proposed techniques. The systemdynamics of each agent is given by (1) with the parametersm = 60kg, and = 20Ns/m.Example 1: Based on the model proposed in Section II, thisexample considers the tracking problem for a single agent.The generated trajectory is plotted in blue in Fig.3. Thebehavior of the agent over a time window of 400s, startingfrom an initial position = [15 1 1 5]T is depicted inred in the same figure.Example 2: This example considers three agents followingthe same trajectory (in magenta, Fig.4 ) generated forthe first example. The initial states for the agents are:1 = [15 1 1 5]T , 2 = [7 7 5 10], 3 = [8 8 5 2].The parameters for the collision avoidance constraints (14)are: d = 1, M = 100. In Fig.4, the evolution of the agentformation is represented at three different time instants,all the agents are represented as filled circles and thesafety region for each agent is represented as a square withd = 1. Each square points in the direction of each agentvelocity vector. Good tracking performances for the givenreference trajectory is obtained with a prediction horizonN = 2. In order to avoid the collision and according to theoptimization result, the agents are self-organized (and canbe assimilated with a flocking behavior) into the formationdepicted in Fig.4.Example 3: This example considers the case of threeagents both with collision avoidance (14) and velocity (17)constraints (Fig.6) with d = 1, vmin = 8, M > 0. Fig.5presents the simulation results for the initial states are:1 = [15 1 1 5]T , 2 = [7 7 5 10], 3 = [8 8 5 2].The agents are self-organizing in a triangle formation andthe trajectory of its center of mass is plotted in Fig.6 inmagenta. Fig.5.c illustrates good tracking performances ofthe center of mass (in blue), for a prediction horizon aslow as N = 2. Increasing the prediction horizon leads to abetter tracking of the reference trajectory. This impose atrade-off between complexity (increasing N) and precisionof tracking the given path.

For different initial conditions and tunning parameters ofthe optimization problem, most simulations show that theagents have a regular motion while following the path

(a) (b)

(c)

Fig. 3. The reference trajectory and the time evolution ofone agent, along the path: (a) X-axis, (b) Y-axis, (c)X,Y-axes

in a specific formation. The state constraints are alwayssatisfied. Although for some initial position of the agents,simulations have shown that appears a lack of synchro-nization while following the path in a line formation.Therefore the agents formation proves to be sensitive tothe alignment and this indicates from simulations that thetriangle formation is more stable. This deserves a detailedanalysis when disturbances and noises affect the agentsdynamic and represents one of the current research topics.

Fig. 4. Behavior of three agents in a triangle formation,with position constraints, at different time instances

6. CONCLUSIONS

A centralized constrained MPC formulation for multipleagents that follow a given path, while satisfying collisionavoidance and velocity constraints has been proposed inthis paper. In the path following of multi-agents systemsproblems may appear when dealing with collision avoid-ance constraints, both within the group or with any ob-stacles. In this context, the collision avoidance constraintsdescribe a non-convex region, therefore a set of auxiliarybinary variables are introduced in order to translate the

(a) (b)

(c)

Fig. 5. The reference trajectory and the time evolutionof the center of mass of the triangle formation of 3agents, along the path: (a) X-axis, (b) Y-axis, (c) X,Y-axes

non-convex state constraints into linear inequalities. In asimilar way, constraints on velocity are also imposed andtreated by adding other binary variables. The propertiesof differentially flat systems were used in order to obtaina reference trajectory for the group of agents. Therefore,based on the results provided by the constrained optimiza-tion problem, the agents organize themselves in a specificformation while they follow the given path. The resultswere presented through some illustrative simulations ofseveral examples.

Depending on the size of the global system, a centralizedMPC problem may be too large or may require a largecomputational effort. Therefore, future work will focuson investigating the case where the global system isdecomposed in subsystems, leading to a distributed MPCformulation problem.

Fig. 6. Behavior of three agents in a triangle formation,with position and velocity constraints, at differenttime instances.

7. ACKNOWLEDGEMENTS

The research of Ionela Prodan is supported by the EADSFoundation.

REFERENCES

Bemporad A., Morari M. (1999). Control of systems in-tegrating logic, dynamics, and constraints. Automatica35, 407428

Blondel V., Hendrickx J., Olshevsky A., Tsitsiklis J.(2005). Convergence in multiagent coordination, con-sensus, and flocking. In: 44th IEEE Conference onDecision and Control and European Control Conference,pp. 29963000. Seville, Spain

Camacho E.F., Bordons C. (2004). Model predictive con-trol. Springer Verlag

De Dona J., Suryawan F., Seron M., Levine J. (2009). AFlatness-Based Iterative Method for Reference Trajec-tory Generation in Constrained NMPC. Lecture notesin control and information sciences 384, 325333

Dunbar W., Murray R. (2002). Model predictive control ofcoordinated multi-vehicle formations. In: Proceedingsof the 41th IEEE Conference on Decision and Control,vol. 4, pp. 46314636. Las Vegas, Nevada, USA

Fang Z., Song W., Zhang J., Wu H. (2010). Experimentand modeling of exit-selecting behaviors during a build-ing evacuation. Statistical Mechanics and its Applica-tions 389(4), 815824

Helbing D., Farkas I., Vicsek T. (2000). Simulating dy-namical features of escape panic. Nature 407(6803),487490

Maciejowski J. (2002). Predictive control: with constraints.Pearson education

Manikonda V., Arambel P., Gopinathan M., Mehra R.,Hadaegh F., Inc S., Woburn M. (1999). A model pre-dictive control-based approach for spacecraft formationkeeping and attitude control. In: Proceedings of theAmerican Control Conference, vol. 6. San Diego, Cali-fornia, USA

Mayne D., Rawlings J., Rao C., Scokaert P.O. (2000).Constrained model predictive control: Stability and op-timality. Automatica 36, 789814

Mesbahi M., Egerstedt M. (2010). Graph-theoretic Meth-ods in Multi-agent Networks. Princeton University Press

Olfati-Saber R., Murray R. (2002). Distributed coopera-tive control of multiple vehicle formations using struc-tural potential functions. In: Proceedings of the 15thIFAC World Congress, pp. 346352. Barcelona, Spain

Richards A., How J. (2005). Model predictive control ofvehicle maneuvers with guaranteed completion time androbust feasibility. In: Proceedings of the 24th AmericanControl Conference, vol. 5, pp. 40344040. Portland,Oregon, USA

Rossiter J. (2003). Model-based predictive control: a prac-tical approach. CRC

Van Nieuwstadt M., Murray R. (1998). Real-time trajec-tory generation for differentially flat systems. Interna-tional Journal of Robust and Nonlinear Control 8(11),9951020

path following with collision avoidance and velocity constraints for multi-agent systems

Documents