projection operator strategies for trajectory optimization: application...

1 / 34

Projection Operator Strategies for Trajectory Optimizati on:Application to Cooperative Planning

John HauserCU Boulder

Introduction

Introduction❖ Minimization ofTrajectory Functionals

❖ Why do TrajectoryOptimization?

The Projection Operatorapproach to TrajectoryFunctional Minimization

A Sampling of SystemInvestigations

Positive Definiteness

Constraints

2 / 34

Minimization of Trajectory Functionals






Constraints

3 / 34

Consider the problem of minimizing a functional

h(x(·), u(·)) :=

∫ T

0

l(x(τ), u(τ), τ) dτ +m(x(T ))

over the set T of bounded trajectories of the nonlinear system

x = f(x, u) , x(0) = x0

Here, x(t) ∈ Rn and u(t) ∈ R

m.

Minimization of Trajectory Functionals






Constraints

3 / 34

Consider the problem of minimizing a functional

h(x(·), u(·)) :=

∫ T

0

l(x(τ), u(τ), τ) dτ +m(x(T ))

over the set T of bounded trajectories of the nonlinear system

x = f(x, u) , x(0) = x0

Here, x(t) ∈ Rn and u(t) ∈ R

m.

We write this constrained problem as

minξ∈T

h(ξ)

where• ξ = (α(·), µ(·)) is a bounded curve with α(·) continuous• ξ ∈ T means α(t) = f(α(t), u(t)) and α(0) = x0

Why do Trajectory Optimization?






Constraints

4 / 34

Well known:● Optimal control may be used to provide stabilization, tracking, etc., for

nonlinear systems(trajs are characteristics of HJB soln)

● Model predictive/receding horizon strategies have been usedsuccessful for a number of nonlinear systems with constraints

Why do Trajectory Optimization?






Constraints

4 / 34

Well known:● Optimal control may be used to provide stabilization, tracking, etc., for

nonlinear systems(trajs are characteristics of HJB soln)

● Model predictive/receding horizon strategies have been usedsuccessful for a number of nonlinear systems with constraints

Also:● Trajectory exploration : What cool stuff can this system do?

✦ capabilities✦ limitations

● Trajectory modeling : Can the trajectories of a (complex) system bemodeled by those of a simpler system?

● Systems analysis : investigate system structure, e.g., controllability

The Projection Operator approach to TrajectoryFunctional Minimization

Introduction

The Projection Operatorapproach to TrajectoryFunctional Minimization❖ Unconstrained (?)Optimal Control

❖ Projection OperatorApproach

❖ Projection Operator

❖ Projection OperatorProperties

❖ Trajectory Manifold

❖ EquivalentOptimization Problems

❖ Projection operatorNewton method



Constraints

5 / 34

Unconstrained (?) Optimal Control

Introduction










Constraints

6 / 34

● The choice of a control trajectory u(·) determines the state trajectoryx(·) (recall that x0 has been specified).

● With such a trajectory parametrization , one obtains so-calledunconstrained optimal control problem

minu(·)

h(x(·;x0, u(·)), u(·))

● Why not just search over control trajectories u(·)?If the system described by f is sufficiently stable, then such a shootingmethod may be effective.

● Unfortunately, the modulus of continuity of the mapu(·) 7→ (x(·), u(·)) is often so large that such shooting iscomputationally useless :

small changes in u(·) may give LARGE changes in x(·)

Projection Operator Approach

Introduction










Constraints

7 / 34

Key idea: a trajectory tracking controller may be used to minimize theeffects of system instabilities, providing anumerically effective , redundant trajectory parametrization .

● Let ξ(t) = (α(t), µ(t)), t ≥ 0, be a bounded curve● Let η(t) = (x(t), u(t)), t ≥ 0, be the trajectory of f determined by

the nonlinear feedback system

x = f(x, u), x(0) = x0,

u = µ(t) +K(t)(α(t)− x) .

● The map

P : ξ = (α(·), µ(·)) 7→ η = (x(·), u(·))

is a continuous, Nonlinear Projection Operator .

Projection Operator Approach

Introduction










Constraints

7 / 34

Key idea: a trajectory tracking controller may be used to minimize theeffects of system instabilities, providing anumerically effective , redundant trajectory parametrization .

● Let ξ(t) = (α(t), µ(t)), t ≥ 0, be a bounded curve● Let η(t) = (x(t), u(t)), t ≥ 0, be the trajectory of f determined by

the nonlinear feedback system

x = f(x, u), x(0) = x0,

u = µ(t) +K(t)(α(t)− x) .

● The map

P : ξ = (α(·), µ(·)) 7→ η = (x(·), u(·))

is a continuous, Nonlinear Projection Operator .● For each ξ ∈ domP, the curve η = P(ξ) is a trajectory.

Note: the trajectory contains both state and control curves.

Projection Operator

Introduction










Constraints

8 / 34

η = P(ξ)

ξ

Projection Operator Properties

Introduction










Constraints

9 / 34

Suppose that f is Cr and that K is bounded andexponentially stabilizes ξ0 ∈ T . Then

● P is well defined on an L∞ neighborhood of ξ0● P is Cr (Frechet diff wrt L∞ norm)● ξ ∈ T if and only if ξ = P(ξ)● P = P ◦ P (projection )

Projection Operator Properties

Introduction










Constraints

9 / 34

Suppose that f is Cr and that K is bounded andexponentially stabilizes ξ0 ∈ T . Then

● P is well defined on an L∞ neighborhood of ξ0● P is Cr (Frechet diff wrt L∞ norm)● ξ ∈ T if and only if ξ = P(ξ)● P = P ◦ P (projection )

● On the finite interval [0, T ], choose K(·) to obtain stability-likeproperties so that the modulus of continuity of P is relatively small .

● On the infinite horizon, instabilities must be stabilized in order toobtain a projection operator; consider x = x+ u.

Trajectory Manifold

Introduction










Constraints

10 / 34

ξ

ξ+ζ

η = P(ξ+ζ)

Theorem: T is a Banach manifold : Every η ∈ T near ξ ∈ T can beuniquely represented as

η = P(ξ + ζ), ζ ∈ TξT

Key: the projection operator DP(ξ) provides the required subspacesplitting . Note: ζ ∈ TξT if and only if ζ = DP(ξ) · ζ

Equivalent Optimization Problems

Introduction










Constraints

11 / 34

Using the projection operator , we see that

minξ∈T

h(ξ) = minξ=P(ξ)

h(ξ)

where

h(x(·), u(·)) =

∫ T

0

l(x(τ), u(τ), τ) dτ +m(x(T ))

Furthermore, definingg(ξ) = h(P(ξ))

for ξ ∈ U with P(U) ⊂ U ⊂ domP, we see that

minξ∈T

h(ξ)

︸︷︷︸

constrained

and minξ∈U

g(ξ)

︸︷︷︸

unconstrained

are equivalent in the sense that

● if ξ∗ ∈ T ∩ U is a constrained local minimum of h,then it is an unconstrained local minimum of g;

● if ξ+ ∈ U is an unconstrained local minimum of g in U ,then ξ∗ = P(ξ+) is a constrained local minimum of h.

Projection operator Newton method

Introduction










Constraints

12 / 34

given initial trajectory ξ0 ∈ T

for i = 0, 1, 2, . . .

redesign feedback K(·) if desired/needed

descent direction ζi = arg minζ∈Tξi

TDh(ξi)·ζ +

12D2g(ξi)·(ζ, ζ) (LQ)

line search γi = arg minγ∈(0,1]

h(P(ξi + γζi))

update ξi+1 = P(ξi + γiζi)

end


Introduction










Constraints

12 / 34


for i = 0, 1, 2, . . .



TDh(ξi)·ζ +

12D2g(ξi)·(ζ, ζ) (LQ)


h(P(ξi + γζi))


end

This direct method generates a descending trajectory sequence inBanach space , with quadratic convergence to SSC local minimizers.


Introduction










Constraints

12 / 34


for i = 0, 1, 2, . . .



TDh(ξi)·ζ +

12D2g(ξi)·(ζ, ζ) (LQ)


h(P(ξi + γζi))


end

This direct method generates a descending trajectory sequence inBanach space , with quadratic convergence to SSC local minimizers.

When D2g(ξi) is not positive definite on TξiT , one may obtain aquasi-Newton descent direction by solving

ζi = arg minζ∈Tξi

TDh(ξi)·ζ +

12q(ξi) · (ζ, ζ)

where q(ξi) is positive definite on TξiT (e.g., an approximation to D2g(ξi))

A Sampling of System Investigations

Introduction



❖ Inverted Pendulum❖ Heisenberg system -Brockett’s nonholonomicintegrator

❖ Raceline optimization

❖ Collision freemaneuvering

❖ PVTOL❖ Optimal control on Liegroups

❖ Submarinemaneuvering


Constraints

13 / 34

Inverted Pendulum

Introduction









Constraints

14 / 34

al t

L2 trajectory exploration: minimize

h(ξ) =

∫ T

0

‖x(τ)−xd(τ)‖2Q/2+‖u(τ)−ud(τ)‖

2R/2 dτ+‖x(T )−xd(T )‖

2P1

/2

over trajectories ξ = (x(·), u(·)) ∈ T .

Driven inverted pendulum system is

ϕ = (g/l) sinϕ− (1/l) cosϕ u

where the control u is taken to be the pivot point lateral acceleration al.Extensions to pendubot.

Heisenberg system - Brockett’s nonholonomic integrator

Introduction









Constraints

15 / 34

min

∫ 1

0

‖u(τ)‖2/2 dτ + ‖x(T )‖2P1/2

x1 = u1

x2 = u2

x3 = x2u1 − x1u2

P1 = diag([10 10 100])

Raceline optimization

Introduction









Constraints

16 / 34

−150 −100 −50 0 50 100 150−80

−60

−40

−20

0

20

40

60

80track

−10

−8

−6

−4

−2

0

2

4

6

8

10

Collision free maneuvering

Introduction









Constraints

17 / 34

PVTOL

Introduction









Constraints

18 / 34

0 5 10 15 20 25 30 35

0

5

10

15

20

desired vs feasible path

desiredρ = 4ρ = 1

y = u1 sinϕ− ǫu2 cosϕz = −u1 cosϕ− ǫu2 sinϕ+ gϕ = u2.

Optimal control on Lie groups

Introduction









Constraints

19 / 34

x17 x18 x19 x20 x21 x22 x23 x24 x25 x26 x27

to exploit the structure of Lie groups and retain second order properties, anew second order geometric derivative was developed

applicable to quantum spin systems and many classical mechanicalsystems including spacecraft (and aircraft)

Submarine maneuvering

Introduction









Constraints

20 / 34

turn to turn maneuverscritical velocity (loss of linear controllability)


Introduction



Positive Definiteness❖ Descent directioncalculation

❖ Derivatives

❖ Computation of D2P

❖D2g LagrangeMultiplier

❖D2g

❖ descent direction LQOCP

Constraints

21 / 34

Descent direction calculation

Introduction




❖ Derivatives



❖D2g


Constraints

22 / 34

Recall that a descent direction may be found using a quadraticapproximation of the cost functional.

Making use of the representation theorem, the trajectory functional is fullydescribed locally by

h(P(ξi + ζ)) = h(ξi) +Dh(ξi) · ζ +1

2D2g(ξi) · (ζ, ζ) + h.o.t.

where ζ ranges over the space of tangent trajectories TξiT .

We thus seek a (Newton) descent direction as the solution of a linearquadratic optimal control problem:

ζi = arg minζ∈Tξi

TDh(ξi)·ζ +

12D2g(ξi)·(ζ, ζ)

Derivatives

Introduction




❖ Derivatives



❖D2g


Constraints

23 / 34

First and second derivatives of g(ξ) = h(P(ξ)) are given by

Dg(ξ) · ζ = Dh(P(ξ)) ·DP(ξ) · ζ

D2g(ξ) · (ζ1, ζ2) =

D2h(P(ξ)) · (DP(ξ) · ζ1, DP(ξ) · ζ2)

+Dh(P(ξ)) ·D2P(ξ) · (ζ1, ζ2)

Derivatives

Introduction




❖ Derivatives



❖D2g


Constraints

23 / 34


Dg(ξ) · ζ = Dh(P(ξ)) ·DP(ξ) · ζ

D2g(ξ) · (ζ1, ζ2) =

D2h(P(ξ)) · (DP(ξ) · ζ1, DP(ξ) · ζ2)

+Dh(P(ξ)) ·D2P(ξ) · (ζ1, ζ2)

When ξ ∈ T and ζi ∈ TξT , they specialize to

Dg(ξ) · ζ = Dh(ξ) · ζ

D2g(ξ) · (ζ1, ζ2) = D2h(ξ) · (ζ1, ζ2) + Dh(ξ) ·D2P(ξ) · (ζ1, ζ2)︸︷︷︸generalizes Lagrange multiplier

Derivatives

Introduction




❖ Derivatives



❖D2g


Constraints

23 / 34


Dg(ξ) · ζ = Dh(P(ξ)) ·DP(ξ) · ζ

D2g(ξ) · (ζ1, ζ2) =

D2h(P(ξ)) · (DP(ξ) · ζ1, DP(ξ) · ζ2)

+Dh(P(ξ)) ·D2P(ξ) · (ζ1, ζ2)

When ξ ∈ T and ζi ∈ TξT , they specialize to

Dg(ξ) · ζ = Dh(ξ) · ζ

D2g(ξ) · (ζ1, ζ2) = D2h(ξ) · (ζ1, ζ2) + Dh(ξ) ·D2P(ξ) · (ζ1, ζ2)︸︷︷︸generalizes Lagrange multiplier

How to compute D2P(ξ) · (ζ1, ζ2) ?

Computation of D2P

Introduction




❖ Derivatives



❖D2g


Constraints

24 / 34

We may use ODEs to calculate D2P(ξ) · (ζ1, ζ2):

η = (x, u) = P(ξ) = P(α, µ)γi = (zi, vi) = DP(ξ) · ζi = DP(ξ) · (βi, νi)ω = (y, w) = D2P(ξ) · (ζ1, ζ2)

η(t) : x(t) = f(x(t), u(t)), x(0) = x0

u(t) = µ(t) +K(t)(α(t)− x(t))

γi(t) : zi(t) = A(η(t))zi(t) +B(η(t))vi(t), zi(0) = 0vi(t) = νi(t) +K(t)(βi(t)− zi(t))

ω(t) : y(t) = A(η(t))y(t) +B(η(t))w(t) +D2f(η(t)) · (γ1(t), γ2(t))w(t) = −K(t)y(t), y(0) = 0

● The derivatives are about the trajectory η = P(ξ)● The feedback K(·) stabilizes the state at each level

D2g Lagrange Multiplier

Introduction




❖ Derivatives



❖D2g


Constraints

25 / 34

Dh(ξ) ·D2P(ξ) · (ζ, ζ) =

∫ T

0

D2l(τ, ξ(τ)) · (D2P(ξ) · (ζ, ζ))(τ) dτ

=

∫ T

0

D2l(τ, ξ(τ)) ·

[I

−K(τ)

] ∫ τ

0

Φc(τ, s)D2f(ξ(s)) · (ζ(s), ζ(s)) ds dτ

=

∫ T

0

∫ T

s

D2l(τ, ξ(τ)) ·

[I

−K(τ)

]Φc(τ, s) dτ D2f(ξ(s)) · (ζ(s), ζ(s)) ds

=

∫ T

0

q(s)T D2f(ξ(s)) · (ζ(s), ζ(s)) ds

where

q(t) = −[A(ξ(t))−B(ξ(t))K(t)]T q(t)− lTx (t) +K(t)T lTu (t), q(T ) = 0

We obtain a stabilized adjoint variable, independent of stationaryconsiderations!

(for nonzero terminal cost, q(T ) = mTx (x(T )))

D2g

Introduction




❖ Derivatives



❖D2g


Constraints

26 / 34

For ξ ∈ T and ζ ∈ TξP, D2g(ξ) · (ζ, ζ) has the form

∫ T

0

(z(τ)v(τ)

)T [Q(τ) S(τ)S(τ)T R(τ)

](z(τ)v(τ)

)dτ + z(T )TP1z(T )

where

W (t) =

[Q(τ) S(τ)S(τ)T R(τ)

]

has elements

wij(t) =∂2l

∂ξi∂ξj(t, ξ(t)) +

n∑

k=1

qk(t)∂2fk∂ξi∂ξj

(ξ(t))

and P1 = ∂2m∂x2 (x(T )).

In fact, W (·) is just the second derivative matrix of the Hamiltonian

H(t, x, u, q) = l(t, x, u) + qT f(x, u)

Again, no stationary considerations.

descent direction LQ OCP

Introduction




❖ Derivatives



❖D2g


Constraints

27 / 34

The descent direction problem is a linear quadratic optimal control problem

min∫ T

0

(a(τ)b(τ)

)T(z(τ)v(τ)

)+

1

2

(z(τ)v(τ)

)T[Q(τ) S(τ)S(τ)T R(τ)

](z(τ)v(τ)

)dτ

+ rT1 z(T ) + z(T )TP1z(T )/2

subj to z = A(t)z +B(t)v, z(0) = 0,

where the cost is, in general, non-convex.

This LQ OCP (with PD R(·)) has a unique solution if and only if

P + ATP + PA− PBR−1BTP + Q = 0, P (T ) = P1

has a bounded solution on [0, T ].[ A = A−BR−1ST , Q = Q− SR−1ST ]

Constraints

Introduction




Constraints

❖ Constraints❖ Barrier FunctionApproach n

❖ Barrier FunctionApproach ∞

❖ Approximate BarrierFunction❖ Approximate BarrierFunctional

❖ Collision Avoidance

28 / 34

Constraints

Introduction




Constraints





29 / 34

Use a barrier function approach to approximate the (local) solution ofconstrained optimal control problems of the form

minimize∫ T

0

l(τ, x(τ), u(τ)) dτ +m(x(T ))

subject to x(t) = f(x(t), u(t)), x(0) = x0

cj(t, x(t), u(t)) ≤ 0, t ∈ [0, T ], a.e.j = 1, . . . , k,

where the data satisfies some reasonable smoothness and convexityproperties.

Approximating OCPs will be unconstrained .

Barrier Function Approach n

Introduction




Constraints





30 / 34

In finite dimensions, a solution to a C2 convex problem

min f(x)s.t. cj(x) ≤ 0, j = 1, . . . , k

is found by solving a sequence of convex problems

minx∈C

f(x)− ǫ∑

j log(−cj(x))

where C = {x ∈ Rn : cj(x) < 0} is the open strictly feasible set.

Barrier Function Approach ∞

Introduction




Constraints





31 / 34

The direct OCP translation is

min

∫ T

0

l(τ, x(τ), u(τ))− ǫ∑

j log(−cj(τ, x(τ), u(τ))) dτ

+ m(x(T ))

s.t. x(t) = f(x(t), u(t)), x(0) = x0

Suppose that at some ǫ0 > 0, this problem possesses a locally optimaltrajectory ξ∗ǫ0 = (x∗

ǫ0(·), u∗ǫ0(·)) that is SSC and that the Hamiltonian is

strongly convex in u.

Then ξ∗ǫ0 is a strictly feasible trajectory (of constrained problem) and theIFT indicates nice dependence on ǫ.

Looks promising ... but guaranteeing strict feasibility during optimizationprocess may be difficult .

Approximate Barrier Function

Introduction




Constraints





32 / 34

For 0 < δ ≤ 1, define the C2 approximate log barrier function

βδ : (−∞,∞) → (0,∞)

βδ(z) =

− log z z > δ

k − 1

k

[(z − kδ

(k − 1)δ

)k

− 1

]− log δ z ≤ δ

where k > 1 is an even integer, e.g., k = 2.

βδ(·) retains many of the important properties of the log barrier function

Similar to z 7→ − log z: for strictly convex proper c : R → R, z 7→ βδ(−c(z))is also strictly convex so that

minx∈C

f(x) + ǫ∑

j βδ(−cj(x))

is a convex problem that has the same solution (x∗ǫ ) provided δ < cj(x

∗ǫ ) for

all j.

Approximate Barrier Functional

Introduction




Constraints





33 / 34

Returning to infinite dimensions, define, for ξ = (α(·), µ(·))

bδ(ξ) =

∫ T

0

∑jβδ(−cj(τ, α(τ), µ(τ))) dτ

and consider unconstrained approximation (to constrained OCP)

minξ∈T

h(ξ) + ǫbδ(ξ)

Note: h(·) + ǫbδ(·) can be evaluated on any curve ξ in X.

Collision Avoidance

Introduction




Constraints





34 / 34

For collision avoidance, the constraint function is not convex.

‖(xi, yi)− (xj , yj)‖2 > r2

Nevertheless, this approach seems to work due since, in radial directions,local convexity does hold.To take into account the notion that being far away is not really better thanmerely being feasible, we replace − log−z by

− log tanh−z

for z < 0. For z ≥ 0, we use −z in place of tanh−z, and use the usualapproximation.

projection operator strategies for trajectory optimization: application...

Documents