Download - MAE345Lecture17.pdf
-
Stochastic, Robust, andAdaptive Control
Robert Stengel
Robotics and Intelligent Systems, MAE 345, Princeton
University, 2011
Uncertain dynamic systems
State estimation
Stochastic control
Robust control
Adaptive control
Copyright 2011 by Robert Stengel. All rights reserved. For educational use only.http://www.princeton.edu/~stengel/MAE345.html
The problem: physical plants have uncertain
Initial conditions
Inputs
Measurements
System parameters or dynamic structure
Design goal: control systems that provide
satisfactory stability and response in the
presence of uncertainty
Systems
with
Uncertainty
Stochastic controller minimizes response to random initial
conditions, disturbances, and measurement errors
Robust controller has fixed gains and structure, and it
minimizes likelihood of instability or unsatisfactory
performance due to parameter uncertainty in the plant
Adaptive controller has variable gains and/or structure, and
it minimizes likelihood of instability or unsatisfactory
performance due to plant parameter uncertainty,
disturbances, and measurement errors
Stochastic, Robust, and Adaptive
Control
Optimal StateEstimation
-
State Estimation
Goals
Minimize effects of measurement error on knowledge of the state
Recontruct full state from reduced measurement set (r ! n)
Average redundant measurements (r " n) to produce estimate of
the full state
Method
Provide optimal balance between measurements and estimates
based on the dynamic model alone
Continuous- or discrete-time implementation
Linear-Optimal
State Estimation
Continuous-time linear dynamic process
with random disturbance
Measurement with random error !x(t) = F(t)x(t) +G(t)u(t) + L(t)w(t)
z(t) = Hx(t) + n(t)
Uncertainty model for initial condition,
disturbance input, and measurement error
x(t0) = E x(t
0)[ ] ; P(t0 ) = E x(t0 ) ! x(t0 )[ ] x(t0 ) ! x(t0 )[ ]
T{ }u(t) = E u(t)[ ] ; U(t0 ) = 0
w(t) = 0 ; W(t) = E w(t)[ ] w(" )[ ]T{ }
n(t) = 0 ; N(t) = E n(t)[ ] n(" )[ ]T{ }
Linear-Optimal
State Estimator
(Kalman-Bucy Filter)
Optimal estimate of state
!x(t) = F(t)x(t) +G(t)u(t) +K(t) z(t) !Hx(t)[ ], x(to ) = x(to )
K(t) : Optimal estimator gain matrix (n " r)
Two parts to the optimal estimator
Propagation of the expected value of x
Least-squares correction to the model-based estimate
! x (t) = F(t)! x (t) + G(t)!u(t)
+K(t) !z(t) "H! x (t)[ ]
Estimator Gain for the
Kalman-Bucy Filter
K(t) = P(t)HTN
!1(t)
Optimal filter gain matrix
Matrix Riccati equation for estimator
P (t) = F(t)P(t) + P(t)FT(t) + L(t)W(t)L
T(t)
!P(t)HTN
!1HP(t), P(t
o) = P
o
Same equations as those that define LQ control gain,
except
Solution matrix, P, propagated forward in time
Matrices and matrix sequences are different
-
Comparison of Running Average and KalmanEstimate of Velocity from Position Measurement Second-Order Example
of Kalman-Bucy Filter
!p t( )
!! t( )
"
#
$$
%
&
''=
Lp 0
1 0
"
#$$
%
&''
p t( )
! t( )
"
#
$$
%
&
''+
L(A
0
"
#$$
%
&''(A t( ) +
Lp
0
"
#$$
%
&''pw t( )
Rolling motion of an airplane
p
!
"
#$$
%
&''=
Roll rate, rad/s
Roll angle, rad
"
#$$
%
&''
(A = Aileron deflection, rad
pw = Turbulence disturbance, rad/s
Measurement of roll rate and angle
pM t( )
!M t( )
"
#
$$
%
&
''k
=p t( ) + np t( )
! t( ) + n! t( )
"
#
$$
%
&
''k
= I2x t( ) + n t( )
Lp :Roll ! rate damping
Turbulence sensitivity
"#$
%$
L&A : Control effectiveness
Second-Order Example
of Kalman-Bucy Filter
Covariance extrapolation
!p11t( ) !p
12t( )
!p12t( ) !p
22t( )
!
"
##
$
%
&&=
Lp0
1 0
!
"##
$
%&&
p11t( ) p
12t( )
p12t( ) p
22t( )
!
"
##
$
%
&&+
p11t( ) p
12t( )
p12t( ) p
22t( )
!
"
##
$
%
&&
Lp1
0 0
!
"##
$
%&&
+Lp
2'pW
20
0 0
!
"##
$
%&&(
p11t( ) p
12t( )
p12t( ) p
22t( )
!
"
##
$
%
&&
'pM
20
0 ')M2
!
"
##
$
%
&&
(1
p11t( ) p
12t( )
p12t( ) p
22t( )
!
"
##
$
%
&&
Estimator gain computation
k11t( ) k
12t( )
k21t( ) k
22t( )
!
"
##
$
%
&&=
p11t( ) p
12t( )
p12t( ) p
22t( )
!
"
##
$
%
&&
' pM2
0
0 '(M2
!
"
##
$
%
&&
)1
Kalman-Bucy Filter with
Two Measurements
State estimate with roll rate and angle measurements
!p t( )
!! t( )
"
#
$$$
%
&
'''
=Lp 0
1 0
"
#$$
%
&''
p t( )
! t( )
"
#
$$
%
&
''+
L(A
0
"
#$$
%
&''(A t( )
+k11t( ) k
12t( )
k21t( ) k
22t( )
"
#
$$
%
&
''
pM t( ) ) p t( )
!M t( ) ) ! t( )
"
#
$$
%
&
''
-
State Estimate with
Angle Measurement Only
Covariance extrapolation
!p11t( ) !p
12t( )
!p12t( ) !p
22t( )
!
"
##
$
%
&&=
Lp0
1 0
!
"##
$
%&&
p11t( ) p
12t( )
p12t( ) p
22t( )
!
"
##
$
%
&&+
p11t( ) p
12t( )
p12t( ) p
22t( )
!
"
##
$
%
&&
Lp1
0 0
!
"##
$
%&&+
Lp
2'pW
20
0 0
!
"##
$
%&&(1
')M2
p11t( ) p
12t( )
p12t( ) p
22t( )
!
"
##
$
%
&&
T
p11t( ) p
12t( )
p12t( ) p
22t( )
!
"
##
$
%
&&
Gain computation
k11t( )
k21t( )
!
"
##
$
%
&&=1
'(M2
p11t( ) p
12t( )
p12t( ) p
22t( )
!
"
##
$
%
&&
State estimate with roll angle measurement
!p t( )
!! t( )
"
#
$$$
%
&
'''
=Lp 0
1 0
"
#$$
%
&''
p t( )
! t( )
"
#
$$
%
&
''+
L(A
0
"
#$$
%
&''(A t( ) +
k11t( )
k21t( )
"
#
$$
%
&
''
!M t( ) ) ! t( )"# %&
Stochastic OptimalControl
Deterministic vs.
Stochastic Optimization
Deterministic The state is defined by a known
dynamic process and by
precise input
precise initial condition
precise measurement
J* = J(x*, u*)
Stochastic The state is defined by a known dynamic
process and by
an unknown input
an imprecise initial condition
an imprecise or incomplete measurement
Optimal cost = E{J[x*, u*]}
Linear-Quadratic Gaussian (LQG)
Optimal Control Law Minimize expected value of cost, subject to uncertainty
Stochastic optimal feedback control law combines the linear-
optimal control law with a linear-optimal state estimate
u * (t) = !R!1G
T(t)S(t)x(t) = !C(t) x(t)
where is an optimal estimate of the state perturbationx(t)
minu
E(J ) ! E(J*)
Certainty equivalence:
Feedback control is computed from optimal estimate of the state
Stochastic feedback control law is the same as the deterministic
control law
-
z(t) = Hx(t) + n(t)
!x(t) = F(t)x(t) +G(t)u(t) +K(t) z(t) !Hx(t)[ ]
u(t) = !C(t) x(t)
Linear-Quadratic-Gaussian
Control of a Dynamic Process
LQG Rolling Mill Control
System Design Example
Maintain desired thickness of
shaped beam
Account for random
variations in thickness/hardness ofincoming beam
eccentricity in rolling cylinders
measurement errors
Open- and Closed-Loop Response
http://www.mathworks.com/help/toolbox/control/ug/f0-1004500.html
Stochastic RobustControl
Robust Control System Design
Make closed-loop response insensitive to plant
parameter variations
Robust controller
Fixed gains and structure
Minimize likelihood of instability
Minimize likelihood of unsatisfactory performance
-
Probabilistic Robust
Control Design
Design a fixed-parameter controller for
stochastic robustness
Monte Carlo Evaluation of competing designs
Genetic Algorithm or Simulated Annealing
search for best design
Representations of Uncertainty
sI ! F = det sI ! F( ) !
"(s) = sn+ a
n!1sn!1
+ ...+ a1s + a
0
= s ! #1( ) s ! #2( ) ...( ) s ! #n( ) = 0
Characteristic equation of the uncontrolled system
Uncertainty can be expressed in
Elements of F
Coefficients of
Eigenvalues
! s( )
Root Locations for an
Uncertain System
Variation may be represented by
Worst-case, e.g., Upper/lower bounds
Probability, e.g., Gaussian distribution
Uniform Distribution Gaussian Distribution
s Plane s Plane
Stochastic Root
Loci for Second-
Order Example
Root distributionsare nonlinear
functions ofparameterdistributions
Unboundedparameter
distributions alwayslead to non-zeroprobability ofinstability
Boundeddistributions may beguaranteed to bestable
-
Probability of Satisfying a Design Metric
Probability of satisfying a design metric
d: Control design parameter vector [e.g., SA, GA, ]
v: Uncertain plant parameter vector [e.g., RNG]
e: Binary indicator, e.g.,
0: satisfactory 1: unsatisfactory
H(v): Plant
C(d): Controller (Compensator)
Pr(d,v) !1
Ne C d( ),H v( )"# $%
i=1
N
&
Design Control System to Minimize
Probability of Instability
!closed" loop (s) = sI " F v( ) "G v( )C d( )#$ %&
= s " '1( ) s " '2( ) ...( ) s " 'n( )#$ %&closed" loop = 0
Characteristic equation of the closed-loop system
Monte Carlo evaluation of probability of instability
Minimize probability of instability of control parameters
using numerical search
mind
Pr Re !i, i = 1,n( )"# $% > 0{ }
Control Design Example*
Challenge: Design a feedback compensator for a 4th-order
spring-mass system (the plant) whose parameters are
bounded but unknown
Minimize the likelihood of instability
Satisfy a settling time requirement
Don#t use too much control
* 1990 American Control Conference Robust Control Benchmark Problem
Uncertain Plant*
Plant dynamic equation
* 1990 American Control Conference Robust Control Benchmark Problem
m1 m2
ku y
!x1
!x2
!x3
!x4
!
"
#####
$
%
&&&&&
=
0 0 1 0
0 0 0 1
' k m1
k m1
0 0
k m2
' k m20 0
!
"
#####
$
%
&&&&&
x1
x2
x3
x4
!
"
#####
$
%
&&&&&
+
0
0
1 m1
0
!
"
####
$
%
&&&&
u +
0
0
0
1 m2
!
"
####
$
%
&&&&
w
w
!(s) = s2 s2 + km1+ m
2( )m1m2
"
#$
%
&' = s
2s2 ()
n
2"# %&
Plant characteristic equation y = x2
-
Parameter Uncertainties, Root
Locus, and Control Law
Parameters of mass-spring
system
Uniform probability density
functions for
0.5 < m1, m2 < 1.5
0.5 < -k < 2
Effects of parameters on root
locations (right)
Single-input/single-output feedback control law
u s( ) = !C s( )y s( )
Design Cost Function
Probability of Instability, Pri ei = 1 (unstable) or 0 (stable)
Probability of Settling Time
Exceedance, Prts ets = 1 (exceeded) or 0 (not
exceeded)
Probability of Control Limit
Exceedance, Pru eu = 1 (exceeded) or 0 (not
exceeded)
Design Cost Function
J = aPri2 + bPrts
2 + c Pru2
a = 1
b = c = 0.01
Monte Carlo Evaluation of Probabilityof Satisfying a Design Metric
Compute v using randomnumber generators over Ntrials
Required number of trialsdepends on outcomeprobability (right)
Search for best d using agenetic algorithm tominimize J
Prk(d,v) !
1
NekC d( ),H v( )"# $%
i=1
N
& , k = 1,3
J = aPri
2(d,v) + bPr
ts
2(d,v) + cPr
u
2(d,v)
Stabilization Requires
Compensation
Proportional feedback alone cannot stabilize the system
Feedback of either sign drives at least one root into theright half plane
u s( ) = !cy s( )
-
Search-and-Sweep Design of Family of
Robust Feedback Compensators
1) Begin with lowest-order feedback compensator
2) Arrange parameters as binary design vector
3) Genetic algorithm search for best values of the
design vector, i.e., design vector that minimizes J
C12(s) =
a0+ a
1s
b0+ b
1s + b
2s2! C d( )
d = a0,a1,b0,b1,b2{ }
d* = a0*,a
1*,b
0*,b
1*,b
2*{ }
Search-and-Sweep Design of Family ofRobust Feedback Compensators
1) Define next higher-order compensator
2) Optimize over all parameters, including optimal
coefficients in starting population
3) Sweep to satisfactory design or no further improvement
C22(s) =
a0+ a
1s + a
2s2
b0+ b
1s + b
2s2
d = a0*,a
1*,a
2,b0*,b
1*,b
2*{ }! d ** = a0 **,a1 **,a2 **,b0 **,b1 **,b2 **{ }
C23(s) =
a0+ a
1s + a
2s2
b0+ b
1s + b
2s2+ b
3s3
C33(s) =
a0+ a
1s + a
2s2+ a
3s3
b0+ b
1s + b
2s2+ b
3s3
C34(s) =
a0+ a
1s + a
2s2+ a
3s3
b0+ b
1s + b
2s2+ b
3s3+ b
4s4
...
Design Cost and Probabilities for
Optimal 2nd- to 5th-Order
CompensatorsNumber of Zeros = Number of Poles
0
0.001
0.002
0.003
0.004
0.005
0.006
0.007
0.008
0.009
Cost Prob Instab Prob Ts Viol x 0.01 Prob Cont Viol x 0.1
2nd Order
3rd Order
4th Order
5th Order
System Identification
-
Parameter-Dependent
Linear System
!x t( ) = F p( )x t( ) +G p( )u t( ) + L p( )w t( )
z t( ) = Hx t( ) + n t( )
! Linear systems contains parameters
! What if the parameter vector, p, is unknown?
Dynamic Model for
Parameter Estimation Augment vector to include original state and parameter vector
xAt( ) =
x t( )
p t( )
!
"
##
$
%
&&
System model for parameter identification becomes nonlinearbecause parameter is contained in the augmented state
!xAt( ) =
F p( )x t( ) +G p( )u t( ) + L p( )w t( )!" #$
fp p t( ),w t( )!" #$
%
&'
('
)
*'
+'" f
Ax t( ),p t( ),u t( ),w t( )!" #$
z t( ) = HA
x t( )
p t( )
!
"
,,
#
$
--+ n t( ) = H 0!" #$
x t( )
p t( )
!
"
,,
#
$
--+ n t( )
System Identification Using an
Extended Kalman-Bucy Filter
!x(t)
!p(t)
!
"
##
$
%
&&= f x(t), p(t),u(t)[ ] +K z(t) 'H
x(t)
p(t)
!
"##
$
%&&
()*
+*
,-*
.*
Add unknown parameter vector to the estimator state
Extend the state estimator derived for linearsystems to account for the nonlinear dynamics
Multiple-Model Testing for System
Identification
!x1(t) = F
1x1(t) +G
1u(t) +K
1z(t) !H
1x1(t)[ ]
!x2(t) = F
2x2(t) +G
2u(t) +K
2z(t) !H
2x2(t)[ ]
...
!xn(t) = F
nxn(t) +G
nu(t) +K
nz(t) !H
nxn(t)[ ]
Create a separate estimator for each hypothetical model
Choose model with minimum error residual
Ji=1
2!T! =
1
2z(t) "H
ixi(t)[ ]
T
z(t) "Hixi(t)[ ]
i = 1,n
-
Adaptive Control
Adaptive Control
System Design
Control logic changes to accommodate changes orunknown parameters of the plant
System identification
Gain scheduling
Learning systems
Control law is nonlinear
u t( ) = c z(t),a,y * t( )!" #$
c [ ] : Control law
x(t) : State
z x(t)[ ] : Measurement of state
a : Parameters of operating point
y * (t) : Command input
Operating Points Within a
Flight Envelope
Dynamic model is a function of altitude and airspeed
Design LTI controllers throughout the flight envelope
Gain Scheduling
Proportional-integral controller with scheduled gains
u t( ) = CFa( )y *+C
Ba( )!x + C
Ia( ) !y t( )dt" # c x(t),a,y * t( )$% &'
Scheduling variables, a, e.g., altitude, speed, properties ofchemical process,
-
Cerebellar Model
Articulation Controller (CMAC)
Inspired by models ofcerebellum
CMAC: Two-stage mappingof a vector input to a scalaroutput
First mapping: Input space toassociation space
s is fixed
a is binary
Second mapping:Association space to outputspace
g contains learned weights
s : x! a
Input! Selector vector
g :a! y
Selector!Output
Example of Single-
Input CMACAssociation Space
x is in (xmin, xmax)
Selector vector is binary and has Nelements
Receptive regions of associationspace map x to a
Analogous to neurons that fire inresponse to stimulus
NA = Number of receptive regions =N + C 1 = dim(a)
C = Generalization parameter = # ofoverlapping regions
Input quantization = (xmax $xmin) / N
s : x! a
Input! Selector vector
a = 0 0 0 1 1 1 0 0[ ]T
CMAC Output and
Training
CMAC output (i.e., control command) from activated cellsof c Associative Memory layers
ASSOCIATION MEMORY, c = 3
INPUT SPACE, n = 2 Layer 1 Layer 2 Layer 3
input 2
inp
ut
1
quant. widthof input 2
yCMAC = wTa = wi,activated
i= j
j+C!1
" j= index of first activated region
wjnew = wjold +!c
ydesired " wioldi=1
c
#$%&'()
Least-squares training of CMAC weights, w
Analogous to synapses between neurons
is the learning rate and wj is an activated cell weight
Localized generalization and training
!
CMAC Output and Training
In higher dimensions, association space is dim(x), a plane,cube, or hypercube
Potentially large memory requirements
Granularity (quantization) of output
Variable generalization and granularity
ASSOCIATION MEMORY, c = 3
INPUT SPACE, n = 2 Layer 1 Layer 2 Layer 3
input 2
inp
ut 1
quant. widthof input 2
2-dimensional association space
-
CMAC Control of a Fuel-
Cell Pre-Processor(Iwan and Stengel)
BATTERIES
POWER
CONDITIONING
AND MOTOR
CONTROL
GEARMOTOR/
GEN.
FUEL
PROCESSOR
FUEL
STORAGE
FUEL CELL
STACKShift
2H O
Air
PrOx
Reformer or Partial
Oxidation Reactor
Fuel cell produces electricity for electric motor
Preferential Oxidizer (PrOx)
Proton-Exchange Membrane Fuel Cell converts hydrogen
and oxygen to water and electrical power
Steam Reformer/Partial Oxidizer-Shift Reactor converts fuel
(e.g., alcohol or gasoline) to H2, CO2, H2O, and CO. Fuel flow
rate is proportional to power demand
CO poisons the fuel cell and must be removed from the
reformate
Catalyst promotes oxidation of CO to CO2 over oxidation of
H2 in a Preferential Oxidizer (PrOx)
PrOx reactions are nonlinear functions of catalyst,
reformate composition, temperature, and air flow
FUEL
PROCESSOR
Shift
2H O
Air
PrOx
Reformer or Partial
Oxidation Reactor
CMAC/PID Control System
for Preferential Oxidizer
desired H2
conversion
airCMAC
airPID
airTOTAL
training
+-
+
+! ! PROX
PID
CMAC
H2 conv.
error
HYBRID CONTROL SYSTEM
(ANN)
(Conventional)
PROX reformate flow rate
PROX inlet [CO] Inlet coolant temperature
gains=f(flow rate)
Inlet reformate
Outlet reformate
H2 conv. =
f(airTotal, [H2]in, [H 2]out,
flow rate, sensor dynamics)
H2 Conversion Calc.
actual H2 conversion[H2]out
[H2]in
ASSOCIATION MEMORY, c = 3
INPUT SPACE, n = 2 Layer 1 Layer 2 Layer 3
input 2
inp
ut
1
quant. widthof input 2
Summary of CMAC
Characteristics
Inputs and Number of Divisions:
PrOx inlet reformate flow rate (95)
PrOx inlet cooling temperature (80)
PrOx inlet CO concentration (100)
Output: PrOx air injection rate
Associative Layers, C: 24
Number of Associative Memory Cells/Weights and
Layer Offsets: 1,276 and [1,5,7]
Learning Rate, : ~0.01
Sampling Interval: 100 ms
ASSOCIATION MEMORY, c = 3
INPUT SPACE, n = 2 Layer 1 Layer 2 Layer 3
input 2
inp
ut
1
quant. widthof input 2
!
-
Flow Rate and Hydrogen Conversion
of CMAC/PID Controller
H2 conversion command (across PrOx only): 1.5%
Novel data, with (---) and without pre-training ()
Federal Urban Driving Cycle (= FUDS)
Comparison of PrOx Controllers
on FUDS mean H2 error
maximum H2 error
mean CO out
max. CO out
% % ppm ppm %
Fixed-Air 0.68 0.87 6.3 28 57.2
Table Look-up 0.13 1.43 6.5 26 57.8
PID 0.05 0.51 7.7 30 58.1
CMAC/PID 0.02 0.16 7.3 26 58.1
net H2 output
Reinforcement Learning
Learn from success and failure
Repetitive trials
Reward correct behavior
Penalize incorrect behavior
Learn to control from a humanoperator
http://en.wikipedia.org/wiki/Reinforcement_learning
Next Time:Classification of Data
Sets
-
Supplementary Material
Dynamic Models for the
Parameter Vector
Unknown constant: p(t) = constant
!p t( ) = 0; p 0( ) = p
o; Pp 0( ) = Pp
o
Random p(t) (integrated white noise)
!p t( ) = w t( ); p 0( ) = 0; Pp 0( ) = Ppo
E w t( )!" #$ = 0; E w t( )wT %( )!" #$ = Qw& t ' %( )
Linear dynamic system (Markov process)
!p t( ) = Ap t( ) + Bw t( ) " fp p t( ),w t( )!" #$; w t( ) # N 0,Qw( )
Inputs for System Identification
Transient inputs Step or square wave
Impulse or pulse train
Persistent excitation Random noise
Sinusoidal frequency sweep