theoretical and practical aspects of linear and nonlinear model order reduction techniques

Theoretical and practical aspects of linear and nonlinear model order reduction techniques

Dmitry Vasilyev

Thesis supervisor: Jacob K White

December 19, 2007

2

Outline Motivation Overview of existing methods:

Linear MOR Nonlinear MOR

TBR-based trajectory piecewise-linear method

Modified AISIAD linear reduction method Graph-based model reduction for RC circuits Case study: microfluidic channel Conclusions

Nonlinear

Linear

3

Main motivation for MOR: system-level simulation

System to simulate

Device 1

Device 3

Device 2

Device 10

…

…

Q: How to reduce the cost of

simulating the big system? A: Reduce the

complexity of each sub-system, i.e.approximate input-output behavior of the system by a system of lower complexity.

The goal of MOR in a nutshell

4

Main motivation for MOR: system-level simulation

Modern processor or system-on-a chip

> Millions of transistors> Kilometers of interconnects> Linear and nonlinear devices

inBvvGdt

dvvC )()(

4 2 2

4 2 20

( )w

elec a

u u uEI S F p p dy

x x t

3 ( )((1 6 ) ) 12

puK u p p

t

Figures thanks to Mike Chou, Michał Rewienski

5

Model reduction problem

• Reduction should be automatic • Must preserve input-output properties

Many (> 104) internal states

inputs outputs

few (<100) internal states

inputs outputs

6

Differential equation model

Original complex model:

Model can represent: Finite-difference spatial discretization of

PDEs Circuits with linear capacitors and

inductorsNeed accurate input-output behavior

7

Nonlinear model reduction problem

Requirements for reduced model Want q << n (cost of simulation is q3) f r should be fast to compute Want yr(t) to be close to y(t)

Original complex model: Reduced model:

8

A – stable, n x n (large)

Linear Model

Model can represent: Spatial discretization of linear PDEs Circuits with linear elements

- state

- vector of inputs

- vector of outputs

9

Transfer function of LTI system

Transfer function:

Laplace transform of the

output

Laplace transform of the

input

Matrix-valued rational function of s

10





11

Linear MOR problem

n – large(thousands)!

Need the reduction to be automatic and preserve input-output properties (transfer function)

q – small (tens)

12

Approximation error Wide-band applications: model should have

small worst-case error

ω

maximal difference over all frequencies

13

Approximation error Narrow-band approximation: need good

approximation only near particular frequency:

ωfrequency response

Elmore delay: preserved if the first derivative at zero frequency is matched.

14

Linear MOR methods roadmap

Linear MOR

Projection-based Non projection-based

Most widely used. Will be the central topic of this work.

15

Projection-based linear MOR Pick projection matrices V and U:

such that VTU=I

Uz x

x

n x zU q

VTAUz

Ax

16

Projection-based linear MOR

Uzx

0

BuAx

dt

dxV T

Important: reduced system depends only on

column spans of V and UD = Dr, preserves response at infinite frequency

17


Linear MOR


Proper Orthogonal Decomposition methods

Balancing-based (TBR)

V = U = {x(t1)… x(tq )}

V, U =eig{PQ, QP}

Krylov-subspace methods

V, U =K((si I-A)-1,B), K((si I-A)-T,CT)

18


Linear MOR

Projection-based


Proper Orthogonal Decomposition methods

Balancing-based (TBR)

V = U = {x(t1)… x(tq )}

V, U =K((si I-A)-1,B), K((si I-A)-T,CT)

Will describenext.

Non projection-based

19

LTI SYSTEM

X (state)

tu

t

y

input output

P (controllability)Which states are easier to reach?

Q (observability)Which states produce more output?

Reduced model retains most controllable and most observable states

Such states must be both very controllable and very observable

TBR idea

20

Reduced system: (VTAU, VTB, CU, D)

Compute controllability and observability

gramians P and Q :

(~n3)AP + PAT + BBT =0 ATQ + QA + CTC = 0

Reduced model keeps

the dominant eigenspaces of PQ : (~n3)

PQui = λiui vT

iPQ = λivTi

Balanced truncation reduction (TBR)

Very expensive. P and Q are dense even for sparse models

21

TBR benefits Guaranteed stability In practice provides more reduction than

Krylov H-infinity error bound => ideal for wide-band

approximations

Hankel singular values

22


Linear MOR


Singular perturbationapproximation

Transfer function fitting methods

Hankel-optimal MORTwice better errorbound than TBR [Glover ’84]

Match at zero frequency instead of infinity [Liu ‘89]

Promising topicof ongoing research [Sou ‘05]

23





24

Nonlinear MOR framework Consider original (large) system:

Projection of the nonlinear operator f(x):substitute x ≈ Uz and project residual onto VT

Problem: evaluation of V Tf(Uz) is still expensive

25

Nonlinear MOR framework

Problem: evaluation of V Tf(Uz) is still expensiveTwo solutions:

Use Taylor series of f

Use TPWL approximation

26

Taylor series for nonlinear MOR




Accurate only near expansion point or weakly nonlinear systems Storing of dense tensors is expensive; limits the series to orders no more than 3.

27

Nonlinear MOR framework




Will be discussed next

28

Trajectory piecewise linear (TPWL) approximation of f( ) [Rewieński, 2001]

Training trajectory

x0

x1x2

xs

…

Simulating trajectory

wi(x) is zero

outside circle

0

( ) ( ) ( )( ( ))n

TPWLi

iii if xf x w x x xA

0

( ) 1n

ii

w x

29

Projection and TPWL approximation yields

efficient f r( )

q x 1 Air

VT Ai =U Air q

q

nn

Evaluating fTPWLr( ) requires only O(sq2) operations

30

1.Compute A1

2.Obtain V1 and U1 using linear reduction for A1

3.Simulate training input, collect and reduce linearizations Ai

r = W1TAiV1

f r (xi)=W1Tf(xi)

TPWL approximation of f( ).

Extraction algorithm

Non-reduced state space

Initial system position

Training trajectory

x0

x1x2

xs

…

31





32

The matter of this contribution

Krylov-subspace methods Balanced-truncation method

What are projection options for TPWL?

Used in the original work[Rewienski ‘02]

Can we use it?

33

Example problem

Linearized system has non-symmetric, indefinite Jacobian

RLC line

34

0 2 4 6 8 100

0.005

0.01

0.015

0.02

0.025Full linearized model, N=800Full nonlinear model, N=800TPWL model, q=4, TBR basisTPWL model, q=30, Krylov basis

Input:training

input

testinginput

Numerical results – nonlinear RLC transmission line

System response for input current i(t) = (sin(2π/10)+1)/2

Vo

ltage

at n

ode

1 [V

]

Time [s]

35

0 5 10 15 20 25 3010

-4

10-3

10-2

10-1

100

TBR TPWL modelKrylov TPWL model

Numerical results –RLC transmission line

Error in transient

||yr –

y|| 2

Order of the reduced model

TBR-based TPWL beat Krylov-based

4-th order TBR TPWL reaches the limit of TPWL representation

36

Micromachined switch example

4 2 2

4 2 20

3

ˆ ( )

( )((1 6 ) ) 12

w

elec a

u u uEI S F p p dy

x x t

d puK u p p

dt

non-symmetric indefinite Jacobian

Finite-difference model of order 880

Model description [Hung ‘97]

37

0 5 10 15 2010-3

10-2

10-1

100

101

102

TBR TPWL modelKrylov TPWL model

TPWL-TBR results– MEMS switch example

Errors in transient

Order of reduced system

||yr –

y|| 2

Odd order models unstable!

Even order models beat Krylov

Why???

Unstable!

38

Explanation of even-odd effect – Problem statementConsider two LTI systems:

Initial: Perturbed:

TBR reduction

TBR reduction

Projection basis V Projection basis V

Define our problem: How perturbation in the initial system

affects TBR projection matrices?

~

39

Perturbation behavior of TBR basis is similar to symmetric eigenvalue problem

Eigenvectors of M0 :

Eigenvectors of M0 + Δ :

Mixing of eigenvectors (assuming small perturbations):

cik large when λi

0 ≈ λk0

0

1

Nk

k i ii

e c e

0 0

0 0

( ),

Tk k ii

k i

e ec k i

0 0 01 2, , ..., Ne e e

40

0 5 10 15 20 25 30

10 -6

10 -5

10 -4

10 -3

Hankel singular value

Hankel singular values, MEMS beam example

# of the Hankel singular value

This is the key to the problem.

Singular values are arranged in pairs!

41

Explaining even-odd behavior

The closer Hankel singularvalues lie to each other, the

more corresponding eigenvectorsof V tend to intermix!

Analysis implies simple recipe for using TBR Pick reduced order to ensure that

Remaining Hankel singular values are small enough The last kept and the first removed Hankel singular

values are well separated

0 0

0 0

( ),

Tk k ii

k i

e ec k i

Helps to ensure that linearizations are stable

42

Summary We used TBR-based linear reduction

procedure to generate TPWL reduced models

Order reduced 5 times while maintaining comparable accuracy with Krylov TPWL method (efficiency improved 125 times!)

Simple recipe found which helps to ensure stability.

43





44

Reduced system: (VTAU, VTB, CU, D)

Compute controllability and observability

gramians P and Q :

(~n3)AP + PAT + BBT =0 ATQ + QA + CTC = 0

Reduced model keeps

the dominant eigenspaces of PQ : (~n3)

PQui = λiui vT

iPQ = λivTi

Balanced truncation reduction (TBR)

Very expensive. P and Q are dense even for sparse models

45

• Arnoldi [Grimme ‘97]:U = colsp{A-1B, A-2B, …}, V=U , approx. Pdom only

• Padé via Lanczos [Feldman and Freund ‘95]colsp(U) = {A-1B, A-2B, …}, - approx. Pdom colsp(V) = {A-TCT, (A-T )2CT, …}, - approx. Qdom

• Frequency domain POD [Willcox ‘02], Poor Man’s TBR [Phillips ‘04]

Most reduction algorithms effectively separately approximate dominant eigenspaces of P and Q :

However, what matters is the product PQ

colsp(U) = {(jω1I-A)-1B, (jω2I-A)-1B, …}, - approx. Pdom

colsp(V) = {(jω1I-A)-TCT, (jω2I-A)-TCT, …}, - approx. Qdom

46

RC line (symmetric circuit)

Symmetric Jacobian, B=CT, P=Q

all controllable states are observable and vice versa

V(t) – inputi(t) - output

47

RLC line (nonsymmetric circuit)

P and Q are no longer equal! By keeping only mostly controllable

and/or only mostly observable states, we may not find dominant eigenvectors of PQ

Vector of states:

48

Lightly damped RLC circuit

Exact low-rank approximations of P and Q of

order < 50 leads to PQ ≈ 0

R = 0.008, L = 10-5

C = 10-6

N=100

y(t) = i1

49

AISIAD model reduction algorithm

Idea of AISIAD approximation:Approximate eigenvectors using power iterations:

Ui converges to dominant eigenvectors of PQ

Need to find the product (PQ)Ui

Xi = (PQ)Ui Ui+1

= qr(Xi)

“iterate”

How?

50

Approximation of the product Ui+1 =qr(PQUi), AISIAD algorithm

Vi ≈ qr(QUi) Ui+1

≈ qr(PVi)

Approximate using solution of Sylvester equation

Approximate using solution of Sylvester equation

51

More detailed view of AISIAD approximation

Right-multiply by Vi

X X H, qxq (original AISIAD)

M, nxq

52

X X H, qxq

Modified AISIAD approximation


Approximate!

M, nxq

^

53

Modified AISIAD approximation


We can take advantage of various methods, which approximate P and Q

M, nxq

X X H, qxqApproximate!

^

54

n x qn x n

Specialized Sylvester equation

A X + X H =-M

q x q

Need only column span of X

55

Solving Sylvester equation

Schur decomposition of H :

A X + X =-M~ ~

Solve for columns of X~

~

X

56

Solving Sylvester equation

Applicable to any stable A Requires solving q times

Schur decomposition of H :

Solution can be accelerated via fast MVPOriginal method suggests IRA, needs A>0 [Zhou ‘02]

57

1.Obtain low-rank approximations of P and Q2.Solve AXi +XiH + M = 0, => Xi≈ PVi

where H=ViTATVi, M = P(I - ViVi

T)ATVi + BBTVi

3. Perform QR decomposition of Xi =UiR

4. Solve ATYi +YiF + N = 0, => Yi≈ QUi

where F=UiTAUi, N = Q(I - UiUi

T)AUi + CTCUi

5.Perform QR decomposition of Yi =Vi+1 R to get new

iterate. 6.Go to step 2 and iterate.7.Bi-orthogonalize V and U and construct reduced model:

Modified AISIAD algorithm

(VTAU, VTB, CU, D)

LR-sqrt^ ^

^

^

58

RLC line example resultsH-infinity norm of reduction error (worst-case discrepancy over all frequencies)

N = 1000,1 input

2 outputs

59

Summary of the modified AISIAD

Fast approximation to TBR Especially useful if gramians do not

share common dominant eigenspace Improved accuracy and extended

applicability over AISIAD Generalized to the systems in

descriptor form

60





61

Features of the method

Cost of reduction

Reduction quality

TBR

Hankel-optimal


Graph-based reduction:manipulates RC network by removing nodes and inserting new elements

62

Linear RC network description

Jk

Jm

vm

vk

vs

External ports

State-space model in the frequency domain:

Vector of node voltages (state):

symmetric

Conductance matrix is analogous, ground node is excluded.

63

Low-frequency approximation for reduced circuit

Consider removing a single internal node (Nth), partition matrices and vectors:

Substitute vN in the system equations (one step of Gaussian elimination):

Where

64

Node elimination

Added conductance

Capacitance-like

Problem: last capacitance term is negative! Potentially inserting a negative capacitor???

The term was ignored in the TICER algorithm [Sheehan ‘99]. Leads to inconsistent diagonal update.

65

Node elimination – Theorem 1

Claim: keeping the exact Taylor series is OK:

Proof: Define projection:

Gnew Cnew

Congruence transform

Model is alwaysstable and passive

66

Node elimination criteria When is it safe to eliminate a node?

Denominator expansion:(used in TICER)

Numerator term ~s2 (element-by-element)

Using these criteria the reduced order will be chosen on-the-fly

(overlooked in TICER)

67

Resulting algorithm: Given the initial circuit and maximal frequency of

interest Using lowest-degree ordering (minimize fill-ins) Perform the elimination of the “qualified” nodes

by inserting new capacitors and resistors:

(for every nodes i and j which were connected via the node N)

Until no nodes satisfy elimination conditions.

68

Results: testing substitution rules

Testing only substitution rules, 1-CDF of the reduction error

tested more than 30,000 circuits

69

Results: testing elimination conditions

Narrower distribution Better worst-case accuracy

Same elimination rules, same average reductiondifferent elimination criteria:

70

Summary of the new method

Improved accuracy and error control over TICER by using correct Taylor series and elimination criteria

Preserves stability and passivity Generalized to parameter-dependent

case Fastest, though conservative

71





72

r1

w

V

u(t) = Cin(t)

(input)

y1(t) =<Cout>(t)

y2(t)

y3(t)

Inside the carrying fluid, marker fluid spreads governed by 3D convection-diffusion equation:

Using mapped-domain finite-difference volume discretization, obtained model has 2842 unknowns (large)

(outputs)

Electro-osmotic flow in the 3D U-shaped microchannel

73

How the marker spreads

C(t)

t0

C(r,t)

t

r

C(r,t)

t

r

12

3

r

1 2 3

74

Linear reduction techniques are extremely efficient for such models

[Vasilyev, Rewienski, White ‘06]

Linear case In case of constant mobility and diffusivity the model is

linear:

75

Modified AISIAD reduction - results

76

TBR, Arnoldi and mAISIAD

Modified AISIADruntime:73sTBR runtime:2207s

(Matlab implementation)

77

Comparison with other reduction methods

78

For arbitrary nonlinearity in convection and diffusion coefficients and TPWL, this problem is very challenging!

[Vasilyev, Rewienski, White ‘06]

However, the problem becomes more tractable, if one considers a quadratic problem

This is the case for affine μ and D:

μ(C) = μ0C+ μ1

D(C) = D0C+ D1

Nonlinear microchannel problem

79

Quadratic model of microchannel system

Affine mobility and diffusivity leads to quadratic model:

Use orthogonal projection V = U, V TV = I

Reduced quadratic system

80

Projected reduced quadratic model of size 60 approximates original system of size 2842 quite well:

Quadratic microchannel problem - result

Krylov-subspace basis,Quadratic reduction

81

Conclusions Performed applicability analysis of TBR-

based TPWL models based on matrix perturbation theory

Developed modified AISIAD method which is aimed at approximating TBR for the cases where gramians do not necessarily share common dominant eigenspaces

Developed graph-based parameterized RC reduction method and improved nominal reduction

82

I extend my sincere thanks to:

Prof. Jacob White – my supervisor,Profs. Luca Daniel and Alexandre Megretski,Profs. Karen Willcox, John Kassakian, John Wyatt,

Dr. Yehuda Avniel, Dr. Joel Phillips, Dr. Mark ReicheltMy groupmates: Anne, Bo, Brad, Carlos, Dave, Jay,

Jung Hoon, Homer, Kin, Laura, Lei, Michał, Shihhsien, Steve, Tarek, Tom, Xin, Zhenhai

My wife, Patrycja

Thank you! Спасибо! Grazie! Dziękuje! Komapsumnida! Xie Xie! Dua Netjer en ek!

83

For systems in the descriptor form

Generalized Lyapunov equations:

Lead to similar approximate power iterations

84

mAISIAD and low-rank square root

Low-rank gramians

LR-square root

mAISIAD

(inexpensive step) (more expensive)For the majority of non-symmetric cases,

mAISIAD works better than low-rank square root

(cost varies)

85

RLC line example resultsH-infinity norm of reduction error (worst-case discrepancy over all frequencies)

N = 1000,1 input

2 outputs

86

Steel rail cooling profile benchmark

Taken from Oberwolfach benchmark collection, N=1357 7 inputs, 6 outputs

87

mAISIAD is useless for symmetric models

For symmetric systems (A = AT, B = CT) P=Q, therefore mAISIAD is equivalent to LRSQRT for P,Q of order q

RC line example

^ ^

88

Cost of the algorithm Cost of the algorithm is directly

proportional to the cost of solving a linear system:

(where sjj is a complex number)

Cost does not depend on the number of inputs and outputs

(non-descriptor case)

(descriptor case)

89

Lightly damped RLC circuit

Union of eigenspaces of P and Qdoes not necessarily approximate

dominant eigenspace of PQ .

Top 5 eigenvectors of P Top 5 eigenvectors of Q

theoretical and practical aspects of linear and nonlinear model order reduction techniques

Documents

linear capacitors

linear morimportant

linear elements state

practical aspects of

reduced system

big system

goal of mor

systemlevel simulationsystem