what every engineer should know about computational techniques of finite element analysis, second...

WHAT EVERY ENGINEERSHOULD KNOW ABOUTCOMPUTATIONALTECHNIQUES OFFINITE ELEMENTANALYSISSecond Edition

LOUIS KOMZSIK

WHAT EVERY ENGINEERSHOULD KNOW ABOUTCOMPUTATIONALTECHNIQUES OFFINITE ELEMENTANALYSISSecond Edition

CRC Press is an imprint of theTaylor & Francis Group, an informa business

Boca Raton London New York

CRC PressTaylor & Francis Group6000 Broken Sound Parkway NW, Suite 300Boca Raton, FL 33487-2742

© 2009 by Taylor & Francis Group, LLCCRC Press is an imprint of Taylor & Francis Group, an Informa business

No claim to original U.S. Government worksVersion Date: 20131125

International Standard Book Number-13: 978-1-4398-0295-3 (eBook - PDF)

This book contains information obtained from authentic and highly regarded sources. Reasonable efforts have been made to publish reliable data and information, but the author and publisher cannot assume responsibility for the validity of all materials or the consequences of their use. The authors and publishers have attempted to trace the copyright holders of all material reproduced in this publication and apologize to copyright holders if permission to publish in this form has not been obtained. If any copyright material has not been acknowledged please write and let us know so we may rectify in any future reprint.

Except as permitted under U.S. Copyright Law, no part of this book may be reprinted, reproduced, transmit-ted, or utilized in any form by any electronic, mechanical, or other means, now known or hereafter invented, including photocopying, microfilming, and recording, or in any information storage or retrieval system, without written permission from the publishers.

For permission to photocopy or use material electronically from this work, please access www.copyright.com (http://www.copyright.com/) or contact the Copyright Clearance Center, Inc. (CCC), 222 Rosewood Drive, Danvers, MA 01923, 978-750-8400. CCC is a not-for-profit organization that provides licenses and registration for a variety of users. For organizations that have been granted a photocopy license by the CCC, a separate system of payment has been arranged.

Trademark Notice: Product or corporate names may be trademarks or registered trademarks, and are used only for identification and explanation without intent to infringe.

Visit the Taylor & Francis Web site athttp://www.taylorandfrancis.com

and the CRC Press Web site athttp://www.crcpress.com

To my son, Victor

Contents

Preface to the second edition xiii

Preface to the first edition xv

Acknowledgments xvii

I Numerical Model Generation 1

1 Finite Element Analysis 31.1 Solution of boundary value problems . . . . . . . . . . . . . . 31.2 Finite element shape functions . . . . . . . . . . . . . . . . . 61.3 Finite element basis functions . . . . . . . . . . . . . . . . . . 91.4 Assembly of finite element matrices . . . . . . . . . . . . . . . 121.5 Element matrix generation . . . . . . . . . . . . . . . . . . . . 151.6 Local to global coordinate transformation . . . . . . . . . . . 191.7 A linear quadrilateral finite element . . . . . . . . . . . . . . 201.8 Quadratic finite elements . . . . . . . . . . . . . . . . . . . . 26References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

2 Finite Element Model Generation 312.1 Bezier spline approximation . . . . . . . . . . . . . . . . . . . 312.2 Bezier surfaces . . . . . . . . . . . . . . . . . . . . . . . . . . 372.3 B-spline technology . . . . . . . . . . . . . . . . . . . . . . . . 402.4 Computational example . . . . . . . . . . . . . . . . . . . . . 432.5 NURBS objects . . . . . . . . . . . . . . . . . . . . . . . . . . 482.6 Geometric model discretization . . . . . . . . . . . . . . . . . 502.7 Delaunay mesh generation . . . . . . . . . . . . . . . . . . . . 512.8 Model generation case study . . . . . . . . . . . . . . . . . . . 54References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

3 Modeling of Physical Phenomena 593.1 Lagrange’s equations of motion . . . . . . . . . . . . . . . . . 593.2 Continuum mechanical systems . . . . . . . . . . . . . . . . . 613.3 Finite element analysis of elastic continuum . . . . . . . . . . 633.4 A tetrahedral finite element . . . . . . . . . . . . . . . . . . . 653.5 Equation of motion of mechanical system . . . . . . . . . . . 693.6 Transformation to frequency domain . . . . . . . . . . . . . . 71

vii

viii

References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74

4 Constraints and Boundary Conditions 754.1 The concept of multi-point constraints . . . . . . . . . . . . . 764.2 The elimination of multi-point constraints . . . . . . . . . . . 794.3 An axial bar element . . . . . . . . . . . . . . . . . . . . . . . 824.4 The concept of single-point constraints . . . . . . . . . . . . . 854.5 The elimination of single-point constraints . . . . . . . . . . . 864.6 Rigid body motion support . . . . . . . . . . . . . . . . . . . 884.7 Constraint augmentation approach . . . . . . . . . . . . . . . 90References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92

5 Singularity Detection of Finite Element Models 935.1 Local singularities . . . . . . . . . . . . . . . . . . . . . . . . 935.2 Global singularities . . . . . . . . . . . . . . . . . . . . . . . . 975.3 Massless degrees of freedom . . . . . . . . . . . . . . . . . . . 995.4 Massless mechanisms . . . . . . . . . . . . . . . . . . . . . . . 1005.5 Industrial case studies . . . . . . . . . . . . . . . . . . . . . . 102References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104

6 Coupling Physical Phenomena 1056.1 Fluid-structure interaction . . . . . . . . . . . . . . . . . . . . 1056.2 A hexahedral finite element . . . . . . . . . . . . . . . . . . . 1066.3 Fluid finite elements . . . . . . . . . . . . . . . . . . . . . . . 1096.4 Coupling structure with compressible fluid . . . . . . . . . . . 1116.5 Coupling structure with incompressible fluid . . . . . . . . . . 1126.6 Structural acoustic case study . . . . . . . . . . . . . . . . . . 113References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115

II Computational Reduction Techniques 117

7 Matrix Factorization and Linear Systems 1197.1 Finite element matrix reordering . . . . . . . . . . . . . . . . 1197.2 Sparse matrix factorization . . . . . . . . . . . . . . . . . . . 1227.3 Multi-frontal factorization . . . . . . . . . . . . . . . . . . . . 1247.4 Linear system solution . . . . . . . . . . . . . . . . . . . . . . 1267.5 Distributed factorization and solution . . . . . . . . . . . . . 1277.6 Factorization and solution case studies . . . . . . . . . . . . . 1307.7 Iterative solution of linear systems . . . . . . . . . . . . . . . 1347.8 Preconditioned iterative solution technique . . . . . . . . . . 137References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139

ix

8 Static Condensation 1418.1 Single-level, single-component condensation . . . . . . . . . . 1418.2 Computational example . . . . . . . . . . . . . . . . . . . . . 1448.3 Single-level, multiple-component condensation . . . . . . . . . 1478.4 Multiple-level static condensation . . . . . . . . . . . . . . . . 1528.5 Static condensation case study . . . . . . . . . . . . . . . . . 155References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158

9 Real Spectral Computations 1599.1 Spectral transformation . . . . . . . . . . . . . . . . . . . . . 1599.2 Lanczos reduction . . . . . . . . . . . . . . . . . . . . . . . . 1619.3 Generalized eigenvalue problem . . . . . . . . . . . . . . . . . 1649.4 Eigensolution computation . . . . . . . . . . . . . . . . . . . . 1669.5 Distributed eigenvalue computation . . . . . . . . . . . . . . . 1689.6 Dense eigenvalue analysis . . . . . . . . . . . . . . . . . . . . 1729.7 Householder reduction technique . . . . . . . . . . . . . . . . 1759.8 Normal modes analysis case studies . . . . . . . . . . . . . . . 177References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181

10 Complex Spectral Computations 18310.1 Complex spectral transformation . . . . . . . . . . . . . . . . 18310.2 Biorthogonal Lanczos reduction . . . . . . . . . . . . . . . . . 18410.3 Implicit operator multiplication . . . . . . . . . . . . . . . . . 18610.4 Recovery of physical solution . . . . . . . . . . . . . . . . . . 18810.5 Solution evaluation . . . . . . . . . . . . . . . . . . . . . . . . 19010.6 Reduction to Hessenberg form . . . . . . . . . . . . . . . . . . 19110.7 Rotating component application . . . . . . . . . . . . . . . . . 19210.8 Complex modal analysis case studies . . . . . . . . . . . . . . 196References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199

11 Dynamic Reduction 20111.1 Single-level, single-component dynamic reduction . . . . . . . 20111.2 Accuracy of dynamic reduction . . . . . . . . . . . . . . . . . 20311.3 Computational example . . . . . . . . . . . . . . . . . . . . . 20611.4 Single-level, multiple-component dynamic reduction . . . . . . 20811.5 Multiple-level dynamic reduction . . . . . . . . . . . . . . . . 21011.6 Multi-body analysis application . . . . . . . . . . . . . . . . . 212References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 215

12 Component Mode Synthesis 21712.1 Single-level, single-component modal synthesis . . . . . . . . . 21712.2 Mixed boundary component mode reduction . . . . . . . . . . 21912.3 Computational example . . . . . . . . . . . . . . . . . . . . . 22212.4 Single-level, multiple-component modal synthesis . . . . . . . 22512.5 Multiple-level modal synthesis . . . . . . . . . . . . . . . . . . 228

x

12.6 Component mode synthesis case study . . . . . . . . . . . . . 230References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 232

III Engineering Solution Computations 235

13 Modal Solution Technique 23713.1 Modal solution . . . . . . . . . . . . . . . . . . . . . . . . . . 23713.2 Truncation error in modal solution . . . . . . . . . . . . . . . 23913.3 The method of residual flexibility . . . . . . . . . . . . . . . . 24113.4 The method of mode acceleration . . . . . . . . . . . . . . . . 24513.5 Coupled modal solution application . . . . . . . . . . . . . . . 24613.6 Modal contributions and energies . . . . . . . . . . . . . . . . 247References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 250

14 Transient Response Analysis 25114.1 The central difference method . . . . . . . . . . . . . . . . . . 25114.2 The Newmark method . . . . . . . . . . . . . . . . . . . . . . 25214.3 Starting conditions and time step changes . . . . . . . . . . . 25414.4 Stability of time integration techniques . . . . . . . . . . . . . 25514.5 Transient response case study . . . . . . . . . . . . . . . . . . 25814.6 State-space formulation . . . . . . . . . . . . . . . . . . . . . 259References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 262

15 Frequency Domain Analysis 26315.1 Direct and modal frequency response analysis . . . . . . . . . 26315.2 Reduced-order frequency response analysis . . . . . . . . . . . 26415.3 Accuracy of reduced-order solution . . . . . . . . . . . . . . . 26715.4 Frequency response case study . . . . . . . . . . . . . . . . . 26815.5 Enforced motion application . . . . . . . . . . . . . . . . . . . 269References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 271

16 Nonlinear Analysis 27316.1 Introduction to nonlinear analysis . . . . . . . . . . . . . . . . 27316.2 Geometric nonlinearity . . . . . . . . . . . . . . . . . . . . . . 27516.3 Newton-Raphson methods . . . . . . . . . . . . . . . . . . . . 27816.4 Quasi-Newton iteration techniques . . . . . . . . . . . . . . . 28216.5 Convergence criteria . . . . . . . . . . . . . . . . . . . . . . . 28416.6 Computational example . . . . . . . . . . . . . . . . . . . . . 28516.7 Nonlinear dynamics . . . . . . . . . . . . . . . . . . . . . . . . 287References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 288

17 Sensitivity and Optimization 28917.1 Design sensitivity . . . . . . . . . . . . . . . . . . . . . . . . . 28917.2 Design optimization . . . . . . . . . . . . . . . . . . . . . . . 29017.3 Planar bending of the bar . . . . . . . . . . . . . . . . . . . . 294

Contents xi

17.4 Computational example . . . . . . . . . . . . . . . . . . . . . 29717.5 Eigenfunction sensitivities . . . . . . . . . . . . . . . . . . . . 30217.6 Variational analysis . . . . . . . . . . . . . . . . . . . . . . . . 304References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 308

18 Engineering Result Computations 30918.1 Displacement recovery . . . . . . . . . . . . . . . . . . . . . . 30918.2 Stress calculation . . . . . . . . . . . . . . . . . . . . . . . . . 31118.3 Nodal data interpolation . . . . . . . . . . . . . . . . . . . . . 31218.4 Level curve computation . . . . . . . . . . . . . . . . . . . . . 31418.5 Engineering analysis case study . . . . . . . . . . . . . . . . . 316References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 319

Annotation 321

List of Figures 323

List of Tables 325

Index 327

Closing Remarks 331

Preface to the second edition

I am grateful to Taylor & Francis, in particular to Nora Konopka, publisher,for the opportunity to revise this book after five years in print, and for herenthusiastic support of the first edition. This made the book available to awide range of students and practicing engineers fulfilling my original inten-tions. My sincere thanks are also due to Amy Blalock, project coordinator,and Michele Dimont, project editor, at Taylor & Francis.

Mike Gockel, my colleague of many years, now retired, was again instru-mental in clarifying some of the presentation, and he deserves my repeatedgratitude. I would like to thank Professor Duc Nguyen for his proofreading ofthe extensions of this edition. His use of the first edition in his teaching pro-vided me with valuable feedback and confirmation of the approach of the book.

A half a decade passed since the original writing of the first edition andthis edition contains numerous noteworthy technical extensions. In Part I thefinite element chapter now contains a brief introduction to quadratic finiteelement shape functions (1.8). Also in Part I, the geometry modeling chapterhas been extended with three sections (2.3, 2.4 and 2.5) to discuss the B-splinetechnology that has become the de facto industry standard. Several new sec-tions were added to address reader requested topics, such as supporting therigid body motion (4.6), the method of augmenting constraints (4.7) and adiscussion on detecting and eliminating massless mechanisms (5.5).

Still in Part I, a new Chapter 6 describes a significant application trend ofthe past years: the use of the technology to couple multiple physical phenom-ena. This includes a more detailed description of the fluid-structure interac-tion application, a hexahedral finite element, as well as a structural-acousticscase study.

In Part II, a new section (7.7) addressing iterative solutions of linear systemsand specifically the method of conjugate gradients, was also recommended byreaders of the first edition. Also in Part II, a new Chapter 10 is dedicated tocomplex spectral computations, a topic briefly mentioned but not elaboratedon in the first edition. The rotor dynamic application topic and related casestudy examples round up this new chapter.

In Part III, the modal solution chapter has been extended with a new section

xiii

xiv Preface to the second edition

(13.6) describing modal energies and contributions. A new section (14.6) inthe transient response analysis chapter discusses the state-space formulation.The frequency domain analysis chapter has been enhanced with a new sec-tion (15.5) on enforced motion computations. Finally, the nonlinear chapterreceived a new section (16.2) describing geometric nonlinearity computationsin some detail.

The application focus has also significantly expanded during the years sincethe publication of the first edition and one of the goals of this edition was toreflect these changes. The updated case study sections’ (2.8, 7.6, 9.8, 10.8,12.6, 14.5, 15.4 and 18.5) state-of-the-art application results demonstrate thetremendously increased computational complexity.

The final goal of this edition was to correct some of the typing mistakes andtechnical misstatements of the first edition, which were pointed out to me byreaders. While they kindly stated that those were not limiting the usefulnessof the book, I exercised extreme caution to make this edition as error free andclear as possible.

Louis Komzsik2009

The model in the cover art is courtesy of Pilates Aircraft Corporation,Stans, Switzerland. It depicts the tail wing vibrations of a PC-21 aircraft,computed by utilizing the techniques described in this book.

Preface to the first edition

The method of finite elements has become a dominant tool of engineeringanalysis in a large variety of industries and sciences, especially in mechanicaland aerospace engineering. In this role, the method enables the engineer orscientist to solve a physical problem or analyze a process. There is, however,significant computational work - in several distinct phases - involved in thesolution of a physical problem with the finite element method. The emphasisof this book is on the computational techniques of this complete process fromthe physical problem to the computed solution.

In the first phase the physical problem is described in mathematical form,most of the time by a boundary value problem of some sort. At the same timethe geometry of the physical problem is also approximated by computationalgeometry techniques resulting in the finite element model. Applying bound-ary conditions and various constraints to the finite element model results in anumerically solvable form. The first part of the book addresses these topics.

In the second phase of operations the numerical model is reduced to a com-putationally more efficient form via various spectral representations. Todayfinite element problems are extremely large in industrial applications, there-fore, this is an important step. The subject of the second part of the book isthe reduction techniques to reach an efficiently solvable computational model.

Finally, the solution of the engineering problem is obtained with specificcomputational techniques. Both time and frequency domain solutions areused in practice. Advanced computations addressing nonlinearity and opti-mization may also be applied. The third part of the book deals with thesetopics as well as the representation of the computed results.

The book is intended to be a concise, self-contained reference for the topicand aimed at practicing engineers who put the finite element technique topractical use. It may be the subject of specific interest to users of com-mercial finite element analysis products, as those products execute most ofthese computational techniques in various forms. Graduate students of finiteelement techniques in any discipline could benefit from using the book as well.

The material comes from my three decades of activity in the shipbuild-ing, aerospace and automobile industries, during which I used many of these

xv

xvi Preface to the first edition

techniques. I have also personally implemented some of these techniques intovarious versions of NASTRAN1, the world’s leading finite element software.

Finally, I have also encountered many students during my years of teach-ing whose understanding of these computations would have been significantlybetter with such a book.

Louis Komzsik2004

1 - NASTRAN is a registered trademark of the National Aeronautics andSpace Administration

Acknowledgments

I appreciate Mr. Mike Gockel’s (MSC Software Corporation, retired) technicalevaluation of the manuscript and his important recommendations, especiallythose related to the techniques of Chapters 4 and 5.

I would also like to thank Dr. Al Danial (Northrop-Grumman Corporation)for his repeated and very careful proofreading of the entire manuscipt. Hisclarifying comments representing the application engineer’s perspective havesignificantly contributed to the readability of the book.

Professor Barna Szabo (Washington University, St. Louis) deserves creditfor his valuable corrections and insightful advice through several revisions ofthe book. His professional influence in the subject area has reached a widerange of engineers and analysts, including me.

Many thanks are also due to Mrs. Lori Lampert (MSC Software Corpo-ration) for her expertise and patience in producing figures from my hand-drawings.

I also value the professional contribution of the publication staff at Taylorand Francis Group. My sincere thanks to Nora Konopka, publisher, HelenaRedshaw, manager and editor Richard Tressider. They all deserve significantcredit in the final outcome.

Louis Komzsik2004

xvii

Part I

Numerical ModelGeneration

1

1

Finite Element Analysis

The goal of this chapter is to introduce the reader to finite element analy-sis which is the basis for the discussion of the computational methods in theremainder of the book. This chapter first focuses on the computational funda-mentals of the method in connection with a simple boundary value problem.These fundamentals will be expanded with the derivation of a practical finiteelement and further when dealing with the application of the technique formechanical systems in Chapter 3.

1.1 Solution of boundary value problems

The method of using finite elements for the solution of boundary value prob-lems has almost a century of history. The pioneering paper by Ritz [8] haslaid the foundation for this technology. The most widely used practical tech-nique, however, is Galerkin’s method [3].

The difference between the Ritz method and that of Galerkin’s is in thefact that the first addresses the variational form of the boundary value prob-lem. Galerkin’s method minimizes the residual of the differential equationintegrated over the domain with a weight function, hence it is also called themethod of weighted residuals.

This difference lends more generality and computational convenience toGalerkin’s method. Let us consider a linear differential equation in two vari-ables on a simple domain D:

L(q(x, y)) = 0, (x, y) ∈ D,

and apply Dirichlet boundary conditions on the boundary B

q(x, y) = 0, (x, y) ∈ B.

Galerkin’s method is based on the Ritz’s approximate solution idea andconstructs the approximate solution as

3

4 Chapter 1

q(x, y) = q1N1 + q2N2 + ... + qnNn,

where the qi are the yet unknown solution values at discrete points in thedomain (the node points of the finite element mesh) and

Ni, i = 1, ..n,

is the set of the finite element shape functions to be derived shortly. In thiscase, of course there is a residual of the differential equation

L(q) �= 0.

Galerkin proposed using the shape functions of the approximate solution alsoas the weights, and requires that the integral of the so weighted residual van-ish. ∫ ∫

D

L(q)Nj(x, y)dxdy = 0; j = 1, 2, . . . , n.

This yields a system for the solution of the coefficients as

∫ ∫D

L(n∑

i=1

qiNi(x, y))Nj(x, y)dxdy = 0; j = 1, 2, . . . , n.

This is a linear system and produces the unknown values of qi.

Let us now consider the deformation of an elastic membrane loaded by adistributed force of f(x, y) shown in Figure 1.1. The mathematical model isthe well-known Poisson’s equation.

− ∂2q

∂x2− ∂2q

∂y2= f(x, y),

where q(x, y) is the vertical displacement of the membrane at (x, y) and f(x, y)is the distributed load on the surface of the membrane. Assume the membraneoccupies the D domain in the x−y plane with a boundary B. We assume thatthe membrane is clamped manifested by a Dirichlet boundary condition. Itshould be noted that in practical problems the boundary is not necessarily assmooth as shown on the Figure 1.1, in fact it is usually only piecewise analytic.

Let us now apply Galerkin’s method to this problem.∫ ∫D

−(∂2q

∂x2+

∂2q

∂y2+ f(x, y))Njdxdy = 0, j = 1, . . . , n.

Substituting the approximate solution yields

∫ ∫D

−(n∑

i=1

qi∂2Ni

∂x2+

n∑i=1

qi∂2Ni

∂y2+ f(x, y))Njdxdy = 0, j = 1, . . . , n.

Finite Element Analysis 5

z

q(x, y)

y

D

B

q = 0

x

FIGURE 1.1 Membrane model

The left hand side terms may be integrated by parts and after employing theboundary condition they simplify as∫ ∫

D

−(∂2Ni

∂x2+

∂2Ni

∂y2)Njdxdy =

∫ ∫D

(∂Ni

∂x

∂Nj

∂x+

∂Ni

∂y

∂Nj

∂y)dxdy.

Substituting and regrouping yields

∫ ∫D

(n∑

i=1

qi∂Ni

∂x

∂Nj

∂x+

n∑i=1

qi∂Ni

∂y

∂Nj

∂y− f(x, y)Nj)dxdy = 0, j = 1, . . . , n.

Unrolling the sums and reordering we get the Galerkin equations:∫ ∫((q1

∂N1

∂x+ ... + qn

∂Nn

∂x)∂Nj

∂x+ (q1

∂N1

∂y+ ... + qn

∂Nn

∂y)∂Nj

∂y)dxdy =

∫ ∫f(x, y)Njdxdy

for j = 1, .., n. Introducing the notation

Kij = Kji =∫ ∫

(∂Ni

∂x

∂Nj

∂x+

∂Ni

∂y

∂Nj

∂y)dxdy

6 Chapter 1

andFj =

∫ ∫(f(x, y)Nj)dxdy

the Galerkin equations may be written as a matrix equation

Kq = F.

The system matrix is

K =

⎡⎢⎢⎣

K1,1 K1,2 . . . K1,n

K2,1 K2,2 . . . K2,n

. . . . . . . . . . . .Kn,1 Kn,2 . . . Kn,n

⎤⎥⎥⎦ ,

with solution vector of

q =

⎡⎢⎢⎣

q1

q2

. . .qn

⎤⎥⎥⎦ ,

and right hand side vector of

F =

⎡⎢⎢⎣

F1

F2

. . .Fn

⎤⎥⎥⎦ .

The assembly process is addressed in more detail in Section 1.4 after intro-ducing the shape functions. The K matrix is usually very sparse as many Kij

become zero. This equation is known as the linear static analysis problem,where K is called the stiffness matrix, F is the load vector and q is the vectorof displacements, the solution of Poisson’s equation. Other differential equa-tions could lead to similar form as demonstrated in, for example [2].

The concept, therefore, is generally contributing to its wide-spread appli-cation success. For the mathematical theory see [6]; the matrix algebraicfoundation is thoroughly discussed in [7]. More details may be obtained fromthe now classic text of [11].

1.2 Finite element shape functions

To interpolate inside the elements piecewise polynomials are usually used.For example a triangular discretization of a two dimensional domain may be


approximated by bilinear interpolation functions of form

q(x, y) = a + bx + cy.

In order to find the coefficients let us consider the triangular region (element)of the x − y plane in a specifically located local coordinate system and thenotation shown in Figure 1.2.

3 q3(x3, y3)

q2(x2, y2)(x2, 0)

q1(x1, y1)(0, 0)

q(x, y )

2x

y

1

FIGURE 1.2 Local coordinates of triangular element

The usage of a local coordinate system in Figure 1.2 does not limit the gen-erality of the following discussion. The arrangement can always be achievedby appropriate coordinate transformations on a generally located triangle.Using the notation and assignments on Figure 1.2 and by evaluating at eachnode of the triangle

8 Chapter 1

qe =

⎡⎣ q1

q2

q3

⎤⎦ =

⎡⎣ 1 0 0

1 x2 01 x3 y3

⎤⎦

⎡⎣a

bc

⎤⎦ .

The triangular system of equations is easily solved for the unknown coeffi-cients as ⎡

⎣abc

⎤⎦ =

⎡⎣ 1 0 0

− 1x2

1x2

0x3−x2x2y3

−x3x2y3

1y3

⎤⎦

⎡⎣ q1

q2

q3

⎤⎦ .

By back-substituting into the approximation equation we get

q(x, y) = N

⎡⎣ q1

q2

q3

⎤⎦ =

[N1 N2 N3

]⎡⎣ q1

q2

q3

⎤⎦ .

Here N contains the N1, N2, N3 shape functions (more precisely the traces ofshape functions inside an element). With these we are now able to describethe relationship between the solution value inside an element in terms of thesolutions at the corner node points

q(x, y) = N1q1 + N2q2 + N3q3.

The values of Ni are

N1 = 1 − 1x2

x +x3 − x2

x2y3y,

N2 =1x2

x − x3

x2y3y,

andN3 =

1y3

y.

These clearly depend on the coordinates of the corner node of the particulartriangular element of the domain. It is easy to see that at every node only oneof the shape functions is nonzero. Specifically at node 1: N2 and N3 vanish,while N1 = 1. At node 2: N2 = 1, both N1 and N3 are zero. Finally at node3: N3 takes a value of one and the other two vanish. It is also easy to verifythat the

N1 + N2 + N3 = 1

equation is satisfied.

The nonzero shape functions at a certain node point reduce to zero at theother two nodes, respectively. The interpolations are continuous across theneighboring elements. On an edge between two triangles, the approximation


is linear. It is the same when it is approached from either element.

Specifically along the edge between nodes 1 and 2 the shape function N3

is zero. The shape functions N1 and N2 along this edge are the same whencalculated from an element on either side of that edge.

Naturally, additional computations are required to reflect to the fact whenthe triangle is generally located, i.e. none of its sides is collinear with anyaxes. This issue of local-global coordinate transformations will be discussedshortly.

1.3 Finite element basis functions

There is another (sometimes misinterpreted) component of finite element tech-nology, the basis functions. They are sometimes used in place of shape func-tions by engineers, although as shown below, they are distinctly different. Theapproximation of

q(x, y) = Nqe

may also be written as

q(x, y) = Mce

where M is the matrix of basis functions and ce is the vector of basis coeffi-cients.

Clearly for our exampleM =

[1 x y

]and

ce =

⎡⎣a

bc

⎤⎦ .

The family of basis functions for two-dimensional elements may be writtenfrom the terms shown on Table 1.1.

Depending on how the basis functions are chosen, various two-dimensionalelements may be derived. Naturally a higher order basis function family re-quires more node points. For example, a quadratic (order= 2) triangularelement, often used in industry, is based on introducing midpoint nodes oneach side of the triangle. This enables the use of the following interpolation

10 Chapter 1

TABLE 1.1

Basis function terms fortwo-dimensional elementsOrder Terms

0 11 x y2 x2 xy y2

3 x3 x2y xy2 y3

function

q(x, y) = a + bx + cy + dx2 + ey2 + fxy

in each triangle. The six coefficients are again easily established by a proce-dure similar to the linear triangular element above. The interpolation acrossquadratic element boundaries is also continuous, however, now it is parabolicalong an edge. Nevertheless, the parabola produced by the neighboring ele-ments is the same from both sides. Quadratic finite elements will be discussedin section 1.8.

For a first order rectangular element the interpolation may be of the form

q(x, y) = a + bx + cy + dxy.

In this case, all the first-order basis functions were used as well as one com-ponent of the second-order basis function family. We will derive a practicalrectangular element in Section 1.7. Similarly a second-order (eight noded)rectangular element is approximated as

q(x, y) = a + bx + cy + dxy + ex2 + fx2y + gxy2 + hy2.

This is again the use of the complete 2nd order family plus two components ofthe 3rd order family to accommodate additional node points. The latter areusually located on the midpoints of each side, as they were on the quadratictriangle.

For a three-dimensional domain, the four noded tetrahedron is one of themost commonly used finite elements. The interpolation inside a tetrahedralelement is of form

q(x, y, z) = a + bx + cy + dz.

The basis function terms for three-dimensional elements is shown in Table 1.2.

Quadratic interpolation of the tetrahedron is also possible; the related ele-ment is called the 10-noded tetrahedron. The extra node points are located


TABLE 1.2

Basis function terms forthree-dimensional elementsOrder Terms

0 11 x y z2 x2 xy y2 xz yz z2

3 x3 ... xyz ... y3 z3

on the midpoints of the edges.

q(x, y, z) = a + bx + cy + dz + ex2 + fxy + gy2 + hxz + iyz + jz2.

The third-order three-dimensional basis function family introduces another 10terms, some of them are shown on Table 1.2.

Finally, additional volume elements are also frequently used. The hexahe-dron is one of the most widely accepted. Its first order version consists ofeight node points at the corners of the hexahedron and therefore, it is definedwith specifically chosen basis functions as

q(x, y, z) = a + bx + cy + dz + exy + fxz + gyz + hxyz.

The quadratic hexahedral element consists of 20 nodes, the eight corner nodesand the 12 mid points on the edges. A 3rd order hexahedral element with 27nodes is also used, albeit not widely. The additional seven nodes come fromthe mid-point of the six faces and from the center of the volume.

Finally, higher order polynomial (p-version) elements are also used in theindustry. These elements introduce side shape functions in addition to thenodal shape functions mentioned earlier. The side shape functions, as theirname indicates, are assigned to the sides of the elements. They are formu-lated in terms of some orthogonal, most often Legendre, polynomial of orderp, hence the name. There are clearly advantages in computational accuracywhen applying such elements. On the other hand, they introduce extra com-putational costs, so they are mainly used in specific applications and notgenerally. The method and some applications are described in detail in thebook of the pioneering authors of the technique [9].

The gradual widening of the finite element technology may be assessed byreviewing the early articles of [10] and [1], as well as from the reference ofthe first general purpose and still premier finite element analysis tool [4].

12 Chapter 1

1.4 Assembly of finite element matrices

The repeated application of general triangles may be used to cover the D pla-nar domain as shown in Figure 1.3. The process is called meshing. The pointsinside the domain and on the boundary are the node points. They span thefinite element mesh. There may be small gaps between the boundary and the

y

x

FIGURE 1.3 Meshing the membrane model

sides of the triangles adjacent to the boundary. This issue contributes to theapproximation error of the finite element method. The gaps may be filled byprogressively smaller elements or those triangles may be replaced by triangleswith curved edges. Nevertheless, all the elemental matrices contribute to theglobal finite element matrices and the process of computing these contribu-tions is the finite element matrix assembly process.


One way to view the assembly of the K matrix is by way of the shapefunctions. For the triangular element discussed in the last section a shapefunction associated with a node describes a plane going through the othertwo nodes and having a height of unity above the associated node. On theother hand, in an adjacent element the shape function associated with thesame node describes another plane, and so on. In general, a shape functionNi will define a pyramid over node i.

This geometric interpretation explains the sparsity of the K matrix. Onlythose NiNj products will exist, and in turn produce a Kij entry in the Kmatrix, where the two pyramids of Ni and Nj overlap.

A computationally more practical method is based on summing up theenergy contributions from each element to the global matrix. The strainenergy (a component of the potential energy) of a certain element is

Ee =12

∫ ∫[(

∂q

∂x)2 + (

∂q

∂y)2]dxdy.

Introducing the strain vector

ε =

[∂q∂x∂q∂y

].

the strain energy of the element is

Ee =12

∫ ∫εT εdxdy.

Considering our simple triangular element, differentiating and using a matrixnotation yields

ε =

[∂q∂x∂q∂y

]=

[bc

]=

[ − 1x2

1x2

0x3−x2x2y3

−x3x2y3

1y3

] ⎡⎣ q1

q2

q3

⎤⎦ = Bqe,

where

qe =

⎡⎣ q1

q2

q3

⎤⎦ .

In the above, B is commonly called the strain-displacement matrix. The ∂q∂x

and ∂q∂y terms are the strain components of our element, in essence the rate of

change of the deformation of the element in the coordinate directions. The Bmatrix relates the strains to the nodal displacements on the right, hence thename.

14 Chapter 1

Note, that the structure of B depends on the physical model, in our casehaving only one degree of freedom per node point for the membrane element.Elements representing other physical phenomena, for example, triangles hav-ing two in-plane degrees of freedom per node point, have different B matrix,as they have more possible strain components. This issue will be addressed inmore detail in Section 1.7 and in Chapter 3. Here we stay on a mathematicalfocus.

With this the element energy contribution is

Ee =12

∫ ∫qTe BT Bqedxdy.

Since the node point coordinates are constant with respect to the integrationwe may write

Ee =12qTe (

∫ ∫BT Bdxdy)qe =

12qTe keqe.

Here ke is the element matrix whose entries depend only on the shape of theelement. If our element is the element described by nodes 1, 2 and 3, then theterms in ke contribute to the terms of the 1st, 2nd and 3rd columns and rowsof the global K matrix. The actual integration for computing ke is addressedin the next section.

Let us assume that another element is adjacent to the 2-3 edge, its othernode being 4. Then by similar arguments, the 2nd element’s matrix terms(depending on that particular element’s shape) will contribute to the 2nd, 3rdand 4th columns and rows of the global matrix. This process is continued forall the elements contained in the finite element mesh.

Note, that in the case of quadratic or quadrilateral shape elements the ac-tual element matrices are again of different sizes. This fact is due to thedifferent number of node points describing the element geometry. Neverthe-less, the matrix generation and assembly process is conceptually the same.

Furthermore, in the case of three-dimensional elements the energy formu-lation is even more complex. These issues will be discussed in more detail inChapter 3.


1.5 Element matrix generation

Let us now focus on calculating the element matrix integrals. Since for ourmodel B is constant (function of only the coordinates of the node points ofthe element), this may be simplified to

ke = BT B

∫ ∫dxdy = BT BAe,

where Ae is the surface area of the element as

Ae =∫ ∫

dxdy.

In order to evaluate this integral, the element is usually represented in para-metric coordinates. Let us consider again the local coordinates of the tri-angular element, now shown in Figure 1.4 with two specific coordinate axesrepresenting the parametric system. The axis (coincident with the local xaxis) going through node points 1 and 2 is the first parametric axis u. Definethe other axis going from node 1 through node 3 as v. If we define the (0, 0)parametric location to be node 1, the (1, 0) to be node 2 and the (0, 1) to benode 3, then the parametric transformation is of form

u =1x2

x − x3

x2y3y

andv =

1y3

y.

Here we took advantage of the local coordinates of the nodes as shown onFigure 1.2. Note, that this transformation may also be written

u = N2

andv = N3.

Furthermore, the points inside one element may be written as

x = N1x1 + N2x2 + N3x3,

andy = N1y1 + N2y2 + N3y3.

Since we describe the coordinates of a point inside an element with the sameshape functions that were used to approximate the displacement field, thisis called an iso-parametric representation and our element is called an iso-parametric element. Applying the local coordinates of our element of Figure

16 Chapter 1

x, u

y

3

12

v

u = 0, v = 1

u = 1v = 0

u = 0v = 0

FIGURE 1.4 Parametric coordinates of triangular element

1.2 yields

x = x2u + x3v

andy = y3v.

The integral with this parameterization is∫ ∫dxdy =

∫ ∫det[

∂(x, y)∂(u, v)

]dudv.

Here the Jacobian matrix

∂(x, y)∂(u, v)

=[

∂x∂u

∂x∂v

∂y∂u

∂y∂v

]=

[x2 x3

0 y3

].

With this resultAe = x2y3

∫ ∫dudv.

In practice the parametric integral for each element is executed numerically,most commonly via Gaussian numerical integration, quadrature for two di-mensions and cubature for three dimensions. Note, that this is in essence a


reduction type computation, main focus of Part II, as opposed to the analyticintegration over the continuum domain.

Gaussian numerical integration has become the industry standard tool forintegration of the element matrices by virtue of its higher accuracy than theNewton-Cotes type methods such as Simpson’s. In general an integral overa specific continuous interval is approximated by a sum of weighted functionvalues at some specific locations.

∫ 1

−1

f(t)dt = Σni=1cif(ti).

Here n is the number of integration points used. The specific sampling loca-tions are the zeroes of the n-th Legendre polynomial:

ti : Pn(t) = 0

and the recursive definition of Legendre polynomials is

(k + 1)Pk+1(t) = (2k + 1)tPk(t) − kPk−1(t).

Starting from P0(t) = 1 and P1(t) = t the recurrence form produces

P2(t) =12(3t2 − 1),

P3(t) =12(5t3 − 3t),

and so on. The ci weights are computed as

ci =∫ 1

−1

Ln−1,i(t)dt

where

Ln,i =n∏

j=1,j �=i

t − tjti − tj

is the i-th n-th order Lagrange polynomial with roots of the Legendre poly-nomials described above. For the most commonly occurring cases Table 1.3shows the values of ci and ti.

Now integrating over the parametric domain of our element, the integralhas the following boundaries

∫ 1

u=0

∫ 1−u

v=0

dvdu.

This is clear when looking at Figure 1.4. One needs to transform above in-tegral boundaries to the standard [−1, 1] interval required by the Gaussian

18 Chapter 1

TABLE 1.3

Gauss weights andlocationsn ti ci

1 0 22 1√

3,− 1√

31, 1

3√

35 , 0,−

√35

59 , 8

9 , 59

numerical integration. This may be done with the transformation

v =1 − u

2+

1 − u

2r,

anddv =

1 − u

2dr,

as well asu =

12

+12s,

anddu =

12ds.

The transformed integral amenable to Gauss quadrature is∫ 1

s=−1

12

∫ 1

r=−1

(14− 1

4s)drds.

Using the 1-point Gauss formula this is

122(

142) =

12.

With this the surface area of the element is

Ae =x2y3

2,

which agrees with the geometric computation based on the triangle’s localcoordinates. This is a rather roundabout way of computing the area of atriangle. Note, however, that the discussion here is aimed at introducing gen-erally applicable principles.

Naturally, there is a wealth of element types used in various industries.Even for the simple triangular geometry there are other formulations. Theextensions are in both the number of node points describing the triangularelement as well as in the number of degrees of freedom associated with a nodepoint.


1.6 Local to global coordinate transformation

When the element matrix assembly issue was addressed earlier, the elementmatrix had been developed in terms of local (x, y, z) coordinates. In the caseof multiple elements, all the elements have their respective local coordinatesystem chosen on the same principle of the local x axis being collinear withone of the element sides and another one perpendicular.

Thus before assembling any element, the element matrix must be trans-formed to the global coordinate system common to all the elements. Let usdenote the element’s local coordinate systems with (x, y, z) and the globalcoordinate system with (X, Y, Z). The unit direction vectors of the two coor-dinate systems are related as

⎡⎣ i

jk

⎤⎦ = T

⎡⎣ I

JK

⎤⎦ ,

where the terms of the transformations are easily obtained from the geometricrelation between the local and global systems. Specifically

T =

⎡⎣ t11 t12 t13

t21 t22 t23t31 t32 t33

⎤⎦ ,

where the tmn term is the cosine of the angle between the mth local coor-dinate axis and the nth global coordinate axis. The same transformation isapplicable to the nodal degrees of freedom of any element

⎡⎣ qx

qy

qz

⎤⎦ = T

⎡⎣ qX

qY

qZ

⎤⎦ .

Hence, the element displacements in the two systems are related as

qe = Glgqge ,

where the upper left and the lower right 3 × 3 blocks of the 6 × 6 Glg matrixare the same as the T matrix, the other blocks are zero. The qg

e notationrefers to the element displacements in the global coordinate system.

Considering the element energy contribution

Ee =12qTe keqe

20 Chapter 1

and substituting above we get

Ee =12qg,Te Glg,T keG

lgqge

orkg

e = Glg,T keGlg .

This transformation follows the element matrix generation and precedes theassembly process. Naturally, the solution qg

e is also represented in global co-ordinates, which is the subject of the interest of the engineer anyways.

This issue will not be further discussed, the elements introduced later willbe generated either in terms of local or global coordinates for simplifying theparticular discussion. Commercial finite element analysis systems have spe-cific rules for the definition of local coordinates for various element types.

1.7 A linear quadrilateral finite element

So far we have discussed the rather limited triangular element formulation,mainly to provide a foundation for presenting the integration and assemblycomputations. We continue this chapter with the discussion of a more practi-cal quadrilateral or rectangular element, but first we focus on the linear case.Quadrilateral elements are the most frequently used elements of industrialfinite element analysis when analyzing topologically two-dimensional models,such as the body of an automobile or an airplane fuselage.

Let us place the element in the x−y plane as shown in Figure 1.5, but poseno other restriction on its location. Based on the principles we developedin connection with the simple triangular element, we introduce shape func-tions. As we have four nodes in a quadrilateral element, we will have fourshape functions, each of whose values vanish at any other node but one. Forj = 1, 2, 3, 4, we define

Ni ={

1 when i = j,0 when i �= j.

We create an element parametric coordinate system u, v originated in the in-terior of the element and having the following definition:

Ni =14(1 + uui)(1 + vvi).

Such a coordinate system is shown in Figure 1.6. The mapping of the general


3

1

2

4

q3y

q3x

q2y

q2x

q1x

q1y

q4y

q4x

p x y,( )

y

x

FIGURE 1.5 A planar quadrilateral element

element to the parametric coordinates is the following counterclockwise pat-tern:

(x1, y1) → (−1,−1),

(x2, y2) → (1,−1),

(x3, y3) → (1, 1),

and(x4, y4) → (−1, 1).

The corresponding four shape functions are:

N1 =14(1 − u)(1 − v),

N2 =14(1 + u)(1 − v),

N3 =14(1 + u)(1 + v),

andN4 =

14(1 − u)(1 + v).

22 Chapter 1

1 2

34

1 1–,–( ) 1 1–,( )

u

v

1 1,–( ) 1 1,( )

0 0,( )

FIGURE 1.6 Parametric coordinates of quadrilateral element

These so-called Lagrangian shape functions will be used for the element for-mulation. The above selection of the Ni functions obviously again satisfies

N1 + N2 + N3 + N4 = 1.

The element deformations, however, will not be vertical to the plane of theelement as in the earlier triangular membrane element. This element willhave deformation in the plane of the element. Hence, there are eight nodaldisplacements of the element as

qe =

⎡⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎣

q1x

q1y

q2x

q2y

q3x

q3y

q4x

q4y

⎤⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎦

.

The displacement at any location inside this element is approximated withthe help of the matrix of shape functions as


q(x, y) = Nqe.

Since

q(x, y) =[

qx(x, y)qy(x, y)

]the N matrix of the four shape functions is organized as

N =[

N1 0 N2 0 N3 0 N4 00 N1 0 N2 0 N3 0 N4

].

Following the iso-parametric principle also introduced earlier, the location ofa point inside the element is approximated again with the same four shapefunctions as the displacement field:

x = N1x1 + N2x2 + N3x3 + N4x4,

andy = N1y1 + N2y2 + N3y3 + N4y4.

Here xi, yi is the location of the i-th node of the element in the x, y directions.Using the shape functions defined above with the element coordinates and bysubstituting we get

x =14[(1−u)(1− v)x1 +(1+u)(1− v)x2 +(1+u)(1+ v)x3 +(1−u)(1+ v)x4]

=14[(x1 + x2 + x3 + x4) + u(−x1 + x2 + x3 − x4)+

v(−x1 − x3 + x3 + x4) + uvu(x1 − x2 + x3 − x4)].

Similarly

y =14[(y1 + y2 + y3 + y4) + u(−y1 + y2 + y3 − y4)+

v(−y1 − y3 + y3 + y4) + uvu(y1 − y2 + y3 − y4)].

To calculate the element energy and the element matrix, the strain com-ponents and the B strain-displacement matrix need to be computed. Theelement has three constant strains defined from the possible six used in three-dimensional continuum. They are

ε =

⎡⎢⎣

∂qx

∂x∂qy

∂y∂qx

∂y + ∂qy

∂x

⎤⎥⎦ .

We still make an effort here to stay on the mathematical side of the discus-sion; this will be expanded when modeling a physical phenomenon. Clearly

24 Chapter 1

the first two components are the rates of changes in distances between pointsof the element in the appropriate directions. The third component is a com-bined rate of change with respect to the other variable in the plane, definingan angular deformation.

The relationship to the nodal displacements is described in matrix form as

ε = Bqe.

Since the shape functions are given in terms of the parametric coordinates weneed again the Jacobian as

J =∂(x, y)∂(u, v)

=[

∂x∂u

∂x∂v

∂y∂u

∂y∂v

]=

14

[j11 j12j21 j22

].

The terms are

j11 = −(1 − v)x1 + (1 − v)x2 + (1 + v)x3 − (1 + v)x4,

j12 = −(1 − u)x1 − (1 + u)x2 + (1 + u)x3 + (1 − u)x4,

j21 = −(1 − v)y1 + (1 − v)y2 + (1 + v)y3 − (1 + v)y4,

andj22 = −(1 − u)y1 − (1 + u)y2 + (1 + u)y3 + (1 − u)y4.

Since

[∂q∂u∂q∂v

]= J

[∂q∂x∂q∂y

],

the strain components required for the element are[∂qx

∂x∂qx

∂y

]= J−1

[ ∂qx

∂u∂qx

∂v

],

and [∂qy

∂x∂qy

∂y

]= J−1

[∂qy

∂u∂qy

∂v

].

Taking advantage of the components of J and using the adjoint-based inversewe compute

ε =

⎡⎢⎣

∂qx

∂x∂qy

∂y∂qx

∂y + ∂qy

∂x

⎤⎥⎦ =

1det(J)

⎡⎣ j22 −j12 0 0

0 0 −j21 j11−j21 j11 j22 −j12

⎤⎦

⎡⎢⎢⎣

∂qx

∂u∂qx

∂v∂qy

∂u∂qy

∂v

⎤⎥⎥⎦ .


From the displacement field approximation equations we obtain

⎡⎢⎢⎣

∂qx

∂u∂qx

∂v∂qy

∂u∂qy

∂v

⎤⎥⎥⎦ =

14

⎡⎢⎢⎣−(1 − v) 0 (1 − v) 0 (1 + v) 0 −(1 + v) 0−(1 − u) 0 −(1 + u) 0 (1 + u) 0 (1 − u) 0

0 −(1 − v) 0 (1 − v) 0 (1 + v) 0 −(1 + v)0 −(1 − u) 0 −(1 + u) 0 (1 + u) 0 (1 − u)

⎤⎥⎥⎦ qe.

The last two equations produce the B matrix of size 3 × 12 that is now notconstant, it is linear in u and v. Recall that the energy of the element is

Ee =12

∫ ∫εT εdxdy.

With substitution of ε = Bqe we obtain

Ee =12qTe

∫ ∫BT Bdxdyqe =

12qTe keqe.

The element matrix is

ke =∫ ∫

BT Bdet[∂(x, y)∂(u, v)

]dudv.

By the fortuitous choice of the parametric coordinate system this integral nowis directly amenable to Gaussian quadrature as the limits are −1, +1. Intro-ducing

f(u, v) = BT Bdet(J)

the element integral becomes

ke =∫ 1

u=−1

∫ 1

v=−1

f(u, v)dudv = Σni=1ciΣn

j=1cjf(ui, vj).

Here ui, vj are not the nodal point displacements, but the Gauss point loca-tions (shown as ti is Table 1.3) in those directions. With applying the twopoint (n = 2) formula

ke = c21f(u1, v1) + c1c2f(u1, v2) + c2c1f(u2, v1) + c2

2f(u2, v2)

and ci are listed in Table 1.3 also.

This concludes the computation techniques of the linear 2-dimensional quadri-lateral element. In practice the quadratic version is much preferred and willbe described in the following.

26 Chapter 1

1.8 Quadratic finite elements

We view the element in the x− y plane as shown in Figure 1.5, but add nodeson the middle of the sides of the square shown in Figure 1.6 depicting theparametric plane of the element. The locations of these new node points ofthe quadratic element are:

(x5, y5) → (0,−1),

(x6, y6) → (1, 0),

(x7, y7) → (0, 1),

and(x8, y8) → (−1, 0).

Connecting these points are four interior lines, described by parametric equa-tions as

1 − u + v = 0,

connecting nodes 5 and 6,

1 − u − v = 0,

connecting nodes 6 and 7,

1 + u − v = 0,

connecting nodes 7 and 8, and finally

1 + u + v = 0,

connecting nodes 8 and 1, completing the loop. For j = 1, . . . , 8 we seekfunctions Ni that are unit at the ithe node and vanish at the others:

Ni ={

1 when i = j,0 when i �= j.

Let us consider for example node 3. N3 must vanish along the opposite sidesof the rectangle

u = −1,

andv = −1.


That will account for nodes 1, 2, 4, 5, 8. Furthermore it must also vanish atnodes 6 and 7, represented by the line

1 − u − v = 0.

Hence the form of the corresponding shape function is

N3 = nc(1 + u)(1 + v)(1 − u − v).

where the normalization coefficient nc for the corner shape functions may beestablished from the condition of N3 becoming unit at node 3

N3 = nc(1 + 1)(1 + 1)(1 − 1 − 1) = nc(−4) = 1,

yielding

nc = −14.

This is in part identical to the N3 shape function of the linear element, apartfrom the last term. The shape functions corresponding to the corner nodes,based on similar considerations, are of form

N1 = −14(1 − u)(1 − v)(1 + u + v),

N2 = −14(1 + u)(1 − v)(1 − u + v).

N3 = −14(1 + u)(1 + v)(1 − u − v),

and

N4 = −14(1 − u)(1 + v)(1 + u − v).

To define the shape functions at the mid-points, we consider node 6 first. N6

must vanish at 3 edges

v = 1,

v = −1,

andu = −1.

Hence it will be of form

N6 = nm(1 + u)(1 + v)(1 − v).

Substituting the last two terms with the well known algebraic identity, weobtain

N6 = nm(1 + u)(1 − v2).

28 Chapter 1

This form now demonstrates the quadratic nature of the element. The nor-malization constant of the mid-side nodes nm is established by using thecoordinates (1, 0) of node 6

N6 = nm(1 + 1)(1 − 02) = nm · 2 = 1

implying

nm =12.

Hence, the mid-side shape functions are:

N5 =12(1 − u2)(1 − v),

N6 =12(1 + u)(1 − v2).

N7 =12(1 − u2)(1 + v).

N8 =12(1 − u)(1 − v2).

From here on, the process established in connection with the linear elementis directly applicable. The nodal displacement vector will, of course, con-sist of 16 components and the N matrix of shape functions will also double incolumn size. The steps of the element matrix generation process are identical.

A similar flow of operations, in connection with the triangular element, re-sults in a quadratic triangular element, the six-noded triangle. Let us considerthe element depicted in Figure 1.4 and place mid-side nodes as follows:

(x4, y4) → (1/2, 1/2),

(x5, y5) → (0, 1/2),

and(x6, y6) → (1/2, 0).

Following above, the shape functions of the corner nodes will be

N1 = (1 − u − v)2,

N2 = u(2u − 1),

andN3 = v(2v − 1).

The mid-side nodes are represented by

N4 = 4uv,


N5 = 4(1 − u − v)v,

andN6 = 4u(1 − u − v).

They are of unit value in their respective locations and zero otherwise.

For higher order (so-called p-version) or physically more elaborate (non-planar) element formulations the reader is referred to [9] and [5], respectively.

The computational process of adding shape functions to mid-side nodes willeasily generalize to three dimensions. The linear three dimensional elements,such as the linear tetrahedral element introduced in Section 3.4 and the lin-ear hexahedral element, subject of Section 6.2 may be extended to quadraticelements by the same procedure.

The foundation established in this chapter should carry us into the mod-eling of a physical phenomenon, where one more generalization of the finiteelement technology will be done by addressing three-dimensional domains. Be-fore this issue is explored, however, the generation of a finite element modelis discussed.

References

[1] Clough, R. W.; The finite element method in plane stress analysis, Pro-ceedings of 2nd Conference of electronic computations, ASCE, 1960

[2] Courant, P.; Variational methods for the solution of problems of equi-librium and vibration, Bulletin of American Mathematical Society, Vol.49, pp. 1-23, 1943

[3] Galerkin, B. G.; Stabe und Platten: Reihen in gewissen Gleichgewicht-sproblemen elastischer Stabe und Platten, Vestnik der Ingenieure, Vol.19, pp. 897-908, 1915

[4] MacNeal, R. H.; NASTRAN theoretical manual, The MacNeal-Schwendler Corporation, 1972

[5] MacNeal, R. H.; Finite elements: Their design and performance, MarcelDekker, New York, 1994

[6] Oden, J. T. and Reddy, J. N.; An introduction to the mathematicaltheory of finite elements, Wiley, New York, 1976

30 Chapter 1

[7] Przemieniecki, J. S.; Theory of matrix structural analysis, McGraw-Hill,New York, 1968

[8] Ritz, W.; Uber eine neue Methode zur Losung gewisser Variationsprob-leme der Mathematischen Physik, J. Reine Angewendte Mathematik,Vol. 135, pp. 1-61, 1908

[9] Szabo, B. and Babuska, I.; Finite element analysis, Wiley, New York,1991

[10] Turner, M. J. et al; Stiffness and deflection analysis of complex struc-tures, Journal of Aeronautical Science, Vol. 23, pp. 803-823, 1956

[11] Zienkiewicz, O. C.; The finite element method, McGraw-Hill, New York,1968

2

Finite Element Model Generation

Finite element model generation involves two distinct components. First, thereal life geometry of the physical phenomenon is approximated by geometricmodeling. Second, the computational geometry model is discretized produc-ing the finite element model. These issues are addressed in this chapter.

2.1 Bezier spline approximation

The first step in modeling the geometry of a solid object involves approxi-mating its surfaces and edges with splines. Note, that this step also embodiesa certain reduction as the real life continuum geometry is approximated bya finite number of computational geometry entities. The most popular andpractical geometric modeling tools are based on parametric splines.

Let us first consider an edge of a physical model described by a curve whoseequation is

r(t) = x(t)i + y(t)j + z(t)k.

The original curve will be approximated by a set of cubic parametric splinesegments of form

S(t) = a + bt + ct2 + dt3,

where t ranges from 0.0 to 1.0. Let us assume a set of points Pj , j = 1...m,representing the geometric object we are to model. For simplicity let us focuson the first segment of the curve defined by four points P0, P1, P2, P3. Thesefour points define a Bezier [1] polygon as shown in Figure 2.1. The curve willgo through the end-points P0 and P3. The tangents of the curve at the endpoints will be defined by the two intermediate (control) points P1, P2.

The Bezier spline segment is formed from these four points as

S(t) = Σ3i=0PiJ3,i(t).

31

32 Chapter 2

P2

P3

P1

Po

FIGURE 2.1 Bezier polygon

Here

J3,i(t) =(

3i

)ti(1 − t)3−i

are binomial polynomials. Using the boundary conditions of the Bezier curve(S(0), S(1), S′(0), S′(1)) the matrix form of the Bezier spline segment may bewritten as

S(t) = TMP.

Here the matrix P contains the Bezier vertices

P =

⎡⎢⎢⎣

P0

P1

P2

P3

⎤⎥⎥⎦ ,

and the matrix M the interpolation coefficients

Finite Element Model Generation 33

M =

⎡⎢⎢⎣

1 0 0 0−3 3 0 03 −6 3 0−1 3 −3 1

⎤⎥⎥⎦ .

T is a parametric row vector:

T =[1 t t2 t3

].

A very important generalization of this form is to introduce weight functions.The result is the rational parametric Bezier spline segment of form

S(t) =Σ3

i=0wiPiJ3,i(t)Σ3

i=0wiJ3,i(t),

or in matrix notation

S(t) =TMP

TMW.

Here

P =

⎡⎢⎢⎣

w0P0

w1P1

w2P2

w3P3

⎤⎥⎥⎦

is the vector of weighted point coordinates and

W =

⎡⎢⎢⎣

w0

w1

w2

w3

⎤⎥⎥⎦

is the array of weights. The weights have the effect of moving the curve closerto the control points, P1, P2, as shown in Figure 2.2.

The location of a specified point on the curve, Ps in Figure 2.2, defines threeweights, while the remaining weight is covered by specifying the parametervalue t∗ to which the specified point should belong on the spline. Most com-monly t∗ = 1

2 is chosen for such a point. The weights enable us to increasethe fidelity of the approximation of the original curves.

The curve segment is finally approximated by

r(t) =TMX

TMWi +

TMY

TMWj +

TMZ

TMWk.

Here

X =

⎡⎢⎢⎣

w0x0

w1x1

w2x2

w3x3

⎤⎥⎥⎦ , Y =

⎡⎢⎢⎣

w0y0

w1y1

w2y2

w3y3

⎤⎥⎥⎦ , Z =

⎡⎢⎢⎣

w0z0

w1z1

w2z2

w3z3

⎤⎥⎥⎦ ,

34 Chapter 2

PS

P1 P2

P3

Po

FIGURE 2.2 The effect of weights on the shape of spline

where xi, yi, zi are the coordinates of the i-th Bezier point. An additional ad-vantage of using rational Bezier splines is to be able to exactly represent conicsections and quadratic surfaces. These are common components of industrialmodels, for manufacturing as well as esthetic reasons.

In practice the geometric boundary is likely to be described by many pointsand therefore, a collection of spline segments. Consider the collection of pointsdescribing multiple spline segments shown in Figure 2.3. The most importantquestion arising in this regard is the continuity between segments. Since theBezier splines are always tangential to the first and last segments of the Bezierpolygon, clearly a first order continuity exists only if the Pi−1, Pi, Pi+1 pointsare collinear.

The presence of weights further specifies the continuity. Computing

∂S

∂t(t = 0) = 3

w1

w0(P1 − P0)

and∂S

∂t(t = 1) = 3

w2

w3(P3 − P2).


Po

Pi

Pi+3

P3n+1

FIGURE 2.3 Multiple Bezier segments

Here (Pi − Pj) is a vector pointing to Pj from Pi. Focusing on the adjoiningsegments of splines in Figure 2.4 the first order continuity condition is

wi−1

wi−0(Pi − Pi−1) =

wi+1

wi+0(Pi+1 − Pi).

There is a rather subtle but important distinction here. There is a geometriccontinuity component that means that the tangents of the neighboring splinesegments are collinear. Then there is an algebraic component resulting in thefact that the magnitude of the tangent vectors is also the same. The notationwi+0, wi−0 manifests the fact that the weights assigned to a control point inthe neighboring segments do not have to be the same. If they are, a simplifiedfirst order continuity condition exists when

wi−1

wi+1=

(Pi+1 − Pi)(Pi − Pi−1)

.

Enforcing such a continuity is important in the fidelity of the geometry ap-proximation and in the discretization to be discussed later.

For the same reasons a second order continuity is also desirable. By defini-tion

36 Chapter 2

Pi−2

Pi−3

Pi −1

Pi +1

Pi+2

Pi +3Pi

FIGURE 2.4 Continuity of spline segments

∂2S

∂t2(t = 0) = (6

w1

w0+ 6

w2

w0− 18

w21

w20

)(P1 − P0) + 6w2

w0(P2 − P1)

and

∂2S

∂t2(t = 1) = (6

w1

w3+ 6

w2

w3− 18

w22

w23

)(P2 − P3) + 6w1

w3(P1 − P2).

Generalization to the boundary of neighboring segments, assuming that theweights assigned to the common point between the segments is the same,yields the second order continuity condition as

wi−2(Pi−2 − Pi) − 3w2

i−1

wi(Pi−1 − Pi) = wi+2(Pi+2 − Pi) − 3

w2i+1

wi(Pi+1 − Pi).

This is a rather strict condition requiring that the two control points prior andafter the common point (five points in all) are coplanar with some additionalweight relations.


v

v = v2

v = v1

u = u2u = u1

w12

w11

w22w02

w01

w00 w10 w20

w21

u

FIGURE 2.5 Bezier patch definition

2.2 Bezier surfaces

The method discussed above is easily generalized to surfaces. A Bezier sur-face patch is defined by a set of points on the surface of the physical model(plus the control points and weights) as shown in Figure 2.5. The rationalparametric Bezier patch is described as

S(u, v) =Σ3

i=0Σ3j=0wijJ3,i(u)J3,j(v)Pij

Σ3i=0Σ

3j=0wijJ3,i(u)J3,j(v)

or in matrix form

S(u, v) =UMPMT V

UMWMT V.

38 Chapter 2

The computational components are the matrix of weighted point coordinates

P =

⎡⎢⎢⎣

w00P00 w01P01 w02P02 w03P03

w10P10 w11P11 w12P12 w13P13

w20P20 w21P21 w22P22 w23P23

w30P30 w31P31 w32P32 w33P33

⎤⎥⎥⎦ ,

the parametric row vector of

U =[1 u u2 u3

],

and column vector of

V =

⎡⎢⎢⎣

1vv2

v3

⎤⎥⎥⎦ .

The weights form a matrix of:

W =

⎡⎢⎢⎣

w00 w01 w02 w03

w10 w11 w12 w13

w20 w21 w22 w23

w30 w31 w32 w33

⎤⎥⎥⎦ .

The geometric surface of the physical model now is approximated by the patchof

r(u, v) =UMXMT V

TMWMT Vi +

TMY MT V

TMWMTVj +

TMZMT V

TMWMT Vk.

Here X, Y , Z, contain the weighted x, y, z point coordinates, respectively.Again, in a complex physical domain a multitude of these patches is usedto completely cover the surface. The earlier continuity discussion generalizesfor surface patches. The derivatives

∂S(u, v)∂u

,

and∂S(u, v)

∂v

will be the cornerstones of such relations. Similar arithmetic expressions usedfor the spline segments produce the first order continuity condition across thepatch boundaries as shown in Figure 2.6.

wi+1,j+1

wi−1,j+1=

(Pi−1,j+1 − Pi,j+1)(Pi+1,j+1 − Pi,j+1)

.

A similar treatment is applied to the v parametric direction. The second order


i−1, j+1

i−1, j

i+1, j+1

i+1, j

i, j+1

S1

i, j

u

v

u

FIGURE 2.6 Patch continuity definition

continuity is based on

∂2P (u, v)∂u ∂v

computed at the corners and the mathematics is rather tedious, albeit straight-forward. The strictness of this condition is now almost overbearing, requiringnine control points to be the coplanar. Therefore, it is seldom enforced ingeometric modeling for finite element applications. It mainly contributes tothe esthetic appearance of the surface created and as such it is preferred byshape designers.

The technique also generalizes to volumes of the physical model as

S(u, v, t) =Σ3

i=0Σ3j=0Σ

3k=0wijkJ3,i(t)J3,j(u)J3,k(v)Pijk

Σ3i=0Σ

3j=0Σ

3k=0wijkJ3,i(t)J3,j(u)J3,k(v)

.

The matrix form and the final approximation form may be developed alongthe same lines as above for splines or patches. The result is

S(u, v, t) =Σ3

k=0J3,i(t)UMP kMT V

Σ3k=0J3,i(t)UMWkMT V

.

40 Chapter 2

The k layers of the volume are individual spline patches and the weights aredefined as earlier. The formulation enables the modeling of volumes of revo-lutions or extrusions, details of those are beyond our needs here.

The points corresponding to equi-parametric values of the splines, surfacepatches and volumes are of course not equally separated in space. Sometimesit is necessary to re-parameterize one of these objects to smoothen the para-metric distribution in a geometric sense. Nevertheless, the equi-parametriclocations of these objects may constitute a basis for the discretization dis-cussed in the next section.

The Bezier objects’ industrial popularity is due to the following reasons:

1. The convex hull property: All Bezier curves, surface patches or volumesare contained inside of the hull of their control points,2. The variation diminishing property: The number of intersection pointsbetween a Bezier curve and an infinite plane is the same as the number ofintersections between the plane and the control polygon,3. All derivatives and products of Bezier functions are easily computed Bezierfunctions.

These properties are exploited in industrial geometric modeling computations.

2.3 B-spline technology

An alternative to the Bezier spline technology is based on the B-splines. Thetechnology allows a set of input points to be either interpolated or approxi-mated, providing much more flexibility. The curves are still directed by controlpoints, however, they are not given a priori, they are computed as part of theprocess. The technology, therefore, is more flexible than the Bezier technologyand is preferred in the industry.

A general non-uniform, non-rational B-spline is described by

S(t) =n∑

i=0

Bi,k(t)Qi,

where Qi are the yet unknown control points and Bi,k are the B-spline basisfunctions of degree k. They are computed based on a certain parameteriza-tion influencing the shape of the curve. Note that for now we are focusing on


non-rational, non-uniform B-splines.

The basis functions are initiated by

Bi,0(t) ={

1, ti ≤ t < ti+1

0, t < ti, t ≥ ti+1

and higher order terms are recursively computed:

Bi,k(t) =t − ti

ti+k − tiBi,k−1(t) +

ti+k+1 − t

ti+k+1 − ti+1Bi+1,k−1(t).

The parameter values for the spline may be assigned via various methods. Thesimplest, and most widely used method is the uniform spacing. The methodfor n + 1 points is defined by the parameter vector

t =[0 1 2 . . . n

].

When the input points are geometrically somewhat equidistant this is provento be a good method for parameterization. When the input points are spacedin widely varying intervals, a parameterization based on the chord length mayalso be used.

The parameter vector is commonly normalized as

t =[0 1/n 2/n . . . 1

].

Such normalization places all the parameter values in the interval (0, 1) easingthe complexity of the evaluation of the basis functions.

First we seek to interpolate a given set of points

Pj = (Pxj , Pyj, P zj); j = 0, . . . , m,

requiring that the B-spline (S(t) at parameter value tj passes through thegiven point Pj . This results in the equation⎡

⎢⎢⎣P0

P1

. . .Pm

⎤⎥⎥⎦ =

⎡⎢⎢⎣

B0,k(t0) B1,k(t0) B2,k(t0) . . . Bn,k(t0)B0,k(t1) B1,k(t1) B2,k(t1) . . . Bn,k(t1)

. . . . . . . . . . . . . . .B0,k(tm) B1,k(tm) B2,k(tm) . . . Bn,k(tm)

⎤⎥⎥⎦

⎡⎢⎢⎣

Q0

Q1

. . .Qn

⎤⎥⎥⎦ .

Using a matrix notation, the problem is

P = BQ,

where the P column matrix contains m + 1 terms and the Q column matrixcontains n+1 terms, resulting in a rectangular system matrix B with (m+1)rows and (n + 1) columns. This problem may not be solved in general when

42 Chapter 2

m < n, the case when the number of points given is less than the number ofcontrol points. The problem may also be only solved in a least squares sensewhen m > n, having more input points than control points.

The problem has a unique solution for the case of m = n and in this casethe sequence of unknown control points is obtained in the form of

Q = B−1P,

where the inverse is shown for the sake of simplicity, it is not necessarily com-puted. In fact, the B matrix exhibits a banded pattern that is dependent onthe degree k of the spline chosen. Specifically, the semi-bandwidth is less thanthe order k.

Bi,k(tj) = 0; for|i − j| >= k.

This fact should be exploited to produce an efficient solution.

The second approach is to approximate the input points in a least squaressense, resulting in a distinctly different curve. This may be obtained by find-ing a minimum of the squares of the distances between the spline and thepoints.

m∑j=0

(S(tj) − Pj)2.

Substituting the B-spline formulation and the basis functions results in

m∑j=0

(n∑

i=0

Bi,k(tj)Qi − Pj)2.

The derivative with respect to an unknown control point Qp is

2m∑

j=0

Bp,k(tj)(n∑

i=0

Bi,k(tj)Qi − Pj) = 0,

where p = 0, 1, . . . , n. This results in a system of equations, with n + 1 rowsand columns, in the form:

BT BQ = BT P

with the earlier introduced B matrix. The solution of this system producesan approximated, not interpolated solution.

The technology may also be extended to include smoothing considerationsand directional constraints to the splines, topics that are discussed at lengthin [3].


2.4 Computational example

Considering that the problem is given in 3-space, the solution for the x, y, zcoordinates may be obtained simultaneously.⎡

⎢⎢⎣Qx0 Qy0 Qz0

Qx1 Qy1 Qz1

. . .Qxn Qyn Qzn

⎤⎥⎥⎦ = B−1

⎡⎢⎢⎣

Px0 Py0 Pz0

Px1 Py1 Pz1

. . .Pxn Pyn Pzn

⎤⎥⎥⎦ .

For a fixed degree, say k = 3, and uniformly parameterized B-spline segmentsthe basis functions may be analytically computed as:

B0,3 =16(1 − t)3,

B1,3 =16(3t3 − 6t2 + 4),

B2,3 =16(−3t3 + 3t2 + 3t + 1),

and

B3,3 =16t3.

For the case of 4 points (n = 3) the uniform parameter vector becomes:

t =[0 1 2 3

].

For this case the interpolation system matrix is easily computed by hand as

B =16

⎡⎢⎢⎣

1 4 1 00 1 4 1−1 4 −5 8−8 31 −44 27

⎤⎥⎥⎦ .

The matrix is positive definite and its inverse is:

B−1 =16

⎡⎢⎢⎣

21 −28 17 −44 5 −4 1−1 8 −1 00 −1 8 −1

⎤⎥⎥⎦ .

The solution for the control points is obtained as

Q = B−1P,

44 Chapter 2

where P is the vector of input points. For example for the points

P =

⎡⎢⎢⎣

0 01 12 13 0

⎤⎥⎥⎦ ,

the control points obtained are

Q =

⎡⎢⎢⎣−1 −11/60 1/61 7/62 7/6

⎤⎥⎥⎦ .

Figure 2.7 shows the curve generated from the control points interpolatingthe given input points, while spanning the parameter range from 0 to 3.

FIGURE 2.7 B spline interpolation

To evaluate the spline curve as function of any parameter value in the span,the following matrix formula (conceptually similar to the Bezier form) may


be used:

S(t) = TCQ,

with

C =16

⎡⎢⎢⎣

1 4 1 0−3 0 3 03 −6 3 0−1 3 −3 1

⎤⎥⎥⎦

where the C matrix is gathered from the coefficients of the analytic basis func-tions above, and

T =[1 t t2 t3

].

This formula enables the validation of the spline going through the inputpoints. For example

St=1 = TCQ =[1 1

],

which of course agrees with the second input point.

For demonstration of the approximation computation, we add anotherinput point to the above set. The given set of 5 input points are:

P =

⎡⎢⎢⎢⎢⎣

0 01 12 13 02 −1

⎤⎥⎥⎥⎥⎦ .

For the case of 5 points (n = 4) the parameter vector becomes:

t =[0 1 2 3 4

].

For this case the B matrix is

B =16

⎡⎢⎢⎢⎢⎣

1 4 1 00 1 4 1−1 4 −5 8−8 31 −44 27−27 100 −131 64

⎤⎥⎥⎥⎥⎦ .

The solution for the control points in this case is obtained as

Q = (BT B)−1BT P.

Figure 2.8 shows the curve generated from these control points approximat-ing the given input points, while spanning the parameter range from 0 to 4.The evaluation yields the approximation points

46 Chapter 2

-2.5

-2

-1.5

-1

-0.5

0

0.5

1

1.5

-0.5 0 0.5 1 1.5 2 2.5 3

’input.dat’’control.dat’

x(t), y(t)

FIGURE 2.8 B spline approximation

Sapp =

⎡⎢⎢⎢⎢⎣

0.028571 −0.0142860.885714 1.0571432.171429 0.9142862.885714 0.0571432.028571 −1.014286

⎤⎥⎥⎥⎥⎦ ,

which reasonably well approximate the input points, while producing a smoothcurve.

The selection of the parameter values enables interesting and useful shapevariations of the spline around the same set of given points. For example, re-peated values of the parameter values at both ends enforce a clamped bound-ary condition, forcing the curve through the end points. Figure 2.9 shows thecurve generated from these control points and for the same input points, stillapproximating them.

Note that the number of pre-assigned parameter values, in this case be-comes 9 and the parameter vector will be

t =[0 0 0 1/4 1/2 3/4 1 1 1

].


-1

-0.5

0

0.5

1

0 0.5 1 1.5 2 2.5 3

’input’x1(t), y1(t)x2(t), y2(t)x5(t), y5(t)x6(t), y6(t)x3(t), y3(t)x4(t), y4(t)

FIGURE 2.9 Clamped B spline approximation

Also note that the number of sections of the spline increased to 6 in this case.The figure depicts the sections of the spline with different line patterns asshown on the legend, while spanning the parameter range from 0 to 1.

Finally, the curve adhering to the same set of input points may also beclosed by repeating an input point. The starting point repeated at the endresults in the closed curve shown in Figure 2.10, in this case with 5 sections.

This example also demonstrated that a level continuity between the seg-ments of the B-spline is automatically assured depending on the degree k ofthe spline. There is no need for special considerations when large number ofinput points are given.

48 Chapter 2

-1

-0.5

0

0.5

1

0 0.5 1 1.5 2 2.5 3

’input’x1(t), y1(t)x2(t), y2(t)x3(t), y3(t)x4(t), y4(t)x5(t), y5(t)

FIGURE 2.10 Closed B spline approximation

2.5 NURBS objects

As in the Bezier technology, it is also possible to use weights in the B-splinetechnology, resulting in rational B-splines. When a non-uniform parameteriza-tion is also used, the splines become Non-Uniform, Rational B-splines, knownas NURBS.

Introducing weights associated with each control point results in the NURBScurve of form

S(t) =∑n

i=0 wiBi,k(t)Qi∑ni=0 wiBi,k(t)

.

The geometric meaning of the weights is similar to that of the Bezier tech-nology, they will pull the curve closer to the input points. It is, however,important to point out that changing one single weight value will result onlyin a local shape change in the segment related to the point. This local controlis one of the advantages of the B-spline technology over the Bezier approach.


The formulation extends quite easily to surfaces:

S(u, v) =

∑ni=0

∑mj=0 wi,jBi,k(u)Bj,l(v)Qi,j∑n

i=0

∑mj=0 wi,jBi,k(u)Bj,l(v)

.

Note that the degree of the v directional parametric curve may be differentthan that of the u curve, denoted by l. Similarly the parameterization in bothdirections may be different. This gives tremendous flexibility to the method.

Geometric modeling operations are enabled by these objects. Consider gen-erating a swept surface by moving a curve C(u) along a trajectory T (v). Thisis conceptually similar to generating a cylinder by defining a circle and theaxis perpendicular to the plane of the circle. In general, the surface generatedby this process may be described as

S(u, v) = C(u) + T (v).

Assume that the curves are NURBS of the same order

C(u) =∑n

i=0 wCi Bi,k(u)QC

i∑ni=0 wC

i Bi,k(u)

and

T (v) =

∑mj=0 wT

j Bj,k(v)QTj∑m

j=0 wTj Bj,k(v)

.

Then the swept NURBS surface is of form

S(u, v) =

∑ni=0

∑mj=0 wi,jBi,k(u)Bj,l(v)Qi,j∑n

i=0

∑mj=0 wi,jBi,k(u)Bj,k(v)

,

whereQi,j = QC

i + QTj

andwi,j = wC

i wTj .

Similar considerations may be used to generate NURBS surfaces of revolutionaround a given axis.

Finally, the NURBS also generalize to three dimensions for modeling vol-umes:

S(u, v, t) =

∑ni=0

∑mj=0

∑qp=0 wi,j,pBi,k(u)Bj,k(v)Bp,k(t)Qi,j,p∑n

i=0

∑mj=0

∑qp=0 wi,j,pBi,k(u)Bj,k(v)Bp,k(t)

.

The form is written with the assumption of the curve degree being the same(k) in all three parametric directions, albeit that is not necessary.

50 Chapter 2

Finally, it is important to point out that the surface representations via ei-ther Bezier or B-splines may produce non-rectangular surface patches. Such,for example triangular, patches are very important in the finite element dis-cretization step to be discussed next. They may easily be produced fromabove formulations by collapsing a pair of points into one and will not bediscussed further [3].

2.6 Geometric model discretization

The foundation of many general methods of discretization (commonly calledmeshing) is the classical Delaunay triangulation method [4]. The Delau-

FIGURE 2.11 Voronoi polygon

nay triangulation technique in turn is based on Voronoi polygons [8] . TheVoronoi polygon, assigned to a certain point of a set of points in the plane,


contains all the points that are closer to the selected point than to any otherpoint of the set.

Let us define the set of points S ⊆ R2 and Pi ∈ S be the points of the seti = 1, 2, ..n. The points Q(x, y) ∈ R2 that satisfy

‖Q(x, y) − Pi‖ ≤ ‖Q(x, y) − Pj‖, ∀Pj ∈ S,

constitute the Voronoi polygon V (Pi) of point Pi. The Voronoi polygon is aconvex polygon.

The inequalities represent half planes between point Pi and every point Pj .The intersection of these half planes produces the Voronoi polygon. For ex-ample consider the set of points shown in Figure 2.11. The irregular hexagoncontaining one point in the middle (the Pi point) is the Voronoi polygon ofpoint Pi.

It is easy to see that the points inside the polygon (Q(x, y)) are closer toPi than to any other points of the set. It is also quite intuitive that the edgesof the Voronoi polygon are the perpendicular bisectors of the line segmentsconnecting the points of the set.

The union of the Voronoi polygons of all the points in the set completelycovers the plane. It follows that the Voronoi polygon of two points of theset do not have common interior points; at most they share points on theircommon boundary.

The definition and process generalizes to three dimensions very easily. Ifthe set of points are in space, S ⊆ R3, the points Q(x, y, z) ∈ R3 that satisfy

‖Q(x, y, z)− Pi‖ ≤ ‖Q(x, y, z) − Pj‖, ∀Pj ∈ S,

define the Voronoi polyhedron V (Pi) of Pi.

Every inequality defines a half-space and the Voronoi polyhedron V (Pi) isthe intersection of all the half-spaces defined by the point set. The Voronoipolyhedron is a convex polyhedron.

2.7 Delaunay mesh generation

The Delaunay triangulation process is based on the Voronoi polygons as fol-lows. Let us construct Delaunay edges by connecting points Pi and Pj when

52 Chapter 2

their Voronoi polygons V (Pi) and V (Pj) have a common edge. Constructingall such possible edges will result in the covering of the planar region of ourinterest with triangular regions, the Delaunay triangles.

FIGURE 2.12 Delaunay triangle

Figure 2.12 shows a Delaunay triangle. The dotted lines are the edges of theVoronoi polygons and the solid lines depict the Delaunay edges. The processextends quite naturally and covers the plane as shown in Figure 2.13 with6 Delaunay triangles. It is known that under the given definitions no twoDelaunay edges cross each other.

On the other hand it is possible to have a special case when four (or evenmore) Voronoi polygons meet at a common point. This degenerate case willresult in the Delaunay edges producing a quadrilateral. As the discretizedregions are the finite elements for our further computations, this case is nocause for panic. We can certainly have quadrilateral finite elements as wasshown earlier. There are also remedies to preserve a purely triangular mesh;slightly moving one of the points participating in the scenario will eliminate


FIGURE 2.13 Delaunay triangularization

the special case.

Finally, in three dimensions the Delaunay edges are defined as lines con-necting points that share a common Voronoi facet (a face of a Voronoi poly-hedron). Furthermore, the Delaunay facets are defined by points that sharea common Voronoi edge (an edge of a Voronoi polyhedron). In general eachedge is shared by exactly three Voronoi polyhedron, hence the Delaunay re-gions’ facets are going to be triangles.

The Delaunay regions connect points of Voronoi polyhedra that share acommon vertex. Since in general the number of such polyhedra is four, thegenerated Delaunay regions will be tetrahedra. The Delaunay method gener-alized into three dimensions is called Delaunay tessellation [6].

There are many automatic methods to discretize a two-dimensional, notnecessarily planar, domain. [2] describes such a method for surface meshingwith rectangular elements. There are also other methods in the industry topartition a three-dimensional domain into a collection of non-overlapping ele-ments that covers the entire solution domain, see for example [7]. The mostsuccessful techniques are the proprietary heuristic algorithms used in commer-

54 Chapter 2

cial software. The quality of the mesh heavily influences the finite elementsolution results. A good quality mesh has elements close to equal in size withshapes that are not too distorted. In the case of hexahedron elements thismeans element shapes that approach cubes. Gross inequality in the ratios ofthe sides (called aspect ratio in the industry) results in less accurate solutions.

The final topic of the finite element model generation is the assignment ofnode numbers. This step will influence the topology of the assembled finiteelement matrices, and as such, it influences the computational performance.The finite element matrix reordering is discussed in Section 7.1.

The assignment of the node numbers usually starts at a corner or an edgeof the geometric model, now meshed, and proceeds inward towards the inte-rior of the model while at the same time considering the element connectivity.The goal of this is that nodes of an element should have neighboring num-bers. It is not necessary to achieve that, but is it useful as a pre-processingfor reordering and assuring that operation’s efficiency.

2.8 Model generation case study

To demonstrate the model generation process we consider a simple engineeringcomponent of a bracket. This example will be used in the last section alsoas a case study for a complete engineering analysis. The process in today’sengineering practice is almost exclusively executed in a computer aided design(CAD) software environment. The advantage of working in such environmentis that the engineer is able to immediately analyze the model, since the modelis created in a computer.

Still, the engineer starts by creating a design sketch, such as shown in Fig-ure 2.14. The role of the design sketch is to specify the contours of the desiredshape that will accommodate the kinematic relationships between the com-ponent and the rest of the product. Since this is an interactive process, theengineer could easily modify the sketch until it satisfies the goals.

The next step in the design process is to ”fill out” the details of the geom-etry. The model may be extruded from two dimensional contour elements ina certain direction, or blended between contour curves. The interior volumesmay be filled with standard geometrical components like cylinders or cones.The process usually entails generating the faces and interior volumes of themodel from many components. Figure 2.15 depicts the geometric model of


FIGURE 2.14 Design sketch of a bracket

the bracket example.

The geometric modeling software environment facilitates the easy executionof coordinate transformations, such as rotations, translations of reflections,enabling the engineer to try various scenarios. Earlier designs may be reusedand modifications easily made to produce a variant product. Since shape isreally independent of frame of reference, this approach encapsulates the shapein a parametric form, not in a fixed reference frame of the blue-prints of thepast. Another advantage of the parametric representation is the easy re-sizingof the model.

Finally the finite element discretization step is executed on the geometricmodel using the techniques described in the last two sections. This step isnowadays fully automated and produces mostly triangular surface and tetra-hedral volume meshes, such as visible in Figure 2.16.

For complex models consisting of multitudes of surface and volume com-ponents, the various sections may be meshed separately and the boundaryregions are re-meshed to achieve mesh-continuity. This approach is also ad-vantageous from a computational performance point of view, since the sepa-rate sections may be meshed simultaneously on multiprocessor computers.

In most cases the finite element sizes are strongly influenced by the small-

56 Chapter 2

FIGURE 2.15 Geometric model of bracket

est geometric features of the geometric model and this may result in densermesh in other areas of the geometry. For instance, it is noticeable in the ex-ample that the fillet surface between the cylinder on the left and the facingplanar side of the model seem to have dictated the mesh size. This approachmay be detrimental to the solution performance when there are structurallyunimportant minor details, such as esthetic components of the structure. Themodeling and meshing of such do not necessarily improve the quality of theresults either.

The physical problem is still not yet fully defined by the geometric and thefinite element models. The modeling of the physical phenomenon, such aselasticity, and the specification of the material properties of the part need tobe executed. These topics are the subject of the next chapter.


FIGURE 2.16 Finite element model of bracket

References

[1] Bezier, P.; Essai de definition numerique des courbes et de surfaces ex-perimentals, Universite D. et. M. Curie, Paris, 1977

[2] Blacker, T. D. and Stephenson, M. B.; Paving: A new approach to au-tomated quadrilateral mesh generation, Report DE-AC04-17DP00789,Sandia National Laboratory, 1990

[3] Gregory, J. A.; The mathematics of surfaces, Springer, New York, 1978

[4] Delaunay, B.; Sur la sphere vide, Izv. Akad. Nauk SSSR, OtdelenieMatematicheskih i Estestvennyh Nauk, Vol. 7, pp. 793-800, 1934

[5] Komzsik, L.; Applied variational analysis for engineers, Taylor and Fran-cis, Boca Raton, 2009

58 Chapter 2

[6] Shenton, D. N. and Cendes, Z. J.; Three-dimensional finite element meshgeneration using Delaunay tessellation, IEEE Trans. Magn., Vol. 21, pp.2535-2538, 1985

[7] Shephard, M. S.; Finite element modeling within an integrated geometricmodeling environment: Part I - Mesh generation, Report TR-85024,Renssealer Polytechnic Institute, 1985

[8] Voronoi, G.; Nouvelles applications des parametres continus a la theoriedes formes quadratiques. J. Reine Angew. Math., Vol 133, pp. 97-178,1907

3

Modeling of Physical Phenomena

A mechanical system will be used to present the modeling of a physical prob-lem with the finite element technique. The techniques presented in the bookare, however, applicable in many other engineering principles as shown in [2].

3.1 Lagrange’s equations of motion

The analysis of a mechanical system is based on Lagrange’s equations of mo-tion of analytic mechanics, see [1] and [3] for various formulations. Theequations are

d

dt

∂T

∂qi− ∂T

∂qi+

∂P

∂qi+

∂D

∂qi= 0, i = 1, ..n.

The qi are generalized coordinates, describing the motion of the mechanicalsystem. Here T is the kinetic energy and P is the potential energy, while Dis the dissipative function of the system.The potential energy consists of the internal strain energy (Es) and work po-tential (Wp) of external forces as

P = Es + Wp.

For demonstration consider the simple discrete mass-spring system shown inFigure 3.1. The (only) generalized coordinate describing the motion of thesystem is the only degree of freedom, the displacement in the x direction,q1 = x. The kinetic energy of the system is related to the motion of the massas

T =12mx2.

The strain energy here is the energy stored in the spring and it is

Es =12kx2.

59

60 Chapter 3

km

x

FIGURE 3.1 Discrete mechanical system

The total potential energy is

P = Es.

Appropriate differentiation yields

d

dt

∂T

∂x= mx

and∂P

∂x= kx.

Substituting into Lagrange’s equation of motion produces the well-knownequation of

mx + kx = 0.

This equation, the archetype example used in the study of ordinary differen-tial equations, describes the undamped free vibrations of a single degree offreedom mass-spring discrete mechanical system. For the damped and forcedvibration cases see [6] for example.

Modeling of Physical Phenomena 61

3.2 Continuum mechanical systems

A continuum mechanical system with a general geometry is usually analyzedin terms of the displacements of its particles. The displacements of the par-ticles of the continuum are q = q(x, y, z) where x, y, z are the geometric co-ordinates of the particle in space. The finite element discretization of thethree-dimensional continuum model leads to a set of nodes. They are like thenodes in the two-dimensional example in the prior chapter, however, with anadded spatial dimension.

The kinetic energy of a continuum system in terms of the particle velocitiesis

T =12

∫qT qρdV,

where ρ is the mass per unit volume. The internal strain energy for thecontinuum mechanical system is

Es =12

∫σT εdV.

Here σ, ε are the stresses and strains of the system. The work potential is

Wp = −∫

qT fAdV,

where fA contains the active forces acting on the system. Finally the dissipa-tive function is

D =12

∫qT fD qdV,

where fD contains dissipative forces. The active forces are associated withthe q displacements and the dissipative forces with the q velocities. Thedissipative forces usually, and the active forces sometimes, act on the surfaceof the mechanical systems, however, we assume here for simplicity that theyhave been converted to a volume integral.

Let us consider a mechanical system with node points having six degrees offreedom. We assume that they are free to move in all three spatial directionsand rotate freely around all three axes. This is true in standard analysis ofstructures and the infrastructure of some commercial finite element codes is

62 Chapter 3

qz i( )

θz i( )

θy i( )

qy i( )

qx i( )

θx i( )

i

x

y

z

FIGURE 3.2 Degrees of freedom of mechanical particle

built around such assumption. For the i-th node point

v(i) =

⎡⎢⎢⎢⎢⎢⎢⎣

qx(i)qy(i)qz(i)θx(i)θy(i)θz(i)

⎤⎥⎥⎥⎥⎥⎥⎦

,

where qx(i), qy(i), qz(i) are the translational degrees of freedom of the ith nodepoint and θx(i), θy(i), θz(i) are the rotational degrees of freedom, as shown inFigure 3.2.

The complete mechanical model is described with a vector v, that containsall the node displacements of all node points as


v =

⎡⎢⎢⎢⎢⎢⎢⎣

v(1)v(2)

.v(i)

.v(n)

⎤⎥⎥⎥⎥⎥⎥⎦

,

where n is the number of discrete node points, hence the order of v is g = 6nwhere g is the total number of degrees of freedom in the model.

3.3 Finite element analysis of elastic continuum

The purpose of this book is to discuss the computational techniques of finiteelement analysis that are applicable to various physical principles. Therefore,while the concepts will have a mechanical foundation, they will also carryover to other principles where a potential field is the basis of describing thephysical phenomenon.

For example, in heat conduction the potential function is the temperaturefield (t), the “strain” is the temperature gradient (−∇t) and the “stress” isthe heat flow (q). Similarly in magneto-statics the magnetic vector potential(A) is the fundamental “displacement” field, and the physical equivalent tothe strain is the magnetic induction (B). The “stress” in magneto-statics isthe magnetic field strength (H).

Similar analogies exist with other physical disciplines. It is important topoint out that the most important quantity from the engineers perspectivevaries from discipline to discipline. For the heat conduction engineer thetemperature field (the “displacement”) is of primary interest and it is luckilythe primary result of the finite element solutions. For the electrical engineerstudying magneto-statics the magnetic induction (the “strain”) is the mostimportant. Finally, for the structural engineer both stresses and displace-ments are of practical interest.

These major concepts are now defined in connection with an elastic con-tinuum. The physical behavior of an elastic continuum is analyzed via thethe relative displacements on neighboring points in the body. The relativedisplacements represent the physical strain inside the body. The strains werealready discussed in Chapter 1 as a mathematical concept, here we give aphysical foundation.

64 Chapter 3

There are two distinct types of physical strains:a. Extensional strains, andb. Shear strains.The first kind of strain is the change in distance between two points of thebody. The shear strain is defined as the change in the angle between two lineswhich were perpendicular in the undeformed body. The strain vector of thesedistinct components for the general 3-dimensional model is

ε =

⎡⎢⎢⎢⎢⎢⎢⎢⎣

∂q∂x∂q∂y∂q∂z

∂q∂x + ∂q

∂y∂q∂y + ∂q

∂z∂q∂z + ∂q

∂x

⎤⎥⎥⎥⎥⎥⎥⎥⎦

.

The physical stress components of the body corresponding to these strains are:

σ =

⎡⎢⎢⎢⎢⎢⎢⎣

σx

σy

σz

τxy

τyz

τzx

⎤⎥⎥⎥⎥⎥⎥⎦

.

Here the σ are the normal and the τ are the shear stresses. The stress-strainrelationship is described by

σ = Dε,

where the D matrix describes the constitutive relationship due to the elasticproperties of the model, such as Poisson’s ratio and Young’s modulus of elas-ticity. The actual structure of D depends on the material modeling. Fora general three-dimensional model of linear elastic, isotropic material the Dmatrix is of form

D =E

(1 + ν)(1 − 2ν)

⎡⎢⎢⎢⎢⎢⎢⎣

1 − ν ν ν 0 0 0ν 1 − ν ν 0 0 0ν ν 1 − ν 0 0 00 0 0 0.5 − ν 0 00 0 0 0 0.5 − ν 00 0 0 0 0 0.5 − ν

⎤⎥⎥⎥⎥⎥⎥⎦

.

Here E is the Young’s modulus and ν is the Poisson ratio.

It was established earlier that the strains are related to the node displace-ments as


ε = Bqe,

therefore, the stresses are also related via

σ = DBqe.

The structure of the B matrix for general three-dimensional continuum prob-lems will be discussed in detail shortly. For a three-dimensional element, theelement strain energy is formed as

Ee =12

∫ ∫ ∫σT εdxdydz =

12qTe keq

Te ,

where the element stiffness matrix is

ke =∫ ∫ ∫

BT DBdxdydz.

To find more details on the continuum mechanics foundation [5] is the clas-sical reference.

3.4 A tetrahedral finite element

To demonstrate the finite element modeling of a three-dimensional continuumwe consider the tetrahedron element, such as shown in Figure 3.3 located gen-erally in global coordinates. This is the most common object resulting fromthe discretization process discussed in Sections 2.6 and 2.7, albeit not the mostadvantageous from numerical perspective. Every node point of the tetrahe-dron has three degrees of freedom, they are free to move in the three spatialdirections. Hence, there are twelve nodal displacements of the element as

qe =

⎡⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎣

q1x

q1y

q1z

q2x

q2y

q2z

q3x

q3y

q3z

q4x

q4y

q4z

⎤⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎦

.

66 Chapter 3

u

3

1

2

4

v

w

y

z

x

FIGURE 3.3 Tetrahedron element

Note, that in element derivation sessions throughout, we use the notation ofqix to refer to the x translation of the ith local node of the element. This is adistinction from the notation of qx(i) which refers to the x translation of theith global node of the model.

The displacement at any location inside this element is approximated withthe help of the shape functions as

q(x, y, z) = Nqe.

Generalizing the observation of the relationship between the shape functionsand the parametric coordinates with respect to the triangular element, wemay use

N1 = w, N2 = u, N3 = v,

andN4 = 1 − u − v − w.

Here the parametric coordinate w is from the 1st node to the 4th node. Thisarbitrary selection is chosen to be consistent with the triangular element in-troduced in Chapter 1. Then u is from the 1st node to the 2nd node and


finally, the parametric coordinate v is from the 1st node to the 3d node. Sucha selection of the Ni functions obviously satisfies

N1 + N2 + N3 + N4 = 1.

We organize the N matrix of the four shape functions as

N =

⎡⎣N1 0 0 N2 0 0 N3 0 0 N4 0 0

0 N1 0 0 N2 0 0 N3 0 0 N4 00 0 N1 0 0 N2 0 0 N3 0 0 N4

⎤⎦ .

Then

⎡⎣ qx(x, y, z)

qy(x, y, z)qz(x, y, z)

⎤⎦ =

⎡⎣N1 0 0 N2 0 0 N3 0 0 N4 0 0

0 N1 0 0 N2 0 0 N3 0 0 N4 00 0 N1 0 0 N2 0 0 N3 0 0 N4

⎤⎦

⎡⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎣

q1x

q1y

q1z

q2x

q2y

q2z

q3x

q3y

q3z

q4x

q4y

q4z

⎤⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎦

,

orq(x, y, z) = Nqe

as desired. The location of a point inside the element is approximated againwith the same four shape functions as the displacement field

x = N1x1 + N2x2 + N3x3 + N4x4,

y = N1y1 + N2y2 + N3y3 + N4y4,

andz = N1z1 + N2z2 + N3z3 + N4z4.

Here xi, yi, zi are the x, y, z coordinates of the i-th node of the tetrahedron,hence the element is again an iso-parametric element. Using the above defini-tion of the shape functions with the element coordinates and by substitutingwe get

x = x4 + (x1 − x4)w + (x2 − x4)u + (x3 − x4)v,

y = y4 + (y1 − y4)w + (y2 − y4)u + (y3 − y4)v,

andz = z4 + (z1 − z4)w + (z2 − z4)u + (z3 − z4)v.

68 Chapter 3

These equations will be fundamental to the integral transformation of the ele-ment matrix generation for the tetrahedron element. Assuming that the B, Dmatrices are constant in an element (constant strain element), the elementstiffness matrix is formulated as

ke = BT DB

∫ ∫ ∫dxdydz = BT DBVe,

where Ve is the volume of the element as

Ve =∫ ∫ ∫

dxdydz.

The integral will be again transformed to the parametric coordinates

Ve =∫ ∫ ∫

det[∂(x, y, z)∂(u, v, w)

]dudvdw.

Here the Jacobian matrix is

∂(x, y, z)∂(u, v, w)

=

⎡⎣ ∂x

∂u∂x∂v

∂x∂w

∂y∂u

∂y∂v

∂y∂w

∂z∂u

∂z∂v

∂z∂w

⎤⎦ =

⎡⎣x2 − x4 x3 − x4 x1 − x4

y2 − y4 y3 − y4 y1 − y4

z2 − z4 z3 − z4 z1 − z4

⎤⎦ = J,

and it is visibly constant for the element.

Gaussian integration requires the additional transformation of this integral,as integrating over the parametric domain of our element, the integral has thefollowing boundaries

∫ 1

u=0

∫ 1−u

v=0

∫ 1−u−v

w=0

f(u, v, w)dwdvdu =∫ 1

r=−1

∫ 1

s=−1

∫ 1

t=−1

f(s, r, t)dtdrds.

The Gaussian integration method applied to this triple integral is

∫ 1

r=−1

∫ 1

s=−1

∫ 1

t=−1

f(s, r, t)dtdrds = Σni=1ciΣn

j=1cjΣnk=1ckf(si, rj , tk).

With applying the one or two point formula, the element volume is

Ve =16det(J).

To complete the element matrix, the strain components and, in turn, termsof the B matrix still need to be computed. Clearly

⎡⎣ ∂q

∂u∂q∂v∂q∂w

⎤⎦ = J

⎡⎢⎣


⎤⎥⎦ .


Then ⎡⎢⎣


⎤⎥⎦ = J−1

⎡⎣ ∂q

∂u∂q∂v∂q∂w

⎤⎦

The terms of J−1 may be computed as

J−1 =adj(J)det(J)

=

⎡⎣ b11 b12 b13

b21 b22 b23

b31 b32 b33

⎤⎦ .

With the introduction of terms

b1 = −(b11 + b12 + b13),

b2 = −(b21 + b22 + b23),

andb3 = −(b31 + b32 + b33),

the terms of the B matrix are

B =

⎡⎢⎢⎢⎢⎢⎢⎣

b11 0 0 b12 0 0 b13 0 0 b1 0 00 b21 0 0 b22 0 0 b23 0 0 bb 00 0 b31 0 0 b32 0 0 b33 0 0 b3

0 b31 b32 0 b32 b22 0 b33 b23 0 b3 b2

b31 0 b11 b32 0 b12 b33 0 b13 b3 0 b1

b21 b11 0 b22 b12 0 b23 b13 0 b2 b1 0

⎤⎥⎥⎥⎥⎥⎥⎦

.

BT DB produces a 12 × 12 element matrix. This element matrix will con-tribute to 12 columns and rows of the assembled finite element matrix.

3.5 Equation of motion of mechanical system

The motion of the particles of the continuum mechanical system is approxi-mated by the discrete node displacements as

q = Nv,

where N is a collection of shape functions shown earlier. The kinetic energywith this approximation becomes

T =12vT Σm

e=1

∫NT NρdVev,

70 Chapter 3

where m is the number of elements and e is the element index. The integral inthe above equation is performed on each element. The shape of the elementand the connectivity of the related nodes is represented in the N shape func-tions. Executing the summation for all the finite elements of the model we get

T =12vT Mv,

where M is the mass matrix. It is computed similarly to the stiffness matrix as

M = Σme=1me

andme =

∫NT NρdVe,

using the same numerical integration and local-global transformation princi-ples as discussed earlier. Similar manipulations on the potential energy yield

P =12vT Kv − vT F,

where K is the stiffness matrix. It is computed according to the proceduredeveloped earlier. The F is the vector of all active forces (volume and surface)and computed as

F = Σme=1fe,

where fe is the element force. The differentiation of the kinetic energy yields

∂T

∂v= Mv

andd

dt(Mv) = Mv.

Note, that T only depends of v, so the second term of Lagrange’s equationsof motion (the derivative of T with respect to v) is ignored. Similarly thepotential energy terms yield:

∂P

∂v= Kv − F.

Here the K stiffness matrix generation was detailed in the last section. Thedissipative function becomes

D =12vT Bv,

and differentiation yields

∂D

∂v= Bv.


Here B is the damping matrix and not the strain-displacement matrix. Sub-stituting all the above forms into Lagrange’s equations of motion we obtainthe matrix equation of the equilibrium of a general mechanical system

Mv(t) + Bv(t) + Kv(t) = F (t),

where v is the displacement vector of the system at time t. In Chapter 14 wewill address the direct solution of this problem in the time domain. This isa second order, non-homogeneous, ordinary differential equation with matrixcoefficients. The generally time-dependent active loads are contained in theright-hand side matrix.

For our discussion we will mostly assume that the coefficient matrices areconstant; hence the equation is linear. In practice the coefficient matrices areoften not constant, material and geometric nonlinearities exist, see for exam-ple, [4] for more details. The computational components of such cases willbe discussed in Chapter 16, however, the computational techniques developeduntil then are also applicable to them.

We will also assume in most of the following that the matrices are real andsymmetric. This restriction will be released in Chapter 10, where the complexspectral computation will also be addressed.

The computational techniques discussed in the bulk of the book with theserestrictions in mind are just as applicable.

3.6 Transformation to frequency domain

In many cases the equilibrium equation is transformed from the time domainto the frequency domain. This is accomplished by a Fourier transformationof form

u(ω) =∫ ∞

0

v(t)e−iωtdt.

Assuming zero initial conditions, the result is the algebraic equation of motion

(−ω2M + iωB + K)u(ω) = F (ω).

This equation describes the most general problem of the forced vibrationsof a damped structure. It is called the frequency response analysis equationin some commercial software, and the harmonic analysis in some other pro-grams. It has become the most widely used type of analysis because it is more

72 Chapter 3

repeatable and less costly than transient response analysis. The applicationof the Fourier transformation to the right-hand side results in

F (ω) =∫ ∞

0

F (t)e−iωtdt.

Once the frequency domain solution is obtained, the time domain responsemay be computed as

v(t) =1π

∫ ∞

0

Re(u(ω)eiωt)dω.

Depending on the absence of certain matrices in the algebraic equation of mo-tion, various other structural engineering analysis problems are formulated.

The simplest case is when there is no mass (M) or damping (B) in the me-chanical model and the load, hence the solution is not frequency-dependent.This is the case of linear static analysis which computes the static equilibriumof the system. The algebraic equation is simply

Ku = F.

The solution vector u describes the spatial coordinates of each discrete vari-able due to the static load F . The computational components of the solutionof the problem are presented in Chapter 7. The reduction technique applica-ble to this class of problems is the static condensation of Chapter 8.

Another distinct class of analysis is the free vibration of structures. In thisclass, there are no external loads acting on the structure, i.e., F does notexist. The class is further subdivided into damped and undamped cases. Thealgebraic equation of the free vibration of an undamped system is

(K − λM)u(ω) = 0, λ = ω2.

This very important problem, called normal modes analysis in the industry,will be addressed in Chapter 9. The dynamic reduction technique of Chapter11 and the modal synthesis technique discussed in Chapter 12 are also usedfor the solution of this problem.

A very specific subcase of this is the buckling analysis. The goal of thebuckling analysis is to find the critical load under which a structure may be-come unstable. The problem is formulated as

(K − λbKg)u(λb) = 0,

where the Kg matrix is the so-called geometric or differential stiffness matrix.The geometric stiffness matrix depends on the geometry of the structure andthe applied load. The buckling eigenvalue λb is a scaling factor. Multiplying


the applied load (that was used to compute the geometric stiffness) by λb willproduce a critical load for the structure. The corresponding vector of u(λb)is the buckling shape.

The stiffness and the mass matrix terms are generally derived with mathe-matical rigor. Damping terms, by contrast, are developed more for engineeringconvenience. One type, called structural damping is defined by multiplyingthe terms of the stiffness matrix by ig, where i is the imaginary unit and g isa parameter based on engineering experience, with 0.03 being a typical value.The g values may be applied to the element stiffness matrix terms, or theassembled stiffness matrix terms, or both.

The structure may also contain actual damper components, such as hy-draulic shock absorbers that are always modeled in the damping matrix. Thedamping level is one of the main parameters in determining the amount ofamplification of response at resonance. Considering the above, free vibrationsof damped systems are the solutions of

((1 + ig)K + iωB − ω2M)u(ω) = 0.

For types of analyses where imaginary stiffness terms are inconvenient, suchas transient response analysis, they may be converted into equivalent termsof the damping matrix B. This conversion is somewhat dubious, especiallysince the structural damping may not capture all the damping phenomena,such as the play in riveted or bolted joints, flexing of welds, or the thicknessof adhesives in bonded structures. Nevertheless, in the following the dampingwill always be contained in the B matrix.

The most practical class of analysis, however, is the forced vibration ofstructures. In this case external loads act on the structure, i.e. F �= 0. Thisclass is also subdivided into damped and undamped cases. The algebraicequation of the forced vibration of an undamped system is

(K − ω2M)u(ω) = F (ω),

where ω is an excitation frequency. The forced vibration of a damped systemis described by

(K + iωB − ω2M)u(ω) = F (ω).

These problems of frequency response analysis, will be addressed in Chapter15.

These equations, however, are far from ready to be solved. The steps ofmodifying the equations of motion usually reduce the size of the matrices. Inorder to follow the change of size in the reduction process, in the following,

74 Chapter 3

the matrices will always have their sizes indicated in double subscripts. Thevectors whose size is not indicated by a single subscript are assumed to becompatible with the corresponding matrices. For example, the normal modesequation in this form is

(Kgg − λMgg)ug(ω) = 0,

indicating that the matrices are the yet-unreduced global matrices with g =6 ∗ n columns and rows, and the solution vector with the same number ofrows. Here n is the number of node points.

In order to obtain a good numerical solution from the equations, the issueof processing constraints as well as various boundary conditions must be ad-dressed. These are usually applied by the engineer directly or by the choiceof the applied modeling technique and discussed next.

References

[1] Beda, Gy. and Bezak, A.; Kinematika es dinamika, Tankonyvkiado, Bu-dapest, 1969

[2] Brauer, J. R.; What every engineer should know about finite elementanalysis, Marcel Dekker, 1993

[3] Lanczos, C.; The variational principles of mechanics, Dover, Mineola,1979

[4] Oden, J. T.; Finite elements of nonlinear continua, McGraw-Hill, NewYork, 1972

[5] Timoshenko, S. and Goodier, J. N.; Theory of elasticity, McGraw-Hill,New York, 1951

[6] Vierck, R. K.; Vibration analysis, International Textbook Co., Scranton,1967

4

Constraints and Boundary Conditions

Constraint equations may specify the coefficients for flexible elements or fora completely assembled finite element model. A constraint equation statesthat the displacements at a selected set of degrees of freedom must satisfyan equation of constraint independently of the force-deflection relationshipsdefined by the flexible elements. There may be many constraint equationsand each may enforce different laws.

Commercial finite element codes often have methods for defining the coeffi-cients of the constraint equation with element-like input formats because thecoefficients are tedious to develop by hand, and can cause implausible resultswhen a decimal point is in the wrong location, or a sign is changed. Thesetypes of errors in constraint equation coefficients are difficult to identify byscanning input files.

One class of an element-like constraint equation generator is for rigid el-ements whose degrees of freedom move as if they were attached to a rigidcomponent. Another type is an interpolation element, where a set of degreesof freedom is attached to a reference node in a manner that the motion of thereference node is a weighted combination of the motion of the other connecteddegrees of freedom.

Some programs also allow input of constraint equation coefficients directlyas well, to allow analysts to implement features that are not built into theprogram. For example component mode synthesis (the topic of Chapter 12)may also be achieved by appropriately defined constraint equations.

The subject of this chapter is the mathematics of describing and applying(removing or augmenting) these constraints, most commonly named multi-point constraints. The bibliography contains some publications related tothis topic, see [1], [3] and [4].

75

76 Chapter 4

4.1 The concept of multi-point constraints

Multiple degrees of freedom constraints are usually applied before the singledegree of freedom constraints in commercial applications. The reason for thisis that the application of the multiple degrees of freedom constraints removessome of the singularities of the global matrix. The issue of dealing with thesingle degree of freedom constraints representing boundary conditions is thensimpler as it is related only to the remaining singularities in the model.

Let us consider a very simple mechanical problem of a bar in the x − yplane connecting two node points as shown in Figure 4.1. The constraints ofthe bar will be discussed in this chapter and the solution of the problem willculminate in Chapter 5.

FIGURE 4.1 Rigid bar

Constraints and Boundary Conditions 77

Let us assume the bar is rigid under bending but flexible under axial elon-gation, such an element is sometimes called an axial bar. This restriction willbe released in Chapter 17 with the introduction of a planar bending of the bar.

The bar is of length l and its local x axis is aligned with its axis. As thiselement constitutes the whole example model, the global degree of freedomnotation of qx(i) will be used.

The motion of nodes 1 and 2 is obviously related. The y displacement ofnode 2 depends on the motion of node 1 as

qy(2) = qy(1) + lθz(1),

where θz(1) is the rotation of node 1 with axis of rotation being the z coordi-nate axis. It is assumed to be a small rotation such that the

sin(θz(1)) ≈ θz(1)

assumption is valid. Similarly the z displacement is

qz(2) = qz(1) + lθy(1).

Let us consider the case of l = 1 for simplicity, without limiting the gener-ality of the discussion. The rotations with respect to the out of plane z axis(resulting from the rigidity) are obviously identical

θz(2) = θz(1).

Similarly the rotations with respect to the y axis are related as

θy(2) = θy(1).

Finally, assuming that the bar is rigid with respect to torsion also, we haveanother constraint of

θx(2) = θx(1).

Note, that these are already multi-point constraints as “point” in this contextmeans the degrees of freedom, not the nodes. One can write the constraintequations in the form of

qy(1) + θz(1) − qy(2) = 0,

qz(1) + θy(1) − qz(2) = 0,

θx(1) − θx(2) = 0,

θy(1) − θy(2) = 0,

78 Chapter 4

and

θz(1) − θz(2) = 0.

Note, that the length variable is now ignored as it is assumed to be unity, andthe equations are reordered according to the dependent degrees of freedom.A more general form of writing these equations in connection with the globalfinite element model is

Rmgug = 0,

where Rmg is the constraint matrix and m is the number of constraints. In thesimple bar example that is a five by twelve matrix as we have five constraints:

Rmg =

⎡⎢⎢⎢⎢⎣

0 1 0 0 0 1 0 −1 0 0 0 00 0 1 0 1 0 0 0 −1 0 0 00 0 0 1 0 0 0 0 0 −1 0 00 0 0 0 1 0 0 0 0 0 −1 00 0 0 0 0 1 0 0 0 0 0 −1

⎤⎥⎥⎥⎥⎦ .

The structure of ug is simply an order twelve column vector as we have twonodes each consisting of six degrees of freedom:

ug =

⎡⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎣

qx(1)qy(1)qz(1)θx(1)θy(1)θz(1)qx(2)qy(2)qz(2)θx(2)θy(2)θz(2)

⎤⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎦

.

We can further partition the solution vector as

ug =[

un

um

],

where the m partition contains the degrees of freedom whose motion dependson the motion of the n partition degrees of freedom. Hence, the m and ndegrees of freedom are called dependent and independent degrees of freedom,respectively. More on this in the next section. In the simple bar example itis also easy to identify these partitions as the dependent motions of node 2:


um =

⎡⎢⎢⎢⎢⎣

qy(2)qz(2)θx(2)θy(2)θz(2)

⎤⎥⎥⎥⎥⎦ ,

and the independent motions of nodes 1 and 2:

un =

⎡⎢⎢⎢⎢⎢⎢⎢⎢⎣

qx(1)qy(1)qz(1)θx(1)θy(1)θz(1)qx(2)

⎤⎥⎥⎥⎥⎥⎥⎥⎥⎦

.

Although trivial in the bar example, this partitioning is representative ofmulti-point constraints.

In practical applications more difficult rigid elements, such as levers, mech-anisms and gear trains, are frequently applied. These constraints connectseveral nodes and there may be many of them. Therefore, a practical finiteelement analysis model may be subject to many multi-point constraints re-sulting in a large number of independent constraint equations. Note, that thenumber of equations is not necessarily the same as the number of physicalconstraints as some of these may produce more than one equation.

The remainder of this chapter focuses on the reduction step eliminatingthese constraints.

4.2 The elimination of multi-point constraints

In a general form, the problem of eliminating the multi-point constraints isequivalent to solving the quadratic minimization problem

minΠ = uTg Kggug − uT

g Fg,

subject to the equality constraints

Rmgug = 0

where Π is the functional to be minimized (in stationary problems the poten-tial energy). The Rmg matrix is the coefficient matrix of the linear equality

80 Chapter 4

constraints, representing relationships amongst the variables required to beenforced. The reduction involves the selection of m linearly independent vec-tors from the columns of Rmg.

Let us partition the Rmg matrix accordingly into the m and n partitions:

Rmg =[Rmn Rmm

].

Note, that the a priori identification of the m (linearly independent) andthe n partitions is somewhat arbitrary and in general cases requires some en-gineering intuition. The dependent degrees of freedom (m partition) definea matrix made from a subset of the constraint equations that must be wellconditioned for matrix inversion.

The reduction technique related to the multi-point constraints is as follows.The m partition will be removed from further computations. The solution willbe obtained in the remaining n partition. Finally the m partition’s displace-ments are computed from the n partition solution. Hence, the m partitiondegrees of freedom “depend” on the n partition degrees of freedom, as wasmentioned in the prior section. Therefore, they are called dependent degreesof freedom in the industry, although they actually form a linearly indepen-dent set. By the same token the n partition degrees of freedom are called theindependent set. In the following this industrial notation is used.

Finding the linearly independent m partition of the Rmg matrix is a stan-dard linear algebra problem. In part it may be solved by purely mathematicalconsiderations. For example, a degree of freedom may not be dependent inmore than one constraint equation. It is also important to identify redundantconstraint equations.

There can be a lot of computational work required to pick the dependentset, particularly when there are many over-lapping constraint equations. Mostof the commercial finite element tools provide efficient techniques to automat-ically select the dependent set.

In industrial practice, there are some other considerations. For example incase of structural analysis, the degrees of freedom of substructures that areto be coupled with other substructures may not be eliminated because theelimination makes them unavailable for use as boundary points. These, ap-plication specific decisions may also be automated, although the underlyingrules must be made by the engineer.

In the following discussion we assume that the partitioning is properly cho-sen. The multi-point constraint reduction will be facilitated by the matrix


Gmn = − [R−1

mm

] [Rmn

].

Let us consider the linear statics problem partitioned accordingly

[Knn Knm

Kmn Kmm

][un

um

]=

[Fn

Fm

],

where the bar over certain partitions is for notational convenience. The orig-inal Kgg matrix is assumed to be symmetric, meaning that Knm = KT

mn.Since [

Rmn Rmm

] [un

um

]= 0,

using the definition of Gmn it follows that

um = Gmnun.

Introducing this to the last equation yields

[Knn KnmGmn

Kmn KmmGmn

] [un

un

]=

[Fn

Fm

].

Pre-multiplying the second equation by GTmn to maintain symmetry we get

[Knn KnmGmn

GTmnKmn GT

mnKmmGmn

] [un

un

]=

[Fn

GTmnFm

].

The simultaneous solution of the two equations by summing them yields

Knnun = Fn.

The reduced n-size stiffness matrix is built as

Knn = Knn + KnmGmn + GTmnKmn + GT

mnKmmGmn.

It is important to note that since we are ultimately solving equilibrium equa-tions, the matrix modifications need to be reflected on the right-hand sidesalso:

Fn = Fn + GTmnFm.

In practical circumstances, depending on the industry, the g partition to npartition reduction may be as much as 20-30 percent. The larger numbers arecharacteristic of the automobile industry due to very specific techniques suchas modeling spot welds with multi-point constraints.

82 Chapter 4

For the simple bar example, the calculation is as follows.

Rmm =

⎡⎢⎢⎢⎢⎣−1 0 0 0 00 −1 0 0 00 0 −1 0 00 0 0 −1 00 0 0 0 −1

⎤⎥⎥⎥⎥⎦ .

Rmn =

⎡⎢⎢⎢⎢⎣

0 1 0 0 0 1 00 0 1 0 1 0 00 0 0 1 0 0 00 0 0 0 1 0 00 0 0 0 0 1 0

⎤⎥⎥⎥⎥⎦ .

Since Rmm = −I is non-singular and equal to its inverse,

Gmn = −

⎡⎢⎢⎢⎢⎣

0 1 0 0 0 1 00 0 1 0 1 0 00 0 0 1 0 0 00 0 0 0 1 0 00 0 0 0 0 1 0

⎤⎥⎥⎥⎥⎦ .

4.3 An axial bar element

Before we proceed further, let us derive the stiffness matrix for the simplebar element shown in Figure 4.1, allowing for axial deformation. The localelement coordinate system is as earlier, aligned with the axis of the element.The following derivation is aimed at the specific example we used for the con-straint elimination process.

Using earlier principles we introduce a local coordinate u with the followingconvention: u = 0 at node 1 and u = 1 at node 2. We describe the coordinatesof the points of the element with

x = N1x1 + N2x2 = (1 − u)x1 + ux2.

Here xi are the node point coordinates.

Using the iso-parametric concept, the same shape functions are used to de-scribe the deformation of the element as

q(x) = N1q1x + N2q2x = (1 − u)q1x + uq2x.

WithN =

[1 − u u

],


and

qe =[

q1x

q2x

],

we haveq(x) = Nqe.

Observe the local displacement notation.

The strain in the element is

ε =dq

dx=

dq

du

du

dx.

Differentiation yields

dq

du= q2x − q1x

anddx

du= x2x − x1x = l,

where l is the length of the element. Hence, the strain is

ε =q2 − q1

l.

The strain-displacement relationship of

ε = Bqe

is satisfied with the strain-displacement matrix

B =1l

[−1 1].

Note again, that when using linear shape functions the strain will be constantwithin an element.

For this one-dimensional tension-compression problem the elastic behavioris defined by the well-known Hooke’s law

σ = Eε,

where E is the Young’s modulus of elasticity. The stiffness matrix for theelement is

ke =∫ ∫ ∫

BT EBdxdydz.

Considering that the bar has a constant cross section A:

ke = AE

∫ x2

x1

BT Bdx.

84 Chapter 4

Introducing the local coordinates and substituting yields

ke =AE

l2

∫ 1

0

[−11

] [−1 1]Jdu.

With the Jacobian

J =dx

du= l

the element stiffness matrix finally is

ke =[

a −a−a a

],

wherea =

AE

l.

This element stiffness matrix will contribute to the 1st and 7th column of theassembled matrix, resulting in

Kgg =

⎡⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎣

a 0 0 0 0 0 −a 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0−a 0 0 0 0 0 a 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0

⎤⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎦

.

The other degrees of freedom are undefined. Applying the elimination processto the stiffness matrix results in the following reduced, 7 by 7 matrix:

Knn =

⎡⎢⎢⎢⎢⎢⎢⎢⎢⎣

a 0 0 0 0 0 −a0 0 0 0 0 0 00 0 0 0 0 0 00 0 0 0 0 0 00 0 0 0 0 0 00 0 0 0 0 0 0−a 0 0 0 0 0 a

⎤⎥⎥⎥⎥⎥⎥⎥⎥⎦

.

After this elimination step the recovery of the engineering solution set is for-mally obtained by

ug =[

Inn

Gmn

]un,


where the actual merging of the very likely interleaved partitions is not vis-ible. This will be clarified when executing this step for our example in thenext chapter.

Let us summarize the multi-point constraint elimination step as follows:[Kgg [

Knn

] ].

The special notation in this chart (and similar charts at the end of later chap-ters) represents the fact that the original Kgg matrix has been reduced to theKnn matrix. It is not a mathematical expression in the sense, that Knn is nota direct partition of Kgg.

The topic of the next two sections is the application of boundary conditionsto the finite element model. These are usually enforced displacements of thefinite element model. These are represented by single degree of freedom con-straints, commonly called single-point constraints. These constraints are inessence a sub-case of the more general method of the multi-point constraintspresented in the last sections. Their separate presentation in these sectionsis justified because they are also handled separately in the industry and theyare related to the automatic detection of singularities presented in the nextchapter.

4.4 The concept of single-point constraints

The single-point constraints apply a fixed value to a degree of freedom of acertain node, hence the name. The most common occurrence of such is, thedescription of boundary conditions, i.e., the fixed (often zero) displacement ofa node in a certain direction or its rotation by a certain axis.

Let us consider our simple example again, but constrain the translation ofnode 1, imposing a boundary condition via the following constraint equations

qx(1) = 0.0,

qy(1) = 0.0,

andqz(1) = 0.0.

These are clearly single-point constraints as they constrain one single degreeof freedom each. Their physical meaning is to keep node 1 fixed at the origin,

86 Chapter 4

represented by the shading at the left of the bar in Figure 4.2.

Some commercial software programs automatically assign six degrees offreedom per node point. Some types of models have fewer degrees of free-dom per node points. Most solid elements have only translation degrees offreedom leaving only three degrees of freedom per node point. The stiffnessmatrix columns for the other degrees of freedom are null, meaning that thestructure is not defined in these degrees of freedom. Single-point constraintscan be used to remove these undefined degrees of freedom. Other possibleuses for single point constraints, such as to enforce symmetry of the deforma-tion of a symmetric structure or component, are not discussed here.

4.5 The elimination of single-point constraints

The n-size equilibrium equation may further be reduced by applying the sin-gle point constraints. The single-point constraints are described by

us = Ys,

where the us is a partition of un:

un =[

uf

us

].

For our simple example

us =

⎡⎣ qx(1)

qy(1)qz(1)

⎤⎦ ,

leaving the unconstrained set

uf =

⎡⎢⎢⎣

θx(1)θy(1)θz(1)qx(2)

⎤⎥⎥⎦ .

The Ys vector of enforced generalized displacements (translations and rota-tions) for our example is:

Ys =

⎡⎣ 0.0

0.00.0

⎤⎦ .


FIGURE 4.2 Boundary conditions

Note, that the components are not always zero; sometimes a non-zero displace-ment or rotation value is enforced by the engineer. This so-called enforcedmotion case will be discussed in Section 15.5. The corresponding reductionof the stiffness matrix is based on the appropriate partitioning of the last re-duced equation:

Knnun =[

Kff Kfs

Ksf Kss

] [uf

us

]=

[Ff

Fs

].

As the second equation has a prescribed solution, the remaining equation is

Kffuf = Ff − KfsYs.

The f partition is sometimes called the free partition, as the constraints havebeen eliminated. The size reduction between the n and f partitions is not asdramatic as between the g and the n partitions. Usually there are only a fewboundary conditions even for very large problems.

88 Chapter 4

This step for our example results in the further reduced, now 4 by 4 matrix:

Kff =

⎡⎢⎢⎣

0 0 0 00 0 0 00 0 0 00 0 0 a

⎤⎥⎥⎦ .

The recovery of the solution for the independent set before the eliminationof the single-point constraints is done by

un =[

uf

Ys

].

The un to ug transformation via um was shown at the end of the previouschapter. The summary of the two reduction steps completed so far is

⎡⎣Kgg [

Knn [Kff

] ] ⎤⎦ .

4.6 Rigid body motion support

After executing above steps it is still possible that the Kff stiffens matrix issingular and the structure remains a free body exhibiting a rigid body motion.Such a motion does not produce internal forces in the structure and shouldbe restrained in order to solve the flexible body problem.

A common occurrence of this in static analysis is when the engineer forgetsto constrain the structure to ground in some way. This results in six mech-anisms, where the entire structure can move in a stress-free manner in thethree translational and three rotational directions. A set of functions thatdescribes the mechanism motions are called rigid body mode shapes.

Accordingly, the stiffness matrix is partitioned into restrained (r) and un-restrained (l) partitions and the static equilibrium is posed as

Kffuf =[

Krr Krl

Klr Kll

] [ur

ul

]=

[Fr

Fl

]= Ff .

The displacements of the supported partition by definition are zero, hence thesecond equation yields

Kllul = Fl.


The stiffness matrix of this problem is now non-singular and may be solvedfor the internal deformations as

ul = K−1ll Fl.

To verify that the support provided by the engineer is adequate, we turn tothe first equation. Utilizing again the zero displacements the first equationyields

Krlul = Fr.

Substituting the ul displacements, we find the loads necessary to support thestructure in terms of the active loads as

Fr = KrlK−1ll Fl.

Introducing the rigid body transformation matrix of

Grl = −KrlK−1ll

and expanding it to the f partition size as

Gfr =[

Irr

GTrl

]

provides a way to measure the quality of the rigid body support. By trans-forming the stiffness matrix as

Krr = GTfrKffGfr

we obtain the rigid body stiffness matrix that should be computationally zero.Measuring the magnitude of the Euclidean norm of the rigid body stiffnessmatrix and comparing to the norm of the supported partition provides a rigidbody error ratio:

εr =||Krr||||Krr|| .

If this ratio is not small enough, the r partition is not adequately specified andthe l set stiffness matrix is still singular. The summary of the three reductionsteps is

⎡⎢⎢⎣

Kgg ⎡⎣Knn [

Kff [Kll

] ]⎤⎦

⎤⎥⎥⎦ .

90 Chapter 4

4.7 Constraint augmentation approach

There is an alternative way of dealing with constraints in finite element com-putations and that is by augmenting them to the system of equations, insteadof the elimination methods discussed in the prior sections.

As earlier, let us first consider the multi-point constraints. In Section 4.2the partitioning of

Kggug =[

Knn Knm

Kmn Kmm

] [un

um

]=

[Fn

Fm

]= Fg,

was the basis of the elimination and here it will serve as the basis of the aug-mentation. Recall the relationship between the dependent and independentdegrees of freedom of the model:

um = Gmnun,

where Gmn is the constraint matrix. Let us augment the system of equationsby the constraint matrix with the help of Lagrange multipliers as follows:⎡

⎣ Knn Knm GTmn

Kmn Kmm −IGmn −I 0

⎤⎦

⎡⎣ un

um

λm

⎤⎦ =

⎡⎣ Fn

Fm

0

⎤⎦ ,

or [Kgg GT

mg

Gmg 0mm

][ug

λm

]=

[Fg

0m

].

The physical meaning of the Lagrange multipliers will be the reaction forcesat the multi-point constraints, computed as

λm = Kmnun + Kmmum − Fm.

Similar approach to the single-point constraint partitioning form of

Knnun =[

Kff Kfs

Ksf Kss

] [uf

us

]=

[Ff

Fs

]by augmenting with the single point constraints represented by the Ys matrixresults in ⎡

⎣Kff Kfs 0Ksf Kss −I0 I 0

⎤⎦

⎡⎣uf

us

λs

⎤⎦ =

⎡⎣Ff

Fs

Ys

⎤⎦ ,

or [Knn GT

sn

Gsn 0ss

] [un

λs

]=

[Fn

Ys

].


The single point constraint forces are recovered as

λs = Ksfuf + Kssus − Fs.

The augmentation approach may be processed simultaneously by introducingtwo new partitions. First combine the two constraint sets into one as

2 =[

sm

].

Then augment the g-partition with this partition to obtain the super-partition

1 =[

g2

].

With these, the augmented form of the constrained linear static problem be-comes

K11u1 = F1,

where

F1 =[

Fg

Y2

].

The enforced displacement term is a simple extension

Y2 =[

Ys

0m

].

The system matrix is

K11 =[

Kgg GT2g

G2g 0

],

where the combined, single and multi-point constraint matrix is of form

G2g =[

Gsg

Gmg

].

The Gmg sub-matrix was defined above as an appropriately scattered versionof the Gmn sub-matrix, while the Gsg sub-matrix is based on the earlier Gsn

matrix.

This augmented linear system may also be solved with specific pivotingtechniques, such as described in [2]. These are needed due to the fact thatthe augmented stiffness matrix is indefinite on the account of the zero diagonalblock introduced. Nevertheless, factorization techniques, such as discussed inSections 7.2 and 7.3 solve these problems routinely.

92 Chapter 4

The simultaneously augmented solution is partitioned as

u1 =[

ug

λ2

],

where the first partition is the ultimate subject of the engineer’s interest andthe reaction forces at the constraints are trivial to compute.

The method of augmentation is not as wide-spread as the elimination,mainly because solver components were designed with positive-definite ma-trix focus in the past. The augmented approach is clearly more efficient foreffects such as enforced motion, a subject of Section 15.5. It is also the basis ofcoupling finite element systems with multi-body analysis software, as shownin Section 11.6.

The next issue is the detection of singularities introduced by modeling tech-niques or errors in them. These may be of mechanisms, massless degrees offreedom, or even the dangerous combination of both, the massless mecha-nisms, subjects of the next chapter.

References

[1] Barlow, J.; Constraint relationships in linear and nonlinear finite elementanalyses, Int. Journal for Numerical Methods in Engineering, Vol. 2, Nos.2/3, 149-156, 1982

[2] Bunch, J. R. and Parlett, B. N.; Direct methods for solving symmet-ric indefinite systems of linear equations, SIAM Journal of NumericalAnalysis, Vol. 8, No. 2, 1971

[3] Komzsik, L.; The Lagrange multiplier approach for constraint processingin finite element applications, Proceedings of microCAD-SYSTEM ’93,Vol. M, pp. 1-6, The University of Miskolc, Hungary, 1993

[4] Komzsik, L. and Chiang, K.-N.; The effect of a Lagrange multiplierapproach on large scale parallel computations, Computing Systems inEngineering, Vol. 4, pp. 399-403, 1993

5

Singularity Detection of Finite ElementModels

It is clear from the example of Chapter 4 that singularities may remain inthe system after eliminating the specified constraints. The possible remainingsingularities in the system may be due to modeling techniques. For examplemodeling planar behavior via membrane elements causes singularities due tothe out of plane, drilling degrees of freedom being undefined. Similarly, inmodels with solid elements, the nodes have no rotational degrees of freedom.These are all called local singularities, the topic of Section 5.1.

There are also global singularities caused by rank deficiency of the stiffnessmatrix due to a part of the model having an unconstrained rigid body motion.These are called mechanisms and discussed in Section 5.2.

5.1 Local singularities

These singularities, once the proper degrees of freedom are identified, are re-moved by adding additional single-point constraints to the s partition andrepeating the single-point elimination step.

In order to describe the detection process, let us consider the simple barexample again. The f partition version of the equilibrium equation of themodel has only one nonzero term in a 4 × 4 matrix, so it is clearly singular.

The most practical technique to further identify singularities is by the sys-tematic evaluation of node point singularity. This is executed by numericallyexamining the 3 by 3 sub-matrices corresponding to the translational and ro-tational degrees of freedom of every node. Let us denote the translational 3by 3 sub-matrix of the i-th node by K i

t and the rotational 3 by 3 sub-matrixby Ki

r. In case the Kff matrix does not contain the full 3 by 3 sub-matrix ofa certain node (due to constraint elimination), a smaller 2 by 2 or ultimately1 by 1 sub-matrix is examined.

93

94 Chapter 5

The process is based on the following singular value decompositions:

Kit = UtΣtV

Tt ,

andKi

r = UrΣrVTr .

If any of the singular values obtained in Σr or Σt are less than a small computa-tional threshold, then the corresponding direction defined by the appropriatecolumn of V T is considered to be singular. The singular direction may beresolved by an appropriate single-point constraint.

Naturally, more than one direction may be singular and ultimately all threemay be singular. There may also be cases when a singular direction is notdirectly aligned with any of the coordinate axes. In this case, the coordinatedirection closest to the singular vector (based on direction cosines) is con-strained.

For example in the case of our example

K1r =

⎡⎣ 0 0 0

0 0 00 0 0

⎤⎦ .

The singular value decomposition is trivial; all 3 directions are singular. Theautomatically found singularities are:

uas =

⎡⎣θx(1)

θy(1)θz(1)

⎤⎦ .

The superscript notes that this set is the automatically detected singular set.The f -size equilibrium equation may now be reduced by applying the au-tomatic single point constraints. The automatic single-point constraints aredescribed by

uas = Y a

s ,

where the uas is a partition of uf :

uf =[

ua

uas

].

It follows, that for our example:

Y as =

⎡⎣ 0.0

0.00.0

⎤⎦ .

Singularity Detection of Finite Element Models 95

Applying these to the f -size equilibrium equation results in

Kaaua = Fa − KasaY as .

This is commonly called the analysis set formulation as after this step, thenumerical model is ready for analysis. The following chart summarizes all thenumerical reduction steps leading to the analysis formulation

⎡⎢⎢⎢⎢⎣

Kgg ⎡⎢⎢⎣

Knn ⎡⎣Kff [

Kll [Kaa

] ]⎤⎦

⎤⎥⎥⎦

⎤⎥⎥⎥⎥⎦ .

The final stiffness matrix ready for analysis is

Kaa =[a

],

and the final, clearly non-singular, equilibrium equation is

Kaaua = Fa,

or

aqx(2) = F.

Assuming numerical values of E = 107, A = 0.1, F = 1.0 in the appropriatematching units, the numerical solution of our example problem is a displace-ment of 10−6 units. Figure 5.1 shows the (exaggerated) deformed shape ofthe example model.

The recovery of the solution prior to the single-point constraint eliminationis

un =

⎡⎢⎢⎢⎢⎢⎢⎢⎢⎣

0.00.00.00.00.00.0

10−6

⎤⎥⎥⎥⎥⎥⎥⎥⎥⎦

=

⎡⎢⎢⎢⎢⎢⎢⎢⎢⎣

qx(1)qy(1)qz(1)θx(1)θy(1)θz(1)qx(2)

⎤⎥⎥⎥⎥⎥⎥⎥⎥⎦

.

Here the first 3 rows containing zeroes are the boundary condition Ys con-straints. The second 3 represent the automatically found Y a

s constraints. Thisvector is then further reprocessed to reflect the multi-point constraint elimi-nation.

96 Chapter 5

FIGURE 5.1 Deformed shape of bar

um = Gmnun =

⎡⎢⎢⎢⎢⎣

0.00.00.00.00.0

⎤⎥⎥⎥⎥⎦ =

⎡⎢⎢⎢⎢⎣

qy(2)qz(2)θx(2)θy(2)θz(2)

⎤⎥⎥⎥⎥⎦ .

The zero result is explained by the fact that the last column of the Gmn ma-trix derived in Section 4.2 is zero and only the last term of the un vector isnonzero. The g partition result vector is obtained by merging the m and npartitions:


ug =

⎡⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎣

0.00.00.00.00.00.0

10−6

0.00.00.00.00.0

⎤⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎦

=

⎡⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎣

qx(1)qy(1)qz(1)θx(1)θy(1)θz(1)qx(2)qy(2)qz(2)θx(2)θy(2)θz(2)

⎤⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎦

.

This is the final result of our computational example; the displacement atevery node point and each degree of freedom is known.

5.2 Global singularities

Global singularities occur when the stiffness matrix is rank deficient due tolinear dependency among the columns. They manifest an engineering phe-nomenon called a mechanism. After all flexible elements have been accountedfor, the constrained degrees of freedom have been eliminated and rigid bodymotions are supported, it is still possible to have groups of degrees of freedomthat are free to move without incurring loads in other degrees of freedom.

For example, an open automobile door is free to rotate without causingloads in the hinges and other structures to which it attaches. Each group ofdegrees of freedom which can move in a stress-free manner are attached towhat is called a mechanism. The presence of mechanisms causes difficulties inthe solution process unless they are constrained by some other device. Degreesof freedom on a mechanism are likely to move through a large motion for mod-erate loads, and a small change in loads can lead to a large change in response.

The above car door example is a local mechanism. If it is a part of themodel where constraints to ground were neglected the model has seven mech-anisms. This condition is sometimes described as seven unconstrained rigidbody modes being present in the model. Aerospace industry models may haveeven larger numbers of rigid body modes due to a larger number of door-likecomponents. The flaps and other stability control components all could rep-resent mechanisms.

98 Chapter 5

Mechanisms may or may not have been intentional by the engineer. In anycase these type of singularities also need to be removed before any numericalsolution is attempted. After eliminating the constraints and the local singu-larities, this situation could still happen if there is a linear dependency amongthe columns of Kaa. To recognize this case, we rely on some diagnostics pro-vided during the

Kaa = LaaDaaLTaa

factorization (more details on the process in the next chapter) of the stiffnessmatrix as follows.

Due to round-off, mechanisms usually manifest themselves as very smallvalues of D(i, i). That of course could result in an unstable factorization andunreliable solutions later. To prevent this, the ratio of

r(i) = Kaa(i, i)/Daa(i, i)

is monitored throughout the factorization. This strategy is based on the rec-ommendation of [2] and used widely in the industry where it is called thematrix/factor diagonal ratio. Wherever this ratio exceeds a certain thresholdthe related degree of freedom is considered to be part of a possible mechanism.The threshold used in practice is usually close to

√εmachine, where εmachine is

representative of the floating point arithmetic accuracy of the computer used.

In large models the straightforward listing of these degrees of freedom maynot be adequate to aid the engineer in resolving the problem. To visualizethe “shape” of the mechanism, let us partition the degrees of freedom into apartition (x) exceeding and the complementary set (x) below the threshold.

ua =[

ux

ux

].

A similar partitioning of the stiffness matrix is

Kaa =[

Kxx Kxx

Kxx Kxx

].

Let us virtually apply a force (Px) to the degrees of freedom with exceedinglyhigh ratios. Let us assume that this virtual force moves the degrees of freedomin the x partition by one unit.[

Kxx Kxx

Kxx Kxx

] [ux

Ix

]=

[0Px

].

The solution shape of the stable, below the threshold partition is

ux = −K−1xx Kxx.


This shape helps the analyst to recognize the components of the mechanism.The drawback of this method of identifying mechanisms is its non-trivial cost,therefore, it is executed only at the engineer’s specific request in commercialsoftware.

The solution of the mechanism problem is to tie the independent compo-nent to the rest of the structure. This is done by applying more flexibleelements and/or appropriate multi-point constraints and repeating the multi-point elimination process. Writing these constraint equations is not trivialand needs engineering knowledge in most cases.

5.3 Massless degrees of freedom

So far our focus has been restricted to the stiffness matrix. The mass matrixwhose generation was shown in Chapter 3 usually has a large number of zerorows and columns. In industrial finite element analysis these are called mass-less degrees of freedom. Some of the eigenvalue analysis methods on the otherhand require a nonsingular mass matrix. Such method is the dense matrixreduction type method of Householder, to be further discussed in Chapter 9.

In order to allow the use of such methods, another reduction step is some-times executed in the industry, to eliminate the massless degrees of freedomfrom both the global mass and the global stiffness matrix.

Let us consider the a partition normal modes analysis problem of

[Kaa − λMaa]ua = 0.

Remember, this problem now has the global singularities also removed. Letthese matrices be partitioned into the o partition of massless degrees of free-dom and the a partition of degrees of freedom with masses.[

Koo Koa

KToa Kaa − λMaa

] [uo

ua

]= 0.

The fact that the mass matrix contains zero rows and columns in the o par-titions is demonstrated by the lack of M terms in the appropriate partitions.Introducing a transformation matrix

Gaa =[−K−1

oo Koa

Iaa

],

100 Chapter 5

the displacement vector is reduced as

ua = Gaaua.

Left multiplication by the transformation matrix yields

GTaa(Kaa − λMaa)Gaaua = 0.

WithKaa = GT

aaKaaGaa,

andMaa = GT

aaMaaGaa,

one obtains a reduced set of equations devoid of massless degrees of freedom


5.4 Massless mechanisms

Massless mechanisms are the most dangerous and annoying singularities infinite element computations. They are a combination of the mechanisms andmassless degrees of freedom, occurring when the stiffness and mass matriceshave a coinciding zero subspace.

Recall that the rigid body eigenvalues were computational zeroes. Themassless degrees of freedom, if left in the system, may result in infinite eigen-values. Neither of them is troublesome on its own right, but their combina-tion is especially troublesome because the eigenvalues corresponding to thatcase are indefinite. This may carry the grave consequence of finding spuriousmodes.

The practical scenario, under which this may happen, is when either somemass or stiffness components are left out of the model due to engineering er-ror. It is very important to detect such cases and this may be done as follows.Let us compute a linear combination of the stiffness and mass matrix as

A = Kaa + λsMaa.

Note that the combination coefficient is positive, resulting in a negative shiftin an eigenvalue analysis sense, avoiding the influence of the eigenvalues tothe left of the shift.


The detection is again based on the factorization

A = LAADAALTAA,

and the ratio of matrix diagonals and factor diagonals is monitored:

R(i) = A(i, i)/DAA(i, i).

The following strategy is based on the industrially acknowledged solution in[3] and may be used to correct massless mechanism scenarios. Based on theR vector a P matrix originally initialized to zero is populated as follows. Fori = 1, 2, ...n, if

R(i) > threshold,

then the ith row of the next available column j, j = 1, 2, ..m of the P matrixwill be set to unity as

P (i, j) = 1.

The threshold is the maximum ratio, for example 106, tolerated by the engi-neer. The process results in having only one nonzero term in each column ofthe P matrix. If there is no entry in R that violates the maximum threshold,then P is empty and there are no massless mechanisms detected in the system.

The solution of the system

AU = P

by exploiting the factorization executed for the detection

U = (LAADAALTAA)−1P

provides the shapes depicting the massless mechanism. Each column of the Umatrix represents a potential massless mechanism mode, assuming that therigid body modes were already removed.

The locations at the end of each of the remaining mode shapes are gatheredinto a single partitioning vector Q. The nonzero terms in the vector constitutethe m partition of massless mechanism degrees of freedom. The vector is usedto partition both the stiffness and mass matrices simultaneously.

Kaa =[

Kaa Kam

Kma Kmm

]

and

Maa =[

Maa Mam

Mma Mmm

].

102 Chapter 5

The m partition represents the subspaces corresponding to the massless mech-anisms and as such, discarded. The pencil of

(Kaa, Maa)

may be now safely subjected to eigenvalue analysis. The final reduction sum-mary chart after the elimination of massless degrees of freedom and mecha-nisms is

⎡⎢⎢⎢⎢⎢⎢⎢⎢⎣

Kgg ⎡⎢⎢⎢⎢⎢⎢⎣

Knn ⎡⎢⎢⎢⎢⎣

Kff ⎡⎢⎢⎣

Kll ⎡⎣Kaa [

Kaa [Kaa

] ]⎤⎦

⎤⎥⎥⎦

⎤⎥⎥⎥⎥⎦

⎤⎥⎥⎥⎥⎥⎥⎦

⎤⎥⎥⎥⎥⎥⎥⎥⎥⎦

.

There are further singularity phenomena, such as for example singularitiesat element corner nodes [1]. These are not addressed via generic reductionapproaches, but by specific numerical adjustments, and as such, not discussedhere further. It is also possible that multiple physical phenomena are modeledsimultaneously, for example elasticity with fluid dynamics, and this topic willbe discussed in the next, final chapter of Part I, after reviewing some indus-trial case studies.

5.5 Industrial case studies

To quantitatively demonstrate the effect of the numerical finite element modelgeneration techniques discussed so far, several automobile industry exampleswere collected. Table 5.1 shows the number of nodes and various elements ofthree finite element models. They were automobile body examples such asshown in Figure 5.2.

Such structural models are called “body-in-white” models in the automo-bile industry to distinguish from the fully equipped “trimmed” models shownin Chapter 9. They all had various shell and solid elements. The shell ele-ments include 4 or 8 noded quadrilateral elements and 3 or 6 noded triangularelements (first and second order elements, respectively). The solid elementswere hexahedral, pentahedral (also called wedge) and 4 or 10 noded (first orsecond order) tetrahedral elements.


FIGURE 5.2 Typical automobile body-in-white model

TABLE 5.1

Element statistics of automobile model examplesModel Nodes Shells Solids Rigids

A 529,116 428,154 38,385 59,071B 712,680 563,563 123,602 50,868C 1,322,766 1,195,701 89,980 4,766

The matrix partition sizes corresponding to the various steps of numericalmodel generation are collated in Table 5.2. The g partition is the global,assembled partition, the n partition is what remains after eliminating themulti-point constraints. The f partition has the single-point constraints alsoremoved and finally the a partition, ready for analysis, has no singularities.

It is important to notice that the difference between the n and the f parti-tions is rather small, indicating the practical industrial tendency of engineersgiving only a few single-point constraints (as boundary conditions) explicitly.

104 Chapter 5

TABLE 5.2

Reduction sizes of automobile model examplesModel g n f a

A 3,135,144 2,857,100 2,856,080 2,643,561B 4,276,080 4,101,639 4,101,557 3,577,998C 7,936,560 7,882,177 7,881,784 6,936,560

There is much reliance on automated singularity processing, demonstrated bythe difference between the f and a partitions.

On the other hand, the large number of rigid elements gives rise to manymulti-point constraints. This produces the noticeably large difference betweenthe g and the n partition sizes.

We have now arrived at the numerical model containing matrices ready forvarious analyses. Due to the often enormous size of this model, a variety ofreduction methods is used in the industry to achieve a computational model.These are the subject of the chapters of Part II.

References

[1] Benzley, S. E,; Representation of singularities with iso-parametric finiteelements, Numerical Methods in Engineering, Vol. 8, No. 3, pp. 537-545,2005

[2] Gockel, M. A.; An index for the quality of linear equation solutions,Lockheed Corp. Technical Report, LR 25507, 1973

[3] Gockel, M. A.; Massless mechanism detection for real modes, MSC Userreport MMA.v707, 1999

6

Coupling Physical Phenomena

The life cycle of products contains scenarios when the structure is interactingwith another physical entity. Important practical problems are, for example,when the structure surrounds or immersed into a volume of fluid, both calledfluid-structure interaction scenarios.

6.1 Fluid-structure interaction

We focus on analysis scenarios when the behavior is dominated by the struc-ture. The coupling between the fluid and the structure introduces an unsym-metric matrix component. The coupled equilibrium of an undamped structureis

Mcuc + Kcuc = Fc,

where the subscript c refers to the coupling. The coupled matrices are

Mc =[

Ms 0A Mf

]

and

Kc =[

Ks −AT

0 Kf

].

The subscripts s and f refer to the structure and fluid, respectively, not tothe s of f partitions presented in prior sections. The A matrix represents thefluid-structure coupling and will be discussed shortly. The components of thecoupled solution vector are

uc =[

us

pf

],

where the pf is the pressure in the fluid and us is the displacement in thestructural part. The external load may also be given as structural forces andpressure values as

105

106 Chapter 6

Fc =[

Fs

Pf

].

6.2 A hexahedral finite element

The fluid volume may be best represented by hexahedral finite elements. Thisenables the interior to be modeled by element shapes close to the Cartesiandiscretization of the volume. The element shapes are gradually deformed toadhere to the shape of the boundary of the fluid volume.

Following the principles laid down regarding the three dimensional contin-uum modeling by tetrahedral elements in Section 3.4, every node point of thelinear (so-called eight-noded) hexahedron has three degrees of freedom andthere are twenty four nodal displacements of the element.

FIGURE 6.1 Hexahedral finite element

There are higher order hexahedral elements, for example having nodes inthe middle of every edge will result in the so-called twenty-noded element. Fi-

Coupling Physical Phenomena 107

nally, having nodes on the middle of faces as well as one in the middle of thevolume produces the 27-noded element. We will review the 8-noded elementsuch as shown in Figure 6.1 in the following.

The cube has sides of two units and the local coordinate system is originatedin the center of the volume of the element, resulting in the local coordinatevalues of the node points described in Table 6.1.

TABLE 6.1

Local coordinatesof hexahedralelementNode u v w

1 -1 -1 -12 1 -1 -13 -1 1 -14 1 1 -15 -1 -1 16 1 -1 17 -1 1 18 1 1 1

This arrangement of the nodes corresponds to the quadrilateral elementintroduced in section 1.7 and consequently the shape functions of the hexa-hedral element may also be generalized for i = 1, 2, . . . , 8 as

Ni =18(1 + uiu)(1 + viv)(1 + wiw).

Developing the formula for i = 1 as an example yields

N1 =18(1 − u)(1 − v)(1 − w),

which is an obvious generalization of the quadrilateral element’s shape func-tion. The vector of nodal displacements for the element becomes

108 Chapter 6

qe =

⎡⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎣

q1x

q1y

q1z

...

...

...q8x

q8y

q8z

⎤⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎦

.

Recall that qix refers to the x translation of the ith local node of the element.The displacement at any location inside this element is approximated as

q(x, y, z) = Nqe.

The organization of the N matrix of the eight shape functions is

N =

⎡⎣N1 0 0 N2 0 0 . . . N8 0 0

0 N1 0 0 N2 0 . . . 0 N8 00 0 N1 0 0 N2 . . . 0 0 N8

⎤⎦ .

The location of a point inside the element is approximated again with thesame four shape functions as the displacement field

x =8∑

i=1

Nixi,

y =8∑

i=1

Niyi,

and

z =8∑

i=1

Nizi.

Here xi, yi, zi are the eight x, y, z coordinates of the nodes of the hexahedron,hence the element is again an iso-parametric element.

Further mimicking of the procedures developed for the quadrilateral ele-ment (Section 1.7) and the tetrahedral element (Section 3.4) results in theelement stiffness matrix of

ke =∫ +1

−1

∫ +1

−1

∫ +1

−1

BT DB|J |dudvdw,

where B and J are computed before in the sections mentioned above.


6.3 Fluid finite elements

The physical behavior of the fluid is governed by the wave equation of form:

1bp = ∇(

1ρ∇p).

Hereb = c2ρ0

is the bulk modulus, c is the speed of sound in the fluid and ρ is its density[4]. The connection with the surrounding structure is based on relating thestructural displacements to the pressure in the fluid.

The boundary condition at a structure-fluid interface is defined as

∂p

∂n= −ρun,

where n is the direction of the outward normal. At free surfaces of the fluidwe assume

u = p = 0.

We address the problem in its variational formulation [3]∫ ∫ ∫V

[1bp − 1

ρ∇ · ∇p]pdV = 0.

The finite element discretization of this can be carried out again based onGalerkin’s principle assuming

p(x, y, z) =n∑

i=1

Nipi = Np,

where pi and Ni are the discrete pressure value and shape functions associatedwith the i-th node of the fluid finite element mesh.

The same holds for the derivatives:

p(x, y, z) = Np.

Separating the two parts of the variational equation, the first yields∫V

1bppdV =

∫V

1bppdV = pT

∫V

1bNT NdV p.

110 Chapter 6

Introducing the fluid mass matrix

Mf =∫

V

1bNT NdV,

this term simplifies to ∫V

1bppdV = pT Mf p.

The components of the fluid mass matrix are computed as

Mf (i, j) =1b

∫V

NiNjdV.

The second part of the variational equation integrated by parts yields

−∫

V

(1ρ∇ · ∇pp)dV =

∫V

1ρ∇p · ∇pdV −

∫S

1ρ∇ppdS.

From above assumptions it follows that

∇p = ∇Np,

and using the boundary condition stated above, we obtain

pT

∫V

(1ρ∇NT )∇NdV p + pT

∫S

NT undS.

Introducing the fluid stiffness matrix of form

Kf =∫

V

1ρ∇NT∇NdV,

the first part of the variational equation simplifies to

pT Kfp.

The components of the fluid stiffness matrix are computed as

Kf (i, j) =1ρ

∫V

∇Ni∇NjdV.

The coupling force exerted on the boundary by the surrounding structure is

A =∫

S

NT undS.

Utilizing the boundary condition, the terms of the coupling matrix are com-puted as

A(i, j) =∫

NiNjdS.


6.4 Coupling structure with compressible fluid

A symmetric coupled formulation is also possible using the same constituentmatrices but with a different order of operations. It produces the same so-lution but with symmetric matrices, which can be solved more economicallyand reliably.

This method is applicable when the fluid component is highly compressible,as it requires the inverse of the fluid matrices. Since these matrices are usuallysignificantly smaller than the structural matrices, the cost of obtaining theseinverses is not prohibitive.

The derivation is as follows. Let us write out the first row of the coupledmatrix equation

Mssu + Kssu − AT p = Fs.

Add and subtract a yet rather cryptic term as

Mssu + Kssu − AT p + AT M−1ff Au − AT M−1

ff Au = Fs.

After grouping and reorganizing

Mssu + (Kss + AT M−1ff A)u − AT (p + M−1

ff Au) = Fs.

This will become the first equation of the new coupled form. The secondcoupled equation is written as

Au + Mff p + Kffp = Ff .

Pre-multiplying by MffK−1ff and again adding and subtracting a specifically

chosen term results in

MffK−1ff Au + MffK−1

ff Mff p + MffK−1ff Kffp + Au − Au = MffK−1

ff Ff .

Grouping and reordering again produces the second equation of the new cou-pled form

MffK−1ff Mff(p + M−1

ff Au) − Au + Mff (p + M−1ff Au) = MffK−1

ff Ff .

Introducing the new variable

q = p + M−1ff Au

and its 2nd derivative

q = p + M−1ff Au

112 Chapter 6

results in the symmetric coupled formulation of[Mss 00 MffK−1

ff Mff

][uq

]+

[Kss + AT M−1

ff A −AT

−A Mff

] [uq

]=

[Fs

MffK−1ff Ff

].

More details on the physics and another alternative formulation may be seenin [1].

6.5 Coupling structure with incompressible fluid

Coupling structures with incompressible fluid results in a simpler computa-tional form that is still based on the coupled equilibrium as

Mcuc + Kcuc = Fc,

however, due to the incompressibility of the fluid there is no acceleration de-pendent fluid term in the coupled mass matrix:

Mc =[

Ms 0A 0

].

The stiffness matrix remains

Kc =[

Ks −AT

0 Kf

].

This enables the elimination of the pressure from the second equation,

Au + Kfp = 0,

resulting in

p = −K−1f Au.

Substituting into the first equation produces

Msu + Ksu − AT p = Msu + Ksu + AT K−1f Au.

By introducing the so-called virtual mass

Mv = AT K−1f A,

the coupled equations of motion simplify to

(Ms + Mv)u + Ksu = 0.


FIGURE 6.2 Fuel tank model

This computation is very practical for fluid containers, such as the fuel tankof automobiles shown in Figure 6.2.

Other applications include structures surrounded by water, such as shipsand off-shore drilling platforms.

6.6 Structural acoustic case study

In the interior acoustics application of car bodies [2] the interior fluid is theair, and as such it is compressible. The aim of acoustic response analysis isto compute and ultimately reduce the noise at a certain location inside of anautomobile or a truck cabin shown in Figure 6.3.

An example from a leading auto-maker had 5.6 million total coupled de-grees of freedom. The structural excitation originated from the front tires and

114 Chapter 6

FIGURE 6.3 Truck cabin model

the goal was to compute the pressure (the noise) at the hypothetical driver’sear.

Table 6.2 contains statistics of the matrices in this analysis. Note the verylarge number of the zero columns in the mass matrix resulting from the factthat the external body model is mostly from shell elements. The rather higherdensity of the stiffness matrix, manifested by the more than ten-thousandnonzero terms in some columns, is also noteworthy. This is a result of thepresence of the interior fluid model build from solid elements with higher con-nectivity.

TABLE 6.2

Acoustic response analysis matrix statisticsK number of rows nonzero terms max termsmatrix 4.25 million 156 million 10,322

M number of rows zero columns max termsmatrix 4.25 million 665,486 12


The analysis was executed on a workstation with 4 (1.7 GHz) CPUs. Theoverall analysis required 1,995 elapsed minutes, of which about 840 minuteswas the eigenvalue solution. The amount of I/O executed was 4.3 Terabytesand the disk footprint was 183 Gigabytes.

Note that a series of such forced vibration problems are solved in prac-tice, between which the structural model is changed to modify the acousticresponse (lower the noise). These computations are very time consuming,therefore usage of advanced computational methods in the solution compo-nents, subject of Part II are of paramount importance.

Furthermore, this physical problem may also contain damping and could re-sult in a generalized quadratic eigenvalue problem (zero excitation) or forced,damped vibration (nonzero excitation), topics also discussed at length in PartII.

References

[1] Everstine, G. C.; Finite element formulation of structural acoustics prob-lems, Computers and Structures, Vol. 65, No. 3, pp. 307-321, 1997

[2] Komzsik, L.; Computational acoustic analysis in the automobile indus-try, Proc. Supercomputer Applications in the Automobile Industry, Flo-rence, Italy, 1996

[3] Komzsik, L.; Applied variational analysis for engineers, Taylor and Fran-cis, Boca Raton, 2009

[4] Warsi, Z. U. A.; Fluid dynamics, Taylor and Francis, Boca Raton, 2006

Part II

Computational ReductionTechniques

117

7

Matrix Factorization and Linear Systems

Factorization of the finite element matrices and the solution of linear systemsplays a significant role in the following chapters. They are also the foundationof one of the practical problems mentioned earlier, the linear static analysis.Therefore, we focus this chapter on these topics.

7.1 Finite element matrix reordering

The main reasons for reordering the finite element matrices are to minimize:

1. the storage requirements of the factor matrix,2. the computing time of the factorization, and3. the round-off errors during the computation.

Reordering means replacing various rows and the corresponding columns ofthe matrix while maintaining symmetry. To demonstrate this, we look at thefollowing two matrices:

A =

⎡⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎣

x x x xx xx xx x

xx

xx

⎤⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎦

119

120 Chapter 7

and

B =

⎡⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎣

xx

xx

x xx x

x xx x x x

⎤⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎦

.

The x locations are the nonzero values of the matrices and the other loca-tions hold zeroes. As noted in earlier chapters, the assembled finite elementmatrices are very sparse. It is easy to see that the two matrices are reorderedversions of each other.

Specifically, the last four rows and columns of the A matrix were “reordered”to be the the first four rows and columns of the B matrix. Furthermore thefirst row and column of A became the last row and column of B. This choice,and the “betterness” of the B matrix becomes obvious when viewing thefactorization of

A = LDLT .

Here the L matrix is a unit triangular matrix and D is diagonal. An algorithmfor such a factorization of an order n symmetric matrix may be written as:

For i = 1, n

D(i, i) = A(i, i) − Σi−1k=1L(i, k)2D(k, k)

For j = i + 1, n

L(j, i) = (A(j, i) − Σi−1k=1L(i, k)D(k, k)L(j, k))/D(i, i)

End loop jEnd loop i

It is important to notice that the inner loop involves a division by D(i, i). Ifthis term is zero, the matrix is singular and cannot be factored. Alternatively,if this term is very small, the round-off error of the division is very large. Thisis the third reason for reordering: moving small terms off the diagonal im-proves accuracy.

Executing the above factorization for both A and B matrices, the triangularfactors have the following sparsity patterns.

Matrix Factorization and Linear Systems 121

LTA =

⎡⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎣

x x x xx y y

x yx

xx

xx

⎤⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎦

and

LTB =

⎡⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎣

xx

xx

x xx x

x xx

⎤⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎦

.

The observation is that the LTA factor has new (fill-in) terms noted by y.

These terms will increase the storage requirement of the factor (reason 1) andof course the computational time (reason 2).

Mathematically the reordering is presented by a permutation matrix, con-taining only zeroes and ones in specific locations. For our specific example

B = PAP T

where

P =

⎡⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎣

11

11

11

11

⎤⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎦

.

To calculate the P permutation matrix, it is best to consider a graph repre-sentation of the matrix A. We introduce the following equivalency criteria:

1. Diagonal terms of the matrix correspond to a vertex of the graph,2. Off-diagonal terms of a matrix correspond to edges in the graph.

With these, every symmetric matrix has an equivalent undirected graph.Traversing the graph in a suitable manner yields a proper permutation matrix.

122 Chapter 7

One practical traversal method is called the minimum degree algorithm [6].The algorithm systematically removes nodes from the graph until the graphis eliminated. The selection of the next node is based on the node’s connec-tivity (the number of other nodes that are connected). The node with theminimum connectivity will be removed, hence the name. Based on the thematrix-graph analogy, this means factoring the row of the matrix first withthe least number of off-diagonal terms.

The process maintains adjacency of nodes by introducing temporary edges.The temporary edges correspond to the fill-in terms of the factorization pro-cess. The minimum degree elimination of the graph is the symbolic factor-ization of the corresponding matrix. Capturing the order of the eliminationproduces the P permutation matrix.

There are many advanced variations of the minimum degree method. Theyare reviewed in [3]. These methods are the predecessors of the domain de-composition methods mentioned in later chapters.

Note again, that the above deals with symmetric matrices. For unsymmetricmatrices a factorization of

A = LU

is applied. The reordering is in this case represented by two distinct (row andcolumn) permutation matrices as

PAQ = B.

Otherwise most of the discussion of symmetric matrices applies to unsymmet-ric matrices as well.

7.2 Sparse matrix factorization

The finite element method gives rise to very sparse matrices. This was espe-cially obvious in our simple example in Chapter 3. Depending on the modeland the discretization technique, matrices with less than a tenth of a percentdensity are commonplace. It is a natural desire to exploit this level of sparsityin the matrices.

First and foremost, if the matrices were stored as two-dimensional arrays,for today’s multi-million degree of freedom problems the memory requirementwould be in the terabytes. Secondly, the amount of numerical work executed


on zero operations would be the dominant part of the factorization. Most ofthe computational techniques described in the remainder of the book could beimplemented by taking the sparsity into consideration. We review the tech-nique here briefly.

The factorization of finite element matrices is usually executed in twophases. The first phase is the symbolic phase, called such because the factor-ization process is executed by considering the location of the terms only. Thesecond, numeric phase, executes the actual factorization on the terms.

The symbolic phase separates the topology structure of the matrix from itsnumerical content. In this form the matrix is stored in three linear arrays.One contains the row and another the column indices of every term, while athird one contains the numeric values. The matrix is then described as

A = [IA(k), JA(k), NA(k) ; k = 1, ...NZ],

where NZ is the number of nonzero terms in the matrix. Here IA and JAare integer arrays and NA is real. Specifically, a term A(i, j) of the matrixis nonzero if there is an index k for which IA(k) = i, JA(k) = j. ThenA(i, j) = NA(k).

The symbolic factorization also computes a P permutation matrix men-tioned in the last section, which may be modified during the numerical fac-torization process. With this, the numeric factorization may be executed veryefficiently. The numeric factorization operation shown in Section 7.1 may berewritten in this indexed form. The operations in the inner i, j loop of thealgorithm are executed only for nonzero terms.

Note, that the sparsity pattern of the resulting factor matrix is differentthan that of the matrix factored, due to the aforementioned fill-in terms. Infact, the number of nonzero terms of the factor matrix is sometimes orders ofmagnitude higher. They are still very far from being dense and they are alsostored in sparse, indexed form.

While sparse matrix factorization is clearly an advantageous method whenconsidering the outer fringes of the sparse finite element matrices, it could be-come a burden close to the diagonal. Specifically there are usually fairly denseblocks in finite element matrices as well as completely empty blocks. The fol-lowing method takes advantage of the best of the sparse and dense approaches.

124 Chapter 7

7.3 Multi-frontal factorization

The premier, industry standard factorization techniques in finite element anal-ysis are in the class of multi-frontal techniques [2]. The simplified idea of suchmethods is to execute the sparse factorization in terms of dense sub-matrices,the “frontal matrices”. The frontal method is really an implementation of theGaussian elimination by eliminating several rows at the same time.

The multi-frontal method is an extension of this by recognizing that manyseparate fronts can be eliminated simultaneously by following the eliminationpattern of some reordering, like the minimum degree method. Hence, themethod takes advantage of the sparsity preserving nature of the reorderingand the efficiency provided by the dense block computations.

Let us review the method in terms of our A matrix as follows. Permute andpartition A as

P1AP T1 =

[E1 CT

1

C1 B1

].

P1 is the permutation matrix computed by the reordering step, but possiblyslightly modified to assure that the inverse of the s1×s1 sub-matrix E1 exists.The first block elimination or factorization step is then

P1AP T1 =

[I1 0

C1E−11 In−1

][E1 00 B1 − C1E

−11 CT

1

][I1 E−1

1 CT1

0 In−1

].

Here the subscript for I1 indicates the solution step rather than the size ofthe identity matrix. The size is s1 as indicated by the size of E1. The processis repeated by taking

A2 = B1 − C1E−11 CT

1 .

We attack the next “front” by permuting and partitioning as

P2A2PT2 =

[E2 CT

2

C2 B2

],

and factorizing as above. P2 is a partition of the P matrix computed by thereordering step and possibly modified to ensure that E−1

2 exists. The finalfactors of

PAPT

= LDLT


are built as

L =

⎡⎢⎢⎢⎢⎣

I1 0 0 . 0C1E

−11 I2 0 . 0

C2E−12 I3 . 0

. .Ik

⎤⎥⎥⎥⎥⎦ ,

and

D =

⎡⎢⎢⎢⎢⎣

E1 0 0 0 00 E2 0 0 00 0 E3 0 0. . . . .0 0 0 0 Bk − CkE−1

k CTk

⎤⎥⎥⎥⎥⎦ .

D is built from variable size si×si diagonal blocks. Note, that the CiE−1i sub-

matrices are rectangular, extending to the bottom of the factor. The processstops when sk is desirably small. The P is an aggregate of the intermediatePi permutation matrices. If there were no numerical reasons to modify theoriginal order then P = P .

It is evident from the form above that the main computational componentof the factorization process is the execution of the

Bi − CiE−1i CT

i

step. Since the Bi matrix is an ever shrinking partition of the updated Amatrix, this step is called a matrix update. The size of the Ei matrix is therank of the update. Linear algebra libraries, such as LAPACK [1] have read-ily available high performance routines to execute such operations. The Ci

sub-matrices of course are still sparse, therefore, these operations are executedby indexed operations as shown earlier.

Some words of numerical considerations. In the above we assumed that theinverse of each Ei sub-matrix exist. This is assured by the proper choice ofPi. In fact the quality of the inverse determines the numerical stability ofthe factorization. If the sparsity pattern-based original permutation matrixP does not produce E−1

i , it is modified (pivoted) to do so.

Finally, this procedure again generalizes to unsymmetric matrices as well.

126 Chapter 7

7.4 Linear system solution

The solution of linear systems is of paramount importance for finite elementcomputations. The direct solution algorithms follow the factorization of thematrices. The reordering executed in the factorization must be observed dur-ing the solution phase. Consider the linear system

AX = B.

Since for the permutation matrices

PT P = I

holds, the system may be solved in terms of the factors of the reordered matrix

PAP T = LDLT

as

LDLT PX = PB.

There are two solution steps, the forward and the backward substitution. Theforward substitution solves for the intermediate results of

LY = PB.

Here the column permutation of the factorization is executed on the right-hand side B matrix. The backward substitution computes

LT (PX) = D−1Y.

The row permutation of the factorization is now observed on the result ma-trix shown by (PX). Here D−1 of course exists as D is comprised of the Ei

matrices of the factorization.

For the sake of completeness, these steps for the unsymmetric case are

LUQX = PB,

with forward substitution of

LY = PB,

and backward substitution of

U(QX) = Y.


Naturally, the sparse block structure of the participating matrices is ex-ploited in these computations also.

7.5 Distributed factorization and solution

In industrial practice the finite element matrices are exceedingly large andpartitioned with tools like [3] for an efficient solution on multiprocessor com-puters or network of workstations. The subject of this section is to discussthe computational process enabling such an operation.

Assume that the matrix is partitioned into sub-matrices as follows:

A =

⎡⎢⎢⎢⎢⎢⎢⎣

A1oo A1

ot

A2oo A2

ot

. .

Ajoo Aj

ot

. .

A1to A2

to . Ajto . Att

⎤⎥⎥⎥⎥⎥⎥⎦

,

where superscript j refers to the j-th partition. The o subscript refers to theinterior, while subscript t to the common boundary of the partitions and sis the number of partitions, so j = 1, 2, . . . s. The size of the global matrix is N .

As the solution of a linear system is our goal, let the solution vector andthe right-hand side vector be partitioned accordingly.

x =

⎡⎢⎢⎢⎢⎢⎢⎣

x1o

x2o

.xj

o

.xt

⎤⎥⎥⎥⎥⎥⎥⎦

,

and

b =

⎡⎢⎢⎢⎢⎢⎢⎣

b1o

b2o

.bjo

.bt

⎤⎥⎥⎥⎥⎥⎥⎦

.

For simplicity of the discussion, we consider only a single vector right-handside and solution. The j-th processor contains only the j-th partition of the

128 Chapter 7

matrix:

Aj =[

Ajoo Aj

ot

Ajto Aj

tt

],

where Ajtt is the complete boundary of the j-th partition that may be shared

by several other partitions. Note also, that it is a subset of the global bound-ary Att. Similarly the local solution vector component is partitioned

xj =[

xjo

xjt

].

where xjt is a partition of xt as

xt =

⎡⎢⎢⎢⎢⎣

x1t

..

xjt

..xs

t

⎤⎥⎥⎥⎥⎦ .

It is desired that

A = LDLT

be computed in partitions. Consider the factor matrices partitioned similarly.

L =

⎡⎢⎢⎢⎢⎢⎢⎣

L1oo

L2oo

. .Lj

oo

. .

L1to L2

to . Ljto . Ltt

⎤⎥⎥⎥⎥⎥⎥⎦

,

and

D =

⎡⎢⎢⎢⎢⎢⎢⎣

D1oo

D2oo

. .Dj

oo

. .. . Dtt

⎤⎥⎥⎥⎥⎥⎥⎦

.

Multiplication of the partitioned factors yields

A =

⎡⎢⎢⎢⎢⎢⎢⎢⎣

L1ooD

1ooL

1,Too L1

ooD1ooL

1,Tto

L2ooD

200L

2,Too L2

ooD200L

2,Tto

. .

LjooD

j00L

j,Too Lj

ooDj00L

j,Tto

. .

L1toD

100L

1,Too L2

toD200L

2,Too . Lj

toDj00L

j,Too . LttDttL

Ttt + Σs

j=1LjtoD

jooL

j,Tto

⎤⎥⎥⎥⎥⎥⎥⎥⎦

.


The terms of the partitioned factors are obtained in several steps and someof them (Ltt, Dtt) are not explicitly computed. First, the factorization of theinterior of the j-th partition is executed.

Aj =[

Ajoo Aj

ot

Ajto Aj

tt

]=

[Lj

oo 0Lj

to I

][Dj

oo 00 A

j

tt

] [Lj,T

oo Lj,Tto

0 I

],

where the identity matrices are not computed, they are only presented tomake the matrix equation algebraically correct. This step produces the Lj

oo,Lj

to and the Djoo local factor components. The A

j

tt sub-matrix is the localSchur complement of the the j-th partition:

Aj

tt = Ajtt − Lj

toDjooL

j,Tto ,

with Att = Σsj=1A

jtt. Next the individual Schur complement matrices are

summed up as

Att = Σsj=1A

j

tt

to create the global Schur complement. Finally, the global Schur complementis factored as:

Att = LttDttLT

tt.

Note the bar over the factor terms as these are the factors of the global Schurcomplement, not the original boundary partition.

The partitioned solution of

LDLT x = b

will also take multiple steps. The forward substitution in the partitioned formis [

LjooD

joo 0

LjtoD

joo I

] [yj

o

yjt

]=

[bjo

bjt

].

Here the I sub-matrix is included only to assure compatibility and

bt =

⎡⎢⎢⎢⎢⎣

b1t

..

bjt

..bst

⎤⎥⎥⎥⎥⎦ .

The forward substitution on the first block may be executed for the interiorof all partitions first:

yjo = [Lj

ooDjoo]

−1bjo.

130 Chapter 7

The second block of the forward solve for each partition yields:

yjt = bj

t − LjtoD

jooy

jo,

which then has to be summed up for all partitions as:

yt = Σsj=1y

jt .

The global boundary solution is a complete forward-backward substitution of

LttDttLT

ttxt = yt.

The partitions’ interior solutions will be finalized from the backward step of[Lj

ooT

Ljto

T

0 I

] [xo

j

xjt

]=

[yo

j

xjt

].

The first equation yields

xjo = Lj

oo

−T(yj

o − Ljto

Txj

t ),

where xjo is the final result in the interior of the j-th partition.

Note, that the permutation matrix was ignored in the above presentationfor simplicity, however, the process works with it also. Computer implemen-tation aspects of the technology are discussed in [8].

7.6 Factorization and solution case studies

The cost of the multi-frontal factorization is of O(n2favg), where n is thematrix size and favg is the average front size. The cost of the solution isO(nfavgm), where the number of vectors (the number of columns in the Bmatrix) is denoted by m. The range of m depends on the application, butcould easily be in the thousands. Nevertheless, the cost of the factorizationis the dominant cost of linear system solutions, hence the reason for the firstcase study example.

The size of matrices in practical problems, even when talking about compo-nents, is significant. This is largely due to the fact that the automated meshgenerators tend to “over-mesh” physical models. To retain fine local detailsin one part of the model, other parts which do not need detailed results arenonetheless meshed with small elements. This is done because it takes lesseffort to create a mesh with a single overall element size than a mesh with a


gradation of element sizes. Additionally, models with fine geometric detail -for example, fillets around holes and chamfers around edges - force automaticmesh generators to put small elements in these regions even though these de-tails may be structurally insignificant.

For the studying the factorization issue, we consider a component arisingin the automobile industry, a crankshaft casing shown in Figure 7.1.

FIGURE 7.1 Crankshaft casing finite element model

Most such models are automatically meshed with solid elements, and insuch models the rotational degrees of freedom are all eliminated by the proce-dure explained in detail in Chapter 5. Therefore, the number of free degreesof freedom (DOF) is roughly the half of the global degrees of freedom in themodel minus the boundary conditions. The statistics of the model with a fourway partitioning are shown on Table 7.1.

132 Chapter 7

TABLE 7.1

Size statistics of casing component modelModel Nodes Elements g-size f-size

complete 213,470 131,416 1,280,009 638,058partition t-size f-size

1 55,312 33,937 1,161 164,7452 50,608 31,424 2,580 154,4043 53,301 34,351 10,068 169,7714 50,178 31,704 8,298 158,832

There are some observations to be made. It is interesting to note that thesizes of the partitions are very close, within 10 percent of each other. Moreimportantly, the partition boundary and the front sizes, in specific, tend tofollow the partition sizes, the largest partition (3) has the largest boundaryand front size, latter shown in Table 7.2.

TABLE 7.2

Computational statistics of casing component modelModel Front Factor Factorization Boundary

size size time time(max) (Kterms) (sec) (sec)

complete 4,335 372,132 573.1 -partition

1 3,729 80,087 112.2 23.82 3,447 80,739 120.5 37.83 3,816 86,907 135.0 70.54 3,585 75,452 108.5 55.2

Table 7.2 also demonstrates the computational complexity of the factoriza-tion of this model and the times reported are CPU times. The statistics arefrom the factorization on a workstation using 4 processor nodes with 1.5 GHzclock speed. As the machine was dedicated, the CPU times are indicative ofelapsed performance.

The factor size and factorization time also largely follow the front size orderof the partitions. Ultimately, the number of terms in the factor matrix has a


direct correlation with the factorization time.

It is also noticeable that the boundary time (Schur complement computa-tion) is significant, it ranges from 20 to almost 70 percent of the partitionfactorization time. This is related to the t-sizes reported in Table 7.1. Thetotal t-size of the problem was 12,213 as the individual t-sizes largely overlap.Nevertheless, the longest total execution time (205.5 sec for partition 3) is al-most one third of the unpartitioned solution time. Computer implementationaspects of the technology are discussed in [8].

We demonstrate the computational complexity of linear system solutionswith another case study example of a model from the aerospace industry. Themodel consisted of approximately 34 million node points and elements. Table7.3 contains statistics of the matrices in a linear statics analysis.

TABLE 7.3

Linear static analysis matrix statisticsK number of rows nonzero terms max termsmatrix 204 million 3.8 billion 54

F number of rows number of columns max termsmatrix 204 million 1 999

Factor number of rows nonzero terms max frontmatrix 35.7 million 51.1 billion 6,995

The direct linear static analysis was executed on a workstation with 8 (1.9GHz) CPUs. The total linear static solution required 338 minutes of elapsedtime. Of this, the factorization alone required approximately 100 minutesand the direct solve about 30 minutes. The amount of I/O processed was 3.04Terabytes with a 758 Gigabytes disk footprint.

In practical circumstances many different load scenarios are used, result-ing in multiple columns of the F matrix. These systems nowadays are alsomodeled by CAD systems using various solid modeling techniques. This factleads to somewhat denser matrices, in contrast to the sparse matrices of shellmodels of a car body or an airplane fuselage.

Direct solution of such systems requires enormous memory and disk re-sources due to the size of the factor matrices. The advantages of an iterativesolution is quite clear and often exploited in industrial practice.

134 Chapter 7

7.7 Iterative solution of linear systems

It is very clear that the cost of the factorization operation is significant andespecially so when the model results in dense matrices. There are some compu-tational solutions when the factorization cannot be avoided, however, simplelinear system solutions may be more efficiently executed by iterations.

Let us consider the linear statics problem of

Ku = F,

where the partition designation is omitted for simplicity of the presentation.It is safe to assume that we address the f-partition problem.

The simplest iterative solution of this system may be found by splitting thestiffness matrix into two additive components as

K = K1 − K2,

Then the system may be presented in terms of these components as

K1u − K2u = F,

providing a very simple scheme of

K1ui = K2ui−1 + F,

where ui is the i-th iterative solution. When the inverse of the K1 componentexists and any reasonable splitting would be aimed for that, then

ui = K−11 K2ui−1 + K−1

1 F.

The process may be started with u0 = 0 and it is known to converge as longas

||K−11 K2|| ≤ 1.

The above is a necessary condition for the convergence of an iterative solutionof a system based on a particular splitting.

Naturally the inverse should also be easy to compute and the Jacobi method[5] well known to engineers, is based on simply splitting the diagonal of thestiffness matrix off:

K1 = diag(K(j, j)), j = 1, 2, . . . , n,

a reasonable strategy as long as the stiffness matrix does not contain verysmall diagonal terms. The iteration scheme presented above simplifies to a


termwise formula for every j = 1, 2, . . . , n.

ui(j) =1

K(j, j)(F (j) −

j−1∑k=1

K(j, k)ui−1(k).

Above convergence condition adjusted for the Jacobi method is

||K−11 K2|| = max1≤j≤n

∑k �=j

|K(j, k)K(j, j)

| < 1,

from which it follows that

∑k �=j

|K(j, k)| < |K(j, j)|; j = 1, . . . , n

is required. This translates into the requirement of the diagonal dominanceof the K matrix, not at all in odds with finite element stiffness matrices.

The most successful iterative technique in engineering practice today is theconjugate gradient method [4] minimizing the functional

G(u) =12uT Ku − uT F,

whose first derivative (the gradient) is the residual of the linear system

dG

du= Ku − F = −r.

The method is a series of approximate solutions of the form

ui = ui−1 + αipi−1,

and the consecutive residuals are

ri = ri−1 − αiKpi−1.

The method’s mathematical foundation is rooted in the Ritz-Galerkin prin-ciple that proposes to select such iterative solution vectors ui for which theresidual is orthogonal (hence the conjugate in the name) to a Krylov subspacegenerated by K and the initial residual r0.

From the recursive application of the principle emerge the distance coeffi-cients as

αi =rTi−1ri−1

pTi−1Kpi−1

136 Chapter 7

and the search direction of

pi = ri + βipi−1.

The relative improvement of the solution is measured by

βi =rTi ri

rTi ri

.

The process is initialized as

u0 = 0, r0 = F, p0 = r0.

The conjugate gradient method algorithm is using above formulae in a specificorder:

For i = 1, 2, 3, . . . until convergence compute:

Distance: αi

Approximate solution: ui

Current residual: ri

Relative improvement: βi

New search direction: pi

If ||ri|| ≤ ε, stop

End loop i.

In above ε is a certain threshold. The method nicely generalizes to theunsymmetric case and known as the biconjugate gradient method. Its under-lying principle is still the orthogonalization of the successive residuals to thecontinuously generated Krylov subspace. Due to the unsymmetric nature ofthe matrix, there are left handed and right handed sequences residuals

ri = ri−1 − αiKpi−1,

si = si−1 − αiKT qi−1.

corresponding to left and right search directions

pi = ri + βipi−1,

qi = si + βiqi−1.


The coefficients are combining the two sides as

αi =sT

i−1ri−1

qTi−1Kpi−1

and

βi =sT

i ri

sTi−1ri−1

.

Ultimately, however, the next approximate solution is of the same form

ui = ui−1 + αipi−1,

but the process may only be stopped when both residual norms,

||ri||, ||si||,are less than a certain threshold.

7.8 Preconditioned iterative solution technique

There are also variations of iterative solutions when the matrix is precondi-tioned to accelerate the convergence. The essence of preconditioning is bypremultiplying the problem with a suitable preconditioning matrix as

P−1Ku = P−1F.

The conjugate gradient algorithm allows the implicit application of the pre-conditioner during the iterative process. The preconditioner is commonlypresented as an inverse matrix, because of the goal of approximating the in-verse of the system matrix. Clearly with the selection of

P = K

the solution is trivial, since

P−1Ku = K−1Ku = Iu = K−1F.

The cost of the computation of the preconditioner is of course in this case thecost of the factorization. The iterative solution itself is a forward-backwardsubstitution.

138 Chapter 7

More practical approaches use various incomplete factorizations of the ma-trix

K ≈ CCT = P.

In an incomplete factorization process only those terms of the factor matrixare computed that correspond to nonzero terms of the original matrix. Thefill-in terms are omitted. There are several variations of this approach, suchas also computing fill-in terms, but only above a certain threshold value. An-other extension is to compute the fill-in terms inside a certain band of thematrix.

It is another commonly used approach to exploit the fact that the matrix isassembled from finite element matrices. In that case the preconditioners arevarious factorizations of the element matrices as

Ke ≈ CeCTe = Pe,

assembled into a global preconditioner as

P = Σne=1Pe.

Such approaches are described in [9] and [10]. The preconditioning approachis the most successful however when the preconditioner captures and exploitssome physics modeling specific information.

There are cases when the stiffness matrix may consist of two components,one that is easy to solve and one that is difficult. An example is static aero-elastic analysis, where there is a Ks structural stiffness matrix and a matrixKa representing the aero-elastic effects. The static aero-elastic solution is de-scribed by the equation of form

(Ks + Ka)u = F.

Ks is generally sparse and banded and can be solved directly by itself at areasonable cost. The Ka matrix is developed by a theory where every point ofthe structure that is touched by air is coupled to every other point touchingair. This leads to a dense, largely un-banded, and unsymmetric matrix.

In general, an unsymmetric problem is at least twice as expensive to solve asa symmetric problem, and the other unfortunate characteristics of Ka resultin an order of magnitude increase in solution cost. Therefore, preconditioningthe static aero-elastic equation with

P = Ks


and reordering to place the aero effects on the right hand side results in aniterative solution scheme:

ui = K−1s (F − Kaui−1).

The inverse operation is done with a factorization followed by a forward-backward substitution with the vector of unknowns from the right hand side.The structural stiffness matrix acts as a pre-conditioner. u0 is set to zero, re-sulting in u1 being the static solution due to the structural effects only. Onlyone matrix factorization is required for all the iterations. The troublesomeKa is never factorized.

The physics of the problem dictates that the terms in Ks are much largerthan in Ka because the structure is stiffer than the air it intersects. This leadsto a rapid convergence of this solution in connection with, for example, thebi-conjugate gradient method. The cost of the coupled aero-elastic solutionby this approach is generally no more that twice the cost of a structure onlysolution.

There are several other classical splitting methods, such as the Gauss-Seidelor the successive over-relaxation methods [13]. The concept in itself is valu-able for engineers when they combine different physical phenomena and theyknow the coupling and partitioning a priori.

There are also other minimization based methods, such as the generalizedminimum residual method [12]. There is also another class of solutions basedon an adaptive idea [11]. Both of these are subjects of strong academic in-terest but have not yet been proven as generally useful in the industry as theconjugate gradient method.

References

[1] Anderson, E. et al; LAPACK user’s guide, 2nd ed., SIAM, Philadelphia,1995

[2] Duff, I. S. and Reid, J. K.; The multi-frontal solution of indefinite sparsesymmetric linear systems, ACM Trans. Math. Softw. Vol. 9, pp. 302-325,1983

[3] George, A. and Liu, J. W. H.; The evolution of the minimum degreeordering algorithm, SIAM Review, Vol. 31, pp. 1-19, 1989

140 Chapter 7

[4] Hestenes, M. R.; and Stiefel, E.; Methods of conjugate gradients forsolving linear systems, Journal Res. National Bureau of Standards, Vol.49, pp. 409-436, 1952

[5] Jacobi, C. G. J.; Uber eines leichtes Verfahren die in der Theorie derSecularstorungen vorkommenden Gleichungen numerisch aufzulosen, J.Reine Angewandte Math., Vol. 30., pp. 51-94, 1846

[6] Liu, J. W. H.; The minimum degree ordering with constraints, SIAM J.of Scientific and Statistical Computing, Vol. 10, pp. 1136-1145, 1988

[7] Karypis, G. and Kumar, V.; ParMETIS:Parallel graph partitioning andsparse matrix library, University of Minnesota, 1998

[8] Mayer, S.; Distributed parallel solution of very large systems of linearequations in the finite element method, PhD Thesis, Technical Univer-sity of Munich, 1998

[9] Poschmann, P.; and Komzsik, L.; Iterative solution technique for finiteelement applications, Journal of Finite Element Analysis and Design,Vol. 14, No. 4, pp. 373-381, 1993

[10] Komzsik, L., Sharapov, I., Poschmann, P.; A preconditioning techniquefor indefinite linear systems, Journal of Finite Element Analysis andDesign, Vol. 26, No. 3, pp. 253-258, 1997

[11] Rude, U.; Fully adaptive multigrid methods, SIAM Journal of NumericalAnalysis, Vol. 30, pp. 230-248, 1993

[12] Saad, Y. and Schultz, M. H.; GMRES: a generalized minimum residualalgorithm for solving nonsymmetric linear systems, SIAM, Journal ofScientific and Statistical Computing, Vol. 7, pp. 856-869, 1986

[13] Varga, R. S.; Matrix Iterative Analysis, Prentice-Hall, Englewod Cliffs,New Jersey, 1962

8

Static Condensation

The Kaa matrix is free of constraints and ready for analysis. However, forcomputational advantages it may be further reduced. Let us partition theremaining degrees of freedom into two groups. One such physical partitioningmay be based on considering the boundary degrees of freedom as one, andthe interior degrees of freedom as the other partition. This is a single-level,single-component static condensation, the topic of Section 8.1. One of thefirst publications related to this topic is [2].

It is also possible to first partition the model into components and applythe boundary reduction for each. This is the single-level, multiple-componentcondensation of Section 8.2. Finally, the two techniques may be recursivelyapplied yielding the multiple-level, multiple-component method of Section 8.3.

All of these methods allow the possibility of parallel processing. Due tothe cost of reduction and back-transformation, however, the computationalcomplexity of solving a certain problem may not change. The static conden-sation methods produce computationally exact results when applied to thelinear static problem. The following sections detail the static condensationtechnique.

8.1 Single-level, single-component condensation

Let us consider the linear statics problem of

Kaaua = Fa

and the simple finite element model shown in Figure 8.1 where we marked theboundary partition with t and the interior with o.

The matrix partitioning corresponding to the model partitioning shown inFigure 8.1 is simply:

141

142 Chapter 8

o

t

FIGURE 8.1 Single-level, single-component partitioning

Kaa =[

Koo Kot

Kto Ktt

].

The static problem is partitioned accordingly:

Kaaua =[

Koo Kot

Kto Ktt

] [uo

ut

]=

[Fo

Ft

].

Let us introduce a transformation matrix

T =[

Ioo Got

0 Itt

],

whereGot = −K−1

oo Kot

is the static condensation matrix. Substituting

Tua = ua

and pre-multiplying by T T yields

T T KaaTua = T T Fa,

Static Condensation 143

orKaaua = F a.

The latter equation in details reads[Koo 0ot

0to Ktt

] [uo

ut

]=

[Fo

F t

].

Hereuo = uo − Gotut,

andKtt = Ktt + KtoGot

is the Schur complement. Finally the modified load is

F t = Ft + GTotF0.

This results in the reduced (statically condensed) problem of

Kttut = F t

from which the reduced, boundary solution is computed. To obtain the inte-rior solution, one solves

Koouo = Fo

followed by the back-transformation

uo = uo + Gotut.

The importance of the order of operations in the efficiency of numerical algo-rithms is critical and we often trade formulation for efficiency. A case in pointis the method of reducing matrices via the Schur complement. The abovematrix formulation is conceptually simple and easier to implement, however,it is not very efficient for large scale analysis.

In practice the Got transformation is not calculated explicitly. Instead, theKaa matrix is partially factored as shown in the last chapter; only degrees offreedom in the o partition are eliminated:[

Loo 0ot

Lto Itt

] [Doo 0ot

0to Ktt

] [LT

oo LTto

0to Itt

] [uo

ut

]=

[Fo

Ft

].

The partial factor matrices can also be used to derive the partitioned solu-tion. This indicates that they will produce identical results (within computa-tional errors) and are merely a different order of operations, not a change inthe method of solution. Here now the Schur complement is of the form

Ktt = Ktt − LtoDooLTto.

144 Chapter 8

Developing the second row of the above matrix equation one gets

LtoDooLToouo + Kttut + LtoDooL

Ttout = Ft.

Expanding the first row yields

LooDooLToouo + LooDooL

Ttout = Fo.

Executing a forward substitution on the interior we get an intermediate inte-rior solution as

LToouo + LT

tout = (LooDoo)−1Fo = uo.

Note that this intermediate solution is different from that computed in thematrix form. Similarly, the modified boundary load in terms of the partialfactors is different and computed as

F t = Ft − LtoDoouo.

Despite these intermediate step differences, the final solution is the same. Pre-multiplying this by LtoDoo and subtracting it from the developed form of thesecond row, we get the reduced problem of

Kttut = F t.

The reduced matrix factorization of

Ktt = LttDttLTtt

produces the reduced solution

LttDttLTttut = F t

using forward-backward substitution. The interior solution is finally computedvia the earlier calculated partial factors

uo = L−Too (uo − LT

tout).

This process, when using sparse matrix factorization techniques described inChapter 7 that recognize the sparsity pattern of finite element matrices, isvery efficient for large problems.


To demonstrate the static condensation principle, let us consider the followingsmall numerical example.


Kaa =

⎡⎣ 2 1 0

1 3 10 1 4

⎤⎦ .

To execute the static condensation with the single component case, parti-tion the problem into

Koo =[

2 11 3

]Kot =

[01

]

Kto =[0 1

]Ktt =

[4].

The static condensation matrix is calculated as

Got = −[

2 11 3

]−1 [01

]= −

[3/5 −1/5−1/5 2/5

] [01

]=

[1/5−2/5

].

The static condensation transformation matrix becomes

T =

⎡⎣ 1 0 1/5

0 1 −2/50 0 1

⎤⎦ .

The Schur complement result of the static condensation is

Ktt =[4]+

[0 1

] [1/5−2/5

]=

[18/5

].

Finally, the statically condensed matrix is

K =

⎡⎣ 2 1 0

1 3 00 0 18/5

⎤⎦ .

The procedure is rather straightforward, but in practice the transformationand condensation matrices are not built explicitly, as was shown in the lastsection and will be demonstrated in the following.

As these operations are instrumental in the following chapters, we continuethe example with the solution of a system with the same matrix

Ku =

⎡⎣2 1 0

1 3 10 1 4

⎤⎦

⎡⎣ 3

21

⎤⎦ =

⎡⎣ 8

106

⎤⎦ = F.

The first, matrix based solution scheme is as follows. The modified load vec-tor boundary component with the condensation matrix computed above is

F t =[6]+

[1/5 −2/5

] [810

]=

[18/5

].

146 Chapter 8

The boundary solution, using the Schur complement from the condensationpart of the example above is

ut = K−1tt F t =

[18/5

]−1 [18/5

]=

[1].

This is already a component of the final solution and agrees with the analyticvalue. The intermediate interior solution is

uo = K−1oo Fo =

[2 11 3

]−1 [810

]=

[14/512/5

].

Finally, the actual interior solution is

uo = uo + Gotut =[

14/512/5

]+

[1/5−2/5

] [1]

=[

32

].

This also agrees with the analytic solution.

The partial factor based computational solution scheme is as follows. Thepartial factorization of the matrix results in:

K =

⎡⎣ 1 0 0

1/2 1 00 2/5 1

⎤⎦

⎡⎣ 2 0 0

0 5/2 00 0 18/5

⎤⎦

⎡⎣ 1 1/2 0

0 1 2/50 0 1

⎤⎦ .

The resulting components of the partial factorization are

Loo =[

1 01/2 1

]Lto =

[0 2/5

]

Doo =[

2 00 5/2

]Ktt =

[18/5

].

Note that the 18/5 term in above, while it is identical to the explicitly com-puted Schur complement, was a side result of the partial factorization due tothe update of that process and as such, it was free.

The intermediate interior solution in terms of the partial factors is obtainedfrom the forward only substitution of the equation

LooDoouo = Fo

or [1 0

1/2 1

][2 00 5/2

] [uo(1)uo(2)

]=

[810

]as

uo =[

412/5

].


Note that the intermediate solution is different then the matrix formulation.The modified load vector in terms of the partial factors and the intermediateinterior result is

F = Ft − LtoDoouo,

numerically

[6] − [

0 2/5] [

2 00 5/2

] [4

12/5

]=

[18/5.

].

This results in the boundary solution of

ut =[1],

as before and as required by the analytic result. The final interior results arecomputed from the backward only substitution of

LToou0 = uo − LT

tout,

or [1 1/20 1

]uo =

[4

12/5

]−

[0

2/5

] [1],

resulting in the correct values of

uo =[

32

].

The complete solution is assembled as

u =

⎡⎣ 3

21

⎤⎦ .

The computational solution clearly required more steps to execute, however,at a smaller overall computational cost mainly due to the avoidance of explicitmatrix forming and inverse computations.

8.3 Single-level, multiple-component condensation

In this case, the model is first subdivided into multiple components and thestatic condensation is executed simultaneously on all components. The parti-tioning is executed automatically by applying specialized graph partitioning

148 Chapter 8

techniques to the finite element model. It is important that the partitioningproduce close to equal partitions and a minimal boundary between the parti-tions, a tough problem indeed.

A well-known method of separating the graph of the matrix into partitionsis the nested dissection [4]. The class of multilevel partitioning methods of[1] and [3] are also widely accepted in the industry.

O1

O3

O4

O2

t

t

FIGURE 8.2 Single-level, multiple-component partitioning

Let us consider the model partitioning shown in Figure 8.2. Let t12 denotethe common boundary between the geometric partitions o1 and o2, in essencethe upper half of the vertical divider line on the figure. Similarly t13 is thecommon boundary between o1 and o3 and so on. Finally, the t0 will be theboundary that is shared by all partitions, in this case the central vertex of thevertical and horizontal dividers.

Ordering the interior partitions oi first, followed by the boundary partitionstij and finished by the shared boundary t0 as


o1 o2 o3 o4 t12 t13 t24 t34 t0 ,

will result in the following Kaa stiffness matrix pattern:

⎡⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎣

K1oo Ko1t12 Ko1t13 Ko1t0




Kt12o1 Kt12o2 K12tt K12,13

tt K12,24tt K12,0

tt

Kt13o1 Kt13o3 K13,12tt K13

tt K13,34tt K13,0

tt

Kt24o2 Kt24o4 K24,12tt K24

tt K24,34tt K24,0

tt

Kt34o3 Kt34o4 K34,13tt K34,24

tt K34tt K34,0

tt

Kt0o1 Kt0o2 Kt0o3 Kt0o4 K0,12tt K0,13

tt K0,24tt K0,34

tt K0tt

⎤⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎦

.

Here Kioo, i = 1 . . . 4 contains the stiffness matrix partition corresponding

to the interior of the i-th partition. Koitij designates the i-th partition’sboundary-coupling with the j-th, j = 1 . . . 4. The terms K ij

tt contain thestiffness matrix partition of the common boundary between the i-th and j-th partitions. The terms K ij,kl

tt denote coupling between the ij-th and kl-thboundaries, k, l = 1 . . . 4. Finally, K ij,0

tt is the coupling between the ij-th andthe common boundary.

The specific sparsity pattern of the boundary partitions is reflective of theparticular partitioning and ordering. In general cases there may be a ratherdifficult boundary and boundary-coupling sparsity pattern, therefore, a com-bined Ktt term of

Ktt =

⎡⎢⎢⎢⎢⎣

K12tt K12,13

tt K12,24tt K12,0

tt

K13,12tt K13

tt K13,34tt K13,0

tt

K24,12tt K24

tt K24,34tt K24,0

tt

K34,13tt K34,24

tt K34tt K34,0

tt

K0,12tt K0,13

tt K0,24tt K0,34

tt K0tt

⎤⎥⎥⎥⎥⎦

is introduced, where the total boundary is comprised of all the local bound-aries:

t = t12 + t13 + t24 + t34 + t0.

Let us also introduce

K1ot =

[Ko1t12 Ko1t13 0 0 Ko1t0

],

K2ot =

[Ko2t12 0 Ko2t24 0 Ko2t0

],

K3ot =

[0 Ko3t13 0 Ko3t34 Ko3t0

],

150 Chapter 8

andK4

ot =[0 0 Ko4t24 Ko4t34 Ko4t0

].

With the above, the matrix partitioning used for the following discussions is

Kaa =

⎡⎢⎢⎢⎢⎣

K1oo K1

ot

K2oo K2

ot

K3oo K3

ot

K4oo K4

ot

K1to K2

to K3to K4

to Ktt

⎤⎥⎥⎥⎥⎦ .

The static problem in the general multiple (n) component case followingthe latter notation is:

Kaaua =

⎡⎢⎢⎢⎢⎣

K1oo K1

ot

. .Ki

oo Kiot

. .K1

to . Kito . Ktt

⎤⎥⎥⎥⎥⎦

⎡⎢⎢⎢⎢⎣

u1o

.ui

o

.ut

⎤⎥⎥⎥⎥⎦ =

⎡⎢⎢⎢⎢⎣

F 1o

.F i

o

.Ft

⎤⎥⎥⎥⎥⎦ = Fa.

The condensation matrix of the i-th component, following the notation of thepreceding section and introducing a superscript for the component, is

Giot = −K−1,i

oo Kiot.

The multiple component transformation matrix is

T =

⎡⎢⎢⎢⎢⎣

I1oo G1

ot

. .Iioo Gi

ot

. .Itt

⎤⎥⎥⎥⎥⎦ .

Using the pre-multiplication by T T and the substitution of Tua = ua as inthe single component case results in

T T KaaTua = T T Fa,

orKaaua = F a.

In detail this multiple component partitioned form of the condensed problem is⎡⎢⎢⎢⎢⎣

K1oo

.Ki

oo

.

Ktt

⎤⎥⎥⎥⎥⎦

⎡⎢⎢⎢⎢⎣

u1o

.ui

o

.ut

⎤⎥⎥⎥⎥⎦ =

⎡⎢⎢⎢⎢⎣

F 1o

.F i

o

.

F t

⎤⎥⎥⎥⎥⎦ .


The multiple component Schur complement is of the form

Ktt = Ktt + Σni=1K

T,iot Gi

ot.

The modified interior solution components formally are

uio = ui

0 − Giotut,

and the modified boundary load is

F t = Ft + Σni=1G

T,iot F i

o .

The efficient computational technology relies on the partial factors, as shownearlier. The modified boundary load in terms of the partial factors of thecomponents is

F t = Ft − Σni=1L

itoD

ioou

io,

where the intermediate interior solutions are

uio = (Li

ooDioo)

−1F io .

In terms of the partial factors of the components the reduced matrix is

Ktt = Ktt − Σni=1L

itoD

iooL

T,ito ,

and while small, it is very dense. The reduced solution again is of the form

Kttut = F t.

The final interior solution of the components is calculated from

uio = L−T,i

oo (uio − LT,i

to ut).

Since the computations related to the interior of the components are indepen-dent of each other, this method is naturally applicable to parallel computers.However, care must be applied when dealing with the t partition as multiplecomponents contribute to it. The problem becomes even more complex ona parallel computer; special synchronization logic is needed to deal with thecontributions to the t partition. This topic is beyond our focus.

The single-level, multiple-component method has a notable computationalshortcoming because the size of the t partition increases proportionally tothe number of components. To overcome this problem, a multiple-level staticcondensation may also be used.

152 Chapter 8

8.4 Multiple-level static condensation

Let us reconsider the simple finite element problem again with a different par-titioning, as shown in Figure 8.3. In the figure the nodes in set t12 representthe common boundary between partitions o1 and o2. Similarly set t34 is theboundary between partitions o3, o4. These boundaries represent the first levelof partitioning. The second level partitioning is represented by set t1234 whichis the collection of the boundary sets.

O1

O3

O4

O2

t34

t1234

t12

FIGURE 8.3 Multiple-level, multiple-component partitioning

The stiffness matrix structure corresponding to this partitioning is discussedin the following. It is important to point out that the coupling matrices froman interior domain to the various boundaries are different. They are notedby a specific superscript structure. For example the K1,12

ot coupling is fromthe 1st interior domain to the common boundary between the 1st and the2nd components. For simplicity the coupling from the ith domain to the final


boundary is noted as K i,0ot . Note, that only the upper triangle of the symmet-

ric matrix is shown.

Kaa =

⎡⎢⎢⎢⎢⎢⎢⎢⎢⎢⎣

K1oo K1,12

ot K1,0ot

K2oo K2,12

ot K2,0ot

K12tt K12,0

tt

K3oo K3,34

ot K3,0ot

sym K4oo K4,34

ot K4,0ot

K34tt K34,0

tt

K0tt

⎤⎥⎥⎥⎥⎥⎥⎥⎥⎥⎦

.

The component condensation matrices for this arrangement are

G1,12ot = −K−1,1

oo K1,12ot ,

G2,12ot = −K−1,2

oo K2,12ot ,

G3,34ot = −K−1,3

oo K3,34ot ,

andG4,34

ot = −K−1,4oo K4,34

ot .

Note, that the −1,i superscript marks the inverse of the interior of the i-thcomponent.Two transformation matrices may be built to condense the components to theboundaries between them. They are

T12 =

⎡⎢⎢⎢⎢⎢⎢⎢⎢⎣

I1oo G1,12

ot

I2oo G2,12

ot

I12tt

I3oo

I4oo

I34tt

I0tt

⎤⎥⎥⎥⎥⎥⎥⎥⎥⎦

and

T34 =

⎡⎢⎢⎢⎢⎢⎢⎢⎢⎣

I1oo

I2oo

I12tt

I3oo G3,34

ot

I4oo G4,34

ot

I34tt

I0tt

⎤⎥⎥⎥⎥⎥⎥⎥⎥⎦

.

The identity matrices’ sub- and superscripts mark their sizes and locations.The successive application to the stiffness matrix accomplishes the condensa-tion of the two pairs of components to their respective common boundaries,completing the first level of the reduction.

154 Chapter 8

The second (the highest in our case) level of reduction requires the elim-ination of the boundary-coupling terms of the last column via a new set ofcondensation matrices. The following are still simple boundary condensationmatrices from the interior of the components to the final boundary:

G1,0ot = −K−1,1

oo K1,0ot ,

G2,0ot = −K−1,2

oo K2,0ot ,

G3,0ot = −K−1,3

oo K3,0ot ,

andG4,0

ot = −K−1,4oo K4,0

ot .

Another two condensation matrices are needed to eliminate the boundary-coupling terms. They are:

G12,0tt = −K

−1,12

tt K12,0tt

andG34,0

tt = −K−1,34

tt K34,0tt .

Note, that these condensation matrices are calculated with the K tt terms,that are the already statically condensed lower level boundary components(local Schur complements). The transformation matrix for the second levelreduction is formed in terms of previous condensation matrices.

T0 =

⎡⎢⎢⎢⎢⎢⎢⎢⎢⎢⎣

I1oo G1,0

ot

I2oo G2,0

ot

I12tt G12,0

tt

I3oo G3,0

ot

I4oo G4,0

ot

I34tt G34,0

tt

I0tt

⎤⎥⎥⎥⎥⎥⎥⎥⎥⎥⎦

.

The transformation matrix executing the multiple-component multiple-levelstatic condensation for our case is

T = T12T34T0.

Note, that more than two transformation matrices may be in the first leveland also more than two levels may be applied. Applying the transformationmatrix as

Kaa = T T KaaT


results in the following stiffness matrix structure

Kaa =

⎡⎢⎢⎢⎢⎢⎢⎢⎢⎢⎣

K1oo

K2oo

K12

tt

K3oo

K4oo

K34

tt

K0

tt

⎤⎥⎥⎥⎥⎥⎥⎥⎥⎥⎦

.

Note, that this reduction step is not marked by a different partition namebut with Kaa, as the size of this matrix has not been reduced. On the otherhand the density has been radically reduced. The terms denoted by ∗ are thecondensed (Schur complement) terms. The actual execution of the solutionof the condensed static problem is now straightforward.

The following chart illustrates the effect of the static condensation to thestiffness matrix, independently of which version was used. Of course the K tt

is not a direct partition of Kaa.[Kaa [

Ktt

] ].

Note, that the static condensation process is computationally exact, meaningthat apart from the errors introduced by the floating point arithmetic, noother approximation error occurred.

It is also important to mention that the multiple-component static conden-sation technique enables very efficient solution of large linear statics problemswhen executed on parallel computers or network of workstations. The auto-matic partitioning technique mentioned in the beginning of this chapter shouldalso take into consideration the cost of decomposition of the domains and thepossible equivalency of those costs. Specifically, the number of nonzero termsin the rows of the component matrices (front size) should also be consideredduring domain decomposition.

8.5 Static condensation case study

To demonstrate the practicalities of the static condensation, let us considerthe industrial example of the crankshaft of an automobile. An example ofsuch a structural component is shown in Figure 8.4.

156 Chapter 8

FIGURE 8.4 Automobile crankshaft industrial example

The model had 550,132 node points and 369,468 4-noded tetrahedral ele-ments. An eight-component static condensation resulted in the characteristicsshown in Table 8.1. The internal nodes are the oi partitions. The boundarynodes are the ti partitions and their union produces the t partition.

The original model was partitioned automatically and the number of ele-ments and the number of degrees of freedom in each component demonstratethe quality of the partitioning.

The effect of the multiple-component static condensation for a linear staticanalysis of the model is shown in Table 8.2. The serial execution is the originalmodel without static condensation and the parallel is the static condensationversion on eight processors. The computer was the same as described in


TABLE 8.1

Component statistics of crankshaft modelInternal nodes Boundary nodes Elements DOF

76,265 503 50,775 230,30473,167 1,339 49,912 223,51859,133 1,899 39,671 183,09667,245 1,035 46,155 204,84066,898 3,179 45,990 210,23177,805 2,111 53,185 239,74863,504 2,528 43,943 198,09658,999 2,438 39,837 183,111

Chapter 7, but it is immaterial here as the relation between the serial and theparallel executions is of importance here.

TABLE 8.2

Performance statistics of crankshaft modelElapsed CPU I/O Memorymin:sec seconds GBytes Mwords

Serial 71:14 3,982. 36.67 120Parallel 24:46 1,442. 11.08 50

The I/O reported is not the total disk storage usage, it is the amount ofdata transferred during the analysis. As such, it is much larger than the ac-tual disk requirement, due to repeated data access.

It is noticeable that the memory and I/O requirements (the main advantageof using the condensation) are significantly reduced. This effect is given evenwhen the analysis is executed on a serial machine. On the other hand, whenthe condensed version is running on multiple processors, there is an additionalperformance advantage in execution times.

The speedup is not very good. This is due to the fact that the data is thatof a complete analysis job, the factorization and solution steps are only a partof that. This fact is limiting the achievable speedup.

158 Chapter 8

References

[1] Barnard, S. T. and Simon, H. D.; Fast multilevel implementation of re-cursive spectral bisection for partitioning unstructured problems, Con-currency: Practice and Experience, Vol. 6(2), pp. 101-117, Wiley andSons, New York, 1994

[2] Guyan, R. J.; Reduction of stiffness and mass matrices, AIAA Journal,Vol. 3, pp. 380-390, 1965

[3] Karypis, L. and Kumar, V.; A fast and high quality multilevel schemefor partitioning irregular graphs, Tech. Report TR 95-035, Departmentof Computer Science, University of Minnesota, Minneapolis, 1995

[4] Liu, J. W. H.; The role of elimination trees in sparse factorization, SIAMJ. of Matrix Analysis and Applications, Vol. 11, pp. 134-172, 1990

9

Real Spectral Computations

In practice large scale eigenvalue problems are solved with specific techniquesdiscussed in this chapter. These techniques also provide the foundation for thedynamic reduction methods of this second part of the book. These compu-tations are usually executed by employing a spectral transformation followedby a robust eigenvalue solution technique, such as the Lanczos method.

9.1 Spectral transformation

In industrial applications the eigenvalue spectrum of interest is very wide andnon-homogeneous. Regions with closely spaced eigenvalues are followed byempty segments and vice versa. The convergence of most commonly usedeigenvalue methods is slow in that case.

The motivation of the spectral transformation is to modify the spectral dis-tribution to find the eigenvalues more efficiently. This is done in the form ofa transformed eigenvalue:

μ =1

λ − λs,

where λs is an appropriately chosen eigenvalue shift. The graphical represen-tation of this is a hyperbola shifted to the right by λs in the (μ, λ) coordinatesystem. As shown in Figure 9.1, this transformation enables one to find closelyspaced eigenvalues in the neighborhood of λs in a well separated form on theμ axis.

Applying the spectral transformation to the algebraic eigenvalue problem of

Ax = λx

in the form of

λ =1μ

+ λs,

159

160 Chapter 9

μl

λlλs

μ

λ

FIGURE 9.1 Spectral transformation

we get

(A − λsI)−1x = μx,

or

Ax = μx.

In practical computations the A matrix of course is never explicitly formed.Any time when the eigenvalue method requires an operator multiplication of

z = Ax,

one executes the

(A − λsI) = LDLT

symmetric factorization followed by the

LDLT z = x

forward-backward substitution. In the iterations only the substitution stepsare executed until a new spectral transformation is done. While the cost of the

Real Spectral Computations 161

substitution operations may be seemingly too high in the place of the matrix-vector multiplication, the numerical advantages gained clearly outweigh thecosts. The explicit inverse of (A − λsI) is generally a full matrix, while itsfactor L, albeit denser than A, is still a sparse matrix in most applications.The spectral transformation step may be executed at multiple λs1, λs2, . . . lo-cations, enabling the traversal of very wide frequency ranges of interest thatare commonplace in the industry.

The possible problem of λs being identical (or too close) to an eigenvaluemust be resolved by a singularity detection mechanism. The singularity detec-tion is done in the factorization operation similarly to the technique describedin Section 5.2. It is based on monitoring the terms of the D diagonal factormatrix relative to the corresponding original terms in A − λsI. If their ratiois too high (λs is too close to an eigenvalue), the value of λs is perturbed andthe factorization is repeated.

Another advantage of the spectral transformation is that the factorizationat the various shifts produces a mechanism to monitor the distribution ofeigenvalues. Namely, the number of negative terms on the D factor matrix isequivalent to the Sturm number. The Sturm number is the number of alter-ations of sign in the Sturm sequence d0, d1, . . . dn−1, where di = det(Ai −λsI)is the determinant of the leading i-th principal sub-matrix of the shifted ma-trix with d0 = 1.

It follows that the Sturm number also indicates the number of eigenvalueslocated to the left of the current shift λs in the spectrum. This is a toolexploited by all industrial eigenvalue solution software packages to verify thatall eigenvalues in a region bounded by two λs values are found. If the num-ber of eigenvalues found in a region is less then the differences in the Sturmnumbers at the boundaries of the region, the region is bisected and additionaliterations are executed as needed. If the number of eigenvalues found in theregion exceeds the Sturm count difference, then an error has occurred and oneor more spurious eigenvalues were found.

9.2 Lanczos reduction

The most wide spread and robust method for the solution of industrial eigen-value problems is the Lanczos method [6]. Let us consider the canonicaleigenvalue problem

162 Chapter 9

Ax = λx,

with a real, symmetric A. Here x are the eigenvectors of the original problem;the underlining is used to distinguish it from the soon to be introduced xLanczos vectors.

The Lanczos method generates a set of orthogonal vectors Xn such that:

XTn Xn = I,

where I is the identity matrix of order n and

XTn AXn = Tn,

where A is the original real, symmetric matrix and Tn is a tridiagonal matrixof form

Tn =

⎡⎢⎢⎢⎢⎣

α1 β1

β1 α2 β2

. . .βn−2 αn−1 βn−1

βn−1 αn

⎤⎥⎥⎥⎥⎦ .

By multiplication we get the following equation:

AXn = XnTn,

By equating columns on both sides of the equation we get:

Axk = βk−1xk−1 + αkxk + βkxk+1,

where k = 1, 2, ..n− 1 and xk are the k-th columns of Xn. For any k < n thefollowing is also true:

AXk = XkTk + βkxk+1eTk ,

where ek is the k-th unit vector containing unit value in row k and zeroeselsewhere. Its presence is only needed to make the matrix addition operationcompatible. This equation will be very important in the error-bound calcula-tion. The following starting assumption is made:

β0x0 = 0.

By reordering we obtain the following Lanczos recurrence formula:

βkxk+1 = Axk − αkxk − βk−1xk−1.


The coefficients βk and αk are defined as

βk =√|xT

k+1xk+1|,and

αk = xTk Axk.

The process is continued. Sometimes, due to round-off error in the computa-tions, the orthogonality between the Lanczos vectors is lost. This is remediedby executing a Gram-Schmidt orthogonalization step at certain k values:

γi = xk+1xi i = 1, .., k,

andxk+1 = xk+1 − Σk

i=1γixi.

This can be a very time-consuming step and in the industry the orthogonal-ization is only executed against a certain selected set of i indices, not all.

The solution of the eigenvalue problem is now based on the tridiagonalmatrix Tn as follows:

Tnui = λiui.

The eigenvalues are invariant under the reduction and the eigenvectors arerecovered as

x = Xnui.

In practice the Lanczos reduction is executed only up to a certain numberof steps, say j << n. The approximated residual error in the original solutionfollowing a partial Lanczos reduction [7] can be calculated as:

||rj || = ||Ax − λix|| = ||AXjui − λiXjui|| = ||(AXj − λiXj)ui||,where i = 1, 2, . . . j, and j is the number of Lanczos steps executed. Further-more,

||rj || = ||(AXj − XjTj)ui|| = ||(βjxj+1eTj )ui|| = βj ||eT

j ui||,assuming the norm of the Lanczos vector xj+1 is unity. All above norms areEuclidean norms. Taking advantage of the structure of the unit vector we cansimplify into the following scalar form:

||rj || = βj |uji|,where uji is the j-th (last) term in the ui eigenvector.

The last equation gives a convergence monitoring tool. When the errornorm is less than the required tolerance ε and the value of j is higher than thenumber of eigenvalues required, the reduction process can be stopped. The

164 Chapter 9

beauty of this convergence criterion is that only the eigenvector of the tridi-agonal problem has to be found, which is inexpensive compared to finding theeigenvector of the physical (size n) problem.

9.3 Generalized eigenvalue problem

In engineering finite element analysis the eigenvalue problem appears in con-nection with at least two matrices. This is the focus of our interest here, thegeneralized linear eigenvalue problem of

Kφ − λMφ = 0,

with the assumption that the matrices are symmetric and the partition sub-script of the matrices is omitted for simplicity. Since the frequency range ofinterest in these problems is at the lower end of the spectrum, it is advisableto use the spectral transformation introduced in Section 9.1:

μ =1

λ − λs.

This will change the problem into

(K − λsM)φ =1μ

Mφ,

which results in a canonical form amenable to the Lanczos algorithm

μφ = (K − λsM)−1Mφ.

Despite its appearance, the matrix operator will be symmetric if the M ma-trix is used in the inner product of the Lanczos iteration. The process scheme,consisting of the spectral transformation, the block tridiagonal reduction andthe eigenvalue solution steps is depicted in Figure 9.2.

The eigenvectors are invariant under the spectral transformation and theeigenvalues may be recovered as

λ =1μ

+ λs.

The industrial standard block Lanczos method [5] carries out the Lanczosrecurrence with several vectors, called a block, simultaneously. A step of theLanczos recurrence algorithm using blocks of vectors, formulated for the gen-eralized eigenvalue problem is


K M

K - λsM

TB

Λ

FIGURE 9.2 Generalized solution scheme

Rk+1 = (K − λsM)−1MQk − QkAk − Qk−1BTk ,

whereAk = QT

k M(K − λsM)−1MQk,

andRk+1 = Qk+1Bk+1.

Qk+1 is an n by b matrix with M -orthonormal columns (the Lanczos vectors)and Bk+1 is a b by b upper triangular matrix, n being the problem size and bthe block size, obtainable by the QR decomposition [2].

The block orthogonalization may be formulated as

Qk+1 = Qk+1 − Σki=1QiΓi,

whereΓT

i = QTk+1MQi i = 1, .., k

and the vectors are mass orthogonalized.

166 Chapter 9

The initial values are Q0 = 0 and R0 is a collection of b pseudo-randomvectors at the start. This process, executed j times, results in a block tridi-agonal matrix of the form

TB =

⎡⎢⎢⎣

A1 BT2

B2 A2 BT3

. . . Bj

Bj Aj

⎤⎥⎥⎦ .

With appropriately chosen Givens transformations this tridiagonal matrix isreduced into a scalar tridiagonal matrix TJ . If b is the number of vectors in ablock, then the size of TJ is J = jb, (J << n). The solution of the eigenvalueproblem of

TJψ = μψ,

will be addressed in the next section. The eigenvalues of the original largeproblem are invariant under the transformations resulting in the tridiago-nal form. Finally, to find the eigenvectors of the original problem a back-transformation of form

φa = QJψ

is required. The QJ matrix is a collection of the Lanczos vector blocks:

QJ =[Q1 Q2 . . Qj

].

This process then may be repeated until all eigenvalues (and correspondingeigenvectors) of the required frequency range are found.

9.4 Eigensolution computation

The Lanczos process produced the TJ reduced tridiagonal matrix. Considerthe

TJψi = μiψi

eigenvalues problem. The i in the above equations is the index of the eigen-value of the tridiagonal matrix, i = 1, 2, . . . J.

Since the eigenvalues are invariant under the transformation to tridiagonalform, the μi eigenvalues of the tridiagonal matrix, the Ritz values, are approx-imations of the λ eigenvalues of the original problem apart from the spectraltransformation.


The approximations to the eigenvectors of the original problem (Ritz vec-tors) are calculated from the eigenvectors of the tridiagonal problem via themultiplication by the Lanczos vectors shown earlier.

To solve the eigenvalue problem the QR iteration [2] may be used, amongother well known methods, such as the QL method or the method of bisectionin the symmetric case. The QR method is based on a decomposition of theTJ matrix into the form

TJ − ωI = Q1R1,

where R1 is an upper triangular matrix and Q1 (there are not the Lanczosvectors) contains orthogonal columns:

Q1,T Q1 = I.

The presence of ω accounts for a diagonal shift to aid the stability of thisdecomposition. The process is followed by iterating as follows:

T 1J = R1Q1 + ωI.

Pre- and post-multiplying gets

T 2J = Q1,T T 1

JQ1.

This is a congruence transformation, therefore, the newly created matrix pre-serves the eigenvalue spectrum of the old one. By repeatedly applying thisprocedure, T m

J is finally of diagonal form, whose elements are the eigenvaluesof the original tridiagonal matrix, that is

T mJ = diag(μi),

where m is the number of iterations, hopefully, but not necessarily, much lessthan J . The computation in QR iteration takes advantage of the tridiagonalnature of TJ . It creates orthogonal transformation matrices Qi containingonly the four terms of the Givens rotations [3] that reduce the sub-diagonalterms of Ri to zero. This operation will not be detailed more here.

For the eigenvectors an inverse power iteration procedure [8] will be used.The eigenvectors corresponding to the i-th eigenvalue of the TJ tridiagonalmatrix may be determined by the following factorization:

TJ − μiI = LiUi,

where Li is unit lower triangular and Ui is upper triangular. Gaussian elim-ination with partial pivoting is used, i.e., the pivotal row at each stage isselected to be the equation with the largest coefficient of the variable being

168 Chapter 9

eliminated. Since the original matrix is tridiagonal, at each stage there areonly two equations containing that variable. Approximate eigenvectors of thei-th eigenvalue λi will be calculated by the simple (since Ui also has only 3co-diagonals) iterative procedure:

Uiψk+1i = ψk

i ,

where ψ0i is random and k is a counter. Practice shows that the convergence

of this procedure is so rapid that k only goes to 2 or 3.

This original method also has some significant extensions to deal with spe-cial cases such as multiple eigenvalues, which are ignored here. These twosteps complete the solution of the eigenvalue problem.

The approximated residual error of the i-th eigenpair is

||ri|| = ||Kaaφia − λiMaaφi

a||,where

λi =1μi

+ λs.

Based on the discourse in the last section, the error can be estimated by

||riJ || = βJ |ψi(J)|

where ψi(J) is the J-th (last) term in the ψi eigenvector and βJ is the lastoff-diagonal term of the TJ matrix.

The last equation gives the convergence monitoring tool mentioned earlier.When this error norm is less than a pre-defined tolerance ε and the value of Jis higher than the number of eigenvalues required, the process can be stopped.

9.5 Distributed eigenvalue computation

Due to the vast sizes of industrial problems a distributed computationalscheme also has significant merit here. This section introduces an implicit dis-tributed formulation of the Lanczos method. The formulation relies heavilyon the distributed factorization and solution technique introduced in Section7.4. Let us use the standard Lanczos recurrence for the generalized problemas:

xk+1 = Axk − αkxk − βk−1xk−1, , k = 1.2, ...


with orthogonality parameter

αk = xTk MaaAxk.

The canonical (dynamic) matrix is

A = (Kaa − λsMaa)−1Maa,

where the fixed λs value represents the spectral transformation. The Kaa, Maa

matrices are the structural stiffness and mass matrices. The normalization pa-rameter is

βk =√|xT

k+1Maaxk+1|,and the next Lanczos vector of the recurrence is computed as

xk+1 = xk+1/βk.

The following six steps constitute a distributed execution of this process.

I. Partitioned eigenproblem

The global dynamic matrix is partitioned into sub-matrices as follows:

A =

⎡⎢⎢⎢⎢⎢⎢⎣

A1oo A1

oa

A2oo A2

oa

. .Aj

oo Ajoa

. .A1

ao A2ao . Aj

ao . Aaa

⎤⎥⎥⎥⎥⎥⎥⎦

,

where superscript j refers to the j-th partition, subscript a to the commonboundary of the partitions and s is the number of partitions, so j = 1, 2, . . . s,just like in the static condensation case. The size of the global matrix as wellas the global eigenvectors is N . A key feature of the distributed algorithm isthe partitioning of the global Lanczos vectors accordingly:

x =

⎡⎢⎢⎢⎢⎢⎢⎣

x1o

x2o

.xj

o

.xa

⎤⎥⎥⎥⎥⎥⎥⎦

.

In the distributed execution the j-th processor contains only the j-th partitionof:

Aj =[

Ajoo Aj

oa

Ajao Aj

aa

],

170 Chapter 9

where Ajaa is the complete boundary of the j-th partition that may be shared

by several other partitions. Similarly a local Lanczos vector component isbuilt as

xj =[

xjo

xja

],

where xja is a partition of xa.

II. Partitioned matrix factorization

Since the A matrix is available only in the partitions shown above, the fac-torization is comprised of several components as shown in Section 7.4. Thepartitioned factorization of the j-th partition is as follows:

Aj =[

Ajoo Aj

oa

Ajao Aj

aa

]=

[Lj

oo 0Lj

ao I

] [Dj

oo 00 A

j

aa

] [LT,j

oo LT,jao

0 I

],

whereA

j

aa = Ajaa − Lj

aoDjooL

T,jao

with Aaa = Σsj=1A

jaa. The individual Schur complement matrices of each

partition are summed up as

Aaa = Σsj=1A

j

aa

to create the global Schur complement which is factored:

Aaa = LaaDaaLT

aa.

This is the foundation of the distributed Lanczos step.

III. Partitioned substitution

The forward substitution in the partitioned form is

[Lj

ooDjoo 0

LjaoD

joo I

] [zj

o

zja

]=

[xj

o

xa

].

Here the I sub-matrix is included only to assure compatibility. The forwardsubstitution yields

zjo = [Lj

ooDjoo]

−1xjo,

andzj

a = xa − LjaoD

jooz

jo.


This is the boundary component of the new local Lanczos vector. The latterhas to be summed up for all partitions as:

za = Σsj=1z

ja.

The global boundary solution is therefore

LaaDaaLT

aaza = za.

The backward substitution for the partitions

[Lj

ooT

Ljao

T

0 I

] [zo

j

za

]=

[zo

j

za

],

produces

zjo = Lj

oo

−T(zj

o − Ljao

Tza),

which is the interior component of the local Lanczos vector.

IV. Local matrix multiply

In order to compute the orthogonality parameter, the local Lanczos vectorcomponent must be multiplied by the mass matrix as follows

yjk =

[yj

o

yja

]=

[M j

oo M joa

M jao M j

aa

] [xj

o

xja

]= M jxj

k.

Note, that the xja, yj

a local boundary partitions must be updated across all theprocesses.

V. Implicit distributed Lanczos step

At this point we have calculated both components zjo, z

ja of the local zj

k vectoras well as the yj

o, yja components of the local yj

k vector partitions. The appro-priate partitions of the last two Lanczos vectors xj

k, xjk−1 are also available.

Hence, the implicit distributed Lanczos recurrence is executed with the fol-lowing steps.

1. Calculate partition local inner product:

αjk = yT,j

k zjk

2. Accumulate global inner product:

αk = Σsj=1α

jk

172 Chapter 9

3. Execute local Lanczos step:

xjk+1 = zj

k − αkxjk − βk−1x

jk−1

4. Calculate local normalization parameter

yjk+1 = M jxj

k+1

βjk = (xT,j

k+1yjk+1)

1/2

5. Collect global value

βk =√

Σsj=1(β

jk)2

6. Produce the next normalized Lanczos vector local component

xjk+1 = xj

k+1/βk

VI. Distributed orthogonalization scheme

A time-consuming operation in the Lanczos reduction is the orthogonaliza-tion. This is done by first calculating the local coefficients

ωji = xT,j

k+1Mjxj

i ,

and summing up the global values as

ωi = Σsj=1ω

ji ,

where i = 1, 2, .., k − 1 is the index of the already computed Lanczos vectors.The local orthogonalization is done by

xjo,k+1 = xj

o,k+1 − ωixjo,i,

andxj

a,k+1 = xja,k+1 − ωix

ja,i.

This concludes the distributed computational form of the Lanczos reduction.

9.6 Dense eigenvalue analysis

Here we discuss the solution of the free vibration problem when the matri-ces are dense. Such may occur when the structure is a solid model, such asthe automobile crankshaft example shown in the last chapter. This may also


occur when dealing with a boundary component of a complex structure; thiswill be discussed in more detail in the next chapter. Finally, in the coupledfluid-structure application of Chapter 6, the fluid component also producesdense matrices.

While the overall concept of solving these eigenvalue problems is similar tothe solution of the global problem in the sense that a reduction is followedby eigenvalue iterations, the applied reduction techniques are different anddiscussed in the following.

In order to apply the reduction techniques to these problems, the problemis explicitly converted to a single matrix canonical form. Since the origin ofthe dense matrix may be from different sources, in the following we omit thepartition notation when we address the real, symmetric, undamped problem of

(K − λM)φ = 0.

Let us assume first that the Cholesky factorization of the dense mass matrixexists

M = CCT .

The algorithm for such a factorization of an order n symmetric, positive def-inite A matrix is a mild variation of the factorization introduced in Chapter7 and may be written as:

C(1, 1) =√

A(1, 1)

For j = 2, n

C(j, 1) = A(j, 1)/C(1, 1)

End loop jFor i = 2, n− 1

C(i, i) = (A(i, i) − Σi−1k=1C(i, k)2)1/2

For j = i + 1, n

C(j, i) = (A(j, i) − Σj−1k=1C(i, k)C(j, k))/C(i, i)

End loop jEnd loop i

C(n, n) = (A(n, n) − Σj−1k=1C(n, k)2)1/2

174 Chapter 9

With this factorization the eigenvalue problem is converted as

C−1Kφh − λC−1CCT φh = 0.

Introducing

ψ = CT φ

and substituting yields the canonical problem of

Aψ = λψ.

HereA = C−1KC−1,T .

It is of course not always possible to factor M as shown above. In this case,we assume that the factorization of a linear combination of the mass and stiff-ness matrices exist. If that does not hold, a massless mechanism exists in theproblem and should be removed as shown in Chapter 5.

Assuming a positive shift value (λs) for the linear combination, let us factor

K + λsM = CCT .

A heuristic, but industrially proven selection of such a value is in the form of

λs =1√n

Σni=1

K(i, i)M(i, i)

,

where n is the size of the matrices. The following rearrangement is the basisof the canonical form in this case:

(K + λsM − (λ + λs)M)φ = 0.

Pre-multiplying this equation by

−1λ + λs

and substituting the above factorization yields

(M − CCT

λ + λs)φ = 0.

Introducing

φh = C−1,T ψ

and pre-multiplying with C−1 yields the following canonical form

(A − λI)ψ = 0.


HereA = C−1MC−1,T ,

and

λ =1

λ + λs.

The solution of this canonical problem is addressed in the next section.

9.7 Householder reduction technique

The most practical methods for dense matrix reduction are based on theHouseholder reflection. The Householder [4] reflection matrices are of form

Pk = I − 2vkvT

k

vTk vk

.

Consider a dense vector of order n

x =

⎡⎢⎢⎢⎢⎣

x1

...xk

...xn

⎤⎥⎥⎥⎥⎦ .

Let us select the elements of the vk vector as follows:

- zeroes in the first k − 1 terms- the value of xk + α sign(xk) for the kth element, and- the elements of x vector in the k + 1 to the nth element.

Hereα = Σn

i=kx2i .

With such a vector

vk =

⎡⎢⎢⎢⎢⎢⎢⎢⎢⎣

0..0

xk + α sign(xk)xk+1

...xn

⎤⎥⎥⎥⎥⎥⎥⎥⎥⎦

176 Chapter 9

the following is true

Pkx =

⎡⎢⎢⎢⎢⎢⎢⎢⎢⎣

x1

...xk−1

−α sign(xk)0...0

⎤⎥⎥⎥⎥⎥⎥⎥⎥⎦

.

The index k for both v and P indicates the pivoting location. The trans-formed (reflected vector) has zeroes in the locations from the k + 1th to thenth. The kth term is modified, and the first k− 1 terms are unmodified. Thegeometric meaning of the operation is that the x vector is reflected throughthe hyper-plane with normal vector vk, hence the name.

The application of the Householder process to dense matrices results ina reduction to a compact tridiagonal form. This is executed by recursivelymultiplying a matrix with transformation matrices based on the Householdertransformation. With such steps certain components of some columns androws of the matrix may be zeroed out.

Let us consider a symmetric A matrix and execute the following sequenceof matrix transformations

Ar = PrAr−1Pr

where the Pr matrix is a Householder matrix of form:

Pr = I − βvrvTr

with β = 2/(vTr vr). We choose the elements of vr such that the rows r+1 to n

of a vector may be zeroed out. For example, after the rth step of the process,the first r columns of the matrix contain zeroes below the sub-diagonal andthe first r rows will have zeroes beyond the super-diagonal term.

Ar =

⎡⎢⎢⎢⎢⎢⎢⎣

x x 0 0 0 0x x x 0 0 00 x . . . .0 0 . x x x0 0 . x x x0 0 . x x x

⎤⎥⎥⎥⎥⎥⎥⎦

.

Here the x indicates nonzero terms. It is important to execute these opera-tions in such a manner as to maintain symmetry. In detail the transformationat step r is

Ar = (I − βrvrvTr )Ar−1(I − βrvrv

Tr ).


Introducing

pr = βrAr−1vr,

the transformation is

Ar = Ar−1 − vrpTr − prv

Tr + βrvr(vT

r pr)vTr .

Furthermore with

qr = pr − 12vrβr(vT

r pr)

the symmetric step is

Ar = Ar−1 − vrqTr − qrv

Tr .

The latter form is practical for computer implementation. Repeated applica-tion from r = 1 to n − 2 will result in

An−2 = T

where T is tridiagonal.

The importance of this reduction is that it is again a congruence transfor-mation, and as such does not change the eigenvalues of the original matrix.Hence, the eigensolution calculation method for tridiagonal matrices intro-duced in Section 9.4 is also applicable here.

9.8 Normal modes analysis case studies

To demonstrate the power of the computational methods introduced in thischapter, we solve another practical engineering problem introduced in Chap-ter 3, the free, undamped vibration of structures. This is called normal modesanalysis in the industry. This solution will be the foundation of the dynamicreduction technique of the next chapter. It is also very important in thedynamic analysis of global structures to avoid vibration conflicts with the en-vironment in which the structure is operating. The computational problem is

(K − λM)u = 0.

The goal of normal modes analysis is to find the natural frequencies (λ) andcorresponding vibration shapes (u) of the structure. Mostly the lowest nat-ural frequencies of the structure are of particular interest to the engineer.

178 Chapter 9

FIGURE 9.3 Trimmed car body model

Occasionally the natural frequencies in a certain range are needed, to assurethe avoidance of resonance catastrophes or annoying vibrations in the audiblerange.

Let us consider a trimmed car body automobile model shown in Figure 9.3.Such models have all major components of the car, such as wheels, shocks, win-dows, etc., incorporated and lead to large sparse eigenvalue problems solvedin the industry mostly by the Lanczos method.

TABLE 9.1

Model statistics of trimmed car bodyModel Number of Number of Number of Number ofdata nodes shells solids rigids

380,007 361,249 3,762 9,056

Sizes g n f a

2,280,042 2,223,139 2,223,109 1,937,282


FIGURE 9.4 Speedup of parallel normal modes analysis

The statistics of Table 9.1 show the characteristics of such a finite elementmodel. Note, that they are not the statistics of the illustration model. Asit was shown earlier such models usually contain a variety of elements and alarge amount of constraints and rigid components.

The task of finding the natural frequencies and corresponding mode shapesof such a model is truly an enormous one. The distributed parallel computa-tion of Section 9.5 enables the feasible execution of this task.

The statistics of the computation encompassing about 900 modes up to 200Hz are shown on Table 9.2. The analysis was executed on a cluster of 8 work-stations, each containing 8 processors with 1.5 GHz clock cycle. The clusterhad a 1 Gigabit Ethernet connection.

The elapsed time utilizing 32 processors is already a feasible execution,considering the work environment and time schedule in typical automobilecompanies. The efficiency above that decreases, but the speedup is still in-creasing. It peaks at 56 processors, as it is shown in Figure 9.4 where thehorizontal axis is the number of processors. More efficient implementationof the computational technique may overcome this limit. Furthermore, the

180 Chapter 9

technique may scale to over 100 processors with a larger problem or widerfrequency range.

TABLE 9.2

Distributed normal modes analysisstatisticsI/O Elapsed Elapsed Number ofGByte min:sec speedup processors

2,028.4 523:58 1.00 1266.9 83:41 6.26 8191.1 45:16 11.57 1698.3 34:07 15.35 3277.8 27:14 19.23 4867.1 24:41 21.22 5661.4 27:00 19.40 64

Another model is used to demonstrate the computational complexity ofdense component models, also from the automobile industry. A completeengine model, such as shown for example in Figure 9.5 consisted of approxi-mately 12 million node points and 7.5 million elements.

Table 9.3 contains statistics of the matrices in the eigenvalue analysis. Themax terms column indicates the maximum number of nonzero terms in thedensest column of the matrices. The zero columns of the M matrix is thezero subspace of M . The max front is the maximum front size of the factormatrix.

TABLE 9.3

Normal modes analysis dense matrix statisticsK number of rows nonzero terms max termsmatrix 35.7 million 1.38 billion 18,9571

M number of rows zero columns max termsmatrix 35.7 million 5,317,732 11

Factor number of rows nonzero terms max frontmatrix 35.7 million 43.8 billion 30,310

The task of finding the natural frequencies and corresponding mode shapeswas executed up to 200 Hz on a workstation with 8 (1.95 GHz) CPUs. The


FIGURE 9.5 Engine block model

computation required about 680 minutes elapsed and 100,000 seconds of CPUtime using all 8 processors of the workstation in a shared memory fashion.11.5 Terabytes of I/O was executed and 650 Gigabytes of disk footprint wasrequired. The computational complexity of such industrial normal modes ap-plications is overwhelming.

References

[1] Cullum, J. K. and Willoughby, R. A.; Lanczos algorithms for large sym-metric eigenvalue computations, Birkhauser, Boston, 1985

[2] Francis, J. G. F.; The QR transformation I. and II., The ComputerJournal, Vol. 4, pp. 265-271, Vol. 5, pp. 332-345, 1961, 1962

[3] Givens, W.; Numerical computation of the characteristic values of a realsymmetric matrix, Report ORNL-1574, Oak Ridge National Laboratory,

182 Chapter 9

1954

[4] Householder A. S.; Unitary triangularization of a non-symmetric matrix,J. Assoc. Comp. Mach., Vol 5. pp. 339-342, 1958

[5] Komzsik, L.; The Lanczos method: Evolution and Application, SIAM,Philadelphia, 2003

[6] Lanczos, C.; An iteration method for the solution of eigenvalue prob-lem of linear differential and integral operators, Journal of the NationalBureau of Standards, Vol. 49, pp. 409-436, 1952

[7] Parlett, B. N.; The symmetric eigenvalue problem, Prentice-Hall, 1980

[8] Wilkinson, J. H.; The calculation of eigenvectors of co-diagonal matrices,The Computer Journal, Vol. 1, pp 90-92, 1958

10

Complex Spectral Computations

The damped vibration of structures is described by

Mv + Bv + Kv = 0,

where B is the damping matrix, and v refers to the velocity. Executing aFourier transformation yields the following quadratic eigenvalue problem

(Mλ2 + Bλ + K)φ = 0.

The matrices of this quadratic eigenvalue problem may be complex and theproblem may also have a left-handed solution

ψH(Mλ2 + Bλ + K) = 0

that is different from the right-hand solution. Here and in the following, H

denotes complex conjugate transpose. The solution of this problem usuallyresults in complex eigenvalues. In order to solve the quadratic eigenvalueproblem, a transformation is executed to convert the original quadratic prob-lem to a linear problem of twice the size.

10.1 Complex spectral transformation

First the problem is rewritten as a 2 by 2 block linear problem:

λ

[M 00 I

] [φφ

]+

[B K−I 0

] [φφ

]= 0,

where φ = λφ. This equation is now linear, albeit, has some serious limita-tions. For example, one would need to invert both the mass and the dampingmatrices and an unsymmetric, indefinite matrix built from the damping andthe stiffness matrices, in order to reach a solution. Even though the explicitinverses are not needed, the numerical decompositions on either of these ma-trices may not be very well defined.

183

184 Chapter 10

A way to improve on this is to execute a complex spectral transformation,similar to the one introduced in Section 9.1 as

μ =1

λ − λ0,

where now λ0 is a possibly complex shift. By substituting and reordering weget:

μ

[φφ

]=

[−B − Mλ0 −KI −λ0I

]−1 [M 00 I

] [φφ

].

The latter equation is a canonical form of

μx = Ax,

and the corresponding left-handed problem of

μyH = yHA,

where

A =[−B − Mλ0 −K

I −λ0I

]−1 [M 00 I

].

The form allows the singularity of the participating matrices, however, theirzero subspaces may not coincide. This is a much lighter and more practicalrestriction.

10.2 Biorthogonal Lanczos reduction

The industry preferred solution for such problems is the bi-orthogonal Lanc-zos method. In practical implementations the A matrix is not built explicitly;the matrix and its inverse are implicitly used in the eigensolution as shown inthe next section.

The block implementation of the bi-orthogonal Lanczos method [1] gener-ates two sets of bi-orthonormal blocks of vectors Pi and Qj such that:

PHi Qj = I,

when i = j(i, j ≤ n) and zero otherwise. These vector sets reduce the Amatrix to a TB block tridiagonal matrix form:

TB = PH

j AQj ,

Complex Spectral Computations 185

where theP j = [P1, P2, . . . Pj ]

andQj = [Q1, Q2, . . .Qj ]

matrices are the collections of the Lanczos blocks. The structure of the tridi-agonal matrix is:

TB =

⎡⎢⎢⎢⎢⎣

A1 B2

C2 A2 .. . .

. . Bj

Cj Aj

⎤⎥⎥⎥⎥⎦ .

The bi-orthogonal block Lanczos process is manifested in the following threeterm recurrence matrix equations:

Bj+1PHj+1 = P H

j A − AjPHj − CjP

Hj−1

andQj+1Cj+1 = AQj − QjAj − Qj−1Bj .

Note, that in these equations the transpose of the matrix A is avoided.

In order to find the mathematical eigenvalues and eigenvectors we solve twotridiagonal eigenvalue problems posed as:

wHTJ = μwH

andTJz = μz,

where in the above equations the size of the scalar tridiagonal matrix TJ isj × p, assuming a block size p. It is derived from the block tridiagonal matrixTB with Givens transformations. The μ eigenvalues of the tridiagonal problemare approximations of the λ eigenvalues of the mathematical problem. Theapproximations to the eigenvectors of the original problem are calculated fromthe eigenvectors of the tridiagonal problem by:

y = P jw

andx = Qjz,

where again P j , Qj are the matrices containing the first j Lanczos blocks ofvectors and w, z are the left and right eigenvectors of the tridiagonal problem.Finally, x, y are the right and left approximated eigenvectors of the mathe-matical problem.

186 Chapter 10

The useful aspect of the Lanczos method exploited earlier, that the errornorm of the original problem may be calculated from the tridiagonal solutionwithout calculating the eigenvectors, is applicable here too. For a similar ar-rangement, let us introduce a rectangular n× p matrix Ej having an identitymatrix as bottom square. Using this, a residual vector for the left-handedsolution is:

sH = yHA − μyH = (wHEj)Bj+1PHj+1,

which means that only the bottom p (if the block size is p) terms of the neweigenvector w are required (due to the structure of Ej). Similarly for theright-handed vectors:

r = Ax − μx = Qj+1Cj+1(EHj z).

An acceptance criterion (an extension of the one used in the previous chapter)may be based on the norm of above residual vectors as:

min(||sH ||||yH || ,

||r||||x|| ) ≤ εacceptance,

where the εacceptance value to accept convergence is again given based on thephysical problem. The ||.|| denotes the Euclidean norm.

The physical eigenvalues may easily be recovered from the backward sub-stitution of the spectral transformation.

λ =1μ

+ λ0.

The physical eigenvectors are partitioned as follows.

x =[

φφ

],

andyH =

[ψH ψH

].

10.3 Implicit operator multiplication

The operator multiplication step of the bi-orthogonal Lanczos algorithm iscrucial for efficiency. The implicit process described here exploits the struc-ture of the A matrix.

A =[−B − Mλ0 −K

I −λ0I

]−1 [M 00 I

].


It is clear that the matrix is not needed to be built explicitly. In the follow-ing the implicit execution of the operator matrix multiplication in both thetranspose and non-transpose case is detailed [4].

In the non-transpose case any z = Ax operation in the process will be iden-tical to the

[−B − Mλ0 −KI −λ0I

]z =

[M 00 I

]x

solution of systems of equations. Let us consider the partitioning of the vec-tors according to the A matrix partitions:

[−B − Mλ0 −KI −λ0I

] [z1

z2

]=

[M 00 I

] [x1

x2

].

Developing the first row of this matrix equation results in

(−B − Mλ0)z1 − Kz2 = Mx1.

Similarly developing the second row, we have

z1 = λ0z2 + x2.

Substituting latter into the first row we obtain

(−B − Mλ0)(λ0z2 + x2) − Kz2 = Mx1.

Some reordering yields the computational form of

−(K + λ0B + λ20M)z2 = Mx1 + (B + λ0M)x2.

The latter formulation has significant advantages. Besides avoiding the ex-plicit formulation of A, the decomposition of the 2N size problem is alsoavoided.

It is important that the transpose operation be executed without any ma-trix transpose at all. In this case any yT = xT A operation in the process willbe identical to the

yT = xT

[−B − Mλ0 −KI −λ0I

]−1 [M 00 I

]

operation. Let us introduce an intermediate vector z

zT = xT

[−(B + Mλ0) −KI −λ0I

]−1

.

188 Chapter 10

We partition this vector also according to the matrix partitions and transposeto obtain [

x1

x2

]=

[−(B + Mλ0)T I−KT −λ0I

][z1

z2

].

From the first row of this equation we obtain

x1 = (−B − Mλ0)T z1 + z2,

and from the second row:

x2 = −KT z1 − λ0z2.

Expressing z2 from the previous to last equation, substituting into the lastand reordering yields

x2 + λ0x1 = −(KT + λ0BT + λ2

0MT )z1.

From this the computational solution form becomes

z1 = −(KT + λ0BT + λ2

0MT )−1(x2 + λ0x1).

The lower part of the z vector is recovered from the prior equation as

z2 = x1 + (B + λ0M)T z1.

The latter two equations will also be used in the recovery of the left-handedphysical eigenvectors. Finally,

yT = zT

[M 00 I

].

This formulation has even more significant advantages than the non-transposecase. Besides avoiding the explicit formulation of A and the decomposition ofthe double size problem, the transpose may also be avoided with left handedmultiplications and forward-backward substitution.

10.4 Recovery of physical solution

The physical eigenvalues may be recovered from the backward substitution ofthe complex spectral transformation:

λ =1Λ

+ λ0.


In order to find the relationship between the mathematical and physical eigen-vectors let us write the block form of

(λM + K)x = 0,

where

x =[

φφ

].

The block matrices are simply

K =[

B K−I 0

],

and

M =[

M 00 I

].

Substituting and reordering yields

[(K + λ0M)−1M + ΛI)]x = 0.

This proves that the right eigenvectors are invariant under the spectral trans-formation, i.e. the right physical eigenvectors are the same as their mathe-matical counterparts, apart from the appropriate partitioning. [5].

For the left handed problem we also use the block notation

yH(λM + K) = 0

with a left handed physical eigenvector of

yH =[ψH ψH

].

Substituting again and introducing an appropriate identity matrix to accom-modate the left multiplication yields

yH(K + λ0M)(K + λ0M)−1[M + (K + λ0M)Λ] = 0.

By multiplying we obtain

yH(K + λ0M)[(K + λ0M)−1M + ΛI] = 0,

which is equivalent to

[(K + λ0M)Hy]H [(K + λ0M)−1M + ΛI] = 0.

Since the original mathematical problem we solve is

yH [(K + λ0M)−1M + ΛI] = 0,

190 Chapter 10

the left handed physical eigenvectors are not invariant under the transforma-tion. Comparing gives the following relationship:

yH = [(K + λ0M)Hy]H ,

or expanding into the solution terms

y = −[−B − Mλ0 −K

I −λ0I

]H

y.

Finally

y = −[−B − Mλ0 −K

I −λ0I

]−1,H

y.

The cost of this backtransformation is not very large since the factors of theoperator matrix are available; we need a forward-backward substitution only.

10.5 Solution evaluation

Since the mathematical solution is significantly different from the physical so-lution, some additional accuracy considerations are needed. From the eigen-value solution it will be guaranteed that the left and right mathematical eigen-vectors are orthonormal to computational accuracy.

Y HX = I.

Here I is an approximate identity matrix with off-diagonals as computationalzeroes. Monitoring these terms and printing the largest one or the ones abovea certain threshold will give a final orthogonality check.

Based on the physical eigenvalues recovered by the shift formulae and thephysical eigenvectors another orthogonality criterion can be formed. Usingthe left and right solutions, the following equations are true for the problem:

(Mλ2i + Bλi + K)φi = 0,

andψH

j (Mλ2j + Bλj + K) = 0.

By appropriate pre- and post-multiplications we get

ψHj (Mλ2

i + Bλi + K)φi = 0,

andψH

j (Mλ2j + Bλj + K)φi = 0.


A subtraction yields

ψHj Mφi(λ2

i − λ2j ) + ψH

j Bφi(λi − λj) = 0.

Assuming λj �= λi we can shorten and obtain an orthogonality criterion as:

O1 = ψHj Mφi(λi + λj) + ψH

j Bφi.

The O1 matrix also has computational zeroes as off diagonal terms (wheni �= j ) and nonzero (containing 2λi) diagonal terms.

Now premultiplying by λjψHj , postmultiplying by λiφi and subtracting we

obtain

λjψHj (Mλ2

i + Bλi + K)φi − ψHj (Mλ2

j + Bλj + K)λiφi = 0.

By expanding and canceling we get

(λi − λj)ψHj Mφiλiλj + (λj − λi)ψjKφi = 0.

Assuming again that λj �= λi we can shorten and obtain another orthogonal-ity criterion recommended mainly for the structural damping option (the casewhen there is no B matrix, but damping is present in the K matrix) as

O2 = λjψHj Mλiφi − ψH

j Kφi.

This orthogonality matrix will also have zero off-diagonal terms, but nonzero(containing λ2

i ) diagonal terms.

10.6 Reduction to Hessenberg form

For dense, unsymmetric matrices the Householder reduction technique of thelast chapter is still applicable, but will result in an upper Hessenberg form. Letus review this transformation in detail. Due to the lack of symmetry, the pre-and post-multiplications must be considered separately. First we pre-multiply

PrAr−1 = (I − βvrvTr )Ar−1 = Ar−1 − βvr(vrAr−1) = Br.

Because vr has zeroes in its first r − 1 rows, the pre-multiplication leaves thefirst r − 1 rows of Ar−1 unchanged. The post-multiplication of

Ar = BrPr = Br(I − βvrvTr ) = Br − β(Brvr)vT

r

will leave the first r − 1 columns of Ar−1 unchanged.

192 Chapter 10

This may be computationally efficiently executed by the following steps:

1. pTr = vT

r Ar−1

2. Br = Ar−1 − 2vrpTr

3. qr = Brvr

4. Ar = Br − 2qrvTr

The latter steps enable the execution of this computation in the case whenthe complete A matrix does not reside in memory, a common occurrence inthe industry.

Finally, after n − 2 transformations we obtain

An−2 = H,

where H is a matrix in upper Hessenberg form, having all terms below thesub-diagonal zero:

H =

⎡⎢⎢⎢⎢⎣

x x x x xx x x x x0 x x x x0 0 x x x0 0 0 x x

⎤⎥⎥⎥⎥⎦ .

This reduction is again a congruence transformation. The eigenvalue calcula-tion method for tridiagonal matrices introduced in Section 9.4 is also applica-ble here, however, the final result is an upper triangular matrix, as opposedto a diagonal matrix.

10.7 Rotating component application

An important engineering application resulting in the need for complex modalreduction is the analysis of rotating components. The gyroscopic effects aremodeled by the Coriolis matrix that is related to the velocity term of theanalysis. The Coriolis matrix of a single node point for a rotation about thez axis is formed as


Ci =

⎡⎢⎢⎢⎢⎢⎢⎣

0 −m 0 0 0 0m 0 0 0 0 00 0 0 0 0 00 0 0 0 I 00 0 0 −I 0 00 0 0 0 0 0

⎤⎥⎥⎥⎥⎥⎥⎦

,

where

I =12(Izz − (Ixx + Iyy)),

and the I∗∗ terms are the second order moments of inertia to the respectiveaxis. The Ci matrices are assembled into the global Coriolis matrix C [6]. Fora symmetric rotor, a most practical application, the inertia term simplifies,since

Ixx = Iyy.

Furthermore, the centripetal force results in a phenomenon called the cen-trifugal softening and the matrix describing this for a single node for rotationabout the z axis results in

Hi =

⎡⎢⎢⎢⎢⎢⎢⎢⎢⎣

m 0 0 0 0 00 m 0 0 0 00 0 0 0 0 00 0 0 −Izz + Iyy 0 00 0 0 0 −Izz + Ixx 00 0 0 I 0 00 0 0 0 0 0

⎤⎥⎥⎥⎥⎥⎥⎥⎥⎦

,

where the m is the mass associated with the single node. These matrices arealso assembled into the global centrifugal softening matrix H .

There are three main sources of damping in a rotating system. The internaldamping of the rotor and the external damping acting on the bearing populatethe damping matrix B. The third damping source is the structural dampingrelated to the displacement, and as such appearing as an anti-symmetric stiff-ness matrix component Kb.

With these the free vibration of a rotating structure in a rotating referencesystem is described by [3]

(λ2M + iλ(B + 2ΩC) + (K + Kb − Ω2H))φ = 0.

The matrices M , B and K are the regular mass, damping and stiffness ma-trices of earlier chapters. The C Coriolis matrix is anti-symmetric, but the H

194 Chapter 10

centrifugal softening matrix is symmetric.

All matrices are real, except for the stiffness matrix K which may be com-plex if there is structural damping also applied to the model.

The Ω is the rotational speed and the eigenvalues appear in complex con-jugate pairs as

λ = α ± iω.

The natural frequency of the mode shapes is

f =ω

2Π,

and the damping coefficient is

g =2α

ω.

Plotting the frequencies as function of the rotation speed results in theCampbell diagram [2], such as shown in Figure 10.1. The figure shows theresults translated into a fixed reference system since it is more interpretivefor the engineer. The translation between the fixed and rotational frame ofreference is by simply adding or subtracting the rotor speed from the curves.

The 1P (one per revolution) line starting from the origin in the figure rep-resents the locations of equal frequency and rotor speed. Intersections of themodes with this curve pin-point the critical speed locations. Those locationsshould be avoided in the operational range of the rotor to avoid resonancecatastrophes.

The example in the figure has four modes plotted. The first three modesintersect the 1P line resulting in three critical speeds. The negative slopecurves of modes 1, 3 represent backward whirl and the positive slope curve ofmodes 2, 4 represents a forward whirl motion. The forward whirl motion, alsocalled direct whirl, is in the direction of the rotation, while the backward orreverse whirl is opposite.

The whirl direction may be computed from the eigenvectors as follows. Thereal and imaginary components of every node point gathered into two vectorswhose ith terms are

ΨRe(i) =

⎡⎣φRe

x (i)φRe

y (i)0

⎤⎦ ,


FIGURE 10.1 Campbell diagram

and

ΨIm(i) =

⎡⎣φIm

x (i)φIm

y (i)0

⎤⎦ .

The whirl direction is forward if the vector

w = ΨRe × ΨIm

points to the positive z-direction, backwards otherwise.

By plotting the real parts of the eigenvalues as function of the rotationspeed, such as shown in Figure 10.2, the regions of instability of the rotormay be established. Unstable regions are where the real part of any modeis positive. In the case of this example, the region of instability starts atapproximately 5, 000 RPM where mode 2 crosses the horizontal axis. Both ofthese graphs and the related computations are very important for the engineeranalyzing rotational machinery.

The fact that these graphs were plotted in the fixed frame of referencewhile computed in a rotating frame is noteworthy. The analysis of rotating

196 Chapter 10

FIGURE 10.2 Stability diagram

components may also be directly executed in a fixed reference system. In thiscase, however, Steiner inertia terms of the form:

12mr2,

need to be added to the matrices, and the rotor must be symmetric. For un-symmetric rotor models, such as helicopter rotors, propellers and wind-mills,using the rotational reference system is necessary.

10.8 Complex modal analysis case studies

As first application example we consider another automobile component, thebrake. The physical phenomenon of braking is a stick-slip type vibration be-tween the friction pads on the caliper and the rotor.


The friction between the pads and the rotor is a function of both pressureand velocity. The pressure-based friction, sometimes called friction stiffness,creates a relationship between the normal and tangential components of thebrake elements. The mathematical effect of this is the asymmetry of the Kstiffness matrix of the finite element model. The velocity-based friction, alsoknown as friction damping introduces the B damping matrix of the model.

X

Y

Z

X

Y

Z

X

Y

Z

FIGURE 10.3 Brake model

A brake model of a test problem (such as shown in Figure 10.3.) had themodel and execution statistics shown on Table 10.1.

The complex eigenvalue analysis of this brake model was executed with theblock, bi-orthogonal Lanczos method described above. The workstation usedhad 1.5 GHz clock speed and the elapsed time shown in the table is in min-utes:seconds.

The statistics demonstrate the time-consuming nature of the complex modalcalculation. In essence, the expense of computing complex modes of an un-symmetric problem is four-fold. In the normal modes analysis case study in

198 Chapter 10

TABLE 10.1

Statistics of brake modelModel Nodes Elements g-size f-size

50,305 33,750 301,833 298,724

Execution Modes CPU-sec I/O-GB Elapsed

190 5,098.6 393.4 86:35

Section 9.6 we have computed a little more than four times as many eigen-vectors of a seven times larger problem in about the same time, using eightprocessors.

Higher node and element size models commonly occur in the rotating ma-chinery industry. A part of such a model is shown in Figure 10.4 and thepartial view into the cylinder from the left side hints at a difficult interior.

FIGURE 10.4 Rotating machinery model


Table 10.2 shows the matrix statistics of the rotating machinery model.Note that the stiffness matrix was complex. The ”max terms” representsmaximum number of terms in the densest column of the matrix. The au-tomatically generated, very detailed model consisted of 213,000 nodes and130,000 elements. This model clearly demonstrates the computational com-plexity of rotating dynamic applications.

TABLE 10.2

Complex eigenvalue analysis statisticsK number of rows nonzero terms max termsmatrix 638,057 24.8 million 405

M number of rows zero columns max termsmatrix 638,057 368 1

Factor number of rows nonzero terms max frontmatrix 638,057 670 million 7,668

The implicit solution technique detailed in this chapter and the block un-symmetric bi-orthogonal method were used in the complex eigenvalue analysisof this model. The analysis took 250 elapsed minutes on a workstation con-taining 4 (1.5 GHz) CPUs. Of this about 170 minutes were spent in thecomplex eigenvalue analysis computation extracting the complex eigenpairsat each rotational frequency.

The amount of I/O was 1.2 GBytes and the disk footprint was 16 Gigabytes.This is mainly due to the fact that the structural matrices were residing outof core, a commonplace phenomenon in industry.

References

[1] Bai, Z. et al; ABLE: an adaptive block Lanczos method for non-Hermitian eigenvalue problems, SIAM Journal on Matrix Analysis andApplications, Vol. 20, pp. 1060-1082, 1999

[2] Campbell, W.; Protection of steam turbine disk wheels from axial vi-bration, Transactions of ASME, No. 46, pp.. 31-160, 1924

[3] Genta, G.; Dynamics of rotating systems, Springer, New York, 2005

200 Chapter 10

[4] Komzsik, L.; Implicit solution of quadratic eigenvalue problems, FiniteElements in Analysis and Design, Vol 35., pp. 799-810, Elsevier, 2001

[5] Komzsik, L.; The Lanczos method: Evolution and Application, SIAM,Philadelphia, 2003

[6] Vollan, A.; GAROS reference manual, AeroFEM GmbH, 1995

11

Dynamic Reduction

In this chapter, we formulate a reduction technique applicable to dynamicanalysis problems. This is a demanding technique and only approximate so-lutions are produced. However, by the virtue of the reduction, the actualcomputational complexity may be reduced. The relevant publications listedin the reference range from the classical [2] to the recent [4] and from thetheoretical to the practical [9].

We discuss the procedure in the context of undamped, free vibration or nor-mal modes problem discussed in the previous to last chapter. This analysisproblem is described as

[Kaa − λMaa]ua = 0,

where Maa and Kaa are the a partition mass and stiffness matrices, and ua isthe eigenvector of the structure corresponding to eigenvalue λ.

11.1 Single-level, single-component dynamic reduction

Following the single-level, single-component static condensation principle, thenormal modes problem may also be partitioned into interior and boundarycomponents. [

Koo − λMoo Kot − λMot

KTot − λMT

ot Ktt − λMtt

][uo

ut

]= 0.

Let us first solve the following interior eigenvalue problem of

[Koo − λMoo]φo = 0.

We construct a diagonal matrix Λqoqo containing qo eigenvalues and a rectan-gular Φoqo matrix that contains all corresponding φo eigenvectors. The latterare called the fixed boundary interior modes, as this equation assumes thatthe boundary degrees of freedom have zero displacement, i.e., the boundaryis fixed. They are usually, but not necessarily, chosen to be mass-orthogonal,

201

202 Chapter 11

so the so-called generalized mass is identity:

ΦToqo

MooΦoqo = Iqoqo .

The generalized stiffness yields the eigenvalues

ΦToqo

KooΦoqo = Λqoqo .

The number of eigenvalues extracted from the interior problem is qo and it isusually much less than the o partition size. This is a result of a compromise.On one hand, the numerical accuracy of the solution increases with highernumber of interior modes. On the other hand, reducing the problem size byretaining fewer than all the modes, produces the computational advantage.

In the industry, the number of eigenvalues extracted from the o partitiondepends on the width of the frequency range in which the global modes aresought. A heuristic value of 1.5 times the width of the global frequency rangeis used to set the local frequency range of interest. The accuracy of the re-duction is addressed in more detail in the next section.

Similarly, the eigenvalue problem of the boundary

[Ktt − λMtt]φt = 0

produces the so-called boundary mode shapes, gathered into Φtqt . In thiscase, qt refers to the number of eigenvalues found in the boundary problemand Λqtqt is a diagonal matrix containing those eigenvalues. The dynamicreduction transformation matrix is formed as

S =[

Φoqo

Φtqt

].

Note, that this matrix is rectangular as the number of mode shapes extractedfrom the interior and from the boundary are both less than the size of thepartition. S has d = qt + qo columns and t + o rows. Pre-multiplying by ST

and substitutingSud = ua

yieldsST [Kaa − λMaa]Sud = 0.

Executing the multiplications produces a dynamically reduced problem of

[Kdd − λMdd]ud = 0.

The reduced problem in detail is[Λqoqo − λIqoqo Kqoqt − λMqoqt

KTqoqt

− λMTqoqt

Λqtqt − λIqtqt

] [uqo

uqt

]= 0.

Dynamic Reduction 203

Here the transformed coupling matrices are

Mqoqt = ΦToqo

MotΦtqt ,

andKqoqt = ΦT

oqoKotΦtqt .

The transformed eigenvector components are defined by

Φoqouqo = uo,

andΦtqtuqt = ut.

The recovery of the solution of the original size eigenvector (which is thesubject of the engineer’s interest) is simply the application of the last twoequations.

The reduced problem’s size is qo + qt and has a very specific structure. Thematrices are diagonal apart from the coupling terms. Clearly there are com-putational advantages to be gained by solving this reduced problem. On theother hand, there is a truncation error introduced which will be addressed indetail in next section as well as in Chapter 13.

The dynamic behavior of the finite element model may be (computation-ally) exactly reproduced if the dynamic reduction executed on the stiffnessand mass matrices is done with all the eigenvectors, the case of the full re-duction. There may still be a computational advantage in calculating theresponse of a system with the matrices of the full reduction, as they are muchsparser, even though they are of the same dimensions as the original matrices.

11.2 Accuracy of dynamic reduction

While our focus has been and remains on the computational side of thesetechniques, a few words on the accuracy of dynamic reduction are warranted.Recall the original analysis problem of

Kaaua = λaMaaua.

The transformations of

Kdd = ST KaaS,

Mdd = ST MaaS,

204 Chapter 11

andua = Sud

resulted in the reduced problem:

Kddud = λdMddud.

Note the distinction, made between the eigenvalues computed from the twoproblems for this discussion. The eigensolution of the reduced problem is anapproximation from the subspace S spanned by S. Let us evaluate the accu-racy of eigensolution obtained from the reduced problem.

Assume that the analytical eigenvalues of the unreduced problem are

λa1 ≤ λa2 ≤ ... ≤ λan

and the eigenvalues of the reduced problem are

λd1 ≤ λd2 ≤ ... ≤ λdk.

Here n is the number of eigenvalues of the unreduced problem and k is thatof the reduced problem. The value of k is the dimension of the subspace fromwhich the approximate solution is obtained, specifically

k = qo + qt.

Let us define the norm of a vector x with respect to the mass matrix M as

||x||M =√

xT Mx

and the angle between the vector and a subspace S spanned by S in an M -inner product space as

cos∠(x,S)M =||ST Mx||2||x||M .

Here we used the assumption of

ST MS = I.

Based on these it is possible to give bounds for the solutions of the reducedproblem [7]. For the eigenvalues the

λd1 − λa1 ≤ (λan − λa1)sin2∠(ua1,S)M

inequality holds. In this context the M matrix is the Maa matrix. Thismeans that the error between the eigenvalue of the reduced problem and thecorresponding (“exact”) eigenvalue of the unreduced problem is bound by theangle between the corresponding eigenvector and the subspace from which the


reduced solution was obtained.

In case of a full reduction, qo = o, qt = t the subspace has the same di-mension as the original space, k = n and the angle between the eigenvectorand the subspace is zero. This implies that the reduced problem solution iscomputationally exact. The reduction is then a congruence transformation:since S is nonsingular, the eigenvalues of the (Kaa, Maa) matrix pencil areidentical to those of the (Kdd, Mdd) pencil.

The λan −λa1 term on the right-hand side represents the eigenspectrum ofthe problem. This implies that the accuracy of the dynamic reduction alsodepends on the segment of the eigenspectrum represented by the componenteigenvectors. Usually the lower end of the spectrum up to a certain frequencyis of practical interest, however, with the help of the spectral transformationdescribed in Section 9.1, the focus may be placed on the upper end or themiddle of the spectrum.

For the error between the eigenvector from the reduced problem and the“exact” eigenvector of the unreduced problem, the bound is

sin2∠(ud1, ua1)M ≤√

λan − λa1

λa2 − λa1sin∠(ua1,S)M .

Here ud1 is an appropriately inflated a-size version of ud1. The λa2−λa1 termis indicative of the spacing of the eigenvalues. In both bounds the angle termmay be computed by the definition as

sin∠(ua1,S)M =

√1 − (

||ST Mx||2||x||M )2.

For a large problem and large subspace size the computation between theeigenvector and the subspace may be simplified to

sin∠(ua1,S)M = sin∠(ua1, v1)M ,

where the v1 vector is the vector closest to the eigenvector in the M -innerproduct space

v1 : minv∈S ||ua1 − v||M .

These bounds are a posteriori in nature. There is ongoing research and thereare some recent results in providing a priori bounds., i.e., in terms of the re-duced problem’s solution, see [8] for example. These methods are not widelyaccepted yet in the industry and will not be discussed further here. The ac-curacy affect of the dynamic reduction to the response of the system will beaddressed in more detail in Chapter 13.

206 Chapter 11


For a dynamic reduction example let us use as the stiffness matrix the resultof the static condensation example of Section 8.2.

K =

⎡⎣ 2 1 0

1 3 00 0 18/5

⎤⎦ ,

and as mass matrix

M =

⎡⎣ 1 1/2 0

1/2 1 1/50 1/5 18/25

⎤⎦ .

The analytic solution of the

(K − λM)u = 0

problem is

Λ =

⎡⎣ 2 0 0

0 3 00 0 6

⎤⎦

and

U =

⎡⎣ 1 −6 3

0 12 −60 5 10

⎤⎦ .

Here Λ is a diagonal matrix containing all three eigenvalues and U containsthe eigenvectors.

We assume the same partitioning as used in Section 8.2, namely that theupper left, in this case 2 by 2 size, partition is the interior. The eigenvaluesof the interior partition are

Λqoqo =[

2 00 5/2

],

and the eigenvectors

Φoqo =[

1 −1/20 1

].

The generalized mass matrix with these (not mass normalized) eigenvectors is

mqoqo =[

1 00 3/4

].


The mass coupling term is

Mqt =[

1 0−1/2 1

]T [0

1/5

]=

[0

1/5

].

The eigensolution components of the boundary modal solution are

Λqtqt =[18/5

],

andΦtqt =

[1].

The boundary generalized mass is

mqtqt =[18/25

].

Hence, the transformation matrix of the dynamic reduction is

S =

⎡⎣ 1 −1/2 0

0 1 00 0 1

⎤⎦ .

Note, that since the number of eigenvectors extracted from the interior is thesame as the size of the interior (qo = o = 2), this matrix is now square. Withthis, the matrices of the dynamically reduced eigenproblem are

Kdd =

⎡⎣2 0 0

0 5/2 00 0 18/5

⎤⎦

and

Mdd =

⎡⎣ 1 0 0

0 3/4 1/50 1/5 18/25

⎤⎦ .

The reduced(Kdd − λMdd)ud = 0

problem’s eigenvalues are

Λ =

⎡⎣ 2 0 0

0 3 00 0 6

⎤⎦ .

These eigenvalues are identical to the eigenvalues of the (K − λM)u = 0analytic global solution as the transformation is a congruence transformation.The eigenvectors are

Udd =

⎡⎣ 1 0 0

0 1 −3/50 5/12 1

⎤⎦ =

[Uod

Utd

].

208 Chapter 11

To obtain the global eigenvector solution, we need to recover the effects ofdynamic reduction. For the interior:

Uo = ΦoqoUod =[

1 −1/20 1

] [1 0 00 1 −3/5

]=

[1 −1/2 3/100 1 −3/5

].

For the boundary

Ut = ΦtqtUtd =[1] [

0 5/12 1]

=[0 5/12 1

].

Hence, the recovered global eigenvectors of our problem are

U =

⎡⎣ 1 −1/2 3/10

0 1 −3/50 5/12 1

⎤⎦ .

These eigenvectors are the same as the analytic global solution apart from ascalar multiplier.

11.4 Single-level, multiple-component dynamic reduction

As in the case of static condensation, dynamic reduction may also be expandedto multiple components. In this case, the partitioned eigenvalue problem is

⎡⎢⎢⎢⎢⎣

K1oo − λM1

oo K1ot − λM1

ot

. .Ki

oo − λM ioo Ki

ot − λM iot

. .

KT,1ot − λMT,1

ot . KT,iot − λMT,i

ot . Ktt − λMtt

⎤⎥⎥⎥⎥⎦

⎡⎢⎢⎢⎢⎣

u1o

.ui

o

.ut

⎤⎥⎥⎥⎥⎦ = 0.

For each component, fixed boundary interior mode shapes are computed as

KiooΦ

ioqi

o= M i

ooΦioqi

oΛi

qioqi

o.

The coupled boundary mode shapes are computed the same way as before

KttΦtqt = MttΦtqtΛqtqt .


The dynamic reduction transformation matrix becomes

S =

⎡⎢⎢⎢⎢⎣

Φ1oq1

o

.Φi

oqio

.Φtqt

⎤⎥⎥⎥⎥⎦ ,

where the qio index denotes the number of eigenvectors from the interior (o

partition) of the i-th component. Note, that this matrix is also rectangularas the number of mode shapes extracted from each component is very likelyless than the size of the component. S has d = qt + Σn

i=1qio columns and

t+Σni=1o

i rows. Here n is the number of components. Pre-multiplying by ST

and substitutingSud = ua


Executing the multiplications results in


The detailed structure of the reduced problem is⎡⎢⎢⎢⎢⎢⎣

Λ1q1

oq1o− λI1

q1oq1

oK1

q1oqt

− λM1q1

oqt

. .Λi

qioqi

o− λIi

qioqi

oK1

qioqt

− λM iqi

oqt

. .

K1,Tq1

oqt− λM1,T

q1oqt

. Ki,Tqi

oqt− λM i,T

qioqt

. Λqtqt − λIqtqt

⎤⎥⎥⎥⎥⎥⎦

⎡⎢⎢⎢⎢⎣

u1q1

o

.ui

qio

.uqt

⎤⎥⎥⎥⎥⎦ = 0.

The reduced problem’s size is now d = qt +Σni=1q

io and still has a very specific

structure. Its diagonal blocks are the generalized stiffness and mass matricesthat are diagonal, so the complete reduced matrix is diagonal apart from thecoupling block of the sides.

The diagonal generalized stiffnesses are

Λiqi

oqio

= Φi,Toqi

oKi

ooΦioqi

o,

andΛqtqt = ΦT

tqtKttΦtqt .

Assuming mass-orthogonality of the eigenvectors, the generalized mass ma-trices are identity

Iiqi

oqio

= Φi,Toqi

oM i

ooΦioqi

o,

210 Chapter 11

andIqtqt = ΦT

tqtMttΦtqt .

The mass and stiffness coupling terms are

M iqi

oqt= Φi,T

oqioM i

otΦtqt

andKqi

oqt= Φi,T

oqioKi

otΦtqt .


Φioqi

oui

qio

= uio

andΦtqtuqt = ut.

These equations are the basis for the engineering solution in the case of singlelevel, multiple-component dynamic reduction.

11.5 Multiple-level dynamic reduction

It is possible to formulate the dynamic reduction scheme in multiple levelsalso. We are still addressing the problem of


Let us assume the same 2 levels, 2 components per level partitioning of Sec-tion 8.4. Then the mass matrix is partitioned as

Maa =

⎡⎢⎢⎢⎢⎢⎢⎢⎢⎢⎣

M1oo M1,12

ot M1,0ot

M2oo M2,12

ot M2,0ot

M12tt M12,0

tt

M3oo M3,34

ot M3,0ot

sym M4oo M3,34

ot M4,0ot

M34tt M34,0

tt

M0tt

⎤⎥⎥⎥⎥⎥⎥⎥⎥⎥⎦

.

The dynamic reduction transformation matrix is a collection of the interiorcomponent mode shapes and the boundary mode shapes.


S =

⎡⎢⎢⎢⎢⎢⎢⎢⎢⎢⎣

Φ1oq1

o

Φ2oq2

o

Φ12tqt12

Φ3oq3

o

Φ4oq4

o

Φ34tqt34

Φ0tq0

t

⎤⎥⎥⎥⎥⎥⎥⎥⎥⎥⎦

where now the qt12 index denotes the number of eigenvectors from the bound-ary of the first and second partitions. Similarly the qt34 index denotes thenumber of eigenvectors from the boundary of the third and fourth partitionand qt0 index denotes the number of eigenvectors of the final boundary.

Note, that this matrix is also rectangular as the number of mode shapesextracted from each component is very likely less than the size of the compo-nent. S has d = qt12 + qt34 + qt0 + Σn

i=1qio columns and the original number of

rows. Pre-multiplying by ST and substituting

Sud = ua


Executing the multiplications result in


The detailed structure of the reduced stiffness matrix is

Kdd =

⎡⎢⎢⎢⎢⎢⎢⎢⎢⎢⎣

Λ1q1

oq1o

Kq1oqt12

Kq1oqt0

Λ2q2

oq2o

Kq2oqt12

Kq2oqt0

Λ12qt12qt12

Kqt12qt0

Λ3q3

oq3o

Kq3oqt34

Kq3oqt0

sym Λ4q4

oq4o

Kq4oqt34

Kq4oqt0

Λ34qt34qt34

Kqt34qt0

Λ0qt0qt0

⎤⎥⎥⎥⎥⎥⎥⎥⎥⎥⎦

.

The detailed structure of the reduced mass matrix is

Mdd =

⎡⎢⎢⎢⎢⎢⎢⎢⎢⎢⎣

I1q1

oq1o

Mq1oqt12

Mq1oqt0

I2q2

oq2o

Mq2oqt12

Mq2oqt0

I12qt12 qt12

Mqt12qt0

I3q3

oq3o

Mq3oqt34

Mq3oqt0

sym I4q4

oq4o

Mq4oqt34

Mq4oqt0

I34qt34 qt34

Mqt34qt0

I0qt0 qt0

⎤⎥⎥⎥⎥⎥⎥⎥⎥⎥⎦

.

212 Chapter 11

The interior generalized masses I iqi

oqio

and stiffnesses Λiqi

oqio

are the same as inSection 11.4. The boundary components generalized masses are

Ijqi

tqit

= ΦTtqi

tM i

ttΦtqit.

Similarly the stiffnesses:

Λj

qitqi

t= ΦT

tqitKi

ttΦtqit.

The component-to-boundary mass and stiffness coupling terms are

Mqioqj

t= Φi,T

oqioM i

otΦtqjt

andKqi

oqjt

= Φi,Toqi

oKi

otΦtqjt,

where j is either 12, 34 or 0. The boundary-to-boundary mass and stiffnesscoupling terms are

Mqjt q0

t= ΦT

tqjtM j

ttΦtq0t

andKqj

t q0t

= ΦTtqj

t

KjttΦtq0

t,

where j is either 12 or 34. This problem is of the same size as the number ofcolumns in S, i.e. (d), and it has a structure of a main diagonal with somecoupling matrices.

The following chart summarizes the effect of the dynamic reduction. TheKdd partition again is not a straightforward partition, it has been modified.[

Kaa [Kdd

] ].

11.6 Multi-body analysis application

As an application example for dynamic reduction, let us consider the casein engineering analysis when the rigid components are not eliminated fromthe system as shown in Chapter 4. This happens for example when the rigidcomponents have large displacements and they act as an independent, albeitattached body of the system, hence the name multi-body analysis. An ex-ample of such application is the steering mechanism, such as shown in Figure11.1, of a car in relation to the car body.


FIGURE 11.1 Steering mechanism

The relative motion between the flexible car body and the steering mecha-nism is described via a set of constraints dependent on the type of the jointsused between the bodies. The rigid body system’s (the steering mechanism)governing equation using Lagrange multipliers is[

M RT

R 0

] [qλ

]=

[pμ

].

In this equation R is the constraint matrix and λ, μ are the Lagrange multi-pliers and the enforced accelerations, respectively. The physical meaning ofthe Lagrange multipliers is the reaction forces at the constraints. The q is thegeneralized displacement of the rigid body and q is the acceleration. Note,that the governing equation does not contain stiffness terms for the rigid bodysystem.

The actual structure of R is similar to that of Rmg introduced in Chapter4. It is the mathematical representation of mechanical constraints betweencomponents of a multi-body system. An example is a simple hinge connectingtwo bodies, where the two bodies rotate freely about the same axis. The moredifficult joints may have more difficult representation [5].

214 Chapter 11

Most of the time rigid body components are an integrated part of a flexiblestructure. The most common integration technique is based on applying thedynamic reduction technique to the flexible partition of the structure. In otherwords, reduce the car body to a number of attachment points that specify theconnection with the steering mechanism. This reduction process is detailednext.

Let us partition a finite element problem into the flexible or elastic (e par-tition) and rigid (r partition) parts [3] as follows:⎡

⎣Mrr Mre RTr

Mer Mee RTe

Rr Re 0

⎤⎦

⎡⎣ q

ue

λ

⎤⎦ =

⎡⎣Pr

Pe

μ

⎤⎦ −

⎡⎣ 0

Keeue

0

⎤⎦ .

The Mre matrix represents the mass coupling between the flexible and rigidpartitions. Note the presence of the Kee term to represent the flexibility of thestructure. The dynamic reduction transformation matrix for this partitioningis of form:

T =

⎡⎣ Ir 0 0

0 Φeeq 00 0 Ir

⎤⎦ ,

where Φeeq are the eigenvectors of the flexible part of the integrated structure.The number of eigenvectors found is eq. They are the solution of the problem

KeeΦeeq = MeeΦeeq Λeqeq .

Let the eigenvectors be mass-orthogonal as

Ieqeq = Φteeq

MeeΦeeq .

The generalized stiffness is

Λeqeq = Φteeq

KeeΦeeq .

Let us introduce

ue = Φeeq ueq,

andue = Φeeq ueq.

Pre-multiplying the above finite element problem by T T and substitutingyields⎡

⎣ Mrr MreΦeeq RTr

ΦTeeq

Mer Ieqeq ΦTeeq

RTe

Rr ReΦeeq 0

⎤⎦

⎡⎣ q

ueq

λ

⎤⎦

⎡⎣ Pr

ΦTeeq

Peq

μ

⎤⎦ −

⎡⎣ 0

Λeqeq ueq

0

⎤⎦ .


The dynamically reduced form (size 2r+eq as opposed to 2r+e where e >> eq)of the integrated equation of motion is now easier to solve. This techniqueis especially advantageous when the dynamic behavior of the structure is an-alyzed in a longer time interval and many time step solutions are required.The issue of the time integration is addressed in Chapter 14.

This formulation governs the rigid body system’s behavior with the flexiblebody effects being presented with the reduction and preserved in the reducedform. It is also practical to take dynamic forces computed from this simula-tion back to the detailed flexible body for further analysis and detailed stressand response calculations.

Finally, it is prudent to point out that the dynamic reduction process de-scribed in this chapter needs to be generalized when the stiffness matrix isunsymmetric or complex, in which case the normal modes basis cannot becomputed. In such case the quadratic eigenvalue problem presented in thelast chapter is solved and the distinct left and right hand eigenvectors formthe basis of reduction.

Naturally, the simple generalized mass and stiffness formulations do nothold anymore and they are replaced by the orthogonality criteria reflectingthe distinct eigenvectors. Otherwise the concepts of the real dynamic reduc-tion apply equally well. This, complex dynamic reduction technique is beyondour scope and will not be further discussed here.

References

[1] Chatelin, F.; Eigenvalues of matrices, Wiley, New York, 1993

[2] Hurty, W. C.; Dynamic analysis of structural systems using componentmodes, AIAA Journal, Vol. 3, No. 4, pp. 678-685, 1965

[3] Komzsik, L.; Integrated multi-body system and finite element analysis,Proceedings of 4th International Conference on Tools and Methods ofCompetitive Engineering, Wuhan, China, 2002

[4] Masson, G. et al; Parameterized reduced models for efficient optimiza-tion of structural dynamic behavior, AIAA-2002-1392, AIAA, 2002

[5] Nikravesh, P. E.; Computer-aided analysis of mechanical systems, Pren-tice Hall, New Jersey, 1988

[6] Ortega, J. M.; Matrix theory, Plenum Press, New York, 1987

216 Chapter 11

[7] Saad, Y.; Numerical methods for large eigenvalue problems, HalstedPress, 1992

[8] Sleijpen, G. L. G., Eshof, J. V. D. and Smit, P.; Optimal a priori error-bounds for the Rayleigh-Ritz method, Math. Comp, Vol. 72, pp. 667-684,2002

[9] Wamsler, M.; Retaining the influence of crucial local effects in mixedGuyan and modal reduction, Engineering with Computers, Vol 20, pp.363-371, 2005

12

Component Mode Synthesis

The component mode synthesis procedure is a successive application of thestatic condensation and dynamic reduction and as such combines the tech-niques of Chapters, 8 and 11. It is applicable to any dynamic analysis problem,however, we continue to use the undamped, free vibration or normal modesproblem of structural analysis as the basis for discussion of this method. See[1], [6] and [5] for early references.

This technique is also commonly executed with single and multiple com-ponents as well as on multiple levels. First we review the concept with thesingle-level case.

12.1 Single-level, single-component modal synthesis

The combined static condensation and dynamic reduction transformation ma-trix may be written as

R = TS,

where the static condensation transformation is

T =[

Ioo Got

0 Itt

],

with the static condensation matrix Got = −K−1oo Kot. The dynamic reduction

matrix is

S =[

Φoqo

Φtqt

].

The modal space of the interior and the boundary are computed from

KooΦoqo = MooΦoqoΛqoqo ,

andKttΦtqt = MttΦtqtΛqtqt .

217

218 Chapter 12

Pre-multiplying the(Kaa − λMaa)ua = 0

equation by RT and introducing Rud = ua results in

RT (Kaa − λMaa)Rud = 0,

or(Kdd − λMdd)ud = 0,

where the notation Kdd reflects the fact that this matrix is now the result ofboth static condensation and dynamic reduction. In partitioned form[

Λqoqo − λIqoqo −λMqoqt

−λMTqoqt

Λqtqt − λIqtqt

] [uqo

uqt

]= 0.

Here the transformed mass coupling matrix is

Mqoqt = ΦToqo

MotΦtqt

and due to the static condensation the stiffness matrix is uncoupled, whilethe mass is not, represented by M ot.

As before, due to the mass-orthogonality of the eigenvectors

ΦToqo

MooΦoqo = Iqoqo ,

andΦT

tqtMttΦtqt = Iqtqt .


Φoqouqo = uo

andΦtqtuqt = ut.

Note, that while the static condensation step is computationally exact, thedynamic reduction is not and therefore, the modal synthesis also containstruncation error. This is the topic of more discussion in Chapter 13.

Before we invoke an example let us reconcile the method shown here withthe well-known Craig-Bampton method [2]. The R transformation matrix inour case, assuming full modal space, is

R = TS =[

Φoo GotΦtt

0 Φtt

].

The Craig-Bampton method’s transformation matrix is

RCB =[

Φoo Got

0 Itt

].

Component Mode Synthesis 219

It is easy to see that this form is the product of RCB = TSCB, where

SCB =[

Φoo

0 Itt

].

This means that the Craig-Bampton method is a component mode synthesismethod that does not do dynamic reduction on the boundary component.This is the only difference.

12.2 Mixed boundary component mode reduction

The method shown in the prior section has assumed a fixed boundary whencomputing the component modes. In some practical applications that is notappropriate. For example, when some of the boundary components are excitedby loads, the dynamic responses computed from fixed boundary componentmode reduction are poor [7]. The subject of this section is to discuss the casewhen some parts of the boundary are considered to be free and some part isfixed during the component mode computations, hence the name.

The matrix partitioning for the mixed boundary reduction is

Kaa =

⎡⎣ Kcc Kc0

0 Kbb Kbo

Koc Kob Koo

⎤⎦ =

[Ktt Kto

Kot Koo

].

Here the c partition is the boundary segment that is free. The b partitionis the fixed boundary segment as before. Similar partitioning is used for themass matrix.

Maa =

⎡⎣ Mcc Mc0

0 Mbb Mbo

Moc Mob Moo

⎤⎦ =

[M tt Mto

Mot Moo

].

Note, that in above partitioning the boundary was ordered first, followed bythe interior, opposite to the order of the earlier sections. This is to reflectour focus on the split boundary and ease the notation, otherwise this issue isof no computational consequence. The two partitions are really interspersedafter all.

Two static condensation matrices are computed for both boundary seg-ments as

220 Chapter 12

⎡⎣ 0 0 0

0 Kbb Kbo

0 Kob Koo

⎤⎦

⎡⎣ 0

Ibb

Gob

⎤⎦ =

⎡⎣ 0

Pb

0

⎤⎦ → Gob = −K−1

oo Kob,

and ⎡⎣Kcc 0 Kc0

0 0 0Koc 0 Koo

⎤⎦

⎡⎣ Icc

0Goc

⎤⎦ =

⎡⎣Pc

00

⎤⎦ → Goc = −K−1

oo Koc.

To calculate the dynamic reduction mode shapes an eigenvalue problem ofsize v = o + c is solved.

KvvΦvz = MvvΦvzΛzz,

where z is the number of eigenpairs extracted. These mode shapes, however,are not directly used in the dynamic reduction. They are first partitioned as

Φvz =[

Φoz

Φcz

],

where the eigenvector partitions correspond to the interior and the free bound-ary, respectively. The constrained modes corresponding to static condensationare

Φ1oz = Φoz − GocΦcz.

The possible zero modes are partitioned out as

Φ1oz =

[Φ0

oz Φ2oz

]to compute the generalized mass

M2zz = Φ2,T

oz MooΦ2oz.

By introducing a scaling factor

Szz =√

diag(M2zz),

the modes are scaled as

Φ3oz = Φ2

ozSzz.

The superscripts represent various modifications to the vectors and matrices.The scaled modes produce a scaled generalized mass

Mzz = STzzM

2zzSzz.

To verify the non-singularity of this matrix, the generalized mass is factored

Mzz = LTzzDzzLzz.


Dependent vectors are purged from the eigenvectors if

Mzz(ii)/Dzz(i) > εdep,

where the εdep is a dependency cut-off level. With these the generalized stiff-ness matrix is

Kzz = Φ3,Toz KooΦ3

oz.

Finally, the scaled generalized modes are computed from

KzzΦ4oz = MzzΦ4

ozΛzz.

These modes are going to be used for the dynamic reduction part. In specific,the reduction matrix for the mixed boundary case will be

R =

⎡⎣Gob Goc Φ4

oz

Ibb 0 00 Icc 0

⎤⎦ =

[Got Φ4

oz

Itt 0

]

Got =[Gob Goc

].

At this point the two methods are reconciled, albeit with a different transfor-mation matrix. The reduced stiffness matrix is

Kdd = RT KaaR =[

Ktt 00 Kqq

],

withKtt = Ktt + KtoGot

andKqq = ΦT,4

oz KooΦ4oz.

The mass reduction, due to the boundary-coupling terms, is a bit more diffi-cult but conceptually similar.

Mdd = RT MaaR =[

Mtt Mtq

Mqt Mqq

],

withMtt = M tt + MtoGot + GT

otMot + GTotMooGot,

Mqt = ΦT,4oz Mot + ΦT,4

oz MooGot,

andMqq = ΦT,4

oz MooΦ4oz.

The full-size result is obtained from

ua = Rud.

222 Chapter 12

In detailed partitions this equation is:[ut

uo

]=

[Itt 0Got Goq

] [ut

uq

].

The first column partition of [Itt

Got

]

represents the “constraint modes” of the component due to (and includingthe) unit boundary motion.

The second partition of [0

Goq

]

represents the “fixed boundary modes” of the component assuming (and in-cluding the) zero boundary motion.

In contrast, for the fixed method all of the boundary has zero motion duringthe mode shape computation, since

Goq = Φoz.

For the mixed boundary method only part of the boundary has zero motionduring the mode shape computation, since

Goq =[Φ3

oz Φzz

].

In the following sections of this chapter, we will assume the fixed boundaryapproach for simplicity, however, the mixed boundary reduction results areapplicable as well.


Let us now clarify the modal synthesis procedure with the help of a simplenumerical example of the eigenvalue problem. Consider the generalized eigen-value problem with the stiffness matrix used in Section 8.2.

Kaa =

⎡⎣ 2 1 0

1 3 10 1 4

⎤⎦ .


We will use the mass matrix of

Maa =

⎡⎣ 1 1/2 0

1/2 1 1/20 1/2 1

⎤⎦ .

The analytic global eigenvalue solution for the

(Kaa − λMaa)ua = 0

eigenvalue problem is

Λ =

⎡⎣ 2 0 0

0 3 00 0 6

⎤⎦ ,

and

U =

⎡⎣ 1 −1/2 −1/3

0 1 2/30 1/2 −2/3

⎤⎦ .

Let us again partition the problem into the interior consisting of the prin-cipal 2 by 2 minor

Koo =[

2 11 3

]

and

Moo =[

1 1/21/2 1

],

as well as

Kot =[

01

]

and

Mot =[

01/2

].

The boundary components are

Ktt =[4]

andMtt =

[1].

For this stiffness matrix we have calculated the static condensation matrix inChapter 8 as

Got =[

1/5−2/5

].

224 Chapter 12

The mass coupling matrix with this is

Mot =[

01/2

]+

[1 1/2

1/2 1

][1/5−2/5

]=

[0

1/5

].

The Schur complement result of the static condensation was

Ktt =[18/5

].

The corresponding mass complement:

Mtt =[1]+

[0 1/2

] [1/5−2/5

]+

[1/5 −2/5

] [0

1/2

]+

[1/5 −2/5

] [1 1/2

1/2 1

] [1/5−2/5

]=

[18/25

].

Finally, the matrices of the statically condensed eigenproblem are

K =

⎡⎣ 2 1 0

1 3 00 0 18/5

⎤⎦

and

M =

⎡⎣ 1 1/2 0

1/2 1 1/50 1/5 18/25

⎤⎦ .

The next step is to execute the dynamic reduction of this problem. This is,however, the same matrix problem we have used in discussing the dynamicreduction in Section 11.3, so the solution of the component mode synthesisproblem comes from that of the solution of the example in Chapter 11. Theeigenvalues were

Λ =

⎡⎣ 2 0 0

0 3 00 0 6

⎤⎦ ,

which are identical to the original problem’s solution. The eigenvectors (back-transformed with respect to the dynamic reduction) were:

U =

⎡⎣ 1 −1/2 3/10

0 1 −3/50 5/12 1

⎤⎦ .

The global eigenvectors are obtained by redoing the static condensation also.For the interior:

Uo =[

1 −1/2 3/100 1 −3/5

]+

[1/5−2/5

] [0 5/12 1

]=

[1 −5/12 1/20 5/6 −1

].


For the boundary:

Ut =[0 5/12 1

].

Hence, the final eigenvectors are

U =

⎡⎣ 1 −5/12 1/2

0 5/6 −10 5/12 1

⎤⎦ ,

which are the same as the analytic global solution apart from a scalar multi-plier.

In the example, the full eigenspectrum of both the boundary and interioris obtained, t = qt and o = qo, therefore, there was no reduction and theproblem was merely transformed. Naturally, the results then are accurate.The smaller the number of eigenvectors compared to the rank of the matrix,the less accurate the component mode synthesis becomes due to the truncationerror.

In the following we discuss the two stages of the component mode synthesisreduction technique for the multiple-component cases.

12.4 Single-level, multiple-component modal synthesis

The problem at hand with multiple components is partitioned as⎡⎢⎢⎢⎢⎣

K1oo − λM1

oo K1ot − λM1

ot

. .Ki

oo − λM ioo Ki

ot − λM iot

. .

KT,1ot − λMT,1

ot . KT,iot − λMT,i

ot . Ktt − λMtt

⎤⎥⎥⎥⎥⎦

⎡⎢⎢⎢⎢⎣

u1o

.ui

o

.ut

⎤⎥⎥⎥⎥⎦ = 0.

For each component the static reduction to the boundary is facilitated by

Giot = −K−1,i

oo Kot

where i goes from 1 to n and the computations are independent. The simul-taneous static reduction for all components is formally executed by the

T =

⎡⎢⎢⎢⎢⎣

I1oo G1

ot

. .Iioo Gi

ot

. .Itt

⎤⎥⎥⎥⎥⎦

226 Chapter 12

transformation matrix. Pre-multiplying with T T and introducing Tua = ua as

T T [Kaa − λMaa]Tua = 0

results in[Kaa − λMaa]ua = 0.

In detail we get

⎡⎢⎢⎢⎢⎢⎣

K1oo − λM1

oo −λM1

ot

. .

Kioo − λM i

oo −λMi

ot

. .

−λMT,1

ot . −λMT,i

ot . Ktt − λM tt

⎤⎥⎥⎥⎥⎥⎦

⎡⎢⎢⎢⎢⎣

u1o

.

uio

.ut

⎤⎥⎥⎥⎥⎦ = 0,

whereKtt = Ktt + Σn

i=1KT,iot Gi

ot,

M tt = Mtt + Σni=1(M

T,iot Gi

ot + GT,iot M i

ot + GT,iot M i

ooGiot),

and the mass coupling term is

Mi

ot = M iot + M i

ooGiot.

Note, that the coupling stiffness term has been eliminated and the eigenvaluesare invariant under this congruence transformation. The eigenvector interiorcomponents are transformed as

uio = ui

0 − Giotut,

which will be the basis for result recovery. Note, that the ut term is unaffectedduring the transformation by T .

To complete the component mode synthesis we now apply dynamic re-duction to the statically condensed problem. The dynamic behavior of thecomponents is represented by the fixed boundary component modes

KiooΦ

ioqi

o= M i

ooΦioqi

oΛi

oiqqi

o

and the boundary modes of

KttΦtqt = M ttΦtqtΛqtqt .

Note, that the boundary modes are based on the statically condensed bound-ary problem and the number of mode shapes found for each component maybe different.


The dynamic reduction transformation matrix in this form is

S =

⎡⎢⎢⎢⎢⎣

Φ1oq1

o

.Φi

oqio

.Φtqt

⎤⎥⎥⎥⎥⎦ .

Observe that this matrix is rectangular as the number of mode shapes ex-tracted from each component is less than the size of the component. S hasd = qt +Σn

i=1qio number of columns and t+Σn

i=1oi number of rows. Appropri-

ate multiplication of the equilibrium equation with ST and the introductionof Sud = ua as

ST (Kaa − λMaa)Sud = 0

yields the modal synthesis reduced form of

(Kdd − λMdd)ud = 0.

In detail the reduced form of the modal synthesis is⎡⎢⎢⎢⎢⎢⎣

Λ1q1

oq1o− λI1

q1oq1

o−λM1

q1oqt

. .Λi

qioqi

o− λIi

qioqi

o−λM i

qioqt

. .

−λMT,1qtq1

o. −λMT,i

qtqio

. Λqtqt − λIqtqt

⎤⎥⎥⎥⎥⎥⎦

⎡⎢⎢⎢⎢⎣

u1q

.ui

q

.uqt

⎤⎥⎥⎥⎥⎦ = 0.

Here the generalized stiffnesses are

Λiqi

oqio

= ΦT,ioqi

oKi

ooΦioqi

o

andΛqtqt = ΦT

tqKttΦtq.

The generalized mass matrices are

Iiqi

oqio

= ΦT,ioqi

oM i

ooΦioqi

o

andIqtqt = ΦT

tqtM ttΦtqt ,

since the eigenvectors are again mass-orthogonal. The mass coupling term is

M iqi

oqt= ΦT,i

oqioM

i

otΦtqt .

The eigenvalues are again invariant under this congruence transformation.The eigenvector components are transformed as

Φioqi

oui

q = uio

228 Chapter 12

andΦtqtuqt = ut.

The reduced eigenvalue problem contains matrices with very specific struc-ture. Namely, the stiffness matrix is diagonal, but the mass is not (althoughit is very sparse). This gives the idea, sometimes used in commercial analyses,to solve the mode synthesized problem in inverse form, This method of re-placing the mass and stiffness matrices allows for an efficient Lanczos methodsolution, where the mass matrix needs to be used in every step of the oper-ation. The shortcoming of this computationally speedy idea is the need forspecial handling of the possible rigid body modes remaining in the reducedproblem. These modes become computationally infinite in the inverted solu-tion scheme and as such will not appear in the solution, however, they are ofparticular interest of the engineer.

The reduced eigenvalue problem is again of order d = qt + Σni=1q

io size as

opposed to the original order t + Σni=1o

i size.

The eigenvectors of the reduced problem are twice transformed. There-fore, the final eigenvectors will be recovered in two steps. First, the dynamicreduction is recovered as

uio = ΦT,i

oqioui

q

andut = Φtqtut.

Secondly, the effects of the static reduction are accounted for as

uio = ui

o + Giotut.

Remember, ut does not change during static condensation.

12.5 Multiple-level modal synthesis

The combined static condensation and dynamic reduction transformation ma-trix may be written as

R = TS.

The T and S matrices were described in Sections 8.4 and 11.5, respectively.Pre-multiplying

[Kaa − λMaa]ua = 0


by RT and substitutingRud = ua

yieldsRT [Kaa − λMaa]Rud = 0.

Executing the multiplications result in


The detailed structure of the reduced stiffness matrix is

Kdd =

⎡⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎣

Λ1q1

oq1o

Λ2q2

oq2o

Λ12

qt12 qt12

Λ3q3

oq3o

Λ4q4

oq4o

Λ34

qt34 qt34

Λ0

qt0 qt0

⎤⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎦

.

The Λ notation reflects the fact that the boundary eigenvalue solution is com-puted from the statically condensed boundary problem. The detailed struc-ture of the reduced mass matrix is

Mdd =

⎡⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎣

I1q1

oq1o

M q1oqt12

M q1oqt0

I2q2

oq2o

M q2oqt12

M q2oqt0

I12qt12 qt12

Mqt12 qt0

I3q3

oq3o

M q3oqt34

M q3oqt0

sym I4q4

oq4o

M q4oqt34

M q4oqt0

I34qt34 qt34

Mqt34 qt0

I0qt0 qt0

⎤⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎦

.

The M notation represents the fact that the mass coupling matrices underwentstatic condensation prior to the dynamic reduction. All generalized massesand stiffnesses are the same as in Sections 8.4 and 11.4. This problem is of thesame size as the reduced problem after multilevel dynamic reduction, however,now the stiffness matrix is diagonal. This fact provides further computationaladvantages.

The following is the summary chart for this stage of the reduction.⎡⎣Kaa [

Ktt [Kdd

] ] ⎤⎦ .

Here again, Ktt and Kdd are not direct partitions.

230 Chapter 12

12.6 Component mode synthesis case study

The effectiveness of the computational reduction techniques is viewed througha case study. This model was also selected from the automobile industry, al-though similar characteristics may also be found in the aerospace industry,specifically in airplane fuselage models. The example was similar to the con-vertible car body as shown in Figure 12.1.

FIGURE 12.1 Convertible car body

The model incorporated a variety of structural components containing var-ious element types as shown in Table 12.1.

The rigid elements included rigid bar elements, such as the one used in themechanical example in Chapter 3. They also included various forms of multi-point constraints which are used regularly in the auto industry to model spotwelds. As the large number of these indicate, there are quite a few of these


TABLE 12.1

Element types of case study automobilemodelElement type Number of elements

4 noded quadrilateral 274,1923 noded triangular 59,7298 noded hexahedral 3,6454 noded tetrahedral 532,720Rigid elements 12,830

in a car body.

The model had 1,301,381 nodes and the statistics of the degrees of freedomare presented in Table 12.2.

TABLE 12.2

Problem statisticsPartition Size

Global degrees of freedom = g 7,808,286multi-point constraints = m 170,345single-point constraints = s 3,101,117Free set size = f 4,536,824

The rather large number of single point constraints is in part due to bound-ary conditions. Most of them, however, are related to the three dimensionalelements and were found by the automatic elimination process discussed inChapter 5.

Various scenarios of component mode synthesis were executed on this modelto obtain all the eigenmodes up to 600 Hz. Since the problem was extremelycompute intensive, establishing a serial base line was impractical. Instead, abase line with 64 processor execution of the hierarchic parallel normal modesanalysis, demonstrated in Section 9.6 was found. The comparison statisticsare in Table 12.3.

In all three executions a 64 node Linux cluster with dual core 1.85 GHzprocessors, 4 Gigabytes of memory and 50 Gigabytes of disk per node wasused. It is noticeable that the component mode synthesis finds less modesthan the computationally exact baseline solution, a fact the engineer needs tobe aware of. The discrepancy, of course, is due to the approximation involved

232 Chapter 12

TABLE 12.3

Execution statisticsComponents Elapsed time I/O amount Number of modes

min:sec Gigabytes

- 829:53 1,814 5261256 201:01 707 5176512 189:22 888 5132

in the dynamic reduction.

It appears that for this example 256 components was the optimal, consid-ering that the incremental elapsed time improvement for 512 components wasnegligible. Assuming that the baseline run would be about 20 times fasterthan a serial run (an assumption justified by Figure 9.4) the speed-up of the512 component run over the serial one would be approaching 90. Properlyapplied component mode synthesis normal modes analyses often produce twoorders of magnitude (hundredfold) improvements over a serial execution. Inmany cases, the serial execution can not even be done, due to machine limi-tations.

Such a performance advantage of the component mode synthesis basedanalysis solutions is obviously of great industrial importance. In essence amulti-day computational job is reduced to an overnight execution. Due to thecurrent industry tendency of ever widening frequency ranges of interest [4],the component modal synthesis approach seems to be most practical. That isof course with the understanding of the engineering approximations involved.

This concludes the second part of the book dealing with computational re-duction techniques.

References

[1] Benfield, W. A. and Hruda, R. F.; Vibration analysis of structures bycomponent mode substitution, AIAA Journal, Vol. 9, No. 7, pp. 1255-1261, 1971

[2] Craig, R. R., Jr. and Bampton, M. C. C.; Coupling of substructures fordynamic analysis, AIAA Journal, Vol. 6, No. 7, pp. 1313-1319, 1968

[3] Komzsik, L.; A comparison of Lanczos method and AMLS for eigen-


value analysis, US National Conference on Computational Mechanics,Structural Dynamics Mini-symposium, Albuquerque, 2003

[4] Kropp, A. and Heiserer, D.; Efficient broadband vibro-acoustic analysisof passenger car bodies using a FE-based component mode synthesisapproach, Proc. of 5th World Congress on Computational Mechanics,Vienna, 2002

[5] MacNeal, R. H.; A hybrid method of component mode synthesis, Com-puters and Structures, Vol. 1, No. 4, pp. 389-412, 1971

[6] Rubin, S.; Improved component-mode representation for structural dy-namic analysis, AIAA Journal, Vol. 12, No. 8, pp. 995-1006, 1975

[7] Wamsler, M., Komzsik, L., and Rose, T., Combination of quasi-staticand truncated system mode shapes, Proceedings of NASTRAN Euro-pean User’s Conference, The MacNeal-Schwendler Corporation, Ams-terdam, 1992

Part III

Engineering SolutionComputations

235

13

Modal Solution Technique

We now embark on the road to calculate the engineering solutions, the topicof the third, final, part of the book. This chapter will discuss a frequentlyused solution technique for the analysis of the transient behavior of a me-chanical system, the modal solution technique. This is the application of thecomputational reduction techniques to the forced vibration problem.

13.1 Modal solution

The subject of the modal solution is the efficient solution of the forced, dampedvibration problem of:

[Maava(t) + Baava(t) + Kaava(t)] = Fa(t).

The modal solution is based on the free, undamped vibrations of the system,represented by a linear, generalized eigenvalue problem of

(Kaa − λMaa)φa = 0,

where λ = −ω2. Let the matrix Φah contain h eigenvectors. They may havebeen computed directly by solving the a partition eigenvalue problem, viadynamic reduction or component mode synthesis.

The essence of modal solution is to introduce the modal displacement w,defined as

va(t) = Φah wh(t).

The length of the modal displacement vector is h. This equation actually willbe used to transform the modal solution wh(t) to the original va(t) solutionof the engineering problem.

Similarly, modal velocities

va(t) = Φahwh(t),

237

238 Chapter 13

and accelerationsva(t) = Φahwh(t)

are introduced. Pre-multiplying the forced, damped vibration equilibriumequation by ΦT

ah

ΦTah[Maava(t) + Baava(t) + Kaava(t)] = ΦT

ahFa(t),

and substituting the modal quantities, called modal coordinates we get

ΦTahMaaΦahwh(t) + ΦT

ahBaaΦahwh(t) + ΦTahKaaΦahwh(t) = ΦT

ahFa(t).

Let us define modal matrices as modal mass

Mhh = ΦTahMaaΦah,

modal damping

Bhh = ΦTahBaaΦah,

and modal stiffness

Khh = ΦTahKaaΦah.

The modal load is

Fh(t) = ΦTahFa(t).

Substituting all of these results in the modal form of the equation of motion

Mhh wh(t) + Bhh wh(t) + Khh wh(t) = Fh(t).

This equation is now of order h, much smaller than the original equation ofmotion, and as such is much cheaper and easier to solve. The time domainanalysis technique shown in Chapter 14 is eminently applicable to these setof equations.

The summary chart now demonstrates the relationship between the analy-sis and the modal solution set as[

Kaa [Khh

] ].

In practical modal solution techniques the eigenvectors of the free vibrationproblem are mass normalized, so Mhh is the identity matrix and Khh is adiagonal matrix, containing the eigenvalues.

The damping definition style can have a major effect on the efficiency ofthe modal solution, where the reduction results in diagonal stiffness and massmatrices in the modal basis, but the reduced damping matrix can become full,

Modal Solution Technique 239

causing the cost of such analysis to grow greatly. In the midst of all the uncer-tainty about the actual physics, and the great growth in computational costs,the most common damping used in modal analysis is called modal damping.

An experienced engineer who has modeled and tested similar structures hassome notion of what the modal damping for different types of modes is likelyto be on the structure being analyzed. Based on this a modal damping valuefor each mode may be established, resulting in a diagonal modal damping ma-trix. This modal damping coefficient is multiplied by the natural frequency,resulting in the method called proportional damping. The modal equation inthis case may be decoupled into a series of scalar equations as:

wh[j](t) + 2ξjωjwh[j](t) + ω2jwh[j](t) = fj(t) j = 1, 2..h.

Here ω2j = λj is the j-th eigenvalue and wh[j] is the j-th term in the modal

solution vector. The modal damping matrix with proportional damping isdefined as

Bhh[j, j] = 2ξjωj ,

and is a diagonal matrix. The j-th modal load is

fj(t) = Φah[, j]T F (t),

where Φah[, j] is the j-th eigenvector (the j the column of Φah).

13.2 Truncation error in modal solution

The truncation error is introduced by having less than the full eigenspectrumof the Kaa, Maa matrix pencil represented in the reduced forms. This issue isalso of paramount importance to the dynamic reduction techniques presentedin the earlier chapters.

To assess the error accrued by the modal solution, let us concentrate on theundamped case of the modal problem and execute a Laplace transformationof form

w(ω) =∫ ∞

0

w(t)e−ωtdt.

Assuming zero initial conditions w(0) = 0 and w(0) = 0 we get:

(ω2 + λj) wh[j](ω) = fj(ω),

240 Chapter 13

for all j = 1, 2, ..h. From this the modal solution component is

wh[j](ω) =1

ω2 + λjfj(ω).

For simplicity in the following, the (ω) notation will be ignored as the frequency-dependence is obvious. Combining the scalar equations into a matrix equationagain, we obtain:

wh =

⎡⎢⎢⎢⎢⎢⎢⎢⎢⎣

...

1ω2+λj

...

⎤⎥⎥⎥⎥⎥⎥⎥⎥⎦

ΦTahF,

where the right-hand side has been back-transformed to the original a-partitionload vector. Back-transforming the modal solution results in

va = Φahwh = ZmF,

with the flexibility matrix:

Zm = Φah

⎡⎢⎢⎢⎢⎢⎢⎢⎢⎣

...

1ω2+λj

...

⎤⎥⎥⎥⎥⎥⎥⎥⎥⎦

ΦTah.

This solution is unfortunately not exact, since not all the eigenvectors of thematrix pencil are used, i.e. h < a. This is the cause of the truncation error.

Assume now that the complete eigenspace consists of the calculated portionΦm

ah and the truncated component of Φrak and a = h + k. Let us represent

this by the partitioning

Φaa =[Φm

ah Φrak

].

Taking this into consideration the exact solution is now

va = ZmFa + ZrFa,

where the two distinct flexibility matrices are called modal and residual flex-ibility matrices, respectively. Note, that the latter is not computed. The


truncated component of the response solution is simply:

vra = ZrFa = va − ZmFa.

As the exact solution va is not known, this formula is not useful for measuringthe error. It is, however, useful in that it indicates how to reduce the trunca-tion error. This is the topic of the next section.

13.3 The method of residual flexibility

The challenge in improving the accuracy of the modal solution is to accountfor the effect of the residual flexibility (Zr), without computing the residualmodes. Hence, the method described in the following is called the residualflexibility method [5].

Let us also assume that the eigenvectors contained in the residual set arerelated to eigenvalues well above the frequency of interest, the high frequencymodes. Then λj >> ω2 and we may approximate the residual flexibility ma-trix as:

Zr ≈ Φrak

⎡⎢⎢⎢⎢⎢⎢⎢⎢⎣

...

1λj

...

⎤⎥⎥⎥⎥⎥⎥⎥⎥⎦

Φr,Tak .

This suggests that the residual flexibility may be presented via static shapes.For this we consider the general case when the load consists of a time-dependentHa(t) and a time invariant Ga, sometimes called scaling, component as

Fa(t) = GaHa(t).

The static response ψa of the structure at any point in time is the solution of

Kaaψa = Ga.

The complete modal representation of the stiffness matrix is

Λaa = ΦTaaKaaΦaa,

242 Chapter 13

where

Λaa =

⎡⎢⎢⎢⎢⎢⎢⎢⎢⎣

λ1

..λj

..λa

⎤⎥⎥⎥⎥⎥⎥⎥⎥⎦

.

Inverting both sides produces

Λ−1aa = Φ−1

aa K−1aa Φ−1,T

aa .

Pre- and post-multiplying both sides and using the orthogonality property ofthe eigenvectors results

K−1 = ΦaaΛ−1aa ΦT

aa.

Assuming the partitioning

Φaa =[Φm

ah Φrak

],

the inverse is of the form:

K−1aa = Φm

ahΛ−1,mΦm,Tah + Φr

akΛ−1,rΦr,Tak = Zm + Zr.

Here Λ−1,m, Λ−1,r are diagonal matrices containing the inverses of the eigen-values corresponding to the computed modal space and the uncomputed resid-ual space, respectively.

The residual flexibility from this equation is

Zr = K−1aa − Zm = K−1

aa − ΦmahΛ−1,mΦm,T

ah .

The static solution component (residual vector) representing the residual flex-ibility is:

ψra = ZrGa = K−1

aa Ga − ΦmahΛ−1,mΦm,T

ah Ga.

Note, that the right-hand side contains only computed quantities. Hence, thetruncated component of the response solution is

vra = ψr

aHa(t).

Finally, the improved response solution corrected for the truncated residualflexibility is

va =[Φm

ah ψra

] [wh

Ha(t)

].


The disadvantage of this method is its computational expense. The stiffnessmatrix has to be factored and forward-backward substitutions executed foreach load vector. Note, that in the case of multiple loads, the ψa matrix hasmore than one column. In this case, it is important to assure the linear inde-pendency of the columns.

One can extend this method to produce a pseudo-mode shape from thestatic solution vector and augment the computed modal space with this vector.By the nature of this augmentation the method is then called modal truncationaugmentation method [1]. Note, that if the Fa(t) load has multiple columns,the method described here would work with multiple residual vectors. Let usaugment the modal space by the residual vector (for simplicity of discussionwe consider only one)

Ψah+ =[Φm

ah ψa

].

Here the index h+ represents the fact that the h size has been augmentedby at least one additional vector. Let us execute the modal reduction of thestiffness and the mass matrix with this mode set as

kh+h+ = ΨTah+KaaΨah+

andmh+h+ = ΨT

ah+MaaΨah+ .

Solve the modal eigenvalue problem of

kh+h+φh+h+ = mh+h+φh+h+Λh+h+

for all the eigenvalues and eigenvectors. These will be the basis of the finalmode shapes of

Φah+ = Ψah+φh+h+ .

The last vector in Φah+ corresponds to the residual vector shape and it isnow mass and stiffness orthogonalized. Hence, executing the modal reductionwith this modal space will result in improved modal solutions. See [2] forsome practical results using this technique.

The effect of the application of residual vectors is apparent in the followingexample. The applied load was to simulate a multiple G-force impact withload pattern shown in Figure 13.1. The duration of the impact load was 0.005seconds.

The importance of correctly identifying such peak response locations is themost challenging component of modal analysis techniques. Missing such peak

244 Chapter 13

FIGURE 13.1 Time dependent load

response components could result in catastrophic failures in industrial behav-ior of structures.

FIGURE 13.2 The effect of residual vector


The solid line shown in Figure 13.2 represents the results without using theresidual vector. The dotted line represents the response with the inclusion ofthe residual vector. It is clearly following the peak of the load in the timeinterval.

13.4 The method of mode acceleration

There is yet another class of methods to improve the accuracy of modal so-lutions called the mode acceleration method [1]. This method is applieda posteriori to improve the modal solution.

Let us reorganize the dynamic response equation as

Kaav(t) = F (t) − Maav(t) − Baav(t).

Furthermore, solve for the displacement response

v(t) = K−1aa F (t) − K−1

aa Maav(t) − K−1aa Baav(t).

Let us assume that the modal acceleration and velocity of the system are wellrepresented by the computed modal space:

v(t) = Φmahw(t)

andv(t) = Φm

ahw(t).

Approximate the inverse of the stiffness matrix with the computed modalspace as

K−1aa ≈ Φm

ahΛ−1,mΦm,Tah .

Using the mass orthogonality property of

Φm,Tah MaaΦm

ah = I

and substituting the approximate inverse on the last two terms of the righthand side of the solution equation yields

v(t) = K−1aa F (t) − Φm

ahΛ−1,mw(t) − ΦmahΛ−1,mBhhw(t),

whereBhh = Φm,T

ah BaaΦmah,

246 Chapter 13

is the modal damping matrix.

It is possible to prove that this method produces mathematically identicalsolution to that of the method of residual flexibility. The only difference is inthe order of operations and computational complexity. The advantage of onemethod over the other one is problem dependent and it will not be furtherexplored here.

13.5 Coupled modal solution application

A practical application problem for the modal solution technique is related tothe structural acoustic problem presented in Chapter 6. In this applicationone executes two independent real symmetric eigensolutions. The first oneassumes the structure is in a vacuum (fluid effects are ignored):

KsΦsqs = MsΦsqsΛs.

The second problem assumes that the structure in connection with the fluidis rigid:

KfΦfqf= MfΦfqf

Λf .

The notation uses the earlier convention of qs, qf being the number of eigen-vectors extracted from the structural and fluid problem, respectively. Letboth sets of eigenvectors be mass normalized as

ΦTsqs

MsΦsqs = Iqsqs

andΦT

fqfMfΦfqf

= Iqf qf.

Applying the modal reduction to the coupled problem yields

[Iqsqs 0

ΦTfqf

AΦsqs Iqf qf

] [wqs

wqf

]+

[Λqsqs −ΦT

sqsAT Φfqf

0 Λqf qf

][wqs

wqf

]=

[ΦT

sqsFs

ΦTfqf

Ff

].

Here Λqsqs and Λqf qfare the generalized stiffnesses. This is the coupled modal

equilibrium equation that may be subject to the time integration techniqueshown in Chapter 14. The final results are recovered from the modal coordi-nates as

us = Φsqswqs


anduf = Φfqf

wqf.

The formulation shown here utilizes the fact that the constituent matrices aresymmetric, although they are combined in a form that leads to unsymmetriccoupled matrices.

13.6 Modal contributions and energies

Upon completion of a modal solution the engineer is faced with the problemof evaluating the contribution of the various mode shapes to the result. Iden-tifying these modal contributions is instrumental in improving the design, forexample by pin-pointing the locations where vibration absorber or other kindsof damping components may be needed in the structure [3].

The modal contributions are computed from the modal space used to gen-erate the modal solution. Let us partition the eigenvector matrix as follows:

Φah =

⎡⎢⎢⎢⎢⎢⎢⎣

φ11 φ12 ... φ1i ... φ1h

φ21 φ22 ... φ2i ... φ2h

... ... ... ... ... ...φi1 φi2 ... φii ... φih

... ... ... ... ... ...φa1 φa2 ... φai ... φah

⎤⎥⎥⎥⎥⎥⎥⎦

.

The ith column of this matrix corresponds to the ith eigenvector of the modalspace and the components of the eigenvectors are the terms of that column.Using the fact from 13.1 that the physical solutions are recovered from themodal solutions by

va(t) = Φahwh(t),

the total modal contribution of all modes to the kth degree of freedom at timet is computed by

mck(t) =[φk1 φk2 ... φki ... φkh

]⎡⎢⎢⎢⎢⎢⎢⎣

wh,1(t)wh,2(t)

...wh,i(t)

...wh,h(t)

⎤⎥⎥⎥⎥⎥⎥⎦

.

248 Chapter 13

Here the term wh,i(t) is the i-th component of the modal solution vector wh(t).The individual terms in the result of this multiplication

mck,i(t) = φkiwh,i(t)

represent the modal contribution of the ith mode to the kth physical displace-ment response at time t.

FIGURE 13.3 Modal contributions

In order to be able to compare the contribution of the individual modes,they are normalized by the total contribution of all modes as

mcnk,i(t) = mck,i(t)/mck(t).

Since these computations are executed at distinct time steps of the solution,their normalization base is different and additional normalization may benecessary to compute the actual contributions comparable between time in-stances.


These computations may also be executed at distinct excitation frequen-cies when the modal solution is executed in the frequency domain. Thiscase is shown in Figure 13.3 from an auto industry example of modal con-tributions. The additional normalization across frequencies was executed tocompute comparable actual modal contributions. The dominance of certainmodes’ contribution at a certain frequency enables the engineer to optimizethe structure.

Another similar analysis capability with diagnostic value is the computationof modal energies. Assuming that the modal solution is result of a harmonicexcitation, the modal displacement at a time instance may be expressed as

wh(t) = wReh cos(ωt) − wIm

h sin(ωt)).

The value of analyzing the behavior of the structure via modal energies andsubject to harmonic excitation is in its indicative nature for more generic typeexcitations of the life cycle of the product. The modal strain energy of theith mode is computed as

MSEi(t) =12[(wRe,T

h,i cos(ωt)−wIm,Th,i sin(ωt))kii(wRe

h,icos(ωt)−wImh,i sin(ωt))],

where the kii term is the ithe modal stiffness and wh,i is the ith modal dis-placement component. Arithmetic manipulations yield the instructive form of

MSEi(t) =14kii((wRe

h,i)2 + (wIm

h,i )2)+

14kii((wRe

h,i)2 − (wIm

h,i )2)cos(2ωt) − 12kiiw

Reh,iw

Imh,i sin(2ωt).

The first term is the constant and the second two are the oscillating portionsof the modal strain energy. The computed energy may be presented as

MSEei2ωt = MSEconst + MSEoscei2ωt.

Modal kinetic energy may be computed via an identical process, but re-placing kii by the ith modal mass mii and replacing the modal solution termswh,i with the modal velocities wh,i resulting in

MKEi(t) =14mii((wRe

h,i)2 + (wIm

h,i )2)+

14mii((wRe

h,i)2 − (wIm

h,i )2)cos(2ωt) − 12miiw

Reh,iw

Imh,i sin(2ωt).

The structure of the modal kinetic energy is the same

MKEei2ωt = MKEconst + MKEoscei2ωt.

Figure 13.4 enables the engineer to observe various scenarios. In the figurethe dotted line represents the total modal kinetic energy. On the left, modes

250 Chapter 13

FIGURE 13.4 Modal kinetic energy distribution

48 and 49 are out of phase and result in a lower total energy than the highermode 48. On the contrary, on the right hand side, modes 57 and 58 are inphase, hence the total modal energy is higher in their region.

References

[1] Dickens, J. M. Nakagawa, J. M. and Wittbrodt, M. J.; A critique ofmode acceleration and modal truncation methods for modal responseanalysis, Computers and Structures, Vol. 62, No.6, pp. 985-998, 1997

[2] Rose, T.; Using residual vectors in MSC/NASTRAN dynamic analysis toimprove accuracy, Proceedings of NASTRAN World User’s Conference,The MacNeal-Schwendler Corporation, Los Angeles, 1991

[3] Wamsler, M.; The role of actual modal contributions in the optimizationof structures, Engineering with Computers, Springer, August, 2008

14

Transient Response Analysis

We now address the problem of calculating the displacement response of themechanical system in the time domain, called transient response analysis. Thetransient response analysis is the solution of

Mv + Bv + Kv(t) = F (t).

Note, that the subscript of the matrices indicating their reduction state isomitted as the techniques discussed here are applicable to either a-size or h-size matrices. In the first case it is called direct transient, in the second casemodal transient solution. Naturally, the computational costs are significantlyless in the latter case.

This equation is solved numerically at discrete time intervals using variousnumerical differentiation schemes. The two major classes of computations areexplicit and implicit schemes.

14.1 The central difference method

The method is based on the equidistant 3-point central difference formula forthe first and second order numerical derivatives as

v(t) =1

2Δt(v(t + Δt) − v(t − Δt)),

and

v(t) =1

Δt2(v(t + Δt) − 2v(t) + v(t − Δt).

In the central difference method the equilibrium of the system is consideredat time t and the displacements are calculated at time t + Δt, where the Δtis the equidistant time step, hence this is an explicit time integration scheme.Substituting into the equilibrium equation at t we get

M1

Δt2[v(t+Δt)−2v(t)+v(t−Δt)]+B

12Δt

[v(t+Δt)−v(t−Δt)]+Kv(t) = F (t).

251

252 Chapter 14

Reordering yields

[1

Δt2M +

12Δt

B]v(t + Δt) =

F (t) + [2

Δt2M − K]v(t) − [

1Δt2

M − 12Δt

B]v(t − Δt).

Note, that the right-hand side contains only components at time t and t−Δt.Assigning

C1 =1

Δt2M +

12Δt

B,

C2 = F (t),

C3 =2

Δt2M − K,

andC4 = − 1

Δt2M +

12Δt

B,

the problem becomes

C1v(t + Δt) = C2 + C3v(t) + C4v(t − Δt).

In the modal transient solution the matrices are diagonal and the problem iseasy to solve. In the case of direct transient response a linear system solutionis required at each time step.

Assuming that the M, B and K matrices are constant (do not change withtime) and the time step Δt is also constant, the C1 matrix needs to be fac-tored only once. Otherwise the C1 matrix needs to be factored at each timestep, rendering the method very expensive.

14.2 The Newmark method

The time integration method preferred for large scale linear analyses is basedon the classical Newmark method. The foundation of the Newmark-β method[3] is the following approximation

v(t + Δt) = βv(t) + (1 − 2β)v(t + Δt) + βv(t + 2Δt).

As the equation contains terms at time t + Δt on both sides, it is an implicitmethod. The central difference forms from the previous section computed at

Transient Response Analysis 253

time t + Δt are

v(t + Δt) =1

2Δt(v(t + 2Δt) − v(t)),

and

v(t + Δt) =1

Δt2(v(t + 2Δt) − 2v(t + Δt) + v(t)).

Considering the equilibrium equation at time t + Δt

Mv(t + Δt) + Bv(t + Δt) + Kv(t + Δt) = F (t + Δt).

By substituting we obtain

M

Δt2[v(t + 2Δt) − 2v(t + Δt) + v(t)] +

B

2Δt[v(t + 2Δt) − v(t)]+

K[βv(t + 2Δt) + (1 − 2β)v(t + Δt) + βv(t)] = F (t + Δt).

The stability of this formulation will be discussed in more detail in the nextsection. The choice of β = 1

3 originally recommended by Newmark rendersthe form unconditionally stable. Using this value and reorganizing one obtains

[1

Δt2M +

12Δt

B +13K]v(t + 2Δt) = F (t + Δt)+

[2

Δt2M − 1

3K]v(t + Δt) + [

−1Δt2

M +1

2ΔtB − 1

3K]v(t).

Some practical implementations average the load vector over 3 time steps also:

F (t + Δt) =13(F (t + 2Δt) + F (t + Δt) + F (t)).

Introducing intermediate matrices we have the following integration scheme

C1v(t + 2Δt) = C2 + C3v(t + Δt) + C4v(t),

where the coefficients are:

C1 =M

Δt2+

B

2Δt+

K

3,

C2 =13(F (t + 2Δt) + F (t + Δt) + F (t)),

C3 =2M

Δt2− K

3,

and

C4 = − M

Δt2+

B

2Δt− K

3.

254 Chapter 14

An iteration process is executed by factoring the C1 matrix and solving againstthe current right-hand side. In essence each time step is a static solution, how-ever, the right-hand side is updated at each time step.

If we assume again that the M, B and K matrices are constant and do notchange with time and the time step Δt is constant, then the C1 matrix needsto be factored only once. If either one of these conditions is untrue, the C1

matrix needs to be factored at each time step, a daunting task indeed.

14.3 Starting conditions and time step changes

We now need to discuss the starting conditions of these algorithms. If westart at time t = 0 and the first step is to evaluate the equilibrium at timet = Δt, we need to have v(0), F (0), v(−Δt), F (−Δt) specified.

Let us assume that initial displacement and velocity values are specified bythe engineer. The starting components are computed from the assumptionthat at t < 0 the acceleration of the system is zero. Then, the negative timecomponents are computed as

v(−Δt) = v(0) − v(0)Δt,

andF (−Δt) = Kv(−Δt) + Bv(0).

The starting load is also computed to assure that the acceleration at t = 0 isalso zero:

F (0) = Kv(0) + Bv(0).

It is sometimes necessary to change the time step in the middle of theprocess. Naturally, in this case the C1 matrix needs to be re-factored. Inaddition, the starting components of the new time integration process needto be computed.

Let us consider a case of stopping an integration process at time t = T andchange the time step. The initial conditions for the new process are clearly

v(0) = v(T ), F (0) = F (T )

andv(0) = v(T ), v(0) = v(T ).


The starting components for the new integration process starting at timet = T with time step Δt2 are computed as

v(−Δt2) = v(T ) − v(T )Δt2 +12Δt22v(T )

andF (−Δt2) = Kv(−Δt2) + B[v(T ) − Δt2v(T )] + Mv(T ).

14.4 Stability of time integration techniques

An issue of consequence is the stability of these numerical methods. A stablenumerical solution method has the characteristic of producing small changesin the subsequent approximation steps if the initial conditions are subjectedto small changes.

A two-step numerical solution scheme of

wi+1 = fi + bwi + cwi−1

has its characteristic equation defined by

P (λ) = λ2 − bλ − c = 0.

If the roots of the characteristic equations satisfy

|λi| ≤ 1

then the method is called stable.

For the stability analysis of the two methods discussed above, we considerthe modal form of the time domain equilibrium equation. Without restrictingthe generality of the following discussion, we assume that the full modal spacewas computed during the modal formulation. This results in a completely de-coupled set of scalar differential equation of form:

w(t) + 2ξiωiw(t) + ω2i w(t) = fi(t), i = 1, ..n.

HereΦT

i BΦi = 2ξiωi

implies proportional modal damping and

fi(t) = ΦiF (t).

256 Chapter 14

The orthogonality conditions produce

ΦTi KΦi = ω2

i

andΦT

i MΦi = 1.

The stability analysis of the i-th decoupled equation enables drawing conclu-sions for any particular method.

The central difference method in that context is formulated as

w(t + Δt)[1 + ξiωiΔt] = fi(t)Δt2 + w(t)[2 − ω2i Δt2] − w(t − Δt)[1 − ξiωiΔt].

The characteristic equation for this case is

λ2(1 + t) + (r − 2)λ + (1 − t) = 0,

wheret = ξiωiΔt

andr = ω2

i Δt2.

Reformulating into a form with unit leading coefficient

λ2 +r − 21 + t

λ +1 − t

1 + t= 0

yields the solution of the characteristic equation as

λ1,2 =2 − r

2(1 + t)∓ 1

2

√(r − 21 + t

)2 − 41 − t

1 + t.

Setting λ1,2 = 1 and simplifying with the t �= 0 condition one obtains thelimit of

Δt =2ξi

ωi.

Introducing the period of the vibration (Ti = 2Πωi

) instead of the frequency

Δt =Tiξi

Π.

The interpretation of this is that for stability we need to impose a limit onthe size of the time step (Δt). The central difference method hence is a con-ditionally stable method, above the critical value of the time step the methodis unstable. The condition for the complete model is

Δtcritical =cT

Π,


where T is the smallest free vibration period of the finite element model andc is a constant [1]. As this value is not necessarily or easily available, es-pecially in the case of unreduced (a-set size) solution, the explicit methodis the preferred method only in cases when the time step can be kept verysmall to assure stability. An example of such cases is the the crash analysisof automobiles.

The stability of the Newmark method is discussed next. Using the sameapproach as above, the Newmark equilibrium equation may be written as

w(t + 2Δt)[1 + ξiωiΔt + βω2i Δt2] =

w(t + Δt)[2 − (1 − 2β)ω2i Δt2] + w(t)[−1 + ξiωiΔt − βω2

i Δt2] + fi(t + 2Δt).

The characteristic equation defining the stability is

λ2(1 + t + βr) + λ(−2 + (1 − 2β)r) + (1 − t + βr) = 0.

Here t, r are the same as above. Rewrite the equation to unit lead coefficient

λ2 + λ−2 + (1 − 2β)r

1 + t + βr+

(1 − t + βr)1 + t + βr

= 0.

The solution of this equation is

λ1,2 = −−2 + (1 − 2β)r2(1 + t + βr)

∓ 12

√(−2 + (1 − 2β)r

1 + t + βr)2 − 4

1 − t + βr

1 + t + βr.

Assume real solutions for simplicity of our algebra. The general complex so-lution produces the same final result. Setting λ1,2 = 1 and reordering yields

4 + (4β − 1)r = 0.

Assuming that r �= 0 (or ωi �= 0) the selection of β ≥ 14 will assure that

the condition of the eigenvalues being less than unity will always be satisfied.Hence, that is a lower bound for β. The value of 1

3 used in the method de-veloped in the last paragraph is slightly higher, providing a safety cushion ofstability.

In conclusion, the Newmark method with a proper selection of β is un-conditionally stable, i.e., there is no limit on the time step sizes. Naturally,there are other accuracy considerations one must be aware of, for exampleirrationally large time steps will produce large finite difference approximationerrors.

258 Chapter 14

14.5 Transient response case study

The case study involves the modal transient response calculation of a complexchassis and wheel assembly of a car body of which it is impractical to showa picture here. The subject of the analysis was to establish the transient re-sponse of the model to an impact type load as shown in Figure 14.1.

FIGURE 14.1 Transient response

The figure demonstrates the effect of the initial impact and the followingstabilization of the structure indicated by the gradual decaying of the mo-tion. The model consisted of 1.4 million nodes and over 788 thousand finiteelements. The total degrees of freedom was above 8.6 million. The solutionwas obtained with the modal transient response approach and 180 modes upto 3,000 Hz were captured to represent the modal space.

The total transient response analysis required 2,997 minutes of elapsed time,which amounts to more than two days of computing, clearly a weekend job.Of this time 2,175 minutes were spent on the computation of the modal space(the eigenvalue solution) and the rest was the time integration as well as the


engineering result computations.

The run used 315 Mwords of memory and the disk high water level was223 GBytes. The amount of I/O executed was a huge 6.65 Terabytes. Theanalysis was performed on a workstation containing 4 (1.5 GHz) CPUs.

Note that this solution would have not been practical at all without themodal approach even on a higher performing workstation.

14.6 State-space formulation

In complex spectral computations in Chapter 10 it was proven practical totransfer the quadratic problem into a linear problem of twice the size. Theconcept applied to the transient response problem leads to the state-space for-mulation. This is especially advantageous when viscous damping is applied,such as occurring when modeling the shock absorber and other vibration con-trol devices of car bodies.

We consider the transient response problem

Mv(t) + Bv(t) + Kv(t) = f(t),

and rewrite it as a 2 by 2 block linear problem:[M 00 I

] [v(t)v(t)

]+

[B K−I 0

] [v(t)v(t)

]=

[f(t)0

].

Inverting and reordering brings the format of[v(t)v(t)

]=

[−M−1B −M−1KI 0

] [v(t)v(t)

]=

[M−1f(t)

0

].

Introducing the so-called state vector

x =[

v(t)v(t)

]and its derivative

x =[

v(t)v(t)

]results in the state-space formulation of

x = Ax + u.

260 Chapter 14

The state transition matrix is the consequence of above:

A =[−M−1B −M−1K

I 0

],

and the state input vector is

u =[

M−1f(t)0

].

This formulation appears to be simpler to solve than the time integrationtechniques presented earlier in this chapter. This, however, requires the exis-tence of the inverse of the M matrix and the solution of the indefinite systemmay also pose computational difficulties.

The modal transient problem may also be brought to a state-space form.Let the modal reduction of the undamped problem represented by

ωi = φTi Kφi,

2ξiωi = φTi Bφi,

andφT

i Mφi = 1,

for i = 1, 2, . . . , h. Then the modal state transition matrix is formed as

a =

⎡⎢⎢⎣−2ξ1ω1 0 0 −ω2

1 0 00 −2ξiωi 0 0 −ω2

i 00 0 −2ξhωh 0 0 −ω2

h

Ihh 0hh

⎤⎥⎥⎦ .

The modal input becomes

uh =[

ΦTh M−1f(t)

0

].

Introducing the modal displacement

v = Φhw

and modal velocity of

v = Φhw,

the modal state variable becomes

xh =[

w(t)w(t)

],


with derivative

xh =[

w(t)w(t)

].

The modal state-space form is

xh = axh + uh.

In practical applications the input load may only affect a few degrees offreedom of the system and conversely, the results may also be only needed ata few locations. These locations are not necessarily the same, leading to theso-called transfer mobility problem: What is the effect of a load applied toone place of the structure to another location?

For example, the effect of the wheel excitation to the driver seat verticalacceleration is a standard evaluation procedure in the NVH (Noise, Vibrationand Harshness) analysis in the automobile industry. By introducing a B inputcoupling matrix and the C output selection matrix, the formulation may berefined as a pair of equations

x = Ax + Bu,

y = Cx.

Here y contains the selected output. For example, if the engineer wants tomeasure the response only at a single output location, the C matrix will havea single row and as many columns as the state vector size (twice the freedegrees of freedom in the system). The terms of the matrix will be all zero,except for the location corresponding to the output degree of freedom.

Conversely, if the input load affects only one degree of freedom of the model,the B matrix will have a single column with all zeroes, but for the locationof the loaded degree of freedom which will be one. Multiple loaded or senseddegrees of freedom will result in multiple rows and columns of the C and Bmatrices, respectively.

Similar form for the modal case is also possible,

xh = axh + buh

andy = cxh,

but special considerations are necessary in populating b and c accommodat-ing the translation between the physical and modal degrees of freedom. Aprominent application of the state-space formulation is in analyzing struc-tures cooperating with control systems or active damping components [2].

262 Chapter 14

References

[1] Bathe, K.-J. and Wilson, E. L.; Stability and accuracy analysis of di-rect integration methods, Int. Journal of Earthquake Engineering andStructural Dynamics, Vol. 1, pp. 283-191, 1973

[2] Gawronski, W. K.; Balanced control of flexible structures, Springer, NewYork, 1996

[3] Newmark, N. M.; A method of computation for structural dynamics,Proceedings of ASME Conference, ASME, 1959

15

Frequency Domain Analysis

Now we address the problem of calculating the displacement response of themechanical system in the frequency domain. These calculations consider aload exerted on the structure with a frequency-dependent component. Sincethe computation is in the frequency domain, it is called frequency responseanalysis.

The discussions below focus on the undamped symmetric case, althoughthe methodologies directly carries over to the more general cases of the oc-currence of unsymmetric matrices [1] and the damped (3 matrix) problem [2].

15.1 Direct and modal frequency response analysis

The frequency response solution methodology requires the calculation of theresponse of a mechanical model to external excitation in a sometimes verywide frequency range. That is expensive and the computational resource re-quirements are significant.

The direct frequency response of an undamped structure at an excitationfrequency ωj is described by

(Kaa − ω2j Maa)ua(ωj) = Fa(ωj).

Here u is the response vector and F is the external load. The direct methodprovides a computationally exact solution for the engineering problem athand.

The modal frequency response of an undamped structure is described by

(Khh − ω2j Mhh)uh(ωj) = Fh(ωj).

The modal frequency response solution provides a computationally exact so-lution to the engineering problem only when the full modal space is used.

263

264 Chapter 15

Note, that the loads may be frequency-dependent and the equation is evalu-ated at many given frequency locations ωj , j = 1, .., m. Therefore, the solutionis also frequency-dependent.

The solution techniques of this chapter are applicable to both of these cases,therefore, the subscripts will be deliberately ignored in the remainder of thechapter.

Let us introduce μ = ω2j for simplicity. Let us assume for now also that the

matrices are symmetric; this is not a restriction of generality, it just clarifiesthe discussion. Then the direct response at the j-th frequency is given as thesolution of

(K − μM)u(μ) = F (μ).

For numerical stability the problem is sometimes shifted, or a form of spec-tral transformation is executed as follows. By defining a shift σ �= μ in theneighborhood of μ and pre-multiplying this equation we get

(K − σM)−1(K − μM)u(μ) = (K − σM)−1F (μ).

In short, this is called the shifted dynamic equation of the response problem

D(μ)u(μ) = q(μ),

where the shifted dynamic matrix is

D(μ) = (K − σM)−1(K − μM),

and the modified load is

q(μ) = (K − σM)−1F (μ).

The direct solution is then based on the

D(μ) = LU

factorization and the following forward-backward substitution. These tech-niques were shown in Chapter 7.

15.2 Reduced-order frequency response analysis

The reduced-order modeling method enables the approximate solution of thefrequency response problem with much reduced resource requirements. The

Frequency Domain Analysis 265

reduced-order response technique is an alternative spectral reduction tech-nique. In this case, the spectrum of the dynamic matrix is approximated bya Krylov subspace as opposed to approximation by actual eigenvectors.

The shifted dynamic matrix may be written as

D(μ) = (K − σM)−1(K − (σ + μ − σ)M) = I + (σ − μ)S,

where

S = (K − σM)−1M.

A k-th order Krylov subspace spanned by this matrix S is defined by

κk = span(q, Sq, S2q, S3q, ..., Sk−1q),

where q is a starting vector.

To generate the basis vectors for the Krylov subspace, the Lanczos algo-rithm, first introduced in Chapter 9, is an excellent candidate. Starting fromv1 = q/β, β = ||q|| the Lanczos method produces a set of vectors

Vk =[v1 v2 v3 . . . vk,

]that span the k-th order Krylov subspace of S:

κk = span(v1, v2, v3, ..., vk).

The Lanczos recurrence is described by

SVk − VkTk = βkvk+1eTk ,

where the Tk matrix is tridiagonal containing the orthogonalization and nor-malization parameters of the process. The Vk vectors are orthonormal: V T

k Vk =I. The right-hand side term represents the truncation error in case the sub-space size k is less then the matrix size n.

Note, that if the matrices are not symmetric the bi-orthogonal Lanczosmethod from Section 10.2 is used resulting in two (bi-orthogonal) sets ofLanczos vectors. For the sake of the following derivation, let us ignore theright-hand side term temporarily. Reordering and pre-multiplying yields

(σ − μ)SVk = (σ − μ)VkTk.

From the reordered form of the shifted dynamic matrix it follows that

(σ − μ)S = D(μ) − I.

266 Chapter 15

Substituting the latter into the prior equation produces

(D(μ) − I)Vk = (σ − μ)VkTk.

Another reordering yields

D(μ)Vk = Vk(I + (σ − μ)Tk).

Introducingu(μ) = Vku(μ),

and substituting into the dynamic equation results in

Vk(I + (σ − μ)Tk)u(μ) = q(μ).

Finally, pre-multiplying by V Tk produces

(I + (σ − μ)Tk)u(μ) = V Tk q(μ).

The last equation is a reduced (k-th) order problem (hence the name of themethod) as

Dk(μ)u(μ) = q(μ).

HereDk(μ) = (I + (σ − μ)Tk),

u(μ) = V Tk u(μ),

andq = V T

k q.

This also may be considered the projection of our original problem onto theKrylov subspace.

The reduced-order problem (which is usually not formed explicitly) may besolved very conveniently. Solving for u(μ) produces the final response solution

u(μ) = Vk(I + (σ − μ)Tk)−1V Tk q(μ).

In the case of constant right-hand side, q(μ) = q(σ), a further simplificationis possible. Since q = βv1 and V T

k Vk = I, the right-hand side reduces to

V Tk q = βe1,

where e1 is the first unit vector of the k-dimensional subspace. In this case,the final response solution is simply

u(μ) = βVk(I + (σ − μ)Tk)−1e1.


Two specific aspects of this equation are noteworthy. One is that the ma-trix to be inverted here is tridiagonal. That, of course, is a computationaladvantage, ignoring the cost of producing the tridiagonal matrix Tk for themoment. Secondly, appropriately chosen σ enables finding approximate solu-tions at several μ locations without recomputing Tk, Vk.

Finally, since the order of the Krylov subspace is less than that of theoriginal dynamic equation (k < n), the reduced-order, approximate solutionacceptance issue needs to be addressed.

15.3 Accuracy of reduced-order solution

Revisiting the Lanczos recurrence equation above, including the truncationterm and adding a term accounting for round-off error in finite arithmeticswe obtain

SVk − VkTk = βkvk+1eTk + Ek,

where ||Ek|| = εmachine||S|| and εmachine is representative of the floating pointarithmetic accuracy. The residual of the original, full-size response problemis

r(μ) = D(μ)u(μ) − q(μ).

Combining the last two equations and repeating some algebraic steps, we pro-duce a form useful for convergence estimate

r(μ) = (σ − μ)(βkvk+1eTk + Ek)q.

In practice the relative error of

e(μ) =||r(μ)||||F (μ)||

is used, where

||r(μ)|| = |σ − μ|max( βk|eTk q| ; ||Ek|| ||q|| ).

Finally, a response at a certain frequency is accepted if

e(μ) ≤ εacceptance,

where εacceptance is based on various engineering criteria.

It should be noted that the technique is advantageous mainly in modelsdominated by three-dimensional (solid) elements. These models have more

268 Chapter 15

widely spaced natural frequencies. Such models arise in the analysis of auto-mobile components such as brakes and engine blocks.

The accuracy of the approximated responses deteriorates quickly if thereare natural frequencies between the shift point and the response points. Theseissues unfortunately limit the performance of this method in response analysisof shell structures. The latter class include car body and airplane fuselage orwing models.

15.4 Frequency response case study

FIGURE 15.1 Satellite model

We will consider an example of a satellite similar to the one shown in Figure15.1 as illustration. Such objects undergo a wide frequency range of excita-


tion during their launching and operational life-cycle, therefore the frequencyresponse analysis is of utmost importance.

A satellite model consisted of approximately 710,000 node points and 760,000elements of various kind, reflecting the complexity of the model. The totaldegrees of freedom exceeded 4.2 million and the model was analyzed at 800excitation frequencies.

The CPU time of the modal frequency response analysis was approximately200 minutes of elapsed time on a workstation with 4 (1.5 GHz) processors.The amount of I/O operations was just over one Terabyte and almost 100Gigabytes of disk footprint was required to complete the analysis.

15.5 Enforced motion application

It is common in engineering practice to have a non-stationary excitation onthe structure called an enforced motion, as it may be displacement, velocityor even acceleration. The practical values of such approach are immense [4].

The excitation is restricted to a partition of the problem’s degrees of freedomdenoted by (s). Since the mechanism of applying an enforced displacement issimilar to single point constraints, the notation and partitioning resembles tothat process described in Section 4.5.

[Mff Mfs

Msf Mss

] [vf

vs

]+

[Bff Bfs

Bsf Bss

] [vf

vs

]+

[Kff Kfs

Ksf Kss

] [vf

vs

]=

[Pf

0s

].

The commonly used approach is based on the first equation as

Mff vf + Bff vf + Kffvf = Pf − (Mfsvs + Bfsvs + Kfsvs).

Note that the right hand side contains the active load and the enforced mo-tion terms and as such, properly computable [3]. We assume a frequencydependent enforced motion of the structure in the form of

vs(t) = us(ω)eiωt

resulting in a harmonic response

vf (t) = uf (ω)eiωt.

270 Chapter 15

It follows that

vf = iωuf , vf = −ω2uf ,

andvs = iωus, vs = −ω2us.

Substituting into the equilibrium equation produces the governing equationin the frequency domain

(−ω2Mff + iωBff + Kff)uf = Pf − (−ω2Mfs + iωBfs + Kfs)us.

IntroducingZff = −ω2Mff + iωBff + Kff ,

Zfs = −ω2Mfs + iωBfs + Kfs,

Zsf = −ω2Msf + iωBsf + Ksf ,

andZss = −ω2Mss + iωBss + Kss.

the complete frequency domain equation of the enforced motion problem maybe written as [

Zff Zfs

Zsf Zss

] [uf

us

]=

[Pf

0

].

Analyzing enforced motion in the modal space requires that the static shapesassociated with the unit motion of each enforced motion point be appendedto the flexible mode shapes when doing modal reduction. These static shapesare computed as

Φfs = −K−1ff Kfs.

The flexible mode shapes are obtained from the eigenvalue solution as

KffΦfh = MffΦfhΛh.

Introducing

ud =[

uf

us

],

and

Φdx =[

Φfh Φfs

0 Iss

],

the modal substitution is executed as

ud = Φdxwx,


where

wx =[

wh

us

].

Here wh is the modal displacement sought. Pre-multiplying the completeequation by ΦT

dx results in the modal form of the problem.[ΦT

fh 0ΦT

fs Iss

][Zff Zfs

Zsf Zss

][Φfh Φfs

0 Iss

] [wh

us

]=

[ΦT

fhPf

0

].

Executing the posted multiplications and developing the first equation resultsin

ΦTfhZffΦfhwh = ΦT

fhPf − ΦTfh(ZffΦfs + Zfs)us.

Introducing modal matrices

zhh = ΦTfhZffΦfh,

andzhs = ΦT

fhZffΦfs + ΦTfhZfs,

as well as modal loads

Ph = ΦTfhPf ,

the modal solution may be obtained from

zhhwh = Ph − zfsus.

The physical solution component is recovered by the relation

uf = Φfhwh + Φfsus.

The reaction forces at the constraints enforcing the displacements are com-puted as

Q = −ΦTfsPf .

Finally, it is quite simple to apply forces also to the enforced motion locations,by putting Ps into the zero block of the right hand sides of the equations.

References

[1] Freund, R.W.; Passive reduced-order modeling via Krylov subspacemethods, Bell Laboratories Numerical Analysis Manuscript N0. 00-3-02,

272 Chapter 15

March 2000

[2] Meerbergen, K.; The solution of parameterized symmetric linear sys-tems and the connection with model reduction, SIAM Journal of MatrixAnalysis and Applications, 2003

[3] Timoshenko, S.; Vibration problems in engineering, Wiley, New York,1974

[4] Wamsler, M., Blanck, N. and Kern, G.; On the enforced relative motioninside a structure, Proceedings of the MSC 20th User’s Conference, 1993

16

Nonlinear Analysis

In all earlier chapters we assumed that the load vs. displacement relationshipis linear. In some applications, however, there is a nonlinear relationship be-tween the loads and the displacements. This chapter presents the most widelyused computational concepts of such solutions but will not focus on details ofthese issues, due to elaborate, specialized references, such as [1].

16.1 Introduction to nonlinear analysis

Nonlinear relationship between the loads and the displacements could occurin several categories. The simplest case is called material nonlinearity, andit is captured in the stress-strain matrix D. This may be due to nonlinearelasticity, plasticity or visco-elasticity of the material. This category, as theorigin of the nonlinearity indicates, is closely related to the material modelingissues and as such it is application dependent.

In nonlinear analysis the stiffness matrix is composed of two distinct com-ponents as

K = Kl + Knl,

where the subscripts refer to linear and nonlinear, respectively. The accom-panying strains are also of two components

ε = εl + εnl.

The linear components Kl, εl are the conventional stiffness and strain as de-rived in Chapter 3. The second component is formulated similarly, however,in terms of the nonlinear matrices.

In the case of material nonlinearity

Knl = Σe

∫ ∫ ∫BT DnlBdV,

273

274 Chapter 16

the Dnl is a nonlinear stress-strain matrix, representing the material non-linearity. Such a case is shown for example in Figure 16.1, when the strainexceeds the yield point for the material.

εεy

σy

σ

E

FIGURE 16.1 Nonlinear stress-strain relationship

Up to the yield point σy, εy, considering a bar undergoing tension or com-pression

σ = Eε,

which is clearly linear. Above the yield point, however, the relationship is notlinear

σ = f(E, ε).

Finite elements formulated to accommodate this scenario are commonly calledhyper-elastic elements. In the nonlinear case, due to the dependence of thestiffness matrix on the displacement, there is an imbalance between the ex-ternal load and the internal forces of the model.

Nonlinear Analysis 275

ΔF = F − Fint(u).

F is the external force and Fint is the forces internal to the model computed by

Fint(u) = Σe

∫ ∫ ∫BT σdV.

The force imbalance results in an incremental displacement of Δu as follows

ΔF = KΔu.

The equilibrium of the nonlinear model is achieved when the force imbalanceor the incremental displacement is zero or sufficiently small. This equilibriumis obtained by an iterative procedure that consists of steps at which the forceimbalance is computed and tested. If it is not small enough, an incrementaldisplacement related to the imbalance is computed, the displacement is ad-justed and the imbalance is again evaluated. If the displacement exceeds acertain level, the stiffness matrix is also updated and the process repeated.

On a side note: in nonlinear analysis the automated singularity eliminationprocess discussed in Chapter 5 may also be repeatedly executed. This is dueto the presence of the high deformations which modify the original balance inthat regard.

16.2 Geometric nonlinearity

Another case, called geometric nonlinearity, is manifested in the nonlinearityof the strain-displacement matrix B. The possible causes of this may belarge deformation of some elements, or even contact between elements. Thisrequires taking the second order displacement effects into consideration.

The nonlinear stiffness matrix in the case of geometric nonlinearity is:

Knl = Σe

∫ ∫ ∫BT

nlDBnldV.

The non-linear strain-displacement matrix produces a nonlinear strain vector

εnl = Bnlqe.

The nonlinear strain vector components are of form

εnl,x =12((

∂qu

∂x)2 + (

∂qv

∂x)2 + (

∂qw

∂x)2)

276 Chapter 16

and

τnl,xy = (∂qu

∂x

∂qu

∂y+

∂qv

∂x

∂qv

∂y+

∂qw

∂x

∂qw

∂y).

For a geometrically non-linear tetrahedral element corresponding to the linearelement introduced in Section 3.4, the strain-displacement matrix producingthe non-linear strain vector may be computed by introducing an 6 × 9 inter-mediate matrix of form

A =

⎡⎢⎢⎢⎢⎢⎢⎣

aTx 0 00 aT

y 00 0 aT

z

aTy aT

x 00 aT

z aTy

aTz 0 aT

x

⎤⎥⎥⎥⎥⎥⎥⎦

,

with terms of

ax =

⎡⎣ ∂qu

∂x∂qv

∂x∂qw

∂x

⎤⎦ , ay =

⎡⎢⎣

∂qu

∂y∂qv

∂y∂qw

∂y

⎤⎥⎦ , az =

⎡⎣ ∂qu

∂z∂qv

∂z∂qw

∂z

⎤⎦ .

With the vector

b =

⎡⎣ax

ay

az

⎤⎦ ,

it is easy to verify that the multiplication of

εnl =12Ab,

produces the geometrically nonlinear strains of above forms. Substituting theshape function derivatives as

b = Cqe,

where the 9 × 12 matrix

C =

⎡⎣ ∂N1

∂x 0 0∂N2∂x 0 0∂N3

∂x 0 0∂N4∂x 0 0

0 ∂N1∂y 0 0∂N2

∂y 0 0∂N3∂y 0 0∂N4

∂y 00 0 ∂N1

∂z 0 0∂N2∂z 0 0∂N3

∂z 0 0∂N3∂z

⎤⎦ ,

the 6 × 12 non-linear strain-displacement matrix becomes

Bnl =12AC,


satisfying

εnl = Bnlqe.

The computation of the non-linear element matrix now proceeds according toSection 3.4.

A possible cause of geometric nonlinearity is large rotation of some of thestructural elements. Large rotations of the elements could occur in manydifferent physical applications. For a simple example we reconsider the barmodel introduced in Chapter 4 undergoing tension-compression. In this case,however, we let it also attain a large rotation around the first node point onthe left as shown in Figure 16.2.

y

x12

q2y

q2x

2

α

FIGURE 16.2 Rotated bar model

We will consider two degrees of freedom per node point for x and y dis-placements. The strain-displacement relationship is

ε = Bnlqe,

278 Chapter 16

where

qe =

⎡⎢⎢⎣

q1x

q1y

q2x

q2y

⎤⎥⎥⎦ ,

and x is the original axial direction prior to the rotation and y is perpendic-ular to that. The strain-displacement matrix

Bnl =1l

[−1 0 1 00 −1 0 1

],

will result in the strains accommodating the large rotation as

ε =[

lcos(α) − q1x

lsin(α) − q1y

],

where l is the length of the bar and α is the angle or rotation.

16.3 Newton-Raphson methods

The iterative process applied in the industry is based on the well-knownNewton-Raphson iteration method of finding the zero of a nonlinear equation.In that method the solution to the nonlinear equation g(x) = 0 is obtainedby approximating the equation with its first order Taylor polynomial arounda point xi in the iteration process, where i = 1, 2, ... until convergence isachieved.

g(x) = g(xi) + g′(xi)(x − xi)

where

g′(xi) =dg(x)dx

|x=xi.

The estimate for the next point in the iteration comes from setting g(x) = 0:

xi+1 = xi − g(xi)g′(xi)

.

Applying the method to the problem at hand, the Taylor polynomial for theforce imbalance is written as

ΔF (u) = ΔF (ui) +∂ΔF

∂u|u=ui(u − ui).


Since the external force F is constant, the derivative simplifies as

∂ΔF

∂u|u=ui = −∂Fint

∂u|u=ui .

Assuming that ΔF (u) is zero as above, we get the next approximate displace-ment solution ui+1 from the equation

ΔF (ui) =∂Fint

∂u|u=ui(ui+1 − ui).

Introducing

Δui+1 = ui+1 − ui

and comparing with the relation between the force imbalance and incrementaldisplacement yields

Ki =∂Fint

∂u|u=ui .

Hence, the Ki is commonly called the tangent stiffness. Finally,

Δui+1 = Ki,−1ΔF i.

A simplified algorithm of such an iterative scheme is as follows:

For i = 1, 2, ... until convergence, compute:

1. Internal force: F iint = Fint(u = ui)

2. Force imbalance: ΔF i = F − F iint

3. Tangent stiffness: K it = −∂Fint

∂u |u=ui

4. Incremental displacement: Δui+1 = Ki,−1ΔF i

5. Updated displacement: ui+1 = ui + Δui+1

End loop on i.

The initial displacement u1 and the initial internal force F 1int may or may

not be zero. Note, that the initial tangent stiffness K1 is assumed to be notzero. Figure 16.3 shows three steps of the iterative process.

280 Chapter 16

1

2

3

Applied Load

DEFLECTION

LOAD

Iterations

FIGURE 16.3 Newton-Raphson iteration

If the ui displacement is close to the exact solution u∗ then the convergencerate of Newton’s method is quadratic, that is

||u∗ − ui+1|| ≤ ||u∗ − ui||2.This may not be true for an initial displacement very far from the equilib-rium or when the tangent (stiffness) changes direction. These are issues ofconcern in practical implementations of the method and some heuristics arerequired to solve them. The calculation of specific convergence criteria willbe discussed in Section 16.4.

The price for this good convergence rate is the rather time-consuming stepof the algorithm: the solution for the incremental displacement. In practice,the inverse of the tangent stiffness matrix is not explicitly computed, the fac-torization and forward-backward substitution operations shown in Chapter 7are used. In contrast, in linear static analysis, as discussed in earlier chapters,there is only one factorization and forward backward substitution required. Innonlinear analysis many of those steps may be needed to find the equilibrium.

The computational efficiency may be improved by modifying the Newton


iteration method. In this method, depicted in Figure 16.4, the tangent stiff-ness matrix is updated only at selected but not all the steps. This lessens thenumber of solutions (factorization and substitution) required. The decisionwhether to update the matrix may be based on the last incremental displace-ment.

Applied Load

DEFLECTION

LOAD

Matrix Update

FIGURE 16.4 Modified Newton iteration

By comparing Figures 16.3 and 16.4, one can see that the price of compu-tational efficiency is numerical accuracy. While the modified method savestime by computing fewer updates and factors, the incremental displacementsare becoming smaller, resulting in a slower convergence rate. It is possible,however, to produce a both computationally and numerically efficient nonlin-ear iteration technique, as shown in the following section.

282 Chapter 16

16.4 Quasi-Newton iteration techniques

The Newton method may be further improved by only computing an approx-imate update and inverse of the tangent stiffness matrix. Such a computationof the updated tangent stiffness matrix is called a quasi-Newton method andthe scheme is similar to the one shown in Figure 16.4, however, the updatenow is a quasi-Newton update.

The quasi-Newton update calculates a secant type approximation of thetangent stiffness matrix based on the two previous iterations. One of themost frequently used methods in this class is the BFGS method, (Broyden-Fletcher-Goldfarb-Shanno [2] [3] [4] [5]) discussed in the following.

We first introduce

γ = ΔF i−1 − ΔF i,

and for notational convenience

δ = ui − ui−1,

although it is really equivalent to Δui. The approximate updated stiffnessmatrix based on the last two points and the just introduced quantities is pro-posed as

Ki+1 = Ki +γγT

γT δ− KiδδT Ki

δT Kiδ.

This is a rank two update of a matrix. The Sherman-Morrison formula en-ables the calculation of the inverse of an updated matrix based on the originalinverse and the update information. The general formula is

(A + uvT )−1 = A−1 +A−1uvT A−1

1 + vT A−1u,

where A is an n by n matrix updated by the u and v vectors. The formulacomputes the inverse of the updated matrix in terms of the inverse of theoriginal matrix and the updating vectors.

Applying the formula for K i being updated by two pairs of vectors simul-taneously yields the BFGS formula:

Ki+1,−1 = Ki,−1 + (1 +γT Ki,−1γ

γT δ)δδT

δT γ− δγT Ki,−1 + Ki,−1γδT

γT δ.


This is a rather complex formula and will be reformulated for computationalpurposes. Nevertheless, it directly produces the inverse for the next approxi-mate stiffness from the last inverse and information from the last two steps.

We can reformulate the BFGS update as the sum of a triple product and avector update as

Ki,−1 = AT Ki−1,−1A +δδT

δT γ.

Here

A = I − γδT

δT γ.

This is a form more convenient for computer implementation as it is possi-ble to calculate the displacement increment without explicitly computing thenew inverse.

Δui+1 = Ki,−1ΔF i.

This may be done in terms of the following intermediate computational steps.Compute scalars

a = (δT γ)−1

andb = a(δT ΔF i).

Compute vector

q = ΔF i − bγ.

Solve with earlier inverse

r = Ki−1,−1q.

Compute scalar

c = a(γT r).

The incremental displacement due to the BFGS update

Δui+1 = r + (b − c)δ.

This implicit BFGS update may be executed recursively.

284 Chapter 16

16.5 Convergence criteria

We need to establish convergence criteria for both the incremental displace-ment and the load imbalance. Let us introduce the ratio of the incrementaldisplacements

q =||Δui+1||||Δui|| .

An upper bound for the displacement error at step i may be written as

||u − ui|| ≤ ||u − un+i+1|| + ||un+i+1 − un+i|| + ... + ||ui+1 − ui||,where n is the additional number of iterations required for convergence. Weassume that q is constant during the iteration. Then

||u − ui|| ≤ ||Δui||(qn + qn−1 + ... + q).

Finally, with the assumption that q is less than one, taking the limit of n toinfinity and summing the geometric series we get

||u − ui|| ≤ ||Δui|| q

1 − q.

This convergence criterion may not be very accurate due to the assumptions.To represent fluctuations in the incremental displacement ratio, a form of

qi = c||Δui||

||Δui−1|| + (1 − c)qi−1

is sometimes used in the practice, with c being a constant less then unity. Anerror function of the displacements can be formulated as

εiu =

||Δui||||ui||

qi

1 − qi.

Finally, the error criterion for the load imbalance is composed as

εiF =

||ΔF i ∗ ui||||(F + F i

int) ∗ ui|| ,

where the ∗ implies term by term multiplication of the vectors.

Convergence is achieved when either the condition for the load imbalance:

εiF ≤ εF ,

or for the incremental displacement:

εiu ≤ εu


is satisfied. The εF , εu values are set by the engineer.


Let us demonstrate the above formulations with a simple computational ex-ample. We will consider a single degree of freedom system with the followinginput:

External load: F = 3.

Initial displacement: u1 = 1.

Let the characteristics of the system described by the internal load of

Fint = 1 +√

u.

Using the relationship developed above, the tangent stiffness is then

K =1

2√

u.

The theoretical equilibrium is at u = 4 when Fint = 1 +√

4 = 3 = F .

Let us first consider two Newton steps.

i = 1

ΔF 1 = F − Fint(u = u1) = 3 − (1 +√

1) = 1

K1 =1

2√

u1=

12, K1,−1 = 2

Δu2 = K1,−1ΔF 1 = 2 ∗ 1 = 2

u2 = u1 + Δu2 = 1 + 2 = 3

i = 2

ΔF 2 = F − Fint(u = u2) = 3 − (1 +√

3) = 2 −√3

K2 =1

2√

u2=

12√

3, K2,−1 = 2

√3

Δu3 = K2,−1ΔF 2 = 2√

3(2 −√

3) = 4√

3 − 6

286 Chapter 16

u3 = u2 + Δu3 = 3 + 4√

3 − 6 = 3.9282

That is now very close to the theoretical solution.

Now we replace the last step with a modified Newton step as follows.

i = 2

ΔF 2 = F − Fint(u = u2) = 3 − (1 +√

3) = 2 −√3

Δu3 = K1,−1ΔF 2 = 2(2 −√3) = 4 − 2

√3

u3 = u2 + Δu3 = 3 + 4 − 2√

3 = 3.5359

This value is clearly not as close to the theoretical solution as the regularNewton step, however, as the tangent stiffness was not updated and inverted,at a much cheaper computational price.

Finally, we execute the step with BFGS update.

i = 2

δ = u2 − u1 = 3 − 1 = 2

γ = ΔF 1 − ΔF 2 = 1 − (2 −√

3) =√

3 − 1

K2,−1

=δδT

δT γ=

δ

γ=

2√3 − 1

Δu3 = K2,−1

ΔF 2 =2√

3 − 1(2 −√

3) = 0.7321

u3 = u2 + Δu3 = 3 + 0.7321 = 3.7321

This result demonstrates the power of the method, without explicitly updatingthe stiffness matrix (K indicates the approximate inverse), we obtained asolution better than that of the modified method. Note, that the update is inthe simplified form since

A = I − γδT

δT γ= 0

for the single degree of freedom (scalar) problem.

It is also notable that the approximate stiffness (not explicitly computed)would be

K2

=√

3 − 12

.

On the other hand, the slope of the secant between the points (u1, F 1int) and

(u2, F 2int) is the same value. As mentioned above the quasi-Newton method

is a secant type approximation.


16.7 Nonlinear dynamics

The solution of nonlinear transient (or frequency) response problems intro-duces additional complexities. The solution is still based on time integrationschemes as shown in Chapter 14, however, that is now embedded into thenonlinear iterations. The governing equation in this case is:

Mu(t) + Bu(t) + Ku(t) = F (t) − Fint(u(t)).

Note, that the external load is considered to have time-dependence, but notdependent on the displacement. The Newmark β method introduced in Chap-ter 14 may also be recast in a two point recurrence formula more suitable forthe nonlinear dynamics case:

u(t + Δt) = u(t) + Δtu(t) +12Δt2u(t) + βΔt2(u(t + Δt) − u(t)),

and

u(t + Δt) = u(t) + Δtu(t) +12Δt(u(t + Δt) − u(t)).

Solving these equations for the velocity u(t + Δt) and acceleration u(t + Δt)in terms of the displacements results in

u(t + Δt) =1

2βΔt(u(t + Δt) − u(t)) + (1 − 1

2βu(t) + (1 − 1

4β)Δtu(t),

and

u(t + Δt) =1

βΔt2(u(t + Δt) − u(t)) − 1

βΔtu(t) − (

14β

− 1)u(t).

Substituting these into the governing equation at the next time step consid-ering also the iteration steps, yields

Mui+1(t+Δt)+Bui+1(t+Δt)+Ki(t+Δt)Δui+1 = F (t+Δt)−F iint(t+Δt),

where

Ki(t + Δt) =∂F i

int

∂u(t + Δt).

The resulting equation is the nonlinear Newton-Raphson iteration scheme

[1

βΔt2M +

12βΔt

B + Ki(t + Δt)]Δui+1 = ΔFi(t + Δt),

whereui+1(t + Δt) = ui(t + Δt) + Δui+1.

288 Chapter 16

The dynamic load imbalance is

ΔFi(t + Δt) = ΔF i(t + Δt)−

M

βΔt2(ui(t + Δt) − u(t) − Δtu(t)) + (

12β

− 1)Mu(t)

− 12βΔt

B(ui(t + Δt) − u(t)) + (12β

− 1)Bu(t) − (1 − 14β

)ΔtBu(t).

Several comments are in order. First, the tangent stiffness matrix K i(t+Δt)may be replaced by K(t) resulting in the modified Newton-Raphson scheme.Also note, that u(t) is the converged displacement in the last time step andui(t + Δt) is the displacement in the i-th iteration for the next time step.Finally, the tangent stiffness matrix K i(t+Δt) may also be replaced by K(t),representing a BFGS update.

The iteration at a time step proceeds until the force imbalance is sufficientlysmall. At this point the resulting displacement becomes the starting point forthe next time step. Algorithmic implementation of above is rather delicateand commercial finite element analysis systems have various proprietary ad-justments to control the efficiency and numerical convergence of the process.

References

[1] Bathe, K.-J.; Finite element procedures in engineering analysis,Prentice-Hall, Englewood Cliffs, New Jersey, 1988

[2] Broyden, C. G.; The convergence of a class of double rank minimizationalgorithms, J. Inst. Math. Appl., Vol. 6, pp. 76-90, 222-231, 1970

[3] Fletcher, R.; A new approach to variable metric algorithms, ComputerJournal, Vol. 13, pp. 317-322, 1970

[4] Goldfarb, D.; A family of variable metric methods derived by variationalmeans, Math. Comp., Vol. 24, pp. 23-26, 1970

[5] Shanno, D. F.; Conditioning of quasi-Newton methods for function min-imization, Math. Comp., Vol. 24, pp. 647-656, 1970

17

Sensitivity and Optimization

The subject of this chapter is a class of computations that characterize thestability and provide the methodology for optimization of the engineeringdesign.

17.1 Design sensitivity

At this stage of the engineering solutions we are getting close to the originalengineering problem defined by the engineer. Sensitivity computations arebased on computing derivatives of some computational results with respect tosome changes in the model. The changes in the model are described by designvariables. Design variables may be some geometric measures of the model,for example the length and width of a component. They may also be otherengineering quantities, such as eigenfrequencies.

We contain the set of design variables in an array

x =

⎡⎢⎢⎢⎢⎣

x1

..xi

..xm

⎤⎥⎥⎥⎥⎦ .

The design variables are related to the displacements:

x = x(u),

and conversely the displacement vector may be expressed in terms of thedesign variables

u = u(x).

Let us consider, again, the linear static solution

Ku = F.

289

290 Chapter 17

The first variation of the linear static solution,

K∂u

∂xi+

∂K

∂xiu =

∂F

∂xi,

allows us to compute the sensitivity of the linear static solution with respectto the ith design variable as

∂u

∂xi= K−1(

∂F

∂xi− ∂K

∂xiu).

Assuming that the initial design variable vector is x0, the terms on the right-hand side may be computed by finite differences as

∂F

∂xi=

F (x0 + Δxi) − F (x0 − Δxi)2Δxi

,

and∂K

∂xi=

K(x0 + Δxi) − K(x0 − Δxi)2Δxi

.

Hence, the sensitivity of the solution with respect to changes in the ith designvariable is obtained.

17.2 Design optimization

The design variables may be automatically changed to optimize some designobjective under some constraints, a process called design optimization. Letus consider, for example, a maximum stress constraint applied to the linearstatics problem

σ ≤ σmax,

or written in a constraint equation form

g(x) = σ − σmax ≤ 0.

Here the stress may be a compound measure, such as the von Mises stressdescribed in Chapter 18 or could be a component. In the latter case multipleconstraint equations are given. Other kinds of constraints may be for examplethe physical limits of some design variables.

The design objective may be formed as

f(x) = minimum.

Sensitivity and Optimization 291

For example, a design objective may be to find the minimum weight (or vol-ume) of the model that satisfies the stress constraint. Note, that both theobjective function f(x) and the constraint function g(x) are vector valuedfunctions.

Finding the optimal design is based on the approximation of both the ob-jective function

f(x0 + Δx) = f(x0) + ∇f |x0Δx,

and the constraint function

g(x0 + Δx) = g(x0) + ∇g|x0Δx.

The sensitivity of the objective function is computed by finite differences

∇f(x) =

⎡⎢⎢⎢⎢⎢⎣

∂f∂x1

..∂f∂xi

..∂f

∂xm

⎤⎥⎥⎥⎥⎥⎦ =

⎡⎢⎢⎢⎢⎢⎣

f(x0+Δx1)−f(x0)Δx1

..f(x0+Δxi)−f(x0)

Δxi

..f(x0+Δxm)−f(x0)

Δxm

⎤⎥⎥⎥⎥⎥⎦ .

The constraint sensitivities are

∇g(x) =

⎡⎢⎢⎢⎢⎢⎣

∂g∂x1

..∂g∂xi

..∂g

∂xm

⎤⎥⎥⎥⎥⎥⎦ .

The derivatives are

∂g

∂xi=

∂g

∂σ

∂σ

∂xi.

As we introduced earlier the stress vector is related to the displacements as

σ = DBu,

hence

∂σ

∂xi=

∂σ

∂u

∂u

∂xi= DB

∂u

∂xi.

Here ∂u∂xi

is the sensitivity we computed in the prior section. With that

∂g

∂xi= DB

∂u

∂xi.

292 Chapter 17

To minimize the objective function, one direction in which the model may besuitably modified is the direction opposite to the gradient

d = −∇f.

This is the method called the steepest descent. However, this direction mayviolate the constraints so it may need to be modified. The feasible directionsare mathematically described as

(∇f(x))T d ≤ 0.

The equality in this equation results in the direction vector of the steepestdescent. Use of any other directions results in the feasible direction method.The range of the directions is from 90 to 270 degrees from the gradient of theobjective function.

To avoid violating the constraints, the

(∇g(x))T d ≤ 0

condition must also be satisfied. The range of this direction vector is againfrom 90 to 270 degrees from the gradient of the constraint equation. The di-rection that satisfies both may be found by various heuristic techniques, suchas the modified feasible direction method [7] frequently used in the industry.The formulation also enables the application of advanced automated mathe-matical optimization techniques, such as linear or nonlinear programming.

The modification of the model after a step of this optimization is

x1 = x0 + αd,

where α is a scalar (distance) variable and d is the direction chosen to obeyall conditions. This is a very powerful approach since with the help of thissingle scalar we can modify the value of a large number of design variables.

The objective function and the constraint equations are adjusted accord-ingly:

f1 = f(x1),

andg1 = g(x1).

It is assumed thatf1 ≤ f0,

in the sense the minimum is defined. From this new design variable set wecan again compute the sensitivities and the process is repeated until the con-strained optimum is achieved. At the constrained optimum the Kuhn-Tucker


condition applies. The condition states that at the constrained optimum thevector sum of the objective and constraint gradients must be zero with someappropriately chosen multiplying factors. Mathematically

λ∇g(x∗) + ∇f(x∗) = 0,

where the x∗ is the optimal design variable setting. The design cannot beimproved further. The λ is the appropriate (Lagrange) multiplier. Note again,that most of the time there are several constraint conditions given resultingin multiple ∇g vectors in the condition.

x2

g2 x( )

f x∗( )∇

g1 x( )

x1

g1 x∗( )∇

x∗

f x( ) const=

λ1∇g1 x∗( ) λ2∇g2 x∗( )+

∇g2 x∗( )

FIGURE 17.1 Optimum condition

The graphical representation of the condition is shown in Figure 17.1 withtwo constraint conditions.

294 Chapter 17

17.3 Planar bending of the bar

During the course of the book we have discussed the bar element several timesin order to present the computational material that was our main subject atthat particular point. We started with a rigid bar in connection with multi-point constraints. Then the bar was allowed to have axial flexibility and laterit was allowed to undergo large rotations. Now, as a foundation for an opti-mization example, we consider an in-plane bending of the bar.

Let us consider the bar shown in Figure 17.2 and assume that it is bendingin the x, y plane. Ignore the fact of being supported at the left end for themoment. The bending depends on the forces and moments applied at theends.

l

qy1qy2

θz1 θz2

1 2x

y

E,I,A

FIGURE 17.2 Planar bending of bar element

The possible deformations of the planar bending of the bar element are thevertical displacements (qy) and rotations around the z axis (θz) for both ofthe two node points.


qe =

⎡⎢⎢⎣

qy1

θz1

qy2

θz2

⎤⎥⎥⎦ .

Let us assume that the vertical displacement function is a cubic polynomial.This is a prudent choice based on the following argument. The mathematicalmodel describing the bar bending problem is

EId4qy

dx4= F (x),

where the I is the moment of inertia with respect to the bending axis of thecross section (the z axis) and E is the already introduced Young’s modulus[1]. The right-hand side is the load vector, as function of x it is distributedover the length of the bar. It may also be a point load or a moment appliedat a certain point of the bar.

The homogeneous differential equation has a solution form:

qy(x) = a0 + a1x + a2x2 + a3x

3,

hence above assumption. Then the following is true:

qy1 = qy(0) = a0,

andqy2 = qy(l) = a0 + a1l + a2l

2 + a3l3.

The rotation angle and the slope of the vertical displacement are related as

θz = −dqy

dx.

Utilizing this results in

θy1 = −a1,

andθy2 = −a1 − 2a2l − 3a3l

2.

We are now seeking a matrix of the shape functions to describe the verticaldisplacement field

q = Nqe,

whereN =

[N1 N2 N3 N4

].

296 Chapter 17

Some tedious algebra produces the satisfactory shape functions as

N1 = 1 − 3x2

l2+ 2

x3

l3,

N2 = −x − 2x2

l− x3

l2,

N3 = 3x2

l2− 2

x3

l3,

and

N4 = −x2

l+

x3

l2.

The strain due to the bending of the bar is

ε = −yd2qy

dx2.

That is distinctly different from the axial strain of the bar under tension orcompression as shown in Section 4.3. The strain-displacement relationship of

ε = Bqe

is satisfied with the matrix

BT =

⎡⎢⎢⎣− 6

l2 + 12xl3− 6x

l2 + 4l

6l2 − 12x

l32l − 6x

l2

⎤⎥⎥⎦ ,

where the algebraic details are again omitted. Note, that due to the form ofthe shape functions, the B matrix is not constant. The stiffness matrix forthe bending bar element finally is

ke =∫ x=l

x=0

BT DBdx = EI

∫ x=l

x=0

BT Bdx,

with the stress-strain matrix for bending being

D = EI.

Executing the multiplication and integration results

ke = EI

⎡⎢⎢⎣

12l3

6l2 − 12

l36l2

6l2

4l − 6

l22l− 12

l3 − 6l2

12l3 − 6

l26l2

2l − 6

l24l

⎤⎥⎥⎦ .

Note, that this is the stiffness matrix for the bar bending in the x, y planeonly. One can, however, easily modify this equation for bending in another


plane by systematic changes. General formulation for simultaneous deforma-tions in both planes and other variations are also possible [3] but beyond ourfocus.

The finite element equilibrium of the cantilever bar problem of Figure 17.2,modeled by a single bar element is now written:

keq = f,

or

ke

⎡⎢⎢⎣

qy1

θz1

qy2

θz2

⎤⎥⎥⎦ =

⎡⎢⎢⎣

Fy1

Mz1

Fy2

Mz2

⎤⎥⎥⎦ ,

where F are point loads and M are moments applied at the ends. Let us nowconstrain the left end and apply only a point load at the right end, Fy2 = F .Then

ke

⎡⎢⎢⎣

00

qy2

θz2

⎤⎥⎥⎦ =

⎡⎢⎢⎣

00F0

⎤⎥⎥⎦ .

The lower 2 × 2 partition of the equilibrium yields

EI

[12l3 − 6

l2− 6l2

4l

][qy2

θz2

]=

[F0

].

Inverting gives the solution of

[qy2

θz2

]=

1EI

[l3

3l2

2l2

2 l

][F0

]=

[Fl3

3EIFl2

2EI

].

Finally, based on the vertical displacement and the deflection angle solutionthe maximum stress of the bar may be obtained as

σmax = Eε = EBqe =Fly

2I,

where y is the vertical measure of the cross section profile of the bar.


We consider the example of a bar constrained at the left end (see Figure 17.2)and loaded with a constant force F on the right end. The objective is to

298 Chapter 17

optimize the shape of the cross section of the bar to minimize the materialneeded while subject to the constraint of a given maximum stress σmax.

Let the bar have a rectangular cross section of width x1 and height x2.These two will be chosen as the design variables, as hinted by the notationalso.

The maximum stress of such bar with length l based on the results of thelast section is

σmax =6Fl

x1x22

,

with substituting the cross section moment of inertia.

This formula would of course be not explicitly available for a multiple ele-ment model; the stresses would have to be evaluated from the displacementsas shown in Section 18.2 in more detail. This fact does not take away fromthe generality of the following example, whose goal is to demonstrate some ofthe steps of optimization computations.

The volume of the bar is

V = x1x2l.

For the sake of simplicity of the example, we consider a unit length bar, l = 1and the loading force of F = 1

6 .

From engineering practicality it follows that the shape of the cross sectionis bounded, neither a very thin but wide nor a very narrow but thick crosssection is desirable. These may be expressed as

x2

x1≤ rmax,

andx2

x1≥ rmin.

Again for simplicity’s sake we will use

rmax = 2 rmin =12

in the following. These are quite practical limits, a cross section of heighttwice the width, or half the width are frequently used manufactured profiles.

The optimization problem is now formally posed as follows.


In the design space of

x =[

x1

x2

],

minimize the volume, i.e. the cross section area:

f(x) = x1x2

subject to the stress constraint of

g1(x) =1

x1x22

− 1 ≤ 0

and shape constraints of

g2(x) =12− x2

x1≤ 0

andg3(x) =

x2

x1− 2 ≤ 0.

Here we used a unit numeric limit for the stress maximum

σmax = 1.

The two-dimensional design space of the two variables is graphically shownon Figure 17.3.

The level curves ofx2 =

c

x1

demonstrate the objective function’s behavior. The inequalities of

x2 ≤ 2x1

andx2 ≥ 1

2x1

represent the boundaries of admissible cross section shapes. Finally, the in-equality

x2 ≥√

1x1

defines the boundary of the feasible region with respect to the stress con-straint. It is clear from the arrangement that the objective function is decreas-ing towards the origin. The objective function’s gradient may be calculated as

∇f(x) =

[∂f∂x1∂f∂x2

]=

[x2

x1

].

300 Chapter 17

OPT

2

g2

x1

g3

f x( ) const=

g1

1

1

2

x2

FIGURE 17.3 Design space of optimization example

The feasible direction with respect to the objective function is simply

d = −x2i − x1j

pointing back toward the origin. Let us assume a quite feasible starting pointof the optimization at x1 = 2, x2 = 2, a square cross-section. The correspond-ing feasible direction vector is

d = −2i − 2j.

This vector will clearly intersect the stress constraint curve g1(x) at the pointx1 = 1, x2 = 1, so we could consider this the next point in the optimizationsequence. The feasible direction vector is now

d = −i − j,

which needs to be modified as it immediately violates the stress constraint.


A modification of the feasible direction vector to the left and to the right ispossible until reaching the intersection of the shape constraint lines and thestress constraint curve. From these points the closer to the x2 axis

x1 =1

(4)13,

x2 =2

(4)13

lies at lower level of the objective function. In order to verify that we havereached an optimum, we will use the Kuhn-Tucker condition. The gradient ofthe upper shape constraint is

∇g3(x) =

[∂g3∂x1∂g3∂x2

]=

[ −x2x211x1

]= −2(4)

13 i + (4)

13 j.

The gradient of the stress constraint is

∇g1(x) =

[∂g1∂x1∂g1∂x2

]=

[ −1x21x2

2−2x1x3

2

]= −(4)

13 i − (4)

13 j.

Finally, the gradient of the objective function at this point is

∇f(x) =2

(4)13i +

1(4)

13j.

If this point is an optimum, the Kuhn-Tucker condition would be

∇f(x) + λ1∇g1x + λ2∇g2x = 0.

With the selection of

λ1 =1

3(4)23

andλ2 =

43(4)

23

the condition is satisfied and an optimum exist at that point.

The optimal cross section shape has width= 1

(4)13

and height= 2

(4)13

units

resulting in the minimum volume of

V =2

(4)23

=1

(2)13

cubic units. Considering the starting material volume of 4 units, the reduc-tion is rather significant.

302 Chapter 17

In practice, the resulting sizes may not be completely appropriate for mea-surement and manufacturing purposes. Therefore, the resulting dimensionsare likely to be rounded up to the nearest measurable level. For our examplethe final results of width = 0.63 and height = 1.26 are practical. This slightlyincreases the theoretical optimum volume to the practical 0.794 which is stilla significant improvement.

In the preceding, emphasis was given to the fact that the procedure outlinedfinds “an” optimum, not necessarily “the” optimum. This is an important as-pect of such techniques: a local optimum in the neighborhood is found, butthe global optimum may not be found.

The heuristic procedure for modifying the search direction, on the otherhand, has not been exposed in very much detail beyond the conceptual level.Such algorithms are often proprietary and the actual implementation in asoftware environment contributes to the efficiency as much as the clevernessof the algorithm.

17.5 Eigenfunction sensitivities

This topic has been subject of the interest in the area of perturbation theoryfor linear differential operators literally for centuries. For example, [4] hasalready presented results for eigenvalue derivatives almost 160 years ago.

The sensitivities of the linear eigenvalue problem

(Kaa − λiMaa)φia = 0

deserve further consideration. Here λi, φia is the ith eigenpair of the problem,

i = 1, m. Differentiation with respect to the jth design variable and reorder-ing results in

(Kaa − λiMaa)∂φi

a

∂xj+ (

∂Kaa

∂xj− λi ∂Maa

∂xj)φi

a =∂λi

∂xjMaaφi

a.

Pre-multiplying the equation with φi,Ta results in the sensitivity of the ith

eigenvalue with respect to the jth design variable yields

∂λi

∂xj=

φi,Ta (∂Kaa

∂xj− λi ∂Maa

∂xj)φi

a

φi,Ta Maaφi

a

,


as the first term on the left-hand side vanishes. The derivative of the stiffnessmatrix was computed by finite differences in the prior section. Similarly, themass matrix derivative is

∂Maa

∂xi=

Maa(x0 + Δxi) − Maa(x0 − Δxi)2Δxi

.

Note, that this equation is valid only for simple eigenvalues.

Another reorganization yields the equation for the eigenvector sensitivity

(Kaa − λiMaa)∂φi

a

∂xj= (

∂λi

∂xjMaa + λi ∂Maa

∂xj− ∂Kaa

∂xj)φi

a.

This form contains the eigenvalue sensitivity and the mass and stiffness matrixderivatives which may not be available. The following approximate form forthe eigenvector sensitivity is based on [6]. Let us assume a slight perturbationof the jth design variable with the amount of Δxj . Then the right-hand sideof above equation may be approximated by

(Kaa − λMaa)∂φi

a

∂xj=

1Δxj

(ΔλiMaa + λiΔMaa − ΔKaa)φia.

HereΔKaa = Kaa|xj+Δxj − Kaa|xj ,

andΔMaa = Maa|xj+Δxj − Maa|xj .

With the ith modal mass

mi = φi,Ta Maaφi

a,

the third approximation component of the right-hand side is

Δλi =1

mi(φi,T

a ΔKaaφia − λiφi,T

a ΔMaaφia).

With these, the complete right-hand side is

Riaa =

1Δxj

(ΔλiMaa + λiΔMaa − ΔKaa)φia.

Finally, denoting

Aiaa = Kaa − λiMaa,

we can present the problem as a system of linear equations

Aiaa

∂φia

∂xj= Ri

aa.

304 Chapter 17

By definition the matrix of the system of n linear equations is singular. Therank deficiency is, however, only one which may be overcome by arbitrarilysetting one component of the solution vector. If we set it to zero and eliminatethe corresponding row and column of Ai

aa the problem may be solved and theeigenvector sensitivities obtained.

There are ways to calculate the eigenfunction derivatives analytically [5].These methods are not attractive to the very large problems occurring in theindustry and as such are not widely used in the practice.

17.6 Variational analysis

The section addresses the issue of calculating the variation of the response ofa structure with respect to a variation in a design parameter.

Let us consider the linear statics problem

Ku = F,

where K is the stiffness matrix and the F is the load vector. The stiffnessmatrix is assembled as

K = Σnee=1Ke,

where Ke is the eth element matrix.

For the sake of simplicity, consider a single, scalar design variable, however,the forthcoming is true for a set of design variables also. We assume that bothK and F are functions of the design variable. The goal of the development isto establish the solution u as a function of the design variable x.

In order to achieve that, express the solution in a Taylor series

u(x) = u0 +∂u

∂xx +

∂2u

∂x2

x2

2+ ....

Here u0 is the fixed static solution

u0 = K−1F,

computed via the factorization

K = LDLT .


Considering n terms in the series we get

u = Σnj=0

∂ju

∂xj

xj

j!.

A similar series expression is assumed for the element matrices as

Ke = Σnj=0

∂jKe

∂xj

xj

j!.

Let us now consider derivatives of the equilibrium equation with respect tothe design variable. The kth derivative is

∂kK

∂xku + K

∂ku

∂xk=

∂kF

∂xk.

Substituting the element matrices

Σnee=1

∂kKe

∂xku + K

∂ku

∂xk=

∂kF

∂xk.

The kth derivative of an element matrix may be written as

∂kKe

∂xk= Σk−1

j=0aj∂jKe

∂xj,

where

aj =xj

j!.

Substituting results in

(Σnee=1Σ

k−1j=0aj

∂jKe

∂xj)u + K

∂ku

∂xk=

∂kF

∂xk.

Considering that u is also a polynomial expression in aj albeit with differentcoefficients, truncating it at the (k − 1)st term we may write

(Σnee=1Σ

k−1j=0aj

∂jKe

∂xj)u = Σne

e=1Σk−1j=0 bj

∂k−1−jKe

∂xk−1−j

∂ju

∂xj,

where the binomial coefficients are

bj =(k − 1)!

(k − 1 − j)!j!.

Reorganization yields a recursive equation for computing the derivatives ofthe displacement with respect to the design variable.

K∂ku

∂xk=

∂kF

∂xk− Σne

e=1Σk−1j=0 bj

∂k−1−jKe

∂xk−1−j

∂ju

∂xj.

306 Chapter 17

Specifically, for k = 1,

K∂u

∂x=

∂F

∂x− Σne

e=1Keu0

gives the first derivative. The second derivative is obtained from

K∂2u

∂x2=

∂2F

∂x2− Σne

e=1Σ1j=0bj

∂1−jKe

∂x1−j

∂ju

∂xj=

∂2F

∂x2− Σne

e=1(Keu0 +∂Ke

∂x

∂u

∂x),

and recursively on for higher derivatives. The computational complexity ofthe approach lies in the fact that the element stiffness matrices and the deriva-tives have to be reevaluated at all settings of the design variable, while thefactorization of K is available from the initial static solution.

The cost of the evaluation of the right-hand side for each derivative may belessened considering the fact that the element matrices are assembled as

Ke = Σng

i=1JiwiBT DB,

where ng is the number of Gauss points in the element. The Ji Jacobianevaluated at a Gauss point, wi Gaussian weights and the D material consti-tutive matrix do not change. The B strain-displacement matrix needs to bemodified for every solution step.

In practice, the number of derivatives evaluated should be rather small toprevent this cost becoming prohibitive. With two derivatives, for example,the variational solution of the linear statics problem is represented by

u(x) = u(x0) +∂u

∂xx +

∂2u

∂x2

x2

2,

whereu(x0) = u0.

The usage of three design variable values, one below and one above a nominalvalue (x0), has the merit of representing the range of a variational designvariable. Let these be denoted by

x0 − Δx, x0, x0 + Δx.

This situation occurs, for example, when taking a symmetric manufacturingtolerance of a component into consideration. Such an approach provides aneasy evaluation of the yet unresolved derivative of the load vector via finitedifferences:

∂F

∂x=

F (x0 + Δx) − F (x0 − Δx)2Δx

.

Since the nominal value is associated with the u(x0) solution, the variationalexpression is usually only evaluated for the upper and lower boundaries. This


enables the evaluation of the range of variation of the static solution as

umax = u(x0 + Δx)

andumin = u(x0 − Δx).

The number of derivatives to be computed may be raised if necessary, basedon some magnitude criterion. It is not very expensive to go one derivativehigher as the computation is recursive. On the other hand having larger num-ber of design variable values could increase the computing time considerably.

An advancement over variational analysis is stochastic analysis. Latteris useful when a structural component exhibits a stochastic behavior. Suchcould occur, for example when material production procedures result in ran-dom variations of Young’s modulus or density of the material.

One of the most popular stochastic approaches is the Monte-Carlo method.The method simulates the random response of the structure by computingdeterministic responses at many randomly selected values of the design vari-able. After collecting these samples, the stochastic measures of the responsevariables, such as expected value or deviation, may be computed.

More advanced stochastic approaches use chaos theory based approxima-tion of the response solution. Instead of giving a range to the design variable,it is considered to be a random variable with a certain distribution around amean value

x → (ξ, d(ξ)).

The stochastic solution is given by a polynomial expansion of form

u(ξ) = Σmi=0aiΨi(ξ),

where Ψi are basis functions chosen in accord with the type of the distribu-tion function of the stochastic variable. For example, for normal distributionvariable Hermite polynomials are used. The coefficients of the expansion arefound by a least squares solution fitting above response to a set of known,deterministic solution points.

Both the Monte-Carlo and the chaos expansion solutions rely on a set ofsampling solutions that are in themselves deterministic, hence these methodsmaybe executed externally to commercial finite element softwares. Finally,it is also possible to carry the stochastic phenomenon into the finite elementformulation [2], however, this requires an internal modification to the finiteelement software.

308 Chapter 17

References

[1] Gallagher, R. H.; Finite element analysis: Fundamentals, Prentice Hall,1975

[2] Ghanem, R. G. and Spanos, P. D.; Stochastic finite elements: A spectralapproach, Springer,, New York, 1991

[3] Hughes, T. J. R.; The finite element method, Prentice-Hall, EnglewoodCliffs, New Jersey, 1987

[4] Jacobi, C. G., Uber eines lechtes Verfahren die in der Theorie derseccularstorungen vorkommenden Gleichungen numerisch aufzulosen,Zeitschrift fur Reine ungewandte Mathematik, Vol. 30, pp. 51-95, 1846

[5] Jankowic, M. S.; Exact nth derivatives of eigenvalues and eigenvectors,J. of Guidance and Control, Vol. 17, No. 1, 1994

[6] Nelson, R. B.; Simplified calculation of eigenvector derivatives, AIAAJournal, Vol. 14, pp. 1201-1205, 1976

[7] Vanderplaats, G. N.; An efficient feasible direction algorithm for designsynthesis, AIAA Journal, Vol. 22, No. 11, 1984

18

Engineering Result Computations

After the solution computations are executed, the engineering solution set isneeded. This process is sometimes called data recovery in commercial soft-ware [1]. The recovery operations are needed because the final solution mayhave been obtained in a reduced or in modal form, or some constraints mayhave been applied.

18.1 Displacement recovery

Some displacement recovery operations were already shown in the variouschapters, we now collect them to illuminate the complete process.

If modal solution (Chapter 13) was executed, the analysis set solution isrecovered by

ua = Φahuh,

where the Φah is the modal space, assuming a frequency domain modal solu-tion.

If, for example, dynamic reduction was executed (Chapter 11) then the re-covery of the analysis set solution is executed by

ua = Sud,

where the S is the dynamic reduction transformation matrix.

Similar operations are needed when static condensation (Chapter 8) or com-ponent modal synthesis (Chapter 12) was executed. Of course, the analysisset solution may have been also obtained directly, in which case the prior stepsare not executed.

309

310 Chapter 18

The free set solution is obtained by

uf =[

ua

uas

],

where the uas contains the automatically applied single-point constraints to

remove singularities.

The recovery of the independent solution set, including the single-pointconstraints specified by the engineer, is done by

un =[

uf

Ys

],

where the array Ys contains the enforced displacements.

Finally, the engineering g partition of displacements is recovered by

ug =[

Inn

Gmn

]un,

where the Gmn matrix represented the multi-point constraints applied to thesystem.

At this point, the displacement solution of the engineering model ug is avail-able. In some cases not all the components of the engineering solution set areobtained to execute this process expeditiously.

This is especially useful for example in optimization analyses, where somephysical quantities are objective function components. Since the optimizationprocess will be doing repeated analysis, in the intermediate stages only somecomponents of the complete engineering solution set are needed.

Let us denote the partition of the analysis solution set interesting to theengineer by p and partition

ua =[

up

up

],

where p is the remainder of the a partition. Then, for example the modalsolution recovery may be significantly expedited in the form of

ua = Φphuh,

where

Φah =[

Φph

Φph

].

Sometimes similar considerations are being used in partitioning the finalresults also. The g partition is also only computed in places where there is

Engineering Result Computations 311

final result requested by the engineer. This, sometimes called sparse datarecovery technology, could result in significant performance improvement onvery large models in commercial environments.

18.2 Stress calculation

In mechanical system analysis, one of the most important issues besides the de-formation (displacement solution) of the structure is the stress occurring in thestructural components as that is the cornerstone of structural integrity. Thesubject of the analysis is to determine the deformations and related stressesunder the given loads to simulate the real life work environment of the system.

Having computed all nodal displacements either in the time domain or inthe frequency domain, the element nodal displacements qe may be partitioned.It was established in Chapter 3 that

ε = Bqe

andσ = Dε.

Hence, the element stress is

σ = DBqe,

where

σ =

⎡⎢⎢⎢⎢⎢⎢⎣

σx

σy

σz

τxy

τyz

τzx

⎤⎥⎥⎥⎥⎥⎥⎦

.

Here the σ are the normal and the τ are the shear stresses. They are mostcommonly evaluated at the element centroid. In practice the single stressformula of von Mises is often used. In terms of the normal and shear stressesit is written as

σ2v =

12[(σx − σy)2 + (σy − σz)2 + (σx − σz)2 + 6(τ2

xy + τ2yz + τ2

zx)].

This single stress value is computed for every element. In order to present thefinal results to the engineer these are converted to nodal values.

312 Chapter 18

x

y σv

2

1

3 σ3

σ2

σ1

FIGURE 18.1 Stresses in triangle

18.3 Nodal data interpolation

In industrial practice it is common to calculate nodal values out of the constantelement value with least square minimalization. Assume that the elementvalue given is σv and we need corner values at a triangular element as shownin Figure 18.1.

With the help of the shape functions, the stress at a point in the elementis expressed in terms of the nodal values as

σ = N1σ1 + N2σ2 + N3σ3,

where σi are the yet unknown corner stresses and the Ni are the shape func-tions as defined earlier.

Our desire is to minimize the following squared error for this element

Ee =12

∫A

(σv − σ)2dA,


where σv is the computed element stress. Expanding the square results in

Ee =12

∫A

σ2vdA −

∫A

σvσdA +12

∫A

σ2dA.

As σv is constant over the element, the first integral is simply a constant

12σ2

vA = Ce.

Introducing

σe =

⎡⎣σ1

σ2

σ3

⎤⎦ ,

the stress is the product of two vectors

σ = Nσe,

whereN =

[N1 N2 N3

].

By substituting the second integral becomes

σv

∫A

NσedA = σeσv

∫A

NdA,

as σe and σv now both are constant for the element. The third integral simi-larly changes to

12σT

e

∫A

NT NdA σe.

The evaluation of these element integrals proceeds along the same lines as thecomputations shown in Chapters 1 and 3. We sum (assemble) all the elementerrors as

E = Σme=1Ee = Σm

e=1(Ce − σvσe

∫A

NdA +12σT

e

∫A

NT NdAσe),

where m is the number of elements in the finite element model. We thenintroduce

S = Σme=1σe =

⎡⎢⎢⎢⎢⎢⎢⎢⎢⎣

σ1

σ2

σ3

...σn−2

σn−1

σn

⎤⎥⎥⎥⎥⎥⎥⎥⎥⎦

,

314 Chapter 18

where n is the number of nodes in the finite element model. Also introduce

R = Σme=1

∫A

NT NdA,

and

T = Σme=1σv

∫A

NdA.

With these the assembled error is

E =12ST RS − ST T + C.

We obtain the minimum of this error when

∂E

∂S= 0,

which occurs with

RS = T,

since C = Σme=1Ce is constant. Note, that the calculation of R is essentially

the same as the calculation of the mass matrix in Section 3.5, apart from theabsence of the density ρ. Hence, the matrix of this linear system is similarto the mass matrix in structure and as such very likely extremely sparse orstrongly banded. Therefore, this linear system may now be solved with thetechniques shown in Chapter 7. The resulting nodal stress values in S formthe basis for the last computational technique discussed in the next section.

18.4 Level curve computation

The nodal values, whether they were direct solution results such as the dis-placements or were interpolated as the stresses, are best presented to theengineer in the form of level, or commonly called, contour curves.

Let us consider the example of the triangular element with nodes 1, 2, 3 andnodal values σ1, σ2, σ3 shown in Figure 18.2.We are to find the location of the points R, S on the sides of the triangle

that correspond to points P, Q that define a constant level of stress. Let thevalue of the level curve be σl. Introduce the ratio

p =σl − σ1

σ3 − σ1.


σ3

σ2

σ1

1

2

3

σl

σl

QP

S

R

FIGURE 18.2 Level curve computation

Some arithmetic yields the coordinates of point R as

xR = px3 + (1 − p)x1,

andyR = py3 + (1 − p)y1.

Similarly the ratio

r =σl − σ2

σ3 − σ2

yields the coordinates of S as

xS = rx3 + (1 − r)x2,

andyS = ry3 + (1 − r)y2.

These calculations, executed for all the elements, result in continuous levelcurves across elements and throughout the finite element model.

316 Chapter 18

It should be noted that above method is fine when the stresses of the exactsolution are smooth. The method could result in large errors in cases whenthe exact stress is not smooth, for example, in elements located on singularedges of the model.

FIGURE 18.3 Physical load on bracket

18.5 Engineering analysis case study

The engineering analysis process and the computed results are demonstratedby the bracket model for which the geometric and finite element models werepresented in Chapter 2. The first step is to apply the loads to the model as


shown on Figure 18.3 by a pressure load applied horizontally to the left faceof the model.

FIGURE 18.4 Constraint conditions of bracket

Note, that in modern engineering environments, such as the NX CAE envi-ronment of the virtual product development suite of Siemens PLM Software[2], the loads are applied to the geometric model in accordance with the phys-ical intentions of the engineer. In this case the load is distributed on the facebounded by the highlighted edges and the direction of the load is denoted bythe arrows.

The constraints are also applied to the geometric model. In the case of theexample the constraints were applied to the interior cylindrical holes wherethe bolts will be located and made visible by the arrows on Figure 18.4.

318 Chapter 18

FIGURE 18.5 Deformed shape of bracket

The computational steps of the analysis process were executed by NX NAS-TRAN [3]. The displacements superimposed on the finite element mesh ofthe model result in the deformed shape of the model as shown in Figure 18-5.This is one of the most useful informations for the engineers.

The stress results projected to the undeformed finite element model arevisualized in Figure 18.6. The lighter shades indicate the areas of higherstresses. The level curve representation of the results described in the lastsection was used for the contours.

The NX environment also enable the animation of the deformed shape witha contiguous process between the undeformed and the deformed geometry,another useful tool for the engineer.


FIGURE 18.6 Stress contours of bracket

References

[1] Craig, R. R. Jr.; Structural dynamics, An introduction to computermethods, Wiley, New York, 1981

[2] www.plm.automation.siemens.com/en_us/products/nx/design/index.shtml

[3] www.siemens.com/plm/nxnastran

Annotation

Notation Meaning

P Potential energy, permutationPk Householder matrixT Kinetic energy, transformation matrixD Dissipative function, material matrixD Diagonal factor matrixWP Work potentialPs Strain energyNi Matrix of shape functionsN Shape functionsB Strain displacement matrixJ Jacobian matrixqe Element displacementke Element stiffnessAe Element areaVe Element volumeEe Element energyK Stiffness matrixM Mass matrixB Damping matrixF Force matrixG Static condensation matrixH1 Hilbert spaceI Identity matrixC Cholesky factorL Lower triangular factor matrixU Upper triangular factor matrixS Dynamic reduction transformation matrixSj Spline segmentPi Point coordinatesR Rigid constraint matrixV (Pi) Voronoi polygonYs Vector of enforced displacementsT Tridiagonal matrixQ Permutation matrix, Lanczos vector matrix

321

322 Annotation

X Linear system solutionY Intermediate solutionZ Residual flexibility matrix

ε Strain vectorσ Stress vectorα Diagonal Lanczos coefficientβ Off-diagonal Lanczos coefficientλ Eigenvalue, Lagrange multiplierΛ Eigenvalue matrixφ EigenvectorΦ Eigenvector matrixΨ Residual flexibility matrixω Frequencyμ Shifted eigenvalueλs Spectral shiftΔt Time stepΔF Nonlinear force imbalanceΔu Nonlinear displacement incrementκk Krylov subspaceθ Rotational degrees of freedom

bi Modal dampingf(x) Objective functiong(x) Constraint functionki Modal stiffnessmi Modal massqk Lanczos vectorsqi Generalized degrees of freedomr Residual vectort Timeu Displacement in frequency domainv Displacement in time domainw Modal displacementwi Weight coefficients

List of Figures

1.1 Membrane model . . . . . . . . . . . . . . . . . . . . . . . . . . 51.2 Local coordinates of triangular element . . . . . . . . . . . . . . 71.3 Meshing the membrane model . . . . . . . . . . . . . . . . . . . 121.4 Parametric coordinates of triangular element . . . . . . . . . . 161.5 A planar quadrilateral element . . . . . . . . . . . . . . . . . . 211.6 Parametric coordinates of quadrilateral element . . . . . . . . . 22

2.1 Bezier polygon . . . . . . . . . . . . . . . . . . . . . . . . . . . 322.2 The effect of weights on the shape of spline . . . . . . . . . . . 342.3 Multiple Bezier segments . . . . . . . . . . . . . . . . . . . . . 352.4 Continuity of spline segments . . . . . . . . . . . . . . . . . . . 362.5 Bezier patch definition . . . . . . . . . . . . . . . . . . . . . . . 372.6 Patch continuity definition . . . . . . . . . . . . . . . . . . . . . 392.7 B spline interpolation . . . . . . . . . . . . . . . . . . . . . . . 442.8 B spline approximation . . . . . . . . . . . . . . . . . . . . . . 462.9 Clamped B spline approximation . . . . . . . . . . . . . . . . . 472.10 Closed B spline approximation . . . . . . . . . . . . . . . . . . 482.11 Voronoi polygon . . . . . . . . . . . . . . . . . . . . . . . . . . 502.12 Delaunay triangle . . . . . . . . . . . . . . . . . . . . . . . . . . 522.13 Delaunay triangularization . . . . . . . . . . . . . . . . . . . . . 532.14 Design sketch of a bracket . . . . . . . . . . . . . . . . . . . . . 552.15 Geometric model of bracket . . . . . . . . . . . . . . . . . . . . 562.16 Finite element model of bracket . . . . . . . . . . . . . . . . . . 57

3.1 Discrete mechanical system . . . . . . . . . . . . . . . . . . . . 603.2 Degrees of freedom of mechanical particle . . . . . . . . . . . . 623.3 Tetrahedron element . . . . . . . . . . . . . . . . . . . . . . . . 66

4.1 Rigid bar . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 764.2 Boundary conditions . . . . . . . . . . . . . . . . . . . . . . . . 87

5.1 Deformed shape of bar . . . . . . . . . . . . . . . . . . . . . . . 965.2 Typical automobile body-in-white model . . . . . . . . . . . . . 103

6.1 Hexahedral finite element . . . . . . . . . . . . . . . . . . . . . 1066.2 Fuel tank model . . . . . . . . . . . . . . . . . . . . . . . . . . 1136.3 Truck cabin model . . . . . . . . . . . . . . . . . . . . . . . . . 114

323

324 List of Figures

7.1 Crankshaft casing finite element model . . . . . . . . . . . . . . 131

8.1 Single-level, single-component partitioning . . . . . . . . . . . . 1428.2 Single-level, multiple-component partitioning . . . . . . . . . . 1488.3 Multiple-level, multiple-component partitioning . . . . . . . . . 1528.4 Automobile crankshaft industrial example . . . . . . . . . . . . 156

9.1 Spectral transformation . . . . . . . . . . . . . . . . . . . . . . 1609.2 Generalized solution scheme . . . . . . . . . . . . . . . . . . . . 1659.3 Trimmed car body model . . . . . . . . . . . . . . . . . . . . . 1789.4 Speedup of parallel normal modes analysis . . . . . . . . . . . . 1799.5 Engine block model . . . . . . . . . . . . . . . . . . . . . . . . . 181

10.1 Campbell diagram . . . . . . . . . . . . . . . . . . . . . . . . . 19510.2 Stability diagram . . . . . . . . . . . . . . . . . . . . . . . . . . 19610.3 Brake model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19710.4 Rotating machinery model . . . . . . . . . . . . . . . . . . . . . 198

11.1 Steering mechanism . . . . . . . . . . . . . . . . . . . . . . . . 213

12.1 Convertible car body . . . . . . . . . . . . . . . . . . . . . . . . 230

13.1 Time dependent load . . . . . . . . . . . . . . . . . . . . . . . . 24413.2 The effect of residual vector . . . . . . . . . . . . . . . . . . . . 24413.3 Modal contributions . . . . . . . . . . . . . . . . . . . . . . . . 24813.4 Modal kinetic energy distribution . . . . . . . . . . . . . . . . . 250

14.1 Transient response . . . . . . . . . . . . . . . . . . . . . . . . . 258

15.1 Satellite model . . . . . . . . . . . . . . . . . . . . . . . . . . . 268

16.1 Nonlinear stress-strain relationship . . . . . . . . . . . . . . . . 27416.2 Rotated bar model . . . . . . . . . . . . . . . . . . . . . . . . . 27716.3 Newton-Raphson iteration . . . . . . . . . . . . . . . . . . . . . 28016.4 Modified Newton iteration . . . . . . . . . . . . . . . . . . . . . 281

17.1 Optimum condition . . . . . . . . . . . . . . . . . . . . . . . . . 29317.2 Planar bending of bar element . . . . . . . . . . . . . . . . . . 29417.3 Design space of optimization example . . . . . . . . . . . . . . 300

18.1 Stresses in triangle . . . . . . . . . . . . . . . . . . . . . . . . . 31218.2 Level curve computation . . . . . . . . . . . . . . . . . . . . . . 31518.3 Physical load on bracket . . . . . . . . . . . . . . . . . . . . . . 31618.4 Constraint conditions of bracket . . . . . . . . . . . . . . . . . 31718.5 Deformed shape of bracket . . . . . . . . . . . . . . . . . . . . . 31818.6 Stress contours of bracket . . . . . . . . . . . . . . . . . . . . . 319

List of Tables

1.1 Basis function terms for two-dimensional elements . . . . . . . 101.2 Basis function terms for three-dimensional elements . . . . . . 111.3 Gauss weights and locations . . . . . . . . . . . . . . . . . . . . 18

5.1 Element statistics of automobile model examples . . . . . . . . 1035.2 Reduction sizes of automobile model examples . . . . . . . . . 104

6.1 Local coordinates of hexahedral element . . . . . . . . . . . . . 1076.2 Acoustic response analysis matrix statistics . . . . . . . . . . . 114

7.1 Size statistics of casing component model . . . . . . . . . . . . 1327.2 Computational statistics of casing component model . . . . . . 1327.3 Linear static analysis matrix statistics . . . . . . . . . . . . . . 133

8.1 Component statistics of crankshaft model . . . . . . . . . . . . 1578.2 Performance statistics of crankshaft model . . . . . . . . . . . . 157

9.1 Model statistics of trimmed car body . . . . . . . . . . . . . . . 1789.2 Distributed normal modes analysis statistics . . . . . . . . . . . 1809.3 Normal modes analysis dense matrix statistics . . . . . . . . . . 180

10.1 Statistics of brake model . . . . . . . . . . . . . . . . . . . . . . 19810.2 Complex eigenvalue analysis statistics . . . . . . . . . . . . . . 199

12.1 Element types of case study automobile model . . . . . . . . . 23112.2 Problem statistics . . . . . . . . . . . . . . . . . . . . . . . . . 23112.3 Execution statistics . . . . . . . . . . . . . . . . . . . . . . . . . 232

325

Closing Remarks

The book’s goal was to give a working knowledge of the main computationaltechniques of finite element analysis. Extra effort was made to make the ma-terial accessible with the usual engineering mathematical tools.

Some chapters contained a simple computational example to demonstratethe details of the computation. It was hoped that those help the reader todevelop a good understanding of these computational steps.

Some chapters contained a description of an industrial application or anactual real life case study. They were meant to demonstrate the awesomepractical power of the technology discussed.

In order to produce a logically contiguous material and to increase thereadability, some of the more tedious details were omitted. In these areas thereader is encouraged to follow the cited references.

The reference sections at the end of each chapter are organized in alpha-betic order by authors’ names. They are all publicly available references.Many original publications on topics contained in the book are given. Thebest review references are also cited, especially those dealing with the math-ematical, engineering and geometric theory of finite elements.

331

what every engineer should know about computational techniques of finite element analysis, second...

Documents