pyramid algorithms for bernstein-bezier finite …

PYRAMID ALGORITHMS FOR BERNSTEIN-BEZIER FINITE

ELEMENTS OF HIGH, NON-UNIFORM ORDER IN ANY

DIMENSION

MARK AINSWORTH

Abstract. The archetypal pyramid algorithm is the de Casteljau algorithm,which is a standard tool for the evaluation of Bezier curves and surfaces. Pyra-

mid algorithms replace an operation on single high order polynomial by arecursive sequence of self-similar affine combinations, and are ubiquitous inCAGD for computations involving high order curves and surfaces. Pyramidalgorithms have received no attention whatsoever from the high (or low) order

finite element community. We develop and analyse pyramid algorithms forthe efficient handling of all of the basic finite element building blocks, includ-ing the assembly of the element load vectors and element stiffness matrices.The complexity of the algorithm for generating the element stiffness matrix is

optimal. A new, non-uniform order, variant of the de Casteljau algorithm isdeveloped that is applicable to the variable polynomial order case but incursno additional complexity compared with the original algorithm.

1. Introduction

High order finite element methods have been analysed extensively for a widevariety of applications and are known to be capable of producing exponential ratesof convergence, even for challenging problems with singularities [6,18,19,29], sharpboundary layers [26,30], and high frequency oscillations [1, 25].

High order polynomial approximations are commonplace in many areas of sci-entific computing including computer graphics [15], computer aided-geometric de-sign [14, 20], and spectral methods for PDEs [8, 17, 27]. It is commonplace to seethe spectral method used with approximation orders in the 100s or 1000s.

Yet, despite theory giving the nod to the use of very high order finite elementmethods, the range of polynomial degree used in practical finite element compu-tations is rarely larger than eighth order [11, 21, 32]. Possible explanations for theuse of comparatively modest polynomial orders include issues of efficiency, imple-mentation, and stability [33]. Moreover, existing implementations of high orderfinite elements tend to be memory hungry, often relying on the use of precomputedarrays and look-up tables [4, 12, 22], a feature that does not augur well for thefuture given the nature of emerging computer hardware systems. Whatever theunderlying reasons, it is clear that the rather modest polynomial degrees seen inhigh order finite element analysis are due to practical considerations rather thanany theoretical barriers.

1991 Mathematics Subject Classification. 65N30. 65Y20. 65D17. 68U07.Key words and phrases. Optimal high order finite elements. Non-uniform order de Casteljau

Algorithm. High order Bezier surfaces. Computer aided geometric design. Pyramid algorithms.The author gratefully acknowledges the partial support of this work under AFOSR contract

FA9550-12-1-0399.1

2 MARK AINSWORTH

The archetypal pyramid algorithm [16] is the de Casteljau algorithm [14,20,24];a standard tool for the evaluation of Bezier curves and surfaces, and polynomialsexpressed in Bernstein-Bezier form in general. The algorithm enjoys widespreadusage, despite being sub-optimal [28], thanks to the ease with which it can be im-plemented, its underlying stability properties and short, simple recursive naturewith minimal memory overhead. In essence, the de Casteljau algorithm replaces asingle high order polynomial by a recursive sequence of self-similar affine combina-tions. Pyramid algorithms are ubiquitous in the computer aided geometric designcommunity for computations using high order curves and surfaces [16].

Pyramid algorithms have received no attention whatsoever from the high (orlow) order finite element community. One of our aims in the present article is todevelop and analyse pyramid algorithms for the efficient handling of all of the basicfinite element building blocks, including the assembly of the element load vectorsand element stiffness matrices.

Hierarchic bases [11, 31, 32] have been one of the sacred cows of the high orderfinite element literature from the very outset. The first step towards developingpyramid algorithms for high order finite elements is to dispense with hierarchicbases in favour of the non-hierarchic, Bernstein-Bezier (BB-) basis [13, 24]. Somepractitioners may baulk at this prospect, pointing to the freedom hierarchic basesbring to allow varying local approximation order of the elements. However, weshall see that pyramid algorithms bring the same flexibility to the (non-hierarchic)BB-basis at no additional complexity, but, in addition, with the usual advantagesassociated with pyramid algorithms. Bernstein-Bezier bases have recently beenshown to offer some advantages for uniform order finite element approximation [2,23].

We begin by generalising the Bernstein Decomposition, developed in [5] for uni-form approximation order, to the variable order setting. A non-uniform orderBernstein-Bezier basis emerges for which the enforcement of conforming betweenelements of differing local orders is a natural consequence of the structure of thenon-uniform order Bernstein Decomposition.

A new, non-uniform order, variant of the de Casteljau algorithm is developed(expressed as a pyramid algorithm) that retains the favourable features of the orig-inal algorithm. The new variant is applicable to the variable polynomial order casebut incurs no additional complexity compared with the original algorithm. Theclassical degree raising algorithm [13,24] is given a similar treatment giving a new,non-uniform order, degree raising pyramid algorithm. Yet more interestingly, thedual pyramid [16] of our non-uniform order degree raising algorithm gives a pyra-mid algorithm for the assembly of the non-uniform order element load vector. Thecomplexity of the algorithm is the same as that of the most efficient hierarchic basescurrently in use.

An algorithm for the construction of the element matrices in optimal complex-ity for uniform order approximation on a simplicial elements was developed onlyrecently [2]. The efficiency of that algorithm uses the underlying uniformity ofthe polynomial order across the element in an essential way. There is no exist-ing algorithm that constructs the non-uniform order element matrices in optimalcomplexity. In the final part of this work, we extend the algorithm of [2] to thenon-uniform order case. Moreover, we show that the resulting algorithm achievesthe optimal complexity.

VARIABLE ORDER BBFEM 3

2. The Bernstein Decomposition

2.1. Simplices and k-Faces. Let T be a non-degenerate n-simplex in RN givenby T = conv(x1,x2, . . . ,xn+1), where n ≤ N . A useful property of non-degeneratesimplices that will be used repeatedly in the sequel is recorded in the followingresult:

Lemma 1. The following conditions are equivalent, i.e. (1) ⇔ (2):

(1) T = conv(x1,x2, . . . ,xn+1) is a non-degenerate n-simplex.(2) For all x ∈ T , there exists a unique set of non-negative scalars s1, s2, . . . ,

sn+1 such that

n+1∑`=1

s`x` = x;n+1∑`=1

s` = 1.

The convex hull of any subset of k + 1 distinct points chosen from the n + 1points x1, . . . , xn+1 is called a k-face of T , and is itself a non-degenerate k-simplexin Rd. The set of all k-faces of T is denoted by ∆k(T ) and has cardinality

(n+1k+1

).

The set ∆(T ) consists of all possible faces of T , i.e. ∆(T ) =⋃n

k=0 ∆k(T ), and has

cardinality∑n

k=0

(n+1k+1

)= 2n+1 − 1. Often, we shall refer to 0-, 1-, 2- and 3-faces

as nodes, edges, triangles and tetrahedra respectively.For example, consider a tetrahedron T ⊂ R3. By picking any three of the four

available vertices of T , we obtain a 2-simplex corresponding to a triangular faceof T . There are

(43

)distinct ways to select three vertices of T , each of which

corresponds to one of the four faces of T . Likewise, the edges of the tetrahedron Tare 1-simplices specified by a pair of vertices of T . A distinct pair of vertices maybe chosen in any one of

(42

)= 6 possible ways corresponding to the six edges of T .

Finally, the four nodes of T are 0-simplices corresponding to(41

)possible ways of

selecting a single node of T . The k-faces of the tetrahedron consist of the nodes,edges and faces of T along with the original simplex T itself.

The interplay between a simplex and its k-faces will play a central role through-out this article. More precisely, various entities will be associated with a simplex.Often, a particular entity may be interpreted as either belonging to a k-face ofthe original simplex or, as being a k-simplex in its own right. It will be usefulto develop a precise specification of the correspondence between these alternativeinterpretations of a given entity.

2.2. Domain Points. The set Dpn(T ) = {xT

α : α ∈ Ipn} consists of the domain

points of T defined by

xTα =

1

p

n+1∑`=1

α`x` (1)

where Ipn is the indexing set Ip

n ={α ∈ Zn+1

+ : |α| = p}. The subset Ip

n ⊂ Ipn

is defined by Ipn =

{α ∈ Nn+1 : |α| = p

}. The corresponding subset of interior

domain points of T is given by Dpn(T ) = {xT

α : α ∈ Ipn}.

Let F ∈ ∆k(T ) be a general k-face given by F = conv(xσF1, . . . ,xσF

k+1) for some

fixed ordering σF1 , . . . , σ

Fk+1 ∈ {1, . . . , n+1} of the nodes. The set of domain points

4 MARK AINSWORTH

Figure 1. Example of a 2-simplex T = conv(x1,x2,x3) and a 1-face F = conv(x3,x1), along with domain points in the case p = 3.The interior domain points are indicated by open circles in eachcase.

of F is defined by Dpk(F ) = {xF

α : α ∈ Ipk} where

xFα =

1

p

k+1∑`=1

α`xσF`, (2)

with the interior domain points Dpk(F ) defined in an analogous fashion.

For example, if p = 3, then the domain points of a triangle T = conv(x1,x2,x3)consist of the nodes of the regular lattice shown in Figure 1 along with their cor-responding indices α ∈ I3

2 . The set I32 consists of the single index α = (1, 1, 1), so

that the set D32(T ) consists of the domain point located at the centroid of T . The

set of domain points D31(F ) of the 1-face, or line, F = conv(x3,x1) consists of the

equally spaced nodes on F as shown in Figure 1 along with their indices α ∈ I31 .

The subset I31 is given by {(2, 1), (1, 2)} so that the interior domain points D3

1(F )comprises of the two nodes located on the interior of F . The set of domain pointsof the 0-face F = {x1} consists of the node x1 itself with the indexing set given

by I30 = {(3)}. Observe that I3

0 = {(3)} so that in the case of a node, the set of

interior domain points D31(F ) coincides with D3

1(F ).Let F ∈ ∆k(T ) be a general k-face given by F = conv(xσF

1, . . . ,xσF

k+1) as before.

Let αF ∈ Ipk be any multi-index and let x ∈ Dp

k(F ) be the corresponding domainpoint on F , so that x = xF

αF , i.e.

x =k+1∑m=1

αFm

pxσF

m;

k+1∑m=1

αFm

p= 1. (3)


The set of domain points Dpk(F ) is a subset of the domain points Dp

n(T ) of thesimplex itself, so there exists an index αT ∈ Ip

n such that x = xTαT , i.e.

x =n+1∑`=1

αT`

px`;

n+1∑`=1

αT`

p= 1. (4)

Comparing (3) and (4) and using the uniqueness property of Lemma 1 means that

αT = εFTαF , (5)

where εFT : Ipk → Ip

n is the mapping defined by

αT` =

{αFm, if ` = σF

m for some m ∈ {1, . . . , k + 1}0, otherwise.

(6)

It is convenient to introduce the mapping ιFT : Ipn → Ip

k defined by the rule

αF = ιFTαT ⇔ αF

m = αTσFm, m ∈ {1, . . . , k + 1}. (7)

In the interests of retaining a manageable notation, we resist the temptation toexplicitly indicate the dependence of εFT and ιFT on p, k, n and the ordering σF .

The operators εFT and ιFT simply serve to formalise the relationship betweenmulti-indices of the domain points which F and T have in common. For example,the mapping εFT between the multi-indices of the common domain points of Fand T is indicated by the arrows in Figure 1; e.g εFT (1, 2) = (2, 0, 1), εFT (2, 1) =(1, 0, 2), . . . and ιFT (2, 0, 1) = (1, 2), ιFT (1, 0, 2) = (2, 1) etc.

With a slight abuse, we shall use the notation εFTDpk(F ) ⊂ Dp

n(T ) to denote

the domain points of T that belong to F . Likewise, εFT Dpk(F ) ⊂ Dp

n(T ) is usedto denote the domain points of T that belong to the interior of F . The nextresult confirms the intuitive expectation that distinct faces have no common domainpoints.

Lemma 2. Let F ∈ ∆k(T ) and G ∈ ∆m(T ) be distinct faces of T . Then, εFT Dpk(F )

and εGT Dpm(G) are disjoint.

Proof. Suppose otherwise, i.e. there exists a domain point x ∈ Dpn(T ) belonging to

both sets. In particular, x ∈ Dpk(F ) (respectively, x ∈ Dp

m(G)) so that there exists

αF ∈ Ipk (respectively, αG ∈ Ip

m) such that

x =

k+1∑`=1

αF`

pxσF

`

(respectively x =

m+1∑`=1

αG`

pxσG

`

)and

1 =k+1∑`=1

αF`

p

(respectively 1 =

m+1∑`=1

αG`

p

).

Moreover, the coefficients appearing in these expressions are all strictly positivesince, for instance, αF ∈ Nk+1. These expressions give two alternative representa-tions of x as convex combinations of nodes of T (since F,G ∈ ∆(T )). Appealingto the uniqueness property of Lemma 1 we deduce that the coefficients must agreein both representations. Moreover, by again appearling to the fact that the coeffi-cients are strictly positive, the nodes themselves must agree in both representations.Hence, F and G share a common set of vertices, i.e. F = G, contradicting the as-sumption of F and G being distinct. �

6 MARK AINSWORTH

An immediate consequence of Lemma 2 is the following decomposition of thedomain points on the original simplex into the disjoint union of interior domainpoints on the faces of T :

Lemma 3. There holds

Dpn(T ) =

⋃n

k=0

⋃F∈∆k(T )

εFT Dpk(F ) (8)

where⋃

denotes the disjoint union.

Proof. Thanks to Lemma 2 the sets on the right side are disjoint and it sufficesto simply check the cardinalities of the sets matches. Using #∆k(T ) =

(n+1k+1

)and

#Ipk =

(p−1k

)for k ≤ p − 1 (and otherwise zero), the cardinality of the set on the

right hand side is given by

min(n,p−1)∑k=0

(n+ 1

k + 1

)(p− 1

k

)=

n+ 1

p

min(n,p−1)∑k=0

(n

k

)(p

p− 1− k

).

The summation on the right hand is given explicitly by the Chu-Vandermondeidentity

min(n,p−1)∑k=0

(n

k

)(p

p− 1− k

)=

(n+ p

p− 1

).

Hence,min(n,p−1)∑

k=0

(n+ 1

k + 1

)(p− 1

k

)=

(p+ n

n

)(9)

and the result follows at once thanks to #Ipn(T ) =

(p+nn

). �

2.3. Barycentric Coordinates. Let x ∈ T denote a point in the simplex T .Thanks to Lemma 1, for each x ∈ T , there exists a unique set of non-negative realnumbers λT

1 , . . . , λTn+1 satisfying

1 =

n+1∑`=1

λT` ; x =

n+1∑`=1

λT` x`. (10)

The (n + 1)-tuple λT = (λT1 , . . . , λ

Tn+1) is the barycentric coordinate vector of the

point x relative to T , and the rule T 3 x → λT ∈ Rn+1 defines an affine mappingon T .

Let F ∈ ∆k(T ) again be a general k-face given by F = conv(xσF1, . . . ,xσF

k+1).

Similarly to (10), the barycentric coordinates of a point x ∈ F are non-negativereal numbers satisfying

1 =

k+1∑`=1

λF` ; x =

k+1∑`=1

λF` xσF

`(11)

and likewise define an affine mapping F 3 x → λF ∈ Rk+1 on F .In view of F ⊂ T , a point x ∈ F may be equally well be regarded as a point be-

longing to T so that both λT (x) and λF (x) are well-defined. How are these barycen-tric coordinate vectors related? Expressions (10) and (11), along with Lemma 1,


imply that

λT` =

{λFm, if ` = σF

m for some m ∈ {1, . . . , k + 1}0, otherwise.

(12)

This rule defines a mapping from Rk+1 to Rn+1. In view of the obvious relationshipbetween this mapping and the mapping εFT : Ip

k → Ipn defined in (6) we shall, with

an abuse of notation, also denote the new mapping by εFT : Rk+1 → Rn+1 so that

λT (x) = εFTλF (x), x ∈ F. (13)

Similarly to (7), the barycentric coordinates of a point x on F can be obtainedfrom those on T by the rule

λF (x) = ιFTλT (x), x ∈ F, (14)

where ιFT is given by the same rule as previously.The rule (14) is valid for points x ∈ F—the domain over which λF is defined.

However, for later purposes, we wish to extend the domain of definition of λF tothe whole of T without compromising property (14). There are several possibileways to extend, but the most natural choice is the barycentric extension defined bythe rule

λF (x) = ιFTλT (x), x ∈ T. (15)

Strictly speaking, this process defines a new set of functions different from λF .Nonetheless, we shall often simply write λF to denote the extended function safein the knowledge that the functions coincide on the common domain of definition.

In conclusion, the barycentric coordinates over any face F ∈ ∆(T ) can be iden-tified with a subset of the barycentric coordinates on T . The appropriate subsetcorresponds to the indices chosen when selecting which of the vertices of T are usedto obtain F .

2.4. Bernstein Polynomials. Let T be a non-degenerate n-simplex in Rd as be-fore. The Bernstein polynomials of degree p ∈ Z+ associated with T are definedby

BT,pα (x) =

(p

α

)λT (x)α, α ∈ Ip

n, x ∈ T (16)

where λT is the barycentric coordinate vector of x on T . The Bernstein polyno-mials are linearly independent [24]. This, coupled with the observation that thecardinality of the indexing set Ip

n coincides with the dimension of the space Ppn(T ),

means that any polynomial u ∈ Ppn(T ) has a unique BB-form representation

u =∑α∈Ip

n

cαBT,pα . (17)

The uniqueness of the BB-vector {cα : α ∈ Ipn} formed from the coefficients of the

BB-form means that the set Σpn(T ) = {φα : α ∈ Ip

n}, consisting of linear functionalson Pp

n(T ) defined by the rules

Ppn(T ) 3 u 7→ φα(u) = cα, α ∈ Ip

n, (18)

is unisolvent with respect to Ppn(T ), i.e.

u = 0 ⇐⇒ φα(u) = 0 for all α ∈ Ipn. (19)

8 MARK AINSWORTH

Consequently, the triple (T,Σpn(T ),Pp

n(T )) is a finite element in the sense of [7,9], which will be referred to as the Bernstein-Bezier finite element (BB-FEM) of(uniform) degree p on T . Equation (1) establishes a natural correspondence betweenthe domain points Dp

n(T ) of T , the multi-indices Ipn and, in turn, the degrees of

freedom Σpn(T ). This correspondence is often exploited by depicting the degrees of

freedom on an element graphically using the domain points as in Figure 1.Let F ∈ ∆k(T ) again be a general k-face given by F = conv(xσF

1, . . . ,xσF

k+1) as

before. One may define a Bernstein basis for Ppk(F ) by taking

BF,pα (x) =

(p

α

)λF (x)α, α ∈ Ip

k , x ∈ F

where λF is the barycentric coordinate vector of x on F defined via (11) as before.The same considerations for the element T apply to F , so that there is also anatural correspondence between the Bernstein polynomials on F and the domainpoints Dp

k(F ) of F , and one may define the space Ppk(F ) to be the space spanned by

the Bernstein polynomials associated with Dpk(F ), the interior domain points of F :

Ppk(F ) = span{BF,p

α : α ∈ Ipk}. (20)

It is easy to see that if F ∈ ∆k(T ), then functions belonging to Ppk(F ) vanish on

all faces F ′ ∈ ∆`(T ), F 6= F ′, for which ` ≤ k. For this reason, Ppk(F ) is sometimes

described as the set of internal functions on F . It should, however, be borne inmind that Pp

k(F ) = Ppk(F ) if F ∈ ∆0(T ).

How are the Bernstein polynomials defined on a face F related to the Bernsteinpolynomials defined on the original simplex T? It has already been seen that thebarycentric coordinate vector λF of a point x on F is related to the barycentriccoordinate vector λT of the same point x, viewed as belonging to T , according tothe formula (13),

λT = εFTλF .

Likewise, for αF ∈ Ipk , the corresponding multi-index αT ∈ Ip

n is given by

αT = εFTαF .

In view of these identities and the definition of εFT , there holds αT ! = αF ! and

λT (x)αT

= λF (x)αF

. Hence, the Bernstein polynomials are related as follows

BF,pαF (x) = BT,p

αT (x) = BT,pεFTαF (x), x ∈ F. (21)

As before, it will be convenient to define an extension of a Bernstein polynomialdefined on a face F ∈ ∆(T ) to the whole of T . The natural choice is again thebarycentric extension based on the identity (21); thus, we define the extension bythe rule

EFTBF,pα (x) = BT,p

εFTα(x), x ∈ T, α ∈ Ipk . (22)

2.5. Bernstein Decomposition. Identity (8) decomposes the domain points onsimplex T into the sum of the interior domain points on the faces of T . There is anatural correspondence between the Bernstein polynomials {BT,p

α : α ∈ Ipn} on a

simplex T and the domain points Dpn(T ) = {xα : α ∈ Ip

n} of T . One may thereforeexpect that the polynomials on the simplex may be similarly decomposed into the


sum of polynomials associated with the interior domain points on the faces of thesimplex as follows

Ppn(T ) =

n⊕k=0

⊕F∈∆k(T )

EFT Ppk(F ) (23)

where

EFT Ppk(F ) = span{EFTB

F,pα : α ∈ Ip

k} = span{BT,pεFTα : α ∈ Ip

k}. (24)

The identity (23) is referred to as the Bernstein decomposition [5], and is readilyseen to hold since the spaces on the right hand are disjoint subspaces of Pp

n(T ),thanks to Lemma 2, and then using the same dimension count as in the proofof (8).

2.6. Non-Uniform Order Bernstein-Bezier Finite Element. The Bernsteindecomposition (23) shows that the basis functions for an element T can be splitinto (disjoint) subsets associated with the faces of T . The splitting of a simplexinto simpler topological components forms the foundation for the construction ofbasis functions for higher order finite element spaces [32].

The splitting is merely a convenience in the case where polynomials of the same,uniform, order are used throughout the whole finite element partition. However, thesplitting becomes crucial when non-uniform order approximation is employed, suchas in the case with adaptive p and hp-version finite element methods. The split-ting justifies the use of an arbitrary local polynomial order pF ∈ N independentlyspecified for each face F ∈ ∆(T ) of the simplex T . The associated non-uniformorder finite element will be denoted by P~p

n(T ), where ~p is the degree vector givenby ~p = {pF : F ∈ ∆(T )}, and has dimension

dimP~pn(T ) =

n∑k=0

∑F∈∆k(T )

dim PpF

k (F ). (25)

The degree of the space P~pn(T ) is defined by pmax = max ~p. Likewise, define pmin =

min ~p.Traditionally, a basis for this space was constructed by equipping each geo-

metric entity with a hierarchy of basis functions of increasing polynomial de-gree [3, 11, 12, 32]. Hierarchic bases have been all-pervasive in the treatment ofnon-uniform order approximation to the extent that any serious alternative is al-most inconceivable. Their universal acceptance is largely due to the ease with whichinter-element conformity may be maintained through lowering (resp. raising) thepolynomial degree on a face through the omission (resp. addition) of higher orderhierarchical basis functions to achieve the desired local order of approximation onthe face.

In contrast, the non-hierarchic nature of the Bernstein polynomials would seemto to imply that varying the polynomial degree cannot be achieved, or at least, notwithout prohibitive complications. Indeed, there is virtually a complete absenceof available non-hierarchic bases for conforming finite element approximation usingnon-uniform local orders.

These considerations nothwithstanding, we argue that a (non-hierarchic) Bern-stein polynomial basis is not only completely natural but computationally attrac-tive, even in the case of non-uniform polynomial degree. Concerning the former

10 MARK AINSWORTH

claim, the uniform order Bernstein decomposition (23) leads quite naturally to thefollowing definition of the non-uniform order space:

P~pn(T ) =

n⊕k=0

⊕F∈∆k(T )

EFT PpF

k (F ). (26)

In other words, in recognition of the importance of the Bernstein decomposition inthe case of uniform order approximation, the non-uniform order space is constructedso that a non-uniform Bernstein decomposition, expressed in (26), is maintained.

How about the choice of a basis for the non-uniform order space? Thanks to (24)and (26), it is natural to select a non-uniform order Bernstein basis as follows:

P~pn(T ) = span

n⊕k=0

⊕F∈∆k(T )

{BT,pF

εFTα : α ∈ IpF

k

}, (27)

in conjunction with an associated non-uniform indexing set I~pT defined by

I~pT =

⋃n

k=0

⋃F∈∆k(T )

{εFTα : α ∈ IpK

k

}. (28)

The spaces on the right hand side of (26) are disjoint thanks to Lemma 2, and itfollows that the dimension of this space satisfies (25).

The definition of Bernstein-Bezier finite element (T,Σpn(T ),Pp

n(T )) triple is easilygeneralised to the non-uniform order case. In particular, the linear independenceof the set (27) means that any function u ∈ P~p

n(T ) may be written uniquely in theform

u =n∑

k=0

∑F∈∆k(T )

∑α∈IpF

k

cεFTαBT,pFεFTα. (29)

The uniqueness of the non-uniform BB-vector formed from these coefficients meansthat linear functionals on Pp

n(T ) may be defined by the rules

P~pn(T ) 3 u 7→ φα(u) = cα, α ∈ I~p

T (30)

and, in turn, that the set of degrees of freedom given by

Σ~pn(T ) =

{φα : α ∈ I~p

T ,}

(31)

is unisolvent with respect to P~pn(T ). The triple (T,Σ~p

n(T ),P~pn(T )) is referred to as

the Bernstein-Bezier finite element of non-uniform degree ~p on T . Furthermore,the natural correspondence between the domain points Dp

n(T ) of T and the degreesof freedom Σp

n(T ) in the case of uniform order elements carries over directly tothe non-uniform order case, and may again be exploited to depict the degrees offreedom graphically as in Figure 2.

2.7. Non-Uniform Order BB-FEM on a Partition. Of course, the selectionof a non-uniform order basis for an individual element, viewed in isolation, is a farsimpler proposition than that of developing a basis for a finite element space on apartitioning of a domain into simplices. The main consideration then becomes thedesire to maintain conformity (continuity) of the non-uniform order approximationacross interfaces (edges, faces, k-simplices in higher dimensions, etc.) betweenadjacent elements in the partition. Indeed, this is the chief factor driving thewidespread usage of hierarchic bases.


Figure 2. Degrees of freedom on a non-uniform order elementT = conv(x1,x2,x3) and on its faces, in the case pT = 4, p12 = 3,p23 = 2, p13 = 4, p1 = 1, p2 = 3, p3 = 3. The degrees of freedomassociated with a particular entity are indicated by open circles.

How does our non-uniform order Bernstein basis fare in this setting? We havereached the crux of the matter: The element level basis (27) automatically gives abasis for a globally conformining finite element space.

For instance, if distinct elements T and T ′ ∈ P share a common k-face F , thenfor each α ∈ IpF

k , the face F contributes a Bernstein polynomial BF,pFα to the

basis. On the face F , the value of the basis function is given by the k-variateBernstein polynomial BF,pF

α , whilst on the element T (respectively T ′), the valueis given by the n-variate Bernstein polynomial basis function BT,pF

εFTα (respectively

BT,pFεFT ′α). The values of all of these Bernstein polynomials agree on the common

interface. In other words, conformity is guaranteed by construction. In essence, allelements containing the k-face F take their local basis functions for the interfacefrom F itself, viz. the Bernstein polynomials at the interior domain points of F . It isworth emphasising that, as we showed earlier, the value of the Bernstein polynomialon any of the entities T , T ′ or F is computed using information local to the givenentity. No inter-element communication is implied or required.

In conclusion, the Bernstein polynomials provide a completely natural basis fornon-uniform order polynomial approximation. The basis is natural in the sensethat it respects the underlying topological structure of the elements manifested inthe nodal splitting (8) and the algebraic splitting of the polynomial space in theform of the Bernstein decomposition.

3. Pyramid Schemes

The non-uniform order finite element basis uses Bernstein polynomials of differ-ing polynomial orders at both the element level and global level. Our next objectiveis the development of simple, efficient and stable algorithms for handling this basis.

12 MARK AINSWORTH

3.1. Non-Uniform Degree De Casteljau Algorithm. Let uT ∈ P~pn(T ) be a

non-uniform degree polynomial approximation on an element T expressed in non-uniform order BB-form: that is,

uT =

n∑k=0

∑F∈∆k(T )

∑α∈IpF

k

cεFTαBT,pFεFTα. (32)

The de Casteljau algorithm [10, 14, 24] is the standard approach to the pointwiseevaluation of a polynomial (or its derivatives) written in uniform order BB-form.

Can the de Casteljau algorithm be modified to enable evaluation of polynomialsexpressed in non-uniform order BB-form? One could take advantage of the struc-ture (27), and simply apply the de Casteljau algorithm to the Bernstein represen-tations on each face F ∈ ∆(T ) separately, followed by summing the contributions.Clearly, such an approach would be rather inefficient. If, for example, the local or-ders are uniform, then a fresh de Casteljau cascade would be spawned unnecessarilyfor every face. Fortunately, there are better alternatives.

We begin with a brief review of the de Casteljau algorithm in the case of uniformpolynomial degree on the element T where uT has the form

uT =∑α∈Ip

n

c(p)α BT,pα . (33)

The algorithm is based on the identity

BT,pα =

n+1∑`=1

λ`BT,p−1α−e`

(34)

where λ` are the (fixed) barycentric coordinates of the given point of interest x ∈ T .Inserting this formula into the expression for uT and simplifying gives the alterna-tive representation

uT =∑

α∈Ip−1n

c(p−1)α BT,p−1

α (35)

where

c(p−1)α =

n+1∑`=1

λ`c(p)α+e`

, α ∈ Ip−1n . (36)

The net effect of these manipulations is to effectively reduce the degree of therepresentation for uT from order p in (33) to order p−1 in (35). We emphasise thatthe coefficients in the reduced order representation now depend on the barycentriccoordinates. The de Casteljau algorithm consists of iterating this process to obtain

the zeroth order representation uT = c(0)0 BT,0

0 = c(0)0 , from which the value of uT (x)

can simply be read-off.The de Casteljau procedure is an example of a pyramid algorithm [16]. The

reason for this nomenclature becomes clear when the procedure (in the case of onespatial dimension, n = 1) is depicted graphically as shown in Figure 3. Figure 3(a)shows the coefficients generated using the de Casteljau procedure starting with thecoefficients of the initial degree p = 4 approximation in the bottom row of thepyramid. Figure 3(b) shows the rule, or computational atom, for ascending fromone level to the next in the pyramid. In the present case, the atom depicts the rule

c(d−1)α = λ1c

(d)α+e1

+ λ2c(d)α+e2

, (37)


(a) (b)

Figure 3. (a) Pyramid scheme for de Casteljau algorithm in thecase of an interval n = 1. (b) Computational atom (see (37)).

obtained from (36) in the special case n = 1. Pyramid algorithms defined bycomputational atoms of this type will form the foundation of our approach. Suchalgorithms are highly computationally attractive (simple short recurrence, explicit,stable, minimal memory access,. . . ). Here, the pyramid scheme reduces the evalu-ation of a high degree polynomial to a sequence of stable, affine combinations.

The following result forms the basis for generalising the de Casteljau procedureto the non-uniform order case:

Lemma 4. Let uT ∈ P~pn(T ) be given in non-uniform BB-form (32), and let x ∈ T

be a given point with barycentric coordinates λ1, . . . , λn+1. Then, there holds ford = pmax, pmax − 1, . . . , 0,

uT (x) =∑α∈Id

n

c(d)α BT,dα (x) +

n∑k=0

∑F∈∆k(T ):

pF<d

∑β∈IpF

k

cεFTβBT,pF

εFTβ(x) (38)

where, for r = pmax, pmax − 1, . . . , 0 and α ∈ Irn,

c(r)α =n+1∑`=1

λ`c(r+1)α+e`

+

{cα if α ∈ I~p

T

0 otherwise,(39)

and where c(pmax+1)α = 0, α ∈ Ip+1

n . In particular, the value of uT (x) is given by

c(0)0 .

Proof. The result may be shown using mathematical induction. In particular, whend = r = pmax, equations (38) and (39) are equivalent to (32).

Assume (38) holds for the case d = r+1. Inserting identity (34) in the case p =r+1 into the first term in (38), and employing the argument used in (35) and (36)gives the first term on the right hand side of (39). The sum over pF < d = r+1 inthe second term on the right hand side of (38) is split into the cases (a)pF = r and(b) pF < r = d−1. The second term on the right hand side of (39) is the coefficientof the Bernstein polynomial BT,r

α , α ∈ Irn in case (a) where pF = r. Finally the sum

14 MARK AINSWORTH

(a) (b)

Figure 4. Pyramid scheme for non-uniform degree de Casteljaualgorithm to evaluate the function (40). (a) Coefficients of thenon-uniform degree representation placed at appropriate nodes inthe pyramid. (b) Result of using the standard de Casteljau atomto propagate coefficients through the pyramid augmented with co-efficients from non-uniform representation.

in the case (b) is the second term on the right hand side of (38) with d = r. Thiscompletes the inductive step, and the result follows for d = pmax, pmax − 1, ... �

The idea presented in Lemma 4 becomes rather simple when freed of notationaldistractions. We illustrate the algorithm for the particular case where T is aninterval I with endpoints at L and R, and degree vector given by pL = 1, pR = 3and pI = 4. The function we wish to evaluate is given by

uI = cL,11 BL,1

1 +∑α∈I4

1

cI,4α BI,4α + cR,3

3 BR,33 (40)

where BL,11 , BI,4

α and BR,33 are the univariate Bernstein polynomials associated

with the vertices and interior of the interval.Figure 4(a) shows the coefficients in the non-uniform degree representation placed

in the pyramid at positions corresponding to their multi-indices. For uniform orderapproximation, this would mean that the lowest row of the pyramid is fully occu-pied, but this is not the case for the non-uniform representation. Figure 4(b) showsthe pyramid obtained after applying the standard de Casteljau computational atomto ascend the pyramid. The coefficients initially residing at higher levels are assim-ilated into the recursion as the pyramid is ascended. The uppermost node of thepyramid contains the value of the non-uniform approximation at the point whosebarycentric coordinates are given by λ1, λ2:

c(0)0 = λ1c

L,11 + 4λ3

1λ2cI,431 + 6λ2

1λ22c

I,422 + 4λ1λ

32c

I,413 + λ3

2cR,33

= cL,1BL,11 + cI,431 B

I,431 + cI,422 B

I,422 + cI,431 B

I,431 + cR,3BR,3

3

= uI .


The above example dealt with the simplest case of an interval in order to keepthe figures as simple as possible, but the treatment of triangles, tetrahedra andhigher dimensional simplices is virtually identical.

The procedure in the general case is presented in Algorithm 1. The basic ideabehind the generalisation to the non-uniform degree case recognises that the stan-dard de Casteljau reduces the degree of the Bernstein representation recursively.The generalised de Casteljau algorithm merely keeps the Bernstein coefficients fora face F of degree pF in reserve, until the degree d of the recursion has been re-duced to d = pF . At that stage, the Bernstein coefficients on F are appended tothe Bernstein coefficients of the de Casteljau representation, and the standard deCasteljau recursion resumes as usual.

Lemma 5. The non-uniform de Casteljau Algorithm 1 requires

pmax dimPpmaxn (T ) + dimP~p

n(T ) = O(pn+1max) (41)

operations to evaluate an element of P~pn(T ) at a given point x ∈ T .

Proof. Every coefficient in the representation (32) is handled precisely once in the

assimulation step, so a total of #I~pT additions are entailed. For d = pmax, pmax−1, ..,

the de Casteljau step requires n + 1 operations for each index α ∈ Id−1n , giving a

total operation count of

(n+ 1)

(pmax − 1 + n

n

)+ (n+ 1)

(pmax − 2 + n

n

)+ . . . = pmax

(pmax + n

n

),

and the result follows. �

The corresponding count for the standard de Casteljau algorithm is p dimPpn(T )

operations, showing that the non-uniform variant is of the same complexity as thestandard algorithm to leading order.

A practical implementation of Algorithm 1 requires a data structure allowingfor the efficient identification of faces F of degree pF = d in the assimilation step(such as an ordered list of faces, ordered by degree, that would require only a singletraverse during the entire procedure). If the function to be evaluated happens tobe of uniform order, then the first execution of the assimilation step would exhaustthe list of faces, after which the general procedure reduces to the standard (uniformorder) de Casteljau algorithm. Thus, Algorithm 1 is just as efficient as the standardde Casteljau algorithm.

3.2. Degree Raising. A polynomial of degree p may be expressed in terms ofBernstein polynomials of degree p, but could also be written in terms of Bernsteinpolynomials of any greater degree. Degree raising [14, 24] is a useful techniquefor visualisation and other applications involving uniform order BB-form represen-tations. A pyramid scheme for the uniform order degree raising may be derivedstarting again with (33), but this time we substitute for Bp

α using the identity

BT,pα =

n+1∑`=1

λ`BT,pα =

n+1∑`=1

α` + 1

p+ 1BT,p+1

α+e`

to obtain

uT =∑

α∈Ip+1n

c(p+1)α BT,p+1

α (42)

16 MARK AINSWORTH

Algorithm 1: The non-uniform order de Casteljau algorithm.

Input: Coefficients in the non-uniform Bernstein representation (32) of uT :

{cF,pFα : α ∈ IpF

k } for each F ∈ ∆k(T ), k = 0, . . . , n.Output: Value of uT at point x ∈ T with barycentric coordinates λk,

k = 1, ..., n+ 1.

Initialise c(pmax)α = 0 for all α ∈ Ipmax

n ;for d = pmax, pmax − 1, . . . 1 do

// Assimilate Bernstein coefficients on faces of degree d

foreach F ∈ ∆k(T ), k = 0, . . . , n such that pF = d do

c(d)εFTα += cF,pF

α for all α ∈ IpF

k ;

// Standard de Casteljau atom

c(d−1)α =

n+1∑`=1

λ`c(d)α+e`

for α ∈ Id−1n ;

return c(0)0 ;

(a) (b)

Figure 5. (a) Pyramid scheme for raising degree in the case of aninterval n = 1. (b) Computational atom for degree raising.

with

c(p+1)α =

n+1∑`=1

α`

p+ 1c(p)α−e`

. α ∈ Ip+1n , (43)

Here, we adopt the standard convention whereby terms involving negative multi-indices are treated as zero and simply ignored. Repeated application of this proce-dure gives a polynomial representation of uT of any desired order. Figure 5 depictsthe procedure in pyramid form in the case n = 1 along with the computationalatom describing the process for progressing down the pyramid. The entries in thefinal row of the pyramid are the coefficients of the degree 4 representation.

The development of a non-uniform order degree raising procedure is particularlyattractive for the treatment of non-uniform order Bernstein bases. It is clear thatsuch an algorithm exists. One could simply proceed face-by-face, applying degree


(a) (b)

Figure 6. Example of non-uniform degree raising on an inter-val. (a) Coefficients from non-uniform representation (40) enteredinto pyramid. (b) Pyramid populated with values obtained us-ing non-uniform degree raising. The coefficients for the uniformrepresentation appear in the final row.

raising on each entity to change the order of the local Bernstein representationon the face to the maximal order in the element, thereby obtaining an equivalentuniform order approximation. In this way, a non-uniform order representation canalways be recast as an equivalent uniform order approximation by a process of localdegree raising on each face. The option of changing a non-uniform order represen-tation to an equivalent, uniform order representation constitutes a useful fall-backtactic enabling standard uniform degree based algorithms to be brought into play.This tactic could have been used in lieu of developing a non-uniform de Castel-jau algorithm, but would be less efficient. However, for some applications, suchas graphical visualisation and rendering of polynomials, the rendering of uniformBernstein form is implemented at the hardware level using OpenGL evaluators [10].Such procedures are very highly tuned to the extent that the overhead of raisingto a uniform order representation becomes a viable proposition.

The process of performing degree raising separately on each face is inefficient,and a more attractive tailor-made variant is available. Figure 6 illustrates a moreeffective procedure in the special case of raising of the non-uniform order Bernsteinrepresentation on the interval given by (40). Figure 6(a) shows the pyramid pop-ulated with the coefficients from the non-uniform representation (40) inserted atthe level appropriate to the degree of the representation for each component. Thestandard degree raising procedure is then applied according to the computationalatom in Figure 5(b) with the only difference being that the values already presentin the pyramid are added to the values computed using the computational atom.Figure 6(b) shows the result of applying the procedure where the coefficients in theuniform degree 4 representation of uI emerge in the final row of the pyramid.

Although we illustrated the algorithm in the simplest possible case, the sameprocedure applies to higher order simplices. Algorithm 2 gives the procedure in thegeneral case. The same technique employed for the non-uniform order de Casteljau

18 MARK AINSWORTH

Algorithm 2: Degree raising to uniform order representation.

Input: Coefficients in the non-uniform Bernstein representation (32) of uT :

{cF,pFα : α ∈ IpF

k } for each F ∈ ∆k(T ), k = 0, . . . , n.

Output: Coefficients {c(pmax)α : α ∈ Ipmax

n } in uniform order representation ofuT .

Initialise c(pmin−1)α = 0, α ∈ Ipmin−1

n ;for d = pmin, . . . pmax do

// Standard degree raising atom

c(d)α =

n+1∑`=1

α`

dc(d−1)α−e`

, α ∈ Idn;

// Assimilate Bernstein coefficients on faces of degree d


c(d)εFTα += cF,pF

α for all α ∈ IpF

k ;

return {c(pmax)α : α ∈ Ipmax

n };

algorithm is employed in which all faces are handled simultaneously. The standarddegree raising recursion is augmented with an additional step at each polynomiallevel d whereby the Bernstein coefficients are assimilated on all faces F ∈ ∆(T ) forwhich pF = d. The structure of Algorithm 2 is similar to that for the non-uniformde Casteljau procedure. A formal proof and verification of the correctness of thealgorithm may be given based on an analogue of Lemma 4. Indeed, the argumentis virtually identical and is therefore omitted. By the same token, the complexity ofAlgorithm 2 is never worse than the complexity of Algorithm 1 given in Lemma 5(and is identical in the case pmin = 1).

3.3. Assembly of the Load Vector. The element load vector ~fT relative to thenon-uniform order Bernstein basis has entries defined by

fTα =

∫T

f(x)BT,|α|α (x) dx, α ∈ I~p

T , (44)

where f : T → R is an appropriate source function. Definitions (22) and (28)provide an alternative form for the load vector that is more convenient for ourpurposes:

fTεFTα =

∫T

f(x)BT,pFεFTα(x) dx, α ∈ IpF

k , F ∈ ∆k(T ), k ∈ {0, . . . , n}. (45)

If the non-uniform order BB-basis is to be utilised for finite element approxima-tion, it will be necessary to develop efficient procedures for the computation of theload vector relative to the basis. Algorithms were presented in [2] for the efficientcomputation of the BB-moments of the data f defined by

µT,pα (f) =

∫F

f(x)BT,pα (x) dx, α ∈ Ip

n. (46)

The BB-moments directly correspond to the entries of the element load vector inthe case of uniform order approximation.


(a) (b)

Figure 7. (a) Pyramid scheme for moment lowering of BB-moments in the case of an interval n = 1. (b) Computationalatom.

The BB-moments of a given degree may be obtained from the BB-moments ofany higher degree as follows,

µT,d−1α (f) =

n+1∑`=1

1 + α`

dµT,dα+e`

(f), α ∈ Id−1n

thanks to the following property of Bernstein polynomials

BT,d−1α =

n+1∑`=1

1 + α`

dBT,d

α+e`, α ∈ Id−1

n .

This process of moment lowering may be expressed as a pyramid scheme, asillustrated in Figure 7 for the case of an interval. Figure 7(a) depicts the processof obtaining lower order moments recursively using the computational atom shownin Figure 7(b) to ascend the pyramid starting with the moments of uniform orderp = 4. The relevant entries in the load vector for the uniform case are then simplyread-off from the appropriate row of the pyramid.

The pyramid schemes shown in Figure 5(a) and Figure 7(a) are intimately re-lated: the computational atoms are precisely the same. Indeed, the only differenceis that the pyramid is ascended (in the case of moment lowering) or descended (inthe case of degree raising). Pyramid schemes related in this way are said to bedual [16]. The same duality property extends to higher order simplices.

Can the BB-moments of uniform order be used to assemble the load vector inthe case of non-uniform approximation? We return to the example of an interval Iwith endpoints at L and R, where a non-uniform approximation of the form (40)is sought. The entries in the load vector corresponding to the basis functions

BL,33 , BI,4

α and BR,11 are required. Suppose that the uniform order BB-moments

{µI,4α ,α ∈ I4

1} have been computed using, for example, the procedure presentedin [2].

Traditionally, expressions for the components of the non-uniform order load vec-tor would be derived by writing the Bernstein polynomials appearing in (40) in

20 MARK AINSWORTH

Algorithm 3: Assembly of load vector ~fT relative to non-uniform order basis.

Input: BB-Moments µT,pmaxα (f), α ∈ Ipmax

n .

Output: Element load vector ~fT = {fTα} defined in (44).

for d = pmax, pmax − 1, . . . , pmin do// Extract components of load vector on faces of degree d


fTεFTα = µT,d

εFTα(f) for all α ∈ IpF

k ;

// Moment lowering atom

µT,d−1α (f) =

n+1∑`=1

1 + α`

dµT,dα+e`

(f) for α ∈ Id−1n ;

return;

terms of the higher (fourth) order Bernstein polynomials as follows:

BL,11 = BI,4

40 +3

4BI,4

31 +1

2BI,4

22 +1

4BI,4

13 (47)

BR,33 =

1

4BI,4

13 +BI,404

along with the trivial identity BI,4α = BI,4

α for the interior nodes. The entries of theload vector for the non-uniform approximation on the interval are given in terms

of the fourth order BB-moments µ(4)α , α ∈ I4

1 , by the formulae

µL,11 (f) = µI,4

40 +3

4µI,431 +

1

2µI,422 +

1

4µI,413 (48)

µR,33 (f) =

1

4µI,413 + µI,4

04

along with the trivial identities µI,4α (f) = µI,4

α , α ∈ I41 .

The corresponding formulae for higher order simplices are considerably morecomplicated. In practice, the coefficients appearing in these formulae are stored inlook-up tables containing the coefficients arising from expressing the lower orderBernstein polynomials in terms of the higher order polynomials. However, thisapproach is at best clumsy, error prone, and, entailing random access of look-uptables, a potential bottleneck.

A superior alternative is to exploit duality. For example, we may constructthe dual of the pyramid Figure 6(b). The first step is to place the moments ofuniform order p = 4 at the nodes of the pyramid corresponding to their multi-indices α ∈ I4

1 , as shown in Figure 8(a). The moments are then propagated throughthe pyramid using the standard moment lowering computational atom shown inFigure 7(b) precisely as in the case of uniform order moment lowering. This resultsin the pyramid shown in Figure 8(a) which is dual to the pyramid Figure 6(a).Examining the resulting entries in the pyramid reveals that the expressions (48) forthe moments have materialised in the pyramid at the locations corresponding tothe (non-uniform order) multi-indices as indicated in Figure 8(b). The componentsof the non-uniform load vector may simply be read-off from the dual pyramid. Thepyramid scheme is simple, stable and dispenses with the need for look-up tables.

In summary, the dual of the non-uniform degree raising pyramid scheme gives apyramid scheme for the assembly of the non-uniform load vector. The general case


(a) (b)

Figure 8. Pyramid scheme for computation of non-uniform BB-moments needed for the non-uniform order approximation (40).(a) Propagation of uniform p = 4 order moments using standardcomputational atom for moment lowering. (b) Extraction of spe-cific moments used in the non-uniform order load vector.

is presented in Algorithm 3 for the assembly of the element load vector startingfrom the BB-moments of degree pmax. Approached in this way, the generation ofthe non-uniform load vector from the uniform moments is simple, efficient, avoidsthe need for look-up tables and is applicable to arbitrary order approximation inany number of dimensions. The complexity of the whole assembly procedure isgiven by:

Theorem 1. The element load vector for the non-uniform order space P~pn(T ) can

be computed in O(pn+1max) operations.

Proof. The uniform order moments can be computed in O(pn+1max) operations using

the techniques presented in [2]. The moment lowering step has complexity at worstthat of the non-uniform de Casteljau algorithm presented in Lemma 5. �

4. Assembly of Element Matrices

The element stiffness matrix ST has entries given by

STαβ =

∫T

gradBT,|α|α ·A(x) gradB

T,|β|β dx, α,β ∈ I~p

T . (49)

where A : T → Rd×d is appropriate data. As with the element load vector, it ismore convenient to express the entries of the mass matrix in an expanded formusing (27) and (28):

STεFTαεGTβ =

∫T

gradBT,pFεFTα ·A(x) gradBT,pG

εGTβ dx, (50)

where F,G ∈ ∆(T ), α ∈ IpF

kFand β ∈ IpG

kG. The identity

gradBT,pFεFTα = pF

kF+1∑`=1

BT,pF−1εFT (α−e`)

gradλεFT e`.

22 MARK AINSWORTH

along with the corresponding expression for gradBT,pG

εGTβ leads to the following al-

ternative expression for the right hand side of (50)

kF+1∑`=1

kG+1∑m=1

∫T

BT,pF−1εFT (α−e`)

BT,pG

εGT (β−em) gradλεFT e`·A(x) gradλεGT em dx (51)

that is more useful for the purposes of implementation.The algorithm StiffMat presented in [2] enables the construction the element

stiffness matrix in the case of uniform polynomial order p inO(p2n) operations whenT is an n−simplex. This is the optimal complexity given that the matrix has O(p2n)non-zero entries. StiffMat exploits another property of Bernstein polynomials

BT,pα BT,q

β =

(α+βα

)(p+qp

) BT,p+qα+β ,α ∈ Ip

n,β ∈ Iqn (52)

in the case p = q, to reduce the computation of entries in the stiffness matrix tothe evaluation of the BB-moments of the data A over the element T . Remarkably,it is not the computation of the moments but the evaluation of the multinomial co-efficients

(α+βα

)that is responsible for the leading term in the complexity estimate.

Care is needed in the computation of the multinomial coefficients if the algorithmis to achieve optimal complexity.

What about the assembly of the stiffness matrix relative to the non-uniformorder basis? The product identity (52) still applies and all would seem well. Unfor-tunately, the critical multinomial algorithm from [2] relies heavily on the fact thatin the uniform order case, the orders p and q are equal. The algorithm is no longerapplicable to the non-uniform order case and the prospects seem bleak, given thatthe multinomial algorithm dominates the complexity.

Fortunately, there is a nuance that leads to the following refined version of (52):

Lemma 6. Let F,G ∈ ∆(T ) be a pair of faces of T . Then,

BT,pFεFTαB

T,pG

εGTβ =1(

pF+pG

pF

)(α+βα

)BT,2pF

εFT (α+β), if F = G,

BT,pF+pG

εFTα+εGTβ, if F 6= G,(53)

where α ∈ IpF

kFand β ∈ IpG

kG.

Proof. Thanks to Lemma 2, F 6= G implies that F and G have no interior domainpoints in common. This means (εFTα + εGTβ)! reduces to εFTα!εGTβ! and theresult follows from (52). �

The absence of the multinomial coefficient in the second case of (53) is vital toAlgorithm 4. If, for instance, the multinomial fact were still present in the secondcase, then Algorithm 4 would need to compute (or look-up) the relevant multino-mial coefficients in the else clause. Either way, this step would form a bottleneckresulting in a sub-optimal algorithm. The multinomial coefficients needed in the ifclause correspond to both multi-indices being drawn from the same indexing sets.This feature is exploited in the routine StiffMat described in [2] to evaluate themultinomial coefficients efficiently using a nested recursion. The complexity ofAlgorithm 4 is the subject of our final result:

Theorem 2. The element stiffness matrix for the non-uniform order space P~pn(T )

can be computed in O(p2nmax) operations.


Algorithm 4: Assembly of stiffness ST relative to non-uniform order basis.

Input: BB-moments {µT,2pmax−2α (A) : α ∈ I2pmax−2

n }.Output: Element stiffness matrix S = {ST

αβ} defined in (49).

for d = 2pmax − 2, 2pmax − 3, . . . , 2pmin − 1 do// Update stiffness matrix

foreach F,G ∈ ∆(T ): pF + pG = d+ 2 doif F = G then

// Diagonal blocks use procedure from [2]

StiffMat(S, pF , {µT,dα (A) : α ∈ Id

n});else

// Exploit second case in (53)

w = pF pG

(pF +pGpF

);

for ` = 1, . . . , kF + 1 dofor m = 1, . . . , kG + 1 do

SεFT (α+e`)εGT (β+em) +=

w gradλεFT e`· µT,d

εFTα+εGTβ(A) gradλεGT em ;

for α ∈ IpF

kF, β ∈ IpG

kG

// Moment lowering atom

µT,d−1α (A) =

n+1∑`=1

1 + α`

dµT,dα+e`

(A) for α ∈ Id−1n ;

return;

Proof. The moments needed for Algorithm 4 can be evaluated in O((2pmax−2)n+1)operations using the procedures given in [2], where it is also shown that the numberof operations required in a call to routine StiffMat for a k-simplex of polynomialdegree p is O(p2k). Therefore, the number of operations needed for the assemblyof the diagonal blocks using calls to StiffMat is of order

n∑k=0

∑F∈∆k(T )

p2kF ≤n∑

k=0

(n+ 1

k + 1

)p2kmax = p−2

max

((1 + p2max)

n+1 − 1)= O(p2nmax).

The number of operations to deal with the off-diagonal blocks is of order∑F,G∈∆(T ):F 6=G

(kF + 1) dim IpF

kF× (kG + 1) dim IpG

kG,

which is, in turn, bounded by{n∑

k=0

(k + 1)

(n+ 1

k + 1

)dim Ipmax

k

}2

= (n+ 1)2(pmax − 1 + n

n

)2

= O(p2nmax).

where a similar argument to the one used in the proof of Lemma 3 has been applied.The result now follows since the number of operations for the moment lowering is,as usual, of the same order as the non-uniform de Casteljau algorithm, O(pn+1

max). �

24 MARK AINSWORTH

Theorem 2 shows that the stiffness matrix can be constructed in optimal com-plexity of O(1) operation per entry for the non-uniform order Bernstein basis.Algorithms with the same structure as Algorithm 4 that exploit analogues of sub-routine StiffMat from [2] can be easily developed to construct other matrices, suchas the element mass matrix, to be constructed in optimal complexity.

References

[1] M. Ainsworth, Discrete dispersion relation for hp-version finite element approximation athigh wave number, SIAM J. Numer. Anal., 42 (2004), pp. 553–575.

[2] M. Ainsworth, G. Andriamaro, and O. Davydov, Bernstein-Bezier FEM and optimalassembly algorithms, SIAM J. Sci. Comp., (2011), pp. 3087–3109.

[3] M. Ainsworth and J. Coyle, Hierarchic finite element bases on unstructured tetrahedralmeshes, Internat. J. Numer. Methods Engrg., 58 (2003), pp. 2103–2130.

[4] M. Ainsworth and B. Senior, Aspects of an adaptive hp-finite element method: Adaptivestrategy, conforming approximation and efficient solvers, Comput. Methods Appl. Mech.Engrg., 150 (1997), pp. 65–87.

[5] D. N. Arnold, R. S. Falk, and R. Winther, Geometric decompositions and local basesfor spaces of finite element differential forms, Comput. Methods Appl. Mech. Engrg., 198(2009), pp. 1660–1672.

[6] I. Babuska and B. Guo, The h-p version of the finite element method for domains with

curved boundaries, SIAM J. Numer. Anal., 25 (1988), pp. 837–861.[7] S. Brenner and L. Scott, The Mathematical Theory of Finite Element Methods, vol. 15 of

Texts in Applied Mathematics, Springer-Verlag, New York, 1994.[8] C. Canuto, M. Hussaini, A. Quarteroni, and T. Zang, Spectral methods. Evolution to

complex geometries and applications to fluid dynamics, Scientific Computation, Springer,Berlin, 2007.

[9] P. G. Ciarlet, The Finite Element Method for Elliptic Problems, Elsevier, North-Holland,1978.

[10] T. Davis, J. Neider, and M. Woo, OpenGL Programming Guide: The Official Guide toLearning OpenGL, Version 1.1, Second Edition, Addison-Wesley Longman Pub (Sd), 1997.

[11] L. Demkowicz, Computing with hp-adaptive finite elements. Vol. 1, Chapman & Hall/CRCApplied Mathematics and Nonlinear Science Series, Chapman & Hall/CRC, Boca Raton, FL,

2007. One and two dimensional elliptic and Maxwell problems, With 1 CD-ROM (UNIX).[12] L. Demkowicz, J. Oden, W. Rachowicz, and O. Hardy, Toward a universal h-p adap-

tive finite element strategy. Part 1 constrained approximation and data structure, Comput.

Methods Appl. Mech. Engrg., 77 (1989), pp. 79–112.[13] G. Farin, Triangular Bernstein-Bezier patches, Comput. Aided Geom. Design, 3 (1986),

pp. 83–127.[14] , Curves and surfaces for CAGD: a practical guide, Fifth Edition, Morgan Kaufmann

Publishers Inc., San Francisco, CA, USA, 2002.[15] J. Foley, A. van Dan, S. Feiner, and J. Hughes, Computer Graphics: Principles and

Practice, Addison-Wesley Publishing Company, Inc., 1996.[16] R. Goldman, Pyramid Algorithms: A Dynamic Programming Approach to Curves and Sur-

faces for Geometric Modeling, Morgan Kaufmann, 2002.[17] D. Gottlieb and J. Hesthaven, Spectral methods for hyperbolic problems, J. Comput. Appl.

Math., 128 (2001), pp. 83–131. Numerical analysis 2000, Vol. VII, Partial differential equa-tions.

[18] B. Guo and I. Babuska, The h-p version of the finite element method. Part 1 the basicapproximation results, Comp. Mech, 1 (1986), pp. 21–41.

[19] , The h-p version of the finite element method. Part 2 general results and applications,Comp. Mech, 1 (1986), pp. 203–226.

[20] J. Hoschek and D. Lasser, Fundamentals of computer aided geometric design, A K PetersLtd., Wellesley, MA, 1993. Translated from the 1992 German edition by Larry L. Schumaker.

[21] G. E. Karniadakis and S. J. Sherwin, Spectral/hp element methods for computational fluid

dynamics, Numerical Mathematics and Scientific Computation, Oxford University Press, NewYork, second ed., 2005.


[22] I. N. Katz, A. G. Peano, and M. P. Rossow, Nodal variables for complete conforming

finite elements of arbitrary polynomial order, Comput. Math. Appl., 4 (1978), pp. 85–112.[23] R. C. Kirby, Fast simplicial finite element algorithms using Bernstein polynomials, Numer.

Math., 117 (2011), pp. 631–652.[24] M.-J. Lai and L. L. Schumaker, Spline functions on triangulations, vol. 110 of Encyclopedia

of Mathematics and its Applications, Cambridge University Press, Cambridge, 2007.[25] J. M. Melenk and S. Sauter, Wavenumber explicit convergence analysis for galerkin dis-

cretizations of the helmholtz equation, SIAM J. Numer. Anal., 49 (2011), pp. 1210–1243.[26] J. M. Melenk and C. Schwab, HP FEM for reaction-diffusion equations. I. Robust expo-

nential convergence, SIAM J. Numer. Anal., 35 (1998), pp. 1520–1557 (electronic).[27] S. A. Orszag, Spectral methods for problems in complex geometries, J. Comput. Phys., 37

(1980), pp. 70–92.[28] J. Peters, Evaluation and approximate evaluation of the multivariate Bernstein-Bezier form

on a regularly partitioned simplex, ACM Trans. Math. Software, 20 (1994), pp. 460–480.[29] C. Schwab, p- and hp-Finite Element Methods: Theory and Applications in Solid and Fluid

Mechanics, Numerical Mathematics and Scientific Computation, Oxford University Press,1998.

[30] C. Schwab and M. Suri, The p and hp versions of the finite element method for problemswith boundary layers, Math. Comp., 65 (1996), pp. 1403–1429.

[31] B. Szabo, Some recent developments in finite element analysis, Comput. Math. Appl., 5

(1979), pp. 99 – 115.[32] B. Szabo and I. Babuska, Finite Element Analysis, John Wiley & Sons, 1991.[33] P. E. J. Vos, S. J. Sherwin, and R. M. Kirby, From h to p efficiently: implementing finite

and spectral/hp element methods to achieve optimal performance for low- and high-order

discretisations, J. Comput. Phys., 229 (2010), pp. 5161–5181.

Division of Applied Mathematics, Brown University, 182 George St, Providence RI02912, USA.

E-mail address: Mark [email protected]

pyramid algorithms for bernstein-bezier finite …

Documents