pyramid algorithms for bernstein-bezier finite …
TRANSCRIPT
PYRAMID ALGORITHMS FOR BERNSTEIN-BEZIER FINITE
ELEMENTS OF HIGH, NON-UNIFORM ORDER IN ANY
DIMENSION
MARK AINSWORTH
Abstract. The archetypal pyramid algorithm is the de Casteljau algorithm,which is a standard tool for the evaluation of Bezier curves and surfaces. Pyra-
mid algorithms replace an operation on single high order polynomial by arecursive sequence of self-similar affine combinations, and are ubiquitous inCAGD for computations involving high order curves and surfaces. Pyramidalgorithms have received no attention whatsoever from the high (or low) order
finite element community. We develop and analyse pyramid algorithms forthe efficient handling of all of the basic finite element building blocks, includ-ing the assembly of the element load vectors and element stiffness matrices.The complexity of the algorithm for generating the element stiffness matrix is
optimal. A new, non-uniform order, variant of the de Casteljau algorithm isdeveloped that is applicable to the variable polynomial order case but incursno additional complexity compared with the original algorithm.
1. Introduction
High order finite element methods have been analysed extensively for a widevariety of applications and are known to be capable of producing exponential ratesof convergence, even for challenging problems with singularities [6,18,19,29], sharpboundary layers [26,30], and high frequency oscillations [1, 25].
High order polynomial approximations are commonplace in many areas of sci-entific computing including computer graphics [15], computer aided-geometric de-sign [14, 20], and spectral methods for PDEs [8, 17, 27]. It is commonplace to seethe spectral method used with approximation orders in the 100s or 1000s.
Yet, despite theory giving the nod to the use of very high order finite elementmethods, the range of polynomial degree used in practical finite element compu-tations is rarely larger than eighth order [11, 21, 32]. Possible explanations for theuse of comparatively modest polynomial orders include issues of efficiency, imple-mentation, and stability [33]. Moreover, existing implementations of high orderfinite elements tend to be memory hungry, often relying on the use of precomputedarrays and look-up tables [4, 12, 22], a feature that does not augur well for thefuture given the nature of emerging computer hardware systems. Whatever theunderlying reasons, it is clear that the rather modest polynomial degrees seen inhigh order finite element analysis are due to practical considerations rather thanany theoretical barriers.
1991 Mathematics Subject Classification. 65N30. 65Y20. 65D17. 68U07.Key words and phrases. Optimal high order finite elements. Non-uniform order de Casteljau
Algorithm. High order Bezier surfaces. Computer aided geometric design. Pyramid algorithms.The author gratefully acknowledges the partial support of this work under AFOSR contract
FA9550-12-1-0399.1
2 MARK AINSWORTH
The archetypal pyramid algorithm [16] is the de Casteljau algorithm [14,20,24];a standard tool for the evaluation of Bezier curves and surfaces, and polynomialsexpressed in Bernstein-Bezier form in general. The algorithm enjoys widespreadusage, despite being sub-optimal [28], thanks to the ease with which it can be im-plemented, its underlying stability properties and short, simple recursive naturewith minimal memory overhead. In essence, the de Casteljau algorithm replaces asingle high order polynomial by a recursive sequence of self-similar affine combina-tions. Pyramid algorithms are ubiquitous in the computer aided geometric designcommunity for computations using high order curves and surfaces [16].
Pyramid algorithms have received no attention whatsoever from the high (orlow) order finite element community. One of our aims in the present article is todevelop and analyse pyramid algorithms for the efficient handling of all of the basicfinite element building blocks, including the assembly of the element load vectorsand element stiffness matrices.
Hierarchic bases [11, 31, 32] have been one of the sacred cows of the high orderfinite element literature from the very outset. The first step towards developingpyramid algorithms for high order finite elements is to dispense with hierarchicbases in favour of the non-hierarchic, Bernstein-Bezier (BB-) basis [13, 24]. Somepractitioners may baulk at this prospect, pointing to the freedom hierarchic basesbring to allow varying local approximation order of the elements. However, weshall see that pyramid algorithms bring the same flexibility to the (non-hierarchic)BB-basis at no additional complexity, but, in addition, with the usual advantagesassociated with pyramid algorithms. Bernstein-Bezier bases have recently beenshown to offer some advantages for uniform order finite element approximation [2,23].
We begin by generalising the Bernstein Decomposition, developed in [5] for uni-form approximation order, to the variable order setting. A non-uniform orderBernstein-Bezier basis emerges for which the enforcement of conforming betweenelements of differing local orders is a natural consequence of the structure of thenon-uniform order Bernstein Decomposition.
A new, non-uniform order, variant of the de Casteljau algorithm is developed(expressed as a pyramid algorithm) that retains the favourable features of the orig-inal algorithm. The new variant is applicable to the variable polynomial order casebut incurs no additional complexity compared with the original algorithm. Theclassical degree raising algorithm [13,24] is given a similar treatment giving a new,non-uniform order, degree raising pyramid algorithm. Yet more interestingly, thedual pyramid [16] of our non-uniform order degree raising algorithm gives a pyra-mid algorithm for the assembly of the non-uniform order element load vector. Thecomplexity of the algorithm is the same as that of the most efficient hierarchic basescurrently in use.
An algorithm for the construction of the element matrices in optimal complex-ity for uniform order approximation on a simplicial elements was developed onlyrecently [2]. The efficiency of that algorithm uses the underlying uniformity ofthe polynomial order across the element in an essential way. There is no exist-ing algorithm that constructs the non-uniform order element matrices in optimalcomplexity. In the final part of this work, we extend the algorithm of [2] to thenon-uniform order case. Moreover, we show that the resulting algorithm achievesthe optimal complexity.
VARIABLE ORDER BBFEM 3
2. The Bernstein Decomposition
2.1. Simplices and k-Faces. Let T be a non-degenerate n-simplex in RN givenby T = conv(x1,x2, . . . ,xn+1), where n ≤ N . A useful property of non-degeneratesimplices that will be used repeatedly in the sequel is recorded in the followingresult:
Lemma 1. The following conditions are equivalent, i.e. (1) ⇔ (2):
(1) T = conv(x1,x2, . . . ,xn+1) is a non-degenerate n-simplex.(2) For all x ∈ T , there exists a unique set of non-negative scalars s1, s2, . . . ,
sn+1 such that
n+1∑`=1
s`x` = x;n+1∑`=1
s` = 1.
The convex hull of any subset of k + 1 distinct points chosen from the n + 1points x1, . . . , xn+1 is called a k-face of T , and is itself a non-degenerate k-simplexin Rd. The set of all k-faces of T is denoted by ∆k(T ) and has cardinality
(n+1k+1
).
The set ∆(T ) consists of all possible faces of T , i.e. ∆(T ) =⋃n
k=0 ∆k(T ), and has
cardinality∑n
k=0
(n+1k+1
)= 2n+1 − 1. Often, we shall refer to 0-, 1-, 2- and 3-faces
as nodes, edges, triangles and tetrahedra respectively.For example, consider a tetrahedron T ⊂ R3. By picking any three of the four
available vertices of T , we obtain a 2-simplex corresponding to a triangular faceof T . There are
(43
)distinct ways to select three vertices of T , each of which
corresponds to one of the four faces of T . Likewise, the edges of the tetrahedron Tare 1-simplices specified by a pair of vertices of T . A distinct pair of vertices maybe chosen in any one of
(42
)= 6 possible ways corresponding to the six edges of T .
Finally, the four nodes of T are 0-simplices corresponding to(41
)possible ways of
selecting a single node of T . The k-faces of the tetrahedron consist of the nodes,edges and faces of T along with the original simplex T itself.
The interplay between a simplex and its k-faces will play a central role through-out this article. More precisely, various entities will be associated with a simplex.Often, a particular entity may be interpreted as either belonging to a k-face ofthe original simplex or, as being a k-simplex in its own right. It will be usefulto develop a precise specification of the correspondence between these alternativeinterpretations of a given entity.
2.2. Domain Points. The set Dpn(T ) = {xT
α : α ∈ Ipn} consists of the domain
points of T defined by
xTα =
1
p
n+1∑`=1
α`x` (1)
where Ipn is the indexing set Ip
n ={α ∈ Zn+1
+ : |α| = p}. The subset Ip
n ⊂ Ipn
is defined by Ipn =
{α ∈ Nn+1 : |α| = p
}. The corresponding subset of interior
domain points of T is given by Dpn(T ) = {xT
α : α ∈ Ipn}.
Let F ∈ ∆k(T ) be a general k-face given by F = conv(xσF1, . . . ,xσF
k+1) for some
fixed ordering σF1 , . . . , σ
Fk+1 ∈ {1, . . . , n+1} of the nodes. The set of domain points
4 MARK AINSWORTH
Figure 1. Example of a 2-simplex T = conv(x1,x2,x3) and a 1-face F = conv(x3,x1), along with domain points in the case p = 3.The interior domain points are indicated by open circles in eachcase.
of F is defined by Dpk(F ) = {xF
α : α ∈ Ipk} where
xFα =
1
p
k+1∑`=1
α`xσF`, (2)
with the interior domain points Dpk(F ) defined in an analogous fashion.
For example, if p = 3, then the domain points of a triangle T = conv(x1,x2,x3)consist of the nodes of the regular lattice shown in Figure 1 along with their cor-responding indices α ∈ I3
2 . The set I32 consists of the single index α = (1, 1, 1), so
that the set D32(T ) consists of the domain point located at the centroid of T . The
set of domain points D31(F ) of the 1-face, or line, F = conv(x3,x1) consists of the
equally spaced nodes on F as shown in Figure 1 along with their indices α ∈ I31 .
The subset I31 is given by {(2, 1), (1, 2)} so that the interior domain points D3
1(F )comprises of the two nodes located on the interior of F . The set of domain pointsof the 0-face F = {x1} consists of the node x1 itself with the indexing set given
by I30 = {(3)}. Observe that I3
0 = {(3)} so that in the case of a node, the set of
interior domain points D31(F ) coincides with D3
1(F ).Let F ∈ ∆k(T ) be a general k-face given by F = conv(xσF
1, . . . ,xσF
k+1) as before.
Let αF ∈ Ipk be any multi-index and let x ∈ Dp
k(F ) be the corresponding domainpoint on F , so that x = xF
αF , i.e.
x =k+1∑m=1
αFm
pxσF
m;
k+1∑m=1
αFm
p= 1. (3)
VARIABLE ORDER BBFEM 5
The set of domain points Dpk(F ) is a subset of the domain points Dp
n(T ) of thesimplex itself, so there exists an index αT ∈ Ip
n such that x = xTαT , i.e.
x =n+1∑`=1
αT`
px`;
n+1∑`=1
αT`
p= 1. (4)
Comparing (3) and (4) and using the uniqueness property of Lemma 1 means that
αT = εFTαF , (5)
where εFT : Ipk → Ip
n is the mapping defined by
αT` =
{αFm, if ` = σF
m for some m ∈ {1, . . . , k + 1}0, otherwise.
(6)
It is convenient to introduce the mapping ιFT : Ipn → Ip
k defined by the rule
αF = ιFTαT ⇔ αF
m = αTσFm, m ∈ {1, . . . , k + 1}. (7)
In the interests of retaining a manageable notation, we resist the temptation toexplicitly indicate the dependence of εFT and ιFT on p, k, n and the ordering σF .
The operators εFT and ιFT simply serve to formalise the relationship betweenmulti-indices of the domain points which F and T have in common. For example,the mapping εFT between the multi-indices of the common domain points of Fand T is indicated by the arrows in Figure 1; e.g εFT (1, 2) = (2, 0, 1), εFT (2, 1) =(1, 0, 2), . . . and ιFT (2, 0, 1) = (1, 2), ιFT (1, 0, 2) = (2, 1) etc.
With a slight abuse, we shall use the notation εFTDpk(F ) ⊂ Dp
n(T ) to denote
the domain points of T that belong to F . Likewise, εFT Dpk(F ) ⊂ Dp
n(T ) is usedto denote the domain points of T that belong to the interior of F . The nextresult confirms the intuitive expectation that distinct faces have no common domainpoints.
Lemma 2. Let F ∈ ∆k(T ) and G ∈ ∆m(T ) be distinct faces of T . Then, εFT Dpk(F )
and εGT Dpm(G) are disjoint.
Proof. Suppose otherwise, i.e. there exists a domain point x ∈ Dpn(T ) belonging to
both sets. In particular, x ∈ Dpk(F ) (respectively, x ∈ Dp
m(G)) so that there exists
αF ∈ Ipk (respectively, αG ∈ Ip
m) such that
x =
k+1∑`=1
αF`
pxσF
`
(respectively x =
m+1∑`=1
αG`
pxσG
`
)and
1 =k+1∑`=1
αF`
p
(respectively 1 =
m+1∑`=1
αG`
p
).
Moreover, the coefficients appearing in these expressions are all strictly positivesince, for instance, αF ∈ Nk+1. These expressions give two alternative representa-tions of x as convex combinations of nodes of T (since F,G ∈ ∆(T )). Appealingto the uniqueness property of Lemma 1 we deduce that the coefficients must agreein both representations. Moreover, by again appearling to the fact that the coeffi-cients are strictly positive, the nodes themselves must agree in both representations.Hence, F and G share a common set of vertices, i.e. F = G, contradicting the as-sumption of F and G being distinct. �
6 MARK AINSWORTH
An immediate consequence of Lemma 2 is the following decomposition of thedomain points on the original simplex into the disjoint union of interior domainpoints on the faces of T :
Lemma 3. There holds
Dpn(T ) =
⋃n
k=0
⋃F∈∆k(T )
εFT Dpk(F ) (8)
where⋃
denotes the disjoint union.
Proof. Thanks to Lemma 2 the sets on the right side are disjoint and it sufficesto simply check the cardinalities of the sets matches. Using #∆k(T ) =
(n+1k+1
)and
#Ipk =
(p−1k
)for k ≤ p − 1 (and otherwise zero), the cardinality of the set on the
right hand side is given by
min(n,p−1)∑k=0
(n+ 1
k + 1
)(p− 1
k
)=
n+ 1
p
min(n,p−1)∑k=0
(n
k
)(p
p− 1− k
).
The summation on the right hand is given explicitly by the Chu-Vandermondeidentity
min(n,p−1)∑k=0
(n
k
)(p
p− 1− k
)=
(n+ p
p− 1
).
Hence,min(n,p−1)∑
k=0
(n+ 1
k + 1
)(p− 1
k
)=
(p+ n
n
)(9)
and the result follows at once thanks to #Ipn(T ) =
(p+nn
). �
2.3. Barycentric Coordinates. Let x ∈ T denote a point in the simplex T .Thanks to Lemma 1, for each x ∈ T , there exists a unique set of non-negative realnumbers λT
1 , . . . , λTn+1 satisfying
1 =
n+1∑`=1
λT` ; x =
n+1∑`=1
λT` x`. (10)
The (n + 1)-tuple λT = (λT1 , . . . , λ
Tn+1) is the barycentric coordinate vector of the
point x relative to T , and the rule T 3 x → λT ∈ Rn+1 defines an affine mappingon T .
Let F ∈ ∆k(T ) again be a general k-face given by F = conv(xσF1, . . . ,xσF
k+1).
Similarly to (10), the barycentric coordinates of a point x ∈ F are non-negativereal numbers satisfying
1 =
k+1∑`=1
λF` ; x =
k+1∑`=1
λF` xσF
`(11)
and likewise define an affine mapping F 3 x → λF ∈ Rk+1 on F .In view of F ⊂ T , a point x ∈ F may be equally well be regarded as a point be-
longing to T so that both λT (x) and λF (x) are well-defined. How are these barycen-tric coordinate vectors related? Expressions (10) and (11), along with Lemma 1,
VARIABLE ORDER BBFEM 7
imply that
λT` =
{λFm, if ` = σF
m for some m ∈ {1, . . . , k + 1}0, otherwise.
(12)
This rule defines a mapping from Rk+1 to Rn+1. In view of the obvious relationshipbetween this mapping and the mapping εFT : Ip
k → Ipn defined in (6) we shall, with
an abuse of notation, also denote the new mapping by εFT : Rk+1 → Rn+1 so that
λT (x) = εFTλF (x), x ∈ F. (13)
Similarly to (7), the barycentric coordinates of a point x on F can be obtainedfrom those on T by the rule
λF (x) = ιFTλT (x), x ∈ F, (14)
where ιFT is given by the same rule as previously.The rule (14) is valid for points x ∈ F—the domain over which λF is defined.
However, for later purposes, we wish to extend the domain of definition of λF tothe whole of T without compromising property (14). There are several possibileways to extend, but the most natural choice is the barycentric extension defined bythe rule
λF (x) = ιFTλT (x), x ∈ T. (15)
Strictly speaking, this process defines a new set of functions different from λF .Nonetheless, we shall often simply write λF to denote the extended function safein the knowledge that the functions coincide on the common domain of definition.
In conclusion, the barycentric coordinates over any face F ∈ ∆(T ) can be iden-tified with a subset of the barycentric coordinates on T . The appropriate subsetcorresponds to the indices chosen when selecting which of the vertices of T are usedto obtain F .
2.4. Bernstein Polynomials. Let T be a non-degenerate n-simplex in Rd as be-fore. The Bernstein polynomials of degree p ∈ Z+ associated with T are definedby
BT,pα (x) =
(p
α
)λT (x)α, α ∈ Ip
n, x ∈ T (16)
where λT is the barycentric coordinate vector of x on T . The Bernstein polyno-mials are linearly independent [24]. This, coupled with the observation that thecardinality of the indexing set Ip
n coincides with the dimension of the space Ppn(T ),
means that any polynomial u ∈ Ppn(T ) has a unique BB-form representation
u =∑α∈Ip
n
cαBT,pα . (17)
The uniqueness of the BB-vector {cα : α ∈ Ipn} formed from the coefficients of the
BB-form means that the set Σpn(T ) = {φα : α ∈ Ip
n}, consisting of linear functionalson Pp
n(T ) defined by the rules
Ppn(T ) 3 u 7→ φα(u) = cα, α ∈ Ip
n, (18)
is unisolvent with respect to Ppn(T ), i.e.
u = 0 ⇐⇒ φα(u) = 0 for all α ∈ Ipn. (19)
8 MARK AINSWORTH
Consequently, the triple (T,Σpn(T ),Pp
n(T )) is a finite element in the sense of [7,9], which will be referred to as the Bernstein-Bezier finite element (BB-FEM) of(uniform) degree p on T . Equation (1) establishes a natural correspondence betweenthe domain points Dp
n(T ) of T , the multi-indices Ipn and, in turn, the degrees of
freedom Σpn(T ). This correspondence is often exploited by depicting the degrees of
freedom on an element graphically using the domain points as in Figure 1.Let F ∈ ∆k(T ) again be a general k-face given by F = conv(xσF
1, . . . ,xσF
k+1) as
before. One may define a Bernstein basis for Ppk(F ) by taking
BF,pα (x) =
(p
α
)λF (x)α, α ∈ Ip
k , x ∈ F
where λF is the barycentric coordinate vector of x on F defined via (11) as before.The same considerations for the element T apply to F , so that there is also anatural correspondence between the Bernstein polynomials on F and the domainpoints Dp
k(F ) of F , and one may define the space Ppk(F ) to be the space spanned by
the Bernstein polynomials associated with Dpk(F ), the interior domain points of F :
Ppk(F ) = span{BF,p
α : α ∈ Ipk}. (20)
It is easy to see that if F ∈ ∆k(T ), then functions belonging to Ppk(F ) vanish on
all faces F ′ ∈ ∆`(T ), F 6= F ′, for which ` ≤ k. For this reason, Ppk(F ) is sometimes
described as the set of internal functions on F . It should, however, be borne inmind that Pp
k(F ) = Ppk(F ) if F ∈ ∆0(T ).
How are the Bernstein polynomials defined on a face F related to the Bernsteinpolynomials defined on the original simplex T? It has already been seen that thebarycentric coordinate vector λF of a point x on F is related to the barycentriccoordinate vector λT of the same point x, viewed as belonging to T , according tothe formula (13),
λT = εFTλF .
Likewise, for αF ∈ Ipk , the corresponding multi-index αT ∈ Ip
n is given by
αT = εFTαF .
In view of these identities and the definition of εFT , there holds αT ! = αF ! and
λT (x)αT
= λF (x)αF
. Hence, the Bernstein polynomials are related as follows
BF,pαF (x) = BT,p
αT (x) = BT,pεFTαF (x), x ∈ F. (21)
As before, it will be convenient to define an extension of a Bernstein polynomialdefined on a face F ∈ ∆(T ) to the whole of T . The natural choice is again thebarycentric extension based on the identity (21); thus, we define the extension bythe rule
EFTBF,pα (x) = BT,p
εFTα(x), x ∈ T, α ∈ Ipk . (22)
2.5. Bernstein Decomposition. Identity (8) decomposes the domain points onsimplex T into the sum of the interior domain points on the faces of T . There is anatural correspondence between the Bernstein polynomials {BT,p
α : α ∈ Ipn} on a
simplex T and the domain points Dpn(T ) = {xα : α ∈ Ip
n} of T . One may thereforeexpect that the polynomials on the simplex may be similarly decomposed into the
VARIABLE ORDER BBFEM 9
sum of polynomials associated with the interior domain points on the faces of thesimplex as follows
Ppn(T ) =
n⊕k=0
⊕F∈∆k(T )
EFT Ppk(F ) (23)
where
EFT Ppk(F ) = span{EFTB
F,pα : α ∈ Ip
k} = span{BT,pεFTα : α ∈ Ip
k}. (24)
The identity (23) is referred to as the Bernstein decomposition [5], and is readilyseen to hold since the spaces on the right hand are disjoint subspaces of Pp
n(T ),thanks to Lemma 2, and then using the same dimension count as in the proofof (8).
2.6. Non-Uniform Order Bernstein-Bezier Finite Element. The Bernsteindecomposition (23) shows that the basis functions for an element T can be splitinto (disjoint) subsets associated with the faces of T . The splitting of a simplexinto simpler topological components forms the foundation for the construction ofbasis functions for higher order finite element spaces [32].
The splitting is merely a convenience in the case where polynomials of the same,uniform, order are used throughout the whole finite element partition. However, thesplitting becomes crucial when non-uniform order approximation is employed, suchas in the case with adaptive p and hp-version finite element methods. The split-ting justifies the use of an arbitrary local polynomial order pF ∈ N independentlyspecified for each face F ∈ ∆(T ) of the simplex T . The associated non-uniformorder finite element will be denoted by P~p
n(T ), where ~p is the degree vector givenby ~p = {pF : F ∈ ∆(T )}, and has dimension
dimP~pn(T ) =
n∑k=0
∑F∈∆k(T )
dim PpF
k (F ). (25)
The degree of the space P~pn(T ) is defined by pmax = max ~p. Likewise, define pmin =
min ~p.Traditionally, a basis for this space was constructed by equipping each geo-
metric entity with a hierarchy of basis functions of increasing polynomial de-gree [3, 11, 12, 32]. Hierarchic bases have been all-pervasive in the treatment ofnon-uniform order approximation to the extent that any serious alternative is al-most inconceivable. Their universal acceptance is largely due to the ease with whichinter-element conformity may be maintained through lowering (resp. raising) thepolynomial degree on a face through the omission (resp. addition) of higher orderhierarchical basis functions to achieve the desired local order of approximation onthe face.
In contrast, the non-hierarchic nature of the Bernstein polynomials would seemto to imply that varying the polynomial degree cannot be achieved, or at least, notwithout prohibitive complications. Indeed, there is virtually a complete absenceof available non-hierarchic bases for conforming finite element approximation usingnon-uniform local orders.
These considerations nothwithstanding, we argue that a (non-hierarchic) Bern-stein polynomial basis is not only completely natural but computationally attrac-tive, even in the case of non-uniform polynomial degree. Concerning the former
10 MARK AINSWORTH
claim, the uniform order Bernstein decomposition (23) leads quite naturally to thefollowing definition of the non-uniform order space:
P~pn(T ) =
n⊕k=0
⊕F∈∆k(T )
EFT PpF
k (F ). (26)
In other words, in recognition of the importance of the Bernstein decomposition inthe case of uniform order approximation, the non-uniform order space is constructedso that a non-uniform Bernstein decomposition, expressed in (26), is maintained.
How about the choice of a basis for the non-uniform order space? Thanks to (24)and (26), it is natural to select a non-uniform order Bernstein basis as follows:
P~pn(T ) = span
n⊕k=0
⊕F∈∆k(T )
{BT,pF
εFTα : α ∈ IpF
k
}, (27)
in conjunction with an associated non-uniform indexing set I~pT defined by
I~pT =
⋃n
k=0
⋃F∈∆k(T )
{εFTα : α ∈ IpK
k
}. (28)
The spaces on the right hand side of (26) are disjoint thanks to Lemma 2, and itfollows that the dimension of this space satisfies (25).
The definition of Bernstein-Bezier finite element (T,Σpn(T ),Pp
n(T )) triple is easilygeneralised to the non-uniform order case. In particular, the linear independenceof the set (27) means that any function u ∈ P~p
n(T ) may be written uniquely in theform
u =n∑
k=0
∑F∈∆k(T )
∑α∈IpF
k
cεFTαBT,pFεFTα. (29)
The uniqueness of the non-uniform BB-vector formed from these coefficients meansthat linear functionals on Pp
n(T ) may be defined by the rules
P~pn(T ) 3 u 7→ φα(u) = cα, α ∈ I~p
T (30)
and, in turn, that the set of degrees of freedom given by
Σ~pn(T ) =
{φα : α ∈ I~p
T ,}
(31)
is unisolvent with respect to P~pn(T ). The triple (T,Σ~p
n(T ),P~pn(T )) is referred to as
the Bernstein-Bezier finite element of non-uniform degree ~p on T . Furthermore,the natural correspondence between the domain points Dp
n(T ) of T and the degreesof freedom Σp
n(T ) in the case of uniform order elements carries over directly tothe non-uniform order case, and may again be exploited to depict the degrees offreedom graphically as in Figure 2.
2.7. Non-Uniform Order BB-FEM on a Partition. Of course, the selectionof a non-uniform order basis for an individual element, viewed in isolation, is a farsimpler proposition than that of developing a basis for a finite element space on apartitioning of a domain into simplices. The main consideration then becomes thedesire to maintain conformity (continuity) of the non-uniform order approximationacross interfaces (edges, faces, k-simplices in higher dimensions, etc.) betweenadjacent elements in the partition. Indeed, this is the chief factor driving thewidespread usage of hierarchic bases.
VARIABLE ORDER BBFEM 11
Figure 2. Degrees of freedom on a non-uniform order elementT = conv(x1,x2,x3) and on its faces, in the case pT = 4, p12 = 3,p23 = 2, p13 = 4, p1 = 1, p2 = 3, p3 = 3. The degrees of freedomassociated with a particular entity are indicated by open circles.
How does our non-uniform order Bernstein basis fare in this setting? We havereached the crux of the matter: The element level basis (27) automatically gives abasis for a globally conformining finite element space.
For instance, if distinct elements T and T ′ ∈ P share a common k-face F , thenfor each α ∈ IpF
k , the face F contributes a Bernstein polynomial BF,pFα to the
basis. On the face F , the value of the basis function is given by the k-variateBernstein polynomial BF,pF
α , whilst on the element T (respectively T ′), the valueis given by the n-variate Bernstein polynomial basis function BT,pF
εFTα (respectively
BT,pFεFT ′α). The values of all of these Bernstein polynomials agree on the common
interface. In other words, conformity is guaranteed by construction. In essence, allelements containing the k-face F take their local basis functions for the interfacefrom F itself, viz. the Bernstein polynomials at the interior domain points of F . It isworth emphasising that, as we showed earlier, the value of the Bernstein polynomialon any of the entities T , T ′ or F is computed using information local to the givenentity. No inter-element communication is implied or required.
In conclusion, the Bernstein polynomials provide a completely natural basis fornon-uniform order polynomial approximation. The basis is natural in the sensethat it respects the underlying topological structure of the elements manifested inthe nodal splitting (8) and the algebraic splitting of the polynomial space in theform of the Bernstein decomposition.
3. Pyramid Schemes
The non-uniform order finite element basis uses Bernstein polynomials of differ-ing polynomial orders at both the element level and global level. Our next objectiveis the development of simple, efficient and stable algorithms for handling this basis.
12 MARK AINSWORTH
3.1. Non-Uniform Degree De Casteljau Algorithm. Let uT ∈ P~pn(T ) be a
non-uniform degree polynomial approximation on an element T expressed in non-uniform order BB-form: that is,
uT =
n∑k=0
∑F∈∆k(T )
∑α∈IpF
k
cεFTαBT,pFεFTα. (32)
The de Casteljau algorithm [10, 14, 24] is the standard approach to the pointwiseevaluation of a polynomial (or its derivatives) written in uniform order BB-form.
Can the de Casteljau algorithm be modified to enable evaluation of polynomialsexpressed in non-uniform order BB-form? One could take advantage of the struc-ture (27), and simply apply the de Casteljau algorithm to the Bernstein represen-tations on each face F ∈ ∆(T ) separately, followed by summing the contributions.Clearly, such an approach would be rather inefficient. If, for example, the local or-ders are uniform, then a fresh de Casteljau cascade would be spawned unnecessarilyfor every face. Fortunately, there are better alternatives.
We begin with a brief review of the de Casteljau algorithm in the case of uniformpolynomial degree on the element T where uT has the form
uT =∑α∈Ip
n
c(p)α BT,pα . (33)
The algorithm is based on the identity
BT,pα =
n+1∑`=1
λ`BT,p−1α−e`
(34)
where λ` are the (fixed) barycentric coordinates of the given point of interest x ∈ T .Inserting this formula into the expression for uT and simplifying gives the alterna-tive representation
uT =∑
α∈Ip−1n
c(p−1)α BT,p−1
α (35)
where
c(p−1)α =
n+1∑`=1
λ`c(p)α+e`
, α ∈ Ip−1n . (36)
The net effect of these manipulations is to effectively reduce the degree of therepresentation for uT from order p in (33) to order p−1 in (35). We emphasise thatthe coefficients in the reduced order representation now depend on the barycentriccoordinates. The de Casteljau algorithm consists of iterating this process to obtain
the zeroth order representation uT = c(0)0 BT,0
0 = c(0)0 , from which the value of uT (x)
can simply be read-off.The de Casteljau procedure is an example of a pyramid algorithm [16]. The
reason for this nomenclature becomes clear when the procedure (in the case of onespatial dimension, n = 1) is depicted graphically as shown in Figure 3. Figure 3(a)shows the coefficients generated using the de Casteljau procedure starting with thecoefficients of the initial degree p = 4 approximation in the bottom row of thepyramid. Figure 3(b) shows the rule, or computational atom, for ascending fromone level to the next in the pyramid. In the present case, the atom depicts the rule
c(d−1)α = λ1c
(d)α+e1
+ λ2c(d)α+e2
, (37)
VARIABLE ORDER BBFEM 13
(a) (b)
Figure 3. (a) Pyramid scheme for de Casteljau algorithm in thecase of an interval n = 1. (b) Computational atom (see (37)).
obtained from (36) in the special case n = 1. Pyramid algorithms defined bycomputational atoms of this type will form the foundation of our approach. Suchalgorithms are highly computationally attractive (simple short recurrence, explicit,stable, minimal memory access,. . . ). Here, the pyramid scheme reduces the evalu-ation of a high degree polynomial to a sequence of stable, affine combinations.
The following result forms the basis for generalising the de Casteljau procedureto the non-uniform order case:
Lemma 4. Let uT ∈ P~pn(T ) be given in non-uniform BB-form (32), and let x ∈ T
be a given point with barycentric coordinates λ1, . . . , λn+1. Then, there holds ford = pmax, pmax − 1, . . . , 0,
uT (x) =∑α∈Id
n
c(d)α BT,dα (x) +
n∑k=0
∑F∈∆k(T ):
pF<d
∑β∈IpF
k
cεFTβBT,pF
εFTβ(x) (38)
where, for r = pmax, pmax − 1, . . . , 0 and α ∈ Irn,
c(r)α =n+1∑`=1
λ`c(r+1)α+e`
+
{cα if α ∈ I~p
T
0 otherwise,(39)
and where c(pmax+1)α = 0, α ∈ Ip+1
n . In particular, the value of uT (x) is given by
c(0)0 .
Proof. The result may be shown using mathematical induction. In particular, whend = r = pmax, equations (38) and (39) are equivalent to (32).
Assume (38) holds for the case d = r+1. Inserting identity (34) in the case p =r+1 into the first term in (38), and employing the argument used in (35) and (36)gives the first term on the right hand side of (39). The sum over pF < d = r+1 inthe second term on the right hand side of (38) is split into the cases (a)pF = r and(b) pF < r = d−1. The second term on the right hand side of (39) is the coefficientof the Bernstein polynomial BT,r
α , α ∈ Irn in case (a) where pF = r. Finally the sum
14 MARK AINSWORTH
(a) (b)
Figure 4. Pyramid scheme for non-uniform degree de Casteljaualgorithm to evaluate the function (40). (a) Coefficients of thenon-uniform degree representation placed at appropriate nodes inthe pyramid. (b) Result of using the standard de Casteljau atomto propagate coefficients through the pyramid augmented with co-efficients from non-uniform representation.
in the case (b) is the second term on the right hand side of (38) with d = r. Thiscompletes the inductive step, and the result follows for d = pmax, pmax − 1, ... �
The idea presented in Lemma 4 becomes rather simple when freed of notationaldistractions. We illustrate the algorithm for the particular case where T is aninterval I with endpoints at L and R, and degree vector given by pL = 1, pR = 3and pI = 4. The function we wish to evaluate is given by
uI = cL,11 BL,1
1 +∑α∈I4
1
cI,4α BI,4α + cR,3
3 BR,33 (40)
where BL,11 , BI,4
α and BR,33 are the univariate Bernstein polynomials associated
with the vertices and interior of the interval.Figure 4(a) shows the coefficients in the non-uniform degree representation placed
in the pyramid at positions corresponding to their multi-indices. For uniform orderapproximation, this would mean that the lowest row of the pyramid is fully occu-pied, but this is not the case for the non-uniform representation. Figure 4(b) showsthe pyramid obtained after applying the standard de Casteljau computational atomto ascend the pyramid. The coefficients initially residing at higher levels are assim-ilated into the recursion as the pyramid is ascended. The uppermost node of thepyramid contains the value of the non-uniform approximation at the point whosebarycentric coordinates are given by λ1, λ2:
c(0)0 = λ1c
L,11 + 4λ3
1λ2cI,431 + 6λ2
1λ22c
I,422 + 4λ1λ
32c
I,413 + λ3
2cR,33
= cL,1BL,11 + cI,431 B
I,431 + cI,422 B
I,422 + cI,431 B
I,431 + cR,3BR,3
3
= uI .
VARIABLE ORDER BBFEM 15
The above example dealt with the simplest case of an interval in order to keepthe figures as simple as possible, but the treatment of triangles, tetrahedra andhigher dimensional simplices is virtually identical.
The procedure in the general case is presented in Algorithm 1. The basic ideabehind the generalisation to the non-uniform degree case recognises that the stan-dard de Casteljau reduces the degree of the Bernstein representation recursively.The generalised de Casteljau algorithm merely keeps the Bernstein coefficients fora face F of degree pF in reserve, until the degree d of the recursion has been re-duced to d = pF . At that stage, the Bernstein coefficients on F are appended tothe Bernstein coefficients of the de Casteljau representation, and the standard deCasteljau recursion resumes as usual.
Lemma 5. The non-uniform de Casteljau Algorithm 1 requires
pmax dimPpmaxn (T ) + dimP~p
n(T ) = O(pn+1max) (41)
operations to evaluate an element of P~pn(T ) at a given point x ∈ T .
Proof. Every coefficient in the representation (32) is handled precisely once in the
assimulation step, so a total of #I~pT additions are entailed. For d = pmax, pmax−1, ..,
the de Casteljau step requires n + 1 operations for each index α ∈ Id−1n , giving a
total operation count of
(n+ 1)
(pmax − 1 + n
n
)+ (n+ 1)
(pmax − 2 + n
n
)+ . . . = pmax
(pmax + n
n
),
and the result follows. �
The corresponding count for the standard de Casteljau algorithm is p dimPpn(T )
operations, showing that the non-uniform variant is of the same complexity as thestandard algorithm to leading order.
A practical implementation of Algorithm 1 requires a data structure allowingfor the efficient identification of faces F of degree pF = d in the assimilation step(such as an ordered list of faces, ordered by degree, that would require only a singletraverse during the entire procedure). If the function to be evaluated happens tobe of uniform order, then the first execution of the assimilation step would exhaustthe list of faces, after which the general procedure reduces to the standard (uniformorder) de Casteljau algorithm. Thus, Algorithm 1 is just as efficient as the standardde Casteljau algorithm.
3.2. Degree Raising. A polynomial of degree p may be expressed in terms ofBernstein polynomials of degree p, but could also be written in terms of Bernsteinpolynomials of any greater degree. Degree raising [14, 24] is a useful techniquefor visualisation and other applications involving uniform order BB-form represen-tations. A pyramid scheme for the uniform order degree raising may be derivedstarting again with (33), but this time we substitute for Bp
α using the identity
BT,pα =
n+1∑`=1
λ`BT,pα =
n+1∑`=1
α` + 1
p+ 1BT,p+1
α+e`
to obtain
uT =∑
α∈Ip+1n
c(p+1)α BT,p+1
α (42)
16 MARK AINSWORTH
Algorithm 1: The non-uniform order de Casteljau algorithm.
Input: Coefficients in the non-uniform Bernstein representation (32) of uT :
{cF,pFα : α ∈ IpF
k } for each F ∈ ∆k(T ), k = 0, . . . , n.Output: Value of uT at point x ∈ T with barycentric coordinates λk,
k = 1, ..., n+ 1.
Initialise c(pmax)α = 0 for all α ∈ Ipmax
n ;for d = pmax, pmax − 1, . . . 1 do
// Assimilate Bernstein coefficients on faces of degree d
foreach F ∈ ∆k(T ), k = 0, . . . , n such that pF = d do
c(d)εFTα += cF,pF
α for all α ∈ IpF
k ;
// Standard de Casteljau atom
c(d−1)α =
n+1∑`=1
λ`c(d)α+e`
for α ∈ Id−1n ;
return c(0)0 ;
(a) (b)
Figure 5. (a) Pyramid scheme for raising degree in the case of aninterval n = 1. (b) Computational atom for degree raising.
with
c(p+1)α =
n+1∑`=1
α`
p+ 1c(p)α−e`
. α ∈ Ip+1n , (43)
Here, we adopt the standard convention whereby terms involving negative multi-indices are treated as zero and simply ignored. Repeated application of this proce-dure gives a polynomial representation of uT of any desired order. Figure 5 depictsthe procedure in pyramid form in the case n = 1 along with the computationalatom describing the process for progressing down the pyramid. The entries in thefinal row of the pyramid are the coefficients of the degree 4 representation.
The development of a non-uniform order degree raising procedure is particularlyattractive for the treatment of non-uniform order Bernstein bases. It is clear thatsuch an algorithm exists. One could simply proceed face-by-face, applying degree
VARIABLE ORDER BBFEM 17
(a) (b)
Figure 6. Example of non-uniform degree raising on an inter-val. (a) Coefficients from non-uniform representation (40) enteredinto pyramid. (b) Pyramid populated with values obtained us-ing non-uniform degree raising. The coefficients for the uniformrepresentation appear in the final row.
raising on each entity to change the order of the local Bernstein representationon the face to the maximal order in the element, thereby obtaining an equivalentuniform order approximation. In this way, a non-uniform order representation canalways be recast as an equivalent uniform order approximation by a process of localdegree raising on each face. The option of changing a non-uniform order represen-tation to an equivalent, uniform order representation constitutes a useful fall-backtactic enabling standard uniform degree based algorithms to be brought into play.This tactic could have been used in lieu of developing a non-uniform de Castel-jau algorithm, but would be less efficient. However, for some applications, suchas graphical visualisation and rendering of polynomials, the rendering of uniformBernstein form is implemented at the hardware level using OpenGL evaluators [10].Such procedures are very highly tuned to the extent that the overhead of raisingto a uniform order representation becomes a viable proposition.
The process of performing degree raising separately on each face is inefficient,and a more attractive tailor-made variant is available. Figure 6 illustrates a moreeffective procedure in the special case of raising of the non-uniform order Bernsteinrepresentation on the interval given by (40). Figure 6(a) shows the pyramid pop-ulated with the coefficients from the non-uniform representation (40) inserted atthe level appropriate to the degree of the representation for each component. Thestandard degree raising procedure is then applied according to the computationalatom in Figure 5(b) with the only difference being that the values already presentin the pyramid are added to the values computed using the computational atom.Figure 6(b) shows the result of applying the procedure where the coefficients in theuniform degree 4 representation of uI emerge in the final row of the pyramid.
Although we illustrated the algorithm in the simplest possible case, the sameprocedure applies to higher order simplices. Algorithm 2 gives the procedure in thegeneral case. The same technique employed for the non-uniform order de Casteljau
18 MARK AINSWORTH
Algorithm 2: Degree raising to uniform order representation.
Input: Coefficients in the non-uniform Bernstein representation (32) of uT :
{cF,pFα : α ∈ IpF
k } for each F ∈ ∆k(T ), k = 0, . . . , n.
Output: Coefficients {c(pmax)α : α ∈ Ipmax
n } in uniform order representation ofuT .
Initialise c(pmin−1)α = 0, α ∈ Ipmin−1
n ;for d = pmin, . . . pmax do
// Standard degree raising atom
c(d)α =
n+1∑`=1
α`
dc(d−1)α−e`
, α ∈ Idn;
// Assimilate Bernstein coefficients on faces of degree d
foreach F ∈ ∆k(T ), k = 0, . . . , n such that pF = d do
c(d)εFTα += cF,pF
α for all α ∈ IpF
k ;
return {c(pmax)α : α ∈ Ipmax
n };
algorithm is employed in which all faces are handled simultaneously. The standarddegree raising recursion is augmented with an additional step at each polynomiallevel d whereby the Bernstein coefficients are assimilated on all faces F ∈ ∆(T ) forwhich pF = d. The structure of Algorithm 2 is similar to that for the non-uniformde Casteljau procedure. A formal proof and verification of the correctness of thealgorithm may be given based on an analogue of Lemma 4. Indeed, the argumentis virtually identical and is therefore omitted. By the same token, the complexity ofAlgorithm 2 is never worse than the complexity of Algorithm 1 given in Lemma 5(and is identical in the case pmin = 1).
3.3. Assembly of the Load Vector. The element load vector ~fT relative to thenon-uniform order Bernstein basis has entries defined by
fTα =
∫T
f(x)BT,|α|α (x) dx, α ∈ I~p
T , (44)
where f : T → R is an appropriate source function. Definitions (22) and (28)provide an alternative form for the load vector that is more convenient for ourpurposes:
fTεFTα =
∫T
f(x)BT,pFεFTα(x) dx, α ∈ IpF
k , F ∈ ∆k(T ), k ∈ {0, . . . , n}. (45)
If the non-uniform order BB-basis is to be utilised for finite element approxima-tion, it will be necessary to develop efficient procedures for the computation of theload vector relative to the basis. Algorithms were presented in [2] for the efficientcomputation of the BB-moments of the data f defined by
µT,pα (f) =
∫F
f(x)BT,pα (x) dx, α ∈ Ip
n. (46)
The BB-moments directly correspond to the entries of the element load vector inthe case of uniform order approximation.
VARIABLE ORDER BBFEM 19
(a) (b)
Figure 7. (a) Pyramid scheme for moment lowering of BB-moments in the case of an interval n = 1. (b) Computationalatom.
The BB-moments of a given degree may be obtained from the BB-moments ofany higher degree as follows,
µT,d−1α (f) =
n+1∑`=1
1 + α`
dµT,dα+e`
(f), α ∈ Id−1n
thanks to the following property of Bernstein polynomials
BT,d−1α =
n+1∑`=1
1 + α`
dBT,d
α+e`, α ∈ Id−1
n .
This process of moment lowering may be expressed as a pyramid scheme, asillustrated in Figure 7 for the case of an interval. Figure 7(a) depicts the processof obtaining lower order moments recursively using the computational atom shownin Figure 7(b) to ascend the pyramid starting with the moments of uniform orderp = 4. The relevant entries in the load vector for the uniform case are then simplyread-off from the appropriate row of the pyramid.
The pyramid schemes shown in Figure 5(a) and Figure 7(a) are intimately re-lated: the computational atoms are precisely the same. Indeed, the only differenceis that the pyramid is ascended (in the case of moment lowering) or descended (inthe case of degree raising). Pyramid schemes related in this way are said to bedual [16]. The same duality property extends to higher order simplices.
Can the BB-moments of uniform order be used to assemble the load vector inthe case of non-uniform approximation? We return to the example of an interval Iwith endpoints at L and R, where a non-uniform approximation of the form (40)is sought. The entries in the load vector corresponding to the basis functions
BL,33 , BI,4
α and BR,11 are required. Suppose that the uniform order BB-moments
{µI,4α ,α ∈ I4
1} have been computed using, for example, the procedure presentedin [2].
Traditionally, expressions for the components of the non-uniform order load vec-tor would be derived by writing the Bernstein polynomials appearing in (40) in
20 MARK AINSWORTH
Algorithm 3: Assembly of load vector ~fT relative to non-uniform order basis.
Input: BB-Moments µT,pmaxα (f), α ∈ Ipmax
n .
Output: Element load vector ~fT = {fTα} defined in (44).
for d = pmax, pmax − 1, . . . , pmin do// Extract components of load vector on faces of degree d
foreach F ∈ ∆k(T ), k = 0, . . . , n such that pF = d do
fTεFTα = µT,d
εFTα(f) for all α ∈ IpF
k ;
// Moment lowering atom
µT,d−1α (f) =
n+1∑`=1
1 + α`
dµT,dα+e`
(f) for α ∈ Id−1n ;
return;
terms of the higher (fourth) order Bernstein polynomials as follows:
BL,11 = BI,4
40 +3
4BI,4
31 +1
2BI,4
22 +1
4BI,4
13 (47)
BR,33 =
1
4BI,4
13 +BI,404
along with the trivial identity BI,4α = BI,4
α for the interior nodes. The entries of theload vector for the non-uniform approximation on the interval are given in terms
of the fourth order BB-moments µ(4)α , α ∈ I4
1 , by the formulae
µL,11 (f) = µI,4
40 +3
4µI,431 +
1
2µI,422 +
1
4µI,413 (48)
µR,33 (f) =
1
4µI,413 + µI,4
04
along with the trivial identities µI,4α (f) = µI,4
α , α ∈ I41 .
The corresponding formulae for higher order simplices are considerably morecomplicated. In practice, the coefficients appearing in these formulae are stored inlook-up tables containing the coefficients arising from expressing the lower orderBernstein polynomials in terms of the higher order polynomials. However, thisapproach is at best clumsy, error prone, and, entailing random access of look-uptables, a potential bottleneck.
A superior alternative is to exploit duality. For example, we may constructthe dual of the pyramid Figure 6(b). The first step is to place the moments ofuniform order p = 4 at the nodes of the pyramid corresponding to their multi-indices α ∈ I4
1 , as shown in Figure 8(a). The moments are then propagated throughthe pyramid using the standard moment lowering computational atom shown inFigure 7(b) precisely as in the case of uniform order moment lowering. This resultsin the pyramid shown in Figure 8(a) which is dual to the pyramid Figure 6(a).Examining the resulting entries in the pyramid reveals that the expressions (48) forthe moments have materialised in the pyramid at the locations corresponding tothe (non-uniform order) multi-indices as indicated in Figure 8(b). The componentsof the non-uniform load vector may simply be read-off from the dual pyramid. Thepyramid scheme is simple, stable and dispenses with the need for look-up tables.
In summary, the dual of the non-uniform degree raising pyramid scheme gives apyramid scheme for the assembly of the non-uniform load vector. The general case
VARIABLE ORDER BBFEM 21
(a) (b)
Figure 8. Pyramid scheme for computation of non-uniform BB-moments needed for the non-uniform order approximation (40).(a) Propagation of uniform p = 4 order moments using standardcomputational atom for moment lowering. (b) Extraction of spe-cific moments used in the non-uniform order load vector.
is presented in Algorithm 3 for the assembly of the element load vector startingfrom the BB-moments of degree pmax. Approached in this way, the generation ofthe non-uniform load vector from the uniform moments is simple, efficient, avoidsthe need for look-up tables and is applicable to arbitrary order approximation inany number of dimensions. The complexity of the whole assembly procedure isgiven by:
Theorem 1. The element load vector for the non-uniform order space P~pn(T ) can
be computed in O(pn+1max) operations.
Proof. The uniform order moments can be computed in O(pn+1max) operations using
the techniques presented in [2]. The moment lowering step has complexity at worstthat of the non-uniform de Casteljau algorithm presented in Lemma 5. �
4. Assembly of Element Matrices
The element stiffness matrix ST has entries given by
STαβ =
∫T
gradBT,|α|α ·A(x) gradB
T,|β|β dx, α,β ∈ I~p
T . (49)
where A : T → Rd×d is appropriate data. As with the element load vector, it ismore convenient to express the entries of the mass matrix in an expanded formusing (27) and (28):
STεFTαεGTβ =
∫T
gradBT,pFεFTα ·A(x) gradBT,pG
εGTβ dx, (50)
where F,G ∈ ∆(T ), α ∈ IpF
kFand β ∈ IpG
kG. The identity
gradBT,pFεFTα = pF
kF+1∑`=1
BT,pF−1εFT (α−e`)
gradλεFT e`.
22 MARK AINSWORTH
along with the corresponding expression for gradBT,pG
εGTβ leads to the following al-
ternative expression for the right hand side of (50)
kF+1∑`=1
kG+1∑m=1
∫T
BT,pF−1εFT (α−e`)
BT,pG
εGT (β−em) gradλεFT e`·A(x) gradλεGT em dx (51)
that is more useful for the purposes of implementation.The algorithm StiffMat presented in [2] enables the construction the element
stiffness matrix in the case of uniform polynomial order p inO(p2n) operations whenT is an n−simplex. This is the optimal complexity given that the matrix has O(p2n)non-zero entries. StiffMat exploits another property of Bernstein polynomials
BT,pα BT,q
β =
(α+βα
)(p+qp
) BT,p+qα+β ,α ∈ Ip
n,β ∈ Iqn (52)
in the case p = q, to reduce the computation of entries in the stiffness matrix tothe evaluation of the BB-moments of the data A over the element T . Remarkably,it is not the computation of the moments but the evaluation of the multinomial co-efficients
(α+βα
)that is responsible for the leading term in the complexity estimate.
Care is needed in the computation of the multinomial coefficients if the algorithmis to achieve optimal complexity.
What about the assembly of the stiffness matrix relative to the non-uniformorder basis? The product identity (52) still applies and all would seem well. Unfor-tunately, the critical multinomial algorithm from [2] relies heavily on the fact thatin the uniform order case, the orders p and q are equal. The algorithm is no longerapplicable to the non-uniform order case and the prospects seem bleak, given thatthe multinomial algorithm dominates the complexity.
Fortunately, there is a nuance that leads to the following refined version of (52):
Lemma 6. Let F,G ∈ ∆(T ) be a pair of faces of T . Then,
BT,pFεFTαB
T,pG
εGTβ =1(
pF+pG
pF
)(α+βα
)BT,2pF
εFT (α+β), if F = G,
BT,pF+pG
εFTα+εGTβ, if F 6= G,(53)
where α ∈ IpF
kFand β ∈ IpG
kG.
Proof. Thanks to Lemma 2, F 6= G implies that F and G have no interior domainpoints in common. This means (εFTα + εGTβ)! reduces to εFTα!εGTβ! and theresult follows from (52). �
The absence of the multinomial coefficient in the second case of (53) is vital toAlgorithm 4. If, for instance, the multinomial fact were still present in the secondcase, then Algorithm 4 would need to compute (or look-up) the relevant multino-mial coefficients in the else clause. Either way, this step would form a bottleneckresulting in a sub-optimal algorithm. The multinomial coefficients needed in the ifclause correspond to both multi-indices being drawn from the same indexing sets.This feature is exploited in the routine StiffMat described in [2] to evaluate themultinomial coefficients efficiently using a nested recursion. The complexity ofAlgorithm 4 is the subject of our final result:
Theorem 2. The element stiffness matrix for the non-uniform order space P~pn(T )
can be computed in O(p2nmax) operations.
VARIABLE ORDER BBFEM 23
Algorithm 4: Assembly of stiffness ST relative to non-uniform order basis.
Input: BB-moments {µT,2pmax−2α (A) : α ∈ I2pmax−2
n }.Output: Element stiffness matrix S = {ST
αβ} defined in (49).
for d = 2pmax − 2, 2pmax − 3, . . . , 2pmin − 1 do// Update stiffness matrix
foreach F,G ∈ ∆(T ): pF + pG = d+ 2 doif F = G then
// Diagonal blocks use procedure from [2]
StiffMat(S, pF , {µT,dα (A) : α ∈ Id
n});else
// Exploit second case in (53)
w = pF pG
(pF +pGpF
);
for ` = 1, . . . , kF + 1 dofor m = 1, . . . , kG + 1 do
SεFT (α+e`)εGT (β+em) +=
w gradλεFT e`· µT,d
εFTα+εGTβ(A) gradλεGT em ;
for α ∈ IpF
kF, β ∈ IpG
kG
// Moment lowering atom
µT,d−1α (A) =
n+1∑`=1
1 + α`
dµT,dα+e`
(A) for α ∈ Id−1n ;
return;
Proof. The moments needed for Algorithm 4 can be evaluated in O((2pmax−2)n+1)operations using the procedures given in [2], where it is also shown that the numberof operations required in a call to routine StiffMat for a k-simplex of polynomialdegree p is O(p2k). Therefore, the number of operations needed for the assemblyof the diagonal blocks using calls to StiffMat is of order
n∑k=0
∑F∈∆k(T )
p2kF ≤n∑
k=0
(n+ 1
k + 1
)p2kmax = p−2
max
((1 + p2max)
n+1 − 1)= O(p2nmax).
The number of operations to deal with the off-diagonal blocks is of order∑F,G∈∆(T ):F 6=G
(kF + 1) dim IpF
kF× (kG + 1) dim IpG
kG,
which is, in turn, bounded by{n∑
k=0
(k + 1)
(n+ 1
k + 1
)dim Ipmax
k
}2
= (n+ 1)2(pmax − 1 + n
n
)2
= O(p2nmax).
where a similar argument to the one used in the proof of Lemma 3 has been applied.The result now follows since the number of operations for the moment lowering is,as usual, of the same order as the non-uniform de Casteljau algorithm, O(pn+1
max). �
24 MARK AINSWORTH
Theorem 2 shows that the stiffness matrix can be constructed in optimal com-plexity of O(1) operation per entry for the non-uniform order Bernstein basis.Algorithms with the same structure as Algorithm 4 that exploit analogues of sub-routine StiffMat from [2] can be easily developed to construct other matrices, suchas the element mass matrix, to be constructed in optimal complexity.
References
[1] M. Ainsworth, Discrete dispersion relation for hp-version finite element approximation athigh wave number, SIAM J. Numer. Anal., 42 (2004), pp. 553–575.
[2] M. Ainsworth, G. Andriamaro, and O. Davydov, Bernstein-Bezier FEM and optimalassembly algorithms, SIAM J. Sci. Comp., (2011), pp. 3087–3109.
[3] M. Ainsworth and J. Coyle, Hierarchic finite element bases on unstructured tetrahedralmeshes, Internat. J. Numer. Methods Engrg., 58 (2003), pp. 2103–2130.
[4] M. Ainsworth and B. Senior, Aspects of an adaptive hp-finite element method: Adaptivestrategy, conforming approximation and efficient solvers, Comput. Methods Appl. Mech.Engrg., 150 (1997), pp. 65–87.
[5] D. N. Arnold, R. S. Falk, and R. Winther, Geometric decompositions and local basesfor spaces of finite element differential forms, Comput. Methods Appl. Mech. Engrg., 198(2009), pp. 1660–1672.
[6] I. Babuska and B. Guo, The h-p version of the finite element method for domains with
curved boundaries, SIAM J. Numer. Anal., 25 (1988), pp. 837–861.[7] S. Brenner and L. Scott, The Mathematical Theory of Finite Element Methods, vol. 15 of
Texts in Applied Mathematics, Springer-Verlag, New York, 1994.[8] C. Canuto, M. Hussaini, A. Quarteroni, and T. Zang, Spectral methods. Evolution to
complex geometries and applications to fluid dynamics, Scientific Computation, Springer,Berlin, 2007.
[9] P. G. Ciarlet, The Finite Element Method for Elliptic Problems, Elsevier, North-Holland,1978.
[10] T. Davis, J. Neider, and M. Woo, OpenGL Programming Guide: The Official Guide toLearning OpenGL, Version 1.1, Second Edition, Addison-Wesley Longman Pub (Sd), 1997.
[11] L. Demkowicz, Computing with hp-adaptive finite elements. Vol. 1, Chapman & Hall/CRCApplied Mathematics and Nonlinear Science Series, Chapman & Hall/CRC, Boca Raton, FL,
2007. One and two dimensional elliptic and Maxwell problems, With 1 CD-ROM (UNIX).[12] L. Demkowicz, J. Oden, W. Rachowicz, and O. Hardy, Toward a universal h-p adap-
tive finite element strategy. Part 1 constrained approximation and data structure, Comput.
Methods Appl. Mech. Engrg., 77 (1989), pp. 79–112.[13] G. Farin, Triangular Bernstein-Bezier patches, Comput. Aided Geom. Design, 3 (1986),
pp. 83–127.[14] , Curves and surfaces for CAGD: a practical guide, Fifth Edition, Morgan Kaufmann
Publishers Inc., San Francisco, CA, USA, 2002.[15] J. Foley, A. van Dan, S. Feiner, and J. Hughes, Computer Graphics: Principles and
Practice, Addison-Wesley Publishing Company, Inc., 1996.[16] R. Goldman, Pyramid Algorithms: A Dynamic Programming Approach to Curves and Sur-
faces for Geometric Modeling, Morgan Kaufmann, 2002.[17] D. Gottlieb and J. Hesthaven, Spectral methods for hyperbolic problems, J. Comput. Appl.
Math., 128 (2001), pp. 83–131. Numerical analysis 2000, Vol. VII, Partial differential equa-tions.
[18] B. Guo and I. Babuska, The h-p version of the finite element method. Part 1 the basicapproximation results, Comp. Mech, 1 (1986), pp. 21–41.
[19] , The h-p version of the finite element method. Part 2 general results and applications,Comp. Mech, 1 (1986), pp. 203–226.
[20] J. Hoschek and D. Lasser, Fundamentals of computer aided geometric design, A K PetersLtd., Wellesley, MA, 1993. Translated from the 1992 German edition by Larry L. Schumaker.
[21] G. E. Karniadakis and S. J. Sherwin, Spectral/hp element methods for computational fluid
dynamics, Numerical Mathematics and Scientific Computation, Oxford University Press, NewYork, second ed., 2005.
VARIABLE ORDER BBFEM 25
[22] I. N. Katz, A. G. Peano, and M. P. Rossow, Nodal variables for complete conforming
finite elements of arbitrary polynomial order, Comput. Math. Appl., 4 (1978), pp. 85–112.[23] R. C. Kirby, Fast simplicial finite element algorithms using Bernstein polynomials, Numer.
Math., 117 (2011), pp. 631–652.[24] M.-J. Lai and L. L. Schumaker, Spline functions on triangulations, vol. 110 of Encyclopedia
of Mathematics and its Applications, Cambridge University Press, Cambridge, 2007.[25] J. M. Melenk and S. Sauter, Wavenumber explicit convergence analysis for galerkin dis-
cretizations of the helmholtz equation, SIAM J. Numer. Anal., 49 (2011), pp. 1210–1243.[26] J. M. Melenk and C. Schwab, HP FEM for reaction-diffusion equations. I. Robust expo-
nential convergence, SIAM J. Numer. Anal., 35 (1998), pp. 1520–1557 (electronic).[27] S. A. Orszag, Spectral methods for problems in complex geometries, J. Comput. Phys., 37
(1980), pp. 70–92.[28] J. Peters, Evaluation and approximate evaluation of the multivariate Bernstein-Bezier form
on a regularly partitioned simplex, ACM Trans. Math. Software, 20 (1994), pp. 460–480.[29] C. Schwab, p- and hp-Finite Element Methods: Theory and Applications in Solid and Fluid
Mechanics, Numerical Mathematics and Scientific Computation, Oxford University Press,1998.
[30] C. Schwab and M. Suri, The p and hp versions of the finite element method for problemswith boundary layers, Math. Comp., 65 (1996), pp. 1403–1429.
[31] B. Szabo, Some recent developments in finite element analysis, Comput. Math. Appl., 5
(1979), pp. 99 – 115.[32] B. Szabo and I. Babuska, Finite Element Analysis, John Wiley & Sons, 1991.[33] P. E. J. Vos, S. J. Sherwin, and R. M. Kirby, From h to p efficiently: implementing finite
and spectral/hp element methods to achieve optimal performance for low- and high-order
discretisations, J. Comput. Phys., 229 (2010), pp. 5161–5181.
Division of Applied Mathematics, Brown University, 182 George St, Providence RI02912, USA.
E-mail address: Mark [email protected]