chapter 2shodhganga.inflibnet.ac.in/bitstream/10603/18070/7/07_chapter 2.pdf · original problem is...
TRANSCRIPT
Chapter 2
An Efficient Parallel Algorithm for a Class of Two-Point BVP involving a Fourth-Order Differential Equation
• .,. .• ~"" . • -A"A""'HA .. , ·~'.11, ...- ..r-#.t.J_..·, -¥'##.1111.-1·111.#-1.# ..# .ilrl.¥111-1·1.., ,;• M "'II'# ... .I -'".,. .-'" .#III.,JI£$#-1
We consider the extension of the parallel algorithm given in [1] to the class of two
point boundary value problems set up as system of first order equations:
dy -=ZI dx
With the boundary conditions
y(O) =AI,
y(l) = A2
The above system can be, however, written in a compact form as
y<iv> = f (x,y), y(O) =A~> y"(O) = A2, y(l) = B~> y'(l) = B2 with Of :50 and Oy
Of continuos on [0, I] x ( -oo, oo ). Oy
13
In the following we consider devising parallel algorithm on the lines of [ 1] for our
TPBVP.
As in [ 1], the interval [0, I] is divided into p different divisions, each division
consisting of N or (N + I) unequal intervals. A new fourth order finite difference
scheme is developed for general non-~iform mesh and is applied to the above
class of TPBVP's on each of the p divisions. This leads to the solution of
N x N or (N + 1) x (N + 1) system of linear or non-linear equations which is
solved on p processors (p is power of 2) simultaneously. The solution of the
original problem is then obtained at n = Np equally spaced abscissas on [0, I].
2.1 INTRODUCTION
Consider the class ofTPBVP's
y (iv) = f(x, y), y(O) =A" y"(O) = A2, y(l) = B" y"(l) = B2 (2.1)
where Of~ 0 and Of continuos on [0, I] x (oo, oo). We also assume that (2.1) has Oy Oy
a unique solution on [0, 1] x (-oo, oo) [6].
A number of papers have recently appeared in the literature which have considered ~
the solution of TPBVP's on parallel computers; see, e.g., [3-5]. Most of these,
however, consider parallelizing the matrix computations which arise when the
differential equation and associated boundary conditions are replaced by its
equivalent finite difference schemes. A parallel chopping algorithm is considered
in [I] where on computer with p processors, the BVP is solved numerically at each
stage on p-meshes using a code based on COLNEW [7, 8].
14
In the following, we solve (2.1) concurrently on a computer with p-processors, p a
power of 2. In particular, we develop a new fourth-order finite difference scheme
for non-uniform mesh and apply to the above class of TPBVP's on each of the p
divisions which leads to the solution of (N - 1) x (N - 1) or (N x N) system of
linear or non-linear equations which are solved concurrently on p processors.
2.2 THE PARALLEL ALGORITHM
To construct p different divisions of interval [0, I], each division consisting of N
or (N + I) equally spaced abscissas, we use the chopping algorithm as discussed in
[I]. However, for the sake of completeness we describe it in the following:
I. Divide [0, I] into N equal parts such that h = 1/N; xk = kh, k = 0, I, 2, .... ,
N. We solve the (N- I) x (N- 1) system in (N- I) unknowns Yt. y2, ... ..
. . . . . , YN-l arising out of finite difference discretization of (2.1) on the
above abscissas on processor I.
2. Next subdivide each interval [xt.t. xk] into two equal parts such that x'k is
the mid-point of [xk-t. xJ. Then x1' = Xo + h/2. As in stage 1, we apply the
finite difference discretization at the abscissas x1', x2', •••• xN' and solve the
N x N system inN unknowns y1', y2', •••• , YN' on processor 2.
3. Proceeding as in I we further sub-divide the intervals [xk-t. xk'l and [xk', xk]
into two equal parts and denote their mid-points by
x;k_1 and x"2k. k = I, 2, .... , N.
As in stage 2, we solve an algebraic N x N system in unknowns
y;·, y~· ..... y~N-I at the abscissas~;·, x~, .... x~N-I on processor 3. Note that
15
4. On processor 4, we solve that N x N system inN unknowns .. .. ..
Y2• Y4·····Y2N.
Note that
The above procedure can be generalised as follows:
We wish to find the solution (2.1) on [0, 1] at n given points
Xo. x" x2, •••• , xN, which are equally spaced.
The p processors would then solve N x N system each concurrently in N unknowns
(except processor 1 which solves N - 1 x N - 1 system).
Thus,
n=Np (2.2)
Number the p processors in the order 1, 2, 3, .... , p = 2k (k fixed). For the
processor j, 2 ~ j ~ p = 2k (k fixed) we write j uniquely as j = j (d, m) = 2d-t
+ m, 1 ~ m ~ 2d·n, d = 1, 2, ... , k. The processor j then solves the N x N
., xN}, where
Pi= 2k-d (2m - 1) + i2k, i = 0, I, ... , N- I. (2.3)
Also for processor j,
_ =[(2m -l)Jh Xp X 0 , 0 2d
x -x = {1- [(2m -l)]}h N PN-1 2d (2.4)
16
Processor 1 solves (N - I) x (N - 1) system y d I
di = 2k ( 1 + i), i = 0, I, 2, .... , N - 2.
2.3 THE FINITE DIFFERENCE METHOD
;
We next proceed to construct a fourth-order method for the solution of (2.1) on
[0, 1]. All the subintervals for any ofthe p divisions are equally spaced except the
first and the last whose length are ch and (1-c)h respectively, c E (0, 1). A fourth
order finite difference scheme for (2.1) is then given by,
-2yo + UJYI + a2Y2 + UJY3 + ah2y"(O) = h4 [Pofo + Plfl + P2f2 +P3f3] + tl (2.5)
Expanding left hand side and Right hand side by Taylor's series about (Xo, y0 ),
{
2h2 3h3 4h4 } 2 + (I) C (2) C (3) C (4) - y 0 a 1 yo + ch yo + --yo +--yo +--yo + ...... .
2! 3! 4!
+ { ( l)h (I) (c+l)2
h2
<2> (c~I)3 h 3 (J) (c+l)
4h
4 <4> }
0t2 Yo + C + Yo + Yo + Yo + Yo + ...... . 2! 3! 4!
+ { ( 2)h (I) (c+2)2
h2
<2> (c+2)3
h3
(J) (c+2)4
h4
<4> } a.3 Yo + c + Yo + Yo + Yo + Yo + ...... . 2! 3! 4!
+ ah2y (2) = h4 A y(4) +A y(4} +chy(S) +--y(6) +--y(7) +-[ {
c2h2 c3h3 } o 1-'o o 1-'t o o 2, o
3! o
+A y<4>+(c+l)hy<s> + y<6> + y<'> +.-{
(c+l) 2 h 2 (c+l)3 h 3 }
1-'2 0 0 2! 0 3! 0
Comparing coefficients of h0, h, h2, h3, h4, h5
, h6, h7
, we g~t 8 equations through
which 8 unknowns can be calculated.
17
The corresponding 8 equations are:
c3a 1 + (c+ 1)3
a 2 + (c+2)3a 3 = 0
6 6 6
c5a 1 (c+ 1)5a 2 (c+2)5a 3 --+ + = r1 = c(31 + (c + 1)(32 + (c + 2)~3 5! 5! 5!
c6a 1 (c+ 1)6 a 2 (c+2)6 a 3 c2 (3 1 (c+ 1)2 (32 · (c+2)2 (33 --+ + =r =--+ +-----"-6! 6! 6! 2 2 2 2
From the above equations we get the following values:
a 1 = (c + 2) (2c + 3)/6, a 2 =- 4c(c + 2)/3, a 3 = c(2c + 1)/3,
a= c(c + 2)/3
PI= {(c+ l)(c+2)r1 -2r2 (2c+3)+6r3 }
2c
p2
= _ {c(c+2)r1 -4r2 (2c+3)+6r3 }
c+l
p3
= {c(c+ l)r1 -2r2(2c+ 1)+6r3 }
2(c + 2)
18
}4 = X2 + 2h
X3 = X2 + h
Xt = X2- h
Xo = x2 - h - ch = x2 - h( 1 + c)
+2h · 4h2
• 8h3
(J) 16h4
<4> 32h5
(S) Y2 Y +-y +-y +--y +--y +
2 2! 2 3! 2 4! 2 5! 2 -
+a Y +hy. +-y· +-y<3> +-y<4> +-y<s> +-(
h2 h3 h4 hs J 1 2 2 2! 2 3! 2 4! 2 5! 2
19
Comparing the coefficients on both sides, we get the following equations:
2 + a 1 - U.J - ( } + C )a.2 = Q
~ 3 8 + ~- a_l -(I+ c) a_2 = 0
6 6 6
{64+a 1 +a_ 1 +(l+c)6a_2 } ~ {3 p_ (l+ciJ3_
-'---------~ = R = 2A +_I +_I + 2
6! 2
'"'2
2! 2! 2!
Solving, we get
~ 3(c+3) 3(c+3) 6 a 1 =- . a= a 2 =
(c+2)' (c+l)' - {c(c+l)(c+2)}
(3 +c) a_l =---
c
20
p = {(c+l)R 1 +2(c+2)R 2 +6R3 -6~2 (c+3)} 1 2(c + 2)
and
For the discretization of the differential equation, fork= 3, 4, ..... , N- 3
We have
04y, ~ h' [ 2a,f, + t. a. (r, •• + f,_. )] ~ t,
84yk = Yk+2- 4yk+I + 6yk- 4Yk-I + Yk-2
c - y(4) 1 k - k
f - (4) k+m - Yk+m
(2.7)
Comparing the coefficients of both sides by Taylor's expansion, we get
I I Clo= -, al =-, a2 =0
3 6
t. = __:_!___ hsy<&> + O(h IO) k 10080 k
2I
For k = N - 2, we have
YN + J..!IYN-1 + JloYN-2 + J...l-IYN-3 + H-2 YN-4
(2.8)
XN-1 = XN-2 + h
XN = XN-2 + h(2 - C)
XN-4 = XN-2 - 2h
{ . 2 (2- c)2
.. 3 (2- c/ <J> } YN-2+h(2-C)YN-2+h 2! YN-2+.h 3! YN-2+ .....
= h4[v {y<4l + h(2- c)y<s> + h2 (2- c)~ y(6l + hJ (2- c)J y<?l + ..... } 2 N-2 N-2 2, N-2 3, N-2
22
+V y(4) _ hy(S) + -y(6) _ -y(7) +-{
h2
hl } -1 N-2 N-2 2, N-2 J! N-2
Comparing the coefficients on both sides and solving the equations we get,
J.li =- (2- c)(3- c)(4- c) /6, J.l-J =- (2- c)(I- c)(4- c)/2,
J.l-2 = (1- c)(2- c)(3- c)/6, Jlo =-I- (111 + Jl.2 + Jl.1)
v1 = {2RI + 6R2 + 6R3 - (4- c)(3- c) (2- c) v2}/6,
v.1 = {2R2 + 6R3 - 2R1 -(I -c) (2- c) ( 4- c) v2}/2
v.2 = {RI - 6R3 + (1 -c) (2- c)(3- c) v2}/6
Rj = {(2- cY+4 + Jl1 + (-It4J.l. 1 + (-I)j+4 2j+4 f.l-2}/G + 4)!,j = 0, I, 2, 3 0
and finally the discretization of the boundary condition
y"(l> = B2 leads to
23
...... (2.9)
Expanding about xN.2 = XN - (2 - c )h,
{ ' h
2 " h
3 (3) }
+~2 YN 2- hyN 2 +-yN 2 --yN 2 + ..... . - - 2! - 3! -;
h2{ <2> (2 )h<J> (2-c)h2
<4> }] +~ YN-2+ -c YN-2+ 2! YN-2·····
= h4[v {y<4> + h(2- c)y<s> + h2 (2- c)2 y<6> + h3 (2- c)3 y<7> + ·····} o N-2 N-2 2, N-2 3, N-2
Comparing coefficients of power of h and solving the equations we get,
Jl3 = (1 - c)(3- 2c)/3, Jl = ( c - I )( c - 3) /3,
~~ =(2c-5)(c-3)/3, Jl2 = - 4 (1 - c )(3 - c )/3
vo =CRt- 6R3)/ {(c- 1)(2- c)(3- c)}
24
It is convenient to write (2.5) - (2.9) in the matrix fonn for further convergence
analysis. Thus, setting,
F = (d .)N.-11
where IJ I,J= '
dk.k = 6, dk,k±l = -4, dk,k±2 = 1 (k = 3, 4, .... , N- 3)
dN-2,N-3 = Jl-1, dN-2,N-4 = J.l-2•
dN·I,N-1 = ~~, dN-I,N-2 = J..12, dN-I,N-3 = JlJ
And
G(y) = (g~. ..... gN_1)T where
[k = 3, 4, .... N - 3]
25
We may write (2.5)- (2.9) in the matrix form as
F Y + G(Y) + Q = T
Let Y be an approximation of Y. Then
F Y + G( Y ) + Q = 0
Subtracting (2.11) from (2.1 0) and setting
Y- Y = E = (e~> e2, .... , eN_1)T
We obtain the error equation
(F + h4 MU)E = T
Where
M = (m;i t~11 is a penta-diagonal matrix with
mk.k = 2ao, mk,k±I =a~> mk.w = 0, (k = 3, 4, .... , N- 3)
mN-2,N-1 = VJ, mN-2,N-2 = Vo
mN-2,N-3 = V-J, mN-2,N-4 = V_2,
A A
mN-l,N-1 =v,' mN-I,N-2 = V2 ,mN-l,N-3 = V3
26
(2.10)
(2.11)
(2.12)
2.4 CONVERGENCE OF F'INITE DIFFERENCE SCHEME
Lemma:
Proof: The matrix F can be partitioned as
0 0
0 0
A
F
0 0
0 0
Where F = [: : l a is 2 x 2 matrix
represented by
b is 2 x (N- 5) matrix,
b = [~: .0 . . ~]
27
c is (N - 5) x 2 matrix
1-4
0 1
0 0
c=
0 0
and d is the penta-diagonal (N - 5) x (N - 5) matrix where
d·. = 6 1,1 , d· "±I=- 4 d· "±2 = 1 1,1 , 1,1
(A)-I [A 8] Assume that F = C 0
This results in the following system of equations:
aA + bC =I
aB + bD = 0
cA + dC = 0
cB + dD =I
From (2.13) and (2.15),
A= (a- bd-1 cr1 =a (say)= (r\2. IJ Ji,j=l
28
(2.13)
(2.14)
(2.15)
(2.16)
From (2.14) and (2.16),
B =- abd"1
D = d" 1 + d"1c a bd"1 (2.17)
We next show that IIDII ~ O(N4) (in sup norm)
From (2.17),
Let k = [ C a b ]N-5xN-5 = (k;j )~j~~
where
k21 = /3a3 + /4a 1, k22 = /4
All other terms of matrix k is zero.
Al · d-1 _ N-S so smce -(D .. ).· 1 IJ I,F
d" 1 c a b d"1 is given by (N - 5) x (N - 5) matrix whose (i, j)th element is
given by
• Dij (k11Di1 + k21Di2) + D2j (k12Di1 + k22Di2)
where
D = (N- i- 6)(N- i- S)j(j + 1) (i(. + 2)(N- 6)- (i + 1)(. -1)(N- 4)]fori ~ . (2.18) IJ 6(N - 5)(N - 6)(N - 4) J J J
29
and
0 = (N- j- 6)(N- j- S)i(i + l) (j(i + 2)(N- 6)- (j + l)(i -l)(N- 4)]fori 5 j [1 0] IJ 6(N- 5)(N- 6)(N- 4)
Adding all terms of any row i we get,
N-5 N-5
(kttDit + k2tDi2) I Dij + (ki2Dil + k22Di2) L D2j
j=l j=l
It can be easily shown that
kttDit + k2tDi2::: 0 (1)
k21Dit + k22Di2::: 0 (I)
Also ~ D = (N - 6)(N - ?) ::: 0 (N 2 ) L. IJ 12 j=l
N-5 D . = (N
3 -21N
2 + 142N -316)::: 0 (N2)
~ 21 4(N - 6) -j=l
Sum of all terms of any row i ~ 0 (N2)
From (2.18) and (2.19),
30
.... (2.19)
From (2.17),
IIDII
IIDII ~ O~)
. Since all A, B, C, D matrix have sup norm less than or equal to O(tt)
Now partitioning matrix F as [ ~ ~]
where p = [F] , N-3xN-3
q = (0]N-3x2
r= [~ ~
it can be shown in the similar manner as above that
31
From (2.12), taking the sup norm of both sides we obtain
E - II F-111 IIlli
II II- 1- h4 IIF-111 II Mil IIUII (2.20)
provided v• = sup IUkl < c for an appropriate positive number c. IsksN
It is shown above that
2.5 NUMERICAL ILLUSTRATION
For the numerical illustration, we consider the following linear two-point boundary
value problem:
subject to boundary conditions:
y(O) = 1.0, y(l) = 0, y"(O) = -1.0, y"(l)=- 6e (2.21)
with exact solution y(x) = (1 - x2) exp(x). We solved (2.21) using classical
second-order method and the fmirth-order method given by (2.5)- (2.9).
Let Tj,N = time taken to solve the N x N algebraic system on jth processor using
the fourth-order scheme.
32
Tp,N =max {Tj,N}
T1,n =time taken for solving the (n- 1) x (n- 1) system to obtain Y~> Y2· ...
• , Yn-I at (n - 1) abscissas XJ, x2, ••• ·Xn-I respectively, using the classical
second-order method,
II Ell~~> =maximum of absolute error obtained while solving N x N system
on jth processor using fourth-order method.
II Ell~>= max{IIEII~j>}
II Ell~> = maximum of absolute error obtained while solving (n- 1) x (n -1)
system using classical second order method.
Then p(p = 8, 16, 32, 64, 128) NxN system for N = 8, 16, 32, 64 and 128
respectively, were solved, in parallel, each on the single processor and the
CPU time noted against each. Speed-up was calculated by comparing T1,n
(obtained using classical second order method as benchmark) and Tp,N for
the values of II Ell~> and II Ell~> satisfying II Ell~> ~ HEll~> such that II Ell!;> is as
close to II Ell~> as possible.
The results are presented in Table 1. We note that as the system of algebraic
equations become large (n = 1024, 4096), the speed-up is considerably
improved and is very nearly equal to number of processors used.
33
TABLE 1
CI~ssicatcSecond- · F~uJ1h-order .. ,Spe~d-up
order Method
5.08(-7), 3.68 (-7) n = 512
T1,n = 2.0 (-2)
·1.33 (-7), 2.85{;;8) n =;to24:~Y . ~ ' . ·: \ '
T1.n = 4.0 (-2)
3.24 -8), 2.86 ( -8) n = 2048
Tl,n = 8.0 (-2)
8.11 (-9), 1.85 (-9) n = 4096 '
T1.n = 1.6 (-1)
Note: a.b(-c) means a.b x ](f
2.6 THEORETICAL ESTIMATES
Method
N = 8, p = 64
Tp,N = 6.0 (-4)
N~~;p= 16
. . ,~p;t~3.2 (-3) .. · •.
. ,'~
N = 32, p = 64
Tp,N = 1.6 (-3)
;N·;,;32:p= 12R;
Tp,N ~~1.6 (-3)
.(Sp:= Tt,.!f p,N)
33.3
50
100·.· .. ·
We next calculate theoretically the speed-up of the linear two-point boundary
value problem given by
lv) = f(x).y + g(x)
y(O) = A 1, y(l) = B1
y"(O) = A2, y"(l) = B2 ) (2.22)
as attempted by an algorithm using fourth-order method for non-uniform meshes.
Suppose
k 1 =time for evaluating f(x) at any x E [0, I]
34
k2 =time for evaluating g(x) at any x E [0, 1]
y = time for single addition/subtraction,
J..l = time for single multiplication,
v = time for single division
. For our linear TPBVP, the following estimates were observed
k1 = 1.58 x w-6, k2 = 1.456 x to-5
, y = 6.08 x 10·7,
ll = v = 6.95 x w-7
Then the time taken for solving (2.22) by classical second order is given by
T1,n = .20 y + 26J..l + (n+ l)k1 + (n + l)k2 + (13n- 35) y + (16n- 47)J..l
T1,n = (13n- 15) y + (16n- 2l)J.1 + (n + l)(k1 + k2)
and the time taken by parallel algorithm is given by
Tp,N = 112 y + 204J..l + (N + 2) (k1 + k2) + (17N- 39) y + (26N- 71)J..l
Tp,N = (17N + 73) y + (26N + 133)J..l + (N + 2)(ki + k2)
Thus the Speed up is given by
S = ..!!._ = (13n -15)y + (16n- 2l)J..l + (n + l)(k1 + k2 )
P TP (17N- 73)y + (26N + 133)J..l + (N + 2)(k1 + k2)
Thus if n = 512, N = 8, then the speed up given by Sp = 34.22. Theoretical
estimates thus confirm the results obtained in section 2.5.
We may note that in case f(x, y) is non-linear, the theoretical estimates for
obtaining speed would be function of the number of iterations n1 and n2 required
for the convergence of classical ~econd order and fourth order method
respectively.
35
2.7 EXTENSION OF PARALLEL ALGORITHM FOR ELLIPTIC PDE's
We consider the solution of linear Partial Differential Equation:
a2u a2u au au A-+C-+0-+E-+Fu=F*
ax 2 &/ ax ay (2.24)
in the region 9i with boundary a R, where A, C, D, E, F, F* are continuous
functions of x and y in 9i + a 9i. We shall assume that A > 0, C > 0 and F ~ 0 so
that ellipticity condition and weak min-max principle are satisfied. The Boundary
Conditions (BC), as is well known, can ;be one of the following three types:
(i) The Dirichlet BC.
We have
u(x, y) = g(x, y), (x, y) E a9i (2.25)
where g(x, y) is a prescribed function which is defined and continuous on a9i. The
condition (2.25) is the Dirichlet Condition (DC).
(ii) The Neumann BC
We have
au on = g(x,y), (X, y) E a 9i
where g(x, y) is a prescribed function defined and continuous on a9i and n is the
outwardly directed normal. The (2.25) is the Neumann BC.
(iii) Mixed BC
We have
au - + a(x,y)u = g(x, y), an (x, y) E a9i (2.26)
where a{x, y) >band g(x, y) are defined and continuous on a9i.
First of all we concentrate our attention on Poisson Equation, namely,
36
Thus,
A= C = 1, D = E = F = 0 and F* * 0 in (2.24)
82u 82u -+-=F* ax2 ay2 (2.27)
and a~ is the rectangular region on which the BC's are specified. Thus
u (0, y) = g (0, y)
} u (1, y) = g(l, y) 0:5y:51
y (x, 0) = g(x, 0) } u(x, 1) = g(x, 1) 0:5x:51
y
0 X
In order to find the numerical solution of (2.24) with appropriate BCs (in this case,
say (2.28)) we superimpose on~. rectangular network with mesh lengths h and k
in the x andy directions respectively. Thus, the nodal points are given by
x1 = Xo +I h, I= 0, ± 1, ± 2, .....
Ym =Yo+ mh, m = 0, ± 1, ± 2, ...... .
The aim is to find the solution at the nodal points (x1, Ym) for small values of the
mesh spacings h and k.
37
In this section, our basic aim, is to consider development of parallel algorithms for
the class of TPBVP's (2.27) - (2.28). Because of the fact that elliptic PDE's always
occur as BVP's, these algorithms could, in principle, be obtained as extension of
the parallel algorithm presented in [ 1].
In this case, following [ 1 ], we divide the [0, 1] along the x-axis into p different
divisions, each division consisting of N or (N + 1) (N small) unequal intervals.
Similarly divide [0, 1] along the y-axis into p different divisions. We then have a
total of p2 regions (p a power of 2) on which the solution of (2.24) can be obtained
simultaneously provided p2 processors are available.
For the solution of (2.24) on the coarser p2 grids we would have to develop high
order finite difference schemes for BVP (2.27) - (2.28). As in [1 ], a high order
scheme would have to be developed for general non-uniform grid on the
rectangular domain. Once the high order methods are successfully devised the
scheme is applied to each of the p2 regions on which BVP is defined. This finally
leads to a solution of N x N or (N - 1) x (N - 1) system of linear or Non-linear
equations which are concurrently solved on p-processors. As an example, let us
consider the case when p = 4, Nh = 4, Nk = 3. The final region in which the
solution has to be obtained is the following:
38
1
y
r
The p2 processors simultaneously solves the problem (2.24) concurrently in the
following 16 grids. This yields the solution at all points depicted on figure I but as
the solution is found in parallel substantial speed up is expected, especially in view
of the fact that no communication is required amongst the processors.
39
Grid# 0
1
1
113 y
l 113
113
114 114 114 --•1111- X
40
Grid# 1
114 1
1
116
Grid#
113
116
0 114 114 114 114 1
---. ... X
41
1 rt~~~7t~~~~~~~~ 1112
1/3
114
0 114 1/4 114 114 1
--•• X
42
114
Grid# 4
1/3
1112
0 114 114 1/4 1/4 1 ---+ ... X
43
For each different division ofh, 4 different grids (pertaining to different k's) can be
drawn. So total number of grids are 4 x 4 = 16. Four grids pertaining to h = 114 are
shown in the figure. Twelve more grids can be drawn. If all the girds are super
imposed, we get the solution ofBVP (2.24) at the nodes on GRID# 0.
Now, we have to develop high order finite difference scheme for obtaining the
solution simultaneously on p2 regions, consisting or coarser non-uniform grids.
Before the actual implementation, the stability and convergence of the scheme
should be proved which could be tricky affair in itself.
As an examples, consider the problem (2.27) - (2.28) i.e. the Poisson Equation
together with the BC's
write
&u = F*/2 and Ox.2
Discretigation of the above differential equation leads to
2 ( • • • J 2
h Fk+ll + I OFk I + Fk-11 Uk+ I I - Uk I + Uk-1 I = - . . .
. ' ' I2 2
2 ( • • • J 2
k F .. /+1 +I OFk I + Fk 1-1 Uk 1+1 - Ukf + Uk,l-1 = -. . ' .
' ' 12 2
• Substituting (2.30) and (2.31) in (2.28)
44
(2.28)
(2.29)
(2.30)
(2.31)
h2 ( • • • • • ) Uk+ 1,1 + Uk. 1,1 - 4uk./ + Utt./+1 + Uk.f.1 =
24 Fk+l,l + Fk-l.l + 20F".I + Fk.l-1 + Fk.l+l
Similarly,
where
Adding (2.33) and (2.34),
I~ k ~ N,
where [0, I] is sub divided as
I~ l ~ N'
0 = y < x1 < < x - I and ·~ · · · · · · N+l-
45
....... (2.32)
(2.33)
(2.34)
(2.35)
0. = y 0 < y I < ....... < YN'+ 1 = 1.
Similarly
(2.36)
Uk.N+I - (1 + ~N) llk,N + ~Nllk,N-1
F• F. - k 2 (Jl~ + llN -I) k;+l + (J.!N + l)(Jl~ + 3J.!N + 1) ;N
+ tN
12 +J.!N(l+J.!N -J.!~) F:.N-1 2
(2.37)
Adding (2.37) and (2.38)
(2.38)
(2.32), (2.35) and (2.38) can be written in the matrix form as .
A Y + G(Y) + Q = T (2.39)
for appropriate matrices A, G(Y) and Q.
Y = (UJJ, U12• •••.• , UJN', U2J, ••••••• U2N'• .••••• , UNN·]
46
Let Y be an approximation to Y. Then
A Y + G( Y) + Q = 0
(2.40)
We can show that the matrix A is irreducible and monotone since its off diagonal
elements are non-positive. Also it can be shown that
(2.41)
Finally, there is no basic reason why our algorithm should not work for BVP's and
Initial value problems with other kinds of PDE's. For example, we could easily
consider the parabolic equation
au 82u at= ax2
with any of the following conditions:
(i) u (x, 0) = f(x), O~x<oo
Boundary Conditions
ao (0, t)u + a1 (0, t) au = a2 (0, t) ax
where
ao (0, t) 2:: 0, a1 (0, t) ~ 0 and ao - a1 > 0 and also a condition at x = oo, t 2:: 0
(ii) Initial Condition
u {x, 0) = f(x), a~x~b
Boundary Conditions
ao (a, t)u + a1(a, t) au = a2 {a, t) ax
bo (b, t)u + a1{b, t) au = b2 {b, t) ax
47
where
ao(a, t) ~ 0, a1{a, t) ~ 0 and ao- a1 > 0,
b0 (b, t) ~ 0, b1(b, t) ~ 0 and b0 - b, > 0
then this is an initial boundary value problem.
Thus, we see that our algorithm is extremely versatile in the sense that the same
algorithm can be used, in parallel, to obtain the solution of varied BVP's. The only
trick is that we must be able to obtain high and very high order finite difference
methods for relevant BVP's and IVP's. In view of the fact that no communication
takes place the speed up expected is substantial.
48
REFERENCES
[ 1] M. Paprzycki and I. Gladwell, A Parallel chopping algorithm for ODE
boundary value problems, Parallel comput. 19,651-666 (1993).
[2] C. P. Katti and S. Goel, A Parallel Mesh Chopping Algorithm for a class of
Two-point boundary value problems, Computer Math. Applic. Vol. 35, No.
9, 121-128. (1998).
[3] U. Ascher and S. Y. P. Chan, On parallel methods for boundary value
ODE's, University of British Columbia, Department of Computer Science,
Technical Report 89-119, 1989.
[ 4] M. Paprzycki and I. Gladwell, Solving almost block diagonal systems on
Parallel Computers, Parallel Comput. 17, 133-153 ( 1991)
[5] M. Paprzycki and I. Gladwell, Solving almost block diagonal systems using
level 3 BLAS, In proc. 5th SIAM conference on Parallel Processing in
Scientific Computation,(Edited by J.Dongerra et al.),pp.52-62,SIAM,
Philadelphia, PA (1992) 52-62.
[6] P. Henrici, Discrete Variable Methods in Ordinary Differential Equations,
John Wiley, New York, (1962)
[7] U. Ascher, J. Christiansen and R. D. Russell, Collocation Software for
boundary value ODEs, ACM Trans. Math. Soft. 7 (1981) 209-229.
49
[8] A. Bader and U. Ascher, A new basis implementation for a mixed-order
boundary value ODE solver, SIAM J. Sci. Stat. Comp. 8 ( 1987) 483-500.
[9] G. S. Subramianium, Variable mesh difference methods for the solution of
two point singular perturbation BVP's, Ph.D. Thesis, /lTD, Indian Institute
of Technology, Delhi, (1982)
[I 0] Riaz A. Usmani and Dereck S. Meek, On the Application of a five-band
matrix in the numerical solution of a boundary value problem, UTILITAS
MATHEMATICA Vol. I4 (1978), pp. 21-29.
[II] L.F .Shampine, Boundary value problems for ordinary differential equations,
SIAM Journal of Numerical Analysis,2( 1968) 2I9-242.
[I2] M.M.Chawla and C.P.Katti, A finite difference method for a class of singular
two-point boundary value problems, IMA Journal of Numerical Analysis,
4(1984) 457-466.
[13] J.R. Cash and A.Singhal, High order methods for the numerical solution of
two-point boundary value problems,B/T, 22 (1982) 184-199.
50