Download - search.jsp?R=20000093950 2020-03-17T13:27:41+00:00Z - …Richard Gregory Schunk _ and T. J. Chung 2 1NASA/MSFC, Huntsville, Alabama 2Department of Mechanical & Aerospace Engineering

Airbreathing Propulsion System Analysis Using

Multithreaded Parallel Processing

Richard Gregory Schunk _ and T. J. Chung 2

1NASA/MSFC, Huntsville, Alabama

2Department of Mechanical & Aerospace Engineering

The University of Alabama in Huntsville

SUMMARY

In this paper, parallel processing is used to analyze the mixing and combustion behavior

of hypersonic flow. Preliminary work for a sonic transverse hydrogen jet injected from a

slot into a Mach 4 airstream in a two-dimensional duct combustor has been completed

[Moon and Chung, 1996]. Our aim is to extend this work to three-dimensional domain

using multithreaded domain decomposition parallel processing based on the flowfield-

dependent variation theory [Schunk, Canabal, Heard, and Chung, 1999; Schunk and

Chung, 1999].

Numerical simulations of chemically reacting flows are difficult because of the strong

interactions between the turbulent hydrodynamic and chemical processes. The algorithm

must provide an accurate representation of the flowfield, since unphysical flowfield

calculations will lead to the faulty loss or creation of species mass fraction, or even

premature ignition, which in turn alters the flowfield information. Another difficulty

arises from the disparity in time scales between the flowfield and chemical reactions,

which may require the use of finite rate chemistry. The situations are more complex

when there is a disparity in length scales involved in turbulence. In order to cope with

these complicated physical phenomena, it is our plan to utilize the flowfield-dependent

variation theory mentioned above, facilitated by large eddy simulation. Undoubtedly, the

proposed computation requires the most sophisticated computational strategies. The

multithreaded domain decomposition parallel processing will be necessary in order to

reduce both computational time and storage. Without special treatments involved in

computer engineering, our attempt to analyze the airbreathing combustion appears to be

difficult, if not impossible. We describe in detail the parallel processing strategy below.

Multi-threaded programming is utilized to take advantage of multiple computational

elements on the host computer. Typically, a multi-threaded process will spawn multiple

threads which are allocated by the operating system to the available computational

elements (or processors) within the system. If more than one processor is available, the

threads may execute in parallel resulting in a significant reduction in excution time. If

more threads are spawned than available processors, the threads appear to execute

concurrently as the operating system decides which threads execute while the others wait.

One unique advantage of multi-threaded programming on shared memory multiprocessor

systems is the ability to share global memory. This alleviates the need for data exchange

or message passing between threads as all global memory allocated by the parent process

is available to each thread. However, precautions must be taken to prevent deadlock or

https://ntrs.nasa.gov/search.jsp?R=20000093950 2020-04-18T22:09:34+00:00Z

raceconditionsresultingfrom multiple threadstrying to simultaneouslywrite to the samedata.

Threadsareimplementedby linking anapplicationto asharedlibrary andmakingcalls tothe routines within that library. Two popular implementationsare widely used: thePthreadslibrary (and its derivatives)that areavailableon most Unix operatingsystemsand the NTthreadslibrary that is availableunderWindows NT. There aredifferencesbetweenthe two implementations,but applicationscanbe ported from one to the otherwith moderateeaseand many of the basic functions are similar albeit with differentnamesandsyntax.

Domain decomposition methods can be used in conjunction with multi-threadedprogrammingto createanefficient parallel application. The sub-domainsresulting fromthe decompositionprovide a convenientdivision of labor for the processingelementswithin the host computer. In this application, an Additive Schwarz domaindecompositionmethodis utilized. The methodis illustratedbelow (Figure 1) for a twodimensional square mesh that is decomposedinto four sub-domains. The nodesbelonging to eachof the four sub-domainsare denotedwith geometricsymbols whileboundarynodesare identified with bold crosses. The desire is to solve for each nodeimplicitly within a singlesub-domain, For nodeson theedgeof eachsub-domainthis isaccomplishedby treatingtheadjacentnodein theneighboringsub-domainasa boundary.The overlappingof neighboringnodesbetweensub-domainsis illustrated in Figure 2.Higher degreesof overlapping, which may improve convergenceat the expenseofcomputationtime, arealsoused.

In a parallel application, load balancing betweenprocessorsis critical to achievingoptimum performance. Ideally, if a domaincould bedecomposedinto regionsrequiringan identical amountof computation,it would be a simple matter to divide the problembetweenprocessingelementsasshownin Figure3 for four threadsexecutingonanequalnumberof processors.

Unfortunately,in a "real world" applicationthedomainmay notbedecomposedsuchthatthe computation for each processor is balanced,resulting in lost efficiency. If theexecutiontime requiredfor eachsub-domainis not identical,the CPU's will becomeidlefor portionsof time asshownin Figure4.

Oneapproachto loadbalancing,as implementedin this application, is to decomposethedomaininto moresub-domainsthanavailableprocessorsand usethreadsto perform thecomputationswithin eachblock. The finer granularitypermitsa moreevendistributionof work amongsttheavailableprocessingelementsasshownin Figure 5.

In this approach,the numberof threadsspawnedis equal to the number of availableprocessorswith each thread marching through the available sub-domains (whichpreferablynumberat leasttwo timesthe numberof processors),solving one at a time inan "assembly-line"fashion. A stackis employedwhereeachthreadpops the next sub-domain to besolvedoff of the top of the stack.Mutual exclusion locks areemployedtoprotectthestackpointer in theeventtwo or morethreadsaccessthestacksimultaneously.Eachthreadremainsbusyuntil thenumberof sub-domainsis exhausted.If thenumberof

sub-domains is large enough, the degree of parallelism will be high althoughdecomposingaprobleminto too manysub-domainsmayadverselyaffect convergence.

The above approach will be utilized for the analysis of hypersonic airbreathingcombustionwith hydrogenfuel. Without usingtheparallel processing,the analysiswithMach 4 free-streamvelocity for the finite ratechemistrywith 18specieshasbeencarriedout asshown in Figs. 6 through8 [Moon andChung, 1996]. In this study,the standardK -e model wasused. The proposedpaperwill utilize the largeeddy simulationwithhighMach numbersandhighReynoldsnumbers. It is our planto report on the maximumrangesof MachnumberandReynoldsnumbertheproposedalgorithmcanaccommodate.Our eventualgoal in this paper is to determine the relationship betweenmixing andcombustion efficiency. Length scales and time scales involved in turbulence andcombustionwill be thoroughlyinvestigated.

References

Moon, S.Y. and Chung, T. J. [1996]: Numerical simulation of heat transfer in chemically

reacting shock wave turbulent boundary layer interactions, Journal of Numerical Heat

Transfer, Part A, 30, 1,55-72.

Schunk, R. G, Canabal, F., Heard, G.A., and Chung, T.J. [1999]: Unified CFD methods

via flowfield-dependent variation theory. AIAA paper 99-3715.

Schunk, R. G. and Chung, T. J. [1999]: Parallelization of the flowfield-dependent

variation schemes for solving the triple shock/boundary layer interaction problem,

TFAWS 99 NASA/MSFC, Septemer 13-15, 1999.

Sub-domain 1

Sub_omain 2

S ub-.clo main 3

Sub-domain 4

Bc_ ndary Node

Figure 1: Multiple Subdomains

Nodes shaded in J J J I I

white are solved

implicitly within in

each sub-domain.

Nodes from neighboringsub-dorr_ins (shaded)

are treated as boundary

nodes and alowed to lag

J one time_ep.

Figure 2: Domain Decomposition

5

Timestep #1 Timestep #2 Timestep N

./_ End of Timestep

Synchronization

CPU #1

CPU #2

CPU #3

CPU #4

Thread #1 Thread #2

Th read #2 Th read #4

Th read #3 Thread #1

Th read #4 Th read #3

Execution Time

000

Thread #3

Thread #2

Thread #1

Thread #4

Figure 3: Ideal Load Balancing

/_ End of TimestepSynchronization

CPU Idle

CPU #1

CPU #2

CPU #3

CPU #4

Timestep #1 Timestep #2

Thread #1 Thread #2

Thread #2 Thread #4_

Thread #3 [_ Thread #1

Thread #4 Thread #3

Execution Time

Figure 4: "Real World" Load Balancing

000

Timestep N

Thread #3

Thread #2

Thread #1

Thread #4_

CPU #1

CPU #2

CPU #3

CPU #4

CPU #1

CPU #2

CPU #3

CPU #4

#1

z

1,11.2

I '3 I-

.5 I .6_

,7 I ,sL_

Execution Time

4 Sub-domains

8 Sub-domains

&

@

End of TimestepSynchronization

CPU Idle

Figure 5: Domain Decomposition Improves Parallelism

c-ml

"_ c"1

0

o0E "-0..0

1

m09 ,-

0 O..

5 •"0

0

0_ 2:

.7

0°_

if)

c--

7-r-

E

1

"3 ._.1

r- _.)

o_._>

c-_D

E7

_oj

c-O

im

0

0

c-m

i

___.

,t-O

00..0..

<E0

im

im

E

t'--

0...(1)

"Om

im

!

0i

LL

0

o

(/)00

E

!

Vl "-- _ I c,i

u.. _' - o

"E i_ _ + =

o--

C_W

om

o--

c0

0

C

O.

xEW'._

+

II

z

o--

q)r7

r_

,.c:

c

n--

O

C

Em

Iii

Clm

ii

E

O

I,-O

t-O

on

I',,,,,I.m

a

' !t

/

! /

z

z"

x

8

-4-

I

t_ +

II t[;_

II

! I

I

I 1

4-

m

0

c-O

.m

Z3

E0

0

:,Z:C

0

"0>,

m

0m

0

0

!--

T"--

o

,,¢v

o E

\

cO oO0 eJ _ ¢q

0 0 0 0,.- 6 6 c5

0•1- 0 z -i-

>- >- >- >-

t-

c-

-¢j0

t-O

.QEoe, iOc0_- 0'_

<,a-

I

0_i

•_- 13_

0"o

e-

d__,-r

o_rrl_

wn

L_m

c-O

c_

EOL)

i

c-

ECD

W

t_c-O

X

<

V

O

c-

(/3

C3Cv3

>

00c3

Y

O _ IIII II 0c_ t_l C_J

x

-r"

"0m

(1)wm

_===I

0m

LL.

0"_c-

O

I

c-O

_=.=

0

c"0

0(I)

_==

(1)

EQ)

c

>,,im

0_c-(1)n

li]

t i\"

\

o \.4.=1

I_

(!)

EQ)

I--

¢/)

(1)IZl

\.

_r/ /

i i, _i_,,_'_

\\

'\

\

\\

i r

." f:

/t

I./

/

/

/

/r

Jr

EO

O

O

O!

Oim

I,,,-

tOd

d

d

Cxl

O

O

O

tO

O

C,I

d

tr_

d

O

t_O

c5

O

_1 1 I

ta'3 v_" CO C,d

o o c5 c5

O

t I _. 1___£_ miR O

-r i [,,.,.

7-- o7_

v

EO

¢O

U_O

o

O

d

r,-

d

t,o c-o .O

m

X0

"1"

d

0

0

¢ E5.. _ ._-

F= _-_ _

o'$ .=.--

(-

__ _ -_

-_ _ ° _ _ _o8_-. o_o _

._ =_ o 8

_b oE_ _E ,,,_. o f:,, oEo¢ .,..,_ "_

>,_ _ =

,5"

t-wm

EO

-O!

..Q

03

t'-

oo=

5=_ O

a N

OO9(1)>

mm

"O"O<_

._ ,__ .-- .- Z

E E E6 O O O "O"9, "9, "9 "9, =..Q .Q ._ .Q '-I

-I Z_ _ OCO 09 _0 0") ft_

4% ,ll% •

)(--_ >--<>----<

) (-_< >----<>---<

) (---< >--< >----<

) (----< _F---_:I_<

) (-----<__----<

t _lP •)(: :, ,,, ,

0m

"IDm

t-

O

t-wm

0¢-

m

r'n"0

0.._1

ira,,..

00000

00

13_

Z

E oo_

I--

0,1

0..

EI-

Eo_

I--

¢_1 ,'-

0 0o o

nn

8

oo

I

O_ O. O_ O_0 0 0 0

<1

<] E

Q. ¢-0

._..__-p"--5o

LU09

<_

-0m

0

i

n"

c".4,..a

Cim

Cin

0c-

m

m

0._J

0

00

13_

ZO..

g_(D

E

O4

Et-

O-_J

f,D

EI--

co

oo

r'n

"1.'.""lit

o.I

oom

o !rn I

ii

15.0

fitfiiJJJfJJ

o4

om

./i,.../11

fJ'l

fj'jfj,j,f.fJ

o4

oo

m

Q.0

T--

O_0

7_

_i

8¢,n

orn

JIJ

0

<1

<1

Q.e-_ 0

.E __-oo

0

XLLI

a..o

c-im

Oe-

ra

rn

t_O.J

>O

CL

Em

00e-

ll

EO

"OI

.Q

rj)'.6--

O

(D.Q

E

Z(Dc-

O')c-

um

00

(D

Oc'-

m

i

f

O_O

c-

EO"o_500

.,A.

o_

C_l::1:1:

ELO

i

co

O3

O_O

7

I

:)15.O

f

(M

"r--

O

00(-

(tlEO

I

_c}

09co

.A.

co

o,I

15.O

(D

LO

cO

O

(_. c-(1) o

._EE-_-o

WOO

0o

CLO

CLO

C0

im

m

Q_

E

c-"

EC_1,m

C_0

D..-0

"O

!

m

_J

,!

C_

c_

c_

!

c_

c_

c_

c_

c_

CO

c_

Em

O_

E

C_C:

Ec_

C_O

O_

c_

_D

.C

!im

m

Z_

Q.o

O_

C

oCL

C_

c_

C_

c_

Ec-Oe-

rnm

c-O

EL

E0

0

H

oo

0_

..oE

Z

(I)

-60o

Or)

.N

ft..0

f-

LU

t-

O

.=_o

oooa

'o

"o

..EF-

w I

._._ ZZZZ_ I

i

o_ o__

o

_ _00_O_ __

_" _'_X _x x x

_ __XX XXXX_

XX XMMX_

"0

0

8

i

•__ _-___ _o

_ o

E

"_ _ o_ _ o

g g

_ _ _ _ ._ .o_-_ _ .-_ o_

0

0

_:o• _.__0 _-

"-- oE_ o 0_ I e'- o

.0 _ ._

0 _

._ ¢ "0¢-

>,"0 e_ 0

w_ o o

_'_ 0 "_

¢ z

0 0 _

0 ¢- ¢D O_0 ¢- ._

Download - search.jsp?R=20000093950 2020-03-17T13:27:41+00:00Z - …Richard Gregory Schunk _ and T. J. Chung 2 1NASA/MSFC, Huntsville, Alabama 2Department of Mechanical & Aerospace Engineering

Top Related