a nested dissection parallel direct solver for simulations ... · 1 a nested dissection parallel...

Post on 20-Jul-2020

2 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

1

A N

este

dD

issection

Para

llelD

irectSolv

er

for Sim

ula

tions

of3D

DC

/AC

Resis

tivity

Measure

ments

Maciej Paszyński(1

,2)

DavidPardo

(2) , CarlosTorres-Verdín

(2)

(1) Department of Computer Science,

AGH University of Science and Technology, Kraków, Poland

e-mail: paszynsk@agh.edu.pl

home.agh.edu.pl/~paszynsk

(2) Department of PetroleumandGeosystemEngineering,

TheUniversityofTexasatAustin

2

OU

TLIN

E

•Formulationoftheresistivitymeasurementsimulationmodel

problem

•Sequentialalgorithm

•Parallelalgorithm

•Scalabilityoftheparallelsolver

•Parallelsolverdetails

•Conclusionsandfuturework(inversions)

3

FO

RM

ULA

TIO

N O

F T

HE M

OD

EL P

RO

BLEM

()

()

()

=⋅

=⋅

−−

++

0Hµ

MHµ

E

JE

εσ

Himpimp

ρωω

j

j(Ampere’sLaw)

(Faraday’sLaw)

(Gauss’ Law ofElecticity)

(Gauss’ Law ofMagnetism)

magneticfield

electricfield

diaelectricpermittivity

H E ε µmagneticpermeability

σelectricalconductivity

ρelectriccharge distribution

ωangularfrequency

imp

Jimp

M

impressedelectriccurrent

impressedmagneticcurrent

4

FO

RM

ULA

TIO

N O

F T

HE M

OD

EL P

RO

BLEM

(Ampere’sLaw)

(Faraday’sLaw)

(Gauss’ Law ofElecticity)

(Gauss’ Law ofMagnetism)

magneticfield

electricfield

diaelectricpermittivity

H E ε µmagneticpermeability

ρelectriccharge distribution

0=

ωDC formulation

imp

Jimpressedelectriccurrent

()

()

=⋅

=⋅

+=

×∇

0

0

E

JEσ

Himp

ρ

5

FO

RM

ULA

TIO

N O

F T

HE M

OD

EL P

RO

BLEM

TakingthecurloftheAmpere’slaw andutilizingtheGauss’ Electriclaw we obtain

the

conductive

media

equation

()

()

()

()

Ω∈

∀+

⋅∇

=∇

∇Γ

ΩΩ

12

22

,,

,D

LL

LH

vh

vv

uv

N

Jσ(

)J

σ⋅

−∇=

∇⋅

∇u

Variationalfo

rmula

tion

Find

suchthat

()

Ω+

∈1 D

DH

uu

Du

liftofessentialDirichletb.c.

uh

∇⋅⋅

nprescribedfluxon

()

()

0

:1

1=

Ω∈

Γ Du

Hu

HD

where

scalarpotentialsuchthat

u−∇

=E

u

6

FO

RM

ULA

TIO

N O

F T

HE M

OD

EL P

RO

BLEM

•1 transmitter

•2 receiverelectrodes

•transmitterelectrodemodeledby

theimpressedelectriccurrent

•five different layers in the formation

with resistivities100 Ω·m (sand)

5Ω·m (shale) 200 Ω·m (oil)

1 Ω·m (water) 1000 Ω·m (rock)

•boreholewithresistivity0.1 Ω·m

•0 Dirichletb.c. on theexternal

boundaryofthedomain

•0 Neumannb.c. on theaxisof

symmetry

imp

J

7

FO

RM

ULA

TIO

N O

F T

HE M

OD

EL P

RO

BLEM

We utilize a 2D self-adaptive

goal-oriented hp-adaptive strategy

combined with a Fourier series

expansion in a non-orthogonal

system of coordinates

8

FO

RM

ULA

TIO

N O

F T

HE M

OD

EL P

RO

BLEM

9

SEQ

UE

NTIA

L A

LG

OR

ITH

M

Loopoverelectrodelocations

Iterationsofthegoal-orientedself-adaptivehpFEM

Solvetheproblem overthecoarsemesh

Solvetheproblem overthefinemesh

Computerelativeerrorestimationsoverfiniteelements

Ifmaximumrelativeerror< requiredaccuracythenexit

Makedecisionsaboutoptimalrefinements

Performoptimalrefinements

End

Storesecondverticaldifferenceofpotential

atreceiverelectrodes

End

10

PA

RA

LLE

L A

LG

OR

ITH

M

Each processor is assigned to a set of finite elements.

Each node from the interface is assigned to multiple processor owners.

Localcopyoftheentiredata structureisstoredon everyprocessor.

But eachprocessorperformscomputationsonlyon assignedset offiniteelements.

Onlylocalsolutiondegreesoffreedomarestoredto savememory.

11

PA

RA

LLE

L A

LG

OR

ITH

M

Redistributethecomputationalmeshbetweenprocessors

Loopoverelectrodelocations

Iterationsofthegoal-orientedself-adaptivehpFEM

Solvethecoarsemeshproblem by parallelsolver

Solvethefinemeshproblem by parallelsolver

Computerelativeerrorestimationsoverfiniteelements

Computeglobalmaximumrelativeerror(mpi_allreduce)

Ifmaximumrelativeerror< requiredaccuracythenexit

Makelocaldecisionsaboutoptimalrefinements

Broadcastrequiredrefinements

Performoptimalrefinementson thewholelocalmesh

End

Storesecondverticaldifferenceofpotential

atreceiverelectrodes

(requirescommunicationto gatherdistributedsolution)

12

SC

ALA

BIL

ITY O

F T

HE P

AR

ALLEL S

OLVER

Finemesh, 10 Fourier modes, 141 000 degreesoffreedom

211 secondson 1 processor(per single electrodelocation)

1.75 secondson 192 processors(per single electrodelocation)

13

SC

ALA

BIL

ITY O

F T

HE P

AR

ALLEL S

OLVER

Finemesh, 10 Fourier modes, 141 000 degreesoffreedom

211 secondson 1 processor(per single electrodelocation)

1.75 secondson 192 processors(per single electrodelocation)

14

SC

ALA

BIL

ITY O

F T

HE P

AR

ALLEL S

OLVER

Finemesh, 10 Fourier modes, 141 000 degreesoffreedom

211 secondson 1 processor(per single electrodelocation)

1.75 secondson 192 processors(per single electrodelocation)

15

PA

RA

LLE

L S

OLVER

DETA

ILS

()

()

()

()

()

NL

j hp

L

j hp

j hp

L

j hp

i hp

j hp

i hp

he

eel

ee

ee

b

ΓΩ

Ω

+⋅

∇=

∇∇

=

22

2

,,

,,

J

σ

16

PA

RA

LLE

L S

OLVER

DETA

ILS

Forwardelimination

on thewholematrix

O(15^3)

17

PA

RA

LLE

L S

OLVER

DETA

ILS

Partial

forwardelimination

O(6*9^2)

18

PA

RA

LLE

L S

OLVER

DETA

ILS

Partial

forwardelimination

O(6*9^2)

19

PA

RA

LLE

L S

OLVER

DETA

ILS Fullforwardelimination

O(3^3)

20

PA

RA

LLE

L S

OLVER

DETA

ILS

Forwardeliminationoverthewholematrix

15^3 = 3375 operations

Partialforwardeliminationsoverelements

6*9^2 + 6*9^2 + 3^3 = 486 + 486+ 27 = 999

Theidea isgeneralizedintothewholedomain

21

PA

RA

LLE

L S

OLVER

DETA

ILS

22

PA

RA

LLE

L S

OLVER

DETA

ILS

23

SO

LVER

REC

UR

SIV

E A

LG

OR

ITH

M

matrixfunctionrecursive_solver(tree_node)

iftree_nodehasno sonnodesthen

eliminateleafelement stiffnessmatrix

internalnodes

returnSchurcomplementsub-matrix

elseiftree_nodehassonnodesthen

dofor eachson

son_matrix= recursive_solver(tree_node_son)

mergeson_matrixintonew_matrix

enddo

decidewhichunknownsofnew_matrixcanbe eliminated

performpartialforwardeliminationon new_matrix

returnSchurcomplementsub-matrix

endif

24

PA

RA

LLE

L S

OLVER

REC

UR

SIV

E A

LG

OR

ITH

Mmatrix

function

recursive_solver(tree_node)

if

tree_node

has

no son

nodes

then

eliminate

leaf

element stiffness

matrix

internal

nodes

return

Schur

complement

sub-matrix

else

if

tree_node

has

son

nodes

then

do

for each

son

if

son

node

is

assign

to current

processor

son_matrix

= recursive_solver(tree_node_son)

if

current

processor

k is

the

first

processor

in

procesors

group

RECEIVE son_matrix

from

processor

2k+1

merge

son_matrix

into

new_matrix

else

SEND son_matrix

to the

first

processor

in

processors

group

enddo

decide

which

unknowns

of

new_matrix

can

be eliminated

perform

partial

forward

elimination

on new_matrix

return

Schur

complement

sub-matrix

25

CO

NC

LU

SIO

NS A

ND

FU

TU

RE W

OR

K

•A newparalleldirectsolverhasbeendeveloped.

•ThesolverrecursivelyutilizestheSchurcomplementpatternto eliminate

fullyagregateddegreesoffreedomon everyleveloftheeliminationtree.

•Thesolverrecursivelytravelstherefinementtrees, thetreeofinitialmesh

elementsas wellas thetreeofsub-domains.

•Theparallelversionofthesolverprovidesover60% relativeefficiencyon

200 processors.

•Thesolverisableto reducethesolutiontime of141000 degreesof

freedomproblem from211 secondsto 1.75 secondson 192 processors.

•Thesingle loggingpositioncanbe solvedwithin2 secondson 200

processors.

•Thefutureworkwill involveapplicationoftheparallelsolverto theinverse

problem modeling.

top related