lazy wave propagation - eth z...gji seismology lazy wave propagation christian boehm and andreas...

Geophys. J. Int. (2019) 216, 984–990 doi: 10.1093/gji/ggy295Advance Access publication 2018 September 3GJI Seismology

Lazy wave propagation

Christian Boehm and Andreas FichtnerInstitute of Geophysics, Department of Earth Sciences, ETH Zurich, Zurich 8092, Switzerland. E-mail: [email protected]

Accepted 2018 August 31. Received 2018 July 15; in original form 2017 December 20

S U M M A R YWe introduce the concept of ‘lazy wave propagation’ for time-domain simulations of the waveequation, which means to locally skip the computation of the wavefield whenever it has noinfluence on synthetic measurements. This is a simple and efficient extension to conventionalimplementations, which becomes particularly powerful for sequentially simulating multiplesources with similar receiver locations. Lazy wave propagation utilizes the spatio-temporallocalization of the wavefield and it takes advantage of the finite speed at which energy prop-agates through a medium. The key idea is to dynamically adjust the computational domainand to compute the wavefield only inside the region of influence of the source locations inter-sected with the domain of dependence of the receiver locations. This approach decreases theamount of floating-point operations to compute internal forces by up to 40 per cent for realisticsource–receiver geometries. While our main focus is the ‘lazy spectral-element method’, thesame ideas apply to discontinuous Galerkin, finite-volume or finite-difference methods.

Key words: Waveform inversion; Computational seismology; Wave propagation; Numericalmodelling.

1 I N T RO D U C T I O N

Many applications in seismology require numerical simulations ofwaves propagating through complex and heterogeneous media be-cause closed-form solutions to the wave equation exist only for veryfew special cases. Examples of common use cases are waveforminversion for geophysical exploration (Bamberger et al. 1982; Prattet al. 1998; Sirgue et al. 2008; Virieux & Operto 2009; Plessix2017), regional or global seismology (Fichtner et al. 2009; Tapeet al. 2010; Bozdag et al. 2016) and ground motion or dynamic rup-ture modelling (Graves et al. 2011; Heinecke et al. 2014). Seismicwaves are usually excited by sources that are sparsely distributedin space. Similarly, data observations are typically available at re-ceiver locations, which cover only a small part of the entire domain,for example, the surface or even only a small subset thereof. Thus,the parts of the wavefield that we are interested in are localized inspace and time and we can make use of the region of influence or,respectively, the domain of dependence of sources and receivers.

Lazy wave propagation exploits this characteristic spatio-temporal localization of the wavefield and avoids unnecessary com-putations in parts of the domain that have no influence on the mea-sured seismograms. The key idea is to use local traveltimes to thelocations of sources and receivers to form a time-dependent par-tition of the computational domain into an active and an inactiveregion, in which the computation of the wavefield is skipped. Thisapproach applies to both forward and adjoint simulations. It is par-ticularly suited for the computation of sensitivity kernels becausethere is no need storing the wavefield of the inactive regions either.Lazy wave propagation can be interpreted as a simplistic version of

adaptive mesh refinement and goal-oriented adaptivity (Bangerthet al. 2010; Kroner 2011; Boehm & Ulbrich 2013) that is straight-forward to integrate into pre-existing implementations and whichavoids hanging nodes and projections between different numericalgrids.

Lazy wave propagation is a general concept that can be combinedwith all numerical schemes to discretize the wave equation, suchas finite-difference, finite-volume or finite-element methods. In thefollowing, we focus on the spectral-element method (SEM), wherethe subdivision of the domain into finite elements facilitates deal-ing with the active and inactive subdomains. SEM is a widely used,high-order numerical scheme for accurate time-domain simulationsof seismic wave propagation. For detailed reviews of the method werefer to Komatitsch et al. (2013), Chaljub et al. (2007) and Ficht-ner (2010). SEM has been applied successfully in a large numberof studies ranging from the local to the global scale; see, for in-stance, Komatitsch & Vilotte (1998), Komatitsch & Tromp (2002),Chaljub & Valette (2004) and Tape et al. (2010). Software packagessuch as SPECFEM3D (Peter et al. 2011), SES3D (Gokhberg &Fichtner 2016) and Salvus (Afanasiev et al. 2018) provide efficientand scalable implementations for both CPU and GPU architectures.One of the key concepts of the method is to locally approximatethe wavefield by high-order Lagrange polynomials. In particular,the method works entirely matrix-free and the main computationalburden—which is computing the stress tensor—is carried out lo-cally on the elements. Thus, extensions to the lazy spectral-elementmethod (lSEM) require only small modifications to existing imple-mentations, namely flagging elements as active or inactive duringthe simulation.

984 C© The Author(s) 2018. Published by Oxford University Press on behalf of The Royal Astronomical Society.

Dow

nloaded from https://academ

ic.oup.com/gji/article-abstract/216/2/984/5090144 by ETH

Zürich, andreas.fichtner@erdw

.ethz.ch on 15 February 2019

mailto:[email protected]

Lazy wave propagation 985

lSEM can significantly reduce the total amount of floating-pointoperations for a numerical simulation. This directly results in ashorter time-to-solution for serial simulations. Reductions in thetotal runtime can also be achieved for simulations on parallel com-puting architectures by either tailoring the domain decompositionor when modelling multiple events with similar receiver locations,which is a common situation for data sets in full-waveform in-version. Moreover, reducing the floating-point operations is alsorelevant for the energy-to-solution metric or for asynchronouslycarrying out additional post-processing tasks.

The rest of the paper is organized as follows. In the followingsection, we introduce the key ideas of lazy wave propagation, ex-plain how it can be integrated into the SEM and discuss suitableuse cases. Next, we present several numerical examples in 2-D and3-D, followed by a discussion of the implications for simulations onparallel computer architectures.

2 L A Z Y S P E C T R A L - E L E M E N T M E T H O D

Consider the elastic wave equation on the space–time domain � ×[0, T] given by

ρ(x)∂t t u(x, t) − ∇ · (C(x) : ε(u)(x, t)) = f(x, t), (1)

with the initial conditions

u(x, 0) = 0, ∂t u(x, 0) = 0, (2)

and the free-surface condition,

(C(x) : ε(u)(x, t)) · �n(x) = 0, (3)

on the boundary ∂�. Here, u denotes the displacement field and f isan external source term. The model is parametrized by the densityρ and the fourth-order elastic tensor C that acts on the strain tensorε(u), where : denotes contraction over adjacent indices.

Because seismic sources have a very small spatial support, thewavefield is highly localized at the beginning of the simulation andenergy is gradually propagated through the entire computationaldomain. Similarly, the set of receiver stations where measurementsare available is often sparse or at least clustered in a subregion, forexample, at the surface of the domain. Hence, we only require thewavefield in a small subset of the computational domain towardsthe beginning and the end of the simulation.

In the rest of the paper, we define the traveltime between twopoints x1 and x2 as the shortest amount of time it takes a wave topropagate from one point to the other, and denote this temporaldistance by dist(x1, x2). For a given set of sources and receiverstations, we introduce a time-dependent partitioning of the domaininto the inactive subdomain �i(τ ) and the active subdomain �a(τ ).Here, �i(τ ) consists of all points x ∈ � which satisfy one of thefollowing conditions:

(C1) x is outside the region of influence of all source locations,that is, dist(xs, x) > τ for all source locations xs .

(C2) x is outside the domain of dependence of the receiver loca-tions, that is, dist(x, xr ) > T − τ for all receiver locations xr .

All remaining points belong to the active subdomain, which wedefine as �a(τ ) = �\�i(τ ).

Obviously, condition (C1) describes the set of points that cannotbe reached by a wave excited at one of the source locations untiltime τ . Condition (C2) applies to the subdomain for which energythat is present at time τ does not propagate to the receiver locationsbefore the end of the simulation.

Figure 1. Illustration of the governing equations in the truncated domain.The same wave equation is solved inside the active subdomain �a and theinitial conditions ensure that we start with the original wavefield u. Thefree-surface condition remains unchanged along the solid black line, whichrepresents ∂�a∩∂�. Thus, the modified wavefield u at time τ only differsalong the interface ∂�a∩∂�i (dashed line).

The key idea of lazy wave propagation is to compute the wavefieldonly within the active subdomain during the time evolution. Thismeans to solve the following wave equation with displacement fieldu on the truncated domain �a(τ ) and time interval t ∈ [τ , T],

ρ(x)∂t t u(x, t) − ∇ · (C(x) : ε(u)(x, t)) = f(x, t), (4)

u(x, τ ) = u(x, τ ), ∂t u(x, τ ) = ∂t u(x, τ ), (5)

(C(x) : ε(u)(x, t)) · �n(x) = 0. (6)

Here, the free-surface condition is defined on the boundary ∂�a(τ ),which contains the artificial boundary ∂�a(τ )∩∂�i(τ ) at the in-terface between the active and the inactive subdomain. We willdrop the dependency on τ in the following to simplify the notation.To investigate the effects of not updating the inactive subdomain,we consider the differential wavefield d = u − u inside the activeregion �a. Fig. 1 illustrates the setup in a simplified domain. Sub-tracting eqs (4)–(6) from eqs (1)–(3) in the interval [τ , T] gives

ρ(x)∂t t d(x, t) − ∇ · (C(x) : ε(d)(x, t)) = 0, (7)

d(x, τ ) = 0, ∂t d(x, τ ) = 0, (8)

(C(x) : ε(d)(x, t)) · �n(x) = 0, (x ∈ ∂�a ∩ ∂�), (9)

(C(x) : ε(d)(x, t)) · �n(x) = (C(x) : ε(u)(x, t)) · �n(x), (10)

(x ∈ ∂�a ∩ ∂�i).

The differential wavefield can be interpreted as a solution to the waveequation with a source emitted at the boundary ∂�a∩∂�i⊂�i. Bydefinition of the inactive subdomain, energy emitted by this sourcewill not reach any of the receiver locations before the end of thesimulation. Hence, d(xr, t) = 0 for all receiver locations xr and all tin [τ , T]. In other words, the wavefield u, which ignores the inactivesubdomain �i, is exact at all receiver locations.

The same methodology can be applied to the discretized waveequation, which also has a finite propagation speed. Here, the trav-eltime from each element to the locations of sources and receivers

Dow





986 C. Boehm and A. Fichtner

determines the sets of active and inactive elements. The Courant-Friedrichs-Lewy (CFL) condition limits the size of the time step �tfor explicit time stepping schemes. In particular, the time step hasto satisfy

�t ≤ Ch

vp, (11)

where C < 1 denotes the Courant number, h is the minimum distanceof two grid points in the mesh and vp is the speed of compressionalwaves. The estimate in eq. (11) defines the so-called grid velocityh/�t, which has to be faster than the actual wave velocity vp toguarantee stability of the numerical scheme. This difference maylead to some dispersion in the discretized wavefield after propagat-ing many wavelengths (Holmes 2007). Note, however, that if thedispersion error is acceptable for the original wave equation, thenumerical errors propagated in the discrete system of eqs (7)–(10)are marginal, as we will show in the numerical examples in the nextsection. Thus, we do not consider the grid velocity, but work withthe physical traveltimes for the discrete wave equation to determineactive and inactive elements.

The time-dependent partition into active and inactive sets can becomputed in a pre-processing step. Because we keep the same meshduring the simulation, it is sufficient to store the local start and endtime for each element. For that purpose, we compute approximatetraveltimes between each element and all locations of the sourcesand the receivers. Note that it is sufficient to work with lower boundson the local start times and upper bounds on the local end times,respectively. Hence, instead of the exact traveltimes we can use anyapproximation that guarantees not to underestimate the traveltime.This ensures that the approximated inactive domain is contained inthe true �i(τ ). For example, the Euclidean distance multiplied withthe maximum velocity in the mesh gives a lower bound on the travel-time, which can be used as a computationally inexpensive heuristicto determine suboptimal sets �i(τ ) and �a(τ ). Alternatively, thetraveltimes between each element and all locations of sources andreceivers can be computed in a pre-processing step by solving theEikonal equation.

The following steps describe the implementation of lSEM. Ineach time step,

(1) divide the elements into the active and inactive subdomain;(2) compute internal and external forces for all active elements;

and(3) communicate the update across elemental boundaries, ad-

vance the wavefield in time and continue with the next time step.

When comparing these steps to conventional implementations ofthe SEM, the overhead is indeed marginal. Step (1) can be done ina pre-processing step by storing the local start and end time of theinterval when the element is active during the simulation. Step (2)is the same as in conventional implementations, except that we onlyloop over the active elements. The inactive elements do not requireany computations at all.

We conclude this section with three specific-use cases to demon-strate the full potential of lSEM.

(i) Seismic imaging with windowed data. So far, we implicitlyassumed that one is interested in computing the entire trace for allreceiver locations. However, it is common practice in seismologyto adjust the simulation times to the available data and to consideronly time windows of the seismograms during an inversion; see,for instance, Tape et al. (2010); Bozdag et al. (2016); and Krischeret al. (2018). Using such limited data to refine condition (C2) and

to further restrict the active subdomain is straightforward. This onlyrequires to replace T in (C2) with the interval limit of the lastselected time window for each receiver.

(ii) Computing sensitivity kernels. Sensitivity kernels K with re-spect to structural parameters ρ or C can be expressed as

K (x) =∫ T

0(Du(x, t)) · (

Du†(x, t))

dt, (12)

where D denotes a differential operator and u† is the adjoint wave-field. A general derivation of the adjoint-based representation ofsensitivity kernels can be found, for instance, in Fichtner (2010).Conditions (C1) and (C2) apply to the adjoint run as well. Here, (C2)defines the set of all points that adjoint waves cannot reach until timeτ , and condition (C1) does the same for the forward wavefield. Thismeans that the product of forward and adjoint wavefields in eq. (12)is guaranteed to be zero in �i(τ ) because at least one of the twowavefields is zero. Hence, by defining space-dependent start andend times for all points x:

t start(x)= min{τ : x ∈ �a(τ )}, t end(x)= max{τ : x ∈ �a(τ )}, (13)

we can restrict the time integration in eq. (12) to

K (x) =∫ tend(x)

tstart(x)(Du(x, t)) · (

Du†(x, t))

dt. (14)

Here, lSEM does not only reduce the computational cost of forwardand adjoint runs, but it also decreases the memory requirementsby not storing the forward wavefield in the inactive subdomain �i

(Boehm et al. 2016).(iii) Simulating multiple events. Simulating multiple events with

similar receiver locations on the same domain and mesh, which istypically the case in full-waveform inversion. Here, we can stackmultiple sources in time and introduce time lags that start the simu-lation of the subsequent event as soon as its source location becomesinactive during the previous simulation. This allows us to sequen-tially simulate multiple events in less time and without introducingany errors in the individual seismograms. Furthermore, mesh ini-tialization and domain decomposition are only needed once. Thisrequires one additional modification to the steps outlined above,which is(2a) setting the wavefield in all inactive elements to zero.

Consider a sequence of ns point sources emitting at locations xi .For simplicity, we assume that the receivers are the same for everysource, and we introduce source-dependent start and end times foreach position x,

0 ≤ t starti (x) < t end

i (x) ≤ T, i = 1, . . . , ns. (15)

Here, the start and end times require a small safety buffer in additionto the traveltime to ensure that all remaining energy in the inactivedomain is eliminated by applying step (2a). Now, we can define timelags τ 1 = 0 and

τi = τi−1 + t endi−1(xi), i = 2, . . . , ns, (16)

and trigger the subsequent sources at time τ i. This ensures that thewavefields can be computed without any interference due to step(2a). The total simulation length for all sources using lazy wavepropagation is then given by

τns + T =ns∑

i=2

t endi−1(xi) + T, (17)

Dow






while a conventional approach would require ns · T. We will showan example for a configuration in cross borehole tomography inSection 3.3.

3 N U M E R I C A L E X A M P L E S

This section proves the effectiveness of lSEM by showing three dif-ferent examples ranging from the exploration scale to regional sim-ulations. All computations were carried out with Salvus (Afanasievet al. 2018).

3.1 Validation

We start with a toy problem that validates lSEM in a homogeneousisotropic elastic medium with P-wave velocity vp = 5800 m s−1,S-wave velocity vs = 4000 m s−1 and density ρ = 2600 kg m−3.We first consider a quadratic domain of 20 km × 20 km and asingle source–receiver pair with a point force source emitting with adominant frequency of 3 Hz. Instead of using absorbing boundaries,we apply the free-surface condition on all boundaries to obtain amore complex wavefield. Fig. 2 compares snapshots of the wavefieldusing conventional SEM and lSEM, which shows that no errorspropagate inside the active subdomain.

Next, we consider a cubic domain of 100 km × 100 km × 100 kmwith the same material properties as before and the free-surfacecondition applied on all boundaries. Furthermore, we consider amoment tensor source at xs = (50, 15, 50)T [km] with a dominantfrequency of 0.4 Hz and a receiver at xr = (50, 85, 50)T [km]. Fig. 3shows that lSEM accurately computes the seismograms by com-paring them to the conventional SEM solution. Over the courseof the simulation, only 46 per cent of the elements are active onan average. This is of course not a realistic source–receiver ge-ometry, but it shows the validity and the potential of lazy wavepropagation.

3.2 3-D regional simulation

This example considers a regional simulation of a magnitude 4.6earthquake in Switzerland that occurred on 2017 March 6 near theLinthal valley. We use the locations of 222 stations that recordedthe event, a source period of 30 s and PREM (Dziewonski & An-derson 1981) as material model. The traveltimes are estimated byusing the highest P-wave velocity in the whole domain and theEuclidean distance between elements and source or receiver loca-tions, respectively. This is guaranteed to give a lower bound on theactual traveltime. The simulation time is T = 106 s and the meshconsists of 6.4 million fourth-order hexahedral elements which aredistributed across 960 compute cores.

Over the entire simulation lSEM skips the update of 33 per centinactive elements. Fig. 4 depicts the effective local simulation time,that is, how long elements are within the active domain duringthe simulation. Furthermore, it shows the time evolution of theratio of active elements, which slowly increases at the beginningand decreases towards the end. This is a good proxy for the totalreduction of floating-point operations.

However, in a simulations on parallel computer architectures,the total runtime only decreases if the active elements are evenlydistributed among all processors throughout the simulation. It ispossible to use checkpoints and to either adaptively re-partition themesh during the simulation or to adjust the number of ranks andre-initialize the mesh. To this end, we utilize open-source software

libraries such as PETSc (Balay et al. 2017; Lange et al. 2016)to handle distributed data structure on unstructured meshes, andPT-Scotch (Chevalier & Pellegrini 2008) for the parallel domaindecomposition. Whether the gains from re-partitioning the domainduring the simulation are worth the overhead depends on the prob-lem configuration. Fortunately, all required information to assess thepotential savings of lSEM are available prior to the time loop, whichenables us to decide a priori whether to re-partition the domain ornot.

Alternatively, prior information from approximate traveltimesof the source–receiver geometry can be used to optimize the do-main decomposition. Without tuning the partitioning in the exam-ple above, the number of active elements is unbalanced during thesimulation. The bottom panel of Fig. 4(b) shows that lSEM hasonly a marginal advantage over SEM close to the start and theend of the simulation when comparing the maximum number ofactive elements per rank. However, because receiver stations arelocated only at the surface, the duration of elements being activemainly depends on the depth. Hence, we can decompose the com-putational domain by assigning every rank a column of elementsthat reaches from the surface to the bottom. This balances the num-ber of elements per core during the simulation and achieves about60 per cent of the potential savings from the total number of inac-tive elements, as indicated by the solid line in the bottom panel ofFig. 4(b).

3.3 Overlapping the simulation of multiple events

Many applications in seismic tomography use data from severalevents to infer unknown material properties. To this end, we needseveral numerical simulations of the wave equation on the samedomain, but with different sources and, potentially, also differentreceiver locations. lSEM is particularly suited for stacking multi-ple simulations in time without introducing any errors, which wedemonstrate using a setup for cross borehole tomography and theacoustic wave equation.

We consider the Gullfaks model (Stovas et al. 2006) with aline of nsrc = 20 equidistantly spaced sources on the left bound-ary with a dominant frequency of 10 Hz, and a line of nrec =20 equidistantly spaced receivers next to the boundary on theright; see Fig. 5. Furthermore, we set the simulation time toT = 2.4 s.

Conventional SEM would simulate all sources sequentially whichresults in a total simulation time of nsrc · T = 48 s. With lSEM, how-ever, we can advance the next simulation by exploiting the fact thatthe elements near the line of sources become inactive after a cer-tain time. Hence, the next source can be triggered as soon as allelements containing the previous source and their neighbouring el-ements have become inactive. Fig. 5(a) shows the space-dependentstart and end times that indicate when elements become active orinactive. Here, the traveltimes are computed by solving the Eikonalequation. Furthermore, we show snapshots of the stacked simu-lation, where the wavefield of the second source is following theprevious one. Again the definition of the inactive subdomain guar-antees that the traces of the individual sources do not overlap andso they can easily be extracted from the stacked simulation. This isshown in Fig. 5(b) that compares the stacked trace to seismogramsfrom conventional sequential simulations of the sources. Becausethe wavefield is reset to zero in the inactive subdomain, no error isintroduced by stacking the sources with a time lag. In this exam-ple, the next source starts 1.45 s after the previous one. Thus, the

Dow






Figure 2. Comparison of the wavefields generated by conventional SEM (top row) and lSEM (middle) for a single source–receiver pair (black circles). P waves(divergence) are depicted in red and S waves (curl) are depicted in blue. All amplitudes have been normalized. The bottom row shows the differences betweenboth wavefields. The grey shaded areas indicate inactive elements. Although these elements are not updated by lSEM, errors do not propagate to the receiverlocation. The bottom row—showing the differential wavefield—demonstrates that the wavefield generated by lSEM is accurate within the active subdomainand errors occur only in the grey shaded inactive domain.

Figure 3. Comparison of seismograms computed with lSEM and conventional SEM. By construction, skipping the computations in the inactive domain hasno influence on the measurements.

stacked simulation of 20 sources requires a total simulation time of30.4 s in comparison to 48 s when simulating sequentially. This di-rectly translates into decreasing the runtime by 37 per cent becauseall ranks operate on the same number of elements throughout thesimulation.

The same idea applies to passive seismic data as well provided thatthere is a favourable source–receiver constellation. A suitable ex-ample is given by a set of earthquakes along the Mid Atlantic Ridgeand stations from the USArray (Krischer et al. 2018). Furthermore,the concept can be carried over to a wide range of other applica-tions, including ultrasound tomography for breast cancer detection(Goncharsky et al. 2016; Boehm et al. 2018) or non-destructivetesting (Seidl & Rank 2016).

4 D I S C U S S I O N

The potential savings of using lSEM inevitably depend on theproblem-specific setup, such as the source–receiver geometry, thevelocity model, the duration of the simulation and the availablecomputer hardware. However, as there is no loss of accuracy and nocomputational overhead unless the domain is re-partitioned, lSEMwill always perform at least as good as conventional SEM.

Serial runs of the solver can achieve a speed-up that is propor-tional to the ratio of the numbers of active and total elements. Here,it is important that the majority of the computations are carriedout locally on the elements and that there is as little interaction aspossible on global data structures. This is particularly useful for

Dow






0 10 20 30 40 50 60 70 80 90 100time [s]

0

0.2

0.4

0.6

0.8

1

ratio

of a

ctiv

e el

emen

ts

SEMlSEM

0 10 20 30 40 50 60 70 80 90 100time [s]

0

2000

4000

6000

8000

max

ele

men

ts p

er c

ore

SEMlSEM default partitioninglSEM column partitioning

(a) (b)

Figure 4. (a) Space-dependent effective simulation time, that is, tend − tstart. The green ball indicates the source location; the black balls depict receiverstations. (b) Top: evolution of the ratio of active elements during the simulation. Bottom: maximum number of elements per rank during the simulation usingthe default or a customized mesh decomposition where each rank works on a vertical column of the domain.

0 1 2 3 4 5 6 7 8 9time [s]

source 1

source 2

source 3

source 4

source 5

stacked simulation

(a) (b)

Figure 5. Waveform simulation using lSEM for cross borehole tomography with a line of sources at the left boundary and a line of receivers at the rightboundary. (a) Top row: start and end times of elements being active during the simulation of a single event. Bottom row: snapshots of the wavefield excited bythe first source (left) followed by the second source (right), which is triggered before the first simulation is finished. (b) Time-stacked and reference seismogramsat the fifth receiver for the first five sources.

applications in seismic exploration or medical imaging, where par-allelization is typically carried out over a large number of sources(shots), and individual events can be simulated on a single device(GPU or CPU node).

For simulations on parallel computers, reducing the computa-tional cost with lSEM in terms of floating-point operations does notnecessarily directly yield a decrease of the total computing time be-cause the number of active elements dynamically changes during thesimulation. This limits the value of lSEM for some applications, forinstance, global-scale simulations with multi-orbit seismic waveson a huge number of parallel compute cores. However, a suitablechoice of the domain decomposition or re-partitioning the mesh canmitigate this issue in certain situations as outlined in Section 3.2.

Another important aspect is the restriction of the time step in anexplicit time stepping scheme. The CFL condition (11) introduces aspace-dependent local time step, where the smallest value in the en-tire domain determines the global time step of the simulation. WithlSEM, this restriction only applies to the active domain �a. Thus,

the global time step may vary over the course of the simulation, andlarger steps can be taken when the limiting elements are inactive.

Moreover, when stacking multiple events with a similar source–receiver geometry, lSEM allows us to reduce the total simulationtime and to avoid the need for repeating the domain decomposition.This applies to both serial and parallel runs.

Reducing the amount of floating-point operations is highly desir-able for several other reasons as well. The freed-up resources mightbe used for post-processing steps or to carry out asynchronous tasks.Furthermore, the metric energy-to-solution is becoming more andmore important for applications in high-performance computing(Goddeke et al. 2013; Padoin et al. 2012). Although this is ratherdifficult to measure, a smaller number of floating-point operationsshould eventually have a positive impact on the energy consumption.

The integration of lSEM into existing wave propagation solverscomes with a marginal overhead in the implementation because itonly requires flagging the elements to be active or inactive during thesimulation. While this paper focuses on continuous Galerkin SEMs,

Dow






the same strategy applies to discontinuous Galerkin discretizations.Extensions to finite-difference methods are possible, however, theimplementation is more involved because these methods typicallylack the concept of local computations on elements and carry outmost of the computations on the global degrees of freedom.

5 C O N C LU S I O N S

We introduced the lSEM for the numerical simulation of seismicwave propagation. This straightforward extension to classical SEMsutilizes the spatio-temporal domain of dependence of the wavefield.The method significantly reduces the computational cost to computethe elastic forces without adding complexity to existing simulationcodes.

A C K N OW L E D G E M E N T S

The authors would like to thank Editor Ludovic Metivier, Carl Tapeand one anonymous reviewer for their valuable comments and ex-cellent suggestions to improve the paper. Furthermore, we thankMichael Afanasiev, Lion Krischer and Martin van Driel for nu-merous discussions and their continuous support. We gratefullyacknowledge support by the Swiss National Supercomputing Cen-tre (CSCS) under Grants s741, d72 and the PASC project GeoScale.CB acknowledges funding from Shell within the project ‘Boostingfull-waveform inversion’.

R E F E R E N C E SAfanasiev, M., Boehm, C., van Driel, M., Krischer, L., Rietmann, M., May,

D.A., Knepley, M.G. & Fichtner, A., 2018. Modular and flexible spectral-element waveform modeling in two and three dimensions, Geophys. J.Int., accepted manuscript.

Balay, S. et al., 2017. PETSc users manual, Argonne National Laboratory,Tech. Rep., ANL-95/11 - Revision 3.8.

Bamberger, A., Chavent, G., Hemon, C. & Lailly, P., 1982. Inversion ofnormal incidence seismograms, Geophysics, 47(5), 757–770.

Bangerth, W., Geiger, M. & Rannacher, R., 2010. Adaptive galerkin finiteelement methods for the wave equation, Comput. Methods Appl. Math.,10(1), 3–48.

Boehm, C. & Ulbrich, M., 2013. A Newton-CG method for full-waveforminversion in a coupled solid-fluid system, in Advanced Computing, pp.99–117, eds Bader, M., Bungartz, H.-J. & Weinzierl, T., Springer.

Boehm, C., Hanzich, M., de la Puente, J. & Fichtner, A., 2016. Wavefieldcompression for adjoint methods in full-waveform inversion, Geophysics,81, R385–R397, doi:10.1190/geo2015-0653.1.

Boehm, C., Korta Martiartu, N., Vinard, N., Jovanovic Balic, I. & Ficht-ner, A., 2018. Time-domain spectral-element ultrasound waveform to-mography using a stochastic quasi-Newton method, Proc. SPIE, 10580,doi:10.1117/12.2293299.

Bozdag, E., Peter, D., Lefebvre, M., Komatitsch, D., Tromp, J., Hill, J.,Podhorszki, N. & Pugmire, D., 2016. Global adjoint tomography: first-generation model, Geophys. J. Int., 207(3), 1739–1766.

Chaljub, E. & Valette, B., 2004. Spectral element modelling of three-dimensional wave propagation in a self-gravitating earth with an arbi-trarily stratified outer core, Geophys. J. Int., 158(1), 131–141.

Chaljub, E., Komatitsch, D., Vilotte, J.-P., Capdeville, Y., Valette, B. & Festa,G., 2007. Spectral-element analysis in seismology, in Advances in WavePropagation in Heterogenous Earth, pp. 365–419, eds Ru-Shan Wu, V.M.& Dmowska, R., Elsevier, .

Chevalier, C. & Pellegrini, F., 2008. PT-Scotch: a tool for efficient parallelgraph ordering, Parallel Comput., 34(6), 318–331.

Dziewonski, A.M. & Anderson, D.L., 1981. Preliminary reference Earthmodel, Phys. Earth planet. Inter., 25, 297–356.

Fichtner, A., 2010. Full Seismic Waveform Modelling and Inversion,Springer, .

Fichtner, A., Kennett, B. L.N., Igel, H. & Bunge, H.-P., 2009. Spectral-element simulation and inversion of seismic waves in a spherical sectionof the Earth, J. Numer. Anal. Ind. Appl. Math., 4, 11–22.

Goddeke, D., Komatitsch, D., Geveler, M., Ribbrock, D., Rajovic, N., Pu-zovic, N. & Ramirez, A., 2013. Energy efficiency vs. performance ofthe numerical solution of PDEs: an application study on a low-powerARM-based cluster, J. Comput. Phys., 237, 132–150.

Gokhberg, A. & Fichtner, A., 2016. Full-waveform inversion on heteroge-neous HPC systems, Comput. Geosci., 89, 260–268.

Goncharsky, A., Romanov, S.Y. & Seryozhnikov, S.Y., 2016. A computersimulation study of soft tissue characterization using low-frequancy ul-trasonic tomography, Ultrasonics, 67, 136–150.

Graves, R. et al., 2011. Cybershake: a physics-based seismic hazard modelfor Southern California, Pure appl. Geophys., 168(3), 367–381.

Heinecke, A. et al., 2014. Petascale high order dynamic rupture earth-quake simulations on heterogeneous supercomputers, in Proceed-ings of the International Conference for High Performance Comput-ing, Networking, Storage and Analysis, pp. 3–14, New Orleans, LA,doi:10.1109/SC.2014.6.

Holmes, M.H., 2007. Introduction to Numerical Methods in DifferentialEquations, Springer.

Komatitsch, D. & Tromp, J., 2002. Spectral-element simulations of globalseismic wave propagation, part II: 3-D models, oceans, rotation, andgravity, Geophys. J. Int., 150, 303–318.

Komatitsch, D. & Vilotte, J.P., 1998. The spectral element method: an ef-fective tool to simulate the seismic response of 2D and 3D geologicalstructures, Bull. seism. Soc. Am., 88, 368–392.

Komatitsch, D., Tsuboi, S. & Tromp, J., 2013. The spectral-element methodin seismology, in Seismic Earth: Array Analysis of Broadband Seismo-grams, Vol. 157, pp. 205–227, eds Levander, A. & Nolet, G., AmericanGeophysical Union, .

Krischer, L., Fichtner, A., Boehm, C. & Igel, H., 2018. Automated large-scale full seismic waveform inversion for North America and the NorthAtlantic, J. Geophys. Res.

Kroner, A., 2011. Adaptive finite element methods for optimal control ofsecond order hyperbolic equations, Comput. Methods Appl. Math., 11(2),214–240.

Lange, M., Mitchell, L., Knepley, M.G. & Gorman, G.J., 2016. Efficientmesh management in Firedrake using PETSc-DMPlex, SIAM J. Sci. Com-put., 38(5), S143–S155.

Padoin, E.L., de Oliveira, D.A., Velho, P. & Navaux, P.O., 2012. Time-to-solution and energy-to-solution: a comparison between arm and xeon, inThird Workshop on Applications for Multi-Core Architectures (WAMCA), pp. 48–53, IEEE, New York.

Peter, D. et al., 2011. Forward and adjoint simulations of seismic wavepropagation on fully unstructured hexahedral meshes, Geophys. J. Int.,186, 721–739.

Plessix, R.-E., 2017. Some computational aspects of the time and frequencydomain formulations of seismic waveform inversion, in Modern Solversfor Helmholtz Problems, pp. 159–187, eds Lahaye, D., Tang, J. & Vuik,K., Birkhauser, Springer, .

Pratt, R., Shin, C. & Hicks, G., 1998. Gauss–Newton and full Newtonmethods in frequency–space seismic waveform inversion, Geophys. J.Int., 133, 341–362.

Seidl, R. & Rank, E., 2016. Iterative time reversal based flaw identification,Comput. Math. Appl., 72(4), 879–892.

Sirgue, L., Etgen, J.T. & Albertin, U., 2008. 3D frequency domain waveforminversion using time domain finite difference methods, in 70th EAGEConference and Exhibition incorporating SPE EUROPEC,.

Stovas, A., Landrø, M. & Arntsen, B., 2006. A sensitivity study based on 2Dsynthetic data from the Gullfaks field, using PP and PS time-lapse stacksfor fluid-pressure discrimination, J. Geophys. Eng., 3(4).

Tape, C., Liu, Q., Maggi, A. & Tromp, J., 2010. Seismic tomography ofthe southern California crust based upon spectral-element and adjointmethods, Geophys. J. Int., 180, 433–462.

Virieux, J. & Operto, S., 2009. An overview of full waveform inversion inexploration geophysics, Geophysics, 74, WCC127–WCC152.

Dow





http://dx.doi.org/doi:10.1093/gji/ggy469

http://dx.doi.org/doi:10.2478/cmam-2010-0001

http://dx.doi.org/doi:10.1093/gji/ggw356

http://dx.doi.org/doi:10.1111/j.1365-246X.2004.02267.x

http://dx.doi.org/doi:10.1016/j.parco.2007.12.001.

http://dx.doi.org/doi:10.1016/0031-9201(81)90046-7

http://dx.doi.org/doi:10.1016/j.jcp.2012.11.031

http://dx.doi.org/doi:10.1016/j.cageo.2015.12.013

http://dx.doi.org/doi:10.1016/j.ultras.2016.01.008

http://dx.doi.org/doi:10.1007/s00024-010-0161-6


http://dx.doi.org/doi:10.1029/2017JB015289

http://dx.doi.org/doi:10.3182/20120215-3-AT-3016.00197

http://dx.doi.org/doi:10.3182/20120215-3-AT-3016.00197



http://dx.doi.org/doi:10.1016/j.camwa.2016.05.036

http://dx.doi.org/doi:10.1088/1742-2132/3/4/003


http://dx.doi.org/doi:10.1190/1.3238367

lazy wave propagation - eth z...gji seismology lazy wave propagation christian boehm and andreas...

Documents