deep learning-based reduced order models for real-time ......deep learning-based reduced order...

1
Deep learning-based reduced order models With the support of the EU ERC-ADG - Advanced Grant iHEART Project ID: 740132. Acknowledgements Numerical results e-mail: [email protected] Introduction S. Fresca et al. A Comprehensive deep learning-based approach to reduced order modeling of nonlinear time-dependent parametrized PDEs. arXiv preprint arXiv:2001.04001, 2020. S. Fresca et al. Deep learning-based reduced order models in cardiac electrophysiology. PLOS ONE, 15(10):1–32, 2020. S. Fresca et al. POD-DL-ROM: enhancing deep learning-based reduced order models for nonlinear parametrized PDEs by proper orthogonal decomposition. In preparation. https://github.com/stefaniafresca/DL-ROM Deep learning-based reduced order models for real-time approximation of nonlinear time- dependent parametrized PDEs S. Fresca 1 , A. Manzoni 1 , L. Dedè 1 , A. Quarteroni 1,2 1 MOX - Dipartimento di Matematica, Politecnico di Milano, Milano, Italy 4 MATHICSE-CSQI, Ecole polytechnique fédérale de Lausanne, Lausanne, Switzerland Abstract Conventional linear reduced order modeling techniques, such as, e.g., the reduced basis method, may incur in severe limitations when dealing with nonlinear time-dependent parametrized PDEs, featuring coherent structures that propagate over time such as transport, wave, or convection-dominated phenomena. In this work, we propose a new, nonlinear approach relying on deep learning (DL) algorithms to obtain accurate and efficient reduced order models (ROMs), whose dimensionality matches the number of system parameters. Reduced order modeling aims at replacing the FOM (1) by a model showing a much lower complexity but still able to express the physical features of the problem at hand. FOM snapshots FEM IGA POD-Galerkin ROM Hyper- reduction Projection DL-ROM POD DL-ROM Critical issues: coherent structures propagating over time, e.g. wavefronts, low regularity of the solution manifold (with respect to the problem parameters). Deep learning linear + Dimensionality reduction application nonlinear deep feedforward NN decoder function of a convolutional autoencoder φ DF n (t; μ, DF ), u n (t; μ, DF ) (t, μ) reduced dynamics learning reduced trial manifold learning reshape (u n (t; μ, DF ); D ) reshape encoder function of a convolutional autoencoder ˜ u n (t; μ, E ) Full Order Model (FOM) Proper Orthogonal Decomposition-enhanced Deep Learning-based Reduced Order Model (POD DL-ROM) 1 - ! h 2 k ˜ u n (t; μ, E ) - u n (t; μ, DF )k 2 (t; μ, DF , D )k ( (A) (B) (C) f D N V T u h (t; μ) f E n (V T u h (t; μ); E ) ! h 2 kV T u h (t; μ) - ˜ u N (t; μ, DF , D )k 2 u h (t; μ) V ˜ u N (t; μ, DF , D ) Main features: POD-DL-ROMs learn, simultaneously, the nonlinear trial manifold and the nonlinear reduced dynamics; the POD-DL-ROM dimension is as close as possible to the number of parameters which the PDE solution depends upon; a prior dimensionality reduction, performed by means of randomized POD, and pretraining allow to drastically reduce training computational times. Test 1: pretraining on 3D elastodynamics equations Test 2: cardiac electrophysiology on left ventricle and atrium References

Upload: others

Post on 06-Aug-2021

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Deep learning-based reduced order models for real-time ......Deep learning-based reduced order models With the support of the EU ERC-ADG - Advanced Grant iHEART Project ID: 740132

Deep learning-based reduced order models

With the support of the EU ERC-ADG - Advanced Grant iHEARTProject ID: 740132.

Acknowledgements

Numerical results

e-mail: [email protected]

Introduction

• S. Fresca et al. A Comprehensive deep learning-based approachto reduced order modeling of nonlinear time-dependentparametrized PDEs. arXiv preprint arXiv:2001.04001, 2020.

• S. Fresca et al. Deep learning-based reduced order models incardiac electrophysiology. PLOS ONE, 15(10):1–32, 2020.

• S. Fresca et al. POD-DL-ROM: enhancing deep learning-basedreduced order models for nonlinear parametrized PDEs by properorthogonal decomposition. In preparation.

• https://github.com/stefaniafresca/DL-ROM

Deep learning-based reduced order models for real-time approximation of nonlinear time-

dependent parametrized PDEsS. Fresca1, A. Manzoni1, L. Dedè1, A. Quarteroni1,2

1 MOX - Dipartimento di Matematica, Politecnico di Milano, Milano, Italy4 MATHICSE-CSQI, Ecole polytechnique fédérale de Lausanne, Lausanne, Switzerland

AbstractConventional linear reduced order modeling techniques, such as, e.g., the reduced basis method, may incur in severe limitations whendealing with nonlinear time-dependent parametrized PDEs, featuring coherent structures that propagate over time such as transport, wave,or convection-dominated phenomena. In this work, we propose a new, nonlinear approach relying on deep learning (DL) algorithms toobtain accurate and efficient reduced order models (ROMs), whose dimensionality matches the number of system parameters.

Reduced order modeling aims at replacing the FOM (1) by a modelshowing a much lower complexity but still able to express thephysical features of the problem at hand.

FOM snapshots

FEM IGA

POD-GalerkinROM

Hyper-reduction

Projection

DL-ROM POD DL-ROM

Critical issues:• coherent structures

propagating over time, e.g. wavefronts,

• low regularity of the solution manifold(with respect to the problem parameters).

Deep learning

linear

+

Dimensionalityreduction

applicationnonlinear

deep feedforward NN decoder function of a convolutional autoencoder

3 A Deep Learning-based Reduced Order Model (DL-ROM)

Let us now detail the construction of the proposed nonlinear ROM. In this respect, we define the functions h and�n in (9) and (11) by means of deep learning (DL) models, exploiting neural network architectures.This choice is motivated by their capability of approximating nonlinear maps e↵ectively, and by theirability to learn from data and generalize to unseen data. On the other hand, DL models enable us tobuild non-intrusive, completely data-driven, ROMs, since their construction only requires to access thedataset, the parameter values and the snapshots matrix, but not the FOM arrays appearing in (1).

The DL-ROM technique we developed is composed by two main blocks responsible, respectively, for thereduced dynamics learning and the nonlinear trial manifold learning (see Figure 2). Hereon, we denoteby Ntrain, Ntest and Nt the number of training-parameter instances, of testing-parameter instances andtime instances, respectively, and we set Ns = Ntrain ·Nt. The dimension of both the FOM solution andthe ROM approximation is Nh, while n denotes the number of intrinsic coordinates, with n ⌧ Nh.

For the description of the system dynamics on the reduced nonlinear trial manifold (which we refer toas reduced dynamics learning), we employ a deep feedforward neural network (DFNN) with L layers, thatis, we define the function �n in definition (11) as

�n(t;µ,✓DF ) = �DFn (t;µ,✓DF ), (12)

thus yielding the map

(t,µ) 7! un(t;µ,✓DF ) = �DFn (t;µ,✓DF ),

where �DFn takes the form (30), t 2 [0, T ], and results from the subsequent composition of a nonlinear

activation function L times. Here µ 2 P ⇢ Rnµ and ✓DF denotes the vector of hyper-parameters of theDFNN.

Regarding instead the description of the reduced nonlinear trial manifold Sn defined in (10) (which werefer to as reduced trial manifold learning), we employ the decoder function of a convolutional autoencoder(AE), that is, we define the function h appearing in (9) and (10) as

h(un(t;µ);✓D) = fDh (un(t;µ);✓D), (13)

thus yielding the map

un(t;µ) 7! uh(t;µ,✓D) = fDh (un(t;µ);✓D)

where fDh results from the composition of several layers, some of which of convolutional type, overall

depending on the vector ✓D of hyper-parameters of the decoder function.Combining the two stages above, the DL-ROM approximation is then given by

uh(t;µ,✓) = fDh (�DF

n (t;µ,✓DF );✓D) (14)

where �DFn (·; ·,✓DF ) : R(nµ+1)

! Rn and fDh (·;✓D) : Rn

! RNh are defined as in (12) and (13), respec-tively, and ✓ = (✓DF ,✓D) are the parameters defining the neural network. The architecture of DL-ROMis shown in Figure 2.

Computing the ROM approximation (9) in the developed framework is equivalent to solve an optimiza-tion problem. More precisely, provided the parameter matrix M 2 R(nµ+1)⇥Ns defined as

M = [(t1,µ1)| . . . |(tNt ,µ1)| . . . |(t

1,µNs

)| . . . |(tNt ,µNs)], (15)

7

3 A Deep Learning-based Reduced Order Model (DL-ROM)

Let us now detail the construction of the proposed nonlinear ROM. In this respect, we define the functions h and�n in (9) and (11) by means of deep learning (DL) models, exploiting neural network architectures.This choice is motivated by their capability of approximating nonlinear maps e↵ectively, and by theirability to learn from data and generalize to unseen data. On the other hand, DL models enable us tobuild non-intrusive, completely data-driven, ROMs, since their construction only requires to access thedataset, the parameter values and the snapshots matrix, but not the FOM arrays appearing in (1).

The DL-ROM technique we developed is composed by two main blocks responsible, respectively, for thereduced dynamics learning and the nonlinear trial manifold learning (see Figure 2). Hereon, we denoteby Ntrain, Ntest and Nt the number of training-parameter instances, of testing-parameter instances andtime instances, respectively, and we set Ns = Ntrain ·Nt. The dimension of both the FOM solution andthe ROM approximation is Nh, while n denotes the number of intrinsic coordinates, with n ⌧ Nh.

For the description of the system dynamics on the reduced nonlinear trial manifold (which we refer toas reduced dynamics learning), we employ a deep feedforward neural network (DFNN) with L layers, thatis, we define the function �n in definition (11) as

�n(t;µ,✓DF ) = �DFn (t;µ,✓DF ), (12)

thus yielding the map

(t,µ) 7! un(t;µ,✓DF ) = �DFn (t;µ,✓DF ),

where �DFn takes the form (30), t 2 [0, T ], and results from the subsequent composition of a nonlinear

activation function L times. Here µ 2 P ⇢ Rnµ and ✓DF denotes the vector of hyper-parameters of theDFNN.

Regarding instead the description of the reduced nonlinear trial manifold Sn defined in (10) (which werefer to as reduced trial manifold learning), we employ the decoder function of a convolutional autoencoder(AE), that is, we define the function h appearing in (9) and (10) as

h(un(t;µ);✓D) = fDh (un(t;µ);✓D), (13)

thus yielding the map

un(t;µ) 7! uh(t;µ,✓D) = fDh (un(t;µ);✓D)

where fDh results from the composition of several layers, some of which of convolutional type, overall

depending on the vector ✓D of hyper-parameters of the decoder function.Combining the two stages above, the DL-ROM approximation is then given by

uh(t;µ,✓) = fDh (�DF

n (t;µ,✓DF );✓D) (14)

where �DFn (·; ·,✓DF ) : R(nµ+1)

! Rn and fDh (·;✓D) : Rn

! RNh are defined as in (12) and (13), respec-tively, and ✓ = (✓DF ,✓D) are the parameters defining the neural network. The architecture of DL-ROMis shown in Figure 2.

Computing the ROM approximation (9) in the developed framework is equivalent to solve an optimiza-tion problem. More precisely, provided the parameter matrix M 2 R(nµ+1)⇥Ns defined as

M = [(t1,µ1)| . . . |(tNt ,µ1)| . . . |(t

1,µNs

)| . . . |(tNt ,µNs)], (15)

7

3 A Deep Learning-based Reduced Order Model (DL-ROM)

Let us now detail the construction of the proposed nonlinear ROM. In this respect, we define the functions h and�n in (9) and (11) by means of deep learning (DL) models, exploiting neural network architectures.This choice is motivated by their capability of approximating nonlinear maps e↵ectively, and by theirability to learn from data and generalize to unseen data. On the other hand, DL models enable us tobuild non-intrusive, completely data-driven, ROMs, since their construction only requires to access thedataset, the parameter values and the snapshots matrix, but not the FOM arrays appearing in (1).

The DL-ROM technique we developed is composed by two main blocks responsible, respectively, for thereduced dynamics learning and the nonlinear trial manifold learning (see Figure 2). Hereon, we denoteby Ntrain, Ntest and Nt the number of training-parameter instances, of testing-parameter instances andtime instances, respectively, and we set Ns = Ntrain ·Nt. The dimension of both the FOM solution andthe ROM approximation is Nh, while n denotes the number of intrinsic coordinates, with n ⌧ Nh.

For the description of the system dynamics on the reduced nonlinear trial manifold (which we refer toas reduced dynamics learning), we employ a deep feedforward neural network (DFNN) with L layers, thatis, we define the function �n in definition (11) as

�n(t;µ,✓DF ) = �DFn (t;µ,✓DF ), (12)

thus yielding the map

(t,µ) 7! un(t;µ,✓DF ) = �DFn (t;µ,✓DF ),

where �DFn takes the form (30), t 2 [0, T ], and results from the subsequent composition of a nonlinear

activation function L times. Here µ 2 P ⇢ Rnµ and ✓DF denotes the vector of hyper-parameters of theDFNN.

Regarding instead the description of the reduced nonlinear trial manifold Sn defined in (10) (which werefer to as reduced trial manifold learning), we employ the decoder function of a convolutional autoencoder(AE), that is, we define the function h appearing in (9) and (10) as

h(un(t;µ);✓D) = fDh (un(t;µ);✓D), (13)

thus yielding the map

un(t;µ) 7! uh(t;µ,✓D) = fDh (un(t;µ);✓D)

where fDh results from the composition of several layers, some of which of convolutional type, overall

depending on the vector ✓D of hyper-parameters of the decoder function.Combining the two stages above, the DL-ROM approximation is then given by

uh(t;µ,✓) = fDh (�DF

n (t;µ,✓DF );✓D) (14)

where �DFn (·; ·,✓DF ) : R(nµ+1)

! Rn and fDh (·;✓D) : Rn

! RNh are defined as in (12) and (13), respec-tively, and ✓ = (✓DF ,✓D) are the parameters defining the neural network. The architecture of DL-ROMis shown in Figure 2.

Computing the ROM approximation (9) in the developed framework is equivalent to solve an optimiza-tion problem. More precisely, provided the parameter matrix M 2 R(nµ+1)⇥Ns defined as

M = [(t1,µ1)| . . . |(tNt ,µ1)| . . . |(t

1,µNs

)| . . . |(tNt ,µNs)], (15)

7

reduced dynamics learning

reduced trial manifold learning

reshape

3 A Deep Learning-based Reduced Order Model (DL-ROM)

Let us now detail the construction of the proposed nonlinear ROM. In this respect, we define the functions h and�n in (9) and (11) by means of deep learning (DL) models, exploiting neural network architectures.This choice is motivated by their capability of approximating nonlinear maps e↵ectively, and by theirability to learn from data and generalize to unseen data. On the other hand, DL models enable us tobuild non-intrusive, completely data-driven, ROMs, since their construction only requires to access thedataset, the parameter values and the snapshots matrix, but not the FOM arrays appearing in (1).

The DL-ROM technique we developed is composed by two main blocks responsible, respectively, for thereduced dynamics learning and the nonlinear trial manifold learning (see Figure 2). Hereon, we denoteby Ntrain, Ntest and Nt the number of training-parameter instances, of testing-parameter instances andtime instances, respectively, and we set Ns = Ntrain ·Nt. The dimension of both the FOM solution andthe ROM approximation is Nh, while n denotes the number of intrinsic coordinates, with n ⌧ Nh.

For the description of the system dynamics on the reduced nonlinear trial manifold (which we refer toas reduced dynamics learning), we employ a deep feedforward neural network (DFNN) with L layers, thatis, we define the function �n in definition (11) as

�n(t;µ,✓DF ) = �DFn (t;µ,✓DF ), (12)

thus yielding the map

(t,µ) 7! un(t;µ,✓DF ) = �DFn (t;µ,✓DF ),

where �DFn takes the form (30), t 2 [0, T ], and results from the subsequent composition of a nonlinear

activation function L times. Here µ 2 P ⇢ Rnµ and ✓DF denotes the vector of hyper-parameters of theDFNN.

Regarding instead the description of the reduced nonlinear trial manifold Sn defined in (10) (which werefer to as reduced trial manifold learning), we employ the decoder function of a convolutional autoencoder(AE), that is, we define the function h appearing in (9) and (10) as

h(un(t;µ);✓D) = fDh (un(t;µ);✓D), (13)

thus yielding the map

un(t;µ) 7! uh(t;µ,✓D) = fDh (un(t;µ,✓DF );✓D)

where fDh results from the composition of several layers, some of which of convolutional type, overall

depending on the vector ✓D of hyper-parameters of the decoder function.Combining the two stages above, the DL-ROM approximation is then given by

uh(t;µ,✓) = fDh (�DF

n (t;µ,✓DF );✓D) (14)

where �DFn (·; ·,✓DF ) : R(nµ+1)

! Rn and fDh (·;✓D) : Rn

! RNh are defined as in (12) and (13), respec-tively, and ✓ = (✓DF ,✓D) are the parameters defining the neural network. The architecture of DL-ROMis shown in Figure 2.

Computing the ROM approximation (9) in the developed framework is equivalent to solve an optimiza-tion problem. More precisely, provided the parameter matrix M 2 R(nµ+1)⇥Ns defined as

M = [(t1,µ1)| . . . |(tNt ,µ1)| . . . |(t

1,µNs

)| . . . |(tNt ,µNs)], (15)

7

reshape

encoder function of a convolutional autoencoder

3 A Deep Learning-based Reduced Order Model (DL-ROM)

Let us now detail the construction of the proposed nonlinear ROM. In this respect, we define the functions h and�n in (9) and (11) by means of deep learning (DL) models, exploiting neural network architectures.This choice is motivated by their capability of approximating nonlinear maps e↵ectively, and by theirability to learn from data and generalize to unseen data. On the other hand, DL models enable us tobuild non-intrusive, completely data-driven, ROMs, since their construction only requires to access thedataset, the parameter values and the snapshots matrix, but not the FOM arrays appearing in (1).

The DL-ROM technique we developed is composed by two main blocks responsible, respectively, for thereduced dynamics learning and the nonlinear trial manifold learning (see Figure 2). Hereon, we denoteby Ntrain, Ntest and Nt the number of training-parameter instances, of testing-parameter instances andtime instances, respectively, and we set Ns = Ntrain ·Nt. The dimension of both the FOM solution andthe ROM approximation is Nh, while n denotes the number of intrinsic coordinates, with n ⌧ Nh.

For the description of the system dynamics on the reduced nonlinear trial manifold (which we refer toas reduced dynamics learning), we employ a deep feedforward neural network (DFNN) with L layers, thatis, we define the function �n in definition (11) as

�n(t;µ,✓DF ) = �DFn (t;µ,✓DF ), (12)

thus yielding the map(t,µ) 7! un(t;µ,✓DF ) = �DF

n (t;µ,✓DF ),

where �DFn takes the form (30), t 2 [0, T ], and results from the subsequent composition of a nonlinear

activation function L times. Here µ 2 P ⇢ Rnµ and ✓DF denotes the vector of hyper-parameters of theDFNN.

Regarding instead the description of the reduced nonlinear trial manifold Sn defined in (10) (which werefer to as reduced trial manifold learning), we employ the decoder function of a convolutional autoencoder(AE), that is, we define the function h appearing in (9) and (10) as

h(un(t;µ);✓D) = fDh (un(t;µ);✓D), (13)

thus yielding the mapun(t;µ) 7! uh(t;µ,✓D) = f

Dh (un(t;µ);✓D)

where fDh results from the composition of several layers, some of which of convolutional type, overall

depending on the vector ✓D of hyper-parameters of the decoder function.Combining the two stages above, the DL-ROM approximation is then given by

uh(t;µ,✓) = fDh (�DF

n (t;µ,✓DF );✓D) (14)

where �DFn (·; ·,✓DF ) : R(nµ+1)

! Rn and fDh (·;✓D) : Rn

! RNh are defined as in (12) and (13), respec-tively, and ✓ = (✓DF ,✓D) are the hyper-parameters defining the neural network. The architecture ofDL-ROM is shown in Figure 2.

fEh (uh(t;µ),✓E) un(t;µ,✓E) uh(t;µ)

Computing the ROM approximation (14) for any new value of µ 2 P, at any given time, requires toevaluate the map (t,µ) ! uh(t;µ,✓) at the testing stage, once the hyper-parameters ✓ = (✓DF ,✓D) havebeen determined, once and for all, during the training (and validation) stage. The training stage consists

7

Full Order Model

(FOM)

Proper Orthogonal Decomposition-enhanced Deep Learning-based Reduced Order Model (POD DL-ROM)

Computing the ROM approximation (14) for any new value of µ 2 P, at any given time, requiresto evaluate the map (t,µ) ! uh(t;µ,✓) at the testing stage, once the parameters ✓ = (✓DF ,✓D) havebeen determined, once and for all, during the training stage. The training stage consists in solving anoptimization problem (in the variable ✓) after a set of snapshots of the FOM have been computed.More precisely, provided the parameter matrix M 2 R(nµ+1)⇥Ns defined as

M = [(t1,µ1)| . . . |(tNt ,µ1)| . . . |(t

1,µNtrain

)| . . . |(tNt ,µNtrain)], (15)

and the snapshot matrix S, defined in (4), we solve the problem: find the optimal parameters ✓⇤

solution of

J (✓) =1

Ns

NtrainX

i=1

NtX

k=1

L(tk,µi;✓) ! min✓

(16)

where

L(tk,µi;✓) =1

2kuh(t

k;µi)� uh(tk;µi,✓)k

2 =1

2kuh(t

k;µi)� fDh (�DF

n (tk;µi,✓DF );✓D)k2. (17)

To solve the optimization problem (16)-(17) we use the ADAM algorithm [29] which is a StochasticGradient Descent method [52] computing an adaptive approximation of the first and second momentumof the gradients of the loss function. In particular, it computes exponentially weighted moving averagesof the gradients and of the squared gradients. We set the starting learning rate to ⌘ = 10�4, the batchsize to Nb = 20 and the maximum number of epochs to Nepochs = 10000. We perform cross-validation,in order to tune the hyper-parameters of the DL-ROM, by splitting the data in training and validationand following a proportion 8:2. Moreover, we implement an early-stopping regularization techniqueto reduce overfitting [20]. In particular, we stop the training if the loss does not decrease over 500epochs. As nonlinear activation function we employ the ELU function [14] defined as

�(z) =

(z z � 0

exp(z)� 1 z < 0.

No activation function is applied at the last convolutional layer of the decoder neural network, asusually done when dealing with autoencoders. The parameters, weights and biases, are initializedthrough the He uniform initialization [24].

As we rely on a convolutional autoencoder to define the function h, we also exploit the encoderfunction

un(t;µ,✓E) = fEn (u(t;µ);✓E), (18)

which maps each FOM solution associated to the pairs (t;µ) 2 Col(M) provided as inputs to thefeed-forward neural network (12), onto a low-dimensional representation un(t;µ,✓E) depending onthe parameters vector ✓E defining the encoder function.

Indeed, the actual architecture of DL-ROM that is used only during the training and the validationphases, but not during testing, is the one shown in Figure 3. In practice, we add to the architecture ofthe DL-ROM introduced above the encoder function of the convolutional autoencoder. This producesan additional term in the per-example loss function (17), thus calling the following optimizationproblem to be solved:

min✓

J (✓) = min✓

1

Ns

NtrainX

i=1

NtX

k=1

L(tk,µi;✓), (19)

where

L(tk,µi;✓) =!h

2kuh(t

k;µi)� uh(tk;µi,✓DF ,✓D)k

2+1� !h

2kun(t

k;µi,✓E)�un(tk;µi,✓DF )k

2 (20)

L(t,µ;✓) =!h

2kuh(t;µ)� uh(t;µ,✓DF ,✓D)k

2 +1� !h

2kun(t;µ,✓E)� un(t;µ,✓DF )k

2

8

Computing the ROM approximation (14) for any new value of µ 2 P, at any given time, requiresto evaluate the map (t,µ) ! uh(t;µ,✓) at the testing stage, once the parameters ✓ = (✓DF ,✓D) havebeen determined, once and for all, during the training stage. The training stage consists in solving anoptimization problem (in the variable ✓) after a set of snapshots of the FOM have been computed.More precisely, provided the parameter matrix M 2 R(nµ+1)⇥Ns defined as

M = [(t1,µ1)| . . . |(tNt ,µ1)| . . . |(t

1,µNtrain

)| . . . |(tNt ,µNtrain)], (15)

and the snapshot matrix S, defined in (4), we solve the problem: find the optimal parameters ✓⇤

solution of

J (✓) =1

Ns

NtrainX

i=1

NtX

k=1

L(tk,µi;✓) ! min✓

(16)

where

L(tk,µi;✓) =1

2kuh(t

k;µi)� uh(tk;µi,✓)k

2 =1

2kuh(t

k;µi)� fDh (�DF

n (tk;µi,✓DF );✓D)k2. (17)

To solve the optimization problem (16)-(17) we use the ADAM algorithm [29] which is a StochasticGradient Descent method [52] computing an adaptive approximation of the first and second momentumof the gradients of the loss function. In particular, it computes exponentially weighted moving averagesof the gradients and of the squared gradients. We set the starting learning rate to ⌘ = 10�4, the batchsize to Nb = 20 and the maximum number of epochs to Nepochs = 10000. We perform cross-validation,in order to tune the hyper-parameters of the DL-ROM, by splitting the data in training and validationand following a proportion 8:2. Moreover, we implement an early-stopping regularization techniqueto reduce overfitting [20]. In particular, we stop the training if the loss does not decrease over 500epochs. As nonlinear activation function we employ the ELU function [14] defined as

�(z) =

(z z � 0

exp(z)� 1 z < 0.

No activation function is applied at the last convolutional layer of the decoder neural network, asusually done when dealing with autoencoders. The parameters, weights and biases, are initializedthrough the He uniform initialization [24].

As we rely on a convolutional autoencoder to define the function h, we also exploit the encoderfunction

un(t;µ,✓E) = fEn (u(t;µ);✓E), (18)

which maps each FOM solution associated to the pairs (t;µ) 2 Col(M) provided as inputs to thefeed-forward neural network (12), onto a low-dimensional representation un(t;µ,✓E) depending onthe parameters vector ✓E defining the encoder function.

Indeed, the actual architecture of DL-ROM that is used only during the training and the validationphases, but not during testing, is the one shown in Figure 3. In practice, we add to the architecture ofthe DL-ROM introduced above the encoder function of the convolutional autoencoder. This producesan additional term in the per-example loss function (17), thus calling the following optimizationproblem to be solved:

min✓

J (✓) = min✓

1

Ns

NtrainX

i=1

NtX

k=1

L(tk,µi;✓), (19)

where

L(tk,µi;✓) =!h

2kuh(t

k;µi)� uh(tk;µi,✓DF ,✓D)k

2+1� !h

2kun(t

k;µi,✓E)�un(tk;µi,✓DF )k

2 (20)

L(t,µ;✓) =!h

2kuh(t;µ)� uh(t;µ,✓DF ,✓D)k

2 +1� !h

2kun(t;µ,✓E)� un(t;µ,✓DF )k

2

8

Deep learning-based reduced order modeling (DL-ROM) 164

To overcome the limitations of linear ROMs we consider a new, nonlinear ROM technique 165

based on deep learning models. First introduced in [16] and assessed on one-dimensional 166

benchmark problems, the DL-ROM technique aims at learning both the nonlinear trial 167

manifold (corresponding to the matrix V in the case of a linear ROM) in which we 168

seek the solution to the parametrized system (1) and the nonlinear reduced dynamics 169

(corresponding to the projection stage in a linear ROM). This method is non-intrusive; 170

it relies on DL algorithms trained on a set of FOM solutions obtained for different 171

parameter values. 172

We denote by Ntrain and Ntest the number of training and testing parameter instances, 173

respectively; the ROM dimension is again denoted by n ⌧ N . In order to describe the 174

system dynamics on a suitable reduced nonlinear trial manifold (a task which we refer 175

to as reduced dynamics learning), the intrinsic coordinates of the ROM approximation 176

are defined as 177

un(t; µ, ✓DF ) = �DF

n(t; µ, ✓DF ), (8)

where �DF

n(·; ·, ✓DF ) : R(nµ+1) ! Rn is a deep feedforward neural network (DFNN), 178

consisting in the subsequent composition of a nonlinear activation function, applied to a 179

linear transformation of the input, multiple times [34]. Here ✓DF denotes the vector of 180

parameters of the DFNN, collecting all the corresponding weights and biases of each 181

layer of the DFNN. 182

Regarding instead the description of the reduced nonlinear trial manifold, approx- 183

imating the solution one, S ⇡ S (a task which we refer to as reduced trial manifold 184

learning) we employ the decoder function of a convolutional autoencoder (AE) [35,36]. 185

More precisely, S takes the form 186

S = {fD(un(t; µ, ✓DF ); ✓D) | un(t; µ, ✓DF ) 2 Rn, t 2 [0, T ) and µ 2 P ⇢ Rnµ} (9)

where fD(·; ✓D) : Rn ! RN consists in the decoder function of a convolutional AE. This 187

latter results from the composition of several layers (some of which are convolutional), 188

depending upon a vector ✓D collecting all the corresponding weights and biases. 189

As a matter of fact, the approximation u(t; µ) ⇡ u(t; µ) provided by the DL-ROM 190

technique is defined as 191

u(t; µ) = fD(�DF

n(t; µ, ✓DF ); ✓D). (10)

The encoder function of the convolutional AE can then be exploited by mapping the 192

FOM solution associated to (t, µ) onto a low-dimensional representation 193

un(t; µ, ✓E) = fE

n(u(t; µ); ✓E), (11)

where fEn

(·, ✓E) : RN ! Rn is the encoder function, depending on a vector of parameters 194

✓E . 195

Computing the DL-ROM approximation of u(t; µtest), for any possible t 2 (0, T ) and 196

µtest 2 P, corresponds to the testing stage of a DFNN and of the decoder function of 197

a convolutional AE; this does not require the evaluation of the encoder function. We 198

remark that our DL-ROM strategy overcomes the three major computational bottlenecks 199

implied by the use of projection-based ROMs, since: 200

- the dimension of the DL-ROM can be kept extremely small; 201

- the time resolution required by the DL-ROM can be chosen to be larger than the 202

one required by the numerical solution of dynamical systems in cardiac electro- 203

physiology; 204

May 26, 2020 6/27

(A)

(B) (C)<latexit sha1_base64="W0JcnATa4lIkIBLrO4LFHKto+rE=">AAACBHicbVC7TsNAEDyHVwgvAyXNiQSJKrKREJQRUFChIJGHlBjrfNmEU84P3a0jRVZavoIWKjpEy39Q8C/YxgUkTDWa2dXOjhdJodGyPo3S0vLK6lp5vbKxubW9Y+7utXUYKw4tHspQdT2mQYoAWihQQjdSwHxPQscbX2Z+ZwJKizC4w2kEjs9GgRgKzjCVXNOs9X2GD94wGc7cm/urmmtWrbqVgy4SuyBVUqDpml/9QchjHwLkkmnds60InYQpFFzCrNKPNUSMj9kIeikNmA/aSfLkM3oUa4YhjUBRIWkuwu+NhPlaT30vncxi6nkvE//zejEOz51EBFGMEPDsEAoJ+SHNlUgrAToQChBZlhyoCChniiGCEpRxnopx2lEl7cOe/36RtE/q9mnduj2pNi6KZsrkgBySY2KTM9Ig16RJWoSTCXkiz+TFeDRejTfj/We0ZBQ7++QPjI9vmECXlQ==</latexit>

fDN

<latexit sha1_base64="Pm01jjacggiOrZonXxPx7B2q9m4=">AAACI3icbVDLSgNBEJyN7/iKevQymAjxEncDouAl6MWjgnlANobZSScZMvtgpkcIy36Gn+BXeNWTN/HiwX9xN0bQxLp0UdVNd5cXSaHRtj+s3MLi0vLK6lp+fWNza7uws9vQoVEc6jyUoWp5TIMUAdRRoIRWpID5noSmN7rM/OY9KC3C4BbHEXR8NghEX3CGqdQtHJdcn+HQ68eN5O72h5ukOyzjOXW9UPb02E9L7PomOSp1C0W7Yk9A54kzJUUyxXW38On2Qm58CJBLpnXbsSPsxEyh4BKSvGs0RIyP2ADaKQ2YD7oTTx5L6KHRDEMagaJC0okIvydi5uvsvLQzu1zPepn4n9c22D/rxCKIDELAs0UoJEwWaa5EmhjQnlCAyLLLgYqAcqYYIihBGeepaNII82kezuz386RRrTgnFfumWqxdTJNZJfvkgJSJQ05JjVyRa1InnDyQJ/JMXqxH69V6s96/W3PWdGaP/IH1+QWlyaUJ</latexit>

VTuh(t;µ)

<latexit sha1_base64="BOP+Z1/xPCtpnfT7BHt4MZ2hxrQ=">AAACT3icbVDPSxtBGJ2NbY3R1tQeexkahXgJu4K04CVUAj0qmChk4/Lt5IsZnJ1dZr4RwrJ/XP+EHj149qonb+JuWKH+eJd5vPfN9+PFmZKWfP/aa6x8+PhptbnWWt/4/GWz/XVrZFNnBA5FqlJzFoNFJTUOSZLCs8wgJLHC0/jysPJPr9BYmeoTWmQ4SeBCy5kUQKUUtcfbYQI0j2f5rIhyXZwPus/CqDg/eeauiOZdOuBhnKqpXSTlk4eJK3YPXig0R4Kyz6DY3W61onbH7/lL8LckqEmH1TiK2jfhNBUuQU1CgbXjwM9okoMhKRQWrdBZzEBcwgWOS6ohQTvJlyEUfMdZoJRnaLhUfCni/z9ySGy1Z1lZHWVfe5X4njd2NPs1yaXOHKEW1SCSCpeDrDCyTBf5VBokgmpz5FJzAQaI0EgOQpSiK+Ou8gheX/+WjPZ6wX7PP97r9H/XyTTZd/aDdVnAfrI++8OO2JAJ9pfdsjt27/3zHrzHRl3a8Gryjb1AY+0Jyka1kA==</latexit>

fEn (VTuh(t;µ);✓E)

<latexit sha1_base64="pgfhIgvElhJv0ozRmnGBnmsX9lY=">AAACrnicbVHbbhMxEPUut7LcUnjkxSJFaiWIdiMhKvFSQVXxBEFq0krxduX1TrJWvRfZs0iR6x/jT3jgX/CGgGibefHROceamTN5q6TBOP4ZhHfu3rv/YOdh9Ojxk6fPBrvPZ6bptICpaFSjz3NuQMkapihRwXmrgVe5grP88lOvn30HbWRTn+KqhbTiy1oupODoqWzwI4r2WCFNq/jK4EoBZQvNhWVNBUuelc6OHbuirOJY5gs7cxenf3HnsnIfP1CWN6owq8o/llWdO6Bv//kZSlWAt7rsyzbvG0rtNQ5LQO4ye3zitW3KsTtgVxdjGu1lg2E8itdFb4NkA4ZkU5Ns8IsVjegqqFEobsw8iVtMLdcohQIXsc5Ay8UlX8Lcw5pXYFK7ztjR153h2NAWNJWKrkn4/4fllelH9c5+d3NT68lt2rzDxWFqZd12CLXoG/nMYN3ICC398YAWUgMi7ycHKmsquOaIoCXlQniy89eMfB7Jze1vg9l4lLwbxd/Gw6OPm2R2yEvyiuyThLwnR+QzmZApEQENToKvwSSMw1mYhtkfaxhs/rwg1yosfwOI/tXZ</latexit>!h

2kVTuh(t;µ)� uN (t;µ,✓DF ,✓D)k2

<latexit sha1_base64="Pm01jjacggiOrZonXxPx7B2q9m4=">AAACI3icbVDLSgNBEJyN7/iKevQymAjxEncDouAl6MWjgnlANobZSScZMvtgpkcIy36Gn+BXeNWTN/HiwX9xN0bQxLp0UdVNd5cXSaHRtj+s3MLi0vLK6lp+fWNza7uws9vQoVEc6jyUoWp5TIMUAdRRoIRWpID5noSmN7rM/OY9KC3C4BbHEXR8NghEX3CGqdQtHJdcn+HQ68eN5O72h5ukOyzjOXW9UPb02E9L7PomOSp1C0W7Yk9A54kzJUUyxXW38On2Qm58CJBLpnXbsSPsxEyh4BKSvGs0RIyP2ADaKQ2YD7oTTx5L6KHRDEMagaJC0okIvydi5uvsvLQzu1zPepn4n9c22D/rxCKIDELAs0UoJEwWaa5EmhjQnlCAyLLLgYqAcqYYIihBGeepaNII82kezuz386RRrTgnFfumWqxdTJNZJfvkgJSJQ05JjVyRa1InnDyQJ/JMXqxH69V6s96/W3PWdGaP/IH1+QWlyaUJ</latexit>

VTuh(t;µ)

<latexit sha1_base64="g7jhFvKGRPsSekuyXAF6znWOsV4=">AAACXHicdVBNSxxBEO2daKKrG9cEvHhpsgY2EJaZhZBALpIE8SQGsquwMww1vbVusz0fdFcLMsxfzD0H/RVeDfasI/iRvEs93quiql5SKGnI9/+0vBcrqy9fra23NzY7r7e622/GJrda4EjkKtenCRhUMsMRSVJ4WmiENFF4kiy+1/7JOWoj8+wXXRQYpXCWyZkUQE6Ku/O9MAWaJ7NyXIUk1RTLe8FWVXzUp688THI1NRepK8601cfHCs2RoIrLHwf/daoPe3G35w/8JfhzEjSkxxocx93LcJoLm2JGQoExk8AvKCpBkxQKq3ZoDRYgFnCGE0czSNFE5TKRir+3BijnBWouFV+K+HCihNTUV7rO+lvz1KvFf3kTS7MvUSmzwhJmol7kQsPlIiO0dFEjn0qNRFBfjlxmXIAGItSSgxBOtC77tssjePr9czIeDoJPA//nsLf/rUlmje2yd6zPAvaZ7bNDdsxGTLDf7JrdsL+tK2/F2/A6d61eq5l5yx7B27kFiWu57w==</latexit>

VuN (t;µ,✓DF ,✓D)

Main features:• POD-DL-ROMs learn, simultaneously, the nonlinear trial

manifold and the nonlinear reduced dynamics;• the POD-DL-ROM dimension is as close as possible to the

number of parameters which the PDE solution depends upon;• a prior dimensionality reduction, performed by means of

randomized POD, and pretraining allow to drastically reducetraining computational times.

Test 1: pretraining on 3D elastodynamics equations

Test 2: cardiac electrophysiology on left ventricle and atrium

References