hot new directions for qmc research in step with...

30
Frances Kuo @ UNSW Australia p.1 Hot new directions for QMC research in step with applications Frances Kuo [email protected] University of New South Wales, Sydney, Australia

Upload: hoangngoc

Post on 26-Sep-2018

213 views

Category:

Documents


0 download

TRANSCRIPT

Frances Kuo @ UNSW Australia p.1

Hot new directions forQMC research

in step with applications

Frances Kuo

[email protected]

University of New South Wales, Sydney, Australia

Frances Kuo @ UNSW Australia p.2

Outline

Motivating example – a flow through a porous medium

QMC theory

Setting 1

Setting 2

Setting 3

Applications of QMC

Application 1

Application 2

Application 3

Cost reduction strategies

Saving 1

Saving 2

Saving 3

Concluding remarks

Frances Kuo @ UNSW Australia p.3

Motivating exampleUncertainty in groundwater flow

eg. risk analysis of radwaste disposal or CO2 sequestration

Darcy’s law q + a ~∇p = κ

mass conservation law ∇ · q = 0in D ⊂ R

d, d = 1, 2, 3

together with boundary conditions

Uncertainty in a(xxx, ω) leads to uncertainty in q(xxx, ω) and p(xxx, ω)

Frances Kuo @ UNSW Australia p.4

Expected values of quantities of interestTo compute the expected value of some quantity of interest:

1. Generate a number of realizations of the random field(Some approximation may be required)

2. For each realization, solve the PDE using e.g. FEM / FVM / mFEM

3. Take the average of all solutions from different realizations

This describes Monte Carlo simulation.

Example : particle dispersion

p = 1 p = 0

∂p∂~n

= 0

∂p∂~n

= 0

◦release point →•

0

0.2

0.4

0.6

0.8

1

0 0.2 0.4 0.6 0.8 1

Frances Kuo @ UNSW Australia p.5

Expected values of quantities of interestTo compute the expected value of some quantity of interest:

1. Generate a number of realizations of the random field(Some approximation may be required)

2. For each realization, solve the PDE using e.g. FEM / FVM / mFEM

3. Take the average of all solutions from different realizations

This describes Monte Carlo simulation.

NOTE: expected value = (high dimensional) integral

s = stochastic dimension

→ use quasi-Monte Carlo methods

103

104

105

106

10−5

10−4

10−3

10−2

10−1

error

n

n−1/2

n−1

or better QMC

MC

Frances Kuo @ UNSW Australia p.6

MC v.s. QMC in the unit cube∫

[0,1]sf(yyy) dyyy ≈ 1

n

n∑

i=1

f(ttti)

Monte Carlo method Quasi-Monte Carlo methodsttti random uniform ttti deterministic

n−1/2 convergence close to n−1 convergence or better

0 1

1

64 random points

••

••

•••

••

••

• •

••

••

••

••

0 1

1

First 64 points of a2D Sobol′ sequence

0 1

1

A lattice rule with 64 points

more effective for earlier variables and lower-order projectionsorder of variables irrelevant order of variables very important

use randomized QMC methods for error estimation

Frances Kuo @ UNSW Australia p.7

QMCTwo main families of QMC methods:

(t,m,s)-nets and (t,s)-sequenceslattice rules

0 1

1

First 64 points of a2D Sobol′ sequence

0 1

1

A lattice rule with 64 points

(0,6,2)-netHaving the right number ofpoints in various sub-cubes

A group under addition modulo Z

and includes the integer points

••

••

• •

••

• ••

Niederreiter book (1992)

Sloan and Joe book (1994)

Important developments:component-by-component (CBC ) construction , “fast” CBChigher order digital nets

Dick and Pillichshammer book (2010)

Dick, K., Sloan Acta Numerica (2013)

Nuyens and Cools (2006)Dick (2008)

Contribute to:Information-Based ComplexityTractability

Traub, Wasilkowski, Wozniakowski book(1988)

Novak and Wozniakowski books(2008, 2010, 2012)

Frances Kuo @ UNSW Australia p.8

Application of QMC to PDEs with random coefficients

[0] Graham, K., Nuyens, Scheichl, Sloan (J. Comput. Physics, 2011)

[1] K., Schwab, Sloan (SIAM J. Numer. Anal., 2012)

[2] K., Schwab, Sloan (J. FoCM, 2015)

[3] Graham, K., Nichols, Scheichl, Schwab, Sloan (Numer. Math., 2015)

[4] K., Scheichl, Schwab, Sloan, Ullmann (in review)

[5] Dick, K., Le Gia, Nuyens, Schwab (SIAM J. Numer. Anal., 2014)

[6] Dick, K., Le Gia, Schwab (SIAM J. Numer. Anal., to appear)

[7] Graham, K., Nuyens, Scheichl, Sloan (in progress)

Also Schwab (Proceedings of MCQMC 2012)

Le Gia (Proceedings of MCQMC 2012)

Dick, Le Gia, Schwab (in review, in review, in progress)

Harbrecht, Peters, Siebenmorgen (Math. Comp., 2015)

Gantner, Schwab (Proceedings of MCQMC 2014)

Ganesh, Hawkings (SIAM J. Sci. Comput., 2015)

Robbe, Nuyens, Vandewalle (in review)

Scheichl, Stuart, Teckentrup (in review)

Gilbert, Graham, K., Scheichl, Sloan (in progress)

Graham, Parkinson, Scheichl (in progress)etc.

Frances Kuo @ UNSW Australia p.9

Application of QMC to PDEs with random coefficients

Three different QMC theoretical settings:

[1,2] Weighted Sobolev space over [0, 1]s and “randomly shifted lattice rules”

[3,4] Weighted space setting in Rs and “randomly shifted lattice rules”

[5,6] Weighted space of smooth functions over [0, 1]s and (deterministic)“interlaced polynomial lattice rules”

Uniform Lognormal LognormalKL expansion Circulant embedding

Experiments only [0]*First order, single-level [1] [3]* [7]*First order, multi-level [2] [4]*Higher order, single-level [5]Higher order, multi-level [6]*

The * indicates there are accompanying numerical results

K., Nuyens (J. FoCM, to appear) – a survey of [1,2], [3,4], [5,6] in a unified view

Accompanying software packagehttp://people.cs.kuleuven.be/~dirk.nuyens/qmc4pde/

Frances Kuo @ UNSW Australia p.10

5-minute QMC primer (most important slide)

Common features among all three QMC theoretical settings:

Separation in error bound

(rms) QMC error ≤ (rms) worst case error γγγ × norm of integrand γγγ(rms) QMC error ≤ (rms) worst case error γγγ × norm of integrand γγγ

Weighted spaces

Pairing of QMC rule with function space

Optimal rate of convergence

Rate and constant independent of dimension

Fast component-by-component (CBC) construction

Extensible or embedded rules

Application of QMC theory

Estimate the norm (critical step)

Choose the weights

Weights as input to the CBC construction

Frances Kuo @ UNSW Australia p.11

Lattice rulesRank-1 lattice rules have points

ttti = frac

(i

nzzz

)

, i = 1, 2, . . . , n

zzz ∈ Zs – the generating vector, with all components coprime to n

frac(·) – means to take the fractional part of all components

0 1

1

A lattice rule with 64 points

64

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

48

49

50

51

52

53

54

55

56

57

58

59

60

61

62

63

∼ quality determined by the choice of zzz ∼

n = 64 zzz = (1, 19) ttti = frac

(i

64(1, 19)

)

Frances Kuo @ UNSW Australia p.12

Randomly shifted lattice rulesShifted rank-1 lattice rules have points

ttti = frac

(i

nzzz +∆∆∆

)

, i = 1, 2, . . . , n

∆∆∆ ∈ [0, 1)s – the shift

shifted by

∆∆∆ = (0.1, 0.3)

0 1

1

A lattice rule with 64 points

0 1

1

A shifted lattice rule with 64 points

∼ use a number of random shifts for error estimation ∼

Frances Kuo @ UNSW Australia p.13

Component-by-component construction

Want to find zzz for which some error criterion is as small possible∼ Exhaustive search is practically impossible - too many choices! ∼

CBC algorithm [Korobov (1959); Sloan, Reztsov (2002); Sloan, K., Joe (2002),. . . ]

1. Set z1 = 1.2. With z1 fixed, choose z2 to minimize the error criterion in 2D.3. With z1, z2 fixed, choose z3 to minimize the error criterion in 3D.4. etc.

[K. (2003); Dick (2004)]

Optimal rate of convergence O(n−1+δ) in “weighted Sobolev space”,independently of s under an appropriate condition on the weights

∼ Averaging argument: there is always one choice as good as average! ∼

[Nuyens, Cools (2006)]

Cost of algorithm for “product weights” is O(n logn s) using FFT

Extensible/embedded variants[Hickernell, Hong, L’Ecuyer, Lemieux (2000); Hickernell, Niederreiter (2003)]

[Cools, K., Nuyens (2006)][Dick, Pillichshammer, Waterhouse (2007)]

http://people.cs.kuleuven.be/~dirk.nuyens/fast-cbc/http://people.cs.kuleuven.be/~dirk.nuyens/qmc-generators/

Frances Kuo @ UNSW Australia p.14

Fast CBC constructionn = 53

1

53

Natural ordering of the indices Generator ordering of the indices

Matrix-vector multiplication with a circulant matrix can be done using FFT

Frances Kuo @ UNSW Australia p.15

Setting 1: standard QMC for unit cube

Worst case error bound∣∣∣∣∣

[0,1]sf(yyy) dyyy − 1

n

n∑

i=1

f(ttti)

∣∣∣∣∣≤ ewor(ttt1, . . . , tttn) ‖f‖

Weighted Sobolev space [Sloan, Wozniakowski (1998)]

∣∣∣∣∣

[0,1]sf(yyy) dyyy − 1

n

n∑

i=1

f(ttti)

∣∣∣∣∣≤ ewor

γγγ (ttt1, . . . , tttn) ‖f‖γγγ

‖f‖2γγγ =

u⊆{1:s}

1

γu

[0,1]|u|

∣∣∣∣∣

∂|u|f

∂yyyu

(yyyu; 0)

∣∣∣∣∣

2

dyyyu

2s subsets “anchor” at 0 (also “unanchored”)“weights” Mixed first derivatives are square integrable

Small weight γu means that f depends weakly on the variables yyyu

Pair with randomly shifted lattice rules

Choose weights to minimize the error bound [K., Schwab, Sloan (2012)](

2

n

u⊆{1:s}

γλuau

)1/(2λ)

︸ ︷︷ ︸

bound on worst case error (CBC)

(∑

u⊆{1:s}

bu

γu

)1/2

︸ ︷︷ ︸

bound on norm

⇒ γu =

(bu

au

)1/(1+λ)

Construct points (CBC) to minimize the worst case error

“POD weights”

Frances Kuo @ UNSW Australia p.16

Setting 2: QMC integration over Rs

Change of variables∫

Rs

f(yyy)s∏

j=1

φ(yj) dyyy =

[0,1]sf(Φ−1(www)) dwww

Norm with weight function [Wasilkowski, Wozniakowski (2000)]

‖f‖2γγγ =

u⊆{1:s}

1

γu

R|u|

∣∣∣∣∣

∂|u|f

∂yyyu

(yyyu; 0)

∣∣∣∣∣

2∏

j∈u

ϕ2j (yj) dyyyu

also “unanchored” weight function

[K., Sloan, Wasilkowski, Waterhouse (2010); Nichols, K. (2014)]

Pair with randomly shifted lattice rules

CBC error bound for general weights γu

Convergence rate depends on the relationship between φ and ϕj

Fast CBC for POD weights

Important for applications : φ, ϕj , γu are up to us to choose/tune

Frances Kuo @ UNSW Australia p.17

Setting 3: smooth integrands in the cube

Norm involving higher order mixed derivatives

Pair with higher order digital nets [Dick (2008)]

Classical polynomial lattice rule [Niederreiter (1992)]

n = bm points with prime b

An irreducible polynomial with degree m

A generating vector of s polynomials with degree < m

Interlaced polynomial lattice rule [Goda, Dick (2012)]

Interlacing factor α

An irreducible polynomial with degree m

A generating vector of αs polynomials with degree < m

Digit interlacing function Dα : [0, 1)α → [0, 1)

(0.x11x12x13 · · · )b, (0.x21x22x23 · · · )b, . . . , (0.xα1xα2xα3 · · · )b

becomes (0.x11x21 · · ·xα1x12x22 · · ·xα2x13x23 · · ·xα3

· · · )b

[Dick, K., Le Gia, Nuyens, Schwab (2014)]

Variants of fast CBC construction available “SPOD weights”

Frances Kuo @ UNSW Australia p.18

OverviewApplication 1

Option pricing

Application 2

GLMMmaximum likelihood

Application 3

PDEs with random coefficients

Setting 1domain [0, 1]s

randomly shifted lattice rules

mixed first derivatives

first order convergence

Setting 2domain R

s

randomly shifted lattice rules

mixed first derivatives

first order convergence

Setting 3domain [0, 1]s

interlaced polynomial lattice rules

mixed higher derivatives

higher order convergence

Frances Kuo @ UNSW Australia p.19

Application 1: QMC for option pricingPath-dependent option [Giles, K., Sloan, Waterhouse (2008)]

102

103

104

10−4

10−3

10−2

10−1

100

n

standard error

MC

QMC + PCA

naive QMC

Rs

max

(

1

s

s∑

j=1

Sj(zzz) −K, 0

)

exp(−12zzzTΣ−1zzz)

√(2π)s det(Σ)

dzzz

=

Rs

f(yyy)s∏

j=1

φnor(yj) dyyy

=

[0,1]sg(www) dwww

Write Σ = AAT

Sub. zzz = Ayyy

Sub. yyy = Φ-1nor(www)

In the cube, g is unbounded and has a kink

[Griebel, K., Sloan (2010, 2013, 2016)]

ANOVA decomposition: f(yyy) =∑

u⊆{1:s} fu(yyyu)

All fu with u 6= {1 : s} are smooth under BB

[Griewank, Leövey, K., Sloan (in progress)]Smoothing by preintegration

(Pkf)(yyy) =

R

f(yyy)φnor(yk) dyk is smooth for some choice of k[Ian Sloan – Mon 4:50pm Berg A ]

Frances Kuo @ UNSW Australia p.20

Application 2: QMC for maximum likelihood

Generalized linear mixed model [K., Dunsmuir, Sloan, Wand, Womersley (2008)]

Rs

(s∏

j=1

exp(τj(β + zj) − eβ+zj )

τj!

)

exp(−12zzzTΣ−1zzz)

√(2π)s det(Σ)

dzzz

=

Rs

f(yyy)s∏

j=1

φ(yj) dyyy

=

[0,1]sg(www) dwww

Re-center and re-scale, and multiply and divide by φ

Sub. zzz = A∗yyy + zzz∗

Sub. yyy = Φ-1(www)cf. importance sampling

no re-centering (worst) no re-scaling (bad) φ normal (good) φ logistic (better) φ Student-t (best)

In the cube, g and/or its derivatives are unbounded

[K., Sloan, Wasilkowski, Waterhouse (2010)][Nichols, K. (2014)]New weighted space setting over Rs

[Sinescu, K., Sloan (2013)]Differentiate f to estimate the norm

Frances Kuo @ UNSW Australia p.21

Application 3: QMC for PDEs with random coeff.

−∇ · (a(xxx,yyy)∇u(xxx,yyy)) = κ(xxx), xxx ∈ D ⊂ Rd, d = 1, 2, 3

u(xxx,yyy) = 0, xxx ∈ ∂D

Uniform [Cohen, DeVore, Schwab (2011)]

a(xxx,yyy) = a(xxx) +∑

j≥1

yj ψj(xxx), yj ∼ U [−12, 12]

Goal:∫

(−12,12)NG(u(·, yyy)) dyyy G – bounded linear functional

Lognormala(xxx,yyy) = exp(Z(xxx,yyy))

Karhunen–Loève expansion :

Z(xxx,yyy) = Z(xxx) +∑

j≥1

yj√µj ξj(xxx), yj ∼ N (0, 1)

Goal:∫

RN∗

G(u(·, yyy))∏

j≥1

φnor(yj) dyyy

Frances Kuo @ UNSW Australia p.22

Application 3: QMC for PDEs with random coeff.

Error analysis [K., Nuyens (2016) – survey]

(−12,12)NG(u(·, yyy)) dyyy ≈

1

n

n∑

i=1

G(ush(·, ttti))

(1) dimension truncation

(2) FE discretization

(3) QMC quadrature

Balance three sources of error

QMC error

Estimate the norm ‖G(ush)‖γγγ by differentiating the PDE

Choose the weights to minimize (bound on eworγγγ )×(bound on ‖G(us

h)‖γγγ )

Get POD weights (product and order dependent) ⇒ feed into fast CBC construction

j≥1 ‖ψj‖p∞ < ∞ ⇒ O(n−min( 1

p− 1

2,1−δ)) convergence, independently of s

Improved to O(n− 1

p ) convergence, independently of s, with higher order QMC andSPOD weights (smoothness-driven product and order dependent)

Lognormal results depend on smoothness of covariance function

Pairing of QMC rule with function space is crucial

Want the best convergence rate under the least assumption

Want the convergence rate and implied constant to be independent of s

Frances Kuo @ UNSW Australia p.23

Application 3: QMC for PDEs with random coeff.

Extensions

Lognormal a(xxx,yyy) = exp(Z(xxx,yyy))

circulant embedding [Graham, K., Nuyens, Scheichl, Sloan (in progress)]

Z(xxxi, yyy) for i = 1, . . . ,M ⇒M ×M covariance matrixR

Assume stationary covariance function and regular gridExtendR to an s× s circulant matrixRext = AAT

Compute matrix-vector multiplicationAyyy using FFT

H-matrix [Feischl, K., Sloan (in progress)]

Does not require stationary covariance function or regular gridApproximateR by anH2-matrix of interpolation order p, Rp = BBT

Iterative method to computeByyy to some prescribed accuracy

[Michael Feischl – Fri 3:00pm Berg C ]

QMC for PDE eigenvalue problems (neutron transport)[Gilbert, Graham, K., Scheichl, Sloan (in progress)]

QMC for wave propagation models [Ganesh, K., Nuyens, Sloan (in progress)]

QMC for Bayesian inverse problems [several research groups]

Frances Kuo @ UNSW Australia p.24

OverviewApplication 1

Option pricing

Application 2

GLMMmaximum likelihood

Application 3

PDEs with random coefficients

uniform

lognormal

Setting 1domain [0, 1]s

randomly shifted lattice rules

mixed first derivatives

first order convergence

Setting 2domain R

s

randomly shifted lattice rules

mixed first derivatives

first order convergence

Setting 3domain [0, 1]s

interlaced polynomial lattice rules

mixed higher derivatives

higher order convergence

smoothing by preintegration

integral transformation

KL expansion

circulant embedding

H-matrix

Cost reductionSaving 1Saving 2Saving 3

Frances Kuo @ UNSW Australia p.25

Saving 1: multi-level and multi-indexMulti-level method [Heinrich (1998); Giles (2008); . . . ; Giles (2016)]

Telescoping sum

f =∞∑

ℓ=0

(fℓ − fℓ−1), f−1 := 0 AML(f) =L∑

ℓ=0

Qℓ(fℓ − fℓ−1)

|I(f) −AML(f)| ≤ |I(f) − IL(fL)|

︸ ︷︷ ︸

≤ 1

+L∑

ℓ=0

|(Iℓ −Qℓ)(fℓ − fℓ−1)|

︸ ︷︷ ︸

≤ 1

choose L specify parameters for Qℓ

Successive differences fℓ − fℓ−1 become smaller

Require less samples for higher dimensional sub-problems

Lagrange multiplier to minimize cost subject to a given error threshold

Multi-index method [Haji-Ali, Nobile, Tempone (2016)]

Apply multi-level simultaneously to more than one aspect of the problem

e.g., quadrature, spatial discretization, time discretization, etc.

Frances Kuo @ UNSW Australia p.26

Saving 2: MDMMultivariate decomposition method [. . . ; Wasilkowski (2013); . . . ]

[K., Nuyens, Plaskota, Sloan, Wasilkowski (in progress)]Decomposition

f =∑

u⊂{1,2,...},|u|<∞

fu AMDM(f) =∑

u∈A

Qu(fu)

|I(f) −AMDM(f)| ≤∑

u/∈A

|Iu(fu)|

︸ ︷︷ ︸

≤ 1

+∑

u∈A

|(Iu −Qu)(fu)|

︸ ︷︷ ︸

≤ 1

choose A specify parameters for Qu

Assume bounds on the norms ‖fu‖Fu≤ Bu are given

Qu can be QMC or other quadratures including Smolyak

Cardinality of active set A might be huge, but cardinality of u ∈ A is small

Implementation [Gilbert, K., Nuyens, Wasilkowski (in progress)]

Anchored decomposition fu(xxxu) =∑

v⊆u(−1)|u|−|v|f(xxxv;aaa)

Explore nestedness/embedding in the quadratures

Efficient data structure to reuse function evaluations

Application to PDEs with random coefficients[Alexander Gilbert – Mon 2:25pm Berg A ]

Frances Kuo @ UNSW Australia p.27

Saving 3: QMC fast matrix-vector mult.QMC fast matrix-vector multiplication [Dick, K., Le Gia, Schwab (2015)]

Want to compute yyyiA for all i = 1, . . . , n

Row vectors yyyi = ϕ(ttti) where ttti are QMC points

Write Y ′ =

yyy1...

yyyn−1

= CP

C(n−1)×(n−1) is circulant

P has a single 1 in each columnand 0 everywhere else

Can compute Y ′aaa in O(n logn) operations for any column aaa of A

When does/doesn’t this work?deterministic lattice points, deterministic polynomial lattice pointsinverse cumulative normal, tent transformdoes not work with random shifting, scrambling, interlacing

Where does this appy?Generating normally distributed points with a general covariance matrixPDEs with uniform random coefficientsPDEs with lognormal random coefficients involving FE quadrature

Frances Kuo @ UNSW Australia p.28

Summary and concluding remarksApplication 1

Option pricing

Application 2

GLMMmaximum likelihood

Application 3

PDEs with random coefficients

uniform

lognormal

Setting 1domain [0, 1]s

randomly shifted lattice rules

mixed first derivatives

first order convergence

Setting 2domain R

s

randomly shifted lattice rules

mixed first derivatives

first order convergence

Setting 3domain [0, 1]s

interlaced polynomial lattice rules

mixed higher derivatives

higher order convergence

smoothing by preintegration

integral transformation

KL expansion

circulant embedding

H-matrix

Cost reductionSaving 1 multi-level and multi-index methods

Saving 2 multivariate decomposition method (MDM)

Saving 3 QMC fast matrix-vector multiplication

Frances Kuo @ UNSW Australia p.29

Setting 4: higher order QMC for Rs???

Dick, Leobacher, Irrgeher, Pillichshammer

[Christian Irrgeher – Mon 5:20pm Berg A ]

Cools, Nguyen, Nuyens [Dong Nguyen – MCM Linz 2015 ]

Stretch the unit cube

Truncate the integrand

Apply higher order QMC without Φ-1 mapping

s dependence???

Frances Kuo @ UNSW Australia p.30

Thanks to my collaborators

AU

Austria

Belgium

Ecuador

Germany

NZ

Poland

Switzerland

UK

USA