towards(a(scalable(parallel(sparse( linear(systemsolver · role of reordering schemes •...
TRANSCRIPT
![Page 1: Towards(a(Scalable(Parallel(Sparse( Linear(SystemSolver · Role of Reordering Schemes • “Banded”, or low-rank perturbations of banded, systems are often obtained after RCM or](https://reader034.vdocuments.us/reader034/viewer/2022043003/5f842687b8804156cf4ea04e/html5/thumbnails/1.jpg)
Towards a Scalable Parallel Sparse Linear System Solver
M. Manguoglu METU, Turkey
A. Sameh* Purdue University, U.S.A. O. Schenk & M. Sathe Univ. of Basel, Switzerland
Coupled 2011 Kos Island June 20-‐22
Sparse Days 2011 CERFACS, September 6 & 7 Support*: ARO, Intel, NSF.
Acknowledge: F. Saied (CS, Purdue)
![Page 2: Towards(a(Scalable(Parallel(Sparse( Linear(SystemSolver · Role of Reordering Schemes • “Banded”, or low-rank perturbations of banded, systems are often obtained after RCM or](https://reader034.vdocuments.us/reader034/viewer/2022043003/5f842687b8804156cf4ea04e/html5/thumbnails/2.jpg)
Outline • MoUvaUng applicaUons – a sample • The parallel scalability of the SPIKE family of banded linear system solvers
• Extensions for solving general sparse linear systems:
– The Pardiso-‐SPIKE hybrid solver (PSPIKE) • Role of reordering for faster MATVEC and extracUon of precondiUoners:
– TraceMIN – a parallel eigensolver for compuUng the Fiedler vector (spectral reordering)
2
![Page 3: Towards(a(Scalable(Parallel(Sparse( Linear(SystemSolver · Role of Reordering Schemes • “Banded”, or low-rank perturbations of banded, systems are often obtained after RCM or](https://reader034.vdocuments.us/reader034/viewer/2022043003/5f842687b8804156cf4ea04e/html5/thumbnails/3.jpg)
Target ComputaUonal Loop Integra(on
Newton itera(on
Linear system solvers
ηk
∈k
Δt 3
![Page 4: Towards(a(Scalable(Parallel(Sparse( Linear(SystemSolver · Role of Reordering Schemes • “Banded”, or low-rank perturbations of banded, systems are often obtained after RCM or](https://reader034.vdocuments.us/reader034/viewer/2022043003/5f842687b8804156cf4ea04e/html5/thumbnails/4.jpg)
Role of Reordering Schemes
• “Banded”, or low-rank perturbations of banded, systems are often obtained after RCM or spectral reordering,
• This yields more scalable parallel matrix-vector multiplications, and helps in extracting more effective “banded” preconditioners.
a4er RCM reordering
Banded PrecondiUoners
![Page 5: Towards(a(Scalable(Parallel(Sparse( Linear(SystemSolver · Role of Reordering Schemes • “Banded”, or low-rank perturbations of banded, systems are often obtained after RCM or](https://reader034.vdocuments.us/reader034/viewer/2022043003/5f842687b8804156cf4ea04e/html5/thumbnails/5.jpg)
0
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
Factorization Triangular solves
Total
MKL TA0
Time (sec) on a 4-core Intel Clovertown
A x = f n = 600,000 bw = 99
5
![Page 6: Towards(a(Scalable(Parallel(Sparse( Linear(SystemSolver · Role of Reordering Schemes • “Banded”, or low-rank perturbations of banded, systems are often obtained after RCM or](https://reader034.vdocuments.us/reader034/viewer/2022043003/5f842687b8804156cf4ea04e/html5/thumbnails/6.jpg)
0
0.5
1
1.5
2
2.5
3
3.5
4
Factorization Triangular solvers
Total
MKL TA0
off-chip data accessed (in bytes)
X 109
6
“Analyzing memory access intensity in parallel programs for mul(core architectures” L. Liu, Z. Li, and A. S.
![Page 7: Towards(a(Scalable(Parallel(Sparse( Linear(SystemSolver · Role of Reordering Schemes • “Banded”, or low-rank perturbations of banded, systems are often obtained after RCM or](https://reader034.vdocuments.us/reader034/viewer/2022043003/5f842687b8804156cf4ea04e/html5/thumbnails/7.jpg)
Spike-‐based sparse solver: PSPIKE
Solve Mz = r (via Pardiso-‐SPIKE)
BiCGstab (or any Krylov subspace method)
Step 1: reordering (equivalent to HSL -‐-‐ MC64 + MC73) Step 2: extract a “banded” precondiUoner M
7
![Page 8: Towards(a(Scalable(Parallel(Sparse( Linear(SystemSolver · Role of Reordering Schemes • “Banded”, or low-rank perturbations of banded, systems are often obtained after RCM or](https://reader034.vdocuments.us/reader034/viewer/2022043003/5f842687b8804156cf4ea04e/html5/thumbnails/8.jpg)
Weighted Spectral Reordering (WSO)
Main ObjecUves: – encapsulaUng as many of the heaviest sparse matrix elements as possible within a central band to be used as a precondiUoner, • reducing the rank of the matrix lying outside the central band.
– realizing a faster “matvec” (bandedmatrix + few elements outside the band) (banded matrix + few nonzero elements outside the band)
![Page 9: Towards(a(Scalable(Parallel(Sparse( Linear(SystemSolver · Role of Reordering Schemes • “Banded”, or low-rank perturbations of banded, systems are often obtained after RCM or](https://reader034.vdocuments.us/reader034/viewer/2022043003/5f842687b8804156cf4ea04e/html5/thumbnails/9.jpg)
UFL: smt -‐-‐ structural mechanics N: 25,710 NNZ: 3,749,582
Original matrix AGer MC73 AGer TraceMin-‐Fiedler
ager TraceMIN-‐Fiedler ager HSL-‐MC73
9
obtaining the Fiedler vector via the eigensolver: TraceMIN (Wisniewski and A.S. -‐-‐ SINUM, ’82)
![Page 10: Towards(a(Scalable(Parallel(Sparse( Linear(SystemSolver · Role of Reordering Schemes • “Banded”, or low-rank perturbations of banded, systems are often obtained after RCM or](https://reader034.vdocuments.us/reader034/viewer/2022043003/5f842687b8804156cf4ea04e/html5/thumbnails/10.jpg)
UFL: f2 -‐-‐ structural mechanics N: 71,505 NNZ: 5,294,285
Original matrix AGer MC73 AGer TraceMin-‐Fiedler
10 TraceMIN-‐Fiedler: Murat Manguoglu et. al -‐-‐ submiled
![Page 11: Towards(a(Scalable(Parallel(Sparse( Linear(SystemSolver · Role of Reordering Schemes • “Banded”, or low-rank perturbations of banded, systems are often obtained after RCM or](https://reader034.vdocuments.us/reader034/viewer/2022043003/5f842687b8804156cf4ea04e/html5/thumbnails/11.jpg)
Before reordering Ager reordering via TraceMIN-‐Fiedler
UFL – f2
![Page 12: Towards(a(Scalable(Parallel(Sparse( Linear(SystemSolver · Role of Reordering Schemes • “Banded”, or low-rank perturbations of banded, systems are often obtained after RCM or](https://reader034.vdocuments.us/reader034/viewer/2022043003/5f842687b8804156cf4ea04e/html5/thumbnails/12.jpg)
M11
M22
M33
M44
B1
C2
C3
C4
B2
B3
z1
z4
z3
z2
r1
r4
r3
r2
=
M z = r (M is “banded”)
P = M + δ(M) = D’ * S’
(i) Solve D’ y = r
(ii) Solve S’ z = y
PSPIKE: Pardiso-‐SPIKE
12
Each Mkk is a general sparse
matrix
Solving systems involving The precondiPoner M z =r
![Page 13: Towards(a(Scalable(Parallel(Sparse( Linear(SystemSolver · Role of Reordering Schemes • “Banded”, or low-rank perturbations of banded, systems are often obtained after RCM or](https://reader034.vdocuments.us/reader034/viewer/2022043003/5f842687b8804156cf4ea04e/html5/thumbnails/13.jpg)
Reduced system reduced system
![Page 14: Towards(a(Scalable(Parallel(Sparse( Linear(SystemSolver · Role of Reordering Schemes • “Banded”, or low-rank perturbations of banded, systems are often obtained after RCM or](https://reader034.vdocuments.us/reader034/viewer/2022043003/5f842687b8804156cf4ea04e/html5/thumbnails/14.jpg)
GeneraUng Ups of the spikes
Mkk 0?
=
Obtain the upper and lower Ups of the soluUon block via the modified direct
sparse system solver “Pardiso”.
![Page 15: Towards(a(Scalable(Parallel(Sparse( Linear(SystemSolver · Role of Reordering Schemes • “Banded”, or low-rank perturbations of banded, systems are often obtained after RCM or](https://reader034.vdocuments.us/reader034/viewer/2022043003/5f842687b8804156cf4ea04e/html5/thumbnails/15.jpg)
How expensive is spectral reordering? Integra,on
Newton Itera,on
Linear system solvers
ηk
∈k
Δt
Cost of weighted Spectral reordering could be amor(zed over several (me steps.
Cost of WSO is amorPzed across several Pme steps.
Will return to this issue later will return to this issue later 15
![Page 16: Towards(a(Scalable(Parallel(Sparse( Linear(SystemSolver · Role of Reordering Schemes • “Banded”, or low-rank perturbations of banded, systems are often obtained after RCM or](https://reader034.vdocuments.us/reader034/viewer/2022043003/5f842687b8804156cf4ea04e/html5/thumbnails/16.jpg)
σ = T(MC73) on 1 core ÷ T(TraceMIN-‐Fiedler) on 4 nodes Matrix n nnz σ
Rajat31 ~4.7 M ~ 20 M 85
Schenk/nlpkkt20 ~ 3.5 M ~ 95 M 8
Freescale1 ~ 3.4 M ~ 17 M 4
KKT-‐power ~ 2.1 M ~ 13 M 641
![Page 17: Towards(a(Scalable(Parallel(Sparse( Linear(SystemSolver · Role of Reordering Schemes • “Banded”, or low-rank perturbations of banded, systems are often obtained after RCM or](https://reader034.vdocuments.us/reader034/viewer/2022043003/5f842687b8804156cf4ea04e/html5/thumbnails/17.jpg)
Mul(level Parallelism of PSPIKE for solving M z = r
Node 1 Node 2 Node 3 Node 4
Pardiso Pardiso Pardiso Pardiso
SPIKE A1
A2
A3
A4
C2
C3
C4
B1
B2
B3
Precondi(oner: M
17
PrecondiPoner M is “generalized-‐banded” Other opPons: • narrow-‐banded • wide-‐banded
PrecondiUoner M is generalized-‐banded (MBP) Other cases: • narrow-‐banded (NBP) • wide-‐banded (WBP)
![Page 18: Towards(a(Scalable(Parallel(Sparse( Linear(SystemSolver · Role of Reordering Schemes • “Banded”, or low-rank perturbations of banded, systems are often obtained after RCM or](https://reader034.vdocuments.us/reader034/viewer/2022043003/5f842687b8804156cf4ea04e/html5/thumbnails/18.jpg)
18
ager TraceMIN-‐Fiedler Circuit SimulaUon
Wide-‐Banded PrecondiUoner
![Page 19: Towards(a(Scalable(Parallel(Sparse( Linear(SystemSolver · Role of Reordering Schemes • “Banded”, or low-rank perturbations of banded, systems are often obtained after RCM or](https://reader034.vdocuments.us/reader034/viewer/2022043003/5f842687b8804156cf4ea04e/html5/thumbnails/19.jpg)
WBP: wide-‐banded precondiUoners precondiUoner consists of overlapped blocks
Naumov, Manguoglu, A.S. : A Tearing-‐based Hybrid Parallel Sparse Linear System Solver: JACM, 2010.
19
![Page 20: Towards(a(Scalable(Parallel(Sparse( Linear(SystemSolver · Role of Reordering Schemes • “Banded”, or low-rank perturbations of banded, systems are often obtained after RCM or](https://reader034.vdocuments.us/reader034/viewer/2022043003/5f842687b8804156cf4ea04e/html5/thumbnails/20.jpg)
Solve: Ax = f ; A := nonsingular
m << n
A1 A2
20
![Page 21: Towards(a(Scalable(Parallel(Sparse( Linear(SystemSolver · Role of Reordering Schemes • “Banded”, or low-rank perturbations of banded, systems are often obtained after RCM or](https://reader034.vdocuments.us/reader034/viewer/2022043003/5f842687b8804156cf4ea04e/html5/thumbnails/21.jpg)
Determine that which assures that
is the soluPon of the balance system (order = size of overlap)
M y = g M and g are not available explicitly
A1
A2
21
![Page 22: Towards(a(Scalable(Parallel(Sparse( Linear(SystemSolver · Role of Reordering Schemes • “Banded”, or low-rank perturbations of banded, systems are often obtained after RCM or](https://reader034.vdocuments.us/reader034/viewer/2022043003/5f842687b8804156cf4ea04e/html5/thumbnails/22.jpg)
The Balance System
• Solve the system My=g using a Krylov subspace scheme (CG or Bicgstab).
• Two Ume-‐consuming kernels:
– q = M*p
– r(p) = g – M*p
22
![Page 23: Towards(a(Scalable(Parallel(Sparse( Linear(SystemSolver · Role of Reordering Schemes • “Banded”, or low-rank perturbations of banded, systems are often obtained after RCM or](https://reader034.vdocuments.us/reader034/viewer/2022043003/5f842687b8804156cf4ea04e/html5/thumbnails/23.jpg)
ObservaUons • r(p) = g – M*p
= x2̂(p) – x2(p)
• r(0) = g = x2̂(0) – x2(0)
• q = M*p = r(0) – r(p)
23
![Page 24: Towards(a(Scalable(Parallel(Sparse( Linear(SystemSolver · Role of Reordering Schemes • “Banded”, or low-rank perturbations of banded, systems are often obtained after RCM or](https://reader034.vdocuments.us/reader034/viewer/2022043003/5f842687b8804156cf4ea04e/html5/thumbnails/24.jpg)
PSPIKE Stage 1 –
• ExtracUon of an effecUve “banded” precondiUoner M – with or without reordering. • norm(M,’fro’) = (1-‐ ε) norm(A,’fro’); • “bandwidth” ≤ β(n).
Stage 2 – • Use an outer Krylov subspace method (BiCGstab) to solve A x = f with precondiUoner M. • A modified Pardiso (by O. Schenk) is a major kernel for solving M z = r . • Classes of banded precondiUoners:
narrow-‐banded, generalized-‐banded, wide-‐banded
![Page 25: Towards(a(Scalable(Parallel(Sparse( Linear(SystemSolver · Role of Reordering Schemes • “Banded”, or low-rank perturbations of banded, systems are often obtained after RCM or](https://reader034.vdocuments.us/reader034/viewer/2022043003/5f842687b8804156cf4ea04e/html5/thumbnails/25.jpg)
CompuUng Playorm • Intel Cluster – Westmere X5670, 2.93 GHz – Infiniband interconnecUon – 12-‐core nodes
![Page 26: Towards(a(Scalable(Parallel(Sparse( Linear(SystemSolver · Role of Reordering Schemes • “Banded”, or low-rank perturbations of banded, systems are often obtained after RCM or](https://reader034.vdocuments.us/reader034/viewer/2022043003/5f842687b8804156cf4ea04e/html5/thumbnails/26.jpg)
a. PSPIKE-‐NBP
![Page 27: Towards(a(Scalable(Parallel(Sparse( Linear(SystemSolver · Role of Reordering Schemes • “Banded”, or low-rank perturbations of banded, systems are often obtained after RCM or](https://reader034.vdocuments.us/reader034/viewer/2022043003/5f842687b8804156cf4ea04e/html5/thumbnails/27.jpg)
UFL – Rajat31 (circuit simulaUon)
N ~ 4.7 M nnz ~ 20 M nonsymmetric
![Page 28: Towards(a(Scalable(Parallel(Sparse( Linear(SystemSolver · Role of Reordering Schemes • “Banded”, or low-rank perturbations of banded, systems are often obtained after RCM or](https://reader034.vdocuments.us/reader034/viewer/2022043003/5f842687b8804156cf4ea04e/html5/thumbnails/28.jpg)
Scalability of PSPIKE vs. direct solvers (for Rajat31)
• Direct solvers: – WSMP
• matrix input centralized on Node 0
– MUMPS • matrix input distributed by block rows • soluUon vector distributed • ParMETIS
• PSPIKE: (bandwidth of precondiUoner = 5) – TraceMIN-‐Fiedler
– matrix input distributed by block rows
![Page 29: Towards(a(Scalable(Parallel(Sparse( Linear(SystemSolver · Role of Reordering Schemes • “Banded”, or low-rank perturbations of banded, systems are often obtained after RCM or](https://reader034.vdocuments.us/reader034/viewer/2022043003/5f842687b8804156cf4ea04e/html5/thumbnails/29.jpg)
1
10
100
4 8 16 32 64 128 256 512 1024
PSPIKE – WSMP -‐-‐ MUMPS
Time in
seconds
factorizaUon
Total Ume for WSMP & MUMPS
# of cores (8 cores per node)
~37
~25
~17
~2.8
PSPIKE: 1 MPI proc./node 8 OMP threads/MPI proc.
8 MPI proc./node
rel. res. = O(10-‐5) rel. res. = O(10-‐5)
![Page 30: Towards(a(Scalable(Parallel(Sparse( Linear(SystemSolver · Role of Reordering Schemes • “Banded”, or low-rank perturbations of banded, systems are often obtained after RCM or](https://reader034.vdocuments.us/reader034/viewer/2022043003/5f842687b8804156cf4ea04e/html5/thumbnails/30.jpg)
System size:
N = 11,333,520
# of nonzeros: 61,026,416
bandwidth: 334,613
MEMS simulaUon benchmark 1
stopping criterion: rel. res. = O( 10-‐2)
30
![Page 31: Towards(a(Scalable(Parallel(Sparse( Linear(SystemSolver · Role of Reordering Schemes • “Banded”, or low-rank perturbations of banded, systems are often obtained after RCM or](https://reader034.vdocuments.us/reader034/viewer/2022043003/5f842687b8804156cf4ea04e/html5/thumbnails/31.jpg)
Scalability of TraceMin-‐Fiedler
31
0 5 10 15 20 25 30 35 40
0 5 10 15 20 25 30 35
~ Time in seconds
Nodes: 1 to 32
37
21
12
6 3 1.7
![Page 32: Towards(a(Scalable(Parallel(Sparse( Linear(SystemSolver · Role of Reordering Schemes • “Banded”, or low-rank perturbations of banded, systems are often obtained after RCM or](https://reader034.vdocuments.us/reader034/viewer/2022043003/5f842687b8804156cf4ea04e/html5/thumbnails/32.jpg)
MEMS SimulaUon – benchmark-‐1
WSMP total
WSMP factor
1
10
100
1000
4 8 16 32 64 128 256 512 1024
PSPIKE-‐NBP, rel. res. = 5(10-‐5)
~345
~31
~86
~4
~ 2.4
β = 5 β = 5 Time in sec.
cores
Variable # of OMP threads/MPI proc. PSPIKE: variable # of OMP threads/MPI proc.
![Page 33: Towards(a(Scalable(Parallel(Sparse( Linear(SystemSolver · Role of Reordering Schemes • “Banded”, or low-rank perturbations of banded, systems are often obtained after RCM or](https://reader034.vdocuments.us/reader034/viewer/2022043003/5f842687b8804156cf4ea04e/html5/thumbnails/33.jpg)
• Strong scalability of PSPIKE Fixed problem size – 1 to 64 nodes (or 8 to 512 cores)
• Comparison with AMG-‐precondiUoned Krylov subspace solvers in: • Hypre (LLNL) • Trilinos-‐ML (Sandia)
• Smoother – • Chebyshev fastest • Jacobi • Gauss-‐Seidel
Scalability of PSPIKE vs. Trilinos Intel Harpertown
33
![Page 34: Towards(a(Scalable(Parallel(Sparse( Linear(SystemSolver · Role of Reordering Schemes • “Banded”, or low-rank perturbations of banded, systems are often obtained after RCM or](https://reader034.vdocuments.us/reader034/viewer/2022043003/5f842687b8804156cf4ea04e/html5/thumbnails/34.jpg)
0.1
1
10
100 1 2 4 8 16 32 64
Speed Improvement over Trilinos-‐ML (nodes)
Time (Trilinos-‐ML) ÷ Time (PSPIKE)
PSPIKE: k threads per MPI process
break-‐even @ 4 nodes
# of nodes k 1 to 4 1 8 to 16 4 > 16 8 Intel Harpertown
34 MEMS benchmark 1
![Page 35: Towards(a(Scalable(Parallel(Sparse( Linear(SystemSolver · Role of Reordering Schemes • “Banded”, or low-rank perturbations of banded, systems are often obtained after RCM or](https://reader034.vdocuments.us/reader034/viewer/2022043003/5f842687b8804156cf4ea04e/html5/thumbnails/35.jpg)
0
1
2
3
4
5
6
64 128 256 512 1024
time
(sec
)
number of cores
5.4
2.9
1.4 0.9
0.7
Strong Scalability on Intel Nehalem for a larger MEMS system of order ~ 23M (benchmark 2)
35
~ 7.7 speed improv. : 64 to 1024 cores
![Page 36: Towards(a(Scalable(Parallel(Sparse( Linear(SystemSolver · Role of Reordering Schemes • “Banded”, or low-rank perturbations of banded, systems are often obtained after RCM or](https://reader034.vdocuments.us/reader034/viewer/2022043003/5f842687b8804156cf4ea04e/html5/thumbnails/36.jpg)
b. PSPIKE-‐WBP
![Page 37: Towards(a(Scalable(Parallel(Sparse( Linear(SystemSolver · Role of Reordering Schemes • “Banded”, or low-rank perturbations of banded, systems are often obtained after RCM or](https://reader034.vdocuments.us/reader034/viewer/2022043003/5f842687b8804156cf4ea04e/html5/thumbnails/37.jpg)
PSPIKE for ASIC680K system
Time on 1 node: Pardiso – 23 (sec). ILU-‐BiCGstab – 34 (sec). Time on 2 nodes: PSPIKE – 6 (sec).
relaUve residual Pardiso: O(10-‐9) ILU+BiCGstab: O(10-‐7) PSPIKE-‐WBP: O(10-‐7)
PrecondiPoner consisPng of two overlapped blocks
Precondi(oner consis(ng of two overlapped blocks (overlap = 11)
37
![Page 38: Towards(a(Scalable(Parallel(Sparse( Linear(SystemSolver · Role of Reordering Schemes • “Banded”, or low-rank perturbations of banded, systems are often obtained after RCM or](https://reader034.vdocuments.us/reader034/viewer/2022043003/5f842687b8804156cf4ea04e/html5/thumbnails/38.jpg)
Precondi(oner
A Overlapped diagonal blocks
Overlapped diagonal blocks
M
PSPIKE-WBP 38
![Page 39: Towards(a(Scalable(Parallel(Sparse( Linear(SystemSolver · Role of Reordering Schemes • “Banded”, or low-rank perturbations of banded, systems are often obtained after RCM or](https://reader034.vdocuments.us/reader034/viewer/2022043003/5f842687b8804156cf4ea04e/html5/thumbnails/39.jpg)
ASIC_680k_8 nonsymmetric size: ~ 5.5 M nnz: ~31.0 M
PrecondiUoner: 16 overlapped diagonal blocks overlap size = 100
PSPIKE-‐WBP
N = 16 m – 15 β N = 16 m – 15 β
![Page 40: Towards(a(Scalable(Parallel(Sparse( Linear(SystemSolver · Role of Reordering Schemes • “Banded”, or low-rank perturbations of banded, systems are often obtained after RCM or](https://reader034.vdocuments.us/reader034/viewer/2022043003/5f842687b8804156cf4ea04e/html5/thumbnails/40.jpg)
PSPIKE-‐WBP vs. ILU-‐precondiUoned BiCGstab
• ILUT-‐BiCGstab (one core): – drop tolerance = 10-‐3 – fill_in per row = 10% – 16 iteraUons, rel. res. = O(10-‐6) – Time ~ 276 sec.
• PSPIKE-‐WBP : – 1 iteraUon, rel. res = O(10-‐9)
![Page 41: Towards(a(Scalable(Parallel(Sparse( Linear(SystemSolver · Role of Reordering Schemes • “Banded”, or low-rank perturbations of banded, systems are often obtained after RCM or](https://reader034.vdocuments.us/reader034/viewer/2022043003/5f842687b8804156cf4ea04e/html5/thumbnails/41.jpg)
PSPIKE-‐WBP (mulUple nodes) vs.
ILU-‐precondiUoned BiCGstab (one core)
# of cores 32 (4 nodes)
64 (8 nodes)
128 (16 nodes)
Time (sec.) 113 31 9
Speed improv. over ILU-‐BiCGstab (one core)
2.4 8.9 30.7
![Page 42: Towards(a(Scalable(Parallel(Sparse( Linear(SystemSolver · Role of Reordering Schemes • “Banded”, or low-rank perturbations of banded, systems are often obtained after RCM or](https://reader034.vdocuments.us/reader034/viewer/2022043003/5f842687b8804156cf4ea04e/html5/thumbnails/42.jpg)
Thank You!