graphics card computing for materials...
TRANSCRIPT
Graphics Card Computing for Materials Modelling
Case study: Analytic Bond Order Potentials
B. Seiser, T. Hammerschmidt, R. Drautz, D. Pettifor
Funded by EPSRC within the collaborative multi-scale project
“Alloys By Design: Nickel-base superalloys”
Alloys by Design
CUDA Developer Conference Graphics Card Computing for Materials Modelling Bernhard Seiser
Precipitation ofdetrimental phases
Freckling instabilitiesReaction with coatings
Dislocation creep
CREEP RESISTANT STABLE
COATABLE CASTABLE
0.5 μm 2.5 μm
25 μm25 cmTitanium Nickel
Steel Aluminium
Materials for gas turbine blades:
Ni-based superalloys:
• Cr, Co, Mo, W, Al, Ti, Ta, Re, Ru, Hf, C, B (<10 wt%)• alloy design still empirically rather than theoretically• expensive, time-consuming, non-optimized alloys
Need multi-scale modelling for alloy design
Challenge:
Materials Modelling with GPUs
Hierarchy in Materials Modelling• http://www.nvidia.com/object/molecular_dynamics.html• AceMD (the biomolecular MD package used by GPUGRID)• Ascalaph (molecular modelling suite)• HOOMD (Highly Optimized Object Oriented Molecular Dynamics) • VMD & NAMD (Visual Molecular Dynamics)
Molecular dynamics GPU codes
Density functional theory codes
Dwarfs are essential for most electronic structure calculation methods
CUDA Developer Conference Graphics Card Computing for Materials Modelling Bernhard Seiser
• TeraChem (GTO, J. Chem. Theory Comput., 2008, 4 (2), pp 222–231)Single precision: 26 - 96 x speed up
• BIGDFT (WL, see Journal of Chemical Physics 131, 034103, 2009)
Hij = < i| H |j> = RT R x x
ppσ (rij) 0 0
0 ppπ (rij) 0
0 0 ppπ (rij)
Tight-binding method
Total energy:
Repulsive energy:
i
j
k
l
Hkl
Hjl
HijHik
H =
Ebond = n(E) E dE n(E) … Density of states
Hii Hij Hik 0
Matrices dimension depending on number of orbitals
Hji Hjj 0 Hjl
Hki 0 Hkk Hkl
0 Hlj Hlk Hll
E = Erep + Ebond
Summation of pair-wise interactions
Bond integral:
Hv = Ev
∫EF
LapackScalapack
Jacket
EF
n(E)
E
periodic crystal
Bond energy:
CUDA Developer Conference Graphics Card Computing for Materials Modelling Bernhard Seiser
Hv = Ev
Bond order potential (BOP) bond energy:
Analytic Bond Order potentials
0 1 2 3 4 5 6 7 8 9 10
-0.2
-0.1
0.0
0.1
0.2
g n
Ef
where
Drautz and Pettifor (2006)
and is nth moment
n = 3n = 4
n = 5
Moments of density of states: Moment theorem: Cyrot-Lackmann (1967)
Bond integral between atom i and j
= 1
= centre of gravity
= RMS width
= skewness
= bimodality
CUDA Developer Conference Graphics Card Computing for Materials Modelling Bernhard Seiser
Interference path
BOPfox
Benchmark for fcc with 864 W atoms, 12 moments
[s] [%]
initialization 0.24 1.38
neighbour lists 1.11 6.41
bond matrix 0.22 1.25
evaluate moments 14.7 84.74
evaluate aInf,bInf 0.7 4.01
forces 0.28 1.59
EAM 0.02 0.14
Fermi level search 0.07 0.39
self-consistency 0.02 0.09
total 17.45
65 % matrix multiplications
→ rest is spent on path finding
BOPfox tool (Fortran 90): Tight-binding, EAM, BOP -> Molecular dynamics, kMC
CUDA Developer Conference Graphics Card Computing for Materials Modelling Bernhard Seiser
( )li = ( )lj( )ji
+ ( )lk( )ki
( )li = ( )lj( )ji
+ ( )lk( )ki
2nd moment of atom i = sum of paths (n=2) that start and endon atom i
Interference paths
Calculation of interference paths: Length (n) = 2
Set of end points
i
k
l
j
4nd moment of atom i = sum of paths (n=4) that start and endon atom i
EP
( )ii = ∑( )li ( )li
T
CUDA Developer Conference Graphics Card Computing for Materials Modelling Bernhard Seiser
( ) = ( )( )+ ( )( )+ ...
Interference paths
Calculation of interference paths: Length = 3
i
CUDA Developer Conference Graphics Card Computing for Materials Modelling Bernhard Seiser
k
j
-20 -15 -10 -5 0 5 10 15 200
2
4
6
8
10
12
14
16
18
20
De
nsi
ty o
f st
ate
s
Energy
-20 -15 -10 -5 0 5 10 15 200
2
4
6
8
10
12
14
16
18
20
De
nsi
ty o
f st
ate
s
Energy
-20 -15 -10 -5 0 5 10 15 200
2
4
6
8
10
12
14
16
18
20
De
nsi
ty o
f st
ate
s
Energy
-20 -15 -10 -5 0 5 10 15 200
2
4
6
8
10
12
14
16
18
20
De
nsi
ty o
f st
ate
s
Energy
Matrix multiplications
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
0
1x104
2x104
3x104
4x104
5x104
6x104
7x104
Nu
mb
er
of
mat
rix
mu
ltip
licat
ion
s /a
tom
Number of moments
Accuarcy
Number of matrix multiplications scales linearly with number of atoms!
EAM/PP TB
∞
CUDA Developer Conference Graphics Card Computing for Materials Modelling Bernhard Seiser
BOPfox goes GPU
Benchmark for fcc with 864 W atoms, 12 moments
[s] [%]
initialization 0.24 1.38
neighbour lists 1.11 6.41
bond matrix 0.22 1.25
evaluate moments 14.7 84.74
evaluate aInf,bInf 0.7 4.01
forces 0.28 1.59
EAM 0.02 0.14
Fermi level search 0.07 0.39
self-consistency 0.02 0.09
total 17.45
BOPfox tool (Fortran 90): Tight-binding, EAM, BOP -> Molecular dynamics, kMC
CUDA Developer Conference Graphics Card Computing for Materials Modelling Bernhard Seiser
hostToGpu_UploadAtomicPositions();hostToGpu_UploadNeighbourList();
gpu_GetTodoList(); //Get list of matrix calculations
gpu_CalculateBondIntegrals(); //rik -> Hik
for (i = 2; i <= nInterferencemax; i++){gpu_MatrixMultiplication();gpu_MatrixAddition();gpu_MomentCalculation();gpuToHost_Moments();
}
Graphics Card Computing for Materials Modelling
BOPfox and BOPC
TaskBOPfox (CPU)
[ms]
BOPC (GPU)
[ms]
Factor
(Speed up)
Calculation of matrices 264 12 ~22
Path finding 5412 123 ~44
Matrix multiplication 9237 497 ~19
BOPfox (CPU)Hardware
Intel Core2 Dual CPU E65501 core @ 2.33GHz4 GB memory
Compiler optionsGfortran 4.2.1 Release modus (-03)
BOPC (GPU)Hardware
nvidia GeForce GTX 26027 multiprocessors → 216 cores (integer) @ 1.5 Ghz
Compiler optionsNvcc release modus (-03), CUDA 2.0
Benchmark of BOPfox vs BOPC
→ 24 x overall speed up
CUDA Developer Conference Graphics Card Computing for Materials Modelling Bernhard Seiser
Conclusions
CUDA Developer Conference Graphics Card Computing for Materials Modelling Bernhard Seiser
• Materials modelling can benefit significantly from GPU parallelization
• Linear algebra and FFT are essential for most electronic structure calculation methods
• Models like analytic bond order potentials try to avoid expensive LA/FFT routines
→ significant speed up possible