A computational Database for Turbulent Flow
Scott B. Baden and Alden King Computer Science and Engineering Dept.
University of California, San Diego Sutanu Sarkar, Eric Arobone, Hieu Pham
Mechanical and Aerospace Engineering Dept. University of California, San Diego
10/8/09 Computational Database SIAM MI09 / S. B. Baden 10/8/09 2
Motivation • Understanding fluid flows is complex, labor-intensive,
iterative process • Develop models to describe observed behavior hidden
within the data, requires experimentation • Database concepts are useful • Dataset : boxes of numbers →
Database : time dependent sets of features over a sparse domain
• Traditional relational DB solution not appropriate – How to optimize across user defined functions – How to avoid disrupting locality
10/8/09 Computational Database SIAM MI09 / S. B. Baden 10/8/09 3
A “Computational Database” • Traditional relational DB solution not appropriate
– How to optimize across user defined functions? – How to avoid disrupting locality? – Irregular data
• Stonebreaker’s “One size fits all: A concept whose time has come and gone”
• Computational Database (NSF CDI project) – Pose queries – Cross optimize with computationally intensive kernels
10/8/09 Computational Database SIAM MI09 / S. B. Baden 10/8/09 4
Impact of Database Technology on CFD
• Database technology has made only a modest impact – JHU Public Turbulence database turbulence.pha.jhu.edu – iCFD Database cfd.cineca.it – www.cfd-online.com/Wiki/Turbulence_DNS_database – Vortonics [Boghosian, Tufts]
• Sloan Digital Sky Survey (SDSS): most successful scientific Database to date
• Spatial databases have had the most impact in GIS • Imaging also popular
– QBIC [IBM] – Virtual Microscope [Kurc, Saltz, Sussman]
10/8/09 Computational Database SIAM MI09 / S. B. Baden 10/8/09 5
Motivating Application • Spatio-temporal transport and mixing dynamics
in simulated geophysical flows (DNS of horizontal shear)
• Goal: identify turbulent structures automatically
Horizontal spanwise vorticity (mag) using DNS, [Basak & Sarkar, JFM 2006]
10/8/09 Computational Database SIAM MI09 / S. B. Baden 10/8/09 6
Why is this a hard problem?
• Datasets are large: 80 Terabytes per simulation – 1024 Cray XT-4 processors for 48 hours – 2 × 109 unknowns, 10K time steps [4096×1024×512] – 1000 snapshots @ 80 × 109 bytes
• We are interested in a sparse, irregular, subset of the data • If we knew what we were looking for
the problem would be a lot easier! • For example, there is no agreed
upon definition of a vortex core
Han Suk Kim
10/8/09 Computational Database SIAM MI09 / S. B. Baden
The traditional approach to data discovery
• Analyze the dataset off-line
• Post process the data by computing derived quantities, e.g. dissipation
• Employ subjective visual analysis to guide the search – Find something that appears “interesting” – Apply computations to the interesting data – How do we make a connection between what the eye sees
and what the computer carries out?
7
10/8/09 Computational Database SIAM MI09 / S. B. Baden
What is wrong with the approach? • This approach is not scalable
– Bulky data sets – Long computation times
• Without a precise definition of what is “interesting,” we can miss important information
• Systematic analysis of feature populations is impractical because it must be done by hand, one feature at a time
8
10/8/09 Computational Database SIAM MI09 / S. B. Baden
How do we extract the information? • User function identifies
features
• Run time support to apply identification systematically
• Kerney [‘02], Diamessis, Kerney, Baden, Nomura [’02]
9
10/8/09 Computational Database SIAM MI09 / S. B. Baden 10/8/09 10
What are our requirements? • Efficient space-time queries • Sparse domains • Conditioned operations • Conserve data locality • User defined query and analysis functions
10/8/09 Computational Database SIAM MI09 / S. B. Baden 10/8/09 11
Use case • Identify vortex cores using the “Δ criterion” Δ > 1.0e10-9 • Compute quantities conditioned on Δ
F(x; condition)
10/8/09 Computational Database SIAM MI09 / S. B. Baden 10/8/09 12
Delta criterion • Weighted sum of Q and the determinant of the
gradient of the velocity field > ε
– W is the anti-symmetric component of the velocity gradient tensor
– S is the symmetric component – Relative strengths of rotation and strain
• Delta is a more refined version of the Q criterion which has been used in ocean science
€
Q =12W 2 − S2( )
€
Δ =Q3
⎛
⎝ ⎜
⎞
⎠ ⎟ 3
+det∇v2
⎛
⎝ ⎜
⎞
⎠ ⎟ 2
10/8/09 Computational Database SIAM MI09 / S. B. Baden 10/8/09 13
Working with sparse representations • Controlled operations F(x;condition) • Geometric abstractions, e.g. KeLP, Chombo,
Titanium, Fidil G ∩ H • TeraLab, Faisal Mir, KTH Sweden (2006)
10/8/09 Computational Database SIAM MI09 / S. B. Baden 10/8/09 14
Preliminary results • 641 x 385 x 193 x 5 floats • Time to compute delta: 17.6 sec • Card(delta > 10-9): 6.2% • Δ7
h = 0.621 s
• u'v' correlations: 1.531 s • delta-thresholded u'v' correlations: 1.387 s • Fortran: • u'v' correlations: 2.660 s • delta-thresholded u'v' correlations: 1.735 s
14
10/8/09 Computational Database SIAM MI09 / S. B. Baden 10/8/09 15
Preliminary results • 641 x 385 x 193 x 5 floats • Time to compute delta: 17.6 sec • Card(delta > 10-9): 6.2%
Kernel Time (sec)
Conditional Time (sec)
Stencil-7 (100% and 0%)
0.621 0.210 (0.733, 0.143)
Eddy viscosity <u’v’> 1.56 1.40 Turbulent kinetic energy 2.30 1.76 Dissipation 5.86 2.58
15
10/8/09 Computational Database SIAM MI09 / S. B. Baden 10/8/09 16
Work in progress: feature tracking
• Continuation, creation, dissipation, bifurcation, amalgamation [Silver and Wang 1997]
• Custom spatial join [w/ Diamessis et al. ‘02]
B
A
B
A
10/8/09 Computational Database SIAM MI09 / S. B. Baden 10/8/09 17
Acknowledgements and support • United States National Science Foundation
OCE 0835839, ACI9619020 • University of California, San Diego • San Diego Supercomputer Center (SDSC) • KTH (Sabbatical in 2004-5) • For more information
http://www-cse.ucsd.edu/groups/hpcl/scg
10/8/09 Computational Database SIAM MI09 / S. B. Baden
ABSTRACT We describe early experiences with a computational database for
cyber-enabled discovery of spatio-temporal transport and mixing dynamics in simulated geophysical flows.
Our database combines the machinery of spatial data retrieval with powerful computational facilities tailored to dynamic representations of irregular time-dependent structures.
Our database overcomes the limitations of the relational model by supporting an imperative style of query that conserves spatial locality in numerical algorithms.
It is application neutral, and could help transform the discovery process in any field involving complex time-dependent, multi-scale physical phenomena that admits mathematical rules to identify features of interest.
10/8/09 18