a multiresolution volume rendering framework for large-scale time-varying data visualization
DESCRIPTION
A Multiresolution Volume Rendering Framework for Large-Scale Time-Varying Data Visualization. Chaoli Wang 1 , Jinzhu Gao 2 , Liya Li 1 , Han-Wei Shen 1 1 The Ohio State University 2 Oak Ridge National Laboratory. Introduction. Large-scale numerical simulation - PowerPoint PPT PresentationTRANSCRIPT
A Multiresolution Volume A Multiresolution Volume Rendering Framework for Large-Rendering Framework for Large-
Scale Time-Varying Data Scale Time-Varying Data VisualizationVisualization
Chaoli WangChaoli Wang11, Jinzhu Gao, Jinzhu Gao22,,Liya LiLiya Li11, Han-Wei Shen, Han-Wei Shen11
11The Ohio State UniversityThe Ohio State University22Oak Ridge National LaboratoryOak Ridge National Laboratory
IntroductionIntroduction
• Large-scale numerical simulation– Richtmyer-Meshkov Instability (RMI) data @ LLNL
• 2,048 * 2,048 * 1,920 grid• 960 (8 * 8 * 15) nodes of the IBM-SP system• 7.5 GB per time step, output 274 time steps
• Goal– Data exploration– Quick overview, detail on demand
• Approach– Multiresolution data representation– Error-controlled parallel rendering
ChallengeChallenge
• Compact hierarchical data representation
• Allow specifying different spatial and temporal resolutions for rendering
• Long chains of parent-child node dependency
• Data dependency among processors• Balance the workload for parallel
rendering
Algorithm OverviewAlgorithm Overview
pre-processing
WTSP tree construction[error metric calculation]
[reconstructed data storage]
WTSP tree partition
The algorithm flow for large-scale time-varying data visualization
data distribution
distributed data
run-time rendering
next frame
WTSP tree traversal
data block reconstruction
parallel volume rendering
image
Wavelet-Based Time Space Partitioning Wavelet-Based Time Space Partitioning TreeTree
• The WTSP tree– Space-time hierarchical data structure to organize time-varying data – An octree (spatial hierarchy) of binary trees (temporal hierarchy) – Originate from the TSP tree [Shen et al. 1999] – Borrow the idea of the wavelet tree [Guthe et al. 2002]
[0,3]
10 32
[0,1] [2,3]
[0,3]
10 32
[0,1] [2,3]
Wavelet-Based Time Space Partitioning Wavelet-Based Time Space Partitioning TreeTree
octree node
time tree nodelow-pass filtered subblock
wavelet coefficients after 3Dwavelet transforms
wavelet coefficients after 1Dwavelet transforms
(a) 3D wavelet transforms on spatial domain
C0
C1
C2
C3 t
A[0,3]
D[0,3]
D[0,1]
D[2,3]
(b) 1D wavelet transforms on temporal domain
• WTSP tree construction– Two-stage block-wise wavelet transform and compression process– Build a spatial hierarchy in the form of an octree for each time step– Merge the same octree nodes across time into binary time trees
Hierarchical Spatial and Temporal Error Hierarchical Spatial and Temporal Error MetricMetric
T1 T3 T6
T
octree node
time tree node
se(T) = Σi=0..7MSE(T, Ti) + MAX{se(Ti)|i=0..7}
te(T) = MSE(T, Tl) + MSE(T, Tr) + MAX{te(Tl), te(Tr)}T
Tl Tr
•Based on MSE calculation
•Compare the error of each block with its
children
octree node storeslow resolution data
octree node stores 3Dwavelet coefficients
time tree node stores 1Dreconstruction results
time tree node stores 1Dwavelet coefficients
• Alleviate data dependency• EVERY-K scheme
Storing Reconstructed Data for Space-Time Storing Reconstructed Data for Space-Time TradeoffTradeoff
ho = 6, ht = 4
ko = 2, kt = 2
distribution unit
WTSP Tree Partition and Data DistributionWTSP Tree Partition and Data Distribution
octree nodegroup
time treenode group
octree node storeslow resolution data
octree node stores 3Dwavelet coefficients
time tree node stores 1Dreconstruction results
time tree node stores 1Dwavelet coefficients
• Eliminate dependency among processors
• Distribution units
ho = 6, ht = 4
ko = 2, kt = 2
WTSP Tree Partition and Data DistributionWTSP Tree Partition and Data Distribution
• Space-filling curve traversal– Neighboring blocks of similar spatial-temporal resolution should be evenly
distributed to different processors– Space-filling curve preserves locality, always visits neighboring blocks first– Traverse the volume to create a one-dimensional ordering of the blocks
WTSP Tree Partition and Data DistributionWTSP Tree Partition and Data Distribution
spatial error range temporal error range
• Error-guided bucketization– Data blocks with similar spatial and temporal errors should be
distributed to different processors– Create buckets with different spatial-temporal error intervals
error interval
WTSP Tree Partition and Data DistributionWTSP Tree Partition and Data Distribution
Bucket 0
Bucket 1
Bucket 2
• Error-guided bucketization– Bucketize the distribution units when performing hierarchical space-
filling curve traversals– Distribute units in each bucket in a round-robin fashion
Assigned to P0
Assigned to P1
Assigned to P2
Assigned to P3
0 1 3 5 6 8 14
2 4 7 11 12 17 18
9 10 13 15 16 19 20
• WTSP tree traversal– User specifies time step and tolerances
of both spatial and temporal errors– Traverse octree skeleton and the binary
time trees for each encountered octree node
– A sequence of data blocks is identified in back-to-front order for rendering
Run-Time RenderingRun-Time Rendering
• Data block reconstruction– Get low-pass filtered subblock from its parent node– Decode high-pass filtered wavelet coefficients– Perform inverse 3D wavelet transform– Reduce reconstruction time from O(c1ho + c2hoht)
to O(c1ko + c2kokt), where• c1 = time to perform an inverse 3D wavelet transform• c2 = time to perform an inverse 1D wavelet transform• ho = the height of the octree• ht = the height of the time tree• ko = # of levels in an octree node group• kt = # of levels in a time tree node group
Run-Time RenderingRun-Time Rendering
Run-Time RenderingRun-Time Rendering
• Parallel Volume Rendering– Each processor renders the data blocks
identified by the WTSP tree traversal and assigned to it during the data distribution stage
– Cache reconstructed data for subsequent frames
– Screen tiles partition– Image composition
ResultsResults
data (type) RMI (byte)
range (threshold) [0, 255] (0)
volume (size) 1024 * 1024 * 960 * 32 (30 GB)
block (size) 64 * 64 * 32 (128 KB)
tree depth 6 (octree) and 6 (time tree)
wavelet transform Haar with lifting (both space and time)
• Data sets and wavelet transforms
data (type) SPOT (float)
range (threshold) [0.0, 10.109] (0.005)
volume (size) 512 * 512 * 256 * 30 (7.5 GB)
block (size) 32 * 32 * 16 (64 KB)
tree depth 6 (octree) and 6 (time tree)
wavelet transform Daubechies 4 (space) and Haar (time)
ResultsResults
• Testing environment– A PC cluster consisting of 32 2.4 GHz
Pentium 4 processors connected by Dolphin networks
• Performance– Software raycasting– 96.53% parallel CPU utilization, or a
speedup of 30.89 times for 32 processors
ResultsResults
600
700
800
900
Num
ber
of U
nits
Dis
trib
uted
1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31
Processor ID
• Data distribution with EVERY-K scheme (ko = 2, kt = 2)
200
300
400
500
Num
ber
of U
nits
Dis
trib
uted
1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31
Processor ID
SPOT data set
RMI data set
ResultsResults
0
50
100
150
200
250
1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31
Processor ID
Num
ber
of B
lock
sR
ende
red
(1000,10,25) (20000,100,12) (56000,10,3)
• Rendering balance result
0
50
100
150
200
250
1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31
Processor ID
Num
ber
of B
lock
sR
ende
red
(0.1,0.1,24) (1.5,0.1,18) (4.0, 1.0, 7)
SPOT data set
RMI data set
ResultsResults• The timing result with 5122 output image resolution
data set RMI SPOT
(se, te, t) (50, 10, 29) (0.05, 0.01, 23)
number of blocks 6,218 4,840
wavelet reconstruction 15.637s 4.253s
software raycasting 10.810s 2.715s
image composition 0.118s 0.070s
overhead 3.093s 1.719s
total time 29.658s 8.757s
difference time 2.043s 0.241s
ResultsResults
• Rendering of RMI data set at selected time steps
1st 536 8th 743 15th 1,317 32th 1,625
ResultsResults
• Rendering of SPOT data set at selected time steps
1st 2,558 12th 2,743 21th 2,392 30th 2,461
ResultsResults
• Multiresolution volume rendering
RMI data set, 11th time step
SPOT data set, 5th time step
Conclusion & Future WorkConclusion & Future Work
• Multiresolution volume rendering framework for large-scale time-varying data visualization– Hierarchical WTSP tree data representation– Data partition and distribution scheme– Parallel volume rendering algorithm
• Future work– Utilize graphics hardware for wavelet
reconstruction and rendering speedup– Incorporate optimal feature-preserving wavelet
transforms for feature detection
AcknowledgementsAcknowledgements
• Funding agencies– NSF ITR grant ACI-0325934– NSF Career Award CCF-0346883 – DOE Early Career Principal Investigator Award
DE-FG02-03ER25572
• Data sets– Mark Duchaineau @ LLNL– John Clyne @ NCAR
• Testing environment– Jack Dongarra and Clay England @ UTK– Don Stredney and Dennis Sessanna @ OSC