new visualisation and analysis challenges for wallaby · 2010. 12. 15. · david barnes, amr hassan...
TRANSCRIPT
© Swinburne Astronomy Productions
VISUALISATION AND ANALYSIS CHALLENGES FOR WALLABY
CRICOS provider 00111D
Christopher Fluke David Barnes, Amr Hassan
[ Scientific Computing & Visualisation Group ]
WALLABY Workflow
Observe field
Generate spectral cube
Transfer to archive Source finding
Model fitting
Add candidate to catalogue
Ready for WALLABY science
Visualisation + Analysis
• Likely data cube • 6144 x 6144 spatial pixels • 16,384 spectral channels • ~600 gigavoxels • 2.5 TB
• What tools does WALLABY need above/beyond what ASKAP project will provide? • New software? • New hardware?
Existing solutions may not cope
Several Petabytes of data products
The WALLABY Data Deluge
• Grand Challenges for Visualisation and Analysis A. Handling Big Data Files B. Global Views versus Image Slices C. Source Finding and Confirmation D. Desktop Visualisation and Analysis E. Data Product Management
• Need to understand the computing context • [Assumes that all we get from ASKAP is spectral cubes]
• See Fluke, Barnes & Hassan (2010) • e-Science Challenges in Astronomy and Astrophysics • Part of IEEE e-Science 2010 Conference
Desktop Computing Today (2010)
Assumptions: Theoretical peak, single precision 100% efficiency using all cores/streams
2.5 TB
Per-Node HPC Performance Today (2010)
2.5 TB
160 x
WALLABY-sized data on 2010 desktop and HPC is worrying…
The Multi-Core Corner (Barsdell et al. 2010)
Many-core GPU Low-cost streaming co-processor
Multi-core CPU
Single-core CPU
Desktop Computing for WALLABY
Assumptions: Theoretical peak, single precision 100% efficiency using all cores/streams
Per-Node HPC Performance for WALLABY
Bandwidth and memory are bigger factors than FLOPs
(Specs for “Low cost” HPC cluster – not DIRP)
A. Handling Big = 2.5 TB Data Files (in 2014)
• HPC cluster with 72 GB per node = 36 nodes (c.f. 160) • But…
• Sufficient compute capacity in CPUs? • 10 GB/GPU = 256 GPUs
• Remote service mode (Amr Hassan talk) • Need software that supports
• Distributed memory architecture • Data parallel algorithms
• FITS? NetCDF? HDF5? On the fly transforms?
1721x1721x1024= 12GB
GPS (Interference)
LMC
Milky Way Processing artifacts
HI detection
1721 x 1721 x 1024 = 12GB
Visualisation: Amr Hassan
Data courtesy: Russell Jurek (ATNF)
5 GPUs vs 100s of CPUS
B. Global Views versus Image Slices
C. Source Finding and Confirmation
• Identifying and extracting candidate sources • Source finding is easy, right?
• Examine each voxel in turn, identify source contribution • But…every voxel contains both source and non-source
• WALLABY outcomes rely on source finding software that: • Maximises reliablity (only extract HI sources) • Completeness (finds every source in the cube)
• Ideal: • 1:1 reliability • 100% completeness
Computing requirements for source finders
• Consider using more than one source finder • Tuned for different types of sources?
• Requires (at minimum) distributed computing approach
• But with GPUs… • Massive processing gain at much lower cost than CPU-only
cluster • Other source finding alternatives that are not practical with
CPU? (Brute force on GPU) • Accelerate elements of other source finders?
D. Desktop Visualisation and Analysis
• Main memory (bandwidth) is biggest limitation • 16 GB (= 3 min to load) • Cropping or subsampling (qualitative only)
• Calculate min, max, mean? • Seconds (best case)
• Extra calculations? • Standard desktop will not have sufficient compute capability
• Add a GPU? • Not a drastic improvement (2-3 GB RAM) • Perhaps x10 at best
Desktop 3D Volume Rendering
• Hardware-accelerated, texture-based, 3D volume rendering • Today:
• Standard (NVIDIA GT120 GPU with 512 MB RAM) • 3503 vox at 6-10 fps for 600x600 pixel output
• Top-end (ATI Radeon 5970) • 5003 vox at 8-15 fps on 1000x1000 output
• Tomorrow: • 2 GB (500 megavoxel ~ 8003 voxels) for GPU • Likely to be 4 fps?
Demonstration – 3D texture rendering (on a laptop)
NGC3198 (courtesy E. de Blok/THINGS) • Original: 1024 x 1024 x 72 (293 MB) • Scaled: 512 x 512 x 72 (18 MB)
E. Data Product Management
• Solutions over and above ASKAP databases/archives • Need to understand specific requirements for WALLABY • Can’t use text files for 0.5 million sources! (c.f. HIPASS)
• Options • Build our own system? • Buy a solution – don’t reinvent it? • Look to SDSS, WiggleZ, Millennium and see what they did?
• Important to think about future scalability
Visualisation and Analysis of WALLABY Data
http://www.zazzle.com.au
http://www.prague-life.com
Will require evolutionary and
revolutionary changes in hardware and
software
More processing steps will move from desktop to HPC
remote services
GPUs show great potential
Potential solutions