astronomical “big data” analysis and...
TRANSCRIPT
Amr H. Hassan
Centre for Astrophysics and Supercomputing, Swinburne University of Technology
With :
Christopher Fluke (Swinburne), David Barnes (Monash), Virginia Kilborn (Swinburne)
Astronomical “Big Data” Analysis and Visualization
http://www.gizmodo.com.au csironewsblog.com
Astronomy In the “Big Data” Era
© http://science.psu.edu Photography by Paul Bourke and Jonathan Knispel. Supported by WASP (UWA), iVEC, ICRAR, and CSIRO
Australian SKA Pathfinder (ASKAP) The Large Synoptic Survey Telescope (LSST)
3 to 12 TB/ Day 30 TB/ Day
400 TFLOP/S
Swinburne gStar GPU-Supercomputer Square Kilometre Array (SKA)
30 to 360 TB/ Day
© skatelescope.org
Astronomy In the “Big Data” Era
http
://ww
w.w
ired.c
om
/wire
de
nte
rpris
e
LSST – 2018
• 15 TB / night
• Image size ~ 6 to 10 GB
http
://ww
w.b
igdata
byte
s.c
om
ASKAP – 2014
• 3 to 12 TB / day
• Data unit ~ 1TB
Image credit – Swinburne Astronomy Productions / CSIRO
The Australian Square Kilometre Array Pathfinder
Big Data - Case Study
ASKAP
ASKAP data
ASKAP central processing
Simulated Data
ASDAF
ASDAF
Science Community
WALLABY Science Processing
Quality Control
Source Finding
(2)
Spectral Stacking
Source Parameterization
Data Management
Science Analysis
Publications Multi-wavelength data
Radio Astronomy – Computer Assisted Data Analysis
58 TB
Source : WALLABY ASKAP Review - PIs: B. S. Koribalski & L. Staveley-Smith
©zeis
s.m
agn
et.fs
u.e
du
Radio Astronomy – Computer Assisted Data Analysis
DEC
DEC
ASKAP
ASKAP data
ASKAP central processing
Simulated Data
ASDAF
ASDAF
Science Community
WALLABY Science Processing
Quality Control
Source Finding
(2)
Spectral Stacking
Source Parameterization
Data Management
Science Analysis
Publications Multi-wavelength data
Radio Astronomy – Computer Assisted Data Analysis
58 TB
Source : WALLABY ASKAP Review - PIs: B. S. Koribalski & L. Staveley-Smith
ASKAP Cube Dimensions 6144 x 6144 x 16384
10 fps ≈ 30 Minutes
ASKAP Cube Dimensions into 6x6 Grid ≈ 36 x 1024 x 1024 x 16384
10 fps → 36 x 27.3 Minutes ≈ 16.4 Hours
Each Cube 64 GB each
Radio Astronomy – Computer Assisted Data Analysis
Radio Astronomy – Computer Assisted Data Analysis
Radio Astronomy – Computer Assisted Data Analysis
(2σ)
(3σ) (4σ) (7σ)
Computer Assisted Data Analysis Sigma-Clipping Transfer Function
Computer Assisted Data Analysis 3D Spectrum Extraction
Computer Assisted Data Analysis Integerating Source Finder Output
Computer Assisted Data Analysis Other operations
http://www.getmemedia.com
8000×8000 pixel volume rendering of the HIPASS dataset on the CSIRO Optiportal at Marsfield,
NSW. The Southern Sky cube was generated by Russell Jurek (ATNF) from 387 HIPASS cubes.
Credit: Christopher Fluke
Computer Assisted Data Analysis Next Step – Better data interaction
Performance Analysis and Benchmarks gStar
50 standard SGI C3108-TY11 nodes that
each contain:
• 2 six-core Westmere processors at
2.66 GHz
• 48 GB RAM
• 2 NVIDIA Tesla C2070 GPUs (each
with 6 GB RAM).
• 1.7 petabytes of usable disk space
served by a Lustre file system.
Performance Analysis and Benchmarks Datasets
Dataset Name Dimensions (Data Points)
File Size Number of Points
HIPASS Cube 1721 x 1721 x 1024 11.3 GB 3 Billion
8X HIPASS Cube 3442 x 3442 x 2048 90.4 GB 24 Billion
27X HIPASS Cube 5163 x 5163 x 3072 305.1 GB 81 Billion
48X HIPASS Cube 6884 x 6884 x 3072 542.33 GB 145 Billion
Performance Analysis and Benchmarks Datasets
Performance Analysis and Benchmarks Data Analysis Performance - 96 GPUs
Dataset Name File Loading Median (s)
Mean/Std (s)
Histogram (s)
48X HIPASS Cube ~ 9 Minutes 44 1.745 4
27X HIPASS ~ 5.3 Minutes 22 1.2 3.9
8X HIPASS ~ 8.3 Minutes 7.8 0.5 1.6
HIPASS ~ 10 Seconds 2 0.4 0.12
Computer Assisted Data Analysis Volume Rendering – Performance
Computer Assisted Data Analysis Volume Rendering Performance with 96 GPUs
http://fresh-flow.co.uk