the swinburne pulsar portal: real-time supercomputing processing of big data
DESCRIPTION
Presented at the Astroinformatics 2013: Knowledge from Data conference - December 11, 2013TRANSCRIPT
![Page 1: The Swinburne Pulsar Portal: Real-time Supercomputing Processing of Big Data](https://reader033.vdocuments.us/reader033/viewer/2022042813/54805366b479595e578b467e/html5/thumbnails/1.jpg)
Arna KarickeResearch Consultant/Data Analyst/Astro
Swinburne Research
Swinburne Pulsar PortalReal-time Supercomputing Processing
of Big Data
![Page 2: The Swinburne Pulsar Portal: Real-time Supercomputing Processing of Big Data](https://reader033.vdocuments.us/reader033/viewer/2022042813/54805366b479595e578b467e/html5/thumbnails/2.jpg)
This project is an extension of
The Swinburne University of Technology Metadata Stores Project
and partly supported by the Australian National Data Service (ANDS)
ANDS is supported by the Australian Government through the National Collaborative Research Infrastructure Strategy Program and the
Education Investment Fund (EIF) Super Science Initiative
![Page 3: The Swinburne Pulsar Portal: Real-time Supercomputing Processing of Big Data](https://reader033.vdocuments.us/reader033/viewer/2022042813/54805366b479595e578b467e/html5/thumbnails/3.jpg)
The ERA of All-Sky Science @ radio, optical and infrared wavelengths...
ASKAP: Wallaby all-sky HI survey - 620 Giga voxels/2.5 TB data cubes
MWA: All-sky radio survey, ~ 6 PB/yr archived data
WISE: All-sky IR survey – recent AllWISE data release (Nov 2013) Source catalog: 747 million objects VST ATLAS: 4500 square deg. of the Southern sky (U, V, R, I, Z)
VPHAS+: VST H-alpha Survey of the Southern Galactic Plane
+ Molonglo Observatory Synthesis Telescope (MOST)
WISE IR Survey
![Page 4: The Swinburne Pulsar Portal: Real-time Supercomputing Processing of Big Data](https://reader033.vdocuments.us/reader033/viewer/2022042813/54805366b479595e578b467e/html5/thumbnails/4.jpg)
Value of Data Access & Analysis Tools for researchers and citizen scientists...
Hubble Legacy Archive
HST & SDSS obviously....• Dealing with the data deluge
• Efficient research (less reinventing the wheel)
• Greater Exposure
• New Collaborations
• Publications & increased citations
• New Discoveries
![Page 5: The Swinburne Pulsar Portal: Real-time Supercomputing Processing of Big Data](https://reader033.vdocuments.us/reader033/viewer/2022042813/54805366b479595e578b467e/html5/thumbnails/5.jpg)
Swinburne Pulsar Portal - The Who?
Matthew Bailes: Research Astronomer & Pro-Vice Chancellor (Research)
Andrew Jameson: Software Developer & Systems Engineer
Chris Flynn: Research Astronomer (Molonglo Telescope)
Willem van Straten: Research Astronomer (Software & Instrumentation)
Ewan Barr: Pulsar Postdoc (HTRU reprocessing & Molonglo)
Arna Karick: eResearch (Research Data Management & Policy) & Astronomer (optical: galaxy clusters & ETGs)
![Page 6: The Swinburne Pulsar Portal: Real-time Supercomputing Processing of Big Data](https://reader033.vdocuments.us/reader033/viewer/2022042813/54805366b479595e578b467e/html5/thumbnails/6.jpg)
Swinburne Pulsar Portal - The What? online tool facilitating remote access to and processing of CSIRO Parkes pulsar data
Survey snapshot
![Page 7: The Swinburne Pulsar Portal: Real-time Supercomputing Processing of Big Data](https://reader033.vdocuments.us/reader033/viewer/2022042813/54805366b479595e578b467e/html5/thumbnails/7.jpg)
High Time Resolution Universe - HTRU (P630)• Paper I - Keith et al. (2010) + discovery papers• Collaborative Research (Swinburne, Manchester, ATNF, Cagliari)• Low-lat survey: thin strip Galactic Plane (deep) -> faint pulsars• Med-lat survey: bright MSP for timing array projevys • High-lat survey: snapshot of transient sky (Sth +10) • Rotating radio transients, short duration radio bursts • Running for ~5 years, over 100 new pulsars, including 26 ms pulsars • Survey has produced over 600 Tb of raw data (Total ~875 Tb)• Data archived to tape & streamed to Swinburne via 1 Gb/s link - cont. observing
Pulsar Timing Array projects (P140)• Detection of gravitational waves
High Time Resolution Universe North (HTRU - North)• Effelsburg Radio Telescope
Molonglo Observatory Synthesis Telescope (MOST): ??
possibly... in consultation with
research groups
![Page 8: The Swinburne Pulsar Portal: Real-time Supercomputing Processing of Big Data](https://reader033.vdocuments.us/reader033/viewer/2022042813/54805366b479595e578b467e/html5/thumbnails/8.jpg)
Swinburne Pulsar Portal - The How?
• User friendly web interface
• Sophisicated analysis tools backed by significant processing power
• MySQL data base with a PHP frontend.
• Accesses a Pb scale database
• XML headers for instrumental and astrophysical metadata - format independent & editable - facilitates easy indexing
• Uses the supercomputer (gSTAR) batch queue system - email alerts - currently has ‘timeout’ in place
• Modular - datasets and analysis tools can be added over time
• Attempt to write ‘non-expert’ analysis tools
![Page 9: The Swinburne Pulsar Portal: Real-time Supercomputing Processing of Big Data](https://reader033.vdocuments.us/reader033/viewer/2022042813/54805366b479595e578b467e/html5/thumbnails/9.jpg)
Swinburne Pulsar Portal - The Why?
• Sharing of collaborative datasets - secure / proprietory periods
• Target: project collaborators, registered astronomers.. public? • Enables users to query AND analyse data and download data products
(metadata available via CSIROs Data Access Facility)
• Alleviates Tb-Pb storage issues, and the guesswork associated with setup & maintenance of software & hardware infrastructure
• Access pulsar observations, search object catalogues, process time-series data with sophisticated analysis software
• Test & validate analysis techniques (for Parkes data, Molonglo & SKA) Improved multi-processing (eg. orbital solutions for high-eccenticity binaries)
• Science-ready results & greater discovery potential
![Page 10: The Swinburne Pulsar Portal: Real-time Supercomputing Processing of Big Data](https://reader033.vdocuments.us/reader033/viewer/2022042813/54805366b479595e578b467e/html5/thumbnails/10.jpg)
Swinburne Pulsar Portal – Data Processing
![Page 11: The Swinburne Pulsar Portal: Real-time Supercomputing Processing of Big Data](https://reader033.vdocuments.us/reader033/viewer/2022042813/54805366b479595e578b467e/html5/thumbnails/11.jpg)
Swinburne Pulsar Portal – Data Tools
standard routines + novel techniques
candidate sorting
folding & optimisation
pulse periods
plotting software
editabe? user software? e.g Geophysics VO on NeCTar
![Page 12: The Swinburne Pulsar Portal: Real-time Supercomputing Processing of Big Data](https://reader033.vdocuments.us/reader033/viewer/2022042813/54805366b479595e578b467e/html5/thumbnails/12.jpg)
Swinburne Pulsar Portal – Data Products
failures ~2%
e.g. beams with interference/
timeouts
![Page 13: The Swinburne Pulsar Portal: Real-time Supercomputing Processing of Big Data](https://reader033.vdocuments.us/reader033/viewer/2022042813/54805366b479595e578b467e/html5/thumbnails/13.jpg)
Swinburne Pulsar Portal – Data Products
![Page 14: The Swinburne Pulsar Portal: Real-time Supercomputing Processing of Big Data](https://reader033.vdocuments.us/reader033/viewer/2022042813/54805366b479595e578b467e/html5/thumbnails/14.jpg)
Swinburne Pulsar Portal – Data Products
![Page 15: The Swinburne Pulsar Portal: Real-time Supercomputing Processing of Big Data](https://reader033.vdocuments.us/reader033/viewer/2022042813/54805366b479595e578b467e/html5/thumbnails/15.jpg)
Swinburne Pulsar Portal – Data Products
![Page 16: The Swinburne Pulsar Portal: Real-time Supercomputing Processing of Big Data](https://reader033.vdocuments.us/reader033/viewer/2022042813/54805366b479595e578b467e/html5/thumbnails/16.jpg)
Coming Soon...
Mid-2014
![Page 17: The Swinburne Pulsar Portal: Real-time Supercomputing Processing of Big Data](https://reader033.vdocuments.us/reader033/viewer/2022042813/54805366b479595e578b467e/html5/thumbnails/17.jpg)
Other Projects
• MyTardis@Swinburne data solutions - Brain Imaging: MEG & EEG - Microscopy/Eng: raman spectrometer & confocal microscope
• Research Data Management & Policy - Instutional research storage, Cloudstor+ (file sharing & storage) - Research Conduct policy, analytics & strategy
• Swinburne (ANDS) Metadata Store Project - Research data collections for Research Data Australia - Copyright, software licencing, DOIs
• Astronomy Research - HST/ACS Coma Cluster Treasury Survey - HST imaging of the Atlas3D galaxy sample