accelerated acceleration searching · 2015-09-18 · eps2 0 start 56583.219462382228812 finish...
TRANSCRIPT
Accelerated Acceleration searching... Ewan BarrSwinburne University of Technology
Pulsar huntingSearch in 5 parameters:
Position
Dispersion measure
Period
Width
Acceleration
Tools of the tradeSIGPROC:
Lorimer et al.
C / FORTRAN
Procedural paradigm.
Easily extended.
Suffers from too many cooks...
Implements “time domain” acceleration searching
PRESTO:
Ransom et al.
ANSI C / PYTHON
Procedural paradigm.
Easily extended.
Well maintained.
Implements “Fourier domain” acceleration searching.
...CUDA support added!
ProblemsPrevious codes were written for the machines of the time. Less RAM smaller cache sizes and no accelerator support.
Often require heavy disk I/O.
FFTs are “slow” to execute on the CPU.
Cannot keep up with data rates without large computing clusters.
Large lags between observation and discovery.
Execution times force smaller parameter space searches.
Case studyTake a 72 minute long HTRU observation (226samples, 1.2 hours).
For one DM trial (single threaded on Intel Xeon 2.67 GHz):
Accelerations 0 m/s/s ± 5 m/s/s
SIGPROC execution time 13 seconds 1.5 hours
PRESTO execution time 55 seconds 0.6 hours
Accelerations 0 m/s/s ± 5 m/s/s
SIGPROC execution time 7.2 hours 125 days
PRESTO execution time 30.5 hours 50 days
Scaling up to a 2000 DM trial search:
GPUs vs CPUs
CPU GPU
CUDA
Compute Unified Device Architecture
Enables us to quickly write code which can use a large fraction of a GPUs potential.
Problem domain must be decomposed to a high level of granularity.
Peasoup
CUDA / C++ pulsar searching pipeline/toolbox
Based on Sigproc + code by Ben Barsdell, Paul Coster and Matthew Bailes.
Currently the only “end-to-end” acceleration searching pipeline in the world.
Very, very fast...
Peasoup pipeline
A thread is spawned for each “worker” GPU.A master “foreman” thread controls allocation of dispersion trials to GPUs.
Each GPU executes all accelerations for its current DM trial.
Candidates undergo a hierarchical sifting system and are finally folded and “scored”.
Benchmarks
Accelerations 0 m/s/s ± 5 m/s/s
SIGPROC execution time 7.2 hours 125 days
PRESTO execution time 30.5 hours 50 days
PEASOUP execution time 5 minutes 30 hours
Same HTRU “lowlat” file as before (2000 DM trials):
Rates of ~36 trials per second for 2^23 length time series.
Rates of ~7 trials per second for 2^26 length time series.
Results
Reprocessed all of HTRU “medlat” in under 30 days.
Produced >50 million candidates...
To analyze and select candidates we must employ neural nets + ranking algorithms.
This side still needs work, but it is a very active field.
J1705-1908
HTRU
Jacoby & Edwards
J1705-1908
J1705-1908 PSRJ J1705-1908RAJ 17:05:42.6108168 1 0.27649538519546339521 DECJ -19:08:02.5899999999903 F0 403.17844714540158663 1 0.00000048482103529509 F1 0 PEPOCH 56582.211983880566549 DM 57.490000000000002 BINARY ELL1PB 0.18395397110741149573 1 0.00000004178723294063 A1 0.10435891291481527563 1 0.00000040610657989516 TASC 56582.212268774442624 1 0.00000060684296922283 EPS1 0 EPS2 0 START 56583.219462382228812 FINISH 56585.350115984103468 GL 3.18 GB 12.99
Binary MSP
~4 hour orbit.
Low eccentricity.
Low mass companion (0.04 Msol).
Probably “black widow”?
Companion Roche lobe is roughly twice A1.
Shows eclipses over ~5% of the orbit.
Expect r.m.s. residual of ~3-6 us.
J1705-1908Low ecliptic latitude (~3.7 degrees), presents minor positional problem.
No nearby 2FGL sources.
Several X-ray sources within 2 arcminutes.
Serveral bright stars within 2 arcminutes.
If we wish to do optical followup (possibly with GEMINI), then we might require a interferometric position (ATCA or GMRT).
May be of interest to PTAs despite its “black widowiness” as the profile is very narrow and it can be observed by most 100-m class radio telescopes.
Source is moderately bright (~S/N 20 in 8 mins with Parkes) and so may be interesting for RM/DM measurements at eclipse ingress/egress.
SummaryGPUs likely represent the future of astronomical data analysis hardware.
The fine-grained parallelism of many DSP operations make GPUs and ideal choice for pulsar searching.
The speed offered by systems such as Peasoup will be vital in dealing with the SKA’s “Big Data”.
Peasoup has already passed the most important test of pulsar finding software by finding a pulsar.
Peasoup is currently being used in Manchester (for simulated SKA data), Germany (for processing the HTRU-North pulsar survey) and at Swinburne (for processing archival data sets).