trap (transient detection pipeline) status update

Post on 05-Dec-2014

795 Views

Category:

Technology

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

These are the slides from the talk I gave at the 'Radio Transients with SKA Pathfinders and Precursors' conference at Kruger Park, South Africa. 9-12 July 2013

TRANSCRIPT

TRAP STATUS UPDATETRAnsients Pipeline

Gijs Molenaar

gijs@pythonic.nl

@gijzelaerr

Thursday, July 11, 13

ABOUT TRAP

•TRAnsients Pipeline

•Detect and classify transients in multi-frequency radio sky image time series

• Emit VOevents

• 99% Python

Thursday, July 11, 13

STEPS

Thursday, July 11, 13

A LOT HAPPENED

• Version 1.0 imminent

• Focused on code quality and performance

•No big new science features

Thursday, July 11, 13

PERFORMANCE

• A lot faster

• Really a lot faster

• 0.85 image per second per core

• Scales well

minutes

Thursday, July 11, 13

RSM CYCLE0 RUN0

• 3402 images

• processing record - 5:21 min

• 2 machines, 36 cores

• 5645 unique sources

• 667 detected transients

• previous version: 400 min on 40 cores

Thursday, July 11, 13

TRAP & AARTFAAC

• AARTFAAC

• 48 images/s

• 57 (real) cores required

• 1 or 2 big fat systems will do!

Thursday, July 11, 13

INSTALLABILITY

•Merged TKP into TRAP

• Almost open source

• Easy database setup

• Remove many dependencies

• Like Lofar System Software (closed source)

Thursday, July 11, 13

QUALITY CONTROL

• Automated rejection of bad images

• Known bright source in FOV

• RMS x times higher than theoretical noise

• oversampled / undersampled / highly elliptical

Thursday, July 11, 13

STORAGE

• Added support for PostgreSQL

• fast with small datasets

•Many off-the-shelf tools available

Thursday, July 11, 13

UNDER THE HOOD

• Switched to celery

• asynchronous job queue

• based on distributed message passing

•No more cuisine

Thursday, July 11, 13

WHY CELERY

• Easier to use / install / debug

• Faster - hot processes

•Many off-the-shelf tools

•CEP1 compatible

• Easy to add compute nodes

Thursday, July 11, 13

Thursday, July 11, 13

DISCO?

•Maybe add support for Disco in the future

• Similar

• Map - Reduce

•Hadoop for Python

•Distributed file system

Thursday, July 11, 13

USABILITY

• tkp-manage.py

• Pipeline management tool

• Inspired by Django manage.py command

• Easy to

• setup pipeline

• add and run jobs

• run celery workers

• Add new commands

Thursday, July 11, 13

DEMO?

Thursday, July 11, 13

SUPPORTED TELESCOPES

• Support for FITS and CASA tables

• field parsers for LOFAR

• Possible to add telescope specific field parsing and quality checks

•ThunderKAT next week

Thursday, July 11, 13

PROJECT CLEANUP

• removed 40% of code

• 80% unit tested

• Added jenkins build server

• Performance regression tests

• Pull request/review work flow

• hipchat for central communication

Thursday, July 11, 13

WEB INTERFACE BANANA

•New web interface

•Rewrite of TKP-web

• Future ready

• Scientist friendly

Thursday, July 11, 13

Thursday, July 11, 13

DEMO?

Thursday, July 11, 13

FUTURE WORK

• More stable releases

• Add support for non-LOFAR data

• More quality checks

• Source storage and association performance

• Distributed file system

• Automated classification

• Web based data exploration

Thursday, July 11, 13

QUESTIONS

gijs@pythonic.nl

@gijzelaerr

Thursday, July 11, 13

top related