data handling and reduction jamie stevens may 13 2009
TRANSCRIPT
Data Handling and Reduction
Jamie Stevens
May 13 2009
Introduction
• CABB will be different, but similar
• What should you expect?
• What should you look for during reduction?
• Communicating your experiences
CABB Data basics
• CABB data is still in RPFITS format
• MIRIAD is going to be the package of choice for CABB reduction
• alterations are being made to support CABB data
• Other packages won’t be supported
CSIRO. Data Handling and Reduction, April 8 2009
CABB File Sizes
• CABB files are BIG, really BIG• You might think downloading videos from the internet takes a lot
space, but that’s just peanuts to CABB!
• Current system of 2 IFs, 2048 channels per pol, 2 pols, 15 cross-correlations + 6 auto-correlations gives a rate of ~24 GB/day
• Full system with all zoom bands will generate over 400GB every day
• Each file is limited to 4GB by the correlator• Files will close automatically when it reaches this size, and a new file
opened• Each file represents about 4 hours of observing• Files are limited to this size so they can fit on DVD, and are below the
maximum file size supported by FAT32 filesystems
Taking data away
• A new approach to archiving is needed• Users should expect to either take a lot of DVDs away with them,
or bring an external USB hard drive
• Computers in the observers area can be used to transfer data from the correlator to your hard disk
• Only support USB interface• Only support FAT32 or ext2/3 disks• Of course you can just plug in your laptop and use scp
• If you’re in ATNF, then you can copy it from the correlator over the network
• If you’re outside, check that your institution doesn’t charge large amounts for big data transfers!
• Or you can wait for the data to get copied to the archive (a couple of weeks at the moment, should decrease in the future to hours)
Reduction
• There is a new MIRIAD version that you will need to use to reduce CABB data
• New ATLOD needed for new file format, including yet-to-be-supported zoom modes
• New uvsplit allows the wideband to be split up into smaller band chunks
• Array sizes changed behind the scenes to allow plotting etc. of big datasets
• New MIRIAD is installed on kaputar• Can get new binaries/sources from usual web address
CSIRO. Data Handling and Reduction, April 8 2009
CABB File Format
• Currently, a CABB file has• 2 IFs with 2048 channels (IFs 1 & 2)
• Using ATLOD• Either use ifsel to choose which IF to extract from the file
• Can’t select more than 1 at a time
• OR extract all IFs and use select=window(n) to select IF n• My experience is that this doesn't always work
CSIRO. Data Handling and Reduction, April 8 2009
Reduction Philosophies
• Two main philosophies:• All-at-once
• Divide and conquer
• All-at-once• Have to be careful about large fractional bandwidth effects
• Increased sensitivity
• Increased complexity with current routines
• Straightforward process
• Divide and conquer• Can use current routines with little change
• More effort required to run for each sub-band
• More complicated to produce a wideband image
Experiences
• Reduction process has not changed a great deal
• Calibration still uses the same routines• mfcal, gpcal, gpboot, gpcopy all work on the increased bandwidth
• Imaging should be considered carefully• Recommend using mfs for all observations, and using mfclean as
well
• There are some things that you should look out for
Calibration
• Mfcal• Works as expected
• Uses models of source to perform calibration
• Be careful with choice of interval• May be more prudent to calibrate on chunks of time with select than to
increase interval
• Gpcal• Works as expected
• Doesn't require as much buffer space as mfcal, so shouldn't have problems in this way
• gpcopy & gpboot• Use them as usual
• mfboot != better gpboot!
Inspecting Data Quality
• uvplt, uvspec, gpplt all necessary tools for inspecting quality of your data
• By default, uvplt will average over frequency
• This can hide some problems from you that are more easily observed with options=nofqav
Interference
• Self-generated interference can cause some problems when the source is transiting
• ie when the fringe rotation rate is low
• restricted to known channels, easily flaggable
Interference
Flagging
• Can achieve good results with only uvflag• Flag based on channels and time ranges
• Flag out 50 channels from each edge
• Other flagging options• Blflag
• Suffers because of default frequency averaging
• Tvflag• Still doesn't run with decent colour depth
• Tvclip• Can be run in batch mode without output, useful for flagging out
obviously wrong amplitudes/phases
• Pieflag• Struggles with such large datasets
Calibration Quality
• Can achieve good quality calibration using entire bandwidth
Imaging
• When making broadband image, should strongly consider using mfclean
• If your source strength varies with frequency, clean will distort your image
• Use of mfclean particularly important for high dynamic range imaging
• Need to set options=mfs,sdb in invert
• Also need to image out to three times your primary beam size (setting imsize values)
• When running mfclean, need to set region to be just the area inside the primary beam
• Can be useful even with smaller bandwidths, as dynamic range is improved
Broadband imaging
• CenA at 12mm
Divide and Conquer
• Using uvsplit, you can break up your dataset into chunks of smaller bandwidth
• Use the new maxwidth option
• As calibration works quite well over 2 GHz, suggest splitting after calibration, just before imaging
• Imaging smaller bandwidth chunks has its advantages• Less problems with fractional bandwidth effects
• Can measure spectral variation of source flux density
Calibrators
• We are mindful of the need for wide-band characterisation of calibrators, and new mm calibrators
• Two projects are going to be doing this• C007 Edwards
• C2050 Newton-Mcgee
• First C007 data from immediately after science operations start has been reduced and is available through calibrator search tool
• 7mm fluxes for most mm sources across the sky
Experiment!
• Data coming from the CABB system is looking good, and we have some exciting results coming up in later talks
• Good time now, before your observations, to take some existing data and practice your reduction
Credit to Baerbel Koribalski
Summary
• CABB is on-track and looking good
• Getting experience in reducing the new data is vital• Please communicate successes, failures to us so we can improve
our understanding of the processes, and improve MIRIAD
• A big thanks to Warwick and all the people who made CABB a reality!
• Contact:• Jamie Stevens re: data reduction
• Mark Wieringa re: MIRIAD CABB issues
• Anybody who has done CABB observing/reduction!
An Aside
Contact UsPhone: 1300 363 400 or +61 3 9545 2176
Email: [email protected] Web: www.csiro.au
Thank you
CSIRO ATNF NarrabriJamie StevensATCA Senior System Scientist
Phone: 02 6790 4064Email: [email protected]
CSIRO. Data Handling and Reduction, April 8 2009
Reduction
• 1934-638
Reduction
Using normal clean Using mfclean
Interval Choice
• When running mfcal on large dataset, it may stop and complain about running out of buffer space
• Will suggest that you increase interval
• Be very cautious about doing this, especially at higher frequencies
interval=1 interval=0.1