a6523 signal modeling, statistical inference and data...

A6523 Signal Modeling, Statistical Inference and

Data Mining in Astrophysics Spring 2015

http://www.astro.cornell.edu/~cordes/A6523

Lecture 22: –  Source finding in surveys –  Linear and nonlinear leastsquares

•  Notes Modeling2015.pdf on course web site (will be

posted this tonight) •  Chapter 11 in Gregory (Nonlinear model fitting) •  Chapter 29 of Mackay (Monte Carlo Methods)

http://www.inference.phy.cam.ac.uk/mackay/itila/ Assignment 4 to be posted tomorrow

Astronomical Surveys

•  Detection in survey data •  Classification (empirical)

– Morphology – Spectral properties (continuum, line ratios, …) – Time-domain

•  Identification – Star, Galaxy, Asteroid, AGN, unknown

•  Characterization – Detailed parameterization

Surveys and Object Finding •  Time domain (1D)

–  Pulsar surveys –  Gamma-ray bursts –  Extragalactic radio bursts of unknown origin

•  Image surveys (≥2D): sky locations + spectral –  Sloan Digital Sky Survey –  Radio surveys with the Very Large Array

–  NVSS (1990s) –  VLASS (now being proposed)

–  Spectral line surveys: –  Hydrogen, molecular lines –  SETI (search for extraterrestrial intelligence) –  Particle annihilation lines (e± , axion lines [dark matter], …)

–  Optical: LSST (Large Synoptic Survey Telescope) •  Combined time-domain and image surveys:

–  Transient sources –  Asteroids esp. near-Earth objects (including comets)

Common Threads

•  Detection of weak sources amid noise •  Source confusion

•  Multiple sources in the PSF and its sidelobes •  Overlap of sources on the sky

–  E.g. deep galaxy surveys

•  Known vs. unknown object structures – Known: matched filtering – Unknown: need data-adaptive algorithms

•  Want to maximize detections and minimize false positives (ROC curves)

Time-frequency Domains

•  Pulsars, solar/stellar bursts, ETI sources, extragalactic bursts

•  All show “structure” in the 2D t-f plane – Fairly predictable for coherent pulsar/

extragalactic pulses – Highly stochastic from other sources

! !

!"#$%&'()%*&'!&+,-.&*

/0,1&*'&2'+%3'456678

90,',&-2+#$)%+,':)%*&*'0;'&'*+>:%"#$'

,&*0%)2"0#

90,'CD&1':)%*&'+,&+'4!:&+?

='E'-0#*238'2.&'!FG'

$0&*'+*'=HIF5

3

J*'#+,,0&',&*0%)2"0#'"*'0;2'"#2"0#+%%K'

1&$,+1&1'2.,0)$.'*>002."#$'4+11"#$'

#&"$.L0,"#$'*+>:%&*83'M.&'#0"*&'+11*'

"#-0.&,%K'&'*&,"&*'+,&'*)LN&-2&1'20'!FG'

2.,&*.0%1"#$'+#1'+'%"*2'0;'-+#1"1+2&'&O*'"*'

$&,+2&13''

The Astrophysical Journal, 758:23 (14pp), 2012 October 10 Condon et al.

Figure 4. Noise levels in the final image. The rms noise σn in the ring of radiusρ in the sky image (dashed curve) and in the weighted P (D) distribution ofpoints inside the circle of radius ρ. Abscissa: offset from the pointing center(arcmin). Ordinate: root-mean-square noise (µJy beam−1).

Figure 6 is a profile plot of the central portion of ourS/N-optimized wideband sky image. The intensity scale canbe inferred from the highest peaks, which are truncated atSp = 100 µJy beam−1. This image is confusion limited in thesense that the rms fluctuations are everywhere larger than thenoise levels plotted in Figure 4.

3. THE P (D) DISTRIBUTION

Historically, confusion was measured from the probabilitydistribution P (D) of pen deflections D on a chart-recorderplot of fringe amplitudes from the single baseline of a two-element radio interferometer (Scheuer 1957). On a single-dish oraperture-synthesis array image, the corresponding “deflection”D at any pixel is the intensity in units of flux density perbeam solid angle. The observed D at any point is the sum ofthe contribution from the noise-free source confusion and thecontribution from image noise. The source confusion and noisecontributions are independent of each other, so the observedP (D) distribution is the convolution of the source confusion andnoise distributions, and the variance σ 2o of the observed P (D)distribution is the sum of variances of the noise-free sourceconfusion (σ 2c ) and the noise (σ

2n ) distributions:

σ 2o = σ 2c + σ 2n . (10)

The goal is to extract the source confusion distribution and itswidth

σc =(σ 2o − σ 2n

)1/2 (11)from the observed deflections and the measured noise. Ifσc " σn, then ∂σc/∂σo ≈ (σo/σc) ≈ (σn/σc) $ 1 and smallerrors in the measured σo (or σn) cause large errors in theextracted σc. This is the reason that only fairly low-resolution(σc > σn) images can be used to measure confusion. Also, tothe extent that other sources of error (e.g., uncleaned sidelobes)are present but not accounted for, Equation (11) will tend tooverestimate the source confusion.

The noise in our weighted sky image is low at the center butincreases with radial distance ρ from the pointing center, and thenoise eventually overwhelms the confusion at large ρ. On theother hand, increasing the radius ρ of the circular region inside

Figure 5. Effective frequencies in the final image. The effective frequency 〈ν〉 isthe frequency at which the flux density of a source with spectral index α = −0.7equals the local flux density in the ring of radius ρ in the sky image (dashedcurve) or the average flux density of the weighted P (D) distribution of pointsinside the circle of radius ρ. Abscissa: offset from the pointing center (arcmin).Ordinate: effective frequency (GHz).

Figure 6. Confusion profile. This profile plot shows the 3 GHz confusionamplitude in an 8 arcsec FWHM beam, truncated at µmJy beam−1.

which the P (D) distribution is measured increases the numberN of beam solid angles sampled and thus acts to decrease thestatistical uncertainty ∆σo in the width of the observed P (D)distribution. For each thin ring of radius ρ covering N beamsolid angles, we use Equation (10) to calculate

(∆σo)2 =(

∂σo

∂σc∆σc

)2+

(∂σo

∂σn∆σn

)2, (12)

where∂σo

∂σc= σc

σoand

∂σo

∂σn= σn

σo. (13)

Then(

∆σoσo

)2= 1

σ 4o

[σ 4c

(∆σcσc

)2+ σ 4n

(∆σnσn

)2], (14)

where (∆σc/σc) ∼ N−1/2 and (∆σn/σn) = (2N )−1/2(Equation (7)). In the limit σn $ σc,

(∆σoσo

)2≈

(1

2Nσ 4o

)σ 4n . (15)

6

Source Confusion

Sources fainter than 5 σc cannot be identified. But analysis of the fluctuations due to the multiplicity of sources within the PSF yields information about the distribution of source amplitudes

Active Galactic Nuclei

Cygnus A

Radio image VLA, NM

THE ASTRONOMICAL JOURNAL, 115 :1693È1716, 1998 May1998. The American Astronomical Society. All rights reserved. Printed in U.S.A.(

THE NRAO VLA SKY SURVEY

J. J. W. D. E. W. AND Q. F.CONDON, COTTON, GREISEN, YINNational Radio Astronomy 520 Edgemont Road, Charlottesville, VA 22903 ;Observatory,1 jcondon=nrao.edu, bcotton=nrao.edu,

egreisen=nrao.edu, qyin=nrao.edu

R. A. AND G. B.PERLEY TAYLORNational Radio Astronomy P.O. Box O, Socorro, NM 87801 ;Observatory,1 rperley=nrao.edu, gtaylor=nrao.edu

AND

J. J. BRODERICKDepartment of Physics, Virginia Polytechnic Institute and State University, Blacksburg, VA 24061 ; jjb=vt.edu

Received 1997 November 25

ABSTRACTThe NRAO VLA Sky Survey (NVSS) covers the sky north of J2000.0 d \ [40¡ (82% of the celestial

sphere) at 1.4 GHz. The principal data products are (1) a set of 2326 4¡ ] 4¡ continuum ““ cubes ÏÏ withthree planes containing Stokes I, Q, and U images plus (2) a catalog of almost 2 ] 106 discrete sourcesstronger than S B 2.5 mJy. The images all have h \ 45A FWHM resolution and nearly uniform sensi-tivity. Their rms brightness Ñuctuations are p B 0.45 mJy beam~1 B 0.14 K (Stokes I) and p B 0.29 mJybeam~1 B 0.09 K (Stokes Q and U). The rms uncertainties in right ascension and declination vary from

for the N B 4 ] 105 sources stronger than 15 mJy to 7A at the survey limit. The NVSS was made[1Aas a service to the astronomical community. All data products, user software, and updates are beingreleased via the World Wide Web as soon as they are produced and veriÐed.Key words : catalogs È methods : data analysis È methods : observational È radio continuum È

surveys

1. INTRODUCTIONWe used the compact D and DnC conÐgurations of the

Very Large Array (VLA) between 1993 September and 1996October to make 1.4 GHz continuum total intensity andlinear polarization images covering the ) B 10.3 sr of skywith J2000.0 d º [40¡. Additional observations were madeduring the fourth quarter of 1997 to Ðll small gaps in thiscoverage. The full NRAO VLA Sky Survey (NVSS) is basedon 217,446 ““ snapshot ÏÏ observations of partially overlap-ping primary-beam areas, each of which was imagedseparately. The small snapshot images were weighted, cor-rected, and combined to yield a set of 2326 large (4¡ ] 4¡)image ““ cubes ÏÏ whose third axes span the Stokes polariza-tion parameters I, Q, and U. All NVSS images have a circu-lar Gaussian point-source response whose full widthbetween half-maximum (FWHM) points is h \ 45A, signiÐ-cantly larger than the median angular size (/ B 10A) of faintextragalactic sources, in order to achieve the high surface-brightness sensitivity needed for Ñux-limited completenessand good photometric accuracy. The large images have rmsbrightness Ñuctuations p B 0.45 mJy beam~1 B 0.14 K rmsin total intensity and p B 0.29 mJy beam~1 B 0.09 K inStokes Q and U. The catalog of sources extracted fromthese images contains nearly 2 ] 106 objects, includingradio galaxies and quasars, most of the galaxies in the IRASFaint Source Catalog et al. ultraluminous(Moshir 1992),starburst galaxies even at cosmological distances, plusstatistically useful numbers of nearby ““ normal ÏÏ(N ? JN)galaxies, low-luminosity active galactic nuclei (AGNs),Galactic planetary nebulae, pulsars, and stars. Although theNVSS observing frequency was determined primarily by

ÈÈÈÈÈÈÈÈÈÈÈÈÈÈÈ1 The National Radio Astronomy Observatory is a facility of the

National Science Foundation operated under cooperative agreement byAssociated Universities, Inc.

technical considerations, 1.4 GHz lies in the ““ intermediate-frequency gap ÏÏ between higher frequencies associated withsigniÐcant intrinsic source variability and lower frequenciesat which there is strong variability caused by refractiveinterstellar scintillations & Cotton Conse-(Spangler 1981).quently, the NVSS images should accurately reÑect theradio sky for decades to come.

All NVSS results are available to the entire astronomicalcommunity. The survey team reserves no proprietary rightsto either the raw data or the Ðnal data products, because webelieve that the scientiÐc potential of large surveys can berealized only if all astronomers have full and immediateaccess to them. To encourage use of the NVSS, we haveadopted the following policies for NVSS results : (1) Anyastronomer may access the uncalibrated VLA ““ archive ÏÏdata immediately, without waiting for the usual 1 yearproprietary data period to end. (2) Our calibrated (u, v) dataand uncorrected snapshot images are likewise available forcopying at any time. (3) The principal data products ofinterest to most usersÈthe corrected 4¡ ] 4¡ images, thecatalog of sources found on them, plus software for their useand any updatesÈare being released via the World WideWeb as soon as they are made and The NVSSveriÐed.2team members have agreed to use only these electronicallyreleased results for their own research.

This paper reviews the known properties of faint radiosource populations and explains how they inÑuenced theNVSS design and scientiÐc goals. It introduces a number ofnew techniques that were used for the NVSS observations,calibration, imaging, and source Ðtting. It also describes theerror analysis and veriÐcation tests of the NVSS data prod-ucts. Conscientious users who want to skip the detailsshould at least read the user summary and recognize(° 5.3)

ÈÈÈÈÈÈÈÈÈÈÈÈÈÈÈ2 See http ://www.nrao.edu.

1693

0.0001 0.001 0.01 0.1 1 100.1

1

10

100

1000

S (Jy)

0.001 0.01 0.1 1 100

0.5

1

z

1 mJy10100

1694 CONDON ET AL. Vol. 115

those limitations of the NVSS that are not well representedby simple Gaussian error distributions. Caveat emptor.

2. SCIENTIFIC GOALS AND SURVEY DESIGNBoth very deep (S > 1 mJy) and large-scale () ? 1 sr)

radio surveys already exist at 1.4 GHz, so we can use theknown features of the relevant source populations to opti-mize the design of a moderately deep large-scale radiosurvey. Nearly all discrete radio sources more than 1¡ or 2¡from the Galactic plane are extragalactic. The weighted dif-ferential counts of extragalactic sources found at l \ 1.4GHz are shown in Figure 1.

There are two astrophysically distinct populations ofextragalactic radio sources : (1) Over 99% of the stronger

mJy at 1.4 GHz) sources found in earlier large-scale(S Z 60surveys are classical radio galaxies and quasars powered by““ monsters ÏÏ (e.g., supermassive black holes) in AGNs. (2)An increasing fraction of fainter sources can be identiÐedwith low-luminosity AGNs and star-forming galaxies con-taining H II regions ionized by massive short-(M Z 8 M

_),

lived yr) stars and relativistic electrons(q [ 3 ] 107accelerated by their supernova remnants (Condon 1992).These star-forming galaxies are not usually considered to be““ true ÏÏ AGNs, although ultraluminous (L [ 2 ] 1011 ifL

_km s~1 Mpc~1) nuclear ““ starbursts ÏÏ are oftenH0

\ 75quite compact (r D 100 pc) and similar in many otherobservables to AGNs powered by monsters et al.(Sanders

et al. The majority of radio sources1988 ; Condon 1991c).powered by stars lie in ordinary spiral galaxies. Theirmedian face-on disk brightness temperature is just ST T B1 K at l \ 1.4 GHz, so they can be detected only by arelatively low-resolution survey, regardless of their distance.For example, T \ 1 K corresponds to a peak Ñux density

mJy beam~1 in the h \ 45A FWHM Gaussian syn-SP

B 3thesized beam of the VLA D conÐguration but just S

PB 0.3

mJy beam~1 for the h \ 15A resolution of the C conÐgu-ration. Reaching the same brightness sensitivity with the CconÐguration would require a factor of 102 increase in inte-gration time.

FIG. 1.ÈWeighted source counts at 1.4 GHz (data points) and modelsindicating the contributions of evolving ““ monsters ÏÏ in(Condon 1984)

powerful AGNs (dotted curve) and normal galaxies plus ““ starbursts ÏÏ andlow-luminosity AGNs (e.g., Seyfert galaxies ; dashed curve) to the total(solid curve). The quantity n(S)ds is the number of sources per steradianwith Ñux densities between S and S ] dS, and weighting by S5@2 divides bythe counts expected in a static Euclidean universe. Abscissa, Ñux density ;ordinate, weighted source density.

Radio sources in star-forming galaxies generally obey atight far-infrared (FIR) to radio luminosity correlation

Anderson, & Helou and references therein).(Condon, 1991aConsequently, Ñux-limited samples of such galaxies selectedin the radio and FIR bands are nearly identical. The medianFIR-to-radio luminosity ratio for a j \ 60 km sample isSlog & Broderick(S

60 km/S1.4 GHz)T B 2.15 (Condon 1986),so the NVSS should detect the majority of galaxies abovethe IRAS Faint Source Catalog (FSC2; et al.Moshir 1992)

Jy completeness limit. The cumulative frac-S60 km \ 0.28tions f (\z) of star-forming galaxies with redshifts less than

z are plotted for three di†erent 1.4 GHz Ñux density levels inThe stronger starburst sources are local (z > 1),Figure 2.

but deep radio surveys can detect luminous starbursts evenat cosmological distances. Evolution is negligible above100 mJy, detectable at 10 mJy, and quite signiÐcant below1 mJy. The combination of the IRAS FSC2 and the NVSSmay reveal new and exotic objects similar to the ultralumi-nous FSC 10214]4724, at z B 2.3 et al.(Rowan-Robinson

because the NVSS can provide more accurate posi-1991),tions for making unambiguous optical identiÐcations ofdistant, luminous starbursts in optically faint galaxies.Many ultraluminous starbursts are so compact that theyare transparent only in the radio and FIR bands et(Condonal. so radio or FIR observations are essential for1991c),quantitative studies of these objects.

Even the fairly sensitive Green Bank sky survey detectedonly D30 ““ starburst ÏÏ sources per steradian among the 104sr~1 sources stronger than 25 mJy at 4.85 GHz (equivalentto about 60 mJy at 1.4 GHz). At this level, only one in 300extragalactic radio sources is powered by starsÈsuchsources are almost as rare as gravitational lenses. There arejust enough examples known for the simplest statisticaltests. For example, et al. showed that the FIR-Cox (1988)radio correlation is nonlinear using a sample of spiral gal-axies brighter than m D 14.5 detected in the 6C 151 MHzsurveys. The NVSS should multiply the number of knownradio sources in this star-forming population by 102.

Classical radio sources powered by AGNs are typically 2orders of magnitude rarer (per unit volume) and more lumi-nous than those in star-forming galaxies. Cosmologicalevolution dominates their redshift distribution even at high

FIG. 2.ÈThe cumulative fraction f (\z) of radio sources with redshiftsless than z that are powered by stars changes dramatically as the 1.4 GHzÑux density falls from 100 mJy through 10 to 1 mJy. These curves are basedon the evolutionary model in Abscissa, redshift z ; ordinate,Condon (1984).fraction of sources with redshift less than z.

0 5 10

P (db) S (mJy)

1702 CONDON ET AL. Vol. 115

position in the primary beam. Such variations should betime-independent, so these observations were needed onlyonce per major observing cycle. Holography-mode scanscovering the 11 ] 11 point grid of all azimuth and elevationo†sets up to 25@ in multiples of 5@ were made on severalstrong point sources, and images based on these scans wereused to map the VLA instrumental polarization over theprimary-beam area. The results demonstrate that theinstrumental polarization rotates with parallactic angle aspredicted and does not vary signiÐcantly with time. Theinstrumental polarization is also independent of elevationangle, as expected since the change of antenna gain withelevation is less than 0.5% at 1.4 GHz. The data were aver-aged to yield beam polarization maps showing the totalintensity beam and the fractional circular polarizationbeam as well as the fractional Q and U polariza-(Fig. 13),tion beams The VLA antennas are asymmetric(Fig. 14).and have o†-axis feeds that respond to the two circularpolarizations. The total intensity beam map is nearly circu-lar, with The V polarization mapFWHM \ [email protected] ^ [email protected] well (within 1%) with the expected & Turrin(Chu

beam squint. It shows that the circularly polarized1973)beams are separated by along positionv \ [email protected] ^ [email protected] While this beam squint is large enough to134¡.6 ^ 1¡.preclude sensitive circular polarization imaging, it does notsigniÐcantly alter the total intensity beam formed byaveraging the two circularly polarized beams. For a separa-tion v between circular Gaussian circularly polarized beamswith FWHM the major-axis FWHM of the totalh

c? v, h

Mintensity beam is

hM

/hcB [1 ] 2 ln 2 (v/h

c)2]1@2 , (11)

so squint broadens the total intensity primary beam by onlyabout 4A.

4. THE NVSS SNAPSHOT IMAGESA VLA snapshot observation is quick but the synthesized

beam is dirty. We have written special AIPS tasks and pro-cedures to automate the production of snapshot images,enforce uniform quality standards, and reduce oppor-tunities for human error. The dirty beam has such highsidelobe levels that cleaning must be done very carefully.The low-resolution, wideband, wide-Ðeld NVSS snapshotimages cannot be made without stretching or violatingmany of the usual assumptions behind standard imagingalgorithms. We made several Ðrst-order corrections to mini-mize the resulting errors.

4.1. Automated ImagingThe snapshot images were all made by a special AIPS

task, VLAD. It reads the calibrated multisource (u, v) dataÐle and successively images all of the survey snapshots fromone nightÏs observing. After splitting out the data from onesnapshot and applying the calibration table, VLAD Ñagsthose correlators that have excessively large rms Ñuctua-tions among their s samples. This prevents strong inter-313ference from corrupting the images. The surviving (u, v)data are time-averaged over the snapshot duration andimaged. The main image Ðeld is normally a square 512pixels ] 12A pixel~1 B 102@ on a side to cover the main lobeof the primary beam plus most of the Ðrst sidelobe. The(u, v) data are convolved with a spheroidal function thate†ectively suppresses aliased images inside the Ðeld fromsources lying outside the Ðeld, but it cannot eliminate those

sidelobes that do fall inside the image. Outside sourceswhose Ñux densities exceed B2 mJy after attenuation by theprimary beam can leave noticeable sidelobes in the mainÐeld, so they must be imaged and cleaned separately. Figure

shows the extended primary-beam response of a uni-15formly illuminated circular aperture 24.5 m in diameter, agood approximation to the measured VLA primary pattern.VLAD therefore images and cleans additional small Ðeldscentered on the positions of all sources whose Ñux densitiesexceed the values indicated by the dotted curve in Figure 15.It does so by reading a merged list of sources taken from (inorder of preference) the White, & EdwardsBecker, (1991)catalog of sources found on the 1400 MHz Green Bank skyimages, the Parkes-MIT-NRAO 4.85 GHz survey catalogsof sources in the southern et al. tropical(Gregory 1994),

et al. and equatorial et al.(GrifÐth 1994), (GrifÐth 1995)zones, the 6C 151 MHz north polar catalog et al.(Baldwin

and the Parkes PKSCAT90 source list &1985), (WrightOtrupcek 1990).

Clean components from images containing peaks bright-er than 100 mJy beam~1 are used to phase self-calibrate the(u, v) data, and VLAD automatically performs both ampli-tude and phase self-calibration on Ðelds containing sourcesbrighter than 400 mJy beam~1. The largest clean com-ponents are subtracted from the (u, v) data. Residual ampli-tudes exceeding half the interference-Ñagging levels indicatedeviant data points, which are Ñagged before the Ðeld isreimaged and recleaned. To minimize the e†ects of pixelquantization on those images containing a source brighterthan 500 mJy beam~1, VLAD measures the position of thebrightest peak and reimages at a slightly shifted position sothat the brightest peak falls precisely on a pixel center.

The (u, v)-plane coverage of a single VLA snapshot isvery poor. Although ““ natural ÏÏ weighting theoreticallyminimizes image noise, the heavy concentration of shortVLA baselines produces a broad pedestal under the centralpeak of the synthesized, or dirty, beam. This pedestal makescleaning difficult, and the integrated Ñux densities ofextended sources are often a†ected. In addition, natural

FIG. 15.ÈThe upper envelope (dotted curve) of the theoretical powerpattern of a 24.5 m circular aperture at l \ 1.4 GHz (solid curve) was usedto determine the maximum angular distance o at which a source of Ñuxdensity S could have an apparent Ñux density º2 mJy. Such sources mustbe imaged and cleaned separately. Abscissa, angular distance from thepointing center ; left ordinate, primary beam attenuation (decibels) ; rightordinate, minimum Ñux density of a source requiring separate imaging.

Ampl

itude

Offset (deg)-0.4 -0.2 0.0 0.2 0.4

1.0

0.8

0.6

0.4

0.2

0.0

-0.2

No. 5, 1998 NRAO VLA SKY SURVEY 1703

FIG. 16.ÈWith superuniform weighting, the synthesized beam has a nearly Gaussian main lobe whose FWHM is about 45A. The nearby sidelobes arefairly small and have nearly zero mean. Abscissa, right ascension o†set ; ordinate, declination o†set.

weighting produces six very strong di†raction spikesbecause there are so many intra-arm baselines. We thereforeused ““ superuniform ÏÏ weighting with UVBOX \ 5 (see

& Schwab to alleviate these e†ects, at a costSramek 1989)of less than 5% increase in image noise. The resulting syn-thesized beam has a nearly Gaussian main lobe withFWHM about 45A, small nearby sidelobes with no pedestal

and slightly reduced di†raction spikes(Fig. 16), (Fig. 17).Even so, the amplitudes of these di†raction spikes approach0.3 of the peak at o†sets around the grating-ring radius0¡.3,

FIG. 17.ÈBy far the largest sidelobes of the synthesized beam lie alongthree narrow di†raction spikes normal to the VLA arms. The broad peaksnear o†set are grating lobes corresponding to the shortest projected^0¡.3spacings, about 37 m B 170j. Abscissa, o†set from beam center ; ordinate,relative power (dimensionless).

corresponding to the shortest D-conÐguration spacings, ofabout 37 m.

The large di†raction sidelobes must be cleaned very care-fully. The cleaning must be fairly deep, lest the residualimage Ñuctuations be dominated by uncleaned sidelobesrather than by receiver noise. Cleaning too deeply causes anew problem, systematic suppression of signals by ““ cleanbias.ÏÏ One aspect of clean bias is already known to VLAusers. Since the sidelobes are much broader than the mainpeak of the dirty beam, the total area under the intersec-tions of the di†raction spikes and the grating ring can belarger than the area under the central peak of the synthe-sized beam In a dirty image, an extended source(Fig. 17).may therefore appear brighter at these intersections than atthe correct position, and cleaning will put some clean com-ponents onto the six intersections. Each time the synthe-sized beamÏs response from such a clean component issubtracted from the dirty image, Ñux is subtracted from thetrue source position as well. The fraction of the clean-component Ñux erroneously subtracted from the sourceposition is equal to the relative amplitude of the sidelobe,up to 0.3 for the synthesized beam shown in TheFigure 17.result is a clean image in which some of the source Ñux hasbeen moved from the correct position to six ghost images atthe intersections of the di†raction spikes with the gratingring. If the dirty image is dominated by a single extendedsource at the center, the most e†ective cure for this problemis to restrict cleaning to a small region containing the sourcebut not the grating ring. This cannot normally be done in asurvey, because there are many sources spread over thewhole primary beam, which is much larger than the gratingring. Fortunately, few extragalactic sources are larger inboth dimensions than several times the D-conÐgurationsynthesized beamwidth, h B 45A.

Unfortunately, the largest sidelobes in the snapshot syn-thesized beam a†ect even point sources in the presence ofnoise. It is clear from that the peak correspondingFigure 17to a strong (S ? p) point source can easily be located in a

Ampl

itude

Offset (deg)-0.4 -0.2 0.0 0.2 0.4

1.0

0.8

0.6

0.4

0.2

0.0

-0.2


FIG. 16.ÈWith superuniform weighting, the synthesized beam has a nearly Gaussian main lobe whose FWHM is about 45A. The nearby sidelobes arefairly small and have nearly zero mean. Abscissa, right ascension o†set ; ordinate, declination o†set.

weighting produces six very strong di†raction spikesbecause there are so many intra-arm baselines. We thereforeused ““ superuniform ÏÏ weighting with UVBOX \ 5 (see

& Schwab to alleviate these e†ects, at a costSramek 1989)of less than 5% increase in image noise. The resulting syn-thesized beam has a nearly Gaussian main lobe withFWHM about 45A, small nearby sidelobes with no pedestal

and slightly reduced di†raction spikes(Fig. 16), (Fig. 17).Even so, the amplitudes of these di†raction spikes approach0.3 of the peak at o†sets around the grating-ring radius0¡.3,

FIG. 17.ÈBy far the largest sidelobes of the synthesized beam lie alongthree narrow di†raction spikes normal to the VLA arms. The broad peaksnear o†set are grating lobes corresponding to the shortest projected^0¡.3spacings, about 37 m B 170j. Abscissa, o†set from beam center ; ordinate,relative power (dimensionless).

corresponding to the shortest D-conÐguration spacings, ofabout 37 m.

The large di†raction sidelobes must be cleaned very care-fully. The cleaning must be fairly deep, lest the residualimage Ñuctuations be dominated by uncleaned sidelobesrather than by receiver noise. Cleaning too deeply causes anew problem, systematic suppression of signals by ““ cleanbias.ÏÏ One aspect of clean bias is already known to VLAusers. Since the sidelobes are much broader than the mainpeak of the dirty beam, the total area under the intersec-tions of the di†raction spikes and the grating ring can belarger than the area under the central peak of the synthe-sized beam In a dirty image, an extended source(Fig. 17).may therefore appear brighter at these intersections than atthe correct position, and cleaning will put some clean com-ponents onto the six intersections. Each time the synthe-sized beamÏs response from such a clean component issubtracted from the dirty image, Ñux is subtracted from thetrue source position as well. The fraction of the clean-component Ñux erroneously subtracted from the sourceposition is equal to the relative amplitude of the sidelobe,up to 0.3 for the synthesized beam shown in TheFigure 17.result is a clean image in which some of the source Ñux hasbeen moved from the correct position to six ghost images atthe intersections of the di†raction spikes with the gratingring. If the dirty image is dominated by a single extendedsource at the center, the most e†ective cure for this problemis to restrict cleaning to a small region containing the sourcebut not the grating ring. This cannot normally be done in asurvey, because there are many sources spread over thewhole primary beam, which is much larger than the gratingring. Fortunately, few extragalactic sources are larger inboth dimensions than several times the D-conÐgurationsynthesized beamwidth, h B 45A.

Unfortunately, the largest sidelobes in the snapshot syn-thesized beam a†ect even point sources in the presence ofnoise. It is clear from that the peak correspondingFigure 17to a strong (S ? p) point source can easily be located in a


FIG. 32.ÈSimulations ( Ðlled circles) with artiÐcial point sources indi-cate that the incremental completeness of the NVSS catalog is 50% atS \ 2.5 mJy. The variation of incremental completeness with Ñux density isconsistent with the integral of a Gaussian error distribution whose stan-dard deviation is 0.45 mJy (curve), the rms noise and confusion error for apoint source. Abscissa, true source Ñux density ; ordinate, incremental com-pleteness (dimensionless).

of the linearly polarized brightness is biased upward byimage noise. If the rms noise in the Q and U images is p

Q,U,then the probability of estimating L @ given a true linearlypolarized brightness L is

P(L@ o L ) \ L@pQ,U2

I0A L L@

pQ,U2B

expA[ L2 ] L@2

2pQ,U2

B(45)

where is the modiÐed Bessel function.(Vinokur 1965), I0The mean of L @ given L ,

SL@ o L T \P0

=L@P(L@ o L )dL@ , (46)

is greater than L , so we subtract the polarization bias*L @ 4 SL @oL T [ L . Introducing the normalized brightnesses

and yieldsl@ 4 L @/pQ,U l 4 L /pQ,U

Sl@ o lT \ exp ([12l2)P0

=x2I0(lx) exp ([12x2)dx , (47)

which reduces to

Sl@ o lT \ (n/2)1@2 exp ([12l2)M(32, 1, 12l2) (48)Brychkov, & Marichev where M is the(Prudnikov, 1986),

conÑuent hypergeometric function. For l \ 4, we approx-imated M by the (slowly converging) power series in Jahnke& p. 275). For l [ 4, we used the asymptoticEmde (1945,expansion of M : Sl@ o lT [ l B (2l)~1 ] (8l3)~1 &(JahnkeEmde 1945).

The bias-corrected polarized brightness is listed by theÐtted version of the NVSS catalog. Note that it canoccasionally have (unphysical) negative estimated values.To approximate the bias-corrected polarized Ñux density ofa slightly extended source, we simply multiplied the cor-rected polarized brightness by the ratio of the totalh

Mhm/h2

intensity Ðtted area to the beam area. This quantity is givenby the deconvolved version of the catalog. It may not beaccurate for those very extended sources that have(h

M? h)

complex polarization structure on scales larger than thebeam area.

The rms uncertainty in the corrected polarized brightnessis

pP2 B 2p

Q,U2 ] vP2 AP2 , (49)

where is the residual instrumental polarization. If allvPstrong sources were unpolarized, the distribution of their

measured percentage polarizations would peak at Thus,vP.

the peak in the observed polarization distribution yields anupper limit to We found for a large samplev

P. v

PB 0.12%

of sources stronger than 1 Jy. Since sources often haveintrinsic polarizations greater than this, we cannot rule outthe possibility that the instrumental polarization isoccasionally much larger than The rms uncertainty inv

P.

the polarized Ñux density is p(L )(hM

hm/h2).

If the detection of polarized Ñux is signiÐcant at the 98%level, the polarization E-vector position angle PPA mea-sured east from north is also given by both user versions ofthe NVSS catalog. The probability of measuring withl@ [ l0no polarized Ñux present is exp so our criterion for a([12l02),signiÐcant detection is

L@ [ (2 ln 50)1@2p(L ) . (50)

The polarization position angle distribution of signiÐcantlypolarized sources is nearly Gaussian with rms uncertainty

p(PPA) \ 90pQ,Un(L@ [ *L@) deg . (51)

5.3. User SummaryThe NVSS images now cover nearly all of the sky north

of d \ [40¡ at 1.4 GHz with 45A FWHM resolution inStokes I (total intensity), Q, and U (linear polarization). TheNVSS images, the catalog of sources on them, and usersoftware are all available via the World Wide Web. Someexamples of NVSS sources are shown in andFigure 24discussed above in ° 5.

The rms noise and confusion is p B 0.45 mJybeam~1 B 0.14 K on the I images, except near strongsources. The dynamic range is 1000 :1 locally and higherover scales greater than 30@. The Q and U images generallyare noise limited, with p B 0.29 mJy beam~1 B 0.09 K, butthe Ñuctuations are larger near the Galactic plane. Theimages are insensitive to smooth radio structures muchlarger than several arcminutes in both coordinates (di†usesynchrotron emission from the Galaxy and the 3 K cosmicmicrowave background, for example). Smooth sourcessomewhat larger than that in the bottom left panel of

may be visible, but their image Ñux densities willFigure 24probably be low. The image positions are known to beslightly o†set and S*dT \(S*aT \ ]0A.025 ^ 0A.022

and clean bias depresses the peak[0A.113 ^ 0A.027), (° 4.1)Ñux densities of most sources by about mJy*S

P\ [0.3

beam~1 on the images.The NVSS ““ source ÏÏ catalog lists all discrete image peaks

with uncorrected brightness mJy beam~1. ExtendedSP

º 2radio sources may be represented by two or more Ðttedpeaks. There are two user versions of the NVSS catalog,both of which list positions and Ñux densities corrected forknown biases. The ““ Ðtted ÏÏ version returns peak Ñux den-sities and Ðtted sizes ; the ““ deconvolved ÏÏ version returnsintegrated Ñux densities and estimates of source sizes afterthe point-source response has been deconvolved from the

Isotropy of strong sources

NVSS

Isotropy of faint NVSS sources

VLA Sky Survey The Jansky-Very Large Array Sky Survey (VLASS)

The VLA Survey Science Group∗

Submitted: January 15, 2015; Rev. 1: February 5, 2015

Contents1 Executive Summary/Overview 3

2 VLASS – A Launchpad for the Future 5

3 VLASS Themes and Headline Science 63.1 Imaging Galaxies Through Time and Space . . . . . . . . . . . . . . . . . . . 63.2 Radio Sources as Cosmological Probes . . . . . . . . . . . . . . . . . . . . . . 103.3 Hidden Explosions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123.4 Faraday Tomography of The Magnetic Sky . . . . . . . . . . . . . . . . . . . 163.5 Peering Though Our Dusty Galaxy . . . . . . . . . . . . . . . . . . . . . . . 203.6 Missing Physics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 243.7 A Lasting Legacy into the SKA Era . . . . . . . . . . . . . . . . . . . . . . . . 25

4 Survey Strategy 264.1 All-Sky . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

4.1.1 Context . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 284.1.2 Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 294.1.3 Survey Science – Extragalactic . . . . . . . . . . . . . . . . . . . . . . 304.1.4 Survey Science – Galactic . . . . . . . . . . . . . . . . . . . . . . . . . 31

4.2 Deep . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 324.2.1 Context . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 334.2.2 Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 344.2.3 Survey Science . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

5 Data Products, Observing, and Implementation Plan 385.1 Data Products . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

5.1.1 Basic Data Products . . . . . . . . . . . . . . . . . . . . . . . . . . . . 395.1.2 Enhanced Data Products . . . . . . . . . . . . . . . . . . . . . . . . . . 415.1.3 Enhanced Data Services and the VLASS Archive . . . . . . . . . . . . 42

5.2 Observing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42∗A full list of those who directly contributed ideas and/or writing that led to this proposal can be found

in Appendix A.

1

1 Executive Summary/Overview

The Very Large Array Sky Survey (VLASS) is a community-driven project initiated to developand carry out a next-generation large radio sky survey using the recently upgraded Karl G. Jan-sky Very Large Array (VLA). VLASS will open the radio sky to a new exploration of the time andspectral domains. VLASS was developed through unprecedented community involvement andconsensus building, including a public workshop at the AAS, the submission of 22 white papersand long competitive debate in the Survey Science Group (SSG), along with its community work-ing groups of more than 200 multi-wavelength astronomers (see Appendix A). A careful internalNRAO scientific and technical review then provided critical additional input, leading to the opti-mization of the observing plan. The resulting VLASS survey definition is designed to provide abroad, cohesive science program that will both deliver forefront scientific impact in identified sci-ence themes and generate unexpected scientific discoveries. VLASS will engage radio astronomyexperts, multi-wavelength astronomers and citizen scientists alike, leaving a lasting legacy valuefor decades to come. The data from VLASS will be available in the NRAO archive immediatelywith no proprietary period and science data products will be provided to the community in atimely manner. The design of VLASS paid extremely close attention to future Square KilometerArray (SKA) pathfinders, leading to a survey that will both stand out, and be complementary tothose other pathfinder surveys at 1.4 GHz.

The proposed VLASS is a ∼9000 hour comprehensive two-tiered survey, with 23 invested inAll-Sky and 13 in Deep, as summarized in Table 1, and detailed in §4 below. Using the VLA tocapture the radio spectrum from 2 − 4 GHz, VLASS will measure or constrain spectral shapes(e.g. power-law spectral indices), simultaneously providing full polarization parameterization toenable science that showcases the unique capabilities of the Jansky VLA. VLASS as proposed sur-veys the whole Jansky VLA-accessible sky through the All-Sky component, providing full spectraland polarimetric data for a myriad of targets and source types, addressing a broad range of sci-entific questions, individually and statistically. At the same time, through the Deep Component,VLASS will address key science themes of our generation including the nature of dark energy andcosmological bias, questions which can only be addressed through very deep systematic observa-tions over a sufficiently large portion of the sky. Neither of these goals can be reasonably achievedthrough individual PI programs, as the scale of the survey proposed is orders of magnitude morecomplex than previous programs. Both components make optimal utilization of the Jansky VLA’sunique capabilities: high resolution imaging and exquisite point-source sensitivity, critical for sourceidentification; wide bandwidth coverage, enabling instantaneous spectral index determination; andfull polarimetry with good performance even in lines of sight with high Faraday depth, enablinginstantaneous rotation measure and Faraday structure determinations. VLASS will be carried outin multiple passes, providing a synoptic view of the dynamic radio sky similar to those now avail-able through the new generation of synoptic imagers at other wavelengths. VLASS will provideunique measurements of the radio sky at key epochs and sensitivity levels between that fromFIRST and NVSS and the new upcoming radio surveys. This will be a critical enabler for earlyidentification and filtering for the most interesting transient events.

Table 1: VLASS Survey Definition Summary

Area Resolution rms Total Time

Tier (deg2) (arcsec) (µJy/beam) (hr) Epochs

(1) – “All-Sky” 33,885 (δ > −40◦) 2.��5 69 5436 3(3) – Deep 10 (COSMOS, ECDFS, ELAIS-N1) 0.��8 1.5 3391 4

Note: Resolution and sensitivities assume robust weighting to achieve near-natural weighted sensitivitywith suppressed side lobes.

3

Figure 1: The anticipated cumulative extragalacticsource count distribution by VLASS tier using the S3simulated radio sky (Wilman et al., 2008). The verti-cal line indicates the 5σ surface brightness sensitiv-ity of each tier. There are two separate limits for theDeep tier, as the very southern ECDFS field will havean elongated synthesized beam.

Table 2: Expected VLASS Source Statistics

Area Density TotalTier (deg2) (deg−2) Detections

All-Sky 33,885 290 9,700,000Deep 10 9500 95,000

Note: Sky coverage, expected extragalactic sourcedensity, and total number of extragalactic detectionsper VLASS tier above a 5σ surface brightness limit.

As an indication of the power of VLASS, Figure 1 and Table 2 provide an estimate of theexpected extragalactic source counts from VLASS based on the S3 simulated radio sky (Wilmanet al., 2008).

Why is a 2 − 4 GHz, high-resolution, synoptic, polarimetry survey able to deliver such highvalue science?

• High angular resolution is required to not only identify and associate the radio emissionwith its optical host galaxy, but also to identify the location of the emission within the galaxy.

• The 2 − 4 GHz band allows polarimetry on lines of sight with high Faraday depth withoutdepolarization and will enable much broader band Faraday diagnostics in advance of, andcomplementing, upcoming SKA precursor surveys.

• The 2− 4 GHz band allows for earlier identification of explosive event afterglows when theyare brighter, and unlike at 1 − 2 GHz, allows for multiple independent epochs occurringwithin the survey time span.

• Observing at 2 − 4 GHz provides the highest yield of flat or inverted spectrum compactsources while still detecting a large number of sources with “normal” modestly steep spec-tra.

• Employing the 2 − 4 GHz band at high angular resolution will result in a survey that willprovide maximum complementary utility when combined with the lower-frequency, lower-resolution surveys planned for the SKA precursors and pathfinders later this decade, whileadditionally doing key SKA Phase 1 science years before its completion.

• Even in the era of the SKA Phase 1 science observing next decade, VLASS will provide areference epoch this decade for transient object identification, as well as coverage of theNorthern sky not accessible to those instruments.

4

Source detection with Aegean

Source Detection 16 Apr 2013

Aegean: Part of Paul Hancock’s thesis work with Murphy, Gaensler, et al.

(Hancock et al. 2012, MNRAS, 422, 1812)

Continuum Imaging Surveys

General problem: – Find heterogenous sources + noise + image

defects Prior information?

– Blind surveys: find any source subject to realistic levels of false detections and false negatives

– Targeted surveys: partial knowledge of source properties •  Source shapes: e.g. all point sources? •  Locations of sources: e.g. optical surveys first,

another wavelength second (and vice versa)

Blind Surveys

•  Generally have only weak constraints on source properties (e.g. maximum size)

•  Parametric approach: – To approximate matched filtering, a family of

templates is needed (different shapes, sizes) •  Test statistic •  (S/N)TS = S/N of CCF of template, data •  (S/N)TS ~ [S/N of source in image] * [Npixels in template]1/2

Blind Surveys See summary in Hancock et al. 2012, MNRAS, 1812

Nonparametric methods: •  Friends of friends algorithms:

–  Find all pixels > initial threshold –  Aggregate pixels that are neighbors into a source “island”

–  Can allow gaps between pixels (esp. in low S/N case) –  Apply S/N threshold to source island.

•  Sextractor (Source Extractor): –  Developed for scanned optical images –  Find source islands –  Characterizes islands with Npix > some number of interest –  Estimate background RMS from empty regions in image

Nonparametric methods (continued): •  IMSAD (Image Search and Destroy)

–  Threshold to find source islands –  Fit Gaussian shape to island

•  SELAVY – Developed for ASKAP survey – A mixture of island finding methods for images and

data cubes (image + wavelength) –  Fits for multiple Gaussian components

•  SFIND – Detection threshold based on false-alarm probability – Deals with heterogeneous background statistics by

analyzing subregions – Rejects sources not fittable with a 2D Gaussian

(biased?)

Nonparametric methods (continued): •  Floodfill:

– Works top down: finds strong pixels first then increases region around each source if warranted

– Uses two thresholds Tseed and Tflood •  Find pixels above Tseed •  Look at immediately adjacent pixels: if > Tflood include

them in the initial pixel •  Keep repeating until no pixels exceed Tflood

•  Blobcat and Aegean: – Finds islands using floodfill and characterizes

them as without (Blobcat) and with (Aegean) source substructure

Source Identification: Flood Fill


4 6 7 6 4 2 4 6 7 6 4 2 4 6 7 6 4 2

5 7 8 7 5 3 5 7 8 7 5 3 5 7 8 7 5 3

4 6 7 6 4 3 4 6 7 6 4 3 4 6 7 6 4 3

3 4 5 4 3 2 3 4 5 4 3 2 3 4 5 4 3 2

2 2 3 2 2 2 2 2 3 2 2 2 2 2 3 2 2 2

1 1 1 1 1 2 1 1 1 1 1 2 1 1 1 1 1 2

4 6 7 6 4 2 4 6 7 6 4 2 4 6 7 6 4 2

5 7 8 7 5 3 5 7 8 7 5 3 5 7 8 7 5 3

4 6 7 6 4 3 4 6 7 6 4 3 4 6 7 6 4 3

3 4 5 4 3 2 3 4 5 4 3 2 3 4 5 4 3 2

2 2 3 2 2 2 2 2 3 2 2 2 2 2 3 2 2 2

1 1 1 1 1 2 1 1 1 1 1 2 1 1 1 1 1 2

Unprocessed pixels Pixels in the island

Pixels being considered Background pixels

A B C

D E F

Source detection with Aegean


Aegean: Part of Paul Hancock’s thesis work with Murphy, Gaensler, et al.

(Hancock et al. 2012, MNRAS, 422, 1812)

Source detection with Aegean: Performance Comparisons


Source Finding: Real World Performance


• Source finders compared on real VLA observations(Mooley et al. 2013, arXiv:1303.6282)

Comparison of source finder performance vs DR2 catalog forTop: a blended source,Mid: a source with sidelobes,

Bottom: a wide field area.

Source Finding: Real World Performance


For sources in each S/N ratio bin (top), Mooley et al. compare completeness(middle) and reliability (bottom) of source finders.(Highly blended sources cause dip at SNR∼70)

Bottom line for Source Detection


• Source finders are typically optimized for different domains.

• Trade-off between completeness and reliability (false positives).

• All source finders are pretty good.No source finder is perfect.If you’re using one, understand its strengths and weaknesses first.

• There’s room to do better!

a6523 signal modeling, statistical inference and data...

Documents