tahoe, sep. 2006 calibrating photometric redshifts beyond spectroscopic limits jeffrey newman...

Tahoe, Sep. 2006

Calibrating Photometric Redshifts beyond Spectroscopic

LimitsJeffrey Newman

Lawrence Berkeley National Laboratory

Tahoe, Sep. 2006

A critical problem…

DETF Task Force Report:

Tahoe, Sep. 2006

But a difficult one- Future DE experiments plan to use photo-z’s for objects far too faint to get spectroscopic z’s for en masse

- High-z/faint spectroscopic redshift survey samples are far from complete

- Photo-z calibrations for brighter galaxies may not apply directly to fainter galaxies at same z (smaller galaxies start star formation later-what about Pop. III?)

How can we test photo-z’s for faint galaxies if we can’t get complete sets of spectroscopic redshifts?

Tahoe, Sep. 2006

Because galaxies cluster together in 3D, they also cluster together on the sky

Both because dark matter halos cluster with each other and because more galaxies are found in more massive halos, all populations of galaxies cluster with each other - both in 3D and in projection on the sky.

Tahoe, Sep. 2006

Cross-correlations can tell us about p(z):

Consider objects in some photo-z bin, in a region where there is another set of objects with spectroscopic z’s.

zphot~0.7

Tahoe, Sep. 2006

No overlap in z :If none of the photo-z objects are in fact at the same z as a spectroscopic object, they will not cluster with it on the sky.

Tahoe, Sep. 2006

Some overlap in z :

Those photo-z objects which are close in z to a spectroscopic object will yield a clustering signal.

Tahoe, Sep. 2006

Maximal overlap in z :

The cross-correlation is stronger at redshifts where a greater fraction of the photo-z objects truly reside.

Tahoe, Sep. 2006

Two-point correlation statisticsThe simplest clustering observable is the two-point correlation function, the excess probability over random that a second object will be found some distance from another:

dP = n (1+ (r) ) dV

where (r) denotes the real-space two-point autocorrelation function of this class (which has average density n) at separation r.

(r) is the Fourier transform of the power spectrum. It is described well by a power law,

(r) = (r/r0 )-

where r0 ~ 3-5 h-1 Mpc, depending on galaxy type, and

Tahoe, Sep. 2006

Angular cross-correlations

For galaxies in a small spectroscopic bin (e.g. z= 0.01) we can measure the cross-correlation of photometric galaxies about a spectroscopic galaxy, defined by:

dPsp () ~ p (1+ wsp() ) d

where wsp () ~ sp(y) p(z) dz , y = (l2 + D2 2)1/2 ,

Phillipps (1985) first used cross-correlations to measure clustering (also applied by Masjedi et al. 2006), but we’ll use it to get redshift distributions instead (cf. also Schneider et al. 2006, Padmanabhan et al. 2006).

Tahoe, Sep. 2006

Additional observables

In addition to wsp () ~ sp(y) p(z) dz ,

we also measure the real-space autocorrelation for spectroscopic galaxies: ss

And the angular autocorrelation for photometric galaxies:

wpp ~ pp(y) p(z)2 dz

For simple biasing, sp = (ss pp )1/2 ,

providing enough information to solve separately for sp and p(z)

Tahoe, Sep. 2006

Assumptions for the following:

2) We want to measure p(z) for a sample of galaxies in one photometric redshift bin with true redshift distribution a Gaussian with mean z=1 and sigma z. For a standard scenario, we take surface density p =10/sq. arcmin and z ~ 0.1 .

1) We have a spectroscopic sample of galaxies with well-measured redshifts. For starters, assume it has a flat redshift distribution (constant dNs/dz), e.g. 25k galaxies/unit z.

Tahoe, Sep. 2006

Assumptions (continued)

3) We can ignore lensing, which can also cause correlations (can be removed iteratively).

4) The clustering of the photometric sample is independent of z . *

5) We measure correlations within a 5 h-1 Mpc comoving radius (trade-off of signal-to-noise vs. nonlinearities).

6) We can ignore sample (“cosmic”) variance (minimize by using many fields/sampling widely separated regions of sky, remove to first order using the observed fluctuations in dNs/dz ).

Tahoe, Sep. 2006

Monte Carlo simulations

Generate realizations with realistic correlation measurement errors in bins and do Gaussian fits to inferred p(z) in each

Tahoe, Sep. 2006

Scaling with p

Tahoe, Sep. 2006

Scaling with z

Tahoe, Sep. 2006

Dominant Errors:Random errors:

1.0 10-3 (z/0.1)1.5 ((dNs/dz) / 25,000)-0.5(p/10)-0.5

Field-to-field zero point variations:

< 4.1 10-3 (zp/0.01) (Npatch/4)-0.5

Systematic errors in ss:

< 1.6 10-3 (sys/0.02) (z/0.1)

Assuming no bias evolution though it exists:

< 3 10-3 (db/dz / b)/0.3 (z/0.1)2

Tahoe, Sep. 2006

Near-future prospects

Blue: SDSS + AGES + VVDS + DEEP2+1700 galaxies/unit z at high z

Red: add zCOSMOS + PRIMUS + WiggleZ + 5000 galaxies/unit z at high z

Tahoe, Sep. 2006

Monte Carlos for real surveysRedshift samples will be 3-10x larger than today at most z, with correspondingly smaller errors:

Current Future

Tahoe, Sep. 2006

Conclusions

• Reasonably-sized spectroscopic datasets can establish redshift distributions for objects in photometric-only samples, with precisions right around what is necessary for future surveys.

• The spectroscopic sample does not need to be complete, very precise, etc. - we can pick the easiest galaxies to get redshifts for, restrict to only the most secure redshifts, and so forth.

• To minimize systematic errors (and sample variance), best to have many surveys/fields sampled

• What is needed most are larger samples of galaxies at z=0.2-0.7 (under way) and especially z > 1.4.

Tahoe, Sep. 2006

Net scaling:

For both the uncertainty in the mean z of the photometric galaxies or the uncertainty in z , we get:

~ 1.0 10-3 (z/0.1)1.5 ((dNs/dz) / 25,000)-0.5(p/10)-0.5

If p(z) is made up of multiple, nonoverlapping Gaussian peaks each containing fpeak of the probability, errors scale as fpeak

-1/2.

Tahoe, Sep. 2006

Other sources of error

LSST tolerance is 0.002(1+z): matches worst-case systematic errors at z=1.

Tahoe, Sep. 2006

What if bias evolves with z for the sample?To get these uncertainties, I assumed that the biasing/clustering of the photometric galaxies is constant with z. We can use the angular autocorrelation, plus dN/dz, to infer the average bias of the photometric galaxies.

If we assume db/dz = 0, and it is not, then we will get a biased estimate of the true <z>; for b=b0 (1+(db/dz)(z-z0), we will make an error of (db/dz) (z)2

Observed db/dz for reasonable samples is ~0.3, so this corresponds to an error of ~3 10-3 for z = 0.1.

In actuality, we should get some handle on db/dz from comparing e.g. photo-z slices… this error can be reduced substantially.

Tahoe, Sep. 2006

Measuring in a redshift surveyThe observed clustering of galaxies is not isotropic, as the redshift separation of objects is a combined result of their distance and ‘peculiar motions’ induced by gravity.

Conroy et al. 2005/Coil et al. 2005

Therefore, we commonly measure wp(rp): the excess probability two objects are a given separation apart, projected on the sky. This avoids redshift-space effects.

If distance >> r0 ,

wp(rp) ~ (r) dz =f() rp1-r

Tahoe, Sep. 2006

Cross-correlationsGenerally, we measure the autocorrelation of some sort of object with other objects of the same sort. However, we can also measure cross-correlations: the excess probability of finding an object of type 2 near an object of type 1.

Coil et al. 2006

For simple, linear biasing, 12 ~ (11 22) 0.5

Galaxy-QSO clustering vs. galaxy-galaxy clustering

Tahoe, Sep. 2006

Given large sets of galaxies with redshifts, we can infer dN/dz from cross-correlation techniques

Phillipps (1985) showed that high-quality correlation function measurements can be obtained by measuring the angular correlation of galaxies without redshifts (but seen in photometry) around galaxies of known redshift.

This can get around the usual problems with angular correlations: we generally must assume luminosity and clustering are uncoupled and then use a known redshift distribution of sources to interpret angular correlation functions (via Limber’s equation). By cross-correlating with galaxies with spectroscopic redshifts, though, the analysis becomes much simpler.

Tahoe, Sep. 2006

Angular correlations

Much larger sets of galaxies have photometry than spectroscopy/redshifts. Their clustering statistics can be studied using the angular correlation function w().

Coil et al. 2004

To interpret angular correlations, we need to know the redshift distribution of sources;

w()~ (dN/dz)2 (r,z) dz

Tahoe, Sep. 2006

Hybrid methods are also possible

Phillipps (1985) showed that high-quality correlation function measurements can be obtained by measuring the angular correlation of galaxies without redshifts (but seen in photometry) around galaxies of known redshift, e.g. in cases where the photometric dataset is >> the redshift dataset.

This can get around the usual problems with angular correlations: we typically must assume luminosity and clustering are uncoupled and then use a known redshift distribution of sources to interpret angular correlation functions (via Limber’s equation). By cross-correlating with galaxies with spectroscopic redshifts, though, the analysis becomes much simpler.

Tahoe, Sep. 2006

From cross-correlations to dN/dz

Assume a spectroscopic survey of a total of Ns galaxies has been performed over the same region in which we desire to calibrate redshift distributions (e.g. for a given photo-z bin). From that, we know dN/dz for the spectroscopic sample/ns(z), plus the two-point autocorrelation function for those galaxies, ss(r).

For the photometric-only sample, we know its total surface density on the sky, p, and its two-point angular autocorrelation function, wpp().

Tahoe, Sep. 2006

Key observables

Assume a spectroscopic survey of a total of Ns galaxies has been performed over the same region in which we desire to calibrate redshift distributions (e.g. for a given photo-z bin). From that, we know dN/dz for the spectroscopic sample/ns(z), plus the two-point autocorrelation function for those galaxies, ss(r).

For the photometric-only sample, we know its total surface density on the sky, p, and its two-point angular autocorrelation function, wpp().

Tahoe, Sep. 2006

Then…We can measure the cross-correlation of galaxies in a small spectroscopic bin (e.g. z= 0.01) with the photometric sample, defined by:

dPsp () ~ p (1+ wsp() ) dwhere

wsp () ~ np(z)/ p sp(y) dl,

and y = (l2 + dA2 2)1/2 .

So given a sample of galaxies with known z and known clustering, we can derive the fraction of a separate sample of photometric galaxies at that z (as after we measure the cross-correlations, we can get the average clustering of the photometric sample from its angular autocorrelation).

Tahoe, Sep. 2006

A recent application

Tahoe, Sep. 2006

This allows us to infer dN/dz for a photometric sample using a spectroscopic survey

The observed clustering on the sky between galaxies in some photometric-only sample and galaxies known to be at a given z depends on the product of the real-space cross-correlation between locations of the two populations and the fraction of the photometric sample at that z.

In general, we have sufficient information to measure the autocorrelation function of the spectroscopic sample with itself and the angular autocorrelation of the photometric sample along with the angular cross-correlation.

This provides enough information to get out dN/dz for the photometric sample, so long as biasing is simple or well-modeled.

tahoe, sep. 2006 calibrating photometric redshifts beyond spectroscopic limits jeffrey newman...

Documents