preliminary census of rosat bright sources: results from ...sundog.stsci.edu/rick/aas03jan.pdfclassx...
TRANSCRIPT
0.0 0.2 0.4 0.6 0.8Probability (QSO)
0
500
1000
1500
2000
2500
Num
ber
0.0 0.2 0.4 0.6 0.80
500
1000
1500
2000
2500 Distribution of QSO probabilitiesNon-QSOs
QSOs
0.0 0.2 0.4 0.6 0.8Probability (QSO)
0
500
1000
1500
2000
2500
Nu
mb
er
0.0 0.2 0.4 0.6 0.80
500
1000
1500
2000
2500 Distribution of QSO probabilitiesNon-QSOs
QSOs
Preliminary Census of ROSAT Bright Sources: Results from ClassXR.L. White, A.A. Suchkov, R.J. Hanisch, M. Postman, M.E. Donahue (STScI), T.A. McGlynn, L. Angelini, M.F. Corcoran, S.A. Drake,W.D. Pence, N. White, E.L. Winter (NASA/GSFC), F. Genova, F. Ochsenbein, P. Fernique, S. Derriere (CDS), W. Voges (MPE)
Abstract. ClassX is being developed as a
system for automated classification of X-
ray sources within the Virtual Observa-
tory environment. Its core is a network of
classifiers “trained” using diverse data
sets for X-ray sources of known object
type and their optical, infrared, radio, etc.
counterparts. The network is integrated in
the ClassX pipeline with a search engine
that queries remote multi-wavelength data
repositories, using systems such as CDS
VizieR service, to get data (in the VOTa-
ble format) for the sources to be classi-
fied.
In this paper we present a preliminary
census from ClassX of previously unclas-
sified X-ray sources observed with the
ROSAT PSPC. The early results include
the finding that our sources are dominated
by QSOs (Fig.1), in contrast to the star-
dominated samples were used to train our
classifiers. The ClassX census appears to
be consistent with expectations when one
considers the fainter population of
sources being studied compared with pre-
viously classified objects.
Fig. 1. Class distribution. Comparison of class fraction for X-ray sources
classified in the WGACAT (blue) and sources previously unclassified
(red) for which GSC2 counterparts were found. Each panel presents
results from different classifiers: “X-ray–optical” (xo9a_xo), “X-ray”
(xo9a_x), and “optical” (xo9a_o). The green histogram at the bottom
shows the distribution of the classes from the original WGACAT classifi-
cation for the training set. Note that while the training set of known classi-
fications was dominated by stars, the most common class among the
newly classified objects is QSO, followed by galaxy clusters. This change
in population is expected since the unidentified X-ray sources are gener-
ally fainter. That our classifier is able to respond to the changing popula-
tion is encouraging, as it is generally a challenging problem to classify a
set of objects that differs substantially from the training set.
Fig. 6. Near IR colors for newly classified sources.Same as in Fig. 5 but for previously unclassified
sources. Classes are from the “X-ray” classifier.
Fig. 5. Near IR colors. 2MASS J-K color distri-
bution of X-ray sources with previously known
classifications in the WGACAT.
Fig. 2. Classification probabilities. Class probability
distribution for previously unclassified X-ray sources
with GSC2 and 2MASS counterparts (from the classi-
fier trained using only X-ray data). ClassX provides
classification probabilities for every class for each
source. The plot shows the distribution of QSO proba-
bilities for all objects (gray) and objects classified as
QSOs (red), meaning that QSO is the highest probabil-
ity class. The probabilities are relatively low because
the QSO and AGN classes are so similar.
Fig. 7. Mean IR & X-ray colors. Mean 2MASS
J-K color (upper panel) and mean X-ray “color”
(lower) for classified sources (classes based on the
WGACAT -- blue) and unclassified sources
(classes from the X-ray classifier -- red.)
ReferenceA.A. Suchkov, T.A. McGlynn, L. Angelini, M.F. Corcoran, S.A. Drake, W.D. Pence, N. White, E.L. Winter, R.J.Hanisch, R.L. White, M. Postman, M.E. Donahue, F. Genova, F. Ochsenbein, P. Fernique, & S. Derriere, 2002.Automated Object Classification with ClassX, Astro-ph/0210407
Introduction: ClassX classifiers. Clas-
sification of observed astronomical
objects plays in major role in converting
observational data into science. It is also
trickier than one might guess because
the class categories often overlap: the
same object can be called a star and a
white dwarf, a galaxy and an AGN, an
AGN and a QSO, etc. The situation gets
even more complicated when the same
object is viewed with different instru-
ments: for instance, at the position of an
X-ray cluster of galaxies, an optical
counterpart from, say, GSC2, would typ-
ically be a galaxy rather than a combined
entity called a cluster of galaxies. These
and similar conceptual issues related to
the ClassX project were discussed ear-
lier by Suchkov et al. (2002).
This paper. In this paper, we classify
previously unidentified ROSAT sources
with several different classifiers, each
trained with a different set of parame-
ters. For instance, the training of the “X-
ray” classifier involves X-ray magni-
tudes but not optical and infrared magni-
tudes, while the training of the “X-ray
and optical” classifier involves both X-
ray and optical magnitudes. Fig. 1 com-
pares class frequency of two samples of
X-ray sources with classification from
three classifiers.
Data. The WGA catalog of X-ray
sources from ROSAT PSPC observa-
tions contains 36995 sources for which
we found optical counterparts within 30
arcsec in the GSC2, with both F and J
magnitudes. Of those, 6505 sources
were classified in the WGACAT; we
used this sample to train our classifiers.
The classifiers were then applied to the
remaining 30490 sources to determine
the object type (class) associated with
these previously unclassified sources
(see Fig. 1).
AAS Meeting 201, January 5 – 9, Seattle, WA
Class properties of previously unclassified sources. Not surprisingly, the unclassified sources are on average fainter, which implies that the respective
class objects are on average more distant or less luminous. We expect some systematic differences between classes in the classified and unclassified sam-
ples. Illustrations of such differences can be found in the figures presented here. For example, QSOs are much more common at fainter magnitudes, which
accounts for the large increase in the fraction of QSOs compared with the training set.
Observational biases. Class properties in the classified and unclassified samples are also different because of different source detectability in different
bands.. Bluer 2MASS colors are found in the unclassified sample because at faint magnitudes detections in the K band are possible only for bluer sources
(Fig. 7, upper panel). Similarly, fainter sources are softer in the X-ray because detections in the hard band, x3, are available only for softer sources (Fig. 7,
lower panel).
Future work. Clearly these statistical checks on the properties of different classes are not a substitute for checking the accuracy of the classifications using
spectroscopically identified sources. In the near future we plan detailed comparisons of our classes with external data such as SDSS.
HighlightsCompared with the previously classified objects, for the newly classified sources:
• QSOs and clusters of galaxies are much more common (whereas stars dominate the
training set.)
• All classes in the newly classified sample are softer in X-rays (except for OF stars).
• All classes in the newly classified sample are bluer in the 2MASS bands.
• Class QSO is the “softest” and much softer than class AGN in both samples.
• Class AGN is the reddest and much redder than class QSO in both samples.
• AGNs, galaxies, & clusters of galaxies all show bimodal IR color distributions.
• In the infrared, AGNs, galaxies, and clusters of galaxies are dominated by the group of
blue 2MASS counterparts as opposed to the group of red counterparts in the classified
sample.
0.0 0.2 0.4 0.6 0.8P(QSO or AGN)
0.0
0.2
0.4
0.6
0.8
1.0
P(S
tar)
QSO+AGN
Stars
Other
Validation of ClassX classification. We explore validity of the ClassX
classification using a variety of checks on the internal and external consis-
tency of the classification results. Figs. 5 and 6 display the distribution of
the 2MASS J-K color, a parameter that was not used in the training or
classification of the sources. Comparing the two figures, we notice a num-
ber of common features. For example, the distribution of AGNs is obvi-
ously bimodal in both the classified and unclassified samples, which
isolates two groups, blue and red, centered at ~0.7 and ~1.4 (although the
relative prominence of the two groups is different for the two samples).
This kind of consistency suggests that the classifier does indeed a good
job statistically in identifying AGNs among X-ray sources.
Fig. 7 shows the class variation of the mean infrared and X-ray colors for
classified and unclassified sources. There is a remarkable consistency
between the two samples in the color variation from class to class:
Classes that are redder/softer in the classified sample are also redder/
softer in the unclassified sample. Again, this is indicative of a substantial
degree of reliability of the ClassX classification.
Fig. 4 Separation in probabilities. The total probabili-
ties P(Star) vs. P(QSO) are plotted for the combined
classes of Fig. 3. Note that the histograms in Fig. 3
result from summing this distribution along the y direc-
tion. Most of the stars are very well separated from the
other classes, as are many of the QSOs+AGN. Objects
near the intersections and boundaries are difficult to
classify.
0.0 0.2 0.4 0.6 0.8Probability (QSO)
0
500
1000
1500
2000
2500
Num
ber
0.0 0.2 0.4 0.6 0.80
500
1000
1500
2000
2500 Distribution of QSO probabilitiesNon-QSOs
QSOs
Fig. 3 Combining class probabilities. The class
probabilities can be usefully combined to compare
groups of similar classes. Here the normal stellar
classes have been combined into a single “Stars” class,
QSOs & AGNs have been combined into a second
class, and the X-ray Binary, Galaxy and Cluster
classes are left unchanged. Now the QSOs/AGNs
(red) separate very well from the stars (blue).
0.0 0.2 0.4 0.6 0.8Probability (QSO or AGN)
0
500
1000
1500
Num
ber
0.0 0.2 0.4 0.6 0.80
500
1000
1500 Distribution of combinedQSO+AGN probabilities
StarsQSO/AGNs
Other