The XMM-Newton Serendipitous Source Catalogue: 2XMM Pre-release

User Guide to the Catalogue

Release 1.0.3 22 February 2007 Associated with Catalogue version 1.0

Prepared by the XMM-Newton Survey Science Centre Consortium (

This User Guide refers directly to the full FITS and plain-text formats of the pre-release catalogue. It provides a detailed account of the production and contents of the catalogue. Users interested in the main properties of the catalogue will find the summary and sections 1 & 4.3 of most immediate interest.



2XMMp is the pre-release second catalogue of serendipitous X-ray sources from the European Space Agency's (ESA) XMM-Newton observatory, and has been constructed by the XMM-Newton Survey Science Centre (SSC) on behalf of ESA. This 2XMMp catalogue is essentially a subset of the full 2XMM catalogue due to be released later this year. To make the 2XMMp catalogue we have carried out an initial visual screening of each observation and excluded from the catalogue those observations which exhibit deficiencies in the automatic processing which will require full visual screening to flag source detections affected by these deficiencies (~17% of observations). In this way the overall quality of the 2XMMp catalogue is maintained at the expense of excluding these problematic observations. These observations will, of course, be included in the full 2XMM catalogue once the detailed (and time-consuming) visual screening is completed.

The pre-release catalogue contains source detections drawn from 2400 XMM-Newton EPIC observations made between 2000 February 4 and 2006 April 20; all datasets included were publicly available by 2006 May 31 but note that not all public observations are included in this catalogue. The total area of the catalogue fields is ~ 400 deg2, but taking account of the substantial overlaps between observations, the net sky area covered independently is ~ 285 deg2.

The processing to generate the catalogue is largely based on the pipeline developed for the re-processing of all observations to date and includes a number of significant improvements over the previous data processing system (as used by the SSC in routine processing of XMM-Newton data on behalf of ESA). These improvements include a more sensitive source detection scheme using exposures of all cameras, the detection and parameterization of extended sources and the extraction of spectra and time series for the brightest sources. The catalogue contains 153105 X-ray source detections above the processing likelihood threshold of 6. The 153105 X-ray source detections relate to 123170 unique X-ray sources, that is, a significant fraction of sources, namely 46796, have more than one detection in the catalogue.

The median flux (in the total photon-energy band 0.2 - 12 keV) of the catalogue detections is ~ 2.4*10-14 erg/cm2/s; ~ 20% have fluxes below 1*10-14 erg/cm2/s. The positional accuracy of the catalogue detections is generally < 2 arcsec (68% confidence radius). The flux values from the three EPIC cameras are overall in agreement to ~ 2% for on-axis sources, and ~ 6% off-axis.

Known problems and open issues are discussed in Sec. 4.2

1. Introduction

Pointed observations with the XMM-Newton Observatory detect significant numbers of previously unknown 'serendipitous' X-ray sources in addition to the original target. Combining the data from many observations thus yields a serendipitous source catalogue which, by virtue of the large field of view of XMM-Newton and its high sensitivity, represents a significant resource. The serendipitous source catalogue enhances our knowledge of the X-ray sky and has the potential for advancing our understanding of the nature of various Galactic and extragalactic source populations.

While the first XMM-Newton catalogue contains X-ray source detections from 585 XMM-Newton observations made between launch in 2000 March 1 and 2002 May 5, the present 2XMMp catalogue contains sources from over 4 times as many observations (2400) made between 2000 February 4 and 2006 April 20 which are in the public domain since 2006 May 31. The production of this catalogue has been undertaken by the XMM-Newton Survey Science Centre (SSC) consortium in fulfillment of one of its major responsibilities within the XMM-Newton project. The catalogue production process has been designed to exploit fully the capabilities of the XMM-Newton EPIC cameras and to ensure the integrity and quality of the resultant catalogue through rigorous screening of the data.

2. Data selection

2.1 Selection of observations

The selection of XMM-Newton observations for processing in the 2XMM catalogue pipeline is based on the desire to re-process all available obervations with the latest data processing system and calibration data available. All observation that are in the public domain prior to the release of the 2XMM catalogue will be included. For the purpose of the 2XMM pre-release catalogue a sub-selection of these (made available to the public by 2006 May 31) is used, which have been selected through visual screening of the sources detected in the field. Only 'good' observations, that is, those that have no or very few obvious spurious detections (usually found in image artefacts), have been included (cf. Sec. 3.3.5 ). They comprise about ~ 83% of all public observation that have EPIC data. In contrast, the full 2XMM catalogue (to be released later in the year) will contain all observations, and spurious detections will be manually flagged.

Table 2.1 gives the list of the final 2400 observations which are included in the catalogue.

2.2 Selection of exposures

Most, but not all, XMM-Newton observations comprise a single exposure with each of the three EPIC cameras. (A significant number of observations have multiple exposures and/or do not include exposures with one or more of the three cameras.)  For each observation we selected exposures for each of the three EPIC cameras for processing using the following criteria:

(i) An exposure must have > 1000 seconds duration.
(ii) The exposure must have been taken through a scientifically useful filter. In practice this rejected all exposures for which the filter position was closed, calibration or undefined. The possible filters used in the observations selected for the catalogue are Medium , Thick, Thin1, Thin2 (PN only), and Open. For a detailed description of the filters see in the XMM user hand book 3.3.6 EPIC filters and effective area.
(iii) The exposure must have been taken in a mode which could usefully be processed by the detection stage. PN small window modes were rejected since the effective field of view in these modes is small making the background fitting stage of the source detection problematic. For the MOS nearly all modes, including those modes in which the area of the central CCD was windowed or missing (e.g., timing modes, here 'Fast Uncompressed') or modified ('Refreshed Frame Store' mode), were included. The possible observing modes used in the observations selected for the catalogue are given in Tab. 2.2. For a detailed description of the modes see in the XMM user hand book 3.3.2 Science modes of the EPIC cameras.
(iv) Background filtering (see Sec. 3.1.1 c)) must have been successfully applied. This was not the case when the sum of all Good Time Intervals (hereafter GTIs) was less than 1000 seconds. Without background filtering the source detection is typically of limited value due to the much higher net background.
(v) After background filtering has been applied, each of the five images of an exposure (bands 1 - 5) must have sufficient range (as tested with fimgstat), i.e., at least one pixel per image must have more than one event detected.

Where more than one exposure with a particular camera met the above selection conditions, all exposures with the same filter and data mode were merged and then the exposure with maximum duration was chosen for the source detection. The zoom-in flow chart below visualizes the selection procedure. The source-specific products, in contrast, were extracted for all (unmerged) exposures that met the above first four criteria (see Sec. 3.1.4).

Figure 2.1: Flow-chart of the 2XMMp exposure selection.
Figure 2.1: Flow-chart of the 2XMMp exposure selection

3. Data processing

The processing of the observations was facilitated through an improved pipeline configuration over the one currently used for the routine production processing of observations. This new processing pipeline is used to re-process all available observations up-to-date and will become the new routine pipeline after the re-processing has been concluded.

After creation of the pipeline data products (Sec. 3.1), some of the output was modified based on post-processing analysis (Sec. 3.2). The catalogue file was constructed to contain key columns from the source lists plus additional columns which include observation-level meta-data for each source as well as further analyses and evaluation (Sec. 3.3). Additional products for each source were created to facilitate data access (Sec. 3.4).

Throughout the documentation references to catalogue columns are marked as links to their description. The prefix 'ca_' in a column name indicates a wildcard for any of the three EPIC cameras, i.e. 'ca' is to be replaced by 'PN,' 'M1' or 'M2'; it can also stand for 'EP' where applicable. Some of the column names also include an energy band identifier ('_b_', where b = 1,2,3,4,5,6,7,8,9) which is typically not explicitly indicated. The four hardness ratios are identified by the designator 'n' (n = 1,2,3,4).

The zoom-in flow chart below gives an overview of the processing steps from the XMM-Newton observation to the final source list as described in this section. The 'exposure group selection' is described in Sec. 2.2.

Figure 3.1: Flow-chart of the 2XMMp processing steps.
Figure 3.1: Flow-chart of the 2XMMp processing steps

3.1 Pipeline Processing

3.1.1 Event calibration and filtering

The following sections describe the individual steps taken within the processing chain leading to images on which source detection could be performed.

a) The processing of a MOS exposure

1. A first pass constructs a flare lightcurve selecting events with the (XMMEA_22 = REJECT_BY_GATTI & XMMEA_EM = GOOD_MOS_EVENTS) flags set

2. GTIs are made by filtering the flare lightcurve using the MOS flare threshold. These GTIs are used to define the time regions in which bad pixel searching occurs.

3. All GTIs with a duration of less than 100 seconds are excluded.

4. The SAS task embadpixfind is used to locate dark pixels in each MOS CCD (using events which have not been filtered through the flare GTIs).

5. If no flare GTIs were made the SAS task embadpixfind is used to locate bright pixels

6. If flare GTIs do exist the events are filtered through the flare GTIs and then embadpixfind is used to locate bright pixels.

7. The SAS task badpix is run on each CCD event file in order to add a bad pixel extension.

8. The intervals in the global GTI file are aligned with the event list and merged with the CCD GTIs.

9. Attitude correction is applied to the individual events to convert raw CCD pixel coordinates, through camera coordinates, to celestial coordinates.

10. Raw event pulse height values are converted to rectified event energies.

11. Unwanted events are filtered out before lists are merged.

12. The per-CCD event lists are merged into one per camera.

13. Filter the good imaging events into final event lists.

14. Copy the CIF into a separate extension in the event list.

15. Make a second pass flare lightcurve selecting events with the (XMMEA_22 = REJECT_BY_GATTI & XMMEA_EM = GOOD_MOS_EVENTS) flags set and bad pixels excluded.

16. Create flare GTIs using the MOS flare threshold.

17. Filter the event files through GTIs into final event files.

b) The processing of a PN exposure

1. The SAS task badpixfind is run to create a mask of non-source pixels to be used in generating a background flare lightcurve

2. Badpixfind is run on each CCD to locate bright and dead pixels

3. Attitude correction is applied to the individual events to convert raw CCD pixel coordinates, through camera coordinates, to celestial coordinates.

4. Raw event pulse height values are converted to rectified event energies.

5. Filter events by selecting events with the (XMMEA_EP = PN_GOOD_EVENTS) flags set

6. Filter the CCD event files on the HK GTIs and merge into one

7. Copy the CIF into a separate extension in the event list.

8. Make a background flare lightcurve using events with energies between 7 keV & 15 keV events and excluding bad pixels by using the previously created pixel mask.

9. Create flare GTIs using the PN flare threshold for use in later processing stages. 

c) Background filtering

The MOS flare lightcurves were produced from GATTI-flagged (essentially events with energies above 14 keV), single-pixel events from the outer CCDs. After binning the lightcurve, the flare GTIs were selected by imposing a rate threshold of 2 cnts/arcmin2/ksec.

The PN flare lightcurves were produced in the 7.0 - 15 keV energy range. After binning the lightcurve, the flare GTIs were selected by imposing a rate threshold of 10 cnts/arcmin2/ksec.

d) Image creation

1. Exclude from the flare GTIs all intervals with duration less than 100 seconds.

2. For each energy band make a counts image. The images are 600 x 600 pixels with 4-arcsecond pixel sides. The images are tangent plane projections of celestial coordinates. Note that the energy bands have changed slightly with respect to previous processing: the old band 2 is split into two bands now (0.5 keV - 1.0 keV and 1.0 keV - 2.0 keV), while the old bands 4 and 5 have been merged into a single band. The definitions of all the new energy bands are given in Tab. 3.1 below.

Table 3.1:  New energy bands used in 2XMMp processing
Basic energy bands: 1 = 0.2 -   0.5 keV  
2 = 0.5 -   1.0 keV   (formerly part of band 2)
3 = 1.0 -   2.0 keV   (formerly part of band 2)
4 = 2.0 -   4.5 keV   (formerly band 3)
5 = 4.5 - 12.0 keV   (formerly bands 4 and 5)
Broad energy bands: 6 = 0.2 -   2.0 keV   soft band, no images made
7 = 2.0 - 12.0 keV   hard band, no images made
8 = 0.2 - 12.0 keV   total band
9 = 0.5 -   4.5 keV   XID band

3. Event selection for PN images is PATTERN <= 4 and RAWY > 12 with events ON_OFFSET_COLUMN and excluded. Band 1 images have the additional stricter requirement PATTERN = 0, while band 8 images have PATTERN = 0 below 0.5 keV. Band 1 - 5 images have also events OUT_OF_FOV excluded.

4. For MOS band 1 - 5 and 8 images no PATTERN selection is made beyond the 0 - 25 selection made in the event lists. Events OUT_OF_FOV are excluded for band 1 - 5 images.

5. Make exposure images corresponding to bands 1 - 5 count images

3.1.2 Source detection and parametrization

Source detection is performed simultaneously on images in the energy bands 1 - 5 and from the three EPIC cameras. For observations having multiple exposures from the same camera, exposures are merged by filter and observing mode, and for each camera the merged ('added') exposure with the longest integration time is used for source detection (cf. Sec. 2.2).

In the fitting routine source parameters are determined for the three cameras [PN,M1,M2] in the energy bands 1 - 5 as well as the XID band (cf. Tab. 3.1). These parameters are then combined to obtain camera dependent parameters in band 8 as well as all-EPIC parameters.

a) Creation of multiband exposure maps

Exposure maps hold the effective exposure time for each detector point (see Sec. 3.1.1). They are created by the SAS task eexpmap for each EPIC camera and energy bands 1 - 5 using the latest calibration information on the mirror vignetting, quantum efficiency and filter transmission. The exposure maps (see ca_EXP) are corrected for bad pixels, bad columns and CCD gaps as well as being multiplied by an out-of-time factor (oot_factor):

oot_factor = 0.9411  for PN PrimeFullFrame modes,
  0.97815  for PN ExtendedFullFrame modes,
  1.0  for all other PN and M1/M2 modes.

b) Creation of detection masks

The SAS task emask is used to create a detector mask for each camera. Detector masks define the area of the detectors which is suitable for source detection. Only the areas of the detector where the exposure is at least 50% of the maximum exposure have been used for source detection.

c) Sliding box source detection (local mode)

The SAS task eboxdetect is used to create a preliminary source list. Eboxdetect has two operation modes: local and map detection. At this stage eboxdetect is run in local mode and performs a sliding box cell detection (box size 5 x 5 pixels) on the detector areas defined by the detection masks. Eboxdetect in local mode uses a local background that is determined in a frame around the search box. All sources with a 0.2 - 12 keV EPIC detection likelihood above 5 are included in the output source list.

d) Creation of multiband background maps

The SAS task esplinemap is used to create background maps for each camera and energy bands 1 - 5. Using a cut-out radius dependent on source brightness, esplinemap blanks out the areas of the images where sources where detected by eboxdetect in local mode. Then esplinemap performs 12 x 12 nodes spline fits on the resulting source-free images to calculate a smoothed background map for the entire images (ca_BG).

e) Sliding box source detection (map mode)

A second pass of eboxdetect is carried out in map mode. It creates a new source list using this time the background maps generated by esplinemap, increasing thereby the source detection sensitivity as compared to the local detection step. The box size is again set to 5 x 5 pixels. All sources with a 0.2 - 12 keV EPIC detection likelihood above 5 are included in the eboxdetect map mode source list.

f) Source parameter estimation by maximum likelihood fitting

The sources detected by eboxdetect in map mode are passed on to the SAS task emldetect. Emldetect does not perform source detection, instead it calculates PN/M1/M2 source parameters in the bands 1 - 5 by fitting the instrumental point spread function (PSF) convolved with a source extent model to the distribution of counts of the sources detected by eboxdetect (in map mode) simultaneously in the bands 1 - 5 and the three cameras. The extent model used is a beta-model profile. Free parameters of the fits are source positions, extent (ep_EXTENT), and count rates (ca_RATE). Positions and extent are constrained to be the same in all energy bands and for all cameras while count rates are obtained from the best fit value for each camera and energy band. Detection likelihoods (ca_DET_ML) and extent likelihoods (ca_EXTENT_ML) are derived as well.

In a second loop emldetect attempts to fit two PSFs to sources detected as extended, and for those detections where the split was successful the (point) source parameters were re-calculated.

Emldetect uses the multiband exposure maps to correct the count rates for vignetting and losses due to inter-chip gaps and bad pixels/columns as well as for losses in the PN due to events arriving during readout times (out of time events):

count_rate = source_counts / exp_map .

Emldetect derives four camera dependent X-ray colours known as hardness ratios (HR), which are obtained combining corrected count rates from different energy bands. Each hardness ratio, HRn, is obtained as

HRn = (RATE_b - RATE_a) / (RATE_b + RATE_a)

where RATE_a and RATE_b are the corrected count rates in energy bands a and b (see ca_b_RATE). Energy bands 1  & 2 are used to obtain ca_HR1, 2 & 3 for ca_HR2, 3 & 4 for ca_HR3, and 4 & 5 for ca_HR4.

Count rates and therefore hardness ratios are camera dependent. In addition they depend on the blocking filter used for the observation, specially the HR1. This needs to be taken into account when comparing hardness ratios for different sources. Note, that a large fraction of the hardness ratios are calculated from marginal or non-detections in at least one of the energy bands. Consequently, individual hardness ratios should only be deemed reliable if the source was above the detection likelihood threshold in both energy bands, else they have to be treated as upper limits.

Emldetect calculates source fluxes (ca_FLUX) in bands 1 - 5 in units of [erg/s/cm2], using the count rates (ca_RATE) in those bands using the following expression:

Flux = Rate / ECF ,

where the ECF is an energy conversion factor (to 'convert' count rates to fluxes). ECFs have been calculated using the most recent calibration matrices. They have been calculated assuming an spectral model of an absorbed power-law with absorbing column density Nh = 3.0 * 1020 cm2 and continuum spectral slope Gamma 1.7  (see

ECFs for each camera, energy band and observation blocking filter in units of [1011 cts cm2/erg] are given in Tab. 3.2. EPIC-PN Thin1 and Thin2 filters have been assumed to have the same transmission.

Table 3.2:  Energy conversion factors (ECFs) used in 2XMMp processing
Camera Band Open Thin Medium Thick
PN 1 16.1784 8.95403 7.82028 4.71096
2 10.0418 8.09027 7.83782 6.02015
3 6.17030 5.88255 5.78272 5.00419
4 1.95859 1.92805 1.90529 1.80647
5 0.555924 0.555226 0.554529 0.547205
9 5.07412 4.53836 4.43953 3.74772
MOS-1 1 3.15223 1.80399 1.60150 1.06500
2 2.27921 1.88017 1.82853 1.48465
3 2.14933 2.05034 2.01594 1.79446
4 0.757786 0.746128 0.737800 0.707822
5 0.143619 0.143340 0.143131 0.141213
9 1.54600 1.42040 1.39361 1.23264
MOS-2 1 3.17622 1.81179 1.60670 1.06620
2 2.28390 1.88369 1.83088 1.48818
3 2.15017 2.05117 2.01594 1.79530
4 0.761672 0.750569 0.741687 0.711708
5 0.151083 0.150769 0.150560 0.148537
9 1.54912 1.42326 1.39647 1.23524

Note that all count rates (ca_RATE) and fluxes (ca_FLUX) correspond to the flux in the entire PSF and do not need any further corrections for PSF losses.

Band 8 source parameters are derived from the combination of parameters from bands 1 - 5. For details on how each parameter was obtained see the column descriptions for the source parameters.

Emldetect detection likelihood values (ca_DET_ML) are based on the likelihood ratio described by Cash (1979). To allow comparisons of source detection runs with different source parameters, the detection likelihoods in emldetect are given in the form of 'equivalent' detection likelihoods, i.e., they are corrected for the number of free fit parameters. All sources (as detected by eboxdetect map mode) with 0.2 - 12 keV EPIC detection likelihoods greater than 6 as determined by emldetect are included in the output source list.

g) Fitting XID band source parameters

Band 9 source parameters are for the XID band (0.5 - 4.5 keV). Instead of combining parameters from bands 2, 3 and 4, which will produce overall larger source parameter errors, the SAS task emldetect is run a second time using merged images, exposure maps and background maps from bands 2 - 4. Source positions are kept fixed and a likelihood threshold of zero is used to ensure that band 9 parameters are obtained for all sources detected in the first run of emldetect. The output source list contains only band 9 parameters with errors determined directly from the merged images.

h) Automatic detection flags

One of the improvements over the previous processing pipelines is the setting of automatic flags by the SAS task dpssflag; based on the available information in the emldetect source list it writes a string of nine different flags back into the source list (ca_FLAG) to indicate various conditions. Because the decision tree had to be simple these flags should be understood mainly as a warning. In particular, sources with a low coverage on the detector, sources in problematic areas (near a bright source or within an extended source) as well as sources near artefacts like the known bright MOS-1 corner or the occasionally bright low gain columns of the PN are flagged. In addition, an attempt was made to identify spurious extended sources which can often be found near bright sources, within complicated extended emission, or generally in areas where the background changes considerably on a small spatial scale and the spline maps can not adapt well enough. The nine flag positions have been assigned the meanings given in Tab. 3.3 (note that flags 1, 8, and 9 are camera dependent):

Table 3.3:  Flag Keys
1   Low detector coverage ca_MASKFRAC <  0.5
2   Near other source R ≤ 65 * SQRT (EP_RATE); R(min) = 10", R(max) = 400"
3   Within extended emission R ≤ 3 * EP_EXTENT; R(max) = 200"
4   Possible spurious extended source near bright source Flag 2 is set and EP_CTS(min) = 1000 for the causing source
5   Possible spurious extended source within extended emission R ≤ 160" and fraction of rate wrt causing source is 0.4
6   Possible spurious extended source due to unusal large single-band DET_ML Fraction of ca_b_DET_ML wrt the sum of all ≥ 0.9
7   Possible spurious extended source At least one of the flags 4, 5, 6 is set
8   On bright MOS-1 corner or bright low gain PN column
9   Near bright MOS-1 corner R ≤ CUTRAD = 60" of a bright pixel the corner

The default value of every flag is F for False. When a flag was set it means it has been changed to T for True.

Dpssflag sets all flags except the camera-specific flags (i.e., flags 2,3,4,5,6,7) on the summary row (EPIC band 8) which are then propageted backwards to the individual cameras and bands (but see the caveat described in Sec. 4.2).

i) Merging of emldetect source lists

The emldetect source list for the bands 1 - 5 is merged with the emldetect XID source list into a common list by the SAS task srcmatch. The output table consists of a single row per detection with parameters from both input lists in different columns. Srcmatch also calculates band 1 - 5 EPIC fluxes (EP_b_FLUX), EPIC hardness ratios (EP_HRn), and their respective errors.

Srcmatch introduces flag columns which are later populated by the pipeline, e.g., for sources where source-specific products (Sec. 3.1.4) have been made (TSERIES and SPECTRA).


Cash, W., 1979, Parameter estimation in Astronomy through application of the likelihood ratio,
ApJ, 228, p. 939

3.1.3 Position rectification

The SAS task eposcorr correlates the X-ray positions from an observation (as determined by the fitting routine of emldetect) with catalogued optical positions and minimizes the positional offsets by applying a translation and rotation to the X-ray positions. For the catalogue pipeline the srcmatch source lists were correlated with the USNO B1.0 optical catalogue. The correlation allows offsets in RA/DEC of up to 10 arcseconds while all optical sources more than 15 arcseconds from an X-ray source are removed prior to correlation.

The SAS task evalcorr evaluates the quality of the position rectification of eposcorr. For 2XMMp the following empirically determined condition was used to accept the refined astrometric solution:

POSCOROK is set to True by evalcorr if

 LIK_HOOD > 5.0 + ( 2.0 * LIK_NULL ) ,

where LIK_HOOD and LIK_NULL are determined by the SAS task eposcorr. LIK_NULL is the likelihood calculated for purely coincidental X-ray/optical matches in a given observation, i.e., if there were no true counterparts.

If POSCOROK is set to True the columns RA and DEC give the corrected X-ray positions calculated by eposcorr. If the refined astrometric solution was not accepted the columns RA and DEC are the same as the uncorrected values RA_UNC and DEC_UNC.

The intrinsic systematic 1-sigma error for XMM-Newton fields (i.e., before any correction is applied) was estimated from the width of the distributions of position shifts found in eposcorr runs. A value of SYSERRCC = 1.5 arcseconds is assigned to detections in all fields for which no acceptable astrometric correction using eposcorr was determined (that is, POSCOROK is False).

The residual systematic 1-sigma error for XMM-Newton fields where an acceptable astrometric correction was determined using eposcorr is SYSERRCC = 0.5 arcseconds (that is, POSCOROK is True). This value was estimated from a comparison of the width of the distribution of accepted XMM position shifts with expectations based on the statistical errors alone.

3.1.4 Source-specific products

The new pipeline automatically extracts time series and spectra for the brighter detections (EPIC counts ≥ 500; where the detection was only observed with one or two cameras the equivalent EPIC counts were calculated using the PN to MOS count ratio 3.5 : 1). All exposures that passed the filtering (i) -  (iv) in Sec. 2.2  were used for extraction. To extract a detection for a particular camera two conditions had to be met: (i) ca_MASKFRAC ≥ 0.5, and (ii) ca_DET_ML ≥ 15.

The source counts were extracted from a circular aperture with radius 28", and the background counts were extracted from an annulus around the detection position with r(min) = 60" and r(max) = 180". PATTERN selection is the same as for image creation (PN: PATTERN <= 4, MOS: PATTERN <= 12). FLAG selection was done according to the recommandations: FLAG = 0 for the PN, XMMEA_EM for MOS time series, and XMMEA_SM for MOS spectra. The energy range for the extraction of all products is 0.2 - 0.12 keV.

While time series are filtered only by instrumental GTIs (see Sec. 3.1.1 a) and  b)), spectra are also filtered for the flare background (see Sec. 3.1.1 c)). The variability tests, however, exclude times where the flare background is high.

The bin size for the time series was chosen is such a way that the PN bins contain at least 18 counts and the MOS bins at least 5 counts (as derived from the source lists; note that these are background subtracted according to the background maps determined in the source detection process, see Sec. 3.1.2 d). The minimum bin size is 10 seconds, and all other bin sizes are rounded up to an integral multiple of 10.

To test for variability we have used a Chi2-test (suitable for binned data) with the Pearson's approximation for Poissonian data. Times with high background flaring were excluded from the test. The SAS task ekstest writes four keywords into the header of the time series file, namely CHI2PROB for the probability, CHISQUAR for the Chi2-statistic, N_POINTS for the number of bins used in the test, and AVRATE for the mean rate in the number of bins used for the test.

The available spectral products for each selected detection in 2XMMp are (i) a grouped source spectrum (20 counts/bin; energies below 0.35 keV as well as energies in the PN around the copper line at 8.05 keV are set to 'bad'), (ii) a background spectrum, (iii) a source ARF (auxillary response file), and (iv) a spectral plot made with XSPEC. A keyword in the header of the source spectrum file indicates the name of the canned RMF (response matrix file) that can be used for this detection.

The available time series products for each selected detection in 2XMMp are (i) a time series file containing the source minus background and background arrays (corrected for exposure, cosmic rays, and dead time) as well as the keywords regarding the variability, and (ii) a plot of the time series and the background made by the SAS task elcplot.

The available products are identified by their observation ID (OBSID), exposure ID and the observation-specific source number SRC_NUM in the hexadecimal system. Further details and a discussion of the limitations of an automatic extraction can be found in

3.2 Modifications to the pipeline output

3.2.1 Ontime values

The ca_ONTIME values in the source lists are copied from the chip-dependent ONTIME keywords in the header of the images by the SAS task emldetect. For 2XMMp these keywords were not up-dated when exposures were merged for the source detection, and as a consequence the ONTIME value in the individal source lists refers to the first exposure in the merged set (only in ~ 2% of all observations there are one or more merged exposures). This has been rectified in the catalogue, that is, ca_ONTIME refers now to the merged (and GTI filtered) exposure time of the chip where the detection is located. Note, though, that the ONTIME values are undefined when the position of a detection fell into a chip gap or outside the detector.

3.2.2 Errors of EPIC hardness ratios 3 and 4

Due to a problem in the SAS task srcmatch there are errors of hardness ratios given in the individual source lists while the hardness ratio itself is NULL. This affected only the columns EP_HR3_ERR and EP_HR4_ERR. These have been set to NULL in the catalogue.

3.2.3 Band-dependent likelihoods

In some cases, the individual source lists give band-dependent likelihoods ca_b_DET_ML that are defined for one band but undefined (i.e., NULL) for another band for a given detection and camera. This is not correct since either the detection was observed (all values are defined) or not observed (all values are undefined). These 'wrong' NULL values have been set to zero in the catalogue if at least one other band likelihood was given as defined and > 0.

3.3 Catalogue creation

Most of the catalogue columns are derived from information in the output lists by the srcmatch task (see Sec. 3.1.2 i)), but it was also necessary to extract information from the emldetect source lists (see Sec. 3.1.2 f)). Some modifications to a few columns were necessary, see Sec. 3.2. Additional columns derived from other products and from further processing are explained below.

3.3.1 Meta data

The catalogue includes meta data derived from extisting products to help characterize the detections. These are the observation ID (OBSID); revolution number (REVOLUT); the beginning and end of the observation in Modified Julian Date format (MJDSTART and MJDSTOP); filter (ca_FILTER) and submode (ca_SUBMODE); note that the latter two apply to all exposures in a merged set, see Sec. 2.2 .

3.3.2 Identifications

Every row in the catalogue is a detection and has received a running number (DETID). Several detections can refer to the same physical source in the sky (observed at different times), these are identified with a unique source ID (SRCID, see the description in subsection a) below). Every detection is also identified by their observation-specific (decimal) source number SRC_NUM which, in the hexadecimal system, is used together with the observation and exposure ID to identify source-specific products via their file name.

a) Unique source number

Many parts of the sky were observed more than once, either because an interesting object was a target more than once, or because two or more fields happened to overlap. It was therefore desirable to identify all cases in which the same source was responsible for two or more detections, i.e., separate rows in the catalogue. All detections for which this appears to be true have been given the same SRCID number.

The matching to find unique srouces was performed on the basis of coincidence of celestial coordinates within certain limits. An estimated postitional error, POSERR, was computed by taking the larger of the RADEC_ERR value from the source fitting and the estimated systematic error, SYSERRCC, for the observation. Because in a few cases RADEC_ERR values were unreasonably large (up to 20 arcseconds) an upper limit to matching distance of 7 arcseconds was also applied.

All possible pairs of detections from different observations are considered and the great-circle distance between them, GCDIST, computed. Two detections a and b are are considered to be matched if (using SQL notation):

GCDIST < LEAST (0.9 * a.DIST_NN, 0.9 * b.DIST_NN,  7.0, 3.0 * (a.POSERR + b.POSERR)) .

The DIST_NN value for each detection records the distance to its nearest neighbour in that observation, which in a few cases was less than 7 arcseconds, generally because a detection which initially appeared to be an extended object was split into two. The 0.9 * DIST_NN part of the formula was therefore used to ensure that close pairs of detections did not cross-match incorrectly.

The matching was performed efficiently within a Postgres database using R-tree indexing.

Since the actual X-ray sky remains unknown, it is likely that a few cases exist in which two distinct objects have been assigned the same SRCID number, or vice-versa. Further refinement of the matching algorithm will be attempted for the final 2XMM catalogue.

b) IAU identification

An IAU identification, IAUNAME, has been assigned to each unique source (SRCID) based upon the IAU registered classification 2XMMp. The form of these names is "2XMMp Jhhmmss.sSddmmss" where hhmmss.s is taken from the eposcorr corrected right ascension coordinate given in the column RA and Sddmmss is the sign and eposcorr corrected declination taken from the column DEC. A detection (entry in the catalogue) will be uniquely identified by their IAUNAME and DETID (with six digits): "2XMMp Jhhmmss.sSddmmss:detid".

3.3.3 Combined parameters for unique sources

Several source parameters were averaged or otherwise combined to characterize a unique source in the catalogue (N_DETECTIONS indictaes the number of detections found for the unique source). All columns referring to the parameters of a unique source have the prefix AV.

Weighted means (inversely with the estimated variance) and the error of it are given for coordinates (AV_RA, AV_DEC, AV_POSERR) as well as the flux (AV_FLUX, AV_FLUX_ERR) and hardness ratios (AV_HRn, AV_HRn_ERR). Note that the error on a weighted mean is calculated as

mean_err = SQRT( 1.0 / SUM( 1 / err_i2 ) ).

Maximum likelihoods (AV_DET_ML and AV_EXT_ML) are 'equivalent' detection likelihoods, i.e., they are summed over all detections of this unique source and corrected for the number of free parameters (cf. Sec. 3.1.2 f)). Finally, the maximum of all summary flags was determined (AV_SUM_FLAG).

3.3.4 Cross-Matching with 1XMM detections

Of the 2400 observations used for the present catalogue 458 observations are in common with the 1XMM Serendipitous Source Catalogue (cf. the selection of observations in Sec. 2.1). The most likely counterparts in the 1XMM catalogue were found by cross-matching the 1XMM detections with the 2XMMp unique source positions (using AV_RA and AV_DEC) with a simple limit of 10 arcseconds in distance; only the closest match is given. Their names (MATCH_1XMM) and the distance between the two detections SEP_1XMM) are given in the catalogue.

3.3.5 Visual screening

Visual screening of the processed data was performed in a very simple manner: Each observation was visually inspected with the sources overlaid and the automatic flags indicated. It was assumed that the automatic flags 7 and 8 indicate detections that are probably spurious and should be excluded from any clean sample, that is, what we call 'spurious' in the following definitions is assumed to not have flags 7 or 8 set. With that in mind, four classes of observations were indentified (see OBSFLAG):

0 (52%): no obvious spurious detections (may include one or two non-obvious ones);
1 (30%): one or more spurious detections, but usually less than 8 (or a smaller number when the total number of detections was exceptionally small);
2 (0.9%): should be part of class 1 but with the addition that there is a group of very densely positioned sources where many sources were missed and the quality of the source parameters may be questionable;
3 (16%): not accepted for inclusion in the catalogue;

In addition, observations with known problems that affect the source detection and/or parameterization were excluded (0.7%).

3.3.6 The Summary Flag

The column SUM_FLAG provides on overall quality indication of a detection, as a single integer value, based on the automatically set flags (Sec. 3.1.2 h)).

The summary flag is defined as follows, with the individual flags being set to True for:

0 : Summary flag 0 is given if none of the flags for the three cameras [PN,M1,M2] are set to True, i.e., there are no negative indications for this detection.
1 : Summary flag 1 is given if any of the warning flags [1,2,3,9] for any of the cameras [PN,M1,M2] is True, i.e., the source parameters are considered to possibly have some problems.
2 : Summary flag 2 is given if any of the 'spurious detection' flags [7,8] for any of the cameras [PN,M1,M2] is True (note that flag 7 is set to True if any of the flags for possible spurious extended detection [4,5,6] is set to True). i.e., the detection is likely to be spurious.

3.3.7 Variability information

A variability flag VAR_FLAG was set to True for a detection if at least one of the time series for this detection (derived from all appropriate exposures) has a Chi2-probability  ≤ 1E-5 as determined by the SAS task ekstest (see CHI2PROB and Sec. 3.1.4). If the flag was set the camera and exposure ID with the lowest Chi2-probability is given as well (VAR_EXP_ID).

3.4 Additional Processing

3.4.1 Thumbnails

Thumbnail images have been made of every detection in the catalogue. A maximum of 12 small and one large thumbnail image (called location image) is available per source. The small thumbnail images were made in the bands 6, 7, and 8 (see Tab. 3.1) for each of the (sometimes merged) exposures used in the source detection from the set M1, M2 and PN as well as the all-EPIC mosaiced image. The thumbnails are stored as GIF files.

The small and large thumbnail images are 8 and 24 arcmin across in size, respectively. The images are not smoothed. The band 1 - 5 image data are taken from the fits images which form part of the catalogue product set, and thus embody the same X-ray event selections as these images. The merged images are all derived by adding the respective individual band images together. The brightness scaling of the thumbnail images is linear, but pixel brightness is truncated at a given saturation value. The 'heat' colour map is used, and the images are scaled so that the pixel range from 0 to the saturation limit spans the colour map. The value of the saturation is calculated for optimum display of the source at the centre of the field.

Green cross-hairs are overlaid over the centre of the image to display the source position of interest.

The legend at the head of each image gives the following:

3.4.2 Summary html pages

Summary html pages have been made for every detection in the catalogue. These pages give a selection of (somtimes simplified) parameters from the catalogue together with zoom-in images of the thumbnails, time series, and spectra.

4. The Catalogue

4.1 Column description

There are 275 columns in the catalogue; they are grouped together and explained in the links below.

For each observation there are up to three cameras with one or more exposures which were merged when the filter and submodes were the same (Sec. 2.2). Each exposure is divided up into several energy bands (Tab. 3.1). Consequently, the source parameters can refer to some or all of these levels: on the observation level there are the final mean parameters of the source (prefix EP), on the camera level the data for each of the up to three cameras are given (prefix PN, M1, or M2), and on the energy band level we give the energy-dependent details of the source parameters (indicated by a _b_ in the column name where b = 1,2,3,4,5,8,9). Finally, on a meta-level we have combined some parameters of sources that were detected more than once (prefix AV), see Sec. 3.3.3.

The column name is given in capital letters, the FITS data format in brackets and the unit in square brackets. If the column originates from a SAS task, the name of the task is given to the right hand side and a link is set to the online SAS 6.5.0 package documentation on (note that this is a slightly older version compared to the SAS used in the processing of 2XMMp which is not publicly available, see A.3 for more details). A description of the column and possible cross-references follow.

Entries with NULL are given when the detection was not observed with the respective camera (that is, ca_MASKFRAC < 0.15).

Details of the columns

Part 1: 6 columns: Identification of the source
This includes cross matches with the 1XMM catalogue
Part 2: 11 columns: Details of the observation and exposures
Part 3: 8 columns: Coordinates
The external equatorial and Galactic coordinates and the internal equatorial coordinates are given together with the error estimates.
Part 4: 221 columns: Source parameters
The parameters of the source detection as derived from the SAS tasks emldetect and srcmatch are given here.
Part 5: 7 columns: Detection flags
This part lists the flags to qualify the detections. The summary flag, which gives an overall assessment for the detection, is followed by particular flags for each camera. A flag each is given if there exists at least one time series or one spectrum for this source.
Part 6: 5 columns: Source variability
This part gives variability information for those detections for which time series were extracted.
Part 7: 17 columns: Unique source parameters
This part lists the source parameters for the unique sources across all observations (using the prefix AV); these are coordinates, fluxes, hardness ratios, likelihoods, and a summary flag. The number of detection is given also.

4.2 Known problems and open issues

4.3 Properties of the catalogue

The catalogue contains source detections drawn from 2400 XMM-Newton EPIC observations made between 2000 February 4 and 2006 April 20, selected according to the criteria described in Sec. 2. Net exposure times in these observations range from < 1000 up to ~ 90000 seconds.

The total sky area of the 2400 XMM-Newton observations is ~ 400 deg2 which is reduced to ~ 285 deg2 when corrected for field overlaps. For more detail see Fig. 4.1 which shows the sky area as a function of net exposure time.

The catalogue contains 153105 X-ray detections with total-band [0.2 -12 keV] likelihood values ≥ 6. Of these 123170 are unique X-ray sources (Sec. 3.3.2 a)), that is, 46796 X-ray sources were observed more than once and up to 20 times in total. Of the 153105 X-ray detections 6271 are classified as extended, and 5717 of these are unique extended X-ray sources.

The total-band [0.2 -12 keV] median flux of the catalogue detections is ~ 2.4*10-14 erg/cm2/s, while ~ 20% of the detections have fluxes below 1*10-14 erg/cm2/s. Hence, the new source detection used for 2XMMp is more sensitive than the source detection used in 1XMM where the median flux was ~ 3*10-14 erg/cm2/s and only ~ 12% had fluxes below 1*10-14 erg/cm2/s.

The overall astrometric accuracy of the catalogue source positions has been checked statistically by cross-correlating with various optical catalogues and comparing the distribution of position offsets with expectations. These checks confirm that the 2XMMp positions have no systematic shift and that the quoted errors are realistic.

Document revision history

Release No. Release Date      Comments
1.0 24 July 2006 First release
1.0.1 04 August 2006 Correct column names EP_EXTENT_ML and EP_EXTENT_ERR
1.0.2 31 August 2006 Added notes in section 4.2 on CHI2PROB problem and XID band count rates
1.0.3 22 February 2007 - Added notes in section 4.2 on problems with EPIC band 4 fluxes and a list of incorrect spectrum plots.
- Corrected an erroneous in the observation list in Table 2.1 (for observation 0200430701, camera M2).
- Corrected equation for extent likelihoods.


A.1 Catalogue data-products description

A small number of associated data products are released together with the catalogue and are available from various Web-based user interfaces (see here). (Note that the final 2XMM catalogue will come with all EPIC products produced in the pipeline). These selected products are:

EPIC images in fits format (bands 1 - 5 and  8 per exposure and EPIC band 8 image) and png format (all band 8 images);
The EPIC exposure maps in fits format (bands 1 - 5 and  8 per exposure and EPIC band 8 image) and png format (all band 8 images);
The source time series files in fits format;
The source time series plots in pdf format;
The source spectra fits files (source as well as background);
The source arf fits files;
The source spectra plots as pdf files.
The optical finding charts as pdf files.

These products follow the standard specification, as described in the Data Files Handbook and the SSC products Specification available at

Outside of the pipeline processing (Sec. 3.4) two extra set of products were made: the thumbnail and source location images (Sec. 3.4.1), and the source summary html pages (Sec. 3.4.2):

The thumbnail images are graphical products and were made from the fits (merged) images. They are named C<obsid><exp>SRCIMG<band><srcnum>.GIF.
The location image is a graphical product made from the fits EPIC (merged) images; it is named C<obsid>EPX000SRCIMW8<srcnum>.GIF.
The source summary page is an html page which includes several graphic products and some extracts from the catalogue; it is named C<obsid>EPX000SRCSUM0<srcnum>.HTM.

These files were not made using the PCMS, and for this reason they have filenames starting with C instead of P (the other parts of the filename follow the pipeline product standard).

A.2 List of observations used to construct the catalogue

List of observations ('fields').

A.3 Catalogue pipeline processing details

The processing was conducted during April and May 2006 using the calibration files (CCFs) that were available on 11 April 2006. The pipeline manifest can be found here where the SAS task versions used for making the catalogue are listed together with the task version numbers which are publicly available
(SAS 6.0, 6.5, and 7.0). The references to the SAS tasks given in the user guide refer to the SAS 6.5 version which is slightly older than the version used for processing and which is not publicly available.

A.4 Corrected values of chi-squared probability

detid srcid oldpn_chi2prob pn_chi2prob oldm1_chi2prob m1_chi2prob oldm2_chi2prob m2_chi2prob
9557 7650 1.17377e-05 2.10999e-08 1.93683e-07 1.93683e-07 2.67107e-10 2.67107e-10
9559 7650 1.19683e-06 1.31235e-07 0.00324579 0.00324579 0.00188667 0.00188667
10446 8288 0.00660075 0 4.89135e-06 1.75755e-15 3.57594e-05 1.54045e-12
29787 23932 0 0 2.88454e-32 0 0 0
31605 24731 0.691837 1.93202e-16 0.898659 0.608448 0.882586 0.412855
31606 24731     0.115039 0.00187067 0.278877 1.84597e-06
32499 25582     0.639053 0.000613184 0.617727 4.06858e-06
32507 25590 1.74005e-20 1.74005e-20 0.639053 1.36321e-12 0.617727 9.53159e-08
32543 25624 3.15928e-07 3.15928e-07 0.639053 0.00165751 0.617727 0.0148653
32550 25631 4.38522e-24 4.38522e-24 0.771211 1.36499e-07 0.00221883 1.19183e-05
32802 25828 0.183871 2.0335e-09 4.32752e-06 4.32752e-06 0.0587464 0.0587464
32805 25830 0.360471 7.39745e-19 0.000125234 0.000125234 0.0092283 0.0092283
32818 25842 0.897808 7.72032e-24 0.00017236 0.00017236 2.52377e-07 2.52377e-07
33016 25993 0.109717 0.0381989 0.391389 0.391389 3.0614e-06 3.0614e-06
35106 27727 8.35872e-20 8.35872e-20 0.343035 1.04537e-06 0.83493 1.49114e-09
46915 36611 0.0191414 2.51477e-11 0.409126 0.000968114 0.622424 0.000166273
48243 37721 0.000812056 0 0.98771 0.98771    
50481 39529 0.000225779 0 0.738336 5.58982e-14 0.833166 2.70628e-17
50811 39827     0.396345 7.174e-07 0.381365 1.99242e-17
57188 44295     0.390173 7.17607e-23 0.630363 0.0470923
57193 44297     0.0411725 1.52168e-20 0.939411 0.0786263
57526 44439 1.11496e-20 1.11496e-20 0.163991 1.35788e-09 0.0768756 1.26636e-09
57744 44563 5.83865e-20 5.83865e-20 0.697517 2.1888e-07 0.154472 1.70633e-05
61927 48076         0.704294 3.40813e-06
64780 50589     0.474691 0 0.953092 0
64807 50609     0.474691 2.17314e-07 0.953092 1.56783e-08
64816 50618     0.457168 8.71879e-16 0.953092 1.29783e-18
95914 76436     0.211565 1.81129e-08 0.0775838 1.31466e-06
117246 94297     0.388802 2.21633e-11    
131566 105332 0.000964788 3.27169e-22 0.000789271 0 1.96626e-14 0
132697 106206 0.317102 0.317102 5.97427e-27 5.97427e-27 0.00981824 0.00186613
142328 113584 0 0 1.39407e-11 1.54977e-14 6.33997e-14 1.11164e-14