Skip to the navigation links
Last modified: 29 October 2024

URL: https://cxc.cfa.harvard.edu/ciao/merging/imaging.html

Merging Central

Combining Datasets for Imaging & Spatial Analysis



This provides an overview of the considerations involved with combining observations for spatial/imaging analysis. For the most part, the most common tasks is automated using through the merge_obs/reproject_obs/flux_obs family of CIAO scripts. A broad discussion that began pre-dating these scripts and expanded upon their introduction is found in the Why combine data? 'why' page.

Astrometric Corrections (optional)

While updating the astrometry is generally unnecessary, if your science requires the best possible source positions, then you may want to apply astrometric offsets to individual observations to get the correct registration. Chandra's absolute pointing accuracy is generally better than 0.4 arcsec but can be improved in an absolute sense by cross-matching Chandra sources with other higher precision catalogs or in a relative sense by cross matching one Chandra observation with another observation. The CIAO tools reproject_aspect, wcs_match, and wcs_update can be used to compute the fine astrometic shifts between two source lists and apply the offsets to various Chandra files.

Reprojection:

When spatially combining data sets with similar sky coordinate definitions, the effects of the differing tangent planes will be difficult to detect; however, an [extreme] example of how things can go awry without reprojecting to a common sky coordinates grid is by looking at data sets near the celestial poles (e.g. observations within 10 deg of the south celestial pole). The combined data sets without reprojecting the observations to a common tangent plane, a single observation will define the WCS used to transform each observation's sky coordinates to celestial coordinates in the combined data sets, which is clearly incorrect as seen in Figure 1 .

[NOTE]
Note

TL;DR it is a fair assumption that the sky coordinates system and its respective WCS transform are unique for each observation.

Co-adding images within CIAO is typically performed in terms of the real-valued "sky" (also referred as "physical") coordinates by the tools.

Recall that the "sky" coordinate system records the event position on a fictitious tangent plane to a nominal, fixed celestial pointing direction. That is, the "sky" plane is the projection of the celestial sphere onto a 2D plane where the plane intersects orthogonally with the 3D unit sphere in the pointing direction.

For a typical observation the tangent point direction is defined by its nominal right ascension and declination which is generally the same as the telescope's aimpoint location—the time-averaged RA and Dec of the telescope's optical-axis due to spacecraft dithering.

The nominal RA and Dec for the tangent point for an observation is then set to have the sky coordinate position to be (4096.5,4096.5) for ACIS observations, (16384.5,16384.5) for HRC-I observations, and (32768.5,32768.5) for HRC-S observations. The With the exception of multi-ObI observations, the aimpoint will have the exact same sky coordinates as the nominal tangent point too, albeit the nominal tangent point and aimpoint can differ on the detector if the Science Instrument Module is offset for an observation.

A practical consequence of the sky coordinate system being specific to a given observation is that the WCS transform between sky and celestial coordinates will also be specific to that observation. This means that to correctly stack observations for imaging analysis, the observations must all share a common sky coordinate system and WCS transform, which is achieved through reprojecting the data sets to a common spherical projection onto a plane.

In this example, we use 49 ACIS observations of the C-COSMOS field which encompasses a 0.5 square degree region centered at \(\alpha = 10^{\mathrm{h}},\ \delta = {+2}^{\circ}\). The most simplistic approach to combining files is to use dmmerge on a set of FITS tables, e.g. event files with a common set of columns.

An example of directly using the reprocessed data sets and merging event files that have not been projected to a common sky plane.

unix% falafel-194: find_chandra_obsid "10:00:00" "02:00:00" radius=30 download=none | grep "C-COSMOS"
8027     20.8 ACIS-I NONE   49.4 2007-04-14  Elvis                   C-COSMOS
8021     19.1 ACIS-I NONE   48.8 2007-04-09  Elvis                   C-COSMOS
8015     20.5 ACIS-I NONE   45.5 2007-01-07  Elvis                   C-COSMOS
8026     13.8 ACIS-I NONE   47.1 2007-04-13  Elvis                   C-COSMOS
8009     24.7 ACIS-I NONE   45.9 2007-01-02  Elvis                   C-COSMOS
8020     11.1 ACIS-I NONE   49.1 2007-04-09  Elvis                   C-COSMOS
8483     30.4 ACIS-I NONE   21.8 2006-12-04  Elvis                   C-COSMOS
8004     30.4 ACIS-I NONE   15.7 2006-11-27  Elvis                   C-COSMOS
8482     30.4 ACIS-I NONE   10.4 2006-12-02  Elvis                   C-COSMOS
8014     13.5 ACIS-I NONE   47.9 2007-01-05  Elvis                   C-COSMOS
7999     37.0 ACIS-I NONE   29.6 2006-11-25  Elvis                   C-COSMOS
8478     37.0 ACIS-I NONE   18.0 2006-11-24  Elvis                   C-COSMOS
8025      8.9 ACIS-I NONE   49.1 2007-04-12  Elvis                   C-COSMOS
8008     19.2 ACIS-I NONE   46.4 2007-01-02  Elvis                   C-COSMOS
8019      3.1 ACIS-I NONE   49.5 2007-04-06  Elvis                   C-COSMOS
8003     26.1 ACIS-I NONE   47.0 2007-04-02  Elvis                   C-COSMOS
8013      8.3 ACIS-I NONE   48.1 2007-01-04  Elvis                   C-COSMOS
8493     33.5 ACIS-I NONE   19.8 2006-12-12  Elvis                   C-COSMOS
7998     33.5 ACIS-I NONE   27.6 2007-01-10  Elvis                   C-COSMOS
8024      9.7 ACIS-I NONE   49.4 2007-04-11  Elvis                   C-COSMOS
8007     16.0 ACIS-I NONE   21.8 2006-12-19  Elvis                   C-COSMOS
8497     16.0 ACIS-I NONE   27.8 2006-12-25  Elvis                   C-COSMOS
8018      4.9 ACIS-I NONE   46.8 2007-04-05  Elvis                   C-COSMOS
8002     23.9 ACIS-I NONE   29.6 2006-12-19  Elvis                   C-COSMOS
8496     23.9 ACIS-I NONE   18.4 2006-12-23  Elvis                   C-COSMOS
8012      9.1 ACIS-I NONE   49.5 2007-01-04  Elvis                   C-COSMOS
8494     31.8 ACIS-I NONE   20.8 2006-12-16  Elvis                   C-COSMOS
8122     31.8 ACIS-I NONE   28.8 2007-01-20  Elvis                   C-COSMOS
8023     15.4 ACIS-I NONE   49.4 2007-04-10  Elvis                   C-COSMOS
8503     16.4 ACIS-I NONE   20.4 2006-12-31  Elvis                   C-COSMOS
8006     16.4 ACIS-I NONE   26.7 2007-01-01  Elvis                   C-COSMOS
8017     12.9 ACIS-I NONE   46.8 2007-04-04  Elvis                   C-COSMOS
8123     24.2 ACIS-I NONE   49.1 2007-04-07  Elvis                   C-COSMOS
8011     15.0 ACIS-I NONE   47.1 2007-04-04  Elvis                   C-COSMOS
7997     32.1 ACIS-I NONE   45.4 2006-12-30  Elvis                   C-COSMOS
8022     22.5 ACIS-I NONE   31.8 2007-05-10  Elvis                   C-COSMOS
8555     22.5 ACIS-I NONE   16.6 2007-05-12  Elvis                   C-COSMOS
8549     20.3 ACIS-I NONE   17.8 2007-05-05  Elvis                   C-COSMOS
8124     20.3 ACIS-I NONE   31.7 2007-04-08  Elvis                   C-COSMOS
8550     20.9 ACIS-I NONE   23.2 2007-04-18  Elvis                   C-COSMOS
8016     20.9 ACIS-I NONE   23.9 2007-04-19  Elvis                   C-COSMOS
8001     27.0 ACIS-I NONE   48.7 2007-04-02  Elvis                   C-COSMOS
8010     22.3 ACIS-I NONE   33.6 2007-04-27  Elvis                   C-COSMOS
8553     22.3 ACIS-I NONE   14.8 2007-04-29  Elvis                   C-COSMOS
7996     34.2 ACIS-I NONE   46.3 2006-12-28  Elvis                   C-COSMOS
8552     26.1 ACIS-I NONE   14.8 2007-04-26  Elvis                   C-COSMOS
8005     26.1 ACIS-I NONE   31.6 2007-04-25  Elvis                   C-COSMOS
8000     31.6 ACIS-I NONE   46.7 2007-05-26  Elvis                   C-COSMOS
7995     38.0 ACIS-I NONE   46.1 2007-06-01  Elvis                   C-COSMOS


unix% find_chandra_obsid "10:00:00" "02:00:00" radius=30 download=none | grep "C-COSMOS" \
? | awk '{print $1}' > obsids.lis

unix% download_chandra_obsid @obsids.lis

... lots of screen output ...
unix% chandra_repro "*" outdir=""
... lots of screen output ...
Table of ACIS ObsIDs within 10 degrees of the southern celestial pole.
ObsID RA Declination
8027 09:58:36.96 +01:58:41.53
8021 09:58:47.90 +02:06:12.48
8015 09:58:58.85 +02:13:43.42
8026 09:59:07.03 +01:55:57.49
8009 09:59:09.81 +02:21:14.70
8020 09:59:17.97 +02:03:28.44
8004 09:59:20.76 +02:28:45.64
8482 09:59:20.76 +02:28:45.64
8483 09:59:20.76 +02:28:45.64
8014 09:59:28.92 +02:10:59.38
7999 09:59:31.72 +02:36:16.58
8478 09:59:31.72 +02:36:16.58
8025 09:59:37.10 +01:53:13.47
8008 09:59:39.88 +02:18:30.67
8019 09:59:48.04 +02:00:44.41
8003 09:59:50.83 +02:26:01.61
8013 09:59:58.99 +02:08:15.36
7998 10:00:01.79 +02:33:32.55
8493 10:00:01.79 +02:33:32.55
8024 10:00:07.17 +01:50:29.44
8007 10:00:09.95 +02:15:46.64
8497 10:00:09.95 +02:15:46.64
8018 10:00:18.11 +01:58:00.38
8002 10:00:20.90 +02:23:17.58
8496 10:00:20.90 +02:23:17.58
8012 10:00:29.06 +02:05:31.33
8122 10:00:31.85 +02:30:48.52
8494 10:00:31.85 +02:30:48.52
8023 10:00:37.24 +01:47:45.41
8006 10:00:40.02 +02:13:02.61
8503 10:00:40.02 +02:13:02.61
8017 10:00:48.18 +01:55:16.35
8123 10:00:50.97 +02:20:33.55
8011 10:00:59.13 +02:02:47.30
7997 10:01:01.92 +02:28:04.50
8022 10:01:07.30 +01:45:01.39
8555 10:01:07.30 +01:45:01.39
8124 10:01:10.08 +02:10:18.59
8549 10:01:10.08 +02:10:18.59
8016 10:01:18.25 +01:52:32.34
8550 10:01:18.25 +01:52:32.34
8001 10:01:21.03 +02:17:49.54
8010 10:01:29.19 +02:00:03.29
8553 10:01:29.19 +02:00:03.29
7996 10:01:31.99 +02:25:20.48
8005 10:01:40.15 +02:07:34.57
8552 10:01:40.15 +02:07:34.57
8000 10:01:51.10 +02:15:05.52
7995 10:02:02.05 +02:22:36.46
dmmerge-ing these event files will result in will result in using the sky to WCS transform defined in the first input file to the tool.

An example of directly using the reprocessed data sets and merging event files that have not been projected to a common sky plane.

unix% ls */repro/*evt* > evt.lis

unix% dmmerge infile="@evt.lis[cols time,expno,ccd_id,node_id,chip,\
? tdet,det,sky,pha,energy,fltgrade,grade,status]" \
? outfile="merge_no-reproj.evt" clobber+

Figure 1: Stacked Observations without Reprojecting

[Thumbnail image: combined observations of the C-COSMOS field which without reprojecting each observation]

[Version: full-size]

[Print media version: combined observations of the C-COSMOS field which without reprojecting each observation]

Figure 1: Stacked Observations without Reprojecting

dmmerge 49 ObsIDs without first reprojecting the observations to a common tangent plane. The sky coordinates of the first observation listed in the input stack is used as the reference coordinate system.

So you will need to redefine the sky coordinates of all these observations by reprojecting the events to a common tangent plan, which will then be displayed as the set of obervations covering the central portion of the COSMOS field.

Stacking the observations to a common sky tangent point returns the expected wide mosaic rather than being erroneously clustered in a narrow portion of the sky plane.

unix% merge_obs infiles="*/repro/*evt*.fits" \
? outroot="merge_reproj_Dec-90/" bands=broad binsize="16" \
? refcoord="0:0:0 -89:59:59.999999" clobber+

Figure 2: Stacked, Reprojected Observations

[Thumbnail image: combined observations near the pole, individual observations reprojected to a common tangent plane]

[Version: full-size]

[Print media version: combined observations near the pole, individual observations reprojected to a common tangent plane]

Figure 2: Stacked, Reprojected Observations

The mosaicked image stacking the images reprojected to a common tangent plane about the southern celestial pole. In this example, the exposure-corrected images are shown.

Figure 3: Compare Reprojected FOV with Non-Reprojected Data

[Thumbnail image: FOV of reprojected observations overlaid on the non-reprojected data set]

[Version: full-size]

[Print media version: FOV of reprojected observations overlaid on the non-reprojected data set]

Figure 3: Compare Reprojected FOV with Non-Reprojected Data

The FOV of the stacked, reprojected images of the C-COSMOS field overlaid on the non-reprojected data for comparison.

Source Detection with Merged PSF maps:

A primary reason users typically [spatially] stack observations is to detect faint sources. The CIAO celldetect and wavdetect tools can make use of information about the PSF size at a given point to either determine the size of a detect cell at that point or determine the optimum wavelet size that matches the size of the sources being detected (for point sources, this scale is comparable to the size of the PSF), for the respective tool using their psffile parameter with PSF maps.

A discussion may be found in the Running wavdetect or celldetect on Merged Data: Choosing psffile thread and a brief overview of the strengths and weaknesses between the two available detection method (vtpdetect, a Voronoi tessellation and percolation source detection technique does not make use of PSF maps) in CIAO are summarized below.

celldetect

Pros
  • Fast and Robust
  • Works well for point sources
  • Detailed PSF shape not needed, only approximate size
Cons
  • Extended sources can be difficult, without careful selection of cell sizes
  • Can get confused in crowded fields
  • Exposure maps needed to eliminate edge sources
  • Not very sensitive, unless background maps are used, but these can be difficult to make

wavdetect

Pros
  • Works well in crowded fields
  • Works well for point sources superimposed on extended emission
  • Only approximate PSF shape is needed
  • Edge of field effects not a problem
Cons
  • Slow, especially if many wavelet scales used
  • Memory-intensive and very large image sizes can pose a problem

In wavdetect, providing an exposure map will help with eliminating the vast majority of false detections along the chip edges. The most notable feature of the above thread is the discussion regarding the various methods to combine PSF maps.

[TIP]
PSF Map Tidbits

The pixel values of the PSF map represents the radius of a circular PSF at that pixel for the given energy and ecf parameter values passed to mkpsfmap in the units that are also specified in the tool assuming a flat detector. However, realize that the circular assumption used will completely fall apart at large off-axis angles, just based on the changes to the PSF morphology and size.

Scattered throughout the existing documentation related to generating PSF maps (with mkpsfmap and dependent scripts such as fluximage/merge_obs/flux_obs) there are instances where a PSF 'encircled counts fraction' of ecf=0.393 is used to generate PSF maps. An ECF of 39.3% corresponds to the 1\(\sigma\) integrated volume of a 2D Gaussian and in the examples, it is chosen so that point sources with only a few counts which will be concentrated in the center of the PSF to be better detected.

The discussion will be focused on wavdetect herein since it is the most widely used CIAO source detection tool. While the tool cannot compute point spread function sizes if wavdetect is not provided with a PSF map, it will still run. The source characteristics will not be reliable and if the size is wrong, then the wrong wavelet scales may be used to determine the source parameters, and thus the source properties may be wrongly estimated. However, the source detection process is unaffected. That is, when a PSF map is provided, the PSF sizes are used by the tool to determine the detected source significance and whether or not it is included in the source list, thus limiting the number of spurious detections.

[NOTE]
wavdetect—An Overview

wavdetect is basically a wrapper script around the wtransform and wrecon tools. Source detection is performed with wtransform, which provides a list of source candidates and wrecon tests these candidates for significance, providing wavdetect with a source list file of significant detections.

In the detection step, wtransform does not use the PSF information but uses a Ricker (aka "Mexican Hat") wavelet function at different size-scales to correlate the input image with the wavelets. Pixels with a high-correlation are considered sources. The tool then uses an estimated background and the selected detection threshold (sigthresh) to determine whether or not the sources from the correlated pixels can then be considered candidate source detections.

In the next step the PSF map is important. Since wrecon will take the candidate sources and smooth them at various wavelet scales. The source is only statistically significant—and kept as a source—when smoothed with a wavelet scale greater than the PSF size and is reconstructed using the wavelet scale closest to the PSF size. The source size and position are also better estimated during this process. Without a PSF map, the wavelets are scaled with the the image pixels, and only reliable in the case where the PSF size does not vary in the region being examined.

As an aside—and point of clarification that is frequently asked—pertaining to the ellsigma parameter found in both wavdetect and celldetect is that it has no relationship to mkpsfmap's ecf value used to generate PSF maps and it does not affect the detections themselves. ellsigma is an optional parameter with a default value of 3 standard deviations of the distribution of counts within the source cell (the observed counts), with the distribution assumed to be Gaussian. For example, if ellsigma is 1, then the elliptical source region will contain 39.3% of the total observed source counts but is only used as a scaling factor on the major- and minor-axes of the ellipses for each detection to enhance the visualization of the sources regions when the source list is overlaid onto the data and does not affect the source detection process or other determined source properties.

Ultimately, providing a PSF will affect the wavdetect/celldetect results, giving a shorter source list with better estimated source properties. While the PSF size information does not affect the actual detection process, since it is not used, but the PSF map definitely affects the final source list, as it is used for the significance test to reject false detections.

CAVEATS

caveats: merging exposure maps/exposure-corrected images... the merge_obs approach was the simplest

Why combine data?