Reprocess a Chandra dataset
chandra_repro indir outdir [root] [badpixel] [process_events] [destreak] [set_ardlib] [check_vf_pha] [pix_adj] [recreate_tg_mask] [cleanup] [clobber] [verbose]
The chandra_repro reprocessing script automates the recommended data processing steps presented in the CIAO analysis threads. The script reads data from the standard data distribution (e.g. primary and secondary directories) and creates a new bad pixel file, a new level=2 event file, and a new level=2 Type II PHA file with the appropriate response files (grating data only).
chandra_repro may be used to reprocess any ACIS and HRC imaging or grating data.
There are parameters to control certain reprocessing steps, such as creating the new bad pixel file or destreaking the data. Users who wish to have finer-grained control over the reprocessing parameters should instead follow the step-by-step instructions in the CIAO Data Preparation Threads.
Where are the processed files?
The chandra_repro script stores the processed files in the location given by the outdir parameter. This can be relative to the input directory - e.g. a repro/ directory within each obsid - or everything can be placed into a single directory (when processing multiple directories). The grating products - PHA2, grating ARFs, and RMFs - are placed in the tg/ subdirectory of the outroot directory.
The chandra_repro script does not overwrite the CALDBVER keyword in the event header since not all calibrations may have been applied. However, it does record the version of the CALDB in comments that users can check to verify that they applied the intended calibration updates.
unix% dmlist acis_repro_evt2.fits header | grep 'chandra_repro was run' -- COMMENT chandra_repro was run with CALDB 4.5.1 -- HISTORY PARM :value=chandra_repro was run with CALDB 4.5.1 ASC00639
Reprocess the dataset in the current working directory with the default script settings. The default verbosity of 1 means that status messages will be printed to the screen.
The script should be run from directory which contains the contains the primary/ and secondary/ data directories from a standard Chandra Data Archive download tarfile:
unix% pwd /data/1838/ unix% ls primary/ secondary/
unix% chandra_repro verbose=0
Reprocess the dataset in the current working directory with the default settings. Nothing will be printed to the screen, unless the script detects an error.
unix% chandra_repro indir=/data/1838 outdir=/data/1838/new cleanup=no
Reprocess the dataset in directory /data/1838, writing the new files to /data/1838/new . Save all of the intermediate files (cleanup=no).
unix% chandra_repro set_ardlib=no
The data will be reprocessed, but the bad pixel filename will not be set in ardlib.par.
unix% chandra_repro verbose=1 mode=h Resetting afterglow status bits in evt1.fits file... Running acis_build_badpix and acis_find_afterglow to create a new bad pixel file... Running acis_process_events to reprocess the evt1.fits file... Filtering the evt1.fits file by grade and status... Applying the good time intervals from the flt1.fits file... The new evt2.fits file is: /data/1838/repro/acisf01838_repro_evt2.fits Updating the event file header with chandra_repro HISTORY record Setting observation-specific bad pixel file in local ardlib.par. Cleaning up intermediate files WARNING: Observation-specific bad pixel file set for session use: /data/1838/repro/acisf01838_repro_bpix1.fits Run 'punlearn ardlib' when analysis of this dataset completed. The data have been reprocessed. Start your analysis with the new products in /data/1838/repro
At the verbose=0 setting, only critical warnings and errors are printed to the terminal. At verbose=1 (and above) additional information about each data set is printed.
unix% chandra_repro indir=1838,1839 outdir=""
Will reprocess the data in directories 1838/ and 1839/. The outputs will be placed in 1838/repro and 1839/repro. A warning will be issued since multiple indir are used and the default set_ardlib=yes are incompatible (see details below). The data are not reprojected to the tangent plane, nor are they corrected for any small astrometric shifts.
unix% chandra_repro "*" outdir=""
chandra_repro will try to process all the obsids in the current directory. The "*" must be in quotes (single or double) or must be escaped, \*, from the shell. All other stack expansion rules also apply (see ahelp stack).
The default outroot parameter is set such that a "repro" directory will be created in each obsid's subdirectory.
unix% download_chandra_obsid 1838 ... unix% /bin/ls 1838 unix% chandra_repro "*" mode=h verb=1 Processing input directory '/data/1838' Resetting afterglow status bits in evt1.fits file... Running acis_build_badpix and acis_find_afterglow to create a new bad pixel file... Running acis_process_events to reprocess the evt1.fits file... ... unix% /bin/ls 1838/* 1838/axaff01838N002_VV001_vv2.pdf 1838/oif.fits 1838/primary: acisf01838_000N003_bpix1.fits.gz acisf01838N003_full_img2.fits.gz acisf01838_000N003_fov1.fits.gz acisf01838N003_full_img2.jpg acisf01838N003_cntr_img2.fits.gz orbitf084197100N001_eph1.fits.gz acisf01838N003_cntr_img2.jpg pcadf084244404N003_asol1.fits.gz acisf01838N003_evt2.fits.gz 1838/repro: acisf01838_000N003_bpix1.fits acisf01838_asol1.lis acisf01838_000N003_fov1.fits acisf01838_repro_bpix1.fits acisf01838_000N003_msk1.fits acisf01838_repro_evt2.fits acisf01838_000N003_mtl1.fits acisf084245776N003_pbk0.fits acisf01838_000N003_stat1.fits pcadf084244404N003_asol1.fits 1838/secondary: acisf01838_000N003_evt1.fits.gz acisf084244478N003_3_bias0.fits.gz acisf01838_000N003_flt1.fits.gz acisf084244478N003_4_bias0.fits.gz acisf01838_000N003_msk1.fits.gz acisf084244478N003_5_bias0.fits.gz acisf01838_000N003_mtl1.fits.gz acisf084245776N003_pbk0.fits.gz acisf01838_000N003_stat1.fits.gz aspect acisf084244478N003_0_bias0.fits.gz axaff01838N002_VV001_vvref2.pdf.gz acisf084244478N003_1_bias0.fits.gz ephem acisf084244478N003_2_bias0.fits.gz
Detailed Parameter Descriptions
Parameter=indir (file required default=./ stacks=yes)
The directory which contains the primary/ and secondary/ data directories from a standard Chandra Data Archive download tarfile. It may also be a directory which contains all of the downloaded data files (i.e. not organized into primary/ and secondary/ subdirectories).
Special Input Case: multi-OBI data
In order to run chandra_repro on observations that have multiple observation intervals (OBIs), the input data has to be separated into different input directories. Users can refer to the Multi-OBI Observations why topic for information on what these datasets are, and the splitobs script to create separate directories for the different OBIs.
The chandra_repro script can be run on interleaved mode datasets. In this case, the output files will use the labels "e1" and "e2" to separate the two sets of outputs. Alternately, the splitobs script can be used to create separate directories for the primary (e1) and secondary (e2) data periods, after which the data can be processed as separate datasets.
indir can be a stack of input directories. Each subdirectory will be processed one at a time; each must have the primary and secondary directories. See the outdir parameter for details on where reprocessed products will be created. Note: since there is only one ardlib.par, the set_ardlib parameter is forced to be "no" since it cannot be set correctly for all directories. Multiple inputs are NOT reprojected to the same tangent plane nor are they corrected for astrometric shifts.
Parameter=outdir (file required default="")
The output directory for the reprocessed data files. This directory will also contain symbolic links to the files which are required for data analysis, e.g. the aspect solution and the mask file. If the file system does not support symbolic links then copies are made.
The default, outdir="", will create a "repro" subdirectory underneath the indir directory.
When the indir parameter is a stack, the outdir must be specified in one of the following ways. If outdir="", then a "repro" subdirectory will be created in each indir in the stack. If outdir="./repro", or any other single directory name, then all the outputs from all the indirs will be put into the same directory. Finally, users can specify a stack of outdirs that match one-to-one with the indirs for the most control over the location of where each indir goes.
Parameter=root (string not required)
Root for output filenames
If not specified, the script uses the first section of the level=1 event filename (up to the first underscore) to define the root, e.g. "acisf01838" is the root of acisf01838_000N002_evt1.fits.
New files that are created by the script have the filename format <root>_repro_<type>.fits, e.g. acisf01838_repro_evt2.fits or acisf01838_000N003_bpix1.fits.
Parameter=badpixel (boolean not required default=yes)
Create a new bad pixel file?
If "badpixel=yes" (the default), a new bad pixel file is created for the dataset.
- status bits 1-5, 14-20 and 23 in the input event file are reset by the acis_clear_status_bits script to remove any previous bad pixel identification and destreaking.
- destreak is run to identify streak events, if the observation used ACIS-S4 (ccd_id=8) and destreak=yes. Flagging the streaks first prevents the misidentification of pixels with a lot of streak events as being hot pixels.
- acis_build_badpix and acis_find_afterglow are run to create a new bad pixel file which flags hot pixel and afterglow events in the event file.
- hrc_run_hotpix is run to define the valid coordinate regions in the detectors and to identify hot pixels in the event file.
The new bad pixel that is created is used in reprocessing the data, assuming "process_events=yes".
ACIS continuous-clocking mode
This parameter setting does not apply when the input data were taken in ACIS continuous-clocking mode, as the ACIS afterglow and hot pixel tool cannot be used with this data mode.
Parameter=process_events (boolean not required default=yes)
Create a new level=2 event file?
If "process_events=yes" (the default), then the acis_process_events or hrc_process_events tool is run to create a new level=1 event file with the latest calibration applied.
The standard grade, status, and good time filters are applied by the tool dmcopy to create a new level=2 event file. This includes the pulse-height filter to reduce background in LETG/HRC-S observations.
For grating data, a new level=2 Type II PHA file is extracted along with the ARF and RMF files which are stored in the tg/ subdirectory.
ACIS Reprocessing Steps
- the latest charge transfer inefficiency (CTI) correction
- the latest time-dependent gain adjustment
- the latest gain map
- a new bad pixel file
- PHA randomization
- a sub-pixel adjustment (if specified, see "pix_adj" parameter)
- clean the VFAINT background (see "check_vf_pha" parameter)
- continuous clocking mode times of arrival and pulse heights
HRC Reprocessing Steps
- the latest time-dependent gain adjustment
- the AMP_SF correction
- the degap correction
- recomputing average HRC dead time corrections
- pixel randomization (if specified, see "pix_adj" parameter)
Parameter=destreak (boolean not required default=yes)
Destreak the ACIS-8 chip?
There is a flaw in the serial readout of the ACIS chips, causing a significant amount of charge to be randomly deposited along pixel rows as they are read out. If "destreak=yes" (the default), the destreak tool is run on the ACIS-S4 chip to detect probable streak events. The flagged events are then filtered out of the new level=2 event file.
Parameter=set_ardlib (boolean not required default=yes)
Set ardlib.par with the bad pixel file?
If "set_ardlib=yes" (the default), the observation-specific bad pixel file is set in the ardlib.par file so that it is available to the CIAO tools during data analysis.
Setting "set_ardlib=no" may be useful if you are analyzing another dataset while chandra_repro is running. In this case, you would not want ardlib.par to be modified until the other analysis tasks are finished.
Remember to "punlearn" your ardlib.par file after completing analysis of this dataset to ensure that the proper bad pixel maps are used the next time that ardlib.par is referenced by a tool.
If there are multiple input datasets, then the single ardlib.par cannot be setup correctly for all of them. Therefore chandra_repro script will omit this step if multiple input directories are used.
Parameter=check_vf_pha (boolean not required default=no)
Clean ACIS background in VFAINT data?
In ACIS very faint mode, acis_process_events can use the pulse heights in the outer 16 pixels of the 5x5 event island to help distinguish between good X-ray events and bad events that are most likely associated with cosmic rays.
If "check_vf_pha=yes" (not the default), then the ACIS particle background for very faint mode observations is cleaned. The potential background event - events where one or more of those outer pixels is greater than the default split threshold - is flagged and filtered out of the event file.
Prior to the June 2012 release, the default for check_vf_pha was set to "yes". It has been determined that this could remove good events in observations with modestly bright point sources. Therefore, the default is now set to "no".
The value of this parameter is passed to the "check_vf_pha" parameter of acis_process_events when "process_events=yes".
Parameter=pix_adj (string not required default=default)
Pixel randomization: default|edser|none|randomize
If "pix_adj=default" (the default), then the default value of the tool's pixel randomization parameter is used. For ACIS timed exposure mode data, a subpixel algorithm is applied (acis_process_events pix_adj=EDSER). For ACIS continuous-clocking mode data, no subpixel adjustment is applied (acis_process_events pix_adj=NONE). For HRC data, randomization is not applied (hrc_process_events rand_pix_size=0.0).
The following table summarizes the chandra_repro pix_adj options and how each is passed to acis_process_events and hrc_process events (when "process_events=yes").
|pix_adj value||acis_process_events parameter||hrc_process_events parameter|
|default||pix_adj=EDSER (TE) or pix_adj=NONE (CC)||rand_pix_size=0.0|
Parameter=recreate_tg_mask (boolean not required default=no)
Re-run tgdetect2 and tg_create_mask rather than use the Level 2 region extension?
In some situations the correct zero-order location cannot be correctly identified by tgdetect2. This happens when the source is heavily piled up, or if it has been masked out. Additionally, due to the incomplete calibration of the continuous-clocking mode gain, the default order sorting parameters may be in appropriate.
These situations are usually identified as part of the Verification and Validation step in standard data processing. It results in the data being reprocessed with a custom mask created by one of the grating scientists.
By default, chandra_repro re-runs tgdetect2 and thus would create an insufficient grating mask leading to incorrect grating spectra. With the recreate_tg_mask parameter set to "no", the region used in standard processing is used.
The region is defined in physical coordinates. If data has been reprojected or if a fine astrometric shift has been added to the aspect solution, this may invalidate the region coordinates.
Parameter=cleanup (boolean not required default=yes)
Cleanup intermediate files on exit
If "cleanup=yes" (the default), intermediate data files are deleted when the script ends. When the parameter is set to "no", these files remain in the "outdir" directory.
Parameter=clobber (boolean not required default=yes)
Clobber existing files?
Overwrite existing files with the same filename?
Parameter=verbose (integer not required default=1 min=0 max=5)
The amount of information printed to the screen
The default verbosity setting (1) prints status messages as the script runs. Higher verbosity settings print the commands that are being run. Setting verbose=0 turns off all screen output except errors.
Changes in scripts 4.11.3 (May 2019) release
chandra_repro now creates individual TYPE:I source and background PHA files with the appropriate ANCRFILE, RESPFILE, BACKFILE, and BACKSCAL keywords set.
For HRC-S+LETG datasets, the script now creates the grating responses for orders -8, -7, -6, -5, -4, -3, -2, -1, +1, +2, +3, +4, +5, +6, +7, and +8. Previously only the -1 and +1 orders were calculated.
Changes in scripts 4.11.2 (April 2019) release
Changes to allow observations where the Level 2 ONTIME is 0 to be fully processed. Previously the script would error out when running the skyfov tool to create the FOV file. It will now create the file but without using a time filter.
Changes in scripts 4.11.1 (December 2018) release
For HRC, the script will now retain all the columns defined in the hrc_process_events stdlev1 event definition parameter. This includes the coarse coordinates in addition to the new SAMP column.
Change in scripts 4.9.4 (July 2017) release
Internal fix for uninitialized variable ("newasol") when processing an observation that does not have any aspect solution files. This is limited to observations of the Moon and the Earth. These observations still cannot be processed at this time but the error message is more informative.
Changes in scripts 4.9.2 (April 2017) release
These are only internal changes to the script to tweak how HISTORY records are written and to correct some verbose output.
Changes in scripts 4.8.4 (September 2016) release
For ACIS CC mode data, pix_adj="default" now sets acis_process_events pix_adj=NONE whereas before it would be set to "EDSER".
Changes in scripts 4.7.2 (April 2015) release
The following changes are included
- For consistency with the TGCat catalog, only the -1 and +1 order ARF and RMF are now created. Users can run mktgresp to generate the response files for higher orders.
- Errors out if the absolute path name for the input or output directories contains spaces.
- Added support for FAINT_BIAS mode datasets (only a few very early in the mission)
- Compares the files in the primary/ and secondary/ directory to the files listed in the event file header and issues a warning if they are different.
- Added a warning about using splitobs script for observations with multiple level 1 event files.
- ACIS TIME mode observations now have their GTIs aligned on exposure frame time boundaries using the new gti_align script.
- An additional time filter has been added that goes from the first aspect record time to the last. This is to address a rare condition where several of the first or last events in the level 2 event file are inside the GTIs but occur outside the times covered by the aspect solution -- which results in invalid sky coordinates during those times.
Changes in scripts 4.7.1 (December 2014) release
The default verbose level is now set to 1. Also there are internal changes to remove acis_process_events' calc_cc_times parameter.
Changes in scripts 4.6.6 (September 2014) release
The following changes were included
- The script will now create a FOV file - which ends in _repro_fov1.fits - along with the other output products. This may be especially useful if the input data have been reprojected to a different tangent point or filtered exclude a different time range than that used in the archival processing.
- Added STATFILT keyword to event that gives the 32bit status that was used to create the level 2 event file.
Changes in the scripts 4.6.1 (December 2013) release
The following changes were included
- The script now run r4_header_update to add the Repro-4 header keywords if needed.
- The centroid algorithm has been added as a pix_adj option.
- The script now runs tgdetect2 for grating data when recreate_tg_mask=yes
Change in the scripts 4.5.4 (August 2013) release
The following changes were included
- The pi=0:300 filter for HRC-S with no grating has been removed. Users will not see any difference unless they have processed their data with CALDB 4.5.7. The PI filter may optionally be used to reduce the background noise for some observations.
- Bug fix when exactly 2 event files were present in the secondary directory, but it was not an interleaved observation.
- Delay creation of output directory until the inputs have been verified.
Changes in the scripts 4.5.2 (April 2013) release
New recreate_tg_mask parameter
The recreate_tg_mask parameter is has been added to direct chandra_repro to use the existing REGION block attached the original evt2 file rather than re-run tgdetect2 and tg_create_mask. This is especially useful when the 0th order location was manually adjusted in standard data processing.
Grating response (ARF and RMF) files
chandra_repro will now create the appropriate ARF and RMF files for each order and grating arm that is in the _repro_pha2.fits file. It uses the default energy and channel grids. These products are stored in a "tg" subdirectory.
chandra_repro will now work on ACIS interleaved mode datasets without the user needing to split them into separate directories. The output products will have "e1" or "e2" in the file names that make the original primary and secondary exposures.
Skip directories with missing level 1 event files
If chandra_repro encounters a directory where it cannot find the level1 data products it requires it will now skip that directory rather than exiting.
Changes in the October 2012 Release
Support for multiple observations
The indir parameter can now take a stack of directories and the default outdir setting has changed to "".
Copying, rather than linking, files
The archive files - such as the aspect solution and mask files - are now copied rather than linked. This reverses the change made in the April 2012 release.
Changes in the June 2012 Release
- The default for the check_vf_pha parameter has been changed from "yes" to "no". It has been determined that this algorithm can remove good events for modestly bright point sources.
- In the event that a file systems does not support symbolic links (some external hard drives and thumb/USB drives), copies of the auxiliary files will be made.
Changes in the April 2012 Release
Warnings produced by acis_process_events and hrc_process_events are now written out instead of being hidden (this happens even when verbose=0). Please see the Caveats section of the acis_process_events and hrc_process_events bugs pages for information on whether these warnings can be safely ignored.
Links are now used when possible
The Archive data files necessary for data analysis are linked from the output directory instead of being copied.
Changes in the February 2012 Release
- Improved error checking on the input directory.
Changes in the December 2011 Release
- The tools acis_find_afterglow (new in CIAO 4.4) and acis_build_badpix are used in place of acis_run_hotpix because acis_find_afterglow is more efficient at identifying afterglows that have a small number (i.e. 4-10) of events.
- The status bits that are set by destreak or acis_process_events are reset with the acis_clear_status_bits script before the data are reprocessed. Previously, only the bits set by acis_detect_afterglow were reset.
- The destreak tool is run before acis_find_afterglow to improve the streak detection efficiency and to prevent pixels with several streak events from being misidentified as hot pixels.
- The default filename root is chosen from the first segment of the level=1 event file, e.g. changed from "acisf01838_000N002" to "acisf01838".
About Contributed Software
This script is not an official part of the CIAO release but is made available as "contributed" software via the CIAO scripts page. Please see this page for installation instructions.
- Large Datasets: chandra_repro uncompresses files in memory before using them. This may consume too much memory on some machines and cause chandra_repro to fail. Since chandra_repro can work with files that have already been gunziped, the work around is to gunzip all the files before running this tool.
- Multi-obi datasets: as noted above, multi-obi obsids need to be separated before running chandra_repro, which can be performed by the splitobs script.
- chandra_repro fails to create MEG responses
For datasets taken with the HETG inserted, chandra_repro is supposed to create the positive and negative first order (+/-1) response files : ARF and RMF, for both the HEG and MEG arms. Due to a bug in mktgresp this is not happening in Python 3 edition of CIAO 4.10.
The responses are only created for +/-1 HEG. The +/-1 HEG files which are created are correct.
Users can simply run mktgresp after chandra_repro to create the full set of +/- 1,2,3 HEG and MEG responses by setting the orders parameter to an empty|blank string.
unix% chandra_repro 19882 out= ... unix% mktgresp infile=19882/repro/acisf19882_repro_pha2.fits \ evtfile=19882/repro/acisf19882_repro_evt2.fits \ outroot=19882/repro/tg/acisf19882_repro \ orders= \ mode=h clob+
This creates the response files for each spectrum in the pha2.fits file.
- Problem with large files (18 Sep 2012)
chandra_repro uncompresses files into memory to be written back out to disk. With large files, this can exhaust system resources and cause the script to fail with a non-informative error message
Unable to read evt1 file # chandra_repro (09 April 2012): ERROR read error
chandra_repro will use the uncompressed version of the file so the easiest work around it to gunzip the files before running chandra_repro
unix% gunzip primary/*gz secondary/*gz
- dsAPEPULSEHEIGHTERR -- WARNING: pulse height is less than split threshold when performing serial CTI adjustment.
The following warning may be displayed when running chandra_repro on an ACIS observation
# acis_process_events (CIAO): The following error occurred 2 times: dsAPEPULSEHEIGHTERR -- WARNING: pulse height is less than split threshold when performing serial CTI adjustment.
As is discussed in the acis_process_events bug page, this warning can safely be ignored as long as the number of times printed is small compared to the total number of events.