Pitfalls using PIMMS for Observed Data
This Why topic will highlight the issues using PIMMS, the Portable Interactive Multi-Mission Simulator, and the CXC proposal planning counterpart to estimate fluxes from observed count rates. PIMMS, developed as a proposal planning tool, relies on effective area data provided by individual mission experts. The Chandra effective area files are only provided at the aimpoint and do not take into account the spatial variations (QE, bad columns) in the Chandra sensitivity. Even for on-axis sources, the cycle-by-cycle evolution of ACIS sensitivity has historically led to systematic offsets in PIMMS derived fluxes. Overall, errors of 10-20% are common, though substantially larger errors (over 80%) can be seen. This error may be negligible when combined with the statistical error of low count sources; however, for bright sources and especially for large surveys, this systematic error may become significant and introduce a bias into the results.
This document is intended to server as a resource for individuals performing data analysis as well as a reference for journal referees.
PIMMS is a familiar tool to almost every observer who has proposed time on any of the recent high-energy telescopes: XMM, Astro-H, ROSAT, Chandra, NuSTAR, etc. Its most basic operation is to allow for conversion of fluxes: either between missions (ie convert Chandra counts in some energy band, to NuSTAR counts assuming some spectral model), or to convert between flux units, eg. count rate to energy flux (ergs/cm2/sec).
PIMMS purpose is stated on its homepage:
PIMMS ... is primarily intended as a planning tool for future observations ...
Irrespective of this, and despite the warnings on the Chandra calibration pages, the Adding Old Chandra Calibration Data to PIMMS thread, and on the CXC version of PIMMS, it is commonly used by individuals doing data analysis to convert their observed count rates into fluxes. Using the ADS "Bumblebee" full-text search interface shows over 330 refereed papers published in the past 10 years that mention both Chandra and PIMMS. (Certainly at best this can be considered an upper limit; PIMMS may have been used for other reasons some fraction of those papers.)
For past missions with well established calibrations this approach can yield reliable results. However, for the on-going and evolving Chandra mission, this approach can lead to significant systematic errors. The source and magnitude of these errors is discussed herein. Users are also presented with some information on alternatives approaches that can yield more reliable results.
Review: How does PIMMS work?
PIMMS makes use of (a) a collection of spectral models and (b) a library of effective area curves. In over-simplified terms: PIMMS interpolates the spectral models to the users input parameters, multiples (folds) it through the effective area curves and integrates over the user energy band(s) to determine the appropriate conversion factors.
The spectral models are a set of common high energy spectra computed from the XSPEC model library. The spectral models are generated on a grid of common model parameter values and interpolated to match the users inputs.
The library of effective area curves are supplied by experts working with each individual mission to the PIMMS team at HEASARC. The selection of which combination of instruments, detectors, modes, filters/gratings, etc, is decided by each mission's curator. For Chandra, the PIMMS files are generated annually in December to support the Call for Proposals, (CfP) . The Chandra Calibration Database (CALDB) Manager creates and ingests the specially constructed ARF files into the CALDB. These are used to create the PIMMS effective area files which are then delivered to HEASARC.
For the purposes of this document and and its links, the terms CfP, Announcement of Opportunity (AO), and cycle are used interchangeably. They refer to approximately one year observing time. The Chandra cycles do not have rigid start/stop times, but generally begin and end in the October/November time frame.
Chandra PIMMS files
Some key facts about the Chandra PIMMS files are listed below:
The effective area curves are predicted for the middle of the cycle being proposed. Thus the effective area curves generated in December of year Y are for observations that occur in the middle of May of year Y+2, ~18 months into the future.
The CXC's understanding of the evolution of the changing effective area, changes. The major time variable effect for ACIS is the continuing accumulation of contaminants on the detector. Sometimes this change in understanding represents a gradual, slight change in the effective area but others are much more significant changes. For example, early PIMMS files do not include the contamination at all.
PIMMS files for previous cycles are never updated; neither for updates to the calibration nor due to updates in the software. The ARFs used to generate the PIMMS files are stored in the Chandra CALDB in the directory:
Only cycles 3 through 17 (current) are available. The file names contain the string N00xx, where xx is the CfP cycle number. Users need only review the Chandra CALDB release notes to see that no updates have been made to these files.
The Chandra PIMMS files are only generated for observations at the aimpoint. The Chandra aimpoint varies with time and the predicted location is included when the files are created. However, the other spatial variations in effective area (mirror vignetting, detector uniformity, dead-area, spatial variation in the contamination, bad pixels and columns) are not captured. This essentially captures the maximum effective area for the various instrument configurations.
The Cycle 3-17 (current) PIMMS files are created by using the CIAO ARF creation tools (mkarf or mkgarf) and adjusting the parameters to setup for the appropriate time period. For ACIS-I, no gratings, ObsID 2858 was used for Cycles 4 through 10, and ObsID 9768 was used for Cycles 11 through 17. It is unclear from the history exactly how Cycle 3 was created (it is omitted from analysis presented below). The dmhistory tool can be used to see the exact command used:
unix% dmhistory acisiD2014-12-04pimmsN0017.fits mkarf mkarf asphistfile="acisi.asphist" outfile="acisiD2014-12-04pimmsN0017.fits" \ sourcepixelx="4107.645604423146" sourcepixely="4141.879934604962" engrid="0.1:11.0:#1024" \ obsfile="/data/draft_caldb/caldbmgr/CY17_effarea/ACIS-I/9768/repro/acisf09768_repro_evt2.fits[EVENTS]" \ pbkfile="" dafile="CALDB" mirror="HRMA" detsubsys="ACIS-3;TIME=579657667.184" \ grating="NONE" maskfile="NONE" ardlibparfile="ardlib.par" geompar="geom" verbose="0" clobber="yes"
Cycle 3 ARFs were created with CIAO 2.0 which did not include the complete tool history. The version of the software can be verified by checking the ASCDSVER keyword as shown here for the Chandra Cycle 17 effective area curves:
unix% dmkeypar acisiD2014-12-04pimmsN0017.fits ascdsver echo+ CIAO 4.6
CIAO 4.6 was released in December 2014 at the same time as CfP 17 was issued. The CALDBVER, version of the CALDB, is not set in these files.
A new version of CIAO is released annually at the same time as the CfP. The Chandra effective area curves are created with this version of CIAO (so different version from year to year). Not all CIAO releases affect the creation of the ARFs, but there have been some significant changes (eg inclusion of dead area calibrations, contamination). Smaller changes, such as the blocks HISTORY keywords have not affected the effective area values but have added to the files' provenance.
Chandra CALDB updates are not on a fixed schedule; though they occur nor more often than once per month. Typically CALDB releases are tied to the ACIS time-varying gain (T_GAIN) updates that are released approximately every 3 months. Sometimes CALDB updates require a CIAO release (eg if a new calibration product is introduced, such as the ACIS dead-area calibration). Not all CALDB updates include changes that affect the effective area.
- The CIAO release notes contain an abridged version of the CALDB release notes that highlight specifically how each CALDB release affects which tools and threads.
These facts provide the basis for the analysis presented below.
There is some ambiguity when users simply refer to 'PIMMS' in their publications. The most common interpretation is the web-interface to WebPIMMS available on the HEASARC web site. This is the most up-to-date version of the tool with the latest calibrations from each of the supported missions. However, it only contains the latest Chandra effective area files for the current proposal cycle.
The CXC provides our own web interface to a modified version of the PIMMS that has each of the effective area curves for Chandra proposal cycles starting with Cycle 3 through the current cycle (17). This version is updated annually in December with the Chandra CfP. As the page clearly states, it may not contain the latest version of neither the PIMMS software nor effective areas files for other missions.
HEASARC also provides users the PIMMS source code (including both the spectral models and the effective area curves). Users can customize PIMMS to add additional effective area curves. The CXC provides the full library of Chandra PIMMS files (cycles 3 through current) on the Adding Old Chandra Calibration Data to PIMMS thread page and instructions to overlay these files into the users local PIMMS installation.
Changes in the PIMMS code or in any of the effective area calibrations, including Chandra, are updated asynchronously across each of these three locations.
Finally, CIAO provides its own tool that takes an effective area curve and folds it through a spectral model to determine fluxes: modelflux. This tool uses Sherpa to perform the same model evaluation and integration equivalent to PIMMS. In fact sherpa uses the same XSPEC software model library but accesses it directly via its C++/FORTRAN interfaces. Users supply an arbitrary model expression and and arbitrary ARF file and modelflux computes the flux conversion factors.
It is easy to show that modelflux produce consistent results with the PIMMS software. With the AO-17 ARF, and an absorbed powerlaw model, modelflux yields a broad-band counts to flux conversion of 1.2894e-11 ergs/cm2/sec where as the CXC PIMMS service computes the nearly identical value of 1.289e-11 ergs/cm2/sec.
unix% modelflux arf=$CALDB/data/chandra/pimms/acis/acisiD2014-12-04pimmsN0017.fits \ rmf=std.rmf \ model="xspowerlaw.p1*xsphabs.abs1" \ param="p1.PhoIndex=1.7;abs1.nH=0.03" \ rate=1.0 emin=0.5 emax=7.0 Model fluxes: Rate (0.5,7)= 1 count s^-1 Photon Flux (0.5,7)= 0.004578 photon cm^-2 s^-1 Energy Flux (0.5,7)= 1.2894e-11 erg cm^-2 s^-1
modelflux requires an input RMF file to create the appropriate channel grid and energy mapping. For wide energy bands the choice of RMF is arbitrary. A single RMF file, std.rmf, was created at the aimpoint location in OBS_ID 9768 and used for all modeflux runs.
|Tool||Pros & Cons|
Pros: Latest code and non-Chandra calibrations
Cons: Only latest Chandra cycle
Pros: All Chandra cycles (3 through current)
Cons: Potentially out-dated software and/or non-Chandra files.
Pros: Users can add their own arbitrary calibrations
Cons: Must build from source and adding files is non-trivial.
Pros: Arbitrary models and effective area curves.
Cons: Large install footprint (CIAO+CALDB)
Since the ability to use arbitrary ARFs is key to this analysis and the tools are otherwise numerically equivalent, the CIAO modelflux tool is used with the PIMMS ARFs stored in the Chandra CALDB.
Recreate PIMMS ARFs with updated software and calibrations
For the first part of the investigation, the ARFs for ACIS-I, without gratings, for cycles 4 through 17 have been recreated with CIAO 4.7 using CALDB 4.6.8. As noted above, each of the PIMMS files in the CALDB were generated with different versions of CIAO and different calibration inputs. These CIAO ARFs were created for the same date within the cycle and at the same chip location (varies from cycle to cycle as the aim point drifts).
The data presented here are for the aim-point on the imaging array,ie ACIS-I, aka ACIS-3 or ACIS-I3. About 40% of the Chandra observations are done with this setup. The other aim-point is ACIS-S, aka ACIS-7 or ACIS-S3. While it is a different type of CCD (backside illuminated), the general results presented here still apply.
The ACIS effective area has changed significantly during the mission due to build up of an unknown contaminant(s) on the detector . Characterizing this evolving decrease in efficiency is an ongoing activity. Thus not only is there a physical change to the effective area, but the CXC's understanding of the change is also evolving. The PIMMS files in the CALDB then represent the CXC's best guess at what the effective area would have been 18 months into the future based on the then current models. Recreating the ARFs with a single software version and single set of calibration inputs highlights how these two have changed.
The ARFs for cycle 4 through 17 were re-created with CIAO 4.7/CALDB 4.6.8 using the same input parameters and observations. The dmhistory tool was used to extract the mkarf parameters for each cycle's ARF. Figure 1 shows the ARF in the Chandra CALDB for cycle 4, acisiD2000-01-29pimmsN0004.fits (earliest available), compared to the version created with this version of CIAO+CALDB. Compared to the CIAO ARF (red curve), the PIMMS ARF (black curve) shows significantly more effective area at low energies (below 2keV) and a slight deficit at higher energies. A detailed examination of the HISTORY in the PIMMS files shows that this file was created with CIAO 2.0. The contamination model was not introduced until CIAO 3.0 (August 2003). This explains the excess effective area predicted at low energies.
The same analysis can be easily done for AO-04 through AO-17. The data for AO-01 through AO-03 are more difficult to reconstruct and have been omitted from this analysis. AO-01 and AO-02 were pre-launch predictions so are especially unreliable; the files are no longer supplied in the Chandra CALDB. While the AO-03 PIMMS files are in the CALDB, they lack the full HISTORY records needed to easily recreate them so they have been omitted.
In Figure 2a and Figure 2b we show the difference in effective areas between what is available in PIMMS and what a user gets now using the current CIAO+CALDB for cycles 4 through 17. Figure 2b shows the same data as in Figure 2a but with the Y-axis stretched around zero.
The largest difference is seen in the earliest cycle data. The red curve in Figures 2 is the same data shown in the bottom panel of Figure 1. That is the PIMMS predictions are the most in-accurate for data observed in the earliest part of the mission. For the most recent observation cycle (17), the difference between the PIMMS prediction and the current CIAO calibrations are in excellent agreement. This is because the cycle 17 PIMMS files were created with more recent software (CIAO 4.6) and calibrations.
The other important trend to notice is that with the exception of a slight underestimate of the effective area at high energies (> 2keV) in the early observing cycles, the PIMMS curves are over estimating the effective area compared to the current CIAO+CALDB calibrations. This means that fluxes derived from PIMMS will be, in general, systematically lower than what would be computed using the current calibrations.
Predicted Flux in Energy Bands
While the point-to-point differences in the ARF are large, the more interesting quantity is the difference in the integrated flux when those ARFs are used with an assumed spectral model. This is what the PIMMS tool does with these effective area curves and is the quantity users use when publishing results.
As discussed earlier, the CIAO tool modelflux is used here to compute the count rate to flux conversion factor. The PIMMS ARFs are taken directly from the CALDB and the new CIAO+CALDB ARFs are as discussed above. By setting the parameter rate=1 count/sec, the tool's output energy flux (flux) is the desired count-rate to flux conversion factor.
Two spectral models are considered: Absorbed Powerlaw (photon index=1.7, nH=0.03), and an absorbed black body (temperature=1.0 keV, nH=0.03). nH is in 1022 atoms/cm2. The model normalization is arbitrary (set to 1.0). Results are shown for the four main Chandra Source Catalog (CSC) energy bands : broad (0.5 to 7.0 keV), soft (0.5 to 1.2 keV), medium (1.2 to 2.0 keV), and hard (2.0 to 7.0 keV).
The per-cycle percent difference in the flux conversion factor between the CIAO and PIMMS effective area curves are shown in Figure 3a (Blackbody) and Figure 3b (Powerlaw). The over estimate of the PIMMS effective area curves seen in Figures 2 results in fluxes that are systematically lower (percent difference is > 0). Again, since PIMMS Cycle 17 used a more recent version CIAO+CALDB, the results agree well with the current version of CIAO+CALDB.
Based on these data in Figures 3, users may expect there to be upto a 15% systematic error in their fluxes using the PIMMS effective area conversion factors depending on the observation cycle. The magnitude of the error is strongly dependent on the observing cycle and energy band but is almost exclusively in a direction that causes users to underestimate their fluxes.
Obviously for sources with few counts, adding a 15% systematic error in combination with a much larger statistical error is unlikely to affect the overall results. However, for bright sources this systematic error cannot be ignored. Since the PIMMS derived fluxes are almost always lower than the current CIAO+CALDB derived fluxes, this creates a sample bias of for catalogs of sources.
These systematic errors shown in Figures 3 are the best-case scenario. They are for a source at the same location on the detector as was used in the PIMMS predictions. As is discussed below, the location of the source within the field can have a significant impact on the effective area.
Dependence on Energy Band
Figures 3 show that there is some dependence on the energy band. For example, in Cycle 14 users can see a approximately 16% error when using data from the soft band, whereas data from the hard band only show a 2% error. Clearly, integrating the flux across sharp spectral features in either the model or in the effective area curve can lead to abrupt changes in the counts to flux conversion factor.
To quantify this energy-band dependent effect, the counts to flux conversion factors were generated for a grid of energy bands. The grid goes from a starting energy of 0.4 keV up to 7.9 keV in steps of 0.1 keV. The width of the energy band at each starting energy is incremented in steps of 0.1 keV until the 8.0 keV upper bound is reached. The Cycle 14 PIMMS and CIAO+CALDB ARFs were used as it is the most recent cycle that shows the largest difference between energy bands. The results are shown in Figure 4a (blackbody) and Figure 4b (powerlaw). The X-axis is the starting energy of the energy band, the Y-axis is the width of the energy band, and the color is the percent error in the difference between the CIAO and PIMMS counts to flux conversion factors. For reference, the location of the standard CSC bands are identified ; the broad and hard both have an upper limit of 7.0 keV and thus lie along a diagonal parallel to the 8.0 keV cutoff.
The conversion factors for both the blackbody and the powerlaw models show the same basic trends: The counts to flux conversion factors for wide energy bands and energy bands that start at high energies show good agreement, less than 5% error, between PIMMS and CIAO+CALDB. Narrow energy bands at lower energies produce counts to flux conversion factors with the largest errors which is not surprising since the ARF has many edges at lower energies.
The results for the two spectral models do show some variations even for wide bands. The difference are not large, but the spectra are also not very different. Users should therefore be extra cautious if using complex models (ie many absorption/emission lines) or extreme models (very hard/soft spectra).
For narrow energy bands, users need to also begin to worry about the spectral energy resolution of the detector. Band widths at 0.1keV are at or slightly beyond the practical limit of what users should use without considering energy redistribution effects from adjoining bands.
Again, remember that this is the best case scenario with the source at the same location on the detector as was used to generate the PIMMS ARFs.
The PIMMS effective area curves are only created for the on-axis position: the location on the detector with the maximum mirror effective area and the smallest PSF. This is not necessarily the location with maximum detector efficiency (which is the case for ACIS-I). It is therefore necessary to consider the location of sources within in the field when using the PIMMS counts to flux conversion factors.
The OBS_ID used to seed the inputs to create the most recent PIMMS effective area files, ObsID 9768, actually has a large number of point-like sources spread fairly uniformly across the field of view. Figure 5 shows the broad band counts image with the location of 225 sources detected by wavdetect. The aimpoint is marked with a red plus ("+") symbol. There are 4 color coded sources (blue, purple, orange, and green) that are discussed more in Figure 6.
For the purposes of this study any arbitrary locations within the field could also have been use. It would be just as informative to show the results for a regular grid or just a set of randomly selected locations. Users can create an ARF at any location in the field whether a source is detected there or not.
Recent additions to CIAO have made it very easy to automatically create the effective area files for a large number of positions within a single observation. The srcflux script takes in a list of positions (eg those produced by wavdetect) and creates the individual ARF for each location.
unix% srcflux acisf09768_repro_evt2.fits pos=wav.src outroot=powerlaw model="xspowerlaw.p1*xsphabs.abs1" paramvals="p1.PhoIndex=1.7;abs1.nH=0.03"
srcflux also runs modelflux to compute the same counts to flux conversion factor herein discussed which makes it doubly convenient.
As has been discussed, since the PIMMS curve is done on-axis, it is (almost) the maximum effective area curve. The lowest effective area curve colored green is actually for a source located very near the on-axis location but it is located in the gap between all 4 CCDs.
The peak effective area goes from approximately 550 cm2 to only about 60cm2 and careful examination shows that the curves are not simply scaled versions of each other. The orange and purple colored curves cross each other at multiple energies, which is even more interesting when we see in Figure 5 that the purple curve corresponds to a nearly on-axis source, and the orange curve corresponds to a source very far away from the aimpoint. The source corresponding to the purple curve dithers into the chip gap for part of the observation which is why its effective area is reduced. Finally, the source corresponding to the blue curve is about 5 arcmin away from the optical axis, but is not affected by chip gaps or bad columns so its effective area is fairly close to the PIMMS arf.
The spatial variation in the ARF can be understood by examining the exposure maps. Whereas the ARF is the effective area vs. energy, the exposure map is the effective area vs. position (and then usually scaled by exposure time and convolved with the aspect solution). There are multiple detector and mirror effects that get combined together to determine the effective area of detector. These include
- Mirror area (including vignetting)
- Detector quantum efficiency
- Bad pixels and columns
- Spatially varying Contamination
- Dead area due to cosmic rays
Each of these effects can be shown individually as in Figure 7. Using the appropriate ardlib modifiers, each of the calibration effects can be enabled individually when the instrument maps are created with mkinstmap. They are each then convolved with the aspect solution via mkexpmap in the usual way.
Most of the effects in Figure 7 gradually change from pixel to pixel, partly due to the blurring of the aspect solution, but also just due to the nature of the effect. The effects that cause the most significant changes are the column-to-column variation in the CCD quantum efficiency and even more the bad pixels, columns, and the gaps between the CCDs (which are clearly visible in each image, despite being convolved with the aspect solution). We would then expect the ARFs with the lowest effective area in Figure 6 to be associated with sources located at/near these features. This is confirmed in Figure 8 which plots the error in the PIMMS counts to flux conversion factor vs the CIAO+CALDB counts to flux conversion factor obtained from srcflux vs. location on the exposure map.
Again, the PIMMS on-axis aim-point location is on the upper left CCD near the corner where the four CCDs meet. The source locations with the largest error (color coded reds, oranges, yellows) are exactly the sources that fall on the edges of the detector, between the chip gaps, and along clusters of bad columns. These sources see errors over 40% with some extreme cases of errors over 80%.
The results for the other energy bands and with the blackbody model are similar: gradual increase in error father away from the aimpoint with sharp increases at the edges/bad columns.
At 2.3 keV (shown here), the gradual increase in error is mostly due to the difference in the mirror effective area vs. off-axis angle. The Chandra Proprosers' Guide (POG) has data that shows the difference in effective area vs off-axis angle. Using the data shown for the 1.5 keV curve, the PIMMS fluxes have been corrected for the mirror vignetting with the results shown in Figure 9.
This then represents a reasonably simple approach to help minimize the large scale spatial discrepancies; however it does introduce some new sources of calibration uncertainty such as the relevance of the data in the POG for past observations (eg if the mirror effective area has changed). Like PIMMS, the POG is intended to present the most recent information relevant to the most recent proposing cycle so using it as a source of calibration information may be unreliable.
While shown for just a single monochromatic energy and single observation, most of these are also a function of energy and time. Things like the QE/U and vignetting have well characterized energy signatures. However, also consider that each observation has its own list of bad columns and pixels. The location of the detector in the focal plane (ie the SIM location) affects where the aimpoint is on the detector. The elemental makeup of the contamination has changed with time, leading to a calibration trifecta: temporal, spectral, and spatial dependence. Some things are also affected by the observing mode (dead area is affected by use of sub-arrays and number of active CCDs). While a simple majority of observations are done with a "standard" setup akin to the one used to create the PIMMS ARFs, the Chandra archive contains a heterogeneous mix of datasets optimized to achieve the science results of individual programs. The applicability of the PIMMS ARFs in these datasets is subject to scrutiny.
To summarize the key points:
The effective area used by PIMMS is generally higher than what current calibrations establish. Thus fluxes derived from PIMMS will be generally lower than current calibrations would estimate.
Using PIMMS for the earliest cycles produces the largest errors. The most recent data analyzed with the most recent PIMMS files are generally less effected by the changes due to CIAO+CALDB updates. However, they can still suffer from spatial effects.
Using wide energy bands lessens the energy dependency effect.
Higher energy, energy bands have smaller errors as the calibrations have not changed significantly in those regions.
The assumed spectral model has some effect on the results; extra care should be taken for spectra of super-soft and hard sources or spectra with strong absorption/emission lines.
There is a spatial dependence on the effective area. While many of the spatial effects are gradual, sources at or near the edge of the CCDs or near clusters of bad columns may see very large errors when using the PIMMS counts to flux conversion factors.
The typical systematic error when using the PIMMS counts to flux conversion factor is to underestimate fluxes by between 5% and 20%, with much larger errors possible.
Historically, creating ARFs for individual sources in individual observations, and possibly combining them across multiple observations has been labor intensive as well as CPU intensive to the level of being prohibitive (or at least highly impractical). Newer CIAO scripts have simplified and automated much of the individual steps needed to create and combine ARFs such as:
- specextract: creates (and combines) spectra along with the ARFs and RMFs.
- srcflux : runs specextract in parallel for a list of source locations/regions taking advantage of multi-core CPUs.
- combine_spectra: automates the combination of spectra, including combining ARFs and RMFs.
- mktgresp: automates the creation of grating responses making use of multi-core CPUs.
- combine_grating_spectra: combines grating orders, arms, and spectra across observations.
For low count source with large statistical errors, the reduction in the systematic error by creating individual ARFs may not be entirely justified. However, given the spatial dependencies it is difficult to predict the magnitude of the systematic error for any particular source.