Last modified: December 2022

URL: https://cxc.cfa.harvard.edu/ciao/ahelp/dmdiff.html
AHELP for CIAO 4.16

dmdiff

Context: Tools::Core

Synopsis

Compare values in two files.

Syntax

dmdiff  infile1 infile2 [outfile] [tolfile] [keys] [data] [subspaces]
[units] [comments] [wcs] [missing] [error_on_value] [error_on_comment]
[error_on_unit] [error_on_range] [error_on_datatype] [error_on_wcs]
[error_on_subspace] [error_on_missing] [verbose] [clobber]

Description

The dmdiff tools compares two files (FITS or ASCII) and determines whether they contain the same data. The default behavior is to compare the data values - e.g. columns in a table and pixel values in an image - as well as the metadata in the file - e.g. keyword values, units, and comments - but options exist to restrict the items being compared. There are multiple ways of comparing values - such as equality or by using absolute or relative differences - that can be specified for different values using the tolfile parameter.

Exit Status

Like the Unix commands `diff' and `cmp', dmdiff assigns special meaning to its exit status. An exit status of 0 means that no differences were found in the two input files. An exit status of 1 means that either differences were found or an error occurred. An exit status greater than one always indicates that an error occurred. Note that if the verbose parameter is set to 0, the tool will produce no output, but the exit status will still reflect whether differences were found in the input files. This feature can be useful in scripts that automatically compare a large number of files.

Current Limitations

There are a few limitations in the tool:


Examples

Example 1

unix% dmdiff file1.fits file2.fits

Compare all header and data values in the default block of file1.fits and file2.fits.

Example 2

unix% dmdiff "file1.dat[t=100:][cols x,y]" "file2.dat[cols x,y]"

Here the comparison is listed to the X and Y columns in the two files (in this case ASCII files, using the ASCII kernel support), and the data from the first file has an additional filter (only those rows with t values of 100 or more).

Example 3

unix% dmdiff "file1.fits[EVENTS]" "file2.fits[EVENTS]"

Compare all the header and table values of the EVENTS block in file1.fits and file2.fits.

Example 4

unix% dmdiff "file1.fits[EVENTS]" "file2.fits[EVENTS]" keys=yes data=no

Compare only the header values of the EVENTS block in file1.fits and file2.fits.

Example 5

unix% dmdiff "file1.fits[2]" "file2.fits[2]" tolfile=tolerances.txt
outfile=diffs.txt

Compare the header and data values listed in block 2 of file1.fits and file2.fits using the limits given in the file tolerances.txt. Output will be written to diffs.txt.

Example 6

unix% dmdiff "image1.fits[PRIMARY]" "image2.fits[PRIMARY]"
tolfile=tols.txt

Compare the header and image values listed in the PRIMARY block of image1.fits and image2.fits using the limits listed in tols.txt. In this example, we have

unix% cat tols.txt
!DATE
!CHECKSUM
PRIMARY=range(1.0e-6)

which means that pixel values that differ by 1.0e-6 or less will be considered equal and the DATE and CHECKSUM keywords will be ignored.


Parameters

name type ftype def min max reqd stacks
infile1 file input       yes no
infile2 file input       yes no
outfile file         no no
tolfile file         no no
keys boolean   yes     no  
data boolean   yes     no  
subspaces boolean   yes     no  
units boolean   yes     no  
comments boolean   yes     no  
wcs boolean   yes     no  
missing boolean   yes     no  
error_on_value boolean   yes     no  
error_on_comment boolean   yes     no  
error_on_unit boolean   yes     no  
error_on_range boolean   yes     no  
error_on_datatype boolean   yes     no  
error_on_wcs boolean   yes     no  
error_on_subspace boolean   yes     no  
error_on_missing boolean   yes     no  
verbose integer   1 0 5 no  
clobber boolean   no     no  

Detailed Parameter Descriptions

Parameter=infile1 (file required filetype=input stacks=no)

1st input file name

The first file to use. It can contain Data Model syntax. The file does not have to have the same format as the infile2 parameter.

Parameter=infile2 (file required filetype=input stacks=no)

2nd input file name

The second file to use. It can contain Data Model syntax. The file does not have to have the same format as the infile1 parameter.

Parameter=outfile (file not required stacks=no)

Output file name

Output file listing summary of differences found. If the value is omitted or set to 'none', 'NONE', or 'stdout', output will go to the standard output device (generally the terminal). If outfile is set to 'stderr', output will go to the standard error device (also generally displayed on the terminal). Finally, if a filename is given, output will be written to that file. The clobber parameter controls whether an existing file will be overwritten.

Parameter=tolfile (file not required stacks=no)

Tolerance file name

This is an ASCII text file that governs how values are compared. The file is case insensitive, with commands on each line, and empty lines or those beginning with the '#' character are ignored. The order of the commands does not matter and commands that do not match the contents of the file are ignored.

There are multiple ways to compare numeric values, as discussed below. To refer to an image, use the block name of the image (use 'dmlist filename blocks' to find this out). The same syntax is used to refer to keyword values, rows of a column, or image pixel values, so whether the command

EVAL=range(1)

refers to a keyword, column, or image, depends on the input files.

A single value

Using "name=value" means that it is an error if either file does not equal the given value. The following example requires all ccd_id values to be equal to 3 and state values to match the string "finished":

ccd_id=3

state=finished

A range of values

The Data Model range syntax - namely a=b:c, with b or c optional - can be used to specify that a must be within the range a to b (missing values mean lower or upper limits). That is,

ccd_id=6:8

ccd_id=6:

ccd_id=:8

require that the ccd_id values be in the range 6 to 8 (inclusive), greater than or equal to 6, and less than or equal to 8 respectively.

Note that there is no check that the values in the two files equal each other, just that they match the range filter.

An absolute difference

The range option is used to check that the absolute difference between the two files is within the given limit. So

chipx=range(1)

events_image=range(1.0e-6)

mean that the chipx values can differ by no more than 1, and the events_image values no more than 1.0e-6.

A percentage difference

To express a relative difference, use % and then the difference as a percentage (calculated relative to the first file). Note that the % character is written before the limit, otherwise it will be taken as a string comparison. The commands

chipx=%1

events_image=%0.01

mean that the chipx values can differ by 1% or less and the events_image values by 0.01% or less.

File names

When comparing string values (either column values or a keyword) that contain file names, the "ignorepath" directive can be used to make the comparison use only the file name in the comparison, ignoring any preceding path components. That is, with the command

INFILES=ignorepath

the values /path1/to/file1.dat and /data/file1.dat would be considered equal when stored in either the INFILES keyword or column.

Ignoring a value

To ignore a keyword, column, or image, use the ! character followed by the name of the item. To ignore multiple item, write each out on a separate line of the file (preceeded by the ! character). You can also use the Data Model virtual file syntax for the infile1 and infile2 parameters to select (or hide) certain columns. The following commands will ignore the keywords DATE, CHECKSUM, and CREATOR:

!DATE
!CHECKSUM
!CREATOR

Parameter=keys (boolean not required default=yes)

Check header keywords?

Determines whether or not the header keys will be compared. See also the units and comments parameters. The tolerance file - set by the tolfile keyword - can be used to filter out certain keywords and to contol whether, when comparing file names, the path component should be ignored.

Parameter=data (boolean not required default=yes)

Check table or image data?

Determines whether or not the data values - i.e. the image pixels of rows of each column - will be compared.

Parameter=subspaces (boolean not required default=yes)

Check subspaces?

Controls whether or not the subspace record, stored in the file by CIAO tools to record the filters applied, will be compared.

Parameter=units (boolean not required default=yes)

Check units?

Controls whether or not the units of keywords and columns will be compared.

Parameter=comments (boolean not required default=yes)

Check comments?

Controls whether or not the comments of columns and keywords will be compared. This does not refer to the COMMENT or HISTORY keywords, which are not used when comparing files.

Parameter=wcs (boolean not required default=yes)

Check wcs?

Controls whether the WCS keywords be included in the comparison.

Parameter=missing (boolean not required default=yes)

Check for missing header keys?

Determines if missing header keys will be checked.

Parameter=error_on_value (boolean not required default=yes)

Return error when values are different?

Parameter=error_on_comment (boolean not required default=yes)

Return error when comments are different?

Parameter=error_on_unit (boolean not required default=yes)

Return error when units are different?

Parameter=error_on_range (boolean not required default=yes)

Return error when ranges are different?

Parameter=error_on_datatype (boolean not required default=yes)

Return error when datatypes are different?

Parameter=error_on_wcs (boolean not required default=yes)

Return error when wcs's are different?

Parameter=error_on_subspace (boolean not required default=yes)

Return error when subspaces are different?

Parameter=error_on_missing (boolean not required default=yes)

Return error when header key is missing?

Parameter=verbose (integer not required default=1 min=0 max=5)

Debug level

Verbosity level of terminal display information to user (DataModel output included). If verbose is set to 0, the tool will produce no output, but its exit status will indicate whether differences were found in the input files. See the section "Exit Status" above.

Parameter=clobber (boolean not required default=no)

Clobber existing file

Controls whether a file is overwritten, or the tool errors out, if the outfile parameter is set to a file name and it exists.


An example tolerance file

The purpose of the tolerance file is to set parameters when comparing the values of the input files. The tolerance file is an ASCII file with one keyword rule per line; see the description of the tolfile parameter for more information on the syntax and semantics of the commands.

An example tolerance file is:

unix% cat tols.dat
TSTART=83201992:83201992.7
chipx=range(50)
#time=83202418:
ccd_id=8
!checksum
!datasum
!DATE
telescop=CHANDRA
backfile=ignorepath

which indicates that any TSTART value must lie between the given minimum and maximum limits (the checks are inclusive, but note that there is no requirement that the TSTART values are the same in the two files, just that they both lie within this range); the CHIPX values must not differ by more than 50; the CCD_ID values must be equal to 8; the CHECKSUM, DATASUM, and DATE values are ignored (whether a column or keyword); the TELESCOP value is set to 'CHANDRA', and the BACKFILE values are compared ignoring any path component. The TIME filter is ignored because it begins with a '#' character, and note that the names of the values to be compared are case insensitive.


Bugs

Problem with percent sign (%) in strings

dmdiff will produce some bad results if any of the strings (comment, units, value) have a "%" in them; e.g. "90%ecf". The "%e" is getting parsed as string formatting.

The tool does not recognize differences in vector component ranges.

See Also

tools::header
dmhedit, dmhistory, dmkeypar, dmmakepar, dmreadpar