Synopsis
The CrateData object is used to store column or image data.
Syntax
CrateData()
Description
The CRATES Library uses CrateData objects to store data values of a columm or image.
Important fields of a CrateData object
Unlike the other parts of the Crates interface, access to information in a CrateData object is restricted to the methods and fields of the Python object (i.e. there are no separate functions). The important fields are listed below.
Field | Description |
---|---|
values | The data values, stored as a NumPy array. |
name | The name of the object. |
unit | The units of the value, if set. |
desc | A description of the object, if set. |
Reading in a column
Here we read in the X column from the file evt2.fits and inspect the CrateData object that is returned.
>>> cr = read_file("evt2.fits") >>> x = cr.get_column("x") >>> x Name: x Shape: (151781,) Unit: pixel Desc: sky coordinates Eltype: Scalar Range: Min: 0.5 Max: 8192.5 >>> x.values.mean() 4468.9414 >>> y = cr.get_column("y") >>> from matplotlib import pyplot as plt >>> plt.scatter(x.values, y.values, marker='.') >>> plt.xlabel(f"{x.name} ({x.unit})") >>> plt.ylabel(f"{y.name} ({y.unit})") >>> plt.show()
Modifying a column
With the above set up, we can modify values; for instance
>>> xv = x.values >>> xv += 0.5 >>> x.values.mean() 4469.4414
Note that changing the values in the xv array change the underlying CrateData object. To ensure you are working with a copy of the data (so that changes do not get propagated back to the original Crate), use the NumPy copy() method - e.g.
>>> xv = x.values.copy()
or the copy_colvals() routine from Crates.
Creating a column
The following lines create a CrateData object storing the values from the z NumPy array and called "zcol" (the unit and description fields are optional but are added here):
>>> cd = CrateData() >>> cd.name = "zcol" >>> cd.values = z >>> cd.desc = 'The z value' >>> cd.unit = 'erg /cm**2 /s'
This can then be added to a table crate using either
>>> add_col(cr, cd)
or
>>> cr.add_column(cd)
where the optional index argument of add_column can be used to place the column at a specific location, rather than at the end (the default).
Vector columns
In the following we access the SKY vector column of a Chandra events file.
>>> sky = cr.get_column("sky") >>> sky Name: sky Shape: (151781, 2) Datatype: float32 Nsets: 151781 Unit: pixel Desc: sky coordinates Eltype: Vector NumCpts: 2 Cpts: ['x', 'y'] Range: Min: 0.5 Max: 8192.5 >>> sky.values.shape (151781, 2) >>> x = sky.values[:, 0] >>> x.shape (151781,) >>> y = sky.values[:, 1] >>> row0 = sky.values[0, :] >>> print(row0) [5071.793 5225.1724]
Virtual columns
There is no significant difference to handling virtual columns (that is, a column which is calculated by applying a transformation to an actual column in a crate):
>>> msc = cr.get_column("MSC") >>> msc.is_virtual() True >>> msc Name: MSC Shape: (151781, 2) Unit: deg Desc: Eltype: Virtual Vector NumCpts: 2 Cpts: ['PHI', 'THETA']
How about images?
As there's no real distinction between a column and image for the CrateData() object, the read, modify, and write sections are essentially the same as above, as shown in this example
>>> cr = read_file("evt2.fits[bin sky=::8]") >>> img = cr.get_image() >>> img Name: EVENTS_IMAGE Datatype: int16 Unit: Desc: Eltype: Array Ndim: 2 Dimarr: (1024, 1024) Range: Min: None Max: None >>> img.values.mean() 0.14474964141845703 >>> plt.imshow(np.log10(img.values), origin='lower')
When adding an image to a crate, use either the add_piximg command or the add_image method of the IMAGECrate; that is one of
>>> add_piximg(cr, img) >>> cr.add_image(img)
The CrateData object
There are three CrateData object types: Regular, Vector, and Virtual.
Regular Objects
Regular CrateData objects contain values from an image array or a single table column, which can be composed of either scalar or array values.
Multi-dimensional data
A CrateData object can contain multi-dimensional data, and the interpretation of whether this is an image or an array column is made by adding it to an IMAGECrate or TABLECrate respectively, with the add_col or add_piximg commands.
Vector Columns
Vector columns are two or more columns that have been grouped together under the same name, but each component column has its own name as well. For example, the vector column SKY has two components, X position and Y position. The notation for vectors in the CRATES library is
vector(cpt1, cpt2, ...)
so the sky vector is represented as
SKY(X,Y)
.
Vector CrateData objects have values which consist of two or more CrateData objects. Using the previous example, the SKY vector values point to regular columns X position and Y position.
Virtual Objects
A Virtual CrateData object has values that have been calculated via a transform from another CrateData object. For example, the virtual column RA is defined by a transform associated with the regular column X.
Vector columns can also be virtual. EQPOS is a virtual vector column comprised of two virtual column components RA and DEC. EQPOS(RA,DEC) values are determined by applying a transform to SKY(X,Y) values.
Creating a vector column
Two following routines were added to simplify creating vector columns: create_vector_column and create_virtual_column.
This example shows how to create a SKY vector component made up of X and Y arrays:
rng = np.random.default_rng() x = rng.normal(4782.3, 5, size=1000).astype(np.float32) y = rng.normal(5234.1, 5, size=1000).astype(np.float32)
creates 1000 pairs of X,Y values drawn from the normal distribution, and converted to 32-bit floats. To create a SKY vector column you would then say:
sky = create_vector_column('SKY', ['X', 'Y']) sky.unit = 'pixel' sky.desc = 'sky coordinates' sky.values = np.column_stack((x,y))
The sky column can then be added to a crate (a new one in this example) and written out:
cr = TABLECrate() cr.name = 'GAUSS' cr.add_column(sky) cr.write('gauss.fits', clobber=True)
unix% dmlist gauss.fits blocks -------------------------------------------------------------------------------- Dataset: gauss.fits -------------------------------------------------------------------------------- Block Name Type Dimensions -------------------------------------------------------------------------------- Block 1: PRIMARY Null Block 2: GAUSS Table 1 cols x 1000 rows unix% dmlist gauss1.fits cols -------------------------------------------------------------------------------- Columns for Table Block GAUSS -------------------------------------------------------------------------------- ColNo Name Unit Type Range 1 SKY(X,Y) pixel Real4 -Inf:+Inf sky coordinates
Note that the values are added to the vector column - the parent - as an array with shape (nrows, num cpts), which is what column_stack creates. Setting the unit and desc fields are not required, but can provide useful metadata.
Representing bit values
Boolean columns
A column of bollean values - that is True or False - can be created as with any other simple type. For instance, the following adds a column called ENFLAG that indicates whether the energy value of the row is between 500 and 7000:
>>> cr = read_file('evt2.fits') >>> flag = CrateData() >>> flag.name = 'ENFLAG' >>> flag.desc = 'Interesting energy' >>> envals = cr.get_column('energy').values >>> flag.values = (envals >= 500) & (envals <= 7000) >>> cr.add_column(flag) >>> cr.write('flagged.evt2')
which creates a file with an ENFLAG column:
>>> !dmlist "flagged.evt2[cols enflag]" cols ----------------------------------------------------------------- Columns for Table Block EVENTS ----------------------------------------------------------------- ColNo Name Unit Type Range 1 ENFLAG Logical Interesting energy
Boolean arrays
The STATUS column of a Chandra event file is a bit column, where each bit represents a different flag value. The Python representation uses an array of np.uint8 values, one for each bit, where a non-zero value indicates set and 0 is unset. For instance:
>>> cr = read_file('evt2.fits') >>> status = cr.get_column('status') >>> print(status) Name: status Shape: (107664, 32) Datatype: uint8 | Bit[32] Nsets: 107664 Unit: Desc: event status bits Eltype: Array Ndim: 1 Dimarr: (32,) Range: Min: 0 Max: 255 >>> status.is_bit_array() True >>> len(status.values[0]) 32 >>> print(status.values[0]) [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0] >>> status.values[0][1] = 1 >>> status.values[0][10] = 1 >>> status.values[0][20] = 1 >>> print(status.values[0]) [0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0] >>> cr.write('test.fits', clobber=True) >>> !dmlist "test.fits[#row=1][cols status]" data,clean # status[4] 01000000001000000000100000000000
Boolean arrays
The is_bit_array() method of a CrateData object can be used to determine whether a sequence of bits is being represented as a bit array. The resize_bit_array() method is used to increase or decrease the size of the bit array.
Loading Crates
The Crates module is automatically imported into Sherpa sessions, otherwise use one of the following:
from pycrates import *
or
import pycrates
Changes in CIAO 4.8
Support for variable-length arrays
Support for variable-length arrays has been improved and the CrateData object now supports the is_varlen and get_fixed_length_array methods.
Bugs
See the bug pages on the CIAO website for an up-to-date listing of known bugs.
Refer to the CIAO bug pages for an up-to-date listing of known issues.