/ ciao4.16 / ahelp / cratedata.html

Jump to: Description · The CrateData object · Creating a vector column · Representing bit values · Loading Crates · Changes in CIAO 4.8 · Bugs · See Also

AHELP for CIAO 4.16

cratedata

Context: crates

Synopsis

The CrateData object is used to store column or image data.

Syntax

CrateData()

Description

The CRATES Library uses CrateData objects to store data values of a columm or image.

Important fields of a CrateData object

Unlike the other parts of the Crates interface, access to information in a CrateData object is restricted to the methods and fields of the Python object (i.e. there are no separate functions). The important fields are listed below.

Field	Description
values	The data values, stored as a NumPy array.
name	The name of the object.
unit	The units of the value, if set.
desc	A description of the object, if set.

Reading in a column

Here we read in the X column from the file evt2.fits and inspect the CrateData object that is returned.

>>> cr = read_file("evt2.fits")
>>> x = cr.get_column("x")
>>> x
  Name:     x
  Shape:    (151781,)
  Unit:     pixel
  Desc:     sky coordinates
  Eltype:   Scalar
  Range:
     Min:   0.5
     Max:   8192.5

>>> x.values.mean()
4468.9414
>>> y = cr.get_column("y")
>>> from matplotlib import pyplot as plt
>>> plt.scatter(x.values, y.values, marker='.')
>>> plt.xlabel(f"{x.name} ({x.unit})")
>>> plt.ylabel(f"{y.name} ({y.unit})")
>>> plt.show()

Modifying a column

With the above set up, we can modify values; for instance

>>> xv = x.values
>>> xv += 0.5
>>> x.values.mean()
4469.4414

Note that changing the values in the xv array change the underlying CrateData object. To ensure you are working with a copy of the data (so that changes do not get propagated back to the original Crate), use the NumPy copy() method - e.g.

>>> xv = x.values.copy()

or the copy_colvals() routine from Crates.

Creating a column

The following lines create a CrateData object storing the values from the z NumPy array and called "zcol" (the unit and description fields are optional but are added here):

>>> cd = CrateData()
>>> cd.name = "zcol"
>>> cd.values = z
>>> cd.desc = 'The z value'
>>> cd.unit = 'erg /cm**2 /s'

This can then be added to a table crate using either

>>> add_col(cr, cd)

>>> cr.add_column(cd)

where the optional index argument of add_column can be used to place the column at a specific location, rather than at the end (the default).

Vector columns

In the following we access the SKY vector column of a Chandra events file.

>>> sky = cr.get_column("sky")
>>> sky
  Name:     sky
  Shape:    (151781, 2)
  Datatype: float32
  Nsets:    151781
  Unit:     pixel
  Desc:     sky coordinates
  Eltype:   Vector
     NumCpts:   2
     Cpts:      ['x', 'y']
  Range:
     Min:   0.5
     Max:   8192.5

>>> sky.values.shape
(151781, 2)
>>> x = sky.values[:, 0]
>>> x.shape
(151781,)
>>> y = sky.values[:, 1]
>>> row0 = sky.values[0, :]
>>> print(row0)
[5071.793  5225.1724]

Virtual columns

There is no significant difference to handling virtual columns (that is, a column which is calculated by applying a transformation to an actual column in a crate):

>>> msc = cr.get_column("MSC")
>>> msc.is_virtual()
True
>>> msc
  Name:     MSC
  Shape:    (151781, 2)
  Unit:     deg
  Desc:
  Eltype:   Virtual Vector
     NumCpts:   2
     Cpts:      ['PHI', 'THETA']

How about images?

As there's no real distinction between a column and image for the CrateData() object, the read, modify, and write sections are essentially the same as above, as shown in this example

>>> cr = read_file("evt2.fits[bin sky=::8]")
>>> img = cr.get_image()
>>> img
  Name:     EVENTS_IMAGE
  Datatype: int16
  Unit:
  Desc:
  Eltype:   Array
     Ndim:     2
     Dimarr:   (1024, 1024)
  Range:
     Min:   None
     Max:   None

>>> img.values.mean()
0.14474964141845703
>>> plt.imshow(np.log10(img.values), origin='lower')

When adding an image to a crate, use either the add_piximg command or the add_image method of the IMAGECrate; that is one of

>>> add_piximg(cr, img)
>>> cr.add_image(img)

The CrateData object

There are three CrateData object types: Regular, Vector, and Virtual.

Regular Objects

Regular CrateData objects contain values from an image array or a single table column, which can be composed of either scalar or array values.

Multi-dimensional data

A CrateData object can contain multi-dimensional data, and the interpretation of whether this is an image or an array column is made by adding it to an IMAGECrate or TABLECrate respectively, with the add_col or add_piximg commands.

Vector Columns

Vector columns are two or more columns that have been grouped together under the same name, but each component column has its own name as well. For example, the vector column SKY has two components, X position and Y position. The notation for vectors in the CRATES library is

vector(cpt1, cpt2, ...)

so the sky vector is represented as

SKY(X,Y)

Vector CrateData objects have values which consist of two or more CrateData objects. Using the previous example, the SKY vector values point to regular columns X position and Y position.

Virtual Objects

A Virtual CrateData object has values that have been calculated via a transform from another CrateData object. For example, the virtual column RA is defined by a transform associated with the regular column X.

Vector columns can also be virtual. EQPOS is a virtual vector column comprised of two virtual column components RA and DEC. EQPOS(RA,DEC) values are determined by applying a transform to SKY(X,Y) values.

Creating a vector column

Two following routines were added to simplify creating vector columns: create_vector_column and create_virtual_column.

This example shows how to create a SKY vector component made up of X and Y arrays:

rng = np.random.default_rng()
x = rng.normal(4782.3, 5, size=1000).astype(np.float32)
y = rng.normal(5234.1, 5, size=1000).astype(np.float32)

creates 1000 pairs of X,Y values drawn from the normal distribution, and converted to 32-bit floats. To create a SKY vector column you would then say:

sky = create_vector_column('SKY', ['X', 'Y'])
sky.unit = 'pixel'
sky.desc = 'sky coordinates'
sky.values = np.column_stack((x,y))

The sky column can then be added to a crate (a new one in this example) and written out:

cr = TABLECrate()
cr.name = 'GAUSS'
cr.add_column(sky)
cr.write('gauss.fits', clobber=True)

unix% dmlist gauss.fits blocks

--------------------------------------------------------------------------------
Dataset: gauss.fits
--------------------------------------------------------------------------------

     Block Name                          Type         Dimensions
--------------------------------------------------------------------------------
Block    1: PRIMARY                        Null
Block    2: GAUSS                          Table         1 cols x 1000     rows
unix% dmlist gauss1.fits cols

--------------------------------------------------------------------------------
Columns for Table Block GAUSS
--------------------------------------------------------------------------------

ColNo  Name                 Unit        Type             Range
   1   SKY(X,Y)             pixel        Real4          -Inf:+Inf            sky coordinates

Note that the values are added to the vector column - the parent - as an array with shape (nrows, num cpts), which is what column_stack creates. Setting the unit and desc fields are not required, but can provide useful metadata.

Representing bit values

Boolean columns

A column of bollean values - that is True or False - can be created as with any other simple type. For instance, the following adds a column called ENFLAG that indicates whether the energy value of the row is between 500 and 7000:

>>> cr = read_file('evt2.fits')
>>> flag = CrateData()
>>> flag.name = 'ENFLAG'
>>> flag.desc = 'Interesting energy'
>>> envals = cr.get_column('energy').values
>>> flag.values = (envals >= 500) & (envals <= 7000)
>>> cr.add_column(flag)
>>> cr.write('flagged.evt2')

which creates a file with an ENFLAG column:

>>> !dmlist "flagged.evt2[cols enflag]" cols
 
-----------------------------------------------------------------
Columns for Table Block EVENTS
-----------------------------------------------------------------
 
ColNo  Name        Unit   Type        Range
   1   ENFLAG              Logical             Interesting energy

Boolean arrays

The STATUS column of a Chandra event file is a bit column, where each bit represents a different flag value. The Python representation uses an array of np.uint8 values, one for each bit, where a non-zero value indicates set and 0 is unset. For instance:

>>> cr = read_file('evt2.fits')
>>> status = cr.get_column('status')
>>> print(status)
  Name:     status
  Shape:    (107664, 32)
  Datatype: uint8 | Bit[32]
  Nsets:    107664
  Unit:     
  Desc:     event status bits
  Eltype:   Array
     Ndim:     1
     Dimarr:   (32,)
  Range:
     Min:   0
     Max:   255

>>> status.is_bit_array()
True
>>> len(status.values[0])
32
>>> print(status.values[0])
[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
>>> status.values[0][1] = 1
>>> status.values[0][10] = 1
>>> status.values[0][20] = 1
>>> print(status.values[0])
[0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0]
>>> cr.write('test.fits', clobber=True)
>>> !dmlist "test.fits[#row=1][cols status]" data,clean
#  status[4]
             01000000001000000000100000000000

Boolean arrays

The is_bit_array() method of a CrateData object can be used to determine whether a sequence of bits is being represented as a bit array. The resize_bit_array() method is used to increase or decrease the size of the bit array.

Loading Crates

The Crates module is automatically imported into Sherpa sessions, otherwise use one of the following:

from pycrates import *

import pycrates

Changes in CIAO 4.8

Support for variable-length arrays

Support for variable-length arrays has been improved and the CrateData object now supports the is_varlen and get_fixed_length_array methods.

Bugs

See the bug pages on the CIAO website for an up-to-date listing of known bugs.

Refer to the CIAO bug pages for an up-to-date listing of known issues.