/ ciao4.15 / ahelp / write_columns.html

Jump to: Description · Examples · Changes in the January 2012 Release · Changes in the December 2011 Release · Bugs · See Also

AHELP for CIAO 4.15

write_columns

Context: contrib

Synopsis

Write out arrays (columns) to a file as a table (ASCII or FITS format).

Syntax

write_columns(filename, col1, ..., coln)
write_columns(filename, dictionary)
write_columns(filename, structarray)

The optional arguments, with their default values, are:
colnames=None
format="text"
clobber=True
sep=" "
comment="#"

Description

This routine provides a quick means of writing out a set of arrays, a dictionary, or NumPy structured array, to a file as a table, in ASCII or FITS binary format.

Loading the routine

The routine can be loaded into Python by saying:

from crates_contrib.utils import *

Column content: dictionary

If a single argument for the column data is given and it acts like a Python dictionary, then it is used to define both the column names and values (the dictionary keys and values respectively). The dictionary values are assumed to be arrays and follow the same rules as if given individually, as described in the "Column contents: separate arguments" section below.

If the colnames argument is given then it is used to determine the order of the columns in the output crate. This argument acts as a column filter on the input dictionary, since any dictionary keys not in colnames are not added to the crate.

If colnames is None then the argument of the columns is determined by the dictionary itself - i.e. the order returned by its keys() method. Use the collections.OrderedDict class if you need a specific order (or explicitly specify it using the colnames argument).

Column content: structured array

If the column argument is a NumPy structured array then it is used to determine the column names and values and the order of the columns. The colnames argument can be used to re-order or subset the columns as with dictionaries.

Column content: separate arguments

If neither a dictionary or structured array is given then the data to write out is given as the col1 to coln arguments; they can be Python arrays (e.g. [1,2,3]) or numpy arrays (e.g. np.arange(4)), can be multi-dimensional (e.g. for vector columns), and can include string vectors. Each column should contain a single datatype, and all will be padded to the length of the longest array. The padding is 0 for numeric columns and "" or "IND" for string arrays.

If the optional colnames argument is not given then the output columns will be named "col1" to "coln", where n is the number of columns. To use your own names, supply an array of strings to the colnames argument.

Output format

The default output format is ASCII, using the "TEXT/SIMPLE" flavor supported by the CIAO Data Model (see "ahelp dmascii" for more information on text formats). The output format can be changed using the format argument, using the values listed below.

The format argument

Value	Description
"text"	A simple ASCII format consisting of a header line containing the column names and then the data (TEXT/SIMPLE).
"fits"	FITS binary table format.
"simple"	The same as "text".
"raw"	As "text" but without the column names (TEXT/RAW).
"dtf"	Data Text Format, which is a FITS-like ASCII format (TEXT/DTF).

ASCII output options

The sep and comment arguments control the ASCII output versions (they are ignored when format is set to "fits"). The sep argument gives the column seperator - the default is " " - and the comment argument gives the character that starts a comment line, and it has a default of "#".

Adding extra metadata to the table

If you wish to add extra metadata to the table then you can use the make_table_crate() routine, which creates a Crate from a list of arrays, add the metadata using Crates routines, and then write the crate out using write_file().

Examples

Example 1

>>> a = [1, 2, 3, 4, 5]
>>> b = 2.3 * np.asarray(a)**2
>>> c = ["src a", "src b", "", "multiple sources", "x"]
>>> write_columns("src.dat", a, b, c)

The three columns are written to the file "src.dat" in text format. The output file will contain the following; note that the string column contains leading and trailing quote characters for elements that are empty or contain spaces.

>>> !cat src.dat
#TEXT/SIMPLE
# col1 col2 col3
1 2.300000000000 "src a"
2 9.200000000000 "src b"
3 20.70000000000 ""
4 36.80000000000 "multiple sources"
5 57.50000000000 x

Example 2

>>> write_columns("src.dat", a, b, c, colnames=["x", "y", "comment"])

The same data is written out as in the previous example but this time explicit column names are given. The output file now looks like:

>>> !cat src.dat
#TEXT/SIMPLE
# x y comment
1 2.300000000000 "src a"
2 9.200000000000 "src b"
3 20.70000000000 ""
4 36.80000000000 "multiple sources"
5 57.50000000000 x

Note that this call will overwrite the contents of the "src.dat" file without warning.

Example 3

>>> cnames = ["x", "y", "comment"]
>>> write_columns("src.fits", a, b, c, colnames=cnames, format="fits")

The file is now written out as a FITS binary table:

>>> !dmlist src.fits cols
 
----------------------------------------------------------------------
Columns for Table Block TABLE
----------------------------------------------------------------------
 
ColNo  Name       Unit        Type             Range
   1   x                       Int4           -                    
   2   y                       Real8          -Inf:+Inf            
   3   comment                 String[16]

Example 4

>>> x = [1, 2, 3, 4]
>>> y = np.arange(10).reshape(5,2)
>>> print (x)
[1, 2, 3, 4]
>>> print (y)
[[0 1]
[2 3]
[4 5]
[6 7]
[8 9]]
>>> write_columns("cols.dat", x, y)
>>> write_columns("cols.fits", x, y, format="fits")

Here we highlight

handling multi-dimensional arrays (y),
and dealing with arrays of different length (x has 4 scalars and y has 5 pairs).

The ASCII output file contains a scalar column (col1) and a vector column of length 2, col2). The fifth element of col1 has been set to 0 since the input array (x) did not contain any data.

>>> !cat cols.dat
#TEXT/SIMPLE
# col1 col2[2]
1 1 2
2 3 4
3 5 6
4 7 8
0 9 10

The FITS format output is the same: a 0 for the fifth row of col1 and col2 is a vector of length 2.

>>> !dmlist cols.fits data
 
--------------------------------------------------
Data for Table Block TABLE
--------------------------------------------------
 
ROW    col1       col2[2]
 
     1          1                [1 2]
     2          2                [3 4]
     3          3                [5 6]
     4          4                [7 8]
     5          0               [9 10]

Example 5

>>> d = {"x1": a, "x2": b, "x3": c}
>>> write_columns("src.dat", d, clobber=True)

Here we use a Python dictionary to name the columns. The order of the columns in the output file is unspecified (it depends on the dictionary). To enforce an ordering - and to possibly exclude columns - use the colnames argument, or try using an Ordered Dictionary. For example:

>>> write_columns("src.dat', d, colnames=["x1", "x2", "x3"])
>>> write_columns("src.dat', d, colnames=["x1", "x3"])

Changes in the January 2012 Release

Dictionary and structured-array support

The write_columns() routine can now accept a dictionary or NumPy structured array rather than a set of column arguments.

Support for np.int32 arrays

On 64-bit builds of CIAO, any NumPy arrays with a datatype of np.int32 are converted to np.int64 values to avoid data corruption.

Changes in the December 2011 Release

The clobber parameter

The clobber parameter has been added, which defaults to True. In prior releases the routine always over-wrote a file if it existed (i.e. acted as if clobber was True).

Bugs

See the bug pages on the CIAO website for an up-to-date listing of known bugs.

Refer to the CIAO bug pages for an up-to-date listing of known issues.