Using CIAO in a notebook
CIAO can be used with notebooks, whether the classic notebook interface or the newer lab environment. The What is Jupyter? page provides a lot of information on why people may want to use "computional notebooks". There are, however, some adjustments that need to be made when using the notebook or lab interfaces:
-
Jupyter supports a large number of languages (also called kernels): this document focuses on Python (although bash can also be used);
-
interactive tools - such as SAOImageDS9, CSCview, and TOPCAT - are not easily embedded within the notebook environment, so using them can be challenging if the notebook is not running locally (such as via SciServer);
-
and the Python environment used by the lab or notebook can get out of sync with that used by CIAO, in particular if the Jupyter installation does not match the CIAO environment.
The sections of this document are:
- Starting the notebook environment
- Check the installation
- Running CIAO tools and scripts
- Running other tools
Starting the notebook environment
The lab or notebook environments can be used. For example, after ensuring that the Jupyter environment is installed:
unix% jupyter lab ... [I 2025-05-01 15:41:55.575 ServerApp] Jupyter Server 2.14.2 is running at: [I 2025-05-01 15:41:55.575 ServerApp] http://localhost:8888/lab?token=big-long-hexadecimal-string [I 2025-05-01 15:41:55.575 ServerApp] http://127.0.0.1:8888/lab?token=big-long-hexadecimal-string [I 2025-05-01 15:41:55.575 ServerApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation). [C 2025-05-01 15:41:55.671 ServerApp] To access the server, open this file in a browser: file:///...a path Or copy and paste one of these URLs: http://localhost:8888/lab?token=big-long-hexadecimal-string http://127.0.0.1:8888/lab?token=big-long-hexadecimal-string ...
The screen output should include the web page to navigate to, if it hasn't already been selected. The locations used may well differ to those listed in the example above.
Check the installation
The ciaover tool can be used to check that the notebook environment is working:
!ciaover
CIAO 4.17.0 Friday, December 06, 2024 bindir : /soft/ciao-4.17/bin CALDB : 4.12.0
Note that the command started with ! since this is a feature of the Jupyter Python environment, allowing the rest of the line to be run from a shell.
Running CIAO tools and scripts
If the CIAO tool or script has a parameter file - that is, if plist name works - then the ciao_contrib.runtool module can be used to run the command from Python. This module:
-
allows commands and scripts to be run as a Python function;
-
provides easy access to the parameter list;
-
allows parameter names to be reduced to just their unique prefix;
-
and will validate most parameter settings to ensure they are valid.
There are some differences to runnig the tool or script in a shell, including:
-
the screen output is only created once the tool or script has run, rather than as the different parts of the program are completed;
-
the tools are run with mode=h so that parameters are not prompted for, unlike at the command line;
-
parameters are not taken from the user's parameter files, and so have to be set explicitly;
-
some parameter interaction - such as redirecting values to another parameter - are not supported;
-
the output from one file can not be piped in as the input to another tool, unlike in the shell;
-
and if the tool or script errors out then there will be extra output from Python before the underlying error message is displayed.
The ciao_contrib.runtool module can be loaded in several ways, such as
from ciao_contib.runtool import *
or
from ciao_contib import runtool
but this document will use:
from ciao_contrib import runtool as rt
Tools and scripts are available as rt.<name> and they can:
-
printed, which will display the parameter settings;
print(rt.dmlist)
Parameters for dmlist: Required parameters: infile = Input dataset/block specification opt = data Option Optional parameters: outfile = Output file (optional) rows = Range of table rows to print (min:max) cells = Range of array indices to print (min:max) verbose = 0 Debug Level(0-5)
-
used to set or get parameter values as rt.<name>.<parname>;
print(rt.dmlist.opt)
data
# Note that any unique prefix of the parameter name can be used rt.dmlist.inf = "test.fits" rt.dmlist.op = "counts"
print(rt.dmlist)
Parameters for dmlist: Required parameters: infile = test.fits Input dataset/block specification opt = counts Option Optional parameters: outfile = Output file (optional) rows = Range of table rows to print (min:max) cells = Range of array indices to print (min:max) verbose = 0 Debug Level(0-5)
-
or called as a function to run the tool or script.
# Parameters will be taken from the object if not set. This will attempt # to list the size of the file "test.fits". As the file does not exist then # dmlist exits with an error, causing a Python exception (an OSError). # rt.dmlist()
--------------------------------------------------------------------------- OSError Traceback (most recent call last) Cell In[6], line 1 ----> 1 rt.dmlist() File /soft/ciao-4.17/lib/python3.11/site-packages/ciao_contrib/runtool.py:1863, in CIAOToolParFile.__call__(self, *args, **kwargs) 1861 sep = "\n " 1862 smsg = sep.join(sout.rstrip().split("\n")) -> 1863 raise IOError(f"An error occurred while running '{self._toolname}':{sep}{smsg}") 1865 finally: 1866 for v in stackfiles.values(): OSError: An error occurred while running 'dmlist': Failed to open virtual file test.fits # dmlist (CIAO 4.17.0): [DM FITS kernel]: File test.fits does not exist
import os caldbfile = os.getenv("CALDB") + "/docs/chandra/caldb_version/caldb_version.fits" # Explicitly set the infile parameter. The opt setting is still set # to "counts" so the number or rows in this file will be reported. # rt.dmlist(infile=caldbfile)
100
What happens if there is an error
If the tool or script exists with a non-zero status value then an OSError will be raised with the error message. This will normally be displayed with a pink background, as shown above.
The get_runtime_details method can be used to find out more information (the start and stop time, the arguments used, the output, and the status value (code) if needed.
How to get the screen output
The lab/notebook environment makes it hard to understand what messages are printed to the screen and what is the output from the last command. The runtool module catches all the screen output (so both stdout and stderr) and returns it as a string, as shown below:
out = rt.dmlist(caldbfile, "counts") print(f"Output: {out} type: {type(out)}")
Output: '100' type: <class 'ciao_contrib.runtool.CIAOPrintableString'>
The CIAOPrintableString class is just used to provide nice formatting for the notebook output (that is, newlines will be processed rather than displayed as \n), and it can be treated just as a Python string.
The output from a runtool call is only displayed if it is the last call in the cell, so the following only shows the "blocks" output:
rt.dmlist(caldbfile, "cols") rt.dmlist(caldbfile, "blocks")
-------------------------------------------------------------------------------- Dataset: /soft/ciao-4.17/CALDB/docs/chandra/caldb_version/caldb_version.fits -------------------------------------------------------------------------------- Block Name Type Dimensions -------------------------------------------------------------------------------- Block 1: PRIMARY Null Block 2: CALDBVER Table 14 cols x 100 rows
If multiple calls are made in the same cell, and the output is needed, them store them as variables, such as:
out1 = rt.dmlist(caldbfile, "cols") out2 = rt.dmlist(caldbfile, "blocks")
The CIAOPrintableString class is used to make the display in the notebook interface readable, as can be seen comparing it to the string version. Note that print(out1) and print(str(out1)) produce the same output.
out1
-------------------------------------------------------------------------------- Dataset: /soft/ciao-4.17/CALDB/docs/chandra/caldb_version/caldb_version.fits -------------------------------------------------------------------------------- Block Name Type Dimensions -------------------------------------------------------------------------------- Block 1: PRIMARY Null Block 2: CALDBVER Table 14 cols x 100 rows
str(out1)
' \n--------------------------------------------------------------------------------\nDataset: ... rows'
Parameters set by the tool
If a tool or script sets a parameter value then it can be retrieved from the tool after the call. For example:
print(rt.dmkeypar.value)
None
rt.dmkeypar(caldbfile, "revision")
print(rt.dmkeypar.value)
4.12.0
The parameter file for the tool is not changed by runtool - unless this is one of the handful of special-case tools - which means that the dmkeypar parameter file will not have been changed by the above call:
!plist dmkeypar
Parameters for /home/ciaouser/cxcds_param4/dmkeypar.par infile = Input file name keyword = Keyword to retrieve exist = no Keyword existence value = Keyword value rval = Keyword value -- real ival = Keyword value -- integer sval = Keyword value -- string bval = no Keyword value -- boolean datatype = null Keyword data type unit = Keyword unit comment = Keyword comment (echo = no) Print keyword value to screen? (mode = ql)
Running multiple copies of a tool
The make_tool routine can be used to make it easier to run multiple copies of a tool. For example
ia1 = rt.make_tool("dmimgadapt") ia2 = rt.make_tool("dmimgadapt")
creates two routines ia1 and ia2 that can be used to call dmimgadapt without worrying about any parameter changes made to the other routine (or, indeed, rt.dmimgadapt).
Separate parameter directories
Virtually all CIAO scripts and tools are run in such a way that they are not affected by changes to the standard CIAO parameter directory (e.g. the locations in the PFILES environment variable). This means that running multiple versions of the same tool will not cause a problem.
Unfortunately a small number of scripts - axbary, dmgti, evalpos, fullgarf, mean_energy_map, pileup_map, tgdetect, and wavdetect - can not be run this way, which means that greater care is needed when running them. One way is to explicitly create a temporary parameter directory with the runtool.set_new_pfiles_environment context handler:
with rt.new_pfiles_environment(ardlib=True): # This sets up a new PFILES temporary directory and copies over # the current ardlib.par file to it. rt.wavdetect(...)
Parameter file access
![[NOTE]](../imgs/note.png)
The runtool module is written in such a way that most users will not need to use these routines.
The default parameter settings for each script and tool are taken from the CIAO defaults. The read_params and write_params methods can be used ot read in or write out the settings. So the settings can be written out:
# If called with no arguments it will write to the toolname + ".par" # file in the user parameter directory: see the PFILES environment # variable, or call rt.get_pfiles(). # rt.dmlist.write_params("test")
!cat test.par
infile,f,a,"/soft/ciao-4.17/CALDB/docs/chandra/caldb_version/caldb_version.fits",,,"Input dataset/block specification" opt,s,a,"cols",,,"Option" outfile,f,h,"",,,"Output file (optional)" rows,s,h,"",,,"Range of table rows to print (min:max)" cells,s,h,"",,,"Range of array indices to print (min:max)" verbose,i,h,0,0,5,"Debug Level(0-5)" mode,s,h,"hl",,,
or read in:
print(rt.dmlist)
Parameters for dmlist: Required parameters: infile = /soft/ciao-4.17/CALDB/docs/chandra/caldb_version/caldb_version.fits Input dataset/block specification opt = cols Option Optional parameters: outfile = Output file (optional) rows = Range of table rows to print (min:max) cells = Range of array indices to print (min:max) verbose = 0 Debug Level(0-5)
import paramio paramio.punlearn("dmlist")
rt.dmlist.read_params() print(rt.dmlist)
Parameters for dmlist: Required parameters: infile = Input dataset/block specification opt = data Option Optional parameters: outfile = Output file (optional) rows = Range of table rows to print (min:max) cells = Range of array indices to print (min:max) verbose = 0 Debug Level(0-5)
Running other tools
Unix scripts and tools - including those supported by ciao_contrib.runtool - can be run either directly, using the "!" prefix,
!dmlist $CALDB/docs/chandra/caldb_version/caldb_version.fits counts
100
or with the subprocess module:
from subprocess import check_call status = check_call(["dmlist", caldbfile, "counts"])
100
# This is the exit status and not the screen output. print(status)
0
Access to the displayed output depends on the method and can include extra white space (as in this example):
out = !dmlist $CALDB/docs/chandra/caldb_version/caldb_version.fits counts
print(out)
['100 ']
The process for subprocess is more involved:
import subprocess as sbp
proc = sbp.run(["dmlist", caldbfile, "counts"], stdout=sbp.PIPE, encoding="utf8", check=True)
proc.stdout
'100 \n'
print(proc.stdout)
100
Running tools in the backgruond
A tool like SAOImageDS9 can be run with a command like
!ds9
and, if the notebook server has permission to connect to the X server, the DS9 application will appear (but not in the notebook, unless run on the HEASARC SciServer). However, trying to run the tool in the background will result in an error:
!ds9 &
--------------------------------------------------------------------------- OSError Traceback (most recent call last) Cell In[99], line 1 ----> 1 get_ipython().system('ds9 &') File /soft/ciao-4.17/lib/python3.11/site-packages/ipykernel/zmqshell.py:641, in ZMQInteractiveShell.system_piped(self, cmd) 634 if cmd.rstrip().endswith("&"): 635 # this is *far* from a rigorous test 636 # We do not support backgrounding processes because we either use 637 # pexpect or pipes to read from. Users can always just call 638 # os.system() or use ip.system=ip.system_raw 639 # if they really want a background process. 640 msg = "Background processes not supported." --> 641 raise OSError(msg) 643 # we explicitly do NOT return the subprocess status code, because 644 # a non-None value would trigger :func:`sys.displayhook` calls. 645 # Instead, we store the exit_code in user_ns. 646 # Also, protect system call from UNC paths on Windows here too 647 # as is done in InteractiveShell.system_raw 648 if sys.platform == "win32": OSError: Background processes not supported.