/ ciao / scripting / notebook.html

Using CIAO in a notebook

CIAO can be used with notebooks, whether the classic notebook interface or the newer lab environment. The What is Jupyter? page provides a lot of information on why people may want to use "computional notebooks". There are, however, some adjustments that need to be made when using the notebook or lab interfaces:

Jupyter supports a large number of languages (also called kernels): this document focuses on Python (although bash can also be used);
interactive tools - such as SAOImageDS9, CSCview, and TOPCAT - are not easily embedded within the notebook environment, so using them can be challenging if the notebook is not running locally (such as via SciServer);
and the Python environment used by the lab or notebook can get out of sync with that used by CIAO, in particular if the Jupyter installation does not match the CIAO environment.

The sections of this document are:

Starting the notebook environment

The lab or notebook environments can be used. For example, after ensuring that the Jupyter environment is installed:

unix% jupyter lab
...
[I 2025-05-01 15:41:55.575 ServerApp] Jupyter Server 2.14.2 is running at:
[I 2025-05-01 15:41:55.575 ServerApp] http://localhost:8888/lab?token=big-long-hexadecimal-string
[I 2025-05-01 15:41:55.575 ServerApp]     http://127.0.0.1:8888/lab?token=big-long-hexadecimal-string
[I 2025-05-01 15:41:55.575 ServerApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).
[C 2025-05-01 15:41:55.671 ServerApp] 
    
    To access the server, open this file in a browser:
        file:///...a path
    Or copy and paste one of these URLs:
        http://localhost:8888/lab?token=big-long-hexadecimal-string
        http://127.0.0.1:8888/lab?token=big-long-hexadecimal-string
...

The screen output should include the web page to navigate to, if it hasn't already been selected. The locations used may well differ to those listed in the example above.

Check the installation

The ciaover tool can be used to check that the notebook environment is working:

!ciaover

CIAO 4.17.0 Friday, December 06, 2024
  bindir      : /soft/ciao-4.17/bin
  CALDB       : 4.12.0

Note that the command started with ! since this is a feature of the Jupyter Python environment, allowing the rest of the line to be run from a shell.

Running CIAO tools and scripts

If the CIAO tool or script has a parameter file - that is, if plist name works - then the ciao_contrib.runtool module can be used to run the command from Python. This module:

allows commands and scripts to be run as a Python function;
provides easy access to the parameter list;
allows parameter names to be reduced to just their unique prefix;
and will validate most parameter settings to ensure they are valid.

There are some differences to runnig the tool or script in a shell, including:

the screen output is only created once the tool or script has run, rather than as the different parts of the program are completed;
the tools are run with mode=h so that parameters are not prompted for, unlike at the command line;
parameters are not taken from the user's parameter files, and so have to be set explicitly;
some parameter interaction - such as redirecting values to another parameter - are not supported;
the output from one file can not be piped in as the input to another tool, unlike in the shell;
and if the tool or script errors out then there will be extra output from Python before the underlying error message is displayed.

The ciao_contrib.runtool module can be loaded in several ways, such as

from ciao_contib.runtool import *

from ciao_contib import runtool

but this document will use:

from ciao_contrib import runtool as rt

Tools and scripts are available as rt.<name> and they can:

printed, which will display the parameter settings;

print(rt.dmlist)

Parameters for dmlist:

Required parameters:
              infile =                  Input dataset/block specification
                 opt = data             Option

Optional parameters:
             outfile =                  Output file (optional)
                rows =                  Range of table rows to print (min:max)
               cells =                  Range of array indices to print (min:max)
             verbose = 0                Debug Level(0-5)

used to set or get parameter values as rt.<name>.<parname>;

print(rt.dmlist.opt)

data

# Note that any unique prefix of the parameter name can be used
rt.dmlist.inf = "test.fits"
rt.dmlist.op = "counts"

print(rt.dmlist)

Parameters for dmlist:

Required parameters:
              infile = test.fits        Input dataset/block specification
                 opt = counts           Option

Optional parameters:
             outfile =                  Output file (optional)
                rows =                  Range of table rows to print (min:max)
               cells =                  Range of array indices to print (min:max)
             verbose = 0                Debug Level(0-5)

or called as a function to run the tool or script.

# Parameters will be taken from the object if not set. This will attempt
# to list the size of the file "test.fits". As the file does not exist then
# dmlist exits with an error, causing a Python exception (an OSError).
#
rt.dmlist()

---------------------------------------------------------------------------
OSError                                   Traceback (most recent call last)
Cell In[6], line 1
----> 1 rt.dmlist()

File /soft/ciao-4.17/lib/python3.11/site-packages/ciao_contrib/runtool.py:1863, in CIAOToolParFile.__call__(self, *args, **kwargs)
   1861         sep = "\n  "
   1862         smsg = sep.join(sout.rstrip().split("\n"))
-> 1863         raise IOError(f"An error occurred while running '{self._toolname}':{sep}{smsg}")
   1865 finally:
   1866     for v in stackfiles.values():

OSError: An error occurred while running 'dmlist':
  
  Failed to open virtual file test.fits
  # dmlist (CIAO 4.17.0): [DM FITS kernel]: File test.fits does not exist

import os
caldbfile = os.getenv("CALDB") + "/docs/chandra/caldb_version/caldb_version.fits"

# Explicitly set the infile parameter. The opt setting is still set
# to "counts" so the number or rows in this file will be reported.
#
rt.dmlist(infile=caldbfile)

What happens if there is an error

If the tool or script exists with a non-zero status value then an OSError will be raised with the error message. This will normally be displayed with a pink background, as shown above.

The get_runtime_details method can be used to find out more information (the start and stop time, the arguments used, the output, and the status value (code) if needed.

How to get the screen output

The lab/notebook environment makes it hard to understand what messages are printed to the screen and what is the output from the last command. The runtool module catches all the screen output (so both stdout and stderr) and returns it as a string, as shown below:

out = rt.dmlist(caldbfile, "counts")

print(f"Output: {out}  type: {type(out)}")

Output: '100'  type: <class 'ciao_contrib.runtool.CIAOPrintableString'>

The CIAOPrintableString class is just used to provide nice formatting for the notebook output (that is, newlines will be processed rather than displayed as \n), and it can be treated just as a Python string.

The output from a runtool call is only displayed if it is the last call in the cell, so the following only shows the "blocks" output:

rt.dmlist(caldbfile, "cols")
rt.dmlist(caldbfile, "blocks")

 
--------------------------------------------------------------------------------
Dataset: /soft/ciao-4.17/CALDB/docs/chandra/caldb_version/caldb_version.fits
--------------------------------------------------------------------------------
 
     Block Name                          Type         Dimensions
--------------------------------------------------------------------------------
Block    1: PRIMARY                        Null        
Block    2: CALDBVER                       Table        14 cols x 100      rows

If multiple calls are made in the same cell, and the output is needed, them store them as variables, such as:

out1 = rt.dmlist(caldbfile, "cols")
out2 = rt.dmlist(caldbfile, "blocks")

The CIAOPrintableString class is used to make the display in the notebook interface readable, as can be seen comparing it to the string version. Note that print(out1) and print(str(out1)) produce the same output.

out1

 
--------------------------------------------------------------------------------
Dataset: /soft/ciao-4.17/CALDB/docs/chandra/caldb_version/caldb_version.fits
--------------------------------------------------------------------------------
 
     Block Name                          Type         Dimensions
--------------------------------------------------------------------------------
Block    1: PRIMARY                        Null        
Block    2: CALDBVER                       Table        14 cols x 100      rows

str(out1)

' \n--------------------------------------------------------------------------------\nDataset: ... rows'

Parameters set by the tool

If a tool or script sets a parameter value then it can be retrieved from the tool after the call. For example:

print(rt.dmkeypar.value)

None

rt.dmkeypar(caldbfile, "revision")

print(rt.dmkeypar.value)

4.12.0

The parameter file for the tool is not changed by runtool - unless this is one of the handful of special-case tools - which means that the dmkeypar parameter file will not have been changed by the above call:

!plist dmkeypar


Parameters for /home/ciaouser/cxcds_param4/dmkeypar.par

        infile =                  Input file name
       keyword =                  Keyword to retrieve
         exist = no               Keyword existence
         value =                  Keyword value
          rval =                  Keyword value -- real
          ival =                  Keyword value -- integer
          sval =                  Keyword value -- string
          bval = no               Keyword value -- boolean
      datatype = null             Keyword data type
          unit =                  Keyword unit
       comment =                  Keyword comment
         (echo = no)              Print keyword value to screen?
         (mode = ql)

Running multiple copies of a tool

The make_tool routine can be used to make it easier to run multiple copies of a tool. For example

ia1 = rt.make_tool("dmimgadapt")
ia2 = rt.make_tool("dmimgadapt")

creates two routines ia1 and ia2 that can be used to call dmimgadapt without worrying about any parameter changes made to the other routine (or, indeed, rt.dmimgadapt).

Separate parameter directories

Virtually all CIAO scripts and tools are run in such a way that they are not affected by changes to the standard CIAO parameter directory (e.g. the locations in the PFILES environment variable). This means that running multiple versions of the same tool will not cause a problem.

Unfortunately a small number of scripts - axbary, dmgti, evalpos, fullgarf, mean_energy_map, pileup_map, tgdetect, and wavdetect - can not be run this way, which means that greater care is needed when running them. One way is to explicitly create a temporary parameter directory with the runtool.set_new_pfiles_environment context handler:

with rt.new_pfiles_environment(ardlib=True):
    # This sets up a new PFILES temporary directory and copies over
    # the current ardlib.par file to it.
    rt.wavdetect(...)

Parameter file access

Note

The runtool module is written in such a way that most users will not need to use these routines.

The default parameter settings for each script and tool are taken from the CIAO defaults. The read_params and write_params methods can be used ot read in or write out the settings. So the settings can be written out:

# If called with no arguments it will write to the toolname + ".par"
# file in the user parameter directory: see the PFILES environment
# variable, or call rt.get_pfiles().
#
rt.dmlist.write_params("test")

!cat test.par

infile,f,a,"/soft/ciao-4.17/CALDB/docs/chandra/caldb_version/caldb_version.fits",,,"Input dataset/block specification"
opt,s,a,"cols",,,"Option"
outfile,f,h,"",,,"Output file (optional)"
rows,s,h,"",,,"Range of table rows to print (min:max)"
cells,s,h,"",,,"Range of array indices to print (min:max)"
verbose,i,h,0,0,5,"Debug Level(0-5)"
mode,s,h,"hl",,,

or read in:

print(rt.dmlist)

Parameters for dmlist:

Required parameters:
              infile = /soft/ciao-4.17/CALDB/docs/chandra/caldb_version/caldb_version.fits  Input dataset/block specification
                 opt = cols             Option

Optional parameters:
             outfile =                  Output file (optional)
                rows =                  Range of table rows to print (min:max)
               cells =                  Range of array indices to print (min:max)
             verbose = 0                Debug Level(0-5)

import paramio

paramio.punlearn("dmlist")

rt.dmlist.read_params()

print(rt.dmlist)

Parameters for dmlist:

Required parameters:
              infile =                  Input dataset/block specification
                 opt = data             Option

Optional parameters:
             outfile =                  Output file (optional)
                rows =                  Range of table rows to print (min:max)
               cells =                  Range of array indices to print (min:max)
             verbose = 0                Debug Level(0-5)

Running other tools

Unix scripts and tools - including those supported by ciao_contrib.runtool - can be run either directly, using the "!" prefix,

!dmlist $CALDB/docs/chandra/caldb_version/caldb_version.fits counts

or with the subprocess module:

from subprocess import check_call  

status = check_call(["dmlist", caldbfile, "counts"])

# This is the exit status and not the screen output.
print(status)

Access to the displayed output depends on the method and can include extra white space (as in this example):

out = !dmlist $CALDB/docs/chandra/caldb_version/caldb_version.fits counts

print(out)

['100     ']

The process for subprocess is more involved:

import subprocess as sbp

proc = sbp.run(["dmlist", caldbfile, "counts"], stdout=sbp.PIPE,
               encoding="utf8", check=True)

proc.stdout

'100     \n'

print(proc.stdout)

Running tools in the backgruond

A tool like SAOImageDS9 can be run with a command like

!ds9

and, if the notebook server has permission to connect to the X server, the DS9 application will appear (but not in the notebook, unless run on the HEASARC SciServer). However, trying to run the tool in the background will result in an error:

!ds9 &

---------------------------------------------------------------------------
OSError                                   Traceback (most recent call last)
Cell In[99], line 1
----> 1 get_ipython().system('ds9 &')

File /soft/ciao-4.17/lib/python3.11/site-packages/ipykernel/zmqshell.py:641, in ZMQInteractiveShell.system_piped(self, cmd)
    634 if cmd.rstrip().endswith("&"):
    635     # this is *far* from a rigorous test
    636     # We do not support backgrounding processes because we either use
    637     # pexpect or pipes to read from.  Users can always just call
    638     # os.system() or use ip.system=ip.system_raw
    639     # if they really want a background process.
    640     msg = "Background processes not supported."
--> 641     raise OSError(msg)
    643 # we explicitly do NOT return the subprocess status code, because
    644 # a non-None value would trigger :func:`sys.displayhook` calls.
    645 # Instead, we store the exit_code in user_ns.
    646 # Also, protect system call from UNC paths on Windows here too
    647 # as is done in InteractiveShell.system_raw
    648 if sys.platform == "win32":

OSError: Background processes not supported.

/ ciao / scripting / notebook.html