Extra keywords were included
with PVM and MPI so as to cull false matches (e.g. with the
Max Planck Institute). The keyword
xspec refers to the software program of
the same name (Arnaud 1996), which is generally regarded as
the most widely used application for modeling X-ray spectra.
Queries in ADS on other modeling tools, or with other search
engines such as Google, all yield similar trends:
astronomers and astrophysicists do employ parallel
computing, but mainly for highly customized, large-scale
problems in simulation, image processing, or data reduction.
Virtually no one is using parallelism for fitting models
within established software systems, especially in the
interactive context, even though a majority of papers
published in observational astronomy result from exactly
this form of analysis.
ISIS, S-Lang, PVM, and
SLIRP
To exploit this opportunity
we’ve extended ISIS, the Interactive Spectral
Interpretation System (Houck 2002), with a dynamically
importable module that provides scriptable access to the
Parallel Virtual Machine (Geist et al 1994). PVM was
selected (e.g. over MPI) for its robust fault tolerance in a
networked environment. ISIS, in brief, was originally
conceived as a tool for analyzing Chandra grating spectra, but quickly grew
into a general-purpose analysis system. It provides a
superset of the XSpec models and, by embedding the S-Lang
interpreter, a powerful scripting envi
ronment complete with fast
array-based mathematical capabilities rivaling commercial
packages such as MatLab or IDL.
Custom user models may be
loaded into ISIS as either scripts
or compiled code1, without any recompilation of ISIS
itself; because of the fast array manipulation native to
S-Lang, scripted models suffer no needless performance
penalties, while the SLIRP code generator (Noble 2003) can
render the use of compiled C, C++, and FORTRAN models a
nearly instantaneous, turnkey process.
Parallel Modeling
Using the PVM module
we’ve parallelized a number of the numerical modeling
tasks in which astronomers engage daily, and summarize them
here as a series of case studies. Many of the scientific
results stemming from these efforts are already appearing
elsewhere in the literature.
Kerr Disk Line
Relativistic Kerr disk models
are computationally expensive. Historically, implementors have
opted to use precomputed tables to gain speed at the expense
of limiting flexibility in searching parameter space.
However, by recognizing that contributions from individual
radii may be computed independently we’ve parallelized the
model to avoid this tradeoff. To gauge the performance
benefits2 we tested the sequential execution
of a single model evaluation, using a small, faked test
dataset, on our fastest CPU (a 2Ghz AMD Opteron), yielding a
median runtime of 33.86 seconds. Farming the same
computation out to 14 CPUs on our network reduced the median
runtime to 8.16s, yielding a speedup of 4.15. While 30%
efficiency seems unimpressive at first glance, this result
actually represents 67% of the peak speedup of 6.16
predicted by Amdahl’s Law (5.5 of the 33.86 seconds
runtime on 1 CPU was not parallelizable in the current
implementation), on CPUs of mixed speeds and during normal
working hours. Reducing the model evaluation time to 8
seconds brings it into the realm of interactive use, with
the result that fits requiring 3-4 hours to converge (on
“real” datasets such as the long XMM-Newton observation
of MCG--6-30-15 by Fabian) may now be done in less than 1
hour. The model evaluation is initiated in ISIS through the
S-Lang hook function
public define pkerr_fit (lo,
hi, par)
{
variable klo, khi;
(klo, khi) = _A(lo, hi); % convert angstroms to KeV
return par[0] * reverse ( master (klo, khi, par));
}
where lo and hi are arrays (of roughly 800
elements) representing the left and right edges of each bin
within the model grid, and par is a 10 element array of the Kerr
model parameters. Use of the PVM module is hidden within
the master call (which partitions the disk
radii computation into slave tasks), allowing ISIS to remain
unaware that the model has even been parallelized. This is
an important point: parallel models are installed and
later invoked using precisely the same mechanisms employed
for sequential models.
For each task the slaves
invoke a FORTRAN kerr model implementation, by Laura Breneman at
the University of Maryland, wrapped by SLIRP as follows:
linux% slirp -make
kerr.f
Starter make file generated to kerr.mf
linux% make -f kerr.mf
Confidence Contours and Error
Bars
Error analysis is ripe for
exploitation with parallel methods. In the 1D case, an
independent search of χ2 space may be made for each of the
I model parameters, using
N=I slaves, with each treating one
parameter as thawed and I-1 as fixed. Note that superlinear
speedups are possible here, since a slave finding a lower
χ2 value can immediately terminate its
N-1 brethren and restart them with
updated parameters values. Parallelism in the 2D case is
achieved by a straightforward partition of the parameter
value grid into J independently-evaluated rectangles,
where J >> N (again, the number of slaves) is
typical on our cluster. Our group and collaborators have
already published several results utilizing this
technique. For example, Allen et al 2004 describes joint
X-ray, radio, and γ-ray fits of SN1006, containing a
synchrotron radiation component modeled as
This also makes it easy for ISIS to employ an MPI module for parallelism, if desired.
The physics of this integral is not important here; what matters is that the cost of evaluating it over a 2D grid is prohibitive (even though symmetry and precomputed tables have reduced the integral from 3D to 1D), since it must be computed once per spectral bin, hundreds of times per model evaluation, and potentially millions of times per confidence grid. A 170x150 contour grid (of electron spectrum exponential cutoff energy versus magnetic field strength) required 10 days to compute on 20-30 CPUs (the fault tolerance of PVM is critical here), and would scale linearly to a 6-10 month job on a single workstation.
Temperature Mapping
Temperature mapping is another problem that is straightforward to parallelize and for which we have already published results. For instance, Wise & Houck 2004 provides a map of heating in the intracluster medium of Perseus, computed from 10,000 spectral extractions and fits on 20+ CPUs in just several hours.
Going Forward
It is important to note that in the two previous studies the models themselves were not parallelized, so the usual entry barrier of
converting serial codes to parallel does not apply. One consequence is that the community should no longer feel compelled to compute error analyses or temperature maps serially. Another consequence is that the independence between partitions of the data and the computation being performed, which makes the use of sequential models possible in the parallel context, also lurks within other areas of the modeling problem. In principle it should be possible to evaluate an arbitrary sequential model in parallel by partitioning the model grid over which it’s evaluated, or by evaluating over each dataset independently (when multiple datasets are fit), or in certain cases even by evaluating non-tied components in parallel. We are implementing these techniques with an eye towards rendering their use as transparent as possible for the non-expert. With simple models or small datasets these measures may be not be necessary, but the days of simple models and small datasets are numbered. Reduced datasets have already hit the gigabyte scale, and multi-wavelength analysis such as we describe above is fast becoming the norm. These trends will only accelerate as newer instruments are deployed and the Virtual Observatory is more widely utilized, motivating scientists to tackle more ambitious analysis problems that may have been shunned in the past due to their computational expense.
M.S. Noble, J. C. Houck, J. E. Davis, A. Young, M. Nowak
This work was supported by NASA through the AISRP grant NNG05GC23G and Smithsonian Astrophysical Observatory contract SV3-73016 for the Chandra X-Ray Center.
References
Allen, G. E., Houck, J. C., & Sturner, S. J. 2004, Advances in
Space Research, 33, 440
Kurtz, M.J., Karakashian, T., Grant, C.S., Eichhorn, G., Murray, S.S., Watson, J.M., Ossorio, P.G., & Stoner, J.L. 1993 ADASS II
Arnaud, K.A. 1996 ADASS V
A. Geist, A. Beguelin, J. Dongarra, W. Jiang, R. Manchek, & V. Sunderam 1994, PVM: Parallel Virtual Machine, A User’s Guide and Tutorial for Networked Parallel Computing
Houck, J. C. 2002, ISIS: The Interactive Spectral Interpretation System, High Resolution X-ray Spectroscopy with XMM-Newton and Chandra
Noble, M. S. 2003, http://space.mit.edu/cxc/software/slang/modules/slirp
Wise, M., & Houck, J. 2004, 35th COSPAR Scientific Assembly, 3997