Chandra Data Archive Operations

Previous: Chandra Source Catalog   Contents   Next: CXC 2003 Science Press



Chandra Data Archive Operations

For the past two and a half years the CDA (Chandra Data Archive) Operations team ("Arcops") has been collecting information about published papers that present Chandra observations. Links between those articles and the actual observations have been put into the CDA Bibliography database. As a result, users can now access all known publications presenting data from a particular ObsId from the archive's browser, WebChaSeR. Conversely, ADS articles link directly to the associated observations through WebChaSeR. In addition, the ADS browser allows searching on Chandra-related papers; this includes not only the papers that present specific observations, but also those that contain more general references to Chandra and its data. St├ęphane Paltani, who has since left for the Observatoire de Marseille, started the effort and wrote the initial tools for maintaining this database. During the past year we have been working on a substantial expansion of the database including more categories of papers and more attributes for each entry. The result is a database that holds not only papers that present specific observations, but also papers that refer to Chandra results; that predict Chandra results; that describe Chandra instruments, software, or operations; and papers that do not fall in any of those categories but nevertheless are Chandra-related. For each article the database keeps the "bibcode" (to allow direct ADS access), category, date of publication, whether it is refereed or not, the kind of article (abstract, full article, memo, erratum, etc.), the type of publication (book, journal, proceedings, circular, review, newsletter, internal), the number of citations, the keywords attached to the article, a Chandra proposal connection, and a number of instrument-like flags (ACIS, HRC, LETG, HETG, HRMA, PCAD, EPHIN, software, operations).

Since St├ęphane left, Sherry Winkelman has taken over the project and implemented the expansion of the database. Sherry and John Bright have spent many days back-filling the database so we can now offer a much more detailed bibliographic search mechanism for Chandra-related publications.

We are pleased to announce the release of a new interface, written by Sarah Blecksmith:

/http://cxc.harvard.edu/cgi-gen/cda/bibliography/

It allows searches for all papers that represent observations of a particular object, constrained by use of ACIS or HRC (optional), with or without gratings; or refereed papers that deal with software and ACIS. The database also provides ample data for the Director's Office, Mission Planning, and others to extract statistics and metrics.

As one can imagine, vetting the ADS information is a considerable task. All papers that are collected in an automated search need to be categorized and attributes need to be determined. One of the most time consuming aspects of this is the linking to individual ObsIDs. We try to make as many of these links as we can and greatly appreciate it when authors put lists of actual ObsIDs in the Observations section of their papers. If such a list is not present, we try to make a determination on the basis of the text of the article. And if that fails, we contact the author and hope for a prompt answer. Here are some statistics (as of mid-January 2004). We have a total of 3969 papers, with 24016 citations, in our database; of these, 832 present 1363 specific observations, with a total of 2612 links (some observations are linked to more than one article and many articles present more than one ObsId). To create the 3969 entries, Arcops inspected 12008 articles.

This summer we expect that the ADS and the AAS Publication Board will announce the introduction of new tags that authors can insert into papers submitted to the US journals. One of those tags will be a Dataset Identifier. By putting such a tag into your manuscript you will enable the ADS and the CDA to harvest links between papers and datasets electronically; we would like to encourage you to take advantage of this new feature when it becomes available. Currently, it is an agreement between the NASA data centers, the ADS, and the US journal editors, but we hope that other journals will soon follow suit. We believe that the Dataset Identifier tags will make our task easier. By the time these tags are introduced we will provide information and tools that will make it easy for our users to insert these tags. Among the planned services is one that will allow users to request the definition of custom datasets, replacing the need for tags to each individual ObsID in a long list by a single tag. The Chandra Deep Fields are examples of existing compound datasets.

At this point it is not feasible to include preprint services in this effort. We are aware that adding papers from astroph to our database would be very valuable to our users, but unfortunately we do not have the resources to do so. This would be a massive effort, especially since it seems unlikely that astroph will use the ADS Dataset Identifiers. Nevertheless, with the current expansion, Chandra now has the most extensive missionbased bibliographical database and we trust that its interface will be a useful tool for the community.

Arnold Rots for the Archive Operations team




Previous: Chandra Source Catalog   Contents   Next: CXC 2003 Science Press




cxchelp@head.cfa.harvard.edu