About Chandra Archive Proposer Instruments & Calibration Newsletters Data Analysis HelpDesk Calibration Database NASA Archives & Centers Chandra Science Links

Skip the navigation links
Last modified: 14 Nov 2007
Hardcopy (PDF): A4 | Letter

Introduction to Fitting ASCII Data with Errors: Single-Component Source Models

Sherpa Threads (CIAO 4.0 Beta)

[S-Lang Syntax]



Overview

Last Update: 14 Nov 2007 - rewritten for CIAO 4.0 Beta 3

Synopsis:

One-dimensional data from an ASCII file are empirically fit with polynomials of several orders. A parameter expression is defined to link the polynomial offset with one of the constants. The data and fits are plotted, and the plots are customized with ChIPS commands.

Proceed to the HTML or hardcopy (PDF: A4 | letter) version of the thread.




Contents



Reading ASCII Data & Errors Into Sherpa

In this thread, we wish to fit 1-D data from the following ASCII dataset:

sherpa> !more data1.dat
0.5    1.6454   0.04114
1.5    1.7236   0.04114
2.5    1.9472   0.04114
3.5    2.2348   0.04114
4.5    2.6187   0.04114
5.5    2.8642   0.04114
6.5    3.1263   0.04114
7.5    3.2073   0.04114
8.5    3.2852   0.04114
9.5    3.3092   0.04114
10.5   3.4496   0.04114

This dataset is input into Sherpa using the load_data() function with the first argument setting the data set id to 1 (default), the second indicating the filename and the third indicating the number of columns to read. The first and second columns are read in as x and y(x), the third column is read in as the statistical errors:

sherpa> load_data(1, "data1.dat", 3);
sherpa> print(get_data()); 
name      = data1.dat
x         = Float64[11]
y         = Float64[11]
staterror = Float64[11]
syserror  = None

sherpa> print(get_data().x);
0.5
1.5
2.5
3.5
4.5
5.5
6.5
7.5
8.5
9.5
10.5

sherpa> print(get_data().staterror);
0.04114
0.04114
0.04114
0.04114
0.04114
0.04114
0.04114
0.04114
0.04114
0.04114
0.04114

To examine a content of the data we use print get_data() command.



Plotting Data

Now the dataset may be plotted:

sherpa> plot_data();

The CIAO software package includes a plotting application called ChIPS. ChIPS plotting can be used within Sherpa to modifying the appearance of a plot. The set_plot_xlabel() and set_plot_ylabel() commands are used to add axis labels:

Figure 1 [Link to Image 1: Data plotted with errors] shows the resulting plot.



Establishing a Model Component

A polynomial model will be used to fit the data. The Sherpa model name for a 1-D polynomial function is polynom1d.

The model is established using the set_source() function, specifying the model name for the session. Here, the generic polynom1d model is named model1:

The get_model_type() and get_model() commands may be used to examine the details of the established model. The initial parameter settings - thawed or frozen, initial values, minimum and maximum allowed values - are listed:

sherpa> print(get_model_type("model1"));
"polynom1d"

sherpa> print(get_model());
model1
   Param        Type          Value          Min          Max      Units
   -----        ----          -----          ---          ---      -----
   model1.c0    thawed            1      -1e+120       1e+120
   model1.c1    frozen            0      -1e+120       1e+120
   model1.c2    frozen            0      -1e+120       1e+120
   model1.c3    frozen            0      -1e+120       1e+120
   model1.c4    frozen            0      -1e+120       1e+120
   model1.c5    frozen            0      -1e+120       1e+120
   model1.c6    frozen            0      -1e+120       1e+120
   model1.c7    frozen            0      -1e+120       1e+120
   model1.c8    frozen            0      -1e+120       1e+120
   model1.offset frozen            0      -1e+120       1e+120


Viewing Method & Statistic Settings

We use Sherpa's default optimization method and statistics for these polynomial fits. To view the current method and statistics settings:

sherpa> print(get_method_name());
"levmar"

sherpa> print(get_method());
ftol    = 1.19209289551e-07
xtol    = 1.19209289551e-07
gtol    = 1.19209289551e-07
maxfev  = 500
epsfcn  = 1.19209289551e-07
factor  = 100.0
verbose = 0

sherpa> print(get_stat_name());
"chi2gehrels"


Thawing Model Parameters & Fitting

To start, we wish to fit these data with a first-order polynomial. The c1 parameter of model1 needs to be thawed so that it will be allowed to vary during the fit:

sherpa> thaw(model1.c1); 
sherpa> print(get_model());
model1
   Param        Type          Value          Min          Max      Units
   -----        ----          -----          ---          ---      -----
   model1.c0    thawed            1      -1e+120       1e+120           
   model1.c1    thawed            0      -1e+120       1e+120           
   model1.c2    frozen            0      -1e+120       1e+120           
   model1.c3    frozen            0      -1e+120       1e+120           
   model1.c4    frozen            0      -1e+120       1e+120           
   model1.c5    frozen            0      -1e+120       1e+120           
   model1.c6    frozen            0      -1e+120       1e+120           
   model1.c7    frozen            0      -1e+120       1e+120           
   model1.c8    frozen            0      -1e+120       1e+120           
   model1.offset frozen            0      -1e+120       1e+120           

The dataset is then fit using the default optimization method and statistics:

sherpa> fit();
Statistic value = 151.827 at function evaluation 6
Data points = 11
Degrees of freedom = 9
Probability [Q-value] = 3.68798e-28
Reduced statistic  = 16.8697
   model1.c0      1.58227     
   model1.c1      0.198455    

The fit information includes the chi-squared statistic value for chi2gehrels, goodness-of-fit and reduced chi-square along with the best-fit parameter values.

To plot the fit model and the data, use the plot_fit() function:

By default, the fit is plotted as a red line over the data, as shown in Figure 2 [Link to Image 2: First-order polynomial fit to the data]. Clearly a first-order polynomial does not give a good fit to the dataset.

The c2 parameter of model1 is thawed so that a second-order polynomial may be fit to the data:

sherpa> thaw(model1.c2); 
sherpa> fit();
Statistic value = 59.0027 at function evaluation 8
Data points = 11
Degrees of freedom = 8
Probability [Q-value] = 7.31065e-10
Reduced statistic  = 7.37534
   model1.c0      1.30826     
   model1.c1      0.347303    
   model1.c2      -0.0135317  

Plot the fit and the data:

Figure 3 [Link to Image 3: Second-order polynomial fit to the data] shows the plot of the second-order polynomial fit.

Finally, the data are fit with a third-order polynomial. The c3 parameter of model1 is thawed, then the data is fit again and plotted:

sherpa> thaw(model1.c3);
sherpa> fit();
Statistic value = 30.8491 at function evaluation 10
Data points = 11
Degrees of freedom = 7
Probability [Q-value] = 6.62869e-05
Reduced statistic  = 4.40701
   model1.c0      1.49843     
   model1.c1      0.1447      
   model1.c2      0.0322936   
   model1.c3      -0.00277729 

sherpa> plot_fit();
sherpa> set_plot_xlabel("X = Off-Axis (arcmin)");
sherpa> set_plot_ylabel("F(X) = SNR");

Figure 4 [Link to Image 4: Third-order polynomial fit to the data] shows the plot of the third-order polynomial fit together with the data.



Examining Fit Results

The best fit model with the data and residuals may be plotted in the same window:

sherpa> plot_fit_resid();

Various modifications may be made to the plots:

sherpa> set_current_plot("plot2"); 

% Label the x-axis; reformat the y-axis tickmarks 
sherpa> set_plot_xlabel("X = Off-Axis (arcmin)"); 
sherpa> set_axis_tickformat("ay1","%1.2f"); 

% Label the y-axis; reformat the y-axis tickmarks 
sherpa> set_current_plot("plot1"); 
sherpa> set_plot_ylabel("F(X) = SNR"); 
sherpa> set_axis_tickformat("ay1","%1.2f"); 

% Change the title; adjust the space between the two plots
sherpa> set_plot_title("ACIS 25000 Counts Per Chip"); 
sherpa> adjust_grid_ygap(0.04); 

% Add a label that describes the third-order polynomial fit
sherpa> add_label(3.0,1.7,"F(X)=(1.4984)+(0.1447)X+(0.0323)X^2+(-0.0028)X^3"); 

The "%" sign indicates a comment and will be ignored if entered on the command line.

Figure 5 [Link to Image 5: Plotting the fit and residuals], is saved as a JPG image and a PS file:

Use the projection() function - which may be shortened to proj() - to estimate the confidence intervals for the thawed parameters. By default, the 1-sigma upper and lower bounds are returned:

sherpa> proj();
projection 1-sigma bounds:
   Param            Best-Fit  Lower Bound  Upper Bound
   -----            --------  -----------  -----------
   model1.c0         1.49843   -0.0520806    0.0520806
   model1.c1          0.1447   -0.0413683    0.0413683
   model1.c2       0.0322936  -0.00874421   0.00874421
   model1.c3     -0.00277729 -0.000523298  0.000523298

In CIAO 3.4, the goodness command was used to get the chi-squared goodness-of-fit. This information is now reported with the best-fit values after a fit; it is no longer necessary to run a separate command. get_fit_results() allows access to this information after the fit has been performed.

For the final fit (third-order polynomial), the results are:

Statistic value = 30.8491 at function evaluation 10
Data points = 11
Degrees of freedom = 7
Probability [Q-value] = 6.62869e-05
Reduced statistic  = 4.40701


Linking Model Parameters

Instead of empirically fitting a polynomial to the data as before, we now wish to fit a first order polynomial such that the offset constant parameter is the following product:

  offset = 4.3979 * c1

where c1 is the first order coefficient and 4.3979 = log(25000).

First, a new polynom1d model is established and is named model2:

For a third-order polynomial fit, thaw the first three constant parameters of model2:

The offset parameter of model2 is linked to the value of the c1 parameter:

sherpa> model2.offset = (model2.c1)*(4.3979);

The resulting source model expression is:

Here the details of the link of the model2.offset are shown along with the initial settings.

sherpa> print(get_model());
model2
   Param        Type          Value          Min          Max      Units
   -----        ----          -----          ---          ---      -----
   model2.c0    thawed            1      -1e+120       1e+120
   model2.c1    thawed            0      -1e+120       1e+120
   model2.c2    thawed            0      -1e+120       1e+120
   model2.c3    thawed            0      -1e+120       1e+120
   model2.c4    frozen            0      -1e+120       1e+120
   model2.c5    frozen            0      -1e+120       1e+120
   model2.c6    frozen            0      -1e+120       1e+120
   model2.c7    frozen            0      -1e+120       1e+120
   model2.c8    frozen            0      -1e+120       1e+120
   model2.offset linked            0      -1e+120       1e+120

sherpa> print(get_model().offset);
val     = 0
min     = -1e+120
default = 0
max     = 1e+120
units   =
frozen  = True
link    = (model2.c1 * 4.3979)

The dataset is then fit:

sherpa> fit();
Statistic value = 30.8491 at function evaluation 25
Data points = 11
Degrees of freedom = 7
Probability [Q-value] = 6.62869e-05
Reduced statistic  = 4.40701
   model2.c0      1.64339     
   model2.c1      0.193666    
   model2.c2      0.0251972   
   model2.c3      -0.00277729 

Plot the best fit model and add axis labels:

Figure 6 [Link to Image 6: Fitting using linked model parameters] shows the resulting plot.

To compare the fitting results obtained using model2 to those obtained using model1:

sherpa> print(model1);
model1
   Param        Type          Value          Min          Max      Units
   -----        ----          -----          ---          ---      -----
   model1.c0    thawed      1.49843      -1e+120       1e+120           
   model1.c1    thawed       0.1447      -1e+120       1e+120           
   model1.c2    thawed    0.0322936      -1e+120       1e+120           
   model1.c3    thawed  -0.00277729      -1e+120       1e+120           
   model1.c4    frozen            0      -1e+120       1e+120           
   model1.c5    frozen            0      -1e+120       1e+120           
   model1.c6    frozen            0      -1e+120       1e+120           
   model1.c7    frozen            0      -1e+120       1e+120           
   model1.c8    frozen            0      -1e+120       1e+120           
   model1.offset frozen            0      -1e+120       1e+120           

sherpa> print(model2);
model2
   Param        Type          Value          Min          Max      Units
   -----        ----          -----          ---          ---      -----
   model2.c0    thawed      1.64339      -1e+120       1e+120           
   model2.c1    thawed     0.193666      -1e+120       1e+120           
   model2.c2    thawed    0.0251972      -1e+120       1e+120           
   model2.c3    thawed  -0.00277729      -1e+120       1e+120           
   model2.c4    frozen            0      -1e+120       1e+120           
   model2.c5    frozen            0      -1e+120       1e+120           
   model2.c6    frozen            0      -1e+120       1e+120           
   model2.c7    frozen            0      -1e+120       1e+120           
   model2.c8    frozen            0      -1e+120       1e+120           
   model2.offset linked     0.851724      -1e+120       1e+120           

Again, the chi-squared goodness-of-fit was reported with the fit results:

Statistic value = 30.8491 at function evaluation 25
Data points = 11
Degrees of freedom = 7
Probability [Q-value] = 6.62869e-05
Reduced statistic  = 4.40701


Independently Fitting a Second Dataset

Finally, we fit a different dataset with a third-order polynomial.

This dataset and errors are read and loaded into Sherpa as dataset with id="2" so that the previous dataset (with the default id="1") isn't overwritten:

sherpa> !more data2.dat
0.5    2.4602   0.06151
1.5    2.4602   0.06151
2.5    2.6373   0.06151
3.5    2.9393   0.06151
4.5    3.2628   0.06151
5.5    3.4131   0.06151
6.5    3.4999   0.06151
7.5    3.6038   0.06151
8.5    3.6374   0.06151
9.5    3.8728   0.06151
10.5   4.3325   0.06151

sherpa> load_data(2, "data2.dat", 3);
sherpa> print(get_data()); 
name      = data1.dat
x         = Float64[11]
y         = Float64[11]
staterror = Float64[11]
syserror  = None

Dataset id=2 may now be plotted. Note that you need to specify the id in the plot_data() function:

Another polynom1d model - model3 - is established and the first three constants are thawed:

Dataset id=2 is then fit with the model3. Note that id has been explicitly stated in the fit() function:

sherpa> fit(2);
Statistic value = 35.5324 at function evaluation 10
Data points = 11
Degrees of freedom = 7
Probability [Q-value] = 8.88089e-06
Reduced statistic  = 5.07605
   model3.c0      2.19527     
   model3.c1      0.286636    
   model3.c2      -0.0234649  
   model3.c3      0.0013769   

And the best fit model is plotted:

This dataset may need another order in the polynomial, so we free model3.c4 and fit again:

sherpa> thaw(model3.c4);
sherpa> fit(2);
Statistic value = 1.18643 at function evaluation 12
Data points = 11
Degrees of freedom = 6
Probability [Q-value] = 0.97755
Reduced statistic  = 0.197739
   model3.c0      2.60526     
   model3.c1      -0.407013   
   model3.c2      0.254528    
   model3.c3      -0.0377019  
   model3.c4      0.00177631  

sherpa> plot_fit(2);
sherpa> set_plot_xlabel("X = Off-Axis (arcmin)");
sherpa> set_plot_ylabel("F(X) = SNR");

Figure 7 [Link to Image 7: Fitting a second dataset with a fourth-order polynomial] shows the resulting plot.



Checking Sherpa Session Status

A variety of information about the final status of this Sherpa session may be obtained:

% method and statistic setting
sherpa> print(get_method_name());
"levmar"

sherpa> print(get_stat_name());
"chi2gehrels"


% observed counts in each dataset
sherpa> calc_data_sum(1);
29.4115

sherpa> calc_data_sum(2);
36.1193


% what types of models are defined
sherpa> get_model_type("model1");
"polynom1d"

sherpa> get_model_type("model2");
"polynom1d"

sherpa> get_model_type("model3");
"polynom1d"


% final fit values for the source models
sherpa> print(get_model(1));
model2
   Param        Type          Value          Min          Max      Units
   -----        ----          -----          ---          ---      -----
   model2.c0    thawed      1.64339      -1e+120       1e+120
   model2.c1    thawed     0.193666      -1e+120       1e+120
   model2.c2    thawed    0.0251972      -1e+120       1e+120
   model2.c3    thawed  -0.00277729      -1e+120       1e+120
   model2.c4    frozen            0      -1e+120       1e+120
   model2.c5    frozen            0      -1e+120       1e+120
   model2.c6    frozen            0      -1e+120       1e+120
   model2.c7    frozen            0      -1e+120       1e+120
   model2.c8    frozen            0      -1e+120       1e+120
   model2.offset linked     0.851724      -1e+120       1e+120

sherpa> print(get_model(2));
model3
   Param        Type          Value          Min          Max      Units
   -----        ----          -----          ---          ---      -----
   model3.c0    thawed      2.60526      -1e+120       1e+120
   model3.c1    thawed    -0.407013      -1e+120       1e+120
   model3.c2    thawed     0.254528      -1e+120       1e+120
   model3.c3    thawed   -0.0377019      -1e+120       1e+120
   model3.c4    thawed   0.00177631      -1e+120       1e+120
   model3.c5    frozen            0      -1e+120       1e+120
   model3.c6    frozen            0      -1e+120       1e+120
   model3.c7    frozen            0      -1e+120       1e+120
   model3.c8    frozen            0      -1e+120       1e+120
   model3.offset frozen            0      -1e+120       1e+120


Exiting Sherpa

To exit the current Sherpa session:

sherpa> quit();


History

14 Nov 2007 rewritten for CIAO 4.0 Beta 3

Return to Threads Page: Top | All | Intro | Fitting
Hardcopy (PDF): A4 | Letter
Last modified: 14 Nov 2007


The Chandra X-Ray Center (CXC) is operated for NASA by the Smithsonian Astrophysical Observatory.
60 Garden Street, Cambridge, MA 02138 USA.    Email: cxcweb@head.cfa.harvard.edu
Smithsonian Institution, Copyright © 1998-2004. All rights reserved.