Greater Seattle Aquarium Sociey Logo (www.gsas.org) used with permission


Using the GSAS2CIF program
to export GSAS results

The crystallographic information file, (CIF) was developed by the IUCr as a standardized format to document single-crystal structure determinations and to exchange the results between laboratories. More recently, the pdCIF dictionary was developed to allow CIF to document powder diffraction measurements, as well as, Rietveld results. A very short introduction to CIF is included at the end of this document.

The GSAS2CIF program is used to create CIFs from GSAS results. This web page documents ideas behind the program as well as how the program is used.

Overview of the steps to create a CIF in GSAS

  1. Complete (ha!) your refinement
  2. Run DISAGL to compute interatomic distances and angles
  3. (Optional) Edit the "publication flags" in the distance and angle listings
  4. (Optional) Copy previously completed CIF templates to the current directory
  5. Run GSAS2CIF
  6. Edit the template files to include additional information
  7. Run GSAS2CIF

    ... and if you are like me...

  8. try out a few more ideas in GENLES
  9. Run DISAGL
  10. Run GSAS2CIF

    ... repeat steps 8-10 until really finished... (see quote)

Structure of the GSAS2CIF program

The GSAS2CIF program is used to prepare CIF files containing results from GSAS. It should be noted many important types of information that are needed to describe the sample and the measurement are not supplied as input to GSAS. A few examples of information that should likely be included in a CIF, but are not defined within GSAS are: the data measurement temperature, sample prepartion conditions, etc. If the CIF will be used as supplimentary material, to accompany a publication, this sort of documentary information certainly should be supplied, so that the CIF has value as archival and database materials. If a CIF is being prepared for submission to an Acta Cryst. journal, the IUCr has template with a recommended list of CIF data items.

Since the GSAS2CIF program cannot supply many items that need be present in a well-documented CIF, the GSAS2CIF program copies information from template files into each CIF. Three separate template files are used:

  1. one with publication and other overall information,
  2. one with information about the sample & specimen and
  3. one with information about diffraction instrumentation and data collection parameters.
The intent is that users will modify copies of these template files and thus can avoid inputing the same information multiple times.

When GSAS2CIF is used to create a CIF file for an experiment named expnam (e.g. from file expnam.EXP), the GSAS2CIF program creates CIF file expnam.cif containing GSAS results. Information from a series of template files is copied directly into the CIF. There will be N+M+1 template files, where N is the number of phases and M is the number of data histograms. The files are named as follows:

  1. expnam_publ.cif for the publication/overall information template file;
  2. expnam_phasen.cif for the N sample/specimen template file(s);
  3. expnam_INSTmm.cif for the the M instrument/data sample/specimen template file(s), where INST is the instrument name (see below).
If these files do not exist, they are created and filled with the contents of master vesrsions of the template files. In the case of the expnam_publ.cif and expnam_phasen.cif files, template files named template_publ.cif and template_phase.cif are read, if present from the same directory where the expnam.EXP file is found, and if not there, from the GSAS data directory. In the case of the expnam_INSTmm.cif file(s), the program first looks for files named template_INST.cif in the current directory and the GSAS data directory and if that file is not found, file template_instrument.cif is read from the current directory and if not found, the GSAS data directory.

This somewhat complex series of template files allows for the creation default template files for commonly-used instruments as well as the potential for reuse of the other template files, by copying these files as needed. Also, if information is added only to template files, rather than editing the final expnam.cif file, if GSAS2CIF is rerun at a later stage in the refinement, the crystallographic results in the .CIF are updated and the template information is retained automatically. Note that at present, there is very little applicable software for editing a CIF, so editing these template files must be done with a text editor.

In addition to the reading the GSAS experiment file (file expnam.EXP), GSAS2CIF also reads the variance-covariance matrix created in GENLES (from file expnam.CMT) and a table of interatomic distances and angles created from program DISAGL (file expnam.DISAGL). If these files cannot be read, GSAS2CIF produces a warning message, since the CIF will be incomplete without this information.

Instrument Name

An instrument name is needed for every GSAS histogram. It is best if this name is unique to a specific instrument, so for commercial instruments, it is best if this name contains part of the instrument serial number or the institution name, etc. The instrument name may be defined in the instrument parameter file, by inclusion of a record of type "INS nn INAME Instrument name". If this name is not defined in the original instrument parameter file, when GSAS2CIF is run, it will request an instrument name for each histogram, and this information will be added to the GSAS experiment file. Note that the vertical bar character, (|), should not be used in instrument names.

Publication/Non-Publish Flag for Distances and Angles

The DISAGL program will tabulate all interatomic distances within specified interatomic radii. This information is recorded in a file named expnam.DISAGL. These radii may be modified using EXPEDT (but not at present EXPGUI). The IUCr journals use a special flags (_geom_bond_publ_flag and _geom_angle_publ_flag) to indicate distances that will be tabulated in publication. When DISAGL is first run, this flag is set to zero, meaning "do not publish". If this flag is changed to a digit between 1 and 9 (at present, this must be done with a text editing program), the distance/angle flag is set to "publish." If DISAGL is rerun at later time, to update the distances and angles, these publication flags are transferred to the updated expnam.DISAGL file.

Acknowledgements

Richard L. Harlow first got me interested in the problem of a universal file format for powder diffraction data, leading eventually to this effort. I may forgive him someday.


"A Rietveld refinement is never perfected, merely abandoned." Peter Stephens

Appendix: A quick & incomplete introduction to CIF

A CIF file consists of logical groups of information that are called data blocks, since each block is initiated by a label of form data_label. In the simple case, where a single crystallographic model is determined from a single diffraction dataset (histogram), the CIF can be a single block. In the case where either multiple datasets (histograms) or where multiple phases are used in the refinement, a CIF will require several data blocks to describe the data and results.

CIF consists of a series of tags, called data names, and values associated with these data names. Together the data name and value are called a data item. A separate document, the CIF (or pdCIF) dictionary defines the meaning of each data name. If a value does not contain any spaces, may be specifed without quotes, but either single or double quotes may be used to delimit strings. A data value of one ore more lines is quoted with semicolon characters (;). The semicolons must be the first character on a line and the final semicolon is expected to be the only non-blank character on the line. With the exception of semicolon location, CIF ignores spaces. The following lines give examples of a few CIF data items.

_pd_calc_method   'Rietveld Refinement'

_cell_volume   1811.00(5)

_symmetry_space_group_name_H-M   "I a 3 d"

_pd_proc_ls_background_function
;   GSAS Background function number 1 with 4 terms.
 Shifted Chebyshev function of 1st kind
      1:    139.025     2:   -11.5408     3:    9.75652     4:    3.90497    
;
CIF also allows multiple values to be associated with a CIF data item. This is done by preceeding the data name with the keyword loop_. If two or more data names follow the loop_ keyword, a table can be constructed, as is shown in the following examples.
loop_  _symmetry_equiv_pos_as_xyz
       +x,+y,+z     +z,+x,+y 
       +y,+z,+x     +x+1/2,+y,-z+1/2 


loop_
      _atom_site_label
      _atom_site_fract_x
      _atom_site_fract_y
      _atom_site_fract_z
Y1      0.125        0.0          0.25     
FE2     0.0          0.0          0.0      
AL3     0.375        0.0          0.25     
O4     -0.02946(5)   0.05385(5)   0.15068(6)
Finally, it should be pointed out that two values in CIF have special meanings. If a value is supplied as a single period (.), the meaning is the value is not defined or is inappropriate. If the value is a question mark (?), this means that the value is unknown or not specified.

For more information, see: The CIF home page; short intros to pdCIF & CIF syntax

Comments, corrections or questions: crystal@NIST.gov
$Revision: 0.0 $ $Date: 2002/?/? $