Note: Exercise files are found here
It is a common situation
that a powder diffraction dataset will not provide enough observations to allow
refinement of all atoms independently. This is not only a problem for powder
diffraction; it is usually the case in single-crystal protein refinement as
well, as few datasets for that can be collected to atomic resolution. When the
number of parameters is not sufficient to discern individual atomic positions,
it is necessary to simplify the crystallographic model to require fewer
parameters through use of constraints. A common constraint is to group Uiso values so that a single value is used for
all atoms of a particular type or in a common bonding environment. Also common
is to apply restraints, which effectively allows assumptions about the model to
be used as if they were experimental observations. Examples of restraints
include providing target values for bond distances and angles, or penalizing a
refinement if atoms diverge from a plane.
One of the more
powerful types of constraints in GSAS-II is to constrain a group of atoms so
that their relative positions are fixed. Such a group of atoms is known as a
rigid body. The core concept in a rigid body refinement is that 3 parameters for
an origin and 3 orientation parameters are needed to locate a rigid body,
though in some cases symmetry can reduce the number of refinable parameters
below 6. In contrast, 3 position parameters are needed for every atom in a
structure on a general position. Thus, a four-atom group requires 12 positional
parameters and a 6-atom group requires 18 positional parameters, when modeled
with individual atoms, but as rigid-bodies only 6 parameters will be needed.
Also, a more chemically reasonable result will be obtained.
GSAS-II offers two
types of rigid bodies. This tutorial will demonstrate use of one type, residue
rigid bodies, which allow incorporation of optional torsional parameters. The second
type, vector rigid bodies, offer one or more refinable size parameters, which
can allow for refinement of selected bond distances or geometrical factors, but
this is less commonly needed compared to torsions.
In the supplied
example GSAS-II project file, a powder diffraction dataset is supplied for a sample
where there are two independent copies of the chiral R-(+)-Carbidopa
molecule (shown below) as well as two water molecules in the asymmetric
unit.
Since chemical
knowledge dictates that the 6 atoms in the phenyl ring, as well as the 6 atoms
bonded to them, will lie in a plane, and since bond distances and angles are
well known for those atoms, there is no reason to give these atoms any degrees
of freedom in the model used to fit the diffraction data, beyond the orientation
and position of the group. The positions for the remaining non-H atom can be
determined with only 4 additional parameters, noting that rotation is possible
around the four bonds circled below.
Thus, all 16
non-hydrogen atoms can be positioned with 10 parameters: 3 coordinates, 3
orientation parameters and 4 torsions. Compare this to the 48 positional
parameters needed for 16 unconstrained atoms. If we had neutron data that would
allow location of hydrogen atoms, adding 5 more torsions would allow for the
hydrogen atoms to be positioned as well, where a total of 30 atoms would be
located from 15 parameters.
In this example, the
fit in the supplied GSAS-II project file appears quite good with respect to the
agreement to the observed data, but closer examination shows that the bond
distances and angles of the two molecular groups are quite irregular. Since
this distorted structure is not physically reasonable, this model is not acceptable.
This well-fitting but poor-quality model arises because too many parameters
were introduced, which is sometimes called overfitting. In this tutorial, a
better model will be developed using rigid bodies for the two molecules,
providing a more chemically reasonable representation with fewer parameters -- simply
a better fit.
There are two steps
to working with rigid bodies in GSAS-II. The first is to obtain an accurate set
of coordinates for the molecule, molecular fragment, or ionic moiety, which can
be done in a number of ways. One way is to locate an accurate crystal structure
from a database or paper. A second method is to use a computational tool to
determine an approximate structure regularized for reasonable bond lengths.
This can be done with low-accuracy forcefield computations or varying levels of
improved theory. A good tool for simple forcefield minimization is the free
cross-platform Avogadro program.
Note that we could
generate the R-(+)-Carbidopa fairly simply from inside Avogadro by drawing the
molecule (process described
here), or even quicker, since the PubChem entry lists
a
SMILES description of the atomic bonding [C[C@@](CC1=CC(=C(C=C1)O)O)(C(=O)O)NN],
that description could also be used using the Avogadro Build/Insert/SMILES…
menu command, but in this step we will use a more general method to input a
molecular fragment to Avogadro, so that it can be regularized.
1A) For this tutorial, we will export coordinates
from GSAS-II in a form to be read Avogadro. They will be regularized in the
next step.
Begin by starting
GSAS-II. Any .gpx file (GSAS-II project) may be
used. If a new, empty, project is used, select the Data/“Add
new phase” menu item first. Select the Rigid Bodies tab in the data
tree.
1B) Use the “Edit Vector Body”/“Extract
from file” menu item select
the GSAS-II gpx format and read in the distort1.gpx file included in the sample input. This will read in the
coordinates for 34 atoms (two carbidopa molecules and two O atoms), from this
we want to select only one carbidopa molecule.
1C) Unselect
all atoms by pressing the Toggle All button and then select the first
sixteen atoms using the “Set Range” button and selecting the first [“0) C”]
and the sixteenth atom [“15) O3”]. Note that in the plot of the structure, unselected
atoms appear partially “greyed out” compared to selected atoms.
Once the 16 atoms
are selected, press “Continue”. The rigid body then appears in the plot
window. Note how distorted the molecule appears.
1D) If these coordinates were satisfactory, the
rigid body could be created at this point, but since these coordinates are
distorted, we will regularize them in Avogadro first. Export the coordinates by
pressing the “export as xyz” button on the
GSAS-II window.
Make a note of the
file name, as it will be needed in the next step. Press Cancel to close
this window. It is fine to close GSAS-II, but it can also remain open.
For this step, it
is assumed that the cross-platform software Avogadro
program has been installed. If you do not want to install this program,
this step can be skipped, as described at the beginning of Step
3.
2A) Read the coordinates into Avogadro by
starting the program and using the Avogadro File/Open command; select
the XYZ file type option and read the file produced in step 1D.
2B) Note that
the program noted the correct bonding of the carboxylic acid group, where one
of the C-O bonds is noted as a double bond, but the phenyl group does not have
alternating double bonds. To fix this, we need to click on the bonds, but that
is difficult to do unless the atoms are drawn smaller. Make sure that the Display
Settings button is selected (shown in blue) and click on the wrench
symbol to bring up the “Ball and Sticks Settings” dialog, below.
When the “balls” are
made smaller, it becomes possible to click on bonds.
2C) Switch to the
pencil in the toolbar
Click on directly on
every other C-C bond in the phenyl ring to change the bonds from single
to double. Note that when a C-C bond is clicked on, the order changes successively
from single to double and to triple and then back to single again. (Use care to
avoid clicking elsewhere than on a bond or additional atoms will be created;
use a right click to delete unneeded atoms.)
2D) Use the Build/“Add Hydrogens” menu item to add the remaining H atoms
to the other end of the molecule.
2E) Optimize the
bond distances and angles using Extensions/“Optimize
Geometry”. Use it a few times until no further changes are visible. The
molecule should appear as below.
2F) Use “Save
as” to write the newly generated coordinates out. Be sure to select the
file type as “XYZ”. I used name regularized.xyz for the file. Exit Avogadro.
If you skipped
step 2, you can use the file I created with Avogadro, supplied as regularized.xyz
3A) To start, open the supplied distort1.gpx file in
GSAS-II. Then we will create a rigid body for this project by clicking on the “Rigid
bodies” tree item. Click on the “Edit Residue Body”/“Extract
from file” menu item, select the format as XYZ and then select the
file written in step 2F (regularized.xyz).
Note that the “Import XYZ” menu command could also be used, but “Extract
from file” allows atoms to be selected, recomputes the origin as the
midpoint between atoms and allows rigid body axes and the origin to be defined based
on user preferences. With “Import XYZ” the coordinates would be used
exactly as in created by Avogadro.
3B) After the regularized.xyz file is opened and read; all atoms are selected
by default. Note that H atoms are not moved when torsional groups are rotated,
so we do not want to include any of the non-phenyl H atoms. For simplicity here we will remove all of the H atoms. (They could be
regenerated later, if desired). Select only atoms #0-#15 (as below) and
press Continue.
3C) Note that there are several other
things that can be done on the window that is now created. These include:
defining axes and reordering the atoms by dragging rows in the list. As an
example, selecting the atoms in the phenyl ring and pressing “Place in plane”, where
the adjacent button has “xy” selected, will orient
the z axis as normal to the plane of the ring. Selecting a single atom and
pressing “Set origin” will place that atom as the origin, or if two atoms are
selected the origin will be the midpoint between the two atoms. None of this is
needed here, but such actions will likely be needed for bodies that will be
placed at high-symmetry locations in a unit cell.
For this exercise all that is needed
to select all atoms (done by default) and then use the selected atoms to create
a rigid body by pressing the “a Residue Body” button near the bottom of
the list of buttons.
The rigid body is then created as
shown below.
3D) Find atoms in torsions
Press the Plot button to see the rigid body. Note that dragging with the left mouse
button held down is needed to see all the atom labels, as below.
From this it can be seen that the four
bonds with possible torsional rotations are:
a)
C3-C5,
b)
C5-C7,
c)
C7-N2 and
d)
C7-C9.
3E) Define the torsions
Press the “Edit Residue Body”/”Define torsion” menu item. This opens a window to select
an origin atom. Once that is selected, you are offered a choice of atoms that
are bonded to the selected atom.
Thus, for the C3-C5 bond, if C3 is
selected first as the origin atom, the options in the second menu for pivot
atom are: C1, C5 or C8. Select C5. (Note that if C5 had been selected
first as origin, then the pivot choices would be C3 and C7.)
After defining C3 and C5 for a
torsion, then repeat this to define a total of 4 torsions, by adding
C5-C7, C7-N2 and C7-C9. The torsions are displayed at the
bottom of the window, as shown below:
It can be instructive to select each
torsional mode in turn and move the slider to see how the mode rotates the
riding atoms.
Now that the rigid
body has been defined, the next step is to insert it into the crystal structure
model. This means defining where the rigid body is located, oriented and how it
is mapped to atoms in the asymmetric unit. The origin of the rigid body
Cartesian system is placed at a specified location (in fractional coordinates)
and a quaternion defines the rigid body orientation. The same rigid body may be
used multiple times in the asymmetric unit. There will need to be a set of
atoms, an origin, a quaternion for each insertion of each body. There will also
be a set of torsion values angle values associated with each body use.
Note that
understanding GSAS-II’s 3D visualization can be key to managing rigid bodies.
It is valuable to move and rotate the representation, so that distances along
all axes can be visualized. For this it is valuable to understand that as the
structure is viewed, movement of the mouse while holding down a button (also
called dragging) changes the view of the crystal structure, as described here):
· Dragging with the right
button down: moves the viewpoint, which is kept at the center of the screen
(effectively translating the structure)
· Dragging with the left
button down: rotates axes around screen x & y. Rotation is around the
viewpoint
· Dragging with the center
button down: rotates axes around screen z
· Rotating the scroll
wheel: changes “camera position” (zoom in/out)
Note that the screen axes are x: horizontal, y: vertical and z: out of
screen. Also, additional plot controls are available by pressing on the “Draw
Options” tab.
When inserting a
rigid body into a unit cell, as described in the next section, the rigid body can
be is being located into the unit cell, a second set of actions can be accessed
with the mouse that affect the rigid body only. This is done by holding the Alt
keyboard key down while “dragging” the mouse.
· Alt+right mouse dragging changes the rigid body
origin,
· Alt+middle button dragging rotates the rigid body
around the screen z axis.
· Alt+left button dragging rotates the body around the x
& y screen axes.
Visually, the rigid body motions are analogous to what appears when the
view of the structure is changed.
In this tutorial,
we want to place the rigid bodies into the approximate locations where those
molecules are already found. There are two ways to do this. One is that if
specified atoms in the rigid body are paired to atoms in the crystal, the
origin and/or quaternion can be optimized to best collocate the pairs of atoms.
This will be shown in Step 4. Another choice is to visually move the rigid body
to the desired location, which will be shown in Step 5. That process might be
needed to insert a molecule into a region of a Fourier map.
In this step we
will assign pairs of atoms that match between the rigid body and the crystal asymmetric
unit and have the best rigid body location be computed.
4A) To insert a
rigid body into a structure, select the desired phase in the data tree (here
there is only one choice) and then select the “Edit Body”/“Locate
& Insert Rigid Body” menu item. If more than one body were defined, a window
to select which body would be offered. A window is displayed with options to
define where the rigid body will be placed, as below.
Also shown in the
GSAS-II graphics window is a plot of the structure, along with the rigid body
location. Note that by default the asymmetric unit is shown with van der Waals
spheres. The rigid body is displayed as a ball-and-stick model with green
bonds. Note that the origin of the rigid body is shown with a triplet of red,
green and blue lines (for x, y, & z, respectively) as well as a white line
that shows the orientation vector.
Change the crystal
structure display mode to either Ball & Sticks or Sticks (your
preference) using the buttons on the upper right. This is needed because with
van der Waals spheres, it is hard to see when the rigid body is close to the
location where it will be docked.
Note that the
table at the lower part of the window shows each atom in the rigid body and the
closest atom in the crystal structure and the distance to that atom. In the
column labeled as “Assign as atom” to the extreme right of the table, we can assign
specific pairing between atoms in the rigid body and atoms in the crystal
structure that will override the closest atom pairing. Scrolling or expanding
the window so the entire table is visible will make subsequent steps easier (as
seen below). As the rigid body is moved, the distances will be updated. The
closest atom may also change if an assignment has not been made.
4B) We will align the rigid body to the molecule
in the cell closest to the upper right corner by assigning rigid body atoms to
the matching atoms in the unit cell.
If we click on
the 2nd row in the table (with row label “C2”), the row is
highlighted
Note that atom C2 in
the rigid body is highlighted by the ball color changing to green, as well as
its matching atom in the crystal structure (C6), as shown below.
The crystal
structure atom is not the correct one to match to C2. To find the right one, we
can highlight different atoms in the crystal structure by pressing the Tab
keyboard key until the correct atom is displayed; alternately, select different
atoms in the “Crystal Highlight” pull down. After cycling through, one can see
that C11 in the crystal structure should be paired to C2 in the rigid body.
4C) Assign C11 in the crystal structure to be paired to
C2 by clicking in the right-most box in the C2 row of the table and selecting
atom C11 from the dropdown list.
Optional: press
“Process Assignments” and the distance between RB C2 and unit cell C11 will be
shown in the table. Likewise, pressing “Set Origin” will translate the rigid
body to best match the assigned atoms. Since there is only one assigned atom,
the distance will be zero.
4D) Setting the
orientation and origin requires assignment of at least two more atoms and
preferably more. Select another easy to identify an atom in the rigid body by clicking
on the C4 row in the table. (Alternately, press Alt-Tab to highlight
different rigid body atoms.) Using Tab, it can be seen that crystal structure
atom C13 should be paired. Select atom C13 in the “Assign as atom” cell
for C4.
Likewise, rigid body
atom C5 should be assigned to crystal structure atom C14 by selecting atom
C14 in the “Assign as atom” cell for C5.
4E) Then press the “Set Both”
button and the phenyl ring positions are approximately matched, as below.
4F) The agreement
in atom positions at the other end of the molecule is still not very good, but
this can be improved by repositioning the view (dragging with the left
mouse button) to look approximately along the plane of the ring. When this is
done, it becomes clear that the agreement can be improved significantly by
rotating around the first torsion angle by moving the slider. An
angle of ~23 degrees brings the atoms much closer, but there are still atoms as
much as 1 Å in disagreement, since the crystal structure molecule is rather
distorted.
4G) Note that the
table now shows that the rigid body has crystal structure atoms 16-31 paired consecutively,
which is to be expected since the atom order was never changed. Press Add to define these atoms from
the rigid body parameters shown in the window. A warning is displayed, since
some of the distances are larger than 0.5 Å, but press Yes since this is
about as good as we can do.
This causes any
unpaired (shown as “create new”) to be added to the Atoms list and then every
atom in the rigid body is linked to an atom in the asymmetric unit. The rigid-body constrained atoms are now shown
with bonds in orange (unless “Show Rigid Bodies” is unchecked on the “Draw
Options” tab.) The atom coordinates are shown with a grey background on the
Atoms tab. Press the c key (with the graphics window active) to see the view
below.
As mentioned in
the introduction to steps 4 & 5, above, holding a mouse button down (mouse
dragging) changes the view of the crystal while holding Alt down while dragging
the mouse, moves the rigid body within the cell. In this step, to show how that
works, we will use the mouse to place the rigid body.
5A) To insert the rigid body into the structure
again to match the second copy of the molecule, again select the “Edit Body”/“Locate & Insert Rigid Body” menu item. A window is
displayed as before with options to define where the rigid body will be placed
and the graphical display will look like this:
5B) Hold the Alt key down and drag
with the middle mouse button to rotate the rigid body by 180 degrees, as
shown below.
5C) Again with
the Alt key down, drag with the right mouse button and
move the rigid body so that the phenyl rings are more or less aligned.
5D) To see how far the rings are separated in the
screen z direction, we must rotate the view. Drag the mouse upwards
with the left button down (without pressing Alt) to obtain a view
similar to the one below.
From this view it is
clear that a small rotation is needed and a small translation. First with the Alt
and the middle mouse button both pressed, drag upwards until the two rings
molecules are more parallel. Then, with the Alt and the right mouse
buttons both pressed, drag upwards until the molecules are better overlapped.
5E) At the point the rigid body is pretty close
to the position of the molecule in the unit cell, with distances in the table
ranging from 0.1 to nearly 2 Å, but better agreement can be obtained by making
small adjustments with the Alt down and successively pressing each of the three
mouse buttons alternated with changing the view to see the molecules from a
different perspective. Another choice is to assign several atoms and optimize,
as was done in steps 4D and 4E. Choices of atoms to optimize are clear since
the atoms are properly paired in the table. Updating the C3-C5 torsion to ~20
degrees will also improve the agreement.
5F) When the agreement is sufficient, press Add,
accept the warning and this molecule is also now constrained by the rigid body.
Now both molecules will appear as orange.
If the number of
cycles is set to zero (in the Controls tree item), the fit will be seen to have
gotten significantly worse, with the profile wR
jumping from 2.2% with the distorted model to ~6%. This is not surprising and
only a few refinement steps will provide a more realistic model with comparable
agreement, but far fewer parameters.
6A) Initially, turn off refinement of all atoms.
(Select the Atoms tab, double-click on the refine column label and then leave
all boxes unchecked and press OK). Note that the X flag for the atoms in the rigid
bodies would be ignored anyway.
Then select the RB
Models tab. The first rigid body is automatically selected. Make the following
changes:
1.
Select to refine the Origin
2.
Refine both the orientation azimuth and vector by setting the AV
flag
3.
Set the “thermal motion” mode to Uiso
4.
Set the initial Uiso value to
something reasonable, say 0.01
5.
Set the refinement flag for thermal motion
6B) Then select the second rigid body and
repeat the previous actions (setting the refine checkbox, selecting AV
for the orientation refinement, setting Uiso
mode, set the Uiso value to 0.01
and set the thermal motion refinement flag.)
6C) Set the number of cycles back to 5 and
refine. The fit will converge quickly to an wR
of ~4%.
6D) Add refinement of the four torsion angles
for each of the two rigid bodies. Then refine again. The wR quickly drops to 2.5%
The motion of
rigid bodies, known as the TLS description, has been worked out (Schomaker & Trueblood, Acta Cryst.
B24, 63-76, 1968) and consists of three tensors:
·
the T tensor for translation movement of all atoms together
·
the L tensor for libration, rotation of all
atoms as a unit around the rigid body origin, and
·
the S tensor, to describe correlated librational-translational
motion (screw-like motion). The S terms can be readily omitted if the rigid
body origin is at the body center of mass.
In the full expansion there are 6 T and 6 L and 8 S terms. It is
unlikely that all of these terms can ever be used with powder data.
The simplest
description that can be made with a rigid body is to assume that all motion is
translational and isotropic. This is what we have done and would be equivalent
to refining Uiso for each atom in the rigid body, but
applying a constraint so that all atoms in the rigid body share that Uiso value. It makes sense to see if further expansion of
the rigid body motion/disorder description will produce an even better fit,
although it is clear that the crystallographic fit is already quite good
considering the limited resolution and high background levels. Refining 24 more
parameters (6 T + 6 L, for each of the 2 bodies) is clearly impractical. One
possible simplification would be to refine only the diagonal T & L elements
(which constraints translation and libration directions
to crystal axes). That might be appropriate with data more sensitive to group
motion, but is still impractical here. One still simpler model is to include
translation and libration, but require equal
amplitudes in all directions. This can be done by refining only the diagonal T
and L terms, but to constrain them to be equal. If the T & L values are constrained
to be the same for the two groups, this means replacing the 2 group Uiso values with 2 terms (a single grouped Tjj and a Ljj
term). This is easily done as shown below.
7A) Switch to refine the T & L terms by
selecting each body, in the RB Models phase tab, set the thermal
motion model to TL for both rigid bodies and then turn on the refine
checkbox for the T11, T22, T33, L11, L22 & L33 terms.
7B) Delete the two existing constraints
on Uiso values: Use the Constraints tree item
and the Phase tab and delete the two entries. These constraints are
actually ignored with rigid body group motion, so this does not actually have
any effect, but they could lead to confusion.
7C) Create constraints on the Tjj
and Ljj terms through use of the Constraints
tree item. To constrain the Tjj terms, one
needs to know their variable names are RBRTjj:b:0
for body #b of type 0 where j = 1, 2 or 3.
On the Phase tab
(loaded by default), use the “Edit Constr.”/“Add Equivalence”
menu item and enter RBRT in the search filter. The first term is
one of the six that needs to be constrained, so select it as the 1st
variable and press OK.
On the next window
select the remaining five 11, 22 and 33 terms
and press OK:
7D) Create constraints on the Ljj terms through use of the Constraints
tree item as was done before. Their variable name is RBRLjj:b:0.
Again use the “Edit Constr.”/“Add Equivalence” menu item and enter RBRL
in the search filter. The first term is one of the six that needs to be
constrained, and on the next screen select the remaining five 11, 22 and 33
terms and press OK. Constraints should appear as:
7E) Then refine with these parameters and
the Rwp remains at 2.5%, but I would argue this is a
more reasonable model than the Uiso model. If the
thermal ellipsoids are plotted, they get larger further away from the origin as
one might expect.
7F) An even more relaxed model would be to allow
the Tjj and Ljj
terms to differ between the two bodies, with constraints as shown below, but
this produces no improvement in the Rwp. Further it
has a non-physical negative Ljj value for
one of the two bodies, so this is not a better model.
In the tutorial, we have seen how to create a rigid body, using the Avogadro program to regularize the internal geometry. The body was then defined within GSAS-II, including 4 torsion angles internal to the body. The rigid body was then placed into the unit cell in previously determined positions/orientations. The positions, orientations and torsions, as well as rigid body group motion parameters, we then refined. The resulting fit was excellent with a highly realistic structure and a small number of parameters that are appropriate for the dataset being used.