· Note: Exercise files are
found here
It is common that
powder diffraction does not provide enough observations to allow refinement of
all atoms independently. This is not only a problem for powder diffraction; it
is usually the case in protein refinement as well, as few datasets can be
collected to atomic resolution. When the number of parameters is not sufficient
to discern individual atomic positions, it is necessary to simplify the
crystallographic model to require fewer parameters through use of constraints
or to apply restraints, which effectively allow assumptions about the model to
be used as if they were experimental observations. Examples of restraints
include providing target values for bond distances and angles, or penalizing a
refinement if atoms diverge from a plane. A common constraint is to group Uiso values so that a single value is used for all atoms of
a particular type or in a common bonding environment.
One of the more
powerful types of constraints in GSAS-II is to constrain a group of atoms so
that their relative positions are fixed. Such a group of atoms is known as a
rigid body. The core concept in a rigid body refinement is that (at most) 3
position parameters and 3 orientation parameters are needed to locate a simple
rigid body. Sometimes symmetry can reduce the number of refinable parameters.
In contrast, 3 position parameters are needed for every atom in a structure on
a general position. Thus, a four-atom group requires 12 positional parameters
and a 6-atom group requires 18 positional parameters, if modeled with
individual atoms, but as rigid-bodies <=6 parameters will be needed.
GSAS-II offers two
types of rigid bodies. This tutorial will demonstrate use of one type, residue
rigid bodies, which allow incorporation of optional torsional parameters, while
the second type, vector rigid bodies, offer one or more refinable size parameters,
which can allow for refinement of selected bond distances or geometrical
factors.
In the supplied
example GSAS-II project file, a set of diffraction data is supplied for a
crystal with two copies of the chiral R-(+)-Carbidopa
molecule (shown below) and two water molecules in the asymmetric unit.
Note that
ignoring H atom positioning, because the phenyl ring defines the relative
positions of the 6 component atoms and 3 atoms directly bonded to the ring, and
since bond distances and angles are well known, this molecule has only four
torsional degrees of freedom, where rotation is possible around the four bonds
circled below.
Thus, all
16 non-hydrogen atoms can be positioned with 3 coordinates, 3 orientation
parameters and 4 torsions. Compare these 10 parameters to the 48 positional
parameters for 16 unconstrained atoms. (Adding 5 more torsions would allow for
the hydrogen atoms to be positioned as well, for a total of 30 atoms from 15
parameters.)
In this
example, the fit in the supplied GSAS-II project file appears quite good with
respect to the agreement to the observed data, but closer examination shows
that the bond distances and angles of the two molecular groups are quite
irregular. Since this distorted structure is not physically reasonable, this
model is far from optimal. This well-fitting but poor quality
model is due to use of too many parameters, which is sometimes called
overfitting. In this tutorial, a better model will be developed using rigid
bodies for the two molecules.
There are
two steps to working with rigid bodies in GSAS-II. The first is to obtain an
accurate set of coordinates for the molecule, molecular fragment, or ionic
moiety, which can be done in a number of ways. One way is to locate an accurate
crystal structure from a database or paper. A second method is to use a
computational tool to determine an approximate structure regularized for
reasonable bond lengths. This can be done with low-accuracy forcefield
computations or varying levels of theory. A good tool for force field
minimization is the free cross-platform Avogadro
program. The R-(+)-Carbidopa can be generated inside Avogadro by drawing
the molecule (described here),
or since the PubChem
entry lists a
SMILES description of the atomic bonding
[C[C@@](CC1=CC(=C(C=C1)O)O)(C(=O)O)NN], this can also be used using the
Build/Insert/SMILES… menu command in that program, but in the next step we will
use a more general method to regularize a structure.
For this
tutorial, we will export the coordinates from GSAS-II into a form to be read Avogadro.
They will be regularized in the next step.
Begin by starting
GSAS-II. Any .gpx file (GSAS-II project) may be
used. If a new project is used, select the Data/“Add
new phase” menu item first. Select the Rigid Bodies tab in the data
tree.
Use the “Edit Vector Body”/“Extract
from file” menu item
select the GSAS-II gpx format and read in the distort1.gpx file included in the sample input. This will read
in the coordinates for 34 atoms (two carbidopa molecules and two O atoms), from
this we want to select only one carbidopa molecule.
Unselect
all atoms by pressing the Toggle All button and then select the first
sixteen atoms using the “Set Range” button and selecting the first [“0) C”]
and the sixteenth atom [“15) O3”]. Note that the selected atoms appear
differently from unselected atoms in the plot of the structure.
Then press
“Continue”. The rigid body then appears in the plot window. Note how
distorted the molecule appears.
Export the
coordinates by pressing the “export as xyz”
button on the GSAS-II window.
Make a note
of the file name, as it will be needed in the next step. If these coordinates
were satisfactory, the rigid body could be created at this point, but since
these coordinates are distorted, we will regularize them first. Press Cancel
to close this window. It is fine to close GSAS-II, but it can remain open.
For this
step, it is assumed that the cross-platform software Avogadro program has been installed. If you do
not want to install this program, this step can be skipped.
Read the
coordinates into Avogadro by starting the program and using the Avogadro
File/Open command; select the XYZ file type option and read
the file produced in step 1D.
2B)
Note that
the program noted the correct bonding of the carboxylic acid group, where one
of the C-O bonds is noted as a double bond, but the phenyl group does not have
alternating double bonds. To fix this, we need to click on the bonds, but that
is difficult to do unless the atoms are drawn smaller. Make sure that the Display
Settings button is selected (shown in blue) and click on the wrench
symbol to bring up the “Ball and Sticks Settings” dialog, below.
When the
“balls” are smaller, it becomes possible to click on every other bond in
the phenyl ring.
Switch to
the pencil in the toolbar
Click on
directly on every other C-C bond in the ring to change the bonds from
single to double. (Use care to avoid clicking elsewhere or additional atoms
will be created; use a right click to delete them.)
Use the Build/“Add Hydrogens” menu item to add the remaining H atoms
to the other end of the molecule.
Optimize
the bond distances and angles using Extensions/“Optimize
Geometry”. Use it a few times until no further changes are visible. The
molecule should appear as below.
Use “Save
as” to write the newly generated coordinates out. Be sure to select the
file type as “XYZ”. I used name regularized.xyz for the file. Exit Avogadro.
If you skipped step 2, you can
download the Avogadro output file, supplied as regularized.xyz
To start, open the distort1.gpx
file in GSAS-II. Click on the “Rigid bodies” tree item. Click on the “Edit
Residue Body”/“Extract from file” menu item,
select the format as XYZ and then select the file written in step 2F (regularized.xyz).
Note that if
“Import XYZ” were used, the coordinates would be used exactly as in created by
Avogadro, but Extract from file allows atoms to be selected, recomputes the
origin as the midpoint between atoms and allows rigid body axes and the origin
to be defined by user preferences.
Once the file is opened and read; by default all atoms are selected. Note that H atoms are not
moved when torsional groups are moved, so we do not want to include any of the
non-phenyl H atoms. It is easier to remove all of the H atoms (they can be
regenerated later as riders if desired). After selecting only atoms #0-#15
(as below), press Continue.
Here all that is needed to select all
atoms (done by default) and then use the selected atoms to create a rigid body
by pressing the “a Residue Body” button near the bottom of the list of buttons.
The rigid body is then created.
Other things that
can be done on the previous window include defining axes and reordering
the atoms by dragging rows in the list. As an example, selecting the atoms in
the phenyl ring and pressing “Place in plane” “xy”
will orient the z axis as normal to the plane of the ring. Selecting a single
atom and pressing “Set origin” will place that atom as the origin, or if two
atoms are selected the origin will be the midpoint between the two atoms. None
of this is needed here, but such actions can be needed for bodies that will be
placed relative to special positions in a unit cell.
Press the Plot button to see the rigid
body. Note that dragging with the left mouse button held down is needed to see
all the atom labels, as below.
From this it can be seen that the
bonds with possible torsional rotations are C3-C5, C5-C7, C7-N2 and C7-C9.
Press the “Edit Residue Body”/”Define torsion.” This requires selecting an origin
atom. Once that is selected, you are offered a choice of atoms that are bonded
to the selected atom. Thus for the C3-C5 bond, if C3
is selected first, the options in the second menu are C1, C5 or C8 and C5
is selected. If C5 is selected first, then the choices are C3 and C7.
Then repeat this to define a
total of 4 torsions, by then adding C5-C7, C7-N2 and C7-C9.
The torsions are displayed at the bottom of the window, as shown below:
It can be
instructive to select each torsional mode in turn and move the slider to see
how the mode rotates the riding atoms.
Now that
the rigid body has been defined, the next step is to insert it into the crystal
structure model. This means defining where the rigid body is located, oriented
and how it is mapped to atoms in the asymmetric unit. The origin of the rigid
body Cartesian system is placed at a specified location (in fractional coordinates)
and a quaternion defines the rigid body orientation. The same rigid body may be
used multiple times in the asymmetric unit. There will need to be a set of
atoms, an origin, a quaternion for each insertion of each body. There will also
be a set of torsion values angle values associated with each body use.
Note that
understanding GSAS-II’s 3D visualization can be key to managing rigid bodies.
It is valuable to move and rotate the representation, so that distances along
all axes can be visualized. For this it is valuable to understand that as the
structure is viewed, movement of the mouse while holding down a button (also
called dragging) changes the view of the crystal structure, as described here):
* Dragging with
the right button down: moves the viewpoint, which is kept at the center of the
screen (effectively translating the structure)
* Dragging with the left button
down H: rotates axes around screen x & y. Rotation is around the
viewpoint
* Dragging with
the center button down: rotates axes around screen z
* Rotating the
scroll wheel: changes “camera position” (zoom in/out)
Note that the screen axes are x: horizontal,
y: vertical and z: out of screen. Also, additional plot controls are available
by pressing on the “Draw Options” tab.
When, as
described in the next section, a rigid body is being located into the unit
cell, a second set of actions can be accessed with the mouse. This is done by
holding the Alt keyboard key down while dragging the mouse. Alt+right
mouse dragging changes the rigid body origin, while Alt+middle
button and Alt+left button actions rotate the body by
changing the quaternion. Visually, the rigid body motions are analogous to what
appears when the view of the structure is changed.
In this
example, we want to place the rigid bodies into the approximate locations where
those molecules are already found. There are two ways to do this. One is that
if specified atoms in the rigid body are paired to atoms in the crystal, the
origin and/or quaternion can be optimized to best collocate the pairs of atoms.
This will be shown in Step 4. The other choice is to visually “dock” the rigid
body to the location already found, which will be shown in Step 5.
To insert a
rigid body into a structure, select the desired phase in the data tree (here
there is only one choice) and then select the “Edit Body”/“Locate
& Insert Rigid Body” menu item. A window is displayed with options to
define where the rigid body will be placed
and a plot
of the structure along with the rigid body location are shown in the plot
window
where the
asymmetric unit is shown initially with van der Waals spheres and the rigid
body is displayed as a ball-and-stick model with green bonds.
Change the
crystal structure display to either Ball & Sticks or Sticks (your
preference) using the buttons on the upper right. This is needed because with
van der Waals spheres it is hard to see when the rigid body is close to the
location where it will be docked.
Note
that the table at the lower part of the window shows each atom in the rigid
body and the closest atom in the crystal structure and the distance to that
atom. In the column labeled as “Assign as atom” to the extreme right of the
table, we can specify specific pairing between atoms in the rigid body and
atoms in the crystal structure. Scrolling or expanding the window so the entire
table is visible will make subsequent steps easier (as seen below). As the
rigid body is moved, the distances, and if not paired explicitly, possibly the
closest atom will be updated.
We will
align the rigid body to the molecule in the cell closest to the upper right
corner by matching the atoms to the corresponding atoms.
If we click
on the 2nd row in the table (with row label “C1”), the row is
highlighted
Note that
atom C1 in the rigid body is highlighted by the ball color changing to green,
as well as its matching atom in the crystal structure (C6), as shown below.
The crystal
structure atom is not the correct atom to match to C1. We can highlight
different atoms (of the same type) in the crystal structure by pressing the Tab
keyboard key until the correct atom is displayed or by selecting different
atoms in the “Crystal Highlight” pull down. This shows that C11 in the crystal
structure should be paired to C1 in the rigid body.
Force C11
in the crystal structure should be paired to C1 by clicking in the right-most
box in the C2 row of the table and selecting atom C11 from the list.
Optional: press “Process Assignments” and the distance between C1 (RB) and C11
(C) will be shown in the table. Likewise, pressing “Set Origin” will translate
the rigid body to best match the assigned atoms. Since there is only one, the
distance will be zero.
Setting the
orientation requires assignment of at least one more atom and preferably a few.
Select an easy to identify an atom in the rigid body by clicking on the C4
row in the table. (Alternately, press Alt-Tab to highlight different rigid
body atoms.) Using Tab, it can be seen that crystal structure atom C13 should
be paired. Select atom C13 in the “Assign as atom” cell for C4.
Likewise,
rigid body atom C5 should be matched to crystal structure atom C14 by selecting
atom C14 in the “Assign as atom” cell for C5.
Then press
the “Set Both” button and the phenyl ring positions are approximately
matched, as below.
The
agreement in atom positions at the other end of the molecule is still not very
good, but this can be improved by repositioning the view (dragging with
the left mouse button) to look approximately along the plane of the ring. When
this is done, it becomes clear that the agreement can be improved significantly
by rotating around the first torsion angle by moving the slider.
An angle of 22-23 degrees is much brings the atoms much closer, but there are
still atoms as much as 1 Å in disagreement, since the crystal structure
molecule is rather distorted.
Note that
the table now shows that the rigid body has crystal structure atoms 16-31
linked consecutively, which is to be expected since the atom order was never
changed. Press Add to define
these atoms from the rigid body parameters shown in the window. A warning is
displayed, since some of the distances are larger than 0.5 Å, but press Yes
since this is about as good as we can do.
The
rigid-body constrained molecule is now shown in orange, unless “Show Rigid Bodies”
is unchecked on the “Draw Options” tab. Press the c key (with the graphics
window active) to see the view below.
As
mentioned in the introduction to steps 4 & 5, above, mouse holding a mouse
button down (mouse dragging) changes the view of the crystal while holding Alt
down while dragging the mouse, moves the rigid body within the cell. We will
use this to c
To insert a
second copy of the rigid body into the structure, again select the “Edit
Body”/“Locate & Insert Rigid Body” menu item.
A window is displayed with options to define where the rigid body will be
placed
Hold the Alt
key down and drag with the middle mouse button to rotate the
rigid body by 180 degrees, as shown below.
With the Alt
key down and drag with the right mouse button and move the
rigid body so that the phenyl rings are aligned.
To see how
far the rings are separated in the screen z direction, we must rotate the view.
(Without pressing Alt) drag the mouse upwards with the left
button down to obtain a view similar to the one below.
From this
view it is clear that a small rotation is needed and a small translation. First
with the Alt and the middle mouse button both pressed, drag
upwards until the rings molecules are more parallel.
Then with the the Alt and the right mouse
button both pressed, drag upwards until the molecules are better overlapped.
At the
point the rigid body is pretty close to the initial position, with distances in
the table ranging from 0.1 to nearly 2 Å, but better agreement can be obtained
by making small adjustments with the Alt down and successively pressing each of
the three mouse buttons alternated with changing the view to see the molecules
from a different perspective. Another choice is to assign several atoms and
optimize, as was done in steps 4D and 4E. Choices of atoms to optimize are
clear since the atoms are properly paired in the table. Updating the C3-C5
torsion to ~30 degrees will also improve the agreement.
When the agreement
is sufficient, press Add, accept the warning and this molecule is also now
constrained by the rigid body.
If the
number of cycles is set to zero (in the Controls tree item), the fit will be
seen to have gotten significantly worse, with the profile wR
jumping from 2.2% with the distorted model to 5.8%. This is not surprising, but
a few refinement steps will provide a more realistic model with comparable
agreement but far fewer parameters.
Initially, set
all atoms to be refined as U only. (Select the Atoms tab,
double-click on the refine column label and then select U. In fact, this change is cosmetic, as the X
flag for the atoms in rigid bodies is ignored.
Then select
the RB Models tab. The first rigid body is automatically selected.
Select to refine the Origin and the rotation angle/vector with
the AV flag.
Then select
the second rigid body and repeat the previous actions (setting the refine
checkbox and selecting AV for the orientation refinement.)
Set the number
of cycles back to 5 and refine. The fit will converge quickly to an wR of 4.0%.
Add
refinement of the four torsion angles for each of the two rigid
bodies. The wR quickly drops to 2.5%
The
motion of rigid bodies has been worked out (Schomaker
& Trueblood, Acta Cryst, 1968) known as the TLS
description, which consists of three tensors: for translation (movement of all
atoms together, described with the T tensor), libration
(rotation of all atoms as a unit around the rigid body origin, described with
the L tensor) and correlated librational-translational
motion (screw-like motion, described with the S tensor). In the full expansion
there are 6 T and 6 L and 8 S terms. It is unlikely that all of these terms can
be used with powder data. The S terms can be readily omitted if the rigid body
origin is at the body center of mass. This is not exactly true here, but the
origin is close to the center of mass.
The
simplest description that can be made with a rigid body is to assume that all
motion is translational and isotropic. This is equivalent to constraining all
atoms in the rigid body to have the same Uiso value,
which has already been implemented in this refinement via constraints. A
simpler way to set this up would be to select “Uiso”
for the rigid body thermal motion model. This is shown in 7A-7C, below.
Delete the Uiso constraints by clicking on the
constraints tree item and deleting each item. Then click on phase and
the Atoms tab and remove the U flag for all atoms but the last two
(water O atoms).
In the RB
Models refine phase tab, set the thermal motion model to Uiso for both rigid bodies and turn on the refine
checkbox to optimize the group Uiso value.
Carry out a
refinement to confirm that the profile Rw returns to
2.5%.
It makes
sense to see if further expansion of the rigid body motion/disorder description
will produce an even better fit, although it is clear that the crystallographic
fit is already quite good considering the limited resolution and high
background levels. Refining 24 more parameters (6 T + 6 L x 2 bodies) is
clearly impractical. One possible simplification would be to refine only the
diagonal T & L elements (which constraints translation and libration directions to crystal axes). That might be
appropriate with data more sensitive to group motion, but is still impractical
here. One still simpler model is to include translation and libration,
but make then isotropic. This can be done by refining only the diagonal T and L
terms, but constrain them to be equal for both groups. This means replacing the
2 group Uiso values with 2 terms (a single grouped Tjj and a Ljj
term). This is easily done as shown below.
Switch to
refine the T & L terms by selecting each body, in the RB Models
phase tab, set the thermal motion model to TL for both rigid bodies and
then turn on the refine checkbox for the T11, T22, T33, L11,
L22 & L33 terms.
Create constraints on the Tjj and Ljj
terms through use of the Constraints tree item. To constrain the Tjj terms, one needs to know their variable name
is RBRTjj:b:0 for body #b of type
0.
On the Phase
tab (loaded by default), use the “Edit Constr.”/“Add
Equivalence” menu item and enter RBRT in the search filter.
The first term is one of the six that needs to be constrained, so select it as
the 1st variable and press OK.
On the next
window select the remaining 11, 22 and 33 terms
and press
OK:
Create constraints on the Ljj terms through use of the Constraints
tree item as was done before. Their variable name is RBRLjj:b:0.
Again use the “Edit Constr.”/“Add Equivalence” menu item and enter RBRL
in the search filter. The first term is one of the six that needs to be
constrained, and on the next screen select the remaining 11, 22 and 33 terms
and press OK. Constraints should appear as:
Then refine
with these parameters and the Rwp remains at 2.5%,
but I would argue this is a more reasonable model than the Uiso
model as the thermal ellipsoids are plotted, they get larger further away from
the origin.
An even
more relaxed model would be to allow the Tjj
and Ljj terms to differ between the two
bodies, with constraints as shown below, but this produces no improvement in
the Rwp. Further it has a non-physical negative Ljj value for one of the two bodies, so this is
not a better model.