Mark R. Peterson
Structural Chemistry Section, Department of Chemistry,
University
of
Manchester, Oxford Road, Manchester,
M13 9PL, England, U.K.
(Current Address: Wellcome Sciences Institute, Department of Biochemistry,
University of Dundee, Dundee, DDI 4HN, Scotland, U.K.)
Multiwavelength anomalous dispersion methods were used to analyse the
crystal structure of d(CGCGBrCG) in extension of the work presented
in Peterson, Harrop, McSweeney, Leonard, Thompson. Hunter and Helliwell (1996)
J. Synch. Rad. 3, 24-
34.
The brominated oligonucleotide d(CGCGBrCG) of chemical formula
crystallises in space group
with unit cell dimensions a=11.97, b=30.98, c=44.85 Å,
.
It was chosen as a test crystal to evaluate the MAD method itself and to
commission station PX9.5 for several reasons; it was radiation insensitive; it
had a very good concentration of anomalous scatterers, i.e. two bromines in two
hundred and forty light atoms; and the bromine K edge was very near to
the critical wavelength flux output of the SRS wiggler. It also diffracted
strongly, due to the relatively small unit cell, in spite of the rather small
crystal volume. Data to a resolution of 1.65 Å were collected at four
wavelengths about the bromine atom K absorption edge using synchrotron
radiation at Station PX9.5, SRS, Daresbury. Traditionally, the maximum of f "
is not coincident with the minimum in f ', however, in this case both are
observed on the same data set,
.
Hence
and
could be maximised using only two wavelengths. Various wavelength combinations
phasing strategies were then studied, ranging from 4 to 2 wavelengths. DM phase
improvement procedures were also employed on these combinations giving highly
interpretable maps even for unoptimised 2 wavelength cases.
Upon inspection of a test diffraction image, it could be seen that the crystal
was relatively well aligned, i.e. the Bijvoet mates could be measured on the
same or adjacent images. No further crystal alignment was undertaken. The
wavelengths for the diffraction measurements were chosen to optimise the
phasing power by (a) maximising the f " effect and (b)
for different wavelengths for each hkl. Hence, four wavelengths were chosen:
(1) a reference on the long wavelength side of the edge
;
(2) at the absorption edge inflection point
;
(3) at the "white line" absorption maximum
;
(4) a reference on
the short wavelength side of the edge
.
The choices of
and
follow what are known as f ' dip and f " max respectively.
For each wavelength the crystallographic data were collected, each involving a
4o rotation of the crystal. For each 4o sweep the
total exposure time was 60 seconds. In total 120o of data were
collected for each of the four wavelengths. Another 120o, a fifth
data set was then collected on the same crystal, immediately after the MAD
data, at the "white line" (i.e.
)
but with the crystal misaligned by offsetting one of the goniometer head arcs
by approximately 30o. This allowed reflections previously in the
blind region to be measured and combined with the
data set.
Merging statistics from the five data sets are displayed in Table 1.
The weak and negative intensities were made consistent with a Wilson
distribution of structure factor amplitudes using TRUNCATE (CCP4). The computer
programs CAD and SCALEIT (CCP4) were employed to combine the five data sets
into one file and to put them on an overall common scale. This was done with
respect to
,
it was treated as the 'native'. It was indeed found that
had the largest MFID between all other data sets.
SCALEIT provides useful estimates of the largest acceptable dispersive
and absorptive differences between and within the different data
sets. Due to the sensitivity of Patterson methods to spurious, large,
differences it was important to reject any unacceptably large differences as
outliers. The final SCALEIT statistics are shown in Table 2.
Dispersive and absorptive Patterson maps were then generated with FFT.
Identification of the bromine sites could be readily found using both the
anomalous
and dispersive
Patterson maps. From the three Harker sections in both maps, two consistent
bromine sites could be easily found. The quality of these Patterson maps can be
seen in Figures 1. and 2. The positions of the two bromine sites were 0.3241,
0.2009, 0.0100 (Site A) and 0.5010, 0 1807, 0.2310 (Site B) respectively.
Each bromine atom had its co-
ordinates,
temperature factors and occupancies (both real and anomalous) refined in
MLPHARE (CCP4) for ten cycles. The refined positions of the two bromine atoms
were used in MLPHARE on both hands
.
MLPHARE also treats the data sets collected at different wavelengths as
isomorphous derivatives with one data set being chosen as the 'native'. To
maintain a consistent positive dispersive difference between the other data
sets, the f ' dip data set (
)
was chosen as the native. Dispersive differences between
and the other data sets give rise to isomorphous differences, especially
and
with respect to
which were treated as apparent real occupancies of the anomalous scatterers.
For the 'native' data set (
)
the real occupancies of the anomalous scatterers were fixed to zero initially.
The figures of merit of the MAD phases, using all four wavelengths (excluding
the
data set), were 0.86/0.82 to 1.65Å resolution for the acentric/centric
data respectively for both hands. The f ' and f " anomalous scattering factors
were added to the form factor list, both being arbitrarily set equal to one
electron so that the real and anomalous occupancies corresponded to the number
of electrons involved in the dispersive and absorptive differences
respectively, as the data sets were on a common absolute scale previously via
SCALEIT and TRUNCATE. Table 3 gives the relevant phasing statistics for each
derivative against the native (
)
and also compares the theoretical values of the anomalous coefficients f ' and
f " (Sasaki, 1989) at each wavelength with the coefficients extracted at each
wavelength via the occupancies in MLPHARE.
The phases from MLPHARE were then combined with the structure factor amplitudes
from the
,
native data set, enabling a MAD electron density map was calculated via FFT
(CCP4). The MAD maps were calculated on both hands (Figs. 3 (a) and (b)) at
1.65 Å resolution. The figures of merit for both sets of phases do not
distinguish between correct and incorrect enantiomers. The problem is only
resolved upon inspection of the MAD electron density maps for "chemical sense".
That is the map calculated on the correct hand (Fig. 3(a)) showed the bases
clearly and building of the model with O (Jones et al (1989)) could be
easily started from the known heavy atom positions. The map calculated on the
wrong hand was totally uninterpretable (Fig. 3 (b)).
In the variation of f ' and f " with wavelength, only two wavelengths
need to be measured to yield a
at one wavelength and a change via
of F; between the two wavelengths (Okaya and Pepinsky (1956); Hoppe and
Jakubowski (1975); and Helliwell (1979)). The choice of wavelengths to maximise
and
was made with reference to the fluorescence spectrum. A key objective is to
make the centres of the phasing circles in the Harker phasing diagram well
separated and non-
collinear;
which is a necessary and sufficient condition for phasing (Helliwell (1984)).
Traditionally, the maximum of f " is not coincident with the minimum in f '.
Hence, three wavelengths would be needed in such a situation for fully moving
the centres of the phasing circles apart. In this study however, although
was expected to have the largest Friedel anomalous difference, in fact that was
the case for the
(f ' dip) data set (e.g. see Ranom values in Table 1). In light of
being the f " maximum,
was taken as 'native' to confirm if
was indeed at the f ' dip. This was done by comparing MFID's between data sets
where
then
are taken as the 'native' data sets. It was indeed found that
had the largest MFID between all other data sets (see Table 2). In such a case
then, where both the f " maximum and the f ' minimum case are both observed on
the same data set, i.e.
,
one data set becomes essentially redundant i.e.
in making the biggest anomalous differences. Hence, various alternative
strategies of
combinations were investigated.
The following analysis can essentially be split up into three categories involving data sets recorded at: respectively four, three, and two wavelengths in a variety of combinations to explore both experimental strategies for phasing and theoretical/computational strategies of phase improvement (See Figure 4 and Table 4 for respective map quality and FOM's). The experimental strategies were published in Peterson et al. (1996).
Case 1:
This combination of wavelengths is the case described previously where the f "
anomalous effects of each wavelength are all utilised along with the
isomorphous effects between
and each of the other three wavelengths. The map was of excellent quality and
structural moieties could be easily characterised.
Case 2:
This three wavelength case, and the next, is to compare the two possible
choices of reference wavelength. Sometimes, due to lack of SR beam time and/or
prolonged exposure times, it may be only feasible to collect data at three
wavelengths. The reference wavelength,
has no anomalous signal as it is situated on the long wavelength side of the Br
K edge. The map, however, was of excellent quality and could be easily
characterised.
Case 3:
The reference wavelength,
has a good anomalous signal as it is situated on the short wavelength side of
the absorption edge, unlike
.
The overall figure of merit was certainly improved compared with case 2. The
map was again of excellent quality and could be easily characterised.
Case 4:
The theoretical minimum case for unique phase determination involves two
wavelengths. This is akin to the 'two-
short-
wavelength-
method'
of Hoppe and Jakubowski (1975). It is required that the centres of the phasing
circles be well separated and non-
collinear
and this is achieved well here (Helliwell (1984)). The
pairing has the largest dispersive difference, whilst,
also has the maximum Friedel difference. The electron density map was of high
quality and totally interpretable.
Case 5:
This combination of wavelengths stimulated by the correspondence from D. H.
Templeton, was used to see if the map could be phased with two extremely close
wavelengths (i.e. only 0.0007A apart!) that might be adversely affected by
dichroism effects. Also the
pairing has half the dispersive signal compared to the theoretical minimum,
case 4,
.
However,
has the largest anomalous difference whereas
has the next largest anomalous difference..
The principle of density modification (DM) is to improve the
experimental phases by imposing restrictions on the density in real space and
then using the phases of the modified map to alter or replace the experimental
phases. In protein crystallography these are important methods for phase
improvement. Moreover they may be applied so as to reduce the number of
wavelengths needed in a MAD phase determination experiment and/or use
wavelengths very close in value, but with reduced (less optimal) values of f "
or
.
The map modification process embroided in the program DM (Cowtan ( 1994)) was
used on the various wavelength phasing combinations.
Case 1: Density Modified
The quality of the original map was very good, however, DM improved the map quality around all the bases. All bases now had well defined, complete electron density apart from base 7 which still had a lack of connectivity at one bond.
Case 2: Density Modified
Seven bases (1, 3, 8, 9, 10, 11 and 12) that had incomplete density (side chains missing or lack of connectivity) originally, sufficiently improved to now show well resolved connected density. The remainder of the bases, which had previously suffered from a lack of connectivity, were still not significantly altered.
Case 3: Density Modified
Eight bases (1, 3, 4, 8, 9, 10, 11, and 12) that had incomplete density (side chains missing or lack of connectivity) originally, sufficiently improved via to now show well resolved connected density. The remainder of the bases which suffered from a lack of connectivity were not significantly altered.
Case 4: Density Modified
Eight bases (3, 4, 6, 8, 9, 10, 11 and 12) which were defined by density with a lack of connectivity at a least one bond now showed well defined connected density after DM. The remaining four bases showed a clear improvement in density quality, e.g. base 1 now has the nitrogenous side chain defined.
Case 5: Density Modified
The original map had most structural moieties in the correct position. DM further increased the map quality considerably, so much so that all the bases are easily characterised. Bases 3, 4, 5, 10 and 11 now had well defined connected density compared to the lack of connectivity experienced in the original map at these positions. Bases 1, 6, 9 and 12 showed improved density, whereas bases 7, 8 were still interpretable, but were slightly better defined in the original map. Base 2 showed no significant change in density. As might be expected this modified map was not of a high quality as compared to modified case 4.
alone yields the largest f " value, as expected from theory, if not the Kronig-
Kramers
transform curve. Hence, the choice of two wavelengths, a reference wavelength,
or
with
,
whilst being the theoretical minimum number of wavelengths, also yielded the
biggest
and f " differences in the diffraction data. The use of 2-
's
may be of interest when the concentration of anomalous scatterers is high in
the system, and when a three or four wavelength data set collection strategy is
not favourable (e.g. due to restricted beam time, and long exposure times per
diffraction image are needed).
Density modification was then considered for the various wavelength scenarios.
There is a special interest in the two wavelength cases which simplify the
experimental and beamline needs. Key points are further discussed now. The
already good map quality in the
phasing combination was reinforced further after the DM procedure and structure
solution became even easier. The isomorphous difference between data sets
and
is half that of the previous two cases mentioned above, 3.68 electrons, but
this is generated by a change in wavelength of only 0.0007Å! The advantage
of this is that beam position incident onto the sample would be essentially
identical for the two wavelengths. The original
phases and
map were of only reasonable quality before DM procedures. The DM phases produced a highly interpretable map in which the structure could be easily solved. Structure solution can then even be obtained when the isomorphous signal was not optimised, due to these modification procedures. Overall, DM could perhaps be further enhanced if the electron density 'data bank' used for histogram matching actually consisted of nucleic acid density instead of protein density (which had to be used here). In essence, a key result, cases 1 to 4 become equally comparable in terms of FOM's of the phases after DM.
In Peterson et al. (1996), it was reasoned that dichroism effects were
not evident in the f ' and f " values, in essence because the maximum induced f
" and
differences were induced with respect to
in agreement with theory but somewhat unexpected. However, it was pointed out
by David Templeton (pers comm), that for the two independent Br sites (A and B)
in the crystallographic asymmetric unit, there did appear to be a variation
between the two sites f', and f " values which had a maximum at
.
Hence, at
the effect of different atomic environments of the A and B sites might explain
this, in a similar way to the previously reported bromide example of Templeton
and Templeton (1995), in which there was a very marked edge shift, on edge, for
the parallel and perpendicular polarisation components of 0.00031Å
(estimated from figure 3 of that paper). Therefore, the
,
pair in the analysis would be the most to suffer if dichroism were present to a
large degree. Since Figure 5 (b) shows good quality phasing and electron
density map quality, it can be concluded that dichroism was not a major factor
in the f ', f " values that we have encountered. Nevertheless further
experiments are planned to explore the values of f ', and f " at finer
sampling and for dichroism which must be present to some degree.
In summary, this work successfully evaluated and compared a variety of MAD experimental and computational procedures for phase improvement. It provides guidance in planning future experiments and/or new instruments, and is therefore a significant contribution to the methods of protein crystal structure determination. Aspects of the work are published in Peterson et al. (1996).
Thanks for discussions with J. R. Helliwell, W. N. Hunter and G. A. Leonard.
Thanks also to S. J. Harrop and S. M. McSweeney for data collection assistance
at Station 9.5 SRS, Daresbury. Correspondence on possible dichroism in the f'
and f " values at
and
was between D. H. Templeton and J. R. Helliwell.
CCP4 (1994) Acta Cryst. D50, 760- 763.
Sasaki, S. (1989) KEK Report 88- 14, Tsukuba 305, Japan.
Okaya, Y. and Pepinsky, R. (1955) Phys. Rev. 98, 1857- 58.
Hoppe, W. and Jakubowski, U. (1975) In Anomalous Scattering, 437- 61.
Helliwell, J. R. (1979) Daresbury study weekend, DL/SCI/R13,1- 6.
Helliwell, J. R. (1984) Reports on Progress in Physics 47, 1403- 1409.
Peterson, M. R. et al. (1996) J. Synch. Rad. 3, 24- 34.
Cowtan, K. (1994) Newsletter on protein crystallography, 31, 34- 38.
Templeton, D. and Templeton, L. (1995) J. Synch. Rad. 2, 31- 35.