------------ CCP4 Newsletter - June 1996 ------------

Electron density modification with Solomon

J.P. Abrahams

MRC Laboratory of Molecular Biology,
Hills Road,
Cambridge,
CB5 2QH, U.K.
E-mail: jpa@mrc-lmb.cam.ac.uk

The structure determination of F1 ATPase prompted the development of several new procedures for electron density modification [1]. This resulted in a program, called "Solomon", which has allowed the solution of at least 20 different structures over the last 18 months. The main purpose of this short notice is to draw attention to its inclusion in the CCP4 suite, and to give a brief overview of some of the concepts which, until recently, were unique to Solomon.

Density modification - Why and How

After experimentally determining a phase probability distribution for each of the measured structure factors with techniques like SIR, MIR, or MAD, one can use these distributions to calculate the "best" phases and figures of merit. In practice this is done for each structure factor by an angular integration of its phase probability distribution. For example, a centric structure factor can have two possible phases, and if the probability of it having a phase of 0° is 0.75, its "best" phase will be 0°, and its figure of merit will be (0.75 - 0.25). The "best" phases and figures of merit of non-centric structure factors are calculated similarly in the complex plane.

However, usually more information is available. Rather than being information about individual structure factors, as measured from experimental differences in intensity, this information pertains to the way the structure factors interact with one another, and can be formulated as a set of contraints in real-space. Solomon imposes contraints on solvent flatness, non-crystallographic symmetry, and on the density distribution within the ordered parts of the crystal.

In practice, one first calculates a map using the "best" phases and figure-of-merit weighted structure factor amplitudes. Then one modifies the resulting map according to real-space contraints, and from this map calculates new, modified structure factor amplitudes and phases. The extra information contained within these modified structure factors is then combined with the original phase probability distribution to produce better estimates. The process can be iterated. To summarise, the advised procedure in conjunction with Solomon is:

Calculate initial map (FFT)
Introduce real-space contraints by modifying the map (Solomon)
Calculate structure factors of the modified map (SFALL)
Combine experimental phase information with phase information from the new structure factors (SIGMAA)
Calculate a new map using these updated "best" structure factors (FFT)
Stop or go back to step 2

Solomon is a program which modifies electron density maps and as such is responsible for only one step of the cycle outlined above, unlike programs like SQUASH [2] or DM [3], which essentially incorporate all the above steps, although quite different computational procedures may be used. Either approach has advantages and disadvantages. Solomon works in concert with several CCP4 programs (FFT, SFALL & SIGMAA), and it is distributed with a script which will correctly go through the cycle as outlined above. However, more complicated crystallographic problems will require a more sophisticated approach, in which case the user will have to amend the script.

Solvent density modification by Solomon

Solvent flattening is a standard technique for phase improvement and it involves three separate steps: location of the solvent, modification of the solvent, and the combination of the modified map with experimental phase information.

One should realise that the solvent will have a mean electron density which is very similar to that of the protein, and that solvent and protein are mainly distinguished by the relative featurelessness of the former, and the undulating landscape of the latter. Solomon is unique in its way of locating the solvent by determining a new map, in which at every grid point the local standard deviation of the original map is stored. The user will have to specify the radius within which the local standard deviation is determined, and it was found that a radius slightly larger than the maximum resolution of the map is optimal in most cases. After calculating such a map, Solomon suggests a contour level which will show the protein mask, given a certain solvent content, allowing inspection on a graphics workstation. Solomon uses this map to construct a solvent mask by excluding small islands of protein. If requested, this mask can be stored as an old-style "O" [4] or CCP4 mask and manipulated as any other mask. It is also possible to use solvent masks which were generated in other ways, or were edited by the user.

The solvent masks determined from the local standard deviation of the map have a higher resolution than masks determined by the method suggested by Wang [5]. The higher accuracy of the masks was found to be beneficial.

After locating the solvent, it can be modified. In conventional density modification, the density at every grid point of the solvent is replaced by the mean density of the solvent, but Solomon also allows different types of modification. The density within the solvent can be scaled as follows:

rmod = (ri - rmean) . kflip + rmean + sadd ...................... (1)

(Where rmod = modified electron density at grid point "i", ri = density at grid point "i", rmean = mean solvent density, kflip = solvent multiplier, sadd = constant to be added to all solvent density.)

It is evident that setting the solvent multiplier kflip to zero is equivalent to flattening the solvent. Setting it to a negative value will "flip" features within the solvent, and it turns out that doing so is desirable in many cases. The constant sadd can be used to reconstruct low resolution features of the electron density, and by setting it to a negative value, the density of the protein can be "lifted" slightly above that of the solvent. This feature is used in conjunction with protein density truncation and structure factor reconstruction (see below).

The main benefit of flipping solvent features becomes apparent upon combining the modified structure factors with the experimental phase probability distribution. The intricacies of the recombination are beyond the scope of this short report, but an attempt will be made to give the reader a flavour of the sort of difficulties associated with this computation. It can be shown that, provided the sources information are independent, the optimal way of combining the information from model structure factor amplitudes with experimentally determined ones, is through sigmaA-weighting [6]. In the case of density modification, the information carried by the modified structure factors is not strictly independent from the experimental data, but with current methodology it is not possible to calculate how dependent the information actually is. In fact, the degree of independency will vary from structure factor to structure factor and will also crucially rely on the restraints imposed in real-space. As a result of treating the sources of information as independent, the recombined data will be biased. By flipping the solvent instead of flattening it, the modified structure factor amplitudes will be made more different from the original ones, and therefore they will appear to be more independent. Any (accidental) improvement of the phases will result in a more featureless solvent, and the next iteration there will be less density to flip. As a result, the solvent does get flatter as the flipping procedure is iterated, not so much because the solvent is biased to be flat, but rather because of other phase improvements. Solvent flattening cannot be iterated in a similar fashion, but it is entirely possible (and desirable) to flatten the solvent on the very last cycle of the solvent flipping procedure: the bias introduced at this point will not be propagated. It was found that the reduction in the R-factor between the experimental structure factor amplitudes and the modified structure factor amplitudes on the very last flattening cycle is an accurate indicator of the overall phase improvement.

There is a relationship between the solvent content of the crystal and the optimal value for kflip. The higher the solvent content, the less negative kflip should be. If the solvent content is about 30%, kflip should be set to -1.8 to -1.6, if the solvent content is about 50%, the optimum value is about -1, and if the solvent content is 70-75%, a kflip of zero seems to be optimal. Also the amount of averaging influences the optimal value of kflip: with two- or threefold averaging, one should set it at 60% to 80% of the value one would choose in the absence of non-crystallographic averaging, sixfold or higher non-crystallographic symmetry averaging is incompatible with solvent flipping and requires a value for kflip of zero.

Protein density modification with Solomon

Solomon allows two types of density modification in the protein region: averaging and truncation of density. Solomon calculates symmetry related density by cubic spline interpolation, and averages using single, rather than double interpolation. As with masks defining the solvent, masks defining non-crystallographic symmetry are old-style "O" masks or CCP4 masks, and can easily be manipulated with existing programs. Determination of the solvent mask is independent of the specification of the masks defining the non-crystallographic symmetry related density, but one can prevent the solvent mask from intruding into a symmetry mask by inclusion of the appropriate keyword in the script. Although symmetry masks are allowed to overlap, Solomon does provide the opportunity to remove the overlap between these masks.

In density truncation, grid points within the protein region which have a density below a certain specified threshold, are assigned a density equal to this threshold [7]. The result of density truncation of the protein region is that features of high density become sharpened relative to features of lower density. As such it is comparable to histogram matching techniques. However, the sharpening resulting from truncation seems to be beneficial even at resolutions at which the modification resulting from histogram matching is virtually non-existent. Because of this, density truncation was preferred over histogram matching. Other protocols for sharpening protein features were explored, but none of them were very successful, with one exception. In some cases is better to set the density of truncated grid points to the mean density of all truncated grid points, rather than to the threshold density.

As a result of truncating the density within the protein region, the overall variance of electron density of the protein region relative to the solvent region decreases. This means that the density of the solvent region has to be scaled down since it was found desirable to maintain a constant ratio between the two. This is done automatically by Solomon if requested. Another result of truncation is that the mean density within the protein region will increase, relative to the mean density of the solvent region. Since it was found that this is undesirable if one is reconstructing missing structure factors, this can be corrected for by assigning a value to sadd in equation (1).

Reconstruction of missing data

It is vital to try to collect complete data sets, but there will always be some reflections which escape detection. The corresponding structure factors will be systematically absent from all the maps calculated, and as such will introduce a bias, since in density modification procedures one relies on the interaction between structure factors. If certain structure factors are left out, structure factors interacting through real-space with the missing data will incorrectly "assume" them to be zero. Therefore, the script distributed with Solomon allows one to replace missing structure factors with the structure factors of the modified map. In this way, the bias is removed, at the expense of creating potentially dangerous sources of error: since the missing structure factors are not restrained, they can take on unrealistic values in order to satisfy incorrect phase sets. However, if one only includes resolution bins with a completeness of 90% or more, it is quite safe to reconstruct missing structure factors, and in many cases it can be beneficial. Very low resolution terms can also be introduced quite safely.

Availability & Acknowledgements

Solomon is now available from CCP4, and if results obtained with Solomon are published, a reference to [1] should be made. I am grateful to Kevin Cowtan for making Solomon compatible with CCP4 masks.

References:

[1] Abrahams, J.P. & Leslie, A.W.G. (1996) Acta Cryst. D52, 30 - 42
[2] Cowtan, K.D. & Main, P. (1993) Acta Cryst. D49, 148 - 157
[3] Cowtan, K. (1994) Joint CCP4 and ESF-EACBM Newsletter on Protein Crystallography 31, 34-38
[4] Jones, T.A., Zou, J.Y., Cowan, S.W. & Kjeldgaard, M. (1991) Acta Cryst. A47 110 - 119
[5] Wang, B.C. (1985) Methods Enzymolog. 115, 90 - 112
[6] Read, R. (1986) Acta Cryst. A42, 102 - 116
[7] Schevitz, R.W., Podjarny, A.D., Zwick, M., Hughes, J.J. & Sigler, P. (1981) Acta Cryst. A37, 669 - 677

Back to Contents....