**sigmaa HKLIN** *foo_in.mtz *[ **HKLOUT** *foo_out.mtz*
]

[Keyworded input]

The program SIGMAA (Read, 1986) can be used to combine a set of calculated phases with a set of previously determined phases for which the phase probability profiles are held in the form of Hendrickson-Lattman coefficients.

It calculates weighted Fourier coefficients either from the calculated phase from a (partial) model structure, or by combining phase probabilities from isomorphous phases with those from one or more (partial) structures.

WARNING: SIGMAA has been converted so that it will work with MNFs. In a similar fashion to FFT (see also documentation on Missing Number Flags); Fo will be replaced by DFc, if it is missing, for the FWT map coefficient. Also, when combining phases of missing data, the phase probability will be assumed to be uniform. However, the procedure may not be optimal, hopefully a version from Randy Read will be available in a subsequent version.

There are 3 main options:

- PARTIAL
- use partial structure information, writing out a weight and coefficients
for maps in columns as follows:
- WCMB
- A weight (analogous to `Sim weight') to estimate the reliability of AlphaCalc.
- DELFWT (m|Fo| - D|Fc|) exp(i AlphaCalc)
- For a difference map, FFT input: F1=DELFWT PHI=PHIC
- FWT (2m|Fo| - D|Fc|) exp(i AlphaCalc)
- Analogous to 2Fo-Fc map, FFT input: F1=FWT PHI=PHIC

where Fo, Fc are observed and calculated structure factors. Note that for centric terms, the (2m|Fo|-D|Fc|) coefficients are replaced by m|Fo|; these coefficients reduce/remove model bias.

- COMBINE PART
- combine isomorphous phase (preferably input by Hendrickson-Lattman coefficients ABCD) with calculated phases from up to 3 sources; output the combined phase (PHCMB WCMB) and coefficients which minimise model bias. (labelled again: FWT PHFWT and DELFWT PHDELFWT)

- COMBINE MIR2
- combine two sets of experimental phases with or without Hendrickson-Lattman coefficients. This can only be done pair-wise, it might be argued that instead all data should be used in calculating the phase.

The program first calculates, iteratively in resolution bins, the value of SigmaA as defined by Srinivasan, 1966; and then for each reflection, the figure of merit m and the estimate of the error in the partial structure from coordinate errors D (Luzzati, 1952). There is an option to scale these to modify the weight assigned to the partial structure information, or to read in values of SigmaA derived previously.

If EPS is the multiplicity for the reflection zone (Rogers, 1965),

SigmaA = D*sqrt(sigmaP/sigmaN) Eo = Fo/sqrt(EPS*sigmaN) and Ec = Fc/sqrt(EPS*sigmaN) where sigmaN = <Fo**2/EPS> and sigmaP = <Fc**2/EPS>.

The figure of merit m = <cos(AlphaTrue - AlphaCalc)> is calculated from Eo, Ec and SigmaA, while the map coefficients arise from the approximation that

m Eo exp(iAlphaCalc) = 0.5 Eo exp(iAlphaTrue) + 0.5 SigmaA Ec exp(iAlphaCalc)

If coordinate errors are assumed to be normally distributed,

ln SigmaA = intercept - slope * (sintheta/lambda)**2 where intercept = 0.5 * ln(sigmaP/sigmaN) and slope = pi**3 * (mean square coordinate error)

The various data control lines are identified by keywords. Only the first 4 characters need be given. Those available are:

COMBINE,END,ERROR,LABIN,LABOUT,PARTIAL,RANGES,RESOLUTION,SIGMAA,SYMMETRY,TITLE

[Required for option (b).]

Use this option to combine experimental phase information from isomorphous replacement (columns PHIBP, WP, HLA, HLB, HLC, HLD from the input data file) with that from (partial) model structures. This option produces an output data file assigned to HKLOUT.

- PART <nps>
- <nps> is the number of (partial) model structures, default: 1, maximum allowed: 3.
- DAMP <d1> <d2> <d3>
- <d1> <d2> <d3> (default 1.0) are values to multiply the SigmaA values generated for the partial structures. Once the Rfactor between Fobs and Fcalc is below 30% or thereabouts the SigmaA Weights become close to 1. This means that there will be very little contribution to the combined phase from the MIR information. Giving values of di<1.0 may be helpful. See keyword SIGMAA for Randy's preferred solution.
- RESOLUTION <Rmin> <Rmax>
- If resolution limits <Rmin>, <Rmax> are given here, phase combination is only done within this resolution shell: typically this would be used to include experimental phases only for high resolution data during a phase extension process. In this case, a low resolution limit would be set, allowing lower resolution data which has already been phased in previous cycles to diverge from the (incorrect) experimental phases according to phase information from averaging or density modification.

Merge together two sets of MIR phases. RESOLUTION is the same as above.

If this command is present, a straight line is fitted to the plot of ln (SigmaA) against resolution in order to estimate the rms coordinate error.

Input column assignments. If you wish to make use of Hendrickson Lattman coefficients in the input MTZ file, the program assumes that they will have the column labels HLA, HLB, HLC and HLD. If you wish to use alternative column labels for the HL coefficients then they must be specified using LABIN. Program labels for the various options are:

- PARTIAL
- FP SIGFP FC PHIC
- COMBINE PART ...
- FP SIGFP PHIBP WP [HLA HLB HLC HLD], with FC PHIC or FC1 PHIC1 FC2 PHIC2 [FC3 PHIC3]
- COMBINE MIR2 ...
- FP SIGFP PHIBP WP [HLA HLB HLC HLD], with PHIB2 W2 [HLA2 HLB2 HLC2 HLD2]

Output column assignments. Program labels for the options producing output data are:

- PARTIAL
- DELFWT FWT WCMB
- COMBINE MIR2
- HLAC HLBC HLCC HLDC WCMB PHCMB
- COMBINE PART ...
- DELFWT PHDELFWT FWT PHFWT WCMB PHCMB For details of these, see INPUT AND OUTPUT FILES.

Produce weighted map coefficients from a partial structure. This is the default option. It produces an output .mtz data file.

- DAMP <d1>
- <d1> is the damping factor for the SigmaA weights (default 1.0).

Set the number of resolution bins <nbin> and the reflection monitoring interval <mon>. Defaults: 20 1000; maximum <nbin> allowed: 50.

<nbin> is the number of resolution bins (equal width in [sin(theta)/(lambda)]**2 in which to divide partial structure data for normalization and sigmaA estimation. It is IMPORTANT that resolution ranges contain sufficient reflections. It is best to use as large a value of <nbin> as possible, as long as the estimates of sigmaA vary smoothly with resolution. If they do not, <nbin> should be reduced until sigmaA does vary smoothly. A good first guess is the number of reflections divided by 1000. If sigmaA refinement converges to zero in one or more of the ranges (which happens sometimes when the correct value is low), this can usually be circumvented by decreasing <nbin>.

Information about every <nmon>-th reflection will be written to the log file.

Low and high resolution limits in either order or upper limit if only one is specified. These may are in Angstroms or if both are <1.0, units of 4(sintheta/lambda)**2. By default, all the data in the file are used.

Input SigmaA values from another source. Normally these values will be calculated in the program so this keyword is unnecessary. However if the agreement between Fobs and Fcalc becomes very good - for example if the Rfactor is <25% - then the calculated SIGMAA values weight up the PHIcalc at the expense of the experimental phases. This may not be desirable and you may need either to invoke the DAMP keyword or retain an early estimate of sigmaA.

- <nps>
- number of partial structures.
- <nbin>
- number of bins, followed by <nbin> lines of the form:

SigmaA(1,1) [ SigmaA(2,1) ... [SigmaA(nps,1)]] SigmaA(1,2) [ SigmaA(2,2) ... [SigmaA(nps,2)]] .......... SigmaA(1,nbin) [ SigmaA(2,nbin) ... [SigmaA(nps,nbin)]]

Spacegroup number or name or operators in International Tables format. By default, symmetry information is read from the input file header.

A title written to the log file and in the header of the output MTZ data file (if produced).

End of input.

This is an MTZ file assigned to logical name HKLIN. The following column assignments are required (those which are optional are enclosed in square brackets):

- PARTIAL option:
- H K L FP SIGFP FC PHIC

with - FP, SIGFP
- native amplitude and standard deviation
- FC, PHIC
- calculated amplitude and phase (degrees)

- COMBINE option:
- Combination of two sets of MIR phases:
- H K L FP SIGFP PHIBP WP [HLA HLB HLC HLD]

PHIB2 W2 [HLA2 HLB2 HLC2 HLD2]

with - FP, SIGFP
- native amplitude and standard deviation
- PHIBP
- isomorphous centroid phase (degrees)
- WP
- figure of merit
- HLA...HLD
- Hendrickson-Lattman probability coefficients corresponding to isomorphous phase. If these are absent, a unimodal probability distribution will be set up around PHIBP.
- PHIB2
- isomorphous centroid phase for second set
- W2
- figure of merit for second set
- HLA2..HLD2
- Hendrickson-Lattman probability coefficients for second set. If these are absent, a unimodal probability distribution will be set up around PHIB2.

- Combination of one set of MIR phases with PARTIAL information:
- H K L FP SIGFP PHIBP WP [HLA HLB HLC HLD]

plus FC PHIC

or FC1 PHIC1 FC2 PHIC2 [FC3 PHIC3]

with - FP, SIGFP
- native amplitude and standard deviation
- PHIBP
- isomorphous centroid phase (degrees)
- WP
- figure of merit
- HLA...HLD
- Hendrickson-Lattman probability coefficients corresponding to isomorphous phase. If these are absent, a unimodal probability distribution will be set up around PHIBP.
- FC, PHIC
- calculated amplitude and phase (degrees) for one partial structure
- FC1, PHIC1
- calculated amplitude and phase (degrees) for first partial structure when nps > 1
- FC2, PHIC2
- calculated amplitude and phase (degrees) for second partial structure when nps = 2
- FC3, PHIC3
- calculated amplitude and phase (degrees) for third partial structure when nps = 3

This is an MTZ file assigned to logical name HKLOUT. The file will contain all the columns from the input file with extra columns appended, the number depending on which option was used. The default labels of these columns are given below; these may be changed with LABOUT command.

- PARTIAL option:
- The new columns are: WCMB DELFWT FWT, with
- WCMB
- figure of merit m of calculated phase (Sim weight)
- DELFWT
- Fourier amplitude for `difference' map (mFo-DFc)
- FWT
- Fourier amplitude for '2Fo-Fc' map (2mFo-DFc) These terms may be positive or negative.

The phases used for these maps will always be PHIC.

- COMBINE option:
- The new columns are: PHCMB WCMB FWT PHFWT DELFWT PHDELFWT, with
- PHCMB
- combined phase angle (degrees)
- WCMB
- combined figure of merit
- FWT
- Fourier amplitude for '2mFo-DFc' map
- PHFWT
- Combined phase for this term
- DELFWT
- Fourier amplitude for 'mFo-DFc' map
- PHDELWT
- Combined phase for this term

Originator: August 1986: R.J. Read.

Incorporates updates from [11].

sigmaa HKLIN hktmpico.mtz HKLOUT hksigmaa1.mtz TITLE SIGMAA m*Fo-Fc map pfk B.st. BP2.. PROLSQ cycle<1>.. RESOLUTION 100.0 2.6 ! Resolution limits in Angstroms RANGES 30 5000 ! Number of bins for analysis v. resolution ! Monitor every 5000th reflection PARTIAL ! Option for difference map coefficients ERROR ! Use sigmaA v resolution for coordinate error LABIN FP=FO SIGFP=SIGFO FC=FC PHIC=PHIC END

Note: This example uses the default output file labels. To calculate the `difference' map, use DELFWT in FFT. To calculate the `2Fo-Fc' map, use FWT.

sigmaa HKLIN ../data/sp400_monster2.mtz HKLOUT ../data/sp400_phase_comb.mtz << END-sigmaa TITLE TRYIT RANGES 10 1000 ! Number of analysis bins, monitor interval RESOLUTION 0.0 0.25 ! Resolution limits in 4(sintheta/lambda)**2 ERROR ! Use sigmaA v resolution for coordinate error COMBINE PART 1 ! Combine isomorphous + 1 partial model LABOUT PHCMB=PHCMB WCMB=WCMB FWT=FWT PHWT=PHWT LABIN FP=F(Mer) SIGFP=SIGF(Mer) PHIBP=PHIBEST WP=FOM - HLA=A HLB=B HLC=C HLD=D - FC=FC PHIC=AC END END-sigmaa

The phase combination method used in sigmaa depends on the Hendrickson and Lattman (1970) formulation of the phase probability profile for a phase Alpha:

P(Alpha) = exp(A cosAlpha + B sinAlpha + C cos2Alpha + D sin2Alpha)

A, B, C, D are known as the phase coefficients. Phase information from different sources can be combined by a simple addition of the phase coefficients from each determination. The application of a weighting scheme proposed by Sim (1959) allows for the inclusion of phase information determined from a partial structure.

The principles of the method and details of the original phase combination program are described by Bricogne (1976).

It is assumed that the coefficients giving least bias vary as a linear function of partial structure influence. The variation of information is the parameter used to measure the contribution of each partial structure to the combined phase probability profile; and this is normalised to give partial structure weights w. These are tabulated as a function of resolution in the log file. If there are p partial structures, the modified map coefficients are given by

[2mFo - sum_over_p(wDFc)] / [2 - sum_over_p(w)]

- Read, R.J.: Acta Cryst. A42 (1986) 140-149.
- Srinivasan, R.: Acta Cryst. 20 (1966) 143-144.
- Hauptman, H.: Acta Cryst. A38 (1982) 289-294.
- Luzzati, V.: Acta Cryst. 6 (1953) 142-152.
- Rogers, D. in Computing Methods in Crystallography (Rollett, J.S.,ed.) (1985) pp. 126-127, Pergamon Press.
- Hendrickson, W.A. & Lattman, E.E.: Acta Cryst. B26 (1970) 136-143.
- Bricogne, G.: Acta Cryst. A32 (1976) 832-847.
- Sim, G.A.: Acta Cryst. 12 (1959) 813-815; 13 (1960) 511-512.
- Read, R. J.: Acta Cryst. A46 (1990) 140-9.
- Read, R. J.: Acta Cryst. A46 (1990) 900-12.
- Vellieux, F.M.D., Livnah, O., Dym, O., Read, R.J. & Sussman, J.L., manuscript in preparation.