Missing Number Flags in MTZ Files: Users Guide

PREV User Guide on MNFs UP CCP4 Home Page NEXT MNF and Library Routines

Purpose

The principal motives for adding a "MISSING NUMBER FLAG" are:

Old and New Style Missing Data Checks

In the next version of the CCP4 Suite (3.0) the concept of missing number flags within MTZ files will be introduced.

A missing number flag (MNF) will indicate that a certain datum for an HKL record has not been measured or calculated. This means that any datum within an MTZ can be tested to see if it should be used. Previously files did not contain this information for all data types. There was only one way of distinguishing an unmeasured reflection and that was by checking if its standard error SIGF or SIGI was zero. It was possible to test if an experimentally determined phase had been calculated by checking if its FOM was zero.


Example 1)
   A list of reflection data from one crystal would contain only 
   those values of hkl which had an observation associated with them.
   E.g. If you had a blind region up the C* axis for a P212121 data set:
         your file might begin

             0 0  8 F SIGF
             0 0 10 F SIGF
              .....

   from which you could deduce that there was no measurement made for 
   reflections 0 0  2,  0 0 4, and 0 0 6 


Example 2)
   For an MTZ file containing native and derivative data sets.

   an old style MTZ file could look like this:
      H    K    L    FO  SIGFO   FPH     SIGFPH     FOM     PHIB    FreeRflag
      0    0    6     0    0     40         4       0.00     0.0      9.0
      0    0    8    10    1      0         0       0.0      0.0      1.0
      0    0   10    75    2     80         5       0.25    45.0      9.0
      ...........

   From this you could deduce that there was no measurement made for 0 0 8 
   for the derivative FPH, no measurement of FP for 0 0 6, no measurement 
   for either FP or FPH for  0 0  2 or 0 0 4. and therefore no phase or FOM 
   could be calculated for these reflections.

It was possible to mis-use these old-style files by assigning F without assigning SIGF or PHI without FOM. Many people have done "difference" maps where some "differences" were between observed and unobserved data, and where phases were taken as 0.00 when in fact no phase had been determined.

In the new style MTZ file the missing number flag (MNF) will indicate a datum is not present.


   Example 1) would appear as
      0 0  2 MNF MNF  FreeRflag
      0 0  4 MNF MNF  FreeRflag
      0 0  6 MNF MNF  FreeRflag
      0 0  8 F   SIGF FreeRflag
      0 0 10 F   SIGF FreeRflag

   Example 2) would appear as
      0    0    2   MNF  MNF     MNF       MNF       MNF     MNF     6.0
      0    0    4   MNF  MNF     MNF       MNF       MNF     MNF     3.0
      0    0    6   MNF  MNF     40         4        MNF     MNF     9.0
      0    0    8    10    1     MNF       MNF       MNF     MNF     1.0
      0    0   10    75    2     80         5       0.25    45.0     9.0

All CCP4 programs that use MTZ files have been changed in order to deal with MNFs. No program will use a datum flagged with a MNF. The functionality in most cases has remained the same and also to ensure backwards compatibility the old style checks on SIGF and FOM have still been kept where appropriate.

It is strongly advised that you should change existing MTZ file to the new style MTZ format. This is possible through MTZMNF, a new program in the Suite . It uses the old protocols for checking for missing data and then replaces them with MNF. (see mtzmnf.doc)

Existing programs which output data, such as MLPHARE, will now output MNF for undetermined phase and FOMs. Therefore, if the input file to MLPHARE is an old style MTZ file, the output will be a hybrid of both old and new. This is undesirable.

A Complete Reflection List and Assigning FreeRflags.

FreeR information is becoming a requirement for any structural report. It is also used actively in some programs; eg DM, REFMAC, and soon SIGMAA. To use these correctly, they should be assigned at the earliest opportunity and used consistently throughout the structure determination. In particular if you subsequently collect another data set for a structure, either an extended native set, or a mutant which crystallises in the same spacegroup, the same FreeR asignments need to be preserved, otherwise the freeR statistics are to some extent invalididated. To do this, it is sensible to first generate all possible hkl to a given resolution, to assign FreeRflags to this set, then merge this master list with the observed data sets. There will be example scripts of the procedure to follow, both for new data sets and for those which already have a FreeR assigned.

Restoring Missing Data in the Calculation of Maps.

The current practise when calculating "nFo - (n-1)FC" electron density maps (eg: Fo, 2Fo-Fc, 3Fo-2Fc or the SIGMAA or REFMAC style 2mFo-DFc maps) is to leave out any term where Fo is unmeasured. Effectively, you are saying that the contribution from that structure factor is zero. Obviously, this is not correct and errors will be introduced into the map as a consequence of this assumption. See Kevin's Book of Fourier for a duck's a view of the problem.

There is now an option in FFT that will allow you to substitute Fc for "nFo - (n-1)Fc" as the Fourier term for all missing values of Fo (see fft.doc). This invokes the assumption that the most likely value for Fo is Fc. REFMAC ( and soon SIGMAA) will generate a term DFc to substute for 2mFo-DFc. This reduces the distortion caused by missing slabs of data. Although the noise in the map will diminish, it is possible that the systematic error (model bias) may increase. Note, that this substitution is not needed for difference maps (Fo-Fc) where the assumption Fo~Fc will generate a zero difference.

UP CCP4 Home Page NEXT MNF Developers Guide