In the diffraction experiment, the presence of an anomalous scattering signal results means that there may be differences between the structure factor amplitudes measured for a reflection hkl as compared with its Friedel mate, -h-k-l.
Rather than storing the data as two independent reflections however, MTZ stores data for both in just one of the pair (say hkl). Imagine that the amplitiude measured for hkl is F(+) and that for -h-k-l is F(-),then:
The data can be stored as a mean amplitude F and an anomalous difference D:
F = 0.5*( F(+) + F(-) )
D = F(+) - F(-)
(There are also equations for calculating the sigmas for F and D, not shown here).
The data can be stored as the raw F(+) and F(-) measurements.
Originally only mean values were stored and people would often simply throw away the differences. As anomalous scattering techniques became more widespread the differences were also stored. Nowadays it is common to see the data stored as both F and D and F(+)/F(-) - all associated with a single reflection, e.g.
h k l F sigF D sigD F(+) sigF(+) F(-) sigF(-)
(Note that having F sigF D sigD is essentially the same information as having F(+) sigF(+) F(-) sigF(-), only expressed in a different form. In principle you only need one of these two sets of columns in order to have all the data.)
This is different from formats such as XPLOR or SHELX where the data is stored as two separate reflections e.g.:
h k l F(+) sigF(+) -h -k -l F(-) sigF(-)
The CCP4 program MTZ2VARIOUS can convert an MTZ file to mmCIF but there are some issues when dealing with anomalous data.
Older versions of MTZ2VARIOUS (pre-23rd May 2003,or revision 1.101) wrote out anomalous data in two explicit reflections, using the following tokens corresponding to columns in the MTZ file:
_refln.F_meas_au FP (or F(+)/F(-)) _refln.F_meas_sigma_au SIGFP (or sigF(+)/sigF(-)) _refln.intensity_meas I (or I(+)/I(-)) _refln.intensity_sigma SIGI (or sigI(+)/sigI(-))
Now MTZ2VARIOUS treats the anomalous data differently, writing only a single reflection for each anomalous pair and using the following tokens to correspond to the columns in the MTZ file:
_refln.F_meas_au FP _refln.F_meas_sigma_au SIGFP _refln.intensity_meas I _refln.intensity_sigma SIGI _refln.ccp4_SAD_F_meas_plus_au F(+) _refln.ccp4_SAD_F_meas_plus_sigma_au SIGF(+) _refln.ccp4_SAD_F_meas_minus_au F(-) _refln.ccp4_SAD_F_meas_minus_sigma_au SIGF(-) _refln.ccp4_SAD_phase_anom DP _refln.ccp4_SAD_phase_anom_sigma SIGDP _refln.ccp4_I_plus I(+) _refln.ccp4_I_plus_sigma SIGI(+) _refln.ccp4_I_minus I(-) _refln.ccp4_I_minus_sigma SIGI(-)
This maps more closely onto the way that MTZ files store the same information.
The two ways of representing the anomalous data are different, in the way that the standard _refln.F_meas_au etc tokens are used:
in the first representation, these are the actual measured values for the reflection (so the anomalous data is preserved), whereas
in the second representation, these are the mean values of the reflection and its Freidel mate (so the anomalous data is no longer preserved - it is now in the other tokens, for example _refln.ccp4_SAD_phase_anom.
Note that the EBI have these tokens as part of the CIF exchange dictionary mmcif_ccp4.dic: this can be found at http://mmcif.pdb.org/dictionaries/mmcif_ccp4.dic/Index/index.html.
The RCSB convert these to equivalent tokens in the PDB exchange dictionary mmcif_pdbx.dic found at http://mmcif.pdb.org/dictionaries/mmcif_pdbx.dic/Index/index.html. The mappings are:
_refln.ccp4_SAD_F_meas_plus_au -> _refln.pdbx_F_plus _refln.ccp4_SAD_F_meas_plus_sigma_au -> _refln.pdbx_F_plus_sigma _refln.ccp4_SAD_F_meas_minus_au -> _refln.pdbx_F_minus _refln.ccp4_SAD_F_meas_minus_sigma_au -> _refln.pdbx_F_minus_sigma _refln.ccp4_I_plus -> _refln.pdbx_I_plus _refln.ccp4_I_plus_sigma -> _refln.pdbx_I_plus_sigma _refln.ccp4_I_minus -> _refln.pdbx_I_minus _refln.ccp4_I_minus_sigma -> _refln.pdbx_I_minus_sigma
1. In June 2003 the EBI requested that CCP4 change the tokens used explicitly for anomalous data, to make them more generic:
_refln.ccp4_SAD_F_meas_plus_au -> _refln.F_meas_plus _refln.ccp4_SAD_F_meas_plus_sigma_au -> _refln.F_meas_plus_sigma _refln.ccp4_SAD_F_meas_minus_au -> _refln.F_meas_minus _refln.ccp4_SAD_F_meas_minus_sigma_au -> _refln.F_meas_minus_sigma _refln.ccp4_I_plus -> _refln.intensity_meas_plus _refln.ccp4_I_plus_sigma -> _refln.intensity_meas_plus_sigma _refln.ccp4_I_minus -> _refln.intensity_meas_minus _refln.ccp4_I_minus_sigma -> _refln.intensity_meas_minus_sigma
2. The EBI recognise Hendrickson-Lattmann coefficients from CCP4:
_refln.ccp4_SAD_HL_A_iso HLA _refln.ccp4_SAD_HL_B_iso HLB _refln.ccp4_SAD_HL_C_iso HLC _refln.ccp4_SAD_HL_D_iso HLD
The RCSB also have equivalents for these in the PDB exchange dictionary:
_refln.ccp4_SAD_HL_A_iso -> _refln.pdbx_HL_A_iso _refln.ccp4_SAD_HL_B_iso -> _refln.pdbx_HL_B_iso _refln.ccp4_SAD_HL_C_iso -> _refln.pdbx_HL_C_iso _refln.ccp4_SAD_HL_D_iso -> _refln.pdbx_HL_D_iso
3. mmCIF files can contain multiple datasets (indexed by crystal and wavelength). Although this maps well onto MTZ, the MTZ2VARIOUS program also doesn't support this currently.
4. SFCHECK doesn't recognise the _refln.ccp4_SAD_F_meas_plus_au etc tokens.
The CCP4 program CIF2MTZ can be used to convert from mmCIF to MTZ. It was orginally intended to convert mmCIF files from the PDB into MTZ files. Here there are two issues affecting the treatment of anomalous data:
1. If the mmCIF file contains the anomalous data in the RCSB representation (i.e. Freidel mates are explicitly given as separate reflections) then the CIF2MTZ program needs to be given the ANOMALOUS keyword in order to correctly convert the pairs of reflections back to the MTZ format.
2. If the mmCIF file contains the anomalous data in the non-standard token format (i.e. the _refln.ccp4_... tokens) then the correct back conversion is not possible because these tokens are not recognised by MTZ2CIF.
This is an issue for CCP4 to resolve.
mmCIF Resources (including the data dictionaries):
PDB_Extract: