This paper provides a brief introduction to the storage and manipulation of reflection data in the Clipper C++ libraries for crystallographic computation. The library is designed to make all common calculations quick and convenient. In particular, symmetry is handled automatically without any additional code. The library is extensible to new data types which were not available at the compilation of the libraries.
Understanding HKL_info and HKL_data<>
Storage of reciprocal space data in Clipper involved two objects: HKL_info and HKL_data<>. The first stores a list of reflections (HKLs), and the second stores a list of data of some type for each reflection.
Why is this division made? Because often, many types of data will be stored for the same reflection list, and duplicating the list of reflections for each data list would be wasteful. The relationship of an HKL_info object and several HKL_data<> objects is shown below.
These objects handle all the crystallographic symmetry operations associated with storing data in reciprocal space. To the programmer, the data appears to fill a sphere of reciprocal space. However, only a unique set of data is stored, and any changes to a data values are automatically reflected in the symmetry and Friedel related reflections.
The HKL_info object stores a list of reflections, transformed into a standard reciprocal space asymmetric unit. For example:
The HKL_info object also stores a lookup table, which is used to rapidly determine the location in the list of a given HKL. It also stores lookup tables for the reflection class (HKL_class) and resolution (invresolsq).
The reflection list depends upon the spacegroup (which determines the reciprocal asymmetric unit and systematic absences), the cell parameters, and the resolution limit. Therefore to initialise an HKL_info object, objects of types Spacegroup, Cell and Resolution are passed to the constructor or initialiser.
The data object is not much more than an array of data, but it has a number of special methods to make crystallographic operations on that data more convenient. It is defined as a template class, where the template type is the type of crystallographic information to be stored in the object. It simply stores a list of data of that type, and a pointer to the parent HKL_info object, which defined the HKL for each element in the list. Additionally a pointer to a Cell object is stored, which may optionally be used for the case where different data comes from slight different unit cells (e.g. RT and frozen data). Therefore to initialise an HKL_data object, an HKL_info object and optionally a Cell object are passed to the constructor or initialiser.
Data types typically include several values. Examples include measured X-ray intensity and its standard deviation (I_sigI), structure factor magnitude and phase (F_sigF), and Hendrickson-Lattmann coefficients (ABCD). Data types are derived from a base type (Datatype_base), and should override all the methods of that type. This will allow the data to be automatically transformed about reciprocal space, and imported or exported to a file, as required.
Methods are provided to access the data by index or by HKL. Any transformations which must be applied to the data in obtaining a symmetry or Friedel related value are applied automatically.
In order to use the class efficiently, some important difficulties must be borne in mind.
A problem arises when we wish to apply some transformation to the values stored in a data list. In this case, we must access every unique value in the asymmetric unit once and once only, applying the desired transformation. Only then will the entire data list have been transformed correctly.
A second problem arises if we want to access the stored value of the data at some position in reciprocal space, for example to expand to a lower spacegroup. Then it is necessary to search through all the symmetry operators, applying each one in turn to the desired HKL to find the operator which brings the HKL into the stored asymmetric unit, with a Friedel inversion if necessary. Clearly this can be time consuming, especially if there are many symmetry operators.
Both these problems are addressed by the use of HKL reference types. These come in two forms:
The coordinate-like reference behaves like an HKL coordinate: it stores a reference to an HKL_info and an HKL. However to enhance performance it also stores the index corresponding to that HKL, and the number of the symmetry operator used to get back into the stored asymmetric unit, along with a flag to signify Friedel inversion. Since reflections are usually accessed systematically, the next HKL used will commonly require the same symmetry operator, and so that operator is tried first. Methods are provided for incrementing and decrementing along the h, k, and l directions.
The differences between the index-like and coordinate-like reference types can be summarised as follows:
Importing and exporting HKL_data
To import a datalist we need an HKL_info object to hold the reflection list and an HKL_data object to hold the actual data. We also need MTZdataset and MTZcrystal objects to hold the additional information which will be returned from the MTZ file. Then we create an MTZfile object, open it onto a file, and read the reflection list and data.
clipper::HKL_info myhkl; // define objects clipper::MTZdataset myset; clipper::MTZcrystal myxtl; clipper::HKL_data<clipper::data32::F_phi> fphidata( myhkl, myxtl ); clipper::MTZfile mtzin; mtzin.open_read( "my.mtz" ); // open new file mtzin.import_hkl_info( myhkl ); // read sg, cell, reso, hkls mtzin.import_hkl_data( fphidata, myset, myxtl, "native/CuKa/[FCAL PHICAL]" ); mtzin.close_read();
The same process is more elegantly achieved using containers, allowing all the information read from the reflection file to become part of a hierarchy. CMTZcrystal and CMTZdataset objects will be inserted in the hierarchy between the HKL_info and HKL_data objects.
clipper::CSpacegroup myspgr(); // define objects clipper::CCell mycell( myspgr ); clipper::CResolution myreso( mycell ); clipper::CHKL_info myhkl( myreso ); clipper::CHKL_data<clipper::data32::F_phi> fphidata( myhkl ); clipper::MTZfile mtzin; mtzin.open_read( "my.mtz" ); // open new file mtzin.import_hkl_info( myhkl ); // read sg, cell, reso, hkls mtzin.import_chkl_data( fphidata, "native/CuKa/[FCAL PHICAL]" ); mtzin.close_read();
Expanding reflection data to a lower symmetry
To expand a list of data to a lower symmetry, we need two reflection lists, one for each spacegroup; and two datalists, one for each reflection list. The lower symmetry list is then filled by looping over all the reflections in that list an requesting the value from the other list for that HKL.
clipper::HKL_info oldhkl( .... ); clipper::HKL_data<clipper::data32::f_phi> olddata(oldhkl); // ---- fill the objects here ---- clipper::HKL_info newhkl( Spacegroup( Spgr_descr( 1 ) ), oldhkl.cell(), oldhkl.resolution() ); clipper::HKL_data<clipper::data32::f_phi> newdata(oldhkl); HKL_info::HKL_reference_index ih; for ( ih = newhkl.first(); !ih.last; ih.next() ) { newdata[ih] = olddata[ih.hkl()]; }
Note that the '.hkl' is vital, as we want the data with the corresponding hkl, not the data from the corresponding position in the list. If efficiency is paramount, using an HKL_reference_coord to access the old list will save some searches over the symmetry operators:
clipper::HKL_info::HKL_reference_index ih; clipper::HKL_info::HKL_reference_coord ik( oldhkl ); for ( ih = newhkl.first(); !ih.last; ih.next() ) { ik.set_hkl( ih.hkl() ); newdata[ih] = olddata[ik]; }
Applying simple operations to a whole data list
While it is simple to loop through a reflection list and apply some transformation on the data, some simple operations have been automated using built-in C++ arithmetic operators for data of specific types, logical operators for data or any type, comparison operators for flags, and function 'Computation operators' for more complex operations.
clipper::HKL_data<clipper::data32::F_phi> fphi1, fphi2, fphi3; // ---- set data here ---- fphi3 = fphi2 + fphi1;
clipper::HKL_data<clipper::data32::F_sigF> fsigf, fsigftest; clipper::HKL_data<clipper::data32::Flag> flag; // ---- set data here ---- fsigftest = fsigf; fsigftest.mask( flag >= 18 );
Some computation operators simply convert a datalist of one type to a datalist of another type. For example, you can convert a phase and weight to Hendrickson Lattmann coefficients. (Of course C and D will be 0, because a phase and weight can only describe a unimodal distribution).
clipper::HKL_data<clipper::data32::Phi_fom> myphifom; // ---- set data here ---- clipper::HKL_data<clipper::data32::ABCD> myabcd; myabcd.compute( myphifom, clipper::data32::Compute_abcd_from_phifom() );
Some computation operators take data from two datalists. For example, you can calculate map coefficient (magnitude and phase) from a set of observed magnitudes and a phase and weight:
clipper::HKL_data<clipper::data32::F_sigF> myfsig; clipper::HKL_data<clipper::data32::Phi_fom> myphifom; // ---- set data here ---- clipper::HKL_data<clipper::data32::F_phi> myfphi; myfphi.compute( myfsig, myphifom, clipper::data32::Compute_fphi_from_fsigf_phifom() );
Computation operators may operate on a datalist itself, and can also take parameters. These parameters are passed to the constructor of the computation operator. For example, to apply a scale factor of 2.0 and and U-value of 0.5 to a list of reflections, use the following code:
clipper::HKL_data<clipper::data32::F_sigF> myfsig; // ---- set data here ---- myfsig.compute( myfsig, clipper::data32::Compute_scale_u<data32::F_sigF>( 2.0, 0.5 ) );
Computation operators are provided for performing addition, subtraction and scaling of map coefficients (i.e. F_phi), addition and computation of Hendrickson Lattmann coefficients, and for scaling data. It is fairly simple to define new computation operators, see core/hkl_convops.h.
Defining a new reflection datatype
Several data types are defined in the file hkl_datatypes.h
. Defining a new type proceeds as follows:
struct F_phi { float f,phi; F_phi() { f=phi=Nan(); } static const String type() { return "F_phi"; } void friedel() { phi=-phi; } void shift_phase(const ftype dphi) { phi+=dphi; } const bool missing() const { return (isnan(f) || isnan(phi)); } };
The datalist types are constructed from the individual data type by a template class.
If you need to store your new datatype in an MTZ file, you must also define an MTZ_iotype by derivation from clipper::MTZ_iotype_base, create a static instance of the new MTZ_iotype, and add it to the mtz_iotype_registry.