------------ CCP4 Newsletter - January 1997 ------------


Back to Contents....

The BLANC program suite for Protein Crystallography

Alexei A. Vagin¹ ², Garib N. Murshudov¹ ³, Boris V. Strokopytov¹

1. Institute of Crystallography, Leninsky pr.59, Moscow 117333, Russia
2. UCMB-ULB, Free University of Brussels, avenue Paul Heger cp160/16 - P2 1050 Brussels, Belgium
3. Chemistry Department, University of York, Heslington, York, U.K.


Dedicated to the memory of academician B.K.Vainshtein

Abstract

The BLANC program suite is a set of programs which can be used for macromolecular structure determination by X-ray crystallography. The suite is designed to provide experienced crystallographers and students with a number of simple tools and at the same time allows to build and test new algorithms. Beside a set of small programs, the BLANC system introduces so-called superprograms which represent larger programs composed of several smaller ones. They utilise so-called black-box principle requiring minimum preparations or intervention from a user. The programs are written in standard Fortran77. They are connected by standard BLANC data files. The package has been ported to all the major platforms such as Unix, VMS and DOS. At the moment a current version of the suite is distributed by anonymous ftp.

Introduction

The BLANC program suite project was started in 1979 in the laboratory headed by B.K.Vainshtein in the Institute of Crystallography, Moscow. The goal of the project was to develop an independent flexible set of programs which could communicate with each other through standard data file formats. The programs can be combined in a many different ways allowing user to perform any particular task. All computer code is written in standard Fortran77. The suite contain programs for analysis and merging of intensity data, structure solution programs utilising SIR, MIR, SIRAS, MIRAS, molecular replacement and density modification methods. The complex also contains programs for crystallographic refinement and the programs for analysis of the structures. The programs for displaying electron-density, rotation function, etc. are also available. The suite has been used for the determination of a number of protein structures. Some examples are listed below (Table 1).

Table 1: Examples of protein structures solved using the BLANC suite
Protein Reference
Tyrosine phenol-lyase Antson et al. 1992
Catalase Vainshtein et al. 1981
Thermitase Teplyakov et al. 1986
Ribonuclease C2 Polyakov et al. 1988
Ribonuclease Pb1 Pavlovsky et al. 1988
Aspartataminotransferase Malashkevitch et al. 1995
Pyrophosphatase Chirgadze et al. 1989
Dehydrogenase Lamzin et al. 1992

The program suite

Basic conception

The main idea behind the BLANC suite is simplicity. Special attention during development of the program system was paid to make it as user-friendly as possible: Most of BLANC programs do not require large memory. Most of them can be run on IBM PC with 640 K memory. All BLANC programs are written in standard FORTRAN codes and can be running at least by MS-DOS, VMS VAX, UNIX. Some of the BLANC programs and superprograms are listed in Table 2 and 3.

Table 2: Main BLANC programs
Program Function
A. Entrance and exit.
readPDB converts coordinates file formats to CIF
TOBLANC converts structure factors to BLANC format
FROMBL converts structure factor file into CIF
writePDB converts coordinates to PDB file
B. Fourier transformation.
COEF calculates various kinds of Fourier coefficients
FFT calculates maps using FFT
RFT calculates structure factors
C. Look up.
ISOLINE draws maps in isolines (Postscript format)
D. Statistics.
FLSTAT gives various statistics for structure factors files, etc.
MODCHECK gives statistics about restraints
E. Scaling.
SCALE calculates Wilson plot scale
PSCALE Patterson origin peak scaling
ANISOSCL calculates anisothermal scaling of two files
F. Modification, copy and merge.
MODDEN density modification program
COPYFL changes file titles, scale, etc.
CONCRD modifies coordinate files
JOINFL merges the files of structure factors or phases
SORTMRG reads, sorts, averages the files of structure factors or phases
G. Molecular replacement.
RFCOEF calculates coefficients of spherical harmonics
RFRES calculates Rotation Function (Euler angles)
RFROT calculates rotated spheric coefficients
RFADD adds spheric coefficients
TRPACK 3D translation/packing/phased translation function
RTRANS transforms Rotation Function map to polar angles
H. Isomorphous replacement.
PHASE calculates Henrickson-Lattman coefficients for a derivative
REFINE heavy atom's full matrix refinement
J. Refinement.
ROTLSQ rigid body refinement
K. Others.
GENDEN generates electron density
PEAKSRCH map peak search
WATPEAKS water peak search and water replacing
FIT superimposes two sets of coordinates
ABCDPH phases from Hendrickson-Lattman coefficients
PHABCD Hendrickson-Lattman coefficients
HISTOGRM histogram matching
SURFACE solvent accessible surface area
FRAGSRCH builds full atomic model of a protein using C_alpha atom coordinates
CONTACT computes inter or/and intra molecular contacts
L. Not converted to current version yet.
SEQSRCH searches aminoacid sequence in the local Sequence Data Bank.
ALIGN aligns aminoacid sequences
BBONE inserts side chains of a protein into electron density map
PATLSQ refines orientation of a model before translation function search
GROUP converts scattering from protein atoms to group scattering factors
LOCSCL anisothermal local scaling
DPLOT draws PostScript stereo picture of the model with electron density
SKELETON density skeletonisation procedure

Table 3: BLANC superprograms
Program Function
A. Isomorphous replacement method:
MIR automated heavy atom search and phasing
SIR automated one derivative heavy atom search and phasing
PATTSRCH automated reciprocal heavy atom structure solution
B. Molecular replacement method:
MOLREP performs automated molecular replacement search
SELFROT calculates self-rotation function
CROSSROT calculates cross-rotation function
TRFUN calculates translation and packing function
C. Refinement:
MMM Macro Molecular Minimisation/Crystallographic refinement
MAKECIF creates list of geometric and energetic parameters
LIBCHECK reads library of monomers, performs various checks
EMIN performs energy minimisation
DENMOD phase refinement by density modification
D. Others:
OMIT calculates omit synthesis phases
OMIT_MAP creates global omit map
SFCHECK checks quality of X-ray structures

Libraries

BLANC maintains a library of subroutines for performing the basic crystallographic and programming operations. Common subroutines, e.g., to open and close data files, read and write data, FFT, matrix operations, etc. are gathered in a special library (LIBUTILS). This shortens markedly program code and makes it easy to read and modify the programs. Each program has a subroutine version gathered in another library (LIBSUBR). This allows a programmer to develop larger programs composed of smaller ones.

Three levels of programming in BLANC. Introduction of superprograms

There are three main levels of programming modules in the BLANC suite of programs. The first level is superprograms. The superprograms normally implement some method (e.g., molecular replacement using known model). Some programs may act like subroutines inside superprogram. On the second level we have usual crystallographic programs which perform basic operations like calculation of structure factors, electron density etc. They use subroutines from the library. The subroutines themselves constitute a third level. The main goal of this programming level is to solve local tasks only: matrix operation, FFT, opening and closing files, etc. These special arrangement of BLANC programs simplifies significantly development of new programs.

Original features of BLANC

BLANC contains a set of new original algorithms and programs developed independently by us. Among them algorithms for calculation of translation and packing function (Vagin, A.A., 1983; Vagin, A.A. 1989), new program for data scaling using Patterson origin peak (to be be published elsewhere), program for black-box molecular replacement, black-box heavy-atom search and phasing, global omit map program (Vagin, A.A., unpublished results) and others.

File formats

There are four main types of file format for reflection data, map data, coordinate data and graphics meta-files. The coordinate data files are in ASCII but reflection and map files are binary. The BLANC reflection files in most cases uses 12 bytes of disk memory per reflection. Three reflection indices are packed into one integer*4. Two real numbers are used for storing information about amplitude and error estimate (sigma). The header records contain information such as cell dimensions and symmetry operators. The reflection data are stored notionally as columns of real numbers. There is no need to mark columns by special labels since native, derivative and calculated data are always kept in separate structure factor files. Maps are stored in a binary sequential access files as a three dimensional array preceded by a suitable header which contains information about map dimensions, cell, symmetry information, maximum and minimum, mean and root-mean-square deviation density values, etc. Each density grid point is packed into two byte integer. There is a possibility to convert BLANC map format to other map file formats for use on graphical devices. The standard coordinate file format is close to mmCIF format (Bourne et al., 1996). The program suite allows conversion from BLANC/mmCIF format to the PDB (Bernstein et al., 1977) format and vice versa. Graphical programs produce output in PostScript format.

Documentation, Installation and Distribution

The BLANC manual gives the details of installation procedures. In order to run the programs certain environment variables need to be set to appropriate values. Output document files are produced which contain necessary information about the progress of each particular run of the program.

The program suite has been implemented on a large number of hardware platforms including Unix. Installation is straightforward and full instructions are given in the BLANC manual.

The BLANC program suite is licensed free to academic institutes. The programs may be obtained by Internet ftp from anonymous@ftp.ucmb.ulb.ac.be. (First read file: pub/alexei/blanc/README). Several programs and superprograms independent from the BLANC suite (SFCHECK, MOLREP, CONTACT, MAKECIF, EMIN, LIBCHECK etc.) are kept in separate directories at the anonymous ftp site. Separate arrangements can be made for commercial organisations. For further details contact Dr.A.Vagin (email: alexei@ucmbcx1.ulb.ac.be).

Acknowledgements

We are very grateful to all our former colleagues who made significant contributions to this project helping us to eliminate bugs in the programs. We thank them for numerous scientific discussions as well.

References