5. Crank pipeline tutorial

Project directory:      5_crank_infl
Reflection data:         gere_MAD_nat.mtz
Sequence file:            gere.pir

5.1. Introduction

This practical will demonstrate Crank for the solution of a macromolecular structure using the pipeline Afro - Crunch2 - BP3 - Solomon - Buccaneer and data from a SAD experiment.

We will use SAD data from the Seleno-methionine protein GerE (pdb code: 1FSE) originally solved by MAD. The complete MAD and native data set is distributed with the CCP4 package, and also copied to the project directory for convenience.

5.2. Instructions

To run the tutorial please follow the following steps. In the section Experimental Pipeline you can change the program to be used at each pipeline step. Select Display individual program options to see and modify options of the selected programs. Please consult CCP4 Crank wiki page for further details.

5.3. Crank output

Files associated with each step in the Crank pipeline are stored in the corresponding subdirectories. These include scripts used to run the corresponding program. The subdirectory workdb contains files passed between subsequent steps. Files crank.in.#-NAME.* are the input files of the specified stage and crank.out.#-NAME.* files are the output files from the corresponding stage. Subdirectory logs contains log-files from each stage. Subdirectory xml contains input.xml file with the necessary input parameters used by Crank to run each program.

1) Prep

It is possible to input amplitude or intensity data into Crank from several datasets. At the Prep stage SCALEIT program is used to put all the datasets on the same scale. If intensities are input into Crank then CTRUNCATE will be used to convert intensities into amplitudes. The results are output into crank.out.1_PREP.mtz file in workdb directory.

2) Afro

Afro tries to identify contribution of heavy atom substructure in the experimental structure factors (F_A-values). It takes F+/sigF+ and F-/sigF- values, estimates for B-factors, number of heavy atoms and the form factors (f' and f'') as an input. After completion Afro outputs 2_AFRO_FA/2_AFRO_SIGFA (F_A-values) and 2_AFRO_EA/2_AFRO_SIGEA (normalized structure factor E-values) columns in workdb/crank.out.2_AFRO.mtz. These data are used in the following stage to identify heavy atom substructure.

3) Crunch2

Crunch2 finds heavy atom positions using E-values 2_AFRO_EA/2_AFRO_SIGEA from the previous stage. Heavy atom substructure is written into workdb/crank.out.3_CRUNCH2.substructure.pdb file. The FOM data for all Crunch2 trials are summarised in the file log/3-crunch2-logfile.

4) BP3

BP3 is a phasing program which calculates expected values for structure factor amplitudes and phases based on the experimental data (F+/sigF+, F-/sigF-) and heavy atom substructure output in the previous step. BP3 outputs expected values for structure factor amplitudes and phases (4_BP3_F, 4_BP3_PHIB), FoM values (4_BP3_FOM), Hendrickson-Lattman (HL) coefficients (4_BP3_HLA, 4_BP3_HLB, 4_BP3_HLC, 4_BP3_HLD) and difference map coefficients (4_BP3_FDIFF, 4_BP3_PDIFF) for two enantiomorphs.

5) Solomon

Solomon is a density modification program. Initially it is used to select one of the enantiomorph structures. Results from these runs are stored in 5-solomon/hand1 and 5-solomon/hand2 subdirectories. The structure with highest contrast and correlation level is selected, and its optimized structure factors (5_SOLOMON_F, 5_SOLOMON_SIGF, 5_SOLOMON_PHIB), HL coefficients (5_SOLOMON_HLA, 5_SOLOMON_HLB, 5_SOLOMON_HLC, 5_SOLOMON_HLD) and FoM (5_SOLOMON_FOM) are written into the file workdb/crank.out.5-SOLOMON.mtz. At the next step Solomon is used for density modification for the selected hand. Output files and column labels are the same as above but with the step number changed to 6.

6) Buccaneer

Improved map coefficients are used as an input for Buccaneer-Refmac pipeline for model building. The number of built and sequenced residues, and the number of built chains can be found in the file log/7-buccaneer-logfile. The files crank.out.7_BUCCANEER.mtz and crank.out.7_BUCCANEER.pdb present the final map coefficients and model, respectively. These files would open automatically in Coot if the check box Display results with Coot has been selected. Coot button is also available at the end of the result page.

5.4. What next

Not long ago, solving GeRe structure and building the model using the inflection data would not have been a trivial task. However, the new Crank, version 1.5, builds more than 90% complete model automatically with the default parameters (see Buccaneer summary at the end of the result page). Given such a good model we can immediately proceed to refinement against native data of higher resolution. Run refmac5 using the output model from Crank and native GerE data ($CCP4/examples/tutorial/data/gere_MAD_nat.mtz, columns F_nat and SIGF_nat) and inspect refined model and maps in Coot (there is a Coot button in the Refmac result page). At this point it should be already easy to finalise the model manually. It is also possible to try ARP/wARP before proceeding to manual work.