5. Crank pipeline tutorial
Project directory:
5_crank_infl
Reflection data:
gere_MAD_nat.mtz
Sequence file:
gere.pir
5.1. Introduction
This practical will demonstrate Crank for the solution of a
macromolecular structure using the pipeline Afro - Crunch2 - BP3 -
Solomon - Buccaneer and data from a SAD experiment.
We will use SAD data from the Seleno-methionine protein GerE (pdb
code: 1FSE) originally solved by MAD. The complete MAD and native
data set is distributed with the CCP4 package, and also copied to
the project directory for convenience.
5.2. Instructions
To run the tutorial please follow the following steps.
- Open Crank task interface in CCP4I: Experimental
Phasing > Automated Search & Phasing >
Crank - automated EP pipeline
- Specify a title in the Title field (e.g. "Crank
tutorial GerE structure" or anything you like).
- In Type of experiment field select Single
wavelength anomalous diffraction (SAD)
- Make sure that the check box Input protein sequence
is selected. In the SEQ in line, use Browse to specify
sequence file gere.pir.
- Check the sequence file (View button) to make sure that this
is a valid pir-format, i.e. its second line does not contain
any sequence letters.
- In the field MTZ in specify gere_MAD_nat.mtz.
- In Input field select Amplitudes option
- Inside Crystal #1 section, specify Se in Substructure
atom field and enter 2 in the field Number
of substructure atoms per monomer.
- Make sure that SAD is selected for Dataset : 1
Type and specify scattering factors f' -3.5 and f'' 5.4
in the Atom line
- In the FP+ menu select F_infl(+) and the
other three mtz-labels will be set automatically to SIGF_infl(+),
F_infl(-) and SIGF_infl(-).
- In the Experimental Pipeline section select to
Start pipeline with Substructure detection and end
with Model building
- Run > Run Now
In the section Experimental Pipeline you can change the
program to be used at each pipeline step. Select Display
individual program options to see and modify options of the
selected programs. Please consult CCP4
Crank wiki page for further details.
5.3. Crank output
Files associated with each step in the Crank pipeline are stored
in the corresponding subdirectories. These include scripts used to
run the corresponding program. The subdirectory workdb
contains files passed between subsequent steps. Files
crank.in.#-NAME.* are the input files of the specified stage
and crank.out.#-NAME.* files are the output files from the
corresponding stage. Subdirectory logs contains log-files
from each stage. Subdirectory xml contains input.xml
file with the necessary input parameters used by Crank to run each
program.
1) Prep
It is possible to input amplitude or intensity data into Crank
from several datasets. At the Prep stage SCALEIT program is used
to put all the datasets on the same scale. If intensities are
input into Crank then CTRUNCATE will be used to convert
intensities into amplitudes. The results are output into crank.out.1_PREP.mtz
file in workdb directory.
2) Afro
Afro tries to identify contribution of heavy atom substructure in
the experimental structure factors (F_A-values). It takes F+/sigF+
and F-/sigF- values, estimates for B-factors, number of heavy
atoms and the form factors (f' and f'') as an input. After
completion Afro outputs 2_AFRO_FA/2_AFRO_SIGFA (F_A-values) and
2_AFRO_EA/2_AFRO_SIGEA (normalized structure factor E-values)
columns in workdb/crank.out.2_AFRO.mtz. These data are
used in the following stage to identify heavy atom substructure.
3) Crunch2
Crunch2 finds heavy atom positions using E-values
2_AFRO_EA/2_AFRO_SIGEA from the previous stage. Heavy atom
substructure is written into workdb/crank.out.3_CRUNCH2.substructure.pdb
file. The FOM data for all Crunch2 trials are summarised in the
file log/3-crunch2-logfile.
4) BP3
BP3 is a phasing program which calculates expected values for
structure factor amplitudes and phases based on the experimental
data (F+/sigF+, F-/sigF-) and heavy atom substructure output in
the previous step. BP3 outputs expected values for structure
factor amplitudes and phases (4_BP3_F, 4_BP3_PHIB), FoM values
(4_BP3_FOM), Hendrickson-Lattman (HL) coefficients (4_BP3_HLA,
4_BP3_HLB, 4_BP3_HLC, 4_BP3_HLD) and difference map coefficients
(4_BP3_FDIFF, 4_BP3_PDIFF) for two enantiomorphs.
5) Solomon
Solomon is a density modification program. Initially it is used to
select one of the enantiomorph structures. Results from these runs
are stored in 5-solomon/hand1 and 5-solomon/hand2
subdirectories. The structure with highest contrast and
correlation level is selected, and its optimized structure factors
(5_SOLOMON_F, 5_SOLOMON_SIGF, 5_SOLOMON_PHIB), HL coefficients
(5_SOLOMON_HLA, 5_SOLOMON_HLB, 5_SOLOMON_HLC, 5_SOLOMON_HLD) and
FoM (5_SOLOMON_FOM) are written into the file workdb/crank.out.5-SOLOMON.mtz.
At the next step Solomon is used for density modification for the
selected hand. Output files and column labels are the same as
above but with the step number changed to 6.
6) Buccaneer
Improved map coefficients are used as an input for
Buccaneer-Refmac pipeline for model building. The number of built
and sequenced residues, and the number of built chains can be
found in the file log/7-buccaneer-logfile. The files crank.out.7_BUCCANEER.mtz
and crank.out.7_BUCCANEER.pdb present the final map
coefficients and model, respectively. These files would open
automatically in Coot if the check box Display results with
Coot has been selected. Coot button is also available at the
end of the result page.
5.4. What next
Not long ago, solving GeRe structure and building the model using
the inflection data would not have been a trivial task. However,
the new Crank, version 1.5, builds more than 90% complete model
automatically with the default parameters (see Buccaneer summary
at the end of the result page). Given such a good model we can
immediately proceed to refinement against native data of higher
resolution. Run refmac5 using the output model from Crank and
native GerE data ($CCP4/examples/tutorial/data/gere_MAD_nat.mtz,
columns F_nat and SIGF_nat) and inspect refined model and maps in
Coot (there is a Coot button in the Refmac result page). At this
point it should be already easy to finalise the model manually. It
is also possible to try ARP/wARP before proceeding to manual work.