MrBUMP Tutorial
The following tutorial takes you through three example MR problems,
which illustrate various aspects of MrBUMP. Each example starts from a sequence
file containing the target sequence and an MTZ file containing native structure
factor amplitudes, i.e. it assumes you have finished processing your data.
The result of MrBUMP is a set of likely or possible molecular replacement
solutions. The tutorial contains brief suggestions of what to do next.
Normally, you would let MrBUMP look for suitable search model templates,
without specifying what they should be. If MrBUMP finds several possible
templates, then the MrBUMP job can take some time to try them all. In your own
lab, this would not be an issue - let MrBUMP run while you go do other things.
For the purposes of a tutorial, we restrict MrBUMP in order to get a
relatively quick example (though they may still take some time):
- For each example, we suggest suitable search model templates, and these can
be specified explicitly to MrBUMP
- You can limit the number of models passed to MR.
- Don't be afraid to kill a job once you can see what is happening, and
re-run concentrating on good models.
It is recommended that you create a new ccp4i project for each example.
See
http://www.ccp4.ac.uk/dist/examples/tutorial/html/intro-tutorial.html if you
do not know how to do this.
In the following, $MRBUMP_TUTORIAL refers to the top directory of the tutorial,
where you will find this file and the data subdirectory.
Example 1 - hypF
Introduction
The target is the acylphosphatase-like domain of hydrogenase maturation factor HypF
from E.coli, see Rosano et al, JMB, 321, 785 (2002). HypF-ACP sulphate
and phosphate complexes have been deposited as 1gxt and 1gxu respectively.
We have prepared a reflection file for you with native data from 1gxu,
in spacegroup H32 and extending to 1.3 A resolution. The target domain has 91 residues
and a Matthews calculation strongly suggests only one molecule in the asymmetric unit.
The data
Input files supplied:
- hypF-1gxu.mtz and use columns FP1gxu, SIGFP1gxu, FREE
- hypF_Ndom.seq
Local PDB files available (if you wish to bypass the full search):
- 1w2i_A.pdb (approx 42% sequence identity to target)
- 1v3z_B.pdb (approx 42% sequence identity to target)
Checking the data
We first use Sfcheck to check a few things about the data:
- Select the Program List module and open the
sfcheck task window.
- Enter a title (e.g. "checking hypF data").
- Un-check Run Procheck to analyse structure geometry
(we do not yet have any coordinates)
- Select Run Sfcheck to analyse
experimental data only
- Enter MTZ in
$MRBUMP_TUTORIAL/data/hypF-1gxu.mtz
and select the labels FP FP1gxu,
SIGFP SIGFP1gxu and
FreeR FREE
- Enter Output
hypF_analysis.ps
- Click Run -> Run Now.
Sfcheck produces a postscript file (see View Files from Job
-> hypF_analysis.ps) with some useful things:
- Anisotropy of data (it is not very anisotropic)
- Overall B from Wilson plot of 21.8 A**2
- Pseudo-translation not detected (from analysis of the native Patteron map)
Also check the log file View Files from Job then
View Log File:
Running MrBUMP
- Select the Molecular Replacement module and open the
MrBUMP task window.
- Enter a title.
- Leave Program Mode
Model search and Molecular Replacement unchanged.
- Enter MTZ in
$MRBUMP_TUTORIAL/data/hypF-1gxu.mtz
and select the labels F FP1gxu,
Sigma SIGFP1gxu and
Free-R FREE
- Enter SEQ in
$MRBUMP_TUTORIAL/data/hypF_Ndom.seq
The rest of the interface is concerned with customising your run of MrBUMP.
We could accept the defaults and select Click Run -> Run Now
now. However, for this tutorial, we will look at some of the options. In particular, we
will look at two ways of specifying search model templates. Note that if you try
both options, it is safest to Save or Restore ->
Restore Default Parameters in between jobs.
Option 1:
We will explicitly specify 2 search model templates. This is useful if you know
which templates you want to look at, if you want a quick run, or if you don't have
internet access.
- Move to the Template Search Options folder.
- Un-check Do a FASTA search for possible template models.
- Un-check Update local copies of search databases
- You may need to change the Multiple alignment program,
depending on what is installed locally
- Un-check all Additional search methods, i.e. SCOP,
PQS and SSM
- The folder User specified search models will have
opened. Because we have switched off all search options, we are required to use
local files. Click on Add PDB file 2 times to add 2 local
PDB files (do not click on Add Chain id).
The first file is $MRBUMP_TUTORIAL/data/1w2i_A.pdb
and specify Chain identifier A.
The second file is $MRBUMP_TUTORIAL/data/1v3z_B.pdb
and specify Chain identifier B.
- Now skip to step 22
Option 2:
We will let MrBUMP search for suitable search model templates. It is still a good idea
to let MrBUMP do the FASTA search locally, but MrBUMP will need to download the PDB
files that it requires. To keep the example quick, we limit the number of models passed
to the MR step.
- Move to the Template Search Options folder.
- Check Do a FASTA search for possible template models.
- Check Run the FASTA search locally.
- Check Update local copies of search databases
- You may need to change the Multiple alignment program,
depending on what is installed locally
- Check the Additional search methods SCOP and
PQS
- Move to the Search Model Preparation Options folder.
- Set the Maximum number of search results .... to
3
- Now move to step 22
After a quick check of the rest of the interface, we start the job:
- In the folder Search Model Preparation Options,
keep the default which is to use Molrep and
Chainsaw. This means there will be 2 search models for
each template chosen or found.
- In the folder Molecular Replacement and Refinement Options,
keep the default which is to use Molrep only. If you
want, you can use Phaser instead or both.
- Again in the folder Molecular Replacement and
Refinement Options, select Finish when
all of the search models have been tried in MR, so that
we can compare all solutions.
- Click Run -> Run Now
MrBUMP output
After a few minutes, have a look at the MrBUMP log file. Do not wait for the
job to finish - it will take some time.
- At the top, it echoes the options selected.
- Under Target Information, it estimates that there
is 1 molecule in the target asymmetric unit.
- For Option 1:, under Template Model Search Results,
it lists the two local files entered. They are named "loc0", "loc1" for internal
use.
- For Option 2:, under Template Model Search Results,
it lists the results of the Fasta search, which should include the deposited target
structures (1gxt and 1gxu) and the suggested structures from Option 1.
- Under Search Model Preparation Results, details
of the Molrep and Chainsaw methods are given.
- Finally, the section Molecular Replacement and Refinement
gives details for every MR job tried. In particular, look for messages such as
MR log: Molecular replacement (using Molrep) found 1 copies out
of 1 requested which indicates that an MR solution was found, and
Template Model: loc0_A_CHNSAW has produced a marginal solution,
final Rfree = 0.477 which indicates that the MR solution refined well.
- Under Option 2, it will in fact solve the structure trivially, since it finds
the correct structures 1gxt and 1gxu. For testing purposes, these can be excluded
in the folder Development Options.
The main MrBUMP log file finishes with a summary of the models tried, and the
results for each. For comparison, here are some example results from MrBUMP (you
may not get exactly the same):
PDB chain |
sequence identity |
source / release date
| Rfree from MrBUMP |
1w2i_B | 0.310 | OCA - released Apr 2005
| chainsaw 0.447 molrep 0.442 |
1w2i_A | 0.310 | OCA
| chainsaw 0.471 molrep 0.527 |
1v3z_B | 0.310 | OCA - released Mar 2005
| chainsaw 0.430 molrep 0.453 |
1v3z_A | 0.310 | OCA
| chainsaw 0.474 molrep 0.470 |
2bje_G | 0.287 | OCA - released Nov 2005
| chainsaw 0.458 molrep 0.442 |
2bje_E | 0.287 | OCA
| chainsaw 0.468 molrep 0.486 |
2bje_C | 0.287 | OCA
| chainsaw 0.491 molrep 0.481 |
2bje_A | 0.287 | OCA
| chainsaw 0.448 molrep 0.443 |
2bjd_B | 0.287 | OCA - released Nov 2005
| chainsaw 0.468 molrep 0.529 |
2bjd_A | 0.287 | OCA
| chainsaw 0.544 molrep 0.466 |
1ulr_A | 0.286 | OCA - released Nov 2004
| chainsaw 0.476 molrep 0.471 |
2acy_A | 0.264 | SSM - released Nov 1997 (authors tried)
| chainsaw 0.539 molrep 0.564 |
If you want to know more details of the MR runs that MrBUMP has done, then
you need to explore the directory of results. This directory is located at
<ccp4i project directory>/search_<job number>
Open a terminal window, and "cd" to this directory. Use "ls -l" on the command line
to view the directory contents (viewing of results will be easier in future versions
of MrBUMP!). In this directory, there are a number of subdirectories:
- data
- Contains the data files and log files from all jobs run. The directory
hierarchy is of the form <template>/<search model>/<pipeline step>
For example, "<ccp4i project directory>/search_55/data/loc0_A/chainsaw/mr"
contains the Molrep and Phaser results for the Chainsaw model based on chain A of
template loc0.
- input
- Copies of the input data files
- logs
- Some MrBUMP log files
- results
- Results from the successful search model are placed into subdirectory "solution".
Other results are placed into subdirectory "marginal_solns".
- scratch
- Scratch files
- sequences
- Sequence files for the multiple alignment
After MrBUMP
- Check the main MrBUMP log file and look for the best result(s).
- Find the output files under <ccp4i project directory>/search_<job number>/data
or <ccp4i project directory>/search_<job number>/results (see above for description
of directory hierarchy).
- Check the unrefined model in subdirectory "mr". In the case of MR with Phaser, there
will also be an MTZ file with phases.
- Check the refined output in subdirectory "refine".
- If the resolution is good enough, try re-building in ARP/wARP starting from the
refined model.
- Also try re-building using Pirate / Buccaneer.
Example 2 - 1k6d
Introduction
PDB entry 1k6d is the alpha subunit of bacterial acetate CoA-transferase
from E.coli (S Korolev et al (2002) Acta Cryst D58, 2116-21). The asymmetric
unit contains 2 chains of 220 residues, and there is data in P62 to 1.9 A.
The data
Input files supplied:
- 1k6d.mtz and use columns FP, SIGFP, FREE
- 1k6d.seq
Local PDB files available (if you wish to bypass the full search):
- 1ope.pdb (approx 37% sequence identity to target)
- 1ooy.pdb (approx 37% sequence identity to target)
Running MrBUMP
See example 1, for more details on how to run MrBUMP. As for example 1, you
can either let MrBUMP do a search for you, or use the supplied local PDB files.
For this example, specific choices to consider are:
- Enter MTZ in
$MRBUMP_TUTORIAL/data/1k6d.mtz
and select the labels FP FP,
SIGFP SIGFP and
FreeR FREE
- The true spacegroup P62 is one of an enantiomorphc pair (P62/P64) which cannot
be distinguished by the diffraction pattern alone. MrBUMP gives you the option to
test both spacegroups, and you may like to try this. MrBUMP should select the correct
spacegroup based on a better translation function and packing function scores.
- Enter SEQ in
$MRBUMP_TUTORIAL/data/1k6d.seq
- The best search models turn out to be domains of longer chains (the bacterial
alpha subunit aligns with the N-domain of eukaryotic succinyl-CoA:3-ketoacid
CoA-transferases) and therefore you will need to select the SCOP
option in the Template Search Options folder.
- Again in the folder Molecular Replacement and
Refinement Options, select Finish when
all of the search models have been tried in MR, so that
we can compare all solutions.
Example 3 - 2gas
Introduction
This is the crystal structure of isoflavone reductase from alfalfa. The
asymmetric unit contains 2 copies of 307 residues each. There is data to 1.6A
in spacegroup C 1 2 1. MrBUMP should solve it straightforwardly with 1qyc or
1qyd. This example allows you to explore phase improvement with Acorn and
model rebuilding with arp/warp.
The data
Input files supplied:
- 2gas.mtz and use columns FP, SIGFP, FREE
- 2gas.seq
Local PDB files available (if you wish to bypass the full search):
- 1qyc_A (approx 57% sequence identity to target)
- 1qyd_A (approx 47% sequence identity to target)
Running MrBUMP
See example 1, for more details on how to run MrBUMP. As for example 1, you
can either let MrBUMP do a search for you, or use the supplied local PDB files.
For this example, specific choices to consider are:
- Enter MTZ in
$MRBUMP_TUTORIAL/data/2gas.mtz
and select the labels FP FP,
SIGFP SIGFP and
FreeR FREE
- Enter SEQ in
$MRBUMP_TUTORIAL/data/2gas.seq
- Because the resolution of the data is good enough, you will presented
with the option of running Acorn. Select this option in the
Molecular Replacement and Refinement Options folder.
Note: this option requires the latest version of Acorn from the CCP4
Prerelease pages.
- Again in the folder Molecular Replacement and
Refinement Options, select Finish when
all of the search models have been tried in MR, so that
we can compare all solutions.
MrBUMP output
The Acorn option runs a phase improvement step after refinement of the
MR solution. The MrBUMP log file includes a table of correlation coefficients
for the medium E values, such as:
Bin Cycle_number CC $$
1 0 0.17849
2 1 0.22066
3 2 0.24249
4 3 0.24997
5 4 0.25240
6 5 0.25066
An improvement in the CC against cycle number is a good indication of a correct
solution. The absolute value of the CC is relatively low because we are not using the
strongest Es.
After MrBUMP
The output file from Acorn can be found in the directory
<ccp4i project directory>/search_<job number>/data/
<template>/<search model>/building
Part of the Acorn procedure involves artificially extending the data to 1.0A.
You can create a map using columns ECOUT, PHIOUT, WTOUT from the Acorn file
which will look like a 1.0A map. This can be used as input to ARP/wARP.