MrBUMP Tutorial

The following tutorial takes you through three example MR problems, which illustrate various aspects of MrBUMP. Each example starts from a sequence file containing the target sequence and an MTZ file containing native structure factor amplitudes, i.e. it assumes you have finished processing your data. The result of MrBUMP is a set of likely or possible molecular replacement solutions. The tutorial contains brief suggestions of what to do next.

Normally, you would let MrBUMP look for suitable search model templates, without specifying what they should be. If MrBUMP finds several possible templates, then the MrBUMP job can take some time to try them all. In your own lab, this would not be an issue - let MrBUMP run while you go do other things. For the purposes of a tutorial, we restrict MrBUMP in order to get a relatively quick example (though they may still take some time):

  1. For each example, we suggest suitable search model templates, and these can be specified explicitly to MrBUMP
  2. You can limit the number of models passed to MR.
  3. Don't be afraid to kill a job once you can see what is happening, and re-run concentrating on good models.

It is recommended that you create a new ccp4i project for each example. See http://www.ccp4.ac.uk/dist/examples/tutorial/html/intro-tutorial.html if you do not know how to do this.

In the following, $MRBUMP_TUTORIAL refers to the top directory of the tutorial, where you will find this file and the data subdirectory.

Example 1 - hypF

Introduction

The target is the acylphosphatase-like domain of hydrogenase maturation factor HypF from E.coli, see Rosano et al, JMB, 321, 785 (2002). HypF-ACP sulphate and phosphate complexes have been deposited as 1gxt and 1gxu respectively. We have prepared a reflection file for you with native data from 1gxu, in spacegroup H32 and extending to 1.3 A resolution. The target domain has 91 residues and a Matthews calculation strongly suggests only one molecule in the asymmetric unit.

The data

Input files supplied:

Local PDB files available (if you wish to bypass the full search):

Checking the data

We first use Sfcheck to check a few things about the data:

  1. Select the Program List module and open the sfcheck task window.
  2. Enter a title (e.g. "checking hypF data").
  3. Un-check Run Procheck to analyse structure geometry (we do not yet have any coordinates)
  4. Select Run Sfcheck to analyse experimental data only
  5. Enter MTZ in $MRBUMP_TUTORIAL/data/hypF-1gxu.mtz and select the labels FP FP1gxu, SIGFP SIGFP1gxu and FreeR FREE
  6. Enter Output hypF_analysis.ps
  7. Click Run -> Run Now.

Sfcheck produces a postscript file (see View Files from Job -> hypF_analysis.ps) with some useful things:

Also check the log file View Files from Job then View Log File:

Running MrBUMP

  1. Select the Molecular Replacement module and open the MrBUMP task window.
  2. Enter a title.
  3. Leave Program Mode Model search and Molecular Replacement unchanged.
  4. Enter MTZ in $MRBUMP_TUTORIAL/data/hypF-1gxu.mtz and select the labels F FP1gxu, Sigma SIGFP1gxu and Free-R FREE
  5. Enter SEQ in $MRBUMP_TUTORIAL/data/hypF_Ndom.seq
The rest of the interface is concerned with customising your run of MrBUMP. We could accept the defaults and select Click Run -> Run Now now. However, for this tutorial, we will look at some of the options. In particular, we will look at two ways of specifying search model templates. Note that if you try both options, it is safest to Save or Restore -> Restore Default Parameters in between jobs.

Option 1:
We will explicitly specify 2 search model templates. This is useful if you know which templates you want to look at, if you want a quick run, or if you don't have internet access.

  1. Move to the Template Search Options folder.
  2. Un-check Do a FASTA search for possible template models.
  3. Un-check Update local copies of search databases
  4. You may need to change the Multiple alignment program, depending on what is installed locally
  5. Un-check all Additional search methods, i.e. SCOP, PQS and SSM
  6. The folder User specified search models will have opened. Because we have switched off all search options, we are required to use local files. Click on Add PDB file 2 times to add 2 local PDB files (do not click on Add Chain id). The first file is $MRBUMP_TUTORIAL/data/1w2i_A.pdb and specify Chain identifier A. The second file is $MRBUMP_TUTORIAL/data/1v3z_B.pdb and specify Chain identifier B.
  7. Now skip to step 22

Option 2:
We will let MrBUMP search for suitable search model templates. It is still a good idea to let MrBUMP do the FASTA search locally, but MrBUMP will need to download the PDB files that it requires. To keep the example quick, we limit the number of models passed to the MR step.

  1. Move to the Template Search Options folder.
  2. Check Do a FASTA search for possible template models.
  3. Check Run the FASTA search locally.
  4. Check Update local copies of search databases
  5. You may need to change the Multiple alignment program, depending on what is installed locally
  6. Check the Additional search methods SCOP and PQS
  7. Move to the Search Model Preparation Options folder.
  8. Set the Maximum number of search results .... to 3
  9. Now move to step 22

After a quick check of the rest of the interface, we start the job:

  1. In the folder Search Model Preparation Options, keep the default which is to use Molrep and Chainsaw. This means there will be 2 search models for each template chosen or found.
  2. In the folder Molecular Replacement and Refinement Options, keep the default which is to use Molrep only. If you want, you can use Phaser instead or both.
  3. Again in the folder Molecular Replacement and Refinement Options, select Finish when all of the search models have been tried in MR, so that we can compare all solutions.
  4. Click Run -> Run Now

MrBUMP output

After a few minutes, have a look at the MrBUMP log file. Do not wait for the job to finish - it will take some time.

The main MrBUMP log file finishes with a summary of the models tried, and the results for each. For comparison, here are some example results from MrBUMP (you may not get exactly the same):

PDB chain    sequence identity    source / release date Rfree from MrBUMP
1w2i_B 0.310 OCA - released Apr 2005 chainsaw 0.447 molrep 0.442
1w2i_A 0.310 OCA chainsaw 0.471 molrep 0.527
1v3z_B 0.310 OCA - released Mar 2005 chainsaw 0.430 molrep 0.453
1v3z_A 0.310 OCA chainsaw 0.474 molrep 0.470
2bje_G 0.287 OCA - released Nov 2005 chainsaw 0.458 molrep 0.442
2bje_E 0.287 OCA chainsaw 0.468 molrep 0.486
2bje_C 0.287 OCA chainsaw 0.491 molrep 0.481
2bje_A 0.287 OCA chainsaw 0.448 molrep 0.443
2bjd_B 0.287 OCA - released Nov 2005 chainsaw 0.468 molrep 0.529
2bjd_A 0.287 OCA chainsaw 0.544 molrep 0.466
1ulr_A 0.286 OCA - released Nov 2004 chainsaw 0.476 molrep 0.471
2acy_A 0.264 SSM - released Nov 1997 (authors tried)  chainsaw 0.539 molrep 0.564

If you want to know more details of the MR runs that MrBUMP has done, then you need to explore the directory of results. This directory is located at

   <ccp4i project directory>/search_<job number>
Open a terminal window, and "cd" to this directory. Use "ls -l" on the command line to view the directory contents (viewing of results will be easier in future versions of MrBUMP!). In this directory, there are a number of subdirectories:
data
Contains the data files and log files from all jobs run. The directory hierarchy is of the form <template>/<search model>/<pipeline step> For example, "<ccp4i project directory>/search_55/data/loc0_A/chainsaw/mr" contains the Molrep and Phaser results for the Chainsaw model based on chain A of template loc0.
input
Copies of the input data files
logs
Some MrBUMP log files
results
Results from the successful search model are placed into subdirectory "solution". Other results are placed into subdirectory "marginal_solns".
scratch
Scratch files
sequences
Sequence files for the multiple alignment

After MrBUMP

  1. Check the main MrBUMP log file and look for the best result(s).
  2. Find the output files under <ccp4i project directory>/search_<job number>/data or <ccp4i project directory>/search_<job number>/results (see above for description of directory hierarchy).
  3. Check the unrefined model in subdirectory "mr". In the case of MR with Phaser, there will also be an MTZ file with phases.
  4. Check the refined output in subdirectory "refine".
  5. If the resolution is good enough, try re-building in ARP/wARP starting from the refined model.
  6. Also try re-building using Pirate / Buccaneer.

Example 2 - 1k6d

Introduction

PDB entry 1k6d is the alpha subunit of bacterial acetate CoA-transferase from E.coli (S Korolev et al (2002) Acta Cryst D58, 2116-21). The asymmetric unit contains 2 chains of 220 residues, and there is data in P62 to 1.9 A.

The data

Input files supplied:

Local PDB files available (if you wish to bypass the full search):

Running MrBUMP

See example 1, for more details on how to run MrBUMP. As for example 1, you can either let MrBUMP do a search for you, or use the supplied local PDB files. For this example, specific choices to consider are:

  1. Enter MTZ in $MRBUMP_TUTORIAL/data/1k6d.mtz and select the labels FP FP, SIGFP SIGFP and FreeR FREE
  2. The true spacegroup P62 is one of an enantiomorphc pair (P62/P64) which cannot be distinguished by the diffraction pattern alone. MrBUMP gives you the option to test both spacegroups, and you may like to try this. MrBUMP should select the correct spacegroup based on a better translation function and packing function scores.
  3. Enter SEQ in $MRBUMP_TUTORIAL/data/1k6d.seq
  4. The best search models turn out to be domains of longer chains (the bacterial alpha subunit aligns with the N-domain of eukaryotic succinyl-CoA:3-ketoacid CoA-transferases) and therefore you will need to select the SCOP option in the Template Search Options folder.
  5. Again in the folder Molecular Replacement and Refinement Options, select Finish when all of the search models have been tried in MR, so that we can compare all solutions.

Example 3 - 2gas

Introduction

This is the crystal structure of isoflavone reductase from alfalfa. The asymmetric unit contains 2 copies of 307 residues each. There is data to 1.6A in spacegroup C 1 2 1. MrBUMP should solve it straightforwardly with 1qyc or 1qyd. This example allows you to explore phase improvement with Acorn and model rebuilding with arp/warp.

The data

Input files supplied:

Local PDB files available (if you wish to bypass the full search):

Running MrBUMP

See example 1, for more details on how to run MrBUMP. As for example 1, you can either let MrBUMP do a search for you, or use the supplied local PDB files. For this example, specific choices to consider are:

  1. Enter MTZ in $MRBUMP_TUTORIAL/data/2gas.mtz and select the labels FP FP, SIGFP SIGFP and FreeR FREE
  2. Enter SEQ in $MRBUMP_TUTORIAL/data/2gas.seq
  3. Because the resolution of the data is good enough, you will presented with the option of running Acorn. Select this option in the Molecular Replacement and Refinement Options folder. Note: this option requires the latest version of Acorn from the CCP4 Prerelease pages.
  4. Again in the folder Molecular Replacement and Refinement Options, select Finish when all of the search models have been tried in MR, so that we can compare all solutions.

MrBUMP output

The Acorn option runs a phase improvement step after refinement of the MR solution. The MrBUMP log file includes a table of correlation coefficients for the medium E values, such as:


 Bin Cycle_number   CC  $$

     1     0   0.17849
     2     1   0.22066
     3     2   0.24249
     4     3   0.24997
     5     4   0.25240
     6     5   0.25066

An improvement in the CC against cycle number is a good indication of a correct solution. The absolute value of the CC is relatively low because we are not using the strongest Es.

After MrBUMP

The output file from Acorn can be found in the directory <ccp4i project directory>/search_<job number>/data/ <template>/<search model>/building Part of the Acorn procedure involves artificially extending the data to 1.0A. You can create a map using columns ECOUT, PHIOUT, WTOUT from the Acorn file which will look like a 1.0A map. This can be used as input to ARP/wARP.