Mr Bump: automated Molecular Replacement
Ronan Keegan and Martyn Winn


[Using MrBUMP to solve arabinofuranosidase]

Mr Bump Overview

Mr Bump is an automated scheme for Molecular Replacement. Given a target sequence and experimental structure factors, it will search for homologous structures, create a set of suitable search models from the template structures, do molecular replacement, and test the solutions with some rounds of restrained refinement.

The scheme has the potential to be highly parallel, searching over many homologues, derived models and MR techniques. Currently it runs in two modes:

  1. Desktop: the scheme is streamlined for one node
  2. Cluster: jobs are distributed over available nodes

Mr Bump uses existing programs for the various steps. MR itself is done using Molrep or Phaser. The CCP4 suite is widely used for various steps. It also uses bioinformatics programs such as Fasta, MAFFT and ClustalW, and services provided by the European Bioinformatics Institute.

Download and Installation

The latest version of MrBUMP is available for Linux/Mac from here and for Windows from here. The release consists of a set of python scripts, a ccp4i GUI (see this screenshot), and documentation. This code is released under the CCP4 Licence as a CCP4 Application.

Mr Bump is a set of python scripts, and as such requires no compilation. However, it does require a relatively recent version of Python (2.3 or later). It requires version 6.0 of CCP4, and MrBUMP will be installed into this version of CCP4. CCP4 6.0 (including Phaser) and Python can be obtained from the CCP4 downloads page. For Windows users, we recommend downloading and installing the Active Python bundle which is available from the CCP4 ftp site here.

MrBUMP requires a multiple alignment program. MrBUMP 0.4.2 supports mafft, clustalw, Probcons and TCoffee. We recommend Clustalw for Windows users as it is the easiest to install. Optionally, MrBUMP can also use fasta34 for performing the homologue search locally and Perl/SOAP-Lite for the SSM web service.

To access the CCP4i DBviewer feature Graphviz should be insllated on your machine. For more details see the CCP4i DBviewer section.

There is also a MrBUMP tutorial. The material plus the data used in the tutorial can be downloaded from:

Questions / comments to the primary author Ronan Keegan or to Martyn Winn.

Functionality

This first version is expected to handle straightforward MR problems in an automated and convenient manner. Functionality includes:

But note that MrBUMP cannot yet solve complexes (although you may get a solution based on the major component). This and other improvements to the individual steps are planned for future versions.

Note that the time taken by MrBUMP depends on the size of the target, the number of trial models attempted and the speed of your computer. It may be useful to try a restricted search first to get a feeling for this.

The CCP4i DBviewer

The CCP4i database handler and viewer is a key part of the data tracking system being developed by Wanjuan Yang and Peter Briggs of CCP4 for the pan-European BioXHIT project. It is designed to facilitate the development of automated structure determination software pipelines such as MrBUMP. The project is still very much in development but a trial version has been released as part of the MrBUMP project. For more information about the project see the CCP4 BioXHIT page here or the CCP4 newsletter article about the project.

When running MrBUMP you can choose to view the progress of the job by selecting the option to launch the CCP4i dbviewer in the mrbump task interface. Note that you must have Graphviz installed for this to work.

It is also possible to launch the CCP4i DBviewer from CCP4i. At or close to the bottom of the "Program List" menu you will find a new task button for it.

Known Problems

  1. If you do a Kill Job in the ccp4i GUI, then child processes are not killed, e.g. a phaser job. These need to be killed manually (using ps and kill) or left to run their course. This should be fixed soon.
  2. We have had one report of the script crashing. We suspect that this is a special case arising when the initial FASTA search only produces one hit. A fix is being worked on. FIXED 9/01/06: This bug has been fixed, and the download file updated.
  3. The online fasta search using the OCA web application does not seem to work any more - OCA returns zero hits. Even before, it would not find all of the possible matches that are in the PDB. Downloading Fasta34 and running the fasta search locally can avoid this problem.
  4. In some circumstances, if the program PDBCUR fails, then the script exits rather than moving onto the next search model. FIXED 10/01/06: This bug has been fixed, and the download file updated.
  5. A crash has been reported resulting from a downloaded PDB file with negative residue numbers. This should be fixable. Meanwhile, a workaround is to exclude such files ("Development options" folder of the GUI).
  6. A problem has been found when running mrbump on linux running under VMWare on a Windows machine. UPDATE: This seems to be something to do with the network routing losing part of the Mime header. Tests indicate that Mr Bump will work in general under Vmware.
  7. Mrbump will sometimes use the Coot version of PDBCUR if it finds it in the system path ahead of the CCP4 version. This causes problems in the PDB processing stage as the two versions accept different keywords. FIXED 12/01/06: This bug has been fixed, and the download file updated.

References

As well as referencing MrBUMP, please reference the underlying programs used, such as molrep or phaser. MrBUMP prints out a list of appropriate references at the end of the log file, and a full list is given below.

Primary reference:
R.M.Keegan and M.D.Winn,
Acta Cryst. D63, 447 - 457 (2007)
- "Automated search-model discovery and preparation for structure solution by molecular replacement"
   reprint in PDF format Copyright © International Union of Crystallography

A brief description of an early version of MrBUMP is given in:
Bahar, M., Ballard, C.C., Cohen, S., Cowtan, K., Dodson, E.J., Emsley, P., Esnouf, R.M., Keegan, R.M., Lamzin, V., Langer, G., Levdikov, V., Long, F., Meier, C.J., Muller, A., Murshudov, G., Perrakis, A., Siebold, C., Stein, N.D., Turkenburg, M.G.W., Vagin, A., Winn, M.D., Winter, G. and Wilson, K.S. (2006) "SPINE workshop on automated X-ray analysis: a progress report" Acta Cryst. D62, 1170-1183
Some later developments are described in:
Keegan, R.M. and Winn, M.D. Acta Cryst. D64, 119-124 (2008)
- "MrBUMP: An automated pipeline for molecular replacement"
   reprint in PDF format Copyright © International Union of Crystallography
But please use the primary reference.

Here is a partial list of references illustrating the use of MrBUMP. Others can be found by searching the PDB.

  1. J.Obiero, S.A.Bonderoff, M.M.Goertzen and D.A.R.Sanders, (2006), Acta Cryst F62 757-760 (thioredoxin reductase, PDB code 2q7v)
  2. K.El Omari, B.Dhaliwal, M.Lockyer, I.Charles, A.R.Hawkins and D.K.Stammers (2006), Acta Cryst F62 949-953 (guanylate monophosphate kinase, PDB code 2j41).
  3. Izhar Karbat, Michael Turkov, Lior Cohen, Roy Kahn, Dalia Gordon, Michael Gurevitz and Felix Frolow (2007) J. Mol. Biol. 366 586-601 (scorpion toxin, PDB code 2i61).
  4. L.Lehtio et al, to be published (PDB code 2pa9 superceded by 3fhb)
  5. M.A. Stead, C.H. Trinh, J.A. Garnett, S.B. Carr, A.J. Baron, T.A. Edwards and S.C. Wright (2007) J. Mol. Biol. 373, 820-826 (see Supplementary Data) (Miz-1, PDB code 2q81)
  6. E. Kiyota, S. M. de Sousa, M. L. dos Santos, A. da Costa Lima, M. Menossi, J. A. Yunes and R. Aparicio (2007) Acta Cryst F63 990-992 (aldose reductase)
  7. Van Straaten, K.E., Hoffort, A., Palmer, D.R.J., and Sanders, D.A.R., (2008), Acta Cryst F64 98-101. (inositol dehydrogenase)
  8. Structural Studies of Australian Snake Venom Compounds, Millers, E-K.I. (2008) PhD. Thesis.
  9. Kotaka M, Ye H, Alag R, et al. Biochemistry 47 5951-5961 (2008) (FKBP35-FK506 complex, PDB code 2vn1)
  10. Wang E, Bauer MC, Rogstam A, et al. Molecular Microbiology 69 Issue 2 466-478 (2008) (transcriptional repressor Rex, PDB code 2vt2, PDB code 2vt3)
  11. L. Bleicher et al. FEBS Letters 582, 2985-2992 (2008) (IL22-IL22R1 complex, PDB code 3dlq)
  12. K.I.Miyazono, Y.Nishimura, et al., Proteins - Structure Function and Bioinfomatics, 73, 1068-1071 (2008) (PDB code 3d79)
  13. E. Vandermarliere, et al., Biochemical Journal 418, 39-47 (2009) (PDB code 3C7F)
  14. van Straaten KE, Langill DM, Palmer DRJ, et al.Acta Cryst, F65, 426-429 (2009)
  15. Klumpler T, et al. Acta Cryst, F65, 478-481 (2009)
  16. Stead MA, Carr SB and Wright SC, Acta Cryst, F65, 445-449 (2009)
  17. P. Bhaumik, et al. J.Mol.Biol., 388, 520-540 (2009) (HAP, PDB codes 3FNS, 3FNT, 3FNU)
  18. M. Kotaka, et al. J.Biol.Chem., 284, 15739-15749 (2009) (CalE7, PDB code 2W3X, search model 28% identity)
  19. J-J.Zhou et al., J.Mol.Biol., 389, 529-545 (2009) (example of spacegroup discimination, 2wc5)
  20. S.Partha et al., Acta Cryst, F65, 843-845 (2009)
  21. Bagneris C, Bateman OA, Naylor CE, et al. J.Mol.Biol., 392, 1242-1252 (2009) (rat Hsp20 ACD, PDB code 2wj5)
  22. Obiero J et al., J.Bacteriology, 192, 494-501 (2010) (PDB code 2q7v)
  23. Correia MAS et al., Biochemistry, 49, 6193-6205 (2010) (MrBUMP solution into ARP/wARP, PDB code 2wz8)
  24. Basavannacharya C et al., Tuberculosis, 90, 16-24 (2010) (bootsrapped from single domain solution, PDB code 2wtz)
  25. Sakamoto Y et al., Acta Cryst, F66, 309-312 (2010)
  26. Agarwal S et al., Proteins, 78, 2450-2458 (2010) (PDB code 3fry)
  27. Johansson R et al., FEBS Journal, 277, 4265-4277 (2010) (PDB code 2xod)
  28. Chen BY et al., Structure, 18, 1420-1430 (2010) (PDB code 3m0e)
  29. Alahuhta M et al. J.Mol.Biol., 402, 374-387 (2010) (2 domain solution, PDB code 3k4z)
  30. Welin M et al. Nucl. Acids Res, 38, 7308-7319 (2010) (PDB code 2qk4)
  31. Wang et al. PLOS One, 6, e20950 (2011) (example of MrBUMP usage after failed manual search, PDB code 3rdz)

Related References

CCP4
Collaborative Computational Project, Number 4. (1994), "The CCP4 Suite: Programs for Protein Crystallography". Acta Cryst. D50, 760-763
FASTA
W. R. Pearson and D. J. Lipman (1988), "Improved Tools for Biological Sequence Analysis", PNAS 85, 2444-2448
SSM
E.Krissinel and K.Henrick (2004), "Secondary-structure matching (SSM), a new tool for fast protein structure alignment in three dimensions" Acta Cryst. D60, 2256-2268
SCOP
A.G.Murzin, S.E.Brenner, T.Hubbard & C.Chothia (1995), J.Mol.Biol., 247, 536-540
MAFFT
K. Katoh, K. Kuma, H. Toh and T. Miyata (2005) "MAFFT version 5: improvement in accuracy of multiple sequence alignment" Nucleic Acids Res. 33, 511-518
CLUSTALW
Chenna, Ramu, Sugawara, Hideaki, Koike,Tadashi, Lopez, Rodrigo, Gibson, Toby J, Higgins, Desmond G, Thompson, Julie D. (2003) "Multiple sequence alignment with the Clustal series of programs" Nucleic Acids Res 31, 3497-500
CHAINSAW:
N.D.Stein (2008) "CHAINSAW: a progrom for mutating pdb files used as templates in molecular replacement." J. Appl. Cryst. 41, 641-643
MOLREP
A.A.Vagin & A.Teplyakov (1997) J. Appl. Cryst. 30, 1022-1025
PHASER
McCoy, A.J., Grosse-Kunstleve, R.W., Adams, P.D., Winn, M.D., Storoni, L.C. & Read, R.J. (2007). "Phaser crystallographic software" J. Appl. Cryst. 40, 658-674
REFMAC
G.N. Murshudov, A.A.Vagin and E.J.Dodson, (1997) "Refinement of Macromolecular Structures by the Maximum-Likelihood Method" Acta Cryst. D53, 240-255
ACORN
Yao Jia-xing, Woolfson,M.M., Wilson,K.S. and Dodson,E.J. (2005) Acta. Cryst. D61, 1465-1475

Acknowledgements

This work was supported by the BBSRC via the e-HTPX project. It is now supported by the CCP4 project.


Old News

Update 21/08/07: MrBUMP version 0.4.1 now available.

Update 17/07/07: MrBUMP version 0.4.0 now available. New features include:

Update 10/04/07: There is now a MrBUMP tutorial. The material and data used in the tutorial can be downloaded from:

Update 23/03/07:

Update 3/10/06:

Beta version 0.3.2 released.

New in version 0.3.2:

  1. Users can now add their own, locally stored PDB files to the search.
  2. Any PDB id's included in the search by the user are automatically included in the MR stage, regardless of how they score in the multiple alignment.
  3. FASTA search can be turned off if the user specifies search models.
  4. Updated GUI, includes a new section for user specified search models.
  5. Fix for handling NMR models - 1st model in PDB file is extracted and used as a search model.
  6. SSM search fixed.
  7. Additional HTTP source for PDB files has been added to the two existing FTP sources.
  8. Scoring of marginal and full solutions relaxed slightly.
  9. Resolution of template PDBs output to log file.
  10. Phaser ensemble fixed.
  11. Several other fixes.

Update (16/6/06): Beta version 0.3.1 released. This version includes the following new features:

  1. Several bug fixes.
  2. Updated GUI interface.
  3. NMR structures can be used as templates for search models.
  4. Added Option to run molrep and/or phaser.
  5. Chains inclusion moved to Template search stage.
  6. TRYALL keyword - all search models tried in MR or exit after first solution is found.
  7. HTML output is now optional.
  8. This version will also run on Windows but requires manual installation of the files in the CCP4 directory.
MrBUMP 0.3.1 can be downloaded here.

Update (23/3/06): Beta version 0.3 released. This version includes the following new features:

  1. Updated GUI interface.
  2. New mode of operation - only generate search models.
  3. New PDB sequence database with sequences from PDB ATOM records.
  4. Faster search model generation step.
  5. Polyalanine models can now be generated.
  6. Display Space group information in GUI.
  7. The number of cycles of restrained refinement in Refmac can be varied.
  8. Improved output.
  9. Added Reference details to output.
  10. Several other bug fixes and updates.
MrBUMP 0.3 can be downloaded here.

Newsflash (09/2/06): We're back in business, after 4 days of web server being off-line. During this time, MrBUMP will have failed while trying to check databases held here. We'll update MrBUMP soon so that MrBUMP can continue if this happens again.

Update (30/1/06): Beta version 0.2.5 released. This version includes the following new features:

  1. Ability to include specific "chains" in the search models.
  2. User entry of no. of molecules in the a.s.u.
  3. User entry for PACK keyword in Phaser (clashing tolerence).

Newsflash (20/1/06): We have had our first report of a novel structure solution "which resisted solution since December 2002". Details to follow. Interestingly, the solution was based on one of two identical molecules in the PDB entry, the other failing to give a convincing solution.

Newsflash (11/1/06): It seems there has been a format change in the OCA output in the last day or so which breaks the initial FASTA search. Mr Bump has been changed to deal with this. If you are using OCA to get your search results you will need to download and re-install mrbump. If you have fasta34 installed on your machine and are using it to do the search locally mrbump should run fine and you will not need to re-install.

Feedback: The release of Mr Bump has generated some interest, and the feedback from users is uncovering a number of problems. Some of these are minor bugs in Mr Bump. Others are due to peculiarities of particular cases and/or dubious files from databases. We believe that these problems will only manifest themselves in certain cases, so please do not be discouraged from giving it a go.

Known problems are listed at the end of this page. We are fixing these, and updating the download file. So if you have a problem, try downloading again. Alternatively, many problems can be bypassed by turning off the appropriate option in the interface, e.g. if the problem is in SCOP don't include the SCOP search. Finally, please let us know how you are getting on!

*** If it does work for you, please let us know! Future development depends on us knowing what we did right, as well as what we did wrong.