From mgwt@ysbl.york.ac.uk Wed Dec 7 11:52:02 2005 Date: Wed, 06 Jul 2005 14:36:01 +0100 From: Maria Turkenburg To: ccp4@ccp4.ac.uk Subject: [ccp4]: oh dear [ The following text is in the "ISO-8859-1" character set. ] [ Your display is set for the "US-ASCII" character set. ] [ Some characters may be displayed incorrectly. ] Dear CCP4-er, I've just realised that I'm seriously behind in sending you updates for the CCP4 tutorials (examples/tutorial/html/), so I'd better get on with that. Please find attached a new version for intro-tutorial.html Cheers, Maria -- ********************************************************* Dr. Maria G.W. Turkenburg - van Diepen Structural Biology Laboratory phone: +44 1904 328257 Department of Chemistry fax : +44 1904 328266 University of York email: mgwt@ysbl.york.ac.uk York, UK-YO10 5YW URL: http://www.ysbl.york.ac.uk/~mgwt ********************************************************* [ Part 2: "Attached Text" ] [ The following text is in the "ISO-8859-1" character set. ] [ Your display is set for the "US-ASCII" character set. ] [ Some characters may be displayed incorrectly. ] CCP4 Tutorial: Session 1 - Introduction See also the accompanying document giving background information. In the following instructions, when you need to type something, or click on something, it will be shown in red. Output from the programs or text from the interface is given in green. OUTLINE OF THE METHOD 1. Setting up Project and Directory Aliases 2. Introduction to the MTZ format 3. MTZ format: unmerged files 4. The Loggraph Utility THE DATA FILES Directory DATA contains input files: toxd.hklreflection file from X-PLOR/CNS aucn.na4reflection file in NA4 format toxd.pdbcoordinate file of TOXD Directory RESULTS contains selected output files (you can look at these if you have problems, or the job is too slow): toxd.mtzreflection file in CCP4 format import-cns.log.log of importing CNS reflection file into CCP4 import-cns.defCCP4i .def of importing CNS reflection file into CCP4 import-unmerged.log.log of importing an unmerged reflection file into CCP4 import-unmerged.defCCP4i .def of importing an unmerged reflection file into CCP4 You will work in your own directory TEST. If you have problems following the instructions, then you can use .def files in directory DATA which contain the necessary parameters. You can load these files into the interface using the option at the bottom of the task window Save&Restore -> Restore from File -> select the file. Often you will use the output file of one job as the input file for the next job. However, if you do not have the output file, then it will also be available in directory DATA. 1A) SETTING UP PROJECT AND DIRECTORY ALIASES The Problem When using ccp4i for the first time, you need to set up a project to work in. You also need to define directories so that ccp4i knows where to find files. Exercise 1. In your home directory, make a subdirectory TEST: > mkdir TEST 2. Start ccp4i: > ccp4i The Main Window will appear. 3. Click on the Directories&ProjectDir button in the main window. 4. In the new window, click on Add Project and in the new line enter a project alias TEST and then enter the the full path name for the subdirectory TEST that you have just made: Project TEST uses directory: $HOME/TEST 5. Select this new project on the next line: Project for this session of CCP4Interface TEST 6. Click on Add Directory Alias and in this new line add the the directory alias DATA and the path name: Alias: DATA for directory: $CEXAM/tutorial/data 7. Repeat the previous step to add the alias RESULTS: Alias: RESULTS for directory: $CEXAM/tutorial/results 8. Click on Apply&Exit. 1B) INTRODUCTION TO THE MTZ FORMAT The Problem The MTZ file format is central to running the CCP4 programs. When using CCP4 for the first time, you will usually have to convert an external file to MTZ format. You also need to understand how information is arranged in an MTZ file. In this example, we convert a CNS reflection file for the protein toxd to MTZ format, and briefly examine the MTZ file. Exercise 1. Find the Choose module pull-down menu in the main window, and select Reflection Data Utilities. 2. In the Tasks menu below, click on Convert to MTZ and Standardise. This will open a Task window. 3. On the first line, enter a suitable job title such as Job title Importing CNS file for toxd (intro tutorial step 20) 4. On the second line, select X-PLOR/CNS from the pull-down menu. Wait while the task window re-draws itself. 5. On the 3rd line, select: Create full unique set of reflections and keep existing FreeR data. by checking that the radiobutton on the left-hand side is on (this is the default), and selecting the appropriate option from the pull-down menu. 6. Now enter the input CNS file as: In DATA toxd.hkl (you can use the Browse button after selecting DATA from the pull-down menu). The output file should be automatically set to: Out TEST toxd.mtz (if it is not, type this yourself). 7. Now look at the folder MTZ Project, Crystal & Dataset Names. These names will be used to identify the data for Data Harvesting and to categorise the data within MTZ data structures. Enter: Crystal wildtype belonging to Project toxd Dataset name native 8. Now look at the folder Cell and Spacegroup to be saved in MTZ file. We need to supply the spacegroup and cell dimensions, since these are not included in the input CNS file. Enter: Space group name or number 19 Cell dimensions a 73.582 b 38.733 c 23.189 alpha 90 beta 90 gamma 90 9. Now look at the folder Detailed specification of file format. The format of X-PLOR/CNS files is variable, and we need to make sure that the task is able to read the format of the input file correctly. If an incorrect format statement is given, the task fails with an error such as: " f2mtz: problems reading reflection 0 ". In the case, the default format statement is slightly wrong, and needs to be changed to: Fortran format '(6X,3F5.0,6X,F10.3,10X,7X,F10.3,6X,F10.0)' i.e. add 10X, in front of 7X (this takes into account the column for the imaginary component of the input F). 10. The remainder of the task window can be left unchanged, so go to the bottom of the task window and click on Run -> Run Now. Look at the main window of the interface again, and look at the Job List. The current Import job should be at the top. The status will be given as STARTING, RUNNING and then FINISHED. This job is very quick, so you may only see the FINISHED status. 11. When the job has finished, highlight the job in the job list by clicking on it. Then select View Files from Job -> toxd.mtz in the main window. A window will open displaying the contents of the MTZ file that you have created (the MTZ file is a binary file, so you are actually just seeing the output of a viewer program). The information that is displayed comes from the header of the MTZ file. Look for the following: * Dataset ID, project/crystal/dataset names, cell dimensions, wavelength: 1 toxd wildtype native 73.5820 38.7330 23.1890 90.0000 90.0000 90.0000 0.00000 Information about the datasets included in the file is given here. In this example, the file just contains one dataset. * Column Labels : H K L FP SIGFP FreeRflag The file contains 6 columns; 3 holding the hkl indices, and 3 containing data. The names of these columns are given here. In the MTZ format, the column names are not fixed, and neither is the order of the columns. Programs use these names to identify the columns that are to be used. * Column Types : H H H F Q I Each column has an associated type. For example, F refers to a structure factor amplitude: the column FP has this type. * Associated datasets : 0 0 0 1 1 1 This is a list of the datasets associated with each column. In this example, all columns belong to dataset 1. * Cell Dimensions : (obsolete - use crystal cells) 73.5820 38.7330 23.1890 90.0000 90.0000 90.0000 * Resolution Range : 0.00074 0.18900 ( 36.786 - 2.300 A ) * Sort Order : 1 2 3 0 0 * Space group = 'P 21 21 21' (number 19) The cell dimensions, resolution range and space group are carried in the MTZ file header, so that you do not normally need to enter them explicitly when running programs. 12. By default, only the header information from the MTZ file is displayed. To see more, click on List More Info at the bottom of the display window. A dialogue box will appear. Accept the defaults and click Apply&Exit. Extra information is now displayed at the bottom of the display window. Scroll down and look at the table: OVERALL FILE STATISTICS for resolution range 0.001 - 0.189 ======================= Col Sort Min Max Num % Mean Mean Resolution Type Column num order Missing complete abs. Low High label 1 ASC 0 31 0 100.00 11.9 11.9 36.79 2.30 H H 2 NONE 0 16 0 100.00 6.2 6.2 36.79 2.30 H K 3 NONE 0 10 0 100.00 3.6 3.6 36.79 2.30 H L 4 NONE 170.0 20154.0 74 97.71 2851.78 2851.78 36.79 2.30 F FP 5 NONE 18.0 465.0 74 97.71 140.33 140.33 36.79 2.30 Q SIGFP 6 NONE 0.0 22.0 0 100.00 11.59 11.59 36.79 2.30 I FreeRflag No. of reflections used in FILE STATISTICS 3235 Each line corresponds to a column of data in the MTZ file, and for each line various statistics are given. For example, Num Missing gives the number of reflections in that column which have been flagged as missing data (e.g. a structure factor amplitude which wasn't measured in the diffraction experiment). 13. At the bottom of the display, the first 10 reflections are listed (more can be listed via the List More Info option): 0 0 2 626.00 112.00 3.00 0 0 4 9111.00 168.00 22.00 0 0 6 513.00 146.00 20.00 0 0 8 2610.00 52.00 10.00 0 0 10 ? ? 11.00 0 1 1 1200.00 38.00 13.00 0 1 2 2244.00 55.00 21.00 0 1 3 2163.00 36.00 6.00 0 1 4 6057.00 82.00 13.00 0 1 5 3698.00 46.00 16.00 The rows correspond to different reflections, and the columns correspond to the 6 columns of data described in the header. Some entries are given as "?". This represents missing data, and the total number of such entries for each column is listed in the table OVERALL FILE STATISTICS. 14. When you have finished examining the file, click on Quit. Close all other windows except the main window. 1C) MTZ FORMAT: UNMERGED FILES The Problem The previous example looked at a so-called merged MTZ file. This type of file has only one record for each set of hkl indices, and is the type of file one has after merging together all different observations of a particular reflection. In the early stages of data processing, however, one has several observations of each reflection (i.e. from different images or symmetry-related) and such reflection data are held in an unmerged MTZ file. In this exercise, we examine an unmerged MTZ file. Exercise 1. Open the Convert to MTZ and Standardise task window again (see above). 2. On the first line, change the job title to: Job title Importing unmerged DMSO data (intro tutorial step 40) 3. On the second line, select ascii MTZ from the pull-down menu. 4. On the 3rd line, turn off Create full unique set of reflections using the radiobutton. This is not appropriate for unmerged data. 5. Now enter the input file as: In DATA aucn.na4 (In the File Selection Window, change the Filename filter to *.na4) The output file is set automatically to: Out TEST aucn.mtz 6. Now look at the folder MTZ Project, Crystal & Dataset Names. Enter: Crystal aucn belonging to Project dmso Dataset name red_aucn 7. Cell and symmetry information is obtained from the input file and doesn't need to be entered. So click on Run -> Run Now. 8. When the job has finished, inspect the contents of the output unmerged file using View Files from Job -> aucn.mtz. Much of the information is the same as for the previous example, but there is some extra information specific to unmerged MTZ files. 9. Unmerged MTZ files have a standard set of column labels: * Column Labels : H K L M/ISYM BATCH I SIGI IPR SIGIPR FRACTIONCALC XDET YDET ROT WIDTH LP MPART These will normally be the same for all unmerged files. 10. Reflection records are grouped into batches: a batch corresponds to an image (or group of images) upon which a subset of the reflections were recorded. The same hkl triplet may occur several times, with different instances being distinguished by different batch numbers. A list of batches is given at the end of the default display: Batch number: 5 Batch number: 6 Batch number: 7 Batch number: 8 Batch number: 9 Batch number: 10 11. Click on List More Info, and this time select batch headers for multi-record MTZ before clicking Apply&Exit. In the main display window, the batch header for each batch is displayed. Orientation data for batch 5 oscillation data Crystal number ................... 0 Associated dataset ID ............ 1 Cell dimensions .................. 88.91 88.91 229.22 90.00 90.00 90.00 Cell fix flags ................... -1 1 -1 0 0 0 Orientation matrix U ............. 1.0000 0.0000 0.0000 (including setting angles) 0.0000 1.0000 0.0000 0.0000 0.0000 1.0000 Reciprocal axis nearest .. c* Mosaicity ........................ 0.020 Datum goniostat angles (degrees).. 0.000 Start & stop Phi angles (degrees). 343.000 344.000 Range of Phi angles (degrees)..... 0.000 Start & stop time (minutes)....... 0. 0. Crystal goniostat information :- Number of goniostat axes.......... 1 Goniostat vectors..... .... 0.0000 0.0000 1.0000 ..... .... 0.0000 0.0000 0.0000 ..... .... 0.0000 0.0000 0.0000 Beam information :- Idealized X-ray beam vector....... -1.0000 0.0000 0.0000 X-ray beam vector with tilts...... -1.0000 0.0000 0.0000 Wavelength and dispersion ........ 0.88000 0.00120 0.00010 Divergence ....................... 0.120 0.020 Detector information :- Number of detectors............... 0 Crystal to Detector distance (mm). 0.000 Detector swing angle.............. 0.000 Pixel limits on detector.......... 0.0 0.0 0.0 0.0 The batch header contains information on how the corresponding image was recorded, and this information is used by certain programs such as SCALA. 12. When you have finished examining the file, click on Quit. Close all other windows except the main window. 1D) THE LOGGRAPH UTILITY The Problem Many of the CCP4 programs produce specially formatted log files which contain tables and graphs which can be recognised by program LOGGRAPH and reproduced in graphic representation. Graphs can be edited and annotated, and printed either to a PostScript file or directly to a printer. In order to create a log file with suitable graphs for the purpose of this tutorial, we will run program BAVERAGE from CCP4i. Exercise 1. Select the Structure Analysis module, and open the Temperature Factor Analysis task window. 2. On the first line, enter a suitable job title such as Job title Getting to grips with loggraph (intro tutorial step 100) 3. Select a PDB input file: PDB in DATA toxd.pdb (Use the Browse button after selecting DATA from the pull-down menu) 4. Click on Run -> Run Now. The program BAVERAGE will run, and the Loggraph Viewer will open automatically. 5. In the Loggraph window, from the Tables in File panel, select Average B v residue From baverage. CHAIN A. The first Graph in Selected Table will be displayed, namely Average Bfactors (all atoms) Chain A. Change this to Average Bfactors (side chains) Chain A. Some of the residues have Bfactors of 0.0 for the side chains. Use the cursor and the cross-hairs to determine which residues they are. Check against the contents of the PDB file why they should have such a value. To do this, in the Temperature Factor Analysis task window, click on the View button in the line where the PDB file was selected - this will display a CCP4i fileviewer window with the contents of toxd.pdb. To come back to Loggraph at a later date, select the baverage job from the Job List in the main window of CCP4i, then click on View Files from Job -> View Log Graphs from the menu on the right-hand side of the main window. To view graphs from a log file which has not been produced by CCP4i (and is hence not part of any project Job List), click on View Any File from the menu on the right-hand side of the main window. Then go to the directory which contains the log file, select File type log CCP4 log and Viewer View Log Graphs. Select the desired file, and click on Display&Exit. The Loggraph viewer will now be displayed as before. 6. Close the Loggraph window using File -> Exit. Close or Quit all other interface windows except the main window. ________________________________________________________________________________ On to the next tutorial - Data Processing and Reduction. Back to the index. ________________________________________________________________________________ Valid CSS! Valid XHTML 1.0!