Testing the CCP4i SHELX C/D/E Task

1 Introduction

This document outlines the tests that have been performed on the CCP4i SHELX C/D/E task interface, to compare the outputs with those from running the programs as a "pipeline" via standard shell scripts.

The scripts are taken from the examples in the "High throughput phasing with SHELXC/D/E" document, which is available from the SHELX website as a PDF document:

http://shelx.uni-ac.gwdg.de/SHELX/fastphas.pdf

The tests highlighted some differences in outputs from the SHELX pipeline depending on whether the programs were run via the CCP4i SHELX interface in CCP4 version 6.0, or via shell scripts. The differences were relatively minor however they have since been addressed for release 6.1.

The tests were performed at the beginning of April 2006, and the results are given for the following software versions:

SHELXC: Version 2006/3
SHELXD: Version 2006/3
SHELXE: Version 2006/3
CCP4i: developmental version of v1.4.5

Initial tests were also carried using CCP4i version 1.4.4 (released with CCP4 6.0) and those results are described however the data themselves are not given.

All the files and directories (including this document) can be downloaded from the CCP4 ftp server as a compressed archive file:

ftp://ftp.ccp4.ac.uk/pjx/ccp4i/ccp4i_shelx_tests.tar.gz

2 Test Cases

2.1 Test data

There are two sets of data, each can be downloaded from the Autostruct Test Data page at http://www.ccp4.ac.uk/autostruct/testdata/index.html:

JIA (Acyl-CoA Thioesterase II): sfdata-jia.tgz
Thaumatin: sfdata-thau.tgz

2.2 Test scripts

There are three test cases:

MAD example using the JIA data
SAD example using the Thaumatin data
SIRAS example using the Thaumatin data

The shell scripts and output files are in the subdirectories MAD, SAD and SIRAS respectively.

The CCP4i files are all in the subdirectory SHELX_TEST. This is a CCP4i project directory - to make it visible, go into the "Directories&ProjectDir" CCP4i window and assign the project name SHELX_TEST to point to this directory.

2.3 How to run the tests

Download the test data for JIA and thaumatin into the DATA directory.
For each test, move to the directory and run the appropriate shell script. This will run the pipeline script and will (re)generate the results files.
Start up CCP4i and change to the project directory SHELX_TEST (you may need to make this a project directory within CCP4i first). Then rerun the examples.
To compare the output, run the differences.sh script within each of the example directories. This will generate a report of the differences that can then be examined e.g. in a web browser.

3 Test Results

3.1 MAD Example

The table below links to the input shell script and CCP4i parameter files, plus the logfiles. There is also a link to the other output files and differences between those generated by the different modes of running.

File	Description
shelx_mad_example.sh	Shell script for MAD example
shelx_mad_example.log	Logfile from script
13_shelx_cde.def	CCP4i def file for MAD example
13_shelx_cde.log	Logfile from CCP4i task
index.html	Comparisons of test outputs generated by running `../differences.sh MAD jia jia_nat SHELX_TEST 13`

3.1.1 Summary of results

This test highlighted a minor bug in the way that the 6.0 CCP4i dealt with the "minimum distance" (MIND) input to the SHELX programs: if the user failed to specify a value then the .ins file would contain a bad MIND command which resulted in SHELXD using a different value from its default. As a result this problem has been addressed, and does not appear in the examples above.

The remaining differences are in the program log files and are concerned with differing filenames, dates and timings. They are thus not considered to be significant.

3.2 SAD Example

File	Description
shelx_sad_example.sh	Shell script for SAD example
shelx_sad_example.log	Logfile from script
14_shelx_cde.def	CCP4i def file for SAD example
14_shelx_cde.log	Logfile from CCP4i task
index.html	Comparisons of test outputs generated by running `../differences.sh MAD jia jia_nat SHELX_TEST 13`

3.2.1 Summary of results

This example highlighted an issue with the way that the heavy atom search parameters were passed to SHELXD in CCP4i 6.0. In a script, the parameters are passed to SHELXC, which then generates an ins file for SHELXD that contains these parameters plus others based on their values. In the interface the parameters were not passed to SHELXC, instead they were substituted into the ins file from SHELXC immediately before running SHELXD. As a result the values of the generated parameters (in this case the UNIT command) differed between the two modes, and resulted in different logfile output. Note that this did not seem to affect the resulting sites, which were identical.

This issue has now been addressed in the latest CCP4i, which passes the heavy atom search parameters to SHELXC in the same way that a script does. Note that as a result this issue does not appear in the examples above.

The remaining differences are in the program log files and are concerned with differing filenames, dates and timings. They are thus not considered to be significant.

3.3 SIRAS Example

File	Description
shelx_siras_example.sh	Shell script for SIRAS example
shelx_siras_example.log	Logfile from script
17_shelx_cde.def	CCP4i def file for SIRAS example
17_shelx_cde.log	Logfile from CCP4i task
index.html	Comparisons of test outputs generated by running `../differences.sh SIRAS thaui thau-iod SHELX_TEST 17`

3.3.1 Summary of results

Using the CCP4 version 6.0 CCP4i this example highlighted similar issues as the SAD example above, and these issues also have since been addressed. Note that as a result the issues are not reflected in the examples above.

The remaining differences are in the program log files and are concerned with differing filenames, dates and timings. They are thus not considered to be significant.

4 Conclusions

A set of three standard examples have been run to compare the output of the CCP4i SHELX C/D/E task against that from running a SHELX "pipeline" via a shell script.

Differences in output were identified when using the CCP4i v1.4.4 task, and these have been addressed for subsequent versions of the interface. The results using this updated version does not show any substansive differences in output bewteen the two modes of running the programs.

At this time some other issues remain to be addressed (for example inclusion of the SHELXE -e "free lunch" option) however these are beyond the scope of these tests.

Peter Briggs 10^th April 2006