---------------------------------------- Progress Report on CCP4 work for BIOXHIT ---------------------------------------- Peter Briggs, 22nd June 2005 1. Overview =========== Wendy Yang has been working on prototyping the database handler with input from myself and Graeme Winter. The prototype is written in the Python scripting language and uses a simple MySQL database backend to store arbitary data. Since April Graeme has started using the handler in his own system and this collaboration has produced valuable input into the development of the prototype. As a result the prototype has uncovered a large number of requirements and issues which we have documented and which will need to be addressed in a full system. More recently we have been in touch with Steven Ness at Leiden University. Steven is using CCP4i extensively as the basis for a semi-automated structure determination module called CRANK, and is also interested in the issues of storing and retrieving data from large numbers of jobs. Steven is interested in using the BIOXHIT system for his own work - he has been looking at the prototype handler and has made some suggestions which we're now looking at implementing. Some work has also been done on basic visualisation tools, using the Graphviz package to generate network diagrams of project histories from existing CCP4i project databases. Based on this I am now looking at developing a more formal toolkit of functions for exploring project histories. Some work has also been done on looking at integrating the handler into CCP4i. So far relatively little work has been performed on the specification of the database itself, for a number of reasons: firstly, Wendy needs to acquire more general background knowledge of the field of PX; secondly, the specification must fit the needs of the end users who so far don't have a good idea themselves of what the database needs to do; thirdly, we felt that establishing a usuable prototype system was most important at this early stage. However, Wendy has recently spent time learning more about the PX background and has been talking to potential end-users so progress is also being made towards this goal. Presentations on the work have been made at a number of CCP4 meetings, most recently at the CCP4 Annual Developers Meeting (22-23rd March) and CCP4 Automation Meeting (6-7th June). 2. Progress towards Milestones and Deliverables =============================================== Ms 5.2.1 Implementation of prototype handler and communication protocols (18) D5.2.3 Specification for Version 1 of the Project Database Handler design (24, Report) D5.2.1 Version 1 of independent project database handler application with well-defined interaction protocols (24, Prototype) A prototype handler has been developed and we are starting to make it available to external developers. We are now starting to prototype the communication protocols (i.e. how the data is passed from the application to the database and back) however due to Wendy's absence during June/July we will miss milestone Ms 5.2.1 by 6-8 weeks. We are documenting the issues that have arisen as part of the prototyping process and this will form the basis of the specification in deliverable D5.2.3. Development of the prototype will continue to D5.2.1. Ms 5.2.2 Implementation of prototype project history visualisation tools (24) Some work has already been done towards this milestone (see above). D5.2.2 Evaluation report on the requirements of the database system for storing the data (18, Report) This work is ongoing, through prototyping and discussions with potential end users. Currently documentation exists which will form the basis of this report, see http://www.ccp4.ac.uk/projects/bioxhit/documents/ProjectHandlerDesc.html (to access use username "bioxhit" password is username backwards). I see working towards D5.2.2 as the next major area to look at in the latter half of 2005. D5.1.4 Integration of XML Schema for messaging with WP5.2 Project Tracking Database (24, Report) This is dependent on the delivery of the XML Schema (which I believe is an EBI deliverable) and the creation of the schema for the project tracking database (CCP4 deliverable). D5.1.10 Report on the evaluation/identification of the toolbox functions (24, Report) This is separate from the database work above, and refers to the specification of a "PX toolbox for automation". So far, a workshop has been held in Cambridge in February to look at the issues involved in automating software pipelines - I wrote a report on this (see http://www.ebi.ac.uk/msd-srv/docs/9-11Feb2005/BIOXHIT_workshop_report.pdf) which included analyses of the existing automation efforts with a view to isolating common aspects that could be abstracted to a "toolbox" library.