Database Handler in CCP4i

Changes to Database Handler in CCP4I
Peter Briggs CCP4, 24^th July 2002

Current Status:

CCP4i creates a database for each project
The "database" is a flat file in CCP4i parameter file format (.def) file
Stores project history information (taskname, date, status, title, plus references to input/output files and locations) for each job in database
Database handler - which reads/writes database.def files - is embedded in the main CCP4i process (figure 1)
Handles requests to: register new job; get/set information for registered jobs (including updating status of job & input/output files); delete job records.

Aims and Approach:

Use a client-server model with sockets for inter-process communication (standard issue Tcl)
CCP4i main process & its children (scripts) interact with database only via db handler

Motivations:

Allows processes other than CCP4i to communicate directly with the project history database e.g. Molecular Graphics or MOSFLM
More robust than existing implementation (e.g. avoid problems with "unsaved" database information being lost before being "committed" to the database file from the main CCP4i process)
Allows different database back-ends to replace database.def in future (e.g. mySQL) without significant changes to the rest of CCP4i
Suitable for extension to a distributed computing environment

Issues:

Do we start a new db handler process for each CCP4i process, or a new db handler for each database?
Locking issues: how should multiple processes be allowed access (read-write, read-only) to the same database (queuing/lock-grab/lock-out)?
Should subprocesses (i.e. scripts) initiated by the main CCP4i "controller" process communicate directly with the db handler, or via the controller? (Important if the controller is separated from the main CCP4i GUI)
Will the scope of information stored in the database expand in future? (in which case we may need an API to the db handler which can accommodate a broader range of requests than at present)
What are the security implications in a distributed computing environment?