This page is a dynamic document! Send me complaints about existing items, and suggestions for additions. With enough feedback, we might converge on an agreed list.
A first step is to list all the quantities that are included in the last catagory. This will enable us to set up proper methods of handling such data. These methods will probably be extended to the variables held in ccp4i .def files, so we will include those too. But for the time being, we will not consider variables held in the standard files. The aim is to be able to run a program automatically using these variables and the input files.
The following is a list of these variables. The names reflect a possible heirarchy. The ending _i implies there may be several such items. I have no particular implementation in mind (full database, XML, mmCIF, ASCII in log file, etc, etc). This list should be made into a dictionary, with explicit definitions.
ccp4i::njobs ccp4i::job_i::status ccp4i::job_i::date ccp4i::job_i::logfile ccp4i::job_i::scratch ccp4i::job_i::taskname ccp4i::job_i::title ccp4i::job_i::input_files ccp4i::job_i::input_files_dir ccp4i::job_i::output_files ccp4i::job_i::output_files_dirThe list of jobs actually forms a web, with perhaps many jobs providing information for a particular job, and that job perhaps providing information for several more jobs. There will also be many dead-ends. It may be difficult to re-create this web automatically, but we would at least need something like:
ccp4i::job_i::previous_jobs ccp4i::job_i::next_jobs ccp4i::job_i::success_or_failWe need some form of error handling. Each job should produce:
ccp4i::job_i::error_level_number # e.g. 0,1,2,3,4 ccp4i::job_i::error_level_severity # e.g. OK, fatal, warning, info, library ccp4i::job_i::error_program_name ccp4i::job_i::error_messageEach program should produce some standard information whether or not it is run through ccp4i. In the latter case, these would be used to populate the job database.
program::name program::ccp4_version program::date program::status program::keywords
process::crystal::size process::crystal::colour process::crystal::data_collection_temperature process::crystal::oscillation_start process::crystal::oscillation_end process::crystal::oscillation_range process::crystal::oscillation_direction process::crystal::goniostat_orientation process::radiation::wavelength process::radiation::divergence_vertical process::radiation::divergence_horizontal process::radiation::polarization process::radiation::synchrotron_or_lab process::radiation::lab_site process::detector::position::crystal_detector_distance process::detector::position::swing_angle process::detector::position::beam_position_x process::detector::position::beam_position_y process::detector::measurement::num_pixels_x process::detector::measurement::num_pixels_y process::detector::measurement::size_pixels_x process::detector::measurement::size_pixels_y process::detector::measurement::orientation process::detector::measurement::overload_value process::detector::measurement::gain process::detector::mis_setting::twist process::detector::mis_setting::tilt process::detector::mis_setting::bulge process::detector::mis_setting::radial_offset process::detector::mis_setting::tangential_offset
See example below.
data::mtzfile::filename data::mtzfile::title data::mtzfile::num_reflections data::mtzfile::missing_number_flag data::mtzfile::spacegroup_num # symops obtained from ERF i.e. symop.lib data::mtzfile::sort_order # in particular, whether we need sortmtz data::crystal_i::project_name data::crystal_i::crystal_name data::crystal_i::cell data::crystal_i::dataset_i::dataset_name data::crystal_i::dataset_i::wavelength data::crystal_i::dataset_i::column_i::label data::crystal_i::dataset_i::column_i::type data::crystal_i::dataset_i::labels::fobs # These are file labels which data::crystal_i::dataset_i::labels::sigfobs # have been identified, and can data::crystal_i::dataset_i::labels::fobs_plus # be passed to future jobs. data::crystal_i::dataset_i::labels::sigfobs_plus # I.e. these are known columns data::crystal_i::dataset_i::labels::fobs_minus # rather than arbitrary columns data::crystal_i::dataset_i::labels::sigfobs_minus # from a file. data::crystal_i::dataset_i::labels::phase data::crystal_i::dataset_i::labels::fom data::crystal_i::dataset_i::num_centric data::crystal_i::dataset_i::num_acentric data::crystal_i::dataset_i::f_prime data::crystal_i::dataset_i::f_double_prime data::crystal_i::dataset_i::wilson_B_iso # from Wilson plot data::crystal_i::dataset_i::pseudo_trans_u # from Patterson data::crystal_i::dataset_i::pseudo_trans_v data::crystal_i::dataset_i::pseudo_trans_w data::crystal_i::dataset_i::twinning::operator data::crystal_i::dataset_i::twinning::fraction data::crystal_i::dataset_i::twinning::acentric_moment_2 data::crystal_i::dataset_i::twinning::cumul_I_flag # Just a Boolean flag data::crystal_i::dataset_i::anisotropy_ratio data::crystal_i::dataset_i::res_for_anisotropy_ratio
mir::rmsdiff_iso::dataset1 mir::rmsdiff_iso::dataset2 mir::rmsdiff_iso::num_reflections mir::rmsdiff_iso::value mir::rmsdiff_iso::Rfactor mir::rmsdiff_iso::norm_prob::centric mir::rmsdiff_iso::norm_prob::acentric mir::rmsdiff_ano::dataset mir::rmsdiff_ano::value
mr::model_i::filename # coordinate file mr::model_i::name mr::model_i::details # e.g. of side-chain trimming mr::model_i::solution_i::goodness # possibly several versions of this! mr::model_i::solution_i::space_group # can be unresolved at time of mr mr::model_i::solution_i::euler_alpha mr::model_i::solution_i::euler_beta mr::model_i::solution_i::euler_gamma mr::model_i::solution_i::fract_x mr::model_i::solution_i::fract_y mr::model_i::solution_i::fract_z
model::procheck_resolution model::matthews_coefficient model::fraction_solvent_content model::entity::number_residues model::entity::molecular_weight model::asymmetric::number_molecules model::ncs_i::euler_alpha model::ncs_i::euler_beta model::ncs_i::euler_gamma model::ncs_i::polar_omega model::ncs_i::polar_phi model::ncs_i::polar_kappa model::ncs_i::translate_x model::ncs_i::translate_y model::ncs_i::translate_z model::ncs_i::mask model::substructure_i::atom_i::fract_x model::substructure_i::atom_i::fract_y model::substructure_i::atom_i::fract_z model::substructure_i::atom_i::occ model::substructure_i::atom_i::B_iso model::substructure_i::atom_i::B_aniso model::substructure_i::atom_i::aocc model::substructure_i::atom_i::aB_iso
* File information : data::mtzfile::title GerE native and MAD. data::mtzfile::spacegroup_num 5 data::mtzfile::num_reflections 25739 data::mtzfile::missing_number_flag NaN data::mtzfile::sort_order (not implemented) * Crystals, datasets : data::crystal::crystal_name native data::crystal::project_name Gere data::crystal::cell 108.8420 61.7790 71.7520 90.0000 97.2510 90.0000 data::crystal::dataset::dataset_name native data::crystal::dataset::wavelength 1.40000 data::crystal_i::dataset_i::column_i::label data::crystal_i::dataset_i::column_i::type H H K H L H FP F SIGFP Q FreeR_flag I data::crystal::crystal_name Se_met_deriv data::crystal::project_name Gere data::crystal::cell 108.7420 61.6790 71.6520 90.0000 97.1510 90.0000 data::crystal::dataset::dataset_name SEinfl data::crystal::dataset::wavelength 0.98100 data::crystal_i::dataset_i::column_i::label data::crystal_i::dataset_i::column_i::type FSEinfl F SIGFSEinfl Q DSEinfl D SIGDSEinfl Q F(+)SEinfl F SIGF(+)SEinfl Q F(-)SEinfl F SIGF(-)SEinfl Q data::crystal::dataset::dataset_name SEpeak data::crystal::dataset::wavelength 0.98000 data::crystal_i::dataset_i::column_i::label data::crystal_i::dataset_i::column_i::type FSEpeak F SIGFSEpeak Q DSEpeak D SIGDSEpeak Q F(+)SEpeak F SIGF(+)SEpeak Q F(-)SEpeak F SIGF(-)SEpeak QWe would then change the tcl procedure ExtractMTZData to search for particular tags rather than assuming a particular layout.
Some information is still dependent on position, for instance datasets belong to the last crystal specified. In depends how complex you want these tags to get.