Scroll to navigation

cif_filter() cif_filter()

NAME

cif_filter - Parse a CIF file and print out essential data values in the CIF format, the COD CIF style. This script has also many capabilities -- it can restore space group symbols from symmetry operators (consulting predefined tables), parse and tidy-up _chemical_formula_sum, compute cell volume, exclude unknown or "empty" tags, and add specified bibliography data.

SYNOPSIS

cif_filter --options input1.cif input*.cif

DESCRIPTION

Parse a CIF file and print out essential data values in the CIF format, the COD CIF style.

This script has also many capabilities -- it can restore space group symbols from symmetry operators (consulting predefined tables), parse and tidy-up _chemical_formula_sum, compute cell volume, exclude unknown or "empty" tags, and add specified bibliography data.

OPTIONS

-a, --authors 'John Doe; Jane Doe; Joe Bloggs'

-j, --journal 'Acta Cryst. A'

-v, --volume 36

-i, --issue 1

-p, --page 123

--start-page 123

-e, --end-page 132

-y, --year 1999

-D, --doi 10.1010/xyz9999 Specify bibliographic data to be included into the output.

-B, --bibliography bibliography.cif

--bibliography bibliography.mrk Provide a bibliography file with the bibliographic information to be included into the output. The bibliography information can be provided in CIF format or in a XML-like .mrk file with data items between <authors>, <journal>, <volume>, <issue>, <year>, <pages>123-132</pages> tags.

--leave-bibliography Combine bibliographies from various sources. Values of lower precedence are overwritten, but never deleted. List of bibliography sources in order of increasing precedence: 1) the bibliography tags of the original CIF file; 2) the optional bibliography file; 3) the user specified command line options.

--discard-bibliography Only retain the bibliography tags from the bibliography source of highest precedence (see '--leave-bibliography' option) (default).

--leave-title Do not delete the publication title obtained from the source of lower precedence, even if the '--discard-bibliography' option is enabled.

-g, --global-priority Assume bibliography found in 'data_global' data block has precedence over bibliographies found in all the other data blocks.

-g-, --no-global-priority Assume bibliography found in 'data_global' data block does not have precedence over bibliographies found in all the other data blocks. Only missing bibliographic information will be copied from the 'data_global' data block (default).

--exclude-publication-details Exclude potentially copyrighted and irrelevant tags.

--dont-exclude-publication-details,

--no-exclude-publication-details Do not exclude potentially copyrighted and irrelevant tags.

-h, --add-cif-header header_file.cif Prepend each of the output files with the comments from the beginning of the specified file.

-s, --estimate-spacegroup Estimate space group symbols from the symmetry operators in the input.

-s-, --dont-estimate-spacegroup, --no-estimate-spacegroup Do not estimate space group symbols from the symmetry operators in the input (default).

--keep-unrecognised-spacegroups This option is a modifier for the '--estimate-spacegroup' option. Leave tags with unrecognised space group information as they are.

--dont-keep-unrecognised-spacegroups,

--no-keep-unrecognised-spacegroups This option is a modifier for the '--estimate-spacegroup' option. Replace the values of tags with unrecognised space group information with unknown values (represented as '?') and store the unrecognised space group information in '_cod_original_sg_*' tags (default).

--reformat-space-group, --reformat-spacegroup Correct the formatting of Hermann-Mauguin symmetry space group symbol.

--dont-reformat-space-group, --leave-space-group

--dont-reformat-spacegroup, --leave-spacegroup Leave the Hermann-Mauguin symmetry space group symbol as is (default).

--exclude-empty-tags Remove data items that contain only empty values. A value is considered empty if it is equal to a single question mark ('?') or a single period ('.').

--dont-exclude-empty-tags, --no-exclude-empty-tags Disable the '--exclude-empty-tags' option (default).

--placeholder-tag-list '_chemical_name_common,_chemical_name_systematic' A comma-separated list of data items that should be checked for placeholder values (default: '_chemical_name_common,_chemical_name_systematic').

--exclude-placeholder-tags Remove data items that contain common placeholder values (i.e. multiline value consisting only of white spaces and question marks. Only data items specified using the --placeholder-tag-list are affected by this option.

--dont-exclude-placeholder-tags, --no-exclude-placeholder-tags Disable the '--exclude-placeholder-tags' option (default).

--exclude-redundant-chemical-names Remove data items related to various chemical names (systematic, common, mineral) if the stored values match the chemical formula.

--dont-exclude-redundant-chemical-names,

--no-exclude-redundant-chemical-names Disable the '--exclude-redundant-chemical-names' option (default).

--exclude-empty-non-loop-tags Remove tags that contain empty values and are not contained within the CIF _loop structure. For the definition of empty values, see '--exclude-empty-tags' option. This option does not override the '--exclude-empty-tags' option.

--dont-exclude-non-loop-empty-tags,

--no-exclude-non-loop-empty-tags Disable the '--exclude-empty-non-loop-tags' option (default).

--exclude-unknown-tags Remove tags that contain only unknown values. A value is considered unknown if it is equal to a single question mark ('?').

--dont-exclude-unknown-tags, --no-exclude-unknown-tags Disable the '--exclude-unknown-tags' option (default).

--exclude-unknown-non-loop-tags Remove tags that contain unknown values and are not contained within the CIF _loop structure. For the definition of unknown values, see '--exclude-unknown-tags' option. This option does not override the '--exclude-unknown-tags' option.

--dont-exclude-non-loop-unknown-tags,

--no-exclude-non-loop-unknown-tags Disable the '--exclude-unknown-non-loop-tags' option (default).

-x, --extra-tag-list tag-list.lst Add additional tags to the list of recognised CIF tags. These extra tags are presented in a separate file, one tag per line.

--exclude-misspelled-tags Remove tags that were not present in the recognised tag list.

--dont-exclude-misspelled-tags,

--no-exclude-misspelled-tags Disable the '--exclude-misspelled-tags' option (default).

--parse-formula-sum Parse '_chemical_formula_sum' tag value and reformat it according to the Hill system ordering. If the original and reformatted formulae differ, replace the original value with the reformatted one and store the original value as the '_cod_original_formula_sum' tag value.

--dont-parse-formula-sum, --no-parse-formula-sum Do not parse '_chemical_formula_sum' tag value.

--fix-syntax-errors Try to fix syntax errors in the input CIF files that can be corrected unambiguously.

--dont-fix-syntax-errors, --no-fix-syntax-errors Do not try to fix syntax errors in input CIF files (default).

--retain-tag-order Print tags in the same order they appeared in the original file.

--dont-retain-tag-order Disregard original tag order while printing the tags (default).

--preserve-loop-order Print loops in the same order they appeared in the original file.

--use-internal-loop-order Disregard original loop order while printing the tags (default).

--calculate-cell-volume Calculate the unit cell volume from the cell constants. If the calculated value differs from the one already present in the CIF, replace the original value with the calculated one and store the original value as the '_cod_original_cell_volume' tag value.

--dont-calculate-cell-volume Do not calculate unit cell volume from the cell constants (default).

--original-filename data_source.cif Use the provided string as the name of the original file. (see --record-original-filename).

--clear-original-filename Do not use any previously provided strings as the name of the original file.

--record-original-filename Record the original filename and the original data block name for each data block as the '_cod_data_source_*' tag values.

--dont-record-original-filename Do not record the original filename and the original data block name (default).

-S, --start-data-block-number 1234567 Use the provided number as the start number when renaming data blocks (default '7000001') Setting this option enables the '-R' option.

-d, --datablock-format '%07d' Use the provided format to determine new data block names from the provided data block numbers (default '%07d').

-R, --renumber-data-blocks Rename all data blocks. The new names are constructed by taking a start number (specified by the '-S' option), applying the string format (specified by the '-d' option) and then incrementing the start number for each sequential data block.

-R-, --dont-renumber-data-blocks Do not rename data blocks (default). Enabling this option sets the '-S' option to default value.

--original-filename-tag _cod_data_source_file

--original-data-block-tag _cod_data_source_block Use the provided tags to record original filename/data block name (default '_cod_data_source_file' and '_cod_data_source_block').

--database-code-tag _cod_database_code Use the provided tag while adding or updating the database code upon renaming the data blocks (default '_cod_database_code').

--update-database-code Update the database code tag value upon renaming the data blocks (default).

--dont-update-database-code Do not update the database code tag value upon renaming the data blocks.

--use-datablocks-without-coordinates,

--use-all-datablocks Do not remove data blocks without coordinates.

--do-not-use-datablocks-without-coordinates,

--dont-use-datablocks-without-coordinates,

--no-use-datablocks-without-coordinates,

--skip-datablocks-without-coordinates Remove data blocks without coordinates (default).

--use-datablocks-with-structure-factors Do not remove data blocks with structure factors (Fobs).

--dont-use-datablocks-with-structure-factors,

--no-use-datablocks-with-structure-factors,

--skip-datablocks-with-structure-factors Filter out data blocks with structure factors (Fobs) (default).

--folding-width 78 Specify the length of the longest unfolded line (default 76).

--fold-title Folds the title, if longer than folding width.

--dont-fold-title Do not fold the title (default).

--fold-long-fields Fold fields, longer than folding width.

--dont-fold-long-fields Do not fold fields (default).

--use-perl-parser Use Perl parser for CIF parsing.

--use-c-parser Use Perl & C parser for CIF parsing (default).

--cif-input Use CIF format for input (default).

--json-input Use JSON format for input.

--cif-output Use CIF format for output (default).

--json-output Use JSON format for output.

--cif Use CIF format for both input and output (default).

--json Use JSON format for both input and output.

--help, --usage Output a short usage message (this message) and exit.

--version Output version information and exit.

REPORTING BUGS

Report cif_filter bugs using e-mail: cod-bugs@ibt.lt