NAME¶
fitsmd5 - Compute/update the DATAMD5 keyword/value
SYNOPSIS¶
fitsmd5 [-u] [-s] [-a] <FITS files...>
DESCRIPTION¶
fitsmd5 computes the MD5 signature of all data sections in a FITS file,
and prints out the results on stdout. This command can optionally update the
main FITS header in modifying the value of the DATAMD5 key.
This command is useful to give a unique ID to a FITS file. The algorithm simply
browses through all data sections in the input file and passes the data blocks
to an MD5 hash function. The final result is a 128-bit signature that can be
used to uniquely identify the file.
This approach is meant to provide a tool to tag FITS files with unique IDs, it
is not meant to be used as a checksum for file integrity (the CKSUM key is the
solution for that), although it could be used in that spirit. The main point
is that only data sections are taken into account, leaving the possibility of
changing the headers without affecting the data signature.
MD5 hashing is cryptographically strong, which means the probability of having
two different FITS files getting the same ID is almost zero. It should be good
enough to assign a unique ID to several tens of thousands of frames. Since
there is still a tiny but non-zero possibility that two different files will
get an identical key, this approach is not recommended to tag very large
numbers of files (typically: millions of them). If you do have a large
database of FITS files, using a timestamp is usually a better approach.
The MD5 signature is a good solution to tag a list of FITS files which might
have originated from various sources on which the database maintainer has no
control. Typically, calibration databases holding calibration frames for a
given instrument, receive data from different actors who might not be in sync
with unique file naming conventions. This command makes sure it is always
possible to assign a unique ID to each frame.
Notice that if the input FITS file has no data section, the returned MD5 key
will be non-zero (it is exactly d41d8cd98f00b204e9800998ecf8427e). This
signature also offers the interesting property that if two files have exactly
the same pixels (bit-wise comparisons) they will get the same ID, this is
useful e.g. for regression tests.
If you want to produce files containing the DATAMD5 key in their main headers,
you should use the
qfits library, which always inserts this key. If you
are working with other FITS-processing software, you should allocate an empty
DATAMD5 placeholder and apply this command with the -u option to update the
value.
Notice that this command can also compute the MD5 sum of a complete file, not
just its data sections (see -a option). In this mode, the command is
completely identical to the GNU md5sum command, which is used to compute
checksums on files. Input files in that case need not be FITS, though they
still need to be regular files.
OPTIONS¶
- -u
- Try to update the DATAMD5 keyword in the main header if present.
- -s
- Silent mode: run without printing any message.
- -a
- Compute the MD5 sum on all bits in the file. In this mode, the command
behaves like the GNU md5sum command, to be used e.g. as a checksum. This
option excludes all others.
FILES¶
Input files to
fitsmd5 shall comply with the FITS format, except when
used with -a option.