NAME¶
dirfile — a filesystem based database format for time-ordered binary data
DESCRIPTION¶
The
dirfile database format is designed to provide a fast, simple format
for storing and reading binary time-ordered data. Dirfiles can be read using
the GetData library.
The dirfile database is centred around one or more time-ordered data streams (a
time stream). Each time stream is written to disk in a separate file,
in its native binary format. The name of these binary files correspond to the
time stream's
field name. Two time streams may have different constant
sampling frequencies and mechanisms exist within the dirfile format to ensure
these time streams remain properly sequenced in time.
To do this, the time streams in the dirfile are subdivided into
frames.
Each frame contains an integer number of samples of each time stream. When
synchronous retrieval of data from more than one time stream is required,
position in the dirfile can be specified in frames, which will ensure
synchronicity.
The binary files are all located in a central directory, known as the
dirfile
directory. The dirfile as a whole may be referred to by its dirfile
directory path.
Included in the dirfile along with the time streams is the
dirfile format
specification, which is an ASCII text file called
format located in
the dirfile directory. This file fully specifies the dirfile's metadata. For
the syntax of this file, see
dirfile-format(5).
Version 3 of the Dirfile Standards introduced the
large dirfile
extension. This extension added the ability to distribute the dirfile metadata
among multiple files (called
fragments) in addition to the
format file, as well as the ability to house portions of the database
in
subdirfiles. These subdirfiles may be fully fledged dirfiles in
their own right, but may also be contained within a larger, parent dirfile.
See
dirfile-format(5) for information on specifying these subdirfiles.
In addition to the raw fields on disk, the dirfile format specification may also
specify
derived fields which are calculated from one or more raw or
derived time streams. Derived fields behave identically to raw fields when
read via GetData. See
dirfile-format(5) for a complete list of derived
field types. Dirfiles may also contain both numerical and character string
constant
scalar fields, also further outlined in
dirfile-format(5).
Dirfiles are designed to be written to and read simultaneously. The dirfile
specification dictates that one particular raw field (specified either
explicitly or implicitly by the format specification) is to be used as the
reference field: all other vector fields are assumed to have at least
as many frames as the reference field has, and the size (in frames) of the
reference field is used as the size of the dirfile as a whole.
Version 6 of the Dirfile Standards added the ability to encode the binary files
on disk. Each
fragment may have its own encoding scheme. Notably this
can be used to compress these files. See
dirfile-encoding(5) for
information on encoding schemes.
Version 7 of the Dirfile Standards added support for complex valued data. Two
types of complex valued data are supported by the Dirfile Standards:
- •
- A 64-bit complex number consisting of a IEEE-754 standard
32-bit single precision floating point real part and a IEEE-754 standard
32-bit single precision floating point imaginary part, and
- •
- A 128-bit complex number consisting of a IEEE-754 standard
64-bit double precision floating point real part and a IEEE-754 standard
64-bit double precision floating point imaginary part.
No integer-type complex numbers are supported.
Unencoded complex numbers are stored on disk in "Fortran order", that
is with the IEEE-754 real part followed by the IEEE-754 imaginary part. The
specified endianness of the two components follows that of purely real
floating point numbers. Endianness does not affect the ordering of the real
and imaginary parts. This format also conforms to the C99 standard. The latest
C++ standard (C++98) does not specify a standard storage format for native
complex numbers, but the upcoming standard, (C++0x) is intended to specify the
above format for compatibility with C99 (
see: ISO/IEC
JTC1/SC22/WG21/N1388).
To aid in using complex valued data, dirfile field codes may contain a
representation suffix which specifies a norm to apply to the complex
valued data to convert it into purely real data. See
dirfile-format(5).
AUTHORS¶
The dirfile specification was developed by C. B. Netterfield
<netterfield@astro.utoronto.ca>.
The dirfile specification is now maintained by D. V. Wiebe
<getdata@ketiltrout.net>.
SEE ALSO¶
dirfile-encoding(5),
dirfile-format(5)