table of contents
other versions
- wheezy 0.7.3-6
- jessie 0.7.3-6+b2
- testing 0.9.4-1
- unstable 0.9.4-1
- experimental 0.10.0-1
gd_cbopen(3) | GETDATA | gd_cbopen(3) |
NAME¶
gd_cbopen, gd_open — open or create a dirfileSYNOPSIS¶
#include <getdata.h>DIRFILE*
gd_cbopen(const char *dirfilename, unsigned long
flags, gd_parser_callback_t sehandler, void
*extra);
DIRFILE*
gd_open(const char *dirfilename, unsigned long
flags);
DESCRIPTION¶
The gd_cbopen() function opens or creates the dirfile specified by dirfilename, returning a DIRFILE object associated with it. Opening a dirfile will cause the library to read and parse the dirfile's format specification (see dirfile-format(5)).- GD_NOT_ARM_ENDIAN
- Specifies that double precision floating point raw data on
disk is, or is not, stored in the middle-endian format used by older ARM
processors.
- GD_LITTLE_ENDIAN
- Specifies the default byte sex of raw data stored on disk
to be either big-endian (most significant byte first) or little-endian
(least significant byte first). Omitting both flags indicates the default
should be the native endianness of the platform.
- GD_CREAT
- An empty dirfile will be created, if one does not already
exist. This will create both the dirfile directory and an empty format
specification file called format. The directory will have have mode
S_IRWXU | S_IRWXG | S_IRWXO (0777), modified by the
caller's umask value (see umask(2)). The format file will
have mode S_IRUSR | S_IWUSR | S_IRGRP |
S_IWGRP | S_IROTH | S_IWOTH (0666), also modified by
the caller's umask.
- GD_EXCL
- Ensure that this call creates a dirfile: when specified along with GD_CREAT, the call will fail if the dirfile specified by dirfilename already exists. Behaviour of this flag is undefined if GD_CREAT is not specified. This flag suffers from all the limitations of the O_EXCL flag as indicated in open(2).
- GD_FORCE_ENCODING
- Specifies that /ENCODING directives (see dirfile-format(5)) found in the dirfile format specification should be ignored. The encoding scheme specified in flags will be used instead (see below).
- GD_FORCE_ENDIAN
- Specifies that /ENDIAN directives (see dirfile-format(5)) found in the dirfile format specification should be ignored. All raw data will be assumed to have the byte sex indicated through the presence or absense of the GD_ARM_ENDIAN, GD_BIG_ENDIAN, GD_LITTLE_ENDIAN, and GD_NOT_ARM_ENDIAN flags.
- GD_IGNORE_DUPS
- If the dirfile format metadata specifies more than one
field with the same name, all but one of them will be ignored by the
parser. Without this flag, parsing would fail with the GD_E_FORMAT
error, possibly resulting in invocation of the registered callback
function. Which of the duplicate fields is kept is not specified. As a
result, this flag is typically only useful in the case where identical
copies of a field specification line are present.
- GD_PEDANTIC
- Reject dirfiles which don't conform to the Dirfile Standards. See the Standards Compliance section below for full details.
- GD_PERMISSIVE
- Allow non-compliant format specification syntax, even when given along with a conflicting /VERSION directive. See the Standards Compliance section below for full details.
- GD_PRETTY_PRINT
- When dirfile metadata is flushed to disk (either explicitly via gd_metaflush(), gd_rewrite_fragment(), or gd_flush() or implicitly by closing the dirfile), an attempt will be made to create a nicer looking format specification (from a human-readable standpoint). What this explicitly means is not part of the API, and any particular behaviour should not be relied on. If the dirfile is opened read-only, this flag is ignored.
- GD_TRUNC
- If dirfilename specifies an already existing
dirfile, it will be truncated before opening. Since gd_cbopen()
decides whether dirfilename specifies an existing dirfile before
attempting to parse the dirfile, dirfilename is considered to
specify an existing dirfile if it refers to a directory containing a
regular file called format, regardless of the content or form of
that file.
- GD_VERBOSE
- Specifies that whenever an error is triggered by the
library when working on this dirfile, the corresponding error string,
which can be retrieved by calling gd_error_string(3), should be
written on standard error by the library. Without this flag, GetData
writes nothing to standard error. (GetData never writes to standard
output.)
- GD_AUTO_ENCODED
- Specifies that the encoding type is not known in advance, but should be detected by the GetData library. Detection is accomplished by searching for raw data files with extensions appropriate to the encoding scheme. This method will notably fail if the the library is called via putdata(3) to create a previously non-existent raw field unless a read is first successfully performed on the dirfile. Once the library has determined the encoding scheme for the first time, it remembers it for subsequent calls.
- GD_BZIP2_ENDODED
- Specifies that raw data files are compressed using the Burrows-Wheeler block sorting text compression algorithm and Huffman coding, as implemented in the bzip2 format.
- GD_GZIP_ENDODED
- Specifies that raw data files are compressed using Lempel-Ziv coding (LZ77) as implemented in the gzip format.
- GD_LZMA_ENDODED
- Specifies that raw data files are compressed using the Lempel-Ziv Markov Chain Algorithm (LZMA) as implemented in the xz container format.
- GD_SLIM_ENCODED
- Specifies that raw data files are compressed using the slimlib library.
- GD_TEXT_ENCODED
- Specifies that raw data files are encoded as text files containing one data sample per line.
- GD_UNENCODED
- Specifies that raw data files are not encoded, but written
verbatim to disk.
Standards Compliance¶
The latest Dirfile Standards Version which this release of GetData understands is provided in the preprocessor macro GD_DIRFILE_STANDARDS_VERSION defined in getdata.h. GetData is able to open and parse any dirfile which conforms to this Standards Version, or to any earlier Version. The dirfile-format(5) manual page lists the changes between Standards Versions.The Callback Function¶
The caller-supplied sehandler function is called whenever the format specification parser encounters a syntax error (i.e. whenever it would return the GD_E_FORMAT error). This callback may be used to correct the error, or to tell the parser how to recover from it.int
sehandler(gd_parser_data_t *pdata, void
*extra);
typedef struct { const DIRFILE* dirfile; int suberror; int linenum; const char* filename; char* line; size_t buflen; ... } gd_parser_data_t;
The pdata->dirfile member will be a pointer to a DIRFILE object suitable only for passing to gd_error_string(). Notably, the caller should not assume this pointer will be the same as the pointer eventually returned by gd_cbopen(), nor that it will be valid after the callback function returns.
- GD_E_FORMAT_BAD_LINE
- The line was indecipherable. Typically this means that the line contained neither a reserved word, nor a field type.
- GD_E_FORMAT_BAD_NAME
- The specified field name was invalid.
- GD_E_FORMAT_BAD_SPF
- The samples-per-frame of a RAW field was out-of-range.
- GD_E_FORMAT_BAD_TYPE
- The data type of a RAW field was unrecognised.
- GD_E_FORMAT_BITNUM
- The first bit of a BIT field was out-of-range.
- GD_E_FORMAT_BITSIZE
- The last bit of a BIT field was out-of-range.
- GD_E_FORMAT_CHARACTER
- An invalid character was found in the line, or a character escape sequence was malformed.
- GD_E_FORMAT_DUPLICATE
- The specified field name already exists.
- GD_E_FORMAT_ENDIAN
- The byte sex specified by an /ENDIAN directive was unrecognised.
- GD_E_FORMAT_LITERAL
- An unexpected character was encountered in a complex literal.
- GD_E_FORMAT_LOCATION
- The parent of a metafield was defined in another fragment.
- GD_E_FORMAT_METARAW
- An attempt was made to add a RAW metafield.
- GD_E_FORMAT_N_FIELDS
- The number of fields of a LINCOM field was out-of-range.
- GD_E_FORMAT_N_TOK
- An insufficient number of tokens was found on the line.
- GD_E_FORMAT_NO_PARENT
- The parent of a metafield was not found.
- GD_E_FORMAT_NUMBITS
- The number of bits of a BIT field was out-of-range.
- GD_E_FORMAT_PROTECT
- The protection level specified by a PROTECT directive was unrecognised.
- GD_E_FORMAT_RES_NAME
- A field was specified with the reserved name INDEX (or with the reserved name FILEFRAM in a dirfile conforming to Standards Version 5 or earlier).
- GD_E_FORMAT_UNTERM
- The last token of the line was unterminated.
- GD_SYNTAX_ABORT
- The parser should immediately abort parsing the format specification and fail with the error GD_E_FORMAT. This is the default behaviour, if no callback function is provided (or if the parser is invoked by calling gd_open()).
- GD_SYNTAX_CONTINUE
- The parser should continue parsing the format specification. However, once parsing has finished, the parser will fail with the error GD_E_FORMAT, even if no further syntax errors are encountered. This behaviour may be used by the caller to identify all lines containing syntax errors in the format specification, instead of just the first one.
- GD_SYNTAX_IGNORE
- The parser should ignore the line containing the syntax error completely, and carry on parsing the format specification. If no further errors are encountered, the dirfile will be successfully opened.
- GD_SYNTAX_RESCAN
- The parser should rescan the line argument, which
replaces the line which originally contained the syntax error. The line is
assumed to have been corrected by the callback function. If the line still
contains a syntax error, the callback function will be called again.
RETURN VALUE¶
A call to gd_cbopen() or gd_open() always returns a pointer to a newly allocated DIRFILE object. The DIRFILE object is an opaque structure containing the parsed dirfile metadata. If an error occurred, the dirfile error will be set to a non-zero error value. The DIRFILE object will also be internally flagged as invalid. Possible error values are:- GD_E_ACCMODE
- The library was asked to create or truncate a dirfile opened read-only (i.e. GD_CREAT or GD_TRUNC was specified in flags along with GD_RDONLY).
- GD_E_ALLOC
- The library was unable to allocate memory.
- GD_E_BAD_REFERENCE
- The reference field specified by a /REFERENCE directive in the format specification (see dirfile-format(5)) was not found, or was not a RAW field.
- GD_E_CALLBACK
- The registered callback function, sehandler, returned an unrecognised response.
- GD_E_CREAT
- The library was unable to create the dirfile, or the dirfile exists and both GD_CREAT and GD_EXCL were specified.
- GD_E_FORMAT
- A syntax error occurred in the format specification. See also The Callback Function section above.
- GD_E_INTERNAL_ERROR
- An internal error occurred in the library while trying to perform the task. This indicates a bug in the library. Please report the incident to the GetData developers.
- GD_E_LINE_TOO_LONG
- The parser encountered a line in the format specification longer than it was able to deal with. Lines are limited by the storage size of ssize_t. On 32-bit systems, this limits format specification lines to 2**31 bytes. The limit is larger on 64-bit systems.
- GD_E_OPEN
- The dirfile format specification could not be opened, or dirfilename does not specify a valid dirfile.
- GD_E_OPEN_FRAGMENT
- A file specified in an /INCLUDE directive could not be opened.
- GD_E_TRUNC
- The library was unable to truncate the dirfile.
BUGS¶
When working with dirfiles conforming to Standards Versions 4 and earlier (before the introduction of the ENDIAN directive), GetData assumes the dirfile has native byte sex, even though, officially, these early Standards stipulated data to be little-endian. This is necessary since, in the absense of an explicit /VERSION directive, it is often impossible to determine the intended Standards Version of a dirfile, and the current behaviour is to assume native byte sex for modern dirfiles lacking /ENDIAN. To read an old, little-ended dirfile on a big-ended platform, an /ENDIAN directive should be added to the format specification, or else GD_LITTLE_ENDIAN should be specified by the caller.SEE ALSO¶
dirfile(5), dirfile-encoding(5), dirfile-format(5), gd_close(3), gd_dirfile_standards(3), gd_discard(3), gd_error(3), gd_error_string(3), gd_getdata(3), gd_include(3), gd_parser_callback(3)3 November 2010 | Version 0.7.0 |