Scroll to navigation

GDBM_File(3perl) Perl Programmers Reference Guide GDBM_File(3perl)

NAME

GDBM_File - Perl5 access to the gdbm library.

SYNOPSIS

    use GDBM_File;
    [$db =] tie %hash, 'GDBM_File', $filename, GDBM_WRCREAT, 0640
                or die "$GDBM_File::gdbm_errno";
    # Use the %hash...
    $e = $db->errno;
    $e = $db->syserrno;
    $str = $db->strerror;
    $bool = $db->needs_recovery;
    $db->clear_error;
    $db->reorganize;
    $db->sync;
    $n = $db->count;
    $n = $db->flags;
    $str = $db->dbname;
    $db->cache_size;
    $db->cache_size($newsize);
    $n = $db->block_size;
    $bool = $db->sync_mode;
    $db->sync_mode($bool);
    $bool = $db->centfree;
    $db->centfree($bool);
    $bool = $db->coalesce;
    $db->coalesce($bool);
    $bool = $db->mmap;
    $size = $db->mmapsize;
    $db->mmapsize($newsize);
    $db->recover(%args);
    untie %hash ;

DESCRIPTION

GDBM_File is a module which allows Perl programs to make use of the facilities provided by the GNU gdbm library. If you intend to use this module you should really have a copy of the GDBM manual at hand. The manual is avaialble online at <https://www.gnu.org.ua/software/gdbm/manual>.

Most of the gdbm functions are available through the GDBM_File interface.

Unlike Perl's built-in hashes, it is not safe to "delete" the current item from a GDBM_File tied hash while iterating over it with "each". This is a limitation of the gdbm library.

Tie

Use the Perl built-in tie to associate a GDBM database with a Perl hash:

   tie %hash, 'GDBM_File', $filename, $flags, $mode;

Here, $filename is the name of the database file to open or create. $flags is a bitwise OR of access mode and optional modifiers. Access mode is one of:

Open existing database file in read-only mode.
Open existing database file in read-write mode.
If the database file exists, open it in read-write mode. If it doesn't, create it first and open read-write.
Create new database and open it read-write. If the database already exists, truncate it first.

A number of modifiers can be OR'd to the access mode. Most of them are rarely needed (see <https://www.gnu.org.ua/software/gdbm/manual/Open.html> for a complete list), but one is worth mentioning. The GDBM_NUMSYNC modifier, when used with GDBM_NEWDB, instructs GDBM to create the database in extended (so called numsync) format. This format is best suited for crash-tolerant implementations. See CRASH TOLERANCE below for more information.

The $mode parameter is the file mode for creating new database file. Use an octal constant or a combination of "S_I*" constants from the Fcntl module. This parameter is used if $flags is GDBM_NEWDB or GDBM_WRCREAT.

On success, tie returns an object of class GDBM_File. On failure, it returns undef. It is recommended to always check the return value, to make sure your hash is successfully associated with the database file. See ERROR HANDLING below for examples.

STATIC METHODS

GDBM_version

    $str = GDBM_File->GDBM_version;
    @ar = GDBM_File->GDBM_version;

Returns the version number of the underlying libgdbm library. In scalar context, returns the library version formatted as string:

    MINOR.MAJOR[.PATCH][ (GUESS)]

where MINOR, MAJOR, and PATCH are version numbers, and GUESS is a guess level (see below).

In list context, returns a list:

    ( MINOR, MAJOR, PATCH [, GUESS] )

The GUESS component is present only if libgdbm version is 1.8.3 or earlier. This is because earlier releases of libgdbm did not include information about their version and the GDBM_File module has to implement certain guesswork in order to determine it. GUESS is a textual description in string context, and a positive number indicating how rough the guess is in list context. Possible values are:

1 - exact guess
The major and minor version numbers are guaranteed to be correct. The actual patchlevel is most probably guessed right, but can be 1-2 less than indicated.
2 - approximate
The major and minor number are guaranteed to be correct. The patchlevel is set to the upper bound.
3 - rough guess
The version is guaranteed to be not newer than MAJOR.MINOR.

ERROR HANDLING

$GDBM_File::gdbm_errno

When referenced in numeric context, retrieves the current value of the gdbm_errno variable, i.e. a numeric code describing the state of the most recent operation on any gdbm database. Each numeric code has a symbolic name associated with it. For a comprehensive list of these, see <https://www.gnu.org.ua/software/gdbm/manual/Error-codes.html>. Notice, that this list includes all error codes defined for the most recent version of gdbm. Depending on the actual version of the library GDBM_File is built with, some of these may be missing.

In string context, $gdbm_errno returns a human-readable description of the error. If necessary, this description includes the value of $!. This makes it possible to use it in diagnostic messages. For example, the usual tying sequence is

    tie %hash, 'GDBM_File', $filename, GDBM_WRCREAT, 0640
         or die "$GDBM_File::gdbm_errno";

The following, more complex, example illustrates how you can fall back to read-only mode if the database file permissions forbid read-write access:

    use Errno qw(EACCES);
    unless (tie(%hash, 'GDBM_File', $filename, GDBM_WRCREAT, 0640)) {
        if ($GDBM_File::gdbm_errno == GDBM_FILE_OPEN_ERROR
            && $!{EACCES}) {
            if (tie(%hash, 'GDBM_File', $filename, GDBM_READER, 0640)) {
                die "$GDBM_File::gdbm_errno";
            }
        } else {
            die "$GDBM_File::gdbm_errno";
        }
    }

gdbm_check_syserr

    if (gdbm_check_syserr(gdbm_errno)) ...

Returns true if the system error number ($!) gives more information on the cause of the error.

DATABASE METHODS

close

    $db->close;

Closes the database. Normally you would just do untie. However, you will need to use this function if you have explicitly assigned the result of tie to a variable, and wish to release the database to another users. Consider the following code:

    $db = tie %hash, 'GDBM_File', $filename, GDBM_WRCREAT, 0640;
    # Do something with %hash or $db...
    untie %hash;
    $db->close;

In this example, doing untie alone is not enough, since the database would remain referenced by $db, and, as a consequence, the database file would remain locked. Calling $db->close ensures the database file is closed and unlocked.

errno

    $db->errno

Returns the last error status associated with this database. In string context, returns a human-readable description of the error. See also $GDBM_File::gdbm_errno variable above.

syserrno

    $db->syserrno

Returns the last system error status (C "errno" variable), associated with this database,

strerror

    $db->strerror

Returns textual description of the last error that occurred in this database.

clear_error

    $db->clear_error

Clear error status.

needs_recovery

    $db->needs_recovery

Returns true if the database needs recovery.

reorganize

    $db->reorganize;

Reorganizes the database.

sync

    $db->sync;

Synchronizes recent changes to the database with its disk copy.

count

    $n = $db->count;

Returns number of keys in the database.

flags

    $db->flags;

Returns flags passed as 4th argument to tie.

dbname

    $db->dbname;

Returns the database name (i.e. 3rd argument to tie.

cache_size

    $db->cache_size;
    $db->cache_size($newsize);

Returns the size of the internal GDBM cache for that database.

Called with argument, sets the size to $newsize.

block_size

    $db->block_size;

Returns the block size of the database.

sync_mode

    $db->sync_mode;
    $db->sync_mode($bool);

Returns the status of the automatic synchronization mode. Called with argument, enables or disables the sync mode, depending on whether $bool is true or false.

When synchronization mode is on (true), any changes to the database are immediately written to the disk. This ensures database consistency in case of any unforeseen errors (e.g. power failures), at the expense of considerable slowdown of operation.

Synchronization mode is off by default.

centfree

    $db->centfree;
    $db->centfree($bool);

Returns status of the central free block pool (0 - disabled, 1 - enabled).

With argument, changes its status.

By default, central free block pool is disabled.

coalesce

    $db->coalesce;
    $db->coalesce($bool);

mmap

    $db->mmap;

Returns true if memory mapping is enabled.

This method will croak if the libgdbm library is complied without memory mapping support.

mmapsize

    $db->mmapsize;
    $db->mmapsize($newsize);

If memory mapping is enabled, returns the size of memory mapping. With argument, sets the size to $newsize.

This method will croak if the libgdbm library is complied without memory mapping support.

recover

    $db->recover(%args);

Recovers data from a failed database. %args is optional and can contain following keys:

Reference to code for detailed error reporting. Upon encountering an error, recover will call this sub with a single argument - a description of the error.
Creates a backup copy of the database before recovery and returns its filename in $str.
Maximum allowed number of failed keys. If the actual number becomes equal to $n, recover aborts and returns error.
Maximum allowed number of failed buckets. If the actual number becomes equal to $n, recover aborts and returns error.
Maximum allowed number of failures during recovery.
Return recovery statistics in %hash. Upon return, the following keys will be present:
Number of successfully recovered keys.
Number of successfully recovered buckets.
Number of keys that failed to be retrieved.
Number of buckets that failed to be retrieved.

convert

    $db->convert($format);

Changes the format of the database file referred to by $db.

Starting from version 1.20, gdbm supports two database file formats: standard and extended. The former is the traditional database format, used by previous gdbm versions. The extended format contains additional data and is recommended for use in crash tolerant applications.

<https://www.gnu.org.ua/software/gdbm/manual/Numsync.html>, for the discussion of both formats.

The $format argument sets the new desired database format. It is GDBM_NUMSYNC to convert the database from standard to extended format, and 0 to convert it from extended to standard format.

If the database is already in the requested format, the function returns success without doing anything.

dump

    $db->dump($filename, %options)

Creates a dump of the database file in $filename. Such file can be used as a backup copy or sent over a wire to recreate the database on another machine. To create a database from the dump file, use the load method.

GDBM supports two dump formats: old binary and new ascii. The binary format is not portable across architectures and is deprecated. It is supported for backward compatibility. The ascii format is portable and stores additional meta-data about the file. It was introduced with the gdbm version 1.11 and is the preferred dump format. The dump method creates ascii dumps by default.

If the named file already exists, the function will refuse to overwrite and will croak an error. If it doesn't exist, it will be created with the mode 0666 modified by the current umask.

These defaults can be altered using the following %options:

Create dump in binary format.
Set file mode to MODE.
Silently overwrite existing files.

load

    $db->load($filename, %options)

Load the data from the dump file $filename into the database $db. The file must have been previously created using the dump method. File format is recognized automatically. By default, the function will croak if the dump contains a key that already exists in the database. It will silently ignore the failure to restore database mode and/or ownership. These defaults can be altered using the following %options:

Replace existing keys.
If 0, don't try to restore the mode of the database file to that stored in the dump.
If 0, don't try to restore the owner of the database file to that stored in the dump.
Croak if failed to restore ownership and/or mode.

The usual sequence to recreate a database from the dump file is:

    my %hash;
    my $db = tie %hash, 'GDBM_File', 'a.db', GDBM_NEWDB, 0640;
    $db->load('a.dump');

CRASH TOLERANCE

Crash tolerance is a new feature that, given appropriate support from the OS and the filesystem, guarantees that a logically consistent recent state of the database can be recovered following a crash, such as power outage, OS kernel panic, or the like.

Crash tolerance support appeared in gdbm version 1.21. The theory behind it is explained in "Crashproofing the Original NoSQL Key-Value Store", by Terence Kelly (<https://queue.acm.org/detail.cfm?id=3487353>). A detailed discussion of the gdbm implementation is available in the GDBM Manual (<https://www.gnu.org.ua/software/gdbm/manual/Crash-Tolerance.html>). The information below describes the Perl interface.

For maximum robustness, we recommend to use extended database format for crash tolerant databases. To create a database in extended format, use the GDBM_NEWDB|GDBM_NUMSYNC when opening the database, e.g.:

    $db = tie %hash, 'GDBM_File', $filename,
              GDBM_NEWDB|GDBM_NUMSYNC, 0640;

To convert existing database to the extended format, use the convert method, described above, e.g.:

    $db->convert(GDBM_NUMSYNC);

crash_tolerance_status

    GDBM_File->crash_tolerance_status;

This static method returns the status of crash tolerance support. A non-zero value means crash tolerance is compiled in and supported by the operating system.

failure_atomic

    $db->failure_atomic($even, $odd)

Enables crash tolerance for the database $db, Arguments are the pathnames of two files that will be created and filled with snapshots of the database file. The two files must not exist when this method is called and must reside on the same filesystem as the database file. This filesystem must be support the reflink operation (https://www.gnu.org.ua/software/gdbm/manual/Filesystems-supporting-crash-tolerance.html>.

After a successful call to failure_atomic, every call to $db-sync> method will make an efficient reflink snapshot of the database file in one of these files; consecutive calls to sync alternate between the two, hence the names.

The most recent of these files can be used to recover the database after a crash. To select the right snapshot, use the latest_snapshot static method.

latest_snapshot

    $file = GDBM_File->latest_snapshot($even, $odd);
    ($file, $error) = GDBM_File->latest_snapshot($even, $odd);

Given the two snapshot names (the ones used previously in a call to failure_atomic), this method selects the one suitable for database recovery, i.e. the file which contains the most recent database snapshot.

In scalar context, it returns the selected file name or undef in case of failure.

In array context, the returns a list of two elements: the file name and status code. On success, the file name is defined and the code is GDBM_SNAPSHOT_OK. On error, the file name is undef, and the status is one of the following:

Neither snapshot file is applicable. This means that the crash has occurred before a call to failure_atomic completed. In this case, it is best to fall back on a safe backup copy of the data file.
A system error occurred. Examine $! for details. See <https://www.gnu.org.ua/software/gdbm/manual/Crash-recovery.html> for a comprehensive list of error codes and their meaning.
The file modes and modification dates of both snapshot files are exactly the same. This can happen only for databases in standard format.
The numsync counters of the two snapshots differ by more than one. The most probable reason is programmer's error: the two parameters refer to snapshots belonging to different database files.

AVAILABILITY

gdbm is available from any GNU archive. The master site is "ftp.gnu.org", but you are strongly urged to use one of the many mirrors. You can obtain a list of mirror sites from <http://www.gnu.org/order/ftp.html>.

SECURITY AND PORTABILITY

GDBM files are not portable across platforms. If you wish to transfer a GDBM file over the wire, dump it to a portable format first.

Do not accept GDBM files from untrusted sources.

Robustness of GDBM against corrupted databases depends highly on its version. Versions prior to 1.15 did not implement any validity checking, so that a corrupted or maliciously crafted database file could cause perl to crash or even expose a security vulnerability. Versions between 1.15 and 1.20 were progressively strengthened against invalid inputs. Finally, version 1.21 had undergone extensive fuzzy checking which proved its ability to withstand any kinds of inputs without crashing.

SEE ALSO

perl(1), DB_File(3), perldbmfilter, gdbm(3), <https://www.gnu.org.ua/software/gdbm/manual.html>.

2023-11-25 perl v5.36.0