.TH "readers" 3 "Wed Mar 25 2020" "LMDB" \" -*- nroff -*- .ad l .nh .SH NAME readers \- Readers don't acquire any locks for their data access\&. Instead, they simply record their transaction ID in the reader table\&. The reader mutex is needed just to find an empty slot in the reader table\&. The slot's address is saved in thread-specific data so that subsequent read transactions started by the same thread need no further locking to proceed\&. .SH SYNOPSIS .br .PP .SS "Data Structures" .in +1c .ti -1c .RI "struct \fBMDB_rxbody\fP" .br .ti -1c .RI "struct \fBMDB_reader\fP" .br .ti -1c .RI "struct \fBMDB_txbody\fP" .br .ti -1c .RI "struct \fBMDB_txninfo\fP" .br .in -1c .SS "Macros" .in +1c .ti -1c .RI "#define \fBDEFAULT_READERS\fP 126" .br .ti -1c .RI "#define \fBCACHELINE\fP 64" .br .ti -1c .RI "#define \fBMDB_LOCK_FORMAT\fP" .br .in -1c .SH "Detailed Description" .PP Readers don't acquire any locks for their data access\&. Instead, they simply record their transaction ID in the reader table\&. The reader mutex is needed just to find an empty slot in the reader table\&. The slot's address is saved in thread-specific data so that subsequent read transactions started by the same thread need no further locking to proceed\&. If \fBMDB_NOTLS\fP is set, the slot address is not saved in thread-specific data\&. .PP No reader table is used if the database is on a read-only filesystem, or if \fBMDB_NOLOCK\fP is set\&. .PP Since the database uses multi-version concurrency control, readers don't actually need any locking\&. This table is used to keep track of which readers are using data from which old transactions, so that we'll know when a particular old transaction is no longer in use\&. Old transactions that have discarded any data pages can then have those pages reclaimed for use by a later write transaction\&. .PP The lock table is constructed such that reader slots are aligned with the processor's cache line size\&. Any slot is only ever used by one thread\&. This alignment guarantees that there will be no contention or cache thrashing as threads update their own slot info, and also eliminates any need for locking when accessing a slot\&. .PP A writer thread will scan every slot in the table to determine the oldest outstanding reader transaction\&. Any freed pages older than this will be reclaimed by the writer\&. The writer doesn't use any locks when scanning this table\&. This means that there's no guarantee that the writer will see the most up-to-date reader info, but that's not required for correct operation - all we need is to know the upper bound on the oldest reader, we don't care at all about the newest reader\&. So the only consequence of reading stale information here is that old pages might hang around a while longer before being reclaimed\&. That's actually good anyway, because the longer we delay reclaiming old pages, the more likely it is that a string of contiguous pages can be found after coalescing old pages from many old transactions together\&. .SH "Data Structure Documentation" .PP .SH "struct MDB_rxbody" .PP The information we store in a single slot of the reader table\&. In addition to a transaction ID, we also record the process and thread ID that owns a slot, so that we can detect stale information, e\&.g\&. threads or processes that went away without cleaning up\&. .PP \fBNote\fP .RS 4 We currently don't check for stale records\&. We simply re-init the table when we know that we're the only process opening the lock file\&. .RE .PP .PP .in -1c .RI "\fBData Fields\fP" .in +1c .in +1c .ti -1c .RI "volatile \fBtxnid_t\fP \fBmrb_txnid\fP" .br .ti -1c .RI "volatile MDB_PID_T \fBmrb_pid\fP" .br .ti -1c .RI "volatile MDB_THR_T \fBmrb_tid\fP" .br .in -1c .SH "Field Documentation" .PP .SS "volatile \fBtxnid_t\fP MDB_rxbody::mrb_txnid" Current Transaction ID when this transaction began, or (txnid_t)-1\&. Multiple readers that start at the same time will probably have the same ID here\&. Again, it's not important to exclude them from anything; all we need to know is which version of the DB they started from so we can avoid overwriting any data used in that particular version\&. .SS "volatile MDB_PID_T MDB_rxbody::mrb_pid" The process ID of the process owning this reader txn\&. .SS "volatile MDB_THR_T MDB_rxbody::mrb_tid" The thread ID of the thread owning this txn\&. .SH "struct MDB_reader" .PP The actual reader record, with cacheline padding\&. .PP .in -1c .RI "\fBData Fields\fP" .in +1c .in +1c .ti -1c .RI "union {" .br .ti -1c .RI " \fBMDB_rxbody\fP \fBmrx\fP" .br .ti -1c .RI " char \fBpad\fP [(sizeof(\fBMDB_rxbody\fP)+\fBCACHELINE\fP\-1) .br &~(\fBCACHELINE\fP\-1)]" .br .ti -1c .RI "} \fBmru\fP" .br .in -1c .SH "Field Documentation" .PP .SS "char MDB_reader::pad[(sizeof(\fBMDB_rxbody\fP)+\fBCACHELINE\fP\-1) &~(\fBCACHELINE\fP\-1)]" cache line alignment .SH "struct MDB_txbody" .PP The header for the reader table\&. The table resides in a memory-mapped file\&. (This is a different file than is used for the main database\&.) .PP For POSIX the actual mutexes reside in the shared memory of this mapped file\&. On Windows, mutexes are named objects allocated by the kernel; we store the mutex names in this mapped file so that other processes can grab them\&. This same approach is also used on MacOSX/Darwin (using named semaphores) since MacOSX doesn't support process-shared POSIX mutexes\&. For these cases where a named object is used, the object name is derived from a 64 bit FNV hash of the environment pathname\&. As such, naming collisions are extremely unlikely\&. If a collision occurs, the results are unpredictable\&. .PP .in -1c .RI "\fBData Fields\fP" .in +1c .in +1c .ti -1c .RI "uint32_t \fBmtb_magic\fP" .br .ti -1c .RI "uint32_t \fBmtb_format\fP" .br .ti -1c .RI "\fBmdb_mutex_t\fP \fBmtb_rmutex\fP" .br .ti -1c .RI "volatile \fBtxnid_t\fP \fBmtb_txnid\fP" .br .ti -1c .RI "volatile unsigned \fBmtb_numreaders\fP" .br .in -1c .SH "Field Documentation" .PP .SS "uint32_t MDB_txbody::mtb_magic" Stamp identifying this as an LMDB file\&. It must be set to \fBMDB_MAGIC\fP\&. .SS "uint32_t MDB_txbody::mtb_format" Format of this lock file\&. Must be set to \fBMDB_LOCK_FORMAT\fP\&. .SS "\fBmdb_mutex_t\fP MDB_txbody::mtb_rmutex" Mutex protecting access to this table\&. This is the reader table lock used with LOCK_MUTEX()\&. .SS "volatile \fBtxnid_t\fP MDB_txbody::mtb_txnid" The ID of the last transaction committed to the database\&. This is recorded here only for convenience; the value can always be determined by reading the main database meta pages\&. .SS "volatile unsigned MDB_txbody::mtb_numreaders" The number of slots that have been used in the reader table\&. This always records the maximum count, it is not decremented when readers release their slots\&. .SH "struct MDB_txninfo" .PP The actual reader table definition\&. .PP .in -1c .RI "\fBData Fields\fP" .in +1c .in +1c .ti -1c .RI "union {" .br .ti -1c .RI " \fBMDB_txbody\fP \fBmtb\fP" .br .ti -1c .RI " char \fBpad\fP [(sizeof(\fBMDB_txbody\fP)+\fBCACHELINE\fP\-1) .br &~(\fBCACHELINE\fP\-1)]" .br .ti -1c .RI "} \fBmt1\fP" .br .ti -1c .RI "union {" .br .ti -1c .RI " \fBmdb_mutex_t\fP \fBmt2_wmutex\fP" .br .ti -1c .RI " char \fBpad\fP [(MNAME_LEN+\fBCACHELINE\fP\-1) .br &~(\fBCACHELINE\fP\-1)]" .br .ti -1c .RI "} \fBmt2\fP" .br .ti -1c .RI "\fBMDB_reader\fP \fBmti_readers\fP [1]" .br .in -1c .SH "Macro Definition Documentation" .PP .SS "#define DEFAULT_READERS 126" Number of slots in the reader table\&. This value was chosen somewhat arbitrarily\&. 126 readers plus a couple mutexes fit exactly into 8KB on my development machine\&. Applications should set the table size using \fBmdb_env_set_maxreaders()\fP\&. .SS "#define CACHELINE 64" The size of a CPU cache line in bytes\&. We want our lock structures aligned to this size to avoid false cache line sharing in the lock table\&. This value works for most CPUs\&. For Itanium this should be 128\&. .SS "#define MDB_LOCK_FORMAT" \fBValue:\fP .PP .nf ((uint32_t) \ ((MDB_LOCK_VERSION) \ /* Flags which describe functionality */ \ + (((MDB_PIDLOCK) != 0) << 16))) .fi Lockfile format signature: version, features and field layout .SH "Author" .PP Generated automatically by Doxygen for LMDB from the source code\&.