.\" Automatically generated by Pandoc 2.0.6 .\" .TH "LIBRPMEM" "7" "2022-08-25" "PMDK - rpmem API version 1.3" "PMDK Programmer's Manual" .hy .\" SPDX-License-Identifier: BSD-3-Clause .\" Copyright 2016-2022, Intel Corporation .SH NAME .PP \f[B]librpmem\f[] \- remote persistent memory support library (DEPRECATED) .SH SYNOPSIS .IP .nf \f[C] #include\ cc\ ...\ \-lrpmem \f[] .fi .SS Library API versioning: .IP .nf \f[C] const\ char\ *rpmem_check_version( \ \ \ \ unsigned\ major_required, \ \ \ \ unsigned\ minor_required); \f[] .fi .SS Error handling: .IP .nf \f[C] const\ char\ *rpmem_errormsg(void); \f[] .fi .SS Other library functions: .PP A description of other \f[B]librpmem\f[] functions can be found on the following manual pages: .IP \[bu] 2 \f[B]rpmem_create\f[](3), \f[B]rpmem_persist\f[](3) .SH DESCRIPTION .PP \f[B]librpmem\f[] provides low\-level support for remote access to \f[I]persistent memory\f[] (pmem) utilizing RDMA\-capable RNICs. The library can be used to remotely replicate a memory region over the RDMA protocol. It utilizes an appropriate persistency mechanism based on the remote node's platform capabilities. \f[B]librpmem\f[] utilizes the \f[B]ssh\f[](1) client to authenticate a user on the remote node, and for encryption of the connection's out\-of\-band configuration data. See \f[B]SSH\f[], below, for details. .PP The maximum replicated memory region size can not be bigger than the maximum locked\-in\-memory address space limit. See \f[B]memlock\f[] in \f[B]limits.conf\f[](5) for more details. .PP This library is for applications that use remote persistent memory directly, without the help of any library\-supplied transactions or memory allocation. Higher\-level libraries that build on \f[B]libpmem\f[](7) are available and are recommended for most applications, see: .IP \[bu] 2 \f[B]libpmemobj\f[](7), a general use persistent memory API, providing memory allocation and transactional operations on variable\-sized objects. .SH TARGET NODE ADDRESS FORMAT .IP .nf \f[C] [\@][:] \f[] .fi .PP The target node address is described by the \f[I]hostname\f[] which the client connects to, with an optional \f[I]user\f[] name. The user must be authorized to authenticate to the remote machine without querying for password/passphrase. The optional \f[I]port\f[] number is used to establish the SSH connection. The default port number is 22. .SH REMOTE POOL ATTRIBUTES .PP The \f[I]rpmem_pool_attr\f[] structure describes a remote pool and is stored in remote pool's metadata. This structure must be passed to the \f[B]rpmem_create\f[](3) function by caller when creating a pool on remote node. When opening the pool using \f[B]rpmem_open\f[](3) function the appropriate fields are read from pool's metadata and returned back to the caller. .IP .nf \f[C] #define\ RPMEM_POOL_HDR_SIG_LEN\ \ \ \ 8 #define\ RPMEM_POOL_HDR_UUID_LEN\ \ \ 16 #define\ RPMEM_POOL_USER_FLAGS_LEN\ 16 struct\ rpmem_pool_attr\ { \ \ \ \ char\ signature[RPMEM_POOL_HDR_SIG_LEN]; \ \ \ \ uint32_t\ major; \ \ \ \ uint32_t\ compat_features; \ \ \ \ uint32_t\ incompat_features; \ \ \ \ uint32_t\ ro_compat_features; \ \ \ \ unsigned\ char\ poolset_uuid[RPMEM_POOL_HDR_UUID_LEN]; \ \ \ \ unsigned\ char\ uuid[RPMEM_POOL_HDR_UUID_LEN]; \ \ \ \ unsigned\ char\ next_uuid[RPMEM_POOL_HDR_UUID_LEN]; \ \ \ \ unsigned\ char\ prev_uuid[RPMEM_POOL_HDR_UUID_LEN]; \ \ \ \ unsigned\ char\ user_flags[RPMEM_POOL_USER_FLAGS_LEN]; }; \f[] .fi .PP The \f[I]signature\f[] field is an 8\-byte field which describes the pool's on\-media format. .PP The \f[I]major\f[] field is a major version number of the pool's on\-media format. .PP The \f[I]compat_features\f[] field is a mask describing compatibility of pool's on\-media format optional features. .PP The \f[I]incompat_features\f[] field is a mask describing compatibility of pool's on\-media format required features. .PP The \f[I]ro_compat_features\f[] field is a mask describing compatibility of pool's on\-media format features. If these features are not available, the pool shall be opened in read\-only mode. .PP The \f[I]poolset_uuid\f[] field is an UUID of the pool which the remote pool is associated with. .PP The \f[I]uuid\f[] field is an UUID of a first part of the remote pool. This field can be used to connect the remote pool with other pools in a list. .PP The \f[I]next_uuid\f[] and \f[I]prev_uuid\f[] fields are UUIDs of next and previous replicas respectively. These fields can be used to connect the remote pool with other pools in a list. .PP The \f[I]user_flags\f[] field is a 16\-byte user\-defined flags. .SH SSH .PP \f[B]librpmem\f[] utilizes the \f[B]ssh\f[](1) client to login and execute the \f[B]rpmemd\f[](1) process on the remote node. By default, \f[B]ssh\f[](1) is executed with the \f[B]\-4\f[] option, which forces using \f[B]IPv4\f[] addressing. .PP For debugging purposes, both the ssh client and the commands executed on the remote node may be overridden by setting the \f[B]RPMEM_SSH\f[] and \f[B]RPMEM_CMD\f[] environment variables, respectively. See \f[B]ENVIRONMENT\f[] for details. .SH FORK .PP The \f[B]ssh\f[](1) client is executed by \f[B]rpmem_open\f[](3) and \f[B]rpmem_create\f[](3) after forking a child process using \f[B]fork\f[](2). The application must take this into account when using \f[B]wait\f[](2) and \f[B]waitpid\f[](2), which may return the \f[I]PID\f[] of the \f[B]ssh\f[](1) process executed by \f[B]librpmem\f[]. .PP If \f[B]fork\f[](2) support is not enabled in \f[B]libibverbs\f[], \f[B]rpmem_open\f[](3) and \f[B]rpmem_create\f[](3) will fail. By default, \f[B]fabric\f[](7) initializes \f[B]libibverbs\f[] with \f[B]fork\f[](2) support by calling the \f[B]ibv_fork_init\f[](3) function. See \f[B]fi_verbs\f[](7) for more details. .SH CAVEATS .PP \f[B]librpmem\f[] relies on the library destructor being called from the main thread. For this reason, all functions that might trigger destruction (e.g. \f[B]dlclose\f[](3)) should be called in the main thread. Otherwise some of the resources associated with that thread might not be cleaned up properly. .PP \f[B]librpmem\f[] registers a pool as a single memory region. A Chelsio T4 and T5 hardware can not handle a memory region greater than or equal to 8GB due to a hardware bug. So \f[I]pool_size\f[] value for \f[B]rpmem_create\f[](3) and \f[B]rpmem_open\f[](3) using this hardware can not be greater than or equal to 8GB. .SH LIBRARY API VERSIONING .PP This section describes how the library API is versioned, allowing applications to work with an evolving API. .PP The \f[B]rpmem_check_version\f[]() function is used to see if the installed \f[B]librpmem\f[] supports the version of the library API required by an application. The easiest way to do this is for the application to supply the compile\-time version information, supplied by defines in \f[B]\f[], like this: .IP .nf \f[C] reason\ =\ rpmem_check_version(RPMEM_MAJOR_VERSION, \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ RPMEM_MINOR_VERSION); if\ (reason\ !=\ NULL)\ { \ \ \ \ /*\ version\ check\ failed,\ reason\ string\ tells\ you\ why\ */ } \f[] .fi .PP Any mismatch in the major version number is considered a failure, but a library with a newer minor version number will pass this check since increasing minor versions imply backwards compatibility. .PP An application can also check specifically for the existence of an interface by checking for the version where that interface was introduced. These versions are documented in this man page as follows: unless otherwise specified, all interfaces described here are available in version 1.0 of the library. Interfaces added after version 1.0 will contain the text \f[I]introduced in version x.y\f[] in the section of this manual describing the feature. .PP When the version check performed by \f[B]rpmem_check_version\f[]() is successful, the return value is NULL. Otherwise the return value is a static string describing the reason for failing the version check. The string returned by \f[B]rpmem_check_version\f[]() must not be modified or freed. .SH ENVIRONMENT .PP \f[B]librpmem\f[] can change its default behavior based on the following environment variables. These are largely intended for testing and are not normally required. .IP \[bu] 2 \f[B]RPMEM_SSH\f[]=\f[I]ssh_client\f[] .PP Setting this environment variable overrides the default \f[B]ssh\f[](1) client command name. .IP \[bu] 2 \f[B]RPMEM_CMD\f[]=\f[I]cmd\f[] .PP Setting this environment variable overrides the default command executed on the remote node using either \f[B]ssh\f[](1) or the alternative remote shell command specified by \f[B]RPMEM_SSH\f[]. .PP \f[B]RPMEM_CMD\f[] can contain multiple commands separated by a vertical bar (\f[C]|\f[]). Each consecutive command is executed on the remote node in order read from a pool set file. This environment variable is read when the library is initialized, so \f[B]RPMEM_CMD\f[] must be set prior to application launch (or prior to \f[B]dlopen\f[](3) if \f[B]librpmem\f[] is being dynamically loaded). .IP \[bu] 2 \f[B]RPMEM_ENABLE_SOCKETS\f[]=0|1 .PP Setting this variable to 1 enables using \f[B]fi_sockets\f[](7) provider for in\-band RDMA connection. The \f[I]sockets\f[] provider does not support IPv6. It is required to disable IPv6 system wide if \f[B]RPMEM_ENABLE_SOCKETS\f[] == 1 and \f[I]target\f[] == localhost (or any other loopback interface address) and \f[B]SSH_CONNECTION\f[] variable (see \f[B]ssh\f[](1) for more details) contains IPv6 address after ssh to loopback interface. By default the \f[I]sockets\f[] provider is disabled. .IP \[bu] 2 \f[B]RPMEM_ENABLE_VERBS\f[]=0|1 .PP Setting this variable to 0 disables using \f[B]fi_verbs\f[](7) provider for in\-band RDMA connection. The \f[I]verbs\f[] provider is enabled by default. .IP \[bu] 2 \f[B]RPMEM_MAX_NLANES\f[]=\f[I]num\f[] .PP Limit the maximum number of lanes to \f[I]num\f[]. See \f[B]LANES\f[], in \f[B]rpmem_create\f[](3), for details. .IP \[bu] 2 \f[B]RPMEM_WORK_QUEUE_SIZE\f[]=\f[I]size\f[] .PP Suggest the work queue size. The effective work queue size can be greater than suggested if \f[B]librpmem\f[] requires it or it can be smaller if underlying hardware does not support the suggested size. The work queue size affects the performance of communication to the remote node. \f[B]rpmem_flush\f[](3) operations can be added to the work queue up to the size of this queue. When work queue is full any subsequent call has to wait till the work queue will be drained. \f[B]rpmem_drain\f[](3) and \f[B]rpmem_persist\f[](3) among other things also drain the work queue. .SH DEBUGGING AND ERROR HANDLING .PP If an error is detected during the call to a \f[B]librpmem\f[] function, the application may retrieve an error message describing the reason for the failure from \f[B]rpmem_errormsg\f[](). This function returns a pointer to a static buffer containing the last error message logged for the current thread. If \f[I]errno\f[] was set, the error message may include a description of the corresponding error code as returned by \f[B]strerror\f[](3). The error message buffer is thread\-local; errors encountered in one thread do not affect its value in other threads. The buffer is never cleared by any library function; its content is significant only when the return value of the immediately preceding call to a \f[B]librpmem\f[] function indicated an error, or if \f[I]errno\f[] was set. The application must not modify or free the error message string, but it may be modified by subsequent calls to other library functions. .PP Two versions of \f[B]librpmem\f[] are typically available on a development system. The normal version, accessed when a program is linked using the \f[B]\-lrpmem\f[] option, is optimized for performance. That version skips checks that impact performance and never logs any trace information or performs any run\-time assertions. .PP A second version of \f[B]librpmem\f[], accessed when a program uses the libraries under \f[B]/usr/lib/pmdk_debug\f[], contains run\-time assertions and trace points. The typical way to access the debug version is to set the environment variable \f[B]LD_LIBRARY_PATH\f[] to \f[B]/usr/lib/pmdk_debug\f[] or \f[B]/usr/lib64/pmdk_debug\f[], as appropriate. Debugging output is controlled using the following environment variables. These variables have no effect on the non\-debug version of the library. .RS .PP NOTE: On Debian/Ubuntu systems, this extra debug version of the library is shipped in the respective \f[B]-debug\f[R] Debian package and placed in the \f[B]/usr/lib/$ARCH/pmdk_dbg/\f[R] directory. .RE .IP \[bu] 2 \f[B]RPMEM_LOG_LEVEL\f[] .PP The value of \f[B]RPMEM_LOG_LEVEL\f[] enables trace points in the debug version of the library, as follows: .IP \[bu] 2 \f[B]0\f[] \- This is the default level when \f[B]RPMEM_LOG_LEVEL\f[] is not set. No log messages are emitted at this level. .IP \[bu] 2 \f[B]1\f[] \- Additional details on any errors detected are logged (in addition to returning the \f[I]errno\f[]\-based errors as usual). The same information may be retrieved using \f[B]rpmem_errormsg\f[](). .IP \[bu] 2 \f[B]2\f[] \- A trace of basic operations is logged. .IP \[bu] 2 \f[B]3\f[] \- Enables a very verbose amount of function call tracing in the library. .IP \[bu] 2 \f[B]4\f[] \- Enables voluminous and fairly obscure tracing information that is likely only useful to the \f[B]librpmem\f[] developers. .PP Unless \f[B]RPMEM_LOG_FILE\f[] is set, debugging output is written to \f[I]stderr\f[]. .IP \[bu] 2 \f[B]RPMEM_LOG_FILE\f[] .PP Specifies the name of a file where all logging information should be written. If the last character in the name is \[lq]\-\[rq], the \f[I]PID\f[] of the current process will be appended to the file name when the log file is created. If \f[B]RPMEM_LOG_FILE\f[] is not set, logging output is written to \f[I]stderr\f[]. .SH EXAMPLE .PP The following example uses \f[B]librpmem\f[] to create a remote pool on given target node identified by given pool set name. The associated local memory pool is zeroed and the data is made persistent on remote node. Upon success the remote pool is closed. .IP .nf \f[C] #include\ #include\ #include\ #include\ #include\ #include\ #define\ POOL_SIGNATURE\ \ "MANPAGE" #define\ POOL_SIZE\ \ \ (32\ *\ 1024\ *\ 1024) #define\ NLANES\ \ \ \ \ \ 4 #define\ DATA_OFF\ \ \ \ 4096 #define\ DATA_SIZE\ \ \ (POOL_SIZE\ \-\ DATA_OFF) static\ void parse_args(int\ argc,\ char\ *argv[],\ const\ char\ **target,\ const\ char\ **poolset) { \ \ \ \ if\ (argc\ <\ 3)\ { \ \ \ \ \ \ \ \ fprintf(stderr,\ "usage:\\t%s\ \ \\n",\ argv[0]); \ \ \ \ \ \ \ \ exit(1); \ \ \ \ } \ \ \ \ *target\ =\ argv[1]; \ \ \ \ *poolset\ =\ argv[2]; } static\ void\ * alloc_memory() { \ \ \ \ long\ pagesize\ =\ sysconf(_SC_PAGESIZE); \ \ \ \ if\ (pagesize\ <\ 0)\ { \ \ \ \ \ \ \ \ perror("sysconf"); \ \ \ \ \ \ \ \ exit(1); \ \ \ \ } \ \ \ \ /*\ allocate\ a\ page\ size\ aligned\ local\ memory\ pool\ */ \ \ \ \ void\ *mem; \ \ \ \ int\ ret\ =\ posix_memalign(&mem,\ pagesize,\ POOL_SIZE); \ \ \ \ if\ (ret)\ { \ \ \ \ \ \ \ \ fprintf(stderr,\ "posix_memalign:\ %s\\n",\ strerror(ret)); \ \ \ \ \ \ \ \ exit(1); \ \ \ \ } \ \ \ \ assert(mem\ !=\ NULL); \ \ \ \ return\ mem; } int main(int\ argc,\ char\ *argv[]) { \ \ \ \ const\ char\ *target,\ *poolset; \ \ \ \ parse_args(argc,\ argv,\ &target,\ &poolset); \ \ \ \ unsigned\ nlanes\ =\ NLANES; \ \ \ \ void\ *pool\ =\ alloc_memory(); \ \ \ \ int\ ret; \ \ \ \ /*\ fill\ pool_attributes\ */ \ \ \ \ struct\ rpmem_pool_attr\ pool_attr; \ \ \ \ memset(&pool_attr,\ 0,\ sizeof(pool_attr)); \ \ \ \ strncpy(pool_attr.signature,\ POOL_SIGNATURE,\ RPMEM_POOL_HDR_SIG_LEN); \ \ \ \ /*\ create\ a\ remote\ pool\ */ \ \ \ \ RPMEMpool\ *rpp\ =\ rpmem_create(target,\ poolset,\ pool,\ POOL_SIZE, \ \ \ \ \ \ \ \ \ \ \ \ &nlanes,\ &pool_attr); \ \ \ \ if\ (!rpp)\ { \ \ \ \ \ \ \ \ fprintf(stderr,\ "rpmem_create:\ %s\\n",\ rpmem_errormsg()); \ \ \ \ \ \ \ \ return\ 1; \ \ \ \ } \ \ \ \ /*\ store\ data\ on\ local\ pool\ */ \ \ \ \ memset(pool,\ 0,\ POOL_SIZE); \ \ \ \ /*\ make\ local\ data\ persistent\ on\ remote\ node\ */ \ \ \ \ ret\ =\ rpmem_persist(rpp,\ DATA_OFF,\ DATA_SIZE,\ 0,\ 0); \ \ \ \ if\ (ret)\ { \ \ \ \ \ \ \ \ fprintf(stderr,\ "rpmem_persist:\ %s\\n",\ rpmem_errormsg()); \ \ \ \ \ \ \ \ return\ 1; \ \ \ \ } \ \ \ \ /*\ close\ the\ remote\ pool\ */ \ \ \ \ ret\ =\ rpmem_close(rpp); \ \ \ \ if\ (ret)\ { \ \ \ \ \ \ \ \ fprintf(stderr,\ "rpmem_close:\ %s\\n",\ rpmem_errormsg()); \ \ \ \ \ \ \ \ return\ 1; \ \ \ \ } \ \ \ \ free(pool); \ \ \ \ return\ 0; } \f[] .fi .SH NOTE .PP The \f[B]librpmem\f[] API is experimental and may be subject to change in the future. However, using the remote replication in \f[B]libpmemobj\f[](7) is safe and backward compatibility will be preserved. .RS .PP NOTICE: The \f[B]librpmem\f[] library is deprecated since PMDK 1.12 release. If you are interested in a remote persistent memory support please look at new library \f[B]rpma\f[] https://github.com/pmem/rpma. .RE .SH ACKNOWLEDGEMENTS .PP \f[B]librpmem\f[] builds on the persistent memory programming model recommended by the SNIA NVM Programming Technical Work Group: .SH SEE ALSO .PP \f[B]rpmemd\f[](1), \f[B]ssh\f[](1), \f[B]fork\f[](2), \f[B]dlclose\f[](3), \f[B]dlopen\f[](3), \f[B]ibv_fork_init\f[](3), \f[B]rpmem_create\f[](3), \f[B]rpmem_drain\f[](3), \f[B]rpmem_flush\f[](3), \f[B]rpmem_open\f[](3), \f[B]rpmem_persist\f[](3), \f[B]strerror\f[](3), \f[B]limits.conf\f[](5), \f[B]fabric\f[](7), \f[B]fi_sockets\f[](7), \f[B]fi_verbs\f[](7), \f[B]libpmem\f[](7), \f[B]libpmemblk\f[](7), \f[B]libpmemlog\f[](7), \f[B]libpmemobj\f[](7) and \f[B]\f[]