|UMA(9)||Kernel Developer's Manual||UMA(9)|
general-purpose kernel object allocator
typedef int (*uma_ctor)(void *mem, int size, void *arg, int flags); typedef void (*uma_dtor)(void *mem, int size, void *arg); typedef int (*uma_init)(void *mem, int size, int flags); typedef void (*uma_fini)(void *mem, int size); typedef int (*uma_import)(void *arg, void **store, int count, int domain, int flags); typedef void (*uma_release)(void *arg, void **store, int count); typedef void *(*uma_alloc)(uma_zone_t zone, vm_size_t size, int domain, uint8_t *pflag, int wait); typedef void (*uma_free)(void *item, vm_size_t size, uint8_t pflag);
uma_zcreate(char *name, int size, uma_ctor ctor, uma_dtor dtor, uma_init zinit, uma_fini zfini, int align, uint16_t flags);
int size, uma_ctor ctor,
uma_dtor dtor, uma_init zinit,
uma_fini zfini, uma_import
zimport, uma_release zrelease,
void *arg, int flags);
uma_ctor ctor, uma_dtor dtor,
uma_init zinit, uma_fini zfini,
zone, void *arg,
zone, void *arg,
zone, void *arg,
zone, void *item,
zone, void *item,
zone, void *item,
zone, const char
UMA (Universal Memory Allocator) provides an efficient interface for managing dynamically-sized collections of items of identical size, referred to as zones. Zones keep track of which items are in use and which are not, and UMA provides functions for allocating items from a zone and for releasing them back, making them available for subsequent allocation requests. Zones maintain per-CPU caches with linear scalability on SMP systems as well as round-robin and first-touch policies for NUMA systems. The number of items cached per CPU is bounded, and each zone additionally maintains an unbounded cache of items that is used to quickly satisfy per-CPU cache allocation misses.
Two types of zones exist: regular zones and cache zones. In a regular zone, items are allocated from a slab, which is one or more virtually contiguous memory pages that have been allocated from the kernel's page allocator. Internally, slabs are managed by a UMA keg, which is responsible for allocating slabs and keeping track of their usage by one or more zones. In typical usage, there is one keg per zone, so slabs are not shared among multiple zones.
Normal zones import items from a keg, and release items back to that keg if requested. Cache zones do not have a keg, and instead use custom import and release methods. For example, some collections of kernel objects are statically allocated at boot-time, and the size of the collection does not change. A cache zone can be used to implement an efficient allocator for the objects in such a collection.
uma_zcache_create() functions create a new
regular zone and cache zone, respectively. The
function creates a regular zone which shares the keg of the zone specified
by the master argument. The name
argument is a text name of the zone for debugging and stats; this memory
should not be freed until the zone has been deallocated.
The ctor and
dtor arguments are callback functions that are called
by the UMA subsystem at the time of the call to
uma_zfree() respectively. Their purpose is to
provide hooks for initializing or destroying things that need to be done at
the time of the allocation or release of a resource. A good usage for the
ctor and dtor callbacks might be
to initialize a data structure embedded in the item, such as a
The zinit and
zfini arguments are used to optimize the allocation of
items from the zone. They are called by the UMA subsystem whenever it needs
to allocate or free items to satisfy requests or memory pressure. A good use
for the zinit and zfini
callbacks might be to initialize and destroy a mutex contained within an
item. This would allow one to avoid destroying and re-initializing the mutex
each time the item is freed and re-allocated. They are not called on each
uma_zfree() but rather when an item is imported
into a zone's cache, and when a zone releases an item to the slab allocator,
typically as a response to memory pressure.
the zimport and zrelease
functions are called to import items into the zone and to release items from
the zone, respectively. The zimport function should
store pointers to items in the store array, which
contains a maximum of count entries. The function must
return the number of imported items, which may be less than the maximum.
Similarly, the store parameter to the
zrelease function contains an array of
count pointers to items. The arg
parameter passed to
uma_zcache_create() is provided
to the import and release functions. The domain
parameter to zimport specifies the requested
numa(4) domain for the allocation. It is either a NUMA
domain number or the special value
The flags argument of
uma_zcache_create() is a subset of the following
- Slabs allocated to the zone's keg are never freed.
- Pages belonging to the zone will not be included in minidumps.
- An allocation from zone would have mp_ncpu shadow
copies, that are privately assigned to CPUs. A CPU can address its private
copy using base the allocation address plus a multiple of the current CPU
foo_zone = uma_zcreate(..., UMA_ZONE_PCPU); ... foo_base = uma_zalloc(foo_zone, ...); ... critical_enter(); foo_pcpu = (foo_t *)zpcpu_get(foo_base); /* do something with foo_pcpu */ critical_exit();
M_ZEROcannot be used when allocating items from a PCPU zone. To obtain zeroed memory from a PCPU zone, use the
uma_zalloc_pcpu() function and its variants instead, and pass
- By default book-keeping of items within a slab is done in the slab page
itself. This flag explicitly tells subsystem that book-keeping structure
should be allocated separately from special internal zone. This flag
UMA_ZONE_HASH, since subsystem requires a mechanism to find a book-keeping structure to an item being freed. The subsystem may choose to prefer offpage book-keeping for certain zones implicitly.
- The zone will have its uma_init method set to
internal method that initializes a new allocated slab to all zeros. Do not
mistake uma_init method with
uma_ctor. A zone with
UMA_ZONE_ZINITflag would not return zeroed memory on every
- The zone should use an internal hash table to find slab book-keeping structure where an allocation being freed belongs to.
- The zone should use special field of vm_page_t to find slab book-keeping structure where an allocation being freed belongs to.
- The zone is for the malloc(9) subsystem.
- The zone is for the VM subsystem.
- The zone should use a first-touch NUMA policy rather than the round-robin
default. If the
UMA_FIRSTTOUCHkernel option is configured, all zones implicitly use a first-touch policy, and the
UMA_ZONE_NUMAflag has no effect. The
UMA_XDOMAINkernel option, when configured, causes UMA to do the extra tracking to ensure that allocations from first-touch zones are always local. Otherwise, consumers that do not free memory on the same domain from which it was allocated will cause mixing in per-CPU caches. See numa(4) for more details.
Zones can be destroyed using
freeing all memory that is cached in the zone. All items allocated from the
zone must be freed to the zone before the zone may be safely destroyed.
To allocate an item from a zone, simply call
with a pointer to that zone and set the flags argument
to selected flags as documented in malloc(9). It will
return a pointer to an item if successful, or
in the rare case where all items in the zone are in use and the allocator is
unable to grow the zone and
Items are released back to the zone from which they
were allocated by calling
with a pointer to the zone and a pointer to the item. If
uma_zfree() does nothing.
allow callers to specify an argument for the
dtor functions of the zone, respectively. The
function allows callers to specify a fixed numa(4) domain
to allocate from. This uses a guaranteed but slow path in the allocator
which reduces concurrency. The
function should be used to return memory allocated in this fashion. This
function infers the domain from the pointer and does not require it as an
function allocates slabs for the requested number of items, typically
following the initial creation of a zone. Subsequent allocations from the
zone will be satisfied using the pre-allocated slabs. Note that slab
allocation is performed with the
M_WAITOK flag, so
uma_prealloc() may sleep.
function sets the number of reserved items for the zone.
uma_zalloc() and variants will ensure that the zone
contains at least the reserved number of free items. Reserved items may be
allocated by specifying
M_USE_RESERVE in the
allocation request flags.
not perform any pre-allocation by itself.
function pre-allocates kernel virtual address space for the requested number
of items. Subsequent allocations from the zone will be satisfied using the
pre-allocated address space. Note that unlike
uma_zone_reserve_kva() does not restrict the use of
the pre-allocation to
functions allow a zone's default slab allocation and free functions to be
overridden. This is useful if the zone's items have special memory
allocation constraints. For example, if multi-page objects are required to
be physically contiguous, an allocf function which
requests contiguous memory from the kernel's page allocator may be used.
function limits the number of items (and therefore memory) that can be
allocated to zone. The nitems
argument specifies the requested upper limit number of items. The effective
limit is returned to the caller, as it may end up being higher than
requested due to the implementation rounding up to ensure all memory pages
allocated to the zone are utilised to capacity. The limit applies to the
total number of items in the zone, which includes allocated items, free
items and free items in the per-cpu caches. On systems with more than one
CPU it may not be possible to allocate the specified number of items even
when there is no shortage of memory, because all of the remaining free items
may be in the caches of the other CPUs when the limit is hit.
function limits the number of free items which may be cached in the zone,
excluding the per-CPU caches, which are bounded in size. For example, to
implement a ‘
pure’ per-CPU cache, a
cache zone may be configured with a maximum cache size of 0.
function returns the effective upper limit number of items for a zone.
function returns an approximation of the number of items currently allocated
from the zone. The returned value is approximate because appropriate
synchronisation to determine an exact value is not performed by the
implementation. This ensures low overhead at the expense of potentially
stale data being used in the calculation.
function sets a warning that will be printed on the system console when the
given zone becomes full and fails to allocate an item. The warning will be
printed no more often than every five minutes. Warnings can be turned off
globally by setting the vm.zone_warnings sysctl
tunable to 0.
function sets a function that will be called when the given zone becomes
full and fails to allocate an item. The function will be called with the
zone locked. Also, the function that called the allocation function may have
held additional locks. Therefore, this function should do very little work
(similar to a signal handler).
descr) macro declares a static
sysctl(9) oid that exports the effective upper limit
number of items for a zone. The zone argument should
be a pointer to uma_zone_t. A read of the oid returns
value obtained through
uma_zone_get_max(). A write
to the oid sets new value via
zone, descr) macro is provided
to create this type of oid dynamically.
descr) macro declares a static read-only
sysctl(9) oid that exports the approximate current
occupancy of the zone. The zone argument should be a
pointer to uma_zone_t. A read of the oid returns value
descr) macro is provided to create this type of oid
The memory that these allocation calls return is not executable.
uma_zalloc() function does not support the
M_EXEC flag to allocate executable memory. Not all
platforms enforce a distinction between executable and non-executable
Jeff Bonwick, The Slab Allocator: An Object-Caching Kernel Memory Allocator, 1994.
The zone allocator first appeared in FreeBSD 3.0. It was radically changed in FreeBSD 5.0 to function as a slab allocator.
The zone allocator was written by John S. Dyson. The zone allocator was rewritten in large parts by Jeff Roberson <jeff@FreeBSD.org> to function as a slab allocator.
|August 20, 2020||Debian|