— kernel lock profiling
kernel option adds support for
measuring and reporting lock use and contention statistics. These statistics
are collated by “acquisition point”. Acquisition points are
distinct places in the kernel source code (identified by source file name and
line number) where a lock is acquired.
For each acquisition point, the following statistics are accumulated:
- The longest time the lock was ever continuously held
after being acquired at this point.
- The total time the lock was held after being acquired at
- The total time that threads have spent waiting to
acquire the lock.
- The total number of non-recursive acquisitions.
- The total number of times the lock was already held by
another thread when this point was reached, requiring a spin or a
- The total number of times another thread tried to
acquire the lock while it was held after having been acquired at this
In addition, the average hold time and average wait time are derived from the
total hold time and total wait time respectively and the number of
kernel option also adds the following
variables to control and monitor the profiling
- Enable or disable the lock profiling code. This defaults to
- Reset the current lock profiling buffers.
- The total number of lock acquisitions recorded.
- The total number of acquisition points recorded. Note that
only active acquisition points (i.e., points that have been reached at
least once) are counted.
- The maximum number of acquisition points the profiling code
is capable of monitoring. Since it would not be possible to call
malloc(9) from within the lock profiling code, this is a
static limit. The number of records can be changed with the
LPROF_BUFFERS kernel option.
- The number of acquisition points that were ignored after
the table filled up.
- The size of the hash table used to map acquisition points
to statistics records. The hash size can be changed with the
LPROF_HASH_SIZE kernel option.
- The number of hash collisions in the acquisition point hash
- The actual profiling statistics in plain text. The columns
are as follows, from left to right:
- The longest continuous hold time in microseconds.
- The longest continuous wait time in microseconds.
- The total (accumulated) hold time in microseconds.
- The total (accumulated) wait time in microseconds.
- The total number of acquisitions.
- The average hold time in microseconds, derived from the
total hold time and the number of acquisitions.
- The average wait time in microseconds, derived from the
total wait time and the number of acquisitions.
- The number of times the lock was held and another
thread attempted to acquire the lock.
- The number of times the lock was already held when this
point was reached.
- The name of the acquisition point, derived from the
source file name and line number, followed by the name of the lock in
Mutex profiling support appeared in FreeBSD 5.0
Generalized lock profiling support appeared in FreeBSD
code was written by
⟨des@FreeBSD.org⟩ and Robert Watson
⟨rwatson@FreeBSD.org⟩. The LOCK_PROFILING
was written by Kip Macy
⟨kmacy@FreeBSD.org⟩. This manual page was written by
option increases the size of
, so a kernel built with that option
will not work with modules built without it.
option also prevents inlining of the
mutex code, which can result in a fairly severe performance penalty. This is,
however, not always the case.
introduce a substantial performance overhead that is easily monitorable using
other profiling tools, so combining profiling tools with
is not recommended.
Measurements are made and stored in nanoseconds using
, (on architectures without a synchronized TSC)
but are presented in microseconds. This should still be sufficient for the
locks one would be most interested in profiling (those that are held long
and/or acquired often).
should generally not be used in
combination with other debugging options, as the results may be strongly
affected by interactions between the features. In particular,
will report higher than normal
lock contention when run with
due to extra locking that occurs when
is present; likewise, using it in
will lead to much higher lock
hold times and contention in profiling output.