NAME¶
hwlatdetect - program to control the kernel hardware latency detection module
SYNOPSIS¶
hwlatdetect [ --duration=<time> ] [--threshold=<usecs> ]
[--window=<time interval> ] [--width=<time interval> ]
[--report=<path> ] [--cleanup ] [--debug ] [--quiet ]
DESCRIPTION¶
hwlatdetect is a program that controls the kernel hardware latency
detector module (hwlat_detector.ko). The module is a special purpose kernel
module that is used to detect large system latencies induced by the behavior
of certain underlying hardware or firmware, independent of Linux itself. The
code was developed originally to detect SMIs (System Management Interrupts) on
x86 systems, however there is nothing x86 specific about this patchset. It was
originally written for use by the "RT" patch since the Real Time
kernel is highly latency sensitive.
SMIs are usually not serviced by the Linux kernel, which typically does not even
know that they are occuring. SMIs are instead are set up by BIOS code and are
serviced by BIOS code, usually for "critical" events such as
management of thermal sensors and fans. Sometimes though, SMIs are used for
other tasks and those tasks can spend an inordinate amount of time in the
handler (sometimes measured in milliseconds). Obviously this is a problem if
you are trying to keep event service latencies down in the microsecond range.
The hardware latency detector module works by hogging all of the cpus for
configurable amounts of time (by calling stop_machine()), polling the CPU Time
Stamp Counter for some period, then looking for gaps in the TSC data. Any gap
indicates a time when the polling was interrupted and since the machine is
stopped and interrupts turned off the only thing that could do that would be
an SMI.
The hwlatdetector script manages the mounting/unmounting of the debugfs as well
as the loading/unloading of the hwlat_detector module. If the debugfs is
already mounted then hwlatdetector will not unmount it after a run. Likewise,
if the hwlat_detector module is already loaded, it will not be unloaded after
a run.
OPTIONS¶
- --duration=<time>{s,m,d}
- Run the detector logic in for the specified duration. The
duration is a base 10 integer number that defaults to a value in seconds.
An optional suffix may be specified to indicate minutes, hours or days.
- --threshold=<microsecond value>
- Specify the TSC gap used to detect an SMI. Any gap value
greater than <theshold> is considered to be the result of an SMI
occuring.
- --window=<time value>{us,ms,s,m,d}
- specify the size of the sample window. Converted to
microseconds when passed to the kernel module.
- --width=<time value>{us,ms,s,m,d}
- The amount of time within the sample window where the
detector is actually sampling. Must be less than the --window value.
- --report=FILENAME
- Specify the output filename of the detector report. Default
behavior is to print to standard output
- --cleanup
- Force unload of hwlat_detector.ko and unmounting of debugfs
filesystem.
- --debug
- Turn on debug prints
- --quiet
- Turn off all information prints
AUTHOR¶
hwlatdetect was written by Clark Williams <williams@redhat.com>
hwlat_detector.ko was written by Jon Masters <jcm@redhat.com>