table of contents
other versions
- wheezy 1.7-1+deb7u1
- jessie 2.6-0.2
- jessie-backports 3.0-6~bpo8+1
- testing 3.1-2
- unstable 3.1-2
STAPPROBES(3stap) | STAPPROBES(3stap) |
NAME¶
stapprobes - systemtap probe pointsDESCRIPTION¶
The following sections enumerate the variety of probe points supported by the systemtap translator, and some of the additional aliases defined by standard tapset scripts. Many are individually documented in the 3stap manual section, with the probe:: prefix. The general probe point syntax is a dotted-symbol sequence. This allows a breakdown of the event namespace into parts, somewhat like the Domain Name System does on the Internet. Each component identifier may be parametrized by a string or number literal, with a syntax like a function call. A component may include a "*" character, to expand to a set of matching probe points. It may also include "**" to match multiple sequential components at once. Probe aliases likewise expand to other probe points. Each and every resulting probe point is normally resolved to some low-level system instrumentation facility (e.g., a kprobe address, marker, or a timer configuration), otherwise the elaboration phase will fail. However, a probe point may be followed by a "?" character, to indicate that it is optional, and that no error should result if it fails to resolve. Optionalness passes down through all levels of alias/wildcard expansion. Alternately, a probe point may be followed by a "!" character, to indicate that it is both optional and sufficient. (Think vaguely of the Prolog cut operator.) If it does resolve, then no further probe points in the same comma-separated list will be resolved. Therefore, the "!" sufficiency mark only makes sense in a list of probe point alternatives. Additionally, a probe point may be followed by a "if (expr)" statement, in order to enable/disable the probe point on-the-fly. With the "if" statement, if the "expr" is false when the probe point is hit, the whole probe body including alias's body is skipped. The condition is stacked up through all levels of alias/wildcard expansion. So the final condition becomes the logical-and of conditions of all expanded alias/wildcard.kernel.function("foo").return process("/bin/vi").statement(0x2222) end syscall.* sys**open kernel.function("no_such_function") ? module("awol").function("no_such_function") ! signal.*? if (switch) kprobe.function("foo")
DWARF DEBUGINFO¶
Resolving some probe points requires DWARF debuginfo or "debug symbols" for the specific part being instrumented. For some others, DWARF is automatically synthesized on the fly from source code header files. For others, it is not needed at all. Since a systemtap script may use any mixture of probe points together, the union of their DWARF requirements has to be met on the computer where script compilation occurs. (See the --use-server option and the stap-server(8) man page for information about the remote compilation facility, which allows these requirements to be met on a different machine.) The following point lists many of the available probe point families, to classify them with respect to their need for DWARF debuginfo.DWARF | NON-DWARF | |
kernel.function, .statement | kernel.mark | |
module.function, .statement | process.mark | |
process.function, .statement | begin, end, error, never | |
process.mark (backup) | timer | |
perf | ||
procfs | ||
AUTO-DWARF | kernel.statement.absolute | |
kernel.data | ||
kernel.trace | kprobe.function | |
process.statement.absolute | ||
process.begin, .end, .error |
PROBE POINT FAMILIES¶
BEGIN/END/ERROR¶
The probe points begin and end are defined by the translator to refer to the time of session startup and shutdown. All "begin" probe handlers are run, in some sequence, during the startup of the session. All global variables will have been initialized prior to this point. All "end" probes are run, in some sequence, during the normal shutdown of a session, such as in the aftermath of an exit () function call, or an interruption from the user. In the case of an error-triggered shutdown, "end" probes are not run. There are no target variables available in either context. If the order of execution among "begin" or "end" probes is significant, then an optional sequence number may be provided:begin(N) end(N)
NEVER¶
The probe point never is specially defined by the translator to mean "never". Its probe handler is never run, though its statements are analyzed for symbol / type correctness as usual. This probe point may be useful in conjunction with optional probes.SYSCALL¶
The syscall.* aliases define several hundred probes, too many to summarize here. They are:syscall.NAMEsyscall.NAME.return
- argstr
- A pretty-printed form of the entire argument list, without parentheses.
- name
- The name of the system call.
- retstr
- For return probes, a pretty-printed form of the system-call result.
TIMERS¶
Intervals defined by the standard kernel "jiffies" timer may be used to trigger probe handlers asynchronously. Two probe point variants are supported by the translator:timer.jiffies(N) timer.jiffies(N).randomize(M)
timer.ms(N) timer.ms(N).randomize(M)
timer.profile
DWARF¶
This family of probe points uses symbolic debugging information for the target kernel/module/program, as may be found in unstripped executables, or the separate debuginfo packages. They allow placement of probes logically into the execution path of the target program, by specifying a set of points in the source or object code. When a matching statement executes on any processor, the probe handler is run in that context. Points in a kernel, which are identified by module, source file, line number, function name, or some combination of these. Here is a list of probe point families currently supported. The .function variant places a probe near the beginning of the named function, so that parameters are available as context variables. The .return variant places a probe at the moment after the return from the named function, so the return value is available as the "$return" context variable. The .inline modifier for .function filters the results to include only instances of inlined functions. The .call modifier selects the opposite subset. The extbf{.exported} modifier filters the results to include only exported functions. Inline functions do not have an identifiable return point, so .return is not supported on .inline probes. The .statement variant places a probe at the exact spot, exposing those local variables that are visible there.kernel.function(PATTERN)kernel.function(PATTERN).callkernel.function(PATTERN).returnkernel.function(PATTERN).inlinekernel.function(PATTERN).label(LPATTERN)module(MPATTERN).function(PATTERN)module(MPATTERN).function(PATTERN).callmodule(MPATTERN).function(PATTERN).returnmodule(MPATTERN).function(PATTERN).inlinemodule(MPATTERN).function(PATTERN).label(LPATTERN)kernel.statement(PATTERN)kernel.statement(ADDRESS).absolutemodule(MPATTERN).statement(PATTERN)process("PATH").function("NAME")process("PATH").statement("*@FILE.c:123")process("PATH").library("PATH").function("NAME")process("PATH").library("PATH").statement("*@FILE.c:123")process("PATH").function("*").returnprocess("PATH").function("myfun").label("foo")process(PID).statement(ADDRESS).absolute
- •
- The first part is the name of a function, as would appear in the nm program's output. This part may use the "*" and "?" wildcarding operators to match multiple names.
- •
- The second part is optional and begins with the "@" character. It is followed by the path to the source file containing the function, which may include a wildcard pattern, such as mm/slab*. If it does not match as is, an implicit "*/" is optionally added before the pattern, so that a script need only name the last few components of a possibly long source directory path.
- •
- Finally, the third part is optional if the file name part was given, and identifies the line number in the source file preceded by a ":" or a "+". The line number is assumed to be an absolute line number if preceded by a ":", or relative to the entry of the function if preceded by a "+". All the lines in the function can be matched with ":*". A range of lines x through y can be matched with ":x-y".
CONTEXT VARIABLES¶
Many of the source-level context variables, such as function parameters, locals, globals visible in the compilation unit, may be visible to probe handlers. They may refer to these variables by prefixing their name with "$" within the scripts. In addition, a special syntax allows limited traversal of structures, pointers, and arrays. More syntax allows pretty-printing of individual variables or their groups. See also @cast.- $var
- refers to an in-scope variable "var". If it's an integer-like type, it will be cast to a 64-bit int for systemtap script use. String-like pointers (char *) may be copied to systemtap string values using the kernel_string or user_string functions.
- $var->field traversal via a structure's or a pointer's field. This
- generalized indirection operator may be repeated to follow more levels. Note that the . operator is not used for plain structure members, only -> for both purposes. (This is because "." is reserved for string concatenation.)
- $return
- is available in return probes only for functions that are declared with a return value.
- $var[N]
- indexes into an array. The index given with a literal number or even an arbitrary numeric expression.
- $$vars
- expands to a character string that is equivalent to
sprintf("parm1=%x ... parmN=%x var1=%x ... varN=%x", parm1, ..., parmN, var1, ..., varN)
- $$locals
- expands to a subset of $$vars for only local variables.
- $$parms
- expands to a subset of $$vars for only function parameters.
- $$return
- is available in return probes only. It expands to a string that is equivalent to sprintf("return=%x", $return) if the probed function has a return value, or else an empty string.
- & $EXPR
- expands to the address of the given context variable expression, if it is addressable.
- @defined($EXPR)
- expands to 1 or 0 iff the given context variable expression
is resolvable, for use in conditionals such as
@defined($foo->bar) ? $foo->bar : 0
- $EXPR$
- expands to a string with all of $EXPR's members, equivalent
to
sprintf("{.a=%i, .b=%u, .c={...}, .d=[...]}", $EXPR->a, $EXPR->b)
- $EXPR$$
- expands to a string with all of $var's members and
submembers, equivalent to
sprintf("{.a=%i, .b=%u, .c={.x=%p, .y=%c}, .d=[%i, ...]}", $EXPR->a, $EXPR->b, $EXPR->c->x, $EXPR->c->y, $EXPR->d[0])
probe kernel.function("do_filp_open").return { println( get_timeofday_us() - @entry(get_timeofday_us()) ) }
DWARFLESS¶
In absence of debugging information, entry & exit points of kernel & module functions can be probed using the "kprobe" family of probes. However, these do not permit looking up the arguments / local variables of the function. Following constructs are supported :kprobe.function(FUNCTION) kprobe.function(FUNCTION).return kprobe.module(NAME).function(FUNCTION) kprobe.module(NAME).function(FUNCTION).return kprobe.statement.(ADDRESS).absolute
USER-SPACE¶
Support for user-space probing is available for kernels that are configured with the utrace extensions. Seeprocess(PID).statement(ADDRESS).absolute
process(PID).begin process("FULLPATH").begin process.begin process(PID).thread.begin process("FULLPATH").thread.begin process.thread.begin process(PID).end process("FULLPATH").end process.end process(PID).thread.end process("FULLPATH").thread.end process.thread.end process(PID).syscall process("FULLPATH").syscall process.syscall process(PID).syscall.return process("FULLPATH").syscall.return process.syscall.return process(PID).insn process("FULLPATH").insn process(PID).insn.block process("FULLPATH").insn.block
process("PATH").mark("LABEL") process("PATH").provider("PROVIDER").mark("LABEL")
process("PATH").function("NAME") process("PATH").statement("*@FILE.c:123") process("PATH").plt("NAME") process("PATH").library("PATH").plt("NAME") process("PATH").library("PATH").function("NAME") process("PATH").library("PATH").statement("*@FILE.c:123") process("PATH").function("*").return process("PATH").function("myfun").label("foo")
PROCFS¶
These probe points allow procfs "files" in /proc/systemtap/MODNAME to be created, read and written using a permission that may be modified using the proper umask value. Default permissions are 0400 for read probes, and 0200 for write probes. If both a read and write probe are being used on the same file, a default permission of 0600 will be used. Using procfs.umask(0040).read would result in a 0404 permission set for the file. (MODNAME is the name of the systemtap module). The proc filesystem is a pseudo-filesystem which is used an an interface to kernel data structures. There are several probe point variants supported by the translator:procfs("PATH").read procfs("PATH").umask(UMASK).read procfs("PATH").read.maxsize(MAXSIZE) procfs("PATH").umask(UMASK).maxsize(MAXSIZE) procfs("PATH").write procfs("PATH").umask(UMASK).write procfs.read procfs.umask(UMASK).read procfs.read.maxsize(MAXSIZE) procfs.umask(UMASK).read.maxsize(MAXSIZE) procfs.write procfs.umask(UMASK).write
procfs("PATH").read { $value = "100\n" }
procfs("PATH").write { printf("user wrote: %s", $value) }
procfs.read.maxsize(1024) { $value = "long string..." $value .= "another long string..." $value .= "another long string..." $value .= "another long string..." }
MARKERS¶
This family of probe points hooks up to static probing markers inserted into the kernel or modules. These markers are special macro calls inserted by kernel developers to make probing faster and more reliable than with DWARF-based probes. Further, DWARF debugging information is not required to probe markers.TRACEPOINTS¶
This family of probe points hooks up to static probing tracepoints inserted into the kernel or modules. As with markers, these tracepoints are special macro calls inserted by kernel developers to make probing faster and more reliable than with DWARF-based probes, and DWARF debugging information is not required to probe tracepoints. Tracepoints have an extra advantage of more strongly-typed parameters than markers.HARDWARE BREAKPOINTS¶
This family of probes is used to set hardware watchpoints for a given(global) kernel symbol. The probes take three components as inputs :
probe kernel.data(ADDRESS).write probe kernel.data(ADDRESS).rw probe kernel.data(ADDRESS).length(LEN).write probe kernel.data(ADDRESS).length(LEN).rw probe kernel.data("SYMBOL_NAME").write probe kernel.data("SYMBOL_NAME").rw
EXAMPLES¶
Here are some example probe points, defining the associated events.- begin, end, end
- refers to the startup and normal shutdown of the session. In this case, the handler would run once during startup and twice during shutdown.
- timer.jiffies(1000).randomize(200)
- refers to a periodic interrupt, every 1000 +/- 200 jiffies.
- kernel.function("*init*"), kernel.function("*exit*")
- refers to all kernel functions with "init" or "exit" in the name.
- kernel.function("*@kernel/sched.c:240")
- refers to any functions within the "kernel/sched.c" file that span line 240. Note that this is not a probe at the statement at that line number. Use the kernel.statement probe instead.
- kernel.mark("getuid")
- refers to an STAP_MARK(getuid, ...) macro call in the kernel.
- module("usb*").function("*sync*").return
- refers to the moment of return from all functions with "sync" in the name in any of the USB drivers.
- kernel.statement(0xc0044852)
- refers to the first byte of the statement whose compiled instructions include the given address in the kernel.
- kernel.statement("*@kernel/sched.c:2917")
- refers to the statement of line 2917 within "kernel/sched.c".
- kernel.statement("bio_init@fs/bio.c+3")
- refers to the statement at line bio_init+3 within "fs/bio.c".
- kernel.data("pid_max").write
- refers to a hardware preakpoint of type "write" set on pid_max
- syscall.*.return
- refers to the group of probe aliases with any name in the
third position
PERF¶
This prototype family of probe points interfaces to the kernel "perf event" infrasture for controlling hardware performance counters. The events being attached to are described by the "type", "config" fields of the perf_event_attr structure, and are sampled at an interval governed by the "sample_period" field.probe perf.type(NN).config(MM).sample(XX) probe perf.type(NN).config(MM)