'\" t .\" Title: perf-report .\" Author: [FIXME: author] [see http://docbook.sf.net/el/author] .\" Generator: DocBook XSL Stylesheets v1.76.1 .\" Date: 02/24/2016 .\" Manual: perf Manual .\" Source: perf .\" Language: English .\" .TH "PERF_3.16\-REPORT" "1" "02/24/2016" "perf" "perf Manual" .\" ----------------------------------------------------------------- .\" * Define some portability stuff .\" ----------------------------------------------------------------- .\" ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ .\" http://bugs.debian.org/507673 .\" http://lists.gnu.org/archive/html/groff/2009-02/msg00013.html .\" ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ .ie \n(.g .ds Aq \(aq .el .ds Aq ' .\" ----------------------------------------------------------------- .\" * set default formatting .\" ----------------------------------------------------------------- .\" disable hyphenation .nh .\" disable justification (adjust text to left margin only) .ad l .\" ----------------------------------------------------------------- .\" * MAIN CONTENT STARTS HERE * .\" ----------------------------------------------------------------- .SH "NAME" perf-report \- Read perf\&.data (created by perf record) and display the profile .SH "SYNOPSIS" .sp .nf \fIperf report\fR [\-i | \-\-input=file] .fi .SH "DESCRIPTION" .sp This command displays the performance counter profile information recorded via perf record\&. .SH "OPTIONS" .PP \-i, \-\-input= .RS 4 Input file name\&. (default: perf\&.data unless stdin is a fifo) .RE .PP \-v, \-\-verbose .RS 4 Be more verbose\&. (show symbol address, etc) .RE .PP \-n, \-\-show\-nr\-samples .RS 4 Show the number of samples for each symbol .RE .PP \-\-showcpuutilization .RS 4 Show sample percentage for different cpu modes\&. .RE .PP \-T, \-\-threads .RS 4 Show per\-thread event counters .RE .PP \-c, \-\-comms= .RS 4 Only consider symbols in these comms\&. CSV that understands \m[blue]\fBfile://filename\fR\m[] entries\&. This option will affect the percentage of the overhead column\&. See \-\-percentage for more info\&. .RE .PP \-d, \-\-dsos= .RS 4 Only consider symbols in these dsos\&. CSV that understands \m[blue]\fBfile://filename\fR\m[] entries\&. This option will affect the percentage of the overhead column\&. See \-\-percentage for more info\&. .RE .PP \-S, \-\-symbols= .RS 4 Only consider these symbols\&. CSV that understands \m[blue]\fBfile://filename\fR\m[] entries\&. This option will affect the percentage of the overhead column\&. See \-\-percentage for more info\&. .RE .PP \-\-symbol\-filter= .RS 4 Only show symbols that match (partially) with this filter\&. .RE .PP \-U, \-\-hide\-unresolved .RS 4 Only display entries resolved to a symbol\&. .RE .PP \-s, \-\-sort= .RS 4 Sort histogram entries by given key(s) \- multiple keys can be specified in CSV format\&. Following sort keys are available: pid, comm, dso, symbol, parent, cpu, srcline, weight, local_weight\&. .sp .if n \{\ .RS 4 .\} .nf Each key has following meaning: .fi .if n \{\ .RE .\} .sp .RS 4 .ie n \{\ \h'-04'\(bu\h'+03'\c .\} .el \{\ .sp -1 .IP \(bu 2.3 .\} comm: command (name) of the task which can be read via /proc//comm .RE .sp .RS 4 .ie n \{\ \h'-04'\(bu\h'+03'\c .\} .el \{\ .sp -1 .IP \(bu 2.3 .\} pid: command and tid of the task .RE .sp .RS 4 .ie n \{\ \h'-04'\(bu\h'+03'\c .\} .el \{\ .sp -1 .IP \(bu 2.3 .\} dso: name of library or module executed at the time of sample .RE .sp .RS 4 .ie n \{\ \h'-04'\(bu\h'+03'\c .\} .el \{\ .sp -1 .IP \(bu 2.3 .\} symbol: name of function executed at the time of sample .RE .sp .RS 4 .ie n \{\ \h'-04'\(bu\h'+03'\c .\} .el \{\ .sp -1 .IP \(bu 2.3 .\} parent: name of function matched to the parent regex filter\&. Unmatched entries are displayed as "[other]"\&. .RE .sp .RS 4 .ie n \{\ \h'-04'\(bu\h'+03'\c .\} .el \{\ .sp -1 .IP \(bu 2.3 .\} cpu: cpu number the task ran at the time of sample .RE .sp .RS 4 .ie n \{\ \h'-04'\(bu\h'+03'\c .\} .el \{\ .sp -1 .IP \(bu 2.3 .\} srcline: filename and line number executed at the time of sample\&. The DWARF debugging info must be provided\&. .RE .sp .RS 4 .ie n \{\ \h'-04'\(bu\h'+03'\c .\} .el \{\ .sp -1 .IP \(bu 2.3 .\} weight: Event specific weight, e\&.g\&. memory latency or transaction abort cost\&. This is the global weight\&. .RE .sp .RS 4 .ie n \{\ \h'-04'\(bu\h'+03'\c .\} .el \{\ .sp -1 .IP \(bu 2.3 .\} local_weight: Local weight version of the weight above\&. .RE .sp .RS 4 .ie n \{\ \h'-04'\(bu\h'+03'\c .\} .el \{\ .sp -1 .IP \(bu 2.3 .\} transaction: Transaction abort flags\&. .RE .sp .RS 4 .ie n \{\ \h'-04'\(bu\h'+03'\c .\} .el \{\ .sp -1 .IP \(bu 2.3 .\} overhead: Overhead percentage of sample .RE .sp .RS 4 .ie n \{\ \h'-04'\(bu\h'+03'\c .\} .el \{\ .sp -1 .IP \(bu 2.3 .\} overhead_sys: Overhead percentage of sample running in system mode .RE .sp .RS 4 .ie n \{\ \h'-04'\(bu\h'+03'\c .\} .el \{\ .sp -1 .IP \(bu 2.3 .\} overhead_us: Overhead percentage of sample running in user mode .RE .sp .RS 4 .ie n \{\ \h'-04'\(bu\h'+03'\c .\} .el \{\ .sp -1 .IP \(bu 2.3 .\} overhead_guest_sys: Overhead percentage of sample running in system mode on guest machine .RE .sp .RS 4 .ie n \{\ \h'-04'\(bu\h'+03'\c .\} .el \{\ .sp -1 .IP \(bu 2.3 .\} overhead_guest_us: Overhead percentage of sample running in user mode on guest machine .RE .sp .RS 4 .ie n \{\ \h'-04'\(bu\h'+03'\c .\} .el \{\ .sp -1 .IP \(bu 2.3 .\} sample: Number of sample .RE .sp .RS 4 .ie n \{\ \h'-04'\(bu\h'+03'\c .\} .el \{\ .sp -1 .IP \(bu 2.3 .\} period: Raw number of event count of sample .sp .if n \{\ .RS 4 .\} .nf By default, comm, dso and symbol keys are used\&. (i\&.e\&. \-\-sort comm,dso,symbol) .fi .if n \{\ .RE .\} .sp .if n \{\ .RS 4 .\} .nf If \-\-branch\-stack option is used, following sort keys are also available: dso_from, dso_to, symbol_from, symbol_to, mispredict\&. .fi .if n \{\ .RE .\} .RE .sp .RS 4 .ie n \{\ \h'-04'\(bu\h'+03'\c .\} .el \{\ .sp -1 .IP \(bu 2.3 .\} dso_from: name of library or module branched from .RE .sp .RS 4 .ie n \{\ \h'-04'\(bu\h'+03'\c .\} .el \{\ .sp -1 .IP \(bu 2.3 .\} dso_to: name of library or module branched to .RE .sp .RS 4 .ie n \{\ \h'-04'\(bu\h'+03'\c .\} .el \{\ .sp -1 .IP \(bu 2.3 .\} symbol_from: name of function branched from .RE .sp .RS 4 .ie n \{\ \h'-04'\(bu\h'+03'\c .\} .el \{\ .sp -1 .IP \(bu 2.3 .\} symbol_to: name of function branched to .RE .sp .RS 4 .ie n \{\ \h'-04'\(bu\h'+03'\c .\} .el \{\ .sp -1 .IP \(bu 2.3 .\} mispredict: "N" for predicted branch, "Y" for mispredicted branch .RE .sp .RS 4 .ie n \{\ \h'-04'\(bu\h'+03'\c .\} .el \{\ .sp -1 .IP \(bu 2.3 .\} in_tx: branch in TSX transaction .RE .sp .RS 4 .ie n \{\ \h'-04'\(bu\h'+03'\c .\} .el \{\ .sp -1 .IP \(bu 2.3 .\} abort: TSX transaction abort\&. .sp .if n \{\ .RS 4 .\} .nf And default sort keys are changed to comm, dso_from, symbol_from, dso_to and symbol_to, see \*(Aq\-\-branch\-stack\*(Aq\&. .fi .if n \{\ .RE .\} .RE .RE .PP \-F, \-\-fields= .RS 4 Specify output field \- multiple keys can be specified in CSV format\&. Following fields are available: overhead, overhead_sys, overhead_us, overhead_children, sample and period\&. Also it can contain any sort key(s)\&. .sp .if n \{\ .RS 4 .\} .nf By default, every sort keys not specified in \-F will be appended automatically\&. .fi .if n \{\ .RE .\} .sp .if n \{\ .RS 4 .\} .nf If \-\-mem\-mode option is used, following sort keys are also available (incompatible with \-\-branch\-stack): symbol_daddr, dso_daddr, locked, tlb, mem, snoop, dcacheline\&. .fi .if n \{\ .RE .\} .sp .RS 4 .ie n \{\ \h'-04'\(bu\h'+03'\c .\} .el \{\ .sp -1 .IP \(bu 2.3 .\} symbol_daddr: name of data symbol being executed on at the time of sample .RE .sp .RS 4 .ie n \{\ \h'-04'\(bu\h'+03'\c .\} .el \{\ .sp -1 .IP \(bu 2.3 .\} dso_daddr: name of library or module containing the data being executed on at the time of sample .RE .sp .RS 4 .ie n \{\ \h'-04'\(bu\h'+03'\c .\} .el \{\ .sp -1 .IP \(bu 2.3 .\} locked: whether the bus was locked at the time of sample .RE .sp .RS 4 .ie n \{\ \h'-04'\(bu\h'+03'\c .\} .el \{\ .sp -1 .IP \(bu 2.3 .\} tlb: type of tlb access for the data at the time of sample .RE .sp .RS 4 .ie n \{\ \h'-04'\(bu\h'+03'\c .\} .el \{\ .sp -1 .IP \(bu 2.3 .\} mem: type of memory access for the data at the time of sample .RE .sp .RS 4 .ie n \{\ \h'-04'\(bu\h'+03'\c .\} .el \{\ .sp -1 .IP \(bu 2.3 .\} snoop: type of snoop (if any) for the data at the time of sample .RE .sp .RS 4 .ie n \{\ \h'-04'\(bu\h'+03'\c .\} .el \{\ .sp -1 .IP \(bu 2.3 .\} dcacheline: the cacheline the data address is on at the time of sample .sp .if n \{\ .RS 4 .\} .nf And default sort keys are changed to local_weight, mem, sym, dso, symbol_daddr, dso_daddr, snoop, tlb, locked, see \*(Aq\-\-mem\-mode\*(Aq\&. .fi .if n \{\ .RE .\} .RE .RE .PP \-p, \-\-parent= .RS 4 A regex filter to identify parent\&. The parent is a caller of this function and searched through the callchain, thus it requires callchain information recorded\&. The pattern is in the exteneded regex format and defaults to "^sys_|^do_page_fault", see \fI\-\-sort parent\fR\&. .RE .PP \-x, \-\-exclude\-other .RS 4 Only display entries with parent\-match\&. .RE .PP \-w, \-\-column\-widths= .RS 4 Force each column width to the provided list, for large terminal readability\&. .RE .PP \-t, \-\-field\-separator= .RS 4 Use a special separator character and don\(cqt pad with spaces, replacing all occurrences of this separator in symbol names (and other output) with a \fI\&.\fR character, that thus it\(cqs the only non valid separator\&. .RE .PP \-D, \-\-dump\-raw\-trace .RS 4 Dump raw trace in ASCII\&. .RE .PP \-g [type,min[,limit],order[,key]], \-\-call\-graph .RS 4 Display call chains using type, min percent threshold, optional print limit and order\&. type can be either: .sp .RS 4 .ie n \{\ \h'-04'\(bu\h'+03'\c .\} .el \{\ .sp -1 .IP \(bu 2.3 .\} flat: single column, linear exposure of call chains\&. .RE .sp .RS 4 .ie n \{\ \h'-04'\(bu\h'+03'\c .\} .el \{\ .sp -1 .IP \(bu 2.3 .\} graph: use a graph tree, displaying absolute overhead rates\&. .RE .sp .RS 4 .ie n \{\ \h'-04'\(bu\h'+03'\c .\} .el \{\ .sp -1 .IP \(bu 2.3 .\} fractal: like graph, but displays relative rates\&. Each branch of the tree is considered as a new profiled object\&. .sp .if n \{\ .RS 4 .\} .nf order can be either: \- callee: callee based call graph\&. \- caller: inverted caller based call graph\&. .fi .if n \{\ .RE .\} .sp .if n \{\ .RS 4 .\} .nf key can be: \- function: compare on functions \- address: compare on individual code addresses .fi .if n \{\ .RE .\} .sp .if n \{\ .RS 4 .\} .nf Default: fractal,0\&.5,callee,function\&. .fi .if n \{\ .RE .\} .RE .RE .PP \-\-children .RS 4 Accumulate callchain of children to parent entry so that then can show up in the output\&. The output will have a new "Children" column and will be sorted on the data\&. It requires callchains are recorded\&. .RE .PP \-\-max\-stack .RS 4 Set the stack depth limit when parsing the callchain, anything beyond the specified depth will be ignored\&. This is a trade\-off between information loss and faster processing especially for workloads that can have a very long callchain stack\&. .sp .if n \{\ .RS 4 .\} .nf Default: 127 .fi .if n \{\ .RE .\} .RE .PP \-G, \-\-inverted .RS 4 alias for inverted caller based call graph\&. .RE .PP \-\-ignore\-callees= .RS 4 Ignore callees of the function(s) matching the given regex\&. This has the effect of collecting the callers of each such function into one place in the call\-graph tree\&. .RE .PP \-\-pretty= .RS 4 Pretty printing style\&. key: normal, raw .RE .PP \-\-stdio .RS 4 Use the stdio interface\&. .RE .PP \-\-tui .RS 4 Use the TUI interface, that is integrated with annotate and allows zooming into DSOs or threads, among other features\&. Use of \-\-tui requires a tty, if one is not present, as when piping to other commands, the stdio interface is used\&. .RE .PP \-\-gtk .RS 4 Use the GTK2 interface\&. .RE .PP \-k, \-\-vmlinux= .RS 4 vmlinux pathname .RE .PP \-\-kallsyms= .RS 4 kallsyms pathname .RE .PP \-m, \-\-modules .RS 4 Load module symbols\&. WARNING: This should only be used with \-k and a LIVE kernel\&. .RE .PP \-f, \-\-force .RS 4 Don\(cqt complain, do it\&. .RE .PP \-\-symfs= .RS 4 Look for files with symbols relative to this directory\&. .RE .PP \-C, \-\-cpu .RS 4 Only report samples for the list of CPUs provided\&. Multiple CPUs can be provided as a comma\-separated list with no space: 0,1\&. Ranges of CPUs are specified with \-: 0\-2\&. Default is to report samples on all CPUs\&. .RE .PP \-M, \-\-disassembler\-style= .RS 4 Set disassembler style for objdump\&. .RE .PP \-\-source .RS 4 Interleave source code with assembly code\&. Enabled by default, disable with \-\-no\-source\&. .RE .PP \-\-asm\-raw .RS 4 Show raw instruction encoding of assembly instructions\&. .RE .PP \-\-show\-total\-period .RS 4 Show a column with the sum of periods\&. .RE .PP \-I, \-\-show\-info .RS 4 Display extended information about the perf\&.data file\&. This adds information which may be very large and thus may clutter the display\&. It currently includes: cpu and numa topology of the host system\&. .RE .PP \-b, \-\-branch\-stack .RS 4 Use the addresses of sampled taken branches instead of the instruction address to build the histograms\&. To generate meaningful output, the perf\&.data file must have been obtained using perf record \-b or perf record \-\-branch\-filter xxx where xxx is a branch filter option\&. perf report is able to auto\-detect whether a perf\&.data file contains branch stacks and it will automatically switch to the branch view mode, unless \-\-no\-branch\-stack is used\&. .RE .PP \-\-objdump= .RS 4 Path to objdump binary\&. .RE .PP \-\-group .RS 4 Show event group information together\&. .RE .PP \-\-demangle .RS 4 Demangle symbol names to human readable form\&. It\(cqs enabled by default, disable with \-\-no\-demangle\&. .RE .PP \-\-mem\-mode .RS 4 Use the data addresses of samples in addition to instruction addresses to build the histograms\&. To generate meaningful output, the perf\&.data file must have been obtained using perf record \-d \-W and using a special event \-e cpu/mem\-loads/ or \-e cpu/mem\-stores/\&. See \fIperf mem\fR for simpler access\&. .RE .PP \-\-percent\-limit .RS 4 Do not show entries which have an overhead under that percent\&. (Default: 0)\&. .RE .PP \-\-percentage .RS 4 Determine how to display the overhead percentage of filtered entries\&. Filters can be applied by \-\-comms, \-\-dsos and/or \-\-symbols options and Zoom operations on the TUI (thread, dso, etc)\&. .sp .if n \{\ .RS 4 .\} .nf "relative" means it\*(Aqs relative to filtered entries only so that the sum of shown entries will be always 100%\&. "absolute" means it retains the original value before and after the filter is applied\&. .fi .if n \{\ .RE .\} .RE .PP \-\-header .RS 4 Show header information in the perf\&.data file\&. This includes various information like hostname, OS and perf version, cpu/mem info, perf command line, event list and so on\&. Currently only \-\-stdio output supports this feature\&. .RE .PP \-\-header\-only .RS 4 Show only perf\&.data header (forces \-\-stdio)\&. .RE .SH "SEE ALSO" .sp \fBperf_3.16-stat\fR(1), \fBperf_3.16-annotate\fR(1)