.\" Generated by mkman .TH QPERF 1 "April 2018" "qperf" "User Commands" .SH NAME qperf \- Measure RDMA and IP performance .SH SYNOPSIS \fBqperf\fP .br \fBqperf\fP \fISERVERNODE\fP [\fIOPTIONS\fP] \fITESTS\fP .br .br .SH DESCRIPTION qperf measures bandwidth and latency between two nodes. It can work over TCP/IP as well as the RDMA transports. On one of the nodes, qperf is typically run with no arguments designating it the server node. One may then run qperf on a client node to obtain measurements such as bandwidth, latency and cpu utilization. In its most basic form, qperf is run on one node in server mode by invoking it with no arguments. On the other node, it is run with two arguments: the name of the server node followed by the name of the test. A list of tests can be found in the section, TESTS. A variety of options may also be specified. One can get more detailed information on qperf by using the --help option. Below are examples of using the --help option: qperf --help examples Some examples of using qperf qperf --help opts Summary of options qperf --help options Description of options qperf --help tests Short summary and description of tests qperf --help TESTNAME More information on test TESTNAME .SH EXAMPLES In these examples, we first run qperf on a node called myserver in server mode by invoking it with no arguments. In all the subsequent examples, we run qperf on another node and connect to the server which we assume has a hostname of myserver. .TP To run a TCP bandwidth and latency test: qperf myserver tcp_bw tcp_lat .TP To run a SDP bandwidth test for 10 seconds: qperf myserver -t 10 sdp_bw .TP To run a UDP latency test and then cause the server to terminate: qperf myserver udp_lat quit .TP To measure the RDMA UD latency and bandwidth: qperf myserver ud_lat ud_bw .TP To measure RDMA UC bi-directional bandwidth: qperf myserver rc_bi_bw .TP To get a range of TCP latencies with a message size from 1 to 64K qperf myserver -oo msg_size:1:64K:*2 -vu tcp_lat .SH OPTIONS .TP \fB-ar\fP, \fB--access_recv\fP \fIOnOff\fP If OnOff is non-zero, data is accessed once received. Otherwise, data is ignored. By default, OnOff is 0. This can help to mimic some applications. .TP \fB-ar1\fP Cause received data to be accessed. .TP \fB-ap\fP, \fB--alt_port\fP \fIPort\fP Set alternate path port. This enables automatic path failover. .TP \fB-lap\fP, \fB--loc_alt_port\fP \fIPort\fP Set local alternate path port. This enables automatic path failover. .TP \fB-rap\fP, \fB--rem_alt_port\fP \fIPort\fP Set remote alternate path port. This enables automatic path failover. .TP \fB-ca\fP, \fB--cpu_affinity\fP \fIPN\fP Set cpu affinity to PN. CPUs are numbered sequentially from 0. If PN is "any", any cpu is allowed otherwise the cpu is limited to the one specified. .TP \fB-lca\fP, \fB--loc_cpu_affinity\fP \fIPN\fP Set local processor affinity to PN. .TP \fB-rca\fP, \fB--rem_cpu_affinity\fP \fIPN\fP Set remote processor affinity to PN. .TP \fB-f\fP, \fB--flip\fP \fIOnOff\fP If non-zero, cause sender and receiver to play opposite roles. .TP \fB-f1\fP Cause sender and receiver to play opposite roles. .TP \fB-h\fP, \fB--help\fP \fITopic\fP Print out information about Topic. To see the list of topics, type qperf --help .TP \fB-H\fP, \fB--host\fP \fIHost\fP Run test between the current node and the qperf running on node Host. This can also be specified as the first non-option argument. .TP \fB-i\fP, \fB--id\fP \fIDevice:Port\fP Use RDMA Device and Port. .TP \fB-li\fP, \fB--loc_id\fP \fIDevice:Port\fP Use local RDMA Device and Port. .TP \fB-ri\fP, \fB--rem_id\fP \fIDevice:Port\fP Use remote RDMA Device and Port. .TP \fB-lp\fP, \fB--listen_port\fP \fIPort\fP Set the port we listen on to ListenPort. This must be set to the same port on both the server and client machines. The default value is 19765. .TP \fB-oo\fP, \fB--loop\fP \fIVar:Init:Last:Incr\fP Run a test multiple times sequencing through a series of values. Var is the loop variable; Init is the initial value; Last is the value it must not exceed and Incr is the increment. It is useful to set the --verbose_used (-vu) option in conjunction with this option. .TP \fB-m\fP, \fB--msg_size\fP \fISize\fP Set the message size to Size. The default value varies by test. It is assumed that the value is specified in bytes however, a trailing kib or K, mib or M, or gib or G indicates that the size is being specified in kibibytes, mebibytes or gibibytes respectively while a trailing kb or k, mb or m, or gb or g indicates kilobytes, megabytes or gigabytes respectively. .TP \fB-mt\fP, \fB--mtu_size\fP \fISize\fP Set the MTU size. Only relevant to the RDMA UC/RC tests. Units are specified in the same manner as the --msg_size option. .TP \fB-n\fP, \fB--no_msgs\fP \fIN\fP Set test duration by number of messages sent instead of time. .TP \fB-cp\fP, \fB--cq_poll\fP \fIOnOff\fP Turn polling mode on or off. This is only relevant to the RDMA tests and determines whether they poll or wait on the completion queues. If OnOff is 0, they wait; otherwise they poll. .TP \fB-lcp\fP, \fB--loc_cq_poll\fP \fIOnOff\fP Locally turn polling mode on or off. .TP \fB-rcp\fP, \fB--rem_cq_poll\fP \fIOnOff\fP Remotely turn polling mode on or off. .TP \fB-cp1\fP Turn polling mode on. .TP \fB-lcp1\fP Turn local polling mode on. .TP \fB-rcp1\fP Turn remote polling mode on. .TP \fB-ip\fP, \fB--ip_port\fP \fIPort\fP Use Port to run the socket tests. This is different from --listen_port which is used for synchronization. This is only relevant for the socket tests and refers to the TCP/UDP/SDP/RDS/SCTP port that the test is run on. .TP \fB-e\fP, \fB--precision\fP \fIDigits\fP Set the number of significant digits that are used to report results. .TP \fB-nr\fP, \fB--rd_atomic\fP \fIMax\fP Set the number of in-flight operations that can be handled for a RDMA read or atomic operation to Max. This is only relevant to the RDMA Read and Atomic tests. .TP \fB-lnr\fP, \fB--loc_rd_atomic\fP \fIMax\fP Set local read/atomic count. .TP \fB-rnr\fP, \fB--rem_rd_atomic\fP \fIMax\fP Set remote read/atomic count. .TP \fB-sl\fP, \fB--service_level\fP \fISL\fP Set RDMA service level to SL. This is only used by the RDMA tests. The service level must be between 0 and 15. The default service level is 0. .TP \fB-lsl\fP, \fB--loc_service_level\fP \fISL\fP Set local service level. .TP \fB-rsl\fP, \fB--rem_service_level\fP \fISL\fP Set remote service level. .TP \fB-sb\fP, \fB--sock_buf_size\fP \fISize\fP Set the socket buffer size. This is only relevant to the socket tests. .TP \fB-lsb\fP, \fB--loc_sock_buf_size\fP \fISize\fP Set local socket buffer size. .TP \fB-rsb\fP, \fB--rem_sock_buf_size\fP \fISize\fP Set remote socket buffer size. .TP \fB-sp\fP, \fB--src_path_bits\fP \fIN\fP Set source path bits. If the LMC is not zero, this will cause the connection to use a LID with the low order LMC bits set to N. .TP \fB-lsp\fP, \fB--loc_src_path_bits\fP \fIN\fP Set local source path bits. .TP \fB-rsp\fP, \fB--rem_src_path_bits\fP \fIN\fP Set remote source path bits. .TP \fB-sr\fP, \fB--static_rate\fP \fIRate\fP Force InfiniBand static rate. Rate can be one of: 2.5, 5, 10, 20, 30, 40, 60, 80, 120, 1xSDR (2.5 Gbps), 1xDDR (5 Gbps), 1xQDR (10 Gbps), 4xSDR (2.5 Gbps), 4xDDR (5 Gbps), 4xQDR (10 Gbps), 8xSDR (2.5 Gbps), 8xDDR (5 Gbps), 8xQDR (10 Gbps). .TP \fB-lsr\fP, \fB--loc_static_rate\fP Force local InfiniBand static rate .TP \fB-rsr\fP, \fB--rem_static_rate\fP Force remote InfiniBand static rate .TP \fB-t\fP, \fB--time\fP \fITime\fP Set test duration to Time. Specified in seconds however a trailing m, h or d indicates that the time is specified in minutes, hours or days respectively. .TP \fB-to\fP, \fB--timeout\fP \fITime\fP Set timeout to Time. This is the timeout used for various things such as exchanging messages. The default is 5 seconds. .TP \fB-lto\fP, \fB--loc_timeout\fP \fITime\fP Set local timeout to Time. This may be used on the server to set the timeout when initially exchanging data with each client. However, as soon as we receive the client's parameters, the client's remote timeout will override this parameter. .TP \fB-rto\fP, \fB--rem_timeout\fP \fITime\fP Set remote timeout to Time. .TP \fB-un\fP, \fB--unify_nodes\fP Unify the nodes. Describe them in terms of local and remote rather than send and receive. .TP \fB-uu\fP, \fB--unify_units\fP Unify the units that results are shown in. Uses the lowest common denominator. Helpful for scripts. .TP \fB-ub\fP, \fB--use_bits_per_sec\fP Use bits/sec rather than bytes/sec when displaying networking speed. .TP \fB-cm\fP, \fB--use_cm\fP \fIOnOff\fP Use the RDMA Connection Manager (CM) if OnOff is non-zero. It is necessary to use the CM for iWARP devices. The default is to establish the connection without using the CM. This only works for the tests that use the RC transport. .TP \fB-cm1\fP Use RDMA Connection Manager. .TP \fB-v\fP, \fB--verbose\fP Provide more detailed output. Turns on -vc, -vs, -vt and -vu. .TP \fB-vc\fP, \fB--verbose_conf\fP Provide information on configuration. .TP \fB-vs\fP, \fB--verbose_stat\fP Provide information on statistics. .TP \fB-vt\fP, \fB--verbose_time\fP Provide information on timing. .TP \fB-vu\fP, \fB--verbose_used\fP Provide information on parameters used. .TP \fB-vv\fP, \fB--verbose_more\fP Provide even more detailed output. Turns on -vvc, -vvs, -vvt and -vvu. .TP \fB-vvc\fP, \fB--verbose_more_conf\fP Provide more information on configuration. .TP \fB-vvs\fP, \fB--verbose_more_stat\fP Provide more information on statistics. .TP \fB-vvt\fP, \fB--verbose_more_time\fP Provide more information on timing. .TP \fB-vvu\fP, \fB--verbose_more_used\fP Provide more information on parameters used. .TP \fB-V\fP, \fB--version\fP The current version of qperf is printed. .TP \fB-ws\fP, \fB--wait_server\fP \fITime\fP If the server is not ready, continue to try connecting for Time seconds before giving up. The default is 5 seconds. .SH TESTS .TP \fBconf\fP Show configuration .TP \fBquit\fP Cause the server to quit .TP \fBrds_bw\fP RDS streaming one way bandwidth .TP \fBrds_lat\fP RDS one way latency .TP \fBsctp_bw\fP SCTP streaming one way bandwidth .TP \fBsctp_lat\fP SCTP one way latency .TP \fBsdp_bw\fP SDP streaming one way bandwidth .TP \fBsdp_lat\fP SDP one way latency .TP \fBtcp_bw\fP TCP streaming one way bandwidth .TP \fBtcp_lat\fP TCP one way latency .TP \fBudp_bw\fP UDP streaming one way bandwidth .TP \fBudp_lat\fP UDP one way latency .TP \fBrc_bi_bw\fP RC streaming two way bandwidth .TP \fBrc_bw\fP RC streaming one way bandwidth .TP \fBrc_lat\fP RC one way latency .TP \fBuc_bi_bw\fP UC streaming two way bandwidth .TP \fBuc_bw\fP UC streaming one way bandwidth .TP \fBuc_lat\fP UC one way latency .TP \fBud_bi_bw\fP UD streaming two way bandwidth .TP \fBud_bw\fP UD streaming one way bandwidth .TP \fBud_lat\fP UD one way latency .TP \fBxrc_bi_bw\fP XRC streaming two way bandwidth .TP \fBxrc_bw\fP XRC streaming one way bandwidth .TP \fBxrc_lat\fP XRC one way latency .TP \fBrc_rdma_read_bw\fP RC RDMA read streaming one way bandwidth .TP \fBrc_rdma_read_lat\fP RC RDMA read one way latency .TP \fBrc_rdma_write_bw\fP RC RDMA write streaming one way bandwidth .TP \fBrc_rdma_write_lat\fP RC RDMA write one way latency .TP \fBrc_rdma_write_poll_lat\fP RC RDMA write one way polling latency .TP \fBuc_rdma_write_bw\fP UC RDMA write streaming one way bandwidth .TP \fBuc_rdma_write_lat\fP UC RDMA write one way latency .TP \fBuc_rdma_write_poll_lat\fP UC RDMA write one way polling latency .TP \fBrc_compare_swap_mr\fP RC compare and swap messaging rate .TP \fBrc_fetch_add_mr\fP RC fetch and add messaging rate .TP \fBver_rc_compare_swap\fP Verify RC compare and swap .TP \fBver_rc_fetch_add\fP Verify RC fetch and add .SH AUTHOR Written by Johann George. .SH BUGS None of the RDMA tests are available if qperf is compiled without the RDMA libraries. None of the XRC tests are available if qperf is compiled without the XRC extensions. The -f option is not yet implemented in many of the tests.