Scroll to navigation

perftest(1) General Commands Manual perftest(1)

NAME

ib_write_bw, ib_read_bw, ib_send_bw, ib_atomic_bw, ib_write_lat, ib_read_lat, ib_send_lat, ib_atomic_lat, raw_ethernet_bw, raw_ethernet_lat, raw_ethernet_burst_lat, raw_ethernet_fs_rate - benchmarks for various types of infinabnd performance

DESCRIPTION

different metrics & verbs performance which include many different options and modes.

RUNNING TESTS


./<test name> <options>

./<test name> <options> <server IP address>

1- Running bidirectional bandwidth test using Write verb for 5 seconds with 8388608 as a message size and 3 qps:
Server: ./ib_write_bw -s 8388608 -b -D 5 -q 3
Client: ./ib_write_bw -s 8388608 -b -D 5 -q 3 1.1.1.2


2- Running latency test using Read verb for 5000 iterations with 32 as a message size:
Server: ./ib_read_lat -s 32 -n 5000
Client: ./ib_read_lat -s 32 -n 5000 192.168.0.1

IMPORTANT NOTES


1- The options that specific to modes in perftest must be the same for both server and client.

2- Perftest applications may need to be ran with sudo when running from non root.
3- Perftest applications usually installed to the /usr/bin/.
4- Perftest may print some failures with syndroms to the stderr, perftest get those errors from rdma-core.

OPTIONS


Lists the available options to the screen.

Run sizes from 2 till 2^23.
Not relevant for Atomic and RawEth.

Type of atomic operation from {CMP_AND_SWAP,FETCH_AND_ADD} (default FETCH_AND_ADD).
Relevant only for Atomic.

Measure bidirectional bandwidth (default unidirectional).
Relevant only for BW.

Connection type RC/XRC/UC/UD/DC/SRD (default RC).
UD relevant only for Send verb.
SRD relevant only for Read, Write and Send verbs.
UC relevant only for Write and Send verbs.
Not relevant for RawEth.

Run DC initiator as DCS instead of DCI with <log_num dci_stream_channels>.
Not relevant for RawEth.
System support required.

Not relevant for RawEth.
System support required.

Runs traffic with AES_XTS feature (encryption).
Not relevant for RawEth and Write latency.
System support required.

Runs traffic with encryption on tx (default decryption on tx).
Not relevant for RawEth and Write latency.
System support required.

Puts signature on data before encrypting it (default after).
Not relevant for RawEth and Write latency.
System support required.

Not relevant for RawEth and Write latency.
System support required.

Not relevant for RawEth and Write latency.
System support required.

Not relevant for RawEth and Write latency.
System support required.

Not relevant for RawEth and Write latency.
System support required.

Not relevant for RawEth and Write latency.
System support required.

Report times in cpu cycle units (default microseconds).
Relevant only for latency.

Use IB device <dev> (default first device found).

Run test for a customized period of seconds.

Sleep on CQ events (default poll).
Not relevant for Write and RawEth.

Set <completion vector> used for events.
Not relevant for Write and RawEth.

measure results within margins. (default=2sec).

Do not show a warning even if cpufreq_ondemand module is loaded, and cpu-freq is not on max.

Send messages to multicast group with 1 QP attached to it.
When there is no multicast gid specified, a default IPv6 typed gid '255:1:0:0:0:2:201:133:0:0:0:0:0:0:0:0' will be used.
Relevant only for send non fsRate.

Print out all results (default print summary only).
Relevant only for latency and raw_ethernet_fs_rate.

Use port <port> of IB device (default 1).

Max size of message to be sent in inline.
Not relevant for Read and Atomic.

Post list of send WQEs of <list size> size (instead of single post).
Relevant only for BW and raw_ethernet_burst_lat.

Post list of receive WQEs of <list size> size (instead of single post).
Relevant only for BW and raw_ethernet_burst_lat.

Set hop limit value (ttl for IPv4 RawEth QP). Values 0-255 (default 64).
Relevant only for RawEth
Not relevant for raw_ethernet_fs_rate.

MTU size : 64 - 9600 (default port mtu) for RawEth else 256 - 4096.
Not relevant for raw_ethernet_fs_rate.

In multicast, uses <multicast_gid> as the group MGID.
<multicast_gid> can be either decimal or hexadecimal, e.g. regarding the IPv4 224.0.0.30 :
Decimal: 0:0:0:0:0:0:0:0:0:0:255:255:224:0:0:30 , Hexadecimal: 0:0:0:0:0:0:0:0:0:0:0xff:0xff:0xe0:0:0:0x1e
Relevant only for send non fsRate.

Number of exchanges (at least 5, default for write 5000 else 1000 ).

Cancel peak-bw calculation (default with peak up to iters=20000).
Relevant only for bandwidth.

Relevant only for Read and Atomic.

Run test in dual-port mode.
Not relevant for RawEth.
Relevant only for bandwidth.
System support required.

Listen on/connect to port <port> (default 18515).

Num of qp's(default 1).
Relevant only for bandwidth.

Generate Cqe only after <--cq-mod> completion.
Relevant only for bandwidth.

Rx queue size (default 512), if using srq, rx-depth controls max-wr size of the srq.
Relevant only for send non fsRate.

Connect QPs with rdma_cm and run test on those QPs.
Not relevant for RawEth.

Size of message to exchange (default 65536 for bw, for lat 2).
Not relevant for Atomic.

SL (default 0).
Not relevant for raw_ethernet_fs_rate.

Size of tx queue (default 128 for bw else 1).
Relevant only for bw and raw_ethernet_burst_lat.

Set <tos_value> to RDMA-CM QPs. available only with -R flag. values 0-256 (default off).
Not relevant for RawEth

QP timeout, timeout value is 4 usec * 2 ^(timeout), default 14.

(implies -H) print out unsorted results (default sorted).
Relevant only for latency and raw_ethernet_burst_lat and raw_ethernet_fs_rate.

Display perftest version number.

Report performance counter change (example: counters/port_xmit_data,hw_counters/out_of_buffer).

Test uses GID with GID index.
Not relevant for RawEth.

Communicate with rdma_cm module to exchange data - use regular QPs.
Not relevant for RawEth.

Save the report in a json file.

Name of the report json file. (Default: "perftest_out.json" in the working directory).

Show CPU Utilization in report, valid only in Duration mode.

Set a Destination LID instead of getting it from the other side.
Not relevant for raw_ethernet_fs_rate.

Do not exchange versions and MTU with other side.
Not relevant for RawEth.

Force the link(s) to a specific type: IB or Ethernet.
Not relevant for raw_ethernet_fs_rate.

Use a Shared Receive Queue. --rx-depth controls max-wr size of the SRQ.
Relevant only for Send.

Use IPv6 GID. Default is IPv4.
Not relevant for RawEth.

Use IPv6 address for parameters negotiation. Default is IPv4.
Not relevant for RawEth.

Source IP of the interface used for connection establishment. By default taken from routing table.
Not relevant for RawEth.

delay time between each post send.
Relevant only for latency.

Use an mmap'd file as the buffer for testing P2P transfers.
Not relevant for RawEth.

The mmap offset.
Not relevant for RawEth.

Create memory region for each qp.
Relevant only for bandwidth.

Use On Demand Paging instead of Memory Registration.
System support required.

Set verbosity output level: bandwidth , message_rate, latency.
Latency measurement is Average calculation.
bw (bandwidth / message_rate), latency (latency).

Set the payload by passing a txt file containing a pattern in the next form(little endian): '0xaaaaaaaa, 0xbbbbbbbb, ...
Not relevant for RawEth and Write latency.

Use old post send flow (ibv_post_send).

Perform some iterations before start measuring in order to warming-up memory cache.
Not relevant for raw_ethernet_fs_rate.

PKey index to use for QP.
Not relevant for raw_ethernet_fs_rate.

Report RX & TX results separately on Bidirectional BW tests.
Relevant only for bidirectional bandwidth.

Report Max/Average BW of test in Gbit/sec (instead of MiB/sec).
Relevant only for bandwidth.

Report BW data on both ports when running Dualport and Duration mode.
Not relevant for RawEth.
System support required.

Reverse traffic direction - Server send to client.

Run test forever, print results every <duration> seconds.

Set retry count value in rdma_cm mode.
Relevant only for rdma_cm mode.
Not relevant for RawEth.

Set the Traffic Class in GRH (if GRH is in use).
Not relevant for raw_ethernet_fs_rate.

Allocate a null memory region for the client with ibv_alloc_null_mr(3)

Use CUDA specific device for GPUDirect RDMA testing.
Not relevant for raw_ethernet_fs_rate.
System support required.

Use CUDA specific device, based on its full PCIe address, for GPUDirect RDMA testing.
Not relevant for raw_ethernet_fs_rate.
System support required.

Use CUDA DMA-BUF for GPUDirect RDMA testing.
Not relevant for raw_ethernet_fs_rate.
System support required.

Use HabanaLabs specific device for HW accelerator direct RDMA testing.
System support required.

Use Neuron specific device for HW accelerator direct RDMA testing.
System support required.

Use Neuron DMA-BUF for HW accelerator direct RDMA testing.
System support required.

Use selected ROCm device for GPUDirect RDMA testing.
Not relevant for raw_ethernet_fs_rate.
System support required.

Use Hugepages instead of contig, memalign allocations.
Not relevant for raw_ethernet_fs_rate.

Wait <seconds> before destroying allocated resources (QP/CQ/PD/MR..).
Relevant only for bandwidth and raw_ethernet_burst_lat.

Disable PCIe relaxed ordering.
Relevant only for bandwidth and raw_ethernet_burst_lat.
System support required.

Set the amount of messages to send in a burst when using rate limiter.
Relevant only for bandwidth and raw_ethernet_burst_lat.

Set the size of packet to send in a burst. Only supports PP rate limiter.
Relevant only for bandwidth and raw_ethernet_burst_lat.

Set the maximum rate of sent packages. default unit is [Gbps]. use --rate_units to change that.
Relevant only for bandwidth and raw_ethernet_burst_lat.

[Mgp] Set the units for rate limit to MiBps (M), Gbps (g) or pps (p). default is Gbps (g).
Relevant only for bandwidth and raw_ethernet_burst_lat.

[HW/SW/PP] Limit the QP's by HW, PP or by SW. Disabled by default. When rate_limit is not specified HW limit is Default.
Relevant only for bandwidth and raw_ethernet_burst_lat.

Use out of order data placement.
System support required.

Use write-with-immediate verb instead of write.
Write tests only.

RawEth only options


Source MAC address by this format XX:XX:XX:XX:XX:XX **MUST** be entered.

Destination MAC address by this format XX:XX:XX:XX:XX:XX **MUST** be entered.

Use RSS on server side. need to open 2^x qps (using -q flag. default is -q 2). open 2^x clients that transmit to this server.

Destination ip address by this format X.X.X.X for IPv4 or X:X:X:X:X:X for IPv6 (using to send packets with IP header).
System support required for IPv6.

Source ip address by this format X.X.X.X for IPv4 or X:X:X:X:X:X for IPv6 (using to send packets with IP header).
System support required for IPv6.

Destination port number (using to send packets with UDP header as default, or you can use --tcp flag to send TCP Header).

Source port number (using to send packets with UDP header as default, or you can use --tcp flag to send TCP Header).

Ethertype value in the ethernet frame by this format 0xXXXX.

Choose server side for the current machine (--server/--client must be selected ).

Insert vlan tag in ethernet header.

Specify vlan_pcp value for vlan tag, 0~7. 8 means different vlan_pcp for each packet.

Choose client side for the current machine (--server/--client must be selected).
Not relevant for raw_ethernet_fs_rate.

Run mac forwarding test.
Not relevant for raw_ethernet_fs_rate.

Set number of TCP/UDP flows, starting from <src_port, dst_port>.
Not relevant for raw_ethernet_fs_rate.

Set number of burst size per TCP/UDP flow.
Not relevant for raw_ethernet_fs_rate.

Run promiscuous mode.
Not relevant for raw_ethernet_fs_rate.

In latency test, receiver pong after number of received pings.
Not relevant for raw_ethernet_fs_rate.

Run sniffer mode.
Not relevant for raw_ethernet_fs_rate.
System support required.

IPv6 flow label.
Not relevant for raw_ethernet_fs_rate.

Send TCP Packets. must include IP and Ports information.

Send IPv6 Packets.
System support required.

Relevant only for bandwidth.

AUTHORS