.\" -*- nroff -*- .\" -*- nroff -*- .\" ovs.tmac .\" .\" Open vSwitch troff macro library . . .\" Continuation line for .IP. .de IQ . br . ns . IP "\\$1" .. . .\" Introduces a sub-subsection .de ST . PP . RS -0.15in . I "\\$1" . RE .. . .\" The content between the lines below is from an-ext.tmac in groff .\" 1.21, with some modifications. .\" ---------------------------------------------------------------------- .\" an-ext.tmac .\" .\" Written by Eric S. Raymond .\" Werner Lemberg .\" .\" Version 2007-Feb-02 .\" .\" Copyright (C) 2007, 2009, 2011 Free Software Foundation, Inc. .\" You may freely use, modify and/or distribute this file. .\" .\" .\" The code below provides extension macros for the `man' macro package. .\" Care has been taken to make the code portable; groff extensions are .\" properly hidden so that all troff implementations can use it without .\" changes. .\" .\" With groff, this file is sourced by the `man' macro package itself. .\" Man page authors who are concerned about portability might add the .\" used macros directly to the prologue of the man page(s). . . .\" Convention: Auxiliary macros and registers start with `m' followed .\" by an uppercase letter or digit. . . .\" Declare start of command synopsis. Sets up hanging indentation. .de SY . ie !\\n(mS \{\ . nh . nr mS 1 . nr mA \\n(.j . ad l . nr mI \\n(.i . \} . el \{\ . br . ns . \} . . HP \w'\fB\\$1\fP\ 'u . B "\\$1" .. . . .\" End of command synopsis. Restores adjustment. .de YS . in \\n(mIu . ad \\n(mA . hy \\n(HY . nr mS 0 .. . . .\" Declare optional option. .de OP . ie \\n(.$-1 \ . RI "[\fB\\$1\fP" "\ \\$2" "]" . el \ . RB "[" "\\$1" "]" .. . . .\" Start URL. .de UR . ds m1 \\$1\" . nh . if \\n(mH \{\ . \" Start diversion in a new environment. . do ev URL-div . do di URL-div . \} .. . . .\" End URL. .de UE . ie \\n(mH \{\ . br . di . ev . . \" Has there been one or more input lines for the link text? . ie \\n(dn \{\ . do HTML-NS "" . \" Yes, strip off final newline of diversion and emit it. . do chop URL-div . do URL-div \c . do HTML-NS . \} . el \ . do HTML-NS "\\*(m1" \&\\$*\" . \} . el \ \\*(la\\*(m1\\*(ra\\$*\" . . hy \\n(HY .. . . .\" Start email address. .de MT . ds m1 \\$1\" . nh . if \\n(mH \{\ . \" Start diversion in a new environment. . do ev URL-div . do di URL-div . \} .. . . .\" End email address. .de ME . ie \\n(mH \{\ . br . di . ev . . \" Has there been one or more input lines for the link text? . ie \\n(dn \{\ . do HTML-NS "" . \" Yes, strip off final newline of diversion and emit it. . do chop URL-div . do URL-div \c . do HTML-NS . \} . el \ . do HTML-NS "\\*(m1" \&\\$*\" . \} . el \ \\*(la\\*(m1\\*(ra\\$*\" . . hy \\n(HY .. . . .\" Continuation line for .TP header. .de TQ . br . ns . TP \\$1\" no doublequotes around argument! .. . . .\" Start example. .de EX . nr mE \\n(.f . nf . nh . ft CW .. . . .\" End example. .de EE . ft \\n(mE . fi . hy \\n(HY .. . .\" EOF .\" ---------------------------------------------------------------------- .TH ovs\-vswitchd 8 "2.15.0" "Open vSwitch" "Open vSwitch Manual" .\" This program's name: .ds PN ovs\-vswitchd . .SH NAME ovs\-vswitchd \- Open vSwitch daemon . .SH SYNOPSIS \fBovs\-vswitchd \fR[\fIdatabase\fR] . .SH DESCRIPTION A daemon that manages and controls any number of Open vSwitch switches on the local machine. .PP The \fIdatabase\fR argument specifies how \fBovs\-vswitchd\fR connects to \fBovsdb\-server\fR. \fIdatabase\fR may be an OVSDB active or passive connection method, as described in \fBovsdb\fR(7). The default is \fBunix:/var/run/openvswitch/db.sock\fR. .PP \fBovs\-vswitchd\fR retrieves its configuration from \fIdatabase\fR at startup. It sets up Open vSwitch datapaths and then operates switching across each bridge described in its configuration files. As the database changes, \fBovs\-vswitchd\fR automatically updates its configuration to match. .PP \fBovs\-vswitchd\fR switches may be configured with any of the following features: . .IP \(bu L2 switching with MAC learning. . .IP \(bu NIC bonding with automatic fail-over and source MAC-based TX load balancing ("SLB"). . .IP \(bu 802.1Q VLAN support. . .IP \(bu Port mirroring, with optional VLAN tagging. . .IP \(bu NetFlow v5 flow logging. . .IP \(bu sFlow(R) monitoring. . .IP \(bu Connectivity to an external OpenFlow controller, such as NOX. . .PP Only a single instance of \fBovs\-vswitchd\fR is intended to run at a time. A single \fBovs\-vswitchd\fR can manage any number of switch instances, up to the maximum number of supported Open vSwitch datapaths. .PP \fBovs\-vswitchd\fR does all the necessary management of Open vSwitch datapaths itself. Thus, \fBovs\-dpctl\fR(8) (and its userspace datapath counterparts accessible via \fBovs\-appctl dpctl/\fIcommand\fR) are not needed with \fBovs\-vswitchd\fR and should not be used because they can interfere with its operation. These tools are still useful for diagnostics. .PP An Open vSwitch datapath kernel module must be loaded for \fBovs\-vswitchd\fR to be useful. Refer to the documentation for instructions on how to build and load the Open vSwitch kernel module. .PP .SH OPTIONS .IP "\fB\-\-mlockall\fR" Causes \fBovs\-vswitchd\fR to call the \fBmlockall()\fR function, to attempt to lock all of its process memory into physical RAM, preventing the kernel from paging any of its memory to disk. This helps to avoid networking interruptions due to system memory pressure. .IP Some systems do not support \fBmlockall()\fR at all, and other systems only allow privileged users, such as the superuser, to use it. \fBovs\-vswitchd\fR emits a log message if \fBmlockall()\fR is unavailable or unsuccessful. . .SS "DPDK Options" For details on initializing \fBovs\-vswitchd\fR to use DPDK ports, refer to the documentation or \fBovs\-vswitchd.conf.db\fR(5). .SS "Daemon Options" .ds DD \ \fBovs\-vswitchd\fR detaches only after it has connected to the \ database, retrieved the initial configuration, and set up that \ configuration. .PP The following options are valid on POSIX based platforms. .TP \fB\-\-pidfile\fR[\fB=\fIpidfile\fR] Causes a file (by default, \fB\*(PN.pid\fR) to be created indicating the PID of the running process. If the \fIpidfile\fR argument is not specified, or if it does not begin with \fB/\fR, then it is created in \fB/var/run/openvswitch\fR. .IP If \fB\-\-pidfile\fR is not specified, no pidfile is created. . .TP \fB\-\-overwrite\-pidfile\fR By default, when \fB\-\-pidfile\fR is specified and the specified pidfile already exists and is locked by a running process, \fB\*(PN\fR refuses to start. Specify \fB\-\-overwrite\-pidfile\fR to cause it to instead overwrite the pidfile. .IP When \fB\-\-pidfile\fR is not specified, this option has no effect. . .IP \fB\-\-detach\fR Runs \fB\*(PN\fR as a background process. The process forks, and in the child it starts a new session, closes the standard file descriptors (which has the side effect of disabling logging to the console), and changes its current directory to the root (unless \fB\-\-no\-chdir\fR is specified). After the child completes its initialization, the parent exits. \*(DD . .TP \fB\-\-monitor\fR Creates an additional process to monitor the \fB\*(PN\fR daemon. If the daemon dies due to a signal that indicates a programming error (\fBSIGABRT\fR, \fBSIGALRM\fR, \fBSIGBUS\fR, \fBSIGFPE\fR, \fBSIGILL\fR, \fBSIGPIPE\fR, \fBSIGSEGV\fR, \fBSIGXCPU\fR, or \fBSIGXFSZ\fR) then the monitor process starts a new copy of it. If the daemon dies or exits for another reason, the monitor process exits. .IP This option is normally used with \fB\-\-detach\fR, but it also functions without it. . .TP \fB\-\-no\-chdir\fR By default, when \fB\-\-detach\fR is specified, \fB\*(PN\fR changes its current working directory to the root directory after it detaches. Otherwise, invoking \fB\*(PN\fR from a carelessly chosen directory would prevent the administrator from unmounting the file system that holds that directory. .IP Specifying \fB\-\-no\-chdir\fR suppresses this behavior, preventing \fB\*(PN\fR from changing its current working directory. This may be useful for collecting core files, since it is common behavior to write core dumps into the current working directory and the root directory is not a good directory to use. .IP This option has no effect when \fB\-\-detach\fR is not specified. . .TP \fB\-\-no\-self\-confinement\fR By default daemon will try to self-confine itself to work with files under well-known directories determined during build. It is better to stick with this default behavior and not to use this flag unless some other Access Control is used to confine daemon. Note that in contrast to other access control implementations that are typically enforced from kernel-space (e.g. DAC or MAC), self-confinement is imposed from the user-space daemon itself and hence should not be considered as a full confinement strategy, but instead should be viewed as an additional layer of security. . .TP \fB\-\-user\fR Causes \fB\*(PN\fR to run as a different user specified in "user:group", thus dropping most of the root privileges. Short forms "user" and ":group" are also allowed, with current user or group are assumed respectively. Only daemons started by the root user accepts this argument. .IP On Linux, daemons will be granted CAP_IPC_LOCK and CAP_NET_BIND_SERVICES before dropping root privileges. Daemons that interact with a datapath, such as \fBovs\-vswitchd\fR, will be granted three additional capabilities, namely CAP_NET_ADMIN, CAP_NET_BROADCAST and CAP_NET_RAW. The capability change will apply even if the new user is root. .IP On Windows, this option is not currently supported. For security reasons, specifying this option will cause the daemon process not to start. .SS "Service Options" The following options are valid only on Windows platform. .TP \fB\-\-service\fR Causes \fB\*(PN\fR to run as a service in the background. The service should already have been created through external tools like \fBSC.exe\fR. . .TP \fB\-\-service\-monitor\fR Causes the \fB\*(PN\fR service to be automatically restarted by the Windows services manager if the service dies or exits for unexpected reasons. .IP When \fB\-\-service\fR is not specified, this option has no effect. .SS "Public Key Infrastructure Options" .IP "\fB\-p\fR \fIprivkey.pem\fR" .IQ "\fB\-\-private\-key=\fIprivkey.pem\fR" Specifies a PEM file containing the private key used as \fB\*(PN\fR's identity for outgoing SSL connections. . .IP "\fB\-c\fR \fIcert.pem\fR" .IQ "\fB\-\-certificate=\fIcert.pem\fR" Specifies a PEM file containing a certificate that certifies the private key specified on \fB\-p\fR or \fB\-\-private\-key\fR to be trustworthy. The certificate must be signed by the certificate authority (CA) that the peer in SSL connections will use to verify it. . .IP "\fB\-C\fR \fIcacert.pem\fR" .IQ "\fB\-\-ca\-cert=\fIcacert.pem\fR" Specifies a PEM file containing the CA certificate that \fB\*(PN\fR should use to verify certificates presented to it by SSL peers. (This may be the same certificate that SSL peers use to verify the certificate specified on \fB\-c\fR or \fB\-\-certificate\fR, or it may be a different one, depending on the PKI design in use.) . .IP "\fB\-C none\fR" .IQ "\fB\-\-ca\-cert=none\fR" Disables verification of certificates presented by SSL peers. This introduces a security risk, because it means that certificates cannot be verified to be those of known trusted hosts. .IP "\fB\-\-bootstrap\-ca\-cert=\fIcacert.pem\fR" When \fIcacert.pem\fR exists, this option has the same effect as \fB\-C\fR or \fB\-\-ca\-cert\fR. If it does not exist, then \fB\*(PN\fR will attempt to obtain the CA certificate from the SSL peer on its first SSL connection and save it to the named PEM file. If it is successful, it will immediately drop the connection and reconnect, and from then on all SSL connections must be authenticated by a certificate signed by the CA certificate thus obtained. .IP \fBThis option exposes the SSL connection to a man-in-the-middle attack obtaining the initial CA certificate\fR, but it may be useful for bootstrapping. .IP This option is only useful if the SSL peer sends its CA certificate as part of the SSL certificate chain. The SSL protocol does not require the server to send the CA certificate. .IP This option is mutually exclusive with \fB\-C\fR and \fB\-\-ca\-cert\fR. .IP "\fB\-\-peer\-ca\-cert=\fIpeer-cacert.pem\fR" Specifies a PEM file that contains one or more additional certificates to send to SSL peers. \fIpeer-cacert.pem\fR should be the CA certificate used to sign \fB\*(PN\fR's own certificate, that is, the certificate specified on \fB\-c\fR or \fB\-\-certificate\fR. If \fB\*(PN\fR's certificate is self-signed, then \fB\-\-certificate\fR and \fB\-\-peer\-ca\-cert\fR should specify the same file. .IP This option is not useful in normal operation, because the SSL peer must already have the CA certificate for the peer to have any confidence in \fB\*(PN\fR's identity. However, this offers a way for a new installation to bootstrap the CA certificate on its first SSL connection. .SS "Logging Options" .IP "\fB\-v\fR[\fIspec\fR] .IQ "\fB\-\-verbose=\fR[\fIspec\fR] . Sets logging levels. Without any \fIspec\fR, sets the log level for every module and destination to \fBdbg\fR. Otherwise, \fIspec\fR is a list of words separated by spaces or commas or colons, up to one from each category below: . .RS .IP \(bu A valid module name, as displayed by the \fBvlog/list\fR command on \fBovs\-appctl\fR(8), limits the log level change to the specified module. . .IP \(bu \fBsyslog\fR, \fBconsole\fR, or \fBfile\fR, to limit the log level change to only to the system log, to the console, or to a file, respectively. (If \fB\-\-detach\fR is specified, \fB\*(PN\fR closes its standard file descriptors, so logging to the console will have no effect.) .IP On Windows platform, \fBsyslog\fR is accepted as a word and is only useful along with the \fB\-\-syslog\-target\fR option (the word has no effect otherwise). . .IP \(bu \fBoff\fR, \fBemer\fR, \fBerr\fR, \fBwarn\fR, \fBinfo\fR, or \fBdbg\fR, to control the log level. Messages of the given severity or higher will be logged, and messages of lower severity will be filtered out. \fBoff\fR filters out all messages. See \fBovs\-appctl\fR(8) for a definition of each log level. .RE . .IP Case is not significant within \fIspec\fR. .IP Regardless of the log levels set for \fBfile\fR, logging to a file will not take place unless \fB\-\-log\-file\fR is also specified (see below). .IP For compatibility with older versions of OVS, \fBany\fR is accepted as a word but has no effect. . .IP "\fB\-v\fR" .IQ "\fB\-\-verbose\fR" Sets the maximum logging verbosity level, equivalent to \fB\-\-verbose=dbg\fR. . .IP "\fB\-vPATTERN:\fIdestination\fB:\fIpattern\fR" .IQ "\fB\-\-verbose=PATTERN:\fIdestination\fB:\fIpattern\fR" Sets the log pattern for \fIdestination\fR to \fIpattern\fR. Refer to \fBovs\-appctl\fR(8) for a description of the valid syntax for \fIpattern\fR. . .IP "\fB\-vFACILITY:\fIfacility\fR" .IQ "\fB\-\-verbose=FACILITY:\fIfacility\fR" Sets the RFC5424 facility of the log message. \fIfacility\fR can be one of \fBkern\fR, \fBuser\fR, \fBmail\fR, \fBdaemon\fR, \fBauth\fR, \fBsyslog\fR, \fBlpr\fR, \fBnews\fR, \fBuucp\fR, \fBclock\fR, \fBftp\fR, \fBntp\fR, \fBaudit\fR, \fBalert\fR, \fBclock2\fR, \fBlocal0\fR, \fBlocal1\fR, \fBlocal2\fR, \fBlocal3\fR, \fBlocal4\fR, \fBlocal5\fR, \fBlocal6\fR or \fBlocal7\fR. If this option is not specified, \fBdaemon\fR is used as the default for the local system syslog and \fBlocal0\fR is used while sending a message to the target provided via the \fB\-\-syslog\-target\fR option. . .TP \fB\-\-log\-file\fR[\fB=\fIfile\fR] Enables logging to a file. If \fIfile\fR is specified, then it is used as the exact name for the log file. The default log file name used if \fIfile\fR is omitted is \fB/var/log/openvswitch/\*(PN.log\fR. . .IP "\fB\-\-syslog\-target=\fIhost\fB:\fIport\fR" Send syslog messages to UDP \fIport\fR on \fIhost\fR, in addition to the system syslog. The \fIhost\fR must be a numerical IP address, not a hostname. . .IP "\fB\-\-syslog\-method=\fImethod\fR" Specify \fImethod\fR how syslog messages should be sent to syslog daemon. Following forms are supported: .RS .IP \(bu \fBlibc\fR, use libc \fBsyslog()\fR function. Downside of using this options is that libc adds fixed prefix to every message before it is actually sent to the syslog daemon over \fB/dev/log\fR UNIX domain socket. .IP \(bu \fBunix:\fIfile\fR\fR, use UNIX domain socket directly. It is possible to specify arbitrary message format with this option. However, \fBrsyslogd 8.9\fR and older versions use hard coded parser function anyway that limits UNIX domain socket use. If you want to use arbitrary message format with older \fBrsyslogd\fR versions, then use UDP socket to localhost IP address instead. .IP \(bu \fBudp:\fIip\fR:\fIport\fR\fR, use UDP socket. With this method it is possible to use arbitrary message format also with older \fBrsyslogd\fR. When sending syslog messages over UDP socket extra precaution needs to be taken into account, for example, syslog daemon needs to be configured to listen on the specified UDP port, accidental iptables rules could be interfering with local syslog traffic and there are some security considerations that apply to UDP sockets, but do not apply to UNIX domain sockets. .IP \(bu \fBnull\fR, discards all messages logged to syslog. .RE .IP The default is taken from the \fBOVS_SYSLOG_METHOD\fR environment variable; if it is unset, the default is \fBlibc\fR. .SS "Other Options" .IP "\fB\-\-unixctl=\fIsocket\fR" Sets the name of the control socket on which \fB\*(PN\fR listens for runtime management commands (see \fBRUNTIME MANAGEMENT COMMANDS\fR, below). If \fIsocket\fR does not begin with \fB/\fR, it is interpreted as relative to \fB/var/run/openvswitch\fR. If \fB\-\-unixctl\fR is not used at all, the default socket is \fB/var/run/openvswitch/\*(PN.\fIpid\fB.ctl\fR, where \fIpid\fR is \fB\*(PN\fR's process ID. .IP On Windows a local named pipe is used to listen for runtime management commands. A file is created in the absolute path as pointed by \fIsocket\fR or if \fB\-\-unixctl\fR is not used at all, a file is created as \fB\*(PN.ctl\fR in the configured \fIOVS_RUNDIR\fR directory. The file exists just to mimic the behavior of a Unix domain socket. .IP Specifying \fBnone\fR for \fIsocket\fR disables the control socket feature. .IP "\fB\-h\fR" .IQ "\fB\-\-help\fR" Prints a brief help message to the console. . .IP "\fB\-V\fR" .IQ "\fB\-\-version\fR" Prints version information to the console. . .SH "RUNTIME MANAGEMENT COMMANDS" \fBovs\-appctl\fR(8) can send commands to a running \fBovs\-vswitchd\fR process. The currently supported commands are described below. The command descriptions assume an understanding of how to configure Open vSwitch. .SS "GENERAL COMMANDS" .IP "\fBexit\fR \fI--cleanup\fR" Causes \fBovs\-vswitchd\fR to gracefully terminate. If \fI--cleanup\fR is specified, deletes flows from datapaths and releases other datapath resources configured by \fBovs\-vswitchd\fR. Otherwise, datapath flows and other resources remains undeleted. Resources of datapaths that are integrated into \fBovs\-vswitchd\fR (e.g. the \fBnetdev\fR datapath type) are always released regardless of \fI--cleanup\fR except for ports with \fBinternal\fR type. Use \fI--cleanup\fR to release \fBinternal\fR ports too. . .IP "\fBqos/show-types\fR \fIinterface\fR" Queries the interface for a list of Quality of Service types that are configurable via Open vSwitch for the given \fIinterface\fR. .IP "\fBqos/show\fR \fIinterface\fR" Queries the kernel for Quality of Service configuration and statistics associated with the given \fIinterface\fR. .IP "\fBbfd/show\fR [\fIinterface\fR]" Displays detailed information about Bidirectional Forwarding Detection configured on \fIinterface\fR. If \fIinterface\fR is not specified, then displays detailed information about all interfaces with BFD enabled. .IP "\fBbfd/set-forwarding\fR [\fIinterface\fR] \fIstatus\fR" Force the fault status of the BFD module on \fIinterface\fR (or all interfaces if none is given) to be \fIstatus\fR. \fIstatus\fR can be "true", "false", or "normal" which reverts to the standard behavior. .IP "\fBcfm/show\fR [\fIinterface\fR]" Displays detailed information about Connectivity Fault Management configured on \fIinterface\fR. If \fIinterface\fR is not specified, then displays detailed information about all interfaces with CFM enabled. .IP "\fBcfm/set-fault\fR [\fIinterface\fR] \fIstatus\fR" Force the fault status of the CFM module on \fIinterface\fR (or all interfaces if none is given) to be \fIstatus\fR. \fIstatus\fR can be "true", "false", or "normal" which reverts to the standard behavior. .IP "\fBstp/tcn\fR [\fIbridge\fR]" Forces a topology change event on \fIbridge\fR if it's running STP. This may cause it to send Topology Change Notifications to its peers and flush its MAC table. If no \fIbridge\fR is given, forces a topology change event on all bridges. .IP "\fBstp/show\fR [\fIbridge\fR]" Displays detailed information about spanning tree on the \fIbridge\fR. If \fIbridge\fR is not specified, then displays detailed information about all bridges with STP enabled. .IP "\fBrstp/tcn\fR [\fIbridge\fR]" Forces a topology change event on \fIbridge\fR if it's running RSTP. This may cause it to send Topology Change Notifications to its peers and flush its MAC table. If no \fIbridge\fR is given, forces a topology change event on all bridges. .IP "\fBrstp/show\fR [\fIbridge\fR]" Displays detailed information about rapid spanning tree on the \fIbridge\fR. If \fIbridge\fR is not specified, then displays detailed information about all bridges with RSTP enabled. .SS "BRIDGE COMMANDS" These commands manage bridges. .IP "\fBfdb/flush\fR [\fIbridge\fR]" Flushes \fIbridge\fR MAC address learning table, or all learning tables if no \fIbridge\fR is given. .IP "\fBfdb/show\fR \fIbridge\fR" Lists each MAC address/VLAN pair learned by the specified \fIbridge\fR, along with the port on which it was learned and the age of the entry, in seconds. .IP "\fBfdb/stats-clear\fR [\fIbridge\fR]" Clear \fIbridge\fR MAC address learning table statistics, or all statistics if no \fIbridge\fR is given. .IP "\fBfdb/stats-show\fR \fIbridge\fR" Show MAC address learning table statistics for the specified \fIbridge\fR. .IP "\fBmdb/flush\fR [\fIbridge\fR]" Flushes \fIbridge\fR multicast snooping table, or all snooping tables if no \fIbridge\fR is given. .IP "\fBmdb/show\fR \fIbridge\fR" Lists each multicast group/VLAN pair learned by the specified \fIbridge\fR, along with the port on which it was learned and the age of the entry, in seconds. .IP "\fBbridge/reconnect\fR [\fIbridge\fR]" Makes \fIbridge\fR drop all of its OpenFlow controller connections and reconnect. If \fIbridge\fR is not specified, then all bridges drop their controller connections and reconnect. .IP This command might be useful for debugging OpenFlow controller issues. . .IP "\fBbridge/dump\-flows\fR [\fB\-\-offload-stats\fR] \fIbridge\fR" Lists all flows in \fIbridge\fR, including those normally hidden to commands such as \fBovs\-ofctl dump\-flows\fR. Flows set up by mechanisms such as in-band control and fail-open are hidden from the controller since it is not allowed to modify or override them. If \fB\-\-offload-stats\fR are specified then also list statistics for offloaded packets and bytes, which are a subset of the total packets and bytes. .SS "BOND COMMANDS" These commands manage bonded ports on an Open vSwitch's bridges. To understand some of these commands, it is important to understand a detail of the bonding implementation called ``source load balancing'' (SLB). Instead of directly assigning Ethernet source addresses to members, the bonding implementation computes a function that maps an 48-bit Ethernet source addresses into an 8-bit value (a ``MAC hash'' value). All of the Ethernet addresses that map to a single 8-bit value are then assigned to a single member. .IP "\fBbond/list\fR" Lists all of the bonds, and their members, on each bridge. . .IP "\fBbond/show\fR [\fIport\fR]" Lists all of the bond-specific information (updelay, downdelay, time until the next rebalance) about the given bonded \fIport\fR, or all bonded ports if no \fIport\fR is given. Also lists information about each members: whether it is enabled or disabled, the time to completion of an updelay or downdelay if one is in progress, whether it is the active member, the hashes assigned to the member. Any LACP information related to this bond may be found using the \fBlacp/show\fR command. . .IP "\fBbond/migrate\fR \fIport\fR \fIhash\fR \fImember\fR" Only valid for SLB bonds. Assigns a given MAC hash to a new member. \fIport\fR specifies the bond port, \fIhash\fR the MAC hash to be migrated (as a decimal number between 0 and 255), and \fImember\fR the new member to be assigned. .IP The reassignment is not permanent: rebalancing or fail-over will cause the MAC hash to be shifted to a new member in the usual manner. .IP A MAC hash cannot be migrated to a disabled member. .IP "\fBbond/set\-active\-member\fR \fIport\fR \fImember\fR" Sets \fImember\fR as the active member on \fIport\fR. \fImember\fR must currently be enabled. .IP The setting is not permanent: a new active member will be selected if \fImember\fR becomes disabled. .IP "\fBbond/enable\-member\fR \fIport\fR \fImember\fR" .IQ "\fBbond/disable\-member\fR \fIport\fR \fImember\fR" Enables (or disables) \fImember\fR on the given bond \fIport\fR, skipping any updelay (or downdelay). .IP This setting is not permanent: it persists only until the carrier status of \fImember\fR changes. .IP "\fBbond/hash\fR \fImac\fR [\fIvlan\fR] [\fIbasis\fR]" Returns the hash value which would be used for \fImac\fR with \fIvlan\fR and \fIbasis\fR if specified. . .IP "\fBlacp/show\fR [\fIport\fR]" Lists all of the LACP related information about the given \fIport\fR: active or passive, aggregation key, system id, and system priority. Also lists information about each member: whether it is enabled or disabled, whether it is attached or detached, port id and priority, actor information, and partner information. If \fIport\fR is not specified, then displays detailed information about all interfaces with CFM enabled. . .IP "\fBlacp/stats-show\fR [\fIport\fR]" Lists various stats about LACP PDUs (number of RX/TX PDUs, bad PDUs received) and member state (number of times its state expired/defaulted and carrier status changed) for the given \fIport\fR. If \fIport\fR is not specified, then displays stats of all interfaces with LACP enabled. .SS "DPCTL DATAPATH DEBUGGING COMMANDS" The primary way to configure \fBovs\-vswitchd\fR is through the Open vSwitch database, e.g. using \fBovs\-vsctl\fR(8). These commands provide a debugging interface for managing datapaths. They implement the same features (and syntax) as \fBovs\-dpctl\fR(8). Unlike \fBovs\-dpctl\fR(8), these commands work with datapaths that are integrated into \fBovs\-vswitchd\fR (e.g. the \fBnetdev\fR datapath type). .PP . .ds DX \fBdpctl/\fR .de DO \\$2 \\$1 \\$3 .. Do not use commands to add or remove or modify datapaths if \fBovs\-vswitchd\fR is running because this interferes with \fBovs\-vswitchd\fR's own datapath management. .TP \*(DX\fBadd\-dp \fIdp\fR [\fInetdev\fR[\fB,\fIoption\fR]...] Creates datapath \fIdp\fR, with a local port also named \fIdp\fR. This will fail if a network device \fIdp\fR already exists. .IP If \fInetdev\fRs are specified, \fB\*(PN\fR adds them to the new datapath, just as if \fBadd\-if\fR was specified. . .TP \*(DX\fBdel\-dp \fIdp\fR Deletes datapath \fIdp\fR. If \fIdp\fR is associated with any network devices, they are automatically removed. . .TP \*(DX\fBadd\-if \fIdp netdev\fR[\fB,\fIoption\fR]... Adds each \fInetdev\fR to the set of network devices datapath \fIdp\fR monitors, where \fIdp\fR is the name of an existing datapath, and \fInetdev\fR is the name of one of the host's network devices, e.g. \fBeth0\fR. Once a network device has been added to a datapath, the datapath has complete ownership of the network device's traffic and the network device appears silent to the rest of the system. .IP A \fInetdev\fR may be followed by a comma-separated list of options. The following options are currently supported: . .RS .IP "\fBtype=\fItype\fR" Specifies the type of port to add. The default type is \fBsystem\fR. .IP "\fBport_no=\fIport\fR" Requests a specific port number within the datapath. If this option is not specified then one will be automatically assigned. .IP "\fIkey\fB=\fIvalue\fR" Adds an arbitrary key-value option to the port's configuration. .RE .IP \fBovs\-vswitchd.conf.db\fR(5) documents the available port types and options. . .IP "\*(DX\fBset\-if \fIdp port\fR[\fB,\fIoption\fR]..." Reconfigures each \fIport\fR in \fIdp\fR as specified. An \fIoption\fR of the form \fIkey\fB=\fIvalue\fR adds the specified key-value option to the port or overrides an existing key's value. An \fIoption\fR of the form \fIkey\fB=\fR, that is, without a value, deletes the key-value named \fIkey\fR. The type and port number of a port cannot be changed, so \fBtype\fR and \fBport_no\fR are only allowed if they match the existing configuration. .TP \*(DX\fBdel\-if \fIdp netdev\fR... Removes each \fInetdev\fR from the list of network devices datapath \fIdp\fR monitors. . .TP \*(DX\fBdump\-dps\fR Prints the name of each configured datapath on a separate line. . .TP .DO "[\fB\-s\fR | \fB\-\-statistics\fR]" "\*(DX\fBshow" "\fR[\fIdp\fR...]" Prints a summary of configured datapaths, including their datapath numbers and a list of ports connected to each datapath. (The local port is identified as port 0.) If \fB\-s\fR or \fB\-\-statistics\fR is specified, then packet and byte counters are also printed for each port. .IP The datapath numbers consists of flow stats and mega flow mask stats. .IP The "lookups" row displays three stats related to flow lookup triggered by processing incoming packets in the datapath. "hit" displays number of packets matches existing flows. "missed" displays the number of packets not matching any existing flow and require user space processing. "lost" displays number of packets destined for user space process but subsequently dropped before reaching userspace. The sum of "hit" and "miss" equals to the total number of packets datapath processed. .IP The "flows" row displays the number of flows in datapath. .IP The "masks" row displays the mega flow mask stats. This row is omitted for datapath not implementing mega flow. "hit" displays the total number of masks visited for matching incoming packets. "total" displays number of masks in the datapath. "hit/pkt" displays the average number of masks visited per packet; the ratio between "hit" and total number of packets processed by the datapath. .IP If one or more datapaths are specified, information on only those datapaths are displayed. Otherwise, \fB\*(PN\fR displays information about all configured datapaths. .SS "DATAPATH FLOW TABLE DEBUGGING COMMANDS" The following commands are primarily useful for debugging Open vSwitch. The flow table entries (both matches and actions) that they work with are not OpenFlow flow entries. Instead, they are different and considerably simpler flows maintained by the Open vSwitch kernel module. Do not use commands to add or remove or modify datapath flows if \fBovs\-vswitchd\fR is running because it interferes with \fBovs\-vswitchd\fR's own datapath flow management. Use \fBovs\-ofctl\fR(8), instead, to work with OpenFlow flow entries. . .PP The \fIdp\fR argument to each of these commands is optional when exactly one datapath exists, in which case that datapath is the default. When multiple datapaths exist, then a datapath name is required. . .TP .DO "[\fB\-m \fR| \fB\-\-more\fR] [\fB\-\-names \fR| \fB\-\-no\-names\fR]" \*(DX\fBdump\-flows\fR "[\fIdp\fR] [\fBfilter=\fIfilter\fR] [\fBtype=\fItype\fR] [\fBpmd=\fIpmd\fR]" Prints to the console all flow entries in datapath \fIdp\fR's flow table. Without \fB\-m\fR or \fB\-\-more\fR, output omits match fields that a flow wildcards entirely; with \fB\-m\fR or \fB\-\-more\fR, output includes all wildcarded fields. .IP If \fBfilter=\fIfilter\fR is specified, only displays the flows that match the \fIfilter\fR. \fIfilter\fR is a flow in the form similiar to that accepted by \fBovs\-ofctl\fR(8)'s \fBadd\-flow\fR command. (This is not an OpenFlow flow: besides other differences, it never contains wildcards.) The \fIfilter\fR is also useful to match wildcarded fields in the datapath flow. As an example, \fBfilter='tcp,tp_src=100'\fR will match the datapath flow containing '\fBtcp(src=80/0xff00,dst=8080/0xff)\fR'. .IP If \fBpmd=\fIpmd\fR is specified, only displays flows of the specified pmd. Using \fBpmd=\fI-1\fR will restrict the dump to flows from the main thread. This option is only supported by the \fBuserspace datapath\fR. .IP If \fBtype=\fItype\fR is specified, only displays flows of the specified types. This option supported only for \fBovs\-appctl dpctl/dump\-flows\fR. \fItype\fR is a comma separated list, which can contain any of the following: . \fBovs\fR - displays flows handled in the ovs dp \fBtc\fR - displays flows handled in the tc dp \fBdpdk\fR - displays flows fully offloaded by dpdk \fBoffloaded\fR - displays flows offloaded to the HW \fBnon-offloaded\fR - displays flows not offloaded to the HW \fBpartially-offloaded\fR - displays flows where only part of their proccessing is done in HW \fBall\fR - displays all the types of flows .IP By default all the types of flows are displayed. \fBovs\-dpctl\fR always acts as if the \fBtype\fR was \fIovs\fR. . .IP "\*(DX\fBadd\-flow\fR [\fIdp\fR] \fIflow actions\fR" .TP .DO "[\fB\-\-clear\fR] [\fB\-\-may-create\fR] [\fB\-s\fR | \fB\-\-statistics\fR]" "\*(DX\fBmod\-flow\fR" "[\fIdp\fR] \fIflow actions\fR" Adds or modifies a flow in \fIdp\fR's flow table that, when a packet matching \fIflow\fR arrives, causes \fIactions\fR to be executed. .IP The \fBadd\-flow\fR command succeeds only if \fIflow\fR does not already exist in \fIdp\fR. Contrariwise, \fBmod\-flow\fR without \fB\-\-may\-create\fR only modifies the actions for an existing flow. With \fB\-\-may\-create\fR, \fBmod\-flow\fR will add a new flow or modify an existing one. .IP If \fB\-s\fR or \fB\-\-statistics\fR is specified, then \fBmod\-flow\fR prints the modified flow's statistics. A flow's statistics are the number of packets and bytes that have passed through the flow, the elapsed time since the flow last processed a packet (if ever), and (for TCP flows) the union of the TCP flags processed through the flow. .IP With \fB\-\-clear\fR, \fBmod\-flow\fR zeros out the flow's statistics. The statistics printed if \fB\-s\fR or \fB\-\-statistics\fR is also specified are those from just before clearing the statistics. .IP NOTE: \fIflow\fR and \fIactions\fR do not match the syntax used with \fBovs\-ofctl\fR(8)'s \fBadd\-flow\fR command. . .IP \fBUsage Examples\fR . .RS .PP Forward ARP between ports 1 and 2 on datapath myDP: .IP ovs-dpctl add-flow myDP \\ . "in_port(1),eth(),eth_type(0x0806),arp()" 2 . .IP ovs-dpctl add-flow myDP \\ . "in_port(2),eth(),eth_type(0x0806),arp()" 1 . .PP Forward all IPv4 traffic between two addresses on ports 1 and 2: . .IP ovs-dpctl add-flow myDP \\ . "in_port(1),eth(),eth_type(0x800),\\ ipv4(src=172.31.110.4,dst=172.31.110.5)" 2 . .IP ovs-dpctl add-flow myDP \\ . "in_port(2),eth(),eth_type(0x800),\\ ipv4(src=172.31.110.5,dst=172.31.110.4)" 1 . .RE .TP \*(DX\fBadd\-flows\fR [\fIdp\fR] \fIfile\fR .TQ \*(DX\fBmod\-flows\fR [\fIdp\fR] \fIfile\fR .TQ \*(DX\fBdel\-flows\fR [\fIdp\fR] \fIfile\fR Reads flow entries from \fIfile\fR (or \fBstdin\fR if \fIfile\fR is \fB\-\fR) and adds, modifies, or deletes each entry to the datapath. . Each flow specification (e.g., each line in \fIfile\fR) may start with \fBadd\fR, \fBmodify\fR, or \fBdelete\fR keyword to specify whether a flow is to be added, modified, or deleted. A flow specification without one of these keywords is treated based on the used command. All flow modifications are executed as individual transactions in the order specified. . .TP .DO "[\fB\-s\fR | \fB\-\-statistics\fR]" "\*(DX\fBdel\-flow\fR" "[\fIdp\fR] \fIflow\fR" Deletes the flow from \fIdp\fR's flow table that matches \fIflow\fR. If \fB\-s\fR or \fB\-\-statistics\fR is specified, then \fBdel\-flow\fR prints the deleted flow's statistics. . .TP .DO "[\fB\-m \fR| \fB\-\-more\fR] [\fB\-\-names \fR| \fB\-\-no\-names\fR]" "\*(DX\fBget\-flow\fR [\fIdp\fR] ufid:\fIufid\fR" Fetches the flow from \fIdp\fR's flow table with unique identifier \fIufid\fR. \fIufid\fR must be specified as a string of 32 hexadecimal characters. . .IP "\*(DX\fBdel\-flows\fR [\fIdp\fR]" Deletes all flow entries from datapath \fIdp\fR's flow table. .SS "CONNECTION TRACKING TABLE COMMANDS" The following commands are useful for debugging and configuring the connection tracking table in the datapath. . .PP The \fIdp\fR argument to each of these commands is optional when exactly one datapath exists, in which case that datapath is the default. When multiple datapaths exist, then a datapath name is required. . .PP \fBN.B.\fR(Linux specific): the \fIsystem\fR datapaths (i.e. the Linux kernel module Open vSwitch datapaths) share a single connection tracking table (which is also used by other kernel subsystems, such as iptables, nftables and the regular host stack). Therefore, the following commands do not apply specifically to one datapath. . .TP \*(DX\fBipf\-set\-enabled\fR [\fIdp\fR] \fBv4\fR|\fBv6\fR .TQ \*(DX\fBipf\-set\-disabled\fR [\fIdp\fR] \fBv4\fR|\fBv6\fR Enables or disables IP fragmentation handling for the userspace connection tracker. Either \fBv4\fR or \fBv6\fR must be specified. Both IPv4 and IPv6 fragment reassembly are enabled by default. Only supported for the userspace datapath. . .TP \*(DX\fBipf\-set\-min\-frag\fR [\fIdp\fR] \fBv4\fR|\fBv6\fR \fIminfrag\fR Sets the minimum fragment size (L3 header and data) for non-final fragments to \fIminfrag\fR. Either \fBv4\fR or \fBv6\fR must be specified. For enhanced DOS security, higher minimum fragment sizes can usually be used. The default IPv4 value is 1200 and the clamped minimum is 400. The default IPv6 value is 1280, with a clamped minimum of 400, for testing flexibility. The maximum fragment size is not clamped, however, setting this value too high might result in valid fragments being dropped. Only supported for userspace datapath. . .TP \*(DX\fBipf\-set\-max\-nfrags\fR [\fIdp\fR] \fImaxfrags\fR Sets the maximum number of fragments tracked by the userspace datapath connection tracker to \fImaxfrags\fR. The default value is 1000 and the clamped maximum is 5000. Note that packet buffers can be held by the fragmentation module while fragments are incomplete, but will timeout after 15 seconds. Memory pool sizing should be set accordingly when fragmentation is enabled. Only supported for userspace datapath. . .TP .DO "[\fB\-m\fR | \fB\-\-more\fR]" "\*(DX\fBipf\-get\-status\fR [\fIdp\fR]" Gets the configuration settings and fragment counters associated with the fragmentation handling of the userspace datapath connection tracker. With \fB\-m\fR or \fB\-\-more\fR, also dumps the IP fragment lists. Only supported for userspace datapath. . .TP .DO "[\fB\-m\fR | \fB\-\-more\fR] [\fB\-s\fR | \fB\-\-statistics\fR]" "\*(DX\fBdump\-conntrack\fR" "[\fIdp\fR] [\fBzone=\fIzone\fR]" Prints to the console all the connection entries in the tracker used by \fIdp\fR. If \fBzone=\fIzone\fR is specified, only shows the connections in \fIzone\fR. With \fB\-\-more\fR, some implementation specific details are included. With \fB\-\-statistics\fR timeouts and timestamps are added to the output. . .TP \*(DX\fBflush\-conntrack\fR [\fIdp\fR] [\fBzone=\fIzone\fR] [\fIct-tuple\fR] Flushes the connection entries in the tracker used by \fIdp\fR based on \fIzone\fR and connection tracking tuple \fIct-tuple\fR. If \fIct-tuple\fR is not provided, flushes all the connection entries. If \fBzone\fR=\fIzone\fR is specified, only flushes the connections in \fIzone\fR. .IP If \fIct-tuple\fR is provided, flushes the connection entry specified by \fIct-tuple\fR in \fIzone\fR. The zone defaults to 0 if it is not provided. The userspace connection tracker requires flushing with the original pre-NATed tuple and a warning log will be otherwise generated. An example of an IPv4 ICMP \fIct-tuple\fR: .IP "ct_nw_src=10.1.1.1,ct_nw_dst=10.1.1.2,ct_nw_proto=1,icmp_type=8,icmp_code=0,icmp_id=10" .IP An example of an IPv6 TCP \fIct-tuple\fR: .IP "ct_ipv6_src=fc00::1,ct_ipv6_dst=fc00::2,ct_nw_proto=6,ct_tp_src=1,ct_tp_dst=2" . .TP .DO "[\fB\-m\fR | \fB\-\-more\fR]" "\*(DX\fBct\-stats\-show\fR [\fIdp\fR] [\fBzone=\fIzone\fR]" Displays the number of connections grouped by protocol used by \fIdp\fR. If \fBzone=\fIzone\fR is specified, numbers refer to the connections in \fIzone\fR. With \fB\-\-more\fR, groups by connection state for each protocol. . .TP \*(DX\fBct\-bkts\fR [\fIdp\fR] [\fBgt=\fIthreshold\fR] For each conntrack bucket, displays the number of connections used by \fIdp\fR. If \fBgt=\fIthreshold\fR is specified, bucket numbers are displayed when the number of connections in a bucket is greater than \fIthreshold\fR. . .TP \*(DX\fBct\-set\-maxconns\fR [\fIdp\fR] \fImaxconns\fR Sets the maximum limit of connection tracker entries to \fImaxconns\fR on \fIdp\fR. This can be used to reduce the processing load on the system due to connection tracking or simply limiting connection tracking. If the number of connections is already over the new maximum limit request then the new maximum limit will be enforced when the number of connections decreases to that limit, which normally happens due to connection expiry. Only supported for userspace datapath. . .TP \*(DX\fBct\-get\-maxconns\fR [\fIdp\fR] Prints the maximum limit of connection tracker entries on \fIdp\fR. Only supported for userspace datapath. . .TP \*(DX\fBct\-get\-nconns\fR [\fIdp\fR] Prints the current number of connection tracker entries on \fIdp\fR. Only supported for userspace datapath. . .TP \*(DX\fBct\-enable\-tcp\-seq\-chk\fR [\fIdp\fR] .TQ \*(DX\fBct\-disable\-tcp\-seq\-chk\fR [\fIdp\fR] Enables or disables TCP sequence checking. When set to disabled, all sequence number verification is disabled, including for TCP resets. This is similar, but not the same as 'be_liberal' mode, as in Netfilter. Disabling sequence number verification is not an optimization in itself, but is needed for some hardware offload support which might offer some performance advantage. Sequence number checking is enabled by default to enforce better security and should only be disabled if required for hardware offload support. This command is only supported for the userspace datapath. . .TP \*(DX\fBct\-get\-tcp\-seq\-chk\fR [\fIdp\fR] Prints whether TCP sequence checking is enabled or disabled on \fIdp\fR. Only supported for the userspace datapath. . .TP \*(DX\fBct\-set\-limits\fR [\fIdp\fR] [\fBdefault=\fIdefault_limit\fR] [\fBzone=\fIzone\fR,\fBlimit=\fIlimit\fR]... Sets the maximum allowed number of connections in a connection tracking zone. A specific \fIzone\fR may be set to \fIlimit\fR, and multiple zones may be specified with a comma-separated list. If a per-zone limit for a particular zone is not specified in the datapath, it defaults to the default per-zone limit. A default zone may be specified with the \fBdefault=\fIdefault_limit\fR argument. Initially, the default per-zone limit is unlimited. An unlimited number of entries may be set with \fB0\fR limit. . .TP \*(DX\fBct\-del\-limits\fR [\fIdp\fR] \fBzone=\fIzone[,zone]\fR... Deletes the connection tracking limit for \fIzone\fR. Multiple zones may be specified with a comma-separated list. . .TP \*(DX\fBct\-get\-limits\fR [\fIdp\fR] [\fBzone=\fIzone\fR[\fB,\fIzone\fR]...] Retrieves the maximum allowed number of connections and current counts per-zone. If \fIzone\fR is given, only the specified zone(s) are printed. If no zones are specified, all the zone limits and counts are provided. The command always displays the default zone limit. . .SS "DPDK COMMANDS" These commands manage DPDK components. .IP "\fBdpdk/log-list\fR" Lists all DPDK components that emit logs and their logging levels. .IP "\fBdpdk/log-set\fR [\fIspec\fR]" Sets DPDK components logging level. Without any \fIspec\fR, sets the logging \fBlevel\fR for all DPDK components to \fBdebug\fR. Otherwise, \fIspec\fR is a list of words separated by spaces: a word can be either a logging \fBlevel\fR (\fBemergency\fR, \fBalert\fR, \fBcritical\fR, \fBerror\fR, \fBwarning\fR, \fBnotice\fR, \fBinfo\fR or \fBdebug\fR) or a \fBpattern\fR matching DPDK components (see \fBdpdk/log-list\fR command on \fBovs\-appctl\fR(8)) separated by a colon from the logging \fBlevel\fR to apply. .RE . .SS "DPIF-NETDEV COMMANDS" These commands are used to expose internal information (mostly statistics) about the "dpif-netdev" userspace datapath. If there is only one datapath (as is often the case, unless \fBdpctl/\fR commands are used), the \fIdp\fR argument can be omitted. By default the commands present data for all pmd threads in the datapath. By specifying the "-pmd Core" option one can filter the output for a single pmd in the datapath. . .IP "\fBdpif-netdev/pmd-stats-show\fR [\fB-pmd\fR \fIcore\fR] [\fIdp\fR]" Shows performance statistics for one or all pmd threads of the datapath \fIdp\fR. The special thread "main" sums up the statistics of every non pmd thread. The sum of "emc hits", "smc hits", "megaflow hits" and "miss" is the number of packet lookups performed by the datapath. Beware that a recirculated packet experiences one additional lookup per recirculation, so there may be more lookups than forwarded packets in the datapath. Cycles are counted using the TSC or similar facilities (when available on the platform). The duration of one cycle depends on the processing platform. "idle cycles" refers to cycles spent in PMD iterations not forwarding any any packets. "processing cycles" refers to cycles spent in PMD iterations forwarding at least one packet, including the cost for polling, processing and transmitting said packets. To reset these counters use \fBdpif-netdev/pmd-stats-clear\fR. . .IP "\fBdpif-netdev/pmd-stats-clear\fR [\fIdp\fR]" Resets to zero the per pmd thread performance numbers shown by the \fBdpif-netdev/pmd-stats-show\fR and \fBdpif-netdev/pmd-perf-show\fR commands. It will NOT reset datapath or bridge statistics, only the values shown by the above commands. . .IP "\fBdpif-netdev/pmd-perf-show\fR [\fB-nh\fR] [\fB-it\fR \fIiter_len\fR] \ [\fB-ms\fR \fIms_len\fR] [\fB-pmd\fR \fIcore\fR] [\fIdp\fR]" Shows detailed performance metrics for one or all pmds threads of the user space datapath. The collection of detailed statistics can be controlled by a new configuration parameter "other_config:pmd-perf-metrics". By default it is disabled. The run-time overhead, when enabled, is in the order of 1%. .RS .IP .PD .4v .IP \(em used cycles .IP \(em forwared packets .IP \(em number of rx batches .IP \(em packets/rx batch .IP \(em max. vhostuser queue fill level .IP \(em number of upcalls .IP \(em cycles spent in upcalls .PD .RE .IP This raw recorded data is used threefold: .RS .IP .PD .4v .IP 1. In histograms for each of the following metrics: .RS .IP \(em cycles/iteration (logarithmic) .IP \(em packets/iteration (logarithmic) .IP \(em cycles/packet .IP \(em packets/batch .IP \(em max. vhostuser qlen (logarithmic) .IP \(em upcalls .IP \(em cycles/upcall (logarithmic) The histograms bins are divided linear or logarithmic. .RE .IP 2. A cyclic history of the above metrics for 1024 iterations .IP 3. A cyclic history of the cummulative/average values per millisecond wall clock for the last 1024 milliseconds: .RS .IP \(em number of iterations .IP \(em avg. cycles/iteration .IP \(em packets (Kpps) .IP \(em avg. packets/batch .IP \(em avg. max vhost qlen .IP \(em upcalls .IP \(em avg. cycles/upcall .RE .PD .RE .IP . The command options are: .RS .IP "\fB-nh\fR" Suppress the histograms .IP "\fB-it\fR \fIiter_len\fR" Display the last iter_len iteration stats .IP "\fB-ms\fR \fIms_len\fR" Display the last ms_len millisecond stats .RE .IP The output always contains the following global PMD statistics: .RS .IP .EX Time: 15:24:55.270 Measurement duration: 1.008 s pmd thread numa_id 0 core_id 1: Iterations: 572817 (1.76 us/it) - Used TSC cycles: 2419034712 ( 99.9 % of total cycles) - idle iterations: 486808 ( 15.9 % of used cycles) - busy iterations: 86009 ( 84.1 % of used cycles) Rx packets: 2399607 (2381 Kpps, 848 cycles/pkt) Datapath passes: 3599415 (1.50 passes/pkt) - EMC hits: 336472 ( 9.3 %) - SMC hits: 0 ( 0.0 %) - Megaflow hits: 3262943 ( 90.7 %, 1.00 subtbl lookups/hit) - Upcalls: 0 ( 0.0 %, 0.0 us/upcall) - Lost upcalls: 0 ( 0.0 %) Tx packets: 2399607 (2381 Kpps) Tx batches: 171400 (14.00 pkts/batch) .EE .RE .IP Here "Rx packets" actually reflects the number of packets forwarded by the datapath. "Datapath passes" matches the number of packet lookups as reported by the \fBdpif-netdev/pmd-stats-show\fR command. To reset the counters and start a new measurement use \fBdpif-netdev/pmd-stats-clear\fR. . .IP "\fBdpif-netdev/pmd-perf-log-set\fR \fBon\fR|\fBoff\fR \ [\fB-b\fR \fIbefore\fR] [\fB-a\fR \fIafter\fR] [\fB-e\fR|\fB-ne\fR] \ [\fB-us\fR \fIusec\fR] [\fB-q\fR \fIqlen\fR]" . The userspace "netdev" datapath is able to supervise the PMD performance metrics and detect iterations with suspicious statistics according to the following criteria: .RS .IP \(em The iteration lasts longer than \fIusec\fR microseconds (default 250). This can be used to capture events where a PMD is blocked or interrupted for such a period of time that there is a risk for dropped packets on any of its Rx queues. .IP \(em The max vhost qlen exceeds a threshold \fIqlen\fR (default 128). This can be used to infer virtio queue overruns and dropped packets inside a VM, which are not visible in OVS otherwise. .RE .IP Such suspicious iterations can be logged together with their iteration statistics in the \fBovs-vswitchd.log\fR to be able to correlate them to packet drop or other events outside OVS. The above command enables (\fBon\fR) or disables (\fBoff\fR) supervision and logging at run-time and can be used to adjust the above thresholds for detecting suspicious iterations. By default supervision and logging is disabled. The command options are: .RS .IP "\fB-b\fR \fIbefore\fR" The number of iterations before the suspicious iteration to be logged (default 5). .IP "\fB-a\fR \fIafter\fR" The number of iterations after the suspicious iteration to be logged (default 5). .IP "\fB-e\fR" Extend logging interval if another suspicious iteration is detected before logging occurs. .IP "\fB-ne\fR" Do not extend logging interval if another suspicious iteration is detected before logging occurs (default). .IP "\fB-q\fR \fIqlen\fR" Suspicious vhost queue fill level threshold. Increase this to 512 if the Qemu supports 1024 virtio queue length (default 128). .IP "\fB-us\fR \fIusec\fR" Change the duration threshold for a suspicious iteration (default 250 us). .RE Note: Logging of suspicious iterations itself consumes a considerable amount of processing cycles of a PMD which may be visible in the iteration history. In the worst case this can lead OVS to detect another suspicious iteration caused by logging. If more than 100 iterations around a suspicious iteration have been logged once, OVS falls back to the safe default values (-b 5 -a 5 -ne) to avoid that logging itself continuously causes logging of further suspicious iterations. . .IP "\fBdpif-netdev/pmd-rxq-show\fR [\fB-pmd\fR \fIcore\fR] [\fIdp\fR]" For one or all pmd threads of the datapath \fIdp\fR show the list of queue-ids with port names, which this thread polls. . .IP "\fBdpif-netdev/pmd-rxq-rebalance\fR [\fIdp\fR]" Reassigns rxqs to pmds in the datapath \fIdp\fR based on their current usage. . .IP "\fBdpif-netdev/bond-show\fR [\fIdp\fR]" When "other_config:lb-output-action" is set to "true", the userspace datapath handles the load balancing of bonds directly instead of depending on flow recirculation (only in balance-tcp mode). When this is the case, the above command prints the load-balancing information of the bonds configured in datapath \fIdp\fR showing the interface associated with each bucket (hash). .SS "NETDEV-DPDK COMMANDS" These commands manage DPDK related ports (\fBtype=\fR\fIdpdk*\fR). .IP "\fBnetdev-dpdk/set-admin-state\fR [\fIinterface\fR] \fBup\fR | \fBdown\fR" Change the admin state for DPDK \fIinterface\fR to \fBup\fR or \fBdown\fR. If \fIinterface\fR is not specified, then it applies to all DPDK ports. .IP "\fBnetdev-dpdk/detach\fR \fIpci-address\fR" Detaches device with corresponding \fIpci-address\fR from DPDK. This command can be used to detach device if it wasn't detached automatically after port deletion. Refer to the documentation for details and instructions. .IP "\fBnetdev-dpdk/get-mempool-info\fR [\fIinterface\fR]" Prints the debug information about memory pool used by DPDK \fIinterface\fR. If called without arguments, information of all the available mempools will be printed. For additional mempool statistics enable \fBCONFIG_RTE_LIBRTE_MEMPOOL_DEBUG\fR while building DPDK. .SS "DATAPATH DEBUGGING COMMANDS" These commands query and modify datapaths. They are are similar to \fBovs\-dpctl\fR(8) commands. \fBdpif/show\fR has the additional functionality, beyond \fBdpctl/show\fR of printing OpenFlow port numbers. The other commands are redundant and will be removed in a future release. . .IP "\fBdpif/dump\-dps\fR" Prints the name of each configured datapath on a separate line. . .IP "\fBdpif/show\fR" Prints a summary of configured datapaths, including statistics and a list of connected ports. The port information includes the OpenFlow port number, datapath port number, and the type. (The local port is identified as OpenFlow port 65534.) . .IP "\fBdpif/dump\-flows\fR [\fB\-m\fR] \fIdp\fR" Prints to the console all flow entries in datapath \fIdp\fR's flow table. Without \fB\-m\fR, output omits match fields that a flow wildcards entirely; with \fB\-m\fR output includes all wildcarded fields. .IP This command is primarily useful for debugging Open vSwitch. The flow table entries that it displays are not OpenFlow flow entries. Instead, they are different and considerably simpler flows maintained by the datapath module. If you wish to see the OpenFlow flow entries, use \fBovs\-ofctl dump\-flows\fR. . .IP "\fBdpif/del\-flows \fIdp\fR" Deletes all flow entries from datapath \fIdp\fR's flow table and underlying datapath implementation (e.g., kernel datapath module). .IP This command is primarily useful for debugging Open vSwitch. As discussed in \fBdpif/dump\-flows\fR, these entries are not OpenFlow flow entries. .SS "OFPROTO COMMANDS" These commands manage the core OpenFlow switch implementation (called \fBofproto\fR). . .IP "\fBofproto/list\fR" Lists the names of the running ofproto instances. These are the names that may be used on \fBofproto/trace\fR. . .IP "\fBofproto/trace\fR [\fIoptions\fR] [\fIdpname\fR] \fIodp_flow\fR [\fIpacket\fR] .IQ "\fBofproto/trace\fR [\fIoptions\fR] \fIbridge\fR \fIbr_flow\fR [\fIpacket\fR]] .IQ "\fBofproto/trace\-packet\-out\fR [\fIoptions\fR] [\fIdpname\fR] \fIodp_flow\fR [\fIpacket\fR] \fIactions\fR" .IQ "\fBofproto/trace\-packet\-out\fR [\fIoptions\fR] \fIbridge\fR \fIbr_flow\fR [\fIpacket\fR] \fIactions\fR" Traces the path of an imaginary packet through \fIswitch\fR and reports the path that it took. The initial treatment of the packet varies based on the command: . .RS .IP \(bu \fBofproto/trace\fR looks the packet up in the OpenFlow flow table, as if the packet had arrived on an OpenFlow port. . .IP \(bu \fBofproto/trace\-packet\-out\fR applies the specified OpenFlow \fIactions\fR, as if the packet, flow, and actions had been specified in an OpenFlow ``packet-out'' request. .RE . .IP The packet's headers (e.g. source and destination) and metadata (e.g. input port), together called its ``flow,'' are usually all that matter for the purpose of tracing a packet. You can specify the flow in the following ways: . .RS .IP "\fIdpname\fR \fIodp_flow\fR" \fIodp_flow\fR is a flow in the form printed by \fBovs\-dpctl\fR(8)'s \fBdump\-flows\fR command. If all of your bridges have the same type, which is the common case, then you can omit \fIdpname\fR, but if you have bridges of different types (say, both \fBovs-netdev\fR and \fBovs-system\fR), then you need to specify a \fIdpname\fR to disambiguate. . .IP "\fIbridge\fR \fIbr_flow\fR" \fIbr_flow\fR is a flow in the form similar to that accepted by \fBovs\-ofctl\fR(8)'s \fBadd\-flow\fR command. (This is not an OpenFlow flow: besides other differences, it never contains wildcards.) \fIbridge\fR names of the bridge through which \fIbr_flow\fR should be traced. .RE . .IP .RS These commands support the following options: .IP \fB\-\-generate\fR Generate a packet from the flow (see below for more information). . .IP "\fB\-\-l7 \fIpayload\fR" .IQ "\fB\-\-l7\-len \fIlength\fR" Accepted only with \fB\-\-generate\fR (see below for more information). . .IP \fB\-\-consistent\fR Accepted by \fBofproto\-trace\-packet\-out\fR only. With this option, the command rejects \fIactions\fR that are inconsistent with the specified packet. (An example of an inconsistency is attempting to strip the VLAN tag from a packet that does not have a VLAN tag.) Open vSwitch ignores most forms of inconsistency in OpenFlow 1.0 and rejects inconsistencies in later versions of OpenFlow. The option is necessary because the command does not ordinarily imply a particular OpenFlow version. One exception is that, when \fIactions\fR includes an action that only OpenFlow 1.1 and later supports (such as \fBpush_vlan\fR), \fB\-\-consistent\fR is automatically enabled. . .IP "\fB\-\-ct-next\fR \fIflags\fR" When the traced flow triggers conntrack actions, \fBofproto/trace\fR will automatically trace the forked packet processing pipeline with user specified ct_state. This option sets the ct_state flags that the conntrack module will report. The \fIflags\fR must be a comma- or space-separated list of the following connection tracking flags: . .RS .IP \(bu \fBtrk\fR: Include to indicate connection tracking has taken place. . .IP \(bu \fBnew\fR: Include to indicate a new flow. . .IP \(bu \fBest\fR: Include to indicate an established flow. . .IP \(bu \fBrel\fR: Include to indicate a related flow. . .IP \(bu \fBrpl\fR: Include to indicate a reply flow. . .IP \(bu \fBinv\fR: Include to indicate a connection entry in a bad state. . .IP \(bu \fBdnat\fR: Include to indicate a packet whose destination IP address has been changed. . .IP \(bu \fBsnat\fR: Include to indicate a packet whose source IP address has been changed. . .RE . .IP When \fB\-\-ct-next\fR is unspecified, or when there are fewer \fB\-\-ct-next\fR options than ct \fIactions\fR, the \fIflags\fR default to \fBtrk,new\fR. . .RE . .IP Most commonly, one specifies only a flow, using one of the forms above, but sometimes one might need to specify an actual packet instead of just a flow: . .RS .IP "Side effects." Some actions have side effects. For example, the \fBnormal\fR action can update the MAC learning table, and the \fBlearn\fR action can change OpenFlow tables. The trace commands only perform side effects when a packet is specified. If you want side effects to take place, then you must supply a packet. . .IP (Output actions are obviously side effects too, but the trace commands never execute them, even when one specifies a packet.) . .IP "Incomplete information." Most of the time, Open vSwitch can figure out everything about the path of a packet using just the flow, but in some special circumstances it needs to look at parts of the packet that are not included in the flow. When this is the case, and you do not supply a packet, then a trace command will tell you it needs a packet. .RE . .IP If you wish to include a packet as part of a trace operation, there are two ways to do it: . .RS .IP \fB\-\-generate\fR This option, added to one of the ways to specify a flow already described, causes Open vSwitch to internally generate a packet with the flow described and then to use that packet. If your goal is to execute side effects, then \fB\-\-generate\fR is the easiest way to do it, but \fB\-\-generate\fR is not a good way to fill in incomplete information, because it generates packets based on only the flow information, which means that the packets really do not have any more information than the flow. .IP By default, for protocols that allow arbitrary L7 payloads, the generated packet has 64 bytes of payload. Use \fB\-\-l7\-len\fR to change the payload length, or \fB\-\-l7\fR to specify the exact contents of the payload. . .IP \fIpacket\fR This form supplies an explicit \fIpacket\fR as a sequence of hex digits. An Ethernet frame is at least 14 bytes long, so there must be at least 28 hex digits. Obviously, it is inconvenient to type in the hex digits by hand, so the \fBovs\-pcap\fR(1) and \fBovs\-tcpundump\fR(1) utilities provide easier ways. .IP With this form, packet headers are extracted directly from \fIpacket\fR, so the \fIodp_flow\fR or \fIbr_flow\fR should specify only metadata. The metadata can be: .RS .IP \fIskb_priority\fR Packet QoS priority. .IP \fIpkt_mark\fR Mark of the packet. .IP \fIct_state\fR Connection state of the packet. .IP \fIct_zone\fR Connection tracking zone for packet. .IP \fIct_mark\fR Connection mark of the packet. .IP \fIct_label\fR Connection label of the packet. .IP \fItun_id\fR The tunnel ID on which the packet arrived. .IP \fIin_port\fR The port on which the packet arrived. .RE .RE . .IP The in_port value is kernel datapath port number for the first format and OpenFlow port number for the second format. The numbering of these two types of port usually differs and there is no relationship. . . .IP "Usage examples:" .RS 4 .PP \fBTrace an unicast ICMP echo request on ingress port 1 to destination MAC 00:00:5E:00:53:01\fR .RS 4 .nf ofproto/trace br in_port=1,icmp,icmp_type=8,\\ dl_dst=00:00:5E:00:53:01 .RE .fi .PP \fBTrace an unicast ICMP echo reply on ingress port 1 to destination MAC 00:00:5E:00:53:01\fR .RS 4 .nf ofproto/trace br in_port=1,icmp,icmp_type=0,\\ dl_dst=00:00:5E:00:53:01 .fi .RE .PP \fBTrace an ARP request on ingress port 1\fR .RS 4 .nf ofproto/trace br in_port=1,arp,arp_op=1 .fi .RE .PP \fBTrace an ARP reply on ingress port 1\fR .RS 4 .nf ofproto/trace br in_port=1,arp,arp_op=2 .fi .RE .RE .SS "VLOG COMMANDS" These commands manage \fB\*(PN\fR's logging settings. .IP "\fBvlog/set\fR [\fIspec\fR]" Sets logging levels. Without any \fIspec\fR, sets the log level for every module and destination to \fBdbg\fR. Otherwise, \fIspec\fR is a list of words separated by spaces or commas or colons, up to one from each category below: . .RS .IP \(bu A valid module name, as displayed by the \fBvlog/list\fR command on \fBovs\-appctl\fR(8), limits the log level change to the specified module. . .IP \(bu \fBsyslog\fR, \fBconsole\fR, or \fBfile\fR, to limit the log level change to only to the system log, to the console, or to a file, respectively. .IP On Windows platform, \fBsyslog\fR is accepted as a word and is only useful along with the \fB\-\-syslog\-target\fR option (the word has no effect otherwise). . .IP \(bu \fBoff\fR, \fBemer\fR, \fBerr\fR, \fBwarn\fR, \fBinfo\fR, or \fBdbg\fR, to control the log level. Messages of the given severity or higher will be logged, and messages of lower severity will be filtered out. \fBoff\fR filters out all messages. See \fBovs\-appctl\fR(8) for a definition of each log level. .RE . .IP Case is not significant within \fIspec\fR. .IP Regardless of the log levels set for \fBfile\fR, logging to a file will not take place unless \fB\*(PN\fR was invoked with the \fB\-\-log\-file\fR option. .IP For compatibility with older versions of OVS, \fBany\fR is accepted as a word but has no effect. .RE .IP "\fBvlog/set PATTERN:\fIdestination\fB:\fIpattern\fR" Sets the log pattern for \fIdestination\fR to \fIpattern\fR. Refer to \fBovs\-appctl\fR(8) for a description of the valid syntax for \fIpattern\fR. . .IP "\fBvlog/list\fR" Lists the supported logging modules and their current levels. . .IP "\fBvlog/list-pattern\fR" Lists logging patterns used for each destination. . .IP "\fBvlog/close\fR" Causes \fB\*(PN\fR to close its log file, if it is open. (Use \fBvlog/reopen\fR to reopen it later.) . .IP "\fBvlog/reopen\fR" Causes \fB\*(PN\fR to close its log file, if it is open, and then reopen it. (This is useful after rotating log files, to cause a new log file to be used.) .IP This has no effect unless \fB\*(PN\fR was invoked with the \fB\-\-log\-file\fR option. . .IP "\fBvlog/disable\-rate\-limit \fR[\fImodule\fR]..." .IQ "\fBvlog/enable\-rate\-limit \fR[\fImodule\fR]..." By default, \fB\*(PN\fR limits the rate at which certain messages can be logged. When a message would appear more frequently than the limit, it is suppressed. This saves disk space, makes logs easier to read, and speeds up execution, but occasionally troubleshooting requires more detail. Therefore, \fBvlog/disable\-rate\-limit\fR allows rate limits to be disabled at the level of an individual log module. Specify one or more module names, as displayed by the \fBvlog/list\fR command. Specifying either no module names at all or the keyword \fBany\fR disables rate limits for every log module. . .IP The \fBvlog/enable\-rate\-limit\fR command, whose syntax is the same as \fBvlog/disable\-rate\-limit\fR, can be used to re-enable a rate limit that was previously disabled. .SS "MEMORY COMMANDS" These commands report memory usage. . .IP "\fBmemory/show\fR" Displays some basic statistics about \fB\*(PN\fR's memory usage. \fB\*(PN\fR also logs this information soon after startup and periodically as its memory consumption grows. .SS "COVERAGE COMMANDS" These commands manage \fB\*(PN\fR's ``coverage counters,'' which count the number of times particular events occur during a daemon's runtime. In addition to these commands, \fB\*(PN\fR automatically logs coverage counter values, at \fBINFO\fR level, when it detects that the daemon's main loop takes unusually long to run. .PP Coverage counters are useful mainly for performance analysis and debugging. .IP "\fBcoverage/show\fR" Displays the averaged per-second rates for the last few seconds, the last minute and the last hour, and the total counts of all of the coverage counters. .IP "\fBcoverage/read-counter\fR \fIcounter\fR" Displays the total count for the given coverage \fIcounter\fR. .SS "OPENVSWITCH TUNNELING COMMANDS" These commands query and modify OVS tunnel components. . .IP "\fBovs/route/add ipv4_address/plen output_bridge [GW]\fR" Adds ipv4_address/plen route to vswitchd routing table. output_bridge needs to be OVS bridge name. This command is useful if OVS cached routes does not look right. . .IP "\fBovs/route/show\fR" Print all routes in OVS routing table, This includes routes cached from system routing table and user configured routes. . .IP "\fBovs/route/del ipv4_address/plen\fR" Delete ipv4_address/plen route from OVS routing table. . .IP "\fBtnl/neigh/show\fR" .IP "\fBtnl/arp/show\fR" OVS builds ARP cache by snooping are messages. This command shows ARP cache table. . .IP "\fBtnl/neigh/set \fIbridge ip mac\fR" .IP "\fBtnl/arp/set \fIbridge ip mac\fR" Adds or modifies an ARP cache entry in \fIbridge\fR, mapping \fIip\fR to \fImac\fR. . .IP "\fBtnl/neigh/flush\fR" .IP "\fBtnl/arp/flush\fR" Flush ARP table. . .IP "\fBtnl/egress_port_range [num1] [num2]\fR" Set range for UDP source port used for UDP based Tunnels. For example VxLAN. If case of zero arguments this command prints current range in use. . .SH "OPENFLOW IMPLEMENTATION" . .PP This section documents aspects of OpenFlow for which the OpenFlow specification requires documentation. . .SS "Packet buffering." The OpenFlow specification, version 1.2, says: . .IP Switches that implement buffering are expected to expose, through documentation, both the amount of available buffering, and the length of time before buffers may be reused. . .PP Open vSwitch does not maintains any packet buffers. . .SS "Bundle lifetime" The OpenFlow specification, version 1.4, says: . .IP If the switch does not receive any OFPT_BUNDLE_CONTROL or OFPT_BUNDLE_ADD_MESSAGE message for an opened bundle_id for a switch defined time greater than 1s, it may send an ofp_error_msg with OFPET_BUNDLE_FAILED type and OFPBFC_TIMEOUT code. If the switch does not receive any new message in a bundle apart from echo request and replies for a switch defined time greater than 1s, it may send an ofp_error_msg with OFPET_BUNDLE_FAILED type and OFPBFC_TIMEOUT code. . .PP Open vSwitch implements default idle bundle lifetime of 10 seconds. (This is configurable via \fBother-config:bundle-idle-timeout\fR in the \fBOpen_vSwitch\fR table. See \fBovs-vswitchd.conf.db\fR(5) for details.) . .SH "LIMITS" . .PP We believe these limits to be accurate as of this writing. These limits assume the use of the Linux kernel datapath. . .IP \(bu \fBovs\-vswitchd\fR started through \fBovs\-ctl\fR(8) provides a limit of 65535 file descriptors. The limits on the number of bridges and ports is decided by the availability of file descriptors. With the Linux kernel datapath, creation of a single bridge consumes three file descriptors and each port consumes one additional file descriptor. Other platforms may have different limitations. . .IP \(bu 8,192 MAC learning entries per bridge, by default. (This is configurable via \fBother\-config:mac\-table\-size\fR in the \fBBridge\fR table. See \fBovs\-vswitchd.conf.db\fR(5) for details.) . .IP \(bu Kernel flows are limited only by memory available to the kernel. Performance will degrade beyond 1,048,576 kernel flows per bridge with a 32-bit kernel, beyond 262,144 with a 64-bit kernel. (\fBovs\-vswitchd\fR should never install anywhere near that many flows.) . .IP \(bu OpenFlow flows are limited only by available memory. Performance is linear in the number of unique wildcard patterns. That is, an OpenFlow table that contains many flows that all match on the same fields in the same way has a constant-time lookup, but a table that contains many flows that match on different fields requires lookup time linear in the number of flows. . .IP \(bu 255 ports per bridge participating in 802.1D Spanning Tree Protocol. . .IP \(bu 32 mirrors per bridge. . .IP \(bu 15 bytes for the name of a port, for ports implemented in the Linux kernel. Ports implemented in userspace, such as patch ports, do not have an arbitrary length limitation. OpenFlow also limit port names to 15 bytes. . .SH "SEE ALSO" .BR ovs\-appctl (8), .BR ovsdb\-server (1).