'\" p .\" -*- nroff -*- .TH "ovn-northd" 8 "ovn-northd" "Open vSwitch 2\[char46]10\[char46]1" "Open vSwitch Manual" .fp 5 L CR \\" Make fixed-width font available as \\fL. .de TQ . br . ns . TP "\\$1" .. .de ST . PP . RS -0.15in . I "\\$1" . RE .. .PP .SH "NAME" .PP .PP ovn-northd \- Open Virtual Network central control daemon .SH "SYNOPSIS" .PP \fBovn\-northd\fR [\fIoptions\fR] .SH "DESCRIPTION" .PP .PP \fBovn\-northd\fR is a centralized daemon responsible for translating the high-level OVN configuration into logical configuration consumable by daemons such as \fBovn\-controller\fR\[char46] It translates the logical network configuration in terms of conventional network concepts, taken from the OVN Northbound Database (see \fBovn\-nb\fR(5)), into logical datapath flows in the OVN Southbound Database (see \fBovn\-sb\fR(5)) below it\[char46] .SH "OPTIONS" .TP \fB\-\-ovnnb\-db=\fIdatabase\fB\fR The OVSDB database containing the OVN Northbound Database\[char46] If the \fBOVN_NB_DB\fR environment variable is set, its value is used as the default\[char46] Otherwise, the default is \fBunix:/var/run/openvswitch/ovnnb_db\[char46]sock\fR\[char46] .TP \fB\-\-ovnsb\-db=\fIdatabase\fB\fR The OVSDB database containing the OVN Southbound Database\[char46] If the \fBOVN_SB_DB\fR environment variable is set, its value is used as the default\[char46] Otherwise, the default is \fBunix:/var/run/openvswitch/ovnsb_db\[char46]sock\fR\[char46] .PP .PP \fIdatabase\fR in the above options must be an OVSDB active or passive connection method, as described in \fBovsdb\fR(7)\[char46] .SS "Daemon Options" .TP \fB\-\-pidfile\fR[\fB=\fR\fIpidfile\fR] Causes a file (by default, \fB\fIprogram\fB\[char46]pid\fR) to be created indicating the PID of the running process\[char46] If the \fIpidfile\fR argument is not specified, or if it does not begin with \fB/\fR, then it is created in \fB/var/run/openvswitch\fR\[char46] .IP If \fB\-\-pidfile\fR is not specified, no pidfile is created\[char46] .TP \fB\-\-overwrite\-pidfile\fR By default, when \fB\-\-pidfile\fR is specified and the specified pidfile already exists and is locked by a running process, the daemon refuses to start\[char46] Specify \fB\-\-overwrite\-pidfile\fR to cause it to instead overwrite the pidfile\[char46] .IP When \fB\-\-pidfile\fR is not specified, this option has no effect\[char46] .TP \fB\-\-detach\fR Runs this program as a background process\[char46] The process forks, and in the child it starts a new session, closes the standard file descriptors (which has the side effect of disabling logging to the console), and changes its current directory to the root (unless \fB\-\-no\-chdir\fR is specified)\[char46] After the child completes its initialization, the parent exits\[char46] .TP \fB\-\-monitor\fR Creates an additional process to monitor this program\[char46] If it dies due to a signal that indicates a programming error (\fBSIGABRT\fR, \fBSIGALRM\fR, \fBSIGBUS\fR, \fBSIGFPE\fR, \fBSIGILL\fR, \fBSIGPIPE\fR, \fBSIGSEGV\fR, \fBSIGXCPU\fR, or \fBSIGXFSZ\fR) then the monitor process starts a new copy of it\[char46] If the daemon dies or exits for another reason, the monitor process exits\[char46] .IP This option is normally used with \fB\-\-detach\fR, but it also functions without it\[char46] .TP \fB\-\-no\-chdir\fR By default, when \fB\-\-detach\fR is specified, the daemon changes its current working directory to the root directory after it detaches\[char46] Otherwise, invoking the daemon from a carelessly chosen directory would prevent the administrator from unmounting the file system that holds that directory\[char46] .IP Specifying \fB\-\-no\-chdir\fR suppresses this behavior, preventing the daemon from changing its current working directory\[char46] This may be useful for collecting core files, since it is common behavior to write core dumps into the current working directory and the root directory is not a good directory to use\[char46] .IP This option has no effect when \fB\-\-detach\fR is not specified\[char46] .TP \fB\-\-no\-self\-confinement\fR By default this daemon will try to self-confine itself to work with files under well-known directories whitelisted at build time\[char46] It is better to stick with this default behavior and not to use this flag unless some other Access Control is used to confine daemon\[char46] Note that in contrast to other access control implementations that are typically enforced from kernel-space (e\[char46]g\[char46] DAC or MAC), self-confinement is imposed from the user-space daemon itself and hence should not be considered as a full confinement strategy, but instead should be viewed as an additional layer of security\[char46] .TP \fB\-\-user=\fR\fIuser\fR\fB:\fR\fIgroup\fR Causes this program to run as a different user specified in \fIuser\fR\fB:\fR\fIgroup\fR, thus dropping most of the root privileges\[char46] Short forms \fIuser\fR and \fB:\fR\fIgroup\fR are also allowed, with current user or group assumed, respectively\[char46] Only daemons started by the root user accepts this argument\[char46] .IP On Linux, daemons will be granted \fBCAP_IPC_LOCK\fR and \fBCAP_NET_BIND_SERVICES\fR before dropping root privileges\[char46] Daemons that interact with a datapath, such as \fBovs\-vswitchd\fR, will be granted three additional capabilities, namely \fBCAP_NET_ADMIN\fR, \fBCAP_NET_BROADCAST\fR and \fBCAP_NET_RAW\fR\[char46] The capability change will apply even if the new user is root\[char46] .IP On Windows, this option is not currently supported\[char46] For security reasons, specifying this option will cause the daemon process not to start\[char46] .SS "Logging Options" .TP \fB\-v\fR[\fIspec\fR] .TQ .5in \fB\-\-verbose=\fR[\fIspec\fR] Sets logging levels\[char46] Without any \fIspec\fR, sets the log level for every module and destination to \fBdbg\fR\[char46] Otherwise, \fIspec\fR is a list of words separated by spaces or commas or colons, up to one from each category below: .RS .IP \(bu A valid module name, as displayed by the \fBvlog/list\fR command on \fBovs\-appctl\fR(8), limits the log level change to the specified module\[char46] .IP \(bu \fBsyslog\fR, \fBconsole\fR, or \fBfile\fR, to limit the log level change to only to the system log, to the console, or to a file, respectively\[char46] (If \fB\-\-detach\fR is specified, the daemon closes its standard file descriptors, so logging to the console will have no effect\[char46]) .IP On Windows platform, \fBsyslog\fR is accepted as a word and is only useful along with the \fB\-\-syslog\-target\fR option (the word has no effect otherwise)\[char46] .IP \(bu \fBoff\fR, \fBemer\fR, \fBerr\fR, \fBwarn\fR, \fBinfo\fR, or \fBdbg\fR, to control the log level\[char46] Messages of the given severity or higher will be logged, and messages of lower severity will be filtered out\[char46] \fBoff\fR filters out all messages\[char46] See \fBovs\-appctl\fR(8) for a definition of each log level\[char46] .RE .IP Case is not significant within \fIspec\fR\[char46] .IP Regardless of the log levels set for \fBfile\fR, logging to a file will not take place unless \fB\-\-log\-file\fR is also specified (see below)\[char46] .IP For compatibility with older versions of OVS, \fBany\fR is accepted as a word but has no effect\[char46] .TP \fB\-v\fR .TQ .5in \fB\-\-verbose\fR Sets the maximum logging verbosity level, equivalent to \fB\-\-verbose=dbg\fR\[char46] .TP \fB\-vPATTERN:\fR\fIdestination\fR\fB:\fR\fIpattern\fR .TQ .5in \fB\-\-verbose=PATTERN:\fR\fIdestination\fR\fB:\fR\fIpattern\fR Sets the log pattern for \fIdestination\fR to \fIpattern\fR\[char46] Refer to \fBovs\-appctl\fR(8) for a description of the valid syntax for \fIpattern\fR\[char46] .TP \fB\-vFACILITY:\fR\fIfacility\fR .TQ .5in \fB\-\-verbose=FACILITY:\fR\fIfacility\fR Sets the RFC5424 facility of the log message\[char46] \fIfacility\fR can be one of \fBkern\fR, \fBuser\fR, \fBmail\fR, \fBdaemon\fR, \fBauth\fR, \fBsyslog\fR, \fBlpr\fR, \fBnews\fR, \fBuucp\fR, \fBclock\fR, \fBftp\fR, \fBntp\fR, \fBaudit\fR, \fBalert\fR, \fBclock2\fR, \fBlocal0\fR, \fBlocal1\fR, \fBlocal2\fR, \fBlocal3\fR, \fBlocal4\fR, \fBlocal5\fR, \fBlocal6\fR or \fBlocal7\fR\[char46] If this option is not specified, \fBdaemon\fR is used as the default for the local system syslog and \fBlocal0\fR is used while sending a message to the target provided via the \fB\-\-syslog\-target\fR option\[char46] .TP \fB\-\-log\-file\fR[\fB=\fR\fIfile\fR] Enables logging to a file\[char46] If \fIfile\fR is specified, then it is used as the exact name for the log file\[char46] The default log file name used if \fIfile\fR is omitted is \fB/var/log/openvswitch/\fIprogram\fB\[char46]log\fR\[char46] .TP \fB\-\-syslog\-target=\fR\fIhost\fR\fB:\fR\fIport\fR Send syslog messages to UDP \fIport\fR on \fIhost\fR, in addition to the system syslog\[char46] The \fIhost\fR must be a numerical IP address, not a hostname\[char46] .TP \fB\-\-syslog\-method=\fR\fImethod\fR Specify \fImethod\fR as how syslog messages should be sent to syslog daemon\[char46] The following forms are supported: .RS .IP \(bu \fBlibc\fR, to use the libc \fBsyslog()\fR function\[char46] This is the default behavior\[char46] Downside of using this options is that libc adds fixed prefix to every message before it is actually sent to the syslog daemon over \fB/dev/log\fR UNIX domain socket\[char46] .IP \(bu \fBunix:\fIfile\fB\fR, to use a UNIX domain socket directly\[char46] It is possible to specify arbitrary message format with this option\[char46] However, \fBrsyslogd 8\[char46]9\fR and older versions use hard coded parser function anyway that limits UNIX domain socket use\[char46] If you want to use arbitrary message format with older \fBrsyslogd\fR versions, then use UDP socket to localhost IP address instead\[char46] .IP \(bu \fBudp:\fIip\fB:\fIport\fB\fR, to use a UDP socket\[char46] With this method it is possible to use arbitrary message format also with older \fBrsyslogd\fR\[char46] When sending syslog messages over UDP socket extra precaution needs to be taken into account, for example, syslog daemon needs to be configured to listen on the specified UDP port, accidental iptables rules could be interfering with local syslog traffic and there are some security considerations that apply to UDP sockets, but do not apply to UNIX domain sockets\[char46] .RE .SS "PKI Options" .PP .PP PKI configuration is required in order to use SSL for the connections to the Northbound and Southbound databases\[char46] .RS .TP \fB\-p\fR \fIprivkey\[char46]pem\fR .TQ .5in \fB\-\-private\-key=\fR\fIprivkey\[char46]pem\fR Specifies a PEM file containing the private key used as identity for outgoing SSL connections\[char46] .TP \fB\-c\fR \fIcert\[char46]pem\fR .TQ .5in \fB\-\-certificate=\fR\fIcert\[char46]pem\fR Specifies a PEM file containing a certificate that certifies the private key specified on \fB\-p\fR or \fB\-\-private\-key\fR to be trustworthy\[char46] The certificate must be signed by the certificate authority (CA) that the peer in SSL connections will use to verify it\[char46] .TP \fB\-C\fR \fIcacert\[char46]pem\fR .TQ .5in \fB\-\-ca\-cert=\fR\fIcacert\[char46]pem\fR Specifies a PEM file containing the CA certificate for verifying certificates presented to this program by SSL peers\[char46] (This may be the same certificate that SSL peers use to verify the certificate specified on \fB\-c\fR or \fB\-\-certificate\fR, or it may be a different one, depending on the PKI design in use\[char46]) .TP \fB\-C none\fR .TQ .5in \fB\-\-ca\-cert=none\fR Disables verification of certificates presented by SSL peers\[char46] This introduces a security risk, because it means that certificates cannot be verified to be those of known trusted hosts\[char46] .RE .SS "Other Options" .TP \fB\-\-unixctl=\fIsocket\fB\fR Sets the name of the control socket on which \fB\fIprogram\fB\fR listens for runtime management commands (see \fIRUNTIME MANAGEMENT COMMANDS,\fR below)\[char46] If \fIsocket\fR does not begin with \fB/\fR, it is interpreted as relative to \fB/var/run/openvswitch\fR\[char46] If \fB\-\-unixctl\fR is not used at all, the default socket is \fB/var/run/openvswitch/\fIprogram\fB\[char46]\fR\fIpid\fR\fB\[char46]ctl\fR, where \fIpid\fR is \fB\fIprogram\fB\fR\(cqs process ID\[char46] .IP On Windows a local named pipe is used to listen for runtime management commands\[char46] A file is created in the absolute path as pointed by \fIsocket\fR or if \fB\-\-unixctl\fR is not used at all, a file is created as \fB\fIprogram\fB\fR in the configured \fIOVS_RUNDIR\fR directory\[char46] The file exists just to mimic the behavior of a Unix domain socket\[char46] .IP Specifying \fBnone\fR for \fIsocket\fR disables the control socket feature\[char46] .ST "" .TP \fB\-h\fR .TQ .5in \fB\-\-help\fR Prints a brief help message to the console\[char46] .TP \fB\-V\fR .TQ .5in \fB\-\-version\fR Prints version information to the console\[char46] .SH "RUNTIME MANAGEMENT COMMANDS" .PP .PP \fBovs\-appctl\fR can send commands to a running \fBovn\-northd\fR process\[char46] The currently supported commands are described below\[char46] .RS .TP \fBexit\fR Causes \fBovn\-northd\fR to gracefully terminate\[char46] .RE .SH "ACTIVE-STANDBY FOR HIGH AVAILABILITY" .PP .PP You may run \fBovn\-northd\fR more than once in an OVN deployment\[char46] OVN will automatically ensure that only one of them is active at a time\[char46] If multiple instances of \fBovn\-northd\fR are running and the active \fBovn\-northd\fR fails, one of the hot standby instances of \fBovn\-northd\fR will automatically take over\[char46] .SH "LOGICAL FLOW TABLE STRUCTURE" .PP .PP One of the main purposes of \fBovn\-northd\fR is to populate the \fBLogical_Flow\fR table in the \fBOVN_Southbound\fR database\[char46] This section describes how \fBovn\-northd\fR does this for switch and router logical datapaths\[char46] .SS "Logical Switch Datapaths" .ST "Ingress Table 0: Admission Control and Ingress Port Security - L2" .PP .PP Ingress table 0 contains these logical flows: .RS .IP \(bu Priority 100 flows to drop packets with VLAN tags or multicast Ethernet source addresses\[char46] .IP \(bu Priority 50 flows that implement ingress port security for each enabled logical port\[char46] For logical ports on which port security is enabled, these match the \fBinport\fR and the valid \fBeth\[char46]src\fR address(es) and advance only those packets to the next flow table\[char46] For logical ports on which port security is not enabled, these advance all packets that match the \fBinport\fR\[char46] .RE .PP .PP There are no flows for disabled logical ports because the default-drop behavior of logical flow tables causes packets that ingress from them to be dropped\[char46] .ST "Ingress Table 1: Ingress Port Security - IP" .PP .PP Ingress table 1 contains these logical flows: .RS .IP \(bu For each element in the port security set having one or more IPv4 or IPv6 addresses (or both), .RS .IP \(bu Priority 90 flow to allow IPv4 traffic if it has IPv4 addresses which match the \fBinport\fR, valid \fBeth\[char46]src\fR and valid \fBip4\[char46]src\fR address(es)\[char46] .IP \(bu Priority 90 flow to allow IPv4 DHCP discovery traffic if it has a valid \fBeth\[char46]src\fR\[char46] This is necessary since DHCP discovery messages are sent from the unspecified IPv4 address (0\[char46]0\[char46]0\[char46]0) since the IPv4 address has not yet been assigned\[char46] .IP \(bu Priority 90 flow to allow IPv6 traffic if it has IPv6 addresses which match the \fBinport\fR, valid \fBeth\[char46]src\fR and valid \fBip6\[char46]src\fR address(es)\[char46] .IP \(bu Priority 90 flow to allow IPv6 DAD (Duplicate Address Detection) traffic if it has a valid \fBeth\[char46]src\fR\[char46] This is is necessary since DAD include requires joining an multicast group and sending neighbor solicitations for the newly assigned address\[char46] Since no address is yet assigned, these are sent from the unspecified IPv6 address (::)\[char46] .IP \(bu Priority 80 flow to drop IP (both IPv4 and IPv6) traffic which match the \fBinport\fR and valid \fBeth\[char46]src\fR\[char46] .RE .IP \(bu One priority\-0 fallback flow that matches all packets and advances to the next table\[char46] .RE .ST "Ingress Table 2: Ingress Port Security - Neighbor discovery" .PP .PP Ingress table 2 contains these logical flows: .RS .IP \(bu For each element in the port security set, .RS .IP \(bu Priority 90 flow to allow ARP traffic which match the \fBinport\fR and valid \fBeth\[char46]src\fR and \fBarp\[char46]sha\fR\[char46] If the element has one or more IPv4 addresses, then it also matches the valid \fBarp\[char46]spa\fR\[char46] .IP \(bu Priority 90 flow to allow IPv6 Neighbor Solicitation and Advertisement traffic which match the \fBinport\fR, valid \fBeth\[char46]src\fR and \fBnd\[char46]sll\fR/\fBnd\[char46]tll\fR\[char46] If the element has one or more IPv6 addresses, then it also matches the valid \fBnd\[char46]target\fR address(es) for Neighbor Advertisement traffic\[char46] .IP \(bu Priority 80 flow to drop ARP and IPv6 Neighbor Solicitation and Advertisement traffic which match the \fBinport\fR and valid \fBeth\[char46]src\fR\[char46] .RE .IP \(bu One priority\-0 fallback flow that matches all packets and advances to the next table\[char46] .RE .ST "Ingress Table 3: \fBfrom\-lport\fR Pre-ACLs" .PP .PP This table prepares flows for possible stateful ACL processing in ingress table \fBACLs\fR\[char46] It contains a priority\-0 flow that simply moves traffic to the next table\[char46] If stateful ACLs are used in the logical datapath, a priority\-100 flow is added that sets a hint (with \fBreg0[0] = 1; next;\fR) for table \fBPre\-stateful\fR to send IP packets to the connection tracker before eventually advancing to ingress table \fBACLs\fR\[char46] If special ports such as route ports or localnet ports can\(cqt use ct(), a priority\-110 flow is added to skip over stateful ACLs\[char46] .ST "Ingress Table 4: Pre-LB" .PP .PP This table prepares flows for possible stateful load balancing processing in ingress table \fBLB\fR and \fBStateful\fR\[char46] It contains a priority\-0 flow that simply moves traffic to the next table\[char46] Moreover it contains a priority\-110 flow to move IPv6 Neighbor Discovery traffic to the next table\[char46] If load balancing rules with virtual IP addresses (and ports) are configured in \fBOVN_Northbound\fR database for a logical switch datapath, a priority\-100 flow is added for each configured virtual IP address \fIVIP\fR\[char46] For IPv4 \fIVIPs\fR, the match is \fBip && ip4\[char46]dst == \fIVIP\fB\fR\[char46] For IPv6 \fIVIPs\fR, the match is \fBip && ip6\[char46]dst == \fIVIP\fB\fR\[char46] The flow sets an action \fBreg0[0] = 1; next;\fR to act as a hint for table \fBPre\-stateful\fR to send IP packets to the connection tracker for packet de-fragmentation before eventually advancing to ingress table \fBLB\fR\[char46] .ST "Ingress Table 5: Pre-stateful" .PP .PP This table prepares flows for all possible stateful processing in next tables\[char46] It contains a priority\-0 flow that simply moves traffic to the next table\[char46] A priority\-100 flow sends the packets to connection tracker based on a hint provided by the previous tables (with a match for \fBreg0[0] == 1\fR) by using the \fBct_next;\fR action\[char46] .ST "Ingress table 6: \fBfrom\-lport\fR ACLs" .PP .PP Logical flows in this table closely reproduce those in the \fBACL\fR table in the \fBOVN_Northbound\fR database for the \fBfrom\-lport\fR direction\[char46] The \fBpriority\fR values from the \fBACL\fR table have a limited range and have 1000 added to them to leave room for OVN default flows at both higher and lower priorities\[char46] .RS .IP \(bu \fBallow\fR ACLs translate into logical flows with the \fBnext;\fR action\[char46] If there are any stateful ACLs on this datapath, then \fBallow\fR ACLs translate to \fBct_commit; next;\fR (which acts as a hint for the next tables to commit the connection to conntrack), .IP \(bu \fBallow\-related\fR ACLs translate into logical flows with the \fBct_commit(ct_label=0/1); next;\fR actions for new connections and \fBreg0[1] = 1; next;\fR for existing connections\[char46] .IP \(bu Other ACLs translate to \fBdrop;\fR for new or untracked connections and \fBct_commit(ct_label=1/1);\fR for known connections\[char46] Setting \fBct_label\fR marks a connection as one that was previously allowed, but should no longer be allowed due to a policy change\[char46] .RE .PP .PP This table also contains a priority 0 flow with action \fBnext;\fR, so that ACLs allow packets by default\[char46] If the logical datapath has a statetful ACL, the following flows will also be added: .RS .IP \(bu A priority\-1 flow that sets the hint to commit IP traffic to the connection tracker (with action \fBreg0[1] = 1; next;\fR)\[char46] This is needed for the default allow policy because, while the initiator\(cqs direction may not have any stateful rules, the server\(cqs may and then its return traffic would not be known and marked as invalid\[char46] .IP \(bu A priority\-65535 flow that allows any traffic in the reply direction for a connection that has been committed to the connection tracker (i\[char46]e\[char46], established flows), as long as the committed flow does not have \fBct_label\[char46]blocked\fR set\[char46] We only handle traffic in the reply direction here because we want all packets going in the request direction to still go through the flows that implement the currently defined policy based on ACLs\[char46] If a connection is no longer allowed by policy, \fBct_label\[char46]blocked\fR will get set and packets in the reply direction will no longer be allowed, either\[char46] .IP \(bu A priority\-65535 flow that allows any traffic that is considered related to a committed flow in the connection tracker (e\[char46]g\[char46], an ICMP Port Unreachable from a non-listening UDP port), as long as the committed flow does not have \fBct_label\[char46]blocked\fR set\[char46] .IP \(bu A priority\-65535 flow that drops all traffic marked by the connection tracker as invalid\[char46] .IP \(bu A priority\-65535 flow that drops all trafic in the reply direction with \fBct_label\[char46]blocked\fR set meaning that the connection should no longer be allowed due to a policy change\[char46] Packets in the request direction are skipped here to let a newly created ACL re-allow this connection\[char46] .RE .ST "Ingress Table 7: \fBfrom\-lport\fR QoS Marking" .PP .PP Logical flows in this table closely reproduce those in the \fBQoS\fR table with the \fBaction\fR column set in the \fBOVN_Northbound\fR database for the \fBfrom\-lport\fR direction\[char46] .RS .IP \(bu For every qos_rules entry in a logical switch with DSCP marking enabled, a flow will be added at the priority mentioned in the QoS table\[char46] .IP \(bu One priority\-0 fallback flow that matches all packets and advances to the next table\[char46] .RE .ST "Ingress Table 8: \fBfrom\-lport\fR QoS Meter" .PP .PP Logical flows in this table closely reproduce those in the \fBQoS\fR table with the \fBbandwidth\fR column set in the \fBOVN_Northbound\fR database for the \fBfrom\-lport\fR direction\[char46] .RS .IP \(bu For every qos_rules entry in a logical switch with metering enabled, a flow will be added at the priorirty mentioned in the QoS table\[char46] .IP \(bu One priority\-0 fallback flow that matches all packets and advances to the next table\[char46] .RE .ST "Ingress Table 9: LB" .PP .PP It contains a priority\-0 flow that simply moves traffic to the next table\[char46] For established connections a priority 100 flow matches on \fBct\[char46]est && !ct\[char46]rel && !ct\[char46]new && !ct\[char46]inv\fR and sets an action \fBreg0[2] = 1; next;\fR to act as a hint for table \fBStateful\fR to send packets through connection tracker to NAT the packets\[char46] (The packet will automatically get DNATed to the same IP address as the first packet in that connection\[char46]) .ST "Ingress Table 10: Stateful" .RS .IP \(bu For all the configured load balancing rules for a switch in \fBOVN_Northbound\fR database that includes a L4 port \fIPORT\fR of protocol \fIP\fR and IP address \fIVIP\fR, a priority\-120 flow is added\[char46] For IPv4 \fIVIPs \fR, the flow matches \fBct\[char46]new && ip && ip4\[char46]dst == \fIVIP\fB && \fIP\fB && \fIP\fB\[char46]dst == \fIPORT\fB\fR\[char46] For IPv6 \fIVIPs\fR, the flow matches \fBct\[char46]new && ip && ip6\[char46]dst == \fI VIP \fB&& \fIP\fB && \fIP\fB\[char46]dst == \fI PORT\fB\fR\[char46] The flow\(cqs action is \fBct_lb(\fIargs\fB) \fR, where \fIargs\fR contains comma separated IP addresses (and optional port numbers) to load balance to\[char46] The address family of the IP addresses of \fIargs\fR is the same as the address family of \fIVIP\fR .IP \(bu For all the configured load balancing rules for a switch in \fBOVN_Northbound\fR database that includes just an IP address \fIVIP\fR to match on, OVN adds a priority\-110 flow\[char46] For IPv4 \fIVIPs\fR, the flow matches \fBct\[char46]new && ip && ip4\[char46]dst == \fIVIP\fB\fR\[char46] For IPv6 \fIVIPs\fR, the flow matches \fBct\[char46]new && ip && ip6\[char46]dst == \fI VIP\fB\fR\[char46] The action on this flow is \fB ct_lb(\fIargs\fB)\fR, where \fIargs\fR contains comma separated IP addresses of the same address family as \fIVIP\fR\[char46] .IP \(bu A priority\-100 flow commits packets to connection tracker using \fBct_commit; next;\fR action based on a hint provided by the previous tables (with a match for \fBreg0[1] == 1\fR)\[char46] .IP \(bu A priority\-100 flow sends the packets to connection tracker using \fBct_lb;\fR as the action based on a hint provided by the previous tables (with a match for \fBreg0[2] == 1\fR)\[char46] .IP \(bu A priority\-0 flow that simply moves traffic to the next table\[char46] .RE .ST "Ingress Table 11: ARP/ND responder" .PP .PP This table implements ARP/ND responder in a logical switch for known IPs\[char46] The advantage of the ARP responder flow is to limit ARP broadcasts by locally responding to ARP requests without the need to send to other hypervisors\[char46] One common case is when the inport is a logical port associated with a VIF and the broadcast is responded to on the local hypervisor rather than broadcast across the whole network and responded to by the destination VM\[char46] This behavior is proxy ARP\[char46] .PP .PP ARP requests arrive from VMs from a logical switch inport of type default\[char46] For this case, the logical switch proxy ARP rules can be for other VMs or logical router ports\[char46] Logical switch proxy ARP rules may be programmed both for mac binding of IP addresses on other logical switch VIF ports (which are of the default logical switch port type, representing connectivity to VMs or containers), and for mac binding of IP addresses on logical switch router type ports, representing their logical router port peers\[char46] In order to support proxy ARP for logical router ports, an IP address must be configured on the logical switch router type port, with the same value as the peer logical router port\[char46] The configured MAC addresses must match as well\[char46] When a VM sends an ARP request for a distributed logical router port and if the peer router type port of the attached logical switch does not have an IP address configured, the ARP request will be broadcast on the logical switch\[char46] One of the copies of the ARP request will go through the logical switch router type port to the logical router datapath, where the logical router ARP responder will generate a reply\[char46] The MAC binding of a distributed logical router, once learned by an associated VM, is used for all that VM\(cqs communication needing routing\[char46] Hence, the action of a VM re-arping for the mac binding of the logical router port should be rare\[char46] .PP .PP Logical switch ARP responder proxy ARP rules can also be hit when receiving ARP requests externally on a L2 gateway port\[char46] In this case, the hypervisor acting as an L2 gateway, responds to the ARP request on behalf of a destination VM\[char46] .PP .PP Note that ARP requests received from \fBlocalnet\fR or \fBvtep\fR logical inports can either go directly to VMs, in which case the VM responds or can hit an ARP responder for a logical router port if the packet is used to resolve a logical router port next hop address\[char46] In either case, logical switch ARP responder rules will not be hit\[char46] It contains these logical flows: .RS .IP \(bu Priority\-100 flows to skip the ARP responder if inport is of type \fBlocalnet\fR or \fBvtep\fR and advances directly to the next table\[char46] ARP requests sent to \fBlocalnet\fR or \fBvtep\fR ports can be received by multiple hypervisors\[char46] Now, because the same mac binding rules are downloaded to all hypervisors, each of the multiple hypervisors will respond\[char46] This will confuse L2 learning on the source of the ARP requests\[char46] ARP requests received on an inport of type \fBrouter\fR are not expected to hit any logical switch ARP responder flows\[char46] However, no skip flows are installed for these packets, as there would be some additional flow cost for this and the value appears limited\[char46] .IP \(bu Priority\-50 flows that match ARP requests to each known IP address \fIA\fR of every logical switch port, and respond with ARP replies directly with corresponding Ethernet address \fIE\fR: .IP .nf \fB .br \fBeth\[char46]dst = eth\[char46]src; .br \fBeth\[char46]src = \fR\fIE\fB\fR; .br \fBarp\[char46]op = 2; /* ARP reply\[char46] */ .br \fBarp\[char46]tha = arp\[char46]sha; .br \fBarp\[char46]sha = \fR\fIE\fB\fR; .br \fBarp\[char46]tpa = arp\[char46]spa; .br \fBarp\[char46]spa = \fR\fIA\fB\fR; .br \fBoutport = inport; .br \fBflags\[char46]loopback = 1; .br \fBoutput; .br \fB \fR .fi .IP These flows are omitted for logical ports (other than router ports or \fBlocalport\fR ports) that are down\[char46] .IP \(bu Priority\-50 flows that match IPv6 ND neighbor solicitations to each known IP address \fIA\fR (and \fIA\fR\(cqs solicited node address) of every logical switch port except of type router, and respond with neighbor advertisements directly with corresponding Ethernet address \fIE\fR: .IP .nf \fB .br \fBnd_na { .br \fB eth\[char46]src = \fR\fIE\fB\fR; .br \fB ip6\[char46]src = \fR\fIA\fB\fR; .br \fB nd\[char46]target = \fR\fIA\fB\fR; .br \fB nd\[char46]tll = \fR\fIE\fB\fR; .br \fB outport = inport; .br \fB flags\[char46]loopback = 1; .br \fB output; .br \fB}; .br \fB \fR .fi .IP Priority\-50 flows that match IPv6 ND neighbor solicitations to each known IP address \fIA\fR (and \fIA\fR\(cqs solicited node address) of logical switch port of type router, and respond with neighbor advertisements directly with corresponding Ethernet address \fIE\fR: .IP .nf \fB .br \fBnd_na_router { .br \fB eth\[char46]src = \fR\fIE\fB\fR; .br \fB ip6\[char46]src = \fR\fIA\fB\fR; .br \fB nd\[char46]target = \fR\fIA\fB\fR; .br \fB nd\[char46]tll = \fR\fIE\fB\fR; .br \fB outport = inport; .br \fB flags\[char46]loopback = 1; .br \fB output; .br \fB}; .br \fB \fR .fi .IP These flows are omitted for logical ports (other than router ports or \fBlocalport\fR ports) that are down\[char46] .IP \(bu Priority\-100 flows with match criteria like the ARP and ND flows above, except that they only match packets from the \fBinport\fR that owns the IP addresses in question, with action \fBnext;\fR\[char46] These flows prevent OVN from replying to, for example, an ARP request emitted by a VM for its own IP address\[char46] A VM only makes this kind of request to attempt to detect a duplicate IP address assignment, so sending a reply will prevent the VM from accepting the IP address that it owns\[char46] .IP In place of \fBnext;\fR, it would be reasonable to use \fBdrop;\fR for the flows\(cq actions\[char46] If everything is working as it is configured, then this would produce equivalent results, since no host should reply to the request\[char46] But ARPing for one\(cqs own IP address is intended to detect situations where the network is not working as configured, so dropping the request would frustrate that intent\[char46] .IP \(bu One priority\-0 fallback flow that matches all packets and advances to the next table\[char46] .RE .ST "Ingress Table 12: DHCP option processing" .PP .PP This table adds the DHCPv4 options to a DHCPv4 packet from the logical ports configured with IPv4 address(es) and DHCPv4 options, and similarly for DHCPv6 options\[char46] .RS .IP \(bu A priority\-100 logical flow is added for these logical ports which matches the IPv4 packet with \fBudp\[char46]src\fR = 68 and \fBudp\[char46]dst\fR = 67 and applies the action \fBput_dhcp_opts\fR and advances the packet to the next table\[char46] .IP .nf \fB .br \fBreg0[3] = put_dhcp_opts(offer_ip = \fR\fIip\fB\fR, \fR\fIoptions\fB\fR\[char46]\[char46]\[char46]); .br \fBnext; .br \fB \fR .fi .IP For DHCPDISCOVER and DHCPREQUEST, this transforms the packet into a DHCP reply, adds the DHCP offer IP \fIip\fR and options to the packet, and stores 1 into reg0[3]\[char46] For other kinds of packets, it just stores 0 into reg0[3]\[char46] Either way, it continues to the next table\[char46] .IP \(bu A priority\-100 logical flow is added for these logical ports which matches the IPv6 packet with \fBudp\[char46]src\fR = 546 and \fBudp\[char46]dst\fR = 547 and applies the action \fBput_dhcpv6_opts\fR and advances the packet to the next table\[char46] .IP .nf \fB .br \fBreg0[3] = put_dhcpv6_opts(ia_addr = \fR\fIip\fB\fR, \fR\fIoptions\fB\fR\[char46]\[char46]\[char46]); .br \fBnext; .br \fB \fR .fi .IP For DHCPv6 Solicit/Request/Confirm packets, this transforms the packet into a DHCPv6 Advertise/Reply, adds the DHCPv6 offer IP \fIip\fR and options to the packet, and stores 1 into reg0[3]\[char46] For other kinds of packets, it just stores 0 into reg0[3]\[char46] Either way, it continues to the next table\[char46] .IP \(bu A priority\-0 flow that matches all packets to advances to table 11\[char46] .RE .ST "Ingress Table 13: DHCP responses" .PP .PP This table implements DHCP responder for the DHCP replies generated by the previous table\[char46] .RS .IP \(bu A priority 100 logical flow is added for the logical ports configured with DHCPv4 options which matches IPv4 packets with \fBudp\[char46]src == 68 && udp\[char46]dst == 67 && reg0[3] == 1\fR and responds back to the \fBinport\fR after applying these actions\[char46] If \fBreg0[3]\fR is set to 1, it means that the action \fBput_dhcp_opts\fR was successful\[char46] .IP .nf \fB .br \fBeth\[char46]dst = eth\[char46]src; .br \fBeth\[char46]src = \fR\fIE\fB\fR; .br \fBip4\[char46]dst = \fR\fIA\fB\fR; .br \fBip4\[char46]src = \fR\fIS\fB\fR; .br \fBudp\[char46]src = 67; .br \fBudp\[char46]dst = 68; .br \fBoutport = \fR\fIP\fB\fR; .br \fBflags\[char46]loopback = 1; .br \fBoutput; .br \fB \fR .fi .IP where \fIE\fR is the server MAC address and \fIS\fR is the server IPv4 address defined in the DHCPv4 options and \fIA\fR is the IPv4 address defined in the logical port\(cqs addresses column\[char46] .IP (This terminates ingress packet processing; the packet does not go to the next ingress table\[char46]) .IP \(bu A priority 100 logical flow is added for the logical ports configured with DHCPv6 options which matches IPv6 packets with \fBudp\[char46]src == 546 && udp\[char46]dst == 547 && reg0[3] == 1\fR and responds back to the \fBinport\fR after applying these actions\[char46] If \fBreg0[3]\fR is set to 1, it means that the action \fBput_dhcpv6_opts\fR was successful\[char46] .IP .nf \fB .br \fBeth\[char46]dst = eth\[char46]src; .br \fBeth\[char46]src = \fR\fIE\fB\fR; .br \fBip6\[char46]dst = \fR\fIA\fB\fR; .br \fBip6\[char46]src = \fR\fIS\fB\fR; .br \fBudp\[char46]src = 547; .br \fBudp\[char46]dst = 546; .br \fBoutport = \fR\fIP\fB\fR; .br \fBflags\[char46]loopback = 1; .br \fBoutput; .br \fB \fR .fi .IP where \fIE\fR is the server MAC address and \fIS\fR is the server IPv6 LLA address generated from the \fBserver_id\fR defined in the DHCPv6 options and \fIA\fR is the IPv6 address defined in the logical port\(cqs addresses column\[char46] .IP (This terminates packet processing; the packet does not go on the next ingress table\[char46]) .IP \(bu A priority\-0 flow that matches all packets to advances to table 12\[char46] .RE .ST "Ingress Table 14 DNS Lookup" .PP .PP This table looks up and resolves the DNS names to the corresponding configured IP address(es)\[char46] .RS .IP \(bu A priority\-100 logical flow for each logical switch datapath if it is configured with DNS records, which matches the IPv4 and IPv6 packets with \fBudp\[char46]dst\fR = 53 and applies the action \fBdns_lookup\fR and advances the packet to the next table\[char46] .IP .nf \fB .br \fBreg0[4] = dns_lookup(); next; .br \fB \fR .fi .IP For valid DNS packets, this transforms the packet into a DNS reply if the DNS name can be resolved, and stores 1 into reg0[4]\[char46] For failed DNS resolution or other kinds of packets, it just stores 0 into reg0[4]\[char46] Either way, it continues to the next table\[char46] .RE .ST "Ingress Table 15 DNS Responses" .PP .PP This table implements DNS responder for the DNS replies generated by the previous table\[char46] .RS .IP \(bu A priority\-100 logical flow for each logical switch datapath if it is configured with DNS records, which matches the IPv4 and IPv6 packets with \fBudp\[char46]dst = 53 && reg0[4] == 1\fR and responds back to the \fBinport\fR after applying these actions\[char46] If \fBreg0[4]\fR is set to 1, it means that the action \fBdns_lookup\fR was successful\[char46] .IP .nf \fB .br \fBeth\[char46]dst <\-> eth\[char46]src; .br \fBip4\[char46]src <\-> ip4\[char46]dst; .br \fBudp\[char46]dst = udp\[char46]src; .br \fBudp\[char46]src = 53; .br \fBoutport = \fR\fIP\fB\fR; .br \fBflags\[char46]loopback = 1; .br \fBoutput; .br \fB \fR .fi .IP (This terminates ingress packet processing; the packet does not go to the next ingress table\[char46]) .RE .ST "Ingress Table 16 Destination Lookup" .PP .PP This table implements switching behavior\[char46] It contains these logical flows: .RS .IP \(bu A priority\-100 flow that outputs all packets with an Ethernet broadcast or multicast \fBeth\[char46]dst\fR to the \fBMC_FLOOD\fR multicast group, which \fBovn\-northd\fR populates with all enabled logical ports\[char46] .IP \(bu One priority\-50 flow that matches each known Ethernet address against \fBeth\[char46]dst\fR and outputs the packet to the single associated output port\[char46] .IP For the Ethernet address on a logical switch port of type \fBrouter\fR, when that logical switch port\(cqs \fBaddresses\fR column is set to \fBrouter\fR and the connected logical router port specifies a \fBredirect\-chassis\fR: .RS .IP \(bu The flow for the connected logical router port\(cqs Ethernet address is only programmed on the \fBredirect\-chassis\fR\[char46] .IP \(bu If the logical router has rules specified in \fBnat\fR with \fBexternal_mac\fR, then those addresses are also used to populate the switch\(cqs destination lookup on the chassis where \fBlogical_port\fR is resident\[char46] .RE .IP \(bu One priority\-0 fallback flow that matches all packets and outputs them to the \fBMC_UNKNOWN\fR multicast group, which \fBovn\-northd\fR populates with all enabled logical ports that accept unknown destination packets\[char46] As a small optimization, if no logical ports accept unknown destination packets, \fBovn\-northd\fR omits this multicast group and logical flow\[char46] .RE .ST "Egress Table 0: Pre-LB" .PP .PP This table is similar to ingress table \fBPre\-LB\fR\[char46] It contains a priority\-0 flow that simply moves traffic to the next table\[char46] Moreover it contains a priority\-110 flow to move IPv6 Neighbor Discovery traffic to the next table\[char46] If any load balancing rules exist for the datapath, a priority\-100 flow is added with a match of \fBip\fR and action of \fBreg0[0] = 1; next;\fR to act as a hint for table \fBPre\-stateful\fR to send IP packets to the connection tracker for packet de-fragmentation\[char46] .ST "Egress Table 1: \fBto\-lport\fR Pre-ACLs" .PP .PP This is similar to ingress table \fBPre\-ACLs\fR except for \fBto\-lport\fR traffic\[char46] .ST "Egress Table 2: Pre-stateful" .PP .PP This is similar to ingress table \fBPre\-stateful\fR\[char46] .ST "Egress Table 3: LB" .PP .PP This is similar to ingress table \fBLB\fR\[char46] .ST "Egress Table 4: \fBto\-lport\fR ACLs" .PP .PP This is similar to ingress table \fBACLs\fR except for \fBto\-lport\fR ACLs\[char46] .PP .PP In addition, the following flows are added\[char46] .RS .IP \(bu A priority 34000 logical flow is added for each logical port which has DHCPv4 options defined to allow the DHCPv4 reply packet and which has DHCPv6 options defined to allow the DHCPv6 reply packet from the \fBIngress Table 13: DHCP responses\fR\[char46] .IP \(bu A priority 34000 logical flow is added for each logical switch datapath configured with DNS records with the match \fBudp\[char46]dst = 53\fR to allow the DNS reply packet from the \fBIngress Table 15:DNS responses\fR\[char46] .RE .ST "Egress Table 5: \fBto\-lport\fR QoS Marking" .PP .PP This is similar to ingress table \fBQoS marking\fR except they apply to \fBto\-lport\fR QoS rules\[char46] .ST "Egress Table 6: \fBto\-lport\fR QoS Meter" .PP .PP This is similar to ingress table \fBQoS meter\fR except they apply to \fBto\-lport\fR QoS rules\[char46] .ST "Egress Table 7: Stateful" .PP .PP This is similar to ingress table \fBStateful\fR except that there are no rules added for load balancing new connections\[char46] .ST "Egress Table 8: Egress Port Security - IP" .PP .PP This is similar to the port security logic in table \fBIngress Port Security \- IP\fR except that \fBoutport\fR, \fBeth\[char46]dst\fR, \fBip4\[char46]dst\fR and \fBip6\[char46]dst\fR are checked instead of \fBinport\fR, \fBeth\[char46]src\fR, \fBip4\[char46]src\fR and \fBip6\[char46]src\fR .ST "Egress Table 9: Egress Port Security - L2" .PP .PP This is similar to the ingress port security logic in ingress table \fBAdmission Control and Ingress Port Security \- L2\fR, but with important differences\[char46] Most obviously, \fBoutport\fR and \fBeth\[char46]dst\fR are checked instead of \fBinport\fR and \fBeth\[char46]src\fR\[char46] Second, packets directed to broadcast or multicast \fBeth\[char46]dst\fR are always accepted instead of being subject to the port security rules; this is implemented through a priority\-100 flow that matches on \fBeth\[char46]mcast\fR with action \fBoutput;\fR\[char46] Finally, to ensure that even broadcast and multicast packets are not delivered to disabled logical ports, a priority\-150 flow for each disabled logical \fBoutport\fR overrides the priority\-100 flow with a \fBdrop;\fR action\[char46] .SS "Logical Router Datapaths" .PP .PP Logical router datapaths will only exist for \fBLogical_Router\fR rows in the \fBOVN_Northbound\fR database that do not have \fBenabled\fR set to \fBfalse\fR .ST "Ingress Table 0: L2 Admission Control" .PP .PP This table drops packets that the router shouldn\(cqt see at all based on their Ethernet headers\[char46] It contains the following flows: .RS .IP \(bu Priority\-100 flows to drop packets with VLAN tags or multicast Ethernet source addresses\[char46] .IP \(bu For each enabled router port \fIP\fR with Ethernet address \fIE\fR, a priority\-50 flow that matches \fBinport == \fIP\fB && (eth\[char46]mcast || eth\[char46]dst == \fIE\fB\fR), with action \fBnext;\fR\[char46] .IP For the gateway port on a distributed logical router (where one of the logical router ports specifies a \fBredirect\-chassis\fR), the above flow matching \fBeth\[char46]dst == \fIE\fB\fR is only programmed on the gateway port instance on the \fBredirect\-chassis\fR\[char46] .IP \(bu For each \fBdnat_and_snat\fR NAT rule on a distributed router that specifies an external Ethernet address \fIE\fR, a priority\-50 flow that matches \fBinport == \fIGW\fB && eth\[char46]dst == \fIE\fB\fR, where \fIGW\fR is the logical router gateway port, with action \fBnext;\fR\[char46] .IP This flow is only programmed on the gateway port instance on the chassis where the \fBlogical_port\fR specified in the NAT rule resides\[char46] .RE .PP .PP Other packets are implicitly dropped\[char46] .ST "Ingress Table 1: IP Input" .PP .PP This table is the core of the logical router datapath functionality\[char46] It contains the following flows to implement very basic IP host functionality\[char46] .RS .IP \(bu L3 admission control: A priority\-100 flow drops packets that match any of the following: .RS .IP \(bu \fBip4\[char46]src[28\[char46]\[char46]31] == 0xe\fR (multicast source) .IP \(bu \fBip4\[char46]src == 255\[char46]255\[char46]255\[char46]255\fR (broadcast source) .IP \(bu \fBip4\[char46]src == 127\[char46]0\[char46]0\[char46]0/8 || ip4\[char46]dst == 127\[char46]0\[char46]0\[char46]0/8\fR (localhost source or destination) .IP \(bu \fBip4\[char46]src == 0\[char46]0\[char46]0\[char46]0/8 || ip4\[char46]dst == 0\[char46]0\[char46]0\[char46]0/8\fR (zero network source or destination) .IP \(bu \fBip4\[char46]src\fR or \fBip6\[char46]src\fR is any IP address owned by the router, unless the packet was recirculated due to egress loopback as indicated by \fBREGBIT_EGRESS_LOOPBACK\fR\[char46] .IP \(bu \fBip4\[char46]src\fR is the broadcast address of any IP network known to the router\[char46] .RE .IP \(bu ICMP echo reply\[char46] These flows reply to ICMP echo requests received for the router\(cqs IP address\[char46] Let \fIA\fR be an IP address owned by a router port\[char46] Then, for each \fIA\fR that is an IPv4 address, a priority\-90 flow matches on \fBip4\[char46]dst == \fIA\fB\fR and \fBicmp4\[char46]type == 8 && icmp4\[char46]code == 0\fR (ICMP echo request)\[char46] For each \fIA\fR that is an IPv6 address, a priority\-90 flow matches on \fBip6\[char46]dst == \fIA\fB\fR and \fBicmp6\[char46]type == 128 && icmp6\[char46]code == 0\fR (ICMPv6 echo request)\[char46] The port of the router that receives the echo request does not matter\[char46] Also, the \fBip\[char46]ttl\fR of the echo request packet is not checked, so it complies with RFC 1812, section 4\[char46]2\[char46]2\[char46]9\[char46] Flows for ICMPv4 echo requests use the following actions: .IP .nf \fB .br \fBip4\[char46]dst <\-> ip4\[char46]src; .br \fBip\[char46]ttl = 255; .br \fBicmp4\[char46]type = 0; .br \fBflags\[char46]loopback = 1; .br \fBnext; .br \fB \fR .fi .IP Flows for ICMPv6 echo requests use the following actions: .IP .nf \fB .br \fBip6\[char46]dst <\-> ip6\[char46]src; .br \fBip\[char46]ttl = 255; .br \fBicmp6\[char46]type = 129; .br \fBflags\[char46]loopback = 1; .br \fBnext; .br \fB \fR .fi .IP \(bu Reply to ARP requests\[char46] .IP These flows reply to ARP requests for the router\(cqs own IP address and populates mac binding table of the logical router port\[char46] The ARP requests are handled only if the requestor\(cqs IP belongs to the same subnets of the logical router port\[char46] For each router port \fIP\fR that owns IP address \fIA\fR, which belongs to subnet \fIS\fR with prefix length \fIL\fR, and Ethernet address \fIE\fR, a priority\-90 flow matches \fBinport == \fIP\fB && arp\[char46]spa == \fIS\fB/\fIL\fB && arp\[char46]op == 1 && arp\[char46]tpa == \fIA\fB\fR (ARP request) with the following actions: .IP .nf \fB .br \fBput_arp(inport, arp\[char46]spa, arp\[char46]sha); .br \fBeth\[char46]dst = eth\[char46]src; .br \fBeth\[char46]src = \fR\fIE\fB\fR; .br \fBarp\[char46]op = 2; /* ARP reply\[char46] */ .br \fBarp\[char46]tha = arp\[char46]sha; .br \fBarp\[char46]sha = \fR\fIE\fB\fR; .br \fBarp\[char46]tpa = arp\[char46]spa; .br \fBarp\[char46]spa = \fR\fIA\fB\fR; .br \fBoutport = \fR\fIP\fB\fR; .br \fBflags\[char46]loopback = 1; .br \fBoutput; .br \fB \fR .fi .IP For the gateway port on a distributed logical router (where one of the logical router ports specifies a \fBredirect\-chassis\fR), the above flows are only programmed on the gateway port instance on the \fBredirect\-chassis\fR\[char46] This behavior avoids generation of multiple ARP responses from different chassis, and allows upstream MAC learning to point to the \fBredirect\-chassis\fR\[char46] .IP \(bu These flows handles ARP requests not for router\(cqs own IP address\[char46] They use the SPA and SHA to populate the logical router port\(cqs mac binding table, with priority 80\[char46] The typical use case of these flows are GARP requests handling\[char46] For the gateway port on a distributed logical router, these flows are only programmed on the gateway port instance on the \fBredirect\-chassis\fR\[char46] .IP \(bu These flows reply to ARP requests for the virtual IP addresses configured in the router for DNAT or load balancing\[char46] For a configured DNAT IP address or a load balancer IPv4 VIP \fIA\fR, for each router port \fIP\fR with Ethernet address \fIE\fR, a priority\-90 flow matches \fBinport == \fIP\fB && arp\[char46]op == 1 && arp\[char46]tpa == \fIA\fB\fR (ARP request) with the following actions: .IP .nf \fB .br \fBeth\[char46]dst = eth\[char46]src; .br \fBeth\[char46]src = \fR\fIE\fB\fR; .br \fBarp\[char46]op = 2; /* ARP reply\[char46] */ .br \fBarp\[char46]tha = arp\[char46]sha; .br \fBarp\[char46]sha = \fR\fIE\fB\fR; .br \fBarp\[char46]tpa = arp\[char46]spa; .br \fBarp\[char46]spa = \fR\fIA\fB\fR; .br \fBoutport = \fR\fIP\fB\fR; .br \fBflags\[char46]loopback = 1; .br \fBoutput; .br \fB \fR .fi .IP For the gateway port on a distributed logical router with NAT (where one of the logical router ports specifies a \fBredirect\-chassis\fR): .RS .IP \(bu If the corresponding NAT rule cannot be handled in a distributed manner, then this flow is only programmed on the gateway port instance on the \fBredirect\-chassis\fR\[char46] This behavior avoids generation of multiple ARP responses from different chassis, and allows upstream MAC learning to point to the \fBredirect\-chassis\fR\[char46] .IP \(bu If the corresponding NAT rule can be handled in a distributed manner, then this flow is only programmed on the gateway port instance where the \fBlogical_port\fR specified in the NAT rule resides\[char46] .IP Some of the actions are different for this case, using the \fBexternal_mac\fR specified in the NAT rule rather than the gateway port\(cqs Ethernet address \fIE\fR: .IP .nf \fB .br \fBeth\[char46]src = \fR\fIexternal_mac\fB\fR; .br \fBarp\[char46]sha = \fR\fIexternal_mac\fB\fR; .br \fB \fR .fi .IP This behavior avoids generation of multiple ARP responses from different chassis, and allows upstream MAC learning to point to the correct chassis\[char46] .RE .IP \(bu ARP reply handling\[char46] This flow uses ARP replies to populate the logical router\(cqs ARP table\[char46] A priority\-90 flow with match \fBarp\[char46]op == 2\fR has actions \fBput_arp(inport, arp\[char46]spa, arp\[char46]sha);\fR\[char46] .IP \(bu Reply to IPv6 Neighbor Solicitations\[char46] These flows reply to Neighbor Solicitation requests for the router\(cqs own IPv6 address and load balancing IPv6 VIPs and populate the logical router\(cqs mac binding table\[char46] .IP For each router port \fIP\fR that owns IPv6 address \fIA\fR, solicited node address \fIS\fR, and Ethernet address \fIE\fR, a priority\-90 flow matches \fBinport == \fIP\fB && nd_ns && ip6\[char46]dst == {\fIA\fB, \fIE\fB} && nd\[char46]target == \fIA\fB\fR with the following actions: .IP .nf \fB .br \fBput_nd(inport, ip6\[char46]src, nd\[char46]sll); .br \fBnd_na_router { .br \fB eth\[char46]src = \fR\fIE\fB\fR; .br \fB ip6\[char46]src = \fR\fIA\fB\fR; .br \fB nd\[char46]target = \fR\fIA\fB\fR; .br \fB nd\[char46]tll = \fR\fIE\fB\fR; .br \fB outport = inport; .br \fB flags\[char46]loopback = 1; .br \fB output; .br \fB}; .br \fB \fR .fi .IP For each router port \fIP\fR that has load balancing VIP \fIA\fR, solicited node address \fIS\fR, and Ethernet address \fIE\fR, a priority\-90 flow matches \fBinport == \fIP\fB && nd_ns && ip6\[char46]dst == {\fIA\fB, \fIE\fB} && nd\[char46]target == \fIA\fB\fR with the following actions: .IP .nf \fB .br \fBput_nd(inport, ip6\[char46]src, nd\[char46]sll); .br \fBnd_na { .br \fB eth\[char46]src = \fR\fIE\fB\fR; .br \fB ip6\[char46]src = \fR\fIA\fB\fR; .br \fB nd\[char46]target = \fR\fIA\fB\fR; .br \fB nd\[char46]tll = \fR\fIE\fB\fR; .br \fB outport = inport; .br \fB flags\[char46]loopback = 1; .br \fB output; .br \fB}; .br \fB \fR .fi .IP For the gateway port on a distributed logical router (where one of the logical router ports specifies a \fBredirect\-chassis\fR), the above flows replying to IPv6 Neighbor Solicitations are only programmed on the gateway port instance on the \fBredirect\-chassis\fR\[char46] This behavior avoids generation of multiple replies from different chassis, and allows upstream MAC learning to point to the \fBredirect\-chassis\fR\[char46] .IP \(bu IPv6 neighbor advertisement handling\[char46] This flow uses neighbor advertisements to populate the logical router\(cqs mac binding table\[char46] A priority\-90 flow with match \fBnd_na\fR has actions \fBput_nd(inport, nd\[char46]target, nd\[char46]tll);\fR\[char46] .IP \(bu IPv6 neighbor solicitation for non-hosted addresses handling\[char46] This flow uses neighbor solicitations to populate the logical router\(cqs mac binding table (ones that were directed at the logical router would have matched the priority\-90 neighbor solicitation flow already)\[char46] A priority\-80 flow with match \fBnd_ns\fR has actions \fBput_nd(inport, ip6\[char46]src, nd\[char46]sll);\fR\[char46] .IP \(bu UDP port unreachable\[char46] Priority\-80 flows generate ICMP port unreachable messages in reply to UDP datagrams directed to the router\(cqs IP address, except in the special case of gateways, which accept traffic directed to a router IP for load balancing and NAT purposes\[char46] .IP These flows should not match IP fragments with nonzero offset\[char46] .IP \(bu TCP reset\[char46] Priority\-80 flows generate TCP reset messages in reply to TCP datagrams directed to the router\(cqs IP address, except in the special case of gateways, which accept traffic directed to a router IP for load balancing and NAT purposes\[char46] .IP These flows should not match IP fragments with nonzero offset\[char46] .IP \(bu Protocol or address unreachable\[char46] Priority\-70 flows generate ICMP protocol or address unreachable messages for IPv4 and IPv6 respectively in reply to packets directed to the router\(cqs IP address on IP protocols other than UDP, TCP, and ICMP, except in the special case of gateways, which accept traffic directed to a router IP for load balancing purposes\[char46] .IP These flows should not match IP fragments with nonzero offset\[char46] .IP \(bu Drop other IP traffic to this router\[char46] These flows drop any other traffic destined to an IP address of this router that is not already handled by one of the flows above, which amounts to ICMP (other than echo requests) and fragments with nonzero offsets\[char46] For each IP address \fIA\fR owned by the router, a priority\-60 flow matches \fBip4\[char46]dst == \fIA\fB\fR and drops the traffic\[char46] An exception is made and the above flow is not added if the router port\(cqs own IP address is used to SNAT packets passing through that router\[char46] .RE .PP .PP The flows above handle all of the traffic that might be directed to the router itself\[char46] The following flows (with lower priorities) handle the remaining traffic, potentially for forwarding: .RS .IP \(bu Drop Ethernet local broadcast\[char46] A priority\-50 flow with match \fBeth\[char46]bcast\fR drops traffic destined to the local Ethernet broadcast address\[char46] By definition this traffic should not be forwarded\[char46] .IP \(bu ICMP time exceeded\[char46] For each router port \fIP\fR, whose IP address is \fIA\fR, a priority\-40 flow with match \fBinport == \fIP\fB && ip\[char46]ttl == {0, 1} && !ip\[char46]later_frag\fR matches packets whose TTL has expired, with the following actions to send an ICMP time exceeded reply for IPv4 and IPv6 respectively: .IP .nf \fB .br \fBicmp4 { .br \fB icmp4\[char46]type = 11; /* Time exceeded\[char46] */ .br \fB icmp4\[char46]code = 0; /* TTL exceeded in transit\[char46] */ .br \fB ip4\[char46]dst = ip4\[char46]src; .br \fB ip4\[char46]src = \fR\fIA\fB\fR; .br \fB ip\[char46]ttl = 255; .br \fB next; .br \fB}; .br \fB .br \fBicmp6 { .br \fB icmp6\[char46]type = 3; /* Time exceeded\[char46] */ .br \fB icmp6\[char46]code = 0; /* TTL exceeded in transit\[char46] */ .br \fB ip6\[char46]dst = ip6\[char46]src; .br \fB ip6\[char46]src = \fR\fIA\fB\fR; .br \fB ip\[char46]ttl = 255; .br \fB next; .br \fB}; .br \fB \fR .fi .IP \(bu TTL discard\[char46] A priority\-30 flow with match \fBip\[char46]ttl == {0, 1}\fR and actions \fBdrop;\fR drops other packets whose TTL has expired, that should not receive a ICMP error reply (i\[char46]e\[char46] fragments with nonzero offset)\[char46] .IP \(bu Next table\[char46] A priority\-0 flows match all packets that aren\(cqt already handled and uses actions \fBnext;\fR to feed them to the next table\[char46] .RE .ST "Ingress Table 2: DEFRAG" .PP .PP This is to send packets to connection tracker for tracking and defragmentation\[char46] It contains a priority\-0 flow that simply moves traffic to the next table\[char46] If load balancing rules with virtual IP addresses (and ports) are configured in \fBOVN_Northbound\fR database for a Gateway router, a priority\-100 flow is added for each configured virtual IP address \fIVIP\fR\[char46] For IPv4 \fIVIPs\fR the flow matches \fBip && ip4\[char46]dst == \fIVIP\fB\fR\[char46] For IPv6 \fIVIPs\fR, the flow matches \fBip && ip6\[char46]dst == \fIVIP\fB\fR\[char46] The flow uses the action \fBct_next;\fR to send IP packets to the connection tracker for packet de-fragmentation and tracking before sending it to the next table\[char46] .ST "Ingress Table 3: UNSNAT" .PP .PP This is for already established connections\(cq reverse traffic\[char46] i\[char46]e\[char46], SNAT has already been done in egress pipeline and now the packet has entered the ingress pipeline as part of a reply\[char46] It is unSNATted here\[char46] .PP .PP Ingress Table 3: UNSNAT on Gateway Routers .RS .IP \(bu If the Gateway router has been configured to force SNAT any previously DNATted packets to \fIB\fR, a priority\-110 flow matches \fBip && ip4\[char46]dst == \fIB\fB\fR with an action \fBct_snat; \fR\[char46] .IP If the Gateway router has been configured to force SNAT any previously load-balanced packets to \fIB\fR, a priority\-100 flow matches \fBip && ip4\[char46]dst == \fIB\fB\fR with an action \fBct_snat; \fR\[char46] .IP For each NAT configuration in the OVN Northbound database, that asks to change the source IP address of a packet from \fIA\fR to \fIB\fR, a priority\-90 flow matches \fBip && ip4\[char46]dst == \fIB\fB\fR with an action \fBct_snat; \fR\[char46] .IP A priority\-0 logical flow with match \fB1\fR has actions \fBnext;\fR\[char46] .RE .PP .PP Ingress Table 3: UNSNAT on Distributed Routers .RS .IP \(bu For each configuration in the OVN Northbound database, that asks to change the source IP address of a packet from \fIA\fR to \fIB\fR, a priority\-100 flow matches \fBip && ip4\[char46]dst == \fIB\fB && inport == \fIGW\fB\fR, where \fIGW\fR is the logical router gateway port, with an action \fBct_snat;\fR\[char46] .IP If the NAT rule cannot be handled in a distributed manner, then the priority\-100 flow above is only programmed on the \fBredirect\-chassis\fR\[char46] .IP For each configuration in the OVN Northbound database, that asks to change the source IP address of a packet from \fIA\fR to \fIB\fR, a priority\-50 flow matches \fBip && ip4\[char46]dst == \fIB\fB\fR with an action \fBREGBIT_NAT_REDIRECT = 1; next;\fR\[char46] This flow is for east/west traffic to a NAT destination IPv4 address\[char46] By setting the \fBREGBIT_NAT_REDIRECT\fR flag, in the ingress table \fBGateway Redirect\fR this will trigger a redirect to the instance of the gateway port on the \fBredirect\-chassis\fR\[char46] .IP A priority\-0 logical flow with match \fB1\fR has actions \fBnext;\fR\[char46] .RE .ST "Ingress Table 4: DNAT" .PP .PP Packets enter the pipeline with destination IP address that needs to be DNATted from a virtual IP address to a real IP address\[char46] Packets in the reverse direction needs to be unDNATed\[char46] .PP .PP Ingress Table 4: Load balancing DNAT rules .PP .PP Following load balancing DNAT flows are added for Gateway router or Router with gateway port\[char46] These flows are programmed only on the \fBredirect\-chassis\fR\[char46] These flows do not get programmed for load balancers with IPv6 \fIVIPs\fR\[char46] .RS .IP \(bu For all the configured load balancing rules for a Gateway router or Router with gateway port in \fBOVN_Northbound\fR database that includes a L4 port \fIPORT\fR of protocol \fIP\fR and IPv4 address \fIVIP\fR, a priority\-120 flow that matches on \fBct\[char46]new && ip && ip4\[char46]dst == \fIVIP\fB && \fIP\fB && \fIP\fB\[char46]dst == \fIPORT \fB\fR with an action of \fBct_lb(\fIargs\fB)\fR, where \fIargs\fR contains comma separated IPv4 addresses (and optional port numbers) to load balance to\[char46] If the router is configured to force SNAT any load-balanced packets, the above action will be replaced by \fBflags\[char46]force_snat_for_lb = 1; ct_lb(\fIargs\fB);\fR\[char46] .IP \(bu For all the configured load balancing rules for a router in \fBOVN_Northbound\fR database that includes a L4 port \fIPORT\fR of protocol \fIP\fR and IPv4 address \fIVIP\fR, a priority\-120 flow that matches on \fBct\[char46]est && ip && ip4\[char46]dst == \fIVIP\fB && \fIP\fB && \fIP\fB\[char46]dst == \fIPORT \fB\fR with an action of \fBct_dnat;\fR\[char46] If the router is configured to force SNAT any load-balanced packets, the above action will be replaced by \fBflags\[char46]force_snat_for_lb = 1; ct_dnat;\fR\[char46] .IP \(bu For all the configured load balancing rules for a router in \fBOVN_Northbound\fR database that includes just an IP address \fIVIP\fR to match on, a priority\-110 flow that matches on \fBct\[char46]new && ip && ip4\[char46]dst == \fIVIP\fB\fR with an action of \fBct_lb(\fIargs\fB)\fR, where \fIargs\fR contains comma separated IPv4 addresses\[char46] If the router is configured to force SNAT any load-balanced packets, the above action will be replaced by \fBflags\[char46]force_snat_for_lb = 1; ct_lb(\fIargs\fB);\fR\[char46] .IP \(bu For all the configured load balancing rules for a router in \fBOVN_Northbound\fR database that includes just an IP address \fIVIP\fR to match on, a priority\-110 flow that matches on \fBct\[char46]est && ip && ip4\[char46]dst == \fIVIP\fB\fR with an action of \fBct_dnat;\fR\[char46] If the router is configured to force SNAT any load-balanced packets, the above action will be replaced by \fBflags\[char46]force_snat_for_lb = 1; ct_dnat;\fR\[char46] .RE .PP .PP Ingress Table 4: DNAT on Gateway Routers .RS .IP \(bu For each configuration in the OVN Northbound database, that asks to change the destination IP address of a packet from \fIA\fR to \fIB\fR, a priority\-100 flow matches \fBip && ip4\[char46]dst == \fIA\fB\fR with an action \fBflags\[char46]loopback = 1; ct_dnat(\fIB\fB);\fR\[char46] If the Gateway router is configured to force SNAT any DNATed packet, the above action will be replaced by \fBflags\[char46]force_snat_for_dnat = 1; flags\[char46]loopback = 1; ct_dnat(\fIB\fB);\fR\[char46] .IP \(bu For all IP packets of a Gateway router, a priority\-50 flow with an action \fBflags\[char46]loopback = 1; ct_dnat;\fR\[char46] .IP \(bu A priority\-0 logical flow with match \fB1\fR has actions \fBnext;\fR\[char46] .RE .PP .PP Ingress Table 4: DNAT on Distributed Routers .PP .PP On distributed routers, the DNAT table only handles packets with destination IP address that needs to be DNATted from a virtual IP address to a real IP address\[char46] The unDNAT processing in the reverse direction is handled in a separate table in the egress pipeline\[char46] .RS .IP \(bu For each configuration in the OVN Northbound database, that asks to change the destination IP address of a packet from \fIA\fR to \fIB\fR, a priority\-100 flow matches \fBip && ip4\[char46]dst == \fIB\fB && inport == \fIGW\fB\fR, where \fIGW\fR is the logical router gateway port, with an action \fBct_dnat(\fIB\fB);\fR\[char46] .IP If the NAT rule cannot be handled in a distributed manner, then the priority\-100 flow above is only programmed on the \fBredirect\-chassis\fR\[char46] .IP For each configuration in the OVN Northbound database, that asks to change the destination IP address of a packet from \fIA\fR to \fIB\fR, a priority\-50 flow matches \fBip && ip4\[char46]dst == \fIB\fB\fR with an action \fBREGBIT_NAT_REDIRECT = 1; next;\fR\[char46] This flow is for east/west traffic to a NAT destination IPv4 address\[char46] By setting the \fBREGBIT_NAT_REDIRECT\fR flag, in the ingress table \fBGateway Redirect\fR this will trigger a redirect to the instance of the gateway port on the \fBredirect\-chassis\fR\[char46] .IP A priority\-0 logical flow with match \fB1\fR has actions \fBnext;\fR\[char46] .RE .ST "Ingress Table 5: IPv6 ND RA option processing" .RS .IP \(bu A priority\-50 logical flow is added for each logical router port configured with IPv6 ND RA options which matches IPv6 ND Router Solicitation packet and applies the action \fBput_nd_ra_opts\fR and advances the packet to the next table\[char46] .IP .nf \fB .br \fBreg0[5] = put_nd_ra_opts(\fR\fIoptions\fB\fR);next; .br \fB \fR .fi .IP For a valid IPv6 ND RS packet, this transforms the packet into an IPv6 ND RA reply and sets the RA options to the packet and stores 1 into reg0[5]\[char46] For other kinds of packets, it just stores 0 into reg0[5]\[char46] Either way, it continues to the next table\[char46] .IP \(bu A priority\-0 logical flow with match \fB1\fR has actions \fBnext;\fR\[char46] .RE .ST "Ingress Table 6: IPv6 ND RA responder" .PP .PP This table implements IPv6 ND RA responder for the IPv6 ND RA replies generated by the previous table\[char46] .RS .IP \(bu A priority\-50 logical flow is added for each logical router port configured with IPv6 ND RA options which matches IPv6 ND RA packets and \fBreg0[5] == 1\fR and responds back to the \fBinport\fR after applying these actions\[char46] If \fBreg0[5]\fR is set to 1, it means that the action \fBput_nd_ra_opts\fR was successful\[char46] .IP .nf \fB .br \fBeth\[char46]dst = eth\[char46]src; .br \fBeth\[char46]src = \fR\fIE\fB\fR; .br \fBip6\[char46]dst = ip6\[char46]src; .br \fBip6\[char46]src = \fR\fII\fB\fR; .br \fBoutport = \fR\fIP\fB\fR; .br \fBflags\[char46]loopback = 1; .br \fBoutput; .br \fB \fR .fi .IP where \fIE\fR is the MAC address and \fII\fR is the IPv6 link local address of the logical router port\[char46] .IP (This terminates packet processing in ingress pipeline; the packet does not go to the next ingress table\[char46]) .IP \(bu A priority\-0 logical flow with match \fB1\fR has actions \fBnext;\fR\[char46] .RE .ST "Ingress Table 7: IP Routing" .PP .PP A packet that arrives at this table is an IP packet that should be routed to the address in \fBip4\[char46]dst\fR or \fBip6\[char46]dst\fR\[char46] This table implements IP routing, setting \fBreg0\fR (or \fBxxreg0\fR for IPv6) to the next-hop IP address (leaving \fBip4\[char46]dst\fR or \fBip6\[char46]dst\fR, the packet\(cqs final destination, unchanged) and advances to the next table for ARP resolution\[char46] It also sets \fBreg1\fR (or \fBxxreg1\fR) to the IP address owned by the selected router port (ingress table \fBARP Request\fR will generate an ARP request, if needed, with \fBreg0\fR as the target protocol address and \fBreg1\fR as the source protocol address)\[char46] .PP .PP This table contains the following logical flows: .RS .IP \(bu For distributed logical routers where one of the logical router ports specifies a \fBredirect\-chassis\fR, a priority\-300 logical flow with match \fBREGBIT_NAT_REDIRECT == 1\fR has actions \fBip\[char46]ttl\-\-; next;\fR\[char46] The \fBoutport\fR will be set later in the Gateway Redirect table\[char46] .IP \(bu IPv4 routing table\[char46] For each route to IPv4 network \fIN\fR with netmask \fIM\fR, on router port \fIP\fR with IP address \fIA\fR and Ethernet address \fIE\fR, a logical flow with match \fBip4\[char46]dst == \fIN\fB/\fIM\fB\fR, whose priority is the number of 1-bits in \fIM\fR, has the following actions: .IP .nf \fB .br \fBip\[char46]ttl\-\-; .br \fBreg0 = \fR\fIG\fB\fR; .br \fBreg1 = \fR\fIA\fB\fR; .br \fBeth\[char46]src = \fR\fIE\fB\fR; .br \fBoutport = \fR\fIP\fB\fR; .br \fBflags\[char46]loopback = 1; .br \fBnext; .br \fB \fR .fi .IP (Ingress table 1 already verified that \fBip\[char46]ttl\-\-;\fR will not yield a TTL exceeded error\[char46]) .IP If the route has a gateway, \fIG\fR is the gateway IP address\[char46] Instead, if the route is from a configured static route, \fIG\fR is the next hop IP address\[char46] Else it is \fBip4\[char46]dst\fR\[char46] .IP \(bu IPv6 routing table\[char46] For each route to IPv6 network \fIN\fR with netmask \fIM\fR, on router port \fIP\fR with IP address \fIA\fR and Ethernet address \fIE\fR, a logical flow with match in CIDR notation \fBip6\[char46]dst == \fIN\fB/\fIM\fB\fR, whose priority is the integer value of \fIM\fR, has the following actions: .IP .nf \fB .br \fBip\[char46]ttl\-\-; .br \fBxxreg0 = \fR\fIG\fB\fR; .br \fBxxreg1 = \fR\fIA\fB\fR; .br \fBeth\[char46]src = \fR\fIE\fB\fR; .br \fBoutport = \fR\fIP\fB\fR; .br \fBflags\[char46]loopback = 1; .br \fBnext; .br \fB \fR .fi .IP (Ingress table 1 already verified that \fBip\[char46]ttl\-\-;\fR will not yield a TTL exceeded error\[char46]) .IP If the route has a gateway, \fIG\fR is the gateway IP address\[char46] Instead, if the route is from a configured static route, \fIG\fR is the next hop IP address\[char46] Else it is \fBip6\[char46]dst\fR\[char46] .IP If the address \fIA\fR is in the link-local scope, the route will be limited to sending on the ingress port\[char46] .RE .ST "Ingress Table 8: ARP/ND Resolution" .PP .PP Any packet that reaches this table is an IP packet whose next-hop IPv4 address is in \fBreg0\fR or IPv6 address is in \fBxxreg0\fR\[char46] (\fBip4\[char46]dst\fR or \fBip6\[char46]dst\fR contains the final destination\[char46]) This table resolves the IP address in \fBreg0\fR (or \fBxxreg0\fR) into an output port in \fBoutport\fR and an Ethernet address in \fBeth\[char46]dst\fR, using the following flows: .RS .IP \(bu For distributed logical routers where one of the logical router ports specifies a \fBredirect\-chassis\fR, a priority\-200 logical flow with match \fBREGBIT_NAT_REDIRECT == 1\fR has actions \fBeth\[char46]dst = \fIE\fB; next;\fR, where \fIE\fR is the ethernet address of the router\(cqs distributed gateway port\[char46] .IP \(bu Static MAC bindings\[char46] MAC bindings can be known statically based on data in the \fBOVN_Northbound\fR database\[char46] For router ports connected to logical switches, MAC bindings can be known statically from the \fBaddresses\fR column in the \fBLogical_Switch_Port\fR table\[char46] For router ports connected to other logical routers, MAC bindings can be known statically from the \fBmac\fR and \fBnetworks\fR column in the \fBLogical_Router_Port\fR table\[char46] .IP For each IPv4 address \fIA\fR whose host is known to have Ethernet address \fIE\fR on router port \fIP\fR, a priority\-100 flow with match \fBoutport === \fIP\fB && reg0 == \fIA\fB\fR has actions \fBeth\[char46]dst = \fIE\fB; next;\fR\[char46] .IP For each IPv6 address \fIA\fR whose host is known to have Ethernet address \fIE\fR on router port \fIP\fR, a priority\-100 flow with match \fBoutport === \fIP\fB && xxreg0 == \fIA\fB\fR has actions \fBeth\[char46]dst = \fIE\fB; next;\fR\[char46] .IP For each logical router port with an IPv4 address \fIA\fR and a mac address of \fIE\fR that is reachable via a different logical router port \fIP\fR, a priority\-100 flow with match \fBoutport === \fIP\fB && reg0 == \fIA\fB\fR has actions \fBeth\[char46]dst = \fIE\fB; next;\fR\[char46] .IP For each logical router port with an IPv6 address \fIA\fR and a mac address of \fIE\fR that is reachable via a different logical router port \fIP\fR, a priority\-100 flow with match \fBoutport === \fIP\fB && xxreg0 == \fIA\fB\fR has actions \fBeth\[char46]dst = \fIE\fB; next;\fR\[char46] .IP \(bu Dynamic MAC bindings\[char46] These flows resolve MAC-to-IP bindings that have become known dynamically through ARP or neighbor discovery\[char46] (The ingress table \fBARP Request\fR will issue an ARP or neighbor solicitation request for cases where the binding is not yet known\[char46]) .IP A priority\-0 logical flow with match \fBip4\fR has actions \fBget_arp(outport, reg0); next;\fR\[char46] .IP A priority\-0 logical flow with match \fBip6\fR has actions \fBget_nd(outport, xxreg0); next;\fR\[char46] .RE .ST "Ingress Table 9: Gateway Redirect" .PP .PP For distributed logical routers where one of the logical router ports specifies a \fBredirect\-chassis\fR, this table redirects certain packets to the distributed gateway port instance on the \fBredirect\-chassis\fR\[char46] This table has the following flows: .RS .IP \(bu A priority\-200 logical flow with match \fBREGBIT_NAT_REDIRECT == 1\fR has actions \fBoutport = \fICR\fB; next;\fR, where \fICR\fR is the \fBchassisredirect\fR port representing the instance of the logical router distributed gateway port on the \fBredirect\-chassis\fR\[char46] .IP \(bu A priority\-150 logical flow with match \fBoutport == \fIGW\fB && eth\[char46]dst == 00:00:00:00:00:00\fR has actions \fBoutport = \fICR\fB; next;\fR, where \fIGW\fR is the logical router distributed gateway port and \fICR\fR is the \fBchassisredirect\fR port representing the instance of the logical router distributed gateway port on the \fBredirect\-chassis\fR\[char46] .IP \(bu For each NAT rule in the OVN Northbound database that can be handled in a distributed manner, a priority\-100 logical flow with match \fBip4\[char46]src == \fIB\fB && outport == \fIGW\fB\fR, where \fIGW\fR is the logical router distributed gateway port, with actions \fBnext;\fR\[char46] .IP \(bu A priority\-50 logical flow with match \fBoutport == \fIGW\fB\fR has actions \fBoutport = \fICR\fB; next;\fR, where \fIGW\fR is the logical router distributed gateway port and \fICR\fR is the \fBchassisredirect\fR port representing the instance of the logical router distributed gateway port on the \fBredirect\-chassis\fR\[char46] .IP \(bu A priority\-0 logical flow with match \fB1\fR has actions \fBnext;\fR\[char46] .RE .ST "Ingress Table 10: ARP Request" .PP .PP In the common case where the Ethernet destination has been resolved, this table outputs the packet\[char46] Otherwise, it composes and sends an ARP or IPv6 Neighbor Solicitation request\[char46] It holds the following flows: .RS .IP \(bu Unknown MAC address\[char46] A priority\-100 flow for IPv4 packets with match \fBeth\[char46]dst == 00:00:00:00:00:00\fR has the following actions: .IP .nf \fB .br \fBarp { .br \fB eth\[char46]dst = ff:ff:ff:ff:ff:ff; .br \fB arp\[char46]spa = reg1; .br \fB arp\[char46]tpa = reg0; .br \fB arp\[char46]op = 1; /* ARP request\[char46] */ .br \fB output; .br \fB}; .br \fB \fR .fi .IP Unknown MAC address\[char46] A priority\-100 flow for IPv6 packets with match \fBeth\[char46]dst == 00:00:00:00:00:00\fR has the following actions: .IP .nf \fB .br \fBnd_ns { .br \fB nd\[char46]target = xxreg0; .br \fB output; .br \fB}; .br \fB \fR .fi .IP (Ingress table \fBIP Routing\fR initialized \fBreg1\fR with the IP address owned by \fBoutport\fR and \fB(xx)reg0\fR with the next-hop IP address) .IP The IP packet that triggers the ARP/IPv6 NS request is dropped\[char46] .IP \(bu Known MAC address\[char46] A priority\-0 flow with match \fB1\fR has actions \fBoutput;\fR\[char46] .RE .ST "Egress Table 0: UNDNAT" .PP .PP This is for already established connections\(cq reverse traffic\[char46] i\[char46]e\[char46], DNAT has already been done in ingress pipeline and now the packet has entered the egress pipeline as part of a reply\[char46] For NAT on a distributed router, it is unDNATted here\[char46] For Gateway routers, the unDNAT processing is carried out in the ingress DNAT table\[char46] .RS .IP \(bu For all the configured load balancing rules for a router with gateway port in \fBOVN_Northbound\fR database that includes an IPv4 address \fBVIP\fR, for every backend IPv4 address \fIB\fR defined for the \fBVIP\fR a priority\-120 flow is programmed on \fBredirect\-chassis\fR that matches \fBip && ip4\[char46]src == \fIB\fB && outport == \fIGW\fB\fR, where \fIGW\fR is the logical router gateway port with an action \fBct_dnat;\fR\[char46] If the backend IPv4 address \fIB\fR is also configured with L4 port \fIPORT\fR of protocol \fIP\fR, then the match also includes \fBP\[char46]src\fR == \fIPORT\fR\[char46] These flows are not added for load balancers with IPv6 \fIVIPs\fR\[char46] .IP If the router is configured to force SNAT any load-balanced packets, above action will be replaced by \fBflags\[char46]force_snat_for_lb = 1; ct_dnat;\fR\[char46] .IP \(bu For each configuration in the OVN Northbound database that asks to change the destination IP address of a packet from an IP address of \fIA\fR to \fIB\fR, a priority\-100 flow matches \fBip && ip4\[char46]src == \fIB\fB && outport == \fIGW\fB\fR, where \fIGW\fR is the logical router gateway port, with an action \fBct_dnat;\fR\[char46] .IP If the NAT rule cannot be handled in a distributed manner, then the priority\-100 flow above is only programmed on the \fBredirect\-chassis\fR\[char46] .IP If the NAT rule can be handled in a distributed manner, then there is an additional action \fBeth\[char46]src = \fIEA\fB;\fR, where \fIEA\fR is the ethernet address associated with the IP address \fIA\fR in the NAT rule\[char46] This allows upstream MAC learning to point to the correct chassis\[char46] .IP \(bu A priority\-0 logical flow with match \fB1\fR has actions \fBnext;\fR\[char46] .RE .ST "Egress Table 1: SNAT" .PP .PP Packets that are configured to be SNATed get their source IP address changed based on the configuration in the OVN Northbound database\[char46] .PP .PP Egress Table 1: SNAT on Gateway Routers .RS .IP \(bu If the Gateway router in the OVN Northbound database has been configured to force SNAT a packet (that has been previously DNATted) to \fIB\fR, a priority\-100 flow matches \fBflags\[char46]force_snat_for_dnat == 1 && ip\fR with an action \fBct_snat(\fIB\fB);\fR\[char46] .IP If the Gateway router in the OVN Northbound database has been configured to force SNAT a packet (that has been previously load-balanced) to \fIB\fR, a priority\-100 flow matches \fBflags\[char46]force_snat_for_lb == 1 && ip\fR with an action \fBct_snat(\fIB\fB);\fR\[char46] .IP For each configuration in the OVN Northbound database, that asks to change the source IP address of a packet from an IP address of \fIA\fR or to change the source IP address of a packet that belongs to network \fIA\fR to \fIB\fR, a flow matches \fBip && ip4\[char46]src == \fIA\fB\fR with an action \fBct_snat(\fIB\fB);\fR\[char46] The priority of the flow is calculated based on the mask of \fIA\fR, with matches having larger masks getting higher priorities\[char46] .IP A priority\-0 logical flow with match \fB1\fR has actions \fBnext;\fR\[char46] .RE .PP .PP Egress Table 1: SNAT on Distributed Routers .RS .IP \(bu For each configuration in the OVN Northbound database, that asks to change the source IP address of a packet from an IP address of \fIA\fR or to change the source IP address of a packet that belongs to network \fIA\fR to \fIB\fR, a flow matches \fBip && ip4\[char46]src == \fIA\fB && outport == \fIGW\fB\fR, where \fIGW\fR is the logical router gateway port, with an action \fBct_snat(\fIB\fB);\fR\[char46] The priority of the flow is calculated based on the mask of \fIA\fR, with matches having larger masks getting higher priorities\[char46] .IP If the NAT rule cannot be handled in a distributed manner, then the flow above is only programmed on the \fBredirect\-chassis\fR\[char46] .IP If the NAT rule can be handled in a distributed manner, then there is an additional action \fBeth\[char46]src = \fIEA\fB;\fR, where \fIEA\fR is the ethernet address associated with the IP address \fIA\fR in the NAT rule\[char46] This allows upstream MAC learning to point to the correct chassis\[char46] .IP \(bu A priority\-0 logical flow with match \fB1\fR has actions \fBnext;\fR\[char46] .RE .ST "Egress Table 2: Egress Loopback" .PP .PP For distributed logical routers where one of the logical router ports specifies a \fBredirect\-chassis\fR\[char46] .PP .PP Earlier in the ingress pipeline, some east-west traffic was redirected to the \fBchassisredirect\fR port, based on flows in the \fBUNSNAT\fR and \fBDNAT\fR ingress tables setting the \fBREGBIT_NAT_REDIRECT\fR flag, which then triggered a match to a flow in the \fBGateway Redirect\fR ingress table\[char46] The intention was not to actually send traffic out the distributed gateway port instance on the \fBredirect\-chassis\fR\[char46] This traffic was sent to the distributed gateway port instance in order for DNAT and/or SNAT processing to be applied\[char46] .PP .PP While UNDNAT and SNAT processing have already occurred by this point, this traffic needs to be forced through egress loopback on this distributed gateway port instance, in order for UNSNAT and DNAT processing to be applied, and also for IP routing and ARP resolution after all of the NAT processing, so that the packet can be forwarded to the destination\[char46] .PP .PP This table has the following flows: .RS .IP \(bu For each NAT rule in the OVN Northbound database on a distributed router, a priority\-100 logical flow with match \fBip4\[char46]dst == \fIE\fB && outport == \fIGW\fB\fR, where \fIE\fR is the external IP address specified in the NAT rule, and \fIGW\fR is the logical router distributed gateway port, with the following actions: .IP .nf \fB .br \fBclone { .br \fB ct_clear; .br \fB inport = outport; .br \fB outport = \(dq\(dq; .br \fB flags = 0; .br \fB flags\[char46]loopback = 1; .br \fB reg0 = 0; .br \fB reg1 = 0; .br \fB \[char46]\[char46]\[char46] .br \fB reg9 = 0; .br \fB REGBIT_EGRESS_LOOPBACK = 1; .br \fB next(pipeline=ingress, table=0); .br \fB}; .br \fB \fR .fi .IP \fBflags\[char46]loopback\fR is set since in_port is unchanged and the packet may return back to that port after NAT processing\[char46] \fBREGBIT_EGRESS_LOOPBACK\fR is set to indicate that egress loopback has occurred, in order to skip the source IP address check against the router address\[char46] .IP \(bu A priority\-0 logical flow with match \fB1\fR has actions \fBnext;\fR\[char46] .RE .ST "Egress Table 3: Delivery" .PP .PP Packets that reach this table are ready for delivery\[char46] It contains priority\-100 logical flows that match packets on each enabled logical router port, with action \fBoutput;\fR\[char46]