'\" p .\" -*- nroff -*- .TH "ovn-northd" 8 "ovn-northd" "OVN 22\[char46]12\[char46]0" "OVN Manual" .fp 5 L CR \\" Make fixed-width font available as \\fL. .de TQ . br . ns . TP "\\$1" .. .de ST . PP . RS -0.15in . I "\\$1" . RE .. .de SU . PP . I "\\$1" .. .PP .SH "NAME" .PP .PP ovn-northd and ovn-northd-ddlog \- Open Virtual Network central control daemon .SH "SYNOPSIS" .PP \fBovn\-northd\fR [\fIoptions\fR] .SH "DESCRIPTION" .PP .PP \fBovn\-northd\fR is a centralized daemon responsible for translating the high-level OVN configuration into logical configuration consumable by daemons such as \fBovn\-controller\fR\[char46] It translates the logical network configuration in terms of conventional network concepts, taken from the OVN Northbound Database (see \fBovn\-nb\fR(5)), into logical datapath flows in the OVN Southbound Database (see \fBovn\-sb\fR(5)) below it\[char46] .PP .PP \fBovn\-northd\fR is implemented in C\[char46] \fBovn\-northd\-ddlog\fR is a compatible implementation written in DDlog, a language for incremental database processing\[char46] This documentation applies to both implementations, with differences indicated where relevant\[char46] .SH "OPTIONS" .TP \fB\-\-ovnnb\-db=\fIdatabase\fB\fR The OVSDB database containing the OVN Northbound Database\[char46] If the \fBOVN_NB_DB\fR environment variable is set, its value is used as the default\[char46] Otherwise, the default is \fBunix:/ovnnb_db\[char46]sock\fR\[char46] .TP \fB\-\-ovnsb\-db=\fIdatabase\fB\fR The OVSDB database containing the OVN Southbound Database\[char46] If the \fBOVN_SB_DB\fR environment variable is set, its value is used as the default\[char46] Otherwise, the default is \fBunix:/ovnsb_db\[char46]sock\fR\[char46] .TP \fB\-\-ddlog\-record=\fIfile\fB\fR This option is for \fBovn\-north\-ddlog\fR only\[char46] It causes the daemon to record the initial database state and later changes to \fIfile\fR in the text-based DDlog command format\[char46] The \fBovn_northd_cli\fR program can later replay these changes for debugging purposes\[char46] This option has a performance impact\[char46] See \fBdebugging\-ddlog\[char46]rst\fR in the OVN documentation for more details\[char46] .TP \fB\-\-dry\-run\fR Causes \fBovn\-northd\fR to start paused\[char46] In the paused state, \fBovn\-northd\fR does not apply any changes to the databases, although it continues to monitor them\[char46] For more information, see the \fBpause\fR command, under \fBRuntime Management Commands\fR below\[char46] .IP For \fBovn\-northd\-ddlog\fR, one could use this option with \fB\-\-ddlog\-record\fR to generate a replay log without restarting a process or disturbing a running system\[char46] .TP \fBn\-threads N\fR In certain situations, it may be desirable to enable parallelization on a system to decrease latency (at the potential cost of increasing CPU usage)\[char46] .IP This option will cause ovn-northd to use N threads when building logical flows, when N is within [2\-256]\[char46] If N is 1, parallelization is disabled (default behavior)\[char46] If N is less than 1, then N is set to 1, parallelization is disabled and a warning is logged\[char46] If N is more than 256, then N is set to 256, parallelization is enabled (with 256 threads) and a warning is logged\[char46] .IP ovn-northd-ddlog does not support this option\[char46] .PP .PP \fIdatabase\fR in the above options must be an OVSDB active or passive connection method, as described in \fBovsdb\fR(7)\[char46] .SS "Daemon Options" .TP \fB\-\-pidfile\fR[\fB=\fR\fIpidfile\fR] Causes a file (by default, \fB\fIprogram\fB\[char46]pid\fR) to be created indicating the PID of the running process\[char46] If the \fIpidfile\fR argument is not specified, or if it does not begin with \fB/\fR, then it is created in \fB\fR\[char46] .IP If \fB\-\-pidfile\fR is not specified, no pidfile is created\[char46] .TP \fB\-\-overwrite\-pidfile\fR By default, when \fB\-\-pidfile\fR is specified and the specified pidfile already exists and is locked by a running process, the daemon refuses to start\[char46] Specify \fB\-\-overwrite\-pidfile\fR to cause it to instead overwrite the pidfile\[char46] .IP When \fB\-\-pidfile\fR is not specified, this option has no effect\[char46] .TP \fB\-\-detach\fR Runs this program as a background process\[char46] The process forks, and in the child it starts a new session, closes the standard file descriptors (which has the side effect of disabling logging to the console), and changes its current directory to the root (unless \fB\-\-no\-chdir\fR is specified)\[char46] After the child completes its initialization, the parent exits\[char46] .TP \fB\-\-monitor\fR Creates an additional process to monitor this program\[char46] If it dies due to a signal that indicates a programming error (\fBSIGABRT\fR, \fBSIGALRM\fR, \fBSIGBUS\fR, \fBSIGFPE\fR, \fBSIGILL\fR, \fBSIGPIPE\fR, \fBSIGSEGV\fR, \fBSIGXCPU\fR, or \fBSIGXFSZ\fR) then the monitor process starts a new copy of it\[char46] If the daemon dies or exits for another reason, the monitor process exits\[char46] .IP This option is normally used with \fB\-\-detach\fR, but it also functions without it\[char46] .TP \fB\-\-no\-chdir\fR By default, when \fB\-\-detach\fR is specified, the daemon changes its current working directory to the root directory after it detaches\[char46] Otherwise, invoking the daemon from a carelessly chosen directory would prevent the administrator from unmounting the file system that holds that directory\[char46] .IP Specifying \fB\-\-no\-chdir\fR suppresses this behavior, preventing the daemon from changing its current working directory\[char46] This may be useful for collecting core files, since it is common behavior to write core dumps into the current working directory and the root directory is not a good directory to use\[char46] .IP This option has no effect when \fB\-\-detach\fR is not specified\[char46] .TP \fB\-\-no\-self\-confinement\fR By default this daemon will try to self-confine itself to work with files under well-known directories determined at build time\[char46] It is better to stick with this default behavior and not to use this flag unless some other Access Control is used to confine daemon\[char46] Note that in contrast to other access control implementations that are typically enforced from kernel-space (e\[char46]g\[char46] DAC or MAC), self-confinement is imposed from the user-space daemon itself and hence should not be considered as a full confinement strategy, but instead should be viewed as an additional layer of security\[char46] .TP \fB\-\-user=\fR\fIuser\fR\fB:\fR\fIgroup\fR Causes this program to run as a different user specified in \fIuser\fR\fB:\fR\fIgroup\fR, thus dropping most of the root privileges\[char46] Short forms \fIuser\fR and \fB:\fR\fIgroup\fR are also allowed, with current user or group assumed, respectively\[char46] Only daemons started by the root user accepts this argument\[char46] .IP On Linux, daemons will be granted \fBCAP_IPC_LOCK\fR and \fBCAP_NET_BIND_SERVICES\fR before dropping root privileges\[char46] Daemons that interact with a datapath, such as \fBovs\-vswitchd\fR, will be granted three additional capabilities, namely \fBCAP_NET_ADMIN\fR, \fBCAP_NET_BROADCAST\fR and \fBCAP_NET_RAW\fR\[char46] The capability change will apply even if the new user is root\[char46] .IP On Windows, this option is not currently supported\[char46] For security reasons, specifying this option will cause the daemon process not to start\[char46] .SS "Logging Options" .TP \fB\-v\fR[\fIspec\fR] .TQ .5in \fB\-\-verbose=\fR[\fIspec\fR] Sets logging levels\[char46] Without any \fIspec\fR, sets the log level for every module and destination to \fBdbg\fR\[char46] Otherwise, \fIspec\fR is a list of words separated by spaces or commas or colons, up to one from each category below: .RS .IP \(bu A valid module name, as displayed by the \fBvlog/list\fR command on \fBovs\-appctl\fR(8), limits the log level change to the specified module\[char46] .IP \(bu \fBsyslog\fR, \fBconsole\fR, or \fBfile\fR, to limit the log level change to only to the system log, to the console, or to a file, respectively\[char46] (If \fB\-\-detach\fR is specified, the daemon closes its standard file descriptors, so logging to the console will have no effect\[char46]) .IP On Windows platform, \fBsyslog\fR is accepted as a word and is only useful along with the \fB\-\-syslog\-target\fR option (the word has no effect otherwise)\[char46] .IP \(bu \fBoff\fR, \fBemer\fR, \fBerr\fR, \fBwarn\fR, \fBinfo\fR, or \fBdbg\fR, to control the log level\[char46] Messages of the given severity or higher will be logged, and messages of lower severity will be filtered out\[char46] \fBoff\fR filters out all messages\[char46] See \fBovs\-appctl\fR(8) for a definition of each log level\[char46] .RE .IP Case is not significant within \fIspec\fR\[char46] .IP Regardless of the log levels set for \fBfile\fR, logging to a file will not take place unless \fB\-\-log\-file\fR is also specified (see below)\[char46] .IP For compatibility with older versions of OVS, \fBany\fR is accepted as a word but has no effect\[char46] .TP \fB\-v\fR .TQ .5in \fB\-\-verbose\fR Sets the maximum logging verbosity level, equivalent to \fB\-\-verbose=dbg\fR\[char46] .TP \fB\-vPATTERN:\fR\fIdestination\fR\fB:\fR\fIpattern\fR .TQ .5in \fB\-\-verbose=PATTERN:\fR\fIdestination\fR\fB:\fR\fIpattern\fR Sets the log pattern for \fIdestination\fR to \fIpattern\fR\[char46] Refer to \fBovs\-appctl\fR(8) for a description of the valid syntax for \fIpattern\fR\[char46] .TP \fB\-vFACILITY:\fR\fIfacility\fR .TQ .5in \fB\-\-verbose=FACILITY:\fR\fIfacility\fR Sets the RFC5424 facility of the log message\[char46] \fIfacility\fR can be one of \fBkern\fR, \fBuser\fR, \fBmail\fR, \fBdaemon\fR, \fBauth\fR, \fBsyslog\fR, \fBlpr\fR, \fBnews\fR, \fBuucp\fR, \fBclock\fR, \fBftp\fR, \fBntp\fR, \fBaudit\fR, \fBalert\fR, \fBclock2\fR, \fBlocal0\fR, \fBlocal1\fR, \fBlocal2\fR, \fBlocal3\fR, \fBlocal4\fR, \fBlocal5\fR, \fBlocal6\fR or \fBlocal7\fR\[char46] If this option is not specified, \fBdaemon\fR is used as the default for the local system syslog and \fBlocal0\fR is used while sending a message to the target provided via the \fB\-\-syslog\-target\fR option\[char46] .TP \fB\-\-log\-file\fR[\fB=\fR\fIfile\fR] Enables logging to a file\[char46] If \fIfile\fR is specified, then it is used as the exact name for the log file\[char46] The default log file name used if \fIfile\fR is omitted is \fB/var/log/ovn/\fIprogram\fB\[char46]log\fR\[char46] .TP \fB\-\-syslog\-target=\fR\fIhost\fR\fB:\fR\fIport\fR Send syslog messages to UDP \fIport\fR on \fIhost\fR, in addition to the system syslog\[char46] The \fIhost\fR must be a numerical IP address, not a hostname\[char46] .TP \fB\-\-syslog\-method=\fR\fImethod\fR Specify \fImethod\fR as how syslog messages should be sent to syslog daemon\[char46] The following forms are supported: .RS .IP \(bu \fBlibc\fR, to use the libc \fBsyslog()\fR function\[char46] Downside of using this options is that libc adds fixed prefix to every message before it is actually sent to the syslog daemon over \fB/dev/log\fR UNIX domain socket\[char46] .IP \(bu \fBunix:\fIfile\fB\fR, to use a UNIX domain socket directly\[char46] It is possible to specify arbitrary message format with this option\[char46] However, \fBrsyslogd 8\[char46]9\fR and older versions use hard coded parser function anyway that limits UNIX domain socket use\[char46] If you want to use arbitrary message format with older \fBrsyslogd\fR versions, then use UDP socket to localhost IP address instead\[char46] .IP \(bu \fBudp:\fIip\fB:\fIport\fB\fR, to use a UDP socket\[char46] With this method it is possible to use arbitrary message format also with older \fBrsyslogd\fR\[char46] When sending syslog messages over UDP socket extra precaution needs to be taken into account, for example, syslog daemon needs to be configured to listen on the specified UDP port, accidental iptables rules could be interfering with local syslog traffic and there are some security considerations that apply to UDP sockets, but do not apply to UNIX domain sockets\[char46] .IP \(bu \fBnull\fR, to discard all messages logged to syslog\[char46] .RE .IP The default is taken from the \fBOVS_SYSLOG_METHOD\fR environment variable; if it is unset, the default is \fBlibc\fR\[char46] .SS "PKI Options" .PP .PP PKI configuration is required in order to use SSL for the connections to the Northbound and Southbound databases\[char46] .RS .TP \fB\-p\fR \fIprivkey\[char46]pem\fR .TQ .5in \fB\-\-private\-key=\fR\fIprivkey\[char46]pem\fR Specifies a PEM file containing the private key used as identity for outgoing SSL connections\[char46] .TP \fB\-c\fR \fIcert\[char46]pem\fR .TQ .5in \fB\-\-certificate=\fR\fIcert\[char46]pem\fR Specifies a PEM file containing a certificate that certifies the private key specified on \fB\-p\fR or \fB\-\-private\-key\fR to be trustworthy\[char46] The certificate must be signed by the certificate authority (CA) that the peer in SSL connections will use to verify it\[char46] .TP \fB\-C\fR \fIcacert\[char46]pem\fR .TQ .5in \fB\-\-ca\-cert=\fR\fIcacert\[char46]pem\fR Specifies a PEM file containing the CA certificate for verifying certificates presented to this program by SSL peers\[char46] (This may be the same certificate that SSL peers use to verify the certificate specified on \fB\-c\fR or \fB\-\-certificate\fR, or it may be a different one, depending on the PKI design in use\[char46]) .TP \fB\-C none\fR .TQ .5in \fB\-\-ca\-cert=none\fR Disables verification of certificates presented by SSL peers\[char46] This introduces a security risk, because it means that certificates cannot be verified to be those of known trusted hosts\[char46] .RE .SS "Other Options" .TP \fB\-\-unixctl=\fIsocket\fB\fR Sets the name of the control socket on which \fB\fIprogram\fB\fR listens for runtime management commands (see \fIRUNTIME MANAGEMENT COMMANDS,\fR below)\[char46] If \fIsocket\fR does not begin with \fB/\fR, it is interpreted as relative to \fB\fR\[char46] If \fB\-\-unixctl\fR is not used at all, the default socket is \fB/\fIprogram\fB\[char46]\fR\fIpid\fR\fB\[char46]ctl\fR, where \fIpid\fR is \fB\fIprogram\fB\fR\(cqs process ID\[char46] .IP On Windows a local named pipe is used to listen for runtime management commands\[char46] A file is created in the absolute path as pointed by \fIsocket\fR or if \fB\-\-unixctl\fR is not used at all, a file is created as \fB\fIprogram\fB\fR in the configured \fIOVS_RUNDIR\fR directory\[char46] The file exists just to mimic the behavior of a Unix domain socket\[char46] .IP Specifying \fBnone\fR for \fIsocket\fR disables the control socket feature\[char46] .ST "" .TP \fB\-h\fR .TQ .5in \fB\-\-help\fR Prints a brief help message to the console\[char46] .TP \fB\-V\fR .TQ .5in \fB\-\-version\fR Prints version information to the console\[char46] .SH "RUNTIME MANAGEMENT COMMANDS" .PP .PP \fBovs\-appctl\fR can send commands to a running \fBovn\-northd\fR process\[char46] The currently supported commands are described below\[char46] .RS .TP \fBexit\fR Causes \fBovn\-northd\fR to gracefully terminate\[char46] .TP \fBpause\fR Pauses \fBovn\-northd\fR\[char46] When it is paused, \fBovn\-northd\fR receives changes from the Northbound and Southbound database changes as usual, but it does not send any updates\[char46] A paused \fBovn\-northd\fR also drops database locks, which allows any other non-paused instance of \fBovn\-northd\fR to take over\[char46] .TP \fBresume\fR Resumes the ovn-northd operation to process Northbound and Southbound database contents and generate logical flows\[char46] This will also instruct ovn-northd to aspire for the lock on SB DB\[char46] .TP \fBis\-paused\fR Returns \(dqtrue\(dq if ovn-northd is currently paused, \(dqfalse\(dq otherwise\[char46] .TP \fBstatus\fR Prints this server\(cqs status\[char46] Status will be \(dqactive\(dq if ovn-northd has acquired OVSDB lock on SB DB, \(dqstandby\(dq if it has not or \(dqpaused\(dq if this instance is paused\[char46] .TP \fBsb\-cluster\-state\-reset\fR Reset southbound database cluster status when databases are destroyed and rebuilt\[char46] .IP If all databases in a clustered southbound database are removed from disk, then the stored index of all databases will be reset to zero\[char46] This will cause ovn-northd to be unable to read or write to the southbound database, because it will always detect the data as stale\[char46] In such a case, run this command so that ovn-northd will reset its local index so that it can interact with the southbound database again\[char46] .TP \fBnb\-cluster\-state\-reset\fR Reset northbound database cluster status when databases are destroyed and rebuilt\[char46] .IP This performs the same task as \fBsb\-cluster\-state\-reset\fR except for the northbound database client\[char46] .TP \fBset\-n\-threads N\fR Set the number of threads used for building logical flows\[char46] When N is within [2\-256], parallelization is enabled\[char46] When N is 1 parallelization is disabled\[char46] When N is less than 1 or more than 256, an error is returned\[char46] If ovn-northd fails to start parallelization (e\[char46]g\[char46] fails to setup semaphores, parallelization is disabled and an error is returned\[char46] .TP \fBget\-n\-threads\fR Return the number of threads used for building logical flows\[char46] .TP \fBinc\-engine/show\-stats\fR Display \fBovn\-northd\fR engine counters\[char46] For each engine node the following counters have been added: .RS .IP \(bu \fBrecompute\fR .IP \(bu \fBcompute\fR .IP \(bu \fBabort\fR .RE .TP \fBinc\-engine/show\-stats \fIengine_node_name\fB \fIcounter_name\fB\fR Display the \fBovn\-northd\fR engine counter(s) for the specified \fIengine_node_name\fR\[char46] \fIcounter_name\fR is optional and can be one of \fBrecompute\fR, \fBcompute\fR or \fBabort\fR\[char46] .TP \fBinc\-engine/clear\-stats\fR Reset \fBovn\-northd\fR engine counters\[char46] .RE .PP .PP Only \fBovn\-northd\-ddlog\fR supports the following commands: .RS .TP \fBenable\-cpu\-profiling\fR .TQ .5in \fBdisable\-cpu\-profiling\fR Enables or disables profiling of CPU time used by the DDlog engine\[char46] When CPU profiling is enabled, the \fBprofile\fR command (see below) will include DDlog CPU usage statistics in its output\[char46] Enabling CPU profiling will slow \fBovn\-northd\-ddlog\fR\[char46] Disabling CPU profiling does not clear any previously recorded statistics\[char46] .TP \fBprofile\fR Outputs a profile of the current and peak sizes of arrangements inside DDlog\[char46] This profiling data can be useful for optimizing DDlog code\[char46] If CPU profiling was previously enabled (even if it was later disabled), the output also includes a CPU time profile\[char46] See \fBProfiling\fR inside the tutorial in the DDlog repository for an introduction to profiling DDlog\[char46] .RE .SH "ACTIVE-STANDBY FOR HIGH AVAILABILITY" .PP .PP You may run \fBovn\-northd\fR more than once in an OVN deployment\[char46] When connected to a standalone or clustered DB setup, OVN will automatically ensure that only one of them is active at a time\[char46] If multiple instances of \fBovn\-northd\fR are running and the active \fBovn\-northd\fR fails, one of the hot standby instances of \fBovn\-northd\fR will automatically take over\[char46] .SS "Active\-Standby with multiple OVN DB servers" .PP .PP You may run multiple OVN DB servers in an OVN deployment with: .RS .IP \(bu OVN DB servers deployed in active/passive mode with one active and multiple passive ovsdb-servers\[char46] .IP \(bu \fBovn\-northd\fR also deployed on all these nodes, using unix ctl sockets to connect to the local OVN DB servers\[char46] .RE .PP .PP In such deployments, the ovn-northds on the passive nodes will process the DB changes and compute logical flows to be thrown out later, because write transactions are not allowed by the passive ovsdb-servers\[char46] It results in unnecessary CPU usage\[char46] .PP .PP With the help of runtime management command \fBpause\fR, you can pause \fBovn\-northd\fR on these nodes\[char46] When a passive node becomes master, you can use the runtime management command \fBresume\fR to resume the \fBovn\-northd\fR to process the DB changes\[char46] .SH "LOGICAL FLOW TABLE STRUCTURE" .PP .PP One of the main purposes of \fBovn\-northd\fR is to populate the \fBLogical_Flow\fR table in the \fBOVN_Southbound\fR database\[char46] This section describes how \fBovn\-northd\fR does this for switch and router logical datapaths\[char46] .SS "Logical Switch Datapaths" .ST "Ingress Table 0: Admission Control and Ingress Port Security check" .PP .PP Ingress table 0 contains these logical flows: .RS .IP \(bu Priority 100 flows to drop packets with VLAN tags or multicast Ethernet source addresses\[char46] .IP \(bu For each disabled logical port, a priority 100 flow is added which matches on all packets and applies the action \fBREGBIT_PORT_SEC_DROP\(dq = 1; next;\(dq\fR so that the packets are dropped in the next stage\[char46] .IP \(bu For each (enabled) vtep logical port, a priority 70 flow is added which matches on all packets and applies the action \fBnext(pipeline=ingress, table=S_SWITCH_IN_L2_LKUP) = 1;\fR to skip most stages of ingress pipeline and go directly to ingress L2 lookup table to determine the output port\[char46] Packets from VTEP (RAMP) switch should not be subjected to any ACL checks\[char46] Egress pipeline will do the ACL checks\[char46] .IP \(bu For each enabled logical port configured with qdisc queue id in the \fBoptions:qdisc_queue_id\fR column of \fBLogical_Switch_Port\fR, a priority 70 flow is added which matches on all packets and applies the action \fBset_queue(id); REGBIT_PORT_SEC_DROP\(dq = check_in_port_sec(); next;\(dq\fR\[char46] .IP \(bu A priority 1 flow is added which matches on all packets for all the logical ports and applies the action \fBREGBIT_PORT_SEC_DROP\(dq = check_in_port_sec(); next;\fR to evaluate the port security\[char46] The action \fBcheck_in_port_sec\fR applies the port security rules defined in the \fBport_security\fR column of \fBLogical_Switch_Port\fR table\[char46] .RE .ST "Ingress Table 1: Ingress Port Security - Apply" .PP .PP This table drops the packets if the port security check failed in the previous stage i\[char46]e the register bit \fBREGBIT_PORT_SEC_DROP\fR is set to 1\[char46] .PP .PP Ingress table 1 contains these logical flows: .RS .IP \(bu A priority\-50 fallback flow that drops the packet if the register bit \fBREGBIT_PORT_SEC_DROP\fR is set to 1\[char46] .IP \(bu One priority\-0 fallback flow that matches all packets and advances to the next table\[char46] .RE .ST "Ingress Table 2: Lookup MAC address learning table" .PP .PP This table looks up the MAC learning table of the logical switch datapath to check if the \fBport\-mac\fR pair is present or not\[char46] MAC is learnt only for logical switch VIF ports whose port security is disabled and \(cqunknown\(cq address set\[char46] .RS .IP \(bu For each such logical port \fIp\fR whose port security is disabled and \(cqunknown\(cq address set following flow is added\[char46] .RS .IP \(bu Priority 100 flow with the match \fBinport == \fIp\fB\fR and action \fBreg0[11] = lookup_fdb(inport, eth\[char46]src); next;\fR .RE .IP \(bu One priority\-0 fallback flow that matches all packets and advances to the next table\[char46] .RE .ST "Ingress Table 3: Learn MAC of \(cqunknown\(cq ports\[char46]" .PP .PP This table learns the MAC addresses seen on the logical ports whose port security is disabled and \(cqunknown\(cq address set if the \fBlookup_fdb\fR action returned false in the previous table\[char46] .RS .IP \(bu For each such logical port \fIp\fR whose port security is disabled and \(cqunknown\(cq address set following flow is added\[char46] .RS .IP \(bu Priority 100 flow with the match \fBinport == \fIp\fB && reg0[11] == 0\fR and action \fBput_fdb(inport, eth\[char46]src); next;\fR which stores the \fBport\-mac\fR in the mac learning table of the logical switch datapath and advances the packet to the next table\[char46] .RE .IP \(bu One priority\-0 fallback flow that matches all packets and advances to the next table\[char46] .RE .ST "Ingress Table 4: \fBfrom\-lport\fI Pre-ACLs" .PP .PP This table prepares flows for possible stateful ACL processing in ingress table \fBACLs\fR\[char46] It contains a priority\-0 flow that simply moves traffic to the next table\[char46] If stateful ACLs are used in the logical datapath, a priority\-100 flow is added that sets a hint (with \fBreg0[0] = 1; next;\fR) for table \fBPre\-stateful\fR to send IP packets to the connection tracker before eventually advancing to ingress table \fBACLs\fR\[char46] If special ports such as route ports or localnet ports can\(cqt use ct(), a priority\-110 flow is added to skip over stateful ACLs\[char46] Multicast, IPv6 Neighbor Discovery and MLD traffic also skips stateful ACLs\[char46] For \(dqallow-stateless\(dq ACLs, a flow is added to bypass setting the hint for connection tracker processing\[char46] .PP .PP This table also has a priority\-110 flow with the match \fBeth\[char46]dst == \fIE\fB\fR for all logical switch datapaths to move traffic to the next table\[char46] Where \fIE\fR is the service monitor mac defined in the \fBoptions:svc_monitor_mac\fR column of \fBNB_Global\fR table\[char46] .ST "Ingress Table 5: Pre-LB" .PP .PP This table prepares flows for possible stateful load balancing processing in ingress table \fBLB\fR and \fBStateful\fR\[char46] It contains a priority\-0 flow that simply moves traffic to the next table\[char46] Moreover it contains two priority\-110 flows to move multicast, IPv6 Neighbor Discovery and MLD traffic to the next table\[char46] If load balancing rules with virtual IP addresses (and ports) are configured in \fBOVN_Northbound\fR database for a logical switch datapath, a priority\-100 flow is added with the match \fBip\fR to match on IP packets and sets the action \fBreg0[2] = 1; next;\fR to act as a hint for table \fBPre\-stateful\fR to send IP packets to the connection tracker for packet de-fragmentation (and to possibly do DNAT for already established load balanced traffic) before eventually advancing to ingress table \fBStateful\fR\[char46] If controller_event has been enabled and load balancing rules with empty backends have been added in \fBOVN_Northbound\fR, a 130 flow is added to trigger ovn-controller events whenever the chassis receives a packet for that particular VIP\[char46] If \fBevent\-elb\fR meter has been previously created, it will be associated to the empty_lb logical flow .PP .PP Prior to \fBOVN 20\[char46]09\fR we were setting the \fBreg0[0] = 1\fR only if the IP destination matches the load balancer VIP\[char46] However this had few issues cases where a logical switch doesn\(cqt have any ACLs with \fBallow\-related\fR action\[char46] To understand the issue lets a take a TCP load balancer - \fB10\[char46]0\[char46]0\[char46]10:80=10\[char46]0\[char46]0\[char46]3:80\fR\[char46] If a logical port - p1 with IP - 10\[char46]0\[char46]0\[char46]5 opens a TCP connection with the VIP - 10\[char46]0\[char46]0\[char46]10, then the packet in the ingress pipeline of \(cqp1\(cq is sent to the p1\(cqs conntrack zone id and the packet is load balanced to the backend - 10\[char46]0\[char46]0\[char46]3\[char46] For the reply packet from the backend lport, it is not sent to the conntrack of backend lport\(cqs zone id\[char46] This is fine as long as the packet is valid\[char46] Suppose the backend lport sends an invalid TCP packet (like incorrect sequence number), the packet gets delivered to the lport \(cqp1\(cq without unDNATing the packet to the VIP - 10\[char46]0\[char46]0\[char46]10\[char46] And this causes the connection to be reset by the lport p1\(cqs VIF\[char46] .PP .PP We can\(cqt fix this issue by adding a logical flow to drop ct\[char46]inv packets in the egress pipeline since it will drop all other connections not destined to the load balancers\[char46] To fix this issue, we send all the packets to the conntrack in the ingress pipeline if a load balancer is configured\[char46] We can now add a lflow to drop ct\[char46]inv packets\[char46] .PP .PP This table also has priority\-120 flows that punt all IGMP/MLD packets to \fBovn\-controller\fR if the switch is an interconnect switch with multicast snooping enabled\[char46] .PP .PP This table also has a priority\-110 flow with the match \fBeth\[char46]dst == \fIE\fB\fR for all logical switch datapaths to move traffic to the next table\[char46] Where \fIE\fR is the service monitor mac defined in the \fBoptions:svc_monitor_mac\fR column of \fBNB_Global\fR table\[char46] .PP .PP This table also has a priority\-110 flow with the match \fBinport == \fII\fB\fR for all logical switch datapaths to move traffic to the next table\[char46] Where \fII\fR is the peer of a logical router port\[char46] This flow is added to skip the connection tracking of packets which enter from logical router datapath to logical switch datapath\[char46] .ST "Ingress Table 6: Pre-stateful" .PP .PP This table prepares flows for all possible stateful processing in next tables\[char46] It contains a priority\-0 flow that simply moves traffic to the next table\[char46] .RS .IP \(bu Priority\-120 flows that send the packets to connection tracker using \fBct_lb_mark;\fR as the action so that the already established traffic destined to the load balancer VIP gets DNATted\[char46] These flows match each VIPs IP and port\[char46] For IPv4 traffic the flows also load the original destination IP and transport port in registers \fBreg1\fR and \fBreg2\fR\[char46] For IPv6 traffic the flows also load the original destination IP and transport port in registers \fBxxreg1\fR and \fBreg2\fR\[char46] .IP \(bu A priority\-110 flow sends the packets that don\(cqt match the above flows to connection tracker based on a hint provided by the previous tables (with a match for \fBreg0[2] == 1\fR) by using the \fBct_lb_mark;\fR action\[char46] .IP \(bu A priority\-100 flow sends the packets to connection tracker based on a hint provided by the previous tables (with a match for \fBreg0[0] == 1\fR) by using the \fBct_next;\fR action\[char46] .RE .ST "Ingress Table 7: \fBfrom\-lport\fI ACL hints" .PP .PP This table consists of logical flows that set hints (\fBreg0\fR bits) to be used in the next stage, in the ACL processing table, if stateful ACLs or load balancers are configured\[char46] Multiple hints can be set for the same packet\[char46] The possible hints are: .RS .IP \(bu \fBreg0[7]\fR: the packet might match an \fBallow\-related\fR ACL and might have to commit the connection to conntrack\[char46] .IP \(bu \fBreg0[8]\fR: the packet might match an \fBallow\-related\fR ACL but there will be no need to commit the connection to conntrack because it already exists\[char46] .IP \(bu \fBreg0[9]\fR: the packet might match a \fBdrop/reject\fR\[char46] .IP \(bu \fBreg0[10]\fR: the packet might match a \fBdrop/reject\fR ACL but the connection was previously allowed so it might have to be committed again with \fBct_label=1/1\fR\[char46] .RE .PP .PP The table contains the following flows: .RS .IP \(bu A priority\-65535 flow to advance to the next table if the logical switch has \fBno\fR ACLs configured, otherwise a priority\-0 flow to advance to the next table\[char46] .RE .RS .IP \(bu A priority\-7 flow that matches on packets that initiate a new session\[char46] This flow sets \fBreg0[7]\fR and \fBreg0[9]\fR and then advances to the next table\[char46] .IP \(bu A priority\-6 flow that matches on packets that are in the request direction of an already existing session that has been marked as blocked\[char46] This flow sets \fBreg0[7]\fR and \fBreg0[9]\fR and then advances to the next table\[char46] .IP \(bu A priority\-5 flow that matches untracked packets\[char46] This flow sets \fBreg0[8]\fR and \fBreg0[9]\fR and then advances to the next table\[char46] .IP \(bu A priority\-4 flow that matches on packets that are in the request direction of an already existing session that has not been marked as blocked\[char46] This flow sets \fBreg0[8]\fR and \fBreg0[10]\fR and then advances to the next table\[char46] .IP \(bu A priority\-3 flow that matches on packets that are in not part of established sessions\[char46] This flow sets \fBreg0[9]\fR and then advances to the next table\[char46] .IP \(bu A priority\-2 flow that matches on packets that are part of an established session that has been marked as blocked\[char46] This flow sets \fBreg0[9]\fR and then advances to the next table\[char46] .IP \(bu A priority\-1 flow that matches on packets that are part of an established session that has not been marked as blocked\[char46] This flow sets \fBreg0[10]\fR and then advances to the next table\[char46] .RE .ST "Ingress table 8: \fBfrom\-lport\fI ACLs before LB" .PP .PP Logical flows in this table closely reproduce those in the \fBACL\fR table in the \fBOVN_Northbound\fR database for the \fBfrom\-lport\fR direction without the option \fBapply\-after\-lb\fR set or set to \fBfalse\fR\[char46] The \fBpriority\fR values from the \fBACL\fR table have a limited range and have 1000 added to them to leave room for OVN default flows at both higher and lower priorities\[char46] .RS .IP \(bu \fBallow\fR ACLs translate into logical flows with the \fBnext;\fR action\[char46] If there are any stateful ACLs on this datapath, then \fBallow\fR ACLs translate to \fBct_commit; next;\fR (which acts as a hint for the next tables to commit the connection to conntrack)\[char46] In case the \fBACL\fR has a label then \fBreg3\fR is loaded with the label value and \fBreg0[13]\fR bit is set to 1 (which acts as a hint for the next tables to commit the label to conntrack)\[char46] .IP \(bu \fBallow\-related\fR ACLs translate into logical flows with the \fBct_commit(ct_label=0/1); next;\fR actions for new connections and \fBreg0[1] = 1; next;\fR for existing connections\[char46] In case the \fBACL\fR has a label then \fBreg3\fR is loaded with the label value and \fBreg0[13]\fR bit is set to 1 (which acts as a hint for the next tables to commit the label to conntrack)\[char46] .IP \(bu \fBallow\-stateless\fR ACLs translate into logical flows with the \fBnext;\fR action\[char46] .IP \(bu \fBreject\fR ACLs translate into logical flows with the \fBtcp_reset { output <\-> inport; next(pipeline=egress,table=5);}\fR action for TCP connections,\fBicmp4/icmp6\fR action for UDP connections, and \fBsctp_abort {output <\-%gt; inport; next(pipeline=egress,table=5);}\fR action for SCTP associations\[char46] .IP \(bu Other ACLs translate to \fBdrop;\fR for new or untracked connections and \fBct_commit(ct_label=1/1);\fR for known connections\[char46] Setting \fBct_label\fR marks a connection as one that was previously allowed, but should no longer be allowed due to a policy change\[char46] .RE .PP .PP This table contains a priority\-65535 flow to advance to the next table if the logical switch has \fBno\fR ACLs configured, otherwise a priority\-0 flow to advance to the next table so that ACLs allow packets by default if \fBoptions:default_acl_drop\fR column of \fBNB_Global\fR is \fBfalse\fR or not set\[char46] Otherwise the flow action is set to \fBdrop;\fR to implement a default drop behavior\[char46] .PP .PP If the logical datapath has a stateful ACL or a load balancer with VIP configured, the following flows will also be added: .RS .IP \(bu If \fBoptions:default_acl_drop\fR column of \fBNB_Global\fR is \fBfalse\fR or not set, a priority\-1 flow that sets the hint to commit IP traffic that is not part of established sessions to the connection tracker (with action \fBreg0[1] = 1; next;\fR)\[char46] This is needed for the default allow policy because, while the initiator\(cqs direction may not have any stateful rules, the server\(cqs may and then its return traffic would not be known and marked as invalid\[char46] .IP \(bu If \fBoptions:default_acl_drop\fR column of \fBNB_Global\fR is \fBtrue\fR, a priority\-1 flow that drops IP traffic that is not part of established sessions\[char46] .IP \(bu A priority\-1 flow that sets the hint to commit IP traffic to the connection tracker (with action \fBreg0[1] = 1; next;\fR)\[char46] This is needed for the default allow policy because, while the initiator\(cqs direction may not have any stateful rules, the server\(cqs may and then its return traffic would not be known and marked as invalid\[char46] .IP \(bu A priority\-65532 flow that allows any traffic in the reply direction for a connection that has been committed to the connection tracker (i\[char46]e\[char46], established flows), as long as the committed flow does not have \fBct_mark\[char46]blocked\fR set\[char46] We only handle traffic in the reply direction here because we want all packets going in the request direction to still go through the flows that implement the currently defined policy based on ACLs\[char46] If a connection is no longer allowed by policy, \fBct_mark\[char46]blocked\fR will get set and packets in the reply direction will no longer be allowed, either\[char46] This flow also clears the register bits \fBreg0[9]\fR and \fBreg0[10]\fR\[char46] If ACL logging and logging of related packets is enabled, then a companion priority\-65533 flow will be installed that accomplishes the same thing but also logs the traffic\[char46] .IP \(bu A priority\-65532 flow that allows any traffic that is considered related to a committed flow in the connection tracker (e\[char46]g\[char46], an ICMP Port Unreachable from a non-listening UDP port), as long as the committed flow does not have \fBct_mark\[char46]blocked\fR set\[char46] This flow also applies NAT to the related traffic so that ICMP headers and the inner packet have correct addresses\[char46] If ACL logging and logging of related packets is enabled, then a companion priority\-65533 flow will be installed that accomplishes the same thing but also logs the traffic\[char46] .IP \(bu A priority\-65532 flow that drops all traffic marked by the connection tracker as invalid\[char46] .IP \(bu A priority\-65532 flow that drops all traffic in the reply direction with \fBct_mark\[char46]blocked\fR set meaning that the connection should no longer be allowed due to a policy change\[char46] Packets in the request direction are skipped here to let a newly created ACL re-allow this connection\[char46] .IP \(bu A priority\-65532 flow that allows IPv6 Neighbor solicitation, Neighbor discover, Router solicitation, Router advertisement and MLD packets\[char46] .RE .PP .PP If the logical datapath has any ACL or a load balancer with VIP configured, the following flow will also be added: .RS .IP \(bu A priority 34000 logical flow is added for each logical switch datapath with the match \fBeth\[char46]dst = \fIE\fB\fR to allow the service monitor reply packet destined to \fBovn\-controller\fR with the action \fBnext\fR, where \fIE\fR is the service monitor mac defined in the \fBoptions:svc_monitor_mac\fR column of \fBNB_Global\fR table\[char46] .RE .ST "Ingress Table 9: \fBfrom\-lport\fI QoS Marking" .PP .PP Logical flows in this table closely reproduce those in the \fBQoS\fR table with the \fBaction\fR column set in the \fBOVN_Northbound\fR database for the \fBfrom\-lport\fR direction\[char46] .RS .IP \(bu For every qos_rules entry in a logical switch with DSCP marking enabled, a flow will be added at the priority mentioned in the QoS table\[char46] .IP \(bu One priority\-0 fallback flow that matches all packets and advances to the next table\[char46] .RE .ST "Ingress Table 10: \fBfrom\-lport\fI QoS Meter" .PP .PP Logical flows in this table closely reproduce those in the \fBQoS\fR table with the \fBbandwidth\fR column set in the \fBOVN_Northbound\fR database for the \fBfrom\-lport\fR direction\[char46] .RS .IP \(bu For every qos_rules entry in a logical switch with metering enabled, a flow will be added at the priority mentioned in the QoS table\[char46] .IP \(bu One priority\-0 fallback flow that matches all packets and advances to the next table\[char46] .RE .ST "Ingress Table 11: Load balancing affinity check" .PP .PP Load balancing affinity check table contains the following logical flows: .RS .IP \(bu For all the configured load balancing rules for a switch in \fBOVN_Northbound\fR database where a positive affinity timeout is specified in \fBoptions\fR column, that includes a L4 port \fIPORT\fR of protocol \fIP\fR and IP address \fIVIP\fR, a priority\-100 flow is added\[char46] For IPv4 \fIVIPs\fR, the flow matches \fBct\[char46]new && ip && ip4\[char46]dst == \fIVIP\fB && \fIP\fB\[char46]dst == \fIPORT\fB\fR\[char46] For IPv6 \fIVIPs\fR, the flow matches \fBct\[char46]new && ip && ip6\[char46]dst == \fIVIP\fB&& \fIP\fB && \fIP\fB\[char46]dst == \fI PORT\fB\fR\[char46] The flow\(cqs action is \fBreg9[6] = chk_lb_aff(); next;\fR\[char46] .IP \(bu A priority 0 flow is added which matches on all packets and applies the action \fBnext;\fR\[char46] .RE .ST "Ingress Table 12: LB" .RS .IP \(bu For all the configured load balancing rules for a switch in \fBOVN_Northbound\fR database where a positive affinity timeout is specified in \fBoptions\fR column, that includes a L4 port \fIPORT\fR of protocol \fIP\fR and IP address \fIVIP\fR, a priority\-150 flow is added\[char46] For IPv4 \fIVIPs\fR, the flow matches \fBreg9[6] == 1 && ct\[char46]new && ip && ip4\[char46]dst == \fIVIP\fB && \fIP\fB\[char46]dst == \fIPORT \fB\fR\[char46] For IPv6 \fIVIPs\fR, the flow matches \fBreg9[6] == 1 && ct\[char46]new && ip && ip6\[char46]dst == \fI VIP \fB&& \fIP\fB && \fIP\fB\[char46]dst == \fI PORT\fB\fR\[char46] The flow\(cqs action is \fBct_lb_mark(\fIargs\fB)\fR, where \fIargs\fR contains comma separated IP addresses (and optional port numbers) to load balance to\[char46] The address family of the IP addresses of \fIargs\fR is the same as the address family of \fIVIP\fR\[char46] .IP \(bu For all the configured load balancing rules for a switch in \fBOVN_Northbound\fR database that includes a L4 port \fIPORT\fR of protocol \fIP\fR and IP address \fIVIP\fR, a priority\-120 flow is added\[char46] For IPv4 \fIVIPs \fR, the flow matches \fBct\[char46]new && ip && ip4\[char46]dst == \fIVIP\fB && \fIP\fB\[char46]dst == \fIPORT\fB\fR\[char46] For IPv6 \fIVIPs\fR, the flow matches \fBct\[char46]new && ip && ip6\[char46]dst == \fI VIP \fB&& \fIP\fB && \fIP\fB\[char46]dst == \fI PORT\fB\fR\[char46] The flow\(cqs action is \fBct_lb_mark(\fIargs\fB) \fR, where \fIargs\fR contains comma separated IP addresses (and optional port numbers) to load balance to\[char46] The address family of the IP addresses of \fIargs\fR is the same as the address family of \fIVIP\fR\[char46] If health check is enabled, then \fIargs\fR will only contain those endpoints whose service monitor status entry in \fBOVN_Southbound\fR db is either \fBonline\fR or empty\[char46] For IPv4 traffic the flow also loads the original destination IP and transport port in registers \fBreg1\fR and \fBreg2\fR\[char46] For IPv6 traffic the flow also loads the original destination IP and transport port in registers \fBxxreg1\fR and \fBreg2\fR\[char46] The above flow is created even if the load balancer is attached to a logical router connected to the current logical switch and the \fBinstall_ls_lb_from_router\fR variable in \fBoptions\fR is set to true\[char46] .IP \(bu For all the configured load balancing rules for a switch in \fBOVN_Northbound\fR database that includes just an IP address \fIVIP\fR to match on, OVN adds a priority\-110 flow\[char46] For IPv4 \fIVIPs\fR, the flow matches \fBct\[char46]new && ip && ip4\[char46]dst == \fIVIP\fB\fR\[char46] For IPv6 \fIVIPs\fR, the flow matches \fBct\[char46]new && ip && ip6\[char46]dst == \fI VIP\fB\fR\[char46] The action on this flow is \fB ct_lb_mark(\fIargs\fB)\fR, where \fIargs\fR contains comma separated IP addresses of the same address family as \fIVIP\fR\[char46] For IPv4 traffic the flow also loads the original destination IP and transport port in registers \fBreg1\fR and \fBreg2\fR\[char46] For IPv6 traffic the flow also loads the original destination IP and transport port in registers \fBxxreg1\fR and \fBreg2\fR\[char46] The above flow is created even if the load balancer is attached to a logical router connected to the current logical switch and the \fBinstall_ls_lb_from_router\fR variable in \fBoptions\fR is set to true\[char46] .IP \(bu If the load balancer is created with \fB\-\-reject\fR option and it has no active backends, a TCP reset segment (for tcp) or an ICMP port unreachable packet (for all other kind of traffic) will be sent whenever an incoming packet is received for this load-balancer\[char46] Please note using \fB\-\-reject\fR option will disable empty_lb SB controller event for this load balancer\[char46] .RE .ST "Ingress Table 13: Load balancing affinity learn" .PP .PP Load balancing affinity learn table contains the following logical flows: .RS .IP \(bu For all the configured load balancing rules for a switch in \fBOVN_Northbound\fR database where a positive affinity timeout \fIT\fR is specified in \fBoptions\fR column, that includes a L4 port \fIPORT\fR of protocol \fIP\fR and IP address \fIVIP\fR, a priority\-100 flow is added\[char46] For IPv4 \fIVIPs\fR, the flow matches \fBreg9[6] == 0 && ct\[char46]new && ip && ip4\[char46]dst == \fIVIP\fB && \fIP\fB\[char46]dst == \fIPORT\fB\fR\[char46] For IPv6 \fIVIPs\fR, the flow matches \fBct\[char46]new && ip && ip6\[char46]dst == \fIVIP\fB && \fIP\fB && \fIP\fB\[char46]dst == \fIPORT\fB \fR\[char46] The flow\(cqs action is \fBcommit_lb_aff(vip = \fIVIP\fB:\fIPORT\fB, backend = \fIbackend ip\fB: \fIbackend port\fB, proto = \fIP\fB, timeout = \fIT\fB); \fR\[char46] .IP \(bu A priority 0 flow is added which matches on all packets and applies the action \fBnext;\fR\[char46] .RE .ST "Ingress table 14: \fBfrom\-lport\fI ACLs after LB" .PP .PP Logical flows in this table closely reproduce those in the \fBACL\fR table in the \fBOVN_Northbound\fR database for the \fBfrom\-lport\fR direction with the option \fBapply\-after\-lb\fR set to \fBtrue\fR\[char46] The \fBpriority\fR values from the \fBACL\fR table have a limited range and have 1000 added to them to leave room for OVN default flows at both higher and lower priorities\[char46] .RS .IP \(bu \fBallow\fR apply-after-lb ACLs translate into logical flows with the \fBnext;\fR action\[char46] If there are any stateful ACLs (including both before-lb and after-lb ACLs) on this datapath, then \fBallow\fR ACLs translate to \fBct_commit; next;\fR (which acts as a hint for the next tables to commit the connection to conntrack)\[char46] In case the \fBACL\fR has a label then \fBreg3\fR is loaded with the label value and \fBreg0[13]\fR bit is set to 1 (which acts as a hint for the next tables to commit the label to conntrack)\[char46] .IP \(bu \fBallow\-related\fR apply-after-lb ACLs translate into logical flows with the \fBct_commit(ct_label=0/1); next;\fR actions for new connections and \fBreg0[1] = 1; next;\fR for existing connections\[char46] In case the \fBACL\fR has a label then \fBreg3\fR is loaded with the label value and \fBreg0[13]\fR bit is set to 1 (which acts as a hint for the next tables to commit the label to conntrack)\[char46] .IP \(bu \fBallow\-stateless\fR apply-after-lb ACLs translate into logical flows with the \fBnext;\fR action\[char46] .IP \(bu \fBreject\fR apply-after-lb ACLs translate into logical flows with the \fBtcp_reset { output <\-> inport; next(pipeline=egress,table=5);}\fR action for TCP connections,\fBicmp4/icmp6\fR action for UDP connections, and \fBsctp_abort {output <\-%gt; inport; next(pipeline=egress,table=5);}\fR action for SCTP associations\[char46] .IP \(bu Other apply-after-lb ACLs translate to \fBdrop;\fR for new or untracked connections and \fBct_commit(ct_label=1/1);\fR for known connections\[char46] Setting \fBct_label\fR marks a connection as one that was previously allowed, but should no longer be allowed due to a policy change\[char46] .RE .RS .IP \(bu One priority\-0 fallback flow that matches all packets and advances to the next table\[char46] .RE .ST "Ingress Table 15: Stateful" .RS .IP \(bu A priority 100 flow is added which commits the packet to the conntrack and sets the most significant 32-bits of \fBct_label\fR with the \fBreg3\fR value based on the hint provided by previous tables (with a match for \fBreg0[1] == 1 && reg0[13] == 1\fR)\[char46] This is used by the \fBACLs\fR with label to commit the label value to conntrack\[char46] .IP \(bu For \fBACLs\fR without label, a second priority\-100 flow commits packets to connection tracker using \fBct_commit; next;\fR action based on a hint provided by the previous tables (with a match for \fBreg0[1] == 1 && reg0[13] == 0\fR)\[char46] .IP \(bu A priority\-0 flow that simply moves traffic to the next table\[char46] .RE .ST "Ingress Table 16: Pre-Hairpin" .RS .IP \(bu If the logical switch has load balancer(s) configured, then a priority\-100 flow is added with the match \fBip && ct\[char46]trk\fR to check if the packet needs to be hairpinned (if after load balancing the destination IP matches the source IP) or not by executing the actions \fBreg0[6] = chk_lb_hairpin();\fR and \fBreg0[12] = chk_lb_hairpin_reply();\fR and advances the packet to the next table\[char46] .IP \(bu A priority\-0 flow that simply moves traffic to the next table\[char46] .RE .ST "Ingress Table 17: Nat-Hairpin" .RS .IP \(bu If the logical switch has load balancer(s) configured, then a priority\-100 flow is added with the match \fBip && ct\[char46]new && ct\[char46]trk && reg0[6] == 1\fR which hairpins the traffic by NATting source IP to the load balancer VIP by executing the action \fBct_snat_to_vip\fR and advances the packet to the next table\[char46] .IP \(bu If the logical switch has load balancer(s) configured, then a priority\-100 flow is added with the match \fBip && ct\[char46]est && ct\[char46]trk && reg0[6] == 1\fR which hairpins the traffic by NATting source IP to the load balancer VIP by executing the action \fBct_snat\fR and advances the packet to the next table\[char46] .IP \(bu If the logical switch has load balancer(s) configured, then a priority\-90 flow is added with the match \fBip && reg0[12] == 1\fR which matches on the replies of hairpinned traffic (i\[char46]e\[char46], destination IP is VIP, source IP is the backend IP and source L4 port is backend port for L4 load balancers) and executes \fBct_snat\fR and advances the packet to the next table\[char46] .IP \(bu A priority\-0 flow that simply moves traffic to the next table\[char46] .RE .ST "Ingress Table 18: Hairpin" .RS .IP \(bu For each distributed gateway router port \fIRP\fR attached to the logical switch, a priority\-2000 flow is added with the match \fBreg0[14] == 1 && is_chassis_resident(\fIRP\fB) \fR and action \fBnext;\fR to pass the traffic to the next table to respond to the ARP requests for the router port IPs\[char46] .IP \fBreg0[14]\fR register bit is set in the ingress L2 port security check table for traffic received from HW VTEP (ramp) ports\[char46] .IP \(bu A priority\-1000 flow that matches on \fBreg0[14]\fR register bit for the traffic received from HW VTEP (ramp) ports\[char46] This traffic is passed to ingress table ls_in_l2_lkup\[char46] .IP \(bu A priority\-1 flow that hairpins traffic matched by non-default flows in the Pre-Hairpin table\[char46] Hairpinning is done at L2, Ethernet addresses are swapped and the packets are looped back on the input port\[char46] .IP \(bu A priority\-0 flow that simply moves traffic to the next table\[char46] .RE .ST "Ingress Table 19: ARP/ND responder" .PP .PP This table implements ARP/ND responder in a logical switch for known IPs\[char46] The advantage of the ARP responder flow is to limit ARP broadcasts by locally responding to ARP requests without the need to send to other hypervisors\[char46] One common case is when the inport is a logical port associated with a VIF and the broadcast is responded to on the local hypervisor rather than broadcast across the whole network and responded to by the destination VM\[char46] This behavior is proxy ARP\[char46] .PP .PP ARP requests arrive from VMs from a logical switch inport of type default\[char46] For this case, the logical switch proxy ARP rules can be for other VMs or logical router ports\[char46] Logical switch proxy ARP rules may be programmed both for mac binding of IP addresses on other logical switch VIF ports (which are of the default logical switch port type, representing connectivity to VMs or containers), and for mac binding of IP addresses on logical switch router type ports, representing their logical router port peers\[char46] In order to support proxy ARP for logical router ports, an IP address must be configured on the logical switch router type port, with the same value as the peer logical router port\[char46] The configured MAC addresses must match as well\[char46] When a VM sends an ARP request for a distributed logical router port and if the peer router type port of the attached logical switch does not have an IP address configured, the ARP request will be broadcast on the logical switch\[char46] One of the copies of the ARP request will go through the logical switch router type port to the logical router datapath, where the logical router ARP responder will generate a reply\[char46] The MAC binding of a distributed logical router, once learned by an associated VM, is used for all that VM\(cqs communication needing routing\[char46] Hence, the action of a VM re-arping for the mac binding of the logical router port should be rare\[char46] .PP .PP Logical switch ARP responder proxy ARP rules can also be hit when receiving ARP requests externally on a L2 gateway port\[char46] In this case, the hypervisor acting as an L2 gateway, responds to the ARP request on behalf of a destination VM\[char46] .PP .PP Note that ARP requests received from \fBlocalnet\fR logical inports can either go directly to VMs, in which case the VM responds or can hit an ARP responder for a logical router port if the packet is used to resolve a logical router port next hop address\[char46] In either case, logical switch ARP responder rules will not be hit\[char46] It contains these logical flows: .RS .IP \(bu Priority\-100 flows to skip the ARP responder if inport is of type \fBlocalnet\fR advances directly to the next table\[char46] ARP requests sent to \fBlocalnet\fR ports can be received by multiple hypervisors\[char46] Now, because the same mac binding rules are downloaded to all hypervisors, each of the multiple hypervisors will respond\[char46] This will confuse L2 learning on the source of the ARP requests\[char46] ARP requests received on an inport of type \fBrouter\fR are not expected to hit any logical switch ARP responder flows\[char46] However, no skip flows are installed for these packets, as there would be some additional flow cost for this and the value appears limited\[char46] .IP \(bu If inport \fBV\fR is of type \fBvirtual\fR adds a priority\-100 logical flows for each \fIP\fR configured in the \fBoptions:virtual-parents\fR column with the match .IP .nf \fB .br \fB\fR\fBinport == \fIP\fB && && ((arp\[char46]op == 1 && arp\[char46]spa == \fIVIP\fB && arp\[char46]tpa == \fIVIP\fB) || (arp\[char46]op == 2 && arp\[char46]spa == \fIVIP\fB))\fB\fR .br \fB\fR\fBinport == \fIP\fB && && ((nd_ns && ip6\[char46]dst == \fI{VIP, NS_MULTICAST_ADDR}\fB && nd\[char46]target == \fIVIP\fB) || (nd_na && nd\[char46]target == \fIVIP\fB))\fB\fR .br \fB \fR .fi .IP and applies the action .IP .nf \fB .br \fB\fR\fBbind_vport(\fIV\fB, inport);\fB\fR .br \fB \fR .fi .IP and advances the packet to the next table\[char46] .IP Where \fIVIP\fR is the virtual ip configured in the column \fBoptions:virtual-ip\fR and NS_MULTICAST_ADDR is solicited-node multicast address corresponding to the VIP\[char46] .IP \(bu Priority\-50 flows that match ARP requests to each known IP address \fIA\fR of every logical switch port, and respond with ARP replies directly with corresponding Ethernet address \fIE\fR: .IP .nf \fB .br \fBeth\[char46]dst = eth\[char46]src; .br \fBeth\[char46]src = \fR\fIE\fB\fR; .br \fBarp\[char46]op = 2; /* ARP reply\[char46] */ .br \fBarp\[char46]tha = arp\[char46]sha; .br \fBarp\[char46]sha = \fR\fIE\fB\fR; .br \fBarp\[char46]tpa = arp\[char46]spa; .br \fBarp\[char46]spa = \fR\fIA\fB\fR; .br \fBoutport = inport; .br \fBflags\[char46]loopback = 1; .br \fBoutput; .br \fB \fR .fi .IP These flows are omitted for logical ports (other than router ports or \fBlocalport\fR ports) that are down (unless \fB ignore_lsp_down\fR is configured as true in \fBoptions\fR column of \fBNB_Global\fR table of the \fBNorthbound\fR database), for logical ports of type \fBvirtual\fR, for logical ports with \(cqunknown\(cq address set and for logical ports of a logical switch configured with \fBother_config:vlan\-passthru=true\fR\[char46] .IP The above ARP responder flows are added for the list of IPv4 addresses if defined in \fBoptions:arp_proxy\fR column of \fBLogical_Switch_Port\fR table for logical switch ports of type \fBrouter\fR\[char46] .IP \(bu Priority\-50 flows that match IPv6 ND neighbor solicitations to each known IP address \fIA\fR (and \fIA\fR\(cqs solicited node address) of every logical switch port except of type router, and respond with neighbor advertisements directly with corresponding Ethernet address \fIE\fR: .IP .nf \fB .br \fBnd_na { .br \fB eth\[char46]src = \fR\fIE\fB\fR; .br \fB ip6\[char46]src = \fR\fIA\fB\fR; .br \fB nd\[char46]target = \fR\fIA\fB\fR; .br \fB nd\[char46]tll = \fR\fIE\fB\fR; .br \fB outport = inport; .br \fB flags\[char46]loopback = 1; .br \fB output; .br \fB}; .br \fB \fR .fi .IP Priority\-50 flows that match IPv6 ND neighbor solicitations to each known IP address \fIA\fR (and \fIA\fR\(cqs solicited node address) of logical switch port of type router, and respond with neighbor advertisements directly with corresponding Ethernet address \fIE\fR: .IP .nf \fB .br \fBnd_na_router { .br \fB eth\[char46]src = \fR\fIE\fB\fR; .br \fB ip6\[char46]src = \fR\fIA\fB\fR; .br \fB nd\[char46]target = \fR\fIA\fB\fR; .br \fB nd\[char46]tll = \fR\fIE\fB\fR; .br \fB outport = inport; .br \fB flags\[char46]loopback = 1; .br \fB output; .br \fB}; .br \fB \fR .fi .IP These flows are omitted for logical ports (other than router ports or \fBlocalport\fR ports) that are down (unless \fB ignore_lsp_down\fR is configured as true in \fBoptions\fR column of \fBNB_Global\fR table of the \fBNorthbound\fR database), for logical ports of type \fBvirtual\fR and for logical ports with \(cqunknown\(cq address set\[char46] .IP \(bu Priority\-100 flows with match criteria like the ARP and ND flows above, except that they only match packets from the \fBinport\fR that owns the IP addresses in question, with action \fBnext;\fR\[char46] These flows prevent OVN from replying to, for example, an ARP request emitted by a VM for its own IP address\[char46] A VM only makes this kind of request to attempt to detect a duplicate IP address assignment, so sending a reply will prevent the VM from accepting the IP address that it owns\[char46] .IP In place of \fBnext;\fR, it would be reasonable to use \fBdrop;\fR for the flows\(cq actions\[char46] If everything is working as it is configured, then this would produce equivalent results, since no host should reply to the request\[char46] But ARPing for one\(cqs own IP address is intended to detect situations where the network is not working as configured, so dropping the request would frustrate that intent\[char46] .IP \(bu For each \fISVC_MON_SRC_IP\fR defined in the value of the \fBip_port_mappings:ENDPOINT_IP\fR column of \fBLoad_Balancer\fR table, priority\-110 logical flow is added with the match \fBarp\[char46]tpa == \fISVC_MON_SRC_IP\fB && && arp\[char46]op == 1\fR and applies the action .IP .nf \fB .br \fBeth\[char46]dst = eth\[char46]src; .br \fBeth\[char46]src = \fR\fIE\fB\fR; .br \fBarp\[char46]op = 2; /* ARP reply\[char46] */ .br \fBarp\[char46]tha = arp\[char46]sha; .br \fBarp\[char46]sha = \fR\fIE\fB\fR; .br \fBarp\[char46]tpa = arp\[char46]spa; .br \fBarp\[char46]spa = \fR\fIA\fB\fR; .br \fBoutport = inport; .br \fBflags\[char46]loopback = 1; .br \fBoutput; .br \fB \fR .fi .IP where \fIE\fR is the service monitor source mac defined in the \fBoptions:svc_monitor_mac\fR column in the \fBNB_Global\fR table\[char46] This mac is used as the source mac in the service monitor packets for the load balancer endpoint IP health checks\[char46] .IP \fISVC_MON_SRC_IP\fR is used as the source ip in the service monitor IPv4 packets for the load balancer endpoint IP health checks\[char46] .IP These flows are required if an ARP request is sent for the IP \fISVC_MON_SRC_IP\fR\[char46] .IP \(bu For each \fIVIP\fR configured in the table \fBForwarding_Group\fR a priority\-50 logical flow is added with the match \fBarp\[char46]tpa == \fIvip\fB && && arp\[char46]op == 1 \fR and applies the action .IP .nf \fB .br \fBeth\[char46]dst = eth\[char46]src; .br \fBeth\[char46]src = \fR\fIE\fB\fR; .br \fBarp\[char46]op = 2; /* ARP reply\[char46] */ .br \fBarp\[char46]tha = arp\[char46]sha; .br \fBarp\[char46]sha = \fR\fIE\fB\fR; .br \fBarp\[char46]tpa = arp\[char46]spa; .br \fBarp\[char46]spa = \fR\fIA\fB\fR; .br \fBoutport = inport; .br \fBflags\[char46]loopback = 1; .br \fBoutput; .br \fB \fR .fi .IP where \fIE\fR is the forwarding group\(cqs mac defined in the \fBvmac\fR\[char46] .IP \fIA\fR is used as either the destination ip for load balancing traffic to child ports or as nexthop to hosts behind the child ports\[char46] .IP These flows are required to respond to an ARP request if an ARP request is sent for the IP \fIvip\fR\[char46] .IP \(bu One priority\-0 fallback flow that matches all packets and advances to the next table\[char46] .RE .ST "Ingress Table 20: DHCP option processing" .PP .PP This table adds the DHCPv4 options to a DHCPv4 packet from the logical ports configured with IPv4 address(es) and DHCPv4 options, and similarly for DHCPv6 options\[char46] This table also adds flows for the logical ports of type \fBexternal\fR\[char46] .RS .IP \(bu A priority\-100 logical flow is added for these logical ports which matches the IPv4 packet with \fBudp\[char46]src\fR = 68 and \fBudp\[char46]dst\fR = 67 and applies the action \fBput_dhcp_opts\fR and advances the packet to the next table\[char46] .IP .nf \fB .br \fBreg0[3] = put_dhcp_opts(offer_ip = \fR\fIip\fB\fR, \fR\fIoptions\fB\fR\[char46]\[char46]\[char46]); .br \fBnext; .br \fB \fR .fi .IP For DHCPDISCOVER and DHCPREQUEST, this transforms the packet into a DHCP reply, adds the DHCP offer IP \fIip\fR and options to the packet, and stores 1 into reg0[3]\[char46] For other kinds of packets, it just stores 0 into reg0[3]\[char46] Either way, it continues to the next table\[char46] .IP \(bu A priority\-100 logical flow is added for these logical ports which matches the IPv6 packet with \fBudp\[char46]src\fR = 546 and \fBudp\[char46]dst\fR = 547 and applies the action \fBput_dhcpv6_opts\fR and advances the packet to the next table\[char46] .IP .nf \fB .br \fBreg0[3] = put_dhcpv6_opts(ia_addr = \fR\fIip\fB\fR, \fR\fIoptions\fB\fR\[char46]\[char46]\[char46]); .br \fBnext; .br \fB \fR .fi .IP For DHCPv6 Solicit/Request/Confirm packets, this transforms the packet into a DHCPv6 Advertise/Reply, adds the DHCPv6 offer IP \fIip\fR and options to the packet, and stores 1 into reg0[3]\[char46] For other kinds of packets, it just stores 0 into reg0[3]\[char46] Either way, it continues to the next table\[char46] .IP \(bu A priority\-0 flow that matches all packets to advances to table 16\[char46] .RE .ST "Ingress Table 21: DHCP responses" .PP .PP This table implements DHCP responder for the DHCP replies generated by the previous table\[char46] .RS .IP \(bu A priority 100 logical flow is added for the logical ports configured with DHCPv4 options which matches IPv4 packets with \fBudp\[char46]src == 68 && udp\[char46]dst == 67 && reg0[3] == 1\fR and responds back to the \fBinport\fR after applying these actions\[char46] If \fBreg0[3]\fR is set to 1, it means that the action \fBput_dhcp_opts\fR was successful\[char46] .IP .nf \fB .br \fBeth\[char46]dst = eth\[char46]src; .br \fBeth\[char46]src = \fR\fIE\fB\fR; .br \fBip4\[char46]src = \fR\fIS\fB\fR; .br \fBudp\[char46]src = 67; .br \fBudp\[char46]dst = 68; .br \fBoutport = \fR\fIP\fB\fR; .br \fBflags\[char46]loopback = 1; .br \fBoutput; .br \fB \fR .fi .IP where \fIE\fR is the server MAC address and \fIS\fR is the server IPv4 address defined in the DHCPv4 options\[char46] Note that \fBip4\[char46]dst\fR field is handled by \fBput_dhcp_opts\fR\[char46] .IP (This terminates ingress packet processing; the packet does not go to the next ingress table\[char46]) .IP \(bu A priority 100 logical flow is added for the logical ports configured with DHCPv6 options which matches IPv6 packets with \fBudp\[char46]src == 546 && udp\[char46]dst == 547 && reg0[3] == 1\fR and responds back to the \fBinport\fR after applying these actions\[char46] If \fBreg0[3]\fR is set to 1, it means that the action \fBput_dhcpv6_opts\fR was successful\[char46] .IP .nf \fB .br \fBeth\[char46]dst = eth\[char46]src; .br \fBeth\[char46]src = \fR\fIE\fB\fR; .br \fBip6\[char46]dst = \fR\fIA\fB\fR; .br \fBip6\[char46]src = \fR\fIS\fB\fR; .br \fBudp\[char46]src = 547; .br \fBudp\[char46]dst = 546; .br \fBoutport = \fR\fIP\fB\fR; .br \fBflags\[char46]loopback = 1; .br \fBoutput; .br \fB \fR .fi .IP where \fIE\fR is the server MAC address and \fIS\fR is the server IPv6 LLA address generated from the \fBserver_id\fR defined in the DHCPv6 options and \fIA\fR is the IPv6 address defined in the logical port\(cqs addresses column\[char46] .IP (This terminates packet processing; the packet does not go on the next ingress table\[char46]) .IP \(bu A priority\-0 flow that matches all packets to advances to table 17\[char46] .RE .ST "Ingress Table 22 DNS Lookup" .PP .PP This table looks up and resolves the DNS names to the corresponding configured IP address(es)\[char46] .RS .IP \(bu A priority\-100 logical flow for each logical switch datapath if it is configured with DNS records, which matches the IPv4 and IPv6 packets with \fBudp\[char46]dst\fR = 53 and applies the action \fBdns_lookup\fR and advances the packet to the next table\[char46] .IP .nf \fB .br \fBreg0[4] = dns_lookup(); next; .br \fB \fR .fi .IP For valid DNS packets, this transforms the packet into a DNS reply if the DNS name can be resolved, and stores 1 into reg0[4]\[char46] For failed DNS resolution or other kinds of packets, it just stores 0 into reg0[4]\[char46] Either way, it continues to the next table\[char46] .RE .ST "Ingress Table 23 DNS Responses" .PP .PP This table implements DNS responder for the DNS replies generated by the previous table\[char46] .RS .IP \(bu A priority\-100 logical flow for each logical switch datapath if it is configured with DNS records, which matches the IPv4 and IPv6 packets with \fBudp\[char46]dst = 53 && reg0[4] == 1\fR and responds back to the \fBinport\fR after applying these actions\[char46] If \fBreg0[4]\fR is set to 1, it means that the action \fBdns_lookup\fR was successful\[char46] .IP .nf \fB .br \fBeth\[char46]dst <\-> eth\[char46]src; .br \fBip4\[char46]src <\-> ip4\[char46]dst; .br \fBudp\[char46]dst = udp\[char46]src; .br \fBudp\[char46]src = 53; .br \fBoutport = \fR\fIP\fB\fR; .br \fBflags\[char46]loopback = 1; .br \fBoutput; .br \fB \fR .fi .IP (This terminates ingress packet processing; the packet does not go to the next ingress table\[char46]) .RE .ST "Ingress table 24 External ports" .PP .PP Traffic from the \fBexternal\fR logical ports enter the ingress datapath pipeline via the \fBlocalnet\fR port\[char46] This table adds the below logical flows to handle the traffic from these ports\[char46] .RS .IP \(bu A priority\-100 flow is added for each \fBexternal\fR logical port which doesn\(cqt reside on a chassis to drop the ARP/IPv6 NS request to the router IP(s) (of the logical switch) which matches on the \fBinport\fR of the \fBexternal\fR logical port and the valid \fBeth\[char46]src\fR address(es) of the \fBexternal\fR logical port\[char46] .IP This flow guarantees that the ARP/NS request to the router IP address from the external ports is responded by only the chassis which has claimed these external ports\[char46] All the other chassis, drops these packets\[char46] .IP A priority\-100 flow is added for each \fBexternal\fR logical port which doesn\(cqt reside on a chassis to drop any packet destined to the router mac - with the match \fBinport == \fIexternal\fB && eth\[char46]src == \fIE\fB && eth\[char46]dst == \fIR\fB && !is_chassis_resident(\(dq\fIexternal\fB\(dq)\fR where \fIE\fR is the external port mac and \fIR\fR is the router port mac\[char46] .IP \(bu A priority\-0 flow that matches all packets to advances to table 20\[char46] .RE .ST "Ingress Table 25 Destination Lookup" .PP .PP This table implements switching behavior\[char46] It contains these logical flows: .RS .IP \(bu A priority\-110 flow with the match \fBeth\[char46]src == \fIE\fB\fR for all logical switch datapaths and applies the action \fBhandle_svc_check(inport)\fR\[char46] Where \fIE\fR is the service monitor mac defined in the \fBoptions:svc_monitor_mac\fR column of \fBNB_Global\fR table\[char46] .IP \(bu A priority\-100 flow that punts all IGMP/MLD packets to \fBovn\-controller\fR if multicast snooping is enabled on the logical switch\[char46] .IP \(bu Priority\-90 flows that forward registered IP multicast traffic to their corresponding multicast group, which \fBovn\-northd\fR creates based on learnt \fBIGMP_Group\fR entries\[char46] The flows also forward packets to the \fBMC_MROUTER_FLOOD\fR multicast group, which \fBovn\-nortdh\fR populates with all the logical ports that are connected to logical routers with \fBoptions\fR:mcast_relay=\(cqtrue\(cq\[char46] .IP \(bu A priority\-85 flow that forwards all IP multicast traffic destined to 224\[char46]0\[char46]0\[char46]X to the \fBMC_FLOOD_L2\fR multicast group, which \fBovn\-northd\fR populates with all non-router logical ports\[char46] .IP \(bu A priority\-85 flow that forwards all IP multicast traffic destined to reserved multicast IPv6 addresses (RFC 4291, 2\[char46]7\[char46]1, e\[char46]g\[char46], Solicited-Node multicast) to the \fBMC_FLOOD\fR multicast group, which \fBovn\-northd\fR populates with all enabled logical ports\[char46] .IP \(bu A priority\-80 flow that forwards all unregistered IP multicast traffic to the \fBMC_STATIC\fR multicast group, which \fBovn\-northd\fR populates with all the logical ports that have \fBoptions\fR \fB:mcast_flood=\(cqtrue\(cq\fR\[char46] The flow also forwards unregistered IP multicast traffic to the \fBMC_MROUTER_FLOOD\fR multicast group, which \fBovn\-northd\fR populates with all the logical ports connected to logical routers that have \fBoptions\fR \fB:mcast_relay=\(cqtrue\(cq\fR\[char46] .IP \(bu A priority\-80 flow that drops all unregistered IP multicast traffic if \fBother_config\fR \fB:mcast_snoop=\(cqtrue\(cq\fR and \fBother_config\fR \fB:mcast_flood_unregistered=\(cqfalse\(cq\fR and the switch is not connected to a logical router that has \fBoptions\fR \fB:mcast_relay=\(cqtrue\(cq\fR and the switch doesn\(cqt have any logical port with \fBoptions\fR \fB:mcast_flood=\(cqtrue\(cq\fR\[char46] .IP \(bu Priority\-80 flows for each IP address/VIP/NAT address owned by a router port connected to the switch\[char46] These flows match ARP requests and ND packets for the specific IP addresses\[char46] Matched packets are forwarded only to the router that owns the IP address and to the \fBMC_FLOOD_L2\fR multicast group which contains all non-router logical ports\[char46] .IP \(bu Priority\-75 flows for each port connected to a logical router matching self originated ARP request/RARP request/ND packets\[char46] These packets are flooded to the \fBMC_FLOOD_L2\fR which contains all non-router logical ports\[char46] .IP \(bu A priority\-70 flow that outputs all packets with an Ethernet broadcast or multicast \fBeth\[char46]dst\fR to the \fBMC_FLOOD\fR multicast group\[char46] .IP \(bu One priority\-50 flow that matches each known Ethernet address against \fBeth\[char46]dst\fR\[char46] Action of this flow outputs the packet to the single associated output port if it is enabled\[char46] \fBdrop;\fR action is applied if LSP is disabled\[char46] .IP For the Ethernet address on a logical switch port of type \fBrouter\fR, when that logical switch port\(cqs \fBaddresses\fR column is set to \fBrouter\fR and the connected logical router port has a gateway chassis: .RS .IP \(bu The flow for the connected logical router port\(cqs Ethernet address is only programmed on the gateway chassis\[char46] .IP \(bu If the logical router has rules specified in \fBnat\fR with \fBexternal_mac\fR, then those addresses are also used to populate the switch\(cqs destination lookup on the chassis where \fBlogical_port\fR is resident\[char46] .RE .IP For the Ethernet address on a logical switch port of type \fBrouter\fR, when that logical switch port\(cqs \fBaddresses\fR column is set to \fBrouter\fR and the connected logical router port specifies a \fBreside\-on\-redirect\-chassis\fR and the logical router to which the connected logical router port belongs to has a distributed gateway LRP: .RS .IP \(bu The flow for the connected logical router port\(cqs Ethernet address is only programmed on the gateway chassis\[char46] .RE .IP For each forwarding group configured on the logical switch datapath, a priority\-50 flow that matches on \fBeth\[char46]dst == \fIVIP\fB \fR with an action of \fBfwd_group(childports=\fIargs \fB)\fR, where \fIargs\fR contains comma separated logical switch child ports to load balance to\[char46] If \fBliveness\fR is enabled, then action also includes \fB liveness=true\fR\[char46] .IP \(bu One priority\-0 fallback flow that matches all packets with the action \fBoutport = get_fdb(eth\[char46]dst); next;\fR\[char46] The action \fBget_fdb\fR gets the port for the \fBeth\[char46]dst\fR in the MAC learning table of the logical switch datapath\[char46] If there is no entry for \fBeth\[char46]dst\fR in the MAC learning table, then it stores \fBnone\fR in the \fBoutport\fR\[char46] .RE .ST "Ingress Table 26 Destination unknown" .PP .PP This table handles the packets whose destination was not found or and looked up in the MAC learning table of the logical switch datapath\[char46] It contains the following flows\[char46] .RS .IP \(bu Priority 50 flow with the match \fBoutport == \fIP\fB\fR is added for each disabled Logical Switch Port \fBP\fR\[char46] This flow has action \fBdrop;\fR\[char46] .IP \(bu If the logical switch has logical ports with \(cqunknown\(cq addresses set, then the below logical flow is added .RS .IP \(bu Priority 50 flow with the match \fBoutport == \(dqnone\(dq\fR then outputs them to the \fBMC_UNKNOWN\fR multicast group, which \fBovn\-northd\fR populates with all enabled logical ports that accept unknown destination packets\[char46] As a small optimization, if no logical ports accept unknown destination packets, \fBovn\-northd\fR omits this multicast group and logical flow\[char46] .RE .IP If the logical switch has no logical ports with \(cqunknown\(cq address set, then the below logical flow is added .RS .IP \(bu Priority 50 flow with the match \fBoutport == none\fR and drops the packets\[char46] .RE .IP \(bu One priority\-0 fallback flow that outputs the packet to the egress stage with the outport learnt from \fBget_fdb\fR action\[char46] .RE .ST "Egress Table 0: Pre-LB" .PP .PP This table is similar to ingress table \fBPre\-LB\fR\[char46] It contains a priority\-0 flow that simply moves traffic to the next table\[char46] Moreover it contains two priority\-110 flows to move multicast, IPv6 Neighbor Discovery and MLD traffic to the next table\[char46] If any load balancing rules exist for the datapath, a priority\-100 flow is added with a match of \fBip\fR and action of \fBreg0[2] = 1; next;\fR to act as a hint for table \fBPre\-stateful\fR to send IP packets to the connection tracker for packet de-fragmentation and possibly DNAT the destination VIP to one of the selected backend for already committed load balanced traffic\[char46] .PP .PP This table also has a priority\-110 flow with the match \fBeth\[char46]src == \fIE\fB\fR for all logical switch datapaths to move traffic to the next table\[char46] Where \fIE\fR is the service monitor mac defined in the \fBoptions:svc_monitor_mac\fR column of \fBNB_Global\fR table\[char46] .ST "Egress Table 1: \fBto\-lport\fI Pre-ACLs" .PP .PP This is similar to ingress table \fBPre\-ACLs\fR except for \fBto\-lport\fR traffic\[char46] .PP .PP This table also has a priority\-110 flow with the match \fBeth\[char46]src == \fIE\fB\fR for all logical switch datapaths to move traffic to the next table\[char46] Where \fIE\fR is the service monitor mac defined in the \fBoptions:svc_monitor_mac\fR column of \fBNB_Global\fR table\[char46] .PP .PP This table also has a priority\-110 flow with the match \fBoutport == \fII\fB\fR for all logical switch datapaths to move traffic to the next table\[char46] Where \fII\fR is the peer of a logical router port\[char46] This flow is added to skip the connection tracking of packets which will be entering logical router datapath from logical switch datapath for routing\[char46] .ST "Egress Table 2: Pre-stateful" .PP .PP This is similar to ingress table \fBPre\-stateful\fR\[char46] This table adds the below 3 logical flows\[char46] .RS .IP \(bu A Priority\-120 flow that send the packets to connection tracker using \fBct_lb_mark;\fR as the action so that the already established traffic gets unDNATted from the backend IP to the load balancer VIP based on a hint provided by the previous tables with a match for \fBreg0[2] == 1\fR\[char46] If the packet was not DNATted earlier, then \fBct_lb_mark\fR functions like \fBct_next\fR\[char46] .IP \(bu A priority\-100 flow sends the packets to connection tracker based on a hint provided by the previous tables (with a match for \fBreg0[0] == 1\fR) by using the \fBct_next;\fR action\[char46] .IP \(bu A priority\-0 flow that matches all packets to advance to the next table\[char46] .RE .ST "Egress Table 3: \fBfrom\-lport\fI ACL hints" .PP .PP This is similar to ingress table \fBACL hints\fR\[char46] .ST "Egress Table 4: \fBto\-lport\fI ACLs" .PP .PP This is similar to ingress table \fBACLs\fR except for \fBto\-lport\fR ACLs\[char46] .PP .PP In addition, the following flows are added\[char46] .RS .IP \(bu A priority 34000 logical flow is added for each logical port which has DHCPv4 options defined to allow the DHCPv4 reply packet and which has DHCPv6 options defined to allow the DHCPv6 reply packet from the \fBIngress Table 18: DHCP responses\fR\[char46] .IP \(bu A priority 34000 logical flow is added for each logical switch datapath configured with DNS records with the match \fBudp\[char46]dst = 53\fR to allow the DNS reply packet from the \fBIngress Table 20: DNS responses\fR\[char46] .IP \(bu A priority 34000 logical flow is added for each logical switch datapath with the match \fBeth\[char46]src = \fIE\fB\fR to allow the service monitor request packet generated by \fBovn\-controller\fR with the action \fBnext\fR, where \fIE\fR is the service monitor mac defined in the \fBoptions:svc_monitor_mac\fR column of \fBNB_Global\fR table\[char46] .RE .ST "Egress Table 5: \fBto\-lport\fI QoS Marking" .PP .PP This is similar to ingress table \fBQoS marking\fR except they apply to \fBto\-lport\fR QoS rules\[char46] .ST "Egress Table 6: \fBto\-lport\fI QoS Meter" .PP .PP This is similar to ingress table \fBQoS meter\fR except they apply to \fBto\-lport\fR QoS rules\[char46] .ST "Egress Table 7: Stateful" .PP .PP This is similar to ingress table \fBStateful\fR except that there are no rules added for load balancing new connections\[char46] .ST "Egress Table 8: Egress Port Security - check" .PP .PP This is similar to the port security logic in table \fBIngress Port Security check\fR except that action \fBcheck_out_port_sec\fR is used to check the port security rules\[char46] This table adds the below logical flows\[char46] .RS .IP \(bu A priority 100 flow which matches on the multicast traffic and applies the action \fBREGBIT_PORT_SEC_DROP\(dq = 0; next;\(dq\fR to skip the out port security checks\[char46] .IP \(bu A priority 0 logical flow is added which matches on all the packets and applies the action \fBREGBIT_PORT_SEC_DROP\(dq = check_out_port_sec(); next;\(dq\fR\[char46] The action \fBcheck_out_port_sec\fR applies the port security rules based on the addresses defined in the \fBport_security\fR column of \fBLogical_Switch_Port\fR table before delivering the packet to the \fBoutport\fR\[char46] .RE .ST "Egress Table 9: Egress Port Security - Apply" .PP .PP This is similar to the ingress port security logic in ingress table \fBA Ingress Port Security \- Apply\fR\[char46] This table drops the packets if the port security check failed in the previous stage i\[char46]e the register bit \fBREGBIT_PORT_SEC_DROP\fR is set to 1\[char46] .PP .PP The following flows are added\[char46] .RS .IP \(bu For each localnet port configured with egress qos in the \fBoptions:qdisc_queue_id\fR column of \fBLogical_Switch_Port\fR, a priority 100 flow is added which matches on the localnet \fBoutport\fR and applies the action \fBset_queue(id); output;\(dq\fR\[char46] .IP Please remember to mark the corresponding physical interface with \fBovn\-egress\-iface\fR set to true in \fBexternal_ids\fR\[char46] .IP \(bu A priority\-50 flow that drops the packet if the register bit \fBREGBIT_PORT_SEC_DROP\fR is set to 1\[char46] .IP \(bu A priority\-0 flow that outputs the packet to the \fBoutport\fR\[char46] .RE .SS "Logical Router Datapaths" .PP .PP Logical router datapaths will only exist for \fBLogical_Router\fR rows in the \fBOVN_Northbound\fR database that do not have \fBenabled\fR set to \fBfalse\fR .ST "Ingress Table 0: L2 Admission Control" .PP .PP This table drops packets that the router shouldn\(cqt see at all based on their Ethernet headers\[char46] It contains the following flows: .RS .IP \(bu Priority\-100 flows to drop packets with VLAN tags or multicast Ethernet source addresses\[char46] .IP \(bu For each enabled router port \fIP\fR with Ethernet address \fIE\fR, a priority\-50 flow that matches \fBinport == \fIP\fB && (eth\[char46]mcast || eth\[char46]dst == \fIE\fB\fR), stores the router port ethernet address and advances to next table, with action \fBxreg0[0\[char46]\[char46]47]=E; next;\fR\[char46] .IP For the gateway port on a distributed logical router (where one of the logical router ports specifies a gateway chassis), the above flow matching \fBeth\[char46]dst == \fIE\fB\fR is only programmed on the gateway port instance on the gateway chassis\[char46] If LRP\(cqs logical switch has attached LSP of \fBvtep\fR type, the \fBis_chassis_resident()\fR part is not added to lflow to allow traffic originated from logical switch to reach LR services (LBs, NAT)\[char46] .IP For a distributed logical router or for gateway router where the port is configured with \fBoptions:gateway_mtu\fR the action of the above flow is modified adding \fBcheck_pkt_larger\fR in order to mark the packet setting \fBREGBIT_PKT_LARGER\fR if the size is greater than the MTU\[char46] If the port is also configured with \fBoptions:gateway_mtu_bypass\fR then another flow is added, with priority\-55, to bypass the \fBcheck_pkt_larger\fR flow\[char46] This is useful for traffic that normally doesn\(cqt need to be fragmented and for which check_pkt_larger, which might not be offloadable, is not really needed\[char46] One such example is TCP traffic\[char46] .IP \(bu For each \fBdnat_and_snat\fR NAT rule on a distributed router that specifies an external Ethernet address \fIE\fR, a priority\-50 flow that matches \fBinport == \fIGW\fB && eth\[char46]dst == \fIE\fB\fR, where \fIGW\fR is the logical router distributed gateway port corresponding to the NAT rule (specified or inferred), with action \fBxreg0[0\[char46]\[char46]47]=E; next;\fR\[char46] .IP This flow is only programmed on the gateway port instance on the chassis where the \fBlogical_port\fR specified in the NAT rule resides\[char46] .IP \(bu A priority\-0 logical flow that matches all packets not already handled (match \fB1\fR) and drops them (action \fBdrop;\fR)\[char46] .RE .PP .PP Other packets are implicitly dropped\[char46] .ST "Ingress Table 1: Neighbor lookup" .PP .PP For ARP and IPv6 Neighbor Discovery packets, this table looks into the \fBMAC_Binding\fR records to determine if OVN needs to learn the mac bindings\[char46] Following flows are added: .RS .IP \(bu For each router port \fIP\fR that owns IP address \fIA\fR, which belongs to subnet \fIS\fR with prefix length \fIL\fR, if the option \fBalways_learn_from_arp_request\fR is \fBtrue\fR for this router, a priority\-100 flow is added which matches \fBinport == \fIP\fB && arp\[char46]spa == \fIS\fB/\fIL\fB && arp\[char46]op == 1\fR (ARP request) with the following actions: .IP .nf \fB .br \fBreg9[2] = lookup_arp(inport, arp\[char46]spa, arp\[char46]sha); .br \fBnext; .br \fB \fR .fi .IP If the option \fBalways_learn_from_arp_request\fR is \fBfalse\fR, the following two flows are added\[char46] .IP A priority\-110 flow is added which matches \fBinport == \fIP\fB && arp\[char46]spa == \fIS\fB/\fIL\fB && arp\[char46]tpa == \fIA\fB && arp\[char46]op == 1\fR (ARP request) with the following actions: .IP .nf \fB .br \fBreg9[2] = lookup_arp(inport, arp\[char46]spa, arp\[char46]sha); .br \fBreg9[3] = 1; .br \fBnext; .br \fB \fR .fi .IP A priority\-100 flow is added which matches \fBinport == \fIP\fB && arp\[char46]spa == \fIS\fB/\fIL\fB && arp\[char46]op == 1\fR (ARP request) with the following actions: .IP .nf \fB .br \fBreg9[2] = lookup_arp(inport, arp\[char46]spa, arp\[char46]sha); .br \fBreg9[3] = lookup_arp_ip(inport, arp\[char46]spa); .br \fBnext; .br \fB \fR .fi .IP If the logical router port \fIP\fR is a distributed gateway router port, additional match \fBis_chassis_resident(cr\-\fIP\fB)\fR is added for all these flows\[char46] .IP \(bu A priority\-100 flow which matches on ARP reply packets and applies the actions if the option \fBalways_learn_from_arp_request\fR is \fBtrue\fR: .IP .nf \fB .br \fBreg9[2] = lookup_arp(inport, arp\[char46]spa, arp\[char46]sha); .br \fBnext; .br \fB \fR .fi .IP If the option \fBalways_learn_from_arp_request\fR is \fBfalse\fR, the above actions will be: .IP .nf \fB .br \fBreg9[2] = lookup_arp(inport, arp\[char46]spa, arp\[char46]sha); .br \fBreg9[3] = 1; .br \fBnext; .br \fB \fR .fi .IP \(bu A priority\-100 flow which matches on IPv6 Neighbor Discovery advertisement packet and applies the actions if the option \fBalways_learn_from_arp_request\fR is \fBtrue\fR: .IP .nf \fB .br \fBreg9[2] = lookup_nd(inport, nd\[char46]target, nd\[char46]tll); .br \fBnext; .br \fB \fR .fi .IP If the option \fBalways_learn_from_arp_request\fR is \fBfalse\fR, the above actions will be: .IP .nf \fB .br \fBreg9[2] = lookup_nd(inport, nd\[char46]target, nd\[char46]tll); .br \fBreg9[3] = 1; .br \fBnext; .br \fB \fR .fi .IP \(bu A priority\-100 flow which matches on IPv6 Neighbor Discovery solicitation packet and applies the actions if the option \fBalways_learn_from_arp_request\fR is \fBtrue\fR: .IP .nf \fB .br \fBreg9[2] = lookup_nd(inport, ip6\[char46]src, nd\[char46]sll); .br \fBnext; .br \fB \fR .fi .IP If the option \fBalways_learn_from_arp_request\fR is \fBfalse\fR, the above actions will be: .IP .nf \fB .br \fBreg9[2] = lookup_nd(inport, ip6\[char46]src, nd\[char46]sll); .br \fBreg9[3] = lookup_nd_ip(inport, ip6\[char46]src); .br \fBnext; .br \fB \fR .fi .IP \(bu A priority\-0 fallback flow that matches all packets and applies the action \fBreg9[2] = 1; next;\fR advancing the packet to the next table\[char46] .RE .ST "Ingress Table 2: Neighbor learning" .PP .PP This table adds flows to learn the mac bindings from the ARP and IPv6 Neighbor Solicitation/Advertisement packets if it is needed according to the lookup results from the previous stage\[char46] .PP .PP reg9[2] will be \fB1\fR if the \fBlookup_arp/lookup_nd\fR in the previous table was successful or skipped, meaning no need to learn mac binding from the packet\[char46] .PP .PP reg9[3] will be \fB1\fR if the \fBlookup_arp_ip/lookup_nd_ip\fR in the previous table was successful or skipped, meaning it is ok to learn mac binding from the packet (if reg9[2] is 0)\[char46] .RS .IP \(bu A priority\-100 flow with the match \fBreg9[2] == 1 || reg9[3] == 0\fR and advances the packet to the next table as there is no need to learn the neighbor\[char46] .IP \(bu A priority\-95 flow with the match \fBnd_ns && (ip6\[char46]src == 0 || nd\[char46]sll == 0)\fR and applies the action \fBnext;\fR .IP \(bu A priority\-90 flow with the match \fBarp\fR and applies the action \fBput_arp(inport, arp\[char46]spa, arp\[char46]sha); next;\fR .IP \(bu A priority\-95 flow with the match \fBnd_na && nd\[char46]tll == 0\fR and applies the action \fBput_nd(inport, nd\[char46]target, eth\[char46]src); next;\fR .IP \(bu A priority\-90 flow with the match \fBnd_na\fR and applies the action \fBput_nd(inport, nd\[char46]target, nd\[char46]tll); next;\fR .IP \(bu A priority\-90 flow with the match \fBnd_ns\fR and applies the action \fBput_nd(inport, ip6\[char46]src, nd\[char46]sll); next;\fR .IP \(bu A priority\-0 logical flow that matches all packets not already handled (match \fB1\fR) and drops them (action \fBdrop;\fR)\[char46] .RE .ST "Ingress Table 3: IP Input" .PP .PP This table is the core of the logical router datapath functionality\[char46] It contains the following flows to implement very basic IP host functionality\[char46] .RS .IP \(bu For each \fBdnat_and_snat\fR NAT rule on a distributed logical routers or gateway routers with gateway port configured with \fBoptions:gateway_mtu\fR to a valid integer value \fIM\fR, a priority\-160 flow with the match \fBinport == \fILRP\fB && REGBIT_PKT_LARGER && REGBIT_EGRESS_LOOPBACK == 0\fR, where \fILRP\fR is the logical router port and applies the following action for ipv4 and ipv6 respectively: .IP .nf \fB .br \fBicmp4_error { .br \fB icmp4\[char46]type = 3; /* Destination Unreachable\[char46] */ .br \fB icmp4\[char46]code = 4; /* Frag Needed and DF was Set\[char46] */ .br \fB icmp4\[char46]frag_mtu = \fR\fIM\fB\fR; .br \fB eth\[char46]dst = eth\[char46]src; .br \fB eth\[char46]src = \fR\fIE\fB\fR; .br \fB ip4\[char46]dst = ip4\[char46]src; .br \fB ip4\[char46]src = \fR\fII\fB\fR; .br \fB ip\[char46]ttl = 255; .br \fB REGBIT_EGRESS_LOOPBACK = 1; .br \fB REGBIT_PKT_LARGER 0; .br \fB outport = \fR\fILRP\fB\fR; .br \fB flags\[char46]loopback = 1; .br \fB output; .br \fB}; .br \fB .br \fBicmp6_error { .br \fB icmp6\[char46]type = 2; .br \fB icmp6\[char46]code = 0; .br \fB icmp6\[char46]frag_mtu = \fR\fIM\fB\fR; .br \fB eth\[char46]dst = eth\[char46]src; .br \fB eth\[char46]src = \fR\fIE\fB\fR; .br \fB ip6\[char46]dst = ip6\[char46]src; .br \fB ip6\[char46]src = \fR\fII\fB\fR; .br \fB ip\[char46]ttl = 255; .br \fB REGBIT_EGRESS_LOOPBACK = 1; .br \fB REGBIT_PKT_LARGER 0; .br \fB outport = \fR\fILRP\fB\fR; .br \fB flags\[char46]loopback = 1; .br \fB output; .br \fB}; .br \fB \fR .fi .IP where \fIE\fR and \fII\fR are the NAT rule external mac and IP respectively\[char46] .IP \(bu For distributed logical routers or gateway routers with gateway port configured with \fBoptions:gateway_mtu\fR to a valid integer value, a priority\-150 flow with the match \fBinport == \fILRP\fB && REGBIT_PKT_LARGER && REGBIT_EGRESS_LOOPBACK == 0\fR, where \fILRP\fR is the logical router port and applies the following action for ipv4 and ipv6 respectively: .IP .nf \fB .br \fBicmp4_error { .br \fB icmp4\[char46]type = 3; /* Destination Unreachable\[char46] */ .br \fB icmp4\[char46]code = 4; /* Frag Needed and DF was Set\[char46] */ .br \fB icmp4\[char46]frag_mtu = \fR\fIM\fB\fR; .br \fB eth\[char46]dst = \fR\fIE\fB\fR; .br \fB ip4\[char46]dst = ip4\[char46]src; .br \fB ip4\[char46]src = \fR\fII\fB\fR; .br \fB ip\[char46]ttl = 255; .br \fB REGBIT_EGRESS_LOOPBACK = 1; .br \fB REGBIT_PKT_LARGER 0; .br \fB next(pipeline=ingress, table=0); .br \fB}; .br \fB .br \fBicmp6_error { .br \fB icmp6\[char46]type = 2; .br \fB icmp6\[char46]code = 0; .br \fB icmp6\[char46]frag_mtu = \fR\fIM\fB\fR; .br \fB eth\[char46]dst = \fR\fIE\fB\fR; .br \fB ip6\[char46]dst = ip6\[char46]src; .br \fB ip6\[char46]src = \fR\fII\fB\fR; .br \fB ip\[char46]ttl = 255; .br \fB REGBIT_EGRESS_LOOPBACK = 1; .br \fB REGBIT_PKT_LARGER 0; .br \fB next(pipeline=ingress, table=0); .br \fB}; .br \fB \fR .fi .IP \(bu For each NAT entry of a distributed logical router (with distributed gateway router port(s)) of type \fBsnat\fR, a priority\-120 flow with the match \fBinport == \fIP\fB && ip4\[char46]src == \fIA\fB\fR advances the packet to the next pipeline, where \fIP\fR is the distributed logical router port corresponding to the NAT entry (specified or inferred) and \fIA\fR is the \fBexternal_ip\fR set in the NAT entry\[char46] If \fIA\fR is an IPv6 address, then \fBip6\[char46]src\fR is used for the match\[char46] .IP The above flow is required to handle the routing of the East/west NAT traffic\[char46] .IP \(bu For each BFD port the two following priority\-110 flows are added to manage BFD traffic: .RS .IP \(bu if \fBip4\[char46]src\fR or \fBip6\[char46]src\fR is any IP address owned by the router port and \fBudp\[char46]dst == 3784 \fR, the packet is advanced to the next pipeline stage\[char46] .IP \(bu if \fBip4\[char46]dst\fR or \fBip6\[char46]dst\fR is any IP address owned by the router port and \fBudp\[char46]dst == 3784 \fR, the \fBhandle_bfd_msg\fR action is executed\[char46] .RE .IP \(bu L3 admission control: Priority\-120 flows allows IGMP and MLD packets if the router has logical ports that have \fBoptions\fR \fB:mcast_flood=\(cqtrue\(cq\fR\[char46] .IP \(bu L3 admission control: A priority\-100 flow drops packets that match any of the following: .RS .IP \(bu \fBip4\[char46]src[28\[char46]\[char46]31] == 0xe\fR (multicast source) .IP \(bu \fBip4\[char46]src == 255\[char46]255\[char46]255\[char46]255\fR (broadcast source) .IP \(bu \fBip4\[char46]src == 127\[char46]0\[char46]0\[char46]0/8 || ip4\[char46]dst == 127\[char46]0\[char46]0\[char46]0/8\fR (localhost source or destination) .IP \(bu \fBip4\[char46]src == 0\[char46]0\[char46]0\[char46]0/8 || ip4\[char46]dst == 0\[char46]0\[char46]0\[char46]0/8\fR (zero network source or destination) .IP \(bu \fBip4\[char46]src\fR or \fBip6\[char46]src\fR is any IP address owned by the router, unless the packet was recirculated due to egress loopback as indicated by \fBREGBIT_EGRESS_LOOPBACK\fR\[char46] .IP \(bu \fBip4\[char46]src\fR is the broadcast address of any IP network known to the router\[char46] .RE .IP \(bu A priority\-100 flow parses DHCPv6 replies from IPv6 prefix delegation routers (\fBudp\[char46]src == 547 && udp\[char46]dst == 546\fR)\[char46] The \fBhandle_dhcpv6_reply\fR is used to send IPv6 prefix delegation messages to the delegation router\[char46] .IP \(bu ICMP echo reply\[char46] These flows reply to ICMP echo requests received for the router\(cqs IP address\[char46] Let \fIA\fR be an IP address owned by a router port\[char46] Then, for each \fIA\fR that is an IPv4 address, a priority\-90 flow matches on \fBip4\[char46]dst == \fIA\fB\fR and \fBicmp4\[char46]type == 8 && icmp4\[char46]code == 0\fR (ICMP echo request)\[char46] For each \fIA\fR that is an IPv6 address, a priority\-90 flow matches on \fBip6\[char46]dst == \fIA\fB\fR and \fBicmp6\[char46]type == 128 && icmp6\[char46]code == 0\fR (ICMPv6 echo request)\[char46] The port of the router that receives the echo request does not matter\[char46] Also, the \fBip\[char46]ttl\fR of the echo request packet is not checked, so it complies with RFC 1812, section 4\[char46]2\[char46]2\[char46]9\[char46] Flows for ICMPv4 echo requests use the following actions: .IP .nf \fB .br \fBip4\[char46]dst <\-> ip4\[char46]src; .br \fBip\[char46]ttl = 255; .br \fBicmp4\[char46]type = 0; .br \fBflags\[char46]loopback = 1; .br \fBnext; .br \fB \fR .fi .IP Flows for ICMPv6 echo requests use the following actions: .IP .nf \fB .br \fBip6\[char46]dst <\-> ip6\[char46]src; .br \fBip\[char46]ttl = 255; .br \fBicmp6\[char46]type = 129; .br \fBflags\[char46]loopback = 1; .br \fBnext; .br \fB \fR .fi .IP \(bu Reply to ARP requests\[char46] .IP These flows reply to ARP requests for the router\(cqs own IP address\[char46] The ARP requests are handled only if the requestor\(cqs IP belongs to the same subnets of the logical router port\[char46] For each router port \fIP\fR that owns IP address \fIA\fR, which belongs to subnet \fIS\fR with prefix length \fIL\fR, and Ethernet address \fIE\fR, a priority\-90 flow matches \fBinport == \fIP\fB && arp\[char46]spa == \fIS\fB/\fIL\fB && arp\[char46]op == 1 && arp\[char46]tpa == \fIA\fB\fR (ARP request) with the following actions: .IP .nf \fB .br \fBeth\[char46]dst = eth\[char46]src; .br \fBeth\[char46]src = xreg0[0\[char46]\[char46]47]; .br \fBarp\[char46]op = 2; /* ARP reply\[char46] */ .br \fBarp\[char46]tha = arp\[char46]sha; .br \fBarp\[char46]sha = xreg0[0\[char46]\[char46]47]; .br \fBarp\[char46]tpa = arp\[char46]spa; .br \fBarp\[char46]spa = \fR\fIA\fB\fR; .br \fBoutport = inport; .br \fBflags\[char46]loopback = 1; .br \fBoutput; .br \fB \fR .fi .IP For the gateway port on a distributed logical router (where one of the logical router ports specifies a gateway chassis), the above flows are only programmed on the gateway port instance on the gateway chassis\[char46] This behavior avoids generation of multiple ARP responses from different chassis, and allows upstream MAC learning to point to the gateway chassis\[char46] .IP For the logical router port with the option \fBreside\-on\-redirect\-chassis\fR set (which is centralized), the above flows are only programmed on the gateway port instance on the gateway chassis (if the logical router has a distributed gateway port)\[char46] This behavior avoids generation of multiple ARP responses from different chassis, and allows upstream MAC learning to point to the gateway chassis\[char46] .IP \(bu Reply to IPv6 Neighbor Solicitations\[char46] These flows reply to Neighbor Solicitation requests for the router\(cqs own IPv6 address and populate the logical router\(cqs mac binding table\[char46] .IP For each router port \fIP\fR that owns IPv6 address \fIA\fR, solicited node address \fIS\fR, and Ethernet address \fIE\fR, a priority\-90 flow matches \fBinport == \fIP\fB && nd_ns && ip6\[char46]dst == {\fIA\fB, \fIE\fB} && nd\[char46]target == \fIA\fB\fR with the following actions: .IP .nf \fB .br \fBnd_na_router { .br \fB eth\[char46]src = xreg0[0\[char46]\[char46]47]; .br \fB ip6\[char46]src = \fR\fIA\fB\fR; .br \fB nd\[char46]target = \fR\fIA\fB\fR; .br \fB nd\[char46]tll = xreg0[0\[char46]\[char46]47]; .br \fB outport = inport; .br \fB flags\[char46]loopback = 1; .br \fB output; .br \fB}; .br \fB \fR .fi .IP For the gateway port on a distributed logical router (where one of the logical router ports specifies a gateway chassis), the above flows replying to IPv6 Neighbor Solicitations are only programmed on the gateway port instance on the gateway chassis\[char46] This behavior avoids generation of multiple replies from different chassis, and allows upstream MAC learning to point to the gateway chassis\[char46] .IP \(bu These flows reply to ARP requests or IPv6 neighbor solicitation for the virtual IP addresses configured in the router for NAT (both DNAT and SNAT) or load balancing\[char46] .IP IPv4: For a configured NAT (both DNAT and SNAT) IP address or a load balancer IPv4 VIP \fIA\fR, for each router port \fIP\fR with Ethernet address \fIE\fR, a priority\-90 flow matches \fBarp\[char46]op == 1 && arp\[char46]tpa == \fIA\fB\fR (ARP request) with the following actions: .IP .nf \fB .br \fBeth\[char46]dst = eth\[char46]src; .br \fBeth\[char46]src = xreg0[0\[char46]\[char46]47]; .br \fBarp\[char46]op = 2; /* ARP reply\[char46] */ .br \fBarp\[char46]tha = arp\[char46]sha; .br \fBarp\[char46]sha = xreg0[0\[char46]\[char46]47]; .br \fBarp\[char46]tpa <\-> arp\[char46]spa; .br \fBoutport = inport; .br \fBflags\[char46]loopback = 1; .br \fBoutput; .br \fB \fR .fi .IP IPv4: For a configured load balancer IPv4 VIP, a similar flow is added with the additional match \fBinport == \fIP\fB\fR if the VIP is reachable from any logical router port of the logical router\[char46] .IP If the router port \fIP\fR is a distributed gateway router port, then the \fBis_chassis_resident(\fIP\fB)\fR is also added in the match condition for the load balancer IPv4 VIP \fIA\fR\[char46] .IP IPv6: For a configured NAT (both DNAT and SNAT) IP address or a load balancer IPv6 VIP \fIA\fR (if the VIP is reachable from any logical router port of the logical router), solicited node address \fIS\fR, for each router port \fIP\fR with Ethernet address \fIE\fR, a priority\-90 flow matches \fBinport == \fIP\fB && nd_ns && ip6\[char46]dst == {\fIA\fB, \fIS\fB} && nd\[char46]target == \fIA\fB\fR with the following actions: .IP .nf \fB .br \fBeth\[char46]dst = eth\[char46]src; .br \fBnd_na { .br \fB eth\[char46]src = xreg0[0\[char46]\[char46]47]; .br \fB nd\[char46]tll = xreg0[0\[char46]\[char46]47]; .br \fB ip6\[char46]src = \fR\fIA\fB\fR; .br \fB nd\[char46]target = \fR\fIA\fB\fR; .br \fB outport = inport; .br \fB flags\[char46]loopback = 1; .br \fB output; .br \fB} .br \fB \fR .fi .IP If the router port \fIP\fR is a distributed gateway router port, then the \fBis_chassis_resident(\fIP\fB)\fR is also added in the match condition for the load balancer IPv6 VIP \fIA\fR\[char46] .IP For the gateway port on a distributed logical router with NAT (where one of the logical router ports specifies a gateway chassis): .RS .IP \(bu If the corresponding NAT rule cannot be handled in a distributed manner, then a priority\-92 flow is programmed on the gateway port instance on the gateway chassis\[char46] A priority\-91 drop flow is programmed on the other chassis when ARP requests/NS packets are received on the gateway port\[char46] This behavior avoids generation of multiple ARP responses from different chassis, and allows upstream MAC learning to point to the gateway chassis\[char46] .IP \(bu If the corresponding NAT rule can be handled in a distributed manner, then this flow is only programmed on the gateway port instance where the \fBlogical_port\fR specified in the NAT rule resides\[char46] .IP Some of the actions are different for this case, using the \fBexternal_mac\fR specified in the NAT rule rather than the gateway port\(cqs Ethernet address \fIE\fR: .IP .nf \fB .br \fBeth\[char46]src = \fR\fIexternal_mac\fB\fR; .br \fBarp\[char46]sha = \fR\fIexternal_mac\fB\fR; .br \fB \fR .fi .IP or in the case of IPv6 neighbor solicition: .IP .nf \fB .br \fBeth\[char46]src = \fR\fIexternal_mac\fB\fR; .br \fBnd\[char46]tll = \fR\fIexternal_mac\fB\fR; .br \fB \fR .fi .IP This behavior avoids generation of multiple ARP responses from different chassis, and allows upstream MAC learning to point to the correct chassis\[char46] .RE .IP \(bu Priority\-85 flows which drops the ARP and IPv6 Neighbor Discovery packets\[char46] .IP \(bu A priority\-84 flow explicitly allows IPv6 multicast traffic that is supposed to reach the router pipeline (i\[char46]e\[char46], router solicitation and router advertisement packets)\[char46] .IP \(bu A priority\-83 flow explicitly drops IPv6 multicast traffic that is destined to reserved multicast groups\[char46] .IP \(bu A priority\-82 flow allows IP multicast traffic if \fBoptions\fR:mcast_relay=\(cqtrue\(cq, otherwise drops it\[char46] .IP \(bu UDP port unreachable\[char46] Priority\-80 flows generate ICMP port unreachable messages in reply to UDP datagrams directed to the router\(cqs IP address, except in the special case of gateways, which accept traffic directed to a router IP for load balancing and NAT purposes\[char46] .IP These flows should not match IP fragments with nonzero offset\[char46] .IP \(bu TCP reset\[char46] Priority\-80 flows generate TCP reset messages in reply to TCP datagrams directed to the router\(cqs IP address, except in the special case of gateways, which accept traffic directed to a router IP for load balancing and NAT purposes\[char46] .IP These flows should not match IP fragments with nonzero offset\[char46] .IP \(bu Protocol or address unreachable\[char46] Priority\-70 flows generate ICMP protocol or address unreachable messages for IPv4 and IPv6 respectively in reply to packets directed to the router\(cqs IP address on IP protocols other than UDP, TCP, and ICMP, except in the special case of gateways, which accept traffic directed to a router IP for load balancing purposes\[char46] .IP These flows should not match IP fragments with nonzero offset\[char46] .IP \(bu Drop other IP traffic to this router\[char46] These flows drop any other traffic destined to an IP address of this router that is not already handled by one of the flows above, which amounts to ICMP (other than echo requests) and fragments with nonzero offsets\[char46] For each IP address \fIA\fR owned by the router, a priority\-60 flow matches \fBip4\[char46]dst == \fIA\fB\fR or \fBip6\[char46]dst == \fIA\fB\fR and drops the traffic\[char46] An exception is made and the above flow is not added if the router port\(cqs own IP address is used to SNAT packets passing through that router or if it is used as a load balancer VIP\[char46] .RE .PP .PP The flows above handle all of the traffic that might be directed to the router itself\[char46] The following flows (with lower priorities) handle the remaining traffic, potentially for forwarding: .RS .IP \(bu Drop Ethernet local broadcast\[char46] A priority\-50 flow with match \fBeth\[char46]bcast\fR drops traffic destined to the local Ethernet broadcast address\[char46] By definition this traffic should not be forwarded\[char46] .IP \(bu ICMP time exceeded\[char46] For each router port \fIP\fR, whose IP address is \fIA\fR, a priority\-100 flow with match \fBinport == \fIP\fB && ip\[char46]ttl == {0, 1} && !ip\[char46]later_frag\fR matches packets whose TTL has expired, with the following actions to send an ICMP time exceeded reply for IPv4 and IPv6 respectively: .IP .nf \fB .br \fBicmp4 { .br \fB icmp4\[char46]type = 11; /* Time exceeded\[char46] */ .br \fB icmp4\[char46]code = 0; /* TTL exceeded in transit\[char46] */ .br \fB ip4\[char46]dst = ip4\[char46]src; .br \fB ip4\[char46]src = \fR\fIA\fB\fR; .br \fB ip\[char46]ttl = 254; .br \fB next; .br \fB}; .br \fB .br \fBicmp6 { .br \fB icmp6\[char46]type = 3; /* Time exceeded\[char46] */ .br \fB icmp6\[char46]code = 0; /* TTL exceeded in transit\[char46] */ .br \fB ip6\[char46]dst = ip6\[char46]src; .br \fB ip6\[char46]src = \fR\fIA\fB\fR; .br \fB ip\[char46]ttl = 254; .br \fB next; .br \fB}; .br \fB \fR .fi .IP \(bu TTL discard\[char46] A priority\-30 flow with match \fBip\[char46]ttl == {0, 1}\fR and actions \fBdrop;\fR drops other packets whose TTL has expired, that should not receive a ICMP error reply (i\[char46]e\[char46] fragments with nonzero offset)\[char46] .IP \(bu Next table\[char46] A priority\-0 flows match all packets that aren\(cqt already handled and uses actions \fBnext;\fR to feed them to the next table\[char46] .RE .ST "Ingress Table 4: UNSNAT" .PP .PP This is for already established connections\(cq reverse traffic\[char46] i\[char46]e\[char46], SNAT has already been done in egress pipeline and now the packet has entered the ingress pipeline as part of a reply\[char46] It is unSNATted here\[char46] .PP .PP Ingress Table 4: UNSNAT on Gateway and Distributed Routers .RS .IP \(bu If the Router (Gateway or Distributed) is configured with load balancers, then below lflows are added: .IP For each IPv4 address \fIA\fR defined as load balancer VIP with the protocol \fIP\fR (and the protocol port \fIT\fR if defined) is also present as an \fBexternal_ip\fR in the NAT table, a priority\-120 logical flow is added with the match \fBip4 && ip4\[char46]dst == \fIA\fB && \fIP\fB\fR with the action \fBnext;\fR to advance the packet to the next table\[char46] If the load balancer has protocol port \fBB\fR defined, then the match also has \fB\fIP\fB\[char46]dst == \fIB\fB\fR\[char46] .IP The above flows are also added for IPv6 load balancers\[char46] .RE .PP .PP Ingress Table 4: UNSNAT on Gateway Routers .RS .IP \(bu If the Gateway router has been configured to force SNAT any previously DNATted packets to \fIB\fR, a priority\-110 flow matches \fBip && ip4\[char46]dst == \fIB\fB\fR or \fBip && ip6\[char46]dst == \fIB\fB\fR with an action \fBct_snat; \fR\[char46] .IP If the Gateway router is configured with \fBlb_force_snat_ip=router_ip\fR then for every logical router port \fIP\fR attached to the Gateway router with the router ip \fIB\fR, a priority\-110 flow is added with the match \fBinport == \fIP\fB && ip4\[char46]dst == \fIB\fB\fR or \fBinport == \fIP\fB && ip6\[char46]dst == \fIB\fB\fR with an action \fBct_snat; \fR\[char46] .IP If the Gateway router has been configured to force SNAT any previously load-balanced packets to \fIB\fR, a priority\-100 flow matches \fBip && ip4\[char46]dst == \fIB\fB\fR or \fBip && ip6\[char46]dst == \fIB\fB\fR with an action \fBct_snat; \fR\[char46] .IP For each NAT configuration in the OVN Northbound database, that asks to change the source IP address of a packet from \fIA\fR to \fIB\fR, a priority\-90 flow matches \fBip && ip4\[char46]dst == \fIB\fB\fR or \fBip && ip6\[char46]dst == \fIB\fB\fR with an action \fBct_snat; \fR\[char46] If the NAT rule is of type dnat_and_snat and has \fBstateless=true\fR in the options, then the action would be \fBnext;\fR\[char46] .IP A priority\-0 logical flow with match \fB1\fR has actions \fBnext;\fR\[char46] .RE .PP .PP Ingress Table 4: UNSNAT on Distributed Routers .RS .IP \(bu For each configuration in the OVN Northbound database, that asks to change the source IP address of a packet from \fIA\fR to \fIB\fR, two priority\-100 flows are added\[char46] .IP If the NAT rule cannot be handled in a distributed manner, then the below priority\-100 flows are only programmed on the gateway chassis\[char46] .RS .IP \(bu The first flow matches \fBip && ip4\[char46]dst == \fIB\fB && inport == \fIGW\fB && flags\[char46]loopback == 0\fR or \fBip && ip6\[char46]dst == \fIB\fB && inport == \fIGW\fB && flags\[char46]loopback == 0\fR where \fIGW\fR is the distributed gateway port corresponding to the NAT rule (specified or inferred), with an action \fBct_snat_in_czone;\fR to unSNAT in the common zone\[char46] If the NAT rule is of type dnat_and_snat and has \fBstateless=true\fR in the options, then the action would be \fBnext;\fR\[char46] .IP If the NAT entry is of type \fBsnat\fR, then there is an additional match \fBis_chassis_resident(\fIcr-GW\fB) \fR where \fIcr-GW\fR is the chassis resident port of \fIGW\fR\[char46] .IP \(bu The second flow matches \fBip && ip4\[char46]dst == \fIB\fB && inport == \fIGW\fB && flags\[char46]loopback == 1 && flags\[char46]use_snat_zone == 1\fR or \fBip && ip6\[char46]dst == \fIB\fB && inport == \fIGW\fB && flags\[char46]loopback == 0 && flags\[char46]use_snat_zone == 1\fR where \fIGW\fR is the distributed gateway port corresponding to the NAT rule (specified or inferred), with an action \fBct_snat;\fR to unSNAT in the snat zone\[char46] If the NAT rule is of type dnat_and_snat and has \fBstateless=true\fR in the options, then the action would be \fBip4/6\[char46]dst=(\fIB\fB)\fR\[char46] .IP If the NAT entry is of type \fBsnat\fR, then there is an additional match \fBis_chassis_resident(\fIcr-GW\fB) \fR where \fIcr-GW\fR is the chassis resident port of \fIGW\fR\[char46] .RE .IP A priority\-0 logical flow with match \fB1\fR has actions \fBnext;\fR\[char46] .RE .ST "Ingress Table 5: DEFRAG" .PP .PP This is to send packets to connection tracker for tracking and defragmentation\[char46] It contains a priority\-0 flow that simply moves traffic to the next table\[char46] .PP .PP If load balancing rules with only virtual IP addresses are configured in \fBOVN_Northbound\fR database for a Gateway router, a priority\-100 flow is added for each configured virtual IP address \fIVIP\fR\[char46] For IPv4 \fIVIPs\fR the flow matches \fBip && ip4\[char46]dst == \fIVIP\fB\fR\[char46] For IPv6 \fIVIPs\fR, the flow matches \fBip && ip6\[char46]dst == \fIVIP\fB\fR\[char46] The flow applies the action \fBreg0 = \fIVIP\fB; ct_dnat;\fR (or \fBxxreg0\fR for IPv6) to send IP packets to the connection tracker for packet de-fragmentation and to dnat the destination IP for the committed connection before sending it to the next table\[char46] .PP .PP If load balancing rules with virtual IP addresses and ports are configured in \fBOVN_Northbound\fR database for a Gateway router, a priority\-110 flow is added for each configured virtual IP address \fIVIP\fR, protocol \fIPROTO\fR and port \fIPORT\fR\[char46] For IPv4 \fIVIPs\fR the flow matches \fBip && ip4\[char46]dst == \fIVIP\fB && \fIPROTO\fB && \fIPROTO\fB\[char46]dst == \fIPORT\fB\fR\[char46] For IPv6 \fIVIPs\fR, the flow matches \fBip && ip6\[char46]dst == \fIVIP\fB && \fIPROTO\fB && \fIPROTO\fB\[char46]dst == \fIPORT\fB\fR\[char46] The flow applies the action \fBreg0 = \fIVIP\fB; reg9[16\[char46]\[char46]31] = \fIPROTO\fB\[char46]dst; ct_dnat;\fR (or \fBxxreg0\fR for IPv6) to send IP packets to the connection tracker for packet de-fragmentation and to dnat the destination IP for the committed connection before sending it to the next table\[char46] .PP .PP If ECMP routes with symmetric reply are configured in the \fBOVN_Northbound\fR database for a gateway router, a priority\-100 flow is added for each router port on which symmetric replies are configured\[char46] The matching logic for these ports essentially reverses the configured logic of the ECMP route\[char46] So for instance, a route with a destination routing policy will instead match if the source IP address matches the static route\(cqs prefix\[char46] The flow uses the actions \fBchk_ecmp_nh_mac(); ct_next\fR or \fBchk_ecmp_nh(); ct_next\fR to send IP packets to table \fB76\fR or to table \fB77\fR in order to check if source info are already stored by OVN and then to the connection tracker for packet de-fragmentation and tracking before sending it to the next table\[char46] .PP .PP If load balancing rules are configured in \fBOVN_Northbound\fR database for a Gateway router, a priority 50 flow that matches \fBicmp || icmp6\fR with an action of \fBct_dnat;\fR, this allows potentially related ICMP traffic to pass through CT\[char46] .ST "Ingress Table 6: Load balancing affinity check" .PP .PP Load balancing affinity check table contains the following logical flows: .RS .IP \(bu For all the configured load balancing rules for a logical router where a positive affinity timeout is specified in \fBoptions\fR column, that includes a L4 port \fIPORT\fR of protocol \fIP\fR and IPv4 or IPv6 address \fIVIP\fR, a priority\-100 flow that matches on \fBct\[char46]new && ip && reg0 == \fIVIP\fB && \fIP\fB && reg9[16\[char46]\[char46]31] == \fR \fB\fIPORT\fB\fR (\fBxxreg0 == \fIVIP \fB\fR in the IPv6 case) with an action of \fBreg9[6] = chk_lb_aff(); next;\fR .IP \(bu A priority 0 flow is added which matches on all packets and applies the action \fBnext;\fR\[char46] .RE .ST "Ingress Table 7: DNAT" .PP .PP Packets enter the pipeline with destination IP address that needs to be DNATted from a virtual IP address to a real IP address\[char46] Packets in the reverse direction needs to be unDNATed\[char46] .PP .PP Ingress Table 7: Load balancing DNAT rules .PP .PP Following load balancing DNAT flows are added for Gateway router or Router with gateway port\[char46] These flows are programmed only on the gateway chassis\[char46] These flows do not get programmed for load balancers with IPv6 \fIVIPs\fR\[char46] .RS .IP \(bu For all the configured load balancing rules for a logical router where a positive affinity timeout is specified in \fBoptions\fR column, that includes a L4 port \fIPORT\fR of protocol \fIP\fR and IPv4 or IPv6 address \fIVIP\fR, a priority\-150 flow that matches on \fBreg9[6] == 1 && ct\[char46]new && ip && reg0 == \fIVIP\fB && \fIP\fB && reg9[16\[char46]\[char46]31] == \fR \fB\fIPORT\fB\fR (\fBxxreg0 == \fIVIP\fB\fR in the IPv6 case) with an action of \fBct_lb_mark(\fIargs\fB) \fR, where \fIargs\fR contains comma separated IP addresses (and optional port numbers) to load balance to\[char46] The address family of the IP addresses of \fIargs\fR is the same as the address family of \fIVIP\fR\[char46] .IP \(bu If controller_event has been enabled for all the configured load balancing rules for a Gateway router or Router with gateway port in \fBOVN_Northbound\fR database that does not have configured backends, a priority\-130 flow is added to trigger ovn-controller events whenever the chassis receives a packet for that particular VIP\[char46] If \fBevent\-elb\fR meter has been previously created, it will be associated to the empty_lb logical flow .IP \(bu For all the configured load balancing rules for a Gateway router or Router with gateway port in \fBOVN_Northbound\fR database that includes a L4 port \fIPORT\fR of protocol \fIP\fR and IPv4 or IPv6 address \fIVIP\fR, a priority\-120 flow that matches on \fBct\[char46]new && !ct\[char46]rel && ip && reg0 == \fIVIP\fB && \fIP\fB && reg9[16\[char46]\[char46]31] == \fR \fB\fIPORT\fB\fR (\fBxxreg0 == \fIVIP\fB \fR in the IPv6 case) with an action of \fBct_lb_mark(\fIargs\fB)\fR, where \fIargs\fR contains comma separated IPv4 or IPv6 addresses (and optional port numbers) to load balance to\[char46] If the router is configured to force SNAT any load-balanced packets, the above action will be replaced by \fBflags\[char46]force_snat_for_lb = 1; ct_lb_mark(\fIargs\fB);\fR\[char46] If the load balancing rule is configured with \fBskip_snat\fR set to true, the above action will be replaced by \fBflags\[char46]skip_snat_for_lb = 1; ct_lb_mark(\fIargs\fB);\fR\[char46] If health check is enabled, then \fIargs\fR will only contain those endpoints whose service monitor status entry in \fBOVN_Southbound\fR db is either \fBonline\fR or empty\[char46] .IP The previous table \fBlr_in_defrag\fR sets the register \fBreg0\fR (or \fBxxreg0\fR for IPv6) and does \fBct_dnat\fR\[char46] Hence for established traffic, this table just advances the packet to the next stage\[char46] .IP \(bu For all the configured load balancing rules for a router in \fBOVN_Northbound\fR database that includes a L4 port \fIPORT\fR of protocol \fIP\fR and IPv4 or IPv6 address \fIVIP\fR, a priority\-120 flow that matches on \fBct\[char46]est && !ct\[char46]rel && ip4 && reg0 == \fIVIP\fB && \fIP\fB && reg9[16\[char46]\[char46]31] == \fR \fB\fIPORT\fB\fR (\fBip6\fR and \fBxxreg0 == \fIVIP\fB\fR in the IPv6 case) with an action of \fBnext;\fR\[char46] If the router is configured to force SNAT any load-balanced packets, the above action will be replaced by \fBflags\[char46]force_snat_for_lb = 1; next;\fR\[char46] If the load balancing rule is configured with \fBskip_snat\fR set to true, the above action will be replaced by \fBflags\[char46]skip_snat_for_lb = 1; next;\fR\[char46] .IP The previous table \fBlr_in_defrag\fR sets the register \fBreg0\fR (or \fBxxreg0\fR for IPv6) and does \fBct_dnat\fR\[char46] Hence for established traffic, this table just advances the packet to the next stage\[char46] .IP \(bu For all the configured load balancing rules for a router in \fBOVN_Northbound\fR database that includes just an IP address \fIVIP\fR to match on, a priority\-110 flow that matches on \fBct\[char46]new && !ct\[char46]rel && ip4 && reg0 == \fIVIP\fB\fR (\fBip6\fR and \fBxxreg0 == \fIVIP\fB\fR in the IPv6 case) with an action of \fBct_lb_mark(\fIargs\fB)\fR, where \fIargs\fR contains comma separated IPv4 or IPv6 addresses\[char46] If the router is configured to force SNAT any load-balanced packets, the above action will be replaced by \fBflags\[char46]force_snat_for_lb = 1; ct_lb_mark(\fIargs\fB);\fR\[char46] If the load balancing rule is configured with \fBskip_snat\fR set to true, the above action will be replaced by \fBflags\[char46]skip_snat_for_lb = 1; ct_lb_mark(\fIargs\fB);\fR\[char46] .IP The previous table \fBlr_in_defrag\fR sets the register \fBreg0\fR (or \fBxxreg0\fR for IPv6) and does \fBct_dnat\fR\[char46] Hence for established traffic, this table just advances the packet to the next stage\[char46] .IP \(bu For all the configured load balancing rules for a router in \fBOVN_Northbound\fR database that includes just an IP address \fIVIP\fR to match on, a priority\-110 flow that matches on \fBct\[char46]est && !ct\[char46]rel && ip4 && reg0 == \fIVIP\fB\fR (or \fBip6\fR and \fBxxreg0 == \fIVIP\fB\fR) with an action of \fBnext;\fR\[char46] If the router is configured to force SNAT any load-balanced packets, the above action will be replaced by \fBflags\[char46]force_snat_for_lb = 1; next;\fR\[char46] If the load balancing rule is configured with \fBskip_snat\fR set to true, the above action will be replaced by \fBflags\[char46]skip_snat_for_lb = 1; next;\fR\[char46] .IP The previous table \fBlr_in_defrag\fR sets the register \fBreg0\fR (or \fBxxreg0\fR for IPv6) and does \fBct_dnat\fR\[char46] Hence for established traffic, this table just advances the packet to the next stage\[char46] .IP \(bu If the load balancer is created with \fB\-\-reject\fR option and it has no active backends, a TCP reset segment (for tcp) or an ICMP port unreachable packet (for all other kind of traffic) will be sent whenever an incoming packet is received for this load-balancer\[char46] Please note using \fB\-\-reject\fR option will disable empty_lb SB controller event for this load balancer\[char46] .IP \(bu For the related traffic, a priority 50 flow that matches \fBct\[char46]rel && !ct\[char46]est && !ct\[char46]new \fR with an action of \fBct_commit_nat;\fR, if the router has load balancer assigned to it\[char46] Along with two priority 70 flows that match \fBskip_snat\fR and \fBforce_snat\fR flags\[char46] .RE .PP .PP Ingress Table 7: DNAT on Gateway Routers .RS .IP \(bu For each configuration in the OVN Northbound database, that asks to change the destination IP address of a packet from \fIA\fR to \fIB\fR, a priority\-100 flow matches \fBip && ip4\[char46]dst == \fIA\fB\fR or \fBip && ip6\[char46]dst == \fIA\fB\fR with an action \fBflags\[char46]loopback = 1; ct_dnat(\fIB\fB);\fR\[char46] If the Gateway router is configured to force SNAT any DNATed packet, the above action will be replaced by \fBflags\[char46]force_snat_for_dnat = 1; flags\[char46]loopback = 1; ct_dnat(\fIB\fB);\fR\[char46] If the NAT rule is of type dnat_and_snat and has \fBstateless=true\fR in the options, then the action would be \fBip4/6\[char46]dst= (\fIB\fB)\fR\[char46] .IP If the NAT rule has \fBallowed_ext_ips\fR configured, then there is an additional match \fBip4\[char46]src == \fIallowed_ext_ips \fB\fR\[char46] Similarly, for IPV6, match would be \fBip6\[char46]src == \fIallowed_ext_ips\fB\fR\[char46] .IP If the NAT rule has \fBexempted_ext_ips\fR set, then there is an additional flow configured at priority 101\[char46] The flow matches if source ip is an \fBexempted_ext_ip\fR and the action is \fBnext; \fR\[char46] This flow is used to bypass the ct_dnat action for a packet originating from \fBexempted_ext_ips\fR\[char46] .IP \(bu A priority\-0 logical flow with match \fB1\fR has actions \fBnext;\fR\[char46] .RE .PP .PP Ingress Table 7: DNAT on Distributed Routers .PP .PP On distributed routers, the DNAT table only handles packets with destination IP address that needs to be DNATted from a virtual IP address to a real IP address\[char46] The unDNAT processing in the reverse direction is handled in a separate table in the egress pipeline\[char46] .RS .IP \(bu For each configuration in the OVN Northbound database, that asks to change the destination IP address of a packet from \fIA\fR to \fIB\fR, a priority\-100 flow matches \fBip && ip4\[char46]dst == \fIB\fB && inport == \fIGW\fB\fR, where \fIGW\fR is the logical router gateway port corresponding to the NAT rule (specified or inferred), with an action \fBct_dnat(\fIB\fB);\fR\[char46] The match will include \fBip6\[char46]dst == \fIB\fB\fR in the IPv6 case\[char46] If the NAT rule is of type dnat_and_snat and has \fBstateless=true\fR in the options, then the action would be \fBip4/6\[char46]dst=(\fIB\fB)\fR\[char46] .IP If the NAT rule cannot be handled in a distributed manner, then the priority\-100 flow above is only programmed on the gateway chassis\[char46] .IP If the NAT rule has \fBallowed_ext_ips\fR configured, then there is an additional match \fBip4\[char46]src == \fIallowed_ext_ips \fB\fR\[char46] Similarly, for IPV6, match would be \fBip6\[char46]src == \fIallowed_ext_ips\fB\fR\[char46] .IP If the NAT rule has \fBexempted_ext_ips\fR set, then there is an additional flow configured at priority 101\[char46] The flow matches if source ip is an \fBexempted_ext_ip\fR and the action is \fBnext; \fR\[char46] This flow is used to bypass the ct_dnat action for a packet originating from \fBexempted_ext_ips\fR\[char46] .IP A priority\-0 logical flow with match \fB1\fR has actions \fBnext;\fR\[char46] .RE .ST "Ingress Table 8: Load balancing affinity learn" .PP .PP Load balancing affinity learn table contains the following logical flows: .RS .IP \(bu For all the configured load balancing rules for a logical router where a positive affinity timeout \fIT\fR is specified in \fBoptions \fR column, that includes a L4 port \fIPORT\fR of protocol \fIP\fR and IPv4 or IPv6 address \fIVIP\fR, a priority\-100 flow that matches on \fBreg9[6] == 0 && ct\[char46]new && ip && reg0 == \fIVIP\fB && \fIP\fB && reg9[16\[char46]\[char46]31] == \fR \fB\fIPORT\fB\fR (\fBxxreg0 == \fIVIP\fB \fR in the IPv6 case) with an action of \fBcommit_lb_aff(vip = \fIVIP\fB:\fIPORT\fB, backend = \fIbackend ip\fB: \fIbackend port\fB, proto = \fIP\fB, timeout = \fIT\fB);\fR\[char46] .IP \(bu A priority 0 flow is added which matches on all packets and applies the action \fBnext;\fR\[char46] .RE .ST "Ingress Table 9: ECMP symmetric reply processing" .RS .IP \(bu If ECMP routes with symmetric reply are configured in the \fBOVN_Northbound\fR database for a gateway router, a priority\-100 flow is added for each router port on which symmetric replies are configured\[char46] The matching logic for these ports essentially reverses the configured logic of the ECMP route\[char46] So for instance, a route with a destination routing policy will instead match if the source IP address matches the static route\(cqs prefix\[char46] The flow uses the action \fBct_commit { ct_label\[char46]ecmp_reply_eth = eth\[char46]src;\(dq \(dq ct_mark\[char46]ecmp_reply_port = \fIK\fB;}; commit_ecmp_nh(); next; \fR to commit the connection and storing \fBeth\[char46]src\fR and the ECMP reply port binding tunnel key \fIK\fR in the \fBct_label\fR and the traffic pattern to table \fB76\fR or \fB77\fR\[char46] .RE .ST "Ingress Table 10: IPv6 ND RA option processing" .RS .IP \(bu A priority\-50 logical flow is added for each logical router port configured with IPv6 ND RA options which matches IPv6 ND Router Solicitation packet and applies the action \fBput_nd_ra_opts\fR and advances the packet to the next table\[char46] .IP .nf \fB .br \fBreg0[5] = put_nd_ra_opts(\fR\fIoptions\fB\fR);next; .br \fB \fR .fi .IP For a valid IPv6 ND RS packet, this transforms the packet into an IPv6 ND RA reply and sets the RA options to the packet and stores 1 into reg0[5]\[char46] For other kinds of packets, it just stores 0 into reg0[5]\[char46] Either way, it continues to the next table\[char46] .IP \(bu A priority\-0 logical flow with match \fB1\fR has actions \fBnext;\fR\[char46] .RE .ST "Ingress Table 11: IPv6 ND RA responder" .PP .PP This table implements IPv6 ND RA responder for the IPv6 ND RA replies generated by the previous table\[char46] .RS .IP \(bu A priority\-50 logical flow is added for each logical router port configured with IPv6 ND RA options which matches IPv6 ND RA packets and \fBreg0[5] == 1\fR and responds back to the \fBinport\fR after applying these actions\[char46] If \fBreg0[5]\fR is set to 1, it means that the action \fBput_nd_ra_opts\fR was successful\[char46] .IP .nf \fB .br \fBeth\[char46]dst = eth\[char46]src; .br \fBeth\[char46]src = \fR\fIE\fB\fR; .br \fBip6\[char46]dst = ip6\[char46]src; .br \fBip6\[char46]src = \fR\fII\fB\fR; .br \fBoutport = \fR\fIP\fB\fR; .br \fBflags\[char46]loopback = 1; .br \fBoutput; .br \fB \fR .fi .IP where \fIE\fR is the MAC address and \fII\fR is the IPv6 link local address of the logical router port\[char46] .IP (This terminates packet processing in ingress pipeline; the packet does not go to the next ingress table\[char46]) .IP \(bu A priority\-0 logical flow with match \fB1\fR has actions \fBnext;\fR\[char46] .RE .ST "Ingress Table 12: IP Routing Pre" .PP .PP If a packet arrived at this table from Logical Router Port \fIP\fR which has \fBoptions:route_table\fR value set, a logical flow with match \fBinport == \(dq\fIP\fB\(dq\fR with priority 100 and action setting unique-generated per-datapath 32-bit value (non-zero) in OVS register 7\[char46] This register\(cqs value is checked in next table\[char46] If packet didn\(cqt match any configured inport (\fI
\fR route table), register 7 value is set to 0\[char46] .PP .PP This table contains the following logical flows: .RS .IP \(bu Priority\-100 flow with match \fBinport == \(dqLRP_NAME\(dq\fR value and action, which set route table identifier in reg7\[char46] .IP A priority\-0 logical flow with match \fB1\fR has actions \fBreg7 = 0; next;\fR\[char46] .RE .ST "Ingress Table 13: IP Routing" .PP .PP A packet that arrives at this table is an IP packet that should be routed to the address in \fBip4\[char46]dst\fR or \fBip6\[char46]dst\fR\[char46] This table implements IP routing, setting \fBreg0\fR (or \fBxxreg0\fR for IPv6) to the next-hop IP address (leaving \fBip4\[char46]dst\fR or \fBip6\[char46]dst\fR, the packet\(cqs final destination, unchanged) and advances to the next table for ARP resolution\[char46] It also sets \fBreg1\fR (or \fBxxreg1\fR) to the IP address owned by the selected router port (ingress table \fBARP Request\fR will generate an ARP request, if needed, with \fBreg0\fR as the target protocol address and \fBreg1\fR as the source protocol address)\[char46] .PP .PP For ECMP routes, i\[char46]e\[char46] multiple static routes with same policy and prefix but different nexthops, the above actions are deferred to next table\[char46] This table, instead, is responsible for determine the ECMP group id and select a member id within the group based on 5-tuple hashing\[char46] It stores group id in \fBreg8[0\[char46]\[char46]15]\fR and member id in \fBreg8[16\[char46]\[char46]31]\fR\[char46] This step is skipped with a priority\-10300 rule if the traffic going out the ECMP route is reply traffic, and the ECMP route was configured to use symmetric replies\[char46] Instead, the stored values in conntrack is used to choose the destination\[char46] The \fBct_label\[char46]ecmp_reply_eth\fR tells the destination MAC address to which the packet should be sent\[char46] The \fBct_mark\[char46]ecmp_reply_port\fR tells the logical router port on which the packet should be sent\[char46] These values saved to the conntrack fields when the initial ingress traffic is received over the ECMP route and committed to conntrack\[char46] If \fBREGBIT_KNOWN_ECMP_NH\fR is set, the priority\-10300 flows in this stage set the \fBoutport\fR, while the \fBeth\[char46]dst\fR is set by flows at the ARP/ND Resolution stage\[char46] .PP .PP This table contains the following logical flows: .RS .IP \(bu Priority\-10550 flow that drops IPv6 Router Solicitation/Advertisement packets that were not processed in previous tables\[char46] .IP \(bu Priority\-10550 flows that drop IGMP and MLD packets with source MAC address owned by the router\[char46] These are used to prevent looping statically forwarded IGMP and MLD packets for which TTL is not decremented (it is always 1)\[char46] .IP \(bu Priority\-10500 flows that match IP multicast traffic destined to groups registered on any of the attached switches and sets \fBoutport\fR to the associated multicast group that will eventually flood the traffic to all interested attached logical switches\[char46] The flows also decrement TTL\[char46] .IP \(bu Priority\-10460 flows that match IGMP and MLD control packets, set \fBoutport\fR to the \fBMC_STATIC\fR multicast group, which \fBovn\-northd\fR populates with the logical ports that have \fBoptions\fR \fB:mcast_flood=\(cqtrue\(cq\fR\[char46] If no router ports are configured to flood multicast traffic the packets are dropped\[char46] .IP \(bu Priority\-10450 flow that matches unregistered IP multicast traffic decrements TTL and sets \fBoutport\fR to the \fBMC_STATIC\fR multicast group, which \fBovn\-northd\fR populates with the logical ports that have \fBoptions\fR \fB:mcast_flood=\(cqtrue\(cq\fR\[char46] If no router ports are configured to flood multicast traffic the packets are dropped\[char46] .IP \(bu IPv4 routing table\[char46] For each route to IPv4 network \fIN\fR with netmask \fIM\fR, on router port \fIP\fR with IP address \fIA\fR and Ethernet address \fIE\fR, a logical flow with match \fBip4\[char46]dst == \fIN\fB/\fIM\fB\fR, whose priority is the number of 1-bits in \fIM\fR, has the following actions: .IP .nf \fB .br \fBip\[char46]ttl\-\-; .br \fBreg8[0\[char46]\[char46]15] = 0; .br \fBreg0 = \fR\fIG\fB\fR; .br \fBreg1 = \fR\fIA\fB\fR; .br \fBeth\[char46]src = \fR\fIE\fB\fR; .br \fBoutport = \fR\fIP\fB\fR; .br \fBflags\[char46]loopback = 1; .br \fBnext; .br \fB \fR .fi .IP (Ingress table 1 already verified that \fBip\[char46]ttl\-\-;\fR will not yield a TTL exceeded error\[char46]) .IP If the route has a gateway, \fIG\fR is the gateway IP address\[char46] Instead, if the route is from a configured static route, \fIG\fR is the next hop IP address\[char46] Else it is \fBip4\[char46]dst\fR\[char46] .IP \(bu IPv6 routing table\[char46] For each route to IPv6 network \fIN\fR with netmask \fIM\fR, on router port \fIP\fR with IP address \fIA\fR and Ethernet address \fIE\fR, a logical flow with match in CIDR notation \fBip6\[char46]dst == \fIN\fB/\fIM\fB\fR, whose priority is the integer value of \fIM\fR, has the following actions: .IP .nf \fB .br \fBip\[char46]ttl\-\-; .br \fBreg8[0\[char46]\[char46]15] = 0; .br \fBxxreg0 = \fR\fIG\fB\fR; .br \fBxxreg1 = \fR\fIA\fB\fR; .br \fBeth\[char46]src = \fR\fIE\fB\fR; .br \fBoutport = inport; .br \fBflags\[char46]loopback = 1; .br \fBnext; .br \fB \fR .fi .IP (Ingress table 1 already verified that \fBip\[char46]ttl\-\-;\fR will not yield a TTL exceeded error\[char46]) .IP If the route has a gateway, \fIG\fR is the gateway IP address\[char46] Instead, if the route is from a configured static route, \fIG\fR is the next hop IP address\[char46] Else it is \fBip6\[char46]dst\fR\[char46] .IP If the address \fIA\fR is in the link-local scope, the route will be limited to sending on the ingress port\[char46] .IP For each static route the \fBreg7 == id &&\fR is prefixed in logical flow match portion\[char46] For routes with \fBroute_table\fR value set a unique non-zero id is used\[char46] For routes within \fB
\fR route table (no route table set), this id value is 0\[char46] .IP For each \fIconnected\fR route (route to the LRP\(cqs subnet CIDR) the logical flow match portion has no \fBreg7 == id &&\fR prefix to have route to LRP\(cqs subnets in all routing tables\[char46] .IP \(bu For ECMP routes, they are grouped by policy and prefix\[char46] An unique id (non-zero) is assigned to each group, and each member is also assigned an unique id (non-zero) within each group\[char46] .IP For each IPv4/IPv6 ECMP group with group id \fIGID\fR and member ids \fIMID1\fR, \fIMID2\fR, \[char46]\[char46]\[char46], a logical flow with match in CIDR notation \fBip4\[char46]dst == \fIN\fB/\fIM\fB\fR, or \fBip6\[char46]dst == \fIN\fB/\fIM\fB\fR, whose priority is the integer value of \fIM\fR, has the following actions: .IP .nf \fB .br \fBip\[char46]ttl\-\-; .br \fBflags\[char46]loopback = 1; .br \fBreg8[0\[char46]\[char46]15] = \fR\fIGID\fB\fR; .br \fBselect(reg8[16\[char46]\[char46]31], \fR\fIMID1\fB\fR, \fR\fIMID2\fB\fR, \[char46]\[char46]\[char46]); .br \fB \fR .fi .IP \(bu A priority\-0 logical flow that matches all packets not already handled (match \fB1\fR) and drops them (action \fBdrop;\fR)\[char46] .RE .ST "Ingress Table 14: IP_ROUTING_ECMP" .PP .PP This table implements the second part of IP routing for ECMP routes following the previous table\[char46] If a packet matched a ECMP group in the previous table, this table matches the group id and member id stored from the previous table, setting \fBreg0\fR (or \fBxxreg0\fR for IPv6) to the next-hop IP address (leaving \fBip4\[char46]dst\fR or \fBip6\[char46]dst\fR, the packet\(cqs final destination, unchanged) and advances to the next table for ARP resolution\[char46] It also sets \fBreg1\fR (or \fBxxreg1\fR) to the IP address owned by the selected router port (ingress table \fBARP Request\fR will generate an ARP request, if needed, with \fBreg0\fR as the target protocol address and \fBreg1\fR as the source protocol address)\[char46] .PP .PP This processing is skipped for reply traffic being sent out of an ECMP route if the route was configured to use symmetric replies\[char46] .PP .PP This table contains the following logical flows: .RS .IP \(bu A priority\-150 flow that matches \fBreg8[0\[char46]\[char46]15] == 0\fR with action \fBnext;\fR directly bypasses packets of non-ECMP routes\[char46] .IP \(bu For each member with ID \fIMID\fR in each ECMP group with ID \fIGID\fR, a priority\-100 flow with match \fBreg8[0\[char46]\[char46]15] == \fIGID\fB && reg8[16\[char46]\[char46]31] == \fIMID\fB\fR has following actions: .IP .nf \fB .br \fB[xx]reg0 = \fR\fIG\fB\fR; .br \fB[xx]reg1 = \fR\fIA\fB\fR; .br \fBeth\[char46]src = \fR\fIE\fB\fR; .br \fBoutport = \fR\fIP\fB\fR; .br \fB \fR .fi .IP \(bu A priority\-0 logical flow that matches all packets not already handled (match \fB1\fR) and drops them (action \fBdrop;\fR)\[char46] .RE .ST "Ingress Table 15: Router policies" .PP .PP This table adds flows for the logical router policies configured on the logical router\[char46] Please see the \fBOVN_Northbound\fR database \fBLogical_Router_Policy\fR table documentation in \fBovn\-nb\fR for supported actions\[char46] .RS .IP \(bu For each router policy configured on the logical router, a logical flow is added with specified priority, match and actions\[char46] .IP \(bu If the policy action is \fBreroute\fR with 2 or more nexthops defined, then the logical flow is added with the following actions: .IP .nf \fB .br \fBreg8[0\[char46]\[char46]15] = \fR\fIGID\fB\fR; .br \fBreg8[16\[char46]\[char46]31] = select(1,\[char46]\[char46]n); .br \fB \fR .fi .IP where \fIGID\fR is the ECMP group id generated by \fBovn\-northd\fR for this policy and \fIn\fR is the number of nexthops\[char46] \fBselect\fR action selects one of the nexthop member id, stores it in the register \fBreg8[16\[char46]\[char46]31]\fR and advances the packet to the next stage\[char46] .IP \(bu If the policy action is \fBreroute\fR with just one nexhop, then the logical flow is added with the following actions: .IP .nf \fB .br \fB[xx]reg0 = \fR\fIH\fB\fR; .br \fBeth\[char46]src = \fR\fIE\fB\fR; .br \fBoutport = \fR\fIP\fB\fR; .br \fBreg8[0\[char46]\[char46]15] = 0; .br \fBflags\[char46]loopback = 1; .br \fBnext; .br \fB \fR .fi .IP where \fIH\fR is the \fBnexthop \fR defined in the router policy, \fIE\fR is the ethernet address of the logical router port from which the \fBnexthop\fR is reachable and \fIP\fR is the logical router port from which the \fBnexthop\fR is reachable\[char46] .IP \(bu If a router policy has the option \fBpkt_mark=\fIm\fB\fR set and if the action is \fBnot\fR drop, then the action also includes \fBpkt\[char46]mark = \fIm\fB\fR to mark the packet with the marker \fIm\fR\[char46] .RE .ST "Ingress Table 16: ECMP handling for router policies" .PP .PP This table handles the ECMP for the router policies configured with multiple nexthops\[char46] .RS .IP \(bu A priority\-150 flow is added to advance the packet to the next stage if the ECMP group id register \fBreg8[0\[char46]\[char46]15]\fR is 0\[char46] .IP \(bu For each ECMP reroute router policy with multiple nexthops, a priority\-100 flow is added for each nexthop \fIH\fR with the match \fBreg8[0\[char46]\[char46]15] == \fIGID\fB && reg8[16\[char46]\[char46]31] == \fIM\fB\fR where \fIGID\fR is the router policy group id generated by \fBovn\-northd\fR and \fIM\fR is the member id of the nexthop \fIH\fR generated by \fBovn\-northd\fR\[char46] The following actions are added to the flow: .IP .nf \fB .br \fB[xx]reg0 = \fR\fIH\fB\fR; .br \fBeth\[char46]src = \fR\fIE\fB\fR; .br \fBoutport = \fR\fIP\fB\fR .br \fB\(dqflags\[char46]loopback = 1; \(dq .br \fB\(dqnext;\(dq .br \fB \fR .fi .IP where \fIH\fR is the \fBnexthop \fR defined in the router policy, \fIE\fR is the ethernet address of the logical router port from which the \fBnexthop\fR is reachable and \fIP\fR is the logical router port from which the \fBnexthop\fR is reachable\[char46] .IP \(bu A priority\-0 logical flow that matches all packets not already handled (match \fB1\fR) and drops them (action \fBdrop;\fR)\[char46] .RE .ST "Ingress Table 17: ARP/ND Resolution" .PP .PP Any packet that reaches this table is an IP packet whose next-hop IPv4 address is in \fBreg0\fR or IPv6 address is in \fBxxreg0\fR\[char46] (\fBip4\[char46]dst\fR or \fBip6\[char46]dst\fR contains the final destination\[char46]) This table resolves the IP address in \fBreg0\fR (or \fBxxreg0\fR) into an output port in \fBoutport\fR and an Ethernet address in \fBeth\[char46]dst\fR, using the following flows: .RS .IP \(bu A priority\-500 flow that matches IP multicast traffic that was allowed in the routing pipeline\[char46] For this kind of traffic the \fBoutport\fR was already set so the flow just advances to the next table\[char46] .IP \(bu Priority\-200 flows that match ECMP reply traffic for the routes configured to use symmetric replies, with actions \fBpush(xxreg1); xxreg1 = ct_label; eth\[char46]dst = xxreg1[32\[char46]\[char46]79]; pop(xxreg1); next;\fR\[char46] \fBxxreg1\fR is used here to avoid masked access to ct_label, to make the flow HW-offloading friendly\[char46] .IP \(bu Static MAC bindings\[char46] MAC bindings can be known statically based on data in the \fBOVN_Northbound\fR database\[char46] For router ports connected to logical switches, MAC bindings can be known statically from the \fBaddresses\fR column in the \fBLogical_Switch_Port\fR table\[char46] For router ports connected to other logical routers, MAC bindings can be known statically from the \fBmac\fR and \fBnetworks\fR column in the \fBLogical_Router_Port\fR table\[char46] (Note: the flow is NOT installed for the IP addresses that belong to a neighbor logical router port if the current router has the \fBoptions:dynamic_neigh_routers\fR set to \fBtrue\fR) .IP For each IPv4 address \fIA\fR whose host is known to have Ethernet address \fIE\fR on router port \fIP\fR, a priority\-100 flow with match \fBoutport === \fIP\fB && reg0 == \fIA\fB\fR has actions \fBeth\[char46]dst = \fIE\fB; next;\fR\[char46] .IP For each virtual ip \fIA\fR configured on a logical port of type \fBvirtual\fR and its virtual parent set in its corresponding \fBPort_Binding\fR record and the virtual parent with the Ethernet address \fIE\fR and the virtual ip is reachable via the router port \fIP\fR, a priority\-100 flow with match \fBoutport === \fIP\fB && xxreg0/reg0 == \fIA\fB\fR has actions \fBeth\[char46]dst = \fIE\fB; next;\fR\[char46] .IP For each virtual ip \fIA\fR configured on a logical port of type \fBvirtual\fR and its virtual parent \fBnot\fR set in its corresponding \fBPort_Binding\fR record and the virtual ip \fIA\fR is reachable via the router port \fIP\fR, a priority\-100 flow with match \fBoutport === \fIP\fB && xxreg0/reg0 == \fIA\fB\fR has actions \fBeth\[char46]dst = \fI00:00:00:00:00:00\fB; next;\fR\[char46] This flow is added so that the ARP is always resolved for the virtual ip \fIA\fR by generating ARP request and \fBnot\fR consulting the MAC_Binding table as it can have incorrect value for the virtual ip \fIA\fR\[char46] .IP For each IPv6 address \fIA\fR whose host is known to have Ethernet address \fIE\fR on router port \fIP\fR, a priority\-100 flow with match \fBoutport === \fIP\fB && xxreg0 == \fIA\fB\fR has actions \fBeth\[char46]dst = \fIE\fB; next;\fR\[char46] .IP For each logical router port with an IPv4 address \fIA\fR and a mac address of \fIE\fR that is reachable via a different logical router port \fIP\fR, a priority\-100 flow with match \fBoutport === \fIP\fB && reg0 == \fIA\fB\fR has actions \fBeth\[char46]dst = \fIE\fB; next;\fR\[char46] .IP For each logical router port with an IPv6 address \fIA\fR and a mac address of \fIE\fR that is reachable via a different logical router port \fIP\fR, a priority\-100 flow with match \fBoutport === \fIP\fB && xxreg0 == \fIA\fB\fR has actions \fBeth\[char46]dst = \fIE\fB; next;\fR\[char46] .IP \(bu Static MAC bindings from NAT entries\[char46] MAC bindings can also be known for the entries in the \fBNAT\fR table\[char46] Below flows are programmed for distributed logical routers i\[char46]e with a distributed router port\[char46] .IP For each row in the \fBNAT\fR table with IPv4 address \fIA\fR in the \fBexternal_ip\fR column of \fBNAT\fR table, a priority\-100 flow with the match \fBoutport === \fIP\fB && reg0 == \fIA\fB\fR has actions \fBeth\[char46]dst = \fIE\fB; next;\fR, where \fBP\fR is the distributed logical router port, \fIE\fR is the Ethernet address if set in the \fBexternal_mac\fR column of \fBNAT\fR table for of type \fBdnat_and_snat\fR, otherwise the Ethernet address of the distributed logical router port\[char46] Note that if the \fBexternal_ip\fR is not within a subnet on the owning logical router, then OVN will only create ARP resolution flows if the \fBoptions:add_route\fR is set to \fBtrue\fR\[char46] Otherwise, no ARP resolution flows will be added\[char46] .IP For IPv6 NAT entries, same flows are added, but using the register \fBxxreg0\fR for the match\[char46] .IP \(bu If the router datapath runs a port with \fBredirect\-type\fR set to \fBbridged\fR, for each distributed NAT rule with IP \fIA\fR in the \fBlogical_ip\fR column and logical port \fIP\fR in the \fBlogical_port\fR column of \fBNAT\fR table, a priority\-90 flow with the match \fBoutport == \fIQ\fB && ip\[char46]src === \fIA\fB && is_chassis_resident(\fIP\fB)\fR, where \fBQ\fR is the distributed logical router port and action \fBget_arp(outport, reg0); next;\fR for IPv4 and \fBget_nd(outport, xxreg0); next;\fR for IPv6\[char46] .IP \(bu Traffic with IP destination an address owned by the router should be dropped\[char46] Such traffic is normally dropped in ingress table \fBIP Input\fR except for IPs that are also shared with SNAT rules\[char46] However, if there was no unSNAT operation that happened successfully until this point in the pipeline and the destination IP of the packet is still a router owned IP, the packets can be safely dropped\[char46] .IP A priority\-2 logical flow with match \fBip4\[char46]dst = {\[char46]\[char46]}\fR matches on traffic destined to router owned IPv4 addresses which are also SNAT IPs\[char46] This flow has action \fBdrop;\fR\[char46] .IP A priority\-2 logical flow with match \fBip6\[char46]dst = {\[char46]\[char46]}\fR matches on traffic destined to router owned IPv6 addresses which are also SNAT IPs\[char46] This flow has action \fBdrop;\fR\[char46] .IP A priority\-0 logical that flow matches all packets not already handled (match \fB1\fR) and drops them (action \fBdrop;\fR)\[char46] .IP \(bu Dynamic MAC bindings\[char46] These flows resolve MAC-to-IP bindings that have become known dynamically through ARP or neighbor discovery\[char46] (The ingress table \fBARP Request\fR will issue an ARP or neighbor solicitation request for cases where the binding is not yet known\[char46]) .IP A priority\-0 logical flow with match \fBip4\fR has actions \fBget_arp(outport, reg0); next;\fR\[char46] .IP A priority\-0 logical flow with match \fBip6\fR has actions \fBget_nd(outport, xxreg0); next;\fR\[char46] .IP \(bu For a distributed gateway LRP with \fBredirect\-type\fR set to \fBbridged\fR, a priority\-50 flow will match \fBoutport == \(dqROUTER_PORT\(dq and !is_chassis_resident (\(dqcr\-ROUTER_PORT\(dq)\fR has actions \fBeth\[char46]dst = \fIE\fB; next;\fR, where \fIE\fR is the ethernet address of the logical router port\[char46] .RE .ST "Ingress Table 18: Check packet length" .PP .PP For distributed logical routers or gateway routers with gateway port configured with \fBoptions:gateway_mtu\fR to a valid integer value, this table adds a priority\-50 logical flow with the match \fBoutport == \fIGW_PORT\fB\fR where \fIGW_PORT\fR is the gateway router port and applies the action \fBcheck_pkt_larger\fR and advances the packet to the next table\[char46] .PP .nf \fB .br \fBREGBIT_PKT_LARGER = check_pkt_larger(\fR\fIL\fB\fR); next; .br \fB \fR .fi .PP .PP where \fIL\fR is the packet length to check for\[char46] If the packet is larger than \fIL\fR, it stores 1 in the register bit \fBREGBIT_PKT_LARGER\fR\[char46] The value of \fIL\fR is taken from \fBoptions:gateway_mtu\fR column of \fBLogical_Router_Port\fR row\[char46] .PP .PP If the port is also configured with \fBoptions:gateway_mtu_bypass\fR then another flow is added, with priority\-55, to bypass the \fBcheck_pkt_larger\fR flow\[char46] .PP .PP This table adds one priority\-0 fallback flow that matches all packets and advances to the next table\[char46] .ST "Ingress Table 19: Handle larger packets" .PP .PP For distributed logical routers or gateway routers with gateway port configured with \fBoptions:gateway_mtu\fR to a valid integer value, this table adds the following priority\-150 logical flow for each logical router port with the match \fBinport == \fILRP\fB && outport == \fIGW_PORT\fB && REGBIT_PKT_LARGER && !REGBIT_EGRESS_LOOPBACK\fR, where \fILRP\fR is the logical router port and \fIGW_PORT\fR is the gateway port and applies the following action for ipv4 and ipv6 respectively: .PP .nf \fB .br \fBicmp4 { .br \fB icmp4\[char46]type = 3; /* Destination Unreachable\[char46] */ .br \fB icmp4\[char46]code = 4; /* Frag Needed and DF was Set\[char46] */ .br \fB icmp4\[char46]frag_mtu = \fR\fIM\fB\fR; .br \fB eth\[char46]dst = \fR\fIE\fB\fR; .br \fB ip4\[char46]dst = ip4\[char46]src; .br \fB ip4\[char46]src = \fR\fII\fB\fR; .br \fB ip\[char46]ttl = 255; .br \fB REGBIT_EGRESS_LOOPBACK = 1; .br \fB REGBIT_PKT_LARGER = 0; .br \fB next(pipeline=ingress, table=0); .br \fB}; .br \fB .br \fBicmp6 { .br \fB icmp6\[char46]type = 2; .br \fB icmp6\[char46]code = 0; .br \fB icmp6\[char46]frag_mtu = \fR\fIM\fB\fR; .br \fB eth\[char46]dst = \fR\fIE\fB\fR; .br \fB ip6\[char46]dst = ip6\[char46]src; .br \fB ip6\[char46]src = \fR\fII\fB\fR; .br \fB ip\[char46]ttl = 255; .br \fB REGBIT_EGRESS_LOOPBACK = 1; .br \fB REGBIT_PKT_LARGER = 0; .br \fB next(pipeline=ingress, table=0); .br \fB}; .br \fB \fR .fi .RS .IP \(bu Where \fIM\fR is the (fragment MTU - 58) whose value is taken from \fBoptions:gateway_mtu\fR column of \fBLogical_Router_Port\fR row\[char46] .IP \(bu \fIE\fR is the Ethernet address of the logical router port\[char46] .IP \(bu \fII\fR is the IPv4/IPv6 address of the logical router port\[char46] .RE .PP .PP This table adds one priority\-0 fallback flow that matches all packets and advances to the next table\[char46] .ST "Ingress Table 20: Gateway Redirect" .PP .PP For distributed logical routers where one or more of the logical router ports specifies a gateway chassis, this table redirects certain packets to the distributed gateway port instances on the gateway chassises\[char46] This table has the following flows: .RS .IP \(bu For each NAT rule in the OVN Northbound database that can be handled in a distributed manner, a priority\-100 logical flow with match \fBip4\[char46]src == \fIB\fB && outport == \fIGW\fB\fR && is_chassis_resident(\fIP\fR), where \fIGW\fR is the distributed gateway port specified in the NAT rule and \fIP\fR is the NAT logical port\[char46] IP traffic matching the above rule will be managed locally setting \fBreg1\fR to \fIC\fR and \fBeth\[char46]src\fR to \fID\fR, where \fIC\fR is NAT external ip and \fID\fR is NAT external mac\[char46] .IP \(bu For each \fBdnat_and_snat\fR NAT rule with \fBstateless=true\fR and \fBallowed_ext_ips\fR configured, a priority\-75 flow is programmed with match \fBip4\[char46]dst == \fIB\fB\fR and action \fBoutport = \fICR\fB; next;\fR where \fIB\fR is the NAT rule external IP and \fICR\fR is the \fBchassisredirect\fR port representing the instance of the logical router distributed gateway port on the gateway chassis\[char46] Moreover a priority\-70 flow is programmed with same match and action \fBdrop;\fR\[char46] For each \fBdnat_and_snat\fR NAT rule with \fBstateless=true\fR and \fBexempted_ext_ips\fR configured, a priority\-75 flow is programmed with match \fBip4\[char46]dst == \fIB\fB\fR and action \fBdrop;\fR where \fIB\fR is the NAT rule external IP\[char46] A similar flow is added for IPv6 traffic\[char46] .IP \(bu For each NAT rule in the OVN Northbound database that can be handled in a distributed manner, a priority\-80 logical flow with drop action if the NAT logical port is a virtual port not claimed by any chassis yet\[char46] .IP \(bu A priority\-50 logical flow with match \fBoutport == \fIGW\fB\fR has actions \fBoutport = \fICR\fB; next;\fR, where \fIGW\fR is the logical router distributed gateway port and \fICR\fR is the \fBchassisredirect\fR port representing the instance of the logical router distributed gateway port on the gateway chassis\[char46] .IP \(bu A priority\-0 logical flow with match \fB1\fR has actions \fBnext;\fR\[char46] .RE .ST "Ingress Table 21: ARP Request" .PP .PP In the common case where the Ethernet destination has been resolved, this table outputs the packet\[char46] Otherwise, it composes and sends an ARP or IPv6 Neighbor Solicitation request\[char46] It holds the following flows: .RS .IP \(bu Unknown MAC address\[char46] A priority\-100 flow for IPv4 packets with match \fBeth\[char46]dst == 00:00:00:00:00:00\fR has the following actions: .IP .nf \fB .br \fBarp { .br \fB eth\[char46]dst = ff:ff:ff:ff:ff:ff; .br \fB arp\[char46]spa = reg1; .br \fB arp\[char46]tpa = reg0; .br \fB arp\[char46]op = 1; /* ARP request\[char46] */ .br \fB output; .br \fB}; .br \fB \fR .fi .IP Unknown MAC address\[char46] For each IPv6 static route associated with the router with the nexthop IP: \fIG\fR, a priority\-200 flow for IPv6 packets with match \fBeth\[char46]dst == 00:00:00:00:00:00 && xxreg0 == \fIG\fB\fR with the following actions is added: .IP .nf \fB .br \fBnd_ns { .br \fB eth\[char46]dst = \fR\fIE\fB\fR; .br \fB ip6\[char46]dst = \fR\fII\fB\fR .br \fB nd\[char46]target = \fR\fIG\fB\fR; .br \fB output; .br \fB}; .br \fB \fR .fi .IP Where \fIE\fR is the multicast mac derived from the Gateway IP, \fII\fR is the solicited-node multicast address corresponding to the target address \fIG\fR\[char46] .IP Unknown MAC address\[char46] A priority\-100 flow for IPv6 packets with match \fBeth\[char46]dst == 00:00:00:00:00:00\fR has the following actions: .IP .nf \fB .br \fBnd_ns { .br \fB nd\[char46]target = xxreg0; .br \fB output; .br \fB}; .br \fB \fR .fi .IP (Ingress table \fBIP Routing\fR initialized \fBreg1\fR with the IP address owned by \fBoutport\fR and \fB(xx)reg0\fR with the next-hop IP address) .IP The IP packet that triggers the ARP/IPv6 NS request is dropped\[char46] .IP \(bu Known MAC address\[char46] A priority\-0 flow with match \fB1\fR has actions \fBoutput;\fR\[char46] .RE .ST "Egress Table 0: Check DNAT local" .PP .PP This table checks if the packet needs to be DNATed in the router ingress table \fBlr_in_dnat\fR after it is SNATed and looped back to the ingress pipeline\[char46] This check is done only for routers configured with distributed gateway ports and NAT entries\[char46] This check is done so that SNAT and DNAT is done in different zones instead of a common zone\[char46] .RS .IP \(bu For each NAT rule in the OVN Northbound database on a distributed router, a priority\-50 logical flow with match \fBip4\[char46]dst == \fIE\fB && is_chassis_resident(\fIP\fB)\fR, where \fIE\fR is the external IP address specified in the NAT rule, \fIGW\fR is the logical router distributed gateway port\[char46] For dnat_and_snat NAT rule, \fIP\fR is the logical port specified in the NAT rule\[char46] If \fBlogical_port\fR column of \fBNAT\fR table is NOT set, then \fIP\fR is the \fBchassisredirect port\fR of \fIGW\fR with the actions: \fBREGBIT_DST_NAT_IP_LOCAL = 1; next; \fR .IP \(bu A priority\-0 logical flow with match \fB1\fR has actions \fBREGBIT_DST_NAT_IP_LOCAL = 0; next;\fR\[char46] .RE .PP .PP This table also installs a priority\-50 logical flow for each logical router that has NATs configured on it\[char46] The flow has match \fBip && ct_label\[char46]natted == 1\fR and action \fBREGBIT_DST_NAT_IP_LOCAL = 1; next;\fR\[char46] This is intended to ensure that traffic that was DNATted locally will use a separate conntrack zone for SNAT if SNAT is required later in the egress pipeline\[char46] Note that this flow checks the value of \fBct_label\[char46]natted\fR, which is set in the ingress pipeline\[char46] This means that ovn-northd assumes that this value is carried over from the ingress pipeline to the egress pipeline and is not altered or cleared\[char46] If conntrack label values are ever changed to be cleared between the ingress and egress pipelines, then the match conditions of this flow will be updated accordingly\[char46] .ST "Egress Table 1: UNDNAT" .PP .PP This is for already established connections\(cq reverse traffic\[char46] i\[char46]e\[char46], DNAT has already been done in ingress pipeline and now the packet has entered the egress pipeline as part of a reply\[char46] This traffic is unDNATed here\[char46] .RS .IP \(bu A priority\-0 logical flow with match \fB1\fR has actions \fBnext;\fR\[char46] .RE .ST "Egress Table 1: UNDNAT on Gateway Routers" .RS .IP \(bu For all IP packets, a priority\-50 flow with an action \fBflags\[char46]loopback = 1; ct_dnat;\fR\[char46] .RE .ST "Egress Table 1: UNDNAT on Distributed Routers" .RS .IP \(bu For all the configured load balancing rules for a router with gateway port in \fBOVN_Northbound\fR database that includes an IPv4 address \fBVIP\fR, for every backend IPv4 address \fIB\fR defined for the \fBVIP\fR a priority\-120 flow is programmed on gateway chassis that matches \fBip && ip4\[char46]src == \fIB\fB && outport == \fIGW\fB\fR, where \fIGW\fR is the logical router gateway port with an action \fBct_dnat_in_czone;\fR\[char46] If the backend IPv4 address \fIB\fR is also configured with L4 port \fIPORT\fR of protocol \fIP\fR, then the match also includes \fBP\[char46]src\fR == \fIPORT\fR\[char46] These flows are not added for load balancers with IPv6 \fIVIPs\fR\[char46] .IP If the router is configured to force SNAT any load-balanced packets, above action will be replaced by \fBflags\[char46]force_snat_for_lb = 1; ct_dnat;\fR\[char46] .IP \(bu For each configuration in the OVN Northbound database that asks to change the destination IP address of a packet from an IP address of \fIA\fR to \fIB\fR, a priority\-100 flow matches \fBip && ip4\[char46]src == \fIB\fB && outport == \fIGW\fB\fR, where \fIGW\fR is the logical router gateway port, with an action \fBct_dnat_in_czone;\fR\[char46] If the NAT rule is of type dnat_and_snat and has \fBstateless=true\fR in the options, then the action would be \fBnext;\fR\[char46] .IP If the NAT rule cannot be handled in a distributed manner, then the priority\-100 flow above is only programmed on the gateway chassis with the action \fBct_dnat_in_czone\fR\[char46] .IP If the NAT rule can be handled in a distributed manner, then there is an additional action \fBeth\[char46]src = \fIEA\fB;\fR, where \fIEA\fR is the ethernet address associated with the IP address \fIA\fR in the NAT rule\[char46] This allows upstream MAC learning to point to the correct chassis\[char46] .RE .ST "Egress Table 2: Post UNDNAT" .PP .PP .RS .IP \(bu A priority\-50 logical flow is added that commits any untracked flows from the previous table \fBlr_out_undnat\fR for Gateway routers\[char46] This flow matches on \fBct\[char46]new && ip\fR with action \fBct_commit { } ; next; \fR\[char46] .IP \(bu A priority\-0 logical flow with match \fB1\fR has actions \fBnext;\fR\[char46] .RE .ST "Egress Table 3: SNAT" .PP .PP Packets that are configured to be SNATed get their source IP address changed based on the configuration in the OVN Northbound database\[char46] .RS .IP \(bu A priority\-120 flow to advance the IPv6 Neighbor solicitation packet to next table to skip SNAT\[char46] In the case where ovn-controller injects an IPv6 Neighbor Solicitation packet (for \fBnd_ns\fR action) we don\(cqt want the packet to go through conntrack\[char46] .RE .PP .PP Egress Table 3: SNAT on Gateway Routers .RS .IP \(bu If the Gateway router in the OVN Northbound database has been configured to force SNAT a packet (that has been previously DNATted) to \fIB\fR, a priority\-100 flow matches \fBflags\[char46]force_snat_for_dnat == 1 && ip\fR with an action \fBct_snat(\fIB\fB);\fR\[char46] .IP \(bu If a load balancer configured to skip snat has been applied to the Gateway router pipeline, a priority\-120 flow matches \fBflags\[char46]skip_snat_for_lb == 1 && ip\fR with an action \fBnext;\fR\[char46] .IP \(bu If the Gateway router in the OVN Northbound database has been configured to force SNAT a packet (that has been previously load-balanced) using router IP (i\[char46]e \fBoptions\fR:lb_force_snat_ip=router_ip), then for each logical router port \fIP\fR attached to the Gateway router, a priority\-110 flow matches \fBflags\[char46]force_snat_for_lb == 1 && outport == \fIP\fB \fR with an action \fBct_snat(\fIR\fB);\fR where \fIR\fR is the IP configured on the router port\[char46] If \fBR\fR is an IPv4 address then the match will also include \fBip4\fR and if it is an IPv6 address, then the match will also include \fBip6\fR\[char46] .IP If the logical router port \fIP\fR is configured with multiple IPv4 and multiple IPv6 addresses, only the first IPv4 and first IPv6 address is considered\[char46] .IP \(bu If the Gateway router in the OVN Northbound database has been configured to force SNAT a packet (that has been previously load-balanced) to \fIB\fR, a priority\-100 flow matches \fBflags\[char46]force_snat_for_lb == 1 && ip\fR with an action \fBct_snat(\fIB\fB);\fR\[char46] .IP \(bu For each configuration in the OVN Northbound database, that asks to change the source IP address of a packet from an IP address of \fIA\fR or to change the source IP address of a packet that belongs to network \fIA\fR to \fIB\fR, a flow matches \fBip && ip4\[char46]src == \fIA\fB && (!ct\[char46]trk || !ct\[char46]rpl)\fR with an action \fBct_snat(\fIB\fB);\fR\[char46] The priority of the flow is calculated based on the mask of \fIA\fR, with matches having larger masks getting higher priorities\[char46] If the NAT rule is of type dnat_and_snat and has \fBstateless=true\fR in the options, then the action would be \fBip4/6\[char46]src= (\fIB\fB)\fR\[char46] .IP \(bu If the NAT rule has \fBallowed_ext_ips\fR configured, then there is an additional match \fBip4\[char46]dst == \fIallowed_ext_ips \fB\fR\[char46] Similarly, for IPV6, match would be \fBip6\[char46]dst == \fIallowed_ext_ips\fB\fR\[char46] .IP \(bu If the NAT rule has \fBexempted_ext_ips\fR set, then there is an additional flow configured at the priority + 1 of corresponding NAT rule\[char46] The flow matches if destination ip is an \fBexempted_ext_ip\fR and the action is \fBnext; \fR\[char46] This flow is used to bypass the ct_snat action for a packet which is destinted to \fBexempted_ext_ips\fR\[char46] .IP \(bu A priority\-0 logical flow with match \fB1\fR has actions \fBnext;\fR\[char46] .RE .PP .PP Egress Table 3: SNAT on Distributed Routers .RS .IP \(bu For each configuration in the OVN Northbound database, that asks to change the source IP address of a packet from an IP address of \fIA\fR or to change the source IP address of a packet that belongs to network \fIA\fR to \fIB\fR, two flows are added\[char46] The priority \fIP\fR of these flows are calculated based on the mask of \fIA\fR, with matches having larger masks getting higher priorities\[char46] .IP If the NAT rule cannot be handled in a distributed manner, then the below flows are only programmed on the gateway chassis increasing flow priority by 128 in order to be run first\[char46] .RS .IP \(bu The first flow is added with the calculated priority \fIP\fR and match \fBip && ip4\[char46]src == \fIA\fB && outport == \fIGW\fB\fR, where \fIGW\fR is the logical router gateway port, with an action \fBct_snat_in_czone(\fIB\fB);\fR to SNATed in the common zone\[char46] If the NAT rule is of type dnat_and_snat and has \fBstateless=true\fR in the options, then the action would be \fBip4/6\[char46]src=(\fIB\fB)\fR\[char46] .IP \(bu The second flow is added with the calculated priority \fB\fIP\fB + 1 \fR and match \fBip && ip4\[char46]src == \fIA\fB && outport == \fIGW\fB && REGBIT_DST_NAT_IP_LOCAL == 0\fR, where \fIGW\fR is the logical router gateway port, with an action \fBct_snat(\fIB\fB);\fR to SNAT in the snat zone\[char46] If the NAT rule is of type dnat_and_snat and has \fBstateless=true\fR in the options, then the action would be \fBip4/6\[char46]src=(\fIB\fB)\fR\[char46] .RE .IP If the NAT rule can be handled in a distributed manner, then there is an additional action (for both the flows) \fBeth\[char46]src = \fIEA\fB;\fR, where \fIEA\fR is the ethernet address associated with the IP address \fIA\fR in the NAT rule\[char46] This allows upstream MAC learning to point to the correct chassis\[char46] .IP If the NAT rule has \fBallowed_ext_ips\fR configured, then there is an additional match \fBip4\[char46]dst == \fIallowed_ext_ips \fB\fR\[char46] Similarly, for IPV6, match would be \fBip6\[char46]dst == \fIallowed_ext_ips\fB\fR\[char46] .IP If the NAT rule has \fBexempted_ext_ips\fR set, then there is an additional flow configured at the priority \fB\fIP\fB + 2 \fR of corresponding NAT rule\[char46] The flow matches if destination ip is an \fBexempted_ext_ip\fR and the action is \fBnext; \fR\[char46] This flow is used to bypass the ct_snat action for a flow which is destinted to \fBexempted_ext_ips\fR\[char46] .IP \(bu A priority\-0 logical flow with match \fB1\fR has actions \fBnext;\fR\[char46] .RE .ST "Egress Table 4: Egress Loopback" .PP .PP For distributed logical routers where one of the logical router ports specifies a gateway chassis\[char46] .PP .PP While UNDNAT and SNAT processing have already occurred by this point, this traffic needs to be forced through egress loopback on this distributed gateway port instance, in order for UNSNAT and DNAT processing to be applied, and also for IP routing and ARP resolution after all of the NAT processing, so that the packet can be forwarded to the destination\[char46] .PP .PP This table has the following flows: .RS .IP \(bu For each NAT rule in the OVN Northbound database on a distributed router, a priority\-100 logical flow with match \fBip4\[char46]dst == \fIE\fB && outport == \fIGW\fB && is_chassis_resident(\fIP\fB)\fR, where \fIE\fR is the external IP address specified in the NAT rule, \fIGW\fR is the distributed gateway port corresponding to the NAT rule (specified or inferred)\[char46] For dnat_and_snat NAT rule, \fIP\fR is the logical port specified in the NAT rule\[char46] If \fBlogical_port\fR column of \fBNAT\fR table is NOT set, then \fIP\fR is the \fBchassisredirect port\fR of \fIGW\fR with the following actions: .IP .nf \fB .br \fBclone { .br \fB ct_clear; .br \fB inport = outport; .br \fB outport = \(dq\(dq; .br \fB flags = 0; .br \fB flags\[char46]loopback = 1; .br \fB flags\[char46]use_snat_zone = REGBIT_DST_NAT_IP_LOCAL; .br \fB reg0 = 0; .br \fB reg1 = 0; .br \fB \[char46]\[char46]\[char46] .br \fB reg9 = 0; .br \fB REGBIT_EGRESS_LOOPBACK = 1; .br \fB next(pipeline=ingress, table=0); .br \fB}; .br \fB \fR .fi .IP \fBflags\[char46]loopback\fR is set since in_port is unchanged and the packet may return back to that port after NAT processing\[char46] \fBREGBIT_EGRESS_LOOPBACK\fR is set to indicate that egress loopback has occurred, in order to skip the source IP address check against the router address\[char46] .IP \(bu A priority\-0 logical flow with match \fB1\fR has actions \fBnext;\fR\[char46] .RE .ST "Egress Table 5: Delivery" .PP .PP Packets that reach this table are ready for delivery\[char46] It contains: .RS .IP \(bu Priority\-110 logical flows that match IP multicast packets on each enabled logical router port and modify the Ethernet source address of the packets to the Ethernet address of the port and then execute action \fBoutput;\fR\[char46] .IP \(bu Priority\-100 logical flows that match packets on each enabled logical router port, with action \fBoutput;\fR\[char46] .IP \(bu A priority\-0 logical flow that matches all packets not already handled (match \fB1\fR) and drops them (action \fBdrop;\fR)\[char46] .RE .SH "DROP SAMPLING" .PP .PP As described in the previous section, there are several places where ovn-northd might decided to drop a packet by explicitly creating a \fBLogical_Flow\fR with the \fBdrop;\fR action\[char46] .PP .PP When debug drop-sampling has been cofigured in the OVN Northbound database, the ovn-northd will replace all the \fBdrop;\fR actions with a \fBsample(priority=65535, collector_set=\fIid\fB, obs_domain=\fIobs_id\fB, obs_point=@cookie)\fR action, where: .RS .IP \(bu \fIid\fR is the value the \fBdebug_drop_collector_set\fR option configured in the OVN Northbound\[char46] .IP \(bu \fIobs_id\fR has it\(cqs 8 most significant bits equal to the value of the \fBdebug_drop_domain_id\fR option in the OVN Northbound and it\(cqs 24 least significant bits equal to the datapath\(cqs tunnel key\[char46] .RE