NAME¶
corosync-qdevice - QDevice daemon
SYNOPSIS¶
corosync-qdevice [-dfh] [-S option=value[,option2=value2,...]]
DESCRIPTION¶
corosync-qdevice is a daemon running on each node of a cluster. It
provides a configured number of votes to the quorum subsystem based on a
third-party arbitrator's decision. Its primary use is to allow a cluster to
sustain more node failures than standard quorum rules allow. It is recommended
for clusters with an even number of nodes and highly recommended for 2 node
clusters.
OPTIONS¶
- -d
- Forcefully turn on debug information without the need to change
corosync.conf.
- -f
- Do not daemonize, run in the foreground.
- -h
- Show short help text
- -S
- Set advanced settings described in its own section below. This option
shouldn't be generally used because most of the options are not safe to
change.
CONFIGURATION¶
corosync-qdevice reads its configuration from corosync.conf file.
The main configuration is within
quorum.device sub-key. Each model also
has its own configuration within a similarly named sub-key.
- model
- Specifies the model to be used. This parameter is required.
corosync-qdevice is modular and is able to support multiple
different models. The model basically defines what type of arbitrator is
used. Currently only net is supported.
- timeout
- Specifies how often corosync-qdevice should call the
votequorum_poll function. It is also used by the net model to adjust its
hearbeat timeout. It is recommended that you don't change this value.
Default is 10000.
- sync_timeout
- Specifies how often corosync-qdevice should call the
votequorum_poll function during a sync phase. It is recommended that you
don't change this value. Default is 30000.
- votes
- The number of votes provided to the cluster by qdevice. Default is
(number_of_nodes - 1) or generally sum(votes_per_node) - 1.
quorum.device.net holds the configuration for model 'net'.
- tls
- Can be one of on, off or required and specifies if tls should be
used. on means a connection with TLS is attempted first, but if the
server doesn't advertise TLS support then non-TLS will be used. off
is used then TLS is not required and it's then not even tried. This mode
is the only one which doesn't need a properly initialized NSS database.
required means TLS is required and if the server doesn't support
TLS, qdevice will exit with error message. Default is on.
- host
- Specifies the IP address or host name of the qnetd server to be used. This
parameter is required.
- port
- Specifies TCP port of qnetd server. Default is 5403.
- algorithm
- Decision algorithm. Can be one of the ffsplit or lms.
(actually there are also test and 2nodelms , both of which
are mainly for developers and shouldn't be used for production clusters).
For a description of what each algorithm means and how the algorithms
differ see their individual sections. Default value is ffsplit.
- tie_breaker
- can be one of lowest, highest or valid_node_id (number)
values. It's used as a fallback if qdevice has to decide between two or
more equal partitions. lowest means the partition with the lowest
node id is chosen. highest means the partition with highest node id
is chosen. And valid_node_id means that the partition containing the node
with the given node id is chosen. Default is 'lowest'.
- connect_timeout
- Timeout when corosync-qdevice is trying to connect to
corosync-qnetd host. Default is 0.8 * quorum.sync_timeout.
- force_ip_version
- can be one of 0|4|6 and forces the software to use the given IP
version. 0 (default value) means IPv6 is preferred and IPv4 should
be used as a fallback.
Logging configuration is within the
logging directive.
corosync-qdevice parses and supports most of the options with exception
of
to_logfile, logfile and
logfile_priority. The
logger_subsys sub-directive can be also used if
subsys is set to
QDEVICE.
For
corosync-qdevice to work correctly, the
nodelist directive has
to be used and properly configured. Also the net model requires that
totem.cluster_name option is set.
MODEL NET TLS CONFIGURATION¶
For model net to work using TLS, it's necessary to create the NSS database,
import Qnetd CA certificate, and get/distribute a valid client certificate.
If pcs is used (recommended) the following steps are not needed because pcs does
them automatically.
corosync-qdevice-net-certutil is the tool to perform required actions
semi-automatically. Please consult the help output of it and its man page. For
a first time configuration it may make sense to start with the
-Q
option.
If TLS is not required just edit corosync.conf file and set
quorum.device.net.tls to
off.
MODEL NET ALGORITHMS¶
Algorithms are used to change behavior of how
corosync-qnetd provides
votes to a given node/partition. Currently there are two algorithms supported.
- ffsplit
- This one makes sense only for clusters with even number of nodes. It
provides exactly one vote to the partition with the highest number of
active nodes. If there are two exactly similar partitions, it provides its
vote to the partition that has the most clients connected to the qnetd
server. If this number is also equal, then the tie_breaker is used. It is
able to transition its vote if the currently active partition becomes
partitioned and a non-active partition still has at least 50% of the
active nodes. Because of this, a vote is not provided if the qnetd
connection is not active.
To use this algorithm it's required to set the number of votes per node to 1
(default) and the qdevice number of votes has to be also 1. This is
achieved by setting quorum.device.votes key in corosync.conf file
to 1.
- lms
- Last-man-standing. If the node is the only one left in the cluster that
can see the qnetd server then we return a vote.
If more than one node can see the qnetd server but some nodes can't see each
other then the cluster is divided up into 'partitions' based on their
ring_id and this algorithm returns a vote to the largest active partition
or, if there is more than 1 equal partiton, the partition that contains
the tie_breaker node (lowest, highest, etc). For LMS to work, the number
of qdevice votes has to be set to default (so just delete
quorum.device.votes key from corosync.conf).
ADVANCED SETTINGS¶
Set by using
-S option. The default value is shown in parentheses)
Options beginning with
net_ prefix are specific to model net.
- lock_file
- Lock file location. (/var/run/corosync-qdevice/corosync-qdevice.pid)
- local_socket_file
- Internal IPC socket file location.
(/var/run/corosync-qdevice/corosync-qdevice.sock)
- local_socket_backlog
- Parameter passed to listen syscall. (10)
- max_cs_try_again
- How many times to retry the call to a corosync function which has returned
CS_ERR_TRY_AGAIN. (10)
- votequorum_device_name
- Name used for qdevice registration. (Qdevice)
- ipc_max_clients
- Maximum allowed simultaneous IPC clients. (10)
- ipc_max_receive_size
- Maximum size of a message received by IPC client. (4096)
- ipc_max_send_size
- Maximum size of a message allowed to be sent to an IPC client.
(65536)
- master_wins
- Force enable/disable master wins. (default is model)
- net_nss_db_dir
- NSS database directory. (/etc/corosync/qdevice/net/nssdb)
- net_initial_msg_receive_size
- Initial (used during connection parameters negotiation) maximum size of
the receive buffer for message (maximum allowed message size received from
qnetd). (32768)
- net_initial_msg_send_size
- Initial (used during connection parameter negotiation) maximum size of one
send buffer (message) to be sent to server. (32768)
- net_min_msg_send_size
- Minimum required size of one send buffer (message) to be sent to server.
(32768)
- net_max_msg_receive_size
- Maximum allowed size of receive buffer for a message sent by server.
(16777216)
- net_max_send_buffers
- Maximum number of send buffers. (10)
- net_nss_qnetd_cn
- Canonical name of qnetd server certificate. (Qnetd Server)
- net_nss_client_cert_nickname
- NSS nickname of qdevice client certificate. (Cluster Cert)
- net_heartbeat_interval_min
- Minimum heartbeat timeout accepted by client in ms. (1000)
- net_heartbeat_interval_max
- Maximum heartbeat timeout accepted by client in ms. (120000)
- net_min_connect_timeout
- Minimum connection timeout accepted by client in ms. (1000)
- net_max_connect_timeout
- Maximum connection timeout accepted by client in ms. (120000)
- net_test_algorithm_enabled
- Enable test algorithm. (if built with --enable-debug on, otherwise
off)
SEE ALSO¶
corosync-qdevice-tool(8) corosync-qdevice-net-certutil(8)
corosync-qnetd(8) corosync.conf(5)
AUTHOR¶
Jan Friesse