NAME¶
cman_tool - Cluster Management Tool
SYNOPSIS¶
cman_tool join | leave | kill | expected | votes | version | wait | status |
nodes | services | debug [options]
DESCRIPTION¶
cman_tool is a program that manages the cluster management subsystem
CMAN. cman_tool can be used to join the node to a cluster, leave the cluster,
kill another cluster node or change the value of expected votes of a cluster.
Be careful that you understand the consequences of the commands issued via
cman_tool as they can affect all nodes in your cluster. Most of the time the
cman_tool will only be invoked from your startup and shutdown scripts.
SUBCOMMANDS¶
- join
- This is the main use of cman_tool. It instructs the cluster manager to
attempt to join an existing cluster or (if no existing cluster exists)
then to form a new one on its own.
If no options are given to this command then it will take the cluster
configuration information from cluster.conf. However, it is possible to
provide all the information on the command-line or to override
cluster.conf values by using the command line.
- leave
- Tells CMAN to leave the cluster. You cannot do this if there are
subsystems (eg DLM, GFS) active. You should dismount all GFS filesystems,
shutdown CLVM, fenced and anything else using the cluster manager before
using cman_tool leave. Look at 'cman_tool status' and group_tool to
see how many (and which) subsystems are active.
When a node leaves the cluster, the remaining nodes recalculate quorum and
this may block cluster activity if the required number of votes is not
present. If this node is to be down for an extended period of time and you
need to keep the cluster running, add the remove option, and the
remaining nodes will recalculate quorum such that activity can continue.
- kill
- Tells CMAN to kill another node in the cluster. This will cause the local
node to send a "KILL" message to that node and it will shut
down. Recovery will occur for the killed node as if it had failed. This is
a sort of remote version of "leave force" so only use if if you
really know what you are doing.
- expected
- Tells CMAN a new value of expected votes and instructs it to recalculate
quorum based on this value.
The recalculation takes into account the number of currently active nodes,
so this option should not be used in an attempt to artificially lower the
quorum value in advance of a planned shutdown of cluster nodes. Instead,
the 'cman_tool leave remove' command should be used (see the 'leave'
subcommand above).
Use this option if your cluster has lost quorum due to nodes failing and you
need to get it running again in a hurry.
- version
- Used alone this will report the major, minor, patch and config versions
used by CMAN (also displayed in 'cman_tool status'). It can also be used
with -r to tell cluster members to update the cluster configuration.
If -r is specified, cman will read the configuration file, validate it,
distribute it around the cluster (if necessary) an activate it. See the
VERSION OPTIONS section below for additional options to the version
command.
- wait
- Waits until the node is a member of the cluster and then returns.
- status
- Displays the local view of the cluster status.
- nodes
- Displays the local view of the cluster nodes.
- services
- Displays the local view of subsystems using cman (deprecated, group_tool
should be used instead).
- debug
- Sets the debug level of the running cman daemon. Debug output will be sent
to syslog level LOG_DEBUG. the -d switch specifies the new logging
level. This is the same bitmask used for cman_tool join -d
LEAVE OPTIONS¶
- -w
- Normally, "cman_tool leave" will fail if the cluster is in
transition (ie another node is joining or leaving the cluster). By adding
the -w flag, cman_tool will wait and retry the leave operation repeatedly
until it succeeds or a more serious error occurs.
- -t <seconds>
- If -w is also specified then -t dictates the maximum amount of time
cman_tool is prepared to wait. If the operation times out then a status of
2 is returned.
- force
- Shuts down the cluster manager without first telling any of the subsystems
to close down. Use this option with extreme care as it could easily cause
data loss.
- remove
- Tells the rest of the cluster to recalculate quorum such that activity can
continue without this node.
EXPECTED OPTIONS¶
- -e <expected-votes>
- The new value of expected votes to use. This will usually be enough to
bring the cluster back to life. Values that would cause incorrect quorum
will be rejected.
KILL OPTIONS¶
- -n <nodename>
- The node name of the node to be killed. This should be the unqualified
node name as it appears in 'cman_tool nodes'.
VERSION OPTIONS¶
- -r
- Update config version. You don't need to use this when adding a new node,
the new cman node will tell the rest of the cluster to read the latest
version of the config file automatically. The version present in the new
configuration must be higher than the one currently in use by cman.
cman_tool version on its own will always show the current version and not
the one being looked for. So be aware that the display will possibly not
update immediately after you have run cman_tool version -r.
- -D<option>
- see "JOIN" options
- -S
- By default cman_tool version will try to distribute the new cluster.conf
file using ccs_sync and ricci. If you have distributed the file yourself
and/or do not have ricci installed then the -S option will skip this step.
NOTE: it is still important that all nodes in the cluster have the same
version of the file. Make sure that this is the case before using this
option.
WAIT OPTIONS¶
- -q
- Waits until the cluster is quorate before returning. -t
<seconds> Dictates the maximum amount of time cman_tool is
prepared to wait. If the operation times out then a status of 2 is
returned.
JOIN OPTIONS¶
- -c <clustername>
- Provides a text name for the cluster. You can have several clusters on one
LAN and they are distinguished by this name. Note that the name is hashed
to provide a unique number which is what actually distinguishes the
cluster, so it is possible that two different names can clash. If this
happens, the node will not be allowed into the existing cluster and you
will have to pick another name or use different port number for cluster
communication.
- -p <port>
- UDP port number used for cluster communication. This defaults to
5405.
- -v <votes>
- Number of votes this node has in the cluster. Defaults to 1.
- -e <expected votes>
- Number of expected votes for the whole cluster. If different nodes provide
different values then the highest is used. The cluster will only operate
when quorum is reached - that is more than half the available votes are
available to the cluster. The default for this value is the total number
of votes for all nodes in the configuration file.
- -2
- Sets the cluster up for a special "two node only" mode. Because
of the quorum requirements mentioned above, a two-node cluster cannot be
valid. This option tells the cluster manager that there will only ever be
two nodes in the cluster and relies on fencing to ensure cluster
integrity. If you specify this you cannot add more nodes without taking
down the existing cluster and reconfiguring it. Expected votes should be
set to 1 for a two-node cluster.
- -n <nodename>
- Overrides the node name. By default the unqualified hostname is used. This
option is also used to specify which interface is used for cluster
communication.
- -N <nodeid>
- Overrides the node ID for this node. Normally, nodes are assigned a node
id in cluster.conf. If you specify an incorrect node ID here, the node
might not be allowed to join the cluster. Setting node IDs in the
configuration is a far better way to do this. Note that the node's
application to join the cluster may be rejected if you try to set the
nodeid to one that has already been used, or if the node was previously a
member of the cluster but with a different nodeid.
- -o <nodename>
- Override the name this node will have in the cluster. This will normally
be the hostname or the first name specified by -n. Note how this differs
from -n: -n tells cman_tool how to find the host address and/or the entry
in the configuration file. -o simply changes the name the node will have
in the cluster and has no bearing on the actual name of the machine. Use
this option will extreme caution.
- -m <multicast-address>
- Specifies a multicast address to use for cluster communication. This is
required for IPv6 operation. You should also specify an ethernet interface
to bind to this multicast address using the -i option.
- -w
- Join and wait until the node is a cluster member.
- -q
- Join and wait until the cluster is quorate. If the cluster join fails and
-w (or -q) is specified, then it will be retried. Note that cman_tool
cannot tell whether the cluster join was rejected by another node for a
good reason or that it timed out for some benign reason; so it is strongly
recommended that a timeout is also given with the wait options to join. If
you don't want join to retry on failure but do want to wait, use the
cman_tool join command without -w followed by cman_tool
wait.
- -k <keyfile>
- All traffic sent out by cman/corosync is encrypted. By default the
security key used is simply the cluster name. If you need more security
you can specify a key file that contains the key used to encrypt cluster
communications. Of course, the contents of the key file must be the same
on all nodes in the cluster. It is up to you to securely copy the file to
the nodes.
- -t <seconds>
- If -w or -q is also specified then -t dictates the maximum amount of time
cman_tool is prepared to wait. If the operation times out then a status of
2 is returned. Note that just because cman_tool has given up, does not
mean that cman itself has stopped trying to join a cluster.
- -X
- Tells cman not to use the configuration file to get cluster information.
If you use this option then cman will apply several defaults to the
cluster to get it going. The cluster name will be "RHCluster",
node IDs will default to the IP address of the node and remote node names
will show up as Node<nodeid>. All of these, apart from the node
names can be overridden on the cman_tool command-line if required.
If you have to set up fence devices, services or anything else in
cluster.conf then this option is probably not worthwhile to you - the
extra readability of sensible node names and numbers will make it worth
using cluster.conf for the cluster too. But for a simple failover cluster
this might save you some effort.
On each node using this configuration you will need to have the same
authorization key installed. To create this key run
corosync-keygen
mv /etc/ais/authkey /etc/cluster/cman_authkey
then copy that file to all nodes you want to join the cluster.
- -C
- Overrides the default configuration module. Usually cman uses xmlconfig
(cluster.conf) to load its configuration. If you have your configuration
database held elsewhere (eg LDAP) and have a configuration plugin for it,
then you should specify the name of the module (see the documentation for
the module for the name of it - it's not necessarily the same as the
filename) here.
It is possible to chain configuration modules by separating them with
colons. So to add two modules (eg) 'ldapconfig' and 'ldappreproc' to the
chain start cman with -C ldapconfig:ldappreproc
The default value for this is 'xmlconfig'. Note that if the -X is on the
command-line then -C will be ignored.
- -A
- Don't load openais services. Normally cman_tool join will load the
configuration module 'openaisserviceenablestable' which will load the
services installed by openais. If you don't want to use these services or
have not installed openais then this switch will disable them.
- -D
- Tells cman_tool whether to validate the configuration before loading or
reloading it. By default the configuration is validated, which is
equivalent to -Dfail.
-Dwarn will validate the configuration and print any messages arising, but
will attempt to use it regardless of its validity.
-Dnone (or just -D) will skip the validation completely.
The -D switch does not take a space between -D and the parameter. so '-D
fail' will cause an error. Use -Dfail.
NODES OPTIONS¶
- -a
- Shows the IP address(es) the nodes are communicating on.
- -n <nodename>
- Shows node information for a specific node. This should be the unqualified
node name as it appears in 'cman_tool nodes'.
- -F <format>
- Specify the format of the output. The format string may contain one or
more format options, each separated by a comma. Valid format options
include: id, name, type, and addr.
DEBUG OPTIONS¶
- -d<value>
- The value is a bitmask of
2 Barriers
4 Membership messages
8 Daemon operation, including command-line interaction
16 Interaction with Corosync
32 Startup debugging (cman_tool join operations only)
NOTES¶
the
nodes subcommand shows a list of nodes known to cman. the state is
one of the following:
M The node is a member of the cluster
X The node is not a member of the cluster
d The node is known to the cluster but disallowed access to it.
ENVIRONMENT VARIABLES¶
cman_tool removes most environment variables before forking and running
Corosync, as well as adding some of its own for setting up configuration
parameters that were overridden on the command-line, the exception to this is
that variable with names starting COROSYNC_ will be passed down intact as they
are assumed to be used for configuring the daemon.
DISALLOWED NODES¶
Occasionally (but very infrequently I hope) you may see nodes marked as
"Disallowed" in cman_tool status or "d" in cman_tool
nodes. This is a bit of a nasty hack to get around mismatch between what the
upper layers expect of the cluster manager and corosync.
- If a node experiences a momentary lack of connectivity, but one that is
long enough to trigger the token timeouts, then it will be removed from the
cluster. When connectivity is restored corosync will happily let it rejoin
the cluster with no fuss. Sadly the upper layers don't like this very much.
They may (indeed probably will have) have changed their internal state while
the other node was away and there is no straightforward way to bring the
rejoined node up-to-date with that state. When this happens the node is
marked "Disallowed" and is not permitted to take part in cman
operations.
If the remainder of the cluster is quorate the the node will be sent a kill
message and it will be forced to leave the cluster that way. Note that fencing
should kick in to remove the node permanently anyway, but it may take longer
than the network outage for this to complete.
If the remainder of the cluster is inquorate then we have a problem. The
likelihood is that we will have two (or more) partitioned clusters and we
cannot decide which is the "right" one. In this case we need to
defer to the system administrator to kill an appropriate selection of nodes to
restore the cluster to sensible operation.
The latter scenario should be very rare and may indicate a bug somewhere in the
code. If the local network is very flaky or busy it may be necessary to
increase some of the protocol timeouts for corosync. We are trying to think of
better solutions to this problem.
Recovering from this state can, unfortunately, be complicated. Fortunately, in
the majority of cases, fencing will do the job for you, and the disallowed
state will only be temporary. If it persists, the recommended approach it is
to do a cman tool nodes on all systems in the cluster and determine the
largest common subset of nodes that are valid members to each other. Then
reboot the others and let them rejoin correctly. In the case of a single-node
disconnection this should be straightforward, with a large cluster that has
experienced a network partition it could get very complicated!
Example:
In this example we have a five node cluster that has experienced a network
partition. Here is the output of cman_tool nodes from all systems:
Node Sts Inc Joined Name
1 M 2372 2007-11-05 02:58:55 node-01.example.com
2 d 2376 2007-11-05 02:58:56 node-02.example.com
3 d 2376 2007-11-05 02:58:56 node-03.example.com
4 M 2376 2007-11-05 02:58:56 node-04.example.com
5 M 2376 2007-11-05 02:58:56 node-05.example.com
Node Sts Inc Joined Name
1 d 2372 2007-11-05 02:58:55 node-01.example.com
2 M 2376 2007-11-05 02:58:56 node-02.example.com
3 M 2376 2007-11-05 02:58:56 node-03.example.com
4 d 2376 2007-11-05 02:58:56 node-04.example.com
5 d 2376 2007-11-05 02:58:56 node-05.example.com
Node Sts Inc Joined Name
1 d 2372 2007-11-05 02:58:55 node-01.example.com
2 M 2376 2007-11-05 02:58:56 node-02.example.com
3 M 2376 2007-11-05 02:58:56 node-03.example.com
4 d 2376 2007-11-05 02:58:56 node-04.example.com
5 d 2376 2007-11-05 02:58:56 node-05.example.com
Node Sts Inc Joined Name
1 M 2372 2007-11-05 02:58:55 node-01.example.com
2 d 2376 2007-11-05 02:58:56 node-02.example.com
3 d 2376 2007-11-05 02:58:56 node-03.example.com
4 M 2376 2007-11-05 02:58:56 node-04.example.com
5 M 2376 2007-11-05 02:58:56 node-05.example.com
Node Sts Inc Joined Name
1 M 2372 2007-11-05 02:58:55 node-01.example.com
2 d 2376 2007-11-05 02:58:56 node-02.example.com
3 d 2376 2007-11-05 02:58:56 node-03.example.com
4 M 2376 2007-11-05 02:58:56 node-04.example.com
5 M 2376 2007-11-05 02:58:56 node-05.example.com
In this scenario we should kill the node node-02 and node-03. Of course, the 3
node cluster of node-01, node-04 & node-05 should remain quorate and be
able to fenced the two rejoined nodes anyway, but it is possible that the
cluster has a qdisk setup that precludes this.
CONFIGURATION SYSTEMS¶
This section details how the configuration systems work in cman. You might need
to know this if you are using the -C option to cman_tool, or writing your own
configuration subsystem.
By default cman uses two configuration plugins to corosync. The first,
'xmlconfig', reads the configuration information stored in cluster.conf and
stores it in an internal database, in the same schema as it finds in
cluster.conf. The second plugin, 'cmanpreconfig', takes the information in
that the database, adds several cman defaults, determines the corosync node
name and nodeID and formats the information in a similar manner to
corosync.conf(5). Corosync then reads those keys to start the cluster
protocol. cmanpreconfig also reads several environment variables that might be
set by cman_tool which can override information in the configuration.
In the absence of xmlconfig, ie when 'cman_tool join' is run with -X switch
(this removes xmlconfig from the module list), cmanpreconfig also generates
several defaults so that the cluster can be got running without any
configuration information - see above for the details.
Note that cmanpreconfig will not overwrite corosync keys that are explicitly set
in the configuration file, allowing you to provide custom values for token
timeouts etc, even though cman has its own defaults for some of those values.
The exception to this is the node name/address and multicast values, which are
always taken from the cman configuration keys.
Most of the extra keys that cmanpreconfig adds are outside of the /cluster/ tree
and will only be seen if you dump the whole of corosync's object database.
However it does add some keys into /cluster/cman that you would not normally
see in a normal cluster.conf file. These are harmless, though could be
confusing. The most obvious of these is the "nodename" option which
is passed from cmanpreconfig to the name cman module, to save it recalculating
the node name again.