'\" t .\" Title: ocf_linbit_drbd .\" Author: LINBIT HA Solutions GmbH .\" Generator: DocBook XSL Stylesheets vsnapshot .\" Date: 09/29/2020 .\" Manual: OCF resource agents .\" Source: drbd-pacemaker 9.15.0 .\" Language: English .\" .TH "OCF_LINBIT_DRBD" "7" "09/29/2020" "drbd-pacemaker 9.15.0" "OCF resource agents" .\" ----------------------------------------------------------------- .\" * Define some portability stuff .\" ----------------------------------------------------------------- .\" ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ .\" http://bugs.debian.org/507673 .\" http://lists.gnu.org/archive/html/groff/2009-02/msg00013.html .\" ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ .ie \n(.g .ds Aq \(aq .el .ds Aq ' .\" ----------------------------------------------------------------- .\" * set default formatting .\" ----------------------------------------------------------------- .\" disable hyphenation .nh .\" disable justification (adjust text to left margin only) .ad l .\" ----------------------------------------------------------------- .\" * MAIN CONTENT STARTS HERE * .\" ----------------------------------------------------------------- .SH "NAME" ocf_linbit_drbd \- Manages a DRBD device as a Master/Slave resource .SH "SYNOPSIS" .HP \w'\fBdrbd\fR\ 'u \fBdrbd\fR [start | stop | monitor | promote | demote | meta\-data | validate\-all] .SH "DESCRIPTION" .PP This resource agent manages a DRBD resource as a master/slave resource\&. DRBD is a shared\-nothing replicated storage device\&. .PP NOTE: To avoid data\-divergence, you should enable either DRBD "quorum" and "on\-no\-quorum io\-error" (recommended), or configure proper fencing policies in both DRBD *and* Pacemaker (fencing resource\-and\-stonith)\&. This cannot be done from this resource agent alone\&. .PP See the DRBD User\*(Aqs Guide for more information\&. https://docs\&.linbit\&.com/ .SH "SUPPORTED PARAMETERS" .PP \fBdrbd_resource\fR .RS 4 The name of the drbd resource from the drbd\&.conf file\&. .sp (unique, required, string, no default) .RE .PP \fBdrbdconf\fR .RS 4 Full path to the drbd\&.conf file\&. .sp (optional, string, default "/etc/drbd\&.conf") .RE .PP \fBadjust_master_score\fR .RS 4 Space separated list of four master score adjustments for different scenarios: \- only access to \*(Aqconsistent\*(Aq data \- only remote access to \*(Aquptodate\*(Aq data \- currently Secondary, local access to \*(Aquptodate\*(Aq data, but remote is unknown \- local access to \*(Aquptodate\*(Aq data, and currently Primary or remote is known .sp Numeric values are expected to be non\-decreasing\&. .sp The first value is 0 by default to prevent pacemaker from trying to promote while it is unclear whether the data is really the most recent copy\&. (DRBD knows it is "consistent", but is unsure about "uptodate"ness)\&. Please configure proper fencing methods both in DRBD (fencing resource\-and\-stonith; appropriate (un)fence\-peer handlers) AND in Pacemaker to make this work reliably\&. .sp Advanced use: Adjust the other values to better fit into complex dependency score calculations\&. .sp Intentionally diskless nodes ("Diskless Clients") with access to good data via some (or all) their peers will use the 3rd or 4th value (minus one) when they are (Secondary, not all peers up\-to\-date) or (ALL peers are up\-to\-date, or they are Primary themselves)\&. This may need to change if this should become a frequent use case\&. .sp Special considerations: .sp If a Secondary DRBD is connected to a peer in Primary role, but Pacemaker does not know about any Primary (using crm_resource \-\-locate), we conclude that there likely is a cluster\-split\-brain, and may try to "help" Pacemaker by removing the master\-score\&. Also see "remove_master_score_if_peer_primary"\&. .sp (optional, string, default "0 10 1000 10000") .RE .PP \fBstop_outdates_secondary\fR .RS 4 Recommended setting: leave at default (disabled)\&. .sp Note that this feature depends on the passed in information in OCF_RESKEY_CRM_meta_notify_master_uname to be correct, which unfortunately is not reliable for pacemaker versions up to at least 1\&.0\&.10 / 1\&.1\&.4\&. .sp If a Secondary is stopped (unconfigured), it may be marked as outdated in the drbd meta data, if we know there is still a Primary running in the cluster\&. Note that this does not affect fencing policies set in drbd config, but is an additional safety feature of this resource agent only\&. You can enable this behaviour by setting the parameter to true\&. .sp If this feature seems to not do what you expect, make sure you have defined fencing policies in the drbd configuration as well\&. .sp (optional, boolean, default false) .RE .PP \fBignore_missing_notifications\fR .RS 4 Some setups do not benefit from notifications\&. Allow to disable notifications without patching this resource agent\&. .sp (optional, boolean, default false) .RE .PP \fBwfc_timeout\fR .RS 4 Unless set to the empty string or any non\-digits, wait (at most) this many seconds for the connection(s) to be established after bringing them up during "start"\&. .sp (optional, numeric, default 5) .RE .PP \fBremove_master_score_if_peer_primary\fR .RS 4 See also "adjust_master_score" and "fail_promote_early_if_peer_primary"\&. .sp To prevent a potentially failed promotion attempt in case of cluster split\-brain (Pacemaker communication loss) while DRBD is still connected to a Primary, you can request to remove any master score while DRBD is connected to a Primary (and that Primary peer looks like it has all disks up\-to\-date)\&. .sp This may delay legitimate failovers after Primary crash by up to some TCP timeout (until DRBD realizes that the Primary is gone) plus one monitoring interval\&. .sp This parameter is interpreted almost as an "ocf boolean", with the exception of a literal "unexpected", that is: .sp \- (yes|true|1) [actually, according to the OCF spec, also (YES|TRUE|True|ja|ON), but please don\*(Aqt go there]: is "true": remove (or never assign) master scores, if DRBD appears to see a (healthy) Primary .sp \- "unexpected": assign master scores as described under "adjust_master_score", while removing it if DRBD appears to see a (healthy) Primary that Pacemaker does not know about (as determined by crm_resource \-\-locate)\&. .sp \- everything else is "false": ignore the peer role while assigning master scores\&. .sp (optional, string, default "false") .RE .PP \fBfail_promote_early_if_peer_primary\fR .RS 4 See also "adjust_master_score" and "remove_master_score_if_peer_primary"\&. .sp To avoid a useless retry loop during promotion attempts in case of cluster split\-brain (Pacemaker communication loss) while DRBD is still connected to a Primary, you can chose to give up after the first try if this situation is detected\&. .sp If a Primary "vanishes", TCP may not immediately detect this, and an idle DRBD may take some time until it does in\-DRBD\-protocol "pings"\&. Pacemaker may well detect Primary loss earlier than DRBD, and try to promote while DRBD thinks it can still see a Primary\&. Which means, in general, trying to promote at least once is necessary, as that implies an in\-DRBD\-protocol "peer alive" check\&. .sp But if that does not succeed, re\-trying until we hit the operation timeout may not be desired, so you can disable it\&. .sp (optional, boolean, default false) .RE .PP \fBunfence_if_all_uptodate\fR .RS 4 If all volumes of this resource report to be UpToDate, call an unfence script hook, just in case some stale fencing constraint or similar is still around\&. .sp \- With DRBD utils version <= 8\&.9\&.4, this is hardcoded to /usr/lib/drbd/crm\-unfence\-peer\&.sh \-r $DRBD_RESOURCE .sp \- With DRBD utils version >= 8\&.9\&.5, this is dispatched to $DRBDADM unfence\-peer $DRBD_RESOURCE .sp In any case, the hook itself is responsible to fetch $OCF_RESKEY_unfence_extra_args from its environment\&. .sp (optional, boolean, default false) .RE .PP \fBunfence_extra_args\fR .RS 4 This may be used to pass extra hints to the unfence hook\&. See description of unfence_if_all_uptodate\&. .sp (optional, boolean, default \-\-quiet \-\-flock\-required \-\-flock\-timeout 0 \-\-unfence\-only\-if\-owner\-match) .RE .PP \fBrequire_drbd_module_version_ge\fR .RS 4 Use this you want to force failure of this resource agent if the detected DRBD kernel (module) driver version is lower than a required minimum\&. .sp Example: use require_drbd_module_version_ge=9\&.0\&.16 to fail unless DRBD module version >= 9\&.0\&.16 is available (effectively requires DRBD 9)\&. .sp The intention of this is to give a more useful failure message after accidentally downgrading the DRBD version by installing/upgrading a new kernel\&. .sp Note: "ge", "greater\-or\-equal", inclusive\&. Required format: x\&.y\&.z .sp (optional, string, no default) .RE .PP \fBrequire_drbd_module_version_lt\fR .RS 4 Use this you want to force failure of this resource agent if the detected DRBD kernel (module) driver version is higher than a required maximum\&. .sp Example: use require_drbd_module_version_lt=9\&.0\&.0 to fail unless DRBD module version < 9\&.0 is available (effectively requires DRBD 8\&.4)\&. .sp Note: "lt", "less\-than", exclusive\&. Required format: x\&.y\&.z .sp (optional, string, no default) .RE .PP \fBconnect_only_after_promote\fR .RS 4 This may be useful for "stacked" setups without proper fencing on the lower layer (which we obviously do not recommend), to avoid some of the ugly side effects that may arise after resolving a split\-brain on the lower layer\&. .sp Keep this DRBD instance disconnected until it is promoted\&. After promotion we issue an additional "adjust", which is supposed to initiate the connection attempts\&. .sp This causes a new data generation identifier ("current uuid") to be generated after the failover of a "healthy" DRBD\&. .sp (optional, boolean, default false) .RE .SH "SUPPORTED ACTIONS" .PP This resource agent supports the following actions (operations): .PP \fBstart\fR .RS 4 Starts the resource\&. Suggested minimum timeout: 240\&. .RE .PP \fBreload\fR .RS 4 Suggested minimum timeout: 30\&. .RE .PP \fBpromote\fR .RS 4 Promotes the resource to the Master role\&. Suggested minimum timeout: 90\&. .RE .PP \fBdemote\fR .RS 4 Demotes the resource to the Slave role\&. Suggested minimum timeout: 90\&. .RE .PP \fBnotify\fR .RS 4 Suggested minimum timeout: 90\&. .RE .PP \fBstop\fR .RS 4 Stops the resource\&. Suggested minimum timeout: 100\&. .RE .PP \fBmonitor (Slave role)\fR .RS 4 Performs a detailed status check\&. Suggested minimum timeout: 20\&. Suggested interval: 20\&. .RE .PP \fBmonitor (Master role)\fR .RS 4 Performs a detailed status check\&. Suggested minimum timeout: 20\&. Suggested interval: 10\&. .RE .PP \fBmeta\-data\fR .RS 4 Retrieves resource agent metadata (internal use only)\&. Suggested minimum timeout: 5\&. .RE .PP \fBvalidate\-all\fR .RS 4 Performs a validation of the resource configuration\&. .RE .SH "EXAMPLE CRM SHELL" .PP The following is an example configuration for a drbd resource using the \fBcrm\fR(8) shell: .sp .if n \{\ .RS 4 .\} .nf primitive p_drbd ocf:linbit:drbd \e params \e drbd_resource=\fIstring\fR \e op monitor timeout="20" interval="20" role="Slave" \e op monitor timeout="20" interval="10" role="Master" .fi .if n \{\ .RE .\} .sp .if n \{\ .RS 4 .\} .nf ms ms_drbd p_drbd \e meta notify="true" interleave="true" .fi .if n \{\ .RE .\} .SH "EXAMPLE PCS" .PP The following is an example configuration for a drbd resource using \fBpcs\fR(8) .sp .if n \{\ .RS 4 .\} .nf pcs resource create p_drbd ocf:linbit:drbd \e drbd_resource=\fIstring\fR \e op monitor timeout="20" interval="20" role="Slave" \e op monitor timeout="20" interval="10" role="Master" \-\-master .fi .if n \{\ .RE .\} .SH "SEE ALSO" .PP \m[blue]\fB\%https://docs.linbit.com/\fR\m[], \m[blue]\fB\%https://clusterlabs.org/\fR\m[], \m[blue]\fB\%https://www.linbit.com/drbd-community/\fR\m[] .SH "AUTHORS" .PP \fBLINBIT HA Solutions GmbH\fR