Pacemaker: Difference between revisions

From DWIKI
Line 23: Line 23:
*[http://www.linux-ha.org/doc/dev-guides/ra-dev-guide.html The OCF Resource Agent Developer’s Guide]
*[http://www.linux-ha.org/doc/dev-guides/ra-dev-guide.html The OCF Resource Agent Developer’s Guide]
*https://clusterlabs.org/pacemaker/doc/en-US/Pacemaker/2.0/html/Pacemaker_Administration/_visualizing_the_action_sequence.html]
*https://clusterlabs.org/pacemaker/doc/en-US/Pacemaker/2.0/html/Pacemaker_Administration/_visualizing_the_action_sequence.html]
*[https://documentation.suse.com/sle-ha/15-SP1/html/SLE-HA-all/cha-ha-maintenance.html#sec-ha-maint-shutdown-node Implications of Taking Down a Cluster Node]


=Notes=
=Notes=

Revision as of 12:29, 29 May 2020

uses Corosync or heartbeat, (it seems) corosync is the one to go for.

Links

Notes

by specifying -INFINITY, the constraint is binding.


Commands/tools

Useful commands

save entire config

pcs config backup configfile

Dump entire crm

cibadm -Q

FAQ

Update resource

pcs resource update resourcname variablename=newvalue

Move resource to node

pcs resource move RES NODE

Show default resource stickiness

pcs resource default

Set resource stickiness

pcs resource meta <resource_id> resource-stickiness=100

and to check:

pcs resource show <resource_id>

Or better yet:

crm_simulate -Ls

Undo resource move

pcs constraint --full
Location Constraints:
  Resource: FOO
    Enabled on: santest-a (score:INFINITY) (role: Started) (id:cli-prefer-FOO)
pcs constraint remove cli-prefer-FOO

pcs status: Error: cluster is not currently running on this node

Don't panic until after

sudo pcs status

show detailed resources

pcs resource --full

stop node (standby)

The following command puts the specified node into standby mode. The specified node is no longer able to host resources. Any resources currently active on the node will be moved to another node. If you specify the --all, this command puts all nodes into standby mode.


pcs cluster standby node-1

or

pcs node standby

on the node itself


and undo this with

pcs cluster unstandby node-1

or

pcs node unstandby

set maintenance mode

This sets the cluster in maintenance mode, so it stops managing the resources

pcs property set maintenance-mode=true

Error: cluster is not currently running on this node

pcs cluster start


Remove a constraint

pcs constraint list --full

to identify the constraints and then

pcs constraint remove <whatever-constraint-id>

Clear error messages

 pcs resource cleanup

Call cib_replace failed (-205): Update was older than existing configuration

can be run only once

[Error signing on to the CIB service: Transport endpoint is not connected ]

probably selinux

Show allocation scores

crm_simulate -sL

Show resource failcount

pcs resource failcount show <resource>


export current configuration as commands

pcs config export pcs-commands

debug resource

pcs resource debug-start resource

*** Resource management is DISABLED *** The cluster will not attempt to start, stop or recover services

Cluster is in maintenance mode

Found meta data is "unclean", please apply-al first

Troubleshooting

pcs status all resources stopped

probably a bad ordering constraint

Fencing and resource management disabled due to lack of quorum

Problably means you forgot to pcs cluster start the other node

Resource cannot run anywhere

Check if some stickiness was set

pcs resource update unable to find resource

Trying to unset stickiness:

pcs resource update ISCSIgroupTEST1 meta resource-stickiness=

caused: Error: Unable to find resource: ISCSIgroupTEST1

what his means is: try it on the host where stickiness was set :)