Pacemaker
From DWIKI
uses Corosync or heartbeat, (it seems) corosync is the one to go for.
Links
- Cluster Labs
- https://github.com/ClusterLabs/pacemaker/blob/master/doc/pcs-crmsh-quick-ref.md
- Pacemaker explained
- pcs command resference
- Pacemaker and pcs on Linux example, managing cluster resource
- Building a high-available failover cluster with Pacemaker, Corosync & PCS
- HIGH AVAILABILITY ADD-ON ADMINISTRATION
- How To Create a High Availability Setup with Corosync, Pacemaker, and Floating IPs on Ubuntu 14.04
- http://fibrevillage.com/sysadmin/304-pcs-command-reference
- http://wiki.lustre.org/Creating_Pacemaker_Resources_for_Lustre_Storage_Services
- Pacemaker and pcs on Linux example, managing cluster resource
- Cheatsheet
- PCS tips&tricks
- pacemaker + drbd + iscsi, also useful pcs tips
- Mandatory and advisory ordering in Pacemaker
- http://clusterlabs.org/pacemaker/doc/en-US/Pacemaker/1.1/html/Clusters_from_Scratch/_specifying_a_preferred_location.html
- resource sets
- History of HA clustering
Notes
by specifying -INFINITY, the constraint is binding.
Commands/tools
- crm
- crmadmin
- cibadm
- pcs
- corosync
Useful commands
Dump entire crm
cibadm -Q
FAQ
Update resource
pcs resource update resourcname variablename=newvalue
Move resource to node
pcs resource move RES NODE
Show default resource stickiness
pcs resource default
Set resource stickiness
pcs resource meta <resource_id> resource-stickiness=100
and to check:
pcs resource show <resource_id>
Or better yet:
crm_simulate -Ls
Undo resource move
pcs constraint --full
Location Constraints: Resource: FOO Enabled on: santest-a (score:INFINITY) (role: Started) (id:cli-prefer-FOO)
pcs constraint remove cli-prefer-FOO
pcs status: Error: cluster is not currently running on this node
Don't panic until after
sudo pcs status
show detailed resources
pcs resource --full
stop node (standby)
The following command puts the specified node into standby mode. The specified node is no longer able to host resources. Any resources currently active on the node will be moved to another node. If you specify the --all, this command puts all nodes into standby mode.
pcs cluster standby node-1
or
pcs node standby
on the node itself
and undo this with
pcs cluster unstandby node-1
or
pcs node unstandby
set maintenance mode
This sets the cluster in maintenance mode, so it stops managing the resources
pcs property set maintenance-mode=true
Error: cluster is not currently running on this node
pcs cluster start
Remove a constraint
pcs constraint list --full
to identify the constraints and then
pcs constraint remove <whatever-constraint-id>
Clear error messages
pcs resource cleanup
Call cib_replace failed (-205): Update was older than existing configuration
can be run only once
[Error signing on to the CIB service: Transport endpoint is not connected ]
probably selinux
Show allocation scores
crm_simulate -sL
Show resource failcount
pcs resource failcount show <resource>
debug resource
pcs resource debug-start resource
*** Resource management is DISABLED *** The cluster will not attempt to start, stop or recover services
??