Pacemaker
From DWIKI
uses Corosync or heartbeat, (it seems) corosync is the one to go for.
Links
- Cluster Labs
- https://github.com/ClusterLabs/pacemaker/blob/master/doc/pcs-crmsh-quick-ref.md
- Pacemaker explained
- pcs command resference
- Building a high-available failover cluster with Pacemaker, Corosync & PCS
- HIGH AVAILABILITY ADD-ON ADMINISTRATION
- How To Create a High Availability Setup with Corosync, Pacemaker, and Floating IPs on Ubuntu 14.04
- http://fibrevillage.com/sysadmin/304-pcs-command-reference
- Pacemaker and pcs on Linux example, managing cluster resource
- Cheatsheet
- PCS tips&tricks
- pacemaker + drbd + iscsi, also useful pcs tips
- Mandatory and advisory ordering in Pacemaker
- http://clusterlabs.org/pacemaker/doc/en-US/Pacemaker/1.1/html/Clusters_from_Scratch/_specifying_a_preferred_location.html
- resource sets
Notes
by specifying -INFINITY, the constraint is binding.
Commands/tools
- crm
- crmadmin
- cibadm
- pcs
- corosync
Useful commands
Dump entire crm
cibadm -Q
FAQ
Move resource to node
pcs resource move RES NODE
Undo resource move
pcs constraint --full
Location Constraints: Resource: FOO Enabled on: santest-a (score:INFINITY) (role: Started) (id:cli-prefer-FOO)
pcs constraint remove cli-prefer-FOO
pcs status: Error: cluster is not currently running on this node
Don't panic until after
sudo pcs status
show detailed resources
pcs resource --full
stop node
pcs cluster standby node-1
or
pcs node standby
on the node itself
set maintenance mode
pcs property set maintenance-mode=true
Error: cluster is not currently running on this node
pcs cluster start
Remove a constraint
pcs constraint list --full
to identify the constraints and then
pcs constraint remove <whatever-constraint-id>
Clear error messages
pcs resource cleanup
Call cib_replace failed (-205): Update was older than existing configuration
can be run only once
[Error signing on to the CIB service: Transport endpoint is not connected ]
probably selinux
Show allocation scores
crm_simulate -sL
Show resource failcount
pcs resource failcount show <resource>
debug resource
pcs resource debug-start resource
*** Resource management is DISABLED *** The cluster will not attempt to start, stop or recover services
??