DRBD
Distributed Replicated Block Device
Links
- Homepage
- DRBD-8.4 user's guide
- DRBD + pacemaker + NFS, pretty good doc
- http://www.securityandit.com/system/pacemaker-cluster-with-nfs-and-drbd/
- https://www.sebastien-han.fr/blog/2012/08/01/corosync-rrp-configuration/
- Building a redundant pair of Linux storage servers using DRBD and Heartbeat
- [https://serverstack.wordpress.com/2017/05/31/install-and-configure-drbd-cluster-on-rhel7-centos7/
- http://www.asplund.nu/xencluster/page2.html
- Resizing resources
- DRBD cheat sheet
- http://www.linux-ha.org/wiki/Main_Page
- HA iSCSI with drbd and pacemaker
- Internal meta data
- Configure sync rate
- https://serverfault.com/questions/740311/drbd-terrible-sync-performance-on-10gige
- also tuning tips
- Debian DRBD: How to resize NFS on drbd volume on top of LVM
- HOWTO: Resolve DRBD split-brain recovery manually
See also: http://www.gluster.org/ and http://ceph.com/
Stacked resources
- Using stacked DRBD resources in Pacemaker clusters
- https://github.com/fghaas/drbd-documentation/blob/master/users-guide/pacemaker.txt
- Pacemaker and DRBD9 stacked resources
- DRBD 8.3 Third Node Replication (stacked)
Support
Tools
- LCMC, a GUI for managing LVM, DRBD in pacemaker environment]
- zabbix monitoring for drbd
drbdadm
drbd-overview
drbdsetup
Docs
- Resizing drbd xen lvm (dont worry about the meta-data if that's not internal) (bug it seems wrong, using phy: instead of drbd: )
- http://www.asplund.nu/xencluster/page2.html
Recovery
- Recovering from a DRBD split-brain scenario in heartbeat
- http://www.asplund.nu/xencluster/page2.html
- [https://docs.linbit.com/doc/users-guide-83/s-split-brain-notification-and-recovery/ Split brain notification and automatic recovery
- Troubleshooting DRBD on MediaCentral
GFS on DRBD
- http://sourceware.org/cluster/wiki/DRBD_Cookbook
- http://www.piemontewireless.net/Storage_on_Cluster_DRBD_and_GFS2
Cheatsheet
Make device primary
drbdadm primary yourdeviceID
or
drbdsetup /dev/drbdX primary -o
Grow resource
On both nodes:
lvextend -L+10G /dev/DRBD/myresource
On one node:
drbdadm resize myresource
Check resource file
Editing files in /etc/drbd.d/ is a bad plan, to check syntax first:
drbdadm dump -c /tmp/test.res
FAQ
Get out of 'Standalone'
disconnect/connect until works :)
1: State change failed: (-2) Need access to UpToDate data
when you get that tryinng to make a node/resource primary, try
drbdadm primary drbdX --force
calculate metadata size
https://serverfault.com/questions/433999/calculating-drbd-meta-size
Cs=`blockdev --getsz /dev/foo` Bs=`blockdev --getpbsz /dev/foo`
TODO finish this
'mydrbd' not defined in your config (for this host).
If drbdadm create-md throws this, 'this host' is the clue: it must match `hostname`
https://newbiedba.wordpress.com/2015/09/21/drbd-not-defined-in-your-config-for-this-host/
show resource sizes
lsblk
commands to show info
drbdmon drbdtop
resolving split brain issues
- https://docs.linbit.com/doc/users-guide-83/s-resolve-split-brain/
- https://www.sebastien-han.fr/blog/2012/04/25/DRBD-split-brain/
diskless
You might try
drbdadm attach drbd0
The disk contains an unclean file system (0, 0).
Metadata kept in Windows cache, refused to mount. Falling back to read-only mount because the NTFS partition is in an unsafe state. Please resume and shutdown Windows fully (no hibernation or fast restarting.)
When trying to mount a snapshot (kpartx -av backup-snap1 etc)
???
sync is slow
On secondary:
drbdadm disk-options --c-plan-ahead=0 --resync-rate=50M drbd0and
and to reset after sync:
drbdadm adjust drbd0
show configuration
drbdsetup show
show more info
drbdsetup show-gi <minor-number>
Minor number is shown in drbdsetup show
update network settings
drbdsetup net-options 10.0.0.1 10.0.0.2 --sndbuf-size=2M
mount: unknown filesystem type 'drbd'
Usually means your node is not primary. If you're sure you know what you're doing you can use
mount -t ext4 /dev/drbd1 /drbdmount
or when you want to mount the partion while drbd is down:
kpartx -av /dev/mapper/DRBD-test1 #add map DRBD-test1p1 (253:5): 0 10482928 linear /dev/mapper/DRBD-test1 2048 #I suggest mounting ro mount -o ro /dev/mapper/DRBD-test1o1 /mnt/test1
reload configuration
drbdadm --dry-run adjust <resourcename|all>
and then
drbdadm adjust <resourcename|all>
show configuration of resource
drbdsetup /dev/drbd0 show
resource unknown
First try
drbdadm up resourcename
Command 'drbdmeta 1 v08 /dev/drbd0 internal apply-al' terminated with exit code 20
Most likely split brain issue, check dmesg etc
wfconnection
Could be split brain situation
drbdadm -- --discard-my-data connect resource
cs:StandAlone ro:Primary/Unknown ds:UpToDate/DUnknown
So you're on the primary node, secondary might showing nothing, then first
drbdadm up drbdX
or if secondary shows "cs:WFConnection ro:Secondary/Unknown ds:UpToDate/DUnknown", it is waiting for connection(no?)
drbdadm disconnect drbdres drbdadm connect --discard-my-data drbdres
or it shows "cs:WFConnection ro:Secondary/Unknown ds:UpToDate/DUnknown", in that case what might work on primary node:
drbdadm disconnect drbdX drbdadm connect drbdX
Split brain issue
To force updating resource
drbdadm invalidate resource
cs:WFReportParams ro:Secondary/Unknown ds:UpToDate/DUnknown
connection is made, waiting for more
Unexpected data packet AuthChallenge (0x0010)
maybe the shared key
State change failed: Device is held open by someone
could be stacked resource, timeout?
error receiving ReportState, e: -5 l: 0!
??
drbd: error sending genl reply
CentOS feature, https://wiki.centos.org/Manuals/ReleaseNotes/CentOS7.2003. "Try another kernel/module version"
State change failed: (-14) Need a verify algorithm to start online verify
Means no verify-alg was defined, so no online checking
drbdadm dump-md foo: Found meta data is "unclean", please apply-al first
Try
drbdadm apply-al foo
( AL means "activity log", btw )