Revision as of 08:35, 2 September 2024

Links

Commands

Get PVE version

pveversion -v  | head -n 2

qm Qemu Manager

pvesm Storage manager

Check

pvesm scan

pveperf

pvecm

About pvecm output

A = Alive, NA = Not Alive
V = Vote, NV = Not Vote
MW = Master Wins, NMW = Not Master Wins [0]
NR = Not Registered

pvesh

pvesh get /cluster/resources

pvesh get /cluster/resources --output-format json-pretty

Get backup jobs

pvesh get /cluster/backup

then

pvesh get /cluster/backup/{backupid}

To change the vms included in the job:

pvesh set /cluster/backup/{backupid} -vmid 100,101,102

Get backup errors

pvesh get /nodes/pve01/tasks --typefilter vzdump --errors

Documentation

Proxmox API

View api: https://your.server:8006/api2/json/ ?

Invalid token name

pve uses separator '=', but pbs wants ':'

Ballooning

Dynamic Memory Management

Ballooning memory limit 80%:

autoballooning is done by one of our daemons (pvestatd) and this limit is hardcoded at the moment

Directory structure

/etc/pve

/etc/pve/qemu-server

The VM configs

vmstate? sems related to snapshots

/var/lib/vz

/var/lib/vz/template/iso

Proxmox cluster

https://pve.proxmox.com/wiki/Cluster_Manager

Cluster manager

pvecm status
pvecm nodes

HA status

ha-manager status

VM configuration

Hard disk caching

"If you're using ZFS as your backing store, you should leave the vdisk caching set to 'No cache' (default)."

Monitoring proxmox cluster with zabbix

https://github.com/takala-jp/zabbix-proxmox

Terms

vram_allocated (maxmem)

Maximum amount of ram a VM is allowed to use

vram_used (mem)

Memory being used by VM

Terms

vram

maximum amount of memory a vm may use

lrm

Local Resource Manager

HOWTO

Maintenance

Rebooting a node

If HA enabled check https://pve.proxmox.com/wiki/High_Availability#ha_manager_node_maintenance If you don't want it to start migrating, 'Freeze' might be the right option for HA Settings.

otherwise just do it :)

Disk cache for guest

Show vm configuration

qm config  101

Get VM name by ID

grep '^name:' /etc/pve/nodes/*/qemu-server/$ID.conf | awk '{print $2}'

or

pvesh get /cluster/resources -type vm --output-format yaml | egrep -i 'vmid|name' | sed 's@.*:@@'

or

grep "name:" /etc/pve/nodes/*/*/<vmid>.conf | awk '{ print $2 }'

Clustering

Show cluster status

pvecm status

It seems relatively safe to restart corosync

View cluster logs

pvesh get /cluster/tasks --output-format=json-pretty

Sysctl settings for kvm guests

Still investigating, going for /etc/sysctl.d/50-kvmguest.conf

vm.vfs_cache_pressure=30
vm.swappiness=5

Installing proxmox via PXE

https://github.com/morph027/pve-iso-2-pxe

Storage

Adding another thin pool

lvcreate -L 500G --thinpool newpool vg1

after creating lvm thin pool (TODO: link to that) add to /etc/pve/storage.cfg

lvmthin: lvm-raid10
       thinpool raid10pool
       vgname raid10
       content images

Get OS information of guest

qm guest cmd 105 get-osinfo

Disks

Identify disks in linux guest

lsblk -o +SERIAL

Run fstrim from host

Assuming agent is running:

qm agent 102 fstrim

Suspend or hibernate

Suspend

Suspend does not turn off your computer. It puts the computer and all peripherals on a low power consumption mode. If the battery runs out or the computer turns off for some reason, the current session and unsaved changes will be lost.

qm suspend

in GUI: sleep

qm status: paused

Hibernate

Hibernate saves the state of your computer to the hard disk and completely powers off. When resuming, the saved state is restored to RAM.

qm suspend to disk

or in GUI: Hibe

Backups

proxmox-backup-client

export PBS_REPOSITORY="backup@pbs@pbs-server:backuprepo"

proxmox-backup-client snapshot list

proxmox-backup-client prune vm/101 --dry-run --keep-daily 7 --keep-weekly 3

proxmox-backup-client garbage-collect

vzdump limit bandwidth

--bwlimit 50000

it looks like that limits read speed, i also noticed that bad write/speed to PBS has bad effects on guests or nowadays in /etc/vzdump.conf:

bwlimit

Get total memory allocated to vms

grep memory: /etc/pve/nodes/*/qemu-server/*conf|awk '{sum+=$2} END {print sum}'

nvidia on proxmox

Upgrade problem to 8.2 (6.8 kernel - nvidia drivers)

FAQ

Web interface stuck on "loading"

When clicking on guest on a particular node

Works on webui of that node

Different versions of PVE?

Console: unable to find serial interface

Maybe you're trying to get console on guest of another node in your cluster. To investigate why this goes wrong.

Cores or threads?

What's called "core" in the Web UI is a core from guest point of view, it would probably be a thread on the host.

Cloud-init

No CloudInit Drive found

See https://gist.github.com/aw/ce460c2100163c38734a83e09ac0439a

Error messages

create storage failed: storage 'XX' is not online (500)

When trying to create a storage on NFS

memory: hotplug problem

a used vhost backend has no free memory slots left

echo "options vhost max_mem_regions=509" >> /etc/modprobe.d/vhost.conf

and reboot

Proxmox API call failed: Couldn't authenticate user: zabbix@pve

Funky characters in password string?

SMP vm created on host with unstable TSC; guest TSC will not be reliable

memory: hotplug problem - 400 Parameter verification failed. dimm17: error unplug memory module

bad!

Failed to establish a new connection: [Errno -2] Name or service not known

Just that, check your DNS

ConditionPathExists=/etc/corosync/corosync.conf was not met

Problably set up node with bad /etc/hosts, or forgot to join cluster

https://forum.proxmox.com/threads/cluster.103370/

https://blog.jenningsga.com/proxmox-keeping-quorum-with-qdevices/

https://pve.proxmox.com/pve-docs/chapter-pvecm.html#_corosync_external_vote_support

corosync-qdevice[11695]: Can't read quorum.device.model cmap key

On the qdevice node

Check corosync-cmapctl ?

also see https://pve.proxmox.com/pve-docs/chapter-pvecm.html#_corosync_external_vote_support

"Quorum: 2 Activity blocked"

In my case this meant boot up second real node first

On working node:

corosync-cmapctl | grep quorum.device
quorum.device.model (str) = net
quorum.device.net.algorithm (str) = ffsplit
quorum.device.net.host (str) = 192.168.178.2
quorum.device.net.tls (str) = on
quorum.device.votes (u32) = 1

https://bugs.launchpad.net/ubuntu/+source/corosync-qdevice/+bug/1733889

x86/split lock detection: #AC: kvm/1161956 took a split_lock trap at address: 0x7ffebcb378ab

The number after kvm/ is process it, this will help you find the culprit.

See:

Could be large Windows guests on NUMA. Can probably be ignored

Shutting down a node

Should just work. Takes guests down with it when they're not in HA

API calls

List all vms in cluster

pvesh get /cluster/resources --type vm --output-format yaml | egrep -i 'vmid|name'

or json:

pvesh get /cluster/resources --type vm --output-format json | jq '.[] | {id,name}'

Cores, sockets and vCPUs

vCPUs is what the vm uses, maximum is sockets*cores but you could set it lower to allow adding cores/vcpus dynamically.

Migrating

VM is locked (create) (500)

Not always clear why, but try

qm unlock 111

CT is locked (snapshot-delete) =

pct unlock 115

Replication

missing replicate feature on volume 'local-lvm

looks like replication of lvm isn't supported

Check if qemu agent is running

See if IP is shown under Summary, also

qm agent 105 ping

or

qm guest cmd 111 ping

qm agent ping return values

0: OK

2: VM not running

255: No QEMU guest agent configured (just disabled in vm config?) (QEMU guest agent is not running would only show when enabled in in config?)

There is no way to tell if agent is running when it's not enabled in VM config.

When VM is not running, GUI claims agent not running

Move to unused disk

If you moved disk, and decided to move back to the old one:

detach current disk
select the unused disk
click Add

qemu-guest-agent.service: Job qemu-guest-agent.service/start failed with result 'dependency'.

Could mean QEMU guest agent is not enabled in vm config

Stop all proxmox services

systemctl stop pve-cluster systemctl stop pvedaemon systemctl stop pveproxy systemctl stop pvestatd

Storage (xx) not available on selected target

probably some storage mounted only on one node, so not clustered

switch to community repository

cat /etc/apt/sources.list.d/pve-enterprise.list 
#deb https://enterprise.proxmox.com/debian/pve buster pve-enterprise

echo "deb http://download.proxmox.com/debian/pve buster pve-no-subscription" > /etc/apt/sources.list.d/pve-no-subscription.list

apt update

W: (pve-apt-hook) You are attempting to remove the meta-package 'proxmox-ve'!

cehck sources.list :)

Storage

Could not determine current size of volume

When trying to grow a disk another secret!

Add local disk or LV to vm

That would be passtrough

qm set 101 -scsi1 /dev/mapper/somevolume

Make sure node node can't migrate: ?? PVE won't try that anyway, but still

Storage migration failed: block job (mirror) error: drive-scsi0: 'mirror' has been cancelled

Maybe moving disk to LVM, check for 4MiB alignment. qemu-img resize to 4MiB aligned size.

fstrim guests

qm guest <ID> fstrim

qmp command 'guest-fstrim' failed - got timeout

seems to be a windows thing

No disk unused

when trying to create thin volume, use command line?

qcow image bigger than assigned disk

Probably snapshots

Backups

backup write data failed: command error: protocol canceled

Temporary network failure?

storing login ticket failed: $XDG_RUNTIME_DIR must be set

Temporary bug, ignore it

dirty-bitmap status: created new

unexpected property 'prune-backups' (500)

When for example Add: iSCSI Uncheck "Keep all backups" in "Backup retention"

FAILED 00:00:02 unable to activate storage

TODO

VM 101 Backup failed: VM is locked (snapshot)

Check if there's no snapshot running (how?)

qm unlock 101

qmp command 'blockdev-snapshot-delete-internal-sync' failed - got timeout

Another job for

qm unlock 101

qmp command 'blockdev-snapshot-delete-internal-sync' failed - Snapshot with id 'null' and name 'mysnapshot' does not exist on device 'drive-scsi1'

Verify there is no such snapshot at all:

qemu-img snapshot -l vm-114-disk-1.qcow2

and then delete the entire system from [mysnapshot] in the vm config file

lvremove snapshot 'xx' error: Failed to find logical volume "pve/snap_vm-103-disk-0_xx"

Most likely the logical volume doesn't exist anymore. no idea how this can happen, but:

qm listsnapshot 103
qm delsnapshow 103 xx --force

will most likely cry about failed to find again, but with some luck:

qm listsnapshot 103

and it should be gone.

If you get

VM is locked (snapshot-delete)

just

qm unlock 103

(probably after checking there's not something else locking it)

can't acquire lock '/var/run/vzdump.lock' - got timeout

Check if vzdump is running, otherwise kill it (cluster?)

You could change lockwait in vzdump.conf, or as --lockwait parameter. Default is 180 minutes

VM 101 Backup failed::= VM is locked (snapshot-delete)

Check /etc/pve/qemu-server/101.conf for 'snapstate'

If that says 'delete' for a snapshot try deleting the snapshot:

qm delsnapshot 101 snapname

If that throws like Failed to find logical volume 'pve/snap_vm-101-disk-0_saving

 qm delsnapshot 101 snapname --force

If it says VM is locked (snapshot-delete) us

qm unlock XXX

When you get does not exist on device 'drive-scsi0 you might also need to delete the line "lock: snapshot-delete" from the 101.conf file

qmp command 'query-backup' failed - got wrong command id

Restoring single file from (PBS) backup

Check Mounting of archives with fuse

Requires package proxmox-backup-file-restore:

proxmox-file-restore

proxmox-file-restore failed: Error: mounting 'drive-scsi0.img.fidx/part/["2"]' failed: all mounts failed or no supported file system (500)

Maybe because of lvm?

Backup log

Upload size

Seems to be in kilobytes

Duplicates

Error: VM quit/powerdown failed - got timeout

qm stop VMID

if that complains about lock, remove the lock and try again

You have not turned on protection against thin pools running out of space.

Seems noboby knows how, just monitor it?

serial console from command line

qm terminal <id}

enable serial console in guest

https://pve.proxmox.com/wiki/Serial_Terminal

looks like this is not needed:

systemctl enable serial-getty@ttyS0.service

in /etc/default/grub

GRUB_CMDLINE_LINUX_DEFAULT="console=ttyS0 console=tty0"

ttyS0 is for qm terminal, tty0 is for the "console" buttion in UI

debian based

update-grub

redhat based

grub2-mkconfig --output=/boot/grub2/grub.cfg

add

serial0: socket

to /etc/pve/qemu-server/[vmid].conf and restart

agetty: /dev/ttyS0: not a device

systemctl status useless again, means the serial bit is missing from <vmid>.conf

TASK ERROR: command 'apt-get update' failed: exit code 100

subtle way of telling you to get subscription of at least change the sources list

Import vmdk to lvm

https://pve.proxmox.com/wiki/Qemu/KVM_Virtual_Machines#_importing_virtual_machines_and_disk_images

Can't apply changes to memory allocation

Maybe try enabling NUMA in CPU settings

Adding hardware shows orange

The keyword here is "PENDING", possibly ion /etc/pve/qemu-server/<id>.conf

Maybe something is not supported (Options->Hotplug), options: reboot or click "revert"

"Connection error 401: no ticket"

Login session expired?

can't lock file '/var/lock/qemu-server/lock-102.conf' - got timeout (500)

Maybe someone else has/had webui open, otherwise just remove it

TASK ERROR: Can't use string ("keep-all=0,keep-last=3") as a HASH ref while "strict refs" in use at /usr/share/perl5/PVE/VZDump.pm line 502.

Classic, means incorrect syntax in your /etc/pve/storage.cfg

The current guest configuration does not support taking new snapshots

You're using raw instead of qcow2. Convert: Hardware->Hard disk "Move Disk"
you might be using lvm thin over iscsi, then you can't have snapshots

WARNING: Device /dev/dm-21 not initialized in udev database even after waiting 10000000 microseconds.

Until someone fixes it:

udevadm trigger

Also look for link to dm-21 in /dev/disk/by-id

"connection error - server offline?"

Try reconnection browser

Find vm name by id

qm config 100 | grep '^name:' | awk '{print $2}'

or a bit cruder"

grep name: /etc/pve/nodes/*/qemu-server/101.conf |head -n 1 | cut -d ' ' -f 2

Started Proxmox VE replication runner.

??

Find ID by name

grep -l "name: <NAME>"  /etc/pve/nodes/*/qemu-server/*conf| sed 's/^.*\/\([0-9]*\)\.conf/\1/g'

Can't migrate VM with local CD/DVD

Remove the CD :)

Memory allocated to VMs

qm list|egrep -v "VM|stopped" | awk '{ sum+=$4 } END { print sum }'

Ceph

Got timeout(500)

Check

pveceph status

Possibly problem with ceph mgr

vzdump: # cluster wide vzdump cron schedule

Automatically generated file - do not edit

edit it anyway?

Guest issues

virtio_balloon virtio0: Out of puff! Can't get 1 pages

iSCSI

iscsid: conn 0 login rejected: initiator error - target not found

pvesm scan iscsi <targetip>

and

iscsiadm -m session -P 3

udev high load

Check

udevadmin monitor

KERNEL[426405.347906] change   /devices/virtual/block/dm-8 (block)
UDEV  [426405.359582] change   /devices/virtual/block/dm-8 (block)

ls -al /dev/mapper/

pve-vm--113--disk--0 -> ../dm-8

So vm/lx '113' is the one.

In general see https://forum.proxmox.com/threads/udev-malfunction-udisksd-high-cpu-load.99169/ since it could be usdisks2

start failed: org.freedesktop.DBus.Error.Disconnected: Connection is closed

Most likely that VM isn't running.

@@ Line 1: / Line 1: @@
-=Links=
-*[https://pve.proxmox.com/pve-docs/pve-admin-guide.html Proxmox VE Administration Guide]
-=FAQ=
+= Links =
-==Error: VM quit/powerdown failed - got timeout==
+*[https://pve.proxmox.com/pve-docs/pve-admin-guide.html Proxmox VE Administration Guide]
+*[https://pve.proxmox.com/wiki https://pve.proxmox.com/wiki Wiki]
+*[https://www.zabbix.com/integrations/proxmox Monitoring Proxmox with Zabbix]
+*[https://www.proxmox.com/en/proxmox-backup-server Proxmox Backup Server]
+*[https://pve.proxmox.com/wiki/Backup_and_Restore Backup and Restore]
+*[https://www.danatec.org/2021/05/21/two-node-cluster-in-proxmox-ve-with-raspberry-pi-as-qdevice/ Proxmox Cluster with Raspberry Pi as QDevice (outdate)]
+*[https://pve.proxmox.com/pve-docs/chapter-pvecm.html#_corosync_external_vote_support External vote ( like a raspberry pi)]
+*[https://gist.github.com/dragolabs/f391bdda050480871ddd129aa6080ac2 Useful proxmox commands ]
+*[https://tteck.github.io/Proxmox/ Proxmox VE helper scripts]
+*[https://bugzilla.proxmox.com/ Proxmox bug tracker]
+= Commands =
+==Get PVE version==
+ pveversion -v  | head -n 2
+== qm Qemu Manager ==
+== pvesm Storage manager ==
+Check
+ pvesm scan
+== pveperf ==
+== pvecm ==
+===About pvecm output===
+ A = Alive, NA = Not Alive
+ V = Vote, NV = Not Vote
+ MW = Master Wins, NMW = Not Master Wins [0]
+ NR = Not Registered
+==pvesh==
+ pvesh get /cluster/resources
+ pvesh get /cluster/resources --output-format json-pretty
+===Get backup jobs===
+ pvesh get /cluster/backup
+then
+ pvesh get /cluster/backup/{backupid}
+To change the vms included in the job:
+ pvesh set /cluster/backup/{backupid} -vmid 100,101,102
+===Get backup errors===
+ pvesh get /nodes/pve01/tasks --typefilter vzdump --errors
+= Documentation =
+== Proxmox API ==
+*[https://pve.proxmox.com/wiki/Proxmox_VE_API Proxmox VE API]
+*[https://pve.proxmox.com/pve-docs/api-viewer/ Proxmox API viewer]
+View api: https://your.server:8006/api2/json/ ?
+===Invalid token name===
+'''pve uses separator '=', but pbs wants ':''''
+==Ballooning==
+*[https://pve.proxmox.com/wiki/Dynamic_Memory_Management Dynamic Memory Management]
+Ballooning memory limit 80%:
+autoballooning is done by one of our daemons (pvestatd) and this limit is hardcoded at the moment
+== Directory structure ==
+=== /etc/pve ===
+===/etc/pve/qemu-server===
+The VM configs
+vmstate? sems related to snapshots
+=== /var/lib/vz ===
+/var/lib/vz/template/iso
+&nbsp;
+&nbsp;
+== Proxmox cluster ==
+[https://pve.proxmox.com/wiki/Cluster_Manager https://pve.proxmox.com/wiki/Cluster_Manager]
+=== Cluster manager ===
+ pvecm status
+ pvecm nodes
+&nbsp;
+=== HA status ===
+ ha-manager status
+== VM configuration==
+===Hard disk caching===
+"If you're using ZFS as your backing store, you should leave the vdisk caching set to 'No cache' (default)."
+= Monitoring proxmox cluster with zabbix =
+[https://github.com/takala-jp/zabbix-proxmox https://github.com/takala-jp/zabbix-proxmox]
+==Terms==
+===vram_allocated (maxmem)===
+Maximum amount of ram a VM is allowed to use
+===vram_used (mem)===
+Memory being used by VM
+=Terms=
+==vram==
+maximum amount of memory a vm may use
+==lrm==
+Local Resource Manager
+=HOWTO=
+==Maintenance==
+===Rebooting a node===
+If HA enabled check https://pve.proxmox.com/wiki/High_Availability#ha_manager_node_maintenance
+If you don't want it to start migrating, 'Freeze' might be the right option for HA Settings.
+otherwise just do it :)
+==Disk cache for guest==
+*https://pve.proxmox.com/wiki/Performance_Tweaks#Disk_Cache
+*https://forum.proxmox.com/threads/disk-cache-wiki-documentation.125775/
+==Show vm configuration==
+ qm config  101
+==Get VM name by ID==
+ grep '^name:' /etc/pve/nodes/*/qemu-server/$ID.conf | awk '{print $2}'
+or
+ pvesh get /cluster/resources -type vm --output-format yaml | egrep -i 'vmid|name' | sed 's@.*:@@'
+or
+ grep "name:" /etc/pve/nodes/*/*/<vmid>.conf | awk '{ print $2 }'
+== Clustering ==
+=== Show cluster status ===
+ pvecm status
+It seems relatively safe to restart corosync
+===View cluster logs===
+ pvesh get /cluster/tasks --output-format=json-pretty
+==Sysctl settings for kvm guests==
+Still investigating, going for /etc/sysctl.d/50-kvmguest.conf
+ vm.vfs_cache_pressure=30
+ vm.swappiness=5
+==Installing proxmox via PXE==
+https://github.com/morph027/pve-iso-2-pxe
+==Storage==
+===Adding another thin pool===
+ lvcreate -L 500G --thinpool newpool vg1
+after creating lvm thin pool (TODO: link to that) add to '''/etc/pve/storage.cfg'''
+ lvmthin: lvm-raid10
+        thinpool raid10pool
+        vgname raid10
+        content images
+== Get OS information of guest==
+ qm guest cmd 105 get-osinfo
+==Disks==
+===Identify disks in linux guest===
+ lsblk -o +SERIAL
+===Run fstrim from host===
+Assuming agent is running:
+ qm agent 102 fstrim
+==Suspend or hibernate==
+===Suspend===
+Suspend does not turn off your computer. It puts the computer and all peripherals on a low power consumption mode. If the battery runs out or the computer turns off for some reason, the current session and unsaved changes will be lost.
+ qm suspend
+in GUI: sleep
+qm status: paused
+===Hibernate===
+Hibernate saves the state of your computer to the hard disk and completely powers off. When resuming, the saved state is restored to RAM.
+ qm suspend to disk
+or in GUI: Hibe
+==Backups==
+=== proxmox-backup-client ===
+ export PBS_REPOSITORY="backup@pbs@pbs-server:backuprepo"
+ proxmox-backup-client snapshot list
+ proxmox-backup-client prune vm/101 --dry-run --keep-daily 7 --keep-weekly 3
+ proxmox-backup-client garbage-collect
+===vzdump limit bandwidth===
+ --bwlimit 50000
+it looks like that limits read speed, i also noticed that bad write/speed to PBS has bad effects on guests
+or nowadays in '''/etc/vzdump.conf''':
+ bwlimit
+===Get total memory allocated to vms===
+ grep memory: /etc/pve/nodes/*/qemu-server/*conf|awk '{sum+=$2} END {print sum}'
+==nvidia on proxmox==
+*[https://forum.proxmox.com/threads/upgrade-problem-to-8-2-6-8-kernel-nvidia-drivers.145749/page-2#post-664595 Upgrade problem to 8.2 (6.8 kernel - nvidia drivers)]
+= FAQ =
+==Web interface stuck on "loading"==
+===When clicking on guest on a particular node===
+====Works on webui of that node====
+Different versions of PVE?
+==Console: unable to find serial interface==
+Maybe you're trying to get console on guest of another node in your cluster. To investigate why this goes wrong.
+==Cores or threads?==
+What's called "core" in the Web UI is a core from guest point of view, it would probably be a thread on the host.
+==Cloud-init==
+===No CloudInit Drive found===
+See https://gist.github.com/aw/ce460c2100163c38734a83e09ac0439a
+==Error messages==
+===create storage failed: storage 'XX' is not online (500)===
+When trying to create a storage on NFS
+===memory: hotplug problem===
+a used vhost backend has no free memory slots left
+ echo "options vhost max_mem_regions=509" >> /etc/modprobe.d/vhost.conf
+and reboot
+=== Proxmox API call failed: Couldn't authenticate user: zabbix@pve ===
+Funky characters in password string?
+=== SMP vm created on host with unstable TSC; guest TSC will not be reliable ===
+===memory: hotplug problem - 400 Parameter verification failed. dimm17: error unplug memory module===
+bad!
+=== Failed to establish a new connection: [Errno -2] Name or service not known ===
+Just that, check your DNS
+=== ConditionPathExists=/etc/corosync/corosync.conf was not met ===
+Problably set up node with bad /etc/hosts, or forgot to join cluster
+https://forum.proxmox.com/threads/cluster.103370/
+=== [https://blog.jenningsga.com/proxmox-keeping-quorum-with-qdevices/ https://blog.jenningsga.com/proxmox-keeping-quorum-with-qdevices/] ===
+[https://pve.proxmox.com/pve-docs/chapter-pvecm.html#_corosync_external_vote_support https://pve.proxmox.com/pve-docs/chapter-pvecm.html#_corosync_external_vote_support]
+=== corosync-qdevice[11695]: Can't read quorum.device.model cmap key ===
+On the qdevice node
+Check&nbsp;corosync-cmapctl&nbsp;?
+also see&nbsp;[https://pve.proxmox.com/pve-docs/chapter-pvecm.html#_corosync_external_vote_support https://pve.proxmox.com/pve-docs/chapter-pvecm.html#_corosync_external_vote_support]
+=== "Quorum: 2 Activity blocked" ===
+In my case this meant boot up second real node first
+&nbsp;
+&nbsp;
+On working node:
+corosync-cmapctl | grep quorum.device<br/> quorum.device.model (str) = net<br/> quorum.device.net.algorithm (str) = ffsplit<br/> quorum.device.net.host (str) = 192.168.178.2<br/> quorum.device.net.tls (str) = on<br/> quorum.device.votes (u32) = 1<br/> <br/> [https://bugs.launchpad.net/ubuntu/+source/corosync-qdevice/+bug/1733889 https://bugs.launchpad.net/ubuntu/+source/corosync-qdevice/+bug/1733889]
+=== x86/split lock detection: #AC: kvm/1161956 took a split_lock trap at address: 0x7ffebcb378ab===
+The number after '''kvm/''' is process it, this will help you find the culprit.
+See:
+*[https://www.sobyte.net/post/2022-05/split-locks/ In-depth analysis of split locks]
+*[https://lwn.net/Articles/790464/ Detecting and handling split locks]
+*[https://lwn.net/Articles/911219/ The search for the correct amount of split-lock misery]
+Could be large Windows guests on NUMA. Can probably be ignored
+== Shutting down a node ==
+Should just work. Takes guests down with it when they're not in HA
+== API calls ==
+===List all vms in cluster===
+ pvesh get /cluster/resources --type vm --output-format yaml | egrep -i 'vmid|name'
+or json:
+ pvesh get /cluster/resources --type vm --output-format json | jq '.[] | {id,name}'
+== Cores, sockets and vCPUs ==
+vCPUs is what the vm uses, maximum is sockets*cores but you could set it lower to allow adding cores/vcpus dynamically.
+== Migrating ==
+=== VM is locked (create) (500) ===
+Not always clear why, but try
+ qm unlock 111
+== CT is locked (snapshot-delete) ===
+ pct unlock 115
+== Replication ==
+=== missing replicate feature on volume 'local-lvm ===
+looks like replication of lvm isn't supported
+== Check if qemu agent is running ==
+See if IP is shown under Summary, also
+ qm agent 105 ping
+or
+ qm guest cmd 111 ping
+===qm agent ping return values===
+: OK
+: VM not running
+: No QEMU guest agent configured (just disabled in vm config?) (QEMU guest agent is not running would only show when enabled in in config?)
+There is no way to tell if agent is running when it's not enabled in VM config.
+When VM is not running, GUI claims agent not running
+== Move to unused disk ==
+If you moved disk, and decided to move back to the old one:
+*detach current disk
+*select the unused disk
+*click Add
+== qemu-guest-agent.service: Job qemu-guest-agent.service/start failed with result 'dependency'. ==
+Could mean QEMU guest agent is not enabled in vm config
+== Stop all proxmox services ==
+systemctl stop pve-cluster systemctl stop pvedaemon systemctl stop pveproxy systemctl stop pvestatd
+== Storage (xx) not available on selected target ==
+probably some storage mounted only on one node, so not clustered
+== switch to community repository ==
+ cat /etc/apt/sources.list.d/pve-enterprise.list
+ #deb [https://enterprise.proxmox.com/debian/pve https://enterprise.proxmox.com/debian/pve] buster pve-enterprise
+ echo "deb [http://download.proxmox.com/debian/pve http://download.proxmox.com/debian/pve] buster pve-no-subscription" > /etc/apt/sources.list.d/pve-no-subscription.list
+ apt update
+== W: (pve-apt-hook) You are attempting to remove the meta-package 'proxmox-ve'! ==
+cehck sources.list&nbsp;:)
+&nbsp;
+== Storage ==
+===Could not determine current size of volume===
+When trying to grow a disk
+another secret!
+=== Add local disk or LV to vm ===
+That would be passtrough
+ qm set 101 -scsi1 /dev/mapper/somevolume
+Make sure node node can't migrate:&nbsp;?? PVE won't try that anyway, but still
+&nbsp;
+===Storage migration failed: block job (mirror) error: drive-scsi0: 'mirror' has been cancelled===
+Maybe moving disk to LVM, check for 4MiB alignment. qemu-img resize to 4MiB aligned size.
+===fstrim guests===
+ qm guest <ID> fstrim
+===qmp command 'guest-fstrim' failed - got timeout===
+seems to be a windows thing
+===No disk unused===
+when trying to create thin volume, use command line?
+==qcow image bigger than assigned disk==
+Probably snapshots
+== Backups ==
+===backup write data failed: command error: protocol canceled===
+Temporary network failure?
+===storing login ticket failed: $XDG_RUNTIME_DIR must be set===
+Temporary bug, ignore it
+=== PBS GC & Prune scheduling ===
+[https://pbs.proxmox.com/docs/prune-simulator/ https://pbs.proxmox.com/docs/prune-simulator/]
+=== qmp command 'backup' failed - got timeout ===
+[https://github.com/proxmox/qemu/blob/master/qmp-commands.hx https://github.com/proxmox/qemu/blob/master/qmp-commands.hx]
+&nbsp;
+=== dirty-bitmap status: existing bitmap was invalid and has been cleared ===
+*[https://qemu-project.gitlab.io/qemu/interop/bitmaps.html https://qemu-project.gitlab.io/qemu/interop/bitmaps.html]
+=== dirty-bitmap status: created new ===
+=== unexpected property 'prune-backups' (500) ===
+When for example Add: iSCSI Uncheck "Keep all backups" in "Backup retention"
+&nbsp;
+=== FAILED 00:00:02 unable to activate storage ===
+TODO
+&nbsp;
+=== VM 101 Backup failed: VM is locked (snapshot) ===
+Check if there's no snapshot running (how?)
+ qm unlock 101
+=== qmp command 'blockdev-snapshot-delete-internal-sync' failed - got timeout ===
+Another job for
+ qm unlock 101
+===qmp command 'blockdev-snapshot-delete-internal-sync' failed - Snapshot with id 'null' and name 'mysnapshot' does not exist on device 'drive-scsi1'===
+Verify there is no such snapshot at all:
+ qemu-img snapshot -l vm-114-disk-1.qcow2
+and then delete the entire system from [mysnapshot] in the vm config file
+===lvremove snapshot 'xx' error: Failed to find logical volume "pve/snap_vm-103-disk-0_xx"===
+Most likely the logical volume doesn't exist anymore. no idea how this can happen, but:
+ qm listsnapshot 103
+ qm delsnapshow 103 xx --force
+will most likely cry about failed to find again, but with some luck:
+ qm listsnapshot 103
+and it should be gone.
+If you get
+ VM is locked (snapshot-delete)
+just
+ qm unlock 103
+(probably after checking there's not something else locking it)
+=== can't acquire lock '/var/run/vzdump.lock' - got timeout ===
+Check if vzdump is running, otherwise kill it (cluster?)
+You could change lockwait in vzdump.conf, or as --lockwait parameter.
+Default is 180 minutes
+&nbsp;
+=== VM 101 Backup failed::= VM is locked (snapshot-delete) ===
+Check /etc/pve/qemu-server/101.conf for 'snapstate'
+If that says 'delete' for a snapshot try deleting the snapshot:
+ qm delsnapshot 101 snapname
+If that throws like '''Failed to find logical volume 'pve/snap_vm-101-disk-0_saving'''
+  qm delsnapshot 101 snapname --force
+If it says '''VM is locked (snapshot-delete)''' us
+ qm unlock XXX
+When you get '''does not exist on device 'drive-scsi0''' you might also need to delete the line "lock: snapshot-delete" from the 101.conf file
+===qmp command 'query-backup' failed - got wrong command id===
+=== Restoring single file from (PBS) backup ===
+Check [https://pbs.proxmox.com/docs/backup-client.html#mounting-of-archives-via-fuse Mounting of archives with fuse]
+Requires package proxmox-backup-file-restore:
+ proxmox-file-restore
+===proxmox-file-restore failed: Error: mounting 'drive-scsi0.img.fidx/part/["2"]' failed: all mounts failed or no supported file system (500)===
+Maybe because of lvm?
+===Backup log===
+====Upload size====
+Seems to be in kilobytes
+====Duplicates====
+== Error: VM quit/powerdown failed - got timeout ==
   qm stop VMID
+if that complains about lock, remove the lock and try again
+&nbsp;
+== You have not turned on protection against thin pools running out of space. ==
+Seems noboby knows how, just monitor it?
+== serial console from command line ==
+ qm terminal <id}
+== enable serial console in guest ==
+*[https://pve.proxmox.com/wiki/Serial_Terminal https://pve.proxmox.com/wiki/Serial_Terminal]
+looks like this is not needed:
-==enable serial console in guest==
   systemctl enable serial-getty@ttyS0.service
+in /etc/default/grub
+ GRUB_CMDLINE_LINUX_DEFAULT="console=ttyS0 console=tty0"
+ttyS0 is for qm terminal, tty0 is for the "console" buttion in UI
+#debian based
+update-grub
+#redhat based
+grub2-mkconfig --output=/boot/grub2/grub.cfg
+&nbsp;
+add
+ serial0: socket
+to /etc/pve/qemu-server/[vmid].conf and restart
+&nbsp;
+=== agetty: /dev/ttyS0: not a device ===
+systemctl status useless again, means the serial bit is missing from <vmid>.conf
+== TASK ERROR: command 'apt-get update' failed: exit code 100 ==
+subtle way of telling you to get subscription of at least change the sources list
+== Import vmdk to lvm ==
+[https://pve.proxmox.com/wiki/Qemu/KVM_Virtual_Machines#_importing_virtual_machines_and_disk_images https://pve.proxmox.com/wiki/Qemu/KVM_Virtual_Machines#_importing_virtual_machines_and_disk_images]
+== Can't apply changes to memory allocation ==
+Maybe try enabling NUMA in CPU settings
+&nbsp;
+&nbsp;
+== Adding hardware shows orange ==
+The keyword here is "PENDING", possibly ion /etc/pve/qemu-server/<id>.conf
+Maybe something is not supported (Options->Hotplug), options:
+reboot or click "revert"
+== "Connection error 401: no ticket" ==
+Login session expired?
+== can't lock file '/var/lock/qemu-server/lock-102.conf' - got timeout (500) ==
+Maybe someone else has/had webui open, otherwise just remove it
+== TASK ERROR: Can't use string ("keep-all=0,keep-last=3") as a HASH ref while "strict refs" in use at /usr/share/perl5/PVE/VZDump.pm line 502. ==
+Classic, means incorrect syntax in your /etc/pve/storage.cfg
+&nbsp;
+== The current guest configuration does not support taking new snapshots ==
+*You're using raw instead of qcow2. Convert: Hardware->Hard disk "Move Disk"
+*you might be using lvm thin over iscsi, then you can't have snapshots
+== WARNING: Device /dev/dm-21 not initialized in udev database even after waiting 10000000 microseconds. ==
+Until someone fixes it:
+ udevadm trigger
+Also look for link to dm-21 in /dev/disk/by-id
+== "connection error - server offline?" ==
+Try reconnection browser
+&nbsp;
+&nbsp;
+== Find vm name by id ==
+ qm config 100 | grep '^name:' | awk '{print $2}'
+or a bit cruder"
+ grep name: /etc/pve/nodes/*/qemu-server/101.conf |head -n 1 | cut -d ' ' -f 2
+== Started Proxmox VE replication runner. ==
+??
+== Find ID by name ==
+ grep -l "name: <NAME>"  /etc/pve/nodes/*/qemu-server/*conf| sed 's/^.*\/\([0-9]*\)\.conf/\1/g'
+&nbsp;
+== Can't migrate VM with local CD/DVD ==
+Remove the CD&nbsp;:)
+&nbsp;
+&nbsp;
+== Memory allocated to VMs ==
+ qm list|egrep -v "VM|stopped" | awk '{ sum+=$4 } END { print sum }'
+== Ceph ==
+=== Got timeout(500) ===
+Check
+ pveceph status
+Possibly problem with ceph mgr
+&nbsp;
+==vzdump: # cluster wide vzdump cron schedule==
+# Automatically generated file - do not edit
+edit it anyway?
+==Guest issues==
+===virtio_balloon virtio0: Out of puff! Can't get 1 pages===
+==iSCSI==
+===iscsid: conn 0 login rejected: initiator error - target not found===
+ pvesm scan iscsi <targetip>
+and
+ iscsiadm -m session -P 3
+==udev high load==
+Check
+ udevadmin monitor
+ KERNEL[426405.347906] change   /devices/virtual/block/dm-8 (block)
+ UDEV  [426405.359582] change   /devices/virtual/block/dm-8 (block)
+ls -al /dev/mapper/
+ pve-vm--113--disk--0 -> ../dm-8
+So vm/lx '113' is the one.
+In general see https://forum.proxmox.com/threads/udev-malfunction-udisksd-high-cpu-load.99169/
+since it could be usdisks2
+==start failed: org.freedesktop.DBus.Error.Disconnected: Connection is closed==
+Most likely that VM isn't running.
+  [[Category:Proxmox]]

Anonymous

Search

Proxmox: Difference between revisions

Revision as of 08:35, 2 September 2024

Links

Commands

Get PVE version

qm Qemu Manager

pvesm Storage manager

pveperf

pvecm

About pvecm output

pvesh

Get backup jobs

Get backup errors

Documentation

Proxmox API

Invalid token name

Ballooning

Directory structure

/etc/pve

/etc/pve/qemu-server

/var/lib/vz

Proxmox cluster

Cluster manager

HA status

VM configuration

Hard disk caching

Monitoring proxmox cluster with zabbix

Terms

vram_allocated (maxmem)

vram_used (mem)

Terms

vram

lrm

HOWTO

Maintenance

Rebooting a node

Disk cache for guest

Show vm configuration

Get VM name by ID

Clustering

Show cluster status

View cluster logs

Sysctl settings for kvm guests

Installing proxmox via PXE

Storage

Adding another thin pool

Get OS information of guest

Disks

Identify disks in linux guest

Run fstrim from host

Suspend or hibernate

Suspend

Hibernate

Backups

proxmox-backup-client

vzdump limit bandwidth

Get total memory allocated to vms

nvidia on proxmox

FAQ

Web interface stuck on "loading"

When clicking on guest on a particular node

Works on webui of that node

Console: unable to find serial interface

Cores or threads?

Cloud-init

No CloudInit Drive found

Error messages

create storage failed: storage 'XX' is not online (500)

memory: hotplug problem

Proxmox API call failed: Couldn't authenticate user: zabbix@pve

SMP vm created on host with unstable TSC; guest TSC will not be reliable

memory: hotplug problem - 400 Parameter verification failed. dimm17: error unplug memory module

Failed to establish a new connection: [Errno -2] Name or service not known

ConditionPathExists=/etc/corosync/corosync.conf was not met

https://blog.jenningsga.com/proxmox-keeping-quorum-with-qdevices/

corosync-qdevice[11695]: Can't read quorum.device.model cmap key

"Quorum: 2 Activity blocked"

x86/split lock detection: #AC: kvm/1161956 took a split_lock trap at address: 0x7ffebcb378ab