Difference between revisions of "Proxmox"

From DWIKI
m
(44 intermediate revisions by the same user not shown)
Line 1: Line 1:
 


 
 
Line 11: Line 13:
*[https://www.proxmox.com/en/proxmox-backup-server Proxmox Backup Server]  
*[https://www.proxmox.com/en/proxmox-backup-server Proxmox Backup Server]  
*[https://pve.proxmox.com/wiki/Backup_and_Restore Backup and Restore]  
*[https://pve.proxmox.com/wiki/Backup_and_Restore Backup and Restore]  
*[https://www.danatec.org/2021/05/21/two-node-cluster-in-proxmox-ve-with-raspberry-pi-as-qdevice/ Proxmox Cluster with Raspberry Pi as QDevice]


 
 


 
 


= Commands =
= Commands =
 
==Get PVE version==
pveversion -v  | head -n 2
== qm Qemu Manager ==
== qm Qemu Manager ==


== pvesm Storage manager ==
== pvesm Storage manager ==
Check
pvesm scan
== pveperf ==
== pvecm ==


== pveperf==
==pvesh==
pvesh get /cluster/resources


== pvecm ==
pvesh get /cluster/resources --output-format json-pretty


= Documentation =
= Documentation =
== Proxmox API ==
*[https://pve.proxmox.com/wiki/Proxmox_VE_API Proxmox VE API]
*[https://pve.proxmox.com/pve-docs/api-viewer/ Proxmox API viewer]
===Invalid token name===
'''pve uses separator '=', but pbs wants ':''''


== Directory structure ==
== Directory structure ==


=== /etc/pve ===
=== /etc/pve ===
===/etc/pve/qemu-server===
The VM configs
vmstate? sems related to snapshots


=== /var/lib/vz ===
=== /var/lib/vz ===
Line 37: Line 62:
/var/lib/vz/template/iso
/var/lib/vz/template/iso


 


 


== Proxmox cluster ==
== Proxmox cluster ==
[https://pve.proxmox.com/wiki/Cluster_Manager https://pve.proxmox.com/wiki/Cluster_Manager]


=== Cluster manager ===
=== Cluster manager ===
Line 56: Line 85:
[https://github.com/takala-jp/zabbix-proxmox https://github.com/takala-jp/zabbix-proxmox]
[https://github.com/takala-jp/zabbix-proxmox https://github.com/takala-jp/zabbix-proxmox]


== FAQ ==


=HOWTO=
==Get VM name by ID==
grep '^name:' /etc/pve/nodes/*/qemu-server/$ID.conf | awk '{print $2}'
and
pvesh get /cluster/resources -type vm --output-format yaml | egrep -i 'vmid|name' | sed 's@.*:@@'
== Clustering ==
=== Show cluster status ===
pvecm status
It seems relatively safe to restart corosync
==Sysctl settings for kvm guests==
Still investigating, going for /etc/sysctl.d/50-kvmguest.conf
vm.vfs_cache_pressure=30
vm.swappiness=5
= FAQ =
==Error messages==
=== Proxmox API call failed: Couldn't authenticate user: zabbix@pve ===
=== Proxmox API call failed: Couldn't authenticate user: zabbix@pve ===


Line 68: Line 118:
Just that, check your DNS
Just that, check your DNS


= FAQ =
 


== Clustering ==
 


=== Show cluster status ===
pvecm status
 


=== [https://blog.jenningsga.com/proxmox-keeping-quorum-with-qdevices/ https://blog.jenningsga.com/proxmox-keeping-quorum-with-qdevices/] ===
=== [https://blog.jenningsga.com/proxmox-keeping-quorum-with-qdevices/ https://blog.jenningsga.com/proxmox-keeping-quorum-with-qdevices/] ===
Line 102: Line 147:
corosync-cmapctl | grep quorum.device<br/> quorum.device.model (str) = net<br/> quorum.device.net.algorithm (str) = ffsplit<br/> quorum.device.net.host (str) = 192.168.178.2<br/> quorum.device.net.tls (str) = on<br/> quorum.device.votes (u32) = 1<br/> <br/> [https://bugs.launchpad.net/ubuntu/+source/corosync-qdevice/+bug/1733889 https://bugs.launchpad.net/ubuntu/+source/corosync-qdevice/+bug/1733889]
corosync-cmapctl | grep quorum.device<br/> quorum.device.model (str) = net<br/> quorum.device.net.algorithm (str) = ffsplit<br/> quorum.device.net.host (str) = 192.168.178.2<br/> quorum.device.net.tls (str) = on<br/> quorum.device.votes (u32) = 1<br/> <br/> [https://bugs.launchpad.net/ubuntu/+source/corosync-qdevice/+bug/1733889 https://bugs.launchpad.net/ubuntu/+source/corosync-qdevice/+bug/1733889]


&nbsp;
=== x86/split lock detection: #AC: kvm/1161956 took a split_lock trap at address: 0x7ffebcb378ab===
Windows guest? Can probably be ignored


&nbsp;
== Shutting down a node ==


&nbsp;
Should just work. Takes guests down with it when they're not in HA


&nbsp;
== API calls ==
===List all vms in cluster===
pvesh get /cluster/resources --type vm --output-format yaml | egrep -i 'vmid|name'
or json:
pvesh get /cluster/resources --type vm --output-format json | jq '.[] | {id,name}'


=== Shutting down a node ===
Should just work. Takes guests down with it when they're not in HA


== Cores, sockets and vCPUs ==
== Cores, sockets and vCPUs ==
Line 147: Line 194:


  qm agent 105 ping
  qm agent 105 ping
===qm agent ping return values===
0: OK
2: VM not running
255: No QEMU guest agent configured (just disabled in vm config?) (QEMU guest agent is not running would only show then enabledin in config?)
When VM is not running, GUI claims agent not running


== Move to unused disk ==
== Move to unused disk ==
Line 181: Line 237:
&nbsp;
&nbsp;


&nbsp;
== Storage ==
 
=== Add local disk or LV to vm ===
 
That would be passtrough
 
qm set 101 -scsi1 /dev/mapper/somevolume


&nbsp;
Make sure node node can't migrate:&nbsp;?? PVE won't try that anyway, but still


&nbsp;
&nbsp;


&nbsp;
===fstrim guests===
qm guest <ID> fstrim


&nbsp;
===qmp command 'guest-fstrim' failed - got timeout===
seems to be a windows thing


==qcow image bigger than assigned disk==
Probably snapshots


== Backups ==
== Backups ==


=== PBS GC & Prune scheduling ===[https://pbs.proxmox.com/docs/prune-simulator/ https://pbs.proxmox.com/docs/prune-simulator/]
=== PBS GC & Prune scheduling ===
 
[https://pbs.proxmox.com/docs/prune-simulator/ https://pbs.proxmox.com/docs/prune-simulator/]


=== qmp command 'backup' failed - got timeout===
=== qmp command 'backup' failed - got timeout ===
https://github.com/proxmox/qemu/blob/master/qmp-commands.hx


[https://github.com/proxmox/qemu/blob/master/qmp-commands.hx https://github.com/proxmox/qemu/blob/master/qmp-commands.hx]
&nbsp;


=== proxmox-backup-client ===
=== proxmox-backup-client ===
Line 232: Line 302:
Check if there's no snapshot running (how?)
Check if there's no snapshot running (how?)


qm unlock 101
=== qmp command 'blockdev-snapshot-delete-internal-sync' failed - got timeout ===
Another job for
  qm unlock 101
  qm unlock 101


Line 253: Line 328:


to get it out of 101.conf
to get it out of 101.conf
&nbsp;
===qmp command 'query-backup' failed - got wrong command id===
=== Restoring single file from (PBS) backup ===
Check [https://pbs.proxmox.com/docs/backup-client.html#mounting-of-archives-via-fuse Mounting of archives with fuse]
Requires package proxmox-backup-file-restore:
proxmox-file-restore


== Error: VM quit/powerdown failed - got timeout ==
== Error: VM quit/powerdown failed - got timeout ==
Line 262: Line 349:
&nbsp;
&nbsp;


== a used vhost backend has no free memory slots left ==
 


== You have not turned on protection against thin pools running out of space. ==
== You have not turned on protection against thin pools running out of space. ==
Line 332: Line 419:
== can't lock file '/var/lock/qemu-server/lock-102.conf' - got timeout (500) ==
== can't lock file '/var/lock/qemu-server/lock-102.conf' - got timeout (500) ==


Maybe someone else has/had webui open
Maybe someone else has/had webui open, otherwise just remove it
 


== TASK ERROR: Can't use string ("keep-all=0,keep-last=3") as a HASH ref while "strict refs" in use at /usr/share/perl5/PVE/VZDump.pm line 502. ==
 
== TASK ERROR: Can't use string ("keep-all=0,keep-last=3") as a HASH ref while "strict refs" in use at /usr/share/perl5/PVE/VZDump.pm line 502. ==


Classic, means incorrect syntax in your /etc/pve/storage.cfg
Classic, means incorrect syntax in your /etc/pve/storage.cfg
Line 360: Line 449:


Try reconnection browser
Try reconnection browser
&nbsp;


&nbsp;
&nbsp;


== Find vm name by id ==
== Find vm name by id ==
 
qm config 100 | grep '^name:' | awk '{print $2}'
  grep name /etc/pve/nodes/*/qemu-server/101.conf |head -n 1
or a bit cruder"
     
  grep name: /etc/pve/nodes/*/qemu-server/101.conf |head -n 1 | cut -d ' ' -f 2
 
&nbsp;


== Started Proxmox VE replication runner. ==
== Started Proxmox VE replication runner. ==
Line 384: Line 473:


Remove the CD&nbsp;:)
Remove the CD&nbsp;:)
&nbsp;


&nbsp;
&nbsp;
Line 389: Line 480:
== Memory allocated to VMs ==
== Memory allocated to VMs ==


  qm list|grep -v VM| awk '{ sum+=$4 } END { print sum }'
  qm list|egrep -v "VM|stopped" | awk '{ sum+=$4 } END { print sum }'
 
== Ceph ==
 
=== Got timeout(500) ===
 
Check
 
pveceph status
 
Possibly problem with ceph mgr
 
&nbsp;
 
 
==vzdump: # cluster wide vzdump cron schedule==
# Automatically generated file - do not edit
edit it anyway?
 
 
==Guest issues==
===virtio_balloon virtio0: Out of puff! Can't get 1 pages===
 
 
  [[Category:Proxmox]]

Revision as of 14:54, 3 August 2022

 

 

 

Links

 

 

Commands

Get PVE version

pveversion -v  | head -n 2

qm Qemu Manager

pvesm Storage manager

Check

pvesm scan 

pveperf

pvecm

pvesh

pvesh get /cluster/resources
pvesh get /cluster/resources --output-format json-pretty

Documentation

Proxmox API


Invalid token name

pve uses separator '=', but pbs wants ':'

Directory structure

/etc/pve

/etc/pve/qemu-server

The VM configs

vmstate? sems related to snapshots


/var/lib/vz

/var/lib/vz/template/iso

 

 

Proxmox cluster

https://pve.proxmox.com/wiki/Cluster_Manager

Cluster manager

pvecm status
pvecm nodes

 

HA status

ha-manager status

Monitoring proxmox with zabbix

https://github.com/takala-jp/zabbix-proxmox


HOWTO

Get VM name by ID

grep '^name:' /etc/pve/nodes/*/qemu-server/$ID.conf | awk '{print $2}'

and

pvesh get /cluster/resources -type vm --output-format yaml | egrep -i 'vmid|name' | sed 's@.*:@@'

Clustering

Show cluster status

pvecm status

It seems relatively safe to restart corosync


Sysctl settings for kvm guests

Still investigating, going for /etc/sysctl.d/50-kvmguest.conf

vm.vfs_cache_pressure=30
vm.swappiness=5

FAQ

Error messages

Proxmox API call failed: Couldn't authenticate user: zabbix@pve

Funky characters in password string?

 

Failed to establish a new connection: [Errno -2] Name or service not known

Just that, check your DNS

 

 


https://blog.jenningsga.com/proxmox-keeping-quorum-with-qdevices/

https://pve.proxmox.com/pve-docs/chapter-pvecm.html#_corosync_external_vote_support

corosync-qdevice[11695]: Can't read quorum.device.model cmap key

On the qdevice node

Check corosync-cmapctl ?

also see https://pve.proxmox.com/pve-docs/chapter-pvecm.html#_corosync_external_vote_support

"Quorum: 2 Activity blocked"

In my case this meant boot up second real node first

 

 

On working node:

corosync-cmapctl | grep quorum.device
quorum.device.model (str) = net
quorum.device.net.algorithm (str) = ffsplit
quorum.device.net.host (str) = 192.168.178.2
quorum.device.net.tls (str) = on
quorum.device.votes (u32) = 1

https://bugs.launchpad.net/ubuntu/+source/corosync-qdevice/+bug/1733889

x86/split lock detection: #AC: kvm/1161956 took a split_lock trap at address: 0x7ffebcb378ab

Windows guest? Can probably be ignored

Shutting down a node

Should just work. Takes guests down with it when they're not in HA

API calls

List all vms in cluster

pvesh get /cluster/resources --type vm --output-format yaml | egrep -i 'vmid|name'

or json:

pvesh get /cluster/resources --type vm --output-format json | jq '.[] | {id,name}'


Cores, sockets and vCPUs

vCPUs is what the vm uses, equals sockets*cores

 

 

Migrating

VM is locked (create) (500)

Not always clear why, but try

qm unlock 111

 

 

 

Replication

missing replicate feature on volume 'local-lvm

looks like replication of lvm isn't supported

Check if qemu agent is running

See if IP is shown under Summary, also

qm agent 105 ping

qm agent ping return values

0: OK

2: VM not running

255: No QEMU guest agent configured (just disabled in vm config?) (QEMU guest agent is not running would only show then enabledin in config?)

When VM is not running, GUI claims agent not running

Move to unused disk

If you moved disk, and decided to move back to the old one:

  • detach current disk
  • select the unused disk
  • click Add

Stop all proxmox services

systemctl stop pve-cluster systemctl stop pvedaemon systemctl stop pveproxy systemctl stop pvestatd

Storage (xx) not available on selected target

probably some storage mounted only on one node, so not clustered

 

switch to community repository

cat /etc/apt/sources.list.d/pve-enterprise.list 
#deb https://enterprise.proxmox.com/debian/pve buster pve-enterprise
echo "deb http://download.proxmox.com/debian/pve buster pve-no-subscription" > /etc/apt/sources.list.d/pve-no-subscription.list 
apt update

W: (pve-apt-hook) You are attempting to remove the meta-package 'proxmox-ve'!

cehck sources.list :)

 

Storage

Add local disk or LV to vm

That would be passtrough

qm set 101 -scsi1 /dev/mapper/somevolume

Make sure node node can't migrate: ?? PVE won't try that anyway, but still

 

fstrim guests

qm guest <ID> fstrim

qmp command 'guest-fstrim' failed - got timeout

seems to be a windows thing

qcow image bigger than assigned disk

Probably snapshots

Backups

PBS GC & Prune scheduling

https://pbs.proxmox.com/docs/prune-simulator/

qmp command 'backup' failed - got timeout

https://github.com/proxmox/qemu/blob/master/qmp-commands.hx

 

proxmox-backup-client

export PBS_REPOSITORY="backup@pbs@pbs-server:backuprepo"
proxmox-backup-client snapshot list
proxmox-backup-client prune vm/101 --dry-run --keep-daily 7 --keep-weekly 3
proxmox-backup-client garbage-collect

dirty-bitmap status: existing bitmap was invalid and has been cleared

 

unexpected property 'prune-backups' (500)

When for example Add: iSCSI Uncheck "Keep all backups" in "Backup retention"

 

FAILED 00:00:02 unable to activate storage

TODO

 

VM 101 Backup failed: VM is locked (snapshot)

Check if there's no snapshot running (how?)

qm unlock 101


qmp command 'blockdev-snapshot-delete-internal-sync' failed - got timeout

Another job for

qm unlock 101

can't acquire lock '/var/run/vzdump.lock' - got timeout

Check if vzdump is running, otherwise kill it (cluster?)

 

VM 101 Backup failed::= VM is locked (snapshot-delete)

Check /etc/pve/qemu-server/101.conf for 'snapstate'

If that says 'delete' for a snapshot try deleting the snapshot:

qm delsnapshot 101 snapname

If that throws like Failed to find logical volume 'pve/snap_vm-101-disk-0_saving'

 qm delsnapshot 101 snapname --force

to get it out of 101.conf

 

qmp command 'query-backup' failed - got wrong command id

Restoring single file from (PBS) backup

Check Mounting of archives with fuse

Requires package proxmox-backup-file-restore:

proxmox-file-restore

Error: VM quit/powerdown failed - got timeout

qm stop VMID

if that complains about lock, remove the lock and try again

 


You have not turned on protection against thin pools running out of space.

serial console from command line

qm terminal <id}

enable serial console in guest

looks like this is not needed:

systemctl enable serial-getty@ttyS0.service

in /etc/default/grub

GRUB_CMDLINE_LINUX_DEFAULT="console=ttyS0 console=tty0"

ttyS0 is for qm terminal, tty0 is for the "console" buttion in UI

  1. debian based

update-grub

  1. redhat based

grub2-mkconfig --output=/boot/grub2/grub.cfg

 

add

serial0: socket

to /etc/pve/qemu-server/[vmid].conf and restart

 

agetty: /dev/ttyS0: not a device

systemctl status useless again, means the serial bit is missing from <vmid>.conf

TASK ERROR: command 'apt-get update' failed: exit code 100

subtle way of telling you to get subscription of at least change the sources list

Import vmdk to lvm

https://pve.proxmox.com/wiki/Qemu/KVM_Virtual_Machines#_importing_virtual_machines_and_disk_images

Can't apply changes to memory allocation

Maybe try enabling NUMA in CPU settings

 

 

Adding hardware shows orange

something is not supported (Options->Hotplug)

"Connection error 401: no ticket"

Login session expired?

can't lock file '/var/lock/qemu-server/lock-102.conf' - got timeout (500)

Maybe someone else has/had webui open, otherwise just remove it


TASK ERROR: Can't use string ("keep-all=0,keep-last=3") as a HASH ref while "strict refs" in use at /usr/share/perl5/PVE/VZDump.pm line 502.

Classic, means incorrect syntax in your /etc/pve/storage.cfg

 

The current guest configuration does not support taking new snapshots

You're using raw instead of qcow2. Convert: Hardware->Hard disk "Move Disk"

 

 

WARNING: Device /dev/dm-21 not initialized in udev database even after waiting 10000000 microseconds.

Until someone fixes it:

udevadm trigger
      

Also look for link to dm-21 in /dev/disk/by-id

"connection error - server offline?"

Try reconnection browser

 

 

Find vm name by id

qm config 100 | grep '^name:' | awk '{print $2}'

or a bit cruder"

grep name: /etc/pve/nodes/*/qemu-server/101.conf |head -n 1 | cut -d ' ' -f 2

Started Proxmox VE replication runner.

??

Find ID by name

grep -l "name: <NAME>"  /etc/pve/nodes/*/qemu-server/*conf| sed 's/^.*\/\([0-9]*\)\.conf/\1/g'
      

 

Can't migrate VM with local CD/DVD

Remove the CD :)

 

 

Memory allocated to VMs

qm list|egrep -v "VM|stopped" | awk '{ sum+=$4 } END { print sum }'

Ceph

Got timeout(500)

Check

pveceph status

Possibly problem with ceph mgr

 


vzdump: # cluster wide vzdump cron schedule

  1. Automatically generated file - do not edit

edit it anyway?


Guest issues

virtio_balloon virtio0: Out of puff! Can't get 1 pages