A guest with 480 GiB RAM on a 512 GiB machine is taking several minutes to start because memory pre-allocation is not fast enough (approx. 2 GiB/sec).
The solution is to have the necessary amount of RAM preallocated in the huge page pool, and to make QEMU/KVM use the pool for the guest.
- Check which page size is supported by the platform:
if [ "$(cat /proc/cpuinfo | grep -oh pse | uniq)" = "pse" ]
then echo "2048K = OK"
else echo "2048K = NO"
fi
if [ "$(cat /proc/cpuinfo | grep -oh pdpe1gb | uniq)" = "pdpe1gb" ]
then echo "1G = OK"
else echo "1G = NO"
fi
-
Add kernel args to use and preallocate necessary amount of huge pages (in this case, 480 GiB):
hugepagesz=1G default_hugepagesz=1G hugepages=480
.- On Proxmox, add them to
/etc/kernel/cmdline
and runpve-efiboot-tool refresh
. - On Ubuntu, add to
GRUB_CMDLINE_LINUX_DEFAULT
variable in/etc/default/grub
, then runupdate-grub
. - Note: while this can be done on the fly by writing a value to
/proc/sys/vm/nr_hugepages
, it's likely to take a long time or to fail to allocate the requested amount of pages. Meanwhile, the kernel arg will take effect at boot while the memory is not fragmented thus the allocation will be done within seconds.
- On Proxmox, add them to
-
Mount HugeTLB filesystem:
- On Proxmox, add
hugetlbfs /dev/hugepages hugetlbfs mode=01770 0 0
to/etc/fstab
. - On Ubuntu, this is not needed as it's already mounted (check with
mount | grep hugetlbfs
).
- On Proxmox, add
-
Configure QEMU/KVM to use 1GiB hugepages:
- On Proxmox, edit
/etc/pve/qemu-server/<ID>.conf
and add the linehugepages: 1024
. - For Libvirt, see here: https://help.ubuntu.com/community/KVM%20-%20Using%20Hugepages
- On Proxmox, edit
-
On NUMA systems, huge pages allocated by the kernel are distributed equally between nodes. VMs bound to NUMA nodes should be configured not to exceed per-node memory allocation.