“The only true wisdom is in knowing you know nothing” - Socrates

Introduction

Back when I built my server (Link), I assumed that by right-sizing my VMs (such that 2 VMs are placed in each NUMA node) the hypervisor (libvirt/QEMU/kvm in my case) would take care of any NUMA tuning… I was soo wrong…

François Donzé has a post on this: Link

RedHat also has some guidelines on NUMA tuning: Link

Key Takeaways

  1. According to RedHat, a combination of vcpupin, emulatorpin and numatune is required in order to actually pin it to a NUMA node.

  2. A “strict” numatune memory policy is dangerous and will lead to VMs OOMing in case memory runs out on a NUMA node. According to Francois, the “interleave” policy is a much better option since it would allow to borrow memory from nearby nodes.

  3. numastat can be used to verify that the NUMA tuning is working as expected. For example, “watch -n 1 numastat -c qemu-kvm” is a great way to observe VMs requesting memory from other nodes.

  4. numactl can be used to check the NUMA topology of the machine. For example, “numactl -H”.

NUMA Tuning on libvirt

Assuming each NUMA node has access to 16 cores and 32GBs of RAM, we can pin 2 machines (each with 8 core + 16GBs of RAM) to it.

Let’s assume NUMA node 0 has access to 16 cores (0-7 and 64-71)

For machine1,

virsh vcpupin machine1 0 0 --config
virsh vcpupin machine1 1 1 --config
virsh vcpupin machine1 2 2 --config
virsh vcpupin machine1 3 3 --config
virsh vcpupin machine1 4 4 --config
virsh vcpupin machine1 5 5 --config
virsh vcpupin machine1 6 6 --config
virsh vcpupin machine1 7 7 --config
virsh emulatorpin machine1 0-7 --config
virsh numatune machine1 --mode interleave --nodeset 0 --config

For machine2,

virsh vcpupin machine2 0 64 --config
virsh vcpupin machine2 1 65 --config
virsh vcpupin machine2 2 66 --config
virsh vcpupin machine2 3 67 --config
virsh vcpupin machine2 4 68 --config
virsh vcpupin machine2 5 69 --config
virsh vcpupin machine2 6 70 --config
virsh vcpupin machine2 7 71 --config
virsh emulatorpin machine2 64-71 --config
virsh numatune machine2 --mode interleave --nodeset 0 --config