“The only constant in life is change” - Heraclitus

Update: I have realized that without tuning for NUMA manually, it’s almost certain that you’ll run into cross-NUMA issues! I’ve written about it here: Link

EPYC

Motive

I had to get rid of my rack server that I used as my homelab since last year. Dell PowerEdge R620, which was a second-hand purchase that I had made last year. While it was kinda powerful (256GBs of RAM and 48 CPU cores), it was noisy and my electricity bills were through the roof because of it’s power consumption. Also, the disks were too slow for my liking - Kubernetes nodes and especially Ceph, do not like slow disks. And let’s be honest, if you’re not on NVMe in 2023 you’re doing something wrong.

One day, I stumbled upon some forums created for/by people passionate about CFD (Computational Fluid Dynamics) and happened to read about the machines they tend to use for this. Apparently, EPYC processors are pretty popular in this community. Given their high core count and insane amount of PCIe lanes, it only makes sense.

Then, I wondered how good would EPYC processors be for my workloads i.e. Virtualization. Maybe I could get some second-hand processors from resellers here in India…

Down the Rabbit Hole

Anything but the first gen EPYCs are super expensive. That makes only the first gen EPYCs viable for me since I’m not a millionaire.

Because of how first-gen EPYCs were designed, each CPU package is essentially a group of 4 dies. The following is the NUMA topology described in the Naples spec sheet:

  • 4 Dies per package.
  • 2 Core-Complexes (CCXs) per Die.
  • Up to 4 Cores per CCX sharing an L3 cache. All CCXs configured equally.
  • 2 Threads per Core (SMT) sharing an L2 cache.
  • 2 Memory channels per die
  • 8 memory channels per package with up-to 2 DIMMs per channel.
  • Platform support for one or two SoCs (1P or 2P).

(https://www.amd.com/content/dam/amd/en/documents/epyc-technical-docs/specifications/56308-numa-topology-for-epyc-naples-family-processors.pdf)

So, 4 NUMA nodes on single socket! Naturally, this can cause issues with virtualization workloads where if a VM’s core count exceeds the number of cores on a die then those cores will be scheduled on a different NUMA node - which would incur a performance penalty. This has been addressed by AMD from 2nd Gen onwards but those remain expensive to this day. So, we’ll have to figure out something.

What if we configure our VMs to have core counts <= total cores on a single NUMA node. Sounds reasonable, right? Well…

In a mixed configuration where different VMs have different number of core counts, we run the risk of cross-NUMA access.

For example:

  • Assume we have 2 NUMA nodes (A, B) with 8-cores on each node.
  • Assume a VM (VM1) with 4 cores is scheduled on node A and a VM (VM2) with 6 cores is scheduled on B.

That results in A and B having 4 cores and 2 cores available respectively.

Now, let’s create a VM (VM3) with 6 cores. See the problem here? It might be possible that the 6 cores are split across A and B (4 cores and 2 cores, respectively) which would then be bad performance wise.

Extrapolating the scenario to 4 NUMA nodes, we can see that this holds.

Will not having a mixed configuration for our VMs solve the issue?

Yeah, it would… but that would result in underutilization for certain VMs - those cores could always be put into use elsewhere.

So, it is very important to decide on our VM configurations (number of VMs, memory allocated, core count etc.) before we go about deciding on which CPU to buy.

My requirements:

Count Node Type CPU RAM
3 Kubernetes/OpenShift Control Plane Node 4 8 GB
10 Kubernetes/OpenShift Worker Node 8 16 GB
1 DNS/IPA Server + BGP Router 4 8 GB
1 Helper Services 4 8 GB
1 GNS3 Server 16 32 GB
Total Total Total
16 116 232 GB

I decided to go with a couple of EPYC 7601 processors since the seller promised me a 1 year warranty here in India.

With the above configuration, we will not run into cross-NUMA access as it’s guaranteed (Update: This is not true… read the update above… Oops!) that each VM will be scheduled on exactly one NUMA node provided:

  1. Each NUMA node has access to 16 cores and 32 GBs of RAM
  2. CPU pinning is configured in your hypervisor

Building the machine

Specs:

  1. Supermicro H11DSi Rev 2.0 (Rev 2.0 has support for 7002 series processors as well)
  2. 2 x EPYC 7601 - 128 threads in total
  3. 256 GBs of DDR4 RAM

Here are some of the things that I learned while installing the CPU:

  1. Contact the vendor selling you the CPUs and make sure it’s not “Vendor Locked”. AMD had this EPYC idea of allowing vendors like Dell to lock processors to their motherboards. This means that EPYC CPUs pulled from Dell machines would not work with Supermicro motherboards. On Supermicro motherboards, using a vendor locked CPU will result in POST Code 78. This happened to me and I had to return the processors and get it from somewhere else.

  2. Ensure the CPU is in the holding tray before opening the thingy that protects the pins. Otherwise, the CPU could fall into the socket and damage it which would be an EPYC fail.

  3. Man, those guys on Reddit and STH are not kidding when they say “Installing EPYCs/Threadrippers is a pain”. It’s true especially when you (like me) don’t have a torque screwdriver. It’s definitely possible but takes a lot of time and effort without it. You have to tighten each screw (1, 2 and 3) properly. Ensure that the machine doesn’t turn on automatically. That would mean that the CPU is not mounted correctly. Too little mounting pressure = machine doesn’t turn on when the power button on the case is pressed. Too much mounting pressure = machine turns on automatically even when the power button on the case is not pressed. (Assuming auto power-on is turned off from your motherboard’s settings) I’ve run memtest for 4 cycles and don’t see any issues. So, I’m assuming my CPUs are mounted properly. I guess I can confirm it only once I start running my workloads on this. Install the CPUs and test the system before installing the motherboard into a case. Once it’s working outside the case, install it into the case. Ensure that the standoffs are installed correctly, otherwise you’ll encounter weird issues (for me it was POST code 94 which is for “PCI Enumeration” but the problem was probably a short because i had the case lying sideways on a step stool). Even ‘FF’ as the POST code sometimes, all related to the standoffs not being installed correctly.

  4. A medium-sized blob of thermal paste at the center of the CPU is enough. I’ve done that and in stress tests the max temp for CPUs is like 52C.

Forward Unto Dawn

I hope this post is helpful for homelabbers who wish to build an EPYC rig themselves. I have spent almost 3 months on this little venture of mine. I’m glad that it has finally come to fruition. Now, all that remains is to work hard and continue onward. May the Entropy guide me.