Re: [wallaby][nova] CPU topology and NUMA Nodes

24 Jun 2021

      ...
Hi,
I am using openstack wallaby on ubuntu 20.04 and kvm. I am working to make
optimized flavor properties that should provide optimal performance. I was
reviewing the document below.
https://docs.openstack.org/nova/wallaby/admin/cpu-topologies.html
I have two socket AMD compute node. The workload running on nodes are mixed
workload.
My question is should I use default nova CPU topology and NUMA node that
nova deploys instance by default OR should I use hw:cpu_sockets='2'
and hw:numa_nodes='2'.
On Fri, 2021-06-25 at 10:02 +0500, Ammad Syed wrote:
the latter hw:cpu_sockets='2' and hw:numa_nodes='2' should give you better performce
however you should also set hw:mem_page_size=small or hw:mem_page_size=any
when you enable virtual numa policies we afinities the guest memory to host numa nodes.
This can lead to Out of memory evnet on the the host numa nodes which can result in vms
being killed by the host kernel memeory reaper if you do not enable numa aware memeory
trackign iin nova which is done by setting hw:mem_page_size. setting  hw:mem_page_size has
the side effect of of disabling memory over commit so you have to bare that in mind.
if you are using numa toplogy you should almost always also use hugepages which are enabled
using  hw:mem_page_size=large this however requires you to configure hupgepages in the host
at boot.
...
Which one from above provide best instance performance ? or any other
tuning should I do ?
in the libvirt driver the default cpu toplogy we will genergated
is 1 thread per core, 1 core per socket and 1 socket per flavor.vcpu.
(technially this is an undocumeted implemation detail that you should not rely on, we have the hw:cpu_* element if you care about the toplogy)

this was more effincet in the early days of qemu/openstack but has may issue when software is chagne per sokcet or oepreating systems have
a limit on socket supported such as windows.

generally i advies that you set hw:cpu_sockets to the typical number of sockets on the underlying host.
simialrly if the flavor will only be run on host with SMT/hypertreading enabled on you shoudl set hw:cpu_threads=2

the flavor.vcpus must be devisable by the product of hw:cpu_sockets, hw:cpu_cores and hw:cpu_threads if they are set.

so if you have  hw:cpu_threads=2 it must be devisable by 2
if you have  hw:cpu_threads=2 and hw:cpu_sockets=2 flavor.vcpus must be a multiple of 4
...
The note in the URL (CPU topology sesion) suggests that I should stay with
default options that nova provides.
in generaly no you should aling it to the host toplogy if you have similar toplogy across your data center.
the default should always just work but its not nessisarly optimal and window sguest might not boot if you have too many sockets.
windows 10 for exmple only supprot 2 socket so you could only have 2 flavor.vcpus if you used the default toplogy.
...
Currently it also works with libvirt/QEMU driver but we don’t recommend it
in production use cases. This is because vCPUs are actually running in one
thread on host in qemu TCG (Tiny Code Generator), which is the backend for
libvirt/QEMU driver. Work to enable full multi-threading support for TCG
(a.k.a. MTTCG) is on going in QEMU community. Please see this MTTCG project
<http://wiki.qemu.org/Features/tcg-multithread> page for detail.
we do not gnerally recommende using qemu without kvm in produciton.
the mttcg backend is useful in cases where you want to emulate other plathform but that usecsae
is not currently supported in nova.
for your deployment you should use libvirt with kvm and you should also consider if you want to support
nested virtualisation or not.
...
Ammad

Re: [wallaby][nova] CPU topology and NUMA Nodes

Sean Mooney