Hi,
I am using openstack wallaby on ubuntu 20.04 and kvm. I am working to make optimized flavor properties that should provide optimal performance. I was reviewing the document below.
https://docs.openstack.org/nova/wallaby/admin/cpu-topologies.html
I have two socket AMD compute node. The workload running on nodes are mixed workload.
My question is should I use default nova CPU topology and NUMA node that nova deploys instance by default OR should I use hw:cpu_sockets='2' and hw:numa_nodes='2'.
On Fri, 2021-06-25 at 10:02 +0500, Ammad Syed wrote: the latter hw:cpu_sockets='2' and hw:numa_nodes='2' should give you better performce however you should also set hw:mem_page_size=small or hw:mem_page_size=any when you enable virtual numa policies we afinities the guest memory to host numa nodes. This can lead to Out of memory evnet on the the host numa nodes which can result in vms being killed by the host kernel memeory reaper if you do not enable numa aware memeory trackign iin nova which is done by setting hw:mem_page_size. setting hw:mem_page_size has the side effect of of disabling memory over commit so you have to bare that in mind. if you are using numa toplogy you should almost always also use hugepages which are enabled using hw:mem_page_size=large this however requires you to configure hupgepages in the host at boot.
Which one from above provide best instance performance ? or any other tuning should I do ?
in the libvirt driver the default cpu toplogy we will genergated is 1 thread per core, 1 core per socket and 1 socket per flavor.vcpu. (technially this is an undocumeted implemation detail that you should not rely on, we have the hw:cpu_* element if you care about the toplogy) this was more effincet in the early days of qemu/openstack but has may issue when software is chagne per sokcet or oepreating systems have a limit on socket supported such as windows. generally i advies that you set hw:cpu_sockets to the typical number of sockets on the underlying host. simialrly if the flavor will only be run on host with SMT/hypertreading enabled on you shoudl set hw:cpu_threads=2 the flavor.vcpus must be devisable by the product of hw:cpu_sockets, hw:cpu_cores and hw:cpu_threads if they are set. so if you have hw:cpu_threads=2 it must be devisable by 2 if you have hw:cpu_threads=2 and hw:cpu_sockets=2 flavor.vcpus must be a multiple of 4
The note in the URL (CPU topology sesion) suggests that I should stay with default options that nova provides.
in generaly no you should aling it to the host toplogy if you have similar toplogy across your data center. the default should always just work but its not nessisarly optimal and window sguest might not boot if you have too many sockets. windows 10 for exmple only supprot 2 socket so you could only have 2 flavor.vcpus if you used the default toplogy.
Currently it also works with libvirt/QEMU driver but we don’t recommend it in production use cases. This is because vCPUs are actually running in one thread on host in qemu TCG (Tiny Code Generator), which is the backend for libvirt/QEMU driver. Work to enable full multi-threading support for TCG (a.k.a. MTTCG) is on going in QEMU community. Please see this MTTCG project <http://wiki.qemu.org/Features/tcg-multithread> page for detail.
we do not gnerally recommende using qemu without kvm in produciton. the mttcg backend is useful in cases where you want to emulate other plathform but that usecsae is not currently supported in nova. for your deployment you should use libvirt with kvm and you should also consider if you want to support nested virtualisation or not.
Ammad