how to define a flavor... numa topology and/or cpu pinning?
Hi, considering my hosts have numa architecture. what is the difference between "numa topology" and "cpu pinning"? would "cpu pinning" make any difference if "numa topology" is already in place? thank you very much NOTICE Please consider the environment before printing this email. This message and any attachments are intended for the addressee named and may contain legally privileged/confidential/copyright information. If you are not the intended recipient, you should not read, use, disclose, copy or distribute this communication. If you have received this message in error please notify us at once by return email and then delete both messages. We accept no liability for the distribution of viruses or similar in electronic communications. This notice should not be removed.
On Sat, 2019-04-13 at 02:12 +0000, Manuel Sopena Ballesteros wrote:
Hi,
considering my hosts have numa architecture.
what is the difference between "numa topology" a numa toplogy refers to the topology of your memory contorlers and therefore effect any operation involving memory acess includeing but not limited to memroy mapped io to hardware devices such as nics or gpus.
if we say a vm or host has 2 numa node that means that it has two indepenent meoroy controlers. this is pretty common these these days. by having multiple memory controlers the total memory of bandwith of the system is doubled however the distance (the phyical lenght of the copper traces) between a cpu and a memory contoler or a device and a memory contoler will be differenet between each of the two memory contolers as the cpu core or device will phycially postioned closer to one of them. basicly because the speed of light is not infiite if you have more then 1 memory contorler then acessing different parts of memory will have different latency depening on the memroy adress, the device that is doing the access and the contoler that that device is closets and if that contoler is the one that manages that area of memory.. so put simply when we are talking about numa we are talking about memory, and when we talk about numa affinity we are talking about minimising that latency between a device(cpu or mmio device like a nic) and that memeory by selecting a device who local memory contoler "owns" that memory therefore resulting in the shortest path. to do this in openstack if you add hw:numa_node=1 we will ensure the cpus and pci passthough devices that are used by a guest come the same numa node(memory contorler) as the ram for that guest, thereby optimising for the minium memory latency.
and "cpu pinning"? would "cpu pinning" make any difference if "numa topology" is already in place? cpu pinning can althouth the effect will be less then from enableing a numa toplogy for io intesive workloads.
when you enable cpu pinning in opentack we also create a numa topology of 1 implcitly. you can specify multiple numa node but as a result you get all the benifts of numa affity automatically when you enable cpu pinning. we should have kept these seperate but that is the legacy we have today. where cpu pinning helps is if your workload is compute bound. when you enable a numa afinity in openstack we confine the guest cpus to float over the cpus with a numa node that is local to the momory of the guest. while this soft pinning results in a perfromance boost for io operattions it dose not imporve cpu bound performacne. enabling cpu pinning will pin each guest cpu to a host hardware thread and addtionaly prevent oversubciption by other pinned cpus. i.e. we will never pin to guest vcpus to the same hardawere thread. with numa the guest not only floats over host cpus but the host cpus can be over subscribed.
thank you very much
one other think you should look into is the use of huge page memory. like cpu pinnign enableing hugepages creats an implict numa toplogy of 1 numa node if you do not override it with hw:numa_nodes. useing hugepage memory for the guest via hw:mem_page_size will futher improve guest memory intensive workload by miniuming tlb cache misses optimising virtual memory to phyical memory address traslation.
also as i have mentioned in the past notices like the one below raise legal ambigurity of if they are allowed to be sent to the list. with a strict reading it woudl not allow the mailinglist to forward the email. i am choosing to read this as all subsribers or the mailing list are the intended recipients. such disclaimer are almost allway unenforceabel and are effectivly spam so while it is not strictly against our mailing list etiquette https://wiki.openstack.org/wiki/MailingListEtiquette it is still best avoided.
NOTICE Please consider the environment before printing this email. This message and any attachments are intended for the addressee named and may contain legally privileged/confidential/copyright information. If you are not the intended recipient, you should not read, use, disclose, copy or distribute this communication. If you have received this message in error please notify us at once by return email and then delete both messages. We accept no liability for the distribution of viruses or similar in electronic communications. This notice should not be removed.
participants (2)
-
Manuel Sopena Ballesteros
-
Sean Mooney