hw:numa_nodes=1 does not enable per numa node memory tracking to resolve your OOM issue you need to set hw:mem_page_size=small or hw:mem_page_size=any
Ah! That's what I was looking for! :) Thank you Sean!
the reason that it is always selection numa 0 is that nova is taking the list of host numa nodes and checkign each one using itertools.permutations. that always checks the numa nodes in a stable order starting with numa node 0.
since you have jsut set hw:numa_nodes=1 without requesting any numa specifc resouces e.g. memory or cpus numa node 0 will effectivly always fit the vm
Makes sense.
when you set hw:numa_nodes=1 and nothing else the scudler will only reject a node if the number of cpus on the numa node is less that the number the vm requests. it will not check the memory availabel on the numa node since you did not ask nova to do that via hw:mem_page_size
effectivly if you are using any numa feature in nova and do not set hw:mem_page_size then your flavor is misconfigured as it will not request numa local memory trackign to be enabled.
Good to know. So it sounds like by setting the hw:mem_page_size parameter (probably best to choose "small" as a general default), NUMA node 0 will fill up, and then NUMA node 1 will be considered. In other words, VMs will NOT be provisioned in a "round-robin" fashion between NUMA nodes. Do I understand that correctly?
you do not need to use hugepages but you do need to enable per numa node memory tracking with hw:mem_page_size=small (use non hugepage typicaly 4k pages) or hw:mem_page_size=any which basicaly is the same as small but the image can requst hugepages if it wantws too. if you set small in the flaor but large in the image that is an error. if you set any in the falvor the image can set any value it like like small or large or an explcit page size and the schduler will honour that.
if you know you wnat the flavor to use small pages then you shoudl just set small explictly.
Also good to know. Thanks again! Eric