[nova][numa]Question regarding numa affinity balancer/weigher per host

Sean Mooney smooney at redhat.com
Thu Feb 21 15:41:24 UTC 2019


On Thu, 2019-02-21 at 10:21 -0500, Li Liu wrote:
> HI Nova folks,
> 
> I am trying to find out how Numa balance/weighs per host is taking care of in Nova.
> 
> I know how weighers work in general, but it's weighing between hosts. I am not so clear when it comes to a single host
> with multiple sockets, how does nova weigh them?
the short answer is it doesn't.

we then to pack numa node 1 before we move on to numa node 2
the actual asignment of numa resouce to the vm is done by the resouce tracker on the compute node.
stephen on cc has done some work around avoidign host with pci device when they are not requested 
and im not sur if he extened that to numa nodes as well.

the specifc code in the libvirt dirver and hardware.py file is rather complicated and hard to extend
so while this has come up in the past we have not really put alot of errort into this topic.

it is not clear that balancing the vm placement is always corect and this is not something a enduser should
be able to influnce. that means adding a flavor extraspec. this makes modeling numa in placement more complicated
which is another reason we have not really spent much time on this lately.
one example where you dont want to blance blindly is when you have sriov devices, gpus or fpgas on the host.
in this case assuming you have not already seperated the host into a dedicated hostaggrate for host with special deivces
you would want to avoid blancing so that instance that dont request an fpga are first tried to be placed on numa nodes
without an fpga before they are  assingined to a numa node with an fpga.
> 
> For instance, a host has 4 sockets, A, B, C, D. When scheduling request comes in asking for 4 cores on 2 sockets(2
> cores per socket), scheduler realized that A+B, A+C, and C+D combination can all fit the request. In this case, how
> does nova make the decision on which combination to choose from?
if i remember the code correctly it will take resoces for the first two numa nodes that can fit the vm.
> 




More information about the openstack-discuss mailing list