[nova][numa]Question regarding numa affinity balancer/weigher per host
HI Nova folks, I am trying to find out how Numa balance/weighs per host is taking care of in Nova. I know how weighers work in general, but it's weighing between hosts. I am not so clear when it comes to a single host with multiple sockets, how does nova weigh them? For instance, a host has 4 sockets, A, B, C, D. When scheduling request comes in asking for 4 cores on 2 sockets(2 cores per socket), scheduler realized that A+B, A+C, and C+D combination can all fit the request. In this case, how does nova make the decision on which combination to choose from? -- Thank you Regards Li
HI Nova folks,
I am trying to find out how Numa balance/weighs per host is taking care of in Nova.
I know how weighers work in general, but it's weighing between hosts. I am not so clear when it comes to a single host with multiple sockets, how does nova weigh them?
On Thu, 2019-02-21 at 10:21 -0500, Li Liu wrote: the short answer is it doesn't. we then to pack numa node 1 before we move on to numa node 2 the actual asignment of numa resouce to the vm is done by the resouce tracker on the compute node. stephen on cc has done some work around avoidign host with pci device when they are not requested and im not sur if he extened that to numa nodes as well. the specifc code in the libvirt dirver and hardware.py file is rather complicated and hard to extend so while this has come up in the past we have not really put alot of errort into this topic. it is not clear that balancing the vm placement is always corect and this is not something a enduser should be able to influnce. that means adding a flavor extraspec. this makes modeling numa in placement more complicated which is another reason we have not really spent much time on this lately. one example where you dont want to blance blindly is when you have sriov devices, gpus or fpgas on the host. in this case assuming you have not already seperated the host into a dedicated hostaggrate for host with special deivces you would want to avoid blancing so that instance that dont request an fpga are first tried to be placed on numa nodes without an fpga before they are assingined to a numa node with an fpga.
For instance, a host has 4 sockets, A, B, C, D. When scheduling request comes in asking for 4 cores on 2 sockets(2 cores per socket), scheduler realized that A+B, A+C, and C+D combination can all fit the request. In this case, how does nova make the decision on which combination to choose from?
if i remember the code correctly it will take resoces for the first two numa nodes that can fit the vm.
Thanks a lot, Sean for the clarification :P Regards Li Liu On Thu, Feb 21, 2019 at 10:41 AM Sean Mooney <smooney@redhat.com> wrote:
HI Nova folks,
I am trying to find out how Numa balance/weighs per host is taking care of in Nova.
I know how weighers work in general, but it's weighing between hosts. I am not so clear when it comes to a single host with multiple sockets, how does nova weigh them?
On Thu, 2019-02-21 at 10:21 -0500, Li Liu wrote: the short answer is it doesn't.
we then to pack numa node 1 before we move on to numa node 2 the actual asignment of numa resouce to the vm is done by the resouce tracker on the compute node. stephen on cc has done some work around avoidign host with pci device when they are not requested and im not sur if he extened that to numa nodes as well.
the specifc code in the libvirt dirver and hardware.py file is rather complicated and hard to extend so while this has come up in the past we have not really put alot of errort into this topic.
it is not clear that balancing the vm placement is always corect and this is not something a enduser should be able to influnce. that means adding a flavor extraspec. this makes modeling numa in placement more complicated which is another reason we have not really spent much time on this lately. one example where you dont want to blance blindly is when you have sriov devices, gpus or fpgas on the host. in this case assuming you have not already seperated the host into a dedicated hostaggrate for host with special deivces you would want to avoid blancing so that instance that dont request an fpga are first tried to be placed on numa nodes without an fpga before they are assingined to a numa node with an fpga.
For instance, a host has 4 sockets, A, B, C, D. When scheduling request
comes in asking for 4 cores on 2 sockets(2
cores per socket), scheduler realized that A+B, A+C, and C+D combination can all fit the request. In this case, how does nova make the decision on which combination to choose from? if i remember the code correctly it will take resoces for the first two numa nodes that can fit the vm.
-- Thank you Regards Li
participants (2)
-
Li Liu
-
Sean Mooney