[nova][scheduler] nova scheduler to balance vms more randomly
Folks, I have noticed when I spin up a bunch of vms then they all go to the same hypervisor until it runs out of resources. I can understand that it's trying to fill the hypervisor before it picks the next one. But it's kind of dangerous behavior. For example one customer created 3 vms to build a mysql galera cluster and all those 3 nodes endup on the same hypervisor (This is dangerous). I would like openstack to pick the hypervisor more randomly if resources are available in the pool instead of trying to fill the hypervisor. How do I change that behavior? ( There is a feature called affinity but again I would like nova to place vm more randomly instead shaving all to a single node). I am not an expert in scheduler logic so please educate me if I am missing something here.
Hello! An end user creating VMs which are expected to be kept apart should use Nova's server groups with [soft-]anti-affinity to hint to the scheduler that those VMs are to be kept apart. Soft anti-affinity will try to keep them apart as best it can but will allow double-ups on hosts, while plain anti-affinity will actually fail the create if it can be kept away from other members of the server group. There are AFAIK some corner cases involving how an operator might do live migration that could break this, but it should generally work. -- David Zanetti Chief Technology Officer [1] Aotearoa New Zealand's cloud provider e david.zanetti@catalystcloud.nz m +64-21-402260 w catalystcloud.nz Follow us on LinkedIn [2] Level 5, 2 Commerce Street, Auckland 1010 Confidentiality Notice: This email is intended for the named recipients only. It may contain privileged, confidential or copyright information. If you are not the named recipient, any use, reliance upon, disclosure or copying of this email or its attachments is unauthorised. If you have received this email in error, please reply via email or call +64 4 499 2267. On Thu, 2024-05-23 at 18:58 -0400, Satish Patel wrote:
Folks,
I have noticed when I spin up a bunch of vms then they all go to the same hypervisor until it runs out of resources. I can understand that it's trying to fill the hypervisor before it picks the next one. But it's kind of dangerous behavior. For example one customer created 3 vms to build a mysql galera cluster and all those 3 nodes endup on the same hypervisor (This is dangerous). I would like openstack to pick the hypervisor more randomly if resources are available in the pool instead of trying to fill the hypervisor. How do I change that behavior? ( There is a feature called affinity but again I would like nova to place vm more randomly instead shaving all to a single node).
I am not an expert in scheduler logic so please educate me if I am missing something here.
[1] https://catalystcloud.nz/ [2] Follow us on LinkedIn https://www.linkedin.com/company/catalyst-cloud-limited/
If I am not mistaken, scheduler debug logs tell you which hosts were evaluated with the winner host in it. That can be a good starting point to see what's actually happening. Also I would recommend checking if the tracker is working properly or supplying the correct data as it is vital for the scheduling. In case it's not, one scenario would be that the very same host is displayed as having 0 VMs, that's why you have your VMs spun up on the same host. On Fri, 24 May 2024 at 05:23, David Zanetti <david.zanetti@catalystcloud.nz> wrote:
Hello!
An end user creating VMs which are expected to be kept apart should use Nova's server groups with [soft-]anti-affinity to hint to the scheduler that those VMs are to be kept apart. Soft anti-affinity will try to keep them apart as best it can but will allow double-ups on hosts, while plain anti-affinity will actually fail the create if it can be kept away from other members of the server group.
There are AFAIK some corner cases involving how an operator might do live migration that could break this, but it should generally work.
--
David Zanetti
Chief Technology Officer [image: Catalyst Cloud] <https://catalystcloud.nz>
Aotearoa New Zealand's cloud provider
e david.zanetti@catalystcloud.nz
m +64-21-402260
w catalystcloud.nz
Follow us on LinkedIn <https://www.linkedin.com/company/catalyst-cloud-limited/>
Level 5, 2 Commerce Street, Auckland 1010
Confidentiality Notice: This email is intended for the named recipients only. It may contain privileged, confidential or copyright information. If you are not the named recipient, any use, reliance upon, disclosure or copying of this email or its attachments is unauthorised. If you have received this email in error, please reply via email or call +64 4 499 2267 <+6444992267>.
On Thu, 2024-05-23 at 18:58 -0400, Satish Patel wrote:
Folks,
I have noticed when I spin up a bunch of vms then they all go to the same hypervisor until it runs out of resources. I can understand that it's trying to fill the hypervisor before it picks the next one. But it's kind of dangerous behavior. For example one customer created 3 vms to build a mysql galera cluster and all those 3 nodes endup on the same hypervisor (This is dangerous). I would like openstack to pick the hypervisor more randomly if resources are available in the pool instead of trying to fill the hypervisor. How do I change that behavior? ( There is a feature called affinity but again I would like nova to place vm more randomly instead shaving all to a single node).
I am not an expert in scheduler logic so please educate me if I am missing something here.
I’m just little worried to turn on debug on production. We have 300 compute node hope debug don’t blow up thing. I will there is a dry run scenario in nova which just tell you where its going to park vm. Currently I’m running some hack script to check what customers vms are crowd up on single hypervisor and doing vm migration to keep them in good balance. I hate to doing it. Wish there is a simple flag or something to park vm with least populated compute resource. On Fri, May 24, 2024 at 4:32 AM Can Özyurt <acozyurt@gmail.com> wrote:
If I am not mistaken, scheduler debug logs tell you which hosts were evaluated with the winner host in it. That can be a good starting point to see what's actually happening.
Also I would recommend checking if the tracker is working properly or supplying the correct data as it is vital for the scheduling. In case it's not, one scenario would be that the very same host is displayed as having 0 VMs, that's why you have your VMs spun up on the same host.
On Fri, 24 May 2024 at 05:23, David Zanetti < david.zanetti@catalystcloud.nz> wrote:
Hello!
An end user creating VMs which are expected to be kept apart should use Nova's server groups with [soft-]anti-affinity to hint to the scheduler that those VMs are to be kept apart. Soft anti-affinity will try to keep them apart as best it can but will allow double-ups on hosts, while plain anti-affinity will actually fail the create if it can be kept away from other members of the server group.
There are AFAIK some corner cases involving how an operator might do live migration that could break this, but it should generally work.
--
David Zanetti
Chief Technology Officer [image: Catalyst Cloud] <https://catalystcloud.nz>
Aotearoa New Zealand's cloud provider
e david.zanetti@catalystcloud.nz
m +64-21-402260
w catalystcloud.nz
Follow us on LinkedIn <https://www.linkedin.com/company/catalyst-cloud-limited/>
Level 5, 2 Commerce Street, Auckland 1010 <https://www.google.com/maps/search/2+Commerce+Street,+Auckland+1010?entry=gmail&source=g>
Confidentiality Notice: This email is intended for the named recipients only. It may contain privileged, confidential or copyright information. If you are not the named recipient, any use, reliance upon, disclosure or copying of this email or its attachments is unauthorised. If you have received this email in error, please reply via email or call +64 4 499 2267 <+6444992267>.
On Thu, 2024-05-23 at 18:58 -0400, Satish Patel wrote:
Folks,
I have noticed when I spin up a bunch of vms then they all go to the same hypervisor until it runs out of resources. I can understand that it's trying to fill the hypervisor before it picks the next one. But it's kind of dangerous behavior. For example one customer created 3 vms to build a mysql galera cluster and all those 3 nodes endup on the same hypervisor (This is dangerous). I would like openstack to pick the hypervisor more randomly if resources are available in the pool instead of trying to fill the hypervisor. How do I change that behavior? ( There is a feature called affinity but again I would like nova to place vm more randomly instead shaving all to a single node).
I am not an expert in scheduler logic so please educate me if I am missing something here.
On Fri, 2024-05-24 at 12:05 -0400, Satish Patel wrote:
I’m just little worried to turn on debug on production. We have 300 compute node hope debug don’t blow up thing.
i was avoiding replying as i have explaind this to you previously in the past. two quick comments first debug mode is intneded to used in production. it creates a lot of logs but several large deployment permently run with debug mode that however is not the solution to your problem. one of the replies mentioned https://docs.openstack.org/nova/latest/configuration/config.html#filter_sche... if you want true randomness that is not a bad approch you can also have placement randomise the allocation candiates https://docs.openstack.org/placement/latest/configuration/config.html#placem... however what you are desicbing is a desire for the scheduler to spread instread of pack. the defautl for spread vs pack depends on the specific weigher the ram cpu and disk weighers all spread by default some of the other weigher pack by default. in all cases you can use the [filter_scheduler] *_weight_multiplier config options to influcance this behavior. the ram disk an cpu weigher are usually the most effective ``` ram_weight_multiplier Type: floating point Default: 1.0 RAM weight multiplier ratio. This option determines how hosts with more or less available RAM are weighed. A positive value will result in the scheduler preferring hosts with more available RAM, and a negative number will result in the scheduler preferring hosts with less available RAM. Another way to look at it is that positive values for this option will tend to spread instances across many hosts, while negative values will tend to fill up (stack) hosts as much as possible before scheduling to a less-used host. The absolute value, whether positive or negative, controls how strong the RAM weigher is relative to other weighers. Note that this setting only affects scheduling if the RAMWeigher weigher is enabled. Possible values: An integer or float value, where the value corresponds to the multiplier ratio for this weigher. Related options: [filter_scheduler] weight_classes cpu_weight_multiplier Type: floating point Default: 1.0 CPU weight multiplier ratio. Multiplier used for weighting free vCPUs. Negative numbers indicate stacking rather than spreading. Note that this setting only affects scheduling if the CPUWeigher weigher is enabled. Possible values: An integer or float value, where the value corresponds to the multiplier ratio for this weigher. Related options: [filter_scheduler] weight_classes disk_weight_multiplier Type: floating point Default: 1.0 Disk weight multiplier ratio. Multiplier used for weighing free disk space. Negative numbers mean to stack vs spread. Note that this setting only affects scheduling if the DiskWeigher weigher is enabled. Possible values: An integer or float value, where the value corresponds to the multiplier ratio for this weigher. ``` however you might want to activate the num_instances_weight_multiplier https://docs.openstack.org/nova/latest/configuration/config.html#filter_sche... Negative numbers indicate preferring hosts with fewer instances (i.e. choosing to spread instances) [filter_scheduler] num_instances_weight_multiplier=-10 in the unlikely event several fo your host have the same value you can also enable https://docs.openstack.org/nova/latest/configuration/config.html#filter_sche... which only intoduced randomess in that situation. this is most common when you initally have a large number of empty hosts and becomes less useful as the could fills. unless you ahve a very uniform distibution fo vms its unlikely that two or more hsots will have the same weight. its effectivlly alwasy safe to trun on shuffle_best_same_weighed_hosts
I will there is a dry run scenario in nova which just tell you where its going to park vm.
Currently I’m running some hack script to check what customers vms are crowd up on single hypervisor and doing vm migration to keep them in good balance. I hate to doing it.
Wish there is a simple flag or something to park vm with least populated compute resource.
On Fri, May 24, 2024 at 4:32 AM Can Özyurt <acozyurt@gmail.com> wrote:
If I am not mistaken, scheduler debug logs tell you which hosts were evaluated with the winner host in it. That can be a good starting point to see what's actually happening.
Also I would recommend checking if the tracker is working properly or supplying the correct data as it is vital for the scheduling. In case it's not, one scenario would be that the very same host is displayed as having 0 VMs, that's why you have your VMs spun up on the same host.
On Fri, 24 May 2024 at 05:23, David Zanetti < david.zanetti@catalystcloud.nz> wrote:
Hello!
An end user creating VMs which are expected to be kept apart should use Nova's server groups with [soft-]anti-affinity to hint to the scheduler that those VMs are to be kept apart. Soft anti-affinity will try to keep them apart as best it can but will allow double-ups on hosts, while plain anti-affinity will actually fail the create if it can be kept away from other members of the server group.
There are AFAIK some corner cases involving how an operator might do live migration that could break this, but it should generally work.
--
David Zanetti
Chief Technology Officer [image: Catalyst Cloud] <https://catalystcloud.nz>
Aotearoa New Zealand's cloud provider
e david.zanetti@catalystcloud.nz
m +64-21-402260
w catalystcloud.nz
Follow us on LinkedIn <https://www.linkedin.com/company/catalyst-cloud-limited/>
Level 5, 2 Commerce Street, Auckland 1010 <https://www.google.com/maps/search/2+Commerce+Street,+Auckland+1010?entry=gmail&source=g>
Confidentiality Notice: This email is intended for the named recipients only. It may contain privileged, confidential or copyright information. If you are not the named recipient, any use, reliance upon, disclosure or copying of this email or its attachments is unauthorised. If you have received this email in error, please reply via email or call +64 4 499 2267 <+6444992267>.
On Thu, 2024-05-23 at 18:58 -0400, Satish Patel wrote:
Folks,
I have noticed when I spin up a bunch of vms then they all go to the same hypervisor until it runs out of resources. I can understand that it's trying to fill the hypervisor before it picks the next one. But it's kind of dangerous behavior. For example one customer created 3 vms to build a mysql galera cluster and all those 3 nodes endup on the same hypervisor (This is dangerous). I would like openstack to pick the hypervisor more randomly if resources are available in the pool instead of trying to fill the hypervisor. How do I change that behavior? ( There is a feature called affinity but again I would like nova to place vm more randomly instead shaving all to a single node).
I am not an expert in scheduler logic so please educate me if I am missing something here.
Hi, We’ve faced the same issue until we found this option of nova-scheduler. Hope this will help your case. host_subset_size Type: integer Default: 1 Minimum Value: 1 Size of subset of best hosts selected by scheduler. New instances will be scheduled on a host chosen randomly from a subset of the N best hosts, where N is the value set by this option. Setting this to a value greater than 1 will reduce the chance that multiple scheduler processes handling similar requests will select the same host, creating a potential race condition. By selecting a host randomly from the N hosts that best fit the request, the chance of a conflict is reduced. However, the higher you set this value, the less optimal the chosen host may be for a given request. Thanks & Best Regards [thumbnail_image001] Sang Tran Quoc (Mr.) FPT Smart Cloud FPT Tower, 10 Pham Van Bach Str., Cau Giay Dist., Ha Noi, Viet Nam E: sangtq8@fpt.com<mailto:sangtq8@fpt.com> T: 0962949843 W: https://fptcloud.com<https://fptcloud.com/> | https://fpt.ai<https://fpt.ai/> From: David Zanetti <david.zanetti@catalystcloud.nz> Date: Friday, 24 May 2024 at 09:24 To: openstack-discuss@lists.openstack.org <openstack-discuss@lists.openstack.org> Subject: Re: [nova][scheduler] nova scheduler to balance vms more randomly Hello! An end user creating VMs which are expected to be kept apart should use Nova's server groups with [soft-]anti-affinity to hint to the scheduler that those VMs are to be kept apart. Soft anti-affinity will try to keep them apart as best it can but will allow double-ups on hosts, while plain anti-affinity will actually fail the create if it can be kept away from other members of the server group. There are AFAIK some corner cases involving how an operator might do live migration that could break this, but it should generally work. -- David Zanetti Chief Technology Officer [Catalyst Cloud]<https://catalystcloud.nz/> Aotearoa New Zealand's cloud provider e david.zanetti@catalystcloud.nz <mailto:david.zanetti@catalystcloud.nz> m +64-21-402260<file:///+64-21-40> w catalystcloud.nz <https://catalystcloud.nz> Follow us on LinkedIn <https://www.linkedin.com/company/catalyst-cloud-limited/> Level 5, 2 Commerce Street, Auckland 1010 Confidentiality Notice: This email is intended for the named recipients only. It may contain privileged, confidential or copyright information. If you are not the named recipient, any use, reliance upon, disclosure or copying of this email or its attachments is unauthorised. If you have received this email in error, please reply via email or call +64 4 499 2267<tel:+6444992267>. On Thu, 2024-05-23 at 18:58 -0400, Satish Patel wrote: Folks, I have noticed when I spin up a bunch of vms then they all go to the same hypervisor until it runs out of resources. I can understand that it's trying to fill the hypervisor before it picks the next one. But it's kind of dangerous behavior. For example one customer created 3 vms to build a mysql galera cluster and all those 3 nodes endup on the same hypervisor (This is dangerous). I would like openstack to pick the hypervisor more randomly if resources are available in the pool instead of trying to fill the hypervisor. How do I change that behavior? ( There is a feature called affinity but again I would like nova to place vm more randomly instead shaving all to a single node). I am not an expert in scheduler logic so please educate me if I am missing something here.
participants (5)
-
Can Özyurt
-
David Zanetti
-
Sang Tran Quoc
-
Satish Patel
-
smooney@redhat.com