[nova-scheduler] always the same compute was selected?
Hi, I have a Ussuri setup with 5 compute nodes (same resource spec). When launch 5 Cirros VMs in a batch, they are all launched on the same compute node. All 5 nodes have sufficient resource. AWIK, Nova scheduler picks a compute randomly after going through filtering. But 5 VMs all on the same compute is quite coincidental. Could it be something missing? Is there any control to such random pick? Any way to make a better balancing?
From the below log, could anyone let me know how compute-4 was selected? I tried a few more launches, it's always compute-4 was selected. ========================================= 2020-10-27 13:56:45.484 24 DEBUG nova.scheduler.filter_scheduler [req-e2c6bf4b-7512-4ff0-9899-2a24bffa9131 113ee63a9ed0466794e24d069efc302c 32be191b3ec9497eaec108a22220890b - default default] Filtered [(compute-4, compute-4) ram: 131698MB disk: 14251008MB io_ops: 0 instances: 17, (compute-5, compute-5) ram: 232050MB disk: 14251008MB io_ops: 0 instances: 2, (compute-2, compute-2) ram: 367785MB disk: 14252032MB io_ops: 0 instances: 5, (compute-1, compute-1) ram: 298665MB disk: 14251008MB io_ops: 0 instances: 9, (compute-3, compute-3) ram: 259753MB disk: 14251008MB io_ops: 0 instances: 19] _get_sorted_hosts /usr/lib/python3.6/site-packages/nova/scheduler/filter_scheduler.py:443 2020-10-27 13:56:45.485 24 DEBUG nova.scheduler.filter_scheduler [req-e2c6bf4b-7512-4ff0-9899-2a24bffa9131 113ee63a9ed0466794e24d069efc302c 32be191b3ec9497eaec108a22220890b - default default] Weighed [WeighedHost [host: (compute-4, compute-4) ram: 131698MB disk: 14251008MB io_ops: 0 instances: 17, weight: 2.315526126088031], WeighedHost [host: (compute-2, compute-2) ram: 367785MB disk: 14252032MB io_ops: 0 instances: 5, weight: -999997.0], WeighedHost [host: (compute-1, compute-1) ram: 298665MB disk: 14251008MB io_ops: 0 instances: 9, weight: -999997.2171186721], WeighedHost [host: (compute-3, compute-3) ram: 259753MB disk: 14251008MB io_ops: 0 instances: 19, weight: -999997.3362949106], WeighedHost [host: (compute-5, compute-5) ram: 232050MB disk: 14251008MB io_ops: 0 instances: 2, weight: -999997.371492924]] _get_sorted_hosts /usr/lib/python3.6/site-packages/nova/scheduler/filter_scheduler.py:462 2020-10-27 13:56:45.486 24 DEBUG nova.scheduler.utils [req-e2c6bf4b-7512-4ff0-9899-2a24bffa9131 113ee63a9ed0466794e24d069efc302c 32be191b3ec9497eaec108a22220890b - default default] Attempting to claim resources in the placement API for instance ac10e5df-7e55-4c88-a682-df25cb6dd535 claim_resources /usr/lib/python3.6/site-packages/nova/scheduler/utils.py:1175 2020-10-27 13:56:45.657 24 DEBUG nova.scheduler.filter_scheduler [req-e2c6bf4b-7512-4ff0-9899-2a24bffa9131 113ee63a9ed0466794e24d069efc302c 32be191b3ec9497eaec108a22220890b - default default] [instance: ac10e5df-7e55-4c88-a682-df25cb6dd535] Selected host: (compute-4, compute-4) ram: 131698MB disk: 14251008MB io_ops: 0 instances: 17 _consume_selected_host /usr/lib/python3.6/site-packages/nova/scheduler/filter_scheduler.py:354 =========================================
Thanks! Tony
On 10/27/20 14:13, Tony Liu wrote:
Hi,
I have a Ussuri setup with 5 compute nodes (same resource spec). When launch 5 Cirros VMs in a batch, they are all launched on the same compute node. All 5 nodes have sufficient resource.
AWIK, Nova scheduler picks a compute randomly after going through filtering. But 5 VMs all on the same compute is quite coincidental. Could it be something missing? Is there any control to such random pick? Any way to make a better balancing?
Nova scheduler will weigh all of the candidate hosts via the weighers configured and then by default, it will pick randomly from a set of only 1 [1]. This will result in a "packing" behavior if the weighers are not sufficiently configured to result in "spreading" instances (example: cpu/ram/disk weighers). To make the scheduler pick randomly from a larger subset of candidate compute hosts, increase the host_subset_size [1]. To make the scheduler randomly shuffle candidate hosts when weighting is the same, set shuffle_best_same_weighed_hosts to True [2]. If you have > 1000 compute hosts (via multi-cell) and need to randomize the default 1000 results that come from placement [3], set randomize_allocation_candidates to True. You likely will only be interested in the first two options I described. Tune host_subset_size until you get the desired scheduling behavior. Hope this helps, -melanie [1] https://docs.openstack.org/nova/ussuri/configuration/config.html#filter_sche... [2] https://docs.openstack.org/nova/ussuri/configuration/config.html#filter_sche... [3] https://docs.openstack.org/nova/ussuri/configuration/config.html#scheduler.m... [4] https://docs.openstack.org/placement/ussuri/configuration/config.html#placem...
Thanks Melanie! It helps. VMs are spread very well. Tony
-----Original Message----- From: melanie witt <melwittt@gmail.com> Sent: Tuesday, October 27, 2020 2:25 PM To: Tony Liu <tonyliu0592@hotmail.com>; openstack- discuss@lists.openstack.org Subject: Re: [nova-scheduler] always the same compute was selected?
On 10/27/20 14:13, Tony Liu wrote:
Hi,
I have a Ussuri setup with 5 compute nodes (same resource spec). When launch 5 Cirros VMs in a batch, they are all launched on the same compute node. All 5 nodes have sufficient resource.
AWIK, Nova scheduler picks a compute randomly after going through filtering. But 5 VMs all on the same compute is quite coincidental. Could it be something missing? Is there any control to such random pick? Any way to make a better balancing?
Nova scheduler will weigh all of the candidate hosts via the weighers configured and then by default, it will pick randomly from a set of only 1 [1]. This will result in a "packing" behavior if the weighers are not sufficiently configured to result in "spreading" instances (example: cpu/ram/disk weighers).
To make the scheduler pick randomly from a larger subset of candidate compute hosts, increase the host_subset_size [1].
To make the scheduler randomly shuffle candidate hosts when weighting is the same, set shuffle_best_same_weighed_hosts to True [2].
If you have > 1000 compute hosts (via multi-cell) and need to randomize the default 1000 results that come from placement [3], set randomize_allocation_candidates to True.
You likely will only be interested in the first two options I described. Tune host_subset_size until you get the desired scheduling behavior.
Hope this helps, -melanie
[1] https://docs.openstack.org/nova/ussuri/configuration/config.html#filter_ scheduler.host_subset_size [2] https://docs.openstack.org/nova/ussuri/configuration/config.html#filter_ scheduler.shuffle_best_same_weighed_hosts [3] https://docs.openstack.org/nova/ussuri/configuration/config.html#schedul er.max_placement_results [4] https://docs.openstack.org/placement/ussuri/configuration/config.html#pl acement.randomize_allocation_candidates
participants (2)
-
melanie witt
-
Tony Liu