[nova][scheduler] - Stack VMs based on RAM

melanie witt melwittt at gmail.com
Fri Apr 19 21:38:26 UTC 2019


On Sat, 20 Apr 2019 00:17:38 +0300, Georgios Dimitrakakis 
<giorgis at acmac.uoc.gr> wrote:
>> On Fri, 19 Apr 2019 22:47:23 +0300, Georgios Dimitrakakis
>> <giorgis at acmac.uoc.gr> wrote:
>>>    Hello again and apologies for my absence!
>>> First of all let me express my gratitude for your valuable feedback.
>>>    I understand what all of you are saying and I will try what Sean
>>>    suggested by setting the values for CPUs and Disk to something
>>> beyond
>>>    the default.
>>> Meanwhile what I 've though was to change the "weight_classes"
>>>    parameter to use only the "RAM" weight instead of all, since
>>> initially
>>>    this is what I would like to be based and then move on from there.
>>>    Unfortunately I haven't found a way to properly set it and no
>>> matter
>>>    what I 've tried I always ended up with an error in
>>> nova-scheduler.log
>>>    saying "ERROR nova ValueError: Empty module name"
>>> Any ideas on how to set it based on the available weights? It seems
>>>    that it needs a list format but how?
>>
>> Did you try:
>>
>> weight_classes = ['nova.scheduler.weights.ram.RAMWeigher']
>>
>> That's what should work, based on the code I see here:
>>
>>
>> https://github.com/openstack/nova/blob/stable/rocky/nova/loadables.py#L98
>>
>> -melanie
> 
>   I 'll try that ASAP and get back to you. I believe I have failed to
>   properly use CAPS letters.
> 
>   Meanwhile, what I 've done is that by using Sean's suggestions about
>   RAM, CPU and Disk like below:
> 
>   ram_weight_multiplier=-1.0
>   cpu_weight_multiplier=-1.0
>   disk_weight_multiplier=0.001
> 
>   I have managed to stack VMs on one node.
>   VM stacking with the above is working correctly as long as no IOPS are
>   involved because when they do weights are messed and VMs spread. But I
>   assume that this can be dealt by playing with "io_ops_weight_multiplier"
>   value.
> 
>   So far so good if there wasn't for the following huge problem which is
>   that stacking never stops even when resources are exceeded.
> 
>   To explain what I mean I will give you an introduction of my hardware.
>   My Compute Node has 1x Intel(R) Xeon(R) CPU D-1541 @ 2.10GHz with 8 core
>   or 16 threads and 32GB of RAM.
> 
>   Furthermore on nova.conf I have set:
> 
>   cpu_allocation_ratio=1.0
>   ram_allocation_ratio=1.0
> 
>   With the above settings I have been able to stack on this node 20 VMs
>   with 2vCPUs each and 2GB of RAM. See the following excerpt (when
>   spawning the 20th VM) from the scheduler log:
> 
> 
>   2019-04-19 23:26:42.322 379112 DEBUG nova.scheduler.filter_scheduler
>   [req-73ddffc1-8cd2-4060-bd4c-786ba00793d1
>   6a4c2e32919e4a6fa5c5d956beb68eef 9f22e9bfa7974e14871d58bbb62242b2 -
>   default default] Weighed [WeighedHost [host: (cpu2, cpu2) ram: -6759MB
>   disk: 1536000MB io_ops: 0 instances: 19, weight: 0.000805585392052],
>   WeighedHost [host: (cpu1, cpu1) ram: 32153MB disk: 1906688MB io_ops: 0
>   instances: 0, weight: -1.999]] _get_sorted_hosts
>   /usr/lib/python2.7/site-packages/nova/scheduler/filter_scheduler.py:454
> 
> 
>   To me is very clear that the aforementioned ratio settings are ignored
>   and haven't tried what the limit would be (the final number of VMs that
>   can be spawned).
> 
> 
>   Additionally I did try to perform this again but this time I set the
>   following:
> 
>   enabled_filters=RetryFilter,AvailabilityZoneFilter,CoreFilter,RamFilter,ComputeFilter,ComputeCapabilitiesFilter,ImagePropertiesFilter,ServerGroupAntiAffinityFilter,ServerGroupAffinityFilter
> 
> 
>   This time only 15 VMs went to one compute node because on the 16th
>   attempt the RAMFilter returned just one node (see log below):
> 
>   2019-04-19 23:59:34.170 383851 DEBUG nova.filters
>   [req-9cdd3c15-5bd8-4e06-bf0b-eb2d4d26d66d
>   6a4c2e32919e4a6fa5c5d956beb68eef 9f22e9bfa7974e14871d58bbb62242b2 -
>   default default] Filter RamFilter returned 1 host(s)
>   get_filtered_objects
>   /usr/lib/python2.7/site-packages/nova/filters.py:104
> 
> 
>   Melanie told me earlier on this thread that setting "enabled_filters"
>   shouldn't make any difference but it is obvious from the above behavior
>   that this is not the case.
> 
>   In any case this is not a correct behavior since 15VMs x 2vCPUs/VM =
>   30vCPUs which is almost double the available 16cores, don't forget that
>   I have ratio 1:1 for both CPU and RAM.
> 
>   Does it seem logical to you?

I agree it sounds like your overcommit settings are not being honored. 
Placement will filter hosts taking overcommit into account, and if it 
doesn't have the values you set, then it will return more hosts than the 
RamFilter would filter, causing the difference in behavior.

To debug this further, you need to start looking at what values 
placement has for the allocation ratios. You can do this by installing 
the osc-placement plugin [1] for the openstackclient. Once installed, 
you can do 'openstack resource provider inventory list <compute host 
UUID>' to see what allocation ratios are being used on a given compute 
host. If they're not 1.0, that would explain why it's returning too many 
hosts. If that's the case, next step would be trying to figure out why 
the values you're setting in nova.conf on the compute host (I assume you 
made sure to restart nova-compute after setting) are not being 
transferred to placement like they should be.

-melanie

[1] https://docs.openstack.org/osc-placement/rocky/cli
> 
> 
> 
> 
> 
> 
>>
>>>> On Wed, 2019-04-17 at 18:24 -0700, melanie witt wrote:
>>>>> On Thu, 18 Apr 2019 01:38:17 +0100, Sean Mooney
>>>>> <smooney at redhat.com>
>>>>> wrote:
>>>>>> On Wed, 2019-04-17 at 15:30 -0700, melanie witt wrote:
>>>>>>> On Thu, 18 Apr 2019 01:13:42 +0300, Georgios Dimitrakakis
>>>>>>> <giorgis at acmac.uoc.gr> wrote:
>>>>>>>> Shouldn’t that be the correct behavior and place the new VM on
>>>>> the host with the smaller weight? Isn’t that what
>>>>>>>> the
>>>>>>>> negative value for “ram_weight_multiplier” does ?
>>>>>>>
>>>>>>> No, it's the opposite, higher weights win. That's why you have
>>>>> to use a
>>>>>>> negative value for ram_weight_multiplier if you want hosts with
>>>>> _less_
>>>>>>> RAM to win over hosts with more RAM (stacking).
>>>>>>>
>>>>>>>> Please let me know how I can provide to you more debug
>>>>> info....
>>>>>>>
>>>>>>> One thing I noticed from your log is on the second request,
>>>>> 'cpu1' has
>>>>>>> io_ops: 0 whereas 'cpu2' has io_ops: 1 and the IoOpsWeigher [1]
>>>>> will
>>>>>>> prefer hosts with fewer io_iops by default. Note that's only one
>>>>> piece
>>>>>>> of the ending weight -- the weighing process will take a sum of
>>>>> all of
>>>>>>> the weights each weigher returns. So the weight returned from
>>>>> RamWeigher
>>>>>>> is added to the weight returned from IoOpsWeigher is added the
>>>>> weight
>>>>>>> returned from CPUWeigher, and so on.
>>>>>>>
>>>>>>> So, as Matt said, we're a bit in the dark now as far as what
>>>>> each
>>>>>>> weigher is returning and we don't currently have debug logging
>>>>> per
>>>>>>> weigher the way we do for filters. That would be an enhancement
>>>>> we could
>>>>>>> make to aid in debugging issues like this one. You could hack
>>>>> something
>>>>>>> up locally to log/print the returned weight in each weight class
>>>>> under
>>>>>>> the nova/scheduler/weights/ directory, if you want to dig into
>>>>> that.
>>>>>>>
>>>>>>> Another thing I noticed is that there are probably some new
>>>>> weighers
>>>>>>> available by default that did not exist in the previous version
>>>>> of nova
>>>>>>> that you were using in the past. By default, the config option
>>>>> for weighers:
>>>>>>>
>>>>>>> [filter_scheduler]weight_classes =
>>>>> ["nova.scheduler.weights.all_weighers"]
>>>>>>>
>>>>>>> will pick up all weigher classes in the nova/scheduler/weights/
>>>>> code
>>>>>>> directory. You might take a look at these and see if any are
>>>>> ones you
>>>>>>> should exclude in your environment. For example, the CPUWeigher
>>>>> [1] (new
>>>>>>> in Rocky) will spread VMs based on available CPU by default.
>>>>>>
>>>>>> most of the weighers spread by default so the cpu weigher may be
>>>>>> a
>>>>> factor but
>>>>>> the the disk weigher tends to all hevily impact the choice.
>>>>>>
>>>>>> we do not normalise any of the values retured by the different
>>>>> weighers
>>>>>> the disk wighter is basically  host_state.free_disk_mb *
>>>>> disk_weight_multiplier
>>>>>
>>>>> Hm, I thought we do based on this code:
>>>>>
>>>>>
>>>>>
>>>>> https://github.com/openstack/nova/blob/stable/rocky/nova/weights.py#L135
>>>>>
>>>>> which normalizes the weight to a value between 0.0 and 1.0.
>>>> yes we do but that is witnin the same resouce type.
>>>>
>>>> we do not normalise between resoucser types.
>>>> on a typical host you will have between 2GB and 8GB of ram per cpu
>>>> core
>>>> and you will have between 10G and 100G of local disk typically.
>>>>
>>>> so we dont look at the capasity of each resouce and renomalise
>>>> between them.
>>>> if you want achive that you have to carefully tweek the weights to
>>>> do
>>>> that manually.
>>>>
>>>> that said i have not sat down and worked out the math to do that in
>>>> about 3-4 years
>>>> but i think i used to run my dev clouster with the
>>>> disk_weight_multiplier around 0.04
>>>> and i think i used to set the ram_weight_multiplier to 2.0
>>>>
>>>> anyway that is just my personal experience and i was trying to
>>>> tweek
>>>> the spreading behavior
>>>> to spread based on ram then cpus then disk rather then pack based
>>>> on
>>>> ram but spread based on the rest.
>>>> so i dont know if this is relvent or correct for this usecase.
>>>>
>>>>>
>>>>> If the multiplier is large though, that could make the considered
>>>>> value
>>>>>    > 1.0 (as is the case with the default
>>>>> build_failure_weight_multiplier:
>>>>>
>>>>>
>>>>>
>>>>> https://github.com/openstack/nova/blob/stable/rocky/nova/conf/scheduler.py#L501
>>>>>
>>>>> -melanie
>>>>>
>>>>>> althoer host_state.free_disk_mb is actully disk_available_least.
>>>>>>
>>>>>> as a result the disk filter will weigh cpu1 19456 height then
>>>>>> cpu2
>>>>>> the dela between the cpu1 and cpu2 based on the ram weigher is
>>>>> only 2048
>>>>>>
>>>>>> if you want the ram filter to take presedence over teh disk
>>>>>> filter
>>>>> you will
>>>>>> need to scale the disk filter down to be in a similar value range
>>>>>>
>>>>>> i woudl suggest setting disk_weight_multiplier=0.001
>>>>>>
>>>>>>> This
>>>>>>> weigher might be contributing to the VM spreading you're seeing.
>>>>> You
>>>>>>> might try playing with the '[filter_scheduler]weight_classes'
>>>>> config
>>>>>>> option to select only the weighers you want or alternatively you
>>>>> could
>>>>>>> set the weighers multipliers the way you prefer.
>>>>>>>
>>>>>>> -melanie
>>>>>>>
>>>>>>> [1]
>>>>>
>>>>> https://docs.openstack.org/nova/rocky/user/filter-scheduler.html#weights
>>>>>>>
>>>>>>>>>> On 4/17/2019 3:50 PM, Georgios Dimitrakakis wrote:
>>>>>>>>>> And here is the new log where spawning of 2 VMs can be
>>>>> seen with a few seconds of difference:
>>>>>>>>>> https://pastebin.com/Xy2FL2KL
>>>>>>>>>> Initially both hosts are of weight 1.0 then the one with
>>>>> one VM already running has negative weight but the
>>>>>>>>>> new
>>>>>>>>>> VM is placed on the other host.
>>>>>>>>>> Really-really strange why this is happening...
>>>>>>>>>
>>>>>>>>> < 2019-04-17 23:26:18.770 157355 DEBUG
>>>>> nova.scheduler.filter_scheduler [req-14c666e4-3ff4-4d88-947e-
>>>>>>>>> 377b3d37bff9
>>>>>>>>> 6a4c2e32919e4a6fa5c5d956beb68eef
>>>>> 9f22e9bfa7974e14871d58bbb62242b2 - default default] Filtered
>>>>> [(cpu2,
>>>>> cpu2)
>>>>>>>>> ram:
>>>>>>>>> 30105MB disk: 1887232MB io_ops: 1 instances: 1, (cpu1, cpu1)
>>>>> ram: 32153MB disk: 1906688MB io_ops: 0 instances:
>>>>>>>>> 0]
>>>>>>>>> _get_sorted_hosts
>>>>>
>>>>> /usr/lib/python2.7/site-packages/nova/scheduler/filter_scheduler.py:435
>>>>>>>>>
>>>>>>>>> < 2019-04-17 23:26:18.771 157355 DEBUG
>>>>> nova.scheduler.filter_scheduler [req-14c666e4-3ff4-4d88-947e-
>>>>>>>>> 377b3d37bff9
>>>>>>>>> 6a4c2e32919e4a6fa5c5d956beb68eef
>>>>> 9f22e9bfa7974e14871d58bbb62242b2 - default default] Weighed
>>>>> [WeighedHost
>>>>>>>>> [host:
>>>>>>>>> (cpu1, cpu1) ram: 32153MB disk: 1906688MB io_ops: 0
>>>>> instances: 0, weight: 1.0], WeighedHost [host: (cpu2,
>>>>>>>>> cpu2)
>>>>>>>>> ram: 30105MB disk: 1887232MB io_ops: 1 instances: 1, weight:
>>>>> -0.00900862553213]] _get_sorted_hosts
>>>>>>>>>
>>>>>
>>>>> /usr/lib/python2.7/site-packages/nova/scheduler/filter_scheduler.py:454
>>>>>>>>>
>>>>>>>>> cpu1 is definitely getting weighed higher but I'm not sure
>>>>> why. We likely need some debug logging on the
>>>>>>>>> result of
>>>>>>>>> each weigher like we have for each filter to figure out
>>>>> what's going on with the weighers.
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>>
>>>>>>>>> Matt
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>
>>>>>
>>>>>







More information about the openstack-discuss mailing list