[Openstack-operators] [openstack-dev] [nova] heads up to users of Aggregate[Core|Ram|Disk]Filter: behavior change in >= Ocata
mgagne at calavera.ca
Thu Jan 18 20:54:02 UTC 2018
On Tue, Jan 16, 2018 at 4:24 PM, melanie witt <melwittt at gmail.com> wrote:
> Hello Stackers,
> This is a heads up to any of you using the AggregateCoreFilter,
> AggregateRamFilter, and/or AggregateDiskFilter in the filter scheduler.
> These filters have effectively allowed operators to set overcommit ratios
> per aggregate rather than per compute node in <= Newton.
> Beginning in Ocata, there is a behavior change where aggregate-based
> overcommit ratios will no longer be honored during scheduling. Instead,
> overcommit values must be set on a per compute node basis in nova.conf.
> Details: as of Ocata, instead of considering all compute nodes at the start
> of scheduler filtering, an optimization has been added to query resource
> capacity from placement and prune the compute node list with the result
> *before* any filters are applied. Placement tracks resource capacity and
> usage and does *not* track aggregate metadata . Because of this,
> placement cannot consider aggregate-based overcommit and will exclude
> compute nodes that do not have capacity based on per compute node
> How to prepare: if you have been relying on per aggregate overcommit, during
> your upgrade to Ocata, you must change to using per compute node overcommit
> ratios in order for your scheduling behavior to stay consistent. Otherwise,
> you may notice increased NoValidHost scheduling failures as the
> aggregate-based overcommit is no longer being considered. You can safely
> remove the AggregateCoreFilter, AggregateRamFilter, and AggregateDiskFilter
> from your enabled_filters and you do not need to replace them with any other
> core/ram/disk filters. The placement query takes care of the core/ram/disk
> filtering instead, so CoreFilter, RamFilter, and DiskFilter are redundant.
>  Placement has been a new slate for resource management and prior to
> placement, there were conflicts between the different methods for setting
> overcommit ratios that were never addressed, such as, "which value to take
> if a compute node has overcommit set AND the aggregate has it set? Which
> takes precedence?" And, "if a compute node is in more than one aggregate,
> which overcommit value should be taken?" So, the ambiguities were not
> something that was desirable to bring forward into placement.
So we are a user of this feature and I do have some questions/concerns.
We use this feature to segregate capacity/hosts based on CPU
allocation ratio using aggregates.
This is because we have different offers/flavors based on those
allocation ratios. This is part of our business model.
A flavor extra_specs is use to schedule instances on appropriate hosts
Our setup has a configuration management system and we use aggregates
exclusively when it comes to allocation ratio.
We do not rely on cpu_allocation_ratio config in nova-scheduler or nova-compute.
One of the reasons is we do not wish to have to
update/package/redeploy our configuration management system just to
add one or multiple compute nodes to an aggregate/capacity pool.
This means anyone (likely an operator or other provisioning
technician) can perform this action without having to touch or even
know about our configuration management system.
We can also transfer capacity from one aggregate to another if there
is a need, again, using aggregate memberships. (we do "evacuate" the
node if there are instances on it)
Our capacity monitoring is based on aggregate memberships and this
offer an easy overview of the current capacity. Note that a host can
be in one and only one aggregate in our setup.
What's the migration path for us?
My understanding is that we will now be forced to have people rely on
our configuration management system (which they don't have access to)
to perform simple task we used to be able to do through the API.
I find this unfortunate and I would like to be offered an alternative
solution as the current proposed solution is not acceptable for us.
We are loosing "agility" in our operational tasks.
More information about the OpenStack-operators