[Openstack-operators] [openstack-dev] [nova] heads up to users of Aggregate[Core|Ram|Disk]Filter: behavior change in >= Ocata

Jay Pipes jaypipes at gmail.com
Mon Jan 29 13:47:46 UTC 2018


Greetings again, Mathieu, response inline...

On 01/18/2018 07:24 PM, Mathieu Gagné wrote:
> So far, a couple challenges/issues:
> 
> We used to have fine grain control over the calls a user could make to
> the Nova API:
> * os_compute_api:os-aggregates:add_host
> * os_compute_api:os-aggregates:remove_host
> 
> This means we could make it so our technicians could *ONLY* manage
> this aspect of our cloud.
> With placement API, it's all or nothing. (and found some weeks ago
> that it's hardcoded to the "admin" role)
> And you now have to craft your own curl calls and no more UI in
> Horizon. (let me know if I missed something regarding the ACL)
> 
> I will read about placement API and see with my coworkers how we could
> adapt our systems/tools to use placement API instead. (assuming
> disable_allocation_ratio_autoset will be implemented)
> But ACL is a big concern for us if we go down that path.

OK, I think I may have stumbled upon a possible solution to this that 
would allow you to keep using the same host aggregate metadata APIs for 
setting allocation ratios. See below.

> While I agree there are very technical/raw solutions to the issue
> (like the ones you suggested), please understand that from our side,
> this is still a major regression in the usability of OpenStack from an
> operator point of view.

Yes, understood.

> And it's unfortunate that I feel I now have to play catch up and
> explain my concerns about a "fait accompli" that wasn't well
> communicated to the operators and wasn't clearly mentioned in the
> release notes.
> I would have appreciated an email to the ops list explaining the
> proposed change and if anyone has concerns/comments about it. I don't
> often reply but I feel like I would have this time as this is a major
> change for us.

Agree with you. Frankly, I did not realize this would be an issue. Had I 
known, of course we would have sent a note out about this and consulted 
with operators ahead of time.

Anyway, on to a possible solution.

For background, please see this bug:

https://bugs.launchpad.net/nova/+bug/1742747

When looking at that bug and the associated patch, I couldn't help but 
think that perhaps we could just change the default behaviour of the 
resource tracker when it encounters a nova.conf 
CONF.cpu_allocation_ratio value of 0.0.

The current behaviour of the nova-compute resource tracker is to follow 
the policy outlined in the CONF option's documentation: [1]

"From Ocata (15.0.0) this is used to influence the hosts selected by
the Placement API. Note that when Placement is used, the CoreFilter
is redundant, because the Placement API will have already filtered
out hosts that would have failed the CoreFilter.

This configuration specifies ratio for CoreFilter which can be set
per compute node. For AggregateCoreFilter, it will fall back to this
configuration value if no per-aggregate setting is found.

NOTE: This can be set per-compute, or if set to 0.0, the value
set on the scheduler node(s) or compute node(s) will be used
and defaulted to 16.0."

[1] 
https://github.com/openstack/nova/blob/master/nova/conf/compute.py#L407-L418

What I believe we can do is change the behaviour so that if a 0.0 value 
is found in the nova.conf file on the nova-compute worker, then instead 
of defaulting to 16.0, the resource tracker would first look to see if 
the compute node was associated with a host aggregate that had the 
"cpu_allocation_ratio" a metadata item. If one was found, then the host 
aggregate's cpu_allocation_ratio would be used. If not, then the 16.0 
default would be used.

What do you think?

Best,
-jay



More information about the OpenStack-operators mailing list