[Openstack-operators] [openstack-dev] [nova] heads up to users of Aggregate[Core|Ram|Disk]Filter: behavior change in >= Ocata

Jay Pipes jaypipes at gmail.com
Fri Feb 2 15:27:34 UTC 2018


On 01/29/2018 06:30 PM, Mathieu Gagné wrote:
> So lets explore what would looks like a placement centric solution.
> (let me know if I get anything wrong)
> 
> Here are our main concerns/challenges so far, which I will compare to
> our current flow:
> 
> 1. Compute nodes should not be enabled by default
> 
> When adding new compute node, we add them to a special aggregate which
> makes scheduling instances on it impossible. (with an impossible
> condition)
> We are aware that there is a timing issue where someone *could*
> schedule an instance on it if we aren't fast enough. So far, it has
> never been a major issue for us.
> If we want to move away from aggregates, the "enable_new_services"
> config (since Ocata) would be a great solution to our need.
> I don't think Placement needs to be involved in this case, unless you
> can show me a better alternative solution.

Luckily, this one's easy. No changes would need to be made. The 
enable_new_services option should behave the same. Placement does not do 
service status checking.

When the scheduler receives a request to determine a target host for an 
instance, it first asks the Placement service for a set of compute nodes 
that meet the resource requirements of that instance [1]. The Placement 
service returns a list of candidate compute nodes that could fit the 
instance.

The scheduler then runs that list of compute nodes through something 
called the servicegroup API when it runs the ComputeFilter [2]. This 
filters out any compute nodes where the nova-compute service is disabled 
on the host.

So, no changes needed here.

[1] 
https://github.com/openstack/nova/blob/master/nova/scheduler/manager.py#L122
[2] 
https://github.com/openstack/nova/blob/master/nova/scheduler/filters/compute_filter.py#L45

> 2. Ability to move compute nodes around through API (without
> configuration management system)
> 
> We use aggregate to isolate our flavor series which are mainly based
> on cpu allocation ratio.
> This means distinct capacity pool of compute nodes for each flavor series.
> We can easily move around our compute nodes if one aggregate (or
> capacity pool) need more compute nodes through API.
> There is no need to deploy a new version of our configuration management system.
> 
> Based on your previous comment, Nova developers could implement a way
> so configuration file is no longer used when pushing ratios to
> Placement.
> An op should be able to provide the ratio himself through the Placement API.

Yes. The following will need to happen in order for you to continue 
operating your cloud without any changes:

nova-compute needs to stop overwriting the inventory records' allocation 
ratio to the value it sees in the nova-compute's nova.conf file for 
ratios that are 0.0. Instead, nova-compute needs to look for the first 
host aggregate that has a corresponding allocation_ratio metadata item 
and use that if found for the inventory record's.

I'm working on a spec for the above and have added this as an agenda 
item for the PTHG.

In addition to the above, we are also looking at making the Placement 
service "oslo.policy-ified" :) This means that the Placement service 
needs to have RBAC rules enforced using the same middleware and policy 
library as Nova. This is also on the PTG agenda.

> Some questions:
> * What's the default ratio if none is provided? (none through config
> file or API)

Depends on the resource class. CPU is 16.0, RAM is 1.5 and disk is 1.0.

> * How can I restrict flavors to matching hosts? Will Placement respect
> allocation ratios provided by a flavor and find corresponding compute
> nodes? I couldn't find details on that one in previous emails.

Allocation ratios are not provided by the flavor. What I think you meant 
is will Placement respect allocation ratios made on the host aggregate. 
And the answer to that is: no, but it won't need to.

Instead, it is *nova-compute* that needs to "respect the host 
aggregate's allocation ratios". That's what needs to change and what I 
describe above.

> Some challenges:
> * Find a way to easily visualise/monitor available capacity/hosts per
> capacity pool (previously aggregate)

ack. This is a good thing to discuss in Dublin.

> 3. Ability to delegate above operations to ops

If we do the "make sure nova-compute respects host aggregate allocation 
ratios", things will continue to be operational as you have had it since 
Folsom.

> With aggregates, you can easily precisely delegate host memberships to
> ops or other people using the corresponding policies:
> * os_compute_api:os-aggregates:add_host
> * os_compute_api:os-aggregates:remove_host

Yep, we're not changing any of that.

> And those people can be autonomous with the API without having to
> redeploy anything.

Understood.

> Being able to manipulate/administer the hosts through an API is golden
> and therefore totally disagree with "Config management is the solution
> to your problem".

Understood, and we're trying our best to work with you to keep the 
functionality you've gotten accustomed to.

> With Placement API, there is no fine grain control over who/what can
> be done through the API. (it's even hardcoded to the "admin" role)

Yep. That's the oslo-policy integration task mentioned above. On the PTG 
agenda.

> So there is some work to be done here:
> * Remove hardcoded "admin" role from code. Already being work on by
> Matt Riedemann [1]
> * Fine grain control/policies for Placement API.
> 
> The last point needs a bit more work. If we can implement control on
> resource providers, allocations, usages, traits, etc, I will be happy.
> In the end, that's one of the major "regression" I found with
> placement. I don't want a human to be able to do more than it should
> be able to do and break everything.
> 
> So I'm not ready to cross that river yet, I'm still running Mitaka.
> But I need to make sure I'm not stuck when Ocata happens for us.

Well, more likely it's going to be Rocky that you will need to 
fast-forward upgrade to, but we'll cross that river when we need to.

Best,
-jay

> [1] https://review.openstack.org/#/c/524425/
> 
> --
> Mathieu
> 



More information about the OpenStack-operators mailing list