[openstack-dev] [cyborg] [nova] Cyborg quotas
lvmxhster at gmail.com
Sat May 19 10:34:05 UTC 2018
2018-05-18 19:58 GMT+08:00 Nadathur, Sundar <sundar.nadathur at intel.com>:
> Hi Matt,
> On 5/17/2018 3:18 PM, Matt Riedemann wrote:
> On 5/17/2018 3:36 PM, Nadathur, Sundar wrote:
> This applies only to the resources that Nova handles, IIUC, which does not
> handle accelerators. The generic method that Alex talks about is obviously
> preferable but, if that is not available in Rocky, is the filter an option?
> If nova isn't creating accelerator resources managed by cyborg, I have no
> idea why nova would be doing quota checks on those types of resources. And
> no, I don't think adding a scheduler filter to nova for checking
> accelerator quota is something we'd add either. I'm not sure that would
> even make sense - the quota for the resource is per tenant, not per host is
> it? The scheduler filters work on a per-host basis.
> Can we not extend BaseFilter.filter_all() to get all the hosts in a
> I should have made it clearer that this putative filter will be
> out-of-tree, and needed only till better solutions become available.
> Like any other resource in openstack, the project that manages that
> resource should be in charge of enforcing quota limits for it.
> Agreed. Not sure how other projects handle it, but here's the situation
> for Cyborg. A request may get scheduled on a compute node with no
> intervention by Cyborg. So, the earliest check that can be made today is in
> the selected compute node. A simple approach can result in quota violations
> as in this example.
> Say there are 5 devices in a cluster. A tenant has a quota of 4 and is
> currently using 3. That leaves 2 unused devices, of which the tenant is
> permitted to use only one. But he may submit two concurrent requests, and
> they may land on two different compute nodes. The Cyborg agent in each node
> will see the current tenant usage as 3 and let the request go through,
> resulting in quota violation.
> That's a bed design if Cyborg agent in each node let the request go
And the current Cyborg quota design does not have this issue.
> To prevent this, we need some kind of atomic update , like SQLAlchemy's
> That seems to have issues, as documented in the link above. Also, since
> every compute node does that, it would also serialize the bringup of all
> instances with accelerators, across the cluster.
> If there is a better solution, I'll be happy to hear it.
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the OpenStack-dev