<html>
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
</head>
<body text="#000000" bgcolor="#FFFFFF">
Hi,<br>
The Cyborg quota spec [1] proposes to implement a quota (maximum
usage) for accelerators on a per-project basis, to prevent one
project (tenant) from over-using some resources and starving other
tenants. There are separate resource classes for different
accelerator types (GPUs, FPGAs, etc.), and so we can do quotas per
RC.<br>
<br>
The current proposal [2] is to track the usage in Cyborg
agent/driver. I am not sure that scheme will work, as I have
indicated in the comments on [1]. Here is another possible way. <br>
<div align="left">
<ul>
<li>The operator configures the oslo.limit in keystone
per-project per-resource-class (GPU, FPGA, ...).</li>
<ul>
<li>Until this gets into Keystone, Cyborg may define its own
quota table, as defined in [1].</li>
</ul>
<li>Cyborg implements a table to track per-project usage, as
defined in [1].</li>
<li>Cyborg provides a filter for the Nova scheduler, which
checks whether the project making the request has exceeded its
own quota. </li>
<ul>
<li>If so, it removes all candidates, thus failing the
request.</li>
<li>If not, it updates the per-project usage in its own DB.
Since this is an out-of-tree filter, at least to start with,
it should be ok to directly update the db without making
REST API calls.</li>
</ul>
</ul>
</div>
IOW, the resource usage tracking and enforcement are done as part of
the request scheduling, rather than done at the compute node.<br>
<br>
If there are better ways, or ways to avoid a filter, please LMK.<br>
<br>
[1] <a class="moz-txt-link-freetext" href="https://review.openstack.org/#/c/560285/">https://review.openstack.org/#/c/560285/</a> <br>
[2] <a class="moz-txt-link-freetext" href="https://review.openstack.org/#/c/564968/">https://review.openstack.org/#/c/564968/</a><br>
<br>
Thanks.<br>
<br>
Regards,<br>
Sundar<br>
<br>
</body>
</html>