On Thu, Dec 20, 2018 at 1:09 PM Matt Riedemann <mriedemos@gmail.com> wrote:
I wanted to float something that we talked about in the public cloud SIG meeting today [1] which is the concept of making the lock API more granular to lock on a list of actions rather than globally locking all actions that can be performed on a server.
The primary use case we discussed was around a pre-paid pricing model for servers. A user can pre-pay resources at a discount if let's say they are going to use them for a month at a fixed rate. However, once they do, they can't resize those servers without going through some kind of approval (billing) process to resize up. With this, the provider could lock the user from performing the resize action on the server but the user could do other things like stop/start/reboot/snapshot/etc.
The pricing model sounds similar to pre-emptible instances for getting a discount but the scenario is different in that these servers couldn't be pre-empted (they are definitely more non-cloudy pets than cattle).
An alternative solution for that locked resize issue is using granular policy rules such that pre-paid servers have some other kind of role attached to them so by policy you could restrict users from performing actions on those servers (but the admin could override). In reality I'm not sure how feasible that is in a public cloud with several thousand projects. The issue I see with policy controlling this is the role is attached to the project, not the resource (the server), so if you did this would users have to have separate projects for on-demand vs pre-paid resources? I believe that's what CERN and StackHPC are doing with pre-emptible instances (you have different projects with different quota models for pre-emptible resources).
One way you might be able to do this is by shoveling off the policy check using oslo.policy's http_check functionality [0]. But, it still doesn't fix the problem that users have roles on projects, and that's the standard for relaying information from keystone to services today. Hypothetically, the external policy system *could* be an API that allows operators to associate users to different policies that are more granular than what OpenStack offers today (I could POST to this policy system that a specific user can do everything but resize up this *specific* instance). When nova parses a policy check, it hands control to oslo.policy, which shuffles it off to this external system for enforcement. This external policy system evaluates the policies based on what information nova passes it, which would require the policy check string, context of the request like the user, and the resource they are trying operate on (the instance in this case). The external policy system could query it's own policy database for any policies matching that data, run the decisions, and return the enforcement decision per the oslo.limit API. Conversely, you'll have a performance hit since the policy decision and policy enforcement points are no longer oslo.policy *within* nova, but some external system being called by oslo.policy... Might not be the best idea, but food for thought based on the architecture we have today. [0] https://docs.openstack.org/oslo.policy/latest/user/plugins.html
I believe there are probably other use cases for granular locks on servers for things like service VMs (trove creates some service VMs to run a database cluster and puts locks on those servers). Again, definitely a pet scenario but it's one I've heard before.
Would people be generally in favor of this or opposed, or just meh?
[1] https://etherpad.openstack.org/p/publiccloud-wg
--
Thanks,
Matt