Re: [nova][dev][ops] can we get rid of 'project_only' in the DB layer?

18 Feb 2019


      On Mon, 18 Feb 2019 13:16:38 +0800, He Jie Xu <soulxu@gmail.com> wrote:
...
Add the maillist back, I missed from the previous reply...
melanie witt <melwittt@gmail.com <mailto:melwittt@gmail.com>> 于2019年2 
月16日周六 上午12:22写道：
Thanks for the reply, Alex. Response is inline.
On Fri, 15 Feb 2019 15:49:19 +0800, He Jie Xu <soulxu@gmail.com
    <mailto:soulxu@gmail.com>> wrote:
     > We need to ensure all the APIs check the policy with real instance's
     > project and user id, like this
     >
    https://github.com/openstack/nova/blob/master/nova/api/openstack/compute/eva...
     > Otherwise, any user can get any other tenant's instance.
     >
     > Some of APIs doesn't check like this, for example:
     >
    https://github.com/openstack/nova/blob/master/nova/api/openstack/compute/att...
While I agree with this (policy check instance project/user instead of
    request context project/user), I'm not sure it's related to the
    project_only=True at the database layer. The project_only=True at the
    database layer only enforces request context project, which is what the
    second example here will also do, if policy contains
    'project_id:%(project_id)s'. It seems to me that if we remove
    project_only=True from the database layer, we will get proper
    enforcement of request context project/user by the existing policy
    checks, if the policy is configured as such. And we default policy to
    either rule:admin_api or rule:admin_or_owner:
https://docs.openstack.org/nova/latest/configuration/sample-policy.html
So, it seems to me that changing policy enforcement from request
    context
    project/user => instance project/user would be a separate change.
    Please
    let me know if I'm misunderstanding you.
One thing I do notice though, in your first example, is that the
    get_instance is done _before_ the policy check, which would need to be
    moved after the policy check, in the same change that would remove
    project_only=True. So I'm glad you pointed that out.
Emm...no, the first example is the right example.
project_only=True will ensure db call return the instance belong to the 
project
in the request context. If project_only=False, the db call may return 
other project's
instance.
Then when the rule is 'project_id:%(project_id)s' and the target is 
instance's project_id,
the policy enforcement will ensure the request context's project id 
match the instance's project_id,
then the user won't get other project's instance.
Right, that is the proposal in this email. That we should remove 
project_only=True and let the API policy check handle whether or not the 
user from a different project is allowed to get the instance. Otherwise, 
users are not able to use policy to control the behavior because it is 
hard-coded in the database layer.

I was trying to say that adding the instance.project_id target seems 
unrelated to the issue about removing project_only=True.
...
> I'm trying to memory why we didn't do that in the beginning,
    sounds like
     > we refused to support user-id based policy. But
     > in the end for the backward-campatible for the user like CERN, we
     > support user-id based policy for few APIs.
     >
    https://review.openstack.org/#/q/topic:bp/user_id_based_policy_enforcement+(...)
     >
     > This is probably why some APIs checks policy with real instance's
     > project id and user id and some APIs not.
     >
     > melanie witt <melwittt@gmail.com <mailto:melwittt@gmail.com>
    <mailto:melwittt@gmail.com <mailto:melwittt@gmail.com>>> 于2019年2
     > 月15日周五 上午1:23写道：
     >
     >     Hey all,
     >
     >     Recently, we had a customer try the following command as a
    non-admin
     >     with a policy role granted in policy.json to allow live migrate:
     >
     >         "os_compute_api:os-migrate-server:migrate_live":
    "rule:admin_api or
     >     role:Operator"
     >
     >     The scenario is that they have a server in project A and a
    user in
     >     project B with role:Operator and the user makes a call to
    live migrate
     >     the server.
     >
     >     But when they call the API, they get the following error
    response:
     >
     >         {"itemNotFound": {"message": "Instance <uuid server 1>
    could not be
     >     found.", "code": 404}}
     >
     >     A superficial look through the code shows that the live
    migrate should
     >     work, because we have appropriate policy checks in the API,
    and the
     >     request makes it past those checks because the policy.json
    has been set
     >     correctly.
     >
     >     A common pattern in our APIs is that we first
    compute_api.get() the
     >     instance object and then we call the server action (live
    migrate, stop,
     >     start, etc) with it after we retrieve it. In this scenario, the
     >     compute_api.get() fails with NotFound.
     >
     >     And the reason it fails with NotFound is because, much lower
    level, at
     >     the DB layer, we have a keyword arg called 'project_only'
    which, when
     >     True, will scope a database query to the
    RequestContext.project_id
     >     only.
     >     We have hard-coded 'project_only=True' for the instance get
    query.
     >
     >     So, when the user in project B with role:Operator tries to
    retrieve the
     >     instance record in project A, with appropriate policy rules
    set, it
     >     will
     >     fail because 'project_only=True' and the request context is
    project B,
     >     while the instance is in project A.
     >
     >     My question is: can we get rid of the hard-coded
    'project_only=True' at
     >     the database layer? This seems like something that should be
     >     enforced at
     >     the API layer and not at the database layer. It reminded me of an
     >     effort
     >     we had a few years ago where we removed other hard-coded policy
     >     enforcement from the database layer [1][2]. I've uploaded a WIP
     >     patch to
     >     demonstrate the proposed change [3].
     >
     >     Can anyone think of any potential problems with doing this?
    I'd like to
     >     be able to remove it so that operators are able use policy to
    allow
     >     non-admin users with appropriately configured roles to run server
     >     actions.
     >
     >     Cheers,
     >     -melanie
     >
     >     [1]
     >
    https://blueprints.launchpad.net/nova/+spec/nova-api-policy-final-part
     >     [2]
     >
    https://review.openstack.org/#/q/topic:bp/nova-api-policy-final-part+(status...)
     >     [3] https://review.openstack.org/637010
     >