Add the maillist back, I missed from the previous reply...

melanie witt <melwittt@gmail.com> 于2019年2月16日周六 上午12:22写道:
Thanks for the reply, Alex. Response is inline.

On Fri, 15 Feb 2019 15:49:19 +0800, He Jie Xu <soulxu@gmail.com> wrote:
> We need to ensure all the APIs check the policy with real instance's
> project and user id, like this
> https://github.com/openstack/nova/blob/master/nova/api/openstack/compute/evacuate.py#L85-L88
> Otherwise, any user can get any other tenant's instance.
>
> Some of APIs doesn't check like this, for example:
> https://github.com/openstack/nova/blob/master/nova/api/openstack/compute/attach_interfaces.py#L55

While I agree with this (policy check instance project/user instead of
request context project/user), I'm not sure it's related to the
project_only=True at the database layer. The project_only=True at the
database layer only enforces request context project, which is what the
second example here will also do, if policy contains
'project_id:%(project_id)s'. It seems to me that if we remove
project_only=True from the database layer, we will get proper
enforcement of request context project/user by the existing policy
checks, if the policy is configured as such. And we default policy to
either rule:admin_api or rule:admin_or_owner:

https://docs.openstack.org/nova/latest/configuration/sample-policy.html

So, it seems to me that changing policy enforcement from request context
project/user => instance project/user would be a separate change. Please
let me know if I'm misunderstanding you.

One thing I do notice though, in your first example, is that the
get_instance is done _before_ the policy check, which would need to be
moved after the policy check, in the same change that would remove
project_only=True. So I'm glad you pointed that out.

Emm...no, the first example is the right example.

project_only=True will ensure db call return the instance belong to the project
in the request context. If project_only=False, the db call may return other project's
instance.

Then when the rule is 'project_id:%(project_id)s' and the target is instance's project_id,
the policy enforcement will ensure the request context's project id match the instance's project_id,
then the user won't get other project's instance.
 

> I'm trying to memory why we didn't do that in the beginning, sounds like
> we refused to support user-id based policy. But
> in the end for the backward-campatible for the user like CERN, we
> support user-id based policy for few APIs.
> https://review.openstack.org/#/q/topic:bp/user_id_based_policy_enforcement+(status:open+OR+status:merged)
>
> This is probably why some APIs checks policy with real instance's
> project id and user id and some APIs not.
>
> melanie witt <melwittt@gmail.com <mailto:melwittt@gmail.com>> 于2019年2
> 月15日周五 上午1:23写道:
>
>     Hey all,
>
>     Recently, we had a customer try the following command as a non-admin
>     with a policy role granted in policy.json to allow live migrate:
>
>         "os_compute_api:os-migrate-server:migrate_live": "rule:admin_api or
>     role:Operator"
>
>     The scenario is that they have a server in project A and a user in
>     project B with role:Operator and the user makes a call to live migrate
>     the server.
>
>     But when they call the API, they get the following error response:
>
>         {"itemNotFound": {"message": "Instance <uuid server 1> could not be
>     found.", "code": 404}}
>
>     A superficial look through the code shows that the live migrate should
>     work, because we have appropriate policy checks in the API, and the
>     request makes it past those checks because the policy.json has been set
>     correctly.
>
>     A common pattern in our APIs is that we first compute_api.get() the
>     instance object and then we call the server action (live migrate, stop,
>     start, etc) with it after we retrieve it. In this scenario, the
>     compute_api.get() fails with NotFound.
>
>     And the reason it fails with NotFound is because, much lower level, at
>     the DB layer, we have a keyword arg called 'project_only' which, when
>     True, will scope a database query to the RequestContext.project_id
>     only.
>     We have hard-coded 'project_only=True' for the instance get query.
>
>     So, when the user in project B with role:Operator tries to retrieve the
>     instance record in project A, with appropriate policy rules set, it
>     will
>     fail because 'project_only=True' and the request context is project B,
>     while the instance is in project A.
>
>     My question is: can we get rid of the hard-coded 'project_only=True' at
>     the database layer? This seems like something that should be
>     enforced at
>     the API layer and not at the database layer. It reminded me of an
>     effort
>     we had a few years ago where we removed other hard-coded policy
>     enforcement from the database layer [1][2]. I've uploaded a WIP
>     patch to
>     demonstrate the proposed change [3].
>
>     Can anyone think of any potential problems with doing this? I'd like to
>     be able to remove it so that operators are able use policy to allow
>     non-admin users with appropriately configured roles to run server
>     actions.
>
>     Cheers,
>     -melanie
>
>     [1]
>     https://blueprints.launchpad.net/nova/+spec/nova-api-policy-final-part
>     [2]
>     https://review.openstack.org/#/q/topic:bp/nova-api-policy-final-part+(status:open+OR+status:merged)
>     [3] https://review.openstack.org/637010
>