[openstack-dev] [nova][cinder] Fix nova swap volume (updating an attached volume) function

Matt Riedemann mriedem at linux.vnet.ibm.com
Thu Mar 31 01:39:15 UTC 2016

On 3/30/2016 8:20 PM, Matt Riedemann wrote:
> On 3/30/2016 7:56 PM, Matt Riedemann wrote:
>> On 3/30/2016 7:38 PM, Matt Riedemann wrote:
>>> On 2/25/2016 5:31 AM, Takashi Natsume wrote:
>>>> Hi Nova and Cinder developers.
>>>> As I reported in a bug report [1], nova swap volume
>>>> (updating an attached volume) fuction does not work
>>>> in the case of non admin users by default.
>>>> (Volumes are stuck.)
>>>> Before I was working for fixing another swap volume bug [2][3].
>>>> But Ryan fixed it on the Cinder side [4].
>>>> As a result, admin users can execute swap volume function,
>>>> but it was not fixed in the case of non admin users.
>>>> So I reported the bug report [1].
>>>> In the patch[5], I tried to change the default cinder's policy
>>>> to allow non admin users to execute migrate_volume_completion API.
>>>> But it was rejected by the cinder project ('-2' was voted).
>>>> In the patch[5], it was suggested to make the swap volume API admin
>>>> only
>>>> on the Nova side.
>>>> But IMO, the swap volume function should be allowed to non admin users
>>>> because attaching a volume and detaching a volume can be performed
>>>> by non admin users.
>>> I agree with this. DuncanT said in IRC that he didn't think non-admin
>>> users should be using the swap-volume API in nova because it can be
>>> problematic, but I'm not sure why, is there more history or detail
>>> there? I'd think it shouldn't be any worse than doing a detach/attach in
>>> quick succession (like in a CI test for example).
>>>> If migrate_volume_completion is only allowed to admin users
>>>> by default on the Cinder side, attaching a new volume and
>>>> detaching an old volume should be performed on the Nova side
>>>> when swapping volumes.
>>> My understanding of the problem is as follows:
>>> 1. Admin-initiated volume migration in Cinder calls off to Nova to
>>> perform the swap-volume, and then Nova calls back to Cinder's
>>> migrate_volume_completion API. This is fine since it's an admin that
>>> initiated this series of operations on the Cinder side (that's by
>>> default, however, this is broken if the policy file for Cinder is change
>>> to allow non-admins to migrate volumes).
>>> 2. A non-admin swap-volume API call in Nova fails because Nova blindly
>>> makes the migrate_volume_completion call to Cinder which fails with a
>>> 403 because the Cinder API policy has that as an admin action by
>>> default.
>>> I don't know the history around when the swap-volume API was added to
>>> Nova, was it specifically for this volume migration scenario in Cinder?
>>>   Are there other use cases?  Knowing those would be good to determine
>>> if Nova should change it's default policy for swap-volume, although,
>>> again, that's only a default and can be changed per deployment so we
>>> probably shouldn't rely on it.
>>> Ideally we would have implemented this like the nova/neutron server
>>> events callback API in Nova during vif plugging (nova does the vif plug
>>> on the host then waits for neutron to update it's database for the port
>>> status and sends an event (API call) to nova to continue booting the
>>> server). That server events API in nova is admin-only by default and
>>> neutron is configured with admin credentials for nova to use it.
>>> Another option would be for Nova to handle a 403 response when calling
>>> Cinder's migrate_volume_completion API and ignore it if we don't have an
>>> admin context. This is pretty hacky though. It assumes that it's a
>>> non-admin user initiating the swap-volume operation. It wouldn't be a
>>> problem for the volume migration operation initiated in Cinder since by
>>> default that's admin-only, so nova shouldn't get a 403 when calling
>>> migrate_volume_completion. The trap would be if the cinder policy for
>>> volume migration was changed to allow non-admins, but if someone did
>>> that, they should also change the policy for migrate_volume_completion
>>> to allow non-admin too.
>>>> If you have a good idea, please let me know it.
>>>> [1] Cinder volumes are stuck when non admin user executes nova swap
>>>> volume API
>>>>      https://bugs.launchpad.net/cinder/+bug/1522705
>>>> [2] Cinder volume stuck in swap_volume
>>>>      https://bugs.launchpad.net/nova/+bug/1471098
>>>> [3] Fix cinder volume stuck in swap_volume
>>>>      https://review.openstack.org/#/c/207385/
>>>> [4] Fix swap_volume for case without migration
>>>>      https://review.openstack.org/#/c/247767/
>>>> [5] Enable volume owners to execute migrate_volume_completion
>>>>      https://review.openstack.org/#/c/253363/
>>>> Regards,
>>>> Takashi Natsume
>>>> NTT Software Innovation Center
>>>> E-mail: natsume.takashi at lab.ntt.co.jp
>>>> __________________________________________________________________________
>>>> OpenStack Development Mailing List (not for usage questions)
>>>> Unsubscribe:
>>>> OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>> I also just checked Tempest and apparently we have no coverage for the
>> swap-volume API in Nova, we should fix that as part of this.
> I've done some more digging. The swap-volume functionality was added to
> nova here [1].  The cinder use of it for volume migration was added here
> [2].
> Looking at the cinder volume API for migrate_volume_completion, it
> expects the source (old) volume to have migration_status set [3].
> So, I think we can easily fix this in Nova by simply not calling
> volume_migration_completion if old_volume['migration_status'] is None.
> [1]
> https://github.com/openstack/nova/commit/8f51b120b430c7c21399256f37e1d8f75d030484
> [2]
> https://github.com/openstack/nova/commit/0e4bd7f93b9bfadcc2bb6dfaeae7bb5ee00c194b
> [3]
> http://git.openstack.org/cgit/openstack/cinder/tree/cinder/volume/api.py#n1358

Of course that had to be too easy to be true. The volume 
'migration_status' is only returned for the volume details if you're 
calling with an admin context [1].

I think we can still use this, we just can't expect 
volume['migration_status'] to be in the response from the volume GET. If 
it's not there, we can assume we're not doing a migration and we're not 
an admin anyway, so we can't call migrate_volume_completion.




Matt Riedemann

More information about the OpenStack-dev mailing list