[nova] Updates about Detaching/Attaching root volumes
Hi Nova, I'm working on a blueprint to support Detach/Attach root volumes. The blueprint has been proposed for quite a while since mitaka[1] in that version of proposal, we only talked about instances in shelved_offloaded status. And in Stein[2] the status of stopped was also added. But now we realized that support detach/attach root volume on a stopped instance could be problemastic since the underlying image could change which might invalidate the current host.[3] So Matt and Sean suggested maybe we could just do it for shelved_offloaded instances, and I have updated the patch according to this comment. And I will update the spec latter, so if anyone have thought on this, please let me know. Another thing I wanted to discuss is that in the proposal, we will reset some fields in the root_bdm instead of delete the whole record, among those fields, the tag field could be tricky. My idea was to reset it too. But there also could be cases that the users might think that it would not change[4]. Thoughts, BR, [1] http://specs.openstack.org/openstack/nova-specs/specs/mitaka/approved/detach... [2] http://specs.openstack.org/openstack/nova-specs/specs/stein/approved/detach-... [3] https://review.openstack.org/#/c/614750/34/nova/compute/manager.py@5467 [4] https://review.openstack.org/#/c/614750/37/nova/objects/block_device.py
On 2/26/2019 6:40 AM, Zhenyu Zheng wrote:
I'm working on a blueprint to support Detach/Attach root volumes. The blueprint has been proposed for quite a while since mitaka[1] in that version of proposal, we only talked about instances in shelved_offloaded status. And in Stein[2] the status of stopped was also added. But now we realized that support detach/attach root volume on a stopped instance could be problemastic since the underlying image could change which might invalidate the current host.[3]
So Matt and Sean suggested maybe we could just do it for shelved_offloaded instances, and I have updated the patch according to this comment. And I will update the spec latter, so if anyone have thought on this, please let me know.
I mentioned this during the spec review but didn't push on it I guess, or must have talked myself out of it. We will also have to handle the image potentially changing when attaching a new root volume so that when we unshelve, the scheduler filters based on the new image metadata rather than the image metadata stored in the RequestSpec from when the server was originally created. But for a stopped instance, there is no run through the scheduler again so I don't think we can support that case. Also, there is no real good way for us (right now) to even compare the image ID from the new root volume to what was used to originally create the server because for volume-backed servers the RequestSpec.image.id is not set (I'm not sure why, but that's the way it's always been, the image.id is pop'ed from the metadata [1]). And when we detach the root volume, we null out the BDM.volume_id so we can't get back to figure out what that previous root volume's image ID was to compare, i.e. for a stopped instance we can't enforce that the underlying image is the same to support detach/attach root volume. We could probably hack stuff up by stashing the old volume_id/image_id in system_metadata but I'd rather not play that game. It also occurs to me that the root volume attach code is also not verifying that the new root volume is bootable. So we really need to re-use this code on root volume attach [2]. tl;dr when we attach a new root volume, we need to update the RequestSpec.image (ImageMeta) object based on the new root volume's underlying volume_image_metadata so that when we unshelve we use that image rather than the original image.
Another thing I wanted to discuss is that in the proposal, we will reset some fields in the root_bdm instead of delete the whole record, among those fields, the tag field could be tricky. My idea was to reset it too. But there also could be cases that the users might think that it would not change[4].
Yeah I am not sure what to do here. Here is a scenario: User boots from volume with a tag "ubuntu1604vol" to indicate it's the root volume with the operating system. Then they shelve offload the server and detach the root volume. At this point, the GET /servers/{server_id}/os-volume_attachments API is going to show None for the volume_id on that BDM but should it show the original tag or also show None for that. Kevin currently has the tag field being reset to None when the root volume is detached. When the user attaches a new root volume, they can provide a new tag so even if we did not reset the tag, the user can overwrite it. As a user, would you expect the tag to be reset when the root volume is detached or have it persist but be overwritable? If in this scenario the user then attaches a new root volume that is CentOS or Ubuntu 18.04 or something like that, but forgets to update the tag, then the old tag would be misleading. So it is probably safest to just reset the tag like Kevin's proposed code is doing, but we could use some wider feedback here. [1] https://github.com/openstack/nova/blob/33f367ec2f32ce36b00257c11c50844004167... [2] https://github.com/openstack/nova/blob/33f367ec2f32ce36b00257c11c50844004167... -- Thanks, Matt
On 2/26/2019 6:40 AM, Zhenyu Zheng wrote:
I'm working on a blueprint to support Detach/Attach root volumes. The blueprint has been proposed for quite a while since mitaka[1] in that version of proposal, we only talked about instances in shelved_offloaded status. And in Stein[2] the status of stopped was also added. But now we realized that support detach/attach root volume on a stopped instance could be problemastic since the underlying image could change which might invalidate the current host.[3]
So Matt and Sean suggested maybe we could just do it for shelved_offloaded instances, and I have updated the patch according to this comment. And I will update the spec latter, so if anyone have thought on this, please let me know.
I mentioned this during the spec review but didn't push on it I guess, or must have talked myself out of it. We will also have to handle the image potentially changing when attaching a new root volume so that when we unshelve, the scheduler filters based on the new image metadata rather than the image metadata stored in the RequestSpec from when the server was originally created. But for a stopped instance, there is no run through the scheduler again so I don't think we can support that case. Also, there is no real good way for us (right now) to even compare the image ID from the new root volume to what was used to originally create the server because for volume-backed servers the RequestSpec.image.id is not set (I'm not sure why, but that's the way it's always been, the image.id is pop'ed from the metadata [1]). And when we detach the root volume, we null out the BDM.volume_id so we can't get back to figure out what that previous root volume's image ID was to compare, i.e. for a stopped instance we can't enforce that the underlying image is the same to support detach/attach root volume. We could probably hack stuff up by stashing the old volume_id/image_id in system_metadata but I'd rather not play that game.
It also occurs to me that the root volume attach code is also not verifying that the new root volume is bootable. So we really need to re-use this code on root volume attach [2].
tl;dr when we attach a new root volume, we need to update the RequestSpec.image (ImageMeta) object based on the new root volume's underlying volume_image_metadata so that when we unshelve we use that image rather than the original image.
Another thing I wanted to discuss is that in the proposal, we will reset some fields in the root_bdm instead of delete the whole record, among those fields, the tag field could be tricky. My idea was to reset it too. But there also could be cases that the users might think that it would not change[4].
Yeah I am not sure what to do here. Here is a scenario:
User boots from volume with a tag "ubuntu1604vol" to indicate it's the root volume with the operating system. Then they shelve offload the server and detach the root volume. At this point, the GET /servers/{server_id}/os-volume_attachments API is going to show None for the volume_id on that BDM but should it show the original tag or also show None for that. Kevin currently has the tag field being reset to None when the root volume is detached.
When the user attaches a new root volume, they can provide a new tag so even if we did not reset the tag, the user can overwrite it. As a user, would you expect the tag to be reset when the root volume is detached or have it persist but be overwritable?
If in this scenario the user then attaches a new root volume that is CentOS or Ubuntu 18.04 or something like that, but forgets to update the tag, then the old tag would be misleading.
So it is probably safest to just reset the tag like Kevin's proposed code is doing, but we could use some wider feedback here.
On Tue, 2019-02-26 at 07:21 -0600, Matt Riedemann wrote: the only thing i can tink of would be to have a standard "root" or "boot" tag that we apply to root volume and encourage uses to use that. but i dont know of a better way to do it generically so reseting is proably as sane as anything else.
[1] https://github.com/openstack/nova/blob/33f367ec2f32ce36b00257c11c50844004167... [2] https://github.com/openstack/nova/blob/33f367ec2f32ce36b00257c11c50844004167...
On 2/26/2019 7:21 AM, Matt Riedemann wrote:
Yeah I am not sure what to do here. Here is a scenario:
User boots from volume with a tag "ubuntu1604vol" to indicate it's the root volume with the operating system. Then they shelve offload the server and detach the root volume. At this point, the GET /servers/{server_id}/os-volume_attachments API is going to show None for the volume_id on that BDM but should it show the original tag or also show None for that. Kevin currently has the tag field being reset to None when the root volume is detached.
When the user attaches a new root volume, they can provide a new tag so even if we did not reset the tag, the user can overwrite it. As a user, would you expect the tag to be reset when the root volume is detached or have it persist but be overwritable?
If in this scenario the user then attaches a new root volume that is CentOS or Ubuntu 18.04 or something like that, but forgets to update the tag, then the old tag would be misleading.
So it is probably safest to just reset the tag like Kevin's proposed code is doing, but we could use some wider feedback here.
I just realized that the user providing a new tag when attaching the new root volume won't work, because we are only going to allow attaching a new root volume to a shelved offloaded instance, which explicitly rejects providing a tag in that case [1]. So we likely need to lift that restriction in this microversion and then on unshelve in the compute service we need to check if the compute supports device tags like during server create and if not, the unshelve will fail. Now that I think about that, that's likely already a bug today, i.e. if I create a server with device tags at server create time and land on a host that supports them, but then shelve offload and unshelve to a compute that does not support them, the unshelve won't fail even though the compute doesn't support the device tags on my attached volumes/ports. [1] https://review.openstack.org/#/c/623981/18/nova/compute/api.py@4264 -- Thanks, Matt
On Tue, Feb 26, 2019 at 8:23 AM Matt Riedemann <mriedemos@gmail.com> wrote:
On 2/26/2019 6:40 AM, Zhenyu Zheng wrote:
I'm working on a blueprint to support Detach/Attach root volumes. The blueprint has been proposed for quite a while since mitaka[1] in that version of proposal, we only talked about instances in shelved_offloaded status. And in Stein[2] the status of stopped was also added. But now we realized that support detach/attach root volume on a stopped instance could be problemastic since the underlying image could change which might invalidate the current host.[3]
So Matt and Sean suggested maybe we could just do it for shelved_offloaded instances, and I have updated the patch according to this comment. And I will update the spec latter, so if anyone have thought on this, please let me know.
I mentioned this during the spec review but didn't push on it I guess, or must have talked myself out of it. We will also have to handle the image potentially changing when attaching a new root volume so that when we unshelve, the scheduler filters based on the new image metadata rather than the image metadata stored in the RequestSpec from when the server was originally created. But for a stopped instance, there is no run through the scheduler again so I don't think we can support that case. Also, there is no real good way for us (right now) to even compare the image ID from the new root volume to what was used to originally create the server because for volume-backed servers the RequestSpec.image.id is not set (I'm not sure why, but that's the way it's always been, the image.id is pop'ed from the metadata [1]). And when we detach the root volume, we null out the BDM.volume_id so we can't get back to figure out what that previous root volume's image ID was to compare, i.e. for a stopped instance we can't enforce that the underlying image is the same to support detach/attach root volume. We could probably hack stuff up by stashing the old volume_id/image_id in system_metadata but I'd rather not play that game.
It also occurs to me that the root volume attach code is also not verifying that the new root volume is bootable. So we really need to re-use this code on root volume attach [2].
tl;dr when we attach a new root volume, we need to update the RequestSpec.image (ImageMeta) object based on the new root volume's underlying volume_image_metadata so that when we unshelve we use that image rather than the original image.
Another thing I wanted to discuss is that in the proposal, we will reset some fields in the root_bdm instead of delete the whole record, among those fields, the tag field could be tricky. My idea was to reset it too. But there also could be cases that the users might think that it would not change[4].
Yeah I am not sure what to do here. Here is a scenario:
User boots from volume with a tag "ubuntu1604vol" to indicate it's the root volume with the operating system. Then they shelve offload the server and detach the root volume. At this point, the GET /servers/{server_id}/os-volume_attachments API is going to show None for the volume_id on that BDM but should it show the original tag or also show None for that. Kevin currently has the tag field being reset to None when the root volume is detached.
When the user attaches a new root volume, they can provide a new tag so even if we did not reset the tag, the user can overwrite it. As a user, would you expect the tag to be reset when the root volume is detached or have it persist but be overwritable?
If in this scenario the user then attaches a new root volume that is CentOS or Ubuntu 18.04 or something like that, but forgets to update the tag, then the old tag would be misleading.
The tag is a Nova concept on the attachment. If you detach a volume (root or not) then attach a different one (root or not), to me that's a new attachment, with a new (potentially None) tag. I have no idea who that fits into the semantics around root volume detach, but that's my 2 cents.
So it is probably safest to just reset the tag like Kevin's proposed code is doing, but we could use some wider feedback here.
[1] https://github.com/openstack/nova/blob/33f367ec2f32ce36b00257c11c50844004167... [2] https://github.com/openstack/nova/blob/33f367ec2f32ce36b00257c11c50844004167...
--
Thanks,
Matt
-- -- Artom Lifshitz Software Engineer, OpenStack Compute DFG
As for this case, and what Matt mentioned in the patch review:
restriction for attaching volumes with a tag because while it's true we don't know if the compute that the instance is unshelved on will support device tags, we also don't know that during initial server create but we still allow bdm tags in that case. In fact, in the server create case, if you try to create a server with bdm tags on a compute host that does not support them, it will fail the build (not even try to reschedule)
There is something I don't quite understand, what will be different for the volumes that are newly attached and the existing volumes in case you mentioned? I mean, the existing volumes could also have tags, and when we unshelve, we still have to handle the tags in bdms, no matter it is existing bdms or newly atteched when the instance is in ``shelved_offloaded`` status. What is the difference? BR, On Thu, Feb 28, 2019 at 12:08 AM Artom Lifshitz <alifshit@redhat.com> wrote:
On Tue, Feb 26, 2019 at 8:23 AM Matt Riedemann <mriedemos@gmail.com> wrote:
On 2/26/2019 6:40 AM, Zhenyu Zheng wrote:
I'm working on a blueprint to support Detach/Attach root volumes. The blueprint has been proposed for quite a while since mitaka[1] in that version of proposal, we only talked about instances in
shelved_offloaded
status. And in Stein[2] the status of stopped was also added. But now we realized that support detach/attach root volume on a stopped instance could be problemastic since the underlying image could change which might invalidate the current host.[3]
So Matt and Sean suggested maybe we could just do it for shelved_offloaded instances, and I have updated the patch according to this comment. And I will update the spec latter, so if anyone have thought on this, please let me know.
I mentioned this during the spec review but didn't push on it I guess, or must have talked myself out of it. We will also have to handle the image potentially changing when attaching a new root volume so that when we unshelve, the scheduler filters based on the new image metadata rather than the image metadata stored in the RequestSpec from when the server was originally created. But for a stopped instance, there is no run through the scheduler again so I don't think we can support that case. Also, there is no real good way for us (right now) to even compare the image ID from the new root volume to what was used to originally create the server because for volume-backed servers the RequestSpec.image.id is not set (I'm not sure why, but that's the way it's always been, the image.id is pop'ed from the metadata [1]). And when we detach the root volume, we null out the BDM.volume_id so we can't get back to figure out what that previous root volume's image ID was to compare, i.e. for a stopped instance we can't enforce that the underlying image is the same to support detach/attach root volume. We could probably hack stuff up by stashing the old volume_id/image_id in system_metadata but I'd rather not play that game.
It also occurs to me that the root volume attach code is also not verifying that the new root volume is bootable. So we really need to re-use this code on root volume attach [2].
tl;dr when we attach a new root volume, we need to update the RequestSpec.image (ImageMeta) object based on the new root volume's underlying volume_image_metadata so that when we unshelve we use that image rather than the original image.
Another thing I wanted to discuss is that in the proposal, we will
reset
some fields in the root_bdm instead of delete the whole record, among those fields, the tag field could be tricky. My idea was to reset it too. But there also could be cases that the users might think that it would not change[4].
Yeah I am not sure what to do here. Here is a scenario:
User boots from volume with a tag "ubuntu1604vol" to indicate it's the root volume with the operating system. Then they shelve offload the server and detach the root volume. At this point, the GET /servers/{server_id}/os-volume_attachments API is going to show None for the volume_id on that BDM but should it show the original tag or also show None for that. Kevin currently has the tag field being reset to None when the root volume is detached.
When the user attaches a new root volume, they can provide a new tag so even if we did not reset the tag, the user can overwrite it. As a user, would you expect the tag to be reset when the root volume is detached or have it persist but be overwritable?
If in this scenario the user then attaches a new root volume that is CentOS or Ubuntu 18.04 or something like that, but forgets to update the tag, then the old tag would be misleading.
The tag is a Nova concept on the attachment. If you detach a volume (root or not) then attach a different one (root or not), to me that's a new attachment, with a new (potentially None) tag. I have no idea who that fits into the semantics around root volume detach, but that's my 2 cents.
So it is probably safest to just reset the tag like Kevin's proposed code is doing, but we could use some wider feedback here.
[1]
https://github.com/openstack/nova/blob/33f367ec2f32ce36b00257c11c50844004167...
[2]
https://github.com/openstack/nova/blob/33f367ec2f32ce36b00257c11c50844004167...
--
Thanks,
Matt
-- -- Artom Lifshitz Software Engineer, OpenStack Compute DFG
On 2/27/2019 7:41 PM, Zhenyu Zheng wrote:
There is something I don't quite understand, what will be different for the volumes that are newly attached and the existing volumes in case you mentioned? I mean, the existing volumes could also have tags, and when we unshelve, we still have to handle the tags in bdms, no matter it is existing bdms or newly atteched when the instance is in ``shelved_offloaded`` status. What is the difference?
There isn't, it's a bug: https://bugs.launchpad.net/nova/+bug/1817927 Which is why I think we should probably lift the restriction in the API so that users can attach volumes with tags to a shelved offloaded instance. I'm not really comfortable with adding root volume detach/attach support if the user cannot specify a new tag when attaching a new root volume, and to do that we have to remove that restriction on tags + shelved offloaded servers in the API. -- Thanks, Matt
Checking the bug report you mentioned and seems the best solution will be rely on that BP you mentioned in the report. I sugguest one thing we can do is that we mention that we will not reset the tag but it might not working. And when we can support assign tag when attach volume to shelved_offloaded instances, we then perform the reset and update action. On Thu, Feb 28, 2019 at 10:10 AM Matt Riedemann <mriedemos@gmail.com> wrote:
On 2/27/2019 7:41 PM, Zhenyu Zheng wrote:
There is something I don't quite understand, what will be different for the volumes that are newly attached and the existing volumes in case you mentioned? I mean, the existing volumes could also have tags, and when we unshelve, we still have to handle the tags in bdms, no matter it is existing bdms or newly atteched when the instance is in ``shelved_offloaded`` status. What is the difference?
There isn't, it's a bug:
https://bugs.launchpad.net/nova/+bug/1817927
Which is why I think we should probably lift the restriction in the API so that users can attach volumes with tags to a shelved offloaded instance.
I'm not really comfortable with adding root volume detach/attach support if the user cannot specify a new tag when attaching a new root volume, and to do that we have to remove that restriction on tags + shelved offloaded servers in the API.
--
Thanks,
Matt
On 2/28/2019 3:35 AM, Zhenyu Zheng wrote:
And when we can support assign tag when attach volume to shelved_offloaded instances, we then perform the reset and update action.
Honestly if we're going to do all of that I would rather *not* do the root detach/attach volume stuff if it just means we have to change it later to allow tagging the new root volume. However, I don't think we really need to wait for driver capabilities traits-based scheduling with a placement request filter either. What is stopping us from simply, within the same microversion for root detach/attach, also lift the restriction on being able to attach a volume to a shelved offloaded instance with a tag? I realize the restriction was there because we didn't know where the instance would unshelve and it could be on a compute host that doesn't support device tags, but that is also true of the initial server create, so if we're going to support the create scenario in the API I don't see why we don't support the shelved offloaded scenario too. We don't have any policy rules in place against using tags if you're a full VIO shop with only vmware and don't support tags, but we could always add that to 403 requests to use device tags. Otherwise if you're using libvirt then this would already work. I realize this is a late change in the design for the root volume detach/attach spec, but I think the two need to go together to avoid releasing a half-baked feature. -- Thanks, Matt
On 2/28/2019 8:36 AM, Matt Riedemann wrote:
Honestly if we're going to do all of that I would rather *not* do the root detach/attach volume stuff if it just means we have to change it later to allow tagging the new root volume.
However, I don't think we really need to wait for driver capabilities traits-based scheduling with a placement request filter either. What is stopping us from simply, within the same microversion for root detach/attach, also lift the restriction on being able to attach a volume to a shelved offloaded instance with a tag? I realize the restriction was there because we didn't know where the instance would unshelve and it could be on a compute host that doesn't support device tags, but that is also true of the initial server create, so if we're going to support the create scenario in the API I don't see why we don't support the shelved offloaded scenario too.
We don't have any policy rules in place against using tags if you're a full VIO shop with only vmware and don't support tags, but we could always add that to 403 requests to use device tags. Otherwise if you're using libvirt then this would already work.
I realize this is a late change in the design for the root volume detach/attach spec, but I think the two need to go together to avoid releasing a half-baked feature.
I've put this on the agenda for today's nova meeting. Here are the options as I see them: 1: lift the restriction in this same microversion so that you can attach volumes to a shelved offloaded server and specify a tag (despite bug 1817927 which we can fix later) 2: reset the tag during root volume detach and don't allow specifying a new tag - this is what we'd get today if we merged Kevin's code as-is 3: don't reset the tag during root volume detach but don't allow overwriting it on root volume attach - this could be poor UX if you need to change the tag 4: same as option 2 but eventually add a new microversion for option 1 thus punting the decision My preferred option is 1 but time is short to be considering this and we might screw something up. I don't think option 3 is viable given the original tag might not make any sense with a new root volume. Options 2 and 4 get us the root volume detach/attach feature in stein as a short-term win but is half-baked since we still have more changes to make later (and another microversion). I know another option is just defer to Train when we have more time to think about this but we're nearly there on this feature so I'd like to get it done if we can reach agreement on the design. -- Thanks, Matt
On 2/28/2019 12:08 PM, Matt Riedemann wrote:
I've put this on the agenda for today's nova meeting. Here are the options as I see them:
1: lift the restriction in this same microversion so that you can attach volumes to a shelved offloaded server and specify a tag (despite bug 1817927 which we can fix later)
2: reset the tag during root volume detach and don't allow specifying a new tag - this is what we'd get today if we merged Kevin's code as-is
3: don't reset the tag during root volume detach but don't allow overwriting it on root volume attach - this could be poor UX if you need to change the tag
4: same as option 2 but eventually add a new microversion for option 1 thus punting the decision
My preferred option is 1 but time is short to be considering this and we might screw something up. I don't think option 3 is viable given the original tag might not make any sense with a new root volume. Options 2 and 4 get us the root volume detach/attach feature in stein as a short-term win but is half-baked since we still have more changes to make later (and another microversion).
I know another option is just defer to Train when we have more time to think about this but we're nearly there on this feature so I'd like to get it done if we can reach agreement on the design.
We talked about this in the nova meeting today: http://eavesdrop.openstack.org/meetings/nova/2019/nova.2019-02-28-21.00.log.... It sounds like we might be leaning to option 4 for a couple of reasons: a) It's getting late in stein to be lifting the restriction on attaching a volume with a tag to a shelved offloaded server and munging that in with the root attach/detach volume API change. b) As we got talking I noted that for the same reason as a tag, you currently cannot attach a multi-attach volume to a shelved offloaded server, because we don't make the "reserve_block_device_name" RPC call to the compute service to determine if the driver supports it. c) Once we have code in the API to properly translate tags/multiattach requests to the scheduler (based on [1]) so that we know when we unshelve that we'll pick a host that supports device tags and multiattach volumes, we could lift the restriction in the API with a new microversion. If we go this route, it means the API reference for the root volume detach/attach change needs to mention that the tag will be reset and cannot be set again on attach of the root volume (at least until we lift that restriction). [1] https://review.openstack.org/#/c/538498/ -- Thanks, Matt
On Thu, Feb 28, 2019, 18:29 Matt Riedemann, <mriedemos@gmail.com> wrote:
On 2/28/2019 12:08 PM, Matt Riedemann wrote:
I've put this on the agenda for today's nova meeting. Here are the options as I see them:
1: lift the restriction in this same microversion so that you can attach volumes to a shelved offloaded server and specify a tag (despite bug 1817927 which we can fix later)
2: reset the tag during root volume detach and don't allow specifying a new tag - this is what we'd get today if we merged Kevin's code as-is
3: don't reset the tag during root volume detach but don't allow overwriting it on root volume attach - this could be poor UX if you need to change the tag
4: same as option 2 but eventually add a new microversion for option 1 thus punting the decision
My preferred option is 1 but time is short to be considering this and we might screw something up. I don't think option 3 is viable given the original tag might not make any sense with a new root volume. Options 2 and 4 get us the root volume detach/attach feature in stein as a short-term win but is half-baked since we still have more changes to make later (and another microversion).
I know another option is just defer to Train when we have more time to think about this but we're nearly there on this feature so I'd like to get it done if we can reach agreement on the design.
We talked about this in the nova meeting today:
http://eavesdrop.openstack.org/meetings/nova/2019/nova.2019-02-28-21.00.log....
It sounds like we might be leaning to option 4 for a couple of reasons:
a) It's getting late in stein to be lifting the restriction on attaching a volume with a tag to a shelved offloaded server and munging that in with the root attach/detach volume API change.
Yeah, 1 and 2 end up with the same situation: a tagged volume becoming tagless. In 1 it's ignored by the host during unshelve, in 2 it's reset during the detach. So it becomes a matter of what's faster to implement, so 2 (well, 4, really, because we still need to fix 1817927) would be the best choice.
b) As we got talking I noted that for the same reason as a tag, you currently cannot attach a multi-attach volume to a shelved offloaded server, because we don't make the "reserve_block_device_name" RPC call to the compute service to determine if the driver supports it.
c) Once we have code in the API to properly translate tags/multiattach requests to the scheduler (based on [1]) so that we know when we unshelve that we'll pick a host that supports device tags and multiattach volumes, we could lift the restriction in the API with a new microversion.
If we go this route, it means the API reference for the root volume detach/attach change needs to mention that the tag will be reset and cannot be set again on attach of the root volume (at least until we lift that restriction).
[1] https://review.openstack.org/#/c/538498/
--
Thanks,
Matt
We talked about this in the nova meeting today:
http://eavesdrop.openstack.org/meetings/nova/2019/nova.2019-02-28-21.00.log....
It sounds like we might be leaning to option 4 for a couple of reasons:
a) It's getting late in stein to be lifting the restriction on attaching a volume with a tag to a shelved offloaded server and munging that in with the root attach/detach volume API change.
b) As we got talking I noted that for the same reason as a tag, you currently cannot attach a multi-attach volume to a shelved offloaded server, because we don't make the "reserve_block_device_name" RPC call to the compute service to determine if the driver supports it.
c) Once we have code in the API to properly translate tags/multiattach requests to the scheduler (based on [1]) so that we know when we unshelve that we'll pick a host that supports device tags and multiattach volumes, we could lift the restriction in the API with a new microversion.
If we go this route, it means the API reference for the root volume detach/attach change needs to mention that the tag will be reset and cannot be set again on attach of the root volume (at least until we lift that restriction).
So, first off, sorry I got pulled away from the meeting and am now showing up here with alternatives. My opinion is that it sucks to have to choose between "detaching my root volume" and "can have tags". I think I understand the reasoning for doing the RPC call when attaching to a server on a host to make sure we can honor the tag you want, but I think the shelved case is a bit weird and shouldn't really be held to the same standard. It's always a valid outcome when trying to unshelve a host to say "sorry, but there's no room for that anymore" or even "sorry, you've had this shelved since our last hardware refresh and your instance no longer fits with the group." Thus, attaching a volume with a tag to a shelved instance making it either unbootable or silently ignore-able seems like not the worst thing, especially if the user can just detach and re-attach without a tag to get out of that situation. Further, I also think that wiping the tag on detach is the wrong thing to do. The point of tags for both volumes and network devices was to indicate function, not content. Things like "this is my fast SSD-backed database store" or "this is my slow rotating-rust backup disk" or "this is the fast unmetered internal network." They don't really make sense to say "this is ubuntu 16.04" because ... the image (metadata) tells you that. Thus, I would expect someone with a root volume to use a tag like "root" or "boot" or "where the virus lives". Detaching the root volume doesn't likely change that function. So, IMHO, we should select a path that does not make the user choose between the "tags or detach" dichotomy above. Some options for that are: 1. Don't wipe the tag on detach. On re-attach, let them specify a tag if there is already one set on the BDM. Clearly that means tags were supported before, so a reasonable guess that they still are. 2. Specifically for the above case, either (a) require they pass in the tag they want on the new volume if they want to keep/change it or (b) assume it stays the same unless they specify a null tag to remove it. 3. Always let them specify a tag on attach, and if that means they're not un-shelve-able because no compute nodes support tags, then that's the same case as if you show up with tag-having boot request. Let them de-and-re-attach the volume to wipe the tag and then unshelve. 4. Always let them specify a tag on attach and just not freak out on unshelve if we end up picking a host with no tag support. Personally, I think 2a or 2b are totally fine, but they might not be as obvious as #3. Either will avoid the user having to make a hard life decision about whether they're going to lose their volume tag forever because they need to detach their root volume. Since someone doing such an operation on an instance is probably caring for a loved pet, that's a sucky position to be in, especially if you're reading the docs after having done so only to realize you've already lost it forever when you did your detach. --Dan
On 2/28/2019 6:02 PM, Dan Smith wrote:
So, IMHO, we should select a path that does not make the user choose between the "tags or detach" dichotomy above. Some options for that are:
I've got a few questions for clarification below.
1. Don't wipe the tag on detach. On re-attach, let them specify a tag if there is already one set on the BDM. Clearly that means tags were supported before, so a reasonable guess that they still are.
2. Specifically for the above case, either (a) require they pass in the tag they want on the new volume if they want to keep/change it or (b) assume it stays the same unless they specify a null tag to remove it.
Is this (a) *or* (b)? If I create a server with a device tag on the root bdm, then detach the root volume and am happy with the tag that was there, so on attach of a new root volume I don't specify a new tag - do I lose it because I didn't specify the same tag over again? I would expect that if I had a tag, I can keep it without specifying anything different, or change it by specifying some new (new tag or null it out). If I didn't have a tag on the root bdm when I detached, I can't specify a new one because we don't know if we can honor it (existing behavior for attaching volumes to shelved instances). Does what I just described equate to option 1 above? If so, then I'd vote for option 1 here.
3. Always let them specify a tag on attach, and if that means they're not un-shelve-able because no compute nodes support tags, then that's the same case as if you show up with tag-having boot request. Let them de-and-re-attach the volume to wipe the tag and then unshelve.
So this is option 4 but fail the unshelve if the compute driver they land on doesn't support tags, thus making it more restrictive than option 4. I'm not crazy about this one because if we just start checking device tags on unshelve always we will arguably change behavior - granted the behavior we have today is you get lucky and land on a host that supports tags (this is not an unreasonable assumption if you are using a cloud with homogeneous virt drivers which I expect most do) or you land on a host that doesn't and we don't honor the tags, so sorry.
4. Always let them specify a tag on attach and just not freak out on unshelve if we end up picking a host with no tag support.
This was essentially my first option: "1: lift the restriction in this same microversion so that you can attach volumes to a shelved offloaded server and specify a tag (despite bug 1817927 which we can fix later)" -- Thanks, Matt
1. Don't wipe the tag on detach. On re-attach, let them specify a tag if there is already one set on the BDM. Clearly that means tags were supported before, so a reasonable guess that they still are.
2. Specifically for the above case, either (a) require they pass in the tag they want on the new volume if they want to keep/change it or (b) assume it stays the same unless they specify a null tag to remove it.
Is this (a) *or* (b)?
Yes because it describes complementary behaviors.
If I create a server with a device tag on the root bdm, then detach the root volume and am happy with the tag that was there, so on attach of a new root volume I don't specify a new tag - do I lose it because I didn't specify the same tag over again? I would expect that if I had a tag, I can keep it without specifying anything different, or change it by specifying some new (new tag or null it out). If I didn't have a tag on the root bdm when I detached, I can't specify a new one because we don't know if we can honor it (existing behavior for attaching volumes to shelved instances). Does what I just described equate to option 1 above? If so, then I'd vote for option 1 here.
No, what you describe is 2b and that's fine with me. Maybe I should check my outline style, but 1 is the general idea, 2a and 2b are each behaviors we'd choose from when implementing 1.
3. Always let them specify a tag on attach, and if that means they're not un-shelve-able because no compute nodes support tags, then that's the same case as if you show up with tag-having boot request. Let them de-and-re-attach the volume to wipe the tag and then unshelve.
So this is option 4 but fail the unshelve if the compute driver they land on doesn't support tags, thus making it more restrictive than option 4. I'm not crazy about this one because if we just start checking device tags on unshelve always we will arguably change behavior - granted the behavior we have today is you get lucky and land on a host that supports tags (this is not an unreasonable assumption if you are using a cloud with homogeneous virt drivers which I expect most do) or you land on a host that doesn't and we don't honor the tags, so sorry.
Sure, that's fine. it seems like the most obvious option to me, but if it means we'd be tweaking existing behavior, then it makes sense to avoid it. --Dan
Thanks alot for all the suggestions, so as a summary: 1. do not wipe out the tag when detach 2. free the limit on do not allow attach volume with tag for shelved_offloaded instance if it is a root volume(we can check whether ``is_root`` is True) 2a. if user does not provide any tag, we keep the old tag 2b. if user provide a new tag, we update the new tag 2c. provide a way for user to indicate that I want to null out the previous tag is my understanding correct? BR, On Fri, Mar 1, 2019 at 8:32 AM Dan Smith <dms@danplanet.com> wrote:
1. Don't wipe the tag on detach. On re-attach, let them specify a tag if there is already one set on the BDM. Clearly that means tags were supported before, so a reasonable guess that they still are.
2. Specifically for the above case, either (a) require they pass in the tag they want on the new volume if they want to keep/change it or (b) assume it stays the same unless they specify a null tag to remove it.
Is this (a) *or* (b)?
Yes because it describes complementary behaviors.
If I create a server with a device tag on the root bdm, then detach the root volume and am happy with the tag that was there, so on attach of a new root volume I don't specify a new tag - do I lose it because I didn't specify the same tag over again? I would expect that if I had a tag, I can keep it without specifying anything different, or change it by specifying some new (new tag or null it out). If I didn't have a tag on the root bdm when I detached, I can't specify a new one because we don't know if we can honor it (existing behavior for attaching volumes to shelved instances). Does what I just described equate to option 1 above? If so, then I'd vote for option 1 here.
No, what you describe is 2b and that's fine with me. Maybe I should check my outline style, but 1 is the general idea, 2a and 2b are each behaviors we'd choose from when implementing 1.
3. Always let them specify a tag on attach, and if that means they're not un-shelve-able because no compute nodes support tags, then that's the same case as if you show up with tag-having boot request. Let them de-and-re-attach the volume to wipe the tag and then unshelve.
So this is option 4 but fail the unshelve if the compute driver they land on doesn't support tags, thus making it more restrictive than option 4. I'm not crazy about this one because if we just start checking device tags on unshelve always we will arguably change behavior - granted the behavior we have today is you get lucky and land on a host that supports tags (this is not an unreasonable assumption if you are using a cloud with homogeneous virt drivers which I expect most do) or you land on a host that doesn't and we don't honor the tags, so sorry.
Sure, that's fine. it seems like the most obvious option to me, but if it means we'd be tweaking existing behavior, then it makes sense to avoid it.
--Dan
On 2/28/2019 7:54 PM, Zhenyu Zheng wrote:
Thanks alot for all the suggestions, so as a summary: 1. do not wipe out the tag when detach 2. free the limit on do not allow attach volume with tag for shelved_offloaded instance if it is a root volume(we can check whether ``is_root`` is True) 2a. if user does not provide any tag, we keep the old tag 2b. if user provide a new tag, we update the new tag 2c. provide a way for user to indicate that I want to null out the previous tag
is my understanding correct?
I think so. For 2c, that gets complicated because the schema for the tag is not nullable [1][2]. So then you're left with leave it or overwrite the tag you had, but you can't go from no tag to tag or vice-versa (without a schema change....). Also one final note here, but multiattach volumes are in the same conundrum, meaning they aren't allowed to attach to a shelved offloaded instance, but if we draw on Dan's use case, picking between detaching a root volume and not being able to attach another multiattach root volume, isn't a good option. So if we're going to do this for tags, I think we ought to do it for multiattach volumes as well. In other words, if the root volume was multiattach before it was detached, then the new root volume can be multiattach, but you can't go from non-multiattach to multiattach while shelved offloaded. The crappy thing about multiattach is it's not a clear attribute on the BDM object, it's buried in the BDM.connection_info [3], but we can still sort that out with a helper method (but it means you can't reset the connection_info when detaching the root volume). [1] https://github.com/openstack/nova/blob/a8c065dea946599a1b07d003cd21409c4cd58... [2] https://github.com/openstack/nova/blob/a8c065dea946599a1b07d003cd21409c4cd58... [3] https://github.com/openstack/nova/blob/a8c065dea946599a1b07d003cd21409c4cd58... -- Thanks, Matt
A silly method could be add a new param "cleanup_original_tag" and it could only be provided if "is_root"=True and no new tag provided. So when cleanup_original_tag=True or the user provided a new tag, we cleanup/update the old tag, otherwise we keep it. On Fri, Mar 1, 2019 at 10:36 AM Matt Riedemann <mriedemos@gmail.com> wrote:
On 2/28/2019 7:54 PM, Zhenyu Zheng wrote:
Thanks alot for all the suggestions, so as a summary: 1. do not wipe out the tag when detach 2. free the limit on do not allow attach volume with tag for shelved_offloaded instance if it is a root volume(we can check whether ``is_root`` is True) 2a. if user does not provide any tag, we keep the old tag 2b. if user provide a new tag, we update the new tag 2c. provide a way for user to indicate that I want to null out the previous tag
is my understanding correct?
I think so. For 2c, that gets complicated because the schema for the tag is not nullable [1][2]. So then you're left with leave it or overwrite the tag you had, but you can't go from no tag to tag or vice-versa (without a schema change....).
Also one final note here, but multiattach volumes are in the same conundrum, meaning they aren't allowed to attach to a shelved offloaded instance, but if we draw on Dan's use case, picking between detaching a root volume and not being able to attach another multiattach root volume, isn't a good option. So if we're going to do this for tags, I think we ought to do it for multiattach volumes as well. In other words, if the root volume was multiattach before it was detached, then the new root volume can be multiattach, but you can't go from non-multiattach to multiattach while shelved offloaded. The crappy thing about multiattach is it's not a clear attribute on the BDM object, it's buried in the BDM.connection_info [3], but we can still sort that out with a helper method (but it means you can't reset the connection_info when detaching the root volume).
[1]
https://github.com/openstack/nova/blob/a8c065dea946599a1b07d003cd21409c4cd58... [2]
https://github.com/openstack/nova/blob/a8c065dea946599a1b07d003cd21409c4cd58... [3]
https://github.com/openstack/nova/blob/a8c065dea946599a1b07d003cd21409c4cd58...
--
Thanks,
Matt
On 3/1/2019 3:56 AM, Zhenyu Zheng wrote:
A silly method could be add a new param "cleanup_original_tag" and it could only be provided if "is_root"=True and no new tag provided. So when cleanup_original_tag=True or the user provided a new tag, we cleanup/update the old tag, otherwise we keep it.
I thought about that yesterday when I was talking with Dan about this and while it's an option I think it's a pretty terrible user experience so I'd like to avoid that. I think it's probably good enough that you either keep the tag you had or you can overwrite it. I don't really want to start adding new parameters for weird edge cases here. -- Thanks, Matt
participants (5)
-
Artom Lifshitz
-
Dan Smith
-
Matt Riedemann
-
Sean Mooney
-
Zhenyu Zheng