[openstack-dev] [nova] Should we make rebuild + new image on a volume-backed instance fail fast?

John Griffith john.griffith8 at gmail.com
Fri Oct 6 17:35:12 UTC 2017


On Fri, Oct 6, 2017 at 11:22 AM, Matt Riedemann <mriedemos at gmail.com> wrote:

> This came up in IRC discussion the other day, but we didn't dig into it
> much given we were all (2 of us) exhausted talking about rebuild.
>
> But we have had several bugs over the years where people expect the root
> disk to change to a newly supplied image during rebuild even if the
> instance is volume-backed.
>
> I distilled several of those bugs down to just this one and duplicated the
> rest:
>
> https://bugs.launchpad.net/nova/+bug/1482040
>
> I wanted to see if there is actually any failure on the backend when doing
> this, and there isn't - there is no instance fault or anything like that.
> It's just not what the user expects, and actually the instance image_ref is
> then shown later as the image specified during rebuild, even though that's
> not the actual image in the root disk (the volume).
>
> There have been a couple of patches proposed over time to change this:
>
> https://review.openstack.org/#/c/305079/
>
> https://review.openstack.org/#/c/201458/
>
> https://review.openstack.org/#/c/467588/
>
> And Paul Murray had a related (approved) spec at one point for detach and
> attach of root volumes:
>
> https://review.openstack.org/#/c/221732/
>
> But the blueprint was never completed.
>
> So with all of this in mind, should we at least consider, until at least
> someone owns supporting this, that the API should fail with a 400 response
> if you're trying to rebuild with a new image on a volume-backed instance?
> That way it's a fast failure in the API, similar to trying to backup a
> volume-backed instance fails fast.
>
​Seems reasonable and correct to me, if we were dealing with ephemeral
Cinder (which isn't a thing) we'd just delete the volume, create a new one
with new image.  In this case however the point of BFV for must is
persistence so it seems reasonable to me to start with changing the
response.​


>
> If we did, that would change the API response from a 202 today to a 400,
> which is something we normally don't do. I don't think a microversion would
> be necessary if we did this, however, because essentially what the user is
> asking for isn't what we're actually giving them, so it's a failure in an
> unexpected way even if there is no fault recorded, it's not what the user
> asked for. I might not be thinking of something here though, like
> interoperability for example - a cloud without this change would blissfully
> return 202 but a cloud with the change would return a 400...so that should
> be considered.

​It's a bug IMO, if you're relying on an incorrect response code and not
getting what you expect it seems that should be more important than
consistent behavior.  ​


>
>
> --
>
> Thanks,
>
> Matt
>
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20171006/046113e5/attachment.html>


More information about the OpenStack-dev mailing list