[openstack-dev] Improvement of Cinder API wrt https://bugs.launchpad.net/nova/+bug/1213953

John Griffith john.griffith at solidfire.com
Tue Nov 5 14:22:06 UTC 2013


On Nov 5, 2013 3:33 PM, "Avishay Traeger" <AVISHAY at il.ibm.com> wrote:
>
> So while doubling the timeout will fix some cases, there will be cases
with
> larger volumes and/or slower systems where the bug will still hit.  Even
> timing out on the download progress can lead to unnecessary timeouts (if
> it's really slow, or volume is really big, it can stay at 5% for some
> time).
>
> I think the proper fix is to make sure that Cinder is moving the volume
> into 'error' state in all cases where there is an error.  Nova can then
> poll as long as its in the 'downloading' state, until it's 'available' or
> 'error'.

Agree

 Is there a reason why Cinder would legitimately get stuck in
> 'downloading'?
>
> Thanks,
> Avishay
>
>
>
> From:   John Griffith <john.griffith at solidfire.com>
> To:     "OpenStack Development Mailing List (not for usage questions)"
>             <openstack-dev at lists.openstack.org>,
> Date:   11/05/2013 07:41 AM
> Subject:        Re: [openstack-dev] Improvement of Cinder API wrt
>             https://bugs.launchpad.net/nova/+bug/1213953
>
>
>
> On Tue, Nov 5, 2013 at 7:27 AM, John Griffith
> <john.griffith at solidfire.com> wrote:
> > On Tue, Nov 5, 2013 at 6:29 AM, Chris Friesen
> > <chris.friesen at windriver.com> wrote:
> >> On 11/04/2013 03:49 PM, Solly Ross wrote:
> >>>
> >>> So, There's currently an outstanding issue with regards to a Nova
> >>> shortcut command that creates a volume from an image and then boots
> >>> from it in one fell swoop.  The gist of the issue is that there is
> >>> currently a set timeout which can time out before the volume creation
> >>> has finished (it's designed to time out in case there is an error),
> >>> in cases where the image download or volume creation takes an
> >>> extended period of time (e.g. under a Gluster backend for Cinder with
> >>> certain network conditions).
> >>>
> >>> The proposed solution is a modification to the Cinder API to provide
> >>> more detail on what exactly is going on, so that we could
> >>> programmatically tune the timeout.  My initial thought is to create a
> >>> new column in the Volume table called 'status_detail' to provide more
> >>> detailed information about the current status.  For instance, for the
> >>> 'downloading' status, we could have 'status_detail' be the completion
> >>> percentage or JSON containing the total size and the current amount
> >>> copied.  This way, at each interval we could check to see if the
> >>> amount copied had changed, and trigger the timeout if it had not,
> >>> instead of blindly assuming that the operation will complete within a
> >>> given amount of time.
> >>>
> >>> What do people think?  Would there be a better way to do this?
> >>
> >>
> >> The only other option I can think of would be some kind of callback
that
> >> cinder could explicitly call to drive updates and/or notifications of
> faults
> >> rather than needing to wait for a timeout.  Possibly a combination of
> both
> >> would be best, that way you could add a --poll option to the "create
> volume
> >> and boot" CLI command.
> >>
> >> I come from the kernel-hacking world and most things there involve
> >> event-driven callbacks.  Looking at the openstack code I was kind of
> >> surprised to see hardcoded timeouts and RPC casts with no callbacks to
> >> indicate completion.
> >>
> >> Chris
> >>
> >>
> >> _______________________________________________
> >> OpenStack-dev mailing list
> >> OpenStack-dev at lists.openstack.org
> >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
> I believe you're referring to [1], which was closed after a patch was
> added to nova to double the timeout length.  Based on comments sounds
> like your still seeing issues on some Gluster (maybe other) setups?
>
> Rather than mess with the API in order to do debug, why don't you use
> the info in the cinder-logs?
>
> [1] https://bugs.launchpad.net/nova/+bug/1213953
>
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>
>
>
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20131105/10510faf/attachment.html>


More information about the OpenStack-dev mailing list