[openstack-dev] Improvement of Cinder API wrt https://bugs.launchpad.net/nova/+bug/1213953

John Griffith john.griffith at solidfire.com
Mon Nov 4 23:31:54 UTC 2013


On Tue, Nov 5, 2013 at 7:27 AM, John Griffith
<john.griffith at solidfire.com> wrote:
> On Tue, Nov 5, 2013 at 6:29 AM, Chris Friesen
> <chris.friesen at windriver.com> wrote:
>> On 11/04/2013 03:49 PM, Solly Ross wrote:
>>>
>>> So, There's currently an outstanding issue with regards to a Nova
>>> shortcut command that creates a volume from an image and then boots
>>> from it in one fell swoop.  The gist of the issue is that there is
>>> currently a set timeout which can time out before the volume creation
>>> has finished (it's designed to time out in case there is an error),
>>> in cases where the image download or volume creation takes an
>>> extended period of time (e.g. under a Gluster backend for Cinder with
>>> certain network conditions).
>>>
>>> The proposed solution is a modification to the Cinder API to provide
>>> more detail on what exactly is going on, so that we could
>>> programmatically tune the timeout.  My initial thought is to create a
>>> new column in the Volume table called 'status_detail' to provide more
>>> detailed information about the current status.  For instance, for the
>>> 'downloading' status, we could have 'status_detail' be the completion
>>> percentage or JSON containing the total size and the current amount
>>> copied.  This way, at each interval we could check to see if the
>>> amount copied had changed, and trigger the timeout if it had not,
>>> instead of blindly assuming that the operation will complete within a
>>> given amount of time.
>>>
>>> What do people think?  Would there be a better way to do this?
>>
>>
>> The only other option I can think of would be some kind of callback that
>> cinder could explicitly call to drive updates and/or notifications of faults
>> rather than needing to wait for a timeout.  Possibly a combination of both
>> would be best, that way you could add a --poll option to the "create volume
>> and boot" CLI command.
>>
>> I come from the kernel-hacking world and most things there involve
>> event-driven callbacks.  Looking at the openstack code I was kind of
>> surprised to see hardcoded timeouts and RPC casts with no callbacks to
>> indicate completion.
>>
>> Chris
>>
>>
>> _______________________________________________
>> OpenStack-dev mailing list
>> OpenStack-dev at lists.openstack.org
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

I believe you're referring to [1], which was closed after a patch was
added to nova to double the timeout length.  Based on comments sounds
like your still seeing issues on some Gluster (maybe other) setups?

Rather than mess with the API in order to do debug, why don't you use
the info in the cinder-logs?

[1] https://bugs.launchpad.net/nova/+bug/1213953



More information about the OpenStack-dev mailing list