[Openstack] Nova and asynchronous instance launching

David Kranz david.kranz at qrclab.com
Fri Jun 29 17:50:59 UTC 2012


An assumption is being made here that the "user" and "cloud provider" 
are unrelated. But I think there are many projects under development 
where a cloud-based service is being provided on top of an OpenStack 
infrastructure. In that use case, the direct user of OpenStack APIs and 
the "cloud provider" may be the same entity. It would be really nice if 
when an application fires up an instance that enters the error state, 
there was an api that could get the reason why it failed with as much 
information as the OpenStack code that set the instance state to ERROR had.

If we are concerned that such information is sensitive and a public 
provider might not want to give it all to users, this could be an 
admin-only API. There are many
variations of how the information is controlled.

  -David

If we are concerned that a public provider might not want to give some 
information to users, this could be an admin-only API.
On 6/29/2012 11:40 AM, Day, Phil wrote:
>
> >However, considering the unhappy-path for a second, is there a place
> for surfacing some more context as to why the new instance unexpectedly
> went into the ERROR state?
>
> I assume the philosophy is that the API has validated the request as 
> far and it can, and returned any meaningful error messages, etc.   
> Anything that fails past that point is something going wrong from the 
> cloud provider and there is nothing the user could have done to avoid 
> the error, so any additional information won't help them.
>
> However on the basis that up-front validation is seldom perfect, and 
> things can change while a request is in flight I think that being able 
> to tell a user that, for example, their request failed because the 
> image was deleted before it could be downloaded would be useful.
>
> One approach might be to make the task_state more granular and use 
> that to qualify the error.   In general our users have found having 
> the state shown as "vm_state (task_state)" was useful as it shows 
> progress during things like building.
>
> Phil
>
> *From:*openstack-bounces+philip.day=hp.com at lists.launchpad.net 
> [mailto:openstack-bounces+philip.day=hp.com at lists.launchpad.net] *On 
> Behalf Of *Doug Davis
> *Sent:* 29 June 2012 12:45
> *To:* Eoghan Glynn
> *Cc:* openstack at lists.launchpad.net
> *Subject:* Re: [Openstack] Nova and asynchronous instance launching
>
>
> Right - examining the current state isn't a good way to determine what 
> happened with one particular request.  This is exactly one of the 
> reasons some providers create Jobs for all actions.  Checking the 
> resource "later" to see why something bad happened is fragile since 
> other opertaons might have happened since then, erasing any "error 
> message" type of state info.  And relying on event/error logs is hard 
> since correlating one particular action with a flood of events is 
> tricky - especially in a multi-user environment where several actions 
> could be underway at once.  If each action resulted in a Job URI being 
> returned then the client can check that Job resource when its 
> convinient for them - and this could be quite useful in both happy and 
> unhappy situations.
>
> And to be clear, a Job doesn't necessarily need to be a a full new 
> resource, it could (under the covers) map to a grouping of event logs 
> entries but the point is that from a client's perspective they have an 
> easy mechanism (e.g. issue a GET to a single URI) that returns all of 
> the info needed to determine what happened with one particular operation.
>
> thanks
> -Doug
> ______________________________________________________
> STSM |  Standards Architect  |  IBM Software Group
> (919) 254-6905  |  IBM 444-6905  | dug at us.ibm.com <mailto:dug at us.ibm.com>
> The more I'm around some people, the more I like my dog.
>
> *Eoghan Glynn <eglynn at redhat.com <mailto:eglynn at redhat.com>>*
>
> 06/29/2012 06:00 AM
>
> 	
>
> To
>
> 	
>
> Doug Davis/Raleigh/IBM at IBMUS
>
> cc
>
> 	
>
> openstack at lists.launchpad.net <mailto:openstack at lists.launchpad.net>, 
> Jay Pipes <jaypipes at gmail.com <mailto:jaypipes at gmail.com>>
>
> Subject
>
> 	
>
> Re: [Openstack] Nova and asynchronous instance launching
>
>
> 	
>
>
>
>
>
> > Note that I do distinguish between a 'real' async op (where you
> > really return little more than a 202) and one that returns a
> > skeleton of the resource being created - like instance.create() does
> > now.
>
> So the latter approach at least provides a way to poll on the resource
> status, so as to figure out if and when it becomes usable.
>
> In the happy-path, eventually the instance status transitions to
> ACTIVE and away we go.
>
> However, considering the unhappy-path for a second, is there a place
> for surfacing some more context as to why the new instance unexpectedly
> went into the ERROR state?
>
> For example even just an indication that failure occurred in the scheduler
> (e.g. resource starvation) or on the target compute node. Is the thought
> that such information may be operationally sensitive, or just TMI for a
> typical cloud user?
>
> Cheers,
> Eoghan
>
>
>
> _______________________________________________
> Mailing list: https://launchpad.net/~openstack
> Post to     : openstack at lists.launchpad.net
> Unsubscribe : https://launchpad.net/~openstack
> More help   : https://help.launchpad.net/ListHelp

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack/attachments/20120629/cb80ef77/attachment.html>


More information about the Openstack mailing list