[Openstack-operators] Graceful VM shutdown

Matt Van Winkle mvanwink at rackspace.com
Thu May 16 21:46:46 UTC 2013


Ah yes, "rebooting" the cloud is never fun.  This is some really interesting stuff, Narayan.  I need to spend some more time with it.

One of the things we are putting a lot of hope in is the XenServer 6.1's live migration between hosts without having to have shared storage.  We are still in the very early stages of testing it, but as it makes its way out into the environment, our hope is we can run regular update jobs, like patching, in such a way that it crawls through the environments a few hypervisors at a time – moving instances as necessary.  It would almost be a giant game of Openstack tetris.  There is still a ton to figure out, but that is something we are thinking long term about.

An additional benefit of an approach like this would be "defragging" clouds which, from a service provider standpoint, would allow for more efficient use of resources.

The meta data concept below is something my engineers and I have discussed it before.  There was talk recently of additional statuses for hypervisors – which we probably still need a couple of – but I don't think you'd be able to create a list robust enough to solve everyone's use cases.  I like the approach that pancho is taking.  For example, we could use the concept of a "draining" host to plan maintenances across collections of capacity -  both for the short term, and to better manage processes like I described above.

Nice work, guys!  I'll point a my Engineers at this stuff and see if some of them can jump in and help push it along.

Thanks!
Matt

--
Matt Van Winkle
Manager, Cloud Engineering
Rackspace

210-312-4442(w)
mvanwink at racksapce.com

From: Tim Bell <Tim.Bell at cern.ch<mailto:Tim.Bell at cern.ch>>
Date: Thursday, May 16, 2013 4:16 PM
To: Narayan Desai <narayan.desai at gmail.com<mailto:narayan.desai at gmail.com>>
Cc: "openstack-operators at lists.openstack.org<mailto:openstack-operators at lists.openstack.org>" <openstack-operators at lists.openstack.org<mailto:openstack-operators at lists.openstack.org>>, Scott Devoid <sdevoid at gmail.com<mailto:sdevoid at gmail.com>>
Subject: Re: [Openstack-operators] Graceful VM shutdown


Poncho would be my nomination for best project name, although Ironic deserves a special award also..

Looks very interesting …

Tim

From: Narayan Desai [mailto:narayan.desai at gmail.com]
Sent: 16 May 2013 22:52
To: Tim Bell
Cc: openstack-operators at lists.openstack.org<mailto:openstack-operators at lists.openstack.org>; Scott Devoid; Lorin Hochstein
Subject: Re: [Openstack-operators] Graceful VM shutdown

We (Scott Devoid, Lorin Hochstein, and myself) have built something to help with this class of problem, and submitted a paper to LISA about it. The paper is still under review, but the code is already up on github. We'd love feedback on the code or approach.

Our basic goal here was to start working with a user-annotated approach for load-shedding, which will open the door to a bunch of really interesting resource management techniques and reducing the impact of basic system operations tasks.

The code is called poncho (since we needed it to deal with our "full cloud" ;) and is available here:
https://github.com/magellancloud/poncho

It is definitely still under development, but we hope that it will prove out the concepts before we try to build a version that would be suitable for integration into openstack.

I've attached a preprint of the paper to this mail.
 -nld



On Thu, May 16, 2013 at 1:02 PM, Tim Bell <Tim.Bell at cern.ch<mailto:Tim.Bell at cern.ch>> wrote:

I am interested to see how other service providers handle the cases where there is a need to reboot a hypervisor but it is planned (such as a reboot after OS patching).

We do not want to do live or block migration for these cases as there is no shared storage and block migration can be a bit unreliable at times.

In my ideal case, we would be able to warn the VMs in some way so they can do a reasonable job of shutting down. In some cases, our users would like many minutes of notice to complete their current transaction.

Does anyone know of a standard mechanism for hypervisor communicating to the guest to warn of an impending shutdown ?

Tim


_______________________________________________
OpenStack-operators mailing list
OpenStack-operators at lists.openstack.org<mailto:OpenStack-operators at lists.openstack.org>
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20130516/0f688037/attachment.html>


More information about the OpenStack-operators mailing list