[Openstack-operators] Live migration
matt.joyce at cloudscaling.com
Fri Sep 7 22:31:08 UTC 2012
I agree largely with Diego on this.
As far as nova-compute failure is concerned it really falls to the instance
owner to detect whether or not their instance has failed and initiate the
failover actions they wish to employ. The world of false positives is
tricky. That being said, providing an SOP for failover of instances in the
event of a fallen nova-compute host is absolutely in our interest to define
and add to documentation.
On Thu, Sep 6, 2012 at 12:38 PM, Diego Parrilla Santamaría <
diego.parrilla.santamaria at gmail.com> wrote:
> I think you are mixing live migration of VMs and compute node fail over.
> Live migration assumes your compute nodes are running healthy (or healthy
> enough) and you want to move VMs around in order to perform updates or fix
> problems on specific nodes.
> Failure detection of a compute node seems to be easy, but it's not. There
> are a ton of events that could raise a false positive and the logic to
> filter these events and perform a recovery could lead to VM duplication:
> the same VM running in two compute nodes at the same time. That's the
> reason we let our customers to use our APIs to implement their own fail
> detection process and then trigger the fail over action.
> Compute node failover means one or more compute nodes fail and you need to
> replace the failing compute nodes with some spare server you have around
> preconfigured in your datacenter. If the problem in the compute node is not
> complete, you might live migrate the VMs first, and then replace the
> compute node. This is not a complex task with Openstack.
> But, if the failure in the compute node is complete, then you need to
> replace the compute node and recover the existing configuration of the node
> as your previous node. The state of your VMs are lost, but Openstack can
> spin them up again. The latter needs not only shared storage, but some
> logic to recover the OS configuration, nova-compute configuration, and the
> most important thing: libvirt configuration (if you are running KVM, of
> Diego Parrilla
> *www.stackops.com | * diego.parrilla at stackops.com** | +34 649 94 43 29 |
> * <http://www.stackops.com/>
> On Thu, Sep 6, 2012 at 9:16 PM, Paul Walton <paul.d.walton at gmail.com>wrote:
>> Well, I guess my real question next is, what happens when a VM is running
>> on a compute node and that node simply fails for whatever reason? Does
>> OpenStack have any way to detect a compute node failure, and attempt to
>> migrate its VMs to another node? I realize that the real VM and the
>> migrated VM may be out of sync with each other, but I'm assuming that a
>> relatively recent version of the VM is still present on the distributed
>> file system to facilitate migration.
>> Obviously, regular backups would be required for a real disaster
>> recovery, but the VMs I'm talking about won't be changing much, and being a
>> few minutes out of sync won't be a problem.
>> On Thu, Sep 6, 2012 at 1:32 PM, Joe Topjian <joe.topjian at cybera.ca>wrote:
>>> Hi Paul,
>>> OpenStack does not have anything like this yet. Anton Beloglazov has
>>> created a blueprint from some research work he's done that would provide
>>> such a feature:
>>> On Thu, Sep 6, 2012 at 12:26 PM, Paul Walton <paul.d.walton at gmail.com>wrote:
>>>> I have a quick question about live migration of VMs. I can see how to
>>>> do this manually from the command line, i.e. if I want to move a VM from
>>>> one compute node to another myself, but I was wondering if there is a way
>>>> to have OpenStack do this automatically for me to balance the load across
>>>> all the compute nodes dynamically?
>>>> Paul Walton
>>>> University of Arkansas
>>>> College of Engineering
>>>> CSCE Technical Support Team
>>>> J.B. Hunt Building, Room 440
>>>> OpenStack-operators mailing list
>>>> OpenStack-operators at lists.openstack.org
>>> Joe Topjian
>>> Systems Administrator
>>> Cybera Inc.
>>> Big data is coming to Canada. Join the welcome wagon.
>>> *Cyber Summit 2012*
>>> October 1-3, Banff
>>> Cybera is a not-for-profit organization that works to spur and support
>>> innovation, for the economic benefit of Alberta, through the use
>>> of cyberinfrastructure.
>> Paul Walton
>> University of Arkansas
>> College of Engineering
>> CSCE Technical Support Team
>> J.B. Hunt Building, Room 440
>> OpenStack-operators mailing list
>> OpenStack-operators at lists.openstack.org
> OpenStack-operators mailing list
> OpenStack-operators at lists.openstack.org
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the OpenStack-operators