[Openstack-operators] Live migration

Paul Walton paul.d.walton at gmail.com
Mon Sep 10 15:06:57 UTC 2012


Sorry it took so long to respond.  I was out of town all weekend.

This information helps a lot.  Thanks.  It will allow me to better inform
my boss of the situation.  In my case, keeping the student VMs up at all
times is not a major priority, but one that I would offer if I can.
Restoring to a nightly backup is plenty good enough for our situation.
However, I can certainly see how this would be a major concern in an
enterprise application, where customer VMs must be available at all times.

On Sat, Sep 8, 2012 at 1:29 PM, Diego Parrilla Santamaría <
diego.parrilla.santamaria at gmail.com> wrote:

> Hi Matt,
>
> we already provide this feature in our enterprise version of the StackOps
> Distro. It only works with KVM + libvirt, but it works like a breeze.
> Still, we can do it due to a dirty hack on libvirt configuration. It would
> be great if we could standardize how to implement nova-compute failover
> features in the future.
>
> May be it would be a good idea to work on a blueprint on this for next
> summit. Sadly a small company like StackOps we cannot dedicate much efforts
> to it, but we can show how we do it and may be give it back to the
> community with some help of others.
>
> Cheers
>
> Diego
> --
> Diego Parrilla
>  <http://www.stackops.com/>*CEO*
> *www.stackops.com | * diego.parrilla at stackops.com** | +34 649 94 43 29 |
> skype:diegoparrilla*
> * <http://www.stackops.com/>
> *
>
> *
>
>
>
>
> On Sat, Sep 8, 2012 at 12:31 AM, Matt Joyce <matt.joyce at cloudscaling.com>wrote:
>
>> I agree largely with Diego on this.
>>
>> As far as nova-compute failure is concerned it really falls to the
>> instance owner to detect whether or not their instance has failed and
>> initiate the failover actions they wish to employ.  The world of false
>> positives is tricky.  That being said, providing an SOP for failover of
>> instances in the event of a fallen nova-compute host is absolutely in our
>> interest to define and add to documentation.
>>
>> -Matt
>>
>>
>> On Thu, Sep 6, 2012 at 12:38 PM, Diego Parrilla Santamaría <
>> diego.parrilla.santamaria at gmail.com> wrote:
>>
>>> I think you are mixing live migration of VMs and compute node fail over.
>>> Live migration assumes your compute nodes are running healthy (or healthy
>>> enough) and you want to move VMs around in order to perform updates or fix
>>> problems on specific nodes.
>>>
>>> Failure detection of a compute node seems to be easy, but it's not.
>>> There are a ton of events that could raise a false positive and the logic
>>> to filter these events and perform a recovery  could lead to VM
>>> duplication: the same VM running in two compute nodes at the same time.
>>> That's the reason we let our customers to use our APIs to implement their
>>> own fail detection process and then trigger the fail over action.
>>>
>>> Compute node failover means one or more compute nodes fail and you need
>>> to replace the failing compute nodes with some spare server you have around
>>> preconfigured in your datacenter. If the problem in the compute node is not
>>> complete, you might live migrate the VMs first, and then replace the
>>> compute node. This is not a complex task with Openstack.
>>>
>>> But, if the failure in the compute node is complete, then you need to
>>> replace the compute node and recover the existing configuration of the node
>>> as your previous node. The state of your VMs are lost, but Openstack can
>>> spin them up again. The latter needs not only shared storage, but some
>>> logic to recover the OS configuration, nova-compute configuration, and the
>>> most important thing: libvirt configuration (if you are running KVM, of
>>> course).
>>>
>>> Cheers,
>>> Diego
>>>  --
>>> Diego Parrilla
>>> <http://www.stackops.com/>*CEO*
>>> *www.stackops.com | * diego.parrilla at stackops.com** | +34 649 94 43 29|
>>> skype:diegoparrilla*
>>> * <http://www.stackops.com/>
>>> *
>>>
>>> *
>>>
>>>
>>>
>>>
>>> On Thu, Sep 6, 2012 at 9:16 PM, Paul Walton <paul.d.walton at gmail.com>wrote:
>>>
>>>> Well, I guess my real question next is, what happens when a VM is
>>>> running on a compute node and that node simply fails for whatever reason?
>>>> Does OpenStack have any way to detect a compute node failure, and attempt
>>>> to migrate its VMs to another node?  I realize that the real VM and the
>>>> migrated VM may be out of sync with each other, but I'm assuming that a
>>>> relatively recent version of the VM is still present on the distributed
>>>> file system to facilitate migration.
>>>>
>>>> Obviously, regular backups would be required for a real disaster
>>>> recovery, but the VMs I'm talking about won't be changing much, and being a
>>>> few minutes out of sync won't be a problem.
>>>>
>>>>
>>>> On Thu, Sep 6, 2012 at 1:32 PM, Joe Topjian <joe.topjian at cybera.ca>wrote:
>>>>
>>>>> Hi Paul,
>>>>>
>>>>> OpenStack does not have anything like this yet. Anton Beloglazov has
>>>>> created a blueprint from some research work he's done that would provide
>>>>> such a feature:
>>>>>
>>>>>
>>>>> https://blueprints.launchpad.net/nova/+spec/dynamic-consolidation-of-virtual-machines
>>>>>
>>>>> Thanks,
>>>>> Joe
>>>>>
>>>>> On Thu, Sep 6, 2012 at 12:26 PM, Paul Walton <paul.d.walton at gmail.com>wrote:
>>>>>
>>>>>> Hello,
>>>>>>
>>>>>> I have a quick question about live migration of VMs.  I can see how
>>>>>> to do this manually from the command line, i.e. if I want to move a VM from
>>>>>> one compute node to another myself, but I was wondering if there is a way
>>>>>> to have OpenStack do this automatically for me to balance the load across
>>>>>> all the compute nodes dynamically?
>>>>>>
>>>>>> Thanks
>>>>>>
>>>>>> --
>>>>>>
>>>>>> Paul Walton
>>>>>>
>>>>>> University of Arkansas
>>>>>> College of Engineering
>>>>>> CSCE Technical Support Team
>>>>>> J.B. Hunt Building, Room 440
>>>>>>
>>>>>> _______________________________________________
>>>>>> OpenStack-operators mailing list
>>>>>> OpenStack-operators at lists.openstack.org
>>>>>>
>>>>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Joe Topjian
>>>>> Systems Administrator
>>>>> Cybera Inc.
>>>>>
>>>>> www.cybera.ca
>>>>>
>>>>> Big data is coming to Canada. Join the welcome wagon.
>>>>> *Cyber Summit 2012*
>>>>> October 1-3, Banff
>>>>> www.cybera.ca/summit2012
>>>>>
>>>>> Cybera is a not-for-profit organization that works to spur and
>>>>> support innovation, for the economic benefit of Alberta, through the use
>>>>> of cyberinfrastructure.
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>>
>>>> Paul Walton
>>>>
>>>> University of Arkansas
>>>> College of Engineering
>>>> CSCE Technical Support Team
>>>> J.B. Hunt Building, Room 440
>>>>
>>>> _______________________________________________
>>>> OpenStack-operators mailing list
>>>> OpenStack-operators at lists.openstack.org
>>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>>>>
>>>>
>>>
>>> _______________________________________________
>>> OpenStack-operators mailing list
>>> OpenStack-operators at lists.openstack.org
>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>>>
>>>
>>
>


-- 

Paul Walton

University of Arkansas
College of Engineering
CSCE Technical Support Team
J.B. Hunt Building, Room 440
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20120910/38c4b022/attachment.html>


More information about the OpenStack-operators mailing list