[Openstack] snapshots, backups of running VMs and compute node recovery

Jānis Ģeņģeris janis.gengeris at gmail.com
Mon Nov 12 09:44:45 UTC 2012


On Fri, Nov 9, 2012 at 9:45 PM, Vishvananda Ishaya <vishvananda at gmail.com>wrote:

> The libvirt driver has actually gotten quite good at rebuilding all of the
> data for instances. This only thing it can't do right now is redownload
> base images from glance. With current state if you simply back up the
> instances directory (usually /var/lib/nova/instances) then you can recover
> by bringing back the whole directory and doing a nova reboot <uuid> for
> each instance.
>

What about image corruption, If I start backing up '/var/lib/instances'
when the instances are running? Should they be paused or suspended while
doing that?


> You could just stick the whole thing on an lvm and snaphot it regularly
> for dr. The _base directory can be regenerated with images from glance so
> you could also write a script to regenerate it and not have to worry about
> backing it up. The code to add to nova to make it automatically re-download
> the image from glance if it isn't there shouldn't be too bad either, which
> would mean you could safely ignore the _base directory for backups.
> Additionally using qcow images in glance and the config option
> `force_raw_images=False` will keep this directory much smaller.
>
If I put everything on LVM, I will not be able to regular snapshots
anymore. Is there a workaround to this. I want to get some understanding
and confidence before I start rebuild my current setup.

With image regeneration you meant image download from glance ('glance
image-download')?


>
> Vish
>
>
> On Nov 9, 2012, at 2:51 AM, Jānis Ģeņģeris <janis.gengeris at gmail.com>
> wrote:
>
> Hello all,
>
> I would like to know the available solutions that are used regarding to
> backing up and/or snapshotting running
> instances on compute nodes. Documentation does not mention anything
> related to this. With snapshots I don't mean
> the current snapshot mechanism, that imports image of the running VM into
> glance. I'm using KVM, but this is
> significant for any hypervisor.
>
> Why is this important?
> Consider simple scenario when hardware on compute node fails and the node
> goes down immediately and is not recoverable
> in reasonable time. The images of the running instances are also lost.
> Shared file system is not considered here as it
> may cause IO bottlenecks and adds another layer of complexity.
>
> There have been a few discussions on the the list about this problem, but
> none have really answered the question.
>
> The documentation speaks of disaster recovery when power loss have
> happened and failed compute node recovery from
> shared file system. But don't cover the case without shared file system.
>
> I can think of few solutions currently (for KVM):
> a) using LVM images for VMs, and making LVM logical volume snapshots, but
> then the current nova snapshot mechanism
> will not work (from the docs - 'current snapshot mechanism in OpenStack
> Compute works only with instances backed
> with Qcow2 images');
> b) snapshot machines with OpenStack snapshotting mechanism, but this
> doesn't fit somehow, because it has
> other goal than creating backups, will be slow and pollute the glance
> image space;
>
> Regards
> --janis
>  _______________________________________________
> Mailing list: https://launchpad.net/~openstack
> Post to     : openstack at lists.launchpad.net
> Unsubscribe : https://launchpad.net/~openstack
> More help   : https://help.launchpad.net/ListHelp
>
>
>


-- 
--janis
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack/attachments/20121112/cbc8afef/attachment.html>


More information about the Openstack mailing list