[Openstack-operators] nova snapshots should dump all RAM to hypervisor disk ?

Antonio Messina antonio.s.messina at gmail.com
Sat Apr 23 14:55:23 UTC 2016


We are in an even worst situation: we have flavors with 256GB of ram
but only 100GB on the local hard disk, which means that we cannot
snapshot VMs with this flavor.

If there is any way to avoid saving the content of the ram to disk (or
maybe there is a way to snapshot the ram to, e.g., ceph), we would be
very happy.

.a.

On Sat, Apr 23, 2016 at 12:31 AM, Saverio Proto <zioproto at gmail.com> wrote:
> Hello Operators,
>
> one of the users of our cluster opened a ticket about a snapshot
> corner case. It is not possible to snapshot a instance that is booted
> from volume when the instance is paused. So I wrote this patch, and
> from the discussion you can see that I learnt a lot about snapshots.
> https://review.openstack.org/#/c/295865/
>
> Discussing about the patch I found something that I found totally
> strange, so I want to check with the community if this is the expected
> behavior.
>
> Scenario:
> Openstack Kilo
> libvirt
> rbd storage for the images
> instance booted from image
>
> Now the developers pointed to the fact that when I snapshot an active
> instance, nova makes a "managedSave"
> https://libvirt.org/html/libvirt-libvirt-domain.html#virDomainManagedSave
>
> I thought there was a misunderstanding, because I did not see the
> point of dumping the all content of the RAM to disk.
>
> I was surprised to check on the hypervisor where the instance is
> scheduled that I really see two temporary files created during the
> snapshotting process
>
> As soon as you click "snapshot" you will see this file:
>
> /var/lib/libvirt/qemu/save/instance-0001cf8c.save
>
> This file will have the size of the RAM memory of the instance. In my
> case I had to wait for 32Gb of RAM to be written to disk.
>
> Once that is finished this second process starts.
>
> qemu-img convert -O raw
> rbd:volumes/ee3a84c3-b870-4669-8847-6b9ac93a8eac_disk:id=cinder:conf=/etc/ceph/ceph.conf
> /var/lib/nova/instances/snapshots/
>
> ok this convert is also slow but already fixed in Mitaka:
> Problem description -
> http://www.sebastien-han.fr/blog/2015/10/05/openstack-nova-snapshots-on-ceph-rbd/
> Patches that should solve the problem:
> https://review.openstack.org/#/c/205282/
> https://review.openstack.org/#/c/188244/
> Merged for Mitaka -
> https://blueprints.launchpad.net/nova/+spec/rbd-instance-snapshots
>
>
> as a result you have a file with a name that looks like a uuid in this
> other folder:
>
> ls /var/lib/nova/instances/snapshots/tmpWsKqvl/
> 51574e9140204c0f89c7d86fcf741579
>
> So this means that when we take a snapshot of an active instance, we
> dump all the RAM memory into a temp file.
>
> This has an impact for us because we have flavors with 32Gb of RAM.
> Because our instances are completely rbd backed, we have small disks
> on the compute nodes.
> Also, it takes time to dump to disk 32Gb of RAM for nothing !!
>
> So, is calling managedSave the intedend behavior ? Or nova should just
> make a call to libvirt to make sure that filesystem caches are written
> to disk before snapshotting ?
>
> I tracked this call in the git, and looks like nova is implemented
> this way since 2012.
>
> Please opeators tell me that I configured something wrong and this is
> not really how snapshots are implemented :) Or explain why the dump of
> the all RAM memory is needed :)
>
> Any feedback is appreciated !!
>
> Saverio
>
> _______________________________________________
> OpenStack-operators mailing list
> OpenStack-operators at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators



-- 
antonio.s.messina at gmail.com
antonio.messina at uzh.ch                     +41 (0)44 635 42 22
S3IT: Service and Support for Science IT   http://www.s3it.uzh.ch/
University of Zurich
Winterthurerstrasse 190
CH-8057 Zurich Switzerland



More information about the OpenStack-operators mailing list