Open Stack

Fri Apr 22 22:31:03 UTC 2016

Hello Operators,

one of the users of our cluster opened a ticket about a snapshot
corner case. It is not possible to snapshot a instance that is booted
from volume when the instance is paused. So I wrote this patch, and
from the discussion you can see that I learnt a lot about snapshots.
https://review.openstack.org/#/c/295865/

Discussing about the patch I found something that I found totally
strange, so I want to check with the community if this is the expected
behavior.

Scenario:
Openstack Kilo
libvirt
rbd storage for the images
instance booted from image

Now the developers pointed to the fact that when I snapshot an active
instance, nova makes a "managedSave"
https://libvirt.org/html/libvirt-libvirt-domain.html#virDomainManagedSave

I thought there was a misunderstanding, because I did not see the
point of dumping the all content of the RAM to disk.

I was surprised to check on the hypervisor where the instance is
scheduled that I really see two temporary files created during the
snapshotting process

As soon as you click "snapshot" you will see this file:

/var/lib/libvirt/qemu/save/instance-0001cf8c.save

This file will have the size of the RAM memory of the instance. In my
case I had to wait for 32Gb of RAM to be written to disk.

Once that is finished this second process starts.

qemu-img convert -O raw
rbd:volumes/ee3a84c3-b870-4669-8847-6b9ac93a8eac_disk:id=cinder:conf=/etc/ceph/ceph.conf
/var/lib/nova/instances/snapshots/

ok this convert is also slow but already fixed in Mitaka:
Problem description -
http://www.sebastien-han.fr/blog/2015/10/05/openstack-nova-snapshots-on-ceph-rbd/
Patches that should solve the problem:
https://review.openstack.org/#/c/205282/
https://review.openstack.org/#/c/188244/
Merged for Mitaka -
https://blueprints.launchpad.net/nova/+spec/rbd-instance-snapshots

as a result you have a file with a name that looks like a uuid in this
other folder:

ls /var/lib/nova/instances/snapshots/tmpWsKqvl/
51574e9140204c0f89c7d86fcf741579

So this means that when we take a snapshot of an active instance, we
dump all the RAM memory into a temp file.

This has an impact for us because we have flavors with 32Gb of RAM.
Because our instances are completely rbd backed, we have small disks
on the compute nodes.
Also, it takes time to dump to disk 32Gb of RAM for nothing !!

So, is calling managedSave the intedend behavior ? Or nova should just
make a call to libvirt to make sure that filesystem caches are written
to disk before snapshotting ?

I tracked this call in the git, and looks like nova is implemented
this way since 2012.

Please opeators tell me that I configured something wrong and this is
not really how snapshots are implemented :) Or explain why the dump of
the all RAM memory is needed :)

Any feedback is appreciated !!

Saverio

Open Stack

[Openstack-operators] nova snapshots should dump all RAM to hypervisor disk ?

OpenStack

Community

Documentation

Branding & Legal