[nova] unshelve image_ref bugs
alexandre.arents at corp.ovh.com
Fri Nov 22 14:05:19 UTC 2019
We are using more and more shelve feature and we recently hit this referenced shelve/unshelve bug:
To summup the issue:
-Once we shelve/unshelve a qcow2 instance, we cannot anymore live-migrate/cold-migrate/resize
without breaking it, involving data loss..
->This is because the unshelved instance have a backing file corresponding
to the deleted snapshot(shelve-snapshot), not the original image_ref in glance.
So when we migrate the instance, destination host fetch original image from glance,
not the one currently used by instance.
I've tried to summup possible solution:
A) Use patch proposed by Matt (abandonned):
This change assume the existence of the snapshot image created by a shelve,
that is to say don't delete image during unshelve and update instane.image_ref with it.
-there is no more "hidden image"
-each shelved/unshelved create images in glance(it uses space, can be a problem or not)
-It breaks the assumption:
"I want my instance remains unchanged after unshelve and keep original image_ref,
so when I rebuild my instance, I use the original image (at spawn time),
not the one used during last shelve"
Note: Matt mention a possible workaround of the rebuild issue by using
image_id in in request.spec instead of image_ref)
B) Someone propose to rebase backing file :
I did not check feasibility/complexity of this ?
-What we do when original image is deleted?
C) Change create_image()/imagebackend driver behavior,
to create a flatten qcow2 file in case of unshelving.
flattening disk may be a solution because there will be no more "orphan backing file".
(Basicly doing like "flat" backend driver except we need to stay in qcow2 instead of RAW)
-we keep orignal unshelve behavior/assumption
-It means that in your infra configured in COW some instances will be in "qcow2 flat",
Flat qcow2 instance works great (livemigration/resize..). Would all installation ok with that ?
Ok it seems a little odd to ask COW driver to not do COW in some case. Alternatilevy we can
force using flat driver if unshelving, but we need to change flat driver to support also qcow2.
D) During spawn() if unshelving we convert "qcow2 disk with backing file" to a "flatten qcow2 disk",
just after self._create_image().
It looks more like a workaround than a long term solution as it need to convert something created before,
that do not meet the need(better to do C).
E) Any other idea ?
Currently to make short term OPS and User happy, we are about to do (D) as it works great in our environment,
but we are looking for the project solution.
More information about the openstack-discuss