[nova][cinder] Volume-backed instance disks and some operations that do not support those yet
Hi,
First of all, thanks to everyone involved for all the work on Nova, Cinder, os-brick, and actually all the rest of OpenStack, too! So, yeah, I guess it is kind of weird that I'm asking this on the list just a couple of days after the PTG where I could have asked in person, but here goes :)
There seem to still be some quirks with Nova and volume-backed instance disks; some actions on instances are not allowed, others produce somewhat weird results. From a quick look at the code it seems to me that currently these are: - taking a snapshot of an instance (produces a zero-sized file, no real data backed up) - backing an instance up (refuses outright) - rescuing an instance (refuses outright) ...and maybe there are some that I've missed.
So, possibly stupid question here, but what are the project's plans about these - is there an intention to implement them at some point, or are there some very, very hard theoreitcal or practical problems (so something like "guess not for the present"), or is somebody working on something?
The main reason that I am asking is that we, StorPool, have a shared-storage Cinder driver, and every now and then a customer comes up and asks about one or more of these actions. Every now and then we come back to the idea of writing a vendor-specific Nova image backend, but, first off, we are not really sure whether we want to do this, and second, we are not really sure whether it will be accepted upstream. A couple of years ago people told us "don't do that" and there was some talk about having an image backend for storage drivers supported by libvirt, but that effort seems to have stalled.
Of course, we know that in all software projects, including, but certainly not limited to, the more-or-less volunteer free/libre/open-source projects, there are many tasks and many demands on the developers so that it is only natural that not everything is implemented or adapted at once; things happen, priorities shift, people get redirected, nobody else steps up to continue - it happens. With this in mind, where do things stand right now, should we consider writing an image backend, are there other options or plans?
So thanks for reading through my ramblings, I guess, and keep up the great work!
Best regards, Peter
With this in mind, where do things stand right now, should we consider writing an image backend, are there other options or plans?
I don't think you should, no. The image backend code is messy and problematic for a lot of reasons, and building on what we have there is a path to madness I think. Rewriting it is no small feat, and I think that if we did we'd want to do so in such a way that makes use of cinder for anything other than local disk. That's a really nice ideal, but it's a huge amount of work to do (and review) and also unlikely to ever actually happen.
We can do a lot better by reducing the feature gap with volume-backed instances. Implementing the features that aren't supported, and improving the ones that are *weird* when used on a volume-backed instance. These would be much smaller changes, easier to review, easier to gain acceptance for, etc. Personally, if you want to do some work in this area, I'd recommend picking a weird behavior and trying to propose an improvement to it.
--Dan
On 11/11/2019 10:06 AM, Peter Penchev wrote:
There seem to still be some quirks with Nova and volume-backed instance disks; some actions on instances are not allowed, others produce somewhat weird results. From a quick look at the code it seems to me that currently these are:
- taking a snapshot of an instance (produces a zero-sized file, no real
data backed up)
Volume-backed instance snapshot is supported [1]. It creates a volume snapshot in cinder and then links that to the glance image via metadata. If you boot a server from that image snapshot it's boot-from-volume under the covers, what is sometimes referred to as an image-defined block device mapping. Tempest also has a scenario test for this [2].
- backing an instance up (refuses outright)
Yeah not supported and not really necessary to support. The createBackup API is essentially frozen since it's just orchestration over the existing createImage API and could all be done via external tooling so it's not really a priority to make that a more feature rich API. We've even talked about deprecating createBackup just to get people to stop using it.
- rescuing an instance (refuses outright)
Yeah, not supported, but there have been specs [3][4].
...and maybe there are some that I've missed.
Rebuilding a volume-backed server is another big one. There was actually agreement on how to do this between nova and cinder [5][6], the cinder implementation was code up and being reviewed, but the nova side lagged and was eventually abandoned. So that could be picked up again if someone was willing to invest the time in it.
[1] https://github.com/openstack/nova/blob/20.0.0/nova/compute/api.py#L3031 [2] https://github.com/openstack/tempest/blob/22.1.0/tempest/scenario/test_volum... [3] https://review.opendev.org/#/c/651151/ [4] https://review.opendev.org/#/c/532410/ [5] https://specs.openstack.org/openstack/nova-specs/specs/stein/approved/volume... [6] https://blueprints.launchpad.net/cinder/+spec/add-volume-re-image-api
On 11-11-19 13:54:02, Matt Riedemann wrote:
On 11/11/2019 10:06 AM, Peter Penchev wrote:
- rescuing an instance (refuses outright)
Yeah, not supported, but there have been specs [3][4].
[..]
[3] https://review.opendev.org/#/c/651151/ [4] https://review.opendev.org/#/c/532410/
I might actually have time for this during U. Third time lucky?
https://review.opendev.org/693849
Cheers,
participants (4)
-
Dan Smith
-
Lee Yarwood
-
Matt Riedemann
-
Peter Penchev