[openstack-dev] [nova] Block Device mapping and supporting cd-roms
vishvananda at gmail.com
Wed Jan 23 17:49:27 UTC 2013
There is extensive discussion going on a patch to allow attaching cd-roms. I'm cross-posting the last comments to the mailing list in case others have input. It is in response to the following from daniel berrange:
> I have been thinking about how this interacts with the block device mapping code, and can't help coming to the conclusion that this design is flawed, and will negatively impact on our ability todo future work in this area without having backwards compatibility pains.
> The current way to boot nova instances with a block device mapping is
> --block-device-mapping /dev/vdf=404d2b8e-a174-4d4c-9bfb-6091dc480a01:::0
> This causes the guest to have a disk device /dev/vdf backed by the cinder volume 404d2b8e-a174-4d4c-9bfb-6091dc480a01.
> What if I wanted the guest to have a CDROM device backed by the cinder volume instead ? There's no way to express this in Nova currently.
> What if I don't care about device name & just want Nova to auto-allocate one ? There's no way to express this in Nova curently.
> It is already possible to get Nova to configure a CDROM device for the primary root image of a VM, by setting the 'format' property against the image to 'iso'. This is already flawed because Nova is associating the type of device with the filesystem format of the image. It is perfectly acceptable to have a CDROM device with an ext4 filesystem, or a plain disk with a iso filesystem. There's no way to express this in Nova currently.
> Now on to your proposal
> --block_device_mapping cdrom1=a650060c-09f9-44b4-9eec-196d36a3b7ea:optical mytest-1
> This causes the guest to have an extra cdrom device (user can't choose name) backed by a glance image a650060c-09f9-44b4-9eec-196d36a3b7ea. This magic is done by looking for the 'optical' flag in the block device mapping. There are a number of flaws with this syntax IMHO
> - It is silently changing the meaning of the UUID to refer to a glance image, instead of cinder volume, which has huge scope for confusing users
> - The user is forced to accept auto allocation of device name - they have no way to control what device name is used, unlike when using cinder volumes
> Your additions to the block device mapping have pretty much completely different semanticsf to the block device mapping data. I think this is a serious design flaw.
> Looking at the bigger picture, what this clearly says is that nova disk image configuration is not sufficiently flexible. For *any* disk, we need to be able to
> - Choose between different types of guest device - harddisk, cdrom, or floppy, mmc (flash)
> - Choose between either glance image or cinder volume
> - Choose a disk name, or let Nova auto-allocate
> These three variables should be completely independent of each other.
> I think we need a clear design on how to achieve this in general, not merely hack up something which is only going to work in one limited use case, and will cause problems for our more general solution. As such I don't think this changeset is suitable to consider merging.
I mostly agree with your comments. IMO the disk name is really about choosing an ordering, since the name in the guest is arbitrary. The name is really a holdover from when nova was trying to mimic ec2. I guess auto-allocate would mean just use the ordering of the block device mapping passed in. We have to keep compatibility in mind. Ultimately it seems like we need to have guest_type, src_type, target_type It sounds like we have a few separate patches:
1. allow users to specify guest_type in bdm
(aligned withChoose between different types of guest device - harddisk, cdrom, or floppy, mmc (flash))
The first version of this should probably just be hd + cdrom. This would allow us to attach volumes as cdroms but not images.
2. allow users to specify images as src_type in addition to volume and snapshot
(aligned with Choose between either glance image or cinder volume)
This could be done by adding image_id to the table and allowing one of image_id, snapshot_id, volume_id to be passed in. Honestly it seems like it would be better the code to accept (id, type) instead of exposing three fields externally, but that is probably an incompatible api change. It should be possible to specify the image_id of the root disk using this syntax as well.
There are two concerns here:
a) exposing the additional functionality via the api (requires a new extension)
b) we need to be careful about users using too much space. Rather than force a maximum size, we should probably just subtract the size of any images that are in bdm from the flavor ephemeral size (first) and the root disk size (second) so that they get the same total space no matter how many images they use
4. allow users to specify src_type = blank
Users should be able to create ephemeral disks using bdms as well.
5. allow users to change the size of local storage options (optional)
Right now we resize the disks based on a flavor. Why not allow them to specify the size of the target. for example if you had a flavor that was 20 Gb root and 100 Gb local, you could specify a 50 gb root drive a 50 gb ephemeral drive and attach 20 gb cd rom
5. allow users to specify target_type (optional)
src_type tells us where the disk bits come from. Currently we do some magic to determine where the bits will live while the guest is active.
volume -> self
snapshot -> new volume
image -> local
blank -> local
We could continue to have a magic mapping like this or we could expose to the user the ability to specify where the bits should be. This involves nova having more conversion options, but it could be valuable to specify if it isn't too complex.
6 Nova autoallocates disk names
(aligned with Choose a disk name, or let Nova auto-allocate)
nova can already autoallocate on attach so we can use the same idea to pick disk names if they are not specified. I think it makes sense to continue to use the same syntax as much as possible /dev/vdx for example. I think the problem here is that cdroms are attached on a different bus than the other disks which leads to confusion. I'm not super opposed to using something like cdrom0-1-2 here but we should discuss a if there is something better.
So as far as I can tell this patch should be split. I think 2-5 should be covered by ndipainov's blueprint and this patch should only allow attaching volumes and snapshots (current functionality) as cd-roms in addition to hard disk. We also need a competent plan for 6 but I think if we minimize this patch to 1 and discuss 6 further we can come up with something reasonable.
More information about the OpenStack-dev