[Openstack] Libvirt LXC with volume-attach broken ?

Eric W. Biederman ebiederm at xmission.com
Fri Jul 6 09:35:14 UTC 2012


"Daniel P. Berrange" <berrange at redhat.com> writes:

> On Thu, Jul 05, 2012 at 06:49:06PM -0700, Eric W. Biederman wrote:
>> Serge Hallyn <serge.hallyn at canonical.com> writes:
>> 
>> > Quoting Daniel P. Berrange (berrange at redhat.com):
>> >> On Thu, Jul 05, 2012 at 03:00:26PM +0100, Daniel P. Berrange wrote:
>> >> > Now, when using 'nova volume-attach':
>> >> > 
>> >> >   # nova volume-attach 05eb16df-03b8-451b-85c1-b838a8757736 a5ad1d37-aed0-4bf6-8c6e-c28543cd38ac /dev/sdf
>> >> > 
>> >> > nova will import an iSCSI LUN from the nova volume service, on the compute
>> >> > node. The kernel will assign it the next free SCSI drive letter, in my
>> >> > case '/dev/sdc'.
>> >> > 
>> >> > The libvirt nova driver will then do a mknod, using the volume name
>> >> > passed to 'nova volume-attach'.
>> >> > eg it will do
>> >> > 
>> >> >   mknod  /var/lib/nova/instances/instance-0000000e/rootfs/dev/sdf
>> >> 
>> >> Opps, I'm slightly wrong here. What it actually does is
>> >> 
>> >>   mount --bind /dev/sdc /var/lib/nova/instances/instance-0000000e/rootfs/dev/sdf
>> >> 
>> >> so you get a 'sdf' device, but with the major/minor number of the 'sdc'
>> >> device. I can't say I particularly like this approach. Ultimately I
>> >> think we need the kernel support to make this work correctly. In any
>> >
>> > Yes, that's what the 'devices namespace' is meant to address.  I'm hoping
>> > we can some serious design discussion on that in the next few months.
>> 
>> This is not the device namespace problem.
>> 
>> This is the setns problem for mount namespaces, and the unprivilged
>> mount problem.
>> 
>> There may be a notification issue so use space can perform actions
>> in a container when a device shows up.
>> 
>> But it should be very possible on the host to call.
>> setns(containers_mount_namespace);
>> mknod("/dev/foo");
>> chown("/dev/foo", CONTAINER_ROOT_UID, CONTAINER_ROOT_GID);
>> 
>> And then from inside the container especially when I get the rest of
>> the user namespace merged it should be very possible to manipulate
>> the block device because you have permission, and to mount the
>> partitions of the block device, because you are root in your container.
>> 
>> But until the user namespace is merged you really are root so you can
>> mount whatever.
>> 
>> Daniel does that sound like the support you are looking for?
>
> Yes, the setns(mnt) approach you describe above is exactly what I'd
> like to be able todo, to solve the first half of the problem.
>
> The part of the problem is that I have a /dev/sdf, or even a
> /dev/volgroup00/logvol3 in the host (with whatever major:minor
> number that implies), and I want to be able to make it always
> appear as /dev/sda  in the container (with the correspondingly
> different major:minor number).  I'm guessing this is what Serge
> was refering to as the 'device' namespace problem

Getting the device to always appear with the name /dev/sda is easy.

Where does the need to have a specific device come from?  I would have
thought by now that hotplug had been around long enough that in general
user space would not care.

The only case that I know of where keeping the same device number seems
reasonable is in the case of live migration an application, in order to
avoid issues with stat changing for the same file over the transition,
and I think a synthesized hotplug event could probably handle that case.

Is there another case besides buggy applications that have hard
coded device numbers that need specific device numbers?

Eric





More information about the Openstack mailing list