[openstack-dev] [nova][libvirt][lxc] Attach volumes to LXC is broken

Vladik Romanovsky vladik.romanovsky at enovance.com
Thu Jun 19 14:58:56 UTC 2014


Hello Everyone,

I've been recently working on bug/1269990, re: "attached volumes to LXC are
being lost after reboot/power on", after fixing it, I've realized that the
initial attach volume to LXC operation is not functional at all. (bug/1330981)

I've described the problem in details in the bug. In essence, it all converges
to the fact that /dev/nbdX or /dev/loopX is being set as a root_device_name,
when LXC is being started. Later, while attaching a new volume, these devices
names are not being properly parsed. Nor can a disk_bus can be chosen for the new
volume.

Saving the nbd device as root_device_name was introduced by patch
6277f8aa9 - Change-Id: I063fd3a9856bba089bcde5cdefd2576e2eb2b0e9,
to fix a problem where nbd and loop devices are not properly disconnected while the LXC instance
 has been terminated.
These devices where leaking because, while starting LXC, we are unmounting the lxc rootfs,
and thus cleaning the LXC space in disk.clean_lxc_namespace() - 
https://github.com/openstack/nova/blob/master/nova/virt/libvirt/driver.py#L3570
This is causing the disk api not able to find a relevant nbd/loop device to disconnect
while terminating a LXC instance.
https://github.com/openstack/nova/blob/master/nova/virt/disk/api.py#L250

The possible to solutions to these problems, I could come up with:
1. Stop saving nbd/loop devices as root_device_name, as well as, stop calling
   disk.clean_lxc_namespace(), letting the terminate_instance()
   unmount the LXC rootfs and disconnect relevant devices
   (relying on an existing mechanism). 
   This will also allow the attach_volume() to succeed.
   
2. Adjust get_next_device_name() method to explicitly handle nbd and loop devices in 
   https://github.com/openstack/nova/blob/master/nova/compute/utils.py#L129
   
3. Add an additional filed to the instance model, other then root_device_name,
   to save nbd/loop devices, that should be disconnected on instance termination.
   
Not sure which option is better. Also, it is not entirely clear to me why clean_lxc_namespace
was/is needed.

I'd like to get your opinion and feedback if I'm missing anything or the explanation was too confusing :)

Thanks,
Vladik 



More information about the OpenStack-dev mailing list