Non predictable device name/path in libvirt/qemu config with multipath
Hello Stackers, Today I had a stressful problem. While doing system upgrades on my Ubuntu based OpenStask-Ansible deployment, with Pure Storage cinder driver and boot from volume instances, I faced that problem : While migrating instances to evacuate a compute node, several instances never wanted to live migrate (Kubernetes workers with 16GB RAM). After restarting several times, attempting to change some nova/libvirt configuration (certainly not taken, because qemu was not restarted on source host), I decider to cold migrate thoses instances. So I simply use : openstack server migrate instance-uuid The instances tried to start on other compute node, but failed. Two have the same problem I would talk about, the third just didn't find the volume/LUN on the storage Array (also very weird). So, why the other two didn't start ? They wanted to use a multipath map that was already in use. And really in use from another instance. the multipath -f mpathxy was tried and make the multipath map unusable for the instance which used it. It was the database of an important app, to add a little bit of stress. This is not happening with successful live migration, I only see it with cold migrations. But with some instances, I really don't manage to have a successful live migration, so it can be useful to also have a working cold migration (I mean stop, migrate, and start the instance). So, my questions are : 1. Should multipath.conf enable or not user friendly names ? I didn't find a recommended multipath config. I saw both in kolla templates (not using friendly names) or starlingX ones or old bugs (using friendly names, for data flush to really be effective on paths...) Didn't find yet if openstack-ansible is managing multipath or not. I would say that without friendly names, we have predictable names in /dev/ mapper/ (with WWN), that if used couldn't lead to device "steal" if every instance record that predictable device. 2. Is there some configuration for libvirt/qemu ? Because, in the libvirt/qemu instance config file, the devices are not the multipath map but the /dev/dm-XY where the multipath map symlinks to... Also not predictable and changing, especially when changing compute hosts with different number of accessible volumes... Did someone already had those problems, and how do you handled it ?
participants (1)
-
Gilles Mocellin