On 12/19/2018 9:09 AM, Torin Woltjer wrote:
After doing live migrations for some instances, and those migrations failing, the attached volumes show duplicates of the same attachment. This is the error message I get when the migration fails: https://pastebin.com/raw/3mxSVnRR
openstack volume list shows the volume is attached to the instance twice. | a424fd41-a72f-4099-9c1a-47114d43c1dc | zktech-wpdb1 | in-use | 50 | Attached to ead8ecc3-f473-4672-a67b-c44534c6042d on /dev/vda Attached to ead8ecc3-f473-4672-a67b-c44534c6042d on /dev/vda
How do I remove the duplicate attachment, and why could the live migration be failing in the first place? Not all migrations fail, but sometimes they do and I have multiple volumes with duplicate attachments.
/*Torin Woltjer*/ *Grand Dial Communications - A ZK Tech Inc. Company* *616.776.1066 ext. 2006* /*<http://www.granddial.com>www.granddial.com <http://www.granddial.com>*/
From the log, it looks like the live migration is timing out and aborting itself: 2018-12-17 13:47:12.449 16987 INFO nova.virt.libvirt.driver [req-7bc758de-b2e4-461b-a971-f79be6cd4703 313d1247d7b845da9c731eec53e50a26 2f693c782fa748c2baece8db95b4ba5b - default default] [instance: ead8ecc3-f473-4672-a67b-c44534c6042d] Migration running for 2280 secs, memory 3% remaining; (bytes processed=2541888418, remaining=126791680, total=3226542080) 2018-12-17 13:47:29.591 16987 WARNING nova.virt.libvirt.migration [req-7bc758de-b2e4-461b-a971-f79be6cd4703 313d1247d7b845da9c731eec53e50a26 2f693c782fa748c2baece8db95b4ba5b - default default] [instance: ead8ecc3-f473-4672-a67b-c44534c6042d] Live migration not completed after 2400 sec 2018-12-17 13:47:30.097 16987 WARNING nova.virt.libvirt.driver [req-7bc758de-b2e4-461b-a971-f79be6cd4703 313d1247d7b845da9c731eec53e50a26 2f693c782fa748c2baece8db95b4ba5b - default default] [instance: ead8ecc3-f473-4672-a67b-c44534c6042d] Migration operation was cancelled 2018-12-17 13:47:30.299 16987 ERROR nova.virt.libvirt.driver [req-7bc758de-b2e4-461b-a971-f79be6cd4703 313d1247d7b845da9c731eec53e50a26 2f693c782fa748c2baece8db95b4ba5b - default default] [instance: ead8ecc3-f473-4672-a67b-c44534c6042d] Live Migration failure: operation aborted: migration job: canceled by client: libvirtError: operation aborted: migration job: canceled by client After that, the _rollback_live_migration method is called which is trying to cleanup volume attachments created against the destination host (during pre_live_migration). The attachment cleanup is failing because it looks like the user token has expired: 2018-12-17 13:47:30.685 16987 INFO nova.compute.manager [req-7bc758de-b2e4-461b-a971-f79be6cd4703 313d1247d7b845da9c731eec53e50a26 2f693c782fa748c2baece8db95b4ba5b - default default] [instance: ead8ecc3-f473-4672-a67b-c44534c6042d] Swapping old allocation on 3e32d595-bd1f-4136-a7f4-c6703d2fbe18 held by migration 17bec61d-544d-47e0-a1c1-37f9d7385286 for instance 2018-12-17 13:47:32.450 16987 ERROR nova.volume.cinder [req-7bc758de-b2e4-461b-a971-f79be6cd4703 313d1247d7b845da9c731eec53e50a26 2f693c782fa748c2baece8db95b4ba5b - default default] Delete attachment failed for attachment 58997d5b-24f0-4073-819e-97916fb1ee19. Error: The request you have made requires authentication. (HTTP 401) Code: 401: Unauthorized: The request you have made requires authentication. (HTTP 401) 2018-12-17 13:47:32.497 16987 WARNING nova.virt.libvirt.driver [req-7bc758de-b2e4-461b-a971-f79be6cd4703 313d1247d7b845da9c731eec53e50a26 2f693c782fa748c2baece8db95b4ba5b - default default] [instance: ead8ecc3-f473-4672-a67b-c44534c6042d] Error monitoring migration: The request you have made requires authentication. (HTTP 401): Unauthorized: The request you have made requires authentication. (HTTP 401) Given that, you'd probably be interested in configuring nova to use service user tokens: https://docs.openstack.org/nova/latest/configuration/config.html#service-use... With that feature, you configure nova with service user credentials so in the case that the user token times out, keystone automatically re-authenticates using the service user credentials. More details can be found in the spec: https://specs.openstack.org/openstack/nova-specs/specs/ocata/implemented/use... -- Thanks, Matt