Hello all,
For a while now we've been attempting to track down some infrequent but
annoying Tempest test cleanup failures in CI when detaching volumes from
an instance. Finally after rewriting part of the Tempest logic
controlling the cleanup we've been able to confirm that this is being
caused by a kernel panic within the instance at boot time as documented
in the following bug:
Failure to detach volume during Tempest test cleanup due to APIC related
kernel panic within the guest OS
https://bugs.launchpad.net/nova/+bug/1939108
This had been previously found in 2014 but at the time a fix was only
proposed to Nova that would solve this when using a supplied kernel
image:
cirros 0.3.1 fails to boot
https://bugs.launchpad.net/cirros/+bug/1312199
Use no_timer_check with soft-qemu
https://review.opendev.org/c/openstack/nova/+/96090
Most (all?) of our CI currently running with [libvirt]virt_type=qemu
uses the full Cirros 0.5.2 image. Does anyone have any suggestions on
the best way of modifying the image(s) we use in CI to use the
no_timer_check kernel command line arg?
Thanks in advance,
--
Lee Yarwood A5D1 9385 88CB 7E5F BE64 6618 BCA6 6E33 F672 2D76