On 26-07-21 08:14:46, Aleksander Wojtal wrote:
Hello,
During our testing we perform reboot of compute during volume attached task. Result is as follows.
I was about to ask for more details when I saw this initial line. This is almost expected if you restart n-cpu while it's servicing a request like attaching a volume. The compute service isn't able to pick these tasks back up again and complete them once it restarts. That said given the `in-use` volume state below we should've basically completed the attach before the host reboot so I don't understand why it didn't show up in the instance.
Volume is marked as attached to VM (also in DB) ceeinfra@lcm1:~> openstack volume list +--------------------------------------+-----------------------------------------------------------------------------+-----------+------+--------------------------------------------------------------------------------------------------------------+ | ID | Name | Status | Size | Attached to | +--------------------------------------+-----------------------------------------------------------------------------+-----------+------+--------------------------------------------------------------------------------------------------------------+ | c52d406c-5587-445d-9da1-436ffbbe3541 | vol_neXt-488 | available | 10 | | | f3e45efc-35eb-4f95-8e71-f50b5cb69028 | RebootOfComputeHostWhilePerformingVolumeOperations-Volume-0618_19_45_12_669 | in-use | 20 | Attached to neXt-377_VM1--RebootOfComputeHostWhilePerformingVolumeOperations--0618_19_44_16_775 on /dev/vdb | | 1bd81392-e2c9-4725-93c8-baa484924f21 | vol_neXt-488 | available | 10 | | | 919db17c-0729-409c-bf59-74539edcea47 | vol_neXt-488 | available | 10 | | +--------------------------------------+-----------------------------------------------------------------------------+-----------+------+--------------------------------------------------------------------------------------------------------------+
From VM perspective too. ceeinfra@lcm1:~> openstack server show neXt-377_VM1--RebootOfComputeHostWhilePerformingVolumeOperations--0618_19_44_16_775 +-------------------------------------+-------------------------------------------------------------------------------------+ | Field | Value | +-------------------------------------+-------------------------------------------------------------------------------------+ | OS-DCF:diskConfig | MANUAL | | OS-EXT-AZ:availability_zone | nova | | OS-EXT-SRV-ATTR:host | compute-0-11.k2.ericsson.se | | OS-EXT-SRV-ATTR:hypervisor_hostname | compute-0-11.k2.ericsson.se | | OS-EXT-SRV-ATTR:instance_name | instance-00001edc | | OS-EXT-STS:power_state | Running | | OS-EXT-STS:task_state | None | | OS-EXT-STS:vm_state | active | | OS-SRV-USG:launched_at | 2021-06-18T14:14:58.000000 | | OS-SRV-USG:terminated_at | None | | accessIPv4 | | | accessIPv6 | | | addresses | NovaController-Network-0618_19_44_16_852=10.0.0.5 | | config_drive | True | | created | 2021-06-18T14:14:19Z | | flavor | m1.small (2) | | hostId | 5078bfd4fabd8b2e7c9f4554b3ac08d536ceebb5f393db4d38c8ff60 | | id | 95ed49ed-b3f8-46ae-80a0-544b939c50b7 | | image | JCAT_Common_CirrOS_i386 (b71f3fbd-59f1-4c29-a296-5065fe0406f7) | | key_name | None | | name | neXt-377_VM1--RebootOfComputeHostWhilePerformingVolumeOperations--0618_19_44_16_775 | | progress | 0 | | project_id | b63725c8ebaf43ebabbed497f41eb71c | | properties | ha-policy='managed-on-host' | | scheduler_hints | {} | | security_groups | name='default' | | status | ACTIVE | | updated | 2021-06-18T14:22:07Z | | user_id | f490f07b813748698e56ca7641d46a72 | | volumes_attached | id='f3e45efc-35eb-4f95-8e71-f50b5cb69028' | +-------------------------------------+-------------------------------------------------------------------------------------+
What about `openstack server volume list 95ed49ed-b3f8-46ae-80a0-544b939c50b7`? Did you restart the instance as I assuming a hard reboot would resolve the issue here.
From Libvirt perspective, volume is not attached to VM
compute-0-11:/var/log # virsh dumpxml instance-00001edc <domain type='kvm' id='1'> <name>instance-00001edc</name> <uuid>95ed49ed-b3f8-46ae-80a0-544b939c50b7</uuid> (...) <devices> <emulator>/usr/bin/qemu-system-x86_64</emulator> <disk type='file' device='disk'> <driver name='qemu' type='qcow2' cache='directsync'/> <source file='/var/lib/nova/instances/95ed49ed-b3f8-46ae-80a0-544b939c50b7/disk' index='2'/> <backingStore type='file' index='3'> <format type='raw'/> <source file='/var/lib/nova/instances/_base/af82cd8216adb002916bb9e422a3fbb637022ec6'/> <backingStore/> </backingStore> <target dev='vda' bus='virtio'/> <alias name='virtio-disk0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/> </disk> <disk type='file' device='cdrom'> <driver name='qemu' type='raw' cache='directsync'/> <source file='/var/lib/nova/instances/95ed49ed-b3f8-46ae-80a0-544b939c50b7/disk.config' index='1'/> <backingStore/> <target dev='hdd' bus='ide'/> <readonly/> <alias name='ide0-1-1'/> <address type='drive' controller='0' bus='1' target='0' unit='1'/> </disk> <controller type='usb' index='0' model='piix3-uhci'> (...) </domain>
It's easier to use `virsh domblklist 95ed49ed-b3f8-46ae-80a0-544b939c50b7` to list the attached disks FWIW.
System does not provide any indication of the problem. There should be some kind of indication to user that volume attachment was not completed.
Yeah there's no real way for n-cpu to handle this, the event log for the action isn't going to be complete but there's no state machine for volume attachments that we can rely on here. Would you mind opening a bug with the above and the following event details for the attach request: $ openstack server event list 95ed49ed-b3f8-46ae-80a0-544b939c50b7 $ openstack server event show 95ed49ed-b3f8-46ae-80a0-544b939c50b7 $volume-attach-request-id https://docs.openstack.org/api-guide/compute/faults.html Cheers, -- Lee Yarwood A5D1 9385 88CB 7E5F BE64 6618 BCA6 6E33 F672 2D76