DCN compute service goes down when a instance is scheduled to launch | wallaby | tripleo
Swogat Pradhan
swogatpradhan22 at gmail.com
Wed Mar 22 11:25:08 UTC 2023
Update:
Here is the log when creating a volume using cirros image:
2023-03-22 11:04:38.449 109 INFO cinder.volume.flows.manager.create_volume
[req-646b9ac8-a5a7-45ac-a96d-8dd6bb45da17 b240e3e89d99489284cd731e75f2a5db
4160ce999a31485fa643aed0936dfef0 - - -] Volume
bf341343-6609-4b8c-b9e0-93e2a89c8c8f: being created as image with
specification: {'status': 'creating', 'volume_name':
'volume-bf341343-6609-4b8c-b9e0-93e2a89c8c8f', 'volume_size': 4,
'image_id': '736d8779-07cd-4510-bab2-adcb653cc538', 'image_location':
('rbd://a5ae877c-bcba-53fe-8336-450e63014757/images/736d8779-07cd-4510-bab2-adcb653cc538/snap',
[{'url':
'rbd://a5ae877c-bcba-53fe-8336-450e63014757/images/736d8779-07cd-4510-bab2-adcb653cc538/snap',
'metadata': {'store': 'ceph'}}, {'url':
'rbd://a8d5f1f5-48e7-5ede-89ab-8aca59b6397b/images/736d8779-07cd-4510-bab2-adcb653cc538/snap',
'metadata': {'store': 'dcn02'}}]), 'image_meta': {'name': 'cirros',
'disk_format': 'qcow2', 'container_format': 'bare', 'visibility': 'public',
'size': 16338944, 'virtual_size': 117440512, 'status': 'active',
'checksum': '1d3062cd89af34e419f7100277f38b2b', 'protected': False,
'min_ram': 0, 'min_disk': 0, 'owner': '4160ce999a31485fa643aed0936dfef0',
'os_hidden': False, 'os_hash_algo': 'sha512', 'os_hash_value':
'553d220ed58cfee7dafe003c446a9f197ab5edf8ffc09396c74187cf83873c877e7ae041cb80f3b91489acf687183adcd689b53b38e3ddd22e627e7f98a09c46',
'id': '736d8779-07cd-4510-bab2-adcb653cc538', 'created_at':
datetime.datetime(2023, 3, 22, 10, 44, 12, tzinfo=datetime.timezone.utc),
'updated_at': datetime.datetime(2023, 3, 22, 10, 54, 1,
tzinfo=datetime.timezone.utc), 'locations': [{'url':
'rbd://a5ae877c-bcba-53fe-8336-450e63014757/images/736d8779-07cd-4510-bab2-adcb653cc538/snap',
'metadata': {'store': 'ceph'}}, {'url':
'rbd://a8d5f1f5-48e7-5ede-89ab-8aca59b6397b/images/736d8779-07cd-4510-bab2-adcb653cc538/snap',
'metadata': {'store': 'dcn02'}}], 'direct_url':
'rbd://a5ae877c-bcba-53fe-8336-450e63014757/images/736d8779-07cd-4510-bab2-adcb653cc538/snap',
'tags': [], 'file': '/v2/images/736d8779-07cd-4510-bab2-adcb653cc538/file',
'stores': 'ceph,dcn02', 'properties': {'os_glance_failed_import': '',
'os_glance_importing_to_stores': '', 'owner_specified.openstack.md5': '',
'owner_specified.openstack.object': 'images/cirros',
'owner_specified.openstack.sha256': ''}}, 'image_service':
<cinder.image.glance.GlanceImageService object at 0x7f449ded1198>}
2023-03-22 11:06:16.570 109 INFO cinder.image.image_utils
[req-646b9ac8-a5a7-45ac-a96d-8dd6bb45da17 b240e3e89d99489284cd731e75f2a5db
4160ce999a31485fa643aed0936dfef0 - - -] Image download 15.58 MB at 0.16 MB/s
2023-03-22 11:07:54.023 109 WARNING py.warnings
[req-646b9ac8-a5a7-45ac-a96d-8dd6bb45da17 b240e3e89d99489284cd731e75f2a5db
4160ce999a31485fa643aed0936dfef0 - - -]
/usr/lib/python3.6/site-packages/oslo_utils/imageutils.py:75:
FutureWarning: The human format is deprecated and the format parameter will
be removed. Use explicitly json instead in version 'xena'
category=FutureWarning)
2023-03-22 11:11:12.161 109 WARNING py.warnings
[req-646b9ac8-a5a7-45ac-a96d-8dd6bb45da17 b240e3e89d99489284cd731e75f2a5db
4160ce999a31485fa643aed0936dfef0 - - -]
/usr/lib/python3.6/site-packages/oslo_utils/imageutils.py:75:
FutureWarning: The human format is deprecated and the format parameter will
be removed. Use explicitly json instead in version 'xena'
category=FutureWarning)
2023-03-22 11:11:12.163 109 INFO cinder.image.image_utils
[req-646b9ac8-a5a7-45ac-a96d-8dd6bb45da17 b240e3e89d99489284cd731e75f2a5db
4160ce999a31485fa643aed0936dfef0 - - -] Converted 112.00 MB image at 112.00
MB/s
2023-03-22 11:11:14.998 109 INFO cinder.volume.flows.manager.create_volume
[req-646b9ac8-a5a7-45ac-a96d-8dd6bb45da17 b240e3e89d99489284cd731e75f2a5db
4160ce999a31485fa643aed0936dfef0 - - -] Volume
volume-bf341343-6609-4b8c-b9e0-93e2a89c8c8f
(bf341343-6609-4b8c-b9e0-93e2a89c8c8f): created successfully
2023-03-22 11:11:15.195 109 INFO cinder.volume.manager
[req-646b9ac8-a5a7-45ac-a96d-8dd6bb45da17 b240e3e89d99489284cd731e75f2a5db
4160ce999a31485fa643aed0936dfef0 - - -] Created volume successfully.
The image is present in dcn02 store but still it downloaded the image in
0.16 MB/s and then created the volume.
With regards,
Swogat Pradhan
On Tue, Mar 21, 2023 at 6:10 PM Swogat Pradhan <swogatpradhan22 at gmail.com>
wrote:
> Hi Jhon,
> This seems to be an issue.
> When i deployed the dcn ceph in both dcn01 and dcn02 the --cluster
> parameter was specified to the respective cluster names but the config
> files were created in the name of ceph.conf and keyring was
> ceph.client.openstack.keyring.
>
> Which created issues in glance as well as the naming convention of the
> files didn't match the cluster names, so i had to manually rename the
> central ceph conf file as such:
>
> [root at dcn02-compute-0 ~]# cd /var/lib/tripleo-config/ceph/
> [root at dcn02-compute-0 ceph]# ll
> total 16
> -rw-------. 1 root root 257 Mar 13 13:56
> ceph_central.client.openstack.keyring
> -rw-r--r--. 1 root root 428 Mar 13 13:56 ceph_central.conf
> -rw-------. 1 root root 205 Mar 15 18:45 ceph.client.openstack.keyring
> -rw-r--r--. 1 root root 362 Mar 15 18:45 ceph.conf
> [root at dcn02-compute-0 ceph]#
>
> ceph.conf and ceph.client.openstack.keyring contain the fsid of the
> respective clusters in both dcn01 and dcn02.
> In the above cli output, the ceph.conf and ceph.client... are the files
> used to access dcn02 ceph cluster and ceph_central* files are used in for
> accessing central ceph cluster.
>
> glance multistore config:
> [dcn02]
> rbd_store_ceph_conf=/etc/ceph/ceph.conf
> rbd_store_user=openstack
> rbd_store_pool=images
> rbd_thin_provisioning=False
> store_description=dcn02 rbd glance store
>
> [ceph_central]
> rbd_store_ceph_conf=/etc/ceph/ceph_central.conf
> rbd_store_user=openstack
> rbd_store_pool=images
> rbd_thin_provisioning=False
> store_description=Default glance store backend.
>
>
> With regards,
> Swogat Pradhan
>
> On Tue, Mar 21, 2023 at 5:52 PM John Fulton <johfulto at redhat.com> wrote:
>
>> On Tue, Mar 21, 2023 at 8:03 AM Swogat Pradhan
>> <swogatpradhan22 at gmail.com> wrote:
>> >
>> > Hi,
>> > Seems like cinder is not using the local ceph.
>>
>> That explains the issue. It's a misconfiguration.
>>
>> I hope this is not a production system since the mailing list now has
>> the cinder.conf which contains passwords.
>>
>> The section that looks like this:
>>
>> [tripleo_ceph]
>> volume_backend_name=tripleo_ceph
>> volume_driver=cinder.volume.drivers.rbd.RBDDriver
>> rbd_ceph_conf=/etc/ceph/ceph.conf
>> rbd_user=openstack
>> rbd_pool=volumes
>> rbd_flatten_volume_from_snapshot=False
>> rbd_secret_uuid=<redacted>
>> report_discard_supported=True
>>
>> Should be updated to refer to the local DCN ceph cluster and not the
>> central one. Use the ceph conf file for that cluster and ensure the
>> rbd_secret_uuid corresponds to that one.
>>
>> TripleO’s convention is to set the rbd_secret_uuid to the FSID of the
>> Ceph cluster. The FSID should be in the ceph.conf file. The
>> tripleo_nova_libvirt role will use virsh secret-* commands so that
>> libvirt can retrieve the cephx secret using the FSID as a key. This
>> can be confirmed with `podman exec nova_virtsecretd virsh
>> secret-get-value $FSID`.
>>
>> The documentation describes how to configure the central and DCN sites
>> correctly but an error seems to have occurred while you were following
>> it.
>>
>>
>> https://docs.openstack.org/project-deploy-guide/tripleo-docs/latest/features/distributed_multibackend_storage.html
>>
>> John
>>
>> >
>> > Ceph Output:
>> > [ceph: root at dcn02-ceph-all-0 /]# rbd -p images ls -l
>> > NAME SIZE PARENT FMT PROT
>> LOCK
>> > 2abfafaa-eff4-4c2e-a538-dc2e1249ab65 8 MiB 2
>> excl
>> > 55f40c8a-8f79-48c5-a52a-9b679b762f19 16 MiB 2
>> > 55f40c8a-8f79-48c5-a52a-9b679b762f19 at snap 16 MiB 2 yes
>> > 59f6a9cd-721c-45b5-a15f-fd021b08160d 321 MiB 2
>> > 59f6a9cd-721c-45b5-a15f-fd021b08160d at snap 321 MiB 2 yes
>> > 5f5ddd77-35f3-45e8-9dd3-8c1cbb1f39f0 386 MiB 2
>> > 5f5ddd77-35f3-45e8-9dd3-8c1cbb1f39f0 at snap 386 MiB 2 yes
>> > 9b27248e-a8cf-4f00-a039-d3e3066cd26a 15 GiB 2
>> > 9b27248e-a8cf-4f00-a039-d3e3066cd26a at snap 15 GiB 2 yes
>> > b7356adc-bb47-4c05-968b-6d3c9ca0079b 15 GiB 2
>> > b7356adc-bb47-4c05-968b-6d3c9ca0079b at snap 15 GiB 2 yes
>> > e77e78ad-d369-4a1d-b758-8113621269a3 15 GiB 2
>> > e77e78ad-d369-4a1d-b758-8113621269a3 at snap 15 GiB 2 yes
>> >
>> > [ceph: root at dcn02-ceph-all-0 /]# rbd -p volumes ls -l
>> > NAME SIZE PARENT FMT
>> PROT LOCK
>> > volume-c644086f-d3cf-406d-b0f1-7691bde5981d 100 GiB 2
>> > volume-f0969935-a742-4744-9375-80bf323e4d63 10 GiB 2
>> > [ceph: root at dcn02-ceph-all-0 /]#
>> >
>> > Attached the cinder config.
>> > Please let me know how I can solve this issue.
>> >
>> > With regards,
>> > Swogat Pradhan
>> >
>> > On Tue, Mar 21, 2023 at 3:53 PM John Fulton <johfulto at redhat.com>
>> wrote:
>> >>
>> >> in my last message under the line "On a DCN site if you run a command
>> like this:" I suggested some steps you could try to confirm the image is a
>> COW from the local glance as well as how to look at your cinder config.
>> >>
>> >> On Tue, Mar 21, 2023, 12:06 AM Swogat Pradhan <
>> swogatpradhan22 at gmail.com> wrote:
>> >>>
>> >>> Update:
>> >>> I uploaded an image directly to the dcn02 store, and it takes around
>> 10,15 minutes to create a volume with image in dcn02.
>> >>> The image size is 389 MB.
>> >>>
>> >>> On Mon, Mar 20, 2023 at 10:26 PM Swogat Pradhan <
>> swogatpradhan22 at gmail.com> wrote:
>> >>>>
>> >>>> Hi Jhon,
>> >>>> I checked in the ceph od dcn02, I can see the images created after
>> importing from the central site.
>> >>>> But launching an instance normally fails as it takes a long time for
>> the volume to get created.
>> >>>>
>> >>>> When launching an instance from volume the instance is getting
>> created properly without any errors.
>> >>>>
>> >>>> I tried to cache images in nova using
>> https://docs.openstack.org/project-deploy-guide/tripleo-docs/latest/post_deployment/pre_cache_images.html
>> but getting checksum failed error.
>> >>>>
>> >>>> With regards,
>> >>>> Swogat Pradhan
>> >>>>
>> >>>> On Thu, Mar 16, 2023 at 5:24 PM John Fulton <johfulto at redhat.com>
>> wrote:
>> >>>>>
>> >>>>> On Wed, Mar 15, 2023 at 8:05 PM Swogat Pradhan
>> >>>>> <swogatpradhan22 at gmail.com> wrote:
>> >>>>> >
>> >>>>> > Update: After restarting the nova services on the controller and
>> running the deploy script on the edge site, I was able to launch the VM
>> from volume.
>> >>>>> >
>> >>>>> > Right now the instance creation is failing as the block device
>> creation is stuck in creating state, it is taking more than 10 mins for the
>> volume to be created, whereas the image has already been imported to the
>> edge glance.
>> >>>>>
>> >>>>> Try following this document and making the same observations in your
>> >>>>> environment for AZs and their local ceph cluster.
>> >>>>>
>> >>>>>
>> https://docs.openstack.org/project-deploy-guide/tripleo-docs/latest/features/distributed_multibackend_storage.html#confirm-images-may-be-copied-between-sites
>> >>>>>
>> >>>>> On a DCN site if you run a command like this:
>> >>>>>
>> >>>>> $ sudo cephadm shell --config /etc/ceph/dcn0.conf --keyring
>> >>>>> /etc/ceph/dcn0.client.admin.keyring
>> >>>>> $ rbd --cluster dcn0 -p volumes ls -l
>> >>>>> NAME SIZE PARENT
>> >>>>> FMT PROT LOCK
>> >>>>> volume-28c6fc32-047b-4306-ad2d-de2be02716b7 8 GiB
>> >>>>> images/8083c7e7-32d8-4f7a-b1da-0ed7884f1076 at snap 2 excl
>> >>>>> $
>> >>>>>
>> >>>>> Then, you should see the parent of the volume is the image which is
>> on
>> >>>>> the same local ceph cluster.
>> >>>>>
>> >>>>> I wonder if something is misconfigured and thus you're encountering
>> >>>>> the streaming behavior described here:
>> >>>>>
>> >>>>> Ideally all images should reside in the central Glance and be copied
>> >>>>> to DCN sites before instances of those images are booted on DCN
>> sites.
>> >>>>> If an image is not copied to a DCN site before it is booted, then
>> the
>> >>>>> image will be streamed to the DCN site and then the image will boot
>> as
>> >>>>> an instance. This happens because Glance at the DCN site has access
>> to
>> >>>>> the images store at the Central ceph cluster. Though the booting of
>> >>>>> the image will take time because it has not been copied in advance,
>> >>>>> this is still preferable to failing to boot the image.
>> >>>>>
>> >>>>> You can also exec into the cinder container at the DCN site and
>> >>>>> confirm it's using it's local ceph cluster.
>> >>>>>
>> >>>>> John
>> >>>>>
>> >>>>> >
>> >>>>> > I will try and create a new fresh image and test again then
>> update.
>> >>>>> >
>> >>>>> > With regards,
>> >>>>> > Swogat Pradhan
>> >>>>> >
>> >>>>> > On Wed, Mar 15, 2023 at 11:13 PM Swogat Pradhan <
>> swogatpradhan22 at gmail.com> wrote:
>> >>>>> >>
>> >>>>> >> Update:
>> >>>>> >> In the hypervisor list the compute node state is showing down.
>> >>>>> >>
>> >>>>> >>
>> >>>>> >> On Wed, Mar 15, 2023 at 11:11 PM Swogat Pradhan <
>> swogatpradhan22 at gmail.com> wrote:
>> >>>>> >>>
>> >>>>> >>> Hi Brendan,
>> >>>>> >>> Now i have deployed another site where i have used 2 linux
>> bonds network template for both 3 compute nodes and 3 ceph nodes.
>> >>>>> >>> The bonding options is set to mode=802.3ad (lacp=active).
>> >>>>> >>> I used a cirros image to launch instance but the instance timed
>> out so i waited for the volume to be created.
>> >>>>> >>> Once the volume was created i tried launching the instance from
>> the volume and still the instance is stuck in spawning state.
>> >>>>> >>>
>> >>>>> >>> Here is the nova-compute log:
>> >>>>> >>>
>> >>>>> >>> 2023-03-15 17:35:47.739 185437 INFO oslo.privsep.daemon [-]
>> privsep daemon starting
>> >>>>> >>> 2023-03-15 17:35:47.744 185437 INFO oslo.privsep.daemon [-]
>> privsep process running with uid/gid: 0/0
>> >>>>> >>> 2023-03-15 17:35:47.749 185437 INFO oslo.privsep.daemon [-]
>> privsep process running with capabilities (eff/prm/inh):
>> CAP_SYS_ADMIN/CAP_SYS_ADMIN/none
>> >>>>> >>> 2023-03-15 17:35:47.749 185437 INFO oslo.privsep.daemon [-]
>> privsep daemon running as pid 185437
>> >>>>> >>> 2023-03-15 17:35:47.974 8 WARNING
>> os_brick.initiator.connectors.nvmeof
>> [req-dbb11a9b-317e-4957-b141-f9e0bdf6a266 b240e3e89d99489284cd731e75f2a5db
>> 4160ce999a31485fa643aed0936dfef0 - default default] Process execution error
>> in _get_host_uuid: Unexpected error while running command.
>> >>>>> >>> Command: blkid overlay -s UUID -o value
>> >>>>> >>> Exit code: 2
>> >>>>> >>> Stdout: ''
>> >>>>> >>> Stderr: '':
>> oslo_concurrency.processutils.ProcessExecutionError: Unexpected error while
>> running command.
>> >>>>> >>> 2023-03-15 17:35:51.616 8 INFO nova.virt.libvirt.driver
>> [req-dbb11a9b-317e-4957-b141-f9e0bdf6a266 b240e3e89d99489284cd731e75f2a5db
>> 4160ce999a31485fa643aed0936dfef0 - default default] [instance:
>> 450b749c-a10a-4308-80a9-3b8020fee758] Creating image
>> >>>>> >>>
>> >>>>> >>> It is stuck in creating image, do i need to run the template
>> mentioned here ?:
>> https://docs.openstack.org/project-deploy-guide/tripleo-docs/latest/post_deployment/pre_cache_images.html
>> >>>>> >>>
>> >>>>> >>> The volume is already created and i do not understand why the
>> instance is stuck in spawning state.
>> >>>>> >>>
>> >>>>> >>> With regards,
>> >>>>> >>> Swogat Pradhan
>> >>>>> >>>
>> >>>>> >>>
>> >>>>> >>> On Sun, Mar 5, 2023 at 4:02 PM Brendan Shephard <
>> bshephar at redhat.com> wrote:
>> >>>>> >>>>
>> >>>>> >>>> Does your environment use different network interfaces for
>> each of the networks? Or does it have a bond with everything on it?
>> >>>>> >>>>
>> >>>>> >>>> One issue I have seen before is that when launching instances,
>> there is a lot of network traffic between nodes as the hypervisor needs to
>> download the image from Glance. Along with various other services sending
>> normal network traffic, it can be enough to cause issues if everything is
>> running over a single 1Gbe interface.
>> >>>>> >>>>
>> >>>>> >>>> I have seen the same situation in fact when using a single
>> active/backup bond on 1Gbe nics. It’s worth checking the network traffic
>> while you try to spawn the instance to see if you’re dropping packets. In
>> the situation I described, there were dropped packets which resulted in a
>> loss of communication between nova_compute and RMQ, so the node appeared
>> offline. You should also confirm that nova_compute is being disconnected in
>> the nova_compute logs if you tail them on the Hypervisor while spawning the
>> instance.
>> >>>>> >>>>
>> >>>>> >>>> In my case, changing from active/backup to LACP helped. So,
>> based on that experience, from my perspective, is certainly sounds like
>> some kind of network issue.
>> >>>>> >>>>
>> >>>>> >>>> Regards,
>> >>>>> >>>>
>> >>>>> >>>> Brendan Shephard
>> >>>>> >>>> Senior Software Engineer
>> >>>>> >>>> Red Hat Australia
>> >>>>> >>>>
>> >>>>> >>>>
>> >>>>> >>>>
>> >>>>> >>>> On 5 Mar 2023, at 6:47 am, Eugen Block <eblock at nde.ag> wrote:
>> >>>>> >>>>
>> >>>>> >>>> Hi,
>> >>>>> >>>>
>> >>>>> >>>> I tried to help someone with a similar issue some time ago in
>> this thread:
>> >>>>> >>>>
>> https://serverfault.com/questions/1116771/openstack-oslo-messaging-exception-in-nova-conductor
>> >>>>> >>>>
>> >>>>> >>>> But apparently a neutron reinstallation fixed it for that
>> user, not sure if that could apply here. But is it possible that your nova
>> and neutron versions are different between central and edge site? Have you
>> restarted nova and neutron services on the compute nodes after
>> installation? Have you debug logs of nova-conductor and maybe nova-compute?
>> Maybe they can help narrow down the issue.
>> >>>>> >>>> If there isn't any additional information in the debug logs I
>> probably would start "tearing down" rabbitmq. I didn't have to do that in a
>> production system yet so be careful. I can think of two routes:
>> >>>>> >>>>
>> >>>>> >>>> - Either remove queues, exchanges etc. while rabbit is
>> running, this will most likely impact client IO depending on your load.
>> Check out the rabbitmqctl commands.
>> >>>>> >>>> - Or stop the rabbitmq cluster, remove the mnesia tables from
>> all nodes and restart rabbitmq so the exchanges, queues etc. rebuild.
>> >>>>> >>>>
>> >>>>> >>>> I can imagine that the failed reply "survives" while being
>> replicated across the rabbit nodes. But I don't really know the rabbit
>> internals too well, so maybe someone else can chime in here and give a
>> better advice.
>> >>>>> >>>>
>> >>>>> >>>> Regards,
>> >>>>> >>>> Eugen
>> >>>>> >>>>
>> >>>>> >>>> Zitat von Swogat Pradhan <swogatpradhan22 at gmail.com>:
>> >>>>> >>>>
>> >>>>> >>>> Hi,
>> >>>>> >>>> Can someone please help me out on this issue?
>> >>>>> >>>>
>> >>>>> >>>> With regards,
>> >>>>> >>>> Swogat Pradhan
>> >>>>> >>>>
>> >>>>> >>>> On Thu, Mar 2, 2023 at 1:24 PM Swogat Pradhan <
>> swogatpradhan22 at gmail.com>
>> >>>>> >>>> wrote:
>> >>>>> >>>>
>> >>>>> >>>> Hi
>> >>>>> >>>> I don't see any major packet loss.
>> >>>>> >>>> It seems the problem is somewhere in rabbitmq maybe but not
>> due to packet
>> >>>>> >>>> loss.
>> >>>>> >>>>
>> >>>>> >>>> with regards,
>> >>>>> >>>> Swogat Pradhan
>> >>>>> >>>>
>> >>>>> >>>> On Wed, Mar 1, 2023 at 3:34 PM Swogat Pradhan <
>> swogatpradhan22 at gmail.com>
>> >>>>> >>>> wrote:
>> >>>>> >>>>
>> >>>>> >>>> Hi,
>> >>>>> >>>> Yes the MTU is the same as the default '1500'.
>> >>>>> >>>> Generally I haven't seen any packet loss, but never checked
>> when
>> >>>>> >>>> launching the instance.
>> >>>>> >>>> I will check that and come back.
>> >>>>> >>>> But everytime i launch an instance the instance gets stuck at
>> spawning
>> >>>>> >>>> state and there the hypervisor becomes down, so not sure if
>> packet loss
>> >>>>> >>>> causes this.
>> >>>>> >>>>
>> >>>>> >>>> With regards,
>> >>>>> >>>> Swogat pradhan
>> >>>>> >>>>
>> >>>>> >>>> On Wed, Mar 1, 2023 at 3:30 PM Eugen Block <eblock at nde.ag>
>> wrote:
>> >>>>> >>>>
>> >>>>> >>>> One more thing coming to mind is MTU size. Are they identical
>> between
>> >>>>> >>>> central and edge site? Do you see packet loss through the
>> tunnel?
>> >>>>> >>>>
>> >>>>> >>>> Zitat von Swogat Pradhan <swogatpradhan22 at gmail.com>:
>> >>>>> >>>>
>> >>>>> >>>> > Hi Eugen,
>> >>>>> >>>> > Request you to please add my email either on 'to' or 'cc' as
>> i am not
>> >>>>> >>>> > getting email's from you.
>> >>>>> >>>> > Coming to the issue:
>> >>>>> >>>> >
>> >>>>> >>>> > [root at overcloud-controller-no-ceph-3 /]# rabbitmqctl
>> list_policies -p
>> >>>>> >>>> /
>> >>>>> >>>> > Listing policies for vhost "/" ...
>> >>>>> >>>> > vhost name pattern apply-to definition
>> priority
>> >>>>> >>>> > / ha-all ^(?!amq\.).* queues
>> >>>>> >>>> >
>> >>>>> >>>>
>> {"ha-mode":"exactly","ha-params":2,"ha-promote-on-shutdown":"always"} 0
>> >>>>> >>>> >
>> >>>>> >>>> > I have the edge site compute nodes up, it only goes down
>> when i am
>> >>>>> >>>> trying
>> >>>>> >>>> > to launch an instance and the instance comes to a spawning
>> state and
>> >>>>> >>>> then
>> >>>>> >>>> > gets stuck.
>> >>>>> >>>> >
>> >>>>> >>>> > I have a tunnel setup between the central and the edge sites.
>> >>>>> >>>> >
>> >>>>> >>>> > With regards,
>> >>>>> >>>> > Swogat Pradhan
>> >>>>> >>>> >
>> >>>>> >>>> > On Tue, Feb 28, 2023 at 9:11 PM Swogat Pradhan <
>> >>>>> >>>> swogatpradhan22 at gmail.com>
>> >>>>> >>>> > wrote:
>> >>>>> >>>> >
>> >>>>> >>>> >> Hi Eugen,
>> >>>>> >>>> >> For some reason i am not getting your email to me directly,
>> i am
>> >>>>> >>>> checking
>> >>>>> >>>> >> the email digest and there i am able to find your reply.
>> >>>>> >>>> >> Here is the log for download: https://we.tl/t-L8FEkGZFSq
>> >>>>> >>>> >> Yes, these logs are from the time when the issue occurred.
>> >>>>> >>>> >>
>> >>>>> >>>> >> *Note: i am able to create vm's and perform other
>> activities in the
>> >>>>> >>>> >> central site, only facing this issue in the edge site.*
>> >>>>> >>>> >>
>> >>>>> >>>> >> With regards,
>> >>>>> >>>> >> Swogat Pradhan
>> >>>>> >>>> >>
>> >>>>> >>>> >> On Mon, Feb 27, 2023 at 5:12 PM Swogat Pradhan <
>> >>>>> >>>> swogatpradhan22 at gmail.com>
>> >>>>> >>>> >> wrote:
>> >>>>> >>>> >>
>> >>>>> >>>> >>> Hi Eugen,
>> >>>>> >>>> >>> Thanks for your response.
>> >>>>> >>>> >>> I have actually a 4 controller setup so here are the
>> details:
>> >>>>> >>>> >>>
>> >>>>> >>>> >>> *PCS Status:*
>> >>>>> >>>> >>> * Container bundle set: rabbitmq-bundle [
>> >>>>> >>>> >>>
>> 172.25.201.68:8787/tripleomaster/openstack-rabbitmq:pcmklatest]:
>> >>>>> >>>> >>> * rabbitmq-bundle-0 (ocf::heartbeat:rabbitmq-cluster):
>> >>>>> >>>> Started
>> >>>>> >>>> >>> overcloud-controller-no-ceph-3
>> >>>>> >>>> >>> * rabbitmq-bundle-1 (ocf::heartbeat:rabbitmq-cluster):
>> >>>>> >>>> Started
>> >>>>> >>>> >>> overcloud-controller-2
>> >>>>> >>>> >>> * rabbitmq-bundle-2 (ocf::heartbeat:rabbitmq-cluster):
>> >>>>> >>>> Started
>> >>>>> >>>> >>> overcloud-controller-1
>> >>>>> >>>> >>> * rabbitmq-bundle-3 (ocf::heartbeat:rabbitmq-cluster):
>> >>>>> >>>> Started
>> >>>>> >>>> >>> overcloud-controller-0
>> >>>>> >>>> >>>
>> >>>>> >>>> >>> I have tried restarting the bundle multiple times but the
>> issue is
>> >>>>> >>>> still
>> >>>>> >>>> >>> present.
>> >>>>> >>>> >>>
>> >>>>> >>>> >>> *Cluster status:*
>> >>>>> >>>> >>> [root at overcloud-controller-0 /]# rabbitmqctl
>> cluster_status
>> >>>>> >>>> >>> Cluster status of node
>> >>>>> >>>> >>> rabbit at overcloud-controller-0.internalapi.bdxworld.com ...
>> >>>>> >>>> >>> Basics
>> >>>>> >>>> >>>
>> >>>>> >>>> >>> Cluster name:
>> rabbit at overcloud-controller-no-ceph-3.bdxworld.com
>> >>>>> >>>> >>>
>> >>>>> >>>> >>> Disk Nodes
>> >>>>> >>>> >>>
>> >>>>> >>>> >>> rabbit at overcloud-controller-0.internalapi.bdxworld.com
>> >>>>> >>>> >>> rabbit at overcloud-controller-1.internalapi.bdxworld.com
>> >>>>> >>>> >>> rabbit at overcloud-controller-2.internalapi.bdxworld.com
>> >>>>> >>>> >>>
>> rabbit at overcloud-controller-no-ceph-3.internalapi.bdxworld.com
>> >>>>> >>>> >>>
>> >>>>> >>>> >>> Running Nodes
>> >>>>> >>>> >>>
>> >>>>> >>>> >>> rabbit at overcloud-controller-0.internalapi.bdxworld.com
>> >>>>> >>>> >>> rabbit at overcloud-controller-1.internalapi.bdxworld.com
>> >>>>> >>>> >>> rabbit at overcloud-controller-2.internalapi.bdxworld.com
>> >>>>> >>>> >>>
>> rabbit at overcloud-controller-no-ceph-3.internalapi.bdxworld.com
>> >>>>> >>>> >>>
>> >>>>> >>>> >>> Versions
>> >>>>> >>>> >>>
>> >>>>> >>>> >>> rabbit at overcloud-controller-0.internalapi.bdxworld.com:
>> RabbitMQ
>> >>>>> >>>> 3.8.3
>> >>>>> >>>> >>> on Erlang 22.3.4.1
>> >>>>> >>>> >>> rabbit at overcloud-controller-1.internalapi.bdxworld.com:
>> RabbitMQ
>> >>>>> >>>> 3.8.3
>> >>>>> >>>> >>> on Erlang 22.3.4.1
>> >>>>> >>>> >>> rabbit at overcloud-controller-2.internalapi.bdxworld.com:
>> RabbitMQ
>> >>>>> >>>> 3.8.3
>> >>>>> >>>> >>> on Erlang 22.3.4.1
>> >>>>> >>>> >>>
>> rabbit at overcloud-controller-no-ceph-3.internalapi.bdxworld.com:
>> >>>>> >>>> RabbitMQ
>> >>>>> >>>> >>> 3.8.3 on Erlang 22.3.4.1
>> >>>>> >>>> >>>
>> >>>>> >>>> >>> Alarms
>> >>>>> >>>> >>>
>> >>>>> >>>> >>> (none)
>> >>>>> >>>> >>>
>> >>>>> >>>> >>> Network Partitions
>> >>>>> >>>> >>>
>> >>>>> >>>> >>> (none)
>> >>>>> >>>> >>>
>> >>>>> >>>> >>> Listeners
>> >>>>> >>>> >>>
>> >>>>> >>>> >>> Node:
>> rabbit at overcloud-controller-0.internalapi.bdxworld.com,
>> >>>>> >>>> interface:
>> >>>>> >>>> >>> [::], port: 25672, protocol: clustering, purpose:
>> inter-node and CLI
>> >>>>> >>>> tool
>> >>>>> >>>> >>> communication
>> >>>>> >>>> >>> Node:
>> rabbit at overcloud-controller-0.internalapi.bdxworld.com,
>> >>>>> >>>> interface:
>> >>>>> >>>> >>> 172.25.201.212, port: 5672, protocol: amqp, purpose: AMQP
>> 0-9-1
>> >>>>> >>>> >>> and AMQP 1.0
>> >>>>> >>>> >>> Node:
>> rabbit at overcloud-controller-0.internalapi.bdxworld.com,
>> >>>>> >>>> interface:
>> >>>>> >>>> >>> [::], port: 15672, protocol: http, purpose: HTTP API
>> >>>>> >>>> >>> Node:
>> rabbit at overcloud-controller-1.internalapi.bdxworld.com,
>> >>>>> >>>> interface:
>> >>>>> >>>> >>> [::], port: 25672, protocol: clustering, purpose:
>> inter-node and CLI
>> >>>>> >>>> tool
>> >>>>> >>>> >>> communication
>> >>>>> >>>> >>> Node:
>> rabbit at overcloud-controller-1.internalapi.bdxworld.com,
>> >>>>> >>>> interface:
>> >>>>> >>>> >>> 172.25.201.205, port: 5672, protocol: amqp, purpose: AMQP
>> 0-9-1
>> >>>>> >>>> >>> and AMQP 1.0
>> >>>>> >>>> >>> Node:
>> rabbit at overcloud-controller-1.internalapi.bdxworld.com,
>> >>>>> >>>> interface:
>> >>>>> >>>> >>> [::], port: 15672, protocol: http, purpose: HTTP API
>> >>>>> >>>> >>> Node:
>> rabbit at overcloud-controller-2.internalapi.bdxworld.com,
>> >>>>> >>>> interface:
>> >>>>> >>>> >>> [::], port: 25672, protocol: clustering, purpose:
>> inter-node and CLI
>> >>>>> >>>> tool
>> >>>>> >>>> >>> communication
>> >>>>> >>>> >>> Node:
>> rabbit at overcloud-controller-2.internalapi.bdxworld.com,
>> >>>>> >>>> interface:
>> >>>>> >>>> >>> 172.25.201.201, port: 5672, protocol: amqp, purpose: AMQP
>> 0-9-1
>> >>>>> >>>> >>> and AMQP 1.0
>> >>>>> >>>> >>> Node:
>> rabbit at overcloud-controller-2.internalapi.bdxworld.com,
>> >>>>> >>>> interface:
>> >>>>> >>>> >>> [::], port: 15672, protocol: http, purpose: HTTP API
>> >>>>> >>>> >>> Node:
>> rabbit at overcloud-controller-no-ceph-3.internalapi.bdxworld.com
>> >>>>> >>>> ,
>> >>>>> >>>> >>> interface: [::], port: 25672, protocol: clustering,
>> purpose:
>> >>>>> >>>> inter-node and
>> >>>>> >>>> >>> CLI tool communication
>> >>>>> >>>> >>> Node:
>> rabbit at overcloud-controller-no-ceph-3.internalapi.bdxworld.com
>> >>>>> >>>> ,
>> >>>>> >>>> >>> interface: 172.25.201.209, port: 5672, protocol: amqp,
>> purpose: AMQP
>> >>>>> >>>> 0-9-1
>> >>>>> >>>> >>> and AMQP 1.0
>> >>>>> >>>> >>> Node:
>> rabbit at overcloud-controller-no-ceph-3.internalapi.bdxworld.com
>> >>>>> >>>> ,
>> >>>>> >>>> >>> interface: [::], port: 15672, protocol: http, purpose:
>> HTTP API
>> >>>>> >>>> >>>
>> >>>>> >>>> >>> Feature flags
>> >>>>> >>>> >>>
>> >>>>> >>>> >>> Flag: drop_unroutable_metric, state: enabled
>> >>>>> >>>> >>> Flag: empty_basic_get_metric, state: enabled
>> >>>>> >>>> >>> Flag: implicit_default_bindings, state: enabled
>> >>>>> >>>> >>> Flag: quorum_queue, state: enabled
>> >>>>> >>>> >>> Flag: virtual_host_metadata, state: enabled
>> >>>>> >>>> >>>
>> >>>>> >>>> >>> *Logs:*
>> >>>>> >>>> >>> *(Attached)*
>> >>>>> >>>> >>>
>> >>>>> >>>> >>> With regards,
>> >>>>> >>>> >>> Swogat Pradhan
>> >>>>> >>>> >>>
>> >>>>> >>>> >>> On Sun, Feb 26, 2023 at 2:34 PM Swogat Pradhan <
>> >>>>> >>>> swogatpradhan22 at gmail.com>
>> >>>>> >>>> >>> wrote:
>> >>>>> >>>> >>>
>> >>>>> >>>> >>>> Hi,
>> >>>>> >>>> >>>> Please find the nova conductor as well as nova api log.
>> >>>>> >>>> >>>>
>> >>>>> >>>> >>>> nova-conuctor:
>> >>>>> >>>> >>>>
>> >>>>> >>>> >>>> 2023-02-26 08:45:01.108 31 WARNING
>> >>>>> >>>> oslo_messaging._drivers.amqpdriver
>> >>>>> >>>> >>>> [req-caefe26d-153a-4dfd-9ea6-bc5ca0d46679 - - - - -]
>> >>>>> >>>> >>>> reply_349bcb075f8c49329435a0f884b33066 doesn't exist,
>> drop reply to
>> >>>>> >>>> >>>> 16152921c1eb45c2b1f562087140168b
>> >>>>> >>>> >>>> 2023-02-26 08:45:02.144 26 WARNING
>> >>>>> >>>> oslo_messaging._drivers.amqpdriver
>> >>>>> >>>> >>>> [req-7b43c4e5-0475-4598-92c0-fcacb51d9813 - - - - -]
>> >>>>> >>>> >>>> reply_276049ec36a84486a8a406911d9802f4 doesn't exist,
>> drop reply to
>> >>>>> >>>> >>>> 83dbe5f567a940b698acfe986f6194fa
>> >>>>> >>>> >>>> 2023-02-26 08:45:02.314 32 WARNING
>> >>>>> >>>> oslo_messaging._drivers.amqpdriver
>> >>>>> >>>> >>>> [req-7b43c4e5-0475-4598-92c0-fcacb51d9813 - - - - -]
>> >>>>> >>>> >>>> reply_276049ec36a84486a8a406911d9802f4 doesn't exist,
>> drop reply to
>> >>>>> >>>> >>>> f3bfd7f65bd542b18d84cea3033abb43:
>> >>>>> >>>> >>>> oslo_messaging.exceptions.MessageUndeliverable
>> >>>>> >>>> >>>> 2023-02-26 08:45:02.316 32 ERROR
>> oslo_messaging._drivers.amqpdriver
>> >>>>> >>>> >>>> [req-7b43c4e5-0475-4598-92c0-fcacb51d9813 - - - - -] The
>> reply
>> >>>>> >>>> >>>> f3bfd7f65bd542b18d84cea3033abb43 failed to send after 60
>> seconds
>> >>>>> >>>> due to a
>> >>>>> >>>> >>>> missing queue (reply_276049ec36a84486a8a406911d9802f4).
>> >>>>> >>>> Abandoning...:
>> >>>>> >>>> >>>> oslo_messaging.exceptions.MessageUndeliverable
>> >>>>> >>>> >>>> 2023-02-26 08:48:01.282 35 WARNING
>> >>>>> >>>> oslo_messaging._drivers.amqpdriver
>> >>>>> >>>> >>>> [req-caefe26d-153a-4dfd-9ea6-bc5ca0d46679 - - - - -]
>> >>>>> >>>> >>>> reply_349bcb075f8c49329435a0f884b33066 doesn't exist,
>> drop reply to
>> >>>>> >>>> >>>> d4b9180f91a94f9a82c3c9c4b7595566:
>> >>>>> >>>> >>>> oslo_messaging.exceptions.MessageUndeliverable
>> >>>>> >>>> >>>> 2023-02-26 08:48:01.284 35 ERROR
>> oslo_messaging._drivers.amqpdriver
>> >>>>> >>>> >>>> [req-caefe26d-153a-4dfd-9ea6-bc5ca0d46679 - - - - -] The
>> reply
>> >>>>> >>>> >>>> d4b9180f91a94f9a82c3c9c4b7595566 failed to send after 60
>> seconds
>> >>>>> >>>> due to a
>> >>>>> >>>> >>>> missing queue (reply_349bcb075f8c49329435a0f884b33066).
>> >>>>> >>>> Abandoning...:
>> >>>>> >>>> >>>> oslo_messaging.exceptions.MessageUndeliverable
>> >>>>> >>>> >>>> 2023-02-26 08:49:01.303 33 WARNING
>> >>>>> >>>> oslo_messaging._drivers.amqpdriver
>> >>>>> >>>> >>>> [req-caefe26d-153a-4dfd-9ea6-bc5ca0d46679 - - - - -]
>> >>>>> >>>> >>>> reply_349bcb075f8c49329435a0f884b33066 doesn't exist,
>> drop reply to
>> >>>>> >>>> >>>> 897911a234a445d8a0d8af02ece40f6f:
>> >>>>> >>>> >>>> oslo_messaging.exceptions.MessageUndeliverable
>> >>>>> >>>> >>>> 2023-02-26 08:49:01.304 33 ERROR
>> oslo_messaging._drivers.amqpdriver
>> >>>>> >>>> >>>> [req-caefe26d-153a-4dfd-9ea6-bc5ca0d46679 - - - - -] The
>> reply
>> >>>>> >>>> >>>> 897911a234a445d8a0d8af02ece40f6f failed to send after 60
>> seconds
>> >>>>> >>>> due to a
>> >>>>> >>>> >>>> missing queue (reply_349bcb075f8c49329435a0f884b33066).
>> >>>>> >>>> Abandoning...:
>> >>>>> >>>> >>>> oslo_messaging.exceptions.MessageUndeliverable
>> >>>>> >>>> >>>> 2023-02-26 08:49:52.254 31 WARNING nova.cache_utils
>> >>>>> >>>> >>>> [req-3a1547ea-326f-4dd0-9127-7f4a4bdf1e45
>> >>>>> >>>> b240e3e89d99489284cd731e75f2a5db
>> >>>>> >>>> >>>> 4160ce999a31485fa643aed0936dfef0 - default default] Cache
>> enabled
>> >>>>> >>>> with
>> >>>>> >>>> >>>> backend dogpile.cache.null.
>> >>>>> >>>> >>>> 2023-02-26 08:50:01.264 27 WARNING
>> >>>>> >>>> oslo_messaging._drivers.amqpdriver
>> >>>>> >>>> >>>> [req-caefe26d-153a-4dfd-9ea6-bc5ca0d46679 - - - - -]
>> >>>>> >>>> >>>> reply_349bcb075f8c49329435a0f884b33066 doesn't exist,
>> drop reply to
>> >>>>> >>>> >>>> 8f723ceb10c3472db9a9f324861df2bb:
>> >>>>> >>>> >>>> oslo_messaging.exceptions.MessageUndeliverable
>> >>>>> >>>> >>>> 2023-02-26 08:50:01.266 27 ERROR
>> oslo_messaging._drivers.amqpdriver
>> >>>>> >>>> >>>> [req-caefe26d-153a-4dfd-9ea6-bc5ca0d46679 - - - - -] The
>> reply
>> >>>>> >>>> >>>> 8f723ceb10c3472db9a9f324861df2bb failed to send after 60
>> seconds
>> >>>>> >>>> due to a
>> >>>>> >>>> >>>> missing queue (reply_349bcb075f8c49329435a0f884b33066).
>> >>>>> >>>> Abandoning...:
>> >>>>> >>>> >>>> oslo_messaging.exceptions.MessageUndeliverable
>> >>>>> >>>> >>>>
>> >>>>> >>>> >>>> With regards,
>> >>>>> >>>> >>>> Swogat Pradhan
>> >>>>> >>>> >>>>
>> >>>>> >>>> >>>> On Sun, Feb 26, 2023 at 2:26 PM Swogat Pradhan <
>> >>>>> >>>> >>>> swogatpradhan22 at gmail.com> wrote:
>> >>>>> >>>> >>>>
>> >>>>> >>>> >>>>> Hi,
>> >>>>> >>>> >>>>> I currently have 3 compute nodes on edge site1 where i
>> am trying to
>> >>>>> >>>> >>>>> launch vm's.
>> >>>>> >>>> >>>>> When the VM is in spawning state the node goes down
>> (openstack
>> >>>>> >>>> compute
>> >>>>> >>>> >>>>> service list), the node comes backup when i restart the
>> nova
>> >>>>> >>>> compute
>> >>>>> >>>> >>>>> service but then the launch of the vm fails.
>> >>>>> >>>> >>>>>
>> >>>>> >>>> >>>>> nova-compute.log
>> >>>>> >>>> >>>>>
>> >>>>> >>>> >>>>> 2023-02-26 08:15:51.808 7 INFO nova.compute.manager
>> >>>>> >>>> >>>>> [req-bc0f5f2e-53fc-4dae-b1da-82f1f972d617 - - - - -]
>> Running
>> >>>>> >>>> >>>>> instance usage
>> >>>>> >>>> >>>>> audit for host dcn01-hci-0.bdxworld.com from 2023-02-26
>> 07:00:00
>> >>>>> >>>> to
>> >>>>> >>>> >>>>> 2023-02-26 08:00:00. 0 instances.
>> >>>>> >>>> >>>>> 2023-02-26 08:49:52.813 7 INFO nova.compute.claims
>> >>>>> >>>> >>>>> [req-3a1547ea-326f-4dd0-9127-7f4a4bdf1e45
>> >>>>> >>>> >>>>> b240e3e89d99489284cd731e75f2a5db
>> >>>>> >>>> >>>>> 4160ce999a31485fa643aed0936dfef0 - default default]
>> [instance:
>> >>>>> >>>> >>>>> 0c62c1ef-9010-417d-a05f-4db77e901600] Claim successful
>> on node
>> >>>>> >>>> >>>>> dcn01-hci-0.bdxworld.com
>> >>>>> >>>> >>>>> 2023-02-26 08:49:54.225 7 INFO nova.virt.libvirt.driver
>> >>>>> >>>> >>>>> [req-3a1547ea-326f-4dd0-9127-7f4a4bdf1e45
>> >>>>> >>>> >>>>> b240e3e89d99489284cd731e75f2a5db
>> >>>>> >>>> >>>>> 4160ce999a31485fa643aed0936dfef0 - default default]
>> [instance:
>> >>>>> >>>> >>>>> 0c62c1ef-9010-417d-a05f-4db77e901600] Ignoring supplied
>> device
>> >>>>> >>>> name:
>> >>>>> >>>> >>>>> /dev/vda. Libvirt can't honour user-supplied dev names
>> >>>>> >>>> >>>>> 2023-02-26 08:49:54.398 7 INFO nova.virt.block_device
>> >>>>> >>>> >>>>> [req-3a1547ea-326f-4dd0-9127-7f4a4bdf1e45
>> >>>>> >>>> >>>>> b240e3e89d99489284cd731e75f2a5db
>> >>>>> >>>> >>>>> 4160ce999a31485fa643aed0936dfef0 - default default]
>> [instance:
>> >>>>> >>>> >>>>> 0c62c1ef-9010-417d-a05f-4db77e901600] Booting with volume
>> >>>>> >>>> >>>>> c4bd7885-5973-4860-bbe6-7a2f726baeee at /dev/vda
>> >>>>> >>>> >>>>> 2023-02-26 08:49:55.216 7 WARNING nova.cache_utils
>> >>>>> >>>> >>>>> [req-3a1547ea-326f-4dd0-9127-7f4a4bdf1e45
>> >>>>> >>>> >>>>> b240e3e89d99489284cd731e75f2a5db
>> >>>>> >>>> >>>>> 4160ce999a31485fa643aed0936dfef0 - default default]
>> Cache enabled
>> >>>>> >>>> with
>> >>>>> >>>> >>>>> backend dogpile.cache.null.
>> >>>>> >>>> >>>>> 2023-02-26 08:49:55.283 7 INFO oslo.privsep.daemon
>> >>>>> >>>> >>>>> [req-3a1547ea-326f-4dd0-9127-7f4a4bdf1e45
>> >>>>> >>>> >>>>> b240e3e89d99489284cd731e75f2a5db
>> >>>>> >>>> >>>>> 4160ce999a31485fa643aed0936dfef0 - default default]
>> Running
>> >>>>> >>>> >>>>> privsep helper:
>> >>>>> >>>> >>>>> ['sudo', 'nova-rootwrap', '/etc/nova/rootwrap.conf',
>> >>>>> >>>> 'privsep-helper',
>> >>>>> >>>> >>>>> '--config-file', '/etc/nova/nova.conf', '--config-file',
>> >>>>> >>>> >>>>> '/etc/nova/nova-compute.conf', '--privsep_context',
>> >>>>> >>>> >>>>> 'os_brick.privileged.default', '--privsep_sock_path',
>> >>>>> >>>> >>>>> '/tmp/tmpin40tah6/privsep.sock']
>> >>>>> >>>> >>>>> 2023-02-26 08:49:55.791 7 INFO oslo.privsep.daemon
>> >>>>> >>>> >>>>> [req-3a1547ea-326f-4dd0-9127-7f4a4bdf1e45
>> >>>>> >>>> >>>>> b240e3e89d99489284cd731e75f2a5db
>> >>>>> >>>> >>>>> 4160ce999a31485fa643aed0936dfef0 - default default]
>> Spawned new
>> >>>>> >>>> privsep
>> >>>>> >>>> >>>>> daemon via rootwrap
>> >>>>> >>>> >>>>> 2023-02-26 08:49:55.717 2647 INFO oslo.privsep.daemon
>> [-] privsep
>> >>>>> >>>> >>>>> daemon starting
>> >>>>> >>>> >>>>> 2023-02-26 08:49:55.722 2647 INFO oslo.privsep.daemon
>> [-] privsep
>> >>>>> >>>> >>>>> process running with uid/gid: 0/0
>> >>>>> >>>> >>>>> 2023-02-26 08:49:55.726 2647 INFO oslo.privsep.daemon
>> [-] privsep
>> >>>>> >>>> >>>>> process running with capabilities (eff/prm/inh):
>> >>>>> >>>> >>>>> CAP_SYS_ADMIN/CAP_SYS_ADMIN/none
>> >>>>> >>>> >>>>> 2023-02-26 08:49:55.726 2647 INFO oslo.privsep.daemon
>> [-] privsep
>> >>>>> >>>> >>>>> daemon running as pid 2647
>> >>>>> >>>> >>>>> 2023-02-26 08:49:55.956 7 WARNING
>> >>>>> >>>> os_brick.initiator.connectors.nvmeof
>> >>>>> >>>> >>>>> [req-3a1547ea-326f-4dd0-9127-7f4a4bdf1e45
>> >>>>> >>>> >>>>> b240e3e89d99489284cd731e75f2a5db
>> >>>>> >>>> >>>>> 4160ce999a31485fa643aed0936dfef0 - default default]
>> Process
>> >>>>> >>>> >>>>> execution error
>> >>>>> >>>> >>>>> in _get_host_uuid: Unexpected error while running
>> command.
>> >>>>> >>>> >>>>> Command: blkid overlay -s UUID -o value
>> >>>>> >>>> >>>>> Exit code: 2
>> >>>>> >>>> >>>>> Stdout: ''
>> >>>>> >>>> >>>>> Stderr: '':
>> oslo_concurrency.processutils.ProcessExecutionError:
>> >>>>> >>>> >>>>> Unexpected error while running command.
>> >>>>> >>>> >>>>> 2023-02-26 08:49:58.247 7 INFO nova.virt.libvirt.driver
>> >>>>> >>>> >>>>> [req-3a1547ea-326f-4dd0-9127-7f4a4bdf1e45
>> >>>>> >>>> >>>>> b240e3e89d99489284cd731e75f2a5db
>> >>>>> >>>> >>>>> 4160ce999a31485fa643aed0936dfef0 - default default]
>> [instance:
>> >>>>> >>>> >>>>> 0c62c1ef-9010-417d-a05f-4db77e901600] Creating image
>> >>>>> >>>> >>>>>
>> >>>>> >>>> >>>>> Is there a way to solve this issue?
>> >>>>> >>>> >>>>>
>> >>>>> >>>> >>>>>
>> >>>>> >>>> >>>>> With regards,
>> >>>>> >>>> >>>>>
>> >>>>> >>>> >>>>> Swogat Pradhan
>> >>>>> >>>> >>>>>
>> >>>>> >>>> >>>>
>> >>>>> >>>>
>> >>>>> >>>>
>> >>>>> >>>>
>> >>>>> >>>>
>> >>>>> >>>>
>> >>>>> >>>>
>> >>>>> >>>>
>> >>>>> >>>>
>> >>>>> >>>>
>> >>>>>
>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.openstack.org/pipermail/openstack-discuss/attachments/20230322/327fd435/attachment-0001.htm>
More information about the openstack-discuss
mailing list