DCN compute service goes down when a instance is scheduled to launch | wallaby | tripleo
    Alan Bishop 
    abishop at redhat.com
       
    Thu Mar 23 12:36:35 UTC 2023
    
    
  
On Thu, Mar 23, 2023 at 5:20 AM Swogat Pradhan <swogatpradhan22 at gmail.com>
wrote:
> Hi,
> Is this bind not required for cinder_scheduler container?
>
> "/var/lib/tripleo-config/ceph:/var/lib/kolla/config_files/src-ceph:ro,rprivate,rbind",
> I do not see this particular bind on the cinder scheduler containers on my
> controller nodes.
>
That is correct, because the scheduler does not access the ceph cluster.
Alan
> With regards,
> Swogat Pradhan
>
> On Thu, Mar 23, 2023 at 2:46 AM Swogat Pradhan <swogatpradhan22 at gmail.com>
> wrote:
>
>> Cinder volume config:
>>
>> [tripleo_ceph]
>> volume_backend_name=tripleo_ceph
>> volume_driver=cinder.volume.drivers.rbd.RBDDriver
>> rbd_user=openstack
>> rbd_pool=volumes
>> rbd_flatten_volume_from_snapshot=False
>> rbd_secret_uuid=a8d5f1f5-48e7-5ede-89ab-8aca59b6397b
>> report_discard_supported=True
>> rbd_ceph_conf=/etc/ceph/dcn02.conf
>> rbd_cluster_name=dcn02
>>
>> Glance api config:
>>
>> [dcn02]
>> rbd_store_ceph_conf=/etc/ceph/dcn02.conf
>> rbd_store_user=openstack
>> rbd_store_pool=images
>> rbd_thin_provisioning=False
>> store_description=dcn02 rbd glance store
>> [ceph]
>> rbd_store_ceph_conf=/etc/ceph/ceph.conf
>> rbd_store_user=openstack
>> rbd_store_pool=images
>> rbd_thin_provisioning=False
>> store_description=Default glance store backend.
>>
>> On Thu, Mar 23, 2023 at 2:29 AM Swogat Pradhan <swogatpradhan22 at gmail.com>
>> wrote:
>>
>>> I still have the same issue, I'm not sure what's left to try.
>>> All the pods are now in a healthy state, I am getting log entries 3 mins
>>> after I hit the create volume button in cinder-volume when I try to create
>>> a volume with an image.
>>> And the volumes are just stuck in creating state for more than 20 mins
>>> now.
>>>
>>> Cinder logs:
>>> 2023-03-22 20:32:44.010 108 INFO cinder.rpc
>>> [req-0d2093a0-efbd-45a5-bd7d-cce25ddc200e b240e3e89d99489284cd731e75f2a5db
>>> 4160ce999a31485fa643aed0936dfef0 - - -] Automatically selected
>>> cinder-volume RPC version 3.17 as minimum service version.
>>> 2023-03-22 20:34:59.166 108 INFO
>>> cinder.volume.flows.manager.create_volume
>>> [req-0d2093a0-efbd-45a5-bd7d-cce25ddc200e b240e3e89d99489284cd731e75f2a5db
>>> 4160ce999a31485fa643aed0936dfef0 - - -] Volume
>>> 5743a879-090d-46db-bc7c-1c0b0669a112: being created as image with
>>> specification: {'status': 'creating', 'volume_name':
>>> 'volume-5743a879-090d-46db-bc7c-1c0b0669a112', 'volume_size': 2,
>>> 'image_id': 'acfd0a14-69e0-44d6-a6a1-aa9dc83e9d5b', 'image_location':
>>> ('rbd://a5ae877c-bcba-53fe-8336-450e63014757/images/acfd0a14-69e0-44d6-a6a1-aa9dc83e9d5b/snap',
>>> [{'url':
>>> 'rbd://a5ae877c-bcba-53fe-8336-450e63014757/images/acfd0a14-69e0-44d6-a6a1-aa9dc83e9d5b/snap',
>>> 'metadata': {'store': 'ceph'}}, {'url':
>>> 'rbd://a8d5f1f5-48e7-5ede-89ab-8aca59b6397b/images/acfd0a14-69e0-44d6-a6a1-aa9dc83e9d5b/snap',
>>> 'metadata': {'store': 'dcn02'}}]), 'image_meta': {'name': 'cirros',
>>> 'disk_format': 'qcow2', 'container_format': 'bare', 'visibility': 'public',
>>> 'size': 16338944, 'virtual_size': 117440512, 'status': 'active',
>>> 'checksum': '1d3062cd89af34e419f7100277f38b2b', 'protected': False,
>>> 'min_ram': 0, 'min_disk': 0, 'owner': '4160ce999a31485fa643aed0936dfef0',
>>> 'os_hidden': False, 'os_hash_algo': 'sha512', 'os_hash_value':
>>> '553d220ed58cfee7dafe003c446a9f197ab5edf8ffc09396c74187cf83873c877e7ae041cb80f3b91489acf687183adcd689b53b38e3ddd22e627e7f98a09c46',
>>> 'id': 'acfd0a14-69e0-44d6-a6a1-aa9dc83e9d5b', 'created_at':
>>> datetime.datetime(2023, 3, 22, 18, 50, 5, tzinfo=datetime.timezone.utc),
>>> 'updated_at': datetime.datetime(2023, 3, 22, 20, 3, 54,
>>> tzinfo=datetime.timezone.utc), 'locations': [{'url':
>>> 'rbd://a5ae877c-bcba-53fe-8336-450e63014757/images/acfd0a14-69e0-44d6-a6a1-aa9dc83e9d5b/snap',
>>> 'metadata': {'store': 'ceph'}}, {'url':
>>> 'rbd://a8d5f1f5-48e7-5ede-89ab-8aca59b6397b/images/acfd0a14-69e0-44d6-a6a1-aa9dc83e9d5b/snap',
>>> 'metadata': {'store': 'dcn02'}}], 'direct_url':
>>> 'rbd://a5ae877c-bcba-53fe-8336-450e63014757/images/acfd0a14-69e0-44d6-a6a1-aa9dc83e9d5b/snap',
>>> 'tags': [], 'file': '/v2/images/acfd0a14-69e0-44d6-a6a1-aa9dc83e9d5b/file',
>>> 'stores': 'ceph,dcn02', 'properties': {'os_glance_failed_import': '',
>>> 'os_glance_importing_to_stores': '', 'owner_specified.openstack.md5': '',
>>> 'owner_specified.openstack.object': 'images/cirros',
>>> 'owner_specified.openstack.sha256': ''}}, 'image_service':
>>> <cinder.image.glance.GlanceImageService object at 0x7f8147973438>}
>>>
>>> With regards,
>>> Swogat Pradhan
>>>
>>> On Wed, Mar 22, 2023 at 9:19 PM Alan Bishop <abishop at redhat.com> wrote:
>>>
>>>>
>>>>
>>>> On Wed, Mar 22, 2023 at 8:38 AM Swogat Pradhan <
>>>> swogatpradhan22 at gmail.com> wrote:
>>>>
>>>>> Hi Adam,
>>>>> The systems are in same LAN, in this case it seemed like the image was
>>>>> getting pulled from the central site which was caused due to an
>>>>> misconfiguration in ceph.conf file in /var/lib/tripleo-config/ceph/
>>>>> directory, which seems to have been resolved after the changes i made to
>>>>> fix it.
>>>>>
>>>>> Right now the glance api podman is running in unhealthy state and the
>>>>> podman logs don't show any error whatsoever and when issued the command
>>>>> netstat -nultp i do not see any entry for glance port i.e. 9292 in the dcn
>>>>> site, which is why cinder is throwing an error stating:
>>>>>
>>>>> 2023-03-22 13:32:29.786 108 ERROR oslo_messaging.rpc.server
>>>>> cinder.exception.GlanceConnectionFailed: Connection to glance failed: Error
>>>>> finding address for
>>>>> http://172.25.228.253:9292/v2/images/736d8779-07cd-4510-bab2-adcb653cc538:
>>>>> Unable to establish connection to
>>>>> http://172.25.228.253:9292/v2/images/736d8779-07cd-4510-bab2-adcb653cc538:
>>>>> HTTPConnectionPool(host='172.25.228.253', port=9292): Max retries exceeded
>>>>> with url: /v2/images/736d8779-07cd-4510-bab2-adcb653cc538 (Caused by
>>>>> NewConnectionError('<urllib3.connection.HTTPConnection object at
>>>>> 0x7f7682d2cd30>: Failed to establish a new connection: [Errno 111]
>>>>> ECONNREFUSED',))
>>>>>
>>>>> Now i need to find out why the port is not listed as the glance
>>>>> service is running, which i am not sure how to find out.
>>>>>
>>>>
>>>> One other thing to investigate is whether your deployment includes this
>>>> patch [1]. If it does, then bear in mind
>>>> the glance-api service running at the edge site will be an "internal"
>>>> (non public facing) instance that uses port 9293
>>>> instead of 9292. You should familiarize yourself with the release note
>>>> [2].
>>>>
>>>> [1]
>>>> https://opendev.org/openstack/tripleo-heat-templates/commit/3605d45e417a77a1d0f153fbeffcbb283ec85fe6
>>>> [2]
>>>> https://opendev.org/openstack/tripleo-heat-templates/src/branch/stable/wallaby/releasenotes/notes/glance-internal-service-86274f56712ffaac.yaml
>>>>
>>>> Alan
>>>>
>>>>
>>>>> With regards,
>>>>> Swogat Pradhan
>>>>>
>>>>> On Wed, Mar 22, 2023 at 8:11 PM Alan Bishop <abishop at redhat.com>
>>>>> wrote:
>>>>>
>>>>>>
>>>>>>
>>>>>> On Wed, Mar 22, 2023 at 6:37 AM Swogat Pradhan <
>>>>>> swogatpradhan22 at gmail.com> wrote:
>>>>>>
>>>>>>> Update:
>>>>>>> Here is the log when creating a volume using cirros image:
>>>>>>>
>>>>>>> 2023-03-22 11:04:38.449 109 INFO
>>>>>>> cinder.volume.flows.manager.create_volume
>>>>>>> [req-646b9ac8-a5a7-45ac-a96d-8dd6bb45da17 b240e3e89d99489284cd731e75f2a5db
>>>>>>> 4160ce999a31485fa643aed0936dfef0 - - -] Volume
>>>>>>> bf341343-6609-4b8c-b9e0-93e2a89c8c8f: being created as image with
>>>>>>> specification: {'status': 'creating', 'volume_name':
>>>>>>> 'volume-bf341343-6609-4b8c-b9e0-93e2a89c8c8f', 'volume_size': 4,
>>>>>>> 'image_id': '736d8779-07cd-4510-bab2-adcb653cc538', 'image_location':
>>>>>>> ('rbd://a5ae877c-bcba-53fe-8336-450e63014757/images/736d8779-07cd-4510-bab2-adcb653cc538/snap',
>>>>>>> [{'url':
>>>>>>> 'rbd://a5ae877c-bcba-53fe-8336-450e63014757/images/736d8779-07cd-4510-bab2-adcb653cc538/snap',
>>>>>>> 'metadata': {'store': 'ceph'}}, {'url':
>>>>>>> 'rbd://a8d5f1f5-48e7-5ede-89ab-8aca59b6397b/images/736d8779-07cd-4510-bab2-adcb653cc538/snap',
>>>>>>> 'metadata': {'store': 'dcn02'}}]), 'image_meta': {'name': 'cirros',
>>>>>>> 'disk_format': 'qcow2', 'container_format': 'bare', 'visibility': 'public',
>>>>>>> 'size': 16338944, 'virtual_size': 117440512, 'status': 'active',
>>>>>>> 'checksum': '1d3062cd89af34e419f7100277f38b2b', 'protected': False,
>>>>>>> 'min_ram': 0, 'min_disk': 0, 'owner': '4160ce999a31485fa643aed0936dfef0',
>>>>>>> 'os_hidden': False, 'os_hash_algo': 'sha512', 'os_hash_value':
>>>>>>> '553d220ed58cfee7dafe003c446a9f197ab5edf8ffc09396c74187cf83873c877e7ae041cb80f3b91489acf687183adcd689b53b38e3ddd22e627e7f98a09c46',
>>>>>>> 'id': '736d8779-07cd-4510-bab2-adcb653cc538', 'created_at':
>>>>>>> datetime.datetime(2023, 3, 22, 10, 44, 12, tzinfo=datetime.timezone.utc),
>>>>>>> 'updated_at': datetime.datetime(2023, 3, 22, 10, 54, 1,
>>>>>>> tzinfo=datetime.timezone.utc), 'locations': [{'url':
>>>>>>> 'rbd://a5ae877c-bcba-53fe-8336-450e63014757/images/736d8779-07cd-4510-bab2-adcb653cc538/snap',
>>>>>>> 'metadata': {'store': 'ceph'}}, {'url':
>>>>>>> 'rbd://a8d5f1f5-48e7-5ede-89ab-8aca59b6397b/images/736d8779-07cd-4510-bab2-adcb653cc538/snap',
>>>>>>> 'metadata': {'store': 'dcn02'}}], 'direct_url':
>>>>>>> 'rbd://a5ae877c-bcba-53fe-8336-450e63014757/images/736d8779-07cd-4510-bab2-adcb653cc538/snap',
>>>>>>> 'tags': [], 'file': '/v2/images/736d8779-07cd-4510-bab2-adcb653cc538/file',
>>>>>>> 'stores': 'ceph,dcn02', 'properties': {'os_glance_failed_import': '',
>>>>>>> 'os_glance_importing_to_stores': '', 'owner_specified.openstack.md5': '',
>>>>>>> 'owner_specified.openstack.object': 'images/cirros',
>>>>>>> 'owner_specified.openstack.sha256': ''}}, 'image_service':
>>>>>>> <cinder.image.glance.GlanceImageService object at 0x7f449ded1198>}
>>>>>>> 2023-03-22 11:06:16.570 109 INFO cinder.image.image_utils
>>>>>>> [req-646b9ac8-a5a7-45ac-a96d-8dd6bb45da17 b240e3e89d99489284cd731e75f2a5db
>>>>>>> 4160ce999a31485fa643aed0936dfef0 - - -] Image download 15.58 MB at 0.16 MB/s
>>>>>>>
>>>>>>
>>>>>> As Adam Savage would say, well there's your problem ^^ (Image
>>>>>> download 15.58 MB at 0.16 MB/s). Downloading the image takes too long, and
>>>>>> 0.16 MB/s suggests you have a network issue.
>>>>>>
>>>>>> John Fulton previously stated your cinder-volume service at the edge
>>>>>> site is not using the local ceph image store. Assuming you are deploying
>>>>>> GlanceApiEdge service [1], then the cinder-volume service should be
>>>>>> configured to use the local glance service [2]. You should check cinder's
>>>>>> glance_api_servers to confirm it's the edge site's glance service.
>>>>>>
>>>>>> [1]
>>>>>> https://github.com/openstack/tripleo-heat-templates/blob/stable/wallaby/environments/dcn.yaml#L29
>>>>>> [2]
>>>>>> https://github.com/openstack/tripleo-heat-templates/blob/stable/wallaby/deployment/glance/glance-api-edge-container-puppet.yaml#L80
>>>>>>
>>>>>> Alan
>>>>>>
>>>>>>
>>>>>>> 2023-03-22 11:07:54.023 109 WARNING py.warnings
>>>>>>> [req-646b9ac8-a5a7-45ac-a96d-8dd6bb45da17 b240e3e89d99489284cd731e75f2a5db
>>>>>>> 4160ce999a31485fa643aed0936dfef0 - - -]
>>>>>>> /usr/lib/python3.6/site-packages/oslo_utils/imageutils.py:75:
>>>>>>> FutureWarning: The human format is deprecated and the format parameter will
>>>>>>> be removed. Use explicitly json instead in version 'xena'
>>>>>>>   category=FutureWarning)
>>>>>>>
>>>>>>> 2023-03-22 11:11:12.161 109 WARNING py.warnings
>>>>>>> [req-646b9ac8-a5a7-45ac-a96d-8dd6bb45da17 b240e3e89d99489284cd731e75f2a5db
>>>>>>> 4160ce999a31485fa643aed0936dfef0 - - -]
>>>>>>> /usr/lib/python3.6/site-packages/oslo_utils/imageutils.py:75:
>>>>>>> FutureWarning: The human format is deprecated and the format parameter will
>>>>>>> be removed. Use explicitly json instead in version 'xena'
>>>>>>>   category=FutureWarning)
>>>>>>>
>>>>>>> 2023-03-22 11:11:12.163 109 INFO cinder.image.image_utils
>>>>>>> [req-646b9ac8-a5a7-45ac-a96d-8dd6bb45da17 b240e3e89d99489284cd731e75f2a5db
>>>>>>> 4160ce999a31485fa643aed0936dfef0 - - -] Converted 112.00 MB image at 112.00
>>>>>>> MB/s
>>>>>>> 2023-03-22 11:11:14.998 109 INFO
>>>>>>> cinder.volume.flows.manager.create_volume
>>>>>>> [req-646b9ac8-a5a7-45ac-a96d-8dd6bb45da17 b240e3e89d99489284cd731e75f2a5db
>>>>>>> 4160ce999a31485fa643aed0936dfef0 - - -] Volume
>>>>>>> volume-bf341343-6609-4b8c-b9e0-93e2a89c8c8f
>>>>>>> (bf341343-6609-4b8c-b9e0-93e2a89c8c8f): created successfully
>>>>>>> 2023-03-22 11:11:15.195 109 INFO cinder.volume.manager
>>>>>>> [req-646b9ac8-a5a7-45ac-a96d-8dd6bb45da17 b240e3e89d99489284cd731e75f2a5db
>>>>>>> 4160ce999a31485fa643aed0936dfef0 - - -] Created volume successfully.
>>>>>>>
>>>>>>> The image is present in dcn02 store but still it downloaded the
>>>>>>> image in 0.16 MB/s and then created the volume.
>>>>>>>
>>>>>>> With regards,
>>>>>>> Swogat Pradhan
>>>>>>>
>>>>>>> On Tue, Mar 21, 2023 at 6:10 PM Swogat Pradhan <
>>>>>>> swogatpradhan22 at gmail.com> wrote:
>>>>>>>
>>>>>>>> Hi Jhon,
>>>>>>>> This seems to be an issue.
>>>>>>>> When i deployed the dcn ceph in both dcn01 and dcn02 the --cluster
>>>>>>>> parameter was specified to the respective cluster names but the config
>>>>>>>> files were created in the name of ceph.conf and keyring was
>>>>>>>> ceph.client.openstack.keyring.
>>>>>>>>
>>>>>>>> Which created issues in glance as well as the naming convention of
>>>>>>>> the files didn't match the cluster names, so i had to manually rename the
>>>>>>>> central ceph conf file as such:
>>>>>>>>
>>>>>>>> [root at dcn02-compute-0 ~]# cd /var/lib/tripleo-config/ceph/
>>>>>>>> [root at dcn02-compute-0 ceph]# ll
>>>>>>>> total 16
>>>>>>>> -rw-------. 1 root root 257 Mar 13 13:56
>>>>>>>> ceph_central.client.openstack.keyring
>>>>>>>> -rw-r--r--. 1 root root 428 Mar 13 13:56 ceph_central.conf
>>>>>>>> -rw-------. 1 root root 205 Mar 15 18:45
>>>>>>>> ceph.client.openstack.keyring
>>>>>>>> -rw-r--r--. 1 root root 362 Mar 15 18:45 ceph.conf
>>>>>>>> [root at dcn02-compute-0 ceph]#
>>>>>>>>
>>>>>>>> ceph.conf and ceph.client.openstack.keyring contain the fsid of the
>>>>>>>> respective clusters in both dcn01 and dcn02.
>>>>>>>> In the above cli output, the ceph.conf and ceph.client... are the
>>>>>>>> files used to access dcn02 ceph cluster and ceph_central* files are used in
>>>>>>>> for accessing central ceph cluster.
>>>>>>>>
>>>>>>>> glance multistore config:
>>>>>>>> [dcn02]
>>>>>>>> rbd_store_ceph_conf=/etc/ceph/ceph.conf
>>>>>>>> rbd_store_user=openstack
>>>>>>>> rbd_store_pool=images
>>>>>>>> rbd_thin_provisioning=False
>>>>>>>> store_description=dcn02 rbd glance store
>>>>>>>>
>>>>>>>> [ceph_central]
>>>>>>>> rbd_store_ceph_conf=/etc/ceph/ceph_central.conf
>>>>>>>> rbd_store_user=openstack
>>>>>>>> rbd_store_pool=images
>>>>>>>> rbd_thin_provisioning=False
>>>>>>>> store_description=Default glance store backend.
>>>>>>>>
>>>>>>>>
>>>>>>>> With regards,
>>>>>>>> Swogat Pradhan
>>>>>>>>
>>>>>>>> On Tue, Mar 21, 2023 at 5:52 PM John Fulton <johfulto at redhat.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> On Tue, Mar 21, 2023 at 8:03 AM Swogat Pradhan
>>>>>>>>> <swogatpradhan22 at gmail.com> wrote:
>>>>>>>>> >
>>>>>>>>> > Hi,
>>>>>>>>> > Seems like cinder is not using the local ceph.
>>>>>>>>>
>>>>>>>>> That explains the issue. It's a misconfiguration.
>>>>>>>>>
>>>>>>>>> I hope this is not a production system since the mailing list now
>>>>>>>>> has
>>>>>>>>> the cinder.conf which contains passwords.
>>>>>>>>>
>>>>>>>>> The section that looks like this:
>>>>>>>>>
>>>>>>>>> [tripleo_ceph]
>>>>>>>>> volume_backend_name=tripleo_ceph
>>>>>>>>> volume_driver=cinder.volume.drivers.rbd.RBDDriver
>>>>>>>>> rbd_ceph_conf=/etc/ceph/ceph.conf
>>>>>>>>> rbd_user=openstack
>>>>>>>>> rbd_pool=volumes
>>>>>>>>> rbd_flatten_volume_from_snapshot=False
>>>>>>>>> rbd_secret_uuid=<redacted>
>>>>>>>>> report_discard_supported=True
>>>>>>>>>
>>>>>>>>> Should be updated to refer to the local DCN ceph cluster and not
>>>>>>>>> the
>>>>>>>>> central one. Use the ceph conf file for that cluster and ensure the
>>>>>>>>> rbd_secret_uuid corresponds to that one.
>>>>>>>>>
>>>>>>>>> TripleO’s convention is to set the rbd_secret_uuid to the FSID of
>>>>>>>>> the
>>>>>>>>> Ceph cluster. The FSID should be in the ceph.conf file. The
>>>>>>>>> tripleo_nova_libvirt role will use virsh secret-* commands so that
>>>>>>>>> libvirt can retrieve the cephx secret using the FSID as a key. This
>>>>>>>>> can be confirmed with `podman exec nova_virtsecretd virsh
>>>>>>>>> secret-get-value $FSID`.
>>>>>>>>>
>>>>>>>>> The documentation describes how to configure the central and DCN
>>>>>>>>> sites
>>>>>>>>> correctly but an error seems to have occurred while you were
>>>>>>>>> following
>>>>>>>>> it.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> https://docs.openstack.org/project-deploy-guide/tripleo-docs/latest/features/distributed_multibackend_storage.html
>>>>>>>>>
>>>>>>>>>   John
>>>>>>>>>
>>>>>>>>> >
>>>>>>>>> > Ceph Output:
>>>>>>>>> > [ceph: root at dcn02-ceph-all-0 /]# rbd -p images ls -l
>>>>>>>>> > NAME                                       SIZE     PARENT  FMT
>>>>>>>>> PROT  LOCK
>>>>>>>>> > 2abfafaa-eff4-4c2e-a538-dc2e1249ab65         8 MiB            2
>>>>>>>>>       excl
>>>>>>>>> > 55f40c8a-8f79-48c5-a52a-9b679b762f19        16 MiB            2
>>>>>>>>> > 55f40c8a-8f79-48c5-a52a-9b679b762f19 at snap   16 MiB
>>>>>>>>> 2  yes
>>>>>>>>> > 59f6a9cd-721c-45b5-a15f-fd021b08160d       321 MiB            2
>>>>>>>>> > 59f6a9cd-721c-45b5-a15f-fd021b08160d at snap  321 MiB
>>>>>>>>> 2  yes
>>>>>>>>> > 5f5ddd77-35f3-45e8-9dd3-8c1cbb1f39f0       386 MiB            2
>>>>>>>>> > 5f5ddd77-35f3-45e8-9dd3-8c1cbb1f39f0 at snap  386 MiB
>>>>>>>>> 2  yes
>>>>>>>>> > 9b27248e-a8cf-4f00-a039-d3e3066cd26a        15 GiB            2
>>>>>>>>> > 9b27248e-a8cf-4f00-a039-d3e3066cd26a at snap   15 GiB
>>>>>>>>> 2  yes
>>>>>>>>> > b7356adc-bb47-4c05-968b-6d3c9ca0079b        15 GiB            2
>>>>>>>>> > b7356adc-bb47-4c05-968b-6d3c9ca0079b at snap   15 GiB
>>>>>>>>> 2  yes
>>>>>>>>> > e77e78ad-d369-4a1d-b758-8113621269a3        15 GiB            2
>>>>>>>>> > e77e78ad-d369-4a1d-b758-8113621269a3 at snap   15 GiB
>>>>>>>>> 2  yes
>>>>>>>>> >
>>>>>>>>> > [ceph: root at dcn02-ceph-all-0 /]# rbd -p volumes ls -l
>>>>>>>>> > NAME                                         SIZE     PARENT
>>>>>>>>> FMT  PROT  LOCK
>>>>>>>>> > volume-c644086f-d3cf-406d-b0f1-7691bde5981d  100 GiB            2
>>>>>>>>> > volume-f0969935-a742-4744-9375-80bf323e4d63   10 GiB            2
>>>>>>>>> > [ceph: root at dcn02-ceph-all-0 /]#
>>>>>>>>> >
>>>>>>>>> > Attached the cinder config.
>>>>>>>>> > Please let me know how I can solve this issue.
>>>>>>>>> >
>>>>>>>>> > With regards,
>>>>>>>>> > Swogat Pradhan
>>>>>>>>> >
>>>>>>>>> > On Tue, Mar 21, 2023 at 3:53 PM John Fulton <johfulto at redhat.com>
>>>>>>>>> wrote:
>>>>>>>>> >>
>>>>>>>>> >> in my last message under the line "On a DCN site if you run a
>>>>>>>>> command like this:" I suggested some steps you could try to confirm the
>>>>>>>>> image is a COW from the local glance as well as how to look at your cinder
>>>>>>>>> config.
>>>>>>>>> >>
>>>>>>>>> >> On Tue, Mar 21, 2023, 12:06 AM Swogat Pradhan <
>>>>>>>>> swogatpradhan22 at gmail.com> wrote:
>>>>>>>>> >>>
>>>>>>>>> >>> Update:
>>>>>>>>> >>> I uploaded an image directly to the dcn02 store, and it takes
>>>>>>>>> around 10,15 minutes to create a volume with image in dcn02.
>>>>>>>>> >>> The image size is 389 MB.
>>>>>>>>> >>>
>>>>>>>>> >>> On Mon, Mar 20, 2023 at 10:26 PM Swogat Pradhan <
>>>>>>>>> swogatpradhan22 at gmail.com> wrote:
>>>>>>>>> >>>>
>>>>>>>>> >>>> Hi Jhon,
>>>>>>>>> >>>> I checked in the ceph od dcn02, I can see the images created
>>>>>>>>> after importing from the central site.
>>>>>>>>> >>>> But launching an instance normally fails as it takes a long
>>>>>>>>> time for the volume to get created.
>>>>>>>>> >>>>
>>>>>>>>> >>>> When launching an instance from volume the instance is
>>>>>>>>> getting created properly without any errors.
>>>>>>>>> >>>>
>>>>>>>>> >>>> I tried to cache images in nova using
>>>>>>>>> https://docs.openstack.org/project-deploy-guide/tripleo-docs/latest/post_deployment/pre_cache_images.html
>>>>>>>>> but getting checksum failed error.
>>>>>>>>> >>>>
>>>>>>>>> >>>> With regards,
>>>>>>>>> >>>> Swogat Pradhan
>>>>>>>>> >>>>
>>>>>>>>> >>>> On Thu, Mar 16, 2023 at 5:24 PM John Fulton <
>>>>>>>>> johfulto at redhat.com> wrote:
>>>>>>>>> >>>>>
>>>>>>>>> >>>>> On Wed, Mar 15, 2023 at 8:05 PM Swogat Pradhan
>>>>>>>>> >>>>> <swogatpradhan22 at gmail.com> wrote:
>>>>>>>>> >>>>> >
>>>>>>>>> >>>>> > Update: After restarting the nova services on the
>>>>>>>>> controller and running the deploy script on the edge site, I was able to
>>>>>>>>> launch the VM from volume.
>>>>>>>>> >>>>> >
>>>>>>>>> >>>>> > Right now the instance creation is failing as the block
>>>>>>>>> device creation is stuck in creating state, it is taking more than 10 mins
>>>>>>>>> for the volume to be created, whereas the image has already been imported
>>>>>>>>> to the edge glance.
>>>>>>>>> >>>>>
>>>>>>>>> >>>>> Try following this document and making the same observations
>>>>>>>>> in your
>>>>>>>>> >>>>> environment for AZs and their local ceph cluster.
>>>>>>>>> >>>>>
>>>>>>>>> >>>>>
>>>>>>>>> https://docs.openstack.org/project-deploy-guide/tripleo-docs/latest/features/distributed_multibackend_storage.html#confirm-images-may-be-copied-between-sites
>>>>>>>>> >>>>>
>>>>>>>>> >>>>> On a DCN site if you run a command like this:
>>>>>>>>> >>>>>
>>>>>>>>> >>>>> $ sudo cephadm shell --config /etc/ceph/dcn0.conf --keyring
>>>>>>>>> >>>>> /etc/ceph/dcn0.client.admin.keyring
>>>>>>>>> >>>>> $ rbd --cluster dcn0 -p volumes ls -l
>>>>>>>>> >>>>> NAME                                      SIZE  PARENT
>>>>>>>>> >>>>>                           FMT PROT LOCK
>>>>>>>>> >>>>> volume-28c6fc32-047b-4306-ad2d-de2be02716b7 8 GiB
>>>>>>>>> >>>>> images/8083c7e7-32d8-4f7a-b1da-0ed7884f1076 at snap   2
>>>>>>>>> excl
>>>>>>>>> >>>>> $
>>>>>>>>> >>>>>
>>>>>>>>> >>>>> Then, you should see the parent of the volume is the image
>>>>>>>>> which is on
>>>>>>>>> >>>>> the same local ceph cluster.
>>>>>>>>> >>>>>
>>>>>>>>> >>>>> I wonder if something is misconfigured and thus you're
>>>>>>>>> encountering
>>>>>>>>> >>>>> the streaming behavior described here:
>>>>>>>>> >>>>>
>>>>>>>>> >>>>> Ideally all images should reside in the central Glance and
>>>>>>>>> be copied
>>>>>>>>> >>>>> to DCN sites before instances of those images are booted on
>>>>>>>>> DCN sites.
>>>>>>>>> >>>>> If an image is not copied to a DCN site before it is booted,
>>>>>>>>> then the
>>>>>>>>> >>>>> image will be streamed to the DCN site and then the image
>>>>>>>>> will boot as
>>>>>>>>> >>>>> an instance. This happens because Glance at the DCN site has
>>>>>>>>> access to
>>>>>>>>> >>>>> the images store at the Central ceph cluster. Though the
>>>>>>>>> booting of
>>>>>>>>> >>>>> the image will take time because it has not been copied in
>>>>>>>>> advance,
>>>>>>>>> >>>>> this is still preferable to failing to boot the image.
>>>>>>>>> >>>>>
>>>>>>>>> >>>>> You can also exec into the cinder container at the DCN site
>>>>>>>>> and
>>>>>>>>> >>>>> confirm it's using it's local ceph cluster.
>>>>>>>>> >>>>>
>>>>>>>>> >>>>>   John
>>>>>>>>> >>>>>
>>>>>>>>> >>>>> >
>>>>>>>>> >>>>> > I will try and create a new fresh image and test again
>>>>>>>>> then update.
>>>>>>>>> >>>>> >
>>>>>>>>> >>>>> > With regards,
>>>>>>>>> >>>>> > Swogat Pradhan
>>>>>>>>> >>>>> >
>>>>>>>>> >>>>> > On Wed, Mar 15, 2023 at 11:13 PM Swogat Pradhan <
>>>>>>>>> swogatpradhan22 at gmail.com> wrote:
>>>>>>>>> >>>>> >>
>>>>>>>>> >>>>> >> Update:
>>>>>>>>> >>>>> >> In the hypervisor list the compute node state is showing
>>>>>>>>> down.
>>>>>>>>> >>>>> >>
>>>>>>>>> >>>>> >>
>>>>>>>>> >>>>> >> On Wed, Mar 15, 2023 at 11:11 PM Swogat Pradhan <
>>>>>>>>> swogatpradhan22 at gmail.com> wrote:
>>>>>>>>> >>>>> >>>
>>>>>>>>> >>>>> >>> Hi Brendan,
>>>>>>>>> >>>>> >>> Now i have deployed another site where i have used 2
>>>>>>>>> linux bonds network template for both 3 compute nodes and 3 ceph nodes.
>>>>>>>>> >>>>> >>> The bonding options is set to mode=802.3ad (lacp=active).
>>>>>>>>> >>>>> >>> I used a cirros image to launch instance but the
>>>>>>>>> instance timed out so i waited for the volume to be created.
>>>>>>>>> >>>>> >>> Once the volume was created i tried launching the
>>>>>>>>> instance from the volume and still the instance is stuck in spawning state.
>>>>>>>>> >>>>> >>>
>>>>>>>>> >>>>> >>> Here is the nova-compute log:
>>>>>>>>> >>>>> >>>
>>>>>>>>> >>>>> >>> 2023-03-15 17:35:47.739 185437 INFO oslo.privsep.daemon
>>>>>>>>> [-] privsep daemon starting
>>>>>>>>> >>>>> >>> 2023-03-15 17:35:47.744 185437 INFO oslo.privsep.daemon
>>>>>>>>> [-] privsep process running with uid/gid: 0/0
>>>>>>>>> >>>>> >>> 2023-03-15 17:35:47.749 185437 INFO oslo.privsep.daemon
>>>>>>>>> [-] privsep process running with capabilities (eff/prm/inh):
>>>>>>>>> CAP_SYS_ADMIN/CAP_SYS_ADMIN/none
>>>>>>>>> >>>>> >>> 2023-03-15 17:35:47.749 185437 INFO oslo.privsep.daemon
>>>>>>>>> [-] privsep daemon running as pid 185437
>>>>>>>>> >>>>> >>> 2023-03-15 17:35:47.974 8 WARNING
>>>>>>>>> os_brick.initiator.connectors.nvmeof
>>>>>>>>> [req-dbb11a9b-317e-4957-b141-f9e0bdf6a266 b240e3e89d99489284cd731e75f2a5db
>>>>>>>>> 4160ce999a31485fa643aed0936dfef0 - default default] Process execution error
>>>>>>>>> in _get_host_uuid: Unexpected error while running command.
>>>>>>>>> >>>>> >>> Command: blkid overlay -s UUID -o value
>>>>>>>>> >>>>> >>> Exit code: 2
>>>>>>>>> >>>>> >>> Stdout: ''
>>>>>>>>> >>>>> >>> Stderr: '':
>>>>>>>>> oslo_concurrency.processutils.ProcessExecutionError: Unexpected error while
>>>>>>>>> running command.
>>>>>>>>> >>>>> >>> 2023-03-15 17:35:51.616 8 INFO nova.virt.libvirt.driver
>>>>>>>>> [req-dbb11a9b-317e-4957-b141-f9e0bdf6a266 b240e3e89d99489284cd731e75f2a5db
>>>>>>>>> 4160ce999a31485fa643aed0936dfef0 - default default] [instance:
>>>>>>>>> 450b749c-a10a-4308-80a9-3b8020fee758] Creating image
>>>>>>>>> >>>>> >>>
>>>>>>>>> >>>>> >>> It is stuck in creating image, do i need to run the
>>>>>>>>> template mentioned here ?:
>>>>>>>>> https://docs.openstack.org/project-deploy-guide/tripleo-docs/latest/post_deployment/pre_cache_images.html
>>>>>>>>> >>>>> >>>
>>>>>>>>> >>>>> >>> The volume is already created and i do not understand
>>>>>>>>> why the instance is stuck in spawning state.
>>>>>>>>> >>>>> >>>
>>>>>>>>> >>>>> >>> With regards,
>>>>>>>>> >>>>> >>> Swogat Pradhan
>>>>>>>>> >>>>> >>>
>>>>>>>>> >>>>> >>>
>>>>>>>>> >>>>> >>> On Sun, Mar 5, 2023 at 4:02 PM Brendan Shephard <
>>>>>>>>> bshephar at redhat.com> wrote:
>>>>>>>>> >>>>> >>>>
>>>>>>>>> >>>>> >>>> Does your environment use different network interfaces
>>>>>>>>> for each of the networks? Or does it have a bond with everything on it?
>>>>>>>>> >>>>> >>>>
>>>>>>>>> >>>>> >>>> One issue I have seen before is that when launching
>>>>>>>>> instances, there is a lot of network traffic between nodes as the
>>>>>>>>> hypervisor needs to download the image from Glance. Along with various
>>>>>>>>> other services sending normal network traffic, it can be enough to cause
>>>>>>>>> issues if everything is running over a single 1Gbe interface.
>>>>>>>>> >>>>> >>>>
>>>>>>>>> >>>>> >>>> I have seen the same situation in fact when using a
>>>>>>>>> single active/backup bond on 1Gbe nics. It’s worth checking the network
>>>>>>>>> traffic while you try to spawn the instance to see if you’re dropping
>>>>>>>>> packets. In the situation I described, there were dropped packets which
>>>>>>>>> resulted in a loss of communication between nova_compute and RMQ, so the
>>>>>>>>> node appeared offline. You should also confirm that nova_compute is being
>>>>>>>>> disconnected in the nova_compute logs if you tail them on the Hypervisor
>>>>>>>>> while spawning the instance.
>>>>>>>>> >>>>> >>>>
>>>>>>>>> >>>>> >>>> In my case, changing from active/backup to LACP helped.
>>>>>>>>> So, based on that experience, from my perspective, is certainly sounds like
>>>>>>>>> some kind of network issue.
>>>>>>>>> >>>>> >>>>
>>>>>>>>> >>>>> >>>> Regards,
>>>>>>>>> >>>>> >>>>
>>>>>>>>> >>>>> >>>> Brendan Shephard
>>>>>>>>> >>>>> >>>> Senior Software Engineer
>>>>>>>>> >>>>> >>>> Red Hat Australia
>>>>>>>>> >>>>> >>>>
>>>>>>>>> >>>>> >>>>
>>>>>>>>> >>>>> >>>>
>>>>>>>>> >>>>> >>>> On 5 Mar 2023, at 6:47 am, Eugen Block <eblock at nde.ag>
>>>>>>>>> wrote:
>>>>>>>>> >>>>> >>>>
>>>>>>>>> >>>>> >>>> Hi,
>>>>>>>>> >>>>> >>>>
>>>>>>>>> >>>>> >>>> I tried to help someone with a similar issue some time
>>>>>>>>> ago in this thread:
>>>>>>>>> >>>>> >>>>
>>>>>>>>> https://serverfault.com/questions/1116771/openstack-oslo-messaging-exception-in-nova-conductor
>>>>>>>>> >>>>> >>>>
>>>>>>>>> >>>>> >>>> But apparently a neutron reinstallation fixed it for
>>>>>>>>> that user, not sure if that could apply here. But is it possible that your
>>>>>>>>> nova and neutron versions are different between central and edge site? Have
>>>>>>>>> you restarted nova and neutron services on the compute nodes after
>>>>>>>>> installation? Have you debug logs of nova-conductor and maybe nova-compute?
>>>>>>>>> Maybe they can help narrow down the issue.
>>>>>>>>> >>>>> >>>> If there isn't any additional information in the debug
>>>>>>>>> logs I probably would start "tearing down" rabbitmq. I didn't have to do
>>>>>>>>> that in a production system yet so be careful. I can think of two routes:
>>>>>>>>> >>>>> >>>>
>>>>>>>>> >>>>> >>>> - Either remove queues, exchanges etc. while rabbit is
>>>>>>>>> running, this will most likely impact client IO depending on your load.
>>>>>>>>> Check out the rabbitmqctl commands.
>>>>>>>>> >>>>> >>>> - Or stop the rabbitmq cluster, remove the mnesia
>>>>>>>>> tables from all nodes and restart rabbitmq so the exchanges, queues etc.
>>>>>>>>> rebuild.
>>>>>>>>> >>>>> >>>>
>>>>>>>>> >>>>> >>>> I can imagine that the failed reply "survives" while
>>>>>>>>> being replicated across the rabbit nodes. But I don't really know the
>>>>>>>>> rabbit internals too well, so maybe someone else can chime in here and give
>>>>>>>>> a better advice.
>>>>>>>>> >>>>> >>>>
>>>>>>>>> >>>>> >>>> Regards,
>>>>>>>>> >>>>> >>>> Eugen
>>>>>>>>> >>>>> >>>>
>>>>>>>>> >>>>> >>>> Zitat von Swogat Pradhan <swogatpradhan22 at gmail.com>:
>>>>>>>>> >>>>> >>>>
>>>>>>>>> >>>>> >>>> Hi,
>>>>>>>>> >>>>> >>>> Can someone please help me out on this issue?
>>>>>>>>> >>>>> >>>>
>>>>>>>>> >>>>> >>>> With regards,
>>>>>>>>> >>>>> >>>> Swogat Pradhan
>>>>>>>>> >>>>> >>>>
>>>>>>>>> >>>>> >>>> On Thu, Mar 2, 2023 at 1:24 PM Swogat Pradhan <
>>>>>>>>> swogatpradhan22 at gmail.com>
>>>>>>>>> >>>>> >>>> wrote:
>>>>>>>>> >>>>> >>>>
>>>>>>>>> >>>>> >>>> Hi
>>>>>>>>> >>>>> >>>> I don't see any major packet loss.
>>>>>>>>> >>>>> >>>> It seems the problem is somewhere in rabbitmq maybe but
>>>>>>>>> not due to packet
>>>>>>>>> >>>>> >>>> loss.
>>>>>>>>> >>>>> >>>>
>>>>>>>>> >>>>> >>>> with regards,
>>>>>>>>> >>>>> >>>> Swogat Pradhan
>>>>>>>>> >>>>> >>>>
>>>>>>>>> >>>>> >>>> On Wed, Mar 1, 2023 at 3:34 PM Swogat Pradhan <
>>>>>>>>> swogatpradhan22 at gmail.com>
>>>>>>>>> >>>>> >>>> wrote:
>>>>>>>>> >>>>> >>>>
>>>>>>>>> >>>>> >>>> Hi,
>>>>>>>>> >>>>> >>>> Yes the MTU is the same as the default '1500'.
>>>>>>>>> >>>>> >>>> Generally I haven't seen any packet loss, but never
>>>>>>>>> checked when
>>>>>>>>> >>>>> >>>> launching the instance.
>>>>>>>>> >>>>> >>>> I will check that and come back.
>>>>>>>>> >>>>> >>>> But everytime i launch an instance the instance gets
>>>>>>>>> stuck at spawning
>>>>>>>>> >>>>> >>>> state and there the hypervisor becomes down, so not
>>>>>>>>> sure if packet loss
>>>>>>>>> >>>>> >>>> causes this.
>>>>>>>>> >>>>> >>>>
>>>>>>>>> >>>>> >>>> With regards,
>>>>>>>>> >>>>> >>>> Swogat pradhan
>>>>>>>>> >>>>> >>>>
>>>>>>>>> >>>>> >>>> On Wed, Mar 1, 2023 at 3:30 PM Eugen Block <
>>>>>>>>> eblock at nde.ag> wrote:
>>>>>>>>> >>>>> >>>>
>>>>>>>>> >>>>> >>>> One more thing coming to mind is MTU size. Are they
>>>>>>>>> identical between
>>>>>>>>> >>>>> >>>> central and edge site? Do you see packet loss through
>>>>>>>>> the tunnel?
>>>>>>>>> >>>>> >>>>
>>>>>>>>> >>>>> >>>> Zitat von Swogat Pradhan <swogatpradhan22 at gmail.com>:
>>>>>>>>> >>>>> >>>>
>>>>>>>>> >>>>> >>>> > Hi Eugen,
>>>>>>>>> >>>>> >>>> > Request you to please add my email either on 'to' or
>>>>>>>>> 'cc' as i am not
>>>>>>>>> >>>>> >>>> > getting email's from you.
>>>>>>>>> >>>>> >>>> > Coming to the issue:
>>>>>>>>> >>>>> >>>> >
>>>>>>>>> >>>>> >>>> > [root at overcloud-controller-no-ceph-3 /]# rabbitmqctl
>>>>>>>>> list_policies -p
>>>>>>>>> >>>>> >>>> /
>>>>>>>>> >>>>> >>>> > Listing policies for vhost "/" ...
>>>>>>>>> >>>>> >>>> > vhost   name    pattern apply-to        definition
>>>>>>>>>   priority
>>>>>>>>> >>>>> >>>> > /       ha-all  ^(?!amq\.).*    queues
>>>>>>>>> >>>>> >>>> >
>>>>>>>>> >>>>> >>>>
>>>>>>>>> {"ha-mode":"exactly","ha-params":2,"ha-promote-on-shutdown":"always"}   0
>>>>>>>>> >>>>> >>>> >
>>>>>>>>> >>>>> >>>> > I have the edge site compute nodes up, it only goes
>>>>>>>>> down when i am
>>>>>>>>> >>>>> >>>> trying
>>>>>>>>> >>>>> >>>> > to launch an instance and the instance comes to a
>>>>>>>>> spawning state and
>>>>>>>>> >>>>> >>>> then
>>>>>>>>> >>>>> >>>> > gets stuck.
>>>>>>>>> >>>>> >>>> >
>>>>>>>>> >>>>> >>>> > I have a tunnel setup between the central and the
>>>>>>>>> edge sites.
>>>>>>>>> >>>>> >>>> >
>>>>>>>>> >>>>> >>>> > With regards,
>>>>>>>>> >>>>> >>>> > Swogat Pradhan
>>>>>>>>> >>>>> >>>> >
>>>>>>>>> >>>>> >>>> > On Tue, Feb 28, 2023 at 9:11 PM Swogat Pradhan <
>>>>>>>>> >>>>> >>>> swogatpradhan22 at gmail.com>
>>>>>>>>> >>>>> >>>> > wrote:
>>>>>>>>> >>>>> >>>> >
>>>>>>>>> >>>>> >>>> >> Hi Eugen,
>>>>>>>>> >>>>> >>>> >> For some reason i am not getting your email to me
>>>>>>>>> directly, i am
>>>>>>>>> >>>>> >>>> checking
>>>>>>>>> >>>>> >>>> >> the email digest and there i am able to find your
>>>>>>>>> reply.
>>>>>>>>> >>>>> >>>> >> Here is the log for download:
>>>>>>>>> https://we.tl/t-L8FEkGZFSq
>>>>>>>>> >>>>> >>>> >> Yes, these logs are from the time when the issue
>>>>>>>>> occurred.
>>>>>>>>> >>>>> >>>> >>
>>>>>>>>> >>>>> >>>> >> *Note: i am able to create vm's and perform other
>>>>>>>>> activities in the
>>>>>>>>> >>>>> >>>> >> central site, only facing this issue in the edge
>>>>>>>>> site.*
>>>>>>>>> >>>>> >>>> >>
>>>>>>>>> >>>>> >>>> >> With regards,
>>>>>>>>> >>>>> >>>> >> Swogat Pradhan
>>>>>>>>> >>>>> >>>> >>
>>>>>>>>> >>>>> >>>> >> On Mon, Feb 27, 2023 at 5:12 PM Swogat Pradhan <
>>>>>>>>> >>>>> >>>> swogatpradhan22 at gmail.com>
>>>>>>>>> >>>>> >>>> >> wrote:
>>>>>>>>> >>>>> >>>> >>
>>>>>>>>> >>>>> >>>> >>> Hi Eugen,
>>>>>>>>> >>>>> >>>> >>> Thanks for your response.
>>>>>>>>> >>>>> >>>> >>> I have actually a 4 controller setup so here are
>>>>>>>>> the details:
>>>>>>>>> >>>>> >>>> >>>
>>>>>>>>> >>>>> >>>> >>> *PCS Status:*
>>>>>>>>> >>>>> >>>> >>>   * Container bundle set: rabbitmq-bundle [
>>>>>>>>> >>>>> >>>> >>>
>>>>>>>>> 172.25.201.68:8787/tripleomaster/openstack-rabbitmq:pcmklatest]:
>>>>>>>>> >>>>> >>>> >>>     * rabbitmq-bundle-0
>>>>>>>>> (ocf::heartbeat:rabbitmq-cluster):
>>>>>>>>> >>>>> >>>> Started
>>>>>>>>> >>>>> >>>> >>> overcloud-controller-no-ceph-3
>>>>>>>>> >>>>> >>>> >>>     * rabbitmq-bundle-1
>>>>>>>>> (ocf::heartbeat:rabbitmq-cluster):
>>>>>>>>> >>>>> >>>> Started
>>>>>>>>> >>>>> >>>> >>> overcloud-controller-2
>>>>>>>>> >>>>> >>>> >>>     * rabbitmq-bundle-2
>>>>>>>>> (ocf::heartbeat:rabbitmq-cluster):
>>>>>>>>> >>>>> >>>> Started
>>>>>>>>> >>>>> >>>> >>> overcloud-controller-1
>>>>>>>>> >>>>> >>>> >>>     * rabbitmq-bundle-3
>>>>>>>>> (ocf::heartbeat:rabbitmq-cluster):
>>>>>>>>> >>>>> >>>> Started
>>>>>>>>> >>>>> >>>> >>> overcloud-controller-0
>>>>>>>>> >>>>> >>>> >>>
>>>>>>>>> >>>>> >>>> >>> I have tried restarting the bundle multiple times
>>>>>>>>> but the issue is
>>>>>>>>> >>>>> >>>> still
>>>>>>>>> >>>>> >>>> >>> present.
>>>>>>>>> >>>>> >>>> >>>
>>>>>>>>> >>>>> >>>> >>> *Cluster status:*
>>>>>>>>> >>>>> >>>> >>> [root at overcloud-controller-0 /]# rabbitmqctl
>>>>>>>>> cluster_status
>>>>>>>>> >>>>> >>>> >>> Cluster status of node
>>>>>>>>> >>>>> >>>> >>>
>>>>>>>>> rabbit at overcloud-controller-0.internalapi.bdxworld.com ...
>>>>>>>>> >>>>> >>>> >>> Basics
>>>>>>>>> >>>>> >>>> >>>
>>>>>>>>> >>>>> >>>> >>> Cluster name:
>>>>>>>>> rabbit at overcloud-controller-no-ceph-3.bdxworld.com
>>>>>>>>> >>>>> >>>> >>>
>>>>>>>>> >>>>> >>>> >>> Disk Nodes
>>>>>>>>> >>>>> >>>> >>>
>>>>>>>>> >>>>> >>>> >>>
>>>>>>>>> rabbit at overcloud-controller-0.internalapi.bdxworld.com
>>>>>>>>> >>>>> >>>> >>>
>>>>>>>>> rabbit at overcloud-controller-1.internalapi.bdxworld.com
>>>>>>>>> >>>>> >>>> >>>
>>>>>>>>> rabbit at overcloud-controller-2.internalapi.bdxworld.com
>>>>>>>>> >>>>> >>>> >>>
>>>>>>>>> rabbit at overcloud-controller-no-ceph-3.internalapi.bdxworld.com
>>>>>>>>> >>>>> >>>> >>>
>>>>>>>>> >>>>> >>>> >>> Running Nodes
>>>>>>>>> >>>>> >>>> >>>
>>>>>>>>> >>>>> >>>> >>>
>>>>>>>>> rabbit at overcloud-controller-0.internalapi.bdxworld.com
>>>>>>>>> >>>>> >>>> >>>
>>>>>>>>> rabbit at overcloud-controller-1.internalapi.bdxworld.com
>>>>>>>>> >>>>> >>>> >>>
>>>>>>>>> rabbit at overcloud-controller-2.internalapi.bdxworld.com
>>>>>>>>> >>>>> >>>> >>>
>>>>>>>>> rabbit at overcloud-controller-no-ceph-3.internalapi.bdxworld.com
>>>>>>>>> >>>>> >>>> >>>
>>>>>>>>> >>>>> >>>> >>> Versions
>>>>>>>>> >>>>> >>>> >>>
>>>>>>>>> >>>>> >>>> >>>
>>>>>>>>> rabbit at overcloud-controller-0.internalapi.bdxworld.com: RabbitMQ
>>>>>>>>> >>>>> >>>> 3.8.3
>>>>>>>>> >>>>> >>>> >>> on Erlang 22.3.4.1
>>>>>>>>> >>>>> >>>> >>>
>>>>>>>>> rabbit at overcloud-controller-1.internalapi.bdxworld.com: RabbitMQ
>>>>>>>>> >>>>> >>>> 3.8.3
>>>>>>>>> >>>>> >>>> >>> on Erlang 22.3.4.1
>>>>>>>>> >>>>> >>>> >>>
>>>>>>>>> rabbit at overcloud-controller-2.internalapi.bdxworld.com: RabbitMQ
>>>>>>>>> >>>>> >>>> 3.8.3
>>>>>>>>> >>>>> >>>> >>> on Erlang 22.3.4.1
>>>>>>>>> >>>>> >>>> >>>
>>>>>>>>> rabbit at overcloud-controller-no-ceph-3.internalapi.bdxworld.com:
>>>>>>>>> >>>>> >>>> RabbitMQ
>>>>>>>>> >>>>> >>>> >>> 3.8.3 on Erlang 22.3.4.1
>>>>>>>>> >>>>> >>>> >>>
>>>>>>>>> >>>>> >>>> >>> Alarms
>>>>>>>>> >>>>> >>>> >>>
>>>>>>>>> >>>>> >>>> >>> (none)
>>>>>>>>> >>>>> >>>> >>>
>>>>>>>>> >>>>> >>>> >>> Network Partitions
>>>>>>>>> >>>>> >>>> >>>
>>>>>>>>> >>>>> >>>> >>> (none)
>>>>>>>>> >>>>> >>>> >>>
>>>>>>>>> >>>>> >>>> >>> Listeners
>>>>>>>>> >>>>> >>>> >>>
>>>>>>>>> >>>>> >>>> >>> Node:
>>>>>>>>> rabbit at overcloud-controller-0.internalapi.bdxworld.com,
>>>>>>>>> >>>>> >>>> interface:
>>>>>>>>> >>>>> >>>> >>> [::], port: 25672, protocol: clustering, purpose:
>>>>>>>>> inter-node and CLI
>>>>>>>>> >>>>> >>>> tool
>>>>>>>>> >>>>> >>>> >>> communication
>>>>>>>>> >>>>> >>>> >>> Node:
>>>>>>>>> rabbit at overcloud-controller-0.internalapi.bdxworld.com,
>>>>>>>>> >>>>> >>>> interface:
>>>>>>>>> >>>>> >>>> >>> 172.25.201.212, port: 5672, protocol: amqp,
>>>>>>>>> purpose: AMQP 0-9-1
>>>>>>>>> >>>>> >>>> >>> and AMQP 1.0
>>>>>>>>> >>>>> >>>> >>> Node:
>>>>>>>>> rabbit at overcloud-controller-0.internalapi.bdxworld.com,
>>>>>>>>> >>>>> >>>> interface:
>>>>>>>>> >>>>> >>>> >>> [::], port: 15672, protocol: http, purpose: HTTP API
>>>>>>>>> >>>>> >>>> >>> Node:
>>>>>>>>> rabbit at overcloud-controller-1.internalapi.bdxworld.com,
>>>>>>>>> >>>>> >>>> interface:
>>>>>>>>> >>>>> >>>> >>> [::], port: 25672, protocol: clustering, purpose:
>>>>>>>>> inter-node and CLI
>>>>>>>>> >>>>> >>>> tool
>>>>>>>>> >>>>> >>>> >>> communication
>>>>>>>>> >>>>> >>>> >>> Node:
>>>>>>>>> rabbit at overcloud-controller-1.internalapi.bdxworld.com,
>>>>>>>>> >>>>> >>>> interface:
>>>>>>>>> >>>>> >>>> >>> 172.25.201.205, port: 5672, protocol: amqp,
>>>>>>>>> purpose: AMQP 0-9-1
>>>>>>>>> >>>>> >>>> >>> and AMQP 1.0
>>>>>>>>> >>>>> >>>> >>> Node:
>>>>>>>>> rabbit at overcloud-controller-1.internalapi.bdxworld.com,
>>>>>>>>> >>>>> >>>> interface:
>>>>>>>>> >>>>> >>>> >>> [::], port: 15672, protocol: http, purpose: HTTP API
>>>>>>>>> >>>>> >>>> >>> Node:
>>>>>>>>> rabbit at overcloud-controller-2.internalapi.bdxworld.com,
>>>>>>>>> >>>>> >>>> interface:
>>>>>>>>> >>>>> >>>> >>> [::], port: 25672, protocol: clustering, purpose:
>>>>>>>>> inter-node and CLI
>>>>>>>>> >>>>> >>>> tool
>>>>>>>>> >>>>> >>>> >>> communication
>>>>>>>>> >>>>> >>>> >>> Node:
>>>>>>>>> rabbit at overcloud-controller-2.internalapi.bdxworld.com,
>>>>>>>>> >>>>> >>>> interface:
>>>>>>>>> >>>>> >>>> >>> 172.25.201.201, port: 5672, protocol: amqp,
>>>>>>>>> purpose: AMQP 0-9-1
>>>>>>>>> >>>>> >>>> >>> and AMQP 1.0
>>>>>>>>> >>>>> >>>> >>> Node:
>>>>>>>>> rabbit at overcloud-controller-2.internalapi.bdxworld.com,
>>>>>>>>> >>>>> >>>> interface:
>>>>>>>>> >>>>> >>>> >>> [::], port: 15672, protocol: http, purpose: HTTP API
>>>>>>>>> >>>>> >>>> >>> Node:
>>>>>>>>> rabbit at overcloud-controller-no-ceph-3.internalapi.bdxworld.com
>>>>>>>>> >>>>> >>>> ,
>>>>>>>>> >>>>> >>>> >>> interface: [::], port: 25672, protocol: clustering,
>>>>>>>>> purpose:
>>>>>>>>> >>>>> >>>> inter-node and
>>>>>>>>> >>>>> >>>> >>> CLI tool communication
>>>>>>>>> >>>>> >>>> >>> Node:
>>>>>>>>> rabbit at overcloud-controller-no-ceph-3.internalapi.bdxworld.com
>>>>>>>>> >>>>> >>>> ,
>>>>>>>>> >>>>> >>>> >>> interface: 172.25.201.209, port: 5672, protocol:
>>>>>>>>> amqp, purpose: AMQP
>>>>>>>>> >>>>> >>>> 0-9-1
>>>>>>>>> >>>>> >>>> >>> and AMQP 1.0
>>>>>>>>> >>>>> >>>> >>> Node:
>>>>>>>>> rabbit at overcloud-controller-no-ceph-3.internalapi.bdxworld.com
>>>>>>>>> >>>>> >>>> ,
>>>>>>>>> >>>>> >>>> >>> interface: [::], port: 15672, protocol: http,
>>>>>>>>> purpose: HTTP API
>>>>>>>>> >>>>> >>>> >>>
>>>>>>>>> >>>>> >>>> >>> Feature flags
>>>>>>>>> >>>>> >>>> >>>
>>>>>>>>> >>>>> >>>> >>> Flag: drop_unroutable_metric, state: enabled
>>>>>>>>> >>>>> >>>> >>> Flag: empty_basic_get_metric, state: enabled
>>>>>>>>> >>>>> >>>> >>> Flag: implicit_default_bindings, state: enabled
>>>>>>>>> >>>>> >>>> >>> Flag: quorum_queue, state: enabled
>>>>>>>>> >>>>> >>>> >>> Flag: virtual_host_metadata, state: enabled
>>>>>>>>> >>>>> >>>> >>>
>>>>>>>>> >>>>> >>>> >>> *Logs:*
>>>>>>>>> >>>>> >>>> >>> *(Attached)*
>>>>>>>>> >>>>> >>>> >>>
>>>>>>>>> >>>>> >>>> >>> With regards,
>>>>>>>>> >>>>> >>>> >>> Swogat Pradhan
>>>>>>>>> >>>>> >>>> >>>
>>>>>>>>> >>>>> >>>> >>> On Sun, Feb 26, 2023 at 2:34 PM Swogat Pradhan <
>>>>>>>>> >>>>> >>>> swogatpradhan22 at gmail.com>
>>>>>>>>> >>>>> >>>> >>> wrote:
>>>>>>>>> >>>>> >>>> >>>
>>>>>>>>> >>>>> >>>> >>>> Hi,
>>>>>>>>> >>>>> >>>> >>>> Please find the nova conductor as well as nova api
>>>>>>>>> log.
>>>>>>>>> >>>>> >>>> >>>>
>>>>>>>>> >>>>> >>>> >>>> nova-conuctor:
>>>>>>>>> >>>>> >>>> >>>>
>>>>>>>>> >>>>> >>>> >>>> 2023-02-26 08:45:01.108 31 WARNING
>>>>>>>>> >>>>> >>>> oslo_messaging._drivers.amqpdriver
>>>>>>>>> >>>>> >>>> >>>> [req-caefe26d-153a-4dfd-9ea6-bc5ca0d46679 - - - -
>>>>>>>>> -]
>>>>>>>>> >>>>> >>>> >>>> reply_349bcb075f8c49329435a0f884b33066 doesn't
>>>>>>>>> exist, drop reply to
>>>>>>>>> >>>>> >>>> >>>> 16152921c1eb45c2b1f562087140168b
>>>>>>>>> >>>>> >>>> >>>> 2023-02-26 08:45:02.144 26 WARNING
>>>>>>>>> >>>>> >>>> oslo_messaging._drivers.amqpdriver
>>>>>>>>> >>>>> >>>> >>>> [req-7b43c4e5-0475-4598-92c0-fcacb51d9813 - - - -
>>>>>>>>> -]
>>>>>>>>> >>>>> >>>> >>>> reply_276049ec36a84486a8a406911d9802f4 doesn't
>>>>>>>>> exist, drop reply to
>>>>>>>>> >>>>> >>>> >>>> 83dbe5f567a940b698acfe986f6194fa
>>>>>>>>> >>>>> >>>> >>>> 2023-02-26 08:45:02.314 32 WARNING
>>>>>>>>> >>>>> >>>> oslo_messaging._drivers.amqpdriver
>>>>>>>>> >>>>> >>>> >>>> [req-7b43c4e5-0475-4598-92c0-fcacb51d9813 - - - -
>>>>>>>>> -]
>>>>>>>>> >>>>> >>>> >>>> reply_276049ec36a84486a8a406911d9802f4 doesn't
>>>>>>>>> exist, drop reply to
>>>>>>>>> >>>>> >>>> >>>> f3bfd7f65bd542b18d84cea3033abb43:
>>>>>>>>> >>>>> >>>> >>>> oslo_messaging.exceptions.MessageUndeliverable
>>>>>>>>> >>>>> >>>> >>>> 2023-02-26 08:45:02.316 32 ERROR
>>>>>>>>> oslo_messaging._drivers.amqpdriver
>>>>>>>>> >>>>> >>>> >>>> [req-7b43c4e5-0475-4598-92c0-fcacb51d9813 - - - -
>>>>>>>>> -] The reply
>>>>>>>>> >>>>> >>>> >>>> f3bfd7f65bd542b18d84cea3033abb43 failed to send
>>>>>>>>> after 60 seconds
>>>>>>>>> >>>>> >>>> due to a
>>>>>>>>> >>>>> >>>> >>>> missing queue
>>>>>>>>> (reply_276049ec36a84486a8a406911d9802f4).
>>>>>>>>> >>>>> >>>> Abandoning...:
>>>>>>>>> >>>>> >>>> >>>> oslo_messaging.exceptions.MessageUndeliverable
>>>>>>>>> >>>>> >>>> >>>> 2023-02-26 08:48:01.282 35 WARNING
>>>>>>>>> >>>>> >>>> oslo_messaging._drivers.amqpdriver
>>>>>>>>> >>>>> >>>> >>>> [req-caefe26d-153a-4dfd-9ea6-bc5ca0d46679 - - - -
>>>>>>>>> -]
>>>>>>>>> >>>>> >>>> >>>> reply_349bcb075f8c49329435a0f884b33066 doesn't
>>>>>>>>> exist, drop reply to
>>>>>>>>> >>>>> >>>> >>>> d4b9180f91a94f9a82c3c9c4b7595566:
>>>>>>>>> >>>>> >>>> >>>> oslo_messaging.exceptions.MessageUndeliverable
>>>>>>>>> >>>>> >>>> >>>> 2023-02-26 08:48:01.284 35 ERROR
>>>>>>>>> oslo_messaging._drivers.amqpdriver
>>>>>>>>> >>>>> >>>> >>>> [req-caefe26d-153a-4dfd-9ea6-bc5ca0d46679 - - - -
>>>>>>>>> -] The reply
>>>>>>>>> >>>>> >>>> >>>> d4b9180f91a94f9a82c3c9c4b7595566 failed to send
>>>>>>>>> after 60 seconds
>>>>>>>>> >>>>> >>>> due to a
>>>>>>>>> >>>>> >>>> >>>> missing queue
>>>>>>>>> (reply_349bcb075f8c49329435a0f884b33066).
>>>>>>>>> >>>>> >>>> Abandoning...:
>>>>>>>>> >>>>> >>>> >>>> oslo_messaging.exceptions.MessageUndeliverable
>>>>>>>>> >>>>> >>>> >>>> 2023-02-26 08:49:01.303 33 WARNING
>>>>>>>>> >>>>> >>>> oslo_messaging._drivers.amqpdriver
>>>>>>>>> >>>>> >>>> >>>> [req-caefe26d-153a-4dfd-9ea6-bc5ca0d46679 - - - -
>>>>>>>>> -]
>>>>>>>>> >>>>> >>>> >>>> reply_349bcb075f8c49329435a0f884b33066 doesn't
>>>>>>>>> exist, drop reply to
>>>>>>>>> >>>>> >>>> >>>> 897911a234a445d8a0d8af02ece40f6f:
>>>>>>>>> >>>>> >>>> >>>> oslo_messaging.exceptions.MessageUndeliverable
>>>>>>>>> >>>>> >>>> >>>> 2023-02-26 08:49:01.304 33 ERROR
>>>>>>>>> oslo_messaging._drivers.amqpdriver
>>>>>>>>> >>>>> >>>> >>>> [req-caefe26d-153a-4dfd-9ea6-bc5ca0d46679 - - - -
>>>>>>>>> -] The reply
>>>>>>>>> >>>>> >>>> >>>> 897911a234a445d8a0d8af02ece40f6f failed to send
>>>>>>>>> after 60 seconds
>>>>>>>>> >>>>> >>>> due to a
>>>>>>>>> >>>>> >>>> >>>> missing queue
>>>>>>>>> (reply_349bcb075f8c49329435a0f884b33066).
>>>>>>>>> >>>>> >>>> Abandoning...:
>>>>>>>>> >>>>> >>>> >>>> oslo_messaging.exceptions.MessageUndeliverable
>>>>>>>>> >>>>> >>>> >>>> 2023-02-26 08:49:52.254 31 WARNING nova.cache_utils
>>>>>>>>> >>>>> >>>> >>>> [req-3a1547ea-326f-4dd0-9127-7f4a4bdf1e45
>>>>>>>>> >>>>> >>>> b240e3e89d99489284cd731e75f2a5db
>>>>>>>>> >>>>> >>>> >>>> 4160ce999a31485fa643aed0936dfef0 - default
>>>>>>>>> default] Cache enabled
>>>>>>>>> >>>>> >>>> with
>>>>>>>>> >>>>> >>>> >>>> backend dogpile.cache.null.
>>>>>>>>> >>>>> >>>> >>>> 2023-02-26 08:50:01.264 27 WARNING
>>>>>>>>> >>>>> >>>> oslo_messaging._drivers.amqpdriver
>>>>>>>>> >>>>> >>>> >>>> [req-caefe26d-153a-4dfd-9ea6-bc5ca0d46679 - - - -
>>>>>>>>> -]
>>>>>>>>> >>>>> >>>> >>>> reply_349bcb075f8c49329435a0f884b33066 doesn't
>>>>>>>>> exist, drop reply to
>>>>>>>>> >>>>> >>>> >>>> 8f723ceb10c3472db9a9f324861df2bb:
>>>>>>>>> >>>>> >>>> >>>> oslo_messaging.exceptions.MessageUndeliverable
>>>>>>>>> >>>>> >>>> >>>> 2023-02-26 08:50:01.266 27 ERROR
>>>>>>>>> oslo_messaging._drivers.amqpdriver
>>>>>>>>> >>>>> >>>> >>>> [req-caefe26d-153a-4dfd-9ea6-bc5ca0d46679 - - - -
>>>>>>>>> -] The reply
>>>>>>>>> >>>>> >>>> >>>> 8f723ceb10c3472db9a9f324861df2bb failed to send
>>>>>>>>> after 60 seconds
>>>>>>>>> >>>>> >>>> due to a
>>>>>>>>> >>>>> >>>> >>>> missing queue
>>>>>>>>> (reply_349bcb075f8c49329435a0f884b33066).
>>>>>>>>> >>>>> >>>> Abandoning...:
>>>>>>>>> >>>>> >>>> >>>> oslo_messaging.exceptions.MessageUndeliverable
>>>>>>>>> >>>>> >>>> >>>>
>>>>>>>>> >>>>> >>>> >>>> With regards,
>>>>>>>>> >>>>> >>>> >>>> Swogat Pradhan
>>>>>>>>> >>>>> >>>> >>>>
>>>>>>>>> >>>>> >>>> >>>> On Sun, Feb 26, 2023 at 2:26 PM Swogat Pradhan <
>>>>>>>>> >>>>> >>>> >>>> swogatpradhan22 at gmail.com> wrote:
>>>>>>>>> >>>>> >>>> >>>>
>>>>>>>>> >>>>> >>>> >>>>> Hi,
>>>>>>>>> >>>>> >>>> >>>>> I currently have 3 compute nodes on edge site1
>>>>>>>>> where i am trying to
>>>>>>>>> >>>>> >>>> >>>>> launch vm's.
>>>>>>>>> >>>>> >>>> >>>>> When the VM is in spawning state the node goes
>>>>>>>>> down (openstack
>>>>>>>>> >>>>> >>>> compute
>>>>>>>>> >>>>> >>>> >>>>> service list), the node comes backup when i
>>>>>>>>> restart the nova
>>>>>>>>> >>>>> >>>> compute
>>>>>>>>> >>>>> >>>> >>>>> service but then the launch of the vm fails.
>>>>>>>>> >>>>> >>>> >>>>>
>>>>>>>>> >>>>> >>>> >>>>> nova-compute.log
>>>>>>>>> >>>>> >>>> >>>>>
>>>>>>>>> >>>>> >>>> >>>>> 2023-02-26 08:15:51.808 7 INFO
>>>>>>>>> nova.compute.manager
>>>>>>>>> >>>>> >>>> >>>>> [req-bc0f5f2e-53fc-4dae-b1da-82f1f972d617 - - - -
>>>>>>>>> -] Running
>>>>>>>>> >>>>> >>>> >>>>> instance usage
>>>>>>>>> >>>>> >>>> >>>>> audit for host dcn01-hci-0.bdxworld.com from
>>>>>>>>> 2023-02-26 07:00:00
>>>>>>>>> >>>>> >>>> to
>>>>>>>>> >>>>> >>>> >>>>> 2023-02-26 08:00:00. 0 instances.
>>>>>>>>> >>>>> >>>> >>>>> 2023-02-26 08:49:52.813 7 INFO nova.compute.claims
>>>>>>>>> >>>>> >>>> >>>>> [req-3a1547ea-326f-4dd0-9127-7f4a4bdf1e45
>>>>>>>>> >>>>> >>>> >>>>> b240e3e89d99489284cd731e75f2a5db
>>>>>>>>> >>>>> >>>> >>>>> 4160ce999a31485fa643aed0936dfef0 - default
>>>>>>>>> default] [instance:
>>>>>>>>> >>>>> >>>> >>>>> 0c62c1ef-9010-417d-a05f-4db77e901600] Claim
>>>>>>>>> successful on node
>>>>>>>>> >>>>> >>>> >>>>> dcn01-hci-0.bdxworld.com
>>>>>>>>> >>>>> >>>> >>>>> 2023-02-26 08:49:54.225 7 INFO
>>>>>>>>> nova.virt.libvirt.driver
>>>>>>>>> >>>>> >>>> >>>>> [req-3a1547ea-326f-4dd0-9127-7f4a4bdf1e45
>>>>>>>>> >>>>> >>>> >>>>> b240e3e89d99489284cd731e75f2a5db
>>>>>>>>> >>>>> >>>> >>>>> 4160ce999a31485fa643aed0936dfef0 - default
>>>>>>>>> default] [instance:
>>>>>>>>> >>>>> >>>> >>>>> 0c62c1ef-9010-417d-a05f-4db77e901600] Ignoring
>>>>>>>>> supplied device
>>>>>>>>> >>>>> >>>> name:
>>>>>>>>> >>>>> >>>> >>>>> /dev/vda. Libvirt can't honour user-supplied dev
>>>>>>>>> names
>>>>>>>>> >>>>> >>>> >>>>> 2023-02-26 08:49:54.398 7 INFO
>>>>>>>>> nova.virt.block_device
>>>>>>>>> >>>>> >>>> >>>>> [req-3a1547ea-326f-4dd0-9127-7f4a4bdf1e45
>>>>>>>>> >>>>> >>>> >>>>> b240e3e89d99489284cd731e75f2a5db
>>>>>>>>> >>>>> >>>> >>>>> 4160ce999a31485fa643aed0936dfef0 - default
>>>>>>>>> default] [instance:
>>>>>>>>> >>>>> >>>> >>>>> 0c62c1ef-9010-417d-a05f-4db77e901600] Booting
>>>>>>>>> with volume
>>>>>>>>> >>>>> >>>> >>>>> c4bd7885-5973-4860-bbe6-7a2f726baeee at /dev/vda
>>>>>>>>> >>>>> >>>> >>>>> 2023-02-26 08:49:55.216 7 WARNING nova.cache_utils
>>>>>>>>> >>>>> >>>> >>>>> [req-3a1547ea-326f-4dd0-9127-7f4a4bdf1e45
>>>>>>>>> >>>>> >>>> >>>>> b240e3e89d99489284cd731e75f2a5db
>>>>>>>>> >>>>> >>>> >>>>> 4160ce999a31485fa643aed0936dfef0 - default
>>>>>>>>> default] Cache enabled
>>>>>>>>> >>>>> >>>> with
>>>>>>>>> >>>>> >>>> >>>>> backend dogpile.cache.null.
>>>>>>>>> >>>>> >>>> >>>>> 2023-02-26 08:49:55.283 7 INFO oslo.privsep.daemon
>>>>>>>>> >>>>> >>>> >>>>> [req-3a1547ea-326f-4dd0-9127-7f4a4bdf1e45
>>>>>>>>> >>>>> >>>> >>>>> b240e3e89d99489284cd731e75f2a5db
>>>>>>>>> >>>>> >>>> >>>>> 4160ce999a31485fa643aed0936dfef0 - default
>>>>>>>>> default] Running
>>>>>>>>> >>>>> >>>> >>>>> privsep helper:
>>>>>>>>> >>>>> >>>> >>>>> ['sudo', 'nova-rootwrap',
>>>>>>>>> '/etc/nova/rootwrap.conf',
>>>>>>>>> >>>>> >>>> 'privsep-helper',
>>>>>>>>> >>>>> >>>> >>>>> '--config-file', '/etc/nova/nova.conf',
>>>>>>>>> '--config-file',
>>>>>>>>> >>>>> >>>> >>>>> '/etc/nova/nova-compute.conf',
>>>>>>>>> '--privsep_context',
>>>>>>>>> >>>>> >>>> >>>>> 'os_brick.privileged.default',
>>>>>>>>> '--privsep_sock_path',
>>>>>>>>> >>>>> >>>> >>>>> '/tmp/tmpin40tah6/privsep.sock']
>>>>>>>>> >>>>> >>>> >>>>> 2023-02-26 08:49:55.791 7 INFO oslo.privsep.daemon
>>>>>>>>> >>>>> >>>> >>>>> [req-3a1547ea-326f-4dd0-9127-7f4a4bdf1e45
>>>>>>>>> >>>>> >>>> >>>>> b240e3e89d99489284cd731e75f2a5db
>>>>>>>>> >>>>> >>>> >>>>> 4160ce999a31485fa643aed0936dfef0 - default
>>>>>>>>> default] Spawned new
>>>>>>>>> >>>>> >>>> privsep
>>>>>>>>> >>>>> >>>> >>>>> daemon via rootwrap
>>>>>>>>> >>>>> >>>> >>>>> 2023-02-26 08:49:55.717 2647 INFO
>>>>>>>>> oslo.privsep.daemon [-] privsep
>>>>>>>>> >>>>> >>>> >>>>> daemon starting
>>>>>>>>> >>>>> >>>> >>>>> 2023-02-26 08:49:55.722 2647 INFO
>>>>>>>>> oslo.privsep.daemon [-] privsep
>>>>>>>>> >>>>> >>>> >>>>> process running with uid/gid: 0/0
>>>>>>>>> >>>>> >>>> >>>>> 2023-02-26 08:49:55.726 2647 INFO
>>>>>>>>> oslo.privsep.daemon [-] privsep
>>>>>>>>> >>>>> >>>> >>>>> process running with capabilities (eff/prm/inh):
>>>>>>>>> >>>>> >>>> >>>>> CAP_SYS_ADMIN/CAP_SYS_ADMIN/none
>>>>>>>>> >>>>> >>>> >>>>> 2023-02-26 08:49:55.726 2647 INFO
>>>>>>>>> oslo.privsep.daemon [-] privsep
>>>>>>>>> >>>>> >>>> >>>>> daemon running as pid 2647
>>>>>>>>> >>>>> >>>> >>>>> 2023-02-26 08:49:55.956 7 WARNING
>>>>>>>>> >>>>> >>>> os_brick.initiator.connectors.nvmeof
>>>>>>>>> >>>>> >>>> >>>>> [req-3a1547ea-326f-4dd0-9127-7f4a4bdf1e45
>>>>>>>>> >>>>> >>>> >>>>> b240e3e89d99489284cd731e75f2a5db
>>>>>>>>> >>>>> >>>> >>>>> 4160ce999a31485fa643aed0936dfef0 - default
>>>>>>>>> default] Process
>>>>>>>>> >>>>> >>>> >>>>> execution error
>>>>>>>>> >>>>> >>>> >>>>> in _get_host_uuid: Unexpected error while running
>>>>>>>>> command.
>>>>>>>>> >>>>> >>>> >>>>> Command: blkid overlay -s UUID -o value
>>>>>>>>> >>>>> >>>> >>>>> Exit code: 2
>>>>>>>>> >>>>> >>>> >>>>> Stdout: ''
>>>>>>>>> >>>>> >>>> >>>>> Stderr: '':
>>>>>>>>> oslo_concurrency.processutils.ProcessExecutionError:
>>>>>>>>> >>>>> >>>> >>>>> Unexpected error while running command.
>>>>>>>>> >>>>> >>>> >>>>> 2023-02-26 08:49:58.247 7 INFO
>>>>>>>>> nova.virt.libvirt.driver
>>>>>>>>> >>>>> >>>> >>>>> [req-3a1547ea-326f-4dd0-9127-7f4a4bdf1e45
>>>>>>>>> >>>>> >>>> >>>>> b240e3e89d99489284cd731e75f2a5db
>>>>>>>>> >>>>> >>>> >>>>> 4160ce999a31485fa643aed0936dfef0 - default
>>>>>>>>> default] [instance:
>>>>>>>>> >>>>> >>>> >>>>> 0c62c1ef-9010-417d-a05f-4db77e901600] Creating
>>>>>>>>> image
>>>>>>>>> >>>>> >>>> >>>>>
>>>>>>>>> >>>>> >>>> >>>>> Is there a way to solve this issue?
>>>>>>>>> >>>>> >>>> >>>>>
>>>>>>>>> >>>>> >>>> >>>>>
>>>>>>>>> >>>>> >>>> >>>>> With regards,
>>>>>>>>> >>>>> >>>> >>>>>
>>>>>>>>> >>>>> >>>> >>>>> Swogat Pradhan
>>>>>>>>> >>>>> >>>> >>>>>
>>>>>>>>> >>>>> >>>> >>>>
>>>>>>>>> >>>>> >>>>
>>>>>>>>> >>>>> >>>>
>>>>>>>>> >>>>> >>>>
>>>>>>>>> >>>>> >>>>
>>>>>>>>> >>>>> >>>>
>>>>>>>>> >>>>> >>>>
>>>>>>>>> >>>>> >>>>
>>>>>>>>> >>>>> >>>>
>>>>>>>>> >>>>> >>>>
>>>>>>>>> >>>>>
>>>>>>>>>
>>>>>>>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.openstack.org/pipermail/openstack-discuss/attachments/20230323/5e26f849/attachment-0001.htm>
    
    
More information about the openstack-discuss
mailing list