Hi Eugen, Request you to please add my email either on 'to' or 'cc' as i am not getting email's from you. Coming to the issue: [root@overcloud-controller-no-ceph-3 /]# rabbitmqctl list_policies -p / Listing policies for vhost "/" ... vhost name pattern apply-to definition priority / ha-all ^(?!amq\.).* queues {"ha-mode":"exactly","ha-params":2,"ha-promote-on-shutdown":"always"} 0 I have the edge site compute nodes up, it only goes down when i am trying to launch an instance and the instance comes to a spawning state and then gets stuck. I have a tunnel setup between the central and the edge sites. With regards, Swogat Pradhan On Tue, Feb 28, 2023 at 9:11 PM Swogat Pradhan <swogatpradhan22@gmail.com> wrote:
Hi Eugen, For some reason i am not getting your email to me directly, i am checking the email digest and there i am able to find your reply. Here is the log for download: https://we.tl/t-L8FEkGZFSq Yes, these logs are from the time when the issue occurred.
*Note: i am able to create vm's and perform other activities in the central site, only facing this issue in the edge site.*
With regards, Swogat Pradhan
On Mon, Feb 27, 2023 at 5:12 PM Swogat Pradhan <swogatpradhan22@gmail.com> wrote:
Hi Eugen, Thanks for your response. I have actually a 4 controller setup so here are the details:
*PCS Status:* * Container bundle set: rabbitmq-bundle [ 172.25.201.68:8787/tripleomaster/openstack-rabbitmq:pcmklatest]: * rabbitmq-bundle-0 (ocf::heartbeat:rabbitmq-cluster): Started overcloud-controller-no-ceph-3 * rabbitmq-bundle-1 (ocf::heartbeat:rabbitmq-cluster): Started overcloud-controller-2 * rabbitmq-bundle-2 (ocf::heartbeat:rabbitmq-cluster): Started overcloud-controller-1 * rabbitmq-bundle-3 (ocf::heartbeat:rabbitmq-cluster): Started overcloud-controller-0
I have tried restarting the bundle multiple times but the issue is still present.
*Cluster status:* [root@overcloud-controller-0 /]# rabbitmqctl cluster_status Cluster status of node rabbit@overcloud-controller-0.internalapi.bdxworld.com ... Basics
Cluster name: rabbit@overcloud-controller-no-ceph-3.bdxworld.com
Disk Nodes
rabbit@overcloud-controller-0.internalapi.bdxworld.com rabbit@overcloud-controller-1.internalapi.bdxworld.com rabbit@overcloud-controller-2.internalapi.bdxworld.com rabbit@overcloud-controller-no-ceph-3.internalapi.bdxworld.com
Running Nodes
rabbit@overcloud-controller-0.internalapi.bdxworld.com rabbit@overcloud-controller-1.internalapi.bdxworld.com rabbit@overcloud-controller-2.internalapi.bdxworld.com rabbit@overcloud-controller-no-ceph-3.internalapi.bdxworld.com
Versions
rabbit@overcloud-controller-0.internalapi.bdxworld.com: RabbitMQ 3.8.3 on Erlang 22.3.4.1 rabbit@overcloud-controller-1.internalapi.bdxworld.com: RabbitMQ 3.8.3 on Erlang 22.3.4.1 rabbit@overcloud-controller-2.internalapi.bdxworld.com: RabbitMQ 3.8.3 on Erlang 22.3.4.1 rabbit@overcloud-controller-no-ceph-3.internalapi.bdxworld.com: RabbitMQ 3.8.3 on Erlang 22.3.4.1
Alarms
(none)
Network Partitions
(none)
Listeners
Node: rabbit@overcloud-controller-0.internalapi.bdxworld.com, interface: [::], port: 25672, protocol: clustering, purpose: inter-node and CLI tool communication Node: rabbit@overcloud-controller-0.internalapi.bdxworld.com, interface: 172.25.201.212, port: 5672, protocol: amqp, purpose: AMQP 0-9-1 and AMQP 1.0 Node: rabbit@overcloud-controller-0.internalapi.bdxworld.com, interface: [::], port: 15672, protocol: http, purpose: HTTP API Node: rabbit@overcloud-controller-1.internalapi.bdxworld.com, interface: [::], port: 25672, protocol: clustering, purpose: inter-node and CLI tool communication Node: rabbit@overcloud-controller-1.internalapi.bdxworld.com, interface: 172.25.201.205, port: 5672, protocol: amqp, purpose: AMQP 0-9-1 and AMQP 1.0 Node: rabbit@overcloud-controller-1.internalapi.bdxworld.com, interface: [::], port: 15672, protocol: http, purpose: HTTP API Node: rabbit@overcloud-controller-2.internalapi.bdxworld.com, interface: [::], port: 25672, protocol: clustering, purpose: inter-node and CLI tool communication Node: rabbit@overcloud-controller-2.internalapi.bdxworld.com, interface: 172.25.201.201, port: 5672, protocol: amqp, purpose: AMQP 0-9-1 and AMQP 1.0 Node: rabbit@overcloud-controller-2.internalapi.bdxworld.com, interface: [::], port: 15672, protocol: http, purpose: HTTP API Node: rabbit@overcloud-controller-no-ceph-3.internalapi.bdxworld.com, interface: [::], port: 25672, protocol: clustering, purpose: inter-node and CLI tool communication Node: rabbit@overcloud-controller-no-ceph-3.internalapi.bdxworld.com, interface: 172.25.201.209, port: 5672, protocol: amqp, purpose: AMQP 0-9-1 and AMQP 1.0 Node: rabbit@overcloud-controller-no-ceph-3.internalapi.bdxworld.com, interface: [::], port: 15672, protocol: http, purpose: HTTP API
Feature flags
Flag: drop_unroutable_metric, state: enabled Flag: empty_basic_get_metric, state: enabled Flag: implicit_default_bindings, state: enabled Flag: quorum_queue, state: enabled Flag: virtual_host_metadata, state: enabled
*Logs:* *(Attached)*
With regards, Swogat Pradhan
On Sun, Feb 26, 2023 at 2:34 PM Swogat Pradhan <swogatpradhan22@gmail.com> wrote:
Hi, Please find the nova conductor as well as nova api log.
nova-conuctor:
2023-02-26 08:45:01.108 31 WARNING oslo_messaging._drivers.amqpdriver [req-caefe26d-153a-4dfd-9ea6-bc5ca0d46679 - - - - -] reply_349bcb075f8c49329435a0f884b33066 doesn't exist, drop reply to 16152921c1eb45c2b1f562087140168b 2023-02-26 08:45:02.144 26 WARNING oslo_messaging._drivers.amqpdriver [req-7b43c4e5-0475-4598-92c0-fcacb51d9813 - - - - -] reply_276049ec36a84486a8a406911d9802f4 doesn't exist, drop reply to 83dbe5f567a940b698acfe986f6194fa 2023-02-26 08:45:02.314 32 WARNING oslo_messaging._drivers.amqpdriver [req-7b43c4e5-0475-4598-92c0-fcacb51d9813 - - - - -] reply_276049ec36a84486a8a406911d9802f4 doesn't exist, drop reply to f3bfd7f65bd542b18d84cea3033abb43: oslo_messaging.exceptions.MessageUndeliverable 2023-02-26 08:45:02.316 32 ERROR oslo_messaging._drivers.amqpdriver [req-7b43c4e5-0475-4598-92c0-fcacb51d9813 - - - - -] The reply f3bfd7f65bd542b18d84cea3033abb43 failed to send after 60 seconds due to a missing queue (reply_276049ec36a84486a8a406911d9802f4). Abandoning...: oslo_messaging.exceptions.MessageUndeliverable 2023-02-26 08:48:01.282 35 WARNING oslo_messaging._drivers.amqpdriver [req-caefe26d-153a-4dfd-9ea6-bc5ca0d46679 - - - - -] reply_349bcb075f8c49329435a0f884b33066 doesn't exist, drop reply to d4b9180f91a94f9a82c3c9c4b7595566: oslo_messaging.exceptions.MessageUndeliverable 2023-02-26 08:48:01.284 35 ERROR oslo_messaging._drivers.amqpdriver [req-caefe26d-153a-4dfd-9ea6-bc5ca0d46679 - - - - -] The reply d4b9180f91a94f9a82c3c9c4b7595566 failed to send after 60 seconds due to a missing queue (reply_349bcb075f8c49329435a0f884b33066). Abandoning...: oslo_messaging.exceptions.MessageUndeliverable 2023-02-26 08:49:01.303 33 WARNING oslo_messaging._drivers.amqpdriver [req-caefe26d-153a-4dfd-9ea6-bc5ca0d46679 - - - - -] reply_349bcb075f8c49329435a0f884b33066 doesn't exist, drop reply to 897911a234a445d8a0d8af02ece40f6f: oslo_messaging.exceptions.MessageUndeliverable 2023-02-26 08:49:01.304 33 ERROR oslo_messaging._drivers.amqpdriver [req-caefe26d-153a-4dfd-9ea6-bc5ca0d46679 - - - - -] The reply 897911a234a445d8a0d8af02ece40f6f failed to send after 60 seconds due to a missing queue (reply_349bcb075f8c49329435a0f884b33066). Abandoning...: oslo_messaging.exceptions.MessageUndeliverable 2023-02-26 08:49:52.254 31 WARNING nova.cache_utils [req-3a1547ea-326f-4dd0-9127-7f4a4bdf1e45 b240e3e89d99489284cd731e75f2a5db 4160ce999a31485fa643aed0936dfef0 - default default] Cache enabled with backend dogpile.cache.null. 2023-02-26 08:50:01.264 27 WARNING oslo_messaging._drivers.amqpdriver [req-caefe26d-153a-4dfd-9ea6-bc5ca0d46679 - - - - -] reply_349bcb075f8c49329435a0f884b33066 doesn't exist, drop reply to 8f723ceb10c3472db9a9f324861df2bb: oslo_messaging.exceptions.MessageUndeliverable 2023-02-26 08:50:01.266 27 ERROR oslo_messaging._drivers.amqpdriver [req-caefe26d-153a-4dfd-9ea6-bc5ca0d46679 - - - - -] The reply 8f723ceb10c3472db9a9f324861df2bb failed to send after 60 seconds due to a missing queue (reply_349bcb075f8c49329435a0f884b33066). Abandoning...: oslo_messaging.exceptions.MessageUndeliverable
With regards, Swogat Pradhan
On Sun, Feb 26, 2023 at 2:26 PM Swogat Pradhan < swogatpradhan22@gmail.com> wrote:
Hi, I currently have 3 compute nodes on edge site1 where i am trying to launch vm's. When the VM is in spawning state the node goes down (openstack compute service list), the node comes backup when i restart the nova compute service but then the launch of the vm fails.
nova-compute.log
2023-02-26 08:15:51.808 7 INFO nova.compute.manager [req-bc0f5f2e-53fc-4dae-b1da-82f1f972d617 - - - - -] Running instance usage audit for host dcn01-hci-0.bdxworld.com from 2023-02-26 07:00:00 to 2023-02-26 08:00:00. 0 instances. 2023-02-26 08:49:52.813 7 INFO nova.compute.claims [req-3a1547ea-326f-4dd0-9127-7f4a4bdf1e45 b240e3e89d99489284cd731e75f2a5db 4160ce999a31485fa643aed0936dfef0 - default default] [instance: 0c62c1ef-9010-417d-a05f-4db77e901600] Claim successful on node dcn01-hci-0.bdxworld.com 2023-02-26 08:49:54.225 7 INFO nova.virt.libvirt.driver [req-3a1547ea-326f-4dd0-9127-7f4a4bdf1e45 b240e3e89d99489284cd731e75f2a5db 4160ce999a31485fa643aed0936dfef0 - default default] [instance: 0c62c1ef-9010-417d-a05f-4db77e901600] Ignoring supplied device name: /dev/vda. Libvirt can't honour user-supplied dev names 2023-02-26 08:49:54.398 7 INFO nova.virt.block_device [req-3a1547ea-326f-4dd0-9127-7f4a4bdf1e45 b240e3e89d99489284cd731e75f2a5db 4160ce999a31485fa643aed0936dfef0 - default default] [instance: 0c62c1ef-9010-417d-a05f-4db77e901600] Booting with volume c4bd7885-5973-4860-bbe6-7a2f726baeee at /dev/vda 2023-02-26 08:49:55.216 7 WARNING nova.cache_utils [req-3a1547ea-326f-4dd0-9127-7f4a4bdf1e45 b240e3e89d99489284cd731e75f2a5db 4160ce999a31485fa643aed0936dfef0 - default default] Cache enabled with backend dogpile.cache.null. 2023-02-26 08:49:55.283 7 INFO oslo.privsep.daemon [req-3a1547ea-326f-4dd0-9127-7f4a4bdf1e45 b240e3e89d99489284cd731e75f2a5db 4160ce999a31485fa643aed0936dfef0 - default default] Running privsep helper: ['sudo', 'nova-rootwrap', '/etc/nova/rootwrap.conf', 'privsep-helper', '--config-file', '/etc/nova/nova.conf', '--config-file', '/etc/nova/nova-compute.conf', '--privsep_context', 'os_brick.privileged.default', '--privsep_sock_path', '/tmp/tmpin40tah6/privsep.sock'] 2023-02-26 08:49:55.791 7 INFO oslo.privsep.daemon [req-3a1547ea-326f-4dd0-9127-7f4a4bdf1e45 b240e3e89d99489284cd731e75f2a5db 4160ce999a31485fa643aed0936dfef0 - default default] Spawned new privsep daemon via rootwrap 2023-02-26 08:49:55.717 2647 INFO oslo.privsep.daemon [-] privsep daemon starting 2023-02-26 08:49:55.722 2647 INFO oslo.privsep.daemon [-] privsep process running with uid/gid: 0/0 2023-02-26 08:49:55.726 2647 INFO oslo.privsep.daemon [-] privsep process running with capabilities (eff/prm/inh): CAP_SYS_ADMIN/CAP_SYS_ADMIN/none 2023-02-26 08:49:55.726 2647 INFO oslo.privsep.daemon [-] privsep daemon running as pid 2647 2023-02-26 08:49:55.956 7 WARNING os_brick.initiator.connectors.nvmeof [req-3a1547ea-326f-4dd0-9127-7f4a4bdf1e45 b240e3e89d99489284cd731e75f2a5db 4160ce999a31485fa643aed0936dfef0 - default default] Process execution error in _get_host_uuid: Unexpected error while running command. Command: blkid overlay -s UUID -o value Exit code: 2 Stdout: '' Stderr: '': oslo_concurrency.processutils.ProcessExecutionError: Unexpected error while running command. 2023-02-26 08:49:58.247 7 INFO nova.virt.libvirt.driver [req-3a1547ea-326f-4dd0-9127-7f4a4bdf1e45 b240e3e89d99489284cd731e75f2a5db 4160ce999a31485fa643aed0936dfef0 - default default] [instance: 0c62c1ef-9010-417d-a05f-4db77e901600] Creating image
Is there a way to solve this issue?
With regards,
Swogat Pradhan