beginning with [masakari]
Hi, i have recently started working on Masakari (kubernetes in openstack) so i wanted some help with it please i was trying to test the masakari-instance_monitor & masakari-host-monitor but having trouble with it. i have created a pacemaker remote cluster (with pacemaker & corosync on controller nodes and pacemaker remote on compute node. *pcs statusCluster name: lab-clusterCluster Summary: * Stack: corosync * Current DC: lab-11053-26006-ceph-2 (version 2.1.2-ada5c3b36e2) - partition with quorum * Last updated: Thu Dec 14 08:41:41 2023 * Last change: Thu Dec 14 08:41:22 2023 by root via cibadmin on lab-11053-26006-ceph-1 * 6 nodes configured * 3 resource instances configuredNode List: * Online: [ lab-11053-26006-ceph-1 lab-11053-26006-ceph-2 lab-11053-26006-ceph-3 ] * RemoteOnline: [ lab-11053-26006-comp-1.cluster.local lab-11053-26006-comp-2.cluster.local lab-11053-26006-comp-3.cluster.local ]Full List of Resources: * lab-11053-26006-comp-1.cluster.local (ocf:pacemaker:remote): Started lab-11053-26006-ceph-1 * lab-11053-26006-comp-2.cluster.local (ocf:pacemaker:remote): Started lab-11053-26006-ceph-2 * lab-11053-26006-comp-3.cluster.local (ocf:pacemaker:remote): Started lab-11053-26006-ceph-3Daemon Status: corosync: active/enabled pacemaker: active/enabled pcsd: active/enabled* i have also created segment and added hosts(computes) in it. when i tried to test the masakari host monitor by manually powering off the compute node, masakari should move the vms from powered off computes to some running compute but after the compute is powered off vms still remains on the powered off compute. Am i missing some conf? is there any need to add some conf in nova and keystone too? in masakari engine logs i see host_failure.evacuate_all_instances = True instance_failure.process_all_instances = False host_failure.add_reserved_host_to_aggregate = False host_failure.ha_enabled_instance_metadata_key = HA_Enabled log_opt_values host_failure.ignore_instances_in_error_state = False host_failure.service_disable_reason = Masakari detected host failed. do you have any suggestion please let me know! Thank You
Providing output from masakari-hostmonitor and masakari-engine would be helpful as it's important to understand if masakari has identified compute failure at all, and if it did - does it try to schedule evacuation at all. Also do you have compute failure recorded in action list in masakari api? On Wed, Dec 20, 2023, 17:11 Shubham Kumar Yadav < shubham.kumar.yadav369@gmail.com> wrote:
Hi, i have recently started working on Masakari (kubernetes in openstack) so i wanted some help with it please i was trying to test the masakari-instance_monitor & masakari-host-monitor but having trouble with it. i have created a pacemaker remote cluster (with pacemaker & corosync on controller nodes and pacemaker remote on compute node.
*pcs statusCluster name: lab-clusterCluster Summary: * Stack: corosync * Current DC: lab-11053-26006-ceph-2 (version 2.1.2-ada5c3b36e2) - partition with quorum * Last updated: Thu Dec 14 08:41:41 2023 * Last change: Thu Dec 14 08:41:22 2023 by root via cibadmin on lab-11053-26006-ceph-1 * 6 nodes configured * 3 resource instances configuredNode List: * Online: [ lab-11053-26006-ceph-1 lab-11053-26006-ceph-2 lab-11053-26006-ceph-3 ] * RemoteOnline: [ lab-11053-26006-comp-1.cluster.local lab-11053-26006-comp-2.cluster.local lab-11053-26006-comp-3.cluster.local ]Full List of Resources: * lab-11053-26006-comp-1.cluster.local (ocf:pacemaker:remote): Started lab-11053-26006-ceph-1 * lab-11053-26006-comp-2.cluster.local (ocf:pacemaker:remote): Started lab-11053-26006-ceph-2 * lab-11053-26006-comp-3.cluster.local (ocf:pacemaker:remote): Started lab-11053-26006-ceph-3Daemon Status: corosync: active/enabled pacemaker: active/enabled pcsd: active/enabled*
i have also created segment and added hosts(computes) in it.
when i tried to test the masakari host monitor by manually powering off the compute node, masakari should move the vms from powered off computes to some running compute but after the compute is powered off vms still remains on the powered off compute. Am i missing some conf? is there any need to add some conf in nova and keystone too? in masakari engine logs i see host_failure.evacuate_all_instances = True instance_failure.process_all_instances = False host_failure.add_reserved_host_to_aggregate = False host_failure.ha_enabled_instance_metadata_key = HA_Enabled log_opt_values host_failure.ignore_instances_in_error_state = False host_failure.service_disable_reason = Masakari detected host failed.
do you have any suggestion please let me know! Thank You
participants (2)
-
Dmitriy Rabotyagov
-
Shubham Kumar Yadav