beginning with [masakari]
Hi, i have recently started working on Masakari (kubernetes in openstack) so i wanted some help with it please i was trying to test the masakari-instance-monitor & masakari-host-monitor but having trouble with it. i have created a pacemaker remote cluster (with pacemaker & corosync on controller nodes and pacemaker remote on compute node). *pcs statusCluster name: lab-clusterCluster Summary: * Stack: corosync * Current DC: lab-11053-26006-ceph-2 (version 2.1.2-ada5c3b36e2) - partition with quorum * Last updated: Thu Dec 14 08:41:41 2023 * Last change: Thu Dec 14 08:41:22 2023 by root via cibadmin on lab-11053-26006-ceph-1 * 6 nodes configured * 3 resource instances configuredNode List: * Online: [ lab-11053-26006-ceph-1 lab-11053-26006-ceph-2 lab-11053-26006-ceph-3 ] * RemoteOnline: [ lab-11053-26006-comp-1.cluster.local lab-11053-26006-comp-2.cluster.local lab-11053-26006-comp-3.cluster.local ]Full List of Resources: * lab-11053-26006-comp-1.cluster.local (ocf:pacemaker:remote): Started lab-11053-26006-ceph-1 * lab-11053-26006-comp-2.cluster.local (ocf:pacemaker:remote): Started lab-11053-26006-ceph-2 * lab-11053-26006-comp-3.cluster.local (ocf:pacemaker:remote): Started lab-11053-26006-ceph-3Daemon Status: corosync: active/enabled pacemaker: active/enabled pcsd: active/enabled* i have also created segment and added segment hosts(computes) in it. when i tried to test the masakari host monitor by manually powering off the compute node, masakari should move the vms from powered off computes to some running compute but after the compute is powered off vms still remains on the powered off compute. Am i missing some conf? is there any need to add some conf in nova and keystone too? *in masakari engine logs i see host_failure.evacuate_all_instances = Trueinstance_failure.process_all_instances = Falsehost_failure.add_reserved_host_to_aggregate = Falsehost_failure.ha_enabled_instance_metadata_key = HA_Enabled log_opt_valueshost_failure.ignore_instances_in_error_state = Falsehost_failure.service_disable_reason = Masakari detected host failed.* and i get below error when i turn off an instance on the compute, logs are from masakari-instance-monitor *2024-01-03 13:01:15.574 1 DEBUG openstack.resource [-] Attribute [] not found in [<openstack.resource._ComponentManager object at 0x7fa051f48730>]: ''. __getattribute__ /var/lib/openstack/lib/python3.8/site-packages/openstack/resource.py:6222024-01-03 13:01:15.595 1 WARNING masakarimonitors.ha.masakari [-] Retry sending a notification. (BadRequestException: 400: Client Error for url: http://masakari-api.openstack.svc.cluster.local:15868/v1/9db00e10af034fc6a4d... <http://masakari-api.openstack.svc.cluster.local:15868/v1/9db00e10af034fc6a4d50813131ae3a6/notifications>, Host with name lab-11053-26006-cmp-1 could not be found.): openstack.exceptions.BadRequestException: BadRequestException: 400: Client Error for url: http://masakari-api.openstack.svc.cluster.local:15868/v1/9db00e10af034fc6a4d... <http://masakari-api.openstack.svc.cluster.local:15868/v1/9db00e10af034fc6a4d50813131ae3a6/notifications>, Host with name lab-11053-26006-cmp-1 could not be found.* (do i need to create some masakari notifications for it to work in openstack? if yes how may i do it) below i have attached the logs from masakari-hostmonitor and masakari-engine and also some components which i created in openstack about masakari do you have any suggestion please let me know! Thank You
Hi,I guess it is an issue with the hostname. It use a fqdn hostname in nova-compute services such as ‘lab-11053-26006-cmp-1.cluster.local’ but use a short hostname ‘lab-11053-26006-cmp-1’ in masakari-monitors. You can configurate a fqdn hostname in masakari-monitors as bellow, [default] hostname= lab-11053-26006-cmp-1.cluster.local(example) Otherwise, it would use a short hostname. masakarimonitors/conf/service.py service_opts = [ cfg.StrOpt('hostname', default=socket.gethostname(), deprecated_name="host", help=''' Hostname, FQDN or IP address of this host. Must be valid within AMQP key. Possible values: * String with hostname, FQDN or IP address. Default is hostname of this host. '''), 发件人: Shubham Kumar Yadav <shubham.kumar.yadav369@gmail.com> 发送时间: 2024年1月4日 13:20 收件人: openstack-discuss@lists.openstack.org 主题: beginning with [masakari] Hi, i have recently started working on Masakari (kubernetes in openstack) so i wanted some help with it please i was trying to test the masakari-instance-monitor & masakari-host-monitor but having trouble with it. i have created a pacemaker remote cluster (with pacemaker & corosync on controller nodes and pacemaker remote on compute node). pcs status Cluster name: lab-cluster Cluster Summary: * Stack: corosync * Current DC: lab-11053-26006-ceph-2 (version 2.1.2-ada5c3b36e2) - partition with quorum * Last updated: Thu Dec 14 08:41:41 2023 * Last change: Thu Dec 14 08:41:22 2023 by root via cibadmin on lab-11053-26006-ceph-1 * 6 nodes configured * 3 resource instances configured Node List: * Online: [ lab-11053-26006-ceph-1 lab-11053-26006-ceph-2 lab-11053-26006-ceph-3 ] * RemoteOnline: [ lab-11053-26006-comp-1.cluster.local lab-11053-26006-comp-2.cluster.local lab-11053-26006-comp-3.cluster.local ] Full List of Resources: * lab-11053-26006-comp-1.cluster.local (ocf:pacemaker:remote): Started lab-11053-26006-ceph-1 * lab-11053-26006-comp-2.cluster.local (ocf:pacemaker:remote): Started lab-11053-26006-ceph-2 * lab-11053-26006-comp-3.cluster.local (ocf:pacemaker:remote): Started lab-11053-26006-ceph-3 Daemon Status: corosync: active/enabled pacemaker: active/enabled pcsd: active/enabled i have also created segment and added segment hosts(computes) in it. when i tried to test the masakari host monitor by manually powering off the compute node, masakari should move the vms from powered off computes to some running compute but after the compute is powered off vms still remains on the powered off compute. Am i missing some conf? is there any need to add some conf in nova and keystone too? in masakari engine logs i see host_failure.evacuate_all_instances = True instance_failure.process_all_instances = False host_failure.add_reserved_host_to_aggregate = False host_failure.ha_enabled_instance_metadata_key = HA_Enabled log_opt_values host_failure.ignore_instances_in_error_state = False host_failure.service_disable_reason = Masakari detected host failed. and i get below error when i turn off an instance on the compute, logs are from masakari-instance-monitor 2024-01-03 13:01:15.574 1 DEBUG openstack.resource [-] Attribute [] not found in [<openstack.resource._ComponentManager object at 0x7fa051f48730>]: ''. __getattribute__ /var/lib/openstack/lib/python3.8/site-packages/openstack/resource.py:622 2024-01-03 13:01:15.595 1 WARNING masakarimonitors.ha.masakari [-] Retry sending a notification. (BadRequestException: 400: Client Error for url: http://masakari-api.openstack.svc.cluster.local:15868/v1/9db00e10af034fc6a4d..., Host with name lab-11053-26006-cmp-1 could not be found.): openstack.exceptions.BadRequestException: BadRequestException: 400: Client Error for url: http://masakari-api.openstack.svc.cluster.local:15868/v1/9db00e10af034fc6a4d..., Host with name lab-11053-26006-cmp-1 could not be found. (do i need to create some masakari notifications for it to work in openstack? if yes how may i do it) below i have attached the logs from masakari-hostmonitor and masakari-engine and also some components which i created in openstack about masakari do you have any suggestion please let me know! Thank You
participants (2)
-
Sam Su (苏正伟)
-
Shubham Kumar Yadav