[TripleO] what option to enable restart compute node if it becomes unavailable/not pingable
Hi all, Since TripleO is Openstack Over Openstack, and overcloud is able to recover instances if they become unavailable, how to enable/force undercloud to launch reset over redfish/idrac? Any suggestions? If that is relevant, I use ussuri, but I think this should not be relevant to the question for the option since I would like to have same enabled in latest release also. -- Ruslanas Gžibovskis +370 6030 7030
TripleO supports configuring instance HA, which uses pacemaker + pacemaker-remote to detect unreachable compute nodes, reboot them and evacuate instances in the unreachable compute nodes. https://docs.openstack.org/project-deploy-guide/tripleo-docs/latest/features... I've never tried this actually but fence_evacuate agent supports disabling evacuation so you can use this to only reboot nodes, I guess. https://github.com/ClusterLabs/fence-agents/blob/main/agents/evacuate/fence_... By the way I'm not really aware of the feature to recover unavailable instances within OpenStack. In the past Masakari had the instance monitor which checks status of instances via libvirt interface and reboots the instances without responses, but afaik that is no longer supported, afair. On Wed, Feb 8, 2023 at 2:42 AM Ruslanas Gžibovskis <ruslanas@lpic.lt> wrote:
Hi all,
Since TripleO is Openstack Over Openstack, and overcloud is able to recover instances if they become unavailable, how to enable/force undercloud to launch reset over redfish/idrac?
Any suggestions?
If that is relevant, I use ussuri, but I think this should not be relevant to the question for the option since I would like to have same enabled in latest release also.
-- Ruslanas Gžibovskis +370 6030 7030
participants (2)
-
Ruslanas Gžibovskis
-
Takashi Kajinami