[TripleO] what option to enable restart compute node if it becomes unavailable/not pingable

Takashi Kajinami tkajinam at redhat.com
Wed Feb 8 09:11:39 UTC 2023


TripleO supports configuring instance HA, which uses pacemaker +
pacemaker-remote to detect
unreachable compute nodes, reboot them and evacuate instances in the
unreachable compute nodes.

https://docs.openstack.org/project-deploy-guide/tripleo-docs/latest/features/instance_ha.html

I've never tried this actually but fence_evacuate agent supports disabling
evacuation so you can
use this to only reboot nodes, I guess.

https://github.com/ClusterLabs/fence-agents/blob/main/agents/evacuate/fence_evacuate.py#L345-L353

By the way I'm not really aware of the feature to recover unavailable
instances within OpenStack.
In the past Masakari had the instance monitor which checks status of
instances via libvirt interface
and reboots the instances without responses, but afaik that is no longer
supported, afair.



On Wed, Feb 8, 2023 at 2:42 AM Ruslanas Gžibovskis <ruslanas at lpic.lt> wrote:

> Hi all,
>
> Since TripleO is Openstack Over Openstack, and overcloud is able to
> recover instances if they become unavailable, how to enable/force
> undercloud to launch reset over redfish/idrac?
>
> Any suggestions?
>
> If that is relevant, I use ussuri, but I think this should not be relevant
> to the question for the option since I would like to have same enabled in
> latest release also.
>
> --
> Ruslanas Gžibovskis
> +370 6030 7030
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.openstack.org/pipermail/openstack-discuss/attachments/20230208/2f1f1257/attachment.htm>


More information about the openstack-discuss mailing list