Mohammed, Dis-regard my earlier emails. i found senlin does auto-healing. you need to create a health policy and attach it to your cluster. Here is my policy I created to monitor nodes' heath and if for some reason it dies or crashes, senlin will auto create that instance to fulfill the need. type: senlin.policy.health version: 1.1 description: A policy for maintaining node health from a cluster. properties: detection: # Number of seconds between two adjacent checking interval: 60 detection_modes: # Type for health checking, valid values include: # NODE_STATUS_POLLING, NODE_STATUS_POLL_URL, LIFECYCLE_EVENTS - type: NODE_STATUS_POLLING recovery: # Action that can be retried on a failed node, will improve to # support multiple actions in the future. Valid values include: # REBOOT, REBUILD, RECREATE actions: - name: RECREATE ** Here is the POC [root@os-infra-1-utility-container-e139058e ~]# nova list +--------------------------------------+---------------+--------+------------+-------------+-------------------+ | ID | Name | Status | Task State | Power State | Networks | +--------------------------------------+---------------+--------+------------+-------------+-------------------+ | 38ba7f7c-2f5f-4502-a5d0-6c4841d6d145 | cirros_server | ACTIVE | - | Running | net1=192.168.1.26 | | ba55deb6-9488-4455-a472-a0a957cb388a | cirros_server | ACTIVE | - | Running | net1=192.168.1.14 | +--------------------------------------+---------------+--------+------------+-------------+-------------------+ ** Lets delete one of the nodes. [root@os-infra-1-utility-container-e139058e ~]# nova delete ba55deb6-9488-4455-a472-a0a957cb388a Request to delete server ba55deb6-9488-4455-a472-a0a957cb388a has been accepted. ** After a few min i can see RECOVERING nodes. [root@os-infra-1-utility-container-e139058e ~]# openstack cluster node list +----------+---------------+-------+------------+------------+-------------+--------------+----------------------+----------------------+---------+ | id | name | index | status | cluster_id | physical_id | profile_name | created_at | updated_at | tainted | +----------+---------------+-------+------------+------------+-------------+--------------+----------------------+----------------------+---------+ | d4a8f219 | node-YPsjB6bV | 6 | RECOVERING | 091fbd52 | ba55deb6 | myserver | 2020-09-02T21:01:47Z | 2020-09-03T04:01:58Z | False | | bc50c0b9 | node-hoiHkRcS | 7 | ACTIVE | 091fbd52 | 38ba7f7c | myserver | 2020-09-03T03:40:29Z | 2020-09-03T03:57:58Z | False | +----------+---------------+-------+------------+------------+-------------+--------------+----------------------+----------------------+---------+ ** Finally it's up and running with a new ip address. [root@os-infra-1-utility-container-e139058e ~]# nova list +--------------------------------------+---------------+--------+------------+-------------+-------------------+ | ID | Name | Status | Task State | Power State | Networks | +--------------------------------------+---------------+--------+------------+-------------+-------------------+ | 38ba7f7c-2f5f-4502-a5d0-6c4841d6d145 | cirros_server | ACTIVE | - | Running | net1=192.168.1.26 | | 73a658cd-c40a-45d8-9b57-cc9e6c2b4dc1 | cirros_server | ACTIVE | - | Running | net1=192.168.1.17 | +--------------------------------------+---------------+--------+------------+-------------+-------------------+ On Tue, Sep 1, 2020 at 8:51 AM Mohammed Naser <mnaser@vexxhost.com> wrote:
Hi Satish,
I'm interested by this, did you end up finding a solution for this?
Thanks, Mohammed
On Thu, Aug 27, 2020 at 1:54 PM Satish Patel <satish.txt@gmail.com> wrote:
Folks,
I have created very simple cluster using following command
openstack cluster create --profile myserver --desired-capacity 2 --min-size 2 --max-size 3 --strict my-asg
It spun up 2 vm immediately now because the desired capacity is 2 so I am assuming if any node dies in the cluster it should spin up node to make count 2 right?
so i killed one of node with "nove delete <instance-foo-1>" but senlin didn't create node automatically to make desired capacity 2 (In AWS when you kill node in ASG it will create new node so is this senlin different then AWS?)
-- Mohammed Naser VEXXHOST, Inc.