That’s awesome. Thank you.
Mohammed,
Dis-regard my earlier emails. i found senlin does auto-healing. you
need to create a health policy and attach it to your cluster.
Here is my policy I created to monitor nodes' heath and if for some
reason it dies or crashes, senlin will auto create that instance to
fulfill the need.
type: senlin.policy.health
version: 1.1
description: A policy for maintaining node health from a cluster.
properties:
detection:
# Number of seconds between two adjacent checking
interval: 60
detection_modes:
# Type for health checking, valid values include:
# NODE_STATUS_POLLING, NODE_STATUS_POLL_URL, LIFECYCLE_EVENTS
- type: NODE_STATUS_POLLING
recovery:
# Action that can be retried on a failed node, will improve to
# support multiple actions in the future. Valid values include:
# REBOOT, REBUILD, RECREATE
actions:
- name: RECREATE
** Here is the POC
[root@os-infra-1-utility-container-e139058e ~]# nova list
+--------------------------------------+---------------+--------+------------+-------------+-------------------+
| ID | Name | Status | Task
State | Power State | Networks |
+--------------------------------------+---------------+--------+------------+-------------+-------------------+
| 38ba7f7c-2f5f-4502-a5d0-6c4841d6d145 | cirros_server | ACTIVE | -
| Running | net1=192.168.1.26 |
| ba55deb6-9488-4455-a472-a0a957cb388a | cirros_server | ACTIVE | -
| Running | net1=192.168.1.14 |
+--------------------------------------+---------------+--------+------------+-------------+-------------------+
** Lets delete one of the nodes.
[root@os-infra-1-utility-container-e139058e ~]# nova delete
ba55deb6-9488-4455-a472-a0a957cb388a
Request to delete server ba55deb6-9488-4455-a472-a0a957cb388a has been accepted.
** After a few min i can see RECOVERING nodes.
[root@os-infra-1-utility-container-e139058e ~]# openstack cluster node list
+----------+---------------+-------+------------+------------+-------------+--------------+----------------------+----------------------+---------+
| id | name | index | status | cluster_id |
physical_id | profile_name | created_at | updated_at
| tainted |
+----------+---------------+-------+------------+------------+-------------+--------------+----------------------+----------------------+---------+
| d4a8f219 | node-YPsjB6bV | 6 | RECOVERING | 091fbd52 |
ba55deb6 | myserver | 2020-09-02T21:01:47Z |
2020-09-03T04:01:58Z | False |
| bc50c0b9 | node-hoiHkRcS | 7 | ACTIVE | 091fbd52 |
38ba7f7c | myserver | 2020-09-03T03:40:29Z |
2020-09-03T03:57:58Z | False |
+----------+---------------+-------+------------+------------+-------------+--------------+----------------------+----------------------+---------+
** Finally it's up and running with a new ip address.
[root@os-infra-1-utility-container-e139058e ~]# nova list
+--------------------------------------+---------------+--------+------------+-------------+-------------------+
| ID | Name | Status | Task
State | Power State | Networks |
+--------------------------------------+---------------+--------+------------+-------------+-------------------+
| 38ba7f7c-2f5f-4502-a5d0-6c4841d6d145 | cirros_server | ACTIVE | -
| Running | net1=192.168.1.26 |
| 73a658cd-c40a-45d8-9b57-cc9e6c2b4dc1 | cirros_server | ACTIVE | -
| Running | net1=192.168.1.17 |
+--------------------------------------+---------------+--------+------------+-------------+-------------------+
On Tue, Sep 1, 2020 at 8:51 AM Mohammed Naser <mnaser@vexxhost.com> wrote:
>
> Hi Satish,
>
> I'm interested by this, did you end up finding a solution for this?
>
> Thanks,
> Mohammed
>
> On Thu, Aug 27, 2020 at 1:54 PM Satish Patel <satish.txt@gmail.com> wrote:
> >
> > Folks,
> >
> > I have created very simple cluster using following command
> >
> > openstack cluster create --profile myserver --desired-capacity 2
> > --min-size 2 --max-size 3 --strict my-asg
> >
> > It spun up 2 vm immediately now because the desired capacity is 2 so I
> > am assuming if any node dies in the cluster it should spin up node to
> > make count 2 right?
> >
> > so i killed one of node with "nove delete <instance-foo-1>" but
> > senlin didn't create node automatically to make desired capacity 2 (In
> > AWS when you kill node in ASG it will create new node so is this
> > senlin different then AWS?)
> >
>
>
> --
> Mohammed Naser
> VEXXHOST, Inc.
--
Mohammed Naser
VEXXHOST, Inc.