<div><div dir="auto">Satish,</div></div><div dir="auto"><br></div><div dir="auto">I’m glad you were able to find the answer. Just to clarify, your original email mentioned auto scaling in its title. Auto scaling means creating or deleting nodes as load goes up or down. Senlin supports scaling clusters but requires another service to perform the decision making and triggering of the scaling (i.e. the auto in auto scaling).</div><div dir="auto"><br></div><div dir="auto">But as you correctly pointed out auto healing is fully supported by Senlin on its own with its health policy. </div><div dir="auto"><br></div><div dir="auto">Duc Truong</div><div dir="auto"><br></div><div><br><div class="gmail_quote"><div dir="ltr">On Wed, Sep 2, 2020 at 9:31 PM Satish Patel <<a href="mailto:satish.txt@gmail.com">satish.txt@gmail.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Mohammed,<br>

<br>

Dis-regard my earlier emails. i found senlin does auto-healing. you<br>

need to create a health policy and attach it to your cluster.<br>

<br>

Here is my policy I created to monitor nodes' heath and if for some<br>

reason it dies or crashes, senlin will auto create that instance to<br>

fulfill the need.<br>

<br>

type: senlin.policy.health<br>

version: 1.1<br>

description: A policy for maintaining node health from a cluster.<br>

properties:<br>

  detection:<br>

    # Number of seconds between two adjacent checking<br>

    interval: 60<br>

<br>

    detection_modes:<br>

      # Type for health checking, valid values include:<br>

      # NODE_STATUS_POLLING, NODE_STATUS_POLL_URL, LIFECYCLE_EVENTS<br>

      - type: NODE_STATUS_POLLING<br>

<br>

  recovery:<br>

    # Action that can be retried on a failed node, will improve to<br>

    # support multiple actions in the future. Valid values include:<br>

    # REBOOT, REBUILD, RECREATE<br>

    actions:<br>

      - name: RECREATE<br>

<br>

<br>

** Here is the POC<br>

<br>

[root@os-infra-1-utility-container-e139058e ~]# nova list<br>

+--------------------------------------+---------------+--------+------------+-------------+-------------------+<br>

| ID                                   | Name          | Status | Task<br>

State | Power State | Networks          |<br>

+--------------------------------------+---------------+--------+------------+-------------+-------------------+<br>

| 38ba7f7c-2f5f-4502-a5d0-6c4841d6d145 | cirros_server | ACTIVE | -<br>

      | Running     | net1=192.168.1.26 |<br>

| ba55deb6-9488-4455-a472-a0a957cb388a | cirros_server | ACTIVE | -<br>

      | Running     | net1=192.168.1.14 |<br>

+--------------------------------------+---------------+--------+------------+-------------+-------------------+<br>

<br>

** Lets delete one of the nodes.<br>

<br>

[root@os-infra-1-utility-container-e139058e ~]# nova delete<br>

ba55deb6-9488-4455-a472-a0a957cb388a<br>

Request to delete server ba55deb6-9488-4455-a472-a0a957cb388a has been accepted.<br>

<br>

** After a few min i can see RECOVERING nodes.<br>

<br>

[root@os-infra-1-utility-container-e139058e ~]# openstack cluster node list<br>

+----------+---------------+-------+------------+------------+-------------+--------------+----------------------+----------------------+---------+<br>

| id       | name          | index | status     | cluster_id |<br>

physical_id | profile_name | created_at           | updated_at<br>

  | tainted |<br>

+----------+---------------+-------+------------+------------+-------------+--------------+----------------------+----------------------+---------+<br>

| d4a8f219 | node-YPsjB6bV |     6 | RECOVERING | 091fbd52   |<br>

ba55deb6    | myserver     | 2020-09-02T21:01:47Z |<br>

2020-09-03T04:01:58Z | False   |<br>

| bc50c0b9 | node-hoiHkRcS |     7 | ACTIVE     | 091fbd52   |<br>

38ba7f7c    | myserver     | 2020-09-03T03:40:29Z |<br>

2020-09-03T03:57:58Z | False   |<br>

+----------+---------------+-------+------------+------------+-------------+--------------+----------------------+----------------------+---------+<br>

<br>

** Finally it's up and running with a new ip address.<br>

<br>

[root@os-infra-1-utility-container-e139058e ~]# nova list<br>

+--------------------------------------+---------------+--------+------------+-------------+-------------------+<br>

| ID                                   | Name          | Status | Task<br>

State | Power State | Networks          |<br>

+--------------------------------------+---------------+--------+------------+-------------+-------------------+<br>

| 38ba7f7c-2f5f-4502-a5d0-6c4841d6d145 | cirros_server | ACTIVE | -<br>

      | Running     | net1=192.168.1.26 |<br>

| 73a658cd-c40a-45d8-9b57-cc9e6c2b4dc1 | cirros_server | ACTIVE | -<br>

      | Running     | net1=192.168.1.17 |<br>

+--------------------------------------+---------------+--------+------------+-------------+-------------------+<br>

<br>

On Tue, Sep 1, 2020 at 8:51 AM Mohammed Naser <<a href="mailto:mnaser@vexxhost.com" target="_blank">mnaser@vexxhost.com</a>> wrote:<br>

><br>

> Hi Satish,<br>

><br>

> I'm interested by this, did you end up finding a solution for this?<br>

><br>

> Thanks,<br>

> Mohammed<br>

><br>

> On Thu, Aug 27, 2020 at 1:54 PM Satish Patel <<a href="mailto:satish.txt@gmail.com" target="_blank">satish.txt@gmail.com</a>> wrote:<br>

> ><br>

> > Folks,<br>

> ><br>

> > I have created very simple cluster using following command<br>

> ><br>

> > openstack cluster create --profile myserver --desired-capacity 2<br>

> > --min-size 2 --max-size 3 --strict my-asg<br>

> ><br>

> > It spun up 2 vm immediately now because the desired capacity is 2 so I<br>

> > am assuming if any node dies in the cluster it should spin up node to<br>

> > make count 2 right?<br>

> ><br>

> > so i killed one of node with "nove delete <instance-foo-1>"  but<br>

> > senlin didn't create node automatically to make desired capacity 2 (In<br>

> > AWS when you kill node in ASG it will create new node so is this<br>

> > senlin different then AWS?)<br>

> ><br>

><br>

><br>

> --<br>

> Mohammed Naser<br>

> VEXXHOST, Inc.<br>

<br>

</blockquote></div></div>