[tripleo] Scale up/down Ansible tasks

newer
[searchlight] Searchlight planning...

older
[ops] offer to host next OpenStack...

Emilien Macchi

10 Apr 2019 10 Apr '19

9:57 p.m.

Hi folks, Today I spent a bit of time on: https://blueprints.launchpad.net/tripleo/+spec/scale-down-tasks Which is basically adding the capability of running Ansible tasks before a node is removed during a scale down or after a scale-up. I'm focusing on the scale-down right now, as I know it's something people have been waiting for (e.g. RHSM unsubscribe, Ceph OSD tear down, Nova Compute, etc). I need inputs from folks now, on what kind of tasks would be needed, I will test them and make sure the interface we provide is enough. John, Olie, and Martin in copy have maybe some ideas, please let me know some examples of Ansible tasks that you folks want to run before a node is deleted in Ironic. Prototype: https://review.openstack.org/#/q/topic:bp/scale-down-tasks+(status:open+OR+s...) Thanks a lot, -- Emilien Macchi

Attachments:

attachment.html (text/html — 1.3 KB)

Show replies by date

John Fulton

10 Apr 10 Apr

10:29 p.m.

On Wed, Apr 10, 2019 at 5:58 PM Emilien Macchi <emilien@redhat.com> wrote:

...

Hi folks,

Today I spent a bit of time on: https://blueprints.launchpad.net/tripleo/+spec/scale-down-tasks

Which is basically adding the capability of running Ansible tasks before a node is removed during a scale down or after a scale-up. I'm focusing on the scale-down right now, as I know it's something people have been waiting for (e.g. RHSM unsubscribe, Ceph OSD tear down, Nova Compute, etc).

I need inputs from folks now, on what kind of tasks would be needed, I will test them and make sure the interface we provide is enough. John, Olie, and Martin in copy have maybe some ideas, please let me know some examples of Ansible tasks that you folks want to run before a node is deleted in Ironic.

In the Ceph case ceph-ansible has playbooks to correctly handle the scale down of different ceph services, e.g. delete a monitor [1] or delete an OSD [2]. The process would be to generate a ceph-ansible inventory, e.g. the same way we do when we scale up [3], and then execute one of those playbooks with that inventory. Examples of running these playbook are in Seb's blog [4]. This would be a great feature to have because if you don't tell the Ceph cluster that a node is not part of it anymore, then it will want to find it and not be happy if you just delete the node. It's better to tell the ceph cluster not to worry about a particular node anymore by running one of these playbooks before the node is deleted. Thanks, John [1] https://github.com/ceph/ceph-ansible/blob/master/infrastructure-playbooks/sh... [2] https://github.com/ceph/ceph-ansible/blob/master/infrastructure-playbooks/sh... [3] https://github.com/openstack/tripleo-heat-templates/blob/master/deployment/c... [4] https://www.sebastien-han.fr/blog/2016/08/16/Ceph-ansible-can-now-shrink-you...

...

Prototype: https://review.openstack.org/#/q/topic:bp/scale-down-tasks+(status:open+OR+s...)

Thanks a lot, -- Emilien Macchi

Emilien Macchi

11 Apr 11 Apr

12:22 p.m.

On Wed, Apr 10, 2019 at 6:30 PM John Fulton <johfulto@redhat.com> wrote:

...

In the Ceph case ceph-ansible has playbooks to correctly handle the scale down of different ceph services, e.g. delete a monitor [1] or delete an OSD [2]. The process would be to generate a ceph-ansible inventory, e.g. the same way we do when we scale up [3], and then execute one of those playbooks with that inventory. Examples of running these playbook are in Seb's blog [4].

This would be a great feature to have because if you don't tell the Ceph cluster that a node is not part of it anymore, then it will want to find it and not be happy if you just delete the node. It's better to tell the ceph cluster not to worry about a particular node anymore by running one of these playbooks before the node is deleted.

So if I'm not mistaken, these tasks need to run within the mistral_executor on the Undercloud against a generated ceph-ansible inventory. Which means, no tasks are run on hosts on local mode. Let me know if I'm wrong, I'll make sure this is working fine for the scale tasks. -- Emilien Macchi

Martin Schuppert

7:22 a.m.

On Wed, Apr 10, 2019 at 11:58 PM Emilien Macchi <emilien@redhat.com> wrote:

...

Hi folks,

Today I spent a bit of time on: https://blueprints.launchpad.net/tripleo/+spec/scale-down-tasks

Which is basically adding the capability of running Ansible tasks before a node is removed during a scale down or after a scale-up. I'm focusing on the scale-down right now, as I know it's something people have been waiting for (e.g. RHSM unsubscribe, Ceph OSD tear down, Nova Compute, etc).

I need inputs from folks now, on what kind of tasks would be needed, I will test them and make sure the interface we provide is enough. John, Olie, and Martin in copy have maybe some ideas, please let me know some examples of Ansible tasks that you folks want to run before a node is deleted in Ironic.

For nova/neutron it would be to disable the service/agent: (overcloud) $ openstack compute service list (overcloud) $ openstack compute service set [hostname] nova-compute --disable (overcloud) $ openstack network agent list (overcloud) $ openstack network agent set --disable [openvswitch-agent-id] After service is stopped/or host delete (overcloud) $ openstack compute service delete [service-id] (overcloud) $ openstack network agent delete [openvswitch-agent-id] Regards, Martin [1] https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/14/...

...

Prototype: https://review.openstack.org/#/q/topic:bp/scale-down-tasks+(status:open+OR+s...)

Thanks a lot, -- Emilien Macchi

Oliver Walsh

9:17 a.m.

On Thu, 11 Apr 2019 at 08:23, Martin Schuppert <mschuppert@redhat.com> wrote:

...

On Wed, Apr 10, 2019 at 11:58 PM Emilien Macchi <emilien@redhat.com> wrote:

...
Hi folks,

Today I spent a bit of time on: https://blueprints.launchpad.net/tripleo/+spec/scale-down-tasks

Which is basically adding the capability of running Ansible tasks before a node is removed during a scale down or after a scale-up. I'm focusing on the scale-down right now, as I know it's something people have been waiting for (e.g. RHSM unsubscribe, Ceph OSD tear down, Nova Compute, etc).

I need inputs from folks now, on what kind of tasks would be needed, I will test them and make sure the interface we provide is enough. John, Olie, and Martin in copy have maybe some ideas, please let me know some examples of Ansible tasks that you folks want to run before a node is deleted in Ironic.

For nova/neutron it would be to disable the service/agent:

(overcloud) $ openstack compute service list (overcloud) $ openstack compute service set [hostname] nova-compute --disable

(overcloud) $ openstack network agent list (overcloud) $ openstack network agent set --disable [openvswitch-agent-id]

After service is stopped/or host delete (overcloud) $ openstack compute service delete [service-id] (overcloud) $ openstack network agent delete [openvswitch-agent-id]

Regards, Martin

[1] https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/14/...

Might be worth confirming there are no instances running on the node too. Cheers, Ollie

...

...
Prototype: https://review.openstack.org/#/q/topic:bp/scale-down-tasks+(status:open+OR+s...)

Thanks a lot, -- Emilien Macchi

Emilien Macchi

12:20 p.m.

On Thu, Apr 11, 2019 at 3:23 AM Martin Schuppert <mschuppert@redhat.com> wrote:

...

For nova/neutron it would be to disable the service/agent:

(overcloud) $ openstack compute service list (overcloud) $ openstack compute service set [hostname] nova-compute --disable

(overcloud) $ openstack network agent list (overcloud) $ openstack network agent set --disable [openvswitch-agent-id]

After service is stopped/or host delete (overcloud) $ openstack compute service delete [service-id] (overcloud) $ openstack network agent delete [openvswitch-agent-id]

Ok so these commands would need to be executed from the Undercloud in the mistral_container, since they do nothing on local nodes but just do CLI against APIs. I take note and will make sure it's possible. Do you folks already have some playbooks doing these things or should we start from scratch? -- Emilien Macchi

Martin Schuppert

12:27 p.m.

On Thu, Apr 11, 2019 at 2:20 PM Emilien Macchi <emilien@redhat.com> wrote:

...

On Thu, Apr 11, 2019 at 3:23 AM Martin Schuppert <mschuppert@redhat.com> wrote:

...
For nova/neutron it would be to disable the service/agent:

(overcloud) $ openstack compute service list (overcloud) $ openstack compute service set [hostname] nova-compute --disable

(overcloud) $ openstack network agent list (overcloud) $ openstack network agent set --disable [openvswitch-agent-id]

After service is stopped/or host delete (overcloud) $ openstack compute service delete [service-id] (overcloud) $ openstack network agent delete [openvswitch-agent-id]

Ok so these commands would need to be executed from the Undercloud in the mistral_container, since they do nothing on local nodes but just do CLI against APIs. I take note and will make sure it's possible.

Do you folks already have some playbooks doing these things or should we start from scratch?

No, right now those are manual tasks, but Rajesh could help on this. Regards, Martin

...

-- Emilien Macchi

Emilien Macchi

19 Apr 19 Apr

1:47 a.m.

On Thu, Apr 11, 2019 at 8:28 AM Martin Schuppert <mschuppert@redhat.com> wrote:

...

On Thu, Apr 11, 2019 at 2:20 PM Emilien Macchi <emilien@redhat.com> wrote:

...
On Thu, Apr 11, 2019 at 3:23 AM Martin Schuppert <mschuppert@redhat.com> wrote:

...
For nova/neutron it would be to disable the service/agent:

(overcloud) $ openstack compute service list (overcloud) $ openstack compute service set [hostname] nova-compute --disable

(overcloud) $ openstack network agent list (overcloud) $ openstack network agent set --disable [openvswitch-agent-id]

After service is stopped/or host delete (overcloud) $ openstack compute service delete [service-id] (overcloud) $ openstack network agent delete [openvswitch-agent-id]

Ok so these commands would need to be executed from the Undercloud in the mistral_container, since they do nothing on local nodes but just do CLI against APIs. I take note and will make sure it's possible.

Do you folks already have some playbooks doing these things or should we start from scratch?

No, right now those are manual tasks, but Rajesh could help on this.

I'm prototyping it on https://review.openstack.org/#/c/653893/ - I'll iterate on that patch and once it works I'll tackle Neutron. Feedback is welcome! -- Emilien Macchi

2302

Age (days ago)

2311

Last active (days ago)

List overview

Download

7 comments

4 participants

participants (4)

Emilien Macchi
John Fulton
Martin Schuppert
Oliver Walsh

[tripleo] Scale up/down Ansible tasks

tags

participants (4)