Best practices to restart / repair broken Octavia LoadBalancer
Michael Johnson
johnsomor at gmail.com
Wed Jun 26 15:47:59 UTC 2019
Hi Pawel,
The intended CLI functionality to address this is the load balancer
failover API, however we have some open bugs with that right now.
Objects should never get "stuck" in PENDING_*, those are transitive
states meaning that one of the controllers has claimed ownership of
the resource to take an action on it. For example, in your case one of
your health manager processes has claimed the load balancer to attempt
an automatic repair.
However, due to a bug in nova
(https://bugs.launchpad.net/nova/+bug/1827746) this automatic repair
was unable to complete. We try for up to five minutes, but then have
to give up as nova is stuck.
We have open stories and one patch in progress to improve this
situation. Once we can get resources available to finish those, we
will backport the bug fix patches to the stable branches.
Related stories and patches in Octavia:
https://review.opendev.org/#/c/585864/
https://storyboard.openstack.org/#!/story/2006051
As always, we encourage you to open StoryBoard stories for us to track
any issues you have seen. Even if they are duplicate, we can then
track the number of people experiencing an issue and help prioritize
the work.
Michael
On Wed, Jun 26, 2019 at 7:50 AM Pawel Konczalski
<pawel.konczalski at everyware.ch> wrote:
>
> Hi,
>
> i run into a issue where one of the Octavia LB amphora VMs was crashed and since the loadbalancer operating_status become PENDING_UPDATE (or ERROR) it is no longer possible to use the OpenStack CLI tools to manage the LB:
>
> openstack loadbalancer amphora list --loadbalancer 0ce30f0e-1d75-486c-a09f-79125abf44b8
> +--------------------------------------+--------------------------------------+-----------+--------+---------------+-------------+
> | id | loadbalancer_id | status | role | lb_network_ip | ha_ip |
> +--------------------------------------+--------------------------------------+-----------+--------+---------------+-------------+
> | daee2f88-01fd-4ffa-b80d-15c63771d99d | 0ce30f0e-1d75-486c-a09f-79125abf44b8 | ERROR | BACKUP | 172.10.10.30 | 172.11.12.26 |
> | f22186b1-2865-4f4a-aae2-7f869b7aae12 | 0ce30f0e-1d75-486c-a09f-79125abf44b8 | ALLOCATED | MASTER | 172.10.10.5 | 172.11.12.26 |
> +--------------------------------------+--------------------------------------+-----------+--------+---------------+-------------+
>
> openstack loadbalancer show 0ce30f0e-1d75-486c-a09f-79125abf44b8
> +---------------------+--------------------------------------+
> | Field | Value |
> +---------------------+--------------------------------------+
> | admin_state_up | True |
> | created_at | 2019-05-19T09:48:12 |
> | description | |
> | flavor | |
> | id | 0ce30f0e-1d75-486c-a09f-79125abf44b8 |
> | listeners | 12745a48-7277-405f-98da-e7b9fbaf93cc |
> | name | foo-lb1 |
> | operating_status | ONLINE |
> | pools | 482985f9-2804-4960-bd93-6bbb798b57f7 |
> | project_id | 76e81458c81f6e2xebbbfc81f6bb76e008d |
> | provider | amphora |
> | provisioning_status | PENDING_UPDATE |
> | updated_at | 2019-06-25T16:58:33 |
> | vip_address | 172.11.12.26 |
> | vip_network_id | 8cc0f284-613c-40a7-ac72-c83ffdc26a93 |
> | vip_port_id | f598aac4-4bd0-472b-9b9c-e4e305cb561b |
> | vip_qos_policy_id | None |
> | vip_subnet_id | e1478576-23b0-40e8-b4f2-5b284f2b23c4 |
> +---------------------+--------------------------------------+
>
> I was able to fix this by update the load_balancer state to 'ACTIVE' directly in the Octavia Database and trigger a failover:
>
> MySQL [octavia]> update load_balancer set provisioning_status = 'ACTIVE' where id = '0ce30f0e-1d75-486c-a09f-79125abf44b8';
>
> openstack loadbalancer failover 0ce30f0e-1d75-486c-a09f-79125abf44b8
>
>
> But this seams to by more a workaround the a proper way to restart / repair the loadbalancer without a interfare in the OpenStack DB manually.
>
> Is there a another way accomplish this with the CLI?
>
> BR
>
> Pawel
More information about the openstack-discuss
mailing list