Best practices to restart / repair broken Octavia LoadBalancer
Hi, i run into a issue where one of the Octavia LB amphora VMs was crashed and since the loadbalancer operating_status become PENDING_UPDATE (or ERROR) it is no longer possible to use the OpenStack CLI tools to manage the LB: openstack loadbalancer amphora list --loadbalancer 0ce30f0e-1d75-486c-a09f-79125abf44b8 +--------------------------------------+--------------------------------------+-----------+--------+---------------+-------------+ | id | loadbalancer_id | status | role | lb_network_ip | ha_ip | +--------------------------------------+--------------------------------------+-----------+--------+---------------+-------------+ | daee2f88-01fd-4ffa-b80d-15c63771d99d | 0ce30f0e-1d75-486c-a09f-79125abf44b8 | ERROR | BACKUP | 172.10.10.30 | 172.11.12.26 | | f22186b1-2865-4f4a-aae2-7f869b7aae12 | 0ce30f0e-1d75-486c-a09f-79125abf44b8 | ALLOCATED | MASTER | 172.10.10.5 | 172.11.12.26 | +--------------------------------------+--------------------------------------+-----------+--------+---------------+-------------+ openstack loadbalancer show 0ce30f0e-1d75-486c-a09f-79125abf44b8 +---------------------+--------------------------------------+ | Field | Value | +---------------------+--------------------------------------+ | admin_state_up | True | | created_at | 2019-05-19T09:48:12 | | description | | | flavor | | | id | 0ce30f0e-1d75-486c-a09f-79125abf44b8 | | listeners | 12745a48-7277-405f-98da-e7b9fbaf93cc | | name | foo-lb1 | | operating_status | ONLINE | | pools | 482985f9-2804-4960-bd93-6bbb798b57f7 | | project_id | 76e81458c81f6e2xebbbfc81f6bb76e008d | | provider | amphora | | provisioning_status | PENDING_UPDATE | | updated_at | 2019-06-25T16:58:33 | | vip_address | 172.11.12.26 | | vip_network_id | 8cc0f284-613c-40a7-ac72-c83ffdc26a93 | | vip_port_id | f598aac4-4bd0-472b-9b9c-e4e305cb561b | | vip_qos_policy_id | None | | vip_subnet_id | e1478576-23b0-40e8-b4f2-5b284f2b23c4 | +---------------------+--------------------------------------+ I was able to fix this by update the load_balancer state to 'ACTIVE' directly in the Octavia Database and trigger a failover: MySQL [octavia]>update load_balancer set provisioning_status = 'ACTIVE' where id = '0ce30f0e-1d75-486c-a09f-79125abf44b8'; openstack loadbalancer failover 0ce30f0e-1d75-486c-a09f-79125abf44b8 But this seams to by more a workaround the a proper way to restart / repair the loadbalancer without a interfare in the OpenStack DB manually. Is there a another way accomplish this with the CLI? BR Pawel
Hi Pawel, The intended CLI functionality to address this is the load balancer failover API, however we have some open bugs with that right now. Objects should never get "stuck" in PENDING_*, those are transitive states meaning that one of the controllers has claimed ownership of the resource to take an action on it. For example, in your case one of your health manager processes has claimed the load balancer to attempt an automatic repair. However, due to a bug in nova (https://bugs.launchpad.net/nova/+bug/1827746) this automatic repair was unable to complete. We try for up to five minutes, but then have to give up as nova is stuck. We have open stories and one patch in progress to improve this situation. Once we can get resources available to finish those, we will backport the bug fix patches to the stable branches. Related stories and patches in Octavia: https://review.opendev.org/#/c/585864/ https://storyboard.openstack.org/#!/story/2006051 As always, we encourage you to open StoryBoard stories for us to track any issues you have seen. Even if they are duplicate, we can then track the number of people experiencing an issue and help prioritize the work. Michael On Wed, Jun 26, 2019 at 7:50 AM Pawel Konczalski <pawel.konczalski@everyware.ch> wrote:
Hi,
i run into a issue where one of the Octavia LB amphora VMs was crashed and since the loadbalancer operating_status become PENDING_UPDATE (or ERROR) it is no longer possible to use the OpenStack CLI tools to manage the LB:
openstack loadbalancer amphora list --loadbalancer 0ce30f0e-1d75-486c-a09f-79125abf44b8 +--------------------------------------+--------------------------------------+-----------+--------+---------------+-------------+ | id | loadbalancer_id | status | role | lb_network_ip | ha_ip | +--------------------------------------+--------------------------------------+-----------+--------+---------------+-------------+ | daee2f88-01fd-4ffa-b80d-15c63771d99d | 0ce30f0e-1d75-486c-a09f-79125abf44b8 | ERROR | BACKUP | 172.10.10.30 | 172.11.12.26 | | f22186b1-2865-4f4a-aae2-7f869b7aae12 | 0ce30f0e-1d75-486c-a09f-79125abf44b8 | ALLOCATED | MASTER | 172.10.10.5 | 172.11.12.26 | +--------------------------------------+--------------------------------------+-----------+--------+---------------+-------------+
openstack loadbalancer show 0ce30f0e-1d75-486c-a09f-79125abf44b8 +---------------------+--------------------------------------+ | Field | Value | +---------------------+--------------------------------------+ | admin_state_up | True | | created_at | 2019-05-19T09:48:12 | | description | | | flavor | | | id | 0ce30f0e-1d75-486c-a09f-79125abf44b8 | | listeners | 12745a48-7277-405f-98da-e7b9fbaf93cc | | name | foo-lb1 | | operating_status | ONLINE | | pools | 482985f9-2804-4960-bd93-6bbb798b57f7 | | project_id | 76e81458c81f6e2xebbbfc81f6bb76e008d | | provider | amphora | | provisioning_status | PENDING_UPDATE | | updated_at | 2019-06-25T16:58:33 | | vip_address | 172.11.12.26 | | vip_network_id | 8cc0f284-613c-40a7-ac72-c83ffdc26a93 | | vip_port_id | f598aac4-4bd0-472b-9b9c-e4e305cb561b | | vip_qos_policy_id | None | | vip_subnet_id | e1478576-23b0-40e8-b4f2-5b284f2b23c4 | +---------------------+--------------------------------------+
I was able to fix this by update the load_balancer state to 'ACTIVE' directly in the Octavia Database and trigger a failover:
MySQL [octavia]> update load_balancer set provisioning_status = 'ACTIVE' where id = '0ce30f0e-1d75-486c-a09f-79125abf44b8';
openstack loadbalancer failover 0ce30f0e-1d75-486c-a09f-79125abf44b8
But this seams to by more a workaround the a proper way to restart / repair the loadbalancer without a interfare in the OpenStack DB manually.
Is there a another way accomplish this with the CLI?
BR
Pawel
participants (2)
-
Michael Johnson
-
Pawel Konczalski