[octavia] Timeouts during building of lb? But then successful

Michael Johnson johnsomor at gmail.com
Mon Nov 9 16:40:39 UTC 2020


Hi Florian,

That is very unusual. It typically takes less than 30 seconds for a
load balancer to be provisioned. It definitely sounds like the mysql
instance is having trouble. This can also cause longer term issues if
the query response time drops to 10 seconds or more(0.001 is normal),
which could trigger unnecessary failovers.

In Octavia there are layers of "retries" to attempt to handle clouds
that are having trouble. It sounds like database issues are triggering
one or more of these retries.
There are a few retries that will be in play for database transactions:
MySQL internal retries/timeouts such as lock timeouts (logged on the mysql side)
oslo.db includes some automatic retries (typically not logged without
configuration file settings)
Octavia tenacity and flow retries (Typically logged if the
configuration file has Debug = True enabled)

This may also be a general network connection issue. The default REST
timeouts (used when we connect to the amphora agents) is 600, I'm
wondering if the lb-mgmt-network is also having an issue.

Please check your health manager log files. If there are database
query time issues logged, it would point specifically to a mysql
issue. In the past we have seen mysql clustering setups that were bad
and caused performance issues (flipping primary instance, lock
contention between the instances, etc.). You should not be seeing any
log messages that the mysql database went away, that is not normal.

Michael

On Sun, Nov 8, 2020 at 7:06 AM Florian Rommel <florian at datalounges.com> wrote:
>
> Hi, so we have a fully functioning setup of octavia on ussuri and it works nicely, when it competes.
> So here is what happens:
> From octavia api to octavia worker takes 20 seconds for the job to be initiated.
> The loadbalancer gets built quickly and then we get a mysql went away error, the listener gets built and then a member , that works too, then the mysql error comes up with query took too long to execute.
> Now this is where it gets weird. This is all within the first 2 - 3 minutes.
> At this point it hangs and takes 10 minutes (600 seconds) for the next step to complete and then another 10 minutes and another 10 until it’s completed.
> It seems there is a timeout somewhere but even with debug on we do not see what is going on. Does anyone have a mysql 8 running and octavia executing fine? And could send me their redacted octavia or mysql conf files? We didn’t touch them but it seems that there is something off..
> especially since it then completes and works extremely nicely.
> I would highly appreciate it , even off list.
> Best regards,
> //f
>
>



More information about the openstack-discuss mailing list