[octavia] Timeouts during building of lb? But then successful
Florian Rommel
florian at datalounges.com
Tue Nov 10 08:30:00 UTC 2020
Hi Micheal and thank you..
So it really seems that the mysql server ( fairly vanilla, non-clustered) is a problem.. I followed the installation instructions from the official documentation and yes, when I create a LB this is what comes through in the octavia-worker log:
2020-11-10 08:23:18.450 777550 INFO octavia.controller.queue.v1.endpoints [-] Creating member 'a6389dc7-61cb-489c-b69b-a922f0a9d9f2'...
2020-11-10 08:23:18.458 777550 ERROR oslo_db.sqlalchemy.engines [-] Database connection was found disconnected; reconnecting: oslo_db.exception.DBConnectionError: (pymysql.err.OperationalError) (2013, 'Lost connection to MySQL server during query’)
The server right now runs on localhost (mysql) I changed that to see if the problem persists..
There is no iptables currently running to prevent rate limited etc.
We made almost no modifications to the my.cnf and octavia is the only one that has problems.. all other services have no issues at all.
Is there some specific settings I should set in the octavia.conf that deal with this? Here is the python trace, which in my opnion is fairly basic…
(Background on this error at: http://sqlalche.me/e/e3q8)
2020-11-10 08:23:18.458 777550 ERROR oslo_db.sqlalchemy.engines Traceback (most recent call last):
2020-11-10 08:23:18.458 777550 ERROR oslo_db.sqlalchemy.engines File "/usr/lib/python3/dist-packages/sqlalchemy/engine/base.py", line 1245, in _execute_context
2020-11-10 08:23:18.458 777550 ERROR oslo_db.sqlalchemy.engines self.dialect.do_execute(
2020-11-10 08:23:18.458 777550 ERROR oslo_db.sqlalchemy.engines File "/usr/lib/python3/dist-packages/sqlalchemy/engine/default.py", line 581, in do_execute
2020-11-10 08:23:18.458 777550 ERROR oslo_db.sqlalchemy.engines cursor.execute(statement, parameters)
2020-11-10 08:23:18.458 777550 ERROR oslo_db.sqlalchemy.engines File "/usr/lib/python3/dist-packages/pymysql/cursors.py", line 170, in execute
2020-11-10 08:23:18.458 777550 ERROR oslo_db.sqlalchemy.engines result = self._query(query)
2020-11-10 08:23:18.458 777550 ERROR oslo_db.sqlalchemy.engines File "/usr/lib/python3/dist-packages/pymysql/cursors.py", line 328, in _query
2020-11-10 08:23:18.458 777550 ERROR oslo_db.sqlalchemy.engines conn.query(q)
2020-11-10 08:23:18.458 777550 ERROR oslo_db.sqlalchemy.engines File "/usr/lib/python3/dist-packages/pymysql/connections.py", line 517, in query
2020-11-10 08:23:18.458 777550 ERROR oslo_db.sqlalchemy.engines self._affected_rows = self._read_query_result(unbuffered=unbuffered)
2020-11-10 08:23:18.458 777550 ERROR oslo_db.sqlalchemy.engines File "/usr/lib/python3/dist-packages/pymysql/connections.py", line 732, in _read_query_result
2020-11-10 08:23:18.458 777550 ERROR oslo_db.sqlalchemy.engines result.read()
2020-11-10 08:23:18.458 777550 ERROR oslo_db.sqlalchemy.engines File "/usr/lib/python3/dist-packages/pymysql/connections.py", line 1075, in read
2020-11-10 08:23:18.458 777550 ERROR oslo_db.sqlalchemy.engines first_packet = self.connection._read_packet()
2020-11-10 08:23:18.458 777550 ERROR oslo_db.sqlalchemy.engines File "/usr/lib/python3/dist-packages/pymysql/connections.py", line 657, in _read_packet
2020-11-10 08:23:18.458 777550 ERROR oslo_db.sqlalchemy.engines packet_header = self._read_bytes(4)
2020-11-10 08:23:18.458 777550 ERROR oslo_db.sqlalchemy.engines File "/usr/lib/python3/dist-packages/pymysql/connections.py", line 706, in _read_bytes
2020-11-10 08:23:18.458 777550 ERROR oslo_db.sqlalchemy.engines raise err.OperationalError(
2020-11-10 08:23:18.458 777550 ERROR oslo_db.sqlalchemy.engines pymysql.err.OperationalError: (2013, 'Lost connection to MySQL server during query')
2020-11-10 08:23:18.458 777550 ERROR oslo_db.sqlalchemy.engines
2020-11-10 08:23:18.458 777550 ERROR oslo_db.sqlalchemy.engines The above exception was the direct cause of the following exception:
2020-11-10 08:23:18.458 777550 ERROR oslo_db.sqlalchemy.engines
2020-11-10 08:23:18.458 777550 ERROR oslo_db.sqlalchemy.engines Traceback (most recent call last):
2020-11-10 08:23:18.458 777550 ERROR oslo_db.sqlalchemy.engines File "/usr/lib/python3/dist-packages/oslo_db/sqlalchemy/engines.py", line 73, in _connect_ping_listener
2020-11-10 08:23:18.458 777550 ERROR oslo_db.sqlalchemy.engines connection.scalar(select([1]))
2020-11-10 08:23:18.458 777550 ERROR oslo_db.sqlalchemy.engines File "/usr/lib/python3/dist-packages/sqlalchemy/engine/base.py", line 914, in scalar
2020-11-10 08:23:18.458 777550 ERROR oslo_db.sqlalchemy.engines return self.execute(object_, *multiparams, **params).scalar()
2020-11-10 08:23:18.458 777550 ERROR oslo_db.sqlalchemy.engines File "/usr/lib/python3/dist-packages/sqlalchemy/engine/base.py", line 982, in execute
2020-11-10 08:23:18.458 777550 ERROR oslo_db.sqlalchemy.engines return meth(self, multiparams, params)
2020-11-10 08:23:18.458 777550 ERROR oslo_db.sqlalchemy.engines File "/usr/lib/python3/dist-packages/sqlalchemy/sql/elements.py", line 287, in _execute_on_connection
2020-11-10 08:23:18.458 777550 ERROR oslo_db.sqlalchemy.engines return connection._execute_clauseelement(self, multiparams, params)
2020-11-10 08:23:18.458 777550 ERROR oslo_db.sqlalchemy.engines File "/usr/lib/python3/dist-packages/sqlalchemy/engine/base.py", line 1095, in _execute_clauseelement
2020-11-10 08:23:18.458 777550 ERROR oslo_db.sqlalchemy.engines ret = self._execute_context(
2020-11-10 08:23:18.458 777550 ERROR oslo_db.sqlalchemy.engines File "/usr/lib/python3/dist-packages/sqlalchemy/engine/base.py", line 1249, in _execute_context
2020-11-10 08:23:18.458 777550 ERROR oslo_db.sqlalchemy.engines self._handle_dbapi_exception(
2020-11-10 08:23:18.458 777550 ERROR oslo_db.sqlalchemy.engines File "/usr/lib/python3/dist-packages/sqlalchemy/engine/base.py", line 1474, in _handle_dbapi_exception
2020-11-10 08:23:18.458 777550 ERROR oslo_db.sqlalchemy.engines util.raise_from_cause(newraise, exc_info)
2020-11-10 08:23:18.458 777550 ERROR oslo_db.sqlalchemy.engines File "/usr/lib/python3/dist-packages/sqlalchemy/util/compat.py", line 398, in raise_from_cause
2020-11-10 08:23:18.458 777550 ERROR oslo_db.sqlalchemy.engines reraise(type(exception), exception, tb=exc_tb, cause=cause)
2020-11-10 08:23:18.458 777550 ERROR oslo_db.sqlalchemy.engines File "/usr/lib/python3/dist-packages/sqlalchemy/util/compat.py", line 152, in reraise
2020-11-10 08:23:18.458 777550 ERROR oslo_db.sqlalchemy.engines raise value.with_traceback(tb)
2020-11-10 08:23:18.458 777550 ERROR oslo_db.sqlalchemy.engines File "/usr/lib/python3/dist-packages/sqlalchemy/engine/base.py", line 1245, in _execute_context
2020-11-10 08:23:18.458 777550 ERROR oslo_db.sqlalchemy.engines self.dialect.do_execute(
2020-11-10 08:23:18.458 777550 ERROR oslo_db.sqlalchemy.engines File "/usr/lib/python3/dist-packages/sqlalchemy/engine/default.py", line 581, in do_execute
2020-11-10 08:23:18.458 777550 ERROR oslo_db.sqlalchemy.engines cursor.execute(statement, parameters)
2020-11-10 08:23:18.458 777550 ERROR oslo_db.sqlalchemy.engines File "/usr/lib/python3/dist-packages/pymysql/cursors.py", line 170, in execute
2020-11-10 08:23:18.458 777550 ERROR oslo_db.sqlalchemy.engines result = self._query(query)
2020-11-10 08:23:18.458 777550 ERROR oslo_db.sqlalchemy.engines File "/usr/lib/python3/dist-packages/pymysql/cursors.py", line 328, in _query
2020-11-10 08:23:18.458 777550 ERROR oslo_db.sqlalchemy.engines conn.query(q)
2020-11-10 08:23:18.458 777550 ERROR oslo_db.sqlalchemy.engines File "/usr/lib/python3/dist-packages/pymysql/connections.py", line 517, in query
2020-11-10 08:23:18.458 777550 ERROR oslo_db.sqlalchemy.engines self._affected_rows = self._read_query_result(unbuffered=unbuffered)
2020-11-10 08:23:18.458 777550 ERROR oslo_db.sqlalchemy.engines File "/usr/lib/python3/dist-packages/pymysql/connections.py", line 732, in _read_query_result
2020-11-10 08:23:18.458 777550 ERROR oslo_db.sqlalchemy.engines result.read()
2020-11-10 08:23:18.458 777550 ERROR oslo_db.sqlalchemy.engines File "/usr/lib/python3/dist-packages/pymysql/connections.py", line 1075, in read
2020-11-10 08:23:18.458 777550 ERROR oslo_db.sqlalchemy.engines first_packet = self.connection._read_packet()
2020-11-10 08:23:18.458 777550 ERROR oslo_db.sqlalchemy.engines File "/usr/lib/python3/dist-packages/pymysql/connections.py", line 657, in _read_packet
2020-11-10 08:23:18.458 777550 ERROR oslo_db.sqlalchemy.engines packet_header = self._read_bytes(4)
2020-11-10 08:23:18.458 777550 ERROR oslo_db.sqlalchemy.engines File "/usr/lib/python3/dist-packages/pymysql/connections.py", line 706, in _read_bytes
2020-11-10 08:23:18.458 777550 ERROR oslo_db.sqlalchemy.engines raise err.OperationalError(
2020-11-10 08:23:18.458 777550 ERROR oslo_db.sqlalchemy.engines oslo_db.exception.DBConnectionError: (pymysql.err.OperationalError) (2013, 'Lost connection to MySQL server during query')
2020-11-10 08:23:18.458 777550 ERROR oslo_db.sqlalchemy.engines [SQL: SELECT 1]
2020-11-10 08:23:18.458 777550 ERROR oslo_db.sqlalchemy.engines (Background on this error at: http://sqlalche.me/e/e3q8)
2020-11-10 08:23:18.458 777550 ERROR oslo_db.sqlalchemy.engines
The other option is to run it all verbos?
Best regards
//F
> On 9. Nov 2020, at 18.40, Michael Johnson <johnsomor at gmail.com> wrote:
>
> Hi Florian,
>
> That is very unusual. It typically takes less than 30 seconds for a
> load balancer to be provisioned. It definitely sounds like the mysql
> instance is having trouble. This can also cause longer term issues if
> the query response time drops to 10 seconds or more(0.001 is normal),
> which could trigger unnecessary failovers.
>
> In Octavia there are layers of "retries" to attempt to handle clouds
> that are having trouble. It sounds like database issues are triggering
> one or more of these retries.
> There are a few retries that will be in play for database transactions:
> MySQL internal retries/timeouts such as lock timeouts (logged on the mysql side)
> oslo.db includes some automatic retries (typically not logged without
> configuration file settings)
> Octavia tenacity and flow retries (Typically logged if the
> configuration file has Debug = True enabled)
>
> This may also be a general network connection issue. The default REST
> timeouts (used when we connect to the amphora agents) is 600, I'm
> wondering if the lb-mgmt-network is also having an issue.
>
> Please check your health manager log files. If there are database
> query time issues logged, it would point specifically to a mysql
> issue. In the past we have seen mysql clustering setups that were bad
> and caused performance issues (flipping primary instance, lock
> contention between the instances, etc.). You should not be seeing any
> log messages that the mysql database went away, that is not normal.
>
> Michael
>
> On Sun, Nov 8, 2020 at 7:06 AM Florian Rommel <florian at datalounges.com> wrote:
>>
>> Hi, so we have a fully functioning setup of octavia on ussuri and it works nicely, when it competes.
>> So here is what happens:
>> From octavia api to octavia worker takes 20 seconds for the job to be initiated.
>> The loadbalancer gets built quickly and then we get a mysql went away error, the listener gets built and then a member , that works too, then the mysql error comes up with query took too long to execute.
>> Now this is where it gets weird. This is all within the first 2 - 3 minutes.
>> At this point it hangs and takes 10 minutes (600 seconds) for the next step to complete and then another 10 minutes and another 10 until it’s completed.
>> It seems there is a timeout somewhere but even with debug on we do not see what is going on. Does anyone have a mysql 8 running and octavia executing fine? And could send me their redacted octavia or mysql conf files? We didn’t touch them but it seems that there is something off..
>> especially since it then completes and works extremely nicely.
>> I would highly appreciate it , even off list.
>> Best regards,
>> //f
>>
>>
>
More information about the openstack-discuss
mailing list