Galera config values
Eugen Block
eblock at nde.ag
Fri Jan 17 22:34:58 UTC 2020
Hi,
I'm pretty sure you'll have to figure it out yourself. I always found
the deployment guides quite good, I got my cloud running without major
issues. But when it comes to HA configuration the guide lacks many
information. I had to fiure out many details on my own, though haproxy
is currently not in use here.
> So it looks like the value of "timeout client" in haproxy.cfg needs
> to match or exceed the value of "wait_timeout" in mysql.
Although I'm not entirely sure I tend to agree with you. Dealing with
a Ceph RGW deployment I encountered a similar issue and had to
increase some timeout values to get it working.
I'm convinced that many people would appreciate if you created a doc
for haproxy.
Regards,
Eugen
Zitat von Albert Braden <Albert.Braden at synopsys.com>:
> I'm experimenting with Galera in my Rocky openstack-ansible dev
> cluster, and I'm finding that the default haproxy config values
> don't seem to work. Finding the correct values is a lot of work. For
> example, I spent this morning experimenting with different values
> for "timeout client" in /etc/haproxy/haproxy.cfg. The default is
> 1m, and with the default set I see this error in
> /var/log/nova/nova-scheduler.log on the controllers:
>
> 2020-01-17 13:54:26.059 443358 ERROR oslo_db.sqlalchemy.engines
> DBConnectionError: (pymysql.err.OperationalError) (2013, 'Lost
> connection to MySQL server during query') [SQL: u'SELECT 1']
> (Background on this error at: http://sqlalche.me/e/e3q8)
>
> There are several timeout values in /etc/haproxy/haproxy.cfg. These
> are the values we started with:
>
> stats timeout 30s
> timeout http-request 10s
> timeout queue 1m
> timeout connect 10s
> timeout client 1m
> timeout server 1m
> timeout check 10s
>
> At first I changed them all to 30m. This stopped the "Lost
> connection" error in nova-scheduler.log. Then, one at a time, I
> changed them back to the default. When I got to "timeout client" I
> found that setting it back to 1m caused the errors to start again. I
> changed it back and forth and found that 4 minutes causes errors,
> and 6m stops them, so I left it at 6m.
>
> These are my active variables:
>
> root at us01odc-dev2-ctrl1:/etc/mysql# mysql -e 'show variables;'|grep timeout
> connect_timeout 20
> deadlock_timeout_long 50000000
> deadlock_timeout_short 10000
> delayed_insert_timeout 300
> idle_readonly_transaction_timeout 0
> idle_transaction_timeout 0
> idle_write_transaction_timeout 0
> innodb_flush_log_at_timeout 1
> innodb_lock_wait_timeout 50
> innodb_rollback_on_timeout OFF
> interactive_timeout 28800
> lock_wait_timeout 86400
> net_read_timeout 30
> net_write_timeout 60
> rpl_semi_sync_master_timeout 10000
> rpl_semi_sync_slave_kill_conn_timeout 5
> slave_net_timeout 60
> thread_pool_idle_timeout 60
> wait_timeout 3600
>
> So it looks like the value of "timeout client" in haproxy.cfg needs
> to match or exceed the value of "wait_timeout" in mysql. Also in
> nova.conf I see "#connection_recycle_time = 3600" - I need to
> experiment to see how that value interacts with the timeouts in the
> other config files.
>
> Is this the best way to find the correct config values? It seems
> like there should be a document that talks about these timeouts and
> how to set them (or maybe more generally how the different timeout
> settings in the various config files interact). Does that document
> exist? If not, maybe I could write one, since I have to figure out
> the correct values anyway.
More information about the openstack-discuss
mailing list