[oslo][nova] Nova causes MySQL timeouts
Herve Beraud
hberaud at redhat.com
Wed Sep 18 08:05:07 UTC 2019
Le mar. 17 sept. 2019 à 19:55, Albert Braden <Albert.Braden at synopsys.com> a
écrit :
> I had not heard about the eventlet heartbeat issue. Where can I read more
> about it?
>
Under apache and mod_wsgi eventlet green thread doesn't work properly.
Nova faced this issue few months ago through the use of oslo.messaging and
especially through the heartbeat's rabbitmq driver.
The heartbeat was runned by using a green thread under apache and mod_wsgi,
so after few secondes/minutes the heartbeat thread became idle and so the
connection with the rabbitmq server was closed and re-opened etc... Hence,
that introduced a lot of connections opened and closed between the client
and the server.
You can find more discuss about there:
-
http://lists.openstack.org/pipermail/openstack-discuss/2019-May/005822.html
And the oslo.messaging fix related to this issue :
-
https://github.com/openstack/oslo.messaging/commit/22f240b82fffbd62be8568a7d0d3369134596ace
>
> The [wsgi] section of my nova.conf is default; nothing is uncommented.
>
> -----Original Message-----
> From: Sean Mooney <smooney at redhat.com>
> Sent: Tuesday, September 17, 2019 9:50 AM
> To: Albert Braden <albertb at synopsys.com>;
> openstack-discuss at lists.openstack.org
> Cc: Ben Nemec <openstack at nemebean.com>; Chris Hoge <chris at openstack.org>
> Subject: Re: [oslo][nova] Nova causes MySQL timeouts
>
> On Tue, 2019-09-17 at 16:36 +0000, Albert Braden wrote:
> > I thought I had figured out that the solution was to increase the MySQL
> wait_timeout so that it is longer than the
> > nova (and glance, neutron, etc.) connection_recycle_time (3600). I
> increased my MySQL wait_timeout to 6000:
> >
> > root at us01odc-qa-ctrl1:~# mysqladmin variables|grep wait_timeout|grep -v
> _wait
> > > wait_timeout | 6000
> >
> > But I still see the MySQL errors. There's no LB; we are pointing to a
> single MySQL host.
> >
> > Sep 11 14:59:56 us01odc-qa-ctrl1 mysqld[1052956]: 2019-09-11 14:59:56
> 8016 [Warning] Aborted connection 8016 to db:
> > 'nova' user: 'nova' host: 'us01odc-qa-ctrl2.internal.synopsys.com' (Got
> timeout reading communication packets)
> > Sep 11 14:59:57 us01odc-qa-ctrl1 mysqld[1052956]: 2019-09-11 14:59:57
> 8019 [Warning] Aborted connection 8019 to db:
> > 'glance' user: 'glance' host: 'us01odc-qa-ctrl1.internal.synopsys.com'
> (Got timeout reading communication packets)
> > Sep 11 14:59:57 us01odc-qa-ctrl1 mysqld[1052956]: 2019-09-11 14:59:57
> 8018 [Warning] Aborted connection 8018 to db:
> > 'nova_api' user: 'nova' host: 'us01odc-qa-ctrl2.internal.synopsys.com'
> (Got timeout reading communication packets)
> > Sep 11 15:00:50 us01odc-qa-ctrl1 mysqld[1052956]: 2019-09-11 15:00:50
> 8022 [Warning] Aborted connection 8022 to db:
> > 'nova_api' user: 'nova' host: 'us01odc-qa-ctrl1.internal.synopsys.com'
> (Got timeout reading communication packets)
> >
> > The errors come from nova, neutron, glance and keystone; it appears that
> all default to 3600. So it appears that, even
> > with wait_timeout > connection_recycle_time we still see mysql timeout
> errors.
> >
> > Just for fun I tried setting the MySQL wait_timeout to 86400 and
> restarting MySQL. I expected that this would pause
> > the "Aborted connection" errors for 24 hours, but they started again
> after an hour. So it looks like my original
> > assumption was incorrect. I thought nova was keeping connections open
> until the MySQL server timed them out, but now
> > it appears that something else is happening.
> >
> > Has anyone successfully stopped these MySQL error messages?
>
> could this be related to the eventlet heartbeat issue we see for rabbitmq
> when running the api under mod_wsgi/uwsgi?
>
> e.g. hav eyou confirmed that you wsgi serer is configure to use 1 thread
> and multiple processes for concurancy
> multiple thread in one process might have issues.
> > -----Original Message-----
> > From: Ben Nemec <openstack at nemebean.com>
> > Sent: Monday, September 9, 2019 9:50 AM
> > To: Chris Hoge <chris at openstack.org>;
> openstack-discuss at lists.openstack.org
> > Subject: Re: [oslo][nova] Nova causes MySQL timeouts
> >
> >
> >
> > On 9/9/19 11:38 AM, Chris Hoge wrote:
> > > In my personal experience, running Nova on a four core machine without
> > > limiting the number of database connections will easily exhaust the
> > > available connections to MySQL/MariaDB. Keep in mind that the limit
> > > applies to every instance of a service, so if Nova starts 'm' services
> > > replicated for 'n' cores with 'd' possible connections you'll be up to
> > > ‘m x n x d' connections. It gets big fast.
> > >
> > > The default setting of '0' (that is, unlimited) does not make for a
> good
> > > first-run experience, IMO.
> >
> > We don't default to 0. We default to 5:
> >
>
> https://urldefense.proofpoint.com/v2/url?u=https-3A__docs.openstack.org_oslo.db_stein_reference_opts.html-23database.max-5Fpool-5Fsize&d=DwIDaQ&c=DPL6_X_6JkXFx7AXWqB0tg&r=XrJBXYlVPpvOXkMqGPz6KucRW_ils95ZMrEmlTflPm8&m=W7apBhYbgfvGgB46HWLe-By9d_MYg6RB_eU3C2mARRY&s=p7bBYcuhnDR_J08MWFBj8XLiRUUV8JfruAIcl0zF234&e=
> >
> >
> > >
> > > This issue comes up every few years or so, and the consensus previously
> > > is that 200-2000 connections is recommended based on your needs. Your
> > > database has to be configured to handle the load and looking at the
> > > configuration value across all your services and setting them
> > > consistently and appropriately is important.
> > >
> > >
>
> https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.openstack.org_pipermail_openstack-2Ddev_2015-2DApril_061808.html&d=DwIDaQ&c=DPL6_X_6JkXFx7AXWqB0tg&r=XrJBXYlVPpvOXkMqGPz6KucRW_ils95ZMrEmlTflPm8&m=W7apBhYbgfvGgB46HWLe-By9d_MYg6RB_eU3C2mARRY&s=FGLfZK5eHj7z_xL-5DJsPgHkOt_T131ugvicMvcMDbc&e=
> > >
> >
> > Thanks, I did not recall that discussion.
> >
> > If I'm reading it correctly, Jay is suggesting that for MySQL we should
> > just disable connection pooling. As I noted earlier, I don't think we
> > expose the ability to do that in oslo.db (patches welcome!), but setting
> > max_pool_size to 1 would get you pretty close. Maybe we should add that
> > to the help text for the option in oslo.db?
> >
> > >
> > > > On Sep 6, 2019, at 7:34 AM, Ben Nemec <openstack at nemebean.com>
> wrote:
> > > >
> > > > Tagging with oslo as this sounds related to oslo.db.
> > > >
> > > > On 9/5/19 7:37 PM, Albert Braden wrote:
> > > > > After more googling it appears that max_pool_size is a maximum
> limit on the number of connections that can stay
> > > > > open, and max_overflow is a maximum limit on the number of
> connections that can be temporarily opened when the
> > > > > pool has been consumed. It looks like the defaults are 5 and 10
> which would keep 5 connections open all the time
> > > > > and allow 10 temp.
> > > > > Do I need to set max_pool_size to 0 and max_overflow to the number
> of connections that I want to allow? Is that
> > > > > a reasonable and correct configuration? Intuitively that doesn't
> seem right, to have a pool size of 0, but if
> > > > > the "pool" is a group of connections that will remain open until
> they time out, then maybe 0 is correct?
> > > >
> > > > I don't think so. According to [0] and [1], a pool_size of 0 means
> unlimited. You could probably set it to 1 to
> > > > minimize the number of connections kept open, but then I expect
> you'll have overhead from having to re-open
> > > > connections frequently.
> > > >
> > > > It sounds like you could use a NullPool to eliminate connection
> pooling entirely, but I don't think we support
> > > > that in oslo.db. Based on the error message you're seeing, I would
> take a look at connection_recycle_time[2]. I
> > > > seem to recall seeing a comment that the recycle time needs to be
> shorter than any of the timeouts in the path
> > > > between the service and the db (so anything like haproxy or mysql
> itself). Shortening that, or lengthening
> > > > intervening timeouts, might get rid of these disconnection messages.
> > > >
> > > > 0:
> > > >
> https://urldefense.proofpoint.com/v2/url?u=https-3A__docs.openstack.org_oslo.db_stein_reference_opts.html-23database.max-5Fpool-5Fsize&d=DwIDaQ&c=DPL6_X_6JkXFx7AXWqB0tg&r=XrJBXYlVPpvOXkMqGPz6KucRW_ils95ZMrEmlTflPm8&m=W7apBhYbgfvGgB46HWLe-By9d_MYg6RB_eU3C2mARRY&s=p7bBYcuhnDR_J08MWFBj8XLiRUUV8JfruAIcl0zF234&e=
> > > >
> > > > 1:
> > > >
> https://urldefense.proofpoint.com/v2/url?u=https-3A__docs.sqlalchemy.org_en_13_core_pooling.html-23sqlalchemy.pool.QueuePool.-5F-5Finit-5F-5F&d=DwIDaQ&c=DPL6_X_6JkXFx7AXWqB0tg&r=XrJBXYlVPpvOXkMqGPz6KucRW_ils95ZMrEmlTflPm8&m=W7apBhYbgfvGgB46HWLe-By9d_MYg6RB_eU3C2mARRY&s=_EIhQyyj1gSM0PrX7de3yJr8hNi7tD8-tnfPo2VV_LU&e=
> > > >
> > > > 2:
> > > >
> https://urldefense.proofpoint.com/v2/url?u=https-3A__docs.openstack.org_oslo.db_stein_reference_opts.html-23database.connection-5Frecycle-5Ftime&d=DwIDaQ&c=DPL6_X_6JkXFx7AXWqB0tg&r=XrJBXYlVPpvOXkMqGPz6KucRW_ils95ZMrEmlTflPm8&m=W7apBhYbgfvGgB46HWLe-By9d_MYg6RB_eU3C2mARRY&s=xDnj80EQrxXwenOLgmKEaJbF3VRIylapDgqyMs81pSY&e=
> > > >
> > > >
> > > > > *From:* Albert Braden <Albert.Braden at synopsys.com>
> > > > > *Sent:* Wednesday, September 4, 2019 10:19 AM
> > > > > *To:* openstack-discuss at lists.openstack.org
> > > > > *Cc:* Gaëtan Trellu <gaetan.trellu at incloudus.com>
> > > > > *Subject:* RE: Nova causes MySQL timeouts
> > > > > We’re not setting max_pool_size nor max_overflow option presently.
> I googled around and found this document:
> > > > >
>
> https://urldefense.proofpoint.com/v2/url?u=https-3A__docs.openstack.org_keystone_stein_configuration_config-2Doptions.html&d=DwIDaQ&c=DPL6_X_6JkXFx7AXWqB0tg&r=XrJBXYlVPpvOXkMqGPz6KucRW_ils95ZMrEmlTflPm8&m=W7apBhYbgfvGgB46HWLe-By9d_MYg6RB_eU3C2mARRY&s=NXcUpNTYGd6ZP-1oOUaQXsF7rHQ0mAt4e9uL8zzd0KA&e=
> > > > > = <
> https://urldefense.proofpoint.com/v2/url?u=https-3A__docs.openstack.org_keystone_stein_configuration_config-
> > > > >
> 2Doptions.html&d=DwMGaQ&c=DPL6_X_6JkXFx7AXWqB0tg&r=XrJBXYlVPpvOXkMqGPz6KucRW_ils95ZMrEmlTflPm8&m=3eF4Bv1HRQW6gl7
> > > > >
> II12rTTSKj_A9_LDISS6hU0nP-R0&s=0EGWx9qW60G1cxoPFCIv_G1-iXX20jKcC5-AwlCWk8g&e=>
> > > > > Document says:
> > > > > [api_database]
> > > > > connection_recycle_time = 3600 (Integer) Timeout
> before idle SQL connections are reaped.
> > > > > max_overflow = None (Integer) If
> set, use this value for max_overflow with
> > > > > SQLAlchemy.
> > > > > max_pool_size = None (Integer)
> Maximum number of SQL connections to keep open
> > > > > in a pool.
> > > > > [database]
> > > > > connection_recycle_time = 3600 (Integer) Timeout
> before idle SQL connections are reaped.
> > > > > min_pool_size = 1
> (Integer) Minimum number of SQL connections to keep
> > > > > open in a pool.
> > > > > max_overflow = 50
> (Integer) If set, use this value for max_overflow
> > > > > with SQLAlchemy.
> > > > > max_pool_size = None (Integer)
> Maximum number of SQL connections to keep open
> > > > > in a pool.
> > > > > If min_pool_size is >0, would that cause at least 1 connection to
> remain open until it times out? What are the
> > > > > recommended values for these, to allow unused connections to close
> before they time out? Is “min_pool_size = 0”
> > > > > an acceptable setting?
> > > > > My settings are default:
> > > > > [api_database]:
> > > > > #connection_recycle_time = 3600
> > > > > #max_overflow = <None>
> > > > > #max_pool_size = <None>
> > > > > [database]:
> > > > > #connection_recycle_time = 3600
> > > > > #min_pool_size = 1
> > > > > #max_overflow = 50
> > > > > #max_pool_size = 5
> > > > > It’s not obvious what max_overflow does. Where can I find a
> document that explains more about these settings?
> > > > > *From:* Gaëtan Trellu <gaetan.trellu at incloudus.com <mailto:
> gaetan.trellu at incloudus.com>>
> > > > > *Sent:* Tuesday, September 3, 2019 1:37 PM
> > > > > *To:* Albert Braden <albertb at synopsys.com <mailto:
> albertb at synopsys.com>>
> > > > > *Cc:* openstack-discuss at lists.openstack.org <mailto:
> openstack-discuss at lists.openstack.org>
> > > > > *Subject:* Re: Nova causes MySQL timeouts
> > > > > Hi Albert,
> > > > > It is a configuration issue, have a look to max_pool_size and
> max_overflow options under [database] section.
> > > > > Keep in mind than more workers you will have more connections will
> be opened on the database.
> > > > > Gaetan (goldyfruit)
> > > > > On Sep 3, 2019 4:31 PM, Albert Braden <Albert.Braden at synopsys.com
> <mailto:Albert.Braden at synopsys.com>> wrote:
> > > > > It looks like nova is keeping mysql connections open until
> they time
> > > > > out. How are others responding to this issue? Do you just
> ignore the
> > > > > mysql errors, or is it possible to change configuration so
> that nova
> > > > > closes and reopens connections before they time out? Or is
> there a
> > > > > way to stop mysql from logging these aborted connections
> without
> > > > > hiding real issues?
> > > > > Aborted connection 10726 to db: 'nova' user: 'nova' host:
> 'asdf'
> > > > > (Got timeout reading communication packets)
> > >
> > >
> >
> >
>
>
--
Hervé Beraud
Senior Software Engineer
Red Hat - Openstack Oslo
irc: hberaud
-----BEGIN PGP SIGNATURE-----
wsFcBAABCAAQBQJb4AwCCRAHwXRBNkGNegAALSkQAHrotwCiL3VMwDR0vcja10Q+
Kf31yCutl5bAlS7tOKpPQ9XN4oC0ZSThyNNFVrg8ail0SczHXsC4rOrsPblgGRN+
RQLoCm2eO1AkB0ubCYLaq0XqSaO+Uk81QxAPkyPCEGT6SRxXr2lhADK0T86kBnMP
F8RvGolu3EFjlqCVgeOZaR51PqwUlEhZXZuuNKrWZXg/oRiY4811GmnvzmUhgK5G
5+f8mUg74hfjDbR2VhjTeaLKp0PhskjOIKY3vqHXofLuaqFDD+WrAy/NgDGvN22g
glGfj472T3xyHnUzM8ILgAGSghfzZF5Skj2qEeci9cB6K3Hm3osj+PbvfsXE/7Kw
m/xtm+FjnaywZEv54uCmVIzQsRIm1qJscu20Qw6Q0UiPpDFqD7O6tWSRKdX11UTZ
hwVQTMh9AKQDBEh2W9nnFi9kzSSNu4OQ1dRMcYHWfd9BEkccezxHwUM4Xyov5Fe0
qnbfzTB1tYkjU78loMWFaLa00ftSxP/DtQ//iYVyfVNfcCwfDszXLOqlkvGmY1/Y
F1ON0ONekDZkGJsDoS6QdiUSn8RZ2mHArGEWMV00EV5DCIbCXRvywXV43ckx8Z+3
B8qUJhBqJ8RS2F+vTs3DTaXqcktgJ4UkhYC2c1gImcPRyGrK9VY0sCT+1iA+wp/O
v6rDpkeNksZ9fFSyoY2o
=ECSj
-----END PGP SIGNATURE-----
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20190918/8c01f9f2/attachment-0001.html>
More information about the openstack-discuss
mailing list