[oslo][nova] Nova causes MySQL timeouts
Satish Patel
satish.txt at gmail.com
Tue Sep 17 17:27:40 UTC 2019
I don't want to boil the ocean but i had similar problem my nove was
loosing mysql db connection and we should culprit was Load-balancer
(BigIP F5) it has different tcp-timeout compare to whatever openstack
provide. after adjusting timeout on F5 my issue got resolved.
On Tue, Sep 17, 2019 at 12:56 PM Sean Mooney <smooney at redhat.com> wrote:
>
> On Tue, 2019-09-17 at 16:36 +0000, Albert Braden wrote:
> > I thought I had figured out that the solution was to increase the MySQL wait_timeout so that it is longer than the
> > nova (and glance, neutron, etc.) connection_recycle_time (3600). I increased my MySQL wait_timeout to 6000:
> >
> > root at us01odc-qa-ctrl1:~# mysqladmin variables|grep wait_timeout|grep -v _wait
> > > wait_timeout | 6000
> >
> > But I still see the MySQL errors. There's no LB; we are pointing to a single MySQL host.
> >
> > Sep 11 14:59:56 us01odc-qa-ctrl1 mysqld[1052956]: 2019-09-11 14:59:56 8016 [Warning] Aborted connection 8016 to db:
> > 'nova' user: 'nova' host: 'us01odc-qa-ctrl2.internal.synopsys.com' (Got timeout reading communication packets)
> > Sep 11 14:59:57 us01odc-qa-ctrl1 mysqld[1052956]: 2019-09-11 14:59:57 8019 [Warning] Aborted connection 8019 to db:
> > 'glance' user: 'glance' host: 'us01odc-qa-ctrl1.internal.synopsys.com' (Got timeout reading communication packets)
> > Sep 11 14:59:57 us01odc-qa-ctrl1 mysqld[1052956]: 2019-09-11 14:59:57 8018 [Warning] Aborted connection 8018 to db:
> > 'nova_api' user: 'nova' host: 'us01odc-qa-ctrl2.internal.synopsys.com' (Got timeout reading communication packets)
> > Sep 11 15:00:50 us01odc-qa-ctrl1 mysqld[1052956]: 2019-09-11 15:00:50 8022 [Warning] Aborted connection 8022 to db:
> > 'nova_api' user: 'nova' host: 'us01odc-qa-ctrl1.internal.synopsys.com' (Got timeout reading communication packets)
> >
> > The errors come from nova, neutron, glance and keystone; it appears that all default to 3600. So it appears that, even
> > with wait_timeout > connection_recycle_time we still see mysql timeout errors.
> >
> > Just for fun I tried setting the MySQL wait_timeout to 86400 and restarting MySQL. I expected that this would pause
> > the "Aborted connection" errors for 24 hours, but they started again after an hour. So it looks like my original
> > assumption was incorrect. I thought nova was keeping connections open until the MySQL server timed them out, but now
> > it appears that something else is happening.
> >
> > Has anyone successfully stopped these MySQL error messages?
>
> could this be related to the eventlet heartbeat issue we see for rabbitmq when running the api under mod_wsgi/uwsgi?
>
> e.g. hav eyou confirmed that you wsgi serer is configure to use 1 thread and multiple processes for concurancy
> multiple thread in one process might have issues.
> > -----Original Message-----
> > From: Ben Nemec <openstack at nemebean.com>
> > Sent: Monday, September 9, 2019 9:50 AM
> > To: Chris Hoge <chris at openstack.org>; openstack-discuss at lists.openstack.org
> > Subject: Re: [oslo][nova] Nova causes MySQL timeouts
> >
> >
> >
> > On 9/9/19 11:38 AM, Chris Hoge wrote:
> > > In my personal experience, running Nova on a four core machine without
> > > limiting the number of database connections will easily exhaust the
> > > available connections to MySQL/MariaDB. Keep in mind that the limit
> > > applies to every instance of a service, so if Nova starts 'm' services
> > > replicated for 'n' cores with 'd' possible connections you'll be up to
> > > ‘m x n x d' connections. It gets big fast.
> > >
> > > The default setting of '0' (that is, unlimited) does not make for a good
> > > first-run experience, IMO.
> >
> > We don't default to 0. We default to 5:
> >
> https://urldefense.proofpoint.com/v2/url?u=https-3A__docs.openstack.org_oslo.db_stein_reference_opts.html-23database.max-5Fpool-5Fsize&d=DwIDaQ&c=DPL6_X_6JkXFx7AXWqB0tg&r=XrJBXYlVPpvOXkMqGPz6KucRW_ils95ZMrEmlTflPm8&m=W7apBhYbgfvGgB46HWLe-By9d_MYg6RB_eU3C2mARRY&s=p7bBYcuhnDR_J08MWFBj8XLiRUUV8JfruAIcl0zF234&e=
> >
> >
> > >
> > > This issue comes up every few years or so, and the consensus previously
> > > is that 200-2000 connections is recommended based on your needs. Your
> > > database has to be configured to handle the load and looking at the
> > > configuration value across all your services and setting them
> > > consistently and appropriately is important.
> > >
> > >
> https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.openstack.org_pipermail_openstack-2Ddev_2015-2DApril_061808.html&d=DwIDaQ&c=DPL6_X_6JkXFx7AXWqB0tg&r=XrJBXYlVPpvOXkMqGPz6KucRW_ils95ZMrEmlTflPm8&m=W7apBhYbgfvGgB46HWLe-By9d_MYg6RB_eU3C2mARRY&s=FGLfZK5eHj7z_xL-5DJsPgHkOt_T131ugvicMvcMDbc&e=
> > >
> >
> > Thanks, I did not recall that discussion.
> >
> > If I'm reading it correctly, Jay is suggesting that for MySQL we should
> > just disable connection pooling. As I noted earlier, I don't think we
> > expose the ability to do that in oslo.db (patches welcome!), but setting
> > max_pool_size to 1 would get you pretty close. Maybe we should add that
> > to the help text for the option in oslo.db?
> >
> > >
> > > > On Sep 6, 2019, at 7:34 AM, Ben Nemec <openstack at nemebean.com> wrote:
> > > >
> > > > Tagging with oslo as this sounds related to oslo.db.
> > > >
> > > > On 9/5/19 7:37 PM, Albert Braden wrote:
> > > > > After more googling it appears that max_pool_size is a maximum limit on the number of connections that can stay
> > > > > open, and max_overflow is a maximum limit on the number of connections that can be temporarily opened when the
> > > > > pool has been consumed. It looks like the defaults are 5 and 10 which would keep 5 connections open all the time
> > > > > and allow 10 temp.
> > > > > Do I need to set max_pool_size to 0 and max_overflow to the number of connections that I want to allow? Is that
> > > > > a reasonable and correct configuration? Intuitively that doesn't seem right, to have a pool size of 0, but if
> > > > > the "pool" is a group of connections that will remain open until they time out, then maybe 0 is correct?
> > > >
> > > > I don't think so. According to [0] and [1], a pool_size of 0 means unlimited. You could probably set it to 1 to
> > > > minimize the number of connections kept open, but then I expect you'll have overhead from having to re-open
> > > > connections frequently.
> > > >
> > > > It sounds like you could use a NullPool to eliminate connection pooling entirely, but I don't think we support
> > > > that in oslo.db. Based on the error message you're seeing, I would take a look at connection_recycle_time[2]. I
> > > > seem to recall seeing a comment that the recycle time needs to be shorter than any of the timeouts in the path
> > > > between the service and the db (so anything like haproxy or mysql itself). Shortening that, or lengthening
> > > > intervening timeouts, might get rid of these disconnection messages.
> > > >
> > > > 0:
> > > > https://urldefense.proofpoint.com/v2/url?u=https-3A__docs.openstack.org_oslo.db_stein_reference_opts.html-23database.max-5Fpool-5Fsize&d=DwIDaQ&c=DPL6_X_6JkXFx7AXWqB0tg&r=XrJBXYlVPpvOXkMqGPz6KucRW_ils95ZMrEmlTflPm8&m=W7apBhYbgfvGgB46HWLe-By9d_MYg6RB_eU3C2mARRY&s=p7bBYcuhnDR_J08MWFBj8XLiRUUV8JfruAIcl0zF234&e=
> > > >
> > > > 1:
> > > > https://urldefense.proofpoint.com/v2/url?u=https-3A__docs.sqlalchemy.org_en_13_core_pooling.html-23sqlalchemy.pool.QueuePool.-5F-5Finit-5F-5F&d=DwIDaQ&c=DPL6_X_6JkXFx7AXWqB0tg&r=XrJBXYlVPpvOXkMqGPz6KucRW_ils95ZMrEmlTflPm8&m=W7apBhYbgfvGgB46HWLe-By9d_MYg6RB_eU3C2mARRY&s=_EIhQyyj1gSM0PrX7de3yJr8hNi7tD8-tnfPo2VV_LU&e=
> > > >
> > > > 2:
> > > > https://urldefense.proofpoint.com/v2/url?u=https-3A__docs.openstack.org_oslo.db_stein_reference_opts.html-23database.connection-5Frecycle-5Ftime&d=DwIDaQ&c=DPL6_X_6JkXFx7AXWqB0tg&r=XrJBXYlVPpvOXkMqGPz6KucRW_ils95ZMrEmlTflPm8&m=W7apBhYbgfvGgB46HWLe-By9d_MYg6RB_eU3C2mARRY&s=xDnj80EQrxXwenOLgmKEaJbF3VRIylapDgqyMs81pSY&e=
> > > >
> > > >
> > > > > *From:* Albert Braden <Albert.Braden at synopsys.com>
> > > > > *Sent:* Wednesday, September 4, 2019 10:19 AM
> > > > > *To:* openstack-discuss at lists.openstack.org
> > > > > *Cc:* Gaëtan Trellu <gaetan.trellu at incloudus.com>
> > > > > *Subject:* RE: Nova causes MySQL timeouts
> > > > > We’re not setting max_pool_size nor max_overflow option presently. I googled around and found this document:
> > > > >
> https://urldefense.proofpoint.com/v2/url?u=https-3A__docs.openstack.org_keystone_stein_configuration_config-2Doptions.html&d=DwIDaQ&c=DPL6_X_6JkXFx7AXWqB0tg&r=XrJBXYlVPpvOXkMqGPz6KucRW_ils95ZMrEmlTflPm8&m=W7apBhYbgfvGgB46HWLe-By9d_MYg6RB_eU3C2mARRY&s=NXcUpNTYGd6ZP-1oOUaQXsF7rHQ0mAt4e9uL8zzd0KA&e=
> > > > > = <https://urldefense.proofpoint.com/v2/url?u=https-3A__docs.openstack.org_keystone_stein_configuration_config-
> > > > > 2Doptions.html&d=DwMGaQ&c=DPL6_X_6JkXFx7AXWqB0tg&r=XrJBXYlVPpvOXkMqGPz6KucRW_ils95ZMrEmlTflPm8&m=3eF4Bv1HRQW6gl7
> > > > > II12rTTSKj_A9_LDISS6hU0nP-R0&s=0EGWx9qW60G1cxoPFCIv_G1-iXX20jKcC5-AwlCWk8g&e=>
> > > > > Document says:
> > > > > [api_database]
> > > > > connection_recycle_time = 3600 (Integer) Timeout before idle SQL connections are reaped.
> > > > > max_overflow = None (Integer) If set, use this value for max_overflow with
> > > > > SQLAlchemy.
> > > > > max_pool_size = None (Integer) Maximum number of SQL connections to keep open
> > > > > in a pool.
> > > > > [database]
> > > > > connection_recycle_time = 3600 (Integer) Timeout before idle SQL connections are reaped.
> > > > > min_pool_size = 1 (Integer) Minimum number of SQL connections to keep
> > > > > open in a pool.
> > > > > max_overflow = 50 (Integer) If set, use this value for max_overflow
> > > > > with SQLAlchemy.
> > > > > max_pool_size = None (Integer) Maximum number of SQL connections to keep open
> > > > > in a pool.
> > > > > If min_pool_size is >0, would that cause at least 1 connection to remain open until it times out? What are the
> > > > > recommended values for these, to allow unused connections to close before they time out? Is “min_pool_size = 0”
> > > > > an acceptable setting?
> > > > > My settings are default:
> > > > > [api_database]:
> > > > > #connection_recycle_time = 3600
> > > > > #max_overflow = <None>
> > > > > #max_pool_size = <None>
> > > > > [database]:
> > > > > #connection_recycle_time = 3600
> > > > > #min_pool_size = 1
> > > > > #max_overflow = 50
> > > > > #max_pool_size = 5
> > > > > It’s not obvious what max_overflow does. Where can I find a document that explains more about these settings?
> > > > > *From:* Gaëtan Trellu <gaetan.trellu at incloudus.com <mailto:gaetan.trellu at incloudus.com>>
> > > > > *Sent:* Tuesday, September 3, 2019 1:37 PM
> > > > > *To:* Albert Braden <albertb at synopsys.com <mailto:albertb at synopsys.com>>
> > > > > *Cc:* openstack-discuss at lists.openstack.org <mailto:openstack-discuss at lists.openstack.org>
> > > > > *Subject:* Re: Nova causes MySQL timeouts
> > > > > Hi Albert,
> > > > > It is a configuration issue, have a look to max_pool_size and max_overflow options under [database] section.
> > > > > Keep in mind than more workers you will have more connections will be opened on the database.
> > > > > Gaetan (goldyfruit)
> > > > > On Sep 3, 2019 4:31 PM, Albert Braden <Albert.Braden at synopsys.com <mailto:Albert.Braden at synopsys.com>> wrote:
> > > > > It looks like nova is keeping mysql connections open until they time
> > > > > out. How are others responding to this issue? Do you just ignore the
> > > > > mysql errors, or is it possible to change configuration so that nova
> > > > > closes and reopens connections before they time out? Or is there a
> > > > > way to stop mysql from logging these aborted connections without
> > > > > hiding real issues?
> > > > > Aborted connection 10726 to db: 'nova' user: 'nova' host: 'asdf'
> > > > > (Got timeout reading communication packets)
> > >
> > >
> >
> >
>
>
More information about the openstack-discuss
mailing list