Le lun. 14 sept. 2020 à 18:09, Tony Liu <tonyliu0592@hotmail.com> a écrit :
Radosław pointed another bug
https://bugs.launchpad.net/keystonemiddleware/+bug/1883659
referring to the same fix
https://review.opendev.org/#/c/742193/

Regarding to the fix, The comment says "This flag is off by
default for backwards compatibility.". But I see this flag is
on by default in current code. That's how it causes issues.
This fix changes the default value from on to off. It does break
backwards compatibility. To keep Keystone working as the old way,
along with this fix, this flag has to be explicitly set to true
in keystone.conf. For neutron-server and nova-api, it's good to
leave this flag off by default. Am I correct?


Long short story as far as I correctly remember this topic.

Currently flush on reconnect is not an option and it is always triggered (in the corresponding scenario).

If we decide to introduce this new option `memcache_pool_flush_on_reconnect` we need to set this option to `True` as the default value to keep the backward compat.

If this option is set to `true` then flush on reconnect will be triggered all the time in the corresponding scenario.

Use `True` as default value was my first choice for these changes, and I think we need to give prior to backward compat for the first time and in a second time start by deprecating this behavior and turn this option to `False` as the default value if it helps to fix things.

Finally after some discussions `False` have been retained as default value (c.f comments on https://review.opendev.org/#/c/742193/) which mean that flush on reconnect will not be executed and in this case I think we can say that backward compat is broken as this is not the current behavior.

AFAIK `flush_on_reconnect` have been added for Keystone and I think only Keystone really needs that but other people could confirm that.

If we decide to continue with `False` as the default value then neutron-server and nova-api could leave this default value as I don't think we need that (c.f my previous line).
 
Finally, it could be worth to deep dive in the python-memcached side which is where the root cause is (the exponential connections) and to see how to address that.

Hope that helps you.


Thanks!
Tony
> -----Original Message-----
> From: Herve Beraud <hberaud@redhat.com>
> Sent: Monday, September 14, 2020 8:27 AM
> To: Tony Liu <tonyliu0592@hotmail.com>
> Cc: openstack-discuss <openstack-discuss@lists.openstack.org>
> Subject: Re: memchached connections
>
> Hello,
>
> python-memcached badly handles connections during a flush on reconnect
> and so connections can grow up exponentially [1].
>
>
> I don't know if it is the same issue that you faced but it could be a
> track to follow.
>
> On oslo.cache a fix has been submitted but it is not yet merged [2].
>
>
> [1] https://bugs.launchpad.net/oslo.cache/+bug/1888394
> [2] https://review.opendev.org/#/c/742193/
>
> Le ven. 11 sept. 2020 à 23:29, Tony Liu <tonyliu0592@hotmail.com
> <mailto:tonyliu0592@hotmail.com> > a écrit :
>
>
>       Hi,
>
>       Is there any guidance or experiences to estimate the number
>       of memcached connections?
>
>       Here is memcached connection on one of the 3 controllers.
>       Connection number is the total established connections to
>       all 3 memcached nodes.
>
>       Node 1:
>       10 Keystone workers have 62 connections.
>       11 Nova API workers have 37 connections.
>       6 Neutron server works have 4304 connections.
>       1 memcached has 4973 connections.
>
>       Node 2:
>       10 Keystone workers have 62 connections.
>       11 Nova API workers have 30 connections.
>       6 Neutron server works have 3703 connections.
>       1 memcached has 4973 connections.
>
>       Node 3:
>       10 Keystone workers have 54 connections.
>       11 Nova API workers have 15 connections.
>       6 Neutron server works have 6541 connections.
>       1 memcached has 4973 connections.
>
>       Before I increase the connection limit for memcached, I'd
>       like to understand if all the above is expected?
>
>       How Neutron server and memcached take so many connections?
>
>       Any elaboration is appreciated.
>
>       BTW, the problem leading me here is memcached connection timeout,
>       which results all services depending on memcached stop working
>       properly.
>
>
>       Thanks!
>       Tony
>
>
>
>
>
>
> --
>
> Hervé Beraud
> Senior Software Engineer
>
> Red Hat - Openstack Oslo
> irc: hberaud
> -----BEGIN PGP SIGNATURE-----
>
> wsFcBAABCAAQBQJb4AwCCRAHwXRBNkGNegAALSkQAHrotwCiL3VMwDR0vcja10Q+
> Kf31yCutl5bAlS7tOKpPQ9XN4oC0ZSThyNNFVrg8ail0SczHXsC4rOrsPblgGRN+
> RQLoCm2eO1AkB0ubCYLaq0XqSaO+Uk81QxAPkyPCEGT6SRxXr2lhADK0T86kBnMP
> F8RvGolu3EFjlqCVgeOZaR51PqwUlEhZXZuuNKrWZXg/oRiY4811GmnvzmUhgK5G
> 5+f8mUg74hfjDbR2VhjTeaLKp0PhskjOIKY3vqHXofLuaqFDD+WrAy/NgDGvN22g
> glGfj472T3xyHnUzM8ILgAGSghfzZF5Skj2qEeci9cB6K3Hm3osj+PbvfsXE/7Kw
> m/xtm+FjnaywZEv54uCmVIzQsRIm1qJscu20Qw6Q0UiPpDFqD7O6tWSRKdX11UTZ
> hwVQTMh9AKQDBEh2W9nnFi9kzSSNu4OQ1dRMcYHWfd9BEkccezxHwUM4Xyov5Fe0
> qnbfzTB1tYkjU78loMWFaLa00ftSxP/DtQ//iYVyfVNfcCwfDszXLOqlkvGmY1/Y
> F1ON0ONekDZkGJsDoS6QdiUSn8RZ2mHArGEWMV00EV5DCIbCXRvywXV43ckx8Z+3
> B8qUJhBqJ8RS2F+vTs3DTaXqcktgJ4UkhYC2c1gImcPRyGrK9VY0sCT+1iA+wp/O
> v6rDpkeNksZ9fFSyoY2o
> =ECSj
> -----END PGP SIGNATURE-----
>



--
Hervé Beraud
Senior Software Engineer
Red Hat - Openstack Oslo
irc: hberaud
-----BEGIN PGP SIGNATURE-----

wsFcBAABCAAQBQJb4AwCCRAHwXRBNkGNegAALSkQAHrotwCiL3VMwDR0vcja10Q+
Kf31yCutl5bAlS7tOKpPQ9XN4oC0ZSThyNNFVrg8ail0SczHXsC4rOrsPblgGRN+
RQLoCm2eO1AkB0ubCYLaq0XqSaO+Uk81QxAPkyPCEGT6SRxXr2lhADK0T86kBnMP
F8RvGolu3EFjlqCVgeOZaR51PqwUlEhZXZuuNKrWZXg/oRiY4811GmnvzmUhgK5G
5+f8mUg74hfjDbR2VhjTeaLKp0PhskjOIKY3vqHXofLuaqFDD+WrAy/NgDGvN22g
glGfj472T3xyHnUzM8ILgAGSghfzZF5Skj2qEeci9cB6K3Hm3osj+PbvfsXE/7Kw
m/xtm+FjnaywZEv54uCmVIzQsRIm1qJscu20Qw6Q0UiPpDFqD7O6tWSRKdX11UTZ
hwVQTMh9AKQDBEh2W9nnFi9kzSSNu4OQ1dRMcYHWfd9BEkccezxHwUM4Xyov5Fe0
qnbfzTB1tYkjU78loMWFaLa00ftSxP/DtQ//iYVyfVNfcCwfDszXLOqlkvGmY1/Y
F1ON0ONekDZkGJsDoS6QdiUSn8RZ2mHArGEWMV00EV5DCIbCXRvywXV43ckx8Z+3
B8qUJhBqJ8RS2F+vTs3DTaXqcktgJ4UkhYC2c1gImcPRyGrK9VY0sCT+1iA+wp/O
v6rDpkeNksZ9fFSyoY2o
=ECSj
-----END PGP SIGNATURE-----