[Openstack-operators] memcached redundancy
morgan.fainberg at gmail.com
Fri Aug 22 18:39:12 UTC 2014
While keystone uses memcache as a possible token storage backend we are
working towards eliminating the design that makes memcache a desirable
Using memcache for the token backend is not the best approach as the token
backend (up through icehouse and in some cases will hold true for Juno)
assumes stable storage for at least the life of the token.
I agree with Josh, we are likely using memcached incorrectly in a number
On Thursday, August 21, 2014, Joshua Harlow <harlowja at outlook.com> wrote:
> +1 for this, remember the 'cache' in memcache *strongly* indicates what it
> should be used for.
> A useful link to read over @
> > Excerpts from Joe Topjian's message of 2014-08-14 09:09:59 -0700:
> >> Hello,
> >> I have an OpenStack cloud with two HA cloud controllers. Each controller
> >> runs the standard controller components: glance, keystone, nova minus
> >> compute and network, cinder, horizon, mysql, rabbitmq, and memcached.
> >> Everything except memcached is accessed through haproxy and everything
> >> working great (well, rabbit can be finicky ... I might post about that
> >> it continues).
> >> The problem I currently have is how to effectively work with memcached
> >> this environment. Since all components are load balanced, they need
> >> to the same memcached servers. That's solved by the ability to specify
> >> multiple memcached servers in the various openstack config files.
> >> But if I take a server down for maintenance, I notice a 2-3 second
> delay in
> >> all requests. I've confirmed it's memcached by editing the list of
> >> memcached servers in the config files and the delay goes away.
> > I've seen a few responses to this that show a _massive_ misunderstanding
> > of how memcached is intended to work.
> > Memcached should never need to be load balanced at the connection
> > level. It has a consistent hash ring based on the keys to handle
> > load balancing and failover. If you have 2 servers, and 1 is gone,
> > the failover should happen automatically. This gets important when you
> > have, say, 5 memcached servers as it means that given 1 failed server,
> > you retain n-1 RAM for caching.
> > What I suspect is happening is that we're not doing that right by
> > either not keeping persistent connections, or retrying dead servers
> > too aggressively.
> > In fact, it looks like the default one used in oslo-incubator's
> > 'memorycache', the 'memcache' driver, will by default retry dead servers
> > every 30 seconds, and wait 3 seconds for a timeout, which probably
> > matches the behavior you see. None of the places I looked in Nova seem
> > to allow passing in a different dead_retry or timeout. In my experience,
> > you probably want something like dead_retry == 600, so only one slow
> > operation every 10 minutes per process (so if you have 10 nova-api's
> > running, that's 10 requests every 10 minutes).
> > It is also possible that some of these objects are being re-created on
> > every request, as is common if caching is implemented too deep inside
> > "middleware" and not at the edges of a solution. I haven't dug deep
> > enough in, but suffice to say, replicating and load balancing may be the
> > cheaper solution to auditing the code and fixing it at this point.
> > _______________________________________________
> > OpenStack-operators mailing list
> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
> OpenStack-operators mailing list
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the OpenStack-operators