[Openstack-operators] memcached redundancy

Joe Topjian joe at topjian.net
Thu Aug 21 15:34:03 UTC 2014


I had some time this morning to set up an environment to reproduce the
issue. The environment is just as described before: all services go through
haproxy except for memcached.

This is all Ubuntu 12.04 and OpenStack Havana.

Keystone is using the memcached backend for tokens:

[token]
driver = keystone.token.backends.memcache.Token
[memcache]
servers=hnl-memcached-1:11211,hnl-memcached-2:11211

nova.conf also has a memcached setting:

memcached_servers=hnl-memcached-1:11211,hnl-memcached-2:11211

Normal requests take less than a second:

[ root at hnl-nova-1 ~ ] # time keystone token-get
+-----------+----------------------------------+
|  Property |              Value               |
+-----------+----------------------------------+
|  expires  |       2014-08-22T15:10:20Z       |
|     id    | be78e6ee4992435d8bcde6860810b2db |
| tenant_id | 20f49a5947d94d2ca25f68e8c790b9d6 |
|  user_id  | 79c71416b20f453eb1d48ad57454cdb7 |
+-----------+----------------------------------+

real    0m0.498s
user    0m0.254s
sys     0m0.072s

[ root at hnl-nova-1 ~ ] # time nova list


real    0m0.969s
user    0m0.417s
sys     0m0.104s

If I shut off hnl-memcached-1, requests take an extra 3 seconds:

[ root at hnl-nova-1 ~ ] # time keystone token-get
+-----------+----------------------------------+
|  Property |              Value               |
+-----------+----------------------------------+
|  expires  |       2014-08-22T15:20:24Z       |
|     id    | 8112e925344e4401830b205c60fc4f26 |
| tenant_id | 20f49a5947d94d2ca25f68e8c790b9d6 |
|  user_id  | 79c71416b20f453eb1d48ad57454cdb7 |
+-----------+----------------------------------+

real    0m3.549s
user    0m0.273s
sys     0m0.076s

I'm assuming the 3 seconds comes from the 3 second SYN retry described here
<https://code.google.com/p/memcached/wiki/Timeouts>.

If I turn hnl-memcached-1 back on and shut down hnl-memcached-2, I get
mixed results. Initially I don't see a delay in requests, but repeated
requests take longer and longer until they hit the 3 second mark.

If I switch the order of the memcached servers (so hnl-memcached-2 is first
and hnl-memcached-1 is second), I immediately see the 3 second delay.

If I play around with the memcached settings in nova.conf, I see similar
results including a delay of up to 10 seconds.

So in summary, I'm seeing:

* Delay in requests when the first memcached server in a list is offline
* Sporadic delays in requests when the second memcached server in a list is
offline
* Even longer delays for subsequent services using the same memcached
servers.

Any ideas on what's going on?

Thanks,
Joe


On Thu, Aug 14, 2014 at 1:58 PM, Juan José Pavlik Salles <jjpavlik at gmail.com
> wrote:

> Joe, I'm not running it in my grizzly cloud because I still have got just
> one controller node. However I remember testing repcached in two VMs and it
> worked ok. Anyway it may not be the best solution for a big production
> cloud since repcached project is pretty dead already (last update 2012).
>
> Cheers.
>
>
> 2014-08-14 16:38 GMT-03:00 Joe Topjian <joe at topjian.net>:
>
> Thanks all for the input! Aggregating all replies:
>>
>> Juan: I've seen that patch and was curious about it. When you say it
>> worked for you, have you used it in OpenStack for similar reasons?
>>
>> Daneyon: ah - sticky sessions are a good idea. I will look into that.
>>
>> John: Another component is a very good possibility. I'll try this out in
>> a dev environment or if we have planned maintenance where I can try this
>> prior to the maintenance window. Thank you for the insight on what the
>> memcached client should be doing.
>>
>> Thanks,
>> Joe
>>
>>
>> On Thu, Aug 14, 2014 at 1:22 PM, Juan José Pavlik Salles <
>> jjpavlik at gmail.com> wrote:
>>
>>> It might not be the best idea ever, but it worked for me. There's a
>>> memcached patch called repcached that allows you to have 2 memcached
>>> servers sincronized (as far as I now, it just works for 2 servers) with
>>> each other, this way you get HA and LB. The patch source is
>>> https://github.com/usecide/repcached/blob/master/repcached-2.3.1-1.4.13.patch and
>>> here you've got a little example
>>> http://viviendolared.blogspot.com.ar/2014/01/memcached-replicado-en-ubuntu-1204-lts.html running
>>> on Ubuntu 12.04 (sorry about the spanish link, but the steps are pretty
>>> straight forward).
>>>
>>>
>>> 2014-08-14 16:05 GMT-03:00 Daneyon Hansen (danehans) <danehans at cisco.com
>>> >:
>>>
>>>>
>>>>  It has been a while, but I believe I load-balanced memcached using
>>>> HAProxy (using sticky sessions) and observed no issues with fail-over.
>>>>
>>>>  Regards,
>>>> Daneyon Hansen
>>>> Software Engineer
>>>> Email: danehans at cisco.com
>>>> Phone: 303-718-0400
>>>> http://about.me/daneyon_hansen
>>>>
>>>>   From: Joe Topjian <joe at topjian.net>
>>>> Date: Thursday, August 14, 2014 10:09 AM
>>>> To: "openstack-operators at lists.openstack.org" <
>>>> openstack-operators at lists.openstack.org>
>>>> Subject: [Openstack-operators] memcached redundancy
>>>>
>>>>   Hello,
>>>>
>>>>  I have an OpenStack cloud with two HA cloud controllers. Each
>>>> controller runs the standard controller components: glance, keystone, nova
>>>> minus compute and network, cinder, horizon, mysql, rabbitmq, and memcached.
>>>>
>>>>  Everything except memcached is accessed through haproxy and
>>>> everything is working great (well, rabbit can be finicky ... I might post
>>>> about that if it continues).
>>>>
>>>>  The problem I currently have is how to effectively work with
>>>> memcached in this environment. Since all components are load balanced, they
>>>> need access to the same memcached servers. That's solved by the ability to
>>>> specify multiple memcached servers in the various openstack config files.
>>>>
>>>>  But if I take a server down for maintenance, I notice a 2-3 second
>>>> delay in all requests. I've confirmed it's memcached by editing the list of
>>>> memcached servers in the config files and the delay goes away.
>>>>
>>>>  I'm wondering how people deploy memcached in environments like this?
>>>> Are you using some type of memcached replication between servers? Or if a
>>>> memcached server goes offline are you reconfiguring OpenStack to remove the
>>>> offline memcached server?
>>>>
>>>>  Thanks,
>>>> Joe
>>>>
>>>> _______________________________________________
>>>> OpenStack-operators mailing list
>>>> OpenStack-operators at lists.openstack.org
>>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>>>>
>>>>
>>>
>>>
>>> --
>>> Pavlik Salles Juan José
>>> Blog - http://viviendolared.blogspot.com
>>>
>>> _______________________________________________
>>> OpenStack-operators mailing list
>>> OpenStack-operators at lists.openstack.org
>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>>>
>>>
>>
>> _______________________________________________
>> OpenStack-operators mailing list
>> OpenStack-operators at lists.openstack.org
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>>
>>
>
>
> --
> Pavlik Salles Juan José
> Blog - http://viviendolared.blogspot.com
>
> _______________________________________________
> OpenStack-operators mailing list
> OpenStack-operators at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20140821/89828d3d/attachment.html>


More information about the OpenStack-operators mailing list