Hi Eugen,

Thanks for your response. This is also a good idea. Because the solution with adding 'ignore_exc' option works for us for now, I will stay with it,  but if there will be some other issues with multiple memcached clients we will go with localhost memcached client.

Kamil


On Tue, May 13, 2025 at 8:23 AM Eugen Block <eblock@nde.ag> wrote:
Hi,

we were facing the same thing when we reinstalled our cloud to be 
highly available. We had a list of memcached servers for all the 
required services, and then noticed that a failed control node would 
disrupt our services. We could pinpoint it to memcached not being 
highly-available despite having a list of servers. So we decided to 
point all services to localhost only:

# nova
root@controller02:~# grep memcached /etc/nova/nova.conf
memcached_servers = localhost:11211

# Dashboard
CACHES = {
     'default': {
         'BACKEND': 'django.core.cache.backends.memcached.PyMemcacheCache',
         'LOCATION': '127.0.0.1:11211',
     },
}

This has been working great for years now.

Regards,
Eugen

Zitat von Sean Mooney <smooney@redhat.com>:

> On 12/05/2025 19:27, Kamil Madac wrote:
>> I have deployed openstack 2024.2 with kolla-ansible in HA setup 
>> with 3 control nodes. Everything works without issues, but when I 
>> stop memcached node with IP address  192.168.56.12 (or when node 
>> goes down), it is not possible to login to Horizon with 
>> error message:
>>
>> Something went wrong!
>> An unexpected error has occurred. Try refreshing the page. If that 
>> doesn't help, contact your local administrator.
>>
>> In horizon log I have an error message:
>>
>> 2025-05-12 18:05:37.406588 Internal Server Error: /
>> 2025-05-12 18:05:37.406613 Traceback (most recent call last):
>> 2025-05-12 18:05:37.406615   File 
>> "/var/lib/kolla/venv/lib64/python3.9/site-packages/django/core/handlers/exception.py", line 55, in 
>> inner
>> 2025-05-12 18:05:37.406617     response = get_response(request)
>> 2025-05-12 18:05:37.406619   File 
>> "/var/lib/kolla/venv/lib/python3.9/site-packages/horizon/middleware/simultaneous_sessions.py", line 30, in 
>> __call__
>> 2025-05-12 18:05:37.406621     self._process_request(request)
>> 2025-05-12 18:05:37.406623   File 
>> "/var/lib/kolla/venv/lib/python3.9/site-packages/horizon/middleware/simultaneous_sessions.py", line 37, in 
>> _process_request
>> 2025-05-12 18:05:37.406625     cache_value = cache.get(cache_key)
>> 2025-05-12 18:05:37.406627   File 
>> "/var/lib/kolla/venv/lib64/python3.9/site-packages/django/core/cache/backends/memcached.py", line 75, in 
>> get
>> 2025-05-12 18:05:37.406628     return self._cache.get(key, default)
>> 2025-05-12 18:05:37.406630   File 
>> "/var/lib/kolla/venv/lib/python3.9/site-packages/pymemcache/client/hash.py", line 347, in 
>> get
>> 2025-05-12 18:05:37.406632     return self._run_cmd("get", key, 
>> default, default=default, **kwargs)
>> 2025-05-12 18:05:37.406634   File 
>> "/var/lib/kolla/venv/lib/python3.9/site-packages/pymemcache/client/hash.py", line 322, in 
>> _run_cmd
>> 2025-05-12 18:05:37.406636     return self._safely_run_func(client, 
>> func, default_val, *args, **kwargs)
>> 2025-05-12 18:05:37.406637   File 
>> "/var/lib/kolla/venv/lib/python3.9/site-packages/pymemcache/client/hash.py", line 199, in 
>> _safely_run_func
>> 2025-05-12 18:05:37.406639     result = func(*args, **kwargs)
>> 2025-05-12 18:05:37.406640   File 
>> "/var/lib/kolla/venv/lib/python3.9/site-packages/pymemcache/client/base.py", line 687, in 
>> get
>> 2025-05-12 18:05:37.406642     return self._fetch_cmd(b"get", 
>> [key], False, key_prefix=self.key_prefix).get(
>> 2025-05-12 18:05:37.406644   File 
>> "/var/lib/kolla/venv/lib/python3.9/site-packages/pymemcache/client/base.py", line 1133, in 
>> _fetch_cmd
>> 2025-05-12 18:05:37.406645     self._connect()
>> 2025-05-12 18:05:37.406647   File 
>> "/var/lib/kolla/venv/lib/python3.9/site-packages/pymemcache/client/base.py", line 424, in 
>> _connect
>> 2025-05-12 18:05:37.406648     sock.connect(sockaddr)
>> 2025-05-12 18:05:37.406650 ConnectionRefusedError: [Errno 111] 
>> Connection refused
>>
>> so horizon tries to connect to memcached node which is down. I have 
>> default kolla-ansible config with enabled memcached and horizon 
>> config is following:
>>
>>
>> SESSION_ENGINE = 'django.contrib.sessions.backends.cache'
>> CACHES['default']['LOCATION'] = ['192.168.56.11:11211 
>> <http://192.168.56.11:11211>','192.168.56.12:11211 
>> <http://192.168.56.12:11211>','192.168.56.21:11211 
>> <http://192.168.56.21:11211>']
>>
>> When I stop any other memcached node, horizon is working without issues.
>>
>> Why is that exact node important for the horizon?
>
> its not just important for hoizon.
>
> in most serivce we take the list of memcached servce adn then 
> dristribute the cache keys across them.
>
> memcahce is not a clustered solution like a db where you can write 
> to one peer adn read form another and expect
> to get consitent resutls.
>
> if one instance goes down all keys associated with that instance are 
> unavailable end typiclly lost.
>
> most openstack service will either catch the connection issue and 
> internally tolerate it as if its a cache miss
>
> or have oslo do that for them. but this looks liek the cachign is 
> not using oslo but django.
>
> it can also be confirured to do that
>
> ```
>
> CACHES = {
>     "default": {
>         "BACKEND": "django.core.cache.backends.memcached.PyMemcacheCache",
>         "LOCATION": "127.0.0.1:11211",
>         "OPTIONS": {
>             "no_delay": True,
>             "ignore_exc": True,
>             "max_pool_size": 4,
>             "use_pooling": True,
>         },
>     }
> } ```  "ignore_exc": True seam to be the relevetn parmater
> that one of the exampel in 
> https://docs.djangoproject.com/en/5.2/topics/cache/#cache-arguments
>
> it would appare that django, at least as its used by horizon is not 
> fault tolerant to memcached outages
>
> so if there is a conenction issue it wil break. im not sure if that 
> means horizon is also not fault tolerent to cache
>
> missies but it sworth a try.
>
> perhaps a more fault tolerant cache backend supported by django is 
> alos an option
>
> https://docs.djangoproject.com/en/5.2/topics/cache/#django-s-cache-framework
>
> if you have redis or valkey then perhaps 
> django.core.cache.backends.redis.RedisCache
>
> or one of the db caches would be an option but i would first test adding the
>
> "ignore_exc": True parmater to your config.
>
>> Does anyone else have the same experience?
>>
>> Thanks for any advice.
>>
>> --
>> Kamil Madac





--
Kamil Madac