Hi Sean,

Thanks for your response. You are right, I just tested the cache options from the django docs you sent (all 4 options), and now horizon works even if not all memcached nodes are up. Thank you very much. 

For other who wants to set it up in kolla-ansible deployment I created config file /etc/kolla/config/horizon/_9999-custom-settings.py with content

CACHES['default']['OPTIONS'] =  {
    "no_delay": True,
    "ignore_exc": True,
    "max_pool_size": 4,
    "use_pooling": True,
}

and redeployed horizon 

kolla-ansible deploy -i inventory -t horizon

Kamil

On Mon, May 12, 2025 at 9:01 PM Sean Mooney <smooney@redhat.com> wrote:

On 12/05/2025 19:27, Kamil Madac wrote:
> I have deployed openstack 2024.2 with kolla-ansible in HA setup with 3
> control nodes. Everything works without issues, but when I stop
> memcached node with IP address  192.168.56.12 (or when node goes
> down), it is not possible to login to Horizon with error message:
>
> Something went wrong!
> An unexpected error has occurred. Try refreshing the page. If that
> doesn't help, contact your local administrator.
>
> In horizon log I have an error message:
>
> 2025-05-12 18:05:37.406588 Internal Server Error: /
> 2025-05-12 18:05:37.406613 Traceback (most recent call last):
> 2025-05-12 18:05:37.406615   File
> "/var/lib/kolla/venv/lib64/python3.9/site-packages/django/core/handlers/exception.py",
> line 55, in inner
> 2025-05-12 18:05:37.406617     response = get_response(request)
> 2025-05-12 18:05:37.406619   File
> "/var/lib/kolla/venv/lib/python3.9/site-packages/horizon/middleware/simultaneous_sessions.py",
> line 30, in __call__
> 2025-05-12 18:05:37.406621     self._process_request(request)
> 2025-05-12 18:05:37.406623   File
> "/var/lib/kolla/venv/lib/python3.9/site-packages/horizon/middleware/simultaneous_sessions.py",
> line 37, in _process_request
> 2025-05-12 18:05:37.406625     cache_value = cache.get(cache_key)
> 2025-05-12 18:05:37.406627   File
> "/var/lib/kolla/venv/lib64/python3.9/site-packages/django/core/cache/backends/memcached.py",
> line 75, in get
> 2025-05-12 18:05:37.406628     return self._cache.get(key, default)
> 2025-05-12 18:05:37.406630   File
> "/var/lib/kolla/venv/lib/python3.9/site-packages/pymemcache/client/hash.py",
> line 347, in get
> 2025-05-12 18:05:37.406632     return self._run_cmd("get", key,
> default, default=default, **kwargs)
> 2025-05-12 18:05:37.406634   File
> "/var/lib/kolla/venv/lib/python3.9/site-packages/pymemcache/client/hash.py",
> line 322, in _run_cmd
> 2025-05-12 18:05:37.406636     return self._safely_run_func(client,
> func, default_val, *args, **kwargs)
> 2025-05-12 18:05:37.406637   File
> "/var/lib/kolla/venv/lib/python3.9/site-packages/pymemcache/client/hash.py",
> line 199, in _safely_run_func
> 2025-05-12 18:05:37.406639     result = func(*args, **kwargs)
> 2025-05-12 18:05:37.406640   File
> "/var/lib/kolla/venv/lib/python3.9/site-packages/pymemcache/client/base.py",
> line 687, in get
> 2025-05-12 18:05:37.406642     return self._fetch_cmd(b"get", [key],
> False, key_prefix=self.key_prefix).get(
> 2025-05-12 18:05:37.406644   File
> "/var/lib/kolla/venv/lib/python3.9/site-packages/pymemcache/client/base.py",
> line 1133, in _fetch_cmd
> 2025-05-12 18:05:37.406645     self._connect()
> 2025-05-12 18:05:37.406647   File
> "/var/lib/kolla/venv/lib/python3.9/site-packages/pymemcache/client/base.py",
> line 424, in _connect
> 2025-05-12 18:05:37.406648     sock.connect(sockaddr)
> 2025-05-12 18:05:37.406650 ConnectionRefusedError: [Errno 111]
> Connection refused
>
> so horizon tries to connect to memcached node which is down. I have
> default kolla-ansible config with enabled memcached and horizon config
> is following:
>
>
> SESSION_ENGINE = 'django.contrib.sessions.backends.cache'
> CACHES['default']['LOCATION'] = ['192.168.56.11:11211
> <http://192.168.56.11:11211>','192.168.56.12:11211
> <http://192.168.56.12:11211>','192.168.56.21:11211
> <http://192.168.56.21:11211>']
>
> When I stop any other memcached node, horizon is working without issues.
>
> Why is that exact node important for the horizon?

its not just important for hoizon.

in most serivce we take the list of memcached servce adn then
dristribute the cache keys across them.

memcahce is not a clustered solution like a db where you can write to
one peer adn read form another and expect
to get consitent resutls.

if one instance goes down all keys associated with that instance are
unavailable end typiclly lost.

most openstack service will either catch the connection issue and
internally tolerate it as if its a cache miss

or have oslo do that for them. but this looks liek the cachign is not
using oslo but django.

it can also be confirured to do that

```

CACHES = {
     "default": {
         "BACKEND": "django.core.cache.backends.memcached.PyMemcacheCache",
         "LOCATION": "127.0.0.1:11211",
         "OPTIONS": {
             "no_delay": True,
             "ignore_exc": True,
             "max_pool_size": 4,
             "use_pooling": True,
         },
     }
} ```  "ignore_exc": True seam to be the relevetn parmater
that one of the exampel in https://docs.djangoproject.com/en/5.2/topics/cache/#cache-arguments

it would appare that django, at least as its used by horizon is not
fault tolerant to memcached outages

so if there is a conenction issue it wil break. im not sure if that
means horizon is also not fault tolerent to cache

missies but it sworth a try.

perhaps a more fault tolerant cache backend supported by django is alos
an option

https://docs.djangoproject.com/en/5.2/topics/cache/#django-s-cache-framework

if you have redis or valkey then perhaps
django.core.cache.backends.redis.RedisCache

or one of the db caches would be an option but i would first test adding
the

"ignore_exc": True parmater to your config.

> Does anyone else have the same experience?
>
> Thanks for any advice.
>
> --
> Kamil Madac



--
Kamil Madac