[Openstack-operators] [keystone] Timeouts reading response headers in keystone-wsgi-public and keystone-wsgi-admin
Chris Apsey
bitskrieg at bitskrieg.net
Sun Feb 25 00:58:06 UTC 2018
All,
As a follow up to this, we eventually discovered that the issues were
caused by a recent migration to a new public cloud provider (they host
our stateful data - user accounts, etc.). Our old cloud provider
allowed for publicly-routable IPs to be directly attached to instances,
and our new one does not (they utilize 1:1 NAT). The 1:1 NAT was
causing issued with our LDAP provider (ipa) and would occasionally cause
the keystone-wsgi service to hang (but only when leveraging the LDAP
keystone domain - accounts in the default sql domain worked fine).
Pointing keystone at a fully-routable address over a VPN instead of at
the 1:1 NAT address over the internet caused the issues to disappear.
This may have been more of a FreeIPA not liking 1:1 NAT more than
anything else, but just thought I'd share in case anyone ran into
something similar in the future.
---
v/r
Chris Apsey
bitskrieg at bitskrieg.net
https://www.bitskrieg.net
On 2018-02-20 10:27 PM, Chris Apsey wrote:
> All,
>
> Currently experiencing a sporadic issue with our keystone endpoints.
> Throughout the day, keystone will just stop responding on both the
> admin and public endpoints, which will cause all services to hang.
> Restarting apache2 fixes the issue for some amount of time, but it
> inevitably appears again later on. Here is what I am seeing:
>
> keystone: /var/log/apache2/keystone.log
>
> 2018-02-20 21:50:38.830302 Timeout when reading response headers
> from daemon process 'keystone-admin': /usr/bin/keystone-wsgi-admin
> 2018-02-20 21:50:50.799587 Timeout when reading response headers
> from daemon process 'keystone-admin': /usr/bin/keystone-wsgi-admin
> 2018-02-20 21:51:02.857266 Timeout when reading response headers
> from daemon process 'keystone-admin': /usr/bin/keystone-wsgi-admin
> 2018-02-20 21:51:02.879630 mod_wsgi (pid=1221): Exception occurred
> processing WSGI script '/usr/bin/keystone-wsgi-admin'.
> 2018-02-20 21:51:02.879796 IOError: failed to write data
> 2018-02-20 21:51:07.005702 mod_wsgi (pid=1220): Exception occurred
> processing WSGI script '/usr/bin/keystone-wsgi-admin'.
>
> horizon: /var/log/apache2/error.log
>
> [Tue Feb 20 21:47:02.582511 2018] [wsgi:error] [pid 1227:tid
> 140591048296192] [client 10.10.5.200:57462] Timeout when reading
> response headers from daemon process 'horizon':
> /usr/share/openstack-dashboard/openstack_dashboard/wsgi/django.wsgi,
> referer:
> https://vta.cybbh.space/horizon/project/instances/900e9d57-752d-488c-8dba-ffc098e1a51a/
> [Tue Feb 20 21:48:03.962589 2018] [wsgi:error] [pid 1225:tid
> 140591249823488] ERROR openstack_auth.user Unable to retrieve project
> list.
> [Tue Feb 20 21:48:03.962646 2018] [wsgi:error] [pid 1225:tid
> 140591249823488] Traceback (most recent call last):
> [Tue Feb 20 21:48:03.962656 2018] [wsgi:error] [pid 1225:tid
> 140591249823488] File
> "/usr/lib/python2.7/dist-packages/openstack_auth/user.py", line 350,
> in authorized_tenants
> [Tue Feb 20 21:48:03.962665 2018] [wsgi:error] [pid 1225:tid
> 140591249823488] is_federated=self.is_federated)
> [Tue Feb 20 21:48:03.962673 2018] [wsgi:error] [pid 1225:tid
> 140591249823488] File
> "/usr/lib/python2.7/dist-packages/openstack_auth/utils.py", line 372,
> in get_project_list
> [Tue Feb 20 21:48:03.962734 2018] [wsgi:error] [pid 1225:tid
> 140591249823488] projects =
> client.projects.list(user=kwargs.get('user_id'))
> [Tue Feb 20 21:48:03.962744 2018] [wsgi:error] [pid 1225:tid
> 140591249823488] File
> "/usr/lib/python2.7/dist-packages/positional/__init__.py", line 101,
> in inner
> [Tue Feb 20 21:48:03.962752 2018] [wsgi:error] [pid 1225:tid
> 140591249823488] return wrapped(*args, **kwargs)
> [Tue Feb 20 21:48:03.962759 2018] [wsgi:error] [pid 1225:tid
> 140591249823488] File
> "/usr/lib/python2.7/dist-packages/keystoneclient/v3/projects.py", line
> 119, in list
> [Tue Feb 20 21:48:03.962767 2018] [wsgi:error] [pid 1225:tid
> 140591249823488] **kwargs)
> [Tue Feb 20 21:48:03.962774 2018] [wsgi:error] [pid 1225:tid
> 140591249823488] File
> "/usr/lib/python2.7/dist-packages/keystoneclient/base.py", line 75, in
> func
> [Tue Feb 20 21:48:03.962782 2018] [wsgi:error] [pid 1225:tid
> 140591249823488] return f(*args, **new_kwargs)
> [Tue Feb 20 21:48:03.962789 2018] [wsgi:error] [pid 1225:tid
> 140591249823488] File
> "/usr/lib/python2.7/dist-packages/keystoneclient/base.py", line 390,
> in list
> [Tue Feb 20 21:48:03.962796 2018] [wsgi:error] [pid 1225:tid
> 140591249823488] self.collection_key)
> [Tue Feb 20 21:48:03.962803 2018] [wsgi:error] [pid 1225:tid
> 140591249823488] File
> "/usr/lib/python2.7/dist-packages/keystoneclient/base.py", line 125,
> in _list
> [Tue Feb 20 21:48:03.962811 2018] [wsgi:error] [pid 1225:tid
> 140591249823488] resp, body = self.client.get(url, **kwargs)
> [Tue Feb 20 21:48:03.962818 2018] [wsgi:error] [pid 1225:tid
> 140591249823488] File
> "/usr/lib/python2.7/dist-packages/keystoneauth1/adapter.py", line 288,
> in get
> [Tue Feb 20 21:48:03.962826 2018] [wsgi:error] [pid 1225:tid
> 140591249823488] return self.request(url, 'GET', **kwargs)
> [Tue Feb 20 21:48:03.962833 2018] [wsgi:error] [pid 1225:tid
> 140591249823488] File
> "/usr/lib/python2.7/dist-packages/keystoneauth1/adapter.py", line 447,
> in request
> [Tue Feb 20 21:48:03.962841 2018] [wsgi:error] [pid 1225:tid
> 140591249823488] resp = super(LegacyJsonAdapter,
> self).request(*args, **kwargs)
> [Tue Feb 20 21:48:03.962848 2018] [wsgi:error] [pid 1225:tid
> 140591249823488] File
> "/usr/lib/python2.7/dist-packages/keystoneauth1/adapter.py", line 192,
> in request
> [Tue Feb 20 21:48:03.962855 2018] [wsgi:error] [pid 1225:tid
> 140591249823488] return self.session.request(url, method,
> **kwargs)
> [Tue Feb 20 21:48:03.962863 2018] [wsgi:error] [pid 1225:tid
> 140591249823488] File
> "/usr/lib/python2.7/dist-packages/positional/__init__.py", line 101,
> in inner
> [Tue Feb 20 21:48:03.962870 2018] [wsgi:error] [pid 1225:tid
> 140591249823488] return wrapped(*args, **kwargs)
> [Tue Feb 20 21:48:03.962877 2018] [wsgi:error] [pid 1225:tid
> 140591249823488] File
> "/usr/lib/python2.7/dist-packages/keystoneauth1/session.py", line 703,
> in request
> [Tue Feb 20 21:48:03.962885 2018] [wsgi:error] [pid 1225:tid
> 140591249823488] resp = send(**kwargs)
> [Tue Feb 20 21:48:03.962892 2018] [wsgi:error] [pid 1225:tid
> 140591249823488] File
> "/usr/lib/python2.7/dist-packages/keystoneauth1/session.py", line 777,
> in _send_request
> [Tue Feb 20 21:48:03.962899 2018] [wsgi:error] [pid 1225:tid
> 140591249823488] raise exceptions.ConnectFailure(msg)
> [Tue Feb 20 21:48:03.962907 2018] [wsgi:error] [pid 1225:tid
> 140591249823488] ConnectFailure: Unable to establish connection to
> https://*******:5000/v3/users/7e68b998ee1ec26139d3482818c9643d1ce3b5aff532c865cff65e1c9fe01306/projects?:
> ('Connection aborted.', BadStatusLine("''",))
>
>
> I get the same behavior regardless of service and regardless of
> whether or not I use the CLI or Horizon. All signs point to keystone
> being the culprit.
>
> I have adjusted my /etc/apache2/sites-available/keystone.conf:
>
> WSGIDaemonProcess keystone-public processes=8 threads=4 user=keystone
> group=keystone display-name=%{GROUP}
> WSGIDaemonProcess keystone-admin processes=8 threads=4 user=keystone
> group=keystone display-name=%{GROUP}
>
> And ensured that WSGIApplicationGroup %{GLOBAL} is present.
>
> haproxy is sitting in between keystone and all other services, and is
> configured as follows:
>
> defaults
> log global
> maxconn 16384
> option redispatch
> retries 3
> timeout http-request 30s
> timeout queue 1m
> timeout connect 30s
> timeout client 2m
> timeout server 2m
> timeout check 10s
>
> ...
>
> listen keystone_admin_cluster
> bind 10.10.5.200:35357
> balance source
> option tcpka
> option httpchk
> option tcplog
> server keystone-0 10.10.5.120:35357 check inter 2000 rise 2 fall 5
> server keystone-1 10.10.5.121:35357 check inter 2000 rise 2 fall 5
>
> listen keystone_public_internal_cluster
> bind 10.50.10.0:5000 ssl crt /etc/letsencrypt/live/*****/master.pem
> bind 10.10.5.200:5000
> balance roundrobin
> option tcpka
> option httpchk
> option tcplog
> server keystone-0 10.10.5.120:5000 check inter 2000 rise 2 fall 5
> server keystone-1 10.10.5.121:5000 check inter 2000 rise 2 fall 5
>
> ...
>
>
> Any ideas on where else I should look?
>
> Thanks in advance,
>
> ---
> v/r
>
> Chris Apsey
> bitskrieg at bitskrieg.net
> https://www.bitskrieg.net
More information about the OpenStack-operators
mailing list