[Openstack-operators] [keystone] Timeouts reading response headers in keystone-wsgi-public and keystone-wsgi-admin

Chris Apsey bitskrieg at bitskrieg.net
Sun Feb 25 00:58:06 UTC 2018


All,

As a follow up to this, we eventually discovered that the issues were 
caused by a recent migration to a new public cloud provider (they host 
our stateful data - user accounts, etc.).  Our old cloud provider 
allowed for publicly-routable IPs to be directly attached to instances, 
and our new one does not (they utilize 1:1 NAT).  The 1:1 NAT was 
causing issued with our LDAP provider (ipa) and would occasionally cause 
the keystone-wsgi service to hang (but only when leveraging the LDAP 
keystone domain - accounts in the default sql domain worked fine).  
Pointing keystone at a fully-routable address over a VPN instead of at 
the 1:1 NAT address over the internet caused the issues to disappear.  
This may have been more of a FreeIPA not liking 1:1 NAT more than 
anything else, but just thought I'd share in case anyone ran into 
something similar in the future.

---
v/r

Chris Apsey
bitskrieg at bitskrieg.net
https://www.bitskrieg.net

On 2018-02-20 10:27 PM, Chris Apsey wrote:
> All,
> 
> Currently experiencing a sporadic issue with our keystone endpoints.
> Throughout the day, keystone will just stop responding on both the
> admin and public endpoints, which will cause all services to hang.
> Restarting apache2 fixes the issue for some amount of time, but it
> inevitably appears again later on.  Here is what I am seeing:
> 
> keystone: /var/log/apache2/keystone.log
> 
>     2018-02-20 21:50:38.830302 Timeout when reading response headers
> from daemon process 'keystone-admin': /usr/bin/keystone-wsgi-admin
>     2018-02-20 21:50:50.799587 Timeout when reading response headers
> from daemon process 'keystone-admin': /usr/bin/keystone-wsgi-admin
>     2018-02-20 21:51:02.857266 Timeout when reading response headers
> from daemon process 'keystone-admin': /usr/bin/keystone-wsgi-admin
>     2018-02-20 21:51:02.879630 mod_wsgi (pid=1221): Exception occurred
> processing WSGI script '/usr/bin/keystone-wsgi-admin'.
>     2018-02-20 21:51:02.879796 IOError: failed to write data
>     2018-02-20 21:51:07.005702 mod_wsgi (pid=1220): Exception occurred
> processing WSGI script '/usr/bin/keystone-wsgi-admin'.
> 
> horizon: /var/log/apache2/error.log
> 
>     [Tue Feb 20 21:47:02.582511 2018] [wsgi:error] [pid 1227:tid
> 140591048296192] [client 10.10.5.200:57462] Timeout when reading
> response headers from daemon process 'horizon':
> /usr/share/openstack-dashboard/openstack_dashboard/wsgi/django.wsgi,
> referer:
> https://vta.cybbh.space/horizon/project/instances/900e9d57-752d-488c-8dba-ffc098e1a51a/
>     [Tue Feb 20 21:48:03.962589 2018] [wsgi:error] [pid 1225:tid
> 140591249823488] ERROR openstack_auth.user Unable to retrieve project
> list.
>     [Tue Feb 20 21:48:03.962646 2018] [wsgi:error] [pid 1225:tid
> 140591249823488] Traceback (most recent call last):
>     [Tue Feb 20 21:48:03.962656 2018] [wsgi:error] [pid 1225:tid
> 140591249823488]   File
> "/usr/lib/python2.7/dist-packages/openstack_auth/user.py", line 350,
> in authorized_tenants
>     [Tue Feb 20 21:48:03.962665 2018] [wsgi:error] [pid 1225:tid
> 140591249823488]     is_federated=self.is_federated)
>     [Tue Feb 20 21:48:03.962673 2018] [wsgi:error] [pid 1225:tid
> 140591249823488]   File
> "/usr/lib/python2.7/dist-packages/openstack_auth/utils.py", line 372,
> in get_project_list
>     [Tue Feb 20 21:48:03.962734 2018] [wsgi:error] [pid 1225:tid
> 140591249823488]     projects =
> client.projects.list(user=kwargs.get('user_id'))
>     [Tue Feb 20 21:48:03.962744 2018] [wsgi:error] [pid 1225:tid
> 140591249823488]   File
> "/usr/lib/python2.7/dist-packages/positional/__init__.py", line 101,
> in inner
>     [Tue Feb 20 21:48:03.962752 2018] [wsgi:error] [pid 1225:tid
> 140591249823488]     return wrapped(*args, **kwargs)
>     [Tue Feb 20 21:48:03.962759 2018] [wsgi:error] [pid 1225:tid
> 140591249823488]   File
> "/usr/lib/python2.7/dist-packages/keystoneclient/v3/projects.py", line
> 119, in list
>     [Tue Feb 20 21:48:03.962767 2018] [wsgi:error] [pid 1225:tid
> 140591249823488]     **kwargs)
>     [Tue Feb 20 21:48:03.962774 2018] [wsgi:error] [pid 1225:tid
> 140591249823488]   File
> "/usr/lib/python2.7/dist-packages/keystoneclient/base.py", line 75, in
> func
>     [Tue Feb 20 21:48:03.962782 2018] [wsgi:error] [pid 1225:tid
> 140591249823488]     return f(*args, **new_kwargs)
>     [Tue Feb 20 21:48:03.962789 2018] [wsgi:error] [pid 1225:tid
> 140591249823488]   File
> "/usr/lib/python2.7/dist-packages/keystoneclient/base.py", line 390,
> in list
>     [Tue Feb 20 21:48:03.962796 2018] [wsgi:error] [pid 1225:tid
> 140591249823488]     self.collection_key)
>     [Tue Feb 20 21:48:03.962803 2018] [wsgi:error] [pid 1225:tid
> 140591249823488]   File
> "/usr/lib/python2.7/dist-packages/keystoneclient/base.py", line 125,
> in _list
>     [Tue Feb 20 21:48:03.962811 2018] [wsgi:error] [pid 1225:tid
> 140591249823488]     resp, body = self.client.get(url, **kwargs)
>     [Tue Feb 20 21:48:03.962818 2018] [wsgi:error] [pid 1225:tid
> 140591249823488]   File
> "/usr/lib/python2.7/dist-packages/keystoneauth1/adapter.py", line 288,
> in get
>     [Tue Feb 20 21:48:03.962826 2018] [wsgi:error] [pid 1225:tid
> 140591249823488]     return self.request(url, 'GET', **kwargs)
>     [Tue Feb 20 21:48:03.962833 2018] [wsgi:error] [pid 1225:tid
> 140591249823488]   File
> "/usr/lib/python2.7/dist-packages/keystoneauth1/adapter.py", line 447,
> in request
>     [Tue Feb 20 21:48:03.962841 2018] [wsgi:error] [pid 1225:tid
> 140591249823488]     resp = super(LegacyJsonAdapter,
> self).request(*args, **kwargs)
>     [Tue Feb 20 21:48:03.962848 2018] [wsgi:error] [pid 1225:tid
> 140591249823488]   File
> "/usr/lib/python2.7/dist-packages/keystoneauth1/adapter.py", line 192,
> in request
>     [Tue Feb 20 21:48:03.962855 2018] [wsgi:error] [pid 1225:tid
> 140591249823488]     return self.session.request(url, method,
> **kwargs)
>     [Tue Feb 20 21:48:03.962863 2018] [wsgi:error] [pid 1225:tid
> 140591249823488]   File
> "/usr/lib/python2.7/dist-packages/positional/__init__.py", line 101,
> in inner
>     [Tue Feb 20 21:48:03.962870 2018] [wsgi:error] [pid 1225:tid
> 140591249823488]     return wrapped(*args, **kwargs)
>     [Tue Feb 20 21:48:03.962877 2018] [wsgi:error] [pid 1225:tid
> 140591249823488]   File
> "/usr/lib/python2.7/dist-packages/keystoneauth1/session.py", line 703,
> in request
>     [Tue Feb 20 21:48:03.962885 2018] [wsgi:error] [pid 1225:tid
> 140591249823488]     resp = send(**kwargs)
>     [Tue Feb 20 21:48:03.962892 2018] [wsgi:error] [pid 1225:tid
> 140591249823488]   File
> "/usr/lib/python2.7/dist-packages/keystoneauth1/session.py", line 777,
> in _send_request
>     [Tue Feb 20 21:48:03.962899 2018] [wsgi:error] [pid 1225:tid
> 140591249823488]     raise exceptions.ConnectFailure(msg)
>     [Tue Feb 20 21:48:03.962907 2018] [wsgi:error] [pid 1225:tid
> 140591249823488] ConnectFailure: Unable to establish connection to
> https://*******:5000/v3/users/7e68b998ee1ec26139d3482818c9643d1ce3b5aff532c865cff65e1c9fe01306/projects?:
> ('Connection aborted.', BadStatusLine("''",))
> 
> 
> I get the same behavior regardless of service and regardless of
> whether or not I use the CLI or Horizon.  All signs point to keystone
> being the culprit.
> 
> I have adjusted my /etc/apache2/sites-available/keystone.conf:
> 
> WSGIDaemonProcess keystone-public processes=8 threads=4 user=keystone
> group=keystone display-name=%{GROUP}
> WSGIDaemonProcess keystone-admin processes=8 threads=4 user=keystone
> group=keystone display-name=%{GROUP}
> 
> And ensured that WSGIApplicationGroup %{GLOBAL} is present.
> 
> haproxy is sitting in between keystone and all other services, and is
> configured as follows:
> 
> defaults
>   log  global
>   maxconn  16384
>   option  redispatch
>   retries  3
>   timeout  http-request 30s
>   timeout  queue 1m
>   timeout  connect 30s
>   timeout  client 2m
>   timeout  server 2m
>   timeout  check 10s
> 
> ...
> 
> listen keystone_admin_cluster
>   bind 10.10.5.200:35357
>   balance  source
>   option  tcpka
>   option  httpchk
>   option  tcplog
>   server keystone-0 10.10.5.120:35357 check inter 2000 rise 2 fall 5
>   server keystone-1 10.10.5.121:35357 check inter 2000 rise 2 fall 5
> 
> listen keystone_public_internal_cluster
>   bind 10.50.10.0:5000 ssl crt /etc/letsencrypt/live/*****/master.pem
>   bind 10.10.5.200:5000
>   balance  roundrobin
>   option  tcpka
>   option  httpchk
>   option  tcplog
>   server keystone-0 10.10.5.120:5000 check inter 2000 rise 2 fall 5
>   server keystone-1 10.10.5.121:5000 check inter 2000 rise 2 fall 5
> 
> ...
> 
> 
> Any ideas on where else I should look?
> 
> Thanks in advance,
> 
> ---
> v/r
> 
> Chris Apsey
> bitskrieg at bitskrieg.net
> https://www.bitskrieg.net



More information about the OpenStack-operators mailing list