[openstack-dev] Attempting to proxy websockets through Apache or HAProxy for Zaqar

Dan Trainor dtrainor at redhat.com
Tue Jan 17 17:23:31 UTC 2017


Hi -

In an attempt to work on [0], I've been playing around with proxying all
the service API endpoints that the UI needs to communicate with, through
either haproxy or Apache to avoid a bug[1] around how non-Chrome browsers
handle SSL connections to different ports on the same domain.

The blueprint suggests using haproxy for this, but we're currently using
the "old" notation of listen/server, not frontend/backend.  The distinction
is important because the ACLs that would allow any kind of proxying to
facilitate this are only available in the latter notation.  In order to do
this in haproxy, tripleo::haproxy would need a rewrite (looks pretty
trivial, but likely out of scope for this).  So I'd really like to isolate
this to UI, which is convenient since UI runs largely self-contained inside
Apache.

I've made some good progress with most all of the services, since they were
pretty straight-forward - mod_proxy handles them just fine.  The one I'm
not able to make work right now is the websocket service that UI uses.
Ultimately, I see the Websocket connection get upgraded and the Websocket
opens, but stays open indefinitely and will never see more than 0 bytes.
No data is transferred from the browser over the Websocket.  This
connection hangs indefinitely, and UI does not complete any operations that
depend on the Zaqar Websocket.

Observing trace6[4] output, I can see mod_proxy_wstunnel (which relies on
mod_proxy) make the connection, I can see Zaqar recognize the request in
logs, the client (UI) doesn't send or receive any data from it.  It's as if
immediately after the Upgrade[2], the persistent Websocket connection just
dies.

I've had limited success using a couple different implementations of this
in Apache.  ProxyPass/ProxyPassReverse looks as if it should work (so long
as mod_proxy_wstunnel is loaded), but this is not my experience.  Using a
mod_rewrite rule[3] to force the specific Websocket proxy for a specific
URI (/zaqar) has the same outcome.

In its most simple form, the ProxyPass rule I'm attempting to use is:

  ProxyPass             "/zaqar"        "ws://192.0.2.1:9000/"
  ProxyPassReverse      "/zaqar"        "ws://192.0.2.1:9000/"

Note that I've used several variations of both ProxyPass configurations and
mod_rewrite rules using the [P] flag which all seem to net the same
result.  I've also tried writing the same functional equivalent in haproxy
using a frontend/backend notation to confirm if this was a protocol thing
or a software thing (if haproxy could do this, but Apache could not).

>From the top, here's some Apache logs (note that trace6 is noisy, I just
grep'd for ws, wss, and the websocket port (9000); full logs of this
request are [4]):

[Tue Jan 17 12:08:16.639170 2017] [proxy_wstunnel:debug] [pid 32128]
mod_proxy_wstunnel.c(253): [client 192.0.2.1:51508] AH02445: woke from
poll(), i=1
[Tue Jan 17 12:08:16.639220 2017] [proxy_wstunnel:debug] [pid 32128]
mod_proxy_wstunnel.c(278): [client 192.0.2.1:51508] AH02448: client was
readable
[Tue Jan 17 12:08:16.639265 2017] [core:trace6] [pid 32128]
core_filters.c(525): [remote 192.0.2.1:9000] core_output_filter: flushing
because of FLUSH bucket
[Tue Jan 17 12:08:16.639337 2017] [proxy_wstunnel:trace2] [pid 32128]
mod_proxy_wstunnel.c(295): [client 192.0.2.1:51508] finished with poll() -
cleaning up
[Tue Jan 17 12:08:16.640023 2017] [proxy:debug] [pid 32128]
proxy_util.c(2218): AH00943: WS: has released connection for (192.0.2.1)
[Tue Jan 17 12:08:19.238044 2017] [core:trace5] [pid 32128]
protocol.c(618): [client 192.0.2.1:51996] Request received from client: GET
/zaqar HTTP/1.1
[Tue Jan 17 12:08:19.238191 2017] [core:trace3] [pid 32128] request.c(293):
[client 192.0.2.1:51996] request authorized without authentication by
access_checker_ex hook: /zaqar
[Tue Jan 17 12:08:19.238202 2017] [proxy_wstunnel:trace1] [pid 32128]
mod_proxy_wstunnel.c(51): [client 192.0.2.1:51996] canonicalising URL //
192.0.2.1:9000/
[Tue Jan 17 12:08:19.238223 2017] [proxy:trace2] [pid 32128]
proxy_util.c(1985): [client 192.0.2.1:51996] ws: found worker ws://
192.0.2.1:9000/ for ws://192.0.2.1:9000/
[Tue Jan 17 12:08:19.238227 2017] [proxy:debug] [pid 32128]
mod_proxy.c(1117): [client 192.0.2.1:51996] AH01143: Running scheme ws
handler (attempt 0)
[Tue Jan 17 12:08:19.238231 2017] [proxy_http:debug] [pid 32128]
mod_proxy_http.c(1925): [client 192.0.2.1:51996] AH01113: HTTP: declining
URL ws://192.0.2.1:9000/
[Tue Jan 17 12:08:19.238236 2017] [proxy_wstunnel:debug] [pid 32128]
mod_proxy_wstunnel.c(333): [client 192.0.2.1:51996] AH02451: serving URL
ws://192.0.2.1:9000/
[Tue Jan 17 12:08:19.238239 2017] [proxy:debug] [pid 32128]
proxy_util.c(2203): AH00942: WS: has acquired connection for (192.0.2.1)
[Tue Jan 17 12:08:19.238244 2017] [proxy:debug] [pid 32128]
proxy_util.c(2256): [client 192.0.2.1:51996] AH00944: connecting ws://
192.0.2.1:9000/ to 192.0.2.1:9000
[Tue Jan 17 12:08:19.238249 2017] [proxy:debug] [pid 32128]
proxy_util.c(2422): [client 192.0.2.1:51996] AH00947: connected / to
192.0.2.1:9000
[Tue Jan 17 12:08:19.238263 2017] [proxy_wstunnel:trace2] [pid 32128]
mod_proxy_wstunnel.c(192): [client 192.0.2.1:51996] sending request
[Tue Jan 17 12:08:19.238283 2017] [core:trace6] [pid 32128]
core_filters.c(525): [remote 192.0.2.1:9000] core_output_filter: flushing
because of FLUSH bucket
[Tue Jan 17 12:08:19.238333 2017] [proxy_wstunnel:trace2] [pid 32128]
mod_proxy_wstunnel.c(210): [client 192.0.2.1:51996] setting up poll()
[Tue Jan 17 12:08:19.238348 2017] [proxy_wstunnel:debug] [pid 32128]
mod_proxy_wstunnel.c(253): [client 192.0.2.1:51996] AH02445: woke from
poll(), i=1
[Tue Jan 17 12:08:19.238351 2017] [proxy_wstunnel:debug] [pid 32128]
mod_proxy_wstunnel.c(262): [client 192.0.2.1:51996] AH02446: sock was
readable
[Tue Jan 17 12:08:19.238390 2017] [proxy_wstunnel:trace2] [pid 32128]
mod_proxy_wstunnel.c(295): [client 192.0.2.1:51996] finished with poll() -
cleaning up
[Tue Jan 17 12:08:19.238394 2017] [proxy:debug] [pid 32128]
proxy_util.c(2218): AH00943: WS: has released connection for (192.0.2.1)

Note that mod_proxy_wstunnel is picking up the connection (the AH01113 is
mod_proxy_http declining the connection, since it's a wstunnel connection,
not a typical http connection):

[Tue Jan 17 12:08:19.238231 2017] [proxy_http:debug] [pid 32128]
mod_proxy_http.c(1925): [client 192.0.2.1:51996] AH01113: HTTP: declining
URL ws://192.0.2.1:9000/
[Tue Jan 17 12:08:19.238236 2017] [proxy_wstunnel:debug] [pid 32128]
mod_proxy_wstunnel.c(333): [client 192.0.2.1:51996] AH02451: serving URL
ws://192.0.2.1:9000/

Here's Zaqar's Websocket transport answering the request, creating both a
queue and a subscription but no data after that:

2017-01-17 12:08:00.902 31693 INFO zaqar.transport.websocket.protocol [-]
Client connecting: tcp:192.0.2.1:59954
2017-01-17 12:08:00.903 31693 INFO zaqar.transport.websocket.protocol [-]
WebSocket connection open.
2017-01-17 12:08:00.984 31693 DEBUG zaqar.transport.auth [-] Installing
Keystone's auth protocol install
/usr/lib/python2.7/site-packages/zaqar/transport/auth.py:53
2017-01-17 12:08:00.994 31693 WARNING keystonemiddleware.auth_token [-]
Using the in-process token cache is deprecated as of the 4.2.0 release and
may be removed in the 5.0.0 release or the 'O' development cycle. The
in-process cache causes inconsistent results and high memory usage. When
the feature is removed the auth_token middleware will not cache tokens by
default which may result in performance issues. It is recommended to use
 memcache for the auth_token token cache by setting the memcached_servers
option.
2017-01-17 12:08:01.637 31693 INFO zaqar.transport.websocket.protocol [-]
Response: API v2 txt, 200. Request: action "authenticate", body {}.
2017-01-17 12:08:01.638 31693 DEBUG zaqar.api.v2.endpoints [-] Queue create
- queue: tripleo, project: 27b25cdc7ee4407491e959bb357c7c0e queue_create
/usr/lib/python2.7/site-packages/zaqar/api/v2/endpoints.py:101
2017-01-17 12:08:01.638 31693 DEBUG zaqar.common.pipeline [-] Stage
<zaqar.storage.mongodb.messages.MessageQueueHandler object at 0x2de1d50>
does not implement create consumer
/usr/lib/python2.7/site-packages/zaqar/common/pipeline.py:94
2017-01-17 12:08:01.640 31693 INFO zaqar.transport.websocket.protocol [-]
Response: API v2 txt, 204. Request: action "queue_create", body
{"queue_name": "tripleo"}.
2017-01-17 12:08:01.641 31693 DEBUG zaqar.api.v2.endpoints [-] Subscription
create - queue: tripleo, project: 27b25cdc7ee4407491e959bb357c7c0e
subscription_create
/usr/lib/python2.7/site-packages/zaqar/api/v2/endpoints.py:810
2017-01-17 12:08:01.641 31693 DEBUG zaqar.common.pipeline [-] Stage
<zaqar.storage.mongodb.messages.MessageQueueHandler object at 0x2de1d50>
does not implement exists consumer
/usr/lib/python2.7/site-packages/zaqar/common/pipeline.py:94
2017-01-17 12:08:01.644 31693 INFO zaqar.transport.websocket.protocol [-]
Response: API v2 txt, 201. Request: action "subscription_create", body
{"queue_name": "tripleo", "ttl": 3600}.

Part of me wonders if this has to do with trying to use the scheme wss
through https, but I've not read anything to indicate this would be a
problem.  In fact, I'm reading [3] that this is in fact possible.  Another
thought is that Websockets can't coexist with other connections but I'm
reading this is not the case either.

Anyone have any experience in this area that might be able to shed some
light?  What am I missing here?

Thanks!
-dant

---

[0]
https://blueprints.launchpad.net/tripleo/+spec/proxy-undercloud-api-services
[1] https://bugzilla.redhat.com/show_bug.cgi?id=1392627
[2] https://tools.ietf.org/html/rfc6455#page-56
[3]
https://lists.apache.org/thread.html/97be7dd152d1a657187b729d8114f769c24d400322b13c7cce071bd0@%3Cusers.httpd.apache.org%3E
[4] http://paste.openstack.org/show/595246/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20170117/14d29841/attachment.html>


More information about the OpenStack-dev mailing list