[Openstack-operators] Fwd: HAPROXY 504 errors in HA conf

Pedro Sousa pgsousa at gmail.com
Thu Jan 15 12:13:12 UTC 2015


Hi all,

the culprit was haproxy, I had "option httpchk" when I disabled this
stopped having timeouts rebooting the servers.

Thank you all.


On Wed, Jan 14, 2015 at 5:29 PM, John Dewey <john at dewey.ws> wrote:

>  I would verify that the VIP failover is occurring.
>
> Your master should have the IP address.  If you shut down keepalived the
> VIP should move to one of the others.   I generally set the state to MASTER
> on all systems, and have one with a higher priority than the others (e.g.
> 100 vs 150 on others).
>
> On Tuesday, January 13, 2015 at 12:18 PM, Pedro Sousa wrote:
>
> As expected If I reboot the Keepalived MASTER node, I get timeouts again,
> so my understanding is that this happens when the VIP fails over to another
> node. Anyone has explanation for this?
>
> Thanks
>
> On Tue, Jan 13, 2015 at 8:08 PM, Pedro Sousa <pgsousa at gmail.com> wrote:
>
> Hi,
>
> I think I found out the issue, as I have all the 3 nodes running
> Keepalived as MASTER, when I reboot one of the servers, one of the VIPS
> failsover to it, causing the timeout issues. So I left only one server as
> MASTER and the other 2 as BACKUP, and If I reboot the BACKUP servers
> everything will work fine.
>
> As a note aside, I don't know if this is some ARP issue because I have a
> similar problem with Neutron L3 running in HA Mode. If I reboot the server
> that is running as MASTER I loose connection to my floating IPS because the
> switch doesn't know yet that the Mac Addr has changed. To everything start
> working I have to ping an outside host  like google from an instance.
>
> Maybe someone could share some experience on this,
>
> Thank you for your help.
>
>
>
>
> On Tue, Jan 13, 2015 at 7:18 PM, Pedro Sousa <pgsousa at gmail.com> wrote:
>
> Jesse,
>
> I see a lot of these messages in glance-api:
>
> 2015-01-13 19:16:29.084 29269 DEBUG
> glance.api.middleware.version_negotiation
> [29d94a9a-135b-4bf2-a97b-f23b0704ee15 eb7ff2b5f0f34f51ac9ea0f75b60065d
> 2524b02b63994749ad1fed6f3a825c15 - - -] Unknown version. Returning version
> choices. process_request
> /usr/lib/python2.7/site-packages/glance/api/middleware/version_negotiation.py:64
>
> While running openstack-status (glance image-list)
>
> == Glance images ==
> Error finding address for
> http://172.16.21.20:9292/v1/images/detail?sort_key=name&sort_dir=asc&limit=20:
> HTTPConnectionPool(host='172.16.21.20', port=9292): Max retries exceeded
> with url: /v1/images/detail?sort_key=name&sort_dir=asc&limit=20 (Caused by
> <class 'httplib.BadStatusLine'>: '')
>
>
> Thanks
>
>
> On Tue, Jan 13, 2015 at 6:52 PM, Jesse Keating <jlk at bluebox.net> wrote:
>
> On 1/13/15 10:42 AM, Pedro Sousa wrote:
>
> Hi
>
>
>     I've changed some haproxy confs, now I'm getting a different error:
>
>     *== Nova networks ==*
>     *ERROR (ConnectionError): HTTPConnectionPool(host='172.16.21.20',
>     port=8774): Max retries exceeded with url:
>     /v2/2524b02b63994749ad1fed6f3a825c15/os-networks (Caused by <class
>     'httplib.BadStatusLine'>: '')*
>     *== Nova instance flavors ==*
>
> If I restart my openstack services everything will start working.
>
>     I'm attaching my new haproxy conf.
>
>
> Thanks
>
>
> Sounds like your services are losing access to something, like rabbit or
> the database. What do your service logs show prior to restart? Are they
> throwing any errors?
>
>
> --
> -jlk
>
>
> _______________________________________________
> OpenStack-operators mailing list
> OpenStack-operators at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>
>
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20150115/3e7d10b5/attachment.html>


More information about the OpenStack-operators mailing list