[Openstack-operators] Openstack HA active/passive vs. active/active
Alvise Dorigo
alvise.dorigo at pd.infn.it
Tue Dec 3 19:00:21 UTC 2013
thank you all for your suggestions and comments.
I would like to use a more robust and advanced tool like RabbitMQ. But I’m getting a problem that I do not understand and I can’t find any clear solution on google:
2013-12-03 19:15:47.312 1953 CRITICAL glance [-] 'RabbitStrategy' object has no attribute ‘connection_errors’
Any hint ?
thank you,
Alvise
On 29 Nov 2013, at 05:15, Mike Wilson <geekinutah at gmail.com> wrote:
> I would advise you to avoid Qpid like the plague, we've had a boatload of problems with it, even at very small scale[1]. I would say the system with the most experience, reliability and support is Rabbit. That being said, we've completely ditched a broker-in-the-middle type solution and started using the 0MQ driver in our deployment. HA is not a big deal for us from a MQ perspective since all nodes and services are already HA. We, get messaging HA with that for free. We have been using ZeroMQ in our installation for a while and I would highly recommend it. There are some caveats, some of the behaviors in Ceilometer at least are not quite supported by the 0MQ driver. I would love to have more people interested in the driver so we can start building more support. We recently presented some of our findings and reasoning at the Honk Kong summit[2]. Good luck!
>
> -Mike
>
> [1] http://www.openstack.org/summit/portland-2013/session-videos/presentation/using-openstack-in-a-traditional-hosting-environment
> [2] http://www.openstack.org/summit/openstack-summit-hong-kong-2013/session-videos/presentation/going-brokerless-the-transition-from-qpid-to-0mq
>
>
> On Wed, Nov 27, 2013 at 11:02 AM, Alvise Dorigo <alvise.dorigo at pd.infn.it> wrote:
> Hi Jay, thanks a lot for your rich answer. More comments and questions inline...
>
> On 26 Nov 2013, at 16:51, Jay Pipes <jaypipes at gmail.com> wrote:
>
> > On 11/26/2013 07:26 AM, Alvise Dorigo wrote:
> >> Hello,
> >> I've read the documentation about Openstack HA
> >> (http://docs.openstack.org/high-availability-guide/content/index.html)
> >> and I successfully implemented the active/passive model (with
> >> corosync/pacemaker) for the two services Keystone and Glance (MySQL HA
> >> is based on Percona-XtraDB multi-master).
> >>
> >> I'd like to know from the experts, which one is the best (and possibly
> >> why) model for HA, between active/passive and active/active, basing on
> >> their usage experience (that is for sure longer than mine).
> >
> > There is no reason to run any OpenStack endpoint -- other than the Neutron L3 agent -- in an active/passive way. The reason is because none of the OpenStack endpoints maintain any state. The backend storage systems used by those endpoints *do* contain state -- but the endpoint services themselves do not.
> >
>
> So, in principle I could simply install a cloud controller (with Keystone, Glance, Nova API, Cinder) and just clone it on another machine. Then I could put an HAProxy (made redundant with Keepalived) on top of them. (A different story would be for Neutron L3 agent for which an active/passive mode is preferable, as you pointed out).
> Does this make sense ?
>
> > Simply front each OpenStack endpoint with a DNS name that resolves to a virtual IP managed by a load balancer, ensure that sessions are managed by the load balancer, and you're good.
> >
> > For the Neutron L3 agent, you will need a separate strategy, because unfortunately, the L3 agent is stateful. We use a number of Python scripts to handle failover of routes when an agent fails. You can see these tools here, which we simply add as a cron job:
> >
> > https://github.com/stackforge/cookbook-openstack-network/blob/master/files/default/quantum-ha-tool.py
> >
> > My advice would be to continue using Percona XtraDB for your database backend (we use the same in a variety of ways, from intra-deployment-zone clusters to WAN-replicated clusters). That solves your database availability issues, and nicely, we've found PXC to be as easy or easier to administer and keep in sync than normal MySQL replication.
> >
>
> Definitely. It showed to be as robust as we expected. And, in addition, the combination of Percona+HAProxy makes possible the expansion (substitution) of nodes without any outage period; for example if we need to increase the cluster performances (more CPU, more RAM, more disk)… needless to mention the RR balancing, which comes for free.
>
> > For your message queue, you need to determine a) what level of data loss you are comfortable with, and b) whether to use certain OpenStack projects' ability to retry multiple MQ hosts in the event of a failure (currently Nova, Neutron and Cinder support this but Glance does not, IIRC).
> >
>
> What about having an instance of QPid per node ? As far as I know, qpid also is stateless, isn’ it ? In my active/passive actual cluster I’ve qpid running on both nodes and when I migrate the keystone/glance from on node to the other I do not note anything strange. Do you see any drawback with this ?
>
> Thanks,
>
> Alvise
>
> > We use RabbitMQ clustering and have had numerous problems with it, frankly. It's been our pain point from an HA perspective. There are other clustering MQ technologies out there, of course. Frankly, one could write a whole book just about how crappy the MQ clustering "story" is...
> >
> > All the best,
> > -jay
> >
> >
> > _______________________________________________
> > OpenStack-operators mailing list
> > OpenStack-operators at lists.openstack.org
> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>
>
> _______________________________________________
> OpenStack-operators mailing list
> OpenStack-operators at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>
More information about the OpenStack-operators
mailing list