<HTML>

<HEAD>

<TITLE>Re: [Openstack] High availability in openstack?</TITLE>

</HEAD>

<BODY>

<FONT FACE="Calibri, Verdana, Helvetica, Arial"><SPAN STYLE='font-size:11pt'>Thanks,<BR>

<BR>

It was along the lines of what I was thinking.<BR>

<BR>

If messages are made persistent, which I hope is planned, or made a configuration option what would be the effects of them not being made persistent.<BR>

<BR>

Right now if a message is lost, it seems the DB/other nodes are left in a bad state, is there any plan to have a “reaper” python object that will reap this bad data/instances....<BR>

<BR>

On 8/18/11 4:54 PM, "Edward "koko" Konetzko" <<a href="konetzed@quixoticagony.com">konetzed@quixoticagony.com</a>> wrote:<BR>

<BR>

</SPAN></FONT><BLOCKQUOTE><FONT FACE="Calibri, Verdana, Helvetica, Arial"><SPAN STYLE='font-size:11pt'>On 08/16/2011 04:50 PM, Joshua Harlow wrote:<BR>

> Are there any good documentations on making openstack fault tolerant or<BR>

> exactly how it will handle failures?<BR>

><BR>

> Like say the mq server dies, can another mq server take over. Similar<BR>

> with the database (mysql replication?)....<BR>

><BR>

> Seems like having that kind of information for corporate users would be<BR>

> nice, at least a recommended “guide”.<BR>

><BR>

> -Josh<BR>

><BR>

><BR>

><BR>

> _______________________________________________<BR>

> Mailing list: <a href="https://launchpad.net/~openstack">https://launchpad.net/~openstack</a><BR>

> Post to     : <a href="openstack@lists.launchpad.net">openstack@lists.launchpad.net</a><BR>

> Unsubscribe : <a href="https://launchpad.net/~openstack">https://launchpad.net/~openstack</a><BR>

> More help   : <a href="https://help.launchpad.net/ListHelp">https://help.launchpad.net/ListHelp</a><BR>

<BR>

Josh<BR>

<BR>

I have a very bare bones start of a doc on making parts of Nova HA.  The<BR>

problem is this document is no where near ready for release as I am<BR>

probably the only person who can understand it.  I will try to point you<BR>

in the right direction on things I have done that work pretty well.<BR>

<BR>

Rabbitmq<BR>

<a href="http://www.rabbitmq.com/pacemaker.html">http://www.rabbitmq.com/pacemaker.html</a><BR>

<BR>

Right now in the version of Nova the team I am working with nothing is<BR>

marked 'persistent'. Right now in this use case if a node fails rabbitmq<BR>

moves over and all the managers reconnect with no issues but all in<BR>

flight messages are lost.  Maybe someone here can clarify on the<BR>

direction of this.  I we are using Ubuntu 10.04 and the version of<BR>

Rabbitmq in that release does not have the pacemaker scripts, I just<BR>

pulled the current package from rabbitmq.com apt repo after that the<BR>

pacemaker setup worked perfect.<BR>

<BR>

MySQL<BR>

For MySQL I just did a simple setup using DRDB to replicate<BR>

/var/lib/mysql and setup corosync/pacemaker to manage all the MySQL<BR>

resources between two nodes.  Again with this situation in failover I<BR>

had no issues with clients reconnecting to the vip.<BR>

<BR>

I hope this points you in the right direction, I know its not exactly<BR>

what you wanted.  Maybe next week I can clean up my documentation and<BR>

send it out to the list.<BR>

<BR>

Edward Konetzko<BR>

<BR>

_______________________________________________<BR>

Mailing list: <a href="https://launchpad.net/~openstack">https://launchpad.net/~openstack</a><BR>

Post to     : <a href="openstack@lists.launchpad.net">openstack@lists.launchpad.net</a><BR>

Unsubscribe : <a href="https://launchpad.net/~openstack">https://launchpad.net/~openstack</a><BR>

More help   : <a href="https://help.launchpad.net/ListHelp">https://help.launchpad.net/ListHelp</a><BR>

<BR>

</SPAN></FONT></BLOCKQUOTE>

</BODY>

</HTML>