[Openstack] [Openstack-operators] Nova Controller HA issues

John Garbutt John.Garbutt at citrix.com
Fri Jun 15 10:36:04 UTC 2012


I know there is some work in the XenAPI driver to make it resilient to these kinds of failures (to allow frequent updates of the nova code), and I think there were plans for the work to be reused in the Libvirt driver.

AFAIK, in Essex and lower, bad things can happen if you don't wait for all the tasks to finish. You may well be OK some of the time.

It boils down to an issue of consuming the message from Rabbit but not completing the task, and not being able to recover from half completed tasks.

Hope that helps,
John

From: Igor Laskovy [mailto:igor.laskovy at gmail.com]
Sent: 15 June 2012 11:31
To: Christian Parpart
Cc: John Garbutt; openstack-operators at lists.openstack.org; &lt,openstack at lists.launchpad.net&gt,
Subject: Re: [Openstack-operators] Nova Controller HA issues


I am using OpenStack for my little lab for a short time too))

Ok, you are right of course, but I meant a some another design when told about virtualization controller nodes.

It is can be only two dedicated hypetvisor with dedicated share/drbd between them. This hypervisors will be standalone, and not be part of nova. Than, maybe pacemaker or another tool can take availability function to restart VM to alive node when active will die.

Main question here - how worth can be if occurs controller nodes unexpected power off. In another word, when VM restart it will be in crash consisted state.
Will some nova services will loose here?
Will RabbiMQ loose some data here? (I am new to RabbitMQ too)

Igor Laskovy
facebook.com/igor.laskovy<http://facebook.com/igor.laskovy>
Kiev, Ukraine
On Jun 15, 2012 10:54 AM, "Christian Parpart" <trapni at gmail.com<mailto:trapni at gmail.com>> wrote:
Hey,

well, I said "I might be wrong" because I have no "clear" vision on how OpenStack works in
its deepest detail, however, I would not like to depend on a controller node that
is inside a virtual machine, controlled by compute nodes, that are controlled by the controller
node. This sounds quite like a chicken-and-egg problem.

However, at the time of this writing, I think you'll have to have a working nova-scheduler process,
which is responsible on deciding on which compute node to spawn your VM (what else?),
and think about what you do when this (or all your controller-)VMs terribly die,
and you want to rebuild it, how do you plan to do this when your controller node is out-of-service?

I in my case have put the controller services onto two compute nodes, and use Pacemaker
to switch between them, in case one node goes down, the other can take over (via shared service-IP).

Again, these are my thoughts, and I am using OpenStack for just about a month now :-)
But I hope this helps a bit...

Best regards,
Christian Parpart.

On Fri, Jun 15, 2012 at 8:16 AM, Igor Laskovy <igor.laskovy at gmail.com<mailto:igor.laskovy at gmail.com>> wrote:

Why? Can you please clarify.

Igor Laskovy
facebook.com/igor.laskovy<http://facebook.com/igor.laskovy>
Kiev, Ukraine
On Jun 15, 2012 1:55 AM, "Christian Parpart" <trapni at gmail.com<mailto:trapni at gmail.com>> wrote:
I don't think putting the controller node completely into a VM is a good advice,
at least when speaking of nova-scheduler and nova-api (if central).

I may be wrong, and if so, please correct me.

Christian.

On Thu, Jun 14, 2012 at 7:20 PM, Igor Laskovy <igor.laskovy at gmail.com<mailto:igor.laskovy at gmail.com>> wrote:
Hi, have any updates there?
Can anybody clarify what happens if controller nodes just going hard shutdown?

I thinking about solution with two hypervisors and putting controller
node in VM shared storage, which can be relaunched when active
hypervisor will die.
Any ideas, advise?


On Tue, Jun 12, 2012 at 3:52 PM, John Garbutt <John.Garbutt at citrix.com<mailto:John.Garbutt at citrix.com>> wrote:
> Sure, I get your point.
>
> I think Florian is working on some docs to help on that.
>
> Not sure how much has been done already.
>
>
>
> Cheers,
>
> John
>
>
>
> From: Christian Parpart [mailto:trapni at gmail.com<mailto:trapni at gmail.com>]
> Sent: 12 June 2012 13:47
> To: John Garbutt
> Cc: openstack-operators at lists.openstack.org<mailto:openstack-operators at lists.openstack.org>
> Subject: Re: [Openstack-operators] Nova Controller HA issues
>
>
>
> Hey, ya I also found this page, but didn't find it yet that helpful, it
> rather much sounds like a theoretical paper on
>
> how they implemented it rather then telling me on how to actually make it
> happen (from the sysop point of view :-)
>
>
>
> I hoped that someone had to face this already, since I really find it very
> unintuitive to realize, or need to wait until
>
> I get more time to investigate dedicated. :-)
>
>
>
> Regards,
>
> Christian.
>
> On Tue, Jun 12, 2012 at 12:52 PM, John Garbutt <John.Garbutt at citrix.com<mailto:John.Garbutt at citrix.com>>
> wrote:
>
> I thought Rabbit had a built in HA solution these days:
>
> http://www.rabbitmq.com/ha.html
>
>
>
> From: openstack-operators-bounces at lists.openstack.org<mailto:openstack-operators-bounces at lists.openstack.org>
> [mailto:openstack-operators-bounces at lists.openstack.org<mailto:openstack-operators-bounces at lists.openstack.org>] On Behalf Of
> Christian Parpart
> Sent: 12 June 2012 09:59
> To: openstack-operators at lists.openstack.org<mailto:openstack-operators at lists.openstack.org>
> Subject: [Openstack-operators] Nova Controller HA issues
>
>
>
> Hi all,
>
>
>
> after spending the whole evening in making our cloud controller node highly
> available
>
> using Corosync/Pacemaker, at which I am really proud about it, I am having
> just a few
>
> problems left, and the one that freaks me out the most is rabbitmq-server.
>
>
>
> That beast I just seem to find no good documenation on how to set
> rabbitmq-server up
>
> properly for HA'ing.
>
>
>
> Does anyone have ever tried to set a nova controller (including rabbitmq
> dependency) up for HAing?
>
> If so, I'd be pleased to share experiences, especially to the latter part.
> :-)
>
>
>
> Best regards,
>
> Christian Parpart
>
>
>
>
> _______________________________________________
> Openstack-operators mailing list
> Openstack-operators at lists.openstack.org<mailto:Openstack-operators at lists.openstack.org>
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>



--
Igor Laskovy
Kiev, Ukraine


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack/attachments/20120615/795c662e/attachment.html>


More information about the Openstack mailing list