[openstack-dev] blueprint proposal nova-compute fencing for HA ?

Doug Hellmann doug.hellmann at dreamhost.com
Mon Apr 22 15:58:28 UTC 2013


We had several sessions at the summit last week to discuss ways to
implement tenant-defined alarms in Ceilometer to notify Heat (and other
consuming services) when some action might need to be taken. See the
session notes [1] for details.

Doug

[1] https://wiki.openstack.org/wiki/Summit/Havana/Etherpads#Ceilometer



On Mon, Apr 22, 2013 at 6:56 AM, Sylvain Bauza
<sylvain.bauza at digimind.com>wrote:

>  I'm not fully aware of the Heat/Ceilometer parts, but I know there are
> some attempts to mix them up in order to get some HA capabilities based on
> metrics given by Ceilometer.
>
> -Sylvain
>
>
> Le 22/04/2013 14:58, Alex Glikson a écrit :
>
> I think this is a good idea.
> We already have a framework in Nova to detect and report the failure
> (service group monitoring APIs, with DB and ZK backends already
> implemented), as well as APIs to list instances on a host and to evacuate
> individual instances (soon with destination selected by the scheduler).
> Indeed, the missing pieces now are the end-to-end orchestration (which is
> probably not going to happen within Nova, at least at the moment), and the
> mechanism(s) to isolate the failed host (e.g., to protect against false
> failure detection events) -- which could potentially happen in several
> places, as you mentioned. It might be the case that whatever can be done
> within Nova is already there -- the corresponding nova-compute will be
> considered down. So, maybe now the question is which additional components
> might be used (as you mentioned -- bare-metal, quantum, cinder, etc). Once
> the individual measures are clear (and implemented), the orchestration
> logic (wherever that would be) can use them.
>
> Regards,
> Alex
>
>
>
>
> From:        Leen Besselink <ubuntu at consolejunkie.net><ubuntu at consolejunkie.net>
> To:        OpenStack Development Mailing List
> <openstack-dev at lists.openstack.org> <openstack-dev at lists.openstack.org>,
> Date:        22/04/2013 03:18 PM
> Subject:        [openstack-dev] blueprint proposal nova-compute fencing
> for HA ?
> ------------------------------
>
>
>
> Hi,
>
> As I have not been at the summit and the technical videos of the Summit
> are not yet online I am not aware of what was discusses there.
>
> But I would like to submit a blueprint.
>
> My idea is:
>
> It is a step to support VM High availability.
>
> This part is about handling compute node failure.
>
> My proposal would be to create a framework/API/plugin/agent or whatever is
> needed for fencing off a nova-compute node.
>
> So when something detects that a nova-compute node isn't functional
> anymore it can fence off that nova-compute node.
>
> After which it can call 'evacuate' to start the instance(s) that were
> previously running on the failed compute node on other compute node(s).
>
> The implementation of the code that handles the fencing could be
> implemented in different ways for different environments:
>
> - The IPMI-code that handle baremetal provisining could for example be
> used to poweroff or reboot the node.
>
> - The Quantum networking code could be used to "disconnect" the
> instance(s) of the failed compute node (or the whole compute node) from
> their respective networks. If you are using overlays you could configure
> other machines not to accept tunnel traffic from the failed compute node
> for the networks of the instance(s)
>
> - You could also have a firewall agent configure the shared storage
> servers (or a firewall in between) to not accept traffic from the failed
> compute node
>
> I am sure other people have other ideas.
>
> My request would be to have an API and general framework which can call
> the different implementations that are configured for that environment.
>
> Does that make any sense ?
>
> Or maybe should this be handled by creating clusters with for example
> pacemaker like I assume oVirt might be doing with their proposals:
>
>
> https://blueprints.launchpad.net/nova/+spec/rhev-m-ovirt-clusters-as-compute-resources/
>
> As I am not yet all that familar with the structure of OpenStack or how it
> is organized it could be I am asking in the wrong place to discuss this or
> if it architecturally does not fit in then do let me know where I went
> wrong.
>
> I've looked at the list of existing blueprints and I at least see other
> evacuate, fault-tolerance/HA- and other related blueprints as well:
>
> https://blueprints.launchpad.net/nova/+spec/evacuate-host
> https://blueprints.launchpad.net/nova/+spec/find-host-and-evacuate-instance
> https://blueprints.launchpad.net/nova/+spec/unify-migrate-and-live-migrate
> https://etherpad.openstack.org/HavanaUnifyMigrateAndLiveMigrate
> https://blueprints.launchpad.net/nova/+spec/live-migration-scheduling
> https://blueprints.launchpad.net/nova/+spec/bare-metal-fault-tolerance
>
> http://openstacksummitapril2013.sched.org/event/92e3468e458c13616331e75f15685560#.UXUeVXyuiw4
> https://blueprints.launchpad.net/nova/+spec/live-migration-scheduling
>
> I think it would be a good idea to have an idea of what all of the
> usecases are and then split them up in tasks.
>
> Hope this is helpful.
>
> Have a nice day,
>                 Leen.
>
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>
>
>
> _______________________________________________
> OpenStack-dev mailing listOpenStack-dev at lists.openstack.orghttp://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>
>
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20130422/44d961ce/attachment.html>


More information about the OpenStack-dev mailing list