[openstack-dev] [heat] Re: deliver the vm-level HA to improve the business continuity with openstack

Steven Dake sdake at redhat.com
Tue Apr 15 15:01:19 UTC 2014


On 04/15/2014 03:16 AM, Qiming Teng wrote:
> What I saw in this thread are several topics:
>
> 1) Is VM HA really relevant (in a cloud)?
>
> This is the most difficult question to answer, because it really depends
> on who you are talking to, who are the user community you are facing.
> IMHO, for most web-based applications that are born to run on cloud,
> maybe certain level of business resiliency has already been built into
> the code, so the application or service can live happily when VMs come
> and go.
>
> For traditional business applications, the scenario may be quite
> different.  These apps are migrated to cloud for reasons like cost
> savings, server consolidation, etc..  Quite some companies are
> evaluating OpenStack for their "private cloud" -- which is a weird term,
> IMHO.
>
> In addition to this, while we are looking into the 'utility' vision of
> cloud, we can still ask ourselves: a) can we survive one month of power
> outage or water outage, though there are abundant supply elsewhere on
> this
> planet? b) what are the costs we need to pay if we eventually make it?
> c) do we want to pay for this?
>
> My personal experience is that our customers really want this feature
> (VM HA) for their private clouds.  The question they asked us was:
>
> "
>    Does OpenStack support VM HA?  Maybe not for all VMS...
>    We know we can have that using vSphere, Azure, or CloudStack...
> "
>
>
> 2) Where is the best location to provide VM HA?
>
> Suppose that we do feel the need to support VM HA, then the questions
> following this would 'where' and 'how'.
>
> Considering that a VM is not merely a bundle of compute processes, it is
> actually a virtual execution environment that consumes resources like
> storage and network bandwidth besides processor cycles, Nova may be NOT
> the ideal location to deal with this cross-cutting concern.
>
> High availability involves redundant resource provisioning, effective
> failure detection and appropriate fail-over policies, including fencing.
> Imposing all these requirements on Nova is impractical.  We may need to
> consider whether VM HA, if ever implemented/supported, should be part of
> the orchestration service, aka Heat.
>
>
> 3) Can/should we do the VM HA orchestration in Heat?
>
> My perception is that it can be done in Heat, based on my limited
> understandig of how Heat works.  It may imply some requirements to other
> projects (e.g.  nova, cinder, neutron ...) as well, though Heat should be
> the orchestrator.
>
> What do we need then?
>
>    - A resource type for VM groups/clusters, for the redundant
>      provisioning.  VMs in the group can be identical instances, managed
>      by a Pacemaker setup among the VMs, just like a WatchRule in Heat can
>      be controlled by Ceilometer.
>
>      Another way to do this is to have the VMs monitored via heartbeat
>      messages sent by Nova (if possible/needed), or some services injected
>      into the VMs (consider what cfn-hup, cfn-signal does today).
>
>      However, the VM group/cluster can decide how to react to a VM online
>      /offline signal.  It may choose to a) restart the VM in-place; b)
>      remote-restart (aka evacuate) the VM somewhere else; c) live/cold
>      migrate the VM to other nodes.
>
>      The policies can be out sourced to other plugins considering that
>      global load-balancing or power management requirements.  But that is an
>      advanced feature that warrants another blueprint.
>
>    - Some fencing support from nova, cinder, neutron to shoot the bad VMs
>      in the head so a VM that cannot be reached is guarantteed to be cleanly
>      killed.
>
>    - VM failure detectors that can reliably tell whether a VM has failed.
>      Sometimes a VM that failed the expected performance goal should be
>      treated as failed as well, if we really want to be strict on this.
>
>      A failure detector can reside inside Nova, as what has been done for
>      the 'service groups' there.  It can reside inside a VM, as a service
>      istalled there, sending out heatbeat messages (before the battery runs
>      out, :))
>
>    - A generic signaling mechanism that allows a secure message delivery
>      back to Heat indicating that a VM is alive or dead.
>
> My current understanding is that we may avoid complicated task-flow
> here.
>
> Regards,
>    - Qiming
>
Qiming,

If you read my original post on this thread, it outlines the current 
heat-core thinking, which is to reduce the scope of this resource from 
the Heat resources since it describes a workflow rather then an 
orchestrated thing (a Noun).

A good framework for HA already exists for HA in the HARestarter 
resource.  It incorporates HA escalation, which is a critical feature of 
any HA system.  The fundamental problem with HARestarter is that is in 
the wrong project.

Long term, HA, if desired, should be part of taskflow, though, because 
its a verb, and verbs don't belong as heat orchestrated resources.

How we get from here to there is left as an exercise to the reader ;-)

Regards
-steve

>>>> For the most part we've been trying to encourage projects that want to
>>>> control VMs to add such functionality to the Orchestration program, aka
>>>> "Heat".
>>> Yes, exactly.
>>>
>>> -jay
>>>
>> Hey folks,
>>
>> Just as a note for HA for VMs, our current heat-core thinking is our
>> HARestarter resource functionality is a workflow (Restarter is a
>> verb, rather then a Noun - Heat orchestrates Nouns) and would be
>> better suited to a workflow service like Mistral.  Clearly we don't
>> know how to get from where we are today to the proper separation of
>> concerns as pointed out by Zane Bitter in recent threads on the ml
>> but just throwing this out there so folks are aware.
>>
>> Regards
>> -steve
>>
>
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev




More information about the OpenStack-dev mailing list