[openstack-dev] [Heat] Multi region support for Heat

Clint Byrum clint at fewbar.com
Wed Jul 24 06:40:29 UTC 2013

Excerpts from Adrian Otto's message of 2013-07-23 21:22:14 -0700:
> Clint,
> On Jul 23, 2013, at 10:03 AM, Clint Byrum <clint at fewbar.com>
>  wrote:
> > Excerpts from Steve Baker's message of 2013-07-22 21:43:05 -0700:
> >> On 07/23/2013 10:46 AM, Angus Salkeld wrote:
> >>> On 22/07/13 16:52 +0200, Bartosz Górski wrote:
> >>>> Hi folks,
> >>>> 
> >>>> I would like to start a discussion about the blueprint I raised about
> >>>> multi region support.
> >>>> I would like to get feedback from you. If something is not clear or
> >>>> you have questions do not hesitate to ask.
> >>>> Please let me know what you think.
> >>>> 
> >>>> Blueprint:
> >>>> https://blueprints.launchpad.net/heat/+spec/multi-region-support
> >>>> 
> >>>> Wikipage:
> >>>> https://wiki.openstack.org/wiki/Heat/Blueprints/Multi_Region_Support_for_Heat
> >>>> 
> >>> 
> >>> What immediatley looks odd to me is you have a MultiCloud Heat talking
> >>> to other Heat's in each region. This seems like unneccessary
> >>> complexity to me.
> >>> I would have expected one Heat to do this job.
> >> 
> >> It should be possible to achieve this with a single Heat installation -
> >> that would make the architecture much simpler.
> >> 
> > 
> > Agreed that it would be simpler and is definitely possible.
> > 
> > However, consider that having a Heat in each region means Heat is more
> > resilient to failure. So focusing on a way to make multiple Heat's
> > collaborate, rather than on a way to make one Heat talk to two regions
> > may be a more productive exercise.
> I agree with Angus, Steve Baker, and Randall on this one. We should aim for simplicity where practical. Having Heat services interacting with other Heat services seems like a whole category of complexity that's difficult to justify. If it were implemented as Steve Baker described, and the local Heat service were unavailable, the client may still have the option to use a Heat service in another region and still successfully orchestrate. That seems to me like a failure mode that's easier for users to anticipate and plan for.

I'm all for keeping the solution simple. However, I am not for making
it simpler than it needs to be to actually achieve its stated goals.

> Can you further explain your perspective? What sort of failures would you expect a network of coordinated Heat services to be more effective with? Is there any way this would be more simple or more elegant than other options?

I expect partitions across regions to be common. Regions should be
expected to operate completely isolated from one another if need be. What
is the point of deploying a service to two regions, if one region's
failure means you cannot manage the resources in the standing region?

Active/Passive means you now have an untested passive heat engine in
the passive region. You also have a lot of pointers to update when the
active is taken offline or when there is a network partition. Also split
brain is basically guaranteed in that scenario.

Active/Active(/Active/...), where each region's Heat service collaborates
and owns its own respective pieces of the stack, means that on partition,
one is simply prevented from telling one region to scale/migrate/
etc. onto another one. It also affords a local Heat the ability to
replace resources in a failed region with local resources.

The way I see it working is actually pretty simple. One stack would
lead to resources in multiple regions. The collaboration I speak of
would simply be that if given a stack that requires crossing regions,
the other Heat is contacted and the same stack is deployed. Cross-region
attribute/ref sharing would need an efficient way to pass data about
resources as well.

Anyway, I'm not the one doing the work, so I'll step back from the
position, but if I were a user who wanted multi-region, I'd certainly
want _a plan_ from day 1 to handle partitions.

More information about the OpenStack-dev mailing list