[openstack-dev] [Heat] Multi region support for Heat

Zane Bitter zbitter at redhat.com
Mon Jul 29 17:21:29 UTC 2013

On 29/07/13 17:40, Clint Byrum wrote:
> Excerpts from Zane Bitter's message of 2013-07-29 00:51:38 -0700:
>> >On 29/07/13 02:04, Angus Salkeld wrote:
>>> > >On 26/07/13 09:43 -0700, Clint Byrum wrote:
>>>> > >>
>>>> > >>These are all predictable limitations and can be handled at the parsing
>>>> > >>level.  You will know as soon as you have template + params whether or
>>>> > >>not that cinder volume in region A can be attached to the nova server
>>>> > >>in region B.
>> >
>> >That's true, but IMO it's even better if it's obvious at the time you
>> >are writing the template. e.g. if (as is currently the case) there is no
>> >mechanism within a template to select a region for each resource, then
>> >it's obvious you have to write separate templates for each region (and
>> >combine them somehow).
>> >
> The language doesn't need to be training wheels for template writers. The

Sure it does ;)

But seriously, the language should reflect the underlying model.

> language parser should just make it obvious when you've done something
> patently impossible. Because cinderclient will only work with a region
> at a time, we can make that presumption at that resource level, and flag
> the problem during creation before any resources have been allocated.
> But it would be quite presumptuous of Heat, at a language layer, to
> say all things are single region.

Well, CloudFormation has made that presumption. And while I would 
*never* suggest that we limit ourselves to features that CloudFormation 
supports, it behooves us to pause and consider why that might be. (I'm 
thinking here of a great many things in Heat, not just this particular 
one.) Because if we do just charge ahead with every idea, the complexity 
of the resulting system will be baroque.

> There's an entirely good use case
> for cross-region volumes and if it is ever implemented that way, and
> cinderclient got some multi-region features, we wouldn't want users
> to be faced with a big rewrite of their templates just to make use of
> them. We should just be able to do the cross region thing now, because
> we have that capability.

This is interesting, because I think of Availability Zones within a 
Region as "things that are separated somewhat, but still make sense to 
be used together sometimes" and Regions as "things that don't make sense 
to be used together". So we have this three-tier system already (local, 
different AZ, different Region) that cloud providers can use to specify 
the capabilities of resources, yet we propose to make two of those tiers 
effectively equivalent? That doesn't sound like we are effectively 
modelling the problem domain.

If we accept that distinction between AZs and Regions, then your example 
is, by definition, not going to happen. Presumably you don't accept that 
definition though, so I'd be curious to hear what everybody else thinks 
it means when a cloud provider creates a separate Region.

> I dislike the language defining everything and putting users in too
> tight of a box. How many times have you been using a DSL and shaken your
> fists in the air, "get out of my way!"? I would suggest that cross region
> support is handled resource by resource.

Probably only because I haven't used a lot of DSLs but, honestly, I 
genuinely can't ever recall that happening. However, I wish I had a 
dollar for every time I've tried desperately to get some combination of 
things that worked individually to work together and ended up having to 
read the source code to figure out that it wasn't supported.

>>>> > >>I'm still convinced that none of this matters if you rely on a single
>>>> > >>Heat
>>>> > >>in one of the regions. The whole point of multi region is to eliminate
>>>> > >>a SPOF.
>> >
>> >So the idea here would be that you spin up a master template in one
>> >region, and this would contain OS::Heat::Stack resources that use
>> >python-heatclient to connect to Heat APIs in other regions to spin up
>> >the constituent stacks in each region. If a region goes down, even if it
>> >is the one with your master template, that's no problem because you can
>> >still interact with the constituent stacks directly in whatever
>> >region(s) you_can_  reach.
>> >
> Interacting with a nested stack directly would be a very uncomfortable
> proposition. How would this affect changes from the master? Would the
> master just overwrite any changes I make or refuse to operate?

Heat resources (with the exception of a few hacks that are implemented 
internally *cough*AutoScaling*cough*) are stateless. We only really 
store the uuid of the target resource.

So if you updated the template for the nested stack directly, the Heat 
engine with the master stack wouldn't notice or care. If you deleted it 
then you should probably delete it from the master stack before you try 
to do an update that modifies it, but that should go smoothly because 
Heat ignores NotFound errors when trying to delete resources. We can and 
probably should improve on that by checking that resources still exist 
during an update, and recreating them if not.

Note that this is completely possible now, for all resources that have 
their own API + nested stacks as well. (Updates after out-of-band 
changes are actually a much bigger problem for virtually everything 
_but_ nested stacks, which is one reason I suggested limiting 
foreign-region resources to only nested stacks, rather than the 
competing proposal from others in this thread to allow it for all 
resource types.) This proposal doesn't change anything at all in this 
regard, aside from increasing the chance that you might want to make use 
of it.

>> >So it solves the non-obviousness problem and the single-point-of-failure
>> >problem in one fell swoop. The question for me is whether there are
>> >legitimate use cases that this would shut out.
>> >
> With this solution to those problems, you have a new non-obvious problem
> which is the split brain I describe above. If master goes down, you have
> to play the master's role, and when the master returns, does it just
> give up because you've tainted its stacks? Do you develop new API commands
> to help resolve this? Use merge techniques?

Other than the "replace a deleted resource on update" thing mentioned 
above, nothing should be necessary afaict.

> I believe that those problems are no less complex than the problem of
> how to make Heat engines simply act as peers that need to share data.

Looking back over your proposal, you seem to be suggesting more or less 
the same thing happening under the hood. The only difference I can 
detect is that I want the structure of the templates to explicitly model 
how it works, whereas you want to infer from one multi-region template 
which resources go to each region.


More information about the OpenStack-dev mailing list