[openstack-dev] [Heat] Short term scaling strategies for large Heat stacks

Zane Bitter zbitter at redhat.com
Fri May 30 19:01:02 UTC 2014


On 29/05/14 19:52, Clint Byrum wrote:
> Multiple Stacks
> ===============
>
> We could break the stack up between controllers, and compute nodes. The
> controller will be less likely to fail because it will probably be 3 nodes
> for a reasonably sized cloud. The compute nodes would then live in their
> own stack of (n) nodes. We could further break that up into chunks of
> compute nodes, which would further mitigate failure. If a small chunk of
> compute nodes fails, we can just migrate off of them. One challenge here
> is that compute nodes need to know about all of the other compute nodes
> to support live migration. We would have to do a second stack update after
> creation to share data between all of these stacks to make this work.
>
> Pros: * Exists today
>
> Cons: * Complicates host awareness
>        * Still vulnerable to stack failure (just reduces probability and
>          impact).

Separating the controllers and compute nodes is something you should do 
anyway (although moving to autoscaling, which will be even better when 
it is possible, would actually have the same effect). Splitting the 
compute nodes into smaller groups would certainly reduce the cost of 
failure. If we were to use an OS::Heat::Stack resource that calls 
python-heatclient instead of creating a nested stack in the same engine, 
then these child stacks would get split across a multi-engine deployment 
automagically. There's a possible implementation already at 
https://review.openstack.org/53313

> update-failure-recovery
> =======================
>
> This is a blueprint I believe Zane is working on to land in Juno. It will
> allow us to retry a failed create or update action. Combined with the
> separate controller/compute node strategy, this may be our best option,
> but it is unclear whether that code will be available soon or not. The
> chunking is definitely required, because with 500 compute nodes, if
> node #250 fails, the remaining 249 nodes that are IN_PROGRESS will be
> cancelled, which makes the impact of a transient failure quite extreme.
> Also without chunking, we'll suffer from some of the performance
> problems we've seen where a single engine process will have to do all of
> the work to bring up a stack.
>
> Pros: * Uses blessed strategy
>
> Cons: * Implementation is not complete
>       * Still suffers from heavy impact of failure
>       * Requires chunking to be feasible

I've already started working on this and I'm expecting to have this 
ready some time between the j-1 and j-2 milestones.

I think these two strategies combined could probably get you a long way 
in the short term, though obviously they are not a replacement for the 
convergence strategy in the long term.


BTW You missed off another strategy that we have discussed in the past, 
and which I think Steve Baker might(?) be working on: retrying failed 
calls at the client level.

cheers,
Zane.



More information about the OpenStack-dev mailing list