[openstack-dev] [TripleO] [Tuskar] Deployment Management section - Wireframes

Clint Byrum clint at fewbar.com
Wed Jan 15 18:38:34 UTC 2014


Excerpts from Jaromir Coufal's message of 2014-01-15 02:05:52 -0800:
> > 1) Check for an already deleted server before deleting any. This is
> > related to stack convergence:
> >
> > https://blueprints.launchpad.net/heat/+spec/stack-convergence
> >
> > This will allow users to just delete a server they want to delete,
> > and then update the template to reflect reality.
> >
> > 2) Allow resources to be marked as critical or disposable. Critical
> > resources would not ever be deleted for scaling purposes or during
> > updates. An update would fail if there were no disposable resources.
> > Scaling down would just need to be retried at this point.
> >
> > With those two things, TripleO can make the default "disposable" for
> > stateless resources, and "critical" for stateful resources. Tuskar would
> > just report on problems in managing the Heat stack. Admins can then
> > control any business cases for evacuations/retirement of workloads/etc
> > for automation purposes.
> This is nice feature, though it looks post-I business. I would focus 
> more on 1) at the moment. But it's good direction.
> 

(2) is definitely a feature that we'd have to propose and flesh out for
Juno. I think it would solve a lot of the problems with auto scaling and
allow it to work for workloads other than stateless app servers. I am not
sure if it fits in with the autoscaling work that has been ongoing. Also
I think we can live without it indefinitely, _if_ we can accept the scary
part of the heat template size/count being a veritable "Sword of Damacles"
hanging over the head of the resources in the group.

For (1), I think anything that does not follow the path of least surprise
is a bug. The way reducing count/size of groups works now is a surprise
if users have manually deleted resources already. So I filed (1) from
above as a bug:

https://bugs.launchpad.net/heat/+bug/1269534

This should be a relatively simple fix, where the update just does a
check_active on all of the resources that should exist, and if any don't,
then reduces the count of active resources. This flips the logic a bit
(right now we check the count and template, and then reduce), but I
think we can at least _try_ to get this for I3. It also fits in with
the convergence blueprint which has been accepted for icehouse (but I
don't know if any work has been completed). I linked the bug to said
blueprint.

> > Eventually perhaps we could use Mistral to manage that, but for now,
> > I think just being able to protect and manually delete important nodes
> > for scale down is enough. Perhaps Tuskar could even pop up a dialog
> > showing them and allowing manual selection.
> Clint, this is exactly what I was asking for and thank you for bringing 
> this up. It would be awesome if we can do it. But I was told that this 
> is not very well possible with current TripleO approach.
> 
> So my question is - when scaling down, are we able to show user a list 
> of participating nodes, let him select which nodes he wants to remove 
> and update the template to reflect the reality then? All in Icehouse 
> timeframe...?
> 

For Icehouse Tuskar should probably just focus on generating templates
for "NovaCompute%d" the way it does now, and then giving users a way to
choose which of those to terminate. If we can get that bug fixed though,
then Tuskar could just nova delete the selected servers and then update
the size of an instance group in Heat.



More information about the OpenStack-dev mailing list