[openstack-dev] [Heat][Summit] Input wanted - real world heat spec
Zane Bitter
zbitter at redhat.com
Thu Apr 24 21:23:38 UTC 2014
On 23/04/14 20:45, Robert Collins wrote:
> Hi, we've got this summit session planned -
> http://summit.openstack.org/cfp/details/428 which is really about
> https://etherpad.openstack.org/p/heat-workflow-vs-convergence
>
> We'd love feedback and questions - this is a significant amount of
> work, but work I (and many others based on responses so far) believe
> it is needed to really take Heat to users and ops teams.
>
> Right now we're looking for both high and low level design and input.
>
> If you're an operator/user/developer of/with/around heat - please take
> a couple of minutes to look - feedback inline in the etherpad, or here
> on the list - whatever suits you.
>
> The basic idea is:
> - no changes needed to the heat template language etc
+1 for this part, definitely :)
> - take a holistic view and fix the system's emergent properties by
> using a different baseline architecture within it
> - ???
> - profit!
Thanks for writing this up Rob. This is certainly a more ambitious scale
of application to deploy than we ever envisioned in the early days of
Heat ;) But I firmly believe that what is good for TripleO will be great
for the rest of our users too. All of the observed issues mentioned are
things we definitely want to address.
I have a few questions about the specific architecture being proposed.
It's not clear to me what you mean by "call-stack style" in referring to
the current paradigm. Maybe you could elaborate on how the current style
and the "convergence style" differ.
Specifically, I am not clear on whether 'convergence' means:
(a) Heat continues to respect the dependency graph but does not stop
after one traversal, instead repeatedly processing it until (and even
after) the stack is complete; or
(b) Heat ignores the dependency graph and just throws everything
against the wall, repeating until it has all stuck.
I also have doubts about the principle "Users should only need to
intervene with a stack when there is no right action that Heat can take
to deliver the current template+parameters". That sounds good in theory,
but in practice it's very hard to know when there is a right action Heat
can take and when there isn't. e.g. There are innumerable ways to create
a template that can _never_ actually converge, and I don't believe
there's a general way we can detect that, only the hard way: one error
type at a time, for every single resource type. Offering users a way to
control how and when that happens allows them to make the best decisions
for their particular circumstances - and hopefully a future WFaaS like
Mistral will make it easy to set up continuous monitoring for those who
require it. (Not incidentally, it also gives cloud operators an
opportunity to charge their users in proportion to their actual
requirements.)
> This can be constrasted with many other existing attempts to design
> solutions which relied on keeping the basic internals of heat as-is
> and just tweaking things - an approach we don't believe will work -
> the issues arise from the current architecture, not the quality of the
> code (which is fine).
Some of the ideas that have been proposed in the past:
- Moving execution of operations on individual resources to a
distributed execution system using taskflow. (This should address the
scalability issue.)
- Updating the stored template in real time during stack updates - this
is happening in Juno btw. (This will solve the problem of inability to
ever recover from an update failure. In theory, it would also make it
possible to interrupt a running update and make changes.)
- Implementing a 'stack converge' operation that the user can trigger to
compare the actual state of the stack with the model and bring it back
into spec.
It would be interesting to see some analysis on exactly how these
existing attempts fall down in trying to fulfil the goals, as well as
the specific points at which the proposed implementation differs.
Depending on the answers to the above questions, this proposal could be
anything between a modest reworking of those existing ideas and a
complete re-imagining of the entire concept of Heat. I'd very much like
to find out where along that spectrum it lies :)
BTW, it appears that the schedule you're suggesting involves assigning a
bunch of people unfamiliar with the current code base and having them
complete a ground-up rearchitecting of the whole engine, all within the
Juno development cycle (about 3.5 months). This is simply not consistent
with reality as I have observed it up to this point.
cheers,
Zane.
More information about the OpenStack-dev
mailing list