[openstack-dev] [Heat] Convergence prototyping

Anant Patil anant.patil at hp.com
Tue Oct 28 07:37:55 UTC 2014


On 23-Oct-14 23:40, Zane Bitter wrote:
> Hi folks,
> I've been looking at the convergence stuff, and become a bit concerned 
> that we're more or less flying blind (or at least I have been) in trying 
> to figure out the design, and also that some of the first implementation 
> efforts seem to be around the stuff that is _most_ expensive to change 
> (e.g. database schemata).
> 
> What we really want is to experiment on stuff that is cheap to change 
> with a view to figuring out the big picture without having to iterate on 
> the expensive stuff. To that end, I started last week to write a little 
> prototype system to demonstrate the concepts of convergence. (Note that 
> none of this code is intended to end up in Heat!) You can find the code 
> here:
> 
> https://github.com/zaneb/heat-convergence-prototype
> 
> Note that this is a *very* early prototype. At the moment it can create 
> resources, and not much else. I plan to continue working on it to 
> implement updates and so forth. My hope is that we can develop a test 
> framework and scenarios around this that can eventually be transplanted 
> into Heat's functional tests. So the prototype code is throwaway, but 
> the tests we might write against it in future should be useful.
> 
> I'd like to encourage anyone who needs to figure out any part of the 
> design of convergence to fork the repo and try out some alternatives - 
> it should be very lightweight to do so. I will also entertain pull 
> requests (though I see my branch primarily as a vehicle for my own 
> learning at this early stage, so if you want to go in a different 
> direction it may be best to do so on your own branch), and the issue 
> tracker is enabled if there is something you want to track.
> 

We are working on PoC for convergence and have some of the patches lined
up for review under convergence-poc topic. We planned for changes to be
incremental to existing design instead of prototyping them separately in
order to make it easier for everyone to understand what we are trying to
achieve and to assess what it takes to do it (in terms of amount of
changes).

The functional tests are going to be great; we all can move with
confidence once they are in place.

> I have learned a bunch of stuff already:
> 
> * The proposed spec for persisting the dependency graph 
> (https://review.openstack.org/#/c/123749/1) is really well done. Kudos 
> to Anant and the other folks who had input to it. I have left comments 
> based on what I learned so far from trying it out.
> 
> 
> * We should isolate the problem of merging two branches of execution 
> (i.e. knowing when to trigger a check on one resource that depends on 
> multiple others). Either in a library (like taskflow) or just a separate 
> database table (like my current prototype). Baking it into the 
> orchestration algorithms (e.g. by marking nodes in the dependency graph) 
> would be a colossal mistake IMHO.
> 
> 
> * Our overarching plan is backwards.
> 
> There are two quite separable parts to this architecture - the worker 
> and the observer. Up until now, we have been assuming that implementing 
> the observer would be the first step. Originally we thought that this 
> would give us the best incremental benefits. At the mid-cycle meetup we 
> came to the conclusion that there were actually no real incremental 
> benefits to be had until everything was close to completion. I am now of 
> the opinion that we had it exactly backwards - the observer 
> implementation should come last. That will allow us to deliver 
> incremental benefits from the observer sooner.
> 
> The problem with the observer is that it requires new plugins. (That 
> sucks BTW, because a lot of the value of Heat is in having all of these 
> tested, working plugins. I'd love it if we could take the opportunity to 
> design a plugin framework such that plugins would require much less 
> custom code, but it looks like a really hard job.) Basically this means 
> that convergence would be stalled until we could rewrite all the 
> plugins. I think it's much better to implement a first stage that can 
> work with existing plugins *or* the new ones we'll eventually have with 
> the observer. That allows us to get some benefits soon and further 
> incremental benefits as we convert plugins one at a time. It should also 
> mean a transition period (possibly with a performance penalty) for 
> existing plugin authors, and for things like HARestarter (can we please 
> please deprecate it now?).
> 
> So the two phases I'm proposing are:
>   1. (Workers) Distribute tasks for individual resources among workers; 
> implement update-during-update (no more locking).
>   2. (Observers) Compare against real-world values instead of template 
> values to determine when updates are needed. Make use of notifications 
> and such.
> 
> I believe it's quite realistic to aim to get #1 done for Kilo. There 
> could also be a phase 1.5, where we use the existing stack-check 
> mechanism to detect the most egregious divergences between template and 
> reality (e.g. whole resource is missing should be easy-ish). I think 
> this means that we could have a feasible Autoscaling API for Kilo if 
> folks step up to work on it - and in any case now is the time to start 
> on that to avoid it being delayed more than it needs to be based purely 
> on the availability of underlying features. That's why I proposed a 
> session on Autoscaling for the design summit.
> 
> 

IMHO, Autoscaling and convergence are two different things.

> * This thing is going to _hammer_ the database
> 
> The advantage is that we'll be able to spread the access across an 
> arbitrary number of workers, but it's still going to be brutal because 
> there's only one database. We'll need to work hard to keep DB access to 
> the minimum we can.
> 

There's going to be many queries to DB but I think they will be smaller
and quicker rather than bulky and time consuming. We can achieve
simplicity and scalability once we decouple the resource from stack and
entire template. This is one of the reasons for persist graph spec talks
about persisting resource definitions. Template for each resource is
self contained, and with the graph in DB, any engine can pull the
definition of the resource and its dependent resources to find the
desired properties.

> 
> That's it for now. I'm going to continue hacking on this, though don't 
> expect much progress before the Summit. I'd be happy to hear any 
> feedback that anybody has, and I hope this might prove a useful platform 
> for other folks to experiment with too.
> 
> cheers,
> Zane.
> 
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 
> 

IMHO, we should discuss the implementable specs so that we all are
aligned to a thought process which makes it easier for us to implement,
review the changes, and give comments.

This protoytping is great exercise which should help us collaborate on
the specs where we all can give feedback/comments based on our
understandings.

- Anant



More information about the OpenStack-dev mailing list