[openstack-dev] [Heat] Convergence prototyping

Clint Byrum clint at fewbar.com
Sun Oct 26 00:35:04 UTC 2014


Excerpts from Zane Bitter's message of 2014-10-23 11:10:29 -0700:
> Hi folks,
> I've been looking at the convergence stuff, and become a bit concerned 
> that we're more or less flying blind (or at least I have been) in trying 
> to figure out the design, and also that some of the first implementation 
> efforts seem to be around the stuff that is _most_ expensive to change 
> (e.g. database schemata).
> 
> What we really want is to experiment on stuff that is cheap to change 
> with a view to figuring out the big picture without having to iterate on 
> the expensive stuff. To that end, I started last week to write a little 
> prototype system to demonstrate the concepts of convergence. (Note that 
> none of this code is intended to end up in Heat!) You can find the code 
> here:
> 
> https://github.com/zaneb/heat-convergence-prototype
> 
> Note that this is a *very* early prototype. At the moment it can create 
> resources, and not much else. I plan to continue working on it to 
> implement updates and so forth. My hope is that we can develop a test 
> framework and scenarios around this that can eventually be transplanted 
> into Heat's functional tests. So the prototype code is throwaway, but 
> the tests we might write against it in future should be useful.
> 
> I'd like to encourage anyone who needs to figure out any part of the 
> design of convergence to fork the repo and try out some alternatives - 
> it should be very lightweight to do so. I will also entertain pull 
> requests (though I see my branch primarily as a vehicle for my own 
> learning at this early stage, so if you want to go in a different 
> direction it may be best to do so on your own branch), and the issue 
> tracker is enabled if there is something you want to track.
> 
> I have learned a bunch of stuff already:
> 
> * The proposed spec for persisting the dependency graph 
> (https://review.openstack.org/#/c/123749/1) is really well done. Kudos 
> to Anant and the other folks who had input to it. I have left comments 
> based on what I learned so far from trying it out.
> 
> 
> * We should isolate the problem of merging two branches of execution 
> (i.e. knowing when to trigger a check on one resource that depends on 
> multiple others). Either in a library (like taskflow) or just a separate 
> database table (like my current prototype). Baking it into the 
> orchestration algorithms (e.g. by marking nodes in the dependency graph) 
> would be a colossal mistake IMHO.
> 
> 
> * Our overarching plan is backwards.
> 
> There are two quite separable parts to this architecture - the worker 
> and the observer. Up until now, we have been assuming that implementing 
> the observer would be the first step. Originally we thought that this 
> would give us the best incremental benefits. At the mid-cycle meetup we 
> came to the conclusion that there were actually no real incremental 
> benefits to be had until everything was close to completion. I am now of 
> the opinion that we had it exactly backwards - the observer 
> implementation should come last. That will allow us to deliver 
> incremental benefits from the observer sooner.
> 
> The problem with the observer is that it requires new plugins. (That 
> sucks BTW, because a lot of the value of Heat is in having all of these 
> tested, working plugins. I'd love it if we could take the opportunity to 
> design a plugin framework such that plugins would require much less 
> custom code, but it looks like a really hard job.) Basically this means 
> that convergence would be stalled until we could rewrite all the 
> plugins. I think it's much better to implement a first stage that can 
> work with existing plugins *or* the new ones we'll eventually have with 
> the observer. That allows us to get some benefits soon and further 
> incremental benefits as we convert plugins one at a time. It should also 
> mean a transition period (possibly with a performance penalty) for 
> existing plugin authors, and for things like HARestarter (can we please 
> please deprecate it now?).
> 
> So the two phases I'm proposing are:
>   1. (Workers) Distribute tasks for individual resources among workers; 
> implement update-during-update (no more locking).
>   2. (Observers) Compare against real-world values instead of template 
> values to determine when updates are needed. Make use of notifications 
> and such.
> 
> I believe it's quite realistic to aim to get #1 done for Kilo. There 
> could also be a phase 1.5, where we use the existing stack-check 
> mechanism to detect the most egregious divergences between template and 
> reality (e.g. whole resource is missing should be easy-ish). I think 
> this means that we could have a feasible Autoscaling API for Kilo if 
> folks step up to work on it - and in any case now is the time to start 
> on that to avoid it being delayed more than it needs to be based purely 
> on the availability of underlying features. That's why I proposed a 
> session on Autoscaling for the design summit.
> 
> 
> * This thing is going to _hammer_ the database
> 
> The advantage is that we'll be able to spread the access across an 
> arbitrary number of workers, but it's still going to be brutal because 
> there's only one database. We'll need to work hard to keep DB access to 
> the minimum we can.
> 

I think it will just hammer the DB in different ways from the current
implementation. Currently, one must do a _lot_ of queries to parse a
stack. And to do almost any operation, one ends up parsing the stack.

Conversely, with everything being resource oriented, while operations
will do more DB queries and transactions, they'll be smaller and more
like single key reads/writes.

Also, the new design will lend itself toward sharding by resource,
since only the stack's graph will need to be related to other resources
and thus on a single DB. But that is _way_ down the road. I think with a
modest sized database server/cluster one can likely run tens of thousands
of concurrent tiny reads/writes per second.



More information about the OpenStack-dev mailing list