[openstack-dev] Thoughts on OpenStack Layers and a Big Tent model

Flavio Percoco flavio at redhat.com
Thu Sep 25 08:50:03 UTC 2014


On 09/24/2014 07:55 PM, Zane Bitter wrote:
> On 18/09/14 14:53, Monty Taylor wrote:
>> Hey all,
>>
>> I've recently been thinking a lot about Sean's Layers stuff. So I wrote
>> a blog post which Jim Blair and Devananda were kind enough to help me
>> edit.
>>
>> http://inaugust.com/post/108
> 
> Thanks Monty, I think there are some very interesting ideas in here.
> 
> I'm particularly glad to see the 'big tent' camp reasserting itself,
> because I have no sympathy with anyone who wants to join the OpenStack
> community and then bolt the door behind them. Anyone who contributes to
> a project that is related to OpenStack's goals, is willing to do things
> the OpenStack way, and submits itself to the scrutiny of the TC deserves
> to be treated as a member of our community with voting rights, entry to
> the Design Summit and so on.
> 
> I'm curious how you're suggesting we decide which projects satisfy those
> criteria though. Up until now, we've done it through the incubation
> process (or technically, the new program approval process... but in
> practice we've never added a project that was targeted for eventual
> inclusion in the integrated release to a program without incubating it).
> Would the TC continue to judge whether a project is doing things the
> OpenStack way prior to inclusion, or would we let projects self-certify?
> What does it mean for a project to submit itself to TC scrutiny if it
> knows that realistically the TC will never have time to actually
> scrutinise it? Or are you not suggesting a change to the current
> incubation process, just a willingness to incubate multiple projects in
> the same problem space?
> 
> I feel like I need to play devil's advocate here, because overall I'm
> just not sure I understand the purpose of arbitrarily - and it *is*
> arbitrary - declaring "Layer #1" to be anything required to run
> Wordpress. To anyone whose goal is not to run Wordpress, how is that
> relevant?
> 
> Speaking of arbitrary, I had to laugh a little at this bit:
> 
>  Also, please someone notice that the above is too many steps and should
> be:
> 
>   openstack boot gentoo on-a 2G-VM with-a publicIP with-a 10G-volume
> call-it blog.inaugust.com
> 
> That's kinda sorta exactly what Heat does ;) Minus the part about
> assuming there is only one kind of application, obviously.
> 
> 
> I think there are a number of unjustified assumptions behind this
> arrangement of things. I'm going to list some here, but I don't want
> anyone to interpret this as a personal criticism of Monty. The point is
> that we all suffer from biases - not for any questionable reasons but
> purely as a result of our own experiences, who we spend our time talking
> to and what we spend our time thinking about - and therefore we should
> all be extremely circumspect about trying to bake our own mental models
> of what OpenStack should be into the organisational structure of the
> project itself.
> 
> * Assumption #1: The purpose of OpenStack is to provide a Compute cloud
> 
> This assumption is front-and-centre throughout everything Monty wrote.
> Yet this wasn't how the OpenStack project started. In fact there are now
> at least three services - Swift, Nova, Zaqar - that could each make
> sense as the core of a standalone product.
> 
> Yes, it's true that Nova effectively depends on Glance and Neutron (and
> everything depends on Keystone). We should definitely document that
> somewhere. But why does it make Nova special?
> 
> * Assumption #2: Yawnoc's Law
> 
> Don't bother Googling that, I just made it up. It's the reverse of
> Conway's Law:
> 
>   Infra engineers who design governance structures for OpenStack are
>   constrained to produce designs that are copies of the structure of
>   Tempest.
> 
> I just don't understand why that needs to be the case. Currently, for
> understandable historic reasons, every project gates against every other
> project. That makes no sense any more, completely independently of the
> project governance structure. We should just change it! There is no
> organisational obstacle to changing how gating works.
> 
> Even this proposal doesn't entirely make sense on this front - e.g.
> Designate requires only Neutron and Keystone... why should Nova, Glance
> and every other project in "Layer 1" gate against it, and vice-versa?
> 
> I suggested in another thread[1] a model where each project would
> publish a set of tests, each project would decide which sets of tests to
> pull in and gate on, and Tempest would just be a shell for setting up
> the environment and running the selected tests. Maybe that idea is crazy
> or at least needs more work (it certainly met with only crickets and
> tumbleweeds on the mailing list), but implementing it wouldn't require
> TC intervention and certainly not by-laws changes. It just requires...
> implementing it.
> 
> Perhaps the idea here is that by designating "Layer 1" the TC is
> indicating to projects which other projects they should accept gate test
> jobs from (a function previously fulfilled by Incubation). I'd argue
> that this is a very bad way to do it, because (a) it says nothing to
> projects outside of "Layer 1" how they should decide, and (b) it jumps
> straight to the TC mandating the result without even letting the
> projects try to sort it out amongst themselves.
> 
> For example, I would actually prefer that Nova not gate against Heat
> because Nova is pretty unlikely to break us and the trade-off of putting
> us in a position to accidentally break them is not worth it. No edict
> from the TC required. On the other hand, I would push very strongly for
> all of the python-*client libraries to gate against both Heat and
> Horizon, because they can easily break us - and if they break us,
> they're probably breaking other users out there too, so I'm confident I
> could convince people that this would be mutually beneficial. (It could
> potentially even extend so far as running the unit tests of Heat and
> Horizon in the client gates, to avoid issues like [2].)
> 
> [1]
> http://lists.openstack.org/pipermail/openstack-dev/2014-September/045446.html
> 
> [2]
> http://lists.openstack.org/pipermail/openstack-dev/2014-September/046686.html
> 
> 
> * Assumption #3: The world is static
> 
> This is a giant red flag:
> 
>   "the set of things in Layer #1 should never change -- unless we
>    refactor something already in Layer #1 into a new project."
> 
> There is no greater act of hubris than to stick a stake in the ground
> and declare that "we will never know more than we do at this moment;
> we'll only get dumber from here, so we must precommit to all of our
> future decisions based on the information we have at present".
> 
> What if, for example, Nova wanted to add a dependency on Zaqar? They'd
> be prevented from doing so because Zaqar is not used by Wordpress. How
> is that relevant? A rigid ban on dependencies is a death knell for
> innovation.
> 
> Can you really never imagine a time where it might be better to run
> Wordpress on a container service rather than a full-fledged VM? I guess
> that's OK but only as long as it starts in Nova and then gets split out?
> Because... nova-core don't have enough to do?
> 
> And none of this is any help at all to projects outside of "Layer 1",
> because they get no guidance at all on what makes sense to depend on.
> This is already hurting with our current system (for example, Mistral is
> implementing a bunch of notification stuff that should properly be
> delegated to Zaqar, and in fact as of 6 months ago it was the
> centrepiece of the design), and the TC abdicating all interest in the
> subject will make it even worse.
> 
> * Assumption #4: The sky is falling
> 
> From reading openstack-dev, it's pretty clear that both the QA and Nova
> programs are facing a scaling crisis of sorts. It's easy to see why
> anybody deeply involved with either or both of those two would indeed
> think that radical change is required. I'm not sure, however, that the
> same sense of crisis pervades all of the other projects. We all have a
> lot of work to do, but I suspect that most projects would say that they
> are trucking along nicely. Meanwhile, the proposal is to change pretty
> much everything about how OpenStack is organised *except* QA and Nova
> (in fact, it creates incentives to stick even more stuff inside Nova),
> which remain sacrosanct. That doesn't seem like attacking the problem at
> its source.
> 
> 
> So we've identified the minimum set of OpenStack services required to
> sensibly run Wordpress. Awesome! Somebody should totally write a blog
> post about that. But officially and permanently baking that in as the
> structure of the OpenStack project? I hate to use the c-word, but the
> bottom line is that "Layer 1" just resurrects Core with a pretext to
> finally kick Swift out. That seems particularly ironic, because I would
> pay good money to be a fly on the wall in a board meeting where anyone
> but Monty proposed such a thing in those terms, just to watch his
> reaction. Given that the TC informed the DefCore committee that it
> regarded everything that has graduated to the integrated release as the
> "designated sections" for DefCore purposes and told them to go do their
> own dirty work, you can bet your last dollar that this will be
> interpreted as a TC endorsement for permanently excluding Swift - and
> all the other non-"Layer 1" projects - from the designated sections. In
> fact, by removing only those tests from Tempest it's likely to have the
> side-effect of eliminating them from RefStack altogether.
> 
> 
> Let's sum up, first by looking at a list of questions that developers,
> distributors, operators and users might ask about a project:
> 
> 1) Are they "one of us"?
> 2) Should I gate against it?
> 3) Can I add a dependency on it?
> 4) Should this be widely distributed as part of OpenStack?
> 5) Can I use this knowing that the API will be somewhat stable?
> 6) Should this be used at scale in production?
> 
> 
> Here's how the TC is answering those questions at the moment:
> 
> 1) New program acceptance + incubation or adoption processes
> 2) Incubation process
> 3) Graduation process
> 4) Graduation process
> 5) Graduation process
> 6) You're on your own
> 
> Here's Monty's answers:
> 
> 1) ???
> 2) No
> 3) No
> 4) You're on your own
> 5) You're on your own?
> 6) "CERN test"
> 
> Both of those feel unsatisfactory in different ways. Monty's suggestions
> seem like an overly radical change to me; I would like to try something
> a bit more incremental to give us the chance to see how the community
> adapts:
> 
> 1) Incubation process (much lower bar)

IMHO, what's really broken in our incubation process is not the
requirements themselves but how we think about it. As far as my
experience goes, the incubation process has been something like:

	"OpenStack likes the project goals and think it may be a good thing to
have in the future. Now, with those few things you already have go and
make it happen... See you in 6 months"

Whereas I think the whole incubation period should be a guided period
where the team and project are led through the process until the project
is mature enough to be part of things like integrated release and whatnot.

What I'd rather have is a process where the mission, vision and goals of
the project are approved and then the project team works towards those
goals but going through a path that has been agreed on. For example (non
exhaustive list):

1. Work on the design spec
   - Stamp the project as having a reasonable design that follows the
guidelines OpenStack has
2. Develop the first POC and work on an API
   - Stamp the project as having an API that sounds reasonably stable.
3. Add jobs in the gate, devstack support, etc.
   - Stamp the project as gated
4. Work on a client library
5. Clean up the API and stabilize it
    - Stamp the project as "consumable" by other services.
6. ....
7. ....

Once the items on that list have been marked off the project can go
through the final graduation review. The above reduces (removes?) the
chance of "surprises" when the graduation review happens and should make
the graduation review fairly simple. In addition, it encourages the
adoption of the project *before* it graduates, this happens when the
project has reached the point of "API maturity" and the need of
field-testing.

For the above to be feasible the TC will definitely need to be
re-structured as you've mentioned below and as others have suggested in
other emails - including myself.


> 2) Do your own cost/benefit analysis
> 3) Graduation process
> 4) Graduation process (maintain high bar, but less capricious)

To keep the bar high or even make it higher, we need a better incubation
process. Expecting projects to jump from underground to
super-high-graduation-level without any kind of guidance has proven to
be bad (and I'm not just referring to Zaqar).

> 5) Graduation process
> 6) TC/UC production-readiness review

I think this could happen before the graduation review if the incubation
process is restructured in something more like a "growing-up" plan where
early adoption is encouraged.


> Finally, since the motivation for change is that we think the current
> structure isn't scaling, let's examine the individual things that are
> currently pain points:
> 
> * Continuous Integration
> 
> We all agree that the gate doesn't scale. I submit that it doesn't scale
> because it tests every project against every other project, and that
> kicking projects out of the gate not only fails to solve the problem in
> the long term (since the projects that _are_ in will continue to grow),
> but also ignores the actual risks that the gate is meant to guard
> against in favour of an arbitrary designation.
> 
> We should scale the gate by only gating projects against other projects
> where the benefit in reduced risk outweighs the cost in increased risk
> of false negatives. For projects that don't depend on each other at all,
> the benefit is precisely zero (beyond the install-only gate suggested by
> Monty, which I support). We should apply the same cost-benefit
> calculation regardless of how involved the projects in question are with
> running Wordpress, and we should let projects themselves decide what to
> gate against in the first instance, with the TC only stepping in in the
> event that consensus can't be reached by other means.
> 
> * Documentation
> 
> This is a tricky one, and not an area of OpenStack that I am an expert
> on. It does seem to me that the only real solution is to make projects
> more responsible for their own documentation. Arbitrarily splitting
> projects into a category where they're not responsible at all and a
> category where they're completely on their own doesn't seem like a good
> solution.
> 
> * Release Management
> 
> This is something we have not really even attempted to scale beyond
> Thierry. As a first step, there is no real organisational obstacle to
> having a different release manager for incubated projects than for
> integrated projects, it's more a matter of making it known to either the
> Foundation or the various companies who employ contributors that we need
> one. I don't want to make that process sound trivial, but I'm confident
> that the release management program could handle it, and I think we
> should at least give them a chance to try before pre-emptively kicking
> anything non-Wordpress-related out of the release forever.
> 
> * Technical Committee
> 
> It is inevitable that we will reach a point where the Technical
> Committee itself does not scale. I'm surprised, because I thought that
> was a ways off, but after watching the latest Zaqar fiasco I think we
> have to consider the possibility that we have reached that point already.
> 
> Perhaps we should consider having subcommittees, maybe based on the
> groupings identified by John (Dickinson), possibly comprised of the
> relevant PTLs plus a representative of the TC. These subcommittees would
> do the legwork of investigating new projects making their way through
> the incubation/graduation process and report summaries and
> recommendations to the TC.
> 

I pretty much agree with all the above!

Flavio

-- 
@flaper87
Flavio Percoco



More information about the OpenStack-dev mailing list