[openstack-dev] [TripleO] Kicking TripleO up a notch

Robert Collins robertc at robertcollins.net
Thu Oct 3 23:52:11 UTC 2013


On 4 October 2013 11:03, James Slagle <james.slagle at gmail.com> wrote:
> On Tue, Oct 1, 2013 at 4:37 AM, Robert Collins
>>

>> In the call we had, we agreed that this approach makes a lot of sense,
>> and spent a bunch of time talking through the ramifications on TripleO
>> and Ironic, and sketched out one way to slice and dice things;
>> https://docs.google.com/drawings/d/1kgBlHvkW8Kj_ynCA5oCILg4sPqCUvmlytY5p1p9AjW0/edit?usp=sharing
>> is the diagram we came up with.
>
> Phase 0...makes sense.
>
> A couple of questions about the other phases:
> What is "Persistent Overcloud with CD" in TripleO Phase 1?
> Is that where the overcloud gets upgraded on each commit, vs torn down
> and redeployed?

Yes indeed!

> I'd take it this is the image based upgrade approach where we'd need
> the read-only /, and
> storage somewhere for the persistent data support that has been
> previously discussed?

Readonly / would be nice but isn't in the MVP for it; persistent data
is crucial however.

> If one of the other goals of the MVP of Phase 1 is to stop causing API
> downtime during
> upgrades, then this implies an HA Overcloud?  I believe that also
> implies that we'd need support
> across the upstream Openstack projects of different versions of the same service
> being compatible (to an extent).  Meaning, if we have a HA Overcloud
> with 2 Control nodes, and
> we bring one of the nodes down and upgrade Nova to a newer version,
> when we start
> the upgraded node again, the 2 running Nova's need to be able to be
> interoperable.  AIUI,
> this type of support is still not ready in most projects.  But, I
> guess that's why this is phase 1
> and not 0 :).

Yes, and yes and yes.! :)

> In Phase 2, does  Undercloud CD also imply persistent Undercloud?  I'm
> guessing yes, since
> the Overcloud couldn't stay persistent if it's undercloud was destroyed.

Totally, thanks for calling these things out; suggests to me we may
want to split into a few more phases, and tighten up the descriptions.

>> then start tackling CD of it's
>> infrastructure, then remove the seed.
>
> Removing the seed and starting with the undercloud is one of the areas
> I've looked
> at, with the goal being making it easier to bootstrap an undercloud
> for the folks working
> on Tuskar.  I know I've pointed out these things before, but I wanted
> to again here.  I'm not
> sure if these efforts align with the long term vision of "removing the
> seed", or what
> exactly the plan is around that.  I just want to make folks aware of
> these, so as to
> avoid duplication if similar paths are chosen.

I don't think they align all that much, but OTOH I think they are
useful things in their own right.

> First, there's the undercloud-live effort to build a live usb image of
> an undercloud that people can boot, and install if they choose to.
> https://github.com/agroup/undercloud-live
>
> Second, undercloud-live makes use of some other python code I worked
> on to apply d-i-b
> elements to the current system, as opposed to a chroot.  This is the
> work I mentioned
> in Seattle (still working on a patch for d-i-b proper for this code
> btw).  For now, it's at:
> https://github.com/agroup/python-dib-elements/
>
> undercloud-live is Fedora based at the moment, because we wanted to integrate
> it with the Fedora build toolchain easily.

Yeah, and thats cool.

>> Ramifications:
>>  - long term a much better project health and responsiveness to
>> changing user needs.
>>  - may cause disruption in the short term as we do whats needed to get
>> /something/ working.
>
> I *think* this is a fair trade off.  Though, I'm not sure I understand
> the short term
> disruption.  Do you just mean there won't be as many people focusing on  devtest
> and the low level tooling because instead they're focused on the CD environment?

And that for instance we probably won't catch nova-bm regressions as
often because we'll be leaving the CD environment running for a period
of time, only retooling that when we get to phase 2.

>>  - will need community buy-in and support to make it work : two of the
>> key things about working Lean are keeping WIP - inventory - low and
>> ensuring that bottlenecks are not used for anything other than
>> bottleneck tasks. Both of these things impact what can be done at any
>> point in time within the project: we may need to say 'no' to proposed
>> work to permit driving more momentum as a whole... at least in the
>> short term.
>
> Can you give some examples of something that might be said no to?

So for instance, lets say someone wanted to switch us to Ironic next
week. That would disrupt the effort to get the overcloud doing CD; it
would slow velocity on that. It's a good thing to do, but we should do
that when we can as a team focus on dealing with the side effects
together, effectively. Or switching devtest to using tuskar, likewise
- long term totally do it, but lets fit it in so we don't cause
extended disruption, or crucially context switching amongst folk
working on whatever is the current critical path.

> In my head, I read that as "refactoring or new functionality that is likely to
> break stuff that works now".

Yes.

> Some of the high level things that are important to me now are:
>
> - getting fixes committed to any of the repos that correct issues that
>    are causing things to not work as intended
> - new d-i-b elements for new functionality
> - minor changes to existing d-i-b elements, things like make something
> more configurable
>   if needed, fix installation issues, etc
> - perhaps new heat templates for additional deployment scenarios (as opposed
>   to changes to existing heat templates)
>
> Do you see anything like that suffering?

I don't expect so.

> Like you say below, it's open source, so people can still work on what
> they want to :).  And in that regard, the things I mentioned above might
> only really suffer if there is suddenly a much longer turn around time on
> reviews, upstream feedback, etc.

Right. I guess one thing we can do is make it clear - say things like
this in a review.
"This is good stuff but, it's not currently on the team kanban, so
you'll need to make sure it comes in with no API changes / very
incremental and safe fashion..."

>> Our long term steady state should be a small amount of category 2 work
>> and a lot of category 3 with no category 1; but to get there we have
>> to go through a crucible where it will be all category 1 and category
>> 2: we should expect all forward momentum to stop while we get our
>
> I'd classify forward momentum recently as polishing the devtest story,
> and working
> on the tooling to do so.  So, maybe that is set aside for a moment while
> the CD environment is brought up.

Exactly.

> However, I think that having a working devtest is important.
> devtest can be quite daunting to a newcomer, but, a nice thing about it
> is that it gives people not familiar with tripleo and new contributors
> a place to
> get started.   And, I think that's important for the community.

It is; we shouldn't deliberately break it, but the reality is we're
spending a huge chunk of time dealing with other project fragility
around it - pip/pbr/nova/keystone/neutron have all broken devtest; and
as a result the actual thing we want to /deliver/ hasn't had nearly
enough cycles. I dunno, it's a hard call to make. Thus this being an
experiment; lets try it for a month and review our status after that?
...

>> system as a whole than push forward something we can't use yet.
>
> I'll say that I really like tracking stuff in trello.  I think the
> reality is that there are going
> to be some well defined project goals (like you're doing here), and
> probably other
> people (or groups of people) within the community may have sightly
> different goals.
>
> Not saying that those are necessarily going to conflict.  Just that there may
> be other stuff that folks are trying to accomplish..  The more
> stuff like that that can be shared in a public trello for tripleo, the
> better for
> everyone.

Ack!

-Rob

-- 
Robert Collins <rbtcollins at hp.com>
Distinguished Technologist
HP Converged Cloud



More information about the OpenStack-dev mailing list