[openstack-dev] [tripleo] Setting up to 3rd party CI OVB jobs

James Slagle james.slagle at gmail.com
Wed Oct 12 18:36:23 UTC 2016


On Wed, Oct 12, 2016 at 1:32 PM, Dan Prince <dprince at redhat.com> wrote:
> On Fri, 2016-10-07 at 09:03 -0400, Paul Belanger wrote:
>> Greetings,
>>
>> I wanted to propose a work item, that I am happy to spearhead, about
>> setting up
>> a 3rd party CI system for tripleo project. The work I am proposing,
>> wouldn't
>> actually affect anything today about tripleo-ci but provider a
>> working example
>> of how 3rd party CI will work and potential migration path.
>>
>> This is just one example of how it would work, obviously everything
>> is open for
>> discussions but I think you'll find the plan to be workable.
>> Additionally, this
>> topic would only apply to OVB jobs, existing jobs already running on
>> cloud
>> providers from openstack-infra would not be affected.
>
>
> The plan you describe here sounds reasonable. Testing out a 3rd party
> system in parallel to our existing CI causes no harm and certainly
> allows us to evaluate things and learn from the new setup.
>
> A couple of things I would like to see discussed a bit more (either
> here or in a new thread if deemed unrelated) are how do we benefit from
> these changes in making the OVB jobs 3rd party.
>
> There are at least 3 groups who likely care about this along with how
> this benefits them:
>
> -the openstack-infra team:
>
>   * standardization: doesn't have to deal with special case OVB clouds
>
> -the tripleo OVB cloud/CI maintainers:
>
>   * Can manage the 3rd party cloud how they like it. Using images or whatever with less regard for openstack-infra compatability.
>
> -the tripleo core team:
>
>   * The OVB jobs are mostly the same. The maintenance is potentially
> further diverging from upstream though. So is there any benefit to 3rd
> party for the core team? Unclear to me at this point. The OVB jobs are
> still the same. They aren't running any faster than they are today. The
> maintenance of them might even get harder for some due to the fact that
> we have different base images across our upstream infra multinode jobs
> and what we run via the OVB 3rd party testing.

Benefits I see include:

- moving to 3rd party CI potentially means moving the jobs to a
different cloud that the core team doesn't have to maintain. If we
move tripleo-test-cloud-rh2 to rdoproject.org's nodepool instead of
infra's, I think that is a step in the direction of having the
rdoproject.org team maintain this cloud (eventually as part of RDO
cloud) in the future. That would free up those reponsibilities from
the tripleo-core team. There may be some on the tripleo-core team that
still want to participate in maintaining that cloud, and they should
feel free to do so as I don't think we're drawing strong
organizational lines here. But, ultimately I would like to see the
tripleo-core team not be on the hook for maintaining a multi
region/site public cloud(s). That frees up the core team to do
development work, reviews, actually fix real CI failures, etc.

- using custom images would be done so that the jobs do indeed run
faster. so, those 2 of your points contradict a bit. while it would be
more maintenance, we would be doing so that the jobs run faster

- 3rd party CI jobs can actually vote in the check queue whereas
current tripleo-ci ovb jobs can not vote at all.

>
> ----
>
> The tripleo-ci end-to-end test jobs have always fallen into the high
> maintenance category. We've only recently switched to OVB and one of
> the nice things about doing that is we are using something much closer
> to stock openstack vs. our previous CI cloud. Sure there are some OVB
> configuration differences to enable testing of baremetal in the cloud
> but we are using more OpenStack to drive things. So by simply using
> more OpenStack within our CI we should be more closely aligning with
> infra. A move in the right direction anyway.
>
> Going through all this effort I really would like to see all the teams
> gain from the effort. Like, for me the point of having upstream
> tripleo-ci tests is that we catch breakages. Breakages that no other
> upstream projects are catching. And the solution to stopping those
> breakages from happening isn't IMO to move some of the most valuable CI
> tests into 3rd party. That may cover over some of the maintenance rubs
> in the short/mid term perhaps. But I view it as a bit of a retreat in
> where we could be with upstream testing.
>
> So rather than just taking what we have in the OVB jobs today and
> making the same, long running (1.5 hours +) CI job (which catches lots
> of things) could we re-imaging the pipeline a bit in the process so we
> improve this. I guess my concern is we'll go to all the trouble to move
> this and we'll actually negatively impact the speed with which the
> tripleo core team can land code instead of increasing it. I guess what
> I'm asking is in doing this move can we raise the bar for TripleO core
> any too?

The full end to end tests are valuable and we definitely need them.

But, and this may be quite controversial, my thinking has evolved to
where I'm not convinced we need multiple full end to end tests on
every single TripleO patch.

Personally I'd like to see the TripleO core team be able to focus more
on actually testing TripleO projects. I think that is the primary way
we will be able to increase the velocity that we are able to land
TripleO patches. That's opposed to trying to optimize how quickly we
can do a full end to end test (which is a never ending losing battle
as the entire system is continually made slower overall).

Features like containerization support could go along way in helping
us bootstrap tests using last known good artifacts from the last
successful end to end test. That could allow us to have jobs that
focus more on just testing individual changes more efficiently.

There would be a lot of work to be done to optimize our jobs running
on infra's nodepool spawned instances, add new jobs, use pipelines,
etc. Some of the features we would likely need would not be present
until Zuul v3 (custom pipelines perhaps?).

For me though, the path forward on the way to do that is continue to
drive tighter integration with infra's nodepool/zuul via the multinode
job efforts. OVB jobs, which require their own unique cloud, are an
unnatural fit for infra's nodepool. Conversely, OVB jobs are a natural
fit for 3rd party CI.

So, I feel like more naturally aligning our CI efforts with the reset
of the infra community is the most reasonable path to make significant
forward progress that can scale our CI and improve how quickly the
core team can actually land TripleO code.

You're absolutely right though that just moving OVB to 3rd party does
not solve any of that. In fact, it creates more work initially in that
someone has to set up the 3rd Party infrastructure. Thankfully Paul
has volunteered to do that :).

-- 
-- James Slagle
--



More information about the OpenStack-dev mailing list