[openstack-dev] [ci][infra][tripleo] Multi-staged check pipelines for Zuul v3 proposal

Bogdan Dobrelya bdobreli at redhat.com
Tue May 15 08:54:37 UTC 2018


On 5/14/18 10:06 PM, Alex Schultz wrote:
> On Mon, May 14, 2018 at 10:15 AM, Bogdan Dobrelya <bdobreli at redhat.com> wrote:
>> An update for your review please folks
>>
>>> Bogdan Dobrelya <bdobreli at redhat.com> writes:
>>>
>>>> Hello.
>>>> As Zuul documentation [0] explains, the names "check", "gate", and
>>>> "post"  may be altered for more advanced pipelines. Is it doable to
>>>> introduce, for particular openstack projects, multiple check
>>>> stages/steps as check-1, check-2 and so on? And is it possible to make
>>>> the consequent steps reusing environments from the previous steps
>>>> finished with?
>>>>
>>>> Narrowing down to tripleo CI scope, the problem I'd want we to solve
>>>> with this "virtual RFE", and using such multi-staged check pipelines,
>>>> is reducing (ideally, de-duplicating) some of the common steps for
>>>> existing CI jobs.
>>>
>>>
>>> What you're describing sounds more like a job graph within a pipeline.
>>> See:
>>> https://docs.openstack.org/infra/zuul/user/config.html#attr-job.dependencies
>>> for how to configure a job to run only after another job has completed.
>>> There is also a facility to pass data between such jobs.
>>>
>>> ... (skipped) ...
>>>
>>> Creating a job graph to have one job use the results of the previous job
>>> can make sense in a lot of cases.  It doesn't always save *time*
>>> however.
>>>
>>> It's worth noting that in OpenStack's Zuul, we have made an explicit
>>> choice not to have long-running integration jobs depend on shorter pep8
>>> or tox jobs, and that's because we value developer time more than CPU
>>> time.  We would rather run all of the tests and return all of the
>>> results so a developer can fix all of the errors as quickly as possible,
>>> rather than forcing an iterative workflow where they have to fix all the
>>> whitespace issues before the CI system will tell them which actual tests
>>> broke.
>>>
>>> -Jim
>>
>>
>> I proposed a few zuul dependencies [0], [1] to tripleo CI pipelines for
>> undercloud deployments vs upgrades testing (and some more). Given that those
>> undercloud jobs have not so high fail rates though, I think Emilien is right
>> in his comments and those would buy us nothing.
>>
>>  From the other side, what do you think folks of making the
>> tripleo-ci-centos-7-3nodes-multinode depend on
>> tripleo-ci-centos-7-containers-multinode [2]? The former seems quite faily
>> and long running, and is non-voting. It deploys (see featuresets configs
>> [3]*) a 3 nodes in HA fashion. And it seems almost never passing, when the
>> containers-multinode fails - see the CI stats page [4]. I've found only a 2
>> cases there for the otherwise situation, when containers-multinode fails,
>> but 3nodes-multinode passes. So cutting off those future failures via the
>> dependency added, *would* buy us something and allow other jobs to wait less
>> to commence, by a reasonable price of somewhat extended time of the main
>> zuul pipeline. I think it makes sense and that extended CI time will not
>> overhead the RDO CI execution times so much to become a problem. WDYT?
>>
> 
> I'm not sure it makes sense to add a dependency on other deployment
> tests. It's going to add additional time to the CI run because the
> upgrade won't start until well over an hour after the rest of the

The things are not so simple. There is also a significant 
time-to-wait-in-queue jobs start delay. And it takes probably even 
longer than the time to execute jobs. And that delay is a function of 
available HW resources and zuul queue length. And the proposed change 
affects those parameters as well, assuming jobs with failed dependencies 
won't run at all. So we could expect longer execution times compensated 
with shorter wait times! I'm not sure how to estimate that tho. You 
folks have all numbers and knowledge, let's use that please.

> jobs.  The only thing I could think of where this makes more sense is
> to delay the deployment tests until the pep8/unit tests pass.  e.g.
> let's not burn resources when the code is bad. There might be
> arguments about lack of information from a deployment when developing
> things but I would argue that the patch should be vetted properly
> first in a local environment before taking CI resources.

I support this idea as well, though I'm sceptical about having that 
blessed in the end :) I'll add a patch though.

> 
> Thanks,
> -Alex
> 
>> [0] https://review.openstack.org/#/c/568275/
>> [1] https://review.openstack.org/#/c/568278/
>> [2] https://review.openstack.org/#/c/568326/
>> [3]
>> https://docs.openstack.org/tripleo-quickstart/latest/feature-configuration.html
>> [4] http://tripleo.org/cistatus.html
>>
>> * ignore the column 1, it's obsolete, all CI jobs now using configs download
>> AFAICT...
>>
>> --
>> Best regards,
>> Bogdan Dobrelya,
>> Irc #bogdando
>>
>> __________________________________________________________________________
>> OpenStack Development Mailing List (not for usage questions)
>> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 


-- 
Best regards,
Bogdan Dobrelya,
Irc #bogdando



More information about the OpenStack-dev mailing list