[openstack-dev] [ci][infra][tripleo] Multi-staged check pipelines for Zuul v3 proposal

Tristan Cacqueray tdecacqu at redhat.com
Fri May 25 16:40:50 UTC 2018


Hello Bogdan,

Perhaps this has something to do with jobs evaluation order, it may be
worth trying to add the dependencies list in the project-templates, like
it is done here for example:
http://git.openstack.org/cgit/openstack-infra/project-config/tree/zuul.d/projects.yaml#n9799

It also easier to read dependencies from pipelines definition imo.

-Tristan

On May 25, 2018 12:45 pm, Bogdan Dobrelya wrote:
> Job dependencies seem ignored by zuul, see jobs [0],[1],[2] started 
> simultaneously. While I expected them run one by one. According to the 
> patch 568536 [3], [1] is a dependency for [2] and [3].
> 
> The same can be observed for the remaining patches in the topic [4].
> Is that a bug or I misunderstood what zuul job dependencies actually do?
> 
> [0] 
> http://logs.openstack.org/36/568536/2/check/tripleo-ci-centos-7-undercloud-containers/731183a/ara-report/
> [1] 
> http://logs.openstack.org/36/568536/2/check/tripleo-ci-centos-7-3nodes-multinode/a1353ed/ara-report/
> [2] 
> http://logs.openstack.org/36/568536/2/check/tripleo-ci-centos-7-containers-multinode/9777136/ara-report/
> [3] https://review.openstack.org/#/c/568536/
> [4] 
> https://review.openstack.org/#/q/topic:ci_pipelines+(status:open+OR+status:merged)
> 
> On 5/15/18 11:39 AM, Bogdan Dobrelya wrote:
>> Added a few more patches [0], [1] by the discussion results. PTAL folks.
>> Wrt remaining in the topic, I'd propose to give it a try and revert it, 
>> if it proved to be worse than better.
>> Thank you for feedback!
>> 
>> The next step could be reusing artifacts, like DLRN repos and containers 
>> built for patches and hosted undercloud, in the consequent pipelined 
>> jobs. But I'm not sure how to even approach that.
>> 
>> [0] https://review.openstack.org/#/c/568536/
>> [1] https://review.openstack.org/#/c/568543/
>> 
>> On 5/15/18 10:54 AM, Bogdan Dobrelya wrote:
>>> On 5/14/18 10:06 PM, Alex Schultz wrote:
>>>> On Mon, May 14, 2018 at 10:15 AM, Bogdan Dobrelya 
>>>> <bdobreli at redhat.com> wrote:
>>>>> An update for your review please folks
>>>>>
>>>>>> Bogdan Dobrelya <bdobreli at redhat.com> writes:
>>>>>>
>>>>>>> Hello.
>>>>>>> As Zuul documentation [0] explains, the names "check", "gate", and
>>>>>>> "post"  may be altered for more advanced pipelines. Is it doable to
>>>>>>> introduce, for particular openstack projects, multiple check
>>>>>>> stages/steps as check-1, check-2 and so on? And is it possible to 
>>>>>>> make
>>>>>>> the consequent steps reusing environments from the previous steps
>>>>>>> finished with?
>>>>>>>
>>>>>>> Narrowing down to tripleo CI scope, the problem I'd want we to solve
>>>>>>> with this "virtual RFE", and using such multi-staged check pipelines,
>>>>>>> is reducing (ideally, de-duplicating) some of the common steps for
>>>>>>> existing CI jobs.
>>>>>>
>>>>>>
>>>>>> What you're describing sounds more like a job graph within a pipeline.
>>>>>> See:
>>>>>> https://docs.openstack.org/infra/zuul/user/config.html#attr-job.dependencies 
>>>>>>
>>>>>> for how to configure a job to run only after another job has 
>>>>>> completed.
>>>>>> There is also a facility to pass data between such jobs.
>>>>>>
>>>>>> ... (skipped) ...
>>>>>>
>>>>>> Creating a job graph to have one job use the results of the 
>>>>>> previous job
>>>>>> can make sense in a lot of cases.  It doesn't always save *time*
>>>>>> however.
>>>>>>
>>>>>> It's worth noting that in OpenStack's Zuul, we have made an explicit
>>>>>> choice not to have long-running integration jobs depend on shorter 
>>>>>> pep8
>>>>>> or tox jobs, and that's because we value developer time more than CPU
>>>>>> time.  We would rather run all of the tests and return all of the
>>>>>> results so a developer can fix all of the errors as quickly as 
>>>>>> possible,
>>>>>> rather than forcing an iterative workflow where they have to fix 
>>>>>> all the
>>>>>> whitespace issues before the CI system will tell them which actual 
>>>>>> tests
>>>>>> broke.
>>>>>>
>>>>>> -Jim
>>>>>
>>>>>
>>>>> I proposed a few zuul dependencies [0], [1] to tripleo CI pipelines for
>>>>> undercloud deployments vs upgrades testing (and some more). Given 
>>>>> that those
>>>>> undercloud jobs have not so high fail rates though, I think Emilien 
>>>>> is right
>>>>> in his comments and those would buy us nothing.
>>>>>
>>>>>  From the other side, what do you think folks of making the
>>>>> tripleo-ci-centos-7-3nodes-multinode depend on
>>>>> tripleo-ci-centos-7-containers-multinode [2]? The former seems quite 
>>>>> faily
>>>>> and long running, and is non-voting. It deploys (see featuresets 
>>>>> configs
>>>>> [3]*) a 3 nodes in HA fashion. And it seems almost never passing, 
>>>>> when the
>>>>> containers-multinode fails - see the CI stats page [4]. I've found 
>>>>> only a 2
>>>>> cases there for the otherwise situation, when containers-multinode 
>>>>> fails,
>>>>> but 3nodes-multinode passes. So cutting off those future failures 
>>>>> via the
>>>>> dependency added, *would* buy us something and allow other jobs to 
>>>>> wait less
>>>>> to commence, by a reasonable price of somewhat extended time of the 
>>>>> main
>>>>> zuul pipeline. I think it makes sense and that extended CI time will 
>>>>> not
>>>>> overhead the RDO CI execution times so much to become a problem. WDYT?
>>>>>
>>>>
>>>> I'm not sure it makes sense to add a dependency on other deployment
>>>> tests. It's going to add additional time to the CI run because the
>>>> upgrade won't start until well over an hour after the rest of the
>>>
>>> The things are not so simple. There is also a significant 
>>> time-to-wait-in-queue jobs start delay. And it takes probably even 
>>> longer than the time to execute jobs. And that delay is a function of 
>>> available HW resources and zuul queue length. And the proposed change 
>>> affects those parameters as well, assuming jobs with failed 
>>> dependencies won't run at all. So we could expect longer execution 
>>> times compensated with shorter wait times! I'm not sure how to 
>>> estimate that tho. You folks have all numbers and knowledge, let's use 
>>> that please.
>>>
>>>> jobs.  The only thing I could think of where this makes more sense is
>>>> to delay the deployment tests until the pep8/unit tests pass.  e.g.
>>>> let's not burn resources when the code is bad. There might be
>>>> arguments about lack of information from a deployment when developing
>>>> things but I would argue that the patch should be vetted properly
>>>> first in a local environment before taking CI resources.
>>>
>>> I support this idea as well, though I'm sceptical about having that 
>>> blessed in the end :) I'll add a patch though.
>>>
>>>>
>>>> Thanks,
>>>> -Alex
>>>>
>>>>> [0] https://review.openstack.org/#/c/568275/
>>>>> [1] https://review.openstack.org/#/c/568278/
>>>>> [2] https://review.openstack.org/#/c/568326/
>>>>> [3]
>>>>> https://docs.openstack.org/tripleo-quickstart/latest/feature-configuration.html 
>>>>>
>>>>> [4] http://tripleo.org/cistatus.html
>>>>>
>>>>> * ignore the column 1, it's obsolete, all CI jobs now using configs 
>>>>> download
>>>>> AFAICT...
>>>>>
>>>>> -- 
>>>>> Best regards,
>>>>> Bogdan Dobrelya,
>>>>> Irc #bogdando
>>>>>
>>>>> __________________________________________________________________________ 
>>>>>
>>>>> OpenStack Development Mailing List (not for usage questions)
>>>>> Unsubscribe: 
>>>>> OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>>>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>>>
>>>> __________________________________________________________________________ 
>>>>
>>>> OpenStack Development Mailing List (not for usage questions)
>>>> Unsubscribe: 
>>>> OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>>>
>>>
>>>
>> 
>> 
> 
> 
> -- 
> Best regards,
> Bogdan Dobrelya,
> Irc #bogdando
> 
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 488 bytes
Desc: not available
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20180525/f50b1927/attachment.sig>


More information about the OpenStack-dev mailing list