[openstack-dev] [ci][infra][tripleo] Multi-staged check pipelines for Zuul v3 proposal

Bogdan Dobrelya bdobreli at redhat.com
Fri May 25 12:45:01 UTC 2018


Job dependencies seem ignored by zuul, see jobs [0],[1],[2] started 
simultaneously. While I expected them run one by one. According to the 
patch 568536 [3], [1] is a dependency for [2] and [3].

The same can be observed for the remaining patches in the topic [4].
Is that a bug or I misunderstood what zuul job dependencies actually do?

[0] 
http://logs.openstack.org/36/568536/2/check/tripleo-ci-centos-7-undercloud-containers/731183a/ara-report/
[1] 
http://logs.openstack.org/36/568536/2/check/tripleo-ci-centos-7-3nodes-multinode/a1353ed/ara-report/
[2] 
http://logs.openstack.org/36/568536/2/check/tripleo-ci-centos-7-containers-multinode/9777136/ara-report/
[3] https://review.openstack.org/#/c/568536/
[4] 
https://review.openstack.org/#/q/topic:ci_pipelines+(status:open+OR+status:merged)

On 5/15/18 11:39 AM, Bogdan Dobrelya wrote:
> Added a few more patches [0], [1] by the discussion results. PTAL folks.
> Wrt remaining in the topic, I'd propose to give it a try and revert it, 
> if it proved to be worse than better.
> Thank you for feedback!
> 
> The next step could be reusing artifacts, like DLRN repos and containers 
> built for patches and hosted undercloud, in the consequent pipelined 
> jobs. But I'm not sure how to even approach that.
> 
> [0] https://review.openstack.org/#/c/568536/
> [1] https://review.openstack.org/#/c/568543/
> 
> On 5/15/18 10:54 AM, Bogdan Dobrelya wrote:
>> On 5/14/18 10:06 PM, Alex Schultz wrote:
>>> On Mon, May 14, 2018 at 10:15 AM, Bogdan Dobrelya 
>>> <bdobreli at redhat.com> wrote:
>>>> An update for your review please folks
>>>>
>>>>> Bogdan Dobrelya <bdobreli at redhat.com> writes:
>>>>>
>>>>>> Hello.
>>>>>> As Zuul documentation [0] explains, the names "check", "gate", and
>>>>>> "post"  may be altered for more advanced pipelines. Is it doable to
>>>>>> introduce, for particular openstack projects, multiple check
>>>>>> stages/steps as check-1, check-2 and so on? And is it possible to 
>>>>>> make
>>>>>> the consequent steps reusing environments from the previous steps
>>>>>> finished with?
>>>>>>
>>>>>> Narrowing down to tripleo CI scope, the problem I'd want we to solve
>>>>>> with this "virtual RFE", and using such multi-staged check pipelines,
>>>>>> is reducing (ideally, de-duplicating) some of the common steps for
>>>>>> existing CI jobs.
>>>>>
>>>>>
>>>>> What you're describing sounds more like a job graph within a pipeline.
>>>>> See:
>>>>> https://docs.openstack.org/infra/zuul/user/config.html#attr-job.dependencies 
>>>>>
>>>>> for how to configure a job to run only after another job has 
>>>>> completed.
>>>>> There is also a facility to pass data between such jobs.
>>>>>
>>>>> ... (skipped) ...
>>>>>
>>>>> Creating a job graph to have one job use the results of the 
>>>>> previous job
>>>>> can make sense in a lot of cases.  It doesn't always save *time*
>>>>> however.
>>>>>
>>>>> It's worth noting that in OpenStack's Zuul, we have made an explicit
>>>>> choice not to have long-running integration jobs depend on shorter 
>>>>> pep8
>>>>> or tox jobs, and that's because we value developer time more than CPU
>>>>> time.  We would rather run all of the tests and return all of the
>>>>> results so a developer can fix all of the errors as quickly as 
>>>>> possible,
>>>>> rather than forcing an iterative workflow where they have to fix 
>>>>> all the
>>>>> whitespace issues before the CI system will tell them which actual 
>>>>> tests
>>>>> broke.
>>>>>
>>>>> -Jim
>>>>
>>>>
>>>> I proposed a few zuul dependencies [0], [1] to tripleo CI pipelines for
>>>> undercloud deployments vs upgrades testing (and some more). Given 
>>>> that those
>>>> undercloud jobs have not so high fail rates though, I think Emilien 
>>>> is right
>>>> in his comments and those would buy us nothing.
>>>>
>>>>  From the other side, what do you think folks of making the
>>>> tripleo-ci-centos-7-3nodes-multinode depend on
>>>> tripleo-ci-centos-7-containers-multinode [2]? The former seems quite 
>>>> faily
>>>> and long running, and is non-voting. It deploys (see featuresets 
>>>> configs
>>>> [3]*) a 3 nodes in HA fashion. And it seems almost never passing, 
>>>> when the
>>>> containers-multinode fails - see the CI stats page [4]. I've found 
>>>> only a 2
>>>> cases there for the otherwise situation, when containers-multinode 
>>>> fails,
>>>> but 3nodes-multinode passes. So cutting off those future failures 
>>>> via the
>>>> dependency added, *would* buy us something and allow other jobs to 
>>>> wait less
>>>> to commence, by a reasonable price of somewhat extended time of the 
>>>> main
>>>> zuul pipeline. I think it makes sense and that extended CI time will 
>>>> not
>>>> overhead the RDO CI execution times so much to become a problem. WDYT?
>>>>
>>>
>>> I'm not sure it makes sense to add a dependency on other deployment
>>> tests. It's going to add additional time to the CI run because the
>>> upgrade won't start until well over an hour after the rest of the
>>
>> The things are not so simple. There is also a significant 
>> time-to-wait-in-queue jobs start delay. And it takes probably even 
>> longer than the time to execute jobs. And that delay is a function of 
>> available HW resources and zuul queue length. And the proposed change 
>> affects those parameters as well, assuming jobs with failed 
>> dependencies won't run at all. So we could expect longer execution 
>> times compensated with shorter wait times! I'm not sure how to 
>> estimate that tho. You folks have all numbers and knowledge, let's use 
>> that please.
>>
>>> jobs.  The only thing I could think of where this makes more sense is
>>> to delay the deployment tests until the pep8/unit tests pass.  e.g.
>>> let's not burn resources when the code is bad. There might be
>>> arguments about lack of information from a deployment when developing
>>> things but I would argue that the patch should be vetted properly
>>> first in a local environment before taking CI resources.
>>
>> I support this idea as well, though I'm sceptical about having that 
>> blessed in the end :) I'll add a patch though.
>>
>>>
>>> Thanks,
>>> -Alex
>>>
>>>> [0] https://review.openstack.org/#/c/568275/
>>>> [1] https://review.openstack.org/#/c/568278/
>>>> [2] https://review.openstack.org/#/c/568326/
>>>> [3]
>>>> https://docs.openstack.org/tripleo-quickstart/latest/feature-configuration.html 
>>>>
>>>> [4] http://tripleo.org/cistatus.html
>>>>
>>>> * ignore the column 1, it's obsolete, all CI jobs now using configs 
>>>> download
>>>> AFAICT...
>>>>
>>>> -- 
>>>> Best regards,
>>>> Bogdan Dobrelya,
>>>> Irc #bogdando
>>>>
>>>> __________________________________________________________________________ 
>>>>
>>>> OpenStack Development Mailing List (not for usage questions)
>>>> Unsubscribe: 
>>>> OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>>
>>> __________________________________________________________________________ 
>>>
>>> OpenStack Development Mailing List (not for usage questions)
>>> Unsubscribe: 
>>> OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>>
>>
>>
> 
> 


-- 
Best regards,
Bogdan Dobrelya,
Irc #bogdando



More information about the OpenStack-dev mailing list