[openstack-dev] [ci][infra][tripleo] Multi-staged check pipelines for Zuul v3 proposal
Bogdan Dobrelya
bdobreli at redhat.com
Fri May 25 12:45:01 UTC 2018
Job dependencies seem ignored by zuul, see jobs [0],[1],[2] started
simultaneously. While I expected them run one by one. According to the
patch 568536 [3], [1] is a dependency for [2] and [3].
The same can be observed for the remaining patches in the topic [4].
Is that a bug or I misunderstood what zuul job dependencies actually do?
[0]
http://logs.openstack.org/36/568536/2/check/tripleo-ci-centos-7-undercloud-containers/731183a/ara-report/
[1]
http://logs.openstack.org/36/568536/2/check/tripleo-ci-centos-7-3nodes-multinode/a1353ed/ara-report/
[2]
http://logs.openstack.org/36/568536/2/check/tripleo-ci-centos-7-containers-multinode/9777136/ara-report/
[3] https://review.openstack.org/#/c/568536/
[4]
https://review.openstack.org/#/q/topic:ci_pipelines+(status:open+OR+status:merged)
On 5/15/18 11:39 AM, Bogdan Dobrelya wrote:
> Added a few more patches [0], [1] by the discussion results. PTAL folks.
> Wrt remaining in the topic, I'd propose to give it a try and revert it,
> if it proved to be worse than better.
> Thank you for feedback!
>
> The next step could be reusing artifacts, like DLRN repos and containers
> built for patches and hosted undercloud, in the consequent pipelined
> jobs. But I'm not sure how to even approach that.
>
> [0] https://review.openstack.org/#/c/568536/
> [1] https://review.openstack.org/#/c/568543/
>
> On 5/15/18 10:54 AM, Bogdan Dobrelya wrote:
>> On 5/14/18 10:06 PM, Alex Schultz wrote:
>>> On Mon, May 14, 2018 at 10:15 AM, Bogdan Dobrelya
>>> <bdobreli at redhat.com> wrote:
>>>> An update for your review please folks
>>>>
>>>>> Bogdan Dobrelya <bdobreli at redhat.com> writes:
>>>>>
>>>>>> Hello.
>>>>>> As Zuul documentation [0] explains, the names "check", "gate", and
>>>>>> "post" may be altered for more advanced pipelines. Is it doable to
>>>>>> introduce, for particular openstack projects, multiple check
>>>>>> stages/steps as check-1, check-2 and so on? And is it possible to
>>>>>> make
>>>>>> the consequent steps reusing environments from the previous steps
>>>>>> finished with?
>>>>>>
>>>>>> Narrowing down to tripleo CI scope, the problem I'd want we to solve
>>>>>> with this "virtual RFE", and using such multi-staged check pipelines,
>>>>>> is reducing (ideally, de-duplicating) some of the common steps for
>>>>>> existing CI jobs.
>>>>>
>>>>>
>>>>> What you're describing sounds more like a job graph within a pipeline.
>>>>> See:
>>>>> https://docs.openstack.org/infra/zuul/user/config.html#attr-job.dependencies
>>>>>
>>>>> for how to configure a job to run only after another job has
>>>>> completed.
>>>>> There is also a facility to pass data between such jobs.
>>>>>
>>>>> ... (skipped) ...
>>>>>
>>>>> Creating a job graph to have one job use the results of the
>>>>> previous job
>>>>> can make sense in a lot of cases. It doesn't always save *time*
>>>>> however.
>>>>>
>>>>> It's worth noting that in OpenStack's Zuul, we have made an explicit
>>>>> choice not to have long-running integration jobs depend on shorter
>>>>> pep8
>>>>> or tox jobs, and that's because we value developer time more than CPU
>>>>> time. We would rather run all of the tests and return all of the
>>>>> results so a developer can fix all of the errors as quickly as
>>>>> possible,
>>>>> rather than forcing an iterative workflow where they have to fix
>>>>> all the
>>>>> whitespace issues before the CI system will tell them which actual
>>>>> tests
>>>>> broke.
>>>>>
>>>>> -Jim
>>>>
>>>>
>>>> I proposed a few zuul dependencies [0], [1] to tripleo CI pipelines for
>>>> undercloud deployments vs upgrades testing (and some more). Given
>>>> that those
>>>> undercloud jobs have not so high fail rates though, I think Emilien
>>>> is right
>>>> in his comments and those would buy us nothing.
>>>>
>>>> From the other side, what do you think folks of making the
>>>> tripleo-ci-centos-7-3nodes-multinode depend on
>>>> tripleo-ci-centos-7-containers-multinode [2]? The former seems quite
>>>> faily
>>>> and long running, and is non-voting. It deploys (see featuresets
>>>> configs
>>>> [3]*) a 3 nodes in HA fashion. And it seems almost never passing,
>>>> when the
>>>> containers-multinode fails - see the CI stats page [4]. I've found
>>>> only a 2
>>>> cases there for the otherwise situation, when containers-multinode
>>>> fails,
>>>> but 3nodes-multinode passes. So cutting off those future failures
>>>> via the
>>>> dependency added, *would* buy us something and allow other jobs to
>>>> wait less
>>>> to commence, by a reasonable price of somewhat extended time of the
>>>> main
>>>> zuul pipeline. I think it makes sense and that extended CI time will
>>>> not
>>>> overhead the RDO CI execution times so much to become a problem. WDYT?
>>>>
>>>
>>> I'm not sure it makes sense to add a dependency on other deployment
>>> tests. It's going to add additional time to the CI run because the
>>> upgrade won't start until well over an hour after the rest of the
>>
>> The things are not so simple. There is also a significant
>> time-to-wait-in-queue jobs start delay. And it takes probably even
>> longer than the time to execute jobs. And that delay is a function of
>> available HW resources and zuul queue length. And the proposed change
>> affects those parameters as well, assuming jobs with failed
>> dependencies won't run at all. So we could expect longer execution
>> times compensated with shorter wait times! I'm not sure how to
>> estimate that tho. You folks have all numbers and knowledge, let's use
>> that please.
>>
>>> jobs. The only thing I could think of where this makes more sense is
>>> to delay the deployment tests until the pep8/unit tests pass. e.g.
>>> let's not burn resources when the code is bad. There might be
>>> arguments about lack of information from a deployment when developing
>>> things but I would argue that the patch should be vetted properly
>>> first in a local environment before taking CI resources.
>>
>> I support this idea as well, though I'm sceptical about having that
>> blessed in the end :) I'll add a patch though.
>>
>>>
>>> Thanks,
>>> -Alex
>>>
>>>> [0] https://review.openstack.org/#/c/568275/
>>>> [1] https://review.openstack.org/#/c/568278/
>>>> [2] https://review.openstack.org/#/c/568326/
>>>> [3]
>>>> https://docs.openstack.org/tripleo-quickstart/latest/feature-configuration.html
>>>>
>>>> [4] http://tripleo.org/cistatus.html
>>>>
>>>> * ignore the column 1, it's obsolete, all CI jobs now using configs
>>>> download
>>>> AFAICT...
>>>>
>>>> --
>>>> Best regards,
>>>> Bogdan Dobrelya,
>>>> Irc #bogdando
>>>>
>>>> __________________________________________________________________________
>>>>
>>>> OpenStack Development Mailing List (not for usage questions)
>>>> Unsubscribe:
>>>> OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>>
>>> __________________________________________________________________________
>>>
>>> OpenStack Development Mailing List (not for usage questions)
>>> Unsubscribe:
>>> OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>>
>>
>>
>
>
--
Best regards,
Bogdan Dobrelya,
Irc #bogdando
More information about the OpenStack-dev
mailing list