[openstack-dev] [tripleo] reducing our upstream CI footprint
Ben Nemec
openstack at nemebean.com
Wed Oct 31 22:19:22 UTC 2018
On 10/31/18 4:59 PM, Harald Jensås wrote:
> On Wed, 2018-10-31 at 11:39 -0600, Wesley Hayutin wrote:
>>
>>
>> On Wed, Oct 31, 2018 at 11:21 AM Alex Schultz <aschultz at redhat.com>
>> wrote:
>>> Hey everyone,
>>>
>>> Based on previous emails around this[0][1], I have proposed a
>>> possible
>>> reducing in our usage by switching the scenario001--011 jobs to
>>> non-voting and removing them from the gate[2]. This will reduce the
>>> likelihood of causing gate resets and hopefully allow us to land
>>> corrective patches sooner. In terms of risks, there is a risk that
>>> we
>>> might introduce breaking changes in the scenarios because they are
>>> officially non-voting, and we will still be gating promotions on
>>> these
>>> scenarios. This means that if they are broken, they will need the
>>> same attention and care to fix them so we should be vigilant when
>>> the
>>> jobs are failing.
>>>
>>> The hope is that we can switch these scenarios out with voting
>>> standalone versions in the next few weeks, but until that I think
>>> we
>>> should proceed by removing them from the gate. I know this is less
>>> than ideal but as most failures with these jobs in the gate are
>>> either
>>> timeouts or unrelated to the changes (or gate queue), they are more
>>> of
>>> hindrance than a help at this point.
>>>
>>> Thanks,
>>> -Alex
>>
>> I think I also have to agree.
>> Having to deploy with containers, update containers and run with two
>> nodes is no longer a very viable option upstream. It's not
>> impossible but it should be the exception and not the rule for all
>> our jobs.
>>
> afaict in my local environment, the container prep stuff takes ages
> when adding the playbooks to update them with yum. We will still have
> to do this for every standalone job right?
>
>
>
> Also, I enabled profiling for ansible tasks on the undercloud and
> noticed that the UndercloudPostDeploy was high on the list, actually
> the longest running task when re-running the undercloud install ...
>
> Moving from shell script using openstack cli to python reduced the time
> for this task dramatically in my environment, see:
> https://review.openstack.org/614540. 6 and half minutes reduced to 40
> seconds ...
Everything old is new again:
https://github.com/openstack/instack-undercloud/commit/0eb1b59926c7dc46e321c56db29af95b3d755f34#diff-5602f1b710e86ca1eb7334cb0632f9ee
:-)
>
>
> How much time would we save in the gates if we converted some of the
> shell scripting to python, or if we want to stay in shell script we can
> use the interactive shell or use the client-as-a-service[2]?
>
> Interactive shell:
> time openstack <<-EOC
> server list
> workflow list
> workflow execution list
> EOC
>
> real 0m2.852s
> time (openstack server list; \
> openstack workflow list; \
> openstack workflow execution list)
>
> real 0m7.119s
>
> The difference is significant.
>
> We could cache a token[1], and specify the end-point on each command,
> but doing so is still far from as effective as using the interactive.
>
>
> There is an old thread[2] on the mailing list, which contain a
> server/client solution. If we run this service in CI nodes and drop in
> the replacement openstack command in /usr/local/bin/openstack we would
> use ~1/5 of the time for each command.
>
> (undercloud) [stack at leafs ~]$ time (/usr/bin/openstack network list -f
> value -c ID; /usr/bin/openstack network segment list -f value -c ID;
> /usr/bin/openstack subnet list -f value -c ID)
>
>
> real 0m6.443s
> user 0m2.171s
> sys 0m0.366s
>
> (undercloud) [stack at leafs ~]$ time (/usr/local/bin/openstack network
> list -f value -c ID; /usr/local/bin/openstack network segment list -f
> value -c ID; /usr/local/bin/openstack subnet list -f value -c ID)
>
> real 0m1.698s
> user 0m0.042s
> sys 0m0.018s
>
>
>
> I relize this is a kind of hacky approch, but it does seem to work and
> it should be fairly quick to get in there. (With the Undercloud post
> script I see 6 minutes returned, what can we get in CI, 10-15 minutes?
> Then we could look at moving these scripts to python or use ansible
> openstack modules which hopefully does'nt share the same issues with
> loading as the python clients?
I'm personally a fan of using Python as then it is unit-testable, but
I'm not sure how that works with the tht-based code so maybe it's not a
factor.
>
>
>
> [1] https://wiki.openstack.org/wiki/OpenStackClient/Authentication
> [2]
> http://lists.openstack.org/pipermail/openstack-dev/2016-April/092546.html
>
>
>> Thanks Alex
>>
>>
>>> [0]
>>> http://lists.openstack.org/pipermail/openstack-dev/2018-October/136141.html
>>> [1]
>>> http://lists.openstack.org/pipermail/openstack-dev/2018-October/135396.html
>>> [2]
>>> https://review.openstack.org/#/q/topic:reduce-tripleo-usage+(status:open+OR+status:merged
>>> )
>>>
>>> ___________________________________________________________________
>>> _______
>>> OpenStack Development Mailing List (not for usage questions)
>>> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsu
>>> bscribe
>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>
>> --
>> WES HAYUTIN
>> ASSOCIATE MANAGER
>> Red Hat
>>
>> whayutin at redhat.com T: +19194232509 IRC: weshay
>>
>>
>> View my calendar and check my availability for meetings HERE
>> _____________________________________________________________________
>> _____
>> OpenStack Development Mailing List (not for usage questions)
>> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubs
>> cribe
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
More information about the OpenStack-dev
mailing list