[openstack-dev] [TripleO][review] Please treat -1s on check-tripleo-*-precise as voting.

Robert Collins robertc at robertcollins.net
Sun Feb 23 20:30:10 UTC 2014


On 22 February 2014 05:36, Derek Higgins <derekh at redhat.com> wrote:

> the ci cloud seems to be running today as expected, but we have a bit to
> tuning todo
>
> check-tripleo-overcloud-precise is throwing out false negatives because
> the testenv-worker has a timeout that is less then the timeout on the
> jenkins job (and less then the length of time it take to run the job)
> o this should handle the false negatives
>   https://review.openstack.org/#/c/75402/

[reviewed]

> o and this is a more permanent solution (to remove the possibility of
> double booking environments), a new test-env cluster will need to be
> built with it, we can do that once we iron out anything else that may
> pop up over the next few days.
>   https://review.openstack.org/#/c/75403/

[reviewed] - but I wonder if Jenkins tells us the timeout, so we could
pass the desired timeout in in the lease request, for a more permanent
solution?

> Current status is that a lot of jobs are failing because they are not
> completing the "nova-manage db sync" on the seed quickly enough, this
> only started happening today and doesn't immediately suggest a problem
> with our test environment setup (unless we are over committing resources
> on the test environments), I suspect some part of the seed boot process
> on or before the db sync is now taking longer then it used to. I was
> trying to track down the problem but I'm about to run out of time.

Ok; we should file a bug about performance regressions like that - if
we're not sure its a nova one, start on tripleo, then add nova when
we've ruled out test environment IO contention.

> This begs the question,
>   If this proves to be a failure in tripleo-ci that is being caused by a
> change that happened outside of tripleo should we stop merging commits?

Yes, IMO.

> Are are we ok to go ahead and merge while also helping the other project
> to solve the problem? Of course if we were gating on all projects this
> problem would be far less frequent then I suspect it will be, but for
> now how do we proceed in these situations.

If we keep merging commits other than the minimal set needed to fix
the problem, we can't tell if those commits are correct. So we end up
chasing broken commits for a while until we get stable again.

Better to focus on the problem, fix it sufficiently for CI (e.g. even
if we end up running a patched nova while nova review the patches
thats better than no CI).

-Rob


-- 
Robert Collins <rbtcollins at hp.com>
Distinguished Technologist
HP Converged Cloud



More information about the OpenStack-dev mailing list