[OpenStack-Infra] Status of check-tempest-dsvm-f20 job

Sean Dague sean at dague.net
Wed Jun 18 10:24:20 UTC 2014


On 06/18/2014 05:45 AM, Eoghan Glynn wrote:
> 
> 
>> On 06/18/2014 06:46 PM, Eoghan Glynn wrote:
>>> If we were to use f20 more widely in the gate (not to entirely
>>> supplant precise, more just to split the load more evenly) then
>>> would the problem observed tend to naturally resolve itself?
>>
>> I would be happy to see that, having spent some time on the Fedora
>> bring-up :) However I guess there is a chicken-egg problem with
>> large-scale roll-out in that the platform isn't quite stable yet.
>> We've hit some things that only really become apparent "in the wild";
>> differences between Rackspace & HP images, issues running on Xen which
>> we don't test much, the odd upstream bug requiring work-arounds [1],
>> etc.
> 
> Fair point.
>  
>> It did seem that devstack changes was the best place to stabilze the
>> job.  However, as is apparent, devstack changes often need to be
>> pushed through quickly and that does not match well with a slightly
>> unstable job.
>>
>> Having it experimental in devstack isn't much help in stabilizing.  If
>> I trigger experimental builds for every devstack change it runs
>> several other jobs too, so really I've just increased contention for
>> limited resources by doing that.
> 
> Very true also.
> 
>> I say this *has* to be running for devstack eventually to stop the
>> fairly frequent breakage of devstack on Fedora, which causes a lot of
>> people wasted time often chasing the same bugs.
> 
> I agree, if we're committed to Fedora being a first class citizen (as
> per TC distro policy, IIUC) then it's crucial that Fedora-specific
> breakages are exposed quickly in the gate, as opposed to being seen
> by developers for the first time in the wild whenever they happen to
> refresh their devstack.
> 
>> But in the mean time, maybe suggestions for getting the Fedora job
>> exposure somewhere else where it can brew and stabilize are a good
>> idea.
> 
> Well, I would suggest the ceilometer/py27 unit test job as a first
> candidate for such exposure.
> 
> The reason being that mongodb 2.4 is not available on precise, but
> is on f20. As a result, the mongodb scenario tests are effectively
> skipped in the ceilo/py27 units, which is clearly badness and needs
> to be addressed.
> 
> Obviously this lack of coverage will resolve itself quickly once
> the Trusty switchover occurs, but it seems like we can short-circuit
> that process by simply switching to f20 right now.
> 
> I think the marconi jobs would be another good candidate, where
> switching over to f20 now would add real value. The marconi tests
> include some coverage against mongodb proper, but this is currently
> disabled, as marconi requires mongodb version >= 2.2 (and precise
> can only offer 2.0.4).
> 
>> We could make a special queue just for f20 that only triggers that
>> job, if others like that idea.
>>
>> Otherwise, ceilometer maybe?  I made some WIP patches [2,3] for this
>> already.  I think it's close, just deciding what tempest tests to
>> match for the job in [2].
> 
> Thanks for that.
> 
> So my feeling is that at least the following would make sense to base
> on f20:
> 
> 1. ceilometer/py27 
> 2. tempest variant with the ceilo DB configured as mongodb
> 3. marconi/py27
> 
> Then a random selection of other p27 jobs could potentially be added
> over time to bring f20 usage up to approximately the same breath
> as precise.
> 
> Cheers,
> Eoghan

Unit test nodes are a different image yet still. So this actually
wouldn't make anything better, it would just also stall out ceilometer
and marconi unit tests in the same scenario.

I think the real issue is to come up with a fairer algorithm that
prevents any node class from starving, even in the extreme case. And get
that implemented and accepted in nodepool.

I do think devstack was the right starting point, because it fixes lots
of issues we've had with us accidentally breaking fedora in devstack.
We've yet to figure out how overall reliable fedora is going to be.

	-Sean

-- 
Sean Dague
http://dague.net

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 482 bytes
Desc: OpenPGP digital signature
URL: <http://lists.openstack.org/pipermail/openstack-infra/attachments/20140618/763a83ed/attachment.pgp>


More information about the OpenStack-Infra mailing list