[openstack-dev] Thoughts on the patch test failure rate and moving forward

Robert Collins robertc at robertcollins.net
Fri Jul 25 21:50:01 UTC 2014


On 26 July 2014 08:20, Matthew Treinish <mtreinish at kortar.org> wrote:

>> This is also more of a pragmatic organic approach to figuring out the
>> interfaces we need to lock down. When one projects breaks depending on
>> an interface in another project, that should trigger this kind of
>> contract growth, which hopefully formally turns into a document later
>> for a stable interface.
>
> So notifications are a good example of this, but I think how we handled this
> is also an example of what not to do. The order was backwards, there should
> have been a stability guarantee upfront, with a versioning mechanism on
> notifications when another project started relying on using them. The fact that
> there are at least 2 ML threads on how to fix and test this at this point in
> ceilometer's life seems like a poor way to handle it. I don't want to see us
> repeat this by allowing cross-project interactions to depend on unstable
> interfaces.

+1

> I agree that there is a scaling issue, our variable testing quality and coverage
> between all the projects in the tempest tree is proof enough of this. I just
> don't want to see us lose the protection we have against inadvertent changes.
> Having the friction of something like the tempest two step is important, we've
> blocked a lot of breaking api changes because of this.
>
> The other thing to consider is that when we adopted branchless tempest part of
> the goal there was to ensure the consistency between release boundaries. If
> we're really advocating dropping most of the API coverage out of tempest part of
> the story needs to be around how we prevent things from slipping between release
> boundaries too.

I'm also worried about the impact on TripleO - we run everything
together functionally, and we've been aiming at the gate since
forever: we need more stability, and I'm worried that this may lead to
less. I don't think more lock-down and a bigger matrix is needed - and
I support doing an experiment to see if we end up in a better place.
Still worried :).


> But, having worked on this stuff for ~2 years I can say from personal experience
> that every project slips when it comes to API stability, despite the best
> intentions, unless there was test coverage for it. I don't want to see us open
> the flood gates on this just because we've gotten ourselves into a bad situation
> with the state of the gate.

+1


>> Our current model leans far too much on the idea of the only time we
>> ever try to test things for real is when we throw all 1 million lines of
>> source code into one pot and stir. It really shouldn't be surprising how
>> many bugs shake out there. And this is the wrong layer to debug from, so
>> I firmly believe we need to change this back to something we can
>> actually manage to shake the bugs out with. Because right now we're
>> finding them, but our infrastructure isn't optimized for fixing them,
>> and we need to change that.
>>
>
> I agree a layered approach is best, I'm not disagreeing on that point. I just am
> not sure how much we really should be decreasing the scope of Tempest as the top
> layer around the api tests. I don't think we should too much just because we
> beefing up the middle with improved functional testing. In my view having some
> duplication between the layers is fine and desirable actually.
>
> Anyway, I feel like I'm diverging this thread off into a different area, so I'll
> shoot off a separate thread on the topic of scale and scope of Tempest and the
> new in-tree project specific functional tests. But to summarize, what I think we
> should be clear about at the high level for this thread is that for right now is
> that for the short term we aren't changing the scope of Tempest. Instead we
> should just be vigilant in managing tempest's growth (which we've been trying to
> do already) We can revisit the discussion of decreasing Tempest's size once
> everyone's figured out the per project functional testing. This will also give
> us time to collect longer term data about test stability in the gate so we can
> figure out which things are actually valuable to have in tempest. I think this
> is what probably got lost in the noise here but has been discussed elsewhere.

I'm pretty interested in having contract tests within each project, I
think thats the right responsibility for them - my specific concern is
the recovery process / time to recovery when a regression does get
through.

-Rob

-- 
Robert Collins <rbtcollins at hp.com>
Distinguished Technologist
HP Converged Cloud



More information about the OpenStack-dev mailing list