Open Stack

Wed Jul 9 17:58:06 UTC 2014

> I think we need to actually step back a little and figure out where we
> are, how we got here, and what the future of validation might need to
> look like in OpenStack. Because I think there has been some
> communication gaps. (Also, for people I've had vigorous conversations
> about this before, realize my positions have changed somewhat,
> especially on separation of concerns.)
> 
> (Also note, this is all mental stream right now, so I will not pretend
> that it's an entirely coherent view of the world, my hope in getting
> things down is we can come up with that coherent view of the wold together.)
> 
> == Basic History ==
> 
> In the essex time frame Tempest was 70 tests. It was basically a barely
> adequate sniff test for integration for OpenStack. So much so that our
> first 3rd Party CI system, SmokeStack, used it's own test suite, which
> legitimately found completely different bugs than Tempest. Not
> surprising, Tempest was a really small number of integration tests.
> 
> As we got to Grizzly Tempest had grown to 1300 tests, somewhat
> organically. People were throwing a mix of tests into the fold, some
> using Tempest's client, some using official clients, some trying to hit
> the database doing white box testing. It had become kind of a mess and a
> rorshack test. We had some really weird design summit sessions because
> many people had only looked at a piece of Tempest, and assumed the rest
> was like it.
> 
> So we spent some time defining scope. Tempest couldn't really be
> everything to everyone. It would be a few things:
>  * API testing for public APIs with a contract
>  * Some throughput integration scenarios to test some common flows
> (these were expected to be small in number)
>  * 3rd Party API testing (because it had existed previously)
> 
> But importantly, Tempest isn't a generic function test suite. Focus is
> important, because Tempests mission always was highly aligned with what
> eventually became called Defcore. Some way to validate some
> compatibility between clouds. Be that clouds built from upstream (is the
> cloud of 5 patches ago compatible with the cloud right now), clouds from
> different vendors, public clouds vs. private clouds, etc.
> 
> == The Current Validation Environment ==
> 
> Today most OpenStack projects have 2 levels of validation. Unit tests &
> Tempest. That's sort of like saying your house has a basement and a
> roof. For sufficiently small values of house, this is fine. I don't
> think our house is sufficiently small any more.
> 
> This has caused things like Neutron's unit tests, which actually bring
> up a full wsgi functional stack and test plugins through http calls
> through the entire wsgi stack, replicated 17 times. It's the reason that
> Neutron unit tests takes many GB of memory to run, and often run longer
> than Tempest runs. (Maru has been doing hero's work to fix much of this.)
> 
> In the last year we made it *really* easy to get a devstack node of your
> own, configured any way you want, to do any project level validation you
> like. Swift uses it to drive their own functional testing. Neutron is
> working on heading down this path.
> 
> == New Challenges with New Projects ==
> 
> When we started down this path all projects had user APIs. So all
> projects were something we could think about from a tenant usage
> environment. Looking at both Ironic and Ceilometer, we really have
> projects that are Admin API only.
> 
> == Contracts or lack thereof ==
> 
> I think this is where we start to overlap with Eoghan's thread most.
> Because branchless Tempest assumes that the test in Tempest are governed
> by a stable contract. The behavior should only change based on API
> version, not on day of the week. In the case that triggered this what
> was really being tested was not an API, but the existence of a meter
> that only showed up in Juno.
> 
> Ceilometer is also another great instance of something that's often in a
> state of huge amounts of stack tracing because it depends on some
> internals interface in a project which isn't a contract. Or notification
> formats, which aren't (largely) versioned.
> 
> Ironic has a Nova driver in their tree, which implements the Nova driver
> internals interface. Which means they depend on something that's not a
> contract. It gets broken a lot.
> 
> == Depth of reach of a test suite ==
> 
> Tempest can only reach so far into a stack given that it's levers are
> basically public API calls. That's ok. But it means that things like
> testing a bunch of different dbs in the gate (i.e. the postgresql job)
> are pretty ineffectual. Trying to exercise code 4 levels deep through
> API calls is like driving a rover on Mars. You can do it, but only very
> carefully.
> 
> == Replication ==
> 
> Because there is such a huge gap between unit tests, and Tempest tests,
> replication of issues is often challenging. We have the ability to see
> races in the gate due to volume of results, that don't show up for
> developers very easily. When you do 30k runs a week, a ton of data falls
> out of it.
> 
> A good instance is the live snapshot bug. It was failing on about 3% of
> Tempest runs, which means that it had about a 10% chance of killing a
> patch on it's own. So it's definitely real. It's real enough that if we
> enable that path, there are a ton of extra rechecks required by people.
> However it's at a frequency that reproducing on demand is hard. And
> reproducing with enough signal to make it debuggable is also hard.
> 
> == The Fail Pit ==
> 
> All of which has somewhat led us to the fail pit. Where keeping
> OpenStack in a state that it can actually pass Tempest consistently is a
> full time job. It's actually more than a full time job, it's a full time
> program. If it was it's own program it would probably be larger than 1/2
> the official programs in OpenStack.
> 
> Also, when the Gate "program" is understaffed, it means that all the
> rest of the OpenStack programs (possibly excepting infra and tripleo
> because they aren't in the integrated gate) are slowed down
> dramatically. That velocity loss has real community and people power
> implications.
> 
> This is especially true of people trying to get time, review, mentoring,
> otherwise, out of the QA team. As there is kind of a natural overlap
> with folks that actually want us to be able to merge code, so while the
> Gate is under water, getting help on Tempest issues isn't going to
> happen in any really responsive rate.
> 
> Also, all the folks that have been the work horses here, myself, joe
> gordon, matt treinish, matt riedemann, are pretty burnt out on this.
> Every time we seem to nail one issue, 3 more crop up. Having no ending
> in sight and spending all your time shoveling out other project bugs is
> not a happy place to be.
> 
> == New Thinking about our validation layers ==
> 
> I feel like an ideal world would be the following:
> 
> 1. all projects have unit tests for their own internal testing, and
> these pass 100% of the time (note, most projects have races in their
> unit tests, and they don't pass 100% of the time. And they are low
> priority to fix).
> 2. all projects have a functional devstack job with tests *in their own
> tree* that pokes their project in interesting ways. This is akin to what
> neutron is trying and what swift is doing. These are *not* cogating.
> 3. all non public API contracts are shored up by landing contract tests
> in projects. We did this recently with Ironic in Nova -
> https://github.com/openstack/nova/blob/master/nova/tests/virt/test_ironic_api_contracts.py.
> 
> 4. all public API contracts are tested in Tempest (these are co-gating,
> and ensure a contract breakage in keystone doesn't break swift).
> 
> Out of these 4 levels, we currently have 2 (1 and 4). In some projects
> we're making #1 cover 1 & 2. And we're making #4 cover 4, 3, and
> sometimes 2. And the problem with this is it's actually pretty wasteful,
> and when things fail, they fail so far away from the test, that the
> reproduce is hard.
> 
> I actually think that if we went down this path we could actually make
> Tempest smaller. For instance, negative API testing is something I'd say
> is really #2. While these tests don't take a ton of time, they do add a
> certain amount of complexity. It might also mean that admin tests, whose
> side effects are hard to understand sometimes without white/greybox
> interactions might migrated into #2.
> 
> I also think that #3 would help expose much more surgically what the
> cross project pain points are instead of proxy efforts through Tempest
> for these subtle issues. Because Tempest is probably a terrible tool to
> discover that notifications in nova changed. The results is some weird
> failure in a ceilometer test which says some instance didn't run when it
> was expected, then you have to dig through 5 different openstack logs to
> figure out that it was really a deep exception somewhere. If it was
> logged, which it often isn't. (I actually challenge anyone to figure out
> the reason for a ceilometer failure from a Tempest test based on it's
> current logging. :) )
> 
> And by ensuring specific functionality earlier in the stack, and letting
> Nova beat up Nova the way they think they should in a functional test
> (or land a Neutron functional test to ensure that it's doing the right
> thing), would make the Tempests runs which were cogating, a ton more
> predictable.
> 
> == Back to Branchless Tempest ==
> 
> I think the real issues that projects are running into with Branchless
> Tempest is they are coming forward with tests not in class #4, which
> fail, because while the same API existed 4 months ago as today, the
> semantics of the project have changed in a non discoverable way. Which
> I'd say was bad, however until we tried the radical idea of running the
> API test suite against all releases that declared they had the same API,
> we didn't see it. :)
> 
> 
> Ok, that was a lot. Hopefully it was vaguely coherent. I want to preface
> that I don't consider this all fully formed, but it's a lot of what's
> been rattling around in my brain.

Thanks for the very detailed response Sean.

There's a lot in there, some of it background, some of it more focussed
on the what-next question.

I'll need to take a bit of time to digest all of that, and also discuss
at the weekly ceilometer meeting tomorrow. I'll circle back with a more
complete response after that.

Cheers,
Eoghan

> 	-Sean
> 
> On 07/09/2014 05:41 AM, Eoghan Glynn wrote:
> > 
> > TL;DR: branchless Tempest shouldn't impact on backporting policy, yet
> >        makes it difficult to test new features not discoverable via APIs
> > 
> > Folks,
> > 
> > At the project/release status meeting yesterday[1], I raised the issue
> > that featureful backports to stable are beginning to show up[2], purely
> > to facilitate branchless Tempest. We had a useful exchange of views on
> > IRC but ran out of time, so this thread is intended to capture and
> > complete the discussion.
> > 
> > The issues, as I see it, are:
> > 
> >  * Tempest is expected to do double-duty as both the integration testing
> >    harness for upstream CI and as a tool for externally probing
> >    capabilities
> >    in public clouds
> > 
> >  * Tempest has an implicit bent towards pure API tests, yet not all
> >    interactions between OpenStack services that we want to test are
> >    mediated by APIs
> > 
> >  * We don't have another integration test harness other than Tempest
> >    that we could use to host tests that don't just make assertions
> >    about the correctness/presence of versioned APIs
> > 
> >  * We want to be able to add new features to Juno, or fix bugs of
> >    omission, in ways that aren't necessarily discoverable in the API;
> >    without backporting these patches to stable if we wouldn't have
> >    done so under the normal stable-maint policy[3]
> > 
> >  * Integrated projects are required[4] to provide Tempest coverage,
> >    so the rate of addition of tests to Tempest is unlikely to slow
> >    down anytime soon
> > 
> > So the specific type of test that I have in mind would be common
> > for Ceilometer, but also possibly for Ironic and others:
> > 
> >  1. an end-user initiates some action via an API
> >     (e.g. calls the cinder snapshot API)
> > 
> >  2. this initiates some actions behind the scenes
> >     (e.g. a volume is snapshot'd and a notification emitted)
> > 
> >  3. the test reasons over some expected side-effect
> >     (e.g. some metering data shows up in ceilometer)
> > 
> > The branchless Tempest spec envisages new features will be added and
> > need to be skipped when testing stable/previous, but IIUC requires
> > that the presence of new behaviors is externally discoverable[5].
> > 
> > One approach mooted for allowing these kind of scenarios to be tested
> > was to split off the pure-API aspects of Tempest so that it can be used
> > for probing public-cloud-capabilities as well as upstream CI, and then
> > build project-specific mini-Tempests to test integration with other
> > projects.
> > 
> > Personally, I'm not a fan of that approach as it would require a lot
> > of QA expertise in each project, lead to inefficient use of CI
> > nodepool resources to run all the mini-Tempests, and probably lead to
> > a divergent hotchpotch of per-project approaches.
> > 
> > Another idea would be to keep all tests in Tempest, while also
> > micro-versioning the services such that tests can be skipped on the
> > basis of whether a particular feature-adding commit is present.
> > 
> > When this micro-versioning can't be discovered by the test (as in the
> > public cloud capabilities probing case), those tests would be skipped
> > anyway.
> > 
> > The final, less palatable, approach that occurs to me would be to
> > revert to branchful Tempest.
> > 
> > Any other ideas, or preferences among the options laid out above?
> > 
> > Cheers,
> > Eoghan
> > 
> > [1]
> > http://eavesdrop.openstack.org/meetings/project/2014/project.2014-07-08-21.03.html
> > [2] https://review.openstack.org/104863
> > [3] https://wiki.openstack.org/wiki/StableBranch#Appropriate_Fixes
> > [4]
> > https://github.com/openstack/governance/blob/master/reference/incubation-integration-requirements.rst#qa-1
> > [5]
> > https://github.com/openstack/qa-specs/blob/master/specs/implemented/branchless-tempest.rst#scenario-1-new-tests-for-new-features
> > 
> > _______________________________________________
> > OpenStack-dev mailing list
> > OpenStack-dev at lists.openstack.org
> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> > 
> 
> 
> --
> Sean Dague
> http://dague.net
> 
> 
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 

Open Stack

[openstack-dev] [qa][all] Branchless Tempest beyond pure-API tests, impact on backporting policy

OpenStack

Community

Documentation

Branding & Legal