[openstack-dev] [tempest][nova][defcore] Add option to disable some strict response checking for interop testing

Mike Perez thingee at gmail.com
Sat Jun 18 23:15:48 UTC 2016

On 23:53 Jun 17, Matthew Treinish wrote:
> On Fri, Jun 17, 2016 at 04:26:49PM -0700, Mike Perez wrote:
> > On 15:12 Jun 14, Matthew Treinish wrote:


> > > I don't think backwards compatibility policies really apply to what what define
> > > as the set of tests that as a community we are saying a vendor has to pass to
> > > say they're OpenStack. From my perspective as a community we either take a hard
> > > stance on this and say to be considered an interoperable cloud (and to get the
> > > trademark) you have to actually have an interoperable product. We slowly ratchet
> > > up the requirements every 6 months, there isn't any implied backwards
> > > compatibility in doing that. You passed in the past but not in the newer stricter
> > > guidelines.
> > > 
> > > Also, even if I did think it applied, we're not talking about a change which
> > > would fall into breaking that. The change was introduced a year and half ago
> > > during kilo and landed a year ago during liberty:
> > > 
> > > https://review.openstack.org/#/c/156130/
> > > 
> > > That's way longer than our normal deprecation period of 3 months and a release
> > > boundary.
> > 
> > <snip>
> > 
> > What kind of communication happens today for these changes? There are so many
> > channels/high volume mailing lists a downstream deployer is expected by the
> > community to listening in. Some disruptive change being introduced a year or
> > longer ago can still be communicated poorly.
> Sure, I agree with that, but I don't think this was necessarily communicated
> poorly. This has been already mentioned a few times on this thread but:
> It was talked about on openstack-dev:
> http://lists.openstack.org/pipermail/openstack-dev/2015-February/057613.html
> On the defcore list: (which is definitely not high volume/traffic ML)
> http://lists.openstack.org/pipermail/defcore-committee/2015-June/000849.html
> This was also raised as an issue for 1 vendor ~6 months ago. (which is also the
> same duration of the hard deadline being discussed in this thread):
> http://lists.openstack.org/pipermail/defcore-committee/2016-January/000986.html
> IMHO, this was more than enough time to introduce a fix or workaround on their
> end. Likely the easiest being just adding an extra nova-api endpoint with the
> extensions disabled.
> I don't have any links or other evidence to point to, but I know that this
> exact topic has been discussed with with people from the vendors having
> difficulties during sessions at at least one of the 2 summits and/or 2 QA
> midcycle meetups since this change landed. I really don't think this is a
> communication problem or unfair surprise for anyone.
> There might be more too, but I don't remember every conversation that I've had
> in the community over the past year. (or where to find the links to point to)

Thanks for the references. So these references show:

* DefCore was aware of these changes long ago.
* Vendors were aware of these changes long ago.
* Referenced vendor is still failing after knowing about this change for six

Question to DefCore, what were you doing in this time frame to prepare vendors
for success with this change rolling out in order to keep a healthy market

> > Just like we've done with Reno in communicating better about disruptive changes
> > in release notes, what tells teams like DefCore about changes with Tempest?
> > (I looked in release.o.o for tempest release notes, although maybe I missed
> > it?)
> Yes, tempest has release notes, they are here:
> http://docs.openstack.org/releasenotes/tempest/
> But, the change in question predates the existence of reno and centralized
> release notes for everything in openstack.
> If this change were pushed today it would definitely be included in the release
> notes. We also would do the same things, put it on the dev list, put it on the
> defcore list. (although probably as a standalone thread this time) I also think
> we'd probably ping hogepodge on irc about it too just so he could also raise it
> up on the defcore side. (which we might have done back then too) Defcore and
> tempest are tightly coupled so we do have pretty constant communication around
> changes being made. But, I do admit we have better mechanisms in place today
> to communicate this kind of change, and hopefully this would be handled better
> now.

This is great! I hope people who have use cases with Tempest are using these
release notes for future big changes.

> > 
> > Since some members of DefCore have interest in making the market place healthy,
> > what is DefCore doing today to communicate these disruptive changes early to
> > deployers? Did it not happen in this particular case because:
> > 
> > * DefCore has no one working closely in the Tempest project to flag things?
> > * Defcore does work closely with Tempest, but somehow the communication for
> >   this was missed?
> > * Not having clear deprecation notices because release notes in the Tempest
> >   don't exist (see above)?
> > 
> > This all just sounds like a communication problem, and it makes me sad to
> > interpret this thread as people being angry with deployers as a result. How
> > about we not think the worse of people that are trying to prove our project
> > being successful and start working with them?
> I actually don't think that's what the fundamental issue here. Chris and the
> other defcore members interact quite regularly with tempest and QA teams, and
> this exact change has been talked about in both circles before this thread
> started. I also don't think looking at things that happened a year or more ago.
> (which is ages in terms of openstack) is a particularly fair assessment. The
> openstack powered program, or whatever it's officially called, was very young
> back then. IIRC, it was only officially done for the first time back around
> vancouver. I don't think it's right to look at things from back then and
> declare there is a communication problem. It seems unfair to everyone working
> in this space. The interactions between defcore and QA have only improved over
> time as both teams have grown.

Right here is partially what I was looking for. I'm really only interested in
how did this slip by. This thread started off with we need more time. Let me
lay out a poor order of events:

* QA/Tempest folks in February announce about this change on the dev list.
* DefCore becomes more of a thing in Vancouver.

* ... throughout here DefCore & tempest/qa speak regularly

Summer 2015:
* It's announced on the DefCore mailing list about this change
* One vendor notices stuff failing

* Tests fail everywhere for DefCore's interop testing
* Chris asks for more time

While this is understanding one side of things, I would like to go back to my
earlier question above and earlier email on communication. How was this being
communicated by DefCore to vendors who are failing now?

> Also, I wouldn't say I'm angry with deployers, more like frustrated that this
> discussion is still going on. It's not a new topic, it's been discussed
> multiple times in the past year. This is just the first time it's been raised as
> a huge problem on the dev list. (likely because the certifications from a year
> ago are expiring) 
> The crux of the issue here is we're saying that we want to to give the openstack
> trademark to the ~3 vendors [1] that are failing the certification tests because
> of proprietary, non-openstack code they're running in their products. TBH, if
> that's what the foundation and the defcore committee want to do that's perfectly
> fine. I don't necessarily agree with it, but I understand there are larger
> politics involved and I probably don't have a complete picture. If we give these
> vendors another 6 months to fix the problem that seems totally fair. Just as
> long as we clearly mark how these clouds are not interoperable, this way users
> can actually see what the vendors are changing.
> But, I still having a hard time understanding why a workaround has to be added
> in tempest to move forward here. We all seem to be in agreement that these
> products don't actually pass the tests, and that tempest is doing the correct
> thing and failing because the api is not actually the nova api.
> It feels to me this would normally be something handled on the defcore side. But,
> because the only mechanism they currently have for this is flagging a test, which
> would basically mean invalidating most of the tests in defcore (especially if
> the extensions are modifying a resource like servers) which is a bad idea.
> I think maybe we should be discussing adding a different mechanism to the defcore
> schema to special case these failures. Instead of flagging a test add a new tag,
> something like 'conditional_failures_allowed: True'. Where if a product fails
> this test (with the specific jsonschema exception?) it can be counted as a pass
> but only if they get an asterisk on the marketplace and the incompatibilities
> are documented there too.

I agree, I would prefer this not happen on the Tempest side of things. There
are many things that have specific use cases with Tempest, and I don't think
those should be leaking into Tempest itself. DefCore has their use of it, and
should be able to instead say on their side "here are the set of tests were
granting exceptions to be defined as a grey list." Yes they will show up as
failures, and yes that's by design, but it won't matter to vendors for now
because they're in a granted grey area by *DefCore's program*.

Mike Perez

More information about the OpenStack-dev mailing list