[OpenStack-DefCore] Additional Properties on API responses

Monty Taylor mordred at inaugust.com
Wed Jun 22 13:29:53 UTC 2016


On 06/21/2016 07:49 PM, Rob Hirschfeld wrote:
> Chris - I have a question about this... since this is upstream, will the
> latest (v2) CLI start enforcing this?  If so, it seems like the cloud
> providers should have a real motivation to make the migration since the
> CLI failing would, IMHO, be a major issue.

If the CLI started enforcing this, that would be a major breaking change
and I believe would necessitate a major version bump in the CLI. It
would break end users for no gain to the end user, and I would be VERY
strongly against such a thing happening.

> On 06/21/2016 04:51 PM, Van Lindberg wrote:
>>
>>
>> On Jun 21, 2016 6:30 PM, Chris Hoge <chris at openstack.org> wrote:
>>>
>>> * Should there be a short-term exception for additional properties?
>>
>> Yes, absolutely.
>>
>>> This change has been harmful to vendors, users, and the OpenStack
>>> Powered program, and there should be a facility in DefCore to handle
>>> this. Existing companies, that passed early in the year last year, no
>>> longer pass.
>>
>> This is the crux of the issue; a cloud that passed in December stopped
>> passing in January, not due to any change on the part of the product
>> offered for sale.
>>
>> Making this change even more egregious, this change is not part of any
>> required capability - this was an extraneous change. Even if we want
>> to make "does not return extra properties" a tested capability in the
>> defcorish sense, we would need to put it through the standard process
>> (widely deployed? Aligned with future direction? etc.) and then put in
>> via the standard advisory/approved process.

I agree completely. This is an _effective_ API change without a
deprecation period.

>>> While there has been warning, product decisions move more
>>> slowly.
>>
>> There has been lots of warnings on lots of things. That doesn't mean
>> that whatever makes it into upstream automatically is part of 
>> Openstack(TM). The things that are part of the Openstack trademark are
>> solely decided according to the defcore process - a process that was
>> negotiated out for literally years.
>>
>>> * Should there a permanent exception for additional properties?
>>
>> Yes, until such time as "Does not return additional properties" is an
>> accepted defcore capability.
>>
>>> However, upstream is moving in this
>>> direction, and it's only a matter of time before more projects and tools
>>> adopt strict response checking, making it a long-term interoperability
>>> issue regardless of the position DefCore takes.
>>
>> I would add that I oppose any restriction on the ability to provide
>> extensions until such point as there is a workable alternative to
>> migrate to, plus an appropriate time to migrate.
>>
>> I believe that this whole issue is an inappropriate technical fix for
>> what is really social problem.

I disagre that this is an inappropriate fix. The vendor extension
mechanism originally was an inappropriate technical solution to a social
problem. The social problem at its root isn't so much a problem any more
- we all mostly get along now and can actually communicate. Getting rid
of vendor extensions in the API is a Very Good Thing for end users. As a
person who develops tools to consume OpenStack clouds, I can tell you
that vendor API extensions, combined with turning off API calls using
policy.json are the two biggest _actual_ issue with interoperability.

That said, I wholeheartedly agree that moving forward on the plan must
provide both a workable replacement mechanism for the things existing
deployments have done so far, and a reasonable time frame for people to
move to and adopt the new thing.

Taking policy.json as an example, since we're not debating it at the
moment. Its use to disable API calls is a ludicrously bad user
experience, as it causes the user to get Does Not Exist on published
portions of the API ... usually leading the user to troubleshoot a bunch
of other things and think they are going crazy. The API is the API - it
shouldn't change. The API _should_ have a defined response code which is
"this cloud does not enable this feature" - preferably with a payload
that could allow the vendor to point the user at something which has
more information on the disabled feature and/or how the user can
request/purchase the ability to do that thing.

HOWEVER, policy.json is what it is and has been that way for six years.
So at whatever point we develop a replacement for it, be it my
suggestion above or something else, it needs to be able to address in a
sane manner the use cases that people have been solving with
policy.json. And we must give the vendors both specific guidance and
reasonable time to react.

>>> * What is our guidance for vendors going forward?
>>>
>>> My suggestion to vendors is to use the next year to adjust their product
>>> strategy and releases. The ideal solution is to work with upstream to
>>> have additional properties rolled into a new micro-version [2], which
>>> would force those properties to be adopted upstream into the Tempest
>>> library.
>>
>> I am generally supportive of this, but that presupposes that
>> microversions actually provide a realistic alternative mechanism. This
>> has not been shown yet.

I think microversions themselves are working fine. Looking at the
Rackspace server record extensions (purely because I'm responding to Van
and it's a ready example) There are two main extensions that are
specific to Rackspace:

servers.RAX-SI:image_schedule
and
servers.RAX-PUBLIC-IP-ZONE-ID:publicIPZoneId

Both _could_ be handled using nova metadata today. There are several
other public clouds that add a couple of salient pieces of information
into the metadata dict. "Here's how this instance should talk to our
proprietary logging service" is one that springs to mind I've seen
elsewhere.

Metadata is really for users though, and there are limits on how many
keys/values you can put in there, so if vendors start using up all the
user metadata, that seems bad.

A simple microversion extension would be to maybe add a second dict like
metadata that is specifically for additional information the cloud wants
to communicate. There is already a construct for this in Nova - in the
metadata service and config-drive there is vendor_data.json. In fact,
Rackspace currently uses it to communicate static IP information - and
it's both super helpful and appropriate. (as a user of this, it's really
easy to do a check "is vendor_data.json non-empty and does it have
"Provider": "Rackspace" ... if so, I should do XXX)

As a user, it would be AWESOME if the contents of vendor_data.json were
available in the server dict. Adding a vendor_data dict to the server
record is a simple microversion bump.

That said, image_schedule seems like a thing that it would be completely
reasonable to have be a normal part of the server record ... except that
I don't believe OpenStack has a scheduled images feature. It looks like
there have been at least two proposed attempts at that (on a quick
google) So another tack could be putting together a scheduled images
feature and then exposing that feature and the associated metadata in a
microversion.

RAX-PUBLIC-IP-ZONE-ID is a bit trickier, as the docs for it are a bit
more opaque. It says "Enables booting the server from a volume when
additional parameters are given. If specified, the volume status must be
available, and the volume attach_status must be detached." I'm guessing
that's a doc bug. I would guess, without really knowing, that this is
addressing similar issues to the ones that adding availability-zone
support to neutron was aiming at dealing with. Namely - AZs are great,
but what good are they if you can't also associate networking choices
with them. It's possible this is one step deeper and is trying to
communicate cell-level placement too - I really don't know. However,
every possible meaning for this I can come up with in my head are all
things that would be widely useful to users of OpenStack clouds, so I
cannot image that we would not be able to find a legit API that could
come with agreed upon semantic meanings.



To sum up - I don't think anyone is attempting anything nefarious here,
or has any bad intentions. I think that we're still digging ourselves
out of a hole we made six years ago by not having a clear understanding
shared by everyone as to whether OpenStack was a thing that was intended
to be interoperable and that cloud end users would be OpenStack users,
or if OpenStack was a construction kit whose users were cloud providers
and a user of any of the clouds would not necessarily think of
themselves as an OpenStack user. We had people firmly in both camps. The
interoperable stance is the understanding we've broadly come to - thus
defcore's existence - but that doesn't mean that we aren't left with a
bunch of things, like vendor API extensions, that were put in place in
service of the other world view.

It's a hard position to extricate ourselves from, and it's going to take
continued patience and understanding. We're going to run in to more
cases like this where we step on something in one direction or the other
that hurts. I think as long as we can all remember that we're all in the
same boat, and that we all have good intentions - even when we disagree.

Monty



More information about the Defcore-committee mailing list