[openstack-dev] [Nova] Some thoughts on API microversions

John Garbutt john at johngarbutt.com
Thu Aug 4 16:47:38 UTC 2016

On 4 August 2016 at 14:18, Andrew Laski <andrew at lascii.com> wrote:
> On Thu, Aug 4, 2016, at 08:20 AM, Sean Dague wrote:
>> On 08/03/2016 08:54 PM, Andrew Laski wrote:
>> > I've brought some of these thoughts up a few times in conversations
>> > where the Nova team is trying to decide if a particular change warrants
>> > a microversion. I'm sure I've annoyed some people by this point because
>> > it wasn't germane to those discussions. So I'll lay this out in it's own
>> > thread.
>> >
>> > I am a fan of microversions. I think they work wonderfully to express
>> > when a resource representation changes, or when different data is
>> > required in a request. This allows clients to make the same request
>> > across multiple clouds and expect the exact same response format,
>> > assuming those clouds support that particular microversion. I also think
>> > they work well to express that a new resource is available. However I do
>> > think think they have some shortcomings in expressing that a resource
>> > has been removed. But in short I think microversions work great for
>> > expressing that there have been changes to the structure and format of
>> > the API.
>> >
>> > I think microversions are being overused as a signal for other types of
>> > changes in the API because they are the only tool we have available. The
>> > most recent example is a proposal to allow the revert_resize API call to
>> > work when a resizing instance ends up in an error state. I consider
>> > microversions to be problematic for changes like that because we end up
>> > in one of two situations:
>> >
>> > 1. The microversion is a signal that the API now supports this action,
>> > but users can perform the action at any microversion. What this really
>> > indicates is that the deployment being queried has upgraded to a certain
>> > point and has a new capability. The structure and format of the API have
>> > not changed so an API microversion is the wrong tool here. And the
>> > expected use of a microversion, in my opinion, is to demarcate that the
>> > API is now different at this particular point.
>> >
>> > 2. The microversion is a signal that the API now supports this action,
>> > and users are restricted to using it only on or after that microversion.
>> > In many cases this is an artificial constraint placed just to satisfy
>> > the expectation that the API does not change before the microversion.
>> > But the reality is that if the API change was exposed to every
>> > microversion it does not affect the ability I lauded above of a client
>> > being able to send the same request and receive the same response from
>> > disparate clouds. In other words exposing the new action for all
>> > microversions does not affect the interoperability story of Nova which
>> > is the real use case for microversions. I do recognize that the
>> > situation may be more nuanced and constraining the action to specific
>> > microversions may be necessary, but that's not always true.
>> >
>> > In case 1 above I think we could find a better way to do this. And I
>> > don't think we should do case 2, though there may be special cases that
>> > warrant it.
>> >
>> > As possible alternate signalling methods I would like to propose the
>> > following for consideration:
>> >
>> > Exposing capabilities that a user is allowed to use. This has been
>> > discussed before and there is general agreement that this is something
>> > we would like in Nova. Capabilities will programatically inform users
>> > that a new action has been added or an existing action can be performed
>> > in more cases, like revert_resize. With that in place we can avoid the
>> > ambiguous use of microversions to do that. In the meantime I would like
>> > the team to consider not using microversions for this case. We have
>> > enough of them being added that I think for now we could just wait for
>> > the next microversion after a capability is added and document the new
>> > capability there.
>> The problem with this approach is that the capability add isn't on a
>> microversion boundary, as long as we continue to believe that we want to
>> support CD deployments this means people can deploy code with the
>> behavior change, that's not documented or signaled any way.


I do wonder if we want to relax our support of CD, to some extent, but
thats a different thread.

> The fact that the capability add isn't on a microversion boundary is
> exactly my point. There's no need for it to be in many cases. But it
> would only apply for capability adds which don't affect the
> interoperability of multiple deployments.
> The signaling would come from the ability to query the capabilities
> listing. A change in what that listing returns indicates a behavior
> change.
> Another reason I like the above mechanism is that it handles differences
> in policy better as well. As much as we say that two clouds with the
> same microversions available should accept the same requests and return
> the same responses that's not actually true due to policy checks. I know
> we discussed removing the ability to modify the response based on policy
> so I'm not referring to that. What I mean is that a full action could be
> disabled for a user. In this situation the microversion is useless
> because it can't signal this behavior to the user, while a capabilities
> list could.

I would hate to bloat the list of possible capabilities, like we had
with lots of silly little "extensions".

I am personally leaning towards a more course grained set of
high-level policy and capabilities, so its more human understandable.

>> A microversion is communication of change that is accessible all the way
>> to the end user (and is currently our only communication channel for
>> that). There are definitely gray areas here, in which case you are
>> deciding whether you over communicate (put in a microversion just in
>> case it turns out to be significant from programming perspective) or
>> under communicate, bunch things up and figure no one will really mind.
>> When we have discoverable capabilities infrastructure, we can probably
>> move some of these to that. But currently we don't have that. And until
>> we do, version numbers are cheap.
> I agree that version numbers are cheap. My contention is that their
> signal is unclear because they can mean almost anything.
>> > Secondly we could consider some indicator that exposes how new the code
>> > in a deployment is. Rather than using microversions as a proxy to
>> > indicate that a deployment has hit a certain point perhaps there could
>> > be a header that indicates the date of the last commit in that code.
>> > That's not an ideal way to implement it but hopefully it makes it clear
>> > what I'm suggesting. Some marker that a user can use to determine that a
>> > new behavior is to be expected, but not one that's more intended to
>> > signal structural API changes.
>> >
>> > Thoughts?
>> I think we're going to get a ton of push back from people on this. When
>> we first rolled out microversions I got a number of people asking if
>> they could hide the supported microversions, because it gave some
>> indication of the code level on the server (like people hide apache
>> version in production). Which entirely missed the point of the
>> infrastructure. I can't see folks allowing this in. Plus this is git,
>> and people have local patches, so I'm not sure there is any meaningful
>> concept here to expose.
> There may not be anything meaningful to expose. I realize it's far from
> trivial to implement something like that and I don't think using git is
> a good Idea it was purely for illustration.
> It's fair to say that information of this nature may be too sensitive to
> expose, though we are actually exposing it coarsely with microversions.

I like to think we are exposing the minimum needed to aid interop.
Although that may not be entirely accurate, I like that aim.

>> I'm on board with a future where we have the monotonically increasing
>> microversions, as well as a side channel of discoverable capabilities.
>> But I think the moment you try to introduce a 3rd communication channel
>> for behavior, which has something looking like a version number, it
>> actually becomes way too confusing from a consumption point of view. And
>> it also breaks one of the things we were trying to do, which is
>> guaruntee old behavior (as much as possible) on old API versions.
> This gets to the point I'm trying to make. We don't guarantee old
> behavior in all cases at which point users can no longer rely on
> microversions to signal non breaking changes. And where we do guarantee
> old behavior sometimes we do it artificially because the only signal we
> have is microversions and that's the contract we're trying to adhere to.
>> I think that if we put some code version into place, we'll just assume
>> we can use that to signal changes, and stop realizing how disruptive it
>> is to make those changes for existing users.
> I already think we're not thinking enough about making those disruptive
> changes because microversions are cheap and don't have a clear
> definition. I personally have not encountered a situation we're we've
> said no to a change because we can't be clear how disruptive it is. We
> just say we'll add a microversion and move on.

We have held back on the more disruptive consistency changes for that
reason. For the cases where we add a URL alias, I can live with those.
Its where we would ideally fix the resource representations to make
them prettier / less broken looking. At some point, a user would need
to adopt those changes to get a feature added in a later version. I am
not sure we have a great story there, we just tell users to suck it

> And just to be clear, I am completely in support of some of those more
> disruptive changes because there are very valid reasons for them. But
> when the range of what a microversion can represent is a new field on
> instance all the way to a complete resource tree in the API has been
> removed across all microversions I'm not sure we're helping users as
> much as we think.

It does feel like we need a stronger/additional signal for the
removals. I am starting to think we will need to keep some of the
proxies for quite a bit of time after nova-network dies.


More information about the OpenStack-dev mailing list