[openstack-dev] [nova] Future of the Nova API

Christopher Yeoh cbkyeoh at gmail.com
Tue Feb 25 06:11:26 UTC 2014


On Mon, 24 Feb 2014 17:37:04 -0800
Dan Smith <dms at danplanet.com> wrote:

> > onSharedStorage = True
> > on_shared_storage = False
> 
> This is a good example. I'm not sure it's worth breaking users _or_
> introducing a new microversion for something like this. This is
> definitely what I would call a "purity" concern as opposed to
> "usability".

If it was just one case it wouldn't matter but when we're inconsistent
across the whole API it is a usability issue because it makes it so
much harder for a user of the API to "learn" it. They may for example
remember that they need to pass a server id, but they also have to
remember for a particular call whether it should be server_id,
instance_uuid, or id. So referring to the documentation (assuming it is
present and correct) becomes required even after using the API for an
extended period of time. It also makes it much more error prone -
simple typos are much less likely to be picked up by reviewers.

Imagine we had to use a python library where sometimes the method and
parameter names were in snake_case, others CamelCase. Sometimes a mix
of the two in the same call. Sometimes it would refer to a widget as
widget and other times you had to refer to it as thingy or the call
failed. And if you passed the wrong parameters in it would sometimes
just quietly ignore the bad ones and proceed as if everything was ok.

Oh and other times it returned saying it had done the work you asked it
to, when it really it meant I'll look at it, but it might not be able
to (more on this below). I think most developers and reviewers would be
banging their heads on their desks after a while.

> Things like the twenty different datetime formats we expose _do_ seem
> worth the change to me as it requires the client to parse a bunch of
> different formats depending on the situation. However, we could solve
> that with very little code by just exposing all the datetimes again in
> proper format:
> 
>  {
>   "updated_at": "%(random_weirdo)s",
>   "updated_at_iso": "%(isotime)s",
>  }
> 
> Doing the above is backwards compatible and doesn't create code
> organizations based on any sort of pasta metaphor. If we introduce a
> discoverable version tag so the client knows if they will be
> available, I think we're good.

Except we also now need to handle the case where both are passed in and
end up disagreeing. And what about the user confusion where they see in
most cases updated_at means one thing so they start assuming that it
always means that, meaning they then get it wrong in the odd case out.
Again, harder to code against, harder to review and is the unfortunate
side effect of being too lax in what we accept.

> URL inconsistencies seem "not worth the trouble" and I tend to think
> that the "server" vs. "instance" distinction probably isn't either,
> but I guess I'm willing to consider it.

So again I think it comes down consistency increases usability - eg
knowing that if you want to operate on a "foo" that you always access
it through /foo rather than most of the time except for those cases when
someone (almost certainly accidentally) ended up writing an interface
where you modify a foo through /bar. The latter makes it much harder to
understand an API.

> Personally, I would rather do what we can/need in order to provide
> features in a compatible way, fix real functional issues (like the
> datetimes), and not ask users to port to a new API to clear up a bunch
> of CamelCase inconsistencies. Just MHO.

So to pick another example of something we can't change in a backwards
compatible way - success return codes.

In the V2 we have often returned 200 (OK) or 201 (Created) when we
actually really mean 202 Accepted. The first two meaning we've done
what you wanted, the last meaning we've got your request, but hey it
might still fail. This is often the case where we have async call
underneath somewhere. We can't change the return code now because
existing apps will break on testing for 200 or 201 if we start
returning 202. 

The more experienced users (eg those who have got bitten by the bug)
know that the 200 doesn't really mean the operation requested has
succeeded, but the new naive user doesn't. And so in testing everything
works fine (lighter load, not hitting quotas, fewer races etc). But then
occasionally in production things fail because they're not testing that
the operation has succeeded just proceeding as if it has because our
API told them it has. That's not a very user friendly API.

Chris



More information about the OpenStack-dev mailing list