[openstack-dev] [api] API Definition Formats

Ian Cordasco ian.cordasco at RACKSPACE.COM
Mon Jan 12 22:35:11 UTC 2015


On 1/9/15, 15:17, "Everett Toews" <everett.toews at RACKSPACE.COM> wrote:

>One thing that has come up in the past couple of API WG meetings [1] is
>just how useful a proper API definition would be for the OpenStack
>projects.
>
>By API definition I mean a format like Swagger, RAML, API Blueprint, etc.
>These formats are a machine/human readable way of describing your API.
>Ideally they drive the implementation of both the service and the client,
>rather than treating the format like documentation where it’s produced as
>a by product of the implementation.
>
>I think this blog post [2] does an excellent job of summarizing the role
>of API definition formats.
>
>Some of the other benefits include validation of requests/responses,
>easier review of API design/changes, more consideration given to client
>design, generating some portion of your client code, generating
>documentation, mock testing, etc.
>
>If you have experience with an API definition format, how has it
>benefitted your prior projects?
>
>Do you think it would benefit your current OpenStack project?
>
>Thanks,
>Everett
>
>[1] https://wiki.openstack.org/wiki/Meetings/API-WG
>[2] 
>http://apievangelist.com/2014/12/21/making-sure-the-most-important-layers-
>of-api-space-stay-open/

Hey Everett,

As we discussed in the meeting, I have some experience with a library
called Interpol [1] and using it in a massive API service. The idea behind
that service was re-written as an open source case study in a project
called Caravan [2].

In short, each and every endpoint used JSON Schema to validate the request
and response for each version of the endpoint. (Yes, endpoints were
versioned individually and that’s a topic for a different discussion.) The
files used by Interpol (which is what applied the defined JSON Schema to
the request/response cycle via Rack middleware) looked something like
https://github.com/bendyworks/caravan/blob/master/lib/endpoint_definitions/
users/user_by_id.yml.

If you read it closely, you’ll notice that path parameters are part of the
schema [3] and status codes are required [4]. Each part of the schema also
has the ability to be described [5]. This allows for Interpol to
automatically document the API for you. Finally, you can define example
responses [6] so you can prop up a stub application for other
services/applications to use. Finally, Interpol has a way of testing the
endpoint definitions (as they’re referred to) to ensure that the example
data actually does follow the schema provided.

As far as I know, there’s nothing similar to Interpol in Python … yet. I’m
fairly confident that the middleware would take a weekend or two of
sprinting to complete. Further, we could allow for more formats than YAML
but I think this could tie in well with the gabbi testing discussion
taking place. The rest might take a bit longer to complete.

In short, using schemas in test and in production allowed the
integration/acceptance tests to remain far more succinct. If you have
something enforcing your request and response formats then you can simply
test that you did get a status code 200 because something else has
validated the contents. If you want to validate that there’s items in the
array, you can skip validating the other properties because if there’s at
least one, the objects inside have been validated by the middleware (so
you can assert at least one came back and be confident).

This worked extremely well in my experience and helped improve development
time for new endpoints and new endpoint versions. The documentation was
also heavily used for the multiple internal clients for that API.

The company that used this used the validation in production (as well as
in testing) had no problems with scaling or performance.

The problem with building something like this /might/ be tying it in to
the different frameworks used by each of the services but on the whole
could be delegated to each service as it looks to integrate.

From my personal perspective, YAML is a nice way to document all of this
data, especially since it’s a format that most any language can parse. We
used these endpoint definitions to simply how we wrote clients for the API
we were developing and I suspect we could do something similar with the
existing clients. It would also definitely help any new clients that
people are currently writing. The biggest win for us would be having our
documentation mostly auto-generated for us and having a whole suite of
tests that would check that a real response matches the schema. If it
doesn’t, we know the schema needs to be updated and then the docs would be
automatically updated as a consequence. It’s a nice way of enforcing that
the response changes are documented as they’re changed.

Cheers,
Ian

[1] https://github.com/seomoz/interpol
[2] https://github.com/bendyworks/caravan
[3] 
https://github.com/bendyworks/caravan/blob/aa05fb345ad346b85fa989e857478491
2104570b/lib/endpoint_definitions/users/user_by_id.yml#L8..L12
[4] 
https://github.com/bendyworks/caravan/blob/aa05fb345ad346b85fa989e857478491
2104570b/lib/endpoint_definitions/users/user_by_id.yml#L18
[5] 
https://github.com/bendyworks/caravan/blob/aa05fb345ad346b85fa989e857478491
2104570b/lib/endpoint_definitions/users/user_by_id.yml#L20..L43
[6] 
https://github.com/bendyworks/caravan/blob/aa05fb345ad346b85fa989e857478491
2104570b/lib/endpoint_definitions/users/user_by_id.yml#L45



More information about the OpenStack-dev mailing list