Hey,

Some of you may know that for certain time I was doing experiments on the sdk/cli/ansible_mod side with regards to the code generation. Reason for that is simple: huge codebases and too few resources. While not being the primary goal of this initiative, but still a pointer into the same direction is that most of the tools we need to maintain do the same stuff with some adaptations necessary for the framework in question.

I did a lot of diverse attempts trying to take current OpenStackSDK codebase as a “source of truth” regarding what exists in which service and use it to generate code (OSC, AnsibleModule, etc). Pretty quickly it became clear that we do not have enough information to cover all the different needs:

- when query parameter has been added into or dropped from the API

- what is the type of the query parameter and whether in the case it is list it should be rendered as "?tags=a,b" or "?tags=a&tags=b”

- which fields are necessary to be send in the request, which are optional and how to deal when you send name but get back UUID in the same field

- what are the parameter/resource fields types in detail with respect to strongly typed programming languages (“int” is not good to differentiate between u8 and f64)

- how to determine list of supported “actions” on the resource which are above the plain CRUD

Firstly I tried to extend our current model of things in SDK, but even that was not so helpful. And then I decided to give another look at OpenAPI and our desires since multiple years to move into that direction. First of all it provides all the required information to do what is necessary. And so I tried to gather all the things we said about OpenAPI at OpenStack and which challenges we have and compare it against the OpenAPI 3.1. Problems:

1) OpenAPI does not support model of microversions. To be more precise on that you can’t describe different request/response body based on the header differentiator

2) URL + Method are representing unique combination making it impossible to represent different actions based on the content

3) All relevant Headers are supposed to be described in the spec. For example Swift is using X-Object-Meta-* prefix to allow user customisations what is not currently supported by the spec

Actually that’s it, everything else was doable. Have I missed something? So what can we do about it:

1) OpenAPI 3.1 is now based upon jsonschema, what allows huge flexibility in describing object schema including inheritance and polymorphism (with composition, discriminator, etc). So we can describe the necessary Microversion Header and schema using “oneOf” whenever necessary. Moreover it is possible to extend type schema with “x-“ prefixed custom properties. So I went and started adding “x-openstack-“ prefixed properties where necessary

2) I personally agree with OpenAPI that URL+method should be enough to uniquely identify what is expected (while not a big deal for the API consumer, it would have been easier on the server side to have URL uniquely tied with the data and operation instead of dynamically looking at the request to figure out what to do. This is actually a most complex thing to deal with and the only solution that comes to my mind would be not to try to put everything into single spec, but rather have a dedicated specs for different operations. It makes it anyway easier to maintain such spec. Maybe at some next OpenAPI iteration this gets resolved and we can adopt changes.

3) A tricky thing, but there is nothing preventing us specifying headers as “x-object-meta-*” and defining custom processors for things like that to for example compose/decompose this into dictionary on the client side

OpenAPI now even allows defining spec extensions if necessary, but I think at the moment we can cover all issues without that.

Demo time:

- server live migration (in my eyes one of the trickiest things I have seen so far from the OpenStack APIs). Here you have an action requiring a dedicated spec, multiple revisions with micro versions and even parameter type bool which can be a string as well: [2] https://paste.opendev.org/show/bcxFi2CrNX1YNEuTzoEh/ More real-life examples in [4]

- image create and upload: [3] https://paste.opendev.org/show/bNN1t0qVmHEdVHu0Mtrn/ Here it is possible to put few methods into the single spec. You can also see here samples of different media-types, capture headers, readOnly/writeOnly properties

(You can actually take those specs and paste them into https://editor-next.swagger.io/ and see how it looks like)

Closing notes:

I have implemented a WIP change [1] in SDK that consumes list of specs and performs requested operation. Series of SDK commands are converted to the new way dealing with APIs. This is backwards compatible, but opens a road to heavily reduce amount of code in SDK (precisely by dropping all those customisations that are necessary now). Honestly speaking my drafts even show performance improvement on SDK side in the area of 100-300ms for diverse API operations, but I am still not really able to explain why. Maybe because of avoiding most black magic of SDK. I assume after completion we could get another performance boost by dropping other unnecessary logic and abstractions. The work is not completed as the change title states.

Once we have specs for the commands code generators can be updated to consume this data and produce much better code. BTW, I have now also a prototype of the new CLI written with Rust and this is a hypersonic bullet compared to the current OSC. But please do not push me more on that, it is still in a very early stages and depends heavily on the available specs. I can only say that it is now designed in the way that every API call can have a thin CLI coverage just by providing a spec, when additional logic is desired - surely will require human implementation.

Code generators in the pipe: OSC, AnsibleModules, RustSDK (sync/async), RustCLI. Next thing that are on the radar: gopher cloud, terraform modules, async python sdk, JS SDK(?)

If all of that gets executed properly and with some community traction we can all have following things covered:

- improve standardisation of OpenStack internals and externals: glance and nova (at least those 2) are already using jsonschema internally in different areas to describe requests/responses. Why not to make this standard reaching the service consumers?

- getting rid of api-ref work by updating our sphinx machinery to consume our customised specs and produce nice docs matching the reality

- sharing specs between teams to improve interface (not like currently we need to read the api-ref with tons of bugs plus source code to understand how to cover new feature in service X). Maybe even a central repo with the specs per release.

- there are plenty of code generators and server bindings for OpenAPI specs so that we can potentially align frameworks used by different teams to maintain less

- less work for all of us who needs services talking to each other (not immediately right now, but once the code is switched on consuming specs)

- request verification already on the client side not waiting for the response

- finally show something to customers often annoying asking “where are your openapi specs” (no offence here ;-))?

I know it is a long message. But I am pretty excited with the progress and would like to hear community opinions. For the more detailed discussion consider this as a pre-announcement of the topic for PTG in sdk/cli slots.

Huge invest but huge outcome

P.S. it can result in a good chunk of relatively easy work for students

Regards,

Artem

[1] https://review.opendev.org/c/openstack/openstacksdk/+/892161

[2] https://paste.opendev.org/show/bcxFi2CrNX1YNEuTzoEh/

[3] https://paste.opendev.org/show/bNN1t0qVmHEdVHu0Mtrn/

[4] https://review.opendev.org/c/openstack/openstacksdk/+/893365