Open Stack

Tue Jan 14 02:59:41 UTC 2014

Caution! Long and opinionated response.

tl;dr:

A versioned API with no API extensions is *not* a static API. If you
want to look at a heavily versioned API without extensions for a Compute
control API that has successfully evolved over the years to meet its
customers' needs, you need look no further than the EC2 API.

We don't need API extensions and they make our Compute API laughably
complex and cumbersome. We should ditch entirely the concept of API
extensions in our next Compute API major release.

Details
-------

All the gory details, including categorization of which API extensions
need to just go away and which extensions should just be part of the
core Compute API is below.

On Tue, 2014-01-14 at 06:29 +0800, Christopher Yeoh wrote:
> We have around 50+ plugins and only slightly more than 10 are
> considered to be "core". So with a static API we'd have to have what I
> think would be a pretty long discussion about what functionality we
> will no longer support, and inevitably we'll end up with a much larger
> subset of existing functionality that all Openstack deployers would
> have to support. Including perhaps all the backend requirements as
> having an API which doesn't actually work is I don't think an
> improvement for client programs on not offering the API functionality
> for a deployment in the first place.

I am not describing a static API. More on that below...

Let's actually talk specifics here, Chris.

In my opinion, out of 78 (!) existing extensions of the OpenStack API,
the *vast* majority of those extensions represent stuff that simply
should have been added to the main Compute API, plain and simple, with a
simple minor/revision version increment since the changes are not
backwards-incompatible at all. The remaining API extensions either
aren't appropriate for an Compute control API targeted at users/tenants
or aren't appropriate for any HTTP API to begin with.

== Should have just been in core API ==

Out of the 78 extensions, I came up with 58 that should have just been
additions and/or modifications to the Compute API (with a version
increment):

admin_actions -- seriously...why wouldn't pause/unpause, etc be part of
the API? if some hypervisor doesn't support the action, then raise
NotImplemented and return an HTTP 501 Not Implemented -- after all,
that's what a 501 was designed for, and client tooling for HTTP APIs
should understand that.

attach_interfaces
block_device_mapping_v2_boot
certificates
console_output
evacuate
fixed_ips
flavor_access
flavor_disabled
flavor_rxtx
flavor_swap
flavorextradata
flavorextraspecs
flavormanage
floating_ip_dns
floating_ip_pools
floating_ips
floating_ips_bulk
instance_actions
keypairs
migrations
multinic
multiple_create
os_networks
os_tenant_networks
quota_classes
quotas
rescue
scheduler_hints
security_group_default_rules
security_groups
server_diagnostics
server_password
server_start_stop
shelve
simple_tenant_usage
used_limits
used_limits_for_admin
user_quotas
virtual_interfaces
volume_attachment_update
volumes

For all of the above, there's really no reason NOT to have the
functionality just be part of the core API.

consoles -- Should be part of the core API -- and even worse, there is a
os-getVNCConsole extension and an os-getSPICEConsle extension entry
point instead of just allowing an extensible request dict input...

createserverext
extended_availability_zone
extended_floating_ip
extended_ips
extended_ips_max
extended_quotas
extended_server_attributes
extended_services
extended_status
extended_virtual_interfaces_net
extended_volumes
image_size
server_usage (despite name, only adds launched_at and terminated_at
attributes)
user_data

All of the above should have just been part of the core API, with
"extended attributes" added in version increments to the API.

== Should never have been in the API to begin with ==

The following extensions should never have been added to the Compute API
at all, IMO. They are specific to particular hypervisors or deployers,
are not tenant-facing and don't belong in a Compute control API at all. 

agents -- a list of *guest agents* that Xen, Virtualbox, and VMWare
hypervisors talk to. Seriously, who cares in the sense of a public
Compute control API? This is the domain of the operator... see below.

aggregates -- was originally XenServer-specific (resource pools)
functionality that was added as an extension to the API with no regard
to how or if the functionality would work in a non-XenServer
environment. It should never have been accepted as an API extension. It
was providing XenServer-specific functionality for an API that is (and
was from the beginning) deliberately supposed to be agnostic of the
underlying hypervisor.

assisted-volume-snapshots -- again, an API extension that is really just
just about enabling some functionality for specific hypervisors and
backend storage (GlusterFS). There is *absolutely zero reason this needs
to be part of the API*. If some drivers can take advantage of something
to make something better for the user, what does that have to do with
the control API. Answer: nothing at all. This code again belongs in the
driver layer entirely.

availability_zone -- yes, folks, believe it or not, Availability Zone is
actually an API extension. What does this extension do? Well, it returns
the results of the EC2 API DescribeAvailabilityZone call. As such, it
doesn't belong in the OpenStack Compute API at all. Seriously, if
somebody wanted the results of DescribeAvailabilityZone, then call the
EC2 API endpoint.

cells and cells_capacities -- Cells are not tenant-facing, and they
really are an implementation detail. This should never have polluted the
Compute API as an extension. It belongs elsewhere, either as a separate
scheduler driver that has its own data store that describes the cell
relationships, or as an RPC-only API.

config_drive -- It's an implementation detail and belongs in the driver
layer, not as something queryable or controllable via an HTTP control
API.

deferred_delete -- Frankly, the functionality here is suspect and smells
like it was just ported as-is from a particular deployer's API... If
this kind of use case is common (doubtful), then the existing Compute
API DELETE /servers should just have been enhanced to support "forced
deletion".

disk_config -- I still believe this is mostly an implementation detail
and doesn't belong as an API extension.

fping -- Eh... this isn't about a Compute API at all. It's a monitoring
API -- specific to one implementation -- and doesn't belong in the
Compute API.

hide_server_addresses -- Should never have been a control API extension.
Belongs as a simple configuration option and done at the manager layer.

hosts
hypervisors
instance_usage_audit_log
services
baremetal_nodes
baremetal_ext_status

The above are for operators, not for tenants or even "admins" in the
traditional sense of an admin of cloud resources. The Ironic API --
which is targeted at operators -- is where this HTTP API stuff should
have gone, not in a Compute API that is targeted at users. Mixing
operator APIs with tenant/user control APIs is wrong, IMO.

> The other issue is how often with a static API we would have to do
> version bumping. Icehouse is not unique in having the Nova API
> extended in several areas and I doubt Juno would be any different. So
> I expect even if we decided we could simply say "no" to new features
> more often, with a static API we'll be releasing a new API version
> every release for the forseable future - what sort of impact does that
> have on deployers and users?

There's a difference between an API that does not have extensions and a
static API. I am not describing a static API. I am describing an HTTP
control API that evolves over time with versioning. You know, just like
the EC2 API has -- dozens of times over the years. In fact, you will
note that new AWS EC2 API versions sometimes come out every two weeks.
Is there some uproar when this happens? No. Developers just look at the
version changelog and documentation and go "oh, hey, cool, new
functionality."

> I think this effect is magnified with continuous deployment. With a
> static API and no ability to not immediately deploy an extension just
> merged, do they have to have an API version bump every time an API
> changing patch lands?

I beg to differ. I would go so far as to say the existing API extension
usage *hampers* deployers using a CD model. With a semver incremental
API versioning system -- just like the one used for the Nova RPC API
incidentally -- change is expected and appropriately discovered/handled.
With API extensions, it's just a mess of "if extensionX.enabled do this,
if extensionY.enabled do this", etc.

Plus, there's no accounting for dependencies between API extensions.
Case in point: ever tried using the SimpleTenantUsage extension when the
OS-SRV-USG extension isn't enabled? Don't bother. The first depends on
database schema changes that the latter introduces, but you'd never know
that from the extension descriptor.

However, you *would* be able to guard for that condition with
incremental API versions, since the dependent OS-SRV-USG code would, by
nature, have an earlier semver API version than the code that introduced
the SimpleTenantUsage calls.

> And although we don't really have an official policy around it, I
> think the API extension functionality has been used as a way of
> allowing new functionality into Nova and evaluating it in place before
> deciding whether or not it becomes a core part of Nova. 

I do understand this. But, I just think that it's mainly laziness that
drives this. Instead of doing the hard work of determining a useful API
structure ahead of time -- and validating that the new features actually
fit the API audience -- it's just one more way of pushing immature or
ill-fitting code into a codebase.

Sorry for ranting.

Best,
-jay

> Chris
> 
> 
> 
>  
>         -jay
>         
>         > ---
>         > Ryan Petrello
>         > Senior Developer, DreamHost
>         > ryan.petrello at dreamhost.com
>         >
>         > On Jan 13, 2014, at 9:23 AM, Jay Pipes <jaypipes at gmail.com>
>         wrote:
>         >
>         > > On Sun, 2014-01-12 at 19:52 -0800, Christopher Yeoh wrote:
>         > >> On my phone so will be very brief but perhaps the
>         extensions extension
>         > >> could publish the jsonschema(s) for the extension. I
>         think the only
>         > >> complicating  factor would be where extensions extend
>         extensions but I
>         > >> think it's all doable.
>         > >
>         > > Am I the only one that sees the above statement as another
>         indication of
>         > > why API extensions should eventually find their way into
>         the dustbin of
>         > > OpenStack history?
>         > >
>         > > -jay
>         > >
>         > >
>         > >
>         > > _______________________________________________
>         > > OpenStack-dev mailing list
>         > > OpenStack-dev at lists.openstack.org
>         > >
>         http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>         >
>         >
>         > _______________________________________________
>         > OpenStack-dev mailing list
>         > OpenStack-dev at lists.openstack.org
>         >
>         http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>         
>         
>         
>         _______________________________________________
>         OpenStack-dev mailing list
>         OpenStack-dev at lists.openstack.org
>         http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>         
> 
> 
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Open Stack

[openstack-dev] [nova] api schema validation pattern changes

OpenStack

Community

Documentation

Branding & Legal