[keystone][nova] Availability zones and unified limits

Jay Pipes jaypipes at gmail.com
Tue Nov 20 14:08:46 UTC 2018


On 11/20/2018 07:33 AM, Lance Bragstad wrote:
> Unified limits cropped up in a few discussions last week. After 
> describing the current implementation of limits, namely the attributes 
> that make up a limit, someone asked if availability zones were on the 
> roadmap.
> 
> Ideally, it sounded like they had multiple AZs in a single region, and 
> they wanted to be able to limit usage within the AZ. With the current 
> implementation, regions would be the smallest unit of a deployment they 
> could limit (e.g., limit project Foo to only using 64 compute cores 
> within RegionOne). Instead, the idea would be to limit the number of 
> resources for that project on an AZ within a region.
> 
> What do people think about adding availability zones to limits? Should 
> it be an official attribute in keystone? What other services would need 
> this outside of nova?
> 
> There were a few other interesting cases that popped up, but I figured 
> we could start here. I can start another thread if needed and we can 
> keep this specific to discussing limitsĀ + AZs.

Keystone should have always been the thing that stores region and 
availability zone information.

When I wrote the regions functionality for Keystone's catalog [1] I 
deliberately added the concept that a region can have zero or more 
sub-regions [2] to it. A region in Keystone wasn't (and AFAIK to this 
day isn't) specific to a geographic location. There's nothing preventing 
anyone from adding sub-regions that represent a nova availability zone 
[3] to Keystone.

And since there's nothing preventing the existing Keystone limits API 
from associating a set of limits with a specific region [4] (yes, even a 
sub-region that represents an availability zone) I don't see any reason 
why the existing structures in Keystone could not be used to fulfill 
this functionality from the Keystone side.

The problem is *always* going to be on the Nova side (and any project 
that made the unfortunate decision to copy nova's availability zone 
"implementation" [5]... hi Cinder! [6]). The way that the availability 
zone concept has been hacked into Nova means it's just going to be a 
hack on top of a hack to get per-AZ quotas working in Nova. I know this 
because Oath deploys this hack on top of a hack in order to divvy up 
resources per power domain and physical network (those two things and 
the site/DC essentially comprise what our availability zones are 
internally).

Once again, not addressing the technical debt of years past -- which in 
this case is the lack of solid modeling of an availability zone concept 
in the Nova subsystems -- is hindering the forward progress of the 
project, which is sad.

The long-term solution is to have Nova scrap it's availability zone code 
that relies on host aggregate metadata to work, use Keystone's /regions 
endpoint (and the hierarchy possible therein) as the single source of 
truth about availability zones, move the association of availability 
zone out of the Service model and onto the ComputeNode model, and have a 
cache of real AvailabilityZone data models stored in the Nova API 
top-level service.

I fear that without truly tackling this underlying problem, we can make 
lots of progress on the Keystone side but things will slow to a crawl 
with people trying to figure out what the heck is going on in Nova with 
availability zones and how they could be tied in to quota handling.

Sorry to be a pessimist^realist,
-jay

[1] 
https://github.com/openstack/keystone/commit/7c847578c8ed6a4921a24acb8a60f9264dd72aa1

[2] https://developer.openstack.org/api-ref/identity/v3/index.html#regions

[3] As I've mentioned numerous times before, there's nothing 
"availability" about a nova availability zone. There's no guarantees 
about failure domains leaking across multiple nova availability zones. 
Nova's availability zones are an anachronism from when a nova endpoint 
serviced a single isolated set of compute resources.

[4] 
https://developer.openstack.org/api-ref/identity/v3/index.html?expanded=create-limits-detail#create-limits

[5] Behold, the availability zone implementation in Nova: 
https://github.com/openstack/nova/blob/ea26392239e67b9504801ee9a478e066ffa2951f/nova/availability_zones.py

You'll notice there's no actual data model for an AvailabilityZone 
anywhere in either the nova_api or nova_cell databases:

https://github.com/openstack/nova/blob/ea26392239e67b9504801ee9a478e066ffa2951f/nova/db/sqlalchemy/api_models.py

https://github.com/openstack/nova/blob/ea26392239e67b9504801ee9a478e066ffa2951f/nova/db/sqlalchemy/models.py

Which puts "availability zone" in rare company inside Nova as one of the 
only data concepts that has no actual data model behind it. It shares 
this distinction with one other data concept... yep, you guessed it... 
the "region".

[6] Sorry, Cinder, you inherited the lack of a real data model for 
availability zones from Nova:

https://github.com/openstack/cinder/blob/b8167a5c3e5952cc52ff8844804b7a5ab36459c8/cinder/volume/api.py#L115-L153



More information about the openstack-discuss mailing list