[keystone][nova] Availability zones and unified limits

Jay Pipes jaypipes at gmail.com
Tue Nov 20 15:41:18 UTC 2018

On 11/20/2018 10:11 AM, Tobias Urdin wrote:
> I partially agree with your statements.
> I'm currently rolling the ball on availability zones in our deployments 
> and it's a real pain and I think if there was an
> easier concept for a source of truth for AZs (and aggregates as well) 
> there would be less confusion and a much better
> forward for inter-project support for such resources.

Keep in mind that host aggregates are not something an end-user is aware 
of -- only operators can see information about host aggregates. This is 
not the same with availability zones, which the end user is aware of.

> My opinion though; quota's really make sense to store in Keystone since 
> it's the source of truth for users and projects however
> I would say that AZs themselves is not on such a high level and isn't 
> the same as Keystone knowing about the regions, it's region specific.

What my point was is that Keystone's catalog can (and should) serve as 
the place to put regional and sub-region information. There is nothing 
in the Keystone concept of a region that denotes it as a geographic 
location nor anything in the concept of a region that prohibits a region 
from representing a smaller grouping of related compute resources -- 
a.k.a. a Nova availability zone.

And if Keystone was used as the centralized catalog service that it was 
designed for, it just naturally makes sense to use the region and 
sub-region concept in Keystone's existing APIs to further partition 
quota limits for a project based on those regions. And there's nothing 
preventing anyone from creating a region in Keystone that represents a 
nova availability zone.

> I don't think Keystone should have a view on such a detailed level 
> inside a region, I do agree that there is a void to be filled with 
> something
> that gives the source of truth on AZs, host aggregates etc though. But 
> moving that outside of a region defeats some purpose on the whole idea
> to isolate it in the first place.

I haven't proposed "moving that outside of a region". Not sure where 
that's coming from. Could you elaborate?


> Best regards
> On 11/20/2018 03:10 PM, Jay Pipes wrote:
>> On 11/20/2018 07:33 AM, Lance Bragstad wrote:
>>> Unified limits cropped up in a few discussions last week. After
>>> describing the current implementation of limits, namely the attributes
>>> that make up a limit, someone asked if availability zones were on the
>>> roadmap.
>>> Ideally, it sounded like they had multiple AZs in a single region, and
>>> they wanted to be able to limit usage within the AZ. With the current
>>> implementation, regions would be the smallest unit of a deployment they
>>> could limit (e.g., limit project Foo to only using 64 compute cores
>>> within RegionOne). Instead, the idea would be to limit the number of
>>> resources for that project on an AZ within a region.
>>> What do people think about adding availability zones to limits? Should
>>> it be an official attribute in keystone? What other services would need
>>> this outside of nova?
>>> There were a few other interesting cases that popped up, but I figured
>>> we could start here. I can start another thread if needed and we can
>>> keep this specific to discussing limitsĀ + AZs.
>> Keystone should have always been the thing that stores region and
>> availability zone information.
>> When I wrote the regions functionality for Keystone's catalog [1] I
>> deliberately added the concept that a region can have zero or more
>> sub-regions [2] to it. A region in Keystone wasn't (and AFAIK to this
>> day isn't) specific to a geographic location. There's nothing preventing
>> anyone from adding sub-regions that represent a nova availability zone
>> [3] to Keystone.
>> And since there's nothing preventing the existing Keystone limits API
>> from associating a set of limits with a specific region [4] (yes, even a
>> sub-region that represents an availability zone) I don't see any reason
>> why the existing structures in Keystone could not be used to fulfill
>> this functionality from the Keystone side.
>> The problem is *always* going to be on the Nova side (and any project
>> that made the unfortunate decision to copy nova's availability zone
>> "implementation" [5]... hi Cinder! [6]). The way that the availability
>> zone concept has been hacked into Nova means it's just going to be a
>> hack on top of a hack to get per-AZ quotas working in Nova. I know this
>> because Oath deploys this hack on top of a hack in order to divvy up
>> resources per power domain and physical network (those two things and
>> the site/DC essentially comprise what our availability zones are
>> internally).
>> Once again, not addressing the technical debt of years past -- which in
>> this case is the lack of solid modeling of an availability zone concept
>> in the Nova subsystems -- is hindering the forward progress of the
>> project, which is sad.
>> The long-term solution is to have Nova scrap it's availability zone code
>> that relies on host aggregate metadata to work, use Keystone's /regions
>> endpoint (and the hierarchy possible therein) as the single source of
>> truth about availability zones, move the association of availability
>> zone out of the Service model and onto the ComputeNode model, and have a
>> cache of real AvailabilityZone data models stored in the Nova API
>> top-level service.
>> I fear that without truly tackling this underlying problem, we can make
>> lots of progress on the Keystone side but things will slow to a crawl
>> with people trying to figure out what the heck is going on in Nova with
>> availability zones and how they could be tied in to quota handling.
>> Sorry to be a pessimist^realist,
>> -jay
>> [1]
>> https://github.com/openstack/keystone/commit/7c847578c8ed6a4921a24acb8a60f9264dd72aa1 
>> [2] 
>> https://developer.openstack.org/api-ref/identity/v3/index.html#regions
>> [3] As I've mentioned numerous times before, there's nothing
>> "availability" about a nova availability zone. There's no guarantees
>> about failure domains leaking across multiple nova availability zones.
>> Nova's availability zones are an anachronism from when a nova endpoint
>> serviced a single isolated set of compute resources.
>> [4]
>> https://developer.openstack.org/api-ref/identity/v3/index.html?expanded=create-limits-detail#create-limits 
>> [5] Behold, the availability zone implementation in Nova:
>> https://github.com/openstack/nova/blob/ea26392239e67b9504801ee9a478e066ffa2951f/nova/availability_zones.py 
>> You'll notice there's no actual data model for an AvailabilityZone
>> anywhere in either the nova_api or nova_cell databases:
>> https://github.com/openstack/nova/blob/ea26392239e67b9504801ee9a478e066ffa2951f/nova/db/sqlalchemy/api_models.py 
>> https://github.com/openstack/nova/blob/ea26392239e67b9504801ee9a478e066ffa2951f/nova/db/sqlalchemy/models.py 
>> Which puts "availability zone" in rare company inside Nova as one of the
>> only data concepts that has no actual data model behind it. It shares
>> this distinction with one other data concept... yep, you guessed it...
>> the "region".
>> [6] Sorry, Cinder, you inherited the lack of a real data model for
>> availability zones from Nova:
>> https://github.com/openstack/cinder/blob/b8167a5c3e5952cc52ff8844804b7a5ab36459c8/cinder/volume/api.py#L115-L153 

More information about the openstack-discuss mailing list