[keystone][nova] Availability zones and unified limits

Tobias Urdin tobias.urdin at binero.se
Wed Nov 21 09:03:52 UTC 2018


On 11/20/2018 04:42 PM, Jay Pipes wrote:
> On 11/20/2018 10:11 AM, Tobias Urdin wrote:
>> I partially agree with your statements.
>>
>> I'm currently rolling the ball on availability zones in our deployments
>> and it's a real pain and I think if there was an
>> easier concept for a source of truth for AZs (and aggregates as well)
>> there would be less confusion and a much better
>> forward for inter-project support for such resources.
> Keep in mind that host aggregates are not something an end-user is aware
> of -- only operators can see information about host aggregates. This is
> not the same with availability zones, which the end user is aware of.
My bad, keeping it to availability zones.
>
>> My opinion though; quota's really make sense to store in Keystone since
>> it's the source of truth for users and projects however
>> I would say that AZs themselves is not on such a high level and isn't
>> the same as Keystone knowing about the regions, it's region specific.
> What my point was is that Keystone's catalog can (and should) serve as
> the place to put regional and sub-region information. There is nothing
> in the Keystone concept of a region that denotes it as a geographic
> location nor anything in the concept of a region that prohibits a region
> from representing a smaller grouping of related compute resources --
> a.k.a. a Nova availability zone.
>
> And if Keystone was used as the centralized catalog service that it was
> designed for, it just naturally makes sense to use the region and
> sub-region concept in Keystone's existing APIs to further partition
> quota limits for a project based on those regions. And there's nothing
> preventing anyone from creating a region in Keystone that represents a
> nova availability zone.
I agree that this might be the currently best possible solution, and 
perhaps even
the only viable solution instead of introducing more technical depth 
with for example an
addition of more services, I just want to make sure we are aware of the 
dependency this would
generate on Keystone which is my concern.

Would your proposal be that AZs cease to exist in the current form? And 
instead it
be replaced by a common community effort to move all existing projects 
to use Keystone
as source of truth? I really like that concept but not convinced where 
we should do that.

There is a great possibility for improvement here considering that Nova, 
Cinder and Neutron all
provide the availability zone concept in, some cases, somewhat different 
ways. Since it's
separated into the projects themselves and not moved out is causing 
pains to manage i.e
the nova <-> cinder relationship for AZs.
>
>> I don't think Keystone should have a view on such a detailed level
>> inside a region, I do agree that there is a void to be filled with
>> something
>> that gives the source of truth on AZs, host aggregates etc though. But
>> moving that outside of a region defeats some purpose on the whole idea
>> to isolate it in the first place.
> I haven't proposed "moving that outside of a region". Not sure where
> that's coming from. Could you elaborate?
My biggest concern is that we are moving out logic that is internal to a 
region out to a central dependency that
keystone is. I've always viewed Keystone (and Horizon deployed alongside 
it) as the central part of a OpenStack
cloud (yes even though we stretch databases between regions and deploy 
Keystone there, or do federation) which
does not interact with regions except for those high-level shared 
resources that comes with authentication and catalog.

After sleeping on it though, I understand your perspective and what I 
really like is that it's pretty much already there.
Not sure that this would still be considered availability zones as we 
see them today, more of a partitioned region into
sub-regions, so we move down a layer further which means you can 
schedule resources in a sub-region. Does that mean
a gradual move away and deprecation of availability zones or is just 
aliasing the Keystone region concept under another name?

I must say this seems like very important work, the cross-project 
involvement is huge but the win for all projects here
sounds substantial and a community goal would probably benefit a lot of 
users.

Best regards
Tobias
>
> Best,
> -jay
>
>> Best regards
>>
>> On 11/20/2018 03:10 PM, Jay Pipes wrote:
>>> On 11/20/2018 07:33 AM, Lance Bragstad wrote:
>>>> Unified limits cropped up in a few discussions last week. After
>>>> describing the current implementation of limits, namely the attributes
>>>> that make up a limit, someone asked if availability zones were on the
>>>> roadmap.
>>>>
>>>> Ideally, it sounded like they had multiple AZs in a single region, and
>>>> they wanted to be able to limit usage within the AZ. With the current
>>>> implementation, regions would be the smallest unit of a deployment they
>>>> could limit (e.g., limit project Foo to only using 64 compute cores
>>>> within RegionOne). Instead, the idea would be to limit the number of
>>>> resources for that project on an AZ within a region.
>>>>
>>>> What do people think about adding availability zones to limits? Should
>>>> it be an official attribute in keystone? What other services would need
>>>> this outside of nova?
>>>>
>>>> There were a few other interesting cases that popped up, but I figured
>>>> we could start here. I can start another thread if needed and we can
>>>> keep this specific to discussing limitsĀ + AZs.
>>> Keystone should have always been the thing that stores region and
>>> availability zone information.
>>>
>>> When I wrote the regions functionality for Keystone's catalog [1] I
>>> deliberately added the concept that a region can have zero or more
>>> sub-regions [2] to it. A region in Keystone wasn't (and AFAIK to this
>>> day isn't) specific to a geographic location. There's nothing preventing
>>> anyone from adding sub-regions that represent a nova availability zone
>>> [3] to Keystone.
>>>
>>> And since there's nothing preventing the existing Keystone limits API
>>> from associating a set of limits with a specific region [4] (yes, even a
>>> sub-region that represents an availability zone) I don't see any reason
>>> why the existing structures in Keystone could not be used to fulfill
>>> this functionality from the Keystone side.
>>>
>>> The problem is *always* going to be on the Nova side (and any project
>>> that made the unfortunate decision to copy nova's availability zone
>>> "implementation" [5]... hi Cinder! [6]). The way that the availability
>>> zone concept has been hacked into Nova means it's just going to be a
>>> hack on top of a hack to get per-AZ quotas working in Nova. I know this
>>> because Oath deploys this hack on top of a hack in order to divvy up
>>> resources per power domain and physical network (those two things and
>>> the site/DC essentially comprise what our availability zones are
>>> internally).
>>>
>>> Once again, not addressing the technical debt of years past -- which in
>>> this case is the lack of solid modeling of an availability zone concept
>>> in the Nova subsystems -- is hindering the forward progress of the
>>> project, which is sad.
>>>
>>> The long-term solution is to have Nova scrap it's availability zone code
>>> that relies on host aggregate metadata to work, use Keystone's /regions
>>> endpoint (and the hierarchy possible therein) as the single source of
>>> truth about availability zones, move the association of availability
>>> zone out of the Service model and onto the ComputeNode model, and have a
>>> cache of real AvailabilityZone data models stored in the Nova API
>>> top-level service.
>>>
>>> I fear that without truly tackling this underlying problem, we can make
>>> lots of progress on the Keystone side but things will slow to a crawl
>>> with people trying to figure out what the heck is going on in Nova with
>>> availability zones and how they could be tied in to quota handling.
>>>
>>> Sorry to be a pessimist^realist,
>>> -jay
>>>
>>> [1]
>>> https://github.com/openstack/keystone/commit/7c847578c8ed6a4921a24acb8a60f9264dd72aa1
>>>
>>>
>>> [2]
>>> https://developer.openstack.org/api-ref/identity/v3/index.html#regions
>>>
>>> [3] As I've mentioned numerous times before, there's nothing
>>> "availability" about a nova availability zone. There's no guarantees
>>> about failure domains leaking across multiple nova availability zones.
>>> Nova's availability zones are an anachronism from when a nova endpoint
>>> serviced a single isolated set of compute resources.
>>>
>>> [4]
>>> https://developer.openstack.org/api-ref/identity/v3/index.html?expanded=create-limits-detail#create-limits
>>>
>>>
>>> [5] Behold, the availability zone implementation in Nova:
>>> https://github.com/openstack/nova/blob/ea26392239e67b9504801ee9a478e066ffa2951f/nova/availability_zones.py
>>>
>>>
>>> You'll notice there's no actual data model for an AvailabilityZone
>>> anywhere in either the nova_api or nova_cell databases:
>>>
>>> https://github.com/openstack/nova/blob/ea26392239e67b9504801ee9a478e066ffa2951f/nova/db/sqlalchemy/api_models.py
>>>
>>>
>>> https://github.com/openstack/nova/blob/ea26392239e67b9504801ee9a478e066ffa2951f/nova/db/sqlalchemy/models.py
>>>
>>>
>>> Which puts "availability zone" in rare company inside Nova as one of the
>>> only data concepts that has no actual data model behind it. It shares
>>> this distinction with one other data concept... yep, you guessed it...
>>> the "region".
>>>
>>> [6] Sorry, Cinder, you inherited the lack of a real data model for
>>> availability zones from Nova:
>>>
>>> https://github.com/openstack/cinder/blob/b8167a5c3e5952cc52ff8844804b7a5ab36459c8/cinder/volume/api.py#L115-L153
>>>
>>>
>>>
>>
>




More information about the openstack-discuss mailing list