[keystone][nova] Availability zones and unified limits
Jay Pipes
jaypipes at gmail.com
Tue Nov 20 14:08:46 UTC 2018
On 11/20/2018 07:33 AM, Lance Bragstad wrote:
> Unified limits cropped up in a few discussions last week. After
> describing the current implementation of limits, namely the attributes
> that make up a limit, someone asked if availability zones were on the
> roadmap.
>
> Ideally, it sounded like they had multiple AZs in a single region, and
> they wanted to be able to limit usage within the AZ. With the current
> implementation, regions would be the smallest unit of a deployment they
> could limit (e.g., limit project Foo to only using 64 compute cores
> within RegionOne). Instead, the idea would be to limit the number of
> resources for that project on an AZ within a region.
>
> What do people think about adding availability zones to limits? Should
> it be an official attribute in keystone? What other services would need
> this outside of nova?
>
> There were a few other interesting cases that popped up, but I figured
> we could start here. I can start another thread if needed and we can
> keep this specific to discussing limitsĀ + AZs.
Keystone should have always been the thing that stores region and
availability zone information.
When I wrote the regions functionality for Keystone's catalog [1] I
deliberately added the concept that a region can have zero or more
sub-regions [2] to it. A region in Keystone wasn't (and AFAIK to this
day isn't) specific to a geographic location. There's nothing preventing
anyone from adding sub-regions that represent a nova availability zone
[3] to Keystone.
And since there's nothing preventing the existing Keystone limits API
from associating a set of limits with a specific region [4] (yes, even a
sub-region that represents an availability zone) I don't see any reason
why the existing structures in Keystone could not be used to fulfill
this functionality from the Keystone side.
The problem is *always* going to be on the Nova side (and any project
that made the unfortunate decision to copy nova's availability zone
"implementation" [5]... hi Cinder! [6]). The way that the availability
zone concept has been hacked into Nova means it's just going to be a
hack on top of a hack to get per-AZ quotas working in Nova. I know this
because Oath deploys this hack on top of a hack in order to divvy up
resources per power domain and physical network (those two things and
the site/DC essentially comprise what our availability zones are
internally).
Once again, not addressing the technical debt of years past -- which in
this case is the lack of solid modeling of an availability zone concept
in the Nova subsystems -- is hindering the forward progress of the
project, which is sad.
The long-term solution is to have Nova scrap it's availability zone code
that relies on host aggregate metadata to work, use Keystone's /regions
endpoint (and the hierarchy possible therein) as the single source of
truth about availability zones, move the association of availability
zone out of the Service model and onto the ComputeNode model, and have a
cache of real AvailabilityZone data models stored in the Nova API
top-level service.
I fear that without truly tackling this underlying problem, we can make
lots of progress on the Keystone side but things will slow to a crawl
with people trying to figure out what the heck is going on in Nova with
availability zones and how they could be tied in to quota handling.
Sorry to be a pessimist^realist,
-jay
[1]
https://github.com/openstack/keystone/commit/7c847578c8ed6a4921a24acb8a60f9264dd72aa1
[2] https://developer.openstack.org/api-ref/identity/v3/index.html#regions
[3] As I've mentioned numerous times before, there's nothing
"availability" about a nova availability zone. There's no guarantees
about failure domains leaking across multiple nova availability zones.
Nova's availability zones are an anachronism from when a nova endpoint
serviced a single isolated set of compute resources.
[4]
https://developer.openstack.org/api-ref/identity/v3/index.html?expanded=create-limits-detail#create-limits
[5] Behold, the availability zone implementation in Nova:
https://github.com/openstack/nova/blob/ea26392239e67b9504801ee9a478e066ffa2951f/nova/availability_zones.py
You'll notice there's no actual data model for an AvailabilityZone
anywhere in either the nova_api or nova_cell databases:
https://github.com/openstack/nova/blob/ea26392239e67b9504801ee9a478e066ffa2951f/nova/db/sqlalchemy/api_models.py
https://github.com/openstack/nova/blob/ea26392239e67b9504801ee9a478e066ffa2951f/nova/db/sqlalchemy/models.py
Which puts "availability zone" in rare company inside Nova as one of the
only data concepts that has no actual data model behind it. It shares
this distinction with one other data concept... yep, you guessed it...
the "region".
[6] Sorry, Cinder, you inherited the lack of a real data model for
availability zones from Nova:
https://github.com/openstack/cinder/blob/b8167a5c3e5952cc52ff8844804b7a5ab36459c8/cinder/volume/api.py#L115-L153
More information about the openstack-discuss
mailing list