[placement][ptg] Resource Provider Partitioning
From the etherpad [1]:
* do we need this? * what is it? * who is going to drive it? As as I recall, resource provider partitioning (distinct from allocation partitioning) is a way of declaring that a set of resource providers are in a thing. This would allow, for example, one placement to service multiple OpenStack clouds or for a placement to be a part of a single pane of glass system in a FOG or edge setup. This was mentioned during Stein nova discussions [2] but since then I've not personally heard a lot of discussion on this topic so it's unclear if it is a pressing issue. Do we want to be build it so they come, or wait until they come and then build it? The discussion at [2] mentions the possibility of an 'openstack-shard' header (instead of query parameter) that would be sent with any request to placement. There is, however, no substantive discussion on the internal implementation. Options: * Do nothing (see above) * Internally manipulate aggregates (all these resource providers below to shard X). * Add a 1:1 or 1:N relation between an RP and a shard uuid in the DB. * Use a trait! [3] But before we get into implementation details we should discuss the use cases for this (if any), the need to do it (if any), and the people who will do it (if any). All three of those are thin at this point. [1] https://etherpad.openstack.org/p/placement-ptg-train [2] around lines 243 on https://etherpad.openstack.org/p/nova-ptg-stein where both types (allocation/rp) of partitioning are discussed. [3] Not for the trait strict constructionists. -- Chris Dent ٩◔̯◔۶ https://anticdent.org/ freenode: cdent tw: @anticdent
Does this have anything to do with the topic of separate projects owning different resource providers in the same tree? Where "owning" indicates responsibility for creation, positioning (i.e. where in the tree, who's the parent), and inventory management, but *not* allocations. This was in the context of e.g. neutron owning bandwidth RPs, or cyborg owning FPGA RPs. In Denver (possibly twice) we talked about the various actors actually needing to know this. I don't remember exactly why - was it only so that each actor knows not to stomp on providers it doesn't own? And is that a problem that needs a solution other than each actor just knowing which providers it's responsible for and leaving anything else alone? If this is unrelated to the subject, please kill this subthread. efried
On Mon, 8 Apr 2019, Eric Fried wrote:
Does this have anything to do with the topic of separate projects owning different resource providers in the same tree? Where "owning" indicates responsibility for creation, positioning (i.e. where in the tree, who's the parent), and inventory management, but *not* allocations.
This is different: Multiple clouds, one placement. However, what you describe is probably a thing that warrants discussion. If you agree, stick it on the etherpad with these two paragraphs and I'll come around to it, eventually, in this process.
This was in the context of e.g. neutron owning bandwidth RPs, or cyborg owning FPGA RPs.
In Denver (possibly twice) we talked about the various actors actually needing to know this. I don't remember exactly why - was it only so that each actor knows not to stomp on providers it doesn't own? And is that a problem that needs a solution other than each actor just knowing which providers it's responsible for and leaving anything else alone?
-- Chris Dent ٩◔̯◔۶ https://anticdent.org/ freenode: cdent tw: @anticdent
On 04/08/2019 10:25 AM, Chris Dent wrote:
From the etherpad [1]:
* do we need this? * what is it? * who is going to drive it?
As as I recall, resource provider partitioning (distinct from allocation partitioning) is a way of declaring that a set of resource providers are in a thing. This would allow, for example, one placement to service multiple OpenStack clouds or for a placement to be a part of a single pane of glass system in a FOG or edge setup.
This was mentioned during Stein nova discussions [2] but since then I've not personally heard a lot of discussion on this topic so it's unclear if it is a pressing issue. Do we want to be build it so they come, or wait until they come and then build it?
The discussion at [2] mentions the possibility of an 'openstack-shard' header (instead of query parameter) that would be sent with any request to placement.
There is, however, no substantive discussion on the internal implementation. Options:
* Do nothing (see above) * Internally manipulate aggregates (all these resource providers below to shard X).
The problem with this implementation is that resource providers can belong to zero or multiple aggregates, of course. And a "shard" or "source partition" is clearly something that a provider only belongs to *one of* and *must* belong to only one.
* Add a 1:1 or 1:N relation between an RP and a shard uuid in the DB.
1:1 is the only thing that makes sense to me. Therefore, it should be a field on the resource_providers table (source_id or partition_id or whatever).
* Use a trait! [3]
Same problem as aggregates. A provider can have zero or more traits, therefore we would run into the same unholy mess that we currently have in Nova aggregate metadata for "availability zones": we need a bunch of hack code to make sure that nobody associates a compute service with multiple aggregates *if* those aggregates have different availability_zone metadata keys. Yuck. This is why getting the data model right is so important... and why bolting on attributes to the wrong entity or cramming relational data into a JSON blob always ends up biting us in the long run.
But before we get into implementation details we should discuss the use cases for this (if any), the need to do it (if any), and the people who will do it (if any). All three of those are thin at this point.
Mentioned in the other thread on consumer types (what you are calling allocation partitioning for some reason), but the best *current* use case for these partitions/types is in solving the quota usage calculations in an efficient manner using the placement data model. Best, -jay
[1] https://etherpad.openstack.org/p/placement-ptg-train [2] around lines 243 on https://etherpad.openstack.org/p/nova-ptg-stein where both types (allocation/rp) of partitioning are discussed. [3] Not for the trait strict constructionists.
For those following along, probably useful to also see my response on the allocation partitioning thread (which may be misnamed): http://lists.openstack.org/pipermail/openstack-discuss/2019-April/004800.htm... more within... On Tue, 9 Apr 2019, Jay Pipes wrote:
* Internally manipulate aggregates (all these resource providers below to shard X).
oops, belong
The problem with this implementation is that resource providers can belong to zero or multiple aggregates, of course. And a "shard" or "source partition" is clearly something that a provider only belongs to *one of* and *must* belong to only one.
I agree that aggregates may be a bad choice because of the option to belong to zero but we could control that above the db level if we cared to. Not making a vote for aggregates here, just pointing out that we have the power to do what we want, and aggregates provide an existing grouping model. And these are groups. Do we want to enforce that any resource provider only belongs to one partition? If so, why? By calling them shards or partitions, then sure, that cardinality makes sense, but what happens when some bright bulb decides there is a monster inter-galactic storage service that can serve multiple clouds, transparently [1]? Do we want the data model to prevent that?
* Add a 1:1 or 1:N relation between an RP and a shard uuid in the DB.
1:1 is the only thing that makes sense to me. Therefore, it should be a field on the resource_providers table (source_id or partition_id or whatever).
See above.
But before we get into implementation details we should discuss the use cases for this (if any), the need to do it (if any), and the people who will do it (if any). All three of those are thin at this point.
Mentioned in the other thread on consumer types (what you are calling allocation partitioning for some reason), but the best *current* use case for these partitions/types is in solving the quota usage calculations in an efficient manner using the placement data model.
As with the allocation partitioning thread, the mental models that I'm trying to integrate here also similar but not coincident: * A placement service that manages resources in multiple things, some of them happen to be disjoint OpenStack clounds. * A single placement in a multi-region OpenStack. * Others I can't remember right this minute but hope will come out in further conversation. [1] by way of quantum entanglement, you see... -- Chris Dent ٩◔̯◔۶ https://anticdent.org/ freenode: cdent tw: @anticdent
On 04/09/2019 10:14 AM, Chris Dent wrote:
On Tue, 9 Apr 2019, Jay Pipes wrote:
The problem with this implementation is that resource providers can belong to zero or multiple aggregates, of course. And a "shard" or "source partition" is clearly something that a provider only belongs to *one of* and *must* belong to only one.
I agree that aggregates may be a bad choice because of the option to belong to zero but we could control that above the db level if we cared to.
We currently control this in Nova above the DB level, as I mentioned in my original reply about how hacky aggregate -> availability zone handling is. I'd prefer less hacky approaches in placement, if possible.
Not making a vote for aggregates here, just pointing out that we have the power to do what we want, and aggregates provide an existing grouping model. And these are groups.
I don't view these as groups, though. Groups are many to many relationships. A thing can be in many groups. A group has many things. But in the case of a shard or partition, it's really a one to one thing in my opinion.
Do we want to enforce that any resource provider only belongs to one partition? If so, why? By calling them shards or partitions, then sure, that cardinality makes sense, but what happens when some bright bulb decides there is a monster inter-galactic storage service that can serve multiple clouds, transparently [1]? Do we want the data model to prevent that I'd prefer to have the data model express the explicit nature of a 1:1 relationship. It's simpler and easier to reason about.
There's a reason we said that for hierarchical resource providers, there can be only one parent provider, much to Mr. Mooney's dismay. Sure, we can try to bend our minds to support quantum entanglement and an alternative universe where there is no spoon. But if we do that, we'll end up with... well, with no spoon. And I like spoons. They are useful scoopers of both liquid and semi-solid materials. Attempting to enjoy a nice soup with a fork tends to spoil the experience. Best, -jay
[1] by way of quantum entanglement, you see...
On Wed, 10 Apr 2019, Jay Pipes wrote:
I'd prefer to have the data model express the explicit nature of a 1:1 relationship. It's simpler and easier to reason about.
FTR: I'm perfectly fine with that and in fact pretty much prefer it. I agree it keeps things simpler and saner. I bring it up at this stage of the conversation because this is the point where we want to get all the ideas and options on the table and draw out people who might have unique and illuminating perspectives. Doing these conversations in email in advance of the PTG gives us some license to cough out all the wild hairs so they don't tangle us up later. If we discover a golden fleece, that's cool too. /me tangles his metaphors -- Chris Dent ٩◔̯◔۶ https://anticdent.org/ freenode: cdent tw: @anticdent
I don't really have my brain around the motivators and use cases for this topic, but at a surface level, I thought this deserved to be crystallized:
I'd prefer to have the data model express the explicit nature of a 1:1 relationship. It's simpler and easier to reason about.
We already have ways (plural deliberate) to express non-1:1 relationships. If there's a use case that gets easier with a 1:1 mechanism, let's make one. If quantum things happen, and 1:1 is no good, we can always use one of the non-1:1 mechanisms. Or a spoon. efried .
As as I recall, resource provider partitioning (distinct from allocation partitioning) is a way of declaring that a set of resource providers are in a thing. This would allow, for example, one placement to service multiple OpenStack clouds or for a placement to be a part of a single pane of glass system in a FOG or edge setup.
Yep, so taking the edge case because it's easy: It's very nice to be able to have one single central cloud, where things like your keystone (and possibly glance) lives. Then on the edges, you might have very small or single-node nova deployments that use those services. They're necessarily tiny because they're edges, and you want to maximize your workload space on those nodes by not running anything you don't really need. Further, since they're all disparate and disconnected (from a resource-reporting point of view), being able to get *some* amount of unified capacity view from a centralized placement would be nice (at the expense of availability of course).
1:1 is the only thing that makes sense to me. Therefore, it should be a field on the resource_providers table (source_id or partition_id or whatever).
Yep, agree. Also, I think this would make it easy to use regular database replication on just a shard of the whole dataset to improve the availability concern above. I don't think there's anyone beating down the door needing this right now, but it's one of those things that will take a little time to filter in (since we have to shard-ify the existing data) and we can hardly even propose the conversation until it's at least planned. Most of the other services are going through a more painful "okay we need to shard our data" phase right now, so the longer we wait to do this, the harder it will be (and the more latency involved when we do). Easy for me to say, I know. --Dan
On Tue, 16 Apr 2019, Dan Smith wrote:
I don't think there's anyone beating down the door needing this right now, but it's one of those things that will take a little time to filter in (since we have to shard-ify the existing data) and we can hardly even propose the conversation until it's at least planned. Most of the other services are going through a more painful "okay we need to shard our data" phase right now, so the longer we wait to do this, the harder it will be (and the more latency involved when we do). Easy for me to say, I know.
In a sense we're already sharded now, it's just that everyone is in the same default null shard. So, for the sake of making it explicit, what's wrong with: * add a nullable shard column * enable the openstack-shard header I suggested at the last ptg [1] * let deployments start using that, but only if they want to. If they do all rp writes and queries use it. There are probably plenty of things wrong with it, but if we write it out then we can come up with the right solution fairly quickly. [1] around line 259 in https://etherpad.openstack.org/p/nova-ptg-stein -- Chris Dent ٩◔̯◔۶ https://anticdent.org/ freenode: cdent tw: @anticdent
I don't think there's anyone beating down the door needing this right now, but it's one of those things that will take a little time to filter in (since we have to shard-ify the existing data) and we can hardly even propose the conversation until it's at least planned. Most of the other services are going through a more painful "okay we need to shard our data" phase right now, so the longer we wait to do this, the harder it will be (and the more latency involved when we do). Easy for me to say, I know.
In a sense we're already sharded now, it's just that everyone is in the same default null shard.
Yep.
So, for the sake of making it explicit, what's wrong with:
* add a nullable shard column * enable the openstack-shard header I suggested at the last ptg [1] * let deployments start using that, but only if they want to. If they do all rp writes and queries use it.
This sounds like a trap, so I'm curious... What, um, is left in such a feature beyond this? :) When we discussed this some number of Denvers ago, I think I said that I would want everything to declare a shard identifier all the time, and you (and I think Jay) wanted a "null is the default shard" type behavior. So, what you say above seems to map to the latter, no? I'm not crazy opposed to everyone being in the undeclared null shard until they need to be in something else. I don't prefer it because: - I think it will be better tested (and test-able) if it's not optional - It's an identifier, and we'd never say "we don't need a non-null row id column until later when we need it" - I think that other services that may start reporting to or using placement may just omit that part in early development However, whether the null default thing is transitional or lives forever, doing what you said above is better than doing nothing, IMHO. --Dan
On Tue, 16 Apr 2019, Dan Smith wrote:
So, for the sake of making it explicit, what's wrong with:
* add a nullable shard column * enable the openstack-shard header I suggested at the last ptg [1] * let deployments start using that, but only if they want to. If they do all rp writes and queries use it.
This sounds like a trap, so I'm curious... What, um, is left in such a feature beyond this? :)
Sorry, no trap intended. I had recalled that you had a preference to not null and I couldn't remember the reasons. [snip]
I'm not crazy opposed to everyone being in the undeclared null shard until they need to be in something else. I don't prefer it because:
- I think it will be better tested (and test-able) if it's not optional - It's an identifier, and we'd never say "we don't need a non-null row id column until later when we need it" - I think that other services that may start reporting to or using placement may just omit that part in early development
For completeness, your preference would be something more like the way we do incomplete project and user for allocations that pre-dated consumers? Make an explicit default (and configurable) shard uuid and migrate to that? -- Chris Dent ٩◔̯◔۶ https://anticdent.org/ freenode: cdent tw: @anticdent
On Tue, 16 Apr 2019, Dan Smith wrote:
For completeness, your preference would be something more like the way we do incomplete project and user for allocations that pre-dated consumers? Make an explicit default (and configurable) shard uuid and migrate to that?
Yup.
I've made a story for this one too https://storyboard.openstack.org/#!/story/2005474 Again, the first task is to create a spec. If someone is interested in taking on this work, please assign yourself. If nobody assigns themselves, it most likely won't happen in Train unless we come up with a magical time or clone or both machine. -- Chris Dent ٩◔̯◔۶ https://anticdent.org/ freenode: cdent tw: @anticdent
participants (4)
-
Chris Dent
-
Dan Smith
-
Eric Fried
-
Jay Pipes