[placement][nova][ptg] Resource provider - request group mapping
This is about this spec [1] "Resource provider - request group mapping in allocation candidate" which didn't get approved in Stein and will need to find its appropriate home (in placement) at some point. This topic is from the cross project etherpad [2]. The questions associated with this are of two forms: * How should the data be presented in the allocation candidates response? * How best to capture the pending discussion on a nova spec as is moved to becoming a placement spec. There's quite a lot of useful information on the spec, including multiple alternatives and reasons why those alternatives are good or not good. This is one those API changes where we need to be careful to be general and within the existing grammar of placement and not simply evolving reactively to increased complexity in Nova. Obviously Placement needs to be evolve in response to Nova, but carefully. What might be useful is for people who feel some ownership for the various proposed structures to discuss their merits, here. Or go the other way: If there are some structures you dislike, why? [1] https://review.openstack.org/#/c/597601/ [2] https://etherpad.openstack.org/p/ptg-train-xproj-nova-placement -- Chris Dent ٩◔̯◔۶ https://anticdent.org/ freenode: cdent tw: @anticdent
On Tue, Apr 9, 2019 at 7:46 PM, Chris Dent <cdent+os@anticdent.org> wrote:
This is about this spec [1] "Resource provider - request group mapping in allocation candidate" which didn't get approved in Stein and will need to find its appropriate home (in placement) at some point. This topic is from the cross project etherpad [2].
It is on my TODO list to create a story for it in placement and move the spec to the placement repo. I don't know when I will reach this item on my list, sorry.
The questions associated with this are of two forms:
* How should the data be presented in the allocation candidates response?
* How best to capture the pending discussion on a nova spec as is moved to becoming a placement spec.
When I move the spec I can add the open questions from the nova spec review to the placement spec directly to help continuity. Is that OK?
There's quite a lot of useful information on the spec, including multiple alternatives and reasons why those alternatives are good or not good.
This is one those API changes where we need to be careful to be general and within the existing grammar of placement and not simply evolving reactively to increased complexity in Nova. Obviously Placement needs to be evolve in response to Nova, but carefully.
Pinging Cyborg folks. Does Cyborg needs something similar? If yes then we can have at least two users of such API.
What might be useful is for people who feel some ownership for the various proposed structures to discuss their merits, here. Or go the other way: If there are some structures you dislike, why?
I can own the first alternative in the spec [3].
[1] https://review.openstack.org/#/c/597601/ [2] https://etherpad.openstack.org/p/ptg-train-xproj-nova-placement
[3] https://review.openstack.org/#/c/597601/1/specs/stein/approved/placement-res... Cheers, gibi
-- Chris Dent ٩◔̯◔۶ https://anticdent.org/ freenode: cdent tw: @anticdent
It is on my TODO list to create a story for it in placement and move the spec to the placement repo. I don't know when I will reach this item on my list, sorry.
I was getting ready to volunteer (again) to help move the ball on this because it's really important that we get this done. But then I started thinking, is it really? The workarounds we have in the client-side code right now are pretty sucky, but they work. The effort of $subject is an optimization and suck-reducer, but is it crucial? Probably not. Though I would like to hear from Cyborg before we decide we can live without it for Train.
When I move the spec I can add the open questions from the nova spec review to the placement spec directly to help continuity. Is that OK?
WFM.
Pinging Cyborg folks. Does Cyborg needs something similar?
I know for sure this is a yes (somewhere around [6]?), but I won't be able to express the details as well as Sundar.
I can own the first alternative in the spec [3].
I'll champion the one I described in the third comment at [4], where we add a "mappings" dict next to "allocations". IMO, it's a tad cleaner because it's per "allocations" rather than per "allocations.$rp". That said, both of these options: - Provide the same information: which request groups got satisfied by which providers [5]. - Violate the "black box" principle and require one side or the other to work around it (either client removes or placement ignores the new key on PUT /allocations). As I said further down in [4], I don't care about that. (Ed?) - Maintain the existing levels of hierarchy for the existing elements, which Chris explained was important (see bottom five comments at [4]). - Don't require correlation by list index, which was the only thing I was a hard -1 on. So if anyone has a strong preference for [3], I'm not going to fight hard. efried
[3] https://review.openstack.org/#/c/597601/1/specs/stein/approved/placement-res... [4] https://review.openstack.org/#/c/597601/1/specs/stein/approved/placement-res... [5] Note that they also both *don't* provide information about which *resource* satisfied which request group. E.g. this spec doesn't help us with the "multiple disks" problem: resources1=DISK_GB:50&resources2=DISK_GB:25&group_policy=none may result in one RP providing DISK_GB:75, request_groups=[resources1,resources2]. I'm assuming we don't care (yet). [6] https://review.openstack.org/#/c/631244/
From: Eric Fried <openstack@fried.cc> Sent: Wednesday, April 10, 2019 7:31 AM To: openstack-discuss@lists.openstack.org Subject: Re: [placement][nova][ptg] Resource provider - request group mapping
Pinging Cyborg folks. Does Cyborg needs something similar?
I know for sure this is a yes (somewhere around [6]?), but I won't be able to express the details as well as Sundar.
Yes, Cyborg can certainly use that. Right now, the Nova patch [6] is relying on the mapping done in [7] as part of the bandwidth provider. I saw the thread where cdent is asking if Ironic would directly invoke Placement. Just curious, are there plans for Zun to leverage Placement and resource providers for containers? I should review the spec. [6] https://review.openstack.org/#/c/631244/ [7] https://git.openstack.org/cgit/openstack/nova/tree/nova/objects/request_spec... Regards, Sundar
On Thu, 11 Apr 2019, Nadathur, Sundar wrote:
I saw the thread where cdent is asking if Ironic would directly invoke Placement. Just curious, are there plans for Zun to leverage Placement and resource providers for containers?
There's been a blueprint and some code written in that direction: * https://blueprints.launchpad.net/zun/+spec/use-placement-resource-management * https://review.openstack.org/#/c/586960/ But I'm not sure of the status of things. Shall we have a zun+placement thread too? -- Chris Dent ٩◔̯◔۶ https://anticdent.org/ freenode: cdent tw: @anticdent
On Wed, 10 Apr 2019, Eric Fried wrote:
I'll champion the one I described in the third comment at [4], where we add a "mappings" dict next to "allocations". IMO, it's a tad cleaner because it's per "allocations" rather than per "allocations.$rp". That said, both of these options:
- Provide the same information: which request groups got satisfied by which providers [5]. - Violate the "black box" principle and require one side or the other to work around it (either client removes or placement ignores the new key on PUT /allocations). As I said further down in [4], I don't care about that. (Ed?) - Maintain the existing levels of hierarchy for the existing elements, which Chris explained was important (see bottom five comments at [4]).
It's true that they do not change existing paths to existing data, but they do "invade" (as you say in your second point) existing data structures. The cleanest solution would probably be a new top-level key that provides the mapping information (line 426 on the spec), but as discussed there that ends up repeating too much information and needing ordering, because there's no independent identifier of a single allocation_request. So yeah, one of those will do. A moment to riff on one the points of having these conversations in the expanse of email: To my internal unfettered brain the above sort-of-mess is yet more evidence that nested is a nightmare and gosh I wish we had never done: it, request groups, provider trees on the nova side, and pretty much all the stuff that's in progress related to enhanced platform awareness. It makes me squirm, see bad smells everywhere, and facepalm multiple times per day. I feel it is important that I get this off my chest and I encourage anyone else who has internal voices that don't feel enitirely appropriate go ahead and let them out as part of this pre-PTG email process. We are much more likely in the long term to be able to get things done in a useful fashion if we are able to honestly flush/vent such concerns in "public" and safely. Because a lot of this stuff is the reality that we have and a situation that we need to solve. We can achieve the goals better by building up a shared language of what it is. That language includes the shit as well as the shine. All that stuff above that I don't like at some gut level, I do like (or at least appreciate the value of) at a more practical level. -- Chris Dent ٩◔̯◔۶ https://anticdent.org/ freenode: cdent tw: @anticdent
On Thu, 2019-04-11 at 12:38 +0100, Chris Dent wrote:
On Wed, 10 Apr 2019, Eric Fried wrote:
I'll champion the one I described in the third comment at [4], where we add a "mappings" dict next to "allocations". IMO, it's a tad cleaner because it's per "allocations" rather than per "allocations.$rp". That said, both of these options:
- Provide the same information: which request groups got satisfied by which providers [5]. - Violate the "black box" principle and require one side or the other to work around it (either client removes or placement ignores the new key on PUT /allocations). As I said further down in [4], I don't care about that. (Ed?) - Maintain the existing levels of hierarchy for the existing elements, which Chris explained was important (see bottom five comments at [4]).
It's true that they do not change existing paths to existing data, but they do "invade" (as you say in your second point) existing data structures. The cleanest solution would probably be a new top-level key that provides the mapping information (line 426 on the spec), but as discussed there that ends up repeating too much information and needing ordering, because there's no independent identifier of a single allocation_request.
So yeah, one of those will do.
A moment to riff on one the points of having these conversations in the expanse of email:
To my internal unfettered brain the above sort-of-mess is yet more evidence that nested is a nightmare and gosh I wish we had never done: it, request groups, provider trees on the nova side, and pretty much all the stuff that's in progress related to enhanced platform awareness. It makes me squirm, see bad smells everywhere, and facepalm multiple times per day. so just want to put this out there there is an alternitve have placemetn be numa aware.
we can track all the resources in placement and we can pass allocation_candiates/provider_summaries to the nova filters and we can have those filter eliminate invalid alloction_candiates based on numa or any other chritia that placement does not understand. we can do this today without modifying placements data structure although we woudl still have to create nested multiple resource provders per numa node but we dont need to have a nested treee. this has a number of pros and cons. the main cons being that nova need to contuie have non trivial filters and we will need to have a limited abount of info in the resouce tracker(basically list of placemente RP uuids per numa node) another con is that some of the allocation candiate returned will be invalid so it makes using limit tricker but we can run out of allocation candiate already today because of filters so its not really any different. on the pros side the placmenet team coudl focus on things it deem more important such as requirement from projectes other then nova like shared storage. another pro of this would be we can still use placement for capsity and traits and i also think its doable in train. the flat 2 level tree we create with vgpus or bandwith RPs can be used for a lot of other usecases like jsut tracking sriov device, persitent memeory, cache basically any resouce that you have multiple pools of on a comptue node can be tracked with just two levels. is we alway use group_policy=none we can figutre out on the client side if the allcoation candiate meets our other constratis. long term i think there is value in have richer query syntax in placement but if we need to pause that to think about it some more, i think that is an ok viewpoint to express. i would personally prefer to have a clean way to express requrement that is maintainable and does not tie our hands going forward. again with that said if we decised not to use placment for numa in train i hope we can pusue using the facilities that we already have to make some progress instead and not block those efforst on "we shoudl do this with placemetn" if we have decided not to do them in placement in train. i know alot of nova folks wont like that as it means we have to keep some of the complexity in nova but i would at least like to have that conversation.
I feel it is important that I get this off my chest and I encourage anyone else who has internal voices that don't feel enitirely appropriate go ahead and let them out as part of this pre-PTG email process. We are much more likely in the long term to be able to get things done in a useful fashion if we are able to honestly flush/vent such concerns in "public" and safely.
Because a lot of this stuff is the reality that we have and a situation that we need to solve. We can achieve the goals better by building up a shared language of what it is. That language includes the shit as well as the shine.
All that stuff above that I don't like at some gut level, I do like (or at least appreciate the value of) at a more practical level.
-- Chris Dent ٩◔̯◔۶ https://anticdent.org/ freenode: cdent tw: @anticdent
On Thu, 11 Apr 2019, Sean Mooney wrote:
long term i think there is value in have richer query syntax in placement but if we need to pause that to think about it some more, i think that is an ok viewpoint to express. i would personally prefer to have a clean way to express requrement that is maintainable and does not tie our hands going forward. again with that said if we decised not to use placment for numa in train i hope we can pusue using the facilities that we already have to make some progress instead and not block those efforst on "we shoudl do this with placemetn" if we have decided not to do them in placement in train. i know alot of nova folks wont like that as it means we have to keep some of the complexity in nova but i would at least like to have that conversation.
I think the pros you present (and other have presented) are (and have been) strong enough that doing NUMA via nested-in-placement is the right way to go. The gist of my prior rant is not so much that I don't like nested but that I don't like the reasons for its existence (NUMA and other hardware awarenesses) and the costs from those reasons. The resolve is: I have to get past that; it's what we've got. Since that's what we've got may as well be placement that makes it cleaner. It's good for that sort of thing. But it also means that I will sometimes be compelled to defend the simpler way, as at the "horrible idea" in my comment at https://review.openstack.org/#/c/650476/1/doc/source/specs/train/approved/20... -- Chris Dent ٩◔̯◔۶ https://anticdent.org/ freenode: cdent tw: @anticdent
On 04/11/2019 07:38 AM, Chris Dent wrote:
A moment to riff on one the points of having these conversations in the expanse of email:
To my internal unfettered brain the above sort-of-mess is yet more evidence that nested is a nightmare and gosh I wish we had never done: it, request groups, provider trees on the nova side, and pretty much all the stuff that's in progress related to enhanced platform awareness. It makes me squirm, see bad smells everywhere, and facepalm multiple times per day.
I share your frustration in many ways. The many "features" added to Nova around NUMA, virtual guest CPU topologies, PCI device management/tuning, realtime support, and the proposed integration of RMD et al have each eroded the abstraction that Nova originally served as: abstraction layer *above* the hardware and hypervisor. Ironically, the hierarchical and shared resource providers modeling was intended to *ensure* and *promote* a structured, consistent, easy-to-reason-about model for resource management. In other words, the whole idea of placement -- including the addition of hierachical and shared providers -- was to provide relief from the free-for-all Wild West frontier that still exists in the Nova PCI manager, hardware.py module, NUMATopologyFilter, and all that. It was supposed to provide us a path out of that quagmire. I take it as a personal failure that we've yet to be able to take advantage of the more consistent and structured data model in placement for these more "advanced" resource classes :( The road to hell is paved with good intentions, I guess. Best, -jay
On Sun, 21 Apr 2019, Jay Pipes wrote:
In other words, the whole idea of placement -- including the addition of hierachical and shared providers -- was to provide relief from the free-for-all Wild West frontier that still exists in the Nova PCI manager, hardware.py module, NUMATopologyFilter, and all that. It was supposed to provide us a path out of that quagmire.
I take it as a personal failure that we've yet to be able to take advantage of the more consistent and structured data model in placement for these more "advanced" resource classes :(
I think we can still do it, but we need to get our mental models arranged and aligned. Thus my drive for some kind of universal theory of nested operation. Some simple heuristics that make it easy to map goals to processes to code. Things that allow us to say "we could do it that way, but that upsets the model". One area where we seem to be running into problems a lot lately is situations where people want to make requests to placement that are explicit and specific about an instance of a type of thing, rather than just the type of thing. Placement is oriented towards the latter and while we could do that level of specificity it requires a much more detailed awareness of the cloud across all the tools that are involved than is ideal. If you already have that much awareness, placement isn't really what you need: you already know the place. -- Chris Dent ٩◔̯◔۶ https://anticdent.org/ freenode: cdent tw: @anticdent
On 04/10/2019 10:31 AM, Eric Fried wrote:
It is on my TODO list to create a story for it in placement and move the spec to the placement repo. I don't know when I will reach this item on my list, sorry.
I was getting ready to volunteer (again) to help move the ball on this because it's really important that we get this done.
But then I started thinking, is it really? The workarounds we have in the client-side code right now are pretty sucky, but they work. The effort of $subject is an optimization and suck-reducer, but is it crucial? Probably not. Though I would like to hear from Cyborg before we decide we can live without it for Train.
When I move the spec I can add the open questions from the nova spec review to the placement spec directly to help continuity. Is that OK?
WFM.
Pinging Cyborg folks. Does Cyborg needs something similar?
I know for sure this is a yes (somewhere around [6]?), but I won't be able to express the details as well as Sundar.
I can own the first alternative in the spec [3].
I'll champion the one I described in the third comment at [4], where we add a "mappings" dict next to "allocations". IMO, it's a tad cleaner because it's per "allocations" rather than per "allocations.$rp".
After considering the alternatives, this is my preference as well. Having a "mappings" key in each element of the "allocation_requests" array makes sense to me: We are providing information *about that particular allocation request's request group to provider mapping* and therefore I feel this is the best location for it to be. Best, -jay
participants (6)
-
Balázs Gibizer
-
Chris Dent
-
Eric Fried
-
Jay Pipes
-
Nadathur, Sundar
-
Sean Mooney