[nova] review guide for the bandwidth patches
Hi, After discussing generic concerns about the current proposed patch series[1] for the bandwidth-resource-provider bp [2] with Matt, I decided to write a summary that might help other reviewers. Grouping of the pathes ---------------------- The whole nova implementation is one long patch series that can be splitted into smaller logical steps * Already merged patches added the two new standard resource classes and added a field to RequestSpec to store the resource request of each neutron port. * The open patch series start at [3] now and the patches between [3] [4] introduce the ability to boot a server with neutron ports that has resource request. These patches implement handling the extra resources during boot, re-schedule and delete. These patches only work if a server has maximum one neutron port that has resource request. * The patches after [4] until [5] implement the calculation and communiction of the mapping between ports and resource providers that are fulfilling the given port resource request. (The reason why this is done in nova described below.) So these patches allow having more than one neutron port per server having resource request. * the next two patches [6][7] implement rejecting use cases that was decided to out of scope for the feature (interface attach with resource requst, booting with network having resource request) * The next patch [8] add support for removing resource allocation during interface detach if the detached port has resource request * The rest of the series up until [9] implement support for selecting VFs form the same PF that the bandwidth was allocated from. This allows handling bandwidth and VF resources properly even if there are more than one PFs on the same compute connected to the same physnet. What is still missing from the patch series: * Support for any kind of server move operations (resize, migrate, evacuate, live-migrate, unshelve) * Upgrade support: Bound SRIOV ports might already have QoS policy that would need healing resource allocation in placement. * Supporting separation of QoS aware and non aware ports on the same host. This can be done with host aggregates today, but the bp described possible ways to support a better separation. * Supporting multisegment network that has more than one segment having physnet name. For this we would need any-trait support from placement. The spec for it is approved but the implementation is on hold due to placement extraction. Order of the patches -------------------- One comment I got from Matt is that the patches are in wrong order. The current order allows booting server with bandwidth request _before_ every other server operation is supported (or rejected). This could lead to the violation of the resource view of the system. For example booting a server with a QoS aware port will result in bandwidth allocation on the compute host. But then later on migrating that server to another host does not properly handle moving the allocation to the target host. In my view reordering the patches to only allow booting the first server with bandwdith after every other operation is supported is not feasible. The current chain is already 17 patches long and it does not touch server moving operations at all. Also if I cannot boot a server with bandwidth then I cannot test that such server can be moved properly. So all the functional test would end up in the last patch of the series. However I think we can use the feature flag approach. We can implement and verify everything in the current order without the end user seeing or causing any kind of inconsistencies. The nova implementation is only activated if a neutron port has a resource_request field. This field is added to the neutron API as an API extension. We can mark this extension experimental while we are working through the missing pieces of the feature. But we can also turn on that extension in our test envirnoment to verify each and every step during the implementation process. Which RP fulfills the request of a neutron port ----------------------------------------------- Neutron expects a single RP uuid sent by nova in the port binding that tells neutron about which RP providers the resources for that port. This is the famous problem to find which request group is fulfilled by which resource provider in an allocation candidate. There are 3 possible solutions: * Neutron does the mapping. This would require Neutron to see each port binding for a given server as a single request. This is not the case today and not feasible to hack around in neutron. * Placement does the mapping. This is technically feasible and in my oppinion this is the good solution. But placement is in feature freeze so the spec [10] describing this change is put on hold. * Nova does the mapping. This is what pathes after [4] until [5] implement. It is a temporary solution until the placement based solution can be provided. The current patches are structured in a way that all ovo and neutronv2/api impacts in nova are reusable for the placement based solution later without change. There is only one patch [11] that will be thrown away when the placement based solution is available. I hope this mail helps reviewing the series. Cheers, gibi [1] https://review.openstack.org/#/q/topic:bp/bandwidth-resource-provider+(statu...) [2] https://blueprints.launchpad.net/nova/+spec/bandwidth-resource-provider [3] https://review.openstack.org/#/c/625941/ [4] https://review.openstack.org/#/c/567268/ [5] https://review.openstack.org/#/c/573317/ [6] https://review.openstack.org/#/c/570078 [7] https://review.openstack.org/#/c/570079 [8] https://review.openstack.org/#/c/622421 [9] https://review.openstack.org/#/c/623543 [10] https://review.openstack.org/#/c/597601 [11] https://review.openstack.org/#/c/616239
Hi Gibi,
However I think we can use the feature flag approach. We can implement and verify everything in the current order without the end user seeing or causing any kind of inconsistencies. The nova implementation is only activated if a neutron port has a resource_request field. This field is added to the neutron API as an API extension. We can mark this extension experimental while we are working through the missing pieces of the feature.
I like the idea of a feature flag, however technically in neutron we don't directly control the list of extensions loaded. Instead what we control (through configuration) is the list of service plugins loaded. The 'resource_request' extension is implemented by the 'qos' service plugin. But the 'qos' service plugin implements other API extensions and features too. A cloud admin may want to use these other features of the 'qos' service plugin, but not the guaranteed minimum bandwidth. Therefore I'd suggest adding the feature flag to nova to keep this usage possible. Cheers, Bence
On Thu, Dec 20, 2018 at 3:02 PM, Bence Romsics <bence.romsics@gmail.com> wrote:
Hi Gibi,
However I think we can use the feature flag approach. We can implement and verify everything in the current order without the end user seeing or causing any kind of inconsistencies. The nova implementation is only activated if a neutron port has a resource_request field. This field is added to the neutron API as an API extension. We can mark this extension experimental while we are working through the missing pieces of the feature.
I like the idea of a feature flag, however technically in neutron we don't directly control the list of extensions loaded. Instead what we control (through configuration) is the list of service plugins loaded. The 'resource_request' extension is implemented by the 'qos' service plugin. But the 'qos' service plugin implements other API extensions and features too. A cloud admin may want to use these other features of the 'qos' service plugin, but not the guaranteed minimum bandwidth. Therefore I'd suggest adding the feature flag to nova to keep this usage possible.
Thanks Bence. I still want to belive that the proper solution for this long running implementation is to use a feature flag. Nova has a 'workaround' config section that might be used to add a new config option as a feature flag similar to the existing 'enable_numa_live_migration' option. This way the feature can be controller externally. Would be good to see the reaction from other Nova developers about this approch. cheers, gibi
Cheers, Bence
On 12/20/2018 09:33 AM, Balázs Gibizer wrote:
On Thu, Dec 20, 2018 at 3:02 PM, Bence Romsics <bence.romsics@gmail.com> wrote:
Hi Gibi,
However I think we can use the feature flag approach. We can implement and verify everything in the current order without the end user seeing or causing any kind of inconsistencies. The nova implementation is only activated if a neutron port has a resource_request field. This field is added to the neutron API as an API extension. We can mark this extension experimental while we are working through the missing pieces of the feature.
I like the idea of a feature flag, however technically in neutron we don't directly control the list of extensions loaded. Instead what we control (through configuration) is the list of service plugins loaded. The 'resource_request' extension is implemented by the 'qos' service plugin. But the 'qos' service plugin implements other API extensions and features too. A cloud admin may want to use these other features of the 'qos' service plugin, but not the guaranteed minimum bandwidth. Therefore I'd suggest adding the feature flag to nova to keep this usage possible.
Thanks Bence. I still want to belive that the proper solution for this long running implementation is to use a feature flag. Nova has a 'workaround' config section that might be used to add a new config option as a feature flag similar to the existing 'enable_numa_live_migration' option. This way the feature can be controller externally.
Would be good to see the reaction from other Nova developers about this approch.
The problem with feature flags is they are not discoverable. As much as I hate API extensions, at least they are discoverable. Is there a reason why we can't just put code into Nova that, upon seeing the resource requests, acts on it? Is there a reason it needs to be behind a feature flag or even needs to query Neutron for extension support? I mean, if it's in the response payload, just use it, right? Best, -jay
On Thu, Dec 20, 2018 at 3:47 PM, Jay Pipes <jaypipes@gmail.com> wrote:
On 12/20/2018 09:33 AM, Balázs Gibizer wrote:
On Thu, Dec 20, 2018 at 3:02 PM, Bence Romsics <bence.romsics@gmail.com> wrote:
Hi Gibi,
However I think we can use the feature flag approach. We can implement and verify everything in the current order without the end user seeing or causing any kind of inconsistencies. The nova implementation is only activated if a neutron port has a resource_request field. This field is added to the neutron API as an API extension. We can mark this extension experimental while we are working through the missing pieces of the feature.
I like the idea of a feature flag, however technically in neutron we don't directly control the list of extensions loaded. Instead what we control (through configuration) is the list of service plugins loaded. The 'resource_request' extension is implemented by the 'qos' service plugin. But the 'qos' service plugin implements other API extensions and features too. A cloud admin may want to use these other features of the 'qos' service plugin, but not the guaranteed minimum bandwidth. Therefore I'd suggest adding the feature flag to nova to keep this usage possible.
Thanks Bence. I still want to belive that the proper solution for this long running implementation is to use a feature flag. Nova has a 'workaround' config section that might be used to add a new config option as a feature flag similar to the existing 'enable_numa_live_migration' option. This way the feature can be controller externally.
Would be good to see the reaction from other Nova developers about this approch.
The problem with feature flags is they are not discoverable. As much as I hate API extensions, at least they are discoverable.
Is there a reason why we can't just put code into Nova that, upon seeing the resource requests, acts on it? Is there a reason it needs to be behind a feature flag or even needs to query Neutron for extension support? I mean, if it's in the response payload, just use it, right?
The currently proposed Nova implementation only acts if it sees a resource request in a neturon port[1]. However the currently proposed nova implementation is incomplete as I descibed in my inital mail of this thread. So Neutron can send resource request in the ports before Nova can fully handle it, which can lead to resource allocation inconsistencies. This is why I'm trying to merge the partial support (e.g. boot, delete) guarded by a feature flag while the other operations (migrate, evacuate) are under developement. Cheers, gibi [1] https://review.openstack.org/#/c/567268/41/nova/network/neutronv2/api.py@187...
Best, -jay
On 12/20/2018 8:47 AM, Jay Pipes wrote:
The problem with feature flags is they are not discoverable. As much as I hate API extensions, at least they are discoverable.
100% agree. Config-driven API behavior like this flies in the face of any kind of hope for interoperability of this feature. Can I do this on cloud X? Maybe? Can I do it on cloud Y? Let's find out! At the very least, I would think if you're going to add a config option to enable/disable this support in the compute API, at least temporarily, you would fail any requests which include ports with QoS resource requests because we're not going to honor them. That aligns more with what I was saying about how I thought the "reject" patches would come earlier in the series since we're currently not supporting this feature, and lying to users by allowing them to attach these kinds of ports. Although I might be conflating QoS ports in general and QoS ports that have the bandwidth resource request things on them. During the spec review, we waffled a bit on whether or not this should have a new microversion with it [1][2]. The way this works definitely sounds like how volume multiattach is setup - you create a resource in another service with a special characteristic and then try to attach that thing to a server. Nova can either support that thing or not, and the only way for a user to find that out without a microversion is to try it and find out (does it fail? if it doesn't fail, is the thing working in my guest? if not, is it because it's not supported or because something failed or there is a bug?). With volume multiattach we added a microversion so that users could know if they can even make the request to attach those types of volumes so they didn't have to guess. So I could argue that QoS ports like this should get similar treatment, but things get fuzzy when it's implicit behavior on some external resource
Is there a reason why we can't just put code into Nova that, upon seeing the resource requests, acts on it? Is there a reason it needs to be behind a feature flag or even needs to query Neutron for extension support? I mean, if it's in the response payload, just use it, right?
The main reason is how the code is written backward, as I've said elsewhere in this thread. If everything was properly plumbed in from the bottom up for at least creating servers with these kinds of ports (and cleaning them up properly), then it's just a matter of doing like you said - take action on the thing if it's there. Like [3] I think we could explicitly fail for things that are not supported until they do become supported, like any move operations. Whether those are considered bug fixes or new features (a la microversions) is debatable. It sucks adding stuff that works when creating a server but isn't supported when you perform actions on that server, and it sucks holding up features before everything is ready, and it sucks to need microversions for every little thing that ideally would have just be done up front, so I can't really say here on what we should do. I know in the golden days before microversions it was totally kosher to just feature flag our way to nirvana and we're still dealing with a lot of that fallout (like the fact that live migration with NUMA-affined instances still doesn't work properly). [1] https://review.openstack.org/#/c/502306/17/specs/rocky/approved/bandwidth-re... [2] https://review.openstack.org/#/c/502306/26/specs/rocky/approved/bandwidth-re... [3] https://review.openstack.org/#/c/611088/ -- Thanks, Matt
On Thu, Dec 20, 2018 at 7:14 PM, Matt Riedemann <mriedemos@gmail.com> wrote:
On 12/20/2018 8:47 AM, Jay Pipes wrote:
The problem with feature flags is they are not discoverable. As much as I hate API extensions, at least they are discoverable.
100% agree. Config-driven API behavior like this flies in the face of any kind of hope for interoperability of this feature. Can I do this on cloud X? Maybe? Can I do it on cloud Y? Let's find out! At the very least, I would think if you're going to add a config option to enable/disable this support in the compute API, at least temporarily, you would fail any requests which include ports with QoS resource requests because we're not going to honor them. That aligns more with what I was saying about how I thought the "reject" patches would come earlier in the series since we're currently not supporting this feature, and lying to users by allowing them to attach these kinds of ports. Although I might be conflating QoS ports in general and QoS ports that have the bandwidth resource request things on them.
During the spec review, we waffled a bit on whether or not this should have a new microversion with it [1][2]. The way this works definitely sounds like how volume multiattach is setup - you create a resource in another service with a special characteristic and then try to attach that thing to a server. Nova can either support that thing or not, and the only way for a user to find that out without a microversion is to try it and find out (does it fail? if it doesn't fail, is the thing working in my guest? if not, is it because it's not supported or because something failed or there is a bug?). With volume multiattach we added a microversion so that users could know if they can even make the request to attach those types of volumes so they didn't have to guess. So I could argue that QoS ports like this should get similar treatment, but things get fuzzy when it's implicit behavior on some external resource
Is there a reason why we can't just put code into Nova that, upon seeing the resource requests, acts on it? Is there a reason it needs to be behind a feature flag or even needs to query Neutron for extension support? I mean, if it's in the response payload, just use it, right?
The main reason is how the code is written backward, as I've said elsewhere in this thread. If everything was properly plumbed in from the bottom up for at least creating servers with these kinds of ports (and cleaning them up properly), then it's just a matter of doing like you said - take action on the thing if it's there. Like [3] I think we could explicitly fail for things that are not supported until they do become supported, like any move operations. Whether those are considered bug fixes or new features (a la microversions) is debatable. It sucks adding stuff that works when creating a server but isn't supported when you perform actions on that server, and it sucks holding up features before everything is ready, and it sucks to need microversions for every little thing that ideally would have just be done up front, so I can't really say here on what we should do. I know in the golden days before microversions it was totally kosher to just feature flag our way to nirvana and we're still dealing with a lot of that fallout (like the fact that live migration with NUMA-affined instances still doesn't work properly).
I'd like to make this feature work and merged without drowning into the huge patch series I'm already pushing forward. I'd like to make small and verifyable steps toward the full feature, so for me, moving the functional tests to the end is scary like hell. I'm wondering that introducing an API microversion could act like a feature flag I need and at the same time still make the feautre discoverable as you would like to see it. Something like: Create a feature flag in the code but do not put it in the config as a settable flag. Instead add an API microversion patch to the top of the series and when the new version is requested it enables the feature via the feature flag. This API patch can be small and simple enough to cherry-pick to earlier into the series for local end-to-end testing if needed. Also in functional test I can set the flag via a mock so I can add and run functional tests patch by patch. Regarding rejections. I can add patches that rejects server operations (including create too) if the new microversion isn't requested but a port included in the request or associated to the server has resource request. How does this sound to you? Cheers, gibi [1] https://review.openstack.org/#/c/569459 [2] https://review.openstack.org/#/c/623543 [3] https://review.openstack.org/#/c/567268/42/nova/conf/workarounds.py
[1] https://review.openstack.org/#/c/502306/17/specs/rocky/approved/bandwidth-re... [2] https://review.openstack.org/#/c/502306/26/specs/rocky/approved/bandwidth-re... [3] https://review.openstack.org/#/c/611088/
--
Thanks,
Matt
On 12/28/2018 4:13 AM, Balázs Gibizer wrote:
I'm wondering that introducing an API microversion could act like a feature flag I need and at the same time still make the feautre discoverable as you would like to see it. Something like: Create a feature flag in the code but do not put it in the config as a settable flag. Instead add an API microversion patch to the top of the series and when the new version is requested it enables the feature via the feature flag. This API patch can be small and simple enough to cherry-pick to earlier into the series for local end-to-end testing if needed. Also in functional test I can set the flag via a mock so I can add and run functional tests patch by patch.
That may work. It's not how I would have done this, I would have started from the bottom and worked my way up with the end to end functional testing at the end, as already noted, but I realize you've been pushing this boulder for a couple of releases now so that's not really something you want to change at this point. I guess the question is should this change have a microversion at all? That's been wrestled in the spec review and called out in this thread. I don't think a microversion would be *wrong* in any sense and could only help with discoverability on the nova side, but am open to other opinions. -- Thanks, Matt
On Thu, 3 Jan 2019 11:40:22 -0600, Matt Riedemann <mriedemos@gmail.com> wrote:
On 12/28/2018 4:13 AM, Balázs Gibizer wrote:
I'm wondering that introducing an API microversion could act like a feature flag I need and at the same time still make the feautre discoverable as you would like to see it. Something like: Create a feature flag in the code but do not put it in the config as a settable flag. Instead add an API microversion patch to the top of the series and when the new version is requested it enables the feature via the feature flag. This API patch can be small and simple enough to cherry-pick to earlier into the series for local end-to-end testing if needed. Also in functional test I can set the flag via a mock so I can add and run functional tests patch by patch.
That may work. It's not how I would have done this, I would have started from the bottom and worked my way up with the end to end functional testing at the end, as already noted, but I realize you've been pushing this boulder for a couple of releases now so that's not really something you want to change at this point.
I guess the question is should this change have a microversion at all? That's been wrestled in the spec review and called out in this thread. I don't think a microversion would be *wrong* in any sense and could only help with discoverability on the nova side, but am open to other opinions.
Sorry to be late to this discussion, but this brought up in the nova meeting today to get more thoughts. I'm going to briefly summarize my thoughts here. IMHO, I think this change should have a microversion, to help with discoverability. I'm thinking, how will users be able to detect they're able to leverage the new functionality otherwise? A microversion would signal the availability. As for dealing with the situation where a user specifies an older microversion combined with resource requests, I think it should behave similarly to how multiattach works, where the request will be rejected straight away if microversion too low + resource requests are passed. Current behavior today would be, the resource requests are ignored. If we only ignored the resource requests when they're passed with an older microversion, it seems like it would be an unnecessarily poor UX to have their parameters ignored and likely lead them on a debugging journey if and when they realize things aren't working the way they expect given the resource requests they specified. -melanie
On Fri, 2019-01-04 at 00:48 -0800, melanie witt wrote:
On Thu, 3 Jan 2019 11:40:22 -0600, Matt Riedemann <mriedemos@gmail.com> wrote:
On 12/28/2018 4:13 AM, Balázs Gibizer wrote:
I'm wondering that introducing an API microversion could act like a feature flag I need and at the same time still make the feautre discoverable as you would like to see it. Something like: Create a feature flag in the code but do not put it in the config as a settable flag. Instead add an API microversion patch to the top of the series and when the new version is requested it enables the feature via the feature flag. This API patch can be small and simple enough to cherry-pick to earlier into the series for local end-to-end testing if needed. Also in functional test I can set the flag via a mock so I can add and run functional tests patch by patch.
That may work. It's not how I would have done this, I would have started from the bottom and worked my way up with the end to end functional testing at the end, as already noted, but I realize you've been pushing this boulder for a couple of releases now so that's not really something you want to change at this point.
I guess the question is should this change have a microversion at all? That's been wrestled in the spec review and called out in this thread. I don't think a microversion would be *wrong* in any sense and could only help with discoverability on the nova side, but am open to other opinions.
Sorry to be late to this discussion, but this brought up in the nova meeting today to get more thoughts. I'm going to briefly summarize my thoughts here.
IMHO, I think this change should have a microversion, to help with discoverability. I'm thinking, how will users be able to detect they're able to leverage the new functionality otherwise? A microversion would signal the availability. As for dealing with the situation where a user specifies an older microversion combined with resource requests, I think it should behave similarly to how multiattach works, where the request will be rejected straight away if microversion too low + resource requests are passed.
this has implcations for upgrades and virsion compatiablity. if a newver version of neutron is used with older nova then behavior will change when nova is upgraded to a version of nova the has the new micoversion. my concern is as follows. a given deployment has rocky nova and rocky neutron. a teant define a minium bandwidth policy and applise it to a network. they create a port on that network. neutorn will automatically apply the minium bandwith policy to the port when it is created on the network. but we could also assuume the tenatn applied the policy to the port if we liked. the tanant then boots a vm with that port. when the vm is schduled to a node neutron will ask the network backend via the ml2 driver to configure the minium bandwith policy if the network backend supports it as part of the bind port call. the ml2 driver can refuse to bind the port at this point if it cannot fulfile the request to prevent the vm from spwaning. assuming the binding succeeds the backend will configure the minium andwith policy on the interface. nova in rocky will not schdule based on the qos policy as there is no resouce request in the port and placement will not model bandwith availablity. note: that this is how minium bandwith was orignially planned to be implmented with ml2/odl and other sdn controler backend several years ago but odl did not implement the required features so this mechanium was never used. i am not aware of any ml2 dirver that actully impmented bandwith check but before placement was created this the mechinium that at least my team at intel and some others had been planning to use. so in rocky the vm should boot, there will be no prevention of over subsciption in placement and netuon will configure the minium bandwith policy if the network backend suports it. The ingress qos minium bandwith rules was only added in neutron be egress qos minium bandwith support was added in newton with https://github.com/openstack/neutron/commit/60325f4ae9ec53734d792d111cbcf242... so there are will be a lot of existing cases where ports will have minium bandwith policies before stein. if we repeat the same exercise with rocky nova and stein neutron this changes slightly in that neutron will look at the qos policy associates with the port and add a resouce request. as rocky nova will not have code to parse the resource requests form the neutron port they will be ignored and the vm will boot, the neutron bandwith will configure minium bandwith enforcement on the port, placement will model the bandwith as a inventory but no allocation will be created for the vm. note: i have not checked the neutron node to confirm the qos plugin will still work without the placement allocation but if it dose not its a bug as stien neutron would nolnger work with pre stien nova. as such we would have broken the ablity to upgrade nova and neutron seperatly. if you use stein nova and stein neutron and the new micro version then the vm boots, we allocate the bandiwth in placement and configure the enforment in the networking backend if it supports it which is our end goal. the last configuration is stein nova and stien neutron with old microviron. this will happen in two cases. first the no micorverion is specified explcitly and openstack client is used since it will not negocitate the latest micro version or an explict microversion is passed. if the last rocky micro version was passed for example and we chose to ignore the presence of the resouce request then it would work the way it did with nova rocky and neutron stien above. if we choose to reject the request instead anyone who tries to preform instance actions on an existing instance will break after nova is upgraded to stien. while the fact over subsription is may happend could be problematic to debug for some i think the ux cost is less then the cost of updating all software that used egress qos since it was intoduced in newton to explcitly pass the latest microversion. i am in favor of adding a microversion by the way, i just think we should ignore the resouce request if an old microversion is used.
Current behavior today would be, the resource requests are ignored. If we only ignored the resource requests when they're passed with an older microversion, it seems like it would be an unnecessarily poor UX to have their parameters ignored and likely lead them on a debugging journey if and when they realize things aren't working the way they expect given the resource requests they specified.
-melanie
On Fri, 04 Jan 2019 13:20:54 +0000, Sean Mooney <smooney@redhat.com> wrote:
On Fri, 2019-01-04 at 00:48 -0800, melanie witt wrote:
On Thu, 3 Jan 2019 11:40:22 -0600, Matt Riedemann <mriedemos@gmail.com> wrote:
On 12/28/2018 4:13 AM, Balázs Gibizer wrote:
I'm wondering that introducing an API microversion could act like a feature flag I need and at the same time still make the feautre discoverable as you would like to see it. Something like: Create a feature flag in the code but do not put it in the config as a settable flag. Instead add an API microversion patch to the top of the series and when the new version is requested it enables the feature via the feature flag. This API patch can be small and simple enough to cherry-pick to earlier into the series for local end-to-end testing if needed. Also in functional test I can set the flag via a mock so I can add and run functional tests patch by patch.
That may work. It's not how I would have done this, I would have started from the bottom and worked my way up with the end to end functional testing at the end, as already noted, but I realize you've been pushing this boulder for a couple of releases now so that's not really something you want to change at this point.
I guess the question is should this change have a microversion at all? That's been wrestled in the spec review and called out in this thread. I don't think a microversion would be *wrong* in any sense and could only help with discoverability on the nova side, but am open to other opinions.
Sorry to be late to this discussion, but this brought up in the nova meeting today to get more thoughts. I'm going to briefly summarize my thoughts here.
IMHO, I think this change should have a microversion, to help with discoverability. I'm thinking, how will users be able to detect they're able to leverage the new functionality otherwise? A microversion would signal the availability. As for dealing with the situation where a user specifies an older microversion combined with resource requests, I think it should behave similarly to how multiattach works, where the request will be rejected straight away if microversion too low + resource requests are passed.
this has implcations for upgrades and virsion compatiablity. if a newver version of neutron is used with older nova then behavior will change when nova is upgraded to a version of nova the has the new micoversion.
my concern is as follows. a given deployment has rocky nova and rocky neutron. a teant define a minium bandwidth policy and applise it to a network. they create a port on that network. neutorn will automatically apply the minium bandwith policy to the port when it is created on the network. but we could also assuume the tenatn applied the policy to the port if we liked. the tanant then boots a vm with that port.
when the vm is schduled to a node neutron will ask the network backend via the ml2 driver to configure the minium bandwith policy if the network backend supports it as part of the bind port call. the ml2 driver can refuse to bind the port at this point if it cannot fulfile the request to prevent the vm from spwaning. assuming the binding succeeds the backend will configure the minium andwith policy on the interface. nova in rocky will not schdule based on the qos policy as there is no resouce request in the port and placement will not model bandwith availablity.
note: that this is how minium bandwith was orignially planned to be implmented with ml2/odl and other sdn controler backend several years ago but odl did not implement the required features so this mechanium was never used. i am not aware of any ml2 dirver that actully impmented bandwith check but before placement was created this the mechinium that at least my team at intel and some others had been planning to use.
so in rocky the vm should boot, there will be no prevention of over subsciption in placement and netuon will configure the minium bandwith policy if the network backend suports it. The ingress qos minium bandwith rules was only added in neutron be egress qos minium bandwith support was added in newton with https://github.com/openstack/neutron/commit/60325f4ae9ec53734d792d111cbcf242... so there are will be a lot of existing cases where ports will have minium bandwith policies before stein.
if we repeat the same exercise with rocky nova and stein neutron this changes slightly in that neutron will look at the qos policy associates with the port and add a resouce request. as rocky nova will not have code to parse the resource requests form the neutron port they will be ignored and the vm will boot, the neutron bandwith will configure minium bandwith enforcement on the port, placement will model the bandwith as a inventory but no allocation will be created for the vm.
note: i have not checked the neutron node to confirm the qos plugin will still work without the placement allocation but if it dose not its a bug as stien neutron would nolnger work with pre stien nova. as such we would have broken the ablity to upgrade nova and neutron seperatly.
if you use stein nova and stein neutron and the new micro version then the vm boots, we allocate the bandiwth in placement and configure the enforment in the networking backend if it supports it which is our end goal.
the last configuration is stein nova and stien neutron with old microviron. this will happen in two cases. first the no micorverion is specified explcitly and openstack client is used since it will not negocitate the latest micro version or an explict microversion is passed.
if the last rocky micro version was passed for example and we chose to ignore the presence of the resouce request then it would work the way it did with nova rocky and neutron stien above. if we choose to reject the request instead anyone who tries to preform instance actions on an existing instance will break after nova is upgraded to stien.
while the fact over subsription is may happend could be problematic to debug for some i think the ux cost is less then the cost of updating all software that used egress qos since it was intoduced in newton to explcitly pass the latest microversion.
i am in favor of adding a microversion by the way, i just think we should ignore the resouce request if an old microversion is used.
Thanks for describing this detailed scenario -- I wasn't realizing that today, you can get _some_ QoS support by pre-creating ports in neutron with resource requests attached and specifying those ports when creating a server. I understand now the concern with the idea of rejecting requests < new microversion + port.resource_request existing on pre-created ports. And there's no notion of being able to request QoS support via ports created by Nova (no change in Nova API or flavor extra-specs in the design). So, I could see this situation being reason enough not to reject requests when an old microversion is specified. But, let's chat more about it via a hangout the week after next (week of January 14 when Matt is back), as suggested in #openstack-nova today. We'll be able to have a high-bandwidth discussion then and agree on a decision on how to move forward with this.
Current behavior today would be, the resource requests are ignored. If we only ignored the resource requests when they're passed with an older microversion, it seems like it would be an unnecessarily poor UX to have their parameters ignored and likely lead them on a debugging journey if and when they realize things aren't working the way they expect given the resource requests they specified.
-melanie
But, let's chat more about it via a hangout the week after next (week of January 14 when Matt is back), as suggested in #openstack-nova today. We'll be able to have a high-bandwidth discussion then and agree on a decision on how to move forward with this.
Thank you all for the discussion. I agree to have a real-time discussion about the way forward. Would Monday, 14th of Jan, 17:00 UTC[1] work for you for a hangouts[2]? I see the following topics we need to discuss: * backward compatibility with already existing SRIOV ports having min bandwidth * introducing microversion(s) for this feature in Nova * allowing partial support for this feature in Nova in Stein (E.g.: only server create/delete but no migrate support). * step-by-step verification of the really long commit chain in Nova I will post a summar of each issue to the ML during this week. Cheers, gibi [1] https://www.timeanddate.com/worldclock/fixedtime.html?iso=20190114T170000 [2] https://hangouts.google.com/call/oZAfCFV3XaH3IxaA0-ITAEEI
On Mon, Jan 7, 2019 at 1:52 PM, Balázs Gibizer <balazs.gibizer@ericsson.com> wrote:
But, let's chat more about it via a hangout the week after next (week of January 14 when Matt is back), as suggested in #openstack-nova today. We'll be able to have a high-bandwidth discussion then and agree on a decision on how to move forward with this.
Thank you all for the discussion. I agree to have a real-time discussion about the way forward.
Would Monday, 14th of Jan, 17:00 UTC[1] work for you for a hangouts[2]?
I see the following topics we need to discuss: * backward compatibility with already existing SRIOV ports having min bandwidth * introducing microversion(s) for this feature in Nova * allowing partial support for this feature in Nova in Stein (E.g.: only server create/delete but no migrate support). * step-by-step verification of the really long commit chain in Nova
I will post a summar of each issue to the ML during this week.
Hi, As I promised here is an etherpad[1] for the hangouts discussion, with a sort summary for the topic I think we need to discuss. Feel free to comment in there or add new topics you feel important. [1] https://etherpad.openstack.org/p/bandwidth-way-forward Cheers, gibi
On Wed, Jan 9, 2019 at 11:30 AM, Balázs Gibizer <balazs.gibizer@ericsson.com> wrote:
On Mon, Jan 7, 2019 at 1:52 PM, Balázs Gibizer <balazs.gibizer@ericsson.com> wrote:
But, let's chat more about it via a hangout the week after next (week of January 14 when Matt is back), as suggested in #openstack-nova today. We'll be able to have a high-bandwidth discussion then and agree on a decision on how to move forward with this.
Thank you all for the discussion. I agree to have a real-time discussion about the way forward.
Would Monday, 14th of Jan, 17:00 UTC[1] work for you for a hangouts[2]?
It seems that Tuesday 15th of Jan, 17:00 UTC [2] would be better for the team. So I'm moving the call there. Cheers, gibi [1] https://hangouts.google.com/call/oZAfCFV3XaH3IxaA0-ITAEEI [2] https://www.timeanddate.com/worldclock/fixedtime.html?iso=20190115T170000
On Wed, Jan 9, 2019 at 5:56 PM, Balázs Gibizer <balazs.gibizer@ericsson.com> wrote:
On Wed, Jan 9, 2019 at 11:30 AM, Balázs Gibizer <balazs.gibizer@ericsson.com> wrote:
On Mon, Jan 7, 2019 at 1:52 PM, Balázs Gibizer <balazs.gibizer@ericsson.com> wrote:
But, let's chat more about it via a hangout the week after next (week of January 14 when Matt is back), as suggested in #openstack-nova today. We'll be able to have a high-bandwidth discussion then and agree on a decision on how to move forward with this.
Thank you all for the discussion. I agree to have a real-time discussion about the way forward.
Would Monday, 14th of Jan, 17:00 UTC[1] work for you for a hangouts[2]?
It seems that Tuesday 15th of Jan, 17:00 UTC [2] would be better for the team. So I'm moving the call there.
Sorry to change it again. I hope this is the final time. Friday 18th of Jan, 17:00 UTC [2]. The dicussion etherpad is updated with a bit more info [3]. Cheers, gibi [1] https://hangouts.google.com/call/oZAfCFV3XaH3IxaA0-ITAEEI [2] https://www.timeanddate.com/worldclock/fixedtime.html?iso=20190118T170000 [3] https://etherpad.openstack.org/p/bandwidth-way-forward
Cheers, gibi
[1] https://hangouts.google.com/call/oZAfCFV3XaH3IxaA0-ITAEEI [2] https://www.timeanddate.com/worldclock/fixedtime.html?iso=20190115T170000
Hello, Thank your for the productive call. Here is a short summary about the agreements: * There will be a new microversion in nova for creating (and deleting) a server with ports having resource request. If such request is sent with any old microversion then nova will reject the request. Nova will reject not just the create request with old microvesion but all the requests that would lead to an inconsistent resource view. These are: create, attachInterface, resize, migrate live-migrate, evacuate, and unshelve of an offloaded server. * Delete server having resource aware ports will be supported with old microversion. Nova will delete the whole consumer in placement anyhow. * There will be a second new microversion (probably in Train) that will enable move operations for server having resource aware ports. This microversion split will allow us not to block the create / delete support for the feature in Stein. * The new microversions will act as a feature flag in the code. This will allow merging single use cases (e.g.: server create with one ovs backed resource aware port) and functionally verifying it before the whole generic create use case is ready and enabled. * A nova-manage command will be provided to heal the port allocations without moving the servers if there is enough resource inventory available for it on the current host. This tool will only work online as it will call neutron and placement APIs. * Server move operations with the second new microversion will automatically heal the server allocation. Cheers, gibi On Mon, Jan 14, 2019 at 6:16 PM, Balázs Gibizer <balazs.gibizer@ericsson.com> wrote:
On Wed, Jan 9, 2019 at 5:56 PM, Balázs Gibizer <balazs.gibizer@ericsson.com> wrote:
On Wed, Jan 9, 2019 at 11:30 AM, Balázs Gibizer <balazs.gibizer@ericsson.com> wrote:
On Mon, Jan 7, 2019 at 1:52 PM, Balázs Gibizer <balazs.gibizer@ericsson.com> wrote:
But, let's chat more about it via a hangout the week after next (week of January 14 when Matt is back), as suggested in #openstack-nova today. We'll be able to have a high-bandwidth discussion then and agree on a decision on how to move forward with this.
Thank you all for the discussion. I agree to have a real-time discussion about the way forward.
Would Monday, 14th of Jan, 17:00 UTC[1] work for you for a hangouts[2]?
It seems that Tuesday 15th of Jan, 17:00 UTC [2] would be better for the team. So I'm moving the call there.
Sorry to change it again. I hope this is the final time. Friday 18th of Jan, 17:00 UTC [2]. The dicussion etherpad is updated with a bit more info [3].
Cheers, gibi
[1] https://hangouts.google.com/call/oZAfCFV3XaH3IxaA0-ITAEEI [2] https://www.timeanddate.com/worldclock/fixedtime.html?iso=20190118T170000 [3] https://etherpad.openstack.org/p/bandwidth-way-forward
Cheers, gibi
[1] https://hangouts.google.com/call/oZAfCFV3XaH3IxaA0-ITAEEI [2]
https://www.timeanddate.com/worldclock/fixedtime.html?iso=20190115T170000
* There will be a second new microversion (probably in Train) that will enable move operations for server having resource aware ports. This microversion split will allow us not to block the create / delete support for the feature in Stein.
* The new microversions will act as a feature flag in the code. This will allow merging single use cases (e.g.: server create with one ovs backed resource aware port) and functionally verifying it before the whole generic create use case is ready and enabled.
* A nova-manage command will be provided to heal the port allocations without moving the servers if there is enough resource inventory available for it on the current host. This tool will only work online as it will call neutron and placement APIs.
* Server move operations with the second new microversion will automatically heal the server allocation.
I wasn't on this call, so apologies if I'm missing something important. Having a microversion that allows move operations for an instance configured with one of these ports seems really terrible to me. What exactly is the point of that? To distinguish between Stein and Train systems purely because Stein didn't have time to finish the feature? IMHO, we should really avoid abusing microversions for that sort of thing. I would tend to err on the side of "if it's not ready, then it's not ready" for Stein, but I'm sure the desire to get this in (even if partially) is too strong for that sort of restraint. Can we not return 403 in Stein, since moving instances is disable-able anyway, and just make it work in Train? Having a new microversion with a description of "nothing changed except we finished a feature so you can do this very obscure thing now" seems like we're just using them as an experimental feature flag, which was definitely not the intent. I know returning 403 for "you can't do this right now" isn't *as* discoverable, but you kinda have to handle 403 for operations that could be disabled anyway, so... --Dan
* There will be a second new microversion (probably in Train) that will enable move operations for server having resource aware ports. This microversion split will allow us not to block the create / delete support for the feature in Stein.
* The new microversions will act as a feature flag in the code. This will allow merging single use cases (e.g.: server create with one ovs backed resource aware port) and functionally verifying it before the whole generic create use case is ready and enabled.
* A nova-manage command will be provided to heal the port allocations without moving the servers if there is enough resource inventory available for it on the current host. This tool will only work online as it will call neutron and placement APIs.
* Server move operations with the second new microversion will automatically heal the server allocation.
I wasn't on this call, so apologies if I'm missing something important.
Having a microversion that allows move operations for an instance configured with one of these ports seems really terrible to me. What exactly is the point of that? To distinguish between Stein and Train systems purely because Stein didn't have time to finish the feature?
On Fri, 2019-01-18 at 10:40 -0800, Dan Smith wrote: the intent of the second micro version was primarily discoverablity the option of have 1 vs 2 micro versions was discussed. no one really pushed strongly for only one on the call so i think we just converged on two mainly becaue of the discoverablity aspect.
IMHO, we should really avoid abusing microversions for that sort of thing. I would tend to err on the side of "if it's not ready, then it's not ready" for Stein, but I'm sure the desire to get this in (even if partially) is too strong for that sort of restraint.
so dont merge it at all for stien and merge it all in train. its an option, but yes i think a lot of people woudl like to see at least some support in Stein. it would not be the only nova feature that does not support move operations but maybe it will cause operator more headache the it solves if we enable the feature with out move support.
Can we not return 403 in Stein, since moving instances is disable-able anyway, and just make it work in Train? Having a new microversion with a description of "nothing changed except we finished a feature so you can do this very obscure thing now" seems like we're just using them as an experimental feature flag, which was definitely not the intent. I know returning 403 for "you can't do this right now" isn't *as* discoverable, but you kinda have to handle 403 for operations that could be disabled anyway, so...
ya i guess that is a fair point. microversions are not like neutrons api extention, you cant mix and match micorverions in the same way. in neutron the presence of the extension tell you the feature is availble and enabled. the nova microversion just tell you the code support it not that its configured so ya you would have to handel 403s if move operations were disabled even if we had two microverions.
--Dan
On 1/18/2019 12:40 PM, Dan Smith wrote:
Having a microversion that allows move operations for an instance configured with one of these ports seems really terrible to me.
I agree with that sentiment.
Can we not return 403 in Stein, since moving instances is disable-able anyway, and just make it work in Train? Having a new microversion with a description of "nothing changed except we finished a feature so you can do this very obscure thing now" seems like we're just using them as an experimental feature flag, which was definitely not the intent. I know returning 403 for "you can't do this right now" isn't*as* discoverable, but you kinda have to handle 403 for operations that could be disabled anyway, so...
We didn't discuss it too much on the call, but in thinking about it afterward, I think I would be OK with treating this like a bug fix in Train. We can fail move operations until we support this, and then once we support it, we just do, without a microversion. As noted, clients have to deal with this kind of stuff already, and I don't remember saying when we support live migration with NUMA (which now fails unless configured otherwise) that we would add a microversion for that - it either just works or it doesn't. So I'm OK with not adding a second microversion for move operation support later. -- Thanks, Matt
On Fri, Jan 18, 2019 at 7:40 PM, Dan Smith <dms@danplanet.com> wrote:
* There will be a second new microversion (probably in Train) that will enable move operations for server having resource aware ports. This microversion split will allow us not to block the create / delete support for the feature in Stein.
* The new microversions will act as a feature flag in the code. This will allow merging single use cases (e.g.: server create with one ovs backed resource aware port) and functionally verifying it before the whole generic create use case is ready and enabled.
* A nova-manage command will be provided to heal the port allocations without moving the servers if there is enough resource inventory available for it on the current host. This tool will only work online as it will call neutron and placement APIs.
* Server move operations with the second new microversion will automatically heal the server allocation.
I wasn't on this call, so apologies if I'm missing something important.
Having a microversion that allows move operations for an instance configured with one of these ports seems really terrible to me. What exactly is the point of that? To distinguish between Stein and Train systems purely because Stein didn't have time to finish the feature?
I think in Stein we have time to finish the boot / delete use case of the feature but most probably do not have time to finish the move use cases. I belive that the boot / delete use case is already useful for end users. There are plenty of features in nova that are enabled before supporting all the cases, like move operations with NUMA.
IMHO, we should really avoid abusing microversions for that sort of thing. I would tend to err on the side of "if it's not ready, then it's not ready" for Stein, but I'm sure the desire to get this in (even if partially) is too strong for that sort of restraint.
Why it is an abuse of the microversion to use it to signal that a new use case is supported? I'm confused. I was asked to use microversions to signal that a feature is ready. So I'm not sure why in case of a feature (feature == one ore more use case(s)) it is OK to use a microversion but not OK when a use case (e.g. boot/delete) is completed.
Can we not return 403 in Stein, since moving instances is disable-able anyway, and just make it work in Train? Having a new microversion with a description of "nothing changed except we finished a feature so you can do this very obscure thing now" seems like we're just using them as an
I think "nothing is changed" would not be true. Some operation (e.g. server move) that was rejected before (or even accepted but caused unintentional resource overallocation) now works properly. Isn't it the "you can do this very obscure thing now" documentation of a microversion that makes the new API behavior discoverable?
experimental feature flag, which was definitely not the intent. I know returning 403 for "you can't do this right now" isn't *as* discoverable, but you kinda have to handle 403 for operations that could be disabled anyway, so...
The boot / delete use case would not be experimental, that would be final. 403 is a client error but in this case, in Stein, move operations would not be implemented yet. So for me that error is not a client error (e.g. there is no way a client can fix it) but a server error, like HTTP 501. Cheers, gibi
--Dan
On Mon, 2019-01-21 at 10:45 +0000, Balázs Gibizer wrote:
On Fri, Jan 18, 2019 at 7:40 PM, Dan Smith <dms@danplanet.com> wrote:
* There will be a second new microversion (probably in Train) that will enable move operations for server having resource aware ports. This microversion split will allow us not to block the create / delete support for the feature in Stein.
* The new microversions will act as a feature flag in the code. This will allow merging single use cases (e.g.: server create with one ovs backed resource aware port) and functionally verifying it before the whole generic create use case is ready and enabled.
* A nova-manage command will be provided to heal the port allocations without moving the servers if there is enough resource inventory available for it on the current host. This tool will only work online as it will call neutron and placement APIs.
* Server move operations with the second new microversion will automatically heal the server allocation.
I wasn't on this call, so apologies if I'm missing something important.
Having a microversion that allows move operations for an instance configured with one of these ports seems really terrible to me. What exactly is the point of that? To distinguish between Stein and Train systems purely because Stein didn't have time to finish the feature?
I think in Stein we have time to finish the boot / delete use case of the feature but most probably do not have time to finish the move use cases. I belive that the boot / delete use case is already useful for end users. There are plenty of features in nova that are enabled before supporting all the cases, like move operations with NUMA.
that is true however numa in partaclar was due to an oversight not by design. as is the case with macvtap sriov numa had intended to support livemigration from its introduction even if they are only now being completed. numa even without artoms work has always supported cold migrations the same is true of cpu pinning,hugepages,pci/sriov pass-though.
IMHO, we should really avoid abusing microversions for that sort of thing. I would tend to err on the side of "if it's not ready, then it's not ready" for Stein, but I'm sure the desire to get this in (even if partially) is too strong for that sort of restraint.
Why it is an abuse of the microversion to use it to signal that a new use case is supported? I'm confused. I was asked to use microversions to signal that a feature is ready. So I'm not sure why in case of a feature (feature == one ore more use case(s)) it is OK to use a microversion but not OK when a use case (e.g. boot/delete) is completed.
dan can speak for himself but i would assume because it does not signal that the use case is supported. it merely signals taht the codebase could support it, as move operations can be disable via config or may not be supported by the selected hypervior (ironic), the presence of the microversion alone is not enough to determine the usecase is supported. unlike neutron extensions micro versions are not advertised individually and cant be enabled only when the deployment is configured to support a feature.
Can we not return 403 in Stein, since moving instances is disable-able anyway, and just make it work in Train? Having a new microversion with a description of "nothing changed except we finished a feature so you can do this very obscure thing now" seems like we're just using them as an
I think "nothing is changed" would not be true. Some operation (e.g. server move) that was rejected before (or even accepted but caused unintentional resource overallocation) now works properly.
Isn't it the "you can do this very obscure thing now" documentation of a microversion that makes the new API behavior discoverable?
experimental feature flag, which was definitely not the intent. I know returning 403 for "you can't do this right now" isn't *as* discoverable, but you kinda have to handle 403 for operations that could be disabled anyway, so...
The boot / delete use case would not be experimental, that would be final.
403 is a client error but in this case, in Stein, move operations would not be implemented yet. So for me that error is not a client error (e.g. there is no way a client can fix it) but a server error, like HTTP 501. a 501 "not implemented" would be a valid error code to use with the new mirco version
since the min bandwith before was best effort any overallocation was not a bug or unintentional it was allowed by design given that we initall planned to delegate the bandwith mangment to the sdn contoler. as matt pointed out the apis for creating qos rules and policies are admin only as are most of the move operations. a tenant could have chosen to apply the QOS policy but the admin had to create it in the first place. that declares support for bandwith based schduling. resize today does not retrun 501 https://developer.openstack.org/api-ref/compute/?expanded=resize-server-resi... nor do shelve/unshelve https://developer.openstack.org/api-ref/compute/#shelve-server-shelve-action https://developer.openstack.org/api-ref/compute/#unshelve-restore-shelved-se... the same is true of migrate and live migrate https://developer.openstack.org/api-ref/compute/?expanded=#migrate-server-mi... https://developer.openstack.org/api-ref/compute/?expanded=#live-migrate-serv... as such for older microverions returning 501 would be incorrect as its a change in the set of response codes that existing clients should expect form those endpoints. while i agree it is not a client error being consitent with exisitng behavior chould be preferable as client presuably know how to deal with it.
Cheers, gibi
--Dan
On 1/4/2019 7:20 AM, Sean Mooney wrote:
so in rocky the vm should boot, there will be no prevention of over subsciption in placement and netuon will configure the minium bandwith policy if the network backend suports it. The ingress qos minium bandwith rules was only added in neutron be egress qos minium bandwith support was added in newton with https://github.com/openstack/neutron/commit/60325f4ae9ec53734d792d111cbcf242...
so there are will be a lot of existing cases where ports will have minium bandwith policies before stein. Isn't this all admin-only by default in neutron since newton? So how do we know there will be "a lot" of existing cases? Do we know of any
You said "The ingress qos minium bandwith rules was only added in neutron" - did you mean a release rather than "neutron", as in a release newer than newton, presumably much newer? public openstack clouds that enable this for their users? If not, I'm guessing by "a lot" maybe you mean a lot of telco private cloud openstack deployments that just have a single MANO tenant? -- Thanks, Matt
On Fri, 2019-01-18 at 10:55 -0600, Matt Riedemann wrote:
On 1/4/2019 7:20 AM, Sean Mooney wrote:
so in rocky the vm should boot, there will be no prevention of over subsciption in placement and netuon will configure the minium bandwith policy if the network backend suports it. The ingress qos minium bandwith rules was only added in neutron be egress qos minium bandwith support was added in newton with
https://github.com/openstack/neutron/commit/60325f4ae9ec53734d792d111cbcf242...
You said "The ingress qos minium bandwith rules was only added in neutron" - did you mean a release rather than "neutron", as in a release newer than newton, presumably much newer? yes i meant to say minium ingress qos was only added to neutron in rocky where as minium egress qos dates back to newton.
so there are will be a lot of existing cases where ports will have minium bandwith policies before stein.
Isn't this all admin-only by default in neutron since newton? So how do we know there will be "a lot" of existing cases? Do we know of any public openstack clouds that enable this for their users? If not, I'm guessing by "a lot" maybe you mean a lot of telco private cloud openstack deployments that just have a single MANO tenant? yes telco/nfv deployment where a mano system is used to manage openstack was the primary usecase i was thinking about. looking at the api definition https://github.com/openstack/neutron-lib/blob/master/neutron_lib/api/definit... and api docs
https://developer.openstack.org/api-ref/network/v2/index.html?expanded=creat... i dont see anything calling this api as admin only. i know qos in general was not intended to be admin only. looking at https://github.com/openstack/neutron/blob/master/neutron/conf/policies/qos.p... it looks like you need admin right to create update and delete qos rules/policies but i think any user can aplly a qos policy that was created by an admin to a port or network https://github.com/openstack/neutron-lib/blob/master/neutron_lib/api/definit... extends the port and network resouces with a qos policy id. https://github.com/openstack/neutron/blob/master/neutron/conf/policies/qos.p... https://github.com/openstack/neutron/blob/master/neutron/conf/policies/port.... https://github.com/openstack/neutron/blob/master/neutron/conf/policies/netwo... do not set an adming only policy on the qos policy id so i assuem the default of RULE_ANY ('rule:regular_user') or RULE_ADMIN_OR_OWNER ('rule:admin_or_owner') applies. the intenat of haveing policy creation be admin only but requesting a policy be available to all tenants was to allow opertor to chage for guaranteed bandwith or priorised trafic and enable tenants to opt in to that. if the admin did not define any qos policies and used teh defult api polices then yes there are likely few users of this out side of telco deployments.
On 12/20/2018 8:02 AM, Bence Romsics wrote:
I like the idea of a feature flag, however technically in neutron we don't directly control the list of extensions loaded. Instead what we control (through configuration) is the list of service plugins loaded. The 'resource_request' extension is implemented by the 'qos' service plugin. But the 'qos' service plugin implements other API extensions and features too. A cloud admin may want to use these other features of the 'qos' service plugin, but not the guaranteed minimum bandwidth. Therefore I'd suggest adding the feature flag to nova to keep this usage possible.
Can't the resource_request part of this be controlled via policy rules or something similar? If neutron were using microversions, users would have to opt into this new QoS bandwidth behavior and nova, as a client, would also have to request that information about each port from neutron to know if it's even available in the neutron server version they are talking to. Anyway, microversions aside, it's not clear to me why a neutron API extension doesn't control if this new field is added to the port resource response? Barring that, are policy rules something that could be used for deployers could decide which users can use this feature while it's being rolled out? -- Thanks, Matt
Hi Matt, Sorry for the slow response over the winter holidays. First I have to correct myself:
On 12/20/2018 8:02 AM, Bence Romsics wrote:
... in neutron we don't directly control the list of extensions loaded. Instead what we control (through configuration) is the list of service plugins loaded. The 'resource_request' extension is implemented by the 'qos' service plugin. But the 'qos' service plugin implements other API extensions and features too. A cloud admin may want to use these other features of the 'qos' service plugin, but not the guaranteed minimum bandwidth.
This is the default behavior, but it can be overcome by a small patch like this: https://review.openstack.org/627978 With a patch like that we could control loading the port-resource-request extension (and by that the presence of the resource_request attribute) on its own (independently of all other extensions implemented by the qos service plugin). On Thu, Dec 20, 2018 at 6:58 PM Matt Riedemann <mriedemos@gmail.com> wrote:
Can't the resource_request part of this be controlled via policy rules or something similar?
Is this question still relevant given the above? Even if we could control the resource-request attribute via policy rules wouldn't that be just as undiscoverable as a config-only feature flag?
Barring that, are policy rules something that could be used for deployers could decide which users can use this feature while it's being rolled out?
Using a standalone neutron extension (controlled on its own by a neutron config option) as a feature flag (and keeping it not loaded by default until the feature is past experimental) would lessen the number of cloud deployments (probably to zero) where the experimental feature is unintentionally exposed. On the other hand - now that Jay called my attention to the undiscoverability of feature flags - I realize that this approach is not enough to distinguish the complete and experimental versions of the feature, given the experimental version was exposed intentionally. Cheers, Bence
On 12/19/2018 9:03 AM, Balázs Gibizer wrote:
* the next two patches [6][7] implement rejecting use cases that was decided to out of scope for the feature (interface attach with resource requst, booting with network having resource request)
I really thought these would go *before* any new feature changes since we currently do not support these types of ports, so if a user were to create a port (or network) with QoS rules and attach it to a server in nova, we're just ignoring it, which is essentially broken. I think of this like how we rejected attempts to rebuild a volume-backed server with a new image (we used to just let that slide with no errors but it didn't actually rebuild the root volume) or trying to attach SR-IOV ports (again, not supported in compute). I think of the "reject" patches more as bug fixes for existing broken, or at least misleading, code.
* The next patch [8] add support for removing resource allocation during interface detach if the detached port has resource request
This is one example of how the implementation is backward. This is something that would be necessary plumbing before enabling a new feature in the API (assuming this cleanup happens in nova-compute?).
* The rest of the series up until [9] implement support for selecting VFs form the same PF that the bandwidth was allocated from. This allows handling bandwidth and VF resources properly even if there are more than one PFs on the same compute connected to the same physnet.
This is really kind of news to me. Maybe I glossed over this or didn't understand it in the spec, but the spec doesn't really say much about this either. There is only one mention of VFs in the spec: "When multiple ports, having QoS policy rules towards the same physical network, are attached to the server (e.g. two VFs on the same PF) then the resulting allocation is the sum of the resource amounts of each individual port request." Anyway, it seems like a sub-feature and I just didn't realize it.
What is still missing from the patch series: * Support for any kind of server move operations (resize, migrate, evacuate, live-migrate, unshelve) * Upgrade support: Bound SRIOV ports might already have QoS policy that would need healing resource allocation in placement.
This wouldn't be a problem if we were already rejecting ports with QoS rules applied as mentioned above.
Order of the patches --------------------
One comment I got from Matt is that the patches are in wrong order. The current order allows booting server with bandwidth request_before_ every other server operation is supported (or rejected). This could lead to the violation of the resource view of the system. For example booting a server with a QoS aware port will result in bandwidth allocation on the compute host. But then later on migrating that server to another host does not properly handle moving the allocation to the target host.
In my view reordering the patches to only allow booting the first server with bandwdith after every other operation is supported is not feasible. The current chain is already 17 patches long and it does not touch server moving operations at all. Also if I cannot boot a server with bandwidth then I cannot test that such server can be moved properly. So all the functional test would end up in the last patch of the series.
When I say wrong order, I mostly mean from the bottom up, not from a feature parity perspective. The normal way a long complicated feature series like this that involves the entire stack would be to start plumbing code into nova-compute at the bottom layer and then work your way up to the control layer services (conductor/scheduler) and finally the API where the feature is turned on (usually with a microversion). I am semi OK with deferring move operation support in v1 and building that on later given the complicated nature of this feature, but my real issue in the order is the backward nature of it (I don't want to approve patches that add code to the API for a feature just to leave it fall on the ground if the compute code isn't already merged).
However I think we can use the feature flag approach. We can implement and verify everything in the current order without the end user seeing or causing any kind of inconsistencies. The nova implementation is only activated if a neutron port has a resource_request field. This field is added to the neutron API as an API extension. We can mark this extension experimental while we are working through the missing pieces of the feature. But we can also turn on that extension in our test envirnoment to verify each and every step during the implementation process.
I'm definitely not crazy about the feature flag idea, but I need to read and reply to the other responses in the thread first.
Which RP fulfills the request of a neutron port -----------------------------------------------
Neutron expects a single RP uuid sent by nova in the port binding that tells neutron about which RP providers the resources for that port. This is the famous problem to find which request group is fulfilled by which resource provider in an allocation candidate. There are 3 possible solutions:
* Neutron does the mapping. This would require Neutron to see each port binding for a given server as a single request. This is not the case today and not feasible to hack around in neutron.
Before I started reviewing the code, I read the spec again to get the context in my head and I really thought this is what was proposed - that the allocation candidates that we used to claim resources in the scheduler would be passed through to the port binding request and neutron would sort out whatever it needed to from that. This is the section of the spec I'm thinking of: https://specs.openstack.org/openstack/nova-specs/specs/stein/approved/bandwi... Did something change drastically since the spec was written such that nova needs to do this now, at least temporarily? -- Thanks, Matt
participants (7)
-
Balázs Gibizer
-
Bence Romsics
-
Dan Smith
-
Jay Pipes
-
Matt Riedemann
-
melanie witt
-
Sean Mooney