Re: [nova] review guide for the bandwidth patches

5 Jan 2019


      On Fri, 04 Jan 2019 13:20:54 +0000, Sean Mooney <smooney@redhat.com> wrote:
...
On Fri, 2019-01-04 at 00:48 -0800, melanie witt wrote:
...
On Thu, 3 Jan 2019 11:40:22 -0600, Matt Riedemann <mriedemos@gmail.com>
wrote:
...
On 12/28/2018 4:13 AM, Balázs Gibizer wrote:
...
I'm wondering that introducing an API microversion could act like a
feature flag I need and at the same time still make the feautre
discoverable as you would like to see it. Something like: Create a
feature flag in the code but do not put it in the config as a settable
flag. Instead add an API microversion patch to the top of the series
and when the new version is requested it enables the feature via the
feature flag. This API patch can be small and simple enough to
cherry-pick to earlier into the series for local end-to-end testing if
needed. Also in functional test I can set the flag via a mock so I can
add and run functional tests patch by patch.
That may work. It's not how I would have done this, I would have started
from the bottom and worked my way up with the end to end functional
testing at the end, as already noted, but I realize you've been pushing
this boulder for a couple of releases now so that's not really something
you want to change at this point.
I guess the question is should this change have a microversion at all?
That's been wrestled in the spec review and called out in this thread. I
don't think a microversion would be *wrong* in any sense and could only
help with discoverability on the nova side, but am open to other opinions.
Sorry to be late to this discussion, but this brought up in the nova
meeting today to get more thoughts. I'm going to briefly summarize my
thoughts here.
IMHO, I think this change should have a microversion, to help with
discoverability. I'm thinking, how will users be able to detect they're
able to leverage the new functionality otherwise? A microversion would
signal the availability. As for dealing with the situation where a user
specifies an older microversion combined with resource requests, I think
it should behave similarly to how multiattach works, where the request
will be rejected straight away if microversion too low + resource
requests are passed.
this has implcations for upgrades and virsion compatiablity.
if a newver version of neutron is used with older nova then
behavior will change when nova is upgraded to a version of
nova the has the new micoversion.
my concern is as follows.
a given deployment has rocky nova and rocky neutron.
a teant define a minium bandwidth policy and applise it to a network.
they create a port on that network.
neutorn will automatically apply the minium bandwith policy to the port when it is created on the network.
but we could also assuume the tenatn applied the policy to the port if we liked.
the tanant then boots a vm with that port.
when the vm is schduled to a node neutron will ask the network backend via the ml2 driver to configure the minium
bandwith policy if the network backend supports it as part of the bind port call. the ml2 driver can refuse to bind the
port at this point if it cannot fulfile the request to prevent the vm from spwaning. assuming the binding succeeds the
backend will configure the minium andwith policy on the interface. nova in rocky will not schdule based on the qos
policy as there is no resouce request in the port and placement will not model bandwith availablity.
note: that this is how minium bandwith was orignially planned to be implmented with ml2/odl and other sdn controler
         backend several years ago but odl did not implement the required features so this mechanium was never used.
         i am not aware of any ml2 dirver that actully impmented bandwith check but before placement was created this
         the mechinium that at least my team at intel and some others had been planning to use.
so in rocky the vm should boot, there will be no prevention of over subsciption in placement and netuon will configure
the minium bandwith policy if the network backend suports it. The ingress qos minium bandwith rules was only added in
neutron be egress qos minium bandwith support was added in newton with
https://github.com/openstack/neutron/commit/60325f4ae9ec53734d792d111cbcf242...
so there are will be a lot of existing cases where ports will have minium bandwith policies before stein.
if we repeat the same exercise with rocky nova and stein neutron this changes slightly in that
neutron will look at the qos policy associates with the port and add a resouce request. as rocky nova
will not have code to parse the resource requests form the neutron port they will be ignored and
the vm will boot, the neutron bandwith will configure minium bandwith enforcement on the port, placement will
model the bandwith as a inventory but no allocation will be created for the vm.
note: i have not checked the neutron node to confirm the qos plugin will still work without the placement allocation
         but if it dose not its a bug as stien neutron would nolnger work with pre stien nova. as such we would have
         broken the ablity to upgrade nova and neutron seperatly.
if you use stein nova and stein neutron and the new micro version then the vm boots, we allocate the bandiwth in
placement and configure the enforment in the networking backend if it supports it which is our end goal.
the last configuration is stein nova and stien neutron with old microviron.
this will happen in two cases.
first the no micorverion is specified explcitly and openstack client is used since it will not negocitate the latest
micro version or an explict microversion is passed.
if the last rocky  micro version was passed for example and we chose to ignore the presence of the resouce request then
it would work the way it did with nova rocky and neutron stien above. if we choose to reject the request instead
anyone who tries to preform instance actions on an existing instance will break after nova is upgraded to stien.
while the fact over subsription is may happend could be problematic to debug for some i think the ux cost is less then
the cost of updating all software that used egress qos since it was intoduced in newton to explcitly pass the latest
microversion.
i am in favor of adding a microversion by the way, i just think we should ignore the resouce request if an old
microversion is used.
Thanks for describing this detailed scenario -- I wasn't realizing that 
today, you can get _some_ QoS support by pre-creating ports in neutron 
with resource requests attached and specifying those ports when creating 
a server. I understand now the concern with the idea of rejecting 
requests < new microversion + port.resource_request existing on 
pre-created ports. And there's no notion of being able to request QoS 
support via ports created by Nova (no change in Nova API or flavor 
extra-specs in the design). So, I could see this situation being reason 
enough not to reject requests when an old microversion is specified.

But, let's chat more about it via a hangout the week after next (week of 
January 14 when Matt is back), as suggested in #openstack-nova today. 
We'll be able to have a high-bandwidth discussion then and agree on a 
decision on how to move forward with this.
...
...
Current behavior today would be, the resource
requests are ignored. If we only ignored the resource requests when
they're passed with an older microversion, it seems like it would be an
unnecessarily poor UX to have their parameters ignored and likely lead
them on a debugging journey if and when they realize things aren't
working the way they expect given the resource requests they specified.
-melanie