Re: [nova] review guide for the bandwidth patches

28 Dec 2018

      On Thu, Dec 20, 2018 at 7:14 PM, Matt Riedemann <mriedemos@gmail.com> 
wrote:
...
On 12/20/2018 8:47 AM, Jay Pipes wrote:
...
The problem with feature flags is they are not discoverable. As much 
as I hate API extensions, at least they are discoverable.
100% agree. Config-driven API behavior like this flies in the face of 
any kind of hope for interoperability of this feature. Can I do this 
on cloud X? Maybe? Can I do it on cloud Y? Let's find out! At the 
very least, I would think if you're going to add a config option to 
enable/disable this support in the compute API, at least temporarily, 
you would fail any requests which include ports with QoS resource 
requests because we're not going to honor them. That aligns more with 
what I was saying about how I thought the "reject" patches would come 
earlier in the series since we're currently not supporting this 
feature, and lying to users by allowing them to attach these kinds of 
ports. Although I might be conflating QoS ports in general and QoS 
ports that have the bandwidth resource request things on them.
During the spec review, we waffled a bit on whether or not this 
should have a new microversion with it [1][2]. The way this works 
definitely sounds like how volume multiattach is setup - you create a 
resource in another service with a special characteristic and then 
try to attach that thing to a server. Nova can either support that 
thing or not, and the only way for a user to find that out without a 
microversion is to try it and find out (does it fail? if it doesn't 
fail, is the thing working in my guest? if not, is it because it's 
not supported or because something failed or there is a bug?). With 
volume multiattach we added a microversion so that users could know 
if they can even make the request to attach those types of volumes so 
they didn't have to guess. So I could argue that QoS ports like this 
should get similar treatment, but things get fuzzy when it's implicit 
behavior on some external resource
...
Is there a reason why we can't just put code into Nova that, upon 
seeing the resource requests, acts on it? Is there a reason it 
needs to be behind a feature flag or even needs to query Neutron 
for extension support? I mean, if it's in the response payload, 
just use it, right?
The main reason is how the code is written backward, as I've said 
elsewhere in this thread. If everything was properly plumbed in from 
the bottom up for at least creating servers with these kinds of ports 
(and cleaning them up properly), then it's just a matter of doing 
like you said - take action on the thing if it's there. Like [3] I 
think we could explicitly fail for things that are not supported 
until they do become supported, like any move operations. Whether 
those are considered bug fixes or new features (a la microversions) 
is debatable. It sucks adding stuff that works when creating a server 
but isn't supported when you perform actions on that server, and it 
sucks holding up features before everything is ready, and it sucks to 
need microversions for every little thing that ideally would have 
just be done up front, so I can't really say here on what we should 
do. I know in the golden days before microversions it was totally 
kosher to just feature flag our way to nirvana and we're still 
dealing with a lot of that fallout (like the fact that live migration 
with NUMA-affined instances still doesn't work properly).
I'd like to make this feature work and merged without drowning into the 
huge patch series I'm already pushing forward. I'd like to make small 
and verifyable steps toward the full feature, so for me, moving the 
functional tests to the end is scary like hell.

I'm wondering that introducing an API microversion could act like a 
feature flag I need and at the same time still make the feautre 
discoverable as you would like to see it. Something like: Create a 
feature flag in the code but do not put it in the config as a settable 
flag. Instead add an API microversion patch to the top of the series 
and when the new version is requested it enables the feature via the 
feature flag. This API patch can be small and simple enough to 
cherry-pick to earlier into the series for local end-to-end testing if 
needed. Also in functional test I can set the flag via a mock so I can 
add and run functional tests patch by patch.

Regarding rejections. I can add patches that rejects server operations 
(including create too) if the new microversion isn't requested but a 
port included in the request or associated to the server has resource 
request.

How does this sound to you?

Cheers,
gibi

[1] https://review.openstack.org/#/c/569459
[2] https://review.openstack.org/#/c/623543
[3] https://review.openstack.org/#/c/567268/42/nova/conf/workarounds.py
...
[1] 
https://review.openstack.org/#/c/502306/17/specs/rocky/approved/bandwidth-re...
[2] 
https://review.openstack.org/#/c/502306/26/specs/rocky/approved/bandwidth-re...
[3] https://review.openstack.org/#/c/611088/
--
Thanks,
Matt

Re: [nova] review guide for the bandwidth patches

Balázs Gibizer