[oslo] PTG Summary

8 May 2019

      Hi,

You can find the raw notes on the etherpad 
(https://etherpad.openstack.org/p/oslo-train-topics), but hopefully this 
will be an easier to read/understand summary.

Pluggable Policy
----------------
Spec: https://review.opendev.org/#/c/578719/

Since this sort of ran out of steam last cycle, we discussed the option 
of not actually making it pluggable and just explicitly adding support 
for other policy backends. The specific one that seems to be of interest 
is Open Policy Agent. To do this we would add an option to enable OPA 
mode, where all policy checks would be passed through to OPA by default. 
An OPACheck class would also be added to facilitate migration (as a rule 
is added to OPA, switch the policy to OPACheck. Once all rules are 
present, remove the policy file and just turn on the OPA mode).

However, after some further investigation by Patrick East, it was not 
clear if users were asking for this or if the original spec was more of 
a "this might be useful" thing. He's following up with some OPA users to 
see if they would use such a feature, but at this point it's not clear 
whether there is enough demand to justify spending time on it.

Image Encryption/Decryption Library
-----------------------------------
I mention this mostly because the current plan is _not_ to create a new 
Oslo library to enable the feature. The common code between services is 
expected to live in os-brick, and there does not appear to be a need to 
create a new encryption library to support this (yay!).

oslo.service SIGHUP bug
-----------------------
This is a problem a number of people have run into recently and there's 
been some ongoing, but spotty, discussion of how to deal with it. In 
Denver we were able to have some face-to-face discussions and hammer out 
a plan to get this fixed. I think we have a fix identified, and now we 
just need to get it proposed and tested so we don't regress this in the 
future. Most of the prior discussion and a previously proposed fix are 
at https://review.opendev.org/#/c/641907/ so if you want to follow this 
that's the place to do it.

In case anyone is interested, it looks like this is a bug that was 
introduced with mutable config. Mutable config requires a different type 
of service restart, and that was never implemented. Now that most 
services are using mutable config, this is much bigger problem.

Unified Limits and Policy
-------------------------
I won't try to cover everything in detail here, but good progress was 
made on both of these topics. There isn't much to do from the Oslo side 
for the policy changes, but we identified a plan for an initial 
implementation of oslo.limit. There was general agreement that we don't 
necessarily have to get it 100% right on the first attempt, we just need 
to get something in the repo that people can start prototyping with. 
Until we release a 1.0 we aren't committed to any API, so we have 
flexibility to iterate.

For more details, see: 
https://etherpad.openstack.org/p/ptg-train-xproj-nova-keystone

oslo.service profiling and pypy
-------------------------------
Oslo has dropped support for pypy in general due to lack of maintainers, 
so although the profiling work has apparently broken oslo.service under 
pypy this isn't something we're likely to address. Based on our 
conversation at the PTG game night, it sounds like this isn't a priority 
anymore anyway because pypy didn't have the desired performance improvement.

oslo.privsep eventlet timeout
-----------------------------
AFAICT, oslo.privsep only uses eventlet at all if monkey-patching is 
enabled (and then only to make sure it returns the right type of pipe 
for the environment). It's doubtful any eventlet exceptions are being 
raised from the privsep code, and even if they are they would go away 
once monkey-patching in the calling service is disabled. Privsep is 
explicitly not depending on eventlet for any of its functionality so 
services should be able to freely move away from eventlet if they wish.

Retrospective
-------------
In general, we got some major features implemented that unblocked things 
either users or services were asking for. We did add two cores during 
the cycle, but we also lost a long-time Oslo core and some of the other 
cores are being pulled away on other projects. So far this has probably 
resulted in a net loss in review capacity.

As a result, our primary actions out of this were to continue watching 
for new candidates to join the Oslo team. We have at least one person we 
are working closely with and a number of other people approached me at 
the event with interest in contributing to one or more Oslo projects. So 
while this cycle was a bit of a mixed bag, I have a cautiously 
optimistic view of the future.

Service Healthchecks and Metrics
--------------------------------
Had some initial hallway track discussions about this. The self-healing 
SIG is looking into ways to improve the healthcheck and metric situation 
in OpenStack, and some of them may require additions or changes in Oslo. 
There is quite a bit of discussion (not all of which I have read yet) 
related to this on https://review.opendev.org/#/c/653707/

On the metrics side, there are some notes on the SIG etherpad (currently 
around line 209): https://etherpad.openstack.org/p/DEN-self-healing-SIG

It's still a bit early days for both of these things so plans may 
change, but it seems likely that Oslo will be involved to some extent. 
Stay tuned.

Endgame
-------
No spoilers, I promise. If you made it all the way here then thanks and 
congrats. :-)

I hope this was helpful, and if you have any thoughts about anything 
above please let me know.

Thanks.

-Ben

Ben Nemec

Juan Osorio Robles

tags

participants (2)